Have you checked the output from bulk load and see if there were lines in the following form (from LoadIncrementalHFiles#splitStoreFile) ?
LOG.info("HFile at " + hfilePath + " no longer fits inside a single " + "region. Splitting..."); In the server log, you should see log in the following form: if (LOG.isDebugEnabled()) { LOG.debug("Compacting " + file + ", keycount=" + keyCount + ", bloomtype=" + r.getBloomFilterType().toString() + ", size=" + TraditionalBinaryPrefix.long2String(r.length(), "", 1) + ", encoding=" + r.getHFileReader().getDataBlockEncoding() + ", seqNum=" + seqNum + (allFiles ? ", earliestPutTs=" + earliestPutTs: "")); } where allFiles being true indicates major compaction. The above should give you some idea of the cause for the compaction activity. Thanks On Tue, Jul 17, 2018 at 11:12 AM Austin Heyne <ahe...@ccri.com> wrote: > Hi all, > > I'm trying to bulk load a large amount of data into HBase. The bulk load > succeeds but then HBase starts running compactions. My input files are > typically ~5-6GB and there are over 3k files. I've used the same table > splits for the bulk ingest and the bulk load so there should be no > reason for hbase to run any compactions. However, I'm seeing it first > start compacting the hfiles into 25+GB files and then into 200+GB files > but didn't let it run any longer. Additionally, I've talked with another > coworker who's tried this process in the past and he's experience the > same thing, eventually giving up on the feature. My attempts have been > on HBase 1.4.2. Does anyone have information on why HBase is insisting > on running these compactions or how I can stop them? They are > essentially breaking the feature for us. > > Thanks, > > -- > Austin L. Heyne > >