Have you checked the output from bulk load and see if there were lines in
the following form (from LoadIncrementalHFiles#splitStoreFile) ?

    LOG.info("HFile at " + hfilePath + " no longer fits inside a
single " + "region.
Splitting...");

In the server log, you should see log in the following form:

      if (LOG.isDebugEnabled()) {
        LOG.debug("Compacting " + file +
          ", keycount=" + keyCount +
          ", bloomtype=" + r.getBloomFilterType().toString() +
          ", size=" + TraditionalBinaryPrefix.long2String(r.length(), "",
1) +
          ", encoding=" + r.getHFileReader().getDataBlockEncoding() +
          ", seqNum=" + seqNum +
          (allFiles ? ", earliestPutTs=" + earliestPutTs: ""));
      }

where allFiles being true indicates major compaction.

The above should give you some idea of the cause for the compaction
activity.

Thanks

On Tue, Jul 17, 2018 at 11:12 AM Austin Heyne <ahe...@ccri.com> wrote:

> Hi all,
>
> I'm trying to bulk load a large amount of data into HBase. The bulk load
> succeeds but then HBase starts running compactions. My input files are
> typically ~5-6GB and there are over 3k files. I've used the same table
> splits for the bulk ingest and the bulk load so there should be no
> reason for hbase to run any compactions. However, I'm seeing it first
> start compacting the hfiles into 25+GB files and then into 200+GB files
> but didn't let it run any longer. Additionally, I've talked with another
> coworker who's tried this process in the past and he's experience the
> same thing, eventually giving up on the feature. My attempts have been
> on HBase 1.4.2. Does anyone have information on why HBase is insisting
> on running these compactions or how I can stop them? They are
> essentially breaking the feature for us.
>
> Thanks,
>
> --
> Austin L. Heyne
>
>

Reply via email to