[ 
https://issues.apache.org/jira/browse/HBASE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239944#comment-14239944
 ] 

Vladimir Rodionov commented on HBASE-12657:
-------------------------------------------

I checked trunk code and I think the issue is still there:

HStore:

{code}
  /**
   * @param candidates pre-filtrate
   * @return filtered subset
   * take upto maxFilesToCompact from the start
   */
  private ArrayList<StoreFile> removeExcessFiles(ArrayList<StoreFile> 
candidates,
      boolean isUserCompaction, boolean isMajorCompaction) {
    int excess = candidates.size() - comConf.getMaxFilesToCompact();
    if (excess > 0) {
      if (isMajorCompaction && isUserCompaction) {
        LOG.debug("Warning, compacting more than " + 
comConf.getMaxFilesToCompact() +
            " files because of a user-requested major compaction");
      } else {
        LOG.debug("Too many admissible files. Excluding " + excess
          + " files from compaction candidates");
        candidates.subList(comConf.getMaxFilesToCompact(), 
candidates.size()).clear();
      }
    }
    return candidates;
  }
{code}

Even if this is a first after split compaction request we remove excess files 
from the list and this is the root cause of HBASE-12657 (imo). Reference files 
must be compacted immediately after split, all of them.

> The Region is not being split and far exceeds the desired maximum size.
> -----------------------------------------------------------------------
>
>                 Key: HBASE-12657
>                 URL: https://issues.apache.org/jira/browse/HBASE-12657
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 0.98.8, 0.94.25, 0.99.2
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 1.0.0, 2.0.0, 0.94.26, 0.98.9
>
>
> We are seeing this behavior when creating indexes in one of our environment.
> When an index is being created, most of the "requests" go into a single 
> region.  The amount of time to create an index seems to take longer than 
> usual and it can take days for the regions to compact and split after the 
> index is created.
> Here is a du of the HBase index table:
> {code}
> -bash-4.1$ sudo -su hdfs hadoop fs -du /hbase/43681
> 705          /hbase/43681/.tableinfo.0000000001
> 0            /hbase/43681/.tmp
> 27981697293  /hbase/43681/0492e22092e21d35fca8e779b21ec797
> 539687093    /hbase/43681/832298c4e975fc47210feb6bac3d2f71
> 560660531    /hbase/43681/be9bdb3bdf9365afe5fe90db4247d82c
> 7081938297   /hbase/43681/cd440e524f96fbe0719b2fe969848560
> 6297860287   /hbase/43681/dc893a2d8daa08c689dc69e6bb2c5b50
> 7189607722   /hbase/43681/ffbceaea5e2f142dbe6cd4cbeacc00e8
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to