churro morales created HBASE-19528:
--------------------------------------

             Summary: Major Compaction Tool 
                 Key: HBASE-19528
                 URL: https://issues.apache.org/jira/browse/HBASE-19528
             Project: HBase
          Issue Type: New Feature
            Reporter: churro morales
            Assignee: churro morales
             Fix For: 2.0.0, 3.0.0


The basic overview of how this tool works is:

Parameters:

    Table

    Stores

    ClusterConcurrency

    Timestamp


So you input a table, desired concurrency and the list of stores you wish to 
major compact.  The tool first checks the filesystem to see which stores need 
compaction based on the timestamp you provide (default is current time).  It 
takes that list of stores that require compaction and executes those requests 
concurrently with at most N distinct RegionServers compacting at a given time.  
Each thread waits for the compaction to complete before moving to the next 
queue.  If a region split, merge or move happens this tool ensures those 
regions get major compacted as well. 

This helps us in two ways, we can limit how much I/O bandwidth we are using for 
major compaction cluster wide and we are guaranteed after the tool completes 
that all requested compactions complete regardless of moves, merges and splits. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to