churro morales updated HBASE-19528:
    Status: Patch Available  (was: Open)

> Major Compaction Tool 
> ----------------------
>                 Key: HBASE-19528
>                 URL: https://issues.apache.org/jira/browse/HBASE-19528
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: churro morales
>            Assignee: churro morales
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>         Attachments: 0001-HBASE-19528-Major-Compaction-Tool-ADDENDUM.patch, 
> HBASE-19528.branch-1.patch, HBASE-19528.patch, HBASE-19528.v1.branch-1.patch, 
> HBASE-19528.v1.patch, HBASE-19528.v2.branch-1.patch, 
> HBASE-19528.v2.branch-1.patch, HBASE-19528.v2.branch-1.patch, 
> HBASE-19528.v8.patch
> The basic overview of how this tool works is:
> Parameters:
>     Table
>     Stores
>     ClusterConcurrency
>     Timestamp
> So you input a table, desired concurrency and the list of stores you wish to 
> major compact.  The tool first checks the filesystem to see which stores need 
> compaction based on the timestamp you provide (default is current time).  It 
> takes that list of stores that require compaction and executes those requests 
> concurrently with at most N distinct RegionServers compacting at a given 
> time.  Each thread waits for the compaction to complete before moving to the 
> next queue.  If a region split, merge or move happens this tool ensures those 
> regions get major compacted as well. 
> This helps us in two ways, we can limit how much I/O bandwidth we are using 
> for major compaction cluster wide and we are guaranteed after the tool 
> completes that all requested compactions complete regardless of moves, merges 
> and splits. 

This message was sent by Atlassian JIRA

Reply via email to