[jira] [Commented] (HBASE-19528) Major Compaction Tool

churro morales (JIRA) Mon, 18 Dec 2017 11:10:23 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-19528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295491#comment-16295491
 ]


churro morales commented on HBASE-19528:
----------------------------------------

[~stack]

To answer your questions, we figure out whether to compact by looking at the 
filesystem but the compaction request happens through the admin.majorCompact() 
and we wait till the compaction is done by looking at the CompactionState for 
that region in question (we make this call directly not through the HBase admin 
such that we do not need to instantiate a new  CatalogTracker each time. 

We wanted to limit the concurrent compactions at a regionserver level, thus if 
you have a region move to a new server and that server already happens to be 
major compacting a region / store, it will get queued up and wait instead of 
executing immediately. 

Additionally we want to handle region splits and merges as well.  

Does that answer your question?

> Major Compaction Tool 
> ----------------------
>
>                 Key: HBASE-19528
>                 URL: https://issues.apache.org/jira/browse/HBASE-19528
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: churro morales
>            Assignee: churro morales
>             Fix For: 2.0.0, 3.0.0
>
>
> The basic overview of how this tool works is:
> Parameters:
>     Table
>     Stores
>     ClusterConcurrency
>     Timestamp
> So you input a table, desired concurrency and the list of stores you wish to 
> major compact.  The tool first checks the filesystem to see which stores need 
> compaction based on the timestamp you provide (default is current time).  It 
> takes that list of stores that require compaction and executes those requests 
> concurrently with at most N distinct RegionServers compacting at a given 
> time.  Each thread waits for the compaction to complete before moving to the 
> next queue.  If a region split, merge or move happens this tool ensures those 
> regions get major compacted as well. 
> This helps us in two ways, we can limit how much I/O bandwidth we are using 
> for major compaction cluster wide and we are guaranteed after the tool 
> completes that all requested compactions complete regardless of moves, merges 
> and splits. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-19528) Major Compaction Tool

Reply via email to