[
https://issues.apache.org/jira/browse/HBASE-19528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317658#comment-16317658
]
Yu Li commented on HBASE-19528:
-------------------------------
Just asking, that have you read [this thread|https://s.apache.org/3Uo5] in user
mailing list and [this
project|https://github.com/flipkart-incubator/hbase-compactor] in github? Or
exactly the same thing? Thanks. [~churromorales]
> Major Compaction Tool
> ----------------------
>
> Key: HBASE-19528
> URL: https://issues.apache.org/jira/browse/HBASE-19528
> Project: HBase
> Issue Type: New Feature
> Reporter: churro morales
> Assignee: churro morales
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-19528.patch, HBASE-19528.v1.patch
>
>
> The basic overview of how this tool works is:
> Parameters:
> Table
> Stores
> ClusterConcurrency
> Timestamp
> So you input a table, desired concurrency and the list of stores you wish to
> major compact. The tool first checks the filesystem to see which stores need
> compaction based on the timestamp you provide (default is current time). It
> takes that list of stores that require compaction and executes those requests
> concurrently with at most N distinct RegionServers compacting at a given
> time. Each thread waits for the compaction to complete before moving to the
> next queue. If a region split, merge or move happens this tool ensures those
> regions get major compacted as well.
> This helps us in two ways, we can limit how much I/O bandwidth we are using
> for major compaction cluster wide and we are guaranteed after the tool
> completes that all requested compactions complete regardless of moves, merges
> and splits.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)