[
https://issues.apache.org/jira/browse/HBASE-19528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
churro morales updated HBASE-19528:
-----------------------------------
Attachment: HBASE-19528.v8.patch
> Major Compaction Tool
> ----------------------
>
> Key: HBASE-19528
> URL: https://issues.apache.org/jira/browse/HBASE-19528
> Project: HBase
> Issue Type: New Feature
> Reporter: churro morales
> Assignee: churro morales
> Priority: Major
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-19528.patch, HBASE-19528.v1.patch,
> HBASE-19528.v8.patch
>
>
> The basic overview of how this tool works is:
> Parameters:
> Table
> Stores
> ClusterConcurrency
> Timestamp
> So you input a table, desired concurrency and the list of stores you wish to
> major compact. The tool first checks the filesystem to see which stores need
> compaction based on the timestamp you provide (default is current time). It
> takes that list of stores that require compaction and executes those requests
> concurrently with at most N distinct RegionServers compacting at a given
> time. Each thread waits for the compaction to complete before moving to the
> next queue. If a region split, merge or move happens this tool ensures those
> regions get major compacted as well.
> This helps us in two ways, we can limit how much I/O bandwidth we are using
> for major compaction cluster wide and we are guaranteed after the tool
> completes that all requested compactions complete regardless of moves, merges
> and splits.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)