churro morales created HBASE-19528:
--------------------------------------
Summary: Major Compaction Tool
Key: HBASE-19528
URL: https://issues.apache.org/jira/browse/HBASE-19528
Project: HBase
Issue Type: New Feature
Reporter: churro morales
Assignee: churro morales
Fix For: 2.0.0, 3.0.0
The basic overview of how this tool works is:
Parameters:
Table
Stores
ClusterConcurrency
Timestamp
So you input a table, desired concurrency and the list of stores you wish to
major compact. The tool first checks the filesystem to see which stores need
compaction based on the timestamp you provide (default is current time). It
takes that list of stores that require compaction and executes those requests
concurrently with at most N distinct RegionServers compacting at a given time.
Each thread waits for the compaction to complete before moving to the next
queue. If a region split, merge or move happens this tool ensures those
regions get major compacted as well.
This helps us in two ways, we can limit how much I/O bandwidth we are using for
major compaction cluster wide and we are guaranteed after the tool completes
that all requested compactions complete regardless of moves, merges and splits.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)