[
https://issues.apache.org/jira/browse/HBASE-19528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317298#comment-16317298
]
Ted Yu commented on HBASE-19528:
--------------------------------
Please add license header to the new files.
Also add audience annotation.
{code}
+ int getCompactionsLeft() {
{code}
'Compactions' can mean the compaction request or the number of compactions.
Please rename the method to reflect the count.
{code}
+ boolean atCapacity() {
+ lock.readLock().lock();
+ try {
+ return compactingServers.size() >= concurrentServers;
{code}
'atCapacity' seems to imply the case of compactingServers.size() ==
concurrentServers . See if there is better method name.
{code}
+ .getRegionInfo().getEncodedName(), " already compacted");
+ }
+ return familiesToCompact;
{code}
Why returning familiesToCompact after seeing the first StoreFileInfo ?
Please put @VisibleForTesting one line above the method it affects.
{code}
+ private final Connection connection;
{code}
Why including connection in the request ? Cannot the connection be created
locally ?
> Major Compaction Tool
> ----------------------
>
> Key: HBASE-19528
> URL: https://issues.apache.org/jira/browse/HBASE-19528
> Project: HBase
> Issue Type: New Feature
> Reporter: churro morales
> Assignee: churro morales
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-19528.patch
>
>
> The basic overview of how this tool works is:
> Parameters:
> Table
> Stores
> ClusterConcurrency
> Timestamp
> So you input a table, desired concurrency and the list of stores you wish to
> major compact. The tool first checks the filesystem to see which stores need
> compaction based on the timestamp you provide (default is current time). It
> takes that list of stores that require compaction and executes those requests
> concurrently with at most N distinct RegionServers compacting at a given
> time. Each thread waits for the compaction to complete before moving to the
> next queue. If a region split, merge or move happens this tool ensures those
> regions get major compacted as well.
> This helps us in two ways, we can limit how much I/O bandwidth we are using
> for major compaction cluster wide and we are guaranteed after the tool
> completes that all requested compactions complete regardless of moves, merges
> and splits.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)