[
https://issues.apache.org/jira/browse/HBASE-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029076#comment-13029076
]
Nicolas Spiegelberg commented on HBASE-3797:
--------------------------------------------
@Stack: review board comments. Note that this is being used in our 0.89
branch, but the 0.90 port hasn't had any cluster testing, just unit testing.
The main differences were: (1) coprocessors, (2) have to pass the server to
CompactRequest/SplitRequest classes, & (3) Split code has changed. All 3
differences were pretty trivial, so it should work out of the box and we plan
to use it for our 0.90 build as well.
I wanted to keep the defaults the same as the old algorithm, but I certainly
agree that we should think of some reasonable default for the throttle size and
default to 1 large + 1 small compaction. It's just that our use case has 10GB
StoreFiles, and I don't think that's normal. Maybe after some auto-split
workload uses/tunes this feature? Also, note that 2 compactions/server means
that up to 6 hard disks/server could be busy handling compactions, so there's
diminished benefits on increasing the number of compaction threads past your
HD/server count. However, throttle partitioning will probably always be useful
for congestion scenarios even if HD/server is low.
CompactionRequestor.java#34 : we can remove the Region-level compaction API;
but user-requested compactions and unit tests still use it for simplicity, so I
was going to limit my refactoring.
MemStoreFlusher.java#257 : Concurrent Splits & Compactions are fine. The first
thing a split request does is region.close(). This waits for
region.writestate.compacting == 0, or all the compactions to finish.
Additionally, compactions prematurely interrupt on a close request, so the
split won't be stalled for long.
SplitRequest.java#34 : The whole setServer() stuff was hacked together [you can
call region.getHRegionServer() in 0.89]. I tried to be consistent with
CompactionRequest, but constructor is fine as well.
Store.java#632 : Note that the old store.compact() code was moved to
store.requestCompaction().
CompactionRequest.java#48 : Feel free to change the comments for the class
header. You don't technically have to call CompactionRequest.run() to execute
a compaction [see HRegion.compactStores()]; however you do need a
CompactionRequest object for the Store to get the details about a compaction.
CompactionRequest.java#177 : The Store.filesCompacting variable and associated
locking ensures that 2 compactions for the same Store will not have overlapping
StoreFiles. See the Collections.disjoint() check at Store.java#881. Currently,
a user can independently add StoreFiles [e.g. bulk import], but not remove
StoreFiles. We would have to check some code here if the user was allowed to
arbitrarily remove StoreFiles outside of a custom compaction algorithm.
> StoreFile Level Compaction Locking
> ----------------------------------
>
> Key: HBASE-3797
> URL: https://issues.apache.org/jira/browse/HBASE-3797
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Priority: Minor
> Attachments: HBASE-3797+1476.patch
>
>
> Multithreaded compactions (HBASE-1476) will solve the problem of major
> compactions clogging high-priority minor compactions. However, there is
> still a problem here. Since compactions are store-level, the store
> undergoing major compaction will have it's storefile count increase during
> the major. We really need a way to allow multiple outstanding compactions
> per store. compactSelection() should lock/reserve the files being used for
> compaction. This will also allow us to know what we're going to compact when
> inserting into the CompactSplitThread and make more informed priority
> queueing decisions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira