[ 
https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594202#comment-13594202
 ] 

Sergey Shelukhin commented on HBASE-7680:
-----------------------------------------

The structure of the files is as such.
StripeStoreEngine is implementation of StoreEngine for stripes, should be 
pretty straightforward. needsCompaction from CompactionPolicy was moved into 
storeEngine, and in case of each engine calls the appropriate method.
StripeCompactor is a placeholder for compactor; does nothing now. Again, due to 
vast differences in compactor interfaces general Compactor lost compact methods.
DefaultCompactionPolicy was refactored a bit to extract the application of the 
ratio algorithm into a static method for use in stripes. I added minfiles check 
right in the method; I'll check whether separate minfile check that default 
policy does can be removed.

Then, StripeStoreConfig is a base class now; it has two nested sub-classes for 
size-based and count-based stripes, with different parameters.
StripeCompactionPolicy is the base policy class that is gene has some common 
methods, the biggest of which is finding single-stripe compaction. It's generic 
on StripeStoreConfig type.
SizeBasedStripeCompactionPolicy and CountBasedStripeCompactionPolicy are the 
actual implementations of the two policies.
Tests for each class are straightforward (in terms of mapping test to class); 
StripeCompactionPolicyTestBase is a base class that contain various common 
methods to create mock state, verify things, etc.

Also: discussion of drop-deletes logic is in HBASE-7902.
In short, to drop deletes, we need to add L0 files to compaction to have all 
store files, lest we have issues with deletes/puts in the past. Then we can 
only drop deletes from stripe files, not L0. Note that we already have the 
issues with that because of memstore, but the timing window to get it is 
relatively small, whereas if we ignore some store files it will be large.
Therefore, I added a config parameters "assume ordering", which basically tells 
us the user is prepared to tolerate it or doesn't use deletes in the past much 
(I assume this is majority of cases, but it's off by default). In that case we 
can drop deletes on the compactions of entire stripe, ignoring L0.
Then, to avoid blowing up the compaction size/rewrites, I added a ratio config, 
where L0 will not be added to compaction to drop deletes unless it's small 
enough. I will need to think more about that, as adding very small L0 to 
compaction can result in a different kind of problem, too many small files.
Needless to say if there's no L0 (no files), we don't have to add it.
Finally, in order to add L0 files to compaction to drop deletes, we need to 
know the resulting stripe boundaries (to split L0), so it's not done for 
compactions that are based on determining the boundaries dynamically (e.g. 
rebalancing, where we determine boundary in compactor based on data size).
 

                
> implement compaction policy for stripe compactions
> --------------------------------------------------
>
>                 Key: HBASE-7680
>                 URL: https://issues.apache.org/jira/browse/HBASE-7680
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7680-minus0.5.patch, HBASE-7680-v0.patch, 
> HBASE-7680-v0-with-7679-and-7935.patch, HBASE-7680-v-minus1.patch, 
> HBASE-7680-v-minus1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to