[ 
https://issues.apache.org/jira/browse/HBASE-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617776#comment-13617776
 ] 

Sergey Shelukhin commented on HBASE-7967:
-----------------------------------------


bq. 1. Inside one stripe, can we reuse some logic in the default compaction 
policy? The logic should be similar, right?  There are many new configuration 
parameters, can we re-use some from the default policy, such as max files, min 
files, etc?  Especially, they can be tuned per table/column family.
If you look at the patch in HBASE-7680, we do that. I wanted to keep parameters 
separate as they might be different, but yeah it probably makes sense to reuse 
them. 
HBASE-7571 allows per-table/per-cf setting, example (from code; shell also 
supports this):
{code}
htd.setConfiguration(StoreEngine.STORE_ENGINE_CLASS_KEY, 
StripeStoreEngine.class.getName());
      htd.setConfiguration(StripeStoreConfig.CountBased.FIXED_COUNT_KEY, 
stripeCount.toString());
      htd.setConfiguration(HStore.BLOCKING_STOREFILES_KEY, Long.toString(7 * 
stripeCount));
      if (l0FileCount != null) {
        htd.setConfiguration(StripeStoreConfig.MIN_FILES_L0_KEY, 
l0FileCount.toString());
      }
      if (assumeOrdering != null) {
        htd.setConfiguration(StripeStoreConfig.ASSUME_ORDERING_KEY, 
assumeOrdering.toString());
      }
{code}


bq. 2. There is a configuration assumeOrdering.  When should it be used?
This is related to dropping deletes. There's a recently discussed window in 
HBase where you can make out of order Put before/during major compaction, and 
it will not be visible before major compaction, but become visible after it 
finishes and drops delete markers.
This setting will extends this window up to N memstore flushes instead of 1, 
where N is number of L0 files (each a memstore flush); by not considering out 
of order puts for L0 files in most compactions.
As a benefit, you don't need to make bigger compactions just to drop deletes. 
So if you don't use out of order puts or are ok with existing window, you 
should use it.

bq. 3. Will we support any stripe type other than count based/size based?  If 
so, probably we need to change how stripe type is configured, since it seems 
that we can support only two types now .
Maybe. Hybrid "size+count" based stack mentioned would probably be just 
improvement of count, if implemented.
Do you think it's worth changing now?

bq. 4. For count based, do we have to always have that many stripes?  Is it ok 
to have a size limit or something so that we don't have many small stripes?
As a future improvement it is possible, will add to doc.

bq. 5. Based on the performance test you did, the write performance is not 
better. You mentioned it could be because of write amplification. Do we have 
some number to prove it?  If we have more IO, should the read performance be 
affected too?
Well, I have numbers for write amplification - in count scheme, there's at 
least x2 write amplification :) I measured ~2.5 in my first test with bad 
settings (not the one in the doc :)). After current test finishes I will post 
the results.

bq. 6. Can we have some doc to walk through the algorithm you implemented for 
the count/size based compaction policy? I was wondering how some L0 files end 
up in a specific stripe, how each stripe is created and maintained. Some 
flow-chart may be very helpful.
The doc attached to this JIRA describes all that. Doesn't have pictures though 
:( Do you mean on top of that doc.
                
> implement compactor for stripe compactions
> ------------------------------------------
>
>                 Key: HBASE-7967
>                 URL: https://issues.apache.org/jira/browse/HBASE-7967
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7967-v0.patch, HBASE-7967-v0-with-stuff.patch, 
> HBASE-7967-v1.patch, HBASE-7967-v1-with-7679-7680.patch, HBASE-7967-v2.patch, 
> HBASE-7967-v2-with-7679-7680.patch
>
>
> Compactor needs to be implemented. See details in parent and blocking jira.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to