[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611478#comment-15611478
 ] 

Eshcar Hillel commented on HBASE-16417:
---------------------------------------

This jira aims to find best policy/ies for in-memory flush merge and 
compaction. We'll tests various workloads, large scale, to find the policy that 
is most beneficial under common workloads, and is not causing performance 
degradation in all workloads. 
We would like to approximate real production workloads (as much as possible 
with the synthetic tools we have to generate workloads) and to have decisions 
driven by benchmarks results.
I hope we can agree that the default policy or the one to be recommended to the 
users would be the one that performs the best.

I started to run benchmarks, plan to publish new results every few days here 
whenever completing a round of experiments, and would be happy to get 
suggestions for further experiments.
First round will be published soon.

bq. On the data compaction use case of Y, I have some Qs.  Is it increment way? 
 Or they are put ops but many duplicated cells comes in?

The duplicates originates from having skewed workloads where some keys are hot 
and therefore are updated much more frequently than cold keys.

> In-Memory MemStore Policy for Flattening and Compactions
> --------------------------------------------------------
>
>                 Key: HBASE-16417
>                 URL: https://issues.apache.org/jira/browse/HBASE-16417
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Anastasia Braginsky
>             Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to