[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904985#comment-15904985
 ] 

Eshcar Hillel commented on HBASE-16417:
---------------------------------------

We already ran some experiments with merge with really good results for 
write-only workload and avoiding the extra overhead in mixed workloads.
We though the right way to go was first to refresh the code, commit to master, 
and then re-run them and publish the result.
You can review the code in HBASE-17765.

In the past we ran experiments with value=1KB (see penultimate report) but 
since then the code changed a lot. Indeed the affect of reducing the meta data 
decreases as the size of data itself increases. It's a good idea to run (at 
least some of the experiments) with 1KB values

We were unable to get greater throughput with sync wal mode (even with more 
than 12 threads) so we decided to test with async wal which helps simulate 
greater load by its nature.
Batching at the client side is for the same reason -- it significantly 
increases the load on the servers and reduces the running time by order of 
magnitude.

Note that in sync wal mode all policies have the same number of wal files and 
the same volume of wal data. The number of wal file is smaller with async wal 
for all policies (in zipfian and uniform key distribution). When you get the 
answer to why thIS happens it might explain the number of wal files in eager 
policy.

Number of pipeline segments: while 4*4=16 would be the maximal number 4/2=2 
would be the number in expectation. 

GC generally takes less than 1% of the running time. Since all experiments run 
with the same GC parameters I don't think its important which parameters we 
use. We are not trying to optimize the performance here but just to have a fair 
comparison under high load and high volume of data. 

> In-Memory MemStore Policy for Flattening and Compactions
> --------------------------------------------------------
>
>                 Key: HBASE-16417
>                 URL: https://issues.apache.org/jira/browse/HBASE-16417
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Eshcar Hillel
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16417-benchmarkresults-20161101.pdf, 
> HBASE-16417-benchmarkresults-20161110.pdf, 
> HBASE-16417-benchmarkresults-20161123.pdf, 
> HBASE-16417-benchmarkresults-20161205.pdf, 
> HBASE-16417-benchmarkresults-20170309.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to