[ 
https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418735#comment-16418735
 ] 

Eshcar Hillel commented on HBASE-20188:
---------------------------------------

One way to explain the poor performance of reads with in-memory compaction is 
the fact that it uses 5 segments in the pipeline, and the fact that in the 
performed benchmarks majority of keys did not have values on disk (only 15-20M 
out of 100M keys are covered after the load phase) aggravates the negative 
affect of these segments. But this is yet to be proved.

I will also try the 2.0 code with the same system settings and YCSB workloads, 
except that I have SSD machines, but I believe that's ok.
 Let me make sure I have all settings correct:
 1) You run the code currently committed to *branch-2.0*
 2) run on a single machine, namely *no replication* at the HDFS
 3) master and RS are on the same machine 
 4) Heap size is *8GB*?
 5) what is the {{operationcount}} in the experiments? (defined as 
{{operationcount=$\{INSERT_COUNT}}})

I would like to run the tests until they are completed and not to cap them at 
20minutes if that's ok.
 I do understand that this will make the experiments much longer; can we agree 
to use {{RECORD_COUNT=50000000}} loading 50GB? This is less keys but more data 
than in the experiments above.

What is the default for writing to WAL in branch-2.0, is it SYNC_WAL or 
ASYNC_WAL?

> [TESTING] Performance
> ---------------------
>
>                 Key: HBASE-20188
>                 URL: https://issues.apache.org/jira/browse/HBASE-20188
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Performance
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: ITBLL2.5B_1.2.7vs2.0.0_cpu.png, 
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png, 
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, 
> YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, 
> YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, 
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, tree.txt
>
>
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor 
> that it is much slower, that the problem is the asyncwal writing. Does 
> in-memory compaction slow us down or speed us up? What happens when you 
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something 
> about perf when 2.0.0 ships.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to