[
https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507045#comment-16507045
]
stack commented on HBASE-20188:
-------------------------------
Update. Have been working on perf in background focused on writes. Writes were
bottlenecking on flush; our flush in hbase2 was 2x slower than hbase1s. It was
also erratic in that it was flushing sometimes at the limit, other times at
well in excess of the limits. With input from the likes of [~ram_krish] and
[~anoop.hbase], flushes are 'regular' now with same profile as hbase1. See
HBASE-20483 for detail.
Our writes are still slower. The bottleneck now seems to be our WAL writing.
While the dfsclient is a ball of synchronization knots, it is able to take in
bigger blobs than our async WAL writer and so in a simple benchmark where
region count is small, the old FSHLog does better (hbase2 is up to 30% slower
than hbase1 in certain setups). But if you up the contention and up the region
count so it resembles a real deploy, the async WAL starts to shine. At hundreds
of regions, it can write faster, and almost as importantly, requires way less
resources (To learn more see messy experiments here abouts:
https://docs.google.com/document/d/1vZ_k6_pNR1eQxID5u1xFihuPC7FkPaJQW8c4M5eA2AQ/edit#heading=h.niiqwjd247t4).
A few of us are working on it.
> [TESTING] Performance
> ---------------------
>
> Key: HBASE-20188
> URL: https://issues.apache.org/jira/browse/HBASE-20188
> Project: HBase
> Issue Type: Umbrella
> Components: Performance
> Reporter: stack
> Priority: Blocker
> Fix For: 3.0.0, 2.1.0
>
> Attachments: CAM-CONFIG-V01.patch, HBASE-20188-xac.sh,
> HBASE-20188.sh, HBase 2.0 performance evaluation - 8GB(1).pdf, HBase 2.0
> performance evaluation - 8GB.pdf, HBase 2.0 performance evaluation - Basic vs
> None_ system settings.pdf, HBase 2.0 performance evaluation - throughput
> SSD_HDD.pdf, ITBLL2.5B_1.2.7vs2.0.0_cpu.png,
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png,
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png,
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png,
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png,
> YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png,
> YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png,
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml,
> hbase-site.xml, hits.png, hits_with_fp_scheduler.png,
> lock.127.workloadc.20180402T200918Z.svg,
> lock.2.memsize2.c.20180403T160257Z.svg, perregion.png, run_ycsb.sh,
> total.png, tree.txt, workloadx, workloadx
>
>
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor
> that it is much slower, that the problem is the asyncwal writing. Does
> in-memory compaction slow us down or speed us up? What happens when you
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something
> about perf when 2.0.0 ships.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)