Eshcar Hillel commented on HBASE-20188:
Just to summarize the results again --
We see that in write workloads IMC improves performance; it delays flush to
disk and hence reduces number of disk compaction. When values are small IMC
reduces memory occupancy by reducing metadata size (regardless of workload
distribution), when the distribution is skewed IMC reduces memory occupancy by
eliminating data duplication (regardless of value size), when the values are
big and workload is uniform IMC doesn't help. For reads IMC is either
comparable or slightly worse than None (no in-memory compaction).
In addition measures we did in past experiments show that IMC reduces write
amplification, again due to reducing number of disk compaction.
I am opening a new Jira to change the default to the parameters that showed
best performance in the recent benchmarks. Namely, IMC policy = ADAPTIVE,
active segment porition = 0.02, limit on number of segments in pipeline = 2.
We are continuing with our experiments to see if any additional changes can
help improve the performance.
> [TESTING] Performance
> Key: HBASE-20188
> URL: https://issues.apache.org/jira/browse/HBASE-20188
> Project: HBase
> Issue Type: Umbrella
> Components: Performance
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 2.0.0
> Attachments: CAM-CONFIG-V01.patch, HBASE-20188-xac.sh,
> HBASE-20188.sh, HBase 2.0 performance evaluation - 8GB(1).pdf, HBase 2.0
> performance evaluation - 8GB.pdf, HBase 2.0 performance evaluation - Basic vs
> None_ system settings.pdf, ITBLL2.5B_1.2.7vs2.0.0_cpu.png,
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png,
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png,
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png,
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png,
> YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png,
> YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png,
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml,
> hbase-site.xml, lock.127.workloadc.20180402T200918Z.svg,
> lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt, workloadx,
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor
> that it is much slower, that the problem is the asyncwal writing. Does
> in-memory compaction slow us down or speed us up? What happens when you
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something
> about perf when 2.0.0 ships.
This message was sent by Atlassian JIRA