[
https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559727#comment-14559727
]
Zee Chen commented on HBASE-13259:
----------------------------------
For those interested in repeating the performance test and CPU profiling, there
are a few things you need:
- an accurate get rpc call latency measurement tool. we have an in house C++
version that is based on libpcap and boost::accumulators, I can put it in a
public repo if there is enough interest.
- a jdk that preserves frame pointer so that you can use the linux perf tool to
do a kernel-user space combined CPU profiling since we are comparing pread and
memcpy. note that preserving the frame pointer will introduce a few percent of
overhead but should not skew the overall profiling result. I have a version of
https://bugs.openjdk.java.net/browse/JDK-8068945 backported to openjdk 8u45. I
can post the patch for 8u45.
> mmap() based BucketCache IOEngine
> ---------------------------------
>
> Key: HBASE-13259
> URL: https://issues.apache.org/jira/browse/HBASE-13259
> Project: HBase
> Issue Type: New Feature
> Components: BlockCache
> Affects Versions: 0.98.10
> Reporter: Zee Chen
> Fix For: 2.0.0, 1.2.0
>
> Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg,
> mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch
>
>
> Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data
> from kernel space to user space. This is a good choice when the total working
> set size is much bigger than the available RAM and the latency is dominated
> by IO access. However, when the entire working set is small enough to fit in
> the RAM, using mmap() (and subsequent memcpy()) to move data from kernel
> space to user space is faster. I have run some short keyval gets tests and
> the results indicate a reduction of 2%-7% of kernel CPU on my system,
> depending on the load. On the gets, the latency histograms from mmap() are
> identical to those from pread(), but peak throughput is close to 40% higher.
> This patch modifies ByteByfferArray to allow it to specify a backing file.
> Example for using this feature: set hbase.bucketcache.ioengine to
> mmap:/dev/shm/bucketcache.0 in hbase-site.xml.
> Attached perf measured CPU usage breakdown in flames graph.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)