[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159049#comment-15159049 ] Hudson commented on HBASE-13259: FAILURE: Integrated in HBase-Trunk_matrix #730 (See [https://builds.apache.org/job/HBase-Trunk_matrix/730/]) HBASE-13259 mmap() based BucketCache IOEngine (Zee Chen & Ram) (ramkrishna: rev 3ba1a7fd23f0b0ca06cf7a9a04cb45975e1c7d91) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/ByteBufferIOEngine.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/FileMmapEngine.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ByteBufferArray.java * hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestByteBufferArray.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestFileMmapEngine.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ByteBufferAllocator.java > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, HBASE-13259_v4.patch, HBASE-13259_v5.patch, > HBASE-13259_v6.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158756#comment-15158756 ] ramkrishna.s.vasudevan commented on HBASE-13259: Thanks for the patch [~zeocio] and all others for the reviews. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, HBASE-13259_v4.patch, HBASE-13259_v5.patch, > HBASE-13259_v6.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158736#comment-15158736 ] Hadoop QA commented on HBASE-13259: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 10s {color} | {color:red} Patch generated 1 new checkstyle issues in hbase-server (total was 36, now 37). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 19s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 44s {color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 118m 8s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s {color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 122m 14s {color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 301m 59s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_72 Failed junit tests | hadoop.hbase.mapreduce.TestImportExport | \\ \\ || Subsystem ||
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158386#comment-15158386 ] stack commented on HBASE-13259: --- +1 if hadoopqa is good w/ it. Mind adding a release note [~ram_krish] Thats great you carried this one home. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, HBASE-13259_v4.patch, HBASE-13259_v5.patch, > ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156795#comment-15156795 ] Hadoop QA commented on HBASE-13259: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 0s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 32s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 21s {color} | {color:red} hbase-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 31s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s {color} | {color:red} hbase-common in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 34s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 25s {color} | {color:red} hbase-common in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 34s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 21s {color} | {color:red} hbase-common in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 27s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 21s {color} | {color:red} hbase-common in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 27s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 18s {color} | {color:red} Patch generated 1 new checkstyle issues in hbase-common (total was 7, now 8). {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 5m 9s {color} | {color:red} Patch generated 4 new checkstyle issues in hbase-server (total was 36, now 40). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 41s {color} | {color:red} Patch causes 14 errors with Hadoop v2.4.0. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 22s {color} | {color:red} Patch causes 14 errors with Hadoop v2.4.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 6s {color} | {color:red} Patch causes 14 errors with Hadoop v2.5.0. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 50s {color} | {color:red} Patch causes 14 errors with Hadoop v2.5.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156724#comment-15156724 ] ramkrishna.s.vasudevan commented on HBASE-13259: bq.Can be call force() on fileChannel? Depending on that, we may not need sync() in ByteBufferArray as well. That was the way with FileIOEngine.sync(). Can the same be used here also? In the previous patch force() was called on the MappedBBs only and not on the fileChannel hence I went with the same way. Am not sure on the impact - whether both are same. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156520#comment-15156520 ] Anoop Sam John commented on HBASE-13259: bq.buffers[i] = allocator.allocate(0, false); This is just a 0 length dummy buffer. Can be a HBB always. {code} if (directByteBuffer) { 77buffer = fileChannel.map(java.nio.channels.FileChannel.MapMode.READ_WRITE, pos * size, 78size); 79 } else { 80buffer = fileChannel.map(java.nio.channels.FileChannel.MapMode.READ_WRITE, pos * size, 810); 82 } {code} On this engine, we will always call with 'directByteBuffer' true right? Can we throw Exception otherwise rather than creating 0 length BB? It is bit ugly to have sync() API in ByteBufferAllocator. {code} public void sync() throws IOException { 153 if (fileChannel != null) { 154 if(bufferArray != null) { 155 bufferArray.sync(); 156 } 157 } 158 } {code} Can be call force() on fileChannel? Depending on that, we may not need sync() in ByteBufferArray as well. // TODO: See if the SHARED mode can be created here You checked. I think it is not advisable. It depends on the total data size and the page hit/miss is going to decide the perf. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152176#comment-15152176 ] ramkrishna.s.vasudevan commented on HBASE-13259: [~zeocio], [~saint@gmail.com],[~anoopsamjohn] Reviews before commit!! > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145910#comment-15145910 ] ramkrishna.s.vasudevan commented on HBASE-13259: bq.So, commit? With a release note on when to use this ioengine? Sure. But a review on the code once would be great considering few changes from the actual patch initially posted. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145665#comment-15145665 ] stack commented on HBASE-13259: --- So, commit? With a release note on when to use this ioengine? > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138543#comment-15138543 ] Zee Chen commented on HBASE-13259: -- Thanks for the comprehensive test update! As you stated, the kernel read-ahead logic is the difference here for the scan performance. mmap can be made to read-ahead as well, but not via FileChannel API. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138356#comment-15138356 ] ramkrishna.s.vasudevan commented on HBASE-13259: Completed the testing. Here are the findings Using YCSB scans and gets were performed with 75 threads and the throughput was measured. I used to a server with 50GB of RAM. Measured the throughput diff between a setup where the mmap() file mode was configured with a cache of 100G. In one of the setup only 10G of data was loaded and that was cached and in another loaded around 75G of data and the whole 75G was cached in the file mode BC. With 10G of cache ||Scans||Gets|| |13697.46 ops/sec|69085.88 ops/sec|| With 75G of cache ||Scans||Gets|| |8745.08 ops/sec|66221.93 ops/sec|| The same 75G of cache setup was run with the current File Mode impl of BC ||Scans||Gets|| |12107.92 ops/sec|42725.07 ops/sec|| Also my Filemode BC impl is backed with *PCIe SSD*. So the test clearly shows that the mmap based file mode is best suited for gets rather than scans because when the data does not fit in the RAM there may lot of page faults and we do a lot of read operations like compare on the BB that is retrieved out of this mmap buffers. Whereas in the current way of File Mode BC since the BB is copied to the onheap there is no page faults. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122009#comment-15122009 ] Hadoop QA commented on HBASE-13259: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} master passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 20s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 59s {color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 23s {color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} master passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} master passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 12s {color} | {color:red} Patch generated 2 new checkstyle issues in hbase-common (total was 7, now 9). {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 25s {color} | {color:red} Patch generated 4 new checkstyle issues in hbase-server (total was 36, now 40). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 30m 57s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 24s {color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 149m 35s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 27s {color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 109m 29s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} |
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122950#comment-15122950 ] ramkrishna.s.vasudevan commented on HBASE-13259: Failures seems unrelated. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, > HBASE-13259_v3.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, > mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119994#comment-15119994 ] Zee Chen commented on HBASE-13259: -- [~ram_krish] Yes please go ahead. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118634#comment-15118634 ] ramkrishna.s.vasudevan commented on HBASE-13259: [~zeocio] Thanks for this input. It will be useful for us too. Do you want to refresh this JIRA or can I start working on it? > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115660#comment-15115660 ] stack commented on HBASE-13259: --- Thanks [~ram_krish] This feature makes sense to me... > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115743#comment-15115743 ] Zee Chen commented on HBASE-13259: -- One of the things folks can do with recent linux kernel's ext4 DAX (https://www.kernel.org/doc/Documentation/filesystems/dax.txt) support is that they can mmap a large bucketcache straight out of NVM (e.g. NAND flash) backed mount point, bypassing page cache and unneeded memory copies. It would be good to do a latency/throughput comparison between the FileIOEngine and FileMmapEngine. Unfortunately I don't have such a system available at the moment. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112116#comment-15112116 ] ramkrishna.s.vasudevan commented on HBASE-13259: I can work on rebasing this and getting this in. It should help us in our future work also. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065990#comment-15065990 ] Anoop Sam John commented on HBASE-13259: Trunk patch need rebase as IOEngie#read API itself got changed.. Yes we can get this in if the tests shows given results. > mmap() based BucketCache IOEngine > - > > Key: HBASE-13259 > URL: https://issues.apache.org/jira/browse/HBASE-13259 > Project: HBase > Issue Type: New Feature > Components: BlockCache >Affects Versions: 0.98.10 >Reporter: Zee Chen >Assignee: Zee Chen >Priority: Critical > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, > mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch > > > Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data > from kernel space to user space. This is a good choice when the total working > set size is much bigger than the available RAM and the latency is dominated > by IO access. However, when the entire working set is small enough to fit in > the RAM, using mmap() (and subsequent memcpy()) to move data from kernel > space to user space is faster. I have run some short keyval gets tests and > the results indicate a reduction of 2%-7% of kernel CPU on my system, > depending on the load. On the gets, the latency histograms from mmap() are > identical to those from pread(), but peak throughput is close to 40% higher. > This patch modifies ByteByfferArray to allow it to specify a backing file. > Example for using this feature: set hbase.bucketcache.ioengine to > mmap:/dev/shm/bucketcache.0 in hbase-site.xml. > Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560195#comment-14560195 ] Andrew Purtell commented on HBASE-13259: bq. an accurate get rpc call latency measurement tool. we have an in house C++ version that is based on libpcap and boost::accumulators, I can put it in a public repo if there is enough interest. I think folks around here would be interested. If only to duplicate your results. bq. a jdk that preserves frame pointer so that you can use the linux perf tool to do a kernel-user space combined CPU profiling [...] I have a version of https://bugs.openjdk.java.net/browse/JDK-8068945 backported to openjdk 8u45. I can post the patch for 8u45. Yes please, if you don't mind, because I'm looking at that patch now wanting to port it too. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559727#comment-14559727 ] Zee Chen commented on HBASE-13259: -- For those interested in repeating the performance test and CPU profiling, there are a few things you need: - an accurate get rpc call latency measurement tool. we have an in house C++ version that is based on libpcap and boost::accumulators, I can put it in a public repo if there is enough interest. - a jdk that preserves frame pointer so that you can use the linux perf tool to do a kernel-user space combined CPU profiling since we are comparing pread and memcpy. note that preserving the frame pointer will introduce a few percent of overhead but should not skew the overall profiling result. I have a version of https://bugs.openjdk.java.net/browse/JDK-8068945 backported to openjdk 8u45. I can post the patch for 8u45. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513177#comment-14513177 ] Nick Dimiduk commented on HBASE-13259: -- Alright. It's too late now for 1.1, but we should be able to get this for 1.2. [~stack] what other tests would you like to see run? I can see about finding some cycles in a couple weeks. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509896#comment-14509896 ] Zee Chen commented on HBASE-13259: -- [~ndimiduk] I have run the comparison between mmap and fileio and mmap and offheap quite a few times. My fuzzy memory of observations from 2 months ago were: - mmap is identical to offheap in both latency and throughput - mmap is slightly better than file in latency but it has greater throughput Since offheap engine also allocates memory using mmap(), except using anonymous maps as opposed to file backed map, the identical performance result is not particularly surprising. In both cases the bulk of the cost is in copying the data from mmap areas and we avoid a trip to the kernel. I haven't tested this with those cache options you mentioned yet. I did test with a few different block sizes and bucket sizes, but haven't found any significant impact on latency yet. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503941#comment-14503941 ] zhangduo commented on HBASE-13259: -- I can pick this up and address the 'ugly ByteBufferArray'. But we do not have enough time to test it on large dataset if we want to catch up with the first rc of 1.1 I think. It is a tuning work, the time we need is unpredictable. We can file a new issue to hold the tuning work and resolve this issue before the first rc of 1.1. What do you think? [~ndimiduk] Thanks. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504049#comment-14504049 ] Nick Dimiduk commented on HBASE-13259: -- Right. Sounds good. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503950#comment-14503950 ] stack commented on HBASE-13259: --- I suggest we kick it out of 1.1 then. It should be finished with a definitive story before it gets committed IMO. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503477#comment-14503477 ] Nick Dimiduk commented on HBASE-13259: -- First rc for 1.1 should be going up Friday (4/24). How are we feeling about this one? Any chance of an updated patch and some of the other requested test runs? mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497124#comment-14497124 ] Nick Dimiduk commented on HBASE-13259: -- bq. mean could we test it with a size much larger than available memory? i.e., 100G RAM, 500G bucket cache on SSD? If we only test it with a size smaller than available memory, then I think we need to beat the offheap engine, not file engine(It is good if you can beat both of them) [~Apache9], [~zeocio] either of you have a test rig to spin the above suggestions? Patch is small and clean, looks good. One question: you changed in ByteBufferArray, all these ugly {{if(filePath)}} checks. Does it make sense to subclass instead? Any chance you've tested it with any of our myriad block cache options (cache on write, block cache compression, evict on close)? Sorry for letting your patch go stale. Mind cleaning it up for master and branch-1? We can get some clean buildbot runs and commit it. Would be good to get this in for 1.1.0 with a nice release note. Oh, and [~zeocio] you may want to set configure your git user.email so you get proper attribution in the repo (https://help.github.com/articles/setting-your-email-in-git/). Nice work! mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497347#comment-14497347 ] zhangduo commented on HBASE-13259: -- No, I haven't tested the patch... mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368154#comment-14368154 ] zhangduo commented on HBASE-13259: -- I mean could we test it with a size much larger than available memory? i.e., 100G RAM, 500G bucket cache on SSD? If we only test it with a size smaller than available memory, then I think we need to beat the offheap engine, not file engine(It is good if you can beat both of them:)) mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367770#comment-14367770 ] Zee Chen commented on HBASE-13259: -- [~Apache9] The current ByteBufferArray class encapsulates the concept of large offheap memory buffers pretty well, all the memory is obtained from mmap() calls. The only difference is whether the maps is anonymous or associated to a named file. It is not necessary to create 2 separate ByteBufferArray classes. When the working set doesn't fit in RAM, paging will take place, even for offheap BucketCache option. Again the difference here is between paging to system swap space and paging to a named local file (except the case where the file is created on a tmpfs like /dev/shm). Paging happens regardless if pread/pwrite(FileIOEngine) is used or if mmap (FileMmapEngine) is used. This is because jvm doesn't support direct io. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365848#comment-14365848 ] stack commented on HBASE-13259: --- [~zeocio] is it not configurable now? On the patch, nice and clean. Here where you do this... fileSize = roundUp(capacity, bufferSize); and then... raf.setLength(fileSize); ... any chance of us reading extra bytes off the end of the file? End of an hfile has a particular format so we will probably never get there? Have you tried reading to EOF verifying it is all goodness you are getting back (i'm guessing you have)? We will get one of these for every file under a RS? 79 LOG.info(Allocating + StringUtils.byteDesc(fileSize) 80 + , on the path: + filePath); Could be a bunch. Maybe DEBUG? Though it would be good to have a message that verifies the filechannel mmap is working... so just leave as is... if it is annoying, can fix in a new JIRA Patch looks great. +1 pending answer to above question. Needs nice fat release note. I can add to the refguide too on commit. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Attachments: ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366108#comment-14366108 ] Zee Chen commented on HBASE-13259: -- btw the test result above is from hbase-0.98.10 patched running on a slightly modified version of JDK8. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366100#comment-14366100 ] Zee Chen commented on HBASE-13259: -- Test results under following conditions: - 22 byte key to 32 byte val map stored in a table, 16k hfile blocksize - uniform key distribution, tested with gets from large number of client threads - hbase.regionserver.handler.count=100 - hbase.bucketcache.size=7 (70GB) - hbase.bucketcache.combinedcache.enabled=true - hbase.bucketcache.ioengine=mmap:/dev/shm/bucketcache.0 - hbase.bucketcache.bucket.sizes=5120,7168,9216,11264,13312,17408,33792,41984,50176,58368,66560,99328,132096,197632,263168,394240,525312 - CMS GC At 85k gets per second, the system looks like: {code} total-cpu-usage -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 58 11 26 0 0 5| 016k| 17M 13M| 0 0 | 316k 255k 59 11 25 0 0 5|2048k 12k| 18M 13M| 0 0 | 319k 254k 58 11 25 0 0 5| 028k| 18M 13M| 0 0 | 318k 253k 59 11 25 0 0 5|2048k0 | 18M 13M| 0 0 | 318k 252k {code} with wire latency profile (unit is microsecond): {code} Quantile: 0.50, Value: 361 Quantile: 0.75, Value: 555 Quantile: 0.90, Value: 830 Quantile: 0.95, Value: 1077 Quantile: 0.98, Value: 1604 Quantile: 0.99, Value: 4212 Quantile: 0.999000, Value: 7221 Quantile: 1.00, Value: 14406 {code} FileIOEngine's latency profile is identical. It had higher sys CPU and lower user CPU, higher context switches, and about 40% lower max throughput in gets per second. The patch was tested to 140k gets per second for 2 weeks nonstop. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365687#comment-14365687 ] Andrew Purtell commented on HBASE-13259: [~ndimiduk], you expressed interest in this on dev@ mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Attachments: ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365826#comment-14365826 ] Zee Chen commented on HBASE-13259: -- ByteBufferArray has DEFAULT_BUFFER_SIZE set to 4MB right now. For realistic deployment scenarios, where regionservers can take 100+GB, we should allow this to be set to 1GB (see https://wiki.debian.org/Hugepages#x86_64 ) if the underlying system is configured for it. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Attachments: ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366232#comment-14366232 ] Hadoop QA commented on HBASE-13259: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705181/HBASE-13259-v2.patch against master branch at commit f9a17edc252a88c5a1a2c7764e3f9f65623e0ced. ATTACHMENT ID: 12705181 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1925 checkstyle errors (more than the master's current 1917 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + public ByteBufferArray(long capacity, boolean directByteBuffer, String filePath) throws IOException { + buffers[i] = fileChannel.map(java.nio.channels.FileChannel.MapMode.READ_WRITE, i*(long)bufferSize, (long)bufferSize); + buffers[i] = fileChannel.map(java.nio.channels.FileChannel.MapMode.READ_WRITE, i*(long)bufferSize, 0); +// Ideally we call ByteBufferArray.close() where we munmap() the segments and close the FileChannel. {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13290//console This message is automatically generated. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366403#comment-14366403 ] zhangduo commented on HBASE-13259: -- What about just use a different ByteArrayBuffer implementation? Then we do not need to check 'fileChannel != null' every time. And one doubt, if the working set can fit in RAM, then why not just use offheap BucketCache? We can not transfer data from HDFS using mmap... And if we use mmap on a file much larger than available RAM as BucketCache, what will happen? I think this is the common use case for FileIOEngine that your FileMmapEngine compared to. Would be great if we can still be faster than FileIOEngine under this condition. Thanks. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365852#comment-14365852 ] stack commented on HBASE-13259: --- Oh, would be good to add your experience with this patch to the release note too... knowing it has been deployed will give those who wonder confidence the feature acutally works. Thanks. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Attachments: ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365959#comment-14365959 ] Hadoop QA commented on HBASE-13259: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705165/HBASE-13259.patch against master branch at commit 99ec36614703fb191fa4093b86efdddf6aaa89ae. ATTACHMENT ID: 12705165 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.1. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestFileMmapEngine.java:[27,31] cannot find symbol [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestFileMmapEngine.java:[35,11] cannot find symbol [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestFileMmapEngine.java:[27,31] cannot find symbol [ERROR] symbol: class SmallTests [ERROR] location: package org.apache.hadoop.hbase [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestFileMmapEngine.java:[35,11] cannot find symbol [ERROR] symbol: class SmallTests [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13289//console This message is automatically generated. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366047#comment-14366047 ] Zee Chen commented on HBASE-13259: -- [~stack] thanks for the review. roundUp is a private method of ByteBufferArray. I haven't specifically tested against reading to EOF yet. Is there some chance of reading uninitialized (already mmaped but not filled with data yet) memory? Yes, and I have come across that when testing with the cache persistence (hbase.bucketcache.persistent.path) feature. And this should be a caveat, that deploying this feature with persistence turned on is broken. And I would like to get some help tracking down the bug. In case you are worried about reading past EOF from caller, the assert logic in ByteBufferArray.multiple() should catch those conditions. I can add unit test for that if so desired. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364371#comment-14364371 ] zhangduo commented on HBASE-13259: -- Good! mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364401#comment-14364401 ] Ted Yu commented on HBASE-13259: {code} 105 LOG.info(MappedByteBuffer: + buffers[i].toString()); {code} Consider using DEBUG level logging for the above. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Attachments: mmap-0.98-v1.patch, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)