[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157955#comment-15157955 ] Sean Busbey commented on HBASE-11927: - is this fine to close now? if not, could we move backport to its own ticket? > Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to > CRC32C) > > > Key: HBASE-11927 > URL: https://issues.apache.org/jira/browse/HBASE-11927 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: stack >Assignee: Appy > Fix For: 2.0.0, 1.2.0, 1.1.4 > > Attachments: HBASE-11927-branch-1.1.patch, HBASE-11927-v1.patch, > HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, > HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, > HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, > after-randomWrite1M-0.5%.svg, before-compact-22%.svg, > before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, > c2021.zip.svg, crc32ct.svg > > > Up in hadoop they have this change. Let me publish some graphs to show that > it makes a difference (CRC is a massive amount of our CPU usage in my > profiling of an upload because of compacting, flushing, etc.). We should > also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in > hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143293#comment-15143293 ] Hadoop QA commented on HBASE-11927: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s {color} | {color:green} branch-1.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} branch-1.1 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} branch-1.1 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} branch-1.1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} branch-1.1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 41s {color} | {color:red} hbase-common in branch-1.1 has 9 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 46s {color} | {color:red} hbase-server in branch-1.1 has 80 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s {color} | {color:red} hbase-common in branch-1.1 failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s {color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s {color} | {color:green} branch-1.1 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 4m 9s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 2.5.2 2.6.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s {color} | {color:red} hbase-common in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 26s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s {color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 46s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 29s {color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 10s {color} | {color:green} hbase-server in the patch
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143291#comment-15143291 ] Andrew Purtell commented on HBASE-11927: +1 I made a similar assessment about 0.98, hence HBASE-14738, but have held off commit because of the rather extensive additional changes needed for 0.98-specific complications of dealing with multiple versions of Hadoop. > Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to > CRC32C) > > > Key: HBASE-11927 > URL: https://issues.apache.org/jira/browse/HBASE-11927 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: stack >Assignee: Appy > Fix For: 2.0.0, 1.2.0, 1.1.4 > > Attachments: HBASE-11927-branch-1.1.patch, HBASE-11927-v1.patch, > HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, > HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, > HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, > after-randomWrite1M-0.5%.svg, before-compact-22%.svg, > before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, > c2021.zip.svg, crc32ct.svg > > > Up in hadoop they have this change. Let me publish some graphs to show that > it makes a difference (CRC is a massive amount of our CPU usage in my > profiling of an upload because of compacting, flushing, etc.). We should > also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in > hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143358#comment-15143358 ] stack commented on HBASE-11927: --- Backport looks good to me. You'll need to call out in any release the flip to CRC32C (it won't be a problem but a change). Its a nice boost so worth the backport. > Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to > CRC32C) > > > Key: HBASE-11927 > URL: https://issues.apache.org/jira/browse/HBASE-11927 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: stack >Assignee: Appy > Fix For: 2.0.0, 1.2.0, 1.1.4 > > Attachments: HBASE-11927-branch-1.1.patch, HBASE-11927-v1.patch, > HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, > HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, > HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, > after-randomWrite1M-0.5%.svg, before-compact-22%.svg, > before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, > c2021.zip.svg, crc32ct.svg > > > Up in hadoop they have this change. Let me publish some graphs to show that > it makes a difference (CRC is a massive amount of our CPU usage in my > profiling of an upload because of compacting, flushing, etc.). We should > also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in > hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545026#comment-14545026 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733061/HBASE-11927-v8.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733061 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.activemq.transport.mqtt.MQTTTest.testPacketIdGeneratorNonCleanSession(MQTTTest.java:859) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545634#comment-14545634 ] stack commented on HBASE-11927: --- That failure is not yours [~appy] ... not sure why complaining no test when you've added some. Let me rerun to be sure. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545655#comment-14545655 ] Anoop Sam John commented on HBASE-11927: +1. Good work [~appy]. Good release notes too. Well summarized. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546237#comment-14546237 ] Hudson commented on HBASE-11927: SUCCESS: Integrated in HBase-TRUNK #6484 (See [https://builds.apache.org/job/HBase-TRUNK/6484/]) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) (stack: rev 988593857f5150e5d337ad6b8bf3ba0479441f3e) * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Components: Performance Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546052#comment-14546052 ] Apekshit Sharma commented on HBASE-11927: - Changed InterfaceAudience of DataChecksum to include HBase. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546143#comment-14546143 ] Hudson commented on HBASE-11927: SUCCESS: Integrated in HBase-1.2 #79 (See [https://builds.apache.org/job/HBase-1.2/79/]) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) (stack: rev 1cf85b3f7fd7a7d48894dc7d42dcf6978197f2f7) * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Components: Performance Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546004#comment-14546004 ] Apekshit Sharma commented on HBASE-11927: - Thanks [~stack] for committing. Thanks everyone for helping all the way. [~anoop.hbase] [~eclark] Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545875#comment-14545875 ] Hadoop QA commented on HBASE-11927: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733162/HBASE-11927-v8.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733162 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14058//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14058//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14058//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14058//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544553#comment-14544553 ] Elliott Clark commented on HBASE-11927: --- +1 looking forward to seeing the CPU improvements Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544518#comment-14544518 ] stack commented on HBASE-11927: --- +1 on patch. Want to write a release note [~appy]? Will commit tomorrow. Others may have comments on the patch. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544707#comment-14544707 ] stack commented on HBASE-11927: --- hadoopqa is it [~appy] -- you could get ./dev/test-patch.sh to run locally but might as well have jenkins do the job for you. There is no shame in a checkstyle or too long a line in a patch... Just address and resubmit. For the checkstyle above, look at the report listed in hadoopqa output: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//artifact/patchprocess/checkstyle-aggregate.html Looking at files in your patch and what is reported in checkstyle report, these might be yours: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//artifact/patchprocess/checkstyle-aggregate.html#org.apache.hadoop.hbase.io.hfile.ChecksumUtil.java https://builds.apache.org/job/PreCommit-HBASE-Build/14049//artifact/patchprocess/checkstyle-aggregate.html#org.apache.hadoop.hbase.util.ChecksumType.java Just fix the unused imports should be enough. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544634#comment-14544634 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12732987/HBASE-11927-v5.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12732987 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1908 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +HFileBlock.FSReader hbr = new HFileBlock.FSReaderImpl(is, totalSize, (HFileSystem) fs, path, meta); + HFileBlock.FSReader hbr = new HFileBlock.FSReaderImpl(is, totalSize, (HFileSystem) fs, path, meta); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14049//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544675#comment-14544675 ] Apekshit Sharma commented on HBASE-11927: - Is there a tool i can run before submitting patch which can check style, line length, and other small stuff? Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544855#comment-14544855 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733018/HBASE-11927-v6.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733018 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1908 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14051//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14051//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14051//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14051//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544834#comment-14544834 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733035/HBASE-11927-v7.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733035 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14053//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14053//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14053//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14053//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541755#comment-14541755 ] Anoop Sam John commented on HBASE-11927: Patch looks good. bq.DataChecksum.Type.valueOf(cktype.getCode()) Do we need a mapping function which maps HBase ChecksumType to hadoop DataChecksum ? I agree that both the type codes are same so no issue now. Still that may be cleaner IMO. We had a fallback mechanism to Java checksum when the hadoop library is not available. So this we are removing now. Make it clear in Release Notes (about the expectation) Good that we no longer have expectation of Block data been a byte[] [This one - checksumObject.update(byte[],int,int) ]... Will help in offheap read path. :-) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542772#comment-14542772 ] Elliott Clark commented on HBASE-11927: --- Patch looks good. I'd mirror [~stack] and say lets not put something in HConstants. As we try and pull things apart HConstants is an antipattern. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542872#comment-14542872 ] Apekshit Sharma commented on HBASE-11927: - I too felt like there were bunch of things wrong in HConstants. Will move the default out of there, add the map as suggested by Anoop, fix other things noted above and re-upload the patch late tonight. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542142#comment-14542142 ] Elliott Clark commented on HBASE-11927: --- Does hadoop's crc code require native or does it do a fallback to java version ? Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542230#comment-14542230 ] stack commented on HBASE-11927: --- bq. Does hadoop's crc code require native or does it do a fallback to java version ? [~appy] Mind testing? Many deploys do not have the native libs available (misconfig., oversight, etc.). bq. Do we need a mapping function which maps HBase ChecksumType to hadoop DataChecksum ? [~anoop.hbase] I was wondering if we could not just strip the hbase checksumtype... there is nothing in it now. Could we just use hadoops? [~appy] notes that we need to update hadoop, adding ourselves to the list of projects on the limitedprivate line. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542274#comment-14542274 ] Anoop Sam John commented on HBASE-11927: bq.I was wondering if we could not just strip the hbase checksumtype... there is nothing in it now. Could we just use hadoops? I was telling mainly wrt read part where we have to handle the old HFiles (with HBase checksum type). So if we just use hadoop's type code, we will have to change write part also. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542310#comment-14542310 ] stack commented on HBASE-11927: --- bq. On cleaning up ChecksumType, it did occur to me to remove it ... I thought it might be better to do that separately, avoiding cluttering in this patch and all. Makes sense. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542302#comment-14542302 ] Apekshit Sharma commented on HBASE-11927: - Native lib is not a requirement. In many of my initial runs, there was no native lib unless i explicitly compiled it. If native is not present (checked [here|https://github.com/apache/hadoop/blob/cbf0ae742ae3db964550df11c4044d3e16013959/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java#L298]), it falls back to java's implementation initialized [here|http://hbase.apache.org/book.html#hadoop.native.lib]. On cleaning up ChecksumType, it did occur to me to remove it since there wasn't anything substantial left, but finding ~50 usages, I thought it might be better to do that separately, avoiding cluttering in this patch and all. Mapping function would be good idea until then. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541428#comment-14541428 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12732468/HBASE-11927-v4.patch against master branch at commit befb46c4d5e7f2d5ce41199fbf9ca2fb7bf43cfc. ATTACHMENT ID: 12732468 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1906 checkstyle errors (more than the master's current 1896 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.hadoop.hbase.io.encoding.TestChangingEncoding.testChangingEncodingWithCompaction(TestChangingEncoding.java:212) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testStoreFileCacheOnWriteInternals(TestCacheOnWrite.java:270) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testStoreFileCacheOnWrite(TestCacheOnWrite.java:472) at org.apache.phoenix.end2end.StatsCollectorIT.testCompactUpdatesStats(StatsCollectorIT.java:288) at org.apache.phoenix.end2end.StatsCollectorIT.testCompactUpdatesStats(StatsCollectorIT.java:241) at org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders.testSeekingOnSample(TestDataBlockEncoders.java:206) at org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:122) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14034//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14034//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14034//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14034//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541229#comment-14541229 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12732399/HBASE-11927-v2.patch against master branch at commit befb46c4d5e7f2d5ce41199fbf9ca2fb7bf43cfc. ATTACHMENT ID: 12732399 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1904 checkstyle errors (more than the master's current 1896 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14032//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14032//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14032//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14032//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927.patch, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, compact-with-native.svg, compact-without-native.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541293#comment-14541293 ] stack commented on HBASE-11927: --- So, on machine w/ hardware support, we spend 20% less CPU. Nice one [~appy]. Minor. I don't think we want to do this in HConstants. 949 public static ChecksumType DEFAULT_CHECKSUM_TYPE = ChecksumType.CRC32C; HConstants is a bit of an anti-pattern. It should have defines that are truly global. Better to keep constants with the code they are related to. Maybe in ChecksumType? (I suppose we need ChecksumType? We can't use hadoop's DataChecksum.Type? We'd break too much? Could maybe do in followup patch). Nice test. And to be clear, if an hfile is written with CRC32, we'll just read it out of the hfile and use that verifying so making the change to new checksum type should only apply to new files written? At least that is how I read it. If good, lets get this in. On commit I'll add note to refguide unless you want too to make sure the native libs are available and that for sure they are working for you into perf section. We have this http://hbase.apache.org/book.html#hadoop.native.lib but we could do better I'd say if its 20% or more. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)