[
https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540957#comment-14540957
]
stack commented on HBASE-11927:
-------------------------------
A machine with hardware support should win us back at least another 10% again
(20% @appy!)
Add a fat release note (you can do it by editing this issue -- and then you
will see the 'release note' text box) saying you are flipping the default
[~appy].
There are other places in codebase you should probably change defaults too...
Doing a grep:
{code}
kalashnikov:hbase.git stack$ grep -r CRC32 hbase-*/src/main/java
hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java:
public static final ChecksumType DEFAULT_CHECKSUM_TYPE = ChecksumType.CRC32;
hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java:
public static final ChecksumType DEFAULT_CHECKSUM_TYPE = ChecksumType.CRC32;
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java: public
static final ChecksumType DEFAULT_CHECKSUM_TYPE = ChecksumType.CRC32;
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java.orig:
public static final ChecksumType DEFAULT_CHECKSUM_TYPE = ChecksumType.CRC32;
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java:
// Initialize checksum type from name. The names are CRC32, CRC32C, etc.
{code}
You want to write a simple unit test that verifies that when we write new
hfiles, that they indeed are CRC32C.
> Use Native Hadoop Library for HFile checksum
> --------------------------------------------
>
> Key: HBASE-11927
> URL: https://issues.apache.org/jira/browse/HBASE-11927
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Apekshit Sharma
> Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch,
> HBASE-11927.patch, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg,
> compact-with-native.svg, compact-without-native.svg, crc32ct.svg
>
>
> Up in hadoop they have this change. Let me publish some graphs to show that
> it makes a difference (CRC is a massive amount of our CPU usage in my
> profiling of an upload because of compacting, flushing, etc.). We should
> also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in
> hbase but that is another issue for now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)