[
https://issues.apache.org/jira/browse/HADOOP-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062591#comment-14062591
]
Todd Lipcon commented on HADOOP-10778:
--------------------------------------
I just took a look at running the benchmark locally, and couldn't repro the
results on my Linux core i7 box. For bpc = 512:
java.version = 1.7.0_55
java.runtime.name = OpenJDK Runtime Environment
java.runtime.version = 1.7.0_55-b14
java.vm.version = 24.51-b03
java.vm.vendor = Oracle Corporation
java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.version = 1.7
java.specification.version = 1.7
os.arch = amd64
os.name = Linux
os.version = 3.11.0-20-generic
Data Length = 64 MB
Trials = 5
Direct Buffer Performance Table (bpc = byte-per-crc in MB/sec; #T = #Theads)
| bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff |
| 512 | 1 | 973.4 | 1288.5 | 32.4% | 1660.5 | 70.6% | 28.9% |
| 512 | 2 | 946.1 | 1248.4 | 32.0% | 1619.1 | 71.1% | 29.7% |
| 512 | 4 | 931.2 | 1199.1 | 28.8% | 1576.6 | 69.3% | 31.5% |
| 512 | 8 | 762.1 | 683.9 | -10.3% | 1352.5 | 77.5% | 97.8% |
| 512 | 16 | 396.3 | 368.6 | -7.0% | 828.3 | 109.0% | 124.7% |
Also, I remembered that a long time ago I wrote a pipelined
(instruction-level-parallel) implementation of the bulk_verify_crc32 method:
https://github.com/toddlipcon/crc-workbench/blob/master/bulk_crc32.c#L308 which
goes about twice as fast. (using that benchmark I get about 3.1GB/sec on the
same machine). So, if we actually care about performance of the zlib
polynomial, we should probably pull in that code rather than dynamically switch.
> Use NativeCrc32 only if it is faster
> ------------------------------------
>
> Key: HADOOP-10778
> URL: https://issues.apache.org/jira/browse/HADOOP-10778
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Tsz Wo Nicholas Sze
> Attachments: c10778_20140702.patch
>
>
> From the benchmark post in [this
> comment|https://issues.apache.org/jira/browse/HDFS-6560?focusedCommentId=14044060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044060],
> NativeCrc32 is slower than java.util.zip.CRC32 for Java 7 and above when
> bytesPerChecksum > 512.
--
This message was sent by Atlassian JIRA
(v6.2#6252)