[
https://issues.apache.org/jira/browse/HADOOP-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz Wo (Nicholas), SZE updated HADOOP-6148:
-------------------------------------------
Attachment: benchmarks20090715.txt
Changed the benchmark as below such that there are ~4GB data in each run.
{code}
for(int j = 10; j < 24; j += 2) {
for(int k = 0; k < 4; k++) {
final int bytelen = (1 << j) + k;
final byte[] b = new byte[bytelen];
final int n = (int)((1L << 32) / bytelen);
ran.nextBytes(b);
t.tick("ran.nextBytes, bytelen=" + bytelen);
final SortedMap<Long, Checksum> rank = new TreeMap<Long, Checksum>();
test(pure, b, n, t, rank);
test(test, b, n, t, rank);
test(zip, b, n, t, rank);
System.out.println("rank = " + rank);
final Checksum c = rank.entrySet().iterator().next().getValue();
fastest.put(c, fastest.get(c) + 1);
}
}
{code}
benchmarks20090715.txt: new results
It is consistent that TestCrc32 is faster than zip.CRC32, which is faster than
PureJavaCrc32. There are ~13% improvement by TestCrc32 over PureJavaCrc32.
> Implement a pure Java CRC32 calculator
> --------------------------------------
>
> Key: HADOOP-6148
> URL: https://issues.apache.org/jira/browse/HADOOP-6148
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Todd Lipcon
> Attachments: benchmarks20090714.txt, benchmarks20090715.txt,
> crc32-results.txt, hadoop-5598-evil.txt, hadoop-5598-hybrid.txt,
> hadoop-5598.txt, hadoop-5598.txt, hdfs-297.txt, PureJavaCrc32.java,
> PureJavaCrc32.java, PureJavaCrc32.java, PureJavaCrc32.java,
> TestCrc32Performance.java, TestCrc32Performance.java,
> TestCrc32Performance.java, TestPureJavaCrc32.java
>
>
> We've seen a reducer writing 200MB to HDFS with replication = 1 spending a
> long time in crc calculation. In particular, it was spending 5 seconds in crc
> calculation out of a total of 6 for the write. I suspect that it is the
> java-jni border that is causing us grief.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.