[
https://issues.apache.org/jira/browse/HADOOP-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731739#action_12731739
]
Todd Lipcon commented on HADOOP-6148:
-------------------------------------
Here are the results from my laptop. PureJavaTW is Tsz Wo's updated version.
First, 64-bit JVM:
||num bytes||NewLoopOnly MB/sec||NewInnerOnly MB/sec||NewPureJava
MB/sec||PureJava MB/sec||||PureJavaTW MB/sec||Native MB/sec||
| 1 |79.721 |87.113 |111.889 |112.642 |88.684
|9.001 |
| 2 |91.820 |130.441 |130.855 |133.645
|131.948 |17.536 |
| 4 |167.315 |223.087 |198.984 |243.936
|198.387 |33.488 |
| 8 |188.153 |254.671 |243.899 |248.944
|242.746 |59.258 |
| 16 |311.118 |353.191 |350.826 |327.231
|333.589 |99.213 |
| 32 |401.696 |416.339 |417.406 |427.676
|408.610 |148.742 |
| 64 |499.747 |483.148 |445.437 |472.814
|487.970 |208.478 |
| 128 |530.055 |520.043 |505.709 |513.645
|457.080 |253.756 |
| 256 |489.407 |541.459 |523.867 |547.794
|510.867 |283.360 |
| 512 |561.871 |528.383 |528.368 |553.071
|530.421 |300.134 |
| 1024 |579.227 |549.401 |537.941 |551.488
|536.391 |293.307 |
| 2048 |586.443 |551.685 |540.289 |564.328
|539.412 |319.766 |
| 4096 |608.470 |573.333 |560.746 |586.014
|560.349 |332.661 |
| 8192 |590.123 |554.975 |543.089 |424.834
|517.325 |322.192 |
| 16384 |583.385 |539.704 |542.656 |567.484
|542.567 |324.026 |
| 32768 |583.592 |551.508 |533.585 |561.559
|529.811 |321.858 |
| 65536 |584.476 |553.217 |537.679 |544.507
|512.978 |310.739 |
| 131072 |548.941 |529.097 |534.430 |564.858
|533.955 |324.626 |
| 262144 |584.379 |551.733 |536.386 |564.063
|479.038 |324.117 |
| 524288 |583.262 |553.893 |536.770 |563.518
|532.404 |324.924 |
| 1048576 |581.947 |550.572 |533.850 |561.049
|512.846 |294.452 |
| 2097152 |566.543 |534.695 |484.256 |551.693
|527.730 |320.850 |
| 4194304 |569.545 |537.748 |520.731 |547.608
|522.762 |318.084 |
| 8388608 |593.932 |563.233 |530.600 |571.098
|545.905 |310.115 |
| 16777216 |573.095 |560.361 |545.069 |576.036
|529.475 |331.984 |
And 32-bit JVM on the same machine:
||num bytes||NewLoopOnly MB/sec||NewInnerOnly MB/sec||NewPureJava
MB/sec||PureJava MB/sec||||PureJavaTW MB/sec||Native MB/sec||
| 1 |56.832 |51.138 |60.116 |60.826 |47.624
|7.612 |
| 2 |84.111 |82.710 |83.672 |81.422 |82.630
|15.075 |
| 4 |167.204 |138.900 |147.189 |163.798
|147.762 |30.063 |
| 8 |175.720 |177.396 |184.865 |180.241
|186.981 |54.772 |
| 16 |227.769 |257.112 |245.795 |247.928
|245.016 |94.794 |
| 32 |279.098 |336.251 |267.332 |298.998
|286.251 |147.808 |
| 64 |304.806 |381.515 |321.722 |339.217
|318.546 |207.577 |
| 128 |316.745 |433.474 |339.159 |356.571
|337.731 |255.225 |
| 256 |331.208 |454.923 |315.551 |374.849
|342.356 |291.948 |
| 512 |333.462 |451.160 |351.006 |374.545
|344.597 |312.074 |
| 1024 |332.338 |462.433 |350.361 |375.159
|359.086 |324.510 |
| 2048 |329.805 |472.755 |361.305 |379.140
|338.142 |317.101 |
| 4096 |326.613 |466.729 |349.653 |345.593
|337.487 |317.728 |
| 8192 |313.368 |458.838 |357.077 |384.962
|348.174 |332.353 |
| 16384 |338.060 |448.738 |341.744 |371.819
|337.915 |295.606 |
| 32768 |301.724 |451.107 |346.410 |381.720
|322.610 |332.183 |
| 65536 |337.599 |472.797 |328.112 |383.809
|324.456 |336.955 |
| 131072 |338.017 |471.498 |351.307 |383.234
|350.165 |338.728 |
| 262144 |338.063 |472.881 |351.411 |383.875
|350.652 |338.079 |
| 524288 |338.175 |471.574 |349.477 |381.680
|348.984 |334.452 |
| 1048576 |333.453 |460.829 |343.706 |381.459
|346.565 |334.384 |
| 2097152 |335.291 |465.923 |347.260 |374.896
|330.102 |330.254 |
| 4194304 |332.436 |460.711 |340.488 |378.804
|346.389 |334.329 |
| 8388608 |334.700 |464.714 |347.837 |378.336
|346.230 |324.550 |
| 16777216 |316.521 |431.736 |342.807 |373.638
|341.928 |328.748 |
It seems to me that, on the 64-bit JVM, most of the implementations are within
margin of error at the sizes that are most often exercised (128 to 256 bytes).
On 32-bit, the NewInnerOnly wins by a reasonable amount.
I say we commit the NewInnerOnly version.
> Implement a pure Java CRC32 calculator
> --------------------------------------
>
> Key: HADOOP-6148
> URL: https://issues.apache.org/jira/browse/HADOOP-6148
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Todd Lipcon
> Attachments: benchmarks20090714.txt, benchmarks20090715.txt,
> crc32-results.txt, hadoop-5598-evil.txt, hadoop-5598-hybrid.txt,
> hadoop-5598.txt, hadoop-5598.txt, hdfs-297.txt, PureJavaCrc32.java,
> PureJavaCrc32.java, PureJavaCrc32.java, PureJavaCrc32.java,
> PureJavaCrc32New.java, PureJavaCrc32NewInner.java, PureJavaCrc32NewLoop.java,
> TestCrc32Performance.java, TestCrc32Performance.java,
> TestCrc32Performance.java, TestCrc32Performance.java, TestPureJavaCrc32.java
>
>
> We've seen a reducer writing 200MB to HDFS with replication = 1 spending a
> long time in crc calculation. In particular, it was spending 5 seconds in crc
> calculation out of a total of 6 for the write. I suspect that it is the
> java-jni border that is causing us grief.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.