[
https://issues.apache.org/jira/browse/HADOOP-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735806#action_12735806
]
Tsz Wo (Nicholas), SZE commented on HADOOP-6166:
------------------------------------------------
Got a different story after updated to the latest jdk.
java.version = 1.6.0_14
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_14-b08
java.vm.version = 14.0-b16
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp
||num bytes||PureJavaCrc32 MB/sec||PureJavaCrc32New MB/sec||Crc32_3_2
MB/sec||Crc32_4_3 MB/sec||Crc32_5_5 MB/sec||Crc32_6_6 MB/sec||Crc32_8_8
MB/sec||Crc32_12_12 MB/sec||
| 8 | 190.202 | 170.445 | 170.560 | 228.872 | 215.769 | 206.454 |
231.685 | 214.802 |
| 16 | 257.234 | 209.434 | 225.336 | 267.450 | 263.165 | 187.785 |
253.917 | 232.935 |
| 32 | 309.992 | 271.358 | 243.099 | 309.840 | 309.997 | 304.166 |
319.013 | 270.716 |
| 64 | 348.461 | 326.343 | 265.435 | 338.049 | 330.947 | 333.372 |
358.959 | 334.240 |
| 128 | 369.745 | 362.989 | 271.531 | 354.615 | 382.880 | 371.822 |
383.919 | 370.870 |
| 256 | 382.773 | 385.201 | 279.028 | 364.379 | 402.521 | 378.506 |
400.110 | 399.904 |
| 512 | 384.597 | 397.015 | 279.898 | 364.408 | 406.135 | 389.814 |
407.351 | 405.674 |
| 1024 | 390.181 | 405.577 | 281.035 | 364.413 | 412.159 | 395.506 |
408.046 | 416.599 |
| 2048 | 392.820 | 408.548 | 275.382 | 360.259 | 412.941 | 395.982 |
409.498 | 422.795 |
| 4096 | 392.362 | 408.593 | 273.375 | 355.012 | 414.489 | 397.885 |
410.857 | 422.115 |
| 8192 | 393.152 | 409.355 | 273.973 | 355.846 | 415.358 | 398.532 |
411.701 | 423.370 |
| 16384 | 393.094 | 409.406 | 274.759 | 355.500 | 415.657 | 398.542 |
411.813 | 422.864 |
| 32768 | 392.515 | 408.989 | 276.169 | 357.135 | 415.965 | 400.295 |
411.622 | 422.887 |
| 65536 | 393.323 | 408.997 | 276.594 | 357.448 | 416.075 | 400.896 |
411.966 | 422.850 |
| 131072 | 393.531 | 408.982 | 276.566 | 357.490 | 416.059 | 400.959 |
412.037 | 422.953 |
| 262144 | 393.638 | 409.070 | 276.585 | 357.407 | 416.030 | 401.040 |
412.046 | 423.034 |
| 524288 | 393.629 | 408.982 | 276.511 | 357.462 | 416.123 | 400.994 |
411.924 | 423.010 |
| 1048576 | 393.652 | 408.943 | 276.397 | 357.050 | 415.844 | 400.785 |
411.927 | 422.808 |
| 2097152 | 393.408 | 408.558 | 276.024 | 356.452 | 415.633 | 400.296 |
411.594 | 422.426 |
| 4194304 | 391.575 | 405.148 | 275.157 | 354.772 | 413.680 | 397.834 |
409.809 | 420.100 |
| 8388608 | 389.204 | 404.179 | 273.648 | 351.896 | 411.007 | 395.309 |
407.030 | 417.661 |
| 16777216 | 388.753 | 403.422 | 273.343 | 351.298 | 410.380 | 394.783 |
406.396 | 416.995 |
The above table makes more sense since it is easy to tell from the codes that
Crc32_N_N for N > 4 is more efficient than PureJavaCrc32 (i.e. Crc32_4_4).
Note that N cannot be increased arbitrary. Otherwise, the tables may not fit
into the cpu cache as explained previously by Scott. (Tried Crc32_16_16 but it
got worst.)
As shown above, Crc32_12_12 has 7% and 26% improvement on my 64-bit and 32-bit
machines with jdk 1.6.0_14-b08, respectively. I cannot explain why the numbers
were generally better in 1.6.0_10-b33, 64-bit vm. Specific jdk feature/bug?
> Improve PureJavaCrc32
> ---------------------
>
> Key: HADOOP-6166
> URL: https://issues.apache.org/jira/browse/HADOOP-6166
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: Tsz Wo (Nicholas), SZE
> Attachments: c6166_20090722.patch, c6166_20090722_benchmark_32VM.txt,
> c6166_20090722_benchmark_64VM.txt, c6166_20090727.patch
>
>
> Got some ideas to improve CRC32 calculation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.