[ 
https://issues.apache.org/jira/browse/HADOOP-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735806#action_12735806
 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6166:
------------------------------------------------

Got a different story after updated to the latest jdk.

java.version = 1.6.0_14
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_14-b08
java.vm.version = 14.0-b16
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = amd64
os.name = Linux
os.version = 2.6.9-55.ELsmp
||num bytes||PureJavaCrc32 MB/sec||PureJavaCrc32New MB/sec||Crc32_3_2 
MB/sec||Crc32_4_3 MB/sec||Crc32_5_5 MB/sec||Crc32_6_6 MB/sec||Crc32_8_8 
MB/sec||Crc32_12_12 MB/sec||
|        8 |  190.202 |  170.445 |  170.560 |  228.872 |  215.769 |  206.454 |  
231.685 |  214.802 |
|       16 |  257.234 |  209.434 |  225.336 |  267.450 |  263.165 |  187.785 |  
253.917 |  232.935 |
|       32 |  309.992 |  271.358 |  243.099 |  309.840 |  309.997 |  304.166 |  
319.013 |  270.716 |
|       64 |  348.461 |  326.343 |  265.435 |  338.049 |  330.947 |  333.372 |  
358.959 |  334.240 |
|      128 |  369.745 |  362.989 |  271.531 |  354.615 |  382.880 |  371.822 |  
383.919 |  370.870 |
|      256 |  382.773 |  385.201 |  279.028 |  364.379 |  402.521 |  378.506 |  
400.110 |  399.904 |
|      512 |  384.597 |  397.015 |  279.898 |  364.408 |  406.135 |  389.814 |  
407.351 |  405.674 |
|     1024 |  390.181 |  405.577 |  281.035 |  364.413 |  412.159 |  395.506 |  
408.046 |  416.599 |
|     2048 |  392.820 |  408.548 |  275.382 |  360.259 |  412.941 |  395.982 |  
409.498 |  422.795 |
|     4096 |  392.362 |  408.593 |  273.375 |  355.012 |  414.489 |  397.885 |  
410.857 |  422.115 |
|     8192 |  393.152 |  409.355 |  273.973 |  355.846 |  415.358 |  398.532 |  
411.701 |  423.370 |
|    16384 |  393.094 |  409.406 |  274.759 |  355.500 |  415.657 |  398.542 |  
411.813 |  422.864 |
|    32768 |  392.515 |  408.989 |  276.169 |  357.135 |  415.965 |  400.295 |  
411.622 |  422.887 |
|    65536 |  393.323 |  408.997 |  276.594 |  357.448 |  416.075 |  400.896 |  
411.966 |  422.850 |
|   131072 |  393.531 |  408.982 |  276.566 |  357.490 |  416.059 |  400.959 |  
412.037 |  422.953 |
|   262144 |  393.638 |  409.070 |  276.585 |  357.407 |  416.030 |  401.040 |  
412.046 |  423.034 |
|   524288 |  393.629 |  408.982 |  276.511 |  357.462 |  416.123 |  400.994 |  
411.924 |  423.010 |
|  1048576 |  393.652 |  408.943 |  276.397 |  357.050 |  415.844 |  400.785 |  
411.927 |  422.808 |
|  2097152 |  393.408 |  408.558 |  276.024 |  356.452 |  415.633 |  400.296 |  
411.594 |  422.426 |
|  4194304 |  391.575 |  405.148 |  275.157 |  354.772 |  413.680 |  397.834 |  
409.809 |  420.100 |
|  8388608 |  389.204 |  404.179 |  273.648 |  351.896 |  411.007 |  395.309 |  
407.030 |  417.661 |
| 16777216 |  388.753 |  403.422 |  273.343 |  351.298 |  410.380 |  394.783 |  
406.396 |  416.995 |

The above table makes more sense since it is easy to tell from the codes that 
Crc32_N_N for N > 4 is more efficient than PureJavaCrc32 (i.e. Crc32_4_4).  
Note that N cannot be increased arbitrary.  Otherwise, the tables may not fit 
into the cpu cache as explained previously by Scott.  (Tried Crc32_16_16 but it 
got worst.)

As shown above, Crc32_12_12 has 7% and 26% improvement on my 64-bit and 32-bit 
machines with jdk 1.6.0_14-b08, respectively.  I cannot explain why the numbers 
were generally better in 1.6.0_10-b33, 64-bit vm.  Specific jdk feature/bug?

> Improve PureJavaCrc32
> ---------------------
>
>                 Key: HADOOP-6166
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6166
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: c6166_20090722.patch, c6166_20090722_benchmark_32VM.txt, 
> c6166_20090722_benchmark_64VM.txt, c6166_20090727.patch
>
>
> Got some ideas to improve CRC32 calculation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to