mapleFU commented on PR #14351: URL: https://github.com/apache/arrow/pull/14351#issuecomment-1353512353
> * check and clean up CRC32 implementation to follow the code style (should be mostly mechanical) > * since we determined that the checksum may apply to all page kinds, change the option names to not mention data pages :-) It's ok, the crc code is pasted from cyrus-imapd . But some code is modified( using arrow's internal library rather than using cyrus-imapd's), you can just diff the code. Some optimizations like simd or other could be introduced. Perhaps: https://www.boost.org/doc/libs/1_73_0/doc/html/crc/crc_optimal.html . To be honest, I think maintaining a crc implementions pasted from zstd or cyrus-imapd is really trickey. But maybe we can keep my implemention first, at least it's correct :) As for variable name, seems that parquet-mr uses `PageWriteChecksum` and `PageVerifyChecksum`, I think they should be better. Thank you for your help and your patient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
