itschrispeck commented on PR #12539: URL: https://github.com/apache/pinot/pull/12539#issuecomment-1992606880
Updated to use safe decompressor. This will move src to `src.limit()` instead of based on bytes read ([ref](https://github.com/lz4/lz4-java/blob/master/src/java/net/jpountz/lz4/LZ4SafeDecompressor.java#L149)). I wasn't able to figure out a good way to round trip unit test this. I've added a comment on why the safe decompressor is chosen to hopefully avoid refactoring into the fast decompressor. With more understanding of the LZ4 algorithm I could probably edit some compressed data's bytes to cause this, but given that fast decompressor is deprecated I'd prefer if we can merge this without the test. I did a basic validation of the underlying decompress implementations. In general, perf differences are a wash and vary based on host/input: ``` +--------------+------------------------------+--------------+----------------+--------------+--------------+ |Source |Function Benchmarked | Total Seconds| Iterations/sec| ns/Iteration| % of default| +--------------+------------------------------+--------------+----------------+--------------+--------------+ |Normal Text |LZ4_decompress_safe() | 140.382024000| 356170| 2807| 100.00%| |Normal Text |LZ4_decompress_fast() | 133.453277000| 374662| 2669| 95.06%| | | | | | | | |Compressible |LZ4_decompress_safe() | 28.044312000| 1782892| 560| 19.98%| |Compressible |LZ4_decompress_fast() | 32.404868999| 1542978| 648| 23.08%| +--------------+------------------------------+--------------+----------------+--------------+--------------+ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
