[ https://issues.apache.org/jira/browse/HADOOP-13578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743180#comment-15743180 ]
churro morales commented on HADOOP-13578: ----------------------------------------- HI [~jlowe] Thanks for taking the time to review. I agree with all of the above comments and will correct those issues. The last question you had was related to the ZSTD_endStream(). The endStream() finishes the frame and writes the epilogue only if the uncompressed buffer has been fully consumed. Otherwise it basically does the same thing as ZSTD_compressStream(). You are correct, if the output buffer is too small it may not be able to flush. There is a check in ZSTD_endStream() which does this: {code} size_t const notEnded = ZSTD_compressStream_generic(zcs, ostart, &sizeWritten, &srcSize, &srcSize, zsf_end); size_t const remainingToFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize; op += sizeWritten; if (remainingToFlush) { output->pos += sizeWritten; return remainingToFlush + ZSTD_BLOCKHEADERSIZE /* final empty block */ + (zcs->checksum * 4); } // Create the epilogue and flush the epilogue {code} so if there is still data to be consumed the library wont finish the frame, thus making it safe to call repeatedly with our framework because we never set the finished flag until the epilogue has been written successfully. The code in the CompressorStream.java which calls our codec simply does this: {code} @Override public void finish() throws IOException { if (!compressor.finished()) { compressor.finish(); while (!compressor.finished()) { compress(); } } } {code} So I believe we wont drop any data with the way things are done. Please let me know if I am missing something obvious here :). > Add Codec for ZStandard Compression > ----------------------------------- > > Key: HADOOP-13578 > URL: https://issues.apache.org/jira/browse/HADOOP-13578 > Project: Hadoop Common > Issue Type: New Feature > Reporter: churro morales > Assignee: churro morales > Attachments: HADOOP-13578.patch, HADOOP-13578.v1.patch, > HADOOP-13578.v2.patch, HADOOP-13578.v3.patch, HADOOP-13578.v4.patch, > HADOOP-13578.v5.patch, HADOOP-13578.v6.patch > > > ZStandard: https://github.com/facebook/zstd has been used in production for 6 > months by facebook now. v1.0 was recently released. Create a codec for this > library. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org