aajisaka commented on code in PR #8526:
URL: https://github.com/apache/hadoop/pull/8526#discussion_r3345446206


##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java:
##########
@@ -557,6 +557,64 @@ public void testDecompressReturnsWhenNothingToDecompress() 
throws Exception {
     assertEquals(0, result);
   }
 
+  /**
+   * Verify that {@code setInput()} does not throw {@code 
BufferOverflowException}
+   * after a previous {@code decompress()} call threw an exception.
+   *
+   * <p>When {@code decompress()} processes compressed data, it sets
+   * {@code compressedDirectBuf.limit(bytesInCompressedBuffer)} — a value that
+   * may be smaller than {@code directBufferSize}. If {@code 
decompressDirectByteBufferStream}
+   * throws (e.g. on corrupted input), the limit is never restored. A 
subsequent
+   * {@code reset()} also does not restore {@code compressedDirectBuf.limit}.
+   * So the next {@code setInput()} call will hit {@code 
BufferOverflowException}
+   * because {@code setInputFromSavedData()} tries to {@code put()} more bytes
+   * than the current limit allows.</p>
+   *
+   * <p>This scenario occurs in practice when reading multiple zstd-compressed
+   * files from a directory: a corrupted file causes an exception 
mid-decompress,
+   * the decompressor is returned to the pool and reset, but the limit stays
+   * small. The next file's {@code setInput()} then fails.</p>
+   */
+  @Test
+  public void testSetInputAfterDecompressThrowsOnCorruptedData() throws 
Exception {

Review Comment:
   Thank you for the investigation. Make sense to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to