Igloo created HDFS-15445:
----------------------------
Summary: ZStandardCodec compression mail fail when encounter
specific file
Key: HDFS-15445
URL: https://issues.apache.org/jira/browse/HDFS-15445
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs
Affects Versions: 2.6.5
Environment: zstd 1.3.3
hadoop 2.6.5
---
a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
+++
b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
@@ -62,10 +62,8 @@
@BeforeClass
public static void beforeClass() throws Exception {
CONFIGURATION.setInt(IO_FILE_BUFFER_SIZE_KEY, 1024 * 64);
- uncompressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt").toURI());
- compressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt.zst").toURI());
+ uncompressedFile = new File("/tmp/badcase.data");
+ compressedFile = new File("/tmp/badcase.data.zst");
Reporter: Igloo
Attachments: badcase.data, image-2020-06-30-11-35-46-859.png,
image-2020-06-30-11-39-17-861.png
*Problem:*
In our production environment, we put file in hdfs with zstd compressor,
recently, we find that a specific file may leads to zstandard compressor
failures.
And we can reproduce the issue with specific file(attached file: badcase.data)
*Analysis*:
ZStandarCompressor use buffersize( From zstd recommended compress out buffer
size) for both inBufferSize and outBufferSize
!image-2020-06-30-11-35-46-859.png|width=475,height=179!
but zstd indeed provides two separately recommending inputBufferSize and
outputBufferSize
!image-2020-06-30-11-39-17-861.png!
*Workaround*
One workaround, use recommended in/out buffer size provided by zstd lib.
input buffer size: 1301072 (128 * 1024)
ouput buffer size: 131591
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]