ConfX created HDFS-17095:
----------------------------
Summary: Out of Memory when mistakenly set file.bytes-per-checksum
to a large number
Key: HDFS-17095
URL: https://issues.apache.org/jira/browse/HDFS-17095
Project: Hadoop HDFS
Issue Type: Bug
Reporter: ConfX
Attachments: reproduce.sh
h2. What happened:
When setting {{file.bytes-per-checksum}} to a large number,
{{ChecksumFileSystem}} in HDFS throws an out-of-memory exception due to
inappropriate checking and handling.
HDFS only checks the value should be larger than 0.
h2. Buggy code:
In FSOutputSummer.java
protected FSOutputSummer(DataChecksum sum) {
... this.buf = new byte[sum.getBytesPerChecksum() * BUFFER_NUM_CHUNKS];
<<--- getBytesPerChecksum() gets parameter value ...
}
In ChecksumFileSystem.java
bytesPerChecksum =
conf.getInt(LocalFileSystemConfigKeys.LOCAL_FS_BYTES_PER_CHECKSUM_KEY,
LocalFileSystemConfigKeys.LOCAL_FS_BYTES_PER_CHECKSUM_DEFAULT);
Preconditions.checkState(bytesPerChecksum > 0, <<---- Only
checks > 0"bytes per checksum should be positive but was %s",
bytesPerChecksum);
h2. StackTrace:
java.lang.OutOfMemoryError: Java heap spaceat
org.apache.hadoop.fs.FSOutputSummer.<init>(FSOutputSummer.java:55) at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:430)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:521)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195) at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175) at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1064) at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1052) at
org.apache.hadoop.hdfs.DFSTestUtil.writeFile(DFSTestUtil.java:902) at
org.apache.hadoop.hdfs.DFSTestUtil.writeFile(DFSTestUtil.java:924) at
org.apache.hadoop.hdfs.util.HostsFileWriter.initialize(HostsFileWriter.java:69
h2. Reproduce:
(1) Set {{file.bytes-per-checksum}} to a large value, e.g., 1666845779
(2) Run a simple test that exercises this parameter, e.g.
{{org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics#testExcessBlocks}}
For an easy reproduction, run the reproduce.sh in the attachment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]