sunhaibotb commented on a change in pull request #7797: [FLINK-11379] Fix
OutOfMemoryError caused by Files.readAllBytes() when TM loads a large size TDD
URL: https://github.com/apache/flink/pull/7797#discussion_r261027583
##########
File path: flink-core/src/main/java/org/apache/flink/util/FileUtils.java
##########
@@ -107,6 +120,89 @@ public static void writeFileUtf8(File file, String
contents) throws IOException
writeFile(file, contents, "UTF-8");
}
+ /**
+ * Reads all the bytes from a file. The method ensures that the file is
+ * closed when all bytes have been read or an I/O error, or other
runtime
+ * exception, is thrown.
+ *
+ * <p>This is an implementation that follow {@link
java.nio.file.Files#readAllBytes(java.nio.file.Path)},
+ * and the difference is that it limits the size of the direct buffer
to avoid
+ * direct-buffer OutOfMemoryError. When {@link
java.nio.file.Files#readAllBytes(java.nio.file.Path)}
+ * or other interfaces in java API can do this in the future, we should
remove it.
+ *
+ * @param path
+ * the path to the file
+ * @return a byte array containing the bytes read from the file
+ *
+ * @throws IOException
+ * if an I/O error occurs reading from the stream
+ * @throws OutOfMemoryError
+ * if an array of the required size cannot be allocated, for
+ * example the file is larger that {@code 2GB}
+ */
+ public static byte[] readAllBytes(java.nio.file.Path path) throws
IOException {
+ try (SeekableByteChannel channel = Files.newByteChannel(path);
Review comment:
@sunjincheng121 is right.
This OutOfMemoryError is different from the [stackoverflow
](https://stackoverflow.com/questions/43782158/java-read-a-big-file-few-gb-without-exception)
one. That one is on heap memory, while this one is on direct memory.
The codes are not copied from stackoverflow, they come from
java.nio.file.Files#readAllBytes() and a little changes have been made. Does
this has licensing issues? @zentol
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services