Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/19184#discussion_r137982497
--- Diff:
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
---
@@ -104,6 +124,10 @@ public void loadNext() throws IOException {
if (taskContext != null) {
taskContext.killTaskIfInterrupted();
}
+ if (this.din == null) {
+ // Good time to init (if all files are opened, we can get Too Many
files exception)
+ initStreams();
+ }
--- End diff --
IIUC, this PR does not reduce the number of total open files. Since this PR
tries to open files when they are required, this PR may reduce possibility of
occurring an error of too may open files.
As @viirya pointed out, it is necessary to provide a feature to control the
number of opening files at one point (e.g. priority queue).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]