pnowojski commented on a change in pull request #13718:
URL: https://github.com/apache/flink/pull/13718#discussion_r512677181
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/serialization/SpanningWrapper.java
##########
@@ -286,11 +292,21 @@ private FileChannel createSpillingChannel() throws
IOException {
// try to find a unique file name for the spilling channel
int maxAttempts = 10;
for (int attempt = 0; attempt < maxAttempts; attempt++) {
- String directory =
tempDirs[rnd.nextInt(tempDirs.length)];
+ int dirIndex = rnd.nextInt(tempDirs.length);
+ String directory = tempDirs[dirIndex];
File file = new File(directory, randomString(rnd) +
".inputchannel");
- if (file.createNewFile()) {
- spillFile = new RefCountedFile(file);
- return new RandomAccessFile(file,
"rw").getChannel();
+ try {
+ if (file.createNewFile()) {
+ spillFile = new RefCountedFile(file);
+ return new RandomAccessFile(file,
"rw").getChannel();
+ }
+ } catch (IOException e) {
+ // if there is no tempDir left to try
+ if (tempDirs.length <= 1) {
+ throw e;
+ }
+ LOG.warn("Caught an IOException when creating
spill file: " + directory + ". Attempt " + attempt, e);
+ tempDirs = (String[])
ArrayUtils.remove(tempDirs, dirIndex);
Review comment:
Ok, good point. Let's keep it as it is. At worst if there are some
intermittent errors and we deplete the available temp dirs, job will restart
and re attempt to write to the same directories (and the intermittent error
might be gone by then).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]