David Heller created FLINK-2334:
-----------------------------------
Summary: IOException: Channel to path could not be opened
Key: FLINK-2334
URL: https://issues.apache.org/jira/browse/FLINK-2334
Project: Flink
Issue Type: Bug
Affects Versions: 0.9
Environment: local and cluster environment; Linux and MacOS
Reporter: David Heller
Priority: Minor
We've encountered an IOException due to missing temporary files (see stacktrace
at the bottom). It occurred both in local and cluster execution and on MacOS as
well as on linux. Data size does not seem to be the reason: the error occurred
on a 6.5GB dataset as well as on a small 400MB dataset.
Our code uses Bulk iterations and the suspicion is that cached build-side files
are accidentally removed too early. As far as we observed it, the exception
always happens in an iteration later than the first one (mostly iteration 2).
Interestingly, on one occasion the error disappeared consistently when setting
the number of maximum iterations higher (from 2 to 6).
On another occasion, the exception appeared when adding a simple map operator
at the end (holding the identity function).
Generally, the error is quite hard to reproduce.
The stacktrace:
java.io.IOException: Channel to path
'/var/folders/xx/0dd3w4jd7fbb4ytmhqxm157h0000gn/T/flink-io-f5061483-ff59-43dc-883f-79af813d5804/19a70637e025c7ee3919b30239060895.000023.channel'
could not be opened.
at
org.apache.flink.runtime.io.disk.iomanager.AbstractFileIOChannel.<init>(AbstractFileIOChannel.java:61)
at
org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.<init>(AsynchronousFileIOChannel.java:86)
at
org.apache.flink.runtime.io.disk.iomanager.AsynchronousBulkBlockReader.<init>(AsynchronousBulkBlockReader.java:46)
at
org.apache.flink.runtime.io.disk.iomanager.AsynchronousBulkBlockReader.<init>(AsynchronousBulkBlockReader.java:39)
at
org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.createBulkBlockChannelReader(IOManagerAsync.java:263)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:751)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
at
org.apache.flink.runtime.operators.hash.ReOpenableMutableHashTable.prepareNextPartition(ReOpenableMutableHashTable.java:167)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
at
org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
at
org.apache.flink.runtime.operators.AbstractCachedBuildSideMatchDriver.run(AbstractCachedBuildSideMatchDriver.java:155)
at
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
at
org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
at
org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92)
at
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/var/folders/xx/0dd3w4jd7fbb4ytmhqxm157h0000gn/T/flink-io-f5061483-ff59-43dc-883f-79af813d5804/19a70637e025c7ee3919b30239060895.000023.channel
(No such file or directory)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:124)
at
org.apache.flink.runtime.io.disk.iomanager.AbstractFileIOChannel.<init>(AbstractFileIOChannel.java:57)
... 16 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)