David Heller created FLINK-2334: ----------------------------------- Summary: IOException: Channel to path could not be opened Key: FLINK-2334 URL: https://issues.apache.org/jira/browse/FLINK-2334 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: local and cluster environment; Linux and MacOS Reporter: David Heller Priority: Minor
We've encountered an IOException due to missing temporary files (see stacktrace at the bottom). It occurred both in local and cluster execution and on MacOS as well as on linux. Data size does not seem to be the reason: the error occurred on a 6.5GB dataset as well as on a small 400MB dataset. Our code uses Bulk iterations and the suspicion is that cached build-side files are accidentally removed too early. As far as we observed it, the exception always happens in an iteration later than the first one (mostly iteration 2). Interestingly, on one occasion the error disappeared consistently when setting the number of maximum iterations higher (from 2 to 6). On another occasion, the exception appeared when adding a simple map operator at the end (holding the identity function). Generally, the error is quite hard to reproduce. The stacktrace: java.io.IOException: Channel to path '/var/folders/xx/0dd3w4jd7fbb4ytmhqxm157h0000gn/T/flink-io-f5061483-ff59-43dc-883f-79af813d5804/19a70637e025c7ee3919b30239060895.000023.channel' could not be opened. at org.apache.flink.runtime.io.disk.iomanager.AbstractFileIOChannel.<init>(AbstractFileIOChannel.java:61) at org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.<init>(AsynchronousFileIOChannel.java:86) at org.apache.flink.runtime.io.disk.iomanager.AsynchronousBulkBlockReader.<init>(AsynchronousBulkBlockReader.java:46) at org.apache.flink.runtime.io.disk.iomanager.AsynchronousBulkBlockReader.<init>(AsynchronousBulkBlockReader.java:39) at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.createBulkBlockChannelReader(IOManagerAsync.java:263) at org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:751) at org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508) at org.apache.flink.runtime.operators.hash.ReOpenableMutableHashTable.prepareNextPartition(ReOpenableMutableHashTable.java:167) at org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544) at org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104) at org.apache.flink.runtime.operators.AbstractCachedBuildSideMatchDriver.run(AbstractCachedBuildSideMatchDriver.java:155) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) at org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139) at org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: /var/folders/xx/0dd3w4jd7fbb4ytmhqxm157h0000gn/T/flink-io-f5061483-ff59-43dc-883f-79af813d5804/19a70637e025c7ee3919b30239060895.000023.channel (No such file or directory) at java.io.RandomAccessFile.open0(Native Method) at java.io.RandomAccessFile.open(RandomAccessFile.java:316) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:124) at org.apache.flink.runtime.io.disk.iomanager.AbstractFileIOChannel.<init>(AbstractFileIOChannel.java:57) ... 16 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)