Hi guys,
a quite long process failed due to this No Space Left on Device exception,
but the machine disk is not full at all.
okkam@okkam-nano-2:/opt/flink-0.8$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb2 223302236 22819504 189116588 11% /
none 4 0 4 0% /sys/fs/cgroup
udev 8156864 4 8156860 1% /dev
tmpfs 1633520 524 1632996 1% /run
none 5120 0 5120 0% /run/lock
none 8167584 0 8167584 0% /run/shm
none 102400 0 102400 0% /run/user
/dev/sdb1 523248 3428 519820 1% /boot/efi
/dev/sda1 961302560 2218352 910229748 1% /media/data
cm_processes 8167584 12116 8155468 1%
/run/cloudera-scm-agent/process
Is it possible that the temporary files were deleted 'after the problem'? I
read so, but there was no confirmation. However, it is a 256SSD disk. Each
of the 6 nodes has it.
Here is the stack trace:
16:37:59,581 ERROR
org.apache.flink.runtime.operators.RegularPactTask - Error in
task code: CHAIN Join
(org.okkam.flink.maintenance.deduplication.consolidate.Join2ToGetCandidates)
-> Filter
(org.okkam.flink.maintenance.deduplication.match.SingleMatchFilterFunctionWithFlagMatch)
-> Map
(org.okkam.flink.maintenance.deduplication.match.MapToTuple3MapFunction) ->
Combine(org.apache.flink.api.java.operators.DistinctOperator$DistinctFunction)
(4/28)
java.io.IOException: The channel is erroneous.
at
org.apache.flink.runtime.io.disk.iomanager.ChannelAccess.checkErroneous(ChannelAccess.java:132)
at
org.apache.flink.runtime.io.disk.iomanager.BlockChannelWriter.writeBlock(BlockChannelWriter.java:73)
at
org.apache.flink.runtime.io.disk.iomanager.ChannelWriterOutputView.writeSegment(ChannelWriterOutputView.java:218)
at
org.apache.flink.runtime.io.disk.iomanager.ChannelWriterOutputView.nextSegment(ChannelWriterOutputView.java:204)
at
org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140)
at
org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeByte(AbstractPagedOutputView.java:223)
at
org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.write(AbstractPagedOutputView.java:173)
at org.apache.flink.types.StringValue.writeString(StringValue.java:808)
at
org.apache.flink.api.common.typeutils.base.StringSerializer.serialize(StringSerializer.java:68)
at
org.apache.flink.api.common.typeutils.base.StringSerializer.serialize(StringSerializer.java:28)
at
org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:95)
at
org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30)
at
org.apache.flink.runtime.operators.hash.HashPartition.insertIntoProbeBuffer(HashPartition.java:269)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.processProbeIter(MutableHashTable.java:474)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:537)
at
org.apache.flink.runtime.operators.hash.BuildSecondHashMatchIterator.callWithNextKey(BuildSecondHashMatchIterator.java:106)
at
org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:148)
at
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:484)
at
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:359)
at
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:246)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:205)
at
org.apache.flink.runtime.io.disk.iomanager.SegmentWriteRequest.write(BlockChannelAccess.java:259)
at
org.apache.flink.runtime.io.disk.iomanager.IOManager$WriterThread.run(IOManager.java:636)