[jira] [Commented] (DRILL-6590) DATA_WRITE ERROR: Hash Join failed to write to output file: /tmp/drill/spill/24bac407
[ https://issues.apache.org/jira/browse/DRILL-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541235#comment-16541235 ] Khurram Faraaz commented on DRILL-6590: --- There is in the stacktrace a java.nio.channels.ClosedByInterruptException: null, which may explain the reason. {noformat} 2018-07-11 20:38:03,974 [24b9591c-5ed1-bccf-8512-0b7707559b29:frag:4:14] INFO o.a.d.e.p.impl.common.HashPartition - User Error Occurred: Hash Join failed to write to output file: /tmp/drill/spill/24b9591c-5ed1-bccf-8512-0b7707559b29_HashJoin_4-22-14/spill5_outer (null) org.apache.drill.common.exceptions.UserException: DATA_WRITE ERROR: Hash Join failed to write to output file: /tmp/drill/spill/24b9591c-5ed1-bccf-8512-0b7707559b29_HashJoin_4-22-14/spill5_outer [Error Id: 97755089-0ac9-4a9f-9d0d-95a9a996894e ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.common.HashPartition.spillThisPartition(HashPartition.java:350) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.common.HashPartition.completeABatch(HashPartition.java:263) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.common.HashPartition.completeAnOuterBatch(HashPartition.java:237) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.common.HashPartition.appendOuterRow(HashPartition.java:232) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.test.generated.HashJoinProbeGen50.executeProbePhase(HashJoinProbeTemplate.java:306) [na:na] at org.apache.drill.exec.test.generated.HashJoinProbeGen50.probeAndProject(HashJoinProbeTemplate.java:393) [na:na] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:348) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:236) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:236) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] ... ... Caused by: java.nio.channels.ClosedByInterruptException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[na:1.8.0_161] at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:216) ~[na:1.8.0_161] at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[na:1.8.0_161] at java.nio.channels.Channels.writeFully(Channels.java:101) ~[na:1.8.0_161] at java.nio.channels.Channels.access$000(Channels.java:61) ~[na:1.8.0_161] at java.nio.channels.Channels$1.write(Channels.java:174) ~[na:1.8.0_161] at com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) ~[protobuf-java-2.5.0.jar:na] at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) ~[protobuf-java-2.5.0.jar:na] at com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:91) ~[protobuf-java-2.5.0.jar:na] at
[jira] [Commented] (DRILL-6590) DATA_WRITE ERROR: Hash Join failed to write to output file: /tmp/drill/spill/24bac407
[ https://issues.apache.org/jira/browse/DRILL-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540997#comment-16540997 ] Boaz Ben-Zvi commented on DRILL-6590: - This bug came out of DRILL-6453, where a query got (in an unexplained way) cancelled. It looks like the cancellation somehow caused the spill writing to fail - maybe another thread closed that spill file, while the original thread was writing ? > DATA_WRITE ERROR: Hash Join failed to write to output file: > /tmp/drill/spill/24bac407 > - > > Key: DRILL-6590 > URL: https://issues.apache.org/jira/browse/DRILL-6590 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.14.0 >Reporter: Khurram Faraaz >Priority: Major > > Apache Drill 1.14.0 git.commit.id.abbrev=eb946b0 > There was enough space on /tmp, however Hash Join failed to write to spill > file > [test@qa102-45 drill-1.14.0]# clush -a df -h /tmp > : Filesystem Size Used Avail Use% Mounted on > : /dev/mapper/vg_root-lv_root 500G 150G 351G 30% / > : Filesystem Size Used Avail Use% Mounted on > : /dev/mapper/vg_root-lv_root 500G 17G 484G 4% / > : Filesystem Size Used Avail Use% Mounted on > : /dev/mapper/vg_root-lv_root 500G 14G 487G 3% / > : Filesystem Size Used Avail Use% Mounted on > : /dev/mapper/vg_root-lv_root 500G 13G 488G 3% / > Stack trace from drillbit.log > {noformat} > 2018-07-10 18:17:51,953 [BitServer-10] WARN > o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there > was no registered listener for that message: profile { > state: FAILED > error { > error_id: "6e258de2-2d4f-4b48-967d-df1b329955cd" > endpoint { > address: "qa102-48.qa.lab" > user_port: 31010 > control_port: 31011 > data_port: 31012 > version: "1.14.0-SNAPSHOT" > state: STARTUP > } > error_type: DATA_WRITE > message: "DATA_WRITE ERROR: Hash Join failed to write to output file: > /tmp/drill/spill/24bac407-2adb-5763-ed08-cb5714dca2c0_HashJoin_4-22-53/spill15_outer\n\nFragment > 4:53\n\n[Error Id: 6e258de2-2d4f-4b48-967d-df1b329955cd on > qa102-48.qa.lab:31010]" > exception { > exception_class: "java.nio.channels.ClosedByInterruptException" > stack_trace { > class_name: "..." > line_number: 0 > method_name: "..." > is_native_method: false > } > stack_trace { > class_name: "com.google.protobuf.CodedOutputStream" > file_name: "CodedOutputStream.java" > line_number: 833 > method_name: "refreshBuffer" > is_native_method: false > } > stack_trace { > class_name: "com.google.protobuf.CodedOutputStream" > file_name: "CodedOutputStream.java" > line_number: 843 > method_name: "flush" > is_native_method: false > } > stack_trace { > class_name: "com.google.protobuf.AbstractMessageLite" > file_name: "AbstractMessageLite.java" > line_number: 91 > method_name: "writeDelimitedTo" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.cache.VectorSerializer$Writer" > file_name: "VectorSerializer.java" > line_number: 97 > method_name: "write" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.common.HashPartition" > file_name: "HashPartition.java" > line_number: 346 > method_name: "spillThisPartition" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.common.HashPartition" > file_name: "HashPartition.java" > line_number: 263 > method_name: "completeABatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.common.HashPartition" > file_name: "HashPartition.java" > line_number: 237 > method_name: "completeAnOuterBatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.common.HashPartition" > file_name: "HashPartition.java" > line_number: 232 > method_name: "appendOuterRow" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.test.generated.HashJoinProbeGen49" > file_name: "HashJoinProbeTemplate.java" > line_number: 306 > method_name: "executeProbePhase" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.test.generated.HashJoinProbeGen49" > file_name: "HashJoinProbeTemplate.java" > line_number: 393 > method_name: "probeAndProject" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.join.HashJoinBatch" > file_name: "HashJoinBatch.java" > line_number: 357 > method_name: "innerNext" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.record.AbstractRecordBatch" > file_name: "AbstractRecordBatch.java" > line_number: 172 > method_name: "next" > is_native_method: false > } >