[jira] [Commented] (DRILL-6590) DATA_WRITE ERROR: Hash Join failed to write to output file: /tmp/drill/spill/24bac407

2018-07-12 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541235#comment-16541235
 ] 

Khurram Faraaz commented on DRILL-6590:
---

There is in the stacktrace a java.nio.channels.ClosedByInterruptException: 
null, which may explain the reason.

{noformat}
2018-07-11 20:38:03,974 [24b9591c-5ed1-bccf-8512-0b7707559b29:frag:4:14] INFO 
o.a.d.e.p.impl.common.HashPartition - User Error Occurred: Hash Join failed to 
write to output file: 
/tmp/drill/spill/24b9591c-5ed1-bccf-8512-0b7707559b29_HashJoin_4-22-14/spill5_outer
 (null)
org.apache.drill.common.exceptions.UserException: DATA_WRITE ERROR: Hash Join 
failed to write to output file: 
/tmp/drill/spill/24b9591c-5ed1-bccf-8512-0b7707559b29_HashJoin_4-22-14/spill5_outer


[Error Id: 97755089-0ac9-4a9f-9d0d-95a9a996894e ]
 at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
 ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.common.HashPartition.spillThisPartition(HashPartition.java:350)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.common.HashPartition.completeABatch(HashPartition.java:263)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.common.HashPartition.completeAnOuterBatch(HashPartition.java:237)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.common.HashPartition.appendOuterRow(HashPartition.java:232)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.test.generated.HashJoinProbeGen50.executeProbePhase(HashJoinProbeTemplate.java:306)
 [na:na]
 at 
org.apache.drill.exec.test.generated.HashJoinProbeGen50.probeAndProject(HashJoinProbeTemplate.java:393)
 [na:na]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:348)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:236)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:216)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:236)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:216)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:274)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 ...
 ...
Caused by: java.nio.channels.ClosedByInterruptException: null
 at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
 ~[na:1.8.0_161]
 at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:216) ~[na:1.8.0_161]
 at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[na:1.8.0_161]
 at java.nio.channels.Channels.writeFully(Channels.java:101) ~[na:1.8.0_161]
 at java.nio.channels.Channels.access$000(Channels.java:61) ~[na:1.8.0_161]
 at java.nio.channels.Channels$1.write(Channels.java:174) ~[na:1.8.0_161]
 at 
com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) 
~[protobuf-java-2.5.0.jar:na]
 at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) 
~[protobuf-java-2.5.0.jar:na]
 at 
com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:91)
 ~[protobuf-java-2.5.0.jar:na]
 at 

[jira] [Commented] (DRILL-6590) DATA_WRITE ERROR: Hash Join failed to write to output file: /tmp/drill/spill/24bac407

2018-07-11 Thread Boaz Ben-Zvi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540997#comment-16540997
 ] 

Boaz Ben-Zvi commented on DRILL-6590:
-

 This bug came out of DRILL-6453, where a query got (in an unexplained way) 
cancelled. It looks like the cancellation somehow caused the spill writing to 
fail - maybe another thread closed that spill file, while the original thread 
was writing ? 


> DATA_WRITE ERROR: Hash Join failed to write to output file: 
> /tmp/drill/spill/24bac407
> -
>
> Key: DRILL-6590
> URL: https://issues.apache.org/jira/browse/DRILL-6590
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Priority: Major
>
> Apache Drill 1.14.0 git.commit.id.abbrev=eb946b0
> There was enough space on /tmp, however Hash Join failed to write to spill 
> file
> [test@qa102-45 drill-1.14.0]# clush -a df -h /tmp
> : Filesystem Size Used Avail Use% Mounted on
> : /dev/mapper/vg_root-lv_root 500G 150G 351G 30% /
> : Filesystem Size Used Avail Use% Mounted on
> : /dev/mapper/vg_root-lv_root 500G 17G 484G 4% /
> : Filesystem Size Used Avail Use% Mounted on
> : /dev/mapper/vg_root-lv_root 500G 14G 487G 3% /
> : Filesystem Size Used Avail Use% Mounted on
> : /dev/mapper/vg_root-lv_root 500G 13G 488G 3% /
> Stack trace from drillbit.log
> {noformat}
> 2018-07-10 18:17:51,953 [BitServer-10] WARN 
> o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there 
> was no registered listener for that message: profile {
>  state: FAILED
>  error {
>  error_id: "6e258de2-2d4f-4b48-967d-df1b329955cd"
>  endpoint {
>  address: "qa102-48.qa.lab"
>  user_port: 31010
>  control_port: 31011
>  data_port: 31012
>  version: "1.14.0-SNAPSHOT"
>  state: STARTUP
>  }
>  error_type: DATA_WRITE
>  message: "DATA_WRITE ERROR: Hash Join failed to write to output file: 
> /tmp/drill/spill/24bac407-2adb-5763-ed08-cb5714dca2c0_HashJoin_4-22-53/spill15_outer\n\nFragment
>  4:53\n\n[Error Id: 6e258de2-2d4f-4b48-967d-df1b329955cd on 
> qa102-48.qa.lab:31010]"
>  exception {
>  exception_class: "java.nio.channels.ClosedByInterruptException"
>  stack_trace {
>  class_name: "..."
>  line_number: 0
>  method_name: "..."
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "com.google.protobuf.CodedOutputStream"
>  file_name: "CodedOutputStream.java"
>  line_number: 833
>  method_name: "refreshBuffer"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "com.google.protobuf.CodedOutputStream"
>  file_name: "CodedOutputStream.java"
>  line_number: 843
>  method_name: "flush"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "com.google.protobuf.AbstractMessageLite"
>  file_name: "AbstractMessageLite.java"
>  line_number: 91
>  method_name: "writeDelimitedTo"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.cache.VectorSerializer$Writer"
>  file_name: "VectorSerializer.java"
>  line_number: 97
>  method_name: "write"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.physical.impl.common.HashPartition"
>  file_name: "HashPartition.java"
>  line_number: 346
>  method_name: "spillThisPartition"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.physical.impl.common.HashPartition"
>  file_name: "HashPartition.java"
>  line_number: 263
>  method_name: "completeABatch"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.physical.impl.common.HashPartition"
>  file_name: "HashPartition.java"
>  line_number: 237
>  method_name: "completeAnOuterBatch"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.physical.impl.common.HashPartition"
>  file_name: "HashPartition.java"
>  line_number: 232
>  method_name: "appendOuterRow"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.test.generated.HashJoinProbeGen49"
>  file_name: "HashJoinProbeTemplate.java"
>  line_number: 306
>  method_name: "executeProbePhase"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.test.generated.HashJoinProbeGen49"
>  file_name: "HashJoinProbeTemplate.java"
>  line_number: 393
>  method_name: "probeAndProject"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.physical.impl.join.HashJoinBatch"
>  file_name: "HashJoinBatch.java"
>  line_number: 357
>  method_name: "innerNext"
>  is_native_method: false
>  }
>  stack_trace {
>  class_name: "org.apache.drill.exec.record.AbstractRecordBatch"
>  file_name: "AbstractRecordBatch.java"
>  line_number: 172
>  method_name: "next"
>  is_native_method: false
>  }
>