[
https://issues.apache.org/jira/browse/DRILL-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622998#comment-16622998
]
ASF GitHub Bot commented on DRILL-6747:
---------------------------------------
ilooner commented on a change in pull request #1472: DRILL-6747: Fixed IOB in
HashJoin for inner and right joins on empty probe side
URL: https://github.com/apache/drill/pull/1472#discussion_r219367526
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatch.java
##########
@@ -115,6 +115,14 @@
* This value will be returned only after {@link #OK_NEW_SCHEMA} has been
* returned at least once (not necessarily <em>immediately</em> after).
* </p>
+ * <p>
+ * Also after a RecordBatch returns NONE a RecordBatch should:
+ * <ul>
+ * <li>Contain the last valid schema seen by the operator.</li>
+ * <li>Contain a VectorContainer with empty columns corresponding to
the last valid schema.</li>
+ * <li>Return a record count of 0.</li>
+ * </ul>
+ * </p>
*/
NONE(false),
Review comment:
@sohami I agree that guarantee was not provided before. However, I am
proposing to change the contract in the way described in the comment. Note no
data would be associated with a record batch after it returns NONE. It would
only have a VectorContainer with **EMPTY** columns corresponding to the last
seen valid schema, and return a record count of 0. In practice almost all
operators already do this (except they don't always zero their container).
The advantage of adding this new guarantee is that downstream operators do
not have to do book keeping when the upstream can do it for us very easily.
The details of why this was done are in the JIRA. There were many ways to
fix the issue in HashJoin but this is by far the cleanest, and less risky since
most of the code in HashJoin has no unit tests. It also has the advantage of
clarifying the behavior of RecordBatches for future operator implementations.
So I felt this approach was a win win and the best way to go about fixing the
issue. However, there should be a follow up Jira to validate the new contract I
am proposing for the rest of the operators, not just unordered reciever.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Empty Probe Side in Right or Inner Join Causes IOB.
> ---------------------------------------------------
>
> Key: DRILL-6747
> URL: https://issues.apache.org/jira/browse/DRILL-6747
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Timothy Farkas
> Assignee: Timothy Farkas
> Priority: Major
> Labels: ready-to-commit
> Fix For: 1.15.0
>
>
> When a right join is done with an empty probe side, which first returns an
> OK_NEW_SCHEMA, and then NONE, an IOB would be triggered.
> This happens in the following scenario
> 1. We finish reading the build side.
> 2. The upstream probe side operator is an UnorderReciever.
> 3. We sniff for probe data, and find none, so the UnorderedReciever returns
> NONE.
> 4. When the UnorderedReciever returns NONE it clears the VectorContainer in
> its RecordBatchLoader causing all the value vectors to be deleted.
> 5. We then try to build hashtables from the build side data.
> 6. The HashTable requires a probe side vector container from an
> UnorderedReciever.
> 7. The UnorderedReciever's vector container is in RecordBatchLoader, which
> was cleared when NONE was returned. So all the vector containers columns were
> removed.
> 8. Building the hashtable attempts to access the vector container's columns,
> and we get an IOB.
> There are a couple ways to fix the issue.
> Currently updating value vectors in the hashtable for build and probe sides
> are tightly coupled, you cannot update them independently. So we could
> refactor updating the build and probe sides independently in the case of an
> empty probe side. This would be an invasive change though.
> Another solution which I am going to implement here is to add a requirement
> to the contract for IteroutCome.NONE that the VectorContainer columns are
> preserved, and fix UnorderedReciever to obey that contract. It looks like all
> the other operators already do this, but it is not explicity tested. I will
> create other jiras to explicitely test this behavior. Adding this requirement
> to the contract for Iteroutcome.NONE has the benefit of making the state of a
> RecordBatch well defined after NONE, which it is currently not.
> {code}
> Fragment 3:3
> [Error Id: 061e7d71-e3c1-43bd-a6d5-9d588ecf6551 on perf109-52.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> IndexOutOfBoundsException: Index: 1, Size: 0
> Fragment 3:3
> [Error Id: 061e7d71-e3c1-43bd-a6d5-9d588ecf6551 on perf109-52.qa.lab:31010]
> at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
> ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360)
> [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215)
> [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326)
> [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_171]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[na:1.8.0_171]
> at java.util.ArrayList.get(ArrayList.java:433) ~[na:1.8.0_171]
> at
> org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:317)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:251)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:139)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.test.generated.HashTableGen783.doSetup(HashTableGen783.java:49)
> ~[na:na]
> at
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.updateBatches(HashTableTemplate.java:513)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.updateIncoming(HashTableTemplate.java:870)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.common.HashPartition.buildContainersHashTableAndHelper(HashPartition.java:510)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:973)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:436)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:175)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:122)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:112)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:143)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:175)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:122)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:112)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:143)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:175)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:152)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:293)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:280)
> ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_171]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_171]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> ~[hadoop-common-2.7.0-mapr-1707.jar:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:280)
> [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT]
> ... 4 common frames omitted
> 2018-09-18 15:32:59,977 [245e830b-9128-eaa9-44b3-907398183f09:frag:3:43]
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR:
> IndexOutOfBoundsException: Index: 1, Size: 0
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)