[ 
https://issues.apache.org/jira/browse/DRILL-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-2107:
------------------------------
    Attachment: q4_1_mj.json
                q4_1_mj_phy.txt
                q4_1_hj.json
                q4_1_hj_phy.txt

q4_1_hj_phy.txt :  the explain plan output when HashJoin is enabled. 
q4_1_hj.json : the physical plan in JSON format when HashJoin is enabled. 
q4_1_mj_phy.txt: the explain plan output when MergeJoin is used.
q4_1_mj.json:  the physical plan in JSON format when MergeJoin is used. 

To submit the physical plan in JSON format directly, use the following command:

{
distribution/target/apache-drill-0.8.0-SNAPSHOT/apache-drill-0.8.0-SNAPSHOT/bin/submit_plan
 -f q4_1_hj.json -t physical -l
}


> Hash Join throw IOBE for a query with exists subquery. 
> -------------------------------------------------------
>
>                 Key: DRILL-2107
>                 URL: https://issues.apache.org/jira/browse/DRILL-2107
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Execution - Operators
>            Reporter: Jinfeng Ni
>            Assignee: Chris Westin
>            Priority: Critical
>         Attachments: q4_1_hj.json, q4_1_hj_phy.txt, q4_1_mj.json, 
> q4_1_mj_phy.txt
>
>
> I hit an IOBE for TestTpchDistributed Q4, when I tried to enable an optimizer 
> rule.  Then, I simplified Q4 to the following, and still re-produce the same 
> IOBE.
> {code}
> select
>   o.o_orderpriority
> from
>   cp.`tpch/orders.parquet` o
> where
>   exists (
>     select
>       *
>     from
>       cp.`tpch/lineitem.parquet` l
>     where
>       l.l_orderkey = o.o_orderkey
>   )
> ;
> {code}
> Stack trace of the exception:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>      at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_45]
>      at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_45]
>      at 
> org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:232)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>      at 
> org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:149)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>      at 
> org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:132)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>      at 
> org.apache.drill.exec.test.generated.HashTableGen307.doSetup(HashTableTemplate.java:71)
>  ~[na:na]
>      at 
> org.apache.drill.exec.test.generated.HashTableGen307.updateBatches(HashTableTemplate.java:473)
>  ~[na:na]
>      at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen313.executeProbePhase(HashJoinProbeTemplate.java:139)
>  ~[na:na]
>      at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen313.probeAndProject(HashJoinProbeTemplate.java:223)
>  ~[na:na]
>      at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:227)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> ....
> {code}
> The physical plan seems to be correct, after enabling the new rule. Actually, 
> if I disable HashJoin, and use merge join for the query, it works fine. So, 
> seems the IOBE exposes some bug in HashJoin.
> To re-produce this issue,  two options:
>  1 )  - Modify DrillRuleSets.java, remove the comment before SwapJoinRule 
>        - alter session set `planner.slice_target` = 10;
>        - run the query
>  
>  2) use the attached physical plan in json file, and use "submitplan" to 
> submit the physical plan.
> For comparison, I also attached the physical plan when disabling hashjoin 
> (use merge join), and the explain plan at physical operator level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to