[
https://issues.apache.org/jira/browse/TAJO-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi resolved TAJO-926.
-------------------------------
Resolution: Fixed
committed it to master branch.
> Join condition including column references of a row-preserving table in left
> outer join causes incorrect result
> ---------------------------------------------------------------------------------------------------------------
>
> Key: TAJO-926
> URL: https://issues.apache.org/jira/browse/TAJO-926
> Project: Tajo
> Issue Type: Bug
> Components: physical operator, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.9.0
>
>
> This patch fixes two bugs.
> One is wrong projection push down (PPD). See the example, reproducing the bug:
> {noformat}
> select
> r_name,
> r_regionkey,
> n_name,
> n_regionkey
> from
> region left outer join nation on n_regionkey = r_regionkey and r_name in
> ('AMERICA', 'ASIA')
> order by r_name;
> {noformat}
> The above query includes one left outer join (LOJ) and one join filter. Since
> this join filter {{R_NAME in ('AMERICA', 'ASIA')}} includes column references
> corresponding to the row preserved table {{region}}, the join filter is
> placed on the LOJ operator. It only results in the sub expression push down
> of RowConstantEval and replaces right expression of IN predicate by
> FieldEval. But, we assume that the RHS of InEval is always RowConstantEval.
> This is the main clause of this bug.
> {noformat}
> 2014-07-09 16:39:37,527 ERROR: org.apache.tajo.worker.Task (run(395)) -
> org.apache.tajo.engine.eval.FieldEval cannot be cast to
> org.apache.tajo.engine.eval.RowConstantEval
> java.lang.ClassCastException: org.apache.tajo.engine.eval.FieldEval cannot be
> cast to org.apache.tajo.engine.eval.RowConstantEval
> at org.apache.tajo.engine.eval.InEval.eval(InEval.java:62)
> at org.apache.tajo.engine.eval.BinaryEval.eval(BinaryEval.java:104)
> at
> org.apache.tajo.engine.planner.physical.NLLeftOuterJoinExec.next(NLLeftOuterJoinExec.java:109)
> at
> org.apache.tajo.engine.planner.physical.ExternalSortExec.sortAndStoreAllChunks(ExternalSortExec.java:201)
> at
> org.apache.tajo.engine.planner.physical.ExternalSortExec.next(ExternalSortExec.java:278)
> at
> org.apache.tajo.engine.planner.physical.RangeShuffleFileWriteExec.next(RangeShuffleFileWriteExec.java:99)
> at org.apache.tajo.worker.Task.run(Task.java:388)
> at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:406)
> at java.lang.Thread.run(Thread.java:744)
> 2014-07-09 16:39:37,528 INFO: org.apache.tajo.worker.TaskAttemptContext
> (setState(115)) - Query status of ta_1404891573341_0004_000003_000000_02 is
> changed to TA_FAILED
> 2014-07-09 16:39:37,529 INFO: org.apache.tajo.worker.Task (run(452)) -
> Worker's task counter - total:3, succeeded: 0, killed: 3, failed: 3
> {noformat}
> The second bug is that HashLeftOuterJoin results in wrong result when it has
> join filter corresponding to row preserved table like the above example
> query.
> In order to fix this bug, we have to skip the right iterator of hash table
> when if the joined tuple is filtered.
> Expected:
> {noformat}
> r_name,r_regionkey,n_name,n_regionkey
> -------------------------------
> AFRICA,0,null,null
> AMERICA,1,ARGENTINA,1
> AMERICA,1,BRAZIL,1
> AMERICA,1,CANADA,1
> AMERICA,1,PERU,1
> AMERICA,1,UNITED STATES,1
> ASIA,2,INDIA,2
> ASIA,2,INDONESIA,2
> ASIA,2,JAPAN,2
> ASIA,2,CHINA,2
> ASIA,2,VIETNAM,2
> EUROPE,3,null,null
> MIDDLE EAST,4,null,null
> {noformat}
> Actual result:
> {noformat}
> r_name,r_regionkey,n_name,n_regionkey
> -------------------------------
> AFRICA,0,null,null
> AFRICA,0,null,null
> AFRICA,0,null,null
> AFRICA,0,null,null
> AFRICA,0,null,null
> AMERICA,1,ARGENTINA,1
> AMERICA,1,BRAZIL,1
> AMERICA,1,CANADA,1
> AMERICA,1,PERU,1
> AMERICA,1,UNITED STATES,1
> ASIA,2,INDIA,2
> ASIA,2,INDONESIA,2
> ASIA,2,JAPAN,2
> ASIA,2,CHINA,2
> ASIA,2,VIETNAM,2
> EUROPE,3,null,null
> EUROPE,3,null,null
> EUROPE,3,null,null
> EUROPE,3,null,null
> EUROPE,3,null,null
> MIDDLE EAST,4,null,null
> MIDDLE EAST,4,null,null
> MIDDLE EAST,4,null,null
> MIDDLE EAST,4,null,null
> MIDDLE EAST,4,null,null
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)