[
https://issues.apache.org/jira/browse/TAJO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041758#comment-14041758
]
ASF GitHub Bot commented on TAJO-853:
-------------------------------------
Github user hyunsik commented on the pull request:
https://github.com/apache/tajo/pull/28#issuecomment-46936065
Thank you for nice contribution. The patch looks good to me. This patch
seems to fix outer join behaviors in terms of following cases:
* join conditions in located on either ON clause or WHERE clause
* column references included in join conditions correspond to either
row-preserving or null supplying tables.
There are tree comments:
* The patch needs rebase. It causes some conflicts against the changes of
TAJO-850.
* TestJoinQuery is changed to use parameterized unit tests. The constructor
of TestJoinQuery changes the global configuration in all workers. It affects
other unit tests. So, I'm facing following errors. When I remove the
reconfiguration in TestJoinQuery, all tests were passed.
```
Results :
Failed tests:
testGroupbyWithJson(org.apache.tajo.engine.query.TestGroupByQuery): Result
Verification expected:<...-------------------(..)
testHavingWithAggFunction(org.apache.tajo.engine.query.TestGroupByQuery):
Result Verification expected:<...-------------------(..)
testGroupByWithSameConstantKeys1(org.apache.tajo.engine.query.TestGroupByQuery):
Result Verification expected:<...---------(..)
testHavingWithNamedTarget(org.apache.tajo.engine.query.TestGroupByQuery):
Result Verification expected:<...-------------------(..)
testFilterPushDownPartitionColumnCaseWhen(org.apache.tajo.engine.query.TestJoinOnPartitionedTables):
Result Verification
expected:<[c_custkey,c_nationkey,c_name,o_custkey,?casewhen(..)
```
* I found some trivial NPE bug in outer join tests as follows. But, I cound
fix it by adding some trivial workaround code to SeqScanExec::scanAndAddCache
as follows:
```
2014-06-24 14:08:15,673 INFO:
org.apache.tajo.engine.planner.PhysicalPlannerImpl
(createBestLeftOuterJoinPlan(470)) - Left Outer Join (5) chooses [Hash Join].
2014-06-24 14:08:15,675 ERROR: org.apache.tajo.worker.Task (run(395)) -
java.lang.NullPointerException
at
org.apache.tajo.engine.planner.physical.SeqScanExec.scanAndAddCache(SeqScanExec.java:236)
at
org.apache.tajo.engine.planner.physical.SeqScanExec.init(SeqScanExec.java:176)
at
org.apache.tajo.engine.planner.physical.BinaryPhysicalExec.init(BinaryPhysicalExec.java:53)
at
org.apache.tajo.engine.planner.physical.UnaryPhysicalExec.init(UnaryPhysicalExec.java:52)
at
org.apache.tajo.engine.planner.physical.StoreTableExec.init(StoreTableExec.java:48)
at org.apache.tajo.worker.Task.run(Task.java:386)
at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:406)
at java.lang.Thread.run(Thread.java:744)
2014-06-24 14:08:15,675 INFO: org.apache.tajo.worker.TaskAttemptContext
(setState(115)) - Query status of ta_1403586203430_0348_000003_000000_00 is
changed to TA_FAILED
2014-06-24 14:08:15,676 INFO: org.apache.tajo.worker.Task (run(452)) -
Worker's task counter - total:1, succeeded: 0, killed: 1, failed: 1
2014-06-24 14:08:15,676 ERROR:
org.apache.tajo.master.querymaster.QueryUnitAttempt (transition(418)) -
ta_1403586203430_0348_000003_000000_00 FROM 192.168.0.205 >>
java.lang.NullPointerException
2014-06-24 14:08:15,678 INFO: org.apache.tajo.worker.TaskRunner (run(346))
- Request GetTask:
eb_1403586203430_0348_000003,container_1403586203430_0348_01_001096
```
Fixed code in SeqScanExec::scanAndAddCache():
```
if (scanner != null) {
scanner.close();
scanner = null;
}
```
> Refactoring FilterPushDown for OUTER JOIN
> -----------------------------------------
>
> Key: TAJO-853
> URL: https://issues.apache.org/jira/browse/TAJO-853
> Project: Tajo
> Issue Type: Improvement
> Reporter: Hyoungjun Kim
> Assignee: Hyoungjun Kim
> Priority: Minor
>
> Currently Tajo doesn't support a filter OUTER JOIN's ON clause.
> or has some bugs. There is some rules for this in the following urls.
> -
> http://www.ibm.com/developerworks/data/library/techarticle/purcell/0112purcell.html
> - https://cwiki.apache.org/confluence/display/Hive/OuterJoinBehavior
> Briefly summarized as follows.
> - Join Predicate on Preserved Row Table: Used for join condition(not filter)
> - Join Predicate on Null Supplying Table: Can push down to the table scan
> - Where Predicate on Preserved Row Table: Can push down to the table scan
> - Where Predicate on Null Supplying Table: Used for filter with join result
> data. This filter condition is attached to SELECTION Node.
--
This message was sent by Atlassian JIRA
(v6.2#6252)