[
https://issues.apache.org/jira/browse/DRILL-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285101#comment-16285101
]
ASF GitHub Bot commented on DRILL-5851:
---------------------------------------
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/1059#discussion_r155939344
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/MergeJoinBatch.java
---
@@ -152,7 +152,10 @@ public void buildSchema() {
return;
}
- if (leftOutcome == IterOutcome.NONE && rightOutcome ==
IterOutcome.NONE) {
+ if (joinType == JoinRelType.INNER && (leftOutcome == IterOutcome.NONE
|| rightOutcome == IterOutcome.NONE) ||
+ joinType != JoinRelType.INNER && (leftOutcome == IterOutcome.NONE
&& rightOutcome == IterOutcome.NONE)) {
+ drainStream(leftOutcome, 0, left);
+ drainStream(rightOutcome, 1, right);
--- End diff --
Does this do what it sounds like it does? Read values until there are no
more to read? I wonder if this has been fully tested, or if it will end up
running the subquery to completion unnecessarily?
Also, here we are checking for NONE. Above we checked for error codes.
Should we check for the error codes here?
Or, better, when we receive an error code, should we simply throw an
exception and end it all? (That is, the code does not currently do anything
useful with STOP, OUT_OF_MEMORY or NOTYET.)
> Empty table during a join operation with a non empty table produces cast
> exception
> -----------------------------------------------------------------------------------
>
> Key: DRILL-5851
> URL: https://issues.apache.org/jira/browse/DRILL-5851
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.11.0
> Reporter: Hanumath Rao Maduri
> Assignee: Hanumath Rao Maduri
>
> Hash Join operation on tables with one table empty and the other non empty
> throws an exception
> {code}
> Error: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts
> between 1. Numeric data
> 2. Varchar, Varbinary data 3. Date, Timestamp data Left type: VARCHAR, Right
> type: INT. Add explicit casts to avoid this error
> {code}
> Here is an example query with which it is reproducible.
> {code}
> select * from cp.`sample-data/nation.parquet` nation left outer join
> dfs.tmp.`2.csv` as two on two.a = nation.`N_COMMENT`;
> {code}
> the contents of 2.csv is empty (i.e not even header info).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)