[
https://issues.apache.org/jira/browse/DRILL-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196581#comment-16196581
]
Prasad Nagaraj Subramanya commented on DRILL-5851:
--------------------------------------------------
I did some analysis on the test case. The cause of the issue is
1) empty csv file and star query
2) parquet column involved in the join is of non-integer type (could be any
other data source)
The issue is not observed under below circumstances -
a) If the parquet column was of type integer then we do not observe this issue.
This is because its a '*' query and involves a csv file with no headers - an
instance of RepeatedVarCharOutput is used which returns nullable int when there
is no data.
b) If csv column was projected using columns[]
{code}
select * from cp.`sample-data/nation.parquet` nation left outer join
dfs.tmp.`2.csv` as two on two.columns[1] = nation.`N_COMMENT`;
{code}
c) If a empty csv was used with extract header set to true, and we had
projections rather than *
{code}
select nation.`N_COMMENT`, nation.`N_NAME`, two.b from
cp.`sample-data/nation.parquet` nation left outer join dfs.tmp.`2.csv` as two
on two.a = nation.`N_COMMENT`;
{code}
> Empty table during a join operation with a non empty table produces cast
> exception
> -----------------------------------------------------------------------------------
>
> Key: DRILL-5851
> URL: https://issues.apache.org/jira/browse/DRILL-5851
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.11.0
> Reporter: Hanumath Rao Maduri
> Assignee: Hanumath Rao Maduri
>
> Hash Join operation on tables with one table empty and the other non empty
> throws an exception
> {code}
> Error: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts
> between 1. Numeric data
> 2. Varchar, Varbinary data 3. Date, Timestamp data Left type: VARCHAR, Right
> type: INT. Add explicit casts to avoid this error
> {code}
> Here is an example query with which it is reproducible.
> {code}
> select * from cp.`sample-data/nation.parquet` nation left outer join
> dfs.tmp.`2.csv` as two on two.a = nation.`N_COMMENT`;
> {code}
> the contents of 2.csv is empty (i.e not even header info).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)