Khurram Faraaz created DRILL-5345:
-------------------------------------
Summary: USING(<column-name>) should return error, when column in
not present in the right table
Key: DRILL-5345
URL: https://issues.apache.org/jira/browse/DRILL-5345
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Khurram Faraaz
When the column used in, USING(column) is not present in the right table of the
JOIN, we show report an error to user instead of returning zero rows.
{noformat}
<named columns join> ::=
USING <left paren> <join column list> <right paren>
<join column list> ::=
<column name list>
<column name list> ::=
<column name> [ { <comma> <column name> }... ]
<column name> ::=
<identifier>
{noformat}
Postgres 9.3 reports an error for such a scenario, because column c1 is not
present in the other table in the join.
{noformat}
postgres=# select * from tbl_l JOIN tbl_r USING (c1);
ERROR: column "c1" specified in USING clause does not exist in right table
{noformat}
Drill 1.10.0 returns zero rows in one case and RuntimeException in the other
case. In both cases Drill should return a meaning error message, like the one
reported by Postgres.
{noformat}
0: jdbc:drill:schema=dfs.tmp> create table tbl_l as select cast(columns[0] as
integer) c1, cast(columns[1] as varchar(25)) c2 from `tbl_l.txt`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 10 |
+-----------+----------------------------+
1 row selected (0.197 seconds)
0: jdbc:drill:schema=dfs.tmp> create table tbl_r as select cast(columns[0] as
integer) c3, cast(columns[1] as varchar(25)) c4 from `tbl_r.txt`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 10 |
+-----------+----------------------------+
1 row selected (0.221 seconds)
## this is not right, we should report error for below query.
0: jdbc:drill:schema=dfs.tmp> select * from tbl_l JOIN tbl_r USING (c1);
+-----+-----+-----+-----+
| c1 | c2 | c3 | c4 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
No rows selected (0.222 seconds)
## we should return a meaningful error message
0: jdbc:drill:schema=dfs.tmp> select * from tbl_l JOIN tbl_r USING (c2);
Error: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts
between 1. Numeric data
2. Varchar, Varbinary data 3. Date, Timestamp data Left type: INT, Right type:
VARCHAR. Add explicit casts to avoid this error
Fragment 0:0
[Error Id: 56107e6e-54b5-4905-9aa4-b0af2aea4dd9 on centos-01.qa.lab:31010]
(state=,code=0)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)