[ https://issues.apache.org/jira/browse/DRILL-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Sinha reassigned DRILL-1401: --------------------------------- Assignee: Aman Sinha > Wrong result with csv data when projecting columns not part of the Join or > Filter > --------------------------------------------------------------------------------- > > Key: DRILL-1401 > URL: https://issues.apache.org/jira/browse/DRILL-1401 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 0.5.0 > Reporter: Chun Chang > Assignee: Aman Sinha > Attachments: aggregate_100r.csv > > > #Wed Sep 10 13:35:05 PDT 2014 > git.commit.id.abbrev=686eb9e > join (inner,full outer,left,right) does not work if the join is directly > applied to csv files. It will produce either null values on the projected > columns or NumberFormatException depends on joining condition applied on > which columns (first or second columns). If you create views on the csv > files, then join works on the views. > For example, the following join give null values: > 0: jdbc:drill:schema=dfs> select cast(`aggregate_100r.csv`.columns[0] as > int), cast(`aggregate_100r.csv`.columns[1] as int), > cast(`join_100r.csv`.columns[1] as int) from `aggregate_100r.csv` inner join > `join_100r.csv` on cast(`aggregate_100r.csv`.columns[0] as int) = > cast(`join_100r.csv`.columns[0] as int); > +------------+------------+------------+ > | EXPR$0 | EXPR$1 | EXPR$2 | > +------------+------------+------------+ > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 0 | null | null | > | 1 | null | null | > | 1 | null | null | > | 1 | null | null | > | 1 | null | null | > The following give NumberFormatException: > 0: jdbc:drill:schema=dfs> select cast(`aggregate_100r.csv`.columns[0] as > int), cast(`aggregate_100r.csv`.columns[1] as int), > cast(`join_100r.csv`.columns[1] as int) from `aggregate_100r.csv` inner join > `join_100r.csv` on cast(`aggregate_100r.csv`.columns[1] as int) = > cast(`join_100r.csv`.columns[1] as int); > Query failed: Failure while running fragment. > [3f67299e-f312-445b-8d6b-74984a820f0c] > Error: exception while executing query: Failure while trying to get next > result batch. (state=,code=0) > The following with views works: > 0: jdbc:drill:schema=dfs> select `aggregate_100r_v`.c0, > `aggregate_100r_v`.c1, `join_100r_v`.c1 from `aggregate_100r_v` inner join > `join_100r_v` on `aggregate_100r_v`.c0 = `join_100r_v`.c0; > +------------+------------+------------+ > | c0 | c1 | c10 | > +------------+------------+------------+ > | 0 | 0 | 0 | > | 0 | 0 | 2 | > | 0 | 0 | 1 | > | 0 | 0 | 0 | > | 0 | 0 | 0 | > | 0 | 0 | 2 | > | 0 | 0 | 1 | > | 0 | 0 | 0 | > | 0 | 1 | 0 | > | 0 | 1 | 2 | > | 0 | 1 | 1 | > | 0 | 1 | 0 | > | 0 | 2 | 0 | > | 0 | 2 | 2 | > | 0 | 2 | 1 | > | 0 | 2 | 0 | > | 1 | 1 | 1 | > | 1 | 1 | 2 | > | 1 | 1 | 1 | > | 1 | 1 | 0 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)