[jira] [Commented] (DRILL-3781) Using CURRENT_DATE in a group by throws a column not found error for hive tables and csv files

Victoria Markman (JIRA) Mon, 28 Sep 2015 09:30:05 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933525#comment-14933525
 ]


Victoria Markman commented on DRILL-3781:
-----------------------------------------

This fix is verified with 1.2.0

#Thu Sep 24 00:27:47 UTC 2015
git.commit.id.abbrev=d2caa62

Verification was done on a drill view:

with 1.1.0
{code}
0: jdbc:drill:schema=dfs> create or replace view v1(a1, b1, c1) as select 
cast(a1 as integer), cast(b1 as varchar(10)), cast(c1 as date) from t1;
+-------+-------------------------------------------------------------+
|  ok   |                           summary                           |
+-------+-------------------------------------------------------------+
| true  | View 'v1' replaced successfully in 'dfs.subqueries' schema  |
+-------+-------------------------------------------------------------+
1 row selected (0.446 seconds)

0: jdbc:drill:schema=dfs> describe v1;
+--------------+--------------------+--------------+
| COLUMN_NAME  |     DATA_TYPE      | IS_NULLABLE  |
+--------------+--------------------+--------------+
| a1           | INTEGER            | YES          |
| b1           | CHARACTER VARYING  | YES          |
| c1           | DATE               | YES          |
+--------------+--------------------+--------------+
3 rows selected (0.911 seconds)

0: jdbc:drill:schema=dfs> select CURRENT_DATE, count(*) from v1 group by 
CURRENT_DATE;
Error: PARSE ERROR: From line 1, column 48 to line 1, column 59: Column 
'CURRENT_DATE' not found in any table
[Error Id: 97e61cdc-f0c6-4ae4-9194-88583f6e2603 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

with 1.2.0
{code}
0: jdbc:drill:schema=dfs> select CURRENT_DATE, count(*) from v1 group by 
CURRENT_DATE;
+---------------+---------+
| CURRENT_DATE  | EXPR$1  |
+---------------+---------+
| 2015-09-28    | 10      |
+---------------+---------+
1 row selected (2.642 seconds)

0: jdbc:drill:schema=dfs> select count(*) from v1 group by CURRENT_DATE;
+---------+
| EXPR$0  |
+---------+
| 10      |
+---------+
1 row selected (0.516 seconds)
{code}

Added test to: Functional/Passing/aggregation/bugs

> Using CURRENT_DATE in a group by throws a column not found error for hive 
> tables and csv files
> ----------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3781
>                 URL: https://issues.apache.org/jira/browse/DRILL-3781
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill, Query Planning & Optimization
>            Reporter: Rahul Challapalli
>            Assignee: Jinfeng Ni
>             Fix For: 1.2.0
>
>         Attachments: 
> 0002-DRILL-3781-Group-by-system-function-in-schema-based-.patch, 
> csv-error.log, hive_error.log
>
>
> Commit # : e43155d8eabb6fc2d0fa4c68c25d6e7c59bf4521
> Using CURRENT_DATE in a group by seems to failing against hive and csv files. 
> With parquet and json, there seems to be no issues.
> Query against a hive table :
> {code}
> select CURRENT_DATE from student_hive group by CURRENT_DATE;
> Error: PARSE ERROR: From line 1, column 48 to line 1, column 59: Column 
> 'CURRENT_DATE' not found in any table
> [Error Id: e7d7df50-c5e8-4eda-990a-050b9a2b188e on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> Query against csv files :
> {code}
> select CURRENT_DATE  from `temp.tbl` group by CURRENT_DATE;
> Error: DATA_READ ERROR: Selected column 'CURRENT_DATE' must have name 
> 'columns' or must be plain '*'
> File Path maprfs:///drill/testdata/temp.tbl
> Fragment 0:0
> [Error Id: 1856f171-966e-4078-bbea-7ff3e9e22e15 on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> A similar query against json and parquet seems to be working fine
> {code}
> select current_date from `a.json` group by current_date;
> +---------------+
> | current_date  |
> +---------------+
> | 2015-09-14    |
> +---------------+
> select CURRENT_DATE from cp.`tpch/lineitem.parquet` group by CURRENT_DATE;
> +---------------+
> | CURRENT_DATE  |
> +---------------+
> | 2015-09-14    |
> +---------------+
> {code}
> I attached the log files for the failing conditions. Let me know if you need 
> anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3781) Using CURRENT_DATE in a group by throws a column not found error for hive tables and csv files

Reply via email to