[
https://issues.apache.org/jira/browse/DRILL-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nitin updated DRILL-5109:
-------------------------
Description:
Following query when executed,
0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t` as select
cast(employee_id as double) employee_id, cast(department_id as double)
department_id,cast(salary as double) salary,DENSE_RANK( ) over( partition by
cast(department_id as double) order by cast(salary as double) asc nulls first
) dummy_DENSE_RANK from cp.`employee.json`;
0: jdbc:drill:zk=local> select * from dfs.tmp.`/tmp/t` limit 5;
+------+------+----------+-----+
| $0 | $1 | $2 | $3 |
+------+------+----------+-----+
| 1.0 | 1.0 | 80000.0 | 4 |
| 2.0 | 1.0 | 40000.0 | 3 |
| 4.0 | 1.0 | 40000.0 | 3 |
| 5.0 | 1.0 | 35000.0 | 2 |
| 6.0 | 2.0 | 25000.0 | 4 |
+------+------+----------+-----+i
t should have had the proper column names. even from parquet schema it comes as
bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar schema /tmp/tmp/t/0_0_0.parquet
message root {
optional double $0;
optional double $1;
optional double $2;
required int64 $3;
}
But when we add order by clause in query it is adding column names looks like
an issue with storage writer. This is true for all cases whichever file format
we choose to store as for CTAS
was:
Following query when executed,
0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t` as select
cast(employee_id as double) employee_id, cast(department_id as double)
department_id,cast(salary as double) salary,DENSE_RANK( ) over( partition by
cast(department_id as double) order by cast(salary as double) asc nulls first
) dummy_DENSE_RANK from cp.`employee.json`;
0: jdbc:drill:zk=local> select * from dfs.tmp.`/tmp/t` limit 5;
+------+------+----------+-----+
| $0 | $1 | $2 | $3 |+
------+------+----------+-----+
| 1.0 | 1.0 | 80000.0 | 4 |
| 2.0 | 1.0 | 40000.0 | 3 |
| 4.0 | 1.0 | 40000.0 | 3 |
| 5.0 | 1.0 | 35000.0 | 2 |
| 6.0 | 2.0 | 25000.0 | 4 |
+------+------+----------+-----+i
t should have had the proper column names. even from parquet schema it comes as
bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar schema /tmp/tmp/t/0_0_0.parquet
message root {
optional double $0;
optional double $1;
optional double $2;
required int64 $3;
}
But when we add order by clause in query it is adding column names looks like
an issue with storage writer. This is true for all cases whichever file format
we choose to store as for CTAS
> CTAS queries for window functions creating files without column names
> ---------------------------------------------------------------------
>
> Key: DRILL-5109
> URL: https://issues.apache.org/jira/browse/DRILL-5109
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill, Storage - Writer
> Affects Versions: 1.8.0, 1.9.0
> Reporter: Nitin
>
> Following query when executed,
> 0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t` as select
> cast(employee_id as double) employee_id, cast(department_id as double)
> department_id,cast(salary as double) salary,DENSE_RANK( ) over( partition by
> cast(department_id as double) order by cast(salary as double) asc nulls
> first ) dummy_DENSE_RANK from cp.`employee.json`;
> 0: jdbc:drill:zk=local> select * from dfs.tmp.`/tmp/t` limit 5;
> +------+------+----------+-----+
> | $0 | $1 | $2 | $3 |
> +------+------+----------+-----+
> | 1.0 | 1.0 | 80000.0 | 4 |
> | 2.0 | 1.0 | 40000.0 | 3 |
> | 4.0 | 1.0 | 40000.0 | 3 |
> | 5.0 | 1.0 | 35000.0 | 2 |
> | 6.0 | 2.0 | 25000.0 | 4 |
> +------+------+----------+-----+i
> t should have had the proper column names. even from parquet schema it comes
> as
> bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar schema
> /tmp/tmp/t/0_0_0.parquet
> message root {
> optional double $0;
> optional double $1;
> optional double $2;
> required int64 $3;
> }
> But when we add order by clause in query it is adding column names looks like
> an issue with storage writer. This is true for all cases whichever file
> format we choose to store as for CTAS
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)