All,
Question is - why does it work for a parquet column and fails when CSV
column is used ?
Drill 1.9.0 commit : a29f1e29
This is a simple project of column from a csv file, works.
{noformat}
0: jdbc:drill:schema=dfs.tmp> select columns[4] FROM `typeall_l.csv` t1
limit 5;
+-------------+
| EXPR$0 |
+-------------+
| 2011-11-04 |
| 1986-10-22 |
| 1992-09-10 |
| 2016-08-07 |
| 1986-01-25 |
+-------------+
5 rows selected (0.26 seconds)
{noformat}
Using TO_DATE function with columns[x] as first input fails, with an
IllegalArgumentException
{noformat}
0: jdbc:drill:schema=dfs.tmp> select to_date(columns[4],'yyyy-mm-dd') FROM
`typeall_l.csv` t1 limit 5;
Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: ""
Fragment 0:0
[Error Id: 9cff3eb9-4045-4d9a-a6a1-1eadaa597f30 on centos-01.qa.lab:31010]
(state=,code=0)
{noformat}
However, interestingly same query over parquet column returns correct
results, on same data.
{noformat}
0: jdbc:drill:schema=dfs.tmp> select to_date(col_dt,'yyyy-mm-dd') FROM
typeall_l limit 5;
+-------------+
| EXPR$0 |
+-------------+
| 2011-01-04 |
| 1986-01-22 |
| 1992-01-10 |
| 2016-01-07 |
| 1986-01-25 |
+-------------+
5 rows selected (0.286 seconds)
{noformat}
When the date string is passed as first input, to_date function returns
correct results.
{noformat}
0: jdbc:drill:schema=dfs.tmp> select to_date('2011-01-04','yyyy-mm-dd')
from (values(1));
+-------------+
| EXPR$0 |
+-------------+
| 2011-01-04 |
+-------------+
1 row selected (0.235 seconds)
{noformat}
Thanks,
Khurram