Hello Drill Devs,
I'm noticing some strange behavior with the newest version of Drill. If you
query a CSV file, you get the following metadata:
SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
{
"queryId": "22eee85f-c02c-5878-9735-091d18788061",
"columns": [
"domain"
],
"rows": [
{
"domain": "thedataist.com"
}
],
"metadata": [
"VARCHAR(0, 0)",
"VARCHAR(0, 0)"
],
"queryState": "COMPLETED",
"attemptedAutoLimit": 0
}
There are two issues here:
1. VARCHAR now has precision
2. There are twice as many columns as there should be.
Additionally, if you query a regular CSV, without the columns extracted, you
get the following:
"rows": [
{
"columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]"
}
],
"metadata": [
"VARCHAR(0, 0)",
"VARCHAR(0, 0)"
],
This is bizarre in that the data type is not being reported correctly, it
should be LIST or something like that, AND we're getting too many columns in
the metadata. I'll submit a JIRA as well, but could someone please take a look?
Thanks,
-- C