[jira] [Commented] (DRILL-2223) Empty parquet file created with Limit 0 query errors out when querying

Khurram Faraaz (JIRA) Sat, 19 Mar 2016 21:55:58 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200792#comment-15200792
 ]


Khurram Faraaz commented on DRILL-2223:
---------------------------------------

Yes this is not reproducible on Drill 1.7.0. The question is if CTAS that used 
a LIMIT 0 query was successful, and because every successful CTAS creates a 
valid parquet file, one would expect that in this case CTAS create an empty 
parquet file that has the metadata information in the parquet footer with no 
actual data in the parquet file, since the query was a LIMIT 0 query.

{noformat}
0: jdbc:drill:schema=dfs.tmp> create table t_2223 as select firstName, 
lastName, isAlive, age, height_cm, address, phoneNumbers, hobbies from 
`employee.json` LIMIT 0;
+-----------+----------------------------+
| Fragment  | Number of records written  |
+-----------+----------------------------+
| 0_0       | 0                          |
+-----------+----------------------------+
1 row selected (0.31 seconds)
0: jdbc:drill:schema=dfs.tmp> select * from t_2223;
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 20: Table 
't_2223' not found

SQL Query null

[Error Id: 18273406-da54-415d-b8fe-aa96c6cc3c85 on centos-01.qa.lab:31010] 
(state=,code=0)
{noformat}

> Empty parquet file created with Limit 0 query errors out when querying
> ----------------------------------------------------------------------
>
>                 Key: DRILL-2223
>                 URL: https://issues.apache.org/jira/browse/DRILL-2223
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>            Reporter: Aman Sinha
>             Fix For: Future
>
>
> Doing a CTAS with limit 0 creates a 0 length parquet file which errors out 
> during querying.  This should at least write the schema information and 
> metadata which will allow queries to run. 
> {code}
> 0: jdbc:drill:zk=local> create table tt_nation2 as select n_nationkey, 
> n_name, n_regionkey from cp.`tpch/nation.parquet` limit 0;
> +------------+---------------------------+
> |  Fragment  | Number of records written |
> +------------+---------------------------+
> | 0_0        | 0                         |
> +------------+---------------------------+
> 1 row selected (0.315 seconds)
> 0: jdbc:drill:zk=local> select n_nationkey from tt_nation2;
> Query failed: RuntimeException: file:/tmp/tt_nation2/0_0_0.parquet is not a 
> Parquet file (too small)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2223) Empty parquet file created with Limit 0 query errors out when querying

Reply via email to