[jira] [Commented] (DRILL-2223) Empty parquet file created with Limit 0 query errors out when querying

SAIKRISHNA (JIRA) Tue, 29 Nov 2016 00:12:03 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15704613#comment-15704613
 ]


SAIKRISHNA commented on DRILL-2223:
-----------------------------------

Is there any way to create empty parquet schema with zero records , for my 
business use case we need to create empty parquet with schema as it is creating 
in json with zero records.

I am trying with below query getting zero records 
create table target.HIVE.employeeTest2911 AS SELECT * FROM cp.`employee.json` 
where employee_id >1157

Fragment      Number of records written
0_0                  0

when I try this 
             select * from  target.HIVE.employeeTest2911
getting this exception 
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From 
line 1, column 16 to line 1, column 21: Table 'target.HIVE.employeeTest2911' 
not found SQL Query null [Error Id: 5ee67a9b-b3ec-4ac8-88bd-13d8428f1d48 on 
DataNode1:31010]

Workspace structure is like this 

{
  "type": "file",
  "enabled": true,
  "connection": "hdfs://XXXXXXXXXXX:8020",
  "config": null,
  "workspaces": {
    "HIVE": {
      "location": "/user/tmp",
      "writable": true,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "parquet": {
      "type": "parquet"
    }
  }
}
Can I have solution for this,if any one has the solution to overcome this 
please let me know, Thanks in advance

> Empty parquet file created with Limit 0 query errors out when querying
> ----------------------------------------------------------------------
>
>                 Key: DRILL-2223
>                 URL: https://issues.apache.org/jira/browse/DRILL-2223
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>            Reporter: Aman Sinha
>             Fix For: Future
>
>
> Doing a CTAS with limit 0 creates a 0 length parquet file which errors out 
> during querying.  This should at least write the schema information and 
> metadata which will allow queries to run. 
> {code}
> 0: jdbc:drill:zk=local> create table tt_nation2 as select n_nationkey, 
> n_name, n_regionkey from cp.`tpch/nation.parquet` limit 0;
> +------------+---------------------------+
> |  Fragment  | Number of records written |
> +------------+---------------------------+
> | 0_0        | 0                         |
> +------------+---------------------------+
> 1 row selected (0.315 seconds)
> 0: jdbc:drill:zk=local> select n_nationkey from tt_nation2;
> Query failed: RuntimeException: file:/tmp/tt_nation2/0_0_0.parquet is not a 
> Parquet file (too small)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2223) Empty parquet file created with Limit 0 query errors out when querying

Reply via email to