[ 
https://issues.apache.org/jira/browse/DRILL-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025680#comment-14025680
 ] 

Jason Altekruse commented on DRILL-815:
---------------------------------------

Impala does not currently mark columns with the standard parquet meta data to 
indicate how data should be read. Instead they are using the hive meta-store to 
persist this information. This is against the model of Drill where we are 
avoiding a meta-store and just allowing users to point at any file and read it. 
This means that for now this data must be cast  to varchar if you want it to be 
shown as strings. We should talk to Impala about supporting this meta-data 
alongside the metastore, as this an issue for all the hadoop project that want 
to read parquet produced files.

> Parquet files created in impala using data from hive tables resulted in 
> incorrect string representation
> -------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-815
>                 URL: https://issues.apache.org/jira/browse/DRILL-815
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: Norris Lee
>            Assignee: Jason Altekruse
>
> The parquet file was created by first loading a csv file into a hive table. A 
> parquet table was then created in impala and data from the hive table was 
> loaded in. The file was extracted from hdfs to local and placed into drill's 
> dfs.
> The keycolumn column in hive is of type string.
> {code}
> 0: jdbc:drill:schema=hivestg> select * from 
> `dfs`.`/opt/drill/integer.parquet`;
> +------------+------------+
> | keycolumn  |  column1   |
> +------------+------------+
> | [B@7385c043 | 0          |
> | [B@5211a9f5 | 1          |
> | [B@5ad3deb | -1         |
> | [B@30bc1236 | 2          |
> | [B@b4fb039 | 127        |
> | [B@1cba73fc | -128       |
> | [B@1514b420 | 255        |
> | [B@23dabb0 | 128        |
> | [B@1ed2b0f6 | -129       |
> | [B@1a5ff649 | 256        |
> | [B@12224026 | 32767      |
> | [B@6a18817 | -32768     |
> | [B@56eda167 | 65535      |
> | [B@aff9dc7 | -32769     |
> | [B@13cf7975 | 32768      |
> | [B@1a2efa7c | 65536      |
> | [B@23ef052 | 2147483647 |
> | [B@721398a4 | -2147483648 |
> +------------+------------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to