[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281561#comment-16281561
 ] 

Vitalii Diravka commented on DRILL-6016:
----------------------------------------

Interesting dataset.
Drill reads INT96 by default as VARBINARY: 
https://drill.apache.org/docs/parquet-format/#sql-data-types-to-parquet
But with provided dataset it returns an error. Even with explicit converting it 
returns an error:
{code}
0: jdbc:drill:zk=local> select CONVERT_FROM(run_date, 'TIMESTAMP_IMPALA') from 
dfs.`/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet`
 limit 1; 
Error: DATA_READ ERROR: Error reading from Parquet file

File:  
/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
Column:  run_date
Row Group Start:  5523
Fragment 0:0
{code}

But the schema looks good:
{code}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar 
parquet-tools-1.6.0rc3-SNAPSHOT.jar schema 
/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
message spark_schema {
  optional binary article_no (UTF8);
  optional binary qty (UTF8);
  required int96 run_date;
}
{code}

> Error reading INT96 created by Apache Spark
> -------------------------------------------
>
>                 Key: DRILL-6016
>                 URL: https://issues.apache.org/jira/browse/DRILL-6016
>             Project: Apache Drill
>          Issue Type: Bug
>         Environment: Drill 1.11
>            Reporter: Rahul Raj
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to