[
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281561#comment-16281561
]
Vitalii Diravka commented on DRILL-6016:
----------------------------------------
Interesting dataset.
Drill reads INT96 by default as VARBINARY:
https://drill.apache.org/docs/parquet-format/#sql-data-types-to-parquet
But with provided dataset it returns an error. Even with explicit converting it
returns an error:
{code}
0: jdbc:drill:zk=local> select CONVERT_FROM(run_date, 'TIMESTAMP_IMPALA') from
dfs.`/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet`
limit 1;
Error: DATA_READ ERROR: Error reading from Parquet file
File:
/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
Column: run_date
Row Group Start: 5523
Fragment 0:0
{code}
But the schema looks good:
{code}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar
parquet-tools-1.6.0rc3-SNAPSHOT.jar schema
/home/vitalii/Downloads/result/parquet/latest/part-r-00000-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
message spark_schema {
optional binary article_no (UTF8);
optional binary qty (UTF8);
required int96 run_date;
}
{code}
> Error reading INT96 created by Apache Spark
> -------------------------------------------
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
> Issue Type: Bug
> Environment: Drill 1.11
> Reporter: Rahul Raj
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException:
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark
> INT96 datetime field on Drill 1.11 in spite of setting the property
> store.parquet.reader.int96_as_timestamp to true.
> I believe this was fixed in drill
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at
> https://github.com/rajrahul/files/blob/master/result.tar.gz
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)