[
https://issues.apache.org/jira/browse/DRILL-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vitalii Diravka reopened DRILL-5495:
------------------------------------
Looks like the issue is gone for
{{NullableFixedByteAlignedReaders.NullableFixedBinaryAsTimeStampReader}}
{code}set `store.parquet.reader.int96_as_timestamp` = true;{code}
but still exists for
{{NullableFixedByteAlignedReaders.NullableFixedBinaryReader}}
{code}set `store.parquet.reader.int96_as_timestamp` = false;{code}
{code}
0: jdbc:drill:zk=local> select convert_from(d, 'TIMESTAMP_IMPALA') from
dfs.`/home/vitalii/Downloads/d4`;
........
........
| 2011-06-19 19:10:02.0 |
| 2011-06-19 19:10:02.0 |
| 2011-06-19 19:10:02.0 |
| 2011-06-19 19:10:02.0 |
Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
Message:
Hadoop path:
/home/vitalii/Downloads/d4/part-r-00003-08c5c621-62ea-4fee-b690-11576eddc39c.snappy.parquet
Total records read: 0
Row group index: 0
Records in row group: 10000
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 a;
optional binary b (UTF8);
optional int32 c (DATE);
optional int96 d;
}
, metadata:
\{org.apache.spark.sql.parquet.row.metadata={"type":"struct","fields":[{"name":"a","type":"integer","nullable":true,"metadata":{}},\{"name":"b","type":"string","nullable":true,"metadata":{}},\{"name":"c","type":"date","nullable":true,"metadata":{}},\{"name":"d","type":"timestamp","nullable":true,"metadata":{}}]}}},
blocks: [BlockMetaData\{10000, 8627 [ColumnMetaData{SNAPPY [a] optional int32
a [BIT_PACKED, PLAIN, RLE], 4}, ColumnMetaData\{SNAPPY [b] optional binary b
(UTF8) [BIT_PACKED, PLAIN_DICTIONARY, RLE], 2348}, ColumnMetaData\{SNAPPY [c]
optional int32 c (DATE) [BIT_PACKED, PLAIN, RLE], 4578}, ColumnMetaData\{SNAPPY
[d] optional int96 d [BIT_PACKED, PLAIN_DICTIONARY, RLE], 5822}]}]}
Fragment 1:0
[Error Id: 7cae35f2-eade-4e12-aaca-25aaf11e63d5 on vitalii-pc:31010]
(state=,code=0)
{code}
> convert_from function on top of int96 data results in
> ArrayIndexOutOfBoundsException
> ------------------------------------------------------------------------------------
>
> Key: DRILL-5495
> URL: https://issues.apache.org/jira/browse/DRILL-5495
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Rahul Challapalli
> Assignee: Vitalii Diravka
> Priority: Major
> Fix For: 1.14.0
>
> Attachments: 26edf56f-6bc6-1e1f-5aa4-d98aec858a4a.sys.drill,
> d4.tar.gz, drillbit.log
>
>
> git.commit.id.abbrev=1e0a14c
> The data set used is generated from spark and contains a timestamp stored as
> int96
> {code}
> [root@qa-node190 framework]# /home/parquet-tools-1.5.1-SNAPSHOT/parquet-meta
> /home/framework/framework/resources/Datasources/parquet_date/spark_generated/d4/part-r-00000-08c5c621-62ea-4fee-b690-11576eddc39c.snappy.parquet
>
> creator: parquet-mr (build 32c46643845ea8a705c35d4ec8fc654cc8ff816d)
> extra: org.apache.spark.sql.parquet.row.metadata =
> {"type":"struct","fields":[{"name":"a","type":"integer","nullable":true,"metadata":{}},{"name":"b","type":"strin
> [more]...
> file schema: spark_schema
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> a: OPTIONAL INT32 R:0 D:1
> b: OPTIONAL BINARY O:UTF8 R:0 D:1
> c: OPTIONAL INT32 O:DATE R:0 D:1
> d: OPTIONAL INT96 R:0 D:1
> row group 1: RC:10000 TS:8661
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> a: INT32 SNAPPY DO:0 FPO:4 SZ:2367/2571/1.09 VC:10000
> ENC:RLE,PLAIN,BIT_PACKED
> b: BINARY SNAPPY DO:0 FPO:2371 SZ:2329/2843/1.22 VC:10000
> ENC:RLE,PLAIN_DICTIONARY,BIT_PACKED
> c: INT32 SNAPPY DO:0 FPO:4700 SZ:1374/1507/1.10 VC:10000
> ENC:RLE,PLAIN,BIT_PACKED
> d: INT96 SNAPPY DO:0 FPO:6074 SZ:1597/1740/1.09 VC:10000
> ENC:RLE,PLAIN_DICTIONARY,BIT_PACKED
> {code}
> The below query fails with an ArrayIndexOutOfBoundsException
> {code}
> select convert_from(d, 'TIMESTAMP_IMPALA') from
> dfs.`/drill/testdata/resource-manager/d4`;
> Fails with below error after displaying a bunch of records
> Error: SYSTEM ERROR: ArrayIndexOutOfBoundsException: 0
> Fragment 1:0
> [Error Id: f963f6c0-3306-49a6-9d98-a193c5e7cfee on qa-node190.qa.lab:31010]
> (state=,code=0)
> {code}
> Attached the logs, profiles and data files
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)