[
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
salim achouche updated DRILL-6685:
----------------------------------
Labels: pull-request-available (was: )
> Error in parquet record reader
> ------------------------------
>
> Key: DRILL-6685
> URL: https://issues.apache.org/jira/browse/DRILL-6685
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Parquet
> Affects Versions: 1.14.0
> Reporter: Robert Hou
> Assignee: salim achouche
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6685
>
>
> This is the query:
> select VarbinaryValue1 from
> dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit
> 36;
> It appears to be caused by this commit:
> DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
> aee899c1b26ebb9a5781d280d5a73b42c273d4d5
> This is the stack trace:
> {noformat}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message:
> Hadoop path:
> /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
> Total records read: 0
> Row group index: 0
> Records in row group: 1250
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
> optional int64 Index;
> optional binary VarbinaryValue1;
> optional int64 BigIntValue;
> optional boolean BooleanValue;
> optional int32 DateValue (DATE);
> optional float FloatValue;
> optional binary VarcharValue1 (UTF8);
> optional double DoubleValue;
> optional int32 IntegerValue;
> optional int32 TimeValue (TIME_MILLIS);
> optional int64 TimestampValue (TIMESTAMP_MILLIS);
> optional binary VarbinaryValue2;
> optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
> optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
> optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
> optional binary VarcharValue2 (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks:
> [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional
> int64 Index [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED
> [VarbinaryValue1] optional binary VarbinaryValue1 [PLAIN, RLE, BIT_PACKED],
> 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue
> [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED
> [BooleanValue] optional boolean BooleanValue [PLAIN, RLE, BIT_PACKED],
> 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue
> (DATE) [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED
> [FloatValue] optional float FloatValue [PLAIN, RLE, BIT_PACKED], 8184959},
> ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1
> (UTF8) [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED
> [DoubleValue] optional double DoubleValue [PLAIN, RLE, BIT_PACKED],
> 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32
> IntegerValue [PLAIN, RLE, BIT_PACKED], 10240111},
> ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue
> (TIME_MILLIS) [PLAIN, RLE, BIT_PACKED], 10245154},
> ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue
> (TIMESTAMP_MILLIS) [PLAIN, RLE, BIT_PACKED], 10250197},
> ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2
> [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED
> [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue
> (INTERVAL) [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED
> [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue
> (INTERVAL) [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED
> [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue
> (INTERVAL) [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED
> [VarcharValue2] optional binary VarcharValue2 (UTF8) [PLAIN, RLE,
> BIT_PACKED], 19677568}]}]}
> Fragment 0:0
> [Error Id: 25852cdb-3217-4041-9743-66e9f3a2fbe4 on qa-node186.qa.lab:31010]
> (state=,code=0)
> {noformat}
> Table can be found in 10.10.100.186:/tmp/fourvarchar_asc_nulls_16MB.parquet
> sys.version is:
> 1.15.0-SNAPSHOT a05f17d6fcd80f0d21260d3b1074ab895f457bac Changed
> PROJECT_OUTPUT_BATCH_SIZE to System + Session 30.07.2018 @ 17:12:53 PDT
> [email protected] 30.07.2018 @ 17:25:21 PDT^M
> fourvarchar_asc_nulls70.q
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)