[ 
https://issues.apache.org/jira/browse/PARQUET-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620908#comment-14620908
 ] 

Michael Allman commented on PARQUET-246:
----------------------------------------

Thanks everyone for your awesome work on this. [~rdblue] to make sure I 
understand correctly, will setting {{parquet.split.files=false}} read all 
parquet files sequentially or only the ones with the defective encoding?

> ArrayIndexOutOfBoundsException with Parquet write version v2
> ------------------------------------------------------------
>
>                 Key: PARQUET-246
>                 URL: https://issues.apache.org/jira/browse/PARQUET-246
>             Project: Parquet
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Konstantin Shaposhnikov
>            Assignee: Konstantin Shaposhnikov
>             Fix For: 1.8.0
>
>
> I am getting the following exception when reading a parquet file that was 
> created using Avro WriteSupport and Parquet write version v2.0:
> {noformat}
> Caused by: parquet.io.ParquetDecodingException: Can't read value in column 
> [colName, rows, array, name] BINARY at value 313601 out of 428260, 1 out of 
> 39200 in currentPage. repetition level: 0, definition level: 2
>       at 
> parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:462)
>       at 
> parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:364)
>       at 
> parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:405)
>       at 
> parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:209)
>       ... 27 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>       at 
> parquet.column.values.deltastrings.DeltaByteArrayReader.readBytes(DeltaByteArrayReader.java:70)
>       at 
> parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:307)
>       at 
> parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:458)
>       ... 30 more
> {noformat}
> The file is quite big (500Mb) so I cannot upload it here, but possibly there 
> is enough information in the exception message to understand the cause of 
> error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to