[jira] [Commented] (PARQUET-400) Error reading some files after PARQUET-77 bytebuffer read path

Daniel Weeks (JIRA) Tue, 08 Dec 2015 09:05:25 -0800

    [ 
https://issues.apache.org/jira/browse/PARQUET-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047074#comment-15047074
 ]


Daniel Weeks commented on PARQUET-400:
--------------------------------------

[~jaltekruse]  Thanks for creating the JIRA.

I tested the attached file with Hadoop 2.4 (Ubuntu 14.04.3 LTS) against S3 and 
HDFS and both failed.  I just tested it against the local filesystem and it was 
able to read it.

The larger example I have can't be read with S3, HDFS, or local.  I'll try to 
create an example of this and test it across different versions of hadoop.

> Error reading some files after PARQUET-77 bytebuffer read path
> --------------------------------------------------------------
>
>                 Key: PARQUET-400
>                 URL: https://issues.apache.org/jira/browse/PARQUET-400
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Jason Altekruse
>            Assignee: Jason Altekruse
>         Attachments: bytebyffer_read_fail.gz.parquet
>
>
> This issue is based on a discussion on the list started by [~dweeks]
> Full discussion:
> https://mail-archives.apache.org/mod_mbox/parquet-dev/201512.mbox/%3CCAMpYv7C_szTheua9N95bXvbd2ROmV63BFiJTK-K-aDNK6ZNBKA%40mail.gmail.com%3E
> From the thread (he later provided a small repro file that is attached here):
> Just wanted to see if you or anyone else has run into problems reading
> files after the ByteBuffer patch.  I've been running into issues and have
> narrowed it down to the ByteBuffer commit using a small repro file (written
> with 1.6.0, unfortunately can't share the data).
> It doesn't happen for every file, but those that fail give this error:
> can not read class org.apache.parquet.format.PageHeader: Required field
> 'uncompressed_page_size' was not found in serialized data! Struct:
> PageHeader(type:null, uncompressed_page_size:0, compressed_page_size:0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PARQUET-400) Error reading some files after PARQUET-77 bytebuffer read path

Reply via email to