[ 
https://issues.apache.org/jira/browse/AVRO-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277192#comment-17277192
 ] 

Andrew Olson commented on AVRO-2944:
------------------------------------

[~rskraba] Thanks for the feedback. We're not sure what exactly is happening 
with the stream (S3AInputStream, via S3AFileSystem). This bug just started 
afflicting us in the past few days. It's sporadic, only happens a very small 
percentage of the time, but often enough to cause stability issues with our MR 
jobs that read a lot of separate files out of S3. I think throwing an 
EOFException here does make sense if a -1 is unexpectedly returned from the 
read before the magic bytes are consumed. I'll try to get a PR for that created 
in the next couple days.

> DataFileReader has incorrect logic reading magic header
> -------------------------------------------------------
>
>                 Key: AVRO-2944
>                 URL: https://issues.apache.org/jira/browse/AVRO-2944
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2
>            Reporter: Mick Jermsurawong
>            Assignee: Mick Jermsurawong
>            Priority: Major
>             Fix For: 1.10.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When creating reader using static method which includes checking for magic 
> header, we currently read 4 bytes but the pointer is not correctly updated.
> [https://github.com/apache/avro/blob/328c539afc77da347ec52be1e112a6a7c371143b/lang/java/avro/src/main/java/org/apache/avro/file/DataFileReader.java#L61-L62]
> When inputstream reads less byte than expected, this will get stuck in the 
> loop until the end of file. Or if inputstream returns -1, for EOF like 
> S3AInpustream, read hangs in this loop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to