[
https://issues.apache.org/jira/browse/HADOOP-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran reassigned HADOOP-17764:
---------------------------------------
Assignee: Zamil Majdy
> S3AInputStream read does not re-open the input stream on the second read
> retry attempt
> --------------------------------------------------------------------------------------
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.3.1
> Reporter: Zamil Majdy
> Assignee: Zamil Majdy
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.3.2
>
> Time Spent: 8h 10m
> Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an
> IOException happening during the read:
> * {{reopen and read quickly}}: The client after failing in the first attempt
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
> * {{reopen and wait for fixed duration}}: The client after failing in the
> attempt of {{read}}, will reopen the stream, sleep for
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will
> be retried without reopening the input stream in case of the second failure
> happened. This leads to some of the bytes read being skipped which results to
> corrupt/less data than required.
>
> *Scenario to reproduce:*
> * Execute S3AInputStream `read()` or `read(b, off, len)`.
> * The read failed and throws `Connection Reset` exception after reading some
> data.
> * The InputStream is re-opened and another `read()` or `read(b, off, len)`
> is executed
> * The read failed for the second time and throws `Connection Reset`
> exception after reading some data.
> * The InputStream is not re-opened and another `read()` or `read(b, off,
> len)` is executed after sleep
> * The read succeed, but it skips the first few bytes that has already been
> read on the second failure.
>
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]