Zamil Majdy created HADOOP-17764:
------------------------------------

             Summary: S3AInputStream read does not re-open the input stream on 
the second read retry attempt
                 Key: HADOOP-17764
                 URL: https://issues.apache.org/jira/browse/HADOOP-17764
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
            Reporter: Zamil Majdy


*Bug description:*

The read method in S3AInputStream has this following behaviour when an 
IOException happening during the read:
 * {{reopen and read quickly}}: The client after failing in the first attempt 
of {{read}}, will reopen the stream and try reading again without {{sleep}}.


 * {{reopen and wait for fixed duration}}: The client after failing in the 
attempt of {{read}}, will reopen the stream, sleep for 
{{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
reading from the stream.

While doing the {{reopen and read quickly}} process, the subsequent read will 
be retried without reopening the input stream in case of the second failure 
happened. This leads to some of the bytes read being skipped which results to 
corrupt/less data than required. 

 

*Scenario to reproduce:*
 * Execute S3AInputStream `read()` or `read(b, off, len)`.
 * The read failed and throws `Connection Reset` exception after reading some 
data.
 * The InputStream is re-opened and another `read()` or `read(b, off, len)` is 
executed
 * The read failed for the second time and throws `Connection Reset` exception 
after reading some data.
 * The InputStream is not re-opened and another `read()` or `read(b, off, len)` 
is executed after sleep
 * The read succeed, but it skips the first few bytes that has already been 
read on the second failure.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to