[ 
https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868982#comment-17868982
 ] 

ASF GitHub Bot commented on HADOOP-19221:
-----------------------------------------

steveloughran commented on PR #6938:
URL: https://github.com/apache/hadoop/pull/6938#issuecomment-2253149858

   @shameersss1 
   I really don't know what best to do here. 
   
   We have massively cut back on the number of retries which take place in the 
V2 SDK compared to V1; even though we have discussed in the past turning it off 
completely and handling it all ourselves. However, that would break things the 
transfer manager does in separate threads.
   
   The thing is, I do not know how often we see 500 errors against AWS S3 
stores (rather than third party ones with unrecoverable issues) -and now we 
have seen them I don't know what the right policy should be. The only 
documentation on what to do seems more focused on 503s, and doesn't provide any 
hints about why a 500 could happen or what to do other than "keep trying maybe 
it'll go away": https://repost.aws/knowledge-center/http-5xx-errors-s3 . I do 
suspect it is very rare -otherwise the AWS team might have noticed their lack 
of resilience here, and we would've found it during our own testing. Any 500 
error at any point other than multipart uploads probably gets recovered from 
nicely so that could've been a background noise of these which we have never 
noticed before. s3a FS stats will now track these, which may be informative.
   
   I don't want to introduce another configuration switch if possible because 
that at more to documentation testing maintenance et cetera. One thing I was 
considering is should we treat this exactly the same as a throttling exception 
which has its own configuration settings for retries?
   
   Anyway, if you could talk to your colleagues and make some suggestions based 
on real knowledge of what can happen that would be really nice. Note that we 
are treating 500 as idempotent, the way we do with all the other failures even 
though from a distributed computing purism perspective it is not in fact true.
   
   Not looked at the other comments yet; will do later. Based on a code 
walk-through with Mukud, Harshit and Saikat, I've realised we should make 
absolutely sure that the stream providing a subset of file fails immediately if 
the read() goes past the allocated space. With tests, obviously.




> S3A: Unable to recover from failure of multipart block upload attempt "Status 
> Code: 400; Error Code: RequestTimeout"
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19221
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19221
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> If a multipart PUT request fails for some reason (e.g. networrk error) then 
> all subsequent retry attempts fail with a 400 Response and ErrorCode 
> RequestTimeout .
> {code}
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended 
> Request ID:
> {code}
> The list of supporessed exceptions contains the root cause (the initial 
> failure was a 500); all retries failed to upload properly from the source 
> input stream {{RequestBody.fromInputStream(fileStream, size)}}.
> Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 
> sdk we would build a multipart block upload request passing in (file, offset, 
> length), the way we are now doing this doesn't recover.
> probably fixable by providing our own {{ContentStreamProvider}} 
> implementations for
> # file + offset + length
> # bytebuffer
> # byte array
> The sdk does have explicit support for the memory ones, but they copy the 
> data blocks first. we don't want that as it would double the memory 
> requirements of active blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to