Steve Loughran created HADOOP-19295:
---------------------------------------

             Summary: S3A: fs.s3a.connection.request.timeout too low for large 
uploads over slow links
                 Key: HADOOP-19295
                 URL: https://issues.apache.org/jira/browse/HADOOP-19295
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.4.0, 3.4.1
            Reporter: Steve Loughran
            Assignee: Steve Loughran


The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low 
for large uploads over slow connections.

I suspect something changed between the v1 and v2 SDK versions so that put was 
exempt from the normal timeouts, It is not and now surfaces in failures to 
upload 1+ GB files over slower network connections. Smailer (for example 128 
MB) files work.

The parallel queuing of writes in the S3ABlockOutputStream is helping create 
this problem as it queues multiple blocks at the same time, so per-block 
bandwidth becomes available/blocks ; four blocks cuts the capacity down by a 
quarter.

The fix is straightforward: use a much bigger timeout. I'm going to propose 15 
minutes. We need to strike a balance between upload time allocation and other 
requests timing out.

I do worry about other consequences; we've found that timeout exception happy 
to hide the underlying causes of retry failures -so in fact this may be better 
for all but a server hanging after the HTTP request is initiated.

too bad we can't alter the timeout for different requests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to