[ 
https://issues.apache.org/jira/browse/IMPALA-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583318#comment-17583318
 ] 

ASF subversion and git services commented on IMPALA-11514:
----------------------------------------------------------

Commit 8e0482294975352d3d34d75adb50602d85b3c155 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8e0482294 ]

IMPALA-11514: Workaround s3 connection timeout issues

When running on s3, dataload is failing with errors
like "Timeout waiting for connection from pool". The
underlying issue is a subtle issue in the async draining
codepath (HADOOP-18410). As a temporary workaround, this
adds fs.s3a.input.async.drain.threshold=512G to core-site.xml.
This disables the async drain codepath.

Testing:
 - An s3 job passed with this setting

Change-Id: I08d03eb653fdcb6955340519b0cf5ba97b10d590
Reviewed-on: http://gerrit.cloudera.org:8080/18872
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>


> Workaround s3 timeout waiting for connection from pool (HADOOP-18410)
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-11514
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11514
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 4.2.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Blocker
>              Labels: broken-build
>
> When testing on s3, we see dataload fail when trying to load testcases:
> {noformat}
> 12:00:17 Creating tpcds testcase data (logging to 
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/data_loading/create-tpcds-testcase-data.log)...
>  
> 12:00:17     FAILED (Took: 0 min 13 sec)
> 12:00:30     
> '/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/testdata/bin/create-tpcds-testcase-files.sh'
>  failed. Tail of log:
> 12:00:30  order by t_s_secyear.customer_id
> 12:00:30          ,t_s_secyear.customer_first_name
> 12:00:30          ,t_s_secyear.customer_last_name
> 12:00:30          ,t_s_secyear.customer_email_address
> 12:00:30 limit 100
> 12:00:30 Query submitted at: 2022-08-18 12:00:25 (Coordinator: 
> http://hostname:25000)
> 12:00:30 ERROR: AnalysisException: getFileStatus on 
> s3a://bucketname/test-warehouse/tpcds-testcase-data: 
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout 
> waiting for connection from pool
> 12:00:30 CAUSED BY: InterruptedIOException: getFileStatus on 
> s3a://bucketname/test-warehouse/tpcds-testcase-data: 
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout 
> waiting for connection from pool
> 12:00:30 CAUSED BY: SdkClientException: Unable to execute HTTP request: 
> Timeout waiting for connection from pool
> 12:00:30 CAUSED BY: ConnectionPoolTimeoutException: Timeout waiting for 
> connection from pool{noformat}
> This has been tracked down to 
> https://issues.apache.org/jira/browse/HADOOP-18410
> A temporary workaround is to specify fs.s3a.input.async.drain.threshold=512G 
> in core-site.xml.
> We should work around this issue until the fix arrives.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to