[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784536
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 24/Jun/22 11:04
Start Date: 24/Jun/22 11:04
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#discussion_r905947970


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java:
##
@@ -1203,4 +1203,18 @@ private Constants() {
* Default maximum read size in bytes during vectored reads : {@value}.
*/
   public static final int DEFAULT_AWS_S3_VECTOR_READS_MAX_MERGED_READ_SIZE = 
1253376; //1M
+
+  /**
+   * Flag for immediate failure when observing a {@link 
AWSBadRequestException}.
+   * If it's disabled and set to false, the failure is treated as retryable.
+   * Value {@value}.
+   */
+  public static final String FAIL_ON_AWS_BAD_REQUEST = 
"fs.s3a.fail.on.aws.bad.request";

Review Comment:
   I now think "fs.s3a.retry.on.400.response.enabled" would be better, with 
default flipped. docs would say "experimental"
   
   and assuming we do have a custom policy, adjacent 
   
   ```
   fs.s3a.retry.on.400.response.delay  // delay between attempts, default "10s"
   fs.s3a.retry.on.400.response.attempts // number of attempts, default 6
   ```
   





Issue Time Tracking
---

Worklog Id: (was: 784536)
Time Spent: 2h 20m  (was: 2h 10m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784532
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 24/Jun/22 10:52
Start Date: 24/Jun/22 10:52
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#discussion_r905945310


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java:
##
@@ -214,7 +214,10 @@ protected Map, RetryPolicy> 
createExceptionMap() {
 
 // policy on a 400/bad request still ambiguous.
 // Treated as an immediate failure
-policyMap.put(AWSBadRequestException.class, fail);
+RetryPolicy awsBadRequestExceptionRetryPolicy =
+configuration.getBoolean(FAIL_ON_AWS_BAD_REQUEST, 
DEFAULT_FAIL_ON_AWS_BAD_REQUEST) ?
+fail : retryIdempotentCalls;

Review Comment:
   1. should retry on all calls, rather than just idempotent ones, as long as 
we are confident that the request is never executed before the failure
   2. I don't believe the normal exponential backoff strategy is the right one, 
as the initial delays are very short lived (500ms), whereas if you are hoping 
that credential providers will fetch new credentials, an initial delay of a few 
seconds would seem better. I wouldn't even bother with exponential growth here, 
just say 6 times at 10 seconds.
   
   I think we would also want to log at warn that this is happening, assuming 
this is rare. 



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java:
##
@@ -1203,4 +1203,18 @@ private Constants() {
* Default maximum read size in bytes during vectored reads : {@value}.
*/
   public static final int DEFAULT_AWS_S3_VECTOR_READS_MAX_MERGED_READ_SIZE = 
1253376; //1M
+
+  /**
+   * Flag for immediate failure when observing a {@link 
AWSBadRequestException}.
+   * If it's disabled and set to false, the failure is treated as retryable.
+   * Value {@value}.
+   */
+  public static final String FAIL_ON_AWS_BAD_REQUEST = 
"fs.s3a.fail.on.aws.bad.request";

Review Comment:
   I now think "fs.s3a.retry.on.400.response.enabled" would be better, with 
default flipped. docs would say "experimental"
   
   and assuming we do have a custom policy, adjacent 
   
   ```
   fs.s3a.retry.on.400.response.delay  // delay between attempts, default "10s"
   fs.s3a.retry.on.400.response.attempts // number of attempts, default 6
   ```
   
   fs.s3a.retry.on.400



##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestInvoker.java:
##
@@ -311,12 +311,25 @@ public void testRetryAWSConnectivity() throws Throwable {
*/
   @Test(expected = AWSBadRequestException.class)
   public void testRetryBadRequestNotIdempotent() throws Throwable {
-invoker.retry("test", null, false,
+
+invoker.retry("test", null, true,
 () -> {
   throw BAD_REQUEST;
 });
   }
 
+  @Test
+  public void testRetryBadRequestIdempotent() throws Throwable {

Review Comment:
   test looks ok.





Issue Time Tracking
---

Worklog Id: (was: 784532)
Time Spent: 2h 10m  (was: 2h)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784416
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 24/Jun/22 06:01
Start Date: 24/Jun/22 06:01
Worklog Time Spent: 10m 
  Work Description: taklwu commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1165228403

   reading from the [hadoop-aws 
page](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)
   
   > The status code 400, Bad Request usually means that the request is 
unrecoverable; it’s the generic “No” response. Very rarely it does recover, 
which is why it is in this category, rather than that of unrecoverable failures.
   
   That does not match the code that in fact we're not retrying for 400 bad 
request, in the case I'm reporting and yeah sorry it's not as usual as others, 
retry did help and help us to use the beauty of retry policy API




Issue Time Tracking
---

Worklog Id: (was: 784416)
Time Spent: 2h  (was: 1h 50m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784372
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 22:46
Start Date: 23/Jun/22 22:46
Worklog Time Spent: 10m 
  Work Description: mukund-thakur commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1164984485

   Do we really need to introduce this config? Seems like an overkill. 
   I think 400 Bad Request are supposed to be non retry-able.




Issue Time Tracking
---

Worklog Id: (was: 784372)
Time Spent: 1h 50m  (was: 1h 40m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784108=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784108
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 10:01
Start Date: 23/Jun/22 10:01
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1164217094

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 21s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 27s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 40s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 16s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 103m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4483 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | Linux 1c7232363729 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 
19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 60df0f6f3637b114bb3fc7d43c74dc2208268310 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/5/testReport/ |
   | Max. process+thread count | 581 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784041=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784041
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 06:00
Start Date: 23/Jun/22 06:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1163984809

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 59s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/4/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 103m 10s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4483 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | Linux 1e6c7306374e 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 
19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f57025b0b24962cb67bfee97eb7b1c08d95e8599 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784031
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 04:17
Start Date: 23/Jun/22 04:17
Worklog Time Spent: 10m 
  Work Description: taklwu commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1163909148

   @steveloughran I should have provided the test result that executed with 
integration tests in the description, they're not perfect but we can discuss 
how we move forward. 
   
   




Issue Time Tracking
---

Worklog Id: (was: 784031)
Time Spent: 1h 20m  (was: 1h 10m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784027
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 03:48
Start Date: 23/Jun/22 03:48
Worklog Time Spent: 10m 
  Work Description: taklwu commented on code in PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#discussion_r904491406


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java:
##
@@ -1177,4 +1177,8 @@ private Constants() {
*/
   public static final String FS_S3A_CREATE_HEADER = "fs.s3a.create.header";
 
+  public static final String FAIL_ON_AWS_BAD_REQUEST = 
"fs.s3a.retry.failOnAwsBadRequest";

Review Comment:
   ack and thanks, I will update it soon.





Issue Time Tracking
---

Worklog Id: (was: 784027)
Time Spent: 1h 10m  (was: 1h)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=784026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784026
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 03:48
Start Date: 23/Jun/22 03:48
Worklog Time Spent: 10m 
  Work Description: taklwu commented on code in PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#discussion_r904491346


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java:
##
@@ -214,7 +214,10 @@ protected Map, RetryPolicy> 
createExceptionMap() {
 
 // policy on a 400/bad request still ambiguous.
 // Treated as an immediate failure
-policyMap.put(AWSBadRequestException.class, fail);
+RetryPolicy awsBadRequestExceptionRetryPolicy =

Review Comment:
   correct me if I'm wrong but before our change, the response as 
`AWSBadRequestException` in fact is getting back with a HTTP 400 error code. It 
is different from other network failures that the 
`fail`/`RetryPolicies.TRY_ONCE_THEN_FAIL` has been applied for.  





Issue Time Tracking
---

Worklog Id: (was: 784026)
Time Spent: 1h  (was: 50m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=783893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783893
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 15:36
Start Date: 22/Jun/22 15:36
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#discussion_r903916520


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java:
##
@@ -214,7 +214,10 @@ protected Map, RetryPolicy> 
createExceptionMap() {
 
 // policy on a 400/bad request still ambiguous.
 // Treated as an immediate failure
-policyMap.put(AWSBadRequestException.class, fail);
+RetryPolicy awsBadRequestExceptionRetryPolicy =

Review Comment:
   should the normal retry policy -which is expected to handle network errors- 
be applied here, or something else



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java:
##
@@ -1177,4 +1177,8 @@ private Constants() {
*/
   public static final String FS_S3A_CREATE_HEADER = "fs.s3a.create.header";
 
+  public static final String FAIL_ON_AWS_BAD_REQUEST = 
"fs.s3a.retry.failOnAwsBadRequest";

Review Comment:
   1. needs to be all lower case with "." between words
   2. and javadocs with {@value)
   3. and something in the documentation





Issue Time Tracking
---

Worklog Id: (was: 783893)
Time Spent: 50m  (was: 40m)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=783891=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783891
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 15:32
Start Date: 22/Jun/22 15:32
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1163267093

   which s3 endpoint did you run the hadoop-aws integration tests against, and 
what was the full mvn command line used? thanks




Issue Time Tracking
---

Worklog Id: (was: 783891)
Time Spent: 40m  (was: 0.5h)

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=783645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783645
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 00:42
Start Date: 22/Jun/22 00:42
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1162498247

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  70m  5s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 36s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 40s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 133m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4483 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux edee9c8b9866 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 
19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f9292dffe2c3cdc8d351a9f87010fea36c003074 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/2/testReport/ |
   | Max. process+thread count | 601 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/2/console |
   | versions | git=2.25.1 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=783557=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783557
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 21/Jun/22 20:45
Start Date: 21/Jun/22 20:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4483:
URL: https://github.com/apache/hadoop/pull/4483#issuecomment-1162333506

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 24s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/1/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 3 new + 2 unchanged - 0 fixed 
= 5 total (was 2)  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 37s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 103m 15s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4483 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 9e62d98eb4d1 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 
19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a9a636cbfa48cddb8e4c37cff9e03a17736d49eb |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4483/1/testReport/ |
   | Max. 

[jira] [Work logged] (HADOOP-18310) Add option and make 400 bad request retryable

2022-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18310?focusedWorklogId=783530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783530
 ]

ASF GitHub Bot logged work on HADOOP-18310:
---

Author: ASF GitHub Bot
Created on: 21/Jun/22 19:00
Start Date: 21/Jun/22 19:00
Worklog Time Spent: 10m 
  Work Description: taklwu opened a new pull request, #4483:
URL: https://github.com/apache/hadoop/pull/4483

   ### Description of PR
   
   Add option and make 400 bad request retryable, added 
`fs.s3a.retry.failOnAwsBadRequest` and default to `true` such that it's acting 
the same behavior without turning it on.
   
   ### How was this patch tested?
   Add a new unit test.
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 783530)
Remaining Estimate: 0h
Time Spent: 10m

> Add option and make 400 bad request retryable
> -
>
> Key: HADOOP-18310
> URL: https://issues.apache.org/jira/browse/HADOOP-18310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Tak-Lon (Stephen) Wu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When one is using a customized credential provider via 
> fs.s3a.aws.credentials.provider, e.g. 
> org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, when the provided 
> credential by this pluggable provider is expired and return an error code of 
> 400 as bad request exception.
> Here, the current S3ARetryPolicy will fail immediately and does not retry on 
> the S3A level. 
> Our recent use case in HBase found this use case could lead to a Region 
> Server got immediate abandoned from this Exception without retry, when the 
> file system is trying open or S3AInputStream is trying to reopen the file. 
> especially the S3AInputStream use cases, we cannot find a good way to retry 
> outside of the file system semantic (because if a ongoing stream is failing 
> currently it's considered as irreparable state), and thus we come up with 
> this optional flag for retrying in S3A.
> {code}
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The provided 
> token has expired. (Service: Amazon S3; Status Code: 400; Error Code: 
> ExpiredToken; Request ID: XYZ; S3 Extended Request ID: ABC; Proxy: null), S3 
> Extended Request ID: 123
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1506)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:217)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
>   ... 35 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)