[jira] [Created] (HADOOP-19044) AWS SDK V2 - Update S3A region logic

2024-01-18 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-19044:
-

 Summary: AWS SDK V2 - Update S3A region logic 
 Key: HADOOP-19044
 URL: https://issues.apache.org/jira/browse/HADOOP-19044
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set 
fs.s3a.endpoint to 

s3.amazonaws.com here:

[https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
 

 

HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is 
set, or if a region can be parsed from fs.s3a.endpoint (which will happen in 
this case, region will be US_EAST_1), cross region access is not enabled. This 
will cause 400 errors if the bucket is not in US_EAST_1. 

 

Proposed: Updated the logic so that if the endpoint is the global 
s3.amazonaws.com , cross region access is enabled.  

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19007) S3A: transfer manager not wired up to s3a executor pool

2023-12-08 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17794670#comment-17794670
 ] 

Ahmar Suhail commented on HADOOP-19007:
---

So I think what it means to pass in the executor has changed b/w V1 and V2. 

With V1, that executor pool would be used to make requests to S3, documented 
[here|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#TransferManager-com.amazonaws.services.s3.AmazonS3-java.util.concurrent.ExecutorService-]

With V2, if you pass it in, it's only used for certain background tasks before 
calling the S3AsyncClient such as visiting file tree in a 
S3TransferManager.uploadDirectory(UploadDirectoryRequest) operation, I don't 
think it's relevant for our usecase of copy. Documented 
[here|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/transfer/s3/S3TransferManager.Builder.html#executor(java.util.concurrent.Executor)]

It'll end up using the executor of the S3AsyncClient. Currently that client 
creates it's own executor pool, but we can also pass in our own if required. 
That behaviour is documented 
[here|https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/asynchronous.html]

Do you think there is an advantage here of passing the in the boundedThreadPool 
to the S3AsyncClient?

> S3A: transfer manager not wired up to s3a executor pool
> ---
>
> Key: HADOOP-19007
> URL: https://issues.apache.org/jira/browse/HADOOP-19007
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> S3ClientFactory.createS3TransferManager() doesn't use the executor declared 
> in S3ClientCreationParameters.transferManagerExecutor
> * method needs to take S3ClientCreationParameters
> * and set the transfer manager executor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-1:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18908) Improve s3a region handling, including determining from endpoint

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18908:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> Improve s3a region handling, including determining from endpoint
> 
>
> Key: HADOOP-18908
> URL: https://issues.apache.org/jira/browse/HADOOP-18908
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> S3A region logic improved for better inference and
> to be compatible with previous releases
> 1. If you are using an AWS S3 AccessPoint, its region is determined
>from the ARN itself.
> 2. If fs.s3a.endpoint.region is set and non-empty, it is used.
> 3. If fs.s3a.endpoint is an s3.*.amazonaws.com url, 
>the region is determined by by parsing the URL 
>Note: vpce endpoints are not handled by this.
> 4. If fs.s3a.endpoint.region==null, and none could be determined
>from the endpoint, use us-east-2 as default.
> 5. If fs.s3a.endpoint.region=="" then it is handed off to
>The default AWS SDK resolution process.
> Consult the AWS SDK documentation for the details on its resolution
> process, knowing that it is complicated and may use environment variables,
> entries in ~/.aws/config, IAM instance information within
> EC2 deployments and possibly even JSON resources on the classpath.
> Put differently: it is somewhat brittle across deployments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18930) S3A: make fs.s3a.create.performance an option you can set for the entire bucket

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18930:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A: make fs.s3a.create.performance an option you can set for the entire 
> bucket
> ---
>
> Key: HADOOP-18930
> URL: https://issues.apache.org/jira/browse/HADOOP-18930
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> make the fs.s3a.create.performance option something you can set everywhere, 
> rather than just in an openFile() option or under a magic path.
> this improves performance on apps like iceberg where filenames are generated 
> with UUIDs in them, so we know there are no overwrites



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18915) Tune/extend S3A http connection and thread pool settings

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18915:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> Tune/extend S3A http connection and thread pool settings
> 
>
> Key: HADOOP-18915
> URL: https://issues.apache.org/jira/browse/HADOOP-18915
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> Increases existing pool sizes, as with server scale and vector
> IO, larger pools are needed
>   fs.s3a.connection.maximum 200
>   fs.s3a.threads.max 96
> Adds new configuration options for v2 sdk internal timeouts,
> both with default of 60s:
>   fs.s3a.connection.acquisition.timeout
>   fs.s3a.connection.idle.time
> All the pool/timoeut options are covered in performance.md
> Moves all timeout/duration options in the s3a FS to taking
> temporal units (h, m, s, ms,...); retaining the previous default
> unit (normally millisecond)
> Adds a minimum duration for most of these, in order to recover from
> deployments where a timeout has been set on the assumption the unit
> was seconds, not millis.
> Uses java.time.Duration throughout the codebase;
> retaining the older numeric constants in
> org.apache.hadoop.fs.s3a.Constants for backwards compatibility;
> these are now deprecated.
> Adds new class AWSApiCallTimeoutException to be raised on
> sdk-related methods and also gateway timeouts. This is a subclass
> of org.apache.hadoop.net.ConnectTimeoutException to support
> existing retry logic.
> + reverted default value of fs.s3a.create.performance to false; 
> inadvertently set to true during testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18932) Upgrade AWS v2 SDK to 2.20.160 and v1 to 1.12.565

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18932:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> Upgrade AWS v2 SDK to 2.20.160 and v1 to 1.12.565
> -
>
> Key: HADOOP-18932
> URL: https://issues.apache.org/jira/browse/HADOOP-18932
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> Bump up the sdk versions for both...even if we don't ship v1 it helps us 
> qualify releases with newer versions, and means that an upgrade of that alone 
> to branch-3.3 will be in sync.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18995) S3A: Upgrade AWS SDK version to 2.21.33 for Amazon S3 Express One Zone support

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18995:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A: Upgrade AWS SDK version to 2.21.33 for Amazon S3 Express One Zone support
> --
>
> Key: HADOOP-18995
> URL: https://issues.apache.org/jira/browse/HADOOP-18995
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> Upgrade SDK version to 2.21.33, which adds S3 Express One Zone support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18939) NPE in AWS v2 SDK RetryOnErrorCodeCondition.shouldRetry()

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18939:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> NPE in AWS v2 SDK RetryOnErrorCodeCondition.shouldRetry()
> -
>
> Key: HADOOP-18939
> URL: https://issues.apache.org/jira/browse/HADOOP-18939
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> NPE in error handling code of RetryOnErrorCodeCondition.shouldRetry(); in 
> bundle-2.20.128.jar
> This is AWS SDK code; fix needs to go there. 
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> software.amazon.awssdk.awscore.retry.conditions.RetryOnErrorCodeCondition.shouldRetry(RetryOnErrorCodeCondition.java:45)
>  ~[bundle-2.20.128.jar:?]
>   at 
> software.amazon.awssdk.core.retry.conditions.OrRetryCondition.lambda$shouldRetry$0(OrRetryCondition.java:46)
>  ~[bundle-2.20.128.jar:?]
>   at java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90) 
> ~[?:1.8.0_382]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18945) S3A: IAMInstanceCredentialsProvider failing: Failed to load credentials from IMDS

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18945:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A: IAMInstanceCredentialsProvider failing: Failed to load credentials from 
> IMDS
> -
>
> Key: HADOOP-18945
> URL: https://issues.apache.org/jira/browse/HADOOP-18945
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 7.2.18.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> Failures in impala test VMs using iAM for auth
> {code}
> Failed to open file as a parquet file: java.net.SocketTimeoutException: 
> re-open 
> s3a://impala-test-uswest2-1/test-warehouse/test_pre_gregorian_date_parquet_2e80ae30.db/hive2_pre_gregorian.parquet
>  at 84 on 
> s3a://impala-test-uswest2-1/test-warehouse/test_pre_gregorian_date_parquet_2e80ae30.db/hive2_pre_gregorian.parquet:
>  org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: +: Failed to load 
> credentials from IMDS
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18996) S3A to provide full support for S3 Express One Zone

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18996:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A to provide full support for S3 Express One Zone
> ---
>
> Key: HADOOP-18996
> URL: https://issues.apache.org/jira/browse/HADOOP-18996
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> HADOOP-18995 upgrades the SDK version which allows connecting to a s3 express 
> one zone support. 
> Complete support needs to be added to address tests that fail with s3 express 
> one zone, additional tests, documentation etc. 
> * hadoop-common path capability to indicate that treewalking may encounter 
> missing dirs
> * use this in treewalking code in shell, mapreduce FileInputFormat etc to not 
> fail during treewalks
> * extra path capability for s3express too.
> * tests for this
> * anything else



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18946) S3A: testMultiObjectExceptionFilledIn() assertion error

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18946:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A: testMultiObjectExceptionFilledIn() assertion error
> ---
>
> Key: HADOOP-18946
> URL: https://issues.apache.org/jira/browse/HADOOP-18946
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> Failure in the new test of HADOOP-18939.
> I've been fiddling with the sdk upgrade, and only merged HADOOP-18932 after 
> submitting the new pr, so maybe, just maybe, the SDK changed some defaults.
> anyway, 
> {code}
> [ERROR] 
> testMultiObjectExceptionFilledIn(org.apache.hadoop.fs.s3a.impl.TestErrorTranslation)
>   Time elapsed: 0.026 s  <<< FAILURE!
> java.lang.AssertionError: retry policy of MultiObjectException
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at 
> {code}
> easily fixed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18948) S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on rename/delete

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18948:
--
Fix Version/s: 3.3.7-aws
   (was: 3.3.6-aws)

> S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on 
> rename/delete
> ---
>
> Key: HADOOP-18948
> URL: https://issues.apache.org/jira/browse/HADOOP-18948
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.7-aws
>
>
> On third-party stores without lifecycle rules its possible to accrue many GB 
> of pending multipart uploads, including from
> * magic committer jobs where spark driver/MR AM failed before commit/abort
> * distcp jobs which timeout and get aborted
> * any client code writing datasets which are interrupted before close.
> Although there's a purge pending uploads option, that's dangerous because if 
> any fs is instantiated with it, it can destroy in-flight work
> otherwise, the "hadoop s3guard uploads" command does work but needs 
> scheduling/manual execution
> proposed: add a new property {{fs.s3a.directory.operations.purge.uploads}} 
> which will automatically cancel all pending uploads under a path
> * delete: everything under the dir
> * rename: all under the source dir
> This will be done in parallel to the normal operation, but no attempt to post 
> abortMultipartUploads in different threads. The assumption here is that this 
> is rare. And it'll be off by default as in AWS people should have rules for 
> these things.
> + doc (third_party?)
> + add new counter/metric for abort operations, count and duration
> + test to include cost assertions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19003) S3A Assume role tests failing against S3Express stores

2023-12-06 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17793861#comment-17793861
 ] 

Ahmar Suhail commented on HADOOP-19003:
---

Checked, even if we disable createSession, any roles still need to use the 
s3Express name space and CreateSession action. I can work on this once I'm back 
from holiday, need to see if we should create new roles or skip failing tests, 
as you can only restrict on a bucket level and not by prefix. 

> S3A Assume role tests failing against S3Express stores
> --
>
> Key: HADOOP-19003
> URL: https://issues.apache.org/jira/browse/HADOOP-19003
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> The test suits which assume roles with restricted permissions down paths 
> still fail on S3Express, even after disabling createSession.
> This is with a role which *should* work.
> Either the role setup is wrong, or there's something special about role 
> configuration for S3Express buckets



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18996) S3A to provide full support for S3 Express One Zone

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18996:
--
Fix Version/s: 3.3.6-aws

> S3A to provide full support for S3 Express One Zone
> ---
>
> Key: HADOOP-18996
> URL: https://issues.apache.org/jira/browse/HADOOP-18996
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> HADOOP-18995 upgrades the SDK version which allows connecting to a s3 express 
> one zone support. 
> Complete support needs to be added to address tests that fail with s3 express 
> one zone, additional tests, documentation etc. 
> * hadoop-common path capability to indicate that treewalking may encounter 
> missing dirs
> * use this in treewalking code in shell, mapreduce FileInputFormat etc to not 
> fail during treewalks
> * extra path capability for s3express too.
> * tests for this
> * anything else



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18995) S3A: Upgrade AWS SDK version to 2.21.33 for Amazon S3 Express One Zone support

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18995:
--
Fix Version/s: 3.3.6-aws

> S3A: Upgrade AWS SDK version to 2.21.33 for Amazon S3 Express One Zone support
> --
>
> Key: HADOOP-18995
> URL: https://issues.apache.org/jira/browse/HADOOP-18995
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> Upgrade SDK version to 2.21.33, which adds S3 Express One Zone support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18915) Tune/extend S3A http connection and thread pool settings

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18915:
--
Fix Version/s: 3.3.6-aws

> Tune/extend S3A http connection and thread pool settings
> 
>
> Key: HADOOP-18915
> URL: https://issues.apache.org/jira/browse/HADOOP-18915
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> Increases existing pool sizes, as with server scale and vector
> IO, larger pools are needed
>   fs.s3a.connection.maximum 200
>   fs.s3a.threads.max 96
> Adds new configuration options for v2 sdk internal timeouts,
> both with default of 60s:
>   fs.s3a.connection.acquisition.timeout
>   fs.s3a.connection.idle.time
> All the pool/timoeut options are covered in performance.md
> Moves all timeout/duration options in the s3a FS to taking
> temporal units (h, m, s, ms,...); retaining the previous default
> unit (normally millisecond)
> Adds a minimum duration for most of these, in order to recover from
> deployments where a timeout has been set on the assumption the unit
> was seconds, not millis.
> Uses java.time.Duration throughout the codebase;
> retaining the older numeric constants in
> org.apache.hadoop.fs.s3a.Constants for backwards compatibility;
> these are now deprecated.
> Adds new class AWSApiCallTimeoutException to be raised on
> sdk-related methods and also gateway timeouts. This is a subclass
> of org.apache.hadoop.net.ConnectTimeoutException to support
> existing retry logic.
> + reverted default value of fs.s3a.create.performance to false; 
> inadvertently set to true during testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18930) S3A: make fs.s3a.create.performance an option you can set for the entire bucket

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18930:
--
Fix Version/s: 3.3.6-aws

> S3A: make fs.s3a.create.performance an option you can set for the entire 
> bucket
> ---
>
> Key: HADOOP-18930
> URL: https://issues.apache.org/jira/browse/HADOOP-18930
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> make the fs.s3a.create.performance option something you can set everywhere, 
> rather than just in an openFile() option or under a magic path.
> this improves performance on apps like iceberg where filenames are generated 
> with UUIDs in them, so we know there are no overwrites



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18945) S3A: IAMInstanceCredentialsProvider failing: Failed to load credentials from IMDS

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18945:
--
Fix Version/s: 3.3.6-aws

> S3A: IAMInstanceCredentialsProvider failing: Failed to load credentials from 
> IMDS
> -
>
> Key: HADOOP-18945
> URL: https://issues.apache.org/jira/browse/HADOOP-18945
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 7.2.18.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> Failures in impala test VMs using iAM for auth
> {code}
> Failed to open file as a parquet file: java.net.SocketTimeoutException: 
> re-open 
> s3a://impala-test-uswest2-1/test-warehouse/test_pre_gregorian_date_parquet_2e80ae30.db/hive2_pre_gregorian.parquet
>  at 84 on 
> s3a://impala-test-uswest2-1/test-warehouse/test_pre_gregorian_date_parquet_2e80ae30.db/hive2_pre_gregorian.parquet:
>  org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: +: Failed to load 
> credentials from IMDS
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18946) S3A: testMultiObjectExceptionFilledIn() assertion error

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18946:
--
Fix Version/s: 3.3.6-aws

> S3A: testMultiObjectExceptionFilledIn() assertion error
> ---
>
> Key: HADOOP-18946
> URL: https://issues.apache.org/jira/browse/HADOOP-18946
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> Failure in the new test of HADOOP-18939.
> I've been fiddling with the sdk upgrade, and only merged HADOOP-18932 after 
> submitting the new pr, so maybe, just maybe, the SDK changed some defaults.
> anyway, 
> {code}
> [ERROR] 
> testMultiObjectExceptionFilledIn(org.apache.hadoop.fs.s3a.impl.TestErrorTranslation)
>   Time elapsed: 0.026 s  <<< FAILURE!
> java.lang.AssertionError: retry policy of MultiObjectException
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at 
> {code}
> easily fixed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18948) S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on rename/delete

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18948:
--
Fix Version/s: 3.3.6-aws

> S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on 
> rename/delete
> ---
>
> Key: HADOOP-18948
> URL: https://issues.apache.org/jira/browse/HADOOP-18948
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> On third-party stores without lifecycle rules its possible to accrue many GB 
> of pending multipart uploads, including from
> * magic committer jobs where spark driver/MR AM failed before commit/abort
> * distcp jobs which timeout and get aborted
> * any client code writing datasets which are interrupted before close.
> Although there's a purge pending uploads option, that's dangerous because if 
> any fs is instantiated with it, it can destroy in-flight work
> otherwise, the "hadoop s3guard uploads" command does work but needs 
> scheduling/manual execution
> proposed: add a new property {{fs.s3a.directory.operations.purge.uploads}} 
> which will automatically cancel all pending uploads under a path
> * delete: everything under the dir
> * rename: all under the source dir
> This will be done in parallel to the normal operation, but no attempt to post 
> abortMultipartUploads in different threads. The assumption here is that this 
> is rare. And it'll be off by default as in AWS people should have rules for 
> these things.
> + doc (third_party?)
> + add new counter/metric for abort operations, count and duration
> + test to include cost assertions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18908) Improve s3a region handling, including determining from endpoint

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18908:
--
Fix Version/s: 3.3.6-aws

> Improve s3a region handling, including determining from endpoint
> 
>
> Key: HADOOP-18908
> URL: https://issues.apache.org/jira/browse/HADOOP-18908
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> S3A region logic improved for better inference and
> to be compatible with previous releases
> 1. If you are using an AWS S3 AccessPoint, its region is determined
>from the ARN itself.
> 2. If fs.s3a.endpoint.region is set and non-empty, it is used.
> 3. If fs.s3a.endpoint is an s3.*.amazonaws.com url, 
>the region is determined by by parsing the URL 
>Note: vpce endpoints are not handled by this.
> 4. If fs.s3a.endpoint.region==null, and none could be determined
>from the endpoint, use us-east-2 as default.
> 5. If fs.s3a.endpoint.region=="" then it is handed off to
>The default AWS SDK resolution process.
> Consult the AWS SDK documentation for the details on its resolution
> process, knowing that it is complicated and may use environment variables,
> entries in ~/.aws/config, IAM instance information within
> EC2 deployments and possibly even JSON resources on the classpath.
> Put differently: it is somewhat brittle across deployments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18939) NPE in AWS v2 SDK RetryOnErrorCodeCondition.shouldRetry()

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18939:
--
Fix Version/s: 3.3.6-aws

> NPE in AWS v2 SDK RetryOnErrorCodeCondition.shouldRetry()
> -
>
> Key: HADOOP-18939
> URL: https://issues.apache.org/jira/browse/HADOOP-18939
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> NPE in error handling code of RetryOnErrorCodeCondition.shouldRetry(); in 
> bundle-2.20.128.jar
> This is AWS SDK code; fix needs to go there. 
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> software.amazon.awssdk.awscore.retry.conditions.RetryOnErrorCodeCondition.shouldRetry(RetryOnErrorCodeCondition.java:45)
>  ~[bundle-2.20.128.jar:?]
>   at 
> software.amazon.awssdk.core.retry.conditions.OrRetryCondition.lambda$shouldRetry$0(OrRetryCondition.java:46)
>  ~[bundle-2.20.128.jar:?]
>   at java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90) 
> ~[?:1.8.0_382]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18932) Upgrade AWS v2 SDK to 2.20.160 and v1 to 1.12.565

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18932:
--
Fix Version/s: 3.3.6-aws

> Upgrade AWS v2 SDK to 2.20.160 and v1 to 1.12.565
> -
>
> Key: HADOOP-18932
> URL: https://issues.apache.org/jira/browse/HADOOP-18932
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> Bump up the sdk versions for both...even if we don't ship v1 it helps us 
> qualify releases with newer versions, and means that an upgrade of that alone 
> to branch-3.3 will be in sync.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-12-06 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-1:
--
Fix Version/s: 3.3.6-aws

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6-aws
>
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18995) Add support for Amazon S3 Express One Zone Storage - SDK version upgrade

2023-11-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail reassigned HADOOP-18995:
-

Assignee: Ahmar Suhail

> Add support for Amazon S3 Express One Zone Storage - SDK version upgrade
> 
>
> Key: HADOOP-18995
> URL: https://issues.apache.org/jira/browse/HADOOP-18995
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade SDK version to 2.21.33, which adds S3 Express One Zone support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18996) Add necessary software support for S3 Express One Zone

2023-11-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18996:
--
Component/s: fs/s3

> Add necessary software support for S3 Express One Zone
> --
>
> Key: HADOOP-18996
> URL: https://issues.apache.org/jira/browse/HADOOP-18996
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> HADOOP-18995 upgrades the SDK version which allows connecting to a s3 express 
> one zone support. 
> Complete support needs to be added to address tests that fail with s3 express 
> one zone, additional tests, documentation etc. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18996) Add necessary software support for S3 Express One Zone

2023-11-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18996:
--
Affects Version/s: 3.4.0

> Add necessary software support for S3 Express One Zone
> --
>
> Key: HADOOP-18996
> URL: https://issues.apache.org/jira/browse/HADOOP-18996
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> HADOOP-18995 upgrades the SDK version which allows connecting to a s3 express 
> one zone support. 
> Complete support needs to be added to address tests that fail with s3 express 
> one zone, additional tests, documentation etc. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18996) Add necessary software support for S3 Express One Zone

2023-11-29 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18996:
-

 Summary: Add necessary software support for S3 Express One Zone
 Key: HADOOP-18996
 URL: https://issues.apache.org/jira/browse/HADOOP-18996
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


HADOOP-18995 upgrades the SDK version which allows connecting to a s3 express 
one zone support. 

Complete support needs to be added to address tests that fail with s3 express 
one zone, additional tests, documentation etc. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18995) Add support for Amazon S3 Express One Zone Storage - SDK version upgrade

2023-11-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18995:
--
Summary: Add support for Amazon S3 Express One Zone Storage - SDK version 
upgrade  (was: Add support for Amazon S3 Express One Zone Storage)

> Add support for Amazon S3 Express One Zone Storage - SDK version upgrade
> 
>
> Key: HADOOP-18995
> URL: https://issues.apache.org/jira/browse/HADOOP-18995
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> Upgrade SDK version to 2.21.33, which adds S3 Express One Zone support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18995) Add support for Amazon S3 Express One Zone Storage

2023-11-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18995:
--
Description: Upgrade SDK version to 2.21.33, which adds S3 Express One Zone 
support.

> Add support for Amazon S3 Express One Zone Storage
> --
>
> Key: HADOOP-18995
> URL: https://issues.apache.org/jira/browse/HADOOP-18995
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> Upgrade SDK version to 2.21.33, which adds S3 Express One Zone support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18995) Add support for Amazon S3 Express One Zone Storage

2023-11-29 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18995:
-

 Summary: Add support for Amazon S3 Express One Zone Storage
 Key: HADOOP-18995
 URL: https://issues.apache.org/jira/browse/HADOOP-18995
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18938) Handle non standard endpoints

2023-10-16 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18938:
--
Parent Issue: HADOOP-18886  (was: HADOOP-18073)

> Handle non standard endpoints 
> --
>
> Key: HADOOP-18938
> URL: https://issues.apache.org/jira/browse/HADOOP-18938
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> For non standard endpoints such as VPCE the region parsing added in 
> HADOOP-18908 doesn't work. This is expected as that logic is only meant to be 
> used for standard endpoints. 
> If you are using a non-standard endpoint, check if a region is also provided, 
> else fail fast. 
> Also update documentation to explain to region and endpoint behaviour with 
> SDK V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18938) Handle non standard endpoints

2023-10-16 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18938:
-

 Summary: Handle non standard endpoints 
 Key: HADOOP-18938
 URL: https://issues.apache.org/jira/browse/HADOOP-18938
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


For non standard endpoints such as VPCE the region parsing added in 
HADOOP-18908 doesn't work. This is expected as that logic is only meant to be 
used for standard endpoints. 

If you are using a non-standard endpoint, check if a region is also provided, 
else fail fast. 

Also update documentation to explain to region and endpoint behaviour with SDK 
V2. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18915) HTTP timeouts are not set correctly

2023-09-29 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18915:
--
Parent Issue: HADOOP-18886  (was: HADOOP-18073)

> HTTP timeouts are not set correctly
> ---
>
> Key: HADOOP-18915
> URL: https://issues.apache.org/jira/browse/HADOOP-18915
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> In the client config builders, when [setting 
> timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120],
>  it uses Duration.ofSeconds(), configs all use milliseconds so this needs to 
> be updated to Duration.ofMillis().
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18915) HTTP timeouts are not set correctly

2023-09-29 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18915:
-

 Summary: HTTP timeouts are not set correctly
 Key: HADOOP-18915
 URL: https://issues.apache.org/jira/browse/HADOOP-18915
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


In the client config builders, when [setting 
timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120],
 it uses Duration.ofSeconds(), configs all use milliseconds so this needs to be 
updated to Duration.ofMillis().

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18889) S3A: V2 SDK client does not work with third-party store

2023-09-21 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767567#comment-17767567
 ] 

Ahmar Suhail commented on HADOOP-18889:
---

[https://github.com/apache/hadoop/pull/6106] (still WIP), removes the region 
check.  

> S3A: V2 SDK client does not work with third-party store
> ---
>
> Key: HADOOP-18889
> URL: https://issues.apache.org/jira/browse/HADOOP-18889
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> testing against an external store without specifying region now blows up 
> because the region is queried off eu-west-1.
> What are we do to here? require the region setting *which wasn't needed 
> before? what even region do we provide for third party stores?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18889) S3A: V2 SDK client does not work with third-party store

2023-09-13 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail reassigned HADOOP-18889:
-

Assignee: Ahmar Suhail

> S3A: V2 SDK client does not work with third-party store
> ---
>
> Key: HADOOP-18889
> URL: https://issues.apache.org/jira/browse/HADOOP-18889
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Ahmar Suhail
>Priority: Critical
>
> testing against an external store without specifying region now blows up 
> because the region is queried off eu-west-1.
> What are we do to here? require the region setting *which wasn't needed 
> before? what even region do we provide for third party stores?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18889) S3A: V2 SDK client does not work with third-party store

2023-09-13 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764664#comment-17764664
 ] 

Ahmar Suhail commented on HADOOP-18889:
---

we can probably get rid of the region query now..as SDK V2 has cross region 
support as of v2.20.99, it wasn't there when we started this work. 

> S3A: V2 SDK client does not work with third-party store
> ---
>
> Key: HADOOP-18889
> URL: https://issues.apache.org/jira/browse/HADOOP-18889
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Critical
>
> testing against an external store without specifying region now blows up 
> because the region is queried off eu-west-1.
> What are we do to here? require the region setting *which wasn't needed 
> before? what even region do we provide for third party stores?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18877) AWS SDK V2 - Move to S3 Java async client

2023-09-01 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail reassigned HADOOP-18877:
-

Assignee: Ahmar Suhail

> AWS SDK V2 - Move to S3 Java async client
> -
>
> Key: HADOOP-18877
> URL: https://issues.apache.org/jira/browse/HADOOP-18877
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
>
> With the upgrade, S3A now has two S3 clients the Java async client and the 
> Java sync client.
> Java async is required for the transfer manager.
> Java sync is used for everything else. 
>  
> * Move all operations to use the Java async client and remove the sync 
> client. 
> * Provide option to configure java async client with the CRT HTTP client. 
> * Create a new interface for S3Client operations, move them out of S3AFS. 
> interface will take request and span, and return response.  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18877) AWS SDK V2 - Move to S3 Java async client

2023-09-01 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18877:
-

 Summary: AWS SDK V2 - Move to S3 Java async client
 Key: HADOOP-18877
 URL: https://issues.apache.org/jira/browse/HADOOP-18877
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


With the upgrade, S3A now has two S3 clients the Java async client and the Java 
sync client.

Java async is required for the transfer manager.

Java sync is used for everything else. 

 

* Move all operations to use the Java async client and remove the sync client. 

* Provide option to configure java async client with the CRT HTTP client. 

* Create a new interface for S3Client operations, move them out of S3AFS. 
interface will take request and span, and return response.  

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18853) AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy

2023-08-22 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18853:
--
Summary: AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy  
(was: AWS SDK V2 - Integrate new transfer manager)

> AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy
> ---
>
> Key: HADOOP-18853
> URL: https://issues.apache.org/jira/browse/HADOOP-18853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> With 2.20.121, the TM has MPU functionality. Upgrading to to this version 
> will also solve the issue with needing to include the CRT dependency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18853) AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy

2023-08-22 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18853:
--
Description: With 2.20.121, the TM has MPU functionality. Upgrading to the 
latest version (2.20.28) will also solve the issue with needing to include the 
CRT dependency.   (was: With 2.20.121, the TM has MPU functionality. Upgrading 
to to this version will also solve the issue with needing to include the CRT 
dependency. )

> AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy
> ---
>
> Key: HADOOP-18853
> URL: https://issues.apache.org/jira/browse/HADOOP-18853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> With 2.20.121, the TM has MPU functionality. Upgrading to the latest version 
> (2.20.28) will also solve the issue with needing to include the CRT 
> dependency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18853) AWS SDK V2 - Integrate new transfer manager

2023-08-17 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18853:
-

 Summary: AWS SDK V2 - Integrate new transfer manager
 Key: HADOOP-18853
 URL: https://issues.apache.org/jira/browse/HADOOP-18853
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


With 2.20.121, the TM has MPU functionality. Upgrading to to this version will 
also solve the issue with needing to include the CRT dependency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18747) AWS SDK V2 - sigv2 support

2023-07-26 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747555#comment-17747555
 ] 

Ahmar Suhail commented on HADOOP-18747:
---

* Yeah, the NPE isn't ideal, we could update with something like throw new 
IllegalArgumentException("unknown signer type, ensure it's included using 
fs.s3a.custom.signers" );
 * The CRT doesn't currently support custom signers, which is why we don't use 
it yet. We may want to add it in the future (but without custom signer support, 
it will have to be an optional client and not the default)
 * The async client supports custom signers, and they are configured in the 
code, same as the sync client. in AwsClientConfig.createClientConfigBuilder

> AWS SDK V2 - sigv2 support
> --
>
> Key: HADOOP-18747
> URL: https://issues.apache.org/jira/browse/HADOOP-18747
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> AWS SDK V2 does not support sigV2 signing. However, the S3 client supports 
> configurable signers so a custom sigV2 signer can be implemented and 
> configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18778) Test failures with CSE enabled

2023-06-20 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail reassigned HADOOP-18778:
-

Assignee: Ahmar Suhail

> Test failures with CSE enabled
> --
>
> Key: HADOOP-18778
> URL: https://issues.apache.org/jira/browse/HADOOP-18778
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
> Fix For: 3.3.9
>
>
> The following tests fail when run hadoop-aws suite is run with CSE enabled:
>  
> {{ITestS3APrefetchingInputStream.testRandomReadLargeFile}}
> {{ITestS3APrefetchingInputStream.testReadLargeFileFully}}
> {{ITestS3APrefetchingInputStream.testReadLargeFileFullyLazySeek}}
> {{ITestS3ARequesterPays.testRequesterPaysOptionSuccess}}
> {{ITestAssumeRole.testReadOnlyOperations }}
> {{ITestPartialRenamesDeletes.testRenameParentPathNotWriteable}}
> {{ITestPartialRenamesDeletes.testRenameParentPathNotWriteable}}
> {{ITestS3GuardTool.testLandsatBucketRequireUnencrypted}}
>  
> Most of these are because they're using landsat data which is not encrypted, 
> so trying to read with a CSE will fail. These tests should be skipped if 
> using CSE.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18778) Test failures with CSE enabled

2023-06-20 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18778:
-

 Summary: Test failures with CSE enabled
 Key: HADOOP-18778
 URL: https://issues.apache.org/jira/browse/HADOOP-18778
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Ahmar Suhail
 Fix For: 3.3.9


The following tests fail when run hadoop-aws suite is run with CSE enabled:

 

{{ITestS3APrefetchingInputStream.testRandomReadLargeFile}}
{{ITestS3APrefetchingInputStream.testReadLargeFileFully}}
{{ITestS3APrefetchingInputStream.testReadLargeFileFullyLazySeek}}
{{ITestS3ARequesterPays.testRequesterPaysOptionSuccess}}
{{ITestAssumeRole.testReadOnlyOperations }}
{{ITestPartialRenamesDeletes.testRenameParentPathNotWriteable}}
{{ITestPartialRenamesDeletes.testRenameParentPathNotWriteable}}
{{ITestS3GuardTool.testLandsatBucketRequireUnencrypted}}

 

Most of these are because they're using landsat data which is not encrypted, so 
trying to read with a CSE will fail. These tests should be skipped if using CSE.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18673) AWS SDK V2 - Refactor getS3Region & other follow up items

2023-06-13 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18673:
--
Description: 
* Factor getS3Region into its own ExecutingStoreOperation;
 * Remove InconsistentS3ClientFactory.
 * Fix issue with getXAttr(/)
 * Look at adding flexible checksum support

  was:
* Factor getS3Region into its own ExecutingStoreOperation;
 * Remove InconsistentS3ClientFactory.
 * Fix issue with getXAttr(/)


> AWS SDK V2 - Refactor getS3Region & other follow up items 
> --
>
> Key: HADOOP-18673
> URL: https://issues.apache.org/jira/browse/HADOOP-18673
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> * Factor getS3Region into its own ExecutingStoreOperation;
>  * Remove InconsistentS3ClientFactory.
>  * Fix issue with getXAttr(/)
>  * Look at adding flexible checksum support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18747) AWS SDK V2 - sigv2 support

2023-05-30 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17727402#comment-17727402
 ] 

Ahmar Suhail commented on HADOOP-18747:
---

Hey [~aajisaka] , yes that's true. Maybe it's not so important currently, but 
as newer features are added in the future, not having sigV2 will block users 
who require it from upgrading. So I think at some point it will need to be 
added in.. 

> AWS SDK V2 - sigv2 support
> --
>
> Key: HADOOP-18747
> URL: https://issues.apache.org/jira/browse/HADOOP-18747
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> AWS SDK V2 does not support sigV2 signing. However, the S3 client supports 
> configurable signers so a custom sigV2 signer can be implemented and 
> configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18747) AWS SDK V2 - sigv2 support

2023-05-23 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725448#comment-17725448
 ] 

Ahmar Suhail commented on HADOOP-18747:
---

will have to find a bucket created before June 24, 2020 in a region that 
supports sigV2 (we do have one we could try) or test with a third party store

> AWS SDK V2 - sigv2 support
> --
>
> Key: HADOOP-18747
> URL: https://issues.apache.org/jira/browse/HADOOP-18747
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> AWS SDK V2 does not support sigV2 signing. However, the S3 client supports 
> configurable signers so a custom sigV2 signer can be implemented and 
> configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18749) AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure

2023-05-22 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18749:
-

 Summary: AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure
 Key: HADOOP-18749
 URL: https://issues.apache.org/jira/browse/HADOOP-18749
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


ITestS3AHugeFilesNoMultipart fails with

java.lang.AssertionError: Expected a 
org.apache.hadoop.fs.s3a.api.UnsupportedRequestException to be thrown, but got 
the result: : true

Happens because the transfer manager currently does not do any MPU when used 
with the Java async client, so the UnsupportedRequestException never gets 
thrown. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18747) AWS SDK V2 - sigv2 support

2023-05-22 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18747:
-

 Summary: AWS SDK V2 - sigv2 support
 Key: HADOOP-18747
 URL: https://issues.apache.org/jira/browse/HADOOP-18747
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


AWS SDK V2 does not support sigV2 signing. However, the S3 client supports 
configurable signers so a custom sigV2 signer can be implemented and 
configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long

2023-05-18 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18744:
-

 Summary: ITestS3ABlockOutputArray failure with IO File name too 
long
 Key: HADOOP-18744
 URL: https://issues.apache.org/jira/browse/HADOOP-18744
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Ahmar Suhail


On an EC2 instance, the following tests are failing:

 

{{{}ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputByteBuffer>ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputDisk>ITestS3ABlockOutputArray.testDiskBlockCreate{}}}

 

with the error IO File name too long. 

 

The tests create a file with a 1024 char file name and rely on 
File.createTempFile() to truncate the file name to < OS limit. 

 

Stack trace:

{{Java.io.IOException: File name too long}}
{{    at java.io.UnixFileSystem.createFileExclusively(Native Method)}}
{{    at java.io.File.createTempFile(File.java:2063)}}
{{    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)}}
{{    at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)}}
{{    at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)}}
{{    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
{{    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18073) Upgrade AWS SDK to v2

2023-05-17 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723516#comment-17723516
 ] 

Ahmar Suhail edited comment on HADOOP-18073 at 5/17/23 3:40 PM:


new rebased branch is 
[feature-HADOOP-18073-s3a-sdk-upgrade-rebase|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase]

ITestS3ABlockOutputArray.testDiskBlockCreate fails (also failing on trunk) on 
an EC2 instance, works ok on Mac. Looks like file names aren't being truncated 
on EC2


was (Author: JIRAUSER283484):
new rebased branch is 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase.]

 

ITestS3ABlockOutputArray.testDiskBlockCreate fails (also failing on trunk) on 
an EC2 instance, works ok on Mac. Looks like file names aren't being truncated 
on EC2

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18073) Upgrade AWS SDK to v2

2023-05-17 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723516#comment-17723516
 ] 

Ahmar Suhail edited comment on HADOOP-18073 at 5/17/23 3:39 PM:


new rebased branch is 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase.]

 

ITestS3ABlockOutputArray.testDiskBlockCreate fails (also failing on trunk) on 
an EC2 instance, works ok on Mac. Looks like file names aren't being truncated 
on EC2


was (Author: JIRAUSER283484):
new rebased branch is 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase
 
|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase.]

 

ITestS3ABlockOutputArray.testDiskBlockCreate fails (also failing on trunk) on 
an EC2 instance, works ok on Mac. Looks like file names aren't being truncated 
on EC2

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18073) Upgrade AWS SDK to v2

2023-05-17 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723516#comment-17723516
 ] 

Ahmar Suhail commented on HADOOP-18073:
---

new rebased branch is 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase
 
|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade-rebase.]

 

ITestS3ABlockOutputArray.testDiskBlockCreate fails (also failing on trunk) on 
an EC2 instance, works ok on Mac. Looks like file names aren't being truncated 
on EC2

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18572) AWS SDK V2 - Fix failing tests

2023-05-17 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail resolved HADOOP-18572.
---
Resolution: Fixed

resolved in https://issues.apache.org/jira/browse/HADOOP-18565

> AWS SDK V2 - Fix failing tests
> --
>
> Key: HADOOP-18572
> URL: https://issues.apache.org/jira/browse/HADOOP-18572
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> We have a few failing tests for various reasons. Some are dependent on the 
> TM, but others can be looked into and fixed. 
> |TestS3AExceptionTranslation|test301ContainsEndpoint|Missing endpoint in SDK 
> exception 
> ([aws/aws-sdk-java-v2#3048|https://github.com/aws/aws-sdk-java-v2/issues/3048])|
> |TestStreamChangeTracker|testCopyETagRequired, 
> testCopyVersionIdRequired|Transfer Manager response does not yet have 
> {{CopyObjectResult}}|
> |ITestS3AFileContextStatistics|testStatistics|ProgressListeners not attached 
> to non-TM uploads|
> |ITestS3AEncryptionSSEC|multiple tests (14 out of 24)|Transfer Manager issue 
> with SSE-C|
> |ITestXAttrCost|testXAttrRoot.|{{headObject()}} with empty key fails|
> |ITestSessionDelegationInFileystem|testDelegatedFileSystem|Succeeds, but 
> {{headObject()}} with empty key commented out|
> |ITestS3ACannedACLs|testCreatedObjectsHaveACLs|AWSCannedACL.LogDeliveryWrite 
> not supported in SDK v2|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18570) AWS SDK V2 - Update region logic

2023-05-17 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail resolved HADOOP-18570.
---
Resolution: Fixed

Resolved in https://issues.apache.org/jira/browse/HADOOP-18565

> AWS SDK V2 - Update region logic
> 
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] (HADOOP-18570) AWS SDK V2 - Update region logic

2023-05-17 Thread Ahmar Suhail (Jira)


[ https://issues.apache.org/jira/browse/HADOOP-18570 ]


Ahmar Suhail deleted comment on HADOOP-18570:
---

was (Author: JIRAUSER283484):
marking as resolved as this was done as part of 
https://issues.apache.org/jira/browse/HADOOP-18565

> AWS SDK V2 - Update region logic
> 
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18570) AWS SDK V2 - Update region logic

2023-05-17 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723510#comment-17723510
 ] 

Ahmar Suhail commented on HADOOP-18570:
---

marking as resolved as this was done as part of 
https://issues.apache.org/jira/browse/HADOOP-18565

> AWS SDK V2 - Update region logic
> 
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18708) AWS SDK V2 - Implement CSE

2023-04-19 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18708:
-

 Summary: AWS SDK V2 - Implement CSE
 Key: HADOOP-18708
 URL: https://issues.apache.org/jira/browse/HADOOP-18708
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Ahmar Suhail


S3 Encryption client for SDK V2 is now available, so add client side encryption 
back in. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18683) Add new store vendor config option

2023-03-27 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18683:
--
Affects Version/s: 3.3.5

> Add new store vendor config option
> --
>
> Key: HADOOP-18683
> URL: https://issues.apache.org/jira/browse/HADOOP-18683
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Ahmar Suhail
>Priority: Minor
>
> Add in a new fs.s3a.store.vendor config, where users can specify the storage 
> vendor they are using (eg: aws, netapp, minio).
> This will allow us to configure S3A correctly per vendor. For example, if the 
> vendor is not AWS, you probably want to use ListObjectsV1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18683) Add new store vendor config option

2023-03-27 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18683:
-

 Summary: Add new store vendor config option
 Key: HADOOP-18683
 URL: https://issues.apache.org/jira/browse/HADOOP-18683
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


Add in a new fs.s3a.store.vendor config, where users can specify the storage 
vendor they are using (eg: aws, netapp, minio).

This will allow us to configure S3A correctly per vendor. For example, if the 
vendor is not AWS, you probably want to use ListObjectsV1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18683) Add new store vendor config option

2023-03-27 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18683:
--
Component/s: fs/s3

> Add new store vendor config option
> --
>
> Key: HADOOP-18683
> URL: https://issues.apache.org/jira/browse/HADOOP-18683
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Minor
>
> Add in a new fs.s3a.store.vendor config, where users can specify the storage 
> vendor they are using (eg: aws, netapp, minio).
> This will allow us to configure S3A correctly per vendor. For example, if the 
> vendor is not AWS, you probably want to use ListObjectsV1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18673) AWS SDK V2 - Refactor getS3Region & other follow up items

2023-03-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18673:
--
Component/s: fs/s3

> AWS SDK V2 - Refactor getS3Region & other follow up items 
> --
>
> Key: HADOOP-18673
> URL: https://issues.apache.org/jira/browse/HADOOP-18673
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> * Factor getS3Region into its own ExecutingStoreOperation;
>  * Remove InconsistentS3ClientFactory.
>  * Fix issue with getXAttr(/)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18674) AWS SDK V2 - Add socket factory to Netty Client

2023-03-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18674:
--
Affects Version/s: 3.4.0

> AWS SDK V2 - Add socket factory to Netty Client
> ---
>
> Key: HADOOP-18674
> URL: https://issues.apache.org/jira/browse/HADOOP-18674
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The Java async client uses the netty http client. We should investigate how 
> to add a socket factory to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18674) AWS SDK V2 - Add socket factory to Netty Client

2023-03-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18674:
--
Component/s: fs/s3

> AWS SDK V2 - Add socket factory to Netty Client
> ---
>
> Key: HADOOP-18674
> URL: https://issues.apache.org/jira/browse/HADOOP-18674
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The Java async client uses the netty http client. We should investigate how 
> to add a socket factory to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18673) AWS SDK V2 - Refactor getS3Region & other follow up items

2023-03-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18673:
--
Affects Version/s: 3.4.0

> AWS SDK V2 - Refactor getS3Region & other follow up items 
> --
>
> Key: HADOOP-18673
> URL: https://issues.apache.org/jira/browse/HADOOP-18673
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> * Factor getS3Region into its own ExecutingStoreOperation;
>  * Remove InconsistentS3ClientFactory.
>  * Fix issue with getXAttr(/)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18674) AWS SDK V2 - Add socket factory to Netty Client

2023-03-21 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18674:
-

 Summary: AWS SDK V2 - Add socket factory to Netty Client
 Key: HADOOP-18674
 URL: https://issues.apache.org/jira/browse/HADOOP-18674
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


The Java async client uses the netty http client. We should investigate how to 
add a socket factory to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18674) AWS SDK V2 - Add socket factory to Netty Client

2023-03-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18674:
--
Priority: Minor  (was: Major)

> AWS SDK V2 - Add socket factory to Netty Client
> ---
>
> Key: HADOOP-18674
> URL: https://issues.apache.org/jira/browse/HADOOP-18674
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The Java async client uses the netty http client. We should investigate how 
> to add a socket factory to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18673) AWS SDK V2 - Refactor getS3Region & other follow up items

2023-03-21 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18673:
-

 Summary: AWS SDK V2 - Refactor getS3Region & other follow up items 
 Key: HADOOP-18673
 URL: https://issues.apache.org/jira/browse/HADOOP-18673
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


* Factor getS3Region into its own ExecutingStoreOperation;
 * Remove InconsistentS3ClientFactory.
 * Fix issue with getXAttr(/)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18638) Encryption behaviour on copy

2023-02-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18638:
--
Summary: Encryption behaviour on copy  (was: Encryption behaviour )

> Encryption behaviour on copy
> 
>
> Key: HADOOP-18638
> URL: https://issues.apache.org/jira/browse/HADOOP-18638
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Major
>
> When doing a copy, S3A always uses encryption configuration of the 
> filesystem, rather than the source object. This behaviour may not have been 
> intended, as in `RequestFactoryImpl.copyEncryptionParameters()`  it does copy 
> source object encryption properties 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java#L336]
>  , but a missing return statement means it ends up using the FS settings 
> anyway. 
>  
> Proposed:
>  * If the copy is called by rename, always preserve source object encryption 
> properties. 
>  * For all other copies, use current FS encryption settings. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18638) Encryption behaviour

2023-02-21 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18638:
-

 Summary: Encryption behaviour 
 Key: HADOOP-18638
 URL: https://issues.apache.org/jira/browse/HADOOP-18638
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


When doing a copy, S3A always uses encryption configuration of the filesystem, 
rather than the source object. This behaviour may not have been intended, as in 
`RequestFactoryImpl.copyEncryptionParameters()`  it does copy source object 
encryption properties 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java#L336]
 , but a missing return statement means it ends up using the FS settings 
anyway. 

 

Proposed:
 * If the copy is called by rename, always preserve source object encryption 
properties. 
 * For all other copies, use current FS encryption settings. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18565) AWS SDK V2 - Complete outstanding items

2023-02-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18565:
--
Description: 
The following work remains to complete the SDK upgrade work:
 * S3A allows users configure to custom signers, add in support for this.
 * Remove SDK V1 bundle dependency
 * Update `getRegion()` logic to use retries. 
 * Add in progress listeners for `S3ABlockOutputStream`
 * Fix any failing tests.

  was:S3A allows users configure to custom signers, add in support for this.


> AWS SDK V2 - Complete outstanding items
> ---
>
> Key: HADOOP-18565
> URL: https://issues.apache.org/jira/browse/HADOOP-18565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> The following work remains to complete the SDK upgrade work:
>  * S3A allows users configure to custom signers, add in support for this.
>  * Remove SDK V1 bundle dependency
>  * Update `getRegion()` logic to use retries. 
>  * Add in progress listeners for `S3ABlockOutputStream`
>  * Fix any failing tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18565) AWS SDK V2 - Complete outstanding items

2023-02-21 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18565:
--
Summary: AWS SDK V2 - Complete outstanding items  (was: AWS SDK V2 - Add in 
support of custom signers)

> AWS SDK V2 - Complete outstanding items
> ---
>
> Key: HADOOP-18565
> URL: https://issues.apache.org/jira/browse/HADOOP-18565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3A allows users configure to custom signers, add in support for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18073) Upgrade AWS SDK to v2

2023-01-18 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678308#comment-17678308
 ] 

Ahmar Suhail commented on HADOOP-18073:
---

[~ste...@apache.org] / [~mthakur] , have just rebased my branch for the 
upgrade. Could you please push 
[this|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade]
 branch up to the [Apache feature 
branch|https://github.com/apache/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade]
 ? I'd like to open a PR against this rebased branch which addresses some 
outstanding issues.

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18579) Warn when no region is configured

2022-12-16 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18579:
--
Description: 
The AWS Java SDK V1 allows for cross region access. This means that even if you 
instantiate the S3 client with US_EAST_1 (or any region different to your 
actual bucket's region), the SDK will figure out the region. 

 

With the upgrade to SDK V2, this is no longer supported and the region should 
be set explicitly. Requests with the incorrect region will fail. To prepare for 
this change, S3A should warn when a region is not set via 
fs.s3a.endpoint.region. 

 

We should warn even if fs.s3a.endpoint is set and region can be parsed from 
this. This is because it is recommended to let the SDK V2 figure out the 
endpoint to use from the region, and so S3A should discourage from setting the 
endpoint unless absolutely required (eg for third party stores). 

 

Ideally rename fs.s3a.endpoint.region to fs.s3a.region, but not sure if this is 
ok to do. 

  was:
The AWS Java SDK V1 allows for cross region access. This means that even if you 
instantiate the S3 client with US_EAST_1 (or any region different to your 
actual bucket's region), the SDK will figure out the region. 

 

With the upgrade to SDK V2, this is no longer supported and the region should 
be set explicitly. Requests with the incorrect region will fail. To prepare for 
this change, S3A should warn when a region is not set via 
fs.s3a.endpoint.region. 

 

We should warn even if fs.s3a.endpoint is set and region can be parsed from 
this. This is because it is recommended to let the SDK V2 figure out the 
endpoint to use from the region, and so S3A should discourage from setting the 
endpoint unless absolutely required (eg for third party stores). 


> Warn when no region is configured
> -
>
> Key: HADOOP-18579
> URL: https://issues.apache.org/jira/browse/HADOOP-18579
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The AWS Java SDK V1 allows for cross region access. This means that even if 
> you instantiate the S3 client with US_EAST_1 (or any region different to your 
> actual bucket's region), the SDK will figure out the region. 
>  
> With the upgrade to SDK V2, this is no longer supported and the region should 
> be set explicitly. Requests with the incorrect region will fail. To prepare 
> for this change, S3A should warn when a region is not set via 
> fs.s3a.endpoint.region. 
>  
> We should warn even if fs.s3a.endpoint is set and region can be parsed from 
> this. This is because it is recommended to let the SDK V2 figure out the 
> endpoint to use from the region, and so S3A should discourage from setting 
> the endpoint unless absolutely required (eg for third party stores). 
>  
> Ideally rename fs.s3a.endpoint.region to fs.s3a.region, but not sure if this 
> is ok to do. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18579) Warn when no region is configured

2022-12-16 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18579:
--
Affects Version/s: 3.3.9

> Warn when no region is configured
> -
>
> Key: HADOOP-18579
> URL: https://issues.apache.org/jira/browse/HADOOP-18579
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The AWS Java SDK V1 allows for cross region access. This means that even if 
> you instantiate the S3 client with US_EAST_1 (or any region different to your 
> actual bucket's region), the SDK will figure out the region. 
>  
> With the upgrade to SDK V2, this is no longer supported and the region should 
> be set explicitly. Requests with the incorrect region will fail. To prepare 
> for this change, S3A should warn when a region is not set via 
> fs.s3a.endpoint.region. 
>  
> We should warn even if fs.s3a.endpoint is set and region can be parsed from 
> this. This is because it is recommended to let the SDK V2 figure out the 
> endpoint to use from the region, and so S3A should discourage from setting 
> the endpoint unless absolutely required (eg for third party stores). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18579) Warn when no region is configured

2022-12-16 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18579:
--
Component/s: fs/s3

> Warn when no region is configured
> -
>
> Key: HADOOP-18579
> URL: https://issues.apache.org/jira/browse/HADOOP-18579
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The AWS Java SDK V1 allows for cross region access. This means that even if 
> you instantiate the S3 client with US_EAST_1 (or any region different to your 
> actual bucket's region), the SDK will figure out the region. 
>  
> With the upgrade to SDK V2, this is no longer supported and the region should 
> be set explicitly. Requests with the incorrect region will fail. To prepare 
> for this change, S3A should warn when a region is not set via 
> fs.s3a.endpoint.region. 
>  
> We should warn even if fs.s3a.endpoint is set and region can be parsed from 
> this. This is because it is recommended to let the SDK V2 figure out the 
> endpoint to use from the region, and so S3A should discourage from setting 
> the endpoint unless absolutely required (eg for third party stores). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18579) Warn when no region is configured

2022-12-16 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18579:
--
Description: 
The AWS Java SDK V1 allows for cross region access. This means that even if you 
instantiate the S3 client with US_EAST_1 (or any region different to your 
actual bucket's region), the SDK will figure out the region. 

 

With the upgrade to SDK V2, this is no longer supported and the region should 
be set explicitly. Requests with the incorrect region will fail. To prepare for 
this change, S3A should warn when a region is not set via 
fs.s3a.endpoint.region. 

 

We should warn even if fs.s3a.endpoint is set and region can be parsed from 
this. This is because it is recommended to let the SDK V2 figure out the 
endpoint to use from the region, and so S3A should discourage from setting the 
endpoint unless absolutely required (eg for third party stores). 

  was:
The AWS Java SDK V1 allows for cross region access. This means that even if you 
have instantiate the S3 client with US_EAST_1 (or any region different to your 
actual bucket's region), the SDK will figure out the region. 

 

With the upgrade to SDK V2, this is no longer supported and the region should 
be set explicitly. To prepare for this change, S3A should warn when a region is 
not set via fs.s3a.endpoint.region. 

 

We should warn even if fs.s3a.endpoint is set and region can be parsed from 
this. This is because it is recommended to let the SDK V2 figure out the 
endpoint to use from the region, and so S3A should discourage from setting the 
endpoint unless absolutely required (eg for third party stores). 


> Warn when no region is configured
> -
>
> Key: HADOOP-18579
> URL: https://issues.apache.org/jira/browse/HADOOP-18579
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Minor
>
> The AWS Java SDK V1 allows for cross region access. This means that even if 
> you instantiate the S3 client with US_EAST_1 (or any region different to your 
> actual bucket's region), the SDK will figure out the region. 
>  
> With the upgrade to SDK V2, this is no longer supported and the region should 
> be set explicitly. Requests with the incorrect region will fail. To prepare 
> for this change, S3A should warn when a region is not set via 
> fs.s3a.endpoint.region. 
>  
> We should warn even if fs.s3a.endpoint is set and region can be parsed from 
> this. This is because it is recommended to let the SDK V2 figure out the 
> endpoint to use from the region, and so S3A should discourage from setting 
> the endpoint unless absolutely required (eg for third party stores). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18579) Warn when no region is configured

2022-12-16 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18579:
-

 Summary: Warn when no region is configured
 Key: HADOOP-18579
 URL: https://issues.apache.org/jira/browse/HADOOP-18579
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


The AWS Java SDK V1 allows for cross region access. This means that even if you 
have instantiate the S3 client with US_EAST_1 (or any region different to your 
actual bucket's region), the SDK will figure out the region. 

 

With the upgrade to SDK V2, this is no longer supported and the region should 
be set explicitly. To prepare for this change, S3A should warn when a region is 
not set via fs.s3a.endpoint.region. 

 

We should warn even if fs.s3a.endpoint is set and region can be parsed from 
this. This is because it is recommended to let the SDK V2 figure out the 
endpoint to use from the region, and so S3A should discourage from setting the 
endpoint unless absolutely required (eg for third party stores). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18572) AWS SDK V2 - Fix failing tests

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18572:
--
Affects Version/s: 3.4.0

> AWS SDK V2 - Fix failing tests
> --
>
> Key: HADOOP-18572
> URL: https://issues.apache.org/jira/browse/HADOOP-18572
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> We have a few failing tests for various reasons. Some are dependent on the 
> TM, but others can be looked into and fixed. 
> |TestS3AExceptionTranslation|test301ContainsEndpoint|Missing endpoint in SDK 
> exception 
> ([aws/aws-sdk-java-v2#3048|https://github.com/aws/aws-sdk-java-v2/issues/3048])|
> |TestStreamChangeTracker|testCopyETagRequired, 
> testCopyVersionIdRequired|Transfer Manager response does not yet have 
> {{CopyObjectResult}}|
> |ITestS3AFileContextStatistics|testStatistics|ProgressListeners not attached 
> to non-TM uploads|
> |ITestS3AEncryptionSSEC|multiple tests (14 out of 24)|Transfer Manager issue 
> with SSE-C|
> |ITestXAttrCost|testXAttrRoot.|{{headObject()}} with empty key fails|
> |ITestSessionDelegationInFileystem|testDelegatedFileSystem|Succeeds, but 
> {{headObject()}} with empty key commented out|
> |ITestS3ACannedACLs|testCreatedObjectsHaveACLs|AWSCannedACL.LogDeliveryWrite 
> not supported in SDK v2|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18572) AWS SDK V2 - Fix failing tests

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18572:
--
Component/s: fs/s3

> AWS SDK V2 - Fix failing tests
> --
>
> Key: HADOOP-18572
> URL: https://issues.apache.org/jira/browse/HADOOP-18572
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> We have a few failing tests for various reasons. Some are dependent on the 
> TM, but others can be looked into and fixed. 
> |TestS3AExceptionTranslation|test301ContainsEndpoint|Missing endpoint in SDK 
> exception 
> ([aws/aws-sdk-java-v2#3048|https://github.com/aws/aws-sdk-java-v2/issues/3048])|
> |TestStreamChangeTracker|testCopyETagRequired, 
> testCopyVersionIdRequired|Transfer Manager response does not yet have 
> {{CopyObjectResult}}|
> |ITestS3AFileContextStatistics|testStatistics|ProgressListeners not attached 
> to non-TM uploads|
> |ITestS3AEncryptionSSEC|multiple tests (14 out of 24)|Transfer Manager issue 
> with SSE-C|
> |ITestXAttrCost|testXAttrRoot.|{{headObject()}} with empty key fails|
> |ITestSessionDelegationInFileystem|testDelegatedFileSystem|Succeeds, but 
> {{headObject()}} with empty key commented out|
> |ITestS3ACannedACLs|testCreatedObjectsHaveACLs|AWSCannedACL.LogDeliveryWrite 
> not supported in SDK v2|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18572) AWS SDK V2 - Fix failing tests

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18572:
--
Summary: AWS SDK V2 - Fix failing tests  (was: Fix failing tests)

> AWS SDK V2 - Fix failing tests
> --
>
> Key: HADOOP-18572
> URL: https://issues.apache.org/jira/browse/HADOOP-18572
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> We have a few failing tests for various reasons. Some are dependent on the 
> TM, but others can be looked into and fixed. 
> |TestS3AExceptionTranslation|test301ContainsEndpoint|Missing endpoint in SDK 
> exception 
> ([aws/aws-sdk-java-v2#3048|https://github.com/aws/aws-sdk-java-v2/issues/3048])|
> |TestStreamChangeTracker|testCopyETagRequired, 
> testCopyVersionIdRequired|Transfer Manager response does not yet have 
> {{CopyObjectResult}}|
> |ITestS3AFileContextStatistics|testStatistics|ProgressListeners not attached 
> to non-TM uploads|
> |ITestS3AEncryptionSSEC|multiple tests (14 out of 24)|Transfer Manager issue 
> with SSE-C|
> |ITestXAttrCost|testXAttrRoot.|{{headObject()}} with empty key fails|
> |ITestSessionDelegationInFileystem|testDelegatedFileSystem|Succeeds, but 
> {{headObject()}} with empty key commented out|
> |ITestS3ACannedACLs|testCreatedObjectsHaveACLs|AWSCannedACL.LogDeliveryWrite 
> not supported in SDK v2|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18571) AWS SDK V2 - Qualify the upgrade.

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18571:
--
Summary: AWS SDK V2 - Qualify the upgrade.   (was: Qualify the upgrade. )

> AWS SDK V2 - Qualify the upgrade. 
> --
>
> Key: HADOOP-18571
> URL: https://issues.apache.org/jira/browse/HADOOP-18571
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Major
>
> Run tests as per [qualifying aws ask 
> update|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18571) AWS SDK V2 - Qualify the upgrade.

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18571:
--
Component/s: fs/s3

> AWS SDK V2 - Qualify the upgrade. 
> --
>
> Key: HADOOP-18571
> URL: https://issues.apache.org/jira/browse/HADOOP-18571
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> Run tests as per [qualifying aws ask 
> update|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18571) AWS SDK V2 - Qualify the upgrade.

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18571:
--
Affects Version/s: 3.4.0

> AWS SDK V2 - Qualify the upgrade. 
> --
>
> Key: HADOOP-18571
> URL: https://issues.apache.org/jira/browse/HADOOP-18571
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> Run tests as per [qualifying aws ask 
> update|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18565) AWS SDK V2 - Add in support of custom signers

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18565:
--
Affects Version/s: 3.4.0

> AWS SDK V2 - Add in support of custom signers
> -
>
> Key: HADOOP-18565
> URL: https://issues.apache.org/jira/browse/HADOOP-18565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3A allows users configure to custom signers, add in support for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18570) AWS SDK V2 - Update region logic

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18570:
--
Summary: AWS SDK V2 - Update region logic  (was: Update region logic)

> AWS SDK V2 - Update region logic
> 
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18565) AWS SDK V2 - Add in support of custom signers

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18565:
--
Summary: AWS SDK V2 - Add in support of custom signers  (was: Add in 
support of custom signers)

> AWS SDK V2 - Add in support of custom signers
> -
>
> Key: HADOOP-18565
> URL: https://issues.apache.org/jira/browse/HADOOP-18565
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3A allows users configure to custom signers, add in support for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18565) AWS SDK V2 - Add in support of custom signers

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18565:
--
Component/s: fs/s3

> AWS SDK V2 - Add in support of custom signers
> -
>
> Key: HADOOP-18565
> URL: https://issues.apache.org/jira/browse/HADOOP-18565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3A allows users configure to custom signers, add in support for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18570) Update region logic

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18570:
--
Component/s: fs/s3

> Update region logic
> ---
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18570) Update region logic

2022-12-14 Thread Ahmar Suhail (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmar Suhail updated HADOOP-18570:
--
Affects Version/s: 3.4.0

> Update region logic
> ---
>
> Key: HADOOP-18570
> URL: https://issues.apache.org/jira/browse/HADOOP-18570
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> SDK V2 will no longer resolve a buckets region if it is not set when 
> initialising the client. 
>  
> Current logic will always make a head bucket call on FS initialisation. We 
> should review this. Possible solution:
>  * Warn if region is not set.
>  * If no region, try and resolve. If resolution fails, throw an exception. 
> Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18572) Fix failing tests

2022-12-14 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18572:
-

 Summary: Fix failing tests
 Key: HADOOP-18572
 URL: https://issues.apache.org/jira/browse/HADOOP-18572
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


We have a few failing tests for various reasons. Some are dependent on the TM, 
but others can be looked into and fixed. 


|TestS3AExceptionTranslation|test301ContainsEndpoint|Missing endpoint in SDK 
exception 
([aws/aws-sdk-java-v2#3048|https://github.com/aws/aws-sdk-java-v2/issues/3048])|
|TestStreamChangeTracker|testCopyETagRequired, 
testCopyVersionIdRequired|Transfer Manager response does not yet have 
{{CopyObjectResult}}|
|ITestS3AFileContextStatistics|testStatistics|ProgressListeners not attached to 
non-TM uploads|
|ITestS3AEncryptionSSEC|multiple tests (14 out of 24)|Transfer Manager issue 
with SSE-C|
|ITestXAttrCost|testXAttrRoot.|{{headObject()}} with empty key fails|
|ITestSessionDelegationInFileystem|testDelegatedFileSystem|Succeeds, but 
{{headObject()}} with empty key commented out|
|ITestS3ACannedACLs|testCreatedObjectsHaveACLs|AWSCannedACL.LogDeliveryWrite 
not supported in SDK v2|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18073) Upgrade AWS SDK to v2

2022-12-14 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647002#comment-17647002
 ] 

Ahmar Suhail commented on HADOOP-18073:
---

Thanks [~mthakur] , I've run the test and all ok. For the refactoring of that 
method, i'd prefer to do it as a separate PR. If all looks good to you, could 
you/[~ste...@apache.org] please push this rebased branch up to the feature 
branch?

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18571) Qualify the upgrade.

2022-12-12 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18571:
-

 Summary: Qualify the upgrade. 
 Key: HADOOP-18571
 URL: https://issues.apache.org/jira/browse/HADOOP-18571
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


Run tests as per [qualifying aws ask 
update|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18570) Update region logic

2022-12-12 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18570:
-

 Summary: Update region logic
 Key: HADOOP-18570
 URL: https://issues.apache.org/jira/browse/HADOOP-18570
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


SDK V2 will no longer resolve a buckets region if it is not set when 
initialising the client. 

 

Current logic will always make a head bucket call on FS initialisation. We 
should review this. Possible solution:
 * Warn if region is not set.
 * If no region, try and resolve. If resolution fails, throw an exception. 
Cache the region to optimise for short lived FS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18073) Upgrade AWS SDK to v2

2022-12-09 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645348#comment-17645348
 ] 

Ahmar Suhail edited comment on HADOOP-18073 at 12/9/22 3:37 PM:


[~ste...@apache.org] I've rebased our branch, 
[here|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade]
 The only notable conflict was in S3AInputStream with new code that was added 
for https://issues.apache.org/jira/browse/HADOOP-18460. We've made a couple of 
small changes to fix conflict 
[here|https://github.com/ahmarsuhail/hadoop/pull/35/files#diff-f84380cfce48a9682320d596f593808fe16d81a71dcc5cfcf10842f932d0ff13R1030].
 [~mthakur] could you check if this look ok? and if there's anything else we 
should do to verify that this issue does not resurface 


was (Author: JIRAUSER283484):
[~ste...@apache.org] I've rebased our branch, 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade
 
|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade.]The
 only notable conflict was in S3AInputStream with new code that was added for 
https://issues.apache.org/jira/browse/HADOOP-18460. We've made a couple of 
small changes to fix conflict 
[here|https://github.com/ahmarsuhail/hadoop/pull/35/files#diff-f84380cfce48a9682320d596f593808fe16d81a71dcc5cfcf10842f932d0ff13R1030].
 [~mthakur] could you check if this look ok? and if there's anything else we 
should do to verify that this issue does not resurface 

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18073) Upgrade AWS SDK to v2

2022-12-09 Thread Ahmar Suhail (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645348#comment-17645348
 ] 

Ahmar Suhail edited comment on HADOOP-18073 at 12/9/22 3:36 PM:


[~ste...@apache.org] I've rebased our branch, 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade
 
|https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade.]The
 only notable conflict was in S3AInputStream with new code that was added for 
https://issues.apache.org/jira/browse/HADOOP-18460. We've made a couple of 
small changes to fix conflict 
[here|https://github.com/ahmarsuhail/hadoop/pull/35/files#diff-f84380cfce48a9682320d596f593808fe16d81a71dcc5cfcf10842f932d0ff13R1030].
 [~mthakur] could you check if this look ok? and if there's anything else we 
should do to verify that this issue does not resurface 


was (Author: JIRAUSER283484):
[~ste...@apache.org] I've rebased our branch, 
[https://github.com/ahmarsuhail/hadoop/tree/feature-HADOOP-18073-s3a-sdk-upgrade.]
 The only notable conflict was in S3AInputStream with new code that was added 
for https://issues.apache.org/jira/browse/HADOOP-18460. We've made a couple of 
small changes to fix conflict 
[here|https://github.com/ahmarsuhail/hadoop/pull/35/files#diff-f84380cfce48a9682320d596f593808fe16d81a71dcc5cfcf10842f932d0ff13R1030].
 [~mthakur] could you check if this look ok? and if there's anything else we 
should do to verify that this issue does not resurface 

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18565) Add in support of custom signers

2022-12-09 Thread Ahmar Suhail (Jira)
Ahmar Suhail created HADOOP-18565:
-

 Summary: Add in support of custom signers
 Key: HADOOP-18565
 URL: https://issues.apache.org/jira/browse/HADOOP-18565
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ahmar Suhail


S3A allows users configure to custom signers, add in support for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >