LDVSOFT opened a new pull request, #7966:
URL: https://github.com/apache/hadoop/pull/7966

   ### Description of PR
   
   `URIBuilder` was used from the AWS SDK for Java v2, to be precise from the 
shaded Apache HTTP Client. It is a problem if a user would like not to use the 
AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & 
sts), but that may cause problems on unshaded dependency versions. Since a URI 
constructor can achieve the same here I switched it as a preferred option.
   
   ### How was this patch tested?
   
   I've run [the test 
suite](https://hadoop.apache.org/docs/r3.4.1/hadoop-aws/tools/hadoop-aws/testing.html)
 against a _eu-west-1_ bucket, without scaling/load since the change shouldn't 
affect that. To be exact, with something like this:
   <details>
     <summary><code class="notranslate">auth-keys.xml</code></summary>
   
       <configuration>
       <property>
           <name>test.fs.s3a.name</name>
           <value>s3a://hadoop-test-‹edited›</value>
       </property>
   
       <property>
           <name>test.fs.s3a.encryption.enabled</name>
           <value>false</value>
           <description>Don't wanna</description>
       </property>
   
       <property>
           <name>test.fs.s3a.create.acl.enabled</name>
           <value>false</value>
           <description>disabled on server</description>
       </property>
   
       <property>
           <name>fs.s3a.endpoint.region</name>
           <value>eu-west-1</value>
       </property>
   
       <property>
           <name>fs.s3a.assumed.role.sts.endpoint.region</name>
           <value>eu-west-1</value>
       </property>
   
       <property>
           <name>test.sts.endpoint</name>
           <description>Specific endpoint to use for STS requests.</description>
           <value>sts.eu-west-1.amazonaws.com</value>
       </property>
   
       <property>
           <name>fs.s3a.assumed.role.sts.endpoint</name>
           <value>${test.sts.endpoint}</value>
       </property>
   
       <property>
           <name>fs.contract.test.fs.s3a</name>
           <value>${test.fs.s3a.name}</value>
       </property>
   
       <property>
           <!-- Runs under aws-vault --no-session -->
           <name>fs.s3a.aws.credentials.provider</name>
           
<value>software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider</value>
       </property>
   
       <property>
           <!-- Runs under aws-vault --no-session -->
           <name>fs.s3a.assumed.role.credentials.provider</name>
           
<value>software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider</value>
       </property>
   
       <property>
           <name>fs.s3a.assumed.role.arn</name>
           <value>arn:aws:iam::‹edited›:role/hadoop_test_role_‹edited›</value>
       </property>
   
       <!-- is there a typo in the docs? -->
       <property>
           <name>fs.s3a.delegation.token.endpoint</name>
           <value>${fs.s3a.assumed.role.sts.endpoint}</value>
       </property>
       </configuration>
   </details>
   
   **Almost** all test pass:
   * I wasn't able to make `ITestDelegatedMRJob` work. They probably clean out 
environment somewhere and my environment-provided AWS credentials didn't work. 
Also it looks parametrized, and I can't tell from the Surefire/Failsafe reports 
which causes a problem.
   * `ITestRoleDelegationInFilesystem`/`ITestSessionDelegationInFilesystem` 
fail a bit in `missmatch2`, but I'm really unfamiliar with credentials 
delegation. Probably lost environment variables on it's way?
   * Sometimes `ITestS3APrefetchingInputStream` fails with 0 size.
   * To be honest those also don't work for me on trunk!
   
   Given that other tests pass and the scope of the change I think it's fine, 
and the problem is my test setup misconfiguration. If you know how to fix the 
setup — I can rerun with some other options.
   
   Also, I've found this bug while repackaging Spark for a local K8S 
deployment, and with this fix STS configuration options work even if I do 
replace AWS SDK bundle with only required SDK modules.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [x] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [x] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   ### Sign-off
   
   I give a license to the Apache Software Foundation to use this code, as 
required under §5 of the Apache License.
   
   ### P.S.
   
   Re-opened from #7483 where there was a review by @steveloughran.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to