adnanhemani commented on PR #6507:
URL: https://github.com/apache/hadoop/pull/6507#issuecomment-1920244731

   Hi @steveloughran - thanks for your time on this. We appreciate it a lot. 
I've responded to your comments inline below:
   
   > We cannot have any more unshaded aws sdk jars as required on our 
classpath; removing s3 select in https://github.com/apache/hadoop/pull/6144 has 
simplified our life by removing another optional one.
   >Do you have a timetable for incorporating this plugging into bundle.jar?
   >Otherwise, it is critical that if the jar is not on the cross path normal 
S3 clients can be constructed and used.
   
   Totally agreed with you on this. @jxhan3 and I will work to make sure this 
is optional as we do not have any timeline (or known plans) for getting the 
plugin incorporated into the AWS Java SDKv2 bundle.jar. If this changes in the 
future, we'd be glad to reverse any classloading code we may need for now.
   
   >This will need documentation. Either in connecting.md or a new file in the 
same directory src/site/markdown/tools/hadoop-aws
   
   Noted. @jxhan3 will work on this in the next PR revision.
   
   >I do not see any integration tests. What is the story here? Is it possible 
to run the whole mvn verify test run with access grants? if so, adding a 
paragraph in testing.md would be good, and particular: how to set it up. I am 
particularly curious about how well the delegation tokens worked...are session 
credentials supported?
   
   The story currently is, if we treat the plugin as yet another third-party 
(and optional) dependency, then this PR is only going to be providing the bare 
minimum code for users to be able to enable the plugin if they explicitly 
choose to do so. Any issues with the actual functionality of the plugin should 
be addressed by the plugin itself at their [open source 
GitHub](https://github.com/aws/aws-s3-accessgrants-plugin-java-v2). So then, 
the only testing that we'd require would be to ensure that if users are 
explicitly enabling this feature, that S3A is ensuring its S3 clients have the 
plugin attached. Other open-source contributions (e.g. Iceberg) have accepted 
this testing model - and I'd also recommend it to reduce the need for redundant 
test coverage between S3A and the S3 Access Grants plugin.
   
   If you don't agree with this testing model, we can surely try to add 
additional ITest cases that will both setup and tear down the S3 Access Grants 
instance, locations, and required grants (to be noted: S3 Access Grants APIs 
are _not_ free) - and then test both when users should and should not be able 
to receive access. However, running all existing test cases under this model 
will be a heavy task that will likely require lots of test case refactoring, as 
Access Grants are defined on a location-by-location basis. In order to test 
both when users should and should not have access for each test case will 
require both additional setup and test code to ensure that those situations can 
be adequately tested with multiple data locations. I'm not sure that the ROI on 
making such a large change will be there. Please let me know your thoughts on 
this.
   
   As for how the feature works, S3 Access Grants will authenticate the 
credentials to find the IAM user associated with it - then use that identity 
for the authorization before returning a new set of scoped credentials to 
actually access the data (in other words, the credentials that are inputted to 
the S3 client will not be the credentials used to actually access the data). 
The S3 Access Grants plugin is the mechanism that will do the entire credential 
vending process and using the vended credentials properly in any calls made 
from the attached S3 client. As such, session credentials and delegation tokens 
will work given that the credentials that are passed to the S3 client (using 
any mechanism) are valid and can be authenticated properly.
   
   >The feature probably also needs an extra line in the "qualifying an SDK" 
section.
   
   Noted. @jxhan3 will work on this in the next PR revision.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to