[ 
https://issues.apache.org/jira/browse/HADOOP-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609137#comment-16609137
 ] 

Steve Loughran commented on HADOOP-15694:
-----------------------------------------

bq. I know it's late for this comment, but I realized while reviewing this that 
it is a feature which would be nice in hadoop-common.  

That would be significant change to a structure which is heavily used across, 
is wire-serialized, and 

It's interesting to see that azure stores so far customise on the account name, 
resented by an FQDN..what's being tuned is how you auth.

For S3A we're tuning on a per-bucket basis, which includes the choice of auth 
mechanisms, 

fs.s3a.bucket.steve-ireland.aws.credentials.provider=org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider
fs.s3a.bucket.steve-ireland.assumed.role.arn=arn:aws:iam::00000000000:role/stevel-s3-restricted

I think they both have roles. The S3A one is good for final customisation, but, 
because it's not known by the rest of the system, makes resolving properties 
hard: you can't reliably have references across properties, as the per-bucket 
ones are only resolved in the bucket creation phase. Also makes 
unsettting/overwriting and diagnosing provenance hard, as in "where did this 
option get set?"

The oauth stuff here is similar: you need to know which a/c whose details you 
are looking up before you can determine the value to use, which again means: 
the URL of the container.

the sub key collection model proposed at least makes those details visible, but 
it'd quite radically change hadoop configs from simple key->value to a more 
complex model. Though as that creeps in in different ways (Go look at how HDFS 
HA naming works!), maybe its something to propose for hadoop 3.3+. Backports 
would scare people

> ABFS: Allow OAuth credentials to not be tied to accounts
> --------------------------------------------------------
>
>                 Key: HADOOP-15694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15694
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Major
>         Attachments: HADOOP-15694-HADOOP-15407-005.patch, 
> HADOOP-15694-HADOOP-15407.003.patch, HADOOP-15694-HADOOP-15407.004.patch, 
> HADOOP-15694-HADOOP-15407.006.patch, HADOOP-15694.001.patch, 
> HADOOP-15694.002.patch, HADOOP-15694.003.patch
>
>
> Now that there's OAuth support, it's possible to have a notion of identity 
> that's distinct from the account itself. If a cluster is configured via OAuth 
> with it's own identity, it's likely operators will want to use that identity 
> regardless of which storage account a job uses.
> So OAuth configs right now (and probably others) are looked up with 
> <config_key>.<account>. I propose that we add a function for looking up these 
> configs that returns an account-specific value if it exists, but in the event 
> it does not will also try to return <config_key>, if that exists.
> I can work on a patch for this if nobody has any objections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to