[
https://issues.apache.org/jira/browse/HADOOP-17915?focusedWorklogId=653075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-653075
]
ASF GitHub Bot logged work on HADOOP-17915:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 20/Sep/21 16:20
Start Date: 20/Sep/21 16:20
Worklog Time Spent: 10m
Work Description: steveloughran commented on pull request #3442:
URL: https://github.com/apache/hadoop/pull/3442#issuecomment-923078073
Yetus is happy, my ITests are happy, so this is in ready for review
@mukund-thakur @mehakmeet @lmccay @snehavarma: any spare review capacity?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 653075)
Time Spent: 0.5h (was: 20m)
> ABFS AbfsDelegationTokenManager to generate canonicalServiceName if DT plugin
> doesn't
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-17915
> URL: https://issues.apache.org/jira/browse/HADOOP-17915
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.1
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Currently in {{AbfsDelegationTokenManager}}, any
> {{CustomDelegationTokenManager}} only provides a canonical service name if it
> implements {{BoundDTExtension}} and its {{getCanonicalServiceName()}} method.
> If this doesn't hold, {{AbfsDelegationTokenManager}} returns null, which
> causes {{AzureBlobFileSystem.getCanonicalServiceName()}}
> to call {{super.getCanonicalServiceName()}} *which resolves the IP address of
> the abfs endpoint, and then the FQDN of that IPAddr
> If a storage account is served over >1 endpoint, then the DT will only have a
> valid service name for one of the possible
> endpoints, so _only work if all process get the same IP address when the look
> up the storage account address_
> Fix
> # DT plugins SHOULD generate the canonical service name
> # If they don't, and DTs are enabled: {{AbfsDelegationTokenManager}} to
> create a default one
> # and {{AzureBlobFileSystem.getCanonicalServiceName()}} MUST NOT call
> superclass.
> The default canonical service name of a store will be {{abfs:// +
> FsURI.getHost() + "/"}}, so all containers in same storage account has the
> same service name
> {code}
> abfs://[email protected]/path
> {code}
> maps to
> {code}
> abfs://stevel-testing.dfs.core.windows.net/
> {code}
> This will mean that only one DT will be created per storage a/c; Applications
> will not need to list all containers which deployed processes will wish to
> interact with. Today's behaviour, based on rDNS lookup of storage account, is
> possibly slightly broader in that all storage accounts which map to the same
> IPAddr share a DT. The proposed scheme will still be much broader than that
> of S3A, where every bucket has its unique service name, so apps need to list
> all target filesystems at launch time (easy for MR, source of trouble in
> spark).
> Fix: straightforward.
> Test
> * no DTs: service name == null
> * DTs: will match proposed pattern, even if extension returns null.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]