[
https://issues.apache.org/jira/browse/HADOOP-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934170#comment-16934170
]
Sneha Vijayarajan commented on HADOOP-16578:
--------------------------------------------
Filesystem REST API calls for which ABFS driver has supporting code are:
* Create FileSystem - REST Syntax - PUT
https://\{accountName}.\{dnsSuffix}/\{filesystem}?resource=filesystem
* Get FileSystem Properties - REST Syntax - HEAD
https://\{accountName}.\{dnsSuffix}/\{filesystem}?resource=filesystem
* Set FileSystem Properties - REST Syntax - PATCH
https://\{accountName}.\{dnsSuffix}/\{filesystem}?resource=filesystem
* Delete FileSystem - REST Syntax - DELETE
https://\{accountName}.\{dnsSuffix}/\{filesystem}?resource=filesystem
[FileSystem REST API Documentation:
[https://docs.microsoft.com/en-us/rest/api/storageservices/data-lake-storage-gen2]]
But among the above, AzureBlobFileSystem would currently trigger only below 2
APIs :
* Create FileSystem - Called in AzureBlobFileSystem ::initialize(), if
"fs.azure.createRemoteFileSystemDuringInitialization" is true and
GetFileSystemProperties call had returned 404 Not Found
* Get FileSystem Properties - Called in below 2 places
# On AzureBlobFileSystem::initialize(), to check FileSystem existence if
"fs.azure.createRemoteFileSystemDuringInitialization" is true
# In GetFileStatus API code flow, if the request is for root path and
account is not Namespace enabled, GetFileSystemProperties() gets called. This
is the expected as GetPathProperties() does not support path as root if account
is not namespace enabled. Hence GetFileSystemProperties() call will be have to
be retained in this flow.
Proposed change is to convert the GetFileSystemProperties call in
AzureBlobFileSystem::initialize() to GetFileStatus() on the root path.
Hence if FileSystem is already present and account is Namespace enabled, no
FileSystem API will be called.
Existing flow in which if the config createRemoteFileSystemDuringInitialization
is true and FileSystem doesnt exists will continue to work as it does today.
i.e, create FileSystem request will be sent to server which will validate RBAC
role needed for container create. [RBAC - ADLS Gen2 Documenation :
[https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control#role-based-access-control]
]
> ABFS: fileSystemExists() should not call container level apis
> -------------------------------------------------------------
>
> Key: HADOOP-16578
> URL: https://issues.apache.org/jira/browse/HADOOP-16578
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.0
> Reporter: Da Zhou
> Assignee: Sneha Vijayarajan
> Priority: Major
> Fix For: 3.3.0
>
>
> ABFS driver should not use container level api "Get Container Properties" as
> there is no concept of container in HDFS, and this caused some RBAC check
> issue.
> Fix: use getFileStatus() to check if the container exists.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]