[ https://issues.apache.org/jira/browse/HADOOP-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848360#comment-17848360 ]
Steve Loughran edited comment on HADOOP-19178 at 5/22/24 2:17 PM: ------------------------------------------------------------------ makes sense, I've been undertesting it anyway. It'd be good for any form of graceful degradation of wasb driver * immediate PR to warn its deprecated for trunk, 3.4 and 3.3.9 branches, docs updated * after the cut, have some stub fs to fail on instantiate() with meaningful error message. We did this with s3n, way back. No attempt at migration, just a "gone, go look at at the docs" was (Author: ste...@apache.org): makes sense, I've been undertesting it anyway. It'd be good for any form of graceful degradation of wasb driver * immediate PR to warn its deprecated for trunk, 3.4 and 3.3.9 branches, docs updated * after the cut, have some stub fs to fail on instantiate() with meaningful error message. We did this with s3n, way back. No attempt at migration, just a "done, go look at at the docs" > WASB Driver Deprecation and eventual removal > -------------------------------------------- > > Key: HADOOP-19178 > URL: https://issues.apache.org/jira/browse/HADOOP-19178 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.4.0 > Reporter: Sneha Vijayarajan > Assignee: Sneha Vijayarajan > Priority: Major > Fix For: 3.4.1 > > > *WASB Driver* > WASB driver was developed to support FNS (FlatNameSpace) Azure Storage > accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder > operations hence are mimicked at client side by WASB driver and certain > folder operations like Rename and Delete can lead to lot of IOPs with > client-side enumeration and orchestration of rename/delete operation blob by > blob. It was not ideal for other APIs too as initial checks for path is a > file or folder needs to be done over multiple metadata calls. These led to a > degraded performance. > To provide better service to Analytics customers, Microsoft released ADLS > Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. > ABFS driver was designed to overcome the inherent deficiencies of WASB and > customers were informed to migrate to ABFS driver. > *Customers who still use the legacy WASB driver and the challenges they face* > Some of our customers have not migrated to the ABFS driver yet and continue > to use the legacy WASB driver with FNS accounts. > These customers face the following challenges: > * They cannot leverage the optimizations and benefits of the ABFS driver. > * They need to deal with the compatibility issues should the files and > folders were modified with the legacy WASB driver and the ABFS driver > concurrently in a phased transition situation. > * There are differences for supported features for FNS and HNS over ABFS > Driver > * In certain cases, they must perform a significant amount of re-work on > their workloads to migrate to the ABFS driver, which is available only on HNS > enabled accounts in a fully tested and supported scenario. > *Deprecation plans for WASB* > We are introducing a new feature that will enable the ABFS driver to support > FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will > enable customers to use the ABFS driver to interact with data stored in GPv2 > (General Purpose v2) storage accounts. > With this feature, the customers who still use the legacy WASB driver will be > able to migrate to the ABFS driver without much re-work on their workloads. > They will however need to change the URIs from the WASB scheme to the ABFS > scheme. > Once ABFS driver has built FNS support capability to migrate WASB customers, > WASB driver will be declared deprecated in OSS documentation and marked for > removal in next major release. This will remove any ambiguity for new > customer onboards as there will be only one Microsoft driver for Azure > Storage and migrating customers will get SLA bound support for driver and > service, which was not guaranteed over WASB. > We anticipate that this feature will serve as a stepping stone for customers > to move to HNS enabled accounts with the ABFS driver, which is our > recommended stack for big data analytics on ADLS Gen2. > *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled > account) with ABFS driver* *?* > This feature does not impact the existing customers who are using ADLS Gen2 > (HNS enabled account) with ABFS driver. > They do not need to make any changes to their workloads or configurations. > They will still enjoy the benefits of HNS, such as atomic operations, > fine-grained access control, scalability, and performance. > *Official recommendation* > Microsoft continues to recommend all Big Data and Analytics customers to use > Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to > optimize this scenario in future, we believe that this new option will help > all those customers to transition to a supported scenario immediately, while > they plan to ultimately move to ADLS Gen2 (HNS enabled account). > *New Authentication options that a WASB to ABFS Driver migrating customer > will get* > Below auth types that WASB provides will continue to work on the new FNS over > ABFS Driver over configuration that accepts these SAS types (similar to WASB) > * SharedKey > * Account SAS > * Service/Container SAS > Below authentication types that were not supported by WASB driver but > supported by ABFS driver will continue to be available for new FNS over ABFS > Driver > * OAuth 2.0 Client Credentials > * OAuth 2.0: Refresh Token > * Azure Managed Identity > * Custom OAuth 2.0 Token Provider > ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS > and Directly SAS will continue to work only for HNS accounts. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org