Dheeraj Panangat created HUDI-5103:
--------------------------------------

             Summary: Does not work with Azure Data lake Gen2
                 Key: HUDI-5103
                 URL: https://issues.apache.org/jira/browse/HUDI-5103
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Dheeraj Panangat


Unable to use Hudi with Flink for Azure Data Lake 
It tries to look for 
{code:java}
Caused by: Configuration property <datalakeaccount>.dfs.core.windows.net not 
found.
    at 
org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:372)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1133)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:174)
 {code}

Following properties are specified when running the Flink code : 
{code:java}
"fs.azure.account.auth.type": "OAuth",
       "fs.azure.account.oauth.provider.type": 
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "fs.azure.account.oauth2.client.id": "<appId>",
       "fs.azure.account.oauth2.client.secret": "<clientSecret>",
       "fs.azure.account.oauth2.client.endpoint": 
"https://login.microsoftonline.com/<tenant>/oauth2/token",
       "fs.azure.createRemoteFileSystemDuringInitialization": "true" {code}
as per document : [Microsoft Azure Spark to Data Lake 
Gen2|https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-use-databricks-spark]

 

 

In AWS it works because it takes the credentials from the environment, but in 
Azure it needs to get it from config, which does not reach till the point where 
the FileSystem is initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to