[
https://issues.apache.org/jira/browse/HUDI-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634648#comment-17634648
]
Dheeraj Panangat commented on HUDI-5103:
----------------------------------------
There are couple of ways to resolve this :
1. Add core-site.xml to resources :
{code:java}
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.hdfs.impl</name>
<value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
</property>
<property>
<name>fs.file.impl</name>
<value>org.apache.hadoop.fs.LocalFileSystem</value>
</property>
<property>
<name>fs.azure.account.auth.type</name>
<value>OAuth</value>
</property>
<property>
<name>fs.azure.account.oauth.provider.type</name>
<value>org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.id</name>
<value>****</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.secret</name>
<value>**********</value>
</property>
<property>
<name>fs.azure.account.oauth2.client.endpoint</name>
<value>https://login.microsoftonline.com/***********/oauth2/token</value>
</property>
<property>
<name>fs.azure.createRemoteFileSystemDuringInitialization</name>
<value>true</value>
</property>
</configuration> {code}
2. Pass this in the Flink Table config by appending 'hadoop.' prefix
{code:java}
"hadoop.fs.azure.account.auth.type": "OAuth",
"hadoop.fs.azure.account.oauth.provider.type":
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"hadoop.fs.azure.account.oauth2.client.id": "<appId>",
"hadoop.fs.azure.account.oauth2.client.secret": "<clientSecret>",
"hadoop.fs.azure.account.oauth2.client.endpoint":
"https://login.microsoftonline.com/<tenant>/oauth2/token",
"hadoop.fs.azure.createRemoteFileSystemDuringInitialization": "true"
{code}
Thanks
> Does not work with Azure Data lake Gen2
> ---------------------------------------
>
> Key: HUDI-5103
> URL: https://issues.apache.org/jira/browse/HUDI-5103
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Dheeraj Panangat
> Priority: Major
> Labels: features, pull-request-available
>
> Unable to use Hudi with Flink for Azure Data Lake
> It tries to look for
> {code:java}
> Caused by: Configuration property <datalakeaccount>.dfs.core.windows.net not
> found.
> at
> org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:372)
> at
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1133)
> at
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:174)
> {code}
> Following properties are specified when running the Flink code :
> {code:java}
> "fs.azure.account.auth.type": "OAuth",
> "fs.azure.account.oauth.provider.type":
> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
> "fs.azure.account.oauth2.client.id": "<appId>",
> "fs.azure.account.oauth2.client.secret": "<clientSecret>",
> "fs.azure.account.oauth2.client.endpoint":
> "https://login.microsoftonline.com/<tenant>/oauth2/token",
> "fs.azure.createRemoteFileSystemDuringInitialization": "true" {code}
> as per document : [Microsoft Azure Spark to Data Lake
> Gen2|https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-use-databricks-spark]
>
>
> In AWS it works because it takes the credentials from the environment, but in
> Azure it needs to get it from config, which does not reach till the point
> where the FileSystem is initialized.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)