[ 
https://issues.apache.org/jira/browse/HUDI-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634648#comment-17634648
 ] 

Dheeraj Panangat commented on HUDI-5103:
----------------------------------------

There are couple of ways to resolve this : 
1. Add core-site.xml to resources :
{code:java}
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.hdfs.impl</name>
        <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
    </property>
    <property>
        <name>fs.file.impl</name>
        <value>org.apache.hadoop.fs.LocalFileSystem</value>
    </property>
    <property>
        <name>fs.azure.account.auth.type</name>
        <value>OAuth</value>
    </property>
    <property>
        <name>fs.azure.account.oauth.provider.type</name>
        
<value>org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider</value>
    </property>
    <property>
        <name>fs.azure.account.oauth2.client.id</name>
        <value>****</value>
    </property>
    <property>
        <name>fs.azure.account.oauth2.client.secret</name>
        <value>**********</value>
    </property>
    <property>
        <name>fs.azure.account.oauth2.client.endpoint</name>
        
<value>https://login.microsoftonline.com/***********/oauth2/token</value>
    </property>
    <property>
        <name>fs.azure.createRemoteFileSystemDuringInitialization</name>
        <value>true</value>
    </property>
</configuration> {code}

2. Pass this in the Flink Table config by appending 'hadoop.' prefix
{code:java}
"hadoop.fs.azure.account.auth.type": "OAuth",
       "hadoop.fs.azure.account.oauth.provider.type": 
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "hadoop.fs.azure.account.oauth2.client.id": "<appId>",
       "hadoop.fs.azure.account.oauth2.client.secret": "<clientSecret>",
       "hadoop.fs.azure.account.oauth2.client.endpoint": 
"https://login.microsoftonline.com/<tenant>/oauth2/token",
       "hadoop.fs.azure.createRemoteFileSystemDuringInitialization": "true"  
{code}
Thanks

> Does not work with Azure Data lake Gen2
> ---------------------------------------
>
>                 Key: HUDI-5103
>                 URL: https://issues.apache.org/jira/browse/HUDI-5103
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Dheeraj Panangat
>            Priority: Major
>              Labels: features, pull-request-available
>
> Unable to use Hudi with Flink for Azure Data Lake 
> It tries to look for 
> {code:java}
> Caused by: Configuration property <datalakeaccount>.dfs.core.windows.net not 
> found.
>     at 
> org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:372)
>     at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:1133)
>     at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:174)
>  {code}
> Following properties are specified when running the Flink code : 
> {code:java}
> "fs.azure.account.auth.type": "OAuth",
>        "fs.azure.account.oauth.provider.type": 
> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
>        "fs.azure.account.oauth2.client.id": "<appId>",
>        "fs.azure.account.oauth2.client.secret": "<clientSecret>",
>        "fs.azure.account.oauth2.client.endpoint": 
> "https://login.microsoftonline.com/<tenant>/oauth2/token",
>        "fs.azure.createRemoteFileSystemDuringInitialization": "true" {code}
> as per document : [Microsoft Azure Spark to Data Lake 
> Gen2|https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-use-databricks-spark]
>  
>  
> In AWS it works because it takes the credentials from the environment, but in 
> Azure it needs to get it from config, which does not reach till the point 
> where the FileSystem is initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to