martin-traverse commented on issue #47353:
URL: https://github.com/apache/arrow/issues/47353#issuecomment-3195965432

   Hi - thanks for getting back to me. I already tried credential_kind = cli 
which didn't work, now using credential_kind = workload_identity gives a 
different error:
   
       Traceback (most recent call last):
         File 
"/__w/tracdap/tracdap/tracdap-runtime/python/src/tracdap/rt/_impl/core/storage.py",
 line 575, in _wrap_operation
           return func(operation_name, storage_path, *args, **kwargs)
         File 
"/__w/tracdap/tracdap/tracdap-runtime/python/src/tracdap/rt/_impl/core/storage.py",
 line 391, in _mkdir
           prior_stat: pa_fs.FileInfo = self._fs.get_file_info(resolved_path)
                                        ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
         File "pyarrow/_fs.pyx", line 615, in 
pyarrow._fs.FileSystem.get_file_info
         File "pyarrow/error.pxi", line 155, in 
pyarrow.lib.pyarrow_internal_check_status
         File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
       pyarrow.lib.ArrowException: Unknown error: Check for Hierarchical 
Namespace support on 
'https://tracdapcistorage1.blob.core.windows.net/tracdap-ci-storage' failed:
           N5Azure4Core11Credentials23AuthenticationExceptionE:
           WorkloadIdentityCredential authentication unavailable. Azure 
Kubernetes environment is not set up correctly.
   
   I'm not certain whether Arrow should be doing workload_identity in code, or 
whether it should be doing cli (because the login already happened in a GitHub 
Action) and the problem is that it doesn't accept the cli login for a workload 
service principle.
   
   One thing that might be relevant - I had a look at the release notes for 
Azure identity SDK and it seems there have been quite a few fixes / additions 
recently:
   
   
https://github.com/Azure/azure-sdk-for-cpp/blob/main/sdk/identity/azure-identity/CHANGELOG.md
   
   I'm not entirely clear on how the azure SDK version in the Arrow build 
scripts relates to the individual component libraries, but it seems safe to say 
there are several recent fixes / features that are not being taken yet. Perhaps 
just putting Azure identity SDK onto latest stable might sort this out?
   
   The best result obviously is that we can use the AzureFileSystem() 
constructor with just account_name and it works automatically through the 
default mechanism. We can do explicit setup with from_uri() if need be although 
it's slightly less obvious for a long-lived fs object.
   
   You can see our [setup code 
here](https://github.com/martin-traverse/tracdap/blob/feature/azure-arrow-native/tracdap-runtime/python/src/tracdap/rt/_plugins/storage_azure.py)
 if you like - it's pretty straightforward.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to