Re: [I] [C++] Don't allow the inclusion of passwords (storage account keys) in Auzre ABFS URLs [arrow]

via GitHub Thu, 11 Jul 2024 22:35:54 -0700


sugibuchi commented on issue #43197:
URL: https://github.com/apache/arrow/issues/43197#issuecomment-2224735341


   > Could you share bad scenarios you think?
   
   There are several Azure Blob File System implementations in Python, and we 
frequently need to use multiple implementations in the same code. However,
   
   1. Most ABFS implementations, except for Arrow's `AzureFileSystem`, do not 
assume that ABFS URLs can contain confidential information like storage account 
keys.
   2. It is not always clear which implementation is actually used in a 
library. 
       * PyArrow has native `AzureFileSystem` implementation since Arrow 
16.0.0. However, `delta-io` implemented on Arrow uses Rust `object_store`.
       * Pandas initially used the fsspec's ABFS implementation but silently 
started using Arrow's native implementation after the release of Arrow 16.0.0 
(#41496).
       * DuckDB has native ABFS support. But Rust `object_store` is eventually 
used when reading Delta Lake into DuckDB by using 
[`ibis`](https://duckdb.org/docs/guides/python/ibis.html) API.
   
   Because of 2, an Arrow's style ABFS URL containing a storage account key can 
be accidentally passed to a different ABFS implementation. However, the 
different implementation usually does not assume the passed URL contains a 
storage account key, as explained in 1.
   
   This leads to the rejection of the URL and an error message like the one 
below exposed to error logs, etc. 
   
   ```
   Invalid file path: abfs://my-container:(plain text of a storage account 
key)@mycontainer.dfs.core.windows.net/...
   ```
   
   Having storage account keys in ABFS URLs can cause this kind of 
interoperability issue with other ABFS implementations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] [C++] Don't allow the inclusion of passwords (storage account keys) in Auzre ABFS URLs [arrow]

Reply via email to