[PR] fix: Extract ADLS account_name from URI hostname in FsspecFileIO [iceberg-python]

via GitHub Fri, 06 Feb 2026 02:07:19 -0800


antonlin1 opened a new pull request, #3005:
URL: https://github.com/apache/iceberg-python/pull/3005


   ## Summary
   - When `adls.account-name` is not in catalog/table properties (common for 
tables created by Spark/Hadoop), `FsspecFileIO` created `AzureBlobFileSystem` 
with `account_name=None`
   - adlfs `_strip_protocol()` strips 
`abfss://[email protected]/path` to `container/path`, 
losing the storage account info, causing `FileNotFoundError`
   - The fix extracts `account_name` from the URI hostname as a last-resort 
fallback in `_adls()`, after SAS token extraction and explicit property checks
   
   ### Priority order for account_name resolution:
   1. Explicit `adls.account-name` property
   2. SAS token key extraction (existing behavior)
   3. **NEW**: URI hostname extraction (e.g. 
`usagestorageprod.dfs.core.windows.net` → `usagestorageprod`)
   
   ### Root cause
   Spark/Java Iceberg uses Hadoop's `FileSystem` API which resolves schemeless 
paths against `fs.defaultFS`. PyIceberg has no equivalent — `FsspecFileIO` 
relies entirely on `adls.account-name` in properties, which Spark-created 
tables typically don't set.
   
   ## Test plan
   - [x] `test_adls_account_name_extracted_from_uri_hostname` — verifies 
account extraction from full ABFSS URI
   - [x] `test_adls_account_name_not_overridden_when_in_properties` — verifies 
explicit property takes priority
   - [x] Existing `test_adls_account_name_sas_token_extraction` still passes 
(SAS token takes priority over hostname)
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] fix: Extract ADLS account_name from URI hostname in FsspecFileIO [iceberg-python]

Reply via email to