GitHub user tustvold added a comment to the discussion: Azure Storage Gen2 (hierarchical namespaces) - use DFS endpoint to improve performance
If anything I would expect the blob endpoint to be faster - it is literally just a range-scan on a sorted map of keys, most likely some sort of BTreeMap. It does not get simpler from a data-retrieval perspective. The DFS endpoint by comparison is significantly more complicated as it has to do a recursive traversal, performing a separate range scan at each level. This is much more complicated to implement. Now the reality is that when hierarchical namespaces are enabled, Azure is probably forced to store data in the hierarchical fashion, which will make recursive listing equally more expensive for both APIs. I would not expect either API to perform better than the other for this use-case, the fact directories exist in the data model will hurt both equally. GitHub link: https://github.com/apache/arrow-rs-object-store/discussions/481#discussioncomment-14592052 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
