GitHub user tustvold added a comment to the discussion: Azure Storage Gen2 
(hierarchical namespaces) - use DFS endpoint to improve performance

If anything I would expect the blob endpoint to be faster - it is literally 
just a range-scan on a sorted map of keys, most likely some sort of BTreeMap. 
It does not get simpler from a data-retrieval perspective.

The DFS endpoint by comparison is significantly more complicated as it has to 
do a recursive traversal, performing a separate range scan at each level. This 
is much more complicated to implement.

Now the reality is that when hierarchical namespaces are enabled, Azure is 
probably forced to store data in the hierarchical fashion, which will make 
recursive listing equally more expensive for both APIs. I would not expect 
either API to perform better than the other for this use-case, the fact 
directories exist in the data model will hurt both equally.

GitHub link: 
https://github.com/apache/arrow-rs-object-store/discussions/481#discussioncomment-14592052

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to