[
https://issues.apache.org/jira/browse/ARROW-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Carlos O'Ryan updated ARROW-15121:
----------------------------------
Description:
The current implementation ignores the {{max_recursion}} attribute in the
selector. Seems like a useful thing to do.
In GCS it is *more* expensive to do {{ls foo/}} and then recurse over the
results than to do a {{ls -R foo/}}. The running time of a (recursive or
non-recursive) operation is proportional to the number of objects in the
prefix, not to the number of objects returned.
Therefore, the implementation will probably list all the objects and
directories, and simply filter out those that are "too deep" in the recursion
hierarchy.
was:
The current implementation ignores the {{max_recursion}} attribute in the
selector. Seems like a useful thing to do.
In GCS it is *more* expensive to do {{ls foo/*}} and then recurse over the
results than to do a {{ls foo/**}}. The running time of a (recursive or
non-recursive) operation is proportional to the number of objects in the
prefix, not to the number of objects returned.
Therefore, the implementation will probably list all the objects and
directories, and simply filter out those that are "too deep" in the recursion
hierarchy.
> [C++] Implement max recursion for GcsFileSystem
> -----------------------------------------------
>
> Key: ARROW-15121
> URL: https://issues.apache.org/jira/browse/ARROW-15121
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Carlos O'Ryan
> Priority: Major
>
> The current implementation ignores the {{max_recursion}} attribute in the
> selector. Seems like a useful thing to do.
> In GCS it is *more* expensive to do {{ls foo/}} and then recurse over the
> results than to do a {{ls -R foo/}}. The running time of a (recursive or
> non-recursive) operation is proportional to the number of objects in the
> prefix, not to the number of objects returned.
> Therefore, the implementation will probably list all the objects and
> directories, and simply filter out those that are "too deep" in the recursion
> hierarchy.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)