[ 
https://issues.apache.org/jira/browse/ARROW-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlos O'Ryan updated ARROW-15121:
----------------------------------
    Description: 
The current implementation ignores the {{max_recursion}} attribute in the 
selector.  Seems like a useful thing to do.

In GCS it is *more* expensive to do {{ls foo/}} and then recurse over the 
results than to do a {{ls -R foo/}}.  The running time of a (recursive or 
non-recursive) operation is proportional to the number of objects in the 
prefix, not to the number of objects returned.

Therefore, the implementation will probably list all the objects and 
directories, and simply filter out those that are "too deep" in the recursion 
hierarchy.

  was:
The current implementation ignores the {{max_recursion}} attribute in the 
selector.  Seems like a useful thing to do.

In GCS it is *more* expensive to do {{ls foo/*}} and then recurse over the 
results than to do a {{ls foo/**}}.  The running time of a (recursive or 
non-recursive) operation is proportional to the number of objects in the 
prefix, not to the number of objects returned.

Therefore, the implementation will probably list all the objects and 
directories, and simply filter out those that are "too deep" in the recursion 
hierarchy.


> [C++] Implement max recursion for GcsFileSystem
> -----------------------------------------------
>
>                 Key: ARROW-15121
>                 URL: https://issues.apache.org/jira/browse/ARROW-15121
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Carlos O'Ryan
>            Priority: Major
>
> The current implementation ignores the {{max_recursion}} attribute in the 
> selector.  Seems like a useful thing to do.
> In GCS it is *more* expensive to do {{ls foo/}} and then recurse over the 
> results than to do a {{ls -R foo/}}.  The running time of a (recursive or 
> non-recursive) operation is proportional to the number of objects in the 
> prefix, not to the number of objects returned.
> Therefore, the implementation will probably list all the objects and 
> directories, and simply filter out those that are "too deep" in the recursion 
> hierarchy.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to