westonpace commented on PR #12625: URL: https://github.com/apache/arrow/pull/12625#issuecomment-1104419660
Originally my goal was to avoid reimplenting the glob logic. However, at this point, I think we have to do that for Windows anyways so it might not be as much complexity as I was concerned about. I'm not entirely sure I agree that local filesystems and remote filesystems would implement glob in the exact same way. For example, if the glob is `/foo/bar*.txt` then a remote filesystem would probably issues a prefix request for `/foo/bar` and filter from there which might be more efficient than a directory-crawling approach that would issue a prefix request for `/foo` and then filter from there. However, I agree that it would be much simpler to have a user utility and not modify the filesystem API. It would also give us glob support for all filesystem immediately instead of waiting for support being added one-by-one. I also don't think the performance difference between the two approaches would matter in most cases. So yes, I think Antoine is right. Apologies for not suggesting this approach sooner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
