wjones127 commented on code in PR #34170:
URL: https://github.com/apache/arrow/pull/34170#discussion_r1115026290
##########
cpp/src/arrow/filesystem/localfs.cc:
##########
@@ -228,7 +247,12 @@ Status StatSelector(const PlatformFilename& dir_fn, const
FileSelector& select,
for (const auto& child_fn : *result) {
PlatformFilename full_fn = dir_fn.Join(child_fn);
- ARROW_ASSIGN_OR_RAISE(FileInfo info, StatFile(full_fn.ToNative()));
+ FileInfo info;
+ if (select.needs_extended_file_info == true) {
+ ARROW_ASSIGN_OR_RAISE(info, StatFile(full_fn.ToNative()));
+ } else {
+ ARROW_ASSIGN_OR_RAISE(info, IdentifyFile(full_fn.ToNative()));
Review Comment:
From [docs on
std::filesystem::status](https://en.cppreference.com/w/cpp/filesystem/status)
(which `std::filesystem::is_directory` calls):
> The information provided by this function is usually also provided as a
byproduct of directory iteration, and may be obtained by the member functions
of
[filesystem::directory_entry](https://en.cppreference.com/w/cpp/filesystem/directory_entry).
During directory iteration, calling status again is unnecessary.
The original example that seemed more performant used
`std::filesystem::directory_iterator`, but this approach seems to instead make
separate calls again for each path instead of using the information provided by
the directory iterator. Is there a reason you moved away from the original
approach?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]