lidavidm commented on code in PR #12977:
URL: https://github.com/apache/arrow/pull/12977#discussion_r857748824
##########
cpp/src/arrow/dataset/discovery.cc:
##########
@@ -278,8 +278,13 @@ Result<std::shared_ptr<Dataset>>
FileSystemDatasetFactory::Finish(FinishOptions
}
std::vector<std::shared_ptr<FileFragment>> fragments;
+ std::string fixed_path;
for (const auto& info : files_) {
- auto fixed_path = StripPrefixAndFilename(info.path(),
options_.partition_base_dir);
+ if (partitioning->type_name() == "filename") {
+ fixed_path = StripPrefix(info.path(), options_.partition_base_dir);
+ } else {
+ fixed_path = StripPrefixAndFilename(info.path(),
options_.partition_base_dir);
+ }
Review Comment:
Yeah, I think we can still strip the prefix, but we can let the partitioning
handle the rest.
And in that case it might make sense to split the remainder of the path and
the filename for the partitioning implementation too and just pass both as
arguments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]