BlakeOrth commented on code in PR #18146:
URL: https://github.com/apache/datafusion/pull/18146#discussion_r2446211441
##########
datafusion/catalog-listing/src/helpers.rs:
##########
@@ -239,105 +238,6 @@ pub async fn list_partitions(
Ok(out)
}
-async fn prune_partitions(
Review Comment:
@alamb I've reviewed the changes and discussion in the PR you listed and it
seems like the big performance boost was mostly attributed to parallelism for
listing partitions. It seems like there's likely a way to accomplish this with
the code in the PR I've made here, however I admit I haven't really explored
that. It also seems I may have made a poor assumption that
`object_store::list()` already parallelized in some manner since it returns an
unordered stream of results. It seems to me that a synthetic benchmark is
likely a good place to start exploring potential solutions. Is there any prior
art I can reference to build a benchmark in DataFusion?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]