Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I believe that using a method like `TableFileCatalog.filterPartitions` to
build a new file catalog restricted to some pruned partitions is a sound
approach, however I'm starting to reconsider the implementation. Specifically,
I'm thinking that having that method return a `ListingFileCatalog` is the wrong
thing to do. It may be unnecessarily heavy handed.
For one thing, a `ListingFileCatalog` performs a file tree traversal right
off the bat. However, the external catalog returns the locations of partitions
as part of the `listPartitionsByFilter` call. I believe that should suffice for
the purpose of building a query plan for metastore-backed tables and executing
it.
The current implementation works, but I'm going to explore the latter
possibility as a more efficient implementation. If I can make it work without
too much complexity I'll probably just push it as a new commit to this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]