Github user Parth-Brahmbhatt commented on the issue:
https://github.com/apache/spark/pull/14817
@hvanhovell I looked at AlterTableRecoverPartitionsCommand and the
parallelism in listing could help it will still cause huge perf penalty. We
have tables with millions of partitions and we use s3 for storage where listing
is more expansive. I think it is much better to just stop listing once we know
the stat used only for join optimization won't meet the threshold and I don't
see the downside compared to what we currently offer.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]