viirya edited a comment on pull request #29804: URL: https://github.com/apache/spark/pull/29804#issuecomment-701110070
I think it is easy to encounter such example. For case 1, the sub-plan from root to bucketed table scan, does not contain [[hasInterestingPartition]] operator. If we cache a query plan like that, but we have other query uses the cached query plan, and the other query has an operator with [[hasInterestingPartition]] on top of the cached query. Then we won't do bucket scan even bucket scan can benefit the later query. It seems to me, this feature can easily cause unintentional regression like that. Maybe we should disable it by default? This feature can confuse users easily. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
