[GitHub] [spark] viirya edited a comment on pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

GitBox Tue, 29 Sep 2020 18:50:40 -0700


viirya edited a comment on pull request #29804:
URL: https://github.com/apache/spark/pull/29804#issuecomment-701110070



   I think it is easy to encounter such example. For case 1, the sub-plan from 
root to bucketed table scan, does not contain [[hasInterestingPartition]] 
operator. If we cache a query plan like that, but we have other query uses the 
cached query plan, and the other query has an operator with 
[[hasInterestingPartition]] on top of the cached query. Then we won't do bucket 
scan even bucket scan can benefit the later query. It seems to me, this feature 
can easily cause unintentional regression like that.
   
   Maybe we should disable it by default? This feature can confuse users easily.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] viirya edited a comment on pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

Reply via email to