Re: [PR] [SPARK-54554][SQL] Enable Dynamic Partition Pruning with CommandResult [spark]

via GitHub Thu, 04 Dec 2025 07:55:11 -0800


disliketd commented on PR #53263:
URL: https://github.com/apache/spark/pull/53263#issuecomment-3612935127


   I have concerns about this approach. The motivating use case relies on 
parsing string outputs from ```SHOW PARTITIONS``` to drive logic, which is an 
anti-pattern compared to standard scalar subqueries ```(WHERE col = (SELECT 
MAX(col)...))```.
   
   Furthermore, blindly treating all ```CommandResult``` nodes as 'selective' 
```(hasSelectivePredicate = true)``` seems risky. If the command returns all 
partitions, we incur the DPP overhead without any pruning benefit. We shouldn't 
modify core optimizer heuristics to support a fragile query pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-54554][SQL] Enable Dynamic Partition Pruning with CommandResult [spark]

Reply via email to