disliketd commented on PR #53263:
URL: https://github.com/apache/spark/pull/53263#issuecomment-3612935127

   I have concerns about this approach. The motivating use case relies on 
parsing string outputs from ```SHOW PARTITIONS``` to drive logic, which is an 
anti-pattern compared to standard scalar subqueries ```(WHERE col = (SELECT 
MAX(col)...))```.
   
   Furthermore, blindly treating all ```CommandResult``` nodes as 'selective' 
```(hasSelectivePredicate = true)``` seems risky. If the command returns all 
partitions, we incur the DPP overhead without any pruning benefit. We shouldn't 
modify core optimizer heuristics to support a fragile query pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to