Re: [PR] [SPARK-47672][SQL] Avoid double eval from filter pushDown w/ projection pushdown [spark]

via GitHub Mon, 05 Jan 2026 05:27:59 -0800


cloud-fan commented on PR #46143:
URL: https://github.com/apache/spark/pull/46143#issuecomment-3710426392


   To discuss https://github.com/apache/spark/pull/46143#discussion_r2643476999 
further:
   
   > Yes that's true, but given your previous statement around how adding 
projections is not free I don't think that's the right way to structure this.
   
   That's why my initial suggestion was to not do this optimization at all. We 
just create a single `Filter` and place it above the `Project`. By doing so we 
avoid the expensive expression duplication caused by filter pushdown, but all 
expressions in `Project` now need to be evaluated against the full input. I'm 
not sure how serious this issue is, and my proposal is to help simplify the 
algorithm given you are doing this optimization. I'm more than happier if you 
agree to drop this optimization and simplify the code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47672][SQL] Avoid double eval from filter pushDown w/ projection pushdown [spark]

Reply via email to