cloud-fan commented on PR #46143: URL: https://github.com/apache/spark/pull/46143#issuecomment-3710426392
To discuss https://github.com/apache/spark/pull/46143#discussion_r2643476999 further: > Yes that's true, but given your previous statement around how adding projections is not free I don't think that's the right way to structure this. That's why my initial suggestion was to not do this optimization at all. We just create a single `Filter` and place it above the `Project`. By doing so we avoid the expensive expression duplication caused by filter pushdown, but all expressions in `Project` now need to be evaluated against the full input. I'm not sure how serious this issue is, and my proposal is to help simplify the algorithm given you are doing this optimization. I'm more than happier if you agree to drop this optimization and simplify the code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
