zml1206 commented on PR #49202: URL: https://github.com/apache/spark/pull/49202#issuecomment-2567396659
> After more thought, I think we should think of filter pushdown in a different way. Once you push a predicate through `Project` and expand the attribute reference into an expensive expression, even only once, there is a risk of perf regression, because that expensive expression will be evaluated twice: once in the `Filter` being pushed down, once in the `Project` stays up. > > A safe approach is that when pushing filters through `Project`, the filters should promise to produce the expensive expressions as attributes, so that we can rewrite the `Project` during the pushdown to use these attributes, to make sure the expensive expressions are only evaluated once. This is kind of we pushing down filters and part of the `Project` together. We can extend the `With` expression to use pre-defined Alias to support it. This seems to be back to sharing across nodes `With`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
