cloud-fan commented on PR #46143: URL: https://github.com/apache/spark/pull/46143#issuecomment-3482107890
Hi @holdenk , we tried very hard to solve this issue efficiently but failed. The idea was to let filter carry a project list and push them down together, but when we push through Project/Aggregate which also contains a project list, we may still hit expression duplication and need to make a decision based on cost. Sorry I should have moved back to this PR earlier. I think we can simplify it a bit as we will likely never have a practical cost model for Spark expressions. Let's just avoid UDF expression (extends marker expression `UserDefinedExpression`) duplication during a filter pushdown and add a config to enable it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
