Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20476
@cloud-fan, @gatorsmile, this PR demonstrates why we should use
PhysicalOperation. I ported the tests from this PR over to our branch and they
pass without modifying the push-down code. That's because it reuses code that
we already trust.
I'm see no benefit to using a brand new code path for push-down when we can
use what is already well tested. I know you want to push other operations, but
I've already raised concerns about the design of this new code: it is brittle
because it requires matching specific plan nodes.
Push-down should work as it always has: by pushing nodes that are adjacent
to relations in the logical plan and relying on the optimizer to push
projections and filters down as far as possible. The separation of concerns
into simple rules is fundamental to the design of the optimizer. I don't think
there is a good argument for new code that breaks how the optimizer is intended
to work.
cc @henryr, who might want to chime in.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]