Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21319
Hi @rdblue , I looked into the plan visitor approach, but was struggling
with some problems:
1. how to pass the `DataSourceReader` to the physical plan? We don't want
to apply operator pushdown again when planning, that's why we made
`DataSourceReader` a `lazy val` in `DataSourceV2Relation`, to kind of cache the
pushdown result and give the `DataSourceReader` to the physical plan directly.
2. How to eliminate unneeded operators like pushed filters? We currently
just change the logical plan to eliminate these operators. Alternatively we can
do that during planning, if we finally move the stats to physical plan. I feel
it's hard to make plan visitor to change the plan.
It will be great if you can share some ideas about it. In the meanwhile,
can we unblock this cleanup PR? We can have a new PR for the plan visitor
approach when it's ready.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]