Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14619
see discussion here:
https://github.com/apache/spark/pull/13893#discussion_r73117855
Currently we collect the projects and filters on scan node at planner by
`PhysicalOperator.unapply`. The `PhysicalOperator.unapply` mostly duplicates
the logic from column pruning and filter push down rules in optimizer, but
doesn't handle non-deterministic expressions well. By adding a wrapper node on
scan, we can push down projects and filters to scan node at optimizer phase,
and reuse the existing rules. Thus we can eliminate the duplicated codes and
handle non-deterministic expressions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]