Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/13585
You probably meant "conjunction" (aka "logical and") instead of
"disjunction" (aka "logical or") in the PR title and comments.
As @clockfly had pointed out, the current approach isn't correct. I think a
better approach to extract as many partition column predicates as possible is
through [CNF conversion][1], which pulls up all conjunctions to the top level,
and then it's safe to do the optimization you intended to do in this PR.
There had been PR(s) tried to add CNF conversion to Spark SQL. However, one
problem is that CNF conversion can lead to exponential explosion in respect to
expression size (i.e. number of tree nodes in the expression tree). Thus
usually we need to set an upper limit of the expression size and stops doing
CNF conversion once the upper limit is exceeded.
[1]: https://en.wikipedia.org/wiki/Conjunctive_normal_form
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]