Asif created SPARK-55110:
----------------------------
Summary: Order of rules BooleanSimplification and
SimplifyBinaryComparison is suboptimal in achieving idempotency
Key: SPARK-55110
URL: https://issues.apache.org/jira/browse/SPARK-55110
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.1.1, 4.2.0
Reporter: Asif
Current order of rule execution is following
{quote} BooleanSimplification, SimplifyConditionals,
SimplifyBinaryComparison,
{quote}
For the following expression:
{quote}($"a" > 1000 && $"c" =!= false){quote}
The optimized expression
{quote}$"a" > 1000 && $"c"{quote}
is achieved in 2 passes.
The reason being SimplifyBinaryComparison is called after Boolean
simplification.
The SimplifyBinaryComparison converts
{quote} $"c" =!= false{quote}
to
{quote}Not (Not ( $"c")){quote}
in first pass, and then in the second pass the BooleanSimplification converts
it into
{quote}$"c"{quote}
If the order of the rules is modified to
SimplifyConditionals, SimplifyBinaryComparison, BooleanSimplification
the idempotency will be achieved in single pass.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]