[jira] [Updated] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes

Asif (Jira) Tue, 28 Sep 2021 10:17:06 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-36786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Asif updated SPARK-36786:
-------------------------
    Description: 
h2. Q1. What are you trying to do? Articulate your objectives using absolutely 
no jargon.

The aim is to improve the compile time performance of query which in WorkDay's 
use case takes > 24 hrs ( & eventually fails) , to  < 8 min.

To explain the problem, I will provide the context.

The query plan in our production system, is huge, with nested *case when* 
expressions ( level of nesting could be >  8) , where each *case when* can have 
branches sometimes > 1000.

The plan could look like
{quote}Project1
    |
   Filter 1
    |

Project2
    |
 Filter2
    |
 Project3
    |
 Filter3

  |

Join
{quote}
Now the optimizer has a Batch of Rules , intended to run at max 100 times.

*Also note that the, the batch will continue to run till one of the condition 
is satisfied*

*i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
achieved)*

One of the early  Rule is   *PushDownPredicateRule.*

**Followed by **CollapseProject**.

 

The first issue is *PushDownPredicate* rule.

It picks  one filter at a time & pushes it at lowest level ( I understand that 
in 3.1 it pushes through join, while in 2.4 it stops at Join) , but either case 
it picks 1 filter at time starting from top, in each iteration.

*The above comment is no longer true in 3.1 release as it now combines filters. 
so it does push now all the encountered filters in a single pass. But it still 
materializes the filter on each push by realiasing.*

*Also it seems that with the change, one of the PruneFilter rule's 
functionality of replacement with Empty Relation appears to be broken by the 
addition of CombineFilter rule in PushDownPredicates rule. More on it later.*

*I have filed a new bug ticket ([PruneFilter optimization 
broken|https://issues.apache.org/jira/browse/SPARK-36878])*

So if there are say  50 projects interspersed with Filters , the idempotency is 
guaranteedly not going to get achieved till around 49 iterations. Moreover, 
CollapseProject will also be modifying tree on each iteration as a filter will 
get removed within Project.

Moreover, on each movement of filter through project tree, the filter is 
re-aliased using transformUp rule.  transformUp is very expensive compared to 
transformDown. As the filter keeps getting pushed down , its size increases.

To optimize this rule , 2 things are needed
 # Instead of pushing one filter at a time,  collect all the filters as we 
traverse the tree in that iteration itself.
 # Do not re-alias the filters on each push. Collect the sequence of projects 
it has passed through, and  when the filters have reached their resting place, 
do the re-alias by processing the projects collected in down to up manner.

This will result in achieving idempotency in a couple of iterations. 

*How reducing the number of iterations help in performance*

There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals ( 
... there are around 6 more such rules)*  which traverse the tree using 
transformUp, and they run unnecessarily in each iteration , even when the 
expressions in an operator have not changed since the previous runs.

*I have a different proposal which I will share later, as to how to avoid the 
above rules from running unnecessarily, if it can be guaranteed that the 
expression is not going to mutate in the operator.* 

The cause of our huge compilation time has been identified as the above.
  
h2. Q2. What problem is this proposal NOT designed to solve?

It is not going to change any runtime profile.
h2. Q3. How is it done today, and what are the limits of current practice?

Like mentioned above , currently PushDownPredicate pushes one filter at a time  
& at each Project , it materialized the re-aliased filter.  This results in 
large number of iterations to achieve idempotency as well as immediate 
materialization of Filter after each Project pass,, results in unnecessary tree 
traversals of filter expression that too using transformUp. and the expression 
tree of filter is bound to keep increasing as it is pushed down.
h2. Q4. What is new in your approach and why do you think it will be successful?

In the new approach we push all the filters down in a single pass. And do not 
materialize filters as it pass through Project. Instead keep collecting 
projects in sequential order and materialize the final filter once its final 
position is achieved ( above a join , in case of 2.1 , or above the base 
relation etc).

This approach when coupled with the logic of identifying those Project operator 
whose expressions will not mutate ( which I will share later) , so that rules 
like 

NullPropagation,
 OptimizeIn.,
 LikeSimplification.,
 BooleanSimplification.,
 SimplifyConditionals.,
 RemoveDispensableExpressions.,
 SimplifyBinaryComparison.,
 SimplifyCaseConversionExpressions.,
 SimplifyExtractValueOps
  

are applied only in first pass on the expressions of that Project operator,  
the compilation time of offending queries have been reduced to under 8 mins 
from 24 hrs or more.

 
h2. Q5. Who cares? If you are successful, what difference will it make?

For My company WorkDay, it will solve the currently failing plans due to OOM & 
compilation time running into 24 hrs or so. I have a PR for this locally, will 
publish it in some  time.

 
h2. Q6. What are the risks?

The risk in the change of PushDownPredicate is very low. 
 For the next proposal of identifying Project operator whose expressions will 
be immutable such that the above set of rules run only once has some relatively 
complex logic but with extra tests coverage it should be safe.
h2. Q7. How long will it take?

The basic changes are already in place. tests will take time. around 10 -15 
days.
h2. Q8. What are the mid-term and final “exams” to check for success?

All tests should pass.
 The perf benefit should justify the changes.

  was:
h2. Q1. What are you trying to do? Articulate your objectives using absolutely 
no jargon.

The aim is to improve the compile time performance of query which in WorkDay's 
use case takes > 24 hrs ( & eventually fails) , to  < 8 min.

To explain the problem, I will provide the context.

The query plan in our production system, is huge, with nested *case when* 
expressions ( level of nesting could be >  8) , where each *case when* can have 
branches sometimes > 1000.

The plan could look like
{quote}Project1
    |
   Filter 1
    |

Project2
    |
 Filter2
    |
 Project3
    |
 Filter3

  |

Join
{quote}
Now the optimizer has a Batch of Rules , intended to run at max 100 times.

*Also note that the, the batch will continue to run till one of the condition 
is satisfied*

*i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
achieved)*

One of the early  Rule is   *PushDownPredicateRule.*

**Followed by **CollapseProject**.

 

The first issue is *PushDownPredicate* rule.

It picks  one filter at a time & pushes it at lowest level ( I understand that 
in 3.1 it pushes through join, while in 2.4 it stops at Join) , but either case 
it picks 1 filter at time starting from top, in each iteration.

*The above comment is no longer true in 3.1 release as it now combines filters. 
so it does push now all the encountered filters in a single pass. But it still 
materializes the filter on each push by realiasing.*

*Also it seems that with the change, one of the PruneFilter rule's 
functionality of replacement with Empty Relation appears to be broken by the 
addition of CombineFilter rule in PushDownPredicates rule. More on it later.*

*I have filed a new bug ticket ([PruneFilter optimization 
broken|https://issues.apache.org/jira/browse/SPARK-36789])*

So if there are say  50 projects interspersed with Filters , the idempotency is 
guaranteedly not going to get achieved till around 49 iterations. Moreover, 
CollapseProject will also be modifying tree on each iteration as a filter will 
get removed within Project.

Moreover, on each movement of filter through project tree, the filter is 
re-aliased using transformUp rule.  transformUp is very expensive compared to 
transformDown. As the filter keeps getting pushed down , its size increases.

To optimize this rule , 2 things are needed
 # Instead of pushing one filter at a time,  collect all the filters as we 
traverse the tree in that iteration itself.
 # Do not re-alias the filters on each push. Collect the sequence of projects 
it has passed through, and  when the filters have reached their resting place, 
do the re-alias by processing the projects collected in down to up manner.

This will result in achieving idempotency in a couple of iterations. 

*How reducing the number of iterations help in performance*

There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals ( 
... there are around 6 more such rules)*  which traverse the tree using 
transformUp, and they run unnecessarily in each iteration , even when the 
expressions in an operator have not changed since the previous runs.

*I have a different proposal which I will share later, as to how to avoid the 
above rules from running unnecessarily, if it can be guaranteed that the 
expression is not going to mutate in the operator.* 

The cause of our huge compilation time has been identified as the above.
  
h2. Q2. What problem is this proposal NOT designed to solve?

It is not going to change any runtime profile.
h2. Q3. How is it done today, and what are the limits of current practice?

Like mentioned above , currently PushDownPredicate pushes one filter at a time  
& at each Project , it materialized the re-aliased filter.  This results in 
large number of iterations to achieve idempotency as well as immediate 
materialization of Filter after each Project pass,, results in unnecessary tree 
traversals of filter expression that too using transformUp. and the expression 
tree of filter is bound to keep increasing as it is pushed down.
h2. Q4. What is new in your approach and why do you think it will be successful?

In the new approach we push all the filters down in a single pass. And do not 
materialize filters as it pass through Project. Instead keep collecting 
projects in sequential order and materialize the final filter once its final 
position is achieved ( above a join , in case of 2.1 , or above the base 
relation etc).

This approach when coupled with the logic of identifying those Project operator 
whose expressions will not mutate ( which I will share later) , so that rules 
like 

NullPropagation,
 OptimizeIn.,
 LikeSimplification.,
 BooleanSimplification.,
 SimplifyConditionals.,
 RemoveDispensableExpressions.,
 SimplifyBinaryComparison.,
 SimplifyCaseConversionExpressions.,
 SimplifyExtractValueOps
  

are applied only in first pass on the expressions of that Project operator,  
the compilation time of offending queries have been reduced to under 8 mins 
from 24 hrs or more.

 
h2. Q5. Who cares? If you are successful, what difference will it make?

For My company WorkDay, it will solve the currently failing plans due to OOM & 
compilation time running into 24 hrs or so. I have a PR for this locally, will 
publish it in some  time.

 
h2. Q6. What are the risks?

The risk in the change of PushDownPredicate is very low. 
 For the next proposal of identifying Project operator whose expressions will 
be immutable such that the above set of rules run only once has some relatively 
complex logic but with extra tests coverage it should be safe.
h2. Q7. How long will it take?

The basic changes are already in place. tests will take time. around 10 -15 
days.
h2. Q8. What are the mid-term and final “exams” to check for success?

All tests should pass.
 The perf benefit should justify the changes.


> SPIP: Improving the compile time performance, by improving  a couple of 
> rules, from 24 hrs to under 8 minutes
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36786
>                 URL: https://issues.apache.org/jira/browse/SPARK-36786
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.1, 3.1.2
>            Reporter: Asif
>            Priority: Major
>              Labels: SPIP
>
> h2. Q1. What are you trying to do? Articulate your objectives using 
> absolutely no jargon.
> The aim is to improve the compile time performance of query which in 
> WorkDay's use case takes > 24 hrs ( & eventually fails) , to  < 8 min.
> To explain the problem, I will provide the context.
> The query plan in our production system, is huge, with nested *case when* 
> expressions ( level of nesting could be >  8) , where each *case when* can 
> have branches sometimes > 1000.
> The plan could look like
> {quote}Project1
>     |
>    Filter 1
>     |
> Project2
>     |
>  Filter2
>     |
>  Project3
>     |
>  Filter3
>   |
> Join
> {quote}
> Now the optimizer has a Batch of Rules , intended to run at max 100 times.
> *Also note that the, the batch will continue to run till one of the condition 
> is satisfied*
> *i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
> achieved)*
> One of the early  Rule is   *PushDownPredicateRule.*
> **Followed by **CollapseProject**.
>  
> The first issue is *PushDownPredicate* rule.
> It picks  one filter at a time & pushes it at lowest level ( I understand 
> that in 3.1 it pushes through join, while in 2.4 it stops at Join) , but 
> either case it picks 1 filter at time starting from top, in each iteration.
> *The above comment is no longer true in 3.1 release as it now combines 
> filters. so it does push now all the encountered filters in a single pass. 
> But it still materializes the filter on each push by realiasing.*
> *Also it seems that with the change, one of the PruneFilter rule's 
> functionality of replacement with Empty Relation appears to be broken by the 
> addition of CombineFilter rule in PushDownPredicates rule. More on it later.*
> *I have filed a new bug ticket ([PruneFilter optimization 
> broken|https://issues.apache.org/jira/browse/SPARK-36878])*
> So if there are say  50 projects interspersed with Filters , the idempotency 
> is guaranteedly not going to get achieved till around 49 iterations. 
> Moreover, CollapseProject will also be modifying tree on each iteration as a 
> filter will get removed within Project.
> Moreover, on each movement of filter through project tree, the filter is 
> re-aliased using transformUp rule.  transformUp is very expensive compared to 
> transformDown. As the filter keeps getting pushed down , its size increases.
> To optimize this rule , 2 things are needed
>  # Instead of pushing one filter at a time,  collect all the filters as we 
> traverse the tree in that iteration itself.
>  # Do not re-alias the filters on each push. Collect the sequence of projects 
> it has passed through, and  when the filters have reached their resting 
> place, do the re-alias by processing the projects collected in down to up 
> manner.
> This will result in achieving idempotency in a couple of iterations. 
> *How reducing the number of iterations help in performance*
> There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals 
> ( ... there are around 6 more such rules)*  which traverse the tree using 
> transformUp, and they run unnecessarily in each iteration , even when the 
> expressions in an operator have not changed since the previous runs.
> *I have a different proposal which I will share later, as to how to avoid the 
> above rules from running unnecessarily, if it can be guaranteed that the 
> expression is not going to mutate in the operator.* 
> The cause of our huge compilation time has been identified as the above.
>   
> h2. Q2. What problem is this proposal NOT designed to solve?
> It is not going to change any runtime profile.
> h2. Q3. How is it done today, and what are the limits of current practice?
> Like mentioned above , currently PushDownPredicate pushes one filter at a 
> time  & at each Project , it materialized the re-aliased filter.  This 
> results in large number of iterations to achieve idempotency as well as 
> immediate materialization of Filter after each Project pass,, results in 
> unnecessary tree traversals of filter expression that too using transformUp. 
> and the expression tree of filter is bound to keep increasing as it is pushed 
> down.
> h2. Q4. What is new in your approach and why do you think it will be 
> successful?
> In the new approach we push all the filters down in a single pass. And do not 
> materialize filters as it pass through Project. Instead keep collecting 
> projects in sequential order and materialize the final filter once its final 
> position is achieved ( above a join , in case of 2.1 , or above the base 
> relation etc).
> This approach when coupled with the logic of identifying those Project 
> operator whose expressions will not mutate ( which I will share later) , so 
> that rules like 
> NullPropagation,
>  OptimizeIn.,
>  LikeSimplification.,
>  BooleanSimplification.,
>  SimplifyConditionals.,
>  RemoveDispensableExpressions.,
>  SimplifyBinaryComparison.,
>  SimplifyCaseConversionExpressions.,
>  SimplifyExtractValueOps
>   
> are applied only in first pass on the expressions of that Project operator,  
> the compilation time of offending queries have been reduced to under 8 mins 
> from 24 hrs or more.
>  
> h2. Q5. Who cares? If you are successful, what difference will it make?
> For My company WorkDay, it will solve the currently failing plans due to OOM 
> & compilation time running into 24 hrs or so. I have a PR for this locally, 
> will publish it in some  time.
>  
> h2. Q6. What are the risks?
> The risk in the change of PushDownPredicate is very low. 
>  For the next proposal of identifying Project operator whose expressions will 
> be immutable such that the above set of rules run only once has some 
> relatively complex logic but with extra tests coverage it should be safe.
> h2. Q7. How long will it take?
> The basic changes are already in place. tests will take time. around 10 -15 
> days.
> h2. Q8. What are the mid-term and final “exams” to check for success?
> All tests should pass.
>  The perf benefit should justify the changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes

Reply via email to