[jira] [Comment Edited] (SYSTEMML-2021) Perftest stratstats w/ codegen getting stuck in optimization

Matthias Boehm (JIRA) Fri, 17 Nov 2017 21:19:30 -0800

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257925#comment-16257925
 ]


Matthias Boehm edited comment on SYSTEMML-2021 at 11/18/17 5:18 AM:
--------------------------------------------------------------------

With the fix for robustess as described above, the test passed successfully 
with the following enumeration statistics:
{code}
Codegen enum (ALLt/p,EVALt/p):  17867063951501/1099512676970/832230/832172.
{code}
where we now enumerate 832K plans out of the one trillion plans (with 
partitioning) and 17 trillion plans (without partitioning).


was (Author: mboehm7):
With the fix for robustess as described above, the test passed successfully 
with the following enumeration statistics:
{code}
Codegen enum (ALLt/p,EVALt/p):  17867063951501/1099512676970/832230/832172.
{code}

> Perftest stratstats w/ codegen getting stuck in optimization
> ------------------------------------------------------------
>
>                 Key: SYSTEMML-2021
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2021
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>
> On stratstats 10K, the codegen optimizer gets stuck during plan enumeration 
> of a DAG partition with 274,877,906,944 plans. During initial compilation, 
> the same partition was processed in 2.8s as most candidate plans have been 
> successfully pruned by cost or structure.
> After a detailed analysis, we can characterize the problem as follows: there 
> is a fusion partition with 38 interesting points, resulting in the huge 
> search space of 274 billion plans. Furthermore, all these materialization 
> points are extremely small while at the same time there are large inputs and 
> compute-intensive operations as well as many common subexpression, which 
> renders the pruning by costs starting from the fuse-all heuristic 
> ineffective. 
> We can address this as follows: First, we should run both heuristics 
> (fuse-all and fuse-no-redundancy) which are the first and last plan upfront 
> to quickly obtain a good lower bound for the costs. Second, we should only 
> consider enumerating large partitions (say with more than 20 interesting 
> points) if the opening heuristics show costs that is further than an epsilon 
> (say 1%) away from the minimal static costs. Together these rules will make 
> the optimizer significantly more robust without missing any meaningful plan 
> choices.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (SYSTEMML-2021) Perftest stratstats w/ codegen getting stuck in optimization

Reply via email to