Mike Dusenberry created SYSTEMML-1561:
-----------------------------------------

             Summary: Improve constant folding during compilation
                 Key: SYSTEMML-1561
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
             Project: SystemML
          Issue Type: Improvement
            Reporter: Mike Dusenberry
         Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
scenario2.py

In our `nn` library, our convolution and pooling layers have to pass around the 
spatial dimensions (height and width) of the images that are stretched out into 
rows of the input/output matrices.  These output dimensions are computed within 
the forward functions of the above layers as small scalar equations.  From a 
mathematical standpoint, these sizes can be determined at compile time, and it 
is nice to have these size equations in DML (v.s. hiding them inside the engine 
within built-in functions).  However, we do not currently evaluate these 
expressions during compilation, and thus we are left with unknown sizes even 
during recompilation.  This naturally leads to max memory estimates and thus 
often leads to unnecessary distributed runtime ops rather than simple CP ones.

I have two related scenarios for which this is a problem.  They both involve 
the {{Houtc1}} & {{Woutc1}} values that are returned from a 
`conv2d::forward(...)` function.  These represent the spatial dimensions of the 
volume with each of the rows of the output {{outc1}} of the function, and the 
third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal to 
{{F1*Houtc1*Wouc1}}.

In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is created 
that should have the same dimensions as {{outc1}}.  For the columns, if I use 
{{cols=ncol(outc1)}} in this rand statement, the size will be propagated and CP 
ops will be compiled and run.  I I instead use {{cols=F1*Houtc1*Woutc1}}, the 
size will forever be unknown, even during recompilation, and thus Spark ops 
will be compiled and run.  I have included the recompile hops plan 
({{scenario1_plan.txt}}).

In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
function is inserted after the {{conv2d::forward(...)}} function that requires 
the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments.  Since 
those latter variables are not executed during compilation time, the max 
pooling sizes remain unknown, even during recompilation, and thus Spark ops 
will be compiled and run.  I have included the recompile hops plan 
({{scenario2_plan.txt}}).

We should either improve or fix our constant folding rewrites so that these 
scenarios are fixed, as they are necessary for performant deep learning 
applications.  Note too that this issue will be present in other non-deep 
learning scenarios as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to