[
https://issues.apache.org/jira/browse/SYSTEMML-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Glenn Weidner updated SYSTEMML-1554:
------------------------------------
Fix Version/s: (was: SystemML 1.0)
SystemML 0.15
> IPA Scalar Transient Read Replacement
> -------------------------------------
>
> Key: SYSTEMML-1554
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1554
> Project: SystemML
> Issue Type: Improvement
> Reporter: Mike Dusenberry
> Assignee: Mike Dusenberry
> Fix For: SystemML 0.15
>
> Attachments: convnet_distrib_sgd.dml, parfor_oom_convnet_plan.txt,
> parfor_oom_convnet.py, parfor_oom_plan.txt, parfor_oom.py
>
>
> Currently, during IPA we collect all variables (scalars & matrices) eligible
> for propagation across blocks (i.e. not updated in block), and then propagate
> the only the matrix sizes across the blocks. It seems plausible that we
> could also replace all eligible scalar transient reads with literals based on
> the variables that have already been collected. The benefit is that many ops
> will be able to determine their respective output sizes during regular
> compilation, instead of having to wait until dynamic recompilation, and thus
> we can reduce the pressure on dynamic recompilation.
> Are there drawbacks to this approach? The use case is that I was seeing a
> large number of memory warnings while training a convolutional net due to the
> sizes being unknown during regular compilation, yet the engine only having CP
> versions of the ops. Additionally, I was running into actual heap space OOM
> errors for situations that should not run out of memory, and thus I started
> exploring.
> I've attached an example script and the explain plan (hops & runtime) w/ and
> w/o the IPA scalar replacement.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)