[
https://issues.apache.org/jira/browse/SYSTEMML-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Glenn Weidner updated SYSTEMML-1623:
------------------------------------
Fix Version/s: (was: SystemML 1.0)
SystemML 0.15
> Memory efficiency JMLC matrix and frame conversions
> ---------------------------------------------------
>
> Key: SYSTEMML-1623
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1623
> Project: SystemML
> Issue Type: Bug
> Reporter: Matthias Boehm
> Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> The current JMLC conversion functions cause a very inefficient and memory
> intensive code path with leads to unnecessary OOMs that can be easily
> avoided. This task aims to add and improve these primitives to allow
> convenient data conversions with much better memory efficiency.
> For example consider a scenario of a 500k x 90 input model available as csv
> file in the classpath, which string representation requires 1GB. The typical
> codepath currently use looks as follows:
> {code}
> ResourceStream(model_file)
> -> prep
> ---> StringBuilder -> String [3GB tmp, 1GB]
> -> convertToDoubleMatrix
> ---> byte[] -> ByteInputStream [2GB]
> ---> MatrixBlock [360MB]
> ---> double[][] [400MB]
> -> setMatrix
> ---> MatrixBlock [360MB]
> {code}
> which requires at least 4GB of memory due to strong references to all
> intermediates. The goal of this task is to reduce this to the following,
> which only requires 360MB of memory:
> {code}
> ResourceStream(model_file)
> -> convertToMatrix
> ---> MatrixBlock [360MB]
> -> setMatrix
> ---> by references
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)