[ 
https://issues.apache.org/jira/browse/SYSTEMML-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1623:
------------------------------------
    Fix Version/s:     (was: SystemML 1.0)
                   SystemML 0.15

> Memory efficiency JMLC matrix and frame conversions
> ---------------------------------------------------
>
>                 Key: SYSTEMML-1623
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1623
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>             Fix For: SystemML 0.15
>
>
> The current JMLC conversion functions cause a very inefficient and memory 
> intensive code path with leads to unnecessary OOMs that can be easily 
> avoided. This task aims to add and improve these primitives to allow 
> convenient data conversions with much better memory efficiency. 
> For example consider a scenario of a 500k x 90 input model available as csv 
> file in the classpath, which string representation requires 1GB. The typical 
> codepath currently use looks as follows:
> {code}
> ResourceStream(model_file)
> -> prep
> ---> StringBuilder -> String [3GB tmp, 1GB]
> -> convertToDoubleMatrix
> ---> byte[] -> ByteInputStream [2GB]
> ---> MatrixBlock [360MB]
> ---> double[][] [400MB]
> -> setMatrix
> ---> MatrixBlock [360MB]
> {code} 
> which requires at least 4GB of memory due to strong references to all 
> intermediates. The goal of this task is to reduce this to the following, 
> which only requires 360MB of memory:
> {code}
> ResourceStream(model_file)
> -> convertToMatrix
> ---> MatrixBlock [360MB]
> -> setMatrix
> ---> by references
> {code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to