[
https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Glenn Weidner updated SYSTEMML-1548:
------------------------------------
Fix Version/s: (was: SystemML 1.0)
SystemML 0.15
> Performance ultra-sparse matrix read
> ------------------------------------
>
> Key: SYSTEMML-1548
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1548
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> Reading ultra-sparse matrices shows for certain data sizes and memory
> configurations poor performance due to garbage collection overheads.
> In detail, this task covers two scenarios that will be addressed
> independently:
> 1) Large heap: In case of large heaps, the problem are temporarily
> deserialized sparse blocks which are not reused due to inefficient reset,
> leading to lots of garbage and hence high cost for full garbage collection.
> This will be addressed by using our CSR sparse blocks for ultra-sparse blocks
> because CSR has a smaller memory footprint and allows for efficient reset.
> 2) Small heap: In case of a small heap not the temporary blocks but the
> memory overhead of the target sparse matrix becomes the bottleneck. This is
> due to a relatively large memory overhead per sparse row which is not
> amortized if a row has just one or very few non-zeros. This will be addressed
> via a modification of the MCSR representation for ultra-sparse matrices. Note
> that we cannot use CSR or COO here because we want to support efficient
> multi-threaded incremental construction and subsequent operations.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)