[ https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias Boehm reassigned SYSTEMML-1548: ---------------------------------------- Assignee: Matthias Boehm > Performance ultra-sparse matrix read > ------------------------------------ > > Key: SYSTEMML-1548 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1548 > Project: SystemML > Issue Type: Task > Reporter: Matthias Boehm > Assignee: Matthias Boehm > Fix For: SystemML 1.0 > > > Reading ultra-sparse matrices shows for certain data sizes and memory > configurations poor performance due to garbage collection overheads. > In detail, this task covers two scenarios that will be addressed > independently: > 1) Large heap: In case of large heaps, the problem are temporarily > deserialized sparse blocks which are not reused due to inefficient reset, > leading to lots of garbage and hence high cost for full garbage collection. > This will be addressed by using our CSR sparse blocks for ultra-sparse blocks > because CSR has a smaller memory footprint and allows for efficient reset. > 2) Small heap: In case of a small heap not the temporary blocks but the > memory overhead of the target sparse matrix becomes the bottleneck. This is > due to a relatively large memory overhead per sparse row which is not > amortized if a row has just one or very few non-zeros. This will be addressed > via a modification of the MCSR representation for ultra-sparse matrices. Note > that we cannot use CSR or COO here because we want to support efficient > multi-threaded incremental construction and subsequent operations. -- This message was sent by Atlassian JIRA (v6.3.15#6346)