[
https://issues.apache.org/jira/browse/SYSTEMML-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Boehm closed SYSTEMML-1587.
------------------------------------
Resolution: Fixed
Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0
> Performance ultra-sparse matrix reads
> -------------------------------------
>
> Key: SYSTEMML-1587
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1587
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> We use the MCSR (modified compressed sparse row) format by default for sparse
> and ultra-sparse matrices because it allows for efficient incremental
> construction, including multi-threaded operations. However, even with
> SYSTEMML-1548, the MCSR is still too inefficient in its memory consumption
> leading to unnecessary garbage collection overhead.
> This task aims to read ultra-sparse matrices (e.g., permutation matrices)
> into CSR format. Since CSR does not allow for efficient incremental
> construction (with multiple unordered input streams), the approach is to use
> thread-local COO representations and finally merge them into a CSR
> representation. The temporary memory requirements are not problematic because
> size(CSR) + size(COO) < size(MCSR) for ultra sparse matrices and the COO
> representation can be partitioned across threads.
> Note that this change should be done in a consistent manner for all matrix
> readers (single-threaded/multi-threaded, all formats).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)