[
https://issues.apache.org/jira/browse/SYSTEMML-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Glenn Weidner updated SYSTEMML-1752:
------------------------------------
Fix Version/s: (was: SystemML 1.0)
SystemML 0.15
> Cache-conscious mmchain matrix multiply for wide matrices
> ---------------------------------------------------------
>
> Key: SYSTEMML-1752
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1752
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Assignee: Matthias Boehm
> Fix For: SystemML 0.15
>
>
> The fused mmchain matrix multiply for patterns such as {{t(X) %*% (w * (X %*%
> v))}} uses row-wise {{dotProduct}} and {{vectMultAdd}} operations, which
> works very well for the common case of tall&skinny matrices where individual
> rows fit into L1 cache. However, for graph and text scenarios with wide
> matrices this leads to cache trashing on the input and output vectors.
> This task aims to generalize these dense and sparse operations to perform the
> computation in a cache-conscious manner when necessary, by accessing
> fragments of the input and output vector for groups of rows. For dense this
> is trivial to realize while for sparse it requires a careful determination of
> the block sizes according to the input sparsity.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)