[ https://issues.apache.org/jira/browse/SYSTEMML-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias Boehm closed SYSTEMML-1752. ------------------------------------ > Cache-conscious mmchain matrix multiply for wide matrices > --------------------------------------------------------- > > Key: SYSTEMML-1752 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1752 > Project: SystemML > Issue Type: Task > Reporter: Matthias Boehm > Assignee: Matthias Boehm > Fix For: SystemML 1.0 > > > The fused mmchain matrix multiply for patterns such as {{t(X) %*% (w * (X %*% > v))}} uses row-wise {{dotProduct}} and {{vectMultAdd}} operations, which > works very well for the common case of tall&skinny matrices where individual > rows fit into L1 cache. However, for graph and text scenarios with wide > matrices this leads to cache trashing on the input and output vectors. > This task aims to generalize these dense and sparse operations to perform the > computation in a cache-conscious manner when necessary, by accessing > fragments of the input and output vector for groups of rows. For dense this > is trivial to realize while for sparse it requires a careful determination of > the block sizes according to the input sparsity. -- This message was sent by Atlassian JIRA (v6.4.14#64029)