[
https://issues.apache.org/jira/browse/MAHOUT-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020702#comment-14020702
]
Ted Dunning commented on MAHOUT-1574:
-------------------------------------
{quote}
(1) first, an added test is failing in jenkins build. Not sure whether
legitimately or not.
{quote}
It was a legitimate beef and a small triumph for the abstract tests.
{quote}
(2)
if (other instanceof SparseRowMatrix) {
....
is a good first step but still a bit naive. Not sure it solves Sebastian's
case. While SRM %*% SRM is definitely one of the cases, there's also a handful
of other cases (e.g. SRM %*% SCM.t which is organizationally equivalent).
{quote}
Read the rest of the code. There is handling for SRM %*% general sparse matrix
in there. It is just with SRM, there is special handling to be had. If the
transpose happens to expose sparse rows then this fix will cause substantial
speedups whenever an SRM is on the left.
It is true that this isn't a complete cost based solution, but it is a good
step forward for an important case. In fact, the general optimization of
matrix products is quite difficult.
It might be worth a followup bug that handles important certain matrix x vector
cases.
Dealing well with these will require that we let matrices expose considerably
more information about memory locality than we already do.
> SparseRowMatrix needs performance improvement for times()
> ---------------------------------------------------------
>
> Key: MAHOUT-1574
> URL: https://issues.apache.org/jira/browse/MAHOUT-1574
> Project: Mahout
> Issue Type: Bug
> Reporter: Ted Dunning
> Assignee: Ted Dunning
>
> According to ssc,
> > * SparseRowMatrix with sequential vectors times SparseRowMatrix with
> > sequential vectors is totally broken, it uses three nested loops and uses
> > get(row, col) on the matrices, which internally uses binary search...
--
This message was sent by Atlassian JIRA
(v6.2#6252)