[
https://issues.apache.org/jira/browse/MAHOUT-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966657#comment-13966657
]
Dmitriy Lyubimov commented on MAHOUT-1508:
------------------------------------------
[~ssc] i am not so sure. SRM is fastest to construct from rows and fastest to
iterate over rows (since it doesn't create vector views, and vector elements in
this case are co-located in memory.
if x cross y doesn't have efficient representation in SRM, then maybe it
shouldn't use SRM.
The real problem is that we don't have in-core algebraic matrix optimizer here
that would look at the expression as a whole. I am not sure we would need one,
but if we had, we'd be able to compute cost variations not only of the current
operation, but of subsequent, too. But since we are constrainted by the cost of
current operation only, we do what's fastest for the current operation
(one-step greedy approach). so SRM is the fastest to wrap a bunch of rows, and
so that's what blockify() uses. If subsequent operation is custom row-wise
iterator, this choice is still the best. but for others, it may not -- but we
have no way of knowing at the point we construct SRM.
More likely, we might do what we did with aggregate() and dot() calls on
vectors, i.e. make it operator-centric approach where operator chooses approach
based on underlying matrix argument structures. in that sense we could
introduce some organizational concept the same way we did for vectors. E.g. we
could have methods isRowWiseLike, isColumnWiseLike, isSparseHashLike etc. etc.
which would allow us to asses costs for certain iteration patterns.
this would strike a middle ground between full-fledged optimizer and
doing-nothing.
> SparseMatrix assign has huge performance problems
> -------------------------------------------------
>
> Key: MAHOUT-1508
> URL: https://issues.apache.org/jira/browse/MAHOUT-1508
> Project: Mahout
> Issue Type: Bug
> Components: Math
> Reporter: Sebastian Schelter
> Priority: Blocker
> Fix For: 1.0
>
>
> I'm currently working with the new Scala DSL and running into a problem with
> SparseMatrix and SparseRowMatrix.
> The problem is that they don't implement a specialized assign(Matrix other,
> DoubleDoubleFunction f) function, but use the implementation from
> AbstractMatrix which loops through all (!) possible entries.
> We have to fix this asap.
--
This message was sent by Atlassian JIRA
(v6.2#6252)