[ 
https://issues.apache.org/jira/browse/MAHOUT-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641313#comment-13641313
 ] 

Jake Mannix commented on MAHOUT-1197:
-------------------------------------

The big issue is the loop over row from 0 to size.

Yes, a bandaid fix would be to instead iterateNonZero, picking out the indices 
as we go (this is what I've done in my work code which ran into 15ms individual 
v.cross(w) timings).
                
> AbstractVector#cross is only appropriately efficient for dense vectors
> ----------------------------------------------------------------------
>
>                 Key: MAHOUT-1197
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1197
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.6
>            Reporter: Jake Mannix
>             Fix For: 0.8
>
>
> Nobody overrides this implementation:
> [code]
>   @Override
>   public Matrix cross(Vector other) {
>     Matrix result = matrixLike(size, other.size());
>     for (int row = 0; row < size; row++) {
>       result.assignRow(row, other.times(getQuick(row)));
>     }
>     return result;
>   }
> [code]
> I think you can imagine what kind of performance this has on sparse vectors 
> (k non-zeroes) with high cardinality (N) - scales as O(N^2) instead of O(k^2).
> I think the right approach is to *not* implement this in AbstractVector at 
> all, and force concrete implementations to properly implement it performantly.
> Alternatively, killing this method entirely might be appropriate.  If anyone 
> was using it (and uses sparse vectors), they'd have complained about this by 
> now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to