[ 
https://issues.apache.org/jira/browse/MAHOUT-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159493#comment-13159493
 ] 

[email protected] commented on MAHOUT-880:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2955/#review3562
-----------------------------------------------------------



trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java
<https://reviews.apache.org/r/2955/#comment7976>

    I'm not sure about this method: you take in a DistributedRowMatrix, which 
by design is an big huge SequenceFile<IntWritable,VectorWritable>.  Why don't 
you just take in a Vector, put that in the DistributedCache (or even serialize 
it into the Configuration, if it's small enough), and use that?  
    
    Passing in a DistributedRowMatrix makes people assume you can put in a real 
full matrix.



trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixRowAverageJob.java
<https://reviews.apache.org/r/2955/#comment7977>

    This will force a huge bottleneck of one reducer, will it not?



trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixRowAverageJob.java
<https://reviews.apache.org/r/2955/#comment7978>

    I think we already have a VectorSummingReducer somewhere, we should re-use 
that.


- Jake


On 2011-11-29 18:44:49, Raphael Cendrillon wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2955/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-11-29 18:44:49)
bq.  
bq.  
bq.  Review request for Ted Dunning, Jake Mannix and Sebastian Schelter.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Jobs for matrix-vector addition, covariance matrix calculation and row 
average calculation in DistributedRowMatrix
bq.  
bq.  
bq.  This addresses bug MAHOUT-880.
bq.      https://issues.apache.org/jira/browse/MAHOUT-880
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java
 1206431 
bq.    
trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixCovarianceJob.java 
PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixRowAverageJob.java 
PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixVectorAdditionJob.java
 PRE-CREATION 
bq.    
trunk/core/src/test/java/org/apache/mahout/math/hadoop/TestDistributedRowMatrix.java
 1206431 
bq.  
bq.  Diff: https://reviews.apache.org/r/2955/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Junit tests for each job
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Raphael
bq.  
bq.


                
> Add some matrix method(like addition, subtraction, norm ... etc) to 
> DistributedRowMatrix
> ----------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-880
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-880
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.6
>            Reporter: Wangda Tan
>            Priority: Minor
>              Labels: DistributedRowMatrix
>         Attachments: MAHOUT-880.patch
>
>
> I'm a new to Mahout, I didn't find some basic matrix functions. This make 
> users cannot do many tasks by CLI or API, if user get some result through 
> existing map-reduce matrix operation (like svd), he cannot do farther steps. 
> I make a list for it:
> 1) Addition, Subtraction 
> 2) Norm (like norm-1, norm-2, norm-frobenius)
> 3) Matrix compare
> 4) Get lower triangle, upper triangle and diagonal
> 5) Get identity and zero matrix
> 6) Put two or matrix to together: A = [A1, A2]
> 7) More linear equations solver method, like Gaussian elimination (maybe it's 
> hard to implement)
> 8) import and export CSV, ARFF ... (this will very useful when user want to 
> reuse result from or to other applications like MATLAB)
> I want to know is there any plan to do this, if so, I can make some efforts 
> to implement these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to