[jira] [Commented] (MAHOUT-1578) Optimizations in matrix serialization

ASF GitHub Bot (JIRA) Wed, 11 Jun 2014 22:35:25 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028844#comment-14028844
 ]


ASF GitHub Bot commented on MAHOUT-1578:
----------------------------------------

Github user dlyubimov commented on a diff in the pull request:

    https://github.com/apache/mahout/pull/16#discussion_r13688078
  
    --- Diff: mrlegacy/src/main/java/org/apache/mahout/math/MatrixWritable.java 
---
    @@ -113,27 +113,28 @@ public static Matrix readMatrix(DataInput in) throws 
IOException {
         int rows = in.readInt();
         int columns = in.readInt();
     
    +    byte vectorFlags = in.readByte();
    +
         Matrix matrix;
    +
         if (dense) {
           matrix = new DenseMatrix(rows, columns);
    -    } else {
    -      if (isSparseRowMatrix) {
    -        matrix = new SparseRowMatrix(rows, columns, sequential);
    -      } else {
    -        matrix = new SparseMatrix(rows, columns);
    +      for (int row = 0; row < rows; row++) {
    +        matrix.assignRow(row, VectorWritable.readVector(in, vectorFlags, 
columns));
           }
    -    }
    -
    -    if (dense || isSparseRowMatrix) {
    +    } else if (isSparseRowMatrix) {
    +      Vector[] rowVectors = new Vector[rows];
           for (int row = 0; row < rows; row++) {
    -        matrix.assignRow(row, VectorWritable.readVector(in));
    --- End diff --
    
    is that where the major problem was? is that because assignment of 
sequential vector to sequential vector is that slow? or this is an assignment 
of random vector to a sequential vector? (sequential to sequential actually 
should be ok methinks). 
    
    Anyway I don't see any immediate problems, and the quality of your work 
usually doesn't require any deep scrutiny, so i'd say ship it. Actually the 
sooner the better, because i am very close to actually give it all a good 
spanking


> Optimizations in matrix serialization
> -------------------------------------
>
>                 Key: MAHOUT-1578
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1578
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>            Reporter: Sebastian Schelter
>             Fix For: 1.0
>
>
> MatrixWritable contains inefficient code in a few places:
>  
>  * type and size are stored with every vector, although they are the same for 
> every vector
>  * in some places vectors are added to the matrix via assign() in places 
> where we could directly use the instance
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAHOUT-1578) Optimizations in matrix serialization

Reply via email to