[jira] Commented: (MAHOUT-6) Need a matrix implementation

Ted Dunning (JIRA) Mon, 25 Feb 2008 18:12:45 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572341#action_12572341
 ]


Ted Dunning commented on MAHOUT-6:
----------------------------------


Yes!

Ability to do destructive operations via views is critical to almost any
decomposition algorithm (QR, LU, Lanczos).

The author of Colt made a persuasive case that mutation by views was
critical for performance without a HUGE api.  Take for instance, the common
operation of zero-ing out a column.  With a Colt style (mutable view) API,
this is done as:

    A.viewColumn(n).assign(0)

Zeroing out a row is done this way:

    A.viewRow(n).assign(0)

But what about adding a vector to a particular row?

    A.viewRow(n).assign(v, Function.plus)

Or zeroing out a sub-matrix:

    A.viewBlock(tl, br, width, height).assign(0).

IF you don't have these mutable views one of two things happens.

Either:

- the programmer calls set a LOT resulting in really, really slow code that
the optimizer can't handle,

Or

- the API becomes (literally) exponentially larger because every common
mutation such as setting to zero, incrementing by a constant, adding a
vector and so on gets multiplied by the number of kinds of pieces that you
want to work on.  In fact, it is a good idea to factor out the kind of
mutation as well, just as Colt does.





> Need a matrix implementation
> ----------------------------
>
>                 Key: MAHOUT-6
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-6
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-6a.diff, MAHOUT-6b.diff, MAHOUT-6c.diff, 
> MAHOUT-6d.diff
>
>
> We need matrices for Mahout.
> An initial set of basic requirements includes:
> a) sparse and dense support are required
> b) row and column labels are important
> c) serialization for hadoop use is required
> d) reasonable floating point performance is required, but awesome FP is not
> e) the API should be simple enough to understand
> f) it should be easy to carve out sub-matrices for sending to different 
> reducers
> g) a reasonable set of matrix operations should be supported, these should 
> eventually include:
>     simple matrix-matrix and matrix-vector and matrix-scalar linear algebra 
> operations, A B, A + B, A v, A + x, v + x, u + v, dot(u, v)
>     row and column sums  
>     generalized level 2 and 3 BLAS primitives, alpha A B + beta C and A u + 
> beta v
> h) easy and efficient iteration constructs, especially for sparse matrices
> i) easy to extend with new implementations

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-6) Need a matrix implementation

Reply via email to