This is also a project called HAMA  <http://wiki.apache.org/hama/>.  
Perhaps your efforts can help further (or coordinate with) HAMA?

Best regards,
Avram


 
On Wednesday, December 09, 2009, at 10:05AM, "Ted Dunning" 
<[email protected]> wrote:
>See here:
>
>http://arxiv.org/abs/0909.4061v1
>
>On Tue, Dec 8, 2009 at 6:47 PM, Zhengguo 'Mike' SUN
><[email protected]>wrote:
>
>> Hi Jake,
>>
>> I am implementing the classical multiplicative update rule of NMF. The
>> matrix to be factorized is really big and sparse. Are you suggesting that I
>> can use some specialised algorithms for sparse matrix instead of the
>> standard multiplication algorithm? But what algorithms are you referring to?
>> Could you please provide some pointers?
>>
>> Thanks
>>
>>
>>
>>
>> ________________________________
>> From: Jake Mannix <[email protected]>
>> To: [email protected]
>> Sent: Tue, December 8, 2009 8:53:39 PM
>> Subject: Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
>>
>> Hi Mike,
>>
>>   In the NMF work you're doing, the matrices you're multiplying are very
>> thin, right?  They're a handful of dense vectors on the left, an equal
>> number of dense vectors on the right, and you're only computing the values
>> in the outer product which coincide with the nonzero entries of the
>> (sparse)
>> input matrix?  In which case you're not really doing the full matrix
>> multiplication, and doing this a slightly specialized way should provide
>> some serious speedup.
>>
>>   Unless you're not dealing with the "really big, but sparse" case, and
>> minimizing error only over the nonzero inputs?
>>
>>   -jake
>>
>> On Tue, Dec 8, 2009 at 5:43 PM, Zhengguo 'Mike' SUN
>> <[email protected]>wrote:
>>
>> > Hi Ted, John,
>> >
>> > I am also interested in matrix multiplication. Actually, I am
>> implementing
>> > Non-negative Matrix Factorization, which basically is iterations of
>> matrix
>> > multiplication. What I did is exactly as what Ted said: doing an outer
>> > product using one column from left matrix and one row from right matrix
>> and
>> > summing up all the results. Idealy, this could be done using only one
>> job:
>> > mappers doing multiplication and reducers doing summing. But the thing is
>> > that how do you match a column from left matrix and a row from right
>> matrix.
>> > Assuming that two input matrices are give in column major order and row
>> > major order, it seems to me that you have to implement a customized
>> > InputFormat to do the matching. Or maybe you could use one matrix as
>> input
>> > and read the other inside mappers, but this also has problem when you
>> have a
>> > lot of mappers. The straightforward way seems to be using two jobs. The
>> > mappers of the first job emit the id and the content of the row or column
>> it
>> >  reads and the reducers do the multiplication. The second job is to do
>> the
>> > summing. This is kind of similar to what John did. But it is inefficient
>> if
>> > you have many iterations. Any comments?
>> >
>> >
>> >
>> >
>> > ________________________________
>> > From: Ted Dunning <[email protected]>
>> > To: [email protected]; mahout-user <
>> [email protected]
>> > >
>> > Sent: Tue, December 8, 2009 2:59:06 PM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> >
>> > John,
>> >
>> > Your comment about matrix multiplication was forwarded to the mahout-user
>> > emailing list.
>> >
>> > I have a question for you about your approach.  Have you considered doing
>> > the multiplication in a single step by storing the the first matrix in
>> > column major order and the second in row major order?  The idea would be
>> to
>> > do a block-wise outer product and use the combiner to sum elements ahead
>> of
>> > the reducer.  This could be done by reading a (block) column from the
>> left
>> > matrix and a (block) row from the second and emitting all of the (block)
>> > elements of the outer product of this column and row.  The key emitted
>> with
>> > these blocks would be the i,j index of the final location of each block
>> in
>> > the final product.  The combiner and reducer could add all of the blocks
>> > together and the use of the combiner would serve to substantially
>> minimize
>> > the amount of network traffic.
>> >
>> > This alternative is equivalent to inverting the loop nesting in a
>> > conventional matrix multiplication algorithm.  It is also very closely
>> > related to the way that cooccurrence counting is typically done in
>> Hadoop.
>> >
>> > Is your project an ongoing effort?
>> >
>> > ---------- Forwarded message ----------
>> > From: Robin Anil <[email protected]>
>> > Date: Tue, Dec 8, 2009 at 11:09 AM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> > To: [email protected]
>> > Cc: John Norstad <[email protected]>
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: John Norstad <[email protected]>
>> > Date: Tue, Dec 8, 2009 at 10:14 PM
>> > Subject: A MapReduce Algorithm for Matrix Multiplication
>> > To: MapReduce <[email protected]>
>> >
>> >
>> > As an exercise while learning MapReduce, I developed an algorithm for
>> > matrix multiplication and wrote it up on my web site. If you're
>> > interested, it's at:
>> >
>> >  http://homepage.mac.com/j.norstad/matrix-multiply
>> >
>> > I present the algorithm in English, as pseudo-code, and as Java source
>> > code for Hadoop.
>> >
>> > --
>> > John Norstad
>> > Academic and Research Technologies
>> > Northwestern University
>> >
>> >
>> >
>> > --
>> > Ted Dunning, CTO
>> > DeepDyve
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>
>
>
>
>-- 
>Ted Dunning, CTO
>DeepDyve
>
>

Reply via email to