Hi,
I'm new to Spark and Hadoop, and I'd like to know if the following
problem is solvable in terms of Spark's primitives.
To compute the K-nearest neighbours of a N-dimensional dataset, I can
multiply my very large normalized sparse matrix by its transpose. As
this yields all pairwise distance
max(nnz(L)*log p, nnz(L)*n/p). I
have to warn though: when I played with matrix multiplication, I was getting
nowhere near serial performance.
On Wed, May 28, 2014 at 11:00 AM, Christian Jauvin cjau...@gmail.com
wrote:
Hi,
I'm new to Spark and Hadoop, and I'd like to know if the following