Hey Shannon,

  The whole useage of temp directories in DistributedRowMatrix is...
"confused" at best.  It needs to be better managed, really.  But out of the
two approaches to this current problem,

-Modify DistributedRowMatrix.times() so "productWith" is instead
"productWith" + (System.nanoTime() & 0xFF) .

I think I like this approach, yes.

  -jake

On Thu, Jun 17, 2010 at 1:35 PM, Shannon Quinn <[email protected]> wrote:

> Hi all,
>
> I'm attempting to carry out the multiplication of two DistributedRowMatrix
> objects. It's similar to a SVD multiplication in that I have a matrix A and
> a matrix B, and the operation looks like BAB, or:
>
> B.times(A.times(B))
>
> When I run it, I get a FileAlreadyExistsException from Hadoop, since for
> both A and B the result of outputTmpBasePath.getParent() is the same (A is
> read from input, and B is calculated from A). Which would be a better
> approach:
>
> -Modify DistributedRowMatrix.times() so "productWith" is instead
> "productWith" + (System.nanoTime() & 0xFF) .
> -Change how the matrices are initialized and avoid the problem from within
> the Driver.
>
> I only ask because transpose() employs the first technique, and I don't
> know how common (or uncommon) this particular situation is that might
> warrant the first change I mentioned.
>
> Thank you!
>
> Regards,
> Shannon
>

Reply via email to