Re: Discussion Of ML environment/MR, Mahout

Sean Owen Wed, 13 Mar 2013 03:00:36 -0700

On Wed, Mar 13, 2013 at 2:04 AM, Dmitriy Lyubimov <[email protected]> wrote:


> Yeah. The stuck point for me is page-rankish-finding stationary
> distributions and extremely popular ALS based stuff. We've beaten the heck
> out of it a year ago and Sebastian conclusively stated Giraph ALS knocks
> the socks off MR version. Add to that a bisect search for a good
>

This keeps being said, but, I thought Sebastian just said that the M/R
version he mentioned being much slower was a different version, deleted
from this project? See my other email. The current version is similar to
the one I just benchmarked, and that appeared to be about as fast as
GraphLab (still not clear if the same amount of work is being compared
though).

This matches my hunch that these things are about the same, modulo some
extra disk I/O, which is not most of the runtime.

I point it out in case this is underpinning many people's logic for
rebuilding a bunch of stuff because it will be a *lot* faster. Surely some
stuff can be done more naturally in a graph paradigm but not everything, or
most? I'm worried about the conclusion because of cases like this.

Re: Discussion Of ML environment/MR, Mahout

Reply via email to