Does it provide search for optimum fit aka regularization rate?
On Mar 19, 2013 8:10 AM, "Sebastian Schelter" <[email protected]> wrote:

> Played a little more with the code, it works astonishingly well. I was
> totally off in my expectations.
>
> I was able to run an iteration of ALS (two map-only jobs) on the Yahoo
> Songs dataset (700M interactions) in less than 2 minutes.
>
>
> On 14.03.2013 17:02, Sean Owen wrote:
> > On Wed, Mar 13, 2013 at 7:41 PM, Sebastian Schelter <[email protected]>
> wrote:
> >
> >> Hadoop has to reschedule every iteration as separate job, reread the
> >> input data from disk and write the iterations result to HDFS. In fact an
> >> ALS iteration always includes twice of these things as it needs two M/R
> >> jobs. GraphLab/Giraph/Stratosphere on the other hand have to do neither
> >> of these three things (GraphLab even doesn't do synchronous iterations)
> >> and I highly doubt that a Hadoop implementation can get on par
> performance.
> >>
> >
> > That's all true but would you imagine I/O is 97.5% of the run-time? A
> > 100-feature vector is 400 bytes, but to compute an update you need to
> > invert a 100x100 matrix. I can't see the former taking 40x longer than
> the
> > latter. That's why I bet you'll find the current implementation is
> nothing
> > like 40x slower.
> >
> > 2x? maybe. And 2x is nothing to sneeze at!
> >
>
>

Reply via email to