I would recommend against a mutable object on maintenance grounds.

Better is to keep the threshold that a new score must meet and only
construct the object on need.  That cuts the allocation down to negligible
levels.

On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen <[email protected]> wrote:

> OK, that's reasonable on 35 machines. (You can turn up to 70 reducers,
> probably, as most machines can handle 2 reducers at once).
> I think the recommendation step loads one whole matrix into memory. You're
> not running out of memory but if you're turning up the heap size to
> accommodate, you might be hitting swapping, yes. I think (?) the
> conventional wisdom is to turn off swap for Hadoop.
>
> Sebastian yes that is probably a good optimization; I've had good results
> reusing a mutable object in this context.
>
>
> On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <[email protected]> wrote:
>
> > The factorization at 2-hours is kind of a non-issue (certainly fast
> > enough). It was run with (if I recall correctly) 30 reducers across a 35
> > node cluster, with 10 iterations.
> >
> > I was a bit shocked at how long the recommendation step took and will
> throw
> > some timing debug in to see where the problem lies exactly. There were no
> > other jobs running on the cluster during these attempts, but it's
> certainly
> > possible that something is swapping or the like. I'll be looking more
> > closely today before I start to consider other options for calculating
> the
> > recommendations.
> >
> >
>

Reply via email to