Got sidetracked today but I'll run Sebastian's version in trunk tomorrow
and report back.


On 6 March 2013 17:07, Sebastian Schelter <[email protected]> wrote:

> I already committed a fix in that direction. I modified our
> FixedSizePriorityQueue to allow inspection of its head for direct
> comparison. This obviates the need to instantiate a Comparable and offer
> it to the queue.
>
> /s
>
>
> On 06.03.2013 17:01, Ted Dunning wrote:
> > I would recommend against a mutable object on maintenance grounds.
> >
> > Better is to keep the threshold that a new score must meet and only
> > construct the object on need.  That cuts the allocation down to
> negligible
> > levels.
> >
> > On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen <[email protected]> wrote:
> >
> >> OK, that's reasonable on 35 machines. (You can turn up to 70 reducers,
> >> probably, as most machines can handle 2 reducers at once).
> >> I think the recommendation step loads one whole matrix into memory.
> You're
> >> not running out of memory but if you're turning up the heap size to
> >> accommodate, you might be hitting swapping, yes. I think (?) the
> >> conventional wisdom is to turn off swap for Hadoop.
> >>
> >> Sebastian yes that is probably a good optimization; I've had good
> results
> >> reusing a mutable object in this context.
> >>
> >>
> >> On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <[email protected]> wrote:
> >>
> >>> The factorization at 2-hours is kind of a non-issue (certainly fast
> >>> enough). It was run with (if I recall correctly) 30 reducers across a
> 35
> >>> node cluster, with 10 iterations.
> >>>
> >>> I was a bit shocked at how long the recommendation step took and will
> >> throw
> >>> some timing debug in to see where the problem lies exactly. There were
> no
> >>> other jobs running on the cluster during these attempts, but it's
> >> certainly
> >>> possible that something is swapping or the like. I'll be looking more
> >>> closely today before I start to consider other options for calculating
> >> the
> >>> recommendations.
> >>>
> >>>
> >>
> >
>
>

Reply via email to