That sounds way too long. How is the U matrix stored? What type? On Wed, Mar 6, 2013 at 6:44 AM, Josh Devins <[email protected]> wrote:
> First bit of feedback. The `M.forEachPair` loop is about 1600-1800 millis > per user (recall the size is ~2.6M users x ~2.8M items). There doesn't > appear to be any out of the ordinary GC going on (yet). Going to look at > optimising this loop a bit and see where I can get. Definitely time-boxing > this though ;) > > > On 6 March 2013 12:16, Sebastian Schelter <[email protected]> wrote: > > > Btw, all important jobs in ALS are map-only, so its the number of map > > slotes that counts. > > > > On 06.03.2013 12:11, Sean Owen wrote: > > > OK, that's reasonable on 35 machines. (You can turn up to 70 reducers, > > > probably, as most machines can handle 2 reducers at once). > > > I think the recommendation step loads one whole matrix into memory. > > You're > > > not running out of memory but if you're turning up the heap size to > > > accommodate, you might be hitting swapping, yes. I think (?) the > > > conventional wisdom is to turn off swap for Hadoop. > > > > > > Sebastian yes that is probably a good optimization; I've had good > results > > > reusing a mutable object in this context. > > > > > > > > > On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <[email protected]> > wrote: > > > > > >> The factorization at 2-hours is kind of a non-issue (certainly fast > > >> enough). It was run with (if I recall correctly) 30 reducers across a > 35 > > >> node cluster, with 10 iterations. > > >> > > >> I was a bit shocked at how long the recommendation step took and will > > throw > > >> some timing debug in to see where the problem lies exactly. There were > > no > > >> other jobs running on the cluster during these attempts, but it's > > certainly > > >> possible that something is swapping or the like. I'll be looking more > > >> closely today before I start to consider other options for calculating > > the > > >> recommendations. > > >> > > >> > > > > > > > >
