Re: Failed stages and dropped executors when running implicit matrix factorization/ALS

2015-08-21 Thread Ravi Mody
re the ALS settings, e.g., number of blocks, rank and > >> > number of iterations, as well as number of users/items in your > >> > dataset? > >> > 2. If you monitor the progress in the WebUI, how much data is stored > >> > in memory and how much data is

Re: Failed stages and dropped executors when running implicit matrix factorization/ALS

2015-06-26 Thread Ravi Mody
t 11:26 AM, Xiangrui Meng wrote: > Please see my comments inline. It would be helpful if you can attach > the full stack trace. -Xiangrui > > On Fri, Jun 26, 2015 at 7:18 AM, Ravi Mody wrote: > > 1. These are my settings: > > rank = 100 > > iterations = 12 > > user

Re: Failed stages and dropped executors when running implicit matrix factorization/ALS

2015-06-26 Thread Ravi Mody
Forgot to mention: rank of 100 usually works ok, 120 consistently cannot finish. On Fri, Jun 26, 2015 at 10:18 AM, Ravi Mody wrote: > 1. These are my settings: > rank = 100 > iterations = 12 > users = ~20M > items = ~2M > training examples = ~500M-1B (I'm running into t

Re: Failed stages and dropped executors when running implicit matrix factorization/ALS

2015-06-26 Thread Ravi Mody
a is stored > > in memory and how much data is shuffled per iteration? > > 3. Do you have enough disk space for the shuffle files? > > 4. Did you set checkpointDir in SparkContext and checkpointInterval in > ALS? > > > > Best, > > Xiangrui > > >

Failed stages and dropped executors when running implicit matrix factorization/ALS

2015-06-19 Thread Ravi Mody
Hi, I'm running implicit matrix factorization/ALS in Spark 1.3.1 on fairly large datasets (1+ billion input records). As I grow my dataset I often run into issues with a lot of failed stages and dropped executors, ultimately leading to the whole application failing. The errors are like "org.apache.

Re: Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0

2015-05-07 Thread Ravi Mody
n Wed, May 6, 2015 at 12:29 PM, Ravi Mody wrote: > Whoops I just saw this thread, it got caught in my spam filter. Thanks for > looking into this Xiangrui and Sean. > > The implicit situation does seem fairly complicated to me. The cost > function (not including the regularization te

Re: Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0

2015-05-06 Thread Ravi Mody
>>>>> is a good change. In 1.2, we multiply lambda by the number ratings in > >>>>> each sub-problem. This makes it "scale-invariant" for explicit > >>>>> feedback. However, in implicit feedback model, a user's sub-problem >

Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0

2015-03-26 Thread Ravi Mody
After upgrading to 1.3.0, ALS.trainImplicit() has been returning vastly smaller factors (and hence scores). For example, the first few product's factor values in 1.2.0 are (0.04821, -0.00674, -0.0325). In 1.3.0, the first few factor values are (2.535456E-8, 1.690301E-8, 6.99245E-8). This differenc