Yep, much better with 0.1. "The best model was trained with rank = 12 and lambda = 0.1, and numIter = 20, and its RMSE on the test set is 0.869092" (Spark 1.3.0)
Question : What is the intuition behind RSME of 0.86 vs 1.3 ? I know the smaller the better. But is it that better ? And what is a good number for a recommendation engine ? Cheers <k/> On Tue, Feb 24, 2015 at 1:03 AM, Guillaume Charhon < guilla...@databerries.com> wrote: > I am using Spark 1.2.1. > > Thank you Krishna, I am getting almost the same results as you so it must > be an error in the tutorial. Xiangrui, I made some additional tests with > lambda to 0.1 and I am getting a much better rmse: > > RMSE (validation) = 0.868981 for the model trained with rank = 8, lambda = > 0.1, and numIter = 10. > > > RMSE (validation) = 0.869628 for the model trained with rank = 8, lambda = > 0.1, and numIter = 20. > > > RMSE (validation) = 1.361321 for the model trained with rank = 8, lambda = > 1.0, and numIter = 10. > > > RMSE (validation) = 1.361321 for the model trained with rank = 8, lambda = > 1.0, and numIter = 20. > > > RMSE (validation) = 3.755870 for the model trained with rank = 8, lambda = > 10.0, and numIter = 10. > > > RMSE (validation) = 3.755870 for the model trained with rank = 8, lambda = > 10.0, and numIter = 20. > > > RMSE (validation) = 0.866605 for the model trained with rank = 12, lambda > = 0.1, and numIter = 10. > > > RMSE (validation) = 0.867498 for the model trained with rank = 12, lambda > = 0.1, and numIter = 20. > > > RMSE (validation) = 1.361321 for the model trained with rank = 12, lambda > = 1.0, and numIter = 10. > > > RMSE (validation) = 1.361321 for the model trained with rank = 12, lambda > = 1.0, and numIter = 20. > > > RMSE (validation) = 3.755870 for the model trained with rank = 12, lambda > = 10.0, and numIter = 10. > > > RMSE (validation) = 3.755870 for the model trained with rank = 12, lambda > = 10.0, and numIter = 20. > > > The best model was trained with rank = 12 and lambda = 0.1, and numIter = > 10, and its RMSE on the test set is 0.865407. > > > On Tue, Feb 24, 2015 at 7:23 AM, Xiangrui Meng <men...@gmail.com> wrote: > >> Try to set lambda to 0.1. -Xiangrui >> >> On Mon, Feb 23, 2015 at 3:06 PM, Krishna Sankar <ksanka...@gmail.com> >> wrote: >> > The RSME varies a little bit between the versions. >> > Partitioned the training,validation,test set like so: >> > >> > training = ratings_rdd_01.filter(lambda x: (x[3] % 10) < 6) >> > validation = ratings_rdd_01.filter(lambda x: (x[3] % 10) >= 6 and (x[3] >> % >> > 10) < 8) >> > test = ratings_rdd_01.filter(lambda x: (x[3] % 10) >= 8) >> > Validation MSE : >> > >> > # 1.3.0 Mean Squared Error = 0.871456869392 >> > # 1.2.1 Mean Squared Error = 0.877305629074 >> > >> > Itertools results: >> > >> > 1.3.0 - RSME = 1.354839 (rank = 8 and lambda = 1.0, and numIter = 20) >> > 1.1.1 - RSME = 1.335831 (rank = 8 and lambda = 1.0, and numIter = 10) >> > >> > Cheers >> > <k/> >> > >> > On Mon, Feb 23, 2015 at 12:37 PM, Xiangrui Meng <men...@gmail.com> >> wrote: >> >> >> >> Which Spark version did you use? Btw, there are three datasets from >> >> MovieLens. The tutorial used the medium one (1 million). -Xiangrui >> >> >> >> On Mon, Feb 23, 2015 at 8:36 AM, poiuytrez <guilla...@databerries.com> >> >> wrote: >> >> > What do you mean? >> >> > >> >> > >> >> > >> >> > -- >> >> > View this message in context: >> >> > >> http://apache-spark-user-list.1001560.n3.nabble.com/Movie-Recommendation-tutorial-tp21769p21771.html >> >> > Sent from the Apache Spark User List mailing list archive at >> Nabble.com. >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> > For additional commands, e-mail: user-h...@spark.apache.org >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > >> > >