Hi, In the current factorization flow, we cross validate on the test dataset using the RMSE number but there are some other measures which are worth looking into.
If we consider the problem as a regression problem and the ratings 1-5 are considered as 5 classes, it is possible to generate a confusion matrix using MultiClassMetrics.scala If the ratings are only 0/1 (like from the spotify demo from spark summit) then it is possible to use Binary Classification Metrices to come up with the ROC curve... For topK user/products we should also look into prec@k and pdcg@k as the metric.. Does it make sense to add the multiclass metric and prec@k, pdcg@k in examples.MovielensALS along with RMSE ? Thanks. Deb