The pretty standard metric for recommenders is mean average precision, and RankingMetrics will already do that as-is. I don't know that a confusion matrix for this binary classification does much.
On Thu, Oct 30, 2014 at 9:41 PM, Debasish Das <debasish.da...@gmail.com> wrote: > I am working on it...I will open up a JIRA once I see some results.. > > Idea is to come up with a test train set based on users...basically for > each user, we come up with 80% train data and 20% test data... > > Now we pick up a K (each user should have a different K based on the movies > he watched so some multiplier) and then we get topK for each user and see > the confusion matrix for each user... > > This data will also go to RankingMetrics I think...one is ground truth > array and the other is our prediction...I would like to see the raw > confusions as well.. > > These measures are necessary to validate any of the topic modeling > algorithms as well... > > Is there a better place for it other than mllib examples ? > > On Thu, Oct 30, 2014 at 8:13 AM, Debasish Das <debasish.da...@gmail.com> > wrote: > >> I thought topK will save us...for each user we have 1xrank...now our movie >> factor is a RDD...we pick topK movie factors based on vector norm...with K >> = 50, we will have 50 vectors * num_executors in a RDD...with the user >> 1xrank we do a distributed dot product using RowMatrix APIs... >> >> May be we can't find topK using vector norm on movie factors... >> >> On Thu, Oct 30, 2014 at 1:12 AM, Nick Pentreath <nick.pentre...@gmail.com> >> wrote: >> >>> Looking at >>> https://github.com/apache/spark/blob/814a9cd7fabebf2a06f7e2e5d46b6a2b28b917c2/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L82 >>> >>> For each user in test set, you generate an Array of top K predicted item >>> ids (Int or String probably), and an Array of ground truth item ids (the >>> known rated or liked items in the test set for that user), and pass that to >>> precisionAt(k) to compute MAP@k (Actually this method name is a bit >>> misleading - it should be meanAveragePrecisionAt where the other method >>> there is without a cutoff at k. However, both compute MAP). >>> >>> The challenge at scale is actually computing all the top Ks for each >>> user, as it requires broadcasting all the item factors (unless there is a >>> smarter way?) >>> >>> I wonder if it is possible to extend the DIMSUM idea to computing top K >>> matrix multiply between the user and item factor matrices, as opposed to >>> all-pairs similarity of one matrix? >>> >>> On Thu, Oct 30, 2014 at 5:28 AM, Debasish Das <debasish.da...@gmail.com> >>> wrote: >>> >>>> Is there an example of how to use RankingMetrics ? >>>> >>>> Let's take the user, document example...we get user x topic and document >>>> x >>>> topic matrices as the model... >>>> >>>> Now for each user, we can generate topK document by doing a sort on (1 x >>>> topic)dot(topic x document) and picking topK... >>>> >>>> Is it possible to validate such a topK finding algorithm using >>>> RankingMetrics ? >>>> >>>> >>>> On Wed, Oct 29, 2014 at 12:14 PM, Xiangrui Meng <men...@gmail.com> >>>> wrote: >>>> >>>> > Let's narrow the context from matrix factorization to recommendation >>>> > via ALS. It adds extra complexity if we treat it as a multi-class >>>> > classification problem. ALS only outputs a single value for each >>>> > prediction, which is hard to convert to probability distribution over >>>> > the 5 rating levels. Treating it as a binary classification problem or >>>> > a ranking problem does make sense. The RankingMetricc is in master. >>>> > Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC >>>> > should be good to add as well. -Xiangrui >>>> > >>>> > >>>> > On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das < >>>> debasish.da...@gmail.com> >>>> > wrote: >>>> > > Hi, >>>> > > >>>> > > In the current factorization flow, we cross validate on the test >>>> dataset >>>> > > using the RMSE number but there are some other measures which are >>>> worth >>>> > > looking into. >>>> > > >>>> > > If we consider the problem as a regression problem and the ratings >>>> 1-5 >>>> > are >>>> > > considered as 5 classes, it is possible to generate a confusion >>>> matrix >>>> > > using MultiClassMetrics.scala >>>> > > >>>> > > If the ratings are only 0/1 (like from the spotify demo from spark >>>> > summit) >>>> > > then it is possible to use Binary Classification Metrices to come up >>>> with >>>> > > the ROC curve... >>>> > > >>>> > > For topK user/products we should also look into prec@k and pdcg@k >>>> as the >>>> > > metric.. >>>> > > >>>> > > Does it make sense to add the multiclass metric and prec@k, pdcg@k >>>> in >>>> > > examples.MovielensALS along with RMSE ? >>>> > > >>>> > > Thanks. >>>> > > Deb >>>> > >>>> >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org