The pretty standard metric for recommenders is mean average precision,
and RankingMetrics will already do that as-is. I don't know that a
confusion matrix for this binary classification does much.


On Thu, Oct 30, 2014 at 9:41 PM, Debasish Das <debasish.da...@gmail.com> wrote:
> I am working on it...I will open up a JIRA once I see some results..
>
> Idea is to come up with a test train set based on users...basically for
> each user, we come up with 80% train data and 20% test data...
>
> Now we pick up a K (each user should have a different K based on the movies
> he watched so some multiplier) and then we get topK for each user and see
> the confusion matrix for each user...
>
> This data will also go to RankingMetrics I think...one is ground truth
> array and the other is our prediction...I would like to see the raw
> confusions as well..
>
> These measures are necessary to validate any of the topic modeling
> algorithms as well...
>
> Is there a better place for it other than mllib examples ?
>
> On Thu, Oct 30, 2014 at 8:13 AM, Debasish Das <debasish.da...@gmail.com>
> wrote:
>
>> I thought topK will save us...for each user we have 1xrank...now our movie
>> factor is a RDD...we pick topK movie factors based on vector norm...with K
>> = 50, we will have 50 vectors * num_executors in a RDD...with the user
>> 1xrank we do a distributed dot product using RowMatrix APIs...
>>
>> May be we can't find topK using vector norm on movie factors...
>>
>> On Thu, Oct 30, 2014 at 1:12 AM, Nick Pentreath <nick.pentre...@gmail.com>
>> wrote:
>>
>>> Looking at
>>> https://github.com/apache/spark/blob/814a9cd7fabebf2a06f7e2e5d46b6a2b28b917c2/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L82
>>>
>>> For each user in test set, you generate an Array of top K predicted item
>>> ids (Int or String probably), and an Array of ground truth item ids (the
>>> known rated or liked items in the test set for that user), and pass that to
>>> precisionAt(k) to compute MAP@k (Actually this method name is a bit
>>> misleading - it should be meanAveragePrecisionAt where the other method
>>> there is without a cutoff at k. However, both compute MAP).
>>>
>>> The challenge at scale is actually computing all the top Ks for each
>>> user, as it requires broadcasting all the item factors (unless there is a
>>> smarter way?)
>>>
>>> I wonder if it is possible to extend the DIMSUM idea to computing top K
>>> matrix multiply between the user and item factor matrices, as opposed to
>>> all-pairs similarity of one matrix?
>>>
>>> On Thu, Oct 30, 2014 at 5:28 AM, Debasish Das <debasish.da...@gmail.com>
>>> wrote:
>>>
>>>> Is there an example of how to use RankingMetrics ?
>>>>
>>>> Let's take the user, document example...we get user x topic and document
>>>> x
>>>> topic matrices as the model...
>>>>
>>>> Now for each user, we can generate topK document by doing a sort on (1 x
>>>> topic)dot(topic x document) and picking topK...
>>>>
>>>> Is it possible to validate such a topK finding algorithm using
>>>> RankingMetrics ?
>>>>
>>>>
>>>> On Wed, Oct 29, 2014 at 12:14 PM, Xiangrui Meng <men...@gmail.com>
>>>> wrote:
>>>>
>>>> > Let's narrow the context from matrix factorization to recommendation
>>>> > via ALS. It adds extra complexity if we treat it as a multi-class
>>>> > classification problem. ALS only outputs a single value for each
>>>> > prediction, which is hard to convert to probability distribution over
>>>> > the 5 rating levels. Treating it as a binary classification problem or
>>>> > a ranking problem does make sense. The RankingMetricc is in master.
>>>> > Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC
>>>> > should be good to add as well. -Xiangrui
>>>> >
>>>> >
>>>> > On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das <
>>>> debasish.da...@gmail.com>
>>>> > wrote:
>>>> > > Hi,
>>>> > >
>>>> > > In the current factorization flow, we cross validate on the test
>>>> dataset
>>>> > > using the RMSE number but there are some other measures which are
>>>> worth
>>>> > > looking into.
>>>> > >
>>>> > > If we consider the problem as a regression problem and the ratings
>>>> 1-5
>>>> > are
>>>> > > considered as 5 classes, it is possible to generate a confusion
>>>> matrix
>>>> > > using MultiClassMetrics.scala
>>>> > >
>>>> > > If the ratings are only 0/1 (like from the spotify demo from spark
>>>> > summit)
>>>> > > then it is possible to use Binary Classification Metrices to come up
>>>> with
>>>> > > the ROC curve...
>>>> > >
>>>> > > For topK user/products we should also look into prec@k and pdcg@k
>>>> as the
>>>> > > metric..
>>>> > >
>>>> > > Does it make sense to add the multiclass metric and prec@k, pdcg@k
>>>> in
>>>> > > examples.MovielensALS along with RMSE ?
>>>> > >
>>>> > > Thanks.
>>>> > > Deb
>>>> >
>>>>
>>>
>>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to