Here's a better way to describe my pov: there is a list of use cases
you would like to implement with your recommender, and some of these
are about the psychology and actions of when and why people push the
button. Then, there is a list of features available in the (medium
rich) recommender class suite. So these create your classic matrix of
use cases v.s. features.

The tools are composable. The contract around the tool APIs is
somewhat loose, and different tools have different interpretations.
The features on the side of the matrix are often non-intuitive
combinations of tools rather than individual tools.

There is a learning curve here, and I would like it to be other than
"Ask Ted". This paper is really helpful. This paper by some of the
same crew is about explaining the recommendation scores to the user:

http://www.grouplens.org/papers/pdf/explain-CSCW.pdf



On Tue, Dec 28, 2010 at 6:26 AM, Alan Said <[email protected]> wrote:
> There's a very nice paper by Herlocker et al. - "Evaluating Collaborative 
> Filtering Recommender Systems" which describes different aspects of 
> evaluation. Recommended reading if you're interested in the topic.
>
> PDF available here:
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.5270&rep=rep1&type=pdf
>
> --
> ***************************************
> M.Sc.(Eng.) Alan Said
> Compentence Center Information Retrieval & Machine Learning
> Technische Universität Berlin / DAI-Lab
> Sekr. TEL 14 Ernst-Reuter-Platz 7
> 10587 Berlin / Germany
> Phone:  0049 - 30 - 314 74072
> Fax:    0049 - 30 - 314 74003
> E-mail: [email protected]
> http://www.dai-labor.de
> ***************************************
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:[email protected]]
> Sent: Monday, December 27, 2010 3:54 PM
> To: [email protected]
> Subject: Evaluating recommendations through user observation
>
> Hi,
>
> I was wondering how people evaluate the quality of recommendations other than
> RMSE and such in eval package.
> For example, what are some good ways to measure/evaluate the quality of
> recommendations based on simply observing users' usage of recommendations?
> Here are 2 ideas.
>
> * If you have a mechanism to capture user's rating of the watched item,  that
> gives you (in)direct feedback about the quality of the  recommendation.  When
> evaluating and comparing you probably also want to  take into account the
> ordinal of the recommended item in the list of  recommended items.  If a 
> person
> chooses 1st recommendation and gives it a  score of 10 (best) it's different
> than when a person chooses 7th  recommendation and gives it a score of 10.  Or
> if a person chooses 1st  recommendation and gives it a rating of 1.0 (worst) 
> vs.
> choosing 10th  recommendation and rating it 1.0.
>
> * Even if you don't have a mechanism to capture rating feedback from  viewers,
> you can evaluate and compare.  You can do that by purely  looking at ordinals 
> of
> items selected from recommendations.  If a  person chooses something closer to
> "the top" of the recommendation list,  the recommendations can be considered
> better than if the user chooses  something closer to "the bottom".  This idea 
> is
> similar to MRR in search  - http://en.wikipedia.org/wiki/Mean_reciprocal_rank 
> .
>
> * The above ideas assume recommendations are not shuffled, meaning that their
> order represents their real recommendation score-based order
>
> I'm wondering:
> A) if these ways or measuring/evaluating quality of recommendations are
> good/bad/flawed
> B) if there are other, better ways of doing this
>
> Thanks,
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>



-- 
Lance Norskog
[email protected]

Reply via email to