On Wed, Oct 9, 2013 at 12:54 PM, Michael Sokolov <
[email protected]> wrote:

>
>> BTW lest we forget this does not imply the Solr-recommender is better
>> than Myrrix or the Mahout-only recommenders. There needs to be some careful
>> comparison of results. Michael, did you do offline or A/B tests during your
>> implementation?
>>
>
> I ran some offline tests using our historical data, but I don't have a lot
> of faith in these beyond the fact they indicate we didn't make any obvious
> implementation errors.  We haven't attempted A/B testing yet since our site
> is so new, and we need to get a meaningful baseline going and sort out a
> lot of other more pressing issues on the site - recommendations are only
> one piece, albeit an important one.
>
>
> Actually there was an interesting idea for an article posted recently
> about the difficulty of comparing results across systems in this field:
> http://www.docear.org/2013/09/**23/research-paper-recommender-**
> system-evaluation-a-**quantitative-literature-**survey/<http://www.docear.org/2013/09/23/research-paper-recommender-system-evaluation-a-quantitative-literature-survey/>but
>  that's no excuse not to do better.  I'll certainly share when I know
> more :)


I tend to be a pessimist with regard to off-line evaluation.  It is fine to
do, but if a system is anywhere near best, I think that it should be
considered for A/B testing.

Reply via email to