Re: Can someone suggest an approach for calculating precision and recall for distributed recommendations?

Jonathan Hodges Tue, 28 Aug 2012 06:04:11 -0700

Thanks to you again Ted. These are some great suggestions for helping out
us newbies.




On Mon, Aug 27, 2012 at 11:28 PM, Ted Dunning <[email protected]> wrote:

> In another forum, I responded to this question this way:
>
> One short answer is that you only need enough test data to drive the
> > accuracy of your PR estimates to the point you need them. That isn't all
> > that much data so the sequential version should do rather well.
> > The gold standard, of course, is actual user behavior. Especially when
> you
> > are starting out, views are going to be entirely driven by your other
> > discovery mechanisms such as search. This means that maximizing recall
> > precision is going to drive your recommender to replicate your current
> > discovery patterns which isn't really what you want.
> > Regarding your use of raw views, you will have problems if your videos
> > have lots of misleading meta-data since users will click on things that
> > they don't really want to watch. This is a key user satisfaction issue,
> of
> > course.
> > You should also consider dithering in your system for lots of reasons.
> > Also, make sure you have alternative discovery mechanisms. A "recently
> > added" page is really helpful for this.
>
>
> And then added this about dithering:
>
> All clicks are implicit data and you can use boolean methods on any or all
> > of them. Nothing in these kinds of data prevents you from using LLR
> methods
> > or matrix factorization methods.
> > For dithering, what I do is set a synthetic score that looks like
> > exp(-rank). Then I add random noise to this that is exponentially
> > distributed (aka -log(random()) ). I scale the noise as small as I would
> > like. This method means that the top items generally mix with just the
> top
> > and deeper items mix with much deeper items.
> > You can experiment with this using the following R commands (with sample
> > output):
> >
>
>
> > *order(-exp(-(0:99)/4) + rexp(100, rate=10))** *
> > [1] 2 1 4 3 6 8 5 10 7 12 29 11 26 21 70 86 79 52
> > [19] 14 68 17 83 44 72 30 89 35 34 84 39 74 100 73 87 78 56
> > [37] 15 66 46 40 9 95 96 67 16 49 80 90 53 32 27 48 37 76
> > [55] 77 91 88 62 98 51 19 50 93 99 23 28 65 33 25 54 71 97
> > [73] 43 57 18 92 94 45 22 38 81 75 85 13 20 82 41 42 58 64
> > [91] 60 59 61 69 47 55 31 24 36 63
> > > *order(-exp(-(0:99)/4) + rexp(100, rate=10))** *
> > [1] 1 2 3 4 5 6 9 12 7 10 15 23 78 72 16 60 95 68
> > [19] 24 65 90 94 55 22 40 21 17 47 39 71 59 66 79 88 97 56
> > [37] 26 99 74 41 44 45 50 70 49 75 62 31 84 51 11 33 91 19
> > [55] 61 28 77 18 52 54 48 43 87 25 35 38 30 73 27 89 53 8
> > [73] 82 83 93 57 13 36 69 29 98 63 76 85 64 37 96 46 81 67
> > [91] 92 20 80 42 58 34 86 32 14 100
> > >
> >
>
>
> As you can see, the top items stay near the top, but mixing down deeper is
> > quite strong.
>
>
>
> You can use uniform noise to get kind of a different effect.
>

Re: Can someone suggest an approach for calculating precision and recall for distributed recommendations?

Reply via email to