Hello,
This jittering and your sample output look good, and intuitively makes sense,
though it looks like I'm not interpreting your recipe correctly.
Here is the code snippet that pretends to be looping through top 20
recommendations:
// exp(-n/5) + rexp() * 0.1
for (int i=1; i < 20; i++) {
float exp = (float) i / 5; // not sure
why you used -i /n
float rexp = (float) Math.log(i-Math.random()); // tried with 1 instead
of i like you said, too
float rank = exp + rexp * 0.1f;
float round = Math.round(rank);
System.out.println("EXP: " + exp + "\tREXP: " + rexp + "\RANK: " + rank +
"\tROUND: " + round);
}
But the output doesn't quite look like yours, so I must be misinterpreting
something.
EXP: 0.2 REXP: -0.59645164 RANK: 0.14035484 ROUND: 0.0
EXP: 0.4 REXP: 0.48764116 RANK: 0.44876412 ROUND: 0.0
EXP: 0.6 REXP: 0.89331275 RANK: 0.6893313 ROUND: 1.0
EXP: 0.8 REXP: 1.2796263 RANK: 0.92796266 ROUND: 1.0
EXP: 1.0 REXP: 1.5976489 RANK: 1.1597649 ROUND: 1.0
EXP: 1.2 REXP: 1.6399297 RANK: 1.363993 ROUND: 1.0
EXP: 1.4 REXP: 1.8479612 RANK: 1.5847961 ROUND: 2.0
EXP: 1.6 REXP: 1.9524398 RANK: 1.795244 ROUND: 2.0
EXP: 1.8 REXP: 2.0999322 RANK: 2.009993 ROUND: 2.0
EXP: 2.0 REXP: 2.218352 RANK: 2.2218351 ROUND: 2.0
EXP: 2.2 REXP: 2.3666646 RANK: 2.4366665 ROUND: 2.0
EXP: 2.4 REXP: 2.4445183 RANK: 2.6444519 ROUND: 3.0
EXP: 2.6 REXP: 2.5367393 RANK: 2.853674 ROUND: 3.0
EXP: 2.8 REXP: 2.633957 RANK: 3.0633957 ROUND: 3.0
EXP: 3.0 REXP: 2.7041435 RANK: 3.2704144 ROUND: 3.0
EXP: 3.2 REXP: 2.751766 RANK: 3.4751766 ROUND: 3.0
EXP: 3.4 REXP: 2.8236845 RANK: 3.6823685 ROUND: 4.0
EXP: 3.6 REXP: 2.854854 RANK: 3.8854854 ROUND: 4.0
EXP: 3.8 REXP: 2.9142253 RANK: 4.0914226 ROUND: 4.0
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: Ted Dunning <[email protected]>
> To: [email protected]
> Sent: Wednesday, June 3, 2009 5:23:15 PM
> Subject: Re: Inconsistent recommendations
>
> My experience is that users like to see recommendations that change.
>
> In fact, this preference is strong enough that I now typically add jitter to
> the recommendations that I return. Typically I do this by computing a
> synthetic score used only to permute results lists:
>
> exp(-n/5) + rexp() * 0.1
>
> Here rexp is an exponentially distributed random deviate that can be
> generated using Math.log(1-Math.random()). The value n is the rank of the
> item (offset 0 or 1, doesn't matter). The magic constants (/ 5 and * 0.1)
> must be tuned to fit the number of results you show and how you want to
> trade off stability versus novelty. Usually I implement this as a
> meta-recommendation engine that uses the output of another engine as input
> and which returns permuted results.
>
> The idea here is that the first few items will generally appear in the
> "correct" order. Items beyond the top 10 will be dramatically shuffled.
> Here is a sequence of 20 draws for the top 20 items out of 200 (after
> permutation by synthetic score):
>
> 2 1 4 3 5 8 7 6 14 13 10 15 37 30 146 40 94 76 172
> 125
> 1 2 4 6 3 8 9 5 11 13 15 7 169 190 174 44 90 171 95
> 74
> 1 2 3 4 5 6 9 7 11 8 15 14 16 17 53 81 34 33 37
> 30
> 1 2 3 4 5 6 8 10 7 9 15 12 19 88 121 30 55 43 200
> 168
> 1 2 4 3 5 7 8 10 6 13 11 19 17 133 139 124 194 123 79
> 186
> 1 2 3 4 5 6 9 11 8 12 10 13 14 19 151 53 102 48 117
> 169
> 1 3 2 5 7 4 6 10 12 15 24 59 83 61 156 148 109 28 188
> 126
> 1 2 3 4 6 7 5 8 10 12 14 11 192 57 54 182 38 158 128
> 123
> 1 3 4 6 5 7 8 11 9 10 13 15 2 12 24 28 43 179 180
> 100
> 1 2 3 4 6 5 7 8 9 10 11 25 15 12 83 124 59 45 169
> 199
> 1 2 3 4 5 7 8 11 6 14 26 85 69 163 40 58 12 182 144
> 109
> 1 2 4 5 3 6 7 8 10 12 9 20 22 109 43 108 27 62 157
> 84
> 1 2 4 3 5 7 8 10 6 9 12 11 16 13 17 15 140 39 122
> 190
> 1 3 2 5 4 7 8 10 16 14 11 15 41 38 42 100 171 68 113
> 178
> 1 2 3 4 5 6 7 8 10 12 11 194 89 43 80 129 126 181 94
> 140
> 1 3 2 5 4 7 6 8 11 10 13 12 9 19 20 53 99 30 183
> 115
> 1 2 3 5 6 4 7 8 10 11 13 16 21 23 153 82 52 163 31
> 186
> 1 2 3 4 5 6 9 10 13 18 16 11 19 7 27 23 29 41 72
> 64
> 1 2 3 4 5 6 8 7 11 9 10 18 13 17 33 194 196 35 128
> 75
> 1 2 3 5 4 7 6 8 9 10 12 11 13 15 16 14 188 82 147
> 163
>
> Thus, for the first row, we would present the second item from the
> recommendations first, followed by items 1, 4, 3 and 5.
>
> In blind testing, I found users typically prefer jittered results
> significantly over unjittered results. One interpretation for this is that
> this is simply a way to get them to look beyond the first page of results.
> Another is that they are more willing to look at lists that change.
>
> That said, I have also found it helpful to make the results be static for a
> small period of time. To ensure that, I typically seed a random number
> generate on each request with the user id and the current time in seconds
> rounded down to the time period of stability. Recommendations can often be
> made fast enough that caching is of little interest, but if caching is used,
> the expiration times should ideally be synchronized with the reseeding to
> give the desired mix of stability and novelty.
>
> On Wed, Jun 3, 2009 at 12:45 PM, Sean Owen wrote:
>
> > 3) I suppose I think of computing recommendation as a
> > relatively-speaking infrequent event. You might compute them once a
> > day or hour. Or you compute on the fly and cache it, either externally
> > or in the framework. So, it shouldn't be the case that the same
> > recommendations are computed over and over in a row, where the
> > differences might become noticeable, in an application, to a user
> >
> >
> > Is it possible to guarantee the same recommendation, even when using
> > sampling, if the data doesn't change? wouldn't be too hard to always
> > use a local RNG and always seed it the same way, no. It would be a
> > performance hit.
> >
> > My first reaction though is #3 -- cache. Is that a feasible response?
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>
> 111 West Evelyn Ave. Ste. 202
> Sunnyvale, CA 94086
> http://www.deepdyve.com
> 858-414-0013 (m)
> 408-773-0220 (fax)