Yes that's well put. My only objection is that this sounds like you're saying that there is a systematic problem with the ordering, so it will usually help to pick any different ordering than the one you thought was optimal. Surely you do this in an effort to better learn what the right-er orderings are: are top recs under-performing their rank-adjusted expected click-through rate or something? (Why is a different question.) That's what I mean by "for evaluation" rather than "as a way to improve recs per se" and I imagine it's just a difference of semantics.
It sounds like the problem here is preventing popular items from dominating recommendations just because they are generally popular -- because being in recommendations makes them popular and you have a positive feedback loop. It's a problem; it's often one that simple approaches to recs have since they don't naturally normalize away general popularity that is not specific to the current user. For example simplistic similarity metrics like simple co-occurrence would strongly favor popular items in a way that is clearly undesirable. I guess I think that anyone who thinks they have this problem, probably has some relatively easy-to-fix algorithmic problem and that's better to fix directly if so. On this note, I like the approach of systems like iTunes, which appear to try to give you top recs in several genres rather than one monolithic list of recommendations. I find this much more reasonable to recommend by genre; that's a better size of world from which to recommend. I find it hard to reason about whether I like a certain rock album better than a certain classical piece, though I can tell you easily which rock songs are better than other rock songs. So, a system that tries to combine all of these things will struggle: is the idea of "top 10 recommendations from across all of music" just problematic to begin with? (Go look at your list of all recommendations across all object types at Amazon -- they do a great job, but man, how do you meaningfully state that an electric toothbrush slots between an Alan Partridge DVD and Harry Potter?) So, if *that* is the sort of issue you're trying to solve by shaking up recommendations, maybe it would be more meaningful to look at restricting recommendations. Can you list the top 2 recs from 5 popular categories? Or can you heavily bias the recommendations to take into account current context (current item)? Those are also ways to surface more "lower" items. Just some more ideas... On Sat, Jul 2, 2011 at 5:42 PM, Ted Dunning <[email protected]> wrote: > I agree that randomness is good to introduce, but I disagree that it is only > for evaluation purposes. In particular, I find it useful to smooth out some > discontinuities that result from the incestuous relationship recommendation > results and later recommendations. Without this, you can quickly wind up > with a very strong echo in your recommendations of the first page of your > previous recommendations. By dithering result orders, you can adjust the > exploration / exploitation trade-off in your system. In anecdotal reports, > I have heard and seen very dramatic improvements in recommendation quality > with dithering. > > That aside, a useful way to do the reordering is to generate a synthetic > score for each item based on its rank r in the original sorted result set. > > double newScore = Math.exp(- r / scale) + fuzz * random.nextDouble(); > > the size of fuzz determines where the reordering will occur. If you pick > fuzz = exp(-2), then reordering will only occur when exp(-(r+1)/scale) > - exp(-r/scale) < 2*fuzz. For instance, with scale = 1, 3 and 10 and fuzz = > 0.1, here are some sample re-orderings: > >> order(-exp(-(1:30)/3) + 0.1 * runif(30)) > [1] 1 2 3 4 5 6 8 7 9 13 12 26 14 28 27 24 17 10 11 20 29 19 16 15 > 23 22 25 30 21 18 >> order(-exp(-(1:30)/3) + 0.1 * runif(30)) > [1] 1 2 3 4 6 5 11 7 8 21 20 29 16 24 25 14 19 26 9 15 13 28 10 22 > 12 17 27 18 30 23 >> order(-exp(-(1:30)/3) + 0.1 * runif(30)) > [1] 1 2 3 4 6 5 7 8 12 16 26 18 24 21 9 22 28 10 11 19 29 30 15 20 > 14 23 17 25 13 27 >> order(-exp(-(1:30)/3) + 0.1 * runif(30)) > [1] 1 2 3 4 5 6 7 8 10 11 28 20 26 18 29 19 22 24 25 9 15 21 27 30 > 13 16 14 12 17 23 > >> order(-exp(-(1:30)) + 0.1 * runif(30)) > [1] 1 2 17 23 19 11 8 21 3 28 12 22 25 13 30 9 27 24 7 4 14 29 10 5 > 6 20 26 15 16 18 >> order(-exp(-(1:30)) + 0.1 * runif(30)) > [1] 1 2 15 25 20 29 14 5 7 16 10 3 18 13 17 23 8 26 22 12 21 24 4 30 > 9 27 28 19 11 6 >> order(-exp(-(1:30)) + 0.1 * runif(30)) > [1] 1 2 19 3 5 28 23 24 12 25 7 22 17 21 4 6 20 13 16 29 9 14 30 10 > 27 8 18 15 26 11 > >> order(-exp(-(1:30)/10) + 0.1 * runif(30)) > [1] 1 2 3 5 4 7 6 8 9 10 11 12 13 14 16 18 15 22 19 17 23 24 25 20 > 21 29 30 27 28 26 >> order(-exp(-(1:30)/10) + 0.1 * runif(30)) > [1] 1 2 3 4 5 6 7 8 9 10 12 11 13 16 14 18 15 20 17 21 19 26 29 25 > 30 22 24 23 27 28 >> order(-exp(-(1:30)/10) + 0.1 * runif(30)) > [1] 1 2 3 5 4 7 6 8 9 12 11 10 13 16 14 15 20 19 17 22 25 18 21 28 > 24 23 29 26 27 30 > > As you can see, with scale = 1, only the first 2 results are stable and very > deep results can be surfaced. With scale = 10, reordering only becomes very > strong below 10-20 results. I usually use scale = 3 because result pages > are usually 20 long. With shorter result pages, scale = 2 is probably > justified. > > > > On Sat, Jul 2, 2011 at 12:56 AM, Sean Owen <[email protected]> wrote: > >> Yes, it's a good idea. Usually it serves a purpose for evaluation >> only. You know the relative strength of recommendations, and know how >> much ranking them 1st, 2nd, 3rd, etc biases the user to click on them. >> So you can predict how many clicks each should relatively get. And you >> can easily pull up recommendation #10 if you want to see if it gets >> unusually more clicks than you'd expect. This tells you there's >> something suboptimal about recs if so. >> >> Don't make similarity change randomly. It is supposed to have >> properties like symmetry and transitivity, which would then break. >> >> You can make the neighborhood pick other users, yes. But I think the >> most direct way to do this is to reorder the final recommendations. >> That works for any implementation. So I would do #2. >> >> But again I would not do this just for its own sake; on its face, it >> hurts recommendation quality. Do so if you are using it to evaluate >> quality. >> >> >> On Fri, Jul 1, 2011 at 7:42 PM, Salil Apte <[email protected]> wrote: >> > My first post to the Mahout group. First, Mahout devs, you have created >> > something great so thanks! >> > >> > I have inherited some Mahout code and am trying to make some >> improvements. I >> > was hoping to get some guidance. >> > >> > 1. We are using the NearestNUserNeighborhood class for neighborhood >> > calculations. While I want to use the similarity metrics provided in >> Mahout, >> > I also want to introduce some randomness. In effect, I want to include a >> few >> > people into the final nearest neighbors set that are not actually that >> > close. That way, my recommender will include some outliers into the >> results >> > which is a desirable property for our recommender. What's the best way of >> > doing this? I can of course implement my own similarity metric (which >> could >> > internally use PearsonCorrelationSimilarity) and then randomly give a >> high >> > correlation number to certain people. But is there a better way? >> > >> > 2. I also want to introduce some randomness into the final recommended >> set. >> > I am thinking I can do this by creating a custom IDRescorer and randomly >> > bumping up the score for some of the items. This will of course require >> some >> > tweaking (how often an item gets a bump, how much of a bump does it get, >> > etc.) but is there a better way of going about this? >> > >> > Thanks for the help! >> > >> > -Salil >> > >> >
