Re: Dithering and Thompson Sampling

Ted Dunning Sat, 15 Feb 2014 19:01:15 -0800

The adPredictor paper has a good example of how to do this.

See http://research.microsoft.com/apps/pubs/default.aspx?id=122779





On Sat, Feb 15, 2014 at 6:38 PM, Pat Ferrel <[email protected]> wrote:

> I’m still unclear about how to apply TS to dithering. TS is usually talked
> about in conjunction with a feedback loop, which make the examples seem
> more like Multi-armed Bandit examples.
>
> Are you suggesting some feedback in recommender ranking or just using the
> same distribution assumptions used in TS?
>
> On Feb 8, 2014, at 12:13 PM, Ted Dunning <[email protected]> wrote:
>
> Thompson sampling doesn't require time other than a sense of what do we now
> know.  It really is just a correct form for dithering that uses our current
> knowledge.
>
> For a worked out version of Thompson sampling with ranking, see this blog:
>
> http://tdunning.blogspot.com/2013/04/learning-to-rank-in-very-bayesian-way.html
>
> The reason that we aren't adding this like cross-rec and other things is
> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout, but
> the rest are not.  You seem more active than most.
>
>
>
>
>
> On Sat, Feb 8, 2014 at 8:50 AM, Pat Ferrel <[email protected]> wrote:
>
> > Didn’t mean to imply I had historical view data—yet.
> >
> > The Thompson sampling ‘trick’ looks useful for auto converging to the
> best
> > of A/B versions and a replacement for dithering. Below you are proposing
> > another case to replace dithering—this time on a list of popular items?
> > Dithering works on anything you can rank but Thompson Sampling usually
> > implies a time dimension. The initial guess, first Thompson sample, could
> > be thought of as a form of dithering I suppose? Haven’t looked at the
> math
> > but it wouldn’t surprise me to find they are very similar things.
> >
> > While we are talking about it, why aren’t we adding things like
> > cross-reccomendations, dithering, popularity, and other generally useful
> > techniques into the Mahout recommenders? All the data is there to do
> these
> > things, and they could be packaged in the same Mahout Jobs. They seem to
> be
> > languishing a bit while technology and the art of recommendations moves
> on.
> >
> > If we add temporal data to preference data a bunch of new features come
> to
> > mind, like hot lists or asymmetric train/query preference history.
> >
> > On Feb 6, 2014, at 9:43 PM, Ted Dunning <[email protected]> wrote:
> >
> > One way to deal with that is to build a model that predicts the ultimate
> > number of views/plays/purchases for the item based on history so far.
> >
> > If this model can be made Bayesian enough to sample from the posterior
> > distribution of total popularity, then you can use the Thomson sampling
> > trick and sort by sampled total views rather than estimated total views.
> > That will give uncertain items (typically new ones) a chance to be shown
> > in the ratings without flooding the list with newcomers.
> >
> > Sent from my iPhone
> >
> >> On Feb 7, 2014, at 3:38, Pat Ferrel <[email protected]> wrote:
> >>
> >> The particular thing I’m looking at now is how to rank a list of items
> > by some measure of popularity when you don’t have a velocity. There is an
> > introduction date though so another way to look at popularity might be to
> > decay it with something like e^-t where t is it’s age. You can see the
> > decay in the views histogram
> >
> >
>
>

Re: Dithering and Thompson Sampling

Reply via email to