Re: Popularity of recommender items

2014-02-14 Thread Pat Ferrel
There’s been work done on the cross-recommender. There is a Mahout-style XRecommenderJob that has two preference models for two actions or preference types. It uses matrix multiply to get a cooccurrence type similarity matrix. If we had a cross-row-similarity-job, it could pretty easily be

Re: Popularity of recommender items

2014-02-14 Thread Andrew Musselman
I'd like to see cross-recommendations added too. But I also want some automation of the steps required to build a simple recommender like the solr/mahout example Ted and Ellen have in their pamphlet. Lowering the barrier to entry by providing a sample pipeline would help a lot of folks get

Re: Popularity of recommender items

2014-02-14 Thread Ted Dunning
Yes! But it is very hard to find the time. On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: I'd like to see cross-recommendations added too. But I also want some automation of the steps required to build a simple recommender like the solr/mahout

Re: Popularity of recommender items

2014-02-14 Thread Andrew Musselman
Oh yes. I do have a small team I could enlist to do things like this; is there a starting point somewhere on Github, Ted? On Fri, Feb 14, 2014 at 12:39 PM, Ted Dunning ted.dunn...@gmail.com wrote: Yes! But it is very hard to find the time. On Fri, Feb 14, 2014 at 11:51 AM, Andrew

Re: Popularity of recommender items

2014-02-14 Thread Pat Ferrel
Note sure if this is what you are looking for. I assume you are talking about Ted’s paper describing a Solr based recommender pipeline? Much of the paper was implemented in the solr-recommender referenced below, which has a fairly flexible parallel version of a logfile reader that uses

Re: Popularity of recommender items

2014-02-14 Thread Andrew Musselman
Precisely; that's the one, thanks! On Fri, Feb 14, 2014 at 12:45 PM, Pat Ferrel p...@occamsmachete.com wrote: Note sure if this is what you are looking for. I assume you are talking about Ted's paper describing a Solr based recommender pipeline? Much of the paper was implemented in the

Re: Popularity of recommender items

2014-02-08 Thread Pat Ferrel
Didn’t mean to imply I had historical view data—yet. The Thompson sampling ‘trick’ looks useful for auto converging to the best of A/B versions and a replacement for dithering. Below you are proposing another case to replace dithering—this time on a list of popular items? Dithering works on

Re: Popularity of recommender items

2014-02-08 Thread Ted Dunning
Thompson sampling doesn't require time other than a sense of what do we now know. It really is just a correct form for dithering that uses our current knowledge. For a worked out version of Thompson sampling with ranking, see this blog:

Re: Popularity of recommender items

2014-02-08 Thread Suneel Marthi
I am not fulltime on Mahout either and have a fulltime job which is unrelated to Mahout. Its just that I have been sacrificing personal time to keep things moving on Mahout. On Saturday, February 8, 2014 3:13 PM, Ted Dunning ted.dunn...@gmail.com wrote: Thompson sampling doesn't

Re: Popularity of recommender items

2014-02-08 Thread Pat Ferrel
That was by no means to criticize effort level, which has been impressive especially during the release. It was more a question about the best place to add these things and whether they are important. Whether people see these things as custom post processing or core. On Feb 8, 2014, at 12:13

Re: Popularity of recommender items

2014-02-08 Thread Ted Dunning
I have different opinions about each piece. I think that cross recommendation is as core as RowSimilarityJob and should be a parallel implementation or integrated. Parallel is probably easier. It is even plausible to have a version of RowSimilarityJob that doesn't support all the different

Re: Popularity of recommender items

2014-02-06 Thread Tevfik Aytekin
Well, I think what you are suggesting is to define popularity as being similar to other items. So in this way most popular items will be those which are most similar to all other items, like the centroids in K-means. I would first check the correlation between this definition and the standard one

Re: Popularity of recommender items

2014-02-06 Thread Ted Dunning
If you look at the indicator matrix (cooccurrence reduced by LLR), you will usually have asymmetry due to limitations on the number of indicators per row. This will give you some interesting results when you look at the column sums. I wouldn't call it popularity, but it is an interesting

Re: Popularity of recommender items

2014-02-06 Thread Pat Ferrel
The problem with the usual preference count is that big hit items can be overwhelmingly popular. If you want to know which ones the most people saw and are likely to have an opinion about then this seems a good measure. But these hugely popular items may not differentiate taste. So we

Re: Popularity of recommender items

2014-02-06 Thread Pat Ferrel
A velocity measure of sorts, makes a lot of sense for a “what’s hot” list. The particular thing I’m looking at now is how to rank a list of items by some measure of popularity when you don’t have a velocity. There is an introduction date though so another way to look at popularity might be to

Re: Popularity of recommender items

2014-02-06 Thread Ted Dunning
One way to deal with that is to build a model that predicts the ultimate number of views/plays/purchases for the item based on history so far. If this model can be made Bayesian enough to sample from the posterior distribution of total popularity, then you can use the Thomson sampling trick

Popularity of recommender items

2014-02-04 Thread Pat Ferrel
Trying to come up with a relative measure of popularity for items in a recommender. Something that could be used to rank items. The user - item preference matrix would be the obvious thought. Just add the number of preferences per item. Maybe transpose the preference matrix (the temp DRM