There’s been work done on the cross-recommender. There is a Mahout-style
XRecommenderJob that has two preference models for two actions or preference
types. It uses matrix multiply to get a cooccurrence type similarity matrix. If
we had a cross-row-similarity-job, it could pretty easily be
I'd like to see cross-recommendations added too.
But I also want some automation of the steps required to build a simple
recommender like the solr/mahout example Ted and Ellen have in their
pamphlet.
Lowering the barrier to entry by providing a sample pipeline would help a
lot of folks get
Yes!
But it is very hard to find the time.
On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman
andrew.mussel...@gmail.com wrote:
I'd like to see cross-recommendations added too.
But I also want some automation of the steps required to build a simple
recommender like the solr/mahout
Oh yes. I do have a small team I could enlist to do things like this; is
there a starting point somewhere on Github, Ted?
On Fri, Feb 14, 2014 at 12:39 PM, Ted Dunning ted.dunn...@gmail.com wrote:
Yes!
But it is very hard to find the time.
On Fri, Feb 14, 2014 at 11:51 AM, Andrew
Note sure if this is what you are looking for. I assume you are talking about
Ted’s paper describing a Solr based recommender pipeline?
Much of the paper was implemented in the solr-recommender referenced below,
which has a fairly flexible parallel version of a logfile reader that uses
Precisely; that's the one, thanks!
On Fri, Feb 14, 2014 at 12:45 PM, Pat Ferrel p...@occamsmachete.com wrote:
Note sure if this is what you are looking for. I assume you are talking
about Ted's paper describing a Solr based recommender pipeline?
Much of the paper was implemented in the
Didn’t mean to imply I had historical view data—yet.
The Thompson sampling ‘trick’ looks useful for auto converging to the best of
A/B versions and a replacement for dithering. Below you are proposing another
case to replace dithering—this time on a list of popular items? Dithering works
on
Thompson sampling doesn't require time other than a sense of what do we now
know. It really is just a correct form for dithering that uses our current
knowledge.
For a worked out version of Thompson sampling with ranking, see this blog:
I am not fulltime on Mahout either and have a fulltime job which is unrelated
to Mahout.
Its just that I have been sacrificing personal time to keep things moving on
Mahout.
On Saturday, February 8, 2014 3:13 PM, Ted Dunning ted.dunn...@gmail.com
wrote:
Thompson sampling doesn't
That was by no means to criticize effort level, which has been impressive
especially during the release.
It was more a question about the best place to add these things and whether
they are important. Whether people see these things as custom post processing
or core.
On Feb 8, 2014, at 12:13
I have different opinions about each piece.
I think that cross recommendation is as core as RowSimilarityJob and should
be a parallel implementation or integrated. Parallel is probably easier.
It is even plausible to have a version of RowSimilarityJob that doesn't
support all the different
Well, I think what you are suggesting is to define popularity as being
similar to other items. So in this way most popular items will be
those which are most similar to all other items, like the centroids in
K-means.
I would first check the correlation between this definition and the
standard one
If you look at the indicator matrix (cooccurrence reduced by LLR), you will
usually have asymmetry due to limitations on the number of indicators per
row.
This will give you some interesting results when you look at the column
sums. I wouldn't call it popularity, but it is an interesting
The problem with the usual preference count is that big hit items can be
overwhelmingly popular. If you want to know which ones the most people saw and
are likely to have an opinion about then this seems a good measure. But these
hugely popular items may not differentiate taste.
So we
A velocity measure of sorts, makes a lot of sense for a “what’s hot” list.
The particular thing I’m looking at now is how to rank a list of items by some
measure of popularity when you don’t have a velocity. There is an introduction
date though so another way to look at popularity might be to
One way to deal with that is to build a model that predicts the ultimate number
of views/plays/purchases for the item based on history so far.
If this model can be made Bayesian enough to sample from the posterior
distribution of total popularity, then you can use the Thomson sampling trick
Trying to come up with a relative measure of popularity for items in a
recommender. Something that could be used to rank items.
The user - item preference matrix would be the obvious thought. Just add the
number of preferences per item. Maybe transpose the preference matrix (the temp
DRM
17 matches
Mail list logo