Wow!  What a great community :-D

Working through all this will take me a little while.  Thank you and I'll
probably have more questions once I'm up to speed.

Thanks again
Peter

On Sun, Sep 14, 2014 at 11:22 AM, Pat Ferrel <[email protected]> wrote:

> We’ve developed the idea that multiple indicators can be blended at
> recommendation time.
>
> Imagine a catalog of items. Each item has many indicators of similarity.
> One indicator might be similar items measured by users who preferred the
> item. These are the classical cooccurrence/collaborative filtering
> indicators. However if you have a near realtime similarity engine (search
> engine) you can mix all sorts of indicators and boost each to your purpose.
> The user is the query—in terms of which indicators apply to the user, which
> items they prefer for the simple collaborative filtering case but
> indicators can be of many flavors.
>
> For instance you could use the content of the articles a user reads.
> Calculate (often) for each item which other items have similar content and
> add those as a new indicator field in your catalog. So a user’s articles
> viewed can be a content based indicator or preference and using the content
> based similar items you can recommend new articles based on content alone.
>
> Let’s create the catalog with indicators for view (CF type indicator,
> similar articles by who viewed them), category, “hotness”, and content
> (similar articles by content). The user has indicators like the items they
> have viewed, the category they are currently looking at (or prefer
> depending on your application). Now use this as a query to the similarity
> engine. Each field of the query maps to one or more indicators. The user’s
> views map to the CF indicators _and_ content indicators, perhaps with
> different boosts. The category maps to the category field of the articles.
> Perform the query and order by “hotness”. Or put hotness  in the query and
> boost that part of the query to favor hot articles.
>
> There are also techniques to use user attributes or actions that may seem
> unrelated to the action you want to recommend (views in your case?). For
> instance if you want to recommend view but you also have “thumbs-down". You
> use something like spark-itemsimilarity to create cross-action indicators.
> Put them in your catalog in a separate field and when you create the user
> (aka similarity engine query) make sure to include their “ thumbs-down”
> history mapped to the thumbs-down cross-inicator. This won’t always work if
> there is no correlation but it’s ok to include it because the tools you use
> from Mahout will generally discover non-corelation.
>
> What you know about the user may be spotty or even empty. But the blended
> CF, metadata, and content recommender described above will be able to make
> recommendations even for new users and new content. In this degenerate case
> you’ll get hot articles in the category the user is viewing. When you know
> more about the user the more you will get personalized recommendations.
> Boosting the importance of CF indicators over content indicators means you
> are favoring CF but this is not a filter so depending on how large the
> boost some content based recs may get in if they are strong enough. Flip
> this to favor content over CF.
>
> Currently we have spark-itemsimilarity and spark-rowsimilarity to create
> CF type indicators, cross-indicators, and content indicators. The metadata
> can be taken as-is for indicators in your catalog (categories, color, tags,
> hotness, etc.)
>
> We’re starting to document and make these techniques easier to use. See
> Ted’s book here: https://www.mapr.com/practical-machine-learning and some
> related blog posts here:
> http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
>
> There are a lot of moving parts so I’d take each of the things you want to
> affect the recs and think about how to create an indicator with them.
>
> We also have a way to create a multi-armed bandit affect but I’ve already
> blathered on for long enough.
>
> On Sep 13, 2014, at 4:25 PM, Peter Wolf <[email protected]> wrote:
>
> Awesome!  Thank you very much Ted.  I'll try that
>
> As I am just getting started with Mahout, can you recommend any good
> example code that does something similar?
>
> On Sat, Sep 13, 2014 at 3:38 PM, Ted Dunning <[email protected]>
> wrote:
>
> > Rebuilding every day works very well in practice, but it captures a
> moving
> > average, not a good estimate of the current popularity of items.
> >
> > A simple hack is to implement a search based recommender and simply put
> an
> > empirically scaled boost on items which are rising rapidly in popularity.
> > Of course you should also have specialized pages that show popular items
> > and another that shows rapidly rising items.
> >
> > The simplest approach to marking rapidly rising items that I know is to
> use
> > the log of recent plays over less plays, offsetting both counts in a
> manner
> > similar to Laplace correction.  The philosophy behind the score is that
> for
> > power law play counts, log play count is proportional to -log rank.
> Then,
> > the thought is something that rises from 2000-th rank to 1000-th rank is
> > rising as significantly as something going from 100-th to 50-th.
> >
> >
> >
> >
> >
> >
> > On Sat, Sep 13, 2014 at 11:25 AM, Peter Wolf <[email protected]> wrote:
> >
> >> Thanks Dmitriy,
> >>
> >> Is anyone working on an open source version of RLFM?
> >>
> >> For the moment, I have few enough classes of users that I can just build
> >> multiple recommenders.  For example, one for men and one for women.
> >>
> >> What about adaptive on-line algorithms?  Just like Agarwal's Yahoo
> > research
> >> my items may rise and fall in popularity over time.  In fact, time may
> be
> >> more important than user preferences in my application.
> >>
> >> Do I just rebuild every day with a window of recent data, or does Mahout
> >> have something better?
> >>
> >> On Sat, Sep 13, 2014 at 12:26 PM, Dmitriy Lyubimov <[email protected]>
> >> wrote:
> >>
> >>> Afaik mahout doesnt have these algorthms. Agarwal's RLFM is one of the
> >> more
> >>> promising while sitll simple enough things to implement  at scale that
> >> does
> >>> that.
> >>> On Sep 13, 2014 9:07 AM, "Peter Wolf" <[email protected]> wrote:
> >>>
> >>>> Hello, I am new to Mahout but not ML in general
> >>>>
> >>>> I want to create a Recommender that combines things I know about
> > Users
> >>> with
> >>>> their Ratings.
> >>>>
> >>>> For example, perhaps I know the sex, age and nationality of my users.
> >>> I'd
> >>>> like to use that information to improve the recommendations.
> >>>>
> >>>> How is this information represented in the Mahout API?  I have not
> > been
> >>>> able to find any documentation or examples about this.
> >>>>
> >>>> Thanks
> >>>> Peter
> >>>>
> >>>
> >>
> >
>
>

Reply via email to