Precisely; that's the one, thanks!

On Fri, Feb 14, 2014 at 12:45 PM, Pat Ferrel <[email protected]> wrote:

> Note sure if this is what you are looking for. I assume you are talking
> about Ted's paper describing a Solr based recommender pipeline?
>
> Much of the paper was implemented in the solr-recommender referenced
> below, which has a fairly flexible parallel version of a logfile reader
> that uses Cascading for mapreduce. It picks out columns in delimited text
> files. You can choose a constant string for your action id, like "purchase"
> or "thumbs-up". Then specify the field index for user, item, and action. It
> assumes strings for all these inputs and creates
> string-id->Mahout-Integer-id->string-id bidriectional hashmaps as
> dictionary and reverse dictionary. Everything is scalable except the
> BiHashmaps, which are in-memory. They aren't usually too big for that.
> There is also a pattern for the input log file names and they are searched
> for recursively from some root directory.
>
> Caveat emptor: not all the options are implemented or tested. One person
> has already implemented a scaffolded option and their pull request was
> merged so feel free to contribute.
>
> It is an example of how to digest logfiles, build Mahout data, and run the
> recommender. It creates Solr indexing data too but the output of the
> recommender is up to you to implement. It is a Solr query or a lookup in
> the Mahout recommender DRM output.
>
> https://github.com/pferrel/solr-recommender
>
>
> On Feb 14, 2014, at 12:39 PM, Ted Dunning <[email protected]> wrote:
>
> Yes!
>
> But it is very hard to find the time.
>
>
>
> On Fri, Feb 14, 2014 at 11:51 AM, Andrew Musselman <
> [email protected]> wrote:
>
> > I'd like to see cross-recommendations added too.
> >
> > But I also want some automation of the steps required to build a simple
> > recommender like the solr/mahout example Ted and Ellen have in their
> > pamphlet.
> >
> > Lowering the barrier to entry by providing a sample pipeline would help a
> > lot of folks get started and hopefully would keep them interested.
>  Perhaps
> > in examples/bin?
> >
> >
> > On Fri, Feb 14, 2014 at 10:56 AM, Pat Ferrel <[email protected]>
> > wrote:
> >
> >> There's been work done on the cross-recommender. There is a Mahout-style
> >> XRecommenderJob that has two preference models for two actions or
> >> preference types. It uses matrix multiply to get a cooccurrence type
> >> similarity matrix. If we had a cross-row-similarity-job, it could pretty
> >> easily be integrated and I'd volunteer to integrate it. The XRSJ is
> >> probably beyond me right now so if we can scare up someone to do that
> > we'd
> >> be a long way down the road.
> >>
> >> I'll put a feature request into Jira and take this to the dev list
> >>
> >> BTW this is already integrated with the solr-recommender.
> >>
> >> On Feb 8, 2014, at 7:19 PM, Ted Dunning <[email protected]> wrote:
> >>
> >> I have different opinions about each piece.
> >>
> >> I think that cross recommendation is as core as RowSimilarityJob and
> > should
> >> be a parallel implementation or integrated.  Parallel is probably
> easier.
> >> It is even plausible to have a version of RowSimilarityJob that doesn't
> >> support all the different distance measures but does support multiple
> > cross
> >> and direct processing using LLR or related cooccurrence based measures.
> > It
> >> would be very cool if a single pass over the data could do many kinds of
> > co
> >> or cross occurrence operations.
> >>
> >> For dithering, it really is post processing.  That said, it is also the
> >> single largest improvement that anybody typically gets when testing
> >> different options so it is a bit goofy to not have good support for some
> >> kinds of dithering.
> >>
> >> For Thompson sampled recommenders, I am not sure where to start hacking
> > on
> >> our current code.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Feb 8, 2014 at 4:53 PM, Pat Ferrel <[email protected]>
> > wrote:
> >>
> >>> That was by no means to criticize effort level, which has been
> > impressive
> >>> especially during the release.
> >>>
> >>> It was more a question about the best place to add these things and
> >>> whether they are important. Whether people see these things as custom
> >> post
> >>> processing or core.
> >>>
> >>> On Feb 8, 2014, at 12:13 PM, Ted Dunning <[email protected]>
> > wrote:
> >>>
> >>> ...
> >>>
> >>> The reason that we aren't adding this like cross-rec and other things
> > is
> >>> that "we" have full-time jobs, mostly.  Suneel is full-time on Mahout,
> >> but
> >>> the rest are not.  You seem more active than most.
> >>>
> >>>
> >>>
> >>
> >>
> >
>
>

Reply via email to