This sounds great.  Go for it.  Put a comment on the design doc with a
pointer to text that I should import.




On Tue, Jul 23, 2013 at 9:39 AM, Pat Ferrel <[email protected]> wrote:

> I can supply:
>
> 1) a Maven based project in a public github repo as a baseline that
> creates the following
> 2) ingest and split actions, in-memory, single process, from text file,
> one line per preference
> 3) create DistributedRowMatrixes one per action (max of 3) with unified
> item and user space
> 4) create the 'similarity matrix' for [B'B] using LLR and [B'A] using
> matrix multiply/cooccurrence.
> 5) can take a stab at loading Solr.  I know the Mahout side and the
> internal to external ID translation. The Solr side sounds pretty simple for
> this case.
>
> This pipeline lacks downsampling since I had to replace
> PreparePreferenceMatrixJob and potentially LLR for [B'A]. I assume
> Sebastian is the person to talk to about these bits?
>
> The job this creates uses the hadoop script to launch. Each job extends
> AbstractJob so runs locally or using HDFS or mapreduce (at least for the
> Mahout parts).
>
> I have some obligations coming up so if you want this I'll need to get
> moving. I can have the project ready on github in a day or two. May take
> longer to do the Solr integration and if someone has a passion for taking
> that bit on--great. This work is in my personal plans for the next couple
> weeks as it happens anyway.
>
> Let me know if you want me to proceed.
>
> On Jul 22, 2013, at 3:42 PM, Ted Dunning <[email protected]> wrote:
>
> On Mon, Jul 22, 2013 at 12:40 PM, Pat Ferrel <[email protected]>
> wrote:
>
> > Yes.  And the combined recommender would query on both at the same time.
> >
> > Pat-- doesn't it need ensemble type weighting for each recommender
> > component? Probably a wishlist item for later?
>
>
> Yes.  Weighting different fields differently is a very nice (and very easy
> feature).
>
>

Reply via email to