True, making your project independent. That should already work so go for it.

On Apr 26, 2014, at 10:21 AM, Saikat Kanjilal <[email protected]> wrote:

That shouldn't technically matter, my thought is to create a spring based 
elasticsearch recommender that leverages spark cooccurrence underneath.

Sent from my iPad

> On Apr 26, 2014, at 10:07 AM, "Pat Ferrel" <[email protected]> wrote:
> 
> Oh, and the example is old hadoop mapreduce, we’re redoing this with the new 
> Spark cooccurrence code, which will replace ItemSimilarity job.
> 
> On Apr 26, 2014, at 10:03 AM, Pat Ferrel <[email protected]> wrote:
> 
> If you want, fork the github repo, do the integration and create a pull 
> request. If the pull is accepted it will automatically be included in the 
> Mahout build’s examples.
> 
> Some things to consider:
> 1) It is actually easier to use either Solr/Lucid/ElasticSearch’s web GUI for 
> bare-bones illustration purposes. You’d have to enter the recs query by hand. 
>  For demo purposes some example queries could be created ahead of time to 
> illustrate the recs generating queries. I did this myself but didn’t include 
> it in the example. I’d actually recommend this as a simple illustration.
> 2) I’d suspect the Solr+DB integration route would be the most common way 
> people would actually use this but I could be wrong. This is what I did on 
> the demo site but far beyond what you’d put in an example.
> 3) What data to use? Unless the data has human readable item ids, the demo is 
> not as compelling
> 
> I can’t give you the demo site’s data since I mined the web for it, which 
> allows me to use it but I don’t think I can republish it. Data actually 
> gathered on the site by users I could share but there isn’t enough to work 
> with. Maybe Ted has some from his demo.
> 
> On Apr 26, 2014, at 9:18 AM, Saikat Kanjilal <[email protected]> wrote:
> 
> 
> 
> Sent from my iPad
> 
>> On Apr 26, 2014, at 9:18 AM, "Saikat Kanjilal" <[email protected]> wrote:
>> 
>> Is it worth it to add in the elasticsearch piece into the demo and tie that 
>> into a generic mvc framework like spring, in fact we could leverage spring 
>> data's elasticsearch plugin.
>> 
>> Sent from my iPad
>> 
>>> On Apr 26, 2014, at 9:08 AM, "Pat Ferrel" <[email protected]> wrote:
>>> 
>>> Yes, it already does. It’s not named well, all it really does is create an 
>>> indicator matrix (item-item similarity using LLR) in a form that is 
>>> digestible by a text indexer. You could use Solr or ElasticSearch to do the 
>>> indexing and queries.
>>> 
>>> In the actual installation on the demo site https://guide.finderbots.com 
>>> the indicator matrix is put into a DB and Solr is used to index the item 
>>> collection’s similarity data field. The queries are handled by the web app 
>>> framework. If I swapped out Solr for ElasticSearch for indexing the DB, it 
>>> would work just fine and I looked into how to integrate it with my web app 
>>> framework (RoR). The integration methods were significantly different 
>>> though so I chose not to do both.
>>> 
>>> The reason I chose to put the indicator matrix in the DB is because it 
>>> makes it very convenient to mix metadata into the recs queries. In the case 
>>> of the demo site where the items are videos I have a bunch of 
>>> recommendation types:
>>> 1) user-history based reqs—query is recent user “likes” history, the query 
>>> is on the videos collection specifying the similar items field, which is a 
>>> list of video id strings. This is most usually what people think a 
>>> recommender does but is only the start.
>>> 2-9 are use various methods of biasing the results by genre metadata. 
>>> Search engines also allow filtering by fields so you can specify videos 
>>> filtered by source. So you can get comedies based on your “likes” filtered 
>>> by source = Netflix. in fact when you set the source filter to Netflix 
>>> every set of recs will contain only those on Netflix
>>> 
>>> There are so many ways to combine bias with filter and what you use as the 
>>> query, that putting the fields in a DB made the most sense. I am still 
>>> thinking of new ways to use this. For instance item-set similarity, which 
>>> is used to give shopping cart recs in some systems. On the demo site you 
>>> could do the same with the watchlist if there were enough watchlists. Use 
>>> the user’s watchlist as query against all otehr watchlists and get back an 
>>> ordered set of watchlists most similar to yours, take recs from there.
>>> 
>>> Some day I’ll write some blog posts about it but I’d encourage anyone with 
>>> data to try the DB route rather than raw indexing of the text files just 
>>> for the amazing flexibility and convenience it brings.
>>> 
>>> On Apr 26, 2014, at 8:25 AM, Saikat Kanjilal <[email protected]> wrote:
>>> 
>>> Pat,
>>> I was wondering if you'd given any thought to genericizing the Solr 
>>> recommender to work with both Solr and elasticsearch, namely are there 
>>> pieces of the recommender that could plug into or be lifted above a search 
>>> engine ( or in the case of elasticsearch a set of rest APIs).  I would be 
>>> very interested in helping out with this.
>>> 
>>> Thoughts?
>>> 
>>> Sent from my iPad
>>> 
> 
> 

Reply via email to