Pat, the "similarity recommender" looks great. I'd like to explain why I think so, based on my understanding of Fusion -which I have never used though. I just read the documentation.
It looks to me that Lucidworks proposes something like a document clustering/search engine++. Possibly something like http://search.carrot2.org and similar, but very B2B oriented. The "search engine++" part comes from the fact that, thanks to last developments described in Ted and Helen's book, they can now also provide recommendations. I see a problem with Lucidworks' Fusion. It is not obvious to me how they can blend a B2B feature (the clusterization, with possibly ACL for accessing various documents inside a company) with a B2C feature (recommendations) -B2B is inherently not suited for collaborative filtering, as discovered by Google 10+ years ago. More specific -Lucidworks says it provides "extensible connector framework to gather content from a rich variety of sources such as Twitter, SharePoint, *databases, web sites*..." This is good if the service can provide me with documents in my "information ecosystem" connected to a certain document I am reading now. E.g., if I am a lawyer and reading the new parliament bill, the software provides me with any document my law firm possess, plus any webpage we have access to, that is connected to that bill. They also provide a way to easily index web-sites for which we have username and passwd. This is great, but can this product be bundled to collaborative filtering recommendations? Back to the real point, i.e. Mahout/Spark. Its strength is a great recommender, with the possibility of leveraging on search engines (because the weight is -log(frequency) etc). This is different from Lucidworks. M/S is, before anything, a recommender. If the "similarity recommender" can provide the features you listed, not only it would be a great software for implementing B2C businesses, but would also be not in direct competition with Lucidworks' Fusion. Fusion is not something I'd take as "benchmark". I'd not even compare the "similarity recommender's" features to Fusion's ones. They look quite different to me. And, as developer, I'd stress that few things are more important to me than providing RESTful API to a business. They make integration so much easier! This would be the first real asset for me. BTW, this is also the reason why, I guess, Cloudera saw potential in Oryx. Cheers! Mario On Tue, Sep 23, 2014 at 6:20 PM, Pat Ferrel <[email protected]> wrote: > Name suggestions are appreciated but this is meant to be about the > similarity engine (search engine) recommender. > > Recently Lucidworks (the Solr people) announced Fusion, a closed source > extension to the Lucidworks offering. It includes a recommender API, which > makes it easier to deal with non-text "signals". What they call signal we > have been calling "indicator". They don't create indicators--just make them > easier to index and query. If you're following this discussion you may want > to have a look at it. > https://docs.lucidworks.com/display/fusion/Lucidworks+Fusion+Documentation > > As more interest builds in this approach it's time to talk about features > and use cases: > > Features: > 1) Multiple user acton collaborative filtering type indicators to use more > user behavior than possible with single action recommenders. > 2) Realtime reaction to new interactions. Both anonymous users and known > users with extremely recent interaction history can get personalized > recommendations. > 3) blended CF, metadata, and content indicators. Blended at query time so > the same model will support a great variety of blends without re-training. > 4) ability to use context to affect recommendations > 5) addresses the cold-start problem with metadata and content indicators. > In other words items with no interaction history can be recommended to > completely new users > 6) simple recs query API encapsulating and insulating the developer from > indicator complexity. > > Use cases: > 1) cooccurrence collaborative filtering recommender, better quality > because multiple user actions can be used > 2) context affects recs so recs can be specialized based on factors like > geolocation of user, mobile vs web, place on site, category being browsed, > time of year. > 3) user profile data can be used skew recs for things like gender or > stated categorical preferences > 4) enable cold-start recommendations that gracefully and automatically > improve in situations where more data is available. > 5) enable item-set recs for shopping carts, wishlists, and other session > specific groupings > 6) improve recommendations in near realtime as users take new actions recs > reflect them. This, for example, enables NRT recs for new users based on > their recent views because #1 has allowed views to be used to recommend > purchases. > 7) the same recommender using the same API can be used anywhere along the > spectrum of content-based to collaborative filtering based. The developer > can use the recommender in exactly the same way as they back-fill data to > gradually improve recs. > 8) the content-based flavor of the recommender enables personalized > recommendations, not just item similarity-based, even without cooccurrence > data. This allows content apps like news to personalize content-based recs > even though they never get enough CF data on short-lived articles to make > CF recs. > > You can see why I call it universal but maybe that's a bit too much > hyperbole. > > Did I mist important use cases or needed features? > >
