Hi Deejay,
> 1. We're using Mahout as a recommendation system. Has anyone had any success
> plugging Neo4j into this?
I work with various companies using graph databases to do recommendation.
Moreover, a couple of them are also experimenting with Mahout over MySQL and
Hadoop. I have never thought to think about backing Mahout by a graph database.
That is an interesting idea... Here are three thoughts that are related (though
do not directly address your question):
1. With Mahout (as I understand it since a few versions back), only
single relational data can be processed. That is, it only supports data of the
form: "X likes[weight] Y," where weight can be binary or rational. When a
domain model is sufficiently complex: people liking things, people knowing each
other, people working in the same, similar, etc. places, and products having
features, designers, etc. ---- there is more information in the domain that can
be capitalized on for recommendation.
2. With pure graph-based recommendation, no recommendation model is
generated (intermediate data structure) as recommendations are calculated on
the fly over the raw graph using traversal techniques. Traversals can propagate
over more complex relations and are not limited to "X likes[weight] Y" or,
better yet, such basic relations can be derived through implicit relations
(i.e. paths) [ http://markorodriguez.com/2011/02/08/property-graph-algorithms/
]. Along this line of thought, the raw graph representation of your domain can
be used for more than just recommendation --- e.g. path analysis, global
ranking, searching, reasoning, abstraction, etc. [
http://markorodriguez.com/2011/07/14/graphs-brains-and-gremlin/ ]
3. With various forms of graph sampling/weighting, it is possible to
put as many clock cycles (thus, compute time) as desired into the determination
of a recommendation -- generally, more clock cycles yields greater accuracy.
However, with accumulative methods, it is possible to reach an ergodic state [
http://en.wikipedia.org/wiki/Ergodicity ] whereby the contribution of more
clock cycles does not yield more information (i.e. does not alter the order of
the resultant recommendation ranking).
While your question was about overlaying Mahout on top of Neo4j, I argue that
by using Neo4j in its native form (through its API and its approach to data
analysis), there is much more beyond recommendation that you can exploit from
your domain model.
To conclude, I recently wrote up a post on graph-based recommendation that may
be of interest to you:
http://markorodriguez.com/2011/09/22/a-graph-based-movie-recommender-engine/
Good luck with your explorations,
Marko.
http://markorodriguez.com
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user