Hi, I have a proposal for a very different solution. First
On Wed, Jul 28, 2010 at 6:36 PM, David Montag < [email protected]> wrote: > Hi Alberto, > > On Wed, Jul 28, 2010 at 5:02 PM, Alberto Perdomo > <[email protected]>wrote: > > > Hi David, > > > > > > > But then you need to store the result. You can store these metrics as > > > relationships in neo4j, and then just update them for each user when > > > you recompute. You can find the user nodes via indexing. Maybe it's > > > acceptable that some metrics are out of date, so you can just > > > background process them continuously. > > > > I already have background processes that go through all users and > > calculate new new pairs. But then in order to do that I do need to > > exclude the pairs I already have... because it would be silly and as > > the relationship density grows the probablity of calculating a pair > > again would be higher and higher... > > Would I be able to do that kind of query using indexing? > > > > From your description it sounds like the factors that influence the metric > don't change, so a single calculation per pair is enough. In this case, you > could just determine the pairs in some way and then do the computation, > storing the relationship in Neo4j. You can do it all in one go, nothing > fancy. You would of course have to compute the metric to N peers for each > new user. > > In other scenarios, the factors that influence the metric might change over > time, e.g. a user's city or favorite movie. Then you actually need to keep > recomputing the metric between existing users, and yes, then you probably > want some scheme to make sure that you don't starve some users. You might > for example want to prioritize the most active users first. Again, I don't > know if this applies to your case though. > > As for the indexing, I'm not sure how you would use it here. Like, what > kind > of querying were you picturing? > > > > > > > Depending on your scenario, if your users know each other, it might be > > > interesting to start computing in a foaf style order (breadth first). > > > Remember, the power is in the relationships. Isolated nodes are not > > > interesting. > > > > You mean I look first for possible pairs with users that are friends > > of friends instead of randomly? We are also interesting in storing > > friendship relationship so that sounds interesting. > > That would be a different type of query: Traverse the graph from node > > A to nodes which are friends of friends of A and have no match > > relationship with A. I guess that is not difficult to implement using > > Neo4j? > > > > Exactly, so you might want to start with the most relevant other people, > i.e. people you can realistically meet IRL via friends. Don't know if > that's > relevant to your application though. > > Neo4j would be a perfect fit for storing friendship relationships between > users. It opens up all kinds of interesting data mining possibilities. > > The FOAF query would be easy to write using the Neo4j APIs, or some other > tool such as Gremlin on top of Neo4j. > > So you could combine the friendship relationships with your processing step > and prioritize active users, and start by checking people close to them in > their social network. Again, if it's relevant. And, as Mattias suggested, > if > you can leverage friendship relationships between users, you might be able > to calculate your metric on the fly, given that you limit the search to the > user's extended social network. Of course, if you go deep enough, you might > reach all users this way too. > > > > > > Thanks for your input David! > > > > Glad to be of service. Ask as much as you like! We're all learning here :) > > > > _______________________________________________ > > Neo4j mailing list > > [email protected] > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

