Hi Rick, On Wed, Jul 28, 2010 at 4:39 AM, <[email protected]> wrote: > Sounds like a pretty easy SQL query, though. ;-)
We are using SQL right now. We have the users table and a table called user_matches with following columns (simplified version): user_1_id user_2_id result The first 2 columns are foreign keys to the users ID's and the third column is the result of the match or the score. It's bidirectional the score between users A and B is the same as between B and A. In order to avoid growing the table very big very fast we are only storing one row per match. This means when querying we need to look up in both columns, user_1_id and user_2_id. Now, when we want to calculate some new matches, we need to get some pseudorandom users that are no match already meaning it's a SELECT QUERY from users DISTINCT all users that already have a match with the user in question. This query is quite complex and gets slower and slower as the user_matches table grows. That's why we though of using a node database. > Actually the "random sampling" aspect definitely throws a complication > into the requirements. I can't even picture how to achieve that in Neo > without first obtain some (large) set of nodes and using a randomizer > to select from the set/array. Iterating on getAllNodes and stopping > after "n" matches wouldn't meet the "random" requirement. I've also thought of getting some pseudorandom array of users from the SQL DB and the querying if a relationships exists within Neo, but this will be less and less effective as the relationship density grows in the graph. Cheers, Alberto. _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

