Hi Rick,

On Wed, Jul 28, 2010 at 4:39 AM,  <[email protected]> wrote:
>   Sounds like a pretty easy SQL query, though. ;-)

We are using SQL right now. We have the users table and a table called
user_matches with following columns (simplified version):

user_1_id
user_2_id
result

The first 2 columns are foreign keys to the users ID's and the third
column is the result of the match or the score. It's bidirectional the
score between users A and B is the same as between B and A.

In order to avoid growing the table very big very fast we are only
storing one row per match. This means when querying we need to look up
in both columns, user_1_id and user_2_id.

Now, when we want to calculate some new matches, we need to get some
pseudorandom users that are no match already meaning it's a SELECT
QUERY from users DISTINCT all users that already have a match with the
user in question.

This query is quite complex and gets slower and slower as the
user_matches table grows. That's why we though of using a node
database.

>   Actually the "random sampling" aspect definitely throws a complication
>   into the requirements.  I can't even picture how to achieve that in Neo
>   without first obtain some (large) set of nodes and using a randomizer
>   to select from the set/array.  Iterating on getAllNodes and stopping
>   after "n" matches wouldn't meet the "random" requirement.

I've also thought of getting some pseudorandom array of users from the
SQL DB and the querying if a relationships exists within Neo, but this
will be less and less effective as the relationship density grows in
the graph.

Cheers,
Alberto.
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to