One benefit you of Neo4j is that you can get rid of these pesky
background jobs and instead calculate such things on the fly quite
fast, and not needing to store that calculated info at all. Tried it?

2010/7/28, Alberto Perdomo <alberto.perd...@gmail.com>:
> Hi everyone,
>
> I would have an SQL db for the app besides the graph db.
>
> I have users that I would store as nodes within the graph besides
> storing them in SQL as well. Within those nodes I store attributes
> like male/female, age or date of birth, etc.
> I would have one kind of relationship for friendship, which doesn't
> present any kind of problem and I would do the standard type of
> queries neo4jr-social provides (e.g. friend suggestions, degrees of
> separation, friends in common, ...)
>
> We want to measure the compatibility/taste match/whatever between
> users in background, meaning for instance how much you have in common.
> This is done in Ruby. The result will be an integer between 0 and 100.
> BTW, this value is symmetric, meaning it could be modelled as a
> bidirectional relationship.
>
> Let's say I have 10k users and for every user I calculate the match
> between him and 10 other users.
> If I store all the results I calculate I potentially up to 100k
> relationships every day / 3m relationships every month. If I store
> this in SQL it can turn into a bottleneck very fast. The table will
> grow soon too big and the queries will be slower and slower.
>
> That's when I started thinking in storing those relationships in Neo4j
> because it's meant to handle a very large number of nodes and
> relationships really efficiently. I can model that as a relationship
> and either store the value inside the relationship or code the
> relationship names as 'match_high, match_medium, match_low'
>
> Now back to step 1. Selecting the users I'll be calculating new
> relationships with. They must match certain criteria, e.g.
> female/male, similar age, etc. and it could be pseudo random.
> Now the first step if you think in SQL is to query for all users that
> match the criteria and don't have a relationship with user A.
>
> And then yesterday looking at the Neo4j docs I thought this kind of
> query cannot be done. I could select all the users that match the
> criteria from SQL, then query all the relationships for A from Neo4j,
> substract those from the array of valid users and pick randomly n
> users. Because n is a low value, perhaps 10, this looks to me like a
> very inefficient way of doing this. Also it will be fast at the
> beginning but it will get slower as the relationship density grows
> with time...
>
> Maybe I should consider a different strategy. I've been also
> considering only storing high or interesting values but it would be
> more interesting to have the n top users for A ordered by relationship
> value. If I go ahead with this then I could just go and store it
> within SQL.
>
> This is not what we strive for but if I don't find a better way I'll
> guess we'll have to live with that. Also the solution I find should be
> easily scalable. It should also apply when having for instance 100k
> users.
>
> Any thoughts or comments?
> What would you recommend?
>
> Thanks for help guys!
> Alberto.
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to