Re: [Neo4j] Scalability Roadmap
I don't know if a node with a hundred relationships can be considered a super node though. Loading a hundred relationships is pretty fast and warm nodes (all its relationships in cache) is already fast on these things. I'm planning on maybe switch internal store representation on a given threshold so that nodes with a relationship count less than, say 50 or 100, have the normal representation/function but will switch to a more optimized format for loading beyond that, although it's more expensive on disk size and the amount of memory you'd want to spend on memory mapping. It could be configurable though so that "the switch" happens earlier. 2011/11/19 Krzysztof Raczyński > Great, since my schema is a tree (1 incoming, up to hundred of > outcoming) i was worried about that. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
Great, since my schema is a tree (1 incoming, up to hundred of outcoming) i was worried about that. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
2011/11/19 Krzysztof Raczyński > On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson > wrote: > > 2011/11/18 serge > > Specifically what's bad about how they are handled is that to get any > > relationship from a node they all have to be loaded once first into > cache, > > regardless of which type you requested. > > Does this also pertain to relationship direction? If, f.e. i request > only inbound relationships, are outbound relationships loaded too? > Yes, sorry, relationships gets loaded per type AND direction. So if you've got 1 incoming and 1 000 000 outgoing relationships of type A you can load only the incoming one if you request it. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson wrote: > 2011/11/18 serge > Specifically what's bad about how they are handled is that to get any > relationship from a node they all have to be loaded once first into cache, > regardless of which type you requested. Does this also pertain to relationship direction? If, f.e. i request only inbound relationships, are outbound relationships loaded too? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
2011/11/18 serge > Are these following topics will be treated in future release (and when if > you > know) ? > > 1/ Supernode > > I know there is a big downside in handle of super-nodes, which can be a big > issue in a twitter-like website with, for example a user followed by more > than 200k users (i have in head, real case) or in a recommendation system > which have sophisticated rules. > > I would like to know if the "super-node issue" (as we name it) is planned > to > be investigated in futures releases ? > > Just to clarify. Super nodes aren't handles well in neo4j a.t.m. Specifically what's bad about how they are handled is that to get any relationship from a node they all have to be loaded once first into cache, regardless of which type you requested. One solution is to be able to only load relationships that you request so that getting, say 3 relationships of type A from a super node with millions of relationships of types A,B,C and D it would only need to load those 3 A relationships. If you would like to get all relationships from such a super node they would all have to be loaded anyway, right? And that brings me to another solution, preferably in conjunction with the former, to have an optimized storage of such super nodes so that relationships can be loaded serialized in bigger chunks from disk. When you think of super nodes (and a solution) which of these are you thinking about first and foremost? Or something that I didn't mention here even? > 2/ Sharding and horizontal scalability > > I guess sharding is a complex problem to handle with graph db but is it > planned to address the horizontal scalability goal ? and that, even if it > should bring us towards kind of "inconsistensy" but acceptable situation > (for example, there are many cases of synchronization latency website can > accept when it have a big load) > > Thanks > > > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
...but I'm sure the community will come up with a wide range of sharding patterns, code, and best practices! On Nov 18, 2011, at 5:46 PM, "Jim Webber" wrote: > Hey Matt, > >> Not to nitpick, but that's for an ideal graph partitioning, not graph >> sharding overall, right? Eg the problem is solvable in many specific >> domains? > > You're right - it's the general case. I was just making the point that > sharding isn't something that's an afternoon's hacking to complete. > > Jim > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
Hey Matt, > Not to nitpick, but that's for an ideal graph partitioning, not graph > sharding overall, right? Eg the problem is solvable in many specific > domains? You're right - it's the general case. I was just making the point that sharding isn't something that's an afternoon's hacking to complete. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
Jim, Not to nitpick, but that's for an ideal graph partitioning, not graph sharding overall, right? Eg the problem is solvable in many specific domains? - Matt On Nov 18, 2011 1:27 PM, "Jim Webber" wrote: > > 1/ Supernode > > 2012, around Q2. > > > 2/ Sharding and horizontal scalability > > 2013, around Q1. > > These are guesses not promises :-) > > Jim > > PS - sharding graphs is NP complete. In theory no general solution exists. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
> 1/ Supernode 2012, around Q2. > 2/ Sharding and horizontal scalability 2013, around Q1. These are guesses not promises :-) Jim PS - sharding graphs is NP complete. In theory no general solution exists. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
thanks, it sounds great :) is there a release date for 1.6 ? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519137.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Scalability Roadmap
Hi Serge, Regarding supernodes I already opened an issue about this some time ago: https://github.com/neo4j/community/issues/19 and as you can read there, at the end of the conversation Peter said: " we will hopefully be on it for 1.6 ! " I really hope they keep thinking of fixing this for 1.6 release, I'd actually say that this is one of the most urgent points that should be covered right now... Cheers, Pablo Pareja On Fri, Nov 18, 2011 at 5:38 PM, serge wrote: > Are these following topics will be treated in future release (and when if > you > know) ? > > 1/ Supernode > > I know there is a big downside in handle of super-nodes, which can be a big > issue in a twitter-like website with, for example a user followed by more > than 200k users (i have in head, real case) or in a recommendation system > which have sophisticated rules. > > I would like to know if the "super-node issue" (as we name it) is planned > to > be investigated in futures releases ? > > 2/ Sharding and horizontal scalability > > I guess sharding is a complex problem to handle with graph db but is it > planned to address the horizontal scalability goal ? and that, even if it > should bring us towards kind of "inconsistensy" but acceptable situation > (for example, there are many cases of synchronization latency website can > accept when it have a big load) > > Thanks > > > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Pablo Pareja Tobes My site http://about.me/pablopareja LinkedInhttp://www.linkedin.com/in/pabloparejatobes Twitter http://www.twitter.com/pablopareja Creator of Bio4j --> http://www.bio4j.com http://www.ohnosequences.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Scalability Roadmap
Are these following topics will be treated in future release (and when if you know) ? 1/ Supernode I know there is a big downside in handle of super-nodes, which can be a big issue in a twitter-like website with, for example a user followed by more than 200k users (i have in head, real case) or in a recommendation system which have sophisticated rules. I would like to know if the "super-node issue" (as we name it) is planned to be investigated in futures releases ? 2/ Sharding and horizontal scalability I guess sharding is a complex problem to handle with graph db but is it planned to address the horizontal scalability goal ? and that, even if it should bring us towards kind of "inconsistensy" but acceptable situation (for example, there are many cases of synchronization latency website can accept when it have a big load) Thanks -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user