Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
2011/11/18 serge s.fedoro...@gmail.com

 Are these following topics will be treated in future release (and when if
 you
 know) ?

 1/ Supernode

 I know there is a big downside in handle of super-nodes, which can be a big
 issue in a twitter-like website with, for example a user followed by more
 than 200k users (i have in head, real case) or in a recommendation system
 which have sophisticated rules.

 I would like to know if the super-node issue (as we name it) is planned
 to
 be investigated in futures releases ?

 Just to clarify. Super nodes aren't handles well in neo4j a.t.m.
Specifically what's bad about how they are handled is that to get any
relationship from a node they all have to be loaded once first into cache,
regardless of which type you requested. One solution is to be able to only
load relationships that you request so that getting, say 3 relationships of
type A from a super node with millions of relationships of types A,B,C and
D it would only need to load those 3 A relationships. If you would like to
get all relationships from such a super node they would all have to be
loaded anyway, right? And that brings me to another solution, preferably in
conjunction with the former, to have an optimized storage of such super
nodes so that relationships can be loaded serialized in bigger chunks from
disk.

When you think of super nodes (and a solution) which of these are you
thinking about first and foremost? Or something that I didn't mention here
even?


 2/ Sharding and horizontal scalability

 I guess sharding is a complex problem to handle with graph db but is it
 planned to address the horizontal scalability goal ? and that, even if it
 should bring us towards kind of inconsistensy but acceptable situation
 (for example, there are many cases of synchronization latency website can
 accept when it have a big load)

 Thanks



 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Krzysztof Raczyński
On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson
matt...@neotechnology.com wrote:
 2011/11/18 serge s.fedoro...@gmail.com
 Specifically what's bad about how they are handled is that to get any
 relationship from a node they all have to be loaded once first into cache,
 regardless of which type you requested.

Does this also pertain to relationship direction? If, f.e. i request
only inbound relationships, are outbound relationships loaded too?
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
2011/11/19 Krzysztof Raczyński racz...@gmail.com

 On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson
 matt...@neotechnology.com wrote:
  2011/11/18 serge s.fedoro...@gmail.com
  Specifically what's bad about how they are handled is that to get any
  relationship from a node they all have to be loaded once first into
 cache,
  regardless of which type you requested.

 Does this also pertain to relationship direction? If, f.e. i request
 only inbound relationships, are outbound relationships loaded too?


Yes, sorry, relationships gets loaded per type AND direction. So if you've
got 1 incoming and 1 000 000 outgoing relationships of type A you can load
only the incoming one if you request it.

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Krzysztof Raczyński
Great, since my schema is a tree (1 incoming, up to hundred of
outcoming) i was worried about that.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
I don't know if a node with a hundred relationships can be considered a
super node though. Loading a hundred relationships is pretty fast and warm
nodes (all its relationships in cache) is already fast on these things. I'm
planning on maybe switch internal store representation on a given threshold
so that nodes with a relationship count less than, say 50 or 100, have the
normal representation/function but will switch to a more optimized format
for loading beyond that, although it's more expensive on disk size and the
amount of memory you'd want to spend on memory mapping. It could be
configurable though so that the switch happens earlier.

2011/11/19 Krzysztof Raczyński racz...@gmail.com

 Great, since my schema is a tree (1 incoming, up to hundred of
 outcoming) i was worried about that.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Pablo Pareja
Hi Serge,

Regarding supernodes I already opened an issue about this some time ago:

https://github.com/neo4j/community/issues/19

and as you can read there, at the end of the conversation Peter said:  we
will hopefully be on it for 1.6 !  
I really hope they keep thinking of fixing this for 1.6 release, I'd
actually say that this is one of the most urgent points that should be
covered right now...

Cheers,

Pablo Pareja

On Fri, Nov 18, 2011 at 5:38 PM, serge s.fedoro...@gmail.com wrote:

 Are these following topics will be treated in future release (and when if
 you
 know) ?

 1/ Supernode

 I know there is a big downside in handle of super-nodes, which can be a big
 issue in a twitter-like website with, for example a user followed by more
 than 200k users (i have in head, real case) or in a recommendation system
 which have sophisticated rules.

 I would like to know if the super-node issue (as we name it) is planned
 to
 be investigated in futures releases ?

 2/ Sharding and horizontal scalability

 I guess sharding is a complex problem to handle with graph db but is it
 planned to address the horizontal scalability goal ? and that, even if it
 should bring us towards kind of inconsistensy but acceptable situation
 (for example, there are many cases of synchronization latency website can
 accept when it have a big load)

 Thanks



 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Pablo Pareja Tobes

My site http://about.me/pablopareja
LinkedInhttp://www.linkedin.com/in/pabloparejatobes
Twitter   http://www.twitter.com/pablopareja

Creator of Bio4j -- http://www.bio4j.com

http://www.ohnosequences.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread serge
thanks, it sounds great :) 

is there a release date for 1.6 ?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519137.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Jim Webber
 1/ Supernode

2012, around Q2.

 2/ Sharding and horizontal scalability

2013, around Q1.

These are guesses not promises :-)

Jim

PS - sharding graphs is NP complete. In theory no general solution exists.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Matt Luongo
Jim,

Not to nitpick, but that's for an ideal graph partitioning, not graph
sharding overall, right? Eg the problem is solvable in many specific
domains?

- Matt
On Nov 18, 2011 1:27 PM, Jim Webber j...@neotechnology.com wrote:

  1/ Supernode

 2012, around Q2.

  2/ Sharding and horizontal scalability

 2013, around Q1.

 These are guesses not promises :-)

 Jim

 PS - sharding graphs is NP complete. In theory no general solution exists.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Jim Webber
Hey Matt,

 Not to nitpick, but that's for an ideal graph partitioning, not graph
 sharding overall, right? Eg the problem is solvable in many specific
 domains?

You're right - it's the general case. I was just making the point that sharding 
isn't something that's an afternoon's hacking to complete.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Rick Bullotta
...but I'm sure the community will come up with a wide range of sharding 
patterns, code, and best practices!

On Nov 18, 2011, at 5:46 PM, Jim Webber j...@neotechnology.com wrote:

 Hey Matt,
 
 Not to nitpick, but that's for an ideal graph partitioning, not graph
 sharding overall, right? Eg the problem is solvable in many specific
 domains?
 
 You're right - it's the general case. I was just making the point that 
 sharding isn't something that's an afternoon's hacking to complete.
 
 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user