Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
I don't know if a node with a hundred relationships can be considered a
super node though. Loading a hundred relationships is pretty fast and warm
nodes (all its relationships in cache) is already fast on these things. I'm
planning on maybe switch internal store representation on a given threshold
so that nodes with a relationship count less than, say 50 or 100, have the
normal representation/function but will switch to a more optimized format
for loading beyond that, although it's more expensive on disk size and the
amount of memory you'd want to spend on memory mapping. It could be
configurable though so that "the switch" happens earlier.

2011/11/19 Krzysztof Raczyński 

> Great, since my schema is a tree (1 incoming, up to hundred of
> outcoming) i was worried about that.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Krzysztof Raczyński
Great, since my schema is a tree (1 incoming, up to hundred of
outcoming) i was worried about that.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
2011/11/19 Krzysztof Raczyński 

> On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson
>  wrote:
> > 2011/11/18 serge 
> > Specifically what's bad about how they are handled is that to get any
> > relationship from a node they all have to be loaded once first into
> cache,
> > regardless of which type you requested.
>
> Does this also pertain to relationship direction? If, f.e. i request
> only inbound relationships, are outbound relationships loaded too?
>

Yes, sorry, relationships gets loaded per type AND direction. So if you've
got 1 incoming and 1 000 000 outgoing relationships of type A you can load
only the incoming one if you request it.

> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Krzysztof Raczyński
On Sat, Nov 19, 2011 at 11:20 AM, Mattias Persson
 wrote:
> 2011/11/18 serge 
> Specifically what's bad about how they are handled is that to get any
> relationship from a node they all have to be loaded once first into cache,
> regardless of which type you requested.

Does this also pertain to relationship direction? If, f.e. i request
only inbound relationships, are outbound relationships loaded too?
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-19 Thread Mattias Persson
2011/11/18 serge 

> Are these following topics will be treated in future release (and when if
> you
> know) ?
>
> 1/ Supernode
>
> I know there is a big downside in handle of super-nodes, which can be a big
> issue in a twitter-like website with, for example a user followed by more
> than 200k users (i have in head, real case) or in a recommendation system
> which have sophisticated rules.
>
> I would like to know if the "super-node issue" (as we name it) is planned
> to
> be investigated in futures releases ?
>
> Just to clarify. Super nodes aren't handles well in neo4j a.t.m.
Specifically what's bad about how they are handled is that to get any
relationship from a node they all have to be loaded once first into cache,
regardless of which type you requested. One solution is to be able to only
load relationships that you request so that getting, say 3 relationships of
type A from a super node with millions of relationships of types A,B,C and
D it would only need to load those 3 A relationships. If you would like to
get all relationships from such a super node they would all have to be
loaded anyway, right? And that brings me to another solution, preferably in
conjunction with the former, to have an optimized storage of such super
nodes so that relationships can be loaded serialized in bigger chunks from
disk.

When you think of super nodes (and a solution) which of these are you
thinking about first and foremost? Or something that I didn't mention here
even?


> 2/ Sharding and horizontal scalability
>
> I guess sharding is a complex problem to handle with graph db but is it
> planned to address the horizontal scalability goal ? and that, even if it
> should bring us towards kind of "inconsistensy" but acceptable situation
> (for example, there are many cases of synchronization latency website can
> accept when it have a big load)
>
> Thanks
>
>
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Rick Bullotta
...but I'm sure the community will come up with a wide range of sharding 
patterns, code, and best practices!

On Nov 18, 2011, at 5:46 PM, "Jim Webber"  wrote:

> Hey Matt,
> 
>> Not to nitpick, but that's for an ideal graph partitioning, not graph
>> sharding overall, right? Eg the problem is solvable in many specific
>> domains?
> 
> You're right - it's the general case. I was just making the point that 
> sharding isn't something that's an afternoon's hacking to complete.
> 
> Jim
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Jim Webber
Hey Matt,

> Not to nitpick, but that's for an ideal graph partitioning, not graph
> sharding overall, right? Eg the problem is solvable in many specific
> domains?

You're right - it's the general case. I was just making the point that sharding 
isn't something that's an afternoon's hacking to complete.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Matt Luongo
Jim,

Not to nitpick, but that's for an ideal graph partitioning, not graph
sharding overall, right? Eg the problem is solvable in many specific
domains?

- Matt
On Nov 18, 2011 1:27 PM, "Jim Webber"  wrote:

> > 1/ Supernode
>
> 2012, around Q2.
>
> > 2/ Sharding and horizontal scalability
>
> 2013, around Q1.
>
> These are guesses not promises :-)
>
> Jim
>
> PS - sharding graphs is NP complete. In theory no general solution exists.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Jim Webber
> 1/ Supernode

2012, around Q2.

> 2/ Sharding and horizontal scalability

2013, around Q1.

These are guesses not promises :-)

Jim

PS - sharding graphs is NP complete. In theory no general solution exists.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread serge
thanks, it sounds great :) 

is there a release date for 1.6 ?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519137.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Scalability Roadmap

2011-11-18 Thread Pablo Pareja
Hi Serge,

Regarding supernodes I already opened an issue about this some time ago:

https://github.com/neo4j/community/issues/19

and as you can read there, at the end of the conversation Peter said: " we
will hopefully be on it for 1.6 !  "
I really hope they keep thinking of fixing this for 1.6 release, I'd
actually say that this is one of the most urgent points that should be
covered right now...

Cheers,

Pablo Pareja

On Fri, Nov 18, 2011 at 5:38 PM, serge  wrote:

> Are these following topics will be treated in future release (and when if
> you
> know) ?
>
> 1/ Supernode
>
> I know there is a big downside in handle of super-nodes, which can be a big
> issue in a twitter-like website with, for example a user followed by more
> than 200k users (i have in head, real case) or in a recommendation system
> which have sophisticated rules.
>
> I would like to know if the "super-node issue" (as we name it) is planned
> to
> be investigated in futures releases ?
>
> 2/ Sharding and horizontal scalability
>
> I guess sharding is a complex problem to handle with graph db but is it
> planned to address the horizontal scalability goal ? and that, even if it
> should bring us towards kind of "inconsistensy" but acceptable situation
> (for example, there are many cases of synchronization latency website can
> accept when it have a big load)
>
> Thanks
>
>
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Pablo Pareja Tobes

My site http://about.me/pablopareja
LinkedInhttp://www.linkedin.com/in/pabloparejatobes
Twitter   http://www.twitter.com/pablopareja

Creator of Bio4j --> http://www.bio4j.com

http://www.ohnosequences.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Scalability Roadmap

2011-11-18 Thread serge
Are these following topics will be treated in future release (and when if you
know) ?

1/ Supernode

I know there is a big downside in handle of super-nodes, which can be a big
issue in a twitter-like website with, for example a user followed by more
than 200k users (i have in head, real case) or in a recommendation system
which have sophisticated rules.

I would like to know if the "super-node issue" (as we name it) is planned to
be investigated in futures releases ? 

2/ Sharding and horizontal scalability

I guess sharding is a complex problem to handle with graph db but is it
planned to address the horizontal scalability goal ? and that, even if it
should bring us towards kind of "inconsistensy" but acceptable situation 
(for example, there are many cases of synchronization latency website can
accept when it have a big load)

Thanks



--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Scalability-Roadmap-tp3519034p3519034.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user