Re: [Neo4j] Neo4j Sharding

Craig Taverner Wed, 04 Jun 2014 03:52:31 -0700

Sharding a graph is something that is often reasonably easy to do for
specific domains (based on knowledge of the domain), but very hard to do
generically for all domains. For this reason, while it has been considered
for a long time, neo4j has not supported built-in partitioning of the graph.

Unsurprisingly this problem is analogous to the partitioning performance
issue with RDBMS. In an RDBMS, it might be simple to pick a consistent hash
for any specific table, and partition based on that. However, it becomes
hard to pick consistent hashes for multiple tables such that all likely
join operations do not cross partitions too much. This means that for
simple, less connected data (less joins), RDBMS partition easily, but for
highly connected data, the performance tanks (joins cross partitions).

If you have a nicely designed, domain specific graph in Neo4j, you should
consider partitioning it in the application layer. Since the partitioning
is domain specific, this is the natural place to do it anyway (even with
rdbms for highly connected models).

I can also comment that one often finds that 'big data' models under
consideration for partitioning are often still small enough to fit inside
one neo4j instance. Everyone has a different idea about what is 'big data'.
Make sure that your decision to partition is based on a real need to split
the data, not on the perception that you might need this. As Michael
implied, perhaps the built-in HA mode is sufficient for your needs.

On Wed, Jun 4, 2014 at 7:22 AM, Michael Hunger <
[email protected]> wrote:

> Each instance holds the _full_ graph. That way you achieve zero copy
> failover and high performance traversals which have never to cross the
> network.
>
> Michael
>
>
> On Wed, Jun 4, 2014 at 1:06 AM, Bernardo Hermont <[email protected]>
> wrote:
>
>> Hi Stefan,
>>
>> Thank you for your e-mail.
>> So there is not a way of each cluster member storing only part of the
>> graph, I mean, to have more control over each part is stored on each
>> cluster node?
>>
>> I ask this just to see how exactly Neo4j fits into my requirements right
>> now.
>>
>> Thank you again,
>>
>> Bernardo
>>
>>
>> On Monday, June 2, 2014 3:11:42 AM UTC-4, Stefan Armbruster wrote:
>>
>>> Hi,
>>>
>>> Neo4j's clustering model is a master-slave replication. Each cluster
>>> member has a copy of the full graph enabling doing read operations
>>> without cluster intercommunication - and therefore scales reads almost
>>> linearly.
>>>
>>> Cheers,
>>> Stefan
>>>
>>> 2014-06-02 2:35 GMT+02:00 Bernardo Hermont <[email protected]>:
>>> > Hi all,
>>> >
>>> > I have the following questions about Neo4j
>>> >
>>> > Is it possible to inform which slave node I want to store data that it
>>> is
>>> > inserted using PUT REST interface?
>>> > Or the data is automatically sharded along the slave servers that are
>>> part
>>> > of the cluster?
>>> >
>>> > Is there any way to do this in a clustered configuration?
>>> >
>>> > Regards,
>>> >
>>> > Bernardo
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google
>>> Groups
>>> > "Neo4j" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> an
>>> > email to [email protected].
>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Neo4j Sharding

Reply via email to