Re: [Neo4j] Neo4j Multiple Nodes Issue

David Montag Wed, 21 Jul 2010 13:12:13 -0700

Hi Maaz,

Rick is on the right track with the UID generation. You need to make
more than the ID generation thread safe though. Your first code
snippet is obviously not thread safe. The second one uses double
checked locking, and should be ok. You can also simply synchronize
around the whole first snippet, or try Johan's suggested locking
strategy. I'd recommend you to stay away from Neo4j's node IDs in this
case, due to the reasons Rick stated.


Now, regarding performance, there are a lot of factors here. Will your
code serve requests? Over HTTP? If so, does the locking here really
matter? I.e. is it washed out by the orders of magnitude greater
network I/O times? If you're really concerned about performance, then
you *will* want to do some kind of profiling. Point is, does the
locking here matter in relation to other delays? Knowing where you
should put your time optimizing is key.

As to the performance of neoIndexService.getSingleNode(), I'm afraid I
currently don't know. Maybe some of the other guys can help you out
with this. Regarding your question about batching operations together
in a single transaction versus doing them in different transactions,
you can easily try it by writing a test. But just thinking about it,
each mutating transaction has to hit disk, so it might cost you some
I/O seeking doing different transactions, so I would count on it
taking longer. How much longer, I can't say. And someone please
correct me if I'm wrong here!

David

On Wednesday, July 21, 2010, Maaz Bin Tariq <maaz.ta...@yahoo.com> wrote:
> Thanks Johan Svensson and Rick Bullotta.
> Yes Bullotta, you are are right the node creation is the problem , our code 
> is something similar to following code.  we donot want to synchronized the 
> method as it cost some performance. Any suggestion to improve it. Also how 
> costly neoIndexService.getSingleNode() method is if we call it twice/thrice 
> and the node was not created.  Will it search the whole graph?
> Svensson, In our case the problem is creating of duplicate reference nodes 
> that is even not handle in the sample code.
> ---------------------------------------------------------------------------
> private IndexService neoIndexService;
> private GraphDatabaseService neoService;
>
>
> private Node getOrCreateUserNodeByUserId(final Long id) {
>        Node node = neoIndexService.getSingleNode(UID, id);
>
>         if (node == null) {
>             node = neoService.createNode();
>             node.setProperty(UID, id);
>             neoIndexService.index(node, UID, id);
>         }
>
>         return node;
> }
> ---------------------------------------------------------------------------
> private Node getOrCreateUserNodeByUserId(final Long id) {
>
>        Node node = neoIndexService.getSingleNode(UID, id);
>
>
>
>         if (node == null) {
>
>             node = neoService.createNode();
>
>             node.setProperty(UID, id);
>
>             neoIndexService.index(node, UID, id);
>
>         }
>
>
>
>         return node;
>
> }
> ----------------------------------------------------------------------------------
> how costly the following solution
>
>     private Node getOrCreateUserNodeByUserId(final Long id) {
>         Node node = neoIndexService.getSingleNode(UID, id);
>
>         if (node == null) {
>             node = createUserNode(id);
>         }
>
>         return node;
>     }
>
>      private synchronized Node createUserNode(final Long id){
>         Node node = neoIndexService.getSingleNode(UID, id);
>         if (node == null) {
>             node = neoService.createNode();
>             node.setProperty(UID, id);
>             neoIndexService.index(node, UID, id);
>         }
>         return node;
> }
>
>
> Thanks
> -Maaz
>
> --- On Wed, 7/21/10, Johan Svensson <jo...@neotechnology.com> wrote:
>
> From: Johan Svensson <jo...@neotechnology.com>
> Subject: Re: [Neo4j] Neo4j Multiple Nodes Issue
> To: "Neo4j user discussions" <user@lists.neo4j.org>
> Date: Wednesday, July 21, 2010, 7:38 PM
>
> Hi,
>
> One can use the built in locking in the kernel to synchronize and make
> code thread safe. Here is an example of this:
>
> https://svn.neo4j.org/examples/apoc-examples/trunk/src/main/java/org/neo4j/examples/socnet/PersonFactory.java
>
> The "createPerson" method guards against creation of multiple persons
> with the same name by creating a relationship from the reference node.
> After the relationship has been created (in the transaction but not
> yet committed) the write lock for the reference node has been acquired
> making sure any other running transaction has to wait for the lock to
> be released. Finally the index is checked to make sure some other
> transaction did not create the person while the current transaction
> was waiting for the write lock.
>
> Even simpler is to just remove a non existing property from a node or
> relationship. That will grab a lock on the specific node or
> relationship (that will be held until the transaction commits or is
> rolledback).
>
> Regards,
> Johan
>
> On Wed, Jul 21, 2010 at 4:07 PM, Rick Bullotta
> <rick.bullo...@burningskysoftware.com> wrote:
>> The node id indirectly achieves this, but node id's can be recycled when
>> nodes are deleted.  Also, depending on node id may or may not work in future
>> versions of Neo that might support sharding or distributed storage.
>>
>> Sounds to me like you have a more simple issue in that your UID generator
>> isn't coded properly.  It should be designed as "thread safe" so that you
>> couldn't get the same UID in the first place.
>>
>> -----Original Message-----
>> From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
>> Behalf Of Maaz Bin Tariq
>> Sent: Wednesday, July 21, 2010 9:39 AM
>> To: user@lists.neo4j.org
>> Subject: [Neo4j] Neo4j Multiple Nodes Issue
>>
>> Hi,
>>
>> We are facing kind of a situation in our application using
>> neo4j graph and we want to have your advice regarding this issue.
>>
>> We use graph in a way that we create a node and with every
>> node we set a property i-e UID =value(numeric)  and then we create the
>> relationships between those nodes. According to our use case requirement
>> there
>> needs to be only one node in the whole graph space should exist having a UID
>>
>> value. That is UID should become the unique identifier for a node in the
>> graph
>> space.
>>
>> Our graph service is configured using the spring framework
>> and all transaction handling is being managed by the spring itself. Now we
>> are
>> facing the problem where multiple nodes get created having the same UID
>> because
>> of multiple transactions running the same time and  one transaction effect
>> is
>> not visible to other until one is committed. What we do is that we look for
>> a
>> node in the graph with a specific UID and if it is not there we create one.
>> So
>> in that case there is probability where multiple nodes could be created
>> having
>> the same UID if multiple transactions running same time and trying to lookup
>>
>> & create same UID.
>>
>> Here I want to inquire that do we have in neo4j some kind of
>> unique constraint be applied on a specific property that prevent multiple
>> nodes
>> get created having the same UID. Second, Let say if I am creating 1000 nodes
>> and
>> their relationships in one transaction and  now I want to know that what is
>> the
>> performance cost if I create each node and its relationship in one separate
>> transaction.
>> Thanks-Maaz
>>
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
>
>
>
> _______________________________________________
> Neo4j mailing list
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Neo4j Multiple Nodes Issue

Reply via email to