FWIW I have replicated this issue on 2.0.3 as well. While GC churn does 
kick in at approximately the same point as with 2.1.2, it is interesting to 
note how much faster the test case cypher queries perform in 
2.1.2--something like 50% faster! :)

Nonetheless, the memory leak does continue to be an issue for me. AFAICT, 
the py2neo API is properly opening, submitting, and closing the cypher 
transactions according to spec. I'd greatly appreciate any assistance in 
determining whether this is indeed a bug in Neo4j.

Thanks,
-Matt


On Wednesday, July 2, 2014 2:20:10 PM UTC-4, Matt Aldridge wrote:
>
> Hi everyone,
>
> I have a use case that appears to expose a memory leak in Neo4j. I've been 
> testing this with Neo4j 2.1.2 on OSX.
>
> I've created a test case that reproduces the issue consistently and mimics 
> the behavior of my real-world application. 
> https://gist.github.com/mlaldrid/85a03fc022170561b807 This uses py2neo to 
> interface with Neo4j's Cypher transactional HTTP endpoint. To force the 
> suspected memory leak behavior to surface more quickly, I limit Neo4j's max 
> heap to 1GB. In practice I tend to use an 8GB heap, but the misbehavior 
> still occurs (albeit delayed).
>
> In my real-world application, we need to CREATE millions of primary nodes 
> of interest and MERGE ancillary nodes into the graph, as they can be shared 
> by any number of other primary nodes. In the test case here we give the 
> primary nodes the Person label, and the ancillary nodes are labeled Address 
> and Phone. A fixed set of Address and Phone nodes are generated and 
> randomly attached to Person nodes.
>
> Each Cypher transaction CREATEs 1000 Person nodes and MERGEs in 2 Address 
> and 1 Phone node for each Person. The transactions are created and then 
> committed without any intermediate executions of the open transaction.
>
> This log demonstrates increasingly poor load performance until finally 
> Neo4j runs out of heap space and fails a transaction:
>
> % time python neo4j_heap_stress.py
> 2014-07-01 17:30:07,596 :: __main__ :: Generating fake data ...
> 2014-07-01 17:30:31,430 :: __main__ :: Creating label indices ...
> 2014-07-01 17:30:31,992 :: __main__ :: Beginning load ...
> 2014-07-01 17:33:33,949 :: __main__ :: Finished batch 100
> 2014-07-01 17:35:49,346 :: __main__ :: Finished batch 200
> 2014-07-01 17:37:56,856 :: __main__ :: Finished batch 300
> 2014-07-01 17:40:01,333 :: __main__ :: Finished batch 400
> 2014-07-01 17:42:04,855 :: __main__ :: Finished batch 500
> 2014-07-01 17:44:11,104 :: __main__ :: Finished batch 600
> 2014-07-01 17:46:17,261 :: __main__ :: Finished batch 700
> 2014-07-01 17:48:21,778 :: __main__ :: Finished batch 800
> 2014-07-01 17:50:28,206 :: __main__ :: Finished batch 900
> 2014-07-01 17:52:39,424 :: __main__ :: Finished batch 1000
> 2014-07-01 17:54:56,618 :: __main__ :: Finished batch 1100
> 2014-07-01 17:57:22,797 :: __main__ :: Finished batch 1200
> 2014-07-01 18:02:27,327 :: __main__ :: Finished batch 1300
> 2014-07-01 18:12:35,143 :: __main__ :: Finished batch 1400
> 2014-07-01 18:24:16,126 :: __main__ :: Finished batch 1500
> 2014-07-01 18:38:25,835 :: __main__ :: Finished batch 1600
> 2014-07-01 18:56:18,826 :: __main__ :: Finished batch 1700
> 2014-07-01 19:22:00,779 :: __main__ :: Finished batch 1800
> 2014-07-01 20:17:18,317 :: __main__ :: Finished batch 1900
> Traceback (most recent call last):
>   File "neo4j_heap_stress.py", line 112, in <module>
>     main()
>   File "neo4j_heap_stress.py", line 34, in main
>     load_batch(fake, addresses, phones)
>   File "neo4j_heap_stress.py", line 68, in load_batch
>     tx.commit()
>   File 
> "/Users/matt/.virtualenvs/dataenv/lib/python2.7/site-packages/py2neo/cypher.py",
>  
> line 242, in commit
>     return self._post(self._commit or self._begin_commit)
>   File 
> "/Users/matt/.virtualenvs/dataenv/lib/python2.7/site-packages/py2neo/cypher.py",
>  
> line 217, in _post
>     raise TransactionError.new(error["code"], error["message"])
> py2neo.cypher.CouldNotCommit: commit threw exception
> python neo4j_heap_stress.py  1351.47s user 48.76s system 10% cpu 
> 3:46:27.77 total
>
> Early in the load process, 100K Person nodes are inserted approximately 
> every 2 minutes. However, after 1.2M Person nodes (batch 1200) have been 
> inserted into the graph the batch times slow dramatically. This corresponds 
> with increased Java garbage collection times--near the end the GC suspends 
> application threads for several seconds at a time. I have captured the GC 
> logs for Neo4j if there is any interest in seeing that.
>
> Eventually a transaction fails, and this is in console.log:
>
> Exception in thread "GC-Monitor"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "GC-Monitor"
> Exception in thread "DateCache"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "DateCache"
> Exception in thread "RMI RenewClean-[192.168.1.214:60253]" 
> java.lang.OutOfMemoryError: Java heap space
>
> Of all the variables here, using many MERGE queries appears to me to be 
> the culprit behind the degradation in performance. Prior to changing my 
> application's graph model to use shared ancillary nodes among the primary 
> nodes I was just doing a simple CREATE for all nodes. That scaled nearly 
> linearly to tens of millions of nodes in the graph. That was also using the 
> same cypher transaction method via py2neo. Only when I wanted to 
> de-duplicate the shared ancillary nodes with MERGE did this behavior arise. 
> Hopefully that's not a red herring, but something is almost certainly amiss 
> here.
>
> Please let me know if I can provide any other relevant details to 
> reproduce.
>
> Thanks,
> -Matt
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to