(if TL,DR skip to the last paragrahp :) )
The following is all for neo4j version 2.1.2 running under Linux, 64bit:
I try to create a neo4j database by importing from a CSV file. The CSV file 
has three columns: node1Name, node2Name, count.
I run 
  neo4j-shell -path mydatabase
and execute the following commands:

CREATE CONSTRAINT ON (node:Node) ASSERT node.name IS UNIQUE;

USING PERIODIC COMMIT 500
LOAD CSV FROM 'file:/where/my/csv/file/is.csv' AS line
MERGE (node1:Node {name:line[0]})
MERGE (node2:Node {name:line[1]})
CREATE (node1)-[:REL {count:toInt(line[2])}]->(node2);

The CSV file contains about 12 million lines and there are about 5 million 
different nodes, most nodes with just 1 or a few relations between them, 
and only a couple 1000 nodes with more than 1000 or 10000 relations. 

I run the neo4j-shell command with 8G of maximum heap memory on a machine 
with 2 cores and 16GB RAM in total. The odd thing is that the required heap 
memory seems to go up linearly over time and eventually the process aborts 
with an out of memory condition. 
Another attempt on a larger machine, with 20G max heapsize and 16 cores 
shows exactly the same behavior on the jconsole. On that machone, the 
memory usage oscillates initially around 2G, then over the course of the 
next hour or so goes slowly up to oscillate around 12G. Then there is a 
sharp rise and all of the 20G are consumed, without any chance to re-claim 
any of it!

What could the reason for that be? I cannot imagine why simply adding nodes 
and relations like this will cause a constant accumulation of 
non-reclamable heap memory?

UPDATE: while writing this I also tried to split the CSV up into smaller 
chunks (500000 rows): when I load each chunk during the same neo4j-shell 
session, the amount of memory that cannot get re-claimed goes up in the 
same way and eventually I get an out of memory exception, even with 20G. 
However, loading each chunk in a separate neo4j-shell session works just 
fine and never uses more than 4G of memory. 
This seems to indicate that there may be some memory leak somewhere, 
because that memory does not actually seem to be necessary after all to 
create the graph?

johann

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to