Commit your changes after each 1000-2000 node writes and relationship writes, and it will go MUCH faster.
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Gautam Thaker Sent: Wednesday, August 31, 2011 12:32 PM To: Neo4j user discussions Subject: [Neo4j] Running out of memory in creating large graph.. Hi: I am creating a graph with 1,000,000 nodes. The nodes are simple and have just 2 simple properties. I am using "embedded" server model and running java with "-Xmx2500m" (2.5 Gb heap). I am able to create the 1,000,000 nodes ok, taking an average of 4.4msec per node (this is on dedicated DELL 1950 with the datastore on a local dedicated disk.) I next create 1 relation from each node to another randomly selected node. In this step I run out of memory. I get the backtrace as below. Interestingly the total disk usage by the datastore directory is just 250Mbytes, so I am not sure why I run out of memory. The code fragment is also below, any hints/comments welcome. How much memory in the java heap is taken up per "Node"? Thanks. Gautam N 1000000 nrels 1 completed creating 1000000 nodes time_per_node 4.412987 msec # Now create 1 relation from each node to another randomly selected node. Exception in thread "main" java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:126) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) at org.neo4j.kernel.impl.transaction.xaframework.DirectMappedLogBuffer.<init>(DirectMappedLogBuffer.java:40) at org.neo4j.kernel.impl.transaction.TxLog.switchToLogFile(TxLog.java:476) at org.neo4j.kernel.impl.transaction.TxManager.getTxLog(TxManager.java:222) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:761) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:640) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:109) at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:85) at org.neo4j.examples.ATL_Test1.main(ATL_Test1.java:119) Transaction tx; long t1, t2; t1 = System.currentTimeMillis(); for (int i = 0 ; i < N; i++) { tx = graphDb.beginTx(); try { n[i] = getOrCreateNode("node_" + i); // place the object randomly in a 1000km x 1000km space. n[i].setProperty("X", r.nextInt(1000000)); // x location, units=meter (1000km=1E6m) n[i].setProperty("Y", r.nextInt(1000000)); // x location, units=meter (1000km=1E6m) tx.success(); } finally { tx.finish(); } } t2 = System.currentTimeMillis(); float per_node = ((float) (t2-t1))/N; System.out.println("completed creating " + N + " nodes time_per_node " + per_node + " msec"); t1 = System.currentTimeMillis(); Node from, to; // from each node build "nrels" relationships to other nodes. for (int i = 0 ; i < N; i++) { from = n[i]; for (int k = 0; k < nrels; k++) { int j; do { j = r.nextInt(N); } while (i == j); // no self relationship to = n[j]; tx = graphDb.beginTx(); try { Relationship rel = from.createRelationshipTo(to, MyRelationshipTypes.CALLED); rel.setProperty("NOW", System.currentTimeMillis()); // time the call was placed rel.setProperty("DURATION", r.nextInt(10) + 1); // 1 to 10 minutes call tx.success(); } finally { tx.finish(); } } // do a sanity check, make sure # of relations is correct. Iterable<Relationship> rels = from.getRelationships(); int len = IteratorUtil.count(rels); if (len < nrels) { System.err.println("rel count is not " + nrels); System.exit(1); } } t2 = System.currentTimeMillis(); float per_rel = ((float) (t2-t1))/(N*nrels); System.out.println("completed building nrels " + nrels + " per node time_per_rel " + per_rel + " msec"); _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

