Hey all,

I've been working on adding batch support to 
http://github.com/jadell/Neo4jPHP Neo4jPHP .  Here are the results of my
latest benchmarks.  First column is the number of nodes being inserted,
second column is the average in seconds over 5 runs to insert that many
nodes in a single batch, third column is the average in seconds over 5 runs
to insert that many nodes one at a time:

#nodes  batch    single
10        0          0
100      0.2       0.4
250      0          1
500      0.8       2
1000    1.4       4
2500    6          10.6
5000    23.2     21.2
10000  91.6     40.4

It seems like batches win out until right around 5000 nodes at a time.  I've
profiled my code, and it seems like the time spent in PHP is roughly
equivalent for batch vs. single.  All the time difference is spent in a
curl_exec call, talking to or waiting to hear back from the server.

I tried going up to 100000 nodes.  Single insert handled this just fine, but
the server kept returning a "500 Java Heap space" error on the batch, even
with 512M max heap.

Benchmark script can be found here:  http://gist.github.com/1169394
http://gist.github.com/1169394 
Benchmarks were run on an 4 x 2.3GHz core Intel i7, 4G RAM, running Ubuntu
10.10.  Neo4j server was run with out-of-the-box settings in a VM runnning
Ubuntu 10.10 with 1 dedicated core and 1G RAM.

I hope this is of interest to anyone.  I'd love to get some feedback from
anyone using Neo4j from PHP, with Neo4jPHP or any other library.

-- Josh Adell


--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4jPHP-batch-insert-benchmarks-tp3282984p3282984.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to