Folks,

I've got an application that has (will have) about 2 billion vertexes
and maybe 8 billion edges (?).  Maybe an avg of 4 properties per
vertex -- with maybe an avg of 32 bytes/value.  So I guess that's 16
billion primitives.  Let's round to 20 billion.  My edges estimate is
a relatively uninformed guess.  Just starting to dig into the data.

Traversals will be relatively shallow.  Concurrent access.  Throughput
is more important than latency.  But latency should be better than
maybe 50ms 99% of the time (allowing for some cache warming and some
GC).  I don't know much yet about locality.  I'm not sure yet how
sensitive the app will be to long GCs.

We will need to do a big batch load, and writes will need to be fast
in that phase.  After that, we'll see more reads that writes.  So I
imagine a config for the batch load and another config for production.

I understand cache sharding, application-level partitioning, and so
forth.  I'm wondering what I can do on a single machine -- and what
that machine should look like.

http://docs.neo4j.org/chunked/stable/configuration-jvm.html and
http://wiki.neo4j.org/content/Neo4j_Performance_Guide are encouraging.
 And having knobs as documented at
http://wiki.neo4j.org/content/Configuration_Settings is great.  Nice
work!

I'm hoping I might be able to get away with 128GB RAM on 12 cores with
data striped over a handful of disks (SSDs if required).  We'll
probably also need a cluster for both traffic and availability, but
that's another topic.

Does anybody have experience with a data set like this on a similar
machine?  How much RAM and how much disk -- and what kinds and in what
configuration?  Latency, throughput, general experience?  Any
production deployments?

I'd appreciate any guidance or feedback.  I'm happy to summarize later
if that'd be helpful.

BTW, my testbed uses Clojure with clojure.contrib.server-socket and
https://github.com/wagjo/borneo. Very convenient!

--Jamie
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to