Hey Mark, That's a really fantastic and useful design metric. Can paraphrase it a bit and write it up on the Neo4j blog/my blog?
I'll credit my source, naturally :-) Jim On 24 Feb 2011, at 14:08, Mark Harwood wrote: >>> But in answering this, I wonder if there are actually two use cases here > > Yes, I see the use cases as the design decision points you are forced > to make at varying points in the scale of increasing data volumes: > 1) 0-10s of gigabytes: > Slam in the RAM on a single server and all is plain sailing > 2) Hundreds of Gigabytes > Too big to hold all in RAM on a single server but not too big to worry > about the cost of replicating the data on disk. Use the suggested > "intelligent cache router" to favour replica servers with a likelihood > of a pre-warmed cache for the given keys. The cost of a cache miss is > not too catastrophic ( a local disk read vs RAM access) > 3) Terabytes and above > Too big for RAM, too big to store or replicate in its entirety on each > server. The cost of not finding what you are after in RAM is then > potentially very large - not just a local disk read but due to > sharding potentially a network hop and related issues of the traversal > state must now be exchanged between server processes. > > Cheers > Mark > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

