Hello everyone, We are relatively new to neo4j and are evaluating some test scenarios in order to decide to use neo4j in productive systems. We used the latest stable release 1.4.2.
I wrote an import script and generated some random data with the given tree structure: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_nodes.png Nodes Summary: Nodes with Type A: 1 Nodes with Type B: 100 Nodes with Type C: 50'000 (100x500) Nodes with Type D: 500'000 (50'000x10) Nodes with Type E: 25'000'000 (500'000x50) Nodes with Type F: 375'000'000 (25'000'000x15) This all worked quite OK, the import took approx. 30hours using the batchimport. We have multiple indexes, but we also have one index where all nodes are indexed. My first question would be, does it make sense to index all nodes with the same index? If I would like to list all nodes with property "type":"type E" it is quite slow the first time ~270s Second time it is fast ~1/2s. I know this is normal and mostlikely fixed in the current milestone version. But I am not sure how long the query will be cached in memory. Are there any configurations I should be concerned about? We also took the hardware sizing calculator. See the result here: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_hardware.png Are these realistic result values? I guess 128GB RAM and 12TB SSD harddrives might be a bit cost intense. Are there any reference applications with these amount of nodes and relations? Also Neoclipse won't start/connect to the database anymore with these amount of data. Am I missing some configurations for neoclipse? Best regards -- alican -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-performance-with-400million-nodes-tp3467806p3467806.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

