Hello all, I hope this question is relevant to this community. Please let me know.
The question is along the lines of how do you avoid unexpected heap issues or garbage collection thrashing that causes 'timeouts' when handling large graphs. The application I am trying to write depends on ~11M nodes with ~100M relationships with 2 or 3 properties. In the application I will select a handful of nodes and find the connection paths between them and the expands sub graphs from the results. Conceivably the expansion of the sub graphs could return 100 of thousands or millions of nodes and even more relationships and associated properties. I will run some analytics on sub graphs and then update the properties. I guess a fairly standard use model. (the machine running the DB has 16G ram) As I run trials on the queries I am seeing 'almost random' heap usage that every now and again causes 'out of heap' related errors. I understand the use of limits and batching to a reasonably level but feel that there should be a solid and consistent programmatic way to protect against heap problems. As the heap usage increase then I am seeing increasingly wider variations in query performance for repeat queries. Ideally I wouldn't have to artificially limit the size of the sub graphs that I work on as it erodes the performance of my analytic algorithms. In my application reliability is my #1 goal; consistent performance is my #2 goal; Absolute performance a #3 goal. So the question is: is there solid and consistent programmatic way to protect against heap problems for all classes of queries and large volumes of property updates? Best regards, John. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
