Glad I.could help! Sent from my iPhone
> On May 4, 2016, at 21:55, John Fry <[email protected]> wrote: > > Many many thanks Clark - a huge help.... > > see in line: > >> On Wednesday, May 4, 2016 at 1:11:44 PM UTC-7, Clark Richey wrote: >> John, >> Here are my initial thoughts: >> You either have too much memory assigned to the JVM given that Neo uses >> off-heap memory for node caches. You have a lot of nodes in memory at a >> given time (550k nodes if I understand correctly).the OS is probably paging >> like mad to keep up which is consistent with the profile you are seeing. You >> need to reduce the number of nodes you are loading in a transaction and >> likely decrease the JVM size. > I did not know that nodes were cached off-heap! I reduced the allocation by > roughly 8G and ran -Xms12000m -Xmx16000m. > I also decreased the batch size to 5k source nodes to keep the total number > of nodes computed over per batch to less than 100k. > The application now completes in about 20mins with uniform loading across > threads :) > >> If your graph is actually connected, meaning that a destination nodes may be >> attached to more than one source node you could probably do this more >> efficiently so as to not have to reload the destination nodes multiple times. > There is probably a clever way to do what I need to do but with a 20min run > time it is now good enough and readable etc. >> If your graph is NOT that connected and you just have >> (:source)-[]->(:connected) 15 million times and nothing else then I might >> even question why you are using a graph database. > The graph is highly connect with average io/out degree of 10. >> >>> On May 4, 2016, at 1:30 PM, John Fry <[email protected]> wrote: >>> >>> Hi All, >>> >>> I am seeing slow and worsening memory performance and ~6 hour run times for >>> the application detailed below. >>> I am running to close to using all 32G of RAM and often the AWS >>> stalls/fails due to memory allocation issues. This is despite liming the >>> application/JVM to 24G. >>> Garbage collection rates worsen as the application progresses. >>> >>> What I need help with is the following: >>> given the description below - is 6 hours run-time and 32G footprint about >>> correct? >>> how do you know when you have tuned a single instance embedded use of neo4j >>> to optimal memory performance? (is there a benchmark to tune against?) >>> is there anything obviously wrong or naive with the approach below? >>> what other tuning options are available for me to try? >>> Let me know if you need see any code. >>> >>> Many thanks in advance for help, John. >>> >>> Environment: >>> AWS m4.2large - 16 VCPUs; 32G RAM >>> Application is using neo4j embedded in Java >>> neo4j-community-2.3.0 >>> >>> >>> Graph Size: >>> 15M Nodes - with properties: a name/string; some floating point values >>> 170M Relationships - with properties: 5 floating point values >>> approximately 15G of data >>> >>> Algorithm Intent: >>> Fetch every source node in turn (all 15M), its out going relationships and >>> connecting destination nodes >>> Calculate some statistics and parameters based from the properties in the >>> source and destination nodes and their connecting relationships >>> For every outgoing relationship update and write back the 5 floating point >>> properties >>> Implementation & Runtime Details: >>> Using 8 threads in a thread pool to queue up the algorithm in batches >>> Batch sizes of 50,000 nodes >>> tx.success is therefore posted every 50k iterations of the algorithm after >>> touching 50k sources nodes and about 500k destination nodes and 500k >>> relationships (~1000k objects + properties) >>> JVM params: -Xms16000m -Xmx24000m -XX:NewRatio=1 -XX:+UseG1G >>> neo4j-properties - everything commented out including - >>> #dbms.pagecache.memory=10g >>> >>> >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
