Lokesh,

nice machine that you have there.

I think you might have run into a long Garbage Collector pause, as you run 
Neo4j with 65GB heap it takes a while to do the full-stop garbage collection.

I would run Neo4j with at most 16GB heap here and rather configure the mmio 
settings in neo4j.properties to conform to your (expected) store sizes, esp. 
for nodestore and percentage-wise for rel-store and the property-stores.

013-12-22 16:45:25.635+0000 INFO  [o.n.k.i.DiagnosticsManager]: Total Physical 
memory: 251.78 GB
2013-12-22 16:45:25.635+0000 INFO  [o.n.k.i.DiagnosticsManager]: Free Physical 
memory: 198.02 GB
2013-12-22 16:45:25.635+0000 INFO  [o.n.k.i.DiagnosticsManager]: Committed 
virtual memory: 734.69 GB
2013-12-22 16:45:25.635+0000 INFO  [o.n.k.i.DiagnosticsManager]: Total swap 
space: 59.60 GB
2013-12-22 16:45:25.635+0000 INFO  [o.n.k.i.DiagnosticsManager]: Free swap 
space: 59.60 GB
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: JVM memory 
information:
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: Free  memory: 
60.69 GB
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: Total memory: 
61.33 GB
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: Max   memory: 
80.00 GB
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: Garbage 
Collector: PS Scavenge: [PS Eden Space, PS Survivor Space]
2013-12-22 16:45:25.636+0000 INFO  [o.n.k.i.DiagnosticsManager]: Garbage 
Collector: PS MarkSweep: [PS Eden Space, PS Survivor Space, PS Old Gen, PS Perm 
Gen]
2013-12-22 16:45:25.637+0000 INFO  [o.n.k.i.DiagnosticsManager]: Memory Pool: 
Code Cache (Non-heap memory): committed=2.44 MB, used=912.69 kB, max=48.00 MB, 
threshold=0.00 B
2013-12-22 16:45:25.637+0000 INFO  [o.n.k.i.DiagnosticsManager]: Memory Pool: 
PS Eden Space (Heap memory): committed=16.00 GB, used=655.36 MB, max=16.00 GB, 
threshold=?
2013-12-22 16:45:25.638+0000 INFO  [o.n.k.i.DiagnosticsManager]: Memory Pool: 
PS Survivor Space (Heap memory): committed=2.67 GB, used=0.00 B, max=2.67 GB, 
threshold=?
2013-12-22 16:45:25.638+0000 INFO  [o.n.k.i.DiagnosticsManager]: Memory Pool: 
PS Old Gen (Heap memory): committed=42.67 GB, used=0.00 B, max=384.00 GB, 
threshold=0.00 B
2013-12-22 16:45:25.638+0000 INFO  [o.n.k.i.DiagnosticsManager]: Memory Pool: 
PS Perm Gen (Non-heap memory): committed=66.59 MB, used=18.02 MB, max=512.00 
MB, threshold=0.00 B

You've got quite a lot of VM GC arguments, where did you get them from?

2013-12-22 16:45:25.640+0000 INFO  [o.n.k.i.DiagnosticsManager]: VM Arguments: 
[-XX:+DisableExplicitGC, 
-Dorg.neo4j.server.properties=conf/neo4j-server.properties, 
-Djava.util.logging.config.file=conf/logging.properties, 
-Dlog4j.configuration=file:conf/log4j.properties, -XX:ParallelGCThreads=48, 
-XX:+UseParallelOldGC, -XX:+UseNUMA, -XX:-UseAdaptiveNUMAChunkSizing, 
-XX:+UseAdaptiveSizePolicy, -XX:+BindGCTaskThreadsToCPUs, 
-XX:+UseGCTaskAffinity, -XX:-UseLargePages, -XX:-UseCompressedOops, 
-XX:-ParallelRefProcEnabled, -XX:MaxPermSize=512m, -Xms65536m, -Xmx65536m, 
-Dneo4j.home=/home/lokesh/code/neo4j-community-2.0.0, 
-Dneo4j.instance=/home/lokesh/code/neo4j-community-2.0.0, -Dfile.encoding=UTF-8]

I would probably go with just CMS for the time being.

I continue to investigate but cannot promise too much over the holidays.

Michael

Am 22.12.2013 um 21:07 schrieb Lokesh Gidra <[email protected]>:

> All the queries are read requests. I am just trying to performance test the 
> server. So I am using only shortestPath queries.
> 
> Please find attached the files. I have compressed the log dir as one of the 
> file was pretty big.
> 
> 
> Thanks,
> Lokesh
> 
> On Sunday, December 22, 2013 6:48:52 PM UTC+1, Michael Hunger wrote:
> Are these queries only reading or reading and writing?
> 
> can you produce a thread dump of you neo4j server when that happens ?
> 
> either send a kill -3 <pid>
> or use jstack <pid>
> 
> and send us the thread-dump and the content of your logfiles (data/log/* and 
> data/graph.db/messages.log)
> 
> Am 22.12.2013 um 14:06 schrieb Lokesh Gidra <[email protected]>:
> 
>> Hello,
>> 
>> I am running a neo4j-2.0.0 server on a linux machine with 48-cores. I run a 
>> python script on another machine. The script uses multiple threads to 
>> perform multiple shortestPath queries to the server. I am using py2neo 
>> package in the python script. In the beginning the queries are processed 
>> fine. I can see multiple cpus being used by neo4j server in the "top" 
>> output. But suddenly, the server gets into a serial phase where only 1 cpu 
>> is used 100%. During this time, the python also doesn't make any progress.
>> 
>> I am certain that the script is not faulty as sometimes this serial phase 
>> begins AFTER processing all the queries sent by the script, but BEFORE the 
>> script exits.
>> 
>> Can anyone please suggest me what causes this behaviour. And what can be 
>> done to avoid it.
>> 
>> 
>> Regards,
>> Lokesh
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
> <jstack.output><messages.log><log.tar.bz2>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to