Thanks Chris!

This is really helpful.

Bryan Wooten
Tel: (801)585-9323
Email: [email protected]<mailto:[email protected]>

[Identity & Access Management_combined centered]

From: Christopher Myers [mailto:[email protected]]
Sent: Monday, August 31, 2015 2:35 PM
To: [email protected]; Bryan Wooten
Subject: Re: [cas-user] Hazelcast / Slow CAS

In the past when I've run into things like this, I've started a VNC session on 
the server and let jvisualvm watch the tomcat process so that it could give me 
statistics on gc activity.

For memory tuning, I spent roughly two months slowly tweaking the config for 
our (very active) cluster nodes (which also host our webmail and campus 
portal,) and came up with:

-Xms6g -Xmx6g -Xss512k 
-Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true 
-XX:+UseCompressedOops -XX:MaxPermSize=256m -XX:NewRatio=3 -XX:SurvivorRatio=8 
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC 
-XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled 
-XX:+CMSScavengeBeforeRemark -XX:CMSInitiatingOccupancyFraction=68

This routinely ends up with regular minor collections, and very few major 
collections even after an extended period of high use.

For additional monitoring, we also have a home-built diagnostics page 
(attached) that we run on all of our cluster nodes, polled by our load 
balancer. It polls things like used db threads, server connections, ldap 
connections, heap size, gc activity, etc. :




15:16pm up 19 days 5:55, 0 users, load average: 0.19, 0.31, 0.31


Connection to PROD ok : connections in use/idle/max: 18/4/25
Connection to Moodle ok : connections in use/idle/max: 1/1/25
Connection to Jira ok : connections in use/idle/max: 1/0/25
Connection to Diebold ok : connections in use/idle/max: 1/1/25


Connection to LDAP on mulinedir1 is ok.
Connection to LDAPS on mulinedir1 is ok.
Connection to LDAP on mulinedir2 is ok.
Connection to LDAPS on mulinedir2 is ok.

Java Heap in use/max: 2140M/5990M
Java non-Heap in use/max: 112M/304M

Number of Java threads: 177
Peak Java threads: 226


Garbage Collection: Copy: 5969
Garbage Collection: ConcurrentMarkSweep: 11


Waiting for I/O accept: org.apache.catalina.core.StandardServer


active internet connections (w/o servers)
proto recv-q send-q local address foreign address state
tcp 9630 0 muwacnode1.millik:60700 muoradbprod.milli:6010 established
tcp 10200 0 muwacnode1.millik:54433 muoradbprod.milli:6010 established
tcp 10200 0 muwacnode1.millik:54433 muoradbprod.milli:6010 established
tcp 0 0 muwacnode1.millik:44428 muoradbprod.milli:6010 established
tcp 0 0 localhost:8009 localhost:40585 established
<snip/>

---------------------------
ESTABLISHED: 104
TIME_WAIT: 42
CLOSE_WAIT: 2
LDAP: 45
LDAPS: 0
HTTP: 0
HTTPS: 3
eDir1 Est: 7
eDir2 Est: 10


Filesystem Size Used Avail Use% Mounted on
/dev/sda4 74G 17G 58G 23% /
udev 4.0G 96K 4.0G 1% /dev
tmpfs 4.0G 0 4.0G 0% /dev/shm
/dev/sda1 92M 21M 66M 25% /boot
/dev/sda3 4.0G 1.6G 2.5G 40% /var
172.16.Y.X:/srv/www/htdocs 26G 8.7G 16G 36% /srv/www/htdocs
172.16.Y.X:/var/export 4.0G 2.0G 1.9G 51% /var/export
172.16.Y.X:/srv/deploy 26G 8.7G 16G 36% /srv/deploy
172.16.Y.X:/mnt/data 26G 8.7G 16G 36% /data
//muoesfile2/data 3.2T 2.6T 610G 81% /mnt/oesfile2
myMILLIKIN project is deployed.




Finally, we run JavaMelody on our cluster nodes as well, which gives some 
really good statistics (note that these stats also include our campus portal 
and webmail, not just CAS, but you get the idea):












>>> Bryan Wooten <[email protected]<mailto:[email protected]>> 08/31/15 
>>> 2:58 PM >>>

Hi all,

So twice in the past few months CAS (3.5.2) has gotten really slow. A restart 
of the Tomcat servers makes the issue go away.

There are no errors in either cas.log or catalina.out, it is just really slow.

Because the issue occurs only in production and not in test I have never had 
time to attempt any kind of root cause analysis.

Now our hazelcast is configured to use 85% of heap which is set to 2048meg. We 
get about 200k logins a day.

I think I may be running into a tomcat/jvm tuning issue (heap size or garbage 
collection issue).

Does anyone have suggestions on how I should monitor this or what config 
settings for tomcat I should be using/

Thanks,

Bryan Wooten
Tel: (801)585-9323
Email: [email protected]<mailto:[email protected]>

[Identity & Access Management_combined centered]


--
You are currently subscribed to 
[email protected]<mailto:[email protected]> as: 
[email protected]<mailto:[email protected]>
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-user

-- 
You are currently subscribed to [email protected] as: 
[email protected]
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-user

Reply via email to