The short version:
I claim that in Init.instance()
this line:
| Init init = (Init) Contexts.getApplicationContext().get(Init.class);
|
should be replaced with :
| Init init = (Init)
Contexts.getApplicationContext().get("org.jboss.seam.core.init")
|
because the current version calles Class.getAnnotation which in turns calls
synchronized method and this results in poor behavior under heavy load.
The long version:
First some history (and a similar previous problem):
Over the past few months I've been working with a team converting an existing
webapp of moderate complexity to a JSF / Seam stack w/ Facelets.
It's been in production now for about a month and a half.
Running on jboss 4.0.5.GA w/ seam 1.2.0.patch1 Sun JSF RI 1.2 b4 (?)
Before we put it in production had to do some performance tuning. What we
started out with was about 10% as fast as the original app , which wasn't
acceptable, but eventually we got to about 60%, which was fine.
One undesirable behavior was that under moderately heavy load, when requests
were slowing down from their normal ~300ms to 3-4 seconds, the distribution of
request response time got very wide with some requests taking 5 seconds while
other, identical, requests were timing out at 5 minutes.
With some testing we tracked this down to the fact that
org/apache/catalina/core/ApplicationContext.get() is being called something
like 50K - 100K times per request in our app. Internally this call is
synchronized.
The solution we tried and eventually adopted was to edit the tomcat
ApplicationContext code and replace the HashMap with a ConcurrentHashMap and
to remove the synchronization in the get().
This resulted in something like a 30-40% performance improvement (if memory
serves), but more importantly it significantly narrowed the response time
curve. That is, the response time for requests ,under load, was a lot more
consistent.
(note 1: this makes sense since having lots of threads contending for a lock
can lead to starvation. In this case, the fact that any request has to "win"
the lock 100K times before it succeeds makes it more likely that starvation
will occur despite any anti-starvation attempts by the OS/VM)
(Note 2: This change not strictly speaking a safe thing to do, but it works
for us. (Tomcat 6 also takes this approach but presumably they've gone further
in validating that the rest of the app is safe with the concurrent hash map
semantics)).
THE CURRENT ISSUE:
Basically, in production under heavy load, we saw the same distribution of
request response time. (measured by mod-log-firstbyte in apache). Bascially
once the per-page response got to ~7-10 seconds / page some requests start to
take 100 - 300 seconds (there's some timeout in the system that caps things at
300 seconds). The situation then often spirals out of control with active
connections climbing to the 100s and inevitably resulting in an OOM.
This time a thread-dump of jboss showed that many threads were waiting in to
lock a specific lock in java.lang.Class.initAnnotationsIfNecessary().
Looking at the stack traces it became clear that the Class in question was
Init.class and that all these threads were calling Init.instance().
( Init.instance() is called ... everywhere... all the time).
The specific line that leads to the eventual lock contention is this one:
Init init = (Init) Contexts.getApplicationContext().get(Init.class);
Which internally leads to a call to get the value of the @Name annotation for
the Init.class so that it can then look up the instance by that name.
It is this lookup of the Name of the Init component that causes the contention.
Replacing the get(Init.class) with get("org.jboss.seam.core.init") bypasses
this lookup and also that synchronized method.
I've been running this modification on half of the servers for the past two
days and they are showing MUCH more linear performance under heavy load.
Basically when subjected to a stressor (GC pause, DB related pause, traffic
spike etc ) the modified machines return to normal much more quickly (say... 20
-30 seconds compared to 5-10 minutes) and they haven't shown a tendency to
spiral out of control towards OOM.
I generally like the compile time checking that one gets from using the
get(Class) version of this method, however in this specific case i think the
loss of safety is worth it given the deep loop nature of this call.
I'm curious as to what you guys think.
Thanks,
radu
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4042697#4042697
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4042697
_______________________________________________
jboss-user mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/jboss-user