Kay,

Just to make sure: You upgraded to 4.0.8 AND applied the patch, right?
(Because 4.0.8 does not include that fix)

Hm, a slow container, many factors ...
Some things that might help narrowing down the issue:

* What exactly is slow, and in what situations: heavy container-load, low load, 
no load?
* Is the persistence directory on a shared filesystem (e.g. NFS)?
* What are the container thread settings in 
$GLOBUS_LOCATION/etc/globus_wsrf_core/server-config.wsdd?
  We experienced that, with default configuration, there were in some cases only
  2 threads that handled incoming requests.
  (check 
http://www.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Performance_Guide.html)
* Are job deletions involved when it's getting slow?
* Do you have a lot of persisted jobs/credentials/subscriptions in the 
persistence directory?
  (by default ~<containeruser>/.globus/persisted/)
* In a situation where the container is slow: A JVM thread dump might give us
  some insight to see what the threads are actually doing.
  (kill -QUIT <container-pid>)

So, if you are willing to debug we could try to track some things.

Also: Is there a way for you to try 4.2.1? I'd argue that Gram4 has far less 
potential in 4.2
to be the resource-hog in the container. I can point you to documentation that 
describes
how to run Gram4 with a container with at most 200M usage and being scalable 
nonetheless.
(Will be the default in 4.2.2)

-Martin

Kay Dörnemann wrote:
Hi,

first of all I want to thank you. We had the same problem and primarily
your suggested fix helped but after approximately 24h the container
process pegged 100% of the CPU again, until we restart it. It is fully
functional but the system is slow. The only error message we found in
the logs is:
2009-01-25 22:01:29,820 ERROR container.ServiceThread
[ServiceThread-5370,run:297] Unexpected error during request processing
java.lang.NullPointerException
        at
org.globus.wsrf.container.GSIServiceThread.process(GSIServiceThread.java:151)
        at
org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:291)
2009-01-25 22:01:33,139 ERROR container.ServiceThread
[ServiceThread-5353,run:297] Unexpected error during request processing
java.lang.NullPointerException
        at
org.globus.wsrf.container.GSIServiceThread.process(GSIServiceThread.java:151)
        at
org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:291)

We tried to upgrade from 4.0.5 to 4.0.8 but it still persists.

Any ideas?

Cheers,

Kay

Patrick Armstrong wrote, on 16.01.2009 23:05:
On 16-Jan-09, at 1:10 PM, Martin Feller wrote:
This sounds as if you are hitting
http://bugzilla.globus.org/globus/show_bug.cgi?id=6341
Yep! This was exactly the issue, as far as I can see. I patched the
system as described in the bug, and I get a nice container error now!
Much better than an infinite loop.

I made a patch, and if anyone is running into this issue, you can patch
your system by running the following on your globus server:

wget -qO-
https://particle.phys.uvic.ca/~patricka/globus-gram-local-proxy-tool.patch
| patch $GLOBUS_LOCATION/libexec/globus-gram-local-proxy-tool

--patrick


Reply via email to