Hello Carl,

The failures we've seen are in anywhere between 8 hours to a week of
runtime. Most of them have (still) been running for almost a month
without failure. There are ~100 machines.

>From the top of my head, I think we've had about 10+ failures now.

We have also had failures with hotspot error files (hs_err) present, and
the cause specified was indeed SIGSEGV indicating a page fault. But I
don't know if the two are related.

We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
the situation allows (during regular updates of the application, or a
crash) to see if that helps.

It might be useful to note that the failures happen with tomcat 6.0.20
as well as 6.0.24.

As far as load concerns, I haven't had a failure on an idle machines.
The machines are well loaded, but only at a fraction limit in regards to
load and cpu utilization.  
Most memory is commited to tomcat, where a 24G machine would have 18G
allocated to heap, 128M to permgen and some unspecified amount would get
used by jni for apr. About 4G remains free after calculating taking into
account the jvm itsself.
A 16G machine would have 12G allocated to the heap.

Besides the fact that our apps heavily use nio and mina I wouldn't say
there's anything else noteworthy. There can be anywhere up to 10000
concurrents on one machine.

I had searched for coredumps, but no luck. Running tomcat on the
foreground might show something, but then again I could be waiting for a
month for it to happen.

On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
> Taylan,
> 
> I am the person who started the "Tomcat dies suddenly" thread which I still 
> haven't resolved.  I am curious about the pattern of failures you are 
> experiencing because they may provide some clues to my problem.  In my case, 
> the system will run for 15 minutes to 10 days before failing (most of the 
> time it is several days to a week.)  It appears to die from a seg fault in 
> the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you 
> may be able to see the cause of the failure from the core file (the core 
> files on my systems were in several directories so you may have to do a 
> 'find' to locate them.)  Load may be a factor but the failures generally 
> come after the load has been heavy for a while.  I am running a couple of 
> applications and it seems the failures are more frequent when people are 
> hitting the additional apps (the primary app is always used, the remaining 
> apps are used sporatically.)
> 
> How does this compare to what you are experiencing?
> 
> Thanks,
> 
> Carl
> 
> ----- Original Message ----- 
> From: "Taylan Develioglu" <tdevelio...@ebuddy.com>
> To: "Tomcat Users List" <users@tomcat.apache.org>; <p...@pidster.com>
> Sent: Wednesday, February 24, 2010 5:09 AM
> Subject: Re: jvm exits without trace
> 
> 
> > The GC log shows plenty of heap space left in all the spaces.
> >
> > I purposely didn't bother replacing the variables because I figured they
> > would not be relevant.
> >
> > But if you think they might provide clues they're as follows:
> >
> > JAVA_HEAP_SIZE=18432M
> > JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
> > JAVA_PERM_SIZE=128M
> > JAVA_STCK_SIZE=128K
> >
> > EDEN_SIZE is 1/6th of total heap.
> >
> > And I said there was nothing in the system logs.
> > But you get a couple of points for trying.
> >
> > On Wed, 2010-02-24 at 10:44 +0100, Pid wrote:
> >> On 24/02/2010 09:36, Taylan Develioglu wrote:
> >> > I thought I'd add the connector definitions too, :
> >> >
> >> >     <Connector port="80"
> >> > protocol="org.apache.coyote.http11.Http11AprProtocol"
> >> >                 compression="1024" keepAliveTimeout="60000"
> >> > maxKeepAliveRequests="-1"
> >> >                 enableLookups="false" redirectPort="443" 
> >> > maxThreads="150"
> >> > pollerSize="32768"
> >> >                 pollerThreadCount="4"/>
> >> >
> >> >      <Connector port="443"
> >> > protocol="org.apache.coyote.http11.Http11AprProtocol" SSLEnabled="true"
> >> >                 enableLookups="false" maxThreads="10" scheme="https"
> >> > secure="true"
> >> >                 SSLCertificateFile="/etc/ssl/private/something.crt"
> >> >                 SSLCertificateKeyFile="/etc/ssl/private/something.key"
> >> >                 SSLCACertificateFile="/etc/ssl/certs/ca.crt"/>
> >> >
> >> >
> >> > On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
> >> >> Hi,
> >> >>
> >> >> I have jvm's, running tomcat and our application, exiting 
> >> >> mysteriously,
> >> >> and was wondering if anyone could give me some advice on how to debug
> >> >> this thing.
> >> >>
> >> >> There is nothing in catalina.out, nor our application logs, and no
> >> >> hotspot error file. GC log looks normal. No trace in system logs.
> >> >>
> >> >> I am left completely clueless :(, has anyone dealt with a problem like
> >> >> this before?
> >> >>
> >> >> Any help appreciated.
> >> >>
> >> >> - Tomcat 6.0.24
> >> >> - TC native 1.1.18
> >> >> - APR 1.3.9
> >> >> - Sun JDK 6u18
> >> >> - Debian Lenny, 2.6.31.10-amd64
> >> >>
> >> >> 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.
> >> >>
> >> >> JAVA_OPTS ( ):
> >> >>
> >> >>      -verbose:gc
> >> >>      -Djava.awt.headless=true
> >> >>      -Dsun.net.inetaddr.ttl=60
> >> >>      -Dfile.encoding=UTF-8
> >> >>      -Djava.io.tmpdir=$TMP_DIR
> >> >>      -Djava.library.path=/usr/local/lib
> >> >>      -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
> >> >>      -Dcatalina.base=$CATALINA_BASE
> >> >>      -Dcatalina.home=$CATALINA_HOME
> >> >>      -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> >> >> -Djava.util.logging.config.file="$CATALINA_BASE/conf/logging.properties"
> >> >>      -XX:+PrintGCDetails
> >> >>      -Xloggc:$CATALINA_BASE/logs/gc.log
> >> >>      -XX:+UseConcMarkSweepGC
> >> >>      -XX:CMSInitiatingOccupancyFraction=70
> >> >>      -Xms$JAVA_HEAP_SIZE
> >> >>      -Xmx$JAVA_HEAP_SIZE
> >> >>      -XX:NewSize=$JAVA_EDEN_SIZE
> >> >>      -XX:MaxNewSize=$JAVA_EDEN_SIZE
> >> >>      -XX:PermSize=$JAVA_PERM_SIZE
> >> >>      -XX:MaxPermSize=$JAVA_PERM_SIZE
> >> >>      -Xss$JAVA_STCK_SIZE
> >> >>      -XX:+UseLargePages
> >>
> >> There's no actual heap size settings in the above.  But you get a couple
> >> of points for trying.
> >>
> >> Google "Linux Out Of Memory killer" or "OOM Killer" and then check the
> >> server logs carefully.  (e.g. /var/log/messages)
> >>
> >>
> >> p
> >>
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >> >
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to