Re: [OT] Re: jvm exits without trace

2010-03-17 Thread Taylan Develioglu
Here's a hs_err file after a crash I had yesterday. We turned off some
things in our code without restarting and the crashes have virtually
stopped but we do still get the off one here and there where the
application has not been restarted, could be that the problem lingers
and builds up in time, who knows.

It's a sigsegv in GCTaskThread. From the occupation in eden it looks
like it happened during a scavenge (ParNew).

Maybe an expert in some dark cave could shed some more light on it.


On Tue, 2010-03-16 at 22:00 +0100, André Warnier wrote:
 Carl wrote:
  My approach is to get something (a JVM) that works and then gradually 
  change until it breaks.  Then, I know what is causing the problem.  To 
  date, I haven't been able to get a JVM that works.
  
 I think we understand that, and agree.
 Our remarks were tongue in cheek, if that is the right expression.
 
 At the bottom of things, finding a bug in the most recent JVM would be 
 much more globally important than finding it in your applications, 
 particularly a bug that can cause the JVM to segfault.
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-03-16 Thread Taylan Develioglu
With parent I meant the main JVM process as opposed to forked processes
or threads, sorry to confuse you there. Stracing the threads generates
too much data to store so I had to settle with the parent process.

To answer your other questions.

The code is 100% pure java, why it causes this messy crash is still
unclear but development is working to figure it out.

I'll follow up when we find out more, but I'm not sure if we're likely
to dig into the root cause, working around it is more of a priority
right now than debugging the jvm.


On Mon, 2010-03-15 at 17:08 +0100, Christopher Schultz wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Taylan,
 
 On 3/15/2010 10:19 AM, Taylan Develioglu wrote:
  The cause for the crashes was in our own application code, we're
  currently investigating the exact reason.
 
 Yeah, I'd like to second Chuck's question: was it native code?
 
  A strace of the parent process shows killed by sigsegv, why or how this
  can happen is still unclear.
 
 So, the parent was being killed? What was the parent of the JVM?
 
  Thanks to everyone that gave their assistance.
 
 Definitely follow-up to let us all know what you've uncovered... this
 was certainly a weird situation.
 
 - -chris
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAkueW4wACgkQ9CaO5/Lv0PAdhgCfa32vlcsMI5ELCNcLSjjV+S/o
 FZEAnjvjXgAwxjejTXexGO//89TyeF+r
 =BPtZ
 -END PGP SIGNATURE-
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-03-15 Thread Taylan Develioglu
The cause for the crashes was in our own application code, we're
currently investigating the exact reason.

A strace of the parent process shows killed by sigsegv, why or how this
can happen is still unclear.

Thanks to everyone that gave their assistance.


On Thu, 2010-03-11 at 15:40 +0100, Taylan Develioglu wrote:
 Hi Carl, thanks for the suggestion. I am going to try jvm 1.6.07
 regardless of what I said before.
 
 Funny coincidence, I tried the ibm jvm as well and ran into a similar
 issue (part of our ssl implementation uses sun specific libraries).
 
 
 On Thu, 2010-03-11 at 12:38 +0100, Carl wrote:
  Taylan,
 
  I am currently trying JVM 1.6.0_7 per Chuck's suggestion and, so far (4
  days), it is working.
 
  I started down the IBM JVM path but have abandoned that for now due to
  difficulties with the SSL implementation (somne browsers would work and some
  wouldn't with seemingly the same setup.)
 
  Thanks,
 
  Carl
  - Original Message -
  From: Taylan Develioglu tdevelio...@ebuddy.com
  To: Tomcat Users List users@tomcat.apache.org
  Sent: Thursday, March 11, 2010 6:13 AM
  Subject: Re: jvm exits without trace
 
 
  a different kernel did not help either...
  
   On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote:
   Changing to JIO didn't help, the silent crashes continue.
  
   I'm changing kernel versions now.
  
   On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote:
It's performing rather poorly performance wise, compared to the apr
connector. The number of threads required to handle the requests has
gone up significantly over the board.
   
Stability wise, I don't have complaints yet.
   
I'm keeping my fingers crossed.
   
On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
 On 05/03/2010 08:41, Taylan Develioglu wrote:
  Pid, that would assume we had a working  1.6.10 version before
  that we
  replaced.

 That it would.

  We've run 1.6.10 upwards succesfully for a very long time. So I
  don't
  see the point in doing this.

 I must have missed that.

 How is the HTTP connector performing?


 p

  On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
  On 03/03/2010 09:11, Taylan Develioglu wrote:
  Downgrading to 1.6.0_16 did not help. I'm replacing the apr
  connector
  with http now.
 
  As Chuck mentioned in the other thread, significant changes
  occurred at
  1.6.10, so trying the release before (1.6.7) might be necessary to
  establish a better determination.
 
 
  p
 
  On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
  Taylan,
 
  The failures we've seen are in anywhere between 8 hours to a
  week of
  runtime.
 
  The timing of the failures seems similar.
 
  We have also had failures with hotspot error files (hs_err)
  present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
 
  I have never seen any hs_* files but have seen core files where
  strace
  showed the jvm stopped on a seg fault.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to
  1.6.0_16 when
  the situation allows (during regular updates of the
  application, or a
  crash) to see if that helps.
 
  I have used jdk 1.6.0_17 and 1.6.0_18 with the same results...
  have not
  tried 1.6.0_16.  Please post your results of this trial.
 
  Running tomcat on the
  foreground might show something, but then again I could be
  waiting for a
  month for it to happen.
 
  Yes, this has been part of my problem as anytime we change
  something, we
  have to wait a week for the server to fail.
 
  In one sense, I am fortunate that I have a little more
  flexibility than you.
  I have two servers (different hardware) but only need one in
  service at a
  time.  Therefore, I always have one server I can test ideas on
  although I
  have never been able to develop a meaningful stress test, i.e.,
  the only way
  I can test a change is to put it in production.
 
  Thanks,
 
  Carl
 
  - Original Message -
  From: Taylan Develioglutdevelio...@ebuddy.com
  To: Tomcat Users Listusers@tomcat.apache.org
  Sent: Wednesday, February 24, 2010 8:31 AM
  Subject: Re: jvm exits without trace
 
 
  Hello Carl,
 
  The failures we've seen are in anywhere between 8 hours to a
  week of
  runtime. Most of them have (still) been running for almost a
  month
  without failure. There are ~100 machines.
 
   From the top of my head, I think we've had about 10+ failures
  now.
 
  We have also had failures with hotspot error files (hs_err)
  present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
  But I
  don't know if the two are related

Re: jvm exits without trace

2010-03-11 Thread Taylan Develioglu
Changing to JIO didn't help, the silent crashes continue.

I'm changing kernel versions now.

On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote:
 It's performing rather poorly performance wise, compared to the apr
 connector. The number of threads required to handle the requests has
 gone up significantly over the board.
 
 Stability wise, I don't have complaints yet.
 
 I'm keeping my fingers crossed.
 
 On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
  On 05/03/2010 08:41, Taylan Develioglu wrote:
   Pid, that would assume we had a working  1.6.10 version before that we
   replaced.
 
  That it would.
 
   We've run 1.6.10 upwards succesfully for a very long time. So I don't
   see the point in doing this.
 
  I must have missed that.
 
  How is the HTTP connector performing?
 
 
  p
 
   On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
   On 03/03/2010 09:11, Taylan Develioglu wrote:
   Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
   with http now.
  
   As Chuck mentioned in the other thread, significant changes occurred at
   1.6.10, so trying the release before (1.6.7) might be necessary to
   establish a better determination.
  
  
   p
  
   On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
   Taylan,
  
   The failures we've seen are in anywhere between 8 hours to a week of
   runtime.
  
   The timing of the failures seems similar.
  
   We have also had failures with hotspot error files (hs_err) present, 
   and
   the cause specified was indeed SIGSEGV indicating a page fault.
  
   I have never seen any hs_* files but have seen core files where strace
   showed the jvm stopped on a seg fault.
  
   We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
   when
   the situation allows (during regular updates of the application, or a
   crash) to see if that helps.
  
   I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not
   tried 1.6.0_16.  Please post your results of this trial.
  
   Running tomcat on the
   foreground might show something, but then again I could be waiting 
   for a
   month for it to happen.
  
   Yes, this has been part of my problem as anytime we change something, 
   we
   have to wait a week for the server to fail.
  
   In one sense, I am fortunate that I have a little more flexibility 
   than you.
   I have two servers (different hardware) but only need one in service 
   at a
   time.  Therefore, I always have one server I can test ideas on 
   although I
   have never been able to develop a meaningful stress test, i.e., the 
   only way
   I can test a change is to put it in production.
  
   Thanks,
  
   Carl
  
   - Original Message -
   From: Taylan Develioglutdevelio...@ebuddy.com
   To: Tomcat Users Listusers@tomcat.apache.org
   Sent: Wednesday, February 24, 2010 8:31 AM
   Subject: Re: jvm exits without trace
  
  
   Hello Carl,
  
   The failures we've seen are in anywhere between 8 hours to a week of
   runtime. Most of them have (still) been running for almost a month
   without failure. There are ~100 machines.
  
From the top of my head, I think we've had about 10+ failures now.
  
   We have also had failures with hotspot error files (hs_err) present, 
   and
   the cause specified was indeed SIGSEGV indicating a page fault. But I
   don't know if the two are related.
  
   We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
   when
   the situation allows (during regular updates of the application, or a
   crash) to see if that helps.
  
   It might be useful to note that the failures happen with tomcat 6.0.20
   as well as 6.0.24.
  
   As far as load concerns, I haven't had a failure on an idle machines.
   The machines are well loaded, but only at a fraction limit in regards 
   to
   load and cpu utilization.
   Most memory is commited to tomcat, where a 24G machine would have 18G
   allocated to heap, 128M to permgen and some unspecified amount would 
   get
   used by jni for apr. About 4G remains free after calculating taking 
   into
   account the jvm itsself.
   A 16G machine would have 12G allocated to the heap.
  
   Besides the fact that our apps heavily use nio and mina I wouldn't say
   there's anything else noteworthy. There can be anywhere up to 1
   concurrents on one machine.
  
   I had searched for coredumps, but no luck. Running tomcat on the
   foreground might show something, but then again I could be waiting 
   for a
   month for it to happen.
  
   On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
   Taylan,
  
   I am the person who started the Tomcat dies suddenly thread which I
   still
   haven't resolved.  I am curious about the pattern of failures you are
   experiencing because they may provide some clues to my problem.  In 
   my
   case,
   the system will run for 15 minutes to 10 days before failing (most 
   of the
   time it is several days to a week.)  It appears to die from a seg 
   fault
   in
   the JVM (I am using Sun

Re: jvm exits without trace

2010-03-11 Thread Taylan Develioglu
a different kernel did not help either...

On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote:
 Changing to JIO didn't help, the silent crashes continue.
 
 I'm changing kernel versions now.
 
 On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote:
  It's performing rather poorly performance wise, compared to the apr
  connector. The number of threads required to handle the requests has
  gone up significantly over the board.
 
  Stability wise, I don't have complaints yet.
 
  I'm keeping my fingers crossed.
 
  On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
   On 05/03/2010 08:41, Taylan Develioglu wrote:
Pid, that would assume we had a working  1.6.10 version before that we
replaced.
  
   That it would.
  
We've run 1.6.10 upwards succesfully for a very long time. So I don't
see the point in doing this.
  
   I must have missed that.
  
   How is the HTTP connector performing?
  
  
   p
  
On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
On 03/03/2010 09:11, Taylan Develioglu wrote:
Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
with http now.
   
As Chuck mentioned in the other thread, significant changes occurred at
1.6.10, so trying the release before (1.6.7) might be necessary to
establish a better determination.
   
   
p
   
On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
Taylan,
   
The failures we've seen are in anywhere between 8 hours to a week of
runtime.
   
The timing of the failures seems similar.
   
We have also had failures with hotspot error files (hs_err) 
present, and
the cause specified was indeed SIGSEGV indicating a page fault.
   
I have never seen any hs_* files but have seen core files where 
strace
showed the jvm stopped on a seg fault.
   
We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
when
the situation allows (during regular updates of the application, or 
a
crash) to see if that helps.
   
I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have 
not
tried 1.6.0_16.  Please post your results of this trial.
   
Running tomcat on the
foreground might show something, but then again I could be waiting 
for a
month for it to happen.
   
Yes, this has been part of my problem as anytime we change 
something, we
have to wait a week for the server to fail.
   
In one sense, I am fortunate that I have a little more flexibility 
than you.
I have two servers (different hardware) but only need one in service 
at a
time.  Therefore, I always have one server I can test ideas on 
although I
have never been able to develop a meaningful stress test, i.e., the 
only way
I can test a change is to put it in production.
   
Thanks,
   
Carl
   
- Original Message -
From: Taylan Develioglutdevelio...@ebuddy.com
To: Tomcat Users Listusers@tomcat.apache.org
Sent: Wednesday, February 24, 2010 8:31 AM
Subject: Re: jvm exits without trace
   
   
Hello Carl,
   
The failures we've seen are in anywhere between 8 hours to a week of
runtime. Most of them have (still) been running for almost a month
without failure. There are ~100 machines.
   
 From the top of my head, I think we've had about 10+ failures now.
   
We have also had failures with hotspot error files (hs_err) 
present, and
the cause specified was indeed SIGSEGV indicating a page fault. But 
I
don't know if the two are related.
   
We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
when
the situation allows (during regular updates of the application, or 
a
crash) to see if that helps.
   
It might be useful to note that the failures happen with tomcat 
6.0.20
as well as 6.0.24.
   
As far as load concerns, I haven't had a failure on an idle 
machines.
The machines are well loaded, but only at a fraction limit in 
regards to
load and cpu utilization.
Most memory is commited to tomcat, where a 24G machine would have 
18G
allocated to heap, 128M to permgen and some unspecified amount 
would get
used by jni for apr. About 4G remains free after calculating taking 
into
account the jvm itsself.
A 16G machine would have 12G allocated to the heap.
   
Besides the fact that our apps heavily use nio and mina I wouldn't 
say
there's anything else noteworthy. There can be anywhere up to 1
concurrents on one machine.
   
I had searched for coredumps, but no luck. Running tomcat on the
foreground might show something, but then again I could be waiting 
for a
month for it to happen.
   
On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
Taylan,
   
I am the person who started the Tomcat dies suddenly thread 
which I
still
haven't resolved.  I am curious about the pattern

Re: jvm exits without trace

2010-03-11 Thread Taylan Develioglu
Hi Carl, thanks for the suggestion. I am going to try jvm 1.6.07
regardless of what I said before.

Funny coincidence, I tried the ibm jvm as well and ran into a similar
issue (part of our ssl implementation uses sun specific libraries).


On Thu, 2010-03-11 at 12:38 +0100, Carl wrote:
 Taylan,
 
 I am currently trying JVM 1.6.0_7 per Chuck's suggestion and, so far (4
 days), it is working.
 
 I started down the IBM JVM path but have abandoned that for now due to
 difficulties with the SSL implementation (somne browsers would work and some
 wouldn't with seemingly the same setup.)
 
 Thanks,
 
 Carl
 - Original Message -
 From: Taylan Develioglu tdevelio...@ebuddy.com
 To: Tomcat Users List users@tomcat.apache.org
 Sent: Thursday, March 11, 2010 6:13 AM
 Subject: Re: jvm exits without trace
 
 
 a different kernel did not help either...
 
  On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote:
  Changing to JIO didn't help, the silent crashes continue.
 
  I'm changing kernel versions now.
 
  On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote:
   It's performing rather poorly performance wise, compared to the apr
   connector. The number of threads required to handle the requests has
   gone up significantly over the board.
  
   Stability wise, I don't have complaints yet.
  
   I'm keeping my fingers crossed.
  
   On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
On 05/03/2010 08:41, Taylan Develioglu wrote:
 Pid, that would assume we had a working  1.6.10 version before
 that we
 replaced.
   
That it would.
   
 We've run 1.6.10 upwards succesfully for a very long time. So I
 don't
 see the point in doing this.
   
I must have missed that.
   
How is the HTTP connector performing?
   
   
p
   
 On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
 On 03/03/2010 09:11, Taylan Develioglu wrote:
 Downgrading to 1.6.0_16 did not help. I'm replacing the apr
 connector
 with http now.

 As Chuck mentioned in the other thread, significant changes
 occurred at
 1.6.10, so trying the release before (1.6.7) might be necessary to
 establish a better determination.


 p

 On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
 Taylan,

 The failures we've seen are in anywhere between 8 hours to a
 week of
 runtime.

 The timing of the failures seems similar.

 We have also had failures with hotspot error files (hs_err)
 present, and
 the cause specified was indeed SIGSEGV indicating a page fault.

 I have never seen any hs_* files but have seen core files where
 strace
 showed the jvm stopped on a seg fault.

 We also use jdk 1.6.0_18, I'm downgrading the machines to
 1.6.0_16 when
 the situation allows (during regular updates of the
 application, or a
 crash) to see if that helps.

 I have used jdk 1.6.0_17 and 1.6.0_18 with the same results...
 have not
 tried 1.6.0_16.  Please post your results of this trial.

 Running tomcat on the
 foreground might show something, but then again I could be
 waiting for a
 month for it to happen.

 Yes, this has been part of my problem as anytime we change
 something, we
 have to wait a week for the server to fail.

 In one sense, I am fortunate that I have a little more
 flexibility than you.
 I have two servers (different hardware) but only need one in
 service at a
 time.  Therefore, I always have one server I can test ideas on
 although I
 have never been able to develop a meaningful stress test, i.e.,
 the only way
 I can test a change is to put it in production.

 Thanks,

 Carl

 - Original Message -
 From: Taylan Develioglutdevelio...@ebuddy.com
 To: Tomcat Users Listusers@tomcat.apache.org
 Sent: Wednesday, February 24, 2010 8:31 AM
 Subject: Re: jvm exits without trace


 Hello Carl,

 The failures we've seen are in anywhere between 8 hours to a
 week of
 runtime. Most of them have (still) been running for almost a
 month
 without failure. There are ~100 machines.

  From the top of my head, I think we've had about 10+ failures
 now.

 We have also had failures with hotspot error files (hs_err)
 present, and
 the cause specified was indeed SIGSEGV indicating a page fault.
 But I
 don't know if the two are related.

 We also use jdk 1.6.0_18, I'm downgrading the machines to
 1.6.0_16 when
 the situation allows (during regular updates of the
 application, or a
 crash) to see if that helps.

 It might be useful to note that the failures happen with tomcat
 6.0.20
 as well as 6.0.24.

 As far as load concerns, I haven't had a failure on an idle
 machines.
 The machines are well loaded, but only at a fraction limit

Re: jvm exits without trace

2010-03-10 Thread Taylan Develioglu
Sorry I wasn't clear.

I didn't mean 2172 concurrent requests. Just sessions. 
It hadn't occured to me that the number of sessions does not necessarily
equal the number of connections (duh).

the number of established connections indeed equals the number of
threads. So what Chuck said was true.




On Tue, 2010-03-09 at 19:29 +0100, André Warnier wrote:
 Taylan Develioglu wrote:
  Chuck, if that is true how can we explain I see only 637 busy threads on
  a server that is serving 2172 clients ?
 
 Woaw ! can you give us your trick ?
 
  
  If every connection requires its own thread there should be 2172
  threads.
 
 Seriously now : when a thread is finished serving a request, there is 
 still some time during which the response bytes are cascading through 
 the network to the clients.
 I think you need to defined serving 2172 clients a bit more precisely 
 before you can say this, no ?
 
 
  
  On Tue, 2010-03-09 at 16:40 +0100, Caldarale, Charles R wrote:
  From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
  Subject: RE: jvm exits without trace
 
  where peak busy-threads used to be ~50 with APR, now it has become ~200
  with JIO.
  To be expected when you have unlimited keep-alives configured.  Each HTTP 
  connection requires a separate thread with JIO, whereas the NIO and APR 
  connectors use a single poller thread.
 
   - Chuck
 
 
  THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
  MATERIAL and is thus for use only by the intended recipient. If you 
  received this in error, please contact the sender and delete the e-mail 
  and its attachments from all computers.
 
 
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email 
  __
  
  
  
  -
  To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
  For additional commands, e-mail: users-h...@tomcat.apache.org
  
  
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: jvm exits without trace

2010-03-09 Thread Taylan Develioglu
The switch is from APR to JIO. SSL practically doesn't get used.

Almost all pages served are jsp or java, very little static files are
served and keep-alive is on.

where peak busy-threads used to be ~50 with APR, now it has become ~200
with JIO.

Here are the connector definitions for reference (no executor is used):

- APR:

Connector port=80
protocol=org.apache.coyote.http11.Http11AprProtocol
   compression=1024 keepAliveTimeout=6
maxKeepAliveRequests=-1
   enableLookups=false redirectPort=443 maxThreads=150
pollerSize=32768
   /

- JIO:

Connector port=80
protocol=org.apache.coyote.http11.Http11Protocol
   compression=1024 connectionTimeout=1
keepAliveTimeout=6 maxKeepAliveRequests=-1
   enableLookups=false redirectPort=443
maxThreads=720/


On Fri, 2010-03-05 at 19:13 +0100, Caldarale, Charles R wrote:
  From: Christopher Schultz [mailto:ch...@christopherschultz.net]
  Subject: Re: jvm exits without trace
  
  I thought he said he was using APR, not NIO.
 
 He was, but IIRC, switched away from it to see if that would affect the 
 outages.  What we don't know is what was switched to - JIO or NIO.  If it's 
 JIO, there may be a lot of threads tied up handling persistent HTTP 
 connections, possibly causing heap or other resource problems.
 
  - Chuck
 
 
 THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
 MATERIAL and is thus for use only by the intended recipient. If you received 
 this in error, please contact the sender and delete the e-mail and its 
 attachments from all computers.
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: jvm exits without trace

2010-03-09 Thread Taylan Develioglu
Chuck, if that is true how can we explain I see only 637 busy threads on
a server that is serving 2172 clients ?

If every connection requires its own thread there should be 2172
threads.

On Tue, 2010-03-09 at 16:40 +0100, Caldarale, Charles R wrote:
  From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
  Subject: RE: jvm exits without trace
  
  where peak busy-threads used to be ~50 with APR, now it has become ~200
  with JIO.
 
 To be expected when you have unlimited keep-alives configured.  Each HTTP 
 connection requires a separate thread with JIO, whereas the NIO and APR 
 connectors use a single poller thread.
 
  - Chuck
 
 
 THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
 MATERIAL and is thus for use only by the intended recipient. If you received 
 this in error, please contact the sender and delete the e-mail and its 
 attachments from all computers.
 
 
 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email 
 __



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-03-05 Thread Taylan Develioglu
Pid, that would assume we had a working  1.6.10 version before that we
replaced.

We've run 1.6.10 upwards succesfully for a very long time. So I don't
see the point in doing this.

On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
 On 03/03/2010 09:11, Taylan Develioglu wrote:
  Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
  with http now.
 
 As Chuck mentioned in the other thread, significant changes occurred at 
 1.6.10, so trying the release before (1.6.7) might be necessary to 
 establish a better determination.
 
 
 p
 
  On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
  Taylan,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime.
 
  The timing of the failures seems similar.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
 
  I have never seen any hs_* files but have seen core files where strace
  showed the jvm stopped on a seg fault.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
  I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not
  tried 1.6.0_16.  Please post your results of this trial.
 
  Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
  Yes, this has been part of my problem as anytime we change something, we
  have to wait a week for the server to fail.
 
  In one sense, I am fortunate that I have a little more flexibility than 
  you.
  I have two servers (different hardware) but only need one in service at a
  time.  Therefore, I always have one server I can test ideas on although I
  have never been able to develop a meaningful stress test, i.e., the only 
  way
  I can test a change is to put it in production.
 
  Thanks,
 
  Carl
 
  - Original Message -
  From: Taylan Develioglutdevelio...@ebuddy.com
  To: Tomcat Users Listusers@tomcat.apache.org
  Sent: Wednesday, February 24, 2010 8:31 AM
  Subject: Re: jvm exits without trace
 
 
  Hello Carl,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime. Most of them have (still) been running for almost a month
  without failure. There are ~100 machines.
 
   From the top of my head, I think we've had about 10+ failures now.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault. But I
  don't know if the two are related.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
  It might be useful to note that the failures happen with tomcat 6.0.20
  as well as 6.0.24.
 
  As far as load concerns, I haven't had a failure on an idle machines.
  The machines are well loaded, but only at a fraction limit in regards to
  load and cpu utilization.
  Most memory is commited to tomcat, where a 24G machine would have 18G
  allocated to heap, 128M to permgen and some unspecified amount would get
  used by jni for apr. About 4G remains free after calculating taking into
  account the jvm itsself.
  A 16G machine would have 12G allocated to the heap.
 
  Besides the fact that our apps heavily use nio and mina I wouldn't say
  there's anything else noteworthy. There can be anywhere up to 1
  concurrents on one machine.
 
  I had searched for coredumps, but no luck. Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
  On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
  Taylan,
 
  I am the person who started the Tomcat dies suddenly thread which I
  still
  haven't resolved.  I am curious about the pattern of failures you are
  experiencing because they may provide some clues to my problem.  In my
  case,
  the system will run for 15 minutes to 10 days before failing (most of the
  time it is several days to a week.)  It appears to die from a seg fault
  in
  the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you
  may be able to see the cause of the failure from the core file (the core
  files on my systems were in several directories so you may have to do a
  'find' to locate them.)  Load may be a factor but the failures generally
  come after the load has been heavy for a while.  I am running a couple of
  applications and it seems the failures are more frequent when people are
  hitting the additional apps (the primary app is always used, the
  remaining
  apps are used sporatically.)
 
  How does this compare to what you are experiencing?
 
  Thanks,
 
  Carl
 
  - Original Message -
  From: Taylan Develioglutdevelio...@ebuddy.com
  To: Tomcat Users Listusers@tomcat.apache.org;p...@pidster.com
  Sent: Wednesday, February 24, 2010 5

Re: jvm exits without trace

2010-03-05 Thread Taylan Develioglu
It's performing rather poorly performance wise, compared to the apr
connector. The number of threads required to handle the requests has
gone up significantly over the board.

Stability wise, I don't have complaints yet. 

I'm keeping my fingers crossed.

On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
 On 05/03/2010 08:41, Taylan Develioglu wrote:
  Pid, that would assume we had a working  1.6.10 version before that we
  replaced.
 
 That it would.
 
  We've run 1.6.10 upwards succesfully for a very long time. So I don't
  see the point in doing this.
 
 I must have missed that.
 
 How is the HTTP connector performing?
 
 
 p
 
  On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
  On 03/03/2010 09:11, Taylan Develioglu wrote:
  Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
  with http now.
 
  As Chuck mentioned in the other thread, significant changes occurred at
  1.6.10, so trying the release before (1.6.7) might be necessary to
  establish a better determination.
 
 
  p
 
  On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
  Taylan,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime.
 
  The timing of the failures seems similar.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
 
  I have never seen any hs_* files but have seen core files where strace
  showed the jvm stopped on a seg fault.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
  I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not
  tried 1.6.0_16.  Please post your results of this trial.
 
  Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
  Yes, this has been part of my problem as anytime we change something, we
  have to wait a week for the server to fail.
 
  In one sense, I am fortunate that I have a little more flexibility than 
  you.
  I have two servers (different hardware) but only need one in service at a
  time.  Therefore, I always have one server I can test ideas on although I
  have never been able to develop a meaningful stress test, i.e., the only 
  way
  I can test a change is to put it in production.
 
  Thanks,
 
  Carl
 
  - Original Message -
  From: Taylan Develioglutdevelio...@ebuddy.com
  To: Tomcat Users Listusers@tomcat.apache.org
  Sent: Wednesday, February 24, 2010 8:31 AM
  Subject: Re: jvm exits without trace
 
 
  Hello Carl,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime. Most of them have (still) been running for almost a month
  without failure. There are ~100 machines.
 
   From the top of my head, I think we've had about 10+ failures now.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault. But I
  don't know if the two are related.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
  It might be useful to note that the failures happen with tomcat 6.0.20
  as well as 6.0.24.
 
  As far as load concerns, I haven't had a failure on an idle machines.
  The machines are well loaded, but only at a fraction limit in regards to
  load and cpu utilization.
  Most memory is commited to tomcat, where a 24G machine would have 18G
  allocated to heap, 128M to permgen and some unspecified amount would get
  used by jni for apr. About 4G remains free after calculating taking into
  account the jvm itsself.
  A 16G machine would have 12G allocated to the heap.
 
  Besides the fact that our apps heavily use nio and mina I wouldn't say
  there's anything else noteworthy. There can be anywhere up to 1
  concurrents on one machine.
 
  I had searched for coredumps, but no luck. Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
  On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
  Taylan,
 
  I am the person who started the Tomcat dies suddenly thread which I
  still
  haven't resolved.  I am curious about the pattern of failures you are
  experiencing because they may provide some clues to my problem.  In my
  case,
  the system will run for 15 minutes to 10 days before failing (most of 
  the
  time it is several days to a week.)  It appears to die from a seg fault
  in
  the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... 
  you
  may be able to see the cause of the failure from the core file (the 
  core
  files on my systems were in several directories so you may have to do a
  'find' to locate them.)  Load may be a factor but the failures 
  generally
  come after the load has been heavy for a while

Re: jvm exits without trace

2010-03-03 Thread Taylan Develioglu
Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
with http now.

On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
 Taylan,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime.
 
 The timing of the failures seems similar.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
 
 I have never seen any hs_* files but have seen core files where strace 
 showed the jvm stopped on a seg fault.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
 I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not 
 tried 1.6.0_16.  Please post your results of this trial.
 
  Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
 Yes, this has been part of my problem as anytime we change something, we 
 have to wait a week for the server to fail.
 
 In one sense, I am fortunate that I have a little more flexibility than you. 
 I have two servers (different hardware) but only need one in service at a 
 time.  Therefore, I always have one server I can test ideas on although I 
 have never been able to develop a meaningful stress test, i.e., the only way 
 I can test a change is to put it in production.
 
 Thanks,
 
 Carl
 
 - Original Message - 
 From: Taylan Develioglu tdevelio...@ebuddy.com
 To: Tomcat Users List users@tomcat.apache.org
 Sent: Wednesday, February 24, 2010 8:31 AM
 Subject: Re: jvm exits without trace
 
 
  Hello Carl,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime. Most of them have (still) been running for almost a month
  without failure. There are ~100 machines.
 
 From the top of my head, I think we've had about 10+ failures now.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault. But I
  don't know if the two are related.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
  It might be useful to note that the failures happen with tomcat 6.0.20
  as well as 6.0.24.
 
  As far as load concerns, I haven't had a failure on an idle machines.
  The machines are well loaded, but only at a fraction limit in regards to
  load and cpu utilization.
  Most memory is commited to tomcat, where a 24G machine would have 18G
  allocated to heap, 128M to permgen and some unspecified amount would get
  used by jni for apr. About 4G remains free after calculating taking into
  account the jvm itsself.
  A 16G machine would have 12G allocated to the heap.
 
  Besides the fact that our apps heavily use nio and mina I wouldn't say
  there's anything else noteworthy. There can be anywhere up to 1
  concurrents on one machine.
 
  I had searched for coredumps, but no luck. Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
  On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
  Taylan,
 
  I am the person who started the Tomcat dies suddenly thread which I 
  still
  haven't resolved.  I am curious about the pattern of failures you are
  experiencing because they may provide some clues to my problem.  In my 
  case,
  the system will run for 15 minutes to 10 days before failing (most of the
  time it is several days to a week.)  It appears to die from a seg fault 
  in
  the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you
  may be able to see the cause of the failure from the core file (the core
  files on my systems were in several directories so you may have to do a
  'find' to locate them.)  Load may be a factor but the failures generally
  come after the load has been heavy for a while.  I am running a couple of
  applications and it seems the failures are more frequent when people are
  hitting the additional apps (the primary app is always used, the 
  remaining
  apps are used sporatically.)
 
  How does this compare to what you are experiencing?
 
  Thanks,
 
  Carl
 
  - Original Message - 
  From: Taylan Develioglu tdevelio...@ebuddy.com
  To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com
  Sent: Wednesday, February 24, 2010 5:09 AM
  Subject: Re: jvm exits without trace
 
 
   The GC log shows plenty of heap space left in all the spaces.
  
   I purposely didn't bother replacing the variables because I figured 
   they
   would not be relevant.
  
   But if you think they might provide clues they're as follows:
  
   JAVA_HEAP_SIZE=18432M
   JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
   JAVA_PERM_SIZE=128M
   JAVA_STCK_SIZE=128K
  
   EDEN_SIZE is 1/6th of total heap

Re: [OT] jvm exits without trace

2010-02-26 Thread Taylan Develioglu
Chuck I am aware.

A SIGSEGV is a signal sent by the kernel. Not a violation itsself.

A sigsegv is sent when an invalid memory access is attempted by a
process in userspace, in other words a page fault occurs, when the page
is actually present in physical memory but cannot be accessed by the
program. 
When this kind of violation occurs the a sigsegv is sent by the kernel
to the violating program.

At least that's what 'Understanding the linux kernel' leads me to
believe (chapter process address space, page fault exception handler
p376 - 378).

On Thu, 2010-02-25 at 22:36 +0100, Christopher Schultz wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Taylan,
 
 On 2/24/2010 8:31 AM, Taylan Develioglu wrote:
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault. But I
  don't know if the two are related.
 
 Just to be clear, a SIGSEGV is a segmentation violation (memory read
 outside process space), not a page fault, which is a perfectly normal
 thing to occur during execution. The latter is a virtual memory matter
 handled by the operating system and should be transparent (other than a
 delay) to the application.
 
 http://en.wikipedia.org/wiki/Segmentation_violation
 http://en.wikipedia.org/wiki/Page_fault
 
 The Wikipedia page for Page fault does indicate that Invalid page
 fault is a term that essentially means null pointer dereference but
 I've never heard that term used, ever.
 
 - -chris
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAkuG7UMACgkQ9CaO5/Lv0PAdLgCfUypdTf332QZ6JHyTzPlS4Lu5
 4xMAnReYrzhvO9xiSS7qB331Tq5DwPpx
 =5cqn
 -END PGP SIGNATURE-
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-26 Thread Taylan Develioglu
Hi Chris, 

There's no doubt about it. The amount free is what's left after
everything is taken into account, heap, jvm, jni, permgen.

And trust me I'd like it to be the oom killer, but it's not.

They could survive, but then I could throw away half of my ram. Not
seeing any point in doing that (doesn't fix the problem).

On Thu, 2010-02-25 at 22:38 +0100, Christopher Schultz wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Taylann,
 
 On 2/24/2010 8:31 AM, Taylan Develioglu wrote:
  Most memory is commited to tomcat, where a 24G machine would have 18G
  allocated to heap, 128M to permgen and some unspecified amount would get
  used by jni for apr. About 4G remains free after calculating taking into
  account the jvm itsself.
  A 16G machine would have 12G allocated to the heap.
 
 Are you sure the rest of the JVM can fit into this space? I've heard of
 JVMs (particularly on Windows) that take a significant chunk of memory
 on top of the heap space requested on the command-line.
 
 Definitely check your system logs for OOM killer, here.
 
 What happens if you cut your heap in half? Can each machine in your
 (probably) cluster survive with less heap space?
 
 - -chris
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAkuG7cEACgkQ9CaO5/Lv0PDRtgCfd7qBww9EUP9whAf6ZlvSvl02
 VnYAoK6f6GTY1vBzw3QW0phnr/53gBYG
 =8thi
 -END PGP SIGNATURE-
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] jvm exits without trace

2010-02-26 Thread Taylan Develioglu
for some reason I keep calling you chuck... I hope I'm not offending
anyone :O

On Fri, 2010-02-26 at 13:55 +0100, Taylan Develioglu wrote:
 Chuck I am aware.
 
 A SIGSEGV is a signal sent by the kernel. Not a violation itsself.
 
 A sigsegv is sent when an invalid memory access is attempted by a
 process in userspace, in other words a page fault occurs, when the page
 is actually present in physical memory but cannot be accessed by the
 program. 
 When this kind of violation occurs the a sigsegv is sent by the kernel
 to the violating program.
 
 At least that's what 'Understanding the linux kernel' leads me to
 believe (chapter process address space, page fault exception handler
 p376 - 378).
 
 On Thu, 2010-02-25 at 22:36 +0100, Christopher Schultz wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  Taylan,
  
  On 2/24/2010 8:31 AM, Taylan Develioglu wrote:
   We have also had failures with hotspot error files (hs_err) present, and
   the cause specified was indeed SIGSEGV indicating a page fault. But I
   don't know if the two are related.
  
  Just to be clear, a SIGSEGV is a segmentation violation (memory read
  outside process space), not a page fault, which is a perfectly normal
  thing to occur during execution. The latter is a virtual memory matter
  handled by the operating system and should be transparent (other than a
  delay) to the application.
  
  http://en.wikipedia.org/wiki/Segmentation_violation
  http://en.wikipedia.org/wiki/Page_fault
  
  The Wikipedia page for Page fault does indicate that Invalid page
  fault is a term that essentially means null pointer dereference but
  I've never heard that term used, ever.
  
  - -chris
  -BEGIN PGP SIGNATURE-
  Version: GnuPG v1.4.10 (MingW32)
  Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
  
  iEYEARECAAYFAkuG7UMACgkQ9CaO5/Lv0PAdLgCfUypdTf332QZ6JHyTzPlS4Lu5
  4xMAnReYrzhvO9xiSS7qB331Tq5DwPpx
  =5cqn
  -END PGP SIGNATURE-
  
  -
  To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
  For additional commands, e-mail: users-h...@tomcat.apache.org
  
 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



jvm exits without trace

2010-02-24 Thread Taylan Develioglu
Hi,

I have jvm's, running tomcat and our application, exiting mysteriously,
and was wondering if anyone could give me some advice on how to debug
this thing.

There is nothing in catalina.out, nor our application logs, and no
hotspot error file. GC log looks normal. No trace in system logs.

I am left completely clueless :(, has anyone dealt with a problem like
this before?

Any help appreciated.

- Tomcat 6.0.24
- TC native 1.1.18
- APR 1.3.9 
- Sun JDK 6u18
- Debian Lenny, 2.6.31.10-amd64

2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.

JAVA_OPTS ( ):

-verbose:gc
-Djava.awt.headless=true
-Dsun.net.inetaddr.ttl=60
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=$TMP_DIR
-Djava.library.path=/usr/local/lib
-Djava.endorsed.dirs=$CATALINA_BASE/endorsed
-Dcatalina.base=$CATALINA_BASE
-Dcatalina.home=$CATALINA_HOME
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
-Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties
-XX:+PrintGCDetails
-Xloggc:$CATALINA_BASE/logs/gc.log
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70
-Xms$JAVA_HEAP_SIZE
-Xmx$JAVA_HEAP_SIZE
-XX:NewSize=$JAVA_EDEN_SIZE
-XX:MaxNewSize=$JAVA_EDEN_SIZE
-XX:PermSize=$JAVA_PERM_SIZE
-XX:MaxPermSize=$JAVA_PERM_SIZE
-Xss$JAVA_STCK_SIZE
-XX:+UseLargePages



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
I thought I'd add the connector definitions too, :

   Connector port=80
protocol=org.apache.coyote.http11.Http11AprProtocol
   compression=1024 keepAliveTimeout=6
maxKeepAliveRequests=-1
   enableLookups=false redirectPort=443 maxThreads=150
pollerSize=32768
   pollerThreadCount=4/

Connector port=443
protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true
   enableLookups=false maxThreads=10 scheme=https
secure=true
   SSLCertificateFile=/etc/ssl/private/something.crt
   SSLCertificateKeyFile=/etc/ssl/private/something.key
   SSLCACertificateFile=/etc/ssl/certs/ca.crt/


On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
 Hi,
 
 I have jvm's, running tomcat and our application, exiting mysteriously,
 and was wondering if anyone could give me some advice on how to debug
 this thing.
 
 There is nothing in catalina.out, nor our application logs, and no
 hotspot error file. GC log looks normal. No trace in system logs.
 
 I am left completely clueless :(, has anyone dealt with a problem like
 this before?
 
 Any help appreciated.
 
 - Tomcat 6.0.24
 - TC native 1.1.18
 - APR 1.3.9 
 - Sun JDK 6u18
 - Debian Lenny, 2.6.31.10-amd64
 
 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.
 
 JAVA_OPTS ( ):
 
 -verbose:gc
 -Djava.awt.headless=true
 -Dsun.net.inetaddr.ttl=60
 -Dfile.encoding=UTF-8
 -Djava.io.tmpdir=$TMP_DIR
 -Djava.library.path=/usr/local/lib
 -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
 -Dcatalina.base=$CATALINA_BASE
 -Dcatalina.home=$CATALINA_HOME
 -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
 -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties
 -XX:+PrintGCDetails
 -Xloggc:$CATALINA_BASE/logs/gc.log
 -XX:+UseConcMarkSweepGC
 -XX:CMSInitiatingOccupancyFraction=70
 -Xms$JAVA_HEAP_SIZE
 -Xmx$JAVA_HEAP_SIZE
 -XX:NewSize=$JAVA_EDEN_SIZE
 -XX:MaxNewSize=$JAVA_EDEN_SIZE
 -XX:PermSize=$JAVA_PERM_SIZE
 -XX:MaxPermSize=$JAVA_PERM_SIZE
 -Xss$JAVA_STCK_SIZE
 -XX:+UseLargePages
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
The GC log shows plenty of heap space left in all the spaces.

I purposely didn't bother replacing the variables because I figured they
would not be relevant.

But if you think they might provide clues they're as follows:

JAVA_HEAP_SIZE=18432M
JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
JAVA_PERM_SIZE=128M
JAVA_STCK_SIZE=128K

EDEN_SIZE is 1/6th of total heap.

And I said there was nothing in the system logs.
But you get a couple of points for trying.

On Wed, 2010-02-24 at 10:44 +0100, Pid wrote:
 On 24/02/2010 09:36, Taylan Develioglu wrote:
  I thought I'd add the connector definitions too, :
 
  Connector port=80
  protocol=org.apache.coyote.http11.Http11AprProtocol
  compression=1024 keepAliveTimeout=6
  maxKeepAliveRequests=-1
  enableLookups=false redirectPort=443 maxThreads=150
  pollerSize=32768
  pollerThreadCount=4/
 
   Connector port=443
  protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true
  enableLookups=false maxThreads=10 scheme=https
  secure=true
  SSLCertificateFile=/etc/ssl/private/something.crt
  SSLCertificateKeyFile=/etc/ssl/private/something.key
  SSLCACertificateFile=/etc/ssl/certs/ca.crt/
 
 
  On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
  Hi,
 
  I have jvm's, running tomcat and our application, exiting mysteriously,
  and was wondering if anyone could give me some advice on how to debug
  this thing.
 
  There is nothing in catalina.out, nor our application logs, and no
  hotspot error file. GC log looks normal. No trace in system logs.
 
  I am left completely clueless :(, has anyone dealt with a problem like
  this before?
 
  Any help appreciated.
 
  - Tomcat 6.0.24
  - TC native 1.1.18
  - APR 1.3.9
  - Sun JDK 6u18
  - Debian Lenny, 2.6.31.10-amd64
 
  2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.
 
  JAVA_OPTS ( ):
 
   -verbose:gc
   -Djava.awt.headless=true
   -Dsun.net.inetaddr.ttl=60
   -Dfile.encoding=UTF-8
   -Djava.io.tmpdir=$TMP_DIR
   -Djava.library.path=/usr/local/lib
   -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
   -Dcatalina.base=$CATALINA_BASE
   -Dcatalina.home=$CATALINA_HOME
   -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
  -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties
   -XX:+PrintGCDetails
   -Xloggc:$CATALINA_BASE/logs/gc.log
   -XX:+UseConcMarkSweepGC
   -XX:CMSInitiatingOccupancyFraction=70
   -Xms$JAVA_HEAP_SIZE
   -Xmx$JAVA_HEAP_SIZE
   -XX:NewSize=$JAVA_EDEN_SIZE
   -XX:MaxNewSize=$JAVA_EDEN_SIZE
   -XX:PermSize=$JAVA_PERM_SIZE
   -XX:MaxPermSize=$JAVA_PERM_SIZE
   -Xss$JAVA_STCK_SIZE
   -XX:+UseLargePages
 
 There's no actual heap size settings in the above.  But you get a couple 
 of points for trying.
 
 Google Linux Out Of Memory killer or OOM Killer and then check the 
 server logs carefully.  (e.g. /var/log/messages)
 
 
 p
 
  -
  To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
  For additional commands, e-mail: users-h...@tomcat.apache.org
 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
Thank you Konstantin, I've read the thread you mentioned.

I should have mentioned the mysterious exit happens on several different
servers with different hardware and configuration. So it's very unlikely
it's being caused by a hardware issue.

It's also not the oom killer as I mentioned before, I already
investigated those possibilities.

I'm suspecting jni with tomcat native and apr now, I believe native code
outside the jvm could very well cause a crash like this but my ignorance
on the subject isn't helping.

I've had strange behavior with libapr 1.3 and apache on machines with
debian 5.0 that synchronize their clock using clock slew (ntpdate) and
decreased the ntpdate frequency to see if that helps. ((as you can tell
I'm getting a bit desperate)


On Wed, 2010-02-24 at 11:28 +0100, Konstantin Kolinko wrote:
 2010/2/24 Taylan Develioglu tdevelio...@ebuddy.com:
  Hi,
 
  I have jvm's, running tomcat and our application, exiting mysteriously,
  and was wondering if anyone could give me some advice on how to debug
  this thing.
 
  There is nothing in catalina.out, nor our application logs, and no
  hotspot error file. GC log looks normal. No trace in system logs.
 
  I am left completely clueless :(, has anyone dealt with a problem like
  this before?
 
 
 There is currently a thread named Tomcat dies suddenly
 Look there for starters.  While that is unlikely your case, most ideas
 of diagnosing such an issue are mentioned in the first dozen of
 messages of that thread.
 
 http://marc.info/?t=12632496092r=1w=2
 http://marc.info/?t=12633901125r=1w=2
 http://marc.info/?t=12647949758r=6w=2
 http://marc.info/?t=12660960545r=1w=2
 
 Best regards,
 Konstantin Kolinko
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
Hello Carl,

The failures we've seen are in anywhere between 8 hours to a week of
runtime. Most of them have (still) been running for almost a month
without failure. There are ~100 machines.

From the top of my head, I think we've had about 10+ failures now.

We have also had failures with hotspot error files (hs_err) present, and
the cause specified was indeed SIGSEGV indicating a page fault. But I
don't know if the two are related.

We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
the situation allows (during regular updates of the application, or a
crash) to see if that helps.

It might be useful to note that the failures happen with tomcat 6.0.20
as well as 6.0.24.

As far as load concerns, I haven't had a failure on an idle machines.
The machines are well loaded, but only at a fraction limit in regards to
load and cpu utilization.  
Most memory is commited to tomcat, where a 24G machine would have 18G
allocated to heap, 128M to permgen and some unspecified amount would get
used by jni for apr. About 4G remains free after calculating taking into
account the jvm itsself.
A 16G machine would have 12G allocated to the heap.

Besides the fact that our apps heavily use nio and mina I wouldn't say
there's anything else noteworthy. There can be anywhere up to 1
concurrents on one machine.

I had searched for coredumps, but no luck. Running tomcat on the
foreground might show something, but then again I could be waiting for a
month for it to happen.

On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
 Taylan,
 
 I am the person who started the Tomcat dies suddenly thread which I still 
 haven't resolved.  I am curious about the pattern of failures you are 
 experiencing because they may provide some clues to my problem.  In my case, 
 the system will run for 15 minutes to 10 days before failing (most of the 
 time it is several days to a week.)  It appears to die from a seg fault in 
 the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you 
 may be able to see the cause of the failure from the core file (the core 
 files on my systems were in several directories so you may have to do a 
 'find' to locate them.)  Load may be a factor but the failures generally 
 come after the load has been heavy for a while.  I am running a couple of 
 applications and it seems the failures are more frequent when people are 
 hitting the additional apps (the primary app is always used, the remaining 
 apps are used sporatically.)
 
 How does this compare to what you are experiencing?
 
 Thanks,
 
 Carl
 
 - Original Message - 
 From: Taylan Develioglu tdevelio...@ebuddy.com
 To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com
 Sent: Wednesday, February 24, 2010 5:09 AM
 Subject: Re: jvm exits without trace
 
 
  The GC log shows plenty of heap space left in all the spaces.
 
  I purposely didn't bother replacing the variables because I figured they
  would not be relevant.
 
  But if you think they might provide clues they're as follows:
 
  JAVA_HEAP_SIZE=18432M
  JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
  JAVA_PERM_SIZE=128M
  JAVA_STCK_SIZE=128K
 
  EDEN_SIZE is 1/6th of total heap.
 
  And I said there was nothing in the system logs.
  But you get a couple of points for trying.
 
  On Wed, 2010-02-24 at 10:44 +0100, Pid wrote:
  On 24/02/2010 09:36, Taylan Develioglu wrote:
   I thought I'd add the connector definitions too, :
  
   Connector port=80
   protocol=org.apache.coyote.http11.Http11AprProtocol
   compression=1024 keepAliveTimeout=6
   maxKeepAliveRequests=-1
   enableLookups=false redirectPort=443 
   maxThreads=150
   pollerSize=32768
   pollerThreadCount=4/
  
Connector port=443
   protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true
   enableLookups=false maxThreads=10 scheme=https
   secure=true
   SSLCertificateFile=/etc/ssl/private/something.crt
   SSLCertificateKeyFile=/etc/ssl/private/something.key
   SSLCACertificateFile=/etc/ssl/certs/ca.crt/
  
  
   On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
   Hi,
  
   I have jvm's, running tomcat and our application, exiting 
   mysteriously,
   and was wondering if anyone could give me some advice on how to debug
   this thing.
  
   There is nothing in catalina.out, nor our application logs, and no
   hotspot error file. GC log looks normal. No trace in system logs.
  
   I am left completely clueless :(, has anyone dealt with a problem like
   this before?
  
   Any help appreciated.
  
   - Tomcat 6.0.24
   - TC native 1.1.18
   - APR 1.3.9
   - Sun JDK 6u18
   - Debian Lenny, 2.6.31.10-amd64
  
   2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.
  
   JAVA_OPTS ( ):
  
-verbose:gc
-Djava.awt.headless=true
-Dsun.net.inetaddr.ttl=60
-Dfile.encoding=UTF-8

Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
It's possible, 
I'm going to try an earlier jvm first. u16 was the previous one running
production, will try moving back to u16.

If that fails removing APR is the next thing to try out.

After that I'm going to try beating the dev team with a stick (I know
you're reading this!).

This is incredibly frustrating, thanks for all the help.

 Can you disable APR, use the alternative SSL configuration or is that 
 not possible?
 
 Also, would be it be possible to use an earlier 1.6 JVM* or perhaps even 
 a completely different one?  I can't remember, offhand, what (if any) 
 results Carl had with other JVMs.
 
 
 p
 
 
 * Perhaps there's a subtle bug in recent releases of the JVM.
 
 
  On Wed, 2010-02-24 at 11:28 +0100, Konstantin Kolinko wrote:
  2010/2/24 Taylan Develioglutdevelio...@ebuddy.com:
  Hi,
 
  I have jvm's, running tomcat and our application, exiting mysteriously,
  and was wondering if anyone could give me some advice on how to debug
  this thing.
 
  There is nothing in catalina.out, nor our application logs, and no
  hotspot error file. GC log looks normal. No trace in system logs.
 
  I am left completely clueless :(, has anyone dealt with a problem like
  this before?
 
 
  There is currently a thread named Tomcat dies suddenly
  Look there for starters.  While that is unlikely your case, most ideas
  of diagnosing such an issue are mentioned in the first dozen of
  messages of that thread.
 
  http://marc.info/?t=12632496092r=1w=2
  http://marc.info/?t=12633901125r=1w=2
  http://marc.info/?t=12647949758r=6w=2
  http://marc.info/?t=12660960545r=1w=2
 
  Best regards,
  Konstantin Kolinko
 
  -
  To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
  For additional commands, e-mail: users-h...@tomcat.apache.org
 
 
 
 
  -
  To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
  For additional commands, e-mail: users-h...@tomcat.apache.org
 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: jvm exits without trace

2010-02-24 Thread Taylan Develioglu
I'll be sure to post an update if u16 resolves it. Or any other progress
for that matter.

In the meantime don't be shy either :)

On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
 Taylan,
 
  The failures we've seen are in anywhere between 8 hours to a week of
  runtime.
 
 The timing of the failures seems similar.
 
  We have also had failures with hotspot error files (hs_err) present, and
  the cause specified was indeed SIGSEGV indicating a page fault.
 
 I have never seen any hs_* files but have seen core files where strace 
 showed the jvm stopped on a seg fault.
 
  We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
  the situation allows (during regular updates of the application, or a
  crash) to see if that helps.
 
 I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not 
 tried 1.6.0_16.  Please post your results of this trial.
 
  Running tomcat on the
  foreground might show something, but then again I could be waiting for a
  month for it to happen.
 
 Yes, this has been part of my problem as anytime we change something, we 
 have to wait a week for the server to fail.
 
 In one sense, I am fortunate that I have a little more flexibility than you. 
 I have two servers (different hardware) but only need one in service at a 
 time.  Therefore, I always have one server I can test ideas on although I 
 have never been able to develop a meaningful stress test, i.e., the only way 
 I can test a change is to put it in production.
 
 Thanks,
 
 Carl



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: CLOSE_WAIT and what to do about it

2009-04-09 Thread Taylan Develioglu
Skimmed quickly through your post there while working, so forgive me if
this is irrelevant.

CLOSE_WAIT is a state where the connection has been closed on the tcp/ip
level, but the application (in this case java) has not closed the socket
descriptor yet.

As a coincidence we just fixed this very same issue in our application,
which uses the httpclient library.

There is a known issue with the httpclient library where sockets are not
closed after the connection ends (issue or feature you be the judge), 
we worked around this by explicitly calling a close ourselves.

If httpclient is used that could be the culprit.

See
http://www.nabble.com/tcp-connections-left-with-CLOSE_WAIT-td13757202.html
for a better description

Rgds,

Taylan

André Warnier wrote:


 Hi.
 As a follow-upon another thread originally entitled apache/tomcat
 communication issues (502 response), I'd like to pursue the
 CLOSE-WAIT subject.

 Sorry if this post is a bit long, I want to make sure that I do
 provide all the necessary information.

 Like the original poster, I am seeing on my systems a fair number of
 sockets apparently stuck for a long time in the CLOSE_WAIT state.
 (Sometimes several hundreds of them).
 They seem to predominantly concern Tomcat and other java processes,
 but as Alan pointed out previously and I confirm, my perspective is
 slanted, because we use a lot of common java programs and webapps on
 our servers, and the ones mostly affected talk to eachother and come
 from the same vendor.
 Unfortunately also, I do not have the sources of these
 programs/webapps available, and will not get them, and I can't do
 without these programs.

 It has been previously established that a socket in a
 long-time-lingering CLOSE-WAIT status, is due to one or the other side
 of a TCP connection not properly closing its side of the connection
 when it is done with it.
 I also surmise (without having a definite proof of this), that this is
 essentially bad, as it ties up some resources that could be
 otherwise freed.
 I have also been told or discovered that, our servers being Linux
 Debian servers, programs such as ps, netstat and lsof can help
 in determining precisely how many such lingering sockets there are,
 and who the culprit processes are (to some extent).

 In our case, we know which are the programs involved, because we know
 which ones open a listening socket and on what fixed port, and we also
 know which are the other processes talking to them.
 But, as mentioned previously, we do not have the source of these
 programs and will not get them, but cannot practically do without them
 for now. But we do have full root control of the Linux servers where
 these programs are running.

 So my question is : considering the situation above, is there
 something I can do locally to free these lingering CLOSE_WAIT sockets,
 and under which conditions ?
 (I must admit that I am a bit lost among the myriad options of lsof)

 For example, suppose I start with a netstat -pan command and I see
 the display below (sorry for the line-wrapping).
 I see a number of sockets in the CLOSE_WAIT state, and for those I
 have a process-id, which I can associate to a particular process.
 For example, I see this line :
 tcp6  12  0 :::127.0.0.1:41764  :::127.0.0.1:11002
 CLOSE_WAIT 29649/java
 which tells me that there is a local process 29649/java, whith a
 local socket port 41674 in the CLOSE_WAIT state, related to another
 socket #11002 on the same host.
 On the other hand, I see this line :
 tcp0  0 127.0.0.1:11002 127.0.0.1:41764 FIN_WAIT2  -
 which shows a local socket on port 11002, related to this other
 local socket port #41764, with no process-id/program displayed.
 What does that tell me ?

 I also know that the process-id 29649 corresponds to a local java
 process, of the daemon variety, multi-threaded.  That program talks
 to another known server program, written in C, of which instances are
 started on an ad-hoc base by inetd, and which listens on port 11002
 (in fact it is inetd who does, and it passes this socket on to the
 process it forks, I understand that).

 (The link with Tomcat is that I also see frequently the same
 situation, where the process owning the CLOSE_WAIT socket is Tomcat,
 more specifically one webapp running inside it.  It's just that in
 this particular snapshot it isn't.)

 What it looks like to me in this case, is that at some point one of
 the threads of process # 29649 opened a client socket #41674 to the
 local inetd port #11002; that inetd then started the underlying server
 process (the C program); that the underlying C program then at some
 point exited; but that process #41674 never closes one of the sides of
 its connection with port #11002.
 Can I somehow detect this condition, and force the offending thread
 of process #29649 to close that socket (or just force this thread to
 exit) ?

 I realise this may be a complex question, and that the answers may be
 

Re: CPU usage with APR and connectionTimeout impact

2009-04-02 Thread Taylan Develioglu
Funny,

according to the documentation there exists no connectionTimeout
attribute for the apr connector.

Setting the value to '0' could mean all sorts of behavior, no way to
know for sure short of  checking the code. (it could mean the connector
will not wait for the uri line at all)

I  can't comment about a correct value for your application.

Setting it to a low value will  have the connector thread return to the
pool faster on connections where the peer has gone to lunch after the
initial connection. This only matters if you have a large number of such
peers.

I'm sure one of the veterans here can clear this up for you.

 Hello,

 In my project, we are using Tomcat 6.0.18, with APR 1.2.12 and tc
 native 1.1.14 on an Redhat OS (Linux kernel 2.6.18).
 There is a behavior that I can't explain:

 -with connectionTimeout=0, the process tomcat uses a huge percentage
 of CPU, even if there is no traffic.
 but we doesn't observe any problem and the response time is good.

 -with connectionTimeout=5000, the process tomcat uses a normal
 percentage of CPU, when there is no traffic.

 -without APR and with connectionTimeout=0, the process tomcat uses a
 normal percentage of CPU when there is no traffic.

 After different searches on the web, tomcat manual and mailing lists, I
 don't find the reason of the link between CPU usage and
 connectionTimeout/keepAliveTimeout with APR.
 With the previous release of Tomcat (5.5) and APR, we have a similar CPU
 usage (without traffic, high CPU load) and when we modify another
 parameter (firstReadTimeout), the behavior also changes in the same
 way.

 I know there is no real trouble, but I'm curious and prudent: I don't
 like to do something, when I don't understand what is hidden behind.
 Could somebody explain to me why Tomcat/APR has these behaviors?
 Is there a performance risk to set connectionTimeout to 5000?

 Thank you for your answers.
 Yann


 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: CPU usage with APR and connectionTimeout impact

2009-04-02 Thread Taylan Develioglu
You're right. I missed it. APR has the same attributes as the HTTP
connector.

I think a seperate overview of attributes per connector would be clearer.

The HTTP connectionTimeout description states:

- The number of milliseconds this *Connector* will wait, after accepting
a connection, for the request URI line to be presented. The default
value is 6 (i.e. 60 seconds).

'0' is not explicitly defined as a special value. According to the
description it would mean a wait period of 0 milliseconds for the uri to
be presented. This would make the connector practically useless.

Caldarale, Charles R wrote:
 From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
 Subject: Re: CPU usage with APR and connectionTimeout impact

 according to the documentation there exists no connectionTimeout
 attribute for the apr connector.
 

 Which documentation is that?  Note that the HTTP connector attributes apply 
 when running in APR mode.  Quoting from the APR-specific doc:

 The following attributes are supported in the HTTP APR connector in addition 
 to the ones supported in the regular HTTP connector:

 What's not clear in the doc is that many of the HTTP attributes also apply to 
 the NIO version of the protocol handler.

  - Chuck


 THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
 MATERIAL and is thus for use only by the intended recipient. If you received 
 this in error, please contact the sender and delete the e-mail and its 
 attachments from all computers.


 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org

   


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: very off topic marketing question

2009-03-20 Thread Taylan Develioglu



and it pre-compiles all of its code before it runs each script
  


For starters, I'd point out the jsp page compiler does this as well...

Then redirect the person to this thread to get lynched. (seriously, 
aside from the lynching this is a good idea)


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: tomcat w/apr data lost in http post request?

2009-03-16 Thread Taylan Develioglu
Possibly IE writes to the socket buffer in seperate steps for header 
info and post parameters. This would cause the data to be sent out in 
seperate packets if nagle's alg. is off.


Caldarale, Charles R wrote:
From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
Subject: Re: tomcat w/apr data lost in http post request?


Can MSIE even control which data goes in which packet?



TCP/IP APIs on most platforms allow the Nagle algorithm to be disabled, which 
will cause data to be sent out on each call.  Most TCP/IP stacks also set the 
push flag on the last packet of a sequence to force the peer stack to deliver 
the data to the receiver without delay.  Tthat's probably all that IE is doing 
(but I don't know the MS APIs).

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

  



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: tomcat w/apr data lost in http post request?

2009-03-16 Thread Taylan Develioglu

Hi Chris,

Raising the keepalive-timeout value on the connector definitely improves 
the situation.


From what I've gathered from what people posted here (thanks guys) and 
dumping packets I believe the situation to be somewhat as follows:


With nagle's off, IE sends out the http request in two separate packets.

Somewhere between Tomcats receipt of packet 1 (header) and packet 2 
(body/parameters) timeout occurs leading to the contents of the second 
packet to be ignored.
Raising keepalive-timeout alleviates the problem by decreasing the 
chance of a timeout to occur.


Christopher Schultz wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Taylan,

On 3/6/2009 4:05 AM, Taylan Develioglu wrote:
  

James, thank you very much.

I suspected IE to be guilty because it was happening only with IE clients.

Chris, I guess we don't need to try and reproduce this anymore  now we
know the cause?



Well, you might want to figure out how to handle this situation. You
can't simply ignore 80% of the potential clients out there :)

- -chris

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkm1mosACgkQ9CaO5/Lv0PBDRgCfQXPTf2uwKVgIeNHiuVbcyYT6
ZuEAnjNY9yEDmIFrc0q4TwNuvPkBuI3U
=NGPN
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

  



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: tomcat w/apr data lost in http post request?

2009-03-16 Thread Taylan Develioglu

Christopher Schultz wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Taylan,

  
No, you'd need to modify the source. It's not particularly useful in

most scenarios to intentionally stall an HTTP conversation, so it's not
a built-in feature :)

  
No, I'm saying that you should send exactly the right amount of data,

but you should stall in the middle. For instance, set the Content-Length
to 10 bytes, then send 5 bytes, then wait 10 or 20 seconds, and send the
rest.
  
Ah ofcourse, I understand what you're saying now. We basically wait for 
the timeout to occur before we send the post parameters. Could be done 
by *socket.setTcpNoDelay*() then writing to the socket and closing ,then 
waiting and writing the other half I think.




Wow, I didn't realize that browsers would keep an HTTP connection open
to a web server for 10 idle seconds. That seems like a really long time.

  
Actually, the default for IE is even 60 seconds (idle) on a keepalive 
connection.


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: tomcat w/apr data lost in http post request?

2009-03-16 Thread Taylan Develioglu

Hi Andre,

I meant to stop writing, not closing the socket. Poor choice of words, 
apologies.


André Warnier wrote:


Taylan Develioglu wrote:

Christopher Schultz wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Taylan,

  No, you'd need to modify the source. It's not particularly useful in
most scenarios to intentionally stall an HTTP conversation, so it's not
a built-in feature :)

  No, I'm saying that you should send exactly the right amount of data,
but you should stall in the middle. For instance, set the 
Content-Length
to 10 bytes, then send 5 bytes, then wait 10 or 20 seconds, and send 
the

rest.
  
Ah ofcourse, I understand what you're saying now. We basically wait 
for the timeout to occur before we send the post parameters. Could be 
done by *socket.setTcpNoDelay*() then writing to the socket and 
closing ,then waiting and writing the other half I think.



No, I don't think you want to close.
Send the first 5 bytes, then wait, then send the rest.
Then maybe close (but only the sending side of the socket). If you 
close the connection totally (including the receiving side), you will 
provoke an error at the server side.



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Effect of Heap Size on Performance?

2009-03-11 Thread Taylan Develioglu

Chris,

We have 100+ application servers in a loadbalancing (application based, 
not tomcat) setup. If servers are removed from the load balancing pool 
the others need to be able to pick up the load. So the number of 
concurrent users is highly dynamic. You can imagine the problem if we 
keep the heapsize to a minimum on every server. I'm talking about a 
fixed size here ofcourse.


I also don't think the relationship between number of objects and young 
gc duration is a linear one.


Increasing the young generation  leads to longer gc's.  Increasing young 
to 682M on a 4G heap from its default size increased gc time approx. 3-4 
x (47ms average to 154ms average on one server), but it also decreased 
the number of gc's performed by 15-20x.
So eventually, a larger heapsize saved cycles, and subsequently 
increased throughput. At least for us.


I think it also gives short-lived objects (for example short sessions) a 
longer time to 'die out', so they won't be moved to tenure because 
survivor space is increased and gc frequency is decreased (can anyone 
confirm this?).


Christopher Schultz wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Taylan,

On 3/5/2009 5:11 AM, Taylan Develioglu wrote:
  

I always hold this as a ground rule:

Increase heapsize as much as possible as long as:



My rule has always been to run with the smallest heap you can get away
with. We ran our main production app in 64MB of heap (the default for
our platform) for 4 years before we got our first OOME. Now we run it
with a 192MB heap.

A smaller heap means that you'll catch even small memory leaks faster.
At least, that's my position.

Surprisingly, Chuck hasn't responded (he usually has something to say
about GC/heap myths), but I suspect he'd say something like heap size
itself has little effect on the GC's performance... it's really the
number of objects that affect the performance. Granted, a larger heap
invites more objects into it, but generational garbage collection is
decent enough that the generations rarely grow to such a size that the
app stalls while the GC runs.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkm2fBAACgkQ9CaO5/Lv0PCPngCfRfClYEVoDAI57VBbqoBUaAC8
RDAAn0fztUgMY0d0K0FAdV0uxYzSjDxN
=EbMZ
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

  



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: tomcat w/apr data lost in http post request?

2009-03-03 Thread Taylan Develioglu
I would like to correct this, it seems to only happen with IE6/7.. maybe
old firefox 2.0
 It happens with different clients indeed.

   


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Record and simulate a web app

2009-02-22 Thread Taylan Develioglu
Guys, I've been following this thread for a while now, but doesn't
Jmeter already do what you're trying to accomplish here?

I've used jmeter's proxy to record and replay http requests/responses
before with success.

Or am I missing something here?

Here's a link to some instructions :
http://jakarta.apache.org/jmeter/usermanual/jmeter_proxy_step_by_step.pd
f

Rgds,

T

-Original Message-
From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
Sent: vrijdag 20 februari 2009 17:07
To: Tomcat Users List
Subject: Re: Record and simulate a web app


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Youssef,

On 2/20/2009 10:45 AM, Youssef Mohammed wrote:
 Yeah I was thinking that the capture code would perfectly fit in some
HTTP
 tunnel so that we can capture the whole thing coming out of the web
server ,
 what do you think ?

Okay, I took a stack trace of my servlet's code right in the middle of
the request (TC 5.5.26) and found this at the top (it's really my first
delve into exactly what code gets executed before application code, so
forgive me if you already knew this):

...blah...blah...blah...
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:174)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:117)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:1
74)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:87
4)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.proc
essConnection(Http11BaseProtocol.java:665)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint
.java:528)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollow
erWorkerThread.java:81)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool
.java:689)


The blah blah blah part is all application code from there on up.
ApplicationFilterChain.internalDoFilter calls the configured filters in
order, starting with mine (which I of course configured first).

Note that I'm looking at the source for Tomcat 6.0.16 yet running
5.5.26. Stupid, I know, but the architecture hasn't changed /that/ much.

I started at the /bottom/, ignoring the socket stuff, and right there in
Http11Processor.prepareResponse I find this:

headers.setValue(Date).setString(date);

So, Tomcat post-processes the HTTP headers at the Connector level.
Without writing your own /Connector/, you aren't going to be able to
intercept the response properly. I was hoping to get away with a valve.

I suppose you could subclass, say, Http11Processor and, in your
constructor, replace the outputBuffer class with a wrapper for
InternalOutputBuffer.

But this is getting a little messy for me. Since I don't need it, I'm
not too concerned about getting it done. :(

If you figure out a way to capture the response, and determine how how
to uniquely identify requests to match the response you want to return,
I can show you how to write a servlet that can re-play the
previously-saved responses.

Good luck,
- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkme1R0ACgkQ9CaO5/Lv0PByBwCfay9gRGEJ/R8m5H+iGB3s0lLP
vP8An122DIn2SreN7czoa1+4HMaWeNPZ
=anEz
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Fatal error: Cleaner terminated abnormally

2009-02-21 Thread Taylan Develioglu
I wanted to let you know it worked. The system.exit does get trapped.

java.lang.Error: Cleaner terminated abnormally
at sun.misc.Cleaner$1.run(Cleaner.java:130)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.Cleaner.clean(Cleaner.java:127)
at
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:124)
Caused by: java.lang.Error: java.io.IOException: Broken pipe
at sun.nio.ch.Util$SelectorWrapper$Closer.run(Util.java:97)
at sun.misc.Cleaner.clean(Cleaner.java:125)
... 1 more
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method)
at
sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:242)
at
sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:170)
at
sun.nio.ch.SelectorImpl.implCloseSelector(SelectorImpl.java:92)
at
java.nio.channels.spi.AbstractSelector.close(AbstractSelector.java:91)
at sun.nio.ch.Util$SelectorWrapper$Closer.run(Util.java:95)
... 2 more
Exception in thread Reference Handler java.lang.SecurityException:
Can't call System.exit()
at
com.emessenger.web.CustomSecurityManager.checkExit(CustomSecurityManager
.java:22)
at java.lang.Runtime.exit(Runtime.java:88)
at java.lang.System.exit(System.java:906)
at sun.misc.Cleaner$1.run(Cleaner.java:132)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.Cleaner.clean(Cleaner.java:127)
at
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:124)


- What we see is, GC breaks and class unloading starts after the first
full GC:

[Unloading class sun.reflect.GeneratedMethodAccessor100]
[Unloading class sun.reflect.GeneratedConstructorAccessor76]
[Unloading class sun.reflect.GeneratedConstructorAccessor80]
[Unloading class sun.reflect.GeneratedConstructorAccessor77]
[Unloading class sun.reflect.GeneratedMethodAccessor95]
[Unloading class sun.reflect.GeneratedMethodAccessor98]
[Unloading class sun.reflect.GeneratedConstructorAccessor78]
[Unloading class sun.reflect.GeneratedMethodAccessor106]
[Unloading class sun.reflect.GeneratedMethodAccessor91]
[Unloading class sun.reflect.GeneratedMethodAccessor105]
[Unloading class sun.reflect.GeneratedMethodAccessor85]

- And then Full GC madness begins:

[GC 4080947K(4177280K), 0.0603370 secs]
[GC 4090738K(4177280K), 0.0683390 secs]
[Full GC 4177280K-4081390K(4177280K), 16.2954960 secs]
[GC 4081673K(4177280K), 0.0607990 secs]
[GC 4097803K(4177280K), 0.0739870 secs]
[Full GC 4177279K-4081951K(4177280K), 16.2857450 secs]
[GC 4082100K(4177280K), 0.0614000 secs]
[GC 4101581K(4177280K), 0.0814330 secs]
[Full GC 4177279K-4082845K(4177280K), 16.2079870 secs]
[GC 4084452K(4177280K), 0.0628080 secs]
[GC 4106928K(4177280K), 0.0835720 secs]
[Full GC 4177279K-4083187K(4177280K), 16.3403530 secs]
[GC 4084203K(4177280K), 0.0627750 secs]
[GC 4101856K(4177280K), 0.0737540 secs]
[Full GC 4177278K-4083998K(4177280K), 16.2605530 secs]
[GC 4084493K(4177280K), 0.0632620 secs]
[GC 4107486K(4177280K), 0.0804700 secs]
[Full GC 4177278K-4084298K(4177280K), 16.3931240 secs]
[GC 4084500K(4177280K), 0.0633480 secs]
[Full GC 4177279K-4085842K(4177280K), 16.4017970 secs]
[GC 409K(4177280K), 0.0702090 secs]
[GC 4127816K(4177280K), 0.1089220 secs]

- But it's still better then an instant shutdown.


-Original Message-
From: Caldarale, Charles R [mailto:chuck.caldar...@unisys.com] 
Sent: donderdag 19 februari 2009 16:22
To: Tomcat Users List
Subject: RE: Fatal error: Cleaner terminated abnormally


 From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
 Subject: Re: Fatal error: Cleaner terminated abnormally

 By trapping the exit call using security manager we hope to prevent
 Tomcat from closing down on a cleaner termination.

This is not likely to work, since the Cleaner is running this code as a
privileged operation; if regular applications could trap those, I think
there would be some serious security holes.

 Not sure what the side  effects would be to keep running
 after a cleaner terminates (any idea).

The thread doing the System.exit() call is the reference handler; the
JVM will not function properly if it's not running.  The exception
should have been logged and ignored, not result in JVM termination, but
I suspect it will be difficult to convince Sun of that at this point.

 I forgot to say thanks for the response guys. Especially
 yours Chris, it was very helpful.

Odd, because Chris didn't participate in this thread...

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you
received this in error, please contact the sender and delete the e-mail
and its attachments from all computers.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e

Re: Fatal error: Cleaner terminated abnormally

2009-02-19 Thread Taylan Develioglu
We're trying a workaround now.

By trapping the exit call using security manager we hope to prevent
Tomcat from closing down on a cleaner termination.

Not sure what the side  effects would be to keep running after a cleaner
terminates (any idea). Keeping fingers crossed.

I forgot to say thanks for the response guys. Especially yours Chris, it
was very helpful.

Rgds,

T

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Fatal error: Cleaner terminated abnormally

2009-02-19 Thread Taylan Develioglu
This is bad news, but it was a longshot to begin with.

I submitted a bug report which is under review now.

and apologies for the name mixup. Chuck is obviously a much prettier name :)



Caldarale, Charles R wrote:
 From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
 Subject: Re: Fatal error: Cleaner terminated abnormally

 By trapping the exit call using security manager we hope to prevent
 Tomcat from closing down on a cleaner termination.
 

 This is not likely to work, since the Cleaner is running this code as a 
 privileged operation; if regular applications could trap those, I think there 
 would be some serious security holes.

   
 Not sure what the side  effects would be to keep running
 after a cleaner terminates (any idea).
 

 The thread doing the System.exit() call is the reference handler; the JVM 
 will not function properly if it's not running.  The exception should have 
 been logged and ignored, not result in JVM termination, but I suspect it will 
 be difficult to convince Sun of that at this point.

   
 I forgot to say thanks for the response guys. Especially
 yours Chris, it was very helpful.
 

 Odd, because Chris didn't participate in this thread...

  - Chuck


 THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
 MATERIAL and is thus for use only by the intended recipient. If you received 
 this in error, please contact the sender and delete the e-mail and its 
 attachments from all computers.

 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org

   


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Fatal error: Cleaner terminated abnormally

2009-02-17 Thread Taylan Develioglu
Hi Guys,

Our application is a servlet running in a container in Tomcat
standalone. It uses the following NIO connector definition:

Connector port=80
protocol=org.apache.coyote.http11.Http11NioProtocol
   connectionTimeout=65000 keepAliveTimeout=1
maxKeepAliveRequests=1000
   redirectPort=443 maxThreads=2000/

Lately we've been experiencing a fatal error, related to gc, with Tomcat
that causes it to stop and unload, which I hoped you could give some
advice for.

I'm still unclear on what is causing the cleaner to terminate, but I
guess that's more of a question for the java forums (I cannot find
anything related to tomcat when I cross reference)

Following the gc trail, it looks like an oom situation (maybe a mem leak
in our application, our heapsize is 4GB), is it normal behavior for
tomcat to destroy itsself like this?

Has anyone experienced a similar problem before? What are usual causes
for Tomcat to stop like this?

*Any* advice or feedback is welcome. Either way, thanks in advance.

Debian 4.0
Tomcat 6.0.18
Sun jdk 1.6.0.11

We use the following java options:

OPTS=
 -verbose:gc
 -Dsun.net.inetaddr.ttl=60
 -Dfile.encoding=UTF-8
 -Djava.io.tmpdir=$TMP_DIR
 -Djava.library.path=/usr/local/lib
 -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
 -Dcatalina.base=$CATALINA_BASE
 -Dcatalina.home=$CATALINA_HOME
 -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
 -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties
 -XX:+UseConcMarkSweepGC
 -XX:+UseParNewGC
 -XX:+CMSIncrementalMode
 -Xms4096M
 -Xmx4096M
 -Xss128k
 -XX:PermSize=256M
 -XX:MaxPermSize=256M


--- catalina.out snippet 

[GC 4052829K-3924296K(4177280K), 0.0519680 secs]
[GC 4060616K-3924100K(4177280K), 0.1517880 secs]
[GC 4060420K-3926867K(4177280K), 0.0883940 secs]
[GC 4062488K-3931589K(4177280K), 0.1008470 secs]
[GC 4067906K-3935097K(4177280K), 0.0931530 secs]
[GC 4071417K-3934946K(4177280K), 0.0787300 secs]
[GC 4029027K(4177280K), 0.1941170 secs]
java.lang.Error: Cleaner terminated abnormally
at sun.misc.Cleaner$1.run(Cleaner.java:130)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.Cleaner.clean(Cleaner.java:127)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:124)
Caused by: java.lang.Error: java.io.IOException: Broken pipe
at sun.nio.ch.Util$SelectorWrapper$Closer.run(Util.java:97)
at sun.misc.Cleaner.clean(Cleaner.java:125)
... 1 more
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method)
at
sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:242)
at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:170)
at sun.nio.ch.SelectorImpl.implCloseSelector(SelectorImpl.java:92)
at
java.nio.channels.spi.AbstractSelector.close(AbstractSelector.java:91)
at sun.nio.ch.Util$SelectorWrapper$Closer.run(Util.java:95)
... 2 more
Feb 17, 2009 12:10:38 AM org.apache.coyote.http11.Http11NioProtocol pause
INFO: Pausing Coyote HTTP/1.1 on http-80
Feb 17, 2009 12:10:38 AM org.apache.coyote.http11.Http11AprProtocol pause
INFO: Pausing Coyote HTTP/1.1 on http-443
Feb 17, 2009 12:10:38 AM org.apache.coyote.ajp.AjpAprProtocol pause
INFO: Pausing Coyote AJP/1.3 on ajp-8009
[GC 4071265K-3937784K(4177280K), 0.0921220 secs]
Feb 17, 2009 12:10:39 AM org.apache.catalina.core.StandardService stop
INFO: Stopping service Catalina
Feb 17, 2009 12:10:39 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 28,017 instance(s) to be deallocated
Feb 17, 2009 12:10:41 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 27,669 instance(s) to be deallocated
Feb 17, 2009 12:10:42 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 27,666 instance(s) to be deallocated
Feb 17, 2009 12:10:43 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 3 instance(s) to be deallocated
Feb 17, 2009 12:10:44 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 3 instance(s) to be deallocated
Feb 17, 2009 12:10:45 AM org.apache.catalina.core.StandardWrapper unload
INFO: Waiting for 3 instance(s) to be deallocated
360358820 [SocketConnectorIoProcessor-0.0] null
org.apache.mina.common.support.DefaultExceptionMonitor - Unexpected
exception.
java.lang.NullPointerException
at org.apache.mina.common.ByteBuffer.allocate(ByteBuffer.java:225)
at org.apache.mina.common.ByteBuffer.allocate(ByteBuffer.java:208)
at
org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:210)
at
org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at
org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at
org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at

RE: Fatal error: Cleaner terminated abnormally

2009-02-17 Thread Taylan Develioglu

Sadly there is no mention of a fix related to NIO in the 6u12 release
notes. This comes as kind of a bummer, as we were hoping to make a comet
implementation soon.

The native/apr connector looks like it could be a replacement for NIO
for us, but after searching I could not find anything conclusive about
the scalability and performance compared to NIO. 
Opinions on native vs nio in discussions I have found seem to be
divided.

I'm also not sure if the native/apr implementation is completely
separate from the NIO api.

Does anyone know of any downsides/pitfalls I should look out for when
using native/apr ? 

As always, any comment is appreciated.

- Taylan
-Original Message-
From: Caldarale, Charles R [mailto:chuck.caldar...@unisys.com] 
Sent: dinsdag 17 februari 2009 16:36
To: Tomcat Users List
Subject: RE: Fatal error: Cleaner terminated abnormally


 From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com]
 Subject: Fatal error: Cleaner terminated abnormally

 Lately we've been experiencing a fatal error, related to gc,
 with Tomcat that causes it to stop and unload

It's not really a GC problem - rather a silly bug in NIO.  You might try
the standard HTTP connector to avoid the problem.  Sun seems to be
continually fixing NIO, so there may be something for this in 6u12, if
you want to keep using the NIO connector.

 I'm still unclear on what is causing the cleaner to terminate

The Cleaner terminates if the run() method of the registered object
throws *any* kind of exception - and then takes the entire JVM down with
it, via a System.exit() call (bloody brilliant, that one).  In this
case, the NIO Selector Closer object didn't like the fact that its peer
had gone away, and puked.  Not quite as robust as one might hope.

 - Chuck

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Fatal error: Cleaner terminated abnormally

2009-02-17 Thread Taylan Develioglu
Yes, 64-bit hotspot server vm.

-Original Message-
From: Mark Thomas [mailto:ma...@apache.org] 
Sent: dinsdag 17 februari 2009 16:23
To: Tomcat Users List
Subject: Re: Fatal error: Cleaner terminated abnormally


Taylan Develioglu wrote:
 Following the gc trail, it looks like an oom situation (maybe a mem
leak
 in our application, our heapsize is 4GB), is it normal behavior for
 tomcat to destroy itsself like this?

Are you on a 64-bit JVM? If not, the process heap is limited to 4GB so
the Java
object heap (set with Xmx) needs to allow for this. I would use 3.5GB as
a
starting point.

Mark


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Fatal error: Cleaner terminated abnormally

2009-02-17 Thread Taylan Develioglu
I found bug 4938372, but it didn't seem related to me at the time.

There's a post dated 2007, from Alan Bateman, indicating they'd try
putting the fix in a java 6 update.

I'll submit a bug report and in the meanwhile explore other options such
as native/apr then.
 
-Original Message-
From: Caldarale, Charles R [mailto:chuck.caldar...@unisys.com] 
Sent: dinsdag 17 februari 2009 23:46
To: Tomcat Users List
Subject: RE: Fatal error: Cleaner terminated abnormally


 From: Filip Hanik - Dev Lists [mailto:devli...@hanik.com]
 Subject: Re: Fatal error: Cleaner terminated abnormally

 search the sun database, some results there
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6521677

It's somewhat related, but I don't think it will cover the case reported
here, which looked like a simple socket closure rather than anything to
do with memory mapping of files.

I think a new bug submission is in order (preferably by the OP).

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you
received this in error, please contact the sender and delete the e-mail
and its attachments from all computers.

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org