Re: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-15 Thread Christopher Schultz
All,

With RH8, I assume that at the very least you'll need to set 
LD_KERNEL_ASSUME.

I would consider that the first course of action, and likely would not 
need to do anything else. I could see hyperthreading a problem if the 
kernel didn't support it very well. You could try the latest 2.4.x kernel.
If the LD_ASSUME_KERNEL doesn't help, try disabling both SMP and 
hyperthreading at the same time. I think that's your next most likely fix.

Just a note: I've had big, beefy servers die with SIG11 on Linux before 
(or course, that was back when a dual athlon 1GHz was considered 'beefy' 
:). Anyway, we tried everything, including hiring BEA consultants for a 
bazillion dollars per hour to help us tune both Weblogic and our VM.

It turned out to be bad hardware. We had six identical machines and two 
of 'em kept crapping out. They just sucked. The only solution was to 
send them back to the manufacturer and ask for more. It turns out that 
not only did those two (production!) machines suck, but two QA machines 
and one dev machine (all the same) sucked, too. They all died when we 
put them under high load. They seemed to do okay under development load 
(about zero).

I'll never buy a machine from Penguin Computing again.

Just wanted to mention that sometimes it's not the software's fault. Do 
you have other similar machines that you can try this on?

-chris

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-15 Thread Ralph Einfeldt
An additional remark,

hardware shops are sometimes quite floppy with the ram they 
use. Even if you order specific ram you may get something
different. (E.G. if you order infineon ram, you sometimes
just get noname modules with infineon chips, which is a 
completly different thing) This caused us all kind of 
trouble with stability. (Certaily we verify what we get,
but sometime there was not enough time to wait for the
replacement)

 -Original Message-
 From: Christopher Schultz [mailto:[EMAIL PROTECTED]
 Sent: Thursday, January 15, 2004 4:44 PM
 To: Tomcat Users List
 Subject: Re: Signal 11 crash and mod_jk, mod_jk2, apache2
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-15 Thread Oscar Carrillo
I thought later about mentioning this, and that's exactly right.

I always use the free memtest86 http://memtest86.com to test the
memory sub-system before deploying.

Sometimes I turn down the FSB and the memory bus speed, makes the crashes
go away.

Oscar

On Thu, 15 Jan 2004, Christopher Schultz wrote:

 All,
 
  With RH8, I assume that at the very least you'll need to set 
  LD_KERNEL_ASSUME.
  
  I would consider that the first course of action, and likely would not 
  need to do anything else. I could see hyperthreading a problem if the 
  kernel didn't support it very well. You could try the latest 2.4.x kernel.
 
 If the LD_ASSUME_KERNEL doesn't help, try disabling both SMP and 
 hyperthreading at the same time. I think that's your next most likely fix.
 
 Just a note: I've had big, beefy servers die with SIG11 on Linux before 
 (or course, that was back when a dual athlon 1GHz was considered 'beefy' 
 :). Anyway, we tried everything, including hiring BEA consultants for a 
 bazillion dollars per hour to help us tune both Weblogic and our VM.
 
 It turned out to be bad hardware. We had six identical machines and two 
 of 'em kept crapping out. They just sucked. The only solution was to 
 send them back to the manufacturer and ask for more. It turns out that 
 not only did those two (production!) machines suck, but two QA machines 
 and one dev machine (all the same) sucked, too. They all died when we 
 put them under high load. They seemed to do okay under development load 
 (about zero).
 
 I'll never buy a machine from Penguin Computing again.
 
 Just wanted to mention that sometimes it's not the software's fault. Do 
 you have other similar machines that you can try this on?
 
 -chris
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-14 Thread Peter Burkholder
I've seen similar threads but not a resolution, so I'll raise this issue
again.

I'm moving our web server and web apps to a new machine, a dual Xeon Redhat
8 box.  Tomcat 5.0.16 and Tomcat 4.1.29 both handle loads well when using
the Tomcat HTTP server.  But I need to connect Tomcat to an Apache
front-end, currently 2.0.28.  When I use either mod_jk or mod_jk2, the JVM
crashes after 1970-1980 requests with Signal 11 and this in
logs/catalina.out:

JVMDG217: Dump Handler is Processing a Signal - Please Wait.
JVMDG303: JVM Requesting Java core file
JVMDG304: Java core file written to
/ucar/dpc/aegean/www/web/www.dlese.org/tomcat/javacore.20040113.142044.6635.txt
JVMDG215: Dump Handler has Processed Exception Signal 11.

 and this exception in the java core:

 2XMEXCPINFOJVM Exception 0x2 (subcode 0x0) occurred in thread
 TP-Processor9 (TID:0x10174AB8)

I've seen suggested avenues like:
# kernel rebuilds (some 2.4.18-2.4.2? kernels had bad thread handling)
# switching JVMs, e.g., Sun, IBM, BEA JRockit
# cleaning out the work directory,
# using or not using LD_ASSUME_KERNEL,
# disabling hyperthreading,
# disabling SMP,
# adjusting heap and stack sizes
# rebuilding the mod_jk or mod_jk2 connectores.
# expanding user ulimits

That makes for a huge matrix of possibilities.  Any hints on which paths
might be more fruitful, or others to consider.  Are more data needed to help
isolate this problem?

Thanks,

Peter


MISC DETAILS:
Kernel: Redhat 2.4.20-27.8smp 
JVM:  IBMJava2-141
glibc Version: 2.3.2
Memory: 2Gb
JVM Memory: Behavior is the same with either -Xms250m or -Xms512m
2 CPUS /w hyperthreading looks like 4 CPUs

CORE EXCERPTS:  
1HPUSERLIMITS  User Limits (in bytes except for NOFILE and NPROC) -
NULL   ---
2HPUSERLIMIT   RLIMIT_FSIZE   : infinity
2HPUSERLIMIT   RLIMIT_DATA: infinity
2HPUSERLIMIT   RLIMIT_STACK   : 2093056
2HPUSERLIMIT   RLIMIT_CORE: 0
2HPUSERLIMIT   RLIMIT_NOFILE  : 1024
2HPUSERLIMIT   RLIMIT_NPROC   : 7168

2HPENVVAR  IBM_JAVA_COMMAND_LINE=/opt/java/java/bin/java -Xmx512m -Djava.end
orsed.dirs=/opt/tomcat/tomcat4.1/common/endorsed -classpath /opt/java/java/lib/t
ools.jar:/opt/tomcat/tomcat4.1/bin/bootstrap.jar -Dcatalina.base=/www/web//www.d
lese.org/tomcat -Dcatalina.home=/opt/tomcat/tomcat4.1 -Djava.io.tmpdir=/www/web/
/www.dlese.org/tomcat/temp org.apache.catalina.startup.Bootstrap -config /www/we
b//www.dlese.org/tomcat/conf/dds.xml start

--
Peter Burkholder, System Administrator
Digital Library for Earth System Education (DLESE® -- http://www.dlese.org)
[EMAIL PROTECTED]
DLESE Program Center (DPC)   ~~~  ~~     __o
UCAR/DPC, P.O. Box 3000   Ph) +1-303-497-2663  ~~~   ~~_`\,_
Boulder, CO 80307-3000Fx) +1 303-497-8336  ~~~    (*)/ (*)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-14 Thread David Short
Hi Peter,

It sounds like you've got Tomcat 5.0.16 running in-process with Apache2.
I'm trying to get this configuration working.  Would you mind posting your
workers.properties file?

Thanks,

Dave

-Original Message-
From: Peter Burkholder [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 14, 2004 8:29 AM
To: [EMAIL PROTECTED]
Subject: Signal 11 crash and mod_jk, mod_jk2, apache2


I've seen similar threads but not a resolution, so I'll raise this issue
again.

I'm moving our web server and web apps to a new machine, a dual Xeon Redhat
8 box.  Tomcat 5.0.16 and Tomcat 4.1.29 both handle loads well when using
the Tomcat HTTP server.  But I need to connect Tomcat to an Apache
front-end, currently 2.0.28.  When I use either mod_jk or mod_jk2, the JVM
crashes after 1970-1980 requests with Signal 11 and this in
logs/catalina.out:

JVMDG217: Dump Handler is Processing a Signal - Please Wait.
JVMDG303: JVM Requesting Java core file
JVMDG304: Java core file written to
/ucar/dpc/aegean/www/web/www.dlese.org/tomcat/javacore.20040113.142044.6635.
txt
JVMDG215: Dump Handler has Processed Exception Signal 11.

 and this exception in the java core:

 2XMEXCPINFOJVM Exception 0x2 (subcode 0x0) occurred in thread
 TP-Processor9 (TID:0x10174AB8)

I've seen suggested avenues like:
# kernel rebuilds (some 2.4.18-2.4.2? kernels had bad thread handling)
# switching JVMs, e.g., Sun, IBM, BEA JRockit
# cleaning out the work directory,
# using or not using LD_ASSUME_KERNEL,
# disabling hyperthreading,
# disabling SMP,
# adjusting heap and stack sizes
# rebuilding the mod_jk or mod_jk2 connectores.
# expanding user ulimits

That makes for a huge matrix of possibilities.  Any hints on which paths
might be more fruitful, or others to consider.  Are more data needed to help
isolate this problem?

Thanks,

Peter


MISC DETAILS:
Kernel: Redhat 2.4.20-27.8smp
JVM:  IBMJava2-141
glibc Version: 2.3.2
Memory: 2Gb
JVM Memory: Behavior is the same with either -Xms250m or -Xms512m
2 CPUS /w hyperthreading looks like 4 CPUs

CORE EXCERPTS:
1HPUSERLIMITS  User Limits (in bytes except for NOFILE and NPROC) -
NULL   ---
2HPUSERLIMIT   RLIMIT_FSIZE   : infinity
2HPUSERLIMIT   RLIMIT_DATA: infinity
2HPUSERLIMIT   RLIMIT_STACK   : 2093056
2HPUSERLIMIT   RLIMIT_CORE: 0
2HPUSERLIMIT   RLIMIT_NOFILE  : 1024
2HPUSERLIMIT   RLIMIT_NPROC   : 7168

2HPENVVAR
 IBM_JAVA_COMMAND_LINE=/opt/java/java/bin/java -Xmx512m -Djava.end
orsed.dirs=/opt/tomcat/tomcat4.1/common/endorsed -classpath
/opt/java/java/lib/t
ools.jar:/opt/tomcat/tomcat4.1/bin/bootstrap.jar -Dcatalina.base=/www/web//w
ww.d
lese.org/tomcat -Dcatalina.home=/opt/tomcat/tomcat4.1 -Djava.io.tmpdir=/www/
web/
/www.dlese.org/tomcat/temp org.apache.catalina.startup.Bootstrap -config
/www/we
b//www.dlese.org/tomcat/conf/dds.xml start

--
Peter Burkholder, System Administrator
Digital Library for Earth System Education (DLESE® -- http://www.dlese.org)
[EMAIL PROTECTED]
DLESE Program Center (DPC)   ~~~  ~~     __o
UCAR/DPC, P.O. Box 3000   Ph) +1-303-497-2663  ~~~   ~~
_`\,_
Boulder, CO 80307-3000Fx) +1 303-497-8336  ~~~    (*)/
(*)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-14 Thread Peter Burkholder
On Wed, Jan 14, 2004 at 09:31:42AM -0800, David Short wrote:
 It sounds like you've got Tomcat 5.0.16 running in-process with Apache2.
 I'm trying to get this configuration working.  Would you mind posting your
 workers.properties file?
No, I'm not running tomcat in-process.  Sorry if I wasn't clear.

P.
 
 Thanks,
 
 Dave
 
 I've seen similar threads but not a resolution, so I'll raise this issue
 again.
 
 I'm moving our web server and web apps to a new machine, a dual Xeon Redhat
 8 box.  Tomcat 5.0.16 and Tomcat 4.1.29 both handle loads well when using
 the Tomcat HTTP server.  But I need to connect Tomcat to an Apache
 front-end, currently 2.0.28.  When I use either mod_jk or mod_jk2, the JVM
 crashes after 1970-1980 requests with Signal 11 and this in
 logs/catalina.out:
--
Peter Burkholder, System Administrator
Digital Library for Earth System Education (DLESE® -- http://www.dlese.org)
[EMAIL PROTECTED]
DLESE Program Center (DPC)   ~~~  ~~     __o
UCAR/DPC, P.O. Box 3000   Ph) +1-303-497-2663  ~~~   ~~_`\,_
Boulder, CO 80307-3000Fx) +1 303-497-8336  ~~~    (*)/ (*)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Signal 11 crash and mod_jk, mod_jk2, apache2

2004-01-14 Thread Oscar Carrillo
With RH8, I assume that at the very least you'll need to set 
LD_KERNEL_ASSUME.

I would consider that the first course of action, and likely would not 
need to do anything else. I could see hyperthreading a problem if the 
kernel didn't support it very well. You could try the latest 2.4.x kernel.

Oscar

On Wed, 14 Jan 2004, Peter Burkholder wrote:

 I've seen similar threads but not a resolution, so I'll raise this issue
 again.
 
 I'm moving our web server and web apps to a new machine, a dual Xeon Redhat
 8 box.  Tomcat 5.0.16 and Tomcat 4.1.29 both handle loads well when using
 the Tomcat HTTP server.  But I need to connect Tomcat to an Apache
 front-end, currently 2.0.28.  When I use either mod_jk or mod_jk2, the JVM
 crashes after 1970-1980 requests with Signal 11 and this in
 logs/catalina.out:
 
 JVMDG217: Dump Handler is Processing a Signal - Please Wait.
 JVMDG303: JVM Requesting Java core file
 JVMDG304: Java core file written to
 /ucar/dpc/aegean/www/web/www.dlese.org/tomcat/javacore.20040113.142044.6635.txt
 JVMDG215: Dump Handler has Processed Exception Signal 11.
 
  and this exception in the java core:
 
  2XMEXCPINFOJVM Exception 0x2 (subcode 0x0) occurred in thread
  TP-Processor9 (TID:0x10174AB8)
 
 I've seen suggested avenues like:
 # kernel rebuilds (some 2.4.18-2.4.2? kernels had bad thread handling)
 # switching JVMs, e.g., Sun, IBM, BEA JRockit
 # cleaning out the work directory,
 # using or not using LD_ASSUME_KERNEL,
 # disabling hyperthreading,
 # disabling SMP,
 # adjusting heap and stack sizes
 # rebuilding the mod_jk or mod_jk2 connectores.
 # expanding user ulimits
 
 That makes for a huge matrix of possibilities.  Any hints on which paths
 might be more fruitful, or others to consider.  Are more data needed to help
 isolate this problem?
 
 Thanks,
 
 Peter
 
 
 MISC DETAILS:
 Kernel: Redhat 2.4.20-27.8smp 
 JVM:  IBMJava2-141
 glibc Version: 2.3.2
 Memory:   2Gb
 JVM Memory: Behavior is the same with either -Xms250m or -Xms512m
 2 CPUS /w hyperthreading looks like 4 CPUs
 
 CORE EXCERPTS:
 1HPUSERLIMITS  User Limits (in bytes except for NOFILE and NPROC) -
 NULL   ---
 2HPUSERLIMIT   RLIMIT_FSIZE   : infinity
 2HPUSERLIMIT   RLIMIT_DATA: infinity
 2HPUSERLIMIT   RLIMIT_STACK   : 2093056
 2HPUSERLIMIT   RLIMIT_CORE: 0
 2HPUSERLIMIT   RLIMIT_NOFILE  : 1024
 2HPUSERLIMIT   RLIMIT_NPROC   : 7168
 
 2HPENVVAR  IBM_JAVA_COMMAND_LINE=/opt/java/java/bin/java -Xmx512m -Djava.end
 orsed.dirs=/opt/tomcat/tomcat4.1/common/endorsed -classpath /opt/java/java/lib/t
 ools.jar:/opt/tomcat/tomcat4.1/bin/bootstrap.jar -Dcatalina.base=/www/web//www.d
 lese.org/tomcat -Dcatalina.home=/opt/tomcat/tomcat4.1 -Djava.io.tmpdir=/www/web/
 /www.dlese.org/tomcat/temp org.apache.catalina.startup.Bootstrap -config /www/we
 b//www.dlese.org/tomcat/conf/dds.xml start
 
 --
 Peter Burkholder, System Administrator
 Digital Library for Earth System Education (DLESE® -- http://www.dlese.org)
 [EMAIL PROTECTED]
 DLESE Program Center (DPC)   ~~~  ~~     __o
 UCAR/DPC, P.O. Box 3000   Ph) +1-303-497-2663  ~~~   ~~_`\,_
 Boulder, CO 80307-3000Fx) +1 303-497-8336  ~~~    (*)/ (*)
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]