RE: JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-09 Thread Guernsey, Byron \(GE Consumer Industrial\)

Thanks Dave, I'll second the report and confirm that it is an issue on
Solaris as well.

Byron
 

-Original Message-
From: David Rees [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 06, 2005 2:30 PM
To: Tomcat Users List
Subject: Re: JK 1.2.14.1 SIG BUS Error on Solaris 9

On 9/6/05, David Rees [EMAIL PROTECTED] wrote:
 
 That is the exact same core dump and back trace that I reported a 
 while back when running on SGI Irix.  Could be a 64bit or big endian 
 problem?
 
 http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2

I've opened a bug for this issue:
http://issues.apache.org/bugzilla/show_bug.cgi?id=36525

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-06 Thread David Rees
On 9/2/05, Guernsey, Byron (GE Consumer  Industrial)
[EMAIL PROTECTED] wrote:
 I apoligize for adding to this, but I'm hoping to jar someones memory.
 I gdb'ed the process now and the BUS error occurs in:
snip
 Program received signal SIGBUS, Bus error.
 0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40,
 is_error=0xfe500840) at jk_lb_worker.c:605
 jk_lb_worker.c:605: No such file or directory.
 (gdb) bt
 #0  0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40,
 is_error=0xfe500840) at jk_lb_worker.c:605

That is the exact same core dump and back trace that I reported a
while back when running on SGI Irix.  Could be a 64bit or big endian
problem?

http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-06 Thread David Rees
On 9/6/05, David Rees [EMAIL PROTECTED] wrote:
 
 That is the exact same core dump and back trace that I reported a
 while back when running on SGI Irix.  Could be a 64bit or big endian
 problem?
 
 http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2

I've opened a bug for this issue:
http://issues.apache.org/bugzilla/show_bug.cgi?id=36525

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-02 Thread Guernsey, Byron \(GE Consumer Industrial\)

I tried to upgrade our JK modules from 1.2.12 to 1.2.14.1 and now every
Apache process crashes while processing a request with:

[Fri Sep 02 14:23:54 2005] [notice] child pid 11079 exit signal Bus
error (10)

For every request in the logs.

The config works fine with JK 1.2.12 (I can swap it in and things work)
and looks like:

==
httpd.conf:
==
JkShmSize 60
JkShmFile logs/jk1.shm
JkWorkersFile conf/workers.properties
JkMountFile conf/uriworkermap.properties
JkLogFile logs/mod_jk_log
JkLogLevelinfo
JkLogStampFormat [%a %b %d %H:%M:%S %Y] 
JkRequestLogFormat %w %V:%p%U%q %s %T

Location /jkstatus/
JkMount jkstatus
Order deny,allow
Deny from all
Allow from 127.
/Location

==
workers.properties:
==
ps=/
worker.list=jkstatus,CAMCentral_lb
worker.jkstatus.type=status
worker.CAMCentral_lb.type=lb
worker.CAMCentral_lb.balance_workers=ap1lnx60,ap1lnx61

worker.ap1lnx60.type=ajp13
worker.ap1lnx60.host=3.130.232.239
worker.ap1lnx60.port=15753

worker.ap1lnx61.type=ajp13
worker.ap1lnx61.host=3.130.233.24
worker.ap1lnx61.port=15753

==
uriworkermap.properties
==
/CAMCentral/*=CAMCentral_lb
/CAMCentral=CAMCentral_lb

Apache version:

Server version: Apache/2.0.52
Server built:   Oct 28 2004 12:24:42
Server's Module Magic Number: 20020903:9
Architecture:   32-bit
Server compiled with
 -D APACHE_MPM_DIR=server/mpm/worker
 -D APR_HAS_MMAP
 -D APR_USE_FCNTL_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D HTTPD_ROOT=/usr/local/apache2
 -D SUEXEC_BIN=/usr/local/apache2/bin/suexec
 -D DEFAULT_SCOREBOARD=logs/apache_runtime_status
 -D DEFAULT_ERRORLOG=logs/error_log
 -D AP_TYPES_CONFIG_FILE=conf/mime.types
 -D SERVER_CONFIG_FILE=conf/httpd.conf

What am I missing?  I built jk with configure
--with-apxs=/usr/local/apache2/bin/apxs;make

I've had no problems building/running in the past with mod_jk 1.2.12.  I
thought it must have been something in my config, but apparently not?

Byron




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-02 Thread Guernsey, Byron \(GE Consumer Industrial\)
Some addition information- I trussed httpd -X to get this trace:

...
11438:  lwp_cond_wait(0xFEE434E8, 0xFEE434F8, 0xFEE3CD80) (sleeping...)
11438:  accept(3, 0x001EAAF4, 0x001EAB04, 1)= 14
11438:  lwp_sema_wait(0xFE909E60)   = 0
11438:  lwp_sema_post(0xFE909E60)   = 0
11438:  lwp_mutex_lock(0xFEE434F8)  = 0
11438:  lwp_mutex_wakeup(0xFEE434F8)= 0
11438:  lwp_sema_post(0xFE909E60)   = 0
11438:  lwp_sema_wait(0xFE909E60)   = 0
11438:  lwp_mutex_lock(0xFEE434F8)  = 0
11438:  lwp_mutex_wakeup(0xFEE434F8)= 0
11438:  fcntl(14, F_GETFL, 0x)  = 2
11438:  fstat64(14, 0xFE909630) = 0
11438:  getsockopt(14, 65535, 8192, 0xFE909730, 0xFE909728, 44) = 0
11438:  fstat64(14, 0xFE909630) = 0
11438:  getsockopt(14, 65535, 8192, 0xFE909730, 0xFE90972C, 44) = 0
11438:  setsockopt(14, 65535, 8192, 0xFE909730, 4, 44)  = 0
11438:  fcntl(14, F_SETFL, 0x0082)  = 0
11438:  read(14,  G E T   / C A M C e n t.., 8000)= 114
11438:  time()  = 1125691419
11438:  time()  = 1125691419
11438:  brk(0x001FC478) = 0
11438:  brk(0x001FE478) = 0
11438:  brk(0x001FE478) = 0
11438:  brk(0x00200478) = 0
11438:  brk(0x00200478) = 0
11438:  brk(0x00202478) = 0
11438:  brk(0x00202478) = 0
11438:  brk(0x00204478) = 0
11438:  so_socket(2, 2, 0, , 1)   = 15
11438:  setsockopt(15, 6, 1, 0xFE908574, 4, 1)  = 0
11438:  setsockopt(15, 65535, 128, 0xFE908578, 8, 1)= 0
11438:  connect(15, 0x000F6CB0, 16, 1)  = 0
11438:  fcntl(15, F_GETFL, 0x)  = 2
11438:  fstat64(15, 0xFE908290) = 0
11438:  getsockopt(15, 65535, 8192, 0xFE908390, 0xFE908388, 0) = 0
11438:  fstat64(15, 0xFE908290) = 0
11438:  getsockopt(15, 65535, 8192, 0xFE908390, 0xFE90838C, 0) = 0
11438:  setsockopt(15, 65535, 8192, 0xFE908390, 4, 0)   = 0
11438:  fcntl(15, F_SETFL, 0x0002)  = 0
11438:  write(15, 12 4\0AD0202\0\b H T T P.., 177)= 177
11438:  read(15,  A B\0A4, 4) = 4
11438:  read(15, 04\0C8\0\0\0\003\0\n S e.., 164) = 164
11438:  read(15,  A B\0D7, 4) = 4
11438:  read(15, 03\0D3\r\n\r\n\r\n  h t.., 215) = 215
11438:  brk(0x00204478) = 0
11438:  brk(0x00206478) = 0
11438:  read(15,  A B\002, 4) = 4
11438:  read(15, 0501, 2) = 2
11438:  brk(0x00206478) = 0
11438:  brk(0x00208478) = 0
11438:  brk(0x00208478) = 0
11438:  brk(0x0020A478) = 0
11438:  writev(14, 0xFE907E38, 2)   = 583
11438:  Incurred fault #5, FLTACCESS  %pc = 0xFDFC4208
11438:siginfo: SIGBUS BUS_ADRALN addr=0xFE0E0234
11438:  Received signal #10, SIGBUS [default]
11438:siginfo: SIGBUS BUS_ADRALN addr=0xFE0E0234
11438:  *** process killed ***
 
I hope I don't have to build a debug version of apache/mod_jk and gdb it
to find this issue. 

Help?

Byron

-Original Message-
From: Guernsey, Byron (GE Consumer  Industrial) 
Sent: Friday, September 02, 2005 4:00 PM
To: Tomcat Users List
Subject: JK 1.2.14.1 SIG BUS Error on Solaris 9


I tried to upgrade our JK modules from 1.2.12 to 1.2.14.1 and now every
Apache process crashes while processing a request with:

[Fri Sep 02 14:23:54 2005] [notice] child pid 11079 exit signal Bus
error (10)

For every request in the logs.

The config works fine with JK 1.2.12 (I can swap it in and things work)
and looks like:

==
httpd.conf:
==
JkShmSize 60
JkShmFile logs/jk1.shm
JkWorkersFile conf/workers.properties
JkMountFile conf/uriworkermap.properties
JkLogFile logs/mod_jk_log
JkLogLevelinfo
JkLogStampFormat [%a %b %d %H:%M:%S %Y] 
JkRequestLogFormat %w %V:%p%U%q %s %T

Location /jkstatus/
JkMount jkstatus
Order deny,allow
Deny from all
Allow from 127.
/Location

==
workers.properties:
==
ps=/
worker.list=jkstatus,CAMCentral_lb
worker.jkstatus.type=status
worker.CAMCentral_lb.type=lb
worker.CAMCentral_lb.balance_workers=ap1lnx60,ap1lnx61

worker.ap1lnx60.type=ajp13
worker.ap1lnx60.host

RE: JK 1.2.14.1 SIG BUS Error on Solaris 9

2005-09-02 Thread Guernsey, Byron \(GE Consumer Industrial\)
I apoligize for adding to this, but I'm hoping to jar someones memory.
I gdb'ed the process now and the BUS error occurs in:

Starting program: /usr/local/apache2/bin/httpd -X -f
/opt/GEinet/webconfigs/ap_i1/conf/httpd.conf -DMOD_JK -DCGI
[New LWP2]
[New LWP3]
[New LWP4]
[New LWP5]
[New LWP6]
[New LWP7]

Program received signal SIGBUS, Bus error.
[Switching to LWP3]

Program received signal SIGBUS, Bus error.
0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40,
is_error=0xfe500840) at jk_lb_worker.c:605
jk_lb_worker.c:605: No such file or directory.
(gdb) bt
#0  0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40,
is_error=0xfe500840) at jk_lb_worker.c:605
#1  0xfdfa9a7c in jk_handler (r=0x1feac0) at mod_jk.c:1889
#2  0x33f8c in ap_run_handler (r=0x1feac0) at config.c:151
#3  0x34588 in ap_invoke_handler (r=0x1feac0) at config.c:363
#4  0x2f82c in ap_process_request (r=0x1feac0) at http_request.c:246
#5  0x2aab4 in ap_process_http_connection (c=0x1f2b60) at
http_core.c:250
#6  0x3e660 in ap_run_process_connection (c=0x1f2b60) at connection.c:42
#7  0x3e95c in ap_process_connection (c=0x1f2b60, csd=0x1f2a90) at
connection.c:175
#8  0x30a5c in process_socket (p=0x1f2a58, sock=0x1f2a90,
my_child_num=0, my_thread_num=4, bucket_alloc=0x1fca80) at worker.c:520
#9  0x310f4 in worker_thread (thd=0x120c10, dummy=0x1f2a58) at
worker.c:834
#10 0xff1144c4 in dummy_worker (opaque=0x74000) at thread.c:88

And jk_lb_worker.c:605 looks like:

if (rec  rec != prec) {
int is_service_error = JK_HTTP_OK;
int service_stat = JK_FALSE;
jk_endpoint_t *end = NULL;

s-jvm_route = rec-r;
rc = rec-w-get_endpoint(rec-w, end, l);

if (JK_IS_DEBUG_LEVEL(l))
jk_log(l, JK_LOG_DEBUG,
   service worker=%s jvm_route=%s,
   rec-s-name, s-jvm_route);
rec-s-elected++;
if (rc  end) {
/* Reset endpoint read and write sizes for
 * this request.
 */
end-rd = end-wr = 0;
/* Increment the number of workers serving request
*/
p-worker-s-busy++;
if (p-worker-s-busy  p-worker-s-max_busy)
p-worker-s-max_busy = p-worker-s-busy;
rec-s-busy++;
if (rec-s-busy  rec-s-max_busy)
rec-s-max_busy = rec-s-busy;
service_stat = end-service(end, s, l,
is_service_error);
/* Update partial reads and writes if any */
605:rec-s-readed += end-rd;
rec-s-transferred += end-wr;
end-done(end, l);

From the debugger:

 (gdb) print *end
$6 = {rd = 393, wr = 177, endpoint_private = 0x14e990, service =
0xfdfc1c34 ajp_service, done = 0xfdfc309c ajp_done}

(gdb) print *rec-s
$5 = {id = 2, busy = 1, max_busy = 1, name = ap1lnx60, '\000' repeats
55 times, domain = '\000' repeats 63 times, 
  redirect = '\000' repeats 63 times, is_disabled = 0, is_stopped = 0,
is_busy = 0, lb_factor = 1, lb_value = 0, 
  in_error_state = 0, in_recovering = 0, sticky_session = 0,
sticky_session_force = 0, recover_wait_time = 0, retries = 0, 
  error_time = 0, readed = 0, transferred = 0, elected = 1, errors = 0}

(gdb) print *s
$7 = {ws_private = 0xfe5018e8, pool = 0xfe5018e8, method = 0x1ff838
GET, protocol = 0x1ff888 HTTP/1.0, 
  req_uri = 0x1ff870 /CMSCentral/Dispatcher, remote_addr = 0x1f2ee0
64.37.211.156, remote_host = 0x0, remote_user = 0x0, 
  auth_type = 0x0, query_string = 0x0, server_name = 0xf5030 hostname,
server_port = 80, 
  server_software = 0x118cf8 Apache/2.0.52 (Unix) mod_jk/1.2.14 DAV/2,
content_length = 0, is_chunked = 0, no_more_chunks = 0, 
  content_read = 0, is_ssl = 0, ssl_cert = 0x0, ssl_cert_len = 0,
ssl_cipher = 0x0, ssl_session = 0x0, ssl_key_size = -1, 
  headers_names = 0x200248, headers_values = 0x200258, num_headers = 4,
attributes_names = 0x0, attributes_values = 0x0, 
  num_attributes = 0, jvm_route = 0xfe0d0140 ap1lnx60, secret = 0x0,
reco_buf = 0xfe500848, reco_status = 1, retries = 3, 
  flush_packets = 0, uw_map = 0x14a818, start_response = 0xfdfa754c
ws_start_response, read = 0xfdfa76e4 ws_read, 
  write = 0xfdfa77ac ws_write, flush = 0xfdfa777c ws_flush}

Byron

-Original Message-
From: Guernsey, Byron (GE Consumer  Industrial) 
Sent: Friday, September 02, 2005 4:15 PM
To: Tomcat Users List
Subject: RE: JK 1.2.14.1 SIG BUS Error on Solaris 9

Some addition information- I trussed httpd -X to get this trace:

...
11438:  lwp_cond_wait(0xFEE434E8, 0xFEE434F8, 0xFEE3CD80) (sleeping...)
11438:  accept(3, 0x001EAAF4, 0x001EAB04, 1)= 14
11438:  lwp_sema_wait(0xFE909E60)   = 0
11438:  lwp_sema_post