RE: JK 1.2.14.1 SIG BUS Error on Solaris 9
Thanks Dave, I'll second the report and confirm that it is an issue on Solaris as well. Byron -Original Message- From: David Rees [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 06, 2005 2:30 PM To: Tomcat Users List Subject: Re: JK 1.2.14.1 SIG BUS Error on Solaris 9 On 9/6/05, David Rees [EMAIL PROTECTED] wrote: That is the exact same core dump and back trace that I reported a while back when running on SGI Irix. Could be a 64bit or big endian problem? http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2 I've opened a bug for this issue: http://issues.apache.org/bugzilla/show_bug.cgi?id=36525 -Dave - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: JK 1.2.14.1 SIG BUS Error on Solaris 9
On 9/2/05, Guernsey, Byron (GE Consumer Industrial) [EMAIL PROTECTED] wrote: I apoligize for adding to this, but I'm hoping to jar someones memory. I gdb'ed the process now and the BUS error occurs in: snip Program received signal SIGBUS, Bus error. 0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40, is_error=0xfe500840) at jk_lb_worker.c:605 jk_lb_worker.c:605: No such file or directory. (gdb) bt #0 0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40, is_error=0xfe500840) at jk_lb_worker.c:605 That is the exact same core dump and back trace that I reported a while back when running on SGI Irix. Could be a 64bit or big endian problem? http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2 -Dave - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: JK 1.2.14.1 SIG BUS Error on Solaris 9
On 9/6/05, David Rees [EMAIL PROTECTED] wrote: That is the exact same core dump and back trace that I reported a while back when running on SGI Irix. Could be a 64bit or big endian problem? http://marc.theaimsgroup.com/?l=tomcat-devm=112501659012202w=2 I've opened a bug for this issue: http://issues.apache.org/bugzilla/show_bug.cgi?id=36525 -Dave - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
JK 1.2.14.1 SIG BUS Error on Solaris 9
I tried to upgrade our JK modules from 1.2.12 to 1.2.14.1 and now every Apache process crashes while processing a request with: [Fri Sep 02 14:23:54 2005] [notice] child pid 11079 exit signal Bus error (10) For every request in the logs. The config works fine with JK 1.2.12 (I can swap it in and things work) and looks like: == httpd.conf: == JkShmSize 60 JkShmFile logs/jk1.shm JkWorkersFile conf/workers.properties JkMountFile conf/uriworkermap.properties JkLogFile logs/mod_jk_log JkLogLevelinfo JkLogStampFormat [%a %b %d %H:%M:%S %Y] JkRequestLogFormat %w %V:%p%U%q %s %T Location /jkstatus/ JkMount jkstatus Order deny,allow Deny from all Allow from 127. /Location == workers.properties: == ps=/ worker.list=jkstatus,CAMCentral_lb worker.jkstatus.type=status worker.CAMCentral_lb.type=lb worker.CAMCentral_lb.balance_workers=ap1lnx60,ap1lnx61 worker.ap1lnx60.type=ajp13 worker.ap1lnx60.host=3.130.232.239 worker.ap1lnx60.port=15753 worker.ap1lnx61.type=ajp13 worker.ap1lnx61.host=3.130.233.24 worker.ap1lnx61.port=15753 == uriworkermap.properties == /CAMCentral/*=CAMCentral_lb /CAMCentral=CAMCentral_lb Apache version: Server version: Apache/2.0.52 Server built: Oct 28 2004 12:24:42 Server's Module Magic Number: 20020903:9 Architecture: 32-bit Server compiled with -D APACHE_MPM_DIR=server/mpm/worker -D APR_HAS_MMAP -D APR_USE_FCNTL_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D HTTPD_ROOT=/usr/local/apache2 -D SUEXEC_BIN=/usr/local/apache2/bin/suexec -D DEFAULT_SCOREBOARD=logs/apache_runtime_status -D DEFAULT_ERRORLOG=logs/error_log -D AP_TYPES_CONFIG_FILE=conf/mime.types -D SERVER_CONFIG_FILE=conf/httpd.conf What am I missing? I built jk with configure --with-apxs=/usr/local/apache2/bin/apxs;make I've had no problems building/running in the past with mod_jk 1.2.12. I thought it must have been something in my config, but apparently not? Byron - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: JK 1.2.14.1 SIG BUS Error on Solaris 9
Some addition information- I trussed httpd -X to get this trace: ... 11438: lwp_cond_wait(0xFEE434E8, 0xFEE434F8, 0xFEE3CD80) (sleeping...) 11438: accept(3, 0x001EAAF4, 0x001EAB04, 1)= 14 11438: lwp_sema_wait(0xFE909E60) = 0 11438: lwp_sema_post(0xFE909E60) = 0 11438: lwp_mutex_lock(0xFEE434F8) = 0 11438: lwp_mutex_wakeup(0xFEE434F8)= 0 11438: lwp_sema_post(0xFE909E60) = 0 11438: lwp_sema_wait(0xFE909E60) = 0 11438: lwp_mutex_lock(0xFEE434F8) = 0 11438: lwp_mutex_wakeup(0xFEE434F8)= 0 11438: fcntl(14, F_GETFL, 0x) = 2 11438: fstat64(14, 0xFE909630) = 0 11438: getsockopt(14, 65535, 8192, 0xFE909730, 0xFE909728, 44) = 0 11438: fstat64(14, 0xFE909630) = 0 11438: getsockopt(14, 65535, 8192, 0xFE909730, 0xFE90972C, 44) = 0 11438: setsockopt(14, 65535, 8192, 0xFE909730, 4, 44) = 0 11438: fcntl(14, F_SETFL, 0x0082) = 0 11438: read(14, G E T / C A M C e n t.., 8000)= 114 11438: time() = 1125691419 11438: time() = 1125691419 11438: brk(0x001FC478) = 0 11438: brk(0x001FE478) = 0 11438: brk(0x001FE478) = 0 11438: brk(0x00200478) = 0 11438: brk(0x00200478) = 0 11438: brk(0x00202478) = 0 11438: brk(0x00202478) = 0 11438: brk(0x00204478) = 0 11438: so_socket(2, 2, 0, , 1) = 15 11438: setsockopt(15, 6, 1, 0xFE908574, 4, 1) = 0 11438: setsockopt(15, 65535, 128, 0xFE908578, 8, 1)= 0 11438: connect(15, 0x000F6CB0, 16, 1) = 0 11438: fcntl(15, F_GETFL, 0x) = 2 11438: fstat64(15, 0xFE908290) = 0 11438: getsockopt(15, 65535, 8192, 0xFE908390, 0xFE908388, 0) = 0 11438: fstat64(15, 0xFE908290) = 0 11438: getsockopt(15, 65535, 8192, 0xFE908390, 0xFE90838C, 0) = 0 11438: setsockopt(15, 65535, 8192, 0xFE908390, 4, 0) = 0 11438: fcntl(15, F_SETFL, 0x0002) = 0 11438: write(15, 12 4\0AD0202\0\b H T T P.., 177)= 177 11438: read(15, A B\0A4, 4) = 4 11438: read(15, 04\0C8\0\0\0\003\0\n S e.., 164) = 164 11438: read(15, A B\0D7, 4) = 4 11438: read(15, 03\0D3\r\n\r\n\r\n h t.., 215) = 215 11438: brk(0x00204478) = 0 11438: brk(0x00206478) = 0 11438: read(15, A B\002, 4) = 4 11438: read(15, 0501, 2) = 2 11438: brk(0x00206478) = 0 11438: brk(0x00208478) = 0 11438: brk(0x00208478) = 0 11438: brk(0x0020A478) = 0 11438: writev(14, 0xFE907E38, 2) = 583 11438: Incurred fault #5, FLTACCESS %pc = 0xFDFC4208 11438:siginfo: SIGBUS BUS_ADRALN addr=0xFE0E0234 11438: Received signal #10, SIGBUS [default] 11438:siginfo: SIGBUS BUS_ADRALN addr=0xFE0E0234 11438: *** process killed *** I hope I don't have to build a debug version of apache/mod_jk and gdb it to find this issue. Help? Byron -Original Message- From: Guernsey, Byron (GE Consumer Industrial) Sent: Friday, September 02, 2005 4:00 PM To: Tomcat Users List Subject: JK 1.2.14.1 SIG BUS Error on Solaris 9 I tried to upgrade our JK modules from 1.2.12 to 1.2.14.1 and now every Apache process crashes while processing a request with: [Fri Sep 02 14:23:54 2005] [notice] child pid 11079 exit signal Bus error (10) For every request in the logs. The config works fine with JK 1.2.12 (I can swap it in and things work) and looks like: == httpd.conf: == JkShmSize 60 JkShmFile logs/jk1.shm JkWorkersFile conf/workers.properties JkMountFile conf/uriworkermap.properties JkLogFile logs/mod_jk_log JkLogLevelinfo JkLogStampFormat [%a %b %d %H:%M:%S %Y] JkRequestLogFormat %w %V:%p%U%q %s %T Location /jkstatus/ JkMount jkstatus Order deny,allow Deny from all Allow from 127. /Location == workers.properties: == ps=/ worker.list=jkstatus,CAMCentral_lb worker.jkstatus.type=status worker.CAMCentral_lb.type=lb worker.CAMCentral_lb.balance_workers=ap1lnx60,ap1lnx61 worker.ap1lnx60.type=ajp13 worker.ap1lnx60.host
RE: JK 1.2.14.1 SIG BUS Error on Solaris 9
I apoligize for adding to this, but I'm hoping to jar someones memory. I gdb'ed the process now and the BUS error occurs in: Starting program: /usr/local/apache2/bin/httpd -X -f /opt/GEinet/webconfigs/ap_i1/conf/httpd.conf -DMOD_JK -DCGI [New LWP2] [New LWP3] [New LWP4] [New LWP5] [New LWP6] [New LWP7] Program received signal SIGBUS, Bus error. [Switching to LWP3] Program received signal SIGBUS, Bus error. 0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40, is_error=0xfe500840) at jk_lb_worker.c:605 jk_lb_worker.c:605: No such file or directory. (gdb) bt #0 0xfdfb4208 in service (e=0xf7c90, s=0xfe501848, l=0x118b40, is_error=0xfe500840) at jk_lb_worker.c:605 #1 0xfdfa9a7c in jk_handler (r=0x1feac0) at mod_jk.c:1889 #2 0x33f8c in ap_run_handler (r=0x1feac0) at config.c:151 #3 0x34588 in ap_invoke_handler (r=0x1feac0) at config.c:363 #4 0x2f82c in ap_process_request (r=0x1feac0) at http_request.c:246 #5 0x2aab4 in ap_process_http_connection (c=0x1f2b60) at http_core.c:250 #6 0x3e660 in ap_run_process_connection (c=0x1f2b60) at connection.c:42 #7 0x3e95c in ap_process_connection (c=0x1f2b60, csd=0x1f2a90) at connection.c:175 #8 0x30a5c in process_socket (p=0x1f2a58, sock=0x1f2a90, my_child_num=0, my_thread_num=4, bucket_alloc=0x1fca80) at worker.c:520 #9 0x310f4 in worker_thread (thd=0x120c10, dummy=0x1f2a58) at worker.c:834 #10 0xff1144c4 in dummy_worker (opaque=0x74000) at thread.c:88 And jk_lb_worker.c:605 looks like: if (rec rec != prec) { int is_service_error = JK_HTTP_OK; int service_stat = JK_FALSE; jk_endpoint_t *end = NULL; s-jvm_route = rec-r; rc = rec-w-get_endpoint(rec-w, end, l); if (JK_IS_DEBUG_LEVEL(l)) jk_log(l, JK_LOG_DEBUG, service worker=%s jvm_route=%s, rec-s-name, s-jvm_route); rec-s-elected++; if (rc end) { /* Reset endpoint read and write sizes for * this request. */ end-rd = end-wr = 0; /* Increment the number of workers serving request */ p-worker-s-busy++; if (p-worker-s-busy p-worker-s-max_busy) p-worker-s-max_busy = p-worker-s-busy; rec-s-busy++; if (rec-s-busy rec-s-max_busy) rec-s-max_busy = rec-s-busy; service_stat = end-service(end, s, l, is_service_error); /* Update partial reads and writes if any */ 605:rec-s-readed += end-rd; rec-s-transferred += end-wr; end-done(end, l); From the debugger: (gdb) print *end $6 = {rd = 393, wr = 177, endpoint_private = 0x14e990, service = 0xfdfc1c34 ajp_service, done = 0xfdfc309c ajp_done} (gdb) print *rec-s $5 = {id = 2, busy = 1, max_busy = 1, name = ap1lnx60, '\000' repeats 55 times, domain = '\000' repeats 63 times, redirect = '\000' repeats 63 times, is_disabled = 0, is_stopped = 0, is_busy = 0, lb_factor = 1, lb_value = 0, in_error_state = 0, in_recovering = 0, sticky_session = 0, sticky_session_force = 0, recover_wait_time = 0, retries = 0, error_time = 0, readed = 0, transferred = 0, elected = 1, errors = 0} (gdb) print *s $7 = {ws_private = 0xfe5018e8, pool = 0xfe5018e8, method = 0x1ff838 GET, protocol = 0x1ff888 HTTP/1.0, req_uri = 0x1ff870 /CMSCentral/Dispatcher, remote_addr = 0x1f2ee0 64.37.211.156, remote_host = 0x0, remote_user = 0x0, auth_type = 0x0, query_string = 0x0, server_name = 0xf5030 hostname, server_port = 80, server_software = 0x118cf8 Apache/2.0.52 (Unix) mod_jk/1.2.14 DAV/2, content_length = 0, is_chunked = 0, no_more_chunks = 0, content_read = 0, is_ssl = 0, ssl_cert = 0x0, ssl_cert_len = 0, ssl_cipher = 0x0, ssl_session = 0x0, ssl_key_size = -1, headers_names = 0x200248, headers_values = 0x200258, num_headers = 4, attributes_names = 0x0, attributes_values = 0x0, num_attributes = 0, jvm_route = 0xfe0d0140 ap1lnx60, secret = 0x0, reco_buf = 0xfe500848, reco_status = 1, retries = 3, flush_packets = 0, uw_map = 0x14a818, start_response = 0xfdfa754c ws_start_response, read = 0xfdfa76e4 ws_read, write = 0xfdfa77ac ws_write, flush = 0xfdfa777c ws_flush} Byron -Original Message- From: Guernsey, Byron (GE Consumer Industrial) Sent: Friday, September 02, 2005 4:15 PM To: Tomcat Users List Subject: RE: JK 1.2.14.1 SIG BUS Error on Solaris 9 Some addition information- I trussed httpd -X to get this trace: ... 11438: lwp_cond_wait(0xFEE434E8, 0xFEE434F8, 0xFEE3CD80) (sleeping...) 11438: accept(3, 0x001EAAF4, 0x001EAB04, 1)= 14 11438: lwp_sema_wait(0xFE909E60) = 0 11438: lwp_sema_post