DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=44402>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=44402 [EMAIL PROTECTED] changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW ------- Additional Comments From [EMAIL PROTECTED] 2008-02-13 22:56 ------- I did the stress test with the patch you suggested. After your patch, I still got the 1st crash. If it crashed in second stack trace then I will update the bug. Here are some more information about 1st crash. * I am able to reproduce the crash on Solaris 10 update 1 (on a different machine) too. It took around 4 hours of stress before I got the crash on Solaris 10 while it takes around < 30 minutes to reproduce on Solaris nevada. It was crash 1 (allocator_free with node = null) (without your patch). Here is more information of the crash 1 : (dbx) where current thread: [EMAIL PROTECTED] =>[1] allocator_free(allocator = 0x8afe2e0, node = (nil)), line 331 in "apr_pools.c" [2] apr_pool_clear(pool = 0xa0d01c0), line 710 in "apr_pools.c" [3] ap_core_output_filter(f = 0xa0b28c8, b = 0xa0b2a08), line 899 in "core_filters.c" [4] ap_pass_brigade(next = 0xa0b28c8, bb = 0xa0b2a08), line 526 in "util_filter.c" [5] logio_out_filter(f = 0xa0b2888, bb = 0xa0b2a08), line 135 in "mod_logio.c" [6] ap_pass_brigade(next = 0xa0b2888, bb = 0xa0b2a08), line 526 in "util_filter.c" [7] ap_flush_conn(c = 0xa0b23e8), line 84 in "connection.c" [8] ap_lingering_close(c = 0xa0b23e8), line 123 in "connection.c" [9] process_socket(p = 0x8afe368, sock = 0x8aff660, my_child_num = 1, my_thread_num = 18, bucket_alloc = 0xa0be178), line 545 in "worker.c" [10] worker_thread(thd = 0x81487d8, dummy = 0x8117b30), line 894 in "worker.c" [11] dummy_worker(opaque = 0x81487d8), line 142 in "thread.c" [12] _thr_setup(0xfe244800), at 0xfeccf92e [13] _lwp_start(), at 0xfeccfc10 (dbx) up Current function is apr_pool_clear 710 allocator_free(pool->allocator, active->next); (dbx) p *active *active = { next = (nil) ref = 0xa0d01a8 index = 1U free_index = 0 first_avail = 0xa0d01f8 "\xc0^A^M\n\xfc^A^M\n\xfc^A^M\nx\xe1^K\n" endp = 0xa0d21a8 "^A " } (dbx) up Current function is ap_core_output_filter 899 apr_pool_clear(ctx->deferred_write_pool); (dbx) p *ctx *ctx = { b = (nil) deferred_write_pool = 0xa0d01c0 } (dbx) p *ctx->deferred_write_pool *ctx->deferred_write_pool = { parent = 0x8afe368 child = (nil) sibling = 0xa0c6198 ref = 0x8afe36c cleanups = (nil) free_cleanups = (nil) allocator = 0x8afe2e0 subprocesses = (nil) abort_fn = (nil) user_data = (nil) tag = 0x80bfd1c "deferred_write" active = 0xa0d01a8 self = 0xa0d01a8 self_first_avail = 0xa0d01f8 "\xc0^A^M\n\xfc^A^M\n\xfc^A^M\nx\xe1^K\n" } (dbx) p *c *c = { pool = 0x8afe368 base_server = 0x80e6bf8 vhost_lookup_data = (nil) local_addr = 0x8aff698 remote_addr = 0x8aff7c0 remote_ip = 0xa0b2850 "192.168.11.1" remote_host = (nil) remote_logname = (nil) aborted = 0 keepalive = AP_CONN_KEEPALIVE double_reverse = 0 keepalives = 1 local_ip = 0xa0b2840 "192.168.11.2" local_host = (nil) id = 518 conn_config = 0xa0b2448 notes = 0xa0b26e8 input_filters = 0xa0b2870 output_filters = 0xa0b2888 sbh = 0xa0b23e0 bucket_alloc = 0xa0be178 cs = (nil) data_in_input_filters = 0 } One putting some printfs I figured out the following : In apr_pool_clear (when invoked for deferred_write_pool) ... active = pool->active = pool->self; active->first_avail = pool->self_first_avail; if (active->next == active) return; active->next should typically be s circular link list. What is happenning some cases is that active->next points to some thing else and active->ref still points to active->next. I put a printf of active->next before it is set to NULL. For a particular crash, here is my debugging session. I found that active->next was set to 0x20e8810 before it was set to NULL. (dbx) up Current function is apr_pool_clear 774 allocator_free(pool->allocator, active->next); (dbx) up Current function is ap_core_output_filter 923 apr_pool_clear(ctx->deferred_write_pool); (dbx) p (struct apr_memnode_t*)0x20e8810 -----> This was active->next before set to NULL. (struct apr_memnode_t *) 0x20e8810 = 0x20e8810 (dbx) p *(struct apr_memnode_t*)0x20e8810 *((struct apr_memnode_t *) 0x20e8810) = { next = 0x288c5b0 ref = 0x20e8810 index = 1U free_index = 0 first_avail = 0x20e9eb0 "GET /file_set/dir00104/class1_3 HTTP/1.0" endp = 0x20ea810 "^A " } (dbx) down Current function is apr_pool_clear 774 allocator_free(pool->allocator, active->next); (dbx) p active active = 0x20e27e0 (dbx) p *((struct apr_memnode_t*)0x20e8810)->next *((struct apr_memnode_t *) 0x20e8810)->next = { next = 0x20e07d0 ref = 0x20e07d0 index = 1U free_index = 0 first_avail = 0x288d008 "" endp = 0x288e5b0 "^A " } (dbx) p active active = 0x20e27e0 (dbx) p *(((struct apr_memnode_t*)0x20e8810)->next)->next *((struct apr_memnode_t *) 0x20e8810)->next->next = { next = 0x28905d0 ref = 0x288c5b0 index = 1U free_index = 0 first_avail = 0x20e2738 "" endp = 0x20e27d0 "^A " } (dbx) p *((((struct apr_memnode_t*)0x20e8810)->next)->next)->next *((struct apr_memnode_t *) 0x20e8810)->next->next->next = { next = 0x288e5c0 ref = 0x28905d0 index = 1U free_index = 0 first_avail = 0x2890668 "\xf8^E\x89^B" endp = 0x28925d0 "^Q^P" } (dbx) p *(((((struct apr_memnode_t*)0x20e8810)->next)->next)->next)->next *((struct apr_memnode_t *) 0x20e8810)->next->next->next->next = { next = (nil) ref = (nil) index = 1U free_index = 0 first_avail = 0x288e5e8 "`^_" endp = 0x28905c0 "^A " } On further debugging, I figured out that typically ap_core_output_filter is called 4 times for a request. The crash always happen in 4th invocation. It seems to me that it gets corrupted somewhere after the 3rd invocation (after it returns from ap_core_output_filter) and before it enters into ap_core_output_filter 4th time (when ap_lingering_close is in call stack). Also conn->keepalives was always set to 1. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
