DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=29709>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=29709 Error in pool management on multiprocessor envoirment Summary: Error in pool management on multiprocessor envoirment Product: APR Version: 0.9.0 Platform: PC OS/Version: Windows NT/2K Status: NEW Severity: Critical Priority: Other Component: APR AssignedTo: [email protected] ReportedBy: [EMAIL PROTECTED] Hello ! It seems there is a bug in the memory management of apache withhin a multiprocessor envoirment. First I will introduce my envoirment: I use apache 2.0.47 in default configuration (build with the workspace shipped by apache.org). The bug was reproduced on Windows NT4 with dual Processor, W2K with dual Processor and W2K with Hyperthreading processor. For reproducing the bug I modified the mod_example as followed: In x_handler() I insert a Sleep(1000); for simulating a long operation during request like this: -------------------------------------- ap_rprintf(r, " Apache HTTP Server version: \"%s\"\n", ap_get_server_version ()); ap_rputs(" <BR>\n", r); /* Simulating a long operation */ Sleep(1000); ap_rprintf(r, " Server built: \"%s\"\n", ap_get_server_built()); -------------------------------------- Then I made heavy request to normal URL’s and to mod_example simultaneous. After a while apache will fail at random in two different states. The first state is a access violation. The Log comes as follows [Thu Jun 17 16:45:47 2004] [notice] Parent: child process exited with status 3221225477 -- Restarting. [Thu Jun 17 16:45:50 2004] [notice] Parent: Created child process 3464 [Thu Jun 17 16:45:51 2004] [debug] mpm_winnt.c(505): Parent: Sent the scoreboard to the child [Thu Jun 17 16:45:53 2004] [notice] Child 3464: Child process is running [Thu Jun 17 16:45:53 2004] [info] Parent: Duplicating socket 404 and sending it to child process 3464 [Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(426): Child 3464: Retrieved our scoreboard from the parent. [Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(623): Parent: Sent 1 listeners to child 3464 [Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(582): Child 3464: retrieved 1 listeners from parent [Thu Jun 17 16:45:53 2004] [notice] Child 3464: Acquired the start mutex. [Thu Jun 17 16:45:54 2004] [notice] Child 3464: Starting 25 worker threads. Where status is 0xC0000005 which means Access Violation. The access violation happens in apr_pool_walk_tree() while accessing child = pool->child; At the same time another thread tries to free the pool memory. I have no stack backtrace for this at the moment but I can try to make one, if there is a intrest for it. The other failure is that apache stops responding and stays with a processor load of about 50%. A break into code shows, that the pool is damaged. One thread stays in allocator_free() and tries to free pool memory but the next pointer of the actual node points to itself so there is a recursion where apache never gets out. This happens at this point: do { next = node->next; index = node->index; Where the node has the following content: next = 0x007dadd8 node->index = 1 node->next = 0x007dadd8 node->ref = 0x007dadd8 node->free_index = 3452816845 The Thread backtrace looks like this: allocator_free(apr_allocator_t * 0x00773dd8, apr_memnode_t * 0x007dadd8) line 362 + 6 bytes apr_pool_destroy(apr_pool_t * 0x007d8db8) line 797 + 13 bytes trace_add(server_rec * 0x0077c290, request_rec * 0x00000000, x_cfg * 0x007b7c18, const char * 0x10014908 `string') line 408 + 15 bytes x_insert_filter(request_rec * 0x007d2d48) line 997 + 23 bytes ap_run_insert_filter(request_rec * 0x007d2d48) line 121 + 31 bytes ap_invoke_handler(request_rec * 0x6ff09466) line 374 ap_process_http_connection(conn_rec * 0x6ff03f8f) line 293 + 6 bytes ap_run_process_connection(conn_rec * 0x007f51c8) line 85 + 31 bytes ap_process_connection(conn_rec * 0x007f51c8, void * 0x007f5100) line 211 + 6 bytes worker_main(long 2013300156) line 731 MSVCRT! 780085bc() Another Thread stays in this stack state: NTDLL! 77894091() NTDLL! 778922f8() allocator_alloc(apr_allocator_t * 0x00773dd8, unsigned int 8192) line 242 apr_pool_create_ex(apr_pool_t * * 0x007d51dc, apr_pool_t * 0x007d4d48, int (int) * 0x00000000, apr_allocator_t * 0x00773dd8) line 829 + 14 bytes core_output_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007d5198) line 4108 ap_pass_brigade(ap_filter_t * 0x007d5198, apr_bucket_brigade * 0x007ed5a0) line 550 + 7 bytes ap_http_header_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007ab1f0) line 1695 ap_pass_brigade(ap_filter_t * 0x007ab1f0, apr_bucket_brigade * 0x007ed408) line 550 + 7 bytes ap_content_length_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007ab1d8) line 1252 + 20 bytes ap_pass_brigade(ap_filter_t * 0x007ab1d8, apr_bucket_brigade * 0x007ed408) line 550 + 7 bytes ap_byterange_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007ab1c0) line 3036 + 5 bytes ap_pass_brigade(ap_filter_t * 0x007ab1c0, apr_bucket_brigade * 0x007ed408) line 550 + 7 bytes ap_old_write_filter(ap_filter_t * 0x007ed3f0, apr_bucket_brigade * 0x007ed528) line 1321 + 10 bytes end_output_stream(request_rec * 0x007aa550) line 1039 + 29 bytes ap_finalize_request_protocol(request_rec * 0x007aa550) line 1061 + 6 bytes ap_send_error_response(request_rec * 0x6ff0d26f, int 404) line 2423 + 6 bytes ap_die(int 1878053609, request_rec * 0x00000000) line 232 + 11 bytes ap_process_request(request_rec * 0x007aa550) line 311 ap_process_http_connection(conn_rec * 0x6ff03f8f) line 293 + 6 bytes ap_run_process_connection(conn_rec * 0x007d4e48) line 85 + 31 bytes ap_process_connection(conn_rec * 0x007d4e48, void * 0x007d4d80) line 211 + 6 bytes worker_main(long 2013300156) line 731 MSV and another Thread looks like this: allocator_alloc(apr_allocator_t * 0x00773dd8, unsigned int 8192) line 242 apr_pool_create_ex(apr_pool_t * * 0x0156ff28, apr_pool_t * 0x007d6d80, int (int) * 0x00000000, apr_allocator_t * 0x00773dd8) line 829 + 14 bytes ap_read_request(conn_rec * 0x6ff09431) line 848 ap_process_http_connection(conn_rec * 0x6ff03f8f) line 286 + 6 bytes ap_run_process_connection(conn_rec * 0x007d6e80) line 85 + 31 bytes ap_process_connection(conn_rec * 0x007d6e80, void * 0x007d6db8) line 211 + 6 bytes worker_main(long 2013300156) line 731 MSVCRT! 780085bc() all other threads stays in winnt_get_connection() So in my oppinion, there is a race condition where one thread tries to free the pool memory and another thread tries to access this memory at the same time. It seems that this behavior can only be reproduced on a multiprocessor or hyperthreading machine. I don’t know if it affects to other OS then Windows. Greetings Gabriel Kalkuhl --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
