Obligatory http://xkcd.com/979/
;-) On Jun 29, 2012, at 9:35 AM, Brook, James wrote: > It's probably bad form to keep answering my own mails but no-one had anything > to say about this. Are there still people on the list who are familiar with > the adaptor internals? This problem is causing us a lot of pain in production. > > Does anyone use the MPM worker module with Apache or are we all still with > pre-fork? I don't think we could live without the performance gains. Perhaps > it doesn't matter. > > I haven't quite proven this but I am pretty certain that my problem is with > fcntl. That's what the adaptor uses to lock the shared memory file. It's > apparently an outdated way of doing this - APR now has better abstractions > for these sorts of mutexes. Even the code that does the locking is in a retry > loop with up to 50 attempts! I started trying to rewrite the locking stuff > but I am out of my depth. > > It strikes me that in general this would not be a bad bit of code for the > community to have updated. Can anyone help me with that please? > > James > > ________________________________________ > From: Brook, James > Sent: 13 June 2012 18:48 > To: <[email protected]> > Subject: Re: Deadlock on Apache 2.2 Adaptor under high load - Solaris 10 - > Worker MPM > > Now I have some detailed adaptor logging from a time close to the deadlock. > Here is an example of an error with a lock: > > Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:375 > Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:379 > Error: lock_file_section(): failed to lock (1 attempts): Deadlock situation > detected/avoided > Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:93 > Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:100 > Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:152 > Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:158 > Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:391 > Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:394 > Error: ac_readConfiguration: WOShmem_lock() failed. Skipping reading config. > > On Jun 13, 2012, at 5:30 PM, James Brook wrote: > >> We have a big problem with the Apache 2.2 WebObjects adaptor on our Solaris >> 10 web servers. We are using the 'worker' MPM but when the sites get busy >> nearly every Apache thread is waiting for a shared memory lock to call the >> function that reads the adaptor config. The remaining threads are in the >> fcntl function trying to lock a section of shared memory. See below for a >> couple of example thread stacks. >> >> I read in several posts that fcntl on Solaris 10 causes deadlocks under high >> load and that the problem is worse with the 'worker MPM'. The recommend >> locking mechanism for Solaris seems to be to use pthreads. >> >> I know that at least a few list members are running with the Solaris >> adaptor. My questions: >> * Has anyone experienced this problem and found a solution? >> * Anyone using the 'worker' MPM or do people still use pre-fork (I don't >> think this a thread safety problem). >> * Any help or suggestions? Especially, any tips on rewriting to use >> pthreads? >> >> -- >> James >> >> >> feec5638 fcntl (d8, 7, 2abe588) >> feeb8258 fcntl (d8, 1, fefcc200, 4d6880, 1580, 20a58) + 84 >> febe8570 lock_file_section (d8, 4d6880, 14, 2abe588, 147c, 2) + 58 >> febe8e14 WOShmem_lock (2abe588, 14, 1, 4d6880, 1580, 1400) + d4 >> febef410 ac_readConfiguration (1, fffee980, 11400, fec08f74, 1d84, 1c00) + 40 >> febe71cc _runRequest (fc9fb9c4, 0, 2d77168, 2d18b40, 5, 0) + 260 >> febe6a0c tr_handleRequest (2d18b40, 27226f0, fc9fbc50, 0, 5, 2) + 30c >> febf42a8 WebObjects_handler (2721208, 0, 10000, 0, 2d18b40, fec08f74) + 48c >> 00041484 ap_run_handler (2721208, febf3e1c, 7b578, 6b5a10, 2, 8) + 40 >> 00041ab4 ap_invoke_handler (2721208, 0, 2721208, 0, 6b58bc, 79c00) + ec >> 0005132c ap_process_request (2721208, 79400, 4, 1, 0, 2721208) + 54 >> 0004d9a4 ap_process_http_connection (26b61c0, 7c000, 0, 1, 79548, 5) + 78 >> 00049654 ap_process_connection (26b61c0, 26b5f10, 6b5d90, 0, 7bd98, 6b5d78) >> + d4 >> 00057558 worker_thread (14d888, ad7, fc9fbf98, 7c24c, 2b, 17) + 280 >> feec5238 _lwp_start (0, 0, 0, 0, 0, 0) >> >> >> feec52d8 lwp_park (0, 0, 0) >> feebf350 cond_wait_queue (ef50a8, ef5090, 0, 0, 1c00, 0) + 28 >> feebf874 cond_wait (ef50a8, ef5090, ef50a8, 0, fec0a8f8, 3) + 10 >> feebf8b0 pthread_cond_wait (ef50a8, ef5090, ef5090, 0, 1c00, 3a) + 8 >> febf2730 _WA_lock (ef5088, febf5974, ef50a8, 0, fec0a8f8, 3) + 90 >> febe9494 sha_lock (100, 4, fffeca64, fec08f74, ef3230, 13400) + 5c >> febedd84 ac_findApplication (fe0fb54c, 4, fec0acfc, fec08f74, 0, fec0a474) + >> 70 >> febe6794 tr_handleRequest (2402c38, 30bbec0, fe0fb7d8, 798f0, ffffffff, >> 14400) + 94 >> febf42a8 WebObjects_handler (30baf40, 0, 10000, 0, 2402c38, fec08f74) + 48c >> 00041484 ap_run_handler (30baf40, febf3e1c, 7b578, 6b5a10, 2, 8) + 40 >> 00041ab4 ap_invoke_handler (30baf40, 0, 2ba5f10, 2ba5348, 30baf40, 2b824d8) >> + ec >> 0003f080 ap_run_sub_req (ffffffff, 30bb0e8, 20, 0, 30bc370, 30baf40) + 3c >> fed336d8 handle_include (2ba4d20, 10800, 2ba5f10, 2ba5348, 30baf40, 2b824d8) >> + 334 >> fed378f8 send_parsed_content (11a8, 7c021, 2ba4d20, 2c01898, 2ba5f14, >> 2ba5f10) + 1080 >> 0003afb0 default_handler (0, 2c01898, 2b91e10, 2b7c748, 2b7e598, 2ba5328) + >> 4a8 >> 00041484 ap_run_handler (2c01898, 3ab08, 7b578, 6b5a74, 7, 8) + 40 >> 00041ab4 ap_invoke_handler (2c01898, 0, 2c01898, 2b9eb80, ffb1b6a0, 4e4960) >> + ec >> 00051a58 ap_internal_redirect (0, 2c01898, fe0fbd10, fe0fbcac, 1, 2c01898) + >> 44 >> febab53c handler_redirect (2b9eb80, ffffffff, febbd238, 2c01560, fffefd64, >> 10000) + 90 >> 00041484 ap_run_handler (2b9eb80, febab4ac, 7b578, 6b5a4c, 5, 8) + 40 >> 00041ab4 ap_invoke_handler (2b9eb80, 0, 2b9eb80, 0, 6b58bc, 79c00) + ec >> 0005132c ap_process_request (2b9eb80, 79400, 4, 1, 0, 2b9eb80) + 54 >> 0004d9a4 ap_process_http_connection (2b7c748, 7c000, 0, 1, 79548, 5) + 78 >> 00049654 ap_process_connection (2b7c748, 2b7c498, 6b5d90, 0, 7bd98, 6b5d78) >> + d4 >> 00057558 worker_thread (14d5a8, a00, fe0fbf98, 7c24c, 28, 0) + 280 >> feec5238 _lwp_start (0, 0, 0, 0, 0, 0) >> >> >> >> >> > > > _______________________________________________ > Do not post admin requests to the list. They will be ignored. > Webobjects-dev mailing list ([email protected]) > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/webobjects-dev/ramseygurley%40gmail.com > > This email sent to [email protected] _______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list ([email protected]) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to [email protected]
