Hi Mark, You're right. That looks like something else. do_monitor() relates to MWI. This could be a completely different issue, if I need to guess I'd say we're looking at a lock-ordering deadlock. Eg:
Thread 1 grabs lock A. Thread 2 grabs lock B. Thread 1 blocks trying to lock B. Thread 2 blocks trying to lock A. I'm not seeing any direct changes in chan_sip from 11.19 => 11.20 that would do anything with respect to netlock. Also no changes with respect to dblock (and ast_db_put is so trivial that there is almost no way that it can be wrong unless the unlock call fails - which I think is impossible). Out of my depth. Sorry. Kind Regards, Jaco On 12/11/2015 07:07, [email protected] wrote: > Hi Jaco, > > I've tried this tmpfs workaround on asterisk 11 and I'm running into > the following (very very long duration) locks that block sip activity > completely. My core issue is definitely not disk-io. > > > === <pending> <lock#> (<file>): <lock type> <line num> <function> > <lock name> <lock addr> (times locked) > === > === Thread ID: 0xb524fb70 (do_monitor started at [29517] > chan_sip.c restart_monitor()) > === ---> Lock #0 (chan_sip.c): MUTEX 28923 handle_request_do &netlock > 0xb630b440 (1) > main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19) > main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85) > channels/chan_sip.c:28926 handle_request_do() > channels/chan_sip.c:28885 sipsock_read() > main/io.c:292 ast_io_wait() (0x8132d50+175) > channels/chan_sip.c:29484 do_monitor() > main/utils.c:1223 dummy_start() > :0 start_thread() > libc.so.6 clone() (0xb77270d0+5E) > === ---> Lock #1 (chan_sip.c): MUTEX 8959 sip_pvt_lock_full pvt > 0x84b5660 (1) > main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19) > main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85) > main/astobj2.c:198 __ao2_lock() (0x80906ad+7C) > channels/chan_sip.c:8960 sip_pvt_lock_full() > channels/chan_sip.c:28939 handle_request_do() > channels/chan_sip.c:28885 sipsock_read() > main/io.c:292 ast_io_wait() (0x8132d50+175) > channels/chan_sip.c:29484 do_monitor() > main/utils.c:1223 dummy_start() > :0 start_thread() > libc.so.6 clone() (0xb77270d0+5E) > === ---> Lock #2 (chan_sip.c): MUTEX 17047 register_verify peer > 0x86df2b8 (1) > main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19) > main/lock.c:258 __ast_pthread_mutex_lock() (0x813864f+85) > main/astobj2.c:198 __ao2_lock() (0x80906ad+7C) > channels/chan_sip.c:17048 register_verify() > channels/chan_sip.c:28477 handle_request_register() > channels/chan_sip.c:28785 handle_incoming() > channels/chan_sip.c:28953 handle_request_do() > channels/chan_sip.c:28885 sipsock_read() > main/io.c:292 ast_io_wait() (0x8132d50+175) > channels/chan_sip.c:29484 do_monitor() > main/utils.c:1223 dummy_start() > :0 start_thread() > libc.so.6 clone() (0xb77270d0+5E) > === ---> Lock #3 (db.c): MUTEX 329 ast_db_put &dblock 0x8214f60 (1) > main/logger.c:1702 ast_bt_get_addresses() (0x813f3ee+19) > main/lock.c:380 __ast_pthread_mutex_trylock() (0x81389b7+85) > main/db.c:329 ast_db_put() (0x80eeec4+124) > channels/chan_sip.c:16242 parse_register_contact() > channels/chan_sip.c:17069 register_verify() > channels/chan_sip.c:28477 handle_request_register() > channels/chan_sip.c:28785 handle_incoming() > channels/chan_sip.c:28953 handle_request_do() > channels/chan_sip.c:28885 sipsock_read() > main/io.c:292 ast_io_wait() (0x8132d50+175) > channels/chan_sip.c:29484 do_monitor() > main/utils.c:1223 dummy_start() > :0 start_thread() > libc.so.6 clone() (0xb77270d0+5E) > === ------------------------------------------------------------------- > === > ======================================================================= > > > On 11.11.2015 10:03, Jaco Kroon wrote: >> Hi Mark, >> >> I suspect the following relates: >> >> >> http://jkroon.blogs.uls.co.za/it/voip/asterisk-massively-speeding-up-those-register-requests >> >> >> That should explain the underlying problem for you, and potentially >> provide you with a fix depending on your situation. We've been using >> the proposed "workaround" there since the time of writing with no ill >> effect, however, you need to take heed of the warnings posted there. >> >> We had to modify code on at least some instances to avoid DB() dialplan >> use, in our case that wasn't difficult, but still, you need to be aware >> of the risks. >> >> Kind Regards, >> Jaco >> > -- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
