GitHub user zwoop opened an issue:

    https://github.com/apache/trafficserver/issues/1335

    Deadlock in HostDB

    We have some 7.0.0 boxes, which ends up completely wedged, where all ET_NET 
threads get stuck on the same lock (so, a deadlock):
    
    ```
    #6  HostDBProcessor::getbyname_imm (this=<optimized out>, 
cont=cont@entry=0x2ab037b1d420, process_hostdb_info=<optimized out>, 
hostname=<optimized out>, len=<optimized out>, opt=...) at HostDB.cc:816
    #6  HostDBProcessor::getbyname_imm (this=<optimized out>, 
cont=cont@entry=0x2aabc1e66a00, process_hostdb_info=<optimized out>, 
hostname=<optimized out>, len=<optimized out>, opt=...) at HostDB.cc:816
    ...
    ```
    
    The trace is always the same in every thread:
    
    ```
    #0  __lll_lock_wait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
    #1  0x00002aaaad73e5d8 in _L_lock_854 () from /lib64/libpthread.so.0
    #2  0x00002aaaad73e4a7 in __pthread_mutex_lock (mutex=0x2aaab098a290) at 
pthread_mutex_lock.c:61
    #3  0x00002aaaaadca986 in ink_mutex_acquire (m=0x2aaab098a290) at 
../../lib/ts/ink_mutex.h:90
    #4  Mutex_lock (t=0x2aaab160db40, m=0x2aaab098a280) at 
../../iocore/eventsystem/I_Lock.h:410
    #5  MutexLock::MutexLock (t=0x2aaab160db40, am=0x2aaab098a280, 
this=0x2aaab470a890) at ../../iocore/eventsystem/I_Lock.h:497
    #6  HostDBProcessor::getbyname_imm (this=<optimized out>, 
cont=cont@entry=0x2aab91432580, process_hostdb_info=<optimized out>, 
hostname=<optimized out>, len=<optimized out>, opt=...) at HostDB.cc:816
    #7  0x00002aaaaacae21c in HttpSM::do_hostdb_lookup 
(this=this@entry=0x2aab91432580) at HttpSM.cc:4133
    #8  0x00002aaaaacc0093 in HttpSM::set_next_state (this=0x2aab91432580) at 
HttpSM.cc:7248
    #9  0x00002aaaaacad47a in HttpSM::call_transact_and_set_next_state 
(this=this@entry=0x2aab91432580, f=f@entry=0x0) at HttpSM.cc:7111
    #10 0x00002aaaaacb7baf in HttpSM::handle_api_return (this=0x2aab91432580) 
at HttpSM.cc:1604
    #11 0x00002aaaaacba5eb in HttpSM::state_api_callout (this=0x2aab91432580, 
event=0, data=0x0) at HttpSM.cc:1542
    #12 0x00002aaaaacbf62b in HttpSM::set_next_state (this=0x2aab91432580) at 
HttpSM.cc:7144
    #13 0x00002aaaaacad47a in HttpSM::call_transact_and_set_next_state 
(this=this@entry=0x2aab91432580, f=f@entry=0x0) at HttpSM.cc:7111
    #14 0x00002aaaaacb9910 in HttpSM::state_hostdb_lookup (this=0x2aab91432580, 
event=500, data=0x2aebe3144800) at HttpSM.cc:2217
    #15 0x00002aaaaacc165d in HttpSM::main_handler (this=0x2aab91432580, 
event=500, data=0x2aebe3144800) at HttpSM.cc:2661
    #16 0x00002aaaaadc7f37 in Continuation::handleEvent (data=0x2aebe3144800, 
event=500, this=0x2aab91432580) at ../../iocore/eventsystem/I_Continuation.h:153
    #17 reply_to_cont (cont=0x2aab91432580, r=0x2aebe3144800, is_srv=<optimized 
out>) at HostDB.cc:474
    #18 0x00002aaaaadcc79d in HostDBContinuation::dnsEvent (this=<optimized 
out>, event=<optimized out>, e=<optimized out>) at HostDB.cc:1450
    #19 0x00002aaaaade3821 in Continuation::handleEvent (data=<optimized out>, 
event=600, this=<optimized out>) at 
../../iocore/eventsystem/I_Continuation.h:153
    #20 DNSEntry::postEvent (this=this@entry=0x2aaab76b4e00) at DNS.cc:1269
    #21 0x00002aaaaade880b in dns_result (h=h@entry=0x2aaabafc9ec0, 
e=e@entry=0x2aaab76b4e00, ent=<optimized out>, ent@entry=0x2aaaee3aa440, 
retry=retry@entry=false) at DNS.cc:1221
    #22 0x00002aaaaadeb189 in dns_process (len=<optimized out>, 
buf=0x2aaaee3aa440, handler=0x2aaabafc9ec0) at DNS.cc:1587
    #23 DNSHandler::recv_dns (this=this@entry=0x2aaabafc9ec0) at DNS.cc:782
    #24 0x00002aaaaadebac9 in DNSHandler::mainEvent (this=0x2aaabafc9ec0, 
event=<optimized out>, e=<optimized out>) at DNS.cc:794
    #25 0x00002aaaaaf0758e in Continuation::handleEvent (data=0x2aaab1788980, 
event=5, this=<optimized out>) at I_Continuation.h:153
    #26 EThread::process_event (calling_code=5, e=0x2aaab1788980, 
this=0x2aaab160db40) at UnixEThread.cc:143
    #27 EThread::execute (this=0x2aaab160db40) at UnixEThread.cc:270
    #28 0x00002aaaaaf06136 in spawn_thread_internal (a=0x2aaab09981f0) at 
Thread.cc:84
    #29 0x00002aaaad73caa1 in start_thread (arg=0x2aaab470c700) at 
pthread_create.c:301
    #30 0x00002aaaae5f393d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:115
    ```
    
    We're not sure if this relates to HostDB sync or not, but the boxes we 
encountered this on, did have syncing on.

----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to