[GitHub] trafficserver issue #1335: Deadlock in HostDB

2017-03-09 Thread zwoop
Github user zwoop commented on the issue:

https://github.com/apache/trafficserver/issues/1335
  
@jacksontj @vmamidi is working on this, and has a fix (hopefully).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver issue #1335: Deadlock in HostDB

2017-03-06 Thread jacksontj
Github user jacksontj commented on the issue:

https://github.com/apache/trafficserver/issues/1335
  
@zwoop Where you able to come up with a reproducible case for this? I'm 
going to try and take a look this week-- it'd be easier if I had a repro method 
:)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver issue #1335: Deadlock in HostDB

2017-02-08 Thread zwoop
Github user zwoop commented on the issue:

https://github.com/apache/trafficserver/issues/1335
  
@randall reports that this happens with sync off as well :-/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver issue #1335: Deadlock in HostDB

2017-01-18 Thread zwoop
GitHub user zwoop opened an issue:

https://github.com/apache/trafficserver/issues/1335

Deadlock in HostDB

We have some 7.0.0 boxes, which ends up completely wedged, where all ET_NET 
threads get stuck on the same lock (so, a deadlock):

```
#6  HostDBProcessor::getbyname_imm (this=, 
cont=cont@entry=0x2ab037b1d420, process_hostdb_info=, 
hostname=, len=, opt=...) at HostDB.cc:816
#6  HostDBProcessor::getbyname_imm (this=, 
cont=cont@entry=0x2aabc1e66a00, process_hostdb_info=, 
hostname=, len=, opt=...) at HostDB.cc:816
...
```

The trace is always the same in every thread:

```
#0  __lll_lock_wait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x2d73e5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x2d73e4a7 in __pthread_mutex_lock (mutex=0x2aaab098a290) at 
pthread_mutex_lock.c:61
#3  0x2adca986 in ink_mutex_acquire (m=0x2aaab098a290) at 
../../lib/ts/ink_mutex.h:90
#4  Mutex_lock (t=0x2aaab160db40, m=0x2aaab098a280) at 
../../iocore/eventsystem/I_Lock.h:410
#5  MutexLock::MutexLock (t=0x2aaab160db40, am=0x2aaab098a280, 
this=0x2aaab470a890) at ../../iocore/eventsystem/I_Lock.h:497
#6  HostDBProcessor::getbyname_imm (this=, 
cont=cont@entry=0x2aab91432580, process_hostdb_info=, 
hostname=, len=, opt=...) at HostDB.cc:816
#7  0x2acae21c in HttpSM::do_hostdb_lookup 
(this=this@entry=0x2aab91432580) at HttpSM.cc:4133
#8  0x2acc0093 in HttpSM::set_next_state (this=0x2aab91432580) at 
HttpSM.cc:7248
#9  0x2acad47a in HttpSM::call_transact_and_set_next_state 
(this=this@entry=0x2aab91432580, f=f@entry=0x0) at HttpSM.cc:7111
#10 0x2acb7baf in HttpSM::handle_api_return (this=0x2aab91432580) 
at HttpSM.cc:1604
#11 0x2acba5eb in HttpSM::state_api_callout (this=0x2aab91432580, 
event=0, data=0x0) at HttpSM.cc:1542
#12 0x2acbf62b in HttpSM::set_next_state (this=0x2aab91432580) at 
HttpSM.cc:7144
#13 0x2acad47a in HttpSM::call_transact_and_set_next_state 
(this=this@entry=0x2aab91432580, f=f@entry=0x0) at HttpSM.cc:7111
#14 0x2acb9910 in HttpSM::state_hostdb_lookup (this=0x2aab91432580, 
event=500, data=0x2aebe3144800) at HttpSM.cc:2217
#15 0x2acc165d in HttpSM::main_handler (this=0x2aab91432580, 
event=500, data=0x2aebe3144800) at HttpSM.cc:2661
#16 0x2adc7f37 in Continuation::handleEvent (data=0x2aebe3144800, 
event=500, this=0x2aab91432580) at ../../iocore/eventsystem/I_Continuation.h:153
#17 reply_to_cont (cont=0x2aab91432580, r=0x2aebe3144800, is_srv=) at HostDB.cc:474
#18 0x2adcc79d in HostDBContinuation::dnsEvent (this=, event=, e=) at HostDB.cc:1450
#19 0x2ade3821 in Continuation::handleEvent (data=, 
event=600, this=) at 
../../iocore/eventsystem/I_Continuation.h:153
#20 DNSEntry::postEvent (this=this@entry=0x2aaab76b4e00) at DNS.cc:1269
#21 0x2ade880b in dns_result (h=h@entry=0x2aaabafc9ec0, 
e=e@entry=0x2aaab76b4e00, ent=, ent@entry=0x2aaaee3aa440, 
retry=retry@entry=false) at DNS.cc:1221
#22 0x2adeb189 in dns_process (len=, 
buf=0x2aaaee3aa440, handler=0x2aaabafc9ec0) at DNS.cc:1587
#23 DNSHandler::recv_dns (this=this@entry=0x2aaabafc9ec0) at DNS.cc:782
#24 0x2adebac9 in DNSHandler::mainEvent (this=0x2aaabafc9ec0, 
event=, e=) at DNS.cc:794
#25 0x2af0758e in Continuation::handleEvent (data=0x2aaab1788980, 
event=5, this=) at I_Continuation.h:153
#26 EThread::process_event (calling_code=5, e=0x2aaab1788980, 
this=0x2aaab160db40) at UnixEThread.cc:143
#27 EThread::execute (this=0x2aaab160db40) at UnixEThread.cc:270
#28 0x2af06136 in spawn_thread_internal (a=0x2aaab09981f0) at 
Thread.cc:84
#29 0x2d73caa1 in start_thread (arg=0x2aaab470c700) at 
pthread_create.c:301
#30 0x2e5f393d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:115
```

We're not sure if this relates to HostDB sync or not, but the boxes we 
encountered this on, did have syncing on.






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---