Hi Berhard Thank you for looking into this.
I posted a third patch and I checked today. I think, it is still relevant. Please see this post: http://lists.uclibc.org/pipermail/uclibc/2013-April/047723.html I have to admit that I used quite some time today to figure out what the exact problem was. However, the attached program demonstrates the issue. The main program and the thread should return a DNS answer but only the main program returns an answer. See below for output from the program. What happens is that the main program tries to resolve openser.ip6000 and openser.ip6000.spectralink.com . The thread only tries to resolve openser.ip6000 . This happens because the thread is running out of retries before it should. Details: 1) __dns_lookup() calculates retries_left = __nameservers * __resolv_attempts 2) retries_left becomes 0 because __nameservers is 0 3) __nameservers is 0 because __open_nameservers() calls res_sync_func() which sets __nameservers = rp->_u._ext.nscount 4) rp->_u._ext.nscount is 0 because it has not yet been initialized with the data read by __open_nameservers() On a different note: Shouldn't retries_left be calculated as retries_left = __nameservers * __searchdomains * __resolv_attempts I hope this is a help to verify the patch. Thanks /Kenneth -------------- program output with resolv DEBUG -------------- 23 adding search spectralink.com nameservers = 2 25 11 adding search spectralink.com nameservers = 2 13 Nothing found in /etc/hosts Looking up type 1 answer for 'openser.ip6000' adding search spectralink.com nameservers = 2 encoding header lookup name: openser.ip6000 On try -1, sending query to 172.29.129.47, port 53 Xmit packet len:32 id:2 qr:0 len:107 id:2 qr:1 Got response (i think)! qrcount=1,ancount=0,nscount=1,arcount=0 opcode=0,aa=0,tc=0,rd=1,ra=1,rcode=3 variant:-1 sdomains:1 __dns_lookup returned < 0 thread (nil) 15 29 Nothing found in /etc/hosts Looking up type 1 answer for 'openser.ip6000' adding search spectralink.com nameservers = 2 encoding header lookup name: openser.ip6000 On try 7, sending query to 172.29.129.47, port 53 Xmit packet len:32 id:3 qr:0 len:107 id:3 qr:1 Got response (i think)! qrcount=1,ancount=0,nscount=1,arcount=0 opcode=0,aa=0,tc=0,rd=1,ra=1,rcode=3 variant:-1 sdomains:1 encoding header lookup name: openser.ip6000.spectralink.com On try 6, sending query to 172.29.129.47, port 53 Xmit packet len:48 id:4 qr:0 len:64 id:4 qr:1 Got response (i think)! qrcount=1,ancount=1,nscount=0,arcount=0 opcode=0,aa=1,tc=0,rd=1,ra=1,rcode=0 Skipping question 0 at 12 Length of question 0 is 36 Decoding answer at pos 48 decode_answer(start): off 48, len 64 Total decode len = 2 i=2,rdlength=4 Answer name = |openser.ip6000.spectralink.com| Answer type = |1| a.add_count:0 a.rdlength:4 a.rdata:0x1e2711c main 0xb6f44ec4 31 -------------- program output with resolv DEBUG -------------- ________________________________________ From: Bernhard Reutner-Fischer <rep.dot....@gmail.com> Sent: Tuesday, November 12, 2013 16:18 To: Sørensen, Kenneth Cc: uclibc@uclibc.org Subject: Re: [PATCH 2/3] Make res_init() thread safe. On Thu, Apr 11, 2013 at 06:51:36AM +0000, Sørensen, Kenneth wrote: > > From c34c95553ac7ebd278059fada06319aaf132c906 Mon Sep 17 00:00:00 2001 > From: Kenneth Soerensen <kenneth.soren...@spectralink.com> > Date: Wed, 10 Apr 2013 16:52:52 +0200 > Subject: [PATCH 2/3] Make res_init() thread safe. > > res_init() was not atomic, which could give undesired behaviour. Now > res_init() is completely locked under one lock and the locking is > removed from __res_vinit(). > > Signed-off-by: Kenneth Soerensen <kenneth.soren...@spectralink.com> > --- > libc/inet/resolv.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/libc/inet/resolv.c b/libc/inet/resolv.c > index df6fefd..c230534 100644 > --- a/libc/inet/resolv.c > +++ b/libc/inet/resolv.c > @@ -3434,7 +3434,6 @@ __res_vinit(res_state rp, int preinit) > int m = 0; > #endif > > - __UCLIBC_MUTEX_LOCK(__resolv_lock); > __close_nameservers(); > __open_nameservers(); > > @@ -3526,7 +3525,6 @@ __res_vinit(res_state rp, int preinit) > > rp->options |= RES_INIT; > > - __UCLIBC_MUTEX_UNLOCK(__resolv_lock); > return 0; > } > > @@ -3576,11 +3574,11 @@ res_init(void) > if (!_res.id) > _res.id = res_randomid(); > > - __UCLIBC_MUTEX_UNLOCK(__resolv_lock); > - > __res_vinit(&_res, 1); > __res_sync = res_sync_func; > > + __UCLIBC_MUTEX_UNLOCK(__resolv_lock); > + > return 0; > } > libc_hidden_def(res_init) > @@ -3679,7 +3677,9 @@ struct __res_state *__resp = &_res; > int > res_ninit(res_state statp) > { > + __UCLIBC_MUTEX_LOCK(__resolv_lock); > return __res_vinit(statp, 0); > + __UCLIBC_MUTEX_UNLOCK(__resolv_lock); This hunk had the locking wrong as you can see. Applied with slight adjustment. Thanks,
resolvtest.cap
Description: resolvtest.cap
#include <netinet/in.h> #include <arpa/nameser.h> #include <resolv.h> #include <netdb.h> #include <unistd.h> #include <pthread.h> #include <stdio.h> static void* the_thread(void *arg) { printf("%d\n", __LINE__); res_init(); printf("%d\n", __LINE__); printf("thread %p\n", gethostbyname("openser.ip6000")); printf("%d\n", __LINE__); return NULL; } int main(void) { pthread_t handle; printf("%d\n", __LINE__); res_init(); printf("%d\n", __LINE__); sleep(1); pthread_create(&handle, NULL, the_thread, NULL); sleep(1); printf("%d\n", __LINE__); printf("main %p\n", gethostbyname("openser.ip6000")); printf("%d\n", __LINE__); return 0; }
_______________________________________________ uClibc mailing list uClibc@uclibc.org http://lists.busybox.net/mailman/listinfo/uclibc