Hi Berhard

Thank you for looking into this.

I posted a third patch and I checked today. I think, it is still relevant. 
Please see this post:

http://lists.uclibc.org/pipermail/uclibc/2013-April/047723.html

I have to admit that I used quite some time today to figure out what the exact 
problem was. However, the attached program demonstrates the issue. The main 
program and the thread should return a DNS answer but only the main program 
returns an answer. See below for output from the program.

What happens is that the main program tries to resolve openser.ip6000 and 
openser.ip6000.spectralink.com . The thread only tries to resolve 
openser.ip6000 . This happens because the thread is running out of retries 
before it should.

Details:
1) __dns_lookup() calculates      retries_left = __nameservers * 
__resolv_attempts
2) retries_left becomes 0 because __nameservers is 0
3) __nameservers is 0 because __open_nameservers() calls res_sync_func() which 
sets    __nameservers = rp->_u._ext.nscount
4) rp->_u._ext.nscount is 0 because it has not yet been initialized with the 
data read by __open_nameservers()

On a different note: Shouldn't retries_left be calculated as 
retries_left = __nameservers * __searchdomains *  __resolv_attempts

I hope this is a help to verify the patch.

Thanks

/Kenneth

-------------- program output with resolv DEBUG --------------
23
adding search spectralink.com
nameservers = 2
25
11
adding search spectralink.com
nameservers = 2
13
Nothing found in /etc/hosts
Looking up type 1 answer for 'openser.ip6000'
adding search spectralink.com
nameservers = 2
encoding header
lookup name: openser.ip6000
On try -1, sending query to 172.29.129.47, port 53
Xmit packet len:32 id:2 qr:0
len:107 id:2 qr:1
Got response (i think)!
qrcount=1,ancount=0,nscount=1,arcount=0
opcode=0,aa=0,tc=0,rd=1,ra=1,rcode=3
variant:-1 sdomains:1
__dns_lookup returned < 0
thread (nil)
15
29
Nothing found in /etc/hosts
Looking up type 1 answer for 'openser.ip6000'
adding search spectralink.com
nameservers = 2
encoding header
lookup name: openser.ip6000
On try 7, sending query to 172.29.129.47, port 53
Xmit packet len:32 id:3 qr:0
len:107 id:3 qr:1
Got response (i think)!
qrcount=1,ancount=0,nscount=1,arcount=0
opcode=0,aa=0,tc=0,rd=1,ra=1,rcode=3
variant:-1 sdomains:1
encoding header
lookup name: openser.ip6000.spectralink.com
On try 6, sending query to 172.29.129.47, port 53
Xmit packet len:48 id:4 qr:0
len:64 id:4 qr:1
Got response (i think)!
qrcount=1,ancount=1,nscount=0,arcount=0
opcode=0,aa=1,tc=0,rd=1,ra=1,rcode=0
Skipping question 0 at 12
Length of question 0 is 36
Decoding answer at pos 48
decode_answer(start): off 48, len 64
Total decode len = 2
i=2,rdlength=4
Answer name = |openser.ip6000.spectralink.com|
Answer type = |1|
a.add_count:0 a.rdlength:4 a.rdata:0x1e2711c
main 0xb6f44ec4
31
-------------- program output with resolv DEBUG --------------

________________________________________
From: Bernhard Reutner-Fischer <rep.dot....@gmail.com>
Sent: Tuesday, November 12, 2013 16:18
To: Sørensen, Kenneth
Cc: uclibc@uclibc.org
Subject: Re: [PATCH 2/3] Make res_init() thread safe.

On Thu, Apr 11, 2013 at 06:51:36AM +0000, Sørensen, Kenneth wrote:
>
> From c34c95553ac7ebd278059fada06319aaf132c906 Mon Sep 17 00:00:00 2001
> From: Kenneth Soerensen <kenneth.soren...@spectralink.com>
> Date: Wed, 10 Apr 2013 16:52:52 +0200
> Subject: [PATCH 2/3] Make res_init() thread safe.
>
> res_init() was not atomic, which could give undesired behaviour. Now
> res_init() is completely locked under one lock and the locking is
> removed from __res_vinit().
>
> Signed-off-by: Kenneth Soerensen <kenneth.soren...@spectralink.com>
> ---
>  libc/inet/resolv.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/libc/inet/resolv.c b/libc/inet/resolv.c
> index df6fefd..c230534 100644
> --- a/libc/inet/resolv.c
> +++ b/libc/inet/resolv.c
> @@ -3434,7 +3434,6 @@ __res_vinit(res_state rp, int preinit)
>       int m = 0;
>  #endif
>
> -     __UCLIBC_MUTEX_LOCK(__resolv_lock);
>       __close_nameservers();
>       __open_nameservers();
>
> @@ -3526,7 +3525,6 @@ __res_vinit(res_state rp, int preinit)
>
>       rp->options |= RES_INIT;
>
> -     __UCLIBC_MUTEX_UNLOCK(__resolv_lock);
>       return 0;
>  }
>
> @@ -3576,11 +3574,11 @@ res_init(void)
>       if (!_res.id)
>               _res.id = res_randomid();
>
> -     __UCLIBC_MUTEX_UNLOCK(__resolv_lock);
> -
>       __res_vinit(&_res, 1);
>       __res_sync = res_sync_func;
>
> +     __UCLIBC_MUTEX_UNLOCK(__resolv_lock);
> +
>       return 0;
>  }
>  libc_hidden_def(res_init)
> @@ -3679,7 +3677,9 @@ struct __res_state *__resp = &_res;
>  int
>  res_ninit(res_state statp)
>  {
> +     __UCLIBC_MUTEX_LOCK(__resolv_lock);
>       return __res_vinit(statp, 0);
> +     __UCLIBC_MUTEX_UNLOCK(__resolv_lock);

This hunk had the locking wrong as you can see. Applied with slight
adjustment.
Thanks,

Attachment: resolvtest.cap
Description: resolvtest.cap

#include <netinet/in.h>
#include <arpa/nameser.h>
#include <resolv.h>
#include <netdb.h>
#include <unistd.h>
#include <pthread.h>
#include <stdio.h>

static void* the_thread(void *arg)
{
  printf("%d\n", __LINE__);
  res_init();
  printf("%d\n", __LINE__);  
  printf("thread %p\n", gethostbyname("openser.ip6000"));
  printf("%d\n", __LINE__);
  return NULL;
}

int main(void)
{
  pthread_t handle;

  printf("%d\n", __LINE__);
        res_init();
  printf("%d\n", __LINE__);
  sleep(1);
  pthread_create(&handle, NULL, the_thread, NULL);
  sleep(1);
  printf("%d\n", __LINE__);
  printf("main %p\n", gethostbyname("openser.ip6000"));
  printf("%d\n", __LINE__);
  return 0;
}

_______________________________________________
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Reply via email to