Hi Mark, Thanks a lot for the information, that is very helpful.
>From your comments, I understand that we should keep the current DNS resolving behavior as is. The thing we need to improve is to stop resolving if there is no /etc/resolv.conf configured. And if /etc/resolv.conf exists and has "127.0.0.1" as the dns server, like Numan mentioned, resolver will block. For testing, I feel that a timeout dns_resolve makes sense. Can we determine testing context in runtime? Best, Yifeng On Thu, Nov 1, 2018 at 6:20 AM Mark Michelson <[email protected]> wrote: > On 10/31/2018 06:24 PM, Yifeng Sun wrote: > > Hi Ben, > > > > The dns resolving depends on libunbound's ub_resolve, which, from > > Numan's experience as well as my reading on its documentation, > > doesn't support timeout. I agree there is a bug and we should fix it. > > > > Thanks, > > Yifeng > > > > I don't think you're going to find many resolvers that support timeouts > being passed to them directly. Most of the time, the system settings are > going to be honored. On Linux distributions, this means using the > resolv.conf timeout and attempts values. By default, these values are > set to 5 and 2 respectively. This means that the resolution will wait 5 > seconds before it determines it has timed out, and will attempt the > query 2 times before it decides that the query has failed. > > Working this way is great when it comes to user-friendliness. System > admins are accustomed to using resolv.conf to control resolver behavior, > so the DNS library isn't doing anything unexpected. > > However, this *sucks* when it comes to trying to test your application. > Those defaults I specified before are not guaranteed to be the same > across different Linux distributions, not to mention other platforms. > Trying to predict what the timeout for your DNS query is going to be is > going to be a pain. > > If you want to implement an upper bound on a timeout, your best bet is > to use an asynchronous query and start your own timer. When your timer > expires, then cancel the query. However, I would only recommend doing > this in a test environment. Like I said before, administrators won't > like it if we're messing with their configured DNS timeouts. > > I think you're onto the right idea here by modifying the behavior when > there are no servers configured. This way, you're not relying on a > timeout in your test for something that really should fail immediately. > > > On Wed, Oct 31, 2018 at 1:59 PM Ben Pfaff <[email protected]> wrote: > > > >> On Thu, Oct 25, 2018 at 03:27:41PM +0530, [email protected] wrote: > >>> From: Numan Siddique <[email protected]> > >>> > >>> When 'make check' is called by the mock rpm build (which disables > >> networking), > >>> the test "ovn-nbctl: LBs - daemon" fails when it runs the command > >>> "ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80". > >> ovn-nbctl > >>> extracts the vip by calling the socket util function > >> 'inet_parse_active()', > >>> and this function blocks when libunbound function ub_resolve() is > called > >>> further down. ub_resolve() is a blocking function without timeout and > >> all the > >>> ovs/ovn utilities use this function. > >>> > >>> As reported by Timothy Redaelli, the issue can also be reproduced by > >> running > >>> the below commands > >>> > >>> $ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \ > >>> mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER' > >>> $ make sandbox SANDBOXFLAGS="--ovn" > >>> $ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \ > >>> 192.168.10.10:80,192.168.10.20:80 > >>> > >>> To address this issue, this patch adds a new function - > >> inet_parse_ip_addr_and_port() > >>> which expects IP:[port] address in the 'target_' argument and disables > >> resolving > >>> the host. This new function is now used in ovn-northd, ovn-nbctl and > >> ovn-trace. > >>> It is fine to use this function as load balancer VIP cannot be a > >> hostname. > >>> > >>> Reported-by: Timothy Redaelli <[email protected]> > >>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672 > >>> Tested-by: Timothy Redaelli <[email protected]> > >>> Signed-off-by: Numan Siddique <[email protected]> > >> > >> I have multiple thoughts here. > >> > >> First, if the resolver in OVS never times out, then that seems like a > >> bug in the OVS resolver. Yifeng, you wrote the DNS code. Is it true > >> that it never times out? If so, should we fix that. > >> > >> Second, about the mock RPM build with disabled networking. Does this > >> environment have a /etc/resolv.conf that specifies a DNS server? If it > >> does, then that seems like a bug in the build environment. If it does > >> not, then that seems like a bug in our DNS resolver code, because DNS > >> resolution should immediately fail if no DNS servers are available. > >> > >> Third, again about naming. If we are going to have two functions that > >> act similarly, with the only difference being that one resolves DNS > >> names and the other does not, then the naming should reflect that > >> clearly. It still isn't obvious to me with the new names. > >> > >> Thanks, > >> > >> Ben. > >> > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
