Hi Mark,

Thanks a lot for the information, that is very helpful.

>From your comments, I understand that we should keep the current
DNS resolving behavior as is. The thing we need to improve is to
stop resolving if there is no /etc/resolv.conf configured.

And if /etc/resolv.conf exists and has "127.0.0.1" as the dns server,
like Numan mentioned, resolver will block. For testing, I feel that a
timeout dns_resolve makes sense. Can we determine testing
context in runtime?

Best,
Yifeng


On Thu, Nov 1, 2018 at 6:20 AM Mark Michelson <[email protected]> wrote:

> On 10/31/2018 06:24 PM, Yifeng Sun wrote:
> > Hi Ben,
> >
> > The dns resolving depends on libunbound's ub_resolve, which, from
> > Numan's experience as well as my reading on its documentation,
> > doesn't support timeout. I agree there is a bug and we should fix it.
> >
> > Thanks,
> > Yifeng
> >
>
> I don't think you're going to find many resolvers that support timeouts
> being passed to them directly. Most of the time, the system settings are
> going to be honored. On Linux distributions, this means using the
> resolv.conf timeout and attempts values. By default, these values are
> set to 5 and 2 respectively. This means that the resolution will wait 5
> seconds before it determines it has timed out, and will attempt the
> query 2 times before it decides that the query has failed.
>
> Working this way is great when it comes to user-friendliness. System
> admins are accustomed to using resolv.conf to control resolver behavior,
> so the DNS library isn't doing anything unexpected.
>
> However, this *sucks* when it comes to trying to test your application.
> Those defaults I specified before are not guaranteed to be the same
> across different Linux distributions, not to mention other platforms.
> Trying to predict what the timeout for your DNS query is going to be is
> going to be a pain.
>
> If you want to implement an upper bound on a timeout, your best bet is
> to use an asynchronous query and start your own timer. When your timer
> expires, then cancel the query. However, I would only recommend doing
> this in a test environment. Like I said before, administrators won't
> like it if we're messing with their configured DNS timeouts.
>
> I think you're onto the right idea here by modifying the behavior when
> there are no servers configured. This way, you're not relying on a
> timeout in your test for something that really should fail immediately.
>
> > On Wed, Oct 31, 2018 at 1:59 PM Ben Pfaff <[email protected]> wrote:
> >
> >> On Thu, Oct 25, 2018 at 03:27:41PM +0530, [email protected] wrote:
> >>> From: Numan Siddique <[email protected]>
> >>>
> >>> When 'make check' is called by the mock rpm build (which disables
> >> networking),
> >>> the test "ovn-nbctl: LBs - daemon" fails when it runs the command
> >>> "ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80".
> >> ovn-nbctl
> >>> extracts the vip by calling the socket util function
> >> 'inet_parse_active()',
> >>> and this function blocks when libunbound function ub_resolve() is
> called
> >>> further down. ub_resolve() is a blocking function without timeout and
> >> all the
> >>> ovs/ovn utilities use this function.
> >>>
> >>> As reported by Timothy Redaelli, the issue can also be reproduced by
> >> running
> >>> the below commands
> >>>
> >>> $ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \
> >>>    mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER'
> >>> $ make sandbox SANDBOXFLAGS="--ovn"
> >>> $ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \
> >>>    192.168.10.10:80,192.168.10.20:80
> >>>
> >>> To address this issue, this patch adds a new function -
> >> inet_parse_ip_addr_and_port()
> >>> which expects IP:[port] address in the 'target_' argument and disables
> >> resolving
> >>> the host. This new function is now used in ovn-northd, ovn-nbctl and
> >> ovn-trace.
> >>> It is fine to use this function as load balancer VIP cannot be a
> >> hostname.
> >>>
> >>> Reported-by: Timothy Redaelli <[email protected]>
> >>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672
> >>> Tested-by: Timothy Redaelli <[email protected]>
> >>> Signed-off-by: Numan Siddique <[email protected]>
> >>
> >> I have multiple thoughts here.
> >>
> >> First, if the resolver in OVS never times out, then that seems like a
> >> bug in the OVS resolver.  Yifeng, you wrote the DNS code.  Is it true
> >> that it never times out?  If so, should we fix that.
> >>
> >> Second, about the mock RPM build with disabled networking.  Does this
> >> environment have a /etc/resolv.conf that specifies a DNS server?  If it
> >> does, then that seems like a bug in the build environment.  If it does
> >> not, then that seems like a bug in our DNS resolver code, because DNS
> >> resolution should immediately fail if no DNS servers are available.
> >>
> >> Third, again about naming.  If we are going to have two functions that
> >> act similarly, with the only difference being that one resolves DNS
> >> names and the other does not, then the naming should reflect that
> >> clearly.  It still isn't obvious to me with the new names.
> >>
> >> Thanks,
> >>
> >> Ben.
> >>
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to