Re: NXDOMAIN problems

2020-11-17 Thread G.W. Haywood via bind-users

Hi there,

On Tue, 17 Nov 2020, Boylan, Ross wrote:


I have been experiencing NXDOMAIN errors ...
... There are a lot of complications.
... The remote machine is only accessible though VPN 
... the nameserver ... is also accessible only through VPN

... The VPN connection has always been a bit touchy ...


In my experience, complicated usually also means unreliable.

Does it _need_ to be complicated?

Could you not just put

192.0.2.3   mymachine.ucsf.edu  mymachine

or similar into /etc/hosts (or whatever passes for that on the client)?

--

73,
Ged.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: NXDOMAIN problems

2020-11-16 Thread Matus UHLAR - fantomas

On 17.11.20 05:41, Boylan, Ross wrote:

One other detail may be important: I just added a bridge interface and
virtual machines.  I presume the VPN tunnel was using the hardware
interface (enp5s0) before, and is using the bridge (br0) now.  OpenConnect
creates the tunnel (tun0); both the name and inspection of the code
indicate the tunnel is based on the TUN interface, at the IP layer,
instead of the TAP interface, at the MAC layer.  If some of the
communication is not using IP then I presume it could be disappearing at
the bridge.


I guess that your VPN uses the IP that topologically closest to the
other side of VPN tunnel. Usually it's the IP with the default route set.

you can often override it in the VPN configuration.
Note this is not bind issue.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: NXDOMAIN problems

2020-11-16 Thread Matus UHLAR - fantomas

On 16.11.20 22:58, Boylan, Ross wrote:

I have been experiencing NXDOMAIN errors persistently, though not 100% of
the time, for a machine I am trying to reach.  The queries worked OK
before today.  I not only don't know what's causing it, but am having
trouble tracing what's going on inside of bind.  I'd be grateful for help
on either front, getting DNS to work or debugging.

There are a lot of complications.  In brief, the machine and name
resolution for it are only available through VPN; I have a search list
which should cause some failed lookups if the original doesn't work; and
I'm using views.  Some details follow, and then discussion of my debugging
attempts.

DETAILS

The remote machine is only accessible though VPN, and the nameserver that
knows how to find it is also accessible only through VPN.  The IP of that
nameserver is first on my forwarders list on my local machine.  When
failures happen the replies indicate the request was addressed to the
public-facing nameservers; it is good that they don't provide any info,
but they shouldn't be getting the request.


forwarders are not used in specified order, named measures TTL and uses server
that answers first.

you can configure configure your domain with specified forwarders and to be
"forward only".


I also added the target domain (ucsf.edu) to my search list.  So when I ask
for mymachine.ucsf.edu, this will also generate a query for
mymachine.ucsf.edu.ucsf.edu if the first query fails.  The second query is
asking for a non-existent domain, and so maybe that is the proximate
source of the NXDOMAIN.


this could be controlled by option "ndots:1" in resolv.conf, so search list
ignored for every hostname with one or more dots
... this is not BIND issue but the stub resolver issue.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I intend to live forever - so far so good.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: NXDOMAIN problems

2020-11-16 Thread Ondřej Surý
Ross,

I don’t have an answer for you what’s happening, but it would help you with the 
debugging if you see what happens on the wire? Using wireshark is usually 
helpful.

Also reviewing named.conf after you made the networking change might help and 
sharing the anonymized named.conf might trigger somebody with similar 
experience.

Ondrej
--
Ondřej Surý — ISC (He/Him)

> On 17. 11. 2020, at 6:42, Boylan, Ross  wrote:
> 
> One other detail may be important: I just added a bridge interface and 
> virtual machines.  I presume the VPN tunnel was using the hardware interface 
> (enp5s0) before, and is using the bridge (br0) now.  OpenConnect creates the 
> tunnel (tun0); both the name and inspection of the code indicate the tunnel 
> is based on the TUN interface, at the IP layer, instead of the TAP interface, 
> at the MAC layer.  If some of the communication is not using IP then I 
> presume it could be disappearing at the bridge.
> 
> This theory seems to imply that DNS lookup will always fail, which is not the 
> case.  dig always works (though not a lot of tests) and general lookup rarely 
> works.  I presume the general lookups go through bind, though maybe lwres is 
> involved. If dig and bind use different communication methods that have 
> different abilities to traverse the network stack that might explain some of 
> the differences.
> 
> I don't think the virtual network is running any DNS servers since a) with 
> bridging it is not an option and b) they are getting IPs from my main 
> machine.  But if they were, that could definitely mess things up.
> 
> This is on Debian 10 (buster) with a Linux 4.19 kernel and bind 9.11.5.
> 
> 
> From: Boylan, Ross
> Sent: Monday, November 16, 2020 2:58 PM
> To: bind-users@lists.isc.org
> Cc: Ross Boylan
> Subject: NXDOMAIN problems
> 
> I have been experiencing NXDOMAIN errors persistently, though not 100% of the 
> time, for a machine I am trying to reach.  The queries worked OK before 
> today.  I not only don't know what's causing it, but am having trouble 
> tracing what's going on inside of bind.  I'd be grateful for help on either 
> front, getting DNS to work or debugging.
> 
> There are a lot of complications.  In brief, the machine and name resolution 
> for it are only available through VPN; I have a search list which should 
> cause some failed lookups if the original doesn't work; and I'm using views.  
> Some details follow, and then discussion of my debugging attempts.
> 
> DETAILS
> 
> The remote machine is only accessible though VPN, and the nameserver that 
> knows how to find it is also accessible only through VPN.  The IP of that 
> nameserver is first on my forwarders list on my local machine.  When failures 
> happen the replies indicate the request was addressed to the public-facing 
> nameservers; it is good that they don't provide any info, but they shouldn't 
> be getting the request.
> 
> I also added the target domain (ucsf.edu) to my search list.  So when I ask 
> for mymachine.ucsf.edu, this will also generate a query for 
> mymachine.ucsf.edu.ucsf.edu if the first query fails.  The second query is 
> asking for a non-existent domain, and so maybe that is the proximate source 
> of the NXDOMAIN.
> 
> The machine I'm making the query from is in my own domain, which is why I'm 
> running BIND.  I use views, and the query is processed through my "inside" 
> view according to the logs.  ucsf.edu is NOT a domain I manage.
> 
> DEBUGGING
> 
> I directed, either explicitly or via default, all channels to a file and I 
> have tried rndc trace as high as 4.  But I can't tell if the values are 
> coming from the cache or where external queries are going.  Even after 
> flushing the cache I didn't see any info on outbound queries.  I tried using 
> the query-errors channel first, but it didn't seem to capture anything.  I 
> guess NXDOMAIN is not considered an error.
> 
> Occasionally I've had success, particularly after flushing the cache (though 
> that doesn't always work).  But when I try 30 seconds later I get NXDOMAIN.
> 
> Every query I have directed explicitly (with dig) at the campus nameserver 
> has succeeded.
> 
> The VPN connection has always been a bit touchy, and the problem first arose 
> immediately after it went down for somewhere between 30 seconds and a couple 
> of minutes.  My initial theory was that had caused a failure to be cached, 
> but the way I get failures right after successes is not consistent with that.
> 
> Thanks for any help.
> 
> Ross Boylan
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from t

Re: NXDOMAIN problems

2020-11-16 Thread Boylan, Ross
One other detail may be important: I just added a bridge interface and virtual 
machines.  I presume the VPN tunnel was using the hardware interface (enp5s0) 
before, and is using the bridge (br0) now.  OpenConnect creates the tunnel 
(tun0); both the name and inspection of the code indicate the tunnel is based 
on the TUN interface, at the IP layer, instead of the TAP interface, at the MAC 
layer.  If some of the communication is not using IP then I presume it could be 
disappearing at the bridge.

This theory seems to imply that DNS lookup will always fail, which is not the 
case.  dig always works (though not a lot of tests) and general lookup rarely 
works.  I presume the general lookups go through bind, though maybe lwres is 
involved. If dig and bind use different communication methods that have 
different abilities to traverse the network stack that might explain some of 
the differences.

I don't think the virtual network is running any DNS servers since a) with 
bridging it is not an option and b) they are getting IPs from my main machine.  
But if they were, that could definitely mess things up.

This is on Debian 10 (buster) with a Linux 4.19 kernel and bind 9.11.5.


From: Boylan, Ross
Sent: Monday, November 16, 2020 2:58 PM
To: bind-users@lists.isc.org
Cc: Ross Boylan
Subject: NXDOMAIN problems

I have been experiencing NXDOMAIN errors persistently, though not 100% of the 
time, for a machine I am trying to reach.  The queries worked OK before today.  
I not only don't know what's causing it, but am having trouble tracing what's 
going on inside of bind.  I'd be grateful for help on either front, getting DNS 
to work or debugging.

There are a lot of complications.  In brief, the machine and name resolution 
for it are only available through VPN; I have a search list which should cause 
some failed lookups if the original doesn't work; and I'm using views.  Some 
details follow, and then discussion of my debugging attempts.

DETAILS

The remote machine is only accessible though VPN, and the nameserver that knows 
how to find it is also accessible only through VPN.  The IP of that nameserver 
is first on my forwarders list on my local machine.  When failures happen the 
replies indicate the request was addressed to the public-facing nameservers; it 
is good that they don't provide any info, but they shouldn't be getting the 
request.

I also added the target domain (ucsf.edu) to my search list.  So when I ask for 
mymachine.ucsf.edu, this will also generate a query for 
mymachine.ucsf.edu.ucsf.edu if the first query fails.  The second query is 
asking for a non-existent domain, and so maybe that is the proximate source of 
the NXDOMAIN.

The machine I'm making the query from is in my own domain, which is why I'm 
running BIND.  I use views, and the query is processed through my "inside" view 
according to the logs.  ucsf.edu is NOT a domain I manage.

DEBUGGING

I directed, either explicitly or via default, all channels to a file and I have 
tried rndc trace as high as 4.  But I can't tell if the values are coming from 
the cache or where external queries are going.  Even after flushing the cache I 
didn't see any info on outbound queries.  I tried using the query-errors 
channel first, but it didn't seem to capture anything.  I guess NXDOMAIN is not 
considered an error.

Occasionally I've had success, particularly after flushing the cache (though 
that doesn't always work).  But when I try 30 seconds later I get NXDOMAIN.

Every query I have directed explicitly (with dig) at the campus nameserver has 
succeeded.

The VPN connection has always been a bit touchy, and the problem first arose 
immediately after it went down for somewhere between 30 seconds and a couple of 
minutes.  My initial theory was that had caused a failure to be cached, but the 
way I get failures right after successes is not consistent with that.

Thanks for any help.

Ross Boylan
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


NXDOMAIN problems

2020-11-16 Thread Boylan, Ross
I have been experiencing NXDOMAIN errors persistently, though not 100% of the 
time, for a machine I am trying to reach.  The queries worked OK before today.  
I not only don't know what's causing it, but am having trouble tracing what's 
going on inside of bind.  I'd be grateful for help on either front, getting DNS 
to work or debugging.

There are a lot of complications.  In brief, the machine and name resolution 
for it are only available through VPN; I have a search list which should cause 
some failed lookups if the original doesn't work; and I'm using views.  Some 
details follow, and then discussion of my debugging attempts.

DETAILS

The remote machine is only accessible though VPN, and the nameserver that knows 
how to find it is also accessible only through VPN.  The IP of that nameserver 
is first on my forwarders list on my local machine.  When failures happen the 
replies indicate the request was addressed to the public-facing nameservers; it 
is good that they don't provide any info, but they shouldn't be getting the 
request.

I also added the target domain (ucsf.edu) to my search list.  So when I ask for 
mymachine.ucsf.edu, this will also generate a query for 
mymachine.ucsf.edu.ucsf.edu if the first query fails.  The second query is 
asking for a non-existent domain, and so maybe that is the proximate source of 
the NXDOMAIN.

The machine I'm making the query from is in my own domain, which is why I'm 
running BIND.  I use views, and the query is processed through my "inside" view 
according to the logs.  ucsf.edu is NOT a domain I manage.

DEBUGGING

I directed, either explicitly or via default, all channels to a file and I have 
tried rndc trace as high as 4.  But I can't tell if the values are coming from 
the cache or where external queries are going.  Even after flushing the cache I 
didn't see any info on outbound queries.  I tried using the query-errors 
channel first, but it didn't seem to capture anything.  I guess NXDOMAIN is not 
considered an error.

Occasionally I've had success, particularly after flushing the cache (though 
that doesn't always work).  But when I try 30 seconds later I get NXDOMAIN.

Every query I have directed explicitly (with dig) at the campus nameserver has 
succeeded.

The VPN connection has always been a bit touchy, and the problem first arose 
immediately after it went down for somewhere between 30 seconds and a couple of 
minutes.  My initial theory was that had caused a failure to be cached, but the 
way I get failures right after successes is not consistent with that.

Thanks for any help.

Ross Boylan
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users