Re: How Zone Files Are Read

2020-12-16 Thread Mark Andrews


> On 17 Dec 2020, at 06:44, Timothe Litt  wrote:
> 
> 
> On 16-Dec-20 13:52, Tim Daneliuk wrote:
>> On 12/16/20 12:25 PM, Timothe Litt wrote:
>> 
>>> On 16-Dec-20 11:37, Tim Daneliuk wrote:
>>> 
 I ran into a situation yesterday which got me pondering something about 
 bind.
 
 In this case, a single line in a zone file was bad.  The devops automation
 had inserted a space in the hostname field of a PTR record.
 
 What was interesting was that - at startup - bind absolutely refused
 to load the zone file at all.  I would have expected it to complain
 about the bad record and ignore it, but load the rest of the
 good records.
 
 Can someone please explain the rationale or logic for this?  Not 
 complaining,
 just trying to understand for future reference.
 
 TIA,
 Tim
 
>>> DNS is complicated.  The scope of an error in a zonefile is hard to 
>>> determine.
>>> 
>>> To avoid this, your automation should use named-checkzone before releasing 
>>> a zone file.
>>> 
>>> This will perform all the checks that named will when it is loaded.
>>> 
>>> 
>> 
>> Kind of what I thought.  Whoever build the environment in question
>> really didn't understand DNS very well and hacked together a kludge
>> that I am still trying to get my head around.
>> 
>> 
> For a simple example of why it's complicated - what if the typo you had was 
> for a host that sends e-mail?
> 
> You'll see intermittent delivery errors when remote hosts can't resolve the 
> host's address; some require that a reverse lookup resolve to the host as an 
> anti-spoofing measure.  Others won't.  You'll spend a long time diagnosing.
> named can't tell this case from a typo for a local printer's PTR - where it's 
> unlikely that a reverse lookup failure will matter.  Of course, this means it 
> could go undetected for years - until it IS needed.
> 
> Or the typo is in a NS record - which you probably won't detect until the 
> other NS goes down...
> 
> And, any errors are cached for their TTL by resolvers.  The TTL may 
> (hopefully for query rate reduction) be large.  In your case, it would be the 
> negative TTL (meaning that even adding the record later wouldn't have 
> immediate effect).
> The bottom line is that named must assume that anything placed in a zone file 
> is important, and that the external impact - either sin of omission or 
> commission - might be large.
> 
> Thus, while named can't detect all (or even most) errors, those that it does 
> detect cause immediate failure to load.  That prevents caching and 
> propagation as well as getting human attention.
> When something's wrong, it's best to stop and fix it.  Error recovery is a 
> very good thing - but only when you can demonstrate that the cure is better 
> than the disease.  Skipping format errors in a zone file would not satisfy 
> that constraint.
> Timothe Litt
> ACM Distinguished Engineer
> --
> This communication may not represent the ACM or my employer's views,
> if any, on the matters discussed. 

And on top of all this there is STD 13 (RFC 1034, RFC 1035) which says
in RFC 1035:

"5.2. Use of master files to define zones

When a master file is used to load a zone, the operation should be
suppressed if any errors are encountered in the master file.  The
rationale for this is that a single error can have widespread
consequences.  For example, suppose that the RRs defining a delegation
have syntax errors; then the server will return authoritative name
errors for all names in the subzone (except in the case where the
subzone is also present on the server).

Several other validity checks that should be performed in addition to
insuring that the file is syntactically correct:

   1. All RRs in the file should have the same class.

   2. Exactly one SOA RR should be present at the top of the zone.

   3. If delegations are present and glue information is required,
  it should be present.

   4. Information present outside of the authoritative nodes in the
  zone should be glue information, rather than the result of an
  origin or similar error."

Those of use with long memories have seen all the errors scenarios
reported here play out in real life because early versions of BIND
did just drop bad lines and continue on as “best effort".  We fixed
this behaviour over 2 decades ago now with no regrets other than we
didn’t fix it sooner.

The above list of thing to check is not exhaustive.  BIND checks much
more these days.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list

Re: How Zone Files Are Read

2020-12-16 Thread Reindl Harald




Am 16.12.20 um 19:18 schrieb Tim Daneliuk:

On 12/16/20 11:36 AM, Reindl Harald wrote:

where did i give the advice "don't fail"?
please read my repsonse again!

* the zone fails on the master
* the zone is still available on the slaves
* so the error isn't fatal
* but you recognize your mistake

what happens when the error is in the line of the MX record and named would say 
"well, it's only one line, we still have the zone but no longer an MX"?

it would lead to a *fatal error* for the behavior of the whole zone, even if 
*all* or your nameservers go down it would be better because every delivering 
MTA would just queue the messages in case of a SERVFAIL

without the MX the would go to the A record of the zone which is in most cases 
simply the wrong destination


I agree that in a master-slave topology, your argument makes sense


sorry, i can't think of any network with only one nameserver given that 
DNS is one of the most important services



I this case, the server was a singleton responsible for a small virtual
private network within a much larger one. So. when the server failed to start,
the client had NO DNS for that subnet.
don't get me wrong but that's how one learns the hard way build basic 
redundancy for services he cares and if one don't care it's no problem 
if they fail


you have 3 options:

1: master/slave as recommended always
2: verify zones file before write them
3: fix software which generates broken zones

normally you chose all 3 in the sense of "and" instead of "or"
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Timothe Litt

On 16-Dec-20 13:52, Tim Daneliuk wrote:
> On 12/16/20 12:25 PM, Timothe Litt wrote:
>> On 16-Dec-20 11:37, Tim Daneliuk wrote:
>>> I ran into a situation yesterday which got me pondering something about 
>>> bind.
>>>
>>> In this case, a single line in a zone file was bad.  The devops automation
>>> had inserted a space in the hostname field of a PTR record.
>>>
>>> What was interesting was that - at startup - bind absolutely refused
>>> to load the zone file at all.  I would have expected it to complain
>>> about the bad record and ignore it, but load the rest of the
>>> good records.
>>>
>>> Can someone please explain the rationale or logic for this?  Not 
>>> complaining,
>>> just trying to understand for future reference.
>>>
>>> TIA,
>>> Tim
>> DNS is complicated.  The scope of an error in a zonefile is hard to 
>> determine.
>>
>> To avoid this, your automation should use named-checkzone before releasing a 
>> zone file.
>>
>> This will perform all the checks that named will when it is loaded.
>>
>
> Kind of what I thought.  Whoever build the environment in question
> really didn't understand DNS very well and hacked together a kludge
> that I am still trying to get my head around.
>
For a simple example of why it's complicated - what if the typo you had
was for a host that sends e-mail?

You'll see intermittent delivery errors when remote hosts can't resolve
the host's address; some require that a reverse lookup resolve to the
host as an anti-spoofing measure.  Others won't.  You'll spend a long
time diagnosing.

named can't tell this case from a typo for a local printer's PTR - where
it's unlikely that a reverse lookup failure will matter.  Of course,
this means it could go undetected for years - until it IS needed.

Or the typo is in a NS record - which you probably won't detect until
the other NS goes down...

And, any errors are cached for their TTL by resolvers.  The TTL may
(hopefully for query rate reduction) be large.  In your case, it would
be the negative TTL (meaning that even adding the record later wouldn't
have immediate effect).

The bottom line is that named must assume that anything placed in a zone
file is important, and that the external impact - either sin of omission
or commission - might be large.

Thus, while named can't detect all (or even most) errors, those that it
does detect cause immediate failure to load.  That prevents caching and
propagation as well as getting human attention.

When something's wrong, it's best to stop and fix it.  Error recovery is
a very good thing - but only when you can demonstrate that the cure is
better than the disease.  Skipping format errors in a zone file would
not satisfy that constraint.

Timothe Litt
ACM Distinguished Engineer
--
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed. 




OpenPGP_signature
Description: OpenPGP digital signature
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Tim Daneliuk
On 12/16/20 12:25 PM, Timothe Litt wrote:
> On 16-Dec-20 11:37, Tim Daneliuk wrote:
>> I ran into a situation yesterday which got me pondering something about bind.
>>
>> In this case, a single line in a zone file was bad.  The devops automation
>> had inserted a space in the hostname field of a PTR record.
>>
>> What was interesting was that - at startup - bind absolutely refused
>> to load the zone file at all.  I would have expected it to complain
>> about the bad record and ignore it, but load the rest of the
>> good records.
>>
>> Can someone please explain the rationale or logic for this?  Not complaining,
>> just trying to understand for future reference.
>>
>> TIA,
>> Tim
> 
> DNS is complicated.  The scope of an error in a zonefile is hard to determine.
> 
> To avoid this, your automation should use named-checkzone before releasing a 
> zone file.
> 
> This will perform all the checks that named will when it is loaded.
> 


Kind of what I thought.  Whoever build the environment in question
really didn't understand DNS very well and hacked together a kludge
that I am still trying to get my head around.


-- 

Tim Daneliuk tun...@tundraware.com
PGP Key: http://www.tundraware.com/PGP/

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Timothe Litt
On 16-Dec-20 11:37, Tim Daneliuk wrote:
> I ran into a situation yesterday which got me pondering something about bind.
>
> In this case, a single line in a zone file was bad.  The devops automation
> had inserted a space in the hostname field of a PTR record.
>
> What was interesting was that - at startup - bind absolutely refused
> to load the zone file at all.  I would have expected it to complain
> about the bad record and ignore it, but load the rest of the
> good records.
>
> Can someone please explain the rationale or logic for this?  Not complaining,
> just trying to understand for future reference.
>
> TIA,
> Tim

DNS is complicated.  The scope of an error in a zonefile is hard to
determine.

To avoid this, your automation should use named-checkzone before
releasing a zone file.

This will perform all the checks that named will when it is loaded.

Timothe Litt
ACM Distinguished Engineer
--
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed. 



OpenPGP_signature
Description: OpenPGP digital signature
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Tim Daneliuk
On 12/16/20 11:36 AM, Reindl Harald wrote:
> 
> 
> Am 16.12.20 um 18:26 schrieb Gregory Sloop:
>> This isn't, IMO, very useful as a response to the OP.
> 
> let that decide the OP
> 
>> To sum up the response; "It's better to never fail!"
>>
>> Yes, that seems pretty obvious. It *would* be better to never fail. Way, way 
>> better.
>> But the big problem in life is; We're always failing! Dammit!
>>
>> So, learning how to gracefully fail, and understanding what happens and why, 
>> when something fails, is pretty important to achieve the outcome of; "Not 
>> failing quite so catastrophically."
> 
> loading a invalid zoen file is far away from "fail geraceful"! if a comozter 
> don't understand the input fully it's not supposed to guess
> 
>> So, while I don't have helpful knowledge to impart to the OP, I think I can 
>> say that giving the advice of "don't fail" doesn't seem very helpful.
> 
> where did i give the advice "don't fail"?
> please read my repsonse again!
> 
> * the zone fails on the master
> * the zone is still available on the slaves
> * so the error isn't fatal
> * but you recognize your mistake
> 
> what happens when the error is in the line of the MX record and named would 
> say "well, it's only one line, we still have the zone but no longer an MX"?
> 
> it would lead to a *fatal error* for the behavior of the whole zone, even if 
> *all* or your nameservers go down it would be better because every delivering 
> MTA would just queue the messages in case of a SERVFAIL
> 
> without the MX the would go to the A record of the zone which is in most 
> cases simply the wrong destination
> 

I agree that in a master-slave topology, your argument makes sense.
I this case, the server was a singleton responsible for a small virtual
private network within a much larger one. So. when the server failed to start,
the client had NO DNS for that subnet.



-- 

Tim Daneliuk tun...@tundraware.com
PGP Key: http://www.tundraware.com/PGP/

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Reindl Harald




Am 16.12.20 um 18:26 schrieb Gregory Sloop:

This isn't, IMO, very useful as a response to the OP.


let that decide the OP


To sum up the response; "It's better to never fail!"

Yes, that seems pretty obvious. It *would* be better to never fail. Way, 
way better.

But the big problem in life is; We're always failing! Dammit!

So, learning how to gracefully fail, and understanding what happens and 
why, when something fails, is pretty important to achieve the outcome 
of; "Not failing quite so catastrophically."


loading a invalid zoen file is far away from "fail geraceful"! if a 
comozter don't understand the input fully it's not supposed to guess


So, while I don't have helpful knowledge to impart to the OP, I think I 
can say that giving the advice of "don't fail" doesn't seem very helpful.


where did i give the advice "don't fail"?
please read my repsonse again!

* the zone fails on the master
* the zone is still available on the slaves
* so the error isn't fatal
* but you recognize your mistake

what happens when the error is in the line of the MX record and named 
would say "well, it's only one line, we still have the zone but no 
longer an MX"?


it would lead to a *fatal error* for the behavior of the whole zone, 
even if *all* or your nameservers go down it would be better because 
every delivering MTA would just queue the messages in case of a SERVFAIL


without the MX the would go to the A record of the zone which is in most 
cases simply the wrong destination



*RH> Am 16.12.20 um 17:37 schrieb Tim Daneliuk:

I ran into a situation yesterday which got me pondering something about bind.



In this case, a single line in a zone file was bad.  The devops automation
had inserted a space in the hostname field of a PTR record.



What was interesting was that - at startup - bind absolutely refused
to load the zone file at all.  I would have expected it to complain
about the bad record and ignore it, but load the rest of the
good records.



Can someone please explain the rationale or logic for this?  Not complaining,
just trying to understand for future reference.


RH> it's better not load a invalid zone on a single nameserver at all as you
RH> are supposed to have at least two nameservers and the second one won't
RH> get the failure via master/slave replication

RH> if it has an error something is wrong
RH> if the last version had no error that version is good

RH> for the world *everything* still is good as long there is one slave -
RH> subtle errors can lead to completly unexpected behavior

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Gregory Sloop
This isn't, IMO, very useful as a response to the OP.

To sum up the response; "It's better to never fail!"

Yes, that seems pretty obvious. It *would* be better to never fail. Way, way 
better.
But the big problem in life is; We're always failing! Dammit!

So, learning how to gracefully fail, and understanding what happens and why, 
when something fails, is pretty important to achieve the outcome of; "Not 
failing quite so catastrophically." 

So, while I don't have helpful knowledge to impart to the OP, I think I can say 
that giving the advice of "don't fail" doesn't seem very helpful.





RH> Am 16.12.20 um 17:37 schrieb Tim Daneliuk:
>> I ran into a situation yesterday which got me pondering something about bind.

>> In this case, a single line in a zone file was bad.  The devops automation
>> had inserted a space in the hostname field of a PTR record.

>> What was interesting was that - at startup - bind absolutely refused
>> to load the zone file at all.  I would have expected it to complain
>> about the bad record and ignore it, but load the rest of the
>> good records.

>> Can someone please explain the rationale or logic for this?  Not complaining,
>> just trying to understand for future reference.

RH> it's better not load a invalid zone on a single nameserver at all as you
RH> are supposed to have at least two nameservers and the second one won't
RH> get the failure via master/slave replication

RH> if it has an error something is wrong
RH> if the last version had no error that version is good

RH> for the world *everything* still is good as long there is one slave - 
RH> subtle errors can lead to completly unexpected behavior
RH> ___
RH> Please visit https://lists.isc.org/mailman/listinfo/bind-users to 
unsubscribe from this list___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: How Zone Files Are Read

2020-12-16 Thread Reindl Harald




Am 16.12.20 um 17:37 schrieb Tim Daneliuk:

I ran into a situation yesterday which got me pondering something about bind.

In this case, a single line in a zone file was bad.  The devops automation
had inserted a space in the hostname field of a PTR record.

What was interesting was that - at startup - bind absolutely refused
to load the zone file at all.  I would have expected it to complain
about the bad record and ignore it, but load the rest of the
good records.

Can someone please explain the rationale or logic for this?  Not complaining,
just trying to understand for future reference.


it's better not load a invalid zone on a single nameserver at all as you 
are supposed to have at least two nameservers and the second one won't 
get the failure via master/slave replication


if it has an error something is wrong
if the last version had no error that version is good

for the world *everything* still is good as long there is one slave - 
subtle errors can lead to completly unexpected behavior

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


How Zone Files Are Read

2020-12-16 Thread Tim Daneliuk
I ran into a situation yesterday which got me pondering something about bind.

In this case, a single line in a zone file was bad.  The devops automation
had inserted a space in the hostname field of a PTR record.

What was interesting was that - at startup - bind absolutely refused
to load the zone file at all.  I would have expected it to complain
about the bad record and ignore it, but load the rest of the
good records.

Can someone please explain the rationale or logic for this?  Not complaining,
just trying to understand for future reference.

TIA,
Tim
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users