Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt

2019-09-11 Thread Wes Hardaker
Loganaden Velvindron  writes:

> On Wed, Sep 11, 2019 at 7:42 AM Wes Hardaker  wrote:
> >
> > Loganaden Velvindron  writes:
> >
> > Hi Loganaden,
> >
> > Thanks for the comments about the EDE draft.  I've marked up your
> > comments with responses and actions below.  Let us know if you have any
> > questions.
> Hi Wes,
> 
> One small note: This reply was from John Todd from Quad9. I asked him to 
> review
> the draft, and he sent me his comments which I then forwarded to the
> dnsop wg mailing list.

Got it, thanks.  And I added his name to the acknowledgements too.
-- 
Wes Hardaker
USC/ISI

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt

2019-09-11 Thread Loganaden Velvindron
On Wed, Sep 11, 2019 at 7:42 AM Wes Hardaker  wrote:
>
> Loganaden Velvindron  writes:
>
> Hi Loganaden,
>
> Thanks for the comments about the EDE draft.  I've marked up your
> comments with responses and actions below.  Let us know if you have any
> questions.
Hi Wes,

One small note: This reply was from John Todd from Quad9. I asked him to review
the draft, and he sent me his comments which I then forwarded to the
dnsop wg mailing list.


>
> 11 Loganaden Velvindron
> ==
>
> 11.1 NOCHANGE pass-through
> ~~
>
>   1) I see at least one more model that needs to be supported, which is
>   how to handle edns extended codes that are generated by a remote
>   server, i.e. passthrough. Layering multiple forwarding resolvers
>   behind each other is common, and some way to notify the end user that
>   the originating message was not generated by the first resolver would
>   be important.  I don't know if there needs to be some way to indicate
>   how "deep" the error was away from the end user; it seems just two
>   levels (locally generated or non-locally generated) would be
>   sufficient with only minor thought on it.
>
>   Re: 1) This is a good point, but implementation will likely run afoul
>   of existing standards or else require duplicative response codes or
>   use of an additional flag in the INFO-CODES section.  Perhaps a new
>   flag type, similar to AA, which can be used to say that this recursor
>   will return this result reliably/deterministically.  Attempting to
>   provide depth is perhaps unlikely, but flags for
>   stub/forwarder/recursive/intermediate recursive or a subset of those
>   might make sense.  Perhaps a non-descript flag such as 'DR' for
>   Deterministic Response.  Obviously INFO-CODES can support many
>   different flags, of which IR (Intermediate Resolver) or such could be
>   included at the point of response generation, with the last server
>   providing actual data in the chain being the one to authoritatively
>   set the flag, which then must not be modified by further downstream
>   resolvers in the process of returning the response.
>
>   + Response: this has been discussed a few times, and the current view
> (that at least I hold, and likely others based on past discussions)
> is that it would be best to get this out as is, without a
> pass-through model while we deploy it and get operational experience
> with its use.  Pass-through is complex for a bunch of reasons (NAT
> alone, eg), and it's unclear we can come up with a solution for all
> the likely corner cases to appear.
>
> TL;DR: we should definitely work on it, but in the future.
>
>
> 11.2 DONE network error code needed beyond timeout
> ~~
>
>   1) SERVFAIL needs another error code to indicate the difference
>   between a network error (unexpected network response like ICMP, or TCP
>   error such as connection refused) versus timeout of the remote auth
>   server, as that is often a confusing issue.
>
>   + Response: looks like a reasonable idea, so it has been added to the
> latest draft.  thank you!
>
>   Re: 2) Specifics as an item in the below list.
>
>
> 11.3 NOCHANGE
> ~~
>
>   1) Really, I'd like to see a definition of some of the EXTRA TEXT
>   strings here, since that will be almost immediately an issue that
>   would need to be sorted out before this could be useful. There have
>   been some discussions (sorry, don't know if it's a draft or just
>   talking) about browsers consuming "extra" data in DNS responses that
>   can do a number of things.  As an example that is important to Quad9
>   (or any blocking-based DNS service) it might be the case that upon
>   receiving a request for a "blocked" qname/qtype, we would hand back a
>   forged answer that leads to a splash page as the default result.
>   However, if the request was made from a resolver stack that had the
>   EDNS extensions, we might include the "real" result in the EXTRA TEXT
>   field, as well as a URL that points the user to an explanation of why
>   that particular qname/qtype was blocked.  Or we might add a risk
>   factor, or type of risk ("risk=100, risktype=phishing") or the like.
>   This allows a single query to be digestable by "dumb" stacks that we
>   want to have do the most safe thing, but also allow "smart" resolver
>   stacks to present a set of options to the end user.
>
>   + Again, I suspect that the complexity associated with standardizing
> on exactly a structure (including internationalization) of
> extra-information in a machine understandable and parsable mechanism
> is fraught with a very long discussion period.  It might be worthy
> of future work, and I certainly think it would be valuable, but
> (IMHO) it would be better to get this out and work on that as a
> follow-on project *if* we could achieve consensus on it (which, 

Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt

2019-09-10 Thread Wes Hardaker
Loganaden Velvindron  writes:

Hi Loganaden,

Thanks for the comments about the EDE draft.  I've marked up your
comments with responses and actions below.  Let us know if you have any
questions.

11 Loganaden Velvindron
==

11.1 NOCHANGE pass-through
~~

  1) I see at least one more model that needs to be supported, which is
  how to handle edns extended codes that are generated by a remote
  server, i.e. passthrough. Layering multiple forwarding resolvers
  behind each other is common, and some way to notify the end user that
  the originating message was not generated by the first resolver would
  be important.  I don't know if there needs to be some way to indicate
  how "deep" the error was away from the end user; it seems just two
  levels (locally generated or non-locally generated) would be
  sufficient with only minor thought on it.

  Re: 1) This is a good point, but implementation will likely run afoul
  of existing standards or else require duplicative response codes or
  use of an additional flag in the INFO-CODES section.  Perhaps a new
  flag type, similar to AA, which can be used to say that this recursor
  will return this result reliably/deterministically.  Attempting to
  provide depth is perhaps unlikely, but flags for
  stub/forwarder/recursive/intermediate recursive or a subset of those
  might make sense.  Perhaps a non-descript flag such as 'DR' for
  Deterministic Response.  Obviously INFO-CODES can support many
  different flags, of which IR (Intermediate Resolver) or such could be
  included at the point of response generation, with the last server
  providing actual data in the chain being the one to authoritatively
  set the flag, which then must not be modified by further downstream
  resolvers in the process of returning the response.

  + Response: this has been discussed a few times, and the current view
(that at least I hold, and likely others based on past discussions)
is that it would be best to get this out as is, without a
pass-through model while we deploy it and get operational experience
with its use.  Pass-through is complex for a bunch of reasons (NAT
alone, eg), and it's unclear we can come up with a solution for all
the likely corner cases to appear.

TL;DR: we should definitely work on it, but in the future.


11.2 DONE network error code needed beyond timeout
~~

  1) SERVFAIL needs another error code to indicate the difference
  between a network error (unexpected network response like ICMP, or TCP
  error such as connection refused) versus timeout of the remote auth
  server, as that is often a confusing issue.

  + Response: looks like a reasonable idea, so it has been added to the
latest draft.  thank you!

  Re: 2) Specifics as an item in the below list.


11.3 NOCHANGE 
~~

  1) Really, I'd like to see a definition of some of the EXTRA TEXT
  strings here, since that will be almost immediately an issue that
  would need to be sorted out before this could be useful. There have
  been some discussions (sorry, don't know if it's a draft or just
  talking) about browsers consuming "extra" data in DNS responses that
  can do a number of things.  As an example that is important to Quad9
  (or any blocking-based DNS service) it might be the case that upon
  receiving a request for a "blocked" qname/qtype, we would hand back a
  forged answer that leads to a splash page as the default result.
  However, if the request was made from a resolver stack that had the
  EDNS extensions, we might include the "real" result in the EXTRA TEXT
  field, as well as a URL that points the user to an explanation of why
  that particular qname/qtype was blocked.  Or we might add a risk
  factor, or type of risk ("risk=100, risktype=phishing") or the like.
  This allows a single query to be digestable by "dumb" stacks that we
  want to have do the most safe thing, but also allow "smart" resolver
  stacks to present a set of options to the end user.

  + Again, I suspect that the complexity associated with standardizing
on exactly a structure (including internationalization) of
extra-information in a machine understandable and parsable mechanism
is fraught with a very long discussion period.  It might be worthy
of future work, and I certainly think it would be valuable, but
(IMHO) it would be better to get this out and work on that as a
follow-on project *if* we could achieve consensus on it (which, I'll
be honesty, will be either difficult or take a long time or both).

  Re: 3) Seems reasonable.


11.4 NOCHANGE blacked/censored/retry


  1) I'm confused as to why a "blocked" or "censored" result would have
  a retry as mandatory.  The resolver gave a canonical answer from the
  point of policy.

  + the retry flag is now gone.

  Re: 4) See below notes.

  

Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt

2019-08-26 Thread Shane Kerr

Loganaden & all,

On 10/08/2019 07.37, Loganaden Velvindron wrote:

On Sat, Aug 10, 2019 at 9:14 AM  wrote:



A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Domain Name System Operations WG of the IETF.

 Title   : Extended DNS Errors
 Authors : Warren Kumari
   Evan Hunt
   Roy Arends
   Wes Hardaker
   David C Lawrence
 Filename: draft-ietf-dnsop-extended-error-07.txt
 Pages   : 13
 Date: 2019-08-09

Abstract:
This document defines an extensible method to return additional
information about the cause of DNS errors.  Though created primarily
to extend SERVFAIL to provide additional information about the cause
of DNS and DNSSEC failures, the Extended DNS Errors option defined in
this document allows all response types to contain extended error
information.



I went to talk to quad9. Here is the reply they sent.

Fwd:

1) I see at least one more model that needs to be supported, which is
how to handle edns extended codes that are generated by a remote
server, i.e. passthrough. Layering multiple forwarding resolvers
behind each other is common, and some way to notify the end user that
the originating message was not generated by the first resolver would
be important.  I don't know if there needs to be some way to indicate
how "deep" the error was away from the end user; it seems just two
levels (locally generated or non-locally generated) would be
sufficient with only minor thought on it.


Re: 1) This is a good point, but implementation will likely run afoul
of existing standards or else require duplicative response codes or
use of an additional flag in the INFO-CODES section.
Perhaps a new flag type, similar to AA, which can be used to say that
this recursor will return this result reliably/deterministically.
Attempting to provide depth is perhaps unlikely, but flags for
stub/forwarder/recursive/intermediate recursive or a subset of those
might make sense.
Perhaps a non-descript flag such as 'DR' for Deterministic Response.
Obviously INFO-CODES can support many different flags, of which IR
(Intermediate Resolver) or such could be included
at the point of response generation, with the last server providing
actual data in the chain being the one to authoritatively set the
flag, which then must not be modified by further
downstream resolvers in the process of returning the response.


Yeah. In principle EDNS0 is hop-by-hop, so getting more information like 
this doesn't really fit in the protocol.


Maybe EDNS1 should include information indicating which hop is 
responsible for any particular bit of EDNS? 



3) Really, I'd like to see a definition of some of the EXTRA TEXT
strings here, since that will be almost immediately an issue that
would need to be sorted out before this could be useful. There have
been some discussions (sorry, don't know if it's a draft or just
talking) about browsers consuming "extra" data in DNS responses that
can do a number of things.  As an example that is important to Quad9
(or any blocking-based DNS service) it might be the case that upon
receiving a request for a "blocked" qname/qtype, we would hand back a
forged answer that leads to a splash page as the default result.
However, if the request was made from a resolver stack that had the
EDNS extensions, we might include the "real" result in the EXTRA TEXT
field, as well as a URL that points the user to an explanation of why
that particular qname/qtype was blocked.  Or we might add a risk
factor, or type of risk ("risk=100, risktype=phishing")  or the like.
This allows a single query to be digestable by "dumb" stacks that we
want to have do the most safe thing, but also allow "smart" resolver
stacks to present a set of options to the end user.

Re: 3) Seems reasonable.


I think this sort of structured information can and probably should be 
included today using the private EDNS option space. Rather than 
embedding parsable information in the EDE equivalent of a TXT record, 
why not just add a new option with this data, which can be defined and 
structured?


Cheers,

--
Shane

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt

2019-08-09 Thread Loganaden Velvindron
On Sat, Aug 10, 2019 at 9:14 AM  wrote:
>
>
> A New Internet-Draft is available from the on-line Internet-Drafts 
> directories.
> This draft is a work item of the Domain Name System Operations WG of the IETF.
>
> Title   : Extended DNS Errors
> Authors : Warren Kumari
>   Evan Hunt
>   Roy Arends
>   Wes Hardaker
>   David C Lawrence
> Filename: draft-ietf-dnsop-extended-error-07.txt
> Pages   : 13
> Date: 2019-08-09
>
> Abstract:
>This document defines an extensible method to return additional
>information about the cause of DNS errors.  Though created primarily
>to extend SERVFAIL to provide additional information about the cause
>of DNS and DNSSEC failures, the Extended DNS Errors option defined in
>this document allows all response types to contain extended error
>information.
>
>
I went to talk to quad9. Here is the reply they sent.

Fwd:

1) I see at least one more model that needs to be supported, which is
how to handle edns extended codes that are generated by a remote
server, i.e. passthrough. Layering multiple forwarding resolvers
behind each other is common, and some way to notify the end user that
the originating message was not generated by the first resolver would
be important.  I don't know if there needs to be some way to indicate
how "deep" the error was away from the end user; it seems just two
levels (locally generated or non-locally generated) would be
sufficient with only minor thought on it.


Re: 1) This is a good point, but implementation will likely run afoul
of existing standards or else require duplicative response codes or
use of an additional flag in the INFO-CODES section.
Perhaps a new flag type, similar to AA, which can be used to say that
this recursor will return this result reliably/deterministically.
Attempting to provide depth is perhaps unlikely, but flags for
stub/forwarder/recursive/intermediate recursive or a subset of those
might make sense.
Perhaps a non-descript flag such as 'DR' for Deterministic Response.
Obviously INFO-CODES can support many different flags, of which IR
(Intermediate Resolver) or such could be included
at the point of response generation, with the last server providing
actual data in the chain being the one to authoritatively set the
flag, which then must not be modified by further
downstream resolvers in the process of returning the response.

2) SERVFAIL needs another error code to indicate the difference
between a network error (unexpected network response like ICMP, or TCP
error such as connection refused) versus timeout of the remote auth
server, as that is often a confusing issue.

Re: 2)  Specifics as an item in the below list.

3) Really, I'd like to see a definition of some of the EXTRA TEXT
strings here, since that will be almost immediately an issue that
would need to be sorted out before this could be useful. There have
been some discussions (sorry, don't know if it's a draft or just
talking) about browsers consuming "extra" data in DNS responses that
can do a number of things.  As an example that is important to Quad9
(or any blocking-based DNS service) it might be the case that upon
receiving a request for a "blocked" qname/qtype, we would hand back a
forged answer that leads to a splash page as the default result.
However, if the request was made from a resolver stack that had the
EDNS extensions, we might include the "real" result in the EXTRA TEXT
field, as well as a URL that points the user to an explanation of why
that particular qname/qtype was blocked.  Or we might add a risk
factor, or type of risk ("risk=100, risktype=phishing")  or the like.
This allows a single query to be digestable by "dumb" stacks that we
want to have do the most safe thing, but also allow "smart" resolver
stacks to present a set of options to the end user.

Re: 3) Seems reasonable.

4) I'm confused as to why a "blocked" or "censored" result would have
a retry as mandatory.   The resolver gave a canonical answer from the
point of policy.

Re: 4) See below notes.

Potential inclusions/Adjustments:

4.1.3.1: A use case exists where a stale answer should attempt a
retry. A declarative setting for the Retry bit should not be specified
here, but instead guidance on whether or not the R bit should be set
should be included. For example, when using a front-end load balancer,
if the recursive backends are temporarily inaccessible but are
expected to recover in time to handle a subsequent query, it would be
prudent to include the R bit. No additional load would be generated
towards the Authoritatives in this case, and the Intermediate Recursor
may choose to set the R bit or not based on whether the failure mode
appears to be temporary.

4.1.5: Another area where guidance should be provided. Some recursive
resolvers process requests out of order,