--On Saturday, January 28, 2023 15:03 -0800 Rob Sayre
<[email protected]> wrote:

> On Sat, Jan 28, 2023 at 2:26 PM John C Klensin
> <[email protected]> wrote:
> 
>> 
>> That is clearly kicking the can down the road, but, until the
>> now-closed I18n Directorate comes up with carefully developed
>> recommendations about how to use IDNs when either standard
>> might be assumed
>> 
> 
> The biggest value any internet standards organization provides
> is a global namespace. We have to get that part right. I agree
> that the discussion can be tiresome, but saying "IDNA2008" is
> the standard is not correct. We have to document the current
> practice, but that doesn't mean there was anything wrong with
> IDNA2008 at the time.

While I know many active participants in the IETF (maybe even
whole Areas) who would dispute your "biggest value" assertion,
let us, for the moment put that aside and examine what I believe
is your claim that UTS#46 is the current practice and see where
that leads us.

Much of UTS#46 is written descriptively rather than normatively.
It allows many choices.   In particular:

*  While it has a "Conformance Testing" section (Section 8),
that is just a collection of test cases.  Section 8.2 specifies
use of toAsciiN and toAsciiT, operations that are not defined in
any of UTS#46, IDNA2003, or IDNA2008.  

* While those tests do not cover it, Section 3.1 defines, and
Section 4.1.1 allows use with, "UseSTD3ASCIIRules" set either
"true" or "false".  The tests in Section 8 check only the former
because, with the latter, "validity tests" are
implementation-dependent.  In particular, Section 3.1 says "some
browsers also allow characters such as U+005F ( _ ) LOW LINE
(underbar) in domain names, and thus use
UseSTD3ASCIIRules=false, plus their own validity checks for the
other ASCII characters".

* Section 4.1 requires a choice between "Transitional" and
"Nontransitional" processing.    It also requires the use of
discretionary flags (CheckHyphens, CheckJoiners, and CheckBidi)
which can be set either way and that produce different sets of
allowed labels.

In addition...

* Section 4.1.1 also gives a definition in terms of
"[\u002Da-zA-Z0-9]", a notation I cannot find explained in
either that document or in Appendix A ("Notational Conventions")
at least through Unicode 14.0.  Probably it denotes a character
class of some sort, but it does not conform to any of the forms
listed there.  

* The foundation for UseSTD3ASCIIRules is a reference to STD 3,
i.e., RFC 1034, without citing any particular section there.
While my assumption is that it is referring to the "Preferred
name syntax" in Section 3.5 of RFC 1034, STD 3 itself allows
binary labels, i.e, labels whose octets can contain any value.
draft-ietf-uta-rfc6125bis-10 has this right, but UTS#46 does not.

* Section 7 is not easy to read and some of its suggestion
depend on "how registries handle IDNA2008 going forward, and how
soon IDNA2003 clients are retired".  While I have no idea about
how many "IDNA2003 clients" are left, the registry situation
falls into four categories today:

(i) The root, i.e, IDN labels for top-level domains.  These are
strictly conformant to IDNA2008.  Many (probably most) are
conformant to the Maximal Standing Repertoire for the root,
which is a small subset of what is allowed by IDNA2008.

(ii) Registries containing labels at the second level allocated
by what ICANN calls "Contracted parties" and other registries
who follow the rules.  "The rules" in that case are IDNA2008 or
some subset of it, including subsets associated with the MSR.

(iii) Registries containing labels at the second level that do
not conform to ICANN rules and recommendations and hence not to
IDNA2008 either.  I don't know, but I would lay good odds that,
if they conform to UTS#46, it is a happy accident: AFAICT, they
are following only the famous rule sometimes expressed as "Do
what thou wilt".

(iv) Registries for labels below the second level: anyone's
guess.  Where there is a difference (and, again, there often is
not), I would guess more of them are guided by IDNA2008 than by
either IDNA2003 or UTS#46.

Now the key question:  If you would like the document to assert
that UTS#46 is "the current practice", are you prepared to claim
(demonstrate?) that every single implementation that is based on
UTS#46 (or even that claims conformance to it) has made the same
decisions about each of the permitted variables and variations
described above (and probably others -- I don't believe that
list is comprehensive?  If the answer to that one is "no", then,
in order to make strong statements about common practice, I
think you need to start identifying specific implementations,
determining what applications are using each one, and then start
counting users or usages to determine actual common practice.  

If that (IMO rathole) is not appealing, I think reality --the
same reality you appeal to in not wanting to cite IDNA2008 as
definitive-- dictates abandoning, at least for the present, the
idea of a global namespace in which all actors follow the same
rules.  Other than trying to standardize wishful thinking (which
I don't believe the AD and IETF will let you get away with, but
I could be wrong), I think that leaves two choices.  The first
is to do more or less what I suggested, which is to describe the
current situation, messy as it clearly is, to whatever degree of
accuracy and precision seems appropriate, and move on.  The
other is to conclude that UTA cannot move further forward with
this document unless there is, in practice, a global namespace
for which everyone is following the same rules and drop the
document itself.

I hope the WG has as little enthusiasm for the latter as I do.

In addition, there is another problem with
draft-ietf-uta-rfc6125bis-10 if your intention is to replace the
reference to IDNA2008 with one to UTS#46.  As I read it, it is
heavily dependent on PKIX as defined in RFC 5280.   That
document is defined in terms of IDNA2003, which is definitely
not current practice (or, except in its embedding in UTS#46,
anything but obsolete).  That was fixed by updates to 5280 in
RFCs in 8398 and 8399, which are not referenced at all in  the
present document.   That should be a simple fix, which I
recommend to the authors.    However, with or without those
updates to 5280, draft-ietf-uta-rfc6125bis-10 appears to be
solidly tied to service identities in PKIX certificates (see,
e.g., the first sentence of Section 1.4.1).  That means IDNA2008
(I hope) or maybe IDNA2003 but certainly not a specification
inconsistent with them that PKIX has not been updated to
recognize.  That also suggests that, if you have a problem with
both the current reference to IDNA2008 and my attempt to find a
way to recognize UTR#46 and work around the problem, you should
be having a discussion with the Security Area about a revision
to PKIX and its normative references, not trying to "fix" the
perceived problem by fussing with the current document.

    john



 Do you know which one is "current practice"?  I don't, but
suggest it is some of each.  In any event, citing UTS#46 does
not answer the question.
> 
> thanks,
> Rob


_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta

Reply via email to