Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread Andrew Sullivan
On Tue, Mar 10, 2009 at 10:27:21AM +0100, Stephane Bortzmeyer wrote:

 recollection of one specific person. The alphabetic-only rule in RFC
 1123 is just a side note, never detailed, and presented as a fact
 (which it was at this time), not as a mandatory restriction.

I don't know whether I agree that it's just a side note.  It seems
to be a clarifying discussion used to explain why an innovation is
safe.  As we have seen in the current discussion, there is possibly
more than one interpretation of that safety.  I think that's what we
have to consider.

 There are no *TECHNICAL* reasons to limit TLD to alphabetic
 characters.

I think this is what's up for dispute.  If people have interpreted the
text in 1123 as normative and built resolvers using the logic there,
then that is a technical reason to limit TLD characters.  Even if we
think those resolvers were mistaken in their implementation, they're
deployed.  Interoperation is one of our more important values, and
that includes interoperation with reasonable interpretations of RFCs
that we nevertheless think are mistaken.

Best,

A
-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread James Seng
By the same logic, the whole IDN would be pointless because RFC 1035
restrict labels to alphabetic letter only.

IDNA transform IDN labels into punycode so that it become transparent
to the resolvers who made those assumption.

-James Seng

 I think this is what's up for dispute.  If people have interpreted the
 text in 1123 as normative and built resolvers using the logic there,
 then that is a technical reason to limit TLD characters.  Even if we
 think those resolvers were mistaken in their implementation, they're
 deployed.  Interoperation is one of our more important values, and
 that includes interoperation with reasonable interpretations of RFCs
 that we nevertheless think are mistaken.

 Best,

 A
 --
 Andrew Sullivan
 a...@shinkuro.com
 Shinkuro, Inc.
 ___
 DNSOP mailing list
 DNSOP@ietf.org
 https://www.ietf.org/mailman/listinfo/dnsop

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread Stephane Bortzmeyer
On Wed, Mar 11, 2009 at 10:56:10PM +0800,
 James Seng ja...@seng.sg wrote 
 a message of 4 lines which said:

 By the same logic, the whole IDN would be pointless because RFC
 1035restrict labels to alphabetic letter only.

I assume you're playing the devil's advocate? Because I believe that
all dnsop members know that the reason why we need IDN is not the
inability of the DNS to handle 8-bits characters...
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread Andrew Sullivan
On Wed, Mar 11, 2009 at 10:56:10PM +0800, James Seng wrote:
 By the same logic, the whole IDN would be pointless because RFC 1035
 restrict labels to alphabetic letter only.

I'd like the reference to where 1035 says that, please.  In
particular, the following passage in §3.1 of RFC 1035 seems to say
something different:

  Although labels can contain any 8 bit values in octets that
  make up a label, it is strongly recommended that labels
  follow the preferred syntax described elsewhere in this
  memo, which is compatible with existing host naming
  conventions.

In addition,
 
 IDNA transform IDN labels into punycode so that it become transparent
 to the resolvers who made those assumption.

you seem to be making my argument for me.  The reason IDNA is
preferable to some of the alternatives is that some resolver software
indeed understood 1034 and 1035 to mean that the preferred syntax
ought to be enforced (in what seems to me a plain violation of those
RFCs).  We have to live with those widely-deployed resolvers, and
therefore we need to design other protocols as though the additional
restrictions that are _not_ part of the DNS protocol are in fact part
of it.  Designing the protocols for the actually existing conditions
in the network is what makes the design activity engineering rather
than research, I think.

A
-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread James Seng
On Wed, Mar 11, 2009 at 11:36 PM, Andrew Sullivan a...@shinkuro.com wrote:
 On Wed, Mar 11, 2009 at 10:56:10PM +0800, James Seng wrote:
 By the same logic, the whole IDN would be pointless because RFC 1035
 restrict labels to alphabetic letter only.

 I'd like the reference to where 1035 says that, please.  In
 particular, the following passage in §3.1 of RFC 1035 seems to say
 something different:


label ::= letter [ [ ldh-str ] let-dig ]

...

letter ::= any one of the 52 alphabetic characters A through Z in
upper case and a through z in lower case

 you seem to be making my argument for me.  The reason IDNA is
 preferable to some of the alternatives is that some resolver software
 indeed understood 1034 and 1035 to mean that the preferred syntax
 ought to be enforced (in what seems to me a plain violation of those
 RFCs).  We have to live with those widely-deployed resolvers, and
 therefore we need to design other protocols as though the additional
 restrictions that are _not_ part of the DNS protocol are in fact part
 of it.  Designing the protocols for the actually existing conditions
 in the network is what makes the design activity engineering rather
 than research, I think.

Preciesly. Punycode instead of UTF-8 was selected because widely
deployed implementation despite theortically DNS should be 8-bit
clean.

My point is that RFC 1123 statement on alphabetic requirement

a) is highly debatable because it is not an explicit requirement since
it is mention in a section called DISCUSSION in a passing that
since at least the highest-level component label will be alphabetic,
in the context that TLD is alphabetic only as a matter of fact at that
time, not as a matter of technical requirement

b) even it is an explicit requirement, it should be taken in context
in the spirit as much as RFC 1035 forbid non-alphabetic characters in
labels.

-James Seng
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread Andrew Sullivan
On Wed, Mar 11, 2009 at 11:44:54PM +0800, James Seng wrote:
 
 
 label ::= letter [ [ ldh-str ] let-dig ]
 
 ...
 
 letter ::= any one of the 52 alphabetic characters A through Z in
 upper case and a through z in lower case

Selective quoting can prove anything.  Immediately prior to that
section, RFC 1035 says

  The following syntax will result in fewer problems with many
  applications that use domain names (e.g., mail, TELNET).

 a) is highly debatable because it is not an explicit requirement since
 it is mention in a section called DISCUSSION in a passing that
 since at least the highest-level component label will be alphabetic,
 in the context that TLD is alphabetic only as a matter of fact at that
 time, not as a matter of technical requirement

I just responded to that exact argument up-thread, but since that
wasn't apparently convincing, let's do it in more detail.

The beginning of 2.1 relaxes a requirement of RFC 952 that host names
may never start with a digit.  1123 says that host software MUST
support the more liberal syntax.  Moreover, the host SHOULD check a
candidate string syntacitcally for dotted-decimal number before
looking it up in the Domain Name System. 

As Mark Andrews has argued elsewhere on this list, the single label
666 could be interpreted as an IP address.  Various hex
representations may also be interpreted as an IP address.  These may
therefore pass the check for being a dotted-decimal number.

The DISCUSSION portion of 2.1 is explaining why relaxing RFC 952's
restriction is safe.  The safety flows exclusively from the premise
that the highest-level component label of a domain name will be
alphabetic; this guarantees that a syntactic check for an IP address
will fail due to at least one label being made up only of letters.  It
may be, therefore, that the alphabetic restriction is in fact
policy, and is not strictly a protocol issue.  The problem is that it
is policy on which other technical decisions rest.  Change the policy,
and the justification for those other technical decisions is
undermined.  In this sense, the claim in the DISCUSSION portion of 2.1
is not just a policy: it is also the foundation of other protocol
issues, and is therefore normative on the protocol even if it _is_ a
policy matter.

Finally, it is well-known that there are many implementations of
software -- particularly with respect to the DNS -- where people with
a less-than-nuanced reading of various RFCs have based what they will
allow on that reading of the RFC.  The 7 bit DNS implementations are
an excellent example of this: RFC 1035 was clear that the DNS itself
allowed other characters, but implementations checked for the
preferred syntax anyway because that was the safest bet.  We know
empirically that there were lots of checks (and in some cases still
are) for valid TLD labels that looked for things no longer than
three letters.  The 2001 introduction of a number of new TLDs was
rockier than necessary partly because of those checks, even though
there was never an RFC that suggested such was a good check.  1123
_does_ suggest that it is reasonable to check for top-level labels
being alphabetic, and I'd bet a pretty good lunch that we can find
implementations that decide whether something is a domain name based
on whether the top label starts with a letter.  Therefore, even if we
don't think that 1123 does in fact restrict the top-level label to
letters only, it is prudent to treat such a restriction as a _de
facto_ part of the protocol.

To the extent we want to change that de facto part of the protocol, we
want to do as little damage as possible.  An argument in favour of
John Klensin's suggestion to make an explicit exception for IDNA2008
A-labels is that it is the smallest change that can be made that still
accommodates the new feature we want.

Best regards,

Andrew

-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-11 Thread James Seng
 The DISCUSSION portion of 2.1 is explaining why relaxing RFC 952's
 restriction is safe.  The safety flows exclusively from the premise
 that the highest-level component label of a domain name will be
 alphabetic; this guarantees that a syntactic check for an IP address
 will fail due to at least one label being made up only of letters.  It
 may be, therefore, that the alphabetic restriction is in fact
 policy, and is not strictly a protocol issue.  The problem is that it
 is policy on which other technical decisions rest.  Change the policy,
 and the justification for those other technical decisions is
 undermined.  In this sense, the claim in the DISCUSSION portion of 2.1
 is not just a policy: it is also the foundation of other protocol
 issues, and is therefore normative on the protocol even if it _is_ a
 policy matter.

Okay, I agree with this line of logic.

1. We agreed that the TLD restriction is therefore a policy one, and
we derive other technical specification (e.g. allowing digits label at
2LD) based on the assumption of the policy one.

2. However, IDNA does not change that technical assumption, since
A-label will never be all digit, or start with a digit or end with
one.

 The 2001 introduction of a number of new TLDs was
 rockier than necessary partly because of those checks, even though
 there was never an RFC that suggested such was a good check.

Agreed

 1123 _does_ suggest that it is reasonable to check for top-level labels
 being alphabetic, and I'd bet a pretty good lunch that we can find
 implementations that decide whether something is a domain name based
 on whether the top label starts with a letter.  Therefore, even if we
 don't think that 1123 does in fact restrict the top-level label to
 letters only, it is prudent to treat such a restriction as a _de
 facto_ part of the protocol.

This is where we differ.

1. RFC 1123 do not suggest that top-level labels be check for
alphabetic. RFC 1123 assumed TLD is alphabetic and therefore made
certain technical assumption of what is considered valid or not.

But I agree with you that there will be implementation that decide
what TLD should be but it is a problem with the implementation, not
with RFC 1123 or RFC 952, esp on what it did not say.

2. IDNA do not change it either again, since A-label is always LDH, or
at least valid according to RFC 1123.

 To the extent we want to change that de facto part of the protocol, we
 want to do as little damage as possible.  An argument in favour of
 John Klensin's suggestion to make an explicit exception for IDNA2008
 A-labels is that it is the smallest change that can be made that still
 accommodates the new feature we want.

What I failed to see is why we need an update to RFC1123...but I can
accept the small change as proposed by John if thats what the group
think it is best moving forward.

-James Seng
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Patrik Fältström

On 9 mar 2009, at 19.16, David Conrad wrote:

This doesn't make any sense to me.  I am fairly certain there will  
be a request to add the U-label 日本 (Japanese Kanji for Japan).   
This isn't alphabetic in any sense of the term.


To some degree it is, as the two characters are:

U+65E5 : Lo, Letter Other -- Alphabetic
U+672C : Lo, Letter Other -- Alphabetic

I.e. they have both the Unicode Properties of letters, and because of  
that the derived property Alphabetic.


This compared to the character 7:

U+0037 : Nd, Decimal_Number digit -- Not Alphabetic

Interestingly, I tried a couple of IDN test tools (IMC's and NASK's)  
to convert that UTF-8 string into the appropriate A-label and both  
indicated there are invalid characters.  I'm getting an uneasy  
feeling...


If you use a mac, let me recommend UnicodeChecker from http://earthlingsoft.net

   Patrik

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Patrik Fältström


On 9 mar 2009, at 19.11, Edward Lewis wrote:

If A-labels conform to the rules in 1123 and all U-labels can be  
translated to A-labels, is BiDi an issue (for the DNS)?


The $1 question has to do with the (for the DNS) part of what you  
write. Domain names are not only used in the DNS as we all know, and  
if you have a domain name that for example end with a 7, and you use  
this domain name in a text with right to left context, and some  
unfortunate characters adjacent to the 7. Well, then the domain name  
might look weird.


So it is a domain name problem, not a DNS problem. And not a (big)  
problem if you only look at the Domain Name in isolation. It is the  
context the domain name is used in which create the problem.


   Patrik

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Krzysztof Olesik
Hello,

Just a short explanation.

 Interestingly, I tried a couple of IDN test tools (IMC's and NASK's) to
 convert that UTF-8 string into the appropriate A-label and both
 indicated there are invalid characters.  I'm getting an uneasy feeling...
IDN translation tool accepts only allowed characters for registration at
NASK. I will add few words explaining this issue at the web page to
avoid further confusion.

Regards,
Krzysztof Olesik
NASK


___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Stephane Bortzmeyer
On Mon, Mar 09, 2009 at 01:04:42PM -0400,
 Andrew Sullivan a...@shinkuro.com wrote 
 a message of 59 lines which said:

 John's view is that the original alphabetic restriction in 1123
 was indeed intended as a restriction,

I was not there at the creation but I find it worrying to rely on the
recollection of one specific person. The alphabetic-only rule in RFC
1123 is just a side note, never detailed, and presented as a fact
(which it was at this time), not as a mandatory restriction.

It is nice to remove the ambiguity (and therefore
draft-liman-tld-names is a good idea) but it should be treated as a
small adjustment, not a big reform.

 He argues that it is a good idea to be as restrictive as possible in
 the top level,

I completely fail to see why. Most reasons given were policy
issues. Here, I fully agree with Edward Lewis's law bus drivers
shouldn't determine the bus route. There are no *TECHNICAL* reasons
to limit TLD to alphabetic characters. There may be non-technical
reasons and even valid non-technical reasons, but they are completely
off-topic for the IETF.

The IETF should be really careful not being used as a pretext in
policy disputes. If some governance body wants to prohibit IDN in the
root (which is the case today), they must not be able to say that it
is per-request of the IETF. Because this would drag IETF in the line
of fire.

 His suggestion is to re-iterate the alphabetic-only criterion,

This would turn a small ambiguity in RFC 1123 in a real rule. -1 for
me.

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Patrik Fältström


On 10 mar 2009, at 08.30, Patrik Fältström wrote:


If you use a mac, let me recommend UnicodeChecker from http://earthlingsoft.net


Hmm...that domain seems to be not delegated at the moment. Anyone have  
other contacts?


   Patrik

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread bmanning
On Tue, Mar 10, 2009 at 10:27:21AM +0100, Stephane Bortzmeyer wrote:
 On Mon, Mar 09, 2009 at 01:04:42PM -0400,
  Andrew Sullivan a...@shinkuro.com wrote 
  a message of 59 lines which said:
 
  John's view is that the original alphabetic restriction in 1123
  was indeed intended as a restriction,
 
 I was not there at the creation but I find it worrying to rely on the
 recollection of one specific person. The alphabetic-only rule in RFC
 1123 is just a side note, never detailed, and presented as a fact
 (which it was at this time), not as a mandatory restriction.

we -could- ask the author of RFC 1123.


--bill 
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread David Conrad

Patrik,

On Mar 10, 2009, at 12:30 AM, Patrik Fältström wrote:

On 9 mar 2009, at 19.16, David Conrad wrote:
This doesn't make any sense to me.  I am fairly certain there will  
be a request to add the U-label 日本 (Japanese Kanji for  
Japan).  This isn't alphabetic in any sense of the term.


To some degree it is, as the two characters are:

U+65E5 : Lo, Letter Other -- Alphabetic
U+672C : Lo, Letter Other -- Alphabetic

I.e. they have both the Unicode Properties of letters, and because  
of that the derived property Alphabetic.


Ah.  A new definition for alphabetic of which I was unaware.

Thanks,
-drc

P.S. Out of curiosity, what is 一 (Japanese Kanji for the number 1)  
considered?


___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread Patrik Fältström

On 10 mar 2009, at 19.07, David Conrad wrote:

P.S. Out of curiosity, what is 一 (Japanese Kanji for the number  
1) considered?


U+4E00 : Lo, Other_Letter, L, Left_To_Right

I.e. it is a letter. With strong directionality. So according to the  
Unicode properties that we use so far, that is not a number.


Fun?

   Patrik

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-10 Thread David Conrad

On Mar 10, 2009, at 11:31 AM, Patrik Fältström wrote:

On 10 mar 2009, at 19.07, David Conrad wrote:
P.S. Out of curiosity, what is 一 (Japanese Kanji for the number  
1) considered?

U+4E00 : Lo, Other_Letter, L, Left_To_Right
I.e. it is a letter. With strong directionality. So according to the  
Unicode properties that we use so far, that is not a number.

Fun?


I'd say interesting (in the same way I find the Ebola virus  
interesting).  :-)


Regards,
-drc

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Edward Lewis

At 13:04 -0400 3/9/09, Andrew Sullivan (based on someone else's note) wrote:


His suggestion is to re-iterate the alphabetic-only criterion,
except to allow one extension to permit A-labels conforming to
IDNA2008 (which work, note, is not yet complete).  In addition, I
think he believes that the document should also require that any
U-label that is to correspond with an A-label that is added to the
root zone must _also_ be alphabetic.


For those of us not reading idnabis, what is an A-label and what is a 
U-label?  I have not seen a reference to their definition so I'm 
assuming these are idnabis terms.



I will say that I am (personally, no hats) uneasy importing to the
technical constraints on top level labels what seem to me to be policy
considerations.  Such policy considerations seem to me to be the sort
of thing that ought to be handled in policy-making bodies set up for
the purpose.  At the same time, I accept the argument that there are
strong technical reasons to minimize the changes to rules about the
root zone, since we know there are many DNS-using systems in the world
built around fragile readings of various RFCs.  So I'm of two minds
about the position I've laid out above.


The problem with saying these are the technical rules and they 
shouldn't be changed is that this essentially closes off the global 
public Internet from becoming global.  If the Internet is based on 
fragile readings of various RFCs then the Internet should be the 
entity that suffers not the world economy it serves.


I understand the advantages of maintaining technical purity, but what 
good is it if the purity was defined by a small percentage of the 
population and then the putiry maintained resulting in there being an 
technical elite?  (Even I have relatives that do not speak English.)


I agree though that policy...ought to be handled in policy-making 
bodies.  I have used the statement bus drivers shouldn't determine 
the bus route a few times in the past - meaning here that having DNS 
experts determine the rules for what's to be allowed in the global 
public Internet root zone is a misplaced assignment.  For engineers, 
no change in the specification is always good.  For the DNS, it 
doesn't matter what's in that root (so long as it's globally 
coherent).  But it matters to a lot of other protocols.


Ultimately, I think that there are no technical restrictions on what 
is placed in the root, no algorithmic way to say thumbs up or say 
thumbs down.

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Edward Lewis
NeuStarYou can leave a voice message at +1-571-434-5468

Getting everything you want is easy, if you don't want much.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Paul Hoffman
At 1:04 PM -0400 3/9/09, Andrew Sullivan wrote:
His suggestion is to re-iterate the alphabetic-only criterion,
except to allow one extension to permit A-labels conforming to
IDNA2008 (which work, note, is not yet complete).  In addition, I
think he believes that the document should also require that any
U-label that is to correspond with an A-label that is added to the
root zone must _also_ be alphabetic. 

...give or take. Some language/script combinations require non-alphabetic 
characters to represent real words. The people who speak those languages would 
not call those characters non-alphabetic. For example, if Unicode required 
you to have two characters to represent e with umlaut, and e and a 
combining umlaut, one can argue ad nauseam whether or not combining umlaut 
was alphabetic.

There is not a technical name for alphabetic and the additional characters you 
need to form words, nor is there a definitive list.

A suggestion would be word characters, explicitly excluding numerals and 
symbol characters (because some people might say that a copyright character is 
a word character).

I will say that I am (personally, no hats) uneasy importing to the
technical constraints on top level labels what seem to me to be policy
considerations.  Such policy considerations seem to me to be the sort
of thing that ought to be handled in policy-making bodies set up for
the purpose.  At the same time, I accept the argument that there are
strong technical reasons to minimize the changes to rules about the
root zone, since we know there are many DNS-using systems in the world
built around fragile readings of various RFCs.  So I'm of two minds
about the position I've laid out above.

+2 (+1 for each mind), unfortunately.

--Paul Hoffman, Director
--VPN Consortium
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Patrik Fältström


On 9 mar 2009, at 18.35, Edward Lewis wrote:

For those of us not reading idnabis, what is an A-label and what is  
a U-label?  I have not seen a reference to their definition so I'm  
assuming these are idnabis terms.


In short, an A-label is what is registered in DNS (xn-- version) and U- 
label is the equivalent but in Unicode.


There are some details of course, but this is the 3 foot level view.

   Patrik

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Andrew Sullivan
On Mon, Mar 09, 2009 at 01:35:08PM -0400, Edward Lewis wrote:

 For those of us not reading idnabis, what is an A-label and what is a  
 U-label?  I have not seen a reference to their definition so I'm  
 assuming these are idnabis terms.

Oops, sorry, yes, I should have provided that.

An A-label is the ASCII-compatible-encoding version of an IDNA-legal
string.  Basically, these are the labels you see in the DNS that start
with xn--.  Note that traditional labels (all ASCII ones that people
use, like shinkuro and com) are _not_ A-labels.

A U-label is the Unicode version of the label -- that is, the thing
that we expect people to see and to type in.  It must include at least
one non-ASCII character (otherwise, it's just an ASCII label, and IDNA
doesn't kick in).  There are some other restrictions having to do with
the legal form for U-labels, but they're beyond our scope for the
purposes of this discussion.

 The problem with saying these are the technical rules and they  
 shouldn't be changed is that this essentially closes off the global  
 public Internet from becoming global.  

Surely not, if we are defining an infrastructure by which that
globalness may be expressed (i.e. IDNA).  Or is there some other
thing you think ought to be permitted that would be closed off by
John's position?

A
-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread David Conrad

On Mar 9, 2009, at 10:04 AM, Andrew Sullivan wrote:

I think he believes that the document should also require that any
U-label that is to correspond with an A-label that is added to the
root zone must _also_ be alphabetic.


This doesn't make any sense to me.  I am fairly certain there will be  
a request to add the U-label 日本 (Japanese Kanji for Japan).  This  
isn't alphabetic in any sense of the term.


Interestingly, I tried a couple of IDN test tools (IMC's and NASK's)  
to convert that UTF-8 string into the appropriate A-label and both  
indicated there are invalid characters.  I'm getting an uneasy  
feeling...


Regards,
-drc

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Edward Lewis

At 13:56 -0400 3/9/09, Andrew Sullivan wrote:


globalness may be expressed (i.e. IDNA).  Or is there some other
thing you think ought to be permitted that would be closed off by
John's position?


I'm not sure (as this is all second hand arguing). ;)  I have gotten 
the feeling that some folks, perhaps John/perhaps not, have been 
arguing that the specs in 1123 were sacred and ought not to change 
lest other bad things happen.  IOW, maintaining the status quo of a 
working Internet was imperative.


I suppose my confusion is now a bit deeper.  If A-labels conform to 
the rules in 1123 and all U-labels can be translated to A-labels, is 
BiDi an issue (for the DNS)?


If A-labels are what is in the DNS, isn't the DNS taken care-of? And 
therefore all of this talk about what U-labels are good for TLDs 
belongs in fora for other protocols (URL's/IRL's, SMTP, etc.) or 
policy on things like bundling (ICANN)?


BTW, I was under the impression that ICANN wanted a root zone without 
any confusion, no homonyms, etc., but the IDN test bed has that - the 
two Chinese-language[0] entries are pronounced the same but written 
differently.


[0] Mandarin/simplified and Mandarin/traditional

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Edward Lewis
NeuStarYou can leave a voice message at +1-571-434-5468

Getting everything you want is easy, if you don't want much.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Andrew Sullivan
On Mon, Mar 09, 2009 at 02:11:13PM -0400, Edward Lewis wrote:

 I'm not sure (as this is all second hand arguing). ;)  I have gotten the 
 feeling that some folks, perhaps John/perhaps not, have been arguing that 
 the specs in 1123 were sacred and ought not to change lest other bad 
 things happen.  IOW, maintaining the status quo of a working Internet 
 was imperative.

No, I think John is arguing that IDNs at the top level are a good
idea, but opening things any wider would be bad.  I think his position
is very similar to what Patrik argued, which was that we ought to
require people to show that there is not a harm before allowing the
innovation at the top level, rather than presuming that there's no
harm and reacting if we run into one.

 I suppose my confusion is now a bit deeper.  If A-labels conform to the 
 rules in 1123 and all U-labels can be translated to A-labels, is BiDi an 
 issue (for the DNS)?

A labels do _not_ conform to 1123.  1123 says alphabetic, and all
A-labels have at least two hyphen-minus (-) characters in them.
That's part of why we have a problem.

 If A-labels are what is in the DNS, isn't the DNS taken care-of? 

Maybe.  That's what I was asking in idnabis about.  It isn't clear to
me yet whether it is possible for an A-label (which is always
xn--[output from Punycode algorithm]) to end with a digit instead of
always a letter.  I _think_ it's always a letter, but I'm not totally
sure.  You can definitely have digits inside a Punycode-encoded label,
because I've seen them.  We may want to restrict any A-label that
happens to end with a digit anyway, because of what might happen if
that label is exposed in A-label form, but in a BiDi context.

 BTW, I was under the impression that ICANN wanted a root zone without  
 any confusion, no homonyms, etc., but the IDN test bed has that - the  
 two Chinese-language[0] entries are pronounced the same but written  
 differently.

I am happy to report that I do not know anything about what ICANN
wants in respect of the contents of the root zone, and that I don't
yet think I need to learn.

A

-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Some second-hand remarks on draft-liman-tld-names-00.txt

2009-03-09 Thread Andrew Sullivan
On Mon, Mar 09, 2009 at 11:16:36AM -0700, David Conrad wrote:

 This doesn't make any sense to me.  I am fairly certain there will be a 
 request to add the U-label 日本 (Japanese Kanji for Japan).  This  
 isn't alphabetic in any sense of the term.

See Paul's note about this.  The point is that one might try to
restrict numbers.  My opinion, in any case, is that anything having to
do with U-labels is completely outside the scope of any document
focussed on the DNS: no U-label should ever be anywhere close to a
zone file.

A
-- 
Andrew Sullivan
a...@shinkuro.com
Shinkuro, Inc.
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop