On Sat, Dec 30, 2017 at 7:26 AM, Stephen J. Turnbull <turnbull.stephen...@u.tsukuba.ac.jp> wrote: > Christian Heimes writes: > > Questions: > > - Is everybody OK with breaking backwards compatibility? The risk is > > small. ASCII-only domains are not affected > > That's not quite true, as your German example shows. In some Oriental > renderings it is impossible to distinguish halfwidth digits from > full-width ones as the same glyphs are used. (This occasionally > happens with other ASCII characters, but users are more fussy about > digits lining up.) That is, while technically ASCII-only domain names > are not affected, users of ASCII-only domain names are potentially > vulnerable to confusable names when IDNA is introduced. (Hopefully > the Asian registrars are as woke as the German ones! But you could > still register a .com containing full-width digits or letters.)
This particular example isn't an issue: in IDNA encoding, full-width and half-width digits are normalized together, so number1.com and number1.com actually refer to the same domain name. This is true in both the 2003 and 2008 versions: # IDNA 2003 In [7]: "number\uff11.com".encode("idna") Out[7]: b'number1.com' # IDNA 2008 (using the 'idna' package from pypi) In [8]: idna.encode("number\uff11.com", uts46=True) Out[8]: b'number1.com' That said, IDNA does still allow for a bunch of spoofing opportunities that aren't possible with pure ASCII, and this requires some care: https://unicode.org/faq/idn.html#16 This is mostly a UI issue, though; there's not much that the socket or ssl modules can do to help here. -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com