I decided to create a domain named S.cr.yp.to, but with the S changed to a UTF-8 contour-integral sign. That's Unicode #222E; \342\210\256. (Some readers say that their MUAs can't display UTF-8 text correctly, so I've changed contour-integral back to S in the following text.) I'm using UNIX, with the UTF-8 version of xterm. Here's what I did: # cd /etc/tinydns/root # echo @S.cr.yp.to:131.193.178.181:mx.contour.cr.yp.to >> data # make This created S.cr.yp.to MX mx.contour.cr.yp.to A 131.193.178.181. # dnsmx S.cr.yp.to 0 mx.contour.cr.yp.to This did a DNS lookup through my local cache, dnscache, and obtained the result without trouble. Can your cache do this? If not, why not? # cd /var/qmail/control/virtualdomains # echo S.cr.yp.to:contour >> virtualdomains # echo S.cr.yp.to >> rcpthosts # svc -h /service/qmail This arranged for qmail to accept mail for (e.g.) [EMAIL PROTECTED] and deliver it to contour-postmaster on the local host. # ( echo To: [EMAIL PROTECTED]; echo Testing ) | qmail-inject This sent a message to [EMAIL PROTECTED] I read the message using the UTF-8 version of less, and it was displayed comprehensibly, with the To line shown as follows: To: postmaster@"S".cr.yp.to qmail-inject doesn't mind weird characters, such as control characters and 8-bit characters, in atoms. It converts the atoms to quoted strings. Of course, RFC 822 doesn't allow quoted strings, never mind 8-bit characters, in domain names, but these are easy protocol extensions. I then tried sending a message to [EMAIL PROTECTED] from another machine. My SMTP client, qmail-remote, quoted the S the same way that qmail-inject did. The message was received and read without trouble. What's wrong with handling S this way? The answer seems to be that some other programs don't work. What are those programs? What exactly do they do wrong? How hard is it to fix them? Why should we believe that the other IDN proposals will require less effort? Keith Moore writes: > all apps will have to be modified (many will require significant > modification) if they want to deal meaningfully with IDNs. False. As demonstrated above, qmail and djbdns already work with UTF-8 domain names. They're both widely deployed. Apparently Microsoft also has some clients and servers that work with UTF-8 domain names. Changing those programs has a cost. What is the benefit? Brian W. Spolarich writes: > Using 8-bit data will break some applications. > Using 7-bit data (presumably) will not. False. If the user's S.cr.yp.to has to be encoded inside DNS and mail messages as ace-blah.cr.yp.to, then qmail will be faced with S.cr.yp.to in (e.g.) /var/qmail/control/virtualdomains, and ace-blah.cr.yp.to in SMTP. This simply won't work unless the software is changed. If, on the other hand, S.cr.yp.to is used as is, then the software will work fine. James Seng/Personal writes: > Patching sendmail might be trival for a good programmer like yourself. > How fast do you think you can get everyone to use your patch and would > unpatch software fallback safely? All the proposals require a sendmail patch. To tell sendmail to accept mail for S.foo.dom, the user adds S.foo.dom to a file with his UTF-8 editor; sendmail mishandles the \210 if it isn't patched. The patch required for direct use of UTF-8 is by far the simplest. No, deployment isn't free, but the other proposals don't change this fact. ---Dan P.S. I'm a subscriber to this mailing list. I don't want to receive extra copies of messages sent to the list. I've set Mail-Followup-To accordingly. P.P.S. You may have noticed an unusual From line on this message. The problem is that the software running this mailing list can't deal with the concept of sublist subscribers; it forwards my messages to Seng. Seng eventually approved two of my messages, editing the Date field and removing the Received lines to hide the delay. He has refused to approve this one unless I take special actions to fool the list software. So I'm using his address in From, and my address in Reply-To.
[idn] An experiment with UTF-8 domain names
D. J. Bernstein c/o James Seng Fri, 05 Jan 2001 00:08:17 -0800
- Re: [idn] An experiment with UTF-8 domain n... D. J. Bernstein c/o James Seng
- Re: [idn] An experiment with UTF-8 dom... Patrik F�ltstr�m
- Re: [idn] An experiment with UTF-8... D. J. Bernstein c/o James Seng
- Re: [idn] An experiment with U... Keith Moore
- Re: [idn] An experiment wi... Martin J. Duerst
- Re: [idn] An experime... Keith Moore
- Re: [idn] An expe... Martin J. Duerst
- Re: [idn] An ... Adam M. Costello
- Re: [idn] An experime... Brian W. Spolarich
- Re: [idn] An experiment wi... D. J. Bernstein c/o James Seng
- Re: [idn] An experime... Keith Moore
