----- Original Message ----- 
From: "Soobok Lee" <[EMAIL PROTECTED]>
To: "IETF idn working group" <[EMAIL PROTECTED]>
Sent: Sunday, March 24, 2002 9:36 PM
Subject: Re: [idn] URL encoding in html page


> 
> ----- Original Message ----- 
> From: "Soobok Lee" <[EMAIL PROTECTED]>
> > > Not necessary, since the HTML and URI specs already limit the host to
> > > ASCII letters, digits, hyphens, and dots.
> > 
> > We experts already knew this. But, many ML.com registrants don't know  about this
> > poor destiny of ML.com. They want to use native ML.com in their HTML homepage.
> > 
> > If we want to have interoperable URI supporting native IDN, we should revise
> > URI spec and HTTP spec BOTH. But, native IDN supports accompany potential
> > legacy code versioning and code interoperablility problems.
> > Would anyone provide indepth analysis on this caveat  ?
> > 
> 
>  
>  Even if we stay with current HTTP/1.1 which allows only ASCII host: header values,
>  still we could revise  URI spec to allow native (utf8 or legacy encoding) IDN in 
>URI.
> 
>  1) With IDNA and HTTP/1.1 , the web browser can encode Native IDN in URI into ACE 
>one , and
>  then open HTTP 1.1 session into the ACEed hostname with ACE host: value.
> 
>  2) With IDNA and revised HTTP with utf8 host support,  the web browser can encode 
>  utf8 IDN in URI into ACE one, and  then open HTTP session into ACE hostname with 
>utf8 host: value.
> 
>  3) With UTF8-based IDN and revised HTTP with utf8 host support, it can check 
>whether 
>  the native IDN is in utf8, and, if not, convert the iDN into utf8 , and then open
>  HTTP session into utf8 webhost with utf8 host: value.
> 
> 
>  2) and 3) may be infeasible due to HTTP's lack of capability negotiation feature 
>like that of ESMTP,

  s/and 3)//    :-)     In 3), the webserver surely support native utf8 host: value.
  

>  because the new web browser with native IDN URI support  can't decide whether the 
>web server supports 
>  native IDN or supports only ASCII(ACE) host in HOST: value   before trying that 
>twice with both forms 
>   of host: value (utf8 first, and then ACE if needed). Using ACE host: value is 
>always  safe in 1) and 2).
> 
>  BTW, in 1) and 2), we cannot avoid legacy versioning problems because 
>   most ACE conversion would be done by "ACE(NFKC(CaseFold(legacy-to-Unicode(native 
>label))))".
>   Most homepages in east asia are in legacy encodings and that monopoly (near 100%) 
>won't change
>    in the forseeable future.
> 
>  new legacy codes may be created after IDN-aware browsers are distributed.
>  old legacy codes may get new code points for newly added characters.
>  If IDN-aware browsers/applications are not upgraded with new legacy-to-Unicode 
>mappings,
>   they will occasionally fail to convert  legacy-encoded IDN into UNICODE one.
>   That kind of IDN failure had  never seen in LDH DNS.  
> 
> Soobok Lee
> 
>  
> 
>   
>  
>  
> 


Reply via email to