----- Original Message -----
From: "Bruce Thomson" <[EMAIL PROTECTED]>
To: "Soobok Lee" <[EMAIL PROTECTED]>; "IETF idn working group" <[EMAIL PROTECTED]>
Sent: Friday, March 22, 2002 6:29 PM
Subject: Re: [idn] URL encoding in html page


> > What if all the html viewable text is in english, but, only the href url contains
> > legacy (korean) encoded hostnames?  chinese visitors would see clean english 
>homepage,
> > but fail to click through the korean link.
> >
> Well, that could happen, but a META tag would solve that so easily. Personally
> I often use a simple text editor to deal with HTML, and would find it easier to
> use legacy encodings or UTF-8 than cut-and-paste ACE from somewhere.
> Of course the user could do it either way and it would work.

Yes. Charset META tags help. But, many homepages  have assumptions on the main 
audience's
default char encodings and very often omit the  META tag for the encoding like :
  <meta http-equiv="Content-Type" content="text/html; charset=euc-kr">

Moreover, IDN url would be used in a pure FRAMESET document that defines frame URLs
and contains no viewable texts. Such FRAMESET documents often omit charset META tags.
 (look into the html source of http://www.freeway.co.kr/ )

AFIAK, 99.99999% of korean homepages have implicit/explicit
legacy korean encoding (KS_C_5601-1987 or euc-kr). So do most japanese/chineses 
homepages.
UTF8/UCS-2 encodings are rarely used in global WEB publishing.  Legacy encodings
will dominates even in the future, because it is compact and inexpensive.

IF we want to make IDN truly internationally interoperable, all IDN-aware 
webbrowsers/applications
should contain libaries of all kinds of legacy-to-Unicode conversion routines. It will 
burden
too much memory load on handheld devices like PDA.

Moreover, legacy encodings are revised separately from unicode. We may face with as 
toughest
versioning problems as we did in stringprep/nameprep versioning problems for newly 
added unicode points.
How to guarantee  stability and intergrity of IDN operations in the all combinations 
of  numerous kinds and versions of iDN-aware
applications and legacy encodings?

Soobok Lee


Reply via email to