At 23:40 02/03/26 +0000, Adam M. Costello wrote: >"David Leung (Neteka Inc.)" <[EMAIL PROTECTED]> wrote: > > > If we use ACE as the URL links then all web designers in the world > > needs to be retrained for ACE conversion. > >Either the web designer uses an HTML editor, in which case the editor >should know the HTML syntax rules and convert to/from the local charset >as needed,
I don't know exactly what you mean by 'convert to/from'. Many HTML editors these days work in Unicode internally, and convert to whatever encoding the author chooses on input/output. This conversion is independent of HTML syntax, except for the use (or not) of numeric character references (&#xHHHH;,...), and the <meta> charset information. Some HTML editors may work in a local encoding throughout, and then they don't do any conversion. You seem to imply that the HTML editor is checking or should check the URI syntax, and that they could be upgraded to add a legacy encoding->Unicode->ACE conversion. First, I'm not aware of HTML editors actually doing such syntax checks. Second, there is the very fundamental problem that they won't be able to do such checks. If you enter hppt:something, should the editor tell you this is an error? It has no idea whether hppt is a legal URI scheme or not. Similarly, even if the HTML editor does the very limited checks that RFC 2396 allows, this doesn't get it very far. For example, while http: uses generic URI syntax, mailto: uses opaque syntax, so there is no general way to know where in an URI there is a domain name. This includes the case that a domain name is sent as a parameter. >or the web designer uses a text editor, in which case the web >designer is taking responsibility for knowing and obeying the HTML/URI >syntax rules, one of which is that href and src attributes contain only >ASCII characters. > >Maybe future HTML/URI specs will allow non-ASCII characters in href and >src attributes, but it's not obvious how to do that without breaking >deployed browsers, and that discussion is for another forum. HTML 4 already says what a browser should do if it finds an non-ASCII character in an URI. Please see http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1 Appendix B.2.1: Non-ASCII characters in URI attribute values (this is supported by all major browsers in newer versions) Further information, e.g. on other W3C specs, see also: http://www.w3.org/International/O-URL-and-ident.html Of course this doesn't solve the problem of how to handle IDNs in URIs automatically, but it provides a clear direction. Regards, Martin. #-#-# Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium #-#-# mailto:[EMAIL PROTECTED] http://www.w3.org/People/D%C3%BCrst
