Hi!

Judson Valeski wrote:
> 
> We decided on the following proposal. dougt, rpotts, chak, dmose, gagan, valeski, 
>and nhotta attended the meeting.
> 
> URI's would accept, and store, only UTF8 encoded strings. Protocols not able to 
>handle UTF8 (HTTP for example), would access the charset attribute (proposed) off of 
>nsIURI to convert back to the original string. The charset would be set by the URI 
>creator as they have the best charset context. Is nsIURI the right
> place for the charset attribute?

I think it is. Also get away with the char representation of the uri
components. Use strings instead.
 
> The current ASCII % encoding would be removed from the internal URI representation. 
>Again, this encoding would be pushed out to the protocol level.

So we will have a two levels of %-enconding? I don't think the
%-encoding can be removed completly. The first level applies to all URIs
and masks reserved chars as the current stuff does. On a second level
non ascii chars can be encoded as the protocol needs it.

> Currently, necko provides the ability to create both UTF8 encoded URIs, as well as 
>ASCII URIs. This is a bug that needs to be fixed so *all* necko URI creation 
>facilities would create UTF8 URIs.
> 
> This proposal addresses LDAP's immediate need for UTF8 URIs (it is a protocol that 
>can handle UTF8 strings), as well as HTTP's need to *not* use UTF8 (the charset 
>attribute will allow HTTP to convert back to the original string).
> 
> IDNS, and future HTTP servers handling UTF8 are believed to be covered under this 
>model.
> 
> Migration to this new world would be phased something like the following to minimize 
>impact...
> 
> First phase:
> - The URI charset attribute would be added first, and URI creators would start 
>feeding in the charset.
> - Necko would provide consistency in URI creation facilities (all UTF8), and 
>callers/users expecting non-UTF8 URIs would need to deal w/ the new encoding.
> - HTTP would covert out of UTF8 before sending requests (fixes chak's bug).
> 
> Second phase:
> - ASCII % encoding would be removed from the url implementation(s), and pushed out 
>to the protocols who need it. Callers expecting the encoding would also need to be 
>repaired to handle the new UTF8 format.
> 
> Jud

Andreas

Reply via email to