J�rgen,

have a look at

http://www.w3.org/International/O-URL-and-ident.html

for a recommendation about defaults in uri character
encodings.

Am Mittwoch den, 6. M�rz 2002, um 11:30, schrieb Pill, Juergen:

> Hello,
>
> The client will send a URI encoded in a specific encoding. This 
> encoding may
> vary from client to client, e.g. IE send UTF-8, WebFolders send a 
> platform
> specific encoding (in US and Germany this is ISO...) at least on 
> Windows
> 2000.
true.

> The server will accept encoded URLs based on a default encoding
> (defaultEncoding = new 
> java.io.InputStreamReader(System.in).getEncoding())
> or to be set in slide.properties (org.apache.slide.urlEncoding=xxxxxx).
This is no good. How should a client guess what encoding the server 
uses?
The only interoperable way is to have a standard encoding, namely UTF-8.

> The client and server encoding must be identical, when the 
> character set
> extends US-ASCII. Unfortunately most (all?) clients do not send 
> infos about
> the used encoding when the URL was encoded. This lets the server 
> to guess or
> make the accepted encoding to be set via a parameter. (does 
> someone know an
> algorithm to decide from an encoded URL the used encoding?)
How should a client announce the encoding used? The "charset" parameter
in tomcat is proprietary, AFAIK. I'd be interested why tomcat went the
charset=xxx way in URIs instead of following W3C recommendations.

The algorithm is to look if the octets are valid UTF-8 (pretty easy) and
if not, fall back to, well, 8859-1, I'd assume as the most commonly
used _other_ encoding. A server could have setup parameters for the 
fallback
encoding(s).

> With above changes we got Japanese and German characters to work 
> properly.
If the server is installed on a german windows?

>
> Best regards,
>
> Juergen
>
>
>
>
>  -----Original Message-----
> From:         Stefan Eissing [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, March 05, 2002 17.47 PM
> To:   Slide Users Mailing List
> Subject:      Re: Special letters
>
> Well, depending on which client is making the request on Windows
> (webfolder, IE or Office), it uses utf-8 or (on my german win2k
> installation) 8859-1 as character encoding for uris.
>
> A server can try to fallback to 8859-1 if the uri does not
> look like valid utf-8...
>
> The best test case is still the euro sign.
>
> Am Dienstag den, 5. M�rz 2002, um 17:32, schrieb Remy Maucherat:
>
>>> This sounds like Slide/Tomcat is not defaulting to UTF-8 encoded
>>> URIs.
>>
>> I am think it does. I've had a lot more success with TeamDrive, 
>> or when
>> using an HTTP browser to access the server, so it may be i18n
>> problems with
>> the current MS client.
>> (I'm not sure 100% about that; it's just my current theory)
>>
>> Remy
>>
>>
>> --
>> To unsubscribe, e-mail:   <mailto:slide-user-
>> [EMAIL PROTECTED]>
>> For additional commands, e-mail: <mailto:slide-user-
>> [EMAIL PROTECTED]>
>>
>
>
>
>
> --
> To unsubscribe, e-mail:   <mailto:slide-user-
> [EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:slide-user-
> [EMAIL PROTECTED]>
>
> --
> To unsubscribe, e-mail:   <mailto:slide-user-
> [EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:slide-user-
> [EMAIL PROTECTED]>
>




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to