All I did was test with utf-8 and it seems to work.

Also be careful about windows explorer. Encoding from windows explorer differs with the combination of windows version and office version installed on you machine. I have windows XP and office 2003 - and it seem to do a correct utf-8 encoding of the uri.

I general client that encode using client characterset does not really impress me :-)

When I use latest slide and TC 5 with slide encoding set to utf-8 and TC connector set to utf-8 encoding then everything works for me. However I only use client that encodes to utf-8.

I am not sure if I understand you test correctly. Do you want a server configuration that can accept both utf-8 and other encodings?

/Jacob

----- Original Message ----- From: "Thomas Draier" <[EMAIL PROTECTED]>
To: "Slide Developers Mailing List" <[EMAIL PROTECTED]>
Sent: Friday, September 17, 2004 11:40 AM
Subject: Re: slide and encodings



interesting, i did not know about these parameters. i just tried on tomcat 5, the result of the getPathInfo() method is clearly different, url-encoded chars are now interpreted as utf-8.the useURIValidationHack was already set to false - when setting it is to false, some bytes (non-ascii) are transformed to '-' char. but the problem is that in general, clients send iso-8859-1, or windows-1252, except weird client that do not know how to send non iso characters. for example, if my filename is \u00E9-\0153-\20AC-\u0430.txt (LATIN SMALL LETTER E WITH ACUTE - LATIN SMALL LIGATURE OE - EURO SIGN - CYRILLIC SMALL LETTER A) , it will be sent by windows explorer as %E9-[C5][93]-[E2][82][AC]-[D0][B0].txt where [xx] is the byte xx - as you can see the first char is url-encoded as iso-8859-1, other chars are not url-encoded and written as utf-8. just to compare, webdrive will send for the same file the sequence %E9-%9C-%80-? - which is completely url-encoded in windows-1252 (LATIN SMALL LIGATURE OE does not exist in iso-8859-1 but is %9c in windows 1252, EURO SIGN is %A4 in 8859-15 but %80 in windows 1252). strange thing is that the cyrillic letter is passed as a ? - which is interpreted by tomcat as a parameter separator, and then cut the end of the string - for both getPathInfo and getRequestURI. whatever the encoding set in tomcat, it is not properly decoded to the original string. that's why i had to bypass the tomcat decoding and write another method that try to guess the encoding from raw data. thomas

Le 17 sept. 04, � 09:19, Jacob Lund a �crit :

Did you remember to set encoding in the connector in server.xml?

in TC 5:
add URIEncoding="UTF-8" as parameter to the connection.

in TC41: (Coyote connector):
add useURIValidationHack="false" as parameter to the connector.

/jacob

----- Original Message ----- From: "Thomas Draier" <[EMAIL PROTECTED]>
To: "Slide Developers Mailing List" <[EMAIL PROTECTED]>
Sent: Thursday, September 16, 2004 5:00 PM
Subject: slide and encodings



hi,
back with my encoding problems, when using strange characters in filenames ..
i've made some tests on differents servers/platforms/clients - my base config is a tomcat 5.0.27 on mac os, i've also tested with tomcat 4.1.29, and moved the server to a windows xp. as clients, i used windows explorer, webdrive (from south river), cadaver on mac os, and the mac os finder webdav client (almost unusable because of specific mac encoding), cadaver on linux, and konqueror on linux. of course, each configuration gave me different results :-)
the getPathInfo() method is supposed to return a decoded path - but the behaviour is different between different containers when strange characters are sent. tomcat 4 and 5 do no return the same value. i replaced it with a parsing of getRequestURI(), which returns what the client has sent without decoding - and do the same with either tomcat 4 or 5, hopefully also for other servers.
i've also found some problems with the urlEncoding configuration parameter. even if the system is configured as utf-8, some clients can still send a mix of utf-8 and another encoding - so i added another parameter in Configuration to define if utf8 should be used or not, and kept the other as a "secondary" encoding. i changed the decodeString method (the method i previously added), which decodes either utf-8 or encoding specified in the configuration. and i updated the fixTomcatURL() method in order to work with these changes.
i changed the propfind method so that the encoding declared in the xml response match the encoding being used (it was always returning "utf-8")
the "Destination" header should be decoded as the url - for all the clients i've tested, the same form is used - getHeader() works as getRequestURI(), and does not decode anything. i do not know about the "Label" header - the clients i have, except slide client, do not support it now.
finally, i added a transformation for the ? character. that character does not work with most of the client i've used, but it appears if a character not supported by the encoding specified in confguration is sent. and then make the file unusable.
now that seems to work fine, whatever the server, client, or encoding being used .. for what i've tested .. but i'm sure we can still find some other problems :-)
there are 5 modified files - slide.properties, Configuration, AbstractWebdavMethod, PropFindMethod and WebdavUtils - can i send the patches here or on the bugzilla ?
thomas
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to