Of course I spoke to soon...

The problem with this setup is that you get mixed encodings. I want to use ISO-8859-1 for the default system encoding (-Dfile.encoding="ISO-8859-1") but I have to have the URLEncoding=UTF-8 setting in tomcat for Slide to work. Now when you ask the Request what encoding it is using it will then tell it's using ISO-8859-1 when it fact it is always decoding with UTF-8. Forms will then work ok (at least with POST) but links with special characters in the attributes (GET) will not since the attribute will be encoded in ISO but decoded in UTF-8.

So...with all my babble aside the ONLY solution is to set everything to UTF-8, file.encoding, the page encoding (in html) and the tomcat variable URIEncoding in server.xml.
Fortunately, at least with Oracle and the thin driver, that works with an ISO-8859-1 database since that encoding is a "subset" of UTF-8 and has a 1-1 mapping in UTF-8.
However this makes Slide unusable with other "stranger" encodings such as Russian UNLESS you have a unicode database.


I looked at the patch that Thomas put in Bugzilla and it looks like it could work but I think the FAQ or the Wiki should contain clear instructions on how to use Slide in a non UTF-8 environment.

Thanks for all the help guys, hope this will be resolved in the next minor version.

Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com



On 26.12.2004, at 05:08, Eirikur Hrafnsson wrote:

Wow thanks a lot : ) it worked.

Setting the value to UTF-8 changed everything. Setting it to ISO-8859-1 worked exactly like before and the attribute URIEncoding did not exist in the Tomcat I downloaded (not from Slide, since we are integrating it ourselves) therefore it will default to ISO-8859-1 according to the docs and won't work with Slide 2.1RC1. Maybe it should say in the FAQ that if you download Tomcat from the jakarta project and create your own stores you should set the value to UTF-8 in server.xml? Or maybe it won't be an issue once we have the patch in Slide 2.1.1 that you said Thomas has?

Well at least now we can continue with our project and hopefully finish in time ; )

Best regards and thanks for the help...
Eiki, idega.


On 24.12.2004, at 11:48, Oliver Zeigermann wrote:

Yes, sorry, it is URIEncoding. This is the connector entry in
server.xml which is set in the Tomcat bundle by default:

<Connector port="8080"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
debug="0" connectionTimeout="20000"
disableUploadTimeout="true" URIEncoding="UTF-8"/>


Oliver

On Fri, 24 Dec 2004 09:46:46 +0000, Eirikur Hrafnsson <[EMAIL PROTECTED]> wrote:
Actually the Tomcat docs say the
"URIEncoding

This specifies the character encoding used to decode the URI bytes,
after %xx decoding the URL. If not specified, ISO-8859-1 will be used.
"
As opposed to UTF-8.


-Eiki

On 24.12.2004, at 09:37, Eirikur Hrafnsson wrote:


On 23.12.2004, at 17:47, Oliver Zeigermann wrote:

Certainly not a showstopper. Encoding is set in the Tomcat connector
and defaults to UTF-8.
Great I hope that works but how is it defined and in which connector
in the server.xml? (with and without apache)
What's the point of the URIEncoding parameter in Silde.properties then?



Thomas had a patch that can find out in most scenarios what the URL is
encoded in and I guess he will add it to 2.2 and 2.1.1
That would be excellent but cold someone send me the patch now or
point me in the right direction?

Best regards

Eiki, Idega Software



Oliver


On Thu, 23 Dec 2004 16:07:41 +0000, Eirikur Hrafnsson <[EMAIL PROTECTED]>
wrote:
Hi,

I think there is a serious bug in the way Slide handles URI encoding
that I think should not go into the 2.1 release. (sorry if the email
is
a little long...our fairly big project depends on this working
a.s.a.p)


I observed that neither Swedish nor Icelandic letters (nor Latvian
for
that matter) make it correctly to the Slide store (any type) when you
are using the "ISO-8859-1" encoding (like most western countries) on
the server.


When stepping through the WebdavServlet I noticed that the characters
become a mess before "and" after a seemingly pointless "encode" that
is
followed directly with a "decode".


I think the problem is as follows:
In Slide there are at least two points where an uri is is
encoded/decoded before being used.
1. In WebdavRessource when you try to upload a file by specifying the
path and an inputstream with the method
public boolean putMethod(String path, InputStream is)
...
PutMethod method = new
PutMethod(URIUtil.encodePathQuery(path));
...
As you can see by this code it encodes the path by the systems
default
encoding which by the way on OSX at least is the infamous MacRoman
character set and unfortunaly there is no way for the developer to
specify an alternative encoding here. This causes one problem right
away on the server side because it tries do encode+decode the string
using ISO-8859-1 and of course fails miserably.


2 So...of course I just copied all the relevant code from the
putMethod
to my class and tried again using
URIUtil.encodePathQuery(path,"ISO-8859-1").
This time the path makes it safely into the encoding+decoding stage
in
the WebDavServlet and looks like this (hope you can view Icelandic
characters...)
"/files/���������-ma�ur.doc"


Then next in the service method the problem starts
protected void service(..)
...
req.setAttribute("slide_uri", WebdavUtils.getRelativePath(req,
(WebdavServletConfig) getServletConfig()));
...
I don't understand why but in WebdavUtils.getRelativePath(...) the
path
is first encoded and then decoded back (why not use the original?).
The encoding starts....


decodeURL(fixTomcatURL(result, "UTF-8"));

Ok...so the path ,that still looks readable, first gets encoded with
ISO-8859-1 (that I set in Slide.properties btw) in
URLUtil.URLEncode(path, enc);
Now the string looks like this
"/files/%C3%81%C3%AD%C3%B3%C3%A9%C3%BA%C3%BE%C3%B0%C3%B6%C3%A6-
ma%C3%B0ur.doc"


I don't really know if it is ok still but lets continue anyway...the
last phase before the path gets totally ruined is the decodeURL part
public static String decodeURL(String path) {
return decodeURL(path, Configuration.urlEncoding());
}


(One point here. This method call gets the desired encoding right but
has no backup plan like the encode part with "UTF-8" except for the
current system encoding making it incompatible with the encoded
string
right away if it's not UTF-8)


In decodeURL(path, Configuration.urlEncoding()) you hit a brick wall.
In there is a method call that destroys the path and that is
String normalized = URLUtil.URLDecode(path, enc); that leads to a
return URLDecode(bytes, enc);


The final path that then gets saved (to the database in the URI
table)
is this poor bugger
"/files/�?�?�?̩̼�?̡̦�?-ma̡ur.doc"
and then you have a file you can't even download or a folder that you
can't browse(if you where using a .mkCol(..))


So can anybody help me fix this or am I just crazy and Slide is
working
for everybody else (Anybody else using ISO-8859-1?)

I have a vague idea of what I think might be causing this but I would
like to see the whole roundabout fixed permanently.
One idea is to always send and receive the String as UTF-8 but then
change it to the desired encoding just before writing to Slide.


Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com

On 23.12.2004, at 12:32, Daniel Florey wrote:

I've no clue how the 2.4.1 made it into the binary package. I've
build
the
packages and they seem to look right.
Can you check if you have some weird build.properties in your
${home}
James?
I've added the mxl-im-exporter needed by the clientlib to the dist
target.
Would you be so kind to build the distro for the clientlib once
again?
Thanks,
Daniel


-----Urspr�ngliche Nachricht-----
Von: [EMAIL PROTECTED]
[mailto:slide-dev-return-15249-
[EMAIL PROTECTED] Im
Auftrag von Daniel Florey
Gesendet: Donnerstag, 23. Dezember 2004 13:22
An: 'Slide Developers Mailing List'
Betreff: AW: 2.1 Release


Hi James,
The webdavclient version number is still 2.4.1 (??). Something
must be
messed up with the build script.
Also the xml-im-exporter needed by the webdavclient is missing in
the
distribution. I'll fix that now and would be very happy if could do
the
distro again.


Cheers,
Daniel

-----Urspr�ngliche Nachricht-----
Von:
slide-dev-return-15232-apmail-
[EMAIL PROTECTED]
[mailto:slide-dev-return-15232-apmail-
[EMAIL PROTECTED] Im Auftrag von James Mason
Gesendet: Mittwoch, 22. Dezember 2004 18:17
An: Slide Developers Mailing List
Betreff: 2.1 Release


I've uploaded everything to
http://cvs.apache.org/~masonjm/slide_2.1/
Can someone have a look and make sure I haven't missed anything? I
want
to double check before I put this somewhere that it's going to get
mirrored :).


-James


--------------------------------------------------------------- ---
---
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------- ---
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


----------------------------------------------------------------- ---
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







------------------------------------------------------------------- --
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com


-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to