I would like Tomcat to be able to serve a file named

http://www.whatever.com/b%f8ger.html

(note the international character %f8 in the name)

As several people have mentioned, it takes some additional Tomcat
configuration to make this work. Unfortunately I have not been able to
make it work yet.

My assumption is that Tomcat transforms the %f8 character into a Unicode
Java character before requesting a file of that Unicode name from the
file system. My Tomcat 4.1.27 is installed on a Windows XP PC, and the
default file encoding is (as most of you know) Cp1252. What I don't
know, is if I need to take that into account - I do not know if the file
encoding is both a file and file name encoding.

I am thinking the attribute useBodyEncodingForURI should be set to false
(though I tried both) since I do not want to make Tomcat's success
depend on what browsers put into the request headers. Do you have any
comments on that?

The files which are served are xhtml files with an ISO-8859-1 encoding.
I have tried setting URIEncoding to both Cp1252 and UTF-8, but to no
avail.

Has anyone of you been able to make Tomcat serve such internationally
named files from a PC? Which attribute settings did you use?

Any suggestions to things I could try would be greatly appreciated!


Randahl



-----Original Message-----
From: Edward Toro [mailto:[EMAIL PROTECTED] 
Sent: 11. marts 2004 19:47
To: Tomcat Users List
Subject: RE: international filenames inaccessible

It still seems incorrect for the server to decide which type of encoding
to use.  To support the portability of webapps, shouldn't each webapp
decide its own encoding?  Otherwise, once "URIEncoding=UTF-8" is set,
every webapp on the server has to send international characters in
UTF-8.  Instead, each webapp should specify the encoding it wants to use
in a header.

So the worthwhile change would be, as Yan said, to default the
useBodyEncodingForURI to true.  But if that only applies to the query
string, then it only solves part of the problem.

-ET

-----Original Message-----
From: Larry Isaacs [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 11, 2004 1:23 PM
To: Tomcat Users List
Subject: RE: international filenames inaccessible


This has been discussed on tomcat-dev pretty thoroughly
already.  Tomcat 4.1.27 and earlier were hard coded to
use UTF-8 for decoding URLs.  This allowed you to easily
develop a dependency on this "feature" and then later
discover your webapp isn't portable.  Tomcat 4.1.30 and
5.0.19 fix this by forcing you to change the default,
which supports portability, to something that does not.
Hence, no surprises with respect to portability.

Note that URL query string encoding is affected by the
useBodyEncodingForURI attribute.  Tomcat 4.1.30 defaults
this to true, to maintain the same behavior as prior
Tomcat 4.1.x versions. In Tomcat 5.0.19 it defaults to
false.  If you try to serve some webapps that aren't
using UTF-8 everywhere, you could be impacted by this.

Cheers,
Larry

> -----Original Message-----
> From: Edward Toro [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, March 11, 2004 12:58 PM
> To: Tomcat Users List
> Subject: RE: international filenames inaccessible
> 
> 
> Wow, that worked!
> 
> The problem may actually be in Java rather than Tomcat.  I 
> set the DEBUG value to 1001 on a 5 server and a 4.1.18 server 
> to check the request info.  The call to getServletPath() 
> returns a different value between 4.1.18 and the latest 
> releases.  I suppose previously Java did the decoding, but 
> now the servlet is responsible for the decoding?  Or maybe 
> the newer servers specify ISO-8859-1 instead of letting Java 
> do the work?
> 
> It's really annoying that this value overrides the use of the 
> "file.encoding" System property.  A previous "solution" 
> mentioned using that, but I couldn't get it to work.
> 
> IMO, the server should be able to serve files with 
> international file names without any extra configuration, 
> especially since it used to do it before.  UTF-8 is becoming 
> the standard for international character transmission over 
> the net, if it's not the standard already.  And UTF-8 looks 
> exactly like ASCII for all the values in the ASCII range.  Is 
> this something worth bringing up in the Tomcat-Dev group?
> 
> -ET
> 
> -----Original Message-----
> From: Larry Isaacs [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 11, 2004 12:36 PM
> To: Tomcat Users List
> Subject: RE: international filenames inaccessible
> 
> 
> See the "uriEncoding" attribute described at:
> 
> http://jakarta.apache.org/tomcat/tomcat-5.0-doc/config/http.html
> 
> The same attribute applies to Tomcat 4.1.30 as well.
> 
> I'm not aware of any specs that guarantee behavior when using
> non-ASCII characters in the URL in this fashion, but it might
> work.
> 
> Cheers,
> Larry
> 
> > -----Original Message-----
> > From: Edward Toro [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, March 11, 2004 11:10 AM
> > To: Tomcat Users List
> > Subject: international filenames inaccessible
> > 
> > 
> > Does anyone know if Tomcat 5 is supposed to serve files with 
> > international characters in their filenames?  It used to work 
> > in Tomcat 4.1.24, but stopped working in 4.1.30 and doesn't 
> > work in 5.0.19.
> > 
> > In all the versions of Tomcat I've seen, the international 
> > characters are converted using URLEncoder(filename, "UTF-8") 
> > as per the standard at 
> > http://www.w3.org/International/O-URL-> code.html.  But the 
> > broken servers return 404 when you try 
> > to access international filenames like that.
> > 
> > The code to interpret the encoding is provided on that w3.org 
> > page.  Why isn't it part of the server anymore?
> > 
> > -Ed
> > 
> > 
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to