Short version:
Does Tomcat 5 no longer serve files with international characters in their filenames?
Long version:
Environment: Tomcat 5.1.19 on WinXP Pro
I have a file located in: <tomcat-home>/<webapps>/MyWebApp/. The filename contains
international characters: 0x305f 0x3079 0x304f (a.k.a E3-81-9F E3-81-B9 E3-81-8F in
UTF-8)).
When I navigate to the directory via http://<server>:8080/<webappname>/ I get a
directory listing of the files in that directory. I can access every file on that
list except those that contain international characters.
When I click on a filename that contains international characters, I'm sent to
http://<server>:8080/<webappname>%E3%81%9F%E3%81%B9%E3%81%8F.xml. This is the correct
result of putting the filename through a URLEncoder with the UTF-8 character set,
which is what I assume is being done behind by the scene by the server. Except the
file doesn't appear. I get a 404 error.
So I made some Java testing code:
try {
URL url = new URL("http://<server>:8080/<webapp>/%E3%81%9F%E3%81%B9%E=
3%81%8F.xml");
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
// checking the headers
String header;
String key;
int i = 0;
while ((header = conn.getHeaderField(i)) != null) {
key = conn.getHeaderFieldKey(i);
System.out.println(key + " = " + header);
i++;
}
// checking the content
InputStream is = url.openConnection().getInputStream();
InputStreamReader isr = new InputStreamReader(is);
int chr;
while ((chr = isr.read()) != -1) {
System.out.print((char)chr);
}
System.out.println("success");
} catch (Throwable t) { t.printStackTrace(); }
The headers I get back are:
HTTP/1.1 404 /<webapp>/%E3%81%9F%E3%81%B9%E3%81%8F.scene.xml
Content-Type = text/html;charset=ISO-8859-1
Content-Language = en-US
Content-Length = 1091
Date = Wed, 10 Mar 2004 18:02:01 GMT
Server = Apache-Coyote/1.1
No help there because I get those same headers when I try to access a file that
doesn't exist at all:
HTTP/1.1 404 /<webapp>/inexistent.xml
Content-Type = text/html;charset=3DISO-8859-1
Content-Language = en-US
Content-Length = 1040
Date = Wed, 10 Mar 2004 18:03:22 GMT
Server = Apache-Coyote/1.1
When I try to access the input stream to read for content, I get a
FileNotFoundException.
I'm pretty confident that this problem does not exist in Tomcat 4.
I'm also pretty confident that this problem is not related to the characters being
3-byte UTF-8. I've tested using 2-byte UTF-8 (D0-9F, D1-80) and the result is the
same.
Is this a bug?
-Ed Toro
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]