Sylvie Perrin wrote:
Christopher,

Here is the stack trace of the FileNotFoundException:

java.io.FileNotFoundException: /home/me/mountDir/fichi��.txt (No such file or directory)

Sylvie,

maybe what appears above shows the origin of the problem, and explains
what I was trying previously to tell you.
It is difficult to be sure, because (again) there are several layers of
encoding/decoding between your logfile, and how it may show up in this email.

The problem is not your problem per se. You are not necessarily doing anything wrong. The problem is basically in the lack of a common standard between different OS'es and filesystem types, about how to represent filenames containing non-US-ASCII characters.

Below, I am trying to explain the root of the problem, concisely but fully. It *is* a complex matter, that's why it is confusing. But you are not alone in being confused or puzzled. Unless one has had to deal with such issues many times, it is really easy to get confused, because in this case, what one sees is not necessarily what one gets.

Assuming that what I see above is also what you see in the logfile ("fichi" + 2 strange characters + ".txt") :

- java is trying to open a file named "fichi" + 2 strange characters +
".txt"
- these two characters *may* be the Unicode/UTF-8 encoding of the
character "é" (e with acute accent)
- but java is not finding that file (obviously)

Furthermore :
The file is really located on a Windows server.
The Windows directory where the file is located, is "mounted" through the CIFS filesystem, onto a local mountpoint on your (Linux) Java and Tomcat host.
On your Java/Tomcat host, Java is seeing the contents of this directory
*through* this CIFS filesystem mount.
In principle (but that is only an assumption here), the CIFS filesystem code (running on the localhost) shows this (remote) directory content to a local application "as is", without making any character set translation.

Now Java (on your local system) is trying to find this file "fichiXX.txt", and not finding it. (XX being 2 the two unknown bytes) That means that, on the remote system, this file "fichXX.txt" does not exist.

If you connect to that remote system via, for instance, a Remote Desktop or a VNC console (or even from your local station, just browse this "share" through the Windows Explorer), and examine the content of that directory, you probably see a file named "fichié.txt".

But that is only what you *see*, through whatever interface you use.
In reality, the "é" in this filename may (or may not) be encoded, in the Windows directory entry, as 2 bytes. Or it may be encoded with (for instance) a Windows 8-bit codepage, as a single byte. If so, that is why Java, which is trying to find this "é" as 2 bytes, does not find it.

Now comes the difficult part :

To solve your problem thus, you have to make sure that when Java is looking for a filename which, from the Java point of view, contains an "é" character, this Java "é" *character* (whatever its representation is as bytes in Java), matches the byte representation of the "é" character, in the filesystem of the remote host where the file actually resides.

And the problem is, that these two "systems" (Java and your current platform) and the remote OS, do not necessarily agree on what this byte representation of an "é" character is.

For example, suppose you find the right set of measures that make your Java program find the file in the end. Then, you replace the Windows fileserver by a Linux server, sharing its files through Samba. Well, the problem may then show up again, because the encoding may be different again. That is why I was recommending to stick to US-ASCII names. It was not a joke.





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to