Sylvie Perrin wrote:
Christopher,
Here is the stack trace of the FileNotFoundException:
java.io.FileNotFoundException: /home/me/mountDir/fichi��.txt (No such file or
directory)
Sylvie,
maybe what appears above shows the origin of the problem, and explains
what I was trying previously to tell you.
It is difficult to be sure, because (again) there are several layers of
encoding/decoding between your logfile, and how it may show up in this
email.
The problem is not your problem per se. You are not necessarily doing
anything wrong. The problem is basically in the lack of a common
standard between different OS'es and filesystem types, about how to
represent filenames containing non-US-ASCII characters.
Below, I am trying to explain the root of the problem, concisely but
fully. It *is* a complex matter, that's why it is confusing. But you
are not alone in being confused or puzzled. Unless one has had to deal
with such issues many times, it is really easy to get confused, because
in this case, what one sees is not necessarily what one gets.
Assuming that what I see above is also what you see in the logfile
("fichi" + 2 strange characters + ".txt") :
- java is trying to open a file named "fichi" + 2 strange characters +
".txt"
- these two characters *may* be the Unicode/UTF-8 encoding of the
character "é" (e with acute accent)
- but java is not finding that file (obviously)
Furthermore :
The file is really located on a Windows server.
The Windows directory where the file is located, is "mounted" through
the CIFS filesystem, onto a local mountpoint on your (Linux) Java and
Tomcat host.
On your Java/Tomcat host, Java is seeing the contents of this directory
*through* this CIFS filesystem mount.
In principle (but that is only an assumption here), the CIFS filesystem
code (running on the localhost) shows this (remote) directory content to
a local application "as is", without making any character set translation.
Now Java (on your local system) is trying to find this file
"fichiXX.txt", and not finding it. (XX being 2 the two unknown bytes)
That means that, on the remote system, this file "fichXX.txt" does not
exist.
If you connect to that remote system via, for instance, a Remote Desktop
or a VNC console (or even from your local station, just browse this
"share" through the Windows Explorer), and examine the content of that
directory, you probably see a file named "fichié.txt".
But that is only what you *see*, through whatever interface you use.
In reality, the "é" in this filename may (or may not) be encoded, in the
Windows directory entry, as 2 bytes. Or it may be encoded with (for
instance) a Windows 8-bit codepage, as a single byte.
If so, that is why Java, which is trying to find this "é" as 2 bytes,
does not find it.
Now comes the difficult part :
To solve your problem thus, you have to make sure that when Java is
looking for a filename which, from the Java point of view, contains an
"é" character, this Java "é" *character* (whatever its representation is
as bytes in Java), matches the byte representation of the "é" character,
in the filesystem of the remote host where the file actually resides.
And the problem is, that these two "systems" (Java and your current
platform) and the remote OS, do not necessarily agree on what this byte
representation of an "é" character is.
For example, suppose you find the right set of measures that make your
Java program find the file in the end.
Then, you replace the Windows fileserver by a Linux server, sharing its
files through Samba.
Well, the problem may then show up again, because the encoding may be
different again.
That is why I was recommending to stick to US-ASCII names. It was not a
joke.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org