Re: Create FileInputStream in servlet from remote file with accentuated character name

André Warnier Mon, 21 Sep 2009 02:46:38 -0700

Sylvie Perrin wrote:

Christopher,
Here is the stack trace of the FileNotFoundException:
java.io.FileNotFoundException: /home/me/mountDir/fichi��.txt (No such file ordirectory)


Sylvie,

maybe what appears above shows the origin of the problem, and explains
what I was trying previously to tell you.
It is difficult to be sure, because (again) there are several layers of

encoding/decoding between your logfile, and how it may show up in thisemail.

The problem is not your problem per se. You are not necessarily doinganything wrong. The problem is basically in the lack of a commonstandard between different OS'es and filesystem types, about how torepresent filenames containing non-US-ASCII characters.

Below, I am trying to explain the root of the problem, concisely butfully. It *is* a complex matter, that's why it is confusing. But youare not alone in being confused or puzzled. Unless one has had to dealwith such issues many times, it is really easy to get confused, becausein this case, what one sees is not necessarily what one gets.

Assuming that what I see above is also what you see in the logfile("fichi" + 2 strange characters + ".txt") :


- java is trying to open a file named "fichi" + 2 strange characters +
".txt"
- these two characters *may* be the Unicode/UTF-8 encoding of the
character "é" (e with acute accent)
- but java is not finding that file (obviously)

Furthermore :
The file is really located on a Windows server.

The Windows directory where the file is located, is "mounted" throughthe CIFS filesystem, onto a local mountpoint on your (Linux) Java andTomcat host.

On your Java/Tomcat host, Java is seeing the contents of this directory
*through* this CIFS filesystem mount.

In principle (but that is only an assumption here), the CIFS filesystemcode (running on the localhost) shows this (remote) directory content toa local application "as is", without making any character set translation.

Now Java (on your local system) is trying to find this file"fichiXX.txt", and not finding it. (XX being 2 the two unknown bytes)That means that, on the remote system, this file "fichXX.txt" does notexist.

If you connect to that remote system via, for instance, a Remote Desktopor a VNC console (or even from your local station, just browse this"share" through the Windows Explorer), and examine the content of thatdirectory, you probably see a file named "fichié.txt".


But that is only what you *see*, through whatever interface you use.

In reality, the "é" in this filename may (or may not) be encoded, in theWindows directory entry, as 2 bytes. Or it may be encoded with (forinstance) a Windows 8-bit codepage, as a single byte.If so, that is why Java, which is trying to find this "é" as 2 bytes,does not find it.


Now comes the difficult part :

To solve your problem thus, you have to make sure that when Java islooking for a filename which, from the Java point of view, contains an"é" character, this Java "é" *character* (whatever its representation isas bytes in Java), matches the byte representation of the "é" character,in the filesystem of the remote host where the file actually resides.

And the problem is, that these two "systems" (Java and your currentplatform) and the remote OS, do not necessarily agree on what this byterepresentation of an "é" character is.

For example, suppose you find the right set of measures that make yourJava program find the file in the end.Then, you replace the Windows fileserver by a Linux server, sharing itsfiles through Samba.Well, the problem may then show up again, because the encoding may bedifferent again.That is why I was recommending to stick to US-ASCII names. It was not ajoke.






---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: Create FileInputStream in servlet from remote file with accentuated character name

Reply via email to