Sylvie Perrin wrote:
Christopher, André,

Christopher Schultz a écrit :

And (just to anticipate the next issue), Sylvie, does your program
actually need to read the content of the file and do something with that
content ?

Yeah, remember to use a Reader and specify the character encoding.
Yes, my program needs to do something with the content of files of the shared Windows directory. Actually, the main action is to parse each files and read content throught an "InputStreamReader(new FileInputStream(file))".

According to what Christopher says, I need to always specify the character encoding, so doing "InputStreamReader(new FileInputStream(file), encoding)"

Yes.
If you know that all the files dropped there will be UTF-8 encoded, then specify UTF-8 as the encoding. The problem is that, if you do not control who puts files there or how, then at some point you may encounter a file whose content is encoded in, say, iso-8859-1 instead of UTF-8. In that case, at some point your InputStreamReader may trigger an exception (when it encounters something that is not valid UTF-8).
You have to be prepared to deal with that.

The general point of this all is : as long as the whole computing world will not have agreed to use Unicode/UTF-8 encoding everywhere (in directories, in text files, in URLs, in program source code,..), dealing with a priori unknown directory entries and text files is messy, and without additional constraints on the clients or additional information provided separately, there is no 100% sure way to determine what you are going to get.

If as you indicate above, you are being asked to "parse" these files, there I suppose that they must have some pre-defined form. Does that form also impose a given character set and encoding ? If not yet, I strongly suggest that you try to add this to the requirements, because otherwise the application will be unreliable. Not because your programs would be bad, but because it is just impossible to be 100% reliable in such cases.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to