Hey there. Been to Prague during the past week, took some fresh air there... What a beautiful city.
--------------------------- It appears that Apache's Java-based XML projects use some "configuration files" in order to operate. Those files are being read by a Reader object (such as java.io.BufferedReader, java.io.InputStreamReader). Such files are being read by constructing a reader on the input stream, defaulting to the platform-default encoding. For example, in the Xalan-J project, class org.apache.xalan.serialize.CharInfo, when we read XMLEntities.res or HTMLEntities.res, a BufferedReader is constructed on an InputStreamReader, which is constructed on an inputstream, without mentioning the encoding in which the file resides - thus assuming default platform-encoding. This works pretty well on ASCII environments, since those text files really reside as ASCII text files, so the default platform-encoding is enough. However, a problem arises when trying to use Xalan-J on a non-ASCII environment, such as IBM's OS/390 which uses EBCDIC. The product simply doesn't work. I already opened a new bug report about the specific problem in org.apache.xalan.serialize.CharInfo (see the link to bug #4000 in the bottom), and I believe it should be fixed, but it might not be enough. Xalan-J was an example. If I recall right, Xerces-J has this problem too. It's obvious that if we want to make Apache's Java-based XML projects as portable as possible, then every place in which a text file, which is NOT encoding-standardized (such as Manifest files, which must be UTF-8 for example) is being read - should NOT make any assumption on the default platform encoding. If Xalan is being built in an ASCII environment, and its configuration files are in ASCII, then ASCII should explicitly be mentioned when reading the configuration file. I'm not talking about the actual way of doing this (hard-coding into the source, or reading some properties from the manifest-files, which are standardized to UTF-8 so no problem here), I'm just rising the problem. I think we should state our position in this subject. We can go and fix whenever needed; we can also just state that "these are the changes that must be made in order for this product to run on a non-ASCII environment" etc; anyway, some action is needed. My opinion is that the code should be as portable as possible, with no modifications needed to be performed by the user (such as converting ASCII to EBCDIC). Any comments, please? - Isaac P.S. here's the link to the original problem reported by me using BugZilla - http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4000 --------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]