[Bug libgcj/31939] Command line arguments are byteswapped before being passed to the program runing in custom locale.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31939 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #5 from Andrew Pinski --- Closing as won't fix as libgcj (and the java front-end) has been removed from the trunk.
[Bug libgcj/31939] Command line arguments are byteswapped before being passed to the program runing in custom locale.
--- Comment #4 from serg at vostok dot net 2007-05-18 20:07 --- For subject 2. The point is to find where arguments of int main(int argc, char** argv) are converted into java.lang.String to be passed to static void main(String[] args). Trail: gcc/java/jvgenmain.c: int main(int argc,char **argv) constructs a C code to call the main java method with JvRunMain(classname,argc,argv) libjava/prims.cc: void JvRunMain (jclass klass, int argc, const char **argv) simply calls _Jv_RunMain (klass, NULL, argc, argv, false) void _Jv_RunMain (jclass klass, const char *name, int argc, const char **argv, bool is_jar) simply calls _Jv_RunMain (NULL, klass, name, argc, argv, is_jar) void _Jv_RunMain (JvVMInitArgs *vm_args, jclass klass, const char *name, int argc, const char **argv, bool is_jar) calls JvConvertArgv (argc - 1, argv + 1) JArrayjstring * JvConvertArgv (int argc, const char **argv) copies each argument into jbyteArray bytes, then calls new java::lang::String (bytes, 0, len) to make a String from it with a default encoding. libjava/java/lang/String.java: public String(byte[] data, int offset, int count) calls init (data, offset, count,System.getProperty(file.encoding, 8859_1)) libjava/java/lang/natString.cc: void java::lang::String::init (jbyteArray bytes, jint offset, jint count, jstring encoding) uses gnu::gcj::convert::BytesToUnicode::getDecoder(encoding) to make a converter and converter-read(array, outpos, avail) to convert data libjava/gnu/gcj/convert/BytesToUnicode.java: class BytesToUnicode extends IOConverter public static BytesToUnicode getDecoder (String encoding) uses eigther new Input_iconv(encoding) or new BytesToCharsetAdaptor(Charset.forName(encoding)) Looks like I will have to test both Input_iconv and BytesToCharsetAdaptor to see if any of them is buggy. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31939
[Bug libgcj/31939] Command line arguments are byteswapped before being passed to the program runing in custom locale.
--- Comment #1 from serg at vostok dot net 2007-05-15 18:05 --- This bug is relevant only for iconv that have the UCS-2 encoding with byte order different from native byte order of the platform, e.g. GNU libiconv on i386. There are actually two subjects here. 1. Fix source code to use an appropriate for the platform UCS-2* encoding with native byte order, e.g. UCS-2-INTERNAL for GNU libiconv on i386. 2. Find the bug that may leave command line arguments in wrong byte order and fix it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31939
[Bug libgcj/31939] Command line arguments are byteswapped before being passed to the program runing in custom locale.
--- Comment #2 from serg at vostok dot net 2007-05-15 18:11 --- Created an attachment (id=13559) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13559action=view) A hack to use UCS-2-INTERNAL instead of plain UCS-2 For subject 1. Works only for those who actually have UCS-2-INTERNAL. That's at least all GNU libiconv users. Shown here to point out that there are only 2 files of source code to change. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31939
[Bug libgcj/31939] Command line arguments are byteswapped before being passed to the program runing in custom locale.
--- Comment #3 from serg at vostok dot net 2007-05-15 18:33 --- For subject 1. Can java/gcj even be used without iconv in general? Considering that java.io.File assumes all file names in OS are encoded in UTF-8 and java.String stores it's data in UCS-2 (UTF-16), the answer should be NO. TODO: Add a check in configure for presence of a UCS-2* encoding in iconv. There we can test several well-known encoding names (UCS-2-INTERNAL, UCS-2-LE, USC-2-BE, UCS-2LE, UCS-2BE) and set HAVE_UCS2 to the name of one with native byte order, if found. If not, then test for plain UCS-2. If not found - bail out with a message about needing an iconv with a UCS-2 for java. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31939