Where are you getting your list from?  On 1.5.0_07, at least on Linux,
UnicodeBig certainly does seem to be defined:

BeanShell 2.0-0.b1.7jpp - by Pat Niemeyer ([EMAIL PROTECTED])
bsh % print(System.getProperty("java.version"));
1.5.0_07
bsh % b = "TEST".getBytes("UnicodeBig");
bsh % print(b.length);
10
bsh % print(new String(b, "UnicodeBig"));
TEST

Neat. I just ran that under BeanShell and it worked fine (Linux, 1.5.0_08)
If you run my code below you'll see that "UnicodeBig" does not exist.

charset:UTF-16BE
  alias:X-UTF-16BE
  alias:UnicodeBigUnmarked
  alias:UTF_16BE
  alias:ISO-10646-UCS-2

XmlBeans _only_ defines UNICODEBIG to correspond to UTF-16BE. So it
still seems impossible to support UTF-16BE (IANA ISO-10646-UCS-2).
As I said above, UNICODEBIG is a java alias for UTF-16BE, so why
shouldn't XmlBeans define this mapping?
It's not (see above).

NOTE: I initially tried (and would prefer) to use ISO-10646-UCS-2 as
this is identical in IANA and Java. It does not work. No IANA/Java
translation is required and XmlBeans still gets it wrong.
Why not just use UTF-16BE, which is the canonical IANA name for this
character set?  Does XmlBeans still have the wrong behavior, for
either incoming or outgoing documents, if you specify UTF-16BE as the
charset?  If so, then I agree we have a bug.
Yes, I tried this and it didn't work.

Ok, then I agree, there is a problem. Was it the incoming or outgoing
or both that did not work? What was the failure mode?

I will post the code on Tuesday.


This is a bug.
I'm not an XML beans developer, but I'm not sure I agree (unless, as I
said there is a problem with specifying UTF-16BE). Though certainly
there may be, and probably are, some mappings missing for certain IANA
aliases.
Since 'UNICODEBIG' is not a Java or IANA character set name perhaps the
bug is a simple type for the characterset name. I can try fixing this
and testing it on Tuesday.

bsh % b = "TEST".getBytes("UNICODEBIG");
// Error: // Uncaught Exception: target exception : at Line: 2 : in
file: <unknown file> : .getBytes ( "UNICODEBIG" )

Interesting, so UNICODEBIG is not valid, but UnicodeBig is (as shown
above).

That is interesting. The javadocs (and code backs this up) state the alias is not case sensitive.

FYI the Java charset/aliases I posted were for Java 1.5. For Java 1.4.1
they are:

charset:UTF-16BE
  alias:X-UTF-16BE
  alias:UTF_16BE
  alias:ISO-10646-UCS-2

For Java 1.6 they are:

charset:UTF-16BE
  alias:X-UTF-16BE
  alias:UTF_16BE
  alias:ISO-10646-UCS-2
  alias:UnicodeBigUnmarked

Can I ask where you are getting this list from?

Sure - from the Java runtime itself:

public class DisplayCharsets {

        public static void main(String[] args) throws Exception {
                SortedMap availableCharsets = Charset.availableCharsets();
                Iterator i = availableCharsets.keySet().iterator();
                for (; i.hasNext(); ) {
                        String charsetName = (String)i.next();
                        System.out.println("charset:" + charsetName);
                        Charset charset = Charset.forName(charsetName);
                        Set aliases = charset.aliases();
                        //System.out.println("aliases:" + aliases.size());
                        Iterator j = aliases.iterator();
                        for (; j.hasNext(); )
                                System.out.println("  alias:" + j.next());
                }

        }

}


--
Free replacement for Exchange and Outlook (Contacts and Calendar)
http://www.ScheduleWorld.com/tg/
WebDAV: http://www.ScheduleWorld.com/sw/webDAVDir/4000.ics
VFREEBUSY: http://www.ScheduleWorld.com/sw/freebusy/4000.ifb

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to