Hi Mike,
Perhaps I can rephrase Misha's answer ;-):
1. EUC-JP is an encoding ("charset") that was originally created to encoding
Japanese character sets such as JIS X 208 and JIS X 212.
2. As such, EUC-JP can be used to encode the subset of Unicode that contains
all of the characters in JIS X 208 and JIS X 212, etc.
3. An XML parser uses the Unicode character set internally to represent and
process character data. As such, the most natural encoding to use for an XML
file would be a Unicode encoding such as UTF-8 or UTF-16.
4. However, you can use any other encoding, provided you tag the file
appropriately (so that the parser knows what the encoding is and can
translate it to its internal representation).
5 You are not required to use EUC-JP for your Japanese XML files: you can
use the Unicode encodings directly. In some cases, though, your file
editting software may make it easier to work with EUC-JP (or
Shift-JIS/Microsoft Code Page 932).
As for an XML parser that handles all of these, I know from extensive
testing that ours does<g>. And it is worth mentioning, becuase, in fact,
EUC-JP (and many other encodings) are perfectly interoperable----for the
subset of characters that they represent. Most XML interchanges (for
example, marketplaces such as CommerceOne or Ariba) tend to prefer that
"legacy encoded" files be converted to UTF-8 for interoperability, but there
is no requirement that one do so and many backend XML systems, *especially*
in Japan, use the non-Unicode encodings.
Best Regards,
Addison
Addison P. Phillips
Globalization Architect / Manager, Globalization Engineering
webMethods, Inc. 432 Lakeside Drive, Sunnyvale, CA
+1 408.962.5487 (phone) +1 408.210.3659 (mobile)
-------------------------------------------------
Internationalization is an architecture. It is not a feature.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of [EMAIL PROTECTED]
Sent: Thursday, August 30, 2001 8:37 AM
To: Ayers, Mike
Cc: [EMAIL PROTECTED]
Subject: RE: japanese xml
I have no idea of what you're talking about.
Misha
On 30/08/2001 16:11:14 "Ayers, Mike" wrote:
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, August 30, 2001 06:06 AM
>
> > IMO, I correctly replied to Viranga's question and I've
> > no idea what you're talking about below.
>
> Let me try to put it another way. What you said may have been
> technically correct, but it was probably not worth mentioning because it
> represents a noninteroperable encoding. Perhaps I am mistaken though - do
> you know of an XML parser that can parse the encoding that you suggested?
>
>
> /|/|ike
-----------------------------------------------------------------
Visit our Internet site at http://www.reuters.com
Any views expressed in this message are those of the individual
sender, except where the sender specifically states them to be
the views of Reuters Ltd.