Andrà Malo <[EMAIL PROTECTED]> writes: >> Not really. Iso-2022-jp is the most auto-detection friendly >> encoding because of infamous escape sequence but shift_jis >> would be OK, too. I'm +-0 on conversion at the moment >> because it hasn't caused me much trouble so far. > > Well, then we should stick with it. > > I find it just annoying that the recoding is not stable (i.e. the xalan > serializer output differs from version to version and depending on other > things like moon phases or so :).
From my experience, it was Java version that mattered. After I upgraded to JDK 1.4, I don't have problem with encoding. The reason this happens is that iso-2022-jp is a stateful encoding. After an escape sequence, following bytes are interpreted in certain state. After one escape sequence, bytes are interpreted as ASCII character and after another, those are are interpreted as some Japanese characters. Because of this, you can have bogus escape sequences like switching to another state and then immediately going back to previous state. There were lots of these sequences when I was using JDK 1.2. I'm hoping this won't happen anymore since all files are re-encoded with newer JDK. If this happens again, I would give +1 for changing the generated files to shift_jis. > (though... actually I don't know if shift_jis would make the things better) Yes, shift_jis would make it easier because it's a plain 8bit character encoding scheme. There is only one way to encode a character. -- Yoshiki Hayashi --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
