Hello,
we're about to make a major decision about the encoding support in
Sablotron. We are interested in your opinions about this.
Apologies to all those who sent related patches long ago (Rui Hirokawa,
Igor Mikhailov, Alexander Cheshev & others). Somehow we didn't feel like
having a reasonably general approach, so it seemed safer to leave the
decision for the next release (and then the next one...)
Anyway, I had thought that doing all conversions using the iconv library
might be the right solution. It has many advantages, but also serious
drawbacks (as far as the use with expat is concerned), and I came to
appreciate an alternative suggested by Rui - to let Sablotron do the
conversions itself, using the encoding tables coming from the
XML::Encoding (Perl) module. I will discuss the pros and cons in more
detail if asked, but for the moment, I'd like to ask the following.
Would anyone greatly miss any encoding which does NOT appear in the list
below? (This is the list of encodings covered by XML::Encoding).
Big 5, ISO-8859-2 to ISO-8859-9, x-euc-jp, x-euc-kr, x-sjis
(Shift_JIS), windows-1250
(plus the built-in ISO-8859-1, US_ASCII, UTF-8 and UTF-16)
Any other comments will be welcome too, especially if you have
experience with using either of these two alternatives.
One other option is the ICU lib by IBM. In the present situation though,
the ICU is not much different from iconv.
Tom Kaiser