Re: Expat XML Parser Full Character Encoding Support

Michael B. Allen Tue, 21 Jan 2003 11:53:07 -0800

On Tue, 21 Jan 2003 13:21:30 +0100 (CET)
Bruno Haible <[EMAIL PROTECTED]> wrote:


> > Is there a way to determine how many bytes will be needed to
> > represent each character in a character set?
> 
> Yes, just take a look at the conversion tables, e.g. in
> libiconv/tests/*.TXT.

Mmm. Yes, this appears to be precisely what I need. So the first column
is a big endian representation of the multibyte sequence corresponding
to the UCS code in the right column? So I could generate the maps from
that information and use the libiconv *_mbtowc functions to do multibyte
conversions.

> 
> > Can I dynamically generate this information with Markus Kuhn's perl
> > tools or by some other means?
> 
> If you want it to be slow, you can certainly use perl for that
> purpose.

Well I just meant to generate the maps once but it looks like your
tests/*.TXT maps will do the job.

Incedentally why is there no ISO-2022-JP.TXT?

Mike

-- 
A  program should be written to model the concepts of the task it
performs rather than the physical world or a process because this
maximizes  the  potential  for it to be applied to tasks that are
conceptually  similar and, more important, to tasks that have not
yet been conceived. 
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Expat XML Parser Full Character Encoding Support

Reply via email to