Re: iconv limitations

Glenn Maynard Thu, 08 Apr 2004 15:42:04 -0700

On Thu, Apr 08, 2004 at 06:17:55PM -0400, Michael B Allen wrote:
> > On the other hand, the iconv API is more flexible the way it is. It
> > can handle strings with embedded zeroes,
> 
> Now *that* is rare.


I use std::string, which is 8-bit clean, and I always like to make things
remain that way unless I have a strong reason not to.

> For that use iconv.
>...
> Just because the conversion routine stops at a null terminator in the
> source doesn't mean it cannot operate on a string that is not null
> terminated. The encdec interface I described can convert non-null
> terminated strings by limiting the number of bytes inspected in src using
> the sn parameter.

I'd suggest that one shouldn't have to use two notably different interfaces
just because your nul-termination needs are different, and that "stop on
nul" should be a conversion flag, as should other things that some need
and some don't want: replacing unconvertable characters (æ -> ?),
transliteration (Ã -> a), etc.

Better would be a low-level conversion interface that allows implementing
these things efficiently (which iconv doesn't), with iconv, encdec, etc.
interfaces being implemented on top of that.  At the very least, this could
solve the problem of having to lug around large conversion tables when
you outgrow iconv().

> pages and MIME messages with bogus length parameters. The W3C claims all
> apps should use UTF-16 internally so if you want to use those in your

FWIW, I'd say that what the W3C claims applications should use internally is
no more interesting than what the FSF claims I should eat for breakfast.  :)
(Not to mention that UTF-16 is such a horrible recommendation to be making!)

-- 
Glenn Maynard

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: iconv limitations

Reply via email to