On Sat, Sep 08, 2001 at 12:02:55PM +0100, Markus Kuhn wrote:
> > Well, I have to normalize to something!
>
> Use iconv to convert to UTF-8 or UTF-16 before you write into data streams
> that other programs than yours have to read in a locale-independent way.
Well I'm working with data streams that have predefined character
encodings like CIFS/SMB network messages which use UCS-2LE (or UTF-16LE,
not sure yet) or binary file formats like Word 97 which has a variety
of string encodings from UCS-2, CP1252, to pascal style weirdness.
For all of this stuff I want to normalize to some string type. So far
I have serialization promatives like:
size_t enc_strn(const wchar_t *src, unsigned int n, char *dst, size_t max, int enc);
size_t dec_str_size(const char *src, size_t max, int enc);
unsigned int dec_strn(const char *src, size_t n, wchar_t *dst, unsigned int max, int
enc);
wchar_t *dec_str_new(const char *src, int enc);
so I'm leaning to wchar_t rather than UTF-8 or UTF-16 but of course my
program will need to communicate with the host environment in which case
I'm just going to do wcstombs at the last moment.
> Most likely, you forgot to tell the C library to initialize the locale.
> Add at the start of your program something like
>
> if (!setlocale(LC_CTYPE, "")) {
Yup.
> > Err, not with RH 6.2 glibc-2.1.3-15.
>
> Any Linux user/developper interested in locales and character sets
> is today *strongly* recommended to upgrade to a glibc 2.2 based
> distribution. There have been huge improvements between 2.1 and 2.2!
Yeah but all the new distros use kernel 2.4 which seems to be the
development kernel masqurading as the stable release. I'd like to see
some VM stability before I throw this 2.2 rock away. They only recently
discovered that page aging didn't work at all the last 9 releases (err,
something fundamentally wrong there).
> http://clisp.cons.org/~haible/
>
> has vanished ... :-( Bruno, where art thou?
So where do people discuss libiconv problems. iconv_open is giving me No
such file or directory. The open()s just before this message is printed:
open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT
open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT
and iconv is giving me Invalid argument but I don't think I'm giving it
invalid arguments (is '\0' an incomplete multibyte sequence)?
But it all works great if I ignore errno so I'm going to piddle along
here until I get myself a glibc-2.2. Hopefully that will fix the problem.
Mike
--
Wow a memory-mapped fork bomb! Now what on earth did you expect? - lkml
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/