Re: Wide character format specifier stick in the mud

Michael B Allen Thu, 03 Jul 2003 01:38:18 -0700

On Thu, 3 Jul 2003, srintuar26 wrote:

> > I'm curious; most of the wide character functions have parameters that
> > are equivalent to their multi-byte versions. This permits you to
> > re-#define things nicely and conditionally compile using multi-byte or
> > wide characters. But the format specifiers are different for wide
> > character strings regardless. Wide character strings are %ls vs. %s.
> > Why didn't wprintf and friends use %s for wide character strings to
> > complete the abstraction?
> 
> I've found its generally useless to write programs that support both
> wide and mulbibyte characters internally. You're better off picking one
> or the other and using it consistently.


I want my general purpose library (http://www.ioplex.com/~miallen/libmba/)
to work on as many systems as possible. Since Win32 is one of my target
systems I need wide character support. Actually I'm just creating an
mba/tchar.h header with a TEXT macro like the Win32 one. Now that I've
created the header it *should* be pretty easy to code for both *nix and
win32 but I haven't quite tried the win32 side yet. The coding rules
for UTF-8 are a superset of the coding rules to support this technique
so I assume everything will just work.

> I recommend using utf-8 internally and for string literals and comments,
> and converting to/from locale-encoding on i/o. (Many programs will need

Actually my technique recently has been to just convert everything to
the locale encoding which on *nix usually means you can do UTF-8. That
also dovetails well with wchar_t support b/c of mbstowcs and wcstombs. I
thought this was how it was all meant to be....

> no changes to meet this requirement.) Its better to ignore the whole
> misbegotten path of wide character c-lib funtions.

Actually it all seems to work pretty well. I have test cases for each
string function from string.h, wchar.h, time.h, and a few others. About
a 100 functions in all. Granted they're trivial tests but everyting works
except for a few type mismatch warnings and the format specifier problem.

Is there a serious flaw with wchar_t on Linux?

> C++ is not really compatible with utf-32 or utf-16: for example you
> cannot use them natively in string literals.

Well I'm using C but can you be more specific? You're saying you cannot
have wide character string literals in C++?

Mike

-- 
A  program should be written to model the concepts of the task it
performs rather than the physical world or a process because this
maximizes  the  potential  for it to be applied to tasks that are
conceptually  similar and, more important, to tasks that have not
yet been conceived. 

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Wide character format specifier stick in the mud

Reply via email to