Ben Pfaff <[email protected]> writes:
> I think you might be right that we need a wide-character
> [sub]string or at least a UTF-8 [sub]string.
>
> I'll think about this before I proceed.
I'm inclined to believe that we should adopt libunistring. It's
part of gnulib, so it would be easy to do so without adding extra
dependencies, and I believe that it has all the functionality
that we need.
libunistring uses uint8_t (unsigned char) for UTF-8. That's the
same as what PSPP uses currently for "union value". I'm inclined
to do this:
* Use uint8_t for UTF-8, via libunistring.
* Use char for multibyte strings in the current locale.
* Change "union value" to use signed char, in place of
its current use of unsigned char.
Then we'll be able to cleanly distinguish each of these types
(char vs. unsigned char vs. signed char) and get some help from
the compiler.
What do you think?
--
Ben Pfaff
http://benpfaff.org
_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev