Khaled Hosny wrote:
On Fri, Feb 05, 2010 at 08:11:40AM +0100, Arthur Reutenauer wrote:
could you please point me to some documentation on the selene utf
library that is mentioned in the luatex reference?
  I don't think there is any, but the test file that comes with the library
sources should provide some indications, as well as the general idea that the
unicode.ascii, unicode.utf8 and unicode.grapheme each provide the same
functions as the standard string library (i.e., unicode.utf8.gmatch is a global
match function for UTF-8 strings just like string.gmatch is for ASCII strings,
etc.)

I was wondering, since luatex defaults to utf-8 every where, why the
built-in non-unicode compliant string library isn't overridden by the
unicode library? So instead of having to libraries, make the standard
one unicode compliant and get ride of the separate unicode library. This
would decrease the confusion that is made right now and avoid bugs
caused by code not aware of this important fact? Have this ever been
considered, may be there are technical difficulties?

Not technical difficulties, but practical ones. The normal string
library allows arbitrary bytes to be handled, which is very useful
functionality. And as the string library has to stay, it may as well
stay in its normal form, as that avoids confusion.

I have thought about replacing the selene unicode library with
something else, but that is a low-priority task.

Best wishes,
Taco

Reply via email to