On Fri, May 30, 2025 at 05:08:21PM +0200, Grégory Vanuxem wrote:
> 
> However, I noticed that about utf-8 support:
> 
> In SBCL or Clozure CL, the later needs a special routine to handle utf-8
> string returned from the C wrapper:
> 
> (1) -> #"Hello world, Καλημέρα κόσμε, コンニチハ"
> 
>    (1)  34
> 
> Whereas in gcl27, 2.7.1-7 (the Debian sid package):
> 
> (2) -> #"Hello world, Καλημέρα κόσμε, コンニチハ"
> 
>    (2)  59
> 
> # is the operation that returns the number of elements in an Aggregate.
> 
> So I think handling strings returned by Julia, I use them sometimes for
> formatting purposes or regular expressions related operation principally,
> will probably become difficult with gcl27, they are in utf-8. For BLAS and
> LAPACK and purely numerical functions I don't think it will it be a problem
> but for returned strings (char *) I wonder if there is a special function
> to let GCL "knows" it is in utf-8 and handle them correctly.
> 

Nice things about utf-8 is that in most cases code expecting
8-bit characters will handle them correctly, so the only
thing which needs to know about utf-8 is input and display
subsystem.  In particular, for regexes, in most cases you
should be able to pass utf-8 strings to 8-bit regex engine
and obtain correct result.

AFAICS number 34 above is useless for formatting purposes.
59 looks like correct number of bytes, which is crucial for
manipulating on the string using low-level operations.
For screen positioning you need number of positions on the
screen, which seem to be 39.  To know this you need to
know a lot of specific thing, like which characters are
double width, which are combining (so does not need their
own position).

BTW: Clef has handle display issues, so I wrote a few support
routines for handling utf-8 (see 'dist_left' and 'dist_right' in
'src/clef/e_buf.c').  They use Clef representation for buffer
but idea should be clear.

-- 
                              Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/fricas-devel/aDn6ZSqoWgiivacM%40fricas.org.

Reply via email to