Martin Baker wrote:
>
> I know its not fair to ask you about timescales, given all the work
> you do on this, but it would be nice to get a rough estimate so that I
> can judge if there is a requirement for a shorter term hack for
> unicode.
>
Just now using sbcl based FriCAS you can do:
1) start FriCAS in Unicode setting, for example:
(export LC_CTYPE=en_US.UTF-8; fricas)
Make sure your console is set to UTF-8
2) In FriCAS:
new(1, char(8721))$String
You should see sigma sign on the screen. In the same way you
can create any unicode character you want.
What does not work:
1) printing characters (that is easy to fix)
2) functions from CharacterClass. It would be easy to change
CharacterClass so that for ASCII it works as before and
non-ASCII is ignored. Proper support requires getting
database of data about Unicode characters. Adapting
database for FriCAS is easy (IIUC database can be obtained
from Unicode website), but adds bulk (IIUC its is several
megabytes in size). Also, naively extending CharacterClass
to full Unicode is likely to lead to poor performance
(currently CharacterClass uses bitwectors of length 256,
changing that to 1114112 means much bigger bitwectors and
conseqently much more work creating them and poorer
cache utilization).
If we are satisfied with CharacterClass which works correctly
only for ASCII the changes can be done in a day or two. This
may be reasonable because currently CharacterClass is used
only for few internal functions in Character, String and
StringAggregate and the functions are only used with ASCII.
For non-Unicode Lisps we could add function, for example called
'ucodeToString' which given integer produces UTF-8 encoded string
corresponding to given Unicode codepoint. For Unicode Lisps
'ustring' is as simple as:
ucodeToString(c : Integer) : String == new(1, char(c))$String
but for non-Unicode Lisps it is a bit more complicated. Given
'ustring' one can do things like
alpha := ucodeToString(945)::Symbol
alpha^2+1
In principle we could provide domain(s) which would give access
to large number of special symbols via names (I am not sure if
this is really useful).
As I wrote simple-minded Unicode support can be done in few
days. For better support it is hard to give timeline because
once you try to use Unicode you will find that some things
do not work as expected, some things suddenly are extremally
slow and need to be rewritten to get reasonable speed.
--
Waldek Hebisch
[email protected]
--
You received this message because you are subscribed to the Google Groups
"FriCAS - computer algebra system" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/fricas-devel?hl=en.