I didn't study it carefully. However I think they aren't utf8. Perhaps it groups arbitrary length of high ascii characters with the preceding low ascii characters.
On Fri, Nov 20, 2020, 11:51 PM Don Guinn <[email protected]> wrote: > You are correct. They are not. I took their definitions from one of the > beta forum messages concerning the implementation of Direct Definition. I > am showing them below. > > > mj=: 256$0 NB. X other > mj=: 1 (9,a.i.' ')}mj NB. S space and tab > mj=: 2 (,(a.i.'Aa')+/i.26)}mj NB. A A-Z a-z excluding N B > mj=: 3 (a.i.'N')}mj NB. N the letter N > mj=: 4 (a.i.'B')}mj NB. B the letter B > mj=: 5 (a.i.'0123456789_')}mj NB. 9 digits and _ > mj=: 6 (a.i.'.')}mj NB. . the decimal point > mj=: 7 (a.i.':')}mj NB. : the colon > mj=: 8 (a.i.'''')}mj NB. Q quote > mj=: 9 (a.i.'{')}mj NB. { the left curly brace > mj=:10 (10)}mj NB. LF and CR > mj=:11 (a.i.'}')}mj NB. } the right curly brace > > NB. mj=: 2 (128+i.128) }mj > > sj=: 0 10#:10*}.".;._2(0 :0) > ' X S A N B 9 . : Q { } LF ']0 > 1.1 0.0 2.1 3.1 2.1 6.1 1.1 1.1 7.1 11.1 10.1 12.1 NB. 0 space > 1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 1 other > 1.2 0.3 2.0 2.0 2.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 2 alp/num > 1.2 0.3 2.0 2.0 4.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 3 N > 1.2 0.3 2.0 2.0 2.0 2.0 5.0 1.0 7.2 11.2 10.2 12.2 NB. 4 NB > 9.0 9.0 9.0 9.0 9.0 9.0 1.0 1.0 9.0 9.0 10.2 9.0 NB. 5 NB. > 1.4 0.5 6.0 6.0 6.0 6.0 6.0 1.0 7.4 11.4 10.2 12.4 NB. 6 num > 7.0 7.0 7.0 7.0 7.0 7.0 7.0 7.0 8.0 7.0 7.0 7.0 NB. 7 ' > 1.2 0.3 2.2 3.2 2.2 6.2 1.2 1.2 7.0 11.2 10.2 12.2 NB. 8 '' > 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 10.2 9.0 NB. 9 comment > 1.2 0.2 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 10 LF > 1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 13.0 10.2 1.2 NB. 11 { > 1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 1.2 10.2 14.0 NB. 12 } > 1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2 1.2 10.2 1.2 NB. 13 {{ > 1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2 1.2 10.2 1.2 NB. 14 }} > ) > > On Fri, Nov 20, 2020 at 8:32 AM Henry Rich <[email protected]> wrote: > > > What are sj/mj? They are not part of the JE, right? I think you are > > saying that (x ;: y) can be used to parse well-formed UTF-8 with a small > > change to the input translation. > > > > Henry Rich > > > > On 11/20/2020 10:17 AM, Don Guinn wrote: > > > Sequential machine does not do well when dealing with UTF-8. It works > > well > > > within comments (NB.) and literals ('⌹'), but outside those cases it > > makes > > > a mess. > > > > > > > > > Given some of the changes to ;: in the beta it seems that it would be > > > desirable to have UTF-8 handled outside of comments and literals as > > handled > > > in them. There is a simple change that can be made to mj that > > accomplishes > > > that. Simply assigning the value 2 for letters for the range 128+i.128 > > > accomplishes that making UTF-8 like letters a-z and A-Z. > > > > > > > > > I don't know where J will be going with UTF-8 and other unicode > handling, > > > but this seems to me to help in the handling of UTF-8 in the sequential > > > machine. > > > > > > > > > Example shown below: > > > > > > NB. Definitions for sj and mj not shown but as > > > > > > NB. the current beta. > > > > > > NB. A noun to show the handling of UTF-8 in ;: > > > > > > test=:{{)n > > > > > > The symbol for the Euro is ₠ > > > > > > Other symbols like π show up also > > > > > > How about ⌹ in APL > > > > > > Common expressions like 'H₂O' for water > > > > > > Common expressions like H₂O for water > > > > > > }} > > > > > > NB. How ;: in beta handles it > > > > > > ,.<;.2(0;sj;mj);:test > > > > > > +-----------------------------------------------+ > > > > > > |+---+------+---+---+----+--+-+-+-+-+ | > > > > > > ||The|symbol|for|the|Euro|is|â|‚| | | | > > > > > > |+---+------+---+---+----+--+-+-+-+-+ | > > > > > > +-----------------------------------------------+ > > > > > > |+-----+-------+----+-+-+----+--+----+-+ | > > > > > > ||Other|symools|like|Ï|€|show|up|also| | | > > > > > > |+-----+-------+----+-+-+----+--+----+-+ | > > > > > > +-----------------------------------------------+ > > > > > > |+---+-----+-+-+-+--+---+-+ | > > > > > > ||How|about|â|Œ|¹|in|APL| | | > > > > > > |+---+-----+-+-+-+--+---+-+ | > > > > > > +-----------------------------------------------+ > > > > > > |+------+-----------+----+-----+---+-----+-+ | > > > > > > ||Common|expressions|like|'H₂O'|for|water| | | > > > > > > |+------+-----------+----+-----+---+-----+-+ | > > > > > > +-----------------------------------------------+ > > > > > > |+------+-----------+----+-+-+-+-+-+---+-----+-+| > > > > > > ||Common|expressions|like|H|â|‚|‚|O|for|water| || > > > > > > |+------+-----------+----+-+-+-+-+-+---+-----+-+| > > > > > > +-----------------------------------------------+ > > > > > > NB. Assigning UTF8 as character > > > > > > mj=: 2 (128+i.128)}mj > > > > > > NB. How UTF-8 is now handled > > > > > > ,.<;.2(0;sj;mj);:test > > > > > > +-------------------------------------------+ > > > > > > |+---+------+---+---+----+--+-+-+ | > > > > > > ||The|symbol|for|the|Euro|is|₠| | | > > > > > > |+---+------+---+---+----+--+-+-+ | > > > > > > +-------------------------------------------+ > > > > > > |+-----+-------+----+-+----+--+----+-+ | > > > > > > ||Other|symools|like|π|show|up|also| | | > > > > > > |+-----+-------+----+-+----+--+----+-+ | > > > > > > +-------------------------------------------+ > > > > > > |+---+-----+-+--+---+-+ | > > > > > > ||How|about|⌹|in|APL| | | > > > > > > |+---+-----+-+--+---+-+ | > > > > > > +-------------------------------------------+ > > > > > > |+------+-----------+----+-----+---+-----+-+| > > > > > > ||Common|expressions|like|'H₂O'|for|water| || > > > > > > |+------+-----------+----+-----+---+-----+-+| > > > > > > +-------------------------------------------+ > > > > > > |+------+-----------+----+---+---+-----+-+ | > > > > > > ||Common|expressions|like|H₂O|for|water| | | > > > > > > |+------+-----------+----+---+---+-----+-+ | > > > > > > +-------------------------------------------+ > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > -- > > This email has been checked for viruses by AVG. > > https://www.avg.com > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
