You are correct. They are not. I took their definitions from one of the
beta forum messages concerning the implementation of Direct Definition. I
am showing them below.
mj=: 256$0 NB. X other
mj=: 1 (9,a.i.' ')}mj NB. S space and tab
mj=: 2 (,(a.i.'Aa')+/i.26)}mj NB. A A-Z a-z excluding N B
mj=: 3 (a.i.'N')}mj NB. N the letter N
mj=: 4 (a.i.'B')}mj NB. B the letter B
mj=: 5 (a.i.'0123456789_')}mj NB. 9 digits and _
mj=: 6 (a.i.'.')}mj NB. . the decimal point
mj=: 7 (a.i.':')}mj NB. : the colon
mj=: 8 (a.i.'''')}mj NB. Q quote
mj=: 9 (a.i.'{')}mj NB. { the left curly brace
mj=:10 (10)}mj NB. LF and CR
mj=:11 (a.i.'}')}mj NB. } the right curly brace
NB. mj=: 2 (128+i.128) }mj
sj=: 0 10#:10*}.".;._2(0 :0)
' X S A N B 9 . : Q { } LF ']0
1.1 0.0 2.1 3.1 2.1 6.1 1.1 1.1 7.1 11.1 10.1 12.1 NB. 0 space
1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 1 other
1.2 0.3 2.0 2.0 2.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 2 alp/num
1.2 0.3 2.0 2.0 4.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 3 N
1.2 0.3 2.0 2.0 2.0 2.0 5.0 1.0 7.2 11.2 10.2 12.2 NB. 4 NB
9.0 9.0 9.0 9.0 9.0 9.0 1.0 1.0 9.0 9.0 10.2 9.0 NB. 5 NB.
1.4 0.5 6.0 6.0 6.0 6.0 6.0 1.0 7.4 11.4 10.2 12.4 NB. 6 num
7.0 7.0 7.0 7.0 7.0 7.0 7.0 7.0 8.0 7.0 7.0 7.0 NB. 7 '
1.2 0.3 2.2 3.2 2.2 6.2 1.2 1.2 7.0 11.2 10.2 12.2 NB. 8 ''
9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 10.2 9.0 NB. 9 comment
1.2 0.2 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 10 LF
1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 13.0 10.2 1.2 NB. 11 {
1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 1.2 10.2 14.0 NB. 12 }
1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2 1.2 10.2 1.2 NB. 13 {{
1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2 1.2 10.2 1.2 NB. 14 }}
)
On Fri, Nov 20, 2020 at 8:32 AM Henry Rich <[email protected]> wrote:
> What are sj/mj? They are not part of the JE, right? I think you are
> saying that (x ;: y) can be used to parse well-formed UTF-8 with a small
> change to the input translation.
>
> Henry Rich
>
> On 11/20/2020 10:17 AM, Don Guinn wrote:
> > Sequential machine does not do well when dealing with UTF-8. It works
> well
> > within comments (NB.) and literals ('⌹'), but outside those cases it
> makes
> > a mess.
> >
> >
> > Given some of the changes to ;: in the beta it seems that it would be
> > desirable to have UTF-8 handled outside of comments and literals as
> handled
> > in them. There is a simple change that can be made to mj that
> accomplishes
> > that. Simply assigning the value 2 for letters for the range 128+i.128
> > accomplishes that making UTF-8 like letters a-z and A-Z.
> >
> >
> > I don't know where J will be going with UTF-8 and other unicode handling,
> > but this seems to me to help in the handling of UTF-8 in the sequential
> > machine.
> >
> >
> > Example shown below:
> >
> > NB. Definitions for sj and mj not shown but as
> >
> > NB. the current beta.
> >
> > NB. A noun to show the handling of UTF-8 in ;:
> >
> > test=:{{)n
> >
> > The symbol for the Euro is ₠
> >
> > Other symbols like π show up also
> >
> > How about ⌹ in APL
> >
> > Common expressions like 'H₂O' for water
> >
> > Common expressions like H₂O for water
> >
> > }}
> >
> > NB. How ;: in beta handles it
> >
> > ,.<;.2(0;sj;mj);:test
> >
> > +-----------------------------------------------+
> >
> > |+---+------+---+---+----+--+-+-+-+-+ |
> >
> > ||The|symbol|for|the|Euro|is|â|‚| | | |
> >
> > |+---+------+---+---+----+--+-+-+-+-+ |
> >
> > +-----------------------------------------------+
> >
> > |+-----+-------+----+-+-+----+--+----+-+ |
> >
> > ||Other|symools|like|Ï|€|show|up|also| | |
> >
> > |+-----+-------+----+-+-+----+--+----+-+ |
> >
> > +-----------------------------------------------+
> >
> > |+---+-----+-+-+-+--+---+-+ |
> >
> > ||How|about|â|Œ|¹|in|APL| | |
> >
> > |+---+-----+-+-+-+--+---+-+ |
> >
> > +-----------------------------------------------+
> >
> > |+------+-----------+----+-----+---+-----+-+ |
> >
> > ||Common|expressions|like|'H₂O'|for|water| | |
> >
> > |+------+-----------+----+-----+---+-----+-+ |
> >
> > +-----------------------------------------------+
> >
> > |+------+-----------+----+-+-+-+-+-+---+-----+-+|
> >
> > ||Common|expressions|like|H|â|‚|‚|O|for|water| ||
> >
> > |+------+-----------+----+-+-+-+-+-+---+-----+-+|
> >
> > +-----------------------------------------------+
> >
> > NB. Assigning UTF8 as character
> >
> > mj=: 2 (128+i.128)}mj
> >
> > NB. How UTF-8 is now handled
> >
> > ,.<;.2(0;sj;mj);:test
> >
> > +-------------------------------------------+
> >
> > |+---+------+---+---+----+--+-+-+ |
> >
> > ||The|symbol|for|the|Euro|is|₠| | |
> >
> > |+---+------+---+---+----+--+-+-+ |
> >
> > +-------------------------------------------+
> >
> > |+-----+-------+----+-+----+--+----+-+ |
> >
> > ||Other|symools|like|π|show|up|also| | |
> >
> > |+-----+-------+----+-+----+--+----+-+ |
> >
> > +-------------------------------------------+
> >
> > |+---+-----+-+--+---+-+ |
> >
> > ||How|about|⌹|in|APL| | |
> >
> > |+---+-----+-+--+---+-+ |
> >
> > +-------------------------------------------+
> >
> > |+------+-----------+----+-----+---+-----+-+|
> >
> > ||Common|expressions|like|'H₂O'|for|water| ||
> >
> > |+------+-----------+----+-----+---+-----+-+|
> >
> > +-------------------------------------------+
> >
> > |+------+-----------+----+---+---+-----+-+ |
> >
> > ||Common|expressions|like|H₂O|for|water| | |
> >
> > |+------+-----------+----+---+---+-----+-+ |
> >
> > +-------------------------------------------+
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm