I didn't study it carefully. However I think they aren't utf8. Perhaps it
groups arbitrary length of high ascii characters with the preceding low
ascii characters.

On Fri, Nov 20, 2020, 11:51 PM Don Guinn <[email protected]> wrote:

> You are correct. They are not. I took their definitions from one of the
> beta forum messages concerning the implementation of Direct Definition. I
> am showing them below.
>
>
> mj=: 256$0                     NB. X other
> mj=: 1 (9,a.i.' ')}mj          NB. S space and tab
> mj=: 2 (,(a.i.'Aa')+/i.26)}mj  NB. A A-Z a-z excluding N B
> mj=: 3 (a.i.'N')}mj            NB. N the letter N
> mj=: 4 (a.i.'B')}mj            NB. B the letter B
> mj=: 5 (a.i.'0123456789_')}mj  NB. 9 digits and _
> mj=: 6 (a.i.'.')}mj            NB. . the decimal point
> mj=: 7 (a.i.':')}mj            NB. : the colon
> mj=: 8 (a.i.'''')}mj           NB. Q quote
> mj=: 9 (a.i.'{')}mj            NB. { the left curly brace
> mj=:10 (10)}mj                 NB. LF and CR
> mj=:11 (a.i.'}')}mj            NB. } the right curly brace
>
> NB. mj=: 2 (128+i.128)        }mj
>
> sj=: 0 10#:10*}.".;._2(0 :0)
> ' X   S   A   N   B   9   .   :   Q    {    }   LF ']0
>  1.1 0.0 2.1 3.1 2.1 6.1 1.1 1.1 7.1 11.1 10.1 12.1 NB. 0 space
>  1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 1 other
>  1.2 0.3 2.0 2.0 2.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 2 alp/num
>  1.2 0.3 2.0 2.0 4.0 2.0 1.0 1.0 7.2 11.2 10.2 12.2 NB. 3 N
>  1.2 0.3 2.0 2.0 2.0 2.0 5.0 1.0 7.2 11.2 10.2 12.2 NB. 4 NB
>  9.0 9.0 9.0 9.0 9.0 9.0 1.0 1.0 9.0  9.0 10.2  9.0 NB. 5 NB.
>  1.4 0.5 6.0 6.0 6.0 6.0 6.0 1.0 7.4 11.4 10.2 12.4 NB. 6 num
>  7.0 7.0 7.0 7.0 7.0 7.0 7.0 7.0 8.0  7.0  7.0  7.0 NB. 7 '
>  1.2 0.3 2.2 3.2 2.2 6.2 1.2 1.2 7.0 11.2 10.2 12.2 NB. 8 ''
>  9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0 9.0  9.0 10.2  9.0 NB. 9 comment
>  1.2 0.2 2.2 3.2 2.2 6.2 1.0 1.0 7.2 11.2 10.2 12.2 NB. 10 LF
>  1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2 13.0 10.2  1.2 NB. 11 {
>  1.2 0.3 2.2 3.2 2.2 6.2 1.0 1.0 7.2  1.2 10.2 14.0 NB. 12 }
>  1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2  1.2 10.2  1.2 NB. 13 {{
>  1.2 0.3 2.2 3.2 2.2 6.2 1.7 1.7 7.2  1.2 10.2  1.2 NB. 14 }}
> )
>
> On Fri, Nov 20, 2020 at 8:32 AM Henry Rich <[email protected]> wrote:
>
> > What are sj/mj?  They are not part of the JE, right?  I think you are
> > saying that (x ;: y) can be used to parse well-formed UTF-8 with a small
> > change to the input translation.
> >
> > Henry Rich
> >
> > On 11/20/2020 10:17 AM, Don Guinn wrote:
> > > Sequential machine does not do well when dealing with UTF-8. It works
> > well
> > > within comments (NB.) and literals ('⌹'), but outside those cases it
> > makes
> > > a mess.
> > >
> > >
> > > Given some of the changes to ;: in the beta it seems that it would be
> > > desirable to have UTF-8 handled outside of comments and literals as
> > handled
> > > in them. There is a simple change that can be made to mj that
> > accomplishes
> > > that. Simply assigning the value 2 for letters for the range 128+i.128
> > > accomplishes that making UTF-8 like letters a-z and A-Z.
> > >
> > >
> > > I don't know where J will be going with UTF-8 and other unicode
> handling,
> > > but this seems to me to help in the handling of UTF-8 in the sequential
> > > machine.
> > >
> > >
> > > Example shown below:
> > >
> > > NB. Definitions for sj and mj not shown but as
> > >
> > > NB. the current beta.
> > >
> > > NB. A noun to show the handling of UTF-8 in ;:
> > >
> > > test=:{{)n
> > >
> > > The symbol for the Euro is ₠
> > >
> > > Other symbols like π show up also
> > >
> > > How about ⌹ in APL
> > >
> > > Common expressions like 'H₂O' for water
> > >
> > > Common expressions like H₂O for water
> > >
> > > }}
> > >
> > > NB. How ;: in beta handles it
> > >
> > > ,.<;.2(0;sj;mj);:test
> > >
> > > +-----------------------------------------------+
> > >
> > > |+---+------+---+---+----+--+-+-+-+-+ |
> > >
> > > ||The|symbol|for|the|Euro|is|â|‚| | | |
> > >
> > > |+---+------+---+---+----+--+-+-+-+-+ |
> > >
> > > +-----------------------------------------------+
> > >
> > > |+-----+-------+----+-+-+----+--+----+-+ |
> > >
> > > ||Other|symools|like|Ï|€|show|up|also| | |
> > >
> > > |+-----+-------+----+-+-+----+--+----+-+ |
> > >
> > > +-----------------------------------------------+
> > >
> > > |+---+-----+-+-+-+--+---+-+ |
> > >
> > > ||How|about|â|Œ|¹|in|APL| | |
> > >
> > > |+---+-----+-+-+-+--+---+-+ |
> > >
> > > +-----------------------------------------------+
> > >
> > > |+------+-----------+----+-----+---+-----+-+ |
> > >
> > > ||Common|expressions|like|'H₂O'|for|water| | |
> > >
> > > |+------+-----------+----+-----+---+-----+-+ |
> > >
> > > +-----------------------------------------------+
> > >
> > > |+------+-----------+----+-+-+-+-+-+---+-----+-+|
> > >
> > > ||Common|expressions|like|H|â|‚|‚|O|for|water| ||
> > >
> > > |+------+-----------+----+-+-+-+-+-+---+-----+-+|
> > >
> > > +-----------------------------------------------+
> > >
> > > NB. Assigning UTF8 as character
> > >
> > > mj=: 2 (128+i.128)}mj
> > >
> > > NB. How UTF-8 is now handled
> > >
> > > ,.<;.2(0;sj;mj);:test
> > >
> > > +-------------------------------------------+
> > >
> > > |+---+------+---+---+----+--+-+-+ |
> > >
> > > ||The|symbol|for|the|Euro|is|₠| | |
> > >
> > > |+---+------+---+---+----+--+-+-+ |
> > >
> > > +-------------------------------------------+
> > >
> > > |+-----+-------+----+-+----+--+----+-+ |
> > >
> > > ||Other|symools|like|π|show|up|also| | |
> > >
> > > |+-----+-------+----+-+----+--+----+-+ |
> > >
> > > +-------------------------------------------+
> > >
> > > |+---+-----+-+--+---+-+ |
> > >
> > > ||How|about|⌹|in|APL| | |
> > >
> > > |+---+-----+-+--+---+-+ |
> > >
> > > +-------------------------------------------+
> > >
> > > |+------+-----------+----+-----+---+-----+-+|
> > >
> > > ||Common|expressions|like|'H₂O'|for|water| ||
> > >
> > > |+------+-----------+----+-----+---+-----+-+|
> > >
> > > +-------------------------------------------+
> > >
> > > |+------+-----------+----+---+---+-----+-+ |
> > >
> > > ||Common|expressions|like|H₂O|for|water| | |
> > >
> > > |+------+-----------+----+---+---+-----+-+ |
> > >
> > > +-------------------------------------------+
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> >
> > --
> > This email has been checked for viruses by AVG.
> > https://www.avg.com
> >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to