On reflection, I think this form
(7&u: 'someutf8string')&,
is not feasible because the string data can be
rank-2 or higher. The utf8 encoding for is non-atomic
for each unicode character.
Fri, 22 Nov 2019, bill lam написал(а):
> (u: 97 98 99)&, <== (3)
> (10&u: 97 98 99)&, <== (3)
>
> Good suggestion for 3/ howerver
> if ascii of string is in range of 32 and 127 then it possible
> (u: 'abc')&,
> (10&u: 'abc')&,
>
> else if the string is valid unicode then it may also
> be possbile using the form
> (7&u: 'someutf8string')&,
> (9&u: 'someutf8string')&,
>
> But J literal4 contain the full range of 4 bytes, so some are
> invalid unicode characters and should use the current format.
>
> Thu, 21 Nov 2019, Kirk Iverson написал(а):
> > A recent thread in the programming forum (00 strange?) inspired me to look
> > at this new behaviour in 901.
> >
> > NB. Numeric
> > datatype&.> 'b i f x c'=. 0 1;(0 1+0);(0 1%1);0 1x;0,1j1-0j1
> > +-------+-------+--------+--------+-------+
> > |boolean|integer|floating|extended|complex|
> > +-------+-------+--------+--------+-------+
> > b&+
> > 0 1&+
> > i&+
> > 00 1&+ <== (1)
> > f&+
> > 0 1.&+
> > x&+
> > 0 1x&+
> > c&+
> > 0 1&+ <== (2)
> >
> >
> > NB. Character
> > d=. (0{a.);'abc'
> > 'lctrl lascii'=. d
> > 'uctrl uascii'=. u:&.> d
> > 'Uctrl Uascii'=. 10 u:&.> d
> > datatype&.>lctrl;lascii;uctrl;uascii;Uctrl;Uascii
> > +-------+-------+-------+-------+--------+--------+
> > |literal|literal|unicode|unicode|unicode4|unicode4|
> > +-------+-------+-------+-------+--------+--------+
> > lctrl&,
> > (00{a.)&,
> > lascii&,
> > 'abc'&,
> > uctrl&,
> > (u: 00)&,
> > uascii&,
> > (u: 97 98 99)&, <== (3)
> > Uctrl&,
> > (10&u: 00)&,
> > Uascii&,
> > (10&u: 97 98 99)&, <== (3)
> >
> >
> > 1/ Is there a reason that it is the first element of an integer array which
> > has extra decoration, rather than the last element (as in the other cases)?
> >
> > 2/ Shouldn't this display as 0 1j0&+ ?
> > Also, when interpreting constants in a line of J:
> > 0 1 is boolean
> > 00 1 is integer
> > 0 1. is floating
> > 0 1x is extended (as is 0 1r1)
> > 0 1j0 is integer
> > Shouldn't this last one be complex?
> >
> > 3/ Can the representations of uascii and Uascii make use of literals (like
> > lascii does)?
> >
> > /K
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
> --
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm