On reflection, I think this form
(7&u: 'someutf8string')&,
is not feasible because the string data can be
rank-2 or higher. The utf8 encoding for is non-atomic
for each unicode character.

Fri, 22 Nov 2019, bill lam написал(а):
> (u: 97 98 99)&,        <== (3)
> (10&u: 97 98 99)&,     <== (3)
> 
> Good suggestion for 3/ howerver
> if ascii of string is in range of 32 and 127 then it possible
> (u: 'abc')&,
> (10&u: 'abc')&,
> 
> else if the string is valid unicode then it may also
> be possbile using the form
> (7&u: 'someutf8string')&,
> (9&u: 'someutf8string')&,
> 
> But J literal4 contain the full range of 4 bytes, so some are
> invalid unicode characters and should use the current format.
> 
> Thu, 21 Nov 2019, Kirk Iverson написал(а):
> > A recent thread in the programming forum (00 strange?) inspired me to look 
> > at this new behaviour in 901.
> > 
> >    NB. Numeric
> >    datatype&.> 'b i f x c'=. 0 1;(0 1+0);(0 1%1);0 1x;0,1j1-0j1
> > +-------+-------+--------+--------+-------+
> > |boolean|integer|floating|extended|complex|
> > +-------+-------+--------+--------+-------+
> >    b&+
> > 0 1&+
> >    i&+
> > 00 1&+           <== (1)
> >    f&+
> > 0 1.&+
> >    x&+
> > 0 1x&+
> >    c&+
> > 0 1&+            <== (2)
> > 
> > 
> >    NB. Character
> >    d=. (0{a.);'abc'
> >    'lctrl lascii'=. d
> >    'uctrl uascii'=. u:&.> d
> >    'Uctrl Uascii'=. 10 u:&.> d
> >    datatype&.>lctrl;lascii;uctrl;uascii;Uctrl;Uascii
> > +-------+-------+-------+-------+--------+--------+
> > |literal|literal|unicode|unicode|unicode4|unicode4|
> > +-------+-------+-------+-------+--------+--------+
> >    lctrl&,
> > (00{a.)&,
> >    lascii&,
> > 'abc'&,
> >    uctrl&,
> > (u: 00)&,
> >    uascii&,
> > (u: 97 98 99)&,        <== (3)
> >    Uctrl&,
> > (10&u: 00)&,
> >    Uascii&,
> > (10&u: 97 98 99)&,     <== (3)
> > 
> > 
> > 1/ Is there a reason that it is the first element of an integer array which 
> > has extra decoration, rather than the last element (as in the other cases)?
> > 
> > 2/ Shouldn't this display as  0 1j0&+   ?
> > Also, when interpreting constants in a line of J:
> > 0 1    is boolean
> > 00 1   is integer
> > 0 1.   is floating
> > 0 1x   is extended (as is  0 1r1)
> > 0 1j0  is integer
> > Shouldn't this last one be complex?
> > 
> > 3/ Can the representations of uascii and Uascii make use of literals (like 
> > lascii does)?
> > 
> > /K
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> 
> -- 
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to