I think this should be the intended behavior. Your string is in utf8 unicode encoding and you can use 9 & u: to convert it into unicode4. 10 & u: converts atom by atom.
Convert to unicode4 unless all characters are ASCII 9 byte Leave unchanged unicode4 any character precision containing a UCPs > 127; or integer in (0,16b10ffff) Convert to unicode4. Any UTF-8 is converted to unicode4, and surrogate pairs in unicode are converted. Convert to unicode4 10 unicode4 any character precision, or integer in (0,16b10ffff) Convert to unicode4 ------------------------------ On Thu, Jan 5, 2023 at 1:46 PM Raul Miller <[email protected]> wrote: > 10 u:'♥♦♣♠' > ♥♦♣♠> #10 u:'♥♦♣♠' > 12 > > I can't make heads nor tails of this result. > > nuvoc suggests that 10 u: should be used to generate unicode4 (which > probably means that it would use the ucs-4 encoding, containing a > utf-32 representation of the argument characters), but while it's > literally the case that the result is in J's unicode4 format: > > datatype 10 u:'♥♦♣♠' > unicode4 > > ... it does not look like the argument characters were encoded in this > format. > > -- > Raul > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
