Great work Bob. Usually I avoid mix literal/unicode in the same application. Converting all data to utf8 or utf16 depending applications.
On Thu, Sep 19, 2019, 12:16 PM 'robert therriault' via Programming < [email protected]> wrote: > Don, > > You are quite correct that care must be taken to convert literals in utf-8 > to unicode before concatenating. > > [ lit3=:8 u: 3101 > ఝ > 3 u: lit3 > 224 176 157 > [ uni_1=:7 u: 3101 > ఝ > lit3,uni_1 > à° ఝ > $ lit3,uni_1 > 4 > datatype lit3,uni_1 > unicode > (7 u: lit3),uni_1 NB. Displays properly when first converted > ఝఝ > $ (7 u: lit3),uni_1 > 2 > datatype (7 u: lit3),uni_1 > unicode > > So the parting wisdom that I have is that when you are working with > unicode in J, you should be aware of what is going on. > > It may be useful for someone with knowledge to create a lab that shows the > preferred way of dealing with conversions and encodings. I might take a run > at it eventually, but if anyone wants to be 'my hero', they could put > something together sooner. > My current confusion is over the number of ways that the outputs of 9&u: > and 7&u: depend on the type of their argument. > > 9 u: 128512 > 😀 > 9 u: '😀' > 😀 > 3 u: 9 u: '😀' > 128512 > 7 u: '😀' > 😀 > 3 u: 7 u: '😀' > 55357 56832 > 9 u: 55357 56832 > 😀 > > 9 u: 7 u: 55357 56832 > 😀 > 3 u: 9 u: 7 u: 55357 56832 > 128512 > 3 u: 7 u: 55357 56832 > 55357 56832 > 3 u: 9 u: 55357 56832 > 55357 56832 > > > Cheers, bob > > > On Sep 18, 2019, at 5:58 PM, Don Guinn <[email protected]> wrote: > > > > If any utf-8 characters > > are in the literal the literal must be converted to unicode before > > concatenating. > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
