You are quite right Don, I should change the request to displaying unicode in UTF8 I suppose. Converting to unicode as you have done also allows manipulation of characters within arrays, but I am looking ways to show the results when reshaping breaks UTF8 representation.
Do you have a way to take a literal array in UTF8 and box the encodings for each character? I have seen your posts in the past and they have helped as I work through this process. Thank you. One of the ways that I am looking at dealing with the width issue is to have the character display display in a smaller font so that some of the unicode display width issues can be resolved. Cheers, bob > On Jun 16, 2016, at 11:25 AM, Don Guinn <[email protected]> wrote: > > You are not dealing with unicode. You have UTF8. > > ]s=. 7 u: 'ఝ' ,'a','ఝ' NB. s is converted to unicode. > > ఝaఝ > > $s > > 3 > > <"0 s > > +---+-+---+ > > |ఝ|a|ఝ| > > +---+-+---+ > > > But the display still is messed up because the display first converts the > unicode to UTF8. Then does a byte count to determine how many boxing > characters to put around the data. But there is still a problem as many > unicode/UTF8 characters beyond ASCII are proportional. Notice how wide the > first and last characters are compared to the "a". > > On Thu, Jun 16, 2016 at 12:08 PM, robert therriault <[email protected]> > wrote: > >> I am in the process of extending some of the type and shape visualizations >> that I have done in the past [0] into the realm of unicode. >> >> If you look through the archives of these message lists you will find that >> unicode can be quite confounding, but my question is relatively simple. >> >> I would like to take >> >> [s=. 2 6 $ 'ఝ' ,'a','ఝ' NB. � results from 224 176 157 being broken >> across dimensions >> ఝa�� >> �ఝa� >> [encode=. a. i. s NB. shape of 2 6 refers to the encoding numbers >> not the number of characters displayed >> 224 176 157 97 224 176 >> 157 224 176 157 97 224 >> >> and convert encode to a form where the encoding for each character is in >> it's own box. Of course, this would be a verb that can work with any >> literal array not just the example given. >> >> [r=. 2 4 $ 224 176 157 ; 97 ; 224 ; 176 ; 157 ; 224 176 157 ; 97 ; 224 >> ┌───────────┬───────────┬───┬───┐ >> │224 176 157│97 │224│176│ >> ├───────────┼───────────┼───┼───┤ >> │157 │224 176 157│97 │224│ >> └───────────┴───────────┴───┴───┘ >> >> which could be converted back to >> >> {&a. each r >> ┌───┬───┬─┬─┐ >> │ఝ│a │�│�│ >> ├───┼───┼─┼─┤ >> │� │ఝ│a│�│ >> └───┴───┴─┴─┘ >> >> With this in place it may be possible to have the literal view of unicode >> display a little more consistently >> >> >> Any suggestions would be welcome. >> >> Cheers, bob >> >> [0] Video of Enhanced display of literals >> https://www.youtube.com/watch?v=BzjfJjGb5cs >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
