Hello, Eric.
The problem comes in degress and the question is where to draw the line. If the utf-8 2 byte chars are European in a fixed pitch font then your suggestion of counting glyphs works.
Include Turkey, Israel, Armenia, Georgia, Mongolia, Canada (and some other) to the list of "european" and your statement becomes closer to the truth.
If the utf-8 2 byte chars are Chinese in a fixed pitch font that takes 2 positions for the Chinese chars then the current scheme of counting bytes works.
No, it does not. Chineese characters (if you include, somewhat arbitrary, kana and hangul) are encoded at unicode positions 16b3000 .. 16bfa2d, which all require #@(8&u:)"> 4 u: 16b3000 16bfa2d 3 3 3 bytes when encoded as utf-8.
The proper solution is to have the box drawing done in the front end where the actual length of each contents is calculated and the box lines are drawn with graphics.
Please, do not ever do this. Please. I was so happy when box drawing characters returned, I even installed newest beta just for that.
This is a lot of work for what we consider to be seriously diminished returns.
That is exactly what I am talking, biggest bang for a buck. Right now you calculate width (in "positions") as #@(8&u:) (or something equivalent) and this approasch works only when input is pure ascii. I suggest to use #@(7&u:) and this approach will work for pure ascii, accented characters, cyrillic, greek, hebrew, APL, box drawing and most of block, arrow and special charaters, when displayed in J session, properly set up console window or, say, putty in utf8 mode. Are you suggesting that #@(7&u:) is somehow order of magnitude more complicated than #@(8&u:) as a function to calculate display width? I might agree that calling OS to comprehend 0 width of combining character classes could be considered as too much trouble, but in case of codes vs bytes... come on! nsg ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
