[Jbeta] boxed unicode

Andrew Nikitin Sat, 17 Jun 2006 11:35:16 -0700

Hello, Eric.

The problem comes in degress and the question is where to draw the
line. If the utf-8 2 byte chars are European in a fixed pitch font
then your suggestion of counting glyphs works.


Include Turkey, Israel, Armenia, Georgia, Mongolia, Canada (and some
other) to the list of "european" and your statement becomes closer to
the truth.

If the utf-8 2 byte
chars are Chinese in a fixed pitch font that takes 2 positions for the
Chinese chars then the current scheme of counting bytes works.


No, it does not. Chineese characters (if you include, somewhat
arbitrary, kana and hangul)  are encoded at unicode positions 16b3000 ..
16bfa2d, which all require
  #@(8&u:)"> 4 u: 16b3000 16bfa2d
3 3
3 bytes when encoded as utf-8.

The proper solution is to have the box drawing done in the front end
where the actual length of each contents is calculated and the box
lines are drawn with graphics.


Please, do not ever do this. Please. I was so happy when box drawing
characters returned, I even installed newest beta just for that.

This is a lot of work for what we consider to be seriously diminished
returns.


That is exactly what I am talking, biggest bang for a buck. Right now
you calculate width (in "positions") as #@(8&u:) (or something
equivalent) and this approasch works only when input is pure ascii. I
suggest to use #@(7&u:) and this approach will work for pure ascii,
accented characters, cyrillic, greek, hebrew, APL, box drawing and most
of block, arrow and special charaters, when displayed in J session,
properly set up console window or, say, putty in utf8 mode.

Are you suggesting that #@(7&u:) is somehow order of magnitude more
complicated than #@(8&u:) as a function to calculate display width?

I might agree that calling OS to comprehend 0 width of combining
character classes could be considered as too much trouble, but in case
of codes vs bytes... come on!

nsg


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jbeta] boxed unicode

Reply via email to