June Kim wrote:
> When a string is boxed and the string includes characters that have
> different width to the byte lenghts, then the box is broken in J. It
> is not because of the font. It is because J makes an assumption that
> every character's width is same with its byte length, which is
> obviously false in many writting+encoding systems, including east
> asians. We can definitely say J's box display isn't internationalized
> yet.
> 
> For example, 54620 (in unicode code point) is a Korean character,
> which is pronounced as "han". It's width is "Wide"(twice wide as latin
> alphabets)
> 
>   han=.4 u: 54620
>   <han
> +---+
> |한|
> +---+
>   <8 u: han
> +---+
> |한|
> +---+
> 
> Since J counts the byte length for determining character's width, and
> the byte length for han is 3 in UTF-8( 3-: #8 u: han ), the box's
> horizontal character '-'(of which width is "Narrow") is printed three
> times, and on the display the box is broken.

Yes, you are quite right, but again, this is not because we are not
using unicode. This particular issue was raised before, see
http://www.jsoftware.com/pipermail/programming/2006-February/001107.html   .

As Eric commented: "The proper solution is for the front end window
display to detect box draw chars and do the display with proper
graphics. That is parse all the contents finding lengths, then draw the
texts where they belong and draw nice boxes with graphic lines. This is
beyond the scope of the 601 release."

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to