Re: [Jbeta] Boxed Unicode displays inconsistently

Henry Rich Sun, 23 Apr 2017 09:24:23 -0700

Bill has hit the nail on the head.

For unboxed display, JE just sends the bytes to the front-end which doeswhatever it wants with invalid UTF-8 sequences.

For boxed display, in order to get the boxing right JE has to predictthe number of character positions that will be taken up by each Unicodecharacter, so it converts anything that looks like UTF-8 to Unicodecharacters. Whatever it chooses to do with invalid UTF-8 might bedifferent from what the front-end does.


My take on this is that it's not worth trying to fix.

Henry Rich

On 4/23/2017 3:54 AM, bill lam wrote:

The difficulty is converting an invalid utf8 to a unicode
and then converting it back to the original invalid utf8.

The display of invalid sequence depends on front-end's font
engine.

I hope you can bear with it.

Вс, 23 апр 2017, robert therriault написал(а):

Thanks Bill,

I thought it may be related and I also suspected that it might not be an easy 
fix.

It does result in some strange situations where a UTF-8 sequence 224 176 157 is 
interpreted one way in the first row of an array and differently in the second 
row. I suppose that is the nature of UTF-8 shards. It is a messy business.

      <2 6  $ 'cఝa'
┌──────┐
│cఝac  │
│à°acà│
└──────┘

Cheers, bob

On Apr 23, 2017, at 12:05 AM, bill lam <[email protected]> wrote:

_3 s: 2 5  $ 'cb鲨a'

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm



---
This email has been checked for viruses by AVG.
http://www.avg.com

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jbeta] Boxed Unicode displays inconsistently

Reply via email to