The feature was there because a co-author (bill) was lazy.
A better user experience would be possible if illegal sequence
be skipped or replaced by space or other special characters.


Вт, 09 май 2017, Paul Jackson написал(а):
> Thanks, that explains a lot. Careful reading of your message made me
> realize it is a feature.  If the characters are not UTF8, the code assumes
> you wanted to display bytes.
> 
> It should be simple to write a _7 u: which inserted 16bfffd wherever
> 7 u: would signal a domain error.
> 
> On Sun, May 7, 2017, 10:39 PM bill lam <[email protected]> wrote:
> 
> > As far as I can recalled, it works this way:
> > if a rank-1 character vector is malformed uf8,
> > it converts the whole vector (not just those
> > illegal characters) to byte by byte. in your
> > example the second row
> >   224 176 157 97 99 224
> > is malformed because of the last 224, so it
> > convert to unicode in this way
> >      7 u: 224 176 157 97 99 224
> > à° acà
> >
> > After blanks inserted for format, it converted
> > back to utf8
> >      a. i. 8 u: 7 u: 224 176 157 97 99 224
> > 195 160 194 176 194 157 97 99 195 160
> >
> > So the round trip didn't look beautiful, ideally
> > it should convert only illegal subsequence , in
> > this case the last 224 to 195 160 to repair
> >
> >    7 u: a.{~224 176 157 97 99 195 160
> > ఝacà
> >    a.i. 8 u: 7 u: a.{~224 176 157 97 99 195 160
> > 224 176 157 97 99 195 160
> >
> >
> > Пн, 08 май 2017, Paul Jackson написал(а):
> > > I know the previous discussion concluded this wasn't worth fixing, but
> > I've
> > > been looking at the causes of damaged output. I've confirmed that the
> > > visible appearance of truncated UTF8 characters are due to the
> > environment.
> > >    a.i. v0=. 'cఝa'
> > > 99 224 176 157 97
> > >    224 { a.
> > > ʀ
> > >    224 176{a.
> > > ఊ
> > >
> > > However, you can also see faults in what default format provides. As an
> > APL
> > > developer, I assume it shares code with default output. My tests suggest
> > > embedded failures are due to J.
> > >    a.i.": <2 6$ v0
> > > 16 26 26 26 26 26 26 18 32 32 32 32
> > > 25 99 224 176 157 97 99 32 32 25 32 32
> > > 25 195 160 194 176 194 157 97 99 195 160 25
> > > 22 26 26 26 26 26 26 24 32 32 32 32
> > >
> > > Note that there is no 195 160 in the text, and it seems 224 176 157 has
> > > become 194 176 195 157.  A further example of this behaviour is shown in
> > > the next unicode character.
> > >    a.i. v1=. 'cఞa'
> > > 99 224 176 158 97
> > >    a.i.": <2 6$ v1
> > > 16  26  26   26   26   26   26  18  32  32  32  32
> > > 25  99 224 176 158  97  99  32  32  25  32  32
> > > 25 195 160 194 176 194 158  97  99 195 160  25
> > > 22  26  26  26  26  26  26  24  32  32  32  32
> > >
> > > While it should be possible to fix these internal mistakes, there cannot
> > be
> > > a safe way to use verbs like
> > > # $ { {. .} } |. |:
> > > on UTF8 values, so I still don't know if it is worth fixing.
> > >
> > > However, running these tests made me realize default format converts
> > > everything to UTF8. While the characters are not damaged by reshape, some
> > > rows of enclosed arrays will end in blanks.
> > >    a.i.": <2 4$7 u: v0
> > > 16  26  26  26  26  18  32  32  32  32
> > > 25  99 224 176 157  97  99  25  32  32
> > > 25 224 176 157  97  99 224 176 157  25
> > > 22  26  26  26  26  24  32  32  32  32
> > > --
> > >
> > > Paul
> > > 650-766-1863
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> > --
> > regards,
> > ====================================================
> > GPG key 1024D/4434BAB3 2008-08-24
> > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> 
> -- 
> 
> Paul
> 650-766-1863
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to