Pascal says:
> J stores the full 8 bits of binary data. There very well may be a great
reason not to provide a friendly display for binary data, I just don't see
it yet.
Yes there is, and it's to do with the nature of the utf-8 standard. Before
unicode came along, J did indeed display (255 { a.) say as a single glyph:
y-umlaut (ÿ -if that shows up on your screen!).
There are many ways J could have "supported" unicode. In fact J uses two
distinct ways: wide-characters and utf-8. The second way, utf-8 is used by
the session window and edit window in j602 (I'm less familiar with JQt and
JHS, but I guess they do likewise), plus a lot of other popular software,
and -- most important! -- by inter-application copy/paste. It's a very
popular standard, because you can keep all your old ascii-based code --
most of it still works.
But there's a cost to using utf-8. As soon as your display software meets a
"superascii" byte (what you're calling "extended ascii") it needs to
interpret it as the start of a multi-byte code representing a "unicode code
point". It detects a "superascii" by the leading bit of its byte code being
1.
In consequence, display software can't both use utf-8 and treat a
superascii as a single glyph. So in order to use utf-8, J is forced to
abandon the ability to display (128{a.) to (255{a.) in one of the old
obsolete standards like latin-1.
If you want to display a superascii such as (255{a.) as a single latin-1
glyph, you can still do so (in j602 at least) like this:
u: 255{a.
ÿ
That works ok because the old latin-1 character set has been imported into
unicode as a proper "codespace". Raul gave you the official reference to it:
http://www.unicode.org/charts/PDF/U0080.pdf
It's hard to understand the ins and outs of unicode unless you call
everything by its right name, in particular using these words rigorously:
glyph, grapheme, char, codespace, code point. So few people do!
http://www.jsoftware.com/jwiki/Guides/UnicodeGettingStarted#Superasciis_and_utf8-encoding
-tries to explain unicode and utf-8 in beginners' terms. But even there I
see a mistake: it occasionally uses the word "glyph" when it should be
using "grapheme".
Still, it's a start.
On Fri, Mar 28, 2014 at 4:11 PM, Pascal Jasmin <[email protected]>wrote:
> J stores the full 8 bits of binary data. There very well may be a great
> reason not to provide a friendly display for binary data, I just don't see
> it yet.
>
> On another note, there seems to be a case for an extra dyad form for u: .
> Say 9 u:
>
> It would behave as monad u: does for char and wchar, but for any other
> argument type (including integers) would return ] y. I understand it not
> being a priority since this can be implemented by users with 3!:0 checking.
>
>
> ----- Original Message -----
> From: Raul Miller <[email protected]>
> To: Programming forum <[email protected]>
> Cc:
> Sent: Friday, March 28, 2014 10:22:18 AM
> Subject: Re: [Jprogramming] font with extended ascii? -display binary data
>
> "Normal ascii" occupies only 7 bits, so it's 128{.a. (or u: i.128).
>
> The problems created by what to do with the other half of they byte (along
> with our love/hate relationship with standards and professionalism) have a
> lot to do with why we are using ascii instead of ebcdic.
>
> Thanks,
>
> --
> Raul
>
>
>
> On Fri, Mar 28, 2014 at 11:04 AM, Pascal Jasmin <[email protected]
> >wrote:
>
> > thank you Raul,
> >
> > On further thought, it appears to be impractical to use larger than base
> > 128 for binary encoding.
> >
> > A friendlier display of my numeric list compression routine is possible
> > though u:
> >
> > BASE128 =: BASE64 , a.{~ 192 + i.64
> >
> >
> > u: compresslistnum 1000000239482039420348x 2 248 +"1 i. 3 3
> > bN5o8ÒÁDâïA BÀ ýA
> > bN5o8ÒÁDâïà DA þÀ
> > bN5o8ÒÁDâðÀ EÀ AÀA
> >
> >
> >
> > There is a formatting problem displaying boxed unicode data. Is there
> any
> > chance that normal ascii could display as above for codes 192+? or boxed
> > unicode could line up?
> >
> > and
> >
> > BASE128 i. 'bN5o8ÒÁDâðÀ EÀ AÀA'
> > 27 13 57 40 60 67 128 67 128 3 67 128 67 128 67 128 128 4 67 128 128 0 67
> > 128 0
> >
> > basically show that all of the extended characters are not found in
> > BASE128 but
> >
> > a. i. 'bN5o8ÒÁDâðÀ EÀ AÀA'
> > 98 78 53 111 56 195 146 195 129 68 195 162 195 176 195 128 32 69 195 128
> > 32 65 195 128 65
> >
> > shows that 2 characters are embedded for extended chars (195 x), and
> > intermixed with single codes.
> >
> > Worth noting is that the extended characters display in my html email
> > client.
> >
> >
> >
> >
> > ----- Original Message -----
> > From: Raul Miller <[email protected]>
> > To: Programming forum <[email protected]>
> > Cc:
> > Sent: Friday, March 28, 2014 9:07:15 AM
> > Subject: Re: [Jprogramming] font with extended ascii? -display binary
> data
> >
> > There's http://www.unicode.org/charts/PDF/U0080.pdf
> >
> > But it's not an informal page.
> >
> > 240-248 corresponds to the rightmost column (the one with the caption
> 00F),
> > and the top half of that column (00F0 through 00F8 in the small print at
> > the bottom of each cell).
> >
> > Thanks,
> >
> > --
> > Raul
> >
> >
> >
> > On Fri, Mar 28, 2014 at 9:48 AM, Pascal Jasmin <[email protected]
> > >wrote:
> >
> > > Jqt uses menlo as default font. Printing binary data over 127 all
> > produce
> > > identical "not found" glyphs. Is it a font issue? and is there a
> fixed
> > > width font that would display extended ascii as this list (or as much
> of
> > it
> > > as possible)? iso-latin 1? Is there some informal code page that
> shows a
> > > printable character for every (or 240-248) binary value(s)?
> > >
> > > http://www.danshort.com/ASCIImap/
> > >
> > >
> > > A related question is wd edit will not display the prettier line
> drawing
> > > (box character set) symbols even when the font is set to Menlo. Is
> > there a
> > > workaround for that?
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
>
> >
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm