Re: [Jchat] APL character support (moved to Chat)

Björn Helgason Fri, 28 Feb 2014 00:59:50 -0800

What you can easily do is create names for the signs you want to use.

If you want you could do NB. and/or some Unicode picture to replace that
name and toggle between the three representations.


Unicode pictures include chess pieces polarbeers etc

-
Björn Helgason
gsm:6985532
skype:gosiminn
On 27.2.2014 21:57, "Skip Cave" <[email protected]> wrote:

> Just my two cents worth...
>
> As an old APL (occasional) programmer, I always wanted a way to flip a
> switch in the J editor and turn J's 2-character primitives into APL
> characters (where appropriate), and either leave J's unique verbs alone,
> have the community decide on an appropriate single glyph, or let me pick a
> symbol for those myself. Then I could always flip that switch in the editor
> back, and see the actual J code, any time I wanted.
>
> For me, it was never about how many characters I had to type. It was about
> what I saw, when I looked at the code. IMHO, the APL single glyphs just
> made the functionality of programs much easier to grasp as I read through
> them.
>
>  If I am entering code and the switch was in APL mode, I could just type
> the actual J 2-character primitives, and the one-character APL symbol would
> appear on the screen.
>
> When sending code around, I can always send the normal ASCII J
> representation (like sending the compiled binaries of a program), and the
> receiver of the code would have the option of looking at the J code in its
> native form, or viewing the APL-like symbols.
>
> I'm sure this plan has many (undiscovered by me) flaws, but it is my
> dream...
>
> Skip
>
>
>
> Skip Cave
> Cave Consulting LLC
>
>
> On Thu, Feb 27, 2014 at 1:03 PM, Don Guinn <[email protected]> wrote:
>
> > This discussion started out on using APL characters as executable in J.
> I'm
> > not sure I would want to make many equivalences between APL symbols and J
> > primitives; however, representing APL characters and international
> > characters gets into the way J handles these characters with the
> character
> > types literal, unicode and UTF-8.
> >
> > Those not interested bail out now as the rest is kind of boring, but my
> > soap-box.
> >
> > About the time mini-computers and personal computers became common 7-bit
> > ASCII was well-established standard. But since by this time computers had
> > standardized on 8 bits to the character. This extra bit allowed for
> > supporting international characters and still fit in the byte. In
> addition,
> > APL used those extra characters to support APL characters. But this lead
> to
> > confusion since those characters varied between countries and systems.
> >
> > Unicode was created to attempt to clean this mess up. It took the 7-bit
> > ASCII and a fairly accepted version of the 8-bit version of extended
> ASCII
> > and added leading zeros up to 32 bits. Now there is all kinds of room to
> > support many languages in a compatible manner.
> >
> > Enter UCS Transformation Format, in particular UTF-8. There are many
> > problems with Unicode as it made ASCII files much larger and take longer
> to
> > send over slow communications lines. And there is the endian issue
> between
> > different computers. UTF-8 is an ingenious technique to compress unicode
> in
> > a manner that is completely compatible with 7-bit ASCII. The endian
> problem
> > is eliminated. It is not compatible with 8-bit ASCII extensions. 7-bit
> > ASCII text looks identical to UTF-8 text. The 8-bit ASCII extensions text
> > does not. Those characters become two bytes each using the UTF-8
> > compression algorithm.
> >
> > J converts literal to unicode by simply putting a zero byte in front
> > extending it to the the 16-bit version of Unicode implemented in Windows
> > and Unix. This is perfectly valid as the numeric values of the first 256
> > Unicode letters match the 8-bit ASCII extension. UTF-8 assumes that
> > _128{.a. characters in literal are used in the compression algorithm.
> That
> > they do not represent extended ASCII. But J treats UTF-8 as literal
> making
> > it impossible to tell if those characters represent extended ASCII or
> UTF-8
> > compression.
> >
> > UTF-8 is a compressed version of Unicode that J fits in literal. J treats
> > literal as 8-bit extended ASCII when combining and converting to/from
> > unicode (wide). It treats literal as UTF-8 when entered from the keyboard
> > and displayed. Got a bit of an inconsistency here.
> >
> >    U =: 7 u: u =: 'þ'
> >
> >    3!:0 u   NB. u is literal
> >
> > 2
> >
> >    3!:0 U   NB. U is unicode
> >
> > 131072
> >
> >    #u       NB. u takes 2 atoms
> >
> > 2
> >
> >    #U       NB. U takes 1 atom
> >
> > 1
> >
> >    'abc',u  NB. ASCII literals catenate with UTF-8
> >
> > abcþ
> >
> >    'abc',U  NB. ASCII literals catenate with unicode
> >
> > abcþ
> >
> >    u,U      NB. UTF-8 literals do not catenate well with unicode
> >
> > Ã¾þ
> >
> >    a.i.u,U  NB. Here we have þ in two forms
> >
> > 195 190 254
> >
> > So, when programming in J one must never mix UTF-8 and unicode without
> > being extremely careful and aware of what can happen. It is easiest to
> use
> > ASCII and UTF-8 together. Not a problem as one cannot get any unicode
> into
> > J without specifically converting to unicode using u: .
> >
> > The alternative is to make sure all text that might contain UTF-8 is
> > converted to unicode. That can be difficult at times.
> >
> > The trouble with mixing ASCII and UTF-8 is that J primitives work on the
> > atoms of literal. Any UTF-8 are treated as 8-bit extended ASCII. Counting
> > characters and reshaping fail with UTF-8. Searching for UTF-8 characters
> is
> > harder. An example of a failure character counting with UTF-8 is the
> > displaying of boxed literals.
> >
> >    <u
> >
> > +--+
> >
> > |þ|
> >
> > +--+
> > Notice that þ is treated as two characters but displays as one.
> >
> > I choose to make sure everything that might contain UTF-8 is run through
> 7
> > u: which will convert it unicode if it contains any UTF-8 or it leaves it
> > literal otherwise. Now all the J primitives work as expected. A character
> > fits in an atom. I never worry about the possibility of UTF-8 characters
> > being garbled. When I'm through, simply convert my final result back to
> > UTF-8.
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jchat] APL character support (moved to Chat)

Reply via email to