Don asked:
> Do you want unicode characters to be
> considered letters to be parts of
> names or do you want to treat them as
> primitives?
Let's call the set of Unicode characters which can legitimately compose
identifiers ID. I want contiguous sequences of ID characters to be considered
a single word. I want pure ASCII characters to be treated as they are today.
I want characters which are neither in ID nor in ASCII to produce a pelling
error (as non-ASCII characters do today).
> xxøabc+de . How should the ø be treated?
By default, that sentence would be treated exactly like J treats xx xn_pda
abc+de . If xn_pda (AKA ø) hadn't been previously defined, you'd get a value
error.
If I'm still not being clear, read on. Let's say my utility/framework is
called the J pre-processor, or JPP. It is intended to allow users to (easily)
implement simple J- or APL-like languages. JPP itself is not a language;
applications of JPP are languages. JUICE is an example of a language that
could be implemented using JPP.
Now, executing a sentence of a programming language is composed of two major
steps:
lexing (spelling, word formation, rhematics), and parsing (interpreting,
grammar, syntax). JPP is mostly concerned with lexing; it allows its users to
concentrate on writing the parser/interpreter for their language (which is the
interesting, and difficult, part).
But, for several reasons, it will be helpful to provide a default parser along
with the lexer. One, it will help me develop & debug JPP. Two, it'll provide
a example/reference implementation for JPP users. Three, using ". as the
default parser already has its uses (knocking up simple J-like languages,
exploring extensions of the J language, e.g.).
So the short story is, to JPP ø is just a word; it is neither a primitive nor a
name. It is the parser which makes that distinction. All such (Unicode) words
will be converted to the (ASCII) form xn_... which, to the default parser (".),
look like names. Of course, ". being J, will treat undefined names as verbs so
long as their value isn't needed, and will balk with a value error when their
value is needed. Of course those who apply JPP with its default parser, can
decide to pre-define certain names, so that they never produce value errors.
Let's use JUICE and your sentence xxøabc+de as an example. JUICE is a very
J-like language, so it probably makes sense to use the default parser, ". . If
Alan, using JPP to write JUICE, pre-defined a word ø (AKA xn_pda=: ... or JPP
'ø=: ...'), then to users of JUICE, ø would look (feel, sound, & smell) like a
primitive. If he didn't define that word, then users of JUICE would perceive
it as an "available name" (and when they used it without defining it, they'd
get a value error, like they'd expect). JPP could even provide a mechanism to
lock the definitions of such primitives (so if Alan defined ø=: ... users of
JUICE couldn't redefine it; see [1]).
Of course, sophisticated JPP users will want to swap out the default parser
with something custom, and that's when trace.ijs will come in handy.
-Dan
[1] Locking definitions
http://www.jsoftware.com/jwiki/DanBron/Temp/SingleAssignmentJ
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm