I think we need to restrict the range of valid unicode identifiers somehow.

For example, whitespace probably should not be in variable names:

https://www.cs.tut.fi/~jkorpela/chars/spaces.html

Or, for example, right-to-left marks should not be in variable names:

https://en.wikipedia.org/wiki/Right-to-left_mark

(Also, we should be aware that we currently do not support all of
unicode - we only [roughly] support the ucs2 characters.)

Thanks,

-- 
Raul



On Sun, Jun 12, 2016 at 4:30 PM, Marshall Lochbaum <mwlochb...@gmail.com> wrote:
> Unbox has code to allow unicode identifiers in J, with the following
> rules:
>
> - All code must be UTF-8. Invalid UTF-8 causes a spelling error.
> - Any non-ASCII character is treated as alphabetic. Identifiers can use
>   these characters freely.
>
> This is completely backwards-compatible with existing J, and allows us
> to use things like greek characters and code in other languages:
>
>    π
> |value error: π
>    π =: 1p1
>    π
> 3.14159
>    π_1
> |value error: π_1
>
> What do people think about this? Should it be added to jsource? Should
> the rules be changed for some characters?
>
> Marshall
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to