I have this feature in jhc, where I have a 'trailing' character class that can appear at the end of both symbols and ids.
currently it consists of $trailing = [₀₁₂₃₄₅₆₇₈₉⁰¹²³⁴⁵⁶⁷⁸⁹₍₎⁽⁾₊₋] John On Sat, Jun 14, 2014 at 7:48 AM, Mikhail Vorozhtsov <mikhail.vorozht...@gmail.com> wrote: > Hello lists, > > As some of you may know, GHC's support for Unicode characters in lexemes is > rather crude and hence prone to inconsistencies in their handling versus the > ASCII counterparts. For example, APOSTROPHE is treated differently from > PRIME: > > λ> data a +' b = Plus a b > <interactive>:3:9: > Unexpected type ‘b’ > In the data declaration for ‘+’ > A data declaration should have form > data + a b c = ... > λ> data a +′ b = Plus a b > > λ> let a' = 1 > λ> let a′ = 1 > <interactive>:10:8: parse error on input ‘=’ > > Also some rather bizarre looking things are accepted: > > λ> let ᵤxᵤy = 1 > > In the spirit of improving things little by little I would like to propose: > > 1. Handle single/double/triple/quadruple Unicode PRIMEs the same way as > APOSTROPHE, meaning the following alterations to the lexer: > > primes -> U+2032 | U+2033 | U+2034 | U+2057 > symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes) > graphic -> small | large | symbol | digit | special | " | ' | primes > varid -> (small { small | large | digit | ' | primes }) (EXCEPT reservedid) > conid -> large { small | large | digit | ' | primes } > > 2. Introduce a new lexer nonterminal "subsup" that would include the Unicode > sub/superscript[1] versions of numbers, "-", "+", "=", "(", ")", Latin and > Greek letters. And allow these characters to be used in names and operators: > > symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes | > subsup ) > digit -> ascDigit | uniDigit (EXCEPT subsup) > small -> ascSmall | uniSmall (EXCEPT subsup) | _ > large -> ascLarge | uniLarge (EXCEPT subsup) > graphic -> small | large | symbol | digit | special | " | ' | primes | > subsup > varid -> (small { small | large | digit | ' | primes | subsup }) (EXCEPT > reservedid) > conid -> large { small | large | digit | ' | primes | subsup } > varsym -> (symbol (EXCEPT :) {symbol | subsup}) (EXCEPT reservedop | dashes) > consym -> (: {symbol | subsup}) (EXCEPT reservedop) > > If this proposal is received favorably, I'll write a patch for GHC based on > my previous stab at the problem[2]. > > P.S. I'm CC-ing Cafe for extra attention, but please keep the discussion to > the GHC users list. > > [1] https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts > [2] https://ghc.haskell.org/trac/ghc/ticket/5108 > _______________________________________________ > Glasgow-haskell-users mailing list > Glasgow-haskell-users@haskell.org > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users -- John Meacham - http://notanumber.net/ _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users