On Fri, Mar 16, 2012 at 6:00 PM, Malcolm Wallace <malcolm.wall...@me.com> wrote:
>>> no purpose to a completely overlapping category unless it is intended to
>>> relate to an earlier standard (say Haskell 1.4).
>
> I believe all Haskell Reports, even since 1.0, have specified that the 
> language "uses" Unicode.  If it helps to bring perspective to this 
> discussion, it is my impression that the initial designers of Haskell did not 
> know very much about Unicode, but wanted to avoid the trap of being stuck 
> with ASCII-only, and so decided to reference "whatever Unicode does", as the 
> most obvious and unambiguous way of not having to think about (or specify) 
> these lexical issues themselves.
>

OK.

>> One of the underlying questions is: what is the concrete syntax of a
>> Unicode character in a Haskell program?  Note that Chapter 2 goes to a great 
>> pain to
>> specify the ASCII concrete syntax.
>
> In my view, the Haskell Report is deliberately agnostic on concrete syntax 
> for Unicode, believing that to be outside the scope of a programming language 
> standard, whilst entirely within the scope of the Unicode standards body.

The trouble is the Unicode standards body believes that the concrete syntax
is entirely within the scope of the programming language definition
(or any client
using Unicode characters), whilst largely restricting itself to the
talking about
code points which are more abstract.  So, the trick of reference the
Unicode standards
is not satisfactory :-(

> Seeing as there are (in practice) numerous concrete representations of 
> Unicode (UTF-8 and other encodings), it is largely up to individual compiler 
> implementations which encodings they support for (a) source text, and (b) 
> input/output at runtime.

OK, thanks!  I guess a take away from this discussion is that what
is a punctuation is far less well defined than it appears...

A common practice (exemplified by the link I gave earlier) is to restrict the
concrete -syntax- of the input program to the ASCII charset, and use Unicode
escape sequences to include the entire Unicode charset.  It is common to use
\uNNNNNN or \UNNNNNN to introduce Unicode characters, but I suspect that is
out of question for Haskell programs because it would clash with
lambda abstraction.

-- Gaby

_______________________________________________
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime

Reply via email to