Simon Marlow wrote:
> Quick quiz: how many Haskell lexemes are represented by the following
> sequences of characters?
>
> 1) M.x
> 2) M.let
> 3) M.as
> 4) M..
> 5) M...
> 6) M.!
>
> answers:
>
> 1) 1. This is a qualified identifier.
We all know what M.x means, but recently I wondered about how the
report makes this sure. I'm afraid it doesn't.
Of course, there is section "5.5.1 Qualified names" saying:
A qualified name is written as modid.name. Since qualifier names are
part of the lexical syntax, no spaces are allowed between the
qualifier and the name. Sample parses are shown below.
[I guess "qualifier names" should be "qualified names".]
But this seems to be an explanation, not an additional information.
The second sentence seems to say M.x is a lexeme, as they are the
fundamental items of lexical analysis.
(Section "2.2 Lexical Program Structure": At each point, the longest
possible lexeme satisfying the lexeme production is read, using a
context-independent deterministic lexical analysis ...)
And if it weren't a lexeme, we're really in trouble, because:
Any kind of whitespace is also a proper delimiter for lexemes.
Still it isn't. It surely is a qvarid, but lexeme is defined like
this:
lexeme -> varid | conid | varsym | consym
| literal | special | reservedop | reservedid
A varid is unqualified, and it is also none of the others.
So maybe this should be:
lexeme -> qvarid | qconid | qvarsym | qconsym
| literal | special | reservedop | reservedid
And then I guess we should have qtyc{on,ls} -> qconid .
Am I terribly missing something?
> 2) 3. 'let' is a keyword, which excludes this string
> from being a qualified identifier.
That's really ugly. I never thought about such things.
Good you finally uncovered it.
> 3) 1. 'as' is a "specialid", not a "reservedid".
>
> 4) 1. This is a qualified symbol.
>
> 5) 2. '..' is a reserved operator, so a qualified symbol
> is out. The sequence '...' is a valid operator and
> according to the maximal munch rule this must be
> the second lexeme.
>
> 6) 1. '!' is a "specialop", not a "reservedop".
>
>
> I especially like case 5 :-)
Yes, it's amazing! Why didn't you go on? M.... is a qualified symbol?
> This is pretty bogus. I suggest as a fix for Haskell 2 that all of the
> above be treated as 1 lexeme, i.e. qualified identifiers/symbols.
But what would M.let mean? Module M can't define let, neither this way
M.let = ... -- qualiefied name not allowed
nor that:
let = ... -- let is reserved
However, 'let' does mean something in module M, so a strange option is
to let 'M.let' mean 'let'.
Should we just disallow it?
There is still another problem in the report.
Section "2.3 Comments" says:
A nested comment begins with the lexeme "{-" ...
There is no such lexeme.
We'd need lexeme -> ... | opencom
What does M.-- mean?
All the best,
Christian Sievers
--
Freeing Software is a good beginning. Now how about people?