Re: The dreaded layout rule

1999-08-12 Thread Andreas Rossberg

Simon Marlow wrote:
 
  Does it mean that the following expressions would be illegal?
 
  if cond then do proc1; proc2 else do proc3; proc4
  (case e of Just x - x  0; Nothing - False)
 
 Unfortunately, yes.
 
  Now one can forget about {} and use layout everywhere. He would no
  longer be able to forget or he would have to split some expressions
  into indented lines, even when they are unambiguous in one line.

You could just enumerate all keywords that allow/enforce insertion of }.
A suitable list for Haskell 98 might be:

in
where
)
]
module
type
data
newtype
class
instance
default

In fact I think that this would be the cleanest and simplest rule. (At
least that is how I once implemented layout similar to Haskell's,
because I couldn't get Yacc's error productions to work properly in all
cases).

For Haskell 2(000) I would suggest removing all but the first 4 tokens
from the list above.

- Andreas

-- 
Andreas Rossberg, [EMAIL PROTECTED]

:: be declarative. be functional. just be. ::





Re: The dreaded layout rule

1999-08-03 Thread Christian Sievers

I wrote:

 lexeme  - qvarid | qconid | qvarsym | qconsym
  | literal | special | reservedop | reservedid 
 
 Now we could replace qvarsym and qconsym by qop, and have both
 examples parse in the same way. However, unlike the other change in
 lexeme's definition, I don't suggest this, I only want to point out
 that there is a (formally) simple way out of the present somewhat
 inconsistent state.

I changed my mind about this issue, I do suggest to change it as
proposed, for if `elem` were three lexemes, any whitespace between
them would be allowed. This might even be considered a typo, as I
think no one intended to allow expressions like

 x ` {- look ma -} elem   -- comments inside!
   ` l 


All the best,
Christian Sievers





Re: The dreaded layout rule

1999-08-03 Thread Guy Lapalme

As an author of an Haskell Emacs mode that deals with the layout rule
(described in Journal of Functional Programming 8(5) 493-502), I
strongly agree that the "parse-error condition" is really a bad idea.
For example, in Emacs, no full Haskell parse is done.

After all, layout should be there to indicate clearly to a user what
section of code depends on which other; the user should not have to parse
and deal with some local fixity declarations.

I know this suggestion would break a few Haskell programs but perhaps
it should be interesting to come back to the first functional language
that implemented the layout rule, Miranda (tm) where the rule was much
more simply stated

  Syntactic objects obey Landin's offside rule.  This requires that
  every token of the object lie directly or to the right of its first
  token.  A token which breaks this rule is said to offside with
  respect to that object.

And that's it... no need to have three pages of explanations and an
appendix.  

One can find many examples where Haskell rules and Miranda
differ and some times one is better than the other, but you would be
surprised to see that in the majority of the cases, the indentation
that people normally produce are very similar under both rules.

Guy Lapalme
Université de Montréal

PS: as a Quiz, can you guess how in Haskell the following is interpreted?

f x = 1 + x
 g y = 1 + y







RE: The dreaded layout rule

1999-08-01 Thread Manuel M. T. Chakravarty

Simon Marlow [EMAIL PROTECTED] wrote,

  Does anybody disagree with my interpretation of the standard?  Are
  there any implementations that actually follow the standard here?
  (Maybe the standard should be changed to follow the implementations in
  this area.)
 
 Phew.  Well spotted.  Of course, none of the existing Haskell
 implementations are in conformance here.
 
 I think this has just about convinced me that the parse-error condition is a
 really bad idea.  

It definitely is.

 The main reason for its inclusion was to allow things like
 
   let f x = x in ...
 
 and also to automatically insert the final '}' before the end of file.
 Perhaps the layout rule should be restricted to these two cases?
 
 Proposal:
 
   - replace t by 'in' in the parse-error rule.

 EOF is already handled by the last clause in the layout spec.  My guess is
 that this would break very few programs.

The problem with this fix is that layout for let-in and
other grouping constructs would be handled differently,
which is not very intuitive.  The problem with the
`do'-notation also appears for `case'-constructs.

 A simpler rule might involve automatically inserting '}' before 'in' during
 lexical analysis iff (a) we're in a layout context and (b) the close brace
 hasn't already been inserted by the layout rule.  This would decouple the
 parser and lexer which is a Good Thing.

Still not intuitive, but from a syntactical point of view
much more preferable.  But you want to change (b) to ``the
last lexeme was no close brace'' (it may have been
explicitly inserted by the programmer.

IMHO, in the long run (= next version of Haskell) this rule
should completely vanish.  So, who is collecting the
proposals for the next Haskell standard?

Cheers,

Manuel





RE: The dreaded layout rule

1999-07-31 Thread Simon Marlow

 Does it mean that the following expressions would be illegal?
 
 if cond then do proc1; proc2 else do proc3; proc4
 (case e of Just x - x  0; Nothing - False)

Unfortunately, yes.

 Now one can forget about {} and use layout everywhere. He would no
 longer be able to forget or he would have to split some expressions
 into indented lines, even when they are unambiguous in one line.
 
 
 Hmm, the `do x == y == z' case is a real trouble. Would it be not
 too ugly to formalize the current common behavior as something like
 "for the purposes of layout resolution, the syntax does not care
 about fixity declarations"? I guess that treating them in any way at
 this stage, as long as they don't reject non-associative operators,
 would yield the same result... Ugly but practical.

One other possible solution is to remove the fixity resolution from the
grammar itself and describe it as a separate process post-parsing.  This is
probably a good thing anyway: it matches the way most implementations work
and it would clean up the grammar.

Cheers,
Simon





Re: The dreaded layout rule

1999-07-30 Thread Wolfram Kahl

Simon Peyton-Jones [EMAIL PROTECTED] writes:
  
   In other words, it is a bug (and GHC and Hugs don't do it
   right - see my previous message; from your comment, I
   presume HBC also doesn't follow the definition).  I think,
   the only Right Thing is to remove this awful rule (unless
   somebody comes up with a rule that can be decided locally).
  
  Maybe so.  But (H98 editors hat on) this is more than a "typo".

I am surprised!  ;-)  

  It's a Haskell 2  issue.  Perhaps there will be no fully conforming 
  H98 compilers!

Perhaps it would be a reasonable Haskell 1.6 issue?


Wolfram





RE: The dreaded layout rule

1999-07-30 Thread Simon Peyton-Jones


 In other words, it is a bug (and GHC and Hugs don't do it
 right - see my previous message; from your comment, I
 presume HBC also doesn't follow the definition).  I think,
 the only Right Thing is to remove this awful rule (unless
 somebody comes up with a rule that can be decided locally).

Maybe so.  But (H98 editors hat on) this is more than a "typo".  
It's a Haskell 2  issue.  Perhaps there will be no fully conforming 
H98 compilers!

Simon





Re: The dreaded layout rule

1999-07-30 Thread Malcolm Wallace

| How about the Carl Witty's
| 
|   do a == b == c
| 
| does NHC handle this correctly?

It matches ghc and Hugs, reporting
Error when renaming:
  Infix operator at 2:21 is non-associative.

Note that this is reported one stage *after* parsing.  Because parsing
of infix operators is difficult, all implementations (to my knowledge)
leave resolution of fixity and associativity until later.  Indeed, the
Haskell 98 standard recognises this (in an oblique way) by permitting
infix decls to appear *after* the first use.  Hence, it is now
impossible to resolve fix/assoc in a single pass anyway.

Regards,
Malcolm






Re: The dreaded layout rule

1999-07-30 Thread Lennart Augustsson

Malcolm Wallace wrote:

 Because parsing
 of infix operators is difficult, all implementations (to my knowledge)
 leave resolution of fixity and associativity until later.  Indeed, the
 Haskell 98 standard recognises this (in an oblique way) by permitting
 infix decls to appear *after* the first use.

Hbc does the resolution while parsing, which means it cannot be
Haskell 98 compliant.  I really dislike the new rules about where infix
can occur since they don't really buy us anything, but they do force
a more complicated implementation.

(Just as an aside.  Local infix declarations, which was just
added to Haskell 98, was very high on the list of design mistakes
in SML and will be removed in ML2000.)


--

-- Lennart








RE: The dreaded layout rule

1999-07-30 Thread Simon Marlow


 Does anybody disagree with my interpretation of the standard?  Are
 there any implementations that actually follow the standard here?
 (Maybe the standard should be changed to follow the implementations in
 this area.)

Phew.  Well spotted.  Of course, none of the existing Haskell
implementations are in conformance here.

I think this has just about convinced me that the parse-error condition is a
really bad idea.  The main reason for its inclusion was to allow things like

let f x = x in ...

and also to automatically insert the final '}' before the end of file.
Perhaps the layout rule should be restricted to these two cases?

Proposal:

- replace t by 'in' in the parse-error rule.
 
EOF is already handled by the last clause in the layout spec.  My guess is
that this would break very few programs.

A simpler rule might involve automatically inserting '}' before 'in' during
lexical analysis iff (a) we're in a layout context and (b) the close brace
hasn't already been inserted by the layout rule.  This would decouple the
parser and lexer which is a Good Thing.

Cheers,
Simon





Re: The dreaded layout rule

1999-07-30 Thread Marcin 'Qrczak' Kowalczyk

Fri, 30 Jul 1999 05:12:51 -0700, Simon Marlow [EMAIL PROTECTED] pisze:

 The main reason for its inclusion was to allow things like
 
   let f x = x in ...
 
 and also to automatically insert the final '}' before the end of file.
 Perhaps the layout rule should be restricted to these two cases?

Does it mean that the following expressions would be illegal?

if cond then do proc1; proc2 else do proc3; proc4
(case e of Just x - x  0; Nothing - False)

Now one can forget about {} and use layout everywhere. He would no
longer be able to forget or he would have to split some expressions
into indented lines, even when they are unambiguous in one line.


Hmm, the `do x == y == z' case is a real trouble. Would it be not
too ugly to formalize the current common behavior as something like
"for the purposes of layout resolution, the syntax does not care
about fixity declarations"? I guess that treating them in any way at
this stage, as long as they don't reject non-associative operators,
would yield the same result... Ugly but practical.

-- 
 __("Marcin Kowalczyk * [EMAIL PROTECTED] http://kki.net.pl/qrczak/
 \__/  GCS/M d- s+:-- a22 C++$ UL++$ P+++ L++$ E-
  ^^W++ N+++ o? K? w(---) O? M- V? PS-- PE++ Y? PGP-+ t
QRCZAK  5? X- R tv-- b+++ DI D- G+ e h! r--%++ y-






RE: The dreaded layout rule

1999-07-30 Thread Manuel M. T. Chakravarty

Malcolm Wallace [EMAIL PROTECTED] wrote,

[...]
 Simon Marlow replies:
 
  GHC and Hugs both make use of yacc-style error recovery, albeit in a very
  limited form.
 
 And nhc uses parser combinators, which give you backtracking on error
 conditions for free.  We actually do almost all layout processing at
 the lexical stage, but where the parser expects a } and doesn't get
 one, we just insert the }, and re-lex the remaining input.  I suppose
 having to re-lex is a bit of a chore, but laziness comes to the rescue
 somewhat.

How about the Carl Witty's

  do a == b == c

does NHC handle this correctly?

Manuel






RE: The dreaded layout rule

1999-07-29 Thread Simon Marlow


Manuel Chakravarty writes:

 What kind of implementation did the originators of this
 clause envision?  If the layout rule is really implemented
 as a filter between the scanner and the parser, it seems
 extremely awkward to add a dependency on the error condition
 of the parser - in particular, it makes a functional, ie,
 side-effect free implementation rather hard and a true two
 phase implementation impossible.  So, I guess (I hope!!) 
 there is a nifty trick that lets you achieve the same effect
 by using only conditions depending on local information
 (either during layout processing or by letting the parser
 insert the missing braces).

GHC and Hugs both make use of yacc-style error recovery, albeit in a very
limited form.  The idea is to have a production in your grammar like this:

close_brace :   '}'
| error

where the '}' token is assumed to have been inserted by the lexical analyser
as a result of layout (i.e. a token was found to be less indented than the
current layout context).  The error case fires if any other token is
encountered, and the semantic action for this production will probably pop
the current layout context and carry on (in practice you also have to tell
yacc not to continue with error recovery, otherwise all sorts of strange
things happen).  Take a look at GHC's parser for the details.

I believe you're right in that a true two-phase implementation of the
Haskell grammar is impossible.  This is consistent with Haskell's policy of
making life easy for programmers and hard for compiler writers :)

Cheers,
Simon





The dreaded layout rule

1999-07-29 Thread Manuel M. T. Chakravarty

One of our students just pointed out an IMHO rather
problematic clause in the layout rule.  In Section 2.7 of
the Haskell 98 Report it says,

  A close brace is also inserted whenever the syntactic
  category containing the layout list ends; that is, if an
  illegal lexeme is encountered at a point where a close
  brace would be legal, a close brace is inserted.

And in B.3, we have in the first equation of the definition
of `L',

  L (t:ts) (m:ms) = } : (L (t:ts) ms)   if parse-error(t)  (Note 1)

where Note 1 says,

  The side condition parse-error(t) is to be interpreted as
  follows: if the tokens generated so far by L together with
  the next token t represent an invalid prefix of the
  Haskell grammar, and the tokens generated so far by L
  followed by the token } represent a valid prefix of the
  Haskell grammar, then parse-error(t) is true.

What kind of implementation did the originators of this
clause envision?  If the layout rule is really implemented
as a filter between the scanner and the parser, it seems
extremely awkward to add a dependency on the error condition
of the parser - in particular, it makes a functional, ie,
side-effect free implementation rather hard and a true two
phase implementation impossible.  So, I guess (I hope!!) 
there is a nifty trick that lets you achieve the same effect
by using only conditions depending on local information
(either during layout processing or by letting the parser
insert the missing braces).

Cheers,

Manuel





Re: The dreaded layout rule

1999-07-29 Thread Ian Holyer

This is a multi-part message in MIME format.
--F93F7E72348E2F23CC7D1D40
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Manuel says

 One of our students just pointed out an IMHO rather problematic clause in the  
layout rule ... So, I guess (I hope!!) there is a nifty trick that lets you
 achieve the same effect by using only conditions depending on local
 information ...

Attached is Haskell code which handles the layout rule reasonably well as a
separate pass between scanning and parsing (though it is Haskell 1.4 rather
than 98 and imperfect).
-- 
Ian[EMAIL PROTECTED], http://www.cs.bris.ac.uk/~ian
--F93F7E72348E2F23CC7D1D40
Content-Type: text/plain; charset=us-ascii; name="Layout.hs"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="Layout.hs"

{--
LAYOUT ANALYSIS

The layout function deals with the layout conventions of Haskell 1.4, inserting
extra "{", ";" and "}" tokens to represent implicit blocks.  The inserted
tokens are marked as implicit, and are inserted as early as possible in the
token stream, in order to promote well-phrased and well-positioned error
messages in case of trouble.  The layout function never fails; it is left up to
a parser to detect errors.

The Haskell standard says that a block (or layout list) is terminated "whenever
the syntactic category containing the layout list ends, that is, if an illegal
lexeme is encountered at a point where a close brace would be legal".  This can
only be implememented easily if layout processing is combined with parsing.
Here, layout processing is done separately, so an approximation to the standard
is achieved by keeping track of brackets.  See the end of this file for
examples where the layout function deviates from the standard.

ISSUES TO BE RESOLVED

-- "case" terminated by "where" may be common enough to make a special case
-- "case" terminated by "," may be worth dealing with
-- check that "let"s inside "do" (which don't have "in") are handled OK
-- check explicit blocks, and their interaction with implicit ones

Ian Holyer  @(#) $Id: Layout.hs,v 1.2 1998/10/26 15:18:39 ian Exp $
--}

module Layout (layout) where

import Haskell
import Lex

-- Start layout processing.  If the source does not begin with "module" or "{",
-- then there is an implicit surrounding block.  Here and elsewhere, a
-- lookahead past possible comments is done so that a token can be inserted
-- before the comments if necessary; also, the end-of-file token makes it
-- unnecessary to check for an empty token stream.

layout :: [Token] - [Token]
layout ts =
   if s == "module" then comments ++ scanExplicit [] (tok:rest) else
   openBlock [] (Tok "}" 1 0 Implicit) ts
   where
   comments = takeWhile isComment ts
   tok @ (Tok s r c k) : rest = dropWhile isComment ts

-- A stack of tokens is used to keep track of the surrounding blocks.  For each
-- block, its opening "{" token is pushed onto the stack.  In an implicit
-- block, the brackets "(",")" and "[","]" and "case","of" and "let","in" and
-- "if","then","else" are tracked by putting the opening bracket on the stack
-- until the matching closing bracket is found.  Each opening bracket is stored
-- on the stack with the indent for the current block in place of its actual
-- column.

type Stack = [Token]

-- Scan the source tokens while in an explicit block (or while not in any
-- blocks) when layout is inactive.  Look for an explicit close block token, or
-- a keyword which indicates the beginning of a new block. Treat field selector
-- as an explicit block.

scanExplicit :: Stack - [Token] - [Token]
scanExplicit stack [] = []
scanExplicit stack (t@(Tok s r c k) : ts1) = case s of
   "}" - t : closeBlock stack t ts1
   "where" - t : openBlock stack t ts1
   "let" - t : openBlock stack t ts1
   "do" - t : openBlock stack t ts1
   "of" - t : openBlock stack t ts1
   "{" - openBlock stack undefined (t:ts1)
   _ - t : scanExplicit stack ts1

-- Scan the source tokens while in an implicit block, with layout active.  The
-- parameters are the stack, the last token dealt with, and the remaining
-- tokens.  The block is terminated by indenting or by a suitable closing
-- bracket.  Treat field selector as an explicit block.

scanImplicit :: Stack - Token - [Token] - [Token]
scanImplicit stack@(Tok bs br bc bk : stack1) last@(Tok ls lr lc lk) ts =
   if c  bc || k == EndToken then close else
   if c == bc  r  lr then newline else
   case s of
  "where" - open
  "let" - pushopen
  "do" - open
  "of" - popopen "case"
  "in" - pop "let"
  "(" - push
  ")" - pop "("
  "[" - push
  "]" - pop "["
  "case" - push
  "if" - push
  "then" - poppush "if"
  "else" - pop "then"
  "}" - close
  "{" - 

RE: The dreaded layout rule

1999-07-29 Thread Malcolm Wallace

Manuel Chakravarty writes:
 
 What kind of implementation did the originators of this
 clause envision?  If the layout rule is really implemented
 as a filter between the scanner and the parser, it seems
 extremely awkward to add a dependency on the error condition
 of the parser - in particular, it makes a functional, ie,
 side-effect free implementation rather hard and a true two
 phase implementation impossible.

Simon Marlow replies:

 GHC and Hugs both make use of yacc-style error recovery, albeit in a very
 limited form.

And nhc uses parser combinators, which give you backtracking on error
conditions for free.  We actually do almost all layout processing at
the lexical stage, but where the parser expects a } and doesn't get
one, we just insert the }, and re-lex the remaining input.  I suppose
having to re-lex is a bit of a chore, but laziness comes to the rescue
somewhat.

Regards,
Malcolm






Re: The dreaded layout rule

1999-07-29 Thread Mike Thyer

If the scanning stage pairs the tokens it returns with
their positions, then scanning can be done once before
parsing begins.  I've done this with a parser implemented
with parser combinators, these combinators then decide
whether or not to accept a token based on which token
it is and how far it is indented.  I think this means
the grammar being parsed is a context sensitive one,
since the state of the parser is represented by more than
just a single stack.  We need a stack telling us what to
do next, and a stack of indentation levels, although
the way in which these stacks grow and shrink is related
they could not be replaced by a single stack, so the
grammar is not context free.

Now that I write this, I think that we could combine these 
stacks as a stack of stacks, although this isn't how I did it.  
I don't think this satisfies the requirements for a context free 
grammar (CFG) but I don't have a definition to hand at the moment.

Mike

Simon Marlow wrote:
 I believe you're right in that a true two-phase implementation of 
 the Haskell grammar is impossible.  This is consistent with Haskell's 
 policy of making life easy for programmers and hard for compiler 
 writers :)