Re: [Haskell-cafe] The Layout Rule
Hi Michael, Michael D. Adams wrote: I am looking for background material on how GHC and other Haskell compilers implement the layout rule. In the context of our work on syntactic extensibility, we have implemented a declarative and extensible mechanism to specify and implement layout rules. A paper about the approach is currently under review, and a draft is available [1]. The implementation and evaluation data is available [2]. [1] http://sugarj.org/layout-parsing.pdf [2] http://github.com/seba--/layout-parsing We used our parser in the implementation of SugarHaskell, a syntactically extensible variant of Haskell. A paper about SugarHaskell is currently under review, and again, a draft is available [3]. The implementation can be installed as an Eclipse plugin from the SugarJ website [4]. A command-line version is forthcoming. [3] http://sugarj.org/sugarhaskell.pdf [4] http://sugarj.org/ Best Regards, Tillmann (on behalf of the SugarJ team) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] The Layout Rule
I am looking for background material on how GHC and other Haskell compilers implement the layout rule. Are there any papers, documentation, commentary, etc. that discus the actual implementation of this rule (even if only a paragraph or two)? I've already looked at the parsing code in GHC and UHC. Do any other Haskell compilers have interesting approaches for implementing the layout rule? I am writing a paper about a new formalism for indentation sensitive languages and I want to ensure I've covered the appropriate background material on existing implementations of the layout rule. Michael D. Adams mdmko...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: New Layout Rule take 2
[EMAIL PROTECTED] wrote: I have made some improvements to the algorithm, and I am happy to say that with some minor tweaks, it correctly lays out the programs in the nofib suite. the algorithm is not much more complicated than the current one in the report, but doesn't have the parse-error rule. it does require a single token of lookahead to look for an in. darcs get http://repetae.net/repos/getlaid/ I have also added a mode so it can work as a ghc preprocesor, allowing very easy testing. just compile with. ghc -pgmF /path/to/getlaid -F --make Main.hs and it will automatically process all your files. Nice! I ran the GHC parser tests using your preprocessor, and get 9 failures out of 27 in the should_compile class. Some of these are bogus (problems with the lexer you're using rather than the layout preprocessor). The should_fail class all failed, but that's because column numbers are different in the preprocessed result, so the error messages changed, I'll need to look at these individually. I've attached a patch that corrects a couple of the failures in the should_compile class. Cheers, Simon simonmar.patches Description: simonmar.patches ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
[Haskell-cafe] RE: New Layout Rule take 2
[EMAIL PROTECTED] wrote: I have made some improvements to the algorithm, and I am happy to say that with some minor tweaks, it correctly lays out the programs in the nofib suite. the algorithm is not much more complicated than the current one in the report, but doesn't have the parse-error rule. it does require a single token of lookahead to look for an in. darcs get http://repetae.net/repos/getlaid/ I have also added a mode so it can work as a ghc preprocesor, allowing very easy testing. just compile with. ghc -pgmF /path/to/getlaid -F --make Main.hs and it will automatically process all your files. Nice! I ran the GHC parser tests using your preprocessor, and get 9 failures out of 27 in the should_compile class. Some of these are bogus (problems with the lexer you're using rather than the layout preprocessor). The should_fail class all failed, but that's because column numbers are different in the preprocessed result, so the error messages changed, I'll need to look at these individually. I've attached a patch that corrects a couple of the failures in the should_compile class. Cheers, Simon simonmar.patches Description: simonmar.patches ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] New Layout Rule take 2
I have made some improvements to the algorithm, and I am happy to say that with some minor tweaks, it correctly lays out the programs in the nofib suite. the algorithm is not much more complicated than the current one in the report, but doesn't have the parse-error rule. it does require a single token of lookahead to look for an in. darcs get http://repetae.net/repos/getlaid/ I have also added a mode so it can work as a ghc preprocesor, allowing very easy testing. just compile with. ghc -pgmF /path/to/getlaid -F --make Main.hs and it will automatically process all your files. Now, it isn't perfect. I can construct pathological examples that the old rule would parse, but this one won't. however, if those examples don't actually occur in practice, then that is not so much an issue. my program doesn't handle many non-haskell 98 extensions, but can probably be easily modified to do so. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
New Layout Rule
Motivated by some recent discussion, I thought I would explore the possibilty of formalizing the haskell layout rule without the dreaded parse-error clause, as in, one that can be completly handled by the lexer. motivated by that I have written a little program that takes a haskell file with layout on stdin and spits out one without layout on stdout. it can be gotten here: darcs get http://repetae.net/repos/getlaid/ the code is designed to make the layout algorithm completly transparent, so that we might experiment with it. The function layout in 'Layout.hs' is the single and complete layout algorithm and the only thing that need be modified by experimentors. I have come up with a simple improvement to the algorithm given in the paper that seems to catch a very large number of layouts. basically, whenever it comes across something that must come in matched pairs (, ), case of, if then. it pushes a special context onto the stack, when it comes across the closing token, it pops every layout context down to the special context. there is a special case for in that causes it to pop only up to the last context created with a let, but not further. here is the complete algorithm (with my modification, sans the parse-error rule): data Token = Token String | TokenVLCurly String !Int | TokenNL !Int deriving(Show) data Context = NoLayout | Layout String !Int -- the string on 'Layout' and 'TokenVLCurly' is the token that -- created the layout, always one of where, let, do, or of layout :: [Token] - [Context] - [Token] layout (TokenNL n:rs) (Layout h n':ls) | n == n' = semi:layout rs (Layout h n':ls) | n n' = layout rs (Layout h n':ls) | n n' = rbrace:layout (TokenNL n:rs) ls layout (TokenNL _:rs) ls = layout rs ls layout (TokenVLCurly h n:rs) (Layout h' n':ls) | n = n' = lbrace:layout rs (Layout h n:Layout h' n':ls) | otherwise = error inner layout can't be shorter than outer one layout (TokenVLCurly h n:rs) ls = lbrace:layout rs (Layout h n:ls) layout (t@(Token s):rs) ls | s `elem` fsts layoutBrackets = t:layout rs (NoLayout:ls) layout (t@(Token s):rs) ls | s `elem` snds layoutBrackets = case ls of Layout _ _:ls - rbrace:layout (t:rs) ls NoLayout:ls - t:layout rs ls [] - error $ unexpected ++ show s layout (t@(Token in):rs) ls = case ls of Layout let n:ls - rbrace:t:layout rs ls Layout _ _:ls - rbrace:layout (t:rs) ls ls - t:layout rs ls layout (t:rs) ls = t:layout rs ls layout [] (Layout _ n:ls) = rbrace:layout [] ls layout [] [] = [] layoutBrackets = [ (case,of), (if,then), ((,)), ([,]), ({,}) ] now. there are a few cases it doesn't catch. the hanging case at the end of a guard for instance, I believe this can be solved easily by treating '|' and '=' as opening and closing pairs in lets and wheres '|' and '-' as opening and closing pairs in case bodies. it is easy to see which one you are in by looking at the context stack. commas are trickier and are the only other case I think we need to consider. I welcome people to experiment and send patches or brainstorm ideas, I have what I believe is a full solution percolating in my head, but am unhappy with it, I am going to sleep on it and see if it crystalizes by morning. In the meantime, perhaps someone can come up with something more elegant for dealing with the remaining cases. or at least find some real programs that this code breaks down on! (bug fixes for the lexer and everything are very much welcome. it will probably choke on some ghc extensions that would be trivial to add to the alex grammar) John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: New Layout Rule
On Fri, Dec 08, 2006 at 03:26:30PM +, Ian Lynagh wrote: On Fri, Dec 08, 2006 at 02:33:47AM -0800, John Meacham wrote: Motivated by some recent discussion, I thought I would explore the possibilty of formalizing the haskell layout rule without the dreaded parse-error clause, as in, one that can be completly handled by the lexer. There was some discussion about that a while ago on this list, e.g. http://www.haskell.org/pipermail/haskell-prime/2006-March/000915.html and other subthreads in that thread. I'd still love to see a replacement which can be a separate phase between lexing and parsing, even if it means we need to lay some things out differently or tweak other bits of the syntax. let isn't an issue (at least not for the reason specified in that mail). It is taken care of properly in the version I posted. the trick is to annotate each layout context with what caused it to occur. when you reach an in rather than popping up to the most recent NoLayout (as you would with a bracket) you pop up to the most recent layout context that was started with a let. (if such a context doesn't exist, it is a syntax error) John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Am Montag, 6. März 2006 16:52 schrieb Malcolm Wallace: Daniel Fischer [EMAIL PROTECTED] wrote: At the beginning of the module, there is _no_ current indentation level - thus the fourth equation of L applies. I think, the third from last equation of L applies, since If the first lexeme of a module is _not_ { or module, then it is preceded by {n} where n is the indentation of the lexeme., so we start L with L ('module':ts) []. Indeed, and thus, when we get to the end of the first 'where' token, the stack of indentation contexts is still empty. Hence my remark about the fourth equation. Aha, I read 'At the beginning of the module' as 'at the very beginning', whereas you meant 'At the beginning, after the module-where', sorry to have misunderstood. body- { impdecls; topdecls } | { impdecls } | { topdecls } The first line seems to suggest that import declaraions were admissible also after topdecls, but any attempt to place an impdecl after a topdecl leads --fortunately-- to a parse error in hugs and ghc, shouldn't the production be body- { impdecls }; { topdecls } ? I think you have mis-read the brace characters as if they were the EBNF meta symbols for repetition. They do in fact mean the literal brace symbol, which may be explicitly present in the source, or inserted by the layout rule. Thus, topdecls must follow impdecls, and be at the same indentation level if layout matters. Ah, damn, fonts are too similar in my browser. And since I've never used explicit braces at the top level, I didn't expect literal brace-characters there. Regards, Malcolm Thanks, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Brian Hulley wrote: However I think there is an error in the description of this in section 2.7 of the Haskell98 report, which states: If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list {} is inserted, and layout processing occurs for the current level ... I dispute the or equal in the above statement, since it seems to be clearly in contradiction to what is actually being done. Section 2.7 does say that it is an informal description, so although it is correct, it is not complete. In the case of the module header, the question is really what is the current indentation level? (that we must be strictly greater than). The answer can be found in the formal definition of the layout rule in section 9.3. At the beginning of the module, there is _no_ current indentation level - thus the fourth equation of L applies. Regards, Malcolm ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Daniel Fischer [EMAIL PROTECTED] wrote: At the beginning of the module, there is _no_ current indentation level - thus the fourth equation of L applies. I think, the third from last equation of L applies, since If the first lexeme of a module is _not_ { or module, then it is preceded by {n} where n is the indentation of the lexeme., so we start L with L ('module':ts) []. Indeed, and thus, when we get to the end of the first 'where' token, the stack of indentation contexts is still empty. Hence my remark about the fourth equation. body - { impdecls; topdecls } | { impdecls } | { topdecls } The first line seems to suggest that import declaraions were admissible also after topdecls, but any attempt to place an impdecl after a topdecl leads --fortunately-- to a parse error in hugs and ghc, shouldn't the production be body - { impdecls }; { topdecls } ? I think you have mis-read the brace characters as if they were the EBNF meta symbols for repetition. They do in fact mean the literal brace symbol, which may be explicitly present in the source, or inserted by the layout rule. Thus, topdecls must follow impdecls, and be at the same indentation level if layout matters. Regards, Malcolm ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Malcolm Wallace wrote: Brian Hulley wrote: However I think there is an error in the description of this in section 2.7 of the Haskell98 report, which states: If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list {} is inserted, and layout processing occurs for the current level ... I dispute the or equal in the above statement, since it seems to be clearly in contradiction to what is actually being done. Section 2.7 does say that it is an informal description, so although it is correct, it is not complete. In the case of the module header, the question is really what is the current indentation level? (that we must be strictly greater than). The answer can be found in the formal definition of the layout rule in section 9.3. At the beginning of the module, there is _no_ current indentation level - thus the fourth equation of L applies. Thanks. However I do think the fact that there is a special case for the module head would merit a mention in section 2.7, because at the moment it's a bit like looking at a stack of chocolate cookies and defining the top one to be vanilla - it works but who'd ever have thought of it for themselves just looking at the visual indentation on the screen? On the subject of 9.3, I'm puzzled by: For the purposes of the layout rule, Unicode characters in a source program are considered to be of the same, fixed, width as an ASCII character. However, to avoid visual confusion, programmers should avoid writing programs in which the meaning of implicit layout depends on the width of non-space characters. Surely almost all Haskell programs rely on the width of every non-space character to be the same as the width of a space (ie monospaced font where one character == one glyph) as in let a = 3 b = 5 I'd suggest that the word non-space should be replaced by multi-glyph and perhaps there could be a recommendation to avoid the use of multi-glyph characters in the first place (otherwise an editor would have to be smart enough to maintain the correct multi-glyph spaces in the columns under them...) Regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Am Freitag, 3. März 2006 19:21 schrieb Brian Hulley: Brian Hulley wrote: Brian Hulley wrote: One other thing I've been wanting to ask (not to change! :-)) for a while is: how is the following acceptable according to the rules in the Haskell98 report where where is one of the lexemes, which when followed by a line more indented than the line the layout-starting-lexeme is on, should start an implicit block: module M where data T = .-- not indented! According to my understanding of the layout algorithm, the above code would have to be written: module M where data T = Can anyone shed some light on what the formal rule is that allows the first (and very useful) way of laying out code to be ok? The solution (as someone pointed out to me in an email) is that the layout block only *finishes* when the current indentation is *less* than the indentation of the lines in the layout block (rather than *starting* only when the current indentation is *more* than the indentation of the line containing the where etc). However I think there is an error in the description of this in section 2.7 of the Haskell98 report, which states: If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list {} is inserted, and layout processing occurs for the current level ... I dispute the or equal in the above statement, since it seems to be clearly in contradiction to what is actually being done. Regards, Brian. AFAICT, the description in the report is correct, *except for the 'where' in module LayOut where*. Consider module LayOut where fun x y = bum x y + y 4 where bum x y = y x a) the module-where is at indentation level 0, accepted here, but nowhere else, even if I indent fun and bum, fun's where must be indented further than fun itself. b) bum's definition is top-level now, but in module LayOut where fun x y = bum x y + y 4 where bum x y = y x it is local (bum is indented more than fun, but less than where), in perfect accord with the report. Even module LayOut ( fun, bum) where fun x y = bum x y + y 4 where bum x y = y x is accepted. So my guess is that layout-processing is applied only to the module-body, not to the module head and probably that should be mentioned in the report. BTW, when I read about layout in the report, this irritated me, too, so thanks for asking. Cheers, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Daniel Fischer wrote: Am Freitag, 3. März 2006 19:21 schrieb Brian Hulley: Brian Hulley wrote: Brian Hulley wrote: [snip] AFAICT, the description in the report is correct, *except for the 'where' in module LayOut where*. [snip] So my guess is that layout-processing is applied only to the module-body, not to the module head and probably that should be mentioned in the report. Thanks - that's quite a relief because my incremental parser absolutely relies on the indentation of a child block to be more than that of it's parent in the AST... Perhaps a future incarnation of Haskell could just omit the keyword where in the module head to avoid all this confusion. Also, all the tutorials (and book) I've read only mention the layout rule in passing somewhere deep inside the text and usually give a rather unsatisfactory hand-waving description that omits to mention the special case for where in the module head and/or the need for the sub-block to be indented more than the parent block, so I think depending on what tutorials people have read, putting this together with the module where, a lot of confusion is floating about... Perhaps a wiki page is indicated? Regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Brian Hulley wrote: Brian Hulley wrote: One other thing I've been wanting to ask (not to change! :-)) for a while is: how is the following acceptable according to the rules in the Haskell98 report where where is one of the lexemes, which when followed by a line more indented than the line the layout-starting-lexeme is on, should start an implicit block: module M where data T = .-- not indented! According to my understanding of the layout algorithm, the above code would have to be written: module M where data T = Can anyone shed some light on what the formal rule is that allows the first (and very useful) way of laying out code to be ok? The solution (as someone pointed out to me in an email) is that the layout block only *finishes* when the current indentation is *less* than the indentation of the lines in the layout block (rather than *starting* only when the current indentation is *more* than the indentation of the line containing the where etc). However I think there is an error in the description of this in section 2.7 of the Haskell98 report, which states: If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list {} is inserted, and layout processing occurs for the current level ... I dispute the or equal in the above statement, since it seems to be clearly in contradiction to what is actually being done. Regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
On 28/02/06, Brian Hulley [EMAIL PROTECTED] wrote: Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? I'm really puzled here. I've been using tabs to indent my C++ code for at least 10 years and don't see the problem. The only problem would be if someone mixed tabs with spaces. Since it has to be either tabs only or spaces only I'd choose tabs only to save keystrokes. I suppose though it is always going to be a matter of personal taste... It's easy to configure most editors (vim and emacs included of course) to treat multiple spaces as if they were tabs, but to only save spaces into your file. This is what I do, as it ensures that the way that the code looks to me in my editor is exactly what it looks like to the compiler. Quite often, if it looks better, I will align things past a tab stop with a few extra spaces (which only has to be done once, if your editor will start the next line at the same indentation as the previous). - Cale ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Layout rule (was Re: PrefixMap: code review request)
I wrote: I just installed Visual Haskell 0.1, and when I type in the editor, CPU usage rises to about 70% and there's a noticeable delay before each character appears on the screen. This is no longer happening, so I guess I ran afoul of a bug. -- Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Brian Hulley wrote: [snip] So any solutions welcome :-) Thank to everyone who replied to my queries about this whole layout issue. One other thing I've been wanting to ask (not to change! :-)) for a while is: how is the following acceptable according to the rules in the Haskell98 report where where is one of the lexemes, which when followed by a line more indented than the line the layout-starting-lexeme is on, should start an implicit block: module M where data T = .-- not indented! According to my understanding of the layout algorithm, the above code would have to be written: module M where data T = Can anyone shed some light on what the formal rule is that allows the first (and very useful) way of laying out code to be ok? Thanks, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Layout only applies when something is less indented than previous lines, I believe... e.g. do c - getContents filename putStrLn blah or do x - getContents filename putStrLn ok works fine but do c - blahAction putStrLn blah obviously won't work Jared. On 3/2/06, Brian Hulley [EMAIL PROTECTED] wrote: Brian Hulley wrote: [snip] So any solutions welcome :-) Thank to everyone who replied to my queries about this whole layout issue. One other thing I've been wanting to ask (not to change! :-)) for a while is: how is the following acceptable according to the rules in the Haskell98 report where where is one of the lexemes, which when followed by a line more indented than the line the layout-starting-lexeme is on, should start an implicit block: module M where data T = .-- not indented! According to my understanding of the layout algorithm, the above code would have to be written: module M where data T = Can anyone shed some light on what the formal rule is that allows the first (and very useful) way of laying out code to be ok? Thanks, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
On Wednesday 01 March 2006 02:36, Brian Hulley wrote: Ben Rudiak-Gould wrote: Brian Hulley wrote: Here is my proposed layout rule: 1) All layout keywords (where, of, let, do) must either be followed by a single element of the corresponding block type, and explicit block introduced by '{', or a layout block whose first line starts on the *next* line I wouldn't have much trouble adapting to that. and whose indentation is accomplished *only* by tabs You can't be serious. This would cause far more problems than the current rule. Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? What kind of editor are you using? Notepad? I am used to hitting TAB key and get the correct number of spaces, according to how I configured my editor (NEdit) for the current language mode. TAB characters in program text should be forbidden by law. As well as editors that by default insert a tab char instead of spaces. Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
Benjamin Franksen wrote: [snip] I am used to hitting TAB key and get the correct number of spaces, according to how I configured my editor (NEdit) for the current language mode. The only thing then is what happens when you type backspace or left arrow to get back out to a previous indentation? If the TAB character inserts spaces, there's no problem going from left to right but it would seem more complicated to go back out again ie without having to type backspace 4 times and try to hope when outdenting more that I haven't typed backspace 23 times instead of 24 times by mistake thus not getting to the column I expected. This is my only reason for wanting to keep tab characters in the text, and certainly it does give some disadvantages when trying to line up '|' '=' etc vertically - at the moment I admit my layouts do end up a bit contrived as I have to use more newlines to ensure I can use tabs only to accomplish the line-up... So any solutions welcome :-) Regards, Brian. ... flee from the Hall of Learning. This Hall is dangerous in its perfidious beauty, is needed but for thy probation. Beware, Lanoo, lest dazzled by illusive radiance thy soul should linger and be caught in its deceptive light. -- Voice of the Silence stanza 33 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewrequest)
On Wednesday 01 March 2006 13:35, Brian Hulley wrote: Benjamin Franksen wrote: [snip] I am used to hitting TAB key and get the correct number of spaces, according to how I configured my editor (NEdit) for the current language mode. The only thing then is what happens when you type backspace or left arrow to get back out to a previous indentation? If the TAB character inserts spaces, there's no problem going from left to right but it would seem more complicated to go back out again ie without having to type backspace 4 times and try to hope when outdenting more that I haven't typed backspace 23 times instead of 24 times by mistake thus not getting to the column I expected. With NEdit, hitting backspace /right after/ hitting the tab key deletes all the whitespace that were inserted, be it a tab character or multiple spaces. (This works also if the line was auto-indented to the same indentation depth as the previous one. That is, hit enter and then backspace, and you are at previous indentation level minus one.) If, however, you press any other key (e.g. any arrow keys), subsequent backspace will only delete a single space. Other behaviors can be easily implemented by writing a macro and binding it to the backspace key. The same is most probably true for emacs. The upshot is: Any decent modern text editor allows to map keys like tab and backspace to almost any action desired, depending on context, language mode, whatever. Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
Am Mittwoch, 1. März 2006 11:57 schrieb Benjamin Franksen: TAB characters in program text should be forbidden by law. As well as editors that by default insert a tab char instead of spaces. As founding member of the church of The only good Tabbing involves Michaela, I wholeheartedly agree. Cheers, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Layout rule (was Re: PrefixMap: code review request)
Duncan Coutts wrote: hIDE and Visual Haskell use the ghc lexer and get near-instantaneous syntax highlighting. Hmm... I just installed Visual Haskell 0.1, and when I type in the editor, CPU usage rises to about 70% and there's a noticeable delay before each character appears on the screen. This is a very short module (~100 lines) and a Pentium M 1600 CPU. Am I doing something wrong? -- Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Layout rule (was Re: PrefixMap: code review request)
Benjamin Franksen wrote: TAB characters in program text should be forbidden by law. Well... they are quite useful for lining things up if you're using a proportional font, and I don't think proportionally-spaced code is a bad idea. I want them to be optional. But it would be nice if parsers would warn about (or even reject) programs whose meaning depends on tab width. -- Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Layout rule
Ketil Malde wrote: Multi line comments are nice for commenting out blocks of code. They're also nice for comments within a line. E.g. haskell-src-exts contains the declaration data HsQualConDecl = HsQualConDecl SrcLoc {- forall -} [HsName] {- . -} HsContext {- = -} HsConDecl Probably half of my uses of {- -} begin and end on the same line. -- Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Layout rule (was Re: PrefixMap: code review request)
On Wed, 2006-03-01 at 22:58 +, Ben Rudiak-Gould wrote: Duncan Coutts wrote: hIDE and Visual Haskell use the ghc lexer and get near-instantaneous syntax highlighting. Hmm... I just installed Visual Haskell 0.1, and when I type in the editor, CPU usage rises to about 70% and there's a noticeable delay before each character appears on the screen. This is a very short module (~100 lines) and a Pentium M 1600 CPU. Am I doing something wrong? I can't say too much about the internals of VH since I've not see the code, only the description. Perhaps that's because they're starting the parser immediately after every keystroke and/or not killing the parser when the user types another key. I've been using hIDE on a Pentium M 1600 laptop and on the size of modules I've tried so far it's quick. The syntax highlighting updates immediately and the type checker shows up errors a second or so after I stop typing (which is because we wait about that long before starting the parser). Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code reviewreque st)
On Wed, 1 Mar 2006 12:35:44 -, Brian Hulley [EMAIL PROTECTED] wrote: The only thing then is what happens when you type backspace or left arrow to get back out to a previous indentation? The Borland IDEs have long supported various smart indentation features, which can each be individually turned on or off (see the third one for the answer to your specific question): * Auto indent mode - Positions the cursor under the first nonblank character of the preceding nonblank line when you press ENTER in the Code Editor. * Smart tab - Tabs to the first non-whitespace character in the preceding line. If Use tab character is enabled, this option is off. * Backspace unindents - Aligns the insertion point to the previous indentation level (outdents it) when you press BACKSPACE, if the cursor is on the first nonblank character of a line. There are a number of other tab-related options as well. Steve Schafer Fenestra Technologies Corp. http://www.fenestra.com/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
Brian Hulley wrote: Here is my proposed layout rule: 1) All layout keywords (where, of, let, do) must either be followed by a single element of the corresponding block type, and explicit block introduced by '{', or a layout block whose first line starts on the *next* line I wouldn't have much trouble adapting to that. and whose indentation is accomplished *only* by tabs You can't be serious. This would cause far more problems than the current rule. I would also make it that explicit braces are not allowed to switch off the layout rule (ie they can be used within a layout), I don't understand. What does used within a layout mean? multiline strings would not be permitted, They aren't now, except with \ escapes. A stray will be caught on the same line unless the line happens to end with \ and the next line happens to begin with \, which is exceedingly unusual. and multiline comments would not be permitted (pragmas could easily be used just by using --#) But --# doesn't introduce a comment. And this would make UNPACK pragmas rather inconvenient to use. 1) When you see a ';' you could immediately tell which block it belongs to by looking backwards till the next '{' I guess that might be helpful, but it doesn't seem easier than looking left to the beginning of the current line and then up to the first less-indented line. 2) Variable width fonts can be used, They can be used now, if you adhere to a certain style, but not everyone likes that style. I wrote in C++ with a variable width font and tabs at one time, but eventually went back to fixed width. One reason was that I couldn't use comment layout conventions that tend (in my experience) to improve readability more than monospacing hurts it. Another reason was that glyph widths appropriate to natural languages didn't work all that well for source code. Spaces are much more important in source code than in natural language, for example. A proportional font designed for source code would be nice, but I haven't found one yet. Stroustrup used a mixture of proportional and monospaced glyphs in _The C++ Programming Language_ and it worked well. or different font faces to represent different sorts of identifier eg class names, tycons, value constructors, operators like `seq` as opposed to seq etc Lots of editors do this with monospaced fonts; I think it's orthogonal to the layout issue. 3) Using only tabs ensures that vertical alignment goes to the same position on the screen regardless of the font and tabs could even have different widths just like in a wordprocessor Requiring tabs is a really bad idea. Just forget it. Seriously. 4) Any keypress has a localised effect on the parse tree of the buffer as a whole ( { no longer kill everything which follows and there would be no {- ) I don't understand why this is an advantage. If you have an editor that highlights comments in green, then large sections of the program will flash green while you type a {- -} comment, which might be annoying, but it also means you'll never forget to close the comment, so the practical benefit of forbidding {- -}, as opposed to simply not typing it yourself, seems nil. 5) It paves the way for a much more immersive editing environment, but I can't say more about this at the moment because I haven't finished writing it yet and it will be a commercial product :-))) I guess everything has been leading up to this, but my reaction is that it renders the whole debate irrelevant. The only reason layout exists in the first place is to make source code look good in ordinary text editors. If you have a high-level source code editor that manipulates the AST, then you don't need layout, or tabs, or any of that silly ASCII stuff. The only time you need to worry about layout is when interoperating with implementations that use the concrete syntax, and then there's nothing to stop you from exporting in any style you like. And when importing, there's no reason to place restrictions on Haskell's layout rule, because the visual layout you display in the editor need have no connection to the layout of the imported file. Using my self-imposed layout rule I'm currently editing all my Haskell code in a standard text editor using tabs set to 4 spaces and a variable width font and have no problems. Which is the best argument for keeping the current rule! If it were changed as you propose, then someday Hugh Briley would come along and complain that Haskell's layout syntax squandered screen space---but he *wouldn't* be able to program in his preferred style, because it would no longer be allowed. Religious freedom is a good thing. {- Ben -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
Ben Rudiak-Gould wrote: Brian Hulley wrote: Here is my proposed layout rule: 1) All layout keywords (where, of, let, do) must either be followed by a single element of the corresponding block type, and explicit block introduced by '{', or a layout block whose first line starts on the *next* line I wouldn't have much trouble adapting to that. and whose indentation is accomplished *only* by tabs You can't be serious. This would cause far more problems than the current rule. Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? I would also make it that explicit braces are not allowed to switch off the layout rule (ie they can be used within a layout), I don't understand. What does used within a layout mean? I meant that {;} would be used just like any other construct that has to respect the layout rule so you could write let a = let { b = 6; z = 77; h = 99; p = 100} in b+z+h + p etc but not: let a = let { b = 6; z = 77; h = 99; -- this binding would be part of the outermost 'let' p = 100} in b+z+h + p multiline strings would not be permitted, They aren't now, except with \ escapes. A stray will be caught on the same line unless the line happens to end with \ and the next line happens to begin with \, which is exceedingly unusual. and multiline comments would not be permitted (pragmas could easily be used just by using --#) But --# doesn't introduce a comment. And this would make UNPACK pragmas rather inconvenient to use. -- # but I hadn't thought about UNPACK... The motivation in both points is to make it easy for an editor to determine which lines need to be re-parsed based on the number of leading tabs alone. 1) When you see a ';' you could immediately tell which block it belongs to by looking backwards till the next '{' I guess that might be helpful, but it doesn't seem easier than looking left to the beginning of the current line and then up to the first less-indented line. There was an example posted on another thread where someone had got into confusion by using ; after a let binding in a do construct with an explicit brace after the 'do' but not after the 'let' (sorry I can't find it again). Also the current layout rule uses the notion of an implicit opening brace which is a to be regarded as a real opening brace as far as ';' in concerned but an unreal non-existent opening brace as far as '}' is concerned. Thus I think it is a real mix-up. 2) Variable width fonts can be used, They can be used now, if you adhere to a certain style, but not everyone likes that style. I wrote in C++ with a variable width font and tabs at one time, but eventually went back to fixed width. One reason was that I couldn't use comment layout conventions that tend (in my experience) to improve readability more than monospacing hurts it. Another reason was that glyph widths appropriate to natural languages didn't work all that well for source code. Spaces are much more important in source code than in natural language, for example. A proportional font designed for source code would be nice, but I haven't found one yet. Stroustrup used a mixture of proportional and monospaced glyphs in _The C++ Programming Language_ and it worked well. or different font faces to represent different sorts of identifier eg class names, tycons, value constructors, operators like `seq` as opposed to seq etc Lots of editors do this with monospaced fonts; I think it's orthogonal to the layout issue. For example on Windows Trebuchet MS is a very nice font, also Verdana, both of which are not monospaced. But yes I agree it's not a major issue and I just see the option of being able to use them as a nice side-effect. 3) Using only tabs ensures that vertical alignment goes to the same position on the screen regardless of the font and tabs could even have different widths just like in a wordprocessor Requiring tabs is a really bad idea. Just forget it. Seriously. I'm really puzled here. I've been using tabs to indent my C++ code for at least 10 years and don't see the problem. The only problem would be if someone mixed tabs with spaces. Since it has to be either tabs only or spaces only I'd choose tabs only to save keystrokes. I suppose though it is always going to be a matter of personal taste... 4) Any keypress has a localised effect on the parse tree of the buffer as a whole ( { no longer kill everything which follows and there would be no {- ) I don't understand why this is an advantage. If you have an editor that highlights comments in green, then large sections of the program will flash green while you type a {- -} comment, which might be annoying, but it also means you'll never forget to close the comment, so the practical benefit of forbidding {- -}, as opposed to simply not typing it yourself
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
On Wed, 1 Mar 2006, Brian Hulley wrote: Ben Rudiak-Gould wrote: Brian Hulley wrote: Here is my proposed layout rule: snip and whose indentation is accomplished *only* by tabs You can't be serious. This would cause far more problems than the current rule. Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? Not when it prevents me from ever exhibiting the slightest shred of style in my code. I use that control for readability purposes in my code. -- [EMAIL PROTECTED] My religion says so explains your beliefs. But it doesn't explain why I should hold them as well, let alone be restricted by them. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)
BH Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) BH times? PC Not when it prevents me from ever exhibiting the slightest shred of style PC in my code. I use that control for readability purposes in my code. [snip] BH I'm really puzled here. I've been using tabs to indent my C++ code for at BH least 10 years and don't see the problem. At least two reasons: 1. C++ doesn't care about any whitespace (except to separate tokens). Haskell cares about leading whitespace (which it is clear you are thinking a lot about...) but 2. as Philippa mentioned, Haskell programmers care a ton about inter-line, inter-word layout/alignment, for example, lining up = signs and arguments to functions in pattern matches, etc. C++ does not invite this style of declarative programming so it is not surprising that it wasn't an issue: aside from the indentation, I rarely type fancy whitespace inside a giving line of C++ code to align elements with those on a preceding line. In Haskell, this unofficial layout style doesn't affect the machine-parsing of the code, but rather the human-parsing of the code. (In fact, it's one of my favorite things about Haskell.) If you want to see what can be accomplished with variable width fonts and complex layouts (not just beginning of lines but rather inter-line, inter-word alignment) you should checkout lhs2TeX. They accomplish all their magic with spaces. BH The only problem would be if BH someone mixed tabs with spaces. Since it has to be either tabs only or BH spaces only I'd choose tabs only to save keystrokes. BTW, tab doesn't type the tab character (at least in emacs and I think vim) but instead moves the left edge of the current line by adding or deleted spaces (or trying to ident the right amount). This usually means you don't have to type 4 or 8 spaces. (And anyway, I would just hold the key down if I had to type more than one spacebar, etc.) [snip] For example on Windows Trebuchet MS is a very nice font, also Verdana, both of which are not monospaced. But yes I agree it's not a major issue and I just see the option of being able to use them as a nice side-effect. Very few programmers I know would go to variable width fonts just to use some Microsoft font to edit code... (BTW I like Trebuchet and Verdana too.) To each his/her own! Cheers, Jared. -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Layout rule
Brian Hulley [EMAIL PROTECTED] writes: You can't be serious. This would cause far more problems than the current rule. Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? What you type depends on your editor. I hit tab, and the editor inserts an appropriate number of spaces. (I thought all editors did this now?) There was an example posted on another thread where someone had got into confusion by using ; after a let binding in a do construct with an explicit brace after the 'do' but not after the 'let' (sorry I can't find it again). If you allow {- everything becomes a lot more complicated and who needs them anyway? Multi line comments are nice for commenting out blocks of code. It is much less intrusive, in particular if you're using version control. back to editing a function at the top of a file. Things like {- would mean that all the parse trees for everything after it would have to be discarded. Also, flashing of highlighting on this scale could be very annoying for a user, so I'd rather just delete this particular possibility of the user getting annoyed when using my software :-) Couldn't your editor just be a little bit smarter? E.g. count the {-s and -}s, and only comment-hilight them if there are two of them? Retain a history of old parse trees, so that it is quick to return to a previous one? Haskell, which in turn might lead to more people understanding and therefore using the language, more libraries, more possibilities for You forget one thing: Avoid success at all costs :-) -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: layout rule infelicity
Jon Fairbairn [EMAIL PROTECTED] writes: Why -f anyway? It took me ages to work out what -fallow-overlapping-instances meant -- I wondered how fallow could apply to overlapping instances. I suppose it's a GCCism, where options starting with -f specifiy *f*lags. (Which doesn't seem to apply to GHC, unless there's a -fno-allow... (of -fdont-allow...?)) -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
At 2002-05-30 02:26, Jon Fairbairn wrote: I think this is extremely bad language design! In general I like having layout rules, but ... What's the deal with the whole layout thing anyway? I've never come across it before in another language. Is it an academic thing? It drove me nuts when I first started Haskell, until I discovered you could use semicolons/braces instead (which I always do). If I were teaching Haskell to working programmer types like myself, I would encourage them to always use full semicolons and braces and forget layout entirely (except a lot of available Haskell source seems to use it). Certainly I find {;} more readable, and I suspect anyone else with a C/C++/Java background (or even a Scheme/Lisp background) does too./RANT -- Ashley Yakeley, Seattle WA ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
At 2002-05-30 02:46, I wrote: What's the deal with the whole layout thing anyway? I've never come across it before in another language. Oh, wait, there's Python and Ruby. For some reason it doesn't bother me so much with them. -- Ashley Yakeley, Seattle WA ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
Ashley Yakeley wrote: At 2002-05-30 02:26, Jon Fairbairn wrote: I think this is extremely bad language design! In general I like having layout rules, but ... What's the deal with the whole layout thing anyway? I've never come across it before in another language. Is it an academic thing? How about FORTRAN (to a very small extent) or Python? I used to dislike layout, but I must say that it didn't take long to become a supporter once you start using it. If you look at C ( offspring), it's not the {;} that makes the code readable, it's the indentation that does. So why not acknowledge that? -- Lennart ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
On Thu, 30 May 2002, Ashley Yakeley wrote: it). Certainly I find {;} more readable, and I suspect anyone else with a C/C++/Java background (or even a Scheme/Lisp background) does too./RANT Just a data point: I learned Basic, Pascal, Standard ML, C, Haskell, C++, Perl, Python in that order and actively use Haskell, C++, Perl Python at the moment, and I find the `visual noise' of braces and semi-colons in C++ and Perl to be very irritating when, as Lennart points out, to be readable by me my code has to embody these structures by layout. (It's primarily the noise of all those `fun', `val' and `end's rather than deeper language issues that put me off looking at ML again.) Indeed, I (half) there ought to be a warning on the main page of Haskell.org saying `WARNING: Using Haskell can lead to semi-colon blindness' since I relatively frequently spend ten minutes trying to figure out why C++ code isn't compiling only to realise that, whilst indented structurally the semi-colons are missing :-S I suspect using layout rule is forever destined to be controversial... ___cheers,_dave_ www.cs.bris.ac.uk/~tweed/ | `It's no good going home to practise email:[EMAIL PROTECTED] | a Special Outdoor Song which Has To Be work tel:(0117) 954-5250 | Sung In The Snow' -- Winnie the Pooh ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
At 2002-05-30 02:54, Lennart Augustsson wrote: If you look at C ( offspring), it's not the {;} that makes the code readable, it's the indentation that does. So why not acknowledge that? In C, the indentation is an important visual clue, but there are many different indentation styles. It's the braces that actually tell you the beginning and end of a block. I might also use indentation for non-blocks, for instance: void foo (int n) { if (n 0) bar ( Sproing!,// title getBounds(n), // bounds true, // bordered true, // bright false, // not transparent true, // use v2 appearance 5, // shadow size null // next ); } Equally, I always indent my braced blocks in Haskell as well as C ( o). If you're used to braces, complicated Haskell expressions with layout look confusing, since it's not immediately clear which indentation style the layout rules are trying to enforce. It's also not clear to the unlearned how best to split an expression onto two lines, or how it interacts with parentheses, etc. And then there are those nasty little infelicities... -- Ashley Yakeley, Seattle WA ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
I like layout but I think the existing rules are too complicated. Unfortunately it's difficult to do anything with them without breaking vast swathes of existing code, so we'll just have to put up with them. The reason I think layout is better than using {'s and ,'s is that humans use the layout to group the structure anyway, which means you can have confusing situations where a structure looks alright to a human but not to a computer. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
Martin Odersky [EMAIL PROTECTED] writes: Redundancy maybe? What's wrong in having both layout and punctuation? Short answer: What's wrong with it is that humans use layout to infer the semantic meaning, compilers use punctuation. Thus it's not really redundancy. -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
What's the deal with the whole layout thing anyway? I've never come across it before in another language. Python has it as well (they stole it from Haskell?) If I were teaching Haskell to working programmer types like myself, I would encourage them to always use full semicolons and braces ... while we're at it - what's the deal with type inference? sometimes I think it is *really bad* language design if the program may contain untyped declarations of identifiers. ghc -Wall warns nicely about undeclared top-level types but what about locals? I've never came across a language that would allow them declared untyped. of course I know (some of) the `academic' background (type inference, type checking) but what about it from a software engineering point of view? \end{rant} .. I think neither the layout rule nor type inferencing are likely to disappear from Haskell .. -- -- Johannes Waldmann http://www.informatik.uni-leipzig.de/~joe/ -- -- [EMAIL PROTECTED] -- phone/fax (+49) 341 9732 204/252 -- ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
... layout rules somewhat like Haskell's. In our experience it was the single thing that confused students most. same here, for exactly these reasons. students get really confused. on the other hand, students regularily get confused by other things as well, like homework assignments on formal languages, so that alone is not enough reason to drop the subject altogether :-) -- -- Johannes Waldmann http://www.informatik.uni-leipzig.de/~joe/ -- -- [EMAIL PROTECTED] -- phone/fax (+49) 341 9732 204/252 -- ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
At 2002-05-30 04:19, Johannes Waldmann wrote: same here, for exactly these reasons. students get really confused. on the other hand, students regularily get confused by other things as well, like homework assignments on formal languages, so that alone is not enough reason to drop the subject altogether :-) In the latter case, they are learning something useful. In the former case, the confusion emerges out of a useless property of the language. Let the students use {;} if it eliminates confusion, it's still perfectly good Haskell. I am certainly not proposing Haskell be modified to eliminate the layout option. I'm just curious as to why Haskell programmers choose to use it. -- Ashley Yakeley, Seattle WA ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
At 2002-05-30 03:59, Ketil Z. Malde wrote: Short answer: What's wrong with it is that humans use layout to infer the semantic meaning, No... layout by itself can't be trusted. It's only a clue. One needs to learn the precise Haskell-specific layout rules, and they're not obvious. -- Ashley Yakeley, Seattle WA ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
Hi everyone, I thought I would bring a students perspective into this discussion. Moving from a C background to Haskell, the layout wasn't very intuitive at first. This was mainly due to my hand's on approach (looking at examples and trying to code similar programs). Given that if i read up on the layout first I would have had less trouble. I did notice that the error messages generated by incorrect layout don't offer much clue to the origin of the layout error, well from a beginner's interpretation of the error messages anyway. Having said that, now that I have gotten used to the Haskell layout I simply adore it. I often remember tiredly coding in C and relying on the compiler to locate where i had left out a ';' at the end of a statement or two. With Haskell there is no such need! Tuong, a happy haskell student.. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
I like layout but I think the existing rules are too complicated. Unfortunat ely it's difficult to do anything with them without breaking vast swathes of existing code, so we'll just have to put up with them. Well, there's two things to consider: Haskell 98, which probably shouldn't change, and extended Haskell, which probably should. Especially if we can make the rules both simpler and better. The reason I think layout is better than using {'s and ,'s is that humans use the layout to group the structure anyway, which means you can have confusing situations where a structure looks alright to a human but not to a computer. Which is exactly the problem with the programme I posted. Having thought about it a bit, it strikes me that the particular problem is the insertion of a closing brace. From the human reader's point of view, there's no visual equivalent of the closing brace in the example: possible_int = do skip_blanks fmap Just int +++ (literal - `as` Nothing) What happens is that a semicolon is inserted because the indentation is the same as the previous line -- that's fair enough, subject to some quibbles about treating all expressions the same -- but then the +++ is a syntax error unless a closing brace is inserted. Visually, the equivalent of a closing brace is when indentation is less (to my eye it ought to be right down to where the 'do' is and inbetween be an error). What's wrong with the notion that closing braces should only be inserted when the indentation is less (or the file ends)? This would reject some programmes, but only ones where the appearance is misleading. So possible_int = do skip_blanks fmap Just int +++ (literal - `as` Nothing) whatever ... parses as possible_int = do {skip_blanks ;fmap Just int +++ (literal - `as` Nothing) } whatever ... and possible_int = do skip_blanks fmap Just int +++ (literal - `as` Nothing) whatever ... parses as possible_int = do {skip_blanks ;fmap Just int ;+++ (literal - `as` Nothing) } whatever ... and then gives a syntax error but possible_int = do skip_blanks fmap Just int +++ (literal - `as` Nothing) whatever ... parses as possible_int = do {skip_blanks ;fmap Just int } +++ (literal - `as` Nothing) whatever ... Which is just about acceptable to me, because the +++ does stick out, though I'd prefer that one to be rejected too. I wasn't fit enough to follow the earlier discussions of the layout rule, so I'm not sure how this interacts with previous awkward cases. I'd be happiest if we could come up with a rule that didn't involve sticking in braces and semicolons because it won't parse otherwise. Can someone remind me why the A close brace is also inserted whenever the syntactic category containing the layout list ends part of the rule is there? Jón -- Jón Fairbairn [EMAIL PROTECTED] 31 Chalmers Road [EMAIL PROTECTED] Cambridge CB1 3SZ+44 1223 570179 (after 14:00 only, please!) ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
Jon Fairbairn wrote [snip] Well, there's two things to consider: Haskell 98, which probably shouldn't change, and extended Haskell, which probably should. Especially if we can make the rules both simpler and better. [snip] How can I resist? I proposed the following revised layout rule some time ago in a message to the Twa Simons. Note that unlike the standard Haskell layout rules it does not need to read the parser's mind. Of course the problem is that while it should work fine for the way I lay out Haskell, it might not work for other people. We represent the lines in a file in a tree like structure: data Grouped line = Grouped line [Grouped line] The meaning of Grouped A lines is a line A, followed by a list of groups, each beginning at the same deeper ind entation. So for example A B C D would go to something like Grouped A [Grouped B [Grouped C []],Grouped D []] In the code I've written A B C produces an error message, but on second thoughts I think the best behaviour wou ld be to treat it like A ++ B C though it's too late to code that now . . . The layout processor would group the lines according to this algorithm. It woul d then output the result of the grouping. When it came to Grouped first rest it would determine if the last token of first is do, of, where or let, and rest does _not_ begin with a { token. If both these conditions were satis fied, it would output { before, ; inbetween elements, and } after when outputting t he rest list. This seems to me to solve most of the fundamental problems, and be somewhat more intuitive than the existing algorithm. It would behave differently in that do if test then do act1 act2 else do act3 act4 is legal. But it would also be necessary to alter the context-free-syntax so th at (1) the contents of the module were not separated by ;'s, but by each being a single item in the [Grouped line] list. (The old where {decl1 ; decl2 ; . . . ; decln} syntax would probably have to remain, for compatibility reasons). (2) single-line forms without braces, like let a = 5 in a+a work. This is only a first approximation, in that do if test then do act1 act2 else do act3 act4 isn't legal. Perhaps one way of fixing this is to modify the layout algorithm s o that tokens such as then, else, in and ) before which a semicolon can't make any sense anyway, get tagged onto the previous group if that began at the same column as t hey did. I don't claim this as the perfect solution. But since layout is something which is rather confusing and at the moment seems to have distinctly rough edges, it might be wo rthwhile experimenting with something like this, to see how much code it would break ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
I wrote: Can someone remind me why the A close brace is also inserted whenever the syntactic category containing the layout list ends part of the rule is there? Lennart wrote: It's so you can write let x = 2+2 in x*x (and similar things) and Arjan van IJzendoorn wrote: x = (3, case True of True - 4) The ')' ends the syntactic category 'tuple' So we get all this misery just so that people can cram things onto fewer lines? let x = 2+2 in x*x could be let {x = 2+2} in x*x or let x = 2+2 in x*x and x = (3, case True of True - 4 ) would be fine. I'd like to see a -fuse-simpler-layout-rule¹ option on the compilers. . . Jón 1. Why -f anyway? It took me ages to work out what -fallow-overlapping-instances meant -- I wondered how fallow could apply to overlapping instances. -- Jón Fairbairn [EMAIL PROTECTED] 31 Chalmers Road [EMAIL PROTECTED] Cambridge CB1 3SZ+44 1223 570179 (after 14:00 only, please!) ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
G'day all. On Thu, May 30, 2002 at 01:10:03PM +0200, Johannes Waldmann wrote: Python has it as well (they stole it from Haskell?) Python's layout rule looks more like Occam's than Haskell's, to my eyes. Aside: Was Occam the first language of the post-punched-card era to use layout as syntax? while we're at it - what's the deal with type inference? sometimes I think it is *really bad* language design if the program may contain untyped declarations of identifiers. Presumably you're not suggesting requiring type declarations in every pattern match too? I think it's something to do with where you draw the line. You could theoretically require type declarations: - Nowhere, unless the type inference mechanism can't cope with it. - Module interfaces. - Top-level declarations. - where clauses too. - let - Everywhere that a variable could be defined, including case-expressions, list comprehension generators and lambdas. - Every subexpression. I personally think it's wrong not to require explicit type declarations for everything exported from a module for engineering reasons. Sane separate compilation is important, IMO. Cheers, Andrew Bromage ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
Hi All, Andrew J Bromage wrote: G'day all. On Thu, May 30, 2002 at 01:10:03PM +0200, Johannes Waldmann wrote: Python has it as well (they stole it from Haskell?) Python's layout rule looks more like Occam's than Haskell's, to my eyes. Aside: Was Occam the first language of the post-punched-card era to use layout as syntax? I fuzzily recall that SICStus Prolog silently tolerated omissions of commas and dots, allowing for: p(X) :- g(X,Y) h(Y) p(X) g(Y,Z) :- ... But Haskell already existed at this point. Alexander ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: layout rule infelicity
[redirected to haskell-cafe] Ashley Yakeley wrote (on 30-05-02 03:18 -0700): At 2002-05-30 02:54, Lennart Augustsson wrote: If you look at C ( offspring), it's not the {;} that makes the code readable, it's the indentation that does. So why not acknowledge that? In C, the indentation is an important visual clue, but there are many different indentation styles. I think there are not so many different _indentation_ styles. The differences seem mostly to revolve around where to put _the braces_. And one might argue that the reason for that is precisely that it's so arbitrary, since the indentation is the main visual clue to structure anyway. Personally, I always try to factor out arbitrariness from my programs. I think layout does the same thing for syntax. If you're used to braces, complicated Haskell expressions with layout look confusing, since it's not immediately clear which indentation style the layout rules are trying to enforce. If you're used to C, then layout and indentation will be the least of your difficulties when you start using Haskell... Anyway, I have the feeling that, for every person on this list who complains about layout being unintuitive, there are 10 people who would say the opposite. Shall we take a poll? -- Frank Atanassow, Information Computing Sciences, Utrecht University Padualaan 14, PO Box 80.089, 3508 TB Utrecht, Netherlands Tel +31 (030) 253-3261 Fax +31 (030) 251-379 ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: layout rule infelicity
On Thursday 30 May 2002 13:43, Frank Atanassow wrote: Anyway, I have the feeling that, for every person on this list who complains about layout being unintuitive, there are 10 people who would say the opposite. Shall we take a poll? It's not unintuitive, it's counter-intuitive. :) The errors resulting from layout mistakes are hard to spot and are annoying. On the other hand, it has a blend of intuition. If you are writing in a certain, not-so-clear-what style, it makes life easier. :) As you can see I've contradicted myself. ** Type error. -- Eray Ozkural (exa) [EMAIL PROTECTED] Comp. Sci. Dept., Bilkent University, Ankara www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: layout rule infelicity
Just to add my voice to the din... I come from a c/c++/java background, and I taught myself haskell. The layout rules were the part I had the least problem with. I'd prefer that if any change is made it's one that adds options, not removes them. I'm confused as to the source of the problem, anyway - if you don't like the layout rules, use braces and semicolons and ignore them. Abe ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Naming conventions for compiler options [Was: layout rule infelicity]
[redirected to haskell-cafe] Jón Fairbairn wrote: 1. Why -f anyway? It took me ages to work out what -fallow-overlapping-instances meant -- I wondered how fallow could apply to overlapping instances. I believe the authors of GHC followed the naming conventions of GCC, which can be gleaned from GCC's info page. -warning-related options typically start with -W -machine-independent flags that control optimization, code-generation or language dialects start with -f. Most flags have positive and negative forms. -many pre-processor-related options start with -i -hardware-dependent options start with -m, e.g., -mcpu=i686 -mno-fancy-math-387 -malign-int -mno-power [RS/6000-related option] -myellowknife [On embedded PowerPC systems, assume that the startup module is called `crt0.o' and the standard C libraries are `libyk.a' and `libc.a'] Of possible interest is a file gcc-2.95.2/gcc/future.options from GCC's source code. The file lists a few suggested options. The file is an e-mail message from Noah Friedman to Richard Stallman, Jim Blandy and a few other people. Some of the suggested options are: -Waggravate-return -Wcast-spell -Wcaste-align [cf. the existing GCC option -Wcast-align] -Win [I guess warn about the Windows system] -Wmissing-protons -Wredundant-repetitions -antsy -fbungee-jump -fexpensive-operations [cf. -fexpensive-optimizations] -fextra-strength [must be a negative form of -fstrength-reduce] -fkeep-programmers-inline [cf. the existing option -fkeep-inline-functions] -fjesus-saves [cf. the existing option -fcaller-saves] -fno-peeping-toms [cf. the existing option -fno-peephole] -fruit-roll-ups[cf. the existing option -funroll-loops] -fshort-enough -mno-dialogue -vomit-frame-pointer ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Naming conventions for compiler options [Was: layout rule infelicity]
At 2002-05-30 16:51, [EMAIL PROTECTED] wrote: -Wmissing-protons Compiles BASIC? -- Ashley Yakeley, Seattle WA ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Naming conventions for compiler options [Was: layout rule infelicity]
-Wmissing-prototypes, actually (http://www.esat.kuleuven.ac.be/~gcc/). But good guess. ;) -J- -Original Message- From: Ashley Yakeley [mailto:[EMAIL PROTECTED]] Sent: Thu 5/30/2002 5:26 PM To: Haskell Cafe List Cc: Subject: Re: Naming conventions for compiler options [Was: layout rule infelicity] At 2002-05-30 16:51, [EMAIL PROTECTED] wrote: -Wmissing-protons Compiles BASIC? -- Ashley Yakeley, Seattle WA ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe jÉP}éX§X¬´v¬ée §Þ «$zYh®m¶ÿÃ!jÉZ+ùYùb²Ø§~ájÉW}
Layout rule (again)
I'm afraid it doesn't seem to be quite right yet :-( Consider instance Foo Maybe where foo = 5 = {4}instance Foo Maybe where {4}foo = 5 = {instance Foo Maybe where {}}foo = 5 The second {4} has meant there is no 4 to cause an implicit semicolon to be inserted. This can be fixed by changing L ({n}:ts) (m:ms) = { : (L ts (n:m:ms)) if n m, (Note 1) = { : } : (L ts (m:ms)) otherwise to L ({n}:ts) (m:ms) = { : (L ts (n:m:ms)) if n m, (Note 1) = { : } : (L (n:ts) (m:ms)) otherwise Thanks Ian, having a bad day, predominantly due to the layout rule :-( ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Layout rule
Hi all The report says The layout rule matches only those open braces that it has inserted in the lexical structure section. However, in the syntax section function L starts L (t:ts) (m:ms) = } : (L (t:ts) ms) if parse-error(t) (Note 1) which AFAICT will implicitly close an explicit open brace. Testing with foo = let { x = 5 in x shows hugs, ghc and nhc98 all seem to agree with me. L (t:ts) (m:ms) = } : (L (t:ts) ms) if m /= 0 parse-error(t) (Note 1) Along similar lines, I think writing the last two lines as L [] [0] = [] L [] (m:ms) = } : L [] ms if m /=0 (Note 5) would be clearer. In fact as the preeceding text says 'and for the empty stream' I think it may be written correctly but have gotten lost in the conversion from whatever it is really written in. Thanks Ian ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: The dreaded layout rule
Simon Marlow wrote: Does it mean that the following expressions would be illegal? if cond then do proc1; proc2 else do proc3; proc4 (case e of Just x - x 0; Nothing - False) Unfortunately, yes. Now one can forget about {} and use layout everywhere. He would no longer be able to forget or he would have to split some expressions into indented lines, even when they are unambiguous in one line. You could just enumerate all keywords that allow/enforce insertion of }. A suitable list for Haskell 98 might be: in where ) ] module type data newtype class instance default In fact I think that this would be the cleanest and simplest rule. (At least that is how I once implemented layout similar to Haskell's, because I couldn't get Yacc's error productions to work properly in all cases). For Haskell 2(000) I would suggest removing all but the first 4 tokens from the list above. - Andreas -- Andreas Rossberg, [EMAIL PROTECTED] :: be declarative. be functional. just be. ::
Re: The dreaded layout rule
I wrote: lexeme - qvarid | qconid | qvarsym | qconsym | literal | special | reservedop | reservedid Now we could replace qvarsym and qconsym by qop, and have both examples parse in the same way. However, unlike the other change in lexeme's definition, I don't suggest this, I only want to point out that there is a (formally) simple way out of the present somewhat inconsistent state. I changed my mind about this issue, I do suggest to change it as proposed, for if `elem` were three lexemes, any whitespace between them would be allowed. This might even be considered a typo, as I think no one intended to allow expressions like x ` {- look ma -} elem -- comments inside! ` l All the best, Christian Sievers
Re: The dreaded layout rule
As an author of an Haskell Emacs mode that deals with the layout rule (described in Journal of Functional Programming 8(5) 493-502), I strongly agree that the "parse-error condition" is really a bad idea. For example, in Emacs, no full Haskell parse is done. After all, layout should be there to indicate clearly to a user what section of code depends on which other; the user should not have to parse and deal with some local fixity declarations. I know this suggestion would break a few Haskell programs but perhaps it should be interesting to come back to the first functional language that implemented the layout rule, Miranda (tm) where the rule was much more simply stated Syntactic objects obey Landin's offside rule. This requires that every token of the object lie directly or to the right of its first token. A token which breaks this rule is said to offside with respect to that object. And that's it... no need to have three pages of explanations and an appendix. One can find many examples where Haskell rules and Miranda differ and some times one is better than the other, but you would be surprised to see that in the majority of the cases, the indentation that people normally produce are very similar under both rules. Guy Lapalme Université de Montréal PS: as a Quiz, can you guess how in Haskell the following is interpreted? f x = 1 + x g y = 1 + y
RE: The dreaded layout rule
Simon Marlow [EMAIL PROTECTED] wrote, Does anybody disagree with my interpretation of the standard? Are there any implementations that actually follow the standard here? (Maybe the standard should be changed to follow the implementations in this area.) Phew. Well spotted. Of course, none of the existing Haskell implementations are in conformance here. I think this has just about convinced me that the parse-error condition is a really bad idea. It definitely is. The main reason for its inclusion was to allow things like let f x = x in ... and also to automatically insert the final '}' before the end of file. Perhaps the layout rule should be restricted to these two cases? Proposal: - replace t by 'in' in the parse-error rule. EOF is already handled by the last clause in the layout spec. My guess is that this would break very few programs. The problem with this fix is that layout for let-in and other grouping constructs would be handled differently, which is not very intuitive. The problem with the `do'-notation also appears for `case'-constructs. A simpler rule might involve automatically inserting '}' before 'in' during lexical analysis iff (a) we're in a layout context and (b) the close brace hasn't already been inserted by the layout rule. This would decouple the parser and lexer which is a Good Thing. Still not intuitive, but from a syntactical point of view much more preferable. But you want to change (b) to ``the last lexeme was no close brace'' (it may have been explicitly inserted by the programmer. IMHO, in the long run (= next version of Haskell) this rule should completely vanish. So, who is collecting the proposals for the next Haskell standard? Cheers, Manuel
RE: The dreaded layout rule
Does it mean that the following expressions would be illegal? if cond then do proc1; proc2 else do proc3; proc4 (case e of Just x - x 0; Nothing - False) Unfortunately, yes. Now one can forget about {} and use layout everywhere. He would no longer be able to forget or he would have to split some expressions into indented lines, even when they are unambiguous in one line. Hmm, the `do x == y == z' case is a real trouble. Would it be not too ugly to formalize the current common behavior as something like "for the purposes of layout resolution, the syntax does not care about fixity declarations"? I guess that treating them in any way at this stage, as long as they don't reject non-associative operators, would yield the same result... Ugly but practical. One other possible solution is to remove the fixity resolution from the grammar itself and describe it as a separate process post-parsing. This is probably a good thing anyway: it matches the way most implementations work and it would clean up the grammar. Cheers, Simon
Re: The dreaded layout rule
Simon Peyton-Jones [EMAIL PROTECTED] writes: In other words, it is a bug (and GHC and Hugs don't do it right - see my previous message; from your comment, I presume HBC also doesn't follow the definition). I think, the only Right Thing is to remove this awful rule (unless somebody comes up with a rule that can be decided locally). Maybe so. But (H98 editors hat on) this is more than a "typo". I am surprised! ;-) It's a Haskell 2 issue. Perhaps there will be no fully conforming H98 compilers! Perhaps it would be a reasonable Haskell 1.6 issue? Wolfram
RE: The dreaded layout rule
In other words, it is a bug (and GHC and Hugs don't do it right - see my previous message; from your comment, I presume HBC also doesn't follow the definition). I think, the only Right Thing is to remove this awful rule (unless somebody comes up with a rule that can be decided locally). Maybe so. But (H98 editors hat on) this is more than a "typo". It's a Haskell 2 issue. Perhaps there will be no fully conforming H98 compilers! Simon
Re: The dreaded layout rule
| How about the Carl Witty's | | do a == b == c | | does NHC handle this correctly? It matches ghc and Hugs, reporting Error when renaming: Infix operator at 2:21 is non-associative. Note that this is reported one stage *after* parsing. Because parsing of infix operators is difficult, all implementations (to my knowledge) leave resolution of fixity and associativity until later. Indeed, the Haskell 98 standard recognises this (in an oblique way) by permitting infix decls to appear *after* the first use. Hence, it is now impossible to resolve fix/assoc in a single pass anyway. Regards, Malcolm
Re: The dreaded layout rule
Malcolm Wallace wrote: Because parsing of infix operators is difficult, all implementations (to my knowledge) leave resolution of fixity and associativity until later. Indeed, the Haskell 98 standard recognises this (in an oblique way) by permitting infix decls to appear *after* the first use. Hbc does the resolution while parsing, which means it cannot be Haskell 98 compliant. I really dislike the new rules about where infix can occur since they don't really buy us anything, but they do force a more complicated implementation. (Just as an aside. Local infix declarations, which was just added to Haskell 98, was very high on the list of design mistakes in SML and will be removed in ML2000.) -- -- Lennart
RE: The dreaded layout rule
Does anybody disagree with my interpretation of the standard? Are there any implementations that actually follow the standard here? (Maybe the standard should be changed to follow the implementations in this area.) Phew. Well spotted. Of course, none of the existing Haskell implementations are in conformance here. I think this has just about convinced me that the parse-error condition is a really bad idea. The main reason for its inclusion was to allow things like let f x = x in ... and also to automatically insert the final '}' before the end of file. Perhaps the layout rule should be restricted to these two cases? Proposal: - replace t by 'in' in the parse-error rule. EOF is already handled by the last clause in the layout spec. My guess is that this would break very few programs. A simpler rule might involve automatically inserting '}' before 'in' during lexical analysis iff (a) we're in a layout context and (b) the close brace hasn't already been inserted by the layout rule. This would decouple the parser and lexer which is a Good Thing. Cheers, Simon
Re: The dreaded layout rule
Fri, 30 Jul 1999 05:12:51 -0700, Simon Marlow [EMAIL PROTECTED] pisze: The main reason for its inclusion was to allow things like let f x = x in ... and also to automatically insert the final '}' before the end of file. Perhaps the layout rule should be restricted to these two cases? Does it mean that the following expressions would be illegal? if cond then do proc1; proc2 else do proc3; proc4 (case e of Just x - x 0; Nothing - False) Now one can forget about {} and use layout everywhere. He would no longer be able to forget or he would have to split some expressions into indented lines, even when they are unambiguous in one line. Hmm, the `do x == y == z' case is a real trouble. Would it be not too ugly to formalize the current common behavior as something like "for the purposes of layout resolution, the syntax does not care about fixity declarations"? I guess that treating them in any way at this stage, as long as they don't reject non-associative operators, would yield the same result... Ugly but practical. -- __("Marcin Kowalczyk * [EMAIL PROTECTED] http://kki.net.pl/qrczak/ \__/ GCS/M d- s+:-- a22 C++$ UL++$ P+++ L++$ E- ^^W++ N+++ o? K? w(---) O? M- V? PS-- PE++ Y? PGP-+ t QRCZAK 5? X- R tv-- b+++ DI D- G+ e h! r--%++ y-
RE: The dreaded layout rule
Malcolm Wallace [EMAIL PROTECTED] wrote, [...] Simon Marlow replies: GHC and Hugs both make use of yacc-style error recovery, albeit in a very limited form. And nhc uses parser combinators, which give you backtracking on error conditions for free. We actually do almost all layout processing at the lexical stage, but where the parser expects a } and doesn't get one, we just insert the }, and re-lex the remaining input. I suppose having to re-lex is a bit of a chore, but laziness comes to the rescue somewhat. How about the Carl Witty's do a == b == c does NHC handle this correctly? Manuel
RE: The dreaded layout rule
Manuel Chakravarty writes: What kind of implementation did the originators of this clause envision? If the layout rule is really implemented as a filter between the scanner and the parser, it seems extremely awkward to add a dependency on the error condition of the parser - in particular, it makes a functional, ie, side-effect free implementation rather hard and a true two phase implementation impossible. So, I guess (I hope!!) there is a nifty trick that lets you achieve the same effect by using only conditions depending on local information (either during layout processing or by letting the parser insert the missing braces). GHC and Hugs both make use of yacc-style error recovery, albeit in a very limited form. The idea is to have a production in your grammar like this: close_brace : '}' | error where the '}' token is assumed to have been inserted by the lexical analyser as a result of layout (i.e. a token was found to be less indented than the current layout context). The error case fires if any other token is encountered, and the semantic action for this production will probably pop the current layout context and carry on (in practice you also have to tell yacc not to continue with error recovery, otherwise all sorts of strange things happen). Take a look at GHC's parser for the details. I believe you're right in that a true two-phase implementation of the Haskell grammar is impossible. This is consistent with Haskell's policy of making life easy for programmers and hard for compiler writers :) Cheers, Simon
The dreaded layout rule
One of our students just pointed out an IMHO rather problematic clause in the layout rule. In Section 2.7 of the Haskell 98 Report it says, A close brace is also inserted whenever the syntactic category containing the layout list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted. And in B.3, we have in the first equation of the definition of `L', L (t:ts) (m:ms) = } : (L (t:ts) ms) if parse-error(t) (Note 1) where Note 1 says, The side condition parse-error(t) is to be interpreted as follows: if the tokens generated so far by L together with the next token t represent an invalid prefix of the Haskell grammar, and the tokens generated so far by L followed by the token } represent a valid prefix of the Haskell grammar, then parse-error(t) is true. What kind of implementation did the originators of this clause envision? If the layout rule is really implemented as a filter between the scanner and the parser, it seems extremely awkward to add a dependency on the error condition of the parser - in particular, it makes a functional, ie, side-effect free implementation rather hard and a true two phase implementation impossible. So, I guess (I hope!!) there is a nifty trick that lets you achieve the same effect by using only conditions depending on local information (either during layout processing or by letting the parser insert the missing braces). Cheers, Manuel
Re: The dreaded layout rule
This is a multi-part message in MIME format. --F93F7E72348E2F23CC7D1D40 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Manuel says One of our students just pointed out an IMHO rather problematic clause in the layout rule ... So, I guess (I hope!!) there is a nifty trick that lets you achieve the same effect by using only conditions depending on local information ... Attached is Haskell code which handles the layout rule reasonably well as a separate pass between scanning and parsing (though it is Haskell 1.4 rather than 98 and imperfect). -- Ian[EMAIL PROTECTED], http://www.cs.bris.ac.uk/~ian --F93F7E72348E2F23CC7D1D40 Content-Type: text/plain; charset=us-ascii; name="Layout.hs" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Layout.hs" {-- LAYOUT ANALYSIS The layout function deals with the layout conventions of Haskell 1.4, inserting extra "{", ";" and "}" tokens to represent implicit blocks. The inserted tokens are marked as implicit, and are inserted as early as possible in the token stream, in order to promote well-phrased and well-positioned error messages in case of trouble. The layout function never fails; it is left up to a parser to detect errors. The Haskell standard says that a block (or layout list) is terminated "whenever the syntactic category containing the layout list ends, that is, if an illegal lexeme is encountered at a point where a close brace would be legal". This can only be implememented easily if layout processing is combined with parsing. Here, layout processing is done separately, so an approximation to the standard is achieved by keeping track of brackets. See the end of this file for examples where the layout function deviates from the standard. ISSUES TO BE RESOLVED -- "case" terminated by "where" may be common enough to make a special case -- "case" terminated by "," may be worth dealing with -- check that "let"s inside "do" (which don't have "in") are handled OK -- check explicit blocks, and their interaction with implicit ones Ian Holyer @(#) $Id: Layout.hs,v 1.2 1998/10/26 15:18:39 ian Exp $ --} module Layout (layout) where import Haskell import Lex -- Start layout processing. If the source does not begin with "module" or "{", -- then there is an implicit surrounding block. Here and elsewhere, a -- lookahead past possible comments is done so that a token can be inserted -- before the comments if necessary; also, the end-of-file token makes it -- unnecessary to check for an empty token stream. layout :: [Token] - [Token] layout ts = if s == "module" then comments ++ scanExplicit [] (tok:rest) else openBlock [] (Tok "}" 1 0 Implicit) ts where comments = takeWhile isComment ts tok @ (Tok s r c k) : rest = dropWhile isComment ts -- A stack of tokens is used to keep track of the surrounding blocks. For each -- block, its opening "{" token is pushed onto the stack. In an implicit -- block, the brackets "(",")" and "[","]" and "case","of" and "let","in" and -- "if","then","else" are tracked by putting the opening bracket on the stack -- until the matching closing bracket is found. Each opening bracket is stored -- on the stack with the indent for the current block in place of its actual -- column. type Stack = [Token] -- Scan the source tokens while in an explicit block (or while not in any -- blocks) when layout is inactive. Look for an explicit close block token, or -- a keyword which indicates the beginning of a new block. Treat field selector -- as an explicit block. scanExplicit :: Stack - [Token] - [Token] scanExplicit stack [] = [] scanExplicit stack (t@(Tok s r c k) : ts1) = case s of "}" - t : closeBlock stack t ts1 "where" - t : openBlock stack t ts1 "let" - t : openBlock stack t ts1 "do" - t : openBlock stack t ts1 "of" - t : openBlock stack t ts1 "{" - openBlock stack undefined (t:ts1) _ - t : scanExplicit stack ts1 -- Scan the source tokens while in an implicit block, with layout active. The -- parameters are the stack, the last token dealt with, and the remaining -- tokens. The block is terminated by indenting or by a suitable closing -- bracket. Treat field selector as an explicit block. scanImplicit :: Stack - Token - [Token] - [Token] scanImplicit stack@(Tok bs br bc bk : stack1) last@(Tok ls lr lc lk) ts = if c bc || k == EndToken then close els
RE: The dreaded layout rule
Manuel Chakravarty writes: What kind of implementation did the originators of this clause envision? If the layout rule is really implemented as a filter between the scanner and the parser, it seems extremely awkward to add a dependency on the error condition of the parser - in particular, it makes a functional, ie, side-effect free implementation rather hard and a true two phase implementation impossible. Simon Marlow replies: GHC and Hugs both make use of yacc-style error recovery, albeit in a very limited form. And nhc uses parser combinators, which give you backtracking on error conditions for free. We actually do almost all layout processing at the lexical stage, but where the parser expects a } and doesn't get one, we just insert the }, and re-lex the remaining input. I suppose having to re-lex is a bit of a chore, but laziness comes to the rescue somewhat. Regards, Malcolm
Re: The dreaded layout rule
If the scanning stage pairs the tokens it returns with their positions, then scanning can be done once before parsing begins. I've done this with a parser implemented with parser combinators, these combinators then decide whether or not to accept a token based on which token it is and how far it is indented. I think this means the grammar being parsed is a context sensitive one, since the state of the parser is represented by more than just a single stack. We need a stack telling us what to do next, and a stack of indentation levels, although the way in which these stacks grow and shrink is related they could not be replaced by a single stack, so the grammar is not context free. Now that I write this, I think that we could combine these stacks as a stack of stacks, although this isn't how I did it. I don't think this satisfies the requirements for a context free grammar (CFG) but I don't have a definition to hand at the moment. Mike Simon Marlow wrote: I believe you're right in that a true two-phase implementation of the Haskell grammar is impossible. This is consistent with Haskell's policy of making life easy for programmers and hard for compiler writers :)
Layout rule confusion
Is the following fragment legal Haskell? Section B.3 of the report is not clear enough in this respect (at least for me :-} foo x = do case x of _ - return '?' If it *is* legal, GHC is wrong and Hugs is correct, otherwise GHC is right and Hugs is too liberal and its library contains some layout errors. Waiting for enlightment, Sven "Nitpick" Panne -- Sven PanneTel.: +49/89/2178-2235 LMU, Institut fuer Informatik FAX : +49/89/2178-2211 LFE Programmier- und Modellierungssprachen Oettingenstr. 67 mailto:[EMAIL PROTECTED]D-80538 Muenchen http://www.pms.informatik.uni-muenchen.de/mitarbeiter/panne
small wart in the Report's description of the layout rule
The Haskell Report says: To facilitate the use of layout at the top level of a module (an implementation may allow several modules may reside in one file), the keyword module and the end-of-file token are assumed to occur in column 0 (whereas normally the first column is 1). Otherwise, all top-level declarations would have to be indented. I've read this many times without thinking about it; however, once I thought about it, it doesn't make sense. Following a module, the keyword "module" is "an illegal lexeme...encountered at a point where a close brace would be legal"; therefore, the close brace is properly inserted no matter what column "module" occurs in. Therefore, I suggest that the above paragraph be removed from the Report. Carl Witty [EMAIL PROTECTED]