On Thu, Jul 08, 2010 at 07:09:29AM +0000, Simon Peyton-Jones wrote: > (ie as infix operators) and I have to squizzle around to re-interpret them as > prefix operators. Not very cool. Something unified would be a Good Thing.
So, after thinking about it some, I think there may be a somewhat elegant solution. The other day I found myself writing a prolog parser in haskell, an interesting thing about prolog is that it is a pure operator precedence grammar[1]. Meaning that the entire grammar can be defined by a list of symbols, their fixities and their priorities. An example of a definition for a prolog-like language is http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Builtin-Operators.html Now, a really nice thing about operator precedence grammars is that they admit a very simple linear parsing algorithm[2] and they are quite simple to understand. So, Why not utilize the nice properties of this style of grammar when defining haskell, we already attempt to interpret infix operators in the grammar BNF proper, but then go and refix them anyway in the fixity fixups pass, probably with something very similar to an operator precedence parser. so the idea is basically to get rid of the initial dummy parsing of expressions altogether and parse expressions as a pure operator precedence grammar in the fixups pass. This will allow seamless handling of prefix operators and likely simplify the formal description of the language to boot. So, for the most part the grammar will look like it does now, except when we get to expressions, patterns, and types, we just parse them uniformly as a sequence of atomic nodes. These may be variables, but also may be things like infix operators, or even an entire parenthesized term or case expression. These can be recursive, a parenthesized expression will itself contain a list of nodes. Now, we can simply define the fixity resolution pass as applying the appropriate operator precedence grammar to each list of nodes, producing expressions, types, or patterns. The really nice thing is that we are under no obligation to use the same operator precedence grammar for each context, we can always tell from the parsing context whether we are parsing a type, expression, or pattern, so we just use the appropriate grammar, for instance, we will augment the grammar with the prefix '~' in patterns, and the prefix '!' (for strictness) in types. '!' can be defined as prefix in patterns and infix in expressions simply by using a slightly different precedence table when interpreting them. This also makes very clear how user defined fixities are used, they are simply appended to the precedence table in this pass. Turning on and off bang patterns with a switch is also extremely easy, just omit that prefix operator from the table when they are switched off. no need to mess with the lexer or the parser. I also suspect we can produce much better error messages with this strategy. John [1] http://en.wikipedia.org/wiki/Operator-precedence_parser [2] http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/ _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime