On 25/04/2013 18:12, Graydon Hoare wrote:
I've been relatively insistent on LL(1) since it is a nice intersection-of-inputs, practically guaranteed to parse under any framework we retarget it to.
I'm a fan of this choice too, if only because the simplest efficient parser-generators and/or parser-composition methodologies I know of take an LL(1) grammar as input.

However, Paul's earlier plea on this thread ("Please don't do this [grammar factoring] to the official parser!") raised the following question in my mind:

Are we allowing for the possibility of choosing the semi-middle ground of: "There *exists* an LL(1) grammar for Rust that is derivable from the non-LL(1)-but-official grammar for Rust." ? Or do we want to go all the way to ensuring that our own grammar that we e.g. use for defining the syntactic classes of the macro system etc is strictly LL(1) (or perhaps LL(k) for some small but known fixed k)?

(I'd have to go review my compiler text books at home to review how much this would actually buy us.)

If we've already discussed the latter, mea culpa.  :)

Cheers,
-Felix

On 25/04/2013 18:12, Graydon Hoare wrote:
On 13-04-25 08:37 AM, John Clements wrote:

FWIW, I'm (mildly) concerned too. In particular, I'm disappointed to discover that in its present form (and using my present caveman-like invocation style), ANTLR parses source files so slowly that it's impossible to use directly as a validation tool; I would very much like to directly validate the grammar used for documentation purposes against the existing sources. I haven't yet asked for help from the ANTLR folks, because I don't yet feel like I've finished due diligence on RTFMing ANTLR, which I would prefer to do before dumping the problem in their lap.

I'm sorry for the confusion; I don't think patrick's work here represents a divergence from yours so much as a continuation of it, in a direction that answers a question I've asked repeatedly while you were working on your grammar: "can we actually find an LL(1) factoring?". Also "can any other non-antlr tools consume this grammar?"

Since you chose antlr4 rather than antlr3, the LL(1) question in particular was obscured under the "antlr4 will parse anything!" sales pitch[1]. Which is fine as far as getting the grammar roughed out and running -- I'm not criticizing that decision, it was yours to make, as was the choice of antlr in the first place. But it _did_ dodge a question I've been quite persistent about asking; one which I wanted to have an answer for before considering the grammar "done".

Longer term, I would like whatever grammar we wind up denoting as canonical / documented / spec'ed to be as (re)target-able as possible. I've been relatively insistent on LL(1) since it is a nice intersection-of-inputs, practically guaranteed to parse under any framework we retarget it to. IOW I do _not_ want to force anyone working with rust grammars in the future to use antlr (3, 4, or anything else). That's too tool-specific[2]. A grammar that is trivally translatable between antlr4, antlr3, yapp2, llgen, llnextgen, coco, javacc, parsec, spirit, "some rust parser-generator", and so forth is my "eventual" goal here.

-Graydon

[1]: "We parse any grammar" is unfortunately common in parser-generator sales pitches these days, with a profusion GLR and LL(*) things. As a downstream consumer of parser-generator technology, let me point out that while I appreciate broad guarantees by tool-makers, I very much _dislike_ a tool that offers broad guarantees at the expense of being able to make promises about efficiency, grammar class and algorithmic complexity. IOW I actually prefer tools that can tell me what I need to leave out (or change) in my grammar in order to arrive at an efficient parser in a given complexity class. "Don't worry about it" is the wrong answer here. I want to worry about it.

[2]: we also seem to be most-invested in python for in-tree "maintainer-mode" tools associated with rust development; it seems like a lot to ask to install a JDK in order to verify the grammar. If the grammar-check _can_ be done in a python module, I'm happy to shift over to using it. Unless antlr-ness is an important part of the grammar in some way I'm not perceiving; do you have a strong preference for keeping the java + antlr dependency?
_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


--
irc: pnkfelix on irc.mozilla.org
email: {fklock, pnkfelix}@mozilla.org

_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to