[rust-dev] parsing, ambiguity, and empty structs

Niko Matsakis Wed, 27 Feb 2013 07:36:29 -0800

A recent and very welcome pull request [1] pointed out Yet AnotherAmbiguity around struct syntax. If you have something like this:

    ... match x { ...

is that "match (x {})", where `x` is the name of a struct literal, or`match x {` where `x` is the variable being matched and what follows arethe arms?

Before I go any further, I want to emphasize that I am not picking onthe author of the pull request. As I said, it's excellent work and theauthor made a logical decision on how to proceed with the ambiguity.However, since it is dealing with our grammar, it seems like we shoulddecide how to resolve this with more discussion than a review on a pullrequest, so I thought I'd write up an e-mail describing the issue andgather some feedback.

Now, to some extent, you can resolve this if there are fields presentbecause the code would look like:

    ... match x { field: ...

However, this breaks down if you have empty structs, which didn't usedto be allowed but currently are. Plus it requires more lookahead,clearly, though not an indeterminate amount.

The pull request took the approach of parsing `match x {}` as an emptystruct literal and thus to write a match with no arms (an admittedlybizarre thing to write) one must write `match (x) {}`. This isreasonable but I find it personally somewhat surprising that `match x {}` would not parse (...and then likely lead to an exhaustivenesschecking failure).

However, this same ambiguity arises in a lot of places: if/else-ifexpressions, match expressions, `do` and `for` expressions, and perhapsa few others. Currently I *think* we use lookahead for field names toresolve the ambiguity that arises with struct literals, but of coursethis doesn't work with empty structs. I'd like it if we could resolvethis in a uniform way.


I see various options:

1. Treat Foo {} as a struct literal, requiring parentheses todisambiguate in some cases (e.g., `if (x) {}`). This is what the pullrequest does.

2. Declare that `Foo { ... }` literals must always have at least onefield, and use newtype structs for the empty struct case.

3. Place a parser restriction on those contexts where `{` terminates theexpression and say that struct literals cannot appear there unless theyare in parentheses.


Some details follow.

### Treat `Foo {}` as a struct literal

I don't have anything more to say about this approach. =)

### Treat empty structs the way we treat enum variants?

Perhaps we should just not parse a declaration like:
    struct X {}
instead one would write something like:
    struct X;
or
    struct X();
Much as you write
    enum Foo { Y }

This would be a "new-type" struct so X would also serve as a value, justlike the constant `Y` in the enum case. This would mean that one neverwrites a struct literal `Foo {}` but instead just `Foo`.


### Restrict where struct literals can appear

We could also just have a subclass of expressions which can appear in`if`, `do`, etc. This subclass would not permit struct literals. Thatmeans that `if Foo {x: 10}.is_true {}` or something would have to bewritten `if (Foo { x: 10 }.is_true()) { ... }`. This rule implies thatvery little lookahead is needed. Such rules can be a pain for thepretty printer, however. To some extent we already have a rule likethis for `do` and `for`, since we will parse:

    ...for x.each |y...

as a method call with one argument and not `(x.each | y)`. Since thisrule would presumably not apply to `if` etc, there would actually bethree classes of expressions, those that can appear in `if`, those thatcan appear in `do`/`for`, and full expressions.


### My personal opinion

I started out preferring the final option, but I am now leaning towardsoption #2, which seems to simplify the grammar overall and stillrequires only fixed lookahead to disambiguate.



Niko

[1] https://github.com/mozilla/rust/pull/5137
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] parsing, ambiguity, and empty structs

Reply via email to