Hi,
In the process of writing rustc (the second, self-hosted compiler),
we're revisiting a number of minor (and some major) choices made in the
implementation of rustboot. One of these is the division of syntactic
forms into expressions and statements.
The concept of a statement is a little bit arbitrary. There are
languages that do not have such a concept, but they're few; most
languages have *some* concept of a form shorter-than-a-program that
represents a single "chunk" of declaration-or-computation.
The distinction is essentially syntactic, though at a semantic level it
captures the conceptual difference between forms which can, or cannot,
produce "values" as natural "results" of their execution. In a language
with a first class unit-value like rust, the concept becomes blurrier:
any "just for side-effects" execution can also be considered as a
unit-value expression.
Many unit-value-equipped languages sink many more syntactic forms into
the expression grammar than we initially put in rustboot. In particular,
rustboot treats all of these as statements rather than expressions:
- Blocks
- All 'movement' operations (assignment, send, receive) that have
a guaranteed side-effect.
- All branches (if, alt, alt type, etc.)
- All loops (while, do-while, for, for-each)
- All non-local jumps (ret, put, break, cont).
- All 'diagnostic' operations (log, note, check, claim, prove)
that may not execute and in any case would be unit-valued.
- All declarations (let, auto, type, obj, tag, mod).
By "treats as statement" I mean, in particular, that they "have no
value" and cannot syntactically nest inside any of the more "natural"
expression nodes (binary and unary operators, function calls).
Part of the motivation here was to provide a simplified flow-graph on
which to run the typestate algorithm, part of it was my own bias against
programs that nest too deeply rather than just using more lines of text.
In any case In rustc we're revisiting this classification; the
flow-graph argument isn't strong enough to justify inconveniencing
users, and the bias argument is easily counterweighed by the number of
cases that benefit in readability from throwing (say) conditionals into
the expression tree.
The only *essential* statements we're changing to expressions are
blocks: once a block can be an expression (with a terminal
expression-statement that provides its "value") then all other
statements can effectively nest into "expression position" by wrapping
them in a block. This is the C-with-GNU-extensions option, and it works.
But for convenience, we're considering making a few of the others into
expressions as well.
I thought I'd conduct a straw poll here to see which of the above
enumerated statement forms you'd like to sink into the expression
grammar. At the extreme end, you sink *everything*, it has the syntactic
structure of lisp, and you can do things like:
auto x = break + type t = int;
Personally that makes me a bit queasy, and I think it *might* be
painting us into some corners syntactically; but it is plausible. At the
moment the implementation is doing less than this: sinking blocks,
branches, loops and movement operations into expressions, but stopping
short of the non-local control flow operators, diagnostic operators and
the declarations. That is: anything that definitely, by construction,
does not and cannot be understood as having a "result value", remains
classified as a statement.
This is a definable, but slightly arbitrary, place to draw the line --
only one of the loops (do-while) can even *potentially* be typed as
non-unit -- so I'm curious where others would draw the line (if at all).
Opinions?
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev