Hi,

In the process of writing rustc (the second, self-hosted compiler), we're revisiting a number of minor (and some major) choices made in the implementation of rustboot. One of these is the division of syntactic forms into expressions and statements.

The concept of a statement is a little bit arbitrary. There are languages that do not have such a concept, but they're few; most languages have *some* concept of a form shorter-than-a-program that represents a single "chunk" of declaration-or-computation.

The distinction is essentially syntactic, though at a semantic level it captures the conceptual difference between forms which can, or cannot, produce "values" as natural "results" of their execution. In a language with a first class unit-value like rust, the concept becomes blurrier: any "just for side-effects" execution can also be considered as a unit-value expression.

Many unit-value-equipped languages sink many more syntactic forms into the expression grammar than we initially put in rustboot. In particular, rustboot treats all of these as statements rather than expressions:

  - Blocks
  - All 'movement' operations (assignment, send, receive) that have
    a guaranteed side-effect.
  - All branches (if, alt, alt type, etc.)
  - All loops (while, do-while, for, for-each)
  - All non-local jumps (ret, put, break, cont).
  - All 'diagnostic' operations (log, note, check, claim, prove)
    that may not execute and in any case would be unit-valued.
  - All declarations (let, auto, type, obj, tag, mod).

By "treats as statement" I mean, in particular, that they "have no value" and cannot syntactically nest inside any of the more "natural" expression nodes (binary and unary operators, function calls).

Part of the motivation here was to provide a simplified flow-graph on which to run the typestate algorithm, part of it was my own bias against programs that nest too deeply rather than just using more lines of text. In any case In rustc we're revisiting this classification; the flow-graph argument isn't strong enough to justify inconveniencing users, and the bias argument is easily counterweighed by the number of cases that benefit in readability from throwing (say) conditionals into the expression tree.

The only *essential* statements we're changing to expressions are blocks: once a block can be an expression (with a terminal expression-statement that provides its "value") then all other statements can effectively nest into "expression position" by wrapping them in a block. This is the C-with-GNU-extensions option, and it works. But for convenience, we're considering making a few of the others into expressions as well.

I thought I'd conduct a straw poll here to see which of the above enumerated statement forms you'd like to sink into the expression grammar. At the extreme end, you sink *everything*, it has the syntactic structure of lisp, and you can do things like:

    auto x = break + type t = int;

Personally that makes me a bit queasy, and I think it *might* be painting us into some corners syntactically; but it is plausible. At the moment the implementation is doing less than this: sinking blocks, branches, loops and movement operations into expressions, but stopping short of the non-local control flow operators, diagnostic operators and the declarations. That is: anything that definitely, by construction, does not and cannot be understood as having a "result value", remains classified as a statement.

This is a definable, but slightly arbitrary, place to draw the line -- only one of the loops (do-while) can even *potentially* be typed as non-unit -- so I'm curious where others would draw the line (if at all). Opinions?

-Graydon

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to