On 09/10/2012 12:21 PM, James Boyden wrote:
Hi Rust-dev,
Hi!
To start with, here's the three-sentence summary of my post:
I propose 2 minor syntax alterations that very-slightly extend the
existing "let" keyword in a logical way, to improve the syntax of
variable binding in destructuring pattern matching and closures.
By "improve", I mean that the syntax will become:
- more intuitive (i.e. the meaning and behaviour are self-evident at
the first encounter)
- more consistent with the basic variable-declaration binding syntax
- less exotic (or, more integrated with the rest of the Rust syntax)
- less magical/implicit (where magical effects "just happen") and
- less cryptic
- less reliant on the use of special sigils such as '|'.
I'm sending this post now, because I understand that the Rust syntax
will be slushifying soon (at the 0.4 release) and I wanted to propose
these alterations before the bar was raised even higher.
Thanks for such a well-thought post. My minor comments are going to seem
an insufficient response, but I hope others will reply as well.
I'll mention that both the pattern matching and closure syntaxes are
very cramped, with a lot of competing requirements.
== 0. Introduction ==
Over the past 12 years, I've been employed as a programmer of C++
(in particular), Python and C. I've "learned" (but not programmed
significant amounts of) Java, JavaScript, Common Lisp, Lua and Go.
For years, I've been on the lookout for a "better C++" that simplified
the core language but added better built-in support for concurrency,
memory safety and memory management. But I also wanted something that
retained an approximately C++-like syntax -- and more importantly,
a C++-like precision of expression (some level of static typing; some
constness; some form of RAII; and some form of compile-time generics).
I'm a huge fan of what you've designed and created with Rust. I agree
strongly with almost all the feature choices, and there's nothing
that really rubs me the wrong way. (Without turning this post into
a love letter about all my favourite features, I'll just say that
I was particularly impressed by the 3 memory realms as a solution
to the concurrency/memory-management/locking/performance problem.)
I'm glad to hear all this.
I've been watching Rust with interest since I first encountered 0.3
on Hacker News in early July. I'm looking forward to the language
spec solidifying and the compiler implementation maturing to a point
that it makes sense to start using Rust at home for my own projects.
I understand that a syntax-slushifying 0.4 release is on its way:
http://news.ycombinator.com/item?id=4467402
This is great news, but it also spurs me to write to you now about
the two issues I have with Rust: the implicit binding of variables
in destructuring pattern matching; and the exotic, cryptic closure
syntax.
== 1. Improving the destructuring variable binding syntax ==
When I first read the Rust Tutorial in July, I took "beginner's notes"
of the features and syntax that stood out for better or worse. It was
almost all positive -- in fact, the only negatives on my list were the
implicit variable binding in destructuring pattern matching, and the
exotic closure syntax (which I'll discuss more in section 2 below).
The destructuring variable binding syntax wasn't nearly as intuitive
(i.e. meaning and behaviour are self-evident at the first encounter)
and unambiguous (not easily misinterpreted to mean something else)
to me as the rest of the Rust syntax. Several times as I read through
the Tutorial, I had to scan back several pages to remind myself what
was happening when I encountered the destructuring syntax.
I agree the binding syntax is not intuitive in places.
Plus, after a few years of Python most recently (and a few occurrences
of the unintentional new-variable-definition mistake described here:
http://programmers.stackexchange.com/a/30098 ), I've become uneasy
with implicit variable declaration. For this reason, I agree strongly
with Rust's use of the "let" keyword to avoid this problem. I also
appreciate the ability to scan the code for "let", to discover quickly
and easily where a variable was declared.
Finally, I think this implicit variable binding makes the Rust syntax
less self-consistent: Inside a function, it's no longer the case that
a new variable is bound if and only if there's a "let" preceding it.
Now, sometimes, new variables can be bound magically.
Hence, I propose that the destructuring syntax be altered so that each
variable binding in a pattern must be preceded by the "let" keyword.
Thus, this example from the Tutorial:
http://dl.rust-lang.org/doc/tutorial.html#pattern-matching
fn angle(vector: (float, float)) -> float {
match vector {
(0f, y) if y < 0f => 1.5 * pi,
(0f, y) => 0.5 * pi,
(x, y) => float::atan(y / x)
}
}
would become this:
fn angle(vector: (float, float)) -> float {
match vector {
(0f, let y) if y < 0f => 1.5 * pi,
(0f, let y) => 0.5 * pi,
(let x, let y) => float::atan(y / x)
}
}
I think that most people will see this as too verbose, but I am sort of
warming to this and the reason is `ref`. Currently, patterns implicitly
bind by reference, but in the future they will create a copy, and to
bind a reference you will need to write `ref`, as in `(0f, ref y) =>`.
This is the only place that `ref` exists in the language. Saying that a
binding is always introduced using either `let` or `ref` at least makes
`ref` not stand out as bad. Of course, to be consistent with the goal of
always using `let`, we would probably have to use `let ref` and then
we're sort of back in the same spot.
Similarly, this example of record destructuring:
http://dl.rust-lang.org/doc/tutorial.html#struct-patterns
struct Point { x: float, y: float }
match mypoint {
Point { x: 0.0, y: y } => { /* use y */ }
Point { x: x, y: y } => { /* use x and y */ }
}
would become:
struct Point { x: float, y: float }
match mypoint {
Point { x: 0.0, y: let y } => { /* use y */ }
Point { x: let x, y: let y } => { /* use x and y */ }
}
And finally, this example of enum destructuring:
http://dl.rust-lang.org/doc/tutorial.html#enum-patterns
fn area(sh: shape) -> float {
match sh {
circle(_, size) => float::consts::pi * size * size,
rectangle({x, y}, {x: x2, y: y2}) => (x2 - x) * (y2 - y)
}
}
would become:
fn area(sh: shape) -> float {
match sh {
circle(_, let size) => float::consts::pi * size * size,
rectangle({let x, let y}, {x: let x2, y: let y2})
=> (x2 - x) * (y2 - y)
}
}
This last example was particularly cryptic to me in its original form
("Which 'x' is which?") but it becomes much clearer with the insertion
of the "let" keyword in the appropriate locations.
Record destructuring is confusing for me too, and I agree 'let' makes it
much clearer.
I've skimmed the recent Rust-dev archives, and I saw this thread that
discussed the same part of the destructuring syntax from a different
point-of-view:
https://mail.mozilla.org/pipermail/rust-dev/2012-August/002258.html
(The original poster was concerned more about ambiguity of existing
enums vs binding new variables, but we're both focussed on the same
part of the syntax.)
I think my proposal would address that poster's concerns too, without
violating any "Hard requirements", "Misuse avoidances" or "Ergonomics"
listed by Graydon Hoare in this reply:
https://mail.mozilla.org/pipermail/rust-dev/2012-August/002272.html
This approach would also avoid adding any new sigils, instead re-using
a short, existing keyword in a logical (and very minimal) extension of
its current meaning.
OK, so how do let bindings work under this scheme? Here's current syntax:
let (foo, bar) = baz();
Any irrefutable pattern can go after 'let'. So does that become:
let (let foo, let bar) = baz();
Just writing `(let foo, let bar)`, allowing patterns to begin
statements, is probably unparseable.
== 2. Improving the closure variable binding syntax ==
As I mentioned above, my only other issue with the current state of
the Rust language is the syntax of closures. The exotic '|x|' syntax
makes closures look cryptic and mysterious. The use of the pipe sign
offers no intuitive (to a C-family programmer) clues as to what '|x|'
means or does.
It is an exotic closure syntax.
I think that closures should be a seamlessly-integrated, decidedly
non-exotic part of Rust. Closures shouldn't seem any more mysterious
than heap allocation or pointers. And finally, it would be nice if
the closure variable binding was preceded by the "let" keyword. ;)
(I glossed over this point in the previous section, to avoid becoming
mired in details, and because closures could sort of be considered
functions if you squint at them just right. ;)
I'd been staring at my proposed destructuring variable binding syntax
for a while, when it dawned on me that a closure is very similar to
a destructuring pattern matching arm in an 'match' construct. If you
ignore the different defining characteristics (a closure is allocated
somewhere in memory and referenced through a pointer, while a pattern
matching arm is just code trapped in an enclosing 'match' construct),
you observe that both constructs define unnamed, function-like blocks
of code that accept some arity of named parameters and are able to
access variables from the enclosing scope.
Why not make their syntax more similar to emphasise this similarity?
It would make closures seem less mysterious, and the Rust syntax would
be more self-consistent overall.
To remind you, this is the destructuring variable binding syntax that
I'm proposing, in which the variable binding is preceded by "let":
match foo {
/* arms consisting of patterns and expressions or code blocks */
(let x, let y) => /* use x, y, and maybe z from enclosing scope */
}
Hence, I propose that the closure syntax be altered from this:
|x, y| { /* use x, y, and maybe z from the enclosing scope */ }
to this:
&(let x, let y) => /* use x, y, and maybe z from enclosing scope */
This would have the following benefits:
1. Now *all* (really "all", this time ;) non-function-parameter
variable bindings in a function are preceded by the "let" keyword
(which is a more self-explanatory syntax than '|x|', and also
makes it easier to scan to see where a variable was bound).
Since closures are functions and closure arguments are function
arguments I don't really see this as being more consistent, but perhaps
'differently consistent'.
2. The exotic, cryptic '|' sigil is not used.
That is nice.
3. Parameter declarations for a function-like code block are enclosed
in the familiar parentheses construct.
Also nice.
4. The similarity to the 'match' construct arm is emphasised.
5. Closures are no more mysterious than heap allocation or pointers
(which is made explicit by the pointer sigil out the front).
Here is what the examples from the Closures chapter would look like:
http://dl.rust-lang.org/doc/tutorial.html#closures
let bloop = &(let well, let oh: mygoodness) -> what_the => /* ... */;
let mut max = 0;
(~[1, 2, 3]).map(&(let x) => if x > max { max = x });
fn mk_appender(suffix: str) -> fn@(str) -> str {
ret @(let s: str) -> str => s + suffix;
}
fn call_twice(f: fn()) { f(); f(); }
call_twice(&() => ~"I am a stack closure");
call_twice(@() => ~"I am a boxed closure");
call_twice(~() => ~"I am a unique closure");
The above example demonstrates the closest this proposed syntax comes
to ambiguity: On its own, @() could be interpreted as "allocate a box
of nil on the task heap". (But unless I'm mistaken, the "fat arrow"
that immediately follows would be sufficient to disambiguate?)
No other arities of closure even flirt with any potential ambiguity,
due to the "let" keyword before the variable name, which distinguishes
the closure parameters from a parenthesised variable/enum or a tuple.
Interesting. We'd considered syntaxes like `(foo, bar) => baz` but they
couldn't be parsed without unbounded lookahead. Your proposal does seem
to fix that problem.
Finally, the "real use" examples of a closure in combination with the
'each' function:
each(~[1, 2, 3], &(let n) => {
debug!("%i", n);
do_some_work(n);
});
do each(~[1, 2, 3]) &(let n) => {
debug!("%i", n);
do_some_work(n);
}
I would want to still have the option if inferring the storage for the
closure, like `do foo.each (let n) => {`.
I do think the 'let's here make the syntax not that aesthetically
pleasing. Also the parens for closure syntax make this harder for
eyeballs to parse.
do foo.each(bar) (let baz) => {
The argument lists look pretty similar.
Also:
do foo.each() () => {
do foo.each () => {
Weirdness.
I find that in these two "real use" examples in particular, changing
'|n|' to '&(let n)' improves the self-descriptiveness of the closure
syntax.
== 3. Conclusion ==
In closing, I think that either of these two proposed syntax changes
(to the variable binding syntax of destructuring pattern matching and
closures) would individually contribute to improving the readability,
learnability, predictability and un-ambiguity of the language.
Further, I think that the positive effects would be even greater if
the syntax changes were applied together, due to the aforementioned
emphasis of the similarities of the two constructs and the overall
increase in language syntax self-consistency.
Thanks for your time,
Thank you. Great suggestions.
-Brian
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev