Hi Rust-dev, To start with, here's the three-sentence summary of my post:
I propose 2 minor syntax alterations that very-slightly extend the existing "let" keyword in a logical way, to improve the syntax of variable binding in destructuring pattern matching and closures. By "improve", I mean that the syntax will become: - more intuitive (i.e. the meaning and behaviour are self-evident at the first encounter) - more consistent with the basic variable-declaration binding syntax - less exotic (or, more integrated with the rest of the Rust syntax) - less magical/implicit (where magical effects "just happen") and - less cryptic - less reliant on the use of special sigils such as '|'. I'm sending this post now, because I understand that the Rust syntax will be slushifying soon (at the 0.4 release) and I wanted to propose these alterations before the bar was raised even higher. == 0. Introduction == Over the past 12 years, I've been employed as a programmer of C++ (in particular), Python and C. I've "learned" (but not programmed significant amounts of) Java, JavaScript, Common Lisp, Lua and Go. For years, I've been on the lookout for a "better C++" that simplified the core language but added better built-in support for concurrency, memory safety and memory management. But I also wanted something that retained an approximately C++-like syntax -- and more importantly, a C++-like precision of expression (some level of static typing; some constness; some form of RAII; and some form of compile-time generics). I'm a huge fan of what you've designed and created with Rust. I agree strongly with almost all the feature choices, and there's nothing that really rubs me the wrong way. (Without turning this post into a love letter about all my favourite features, I'll just say that I was particularly impressed by the 3 memory realms as a solution to the concurrency/memory-management/locking/performance problem.) I've been watching Rust with interest since I first encountered 0.3 on Hacker News in early July. I'm looking forward to the language spec solidifying and the compiler implementation maturing to a point that it makes sense to start using Rust at home for my own projects. I understand that a syntax-slushifying 0.4 release is on its way: http://news.ycombinator.com/item?id=4467402 This is great news, but it also spurs me to write to you now about the two issues I have with Rust: the implicit binding of variables in destructuring pattern matching; and the exotic, cryptic closure syntax. == 1. Improving the destructuring variable binding syntax == When I first read the Rust Tutorial in July, I took "beginner's notes" of the features and syntax that stood out for better or worse. It was almost all positive -- in fact, the only negatives on my list were the implicit variable binding in destructuring pattern matching, and the exotic closure syntax (which I'll discuss more in section 2 below). The destructuring variable binding syntax wasn't nearly as intuitive (i.e. meaning and behaviour are self-evident at the first encounter) and unambiguous (not easily misinterpreted to mean something else) to me as the rest of the Rust syntax. Several times as I read through the Tutorial, I had to scan back several pages to remind myself what was happening when I encountered the destructuring syntax. Plus, after a few years of Python most recently (and a few occurrences of the unintentional new-variable-definition mistake described here: http://programmers.stackexchange.com/a/30098 ), I've become uneasy with implicit variable declaration. For this reason, I agree strongly with Rust's use of the "let" keyword to avoid this problem. I also appreciate the ability to scan the code for "let", to discover quickly and easily where a variable was declared. Finally, I think this implicit variable binding makes the Rust syntax less self-consistent: Inside a function, it's no longer the case that a new variable is bound if and only if there's a "let" preceding it. Now, sometimes, new variables can be bound magically. Hence, I propose that the destructuring syntax be altered so that each variable binding in a pattern must be preceded by the "let" keyword. Thus, this example from the Tutorial: http://dl.rust-lang.org/doc/tutorial.html#pattern-matching fn angle(vector: (float, float)) -> float { match vector { (0f, y) if y < 0f => 1.5 * pi, (0f, y) => 0.5 * pi, (x, y) => float::atan(y / x) } } would become this: fn angle(vector: (float, float)) -> float { match vector { (0f, let y) if y < 0f => 1.5 * pi, (0f, let y) => 0.5 * pi, (let x, let y) => float::atan(y / x) } } Similarly, this example of record destructuring: http://dl.rust-lang.org/doc/tutorial.html#struct-patterns struct Point { x: float, y: float } match mypoint { Point { x: 0.0, y: y } => { /* use y */ } Point { x: x, y: y } => { /* use x and y */ } } would become: struct Point { x: float, y: float } match mypoint { Point { x: 0.0, y: let y } => { /* use y */ } Point { x: let x, y: let y } => { /* use x and y */ } } And finally, this example of enum destructuring: http://dl.rust-lang.org/doc/tutorial.html#enum-patterns fn area(sh: shape) -> float { match sh { circle(_, size) => float::consts::pi * size * size, rectangle({x, y}, {x: x2, y: y2}) => (x2 - x) * (y2 - y) } } would become: fn area(sh: shape) -> float { match sh { circle(_, let size) => float::consts::pi * size * size, rectangle({let x, let y}, {x: let x2, y: let y2}) => (x2 - x) * (y2 - y) } } This last example was particularly cryptic to me in its original form ("Which 'x' is which?") but it becomes much clearer with the insertion of the "let" keyword in the appropriate locations. I've skimmed the recent Rust-dev archives, and I saw this thread that discussed the same part of the destructuring syntax from a different point-of-view: https://mail.mozilla.org/pipermail/rust-dev/2012-August/002258.html (The original poster was concerned more about ambiguity of existing enums vs binding new variables, but we're both focussed on the same part of the syntax.) I think my proposal would address that poster's concerns too, without violating any "Hard requirements", "Misuse avoidances" or "Ergonomics" listed by Graydon Hoare in this reply: https://mail.mozilla.org/pipermail/rust-dev/2012-August/002272.html This approach would also avoid adding any new sigils, instead re-using a short, existing keyword in a logical (and very minimal) extension of its current meaning. == 2. Improving the closure variable binding syntax == As I mentioned above, my only other issue with the current state of the Rust language is the syntax of closures. The exotic '|x|' syntax makes closures look cryptic and mysterious. The use of the pipe sign offers no intuitive (to a C-family programmer) clues as to what '|x|' means or does. I think that closures should be a seamlessly-integrated, decidedly non-exotic part of Rust. Closures shouldn't seem any more mysterious than heap allocation or pointers. And finally, it would be nice if the closure variable binding was preceded by the "let" keyword. ;) (I glossed over this point in the previous section, to avoid becoming mired in details, and because closures could sort of be considered functions if you squint at them just right. ;) I'd been staring at my proposed destructuring variable binding syntax for a while, when it dawned on me that a closure is very similar to a destructuring pattern matching arm in an 'match' construct. If you ignore the different defining characteristics (a closure is allocated somewhere in memory and referenced through a pointer, while a pattern matching arm is just code trapped in an enclosing 'match' construct), you observe that both constructs define unnamed, function-like blocks of code that accept some arity of named parameters and are able to access variables from the enclosing scope. Why not make their syntax more similar to emphasise this similarity? It would make closures seem less mysterious, and the Rust syntax would be more self-consistent overall. To remind you, this is the destructuring variable binding syntax that I'm proposing, in which the variable binding is preceded by "let": match foo { /* arms consisting of patterns and expressions or code blocks */ (let x, let y) => /* use x, y, and maybe z from enclosing scope */ } Hence, I propose that the closure syntax be altered from this: |x, y| { /* use x, y, and maybe z from the enclosing scope */ } to this: &(let x, let y) => /* use x, y, and maybe z from enclosing scope */ This would have the following benefits: 1. Now *all* (really "all", this time ;) non-function-parameter variable bindings in a function are preceded by the "let" keyword (which is a more self-explanatory syntax than '|x|', and also makes it easier to scan to see where a variable was bound). 2. The exotic, cryptic '|' sigil is not used. 3. Parameter declarations for a function-like code block are enclosed in the familiar parentheses construct. 4. The similarity to the 'match' construct arm is emphasised. 5. Closures are no more mysterious than heap allocation or pointers (which is made explicit by the pointer sigil out the front). Here is what the examples from the Closures chapter would look like: http://dl.rust-lang.org/doc/tutorial.html#closures let bloop = &(let well, let oh: mygoodness) -> what_the => /* ... */; let mut max = 0; (~[1, 2, 3]).map(&(let x) => if x > max { max = x }); fn mk_appender(suffix: str) -> fn@(str) -> str { ret @(let s: str) -> str => s + suffix; } fn call_twice(f: fn()) { f(); f(); } call_twice(&() => ~"I am a stack closure"); call_twice(@() => ~"I am a boxed closure"); call_twice(~() => ~"I am a unique closure"); The above example demonstrates the closest this proposed syntax comes to ambiguity: On its own, @() could be interpreted as "allocate a box of nil on the task heap". (But unless I'm mistaken, the "fat arrow" that immediately follows would be sufficient to disambiguate?) No other arities of closure even flirt with any potential ambiguity, due to the "let" keyword before the variable name, which distinguishes the closure parameters from a parenthesised variable/enum or a tuple. Finally, the "real use" examples of a closure in combination with the 'each' function: each(~[1, 2, 3], &(let n) => { debug!("%i", n); do_some_work(n); }); do each(~[1, 2, 3]) &(let n) => { debug!("%i", n); do_some_work(n); } I find that in these two "real use" examples in particular, changing '|n|' to '&(let n)' improves the self-descriptiveness of the closure syntax. == 3. Conclusion == In closing, I think that either of these two proposed syntax changes (to the variable binding syntax of destructuring pattern matching and closures) would individually contribute to improving the readability, learnability, predictability and un-ambiguity of the language. Further, I think that the positive effects would be even greater if the syntax changes were applied together, due to the aforementioned emphasis of the similarities of the two constructs and the overall increase in language syntax self-consistency. Thanks for your time, jb _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
