Re: [rust-dev] Implicit environment capture for closures - a compromise?

Graydon Hoare Mon, 30 Aug 2010 19:05:04 -0700

On 10-08-30 02:07 PM, David Herman wrote:

In principle, I'd prefer to have a lambda form that can implicitly bind upvars, 
although I think I'd look at this from a slightly different direction than 
Sebastian.

I think we may have a divergence of topic here. Not an unhealthy one --we should discuss both at a little length, given the importance attachedto these matters -- but let's be clear on what Sebastian proposed andwhat you're talking about. There are two kinds of relatively differentlambdas here:


  - Down-Lambdas (I'll call them this for now) which can't outlive their
    current scope, and thereby alias the environment they're in by
    frame pointer, directly and at minimal cost, about the same as
    the current foreach blocks do.

  - Heap-Lambdas (again, hope this isn't offputting) which you're
    describing. These are an expression form of our local fn items
    that can be placed in a bind expression. Or perhaps bypass a bind
    expression altogether.

Sebastian is proposing down-lambdas as an addition so that we cansupport a few pass-custom-logic-in idioms (presumably parallel-iter,also the obvious like map and filter). We need to get aliasing rulesright here, which are complex on params and possibly fatal on theclosure value itself, but it seems *plausible*.

In contrast, you're proposing we try to come up with rules by whichheap-lambdas can be made to safely capture their environment, merelylowering the syntax barriers to using the current bind / local-fn combo.

I'm sympathetic to both proposals and interested in working details ofboth or either out (ideally combining them tastefully, or doing themseparately if not). Though I'd caution that there *are* serious hazardsinvolved; I'm an old lisper too, in terms of personal history, but thisis a language aiming for a slightly different sweet spot.

But I don't think lambda is just about RAII.


Indeed not. Didn't mean to imply that.

In particular, I think a very common use case for lambda is for event-driven 
programming, a style I'd expect to come up often in Rust programs. Requiring 
programmers to name their functions and place them out-of-band breaks up the 
flow of the programming and just makes it a little harder to read. Or from the 
other direction: being able to pass an event handler directly inline as an 
anonymous lambda makes it immediately clear to the reader that the relevance of 
the function doesn't extend beyond this one place.


Fair points.

Graydon, Re: control flow complexity, I'm not sure whether lambda adds too much 
control-flow complexity, given that we already have higher-order constructs 
with |bind| and objects.

Nope, I don't think it adds any control-flow complexity. I wasmentioning control-flow complexity wrt. Sebastian's example, that usedcatchable-exceptions and an implied nonlocal control transfer (which wedon't have).

> Primarily, I see it as a lightweight notational convenience forcommon pattern, which you can already express using helpers.

Yup. Issue 6 is open, right? I'm not going to reject a patch thatimplements a fn-expression form. I just want to be careful with what it*means*, particularly as far as any implied capture :)

I wouldn't think we'd want to allow lambda-functions to close over 
stack-allocated locals.

Probably not if the closure escapes to heap. Not unless making oneimplies a copy at the point of capture. That's a possibility. Not anappealing one to me, or not by absolute-most-common default, but apossibility.

Long story short, I'm suggesting we could restrict lambdas to only be allowed 
to refer to @-typed variables.


Sneaky but possible. Let's consider this further...

One approach would be to restrict upvars to be read-only.


Yeah. It's a bit random-feeling though. Users will certainly complain.

You can just view this as syntactic sugar for the one above. (BTW, I'm not 
trying to propose exact concrete syntaxes, just looking for an existence proof 
that it's possible to propagate typestate constraints into lambdas.)

Yeah. I think that proof has been achieved. Let's consider concreteproposals.

That's just my current thinking, anyway. Graydon, does any of this sound 
plausible?


Yeah, it does.

Let me toss a few design considerations into the stew here. We'reobviously getting into brainstorming mode for a few emails. Just try tokeep it focused on the existing semantic categories and runtimevocabulary. Considering these points:


  - The hard part: if we're going to capture by aliasing-fp, we have
    to come up with some way of prohibiting the formation of a copy
    of the down-fn. It has to have a non-copyable type. Otherwise
    you can copy to the heap, and then everything explodes. This
    is currently design problem #1 for down-lambdas. If we can't solve
    this, the remainder of down-lambdas is doomed. In foreach loops, at
    present, there's no such problem because the inner fn isn't named.

  - I don't see a strong reason to add "lambda" to the language, as a
    syntactic keyword, when we've got the shorter "fn" already. So for
    sketching sake, let's restrict to that. If we are really aiming to
    shave syntax we can even play the smalltalk game and move the params
    inside the block: "{(x, y) foo(x+y); }"

  - If you're going to start providing methods for environment capture,
    it is worth considering whether to keep "bind" at all. It gives you
    something like currying, but the argument about redundancy cuts both
    ways: "bind f(10, _)" can be written "fn(int x) { f(10, x); }". It
    depends a lot on the relative frequency of currying vs. capture. Now
    that we're down to just two slot modes, it might make sense to call
    the "bind" game off. It might not be paying for itself.

  - I'm still somewhat concerned with allowing the programmer to specify
    clearly which variables are being captured, when it makes sense.
    I'll admit that there are enough cases in which it is an annoyance,
    but if you have a large body of logic, it's a comprehension hazard
    to have a closure copying something to the heap and/or retaining a
    reference without a reader noticing. It's good to be able to be
    explicit when you want to be. Rust's design has tried to keep in the
    foreground the fact that when working on a large codebase, the
    programmer wants to fasten seatbelts because they don't trust
    *themselves* to get things right, and want double-checks to occur.

  - You still need something like an argument-list to indicate which
    arguments you want the resulting function to accept. What its type
    is. So you need to indicate stuff about the captures *and* the
    residual arguments. This is why it gets chatty.

  - I don't want to get into inferring function types or type-param
    capture. Too much effort, those subsystems are already overloaded.
    So I want to keep a result-type in there too.

  - C++ capture clauses have two forms, one in which they explicitly
    list the variables captured and one in which they capture
    "everything" by mentioned in the body. I wonder if we can follow
    their lead here.

Suppose we do this:

  - use the "fn" keyword.

  - permit fn-expressions, only for monomorphic functions.

  - Permit an optional capture clause between "fn" and its params in
    expression context.

  - The capture clause can be @ or & followed by an optional list of
    captured vars (just their names), or omitted to indicate "inspect
    the function to figure out the captures". Assuming we figure out
    a way to prohibit copying down-lambdas. If not, just @.

  - Remove "bind". Having shipped in Sather is not exactly a huge
    sales pitch, and the haskell/ML people are the only ones who will
    even think to use currying. Everyone else won't notice its absence.

This is essentially the C++-0x-lambdas approach, just adapted to ourcurrent syntax and semantics a touch. So we'd get:


  - "fn &(a,b) (int x) -> int { ... }" -- alias a, b; fn takes one int.
  - "fn @(a,b) (int x) -> int { ... }" -- box a,b; fn takes one int.
  - "fn & (int x) -> int { ... }" -- alias everything mentioned in fn.
  - "fn @ (int x) -> int { ... }" -- box everything mentioned in fn.
  - "fn (int x) -> int { ... }" -- no capture at all.

To C++'s credit here, their scheme gives programmers the option tochoose to be lazy and capture stuff implicitly, if they feel safe doingso, or the ability to be more-precise and capture only what they mention.

Anyway, as I say above, I think there's at least one hard semanticproblem here (how to prohibit escape of a down-lambda) along with a fewsyntax considerations. Thoughts welcome on how to overcome the former.Proceeding with the heap-capture variant is possible, if everyone's ableto accept losing currying and by-name "bind" expressions.


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Implicit environment capture for closures - a compromise?

Reply via email to