Re: [racket-dev] `letrec' and continuations

2011-07-08 Thread Matthew Flatt
I've implemented (not yet pushed) a revision of internal-definition
expansion. The `quicksort!' functions at the start of this thread (see
http://lists.racket-lang.org/dev/archive/2011-May/006547.html) run the
same with that change.

Definitions are partitioned into minimal sets that satisfy the
following rule: if a definition's binding is referenced (after all
macro expansion) by an earlier definition, the two definitions and all
in between are in the same set. If a set contains a single definition
that does not refer to itself, then it's a `let' layer, otherwise the
group is a `letrec' layer.


Actually, I didn't change internal-definition expansion. I changed
`letrec-syntaxes+values' while leaving `letrec' along. That is,

 (letrec-syntaxes+values () clauses body ...)

is not necessarily the same as

 (letrec clauses body ...)

Making them different is a little awkward, but it solves a problem with
`local-expand':

At Fri, 20 May 2011 09:28:40 -0700, Matthew Flatt wrote:
 At Fri, 20 May 2011 11:03:04 -0400, Sam Tobin-Hochstadt wrote:
  On Fri, May 20, 2011 at 10:53 AM, Matthew Flatt mfl...@cs.utah.edu wrote:
   I like the idea of adjusting the semantics of internal definitions and
   leaving `letrec' alone.
  
  While this seems like a nice change, how does it interact with
  internal syntax definitions?  If there are internal syntax
  definitions, do we fall back to `letrec-syntaxes+values'?
 
 Good question. Yes, I think an internal syntax definition would have to
 be treated like a definition that refers to the last binding in the set
 of definitions.

No, it's no good to give up when variable definitions are mixed with
syntax definitions. For example (and possibly what Sam was thinking
of), internal-definition positions that have Typed Racket declarations
correspond to mixtures of variable and syntax definitions. We certainly
want an improvement of internal definitions to work in Typed Racket.

For fully expanded programs, mixing variable and syntax definitions
causes no trouble. The problem is with `local-expand', which is
supposed to preserve `letrec-syntaxes+values' in an expansion, and
Typed Racket relies on that property of the expander. As Sam noted, we
could introduce `letrec-syntaxes+let*-values', but that breaks existing
tools. Changing the semantics of `letrec-syntaxes+values' is more
backward-compatible in practice.


For those rare cases where you want the old `letrec-syntaxes+values'
(e.g., the expansion of `#%stratified-body'), it's easy to force the
old semantics by adding an initial binding clause

   [() (begin (if #f id (values)))]

where `id' is the last variable in the sequence of bindings. Then, all
bindings are forced into a single `letrec'-like set.

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


[racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
If you run this program in Racket,

 (let ([restart void])
   (letrec ([forward-reference (lambda () maybe-ready)] ; affects compiler
[dummy1 (let/cc k (set! restart k))]
[dummy2 (displayln maybe-ready)]
[maybe-ready 'ready])
 (let ([rs restart])
   (set! restart void)
   (rs #f

then the output is

 #undefined
 ready

The second printout is ready because locations for all of the
`letrec'-bound variables are allocated before the right-hand sides are
evaluated --- which means before the continuation is captured.


If you comment out the `forward-reference' binding, however, then the
printout is

 #undefined
 #undefined

That's because the optimizer, seeing no forward references, incorrectly
converts the `letrec' to a `let*'.

There's some logic in the optimizer to avoid a transformation from
`letrec' to `let*' when a continuation might be captured, but it's
obviously broken. I could fix it, but before I fix the optimizer for
the current semantics, I wonder whether there's some way to specify the
semantics of `letrec' so that a conversion to `let*' would be legal.


A programmer almost never needs the semantics of `letrec' that
`call/cc' exposes, and a programmer often wants `letrec' to be as
efficient as `let' or `let*'. This is particularly true now that
Racket's syntax lets you use internal definitions in so many places.
For example, a quicksort core

 (define (quicksort! vec n m)
   (when ( (- m n) 1)
 (let* ([around-val (vector-ref vec n)]
[pos (pivot around-val vec n m)])
   (vector-set! vec pos pivot)
   (quicksort! vec n pos)
   (quicksort! vec (add1 pos) m

is 15% faster (on a vector of 100 random numbers) than 

 (define (quicksort! vec n m)
   (when ( (- m n) 1)
 (define around-val (vector-ref vec n))
 (define pos (pivot around-val vec n m))
 (vector-set! vec pos around-val)
 (quicksort! vec n pos)
 (quicksort! vec (add1 pos) m)))

With internal definitions, the compiler can't see enough of `pivot' to
be sure that it won't capture a continuation, and so it heap-allocates
a location for `pos'.

R6RS addresses the problem by saying that an expression on the
right-hand side of `letrec*' cannot return multiple times. In practice,
I expect that means a compiler would convert `letrec*' to `let*'
without checking the multiple-return constraint --- exploiting the
usual unspecified trap door. Obviously, I'm not in favor of
unspecified behavior. I also don't see how to make a single-return
check efficient; it seems to require heap allocation.


Any ideas?

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Sam Tobin-Hochstadt
On Fri, May 20, 2011 at 10:53 AM, Matthew Flatt mfl...@cs.utah.edu wrote:
 I like the idea of adjusting the semantics of internal definitions and
 leaving `letrec' alone.

While this seems like a nice change, how does it interact with
internal syntax definitions?  If there are internal syntax
definitions, do we fall back to `letrec-syntaxes+values'?  Do we want
`letrec-syntaxes+let*-values' (I hope not)?


 At Fri, 20 May 2011 09:39:06 -0500, Robby Findler wrote:
 How about making letrec (or letrec*?) be syntactically illegal if
 there are no forward references and make the internal definition
 expander avoid it in that case? (Or maybe just the second half of
 that.)

 Robby

 On Fri, May 20, 2011 at 9:28 AM, Matthew Flatt mfl...@cs.utah.edu wrote:
  If you run this program in Racket,
 
   (let ([restart void])
    (letrec ([forward-reference (lambda () maybe-ready)] ; affects compiler
             [dummy1 (let/cc k (set! restart k))]
             [dummy2 (displayln maybe-ready)]
             [maybe-ready 'ready])
      (let ([rs restart])
        (set! restart void)
        (rs #f
 
  then the output is
 
   #undefined
   ready
 
  The second printout is ready because locations for all of the
  `letrec'-bound variables are allocated before the right-hand sides are
  evaluated --- which means before the continuation is captured.
 
 
  If you comment out the `forward-reference' binding, however, then the
  printout is
 
   #undefined
   #undefined
 
  That's because the optimizer, seeing no forward references, incorrectly
  converts the `letrec' to a `let*'.
 
  There's some logic in the optimizer to avoid a transformation from
  `letrec' to `let*' when a continuation might be captured, but it's
  obviously broken. I could fix it, but before I fix the optimizer for
  the current semantics, I wonder whether there's some way to specify the
  semantics of `letrec' so that a conversion to `let*' would be legal.
 
 
  A programmer almost never needs the semantics of `letrec' that
  `call/cc' exposes, and a programmer often wants `letrec' to be as
  efficient as `let' or `let*'. This is particularly true now that
  Racket's syntax lets you use internal definitions in so many places.
  For example, a quicksort core
 
   (define (quicksort! vec n m)
    (when ( (- m n) 1)
      (let* ([around-val (vector-ref vec n)]
             [pos (pivot around-val vec n m)])
        (vector-set! vec pos pivot)
        (quicksort! vec n pos)
        (quicksort! vec (add1 pos) m
 
  is 15% faster (on a vector of 100 random numbers) than
 
   (define (quicksort! vec n m)
    (when ( (- m n) 1)
      (define around-val (vector-ref vec n))
      (define pos (pivot around-val vec n m))
      (vector-set! vec pos around-val)
      (quicksort! vec n pos)
      (quicksort! vec (add1 pos) m)))
 
  With internal definitions, the compiler can't see enough of `pivot' to
  be sure that it won't capture a continuation, and so it heap-allocates
  a location for `pos'.
 
  R6RS addresses the problem by saying that an expression on the
  right-hand side of `letrec*' cannot return multiple times. In practice,
  I expect that means a compiler would convert `letrec*' to `let*'
  without checking the multiple-return constraint --- exploiting the
  usual unspecified trap door. Obviously, I'm not in favor of
  unspecified behavior. I also don't see how to make a single-return
  check efficient; it seems to require heap allocation.
 
 
  Any ideas?
 
  _
   For list-related administrative tasks:
   http://lists.racket-lang.org/listinfo/dev
 

 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev



-- 
sam th
sa...@ccs.neu.edu

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthias Felleisen

On May 20, 2011, at 11:55 AM, Robby Findler wrote:

 Would it make sense to have a new construct, say letrec-super-star,
 that did one of those things and then use that as the core form in
 Racket (that's also a big change, but probably smaller than changing
 letrec itself).


One day we should reduce the language not just add to it. 

A language is not only about the positive things you can say but also about the 
negative ones. 
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Robby Findler
On Fri, May 20, 2011 at 10:59 AM, Matthias Felleisen
matth...@ccs.neu.edu wrote:

 On May 20, 2011, at 11:55 AM, Robby Findler wrote:

 Would it make sense to have a new construct, say letrec-super-star,
 that did one of those things and then use that as the core form in
 Racket (that's also a big change, but probably smaller than changing
 letrec itself).


 One day we should reduce the language not just add to it.

 A language is not only about the positive things you can say but also about 
 the negative ones.

Sorry-- I meant removing/changing the current letrec in the fully
expanded language.

Robby
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
At Fri, 20 May 2011 11:03:04 -0400, Sam Tobin-Hochstadt wrote:
 On Fri, May 20, 2011 at 10:53 AM, Matthew Flatt mfl...@cs.utah.edu wrote:
  I like the idea of adjusting the semantics of internal definitions and
  leaving `letrec' alone.
 
 While this seems like a nice change, how does it interact with
 internal syntax definitions?  If there are internal syntax
 definitions, do we fall back to `letrec-syntaxes+values'?

Good question. Yes, I think an internal syntax definition would have to
be treated like a definition that refers to the last binding in the set
of definitions.

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
At Fri, 20 May 2011 11:36:48 -0400, Matthias Felleisen wrote:
 
 On May 20, 2011, at 10:28 AM, Matthew Flatt wrote:
 
  The second printout is ready because locations for all of the
  `letrec'-bound variables are allocated before the right-hand sides are
  evaluated --- which means before the continuation is captured.
 
  A programmer almost never needs the semantics of `letrec' that
  `call/cc' exposes, and a programmer often wants `letrec' to be as
  efficient as `let' or `let*'.
 
 
 What this code recalled for me is the (in)famous 1980s style 
 puzzle (I believe due to Jinx) of retrieving reference cells 
 (boxes) from letrec plus call/cc (see 1988 LFP paper on LC_v-CS). 

Yes.

 1. Are you sure Robby's idea -- which you modified to compile internal define 
 differently -- works in all cases? 

I don't think we've pinned down exactly the definition, yet, but I'm
pretty sure we can define something that works and that is useful.

 2. Some other ideas: 
 
 -- We could request that RHS are syntactic values. That's Draconian.

Yes, that's out.

 PLUS, we could accommodate this change with a change to internal
 DEFINEs syntax. Don't make it a LETREC. Turn it into a mostly LET
 plus LETREC when you encounter a recursive function.

I think we can define internal definitions that way without a
requirement such as values on the the RHS.

 -- We could use prompts on the right hand side of letrec. 

I've considered that possibility. I like how it would be more like
`define' at the top level. Much like a check for multiple returns,
however, I don't see how to make it cheap enough.

 -- We could install code that re-inits the ref cells to UNDEFINED if a 
 continuation from inside a RHS letrec is invoked.

Same as multiple-return checking --- sounds expensive to me.

Of course, those possibilities be cheap enough through some
implementation that I haven't thought of.

 3. We are Racket and we no longer need to live up to the Scheme standard. 

I mention standards only as an illustration of how the issue has been
addressed before.

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
At Fri, 20 May 2011 10:55:36 -0500, Robby Findler wrote:
 Can we really change the semantics of letrec in such a fundamental way?

Sure. :)  But it's easier to not change `letrec', of course.

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthias Felleisen


Let me make my proposals (2 and 3) more precise because your response suggests 
they were too short. 

1. We could make internal define the primary vehicle for definitions, i.e., not 
compile thru letrec. As far as I am concerned, your change to the language to 
allow defines in many more places has made letrec superfluous. 

2. The semantics for internal defines would be more Algol like, meaning your 
example would immediately behave like let and thus be fast. 

3. As far as letrec is concerned, we can make it 'expensive' if it is no longer 
the intermediate target instruction from the macro compiler's perspective. 

-- I think my preferred solution would be to wrap letrec so that continuations 
grabbed during the setup set up a continuation mark that labels them as 
'dangerous'. When you reinvoke them, the existence of the mark tells you that 
the reference cells should be reinitialized (probably only the ones on the 
control flow from the continuation point). 

-- An alternative could be to stick a lexical identifier into letrec 
declarations that gets removed from the scope once the letrec is established. 
It would reappear when you invoke a continuation from the RHS and thus you'd 
know to reini the ref cells. BUT, this requires a mechanism that is not 
expressible at the surface of Racket. And it's odd. 

-- Matthias





On May 20, 2011, at 12:37 PM, Matthew Flatt wrote:

 At Fri, 20 May 2011 11:36:48 -0400, Matthias Felleisen wrote:
 
 On May 20, 2011, at 10:28 AM, Matthew Flatt wrote:
 
 The second printout is ready because locations for all of the
 `letrec'-bound variables are allocated before the right-hand sides are
 evaluated --- which means before the continuation is captured.
 
 A programmer almost never needs the semantics of `letrec' that
 `call/cc' exposes, and a programmer often wants `letrec' to be as
 efficient as `let' or `let*'.
 
 
 What this code recalled for me is the (in)famous 1980s style 
 puzzle (I believe due to Jinx) of retrieving reference cells 
 (boxes) from letrec plus call/cc (see 1988 LFP paper on LC_v-CS). 
 
 Yes.
 
 1. Are you sure Robby's idea -- which you modified to compile internal 
 define 
 differently -- works in all cases? 
 
 I don't think we've pinned down exactly the definition, yet, but I'm
 pretty sure we can define something that works and that is useful.
 
 2. Some other ideas: 
 
 -- We could request that RHS are syntactic values. That's Draconian.
 
 Yes, that's out.
 
 PLUS, we could accommodate this change with a change to internal
 DEFINEs syntax. Don't make it a LETREC. Turn it into a mostly LET
 plus LETREC when you encounter a recursive function.
 
 I think we can define internal definitions that way without a
 requirement such as values on the the RHS.
 
 -- We could use prompts on the right hand side of letrec. 
 
 I've considered that possibility. I like how it would be more like
 `define' at the top level. Much like a check for multiple returns,
 however, I don't see how to make it cheap enough.
 
 -- We could install code that re-inits the ref cells to UNDEFINED if a 
 continuation from inside a RHS letrec is invoked.
 
 Same as multiple-return checking --- sounds expensive to me.
 
 Of course, those possibilities be cheap enough through some
 implementation that I haven't thought of.
 
 3. We are Racket and we no longer need to live up to the Scheme standard. 
 
 I mention standards only as an illustration of how the issue has been
 addressed before.
 


_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthias Felleisen

On May 20, 2011, at 4:42 PM, Sam Tobin-Hochstadt wrote:

 On Fri, May 20, 2011 at 4:39 PM, Matthias Felleisen
 matth...@ccs.neu.edu wrote:
 
 -- I think my preferred solution would be to wrap letrec so that 
 continuations grabbed during the setup set up a continuation mark that 
 labels them as 'dangerous'. When you reinvoke them, the existence of the 
 mark tells you that the reference cells should be reinitialized (probably 
 only the ones on the control flow from the continuation point).
 
 -- An alternative could be to stick a lexical identifier into letrec 
 declarations that gets removed from the scope once the letrec is 
 established. It would reappear when you invoke a continuation from the RHS 
 and thus you'd know to reini the ref cells. BUT, this requires a mechanism 
 that is not expressible at the surface of Racket. And it's odd.
 
 I think the key missing piece here is that Matthew wants to avoid
 having the reference cells *at all*.  If you use `let*', you don't get
 any reference cells.


He wants an optimizer that compiles away ref cells for recursive declaration 
constructs when possible. 

I think that this is much more easily doable with an internal define compiler 
that goes around the letrec intermediate steps. 

So I am proposing to leave letrec around for the cases when you need an 
extremely general recursion construct AND to modify it so that it is safe 
against call/cc. I would go as far as giving up on ref cell elimination for 
letrec. 

Yes, I forgot to add that this means we need to eliminate letrec from the code 
base in many cases. 
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Robby Findler
On Fri, May 20, 2011 at 3:39 PM, Matthias Felleisen
matth...@ccs.neu.edu wrote:


 Let me make my proposals (2 and 3) more precise because your response 
 suggests they were too short.

 1. We could make internal define the primary vehicle for definitions, i.e., 
 not compile thru letrec. As far as I am concerned, your change to the 
 language to allow defines in many more places has made letrec superfluous.

 2. The semantics for internal defines would be more Algol like, meaning your 
 example would immediately behave like let and thus be fast.

 3. As far as letrec is concerned, we can make it 'expensive' if it is no 
 longer the intermediate target instruction from the macro compiler's 
 perspective.

 -- I think my preferred solution would be to wrap letrec so that 
 continuations grabbed during the setup set up a continuation mark that labels 
 them as 'dangerous'. When you reinvoke them, the existence of the mark tells 
 you that the reference cells should be reinitialized (probably only the ones 
 on the control flow from the continuation point).

Why do you prefer this to putting a prompt around the rhs of each
letrec binding?

Robby

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Sam Tobin-Hochstadt
On Fri, May 20, 2011 at 5:25 PM, John Clements
cleme...@brinckerhoff.org wrote:

 On May 20, 2011, at 1:39 PM, Matthias Felleisen wrote:



 Let me make my proposals (2 and 3) more precise because your response 
 suggests they were too short.

 1. We could make internal define the primary vehicle for definitions, i.e., 
 not compile thru letrec. As far as I am concerned, your change to the 
 language to allow defines in many more places has made letrec superfluous.

 Perhaps this goes without saying, but I'm hoping that if internal defines 
 don't expand into letrec any more, that they expand into some similar form 
 that has syntactically obvious scoping; I like the fact that the scope of 
 letrec-declared variables is delimited by the syntactic letrec term.

Yes, this is very important for Typed Racket and other tools that
process expanded syntax.
-- 
sam th
sa...@ccs.neu.edu

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
At Fri, 20 May 2011 16:39:23 -0400, Matthias Felleisen wrote:
 2. The semantics for internal defines would be more Algol like, meaning your 
 example would immediately behave like let and thus be fast. 

Ok, I see how that's a better way of saying what I agree with (i.e.,
what I think Robby suggested).

 3. As far as letrec is concerned, we can make it 'expensive'

I see no reason to change `letrec'. Fixing internal definitions is the
goal; I didn't see (until Robby's suggestion) that fixing internal
definitions doesn't necessarily require a change to `letrec'.

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthew Flatt
At Fri, 20 May 2011 14:25:54 -0700, John Clements wrote:
 Perhaps this goes without saying, but I'm hoping that if internal
 defines don't expand into letrec any more, that they expand into some
 similar form that has syntactically obvious scoping; I like the fact
 that the scope of letrec-declared variables is delimited by the
 syntactic letrec term.

Yes. The goal as I see it is to preserve the syntax and scoping of
internal definitions (since lots of code relies on that), but to adjust
the semantics of locations for internal-definition identifiers (which
no code likely relies on).

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Robby Findler
On Fri, May 20, 2011 at 6:17 PM, Matthew Flatt mfl...@cs.utah.edu wrote:
 At Fri, 20 May 2011 16:39:23 -0400, Matthias Felleisen wrote:
 2. The semantics for internal defines would be more Algol like, meaning your
 example would immediately behave like let and thus be fast.

 Ok, I see how that's a better way of saying what I agree with (i.e.,
 what I think Robby suggested).

(And indeed, what I meant to suggest :).

Robby
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] `letrec' and continuations

2011-05-20 Thread Matthias Felleisen

On May 20, 2011, at 7:17 PM, Matthew Flatt wrote:

 I see no reason to change `letrec'.

I think letrec's behavior with call/cc on the right-hand side exposes ref cells 
and that will bite us again and again. That's why I think changing it would 
make sense. Then again, the bites are rare, subtle, and probably barely 
noticable in most cases. 
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev