Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-16 Thread Zelphir Kaltstahl
Thanks for the example code.

I will look at it soon, when I can code on that code again and see where
I can get. Seems I misunderstood and thought you were suggesting to use
the non-ergonomic way of identifying procedures for serial-lambda.
That's probably, because I do not understand the source code of the
serial-lambda macro, which I looked at for a moment.

So maybe serial-lambda will save me in the end : ) Will try soon and
thank you! (I think I should also reorganize the code in the repo,
because all the comments make it hard to grasp a lot of the code at one
look. Maybe need to split things up and write comments or explanations
at the top in a lot of detail and references to the code and then just
let the code be there.)


On 15.04.2018 22:47, Philip McGrath wrote:
> If an example would be helpful, here's a toy implementation of
> `parallel-set-map` that uses `serial-lambda` to send the user-supplied
> function across the place-channels:
>
> #lang racket
>
> (require web-server/lang/serial-lambda
>  racket/serialize
>  )
>
> (define (place-printf pch fmt . args)
>   (place-channel-put pch (apply format fmt args)))
>
> (define (spawn-worker id pch print-pch srl-proc)
>   (place/context _
>     (define proc
>   (deserialize srl-proc))
>     (let loop ()
>   (let* ([v (sync pch)]
>  [rslt (proc v)])
>     (place-printf print-pch "place ~a:\t~a -> ~a\n" id v rslt)
>     (place-channel-put pch rslt)
>     (loop)
>
> (define/contract (parallel-set-map proc st send-print-pch)
>   (-> (and/c serializable? (-> any/c any/c))
>   (set/c any/c)
>   place-channel?
>   any/c)
>   (define-values {manager-pch worker-pch}
>     (place-channel))
>   (define workers
>     (for/list ([id (in-range 2)])
>   (spawn-worker id worker-pch send-print-pch (serialize proc
>   (define count
>     (for/sum ([v (in-set st)])
>   (place-channel-put manager-pch v)
>   1))
>   (define results
>     (let loop ([so-far (set)]
>    [i 1])
>   (define st
>     (set-add so-far (sync manager-pch)))
>   (if (= i count)
>   st
>   (loop st (add1 i)
>   (for-each place-kill workers)
>   results)
>
> (define (try-it)
>   (define example-set
>     (set 1 2 3))
>   (define-values {get-print-pch send-print-pch}
>     (place-channel))
>   (thread (λ ()
>     (let loop ()
>   (write-string (sync get-print-pch) (current-output-port))
>   (loop
>   (place-printf send-print-pch
>     "~v\n"
>     (parallel-set-map (serial-lambda (x)
>     (+ x x))
>   example-set
>   send-print-pch))
>   (place-printf send-print-pch
>     "~v\n"
>     (parallel-set-map (serial-lambda (x)
>     (* x x))
>   example-set
>   send-print-pch)))
>
>
> -Philip
>
> On Sun, Apr 15, 2018 at 3:08 PM, Philip McGrath
> > wrote:
>
> On Sun, Apr 15, 2018 at 2:51 PM, Zelphir Kaltstahl
> >
> wrote:
>
> Having to write all things in terms of where things come from
> like in:
>
> > '([racket/base +] . [1 2])
>
> is not ergonomic at all.
>
>
> Absolutely! To be clear, I was not suggesting that you use that
> format in practice: I was trying to illustrate part of the
> low-level mechanism by which `serial-lambda` (and
> `racket/serialize` in general) work.
>
> Using `serial-lambda` does all of the difficult accounting for you
> to make sure the right function from the right module is there
> when you deserialize it and to arrange for the serializable
> procedure to take its lexical environment along with it. You do
> have to be sure that you're dealing with pure functions and
> serializable data structures, but you can use it as a drop-in
> replacement for `lambda` and get surprisingly far before you have
> to think about any of the details.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Philip McGrath
If an example would be helpful, here's a toy implementation of
`parallel-set-map` that uses `serial-lambda` to send the user-supplied
function across the place-channels:

#lang racket

(require web-server/lang/serial-lambda
 racket/serialize
 )

(define (place-printf pch fmt . args)
  (place-channel-put pch (apply format fmt args)))

(define (spawn-worker id pch print-pch srl-proc)
  (place/context _
(define proc
  (deserialize srl-proc))
(let loop ()
  (let* ([v (sync pch)]
 [rslt (proc v)])
(place-printf print-pch "place ~a:\t~a -> ~a\n" id v rslt)
(place-channel-put pch rslt)
(loop)

(define/contract (parallel-set-map proc st send-print-pch)
  (-> (and/c serializable? (-> any/c any/c))
  (set/c any/c)
  place-channel?
  any/c)
  (define-values {manager-pch worker-pch}
(place-channel))
  (define workers
(for/list ([id (in-range 2)])
  (spawn-worker id worker-pch send-print-pch (serialize proc
  (define count
(for/sum ([v (in-set st)])
  (place-channel-put manager-pch v)
  1))
  (define results
(let loop ([so-far (set)]
   [i 1])
  (define st
(set-add so-far (sync manager-pch)))
  (if (= i count)
  st
  (loop st (add1 i)
  (for-each place-kill workers)
  results)

(define (try-it)
  (define example-set
(set 1 2 3))
  (define-values {get-print-pch send-print-pch}
(place-channel))
  (thread (λ ()
(let loop ()
  (write-string (sync get-print-pch) (current-output-port))
  (loop
  (place-printf send-print-pch
"~v\n"
(parallel-set-map (serial-lambda (x)
(+ x x))
  example-set
  send-print-pch))
  (place-printf send-print-pch
"~v\n"
(parallel-set-map (serial-lambda (x)
(* x x))
  example-set
  send-print-pch)))


-Philip

On Sun, Apr 15, 2018 at 3:08 PM, Philip McGrath 
wrote:

> On Sun, Apr 15, 2018 at 2:51 PM, Zelphir Kaltstahl <
> zelphirkaltst...@gmail.com> wrote:
>
>> Having to write all things in terms of where things come from like in:
>>
>> > '([racket/base +] . [1 2])
>>
>> is not ergonomic at all.
>>
>
> Absolutely! To be clear, I was not suggesting that you use that format in
> practice: I was trying to illustrate part of the low-level mechanism by
> which `serial-lambda` (and `racket/serialize` in general) work.
>
> Using `serial-lambda` does all of the difficult accounting for you to make
> sure the right function from the right module is there when you deserialize
> it and to arrange for the serializable procedure to take its lexical
> environment along with it. You do have to be sure that you're dealing with
> pure functions and serializable data structures, but you can use it as a
> drop-in replacement for `lambda` and get surprisingly far before you have
> to think about any of the details.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Philip McGrath
On Sun, Apr 15, 2018 at 2:51 PM, Zelphir Kaltstahl <
zelphirkaltst...@gmail.com> wrote:

> Having to write all things in terms of where things come from like in:
>
> > '([racket/base +] . [1 2])
>
> is not ergonomic at all.
>

Absolutely! To be clear, I was not suggesting that you use that format in
practice: I was trying to illustrate part of the low-level mechanism by
which `serial-lambda` (and `racket/serialize` in general) work.

Using `serial-lambda` does all of the difficult accounting for you to make
sure the right function from the right module is there when you deserialize
it and to arrange for the serializable procedure to take its lexical
environment along with it. You do have to be sure that you're dealing with
pure functions and serializable data structures, but you can use it as a
drop-in replacement for `lambda` and get surprisingly far before you have
to think about any of the details.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Zelphir Kaltstahl
I am aware, that each place has its own stuff even if I put it at top
level of the module, since it is its own Racket instance. I think
sending things as immutable data is what I want to do. "Just send
everything as immutable things to that place, so that it has everything
it needs to complete some computation." (including procedure
definitions) is the idea.

Having to write all things in terms of where things come from like in:

> '([racket/base +] . [1 2])

is not ergonomic at all. I think no one unsuspecting would think to do
that when using some library. If such is really necessary to send things
as immutable data, then users would expect it to translate procedures to
"where they come from + name" automatically. Maybe a macro could find
these things out for the user. However, things might be ambiguous and
the macro would have to make educated guesses or have some policy about
how it guesses and things could go wrong, making some stuff impossible.

(I am not experienced at all at writing macros, so maybe its non-sense.
Every time I go through "Fear of Macros" I sort of get part of it, until
I don't get the rest and then I don't use it (because I rarely get the
idea: "Oh, here I should use a macro!") and then forget most of it again.)

I just feel that things like the '([racket/base +] . [1 2]) are not good
enough to make a general work distributing library. It feels like people
would have to consider to much and when doing anything with that.

> Finally, on a broader point, I don't think you can avoid having to
think about the fact that your code is going to run in parallel: for
example, you will always have to make sure that you don't depend on
shared state. With the implementation of places in particular, you also
need to consider communication overhead and startup time. If I were
designing a library for parallelism, I would start with some specific
use-case in mind and focus on coming up reusable solutions for specific
sub-parts of the problem.

I get the point, I mean "Why do it if there is no use-case?". But I sort
of want to do it "for the future", "in case someone needs easy
parallelism in Racket", as a "drop in library that solves your multi
processing problem". I simply don't have a personal use-case for it,
except for its own sake and except for minor use-cases, where it would
only be a little better to have things be a bit faster, but it does not
really matter, because stuff is fast enough. I think it would be great
though, if people could say:

"It takes too much time? Never mind, I'll simply finally use my many cores!"

when using Racket.I am sorry, I cannot provide a use-case for this right
now, as I am not working on a high-performance project in Racket. Racket
has so many great features, I would love to see such a thing. That's why
I am going at the problem from such a general perspective. I don't
personally need this tomorrow or next month. Of course one can always
bake ones own solution specific to one use-case, but then one has to
"reinvent the wheel" every single time, by writing how one wants to use
places again for every project, when all one want is to spread the work
on multiple cores to make it faster.

It might take someone way more experienced in Racket than me to get it
done, but I thought with a lot of asking and trying, maybe I could get
something at least mostly working for arbitrary use cases going ^^'
Right now I have no idea how to get closer to solving this problem and
feel like: "It's not possible in Racket."

Thank you for your response again though, like before in the other
topics, it's been helpful.


On 15.04.2018 19:17, Philip McGrath wrote:
> I think it would help to take a step back and think about what you're
> doing when you communicate with a place. As you know, places are
> effectively separate instances of the Racket VM: other than the
> explicit, low-level mechanisms like `make-shared-bytes`, they share no
> state at all, not even the same module instances. Let's imagine we
> have this module and have required it in two different places:
> (module example racket
>   (provide remember!
>    recall)
>   (define store
>     (make-hash))
>   (define (remember! k v)
>     (hash-set! store k v))
>   (define (recall k)
>     (hash-ref store k)))
> Each place would have its own distinct instance of that module, each
> with its own hash table: mutating the hash table in one place would
> not change the other place's hash table.
>
> This is the reason why procedures can't be sent across places. Place
> A's version of `remember!` closes over Place A's version of `store`.
> It doesn't know anything about Place B's version of `store`, so it
> certainly can't mutate that. Allowing Place A's `remember!` to be
> called from Place B and mutate Place A's version of `store` would
> violate the safety guarantees that places provide by requiring
> explicit message-passing rather than shared state.
>
> That means, if you want one place to tell another to call a 

Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Matthias Felleisen

You mays wish to look at 

 Haller, Miller, and Müller 
 
https://www.cambridge.org/core/journals/journal-of-functional-programming/article/programming-model-and-foundation-for-lineagebased-distributed-computation/B410CE79B21E33462843B408B716E1E5
 


an article that describes the theory and sketches the practice of Apache Spark, 
whose insight is (1) to bring all data into memory and (2) to bring all 
functionality to the data. It required a change to the Scala type system to 
make sure functions could be sent over to the place where things needed to be 
processed. 

I conjecture that this could be done with an untyped world, like you’re trying, 
an extension to Typed Racket might be cheaper. 

And alas, communication costs is critical and often forgotten. So you really 
want to keep “big” computation in mind when you go thru places. 

— Matthias






> On Apr 15, 2018, at 1:17 PM, Philip McGrath  wrote:
> 
> I think it would help to take a step back and think about what you're doing 
> when you communicate with a place. As you know, places are effectively 
> separate instances of the Racket VM: other than the explicit, low-level 
> mechanisms like `make-shared-bytes`, they share no state at all, not even the 
> same module instances. Let's imagine we have this module and have required it 
> in two different places:
> (module example racket
>   (provide remember!
>recall)
>   (define store
> (make-hash))
>   (define (remember! k v)
> (hash-set! store k v))
>   (define (recall k)
> (hash-ref store k)))
> Each place would have its own distinct instance of that module, each with its 
> own hash table: mutating the hash table in one place would not change the 
> other place's hash table.
> 
> This is the reason why procedures can't be sent across places. Place A's 
> version of `remember!` closes over Place A's version of `store`. It doesn't 
> know anything about Place B's version of `store`, so it certainly can't 
> mutate that. Allowing Place A's `remember!` to be called from Place B and 
> mutate Place A's version of `store` would violate the safety guarantees that 
> places provide by requiring explicit message-passing rather than shared state.
> 
> That means, if you want one place to tell another to call a function, you 
> need to send it some kind of immutable message telling it what to do. It's 
> rather like calling an API over the network. Let's say you want to run tell 
> some place to execute the following thunk:
> (λ ()
>   (+ 1 2))
> 
> How might you represent that function as data?
> 
> Well, if a function is available as a module-level export, you can access it 
> with `dynamic-require`, so a natural way to represent "call this function 
> with these arguments" would be with a list of arguments for `dynamic-require` 
> to get the function you have in mind, plus a list of the arguments to give to 
> the function. The example above might be represented like this:
> '([racket/base +] . [1 2])
> 
> The receiver place could then interpret such a message like this:
> (λ (message)
>   (apply (apply dynamic-require (car message))
>  (cdr message)))
> 
> That's essentially how `serial-lambda` works under the hood. Each syntactic 
> use of the `serial-lambda` macro is turned into a module-level structure type 
> definition that implements `prop:procedure`. When a use is evaluated and a 
> closure is allocated, it creates an instance of that structure type, 
> packaging up its free lexical variables into the structure's fields (that's 
> the hard part). The `prop:serializable` protocol for `racket/serialize` 
> essentially records the same information you would need to use 
> `dynamic-require`.
> 
> So, to answer some of your specific questions:
> 
> On Sun, Apr 15, 2018 at 10:51 AM, Zelphir Kaltstahl 
> > wrote:
> - What if in that serial-lambda the user needs to use some custom
> procedure? Does that suddenly also have to be serializable? What about
> its "dependencies"? --> everything in the user program ends up being a
> serial-lambda. That would be really bad.
> 
> For the procedure value to be serializable, all of the values it lexical 
> closes over have to be serializable. If you remember that those values have 
> to be packaged up into fields of a struct, this makes sense: a list is also 
> only serializable if its contents are serializable. 
> 
> It takes some experience to readily recognize just what it is that an 
> anonymous function will close over. One helpful rule is that module-level 
> variables are never part of the closure, so they aren't required to be 
> serializable: thus, it's ok that things like + aren't serializable. On the 
> other hand, in this example:
> (define 

Re: [racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Philip McGrath
I think it would help to take a step back and think about what you're doing
when you communicate with a place. As you know, places are effectively
separate instances of the Racket VM: other than the explicit, low-level
mechanisms like `make-shared-bytes`, they share no state at all, not even
the same module instances. Let's imagine we have this module and have
required it in two different places:
(module example racket
  (provide remember!
   recall)
  (define store
(make-hash))
  (define (remember! k v)
(hash-set! store k v))
  (define (recall k)
(hash-ref store k)))
Each place would have its own distinct instance of that module, each with
its own hash table: mutating the hash table in one place would not change
the other place's hash table.

This is the reason why procedures can't be sent across places. Place A's
version of `remember!` closes over Place A's version of `store`. It doesn't
know anything about Place B's version of `store`, so it certainly can't
mutate that. Allowing Place A's `remember!` to be called from Place B and
mutate Place A's version of `store` would violate the safety guarantees
that places provide by requiring explicit message-passing rather than
shared state.

That means, if you want one place to tell another to call a function, you
need to send it some kind of immutable message telling it what to do. It's
rather like calling an API over the network. Let's say you want to run tell
some place to execute the following thunk:
(λ ()
  (+ 1 2))

How might you represent that function as data?

Well, if a function is available as a module-level export, you can access
it with `dynamic-require`, so a natural way to represent "call this
function with these arguments" would be with a list of arguments for
`dynamic-require` to get the function you have in mind, plus a list of the
arguments to give to the function. The example above might be represented
like this:
'([racket/base +] . [1 2])

The receiver place could then interpret such a message like this:
(λ (message)
  (apply (apply dynamic-require (car message))
 (cdr message)))

That's essentially how `serial-lambda` works under the hood. Each syntactic
use of the `serial-lambda` macro is turned into a module-level structure
type definition that implements `prop:procedure`. When a use is evaluated
and a closure is allocated, it creates an instance of that structure type,
packaging up its free lexical variables into the structure's fields (that's
the hard part). The `prop:serializable` protocol for `racket/serialize`
essentially records the same information you would need to use
`dynamic-require`.

So, to answer some of your specific questions:

On Sun, Apr 15, 2018 at 10:51 AM, Zelphir Kaltstahl <
zelphirkaltst...@gmail.com> wrote:

> - What if in that serial-lambda the user needs to use some custom
> procedure? Does that suddenly also have to be serializable? What about
> its "dependencies"? --> everything in the user program ends up being a
> serial-lambda. That would be really bad.
>

For the procedure value to be serializable, all of the values it lexical
closes over have to be serializable. If you remember that those values have
to be packaged up into fields of a struct, this makes sense: a list is also
only serializable if its contents are serializable.

It takes some experience to readily recognize just what it is that an
anonymous function will close over. One helpful rule is that module-level
variables are never part of the closure, so they aren't required to be
serializable: thus, it's ok that things like + aren't serializable. On the
other hand, in this example:
(define (make-thunk x)
  (serial-lambda ()
(println x)))
the function returned by make-thunk will only be serializable when `x` is
serializable.

You can find some more background about this in the #lang web-server
documentation. I also wrote some notes on serialization pitfalls for the
`web-server/formlets` library, which (now) uses serializable procedures
internally: http://docs.racket-lang.org/web-server/formlets.html#%
28part._.Formlets_and_.Stateless_.Servlets%29


> - Is there a better way than requiring everything to be serial-lambda?
>

With the caveat that, as I said, not "everything" has to use serial-lambda,
I don't think there is a better way. Any other solution for serializing an
arbitrary function would just end up re-implementing what
`web-server/lang/serial-lambda` does (and has been tested and used in
production doing). I can think of things that `serial-lambda` doesn't
do—for example, I've experimented with trying to find a mechanism for
serialized procedures to take their contracts with them—but I would want to
use `serial-lambda` to implement such additional features, not replace
`serial-lambda`. It does its job very well.


> - Is the idea to have lambdas be serializable by default language wide
> insane? It would be great to be able to simply start a new place and
> give it some arbitrary lambda to execute.
>

#lang 

[racket-users] Using serial-lambda to send lambdas to places

2018-04-15 Thread Zelphir Kaltstahl
Today I wrote some example code for trying out `serial-lambda` from
`(require web-server/lang/serial-lambda)`. Here is what I currently have:


#lang racket

(require web-server/lang/serial-lambda)
(require racket/serialize)

(define to-send (serial-lambda (x) (* x x)))
(define to-send-2
  (serial-lambda (place-id data)
 (list 'result
   place-id
   (for/list ([i (range 1000)]
  [elem (in-cycle data)])
 (* i elem)
(define to-send-3
  (serial-lambda (place-id data custom-proc)
 (list 'result
   place-id
   (custom-proc
    (for*/list ([i (range 1000)]
   [elem data])
  (* i elem))

(define (custom-proc lst)
  (map (λ (x) (* 2 x))
   lst))

(to-send 4)

(fprintf (current-output-port)
 "Serialized lambda:~a~n"
 (serialize to-send))
(fprintf (current-output-port)
 "Deserialized Serialized lambda: ~a~n"
 (deserialize (serialize to-send)))
(fprintf (current-output-port)
 "Deserialized Serialized lambda of 4: ~a~n"
 ((deserialize (serialize to-send)) 4))

(fprintf (current-output-port)
 "Deserialized Serialized lambda of 4: ~a~n"
 ((deserialize (serialize to-send-2)) 3 '(0 1 2 3)))

(fprintf (current-output-port)
 "lenght Deserialized Serialized lambda of 4: ~a~n"
 (length (caddr ((deserialize (serialize to-send-2)) 3 '(0 1 2
3)

(fprintf (current-output-port)
 "lenght Deserialized Serialized lambda of 4: ~a~n"
 ((deserialize (serialize to-send-3)) 3 '(0 1 2 3) custom-proc))
(fprintf (current-output-port)
 "lenght Deserialized Serialized lambda of 4: ~a~n"
 (length
  (caddr
   ((deserialize (serialize to-send-3)) 3 '(0 1 2 3) custom-proc


Which I can simply run with `racket serial-lambda-example.rkt`. Some
time ago I started a project called "work-distributor"
(https://github.com/ZelphirKaltstahl/work-distributor), which I want to
be usable in a way, that users can simply give some lambda to the
distributor and some data, so that the distributor then creates as many
places as specified and distributes the data and lambda to these places.
Then the places should apply the lambda to their portion of the data and
return the results, which the work distributor would merge and return to
the user / caller.
So far the theory.

As it turns out I cannot send lambdas on place channels
(https://docs.racket-lang.org/reference/places.html?q=place-message-allowed%3F#%28def._%28%28lib._racket%2Fplace..rkt%29._place-message-allowed~3f%29%29),
so the first issue is, that I have to use an import
(require web-server/lang/serial-lambda)to use something from a
completely unrelated package, in order to get a lambda, which can be
serialized and then send that on the place channel. That would not be so
bad, if it was constrained to that one usage, when the user builds their
lambda, which is going to be given to the work distributor, so that the
work distributor can go on and send it to places.
In a user program however, such serial-lambda seems to be "infectious":

- What if in that serial-lambda the user needs to use some custom
procedure? Does that suddenly also have to be serializable? What about
its "dependencies"? --> everything in the user program ends up being a
serial-lambda. That would be really bad.
- OK, lets say the procedure is visible from the places ... I cannot
seriously require the user of the work distributor to go into the code
of the work distributor and add custom procedures there. That would be
like requiring them to write custom parallelization code.

Maybe I am missing something simple in this whole line of thought.

- Is there a better way than requiring everything to be serial-lambda?
- Is there a better way than requiring the user to edit the work
distributor code to make custom procedures available to the places?
- Is the idea to have lambdas be serializable by default language wide
insane? It would be great to be able to simply start a new place and
give it some arbitrary lambda to execute.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.