Re: [racket-users] Re: Need help with parallelizing a procedure
> On Jan 13, 2018, at 11:14 AM, Zelphir Kaltstahl > wrote: > > Anyway, I hope to understand how the Y-combinator works with the help of "The > Little Schemer", which I am reading currently. I am not at that chapter yet. > I am taking my time, first thinking through the stuff and later typing it > into the machine and thinking about it again, sometimes noticing some detail > when I type the code and sometimes making a little experiment with that code. > Already on my way through "The Little Schemer" I hit some useful procedures, > which helped me with a totally unrelated problem when coding my blog in > Racket. So it's definitely useful. Yes, that’s the correct approach to reading, something I have to teach most college freshmen (and I fail many of them or do they fail me?). > "The Little Schemer" is starting slow for people who have programmed before, > but seeing, that I am only half-way through and already got some interesting > knowledge from it, one should not underestimate the acceleration in this book. :-) — Matthias -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Need help with parallelizing a procedure
On 8/22/2017 6:08 AM, Zelphir Kaltstahl wrote: Will "The Seasoned Schemer" be understandable for people who did not read the "The Little Schemer"? And is what is in there directly applicable in Racket (I guess it is, because of Racket's background, but I better ask beforehand!) Yes, and sort of. Racket is derived from Scheme - whatever you learn about Scheme will be relevant to programming in Racket. But all the system things that real programs depend on: threads, networking, file i/o, etc. - all of those are outside the scope of the language. Learning Scheme will not necessarily help you, e.g., write a multi-threaded program or a TCP based service ... for that you need to learn Racket's libraries and how to use the facilities that they provide. So far I've read some part of SICP (Chapter 1 & 2), part of Realm of Racket and did some small Racket projects. Not a bad start. I'd add The Scheme Programming Language by R.Kent Dybvig. It's better as a reference work than a tutorial, but it does pack pretty much all you need to know about Scheme into one place. George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
Will "The Seasoned Schemer" be understandable for people who did not read the "The Little Schemer"? And is what is in there directly applicable in Racket (I guess it is, because of Racket's background, but I better ask beforehand!) So far I've read some part of SICP (Chapter 1 & 2), part of Realm of Racket and did some small Racket projects. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Sat, 19 Aug 2017 10:36:37 -0400, Matthias Felleisen wrote: > >May I recommend The Seasoned Schemer? > You may certainly. Ironically, that is one book on Scheme that I have never read. My point to Zelphir was that continuations per se are not unique to Scheme ... only the ways in which they are exposed to the programmer. George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Sat, 19 Aug 2017 09:28:49 -0400, George Neuner wrote: >When you say, (let/ec foo ... ) , all that's happening is the compiler >defines a pseudo-function named 'foo' that when called will exit from >the block of code in the scope of the let. (call/cc foo ...) does the >same, but assumes you will be immediately calling a function instead of >executing inline code. Invoking foo anywhere in the function (or its >descendants) will jump back out of the call chain. Scheme makes >invoking the continuation look like a function call even though it >really is a jump that won't return. Need to clarify that a bit because makes continuations sound a bit too much like exceptions. Code that invokes a continuation has to have a reference to the continuation's (pseudo)function ... in contrast any code can raise a generic exception. George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Need help with parallelizing a procedure
On 8/19/2017 6:16 AM, Zelphir Kaltstahl wrote: I looked at the code for a while again and think I now begin to understand it a bit more: I did not know `(let/ec` so I had to read about it. It says, that it is equivalent to `(call/ec proc` or something, which is equivalent to `(call-with-escape-continuation ...`. Uff … I don't know much about continuations, except from a vague idea of when they are called. The idea gets mingled with structures in other programming languages which are "catching exceptions" and such stuff. I don't know this stuff, so I guess `(call-with-escape-continuation` works like catching an exception, which is not really an exception, but just a signal, that code is returning. Then instead of simply returning, the escape continuation is called. This is done implicitly without specifying a named exception for returning and without defining some conditional structure on the side where it would return to. This could all be wrong ... It's at least a misconception. Continuations are _not_ exceptions - they are a much lower level construct. Exception handling is built on top of continuations. There is nothing magic or mysterious about a continuation. In Scheme, continuations are the moral equivalent of GOTO with arguments [yes, you can pass arguments to continuations]. In actual fact they can be used to write code that jumps around arbitrarily ... but the standard uses are more structured. When you say, (let/ec foo ... ) , all that's happening is the compiler defines a pseudo-function named 'foo' that when called will exit from the block of code in the scope of the let. (call/cc foo ...) does the same, but assumes you will be immediately calling a function instead of executing inline code. Invoking foo anywhere in the function (or its descendants) will jump back out of the call chain. Scheme makes invoking the continuation look like a function call even though it really is a jump that won't return. You do need to learn about continuations because they are the underlying basis of many programming techniques: exceptions, co-routines, threads, etc. ... all of which can be implemented directly in Scheme without dipping into assembler. Dan Friedman's paper is great, but is too technical and too Scheme-centric for beginners. The Wikipedia article on continuations (https://en.wikipedia.org/wiki/Continuation) is simpler to understand. Be sure to look through the "further reading" and links at the end. But the way to really get a handle on what continuations are is to read a book on compilers - at which point you'll realize that they really are little more than a branch target. However, with this kind of idea about escape continuations in mind, I think I get the idea behind using `(let/ec` in the code. When a place wants to exit, it is stopped from doing so, giving it more work, as if to say: "Hey wait! You can escape, but only if you do THIS! (escape continuation)" Then the place, desperately wanting to escape thinks: "Damn, OK, I'll do just that little bit of code more.", not realizing, that it is stuck in a loop. How mean. I wonder however, if there is no simpler way of doing this. I mean, if `(place-channel-get ...)` blocks, could I simply put stuff into an endless loop without escape continuation and only break the loop, if a certain symbol is received on the channel? How are you going to "break out" of the loop? In Scheme there are only 3 ways to jump over or out of a block of code: return from a function, invoke a continuation, or raise an exception. The loop I wrote was in the middle of the function where returning was not an option and there was no reason to raise an exception. Delimited (aka "escape") continuations are a structured way to jump out of a loop. A 'do' loop is a macro that hides continuations inside. You might as well learn to use them directly. Since `(place-channel-get ...)` is blocking, it should not generate any CPU load, when there is no message on the channel (right?). Why do I need to introduce something as complex as `(let/ec ...)`? Because you need to exit from of the - otherwise infinite - loop. I appreciate the code shared here. I just hesitate to use it, when I don't even understand it myself or when I am super unsure about understanding it (escape continuations). I am also thinking about how I can replace all the exclamation mark procedures, before adding it to my other code, which does not deal with assignments so far. Don't get too hung up on style points ... the goal is to write code that is easy to read and that you [and others] will understand without needing days of intensive study. If avoiding assignments makes your code longer and more convoluted, then doing it was a bad thing. Is there some easy to understand introduction to continuations in Racket? (not a super clever and scientific Friedman paper, which I'd probably need 1 year to actually understand
[racket-users] Re: Need help with parallelizing a procedure
On Saturday, August 19, 2017 at 12:16:53 PM UTC+2, Zelphir Kaltstahl wrote: > > Is there some easy to understand introduction to continuations in Racket? > (not a super clever and scientific Friedman paper, which I'd probably need 1 > year to actually understand :D) Try with Beautiful Racket short introduction first: http://beautifulracket.com/explainer/continuations.html -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
I looked at the code for a while again and think I now begin to understand it a bit more: I did not know `(let/ec` so I had to read about it. It says, that it is equivalent to `(call/ec proc` or something, which is equivalent to `(call-with-escape-continuation ...`. Uff … I don't know much about continuations, except from a vague idea of when they are called. The idea gets mingled with structures in other programming languages which are "catching exceptions" and such stuff. I don't know this stuff, so I guess `(call-with-escape-continuation` works like catching an exception, which is not really an exception, but just a signal, that code is returning. Then instead of simply returning, the escape continuation is called. This is done implicitly without specifying a named exception for returning and without defining some conditional structure on the side where it would return to. This could all be wrong and all I found was the following very unhelpful https://lists.racket-lang.org/users/archive/2010-October/042356.html where it links me to some paper (https://www.cs.indiana.edu/~dfried/appcont.pdf), which I have no motivation to read in its entirety, even if it's probably really clever stuff. Maybe that is the actual reason: Fearing that I wont understand any of it in depth. It also includes some exercises, which seem to be really difficult. However, with this kind of idea about escape continuations in mind, I think I get the idea behind using `(let/ec` in the code. When a place wants to exit, it is stopped from doing so, giving it more work, as if to say: "Hey wait! You can escape, but only if you do THIS! (escape continuation)" Then the place, desperately wanting to escape thinks: "Damn, OK, I'll do just that little bit of code more.", not realizing, that it is stuck in a loop. How mean. I wonder however, if there is no simpler way of doing this. I mean, if `(place-channel-get ...)` blocks, could I simply put stuff into an endless loop without escape continuation and only break the loop, if a certain symbol is received on the channel? Since `(place-channel-get ...)` is blocking, it should not generate any CPU load, when there is no message on the channel (right?). Why do I need to introduce something as complex as `(let/ec ...)`? I appreciate the code shared here. I just hesitate to use it, when I don't even understand it myself or when I am super unsure about understanding it (escape continuations). I am also thinking about how I can replace all the exclamation mark procedures, before adding it to my other code, which does not deal with assignments so far. Is there some easy to understand introduction to continuations in Racket? (not a super clever and scientific Friedman paper, which I'd probably need 1 year to actually understand :D) -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Need help with parallelizing a procedure
On 8/16/2017 3:34 PM, Zelphir Kaltstahl wrote: Just a few quick questions regarding the places code: 1. Is `?` like `if`? For me it says that it is an undefined identifier. Um? ... I think that probably is a lambda glyph that didn't render for you. If you are looking at (let [ (output (lambda (fmt . whatever) (apply eprintf fmt whatever) (flush-output (current-error-port)) )) : then that surely is the case. That special output function is an artifact of wanting to print on the console from the place. Normally stderr to a process is unbuffered, however a place is just a kernel thread running a separate copy of the Racket VM. Racket (or maybe the DrRacket debugger) buffers stderr for places, so the port has to be flushed to see output as it happens rather than waiting for the place to terminate or the buffer to fill up. Places support stdio so they can be pipelined like Unix processes, but in practice most interactions with them should be through channel messaging. Channel descriptors, file handles, tcp ports, etc. all can be passed around via messages, so you can implement arbitrarily complex connection schemes using messaging that are impossible using simple pipes. 2. If I understand correctly, the place is looping and in each iteration is looks if there is a message on the channel, which matches something. Is this creating a lot of extra work for looking on the channel all the time? Yes. A place terminates when its last instruction is executed, so you need to use a loop to keep it alive. Channels are signaling objects (event sources). 'place-channel-get' blocks until there is a message available or until the channel is closed. 3. What are `p`, `_i`, `_o` and `_e` doing? The 'place*' call returns 4 values: a channel to the new place, and the ports bound to its stdio. 'p' is the channel - the only value I'm interested in. The others are placeholders. Unfortunately, with multiple value returns, Racket [like Scheme] doesn't allow for ignoring don't cares - you have to catch all the values. 4. Could I replace `set!-values` with a `let`? No, but you could replace it with 'let-values'. Or you could use 'place' which returns a single value (the channel) instead of 'place*' which returns 4. I chose 'place*' specifically because I wanted console output from the places and needed to pass the stdio port to them. https://docs.racket-lang.org/reference/places.html It's a matter of style. I prefer to limit nesting levels (or use let*) because indentation quickly gets out of control. YMMV. From a technical perspective, binding code is somewhat faster than assignment code - but in practice with the current compiler the difference rarely matters. And starting places is a slow process anyway. George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
Just a few quick questions regarding the places code: 1. Is `?` like `if`? For me it says that it is an undefined identifier. 2. If I understand correctly, the place is looping and in each iteration is looks if there is a message on the channel, which matches something. Is this creating a lot of extra work for looking on the channel all the time? 3. What are `p`, `_i`, `_o` and `_e` doing? 4. Could I replace `set!-values` with a `let`? -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Mon, 14 Aug 2017 13:04:36 -0700 (PDT), Zelphir Kaltstahl wrote: >I tried with feature-visualizer, but did not save screenshots of >the result. Would it help showing them? I was not able to figure >out the problem, even when clicking the red dots to see what is >blocking. It made no sense to me, that those operations were >blocking. Believe me, I don't understand it very much better. I rarely use futures ... a long time ago I came to the conclusion that pretty much anything other than pure arithmetic or logic is suspect. >However, I tried a lot of things and don't remember what procedures >were blocking in which case. I remember, that I once tried to use >flonum operations and that even they were shown to be blocking, >while they solved the problem in the guide for parallelization in Racket. One thing you can try is using 'would-be-future' instead of 'future'. https://docs.racket-lang.org/reference/futures.html?q=would-be-futures#%28def._%28%28lib._racket%2Ffuture..rkt%29._would-be-future%29%29 'would-be-future' is a debugging form - it creates a pseudo future that won't be executed in parallel, but when evaluated it logs unsafe operations that might cause it to block if the evaluation _were_ done in parallel. [As long as you don't force [touch] it ... to get the logging you have to let it be evaluated by the future mechanism AS IF it were a real future.] George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
Thanks for the example code, I'll try it soon! The whole code is on my repository at: https://github.com/ZelphirKaltstahl/racket-ml/blob/master/decision-tree.rkt I tried with feature-visualizer, but did not save screenshots of the result. Would it help showing them? I was not able to figure out the problem, even when clicking the red dots to see what is blocking. It made no sense to me, that those operations were blocking. However, I tried a lot of things and don't remember what procedures were blocking in which case. I remember, that I once tried to use flonum operations and that even they were shown to be blocking, while they solved the problem in the guide for parallelization in Racket. When I ran my whole decision tree program with some example data and the statistical profiler package, I found that the algorithm spent most time as follows (for the code in the repository, possibly dev branch): - data-majority-prediction = 15084(32.4%) - calc-proportion = 15656(33.6%) - get-best-split = 4304(9.2%) - gini-index = ??? (but is part of get-best-split) -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Sat, 12 Aug 2017 16:17:21 -0700 (PDT), Zelphir Kaltstahl wrote: >Is it in general not possible to use a place more than once? No problem. A very simple example is shown below. This example only demonstrates that places can remain running and responding to new requests. Real interaction between the coordinator [main program] and the places [threads] is left as an exercise. George ## #lang racket ;== (define place-body (let [ (output (? (fmt . whatever) (apply eprintf fmt whatever) (flush-output (current-error-port)) )) (id #f) ] (lambda (ch) (output "place starting~n") (let/ec done (let loop [ (msg (place-channel-get ch)) ] (match msg ([list 'init val] (set! id val) ) ('terminate (done) ) (else (output "place ~s: ~s~n" id msg) )) (loop (place-channel-get ch)) )) (output "place ~s stopping~n" id) ))) ;== (define my-places (list)) (define (start-places how-many) (for/list [(i how-many)] (let [ (p #f) (_i #f)(_o #f)(_e #f) ] (set!-values (p _i _o _e) (place* #:err (current-error-port) ch (place-body ch))) (printf "created place ~s~n" i) (place-channel-put p (list 'init i )) p ))) (define (stop-places lst) (for [(p lst)] (place-channel-put p 'terminate) (place-wait p) )) ;== (define (main) (set! my-places (start-places 3)) (sleep 1) (let loop [ (msg (list 'the 'quick 'brown 'fox 'jumped 'over 'the 'lazy 'dogs)) (index 0) ] (cond ([null? msg]) (else (place-channel-put (list-ref my-places index) (car msg)) (loop (cdr msg) (remainder (+ index 1) (length my-places))) )) ) (sleep 1) (stop-places my-places) ) ## -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Sun, 13 Aug 2017 15:59:17 -0700 (PDT), Zelphir Kaltstahl wrote: >I tested using futures. While it is significantly faster than creating places, >it is still also significantly slower than the single threaded solution >without places and without futures: > >~~~ >(define (gini-index-futures subsets label-column-index) > (let ([futures (flatten (for/list ([subset (in-list subsets)]) >(for/list ([label (in-list (list 0 1))]) > (future (lambda () >(calc-proportion subset > label > > label-column-index))]) >(for/sum ([a-future (in-list futures)]) > (touch a-future >~~~ > >There always seems to be something blocking, so that only one core is used by >the futures. Creating places is too expensive, so maybe I could create places >at the very beginning before my program runs and always re-use those. You don't show the code for 'calc-proportion' [not that that necessarily would help much]. There are many conditions that prevent futures from being executed in parallel - it's best if the code is purely computation ... almost anything else and you're asking for trouble. The future-visualizer can help you figure out what's blocking them. George -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
I tested using futures. While it is significantly faster than creating places, it is still also significantly slower than the single threaded solution without places and without futures: ~~~ (define (gini-index-futures subsets label-column-index) (let ([futures (flatten (for/list ([subset (in-list subsets)]) (for/list ([label (in-list (list 0 1))]) (future (lambda () (calc-proportion subset label label-column-index))]) (for/sum ([a-future (in-list futures)]) (touch a-future ~~~ There always seems to be something blocking, so that only one core is used by the futures. Creating places is too expensive, so maybe I could create places at the very beginning before my program runs and always re-use those. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Re: Need help with parallelizing a procedure
On Saturday, August 12, 2017 at 12:21:19 PM UTC+2, Zelphir Kaltstahl wrote: > I want to parallelize a procedure which looks like this: > > ~~~ > (define (gini-index subsets label-column-index) > (for/sum ([subset (in-list subsets)]) > (for/sum ([label (in-list (list 0 1))]) > (calc-proportion subset >label >label-column-index > ~~~ > > I tried some variations of using places without success and then I found: > https://rosettacode.org/wiki/Parallel_calculations#Racket > > Where the code is: > > > ~~~ > #lang racket > (require math) > (provide main) > > (define (smallest-factor n) > (list (first (first (factorize n))) n)) > > (define numbers > '(112272537195293 112582718962171 112272537095293 > 115280098190773 115797840077099 1099726829285419)) > > (define (main) > ; create as many instances of Racket as > ; there are numbers: > (define ps > (for/list ([_ numbers]) > (place ch > (place-channel-put > ch > (smallest-factor >(place-channel-get ch)) > ; send the numbers to the instances: > (map place-channel-put ps numbers) > ; get the results and find the maximum: > (argmax first (map place-channel-get ps))) > ~~~ > > So inside the list places are created and it seems that the whole definition > of what they are supposed to do is wrapped in that (place ...) expression. I > tried to do the same for my example: > > ~~~ > (define (gini-index subsets label-column-index) > (for*/list ([subset (in-list subsets)] > [label (in-list (list 0 1))]) > (place pch >(place-channel-put pch (list subset label label-column-index)) >(let ([data (place-channel-get pch)]) > (calc-proportion (first data) > (second data) > (third data)) > ~~~ > > The `subset` inside `(place-channel-put pch (list subset label > label-column-index))` gets underlined and the error is: > > subset: identifier used out of context > > (I) In the example from Rosetta code it is all easy, as here is only passed > one number and is does not need a name or anything, but in my example I am > not sure how to do it. > > (II) A second thing I tried to do was using a place more than once (put, get > then put get to the channel again), but it did not work and my program simply > did nothing anymore, no cpu load or anything, but also did not finish, > probably waiting for an answer from the place and never getting any. Is it in > general not possible to use a place more than once? Meanwhile I could find a way to use places which is the following: ~~~ (define (gini-index subsets label-column-index) ;; (displayln "1 calculating gini index") #| Takes: - a list of place descriptors - subsets (should always be two in this implementation) - labels (should alway be a (list 0 1) Returns: - a list of place descriptors |# (define (iter-subsets subsets labels place-descriptors) ; call with empty (cond [(empty? subsets) place-descriptors] [else (iter-subsets (rest subsets) labels (cons (iter-labels (first subsets) labels empty) place-descriptors))])) (define (iter-labels subset labels place-descriptors) (cond [(empty? labels) place-descriptors] [else (let ([a-place (dynamic-place "decision-tree-places.rkt" 'place-calc-proportion-main)]) (place-channel-put a-place (list subset (first labels) label-column-index)) (iter-labels subset (rest labels) (cons a-place place-descriptors)))])) (let ([places (flatten (iter-subsets subsets (list 0 1) empty))]) (let ([result (for/sum ([a-place (in-list places)]) (place-channel-get a-place))]) (display "result: ") (displayln result) result))) ~~~ I could not find a more elegant solution for creating the places and keeping a handle on them for calling place-channel-get on them. However, my tests confirm exactly what you said: It's wy slower than even the single core implementation. The overhead seems to be huge for starting another Racket instance and all that goes with that. I wondered about futures too, because I read (most of ;) the parallelism guide. I only thought that it would not work out, because of actually using potentially large lists of vectors and not only floats. I once (some months ago) ran the example for futures, which shows that they fail when allocating large integers. This and thinking about lists of vectors made me discard the idea of using futures for this. Do you think using futures would work if the data is: 1) a list of vectors of numbers