On 15/01/13 09:25, Marko Topolnik wrote:
The order in which you are polling is not very relevant given the fact that /doall/ won't return until *all* futures are realized. It's just an internal detail.

I finally fully grasped what you were saying...So yes you're right - as long as I'm forcing realisation at the end there is nothing to be gained...However, what if I submit jobs eagerly and poll for results lazily? Then there must be some some gain from using the completion service which will bring back the results in the order they finished.... some basic testing:

(defn pool-map
"A saner, more disciplined version of pmap. Submits jobs eagerly but polls for results lazily.
 Don't use if original ordering of 'coll' matters."
[f coll]
 (let [cpu-no (.. Runtime getRuntime availableProcessors)
       exec (java.util.concurrent.Executors/newFixedThreadPool cpu-no)
       pool (java.util.concurrent.ExecutorCompletionService. exec)
futures (doall (for [x coll] (.submit pool #(f x))))] ;;submit everything up front
(try
 (for [_ futures]  (.. pool take get))
(finally (.shutdown exec)))))

;;your version is 'pool-map1'
;;weirdly enough 'pool-map1' doesn't behave lazily (even though it has a call to 'map'!)!!!


user=> (def dummy-times [3000 10 9 8 7 6 5 4 3 2 1])
#'user/dummy-times
user=> (time  (pmap #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 16.213366 msecs"
(3000 10 9 8 7 6 5 4 3 2 1) ;;here you waited 3s before sleeping for 0.01 s
user=> (time  (pool-map #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 21.004979 msecs"
(10 9 8 7 6 5 4 3 2 1 3000) ;;here you've not waited at all - sleeping for 3s finished last and is last
user=> (time  (pool-map1 #(do (Thread/sleep %) %) dummy-times))
"Elapsed time: 3008.174631 msecs"  ;;non-lazy?
(3000 10 9 8 7 6 5 4 3 2 1) ;;again your version will wait for the first item to finish before proceeding

I think what you trying to get across is that the overall timings (if we do realise the result) will not differ much as all jobs have to finish eventually. In other words, sleeping for 3 s first and for 1 later is the same thing as sleeping for 1 s and then for 3 seconds!...and of course this is generally true! However, there is no real benefit waiting for the 1st task to finish when we don't mind about ordering. You 'll get the first item whenever it finishes in whatever position...This MUST be good but perhaps it needs to be paired with laziness to witness any effect?

aking into account all that was said, /pool-map/ can't offer much more than /pmap/. You can't know which tasks will take less time until they are already done. It is theoretically impossible to pre-order them according to execution time, thereby harvesting the results of the fastest ones earlier, eventually promoting total concurrency.

hmmm...so the completion service is useless? It can't be... You say that'You can't know which tasks will take less time until they are already done' but the way I see it you don't need to...all you need to know at any given time is whether a or some futures have completed. If one has indeed completed you invoke .get for the result. If it hasn't finished and you do .get it will block until it finishes just like deref-ing in Clojure... I honestly don't see why harvesting the results of the fastest ones earlier requires to know the execution times up front! As you go along you can ask the futures whether they finished or not, can't you?

I am in no way trying to contradict you ,I'm just trying to set things straight so we are all on the same page...again thanks for your time and comments! :)


Jim


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to