On Tuesday, January 15, 2013 12:36:04 AM UTC+1, Jim foo.bar wrote: > On 14/01/13 22:15, Marko Topolnik wrote: > > But... you were quite clear in stating that your function isn't lazy, and > you were right to say that: *doall* will not return until *all* the tasks > have completed. Maybe you did want laziness, after all? > > > I'm not entirely sure what you mean here...The only real reason I am using > 'doall' is so that I can shutdown the ExecutorService, otherwise I wouldn't > mind returning a lazy-seq. In fact that is how I originally had it - I was > never shutting down the pool until recently! Inside the doall however I'm > polling the pool for answers and *it* looks for finished jobs...I'm not > deref-ing the futures in a serial manner which has higher chances of > bocking...Is my thinking not correct? >
The order in which you are polling is not very relevant given the fact that *doall* won't return until *all* futures are realized. It's just an internal detail. As for shutting down the executor service, it is safe to do so immediatly after all tasks have been submitted: (defn pool-map [f coll] (let [exec (Executors/newFixedThreadPool (.availableProcessors (Runtime/getRuntime)))] (->> (try (.invokeAll exec (for [x coll] #(f x))) (finally (.shutdown exec))) (map #(.get ^FutureTask %))))) > Yet another issue with your function is that it doesn't submit all the > tasks up front so it clearly won't prevent longer tasks from delaying the > shorter ones. But, even if they are submitted all at once, like in my > version, the executor service will still enqueue them internally, with the > same end result. You'd need an unbounded thread pool, and that would > obviously bring in other issues. > > > You're quite right here...Again using 'for' was a remnant from the old > version...Perhaps 'mapv' would be a better choice for submitting the > jobs...In any case, is this executor-based approach (with or without the > completion service) encouraged? Would you care if tomorrow pmap started > using this approach? > Using *mapv* (or simply *doall*, same thing) won't solve the root problem of tasks waiting in executor's own queue. Taking into account all that was said, *pool-map* can't offer much more than *pmap*. You can't know which tasks will take less time until they are already done. It is theoretically impossible to pre-order them according to execution time, thereby harvesting the results of the fastest ones earlier, eventually promoting total concurrency. The way to go is grinding the tasks down to chunks of predictable runtime and submitting them to a fork/join solver, which will optimize concurrency through work-stealing. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en