On Sun, Jan 23, 2011 at 11:34 PM, Ken Wesson <kwess...@gmail.com> wrote:
> Other posts to the thread indicate that longer-range patterns in the
> inputs could cause problems. If you know you'll be consuming the full
> sequence, try this:
>
> (defn eager-pmap [f & colls]
>  (map deref (doall (apply map #(future (f %)) colls))))
>
> This creates all of the futures right away (due to the doall) and
> leaves it up to the future implementation to distribute work around
> some thread pool. On my system, it creates 80 or so of them each time
> it's run on a seq of 100000 things, which persist for some time
> afterward -- so, a bit of a problem there. Another one that seems like
> it ought to be more efficient:
>
> (defn eager-pmap [f & colls]
>  (let [cores (.. Runtime getRuntime availableProcessors)
>        agents (cycle (for [_ (range cores)] (agent nil)))
>        promises (apply map (fn [& _] (promise)) colls)]
>    (doall
>      (apply map (fn [a p & args]
>                   (send a
>                     (fn [_] (deliver p (apply f args)))))
>        agents promises colls))
>    (map deref promises)))
>
> This one uses the agent "send" thread pool to divide up the work.
> Oddly, it's actually 2-3x slower than the previous and only uses 75%
> CPU on a dual-core machine, though it doesn't leak threads.

I've managed to make this more efficient, by making each agent send
process a larger portion of the job than one single item:

(defn eager-pmap [f & colls]
  (let [cores (.. Runtime getRuntime availableProcessors)
        agents (cycle (for [_ (range cores)] (agent nil)))
        ccolls (map (partial partition 16 16 []) colls)
        promises (apply map (fn [& _] (promise)) ccolls)]
    (doall
      (apply map (fn [a p & chunks]
                   (send a
                     (fn [_]
                       (deliver p
                         (apply map f chunks)))))
        agents promises ccolls))
    (mapcat deref promises)))

This is much faster than either of the other eager-pmaps I posted to
this thread, and yet it's only using 60% CPU on my dual-core box. That
means there's still another x1.6 or so speedup possible(!) but I'm not
sure how.

Raising the size of the partitions from 16 all the way to 1024 makes
no noticeable difference.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to