Following my recent adventure with words ranking, here's the parallel version:
(use 'clojure.contrib.duck-streams) (defn top-words-core [s] (reduce #(assoc %1 %2 (inc (%1 %2 0))) {} (re-seq #"\w+" (.toLowerCase s)))) (defn format-words [words] (apply str (map #(format "%20s : %5d \r\n" (key %) (val %)) (sort-by #(- (val %)) words)))) (defn split-string-in-two [s] (let [chunk-size (quot (count s) 2)] [(subs s 0 chunk-size), (subs s chunk-size)])) (defn parallel-top-words [in-filepath out-filepath] (let [string (slurp in-filepath) agents (map #(agent %) (split-string-in-two string))] (doseq [a agents] (send a top-words-core)) (apply await agents) (spit out-filepath (format-words (apply merge-with + (map deref agents)))))) (http://pastie.org/348106) On 38MB file it takes 28s, compared to 38s of similar but sequential version. 1. Is there a better way to do it? Perhaps agents should share some data structure? 2. Despite producing valid results, the program never ends. Why? regards, Piotrek --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---