I wrote a simple word counter described here http://ptrace.fefe.de/wp/ it reads stdin and counts the occurrences of words, however I notice that it runs significantly slower than the java version in the link.
I was wondering why there is such a dramatic difference. The approach I took was to create a map keyed on words and use the occurrence count as the value. When each line is read from input it's tokenized and the word counts are updated. The slowdown seems to occur in the inc-count function, where it "updates" the map using the assoc. Is this not a proper way to approach this in clojure? I've also noticed that there is a significant speed difference between conj and assoc, why is that? If I understand correctly both should only create the delta of the new elements and the old structure, however assoc appears to perform much better. (import '(java.io BufferedReader InputStreamReader)) (defn inc-count [words word] (if (= (. word (length)) 0) words (let [cnt (get words word)] (if cnt (assoc words word (inc cnt)) (assoc words word 1))))) (defn sort-words [words] (reverse (sort-by (fn [x] (first x)) (map (fn [x] [(get words x) x]) (keys words))))) (defn print-words [words] (let [head (first words) tail (rest words)] (if head (do (println head) (recur tail))))) (defn read-words [words line] (let [head (first line) tail (rest line)] (if (nil? tail) words (recur (time (inc-count words head)) tail)))) (defn read-input [] (with-open [stream (System/in)] (let [buf (BufferedReader. (InputStreamReader. stream))] (loop [line (. buf (readLine)) words {}] (if (nil? line) (print-words (sort-words words)) (recur (. buf (readLine)) (read-words words (. line (split " "))))))))) (time (read-input)) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---