While grocery shopping this morning, it occurred to me that it would be even faster to do a single pass over the blocks array, and update the count in one of 128 maps depending on the current index. Of course, when I got home and took another look at your gist page, it turns out that's you've already done in the second most recent iteration :) So the following is almost the same as that iteration, except:
1. I call assoc! directly on the transient maps instead of writing my own update-in!. 2. I use unchecked-remainder instead of rem to calculate the index of the map to update; this shaved off 3-4 seconds. 3. The transient maps are stored in a plain old vector. Using a transient vector didn't make things much faster for me, so I dropped it to keep the code a bit simpler. (def num-layers 128) (defn freqs [^bytes blocks] (map persistent! (areduce blocks idx all-freqs (vec (repeatedly num-layers #(transient {}))) (let [layer (unchecked-remainder (int idx) (int num-layers)) layer-freqs (nth all-freqs layer) block (aget blocks idx) old-count (get layer-freqs block 0)] (assoc! layer-freqs block (inc old-count)) all-freqs)))) On my home machine, this computes the frequencies over 99844096 blocks in about 11 seconds. My home machine is slower than the machine I used earlier, so this version should be about twice as fast as my earlier code. On Nov 6, 4:23 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com> wrote: > Awesome. You managed to reproduce my initial solution, but working > with arrays all the way. It is already quite a bit faster than what I > have, but still nowhere near Python. > > I'll put it on the list for things to look at. > > On Nov 6, 1:27 am, Benny Tsai <benny.t...@gmail.com> wrote: > > > > > > > > > Oops, sorry, got my terminology wrong. The sub-arrays represent > > *layers*, not levels. So the code should actually read as follows: > > > (def num-layers 128) > > > (defn get-layer [layer-num ^bytes blocks] > > (let [size (/ (count blocks) num-layers) > > output (byte-array size)] > > (doseq [output-idx (range size)] > > (let [block-idx (+ (* output-idx num-layers) layer-num)] > > (aset output output-idx (aget blocks block-idx)))) > > output)) > > > (defn afrequencies > > [^bytes a] > > (persistent! > > (areduce a > > idx > > counts > > (transient {}) > > (let [x (aget a idx)] > > (assoc! counts x (inc (get counts x 0))))))) > > > (defn freqs [^bytes blocks] > > (let [layers (map #(get-layer % blocks) (range num-layers))] > > (map afrequencies layers))) > > > On Nov 5, 4:57 pm, Benny Tsai <benny.t...@gmail.com> wrote: > > > > Here's what I have so far. The code splits blocks into 128 smaller > > > sub-arrays, each representing a level, then calls a modified version > > > of frequencies (using areduce instead of reduce) on each level. On my > > > machine, with server mode on, it takes about 20 seconds to compute the > > > frequencies for an array of 99844096 blocks. I haven't tested it on > > > your level, because I'm too lazy to get JNBT set up, but I'm curious > > > to see if it returns the correct result for you. > > > > (def num-levels 128) > > > > (defn get-level [level-num ^bytes blocks] > > > (let [size (/ (count blocks) num-levels) > > > output (byte-array size)] > > > (doseq [output-idx (range size)] > > > (let [block-idx (+ (* output-idx num-levels) level-num)] > > > (aset output output-idx (aget blocks block-idx)))) > > > output)) > > > > (defn afrequencies > > > [^bytes a] > > > (persistent! > > > (areduce a > > > idx > > > counts > > > (transient {}) > > > (let [x (aget a idx)] > > > (assoc! counts x (inc (get counts x 0))))))) > > > > (defn freqs [^bytes blocks] > > > (let [levels (map #(get-level % blocks) (range num-levels))] > > > (map afrequencies levels))) > > > > user=> (def blocks (byte-array 99844096)) > > > #'user/blocks > > > user=> (time (count (freqs blocks))) > > > "Elapsed time: 20160.780769 msecs" > > > 128 > > > > On Nov 4, 3:28 pm, Pepijn de Vos <pepijnde...@gmail.com> wrote: > > > > > Hi all, > > > > > I have written a Python script to analyze Minecraft levels and render a > > > > graph. Then I did the same with Clojure. It takes Python 10 seconds to > > > > analyze a map, while it takes Clojure over a minute. > > > > > After having tried different options without any significant > > > > improvement, I am lost as to why there is such a huge difference. I > > > > wouldn't mind an extra pair of eyes/brains to look at this. > > > > > I blogged about it in more detail > > > > here:http://pepijndevos.nl/clojure-versus-python > > > > Clojure version:https://github.com/pepijndevos/Clomian/ > > > > Python version:https://github.com/l0b0/mian > > > > > Clojure spends most of its time in the freqs function, here are a > > > > couple of variations:https://gist.github.com/663096 > > > > > If you want to run the code yourself, you'll need a Minecraft level and > > > > JNBT, which is not on Maven. > > > > JNBT:http://jnbt.sourceforge.net/ > > > > The level used in the > > > > blogpost:http://dl.dropbox.com/u/10094764/World2.zip > > > > > Groeten, > > > > Pepijn de Vos > > > > -- > > > > Sent from my iPod Shufflehttp://pepijndevos.nl -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en