Re: Having trouble getting full performance from a quad-core with trivial code

2010-06-04 Thread Zak Wilson
I have some new data that suggests there are issues inherent to pmap and possibly other parallelism with Clojure on older Intel quad+ core machines. I added a noop loop to the benchmark. It looks like this: (defn noops [n] (when (> n 0) (recur (- n 1 Running those in parallel is also n

Re: Having trouble getting full performance from a quad-core with trivial code

2010-06-03 Thread Zak Wilson
> It seems very weird that my version of fac changes performance > characteristics on my machine and not yours (OS/hardware dependent?). > Can you tell your hardware configuration, esp. number of physical and > logical cores? It's an early Mac Pro with two dual-core Xeon 5150s, 5gb RAM, Mac OS 10.

Re: Having trouble getting full performance from a quad-core with trivial code

2010-06-03 Thread ka
Hi Zak, It seems very weird that my version of fac changes performance characteristics on my machine and not yours (OS/hardware dependent?). Can you tell your hardware configuration, esp. number of physical and logical cores? I am planning next to leave out using pmap and just try to run the thing

Re: Having trouble getting full performance from a quad-core with trivial code

2010-06-02 Thread Zak Wilson
ka, I ran some more tests, including partition-work and your version of fac. I also ran some code from http://shootout.alioth.debian.org in both C and Java. On these 10-element sequences, partition-work seems to be a few tens of milliseconds slower than partition-all. It does look generally useful

Re: Having trouble getting full performance from a quad-core with trivial code

2010-06-01 Thread ka
Hi Zak, I tried your example on my i7 (4 physical cores, 8 logical); here are the results - 1:298 user=> (time (do (doall (map fac (take 10 (repeat 5 nil)) "Elapsed time: 54166.665145 msecs" 1:300 user=> (time (do (doall (pmap fac (take 10 (repeat 5 nil)) "Elapsed time: 27418.263

Re: Having trouble getting full performance from a quad-core with trivial code

2010-05-30 Thread Zak Wilson
Heinz - playing with the size of the number doesn't have much effect, except that when it becomes very small, parallelization overhead eventually exceeds compute time. Lee - Parallel GC slowed it down by 3 seconds on the four core benchmark. -- You received this message because you are subscribe

Re: Having trouble getting full performance from a quad-core with trivial code

2010-05-30 Thread Lee Spector
Zak, This may not be your main issue and I haven't done enough testing with my own code to know if it's even my main issue, but I've found that things appear to go better for me on multicore machines if I invoke java with the -XX:+UseParallelGC option. -Lee On May 30, 2010, at 12:31 PM, Zak

Re: Having trouble getting full performance from a quad-core with trivial code

2010-05-30 Thread Heinz N. Gies
On May 30, 2010, at 18:31 , Zak Wilson wrote: > I'm running Clojure code on an early Mac Pro with OS X 10.5 and Java > 1.6. It has two dual-core Xeon 5150s and 5GB of memory. Just a idea, two dual cores != 4 cores. Parallelism on more then one CPU is always slower then on one cpu with multiple c

Having trouble getting full performance from a quad-core with trivial code

2010-05-30 Thread Zak Wilson
I'm running Clojure code on an early Mac Pro with OS X 10.5 and Java 1.6. It has two dual-core Xeon 5150s and 5GB of memory. I'm not getting the performance I expected despite top reporting 390% steady-state CPU use, so I wrote some trivial tests to see if I was actually getting the benefit of all