Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Brad Beveridge
On 2009-08-17, at 8:58 PM, FFT fft1...@gmail.com wrote: On Mon, Aug 17, 2009 at 9:25 AM, Bradbevbrad.beveri...@gmail.com wrote: On Aug 17, 1:32 am, Nicolas Oury nicolas.o...@gmail.com wrote: I was referring to the rules of the benchmark game. When you benchmark language, using

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread FFT
On Mon, Aug 17, 2009 at 9:25 AM, Bradbevbrad.beveri...@gmail.com wrote: On Aug 17, 1:32 am, Nicolas Oury nicolas.o...@gmail.com wrote: I was referring to the rules of the benchmark game. When you benchmark language, using another language is not fair. If you were to do your own program, of

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Aaron Cohen
On Tue, Aug 18, 2009 at 11:28 AM, Brad Beveridgebrad.beveri...@gmail.com wrote: On 2009-08-17, at 8:58 PM, FFT fft1...@gmail.com wrote: On Mon, Aug 17, 2009 at 9:25 AM, Bradbevbrad.beveri...@gmail.com wrote: Ah, that makes more sense re the cheating then.  Your insight for array range

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Aaron Cohen
On Tue, Aug 18, 2009 at 3:32 PM, Aaron Cohenremled...@gmail.com wrote: On Tue, Aug 18, 2009 at 11:28 AM, Brad Beveridgebrad.beveri...@gmail.com wrote: On 2009-08-17, at 8:58 PM, FFT fft1...@gmail.com wrote: On Mon, Aug 17, 2009 at 9:25 AM, Bradbevbrad.beveri...@gmail.com wrote: Ah, that

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
I was referring to the rules of the benchmark game. When you benchmark language, using another language is not fair. If you were to do your own program, of course you could use Java. However, in the particular circumstance, it is a bit annoying to use Java just to create a data structure type.

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Mark Engelberg
Here's what I've learned from following this benchmark thread: From the various things I've read about Clojure's performance, I've always had this sense that: a) if you have a performance problem, there's probably some inner loop that needs to be optimized, and so b) you can use Clojure's

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread David Nolen
On Mon, Aug 17, 2009 at 4:32 AM, Nicolas Oury nicolas.o...@gmail.comwrote: I was referring to the rules of the benchmark game. When you benchmark language, using another language is not fair. If you were to do your own program, of course you could use Java. However, in the particular

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread e
i don't know much about this (haven't followed closely, lately), but do the new Transients come into play to somewhat address this? Sounds like they were designed just for this sort of thing: inner-loop optimization and low-level mutation that still works functionally to everything outside... On

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
On this particular example, I think we are a bit further that what Transients currently offers. Even using a mutable primitive Java array results in code 2 or 3 times slower than the Java implementation of the benchmarks. I have no doubt the struct and transients in Clojure will allow to do that

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread David Nolen
On Sun, Aug 16, 2009 at 6:50 AM, Nicolas Oury nicolas.o...@gmail.comwrote: Dear all, The good news: I have a version of the N-body benchmark that goes as fast as java. The bad news: I am cheating a little bit... You're only cheating if you care about the fantasy world that is

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Bradbev
On Aug 17, 1:32 am, Nicolas Oury nicolas.o...@gmail.com wrote: I was referring to the rules of the benchmark game. When you benchmark language, using another language is not fair. If you were to do your own program, of course you could use Java. However, in the particular circumstance, it is

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
Seems to mean that I was wrong and that the cost is both in bound check and unpacking the indices, mostly the second one. On Mon, 2009-08-17 at 09:25 -0700, Bradbev wrote: On Aug 17, 1:32 am, Nicolas Oury nicolas.o...@gmail.com wrote: I was referring to the rules of the benchmark game. When

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Aaron Cohen
On Mon, Aug 17, 2009 at 7:45 PM, Mark Engelbergmark.engelb...@gmail.com wrote: On Mon, Aug 17, 2009 at 9:25 AM, Bradbevbrad.beveri...@gmail.com wrote: I found another 2-3x speed up by coercing the indexes with (int x), ie (defmacro mass [p] `(double (aget ~p (int 0 Which makes me

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Nicolas Oury
Dear all, The good news: I have a version of the N-body benchmark that goes as fast as java. The bad news: I am cheating a little bit... As I suspected that a lot of time was spend in the array bound check arithmetic, I replaced #^doubles in the implementation of body by an object implemented

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Meikel Brandmeyer
Hi, Am 16.08.2009 um 12:50 schrieb Nicolas Oury: The bad news: I am cheating a little bit... Why is this cheating? People wrote programs in C and dropped down to Assembly if necessary. People write programs in Python and drop down to C if necessary. Why can't we write programs in Clojure

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Bradbev
Why can't we write programs in Clojure and drop down to Java if necessary? That's what I find funny about these threads, Clojure's Java interop is good, Java is easy to write performant code in. There is a clear path to getting the best JVM performance possible from a Clojure environment.

Re: Pure-functional N-body benchmark implementation

2009-08-13 Thread Nicolas Oury
-XX:+AggressiveOpts improves another 5-10%. EscapeAnalysis seems more important than BiasedLocking. I don't have a disassembling module installed. Could someone use the PrintAssembly option and put the asm for the JITed method somewhere. It could be interesting to see it side by side with the

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Nicolas Oury
Hello, I tried to inline everything in the main loop (the updaters loops) and obtained on my machine a 15% speed-up. One of the possible slowdown may come from having arrays and not object. Maybe, each access need to perform a size check on the array. Which is not very costly but not negligible

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Aaron Cohen
I'm getting a very significant performance improvement by adding a couple of JVM parameters (using jdk 1.6.0_14). They are: -XX:+DoEscapeAnalysis -XX:+UseBiasedLocking (I think the -server flag is required for those two flags to do anything). My runtime with n = 5,000,000 goes from ~7.5 seconds

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Aaron Cohen
On Wed, Aug 12, 2009 at 4:49 PM, Aaron Cohenremled...@gmail.com wrote: I'm getting a very significant performance improvement by adding a couple of JVM parameters (using jdk 1.6.0_14).  They are: -XX:+DoEscapeAnalysis -XX:+UseBiasedLocking (I think the -server flag is required for those two

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 10, 5:57 pm, Mark Engelberg mark.engelb...@gmail.com wrote: Andy, My understanding is that any double that gets stored in a vector or map is boxed, and therefore, the vast majority of your double conversions aren't really doing anything, because when you pull them out of the vector

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Mark Engelberg
On Mon, Aug 10, 2009 at 11:15 PM, Andy Fingerhutandy_finger...@alum.wustl.edu wrote: I suspect I'm doing something wrong in my mutable Java array implementation, but I don't see what it could be. There still seems to be a lot of boxing and unboxing going on. For example, in: (let [[momx momy

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Jonathan Smith
On Aug 10, 11:08 pm, fft1976 fft1...@gmail.com wrote: On Aug 10, 2:19 pm, Jonathan Smith jonathansmith...@gmail.com wrote: 1.) use something mutable 2.) unroll all the loops (mapping is a loop) 3.) try not to coerce between seq/vec/hash-map too much. Are you saying this w.r.t. my code

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 10, 11:33 pm, Mark Engelberg mark.engelb...@gmail.com wrote: On Mon, Aug 10, 2009 at 11:15 PM, Andy Fingerhutandy_finger...@alum.wustl.edu wrote: I suspect I'm doing something wrong in my mutable Java array implementation, but I don't see what it could be. There still seems to be

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Christophe Grand
Hi Andy, On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: I've tried an approach like you suggest, using mutable Java arrays of doubles, macros using aget / aset-double for reading and writing these arrays, and loop/recur everywhere iteration is needed in

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 10, 11:50 pm, Christophe Grand christo...@cgrand.net wrote: Hi Andy, On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: I've tried an approach like you suggest, using mutable Java arrays of doubles, macros using aget / aset-double for reading and

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Mark Engelberg
On Tue, Aug 11, 2009 at 12:39 AM, Andy Fingerhutandy_finger...@alum.wustl.edu wrote: Wow, you ain't kiddin.  I changed about 10 lines from my last version, to avoid using aset-double, using aset and type hints until the reflection warnings went away, and it sped up by a factor of 10.  I'm

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 10, 11:42 pm, Jonathan Smith jonathansmith...@gmail.com wrote: The way your code is setup, you will spend a lot of time in funcall overhead just because you used a lot of functions instead of doing the calculation in bigger chunks. I thought, as I understood from Rich's lectures, JVM

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 12:39 am, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: http://github.com/jafingerhut/clojure-benchmarks/blob/9dc56d8ff53f0b8... Why isn't the array-using version as fast as Java? Shouldn't using Java's data structures, mutation and no reflection supposed to be equivalent to

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 12:39 am, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: On Aug 10, 11:50 pm, Christophe Grand christo...@cgrand.net wrote: Hi Andy, On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: I've tried an approach like you suggest, using

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 2:25 am, fft1976 fft1...@gmail.com wrote: Hmmm I just ran your version #8, and it's almost as slow as mine (nbody_v2.clj): 53 times slower than Java, but I'm running Clojure 1.0 and Strike that. I f'ed up the namespaces and was actually measuring my own version. Yours is 8x

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Jonathan Smith
On Aug 11, 4:42 am, fft1976 fft1...@gmail.com wrote: On Aug 10, 11:42 pm, Jonathan Smith jonathansmith...@gmail.com wrote: The way your code is setup, you will spend a lot of time in funcall overhead just because you used a lot of functions instead of doing the calculation in bigger

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 4:50 am, Jonathan Smith jonathansmith...@gmail.com wrote: I don't think you have to put *everything* in the let, just your constants. (so days per year and solar mass, the bodies themselves). How will they escape from the LET though? I see that in your code everything is inside a

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Jonathan Smith
On Aug 11, 2:43 pm, fft1976 fft1...@gmail.com wrote: On Aug 11, 4:50 am, Jonathan Smith jonathansmith...@gmail.com wrote: I don't think you have to put *everything* in the let, just your constants. (so days per year and solar mass, the bodies themselves). How will they escape from the

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
In case it matters to anyone, my intent in creating these Clojure programs to compare their speed to others isn't to try to rip into Clojure, or start arguments. It is for me to get my feet wet with Clojure, and perhaps produce some examples that others can learn from on what performs well in

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Aaron Cohen
On Tue, Aug 11, 2009 at 5:26 PM, Andy Fingerhutandy_finger...@alum.wustl.edu wrote: In case it matters to anyone, my intent in creating these Clojure programs to compare their speed to others isn't to try to rip into Clojure, or start arguments.  It is for me to get my feet wet with Clojure,

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 2:26 pm, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: As always, suggestions or improved versions are welcome. I noticed that when I wrap ~new-mass in (double ...) in this (defmacro set-mass! [p new-mass] `(aset ~p 0 ~new-mass)) and other setters, I get warnings.

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 11, 2:36 pm, Aaron Cohen remled...@gmail.com wrote: At that point is it possible you're just paying the price of PersistentVector for the bodies vector?  Does it improve much if you change bodies to an array? About 7% faster changing bodies to a Java array of java.lang.Object's, each

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Aaron Cohen
On Tue, Aug 11, 2009 at 8:13 PM, Andy Fingerhutandy_finger...@alum.wustl.edu wrote: On Aug 11, 2:36 pm, Aaron Cohen remled...@gmail.com wrote: At that point is it possible you're just paying the price of PersistentVector for the bodies vector?  Does it improve much if you change bodies to an

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Jarkko Oranen
On Aug 10, 12:41 pm, fft1976 fft1...@gmail.com wrote: I just uploaded to the group an implementation of the n-body benchmark in Clojure (see nbody_init.clj) http://shootout.alioth.debian.org/u32/benchmark.php?test=nbody〈=j... My goal was to write a pure-functional version and to avoid any

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 4:46 am, Jarkko Oranen chous...@gmail.com wrote: I'm not going to start optimising, Somebody'd better! You always hear this dogma that one should write elegant code first and optimize later, and when you do that, a few little changes can make Clojure as fast as Java. Here's your

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 11:35 am, fft1976 fft1...@gmail.com wrote: On Aug 10, 4:46 am, Jarkko Oranen chous...@gmail.com wrote: I'm not going to start optimising, Somebody'd better! You always hear this dogma that one should write elegant code first and optimize later, and when you do that, a few

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 2:19 pm, Jonathan Smith jonathansmith...@gmail.com wrote: 1.) use something mutable 2.) unroll all the loops (mapping is a loop) 3.) try not to coerce between seq/vec/hash-map too much. in real world, stuff like the shootout is pretty useless, as generally you'd reach for a

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Isaac Gouy
On Aug 10, 3:00 pm, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: On Aug 10, 2:19 pm, Jonathan Smith jonathansmith...@gmail.com wrote: 1.) use something mutable 2.) unroll all the loops (mapping is a loop) 3.) try not to coerce between seq/vec/hash-map too much. in real world,

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 5:15 pm, Andy Fingerhut andy_finger...@alum.wustl.edu wrote: OK, I've got a new Clojure program for the n-body benchmark, and it is significantly faster than my previous one -- down from 138 x Java run time, to 37 x Java run time.  Still room for improvement somewhere there, I'm

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 2:19 pm, Jonathan Smith jonathansmith...@gmail.com wrote: 1.) use something mutable 2.) unroll all the loops (mapping is a loop) 3.) try not to coerce between seq/vec/hash-map too much. Are you saying this w.r.t. my code or in general? If the former, be specific, better yet, show