Re: loneclojurian at ICFP programming contest

2009-07-07 Thread Jon Harrop

On Tuesday 07 July 2009 02:08:57 Bradbev wrote:
 On Jul 6, 4:30 pm, fft1976 fft1...@gmail.com wrote:
  On Jul 5, 11:42 pm, Bradbev brad.beveri...@gmail.com wrote:
   more to modern x86 chips.  After you have the best algorithm for the
   job, you very quickly find that going fast is entirely bound by memory
   speed (actually latency) - cache misses are the enemy.
 
  IME (outside JVM), this depends strongly on the kind of problem you
  are solving as well as your implementation (you need to know how to
  cache-optimize). One can easily think of problems that would fit
  entirely in cache, but take an enormous amount of time.

 What sort of problems did you have in mind?  Anything that I can think
 of quickly spills the cache.

There are many examples in scientific computing where many small problems are 
attacked instead of one large problem. For example, practical use of FFTs 
fall into this category with most users performing many transforms with no 
more than 1,024 elements rather than fewer longer transforms. Time frequency 
analysis usually takes n samples and produces an nxn grid over time and 
frequency representing the signal where each frequency is computed from a 
separate FFT. So you can get away with naive distribution of FFTs across 
cores with no regard for cache coherence and still get very good performance.

This is somewhat reflected in the SciMark2 benchmark, which has small and 
large variants. Most people are interested in the in-cache small variant 
because it is more practically relevant. Cache coherence is still relevant in 
some of the subtasks even in the small case but it is a much smaller 
effect.

I am in the process of reimplementing stock numerical algorithms (Cholesky, 
LU, QR, Eigenvalues) for our F# for Numerics product, written entirely in F# 
and parallelized using the TPL. Surprisingly, some of my F# code is 3x faster 
than the equivalent MATLAB for small (e.g. 100x100 matrices) problems even 
though MATLAB is calling directly into vendor-tuned code. For larger problems 
(e.g. 1000x1000 matrices), MATLAB is usually 10x faster because it is using 
code that was carefully optimized for cache issues.

I would actually say that intercore cache effects are more important than 
conventional cache coherence today because you get massive performance 
degradation if you cock it up and it is not at all obvious when that might 
occur because it depends upon things like where the allocator places your 
data. For example, if you have cores mutating global counters then you must 
make sure they are spaced out enough in memory that none share cache lines.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-07 Thread Jon Harrop

On Sunday 05 July 2009 23:19:31 fft1976 wrote:
 On Jul 5, 10:53 am, igorrumiha igorrum...@gmail.com wrote:
  I think it's safe to say that once again it's proved that Clojure
  easily matches the Java level of performance.

 I think one shouldn't generalize from one [unverified] example.

 Personally, I'll wait for Jon Harrop or someone to port the relevant
 Shootout benchmarks or his Ray tracing benchmark to Clojure and see
 what time they get and what the code looks like.

That's a fantastic idea! Let's try porting my ray tracer to Clojure.

Incidentally, Java's performance was extremely variable on the ray tracer 
benchmark. All of the other submissions I received were substantially slower 
than the one currently on the site. I forget why but I remember thinking it 
was an innocuous-looking tweak at the time...

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-07 Thread Bradbev

On Jul 7, 6:23 am, Jon Harrop j...@ffconsultancy.com wrote:
 On Tuesday 07 July 2009 02:08:57 Bradbev wrote:

  On Jul 6, 4:30 pm, fft1976 fft1...@gmail.com wrote:
   On Jul 5, 11:42 pm, Bradbev brad.beveri...@gmail.com wrote:
more to modern x86 chips.  After you have the best algorithm for the
job, you very quickly find that going fast is entirely bound by memory
speed (actually latency) - cache misses are the enemy.

   IME (outside JVM), this depends strongly on the kind of problem you
   are solving as well as your implementation (you need to know how to
   cache-optimize). One can easily think of problems that would fit
   entirely in cache, but take an enormous amount of time.

  What sort of problems did you have in mind?  Anything that I can think
  of quickly spills the cache.

 There are many examples in scientific computing where many small problems are
 attacked instead of one large problem. For example, practical use of FFTs
 fall into this category with most users performing many transforms with no
 more than 1,024 elements rather than fewer longer transforms. Time frequency
 analysis usually takes n samples and produces an nxn grid over time and
 frequency representing the signal where each frequency is computed from a
 separate FFT. So you can get away with naive distribution of FFTs across
 cores with no regard for cache coherence and still get very good performance.

Interesting, I've never dealt with problems in this domain - most of
my performance problems involve relatively simple transforms over
streams of data.  It would be quite a different mindset to program for
problems that fit entirely in cache, it would be fun to try  squeeze
the theoretical power out of a chip.  I guess at those levels you're
mostly concerned about preventing pipeline stalls  instruction
conflicts.  I think I'd pretty much go for handwritten assembler at
those levels.

 I would actually say that intercore cache effects are more important than
 conventional cache coherence today because you get massive performance
 degradation if you cock it up and it is not at all obvious when that might
 occur because it depends upon things like where the allocator places your
 data. For example, if you have cores mutating global counters then you must
 make sure they are spaced out enough in memory that none share cache lines.
Effects like that  cache line aliasing are difficult to diagnose
without good tools.  Not to mention that in most machines you have an
OS scheduling other threads  polluting your cache.

Brad


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-07 Thread fft1976

On Jul 6, 6:08 pm, Bradbev brad.beveri...@gmail.com wrote:
 On Jul 6, 4:30 pm, fft1976 fft1...@gmail.com wrote: On Jul 5, 11:42 pm, 
 Bradbev brad.beveri...@gmail.com wrote:

   more to modern x86 chips.  After you have the best algorithm for the
   job, you very quickly find that going fast is entirely bound by memory
   speed (actually latency) - cache misses are the enemy.

  IME (outside JVM), this depends strongly on the kind of problem you
  are solving as well as your implementation (you need to know how to
  cache-optimize). One can easily think of problems that would fit
  entirely in cache, but take an enormous amount of time.

 What sort of problems did you have in mind?  Anything that I can think
 of quickly spills the cache.


For an extreme example, just about any NP-complete, NP-hard, EXP-hard
problem may do. Who wins at checkers assuming perfect play: black or
white? Answering this requires little memory and lots of time.

Of course, a program does not need to fit in cache not to be memory-
bound.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-06 Thread fft1976

On Jul 5, 11:42 pm, Bradbev brad.beveri...@gmail.com wrote:

 more to modern x86 chips.  After you have the best algorithm for the
 job, you very quickly find that going fast is entirely bound by memory
 speed (actually latency) - cache misses are the enemy.

IME (outside JVM), this depends strongly on the kind of problem you
are solving as well as your implementation (you need to know how to
cache-optimize). One can easily think of problems that would fit
entirely in cache, but take an enormous amount of time.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-06 Thread Bradbev

On Jul 6, 4:30 pm, fft1976 fft1...@gmail.com wrote:
 On Jul 5, 11:42 pm, Bradbev brad.beveri...@gmail.com wrote:

  more to modern x86 chips.  After you have the best algorithm for the
  job, you very quickly find that going fast is entirely bound by memory
  speed (actually latency) - cache misses are the enemy.

 IME (outside JVM), this depends strongly on the kind of problem you
 are solving as well as your implementation (you need to know how to
 cache-optimize). One can easily think of problems that would fit
 entirely in cache, but take an enormous amount of time.
What sort of problems did you have in mind?  Anything that I can think
of quickly spills the cache.

Cheers,
Brad
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-05 Thread Nicolas Oury
Hello,

Thanks to Rich's advices, I managed to make your ICFP program run very fast.
I have made a few modifications:
- changed the code following Rich's advices in order to have primitive array
access everywhere.
- I didn't managed to do it for booleans, so I used a java object
conataining only a public boolean for StatusReg.
- The biggest speep-up was the chunking of the generated program. It seems
that the fact that the program was so big prevented
the JIT to compile it. So I changed the code to emit it by chunk of around
50 guest VM instructions. (after a few tries, it seemed this was a good
figure. On my computer, the performance seems to be highly sensitive on the
size of the chunk. I don't really understand why and it worries me a bit on
the correctness of my program. )
To do so, I create intermediate local functions that I call.

- I generate the program at compile time, in order to only mesure execution
time and not compilation time.


After, when I run the benchmark in -server with a big enough CacheCode area
(1000m), and enough iterations to have everything JITed, I get more than
860.000 iterations per second. (I benchmarked 100 000 000 iterations in 121
sec, on my 2.4GHz computer).

When you look at the profile, 98.7% is spent in compiled code, 1.3% in
interpreted and 0% in stub. This proves that input program actually get
translated to Clojure, compiled to byte code, recompiled to native
instructions and then executed natively.

This is so fast, that I actually wonder whether too many things get
optimized away and the program is not benchmarking anything anymore. Or
maybe my code transformation is completly wrong and throws everything away.

igorrumiha, I am going to send you the source in a private mail so you can
check whether the program still works, and put it in your git hub if it
does.
(I don't know what is the policy of this list regarding attaching files.)

If it does work, and I have not made some stupid mistakes, is is a nice
example of program where Clojure's performance can compete with C and even
outperform it, because you can write easily a better implementation.
 (The code I wrote is really ugly, because it was made as multiple hacks,
but I think with more experience than me and more cleverness in the design,
you can keep the code really expressive and clear, while achieving the same
level of performance.).

Best,

Nicolas.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-05 Thread fft1976

On Jul 5, 2:31 am, Nicolas Oury nicolas.o...@gmail.com wrote:

 After, when I run the benchmark in -server with a big enough CacheCode area
 (1000m), and enough iterations to have everything JITed, I get more than
 860.000 iterations per second. (I benchmarked 100 000 000 iterations in 121
 sec, on my 2.4GHz computer).

That's 3000 clock cycles per VM instruction? I'm not very familiar
with the problem, but I thought straight C bytecode interpreters
were at around 30 and compiling the VM code to native code (with JIT)
reduced it to 7. Is this right?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-05 Thread Nicolas Oury
Each iteration contains 266 instructions. So, it is more like 10.5 clock
cycle per instructions.
Probably the cost of the method call, that I had to introduce in order to
have the JIT compile, or the fact that the status register is
not directly in a local variable but with an indirection, because we can't
set the value of a local variable directly.
 I could improve that by threading the status reg as a local parameter with
let instructions, but that would mean change a lot of things in the code...
And I got lazy.

If the program is correct, of which I am not sure yet, it is already quite
good as it is quite straightforward and probably far more
readable/maintanable than the implementations in other languages.

Moreover, Clojure is young and being so close from the best solution would
already be great.

Actually, that would be so good, that I actually think there is a mistake
somewhere either in my program or my computation of the speed.
Let's wait for igorrumiha to check and to test this implementation.


Best regards,
Nicolas.





On Sun, Jul 5, 2009 at 12:18 PM, fft1976 fft1...@gmail.com wrote:


 On Jul 5, 2:31 am, Nicolas Oury nicolas.o...@gmail.com wrote:

  After, when I run the benchmark in -server with a big enough CacheCode
 area
  (1000m), and enough iterations to have everything JITed, I get more than
  860.000 iterations per second. (I benchmarked 100 000 000 iterations in
 121
  sec, on my 2.4GHz computer).

 That's 3000 clock cycles per VM instruction? I'm not very familiar
 with the problem, but I thought straight C bytecode interpreters
 were at around 30 and compiling the VM code to native code (with JIT)
 reduced it to 7. Is this right?

 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-05 Thread igorrumiha

On Jul 5, 3:05 pm, Nicolas Oury nicolas.o...@gmail.com wrote:
 Actually, that would be so good, that I actually think there is a mistake
 somewhere either in my program or my computation of the speed.
 Let's wait for igorrumiha to check and to test this implementation.


OK, I did my tests, I found and fixed two small bugs and as far as I
can tell the VM is now working correctly. I couldn't, however,
reproduce the timings Nicolas had, on my machine (a 2 Ghz MacBook with
java 1.6 build 1.6.0_13-b03-211) I got timings between 570 and 637
seconds for 100 000 000 executions. So, this translates, on average,
to 163585 executions/second. That translates to 45.9 clock cycles per
guest VM instruction.

I used these tuning options for the Java VM:
-XX:InitialCodeCacheSize=768m
-XX:ReservedCodeCacheSize=1024m
-Xms768M
-Xmx768M
-server

I consider this an excellent result, about 2.5 times faster than my
simple Java implementation of the VM.

I can't explain the 5.2x difference between my and Nicolas' timings,
though :)

What I can say is that my quest for Orbital Virtual Machine speed is
satisfied. With a simulation time limit of 3 million seconds (set by
the contest rules) this VM now reaches that limit in ~19 seconds of
wallclock time. That's even good enough for interactive exploration of
problem solutions (I think it's very likely that it will usually take
more than 19 seconds to write a new version of a solution and then run
it in the REPL :))

I think it's safe to say that once again it's proved that Clojure
easily matches the Java level of performance.

What I've learned from this is:
1. I need to read the documentation more carefully :),
2. I need to learn how to write macros,
3. I need to get myself acquainted with the underlying JVM platform,
4. When in trouble, the clojure group is an excellent place to get
help.
5. Write lots and lots of Clojure code, that is the only way to get
enough experience for the next ICFP contest :)

Thanks a bunch everyone, especially Nicolas for being so persistent :)


p.s.
I'll update my git repo in a day or two, I need to clean up the code a
bit and port the game logic code to use the new VM.


Igor.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-05 Thread fft1976

On Jul 5, 10:53 am, igorrumiha igorrum...@gmail.com wrote:

 I think it's safe to say that once again it's proved that Clojure
 easily matches the Java level of performance.

I think one shouldn't generalize from one [unverified] example.

Personally, I'll wait for Jon Harrop or someone to port the relevant
Shootout benchmarks or his Ray tracing benchmark to Clojure and see
what time they get and what the code looks like.

For example, some people claim that Haskell is as fast as C, while
also being very high-level, but when you look at the Shootout, you'll
see that it's BS: they are doing pointer arithmetic in Haskell, for
Christ's sake!

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-04 Thread fft1976



On Jul 3, 5:52 pm, Jon Harrop j...@ffconsultancy.com wrote:
 On Thursday 02 July 2009 07:58:11 you wrote:

  I wonder if Jon Harrop is still planning to write Clojure for
  Scientists or Scala for Scientists or both?

 I am certainly interested in writing both books. I reviewed Scala back in 2007
 and decided that it was not ready to be advocated. Perhaps things have
 progressed significantly since then but my impression is that Clojure is
 developing in a more productive direction and much more rapidly. I am also
 more interested in Clojure because it strives to be a genuine functional
 language rather than an OOP language with some odds and sods bolted on (Scala
 feels like a minor departure from Java to me, and I am not a Java fan. In
 fact, more like C# 3 than any real functional language) and because Clojure
 is designed with industrial use in mind rather than as an academic exercise.
 However, I have yet to give Clojure the thorough study that it deserves
 simply because I am tied up getting our F# products ready for its big release
 in 2010.

 If anyone here is interested in a Clojure book aimed at technical users
 (scientists and engineers), please let me know.

I'd probably be interested if Clojure lived up to the promise of being
as fast as anything on JVM, and JVM lived up to the promise of being
as fast as C++, but I'm afraid that's not happening, as this thread
indicates.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-04 Thread Nicolas Oury
Hello,

after experimenting a bit, it seems that you can get a small gain by putting
things in local variables (25% or so) but not much.

I looked at the implementation and I was assuming something that is not
true:
aset-double does not compile to JVM instructions for setting the array
directly.
It calls a method setDouble. I think the big difference between Clojure and
java comes from that.
When you write r[i] = ... in java, I believe it is compiled to not much more
than a few bytecode instructions to arrange the stack and
one to store in an array.

When we do the same in Clojure, we call a function that call a method, that
does that. As setting an array is very quick, even the slight overhead of
calling a method makes the program far slower, in a tight loop.

I will start another thread to ask whether this is really the problem and if
it can be solved.

Best,

Nicolas.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-04 Thread Rich Hickey

On Sat, Jul 4, 2009 at 4:50 AM, Nicolas Ourynicolas.o...@gmail.com wrote:
 Hello,

 after experimenting a bit, it seems that you can get a small gain by putting
 things in local variables (25% or so) but not much.

 I looked at the implementation and I was assuming something that is not
 true:
 aset-double does not compile to JVM instructions for setting the array
 directly.
 It calls a method setDouble. I think the big difference between Clojure and
 java comes from that.
 When you write r[i] = ... in java, I believe it is compiled to not much more
 than a few bytecode instructions to arrange the stack and
 one to store in an array.

 When we do the same in Clojure, we call a function that call a method, that
 does that. As setting an array is very quick, even the slight overhead of
 calling a method makes the program far slower, in a tight loop.

 I will start another thread to ask whether this is really the problem and if
 it can be solved.


With -server, HotSpot inlines the calls to static methods generated by
aset on arrays of primitives to the exact same machine instructions.

People that want to match the performance of Java on arrays of
primitives need to follow the advice given here carefully:

http://clojure.org/java_interop#primitives

http://clojure.org/java_interop#optimization

In particular, aget/aset are overloaded for arrays of primitives.
aset-xxx are not.

Rich

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-03 Thread igorrumiha

On Jul 2, 7:13 pm, Nicolas Oury nicolas.o...@gmail.com wrote:
 Hello,
 I just have a look at your code. I was wondering why reflect.Array/getDouble
 was faster than aget?

(aget arrayname) actually makes a call to java.lang.reflect.Array/get
which looks at the type of data in the array and then does the right
thing. Calling  java.lang.reflect.Array/getDouble directly saves the
time for type resolving (I guess) and gives me a 30% speedup.

 And I realized than you create arrays of objects and not array of doubles.

 Have you tried to replace make-array by double-array?

I'm using an equivalent of double-array:

user (make-array Double/TYPE 5)
#double[] [...@7f14671e
user (double-array 5)
#double[] [...@60d8edf8
user

 You can also try to replace the arithmetic by unchecked arithmetic. (Is it
 correct from the point of view of the semantic of the VM you are
 implementing?).

I didn't get to that yet because my profiling shows that the amount of
time spent getting values from an array (even when using getDouble)
surpasses by far the time spent in any other operation. Of course, my
understanding of the JVM profiler logs may be wrong.

 Can you try to check in your profiling whether the function you generate get
 compiled or not?

Looks like my function did not get compiled, here is the java -Xprof
log: http://paste.org/8811 On line 10 you can se the method
loneclojurian.ovm2$eval__3920$fn__3922.invoke. But you can see that
only 0.5% of time is spent in that function. Scroll down to line 44
and you will see the real time wasters:

54.9% 0  +  2413java.lang.reflect.Array.getDouble
21.0% 0  +   924java.lang.reflect.Array.setDouble

Looks like the JIT compiler decided it would be a waste of time to
compile my function and I agree.  :)

 What function are you running at the top level? Is that a loop where you run
 multiple times the same program?

This is the contents of benchmark.clj:

(require 'loneclojurian.ovm2)

(in-ns 'loneclojurian.ovm2)
(def machine (init-ovm /path/to/bin1.obf))
(defn one-run [] (machine) [(aget output-ports 0)
(aget output-ports 1)
(aget output-ports 2)
(aget output-ports 3)
(aget output-ports 4)])

(time (dotimes [_ 20] (one-run)))

So, init-ovm returns a function that executes the program, then I
define the one-run function that mimics the real usage of the VM:
execute the program once and then get the values of the first 5 output
ports (this is all related to problem 1).

So, I think the problem is that this approach of emulating computer
memory is quite resource hungry. Either a way needs to be found to
make these gets and sets really cheap or I need to rethink the
problem.

--
IgorR
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-03 Thread Shawn Hoover
On Fri, Jul 3, 2009 at 9:29 AM, igorrumiha igorrum...@gmail.com wrote:


 On Jul 2, 7:13 pm, Nicolas Oury nicolas.o...@gmail.com wrote:

 You can also try to replace the arithmetic by unchecked arithmetic. (Is it
  correct from the point of view of the semantic of the VM you are
  implementing?).

 I didn't get to that yet because my profiling shows that the amount of
 time spent getting values from an array (even when using getDouble)
 surpasses by far the time spent in any other operation. Of course, my
 understanding of the JVM profiler logs may be wrong.


Interesting. Maybe I'm crazy, but when running java -server, replacing
vectors/assoc with arrays/aset slowed my project down from 4000
iterations/second to 100! Similarly, removing multimethods for each
instruction and loop/recur to churn through them and replacing with one big
fn generated up front resulted in a massive slowdown. I haven't profiled
yet, but I plan to do so because I'm so surprised by the results.

Shawn

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-03 Thread igorrumiha

On Jul 3, 4:30 pm, Shawn Hoover shawn.hoo...@gmail.com wrote:
 Interesting. Maybe I'm crazy, but when running java -server, replacing
 vectors/assoc with arrays/aset slowed my project down from 4000
 iterations/second to 100! Similarly, removing multimethods for each
 instruction and loop/recur to churn through them and replacing with one big
 fn generated up front resulted in a massive slowdown. I haven't profiled
 yet, but I plan to do so because I'm so surprised by the results.


Well, I did one more thing: implement the VM core in Java. Only the
storage and the instructions. Quite ugly, one class, everything is
public static final, and this thing is fast! I get ~65000 iterations/
s. It would be interesting to find out why

machine_data[pc] = machine_data[r1] + machine_data[r2];

is so much faster than:

(aset-double machine-data (+ (Array/getDouble machine-data r1) (Array/
getDouble machine-data r2)))

but I guess, at this level, it would involve comparing the generated
JVM bytecode for each case but that is way over my capabilities at
this moment.

--
IgorR
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-03 Thread Jon Harrop

On Thursday 02 July 2009 07:58:11 you wrote:
 I wonder if Jon Harrop is still planning to write Clojure for
 Scientists or Scala for Scientists or both?

I am certainly interested in writing both books. I reviewed Scala back in 2007 
and decided that it was not ready to be advocated. Perhaps things have 
progressed significantly since then but my impression is that Clojure is 
developing in a more productive direction and much more rapidly. I am also 
more interested in Clojure because it strives to be a genuine functional 
language rather than an OOP language with some odds and sods bolted on (Scala 
feels like a minor departure from Java to me, and I am not a Java fan. In 
fact, more like C# 3 than any real functional language) and because Clojure 
is designed with industrial use in mind rather than as an academic exercise. 
However, I have yet to give Clojure the thorough study that it deserves 
simply because I am tied up getting our F# products ready for its big release 
in 2010.

If anyone here is interested in a Clojure book aimed at technical users 
(scientists and engineers), please let me know.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-03 Thread Robert Fischer

Jon Harrop wrote:
 Scala 
 feels like a minor departure from Java to me, and I am not a Java fan. In 
 fact, more like C# 3 than any real functional language

Don't say that too loud unless you're ready for a drawn-out flame war.

http://enfranchisedmind.com/blog/posts/scala-not-functional/

~~ Robert Fischer, Smokejumper IT Consulting.
Enfranchised Mind Blog http://EnfranchisedMind.com/blog

Check out my book, Grails Persistence with GORM and GSQL!
http://www.smokejumperit.com/redirect.html

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread fft1976



On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

 Some people claim
 that the JVM can give you C-like performance, but I would be more than
 happy if I got my VM to be 10x slower than the C ones :)

I like your honesty! You can come to my house and * my sister!

I wonder if Jon Harrop is still planning to write Clojure for
Scientists or Scala for Scientists or both?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread Daniel Lyons


On Jul 2, 2009, at 12:58 AM, fft1976 wrote:
 On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

 Some people claim
 that the JVM can give you C-like performance, but I would be more  
 than
 happy if I got my VM to be 10x slower than the C ones :)

 I like your honesty! You can come to my house and * my sister!

I wish you wouldn't talk like that.

—
Daniel Lyons


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread Daniel Lyons

Mark,

On Jul 1, 2009, at 3:16 PM, Mark Engelberg wrote:
 But each language encourages a certain style of design and algorithm,
 so it IS fair to compare the way that the language encourages a
 certain approach.  For example, Clojure encourages you to use a more
 functional approach and persistent data structures.  If this ends up
 being way slower than the mutable approach that Python encourages,
 that is worth knowing.

 I remember one year, my ICFP VM in Python ran ridiculously slow in
 part because Python doesn't have a switch statement, whereas other
 languages do.   So it is useful to look at what a language does and
 does not allow you to do, and how that impacts speed.

 My point is: no comparison is going to be exactly apples to apples,
 but the comparison can still be useful.

I think I may have conflated some ideas in my previous message. Let me  
explain.

Firstly, I don't think there's any question that Clojure's performance  
isn't up to that of C. The JVM can always get better but it's probably  
going to be a while on that one. And I think the JVM establishes a  
reasonable ceiling on performance. To wit, if you don't think your  
program would run fast enough on the JVM, it's probably not going to  
run fast enough under Clojure either. Unless you can make up for it  
with cheap parallelism. I think everyone ought to know this coming in,  
but if C-level performance is really necessary, only a few languages  
will suffice and the decision of what to use to implement the code  
will necessarily flow from that.

As for FP techniques themselves, there is ample proof that they are  
not a performance inhibition in the form of other FP languages that do  
not compromise on methodology but still achieve high performance. I'm  
talking about OCaml, Haskell, Common Lisp (not as FPy, but it can be  
done) and Scheme and sometimes Erlang. The language shootout has  
plenty of evidence for that.

Please forgive my derision towards benchmarking languages. I guess you  
could say I am channelling Dijkstra here, but I think the algorithm  
dominates the debate. Quicksort in Ruby will beat bubble sort in C if  
your data is large enough. This is one reason I am not a big fan of  
comparing the speed of languages; another is that it often changes out  
from under you. Java itself is a great example of this: by the time  
everyone had gotten the message that it was slow, it had already  
become reasonably fast. The code that was to take the most advantage  
of the improvements was that which was written plainly and clearly  
rather than doing a lot of legwork to circumvent performance issues,  
and this is always the case. Clojure's young.

I appreciate the novelty of the ICFP and I think it's being examined  
because it's on our minds rather than due to any qualities it has as  
an experiment, so I won't harp on it much, but I think to compare  
between different teams implementations would be to assume that each  
team was more-or-less equal in terms of skill but for the language.  
The whole point of the contest is to discover just how untrue that is.  
I remember in 2006 my team made a very small VM in C and we were  
extremely proud of it because we had wasted so much time on a VM in  
Ruby and again in Python, hoping it would be a substantial  
improvement. Another team had written a VM with a just-in-time  
compiler which defeated our braindead C implementation handily. It was  
written in Haskell, of course:

http://comonad.com/reader/wiki;mode=categoryitem=Perl

All the best,

—
Daniel Lyons

PS: you can emulate a switch in Python by storing functions in a hash  
with the key being the dispatch value. But I don't think it will save  
your performance.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread Konrad Hinsen

On Jul 2, 2009, at 10:36, Daniel Lyons wrote:

 This is one reason I am not a big fan of
 comparing the speed of languages; another is that it often changes out
 from under you. Java itself is a great example of this: by the time
 everyone had gotten the message that it was slow, it had already
 become reasonably fast. The code that was to take the most advantage
 of the improvements was that which was written plainly and clearly
 rather than doing a lot of legwork to circumvent performance issues,
 and this is always the case. Clojure's young.

I agree. I don't think discussing Clojure's performance vs. language  
X is very productive in general, and certainlyl not at this moment.  
The implementation is young, and so is the community's experience  
with efficient coding style.

Two more interesting questions are
1) What are good strategies to get the best performance out of Clojure?
2) How can the current implementation (compiler + runtime library) be  
improved?

Konrad.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread Nicolas Oury
Hello,
I just have a look at your code. I was wondering why reflect.Array/getDouble
was faster than aget?
And I realized than you create arrays of objects and not array of doubles.

Have you tried to replace make-array by double-array?
I am not a Java specialist but if I understand well the API ot should create
an array of unboxed doubles, making acces really faster. (No creation of
Object when you write, nor indirection when you read).

For access, I would use (double (aget ...)). But I am not sure.

Can a clojure expert tell whether there is something like aget-double or
another way to do it?

You can also try to replace the arithmetic by unchecked arithmetic. (Is it
correct from the point of view of the semantic of the VM you are
implementing?).

Can you try to check in your profiling whether the function you generate get
compiled or not?

What function are you running at the top level? Is that a loop where you run
multiple times the same program?

Best regards,

Nicolas.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-02 Thread fft1976



On Jul 2, 1:01 am, Daniel Lyons fus...@storytotell.org wrote:
 On Jul 2, 2009, at 12:58 AM, fft1976 wrote:

  On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

  Some people claim
  that the JVM can give you C-like performance, but I would be more  
  than
  happy if I got my VM to be 10x slower than the C ones :)

  I like your honesty! You can come to my house and * my sister!

 I wish you wouldn't talk like that.


Sorry, it seemed funny when I wrote it:

http://www.youtube.com/watch?v=t8Nf1MK7lts


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread Nicolas Oury
Hello,
That is an interesting blog post. I am not a clojure specialist, but if I
was to try to change your program to get towards C-like speed (or better), I
would :

- Use java arrays for memory. You don't seem to use the vectors in a
persistent way, and there does not seem to be easy parallelization of the
vm. That's less elegant but probably quicker. I wouldn't say the same thing
if there were possible ways of using persistency later on in the problem.

- Most criticaly, try to use JVM's very good JIT. If it is possible. I
haven't looked to the OVM code precisely.
  If you replace :

(defn make-add
 [#^Integer r1 #^Integer r2 #^Integer pc]
 (fn [machine-state]
 (update-in machine-state [:data] ...)

by :

(defmacro make-add
 [r1 r2 pc-var]

 `(let [pc# (+ 1 pc-var)  (update-in machine-state ...)...)

And instead of reading the instruction and storing them in a vector
you create a term:

program-term =

   `(fn [input-array output-array memory]

  (do ~...@list-of-instructions)

Then you (eval program-term) (once only) at run time, or macro expand
it at compile time (if you have the right to use the binary at compile
time.)

This way, the code should be very fast, once the JIT starts. You can
hope better speed than C, because once compiled, you remove the
decoding phase of the opcode the C program has to do on each run.
Somehow you used clojure to translate OVM bytecode to - hopefully good
- JVM bytecode, that the JVM is very good at executing.

Can a clojure expert confirms wether it would work or not?

Best regards,

Nicolas.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread igorrumiha

On Jul 1, 8:55 am, Nicolas Oury nicolas.o...@gmail.com wrote:
 - Use java arrays for memory. You don't seem to use the vectors in a
 persistent way, and there does not seem to be easy parallelization of the
 vm. That's less elegant but probably quicker. I wouldn't say the same thing
 if there were possible ways of using persistency later on in the problem.

I have already started on that path, I have created an experimental
version of the VM that stores machine state in Java arrays. That gives
me around 100% speed increase (from 600 to approx. 1200 program
executions / second).

[SNIP interesting advice]

This looks like an interesting approach. I will try it out. I have
been avoiding the use of macros for some time now, maybe this is an
ideal opportunity to learn how to use them :)

--
Igor.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread Maurice Codik
I did icfp09 on my own using a mix of clojure+java. I'd sitll call myself a
clojure newbie, but have been writting java for a few years now. my code is
here:

http://github.com/mcodik/icfp2009-momo/tree/master

I wrote the vm in java since I figured that most of the code would be really
imperative with lots of writing/reading arrays and bit twiddling, both of
which I was more comfortable doing in java. My vm's main run method takes
in an IFn which it calls after every iteration.. this let me write all of
the physics code in clojure. I didnt really benchmark my vm at all until
just now, but looks like I get around 75k iterations/s on my thinkpad t60
with visualization disabled, and around 22k iters/s with visualization on.
the visualization is a simple swing ui that draws points and circles
representing orbits and sattelite positions (also written in java).

the only actually interesting part about the clojure code was the state
machine I ended up using: I had a ref set to a function of vm inputs -
velocity change and on every vm tick I would call the function currently
assigned to the ref. I created a few of these functions that would do
specific things: I had one that would wait N vm ticks then ref-set another
function as the active one. Another would execute a burn for N ticks then
ref-set another active function when complete. The function that executed a
hohmann orbit transfer would compose these two and then set another active
function, etc. I was having a tough time getting things to happen when I
wanted to before I got the state machine going.

the code in that git repo solves 100x, 2001, 2003 and 2004. I never got 2002
to work due to some rounding error.. I'd always end up slightly outside the
1km range of the target sattelite.

overall, pretty fun contest... learned quite a bit doing it.

Maurice

On Wed, Jul 1, 2009 at 6:45 AM, igorrumiha igorrum...@gmail.com wrote:


 On Jul 1, 8:55 am, Nicolas Oury nicolas.o...@gmail.com wrote:
  - Use java arrays for memory. You don't seem to use the vectors in a
  persistent way, and there does not seem to be easy parallelization of the
  vm. That's less elegant but probably quicker. I wouldn't say the same
 thing
  if there were possible ways of using persistency later on in the problem.

 I have already started on that path, I have created an experimental
 version of the VM that stores machine state in Java arrays. That gives
 me around 100% speed increase (from 600 to approx. 1200 program
 executions / second).

 [SNIP interesting advice]

 This looks like an interesting approach. I will try it out. I have
 been avoiding the use of macros for some time now, maybe this is an
 ideal opportunity to learn how to use them :)

 --
 Igor.
 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread Shawn Hoover
Another one: http://bitbucket.org/shoover/icfp. Like Jeff we had fun on the
VM but didn't get to post a solution :)

On Wed, Jul 1, 2009 at 1:34 AM, Jeff Foster dr.jeff.fos...@gmail.comwrote:


 I looked at the ICFP Contest too.  I didn't even get as far as solving
 the first problem, but I did implement a virtual machine that appeared
 to work.  I really enjoyed the coding, though I didn't get very far
 with the physics!  I tried a couple of approaches but settling on the
 functional side.  Performance was not bad (from what I've seen it was
 vaguely comparable to the Python implementations, but was completely
 blown away by C/C++ implementations).  I really wish I'd had the time
 to do a visualizer!

 My code is on github too (http://github.com/fffej/ClojureProjects/tree/
 1494815e83febebe9af28b0cb08b812a63df9e96/icfp/uk/co/fatvathttp://github.com/fffej/ClojureProjects/tree/%0A1494815e83febebe9af28b0cb08b812a63df9e96/icfp/uk/co/fatvat)
 and
 there's a write-up on my blog (http://www.fatvat.co.uk/2009/06/icfp-
 contest-this-time-its-functional.htmlhttp://www.fatvat.co.uk/2009/06/icfp-%0Acontest-this-time-its-functional.html
 ).

 Again, I'd appreciate any guidance on anything that I could improve!

 Cheers

 jeff

 On Jun 30, 11:02 pm, igorrumiha igorrum...@gmail.com wrote:
  Greetings everyone,
 
  I didn't actually plan it but I ended up participating in the ICFP
  programming contest (http://www.icfpcontest.org). This year the task
  was to move satellites from one orbit to another, to meet with other
  satellites etc. Quite interesting, and to start it all you need to
  implement a virtual machine for the physics simulator.
 
  I used Clojure and managed to solve the first of the four problems,
  which means I didn't get really far. I was simply too slow to get to
  the really interesting stuff. I have written a rather long article
  describing my solution so I hope some of you may find it interesting:
 
  http://igor.rumiha.net/tag/icfp/
 
  I have also put the code on github:
 http://github.com/irumiha/icfp-loneclojurian/tree/master
 
  I hope someone has the interest and the time to take a look at the
  code. I consider myself a Clojure beginner and any suggestions are
  welcome, especially considering possible speed improvements to the
  virtual machine. According to some of the people on the #icfp-contest
  channel my VM implementation is 500x to 1000x slower than a typical
  implementation written in C. It is, on the other hand, in the same
  performance range as some VMs written in Python. Some people claim
  that the JVM can give you C-like performance, but I would be more than
  happy if I got my VM to be 10x slower than the C ones :)
 
  Igor.
 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread fft1976

On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

 According to some of the people on the #icfp-contest channel my
 VM implementation is 500x to 1000x slower than a typical
 implementation written in C. It is, on the other hand, in the same
 performance range as some VMs written in Python.


On Jun 30, 10:34 pm, Jeff Foster dr.jeff.fos...@gmail.com wrote:

 Performance was not bad (from what I've seen it was
 vaguely comparable to the Python implementations, but was completely
 blown away by C/C++ implementations).


Has either one of you tried adding type declarations?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread igorrumiha

On Jul 1, 8:25 pm, fft1976 fft1...@gmail.com wrote:
 On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

  According to some of the people on the #icfp-contest channel my
  VM implementation is 500x to 1000x slower than a typical
  implementation written in C. It is, on the other hand, in the same
  performance range as some VMs written in Python.

 On Jun 30, 10:34 pm, Jeff Foster dr.jeff.fos...@gmail.com wrote:

  Performance was not bad (from what I've seen it was
  vaguely comparable to the Python implementations, but was completely
  blown away by C/C++ implementations).

 Has either one of you tried adding type declarations?

Yes, I had them from the start.

--
IgorR
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread igorrumiha

On Jul 1, 8:55 am, Nicolas Oury nicolas.o...@gmail.com wrote:
[SNIP]
 And instead of reading the instruction and storing them in a vector
 you create a term:

 program-term =

    `(fn [input-array output-array memory]

           (do ~...@list-of-instructions)

 Then you (eval program-term) (once only) at run time, or macro expand
 it at compile time (if you have the right to use the binary at compile
 time.)

Looks like your suggestion is a definitive move in the right
direction. As you suggested, now I'm using native arrays and I'm
generating only one big function to run the whole program. The git
repo has the latest changes. The performance has jumped from 1200
(native array only, one function per VM instruction) to 3400
executions/second on my machine. Woohoo! :) Still there's room for
improvement, so the search continues...


Igor.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread fft1976



On Jul 1, 1:10 pm, igorrumiha igorrum...@gmail.com wrote:
 On Jul 1, 8:25 pm, fft1976 fft1...@gmail.com wrote:

  On Jun 30, 3:02 pm, igorrumiha igorrum...@gmail.com wrote:

   According to some of the people on the #icfp-contest channel my
   VM implementation is 500x to 1000x slower than a typical
   implementation written in C. It is, on the other hand, in the same
   performance range as some VMs written in Python.

  On Jun 30, 10:34 pm, Jeff Foster dr.jeff.fos...@gmail.com wrote:

   Performance was not bad (from what I've seen it was
   vaguely comparable to the Python implementations, but was completely
   blown away by C/C++ implementations).

  Has either one of you tried adding type declarations?

 Yes, I had them from the start.


Isn't it strange that Clojure with type declarations (that some people
say should be as fast as Java) was only as fast as Python (which does
not allow type declarations and does not exactly have a reputation for
speed)?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread Daniel Lyons


On Jul 1, 2009, at 2:24 PM, fft1976 wrote:

 Isn't it strange that Clojure with type declarations (that some people
 say should be as fast as Java) was only as fast as Python (which does
 not allow type declarations and does not exactly have a reputation for
 speed)?


Unless the two implementations share the same general design and  
algorithms, we're comparing apples and oranges. Quicksort in Python  
will always dominate bubble sort in Clojure.

—
Daniel Lyons


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread Mark Engelberg

On Wed, Jul 1, 2009 at 2:08 PM, Daniel Lyonsfus...@storytotell.org wrote:


 On Jul 1, 2009, at 2:24 PM, fft1976 wrote:

 Isn't it strange that Clojure with type declarations (that some people
 say should be as fast as Java) was only as fast as Python (which does
 not allow type declarations and does not exactly have a reputation for
 speed)?


 Unless the two implementations share the same general design and
 algorithms, we're comparing apples and oranges. Quicksort in Python
 will always dominate bubble sort in Clojure.

But each language encourages a certain style of design and algorithm,
so it IS fair to compare the way that the language encourages a
certain approach.  For example, Clojure encourages you to use a more
functional approach and persistent data structures.  If this ends up
being way slower than the mutable approach that Python encourages,
that is worth knowing.

I remember one year, my ICFP VM in Python ran ridiculously slow in
part because Python doesn't have a switch statement, whereas other
languages do.   So it is useful to look at what a language does and
does not allow you to do, and how that impacts speed.

My point is: no comparison is going to be exactly apples to apples,
but the comparison can still be useful.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-07-01 Thread igorrumiha

On Jul 1, 10:24 pm, fft1976 fft1...@gmail.com wrote:

   Has either one of you tried adding type declarations?

  Yes, I had them from the start.

 Isn't it strange that Clojure with type declarations (that some people
 say should be as fast as Java) was only as fast as Python (which does
 not allow type declarations and does not exactly have a reputation for
 speed)?

I don't think type declarations help much in this case. For instance,
I have run java with -Xrunhprof option, with my functional
implementation the top list looks like this:

CPU SAMPLES BEGIN (total = 2241) Wed Jul  1 22:57:01 2009
rank   self  accum   count trace method
   1 20.30% 20.30% 455 301116 clojure.lang.RT.next
   2 14.73% 35.03% 330 301122 clojure.lang.RT.first
   3  9.37% 44.40% 210 301121 clojure.lang.Var.deref
   4  5.85% 50.25% 131 301134 clojure.lang.Var.deref
   5  4.73% 54.98% 106 301123 clojure.lang.Var.deref
   6  3.53% 58.50%  79 301138 clojure.core
$reduce__267$fn__270.invoke

etc...

And when I run my latest code:

CPU SAMPLES BEGIN (total = 4257) Wed Jul  1 23:05:17 2009
rank   self  accum   count trace method
   1  0.42%  0.42%  18 301156 java.lang.reflect.Array.getDouble
   2  0.38%  0.80%  16 301192 java.lang.reflect.Array.getDouble
   3  0.38%  1.17%  16 301238 java.lang.reflect.Array.getDouble
   4  0.35%  1.53%  15 301209 java.lang.reflect.Array.getDouble
   5  0.35%  1.88%  15 301235 java.lang.reflect.Array.getDouble
   6  0.33%  2.21%  14 301247 java.lang.reflect.Array.getDouble
   7  0.33%  2.54%  14 301702 java.lang.reflect.Array.getDouble
   8  0.33%  2.87%  14 301164 java.lang.reflect.Array.getDouble
   9  0.33%  3.19%  14 301337 java.lang.reflect.Array.getDouble
 10  0.31%  3.50%  13 301291 java.lang.reflect.Array.getDouble

(with around 400 more lines like this)

Please note the self column on both logs. So, my first approach was
maybe intuitive to me but it was dealing with a lot of hash lookups,
shuffling with vectors, merging hashes and whatnot. Performance was
not my goal, I just wanted to make the VM run correctly. Also, as an
exercise I wanted to use immutable/persistent data structures, to see
how I could implement such a thing as a virtual machine in a
functional language. I became interested in performance when I
discovered that my VM runs at 45 iterations/second. That had to be
improved, so I tweaked it until I got 500 (still using immutable data
structures), and that allowed me to continue with the contest.

Now, thanks to Nicolas's suggestions I get just short of 5000
iterations/s (yes, I improved it a bit more since my last post), of
course, with the help of mutable java arrays. That is nice.

--
IgorR
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



loneclojurian at ICFP programming contest

2009-06-30 Thread igorrumiha

Greetings everyone,

I didn't actually plan it but I ended up participating in the ICFP
programming contest (http://www.icfpcontest.org). This year the task
was to move satellites from one orbit to another, to meet with other
satellites etc. Quite interesting, and to start it all you need to
implement a virtual machine for the physics simulator.

I used Clojure and managed to solve the first of the four problems,
which means I didn't get really far. I was simply too slow to get to
the really interesting stuff. I have written a rather long article
describing my solution so I hope some of you may find it interesting:

http://igor.rumiha.net/tag/icfp/

I have also put the code on github: 
http://github.com/irumiha/icfp-loneclojurian/tree/master

I hope someone has the interest and the time to take a look at the
code. I consider myself a Clojure beginner and any suggestions are
welcome, especially considering possible speed improvements to the
virtual machine. According to some of the people on the #icfp-contest
channel my VM implementation is 500x to 1000x slower than a typical
implementation written in C. It is, on the other hand, in the same
performance range as some VMs written in Python. Some people claim
that the JVM can give you C-like performance, but I would be more than
happy if I got my VM to be 10x slower than the C ones :)


Igor.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: loneclojurian at ICFP programming contest

2009-06-30 Thread Jeff Foster

I looked at the ICFP Contest too.  I didn't even get as far as solving
the first problem, but I did implement a virtual machine that appeared
to work.  I really enjoyed the coding, though I didn't get very far
with the physics!  I tried a couple of approaches but settling on the
functional side.  Performance was not bad (from what I've seen it was
vaguely comparable to the Python implementations, but was completely
blown away by C/C++ implementations).  I really wish I'd had the time
to do a visualizer!

My code is on github too (http://github.com/fffej/ClojureProjects/tree/
1494815e83febebe9af28b0cb08b812a63df9e96/icfp/uk/co/fatvat) and
there's a write-up on my blog (http://www.fatvat.co.uk/2009/06/icfp-
contest-this-time-its-functional.html).

Again, I'd appreciate any guidance on anything that I could improve!

Cheers

jeff

On Jun 30, 11:02 pm, igorrumiha igorrum...@gmail.com wrote:
 Greetings everyone,

 I didn't actually plan it but I ended up participating in the ICFP
 programming contest (http://www.icfpcontest.org). This year the task
 was to move satellites from one orbit to another, to meet with other
 satellites etc. Quite interesting, and to start it all you need to
 implement a virtual machine for the physics simulator.

 I used Clojure and managed to solve the first of the four problems,
 which means I didn't get really far. I was simply too slow to get to
 the really interesting stuff. I have written a rather long article
 describing my solution so I hope some of you may find it interesting:

 http://igor.rumiha.net/tag/icfp/

 I have also put the code on 
 github:http://github.com/irumiha/icfp-loneclojurian/tree/master

 I hope someone has the interest and the time to take a look at the
 code. I consider myself a Clojure beginner and any suggestions are
 welcome, especially considering possible speed improvements to the
 virtual machine. According to some of the people on the #icfp-contest
 channel my VM implementation is 500x to 1000x slower than a typical
 implementation written in C. It is, on the other hand, in the same
 performance range as some VMs written in Python. Some people claim
 that the JVM can give you C-like performance, but I would be more than
 happy if I got my VM to be 10x slower than the C ones :)

 Igor.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---