I have some more performance stuff coming.

Now interestingly, I think I know why Felix slaughters C and Ocaml
in nbody. The Ocaml code does this:

  for i = 0 to Array.length bodies - 1 do
    let b = bodies.(i) in
    for j = i+1 to Array.length bodies - 1 do
      let b' = bodies.(j) in
      let dx = b.x -. b'.x  and dy = b.y -. b'.y  and dz = b.z -. b'.z in
      let dist2 = dx *. dx +. dy *. dy +. dz *. dz in
      let mag = dt /. (dist2 *. sqrt(dist2)) in

      b.vx <- b.vx -. dx *. b'.mass *. mag;
      b.vy <- b.vy -. dy *. b'.mass *. mag;
      b.vz <- b.vz -. dz *. b'.mass *. mag;

      b'.vx <- b'.vx +. dx *. b.mass *. mag;
      b'.vy <- b'.vy +. dy *. b.mass *. mag;
      b'.vz <- b'.vz +. dz *. b.mass *. mag;
    done
  done;

The Felix code is marginally hand optimised:

   for i in 0 upto nbodies - 1 do
      val bimass = bodies.i.mass;
      var delta = 0.0,0.0,0.0;
      for j in i+1 upto nbodies - 1 do
         val bjmass = bodies.j.mass;
         val d = bodies.i.pos - bodies.j.pos;
         val d2 = sqr d;
         val distance = sqrt d2;

         delta = delta + d * bjmass * dt / (distance * d2);
         bodies.j.vel = bodies.j.vel + d * bimass * dt / (distance * d2);
      done;
      bodies.i.vel = bodies.i.vel - delta;
   done;

Notice bodies.i.vel is adjusted only at the end, by delta.
It's a tiny improvement intended to avoid indexing overheads.
I also lift out bimass (mass of plant I) as a loop invariant of the inner loop.

Well then I decided to look at the output:

Felix:
-0.169075164
-0.169075164
clang
-0.169075164
-0.169059907
gcc
-0.169075164
-0.169059907
ocaml
-0.169075164
-0.169059907

and I'm going "why is Felix wrong??" How did I shortcut
the calculation.

Then I looked what these numbers ARE: Ocaml code:

var r0 = Energy ();
for var i in 1 upto n do Advance 0.01; done;
var r1 = Energy ();
println$ f"%.9f"$ r0;
println$ f"%.9f"$ r1;

So actually .. C and Ocaml get the WRONG answer as well as being
slower. Felix gets the right answer. 

Law of Conservation of Energy.
The number should be the same.

My algorithm is more numerically sound because I'm accumulating
small differences in delta before adjusting plant i velocity.
Really, this should be done for planet j too: the delta's should
be accumulated first then used to adjust the planet velocities.

--
john skaller
skal...@users.sourceforge.net
http://felix-lang.org




------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to