Okay, I rewrote my Lisp version without any weirdness (like #. or
optimize or type declarations), and got the following:

(let* ((n 1000)
       (mat (make-array (* n n) :element-type 'double-float)))

  (defun naive ()
    (macrolet ((mat (n)
                 `(aref mat ,n)))
      (dotimes (i (* n n))
        (setf (mat i) 1.0d0))
      (dotimes (i 10)
        (loop
         for pos1 from 1 below (1- n)
         do (loop
             for pos2 from (+ pos1 n) below (+ pos1 (* n (1- n))) by n
             do (setf (mat pos2)
                      (* 0.11111111111111d0
                         (+ (mat (- pos2 1001)) (mat (1- pos2)) (mat (+ pos2 999))
                            (mat (- pos2 1000)) (mat pos2)      (mat (+ pos2 1000))
                            (mat (- pos2 999))  (mat (1+ pos2)) (mat (+ pos2 
1001))))))))
      (princ (mat (1+ n))))))

I went through and added a minimum of type declarations, making them
as accurate as possible[*].  Unfortunately, Python wasn't able to
figure out the type of N or MAT on its own.  Adding in the OPTIMIZE
declarations, I have the following code:

(let* ((n 1000)
       (mat (make-array (* n n) :element-type 'double-float)))
  (declare (type (eql 1000) n)
           (type (simple-array double-float (1000000)) mat)
           (optimize (speed 3) (safety 0) (debug 0) (space 0) (compilation-speed 0)))

  (defun fast ()
    (macrolet ((mat (n)
                 `(aref mat ,n)))
      (dotimes (i (* n n))
        (setf (mat i) 1.0d0))
      (dotimes (i 10)
        (loop
         for pos1 of-type (integer 1 1000) from 1 below (1- n)
         do (loop
             for pos2 of-type (integer 1000 1000000)
             from (+ pos1 n) below (+ pos1 (* n (1- n))) by n
             do (setf (mat pos2)
                      (* 0.11111111111111d0
                         (+ (mat (- pos2 1001)) (mat (1- pos2)) (mat (+ pos2 999))
                            (mat (- pos2 1000)) (mat pos2)      (mat (+ pos2 1000))
                            (mat (- pos2 999))  (mat (1+ pos2)) (mat (+ pos2 
1001))))))))
      (princ (mat (1+ n))))))

I'm quite happy with it.  It's readible, not littered with type
information, and runs 1.08x the speed of GCC's code (1.1 for Sun's C
compiler).  Note that it does the exact same computations as the C
version.  There are still some things in the disassembly that are
suboptimal, but in all it's as good as the C.

[*] Thanks to Gerd for pointing out in his code that MAT is a
SIMPLE-ARRAY.  Duh.  For some reason I didn't notice that was missing.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               

Reply via email to