On Mon, 2003-10-27 at 17:11, rif wrote:
> 
> Yes, l2-distance-squared is inlined.  I don't personally see how the
> lack of this could be causing a problem converting between dist and
> best-dist --- they are both double-floats, and they are both declared
> as double-floats, so I'd hope the conversion happens without going
> through float-to-pointer.
> 
> I'm tenatively guessing that Raymond's suggestion that the system's
> out of FP registers is the issue.
> 

I'd also buy Raymond's explanation.  There are things going on with the
optimizer that I don't understand.  Here's one I ran across while trying
to reproduce your problem....


(declaim (inline foo))
(defun foo ()
  (declare (values double-float))
  0d0)

(defun bar (i)
  (declare (type fixnum i))
  (declare (values double-float))
  (let ((x 0d0))
    (loop for j fixnum from 0 to i
          do (setf x (the double-float (+ x (foo)))))
    x))

(defun baz (i)
  (declare (type fixnum i))
  (declare (values double-float))
  (let ((x 0d0)
        (y 0d0))
    (loop for j fixnum from 0 to i
          do (setf x (the double-float (+ x y))))
    x))

(with-compilation-unit (:optimize '(optimize (speed 3)))
  (compile 'foo)
  (compile 'bar)
  (compile 'baz))

(format t "Bar:~%")
(time (bar 1000000))
(format t "Baz:~%")
(time (baz 1000000))

(with-compilation-unit (:optimize '(optimize (speed 3) (safety 0) (debug
0)))
  (compile 'foo)
  (compile 'bar)
  (compile 'baz))

(format t "Bar (2):~%")
(time (bar 1000000))
(format t "Baz (2):~%")
(time (baz 1000000))

Executive summary: 
Bar:      32,198,876 CPU cycles,         80 bytes consed.
Baz:      17,623,876 CPU cycles,         80 bytes consed.
Bar (2): 412,752,556 CPU cycles, 16,000,144 bytes consed.
Baz (2):  12,066,088 CPU cycles,         80 bytes consed.


Bar (2) seems to have punted to generic-+/float-pointer coercion in the
highly optimized compilation!



Reply via email to