On Mon, 2003-10-27 at 17:11, rif wrote:
>
> Yes, l2-distance-squared is inlined. I don't personally see how the
> lack of this could be causing a problem converting between dist and
> best-dist --- they are both double-floats, and they are both declared
> as double-floats, so I'd hope the conversion happens without going
> through float-to-pointer.
>
> I'm tenatively guessing that Raymond's suggestion that the system's
> out of FP registers is the issue.
>
I'd also buy Raymond's explanation. There are things going on with the
optimizer that I don't understand. Here's one I ran across while trying
to reproduce your problem....
(declaim (inline foo))
(defun foo ()
(declare (values double-float))
0d0)
(defun bar (i)
(declare (type fixnum i))
(declare (values double-float))
(let ((x 0d0))
(loop for j fixnum from 0 to i
do (setf x (the double-float (+ x (foo)))))
x))
(defun baz (i)
(declare (type fixnum i))
(declare (values double-float))
(let ((x 0d0)
(y 0d0))
(loop for j fixnum from 0 to i
do (setf x (the double-float (+ x y))))
x))
(with-compilation-unit (:optimize '(optimize (speed 3)))
(compile 'foo)
(compile 'bar)
(compile 'baz))
(format t "Bar:~%")
(time (bar 1000000))
(format t "Baz:~%")
(time (baz 1000000))
(with-compilation-unit (:optimize '(optimize (speed 3) (safety 0) (debug
0)))
(compile 'foo)
(compile 'bar)
(compile 'baz))
(format t "Bar (2):~%")
(time (bar 1000000))
(format t "Baz (2):~%")
(time (baz 1000000))
Executive summary:
Bar: 32,198,876 CPU cycles, 80 bytes consed.
Baz: 17,623,876 CPU cycles, 80 bytes consed.
Bar (2): 412,752,556 CPU cycles, 16,000,144 bytes consed.
Baz (2): 12,066,088 CPU cycles, 80 bytes consed.
Bar (2) seems to have punted to generic-+/float-pointer coercion in the
highly optimized compilation!