#|
Hello,
I noticed a compilation quirk in cmucl 18d on x86 [1].
The functions force-func-pwrlaw-2 and
force-func-pwrlaw (below) do the same thing, but the
force-func-pwrlaw-2 gives a compilation note [2] about expt forcing
runtime allocation of %SAP-ALIEN when calling kernel::%pow.
But force-func-pwrlaw-2 runs a tiny bit FASTER than force-func-pwrlaw,
contrary to the compilation note.
The only difference between the two is that the first function
uses intermediate double float variables to build up a result, but the
second faster version modifies a single variable to perform
the same computations. Also, if in force-func-pwrlaw-2 I change
the line:
(let* ((r (sqrt (+ 1d-100 (* x x) (* y y) (* z z))))
to
(let* ((r (sqrt (+ (* x x) (* y y) (* z z))))
the compilation note goes away.
Does anyone know what is happening? Is this a bug in
compiler note generation, or a well known quirk?
----
[1]
* (lisp-implementation-version)
"18d-pre, level-1 built 2002-01-29 on maftia1"
[2] compilation note:
In: lambda (x y z)
(expt r 0.1d0)
--> kernel:%pow block with-alien compiler-let symbol-macrolet values prog1 let
--> alien-funcall kernel:%pow alien::%heap-alien alien::extract-alien-value
--> alien::naturalize
==>
(alien::%sap-alien alien
'#<alien::alien-function-type
(function double-float double-float double-float)>)
Note: Unable to optimize because:
Could not optimize away %SAP-ALIEN: forced to do runtime
allocation of alien-value structure.
Note: Doing SAP to pointer coercion (cost 20).
---
|#
(eval-when (load eval compile)
(defparameter *rho-exponent* 0.1d0)
)
;; force-func-powerlaw-2 and force-func-powerlaw compute
;; 1/[sqrt(1d-100+x^2+y^2+z^2)]^*rho-exponent*
(declaim (inline force-func-pwrlaw-2))
;; this version gives compilation note for expt
(defun force-func-pwrlaw-2 (x y z)
(declare (type double-float x y z)
(optimize speed)
(inline expt))
(let* ((r (sqrt (+ 1d-100 (* x x) (* y y) (* z z))))
(pow-r (expt r #.*rho-exponent*))
(tmp (- (/ 1d0 pow-r))))
(declare (type (double-float 0d0) r pow-r)
(type double-float tmp))
(values (* x tmp) (* y tmp) (* z tmp))))
;; version of force-func-pwrlaw-2, with no compilation note
(declaim (inline force-func-pwrlaw))
(defun force-func-pwrlaw (x y z)
(declare (type double-float x y z)
(optimize speed)
(inline expt))
(let* ((tmp 0d0)) ;; do everything with a single variable 'tmp'
(declare (type double-float tmp))
(setf tmp (sqrt (+ (* x x) (* y y) (* z z))))
(incf tmp 1d-100)
(setf tmp (expt (the (double-float 0d0) tmp) #.*rho-exponent*))
(setf tmp (- (/ 1d0 tmp)))
(values (* x tmp) (* y tmp) (* z tmp))))
;; run both for niterations iterations and print times
(defun testtimes (&optional (n 10000000))
(declare (optimize speed)
(type (unsigned-byte 28) n))
(format t "Testing force-func-pwrlaw~%")
(time (loop for i of-type (unsigned-byte 28) below n
do (force-func-pwrlaw 1d0 2d0 3d0)))
(format t "Testing force-func-pwrlaw-2~%")
(time (loop for i of-type (unsigned-byte 28) below n
do (force-func-pwrlaw 1d0 2d0 3d0))))
__________________________________________________
Do you Yahoo!?
HotJobs - Search new jobs daily now
http://hotjobs.yahoo.com/