#|

Hello,

I noticed a compilation quirk in cmucl 18d on x86 [1].

The functions force-func-pwrlaw-2 and
force-func-pwrlaw (below) do the same thing, but the
force-func-pwrlaw-2 gives a compilation note [2] about expt forcing
runtime allocation of %SAP-ALIEN when calling kernel::%pow.
But force-func-pwrlaw-2  runs a tiny bit FASTER than force-func-pwrlaw,
contrary to the compilation note.

The only difference between the two is that the first function
uses intermediate double float variables to build up a result, but the
second faster version modifies a single variable to perform
the same computations.  Also, if in force-func-pwrlaw-2 I change
the line:

  (let* ((r (sqrt (+ 1d-100 (* x x) (* y y) (* z z))))
to
  (let* ((r (sqrt (+        (* x x) (* y y) (* z z))))

the compilation note goes away.

Does anyone know what is happening?  Is this a bug in
compiler note generation, or a well known quirk?




----
[1]
* (lisp-implementation-version)
"18d-pre, level-1 built 2002-01-29 on maftia1"

[2] compilation note:

In: lambda (x y z)
  (expt r 0.1d0)
--> kernel:%pow block with-alien compiler-let symbol-macrolet values prog1 let 
--> alien-funcall kernel:%pow alien::%heap-alien alien::extract-alien-value 
--> alien::naturalize 
==>
  (alien::%sap-alien alien
                     '#<alien::alien-function-type
                        (function double-float double-float double-float)>)
Note: Unable to optimize because:
      Could not optimize away %SAP-ALIEN: forced to do runtime 
allocation of alien-value structure.
Note: Doing SAP to pointer coercion (cost 20).
---
|#



(eval-when (load eval compile)
  (defparameter *rho-exponent* 0.1d0) 
  )

;; force-func-powerlaw-2 and force-func-powerlaw compute
;; 1/[sqrt(1d-100+x^2+y^2+z^2)]^*rho-exponent*

(declaim (inline force-func-pwrlaw-2))
;; this version gives compilation note for expt
(defun force-func-pwrlaw-2 (x y z) 
  (declare (type double-float x y z)
           (optimize speed)
           (inline expt))
  (let* ((r (sqrt (+ 1d-100 (* x x) (* y y) (* z z))))
         (pow-r (expt r #.*rho-exponent*))
         (tmp (- (/ 1d0 pow-r))))
    (declare (type (double-float 0d0) r pow-r)
             (type double-float tmp))
    (values (* x tmp) (* y tmp) (* z tmp))))

;; version of force-func-pwrlaw-2, with no compilation note
(declaim (inline force-func-pwrlaw))
(defun force-func-pwrlaw (x y z) 
  (declare (type double-float x y z)
           (optimize speed)
           (inline expt))
  (let* ((tmp 0d0)) ;; do everything with a single variable 'tmp'
    (declare (type double-float tmp))
    (setf tmp (sqrt (+ (* x x) (* y y) (* z z))))
    (incf tmp 1d-100)
    (setf tmp (expt (the (double-float 0d0) tmp) #.*rho-exponent*))
    (setf tmp (- (/ 1d0 tmp)))
    (values (* x tmp) (* y tmp) (* z tmp))))


;; run both for niterations iterations and print times
(defun testtimes (&optional (n 10000000))
  (declare (optimize speed)
           (type (unsigned-byte 28) n))
  (format t "Testing force-func-pwrlaw~%")
  (time (loop for i of-type (unsigned-byte 28) below n
              do (force-func-pwrlaw 1d0 2d0 3d0)))
  (format t "Testing force-func-pwrlaw-2~%")
  (time (loop for i of-type (unsigned-byte 28) below n
              do (force-func-pwrlaw 1d0 2d0 3d0))))


__________________________________________________
Do you Yahoo!?
HotJobs - Search new jobs daily now
http://hotjobs.yahoo.com/

Reply via email to