This was very interesting to me, as it was exactly the kind of thing
I'd been looking for and thinking about and trying, but hadn't been
able to make work. The key is inline. In fact, just to test my
understanding, I think that with the specific syntax you used, I don't
think you need a macro at all to make it fast; the reason your
functional version is slow is that the function nmap-df-vector is not
inline, which in turn means that the call
(funcall #'nmap-df-vector #'square-df *vector*)
can't inline it's call to square-df. And the inlining is crucial, as
it's the only thing that will allow me to avoid boxing up the
double-float after every call to square, which is prohibitively
expensive.
So I think I can get rid of the macro by just making the mapping
function itself inline:
(declaim (inline nmap-df-vector-2))
(defun nmap-df-vector-2 (func df-vector)
(declare (type df-vector df-vector)
(type df->df-func func))
(loop for i of-type fixnum from 0 below (length df-vector)
do (setf (aref df-vector i)
(funcall func (aref df-vector i)))))
;; Runs as fast as test-1, conses 0 bytes
(defun test-3 ()
(declare (optimize (speed 3) (safety 0) (debug 0)))
(dotimes (i 30000)
(funcall #'nmap-df-vector-2 #'square-df *vector*)))
Have I got the idea?
rif
>
> On Fri, Feb 14, 2003 at 07:15:04PM -0500, rif wrote:
>
> > In order to get good performance, will I have to make a single
> > function that takes the whole vector and does all the operations
> > internally (i.e., a function called square-vector for instance), or
> > can I build a higher-order function which takes a procedure like
> > square and operates on the elements of the vector? It seems that if
> > I'm guaranteed to lose as soon as I call a function with a
> > double-float argument, that it's going to be hard to do this fast.
> > What're good strategies here?
>
> Perhaps judicious use of inlined functions and macros (possibly
> compiler macros, if you still want to have function-like behaviour
> under some circumstances...)? E.g.:
>
> (declaim (inline square-df))
> (defun square-df (x)
> (declare (optimize (speed 3) (safety 0) (debug 0))
> (double-float x))
> (* x x))
>
> (deftype df->df-func () `(function (double-float) double-float))
>
> (deftype df-vector () `(simple-array double-float 1))
>
> (defmacro %nmap-df-vector (func df-vector)
> (let (($func (gensym "FUNC"))
> ($df-vector (gensym "DF-VECTOR"))
> ($index (gensym "INDEX")))
> `(let ((,$func ,func)
> (,$df-vector ,df-vector))
> (declare (type df-vector ,$df-vector)
> (type df->df-func ,$func))
> (loop for ,$index of-type fixnum from 0 below (length ,$df-vector)
> do (setf (aref ,$df-vector ,$index)
> (funcall ,$func (aref ,$df-vector ,$index)))))))
>
> (define-compiler-macro nmap-df-vector (func df-vector)
> `(%nmap-df-vector ,func ,df-vector))
>
> (defun nmap-df-vector (func df-vector)
> (%nmap-df-vector func df-vector))
>
> (defvar *vector* (make-array (list 4096) :element-type 'double-float))
>
> ;; Runs in 0.32 seconds, 0 bytes consed
> (defun test-1 ()
> (declare (optimize (speed 3) (safety 0) (debug 0)))
> (dotimes (i 30000)
> (nmap-df-vector #'square-df *vector*)))
>
> ;; Runs in 64.33 seconds with 3.64 gigabytes consed
> (defun test-2 ()
> (declare (optimize (speed 3) (safety 0) (debug 0)))
> (dotimes (i 30000)
> (funcall #'nmap-df-vector #'square-df *vector*)))
>