Yo,

I'm back again with some optimization questions. I stumbled across
this by pure chance: I have a rather complicated function that needs
some fresh arrays on each invocation. The function is also called
quite often, i.e. fresh arrays are needed quite often. I created these
arrays with MAKE-ARRAY and, I don't know why, after a while I had the
idea to use templates of my arrays instead and use COPY-SEQ to create
the fresh arrays. And, presto, the code runs much faster now. Here's a
simple example:

  (defun foo (n x)
    (dotimes (i n)
      (let ((a (make-array x :initial-element 0)))
        (setf (svref a 4) 42))))
  
  (defun bar (n x)
    (let ((template (make-array x :initial-element 0)))
      (dotimes (i n)
        (let ((a (copy-seq template)))
          (setf (svref a 4) 42)))))

The compiled code yields these results in CMUCL 18d:

  * (time (foo 1000000 10))
  Compiling LAMBDA NIL: 
  Compiling Top-Level Form: 
  
  Evaluation took:
    7.01 seconds of real time
    6.805664 seconds of user run time
    0.171875 seconds of system run time
    [Run times include 0.54 seconds GC run time]
    0 page faults and
    95666256 bytes consed.
  NIL
  * (time (bar 1000000 10))
  Compiling LAMBDA NIL: 
  Compiling Top-Level Form: 
  
  Evaluation took:
    1.77 seconds of real time
    1.606445 seconds of user run time
    0.147461 seconds of system run time
    [Run times include 0.33 seconds GC run time]
    0 page faults and
    63763840 bytes consed.
  NIL
  *

I think this is rather funny 'cause I would have expected FOO to be at
least as fast as BAR. As it turns out this is the case in the other CL
implementations I've tested (ACL, CLISP, LW) - only LW favors BAR for
small N.

Some other observations:

1. This "optimization" only holds for small X. On my machine BAR is
   faster for X < 100 approximately while FOO wins for bigger X (which
   is perfect for my app).

2. Also, the speed gains are lost if X is fixed, i.e. 

  (defun foo (n)
    (dotimes (i n)
      (let ((a (make-array 10 :initial-element 0)))
        (setf (svref a 4) 42))))
  
  (defun bar (n)
    (let ((template (make-array 10 :initial-element 0)))
      (dotimes (i n)
        (let ((a (copy-seq template)))
          (setf (svref a 4) 42)))))

   In this case FOO always wins.

I'm pretty happy that this behaviour means a significant speed boost
for my app but nevertheless I'd be interested to know what exactly is
happening here.

Thanks,
Edi.


Reply via email to