optimizing numerical code.

Harvey J. Stein Mon, 13 Dec 2004 15:56:28 -0800

I'm investigating using lisp for numerical code, looking at ease of
development and runtime speed.  I could use a little help, though, in
getting the optimizations right.


My first test was a little Monte Carlo.  In xlispstat this is:

(defun callpay (s k)
  (- (max s k)
     k))

(defun bsmc (s0 sigma mat k r steps trials &optional (debug ()) (end 0))
"(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
Black Scholes option price for an option with maturity MAT & strike K,
on a stock with initial price S0 and volatility SIGMA.  Risk free rate
is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
DEBUG to dump intermediate values.  Uses antithetic sampling."
  (let* ((dt (/ mat steps))
         (drift (* (- r (* .5 sigma sigma)) dt))
         (vol (* sigma (sqrt dt)))
         (sumval 0))
    (dotimes (i trials)
      (let ((path (normal-rand steps)))
        (incf sumval
              (+ 
               (callpay (* s0 (exp (sum (+ drift (* vol path)))))
                        k)
               (callpay (* s0 (exp (sum (+ drift (* vol (- path))))))
                        k)))))
    (* (exp (* -1 r mat))
       (/ sumval (* 2 trials)))))

It's especially simple because (normal-rand n) creates a list of N
normally distributed random numbers, and all the operators in
xlispstat are vectorized, in the sense that a scalar plus a vector (or
a list) adds the scalar to all elements of the vector, the sum of two
vectors is done as vector addition, the product of two vectors is done
componentwise, etc.

Presumably this is also slow because it's consing and throwing away
lists of boxed doubles.

None the less, it runs in about the same time in xlispstat as the
straight forward translation to common lisp (after compilation):

;;; Straight-forward conversion of xlispstat version to cmucl.

(defconst SQRT2OVERE (sqrt (/ 2 (exp 1))))

(defun one-normal-rand ()
  (let* ((u (random 1d0))
         (u1 (random 1d0))
         (v (* SQRT2OVERE (- (* 2d0  u1) 1d0)))
         (x (/ v u))
         (y (/ (* x x) 4d0)))
    (if (and (> y (- 1d0 u))
             (> y (- (log u))))
        (one-normal-rand)
      x)))

(defun normal-rand (s)
  (do ((r (list (one-normal-rand)) (cons (one-normal-rand) r))
       (s s (- s 1)))
      ((<= s 1) r)))

(defun callpay (s k)
  (- (max s k)
     k))

(defun scalar-vector-op (op s v)
  (mapcar (lambda (e) (funcall op s e)) v))

(defun vector-op (op v)
  (mapcar (lambda (e) (funcall op e)) v))

(defun sum (v)
  (reduce #'+ v))

(defun terminal-val (s0 drift vol path)
  (* s0 (exp (sum (scalar-vector-op #'+
                                    drift
                                    (scalar-vector-op #'*
                                                      vol
                                                      path))))))

(defun bsmc (s0 sigma mat k r steps trials)
"(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
Black Scholes option price for an option with maturity MAT & strike K,
on a stock with initial price S0 and volatility SIGMA.  Risk free rate
is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
DEBUG to dump intermediate values.  Uses antithetic sampling."
  (let* ((dt (/ mat steps))
         (drift (* (- r (* .5d0 sigma sigma)) dt))
         (vol (* sigma (sqrt dt)))
         (sumval 0))
    (dotimes (i trials)
      (let ((path (normal-rand steps)))
        (incf sumval
              (+ 
               (callpay (terminal-val s0 drift vol path)
                        k)
               (callpay (terminal-val s0 drift vol (vector-op #'- path))
                        k)))))
    (* (exp (* -1 r mat))
       (/ sumval trials 2d0))))


However, I'd like to get it to run about 10x faster, to be competitive
with the C version.  Changing the lists to vectors & doing the work in
place instead of creating new lists each time didn't help much.
Presumably, because although it wasn't throwing away the arrays, it
was still boxing and unboxing intermediate floating point results.

Finally I tried writing it in straight C style, threw in lots of
declarations, and managed to get it within a factor of 3 of C, but I'm
still not getting nearly the speed I'd like to get, and still can't
get it to get rid of the boxing.  Here's this version:

;;; rewrite, trying to get cmucl to unpack everything.
;;; Iteration done directly instead of by vectors.

(eval-when (compile)
  (proclaim '(optimize (speed 3) (safety 1) (space 0) (debug 0))))

(defconstant SQRT2OVERE (sqrt (/ 2d0 (exp 1d0))))

(eval-when (compile)
  (proclaim '(type double-float SQRT2OVERE)))

(defun one-normal-rand ()
  (declare (values (double-float)))
  (let* ((u (random 1d0))
         (u1 (random 1d0)))
    (declare (type (double-float (0d0) 1d0) u u1))
    (let* ((v (* SQRT2OVERE (- (* 2d0  u1) 1d0)))
           (x (/ v u))
           (y (/ (* x x) 4d0)))
      (if (and (> y (- 1d0 u))
               (> y (- (log u))))
          (one-normal-rand)
        x))))

(defun callpay (s k)
   (declare (type double-float s k)
           (values double-float))
  (- (max s k)
     k))

(defun bsmc (s0 sigma mat k r steps trials)
   (declare (type (double-float (0d0) (1d100)) s0 sigma mat r)
           (type (integer 1 1000000) steps trials))
"(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
Black Scholes option price for an option with maturity MAT & strike K,
on a stock with initial price S0 and volatility SIGMA.  Risk free rate
is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
DEBUG to dump intermediate values.  Uses antithetic sampling."
  (let* ((dt (/ mat steps))
         (drift (* (- r (* .5d0 sigma sigma)) mat))
         (vol (* sigma (sqrt dt)))
         (sumval 0d0))
    (dotimes (i trials)
      (let ((s (the double-float 0d0)))
        (dotimes (j steps)
          (incf s (the double-float (one-normal-rand))))
        (incf sumval
              (+ 
               (callpay (* s0 (exp (+ drift (* vol s))))
                        k)
               (callpay (* s0 (exp (+ drift (* vol (- s)))))
                        k)))))
    (* (exp (* -1d0 r mat))
       (/ sumval trials 2d0))))

The compiler won't unbox the result of (one-normal-rand) and
(callpay), and I still don't know why.  Is the problem with
(one-normal-rand) that the compiler doesn't know that (random 1d0) is
going to return a double-float between 0 & 1, and hence ignores my
declarations?  But what about (callpay ...)?  If I can't get something
like one-normal-rand to do the calc unboxed & return an unboxed
double-float, there's not much hope in getting cmucl to run as fast as
a similar C version.

Thanks,

-- 
Harvey Stein
Bloomberg LP
[EMAIL PROTECTED]

optimizing numerical code.

Reply via email to