Re: optimizing numerical code.

Janos G. Hajagos Tue, 14 Dec 2004 08:33:29 -0800

Harvey,

Check out this post from Bill Clementson on optimizing lisp code for
the Coyote Gulch problem:


http://home.comcast.net/~bc19191/blog/040308.html

Richard Fateman has an excellent paper on numeric programming in
lisp:

http://http.cs.berkeley.edu/~fateman/papers/lispfloat.ps

Your friend for optimizing numeric code is the disassemble
function. This will give you a clue of what the compiler is doing.

Janos

On Tue, Dec 14, 2004 at 09:03:04AM -0500, Harvey J. Stein wrote:
> 
> I'm investigating using lisp for numerical code, looking at ease of
> development and runtime speed.  I could use a little help, though, in
> getting the optimizations right.
> 
> My first test was a little Monte Carlo.  In xlispstat this is:
> 
> (defun callpay (s k)
>   (- (max s k)
>      k))
> 
> (defun bsmc (s0 sigma mat k r steps trials &optional (debug ()) (end 0))
> "(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
> Black Scholes option price for an option with maturity MAT & strike K,
> on a stock with initial price S0 and volatility SIGMA.  Risk free rate
> is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
> DEBUG to dump intermediate values.  Uses antithetic sampling."
>   (let* ((dt (/ mat steps))
>          (drift (* (- r (* .5 sigma sigma)) dt))
>          (vol (* sigma (sqrt dt)))
>          (sumval 0))
>     (dotimes (i trials)
>       (let ((path (normal-rand steps)))
>         (incf sumval
>               (+ 
>                (callpay (* s0 (exp (sum (+ drift (* vol path)))))
>                         k)
>                (callpay (* s0 (exp (sum (+ drift (* vol (- path))))))
>                         k)))))
>     (* (exp (* -1 r mat))
>        (/ sumval (* 2 trials)))))
> 
> It's especially simple because (normal-rand n) creates a list of N
> normally distributed random numbers, and all the operators in
> xlispstat are vectorized, in the sense that a scalar plus a vector (or
> a list) adds the scalar to all elements of the vector, the sum of two
> vectors is done as vector addition, the product of two vectors is done
> componentwise, etc.
> 
> Presumably this is also slow because it's consing and throwing away
> lists of boxed doubles.
> 
> None the less, it runs in about the same time in xlispstat as the
> straight forward translation to common lisp (after compilation):
> 
> ;;; Straight-forward conversion of xlispstat version to cmucl.
> 
> (defconst SQRT2OVERE (sqrt (/ 2 (exp 1))))
> 
> (defun one-normal-rand ()
>   (let* ((u (random 1d0))
>          (u1 (random 1d0))
>          (v (* SQRT2OVERE (- (* 2d0  u1) 1d0)))
>          (x (/ v u))
>          (y (/ (* x x) 4d0)))
>     (if (and (> y (- 1d0 u))
>              (> y (- (log u))))
>         (one-normal-rand)
>       x)))
> 
> (defun normal-rand (s)
>   (do ((r (list (one-normal-rand)) (cons (one-normal-rand) r))
>        (s s (- s 1)))
>       ((<= s 1) r)))
> 
> (defun callpay (s k)
>   (- (max s k)
>      k))
> 
> (defun scalar-vector-op (op s v)
>   (mapcar (lambda (e) (funcall op s e)) v))
> 
> (defun vector-op (op v)
>   (mapcar (lambda (e) (funcall op e)) v))
> 
> (defun sum (v)
>   (reduce #'+ v))
> 
> (defun terminal-val (s0 drift vol path)
>   (* s0 (exp (sum (scalar-vector-op #'+
>                                     drift
>                                     (scalar-vector-op #'*
>                                                       vol
>                                                       path))))))
> 
> (defun bsmc (s0 sigma mat k r steps trials)
> "(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
> Black Scholes option price for an option with maturity MAT & strike K,
> on a stock with initial price S0 and volatility SIGMA.  Risk free rate
> is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
> DEBUG to dump intermediate values.  Uses antithetic sampling."
>   (let* ((dt (/ mat steps))
>          (drift (* (- r (* .5d0 sigma sigma)) dt))
>          (vol (* sigma (sqrt dt)))
>          (sumval 0))
>     (dotimes (i trials)
>       (let ((path (normal-rand steps)))
>         (incf sumval
>               (+ 
>                (callpay (terminal-val s0 drift vol path)
>                         k)
>                (callpay (terminal-val s0 drift vol (vector-op #'- path))
>                         k)))))
>     (* (exp (* -1 r mat))
>        (/ sumval trials 2d0))))
> 
> 
> However, I'd like to get it to run about 10x faster, to be competitive
> with the C version.  Changing the lists to vectors & doing the work in
> place instead of creating new lists each time didn't help much.
> Presumably, because although it wasn't throwing away the arrays, it
> was still boxing and unboxing intermediate floating point results.
> 
> Finally I tried writing it in straight C style, threw in lots of
> declarations, and managed to get it within a factor of 3 of C, but I'm
> still not getting nearly the speed I'd like to get, and still can't
> get it to get rid of the boxing.  Here's this version:
> 
> ;;; rewrite, trying to get cmucl to unpack everything.
> ;;; Iteration done directly instead of by vectors.
> 
> (eval-when (compile)
>   (proclaim '(optimize (speed 3) (safety 1) (space 0) (debug 0))))
> 
> (defconstant SQRT2OVERE (sqrt (/ 2d0 (exp 1d0))))
> 
> (eval-when (compile)
>   (proclaim '(type double-float SQRT2OVERE)))
> 
> (defun one-normal-rand ()
>   (declare (values (double-float)))
>   (let* ((u (random 1d0))
>          (u1 (random 1d0)))
>     (declare (type (double-float (0d0) 1d0) u u1))
>     (let* ((v (* SQRT2OVERE (- (* 2d0  u1) 1d0)))
>            (x (/ v u))
>            (y (/ (* x x) 4d0)))
>       (if (and (> y (- 1d0 u))
>                (> y (- (log u))))
>           (one-normal-rand)
>         x))))
> 
> (defun callpay (s k)
>    (declare (type double-float s k)
>            (values double-float))
>   (- (max s k)
>      k))
> 
> (defun bsmc (s0 sigma mat k r steps trials)
>    (declare (type (double-float (0d0) (1d100)) s0 sigma mat r)
>            (type (integer 1 1000000) steps trials))
> "(BSMC S0 SIGMA MAT K R STEPS TRIALS &optional (debug ())) Returns
> Black Scholes option price for an option with maturity MAT & strike K,
> on a stock with initial price S0 and volatility SIGMA.  Risk free rate
> is R.  Monte-carlo of 2*TRIALS paths of STEPS timesteps each.  Turn on
> DEBUG to dump intermediate values.  Uses antithetic sampling."
>   (let* ((dt (/ mat steps))
>          (drift (* (- r (* .5d0 sigma sigma)) mat))
>          (vol (* sigma (sqrt dt)))
>          (sumval 0d0))
>     (dotimes (i trials)
>       (let ((s (the double-float 0d0)))
>         (dotimes (j steps)
>           (incf s (the double-float (one-normal-rand))))
>         (incf sumval
>               (+ 
>                (callpay (* s0 (exp (+ drift (* vol s))))
>                         k)
>                (callpay (* s0 (exp (+ drift (* vol (- s)))))
>                         k)))))
>     (* (exp (* -1d0 r mat))
>        (/ sumval trials 2d0))))
> 
> The compiler won't unbox the result of (one-normal-rand) and
> (callpay), and I still don't know why.  Is the problem with
> (one-normal-rand) that the compiler doesn't know that (random 1d0) is
> going to return a double-float between 0 & 1, and hence ignores my
> declarations?  But what about (callpay ...)?  If I can't get something
> like one-normal-rand to do the calc unboxed & return an unboxed
> double-float, there's not much hope in getting cmucl to run as fast as
> a similar C version.
> 
> Thanks,
> 
> -- 
> Harvey Stein
> Bloomberg LP
> [EMAIL PROTECTED]
> 

-- 

Homepage: <http://creativelimits.net>

Re: optimizing numerical code.

Reply via email to