# Re: Bouncing on trampolines (so cool...)

Hi all,

I didn't bother with trampolines so far, but add some comments to

On Thu, May 16, 2013 at 12:00:20AM +0800, Samuel Dennis Borlongan wrote:
> ...
> http://stackoverflow.com/questions/12186672/how-can-i-prevent-my-ackerman-function-from-overflowing-the-stack
> ...

The bad thing with overflowing your stack is ... well, overflowing your
stack.

But you may as well overflow your heap, so nothing is gained.

As it is recommended for PicoLisp anyway to set 'ulimit -s unlimited',
the stack will use as much memory as is physically available, like the
heap.

The recursive version is usually faster than the one using an explicit
stack, but uses perhaps a little bit more memory because each recursion
level takes several stack lokations.

Let's take the standard Ackermann function

(de ack (M N)
(cond
((=0 M) (inc N))
((=0 N) (ack (dec M) 1))
(T (ack (dec M) (ack M (dec N)))) ) )

Eliminating tail-recursion gives

(de ack2 (M N)
(loop
(T (=0 M) (inc N))
(NIL (=0 N) (ack (dec M) (ack M (dec N))))
(dec 'M)
(one N) ) )

Both functions show roughly the same speed,

: (bench (ack 4 1))
249.465 sec
-> 65533

: (bench (ack2 4 1))
252.487 sec
-> 65533

though the less-recursive function is slightly slower (as expected).

If we go - as usually recommended - for the iterative version with an
explict stack,

(de ack3 (M N)
(for (Stack (cons M)  Stack)
(if (=0 (setq M (pop 'Stack)))
(inc 'N)
(push 'Stack (dec M))
(if (=0 N)
(one N)
(push 'Stack M)
(dec 'N) ) ) ) )

things get significantly slower:

: (bench (ack3 4 1))
312.028 sec
-> 65533

The fastest is - as expected and described - the "cheating"
version:

(de ack4 (M N)
(for (Stack (cons M)  Stack)
(cond
((=0 (setq M (pop 'Stack))) (inc 'N))
((= 1 M) (inc 'N 2))
((= 2 M) (setq N (+ 3 (* 2 N))))
(T
(push 'Stack (dec M))
(if (=0 N)
(one N)
(push 'Stack M)
(dec 'N) ) ) ) ) )

With that we get

: (bench (ack4 4 1))
0.000 sec
-> 65533

: (bench (length (ack4 4 2)))
0.981 sec
-> 19729

> (de ack/t (M N R S)
>   (default R '())
>   (if (> (length R) 0)
>     (unless S
>       (setq M (pop 'R) ) )
>     (push 'R M) )
>   (cond ((=0 (length R) ) (done N) )
>         ((= M 0) (continue ack/t M (+ N 1) R) )
>         ((= M 1) (continue ack/t M (+ N 2) R) )
>         ((= M 2) (continue ack/t M (+ (* N 2) 3) R) )
>         ((= N 0) (continue ack/t (- M 1) 1 R T) )
>         (T
>           (push 'R (- M 1) )
>           (continue ack/t M (- N 1) R T) ) ) )
>
> Honestly, I hate my Ackermann implementation, not because its performance
> is bad ( (trampoline ack/t 3 8000) takes 1.970 sec and 10.287 sec in 32-bit
> and ersatz, respectively)

Are you sure? Here I get

: (bench (length (ack4 3 8000)))
0.018 sec
-> 2410

and even on my humble Netbook with a slow Atom CPU (pil32):

: (bench (length (ack4 3 8000)))
0.213 sec
-> 2410

♪♫ Alex

Side note (though this doesn't really matter for the speed): It is never
wise to use 'length' to detect short lists. Better replace

(if (=0 (length R)) ..)    with  (if R ..)

or

(if (>= (length R) 3) ..)  with  (if (cddr R) ..)

to avoid possibly counting along long lists.
--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe