Before running 8.out, increase $profsize to eliminate the `Prof error':
profsize=40000
I profiled this program and, after using Bprint for output, found that
allocating one node at a time was using much of the CPU time;
allocating a few thousand nodes at a time helps. I also had to fix
some bugs; compile it with
8c -FTVw tds.c
to see them. The next biggest cost was tail recursion to find the
correct tree node for insertion; converting the tail recursion to
explicit iteration reduced the run time noticeably. Given the large
number of nodes and the trivial nature of the operations on them, even
small coding changes can produce measurable improvements. Final times
are:
; time 8.out >/dev/null
3.84u 0.01s 3.90r 8.out
; cat /dev/cputype
PentiumII/Xeon 333