In message <[EMAIL PROTECTED]>, David Schultz writes:

>You can find a somewhat more thorough comparison of malloc
>implementations at .

There are many problems with this paper, and my feeling is that it was
written with a very specific purpose in mind, although I havn't been
able to figure out just what that purpose was.

My own research while writing phkmalloc indicated that applications
are their own worst enemy when it comes to memory allocation, and
therefore the focus for phkmalloc was not to optimize for the
individual application, but rather for the system as a whole.  The
philosphy being that if the entire system gets more work done, the
individual applications must ipso facto on average also benefit.

I was never able to get the authors of this paper into a dialogue,
in particular I tried to find out if they just measured where sbrk()
was or if they had tried to examine the real memory load of the
application.   The reference to correspondence with me in the
conclusion can at best me interpreted as "we didn't understand his
answer and/or didn't have/wanted to spend more time."

It is to some extent remarkable that people still, ten years after
I pointed out the fact in my paper on phkmalloc, still hasn't
realized that VM systems don't behave like swapping systems did
and that memory allocators need to be aware of this.

The standing of sbrk() is only a very weak indicator of the live
set of active pages needed for an application to run in a VM system,
yet people still keep measuring it as a performance parameter.

Imagine a C-compiler:

        read in source for function foo()
        check on it a lot (allocating 4M)
        allocate the resulting info, 64 bytes.
        generate the ass'y output
        free all temp memory

        read in source for function bar()

This will usually result in sbrk() sitting just above that 64 bytes
but there may be 4MB of untouched memory beneath it.

If you run phkmalloc with the 'H' option, this will be madvise(2)'ed
to the system, but even without that, the fact that it is _truly_
untouched means that it will soon become a candidate for pageout,
and it will stay out until needed.

Many mallocs make the mistake of storing the free-list in the actual
free memory as a linked list.  This means that to free a bit of memory
you have to page in all the otherwise unused space, just to traverse
the list.  If physical RAM is limited, this is a bad plan.

The _real_ way to benchmark a malloc implementatio is therefore to
measure how it performs with different amounts of physical RAM
available, because then a bad malloc will suffer a terrible hit in
paging activity where a good malloc may not result in any significant
amount of paging.

In informal tests along those lines I have seen differences of up
to a factor five in wall-clock time.

Of course, with 512MB RAM in a workstation, people will never notice
this and I'm just a old foghorn who doesn't know anything about
performance, but run a server where you actually have RAM pressure
and you might notice the difference.

Progress may be overrated, but it seldom goes too far...


Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED]         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to