Ken,

I wrote out a reply to your post, just ready to send it, and my machine
reset with a nice NT BSOD :(

Here's a similar rendition of my thoughts.

I wish that I talked with you and Alan before I started this project.
It would have saved me a bunch of time!

        >I'd have to argue with this: everyone and their sibling has at
one time
        >written their own allocation system on top of the OS, and it is
a bog
        >standard technique. The overhead shouldn't be much significant
beyond the
        >size of your own code. (Becaue most of the time user code will
only be
        >triggering one of the allocators.) The timewise overhead should
be
        >minimal, _assuming_ that the OS doesn't use virtual (paged or
swapped)
        >allocation. Which it doesn't, which is where we came in. 

Writing any old allocator is pretty easy to accomplish when it comes
right down to it.  Writing an allocator that is both fast and space
efficient is much more of a challenge.  The biggest killer would be page
sizes or their varying sizes which is what developers would want.

So, one makes all VMM page sizes the same, say 1K (this is how Win32 VMM
works except page sizes are larger).  This makes page allocation
relatively fast because we can preallocate records and partition them
according to page sizes.  But we are limited to coarse grained
allocations.  Each 100 byte allocation would require a 1K page to store
it.  This is fast but storage inefficient.  So, the next thing to do is
create a heap that sits on top of these pages and allows for
finer-granularity.  Now this indirection from the direct database
interaction removes the majority of DmNewRecord calls.  But now we have
to write the heap API all over again.  Intuition tells me that this is
going to be slower than a DmNewRecord call I'm making now--I haven't
tried it so I may be pleasantly surprised!

Another wrinkle with a custom allocator is the supporting data structure
size and access.  Access to the memory needs to be fast, but also not
consume a lot of resources.  For example, the VMM hashes its handles
because of the fast access hashing provides.  Yet hash tables generally
waste a lot of space.  Now my test client allocates 50,000 handles at a
time.  Without getting really creative in the implementation of my hash
table, I could be allocating many Ks of up front storage RAM to hold
just my data structures to hold those 50,000 handles!  That would be the
storage RAM needed just to use the VMM without any allocations!
Fortunately, the VMM needs less than 1K of storage space when it runs
with no allocations and grows and shrinks on demand.  But, the bottom
line is it becomes quite tricky to walk that fine line to find the
correct balance between  of speed and storage size.  Careful tuning here
becomes really important because two competing requirements are at work
here.

        That's why my first thought would be to keep the active pool
data on the
        heap, swapping inactive pool data out into storage, moving the
MRU blocks
        into the active pool area. Without trying it, I've no idea
whether this
        would be workable. 

This is definitely workable.  It all boils down to the performance of it
all.  I looked at the various paging schemes when I started the project
two weeks ago.  LRU was attractive because of its high-performance of
memory accesses spatially located (IBM did some really good research on
this at one time).  Unfortunately, the LRU paging scheme is also
inefficient to implement (refer to Operating Systems by Harvey Dietel
for example) because there is a lot of overhead checking page ages
(which we would also be doing with the MRU method).

As far as keeping as much in the dynamic heap as possible, I considered
doing this too.  But then I realized that not all clients will be using
the VMM.  So if I hog all the dynamic RAM, I may keep other applications
from running (on a beam, or Find for example).  I need to be as polite
as possible to the system because not all the apps are going to be using
the VMM.

Now if I hooked the OS functions (I haven't looked into this), then this
becomes a non-issue because all OS calls to MemHandleX and MemPtrX are
routed through the VMM.  In that case, we can afford to be less polite.

Great thoughts!

Mike


Reply via email to