Re: [ApacheDS] Default partition design ideas (was: Re: [ApacheDS] Going to need to implement a splay tree)

Emmanuel Lecharny Sat, 02 Feb 2008 00:54:59 -0800

Howard Chu wrote:

Alex Karasulu wrote:
    Those long must be fetched
    quickly, as we will always get an entry by its ID. Using a BTree for
that is time consuming, as fetching an entry will be done inO(log(N)).
You're absolutely right a hash would be much better. We don't need to
sort the ID's.
Way back in the OpenLDAP 2.1 days we used hashes for our indexing inback-bdb. But we found that B-Trees still performed better, eventhough index lookups have nearly zero locality of reference. Theproblem is with large DBs, the hash tables grow too large to fitentirely in cache. Once the table grows past a certain size, you canno longer directly reference the records; there's a lot of expensivepaging in/out that needs to be done. With a B-Tree, the number ofinternal pages in the tree is still very small relative to the totalnumber of data pages, so you get a lot of cache reuse referencingthose pages. So we switched everything to use B-Trees in OpenLDAP 2.2...
Hashing is faster *in theory*, but in practice it loses out.

That's a very valid point. The issue is how many IDs can we store in aHash table in memory ? Assuming that you have a correct hash method,around 30% of the table will be empty, and the average number of lookupwill be around 2. a DN is a pretty fat object, (I would say, aroundfifty chars), so storing a million of those guys needs 200 Mbytes ofmemory. Pretty heavy.

I would draw a line somehwere in the middle, depending on the availablememory and the number of DN we want to manage. Obviously, at some pointBTree will be faster, just because you can cache the BTree intermediateleaves, leading to less paging than with HashTable. But for a smallnumber of entries (say, 100K entries), with 512 Mbytes of memory, thismight be a good idea to use a Hash instead.

The ultimate server would mitigate those parameters to use the correctdata structure.


Let's be pragmatic :

- using BTree for every elements is a way to handle one single kind ofdata structure instead of two : potentially less bugs- we won't have to face the perfomance slow down Howard is mentioning ifwe keep BTrees to manage DN, even if it can be slower when handling afew DNs. It's more important to have a scalable server.

- One API to play with is easier then 2
- HashTable can degenerate if the hash function is not correctly chosen

- BTrees are using the memory efficiently, Hash spoil 30% of it withempty slots- If the server starts to swap because the hash is not in memory, as thekey distribution is random, performances will really fall by an order ofmagnitude (may be more)

So I would say : thanks Howard for this very interesting insight ! F*ckthe hashtable ;)


--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ApacheDS] Default partition design ideas (was: Re: [ApacheDS] Going to need to implement a splay tree)

Reply via email to