Hi list,
this is a new Dict hash table implementation, which modifies the
internal hash algorithm we use in the gwlib/dict.[ch] module.
The short summary is: It is round about 7,5-8 times faster then the
existing implementation!
Now, the explanation part:
Running the SVN trunk's test/test_dict on a machine I was benchmarking
against gives us the following runtime results:
Average execution speed: 52,x sec.
OProfile's opreport for test_dict:
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % image name symbol name
19000 39.3954 test_dict item_has_key
17563 36.4158 test_dict seems_valid_real
4528 9.3885 test_dict gwlist_search
3510 7.2778 test_dict octstr_compare
456 0.9455 libc-2.11.3.so vfprintf
251 0.5204 libc-2.11.3.so _int_malloc
175 0.3629 libc-2.11.3.so random_r
159 0.3297 libc-2.11.3.so __i686.get_pc_thunk.bx
150 0.3110 test_dict octstr_len
148 0.3069 test_dict octstr_get_char
139 0.2882 libc-2.11.3.so random
(BTW, this was on a VMware VM, so we used timer interrupts, but the
relative results on a real machine are the same.)
So, what does it tell us?
First of all, most CPU cycles are used for the item_has_key() function
internally, which is at the end of the day nothing more then a
octstr_compare(). We compare the hash's key Octstr here while searching
for the one we're looking for.
The "real problem" about the current implementation is the hashing of
the keys. This happens within key_to_index() and we use this implementation:
static long key_to_index(Dict *dict, Octstr *key)
{
return octstr_hash_key(key) % dict->size;
}
where octstr_hash_key() accumulates the single byte values to one long
and we modulo against the dict size.
That's ok, for small Dicts. But when the Dict size get's bigger and
bigger, we'll notice un-leveled congestion in the hash table. Which
means for some hash table buckets we'll have larger chained items in the
bucket list, then for others.
This is also where the gwlist_search() hits us from the OProfile
results. We spend 10% by simply looking for the right item in the hash
bucket list.
So, the key to performance is having a well distributed hash function,
which make the overall average list len of the hash table bucket
minimal, and well distributed.
The new implementation does this. It is based on the code of Bob
Jenkins, which was released to the public domain. I have incorporated
the parts into our existing gwlib/dict.[ch] structure, and also added
some "debugging capabilities" to the structure itself.
Ok, you want some figures now? Here is the same test run of test_dict on
the same machine, under same conditions:
Average execution speed: 6,7 sec.
OProfile's opreport for test_dict:
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % image name symbol name
568 14.2036 libc-2.11.3.so vfprintf
310 7.7519 libc-2.11.3.so random
273 6.8267 libc-2.11.3.so random_r
193 4.8262 libc-2.11.3.so __i686.get_pc_thunk.bx
191 4.7762 libc-2.11.3.so _int_malloc
185 4.6262 test_dict lock
179 4.4761 libpthread-2.11.3.so pthread_mutex_lock
149 3.7259 test_dict seems_valid_real
121 3.0258 libc-2.11.3.so _IO_default_xsputn
87 2.1755 test_dict dict_put_normal
81 2.0255 [vdso] (tgid:15020 range:0xb7781000-0xb7782000) [vdso]
(tgid:15020 range:0xb7781000-0xb7782000)
81 2.0255 [vdso] (tgid:15024 range:0xb76fb000-0xb76fc000) [vdso]
(tgid:15024 range:0xb76fb000-0xb76fc000)
76 1.9005 libc-2.11.3.so _itoa_word
76 1.9005 libc-2.11.3.so strchrnul
75 1.8755 libpthread-2.11.3.so __pthread_mutex_unlock_usercnt
66 1.6504 test_dict octstr_compare
63 1.5754 test_dict get_random_fd
62 1.5504 libc-2.11.3.so _int_free
62 1.5504 libc-2.11.3.so malloc
61 1.5254 test_dict get_random_bytes
59 1.4754 libc-2.11.3.so free
53 1.3253 test_dict dict_destroy
53 1.3253 test_dict gwlist_search
51 1.2753 test_dict item_has_key
50 1.2503 test_dict gwlist_extract_first
For those that identify the parts, it speaks for itself.
Please review and vote for commit.
BTW, this code is in production for a huge majority of the commercial
Kannel SMPP v3.4 server (smppbox) users around the globe and has been
proven to be rock stable since introduction.
Stipe
--
-------------------------------------------------------------------
Kölner Landstrasse 419
40589 Düsseldorf, NRW, Germany
tolj.org system architecture Kannel Software Foundation (KSF)
http://www.tolj.org/ http://www.kannel.org/
mailto:st_{at}_tolj.org mailto:stolj_{at}_kannel.org
-------------------------------------------------------------------
Index: gwlib/dict.c
===================================================================
--- gwlib/dict.c (revision 4986)
+++ gwlib/dict.c (working copy)
@@ -61,8 +61,10 @@
* might be interesting to use a trie instead.
*
* Lars Wirzenius, based on code by Tuomas Luttinen
+ * Stipe Tolj, based on code by Bob Jenkins
*/
+#include <math.h>
#include "gwlib.h"
@@ -75,6 +77,7 @@
struct Item {
Octstr *key;
void *value;
+ int direct_hit;
};
@@ -85,6 +88,8 @@
item = gw_malloc(sizeof(*item));
item->key = octstr_duplicate(key);
item->value = value;
+ item->direct_hit = 0;
+
return item;
}
@@ -114,11 +119,20 @@
*/
struct Dict {
+ int bitsize;
List **tab;
long size;
long key_count;
void (*destroy_value)(void *);
Mutex *lock;
+
+ /* enhanced version for debug information */
+ long inserts;
+ long direct_hits;
+ long duplicates;
+ int (*put_true)(Dict *, Octstr *, void *);
+ int (*put)(Dict *, Octstr *, void *);
+ void *(*remove)(Dict *, Octstr *);
};
@@ -134,11 +148,341 @@
}
-static long key_to_index(Dict *dict, Octstr *key)
+/*************************************************************************
+ * The hashing function itself.
+ *
+ * The following hash function is a public domain contribution of
+ * Bob Jenkins. It can be found at the following URls:
+ *
+ * http://burtleburtle.net/bob/hash/doobs.html
+ * http://burtleburtle.net/bob/c/lookup3.c
+ */
+
+/*
+ * My best guess at if you are big-endian or little-endian. This may
+ * need adjustment.
+ */
+#if (defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && \
+ __BYTE_ORDER == __LITTLE_ENDIAN) || \
+ (defined(i386) || defined(__i386__) || defined(__i486__) || \
+ defined(__i586__) || defined(__i686__) || defined(vax) || defined(MIPSEL))
+# define HASH_LITTLE_ENDIAN 1
+# define HASH_BIG_ENDIAN 0
+#elif (defined(__BYTE_ORDER) && defined(__BIG_ENDIAN) && \
+ __BYTE_ORDER == __BIG_ENDIAN) || \
+ (defined(sparc) || defined(POWERPC) || defined(mc68000) || defined(sel))
+# define HASH_LITTLE_ENDIAN 0
+# define HASH_BIG_ENDIAN 1
+#else
+# define HASH_LITTLE_ENDIAN 0
+# define HASH_BIG_ENDIAN 0
+#endif
+
+
+#define hashsize(n) ((uint32_t)1<<(n))
+#define hashmask(n) (hashsize(n)-1)
+#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
+
+/*
+-------------------------------------------------------------------------------
+mix -- mix 3 32-bit values reversibly.
+
+This is reversible, so any information in (a,b,c) before mix() is
+still in (a,b,c) after mix().
+
+If four pairs of (a,b,c) inputs are run through mix(), or through
+mix() in reverse, there are at least 32 bits of the output that
+are sometimes the same for one pair and different for another pair.
+This was tested for:
+* pairs that differed by one bit, by two bits, in any combination
+ of top bits of (a,b,c), or in any combination of bottom bits of
+ (a,b,c).
+* "differ" is defined as +, -, ^, or ~^. For + and -, I transformed
+ the output delta to a Gray code (a^(a>>1)) so a string of 1's (as
+ is commonly produced by subtraction) look like a single 1-bit
+ difference.
+* the base values were pseudorandom, all zero but one bit set, or
+ all zero plus a counter that starts at zero.
+
+Some k values for my "a-=c; a^=rot(c,k); c+=b;" arrangement that
+satisfy this are
+ 4 6 8 16 19 4
+ 9 15 3 18 27 15
+ 14 9 3 7 17 3
+Well, "9 15 3 18 27 15" didn't quite get 32 bits diffing
+for "differ" defined as + with a one-bit base and a two-bit delta. I
+used http://burtleburtle.net/bob/hash/avalanche.html to choose
+the operations, constants, and arrangements of the variables.
+
+This does not achieve avalanche. There are input bits of (a,b,c)
+that fail to affect some output bits of (a,b,c), especially of a. The
+most thoroughly mixed value is c, but it doesn't really even achieve
+avalanche in c.
+
+This allows some parallelism. Read-after-writes are good at doubling
+the number of bits affected, so the goal of mixing pulls in the opposite
+direction as the goal of parallelism. I did what I could. Rotates
+seem to cost as much as shifts on every machine I could lay my hands
+on, and rotates are much kinder to the top and bottom bits, so I used
+rotates.
+-------------------------------------------------------------------------------
+*/
+#define mix(a,b,c) \
+{ \
+ a -= c; a ^= rot(c, 4); c += b; \
+ b -= a; b ^= rot(a, 6); a += c; \
+ c -= b; c ^= rot(b, 8); b += a; \
+ a -= c; a ^= rot(c,16); c += b; \
+ b -= a; b ^= rot(a,19); a += c; \
+ c -= b; c ^= rot(b, 4); b += a; \
+}
+
+/*
+-------------------------------------------------------------------------------
+final -- final mixing of 3 32-bit values (a,b,c) into c
+
+Pairs of (a,b,c) values differing in only a few bits will usually
+produce values of c that look totally different. This was tested for
+* pairs that differed by one bit, by two bits, in any combination
+ of top bits of (a,b,c), or in any combination of bottom bits of
+ (a,b,c).
+* "differ" is defined as +, -, ^, or ~^. For + and -, I transformed
+ the output delta to a Gray code (a^(a>>1)) so a string of 1's (as
+ is commonly produced by subtraction) look like a single 1-bit
+ difference.
+* the base values were pseudorandom, all zero but one bit set, or
+ all zero plus a counter that starts at zero.
+
+These constants passed:
+ 14 11 25 16 4 14 24
+ 12 14 25 16 4 14 24
+and these came close:
+ 4 8 15 26 3 22 24
+ 10 8 15 26 3 22 24
+ 11 8 15 26 3 22 24
+-------------------------------------------------------------------------------
+*/
+#define final(a,b,c) \
+{ \
+ c ^= b; c -= rot(b,14); \
+ a ^= c; a -= rot(c,11); \
+ b ^= a; b -= rot(a,25); \
+ c ^= b; c -= rot(b,16); \
+ a ^= c; a -= rot(c,4); \
+ b ^= a; b -= rot(a,14); \
+ c ^= b; c -= rot(b,24); \
+}
+
+/*
+-------------------------------------------------------------------------------
+hashlittle() -- hash a variable-length key into a 32-bit value
+ k : the key (the unaligned variable-length array of bytes)
+ length : the length of the key, counting by bytes
+ initval : can be any 4-byte value
+Returns a 32-bit value. Every bit of the key affects every bit of
+the return value. Two keys differing by one or two bits will have
+totally different hash values.
+
+The best hash table sizes are powers of 2. There is no need to do
+mod a prime (mod is sooo slow!). If you need less than 32 bits,
+use a bitmask. For example, if you need only 10 bits, do
+ h = (h & hashmask(10));
+In which case, the hash table should have hashsize(10) elements.
+
+If you are hashing n strings (uint8_t **)k, do it like this:
+ for (i=0, h=0; i<n; ++i) h = hashlittle( k[i], len[i], h);
+
+By Bob Jenkins, 2006. [email protected]. You may use this
+code any way you wish, private, educational, or commercial. It's free.
+
+Use for hash table lookup, or anything where one collision in 2^^32 is
+acceptable. Do NOT use for cryptographic purposes.
+-------------------------------------------------------------------------------
+*/
+
+uint32_t hashlittle( const void *key, size_t length, uint32_t initval)
{
- return octstr_hash_key(key) % dict->size;
+ uint32_t a,b,c; /* internal state */
+ union { const void *ptr; size_t i; } u; /* needed for Mac Powerbook G4 */
+
+ /* Set up the internal state */
+ a = b = c = 0xdeadbeef + ((uint32_t)length) + initval;
+
+ u.ptr = key;
+ if (HASH_LITTLE_ENDIAN && ((u.i & 0x3) == 0)) {
+ const uint32_t *k = (const uint32_t *)key; /* read 32-bit chunks */
+#ifdef VALGRIND
+ const uint8_t *k8;
+#endif
+
+ /*------ all but last block: aligned reads and affect 32 bits of (a,b,c) */
+ while (length > 12)
+ {
+ a += k[0];
+ b += k[1];
+ c += k[2];
+ mix(a,b,c);
+ length -= 12;
+ k += 3;
+ }
+
+ /*----------------------------- handle the last (probably partial) block */
+ /*
+ * "k[2]&0xffffff" actually reads beyond the end of the string, but
+ * then masks off the part it's not allowed to read. Because the
+ * string is aligned, the masked-off tail is in the same word as the
+ * rest of the string. Every machine with memory protection I've seen
+ * does it on word boundaries, so is OK with this. But VALGRIND will
+ * still catch it and complain. The masking trick does make the hash
+ * noticably faster for short strings (like English words).
+ */
+#ifndef VALGRIND
+
+ switch(length)
+ {
+ case 12: c+=k[2]; b+=k[1]; a+=k[0]; break;
+ case 11: c+=k[2]&0xffffff; b+=k[1]; a+=k[0]; break;
+ case 10: c+=k[2]&0xffff; b+=k[1]; a+=k[0]; break;
+ case 9 : c+=k[2]&0xff; b+=k[1]; a+=k[0]; break;
+ case 8 : b+=k[1]; a+=k[0]; break;
+ case 7 : b+=k[1]&0xffffff; a+=k[0]; break;
+ case 6 : b+=k[1]&0xffff; a+=k[0]; break;
+ case 5 : b+=k[1]&0xff; a+=k[0]; break;
+ case 4 : a+=k[0]; break;
+ case 3 : a+=k[0]&0xffffff; break;
+ case 2 : a+=k[0]&0xffff; break;
+ case 1 : a+=k[0]&0xff; break;
+ case 0 : return c; /* zero length strings require no mixing */
+ }
+
+#else /* make valgrind happy */
+
+ k8 = (const uint8_t *)k;
+ switch(length)
+ {
+ case 12: c+=k[2]; b+=k[1]; a+=k[0]; break;
+ case 11: c+=((uint32_t)k8[10])<<16; /* fall through */
+ case 10: c+=((uint32_t)k8[9])<<8; /* fall through */
+ case 9 : c+=k8[8]; /* fall through */
+ case 8 : b+=k[1]; a+=k[0]; break;
+ case 7 : b+=((uint32_t)k8[6])<<16; /* fall through */
+ case 6 : b+=((uint32_t)k8[5])<<8; /* fall through */
+ case 5 : b+=k8[4]; /* fall through */
+ case 4 : a+=k[0]; break;
+ case 3 : a+=((uint32_t)k8[2])<<16; /* fall through */
+ case 2 : a+=((uint32_t)k8[1])<<8; /* fall through */
+ case 1 : a+=k8[0]; break;
+ case 0 : return c;
+ }
+
+#endif /* !valgrind */
+
+ } else if (HASH_LITTLE_ENDIAN && ((u.i & 0x1) == 0)) {
+ const uint16_t *k = (const uint16_t *)key; /* read 16-bit chunks */
+ const uint8_t *k8;
+
+ /*--------------- all but last block: aligned reads and different mixing */
+ while (length > 12)
+ {
+ a += k[0] + (((uint32_t)k[1])<<16);
+ b += k[2] + (((uint32_t)k[3])<<16);
+ c += k[4] + (((uint32_t)k[5])<<16);
+ mix(a,b,c);
+ length -= 12;
+ k += 6;
+ }
+
+ /*----------------------------- handle the last (probably partial) block */
+ k8 = (const uint8_t *)k;
+ switch(length)
+ {
+ case 12: c+=k[4]+(((uint32_t)k[5])<<16);
+ b+=k[2]+(((uint32_t)k[3])<<16);
+ a+=k[0]+(((uint32_t)k[1])<<16);
+ break;
+ case 11: c+=((uint32_t)k8[10])<<16; /* fall through */
+ case 10: c+=k[4];
+ b+=k[2]+(((uint32_t)k[3])<<16);
+ a+=k[0]+(((uint32_t)k[1])<<16);
+ break;
+ case 9 : c+=k8[8]; /* fall through */
+ case 8 : b+=k[2]+(((uint32_t)k[3])<<16);
+ a+=k[0]+(((uint32_t)k[1])<<16);
+ break;
+ case 7 : b+=((uint32_t)k8[6])<<16; /* fall through */
+ case 6 : b+=k[2];
+ a+=k[0]+(((uint32_t)k[1])<<16);
+ break;
+ case 5 : b+=k8[4]; /* fall through */
+ case 4 : a+=k[0]+(((uint32_t)k[1])<<16);
+ break;
+ case 3 : a+=((uint32_t)k8[2])<<16; /* fall through */
+ case 2 : a+=k[0];
+ break;
+ case 1 : a+=k8[0];
+ break;
+ case 0 : return c; /* zero length requires no mixing */
+ }
+
+ } else { /* need to read the key one byte at a time */
+ const uint8_t *k = (const uint8_t *)key;
+
+ /*--------------- all but the last block: affect some 32 bits of (a,b,c) */
+ while (length > 12)
+ {
+ a += k[0];
+ a += ((uint32_t)k[1])<<8;
+ a += ((uint32_t)k[2])<<16;
+ a += ((uint32_t)k[3])<<24;
+ b += k[4];
+ b += ((uint32_t)k[5])<<8;
+ b += ((uint32_t)k[6])<<16;
+ b += ((uint32_t)k[7])<<24;
+ c += k[8];
+ c += ((uint32_t)k[9])<<8;
+ c += ((uint32_t)k[10])<<16;
+ c += ((uint32_t)k[11])<<24;
+ mix(a,b,c);
+ length -= 12;
+ k += 12;
+ }
+
+ /*-------------------------------- last block: affect all 32 bits of (c) */
+ switch(length) /* all the case statements fall through */
+ {
+ case 12: c+=((uint32_t)k[11])<<24;
+ case 11: c+=((uint32_t)k[10])<<16;
+ case 10: c+=((uint32_t)k[9])<<8;
+ case 9 : c+=k[8];
+ case 8 : b+=((uint32_t)k[7])<<24;
+ case 7 : b+=((uint32_t)k[6])<<16;
+ case 6 : b+=((uint32_t)k[5])<<8;
+ case 5 : b+=k[4];
+ case 4 : a+=((uint32_t)k[3])<<24;
+ case 3 : a+=((uint32_t)k[2])<<16;
+ case 2 : a+=((uint32_t)k[1])<<8;
+ case 1 : a+=k[0];
+ break;
+ case 0 : return c;
+ }
+ }
+
+ final(a,b,c);
+ return c;
}
+
+static inline long key_to_index(Dict *dict, Octstr *key)
+{
+ return (hashlittle(octstr_get_cstr(key), octstr_len(key), 13)
+ & hashmask(dict->bitsize));
+}
+
+
+/*************************************************************************
+ * Other internal fuctions
+ */
+
static int handle_null_value(Dict *dict, Octstr *key, void *value)
{
if (value == NULL) {
@@ -151,7 +495,7 @@
return 0;
}
-static int dict_put_true(Dict *dict, Octstr *key, void *value)
+static int dict_put_true_normal(Dict *dict, Octstr *key, void *value)
{
Item *p;
long i;
@@ -162,15 +506,15 @@
i = key_to_index(dict, key);
if (dict->tab[i] == NULL) {
- dict->tab[i] = gwlist_create();
- p = NULL;
+ dict->tab[i] = gwlist_create();
+ p = NULL;
} else {
- p = gwlist_search(dict->tab[i], key, item_has_key);
+ p = gwlist_search(dict->tab[i], key, item_has_key);
}
if (p == NULL) {
p = item_create(key, value);
- gwlist_append(dict->tab[i], p);
+ gwlist_append(dict->tab[i], p);
dict->key_count++;
item_unique = 1;
} else {
@@ -184,90 +528,159 @@
return item_unique;
}
-/*
- * And finally, the public functions.
- */
-
-Dict *dict_create(long size_hint, void (*destroy_value)(void *))
+static int dict_put_true_debug(Dict *dict, Octstr *key, void *value)
{
- Dict *dict;
+ Item *p;
long i;
+ int item_unique;
+ int direct_hit = 0;
+
+ item_unique = 0;
+ lock(dict);
+ i = key_to_index(dict, key);
+
+ if (dict->tab[i] == NULL) {
+ dict->tab[i] = gwlist_create();
+ p = NULL;
+ dict->direct_hits++;
+ direct_hit = 1;
+ } else {
+ if (gwlist_len(dict->tab[i]) == 0) {
+ p = NULL;
+ dict->direct_hits++;
+ direct_hit = 1;
+ } else {
+ p = gwlist_search(dict->tab[i], key, item_has_key);
+ }
+ }
+
+ if (p == NULL) {
+ p = item_create(key, value);
+ p->direct_hit = direct_hit;
+ gwlist_append(dict->tab[i], p);
+ dict->key_count++;
+ item_unique = 1;
+ } else {
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(value);
+ item_unique = 0;
+ dict->duplicates++;
+ }
- dict = gw_malloc(sizeof(*dict));
+ dict->inserts++;
- /*
- * Hash tables tend to work well until they are fill to about 50%.
- */
- dict->size = size_hint * 2;
+ unlock(dict);
- dict->tab = gw_malloc(sizeof(dict->tab[0]) * dict->size);
- for (i = 0; i < dict->size; ++i)
- dict->tab[i] = NULL;
- dict->lock = mutex_create();
- dict->destroy_value = destroy_value;
- dict->key_count = 0;
-
- return dict;
+ return item_unique;
}
-void dict_destroy(Dict *dict)
+static inline int dict_put_true(Dict *dict, Octstr *key, void *value)
{
+ return dict->put_true(dict, key, value);
+}
+
+
+inline int dict_put(Dict *dict, Octstr *key, void *value)
+{
+ return dict->put(dict, key, value);
+}
+
+
+/*
+ * And finally, the public functions.
+ */
+
+int dict_put_normal(Dict *dict, Octstr *key, void *value)
+{
long i;
Item *p;
-
- if (dict == NULL)
- return;
+ int item_unique;
- for (i = 0; i < dict->size; ++i) {
- if (dict->tab[i] == NULL)
- continue;
+ if (value == NULL) {
+ value = dict_remove(dict, key);
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(value);
+ return -1;
+ }
- while ((p = gwlist_extract_first(dict->tab[i])) != NULL) {
- if (dict->destroy_value != NULL)
- dict->destroy_value(p->value);
- item_destroy(p);
- }
- gwlist_destroy(dict->tab[i], NULL);
+ lock(dict);
+ i = key_to_index(dict, key);
+ if (dict->tab[i] == NULL) {
+ dict->tab[i] = gwlist_create();
+ p = NULL;
+ } else
+ p = gwlist_search(dict->tab[i], key, item_has_key);
+
+ if (p == NULL) {
+ p = item_create(key, value);
+ gwlist_append(dict->tab[i], p);
+ dict->key_count++;
+ item_unique = 1;
+ } else {
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(p->value);
+ p->value = value;
+ item_unique = 0;
}
- mutex_destroy(dict->lock);
- gw_free(dict->tab);
- gw_free(dict);
+ unlock(dict);
+
+ return item_unique;
}
-void dict_put(Dict *dict, Octstr *key, void *value)
+int dict_put_debug(Dict *dict, Octstr *key, void *value)
{
long i;
Item *p;
+ int item_unique;
+ int direct_hit = 0;
if (value == NULL) {
value = dict_remove(dict, key);
- if (dict->destroy_value != NULL)
- dict->destroy_value(value);
- return;
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(value);
+ return -1;
}
lock(dict);
i = key_to_index(dict, key);
if (dict->tab[i] == NULL) {
- dict->tab[i] = gwlist_create();
- p = NULL;
- } else
- p = gwlist_search(dict->tab[i], key, item_has_key);
+ dict->tab[i] = gwlist_create();
+ p = NULL;
+ dict->direct_hits++;
+ direct_hit = 1;
+ } else {
+ if (gwlist_len(dict->tab[i]) == 0) {
+ p = NULL;
+ dict->direct_hits++;
+ direct_hit = 1;
+ } else {
+ p = gwlist_search(dict->tab[i], key, item_has_key);
+ }
+ }
+
if (p == NULL) {
- p = item_create(key, value);
- gwlist_append(dict->tab[i], p);
+ p = item_create(key, value);
+ p->direct_hit = direct_hit;
+ gwlist_append(dict->tab[i], p);
dict->key_count++;
+ item_unique = 1;
} else {
- if (dict->destroy_value != NULL)
- dict->destroy_value(p->value);
- p->value = value;
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(p->value);
+ p->value = value;
+ dict->duplicates++;
+ item_unique = 0;
}
+ dict->inserts++;
unlock(dict);
+
+ return item_unique;
}
+
int dict_put_once(Dict *dict, Octstr *key, void *value)
{
int ret;
@@ -304,7 +717,7 @@
}
-void *dict_remove(Dict *dict, Octstr *key)
+void *dict_remove_normal(Dict *dict, Octstr *key)
{
long i;
Item *p;
@@ -317,21 +730,64 @@
list = NULL;
else
list = gwlist_extract_matching(dict->tab[i], key, item_has_key);
+
gw_assert(list == NULL || gwlist_len(list) == 1);
- if (list == NULL)
+
+ if (list == NULL) {
value = NULL;
- else {
- p = gwlist_get(list, 0);
- gwlist_destroy(list, NULL);
+ } else {
+ p = gwlist_get(list, 0);
+ gwlist_destroy(list, NULL);
value = p->value;
- item_destroy(p);
- dict->key_count--;
+ item_destroy(p);
+ dict->key_count--;
}
unlock(dict);
+
return value;
}
+void *dict_remove_debug(Dict *dict, Octstr *key)
+{
+ long i;
+ Item *p;
+ void *value;
+ List *list;
+
+ lock(dict);
+ i = key_to_index(dict, key);
+ if (dict->tab[i] == NULL)
+ list = NULL;
+ else
+ list = gwlist_extract_matching(dict->tab[i], key, item_has_key);
+
+ gw_assert(list == NULL || gwlist_len(list) == 1);
+
+ if (list == NULL) {
+ value = NULL;
+ } else {
+ p = gwlist_get(list, 0);
+ gwlist_destroy(list, NULL);
+ value = p->value;
+ if (p->direct_hit)
+ dict->direct_hits--;
+ item_destroy(p);
+ dict->key_count--;
+ dict->inserts--;
+ }
+ unlock(dict);
+
+ return value;
+}
+
+
+inline void *dict_remove(Dict *dict, Octstr *key)
+{
+ return dict->remove(dict, key);
+}
+
+
long dict_key_count(Dict *dict)
{
long result;
@@ -367,8 +823,223 @@
}
+void dict_enable_debug(Dict *dict)
+{
+ lock(dict);
+ dict->put_true = dict_put_true_debug;
+ dict->put = dict_put_debug;
+ dict->remove = dict_remove_debug;
+ unlock(dict);
+}
+Dict *dict_create(long size_hint, void (*destroy_value)(void *))
+{
+ Dict *dict;
+ long i;
+
+ dict = gw_malloc(sizeof(*dict));
+ /*
+ * Hash tables perform better if they are of power of 2 sized.
+ */
+ dict->bitsize = (int) ceil(log2((double) size_hint));
+ dict->size = (long) exp2((double) dict->bitsize);
+ dict->tab = gw_malloc(sizeof(dict->tab[0]) * dict->size);
+ for (i = 0; i < dict->size; ++i)
+ dict->tab[i] = NULL;
+ dict->lock = mutex_create();
+ dict->destroy_value = destroy_value;
+ dict->key_count = 0;
+
+ /* enhancements for debug information */
+ dict->inserts = 0;
+ dict->direct_hits = 0;
+ dict->duplicates = 0;
+ dict->put_true = dict_put_true_normal;
+ dict->put = dict_put_normal;
+ dict->remove = dict_remove_normal;
+
+ return dict;
+}
+
+void dict_destroy(Dict *dict)
+{
+ long i;
+ Item *p;
+
+ if (dict == NULL)
+ return;
+
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+
+ while ((p = gwlist_extract_first(dict->tab[i])) != NULL) {
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(p->value);
+ item_destroy(p);
+ }
+ gwlist_destroy(dict->tab[i], NULL);
+ }
+ mutex_destroy(dict->lock);
+ gw_free(dict->tab);
+ gw_free(dict);
+}
+
+
+long dict_traverse(Dict *dict, void (*func)(Octstr *, void *, void *), void
*data)
+{
+ Item *item;
+ long i, j, r = 0;
+
+ lock(dict);
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+ for (j = 0; j < gwlist_len(dict->tab[i]); ++j) {
+ item = gwlist_get(dict->tab[i], j);
+ func(item->key, item->value, data);
+ r++;
+ }
+ }
+ unlock(dict);
+
+ return r;
+}
+
+
+long dict_traverse_sorted(Dict *dict, int (*cmp)(const void *, const void *),
+ void (*func)(Octstr *, void *, void *), void
*data)
+{
+ Item *item;
+ long i, j, r = 0;
+ List *l;
+
+ l = gwlist_create();
+ lock(dict);
+
+ /* We need to aggregate a list of all item elements first. */
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+ for (j = 0; j < gwlist_len(dict->tab[i]); ++j) {
+ gwlist_append(l, gwlist_get(dict->tab[i], j));
+ }
+ }
+
+ /* Now we can sort the list. */
+ gwlist_sort(l, cmp);
+
+ /* And traverse the list. */
+ r = gwlist_len(l);
+ while ((item = gwlist_consume(l)) != NULL) {
+ func(item->key, item->value, data);
+ }
+
+ unlock(dict);
+ gwlist_destroy(l, NULL);
+
+ return r;
+}
+
+
+Dict *dict_duplicate(Dict *dict, void *(*duplicate_value)(void *))
+{
+ Item *item;
+ long i, j;
+ Dict *dup;
+
+ lock(dict);
+ dup = dict_create(dict->size, dict->destroy_value);
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+ for (j = 0; j < gwlist_len(dict->tab[i]); ++j) {
+ item = gwlist_get(dict->tab[i], j);
+ dict_put(dup, item->key, duplicate_value(item->value));
+ }
+ }
+ unlock(dict);
+
+ return dup;
+}
+
+
+void dict_reset(Dict *dict)
+{
+ long i;
+ Item *p;
+
+ if (dict == NULL)
+ return;
+
+ lock(dict);
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+
+ while ((p = gwlist_extract_first(dict->tab[i])) != NULL) {
+ if (dict->destroy_value != NULL)
+ dict->destroy_value(p->value);
+ item_destroy(p);
+ dict->key_count--;
+ }
+ }
+ unlock(dict);
+}
+
+
+double dict_hit_ratio(Dict *dict)
+{
+ double ret;
+
+ lock(dict);
+ ret = ((double) dict->direct_hits / (dict->inserts ? dict->inserts : 1));
+ unlock(dict);
+
+ return ret;
+}
+
+
+double dict_bucket_ratio(Dict *dict)
+{
+ long i, l;
+ double ret;
+
+ l = 0;
+
+ lock(dict);
+ for (i = 0; i < dict->size; ++i) {
+ if (dict->tab[i] == NULL)
+ continue;
+ l++;
+ }
+ ret = ((double) l / dict->size);
+ unlock(dict);
+
+ return ret;
+}
+
+
+double dict_avg_list_len(Dict *dict)
+{
+ long i;
+ long l, len;
+
+ l = len = 0;
+
+ lock(dict);
+ for (i = 0; i < dict->size; ++i) {
+ /* don't count in empty slots */
+ if (dict->tab[i] == NULL)
+ continue;
+ l++;
+ len += gwlist_len(dict->tab[i]);
+ }
+ unlock(dict);
+
+ return ((double) len / (l ? l : 1));
+}
Index: gwlib/dict.h
===================================================================
--- gwlib/dict.h (revision 4986)
+++ gwlib/dict.h (working copy)
@@ -62,6 +62,7 @@
* of it as an array indexed by octet strings.
*
* Lars Wirzenius
+ * Stipe Tolj
*/
#ifndef DICT_H
@@ -89,11 +90,26 @@
/*
+ * Create a Dict with the duplicate entries, by using the duplicate_value()
+ * function callback to duplicate the value entries.
+ */
+Dict *dict_duplicate(Dict *dict, void *(*duplicate_value)(void *));
+
+
+/*
+ * Resets all values within a Dict, leaving the Dict empty.
+ */
+void dict_reset(Dict *dict);
+
+
+/*
* Put a new value into a Dict. If the same key existed already, the
* old value is destroyed. If `value' is NULL, the old value is destroyed
* and the key is removed from the Dict.
+ * Returns 1 if a new value is put into the Dict, or 0 if the value has
+ * been replaced for the key, and -1 if the old value is destroyed.
*/
-void dict_put(Dict *dict, Octstr *key, void *value);
+int dict_put(Dict *dict, Octstr *key, void *value);
/*
* Put a new value into a Dict. Return error, if the same key existed all-
@@ -129,10 +145,47 @@
List *dict_keys(Dict *dict);
-#endif
+/*
+ * Traverse to all elements of a Dict, and execute the passed callback
+ * function with the key as first argument, a void pointer as data argument
+ * and a variadic argument list.
+ * Returns the number of Dict elements that have been traversed.
+ */
+long dict_traverse(Dict *dict, void (*func)(Octstr *, void *, void *), void
*data);
+/*
+ * Same as dict_traverse(), but ensures that the elements are sorted before
+ * the traversal. The sorting is done via quick sort, applied via a compare
+ * callback function that compares 2 elements of the overall elements list.
+ */
+long dict_traverse_sorted(Dict *dict, int (*cmp)(const void *, const void *),
+ void (*func)(Octstr *, void *, void *), void
*data);
+/*
+ * Return the ratio between direct hit inserts, which means the hashing
+ * provided an empty bucket to put the object into vs. the overall operations.
+ */
+double dict_hit_ratio(Dict *dict);
+/*
+ * Return the ratio between assigned and unassigned buckets in the hash table.
+ */
+double dict_bucket_ratio(Dict *dict);
+
+
+/*
+ * Return the average list len of buckets that have chained lists.
+ */
+double dict_avg_list_len(Dict *dict);
+
+
+/*
+ * Activates the Dict internal counters via a re-mapping of the function
pointers.
+ * Use this function after the dict_create() to enable debug counters.
+ */
+void dict_enable_debug(Dict *dict);
+
+#endif