Hi Stipe, this is really complex patch for the hash issue, but it could be made easier and almost the same speed. I'm not really convenient that we need to apply this complex patch...
Here are my simple tests and patch attached: Command used: time ./test/test_dict -- old 2012-08-07 14:47:49 [57291] [0] DEBUG: Dict populate phase. 2012-08-07 14:48:16 [57291] [0] INFO: ok, got 200000 entries in dict1. 2012-08-07 14:48:16 [57291] [0] INFO: ok, got 200000 entries in dict2. 2012-08-07 14:48:16 [57291] [0] DEBUG: Dict lookup phase. real 0m41.842s user 0m40.680s sys 0m1.078s -- sedgewick (my patch) 2012-08-07 14:47:01 [57016] [0] DEBUG: Dict populate phase. 2012-08-07 14:47:03 [57016] [0] INFO: ok, got 200000 entries in dict1. 2012-08-07 14:47:03 [57016] [0] INFO: ok, got 200000 entries in dict2. 2012-08-07 14:47:03 [57016] [0] DEBUG: Dict lookup phase. real 0m4.160s user 0m2.998s sys 0m0.977s real 0m3.797s user 0m2.738s sys 0m1.015s real 0m3.732s user 0m2.767s sys 0m0.938s -- Bob Jenkins 2012-08-07 14:50:32 [58470] [0] DEBUG: Dict populate phase. 2012-08-07 14:50:34 [58470] [0] INFO: ok, got 200000 entries in dict1. 2012-08-07 14:50:34 [58470] [0] INFO: ok, got 200000 entries in dict2. 2012-08-07 14:50:34 [58470] [0] DEBUG: Dict lookup phase. real 0m3.747s user 0m2.682s sys 0m0.919s real 0m3.661s user 0m2.692s sys 0m0.946s real 0m3.522s user 0m2.606s sys 0m0.908s Alex
hash_patch.diff
Description: Binary data
Am 06.08.2012 um 15:59 schrieb Stipe Tolj <st...@kannel.org>: > Hi list, > > this is a new Dict hash table implementation, which modifies the internal > hash algorithm we use in the gwlib/dict.[ch] module. > > The short summary is: It is round about 7,5-8 times faster then the existing > implementation! > > Now, the explanation part: > > Running the SVN trunk's test/test_dict on a machine I was benchmarking > against gives us the following runtime results: > > Average execution speed: 52,x sec. > OProfile's opreport for test_dict: > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > samples % image name symbol name > 19000 39.3954 test_dict item_has_key > 17563 36.4158 test_dict seems_valid_real > 4528 9.3885 test_dict gwlist_search > 3510 7.2778 test_dict octstr_compare > 456 0.9455 libc-2.11.3.so vfprintf > 251 0.5204 libc-2.11.3.so _int_malloc > 175 0.3629 libc-2.11.3.so random_r > 159 0.3297 libc-2.11.3.so __i686.get_pc_thunk.bx > 150 0.3110 test_dict octstr_len > 148 0.3069 test_dict octstr_get_char > 139 0.2882 libc-2.11.3.so random > > (BTW, this was on a VMware VM, so we used timer interrupts, but the relative > results on a real machine are the same.) > > So, what does it tell us? > > First of all, most CPU cycles are used for the item_has_key() function > internally, which is at the end of the day nothing more then a > octstr_compare(). We compare the hash's key Octstr here while searching for > the one we're looking for. > > The "real problem" about the current implementation is the hashing of the > keys. This happens within key_to_index() and we use this implementation: > > static long key_to_index(Dict *dict, Octstr *key) > { > return octstr_hash_key(key) % dict->size; > } > > where octstr_hash_key() accumulates the single byte values to one long and we > modulo against the dict size. > > That's ok, for small Dicts. But when the Dict size get's bigger and bigger, > we'll notice un-leveled congestion in the hash table. Which means for some > hash table buckets we'll have larger chained items in the bucket list, then > for others. > > This is also where the gwlist_search() hits us from the OProfile results. We > spend 10% by simply looking for the right item in the hash bucket list. > > So, the key to performance is having a well distributed hash function, which > make the overall average list len of the hash table bucket minimal, and well > distributed. > > The new implementation does this. It is based on the code of Bob Jenkins, > which was released to the public domain. I have incorporated the parts into > our existing gwlib/dict.[ch] structure, and also added some "debugging > capabilities" to the structure itself. > > Ok, you want some figures now? Here is the same test run of test_dict on the > same machine, under same conditions: > > Average execution speed: 6,7 sec. > OProfile's opreport for test_dict: > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > samples % image name symbol name > 568 14.2036 libc-2.11.3.so vfprintf > 310 7.7519 libc-2.11.3.so random > 273 6.8267 libc-2.11.3.so random_r > 193 4.8262 libc-2.11.3.so __i686.get_pc_thunk.bx > 191 4.7762 libc-2.11.3.so _int_malloc > 185 4.6262 test_dict lock > 179 4.4761 libpthread-2.11.3.so pthread_mutex_lock > 149 3.7259 test_dict seems_valid_real > 121 3.0258 libc-2.11.3.so _IO_default_xsputn > 87 2.1755 test_dict dict_put_normal > 81 2.0255 [vdso] (tgid:15020 range:0xb7781000-0xb7782000) [vdso] > (tgid:15020 range:0xb7781000-0xb7782000) > 81 2.0255 [vdso] (tgid:15024 range:0xb76fb000-0xb76fc000) [vdso] > (tgid:15024 range:0xb76fb000-0xb76fc000) > 76 1.9005 libc-2.11.3.so _itoa_word > 76 1.9005 libc-2.11.3.so strchrnul > 75 1.8755 libpthread-2.11.3.so __pthread_mutex_unlock_usercnt > 66 1.6504 test_dict octstr_compare > 63 1.5754 test_dict get_random_fd > 62 1.5504 libc-2.11.3.so _int_free > 62 1.5504 libc-2.11.3.so malloc > 61 1.5254 test_dict get_random_bytes > 59 1.4754 libc-2.11.3.so free > 53 1.3253 test_dict dict_destroy > 53 1.3253 test_dict gwlist_search > 51 1.2753 test_dict item_has_key > 50 1.2503 test_dict gwlist_extract_first > > For those that identify the parts, it speaks for itself. > > Please review and vote for commit. > > BTW, this code is in production for a huge majority of the commercial Kannel > SMPP v3.4 server (smppbox) users around the globe and has been proven to be > rock stable since introduction. > > Stipe > > -- > ------------------------------------------------------------------- > Kölner Landstrasse 419 > 40589 Düsseldorf, NRW, Germany > > tolj.org system architecture Kannel Software Foundation (KSF) > http://www.tolj.org/ http://www.kannel.org/ > > mailto:st_{at}_tolj.org mailto:stolj_{at}_kannel.org > ------------------------------------------------------------------- > <gateway-dict.diff>