[Openpbx-dev] Hashing for OpenPBX.org

bkml Mon, 14 Aug 2006 00:59:15 -0700

Hi

everytime I see all those strcmp()'s and strcasecmp()'s in the MaxSpecter sources I am asking myself:

"Has this been written by a 12 year old on a Commodore VIC-20 toy in1970's BASIC?"

Folks, this is so incredibly illiterate and unprofessional, it hasgot to be thrown out.

So, I have made a start towards removing all the strcmp()'s andstrcasecmp()'s in pbx.c, providing a hashing API and hash codes forall the keywords used in there. The source is available here ...


http://www.sunrise-tel.com/benjk/OpenPBX/opbx_hash.h
http://www.sunrise-tel.com/benjk/OpenPBX/opbx_hash.c
http://www.sunrise-tel.com/benjk/OpenPBX/opbx_keywords.h

Eventually, everything should be hashed, contexts, extensions,variables etc, but for starters I focused on apps and keywords.

Including the above headers in pbx.c allows the followingimprovements to be made:

In pbx.c the structs opbx_app and opbx_builtin need an additionalmember ...


unsigned int hash;

Then, each builtin needs to have its hash code added, like so ...

{ "AbsoluteTimeout", OPBX_KEYWORD_ABSOLUTETIMEOUT,pbx_builtin_atimeout, ...


Function pbx_findapp() needs ...

unsigned int hash = opbx_hash_string(app);

alternatively, ...

unsigned int hash = opbx_hash_string_toupper(app);

... in case you want to retain case insensitivity. Note, that this isslightly more expensive, computationally.


Function pbx_findapp() also needs this ...

while(tmp) {
                // The proper way ...
                if (hash == tmp->hash)
                        break;
                // The silly way ...
                /*
                if (!strcasecmp(tmp->name, app))
                        break;
                */
                tmp = tmp->next;
        }

and this illustrates very well how hashes save many CPU cycles:

First, there is no function call to check the application name for amatch in every turn of the loop. Context switches are ratherexpensive and we avoid them altogether by not calling strcasecmp()anymore. Secondly, instead of one test per character in theapplication name, we now have only one single test per applicationentry to test the hash code.

The result is that application search performance becomes linearinstead of logarithmic. One integer comparison per registeredapplication, instead of one context switch plus average number ofcharacters in an application name times number of registeredapplications.

Someone appears to have been aware of the impact on performance asthere is a comment at the top of pbx.c:


/*
* I M P O R T A N T :
*

* The speed of extension handling will likely be among the mostimportant* aspects of this PBX. The switching scheme as it exists right nowisn't* terribly bad (it's O(N+M), where N is the # of extensions and M isthe avg #

* of priorities, but a constant search time here would be great ;-)
*
*/

For some reason though that someone didn't seem to have done anythingabout it. Maybe their skillset was insufficient, but that's not anyexcuse for the rest of us who have the skills not to be doinganything about it.

Now, in order to get the hash codes for arbitrary applications intothe opbx_app struct in the first place, we also need to add some codeto function opbx_register_application() ...


unsigned int hash = opbx_hash_string(app);

or, again, for retaining case insensitivity ...

unsigned int hash = opbx_hash_string_toupper(app);

Then, ...

        while(tmp) {
                if (hash == tmp->hash) {

// replaced code: (!strcasecmp(app, tmp->name)) {


and ...

        if (tmp) {
                memset(tmp, 0, sizeof(struct opbx_app));
                strncpy(tmp->name, app, sizeof(tmp->name)-1);
                tmp->hash = hash;

Function opbx_unregister_application() should also be modifiedaccordingly.



Finally, we can replace the strcmp()'s for builtin keyword checks ...

if (c && (hash == OPBX_KEYWORD_CALLERIDNUM)) {
                // replaced code: (c && !strcmp(var, "CALLERIDNUM")) {

etc etc

I haven't made any patchfile yet, because there is the issue of casesensitivity about which I feel there needs to be a discussion for aconsensus to be reached.

Some of the checks at present involve strcasecmp() which means thetested identifiers are case insensitive. My personal preference isthat all identifiers should be case sensitive. This is alsocomputationally more efficient, since each character will need to beconverted to its uppercase equivalent within the hash function.

However, if we settle for case sensitive identifiers, then it maybreak some people's dialplans. Consider for example ...


exten => s,1,noop()

instead of

exten => s,1,NoOp()

where the lowercase 'noop' would no longer be recognised.

Thus, as a next step I suggest we discuss the issue of casesensitivity and come to a consensus. After that I will make a patchfile for pbx.c conforming to the consensus.

The next step then will be to convert more tests to the hash codebased system.

Eventually, I envisage to extend the hash library so it can storearbitrary strings in a hash table from where they can be retrievedmuch faster than the current method using linear searches on linkedlists and arrays.


rgds
benjk

___________________________________________________________All New Yahoo! Mail Tired of [EMAIL PROTECTED]@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html

_______________________________________________
Openpbx-dev mailing list
[email protected]
http://lists.openpbx.org/mailman/listinfo/openpbx-dev

[Openpbx-dev] Hashing for OpenPBX.org

Reply via email to