Hi

everytime I see all those strcmp()'s and strcasecmp()'s in the Max Specter sources I am asking myself:

"Has this been written by a 12 year old on a Commodore VIC-20 toy in 1970's BASIC?"

Folks, this is so incredibly illiterate and unprofessional, it has got to be thrown out.

So, I have made a start towards removing all the strcmp()'s and strcasecmp()'s in pbx.c, providing a hashing API and hash codes for all the keywords used in there. The source is available here ...

http://www.sunrise-tel.com/benjk/OpenPBX/opbx_hash.h
http://www.sunrise-tel.com/benjk/OpenPBX/opbx_hash.c
http://www.sunrise-tel.com/benjk/OpenPBX/opbx_keywords.h

Eventually, everything should be hashed, contexts, extensions, variables etc, but for starters I focused on apps and keywords.

Including the above headers in pbx.c allows the following improvements to be made:

In pbx.c the structs opbx_app and opbx_builtin need an additional member ...

unsigned int hash;

Then, each builtin needs to have its hash code added, like so ...

{ "AbsoluteTimeout", OPBX_KEYWORD_ABSOLUTETIMEOUT, pbx_builtin_atimeout, ...

Function pbx_findapp() needs ...

unsigned int hash = opbx_hash_string(app);

alternatively, ...

unsigned int hash = opbx_hash_string_toupper(app);

... in case you want to retain case insensitivity. Note, that this is slightly more expensive, computationally.

Function pbx_findapp() also needs this ...

while(tmp) {
                // The proper way ...
                if (hash == tmp->hash)
                        break;
                // The silly way ...
                /*
                if (!strcasecmp(tmp->name, app))
                        break;
                */
                tmp = tmp->next;
        }

and this illustrates very well how hashes save many CPU cycles:

First, there is no function call to check the application name for a match in every turn of the loop. Context switches are rather expensive and we avoid them altogether by not calling strcasecmp() anymore. Secondly, instead of one test per character in the application name, we now have only one single test per application entry to test the hash code.

The result is that application search performance becomes linear instead of logarithmic. One integer comparison per registered application, instead of one context switch plus average number of characters in an application name times number of registered applications.

Someone appears to have been aware of the impact on performance as there is a comment at the top of pbx.c:

/*
* I M P O R T A N T :
*
* The speed of extension handling will likely be among the most important * aspects of this PBX. The switching scheme as it exists right now isn't * terribly bad (it's O(N+M), where N is the # of extensions and M is the avg #
* of priorities, but a constant search time here would be great ;-)
*
*/

For some reason though that someone didn't seem to have done anything about it. Maybe their skillset was insufficient, but that's not any excuse for the rest of us who have the skills not to be doing anything about it.

Now, in order to get the hash codes for arbitrary applications into the opbx_app struct in the first place, we also need to add some code to function opbx_register_application() ...

unsigned int hash = opbx_hash_string(app);

or, again, for retaining case insensitivity ...

unsigned int hash = opbx_hash_string_toupper(app);

Then, ...

        while(tmp) {
                if (hash == tmp->hash) {
// replaced code: (!strcasecmp(app, tmp- >name)) {

and ...

        if (tmp) {
                memset(tmp, 0, sizeof(struct opbx_app));
                strncpy(tmp->name, app, sizeof(tmp->name)-1);
                tmp->hash = hash;

Function opbx_unregister_application() should also be modified accordingly.


Finally, we can replace the strcmp()'s for builtin keyword checks ...

if (c && (hash == OPBX_KEYWORD_CALLERIDNUM)) {
                // replaced code: (c && !strcmp(var, "CALLERIDNUM")) {

etc etc


I haven't made any patchfile yet, because there is the issue of case sensitivity about which I feel there needs to be a discussion for a consensus to be reached.

Some of the checks at present involve strcasecmp() which means the tested identifiers are case insensitive. My personal preference is that all identifiers should be case sensitive. This is also computationally more efficient, since each character will need to be converted to its uppercase equivalent within the hash function.

However, if we settle for case sensitive identifiers, then it may break some people's dialplans. Consider for example ...

exten => s,1,noop()

instead of

exten => s,1,NoOp()

where the lowercase 'noop' would no longer be recognised.


Thus, as a next step I suggest we discuss the issue of case sensitivity and come to a consensus. After that I will make a patch file for pbx.c conforming to the consensus.

The next step then will be to convert more tests to the hash code based system.

Eventually, I envisage to extend the hash library so it can store arbitrary strings in a hash table from where they can be retrieved much faster than the current method using linear searches on linked lists and arrays.

rgds
benjk


                
___________________________________________________________ All New Yahoo! Mail – Tired of [EMAIL PROTECTED]@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html
_______________________________________________
Openpbx-dev mailing list
[email protected]
http://lists.openpbx.org/mailman/listinfo/openpbx-dev

Reply via email to