Le 25/11/2015 22:02, Jim Jagielski a écrit :
In general, strcmp() is not implemented via strcmp.c
(although if you do a source code search for strcmp, that's
what you'll get). Most of the time it's implemented in
assembly (strcmp.s) or simply leverages memcmp() where
you aren't doing a byte by byte comparison but are doing
a native memory word (32 or 64bit) comparison. This
makes them super fast.

Once we need to worry about case insensitivity, then
we see a whole gamut of implementations; some use
a mapped array as I did; some go char by char and call
tolower() on each one; some do other things such as
testing if isupper() before calling tolower() if needed.
The word-based optimizations seem less viable, as seen
in test results that I ran and Yann also verified (afaict)

In my tests, my impl was faster on OSX and CentOS5 and 6.
It's a very common function we use and with my test results
it seemed to make sense to provide our own impl, esp if
we decided that what we were really concerned about was
comparing for equality, and so would be able to avoid
the !strcasecmp logic leaping.

If we decide that all this was for moot, that's fine.
That's what these types of investigations and discussions
are for.


Personally, my testing shows that faster/slower is not that self evident. On my machine, it depends of the length of the string. With shorter strings (less than ~10 chars) Yann's proposal seems to be the best with the test program. What happens if the const char table is not in L1 cache? We still have the same speedup?
When strings are longer, std strncasecmp always win.

Short strings are our use case, so, I would say, why not using this implementation, after all?


My personal reticence would be:
- it adds complexity to the code (one more function that looks really similar to existing ones) - the speed increase is 'only' 15% if I remember well latest numbers given by Yann - the speed increase is potentially platform/compiler/C library dependent. - it does not suppress (IMO) the 'switch' for going even faster to the right test - many off the tests against ASCII strings are hidden in apr functions (apr_table_get...) Do we have an idea of the overall time spent in these str[n]casecmp function when processing a request? 15% of that time should be, IMO, quite low.
Does it worse the added complexity? For me, the answer is: not sure.

CJ

Reply via email to