improving the string library

Tomas Frydrych Sat, 23 Dec 2000 09:53:53 -0800

First of all, I have got to appologise, because some of the figures I 
posted couple of days back were badly wrong; I forgot to turn the 
optimatization on on my testing programme :-). I have done some 
more tests now and the general conclusion is that it does not 
make sense to provide replacements for any of the basic string 
functions such as strlen, strcpy and strcat.

This is not so much becuase one could not at all improve on the 
raw assembly (although the margins are tight), but because the 
optimizing compiler inlines these, and by this I do not mean simply 
pastes the assembly in, but rather does so with without the 
standard C prologue and epilogue which it cannot do with my 
externs.

For the same reasons I agree with Mike that also for the UT_UCS_ 
functions we should use the wstr* functions from the library if 
available.

There is one function from the std lib which I can significantly 
speed up, the strstr; my asm version takes only 60% of the time. 
The algorithm we have in UT_UCS_strstr is though faster than the 
one used by the library, I can improve on it only about 20%. 
However, I suspect that neither of these functions is speed critical 
for us.

The other function which I can improve on a lot is unichar_to_utf8, 
where I can get 30-20% speed up; (my implementation is biased 
toward the shorter chars, i.e., 30% for 1byte utf8 and 20% for 6byte 
utf8). This so far appears to be the only function that might be 
worth replacing.

Just to make sure I have made clear what I have in mind, I am not 
talking  about writting some inline code into the C++ sources, but 
about writting a library entirely in asm, one independent of the C++ 
sources which would never come near the GNU tools until linkage. 
The choice of the functions from the C++ sources or from the asm 
lib would be made at compile time using an ABI_OPT_USE_NASM 
variable; this only requires some #define's and #ifdef's in ut_string.h 
and ut_string.cpp, and avoiding including <string.h> directly, but 
rather including it through ut_string.h; there is only one file in the 
Unix tree where this happens (I do not know about the other 
platforms though).

>  GAS's (gnu assembler) syntax is totally different from NASM's one (order of
> operands, etc). You have to use C preprocessor and macros for asm code to be
> compiled by gnu tools (i.e on Linux, BSD and possible Solaris for x86, and
> even may be QNX and BeOS for x86 since they use gnu toolchain AFAIR).

I am aware of this; however Gas is a pain to programme for, and I 
have some code for NASM I have written a while back that I can 
reuse. Since NASM is freely available, I do not see this as a 
problem; however, if the library was to contain only a few functions, 
I might consider converting it to Gas syntax once it is debugged.

Tomas

*********************************************
[EMAIL PROTECTED] / www.frydrych.net
PGP keys:  http://www.frydrych.net/contact.html
improving the string library

Reply via email to