If we really want to squeek out optimizations, judicious use
of 'register' might help even...

But after awhile things start getting silly :)

> On Nov 24, 2015, at 1:04 PM, Yann Ylavic <ylavic....@gmail.com> wrote:
> 
> I did some testing with different implémentations and my results show
> that fastest one is:
> 
> int ap_casecmpstr_2(const char *s1, const char *s2)
> {
>    size_t i;
>    const unsigned char *ps1 = (const unsigned char *) s1;
>    const unsigned char *ps2 = (const unsigned char *) s2;
> 
>    for (i = 0; ; ++i) {
>        const int c1 = ps1[i];
>        const int c2 = ps2[i];
> 
>        if (c1 != c2) {
>            return c1 - c2;
>        }
>        if (!c1) {
>            break;
>        }
>    }
>    return (0);
> }
> 
> int ap_casecmpstrn_2(const char *s1, const char *s2, size_t n)
> {
>    size_t i;
>    const unsigned char *ps1 = (const unsigned char *) s1;
>    const unsigned char *ps2 = (const unsigned char *) s2;
> 
>    for (i = 0; i < n; ++i) {
>        const int c1 = ps1[i];
>        const int c2 = ps2[i];
> 
>        if (c1 != c2) {
>            return c1 - c2;
>        }
>        if (!c1) {
>            break;
>        }
>    }
>    return (0);
> }
> 
> Some samples (test program attached):
> 
> $ gcc -Wall -O2 newtest.c -o newtest -lrt
> $ for i in `seq 0 2`; do
>    ./newtest $i 150000000 \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        0
> done
> - str[n]casecmp (nb=150000000, len=0)
> time = 8.444547186 : res = 0
> - ap_casecmpstr[n] (nb=150000000, len=0)
> time = 8.299781468 : res = 0
> - ap_casecmpstr[n] w/ index (nb=150000000, len=0)
> time = 6.148787259 : res = 0
> 
> That's ~30% better.
> 
> $ gcc -Wall -Os newtest.c -o newtest -lrt
> $ for i in `seq 0 2`; do
>    ./newtest $i 150000000 \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        0
> done
> - str[n]casecmp (nb=150000000, len=0)
> time = 8.528311136 : res = 0
> - ap_casecmpstr[n] (nb=150000000, len=0)
> time = 10.150553381 : res = 0
> - ap_casecmpstr[n] w/ index (nb=150000000, len=0)
> time = 9.758638566 : res = 0
> 
> The string.h's str[n]casecmp beat us with -Os, still this new
> implementation is better than the current one.
> 
> WDYT, should I commit these new versions?
> <newtest.c>

Reply via email to