Re: Better ap_casecmpstr[n]?

William A Rowe Jr Tue, 24 Nov 2015 12:03:07 -0800

On Tue, Nov 24, 2015 at 1:04 PM, Yann Ylavic <[email protected]> wrote:


> I tested this change with both Jim's and my versions, that's slower.
>
> The better implementation I have so far is:
>
> int ap_casecmpstr_2(const char *s1, const char *s2)
> {
>     size_t i;
>     const unsigned char *ps1 = (const unsigned char *) s1;
>     const unsigned char *ps2 = (const unsigned char *) s2;
>
>     for (i = 0; ; ++i) {
>         const int c1 = ucharmap[ps1[i]];
>         const int c2 = ucharmap[ps2[i]];
>
>         if (c1 != c2) {
>             return c1 - c2;
>         }
>         if (!c1) {
>             break;
>         }
>     }
>     return (0);
> }
>
> which is ~15% faster than the current one (and not 50% unless I remove
> the translation :)
>
> I also tried:
>
>     for (i = 0; ; ++i) {
>         const int c1 = ps1[i];
>         const int c2 = ps2[i];
>
>         if (c1 != c2 && ucharmap[c1] != ucharmap[c2]) {
>             return ucharmap[c1] - ucharmap[c2];
>         }
>         ...
>
> but no.
>
>
> On Tue, Nov 24, 2015 at 7:56 PM, William A Rowe Jr <[email protected]>
> wrote:
> > For the optimization cases Graham was proposing, how does this perform on
> > your test setup? Looking for both absmatches, case mismatches and proper
> vs
> > lowercase comparisons...
> >
> > int ap_casecmpstr_2(const char *s1, const char *s2)
> > {
> >     size_t i;
> >     const unsigned char *ps1 = (const unsigned char *) s1;
> >     const unsigned char *ps2 = (const unsigned char *) s2;
> >
> >     for (i = 0; ; ++i) {
> >         const int c1 = ucharmap[ps1[i]];
> >         const int c2 = ucharmap[ps2[i]];
> >         /* Above lookups are optimized away if first test below succeeds
> */
> >
> >         if ((ps1[i] != ps2[i]) && (c1 != c2)) {
> >             return c1 - c2;
> >         }
> >         if (!c1) {
> >             break;
> >         }
> >     }
> >     return (0);
> > }
> >
> >
> > On Tue, Nov 24, 2015 at 12:43 PM, Yann Ylavic <[email protected]>
> wrote:
> >>
> >> On Tue, Nov 24, 2015 at 7:39 PM, Mikhail T. <[email protected]>
> >> wrote:
> >> > On 24.11.2015 13:04, Yann Ylavic wrote:
> >> >
> >> > int ap_casecmpstr_2(const char *s1, const char *s2)
> >> > {
> >> >     size_t i;
> >> >     const unsigned char *ps1 = (const unsigned char *) s1;
> >> >     const unsigned char *ps2 = (const unsigned char *) s2;
> >> >
> >> >     for (i = 0; ; ++i) {
> >> >         const int c1 = ps1[i];
> >> >         const int c2 = ps2[i];
> >> >
> >> >         if (c1 != c2) {
> >> >             return c1 - c2;
> >> >         }
> >> >         if (!c1) {
> >> >             break;
> >> >         }
> >> >     }
> >> >     return (0);
> >> > }
> >> >
> >> > Sorry, but would not the above declare "A" and "a" to be different?
> >>
> >> Yeah, forgot the translation, I went too fast :)
> >
> >
>

Sounds like this concludes the discussion of calling strcmp, followed by
this function
for optimization benefits.

Your code is much easier to read, IMHO and I'd prefer that syntax, even if
it optimizes almost identically to the existing code.

The final question was what to call it and how to document it.  Documenting
can be iterative in svn, but do we agree that ap_strcmp_token clarifies that
this is for unambiguous token comparisons, or should we stick with something
that implies ascii (not literally true), or lc_posix (closer)?

Re: Better ap_casecmpstr[n]?

Reply via email to