On 4 May 2013 15:14, Andrey Chernov <a...@freebsd.org> wrote: > On 04.05.2013 0:48, Sergey Kandaurov wrote: >> On 3 May 2013 23:55, Jilles Tjoelker <jil...@stack.nl> wrote: >>> Some sort of perfect hashing can also be an option, although it makes it >>> harder to add new properties or adds a build dependency on gperf(1) that >>> we would like to get rid of. >> I hacked a bit on wctype. Speaking about speed, it shows about 1-3.5x >> improvement over the previous fast version (before r250215). >> >> Time spend for 2097152 wctype() calls for each of wctype property >> current previous mine >> alnum 0.090554676 0.035821210 0.033270579 >> alpha 0.172074310 0.052461036 0.044916572 >> blank 0.261109989 0.055735281 0.036682745 >> cntrl 0.357318986 0.069249831 0.038292782 >> digit 0.436381530 0.094194364 0.039249005 >> graph 0.540954812 0.085580099 0.043331460 >> lower 0.618306476 0.095665215 0.044070399 >> print 0.707443135 0.132559305 0.048216097 >> punct 0.788922052 0.142809109 0.062871432 >> space 0.888263108 0.150516644 0.054086142 >> upper 0.966903461 0.173593592 0.054027834 >> xdigit 0.406611275 0.201614227 0.060695939 >> ideogram 0.439763499 0.239640723 0.068566486 >> special 0.523128094 0.249156298 0.099278051 >> phonogram 0.564975870 0.260972651 0.135751471 >> rune 0.637392247 0.235195497 0.064093971 >> >> Index: locale/wctype.c >> =================================================================== >> --- locale/wctype.c (revision 250217) >> +++ locale/wctype.c (working copy) >> @@ -74,6 +74,9 @@ >> "special\0" /* BSD extension */ >> "phonogram\0" /* BSD extension */ >> "rune\0"; /* BSD extension */ >> + static const size_t propnamlen[] = { >> + 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 8, 7, 9, 4, 0 >> + }; >> static const wctype_t propmasks[] = { >> _CTYPE_A|_CTYPE_D, >> _CTYPE_A, >> @@ -92,16 +95,17 @@ >> _CTYPE_Q, >> 0xFFFFFF00L >> }; >> - size_t len1, len2; >> + const size_t *len2; >> const char *p; >> const wctype_t *q; >> >> - len1 = strlen(property); >> q = propmasks; >> - for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) { >> - if (len1 == len2 && memcmp(property, p, len1) == 0) >> + len2 = propnamlen; >> + for (p = propnames; *len2 != 0; ) { >> + if (property[0] == p[0] && strcmp(property, p) == 0) >> return (*q); >> - q++; >> + p += *len2 + 1; >> + q++; len2++; >> } >> >> return (0UL); >> [...] > > BTW, I don't run tests and look in asm code for sure, but it seems > property[0] == p[0] is unneeded because almost every compiler tries to > inline strcmp().
Doesn't seem so (in-lining), see below. Apparently property[0] == p[0] is cheaper than strcmp() for negative checks. Removing this condition brings perf. numbers back to the "previous" column. Looking into asm: # property[0] == p[0] 4d: 44 3a 75 00 cmp 0x0(%rbp),%r14b 51: 75 dd jne 30 <wctype_l+0x30> # strcmp() 53: 48 89 ee mov %rbp,%rsi 56: 4c 89 ff mov %r15,%rdi 59: e8 00 00 00 00 callq 5e <wctype_l+0x5e> 5e: 85 c0 test %eax,%eax 60: 75 ce jne 30 <wctype_l+0x30> -- wbr, pluknet _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"