On 25 Aug 2014, at 15:55, J. Hannken-Illjes <hann...@eis.cs.tu-bs.de> wrote:
> On 24 Aug 2014, at 18:57, J. Hannken-Illjes <hann...@eis.cs.tu-bs.de> wrote: > > <snip> > >> I tried to bisect and got an increase in time from ~15 secs to ~24 secs >> between the time stamps '2012-09-18 06:00 UTC' '2012-09-18 09:00 UTC'. >> >> Someone should redo this test as this interval is the import of the >> compiler (GCC 4.5.3 -> 4.5.4) and I had to rebuild tools. I cant >> believe this to be a compiler problem. > > GCC 4.5.4 disabled builtin memcmp as x86 has no cmpmemsi pattern. > > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052, Comment 16. > > Could this be the cause of this big loss in performance? Short answer: it is -- reverting external/gpl3/gcc/dist/gcc/builtins.c from Rev. 1.3 to 1.2 brings back the old times which are the same as they were on NetBSD 6. Given that this test has many calls to ufs_lookup/cache_lookup using memcmp to check for equal filenames this is not a surprise. A rather naive "implementation" of memcmp (see below) drops the running time from ~15 sec to ~9 secs. We should consider improving our memcmp. -- J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany) Index: libkern.h =================================================================== RCS file: /cvsroot/src/sys/lib/libkern/libkern.h,v retrieving revision 1.106 diff -p -u -2 -r1.106 libkern.h --- libkern.h 30 Aug 2012 12:16:49 -0000 1.106 +++ libkern.h 25 Aug 2014 17:23:35 -0000 @@ -262,5 +262,18 @@ void *memset(void *, int, size_t); #if __GNUC_PREREQ__(2, 95) && !defined(_STANDALONE) #define memcpy(d, s, l) __builtin_memcpy(d, s, l) -#define memcmp(a, b, l) __builtin_memcmp(a, b, l) +static inline int __memcmp(const void *a, const void *b, size_t l) +{ + const unsigned char *pa = a, *pb = b; + + if (l > 8) + return memcmp(a, b, l); + while (l-- > 0) { + if (__predict_false(*pa != *pb)) + return *pa < *pb ? -1 : 1; + pa++; pb++; + } + return 0; +} +#define memcmp(a, b, l) __memcmp(a, b, l) #endif #if __GNUC_PREREQ__(2, 95) && !defined(_STANDALONE)