Hi devs, this is part of the delta_files() speedup patch series: http://svn.haxx.se/dev/archive-2010-03/0604.shtml
translate_chunk spends most of its time in a loop looking for the next '$' or newline. The call to strchr is excessively expensive since the 'interesting' string is no longer than 3 chars. It is much more efficient to use a simple lookup table (boolean array) that tells us whether a certain char is 'interesting' or not. Since we call it for almost every char in the file, the initialization overhead amortizes within the first two lines of the respective file. Performance gain is ~9%: s~$ time ~/1.7-928181/svn export --ignore-externals -q $REPO/trunk /dev/shm/t real 0m3.727s user 0m3.189s sys 0m0.542s ~$ time ~/1.7-patched/svn export --ignore-externals -q $REPO/trunk /dev/shm/t real 0m3.410s user 0m2.872s sys 0m0.537s -- Stefan^2. [[[ Optimize the search for 'interesting' characters that control the keyword substitution. For details see http:// ... * subversion/libsvn_subr/subst.c (translation_baton): the 'interesting' member is now a boolean array. (create_translation_baton): adapt initialization code (translate_chunk): eliminate call to strchr patch by stefanfuhrmann < at > alice-dsl.de ]]]
Index: subversion/libsvn_subr/subst.c =================================================================== --- subversion/libsvn_subr/subst.c (revision 928181) +++ subversion/libsvn_subr/subst.c (working copy) @@ -769,9 +769,9 @@ apr_hash_t *keywords; svn_boolean_t expand; - /* Characters (excluding the terminating NUL character) which + /* 'short boolean' array that encodes what character values may trigger a translation action, hence are 'interesting' */ - const char *interesting; + char interesting[256]; /* Length of the string EOL_STR points to. */ apr_size_t eol_str_len; @@ -821,11 +821,21 @@ b->repair = repair; b->keywords = keywords; b->expand = expand; - b->interesting = (eol_str && keywords) ? "$\r\n" : eol_str ? "\r\n" : "$"; b->newline_off = 0; b->keyword_off = 0; b->src_format_len = 0; + /* Most characters don't start translation actions. + * Mark those that do depending on the parameters we got. */ + memset(b->interesting, FALSE, sizeof(b->interesting)); + if (keywords) + b->interesting['$'] = TRUE; + if (eol_str) + { + b->interesting['\r'] = TRUE; + b->interesting['\n'] = TRUE; + } + return b; } @@ -938,14 +948,9 @@ len = 0; /* We wanted memcspn(), but lacking that, the loop below has - the same effect. - - Also, skip NUL characters explicitly, since strchr() - considers them part of the string argument, - but we don't consider them interesting + the same effect. Also, skip NUL characters. */ - while ((p + len) < end - && (! p[len] || ! strchr(interesting, p[len]))) + while ((p + len) < end && !interesting[(unsigned char)p[len]]) len++; if (len)