Hi devs,

this is part of the delta_files() speedup patch series:
http://svn.haxx.se/dev/archive-2010-03/0604.shtml

translate_chunk spends most of its time in a loop
looking for the next '$' or newline. The call to strchr
is excessively expensive since the 'interesting' string
is no longer than 3 chars.

It is much more efficient to use a simple lookup table
(boolean array) that tells us whether a certain char
is 'interesting' or not. Since we call it for almost every
char in the file, the initialization overhead amortizes
within the first two lines of the respective file.

Performance gain is ~9%:

s~$ time ~/1.7-928181/svn export --ignore-externals -q $REPO/trunk /dev/shm/t
real    0m3.727s
user    0m3.189s
sys     0m0.542s

~$ time ~/1.7-patched/svn export --ignore-externals -q $REPO/trunk /dev/shm/t
real    0m3.410s
user    0m2.872s
sys     0m0.537s

-- Stefan^2.

[[[
Optimize the search for 'interesting' characters that
control the keyword substitution. For details see
http:// ...

* subversion/libsvn_subr/subst.c
  (translation_baton): the 'interesting' member is now
  a boolean array.
  (create_translation_baton): adapt initialization code
  (translate_chunk): eliminate call to strchr

patch by stefanfuhrmann < at > alice-dsl.de
]]]


Index: subversion/libsvn_subr/subst.c
===================================================================
--- subversion/libsvn_subr/subst.c	(revision 928181)
+++ subversion/libsvn_subr/subst.c	(working copy)
@@ -769,9 +769,9 @@
   apr_hash_t *keywords;
   svn_boolean_t expand;
 
-  /* Characters (excluding the terminating NUL character) which
+  /* 'short boolean' array that encodes what character values
      may trigger a translation action, hence are 'interesting' */
-  const char *interesting;
+  char interesting[256];
 
   /* Length of the string EOL_STR points to. */
   apr_size_t eol_str_len;
@@ -821,11 +821,21 @@
   b->repair = repair;
   b->keywords = keywords;
   b->expand = expand;
-  b->interesting = (eol_str && keywords) ? "$\r\n" : eol_str ? "\r\n" : "$";
   b->newline_off = 0;
   b->keyword_off = 0;
   b->src_format_len = 0;
 
+  /* Most characters don't start translation actions.
+   * Mark those that do depending on the parameters we got. */
+  memset(b->interesting, FALSE, sizeof(b->interesting));
+  if (keywords)
+    b->interesting['$'] = TRUE;
+  if (eol_str)
+    {
+      b->interesting['\r'] = TRUE;
+      b->interesting['\n'] = TRUE;
+    }
+
   return b;
 }
 
@@ -938,14 +948,9 @@
           len = 0;
 
           /* We wanted memcspn(), but lacking that, the loop below has
-             the same effect.
-
-             Also, skip NUL characters explicitly, since strchr()
-             considers them part of the string argument,
-             but we don't consider them interesting
+             the same effect. Also, skip NUL characters.
           */
-          while ((p + len) < end
-                 && (! p[len] || ! strchr(interesting, p[len])))
+          while ((p + len) < end && !interesting[(unsigned char)p[len]])
             len++;
 
           if (len)

Reply via email to