From: Jeff Hostetler <jeffh...@microsoft.com>

Use ALLOC_GROW() macro when reallocing a string_list array
rather than simply increasing it by 32.  This is a performance
optimization.

During status on a very large repo and there are many changes,
a significant percentage of the total run time is spent
reallocing the wt_status.changes array.

This change decreases the time in wt_status_collect_changes_worktree()
from 125 seconds to 45 seconds on my very large repository.

This produced a modest gain on my 1M file artificial repo, but
broke even on linux.git.

Test                                            HEAD^^            HEAD
---------------------------------------------------------------------------------------
0005.2: read-tree status br_ballast (1000001)   8.29(5.62+2.62)   
8.22(5.57+2.63) -0.8%

Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com>
---
 string-list.c          |  5 +----
 t/perf/p0005-status.sh | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p0005-status.sh

diff --git a/string-list.c b/string-list.c
index 45016ad..003ca18 100644
--- a/string-list.c
+++ b/string-list.c
@@ -41,10 +41,7 @@ static int add_entry(int insert_at, struct string_list 
*list, const char *string
        if (exact_match)
                return -1 - index;
 
-       if (list->nr + 1 >= list->alloc) {
-               list->alloc += 32;
-               REALLOC_ARRAY(list->items, list->alloc);
-       }
+       ALLOC_GROW(list->items, list->nr+1, list->alloc);
        if (index < list->nr)
                memmove(list->items + index + 1, list->items + index,
                                (list->nr - index)
diff --git a/t/perf/p0005-status.sh b/t/perf/p0005-status.sh
new file mode 100755
index 0000000..0b0aa98
--- /dev/null
+++ b/t/perf/p0005-status.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+#
+# This test measures the performance of various read-tree
+# and status operations.  It is primarily interested in
+# the algorithmic costs of index operations and recursive
+# tree traversal -- and NOT disk I/O on thousands of files.
+
+test_description="Tests performance of read-tree"
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+# If the test repo was generated by ./repos/many-files.sh
+# then we know something about the data shape and branches,
+# so we can isolate testing to the ballast-related commits
+# and setup sparse-checkout so we don't have to populate
+# the ballast files and directories.
+#
+# Otherwise, we make some general assumptions about the
+# repo and consider the entire history of the current
+# branch to be the ballast.
+
+test_expect_success "setup repo" '
+       if git rev-parse --verify refs/heads/p0006-ballast^{commit}
+       then
+               echo Assuming synthetic repo from many-files.sh
+               git branch br_base            master
+               git branch br_ballast         p0006-ballast
+               git config --local core.sparsecheckout 1
+               cat >.git/info/sparse-checkout <<-EOF
+               /*
+               !ballast/*
+               EOF
+       else
+               echo Assuming non-synthetic repo...
+               git branch br_base            $(git rev-list HEAD | tail -n 1)
+               git branch br_ballast         HEAD
+       fi &&
+       git checkout -q br_ballast &&
+       nr_files=$(git ls-files | wc -l)
+'
+
+test_perf "read-tree status br_ballast ($nr_files)" '
+       git read-tree HEAD &&
+       git status
+'
+
+test_done
-- 
2.9.3

Reply via email to