We insert the commit pointed to by each ref one-by-one into
the "complete" commit_list using insert_by_date. Because
each insertion is O(n), we end up with O(n^2) behavior.

This typically doesn't matter, because the number of refs is
reasonably small. And even if there are a lot of refs, they
often point to a smaller set of objects (in which case the
optimization in commit ea5f220 keeps our "n" small).

However, in pathological repositories (hundreds of thousands
of refs, each pointing to a unique commit), this quadratic
behavior can make a difference. Since we do not care about
the list order until we have finished building it, we can
simply keep it unsorted during the insertion phase, then
sort it afterwards.

On a repository like the one described above, this dropped
the time to do a no-op fetch from 2.0s to 1.7s. On normal
repositories, it probably does not matter at all, but it
does not hurt to protect ourselves from pathological cases.

Signed-off-by: Jeff King <p...@peff.net>
---
A note on the timings. I measured the times above last year when I wrote
the same patch here:

  http://article.gmane.org/gmane.comp.version-control.git/194939

And earlier tonight, I did a fetch that showed the same result. But when
I tried to replicate it while writing the commit message, I had trouble,
because either:

  1. I was fetching actual commits, in which case the more serious
     problem behavior in find_common kicked in, ruining the measurement.

  2. It was a no-op fetch, in which case quickfetch() kicked in and we
     did not call mark_complete at all.

So I'm rather confused how I managed to get timings earlier (both last
year, and earlier today). But I still think it's an obviously correct
thing to do, and does protect us in case (1) above (actually fetching a
commit) once we fix the problem in find_common.

 fetch-pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index abe5ffb..4df8abd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -505,7 +505,7 @@ static int mark_complete(const char *refname, const 
unsigned char *sha1, int fla
                struct commit *commit = (struct commit *)o;
                if (!(commit->object.flags & COMPLETE)) {
                        commit->object.flags |= COMPLETE;
-                       commit_list_insert_by_date(commit, &complete);
+                       commit_list_insert(commit, &complete);
                }
        }
        return 0;
@@ -622,6 +622,7 @@ static int everything_local(struct fetch_pack_args *args,
        if (!args->depth) {
                for_each_ref(mark_complete, NULL);
                for_each_alternate_ref(mark_alternate_complete, NULL);
+               commit_list_sort_by_date(&complete);
                if (cutoff)
                        mark_recent_complete_commits(args, cutoff);
        }
-- 
1.8.3.rc2.14.g7eee6b3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to