Re: git pull takes ~8 seconds on up-to-date Linux git tree
On Fri, Oct 05, 2012 at 10:20:37PM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: @@ -617,6 +618,8 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc one-object.flags |= PARENT1; commit_list_insert_by_date(one, list); +if (!n) +return list; for (i = 0; i n; i++) { twos[i]-object.flags |= PARENT2; commit_list_insert_by_date(twos[i], list); This seems like an obvious optimization, but does it really have anything to do with the patch at hand? The function picks one and paints it against all others, but the logic assumes there must be at least one other to paint against; otherwise the traversal will not ever find a node that is painted with both PARENT1 and PARENT2 to stop, leading us to traverse all the way down to root. Ah, OK. I was thinking it was just a way to skip the further logic, which would come to the same answer (it does, just not quickly). Makes sense. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Junio C Hamano gits...@pobox.com writes: Here is a tested (in the sense that it passes the test suite, and also in the sense that an empty pull in the kernel history gives quick turnaround) patch. As I do not think we would want to revert 5802f81 (fmt-merge-msg: discard needless merge parents, 2012-04-18) which was a correctness fix, I think we would rather want to do something like this. OK, I think I am convinced myself that this patch is the right fix. The performance regression Markus saw is in fmt-merge-message, and it is caused by the updated remove_redundant() that is used by get_merge_bases_many() and reduce_heads(). On the topic branch, all callers of reduce_heads() were passing commits that are already parsed, but before the topic was merged to 'master', we added one more caller to reduce_heads() on the 'master' front that passed an unparsed commit, which is why the problem surfaced at that merge. It might make sense to assert or die in commit_list_insert_by_date() when a caller mistakenly pass an unparsed commit object to prevent this kind of breakages in the future. -- 8 -- Subject: paint_down_to_common(): parse commit before relying on its timestamp When refactoring the merge-base computation to reduce the pairwise O(n*(n-1)) traversals to parallel O(n) traversals, the code forgot that timestamp based heuristics needs each commit to have been parsed. This caused an empty git pull to spend cycles, traversing the history all the way down to 0 (because an unparsed commit object has 0 timestamp, and any other commit object with positive timestamp will be processed for its parents, all getting parsed), only to come up with a merge message to be used. Signed-off-by: Junio C Hamano gits...@pobox.com --- commit.c | 5 + 1 file changed, 5 insertions(+) diff --git i/commit.c w/commit.c index 0246767..213bc98 100644 --- i/commit.c +++ w/commit.c @@ -609,6 +609,7 @@ static struct commit *interesting(struct commit_list *list) return NULL; } +/* all input commits in one and twos[] must have been parsed! */ static struct commit_list *paint_down_to_common(struct commit *one, int n, struct commit **twos) { struct commit_list *list = NULL; @@ -617,6 +618,8 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc one-object.flags |= PARENT1; commit_list_insert_by_date(one, list); + if (!n) + return list; for (i = 0; i n; i++) { twos[i]-object.flags |= PARENT2; commit_list_insert_by_date(twos[i], list); @@ -737,6 +740,8 @@ static int remove_redundant(struct commit **array, int cnt) redundant = xcalloc(cnt, 1); filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1)); + for (i = 0; i cnt; i++) + parse_commit(array[i]); for (i = 0; i cnt; i++) { struct commit_list *common; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
On Fri, Oct 05, 2012 at 01:34:02PM -0700, Junio C Hamano wrote: OK, I think I am convinced myself that this patch is the right fix. The performance regression Markus saw is in fmt-merge-message, and it is caused by the updated remove_redundant() that is used by get_merge_bases_many() and reduce_heads(). On the topic branch, all callers of reduce_heads() were passing commits that are already parsed, but before the topic was merged to 'master', we added one more caller to reduce_heads() on the 'master' front that passed an unparsed commit, which is why the problem surfaced at that merge. Thanks for tracking it down. That makes a lot of sense with the results we are seeing. It might make sense to assert or die in commit_list_insert_by_date() when a caller mistakenly pass an unparsed commit object to prevent this kind of breakages in the future. I wonder if it would be too much to just have commit_list_insert_by_date call parse_commit. It is, after all, the exact moment when we need to have the date valid (and by waiting until the last minute, we can potentially avoid parses that would not otherwise need to happen). The overhead in the common case should basically be the same as an assert: checking that commit-object.parsed is true (we can always inline that bit of parse_commit if we have to). Of course, in this case it is not just commit_list_insert_by_date that cares. paint_down_to_common also want commit-parents to be valid; I'm surprised that dealing with unparsed commits did not also reveal an error there. In an object-oriented world, we would always get the attributes of a commit through accessors that made sure the object was parsed. That would be nicer, but it would also mean paying for the if (parsed) conditional a lot more frequently. @@ -617,6 +618,8 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc one-object.flags |= PARENT1; commit_list_insert_by_date(one, list); + if (!n) + return list; for (i = 0; i n; i++) { twos[i]-object.flags |= PARENT2; commit_list_insert_by_date(twos[i], list); This seems like an obvious optimization, but does it really have anything to do with the patch at hand? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Jeff King p...@peff.net writes: @@ -617,6 +618,8 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc one-object.flags |= PARENT1; commit_list_insert_by_date(one, list); + if (!n) + return list; for (i = 0; i n; i++) { twos[i]-object.flags |= PARENT2; commit_list_insert_by_date(twos[i], list); This seems like an obvious optimization, but does it really have anything to do with the patch at hand? The function picks one and paints it against all others, but the logic assumes there must be at least one other to paint against; otherwise the traversal will not ever find a node that is painted with both PARENT1 and PARENT2 to stop, leading us to traverse all the way down to root. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git pull takes ~8 seconds on up-to-date Linux git tree
Hi, with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total perf shows for trunk: 22.11% git-merge libz.so.1.2.7 [.] 0x73bc 22.03%git libz.so.1.2.7 [.] 0x7338 14.18%git libz.so.1.2.7 [.] inflate 13.70% git-merge libz.so.1.2.7 [.] inflate 9.18%git git[.] 0x000ea391 8.56% git-merge git-merge [.] 0x000f0598 1.58% git-merge libz.so.1.2.7 [.] adler32 1.52%git libz.so.1.2.7 [.] adler32 0.59%git [kernel.kallsyms] [k] clear_page_c and for 1.7.12: 39.29%git git[.] 0x000b9fa8 12.16%git libz.so.1.2.7 [.] inflate 8.67%git libz.so.1.2.7 [.] 0xa18e 8.49% git-merge git-merge [.] 0x000efa15 4.96%git libc-2.16.90.so[.] memcpy@@GLIBC_2.14 2.63%git libc-2.16.90.so[.] _int_malloc 2.61% git-merge [kernel.kallsyms] [k] clear_page_c 2.32%git [kernel.kallsyms] [k] clear_page_c 2.04%git [kernel.kallsyms] [k] filemap_fault 1.87% git-merge libc-2.16.90.so[.] memcpy@@GLIBC_2.14 -- Markus -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Jeff King p...@peff.net writes: On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. Thanks; we probably should revert the merge before the final and handle the fallout later, unless we know why the integrated whole is slower than its parts. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Jeff King p...@peff.net writes: On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. There is one extra call site of reduce_heads() added by that merge relative to 34f5130. It is likely that the updated reduce_heads() is totally broken from performance point of view, or at least it is not optimized for the usage pattern at the new call site, in which case it would be very plausible that both parents of the merge perform well while the merge result is sucky. It gets more curious, though. I am getting ~9 seconds with the tip of master and ~0.4 seconds with 1.7.12. When 34f5130^2 is reverted at the tip of master, I get ~0.35 seconds. That matches Markus's observation. However. If I revert 5802f81 that updated the implementation of fmt-merge-msg on top of 'master', *without* reverting 34f5130^2, I get ~4.5 seconds. As we are doing an Already up-to-date pull, I thought there is no need to call fmt-merge-msg in the first place? Which may indicate that git merge has been broken for a long time and making unnecessary calls. Hrmmm... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Jeff King p...@peff.net writes: On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. I think the following is likely to be the correct solution to this. commit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git i/commit.c w/commit.c index 0246767..e1dd5b9 100644 --- i/commit.c +++ w/commit.c @@ -733,6 +733,9 @@ static int remove_redundant(struct commit **array, int cnt) int *filled_index; int i, j, filled; + if (cnt 2) + return cnt; + work = xcalloc(cnt, sizeof(*work)); redundant = xcalloc(cnt, 1); filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1)); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Junio C Hamano gits...@pobox.com writes: Jeff King p...@peff.net writes: On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. I think the following is likely to be the correct solution to this. No, it is not. Sorry for the noise. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git pull takes ~8 seconds on up-to-date Linux git tree
Junio C Hamano gits...@pobox.com writes: Junio C Hamano gits...@pobox.com writes: Jeff King p...@peff.net writes: On Thu, Oct 04, 2012 at 04:14:54PM +0200, Markus Trippelsdorf wrote: with current trunk I get the following on an up-to-date Linux tree: markus@x4 linux % time git pull Already up-to-date. git pull 7.84s user 0.26s system 92% cpu 8.743 total git version 1.7.12 is much quicker: markus@x4 linux % time git pull Already up-to-date. git pull 0.10s user 0.02s system 16% cpu 0.740 total Yikes. I can easily reproduce here. Bisecting between master and v1.7.12 gives a curious result: the slowdown first occurs with the merge commit 34f5130 (Merge branch 'jc/merge-bases', 2012-09-11). But neither of its parents is slow. I don't see anything obviously suspect in the merge, though. I think the following is likely to be the correct solution to this. No, it is not. Sorry for the noise. Here is a tested (in the sense that it passes the test suite, and also in the sense that an empty pull in the kernel history gives quick turnaround) patch. As I do not think we would want to revert 5802f81 (fmt-merge-msg: discard needless merge parents, 2012-04-18) which was a correctness fix, I think we would rather want to do something like this. -- 8 -- Subject: paint_down_to_common(): parse commit before relying on its timestamp When refactoring the merge-base computation to reduce the pairwise O(n*(n-1)) traversals to parallel O(n) traversals, the code forgot that timestamp based heuristics needs each commit to have been parsed. This caused an empty git pull to spend cycles, traversing the history all the way down to 0 (because an unparsed commit object has 0 timestamp, and any other commit object with positive timestamp will be processed for its parents, all getting parsed), only to come up with a merge message to be used. Signed-off-by: Junio C Hamano gits...@pobox.com --- commit.c | 5 + 1 file changed, 5 insertions(+) diff --git i/commit.c w/commit.c index 0246767..213bc98 100644 --- i/commit.c +++ w/commit.c @@ -609,6 +609,7 @@ static struct commit *interesting(struct commit_list *list) return NULL; } +/* all input commits in one and twos[] must have been parsed! */ static struct commit_list *paint_down_to_common(struct commit *one, int n, struct commit **twos) { struct commit_list *list = NULL; @@ -617,6 +618,8 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc one-object.flags |= PARENT1; commit_list_insert_by_date(one, list); + if (!n) + return list; for (i = 0; i n; i++) { twos[i]-object.flags |= PARENT2; commit_list_insert_by_date(twos[i], list); @@ -737,6 +740,8 @@ static int remove_redundant(struct commit **array, int cnt) redundant = xcalloc(cnt, 1); filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1)); + for (i = 0; i cnt; i++) + parse_commit(array[i]); for (i = 0; i cnt; i++) { struct commit_list *common; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html