Re: Drastic jump in the time required for the test suite
On Fri, Oct 28, 2016 at 3:41 AM, Eric Wongwrote: > Johannes Schindelin wrote: >> I know you are a fan of testing things thoroughly in the test suite, but I >> have to say that it is getting out of hand, in particular due to our >> over-use of shell script idioms (which really only run fast on Linux, not >> a good idea for a portable software). > > How much effort would it take to optimize a /bin/sh? > > Would replacing uses of fork+execve posix_spawn be fast and > portable enough? > > Even on Linux, performance sucks for me. I've been hoping dash > can use posix_spawn (or using vfork directly) to see if that can > help things. Busybox hush uses vfork.on no-mmu builds. You could try it out. I don't know how complete it is though. -- Duy
Re: Drastic jump in the time required for the test suite
Johannes Schindelinwrote: > I know you are a fan of testing things thoroughly in the test suite, but I > have to say that it is getting out of hand, in particular due to our > over-use of shell script idioms (which really only run fast on Linux, not > a good idea for a portable software). How much effort would it take to optimize a /bin/sh? Would replacing uses of fork+execve posix_spawn be fast and portable enough? Even on Linux, performance sucks for me. I've been hoping dash can use posix_spawn (or using vfork directly) to see if that can help things. That won't help with subshells, though... (I'm back to using a Centrino laptop from 2005)
Re: Drastic jump in the time required for the test suite
Am 21.10.2016 um 12:59 schrieb Duy Nguyen: On Thu, Oct 20, 2016 at 11:40 PM, René Scharfewrote: I get this on WSL with prove -j8: Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 3731.85 csys = 4040.15 CPU) And this for a run on Debian inside a Hyper-V VM on the same system: Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr 1.06 sys + 39.70 cusr 25.82 csys = 71.39 CPU) All tests pass on master. Thank you for doing this. 10 times slower is probably not worth following up (though absolute numbers still look amazing, you have some beefy machine there). Thanks, but it's not too impressive: Xeon E3-1231 v3, Crucial BX100 SSD, 8GB RAM. I wonder how much faster a brand-new CPU with more RAM and a PCIe SSD would be.. René
Re: Drastic jump in the time required for the test suite
Am 21.10.2016 um 15:10 schrieb Matthieu Moy: > René Scharfewrites: > >> I get this on WSL with prove -j8: >> >> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 >> cusr 3731.85 csys = 4040.15 CPU) >> >> And this for a run on Debian inside a Hyper-V VM on the same system: >> >> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr 1.06 sys + 39.70 cusr >> 25.82 csys = 71.39 CPU) > > What about the same without WSL on windows? Files=755, Tests=14894, 1774 wallclock secs ( 9.31 usr 9.58 sys + 821.87 cusr 1961.23 csys = 2801.99 CPU) René
Re: Drastic jump in the time required for the test suite
René Scharfewrites: > I get this on WSL with prove -j8: > > Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr > 3731.85 csys = 4040.15 CPU) > > And this for a run on Debian inside a Hyper-V VM on the same system: > > Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr 1.06 sys + 39.70 cusr > 25.82 csys = 71.39 CPU) What about the same without WSL on windows? -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: Drastic jump in the time required for the test suite
On Fri, Oct 21, 2016 at 3:40 AM, Dennis Kaarsemakerwrote: > On Thu, 2016-10-20 at 08:31 -0400, Jeff King wrote: > >> I'm also not entirely convinced that the test suite being a shell script >> is the main culprit for its slowness. We run git a lot of times, and >> that's inherent in testing it. I ran the whole test suite under >> "strace -f -e execve". There are ~335K execs. Here's the breakdown of >> the top ones: > > You're measuring execve's, but fork (well, fork emulation. There's no > actual fork) is also expensive on windows iirc, so subshells add a lot > to this cost. shells fork on piping as well, and redirection and some other construct if I remember correctly (I attempted to port busybox ash to windows and had to find and "fix" all the forks) > That said, strace -eclone says that a 'make test' forks > ~408k times, and while this is significantly more than the amount of > execs in your example, this does include cvs and svn tests and it's > still in the same ballpark. -- Duy
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 11:40 PM, René Scharfewrote: > Am 20.10.2016 um 13:02 schrieb Duy Nguyen: >> On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin >> wrote: >>> Hi Junio, >>> >>> I know you are a fan of testing things thoroughly in the test suite, but I >>> have to say that it is getting out of hand, in particular due to our >>> over-use of shell script idioms (which really only run fast on Linux, not >>> a good idea for a portable software). >>> >>> My builds of `pu` now time out, after running for 3h straight in the VM >>> dedicated to perform the daily routine of building and testing the git.git >>> branches in Git for Windows' SDK. For comparison, `next` passes build & >>> tests in 2.6h. That is quite the jump. >> >> I'm just curious, will running git.exe from WSL [1] help speed things >> up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in >> WSL is quite fast. >> >> I'm pretty sure the test suite would need some adaptation, but if the >> speedup is significant, maybe it's worth spending time on. >> >> [1] https://news.ycombinator.com/item?id=12748395 > > I get this on WSL with prove -j8: > > Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr > 3731.85 csys = 4040.15 CPU) > > And this for a run on Debian inside a Hyper-V VM on the same system: > > Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr 1.06 sys + 39.70 cusr > 25.82 csys = 71.39 CPU) > > All tests pass on master. Thank you for doing this. 10 times slower is probably not worth following up (though absolute numbers still look amazing, you have some beefy machine there). -- Duy
Re: Drastic jump in the time required for the test suite
On Fri, Oct 21, 2016 at 07:27:40AM +0200, Johannes Sixt wrote: > Am 20.10.2016 um 23:38 schrieb Jeff King: > > test_cmp () { > > # optimize for common "they are the same" case > > # without any subshells or subprograms > > We do this already on Windows; it's the function mingw_test_cmp in our test > suite: > https://github.com/git/git/blob/master/t/test-lib-functions.sh#L884-L945 Ah, thanks, I didn't even think to look for an existing solution like this. Looks like it was done for LF/CRLF reasons, though, not performance. I wonder if anybody has measured whether it improves things there or not. -Peff
Re: Drastic jump in the time required for the test suite
Am 20.10.2016 um 23:38 schrieb Jeff King: test_cmp () { # optimize for common "they are the same" case # without any subshells or subprograms We do this already on Windows; it's the function mingw_test_cmp in our test suite: https://github.com/git/git/blob/master/t/test-lib-functions.sh#L884-L945 -- Hannes
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 05:00:13PM -0400, Jeff King wrote: > > I am not an expert on perl nor tracing, but is it feasible to find out > > how many internal calls there are? i.e. either some shell script (rebase, > > submodule) calling git itself a couple of times or even from compile/git/git > > itself, e.g. some submodule operations use forking in there. > > The script below is my attempt, though I think it is not quite right, as > "make" should be the single apex of the graph. You can run it like: > > strace -f -o /tmp/foo.out -e clone,execve make test > perl graph.pl /tmp/foo.out | less -S Ah, I see why it is sometimes wrong. We may see parent/child interactions out of order. E.g., we may see: 1. Process B execs "git". 2. Process B exits. 3. Process A forks pid B. because strace does not manage to ptrace "A" before "B" finishes running. It's hard to tell in step 3 if this is the same "A" we saw earlier. You cannot just go by PID, because the tests run enough processes that we end up reusing PIDs. My script has a hack which increments a counter when processes go away, but that happens in step 2 above (and steps 1 and 3 think they are seeing different processes, even though they are the same). I'm sure it could be cleverly hacked around, but the easiest thing is probably to just make sure the tests are not run in parallel. High load makes that kind of out-of-order sequence much more likely. Making sure I run it with "make -j1", the script I posted earlier gives pretty reasonable output. I won't post mine here, as it's double-digit megabytes, but you should be able to recreate it yourself. Of course, the call graph by itself isn't that interesting. But you might be able to do some interesting analysis, or just find spots of interest to you (like git-submodule). -Peff
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 02:53:36PM -0700, Stefan Beller wrote: > >> That said I really like the idea of having a helper that would eliminate > >> the cat > >> for you, e.g. : > >> > >> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status > >> --porcelain=v2 --branch --untracked-files=all <<-EOF > >> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules > >> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1 > >> EOF > > > > I think that helper still ends up using "cat" and "diff" under the hood, > > I assumed that helper being a C program, that only needs to fork once > for the actual git call and then it sits idle to compare the exact output > from stdout to its stdin. Ah, I see. I thought you meant a shell helper. A C helper does drop 2 forks down to 1, but it would be nice if we could drop it to 0. -Peff
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 05:38:03PM -0400, Jeff King wrote: > I think that helper still ends up using "cat" and "diff" under the hood, > unless you write those bits in pure shell. But at that point, I suspect > we could "cat" and "test_cmp" in pure shell, something like: > [...] > Those are both completely untested. But maybe they are worth playing > around with for somebody on Windows to see if they make a dent in the > test runtime. If you tried to run them, you probably noticed that the "untested" was really true. One of the functions was missing an "else", and the other forgot to add a "\n" to its printf. The patch below gets closer, though there are still a handful of test failures. I didn't investigate deeply, but I think at least one is related to the "read/printf" version of cat not being binary-clean. -Peff --- diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index fdaeb3a96b..de37f3d 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -685,9 +685,48 @@ test_expect_code () { # - not all diff versions understand "-u" test_cmp() { + # optimize for common "they are the same" case + # without any subshells or subprograms + while true; do + if ! read -r line1 <&3 + then + if ! read -r line2 <&4 + then + # EOF on both; good + return 0 + else + # EOF only on file1; fail + break + fi + fi + if ! read -r line2 <&4 + then + # EOF only on file2; fail + break + fi + test "$line1" = "$line2" || break + done 3<"$1" 4<"$2" + + # if we get here, the optimized version found some + # difference. We can just "return 1", but let's run + # the real $GIT_TEST_CMP to provide pretty output. + # This should generally only happen on test failures, + # so performance isn't a big deal. $GIT_TEST_CMP "$@" } +cat () { + # optimize common here-doc usage + if test $# -eq 0 + then + while read -r line + do + printf '%s\n' "$line" + done + fi + command cat "$@" +} + # test_cmp_bin - helper to compare binary files test_cmp_bin() {
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 2:38 PM, Jeff Kingwrote: > On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote: > >> Maybe we should stop introducing un-optimized tests. >> [...] >> * heavy use of the "git -C " pattern. When applying that >> thouroughly we'd save spanning the subshells. > > Yeah, I imagine with some style changes we could drop quite a few > subshells. The problem is that the conversion work is manual and > tedious. I'd look first for spots where we can eliminate thousands of > calls with a single change. > >> That said I really like the idea of having a helper that would eliminate the >> cat >> for you, e.g. : >> >> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status >> --porcelain=v2 --branch --untracked-files=all <<-EOF >> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules >> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1 >> EOF > > I think that helper still ends up using "cat" and "diff" under the hood, I assumed that helper being a C program, that only needs to fork once for the actual git call and then it sits idle to compare the exact output from stdout to its stdin. If there is a difference it will do the extra work (i.e. put stdin and stout to a file and call diff on it)
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote: > Maybe we should stop introducing un-optimized tests. > [...] > * heavy use of the "git -C " pattern. When applying that > thouroughly we'd save spanning the subshells. Yeah, I imagine with some style changes we could drop quite a few subshells. The problem is that the conversion work is manual and tedious. I'd look first for spots where we can eliminate thousands of calls with a single change. > That said I really like the idea of having a helper that would eliminate the > cat > for you, e.g. : > > git_test_helper_equal_stdin_or_diff_and_die -C super_repo status > --porcelain=v2 --branch --untracked-files=all <<-EOF > 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules > 1 AM S.M. 00 16 16 $_z40 $HSUP sub1 > EOF I think that helper still ends up using "cat" and "diff" under the hood, unless you write those bits in pure shell. But at that point, I suspect we could "cat" and "test_cmp" in pure shell, something like: cat () { # optimize common here-doc usage if test $# -eq 0 then while read -r line do printf '%s' "$line" done fi command cat "$@" } test_cmp () { # optimize for common "they are the same" case # without any subshells or subprograms while true; do if ! read -r line1 <&3 then if ! read -r line2 <&4 # EOF on both; good return 0 else # EOF only on file1; fail break fi fi if ! read -r line2 <&4 then # EOF only on file2; fail break fi test "$line1" = "$line2" || break done 3<"$1" 4<"$2" # if we get here, the optimized version found some # difference. We can just "return 1", but let's run # the real $GIT_TEST_CMP to provide pretty output. # This should generally only happen on test failures, # so performance isn't a big deal. "$GIT_TEST_CMP" "$@" } Those are both completely untested. But maybe they are worth playing around with for somebody on Windows to see if they make a dent in the test runtime. -Peff
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 10:38:23PM +0200, Johannes Sixt wrote: > Am 20.10.2016 um 14:31 schrieb Jeff King: > > Close to 1/3 of those processes are just invoking the bin-wrapper > > script to set up the EXEC_PATH, etc. I imagine it would not be too hard > > to just do that in the test script. In fact, it looks like: > > > > make prefix=/wherever install > > GIT_TEST_INSTALLED=/wherever/bin make test > > > > might give you an immediate speedup by skipping bin-wrappers entirely. > > Running the tests with --with-dashes should give you the same effect, no? Yeah, looks like it. it still uses bin-wrappers for t/helper, but that should be a minority of calls. Which I think explains why I saw some test failures with the GIT_TEST_INSTALLED above. It does not know about t/helper, but relies on those programs being present in $GIT_BUILD_DIR. So I suspect it has been totally broken since e6e7530d10 (test helpers: move test-* to t/helper/ subdirectory, 2016-04-13). -Peff
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 09:30:27AM -0700, Stefan Beller wrote: > On Thu, Oct 20, 2016 at 5:31 AM, Jeff Kingwrote: > > > > > $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c > > | sort -rn | head > > 152271 /home/peff/compile/git/git > > 57340 /home/peff/compile/git/t/../bin-wrappers/git > > 16865 /bin/sed > > 12650 /bin/rm > > 11257 /bin/cat > >9326 /home/peff/compile/git/git-sh-i18n--envsubst > >9079 /usr/bin/diff > >8013 /usr/bin/wc > >5924 /bin/mv > >4566 /bin/grep > > > > I am not an expert on perl nor tracing, but is it feasible to find out > how many internal calls there are? i.e. either some shell script (rebase, > submodule) calling git itself a couple of times or even from compile/git/git > itself, e.g. some submodule operations use forking in there. The script below is my attempt, though I think it is not quite right, as "make" should be the single apex of the graph. You can run it like: strace -f -o /tmp/foo.out -e clone,execve make test perl graph.pl /tmp/foo.out | less -S One thing that it counts (that was not counted above) is the number of forks for subshells, which is considerable. I don't know how expensive that is versus, say, running "cat" (if your fork() doesn't copy-on-write, and you implement sub-programs via an efficient spawn() call, it's possible that the subshells are significantly more expensive). -Peff -- >8 -- #!/usr/bin/perl my %clone; my %exec; my %is_child; my %counter; while (<>) { # execve("some-prog", ... if (/^(\d+)\s+execve\("(.*?)"/) { push @{$exec{node($1)}}, $2; } # clone(...) = # or # <... clone resumed> ...) = elsif (/^(\d+)\s+.*clone.*\) = (\d+)$/) { push @{$clone{node($1)}}, node($2); $is_child{node($2)} = 1; } # +++ exited with +++ # We have to keep track of this because pids get recycled, # and so are not unique node names in our graph. elsif (/^(\d+)\s+.*exited with/) { $counter{$1}++; } } show($_, 0) for grep { !$is_child{$_} } keys(%clone); sub show { my ($pid, $indent) = @_; my @progs = @{$exec{$pid}}; if (!@progs) { @progs = ("(fork)"); } print ' ' x $indent; print "$pid: ", shift @progs; print " => $_" for @progs; print "\n"; show($_, $indent + 2) for @{$clone{$pid}}; } sub node { my $pid = shift; my $c = $counter{$pid} || "0"; return "$pid-$c"; }
Re: Drastic jump in the time required for the test suite
On Thu, 2016-10-20 at 08:31 -0400, Jeff King wrote: > I'm also not entirely convinced that the test suite being a shell script > is the main culprit for its slowness. We run git a lot of times, and > that's inherent in testing it. I ran the whole test suite under > "strace -f -e execve". There are ~335K execs. Here's the breakdown of > the top ones: You're measuring execve's, but fork (well, fork emulation. There's no actual fork) is also expensive on windows iirc, so subshells add a lot to this cost. That said, strace -eclone says that a 'make test' forks ~408k times, and while this is significantly more than the amount of execs in your example, this does include cvs and svn tests and it's still in the same ballpark. D.
Re: Drastic jump in the time required for the test suite
Am 20.10.2016 um 14:31 schrieb Jeff King: Close to 1/3 of those processes are just invoking the bin-wrapper script to set up the EXEC_PATH, etc. I imagine it would not be too hard to just do that in the test script. In fact, it looks like: make prefix=/wherever install GIT_TEST_INSTALLED=/wherever/bin make test might give you an immediate speedup by skipping bin-wrappers entirely. Running the tests with --with-dashes should give you the same effect, no? -- Hannes
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 3:50 AM, Johannes Schindelinwrote: > Hi peff, > > On Wed, 19 Oct 2016, Jeff King wrote: > >> On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote: >> >> > > Maybe we should start optimizing the tests... >> > Maybe we should stop introducing un-optimized tests. For other reasons I just stumbled upon t7064 (porcelain V2 output for git status) This is how an arbitrary test looks like: test_expect_success 'staged changes in added submodule (AM S.M.)' ' (cd super_repo && ## stage the changes in the submodule. (cd sub1 && git add file_in_sub ) && HMOD=$(git hash-object -t blob -- .gitmodules) && HSUP=$(git rev-parse HEAD) && HSUB=$HSUP && cat >expect <<-EOF && # branch.oid $HSUP # branch.head master # branch.upstream origin/master # branch.ab +0 -0 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules 1 AM S.M. 00 16 16 $_z40 $HSUB sub1 EOF git status --porcelain=v2 --branch --untracked-files=all >actual && test_cmp expect actual ) ' Following "modern" Git tests I would have expected: * heavy use of the "git -C " pattern. When applying that thouroughly we'd save spanning the subshells. * no `cd` on the same line as the opening paren. (This is style and would derail the performance discussion) test_expect_success 'staged changes in added submodule (AM S.M.)' ' git -C super_repo/sub1 add file_in_sub && HMOD=$(git -C super_repo hash-object -t blob -- .gitmodules) && HSUP=$(git -C super_repo rev-parse HEAD) && # as a comment: HSUB is equal to HSUP, because ... cat >expect <<-EOF && # branch.oid $HSUP # branch.head master # branch.upstream origin/master # branch.ab +0 -0 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules 1 AM S.M. 00 16 16 $_z40 $HSUP sub1 EOF git -C super_repo status --porcelain=v2 --branch --untracked-files=all >../actual && test_cmp expect actual ' That said I really like the idea of having a helper that would eliminate the cat for you, e.g. : git_test_helper_equal_stdin_or_diff_and_die -C super_repo status --porcelain=v2 --branch --untracked-files=all <<-EOF 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules 1 AM S.M. 00 16 16 $_z40 $HSUP sub1 EOF Thanks, Stefan
Re: Drastic jump in the time required for the test suite
Junio C Hamanowrites: > Are you proposing to replace the tests written as shell scripts with > scripts in another language or framework that run equivalent > sequences of git commands that is as portable as, if not more, > Bourne shell? The language (/bin/sh) is probably not the biggest issue. The way we use it may be. I don't have benchmark to tell what slows down the testsuite, but for example we often write cat >expected
Re: Drastic jump in the time required for the test suite
Am 20.10.2016 um 13:02 schrieb Duy Nguyen: > On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin >wrote: >> Hi Junio, >> >> I know you are a fan of testing things thoroughly in the test suite, but I >> have to say that it is getting out of hand, in particular due to our >> over-use of shell script idioms (which really only run fast on Linux, not >> a good idea for a portable software). >> >> My builds of `pu` now time out, after running for 3h straight in the VM >> dedicated to perform the daily routine of building and testing the git.git >> branches in Git for Windows' SDK. For comparison, `next` passes build & >> tests in 2.6h. That is quite the jump. > > I'm just curious, will running git.exe from WSL [1] help speed things > up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in > WSL is quite fast. > > I'm pretty sure the test suite would need some adaptation, but if the > speedup is significant, maybe it's worth spending time on. > > [1] https://news.ycombinator.com/item?id=12748395 I get this on WSL with prove -j8: Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 3731.85 csys = 4040.15 CPU) And this for a run on Debian inside a Hyper-V VM on the same system: Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr 1.06 sys + 39.70 cusr 25.82 csys = 71.39 CPU) All tests pass on master. René
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 5:31 AM, Jeff Kingwrote: > > $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | > sort -rn | head > 152271 /home/peff/compile/git/git > 57340 /home/peff/compile/git/t/../bin-wrappers/git > 16865 /bin/sed > 12650 /bin/rm > 11257 /bin/cat >9326 /home/peff/compile/git/git-sh-i18n--envsubst >9079 /usr/bin/diff >8013 /usr/bin/wc >5924 /bin/mv >4566 /bin/grep > I am not an expert on perl nor tracing, but is it feasible to find out how many internal calls there are? i.e. either some shell script (rebase, submodule) calling git itself a couple of times or even from compile/git/git itself, e.g. some submodule operations use forking in there.
Re: Drastic jump in the time required for the test suite
Johannes Schindelinwrites: > Of course, if you continue to resist (because the problem is obviously not > affecting you personally, so why would you care), I won't even try to find > the time to start on that project. Sorry, but I did not know I was resisting, as I didn't see any proposal to resist against in the first place. I was trying to help by mentioning two tricks that may be helping my test runtime that may help you as well. Are you proposing to replace the tests written as shell scripts with scripts in another language or framework that run equivalent sequences of git commands that is as portable as, if not more, Bourne shell? If that is what you are proposing, well, I won't stop you and I may even help you in there, but I fail to guess what alternative you have in mind. I certainly do not have a suggestion myself and I won't suggest migrate to tclsh or perl for that matter. If that is not what you are trying to propose, and if parallelism has already been employed, then there may or may not be other tricks you are not yet using that helps to speed up your shell execution that others are using---being confrontational is not an effective way to ask others about them.
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 12:17:33PM +0200, Johannes Schindelin wrote: > If you want to know just how harmful this reliance on shell scripting is > to our goal of keeping Git portable: already moving from Linux to MacOSX > costs you roughly 3x as long to run the build & test (~12mins vs ~36mins > for GCC, according to https://travis-ci.org/git/git/builds/159125647). Wait, shell scripts are slow on MacOS now? Perhaps, but it seems more likely that one or more of the following is true: - setup of the OS X VM takes longer (it does; if you click-through to the test results, you'll see that the "make test" step goes from 647s on Linux to 1108s on MacOS. That's much worse, but not even twice as slow, let alone 3x). - Travis Linux and OSX VMs do not have identical hardware. Looking at https://docs.travis-ci.com/user/ci-environment/, it appears that Linux containers get twice as many cores. - Git performance on Linux may be better than MacOS. The test suite is very filesystem-heavy because it creates and destroys a lot of files and repositories. If the kernel vfs performance is worse, it's likely to show up in the test suite (especially if the issue is latency and you aren't doing it massively in parallel). I don't have a real way to measure that, but it seems like a plausible factor. So that sucks that the MacOS Travis build takes a half hour to run. But I don't think that shell scripting is the culprit. > So the only thing that would really count as an improvement would be to > change the test suite in such a manner that it relies more on helpers in > t/helper/ and less on heavy-duty shell scripting. > > Of course, if you continue to resist (because the problem is obviously not > affecting you personally, so why would you care), I won't even try to find > the time to start on that project. I'm not sure what you mean by "resist". The tests suite has been a set of shell scripts for over a decade. As far as I know there is not currently a viable alternative. If you have patches that make it faster without negatively impact the ease of writing tests, I'd be happy to see them. If you have more t/helper programs that can eliminate expensive bits of the shell scripts and speed up the test run, great. If you have some other proposal entirely, I'd love to hear it. But I do not see that there is any proposal to "resist" at this point. I'm also not entirely convinced that the test suite being a shell script is the main culprit for its slowness. We run git a lot of times, and that's inherent in testing it. I ran the whole test suite under "strace -f -e execve". There are ~335K execs. Here's the breakdown of the top ones: $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | sort -rn | head 152271 /home/peff/compile/git/git 57340 /home/peff/compile/git/t/../bin-wrappers/git 16865 /bin/sed 12650 /bin/rm 11257 /bin/cat 9326 /home/peff/compile/git/git-sh-i18n--envsubst 9079 /usr/bin/diff 8013 /usr/bin/wc 5924 /bin/mv 4566 /bin/grep Almost half are running git itself. Let's assume that can't be changed. That leaves ~180K of shell-related overhead (versus the optimal case, that the entire test suite becomes one monolithic program ;) ). Close to 1/3 of those processes are just invoking the bin-wrapper script to set up the EXEC_PATH, etc. I imagine it would not be too hard to just do that in the test script. In fact, it looks like: make prefix=/wherever install GIT_TEST_INSTALLED=/wherever/bin make test might give you an immediate speedup by skipping bin-wrappers entirely. The rest of it is harder. I think you'd have to move the test suite to a language like perl that can do more of that as builtins (I'm sure you'd enjoy the portability implications of _that_). It would almost be easier to build a variant of the shell that has sed, rm, cat, and a few others compiled in. -Peff PS I haven't kept up with all of this POSIX-layer stuff that's been announced in Windows the past few months. Is it a viable path forward that would have better performance (obviously not in the short term, but where we may arrive in a few years)?
Re: Drastic jump in the time required for the test suite
On Thu, Oct 20, 2016 at 12:50:32PM +0200, Johannes Schindelin wrote: > That reflects my findings, too. I want to add that I found preciously > little difference between running slow-to-fast and running in numeric > order, so I gave up on optimizing on that front. Interesting. It makes a 10-15% difference here. I also point "--root" at a ram disk. The tests are very I/O heavy and sometimes fsync; even on a system with an SSD, this saves another ~10%. I know that's small potatoes compared to the Windows vs Linux times, but it might be worth exploring. > Further, I found that the Subversion tests (which run at the end) are so > close in their running time that running the tests in parallel with -j5 > does not result in any noticeable improvement when reordered. I normally don't run the Subversion tests at all. Installing cvs, cvsps, subversion, and libsvn-perl nearly doubles the runtime of the test suite for me (I imagine adding p4 to the mix would bump it further). While it's certainly possible to break them with a change in core git, it doesn't seem like a good tradeoff if I'm not touching them often. As the GfW maintainer, you probably should be running them, at least before a release. But cutting them might be a good way to speed up your day-to-day runs. I also use -j16 on a quad-core (+hyperthreads) machine, which I arrived at experimentally. At least on Linux, it's definitely worth having more threads than processors, to keep the processors busy. > I guess I will have to bite into the sour apple and try to profile, say, > t3404 somehow, including all the shell scripting stuff, to identify where > exactly all that time is lost. My guess is that it boils down to > gazillions of calls to programs like expr.exe or merely subshells. I'm not so sure it isn't gazillions of calls to git. It is testing rebase, after all, which is itself a shell script. GIT_TRACE_PERFORMANCE gives sort of a crude measure; it reports only builtins (so it will underestimate the total time spent in git), but it also doesn't make clear which programs call which, so some times are double-counted (if a builtin shells out to another builtin). But: $ export GIT_TRACE_PERFORMANCE=/tmp/foo.out $ rm /tmp/foo.out $ time ./t3404-rebase-interactive.sh real0m29.755s user0m1.444s sys 0m2.268s $ perl -lne ' /performance: ([0-9.]+)/ and $total += $1; END { print $total } ' /tmp/foo.out 32.851352624 Clearly that's not 100% accurate, as it claims we spent longer in git than the script actually took to run. Given the caveats above, I'm not even sure if it is in the right ballpark. But there are 11,000 git builtins run as part of that script. Even at 2ms each, that's still most of the time going to git. And obviously the fix involves converting git-rebase, which you're already working on. But it's not clear to me that the test infrastructure or shell scripts are the primary cause of the slowness in this particular case. -Peff
Re: Drastic jump in the time required for the test suite
On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelinwrote: > Hi Junio, > > I know you are a fan of testing things thoroughly in the test suite, but I > have to say that it is getting out of hand, in particular due to our > over-use of shell script idioms (which really only run fast on Linux, not > a good idea for a portable software). > > My builds of `pu` now time out, after running for 3h straight in the VM > dedicated to perform the daily routine of building and testing the git.git > branches in Git for Windows' SDK. For comparison, `next` passes build & > tests in 2.6h. That is quite the jump. I'm just curious, will running git.exe from WSL [1] help speed things up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in WSL is quite fast. I'm pretty sure the test suite would need some adaptation, but if the speedup is significant, maybe it's worth spending time on. [1] https://news.ycombinator.com/item?id=12748395 -- Duy
Re: Drastic jump in the time required for the test suite
Hi peff, On Wed, 19 Oct 2016, Jeff King wrote: > On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote: > > > > Maybe we should start optimizing the tests... > > > > Yup, two things that come to mind are to identify long ones and see if > > each of them can be split into two halves that can be run in parallel, > > and to go through the tests with fine toothed comb and remove the ones > > that test exactly the same thing as another test. The latter would be > > very time consuming, though. > > FWIW, I have made attempts at "split long ones into two" before, and > didn't come up with much. There _are_ some tests that are much longer > than others[1], but they are not longer than the whole suite takes to > run. So running in slow-to-fast order means they start first, are run in > parallel with the other tests, and the CPUs stay relatively full through > the whole run. That reflects my findings, too. I want to add that I found preciously little difference between running slow-to-fast and running in numeric order, so I gave up on optimizing on that front. > 43.216765165329 t3404-rebase-interactive.sh > 30.6568658351898 t3421-rebase-topology-linear.sh > 27.92564702034 t9001-send-email.sh > 15.5906939506531 t9500-gitweb-standalone-no-errors.sh > 15.4882569313049 t6030-bisect-porcelain.sh > 14.487174987793 t7610-mergetool.sh > 13.8276169300079 t3425-rebase-topology-merges.sh > 12.7450480461121 t3426-rebase-submodule.sh > 12.4915001392365 t3415-rebase-autosquash.sh > 11.8122401237488 t5572-pull-submodule.sh That looks very similar to what I found: t3404 on top, followed by t3421. Further, I found that the Subversion tests (which run at the end) are so close in their running time that running the tests in parallel with -j5 does not result in any noticeable improvement when reordered. I guess I will have to bite into the sour apple and try to profile, say, t3404 somehow, including all the shell scripting stuff, to identify where exactly all that time is lost. My guess is that it boils down to gazillions of calls to programs like expr.exe or merely subshells. Ciao, Dscho
Re: Drastic jump in the time required for the test suite
Hi Junio, On Wed, 19 Oct 2016, Junio C Hamano wrote: > Johannes Schindelinwrites: > > > What I can also say for certain is that the version from yesterday (commit > > 4ef44ce) was the first one in a week that built successfully and hence > > reached the test phase, and it was the first version of `pu` ever to time > > out after running for 3h. > > I am sympathetic, but I'd be lying if I said I can feel your pain. Obviously. > Admittedly I do not run _all_ the tests (I certainly know that I > exclude the ones behind EXPENSIVE prerequisite), but after every > rebuilding of 'jch' and 'pu', I run the testsuite (and also rebuild > docs) before pushing them out, and "make test && make doc && make > install install-doc" run sequentially for the four integration > branches finishes within 15 minutes, even when I run them after > "make clean". You have the luxury of a system that runs shell scripts fast. Or maybe I should put it differently: you set up the test suite in such a way that it runs fast on your system, exploiting the fact that shell scripts run fast there. If you want to know just how harmful this reliance on shell scripting is to our goal of keeping Git portable: already moving from Linux to MacOSX costs you roughly 3x as long to run the build & test (~12mins vs ~36mins for GCC, according to https://travis-ci.org/git/git/builds/159125647). Again, I have to repeat myself: this is not good. > Perhaps the recent change to run the tests in parallel from slower > to faster under prove may be helping my case. No, you misunderstand. The tests are *already* run in parallel. And running them from slower to faster would only make a difference if the last tests were not requiring roughly the same time to run. Also, I cannot use prove(1) because it proved to be unreliable in my setup (it does hang from time to time, for no good reason whatsoever, which would only make my situation worse, of course). > > Maybe we should start optimizing the tests... > > Yup, two things that come to mind are to identify long ones and see > if each of them can be split into two halves that can be run in > parallel, and to go through the tests with fine toothed comb and > remove the ones that test exactly the same thing as another test. > The latter would be very time consuming, though. Again, you misunderstand. The problem is not whether or not to run the tests in parallel. The problem is that our tests take an insanely long time to run. That is a big problem, it is no good, tests are only useful if they are cheap enough to run all the time. So the only thing that would really count as an improvement would be to change the test suite in such a manner that it relies more on helpers in t/helper/ and less on heavy-duty shell scripting. Of course, if you continue to resist (because the problem is obviously not affecting you personally, so why would you care), I won't even try to find the time to start on that project. Ciao, Dscho P.S.: After increasing the time-out to 240 minutes, the test suite still times out. I investigated a little bit and it appears that t6030-bisect-porcelain.sh now hangs in the `git bisect next` call of its "2 - bisect starts with only one bad" test case. This is specific to Windows, I cannot reproduce that problem on Linux, and I will keep you posted on my progress.
Re: Drastic jump in the time required for the test suite
On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote: > > Maybe we should start optimizing the tests... > > Yup, two things that come to mind are to identify long ones and see > if each of them can be split into two halves that can be run in > parallel, and to go through the tests with fine toothed comb and > remove the ones that test exactly the same thing as another test. > The latter would be very time consuming, though. FWIW, I have made attempts at "split long ones into two" before, and didn't come up with much. There _are_ some tests that are much longer than others[1], but they are not longer than the whole suite takes to run. So running in slow-to-fast order means they start first, are run in parallel with the other tests, and the CPUs stay relatively full through the whole run. Of course YMMV; the long tests on Windows may be different, or proportionally much longer (I note the worst cases almost all involve rebase, which as a shell script is presumably worse on Windows than elsewhere). And of course any reasoning about slow-to-fast order does not apply if you are not using a tool to do that for you. :) -Peff [1] After running "make prove" (time are in seconds): $ perl -MYAML -e ' $_ = do { local $/; <> }; # prove puts this non-YAML cruft at the end s/\.\.\.$//s; my $t = YAML::Load($_)->{tests}; print "$_->[1] $_->[0]\n" for sort { $b->[1] <=> $a->[1] } map { [$_, $t->{$_}->{elapsed}] } keys(%$t); ' .prove | head 43.216765165329 t3404-rebase-interactive.sh 30.6568658351898 t3421-rebase-topology-linear.sh 27.92564702034 t9001-send-email.sh 15.5906939506531 t9500-gitweb-standalone-no-errors.sh 15.4882569313049 t6030-bisect-porcelain.sh 14.487174987793 t7610-mergetool.sh 13.8276169300079 t3425-rebase-topology-merges.sh 12.7450480461121 t3426-rebase-submodule.sh 12.4915001392365 t3415-rebase-autosquash.sh 11.8122401237488 t5572-pull-submodule.sh
Re: Drastic jump in the time required for the test suite
Johannes Schindelinwrites: > What I can also say for certain is that the version from yesterday (commit > 4ef44ce) was the first one in a week that built successfully and hence > reached the test phase, and it was the first version of `pu` ever to time > out after running for 3h. I am sympathetic, but I'd be lying if I said I can feel your pain. Admittedly I do not run _all_ the tests (I certainly know that I exclude the ones behind EXPENSIVE prerequisite), but after every rebuilding of 'jch' and 'pu', I run the testsuite (and also rebuild docs) before pushing them out, and "make test && make doc && make install install-doc" run sequentially for the four integration branches finishes within 15 minutes, even when I run them after "make clean". Perhaps the recent change to run the tests in parallel from slower to faster under prove may be helping my case. > Maybe we should start optimizing the tests... Yup, two things that come to mind are to identify long ones and see if each of them can be split into two halves that can be run in parallel, and to go through the tests with fine toothed comb and remove the ones that test exactly the same thing as another test. The latter would be very time consuming, though.
Drastic jump in the time required for the test suite
Hi Junio, I know you are a fan of testing things thoroughly in the test suite, but I have to say that it is getting out of hand, in particular due to our over-use of shell script idioms (which really only run fast on Linux, not a good idea for a portable software). My builds of `pu` now time out, after running for 3h straight in the VM dedicated to perform the daily routine of building and testing the git.git branches in Git for Windows' SDK. For comparison, `next` passes build & tests in 2.6h. That is quite the jump. Unfortunately, I cannot tell you precisely when this massive increase happened (we are talking about half an hour, after all), because there were build failures preventing the test from being run for the last 40 days (and my build job only retains the results for 7 days currently, anyway, in addition to the last successful/unsuccessful build). What I can say for certain is that the version from 41 days ago (commit 7837d4a) ran for only 2.6h, just as `next` does today, and passed without failure. Every single revision of `pu` since then has been broken in one way or another. Mostly already the build, so that the tests would not even start to run. What I can also say for certain is that the version from yesterday (commit 4ef44ce) was the first one in a week that built successfully and hence reached the test phase, and it was the first version of `pu` ever to time out after running for 3h. I will increase the time-out, of course, but we are walking into a dangerous territory here: the build & test requires such an incredibly long time now that the VM will start to take more than 24h to complete all 8 tasks (building & testing the `pu`, `next`, `master` and `maint` branches, as well as trying to rebase Git for Windows' patches onto each of them). As those tasks are scheduled to run once every 24h, I will no longer be able to notify you and the mailing list in a timely manner, if at all, because the queue will clog up. Maybe we should start optimizing the tests to become more useful again, by forcing them not to take such an insane amount of time? Ciao, Dscho