subject:"Drastic jump in the time required for the test suite"

Re: Drastic jump in the time required for the test suite

2016-10-28 Thread Duy Nguyen

On Fri, Oct 28, 2016 at 3:41 AM, Eric Wong  wrote:
> Johannes Schindelin  wrote:
>> I know you are a fan of testing things thoroughly in the test suite, but I
>> have to say that it is getting out of hand, in particular due to our
>> over-use of shell script idioms (which really only run fast on Linux, not
>> a good idea for a portable software).
>
> How much effort would it take to optimize a /bin/sh?
>
> Would replacing uses of fork+execve posix_spawn be fast and
> portable enough?
>
> Even on Linux, performance sucks for me.  I've been hoping dash
> can use posix_spawn (or using vfork directly) to see if that can
> help things.

Busybox hush uses vfork.on no-mmu builds. You could try it out. I
don't know how complete it is though.
-- 
Duy

Re: Drastic jump in the time required for the test suite

2016-10-27 Thread Eric Wong

Johannes Schindelin  wrote:
> I know you are a fan of testing things thoroughly in the test suite, but I
> have to say that it is getting out of hand, in particular due to our
> over-use of shell script idioms (which really only run fast on Linux, not
> a good idea for a portable software).

How much effort would it take to optimize a /bin/sh?

Would replacing uses of fork+execve posix_spawn be fast and
portable enough?

Even on Linux, performance sucks for me.  I've been hoping dash
can use posix_spawn (or using vfork directly) to see if that can
help things.

That won't help with subshells, though...

(I'm back to using a Centrino laptop from 2005)

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread René Scharfe


Am 21.10.2016 um 12:59 schrieb Duy Nguyen:

On Thu, Oct 20, 2016 at 11:40 PM, René Scharfe  wrote:

I get this on WSL with prove -j8:

Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
3731.85 csys = 4040.15 CPU)

And this for a run on Debian inside a Hyper-V VM on the same system:

Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
25.82 csys = 71.39 CPU)

All tests pass on master.


Thank you for doing this. 10 times slower is probably not worth
following up (though absolute numbers still look amazing, you have
some beefy machine there).


Thanks, but it's not too impressive: Xeon E3-1231 v3, Crucial BX100 SSD, 
8GB RAM.  I wonder how much faster a brand-new CPU with more RAM and a 
PCIe SSD would be..


René

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread René Scharfe

Am 21.10.2016 um 15:10 schrieb Matthieu Moy:
> René Scharfe  writes:
> 
>> I get this on WSL with prove -j8:
>>
>> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 
>> cusr 3731.85 csys = 4040.15 CPU)
>>
>> And this for a run on Debian inside a Hyper-V VM on the same system:
>>
>> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
>> 25.82 csys = 71.39 CPU)
> 
> What about the same without WSL on windows?

Files=755, Tests=14894, 1774 wallclock secs ( 9.31 usr  9.58 sys + 821.87 cusr 
1961.23 csys = 2801.99 CPU)

René

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread Matthieu Moy

René Scharfe  writes:

> I get this on WSL with prove -j8:
>
> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
> 3731.85 csys = 4040.15 CPU)
>
> And this for a run on Debian inside a Hyper-V VM on the same system:
>
> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
> 25.82 csys = 71.39 CPU)

What about the same without WSL on windows?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread Duy Nguyen

On Fri, Oct 21, 2016 at 3:40 AM, Dennis Kaarsemaker
 wrote:
> On Thu, 2016-10-20 at 08:31 -0400, Jeff King wrote:
>
>> I'm also not entirely convinced that the test suite being a shell script
>> is the main culprit for its slowness. We run git a lot of times, and
>> that's inherent in testing it. I ran the whole test suite under
>> "strace -f -e execve". There are ~335K execs. Here's the breakdown of
>> the top ones:
>
> You're measuring execve's, but fork (well, fork emulation. There's no
> actual fork) is also expensive on windows iirc, so subshells add a lot
> to this cost.

shells fork on piping as well, and redirection and some other
construct if I remember correctly (I attempted to port busybox ash to
windows and had to find and "fix" all the forks)

> That said, strace -eclone says that a 'make test' forks
> ~408k times, and while this is significantly more than the amount of
> execs in your example, this does include cvs and svn tests and it's
> still in the same ballpark.
-- 
Duy

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread Duy Nguyen

On Thu, Oct 20, 2016 at 11:40 PM, René Scharfe  wrote:
> Am 20.10.2016 um 13:02 schrieb Duy Nguyen:
>> On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
>>  wrote:
>>> Hi Junio,
>>>
>>> I know you are a fan of testing things thoroughly in the test suite, but I
>>> have to say that it is getting out of hand, in particular due to our
>>> over-use of shell script idioms (which really only run fast on Linux, not
>>> a good idea for a portable software).
>>>
>>> My builds of `pu` now time out, after running for 3h straight in the VM
>>> dedicated to perform the daily routine of building and testing the git.git
>>> branches in Git for Windows' SDK. For comparison, `next` passes build &
>>> tests in 2.6h. That is quite the jump.
>>
>> I'm just curious, will running git.exe from WSL [1] help speed things
>> up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
>> WSL is quite fast.
>>
>> I'm pretty sure the test suite would need some adaptation, but if the
>> speedup is significant, maybe it's worth spending time on.
>>
>> [1] https://news.ycombinator.com/item?id=12748395
>
> I get this on WSL with prove -j8:
>
> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
> 3731.85 csys = 4040.15 CPU)
>
> And this for a run on Debian inside a Hyper-V VM on the same system:
>
> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
> 25.82 csys = 71.39 CPU)
>
> All tests pass on master.

Thank you for doing this. 10 times slower is probably not worth
following up (though absolute numbers still look amazing, you have
some beefy machine there).
-- 
Duy

Re: Drastic jump in the time required for the test suite

2016-10-21 Thread Jeff King

On Fri, Oct 21, 2016 at 07:27:40AM +0200, Johannes Sixt wrote:

> Am 20.10.2016 um 23:38 schrieb Jeff King:
> > test_cmp () {
> > # optimize for common "they are the same" case
> > # without any subshells or subprograms
> 
> We do this already on Windows; it's the function mingw_test_cmp in our test
> suite:
> https://github.com/git/git/blob/master/t/test-lib-functions.sh#L884-L945

Ah, thanks, I didn't even think to look for an existing solution like
this. Looks like it was done for LF/CRLF reasons, though, not
performance. I wonder if anybody has measured whether it improves things
there or not.

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Sixt


Am 20.10.2016 um 23:38 schrieb Jeff King:

test_cmp () {
# optimize for common "they are the same" case
# without any subshells or subprograms


We do this already on Windows; it's the function mingw_test_cmp in our 
test suite:

https://github.com/git/git/blob/master/t/test-lib-functions.sh#L884-L945

-- Hannes

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 05:00:13PM -0400, Jeff King wrote:

> > I am not an expert on perl nor tracing, but is it feasible to find out
> > how many internal calls there are? i.e. either some shell script (rebase,
> > submodule) calling git itself a couple of times or even from compile/git/git
> > itself, e.g. some submodule operations use forking in there.
> 
> The script below is my attempt, though I think it is not quite right, as
> "make" should be the single apex of the graph. You can run it like:
> 
>   strace -f -o /tmp/foo.out -e clone,execve make test
>   perl graph.pl /tmp/foo.out | less -S

Ah, I see why it is sometimes wrong. We may see parent/child
interactions out of order. E.g., we may see:

  1. Process B execs "git".

  2. Process B exits.

  3. Process A forks pid B.

because strace does not manage to ptrace "A" before "B" finishes
running. It's hard to tell in step 3 if this is the same "A" we saw
earlier. You cannot just go by PID, because the tests run enough
processes that we end up reusing PIDs. My script has a hack which
increments a counter when processes go away, but that happens in step 2
above (and steps 1 and 3 think they are seeing different processes, even
though they are the same).

I'm sure it could be cleverly hacked around, but the easiest thing is
probably to just make sure the tests are not run in parallel. High load
makes that kind of out-of-order sequence much more likely.

Making sure I run it with "make -j1", the script I posted earlier gives
pretty reasonable output. I won't post mine here, as it's double-digit
megabytes, but you should be able to recreate it yourself.

Of course, the call graph by itself isn't that interesting. But you
might be able to do some interesting analysis, or just find spots of
interest to you (like git-submodule).

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 02:53:36PM -0700, Stefan Beller wrote:

> >> That said I really like the idea of having a helper that would eliminate 
> >> the cat
> >> for you, e.g. :
> >>
> >> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
> >> --porcelain=v2 --branch --untracked-files=all <<-EOF
> >> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
> >> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
> >> EOF
> >
> > I think that helper still ends up using "cat" and "diff" under the hood,
> 
> I assumed that helper being a C program, that only needs to fork once
> for the actual git call and then it sits idle to compare the exact output
> from stdout to its stdin.

Ah, I see. I thought you meant a shell helper. A C helper does drop 2
forks down to 1, but it would be nice if we could drop it to 0.

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 05:38:03PM -0400, Jeff King wrote:

> I think that helper still ends up using "cat" and "diff" under the hood,
> unless you write those bits in pure shell. But at that point, I suspect
> we could "cat" and "test_cmp" in pure shell, something like:
> [...]
> Those are both completely untested. But maybe they are worth playing
> around with for somebody on Windows to see if they make a dent in the
> test runtime.

If you tried to run them, you probably noticed that the "untested" was
really true. One of the functions was missing an "else", and the other
forgot to add a "\n" to its printf.

The patch below gets closer, though there are still a handful of test
failures.  I didn't investigate deeply, but I think at least one is
related to the "read/printf" version of cat not being binary-clean.

-Peff

---
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index fdaeb3a96b..de37f3d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -685,9 +685,48 @@ test_expect_code () {
 # - not all diff versions understand "-u"
 
 test_cmp() {
+   # optimize for common "they are the same" case
+   # without any subshells or subprograms
+   while true; do
+   if ! read -r line1 <&3
+   then
+   if ! read -r line2 <&4
+   then
+   # EOF on both; good
+   return 0
+   else
+   # EOF only on file1; fail
+   break
+   fi
+   fi
+   if ! read -r line2 <&4
+   then
+   # EOF only on file2; fail
+   break
+   fi
+   test "$line1" = "$line2" || break
+   done 3<"$1" 4<"$2"
+
+   # if we get here, the optimized version found some
+   # difference. We can just "return 1", but let's run
+   # the real $GIT_TEST_CMP to provide pretty output.
+   # This should generally only happen on test failures,
+   # so performance isn't a big deal.
$GIT_TEST_CMP "$@"
 }
 
+cat () {
+   # optimize common here-doc usage
+   if test $# -eq 0
+   then
+   while read -r line
+   do
+   printf '%s\n' "$line"
+   done
+   fi
+   command cat "$@"
+}
+
 # test_cmp_bin - helper to compare binary files
 
 test_cmp_bin() {

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller

On Thu, Oct 20, 2016 at 2:38 PM, Jeff King  wrote:
> On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote:
>
>> Maybe we should stop introducing un-optimized tests.
>> [...]
>> * heavy use of the "git -C " pattern. When applying that
>>   thouroughly we'd save spanning the subshells.
>
> Yeah, I imagine with some style changes we could drop quite a few
> subshells. The problem is that the conversion work is manual and
> tedious. I'd look first for spots where we can eliminate thousands of
> calls with a single change.
>
>> That said I really like the idea of having a helper that would eliminate the 
>> cat
>> for you, e.g. :
>>
>> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
>> --porcelain=v2 --branch --untracked-files=all <<-EOF
>> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
>> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
>> EOF
>
> I think that helper still ends up using "cat" and "diff" under the hood,

I assumed that helper being a C program, that only needs to fork once
for the actual git call and then it sits idle to compare the exact output
from stdout to its stdin.

If there is a difference it will do the extra work (i.e. put stdin and stout to
a file and call diff on it)

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote:

> Maybe we should stop introducing un-optimized tests.
> [...]
> * heavy use of the "git -C " pattern. When applying that
>   thouroughly we'd save spanning the subshells.

Yeah, I imagine with some style changes we could drop quite a few
subshells. The problem is that the conversion work is manual and
tedious. I'd look first for spots where we can eliminate thousands of
calls with a single change.

> That said I really like the idea of having a helper that would eliminate the 
> cat
> for you, e.g. :
> 
> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
> --porcelain=v2 --branch --untracked-files=all <<-EOF
> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
> EOF

I think that helper still ends up using "cat" and "diff" under the hood,
unless you write those bits in pure shell. But at that point, I suspect
we could "cat" and "test_cmp" in pure shell, something like:

cat () {
# optimize common here-doc usage
if test $# -eq 0
then
while read -r line
do
printf '%s' "$line"
done
fi
command cat "$@"
}

test_cmp () {
# optimize for common "they are the same" case
# without any subshells or subprograms
while true; do
if ! read -r line1 <&3
then
if ! read -r line2 <&4
# EOF on both; good
return 0
else
# EOF only on file1; fail
break
fi
fi
if ! read -r line2 <&4
then
# EOF only on file2; fail
break
fi
test "$line1" = "$line2" || break
done 3<"$1" 4<"$2"

# if we get here, the optimized version found some
# difference. We can just "return 1", but let's run
# the real $GIT_TEST_CMP to provide pretty output.
# This should generally only happen on test failures,
# so performance isn't a big deal.
"$GIT_TEST_CMP" "$@"
}

Those are both completely untested. But maybe they are worth playing
around with for somebody on Windows to see if they make a dent in the
test runtime.

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 10:38:23PM +0200, Johannes Sixt wrote:

> Am 20.10.2016 um 14:31 schrieb Jeff King:
> > Close to 1/3 of those processes are just invoking the bin-wrapper
> > script to set up the EXEC_PATH, etc. I imagine it would not be too hard
> > to just do that in the test script. In fact, it looks like:
> > 
> >   make prefix=/wherever install
> >   GIT_TEST_INSTALLED=/wherever/bin make test
> > 
> > might give you an immediate speedup by skipping bin-wrappers entirely.
> 
> Running the tests with --with-dashes should give you the same effect, no?

Yeah, looks like it. it still uses bin-wrappers for t/helper, but that
should be a minority of calls.

Which I think explains why I saw some test failures with the
GIT_TEST_INSTALLED above. It does not know about t/helper, but relies on
those programs being present in $GIT_BUILD_DIR. So I suspect it has been
totally broken since e6e7530d10 (test helpers: move test-* to t/helper/
subdirectory, 2016-04-13).

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 09:30:27AM -0700, Stefan Beller wrote:

> On Thu, Oct 20, 2016 at 5:31 AM, Jeff King  wrote:
> 
> >
> > $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c 
> > | sort -rn | head
> >  152271 /home/peff/compile/git/git
> >   57340 /home/peff/compile/git/t/../bin-wrappers/git
> >   16865 /bin/sed
> >   12650 /bin/rm
> >   11257 /bin/cat
> >9326 /home/peff/compile/git/git-sh-i18n--envsubst
> >9079 /usr/bin/diff
> >8013 /usr/bin/wc
> >5924 /bin/mv
> >4566 /bin/grep
> >
> 
> I am not an expert on perl nor tracing, but is it feasible to find out
> how many internal calls there are? i.e. either some shell script (rebase,
> submodule) calling git itself a couple of times or even from compile/git/git
> itself, e.g. some submodule operations use forking in there.

The script below is my attempt, though I think it is not quite right, as
"make" should be the single apex of the graph. You can run it like:

  strace -f -o /tmp/foo.out -e clone,execve make test
  perl graph.pl /tmp/foo.out | less -S

One thing that it counts (that was not counted above) is the number of
forks for subshells, which is considerable. I don't know how expensive
that is versus, say, running "cat" (if your fork() doesn't
copy-on-write, and you implement sub-programs via an efficient spawn()
call, it's possible that the subshells are significantly more
expensive).

-Peff

-- >8 --
#!/usr/bin/perl

my %clone;
my %exec;
my %is_child;
my %counter;
while (<>) {
#  execve("some-prog", ...
if (/^(\d+)\s+execve\("(.*?)"/) {
push @{$exec{node($1)}}, $2;
}
#  clone(...) = 
#   or
#  <... clone resumed> ...) = 
elsif (/^(\d+)\s+.*clone.*\) = (\d+)$/) {
push @{$clone{node($1)}}, node($2);
$is_child{node($2)} = 1;
}
#  +++ exited with  +++
# We have to keep track of this because pids get recycled,
# and so are not unique node names in our graph.
elsif (/^(\d+)\s+.*exited with/) {
$counter{$1}++;
}
}

show($_, 0) for grep { !$is_child{$_} } keys(%clone);

sub show {
my ($pid, $indent) = @_;

my @progs = @{$exec{$pid}};
if (!@progs) {
@progs = ("(fork)");
}

print ' ' x $indent;
print "$pid: ", shift @progs;
print " => $_" for @progs;
print "\n";

show($_, $indent + 2) for @{$clone{$pid}};
}

sub node {
my $pid = shift;
my $c = $counter{$pid} || "0";
return "$pid-$c";
}

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Dennis Kaarsemaker

On Thu, 2016-10-20 at 08:31 -0400, Jeff King wrote:

> I'm also not entirely convinced that the test suite being a shell script
> is the main culprit for its slowness. We run git a lot of times, and
> that's inherent in testing it. I ran the whole test suite under
> "strace -f -e execve". There are ~335K execs. Here's the breakdown of
> the top ones:

You're measuring execve's, but fork (well, fork emulation. There's no
actual fork) is also expensive on windows iirc, so subshells add a lot
to this cost. That said, strace -eclone says that a 'make test' forks
~408k times, and while this is significantly more than the amount of
execs in your example, this does include cvs and svn tests and it's
still in the same ballpark.

D.

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Sixt


Am 20.10.2016 um 14:31 schrieb Jeff King:

Close to 1/3 of those processes are just invoking the bin-wrapper
script to set up the EXEC_PATH, etc. I imagine it would not be too hard
to just do that in the test script. In fact, it looks like:

  make prefix=/wherever install
  GIT_TEST_INSTALLED=/wherever/bin make test

might give you an immediate speedup by skipping bin-wrappers entirely.


Running the tests with --with-dashes should give you the same effect, no?

-- Hannes

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller

On Thu, Oct 20, 2016 at 3:50 AM, Johannes Schindelin
 wrote:
> Hi peff,
>
> On Wed, 19 Oct 2016, Jeff King wrote:
>
>> On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote:
>>
>> > > Maybe we should start optimizing the tests...
>> >

Maybe we should stop introducing un-optimized tests.
For other reasons I just stumbled upon t7064
(porcelain V2 output for git status)

This is how an arbitrary test looks like:

test_expect_success 'staged changes in added submodule (AM S.M.)' '
(cd super_repo &&
## stage the changes in the submodule.
(cd sub1 &&
git add file_in_sub
) &&

HMOD=$(git hash-object -t blob -- .gitmodules) &&
HSUP=$(git rev-parse HEAD) &&
HSUB=$HSUP &&

cat >expect <<-EOF &&
# branch.oid $HSUP
# branch.head master
# branch.upstream origin/master
# branch.ab +0 -0
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUB sub1
EOF

git status --porcelain=v2 --branch --untracked-files=all >actual &&
test_cmp expect actual
)
'

Following "modern" Git tests I would have expected:

* heavy use of the "git -C " pattern. When applying that
  thouroughly we'd save spanning the subshells.
* no `cd` on the same line as the opening paren.
  (This is style and would derail the performance discussion)

test_expect_success 'staged changes in added submodule (AM S.M.)' '
git -C super_repo/sub1 add file_in_sub &&
HMOD=$(git -C super_repo hash-object -t blob -- .gitmodules) &&
HSUP=$(git -C super_repo rev-parse HEAD) &&
# as a comment: HSUB is equal to HSUP, because ...

cat >expect <<-EOF &&
# branch.oid $HSUP
# branch.head master
# branch.upstream origin/master
# branch.ab +0 -0
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUP sub1
EOF

git -C super_repo status --porcelain=v2 --branch
--untracked-files=all >../actual &&
test_cmp expect actual
'

That said I really like the idea of having a helper that would eliminate the cat
for you, e.g. :

git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
--porcelain=v2 --branch --untracked-files=all <<-EOF
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUP sub1
EOF

Thanks,
Stefan

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Matthieu Moy

Junio C Hamano  writes:

> Are you proposing to replace the tests written as shell scripts with
> scripts in another language or framework that run equivalent
> sequences of git commands that is as portable as, if not more,
> Bourne shell?

The language (/bin/sh) is probably not the biggest issue. The way we use
it may be.

I don't have benchmark to tell what slows down the testsuite, but for
example we often write

cat >expected

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread René Scharfe

Am 20.10.2016 um 13:02 schrieb Duy Nguyen:
> On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
>  wrote:
>> Hi Junio,
>>
>> I know you are a fan of testing things thoroughly in the test suite, but I
>> have to say that it is getting out of hand, in particular due to our
>> over-use of shell script idioms (which really only run fast on Linux, not
>> a good idea for a portable software).
>>
>> My builds of `pu` now time out, after running for 3h straight in the VM
>> dedicated to perform the daily routine of building and testing the git.git
>> branches in Git for Windows' SDK. For comparison, `next` passes build &
>> tests in 2.6h. That is quite the jump.
> 
> I'm just curious, will running git.exe from WSL [1] help speed things
> up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
> WSL is quite fast.
> 
> I'm pretty sure the test suite would need some adaptation, but if the
> speedup is significant, maybe it's worth spending time on.
> 
> [1] https://news.ycombinator.com/item?id=12748395

I get this on WSL with prove -j8:

Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
3731.85 csys = 4040.15 CPU)

And this for a run on Debian inside a Hyper-V VM on the same system:

Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
25.82 csys = 71.39 CPU)

All tests pass on master.

René

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller

On Thu, Oct 20, 2016 at 5:31 AM, Jeff King  wrote:

>
> $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | 
> sort -rn | head
>  152271 /home/peff/compile/git/git
>   57340 /home/peff/compile/git/t/../bin-wrappers/git
>   16865 /bin/sed
>   12650 /bin/rm
>   11257 /bin/cat
>9326 /home/peff/compile/git/git-sh-i18n--envsubst
>9079 /usr/bin/diff
>8013 /usr/bin/wc
>5924 /bin/mv
>4566 /bin/grep
>

I am not an expert on perl nor tracing, but is it feasible to find out
how many internal calls there are? i.e. either some shell script (rebase,
submodule) calling git itself a couple of times or even from compile/git/git
itself, e.g. some submodule operations use forking in there.

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Junio C Hamano

Johannes Schindelin  writes:

> Of course, if you continue to resist (because the problem is obviously not
> affecting you personally, so why would you care), I won't even try to find
> the time to start on that project.

Sorry, but I did not know I was resisting, as I didn't see any
proposal to resist against in the first place.  I was trying to help
by mentioning two tricks that may be helping my test runtime that
may help you as well.

Are you proposing to replace the tests written as shell scripts with
scripts in another language or framework that run equivalent
sequences of git commands that is as portable as, if not more,
Bourne shell?  If that is what you are proposing, well, I won't stop
you and I may even help you in there, but I fail to guess what
alternative you have in mind.  I certainly do not have a suggestion
myself and I won't suggest migrate to tclsh or perl for that matter.

If that is not what you are trying to propose, and if parallelism
has already been employed, then there may or may not be other tricks
you are not yet using that helps to speed up your shell execution
that others are using---being confrontational is not an effective
way to ask others about them.

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 12:17:33PM +0200, Johannes Schindelin wrote:

> If you want to know just how harmful this reliance on shell scripting is
> to our goal of keeping Git portable: already moving from Linux to MacOSX
> costs you roughly 3x as long to run the build & test (~12mins vs ~36mins
> for GCC, according to https://travis-ci.org/git/git/builds/159125647).

Wait, shell scripts are slow on MacOS now?

Perhaps, but it seems more likely that one or more of the following is
true:

  - setup of the OS X VM takes longer (it does; if you click-through to
the test results, you'll see that the "make test" step goes from
647s on Linux to 1108s on MacOS. That's much worse, but not even
twice as slow, let alone 3x).

  - Travis Linux and OSX VMs do not have identical hardware. Looking at
https://docs.travis-ci.com/user/ci-environment/, it appears that
Linux containers get twice as many cores.

  - Git performance on Linux may be better than MacOS. The test suite is
very filesystem-heavy because it creates and destroys a lot of files
and repositories. If the kernel vfs performance is worse, it's
likely to show up in the test suite (especially if the issue is
latency and you aren't doing it massively in parallel).

I don't have a real way to measure that, but it seems like a
plausible factor.

So that sucks that the MacOS Travis build takes a half hour to run. But
I don't think that shell scripting is the culprit.

> So the only thing that would really count as an improvement would be to
> change the test suite in such a manner that it relies more on helpers in
> t/helper/ and less on heavy-duty shell scripting.
> 
> Of course, if you continue to resist (because the problem is obviously not
> affecting you personally, so why would you care), I won't even try to find
> the time to start on that project.

I'm not sure what you mean by "resist". The tests suite has been a set
of shell scripts for over a decade. As far as I know there is not
currently a viable alternative. If you have patches that make it faster
without negatively impact the ease of writing tests, I'd be happy to see
them.  If you have more t/helper programs that can eliminate expensive
bits of the shell scripts and speed up the test run, great. If you have
some other proposal entirely, I'd love to hear it.  But I do not see
that there is any proposal to "resist" at this point.

I'm also not entirely convinced that the test suite being a shell script
is the main culprit for its slowness. We run git a lot of times, and
that's inherent in testing it. I ran the whole test suite under
"strace -f -e execve". There are ~335K execs. Here's the breakdown of
the top ones:

$ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | 
sort -rn | head
 152271 /home/peff/compile/git/git
  57340 /home/peff/compile/git/t/../bin-wrappers/git
  16865 /bin/sed
  12650 /bin/rm
  11257 /bin/cat
   9326 /home/peff/compile/git/git-sh-i18n--envsubst
   9079 /usr/bin/diff
   8013 /usr/bin/wc
   5924 /bin/mv
   4566 /bin/grep

Almost half are running git itself. Let's assume that can't be changed.
That leaves ~180K of shell-related overhead (versus the optimal case,
that the entire test suite becomes one monolithic program ;) ).

Close to 1/3 of those processes are just invoking the bin-wrapper
script to set up the EXEC_PATH, etc. I imagine it would not be too hard
to just do that in the test script. In fact, it looks like:

  make prefix=/wherever install
  GIT_TEST_INSTALLED=/wherever/bin make test

might give you an immediate speedup by skipping bin-wrappers entirely.

The rest of it is harder. I think you'd have to move the test suite to a
language like perl that can do more of that as builtins (I'm sure you'd
enjoy the portability implications of _that_).  It would almost be
easier to build a variant of the shell that has sed, rm, cat, and a few
others compiled in.

-Peff

PS I haven't kept up with all of this POSIX-layer stuff that's been
   announced in Windows the past few months. Is it a viable path forward
   that would have better performance (obviously not in the short term,
   but where we may arrive in a few years)?

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King

On Thu, Oct 20, 2016 at 12:50:32PM +0200, Johannes Schindelin wrote:

> That reflects my findings, too. I want to add that I found preciously
> little difference between running slow-to-fast and running in numeric
> order, so I gave up on optimizing on that front.

Interesting. It makes a 10-15% difference here.

I also point "--root" at a ram disk. The tests are very I/O heavy and
sometimes fsync; even on a system with an SSD, this saves another ~10%.

I know that's small potatoes compared to the Windows vs Linux times, but
it might be worth exploring.

> Further, I found that the Subversion tests (which run at the end) are so
> close in their running time that running the tests in parallel with -j5
> does not result in any noticeable improvement when reordered.

I normally don't run the Subversion tests at all. Installing cvs, cvsps,
subversion, and libsvn-perl nearly doubles the runtime of the test suite
for me (I imagine adding p4 to the mix would bump it further). While
it's certainly possible to break them with a change in core git, it
doesn't seem like a good tradeoff if I'm not touching them often.

As the GfW maintainer, you probably should be running them, at least
before a release. But cutting them might be a good way to speed up your
day-to-day runs.

I also use -j16 on a quad-core (+hyperthreads) machine, which I arrived
at experimentally. At least on Linux, it's definitely worth having more
threads than processors, to keep the processors busy.

> I guess I will have to bite into the sour apple and try to profile, say,
> t3404 somehow, including all the shell scripting stuff, to identify where
> exactly all that time is lost. My guess is that it boils down to
> gazillions of calls to programs like expr.exe or merely subshells.

I'm not so sure it isn't gazillions of calls to git. It is testing
rebase, after all, which is itself a shell script. GIT_TRACE_PERFORMANCE
gives sort of a crude measure; it reports only builtins (so it will
underestimate the total time spent in git), but it also doesn't make
clear which programs call which, so some times are double-counted (if a
builtin shells out to another builtin). But:

  $ export GIT_TRACE_PERFORMANCE=/tmp/foo.out
  $ rm /tmp/foo.out
  $ time ./t3404-rebase-interactive.sh
  real0m29.755s
  user0m1.444s
  sys 0m2.268s

  $ perl -lne '
  /performance: ([0-9.]+)/ and $total += $1;
  END { print $total }
' /tmp/foo.out
  32.851352624

Clearly that's not 100% accurate, as it claims we spent longer in git
than the script actually took to run. Given the caveats above, I'm not
even sure if it is in the right ballpark. But there are 11,000 git
builtins run as part of that script. Even at 2ms each, that's still most
of the time going to git.

And obviously the fix involves converting git-rebase, which you're
already working on. But it's not clear to me that the test
infrastructure or shell scripts are the primary cause of the slowness in
this particular case.

-Peff

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Duy Nguyen

On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
 wrote:
> Hi Junio,
>
> I know you are a fan of testing things thoroughly in the test suite, but I
> have to say that it is getting out of hand, in particular due to our
> over-use of shell script idioms (which really only run fast on Linux, not
> a good idea for a portable software).
>
> My builds of `pu` now time out, after running for 3h straight in the VM
> dedicated to perform the daily routine of building and testing the git.git
> branches in Git for Windows' SDK. For comparison, `next` passes build &
> tests in 2.6h. That is quite the jump.

I'm just curious, will running git.exe from WSL [1] help speed things
up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
WSL is quite fast.

I'm pretty sure the test suite would need some adaptation, but if the
speedup is significant, maybe it's worth spending time on.

[1] https://news.ycombinator.com/item?id=12748395
-- 
Duy

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Schindelin

Hi peff,

On Wed, 19 Oct 2016, Jeff King wrote:

> On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote:
> 
> > > Maybe we should start optimizing the tests...
> > 
> > Yup, two things that come to mind are to identify long ones and see if
> > each of them can be split into two halves that can be run in parallel,
> > and to go through the tests with fine toothed comb and remove the ones
> > that test exactly the same thing as another test.  The latter would be
> > very time consuming, though.
> 
> FWIW, I have made attempts at "split long ones into two" before, and
> didn't come up with much. There _are_ some tests that are much longer
> than others[1], but they are not longer than the whole suite takes to
> run. So running in slow-to-fast order means they start first, are run in
> parallel with the other tests, and the CPUs stay relatively full through
> the whole run.

That reflects my findings, too. I want to add that I found preciously
little difference between running slow-to-fast and running in numeric
order, so I gave up on optimizing on that front.

> 43.216765165329 t3404-rebase-interactive.sh
> 30.6568658351898 t3421-rebase-topology-linear.sh
> 27.92564702034 t9001-send-email.sh
> 15.5906939506531 t9500-gitweb-standalone-no-errors.sh
> 15.4882569313049 t6030-bisect-porcelain.sh
> 14.487174987793 t7610-mergetool.sh
> 13.8276169300079 t3425-rebase-topology-merges.sh
> 12.7450480461121 t3426-rebase-submodule.sh
> 12.4915001392365 t3415-rebase-autosquash.sh
> 11.8122401237488 t5572-pull-submodule.sh

That looks very similar to what I found: t3404 on top, followed by t3421.

Further, I found that the Subversion tests (which run at the end) are so
close in their running time that running the tests in parallel with -j5
does not result in any noticeable improvement when reordered.

I guess I will have to bite into the sour apple and try to profile, say,
t3404 somehow, including all the shell scripting stuff, to identify where
exactly all that time is lost. My guess is that it boils down to
gazillions of calls to programs like expr.exe or merely subshells.

Ciao,
Dscho

Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Schindelin

Hi Junio,

On Wed, 19 Oct 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > What I can also say for certain is that the version from yesterday (commit
> > 4ef44ce) was the first one in a week that built successfully and hence
> > reached the test phase, and it was the first version of `pu` ever to time
> > out after running for 3h.
> 
> I am sympathetic, but I'd be lying if I said I can feel your pain.

Obviously.

> Admittedly I do not run _all_ the tests (I certainly know that I
> exclude the ones behind EXPENSIVE prerequisite), but after every
> rebuilding of 'jch' and 'pu', I run the testsuite (and also rebuild
> docs) before pushing them out, and "make test && make doc && make
> install install-doc" run sequentially for the four integration
> branches finishes within 15 minutes, even when I run them after
> "make clean".

You have the luxury of a system that runs shell scripts fast.

Or maybe I should put it differently: you set up the test suite in such a
way that it runs fast on your system, exploiting the fact that shell
scripts run fast there.

If you want to know just how harmful this reliance on shell scripting is
to our goal of keeping Git portable: already moving from Linux to MacOSX
costs you roughly 3x as long to run the build & test (~12mins vs ~36mins
for GCC, according to https://travis-ci.org/git/git/builds/159125647).

Again, I have to repeat myself: this is not good.

> Perhaps the recent change to run the tests in parallel from slower
> to faster under prove may be helping my case.

No, you misunderstand. The tests are *already* run in parallel. And
running them from slower to faster would only make a difference if the
last tests were not requiring roughly the same time to run.

Also, I cannot use prove(1) because it proved to be unreliable in my setup
(it does hang from time to time, for no good reason whatsoever, which
would only make my situation worse, of course).

> > Maybe we should start optimizing the tests...
> 
> Yup, two things that come to mind are to identify long ones and see
> if each of them can be split into two halves that can be run in
> parallel, and to go through the tests with fine toothed comb and
> remove the ones that test exactly the same thing as another test.
> The latter would be very time consuming, though.

Again, you misunderstand.

The problem is not whether or not to run the tests in parallel.

The problem is that our tests take an insanely long time to run. That is a
big problem, it is no good, tests are only useful if they are cheap enough
to run all the time.

So the only thing that would really count as an improvement would be to
change the test suite in such a manner that it relies more on helpers in
t/helper/ and less on heavy-duty shell scripting.

Of course, if you continue to resist (because the problem is obviously not
affecting you personally, so why would you care), I won't even try to find
the time to start on that project.

Ciao,
Dscho

P.S.: After increasing the time-out to 240 minutes, the test suite still
times out. I investigated a little bit and it appears that
t6030-bisect-porcelain.sh now hangs in the `git bisect next` call of its
"2 - bisect starts with only one bad" test case. This is specific to
Windows, I cannot reproduce that problem on Linux, and I will keep you
posted on my progress.

Re: Drastic jump in the time required for the test suite

2016-10-19 Thread Jeff King

On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote:

> > Maybe we should start optimizing the tests...
> 
> Yup, two things that come to mind are to identify long ones and see
> if each of them can be split into two halves that can be run in
> parallel, and to go through the tests with fine toothed comb and
> remove the ones that test exactly the same thing as another test.
> The latter would be very time consuming, though.

FWIW, I have made attempts at "split long ones into two" before, and
didn't come up with much. There _are_ some tests that are much longer
than others[1], but they are not longer than the whole suite takes to
run. So running in slow-to-fast order means they start first, are run in
parallel with the other tests, and the CPUs stay relatively full through
the whole run.

Of course YMMV; the long tests on Windows may be different, or
proportionally much longer (I note the worst cases almost all involve
rebase, which as a shell script is presumably worse on Windows than
elsewhere). And of course any reasoning about slow-to-fast order does
not apply if you are not using a tool to do that for you. :)

-Peff

[1] After running "make prove" (time are in seconds):

$ perl -MYAML -e '
$_ = do { local $/; <> };
# prove puts this non-YAML cruft at the end
s/\.\.\.$//s;

my $t = YAML::Load($_)->{tests};
print "$_->[1] $_->[0]\n" for
  sort { $b->[1] <=> $a->[1] }
  map { [$_, $t->{$_}->{elapsed}] }
  keys(%$t);
  ' .prove | head

43.216765165329 t3404-rebase-interactive.sh
30.6568658351898 t3421-rebase-topology-linear.sh
27.92564702034 t9001-send-email.sh
15.5906939506531 t9500-gitweb-standalone-no-errors.sh
15.4882569313049 t6030-bisect-porcelain.sh
14.487174987793 t7610-mergetool.sh
13.8276169300079 t3425-rebase-topology-merges.sh
12.7450480461121 t3426-rebase-submodule.sh
12.4915001392365 t3415-rebase-autosquash.sh
11.8122401237488 t5572-pull-submodule.sh

Re: Drastic jump in the time required for the test suite

2016-10-19 Thread Junio C Hamano

Johannes Schindelin  writes:

> What I can also say for certain is that the version from yesterday (commit
> 4ef44ce) was the first one in a week that built successfully and hence
> reached the test phase, and it was the first version of `pu` ever to time
> out after running for 3h.

I am sympathetic, but I'd be lying if I said I can feel your pain.

Admittedly I do not run _all_ the tests (I certainly know that I
exclude the ones behind EXPENSIVE prerequisite), but after every
rebuilding of 'jch' and 'pu', I run the testsuite (and also rebuild
docs) before pushing them out, and "make test && make doc && make
install install-doc" run sequentially for the four integration
branches finishes within 15 minutes, even when I run them after
"make clean".

Perhaps the recent change to run the tests in parallel from slower
to faster under prove may be helping my case.

> Maybe we should start optimizing the tests...

Yup, two things that come to mind are to identify long ones and see
if each of them can be split into two halves that can be run in
parallel, and to go through the tests with fine toothed comb and
remove the ones that test exactly the same thing as another test.
The latter would be very time consuming, though.

Drastic jump in the time required for the test suite

2016-10-19 Thread Johannes Schindelin

Hi Junio,

I know you are a fan of testing things thoroughly in the test suite, but I
have to say that it is getting out of hand, in particular due to our
over-use of shell script idioms (which really only run fast on Linux, not
a good idea for a portable software).

My builds of `pu` now time out, after running for 3h straight in the VM
dedicated to perform the daily routine of building and testing the git.git
branches in Git for Windows' SDK. For comparison, `next` passes build &
tests in 2.6h. That is quite the jump.

Unfortunately, I cannot tell you precisely when this massive increase
happened (we are talking about half an hour, after all), because there
were build failures preventing the test from being run for the last 40
days (and my build job only retains the results for 7 days currently,
anyway, in addition to the last successful/unsuccessful build).

What I can say for certain is that the version from 41 days ago (commit
7837d4a) ran for only 2.6h, just as `next` does today, and passed without
failure.

Every single revision of `pu` since then has been broken in one way or
another. Mostly already the build, so that the tests would not even start
to run.

What I can also say for certain is that the version from yesterday (commit
4ef44ce) was the first one in a week that built successfully and hence
reached the test phase, and it was the first version of `pu` ever to time
out after running for 3h.

I will increase the time-out, of course, but we are walking into a
dangerous territory here: the build & test requires such an incredibly
long time now that the VM will start to take more than 24h to complete all
8 tasks (building & testing the `pu`, `next`, `master` and `maint`
branches, as well as trying to rebase Git for Windows' patches onto each
of them).

As those tasks are scheduled to run once every 24h, I will no longer be
able to notify you and the mailing list in a timely manner, if at all,
because the queue will clog up.

Maybe we should start optimizing the tests to become more useful again, by
forcing them not to take such an insane amount of time?

Ciao,
Dscho

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Re: Drastic jump in the time required for the test suite

Drastic jump in the time required for the test suite

31 matches

Site Navigation

Mail list logo

Footer information