Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-20 Thread Johannes Schindelin
Hi Stolee,

On Mon, 19 Nov 2018, Derrick Stolee wrote:

> On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:
> > On Mon, Nov 19 2018, Derrick Stolee wrote:
> >
> > > [...]
> > > builtin/rebase.c
> > > 62c23938fa 55) return env;
> > > [...]
> > > Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
> > > where rebase.useBuiltin is off
> > This one would be covered with
> > GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
> > the rest of the coverage would look different when passed through the
> > various GIT_TEST_* options.
> >
> 
> Thanks for pointing out this GIT_TEST_* variable to me. I had been running
> builds with some of them enabled, but didn't know about this one.
> 
> Unfortunately, t3406-rebase-message.sh fails with
> GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge branch
> 'ab/rebase-in-c-escape-hatch'.
> 
> The issue is that the commit 04519d72 "rebase: validate -C and
> --whitespace= parameters early" introduced the following test that cares
> about error messages:
> 
> +test_expect_success 'error out early upon -C or --whitespace=' '
> +   test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
> +   test_i18ngrep "numerical value" err &&
> +   test_must_fail git rebase --whitespace=bad HEAD 2>err &&
> +   test_i18ngrep "Invalid whitespace option" err
> +'
> 
> The merge commit then was the first place where this test could run with that
> variable.
> 
> What's the correct fix here? Force the builtin rebase in this test? Unify the
> error message in the non-builtin case?

I am obviously biased, but my take is that the correct fix is in
https://public-inbox.org/git/pull.86.git.gitgitgad...@gmail.com

Ciao,
Dscho

Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-20 Thread Jeff King
On Tue, Nov 20, 2018 at 07:17:46AM -0500, Derrick Stolee wrote:

> > And I think that's a pattern with the delta-island code. What we really
> > care about most is that if we throw a real fork-network repository at
> > it, it produces faster clones with fewer un-reusable deltas. So I think
> > a much more interesting approach here would be perf tests. But:
> > 
> >- we'd want to count those as coverage, and that likely makes your
> >  coverage tests prohibitively expensive
> > 
> >- it requires a specialized repo to demonstrate, which most people
> >  aren't going to have handy
> 
> Do you have regularly-running tests that check this in your infrastructure?
> As long as someone would notice if this code starts failing, that would be
> enough.

We do integration tests, and this feature gets exercised quite a bit in
production at GitHub. But none of that caught the breakage I fixed this
morning for the simple fact that we don't keep up with upstream master
in real-time. We're running the delta-island code I wrote years ago, and
the bug was introduced by the upstreaming. Probably in a month or three
I'll merge v2.20 into our fork, and we would have noticed then.

I've had dreams of following upstream more closely, but there are two
blockers:

  - we have enough custom bits that the merges are time-consuming (and
introduce the possibility of funny interactions or regressions). I'd
like to reduce our overall diff from upstream, but it's slow going
and time is finite. (And ironically, it's made harder in some ways
by the lag, because many of our topics are backports of things I
send upstream).

  - deploying the ".0" release from master can be scary. For instance, I
was really happy to have a fix for 9ac3f0e5b3 (pack-objects: fix
performance issues on packing large deltas, 2018-07-22) before
putting the bug into production.

> > Yeah, this triggers if your regex has more than one capture group (and
> > likewise, we almost certainly don't run the "you have too many groups"
> > warning).
> 
> Did you know that regexes are notoriously under-tested [1]? When looking at
> this code, I didn't even know regexes were involved (but I didn't look
> enough at the context).
> 
> [1] https://dl.acm.org/citation.cfm?id=3236072

Heh. Well, fortunately we are compiling regexes from the user's config,
so it's not Git's fault if the regexes are bad. ;)

(Of course on our production servers I'm on the hook for both sides!)

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-20 Thread Derrick Stolee

On 11/20/2018 6:34 AM, Jeff King wrote:

On Mon, Nov 19, 2018 at 10:40:53AM -0500, Derrick Stolee wrote:


Since depth is never incremented, we are not covering this block. Is it
possible to test?
This should be covered by the fix in:

   https://public-inbox.org/git/20181120095053.gc22...@sigill.intra.peff.net/

because now entries at the top-level are depth "1". The "depth++" line
is still never executed in our test suite. I'm not sure how much that
matters.


Thanks! I'll go take a look at your patch.


delta-islands.c
c8d521faf7  53) memcpy(b, old, size);

This memcpy happens when the 'old' island_bitmap is passed to
island_bitmap_new(), but...


c8d521faf7 187) b->refcount--;
c8d521faf7 188) b = kh_value(island_marks, pos) = island_bitmap_new(b);

This block has the only non-NULL caller to island_bitmap_new().

This is another case where it triggers a lot for a reasonably-sized
repo, but it's hard to construct a small case. This code implements a
copy-on-write of the bitmap, which means the same objects have to be
accessible from two different paths through the reachability graph, each
with different island marks. And then a test would I guess make sure
that the correct subsets of objects never become deltas, which gets
complicated.

And I think that's a pattern with the delta-island code. What we really
care about most is that if we throw a real fork-network repository at
it, it produces faster clones with fewer un-reusable deltas. So I think
a much more interesting approach here would be perf tests. But:

   - we'd want to count those as coverage, and that likely makes your
 coverage tests prohibitively expensive

   - it requires a specialized repo to demonstrate, which most people
 aren't going to have handy


Do you have regularly-running tests that check this in your 
infrastructure? As long as someone would notice if this code starts 
failing, that would be enough.



c8d521faf7 212) obj = ((struct tag *)obj)->tagged;
c8d521faf7 213) if (obj) {
c8d521faf7 214) parse_object(the_repository, >oid);
c8d521faf7 215) marks = create_or_get_island_marks(obj);
c8d521faf7 216) island_bitmap_set(marks, island_counter);

It appears that this block would happen if we placed a tag in the delta
island.

Yep. Again, exercised by real repositories.

I'm not sure how far we want to go in the blind pursuit of coverage.
Certainly we could add a tag to the repo in t5320, and this code would
get executed. But verifying that it's doing the right thing is much
harder (and is more easily done with a perf test).


Blind coverage goals are definitely not worth the effort. My goal here 
was to re-check all of the new code (since last release) that is not 
covered, because it's easier to hide a bug there.





c8d521faf7 397) strbuf_addch(_name, '-');

This block is inside the following patch:
[...]

Yeah, this triggers if your regex has more than one capture group (and
likewise, we almost certainly don't run the "you have too many groups"
warning).


Did you know that regexes are notoriously under-tested [1]? When looking 
at this code, I didn't even know regexes were involved (but I didn't 
look enough at the context).


[1] https://dl.acm.org/citation.cfm?id=3236072




c8d521faf7 433) continue;
c8d521faf7 436) list[dst] = list[src];

These blocks are inside the following nested loop in deduplicate_islands():

+   for (ref = 0; ref + 1 < island_count; ref++) {
+   for (src = ref + 1, dst = src; src < island_count; src++) {
+   if (list[ref]->hash == list[src]->hash)
+   continue;
+
+   if (src != dst)
+   list[dst] = list[src];
+
+   dst++;
+   }
+   island_count = dst;
+   }

This means that our "deduplication" logic is never actually doing anything
meaningful.

Sorry, I don't even remember what this code is trying to do. The island
code is 5+ years old, and just recently ported to upstream Git by
Christian. And that's perhaps part of my laziness in the above tests; it
would be a significant effort to re-figure out all these corner cases.
It's a big part of why I hadn't been sending the patches upstream
myself.


Sure. Hopefully pointing out these blocks gives us more motivation to 
manually inspect them and avoid silly bugs.


Thanks,
-Stolee


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-20 Thread Jeff King
On Mon, Nov 19, 2018 at 10:40:53AM -0500, Derrick Stolee wrote:

> > 28b8a73080 builtin/pack-objects.c 2793) depth++;
> > 108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent,
> > depth);
> 
> This 'depth' variable is incremented as part of a for loop in this patch:
> 
>  static void show_object(struct object *obj, const char *name, void *data)
> @@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const
> char *name, void *data)
>     add_preferred_base_object(name);
>     add_object_entry(>oid, obj->type, name, 0);
>     obj->flags |= OBJECT_ADDED;
> +
> +   if (use_delta_islands) {
> +   const char *p;
> +   unsigned depth = 0;
> +   struct object_entry *ent;
> +
> +   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
> +   depth++;
> +
> +   ent = packlist_find(_pack, obj->oid.hash, NULL);
> +   if (ent && depth > ent->tree_depth)
> +   ent->tree_depth = depth;
> +   }
>  }
> 
> And that 'ent->tree_depth = depth;' line is replaced with the
> oe_set_tree_depth() call in the report.
> 
> Since depth is never incremented, we are not covering this block. Is it
> possible to test?

This should be covered by the fix in:

  https://public-inbox.org/git/20181120095053.gc22...@sigill.intra.peff.net/

because now entries at the top-level are depth "1". The "depth++" line
is still never executed in our test suite. I'm not sure how much that
matters.

> > delta-islands.c
> > c8d521faf7  53) memcpy(b, old, size);
> 
> This memcpy happens when the 'old' island_bitmap is passed to
> island_bitmap_new(), but...
> 
> > c8d521faf7 187) b->refcount--;
> > c8d521faf7 188) b = kh_value(island_marks, pos) = island_bitmap_new(b);
> 
> This block has the only non-NULL caller to island_bitmap_new().

This is another case where it triggers a lot for a reasonably-sized
repo, but it's hard to construct a small case. This code implements a
copy-on-write of the bitmap, which means the same objects have to be
accessible from two different paths through the reachability graph, each
with different island marks. And then a test would I guess make sure
that the correct subsets of objects never become deltas, which gets
complicated.

And I think that's a pattern with the delta-island code. What we really
care about most is that if we throw a real fork-network repository at
it, it produces faster clones with fewer un-reusable deltas. So I think
a much more interesting approach here would be perf tests. But:

  - we'd want to count those as coverage, and that likely makes your
coverage tests prohibitively expensive

  - it requires a specialized repo to demonstrate, which most people
aren't going to have handy

> > c8d521faf7 212) obj = ((struct tag *)obj)->tagged;
> > c8d521faf7 213) if (obj) {
> > c8d521faf7 214) parse_object(the_repository, >oid);
> > c8d521faf7 215) marks = create_or_get_island_marks(obj);
> > c8d521faf7 216) island_bitmap_set(marks, island_counter);
> 
> It appears that this block would happen if we placed a tag in the delta
> island.

Yep. Again, exercised by real repositories.

I'm not sure how far we want to go in the blind pursuit of coverage.
Certainly we could add a tag to the repo in t5320, and this code would
get executed. But verifying that it's doing the right thing is much
harder (and is more easily done with a perf test).

> > c8d521faf7 397) strbuf_addch(_name, '-');
> 
> This block is inside the following patch:
> [...]

Yeah, this triggers if your regex has more than one capture group (and
likewise, we almost certainly don't run the "you have too many groups"
warning).

> > c8d521faf7 433) continue;
> > c8d521faf7 436) list[dst] = list[src];
> 
> These blocks are inside the following nested loop in deduplicate_islands():
> 
> +   for (ref = 0; ref + 1 < island_count; ref++) {
> +   for (src = ref + 1, dst = src; src < island_count; src++) {
> +   if (list[ref]->hash == list[src]->hash)
> +   continue;
> +
> +   if (src != dst)
> +   list[dst] = list[src];
> +
> +   dst++;
> +   }
> +   island_count = dst;
> +   }
> 
> This means that our "deduplication" logic is never actually doing anything
> meaningful.

Sorry, I don't even remember what this code is trying to do. The island
code is 5+ years old, and just recently ported to upstream Git by
Christian. And that's perhaps part of my laziness in the above tests; it
would be a significant effort to re-figure out all these corner cases.
It's a big part of why I hadn't been sending the patches upstream
myself.

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>>
>
> Thanks for pointing out this GIT_TEST_* variable to me. I had been
> running builds with some of them enabled, but didn't know about this
> one.
>
> Unfortunately, t3406-rebase-message.sh fails with
> GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge
> branch 'ab/rebase-in-c-escape-hatch'.
>
> The issue is that the commit 04519d72 "rebase: validate -C and
> --whitespace= parameters early" introduced the following test
> that cares about error messages:
>
> +test_expect_success 'error out early upon -C or --whitespace=' '
> + test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
> + test_i18ngrep "numerical value" err &&
> + test_must_fail git rebase --whitespace=bad HEAD 2>err &&
> + test_i18ngrep "Invalid whitespace option" err
> +'
>
> The merge commit then was the first place where this test could run
> with that variable.

Yup. Sorry should have mentioned that, it's broken in master. Reported
it earlier today:
https://public-inbox.org/git/874lcd1bub@evledraar.gmail.com/

> What's the correct fix here? Force the builtin rebase in this test?
> Unify the error message in the non-builtin case?

Probably to just force the builtin, unless Johannes wants to also fix
the bug for the shellscript version. I don't know if for 2.20 we're
trying to maintain 100% compatibility.


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


[...]
builtin/rebase.c
62c23938fa 55) return env;
[...]
Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.



Thanks for pointing out this GIT_TEST_* variable to me. I had been 
running builds with some of them enabled, but didn't know about this one.


Unfortunately, t3406-rebase-message.sh fails with 
GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge 
branch 'ab/rebase-in-c-escape-hatch'.


The issue is that the commit 04519d72 "rebase: validate -C and 
--whitespace= parameters early" introduced the following test that 
cares about error messages:


+test_expect_success 'error out early upon -C or --whitespace=' '
+   test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
+   test_i18ngrep "numerical value" err &&
+   test_must_fail git rebase --whitespace=bad HEAD 2>err &&
+   test_i18ngrep "Invalid whitespace option" err
+'

The merge commit then was the first place where this test could run with 
that variable.


What's the correct fix here? Force the builtin rebase in this test? 
Unify the error message in the non-builtin case?


Thanks,
-Stolee


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 2:00 PM, Ben Peart wrote:



On 11/19/2018 10:40 AM, Derrick Stolee wrote:


There are a lot of lines introduced by the IEOT extension in these 
commits:


 > Ben Peart  3255089ad: ieot: add Index Entry Offset Table 
(IEOT) extension
 > Ben Peart  3b1d9e045: eoie: add End of Index Entry (EOIE) 
extension
 > Ben Peart  77ff1127a: read-cache: load cache entries on worker 
threads
 > Ben Peart  abb4bb838: read-cache: load cache extensions on a 
worker thread
 > Ben Peart  c780b9cfe: config: add new index.threads config 
setting
 > Ben Peart  d1664e73a: add: speed up cmd_add() by utilizing 
read_cache_preload()
 > Ben Peart  fa655d841: checkout: optimize "git checkout -b 
"




These should be hit if you run the test suite with 
GIT_TEST_INDEX_THREADS=2.  Without that, the indexes for the various 
tests are too small to trigger multi-threaded index reads/writes.


From t/README:

GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading
of the index for the whole test suite by bypassing the default number of
cache entries and thread minimums. Setting this to 1 will make the
index loading single threaded.


I updated my build to add GIT_TEST_INDEX_THREADS=2 and I still see lots 
of uncovered stuff, including that load_cache_entries_threaded() is 
never run.


I added the following diff to my repo and ran the test suite manually 
with GIT_TEST_INDEX_THREADS=2 and it didn't fail:


diff --git a/read-cache.c b/read-cache.c
index 4ca81286c0..36502586a2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2057,6 +2057,9 @@ static unsigned long 
load_cache_entries_threaded(struct index_state *istate, con

    struct load_cache_entries_thread_data *data;
    unsigned long consumed = 0;

+   fprintf(stderr, "load_cache_entries_threaded\n");
+   exit(1);
+
    /* a little sanity checking */
    if (istate->name_hash_initialized)
    BUG("the name hash isn't thread safe");

Am I missing something? Is there another variable I should add?

When I look for where the GIT_TEST_INDEX_THREADS variable is checked, I 
see that the calls to git_config_get_index_threads() are followed by a 
check for NO_PTHREADS (setting the number of threads to 1 again). Is it 
possible that my compiler environment is not allowing me to even compile 
with threads?


Thanks,
-Stolee





Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Jonathan Nieder wrote:

> Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>>
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>
> I wonder if we can do better for this kind of thing.
>
> When I do routine development, I am not running tests with any
> non-default flags.  So why should tests run with non-default flags
> count toward coverage?  Is there a way to make the default test
> settings dip their feet into some non-default configurations, without
> running the full battery of tests and slowing tests down accordingly?
> E.g. is there some kind of smoke test that rebase with
> useBuiltin=false works at all that could run, even if I am not running
> the full battery of rebase tests?

Yeah, definitely. Just pointing out that it would smoke out coverage we
don't have at all v.s. cases where we just don't have coverage with the
default tests without any special modes.

Derrick: I think it would be useful to produce some delta report showing
covered lines without any GIT_TEST_* variables v.s. when they're set. As
Jonathan points out those should ideally be tested with the normal test
suite, leaving GIT_TEST_* just for stress testing to find new bugs.

> That's a bit of a non sequitor for this example, which is actual code
> to handle GIT_TEST_REBASE_USE_BUILTIN, though.  For it, I wonder why
> we need rebase.c to understand the envvar --- couldn't test-lib.sh
> take care of setting rebase.useBuiltin to false when it's set?

I guess the test-lib.sh could pass down things like
"GIT_CONFIG_PARAMETERS='x.y=z'". Maybe that's a better way to do it.


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 2:39 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


The coverage report has been using the following:

export GIT_TEST_MULTI_PACK_INDEX=1
export GIT_TEST_COMMIT_GRAPH=1
export GIT_TEST_INDEX_VERION=4
export GIT_TEST_SPLIT_INDEX=yes
export GIT_TEST_OE_SIZE=10
export GIT_TEST_OE_DELTA_SIZE=5

I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

...although note you'll need to also test without
GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code
won't have coverage.


Sorry for lack of clarity: I first run 'make coverage-test' with no 
GIT_TEST_* variables, then run the test suite again with the optional 
variables.


Thanks,
-Stolee


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>>
>>> Here is a test coverage report for the uncovered lines introduced in
>>> v2.20.0-rc0 compared to v2.19.1.
>> Thanks a lot for this.
>>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>
> The coverage report has been using the following:
>
> export GIT_TEST_MULTI_PACK_INDEX=1
> export GIT_TEST_COMMIT_GRAPH=1
> export GIT_TEST_INDEX_VERION=4
> export GIT_TEST_SPLIT_INDEX=yes
> export GIT_TEST_OE_SIZE=10
> export GIT_TEST_OE_DELTA_SIZE=5
>
> I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

...although note you'll need to also test without
GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code
won't have coverage.


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


Here is a test coverage report for the uncovered lines introduced in
v2.20.0-rc0 compared to v2.19.1.

Thanks a lot for this.


[...]
builtin/rebase.c
62c23938fa 55) return env;
[...]
Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.


The coverage report has been using the following:

export GIT_TEST_MULTI_PACK_INDEX=1
export GIT_TEST_COMMIT_GRAPH=1
export GIT_TEST_INDEX_VERION=4
export GIT_TEST_SPLIT_INDEX=yes
export GIT_TEST_OE_SIZE=10
export GIT_TEST_OE_DELTA_SIZE=5

I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

Thanks!
-Stolee


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ben Peart




On 11/19/2018 10:40 AM, Derrick Stolee wrote:
The test coverage reports started mid-way through this release cycle, so 
I thought it would be good to do a full review of the new uncovered code 
since the last release.


I eliminated most of the uncovered code due to the following cases:

1. Code was only moved or refactored.
2. Code was related to unusual error conditions (e.g. open_pack_index() 
fails)


The comments below are intended only to point out potential directions 
to improve test coverage. Some of it is for me to do!


Thanks,
-Stolee

On 11/18/2018 9:54 PM, Derrick Stolee wrote:

There are a lot of lines introduced by the IEOT extension in these commits:

 > Ben Peart  3255089ad: ieot: add Index Entry Offset Table (IEOT) 
extension

 > Ben Peart  3b1d9e045: eoie: add End of Index Entry (EOIE) extension
 > Ben Peart  77ff1127a: read-cache: load cache entries on worker 
threads
 > Ben Peart  abb4bb838: read-cache: load cache extensions on a 
worker thread

 > Ben Peart  c780b9cfe: config: add new index.threads config setting
 > Ben Peart  d1664e73a: add: speed up cmd_add() by utilizing 
read_cache_preload()
 > Ben Peart  fa655d841: checkout: optimize "git checkout -b 
"




These should be hit if you run the test suite with 
GIT_TEST_INDEX_THREADS=2.  Without that, the indexes for the various 
tests are too small to trigger multi-threaded index reads/writes.


From t/README:

GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading
of the index for the whole test suite by bypassing the default number of
cache entries and thread minimums. Setting this to 1 will make the
index loading single threaded.




Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))

2018-11-19 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:
> On Mon, Nov 19 2018, Derrick Stolee wrote:

>> [...]
>> builtin/rebase.c
>> 62c23938fa 55) return env;
>> [...]
>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>> where rebase.useBuiltin is off
>
> This one would be covered with
> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
> the rest of the coverage would look different when passed through the various 
> GIT_TEST_* options.

I wonder if we can do better for this kind of thing.

When I do routine development, I am not running tests with any
non-default flags.  So why should tests run with non-default flags
count toward coverage?  Is there a way to make the default test
settings dip their feet into some non-default configurations, without
running the full battery of tests and slowing tests down accordingly?
E.g. is there some kind of smoke test that rebase with
useBuiltin=false works at all that could run, even if I am not running
the full battery of rebase tests?

That's a bit of a non sequitor for this example, which is actual code
to handle GIT_TEST_REBASE_USE_BUILTIN, though.  For it, I wonder why
we need rebase.c to understand the envvar --- couldn't test-lib.sh
take care of setting rebase.useBuiltin to false when it's set?

Thanks,
Jonathan


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 11:21:38AM -0500, Jeff King wrote:

> > +   if (use_delta_islands) {
> > +   const char *p;
> > +   unsigned depth = 0;
> > +   struct object_entry *ent;
> > +
> > +   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
> > +   depth++;
> > +
> > +   ent = packlist_find(_pack, obj->oid.hash, NULL);
> > +   if (ent && depth > ent->tree_depth)
> > +   ent->tree_depth = depth;
> > +   }
> >  }
> > 
> > And that 'ent->tree_depth = depth;' line is replaced with the
> > oe_set_tree_depth() call in the report.
> > 
> > Since depth is never incremented, we are not covering this block. Is it
> > possible to test?
> 
> Looks like t5320 only has single-level trees. We probably just need to
> add a deeper tree. A more interesting case is when an object really is
> found at multiple depths, but constructing a case that cared about that
> would be quite complicated.
> 
> That said, there is much bigger problem with this code, which is that
> 108f530385 (pack-objects: move tree_depth into 'struct packing_data',
> 2018-08-16) is totally broken. It works on the trivial repository in the
> test, but try this (especially under valgrind or ASan) on a real
> repository:
> 
>   git repack -adi
> 
> which will crash immediately.  It doesn't correctly maintain the
> invariant that if tree_depth is not NULL, it is the same size as
> the main object array.
> 
> I'll see if I can come up with a fix.

Just a quick update to prevent anyone else looking at it: I have a fix
for this (and another related issue with that commit).

There's an edge case in that depth computation that I think is
unhandled, as well. I _think_ it doesn't trigger very often, but I'm
running some experiments to verify it. That's S-L-O-W, so I probably
won't have results until tomorrow.

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> Here is a test coverage report for the uncovered lines introduced in
> v2.20.0-rc0 compared to v2.19.1.

Thanks a lot for this.

> [...]
> builtin/rebase.c
> 62c23938fa 55) return env;
> [...]
> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
> where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.

That would make it take at least 12x as long if you did them one at a
time. Probably just 2-3x as long in practice since most can be combined
in some way, and somewhat computationally infeasible if you're going to
try all possible combinations.

> [...]
> fsck.c
> fb8952077d 214) die_errno("Could not read '%s'", path);
>
> René Scharfe fb8952077: fsck: use strbuf_getline() to read
> skiplist file

The fault of a patch I submitted. Couldn't find a sane & portable way to
emulate cases where ferror() would trigger.


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 10:40:53AM -0500, Derrick Stolee wrote:

> > 2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash,
> > base_sha1);
> > 2fa233a554 builtin/pack-objects.c 1513) if
> > (!in_same_island(>idx.oid, _oid))
> > 2fa233a554 builtin/pack-objects.c 1514) return 0;
> 
> These lines are inside a block for the following if statement:
> 
> +   /*
> +    * Otherwise, reachability bitmaps may tell us if the receiver has
> it,
> +    * even if it was buried too deep in history to make it into the
> +    * packing list.
> +    */
> +   if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, base_sha1))
> {
> 
> Peff: is this difficult to test?

A bit.

The two features (thin-packs and delta-islands) would not generally be
used together (one is for serving fetches, the other is for optimizing
on-disk packs to serve fetches). Something like:

 echo HEAD^ | git pack-objects --revs --thin --delta-islands

would probably exercise it (assuming there's a delta in HEAD^ against
something in HEAD), but you'd need to construct a very specific scenario
if you wanted to do any kind of checks no the output.

> > 28b8a73080 builtin/pack-objects.c 2793) depth++;
> > 108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent,
> > depth);
> 
> This 'depth' variable is incremented as part of a for loop in this patch:
> 
>  static void show_object(struct object *obj, const char *name, void *data)
> @@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const
> char *name, void *data)
>     add_preferred_base_object(name);
>     add_object_entry(>oid, obj->type, name, 0);
>     obj->flags |= OBJECT_ADDED;
> +
> +   if (use_delta_islands) {
> +   const char *p;
> +   unsigned depth = 0;
> +   struct object_entry *ent;
> +
> +   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
> +   depth++;
> +
> +   ent = packlist_find(_pack, obj->oid.hash, NULL);
> +   if (ent && depth > ent->tree_depth)
> +   ent->tree_depth = depth;
> +   }
>  }
> 
> And that 'ent->tree_depth = depth;' line is replaced with the
> oe_set_tree_depth() call in the report.
> 
> Since depth is never incremented, we are not covering this block. Is it
> possible to test?

Looks like t5320 only has single-level trees. We probably just need to
add a deeper tree. A more interesting case is when an object really is
found at multiple depths, but constructing a case that cared about that
would be quite complicated.

That said, there is much bigger problem with this code, which is that
108f530385 (pack-objects: move tree_depth into 'struct packing_data',
2018-08-16) is totally broken. It works on the trivial repository in the
test, but try this (especially under valgrind or ASan) on a real
repository:

  git repack -adi

which will crash immediately.  It doesn't correctly maintain the
invariant that if tree_depth is not NULL, it is the same size as
the main object array.

I'll see if I can come up with a fix.

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee
The test coverage reports started mid-way through this release cycle, so 
I thought it would be good to do a full review of the new uncovered code 
since the last release.


I eliminated most of the uncovered code due to the following cases:

1. Code was only moved or refactored.
2. Code was related to unusual error conditions (e.g. open_pack_index() 
fails)


The comments below are intended only to point out potential directions 
to improve test coverage. Some of it is for me to do!


Thanks,
-Stolee

On 11/18/2018 9:54 PM, Derrick Stolee wrote:

66ec0390e7 builtin/fsck.c 888) midx_argv[2] = "--object-dir";
66ec0390e7 builtin/fsck.c 889) midx_argv[3] = alt->path;
66ec0390e7 builtin/fsck.c 890) if (run_command(_verify))
66ec0390e7 builtin/fsck.c 891) errors_found |= ERROR_COMMIT_GRAPH;



There are two things wrong here:

1. Not properly covering multi-pack-index fsck with alternates.
2. the ERROR_COMMIT_GRAPH flag when the multi-pack-index is being checked.

I'll submit a patch to fix this.

2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash, 
base_sha1);
2fa233a554 builtin/pack-objects.c 1513) if 
(!in_same_island(>idx.oid, _oid))

2fa233a554 builtin/pack-objects.c 1514) return 0;


These lines are inside a block for the following if statement:

+   /*
+    * Otherwise, reachability bitmaps may tell us if the receiver 
has it,

+    * even if it was buried too deep in history to make it into the
+    * packing list.
+    */
+   if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, 
base_sha1)) {


Peff: is this difficult to test?


28b8a73080 builtin/pack-objects.c 2793) depth++;
108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, 
ent, depth); 


This 'depth' variable is incremented as part of a for loop in this patch:

 static void show_object(struct object *obj, const char *name, void *data)
@@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const 
char *name, void *data)

    add_preferred_base_object(name);
    add_object_entry(>oid, obj->type, name, 0);
    obj->flags |= OBJECT_ADDED;
+
+   if (use_delta_islands) {
+   const char *p;
+   unsigned depth = 0;
+   struct object_entry *ent;
+
+   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+   depth++;
+
+   ent = packlist_find(_pack, obj->oid.hash, NULL);
+   if (ent && depth > ent->tree_depth)
+   ent->tree_depth = depth;
+   }
 }

And that 'ent->tree_depth = depth;' line is replaced with the 
oe_set_tree_depth() call in the report.


Since depth is never incremented, we are not covering this block. Is it 
possible to test?



builtin/repack.c
16d75fa48d  48) use_delta_islands = git_config_bool(var, value);
16d75fa48d  49) return 0;


This is a simple config option check for "repack.useDeltaIslands". The 
logic it enables is tested, so this is an OK gap, in my opinion.


> builtin/submodule--helper.c

ee69b2a90c 1476) out->type = sub->update_strategy.type;
ee69b2a90c 1477) out->command = sub->update_strategy.command;


This block was introduced by this part of the patch:

+   } else if (sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
+   trace_printf("loaded thing");
+   out->type = sub->update_strategy.type;
+   out->command = sub->update_strategy.command;

Which seems to be an important case, as the other SM_UPDATE_* types seem 
like interesting cases.


Stefan: what actions would trigger this block? Is it easy to test?


delta-islands.c
c8d521faf7  53) memcpy(b, old, size);


This memcpy happens when the 'old' island_bitmap is passed to 
island_bitmap_new(), but...



c8d521faf7 187) b->refcount--;
c8d521faf7 188) b = kh_value(island_marks, pos) = island_bitmap_new(b);


This block has the only non-NULL caller to island_bitmap_new().


c8d521faf7 212) obj = ((struct tag *)obj)->tagged;
c8d521faf7 213) if (obj) {
c8d521faf7 214) parse_object(the_repository, >oid);
c8d521faf7 215) marks = create_or_get_island_marks(obj);
c8d521faf7 216) island_bitmap_set(marks, island_counter);


It appears that this block would happen if we placed a tag in the delta 
island.



c8d521faf7 397) strbuf_addch(_name, '-');


This block is inside the following patch:

+   if (matches[ARRAY_SIZE(matches) - 1].rm_so != -1)
+   warning(_("island regex from config has "
+ "too many capture groups (max=%d)"),
+   (int)ARRAY_SIZE(matches) - 2);
+
+   for (m = 1; m < ARRAY_SIZE(matches); m++) {
+   regmatch_t *match = [m];
+
+   if (match->rm_so == -1)
+   continue;
+
+   if (island_name.len)
+   strbuf_addch(_name, '-');
+
+   strbuf_add(_name, refname + match->rm_so, 
match->rm_eo - match->rm_so);

+   }

This likely means that ARRAY_SIZE(matches) is never more than 

Git Test Coverage Report (v2.20.0-rc0)

2018-11-18 Thread Derrick Stolee
Here is a test coverage report for the uncovered lines introduced in 
v2.20.0-rc0 compared to v2.19.1.


Thanks,

-Stolee

[1] https://dev.azure.com/git/git/_build/results?buildId=263=logs

---


apply.c
eccb5a5f3d 4071) return get_oid_hex(p->old_oid_prefix, oid);
517fe807d6 4776) BUG_ON_OPT_NEG(unset);
735ca208c5 4830) return -1;

blame.c
a470beea39  113)  !strcmp(r->index->cache[-1 - pos]->name, path))
a470beea39  272) int pos = index_name_pos(r->index, path, len);
a470beea39  274) mode = r->index->cache[pos]->ce_mode;

builtin/add.c
d1664e73ad builtin/add.c 458) die(_("index file corrupt"));

builtin/am.c
2abf350385 1362) repo_init_revisions(the_repository, _info, NULL);
fce5664805 2117) *opt_value = PATCH_FORMAT_UNKNOWN;

builtin/blame.c
517fe807d6 builtin/blame.c    759) BUG_ON_OPT_NEG(unset);

builtin/cat-file.c
98f425b453 builtin/cat-file.c  56) die("unable to stream %s to stdout", 
oid_to_hex(oid));
0eb8d3767c builtin/cat-file.c 609) return error(_("only one batch option 
may be specified"));


builtin/checkout.c
fa655d8411 builtin/checkout.c  539) return 0;
fa655d8411 builtin/checkout.c  953) return error(_("index file corrupt"));

builtin/difftool.c
4a7e27e957 441) if (oideq(, ))

builtin/fast-export.c
4a7e27e957 builtin/fast-export.c  387) if (oideq(>oid, >oid) &&

builtin/fetch.c
builtin/fsck.c
b29759d89a builtin/fsck.c 613) fprintf(stderr, "Checking %s link\n", 
head_ref_name);

b29759d89a builtin/fsck.c 618) return error("Invalid %s", head_ref_name);
454ea2e4d7 builtin/fsck.c 769) for (p = get_all_packs(the_repository); p;
66ec0390e7 builtin/fsck.c 888) midx_argv[2] = "--object-dir";
66ec0390e7 builtin/fsck.c 889) midx_argv[3] = alt->path;
66ec0390e7 builtin/fsck.c 890) if (run_command(_verify))
66ec0390e7 builtin/fsck.c 891) errors_found |= ERROR_COMMIT_GRAPH;

builtin/gc.c
3029970275 builtin/gc.c 461) ret = error_errno(_("cannot stat '%s'"), 
gc_log_path);
3029970275 builtin/gc.c 470) ret = error_errno(_("cannot read '%s'"), 
gc_log_path);

fec2ed2187 builtin/gc.c 495) die(FAILED_RUN, pack_refs_cmd.argv[0]);
fec2ed2187 builtin/gc.c 498) die(FAILED_RUN, reflog.argv[0]);
3029970275 builtin/gc.c 585) exit(128);
fec2ed2187 builtin/gc.c 637) die(FAILED_RUN, repack.argv[0]);
fec2ed2187 builtin/gc.c 647) die(FAILED_RUN, prune.argv[0]);
fec2ed2187 builtin/gc.c 654) die(FAILED_RUN, prune_worktrees.argv[0]);
fec2ed2187 builtin/gc.c 658) die(FAILED_RUN, rerere.argv[0]);

builtin/grep.c
76e9bdc437 builtin/grep.c  424) grep_read_unlock();
fd6263fb73 builtin/grep.c 1051) warning(_("invalid option combination, 
ignoring --threads"));
fd6263fb73 builtin/grep.c 1057) die(_("invalid number of threads 
specified (%d)"), num_threads);


builtin/help.c
e6e76baaf4 builtin/help.c 429) if (!exclude_guides || alias[0] == '!') {
e6e76baaf4 builtin/help.c 430) printf_ln(_("'%s' is aliased to '%s'"), 
cmd, alias);

e6e76baaf4 builtin/help.c 431) free(alias);
e6e76baaf4 builtin/help.c 432) exit(0);
e6e76baaf4 builtin/help.c 441) fprintf_ln(stderr, _("'%s' is aliased to 
'%s'"), cmd, alias);

e6e76baaf4 builtin/help.c 442) count = split_cmdline(alias, );
e6e76baaf4 builtin/help.c 443) if (count < 0)
e6e76baaf4 builtin/help.c 444) die(_("bad alias.%s string: %s"), cmd,
e6e76baaf4 builtin/help.c 446) free(argv);
e6e76baaf4 builtin/help.c 448) return alias;

builtin/log.c
517fe807d6 builtin/log.c 1196) BUG_ON_OPT_NEG(unset);
2e6fd71a52 builtin/log.c 1472) die(_("failed to infer range-diff ranges"));
ee6cbf712e builtin/log.c 1818) die(_("--interdiff requires 
--cover-letter or single patch"));

8631bf1cdd builtin/log.c 1828) else if (!rdiff_prev)
8631bf1cdd builtin/log.c 1829) die(_("--creation-factor requires 
--range-diff"));
40ce41604d builtin/log.c 1833) die(_("--range-diff requires 
--cover-letter or single patch"));


builtin/multi-pack-index.c
6d68e6a461 35) usage_with_options(builtin_multi_pack_index_usage,
6d68e6a461 39) die(_("too many arguments"));
6d68e6a461 48) die(_("unrecognized verb: %s"), argv[0]);

builtin/pack-objects.c
6a22d52126 builtin/pack-objects.c 1091) continue;
2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash, base_sha1);
2fa233a554 builtin/pack-objects.c 1513) if 
(!in_same_island(>idx.oid, _oid))

2fa233a554 builtin/pack-objects.c 1514) return 0;
28b8a73080 builtin/pack-objects.c 2793) depth++;
108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent, 
depth);

454ea2e4d7 builtin/pack-objects.c 2981) p = get_all_packs(the_repository);

builtin/pack-redundant.c
454ea2e4d7 builtin/pack-redundant.c 580) struct packed_git *p = 
get_all_packs(the_repository);
454ea2e4d7 builtin/pack-redundant.c 595) struct packed_git *p = 
get_all_packs(the_repository);


builtin/pull.c
01a31f3bca 565) die(_("unable to access commit %s"),

builtin/rebase--interactive.c
53bbcfbde7 builtin/rebase--interactive2.c  24) return error(_("no HEAD?"));
53bbcfbde7 builtin/rebase--interactive2.c  51) return 
error_errno(_("could not create temporary %s"), path_state_dir());
53bbcfbde7