UPDATE for V2: We now know the full repro, and a test is added. Thanks Szeder and Peff for your insights!
While dogfooding, Johannes found a bug in the fetch.writeCommitGraph config behavior. While his example initially happened during a clone with --recurse-submodules, (UPDATE) and the submodule is important, but --recurse-submodules is not: $ git clone <url> test $ cd test $ git -c fetch.writeCommitGraph=true fetch origin Computing commit graph generation numbers: 100% (12/12), done. BUG: commit-graph.c:886: missing parent <hash1> for commit <hash2> Aborted (core dumped) In the repo I had cloned, there were really 60 commits to scan, but only 12 were in the list to write when calling compute_generation_numbers(). A commit in the list expects to see a parent, but that parent is not in the list. The simple example I used for my testing was https://github.com/derrickstolee/numbers. Thie repo HAS A SUBMODULE, I just forgot. Sorry for derailing the investigation somewhat. The details above are the start of the commit message for [PATCH 1/2], including a test that fails when fetching after cloning a repo with a submodule. In [PATCH 2/2], I actually have the fix. I tried to include as much detail as I could for how I investigated the problem and why I think this is the right solution. I added details that have come from the on-list discussion, including what the submodule code is doing and why REACHABLE is no longer used in commit-reach.c. Thanks, -Stolee Derrick Stolee (2): t5510-fetch.sh: demonstrate fetch.writeCommitGraph bug commit-graph: fix writing first commit-graph during fetch commit-graph.c | 11 +++++++---- commit-reach.c | 1 - object.h | 3 ++- t/t5510-fetch.sh | 17 +++++++++++++++++ 4 files changed, 26 insertions(+), 6 deletions(-) base-commit: d966095db01190a2196e31195ea6fa0c722aa732 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-415%2Fderrickstolee%2Ffetch-first-write-fail-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-415/derrickstolee/fetch-first-write-fail-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/415 Range-diff vs v1: -: ---------- > 1: 6ac0a05746 t5510-fetch.sh: demonstrate fetch.writeCommitGraph bug 1: a1e5280d4b ! 2: ca59b118f1 commit-graph: fix writing first commit-graph during fetch @@ -2,10 +2,13 @@ commit-graph: fix writing first commit-graph during fetch - While dogfooding, Johannes found a bug in the fetch.writeCommitGraph - config behavior. While his example initially happened during a clone - with --recurse-submodules, we found that it is actually a problem with - the first fetch after a new clone and no existing commit-graph file: + The previous commit includes a failing test for an issue around + fetch.writeCommitGraph and fetching in a repo with a submodule. Here, we + fix that bug and set the test to "test_expect_success". + + The prolem arises with this set of commands when the remote repo at + <url> has a submodule. Note that --recurse-submodules is not needed to + demonstrate the bug. $ git clone <url> test $ cd test @@ -14,12 +17,7 @@ BUG: commit-graph.c:886: missing parent <hash1> for commit <hash2> Aborted (core dumped) - In the repo I had cloned, there were really 60 commits to scan, but - only 12 were in the list to write when calling - compute_generation_numbers(). A commit in the list expects to see a - parent, but that parent is not in the list. - - As an initial test, I converted the code in builtin/fetch.c that calls + As an initial fix, I converted the code in builtin/fetch.c that calls write_commit_graph_reachable() to instead launch a "git commit-graph write --reachable --split" process. That code worked, but is not how we want the feature to work long-term. @@ -60,20 +58,28 @@ flag inside close_reachable(), but the tips did not have the flag, so that did nothing. + It turns out that the calculate_changed_submodule_paths() method is at + fault. Thanks, Peff, for pointing out this detail! More specifically, + for each submodule, the collect_changed_submodules() runs a revision + walk to essentially do file-history on the list of submodules. That + revision walk marks commits UNININTERESTING if they are simiplified away + by not changing the submodule. + Instead, I finally arrived on the conclusion that I should use a flag that is not used in any other part of the code. In commit-reach.c, a number of flags were defined for commit walk algorithms. The REACHABLE flag seemed like it made the most sense, and it seems it was not - actually used in the file. + actually used in the file. The REACHABLE flag was used in early versions + of commit-reach.c, but was removed by 4fbcca4 (commit-reach: make + can_all_from_reach... linear, 2018-07-20). Add the REACHABLE flag to commit-graph.c and use it instead of UNINTERESTING in close_reachable(). This fixes the bug in manual testing. - I have failed to produce a test using the file:// protocol that - demonstrates this bug. - Reported-by: Johannes Schindelin <johannes.schinde...@gmx.de> + Helped-by: Jeff King <p...@peff.net> + Helped-by: Szeder Gábor <szeder....@gmail.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> diff --git a/commit-graph.c b/commit-graph.c @@ -147,3 +153,16 @@ * sha1-name.c: 20 * list-objects-filter.c: 21 * builtin/fsck.c: 0--3 + + diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh + --- a/t/t5510-fetch.sh + +++ b/t/t5510-fetch.sh +@@ + ) + ' + +-test_expect_failure 'fetch.writeCommitGraph with submodules' ' ++test_expect_success 'fetch.writeCommitGraph with submodules' ' + pwd="$(pwd)" && + git clone dups super && + ( -- gitgitgadget