Fixes issues SZEDER raised with v1, except displaying an accurate ETA
in write_graph_*(). As noted in 2/6 I don't think it's worth it, I
just adjusted the message instead.
Ævar Arnfjörð Bjarmason (6):
commit-graph write: rephrase confusing progress output
commit-graph write: add more progress output
commit-graph write: show progress for object search
commit-graph write: add more describing progress output
commit-graph write: remove empty line for readability
commit-graph write: add even more progress output
commit-graph.c | 92 +++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 73 insertions(+), 19 deletions(-)
Range-diff:
1: 751d3a7561 ! 1: 093c63e99f commit-graph write: add more progress output
a => b | 0
1 file changed, 0 insertions(+), 0 deletions(-)
@@ -13,22 +13,30 @@
point at which we're not producing progress output:
$ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
- Finding commits for commit graph: 6418991, done.
- Computing commit graph generation numbers: 100% (797205/797205),
done.
- Writing out commit graph chunks: 2399861, done.
+ Finding commits for commit graph: 6365492, done.
+ Computing commit graph generation numbers: 100% (797222/797222),
done.
+ Writing out commit graph: 2399912, done.
- This "graph chunks" number is not meant to be meaningful to the user,
+ This "writing out" number is not meant to be meaningful to the user,
but just to show that we're doing work and the command isn't
hanging.
+ In the current implementation it's approximately 4x the number of
+ commits. As noted in on-list discussion[1] we could add the loops up
+ and show percentage progress here, but I don't think it's worth it. It
+ would make the implementation more complex and harder to maintain for
+ very little gain.
+
On a much larger in-house repository I have we'll show (note how we
also say "Annotating[...]"):
$ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
- Finding commits for commit graph: 48271163, done.
- Annotating commit graph: 21424536, done.
- Computing commit graph generation numbers: 100% (7141512/7141512),
done.
- Writing out commit graph chunks: 21424913, done.
+ Finding commits for commit graph: 50026015, done.
+ Annotating commit graph: 21567407, done.
+ Computing commit graph generation numbers: 100% (7144680/7144680),
done.
+ Writing out commit graph: 21434417, done.
+
+ 1. https://public-inbox.org/git/[email protected]/
Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
@@ -50,8 +58,7 @@
*/
for (i = 0; i < 256; i++) {
while (count < nr_commits) {
-+ if (progress)
-+ display_progress(progress, ++*progress_cnt);
++ display_progress(progress, ++*progress_cnt);
if ((*list)->object.oid.hash[0] != i)
break;
count++;
@@ -68,8 +75,7 @@
int count;
- for (count = 0; count < nr_commits; count++, list++)
+ for (count = 0; count < nr_commits; count++, list++) {
-+ if (progress)
-+ display_progress(progress, ++*progress_cnt);
++ display_progress(progress, ++*progress_cnt);
hashwrite(f, (*list)->object.oid.hash, (int)hash_len);
+ }
}
@@ -87,15 +93,13 @@
struct commit **list = commits;
struct commit **last = commits + nr_commits;
@@
+ struct commit_list *parent;
int edge_value;
uint32_t packedDate[2];
++ display_progress(progress, ++*progress_cnt);
-+ if (progress)
-+ display_progress(progress, ++*progress_cnt);
-+
parse_commit(*list);
hashwrite(f, get_commit_tree_oid(*list)->hash, hash_len);
-
@@
static void write_graph_chunk_large_edges(struct hashfile *f,
@@ -108,20 +112,18 @@
struct commit **list = commits;
struct commit **last = commits + nr_commits;
@@
+ commits,
nr_commits,
commit_to_sha1);
++ display_progress(progress, ++*progress_cnt);
-+ if (progress)
-+ display_progress(progress, ++*progress_cnt);
-+
if (edge_value < 0)
edge_value = GRAPH_PARENT_MISSING;
- else if (!parent->next)
@@
int num_extra_edges;
struct commit_list *parent;
struct progress *progress = NULL;
-+ uint64_t progress_cnt;
++ uint64_t progress_cnt = 0;
if (!commit_graph_compatible(the_repository))
return;
@@ -135,8 +137,8 @@
- write_graph_chunk_large_edges(f, commits.list, commits.nr);
+ if (report_progress)
+ progress = start_delayed_progress(
-+ _("Writing out commit graph chunks"),
-+ progress_cnt = 0);
++ _("Writing out commit graph"),
++ 0);
+ write_graph_chunk_fanout(f, commits.list, commits.nr, progress,
+ &progress_cnt);
+ write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr,
2: d750f0dd16 ! 2: 6c71de9460 commit-graph write: show progress for object
search
a => b | 0
1 file changed, 0 insertions(+), 0 deletions(-)
@@ -8,18 +8,17 @@
Before we'd emit on e.g. linux.git with "commit-graph write":
- Finding commits for commit graph: 6418991, done.
+ Finding commits for commit graph: 6365492, done.
[...]
And now:
- Finding commits for commit graph: 100% (6418991/6418991), done.
+ Finding commits for commit graph: 100% (6365492/6365492), done.
[...]
- Since the commit graph only includes those commits that are
- packed (via for_each_packed_object(...)) the
- approximate_object_count() returns the actual number of objects we're
- going to process.
+ Since the commit graph only includes those commits that are packed
+ (via for_each_packed_object(...)) the approximate_object_count()
+ returns the actual number of objects we're going to process.
Still, it is possible due to a race with "gc" or another process
maintaining packs that the number of objects we're going to process is
@@ -40,7 +39,7 @@
@@
struct commit_list *parent;
struct progress *progress = NULL;
- uint64_t progress_cnt;
+ uint64_t progress_cnt = 0;
+ unsigned long approx_nr_objects;
if (!commit_graph_compatible(the_repository))
3: a175ab49ff ! 3: c665dbdacb commit-graph write: add more describing
progress output
a => b | 0
1 file changed, 0 insertions(+), 0 deletions(-)
@@ -10,18 +10,26 @@
we support:
$ git commit-graph write
- Finding commits for commit graph among packed objects: 100%
(6418991/6418991), done.
+ Finding commits for commit graph among packed objects: 100%
(6365492/6365492), done.
[...]
+
+ # Actually we don't emit this since this takes almost no time at
+ # all. But if we did (s/_delayed//) we'd show:
$ git for-each-ref --format='%(objectname)' | git commit-graph
write --stdin-commits
- Finding commits for commit graph from 584 ref tips: 100%
(584/584), done.
+ Finding commits for commit graph from 584 refs: 100% (584/584),
done.
[...]
+
$ (cd .git/objects/pack/ && ls *idx) | git commit-graph write
--stdin-pack
- Finding commits for commit graph in 4 packs: 6418991, done.
+ Finding commits for commit graph in 3 packs: 6365492, done.
[...]
- The middle on of those is going to be the output users will most
- commonly see, since it'll be emitted when they get the commit graph
- via gc.writeCommitGraph=true.
+ The middle on of those is going to be the output users might see in
+ practice, since it'll be emitted when they get the commit graph via
+ gc.writeCommitGraph=true. But as noted above you need a really large
+ number of refs for this message to show. It'll show up on a test
+ repository I have with ~165k refs:
+
+ Finding commits for commit graph from 165203 refs: 100%
(165203/165203), done.
Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
@@ -30,7 +38,7 @@
+++ b/commit-graph.c
@@
struct progress *progress = NULL;
- uint64_t progress_cnt;
+ uint64_t progress_cnt = 0;
unsigned long approx_nr_objects;
+ struct strbuf progress_title = STRBUF_INIT;
@@ -66,8 +74,8 @@
- commit_hex->nr);
+ if (report_progress) {
+ strbuf_addf(&progress_title,
-+ Q_("Finding commits for commit graph from
%d ref tip",
-+ "Finding commits for commit graph from
%d ref tips",
++ Q_("Finding commits for commit graph from
%d ref",
++ "Finding commits for commit graph from
%d refs",
+ commit_hex->nr),
+ commit_hex->nr);
+ progress = start_delayed_progress(progress_title.buf,
4: 4e11c8b2fd = 4: f70fc5045d commit-graph write: remove empty line for
readability
a => b | 0
1 file changed, 0 insertions(+), 0 deletions(-)
5: 6fbba22fac ! 5: 2e943fa925 commit-graph write: add even more progress
output
a => b | 0
1 file changed, 0 insertions(+), 0 deletions(-)
@@ -4,24 +4,26 @@
Add more progress output to sections of code that can collectively
take 5-10 seconds on a large enough repository. On a test repository
- with 7141512 commits (see earlier patches for details) we'll now emit:
+ with I have with ~7 million commits and ~50 million objects we'll now
+ emit:
$ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
- Finding commits for commit graph among packed objects: 100%
(50009986/50009986), done.
- Annotating commit graph: 21564240, done.
- Counting distinct commits in commit graph: 100% (7188080/7188080),
done.
- Finding extra edges in commit graph: 100% (7188080/7188080), done.
- Computing commit graph generation numbers: 100% (7143635/7143635),
done.
- Writing out commit graph chunks: 21431282, done.
+ Finding commits for commit graph among packed objects: 100%
(50026015/50026015), done.
+ Annotating commit graph: 21567407, done.
+ Counting distinct commits in commit graph: 100% (7189147/7189147),
done.
+ Finding extra edges in commit graph: 100% (7189147/7189147), done.
+ Computing commit graph generation numbers: 100% (7144680/7144680),
done.
+ Writing out commit graph: 21434417, done.
- Whereas on a medium-sized repository such as linux.git we'll still
+ Whereas on a medium-sized repository such as linux.git these new
+ progress bars won't have time to kick in and as before and we'll still
emit output like:
$ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
- Finding commits for commit graph among packed objects: 100%
(6365328/6365328), done.
- Annotating commit graph: 2391621, done.
- Computing commit graph generation numbers: 100% (797207/797207),
done.
- Writing out commit graph chunks: 2399867, done.
+ Finding commits for commit graph among packed objects: 100%
(6365492/6365492), done.
+ Annotating commit graph: 2391666, done.
+ Computing commit graph generation numbers: 100% (797222/797222),
done.
+ Writing out commit graph: 2399912, done.
The "Counting distinct commits in commit graph" phase will spend most
of its time paused at "0/*" as we QSORT(...) the list. That's not
--
2.20.0.rc0.387.gc7a69e6b6c