> Teach git-commit-graph to delete the graph previously referenced by 
> 'graph_head'
> when writing a new graph file and updating 'graph_head'. This prevents
> data creep by storing a list of useless graphs. Be careful to not delete
> the graph if the file did not change.

We have to be careful with deleting the previously referenced graph
file right away after generating the new one.  Consider two processes
running concurrently, one writing new graph files with
--delete-expire', and the other reading the commit graph, e.g. a
future graph-aware 'git gc' and 'git log --topo-order':

  1. 'log' reads the hash of the graph file from graph-head.
  2. 'gc' writes the new graph and graph head files and deletes the
     old graph file.
  3. 'log' tries to open the the graph file with the hash it just
     read, but that file is already gone.

At this point 'log' could simply error out, but that would be rather
unfriendly.  Or it could try harder and could just ignore the missing
graph file and walk revisions the old school way.  It would be slower,
depending on the history size maybe much slower, but it would work.
Good.

However, in addition to the reader trying harder, I think we should
also consider making the writer more careful, too, and only delete a
stale graph file after a certain grace period is elapsed; similar to
how 'git gc' only deletes old loose objects.  And then perhaps it
should delete all graph files that are older than that grace period;
as it is, neither '--clear' nor '--delete-expired' seem to care about
graph files that aren't or weren't referenced by the graph-head.


> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index 4970dec133..766f09e6fc 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c

> @@ -121,6 +122,17 @@ static int graph_write(void)
>       if (graph_hash)
>               printf("%s\n", oid_to_hex(graph_hash));
>  
> +
> +     if (opts.delete_expired && opts.update_head && opts.has_existing &&
> +         oidcmp(graph_hash, &opts.old_graph_hash)) {
> +             char *old_path = get_commit_graph_filename_hash(opts.pack_dir,
> +                                                             
> &opts.old_graph_hash);
> +             if (remove_path(old_path))
> +                     die("failed to remove path %s", old_path);
> +
> +             free(old_path);
> +     }
> +
>       free(graph_hash);
>       return 0;
>  }
> @@ -139,6 +151,8 @@ int cmd_commit_graph(int argc, const char **argv, const 
> char *prefix)
>                       N_("write commit graph file")),
>               OPT_BOOL('u', "update-head", &opts.update_head,
>                       N_("update graph-head to written graph file")),
> +             OPT_BOOL('d', "delete-expired", &opts.delete_expired,
> +                     N_("delete expired head graph file")),
>               { OPTION_STRING, 'H', "graph-hash", &opts.graph_hash,
>                       N_("hash"),
>                       N_("A hash for a specific graph file in the pack-dir."),

Like '--update-head', '--delete-expired' is silently ignored when it's
not used with '--write'.


> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 6e3b62b754..b56a6d4217 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh

> +test_expect_success 'write graph with merges' \
> +    'graph3=$(git commit-graph --write --update-head --delete-expired) &&
> +     test_path_is_file ${packdir}/graph-${graph3}.graph &&
> +     test_path_is_missing ${packdir}/graph-${graph2}.graph &&
> +     test_path_is_file ${packdir}/graph-${graph1}.graph &&
> +     test_path_is_file ${packdir}/graph-head &&
> +     echo ${graph3} >expect &&
> +     cmp -n 40 expect ${packdir}/graph-head &&

printf and test_cmp.

> +     git commit-graph --read --graph-hash=${graph3} >output &&
> +     _graph_read_expect "23" "${packdir}" &&
> +     cmp expect output'
> +
> +test_expect_success 'write graph with nothing new' \
> +    'graph4=$(git commit-graph --write --update-head --delete-expired) &&
> +     test_path_is_file ${packdir}/graph-${graph4}.graph &&
> +     test_path_is_file ${packdir}/graph-${graph1}.graph &&
> +     test_path_is_file ${packdir}/graph-head &&
> +     echo ${graph4} >expect &&
> +     cmp -n 40 expect ${packdir}/graph-head &&

Likewise.

> +     git commit-graph --read --graph-hash=${graph4} >output &&
> +     _graph_read_expect "23" "${packdir}" &&
> +     cmp expect output'
> +
>  test_expect_success 'clear graph' \
>      'git commit-graph --clear &&
>       test_path_is_missing ${packdir}/graph-${graph2}.graph &&
> +     test_path_is_file ${packdir}/graph-${graph1}.graph &&
>       test_path_is_missing ${packdir}/graph-head'
>  
>  test_expect_success 'setup bare repo' \
> @@ -121,7 +185,7 @@ test_expect_success 'write graph in bare repo' \
>       echo ${graphbare} >expect &&
>       cmp -n 40 expect ${baredir}/graph-head &&
>       git commit-graph --read --graph-hash=${graphbare} >output &&
> -     _graph_read_expect "18" "${baredir}" &&
> +     _graph_read_expect "23" "${baredir}" &&
>       cmp expect output'
>  
>  test_done
> -- 
> 2.16.0.15.g9c3cf44.dirty


Reply via email to