[PATCH 12/12] commit-graph: update design document

2018-05-10 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.

Also remove the section on lazy-loading trees, as that was completed
in an earlier patch series.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 22 --
 1 file changed, 22 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index e1a883eb46..c664acbd76 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -118,9 +118,6 @@ Future Work
 - The commit graph feature currently does not honor commit grafts. This can
   be remedied by duplicating or refactoring the current graft logic.
 
-- The 'commit-graph' subcommand does not have a "verify" mode that is
-  necessary for integration with fsck.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
@@ -130,25 +127,6 @@ Future Work
 - 'log --topo-order'
 - 'tag --merged'
 
-- Currently, parse_commit_gently() requires filling in the root tree
-  object for a commit. This passes through lookup_tree() and consequently
-  lookup_object(). Also, it calls lookup_commit() when loading the parents.
-  These method calls check the ODB for object existence, even if the
-  consumer does not need the content. For example, we do not need the
-  tree contents when computing merge bases. Now that commit parsing is
-  removed from the computation time, these lookup operations are the
-  slowest operations keeping graph walks from being fast. Consider
-  loading these objects without verifying their existence in the ODB and
-  only loading them fully when consumers need them. Consider a method
-  such as "ensure_tree_loaded(commit)" that fully loads a tree before
-  using commit->tree.
-
-- The current design uses the 'commit-graph' subcommand to generate the graph.
-  When this feature stabilizes enough to recommend to most users, we should
-  add automatic graph writes to common operations that create many commits.
-  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
-  commands.
-
 - A server could provide a commit graph file as part of the network protocol
   to avoid extra calculations by clients. This feature is only of benefit if
   the user is willing to trust the file, because verifying the file is correct
-- 
2.16.2.329.gfb62395de6



Re: [RFC PATCH 12/12] commit-graph: update design document

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> The commit-graph feature is now integrated with 'fsck' and 'gc',
> so remove those items from the "Future Work" section of the
> commit-graph design document.

See comments below, but this looks good to me.

What is missing from the "Future Work" section (and which was somewhat
implied by now removed "When this feature stabilizes enough to recommend
to most users") is safety against history [view] changing features:
git-replace, shallow clone and grafts file.  I wrote about this in
"Re: [PATCH v8 04/14] graph: add commit graph design document"
https://public-inbox.org/git/86vacsjdcg@gmail.com/

JN> The problem in my opinion lies in different direction, namely that
JN> commit grafts can change, changing the view of the history.  If we want
JN> commit-graph file to follow user-visible view of the history of the
JN> project, it needs to respect current version of commit grafts - but what
JN> if commit grafts changed since commit-graph file was generated?
JN> 
JN> Actually, there are currently three ways to affect the view of the
JN> history:
JN> 
JN> a. legacy commit grafts mechanism; it was first, but it is not safe,
JN>cannot be transferred on fetch / push, and is now deprecated.
JN> 
JN> b. shallow clones, which are kind of specialized and limited grafts;
JN>they used to limit available functionality, but restrictions are
JN>being lifted (or perhaps even got lifted)
JN> 
JN> c. git-replace mechanism, where we can create an "overlay" of any
JN>object, and is intended to be among others replacement for commit
JN>grafts; safe, transferable, can be turned off with "git
JN>--no-replace-objects "
JN> 
JN> All those can change; some more likely than others.  The problem is if
JN> they change between writing commit-graph file (respecting old view of
JN> the history) and reading it (where we expect to see the new view).
JN> 
JN> a. grafts file can change: lines can be added, removed or changed
JN> 
JN> b. shallow clones can be deepened or shortened, or even make
JN>not shallow
JN> 
JN> c. new replacements can be added, old removed, and existing edited
JN> 
JN> 
JN> There are, as far as I can see, two ways of handling the issue of Git
JN> features that can change the view of the project's history, namely:
JN> 
JN>  * Disable commit-graph reading when any of this features are used, and
JN>always write full graph info.
JN> 
JN>This may not matter much for shallow clones, where commit count
JN>should be small anyway, but when using git-replace to stitch together
JN>current repository with historical one, commit-graph would be
JN>certainly useful.  Also, git-replace does not need to change history.
JN> 
JN>On the other hand I think it is the easier solution.
JN> 
JN> Or
JN> 
JN>  * Detect somehow that the view of the history changed, and invalidate
JN>commit-graph (perhaps even automatically regenerate it).
JN> 
JN>For shallow clone changes I think one can still use the old
JN>commit-graph file to generate the new one.  For other cases, the
JN>metadata is simple to modify, but indices such as generation number
JN>would need to be at least partially calculated anew.

Note that in all cases one can simply discard generation number
information (treating it as if it was ZERO), and use commit-graph as
values before applying history-changing feature: replacements, grafts or
shallow.

Well, at least for shallow you can do that for generation numbers: using
grafts (deprecated) or replacements to join repository with historical
one would mean that we are no longer have commit-graph transitively
closed under reachability.

>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph.txt | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt 
> b/Documentation/technical/commit-graph.txt
> index d9f2713efa..d04657b781 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -118,9 +118,6 @@ Future Work
>  - The commit graph feature currently does not honor commit grafts. This can
>be remedied by duplicating or refactoring the current graft logic.
>  
> -- The 'commit-graph' subcommand does not have a "verify" mode that is
> -  necessary for integration with fsck.
> -

All right (though "verify" mode is actually done via "check" command).

>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
> @@ -142,12 +139,6 @@ Future Work
>such as "ensure_tree_loaded(commit)" that fully loads a tree before
>using commit->tree.
>  
> -- The current design uses the 'commit-graph' subcommand to generate the 
> graph.
> -  When this feature stabilizes enough to recommend to most users, we should

[RFC PATCH 12/12] commit-graph: update design document

2018-04-17 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 9 -
 1 file changed, 9 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index d9f2713efa..d04657b781 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -118,9 +118,6 @@ Future Work
 - The commit graph feature currently does not honor commit grafts. This can
   be remedied by duplicating or refactoring the current graft logic.
 
-- The 'commit-graph' subcommand does not have a "verify" mode that is
-  necessary for integration with fsck.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
@@ -142,12 +139,6 @@ Future Work
   such as "ensure_tree_loaded(commit)" that fully loads a tree before
   using commit->tree.
 
-- The current design uses the 'commit-graph' subcommand to generate the graph.
-  When this feature stabilizes enough to recommend to most users, we should
-  add automatic graph writes to common operations that create many commits.
-  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
-  commands.
-
 - A server could provide a commit graph file as part of the network protocol
   to avoid extra calculations by clients. This feature is only of benefit if
   the user is willing to trust the file, because verifying the file is correct
-- 
2.17.0.39.g685157f7fb