Re: [PATCH 2/3] prune: fix pruning with multiple worktrees and split index

2017-12-11 Thread Thomas Gummerer
On 12/11, Brandon Williams wrote:
> On 12/10, Thomas Gummerer wrote:
> > be489d02d2 ("revision.c: --indexed-objects add objects from all
> > worktrees", 2017-08-23) made sure that pruning takes objects from all
> > worktrees into account.
> > 
> > It did that by reading the index of every worktree and adding the
> > necessary index objects to the set of pending objects.  The index is
> > read by read_index_from.  As mentioned in the previous commit,
> > read_index_from depends on the CWD for the location of the split index,
> > and add_index_objects_to_pending doesn't set that before using
> > read_index_from.
> > 
> > Instead of using read_index_from, use repo_read_index, which is aware of
> > the proper paths for the worktree.
> > 
> > This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> > 
> 
> I'm on the fence about this change.  I understand that this will ensure
> that the proper objects aren't pruned when using a split index in the
> presence of worktrees but I think the solution needs to be thought
> through a bit more.
> 
> My big concern right now is the interaction of 'struct worktree's and
> 'struct repository'.  I'll try to highlight my concerns below.

Thanks for the review!  My thoughts on the concerns are below.

> > Signed-off-by: Thomas Gummerer 
> > ---
> > 
> > This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
> > quite sure why it is fixed by this.  Either way I tracked the failure
> > down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
> > 2017-05-16).  Maybe Peff has an idea why this fixes that test?
> > 
> >  repository.c | 11 +++
> >  repository.h |  2 ++
> >  revision.c   | 13 -
> >  3 files changed, 21 insertions(+), 5 deletions(-)
> > 
> > diff --git a/repository.c b/repository.c
> > index 928b1f553d..3c9bfbd1b8 100644
> > --- a/repository.c
> > +++ b/repository.c
> > @@ -2,6 +2,7 @@
> >  #include "repository.h"
> >  #include "config.h"
> >  #include "submodule-config.h"
> > +#include "worktree.h"
> >  
> >  /* The main repository */
> >  static struct repository the_repo = {
> > @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char 
> > *gitdir, const char *worktree)
> > return -1;
> >  }
> >  
> > +/*
> > + * Initialize 'repo' based on the provided worktree
> > + * Return 0 upon success and a non-zero value upon failure.
> > + */
> > +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> > +{
> > +   return repo_init(repo, get_worktree_git_dir(worktree),
> > +worktree->path);
> > +}
> 
> My first concern is the use of 'get_worktree_git_dir()'.  Under the hood
> it calls 'get_git_dir()', 'get_git_common_dir()', and
> 'git_common_path()' which rely on global state as stored in
> 'the_repository'.  So how does one initialize a repository struct (using
> this initializer) using a worktree from a repository other than the
> global 'the_repository' struct?  I'm not sure I have an answer right
> now, but its an issue that needs to be thought through before we head
> down this road.

The main reason for just using 'get_worktree_git_dir()' was because it
exists and it does exactly what I needed.  If we ever have the need to
initialize a repository struct for a worktree from another repository
struct, I would imagine we introduce a new function
'repo_get_worktree_git_dir()', which takes a struct repository as
input and returns the worktree git dir based on that.

And then we'd have to add a different initializer
'repo_worktree_init_from_repo()' or something like that, where we
could initialize a worktree from a repository struct different from
'the_repository'.  But maybe there are some complications there that I
didn't think about?

> Just thinking to myself, Does it make sense to have worktree's as a
> separate struct or to have them stored in 'struct repository' in some
> way?

I'm not sure I have a good answer for this.  Looking at the libgit2
source, which has a repository struct for (I assume at least, I'm not
very familiar with the libgit2 source :)), they have a repository
struct for each worktree, with a flag specifying whether the struct is
for a worktree, or a "normal" git repository/the main worktree.

Obviously that doesn't mean we should do it the same way, but it
probably also doesn't hurt to look at prior art.

> Shouldn't a repository struct have a way to interact with all of
> its worktrees?

It could go either way I guess depending on how we want to design it.
If we want the repository struct to interact with all worktrees, it
seems to me like we would almost need some kind of 'struct
super_repository', which then contains a list of 'struct repository's
representing each worktree, probably including the main worktree.

But then again in normal operation we probably only want the
repository struct for the current repository we're working with, not
all of the worktrees.  I suspect it's only in some special cases where
we 

Re: [PATCH 2/3] prune: fix pruning with multiple worktrees and split index

2017-12-11 Thread Brandon Williams
On 12/10, Thomas Gummerer wrote:
> be489d02d2 ("revision.c: --indexed-objects add objects from all
> worktrees", 2017-08-23) made sure that pruning takes objects from all
> worktrees into account.
> 
> It did that by reading the index of every worktree and adding the
> necessary index objects to the set of pending objects.  The index is
> read by read_index_from.  As mentioned in the previous commit,
> read_index_from depends on the CWD for the location of the split index,
> and add_index_objects_to_pending doesn't set that before using
> read_index_from.
> 
> Instead of using read_index_from, use repo_read_index, which is aware of
> the proper paths for the worktree.
> 
> This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> 

I'm on the fence about this change.  I understand that this will ensure
that the proper objects aren't pruned when using a split index in the
presence of worktrees but I think the solution needs to be thought
through a bit more.

My big concern right now is the interaction of 'struct worktree's and
'struct repository'.  I'll try to highlight my concerns below.

> Signed-off-by: Thomas Gummerer 
> ---
> 
> This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
> quite sure why it is fixed by this.  Either way I tracked the failure
> down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
> 2017-05-16).  Maybe Peff has an idea why this fixes that test?
> 
>  repository.c | 11 +++
>  repository.h |  2 ++
>  revision.c   | 13 -
>  3 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 928b1f553d..3c9bfbd1b8 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -2,6 +2,7 @@
>  #include "repository.h"
>  #include "config.h"
>  #include "submodule-config.h"
> +#include "worktree.h"
>  
>  /* The main repository */
>  static struct repository the_repo = {
> @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char 
> *gitdir, const char *worktree)
>   return -1;
>  }
>  
> +/*
> + * Initialize 'repo' based on the provided worktree
> + * Return 0 upon success and a non-zero value upon failure.
> + */
> +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> +{
> + return repo_init(repo, get_worktree_git_dir(worktree),
> +  worktree->path);
> +}

My first concern is the use of 'get_worktree_git_dir()'.  Under the hood
it calls 'get_git_dir()', 'get_git_common_dir()', and
'git_common_path()' which rely on global state as stored in
'the_repository'.  So how does one initialize a repository struct (using
this initializer) using a worktree from a repository other than the
global 'the_repository' struct?  I'm not sure I have an answer right
now, but its an issue that needs to be thought through before we head
down this road.

Just thinking to myself, Does it make sense to have worktree's as a
separate struct or to have them stored in 'struct repository' in some
way?  Shouldn't a repository struct have a way to interact with all of
its worktrees?  How would initializing a repository struct for every
worktree work once we migrate the object store to be stored in 'struct
repoisotry'?  Shouldn't every worktree share the same object store
in-memory like they do on-disk?

> +
>  /*
>   * Initialize 'submodule' as the submodule given by 'path' in parent 
> repository
>   * 'superproject'.
> diff --git a/repository.h b/repository.h
> index 7f5e24a0a2..2adeb05bf4 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -4,6 +4,7 @@
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> +struct worktree;
>  
>  struct repository {
>   /* Environment */
> @@ -87,6 +88,7 @@ extern struct repository *the_repository;
>  extern void repo_set_gitdir(struct repository *repo, const char *path);
>  extern void repo_set_worktree(struct repository *repo, const char *path);
>  extern int repo_init(struct repository *repo, const char *gitdir, const char 
> *worktree);
> +extern int repo_worktree_init(struct repository *repo, struct worktree 
> *worktree);
>  extern int repo_submodule_init(struct repository *submodule,
>  struct repository *superproject,
>  const char *path);
> diff --git a/revision.c b/revision.c
> index e2e691dd5a..9d8d9b96d1 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -22,6 +22,7 @@
>  #include "packfile.h"
>  #include "worktree.h"
>  #include "argv-array.h"
> +#include "repository.h"
>  
>  volatile show_early_output_fn_t show_early_output;
>  
> @@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info 
> *revs, unsigned int flags)
>   worktrees = get_worktrees(0);
>   for (p = worktrees; *p; p++) {
>   struct worktree *wt = *p;
> - struct index_state istate = { NULL };
> + struct repository *repo;
>  
> + repo = xmalloc(sizeof(struct repository));

This was 

[PATCH 2/3] prune: fix pruning with multiple worktrees and split index

2017-12-10 Thread Thomas Gummerer
be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) made sure that pruning takes objects from all
worktrees into account.

It did that by reading the index of every worktree and adding the
necessary index objects to the set of pending objects.  The index is
read by read_index_from.  As mentioned in the previous commit,
read_index_from depends on the CWD for the location of the split index,
and add_index_objects_to_pending doesn't set that before using
read_index_from.

Instead of using read_index_from, use repo_read_index, which is aware of
the proper paths for the worktree.

This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.

Signed-off-by: Thomas Gummerer 
---

This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
quite sure why it is fixed by this.  Either way I tracked the failure
down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
2017-05-16).  Maybe Peff has an idea why this fixes that test?

 repository.c | 11 +++
 repository.h |  2 ++
 revision.c   | 13 -
 3 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/repository.c b/repository.c
index 928b1f553d..3c9bfbd1b8 100644
--- a/repository.c
+++ b/repository.c
@@ -2,6 +2,7 @@
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
+#include "worktree.h"
 
 /* The main repository */
 static struct repository the_repo = {
@@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, 
const char *worktree)
return -1;
 }
 
+/*
+ * Initialize 'repo' based on the provided worktree
+ * Return 0 upon success and a non-zero value upon failure.
+ */
+int repo_worktree_init(struct repository *repo, struct worktree *worktree)
+{
+   return repo_init(repo, get_worktree_git_dir(worktree),
+worktree->path);
+}
+
 /*
  * Initialize 'submodule' as the submodule given by 'path' in parent repository
  * 'superproject'.
diff --git a/repository.h b/repository.h
index 7f5e24a0a2..2adeb05bf4 100644
--- a/repository.h
+++ b/repository.h
@@ -4,6 +4,7 @@
 struct config_set;
 struct index_state;
 struct submodule_cache;
+struct worktree;
 
 struct repository {
/* Environment */
@@ -87,6 +88,7 @@ extern struct repository *the_repository;
 extern void repo_set_gitdir(struct repository *repo, const char *path);
 extern void repo_set_worktree(struct repository *repo, const char *path);
 extern int repo_init(struct repository *repo, const char *gitdir, const char 
*worktree);
+extern int repo_worktree_init(struct repository *repo, struct worktree 
*worktree);
 extern int repo_submodule_init(struct repository *submodule,
   struct repository *superproject,
   const char *path);
diff --git a/revision.c b/revision.c
index e2e691dd5a..9d8d9b96d1 100644
--- a/revision.c
+++ b/revision.c
@@ -22,6 +22,7 @@
 #include "packfile.h"
 #include "worktree.h"
 #include "argv-array.h"
+#include "repository.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info 
*revs, unsigned int flags)
worktrees = get_worktrees(0);
for (p = worktrees; *p; p++) {
struct worktree *wt = *p;
-   struct index_state istate = { NULL };
+   struct repository *repo;
 
+   repo = xmalloc(sizeof(struct repository));
if (wt->is_current)
continue; /* current index already taken care of */
+   if (repo_worktree_init(repo, wt))
+   BUG("couldn't initialize repository object from 
worktree");
 
-   if (read_index_from(,
-   worktree_git_path(wt, "index")) > 0)
-   do_add_index_objects_to_pending(revs, );
-   discard_index();
+   if (repo_read_index(repo) > 0)
+   do_add_index_objects_to_pending(revs, repo->index);
+   discard_index(repo->index);
}
free_worktrees(worktrees);
 }
-- 
2.15.1.504.g5279b80103