Re: [PATCH v3 07/28] shallow.c: add remove_reachable_shallow_points()

2013-11-26 Thread Duy Nguyen
On Tue, Nov 26, 2013 at 4:53 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 When we receive a pack and the shallow points from another repository,
 we may want to add more shallow points to current repo to make sure no
 commits point to nowhere. However we do not want to add unnecessary
 shallow points and cut our history short because the other end is a
 shallow version of this repo. The output shallow points may or may not
 be added to .git/shallow, depending on whether they are actually
 reachable in the new pack.

 This function filters such shallow points out, leaving ones that might
 potentially be added. A simple has_sha1_file won't do because we may
 have incomplete object islands (i.e. not connected to any refs) and
 the shallow points are on these islands. In that case we want to keep
 the shallow points as candidates until we figure out if the new pack
 connects to such object islands.

 Typical cases that use remove_reachable_shallow_points() are:

  - fetch from a repo that has longer history: in this case all remote
shallow roots do not exist in the client repo, this function will
be cheap as it just does a few lookup_commit_graft and
has_sha1_file.

 It is unclear why.  If you fetched from a repository more complete
 than you, you may deepen your history which may allow you to unplug
 the shallow points you originally had, and remote shallow root (by
 the way, lets find a single good phrase to represent this thing and
 stick to it) may want to replace them, no?

Except that deepen/shorten history is a different mode that this
function is not used at all. I should have made that clear. This and
the next patch are about stick to our base and add something on top

Any suggestions about a good phase? I've been swinging between
shallow points (from 4 months ago) and shallow roots (recently).

  - fetch from a repo that has exactly the same shallow root set
(e.g. a clone from a shallow repo): this case may trigger
in_merge_bases_many all the way to roots. An exception is made to
avoid such costly path with a faith that .git/shallow does not
usually points to unreachable commit islands.

 ... and when the faith is broken, you will end up with a broken
 repository???

Not really broken because the new ref will be cut at the troublesome
shallow root before it goes out of bound, so the user may be surprised
that he got a history shorter than he wanted. It's when the root is
removed that we have a problem. But commits in .git/shallow are only
removed by

1) deepening history
2) the prune patch 28/28

#1 should send the missing objects and insert a new commit to
.git/shallow to plug the hole, so we're good. #2 only removes commits
from .git/shallow if they are not reachable from any refs, which is no
longer true.

 +static int add_ref(const char *refname,
 +const unsigned char *sha1, int flags, void *cb_data)
 +{
 + struct commit_array *ca = cb_data;
 + ALLOC_GROW(ca-commits, ca-nr + 1, ca-alloc);
 + ca-commits[ca-nr++] = lookup_commit(sha1);
 + return 0;
 +}

 Can't a ref point at a non-commit-ish?  Is the code prepared to deal
 with such an entry (possibly a NULL pointer) in the commit_array
 struct?

Eck, yes a ref can. No the code is not :P Thanks for pointing this
out. We don't care about non-commit refs, so we just need to filter
them out.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/28] shallow.c: add remove_reachable_shallow_points()

2013-11-25 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 When we receive a pack and the shallow points from another repository,
 we may want to add more shallow points to current repo to make sure no
 commits point to nowhere. However we do not want to add unnecessary
 shallow points and cut our history short because the other end is a
 shallow version of this repo. The output shallow points may or may not
 be added to .git/shallow, depending on whether they are actually
 reachable in the new pack.

 This function filters such shallow points out, leaving ones that might
 potentially be added. A simple has_sha1_file won't do because we may
 have incomplete object islands (i.e. not connected to any refs) and
 the shallow points are on these islands. In that case we want to keep
 the shallow points as candidates until we figure out if the new pack
 connects to such object islands.

 Typical cases that use remove_reachable_shallow_points() are:

  - fetch from a repo that has longer history: in this case all remote
shallow roots do not exist in the client repo, this function will
be cheap as it just does a few lookup_commit_graft and
has_sha1_file.

It is unclear why.  If you fetched from a repository more complete
than you, you may deepen your history which may allow you to unplug
the shallow points you originally had, and remote shallow root (by
the way, lets find a single good phrase to represent this thing and
stick to it) may want to replace them, no?

  - fetch from a repo that has exactly the same shallow root set
(e.g. a clone from a shallow repo): this case may trigger
in_merge_bases_many all the way to roots. An exception is made to
avoid such costly path with a faith that .git/shallow does not
usually points to unreachable commit islands.

... and when the faith is broken, you will end up with a broken
repository???

  - push from a shallow repo that has shorter history than the
server's: in_merge_bases_many() is unavoidable, so the longer
history the client has the higher the server cost is. The cost may
be reduced with the help of pack bitmaps, or commit cache, though.

  - push from a shallow clone that has exactly the same shallow root
set: the same as the second fetch case above, .i.e. cheap by
exception.

 The function must be called before the new pack is installed, or we
 won't know which objects are ours, which theirs.

 Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
 ---
  commit.h  |  3 +++
  connect.c |  2 +-
  remote.h  |  1 +
  shallow.c | 45 +
  4 files changed, 50 insertions(+), 1 deletion(-)

 diff --git a/commit.h b/commit.h
 index a879526..98044e6 100644
 --- a/commit.h
 +++ b/commit.h
 @@ -193,6 +193,7 @@ extern struct commit_list *get_octopus_merge_bases(struct 
 commit_list *in);
  /* largest positive number a signed 32-bit integer can contain */
  #define INFINITE_DEPTH 0x7fff
  
 +struct extra_have_objects;
  extern int register_shallow(const unsigned char *sha1);
  extern int unregister_shallow(const unsigned char *sha1);
  extern int for_each_commit_graft(each_commit_graft_fn, void *);
 @@ -206,6 +207,8 @@ extern void setup_alternate_shallow(struct lock_file 
 *shallow_lock,
   const char **alternate_shallow_file);
  extern char *setup_temporary_shallow(void);
  extern void advertise_shallow_grafts(int);
 +extern void remove_reachable_shallow_points(struct extra_have_objects *out,
 + const struct extra_have_objects 
 *in);
  
  int is_descendant_of(struct commit *, struct commit_list *);
  int in_merge_bases(struct commit *, struct commit *);
 diff --git a/connect.c b/connect.c
 index d0602b0..80e4360 100644
 --- a/connect.c
 +++ b/connect.c
 @@ -45,7 +45,7 @@ int check_ref_type(const struct ref *ref, int flags)
   return check_ref(ref-name, strlen(ref-name), flags);
  }
  
 -static void add_extra_have(struct extra_have_objects *extra, unsigned char 
 *sha1)
 +void add_extra_have(struct extra_have_objects *extra, unsigned char *sha1)
  {
   ALLOC_GROW(extra-array, extra-nr + 1, extra-alloc);
   hashcpy((extra-array[extra-nr][0]), sha1);
 diff --git a/remote.h b/remote.h
 index 773faa9..ff604ff 100644
 --- a/remote.h
 +++ b/remote.h
 @@ -145,6 +145,7 @@ extern struct ref **get_remote_heads(int in, char 
 *src_buf, size_t src_len,
struct ref **list, unsigned int flags,
struct extra_have_objects *have,
struct extra_have_objects *shallow);
 +extern void add_extra_have(struct extra_have_objects *extra, unsigned char 
 *sha1);
  
  int resolve_remote_symref(struct ref *ref, struct ref *list);
  int ref_newer(const unsigned char *new_sha1, const unsigned char *old_sha1);
 diff --git a/shallow.c b/shallow.c
 index 9552512..a974d2d 100644
 --- a/shallow.c
 +++ b/shallow.c
 @@ -2,6 +2,8 @@
  #include