Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-09-02 Thread Michael Haggerty
On 08/23/2012 10:31 PM, Jeff King wrote:
 On Thu, Aug 23, 2012 at 03:56:48PM -0400, Jeff King wrote:
 
 This code blames back to:

  commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
  Author: Alexandre Julliard julli...@winehq.org
  Date:   Fri Nov 24 16:00:13 2006 +0100

 fetch-pack: Do not fetch tags for shallow clones.

 A better fix may be to only fetch tags that point to commits that we
 are downloading, but git-clone doesn't have support for following
 tags. This will happen automatically on the next git-fetch though.

 So it is about making sure that git clone --depth=1 does not
 accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
 losing the purpose of using --depth in the first place. These days it is
 largely irrelevant, since this code path is not followed by clone, and
 clone will automatically restrict its list of fetched refs to a single
 branch if --depth is used.
 
 I think part of the confusion of this code is that inside the loop over
 the refs it is sometimes checking aspects of the ref, and sometimes
 checking invariants of the loop (like args.fetch_all). Splitting it into
 separate loops makes it easier to see what is going on, like the patch
 below (on top of Michael's series).
 
 I'm not sure if it ends up being more readable, since the generic cut
 down this linked list function has to operate through callbacks with a
 void pointer. On the other hand, that function could also be used
 elsewhere.
 
 diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
 index 90683ca..877cf38 100644
 --- a/builtin/fetch-pack.c
 +++ b/builtin/fetch-pack.c
 @@ -521,51 +521,80 @@ static void mark_recent_complete_commits(unsigned long 
 cutoff)
   }
  }
  
 -static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
 +static void filter_refs_callback(struct ref **refs,
 +  int (*want)(struct ref *, void *),
 +  void *data)
  {
 - struct ref *newlist = NULL;
 - struct ref **newtail = newlist;
 + struct ref **tail = refs;
   struct ref *ref, *next;
 - int match_pos = 0, unmatched = 0;
  
   for (ref = *refs; ref; ref = next) {
 - int keep_ref = 0;
   next = ref-next;
 - if (!memcmp(ref-name, refs/, 5) 
 - check_refname_format(ref-name, 0))
 - ; /* trash */
 - else if (args.fetch_all 
 -(!args.depth || prefixcmp(ref-name, refs/tags/)))
 - keep_ref = 1;
 - else
 - while (match_pos  *nr_heads) {
 - int cmp = strcmp(ref-name, heads[match_pos]);
 - if (cmp  0) { /* definitely do not have it */
 - break;
 - } else if (cmp == 0) { /* definitely have it */
 - free(heads[match_pos++]);
 - keep_ref = 1;
 - break;
 - } else { /* might have it; keep looking */
 - heads[unmatched++] = heads[match_pos++];
 - }
 - }
 -
 - if (keep_ref) {
 - *newtail = ref;
 - ref-next = NULL;
 - newtail = ref-next;
 - } else {
 + if (want(ref, data))
 + tail = ref-next;
 + else {
   free(ref);
 + *tail = next;
   }
   }
 +}
  
 - /* copy any remaining unmatched heads: */
 - while (match_pos  *nr_heads)
 - heads[unmatched++] = heads[match_pos++];
 - *nr_heads = unmatched;
 +static int ref_name_is_ok(struct ref *ref, void *data)
 +{
 + return memcmp(ref-name, refs/, 5) ||
 + !check_refname_format(ref-name, 0);
 +}
 +
 +static int ref_ok_for_shallow(struct ref *ref, void *data)
 +{
 + return prefixcmp(ref-name, refs/tags/);
 +}
  
 - *refs = newlist;
 +struct filter_by_name_data {
 + char **heads;
 + int nr_heads;
 + int match_pos;
 + int unmatched;
 +};
 +
 +static int want_ref_name(struct ref *ref, void *data)
 +{
 + struct filter_by_name_data *f = data;
 +
 + while (f-match_pos  f-nr_heads) {
 + int cmp = strcmp(ref-name, f-heads[f-match_pos]);
 + if (cmp  0) /* definitely do not have it */
 + return 0;
 + else if (cmp == 0) { /* definitely have it */
 + free(f-heads[f-match_pos++]);
 + return 1;
 + } else /* might have it; keep looking */
 + f-heads[f-unmatched++] = f-heads[f-match_pos++];
 + }
 + return 0;
 +}
 +
 +static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
 +{
 + struct filter_by_name_data f;
 +

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-25 Thread Michael Haggerty
On 08/23/2012 10:31 PM, Jeff King wrote:
 I think part of the confusion of this code is that inside the loop over
 the refs it is sometimes checking aspects of the ref, and sometimes
 checking invariants of the loop (like args.fetch_all). Splitting it into
 separate loops makes it easier to see what is going on, like the patch
 below (on top of Michael's series).
 
 I'm not sure if it ends up being more readable, since the generic cut
 down this linked list function has to operate through callbacks with a
 void pointer. On the other hand, that function could also be used
 elsewhere.
 [...]

Despite requiring a bit more boilerplate, I think that your change makes
the logic clearer.

*If* we want to switch to using callbacks, then we can get even more
bang for the buck, as in the attached patch (which applies on top of my
patch v2).  Beyond your suggestion, this patch:

* Inlines your filter_refs() into everything_local(), because (a) it's
short and (b) the policy work implemented there logically belongs
higher-up in the call chain.

* Renames your filter_refs_callback() to filter_refs().

* Moves the initialization of the filter_by_name_data structure
(including sorting and de-duping) all the way up to fetch_pack(), and
passes a filter_by_name_data* (rather than (nr_heads, heads)) down to
the callees.

If you like this change, let me know and I'll massage it into a
digestible patch series.

Michael
-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index db77ee6..d1dabcf 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -521,51 +521,91 @@ static void mark_recent_complete_commits(unsigned long cutoff)
 	}
 }
 
-static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+static void filter_refs(struct ref **refs,
+			int (*want)(struct ref *, void *),
+			void *data)
 {
-	struct ref *newlist = NULL;
-	struct ref **newtail = newlist;
+	struct ref **tail = refs;
 	struct ref *ref, *next;
-	int head_pos = 0, unmatched = 0;
 
 	for (ref = *refs; ref; ref = next) {
-		int keep_ref = 0;
 		next = ref-next;
-		if (!memcmp(ref-name, refs/, 5) 
-		check_refname_format(ref-name, 0))
-			; /* trash */
-		else if (args.fetch_all 
-			   (!args.depth || prefixcmp(ref-name, refs/tags/)))
-			keep_ref = 1;
-		else
-			while (head_pos  *nr_heads) {
-int cmp = strcmp(ref-name, heads[head_pos]);
-if (cmp  0) { /* definitely do not have it */
-	break;
-} else if (cmp == 0) { /* definitely have it */
-	free(heads[head_pos++]);
-	keep_ref = 1;
-	break;
-} else { /* might have it; keep looking */
-	heads[unmatched++] = heads[head_pos++];
-}
-			}
-
-		if (keep_ref) {
-			*newtail = ref;
-			ref-next = NULL;
-			newtail = ref-next;
-		} else {
+		if (want(ref, data))
+			tail = ref-next;
+		else {
 			free(ref);
+			*tail = next;
 		}
 	}
+}
 
-	/* copy any remaining unmatched heads: */
-	while (head_pos  *nr_heads)
-		heads[unmatched++] = heads[head_pos++];
-	*nr_heads = unmatched;
+static int ref_name_is_ok(struct ref *ref, void *data)
+{
+	return memcmp(ref-name, refs/, 5) ||
+		!check_refname_format(ref-name, 0);
+}
+
+static int ref_ok_for_shallow(struct ref *ref, void *data)
+{
+	return prefixcmp(ref-name, refs/tags/);
+}
+
+static int compare_heads(const void *a, const void *b)
+{
+	return strcmp(*(const char **)a, *(const char **)b);
+}
+
+static int remove_duplicates(int nr_heads, char **heads)
+{
+	int src, dst;
+	for (src = dst = 1; src  nr_heads; src++)
+		if (strcmp(heads[src], heads[dst-1]))
+			heads[dst++] = heads[src];
+		else
+			free(heads[src]);
+	return dst;
+}
 
-	*refs = newlist;
+struct filter_by_name_data {
+	char **heads;
+	int *nr_heads;
+	int head_pos;
+	int unmatched;
+};
+
+static void filter_by_name_init(struct filter_by_name_data *f,
+			 int *nr_heads, char **heads)
+{
+	memset(f, 0, sizeof(*f));
+	f-heads = heads;
+	f-nr_heads = nr_heads;
+	qsort(f-heads, *f-nr_heads, sizeof(*f-heads), compare_heads);
+	*f-nr_heads = remove_duplicates(*f-nr_heads, f-heads);
+}
+
+static int filter_by_name_want(struct ref *ref, void *data)
+{
+	struct filter_by_name_data *f = data;
+
+	while (f-head_pos  *f-nr_heads) {
+		int cmp = strcmp(ref-name, f-heads[f-head_pos]);
+		if (cmp  0) /* definitely do not have it */
+			return 0;
+		else if (cmp == 0) { /* definitely have it */
+			free(f-heads[f-head_pos++]);
+			return 1;
+		} else /* might have it; keep looking */
+			f-heads[f-unmatched++] = f-heads[f-head_pos++];
+	}
+	return 0;
+}
+
+static void filter_by_name_finish(struct filter_by_name_data *f)
+{
+	/* copy any remaining unmatched heads: */
+	while (f-head_pos  *f-nr_heads)
+		f-heads[f-unmatched++] = f-heads[f-head_pos++];
+	*f-nr_heads = f-unmatched;
 }
 
 static void mark_alternate_complete(const struct ref *ref, void *unused)
@@ -573,7 +613,8 @@ static void mark_alternate_complete(const struct ref *ref, void *unused)
 	mark_complete(NULL, 

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-24 Thread Philip Oakley

From: Junio C Hamano gits...@pobox.com
Sent: Friday, August 24, 2012 5:23 AM

Jeff King p...@peff.net writes:


It may be (?) that it is a good time to think about a 'datedepth'
capability to bypass the current counted-depth shallow fetch that 
can

cause so much trouble. With a date limited depth the relevant tags
could also be fetched.


I don't have anything against such an idea, but I think it is 
orthogonal

to the issue being discussed here.


Correct.

The biggest problem with the shallow hack is that the deepening
fetch counts from the tip of the refs at the time of deepening when
deepening the history (i.e. clone --depth, followed by number of
fetch, followed by fetch --depth).  If you start from a shallow
clone of depth 1, repeated fetch to keep up while the history grew
by 100, you would still have a connected history down to the initial
cauterization point, and fetch --depth=200 would give you a
history that is deeper than your original clone by 100 commits.  But
if you start from the same shallow clone of depth 1, did not do
anything while the history grew by 100, and then decide to fetch
again with fetch --depth=20, it does not grow.  It just makes
20-commit deep history from the updated tip, and leave the commit
from your original clone dangling.

The problem with depth does not have anything to do with how it is
specified at the UI level.


That is, correct me if I'm wrong, the server currently lacks a 
capability to provide anything other than the counted depth shallow 
response (if available). Hence my comment about needing an additional 
server side capability, though that would need a well thought out set of 
use cases to make it useful.


  The end-user input is used to 
find out

at what set of commits the history is cauterized, and once they are
computed, the shallow logic solely works on is the history before
these cauterization points, or after, in topological terms. (and it
has to be that way to make sure we get reproducible results).  Even
if a new way to specify these cauterization points in terms of date
is introduced, it does not change anything and does not solve the
fundamental problem the code has when deepening.


fundamental problem = no server capability other than counted depth.




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 10:10:25AM +0200, mhag...@alum.mit.edu wrote:

 There were various confusing things (and a couple of bugs) in the way
 that fetch_pack() handled the list (nr_heads, heads) of references to
 be sought from the remote:

Aside from the minor comments I made to individual patches, this all
looks good. As usual, thanks for breaking it down; I wish all series
were as easy to review as this.

 I'm still suspicious about the logic related to args.fetch_all and
 args.depth, but I don't think I've made anything worse.

I think the point of that is that when doing git fetch-pack --all
--depth=1, the meaning of --all is changed from all refs to
everything but tags.

Which I kind of see the point of, because you don't want to grab ancient
tags that will be expensive. But wouldn't it make more sense to limit it
only to the contents of refs/heads in that case? Surely you wouldn't
want refs/notes, refs/remotes, or other hierarchies.

I suspect this code is never even run at all these days. All of the
callers inside git should actually provide a real list of refs, not
--all. So it is really historical cruft for anybody who calls the
fetch-pack plumbing (I wonder if any third-party callers even exist;
this is such a deep part of the network infrastructure that any sane
scripts would probably just be calling fetch).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Philip Oakley

From: Jeff King p...@peff.net
Sent: Thursday, August 23, 2012 10:26 AM

On Thu, Aug 23, 2012 at 10:10:25AM +0200, mhag...@alum.mit.edu wrote:


There were various confusing things (and a couple of bugs) in the way
that fetch_pack() handled the list (nr_heads, heads) of references to
be sought from the remote:


Aside from the minor comments I made to individual patches, this all
looks good. As usual, thanks for breaking it down; I wish all series
were as easy to review as this.


I'm still suspicious about the logic related to args.fetch_all and
args.depth, but I don't think I've made anything worse.


I think the point of that is that when doing git fetch-pack --all
--depth=1, the meaning of --all is changed from all refs to
everything but tags.



There is a comment in \git\Documentation\technical\shallow.txt that
- If you deepen a history, you'd want to get the tags of the
 newly stored (but older!) commits. This does not work right now.
which may be the source of this restriction. That is, how is the depth
of the tag fetching to be restricted to the requested depth count?
[assuming I've undestood the problem correctly]

It may be (?) that it is a good time to think about a 'datedepth' 
capability to bypass the current counted-depth shallow fetch that can 
cause so much trouble. With a date limited depth the relevant tags could 
also be fetched.


Which I kind of see the point of, because you don't want to grab 
ancient
tags that will be expensive. But wouldn't it make more sense to limit 
it

only to the contents of refs/heads in that case? Surely you wouldn't
want refs/notes, refs/remotes, or other hierarchies.

I suspect this code is never even run at all these days. All of the
callers inside git should actually provide a real list of refs, not
--all. So it is really historical cruft for anybody who calls the
fetch-pack plumbing (I wonder if any third-party callers even exist;
this is such a deep part of the network infrastructure that any sane
scripts would probably just be calling fetch).

-Peff
--



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 08:13:29PM +0100, Philip Oakley wrote:

 I'm still suspicious about the logic related to args.fetch_all and
 args.depth, but I don't think I've made anything worse.
 
 I think the point of that is that when doing git fetch-pack --all
 --depth=1, the meaning of --all is changed from all refs to
 everything but tags.
 
 
 There is a comment in \git\Documentation\technical\shallow.txt that
 - If you deepen a history, you'd want to get the tags of the
  newly stored (but older!) commits. This does not work right now.
 which may be the source of this restriction. That is, how is the depth
 of the tag fetching to be restricted to the requested depth count?
 [assuming I've undestood the problem correctly]

I don't think this is about deepening, but rather about making sure we
remain shallow for the initial fetch. Remember that this is on the
fetch-pack --all code path, which used to be used by git clone when
it was a shell script (these days, clone is a C builtin and will
actually feed the list of refs to fetch-pack).

This code blames back to:

 commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
 Author: Alexandre Julliard julli...@winehq.org
 Date:   Fri Nov 24 16:00:13 2006 +0100

fetch-pack: Do not fetch tags for shallow clones.

A better fix may be to only fetch tags that point to commits that we
are downloading, but git-clone doesn't have support for following
tags. This will happen automatically on the next git-fetch though.

So it is about making sure that git clone --depth=1 does not
accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
losing the purpose of using --depth in the first place. These days it is
largely irrelevant, since this code path is not followed by clone, and
clone will automatically restrict its list of fetched refs to a single
branch if --depth is used.

The bug that shallow.txt talks about (and which is mentioned in that
commit message) is that we will not properly auto-follow tags during
such a clone (i.e., when we fetch a tag because it is pointing to a
commit that we already have or are already pulling). I'm not sure if
that is still the case or not. But assuming your workflow is something
like:

  [make an initial, cheap clone]
  git clone --depth=1 $repo

  [the next day, you do a regular fetch, which will just get new stuff
   on top of what you already have]
  git fetch

Then that second fetch will auto-follow the tags, anyway. And that is
what the commit message is pointing: it's a bug, but one you can work
around.

 It may be (?) that it is a good time to think about a 'datedepth'
 capability to bypass the current counted-depth shallow fetch that can
 cause so much trouble. With a date limited depth the relevant tags
 could also be fetched.

I don't have anything against such an idea, but I think it is orthogonal
to the issue being discussed here.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 03:56:48PM -0400, Jeff King wrote:

 This code blames back to:
 
  commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
  Author: Alexandre Julliard julli...@winehq.org
  Date:   Fri Nov 24 16:00:13 2006 +0100
 
 fetch-pack: Do not fetch tags for shallow clones.
 
 A better fix may be to only fetch tags that point to commits that we
 are downloading, but git-clone doesn't have support for following
 tags. This will happen automatically on the next git-fetch though.
 
 So it is about making sure that git clone --depth=1 does not
 accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
 losing the purpose of using --depth in the first place. These days it is
 largely irrelevant, since this code path is not followed by clone, and
 clone will automatically restrict its list of fetched refs to a single
 branch if --depth is used.

I think part of the confusion of this code is that inside the loop over
the refs it is sometimes checking aspects of the ref, and sometimes
checking invariants of the loop (like args.fetch_all). Splitting it into
separate loops makes it easier to see what is going on, like the patch
below (on top of Michael's series).

I'm not sure if it ends up being more readable, since the generic cut
down this linked list function has to operate through callbacks with a
void pointer. On the other hand, that function could also be used
elsewhere.

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 90683ca..877cf38 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -521,51 +521,80 @@ static void mark_recent_complete_commits(unsigned long 
cutoff)
}
 }
 
-static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+static void filter_refs_callback(struct ref **refs,
+int (*want)(struct ref *, void *),
+void *data)
 {
-   struct ref *newlist = NULL;
-   struct ref **newtail = newlist;
+   struct ref **tail = refs;
struct ref *ref, *next;
-   int match_pos = 0, unmatched = 0;
 
for (ref = *refs; ref; ref = next) {
-   int keep_ref = 0;
next = ref-next;
-   if (!memcmp(ref-name, refs/, 5) 
-   check_refname_format(ref-name, 0))
-   ; /* trash */
-   else if (args.fetch_all 
-  (!args.depth || prefixcmp(ref-name, refs/tags/)))
-   keep_ref = 1;
-   else
-   while (match_pos  *nr_heads) {
-   int cmp = strcmp(ref-name, heads[match_pos]);
-   if (cmp  0) { /* definitely do not have it */
-   break;
-   } else if (cmp == 0) { /* definitely have it */
-   free(heads[match_pos++]);
-   keep_ref = 1;
-   break;
-   } else { /* might have it; keep looking */
-   heads[unmatched++] = heads[match_pos++];
-   }
-   }
-
-   if (keep_ref) {
-   *newtail = ref;
-   ref-next = NULL;
-   newtail = ref-next;
-   } else {
+   if (want(ref, data))
+   tail = ref-next;
+   else {
free(ref);
+   *tail = next;
}
}
+}
 
-   /* copy any remaining unmatched heads: */
-   while (match_pos  *nr_heads)
-   heads[unmatched++] = heads[match_pos++];
-   *nr_heads = unmatched;
+static int ref_name_is_ok(struct ref *ref, void *data)
+{
+   return memcmp(ref-name, refs/, 5) ||
+   !check_refname_format(ref-name, 0);
+}
+
+static int ref_ok_for_shallow(struct ref *ref, void *data)
+{
+   return prefixcmp(ref-name, refs/tags/);
+}
 
-   *refs = newlist;
+struct filter_by_name_data {
+   char **heads;
+   int nr_heads;
+   int match_pos;
+   int unmatched;
+};
+
+static int want_ref_name(struct ref *ref, void *data)
+{
+   struct filter_by_name_data *f = data;
+
+   while (f-match_pos  f-nr_heads) {
+   int cmp = strcmp(ref-name, f-heads[f-match_pos]);
+   if (cmp  0) /* definitely do not have it */
+   return 0;
+   else if (cmp == 0) { /* definitely have it */
+   free(f-heads[f-match_pos++]);
+   return 1;
+   } else /* might have it; keep looking */
+   f-heads[f-unmatched++] = f-heads[f-match_pos++];
+   }
+   return 0;
+}
+
+static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+{
+   struct filter_by_name_data f;
+
+   

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Philip Oakley

From: Jeff King p...@peff.net
Sent: Thursday, August 23, 2012 8:56 PM

On Thu, Aug 23, 2012 at 08:13:29PM +0100, Philip Oakley wrote:


I'm still suspicious about the logic related to args.fetch_all and
args.depth, but I don't think I've made anything worse.

I think the point of that is that when doing git fetch-pack --all
--depth=1, the meaning of --all is changed from all refs to
everything but tags.


There is a comment in \git\Documentation\technical\shallow.txt that
- If you deepen a history, you'd want to get the tags of the
 newly stored (but older!) commits. This does not work right now.
which may be the source of this restriction. That is, how is the 
depth

of the tag fetching to be restricted to the requested depth count?
[assuming I've undestood the problem correctly]


I don't think this is about deepening, but rather about making sure we
remain shallow for the initial fetch. Remember that this is on the
fetch-pack --all code path, which used to be used by git clone 
when

it was a shell script (these days, clone is a C builtin and will
actually feed the list of refs to fetch-pack).


OK



This code blames back to:

commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
Author: Alexandre Julliard julli...@winehq.org
Date:   Fri Nov 24 16:00:13 2006 +0100

   fetch-pack: Do not fetch tags for shallow clones.

   A better fix may be to only fetch tags that point to commits that 
we

   are downloading, but git-clone doesn't have support for following
   tags. This will happen automatically on the next git-fetch though.

So it is about making sure that git clone --depth=1 does not
accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
losing the purpose of using --depth in the first place. These days it 
is

largely irrelevant, since this code path is not followed by clone, and
clone will automatically restrict its list of fetched refs to a single
branch if --depth is used.

The bug that shallow.txt talks about (and which is mentioned in that
commit message) is that we will not properly auto-follow tags during
such a clone (i.e., when we fetch a tag because it is pointing to a
commit that we already have or are already pulling). I'm not sure if
that is still the case or not. But assuming your workflow is something
like:

 [make an initial, cheap clone]
 git clone --depth=1 $repo

 [the next day, you do a regular fetch, which will just get new stuff
  on top of what you already have]
 git fetch

Then that second fetch will auto-follow the tags, anyway. And that is
what the commit message is pointing: it's a bug, but one you can work
around.


I hadn't appreciated that the fetch would limit itself to the original 
shallow

depth. I'd gained  the impression that one need to use the --depth to
control what was being fetched.




It may be (?) that it is a good time to think about a 'datedepth'
capability to bypass the current counted-depth shallow fetch that can
cause so much trouble. With a date limited depth the relevant tags
could also be fetched.


I don't have anything against such an idea, but I think it is 
orthogonal

to the issue being discussed here.

OK. I'll stick it on my slow-burner pile ;-)



-Peff


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 It may be (?) that it is a good time to think about a 'datedepth'
 capability to bypass the current counted-depth shallow fetch that can
 cause so much trouble. With a date limited depth the relevant tags
 could also be fetched.

 I don't have anything against such an idea, but I think it is orthogonal
 to the issue being discussed here.

Correct.

The biggest problem with the shallow hack is that the deepening
fetch counts from the tip of the refs at the time of deepening when
deepening the history (i.e. clone --depth, followed by number of
fetch, followed by fetch --depth).  If you start from a shallow
clone of depth 1, repeated fetch to keep up while the history grew
by 100, you would still have a connected history down to the initial
cauterization point, and fetch --depth=200 would give you a
history that is deeper than your original clone by 100 commits.  But
if you start from the same shallow clone of depth 1, did not do
anything while the history grew by 100, and then decide to fetch
again with fetch --depth=20, it does not grow.  It just makes
20-commit deep history from the updated tip, and leave the commit
from your original clone dangling.

The problem with depth does not have anything to do with how it is
specified at the UI level.  The end-user input is used to find out
at what set of commits the history is cauterized, and once they are
computed, the shallow logic solely works on is the history before
these cauterization points, or after, in topological terms. (and it
has to be that way to make sure we get reproducible results).  Even
if a new way to specify these cauterization points in terms of date
is introduced, it does not change anything and does not solve the
fundamental problem the code has when deepening.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html