Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-09-02 Thread Michael Haggerty
On 08/23/2012 10:31 PM, Jeff King wrote:
> On Thu, Aug 23, 2012 at 03:56:48PM -0400, Jeff King wrote:
> 
>> This code blames back to:
>>
>>  commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
>>  Author: Alexandre Julliard 
>>  Date:   Fri Nov 24 16:00:13 2006 +0100
>>
>> fetch-pack: Do not fetch tags for shallow clones.
>>
>> A better fix may be to only fetch tags that point to commits that we
>> are downloading, but git-clone doesn't have support for following
>> tags. This will happen automatically on the next git-fetch though.
>>
>> So it is about making sure that "git clone --depth=1" does not
>> accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
>> losing the purpose of using --depth in the first place. These days it is
>> largely irrelevant, since this code path is not followed by clone, and
>> clone will automatically restrict its list of fetched refs to a single
>> branch if --depth is used.
> 
> I think part of the confusion of this code is that inside the loop over
> the refs it is sometimes checking aspects of the ref, and sometimes
> checking invariants of the loop (like args.fetch_all). Splitting it into
> separate loops makes it easier to see what is going on, like the patch
> below (on top of Michael's series).
> 
> I'm not sure if it ends up being more readable, since the generic "cut
> down this linked list" function has to operate through callbacks with a
> void pointer. On the other hand, that function could also be used
> elsewhere.
> 
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 90683ca..877cf38 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -521,51 +521,80 @@ static void mark_recent_complete_commits(unsigned long 
> cutoff)
>   }
>  }
>  
> -static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
> +static void filter_refs_callback(struct ref **refs,
> +  int (*want)(struct ref *, void *),
> +  void *data)
>  {
> - struct ref *newlist = NULL;
> - struct ref **newtail = &newlist;
> + struct ref **tail = refs;
>   struct ref *ref, *next;
> - int match_pos = 0, unmatched = 0;
>  
>   for (ref = *refs; ref; ref = next) {
> - int keep_ref = 0;
>   next = ref->next;
> - if (!memcmp(ref->name, "refs/", 5) &&
> - check_refname_format(ref->name, 0))
> - ; /* trash */
> - else if (args.fetch_all &&
> -(!args.depth || prefixcmp(ref->name, "refs/tags/")))
> - keep_ref = 1;
> - else
> - while (match_pos < *nr_heads) {
> - int cmp = strcmp(ref->name, heads[match_pos]);
> - if (cmp < 0) { /* definitely do not have it */
> - break;
> - } else if (cmp == 0) { /* definitely have it */
> - free(heads[match_pos++]);
> - keep_ref = 1;
> - break;
> - } else { /* might have it; keep looking */
> - heads[unmatched++] = heads[match_pos++];
> - }
> - }
> -
> - if (keep_ref) {
> - *newtail = ref;
> - ref->next = NULL;
> - newtail = &ref->next;
> - } else {
> + if (want(ref, data))
> + tail = &ref->next;
> + else {
>   free(ref);
> + *tail = next;
>   }
>   }
> +}
>  
> - /* copy any remaining unmatched heads: */
> - while (match_pos < *nr_heads)
> - heads[unmatched++] = heads[match_pos++];
> - *nr_heads = unmatched;
> +static int ref_name_is_ok(struct ref *ref, void *data)
> +{
> + return memcmp(ref->name, "refs/", 5) ||
> + !check_refname_format(ref->name, 0);
> +}
> +
> +static int ref_ok_for_shallow(struct ref *ref, void *data)
> +{
> + return prefixcmp(ref->name, "refs/tags/");
> +}
>  
> - *refs = newlist;
> +struct filter_by_name_data {
> + char **heads;
> + int nr_heads;
> + int match_pos;
> + int unmatched;
> +};
> +
> +static int want_ref_name(struct ref *ref, void *data)
> +{
> + struct filter_by_name_data *f = data;
> +
> + while (f->match_pos < f->nr_heads) {
> + int cmp = strcmp(ref->name, f->heads[f->match_pos]);
> + if (cmp < 0) /* definitely do not have it */
> + return 0;
> + else if (cmp == 0) { /* definitely have it */
> + free(f->heads[f->match_pos++]);
> + return 1;
> + } else /* might have it; keep looking */
> + f->heads[f->unmatched++] = f->head

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-25 Thread Michael Haggerty
On 08/23/2012 10:31 PM, Jeff King wrote:
> I think part of the confusion of this code is that inside the loop over
> the refs it is sometimes checking aspects of the ref, and sometimes
> checking invariants of the loop (like args.fetch_all). Splitting it into
> separate loops makes it easier to see what is going on, like the patch
> below (on top of Michael's series).
> 
> I'm not sure if it ends up being more readable, since the generic "cut
> down this linked list" function has to operate through callbacks with a
> void pointer. On the other hand, that function could also be used
> elsewhere.
> [...]

Despite requiring a bit more boilerplate, I think that your change makes
the logic clearer.

*If* we want to switch to using callbacks, then we can get even more
bang for the buck, as in the attached patch (which applies on top of my
patch v2).  Beyond your suggestion, this patch:

* Inlines your filter_refs() into everything_local(), because (a) it's
short and (b) the policy work implemented there logically belongs
higher-up in the call chain.

* Renames your filter_refs_callback() to filter_refs().

* Moves the initialization of the filter_by_name_data structure
(including sorting and de-duping) all the way up to fetch_pack(), and
passes a filter_by_name_data* (rather than (nr_heads, heads)) down to
the callees.

If you like this change, let me know and I'll massage it into a
digestible patch series.

Michael
-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index db77ee6..d1dabcf 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -521,51 +521,91 @@ static void mark_recent_complete_commits(unsigned long cutoff)
 	}
 }
 
-static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+static void filter_refs(struct ref **refs,
+			int (*want)(struct ref *, void *),
+			void *data)
 {
-	struct ref *newlist = NULL;
-	struct ref **newtail = &newlist;
+	struct ref **tail = refs;
 	struct ref *ref, *next;
-	int head_pos = 0, unmatched = 0;
 
 	for (ref = *refs; ref; ref = next) {
-		int keep_ref = 0;
 		next = ref->next;
-		if (!memcmp(ref->name, "refs/", 5) &&
-		check_refname_format(ref->name, 0))
-			; /* trash */
-		else if (args.fetch_all &&
-			   (!args.depth || prefixcmp(ref->name, "refs/tags/")))
-			keep_ref = 1;
-		else
-			while (head_pos < *nr_heads) {
-int cmp = strcmp(ref->name, heads[head_pos]);
-if (cmp < 0) { /* definitely do not have it */
-	break;
-} else if (cmp == 0) { /* definitely have it */
-	free(heads[head_pos++]);
-	keep_ref = 1;
-	break;
-} else { /* might have it; keep looking */
-	heads[unmatched++] = heads[head_pos++];
-}
-			}
-
-		if (keep_ref) {
-			*newtail = ref;
-			ref->next = NULL;
-			newtail = &ref->next;
-		} else {
+		if (want(ref, data))
+			tail = &ref->next;
+		else {
 			free(ref);
+			*tail = next;
 		}
 	}
+}
 
-	/* copy any remaining unmatched heads: */
-	while (head_pos < *nr_heads)
-		heads[unmatched++] = heads[head_pos++];
-	*nr_heads = unmatched;
+static int ref_name_is_ok(struct ref *ref, void *data)
+{
+	return memcmp(ref->name, "refs/", 5) ||
+		!check_refname_format(ref->name, 0);
+}
+
+static int ref_ok_for_shallow(struct ref *ref, void *data)
+{
+	return prefixcmp(ref->name, "refs/tags/");
+}
+
+static int compare_heads(const void *a, const void *b)
+{
+	return strcmp(*(const char **)a, *(const char **)b);
+}
+
+static int remove_duplicates(int nr_heads, char **heads)
+{
+	int src, dst;
+	for (src = dst = 1; src < nr_heads; src++)
+		if (strcmp(heads[src], heads[dst-1]))
+			heads[dst++] = heads[src];
+		else
+			free(heads[src]);
+	return dst;
+}
 
-	*refs = newlist;
+struct filter_by_name_data {
+	char **heads;
+	int *nr_heads;
+	int head_pos;
+	int unmatched;
+};
+
+static void filter_by_name_init(struct filter_by_name_data *f,
+			 int *nr_heads, char **heads)
+{
+	memset(f, 0, sizeof(*f));
+	f->heads = heads;
+	f->nr_heads = nr_heads;
+	qsort(f->heads, *f->nr_heads, sizeof(*f->heads), compare_heads);
+	*f->nr_heads = remove_duplicates(*f->nr_heads, f->heads);
+}
+
+static int filter_by_name_want(struct ref *ref, void *data)
+{
+	struct filter_by_name_data *f = data;
+
+	while (f->head_pos < *f->nr_heads) {
+		int cmp = strcmp(ref->name, f->heads[f->head_pos]);
+		if (cmp < 0) /* definitely do not have it */
+			return 0;
+		else if (cmp == 0) { /* definitely have it */
+			free(f->heads[f->head_pos++]);
+			return 1;
+		} else /* might have it; keep looking */
+			f->heads[f->unmatched++] = f->heads[f->head_pos++];
+	}
+	return 0;
+}
+
+static void filter_by_name_finish(struct filter_by_name_data *f)
+{
+	/* copy any remaining unmatched heads: */
+	while (f->head_pos < *f->nr_heads)
+		f->heads[f->unmatched++] = f->heads[f->head_pos++];
+	*f->nr_heads = f->unmatched;
 }
 
 static void mark_alternate_complete(const struct ref *ref, void *unused)
@@ -573,7 +613,8 @@ static void mark_alternate_complete

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-24 Thread Philip Oakley

From: "Junio C Hamano" 
Sent: Friday, August 24, 2012 5:23 AM

Jeff King  writes:


It may be (?) that it is a good time to think about a 'datedepth'
capability to bypass the current counted-depth shallow fetch that 
can

cause so much trouble. With a date limited depth the relevant tags
could also be fetched.


I don't have anything against such an idea, but I think it is 
orthogonal

to the issue being discussed here.


Correct.

The biggest problem with the "shallow" hack is that the deepening
fetch counts from the tip of the refs at the time of deepening when
deepening the history (i.e. "clone --depth", followed by number of
"fetch", followed by "fetch --depth").  If you start from a shallow
clone of depth 1, repeated fetch to keep up while the history grew
by 100, you would still have a connected history down to the initial
cauterization point, and "fetch --depth=200" would give you a
history that is deeper than your original clone by 100 commits.  But
if you start from the same shallow clone of depth 1, did not do
anything while the history grew by 100, and then decide to fetch
again with "fetch --depth=20", it does not grow.  It just makes
20-commit deep history from the updated tip, and leave the commit
from your original clone dangling.

The problem with "depth" does not have anything to do with how it is
specified at the UI level.


That is, correct me if I'm wrong, the server currently lacks a 
capability to provide anything other than the counted depth shallow 
response (if available). Hence my comment about needing an additional 
server side capability, though that would need a well thought out set of 
use cases to make it useful.


  The end-user input is used to 
find out

at what set of commits the history is cauterized, and once they are
computed, the "shallow" logic solely works on "is the history before
these cauterization points, or after, in topological terms." (and it
has to be that way to make sure we get reproducible results).  Even
if a new way to specify these cauterization points in terms of date
is introduced, it does not change anything and does not solve the
fundamental problem the code has when deepening.


fundamental problem = no server capability other than counted depth.




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Junio C Hamano
Jeff King  writes:

>> It may be (?) that it is a good time to think about a 'datedepth'
>> capability to bypass the current counted-depth shallow fetch that can
>> cause so much trouble. With a date limited depth the relevant tags
>> could also be fetched.
>
> I don't have anything against such an idea, but I think it is orthogonal
> to the issue being discussed here.

Correct.

The biggest problem with the "shallow" hack is that the deepening
fetch counts from the tip of the refs at the time of deepening when
deepening the history (i.e. "clone --depth", followed by number of
"fetch", followed by "fetch --depth").  If you start from a shallow
clone of depth 1, repeated fetch to keep up while the history grew
by 100, you would still have a connected history down to the initial
cauterization point, and "fetch --depth=200" would give you a
history that is deeper than your original clone by 100 commits.  But
if you start from the same shallow clone of depth 1, did not do
anything while the history grew by 100, and then decide to fetch
again with "fetch --depth=20", it does not grow.  It just makes
20-commit deep history from the updated tip, and leave the commit
from your original clone dangling.

The problem with "depth" does not have anything to do with how it is
specified at the UI level.  The end-user input is used to find out
at what set of commits the history is cauterized, and once they are
computed, the "shallow" logic solely works on "is the history before
these cauterization points, or after, in topological terms." (and it
has to be that way to make sure we get reproducible results).  Even
if a new way to specify these cauterization points in terms of date
is introduced, it does not change anything and does not solve the
fundamental problem the code has when deepening.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Philip Oakley

From: "Jeff King" 
Sent: Thursday, August 23, 2012 8:56 PM

On Thu, Aug 23, 2012 at 08:13:29PM +0100, Philip Oakley wrote:


>>I'm still suspicious about the logic related to args.fetch_all and
>>args.depth, but I don't think I've made anything worse.
>
>I think the point of that is that when doing "git fetch-pack --all
>--depth=1", the meaning of "--all" is changed from "all refs" to
>"everything but tags".
>

There is a comment in \git\Documentation\technical\shallow.txt that
"- If you deepen a history, you'd want to get the tags of the
 newly stored (but older!) commits. This does not work right now."
which may be the source of this restriction. That is, how is the 
depth

of the tag fetching to be restricted to the requested depth count?
[assuming I've undestood the problem correctly]


I don't think this is about deepening, but rather about making sure we
remain shallow for the initial fetch. Remember that this is on the
"fetch-pack --all" code path, which used to be used by "git clone" 
when

it was a shell script (these days, clone is a C builtin and will
actually feed the list of refs to fetch-pack).


OK



This code blames back to:

commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
Author: Alexandre Julliard 
Date:   Fri Nov 24 16:00:13 2006 +0100

   fetch-pack: Do not fetch tags for shallow clones.

   A better fix may be to only fetch tags that point to commits that 
we

   are downloading, but git-clone doesn't have support for following
   tags. This will happen automatically on the next git-fetch though.

So it is about making sure that "git clone --depth=1" does not
accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
losing the purpose of using --depth in the first place. These days it 
is

largely irrelevant, since this code path is not followed by clone, and
clone will automatically restrict its list of fetched refs to a single
branch if --depth is used.

The bug that shallow.txt talks about (and which is mentioned in that
commit message) is that we will not properly auto-follow tags during
such a clone (i.e., when we fetch a tag because it is pointing to a
commit that we already have or are already pulling). I'm not sure if
that is still the case or not. But assuming your workflow is something
like:

 [make an initial, cheap clone]
 git clone --depth=1 $repo

 [the next day, you do a regular fetch, which will just get new stuff
  on top of what you already have]
 git fetch

Then that second fetch will auto-follow the tags, anyway. And that is
what the commit message is pointing: it's a bug, but one you can work
around.


I hadn't appreciated that the fetch would limit itself to the original 
shallow

depth. I'd gained  the impression that one need to use the --depth to
control what was being fetched.




It may be (?) that it is a good time to think about a 'datedepth'
capability to bypass the current counted-depth shallow fetch that can
cause so much trouble. With a date limited depth the relevant tags
could also be fetched.


I don't have anything against such an idea, but I think it is 
orthogonal

to the issue being discussed here.

OK. I'll stick it on my slow-burner pile ;-)



-Peff


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 03:56:48PM -0400, Jeff King wrote:

> This code blames back to:
> 
>  commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
>  Author: Alexandre Julliard 
>  Date:   Fri Nov 24 16:00:13 2006 +0100
> 
> fetch-pack: Do not fetch tags for shallow clones.
> 
> A better fix may be to only fetch tags that point to commits that we
> are downloading, but git-clone doesn't have support for following
> tags. This will happen automatically on the next git-fetch though.
> 
> So it is about making sure that "git clone --depth=1" does not
> accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
> losing the purpose of using --depth in the first place. These days it is
> largely irrelevant, since this code path is not followed by clone, and
> clone will automatically restrict its list of fetched refs to a single
> branch if --depth is used.

I think part of the confusion of this code is that inside the loop over
the refs it is sometimes checking aspects of the ref, and sometimes
checking invariants of the loop (like args.fetch_all). Splitting it into
separate loops makes it easier to see what is going on, like the patch
below (on top of Michael's series).

I'm not sure if it ends up being more readable, since the generic "cut
down this linked list" function has to operate through callbacks with a
void pointer. On the other hand, that function could also be used
elsewhere.

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 90683ca..877cf38 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -521,51 +521,80 @@ static void mark_recent_complete_commits(unsigned long 
cutoff)
}
 }
 
-static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+static void filter_refs_callback(struct ref **refs,
+int (*want)(struct ref *, void *),
+void *data)
 {
-   struct ref *newlist = NULL;
-   struct ref **newtail = &newlist;
+   struct ref **tail = refs;
struct ref *ref, *next;
-   int match_pos = 0, unmatched = 0;
 
for (ref = *refs; ref; ref = next) {
-   int keep_ref = 0;
next = ref->next;
-   if (!memcmp(ref->name, "refs/", 5) &&
-   check_refname_format(ref->name, 0))
-   ; /* trash */
-   else if (args.fetch_all &&
-  (!args.depth || prefixcmp(ref->name, "refs/tags/")))
-   keep_ref = 1;
-   else
-   while (match_pos < *nr_heads) {
-   int cmp = strcmp(ref->name, heads[match_pos]);
-   if (cmp < 0) { /* definitely do not have it */
-   break;
-   } else if (cmp == 0) { /* definitely have it */
-   free(heads[match_pos++]);
-   keep_ref = 1;
-   break;
-   } else { /* might have it; keep looking */
-   heads[unmatched++] = heads[match_pos++];
-   }
-   }
-
-   if (keep_ref) {
-   *newtail = ref;
-   ref->next = NULL;
-   newtail = &ref->next;
-   } else {
+   if (want(ref, data))
+   tail = &ref->next;
+   else {
free(ref);
+   *tail = next;
}
}
+}
 
-   /* copy any remaining unmatched heads: */
-   while (match_pos < *nr_heads)
-   heads[unmatched++] = heads[match_pos++];
-   *nr_heads = unmatched;
+static int ref_name_is_ok(struct ref *ref, void *data)
+{
+   return memcmp(ref->name, "refs/", 5) ||
+   !check_refname_format(ref->name, 0);
+}
+
+static int ref_ok_for_shallow(struct ref *ref, void *data)
+{
+   return prefixcmp(ref->name, "refs/tags/");
+}
 
-   *refs = newlist;
+struct filter_by_name_data {
+   char **heads;
+   int nr_heads;
+   int match_pos;
+   int unmatched;
+};
+
+static int want_ref_name(struct ref *ref, void *data)
+{
+   struct filter_by_name_data *f = data;
+
+   while (f->match_pos < f->nr_heads) {
+   int cmp = strcmp(ref->name, f->heads[f->match_pos]);
+   if (cmp < 0) /* definitely do not have it */
+   return 0;
+   else if (cmp == 0) { /* definitely have it */
+   free(f->heads[f->match_pos++]);
+   return 1;
+   } else /* might have it; keep looking */
+   f->heads[f->unmatched++] = f->heads[f->match_pos++];
+   }
+   return 0;
+}
+
+static void filter_refs(struct ref **refs, int *nr_heads, char **heads)
+{
+   struc

Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 08:13:29PM +0100, Philip Oakley wrote:

> >>I'm still suspicious about the logic related to args.fetch_all and
> >>args.depth, but I don't think I've made anything worse.
> >
> >I think the point of that is that when doing "git fetch-pack --all
> >--depth=1", the meaning of "--all" is changed from "all refs" to
> >"everything but tags".
> >
> 
> There is a comment in \git\Documentation\technical\shallow.txt that
> "- If you deepen a history, you'd want to get the tags of the
>  newly stored (but older!) commits. This does not work right now."
> which may be the source of this restriction. That is, how is the depth
> of the tag fetching to be restricted to the requested depth count?
> [assuming I've undestood the problem correctly]

I don't think this is about deepening, but rather about making sure we
remain shallow for the initial fetch. Remember that this is on the
"fetch-pack --all" code path, which used to be used by "git clone" when
it was a shell script (these days, clone is a C builtin and will
actually feed the list of refs to fetch-pack).

This code blames back to:

 commit 4bcb310c2539b66d535e87508d1b7a90fe29c083
 Author: Alexandre Julliard 
 Date:   Fri Nov 24 16:00:13 2006 +0100

fetch-pack: Do not fetch tags for shallow clones.

A better fix may be to only fetch tags that point to commits that we
are downloading, but git-clone doesn't have support for following
tags. This will happen automatically on the next git-fetch though.

So it is about making sure that "git clone --depth=1" does not
accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth,
losing the purpose of using --depth in the first place. These days it is
largely irrelevant, since this code path is not followed by clone, and
clone will automatically restrict its list of fetched refs to a single
branch if --depth is used.

The bug that shallow.txt talks about (and which is mentioned in that
commit message) is that we will not properly auto-follow tags during
such a clone (i.e., when we fetch a tag because it is pointing to a
commit that we already have or are already pulling). I'm not sure if
that is still the case or not. But assuming your workflow is something
like:

  [make an initial, cheap clone]
  git clone --depth=1 $repo

  [the next day, you do a regular fetch, which will just get new stuff
   on top of what you already have]
  git fetch

Then that second fetch will auto-follow the tags, anyway. And that is
what the commit message is pointing: it's a bug, but one you can work
around.

> It may be (?) that it is a good time to think about a 'datedepth'
> capability to bypass the current counted-depth shallow fetch that can
> cause so much trouble. With a date limited depth the relevant tags
> could also be fetched.

I don't have anything against such an idea, but I think it is orthogonal
to the issue being discussed here.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Philip Oakley

From: "Jeff King" 
Sent: Thursday, August 23, 2012 10:26 AM

On Thu, Aug 23, 2012 at 10:10:25AM +0200, mhag...@alum.mit.edu wrote:


There were various confusing things (and a couple of bugs) in the way
that fetch_pack() handled the list (nr_heads, heads) of references to
be sought from the remote:


Aside from the minor comments I made to individual patches, this all
looks good. As usual, thanks for breaking it down; I wish all series
were as easy to review as this.


I'm still suspicious about the logic related to args.fetch_all and
args.depth, but I don't think I've made anything worse.


I think the point of that is that when doing "git fetch-pack --all
--depth=1", the meaning of "--all" is changed from "all refs" to
"everything but tags".



There is a comment in \git\Documentation\technical\shallow.txt that
"- If you deepen a history, you'd want to get the tags of the
 newly stored (but older!) commits. This does not work right now."
which may be the source of this restriction. That is, how is the depth
of the tag fetching to be restricted to the requested depth count?
[assuming I've undestood the problem correctly]

It may be (?) that it is a good time to think about a 'datedepth' 
capability to bypass the current counted-depth shallow fetch that can 
cause so much trouble. With a date limited depth the relevant tags could 
also be fetched.


Which I kind of see the point of, because you don't want to grab 
ancient
tags that will be expensive. But wouldn't it make more sense to limit 
it

only to the contents of refs/heads in that case? Surely you wouldn't
want refs/notes, refs/remotes, or other hierarchies.

I suspect this code is never even run at all these days. All of the
callers inside git should actually provide a real list of refs, not
"--all". So it is really historical cruft for anybody who calls the
fetch-pack plumbing (I wonder if any third-party callers even exist;
this is such a deep part of the network infrastructure that any sane
scripts would probably just be calling fetch).

-Peff
--



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread Jeff King
On Thu, Aug 23, 2012 at 10:10:25AM +0200, mhag...@alum.mit.edu wrote:

> There were various confusing things (and a couple of bugs) in the way
> that fetch_pack() handled the list (nr_heads, heads) of references to
> be sought from the remote:

Aside from the minor comments I made to individual patches, this all
looks good. As usual, thanks for breaking it down; I wish all series
were as easy to review as this.

> I'm still suspicious about the logic related to args.fetch_all and
> args.depth, but I don't think I've made anything worse.

I think the point of that is that when doing "git fetch-pack --all
--depth=1", the meaning of "--all" is changed from "all refs" to
"everything but tags".

Which I kind of see the point of, because you don't want to grab ancient
tags that will be expensive. But wouldn't it make more sense to limit it
only to the contents of refs/heads in that case? Surely you wouldn't
want refs/notes, refs/remotes, or other hierarchies.

I suspect this code is never even run at all these days. All of the
callers inside git should actually provide a real list of refs, not
"--all". So it is really historical cruft for anybody who calls the
fetch-pack plumbing (I wonder if any third-party callers even exist;
this is such a deep part of the network infrastructure that any sane
scripts would probably just be calling fetch).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/17] Clean up how fetch_pack() handles the heads list

2012-08-23 Thread mhagger
From: Michael Haggerty 

There were various confusing things (and a couple of bugs) in the way
that fetch_pack() handled the list (nr_heads, heads) of references to
be sought from the remote:

* Different names were used for the list in different functions for no
  special reason.

* fetch_pack() modified the list in-place:

  * It moved entries around

  * It sometimes shrunk the list without informing the caller (which
could lead to spurious error messages)

  * It overwrote the first byte of matching entries with NUL, leaving
a sparse list that the caller had to interpret

  * Its interaction with the list was documented

* No error was reported if *all* requested references were missing
  from the remote.

I'm still suspicious about the logic related to args.fetch_all and
args.depth, but I don't think I've made anything worse.

This patch series applies to the merge between master and
jc/maint-push-refs-all, though the dependency on the latter is only
textual.

Michael Haggerty (17):
  t5500: add tests of error output for missing refs
  Rename static function fetch_pack() to http_fetch_pack()
  Fix formatting.
  Name local variables more consistently
  Do not check the same match_pos twice
  Let fetch_pack() inform caller about number of unique heads
  Pass nr_heads to do_pack_ref() by reference
  Pass nr_heads to everything_local() by reference
  Pass nr_heads to filter_refs() by reference
  Remove ineffective optimization
  filter_refs(): do not leave gaps in return_refs
  filter_refs(): compress unmatched refs in heads array
  cmd_fetch_pack: return early if finish_connect() returns an error
  Report missing refs even if no existing refs were received
  cmd_fetch_pack(): simplify computation of return value
  fetch_pack(): free matching heads
  fetch_refs(): simplify logic

 builtin/fetch-pack.c  | 128 --
 fetch-pack.h  |  19 +---
 http-walker.c |   4 +-
 t/t5500-fetch-pack.sh |  32 -
 transport.c   |   8 ++--
 5 files changed, 101 insertions(+), 90 deletions(-)

-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html