Re: Fwd: [PATCH/RFC] upload-pack: ignore 'shallow' lines with unknown obj-ids

2013-04-28 Thread Michael Heemskerk
Thanks Junio,

That looks fine to me. I'll try to find some time this week to create a
follow-up patch for removing pruned commits from the shallow file as
well.

Cheers,
Michael

On 29 April 2013 15:32, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> OK, the change the patch text shows looks sensible to me.  Can you
>> resend it in full, with the log message, your sign-off, and Duy's
>> "Reviewed-by:", so that it can be applied?
>
> I managed to reassemble what I _think_ is close to the original from
> a few messages by dequoting and fixing MUA whitespace breakages in
> them.
>
> Here is with a slight tweak to the log message.  Please holler if I
> grabbed a wrong version of the patch or made any silly mistakes.
>
> Thanks.
>
> -- >8 --
> From: Michael Heemskerk 
> Subject: [PATCH] upload-pack: ignore 'shallow' lines with unknown obj-ids
>
> When the client sends a 'shallow' line for an object that the server does
> not have, the server currently dies with the error: "did not find object
> for shallow ".  The client may have truncated the history at
> the commit by fetching shallowly from a different server, or the commit
> may have been garbage collected by the server. In either case, this
> unknown commit is not relevant for calculating the pack that is to be
> sent and can be safely ignored, and it is not used when recomputing where
> the updated history of the client is cauterised.
>
> The documentation in technical/pack-protocol.txt has been updated to
> remove the restriction that "Clients MUST NOT mention an obj-id which it
> does not know exists on the server". This requirement is not realistic
> because clients cannot know whether an object has been garbage collected
> by the server.
>
> Signed-off-by: Michael Heemskerk 
> Reviewed-by: Nguyễn Thái Ngọc Duy 
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/technical/pack-protocol.txt | 3 +--
>  upload-pack.c | 2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/technical/pack-protocol.txt 
> b/Documentation/technical/pack-protocol.txt
> index f1a51ed..b898e97 100644
> --- a/Documentation/technical/pack-protocol.txt
> +++ b/Documentation/technical/pack-protocol.txt
> @@ -228,8 +228,7 @@ obtained through ref discovery.
>  The client MUST write all obj-ids which it only has shallow copies
>  of (meaning that it does not have the parents of a commit) as
>  'shallow' lines so that the server is aware of the limitations of
> -the client's history. Clients MUST NOT mention an obj-id which
> -it does not know exists on the server.
> +the client's history.
>
>  The client now sends the maximum commit history depth it wants for
>  this transaction, which is the number of commits it wants from the
> diff --git a/upload-pack.c b/upload-pack.c
> index bfa6279..127e59a 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -592,7 +592,7 @@ static void receive_needs(void)
> die("invalid shallow line: %s", line);
> object = parse_object(sha1);
> if (!object)
> -   die("did not find object for %s", line);
> +   continue;
> if (object->type != OBJ_COMMIT)
> die("invalid shallow object %s", 
> sha1_to_hex(sha1));
> if (!(object->flags & CLIENT_SHALLOW)) {
> --
> 1.8.3-rc0-117-g5915a95
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: [PATCH/RFC] upload-pack: ignore 'shallow' lines with unknown obj-ids

2013-04-28 Thread Junio C Hamano
Junio C Hamano  writes:

> OK, the change the patch text shows looks sensible to me.  Can you
> resend it in full, with the log message, your sign-off, and Duy's
> "Reviewed-by:", so that it can be applied?

I managed to reassemble what I _think_ is close to the original from
a few messages by dequoting and fixing MUA whitespace breakages in
them.

Here is with a slight tweak to the log message.  Please holler if I
grabbed a wrong version of the patch or made any silly mistakes.

Thanks.

-- >8 --
From: Michael Heemskerk 
Subject: [PATCH] upload-pack: ignore 'shallow' lines with unknown obj-ids

When the client sends a 'shallow' line for an object that the server does
not have, the server currently dies with the error: "did not find object
for shallow ".  The client may have truncated the history at
the commit by fetching shallowly from a different server, or the commit
may have been garbage collected by the server. In either case, this
unknown commit is not relevant for calculating the pack that is to be
sent and can be safely ignored, and it is not used when recomputing where
the updated history of the client is cauterised.

The documentation in technical/pack-protocol.txt has been updated to
remove the restriction that "Clients MUST NOT mention an obj-id which it
does not know exists on the server". This requirement is not realistic
because clients cannot know whether an object has been garbage collected
by the server.

Signed-off-by: Michael Heemskerk 
Reviewed-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 Documentation/technical/pack-protocol.txt | 3 +--
 upload-pack.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index f1a51ed..b898e97 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -228,8 +228,7 @@ obtained through ref discovery.
 The client MUST write all obj-ids which it only has shallow copies
 of (meaning that it does not have the parents of a commit) as
 'shallow' lines so that the server is aware of the limitations of
-the client's history. Clients MUST NOT mention an obj-id which
-it does not know exists on the server.
+the client's history.
 
 The client now sends the maximum commit history depth it wants for
 this transaction, which is the number of commits it wants from the
diff --git a/upload-pack.c b/upload-pack.c
index bfa6279..127e59a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -592,7 +592,7 @@ static void receive_needs(void)
die("invalid shallow line: %s", line);
object = parse_object(sha1);
if (!object)
-   die("did not find object for %s", line);
+   continue;
if (object->type != OBJ_COMMIT)
die("invalid shallow object %s", 
sha1_to_hex(sha1));
if (!(object->flags & CLIENT_SHALLOW)) {
-- 
1.8.3-rc0-117-g5915a95

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: [PATCH/RFC] upload-pack: ignore 'shallow' lines with unknown obj-ids

2013-04-28 Thread Junio C Hamano
Michael Heemskerk  writes:

> Re-sent to the mailing list because the original was bounced (HTML subpart):
> ...
>
> With the patch applied, the server ignores the shallow line mentioned by
> the server and will not send a "shallow" or "unshallow" line for it back to
> the client. This scenario is not explicitly described in pack-protocol.txt
> but I'd be happy to add it to. I'll also update the comment to cover this
> aspect.
>
> As Duy pointed out, it doesn't cause problems in the current C Git
> implementation: the client adds a new entry to the shallow file for each
> "shallow" line it receives from the server and removes an entry for each
> "unshallow" line it receives. Any current shallow object that is not
> mentioned by the server is still marked as shallow after the fetch.
>
> I think that's how it should be: it should be the client's
> responsibility to track
> the list of objects it only has in shallow form. It should not rely on
> the server
> to tell it what that list is. Again, an extra line or two in pack-protocol.txt
> would help to clear this up.
> ...
>> > I do not seem to find the patch you are responding to, so I do not
>> > know how the patch handled the unshallowing part, but the impression
>> > I got from reading the log message quoted is that the patch was not
>> > even aware of the issue.
>>
>> I can't find it on gmane.org either. Patch quoted below.

OK, the change the patch text shows looks sensible to me.  Can you
resend it in full, with the log message, your sign-off, and Duy's
"Reviewed-by:", so that it can be applied?

Thanks.

>> On Sat, Apr 20, 2013 at 8:05 PM, Michael Heemskerk
>>  wrote:
>> > diff --git a/Documentation/technical/pack-protocol.txt
>> > b/Documentation/technical/pack-protocol.txt
>> > index f1a51ed..b898e97 100644
>> > --- a/Documentation/technical/pack-protocol.txt
>> > +++ b/Documentation/technical/pack-protocol.txt
>> > @@ -228,8 +228,7 @@ obtained through ref discovery.
>> ...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: [PATCH/RFC] upload-pack: ignore 'shallow' lines with unknown obj-ids

2013-04-20 Thread Michael Heemskerk
Re-sent to the mailing list because the original was bounced (HTML subpart):

On 21 April 2013 09:56, Duy Nguyen  wrote:
>
> On Sun, Apr 21, 2013 at 6:51 AM, Junio C Hamano  wrote:
> > Duy Nguyen  writes:
> > But the shallow list is also used to compute the updated boundary
> > (i.e. "this client does not have a valid history behind these
> > commits")?  When we know their current shallow boundary, after
> > sending history that crosses that boundary, we can tell them "before
> > this fetch, you did not have any history behind this commit, but we
> > know that you now have a bit more. Update your record with these new
> > boundaries. You still do not have any history behind these commits."
> > That is how deepening works.
> >
> > When you receive a shallow boundary unknown to you, it might not
> > hurt if you keep it and parrot it at the end, so that the fetcher
> > will still remember that it does not have any history behind the
> > commit.  There may be reasons why doing so is not sufficient and we
> > have to reject clients whose history is truncated at a place unknown
> > to us.
> >
> > You would declare "now you have everything behind that unknown
> > shallow boundary", if you ignore an unknown shallow boundary and do
> > not send it back.
> >
> > That sounds ourright wrong to me. You simply do not know enough to
> > make such a declaration.
>
> It's a good point. But I think the receiver does not rely solely on
> "shallow " lines from the sender to create new shallow file. If it
> receives "shallow " line, it registers the received sha-1 as a cut
> point. If it receives "unshallow " line, it unregisters. If it
> receives neither, the current registered cutpoints in memory are
> simply written back to disk. So not sending it back should not be a
> big problem (at least for C Git).


With the patch applied, the server ignores the shallow line mentioned by
the server and will not send a "shallow" or "unshallow" line for it back to
the client. This scenario is not explicitly described in pack-protocol.txt
but I'd be happy to add it to. I'll also update the comment to cover this
aspect.

As Duy pointed out, it doesn't cause problems in the current C Git
implementation: the client adds a new entry to the shallow file for each
"shallow" line it receives from the server and removes an entry for each
"unshallow" line it receives. Any current shallow object that is not
mentioned by the server is still marked as shallow after the fetch.

I think that's how it should be: it should be the client's
responsibility to track
the list of objects it only has in shallow form. It should not rely on
the server
to tell it what that list is. Again, an extra line or two in pack-protocol.txt
would help to clear this up.

In testing this patch, I also discovered that the shallow file is not updated
as part of a prune operation. Arguably, any pruned commits should also
be removed from the shallow file. I haven't included a fix for this in my
patch because it's an existing issue that is best fixed in a separate patch
(in progress).

>
> > I do not seem to find the patch you are responding to, so I do not
> > know how the patch handled the unshallowing part, but the impression
> > I got from reading the log message quoted is that the patch was not
> > even aware of the issue.
>
> I can't find it on gmane.org either. Patch quoted below.
>
> On Sat, Apr 20, 2013 at 8:05 PM, Michael Heemskerk
>  wrote:
> > diff --git a/Documentation/technical/pack-protocol.txt
> > b/Documentation/technical/pack-protocol.txt
> > index f1a51ed..b898e97 100644
> > --- a/Documentation/technical/pack-protocol.txt
> > +++ b/Documentation/technical/pack-protocol.txt
> > @@ -228,8 +228,7 @@ obtained through ref discovery.
> >  The client MUST write all obj-ids which it only has shallow copies
> >  of (meaning that it does not have the parents of a commit) as
> >  'shallow' lines so that the server is aware of the limitations of
> > -the client's history. Clients MUST NOT mention an obj-id which
> > -it does not know exists on the server.
> > +the client's history.
> >
> >  The client now sends the maximum commit history depth it wants for
> >  this transaction, which is the number of commits it wants from the
> > diff --git a/upload-pack.c b/upload-pack.c
> > index bfa6279..127e59a 100644
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -592,7 +592,7 @@ static void receive_needs(void)
> > die("invalid shallow line: %s", line);
> > object = parse_object(sha1);
> > if (!object)
> > -   die("did not find object for %s", line);
> > +   continue;
> > if (object->type != OBJ_COMMIT)
> > die("invalid shallow object %s",
> > sha1_to_hex(sha1));
> > if (!(object->flags & CLIENT_SHALLOW)) {
> > --
> > 1.8.0.2
> >
> --
> Duy
--
To unsubscribe from th