Re: $> git branch splat response considered harmful

2019-08-09 Thread Philip Oakley

On 08/08/2019 22:28, Emily Shaffer wrote:

On Thu, Aug 8, 2019 at 2:20 PM Bryan Turner  wrote:

On Thu, Aug 8, 2019 at 2:08 PM  wrote:

fwiw,

jimc@frodo:~/prj-1/capnproto.git$ git branch -l
* master

I find the splat in the response unhelpful
when wrapped in shell for loop, the splat expands into everything in
current directory

jimc@frodo:~/prj-1/capnproto.git$ for b in `git branch -l`; do echo $b; done
appveyor.yml
c++
CMakeLists.txt
CONTRIBUTORS
...

it would be nice if some flag combo would suppress that splat.
save me from fugly brittle sh $IFS fiddlery and incomplete workarounds

Have you tried "git for-each-ref --format="%(refname:short)"
refs/heads/"? That's going to provide short names for branches without
any indicator for the default branch, and without any special
ordering.

More generally, I think you should take a look at `git help git` and
check out the difference between "porcelain" and "plumbing" commands.
The former, of which `git branch` is one, are intended for interactive
use and not really meant for scripting or piping. You can usually come
up with an equivalent from the plumbing commands, which Bryan has
suggested for you with `git for-each-ref`.  Git project tries very
hard to maintain output format of the plumbing commands so as to not
break folks' scripts, but such promises aren't usually made for
porcelain commands.

I think this (the broad Q&A) also suggests that the porcelain command 
documentation could be more helpful for pointing to the appropriate 
plumbing commands for these scripting issues (and it could reduce list 
noise..).

--
Philip


Re: $> git branch splat response considered harmful

2019-08-08 Thread Junio C Hamano
Emily Shaffer  writes:

> More generally, I think you should take a look at `git help git` and
> check out the difference between "porcelain" and "plumbing" commands.
> The former, of which `git branch` is one, are intended for interactive
> use and not really meant for scripting or piping. You can usually come
> up with an equivalent from the plumbing commands, which Bryan has
> suggested for you with `git for-each-ref`.  Git project tries very
> hard to maintain output format of the plumbing commands so as to not
> break folks' scripts, but such promises aren't usually made for
> porcelain commands.

Thanks for a detailed response.

Git project promises that output format of the Porcelain commands is
subject to change, in order to support human eyeballs better.

Unless documented otherwise (e.g. "git status --porcelain"), it is
unwise to try parsing porcelain output---your script may break any
time and you get to keep both halves.

That is not the same as saying "do not script using Porcelain".
Sending the output directly to the end user (e.g. compute the
revision range in your script that finally calls "git log" using
that computed revision range) is perfectly fine.

Thanks.


Re: $> git branch splat response considered harmful

2019-08-08 Thread SZEDER Gábor
On Thu, Aug 08, 2019 at 03:08:06PM -0600, jim.cro...@gmail.com wrote:
> fwiw,
> 
> jimc@frodo:~/prj-1/capnproto.git$ git branch -l
> * master
> 
> I find the splat in the response unhelpful
> when wrapped in shell for loop, the splat expands into everything in
> current directory
> 
> jimc@frodo:~/prj-1/capnproto.git$ for b in `git branch -l`; do echo $b; done
> appveyor.yml
> c++
> CMakeLists.txt
> CONTRIBUTORS
> ...
> 
> it would be nice if some flag combo would suppress that splat.
> save me from fugly brittle sh $IFS fiddlery and incomplete workarounds

'git branch' is not intended for scripting, try 'git for-each-ref'
instead.  The equivalent command should be:

  git for-each-ref --format='%(refname:strip=2)' refs/heads/

> also, I found no mention of the splat in the man page.

It's in the first sentence of the description :)

  If --list is given, or if there are no non-option arguments, existing
  branches are listed; the current branch will be highlighted in green
  and marked with an asterisk.



Re: $> git branch splat response considered harmful

2019-08-08 Thread Emily Shaffer
On Thu, Aug 8, 2019 at 2:20 PM Bryan Turner  wrote:
>
> On Thu, Aug 8, 2019 at 2:08 PM  wrote:
> >
> > fwiw,
> >
> > jimc@frodo:~/prj-1/capnproto.git$ git branch -l
> > * master
> >
> > I find the splat in the response unhelpful
> > when wrapped in shell for loop, the splat expands into everything in
> > current directory
> >
> > jimc@frodo:~/prj-1/capnproto.git$ for b in `git branch -l`; do echo $b; done
> > appveyor.yml
> > c++
> > CMakeLists.txt
> > CONTRIBUTORS
> > ...
> >
> > it would be nice if some flag combo would suppress that splat.
> > save me from fugly brittle sh $IFS fiddlery and incomplete workarounds
>
> Have you tried "git for-each-ref --format="%(refname:short)"
> refs/heads/"? That's going to provide short names for branches without
> any indicator for the default branch, and without any special
> ordering.

More generally, I think you should take a look at `git help git` and
check out the difference between "porcelain" and "plumbing" commands.
The former, of which `git branch` is one, are intended for interactive
use and not really meant for scripting or piping. You can usually come
up with an equivalent from the plumbing commands, which Bryan has
suggested for you with `git for-each-ref`.  Git project tries very
hard to maintain output format of the plumbing commands so as to not
break folks' scripts, but such promises aren't usually made for
porcelain commands.

-- 
Emily Shaffer


Re: $> git branch splat response considered harmful

2019-08-08 Thread Bryan Turner
On Thu, Aug 8, 2019 at 2:08 PM  wrote:
>
> fwiw,
>
> jimc@frodo:~/prj-1/capnproto.git$ git branch -l
> * master
>
> I find the splat in the response unhelpful
> when wrapped in shell for loop, the splat expands into everything in
> current directory
>
> jimc@frodo:~/prj-1/capnproto.git$ for b in `git branch -l`; do echo $b; done
> appveyor.yml
> c++
> CMakeLists.txt
> CONTRIBUTORS
> ...
>
> it would be nice if some flag combo would suppress that splat.
> save me from fugly brittle sh $IFS fiddlery and incomplete workarounds

Have you tried "git for-each-ref --format="%(refname:short)"
refs/heads/"? That's going to provide short names for branches without
any indicator for the default branch, and without any special
ordering.


$> git branch splat response considered harmful

2019-08-08 Thread jim . cromie
fwiw,

jimc@frodo:~/prj-1/capnproto.git$ git branch -l
* master

I find the splat in the response unhelpful
when wrapped in shell for loop, the splat expands into everything in
current directory

jimc@frodo:~/prj-1/capnproto.git$ for b in `git branch -l`; do echo $b; done
appveyor.yml
c++
CMakeLists.txt
CONTRIBUTORS
...

it would be nice if some flag combo would suppress that splat.
save me from fugly brittle sh $IFS fiddlery and incomplete workarounds

git branch -l  # no splat
  master
git branch # splat ok for back-compat (if you think it necessary)
  * master

it appears that by convention, checked out branch is both 1st in list,
and splatted,
perhaps _DIRTY_, JUNK, or maybe "" would be good 1st line of response
if not on a branch ?

also, I found no mention of the splat in the man page.


[PATCH 1/4] am: simplify prompt response handling

2019-05-20 Thread Jeff King
We'll never see a NULL returned from git_prompt(); if it can't produce
any input for us (e.g., because the terminal got EOF) then it will just
die().

So there's no need for us to handle NULL here. And even if there was, it
doesn't make sense to continue; on a true terminal hangup we'd just loop
infinitely trying to get input that will never come.

Signed-off-by: Jeff King 
---
 builtin/am.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 912d9821b1..644bb11f6c 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1661,9 +1661,7 @@ static int do_interactive(struct am_state *state)
 */
reply = git_prompt(_("Apply? [y]es/[n]o/[e]dit/[v]iew 
patch/[a]ccept all: "), PROMPT_ECHO);
 
-   if (!reply) {
-   continue;
-   } else if (*reply == 'y' || *reply == 'Y') {
+   if (*reply == 'y' || *reply == 'Y') {
return 0;
} else if (*reply == 'a' || *reply == 'A') {
state->interactive = 0;
-- 
2.22.0.rc1.539.g7bfcdfe86d



Please Response

2019-05-15 Thread Mr.David Keller
Greeting, Did you received my message? please let me
know.Thanks.Mr.David Keller.


Quick Response

2019-04-29 Thread Arthur Alan
I am writing you this email to seek your cooperation and
partnership in a business that would benefit us immensely.

I would like to name you as the beneficiary to my late client's fund who
died along with his next-of-kin during a summer vacation in the Middle
East.

I shall offer you 50 percent of the fund, and you will be in custody of
my intended investment in your country with my share.

Do reply for details.

Regards
Arthur Allan


Re: [PATCH v2 0/8] CDN offloading of fetch response

2019-04-24 Thread Ævar Arnfjörð Bjarmason


On Fri, Mar 08 2019, Jonathan Tan wrote:

> One relatively significant change: someone pointed out that the issue fixed by
> 50d3413740 ("http: make redirects more obvious", 2016-12-06) could also
> occur here, so I have changed it so that the server is required to send
> the packfile's hash along with the URI.
>
> This does mean that Ævar's workflow described in
> https://public-inbox.org/git/87k1hv6eel@evledraar.gmail.com/ would
> not work.  Quoting Ævar:
>
>> More concretely, I'd like to have a setup where a server can just dumbly
>> point to some URL that probably has most of the data, without having any
>> idea what OIDs are in it. So that e.g. some machine entirely
>> disconnected from the server (and with just a regular clone) can
>> continually generating an up-to-date-enough packfile.
>
> With 50d3413740, it seems to me that it's important for the server to
> know details about the URIs that it points to, so such a disconnection
> would not work.

For what it's worth I'm fine with that, and as is obvious in some of the
previous threads I hadn't thought about that threat model. This "I know
XYZ has a pack that should have most of it, check what's in it and get
back to me with HAVEs" was a nice-to-have, but not a must.

And you can still get most of the way there with this proposal and the
techniques described in the follow-up to
https://public-inbox.org/git/87bmhiykvw@evledraar.gmail.com/ if I'm
reading this series correctly.

I.e. the server would know the CDNs are going to be generating a pack
from the "master" history with the tip being the Nth commit, and since
both agree on the pack algorithm they can generate the same OID/pack
hash pair without further coordination, the server would then
optimistically advertise that. This would give you the sort of "lazy"
CDN setup I was describing with slightly more work than just "here's
some random latest pack".

But isn't in the case that if a malicious server ever got a hold of a
pack SHA-1 it knows to be on a private CDN *and* a commit that's in it
we'd have the same problem? Shouldn't this security model be prominently
documented in Documentation/technical/packfile-uri.txt? I.e. "THOU MUST
NEVER LEAK A COMBINATION OF THE PACK SHA-1 OF A PRIVATE CDN AND A
PRIVATE COMMIT SHA-1!".

It seems likely that once we have git CDN support the first thing CDNs
are going to do is to serve up such packs via a URL that includes the
pack SHA-1. Once the combination of that & a commit in it leaks we're
back to square one, no? *My* default approach in setting up such
infrastructure without knowing about 50d3413740 would be not to care
about the privacy of the SHA-1s, even in the case of private data.

Isn't there also overlap here with non-CDN shallow/narrow fetching?
Where a server can detect such a client, rely on it to get the objects
from elsewhere (e.g. via adding an unrelated remote), and then get a
push that gives it secret data?

I don't see any fool-proof way out of it, but maybe a mode where we:

 1. Get the CDN data URLs (which for the purposes of this example I'm
assuming are "base packs" containing at least some commits...)

 2. Do a follow-up request to the server where we e.g. pretend not to
have the last N commits for tips it advertises and ask it to serve
that up from a non-CDN. If it can't we know it's lying and trying an
attack similar to the one described in 50d3413740.

Conversely, can't know if it can that it isn't, but at that point it
seems unlikely, since surely the useful thing is a delta against the
recent-but-server-lied-about-having-it commit, at least in most
cases

This would also have the advantage of addressing some of the reliability
concerns brought up elsewhere. I.e. once we're doing a sanity-check
post-fetch anyway a failure to download 1/3 packs from a CDN becomes
recoverable, although the design outlined here (understandably) would
prefer that to be done in another "fetch" dialog so the server can keep
the "CDN + my data should be 100% of the data" state.


Re: [PATCH v2 0/8] CDN offloading of fetch response

2019-04-23 Thread Jonathan Tan
> On Fri, Mar 08, 2019 at 01:55:12PM -0800, Jonathan Tan wrote:
> 
> > Here's my current progress - the only thing that is lacking is more
> > tests, maybe, so I think it's ready for review.
> 
> A bit belated, but here are some overall thoughts. A lot of my thinking
> comes from comparing this to previous work I had done on teaching the
> client to fetch first from a bundle and then do a follow-up fetch from
> the server.

Thanks for your thoughts.

> > This code now starts CDN downloads after closing the first HTTP request,
> > and it holds on to the .keep files until after the refs are set. I'll
> > leave parallelization of the CDN downloads for later work.
> 
> Good. One of the problems with the bundle+followup approach was either
> having to hold open or re-establish the connection to the original
> server, since that fetch had to be put on hold.
> 
> I agree that parallelizing can come later. I do wonder if it's worth
> making a new tool rather than trying to reuse git-http-fetch. Yes, it
> basically does what we want already, but there's quite a lot of cruft in
> that dumb-http code base. Using http_get_file(), or even curl more
> directly might be a bit easier.
> 
> One problem in particular I'd worry about is that the http code
> generally expects authentication to happen through the initial
> ref-discovery, and then be set for all subsequent requests. So I doubt
> an http_pack_request actually handles auth at all, and retro-fitting it
> may be tricky.

Thanks - I'll keep that in mind. If auth isn't handled, then we'll
definitely need something new.

> > One relatively significant change: someone pointed out that the issue fixed 
> > by 
> > 50d3413740 ("http: make redirects more obvious", 2016-12-06) could also
> > occur here, so I have changed it so that the server is required to send
> > the packfile's hash along with the URI.
> 
> I have mixed feelings on that. I like how it clearly makes the server
> the source of authority. You don't have to care about CDN tampering,
> because you know you are getting the bytes the server wanted you to.

I was thinking not of CDN tampering but of a malicious server. Suppose a
server knew of (1) the hash of a target commit, and (2) the URL of a
packfile containing that target commit. The server can craft a commit
whose parent is that target commit and advertise it, and serves both
that commit as a packfile and the URL as packfile-uri. When the user
subsequently pushes, they will probably inadvertently push everything
including the target commit.

This is similar to the 50d3413740 case except that now the server needs
an additional datum (2). The packfile's hash is yet another additional
datum. I admit that it may not be that useful, though.

> I think even without that this is still _mostly_ true, though. You're
> going to compute the hash of every object the CDN sends you, and those
> objects must fit into the missing gaps in what the server sends you. So
> the worst case is that a malicious CDN could send you some extra
> objects. That's probably not the end of the world, but I do like the
> extra guarantee of knowing that you got byte-for-byte what the server
> wanted you to.

Yes, the guarantee is nice.

> > This does mean that Ævar's workflow described in [1] would not work.
> > Quoting Ævar:
> > 
> > > More concretely, I'd like to have a setup where a server can just dumbly
> > > point to some URL that probably has most of the data, without having any
> > > idea what OIDs are in it. So that e.g. some machine entirely
> > > disconnected from the server (and with just a regular clone) can
> > > continually generating an up-to-date-enough packfile.
> > 
> > With 50d3413740, it seems to me that it's important for the server to
> > know details about the URIs that it points to, so such a disconnection
> > would not work.
> 
> I think even without 50d3413740, this is important for your scheme. One
> of the weaknesses (and strengths, I suppose) of the bundle+followup
> scheme was that the initial bundle generally had to meet the git
> repository guarantee. After you fetched the bundle, you'd tell the
> server "OK, now I have commit X" just like you were a regular client.
> 
> But I really like that your proposal does away with that, and asks for
> tighter cooperation between the server and the CDN. It means the CDN can
> serve some random subset of the objects. But once we do that, now the
> server _has_ to know what was in those URLs it pointed to, because our
> protocol has no good way of communicating a random subset of objects (if
> they're just blobs, we could enumerate all of them to the server as
> haves; yuck.  But as soon as you get a tree or commit from the CDN, you
> have to have all of the reachable objects to be able to claim it).

Ah...you're right that the server still needs some inkling of what the
CDN's packfile has.

Without the packfile hash, the knowledge needed is slightly less: the
server just has to know a subset of the objects instead o

Re: [PATCH v2 0/8] CDN offloading of fetch response

2019-04-22 Thread Jeff King
On Fri, Mar 08, 2019 at 01:55:12PM -0800, Jonathan Tan wrote:

> Here's my current progress - the only thing that is lacking is more
> tests, maybe, so I think it's ready for review.

A bit belated, but here are some overall thoughts. A lot of my thinking
comes from comparing this to previous work I had done on teaching the
client to fetch first from a bundle and then do a follow-up fetch from
the server.

> This code now starts CDN downloads after closing the first HTTP request,
> and it holds on to the .keep files until after the refs are set. I'll
> leave parallelization of the CDN downloads for later work.

Good. One of the problems with the bundle+followup approach was either
having to hold open or re-establish the connection to the original
server, since that fetch had to be put on hold.

I agree that parallelizing can come later. I do wonder if it's worth
making a new tool rather than trying to reuse git-http-fetch. Yes, it
basically does what we want already, but there's quite a lot of cruft in
that dumb-http code base. Using http_get_file(), or even curl more
directly might be a bit easier.

One problem in particular I'd worry about is that the http code
generally expects authentication to happen through the initial
ref-discovery, and then be set for all subsequent requests. So I doubt
an http_pack_request actually handles auth at all, and retro-fitting it
may be tricky.

> One relatively significant change: someone pointed out that the issue fixed 
> by 
> 50d3413740 ("http: make redirects more obvious", 2016-12-06) could also
> occur here, so I have changed it so that the server is required to send
> the packfile's hash along with the URI.

I have mixed feelings on that. I like how it clearly makes the server
the source of authority. You don't have to care about CDN tampering,
because you know you are getting the bytes the server wanted you to.

I think even without that this is still _mostly_ true, though. You're
going to compute the hash of every object the CDN sends you, and those
objects must fit into the missing gaps in what the server sends you. So
the worst case is that a malicious CDN could send you some extra
objects. That's probably not the end of the world, but I do like the
extra guarantee of knowing that you got byte-for-byte what the server
wanted you to.

> This does mean that Ævar's workflow described in [1] would not work.
> Quoting Ævar:
> 
> > More concretely, I'd like to have a setup where a server can just dumbly
> > point to some URL that probably has most of the data, without having any
> > idea what OIDs are in it. So that e.g. some machine entirely
> > disconnected from the server (and with just a regular clone) can
> > continually generating an up-to-date-enough packfile.
> 
> With 50d3413740, it seems to me that it's important for the server to
> know details about the URIs that it points to, so such a disconnection
> would not work.

I think even without 50d3413740, this is important for your scheme. One
of the weaknesses (and strengths, I suppose) of the bundle+followup
scheme was that the initial bundle generally had to meet the git
repository guarantee. After you fetched the bundle, you'd tell the
server "OK, now I have commit X" just like you were a regular client.

But I really like that your proposal does away with that, and asks for
tighter cooperation between the server and the CDN. It means the CDN can
serve some random subset of the objects. But once we do that, now the
server _has_ to know what was in those URLs it pointed to, because our
protocol has no good way of communicating a random subset of objects (if
they're just blobs, we could enumerate all of them to the server as
haves; yuck.  But as soon as you get a tree or commit from the CDN, you
have to have all of the reachable objects to be able to claim it).

So I think this is pretty inherent to your proposal, and but it's the
right tradeoff to be making here (I think there could still be room for
the other thing, too, but it's just a different feature).

I have a few other comments on the design, but I'll send them in
response to patch 5.

-Peff


[PATCH 2/7] t5530: check protocol response for "not our ref"

2019-04-12 Thread Jeff King
Back in 9f9aa76130 (upload-pack: Improve error message when bad ref
requested, 2010-07-31), we added a test to make sure that we die with a
sensible message when the client asks for an object we don't have.

Much later, in bdb31eada7 (upload-pack: report "not our ref" to client,
2017-02-23), we started reporting that information via an "ERR" line in
the protocol. Let's check that part, as well.

While we're touching this test, let's drop the "-q" on the grep calls.
Our usual test style just relies on --verbose to control output.

Signed-off-by: Jeff King 
---
If like me, you are scratching your head over the grep for
multi_ack_detailed, it is from 9f9aa76130, which made sure we did not
spew extra cruft as part of the die message.

 t/t5530-upload-pack-error.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5530-upload-pack-error.sh b/t/t5530-upload-pack-error.sh
index 4f6e32b04c..295bd0c83c 100755
--- a/t/t5530-upload-pack-error.sh
+++ b/t/t5530-upload-pack-error.sh
@@ -62,8 +62,9 @@ test_expect_success 'upload-pack error message when bad ref 
requested' '
printf "0045want %s multi_ack_detailed\n0009done\n" \
"deadbeefdeadbeefdeadbeefdeadbeefdeadbeef" >input &&
test_must_fail git upload-pack . output 2>output.err &&
-   grep -q "not our ref" output.err &&
-   ! grep -q multi_ack_detailed output.err
+   grep "not our ref" output.err &&
+   grep "ERR" output &&
+   ! grep multi_ack_detailed output.err
 '
 
 test_expect_success 'upload-pack fails due to error in pack-objects 
enumeration' '
-- 
2.21.0.931.gd0bc72a411



Re: [PATCH v2 0/8] CDN offloading of fetch response

2019-03-19 Thread Josh Steadmon
On 2019.03.08 13:55, Jonathan Tan wrote:
> Here's my current progress - the only thing that is lacking is more
> tests, maybe, so I think it's ready for review.

This series looks good to me.

Reviewed-by: Josh Steadmon 


[PATCH v2 8/8] upload-pack: send part of packfile response as uri

2019-03-08 Thread Jonathan Tan
Teach upload-pack to send part of its packfile response as URIs.

An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.

Signed-off-by: Jonathan Tan 
Signed-off-by: Junio C Hamano 
---
 builtin/pack-objects.c |  76 
 fetch-pack.c   | 112 +++--
 t/t5702-protocol-v2.sh |  57 +
 upload-pack.c  |  78 +---
 4 files changed, 312 insertions(+), 11 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index a9fac7c128..2fa962c87d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -110,6 +110,8 @@ static unsigned long window_memory_limit = 0;
 
 static struct list_objects_filter_options filter_options;
 
+static struct string_list uri_protocols = STRING_LIST_INIT_NODUP;
+
 enum missing_action {
MA_ERROR = 0,  /* fail if any missing objects are encountered */
MA_ALLOW_ANY,  /* silently allow ALL missing objects */
@@ -118,6 +120,15 @@ enum missing_action {
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
 
+struct configured_exclusion {
+   struct oidmap_entry e;
+   char *pack_hash_hex;
+   char *uri;
+};
+static struct oidmap configured_exclusions;
+
+static struct oidset excluded_by_config;
+
 /*
  * stats
  */
@@ -832,6 +843,25 @@ static off_t write_reused_pack(struct hashfile *f)
return reuse_packfile_offset - sizeof(struct pack_header);
 }
 
+static void write_excluded_by_configs(void)
+{
+   struct oidset_iter iter;
+   const struct object_id *oid;
+
+   oidset_iter_init(&excluded_by_config, &iter);
+   while ((oid = oidset_iter_next(&iter))) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+
+   if (!ex)
+   BUG("configured exclusion wasn't configured");
+   write_in_full(1, ex->pack_hash_hex, strlen(ex->pack_hash_hex));
+   write_in_full(1, " ", 1);
+   write_in_full(1, ex->uri, strlen(ex->uri));
+   write_in_full(1, "\n", 1);
+   }
+}
+
 static const char no_split_warning[] = N_(
 "disabling bitmap writing, packs are split due to pack.packSizeLimit"
 );
@@ -1125,6 +1155,25 @@ static int want_object_in_pack(const struct object_id 
*oid,
}
}
 
+   if (uri_protocols.nr) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+   int i;
+   const char *p;
+
+   if (ex) {
+   for (i = 0; i < uri_protocols.nr; i++) {
+   if (skip_prefix(ex->uri,
+   uri_protocols.items[i].string,
+   &p) &&
+   *p == ':') {
+   oidset_insert(&excluded_by_config, oid);
+   return 0;
+   }
+   }
+   }
+   }
+
return 1;
 }
 
@@ -2726,6 +2775,29 @@ static int git_pack_config(const char *k, const char *v, 
void *cb)
pack_idx_opts.version);
return 0;
}
+   if (!strcmp(k, "uploadpack.blobpackfileuri")) {
+   struct configured_exclusion *ex = xmalloc(sizeof(*ex));
+   const char *oid_end, *pack_end;
+   /*
+* Stores the pack hash. This is not a true object ID, but is
+* of the same form.
+*/
+   struct object_id pack_hash;
+
+   if (parse_oid_hex(v, &ex->e.oid, &oid_end) ||
+   *oid_end != ' ' ||
+   parse_oid_hex(oid_end + 1, &pack_hash, &pack_end) ||
+   *pack_end != ' ')
+   die(_("value of uploadpack.blobpackfileuri must be "
+ "of the form '  ' 
(got '%s')"), v);
+   if (oidmap_get(&configured_exclusions, &ex->e.oid))
+   die(_("object already configu

[PATCH v2 0/8] CDN offloading of fetch response

2019-03-08 Thread Jonathan Tan
Here's my current progress - the only thing that is lacking is more
tests, maybe, so I think it's ready for review.

I wrote in version 1:

> I know
> that there are some implementation details that could be improved (e.g.
> parallelization of the CDN downloads, starting CDN downloads *after*
> closing the first HTTP request, holding on to the .keep locks until
> after the refs are set), but will work on those once the overall design
> is more or less finalized.

This code now starts CDN downloads after closing the first HTTP request,
and it holds on to the .keep files until after the refs are set. I'll
leave parallelization of the CDN downloads for later work.

One relatively significant change: someone pointed out that the issue fixed by 
50d3413740 ("http: make redirects more obvious", 2016-12-06) could also
occur here, so I have changed it so that the server is required to send
the packfile's hash along with the URI.

This does mean that Ævar's workflow described in [1] would not work.
Quoting Ævar:

> More concretely, I'd like to have a setup where a server can just dumbly
> point to some URL that probably has most of the data, without having any
> idea what OIDs are in it. So that e.g. some machine entirely
> disconnected from the server (and with just a regular clone) can
> continually generating an up-to-date-enough packfile.

With 50d3413740, it seems to me that it's important for the server to
know details about the URIs that it points to, so such a disconnection
would not work.

[1] https://public-inbox.org/git/87k1hv6eel@evledraar.gmail.com/

Jonathan Tan (8):
  http: use --stdin when getting dumb HTTP pack
  http: improve documentation of http_pack_request
  http-fetch: support fetching packfiles by URL
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  fetch-pack: support more than one pack lockfile
  upload-pack: send part of packfile response as uri

 Documentation/git-http-fetch.txt |   8 +-
 Documentation/technical/packfile-uri.txt |  78 
 Documentation/technical/protocol-v2.txt  |  44 +--
 builtin/fetch-pack.c |  17 ++-
 builtin/pack-objects.c   |  76 +++
 connected.c  |   8 +-
 fetch-pack.c | 129 +--
 fetch-pack.h |   2 +-
 http-fetch.c |  64 --
 http.c   |  82 +++-
 http.h   |  32 -
 t/t5550-http-fetch-dumb.sh   |  25 
 t/t5702-protocol-v2.sh   |  57 +
 transport-helper.c   |   5 +-
 transport.c  |  14 +-
 transport.h  |   6 +-
 upload-pack.c| 155 +--
 17 files changed, 670 insertions(+), 132 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

Range-diff against v1:
1:  987c9b686b < -:  -- http: use --stdin and --keep when downloading 
pack
-:  -- > 1:  87173d0ad1 http: use --stdin when getting dumb HTTP pack
2:  4b53a6f48c ! 2:  66d31720a0 http: improve documentation of http_pack_request
@@ -45,4 +45,4 @@
 + */
  extern struct http_pack_request *new_http_pack_request(
struct packed_git *target, const char *base_url);
- int finish_http_pack_request(struct http_pack_request *preq, char 
**lockfile);
+ extern int finish_http_pack_request(struct http_pack_request *preq);
3:  afe73a282d ! 3:  02373f6afe http-fetch: support fetching packfiles by URL
@@ -31,7 +31,8 @@
 +--packfile::
 +  Instead of a commit id on the command line (which is not expected in
 +  this case), 'git http-fetch' fetches the packfile directly at the given
-+  URL and generates the corresponding .idx file.
++  URL and uses index-pack to generate corresponding .idx and .keep files.
++  The output of index-pack is printed to stdout.
 +
  --recover::
Verify that everything reachable from target is fetched.  Used after
@@ -101,12 +102,12 @@
 +  struct http_pack_request *preq;
 +  struct slot_results results;
 +  int ret;
-+  char *lockfile;
 +
 +  preq = new_http_pack_request(NULL, url);
 +  if (preq == NULL)
 +  die("couldn't create http pack request");
 +  preq->slot->results = &results;
++  preq->generate_keep = 1;
 +
 +  if (start_active_slot(preq->slot)) {
 +  run_active_slot(preq->slot);
@@ -118,9 +119,8 @@
 +  die("Unable to start request");
 +

Re: [WIP 0/7] CDN offloading of fetch response

2019-03-04 Thread Christian Couder
Hi,

On Fri, Mar 1, 2019 at 12:21 AM Jonathan Nieder  wrote:
>
> Sorry for the slow followup.  Thanks for probing into the design ---
> this should be useful for getting the docs to be clear.
>
> Christian Couder wrote:
>
> > So it's likely that users will want a way to host on such sites
> > incomplete repos using CDN offloading to a CDN on another site. And
> > then if the CDN is not accessible for some reason, things will
> > completely break when users will clone.
>
> I think this would be a broken setup --- we can make it clear in the
> protocol and server docs that you should only point to a CDN for which
> you control the contents, to avoid breaking clients.

We can say whatever in the docs, but in real life if it's
simpler/cheaper for repo admins to use a CDN for example on Google and
a repo on GitHub, they are likely to do it anyway.

> That doesn't prevent adding additional features in the future e.g. for
> "server suggested alternates" --- it's just that I consider that a
> separate feature.
>
> Using CDN offloading requires cooperation of the hosting provider.
> It's a way to optimize how fetches work, not a way to have a partial
> repository on the server side.

We can say whatever we want about what it is for. Users are likely to
use it anyway in the way they think it will benefit them the most.

> > On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder  wrote:
>
> >> This doesn't stop a hosting provider from using e.g. server options to
> >> allow the client more control over how their response is served, just
> >> like can be done for other features of how the transfer works (how
> >> often to send progress updates, whether to prioritize latency or
> >> throughput, etc).
> >
> > Could you give a more concrete example of what could be done?
>
> What I mean is passing server options using "git fetch --server-option".
> For example:
>
> git fetch -o priority=BATCH origin master
>
> or
>
> git fetch -o avoid-cdn=badcdn.example.com origin master
>
> The interpretation of server options is up to the server.

If you often have to tell things like "-o
avoid-cdn=badcdn.example.com", then how is it better than just
specifying "-o usecdn=goodcdn.example.com" or even better using the
remote mechanism to configure a remote for goodcdn.example.com and
then configuring this remote to be used along the origin remote (which
is what many promisor remotes is about)?

> >> What the client *can* do is turn off support for packfile URLs in a
> >> request completely.  This is required for backward compatibility and
> >> allows working around a host that has configured the feature
> >> incorrectly.
> >
> > If the full content of a repo is really large, the size of a full pack
> > file sent by an initial clone could be really big and many client
> > machines could not have enough memory to deal with that. And this
> > suppose that repo hosting providers would be ok to host very large
> > repos in the first place.
>
> Do we require the packfile to fit in memory?  If so, we should fix
> that (to use streaming instead).

Even if we stream the packfile to write it, at one point we have to use it.

And I could be wrong but I think that mmap doesn't work on Windows, so
I think we will just try to read the whole thing into memory. Even on
Linux I don't think it's a good idea to mmap a very large file and
then use some big parts of it which I think we will have to do when
checking out the large files from inside the packfile.

Yeah, we can improve that part of Git too. I think though that it
means yet another thing (and not an easy one) that needs to be
improved before CDN offloading can work well in the real world.

I think that the Git "development philosophy" since the beginning has
been more about adding things that work well in the real world first
even if they are small and a bit manual, and then improving on top of
those early things, rather than adding a big thing that doesn't quite
work well in the real world but is automated and then improving on
that.


Re: [WIP 0/7] CDN offloading of fetch response

2019-03-04 Thread Christian Couder
On Tue, Feb 26, 2019 at 10:12 AM Ævar Arnfjörð Bjarmason
 wrote:
>
> On Tue, Feb 26 2019, Christian Couder wrote:
>
> > On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder  wrote:

> >> But the same thing can happen with redirects, with embedded assets in
> >> web pages, and so on.
> >
> > I don't think it's the same situation, because the CDN offloading is
> > likely to be used for large objects that some hosting sites like
> > GitHub, GitLab and BitBucket might not be ok to have them store for
> > free on their machines. (I think the current limitations are around
> > 10GB or 20GB, everything included, for each user.)
> >
> > So it's likely that users will want a way to host on such sites
> > incomplete repos using CDN offloading to a CDN on another site. And
> > then if the CDN is not accessible for some reason, things will
> > completely break when users will clone.
> >
> > You could say that it's the same issue as when a video is not
> > available on a web page, but the web browser can still render the page
> > when a video is not available. So I don't think it's the same kind of
> > issue.
> >
> > And by the way that's a reason why I think it's important to think
> > about this in relation to promisor/partial clone remotes. Because with
> > them it's less of a big deal if the CDN is unavailable, temporarily or
> > not, for some reason.
>
> I think all of that's correct. E.g. you can imagine a CDN where the CDN
> serves literally one blob (not a pack), and the server the rest of the
> trees/commits/blobs.
>
> But for the purposes of reviewing this I think it's better to say that
> we're going to have a limited initial introduction of CDN where those
> more complex cases don't need to be handled.
>
> That can always be added later, as far as I can tell from the protocol
> alteration in the RFC nothing's closing the door on that, we could
> always add another capability / protocol extension.

Yeah, it doesn't close the door on further improvements. The issue
though is that it doesn't seem to have many benefits over implementing
things in many promisor remotes. The main benefit seems to be that the
secondary server locations are automatically configured. But when
looking at what can happen in the real world, this benefit seems more
like a drawback to me as it potentially creates a lot of problems.

A solution, many promisor remotes, where:

- first secondary server URLs are manually specified on the client
side, and then
- some kind of negotiation, so that they can be automatically
selected, is implemented

seems better to me than a solution, CDN offloading, where:

- first the main server decides the secondary server URLs, and then
- we work around the cases where this creates problems

In the case of CDN offloading it is likely that early client and
server implementations will create problems for many people as long as
most of the workarounds aren't implemented. While in the case of many
promisor remotes there is always the manual solution as long as the
automated selection doesn't work well enough.


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-28 Thread Jonathan Tan
> So we process the packfile URIs one by one as we receive them from the
> server? If we expect these packfiles to be large (otherwise why are we
> bothering to offload them to the CDN), is there a risk that the
> connection to the server might time out while we're downloading from the
> CDN?

You're right that this is undesirable - this is one of the things that I
will fix, as I mention in the cover letter ("starting CDN downloads...")
[1].

> Please take a look. Feel free to comment on anything, but I prefer
> comments on the major things first (e.g. my usage of a separate process
> (http-fetch) to fetch packfiles, since as far as I know, Git doesn't
> link to libcurl; any of the design decisions I described above). I know
> that there are some implementation details that could be improved (e.g.
> parallelization of the CDN downloads, starting CDN downloads *after*
> closing the first HTTP request, holding on to the .keep locks until
> after the refs are set), but will work on those once the overall design
> is more or less finalized.

[1] https://public-inbox.org/git/20190301000954.ga47...@google.com/


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-28 Thread Josh Steadmon
On 2019.02.23 15:39, Jonathan Tan wrote:
> Teach upload-pack to send part of its packfile response as URIs.
> 
> An administrator may configure a repository with one or more
> "uploadpack.blobpackfileuri" lines, each line containing an OID and a
> URI. A client may configure fetch.uriprotocols to be a comma-separated
> list of protocols that it is willing to use to fetch additional
> packfiles - this list will be sent to the server. Whenever an object
> with one of those OIDs would appear in the packfile transmitted by
> upload-pack, the server may exclude that object, and instead send the
> URI. The client will then download the packs referred to by those URIs
> before performing the connectivity check.
> 
> Signed-off-by: Jonathan Tan 
> ---
>  builtin/pack-objects.c | 63 ++
>  fetch-pack.c   | 58 +++
>  t/t5702-protocol-v2.sh | 54 +
>  upload-pack.c  | 78 ++
>  4 files changed, 247 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index a9fac7c128..73309821e2 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -110,6 +110,8 @@ static unsigned long window_memory_limit = 0;
>  
>  static struct list_objects_filter_options filter_options;
>  
> +static struct string_list uri_protocols = STRING_LIST_INIT_NODUP;
> +
>  enum missing_action {
>   MA_ERROR = 0,  /* fail if any missing objects are encountered */
>   MA_ALLOW_ANY,  /* silently allow ALL missing objects */
> @@ -118,6 +120,14 @@ enum missing_action {
>  static enum missing_action arg_missing_action;
>  static show_object_fn fn_show_object;
>  
> +struct configured_exclusion {
> + struct oidmap_entry e;
> + char *uri;
> +};
> +static struct oidmap configured_exclusions;
> +
> +static struct oidset excluded_by_config;
> +
>  /*
>   * stats
>   */
> @@ -832,6 +842,23 @@ static off_t write_reused_pack(struct hashfile *f)
>   return reuse_packfile_offset - sizeof(struct pack_header);
>  }
>  
> +static void write_excluded_by_configs(void)
> +{
> + struct oidset_iter iter;
> + const struct object_id *oid;
> +
> + oidset_iter_init(&excluded_by_config, &iter);
> + while ((oid = oidset_iter_next(&iter))) {
> + struct configured_exclusion *ex =
> + oidmap_get(&configured_exclusions, oid);
> +
> + if (!ex)
> + BUG("configured exclusion wasn't configured");
> + write_in_full(1, ex->uri, strlen(ex->uri));
> + write_in_full(1, "\n", 1);
> + }
> +}
> +
>  static const char no_split_warning[] = N_(
>  "disabling bitmap writing, packs are split due to pack.packSizeLimit"
>  );
> @@ -1125,6 +1152,25 @@ static int want_object_in_pack(const struct object_id 
> *oid,
>   }
>   }
>  
> + if (uri_protocols.nr) {
> + struct configured_exclusion *ex =
> + oidmap_get(&configured_exclusions, oid);
> + int i;
> + const char *p;
> +
> + if (ex) {
> + for (i = 0; i < uri_protocols.nr; i++) {
> + if (skip_prefix(ex->uri,
> + uri_protocols.items[i].string,
> + &p) &&
> + *p == ':') {
> + oidset_insert(&excluded_by_config, oid);
> + return 0;
> + }
> + }
> + }
> + }
> +
>   return 1;
>  }
>  
> @@ -2726,6 +2772,19 @@ static int git_pack_config(const char *k, const char 
> *v, void *cb)
>   pack_idx_opts.version);
>   return 0;
>   }
> + if (!strcmp(k, "uploadpack.blobpackfileuri")) {
> + struct configured_exclusion *ex = xmalloc(sizeof(*ex));
> + const char *end;
> +
> + if (parse_oid_hex(v, &ex->e.oid, &end) || *end != ' ')
> + die(_("value of uploadpack.blobpackfileuri must be "
> +   "of the form ' ' (got '%s')"), v);
> + if (oidmap_get(&configured_exclusions, &ex->e.oid))
> + die(_("object already configured in another "
> +  

Re: [WIP 0/7] CDN offloading of fetch response

2019-02-28 Thread Jonathan Nieder
Hi,

Sorry for the slow followup.  Thanks for probing into the design ---
this should be useful for getting the docs to be clear.

Christian Couder wrote:

> So it's likely that users will want a way to host on such sites
> incomplete repos using CDN offloading to a CDN on another site. And
> then if the CDN is not accessible for some reason, things will
> completely break when users will clone.

I think this would be a broken setup --- we can make it clear in the
protocol and server docs that you should only point to a CDN for which
you control the contents, to avoid breaking clients.

That doesn't prevent adding additional features in the future e.g. for
"server suggested alternates" --- it's just that I consider that a
separate feature.

Using CDN offloading requires cooperation of the hosting provider.
It's a way to optimize how fetches work, not a way to have a partial
repository on the server side.

[...]
> On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder  wrote:

>> This doesn't stop a hosting provider from using e.g. server options to
>> allow the client more control over how their response is served, just
>> like can be done for other features of how the transfer works (how
>> often to send progress updates, whether to prioritize latency or
>> throughput, etc).
>
> Could you give a more concrete example of what could be done?

What I mean is passing server options using "git fetch --server-option".
For example:

git fetch -o priority=BATCH origin master

or

git fetch -o avoid-cdn=badcdn.example.com origin master

The interpretation of server options is up to the server.

>> What the client *can* do is turn off support for packfile URLs in a
>> request completely.  This is required for backward compatibility and
>> allows working around a host that has configured the feature
>> incorrectly.
>
> If the full content of a repo is really large, the size of a full pack
> file sent by an initial clone could be really big and many client
> machines could not have enough memory to deal with that. And this
> suppose that repo hosting providers would be ok to host very large
> repos in the first place.

Do we require the packfile to fit in memory?  If so, we should fix
that (to use streaming instead).

Thanks,
Jonathan


Re: [WIP 0/7] CDN offloading of fetch response

2019-02-26 Thread Ævar Arnfjörð Bjarmason


On Tue, Feb 26 2019, Christian Couder wrote:

> Hi,
>
> On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder  wrote:
>>
>> Christian Couder wrote:
>> > On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  
>> > wrote:
>>
>> > Especially I'd like to know what should the client do if they find out
>> > that for example a repo that contains a lot of large files is
>> > configured so that the large files should be fetched from a CDN that
>> > the client cannot use? Is the client forced to find or setup another
>> > repo configured differently if the client still wants to use CDN
>> > offloading?
>>
>> The example from that message:
>>
>>   For example I think the Great Firewall of China lets people in China
>>   use GitHub.com but not Google.com. So if people start configuring
>>   their repos on GitHub so that they send packs that contain Google.com
>>   CDN URLs (or actually anything that the Firewall blocks), it might
>>   create many problems for users in China if they don't have a way to
>>   opt out of receiving packs with those kind of URLs.
>>
>> But the same thing can happen with redirects, with embedded assets in
>> web pages, and so on.
>
> I don't think it's the same situation, because the CDN offloading is
> likely to be used for large objects that some hosting sites like
> GitHub, GitLab and BitBucket might not be ok to have them store for
> free on their machines. (I think the current limitations are around
> 10GB or 20GB, everything included, for each user.)
>
> So it's likely that users will want a way to host on such sites
> incomplete repos using CDN offloading to a CDN on another site. And
> then if the CDN is not accessible for some reason, things will
> completely break when users will clone.
>
> You could say that it's the same issue as when a video is not
> available on a web page, but the web browser can still render the page
> when a video is not available. So I don't think it's the same kind of
> issue.
>
> And by the way that's a reason why I think it's important to think
> about this in relation to promisor/partial clone remotes. Because with
> them it's less of a big deal if the CDN is unavailable, temporarily or
> not, for some reason.

I think all of that's correct. E.g. you can imagine a CDN where the CDN
serves literally one blob (not a pack), and the server the rest of the
trees/commits/blobs.

But for the purposes of reviewing this I think it's better to say that
we're going to have a limited initial introduction of CDN where those
more complex cases don't need to be handled.

That can always be added later, as far as I can tell from the protocol
alteration in the RFC nothing's closing the door on that, we could
always add another capability / protocol extension.


Re: [WIP 0/7] CDN offloading of fetch response

2019-02-26 Thread Christian Couder
Hi,

On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder  wrote:
>
> Christian Couder wrote:
> > On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  
> > wrote:
>
> > Especially I'd like to know what should the client do if they find out
> > that for example a repo that contains a lot of large files is
> > configured so that the large files should be fetched from a CDN that
> > the client cannot use? Is the client forced to find or setup another
> > repo configured differently if the client still wants to use CDN
> > offloading?
>
> The example from that message:
>
>   For example I think the Great Firewall of China lets people in China
>   use GitHub.com but not Google.com. So if people start configuring
>   their repos on GitHub so that they send packs that contain Google.com
>   CDN URLs (or actually anything that the Firewall blocks), it might
>   create many problems for users in China if they don't have a way to
>   opt out of receiving packs with those kind of URLs.
>
> But the same thing can happen with redirects, with embedded assets in
> web pages, and so on.

I don't think it's the same situation, because the CDN offloading is
likely to be used for large objects that some hosting sites like
GitHub, GitLab and BitBucket might not be ok to have them store for
free on their machines. (I think the current limitations are around
10GB or 20GB, everything included, for each user.)

So it's likely that users will want a way to host on such sites
incomplete repos using CDN offloading to a CDN on another site. And
then if the CDN is not accessible for some reason, things will
completely break when users will clone.

You could say that it's the same issue as when a video is not
available on a web page, but the web browser can still render the page
when a video is not available. So I don't think it's the same kind of
issue.

And by the way that's a reason why I think it's important to think
about this in relation to promisor/partial clone remotes. Because with
them it's less of a big deal if the CDN is unavailable, temporarily or
not, for some reason.

> I think in this situation the user would likely
> (and rightly) blame the host (github.com) for requiring access to a
> separate inaccessible site, and the problem could be resolved with
> them.

The host will say that it's repo admins' responsibility to use a CDN
that works for the repo users (or to pay for more space on the host).
Then repo admins will say that they use this CDN because it's simpler
for them or the only thing they can afford or deal with. (For example
I don't think it would be easy for westerners to use a Chinese CDN.)
Then users will likely blame Git for not supporting a way to use a
different CDN than the one configured in each repo.

> The beauty of this is that it's transparent to the client: the fact
> that packfile transfer was offloaded to a CDN is an implementation
> detail, and the server takes full responsibility for it.

Who is "the server" in real life? Are you sure they would be ok to
take full responsibility?

And yes, I agree that transparency for the client is nice. And if it's
really nice, then why not have it for promisor/partial clone remotes
too? But then do we really need duplicated functionality between
promisor remotes and CDN offloading?

And also I just think that in real life there needs to be an easy way
to override this transparency, and we already have that with promisor
remotes.

> This doesn't stop a hosting provider from using e.g. server options to
> allow the client more control over how their response is served, just
> like can be done for other features of how the transfer works (how
> often to send progress updates, whether to prioritize latency or
> throughput, etc).

Could you give a more concrete example of what could be done?

> What the client *can* do is turn off support for packfile URLs in a
> request completely.  This is required for backward compatibility and
> allows working around a host that has configured the feature
> incorrectly.

If the full content of a repo is really large, the size of a full pack
file sent by an initial clone could be really big and many client
machines could not have enough memory to deal with that. And this
suppose that repo hosting providers would be ok to host very large
repos in the first place.

With promisor remotes, it's less of a problem if for example:

- a repo hosting provider is not ok with very large repos,
- a CDN is unavailable,
- a repo admin has not configured some repos very well.

Thanks for your answer,
Christian.


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-25 Thread Christian Couder
Hi,

On Tue, Feb 26, 2019 at 2:53 AM Jonathan Nieder  wrote:
>
> Christian Couder wrote:
> > On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  
> > wrote:
>
> > Ok, so each URI sent in the packfile corresponds to exactly one
> > object. And when the client fetches one such URI it gets a packfile
> > that contains only the corresponding object. Or is there something I
> > misunderstood?
>
> I think it's worth separating the protocol and the server
> implementation:
>
> The protocol allows arbitrary packfiles --- they do not have to be
> single-object packfiles.
>
> This patch for a server implementation only uses it to serve
> single-object packfiles.

Ok, this explains things much better.

Thanks,
Christian.


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-25 Thread Jonathan Nieder
Hi,

Christian Couder wrote:
> On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  
> wrote:

>> Teach upload-pack to send part of its packfile response as URIs.
>>
>> An administrator may configure a repository with one or more
>> "uploadpack.blobpackfileuri" lines, each line containing an OID and a
>> URI. A client may configure fetch.uriprotocols to be a comma-separated
>> list of protocols that it is willing to use to fetch additional
>> packfiles - this list will be sent to the server.
>
> So only packfiles will be fetched by the client using those protocols.

Yes.

>> Whenever an object
>> with one of those OIDs would appear in the packfile transmitted by
>> upload-pack, the server may exclude that object, and instead send the
>> URI.
>
> Ok, so each URI sent in the packfile corresponds to exactly one
> object. And when the client fetches one such URI it gets a packfile
> that contains only the corresponding object. Or is there something I
> misunderstood?

I think it's worth separating the protocol and the server
implementation:

The protocol allows arbitrary packfiles --- they do not have to be
single-object packfiles.

This patch for a server implementation only uses it to serve
single-object packfiles.

Thanks,
Jonathan


Re: [WIP 0/7] CDN offloading of fetch response

2019-02-25 Thread Jonathan Nieder
Hi,

Christian Couder wrote:
> On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  
> wrote:

>> There are probably some more design discussions to be had:
>
> [...]
>
>>  - Client-side whitelist of protocol and hostnames. I've implemented
>>whitelist of protocol, but not hostnames.
>
> I would appreciate a more complete answer to my comments in:
>
> https://public-inbox.org/git/CAP8UFD16fvtu_dg3S_J9BjGpxAYvgp8SXscdh=tjb5jvabz...@mail.gmail.com/
>
> Especially I'd like to know what should the client do if they find out
> that for example a repo that contains a lot of large files is
> configured so that the large files should be fetched from a CDN that
> the client cannot use? Is the client forced to find or setup another
> repo configured differently if the client still wants to use CDN
> offloading?

The example from that message:

  For example I think the Great Firewall of China lets people in China
  use GitHub.com but not Google.com. So if people start configuring
  their repos on GitHub so that they send packs that contain Google.com
  CDN URLs (or actually anything that the Firewall blocks), it might
  create many problems for users in China if they don't have a way to
  opt out of receiving packs with those kind of URLs.

But the same thing can happen with redirects, with embedded assets in
web pages, and so on.  I think in this situation the user would likely
(and rightly) blame the host (github.com) for requiring access to a
separate inaccessible site, and the problem could be resolved with
them.

The beauty of this is that it's transparent to the client: the fact
that packfile transfer was offloaded to a CDN is an implementation
detail, and the server takes full responsibility for it.

This doesn't stop a hosting provider from using e.g. server options to
allow the client more control over how their response is served, just
like can be done for other features of how the transfer works (how
often to send progress updates, whether to prioritize latency or
throughput, etc).

What the client *can* do is turn off support for packfile URLs in a
request completely.  This is required for backward compatibility and
allows working around a host that has configured the feature
incorrectly.

Thanks for an interesting example,
Jonathan


Re: [WIP 0/7] CDN offloading of fetch response

2019-02-25 Thread Christian Couder
On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  wrote:

> There are probably some more design discussions to be had:

[...]

>  - Client-side whitelist of protocol and hostnames. I've implemented
>whitelist of protocol, but not hostnames.

I would appreciate a more complete answer to my comments in:

https://public-inbox.org/git/CAP8UFD16fvtu_dg3S_J9BjGpxAYvgp8SXscdh=tjb5jvabz...@mail.gmail.com/

Especially I'd like to know what should the client do if they find out
that for example a repo that contains a lot of large files is
configured so that the large files should be fetched from a CDN that
the client cannot use? Is the client forced to find or setup another
repo configured differently if the client still wants to use CDN
offloading?

Wouldn't it be better if the client could use the same repo and just
select a CDN configuration among many?


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-25 Thread Christian Couder
On Sun, Feb 24, 2019 at 12:39 AM Jonathan Tan  wrote:
>
> Teach upload-pack to send part of its packfile response as URIs.
>
> An administrator may configure a repository with one or more
> "uploadpack.blobpackfileuri" lines, each line containing an OID and a
> URI. A client may configure fetch.uriprotocols to be a comma-separated
> list of protocols that it is willing to use to fetch additional
> packfiles - this list will be sent to the server.

So only packfiles will be fetched by the client using those protocols.

> Whenever an object
> with one of those OIDs would appear in the packfile transmitted by
> upload-pack, the server may exclude that object, and instead send the
> URI.

Ok, so each URI sent in the packfile corresponds to exactly one
object. And when the client fetches one such URI it gets a packfile
that contains only the corresponding object. Or is there something I
misunderstood?

> The client will then download the packs referred to by those URIs
> before performing the connectivity check.


Re: [WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-24 Thread Junio C Hamano
Jonathan Tan  writes:

> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index db4ae09f2f..6dbe2e9584 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -656,6 +656,60 @@ test_expect_success 'when server does not send "ready", 
> expect FLUSH' '
>   test_i18ngrep "expected no other sections to be sent after no .ready." 
> err
>  '
>  
> +test_expect_success 'part of packfile response provided as URI' '
> ...
> + if test "$(grep "^[0-9a-f]\{40\} " out | wc -l)" = 1
> + then

Against this, test-lint-shell-syntax barks.  You'd have seen it if
you did "make test".

I am not sure hard-coding 40 here is a good idea for longer term, as
we are *not* testing that the output from "verify-pack --verbose"
shows the full object name in SHA-1 hash.

Perhaps squash something like this in ("16" is of course negotiable)?

 t/t5702-protocol-v2.sh | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 6dbe2e9584..e9950f0853 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -694,7 +694,10 @@ test_expect_success 'part of packfile response provided as 
URI' '
for idx in http_child/.git/objects/pack/*.idx
do
git verify-pack --verbose $idx >out &&
-   if test "$(grep "^[0-9a-f]\{40\} " out | wc -l)" = 1
+   {
+   grep "^[0-9a-f]\{16,\} " out || :
+   } >out.objectlist &&
+   if test_line_count = 1 out.objectlist
then
if grep $(cat h) out
then


[WIP 7/7] upload-pack: send part of packfile response as uri

2019-02-23 Thread Jonathan Tan
Teach upload-pack to send part of its packfile response as URIs.

An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID and a
URI. A client may configure fetch.uriprotocols to be a comma-separated
list of protocols that it is willing to use to fetch additional
packfiles - this list will be sent to the server. Whenever an object
with one of those OIDs would appear in the packfile transmitted by
upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.

Signed-off-by: Jonathan Tan 
---
 builtin/pack-objects.c | 63 ++
 fetch-pack.c   | 58 +++
 t/t5702-protocol-v2.sh | 54 +
 upload-pack.c  | 78 ++
 4 files changed, 247 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index a9fac7c128..73309821e2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -110,6 +110,8 @@ static unsigned long window_memory_limit = 0;
 
 static struct list_objects_filter_options filter_options;
 
+static struct string_list uri_protocols = STRING_LIST_INIT_NODUP;
+
 enum missing_action {
MA_ERROR = 0,  /* fail if any missing objects are encountered */
MA_ALLOW_ANY,  /* silently allow ALL missing objects */
@@ -118,6 +120,14 @@ enum missing_action {
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
 
+struct configured_exclusion {
+   struct oidmap_entry e;
+   char *uri;
+};
+static struct oidmap configured_exclusions;
+
+static struct oidset excluded_by_config;
+
 /*
  * stats
  */
@@ -832,6 +842,23 @@ static off_t write_reused_pack(struct hashfile *f)
return reuse_packfile_offset - sizeof(struct pack_header);
 }
 
+static void write_excluded_by_configs(void)
+{
+   struct oidset_iter iter;
+   const struct object_id *oid;
+
+   oidset_iter_init(&excluded_by_config, &iter);
+   while ((oid = oidset_iter_next(&iter))) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+
+   if (!ex)
+   BUG("configured exclusion wasn't configured");
+   write_in_full(1, ex->uri, strlen(ex->uri));
+   write_in_full(1, "\n", 1);
+   }
+}
+
 static const char no_split_warning[] = N_(
 "disabling bitmap writing, packs are split due to pack.packSizeLimit"
 );
@@ -1125,6 +1152,25 @@ static int want_object_in_pack(const struct object_id 
*oid,
}
}
 
+   if (uri_protocols.nr) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+   int i;
+   const char *p;
+
+   if (ex) {
+   for (i = 0; i < uri_protocols.nr; i++) {
+   if (skip_prefix(ex->uri,
+   uri_protocols.items[i].string,
+   &p) &&
+   *p == ':') {
+   oidset_insert(&excluded_by_config, oid);
+   return 0;
+   }
+   }
+   }
+   }
+
return 1;
 }
 
@@ -2726,6 +2772,19 @@ static int git_pack_config(const char *k, const char *v, 
void *cb)
pack_idx_opts.version);
return 0;
}
+   if (!strcmp(k, "uploadpack.blobpackfileuri")) {
+   struct configured_exclusion *ex = xmalloc(sizeof(*ex));
+   const char *end;
+
+   if (parse_oid_hex(v, &ex->e.oid, &end) || *end != ' ')
+   die(_("value of uploadpack.blobpackfileuri must be "
+ "of the form ' ' (got '%s')"), v);
+   if (oidmap_get(&configured_exclusions, &ex->e.oid))
+   die(_("object already configured in another "
+ "uploadpack.blobpackfileuri (got '%s')"), v);
+   ex->uri = xstrdup(end + 1);
+   oidmap_put(&configured_exclusions, ex);
+   }
return git_default_config(k, v, cb);
 }
 
@@ -3318,6 +3377,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("do not pack objects in promisor packfiles")),
OPT_BOOL(0, "delta-islands", &use_delta_islands,
 N_("

[WIP 0/7] CDN offloading of fetch response

2019-02-23 Thread Jonathan Tan
It's been a while, so here is an updated version of what I previously
sent [1]. The main difference is that fetch-pack now actually downloads
whatever the server tells it to. The second main difference is that
we no longer buffer progress messages and suspend keepalives - we
no longer need to, because the sideband-all patches have been merged.

I think I've address all the technical comments in [1], except one
comment of Junio's [2] that I still don't understand:

> And the "fix" I was alluding to was to update that "else" clause to
> make it a no-op that keeps os->used non-zero, which would not call
> send-client-data.
>
> When that fix happens, the part that early in the function this
> patch added "now we know we will call send-client-data, so let's say
> 'here comes packdata' unless we have already said that" is making
> the decision too early.  Depending on the value of os->used when we
> enter the code and the number of bytes xread() reads from the
> upstream, we might not call send-client-data yet (namely, when we
> have no buffered data and we happened to get only one byte).

With or without this fix, I don't think there is ever a time when we
say "here comes packdata" without calling send-client-data - we say
"here comes packdata" only when we see the string "PACK", which forms
part of the packfile, and thus we should have no problem sending any
client data. Having said that, maybe this is a moot point - Junio says
that this only happens when the fix is implemented, and the fix is not
implemented.

There are probably some more design discussions to be had:

 - Requirement that all pointed-to CDNs support byte ranges for
   resumability, and to guarantee that the packfiles will be there
   permanently. After some thought, it is a good idea for CDNs to
   support that, but I think that we should support CDNs that can only
   give temporal guarantees (e.g. if/when we start implementing
   resumption, we could read the Cache headers). I didn't add any
   mention of this issue in the documentation.

 - Client-side whitelist of protocol and hostnames. I've implemented
   whitelist of protocol, but not hostnames.

 - Any sort of follow-up fetch - for example, if the download from the
   CDN fails or if we allow the server to tell us of best-effort
   packfiles (but the client still must check and formulate the correct
   request to the server to fetch the rest). This protocol seems like a
   prerequisite to all those, and is independently useful, so maybe all
   of those can be future work.

Please take a look. Feel free to comment on anything, but I prefer
comments on the major things first (e.g. my usage of a separate process
(http-fetch) to fetch packfiles, since as far as I know, Git doesn't
link to libcurl; any of the design decisions I described above). I know
that there are some implementation details that could be improved (e.g.
parallelization of the CDN downloads, starting CDN downloads *after*
closing the first HTTP request, holding on to the .keep locks until
after the refs are set), but will work on those once the overall design
is more or less finalized.

Note that the first patch is exactly the same as one I've previously
sent [3].

[1] https://public-inbox.org/git/cover.1543879256.git.jonathanta...@google.com/
[2] https://public-inbox.org/git/xmqqmupi89ub@gitster-ct.c.googlers.com/
[3] 
https://public-inbox.org/git/20190221001447.124088-1-jonathanta...@google.com/

Jonathan Tan (7):
  http: use --stdin and --keep when downloading pack
  http: improve documentation of http_pack_request
  http-fetch: support fetching packfiles by URL
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  upload-pack: send part of packfile response as uri

 Documentation/git-http-fetch.txt |   7 +-
 Documentation/technical/packfile-uri.txt |  79 
 Documentation/technical/protocol-v2.txt  |  22 ++--
 builtin/pack-objects.c   |  63 +
 fetch-pack.c |  58 +
 http-fetch.c |  65 --
 http-push.c  |   7 +-
 http-walker.c|   5 +-
 http.c   |  83 +++-
 http.h   |  26 +++-
 t/t5550-http-fetch-dumb.sh   |  18 +++
 t/t5702-protocol-v2.sh   |  54 
 upload-pack.c| 155 +--
 13 files changed, 542 insertions(+), 100 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

-- 
2.19.0.271.gfe8321ec05.dirty



Re: [PATCH v3 0/4] Sideband the whole fetch v2 response

2019-01-17 Thread Junio C Hamano
Jonathan Tan  writes:

>  /*
> + * Inspects a multiplexed packet read from the remote. If this packet is a
> + * progress packet and thus should not be processed by the caller, returns 0.
> + * Otherwise, returns 1, releases scratch, and sets sideband_type.
>   *
> + * If this packet is SIDEBAND_PROTOCOL_ERROR, SIDEBAND_REMOTE_ERROR, or a
> + * progress packet, also prints a message to stderr.
> + *
> + * scratch must be a struct strbuf allocated by the caller. It is used to 
> store
> + * progress messages split across multiple packets.
>   */
> +int demultiplex_sideband(const char *me, char *buf, int len,
> +  int die_on_error,
> +  struct strbuf *scratch,
> +  enum sideband_type *sideband_type);

OK.  That sounds like a sensible way to keep the leftover across
calls.

Will queue.


[PATCH v3 3/4] {fetch,upload}-pack: sideband v2 fetch response

2019-01-16 Thread Jonathan Tan
Currently, a response to a fetch request has sideband support only while
the packfile is being sent, meaning that the server cannot send notices
until the start of the packfile.

Extend sideband support in protocol v2 fetch responses to the whole
response. upload-pack will advertise it if the
uploadpack.allowsidebandall configuration variable is set, and
fetch-pack will automatically request it if advertised.

If the sideband is to be used throughout the whole response, upload-pack
will use it to send errors instead of prefixing a PKT-LINE payload with
"ERR ".

This will be tested in a subsequent patch.

Signed-off-by: Jonathan Tan 
Signed-off-by: Junio C Hamano 
---
 Documentation/technical/protocol-v2.txt | 10 ++
 fetch-pack.c| 12 +--
 pkt-line.c  | 43 ++---
 pkt-line.h  |  4 +++
 sideband.c  |  5 +++
 sideband.h  |  1 +
 upload-pack.c   | 16 +
 7 files changed, 78 insertions(+), 13 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 09e4e0273f..39b40c0dc1 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -307,6 +307,16 @@ the 'wanted-refs' section in the server's response as 
explained below.
particular ref, where  is the full name of a ref on the
server.
 
+If the 'sideband-all' feature is advertised, the following argument can be
+included in the client's request:
+
+sideband-all
+   Instruct the server to send the whole response multiplexed, not just
+   the packfile section. All non-flush and non-delim PKT-LINE in the
+   response (not only in the packfile section) will then start with a byte
+   indicating its sideband (1, 2, or 3), and the server may send "0005\2"
+   (a PKT-LINE of sideband 2 with no payload) as a keepalive packet.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
diff --git a/fetch-pack.c b/fetch-pack.c
index ee0847e6ae..86e9e18901 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1090,7 +1090,8 @@ static int add_haves(struct fetch_negotiator *negotiator,
 static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
  const struct fetch_pack_args *args,
  const struct ref *wants, struct oidset *common,
- int *haves_to_send, int *in_vain)
+ int *haves_to_send, int *in_vain,
+ int sideband_all)
 {
int ret = 0;
struct strbuf req_buf = STRBUF_INIT;
@@ -1116,6 +1117,8 @@ static int send_fetch_request(struct fetch_negotiator 
*negotiator, int fd_out,
packet_buf_write(&req_buf, "include-tag");
if (prefer_ofs_delta)
packet_buf_write(&req_buf, "ofs-delta");
+   if (sideband_all)
+   packet_buf_write(&req_buf, "sideband-all");
 
/* Add shallow-info and deepen request */
if (server_supports_feature("fetch", "shallow", 0))
@@ -1324,6 +1327,10 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
packet_reader_init(&reader, fd[0], NULL, 0,
   PACKET_READ_CHOMP_NEWLINE |
   PACKET_READ_DIE_ON_ERR_PACKET);
+   if (server_supports_feature("fetch", "sideband-all", 0)) {
+   reader.use_sideband = 1;
+   reader.me = "fetch-pack";
+   }
 
while (state != FETCH_DONE) {
switch (state) {
@@ -1357,7 +1364,8 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
case FETCH_SEND_REQUEST:
if (send_fetch_request(&negotiator, fd[1], args, ref,
   &common,
-  &haves_to_send, &in_vain))
+  &haves_to_send, &in_vain,
+  reader.use_sideband))
state = FETCH_GET_PACK;
else
state = FETCH_PROCESS_ACKS;
diff --git a/pkt-line.c b/pkt-line.c
index 321ff632a5..d4b71d3e82 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -449,7 +449,7 @@ int recv_sideband(const char *me, int in_stream, int out)
while (1) {
len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX,
  0);
-   if (!demultiplex_sideband(me, buf, len, &scratch,
+   if (!demu

[PATCH v3 0/4] Sideband the whole fetch v2 response

2019-01-16 Thread Jonathan Tan
Like previous versions, this is on origin/ms/packet-err-check.

First of all, thanks to those who noticed the issue with t5409. I have
merged sg/stress-test and ran:

make DEVELOPER=1 && (cd t && sh ./t5409-col*.sh --stress)

with no issues.

demultiplex_sideband() now requires its caller to maintain a strbuf,
since progress messages can be split across packets. I have also changed
the API of demultiplex_sideband() to hopefully make things clearer.

As for keepalive packets, I think it's best if we state that the remote
should use 0005\2 instead of 0005\1, and have made the change
accordingly. This does not need to be treated specially by the client.
(I shouldn't have a problem updating my CDN offloading patches [1] to
use 0005\2 before the packfile starts and 0005\1 after the packfile
starts.)

[1] Patches 5-8 of 
https://public-inbox.org/git/cover.1547244620.git.jonathanta...@google.com/

Jonathan Tan (4):
  pkt-line: introduce struct packet_writer
  sideband: reverse its dependency on pkt-line
  {fetch,upload}-pack: sideband v2 fetch response
  tests: define GIT_TEST_SIDEBAND_ALL

 Documentation/technical/protocol-v2.txt |  10 ++
 fetch-pack.c|  13 +-
 pkt-line.c  | 107 --
 pkt-line.h  |  34 +
 sideband.c  | 178 
 sideband.h  |  25 +++-
 t/README|   5 +
 t/lib-httpd/apache.conf |   1 +
 t/t5537-fetch-shallow.sh|   3 +-
 t/t5701-git-serve.sh|   2 +-
 t/t5702-protocol-v2.sh  |   4 +-
 upload-pack.c   | 131 ++---
 12 files changed, 346 insertions(+), 167 deletions(-)

Interdiff against v2:
diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 1b0633f59f..39b40c0dc1 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -314,8 +314,8 @@ included in the client's request:
Instruct the server to send the whole response multiplexed, not just
the packfile section. All non-flush and non-delim PKT-LINE in the
response (not only in the packfile section) will then start with a byte
-   indicating its sideband (1, 2, or 3), and the server may send "0005\1"
-   (a PKT-LINE of sideband 1 with no payload) as a keepalive packet.
+   indicating its sideband (1, 2, or 3), and the server may send "0005\2"
+   (a PKT-LINE of sideband 2 with no payload) as a keepalive packet.
 
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
diff --git a/pkt-line.c b/pkt-line.c
index 69162f3990..d4b71d3e82 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -442,21 +442,22 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct 
strbuf *sb_out)
 int recv_sideband(const char *me, int in_stream, int out)
 {
char buf[LARGE_PACKET_MAX + 1];
-   int retval = 0;
int len;
+   struct strbuf scratch = STRBUF_INIT;
+   enum sideband_type sideband_type;
 
while (1) {
-   len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX, 
0);
-   retval = demultiplex_sideband(me, buf, len, 0);
-   switch (retval) {
+   len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX,
+ 0);
+   if (!demultiplex_sideband(me, buf, len, 0, &scratch,
+ &sideband_type))
+   continue;
+   switch (sideband_type) {
case SIDEBAND_PRIMARY:
write_or_die(out, buf + 1, len - 1);
break;
-   case SIDEBAND_PROGRESS:
-   /* already written by demultiplex_sideband() */
-   break;
default: /* errors: message already written */
-   return retval;
+   return sideband_type;
}
}
 }
@@ -479,17 +480,19 @@ void packet_reader_init(struct packet_reader *reader, int 
fd,
 
 enum packet_read_status packet_reader_read(struct packet_reader *reader)
 {
+   struct strbuf scratch = STRBUF_INIT;
+
if (reader->line_peeked) {
reader->line_peeked = 0;
return reader->status;
}
 
/*
-* Consume all progress and keepalive packets until a primary payload
-* packet is received
+* Consume all progress packets until a primary payload packet is
+* received
 */
while (1) {
-   int retval;
+   enum sideband_type sideband_type;

Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Junio C Hamano
Jonathan Tan  writes:

> We could make the caller of demultiplex_sideband() store the outbuf, but
> at this point, it might be better to refactor packet_reader into its own
> file and have it depend on both pkt-line.h and sideband.h.

I do not know or too deeply care about the pkt-line / sideband
distinction, as they are logically one thing.  With a packet-reader
abstraction in place, doesn't this "here is the remainder of the
packet from the  previous round" belong to each reader instance?


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Jonathan Tan
> > Ah...yes, you're right. I forgot to build before running the tests. I'll
> > take a look.
> 
> Thanks.

Thanks once again for taking a look. It turns out that it's because
progress messages are sometimes split across PKT-LINEs depending on your
luck, and we need to retain the "leftover" on a \2 sideband in order to
combine it with the next one if necessary. So, for example, the
following fixup works:

diff --git a/sideband.c b/sideband.c
index c185c38637..d5da587d68 100644
--- a/sideband.c
+++ b/sideband.c
@@ -117,7 +117,7 @@ static void maybe_colorize_sideband(struct strbuf 
*dest, const char *src, int n)
 int demultiplex_sideband(const char *me, char *buf, int len, int 
die_on_error)
 {
static const char *suffix;
-   struct strbuf outbuf = STRBUF_INIT;
+   static struct strbuf outbuf = STRBUF_INIT;
int retval = 0;
const char *b, *brk;
int band;
@@ -187,8 +187,7 @@ int demultiplex_sideband(const char *me, char *buf, 
int len, int die_on_error)
"" : DISPLAY_PREFIX);
maybe_colorize_sideband(&outbuf, b, strlen(b));
}
-   retval = SIDEBAND_PROGRESS;
-   break;
+   return SIDEBAND_PROGRESS; /* skip cleanup */
case 1:
retval = SIDEBAND_PRIMARY;
break;

We could make the caller of demultiplex_sideband() store the outbuf, but
at this point, it might be better to refactor packet_reader into its own
file and have it depend on both pkt-line.h and sideband.h. If you (or
anyone else) have any ideas, let me know what you think. I'll think
further about this too.


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Junio C Hamano
Jonathan Tan  writes:

>> Jonathan Tan  writes:
>> 
>> >> Jonathan Tan  writes:
>> >> 
>> >> > Like v1, this is on origin/ms/packet-err-check.
>> >> 
>> >> By the way, when merged to 'pu' as one of the earlier topic, t5409
>> >> starts to fail under --stress.
>> >> 
>> >>   $ git checkout 'origin/pu^{/^Merge branch .jt/fetch-v2-sideband}'
>> >>   $ make
>> >>   $ cd t && sh ./t5409-col*.sh --stress
>> >> 
>> >> This is not new to this round; v1 exhibited the same symptom.
>> >> 
>> >> Thanks.
>> >
>> > Thanks for checking. I don't think this branch is the cause of this
>> > issue, though. I ran the same stress test on both:
>> >
>> >  - f3035d003e ("Merge branch 'sg/stress-test' into jch", 2019-01-14) and
>> >  - the result of merging sg/stress-test into master,
>> >
>> > and the test fails with the same result.
>> 
>> Interesting.  That is not what I am seeing (as I manually bisected
>> the first-parent chain between f3035d003e and the tip of pu).
>
> Ah...yes, you're right. I forgot to build before running the tests. I'll
> take a look.

Thanks.


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Jonathan Tan
> Jonathan Tan  writes:
> 
> >> Jonathan Tan  writes:
> >> 
> >> > Like v1, this is on origin/ms/packet-err-check.
> >> 
> >> By the way, when merged to 'pu' as one of the earlier topic, t5409
> >> starts to fail under --stress.
> >> 
> >>$ git checkout 'origin/pu^{/^Merge branch .jt/fetch-v2-sideband}'
> >>$ make
> >>$ cd t && sh ./t5409-col*.sh --stress
> >> 
> >> This is not new to this round; v1 exhibited the same symptom.
> >> 
> >> Thanks.
> >
> > Thanks for checking. I don't think this branch is the cause of this
> > issue, though. I ran the same stress test on both:
> >
> >  - f3035d003e ("Merge branch 'sg/stress-test' into jch", 2019-01-14) and
> >  - the result of merging sg/stress-test into master,
> >
> > and the test fails with the same result.
> 
> Interesting.  That is not what I am seeing (as I manually bisected
> the first-parent chain between f3035d003e and the tip of pu).

Ah...yes, you're right. I forgot to build before running the tests. I'll
take a look.


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Junio C Hamano
Jonathan Tan  writes:

>> Jonathan Tan  writes:
>> 
>> > Like v1, this is on origin/ms/packet-err-check.
>> 
>> By the way, when merged to 'pu' as one of the earlier topic, t5409
>> starts to fail under --stress.
>> 
>>  $ git checkout 'origin/pu^{/^Merge branch .jt/fetch-v2-sideband}'
>>  $ make
>>  $ cd t && sh ./t5409-col*.sh --stress
>> 
>> This is not new to this round; v1 exhibited the same symptom.
>> 
>> Thanks.
>
> Thanks for checking. I don't think this branch is the cause of this
> issue, though. I ran the same stress test on both:
>
>  - f3035d003e ("Merge branch 'sg/stress-test' into jch", 2019-01-14) and
>  - the result of merging sg/stress-test into master,
>
> and the test fails with the same result.

Interesting.  That is not what I am seeing (as I manually bisected
the first-parent chain between f3035d003e and the tip of pu).


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Jonathan Tan
> Jonathan Tan  writes:
> 
> > Like v1, this is on origin/ms/packet-err-check.
> 
> By the way, when merged to 'pu' as one of the earlier topic, t5409
> starts to fail under --stress.
> 
>   $ git checkout 'origin/pu^{/^Merge branch .jt/fetch-v2-sideband}'
>   $ make
>   $ cd t && sh ./t5409-col*.sh --stress
> 
> This is not new to this round; v1 exhibited the same symptom.
> 
> Thanks.

Thanks for checking. I don't think this branch is the cause of this
issue, though. I ran the same stress test on both:

 - f3035d003e ("Merge branch 'sg/stress-test' into jch", 2019-01-14) and
 - the result of merging sg/stress-test into master,

and the test fails with the same result.


Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Junio C Hamano
Jonathan Tan  writes:

> Like v1, this is on origin/ms/packet-err-check.

By the way, when merged to 'pu' as one of the earlier topic, t5409
starts to fail under --stress.

$ git checkout 'origin/pu^{/^Merge branch .jt/fetch-v2-sideband}'
$ make
$ cd t && sh ./t5409-col*.sh --stress

This is not new to this round; v1 exhibited the same symptom.

Thanks.




Re: [PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Junio C Hamano
Jonathan Tan  writes:

> @@ -474,6 +474,7 @@ void packet_reader_init(struct packet_reader *reader, int 
> fd,
>   reader->buffer = packet_buffer;
>   reader->buffer_size = sizeof(packet_buffer);
>   reader->options = options;
> + reader->me = "git";
>  }

This was somewhat unexpected.  I would have thought that an
interdiff would be more like

+   reader.me = "fetch-pack";
if (using sideband-all) {
reader.use_sideband = 1;
-   reader.me = "fetch-pack";
}

> + case SIDEBAND_PRIMARY:
> + if (reader->pktlen != 1)
> + goto nonprogress_received;
> + /*
> +  * Since the packet contains nothing but the sideband
> +  * designator, this is a keepalive packet. Wait for the
> +  * next one.
> +  */
> + break;
> + default: /* SIDEBAND_PROGRESS */
> + ;

OK.

Will replace.  Thanks.


[PATCH v2 0/4] Sideband the whole fetch v2 response

2019-01-15 Thread Jonathan Tan
Like v1, this is on origin/ms/packet-err-check.

Thanks, Junio, for your reviews.

Jonathan Tan (4):
  pkt-line: introduce struct packet_writer
  sideband: reverse its dependency on pkt-line
  {fetch,upload}-pack: sideband v2 fetch response
  tests: define GIT_TEST_SIDEBAND_ALL

 Documentation/technical/protocol-v2.txt |  10 ++
 fetch-pack.c|  13 +-
 pkt-line.c  | 121 +++---
 pkt-line.h  |  34 +
 sideband.c  | 161 
 sideband.h  |  14 ++-
 t/README|   5 +
 t/lib-httpd/apache.conf |   1 +
 t/t5537-fetch-shallow.sh|   3 +-
 t/t5701-git-serve.sh|   2 +-
 t/t5702-protocol-v2.sh  |   4 +-
 upload-pack.c   | 131 +++
 12 files changed, 343 insertions(+), 156 deletions(-)

Interdiff against v1:
diff --git a/pkt-line.c b/pkt-line.c
index 71b17e6b93..69162f3990 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -447,16 +447,16 @@ int recv_sideband(const char *me, int in_stream, int out)
 
while (1) {
len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX, 
0);
-   retval = diagnose_sideband(me, buf, len, 0);
+   retval = demultiplex_sideband(me, buf, len, 0);
switch (retval) {
-   case SIDEBAND_PRIMARY:
-   write_or_die(out, buf + 1, len - 1);
-   break;
-   case SIDEBAND_PROGRESS:
-   /* already written by diagnose_sideband() */
-   break;
-   default: /* flush or error */
-   return retval;
+   case SIDEBAND_PRIMARY:
+   write_or_die(out, buf + 1, len - 1);
+   break;
+   case SIDEBAND_PROGRESS:
+   /* already written by demultiplex_sideband() */
+   break;
+   default: /* errors: message already written */
+   return retval;
}
}
 }
@@ -474,6 +474,7 @@ void packet_reader_init(struct packet_reader *reader, int 
fd,
reader->buffer = packet_buffer;
reader->buffer_size = sizeof(packet_buffer);
reader->options = options;
+   reader->me = "git";
 }
 
 enum packet_read_status packet_reader_read(struct packet_reader *reader)
@@ -483,6 +484,10 @@ enum packet_read_status packet_reader_read(struct 
packet_reader *reader)
return reader->status;
}
 
+   /*
+* Consume all progress and keepalive packets until a primary payload
+* packet is received
+*/
while (1) {
int retval;
reader->status = packet_read_with_status(reader->fd,
@@ -494,29 +499,31 @@ enum packet_read_status packet_reader_read(struct 
packet_reader *reader)
 reader->options);
if (!reader->use_sideband)
break;
-   retval = diagnose_sideband(reader->me, reader->buffer,
-  reader->pktlen, 1);
+   retval = demultiplex_sideband(reader->me, reader->buffer,
+ reader->pktlen, 1);
switch (retval) {
-   case SIDEBAND_PROTOCOL_ERROR:
-   case SIDEBAND_REMOTE_ERROR:
-   BUG("should have died in diagnose_sideband");
-   case SIDEBAND_FLUSH:
-   goto nonprogress;
-   case SIDEBAND_PRIMARY:
-   if (reader->pktlen != 1)
-   goto nonprogress;
-   /*
-* Since pktlen is 1, this is a keepalive
-* packet. Wait for the next one.
-*/
-   break;
-   default: /* SIDEBAND_PROGRESS */
-   ;
+   case SIDEBAND_PROTOCOL_ERROR:
+   case SIDEBAND_REMOTE_ERROR:
+   BUG("should have died in diagnose_sideband");
+   case SIDEBAND_FLUSH:
+   goto nonprogress_received;
+   case SIDEBAND_PRIMARY:
+   if (reader->pktlen != 1)
+   goto nonprogress_received;
+   /*
+* Since the packet contains nothing but the sideband
+* designa

[PATCH v2 3/4] {fetch,upload}-pack: sideband v2 fetch response

2019-01-15 Thread Jonathan Tan
Currently, a response to a fetch request has sideband support only while
the packfile is being sent, meaning that the server cannot send notices
until the start of the packfile.

Extend sideband support in protocol v2 fetch responses to the whole
response. upload-pack will advertise it if the
uploadpack.allowsidebandall configuration variable is set, and
fetch-pack will automatically request it if advertised.

If the sideband is to be used throughout the whole response, upload-pack
will use it to send errors instead of prefixing a PKT-LINE payload with
"ERR ".

This will be tested in a subsequent patch.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/protocol-v2.txt | 10 +
 fetch-pack.c| 12 -
 pkt-line.c  | 58 -
 pkt-line.h  |  4 ++
 sideband.c  |  7 ++-
 sideband.h  |  2 +-
 upload-pack.c   | 16 +++
 7 files changed, 94 insertions(+), 15 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 09e4e0273f..1b0633f59f 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -307,6 +307,16 @@ the 'wanted-refs' section in the server's response as 
explained below.
particular ref, where  is the full name of a ref on the
server.
 
+If the 'sideband-all' feature is advertised, the following argument can be
+included in the client's request:
+
+sideband-all
+   Instruct the server to send the whole response multiplexed, not just
+   the packfile section. All non-flush and non-delim PKT-LINE in the
+   response (not only in the packfile section) will then start with a byte
+   indicating its sideband (1, 2, or 3), and the server may send "0005\1"
+   (a PKT-LINE of sideband 1 with no payload) as a keepalive packet.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
diff --git a/fetch-pack.c b/fetch-pack.c
index 3f9626dbf7..b89199891d 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1125,7 +1125,8 @@ static int add_haves(struct fetch_negotiator *negotiator,
 static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
  const struct fetch_pack_args *args,
  const struct ref *wants, struct oidset *common,
- int *haves_to_send, int *in_vain)
+ int *haves_to_send, int *in_vain,
+ int sideband_all)
 {
int ret = 0;
struct strbuf req_buf = STRBUF_INIT;
@@ -1151,6 +1152,8 @@ static int send_fetch_request(struct fetch_negotiator 
*negotiator, int fd_out,
packet_buf_write(&req_buf, "include-tag");
if (prefer_ofs_delta)
packet_buf_write(&req_buf, "ofs-delta");
+   if (sideband_all)
+   packet_buf_write(&req_buf, "sideband-all");
 
/* Add shallow-info and deepen request */
if (server_supports_feature("fetch", "shallow", 0))
@@ -1359,6 +1362,10 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
packet_reader_init(&reader, fd[0], NULL, 0,
   PACKET_READ_CHOMP_NEWLINE |
   PACKET_READ_DIE_ON_ERR_PACKET);
+   if (server_supports_feature("fetch", "sideband-all", 0)) {
+   reader.use_sideband = 1;
+   reader.me = "fetch-pack";
+   }
 
while (state != FETCH_DONE) {
switch (state) {
@@ -1392,7 +1399,8 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
case FETCH_SEND_REQUEST:
if (send_fetch_request(&negotiator, fd[1], args, ref,
   &common,
-  &haves_to_send, &in_vain))
+  &haves_to_send, &in_vain,
+  reader.use_sideband))
state = FETCH_GET_PACK;
else
state = FETCH_PROCESS_ACKS;
diff --git a/pkt-line.c b/pkt-line.c
index 5feb73ebec..69162f3990 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -447,7 +447,7 @@ int recv_sideband(const char *me, int in_stream, int out)
 
while (1) {
len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX, 
0);
-   retval = demultiplex_sideband(me, buf, len);
+   retval = demultiplex_sideband(me, buf, len, 0);
switch (retval) {
  

Re: [PATCH 3/4] {fetch,upload}-pack: sideband v2 fetch response

2019-01-15 Thread Jonathan Tan
> OK, so the "fetch" side enables sideband-all any time it knows that
> the other side supports it.
> 
> It feels a bit strange at the logical level that reader.me is set
> only when we are going to talk over "sideband-all".  I know that at
> the implementation level, reader.me is only looked at when sideband
> is use, but it still feels somewhat funny.  Future developers may
> want to use reader.me to identify the entity that found some error
> during a read, even when the sideband is not yet in use, and at that
> point, uninitialized .me would be a source of an error.  IOW, that
> assignment smells like a timebomb waiting to go off.

I think the best solution here is to initialize .me to "git", like how
we do it for trace. I've done that for now.

> > +   switch (retval) {
> > +   case SIDEBAND_PROTOCOL_ERROR:
> 
> Dedent (see previous step).

Done.

> > +   case SIDEBAND_PRIMARY:
> > +   if (reader->pktlen != 1)
> > +   goto nonprogress;
> > +   /*
> > +* Since pktlen is 1, this is a keepalive
> > +* packet. Wait for the next one.
> > +*/
> 
> What guarantees that a normal payload packet is never of length==1?

This is indeed assuming that we currently don't send 0004 (the
equivalent of 0005\1 without the sideband). I chose this because it is
currently what we use in create_pack_file() in upload-pack.c, but I'm OK
with alternatives (e.g. if we use 0005\2 instead).

> > +   break;
> > +   default: /* SIDEBAND_PROGRESS */
> > +   ;
> > +   }
> > +   }
> >  
> > +nonprogress:
> 
> It is totally unclear why this label is called nonprogress.  Is it
> that the primary purpose of the while loop above is to spin and eat
> the keep-alive packets from the other side?  If so, then it sort-of
> makes sense (i.e. "we are looking for progress-meter related
> packets, but now we found something else, so let's leave the loop").
> 
> It would have greatly helped reviewers to have a comment in front of
> that infinite loop, perhaps like
> 
>   /*
>* Spin and consume the keep-alive packets
>*/

[snip]

Yes, it's meant to spin and consume the progress and keepalive packets.
I've added a comment at the top of the loop and renamed the jump label
to "nonprogress_received".

> > if (reader->status == PACKET_READ_NORMAL)
> > -   reader->line = reader->buffer;
> > +   reader->line = reader->use_sideband ?
> > +   reader->buffer + 1 : reader->buffer;
> 
> Is the one byte the sideband designator?

Yes. Added comment to clarify that.

> >  void packet_writer_write(struct packet_writer *writer, const char *fmt, 
> > ...)
> > @@ -521,7 +548,8 @@ void packet_writer_write(struct packet_writer *writer, 
> > const char *fmt, ...)
> > va_list args;
> >  
> > va_start(args, fmt);
> > -   packet_write_fmt_1(writer->dest_fd, 0, "", fmt, args);
> > +   packet_write_fmt_1(writer->dest_fd, 0,
> > +  writer->use_sideband ? "\1" : "", fmt, args);
> > va_end(args);
> >  }
> 
> As I am superstitious, I'd prefer to see octal literals to be fully
> spelled out as 3-digit sequence, i.e. "\001".  Likewise for "\003".

Done.

> > @@ -530,7 +558,8 @@ void packet_writer_error(struct packet_writer *writer, 
> > const char *fmt, ...)
> > va_list args;
> >  
> > va_start(args, fmt);
> > -   packet_write_fmt_1(writer->dest_fd, 0, "ERR ", fmt, args);
> > +   packet_write_fmt_1(writer->dest_fd, 0,
> > +  writer->use_sideband ? "\3" : "ERR ", fmt, args);
> 
> OK, band#3 is an error message for emergency exit.  It is a bit
> curious that this patch does not show any line from the existing
> code in the context that reacts to the ERR string, but that is
> because the errors are noticed at a lot lower level when the
> sideband is in use than the code that currently checks for errors?

Yes - the branch that this patch set is on (ms/packet-err-check) handles
this. I have taken care in this patch to ensure that the error message
printed matches what Masaya used, so that in a future patch, when we
force sideband-all (using GIT_TEST_SIDEBAND_ALL), the same message is
printed so that the existing tests still pass.


Re: [PATCH 3/4] {fetch,upload}-pack: sideband v2 fetch response

2019-01-14 Thread Junio C Hamano
Jonathan Tan  writes:

> Currently, a response to a fetch request has sideband support only while
> the packfile is being sent, meaning that the server cannot send notices
> until the start of the packfile.
>
> Extend sideband support in protocol v2 fetch responses to the whole
> response. upload-pack will advertise it if the
> uploadpack.allowsidebandall configuration variable is set, and
> fetch-pack will automatically request it if advertised.
>
> If the sideband is to be used throughout the whole response, upload-pack
> will use it to send errors instead of prefixing a PKT-LINE payload with
> "ERR ".

Makes sense.

> diff --git a/fetch-pack.c b/fetch-pack.c
> index 3f9626dbf7..b89199891d 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1125,7 +1125,8 @@ static int add_haves(struct fetch_negotiator 
> *negotiator,
>  static int send_fetch_request(struct fetch_negotiator *negotiator, int 
> fd_out,
> const struct fetch_pack_args *args,
> const struct ref *wants, struct oidset *common,
> -   int *haves_to_send, int *in_vain)
> +   int *haves_to_send, int *in_vain,
> +   int sideband_all)
>  {
>   int ret = 0;
>   struct strbuf req_buf = STRBUF_INIT;
> @@ -1151,6 +1152,8 @@ static int send_fetch_request(struct fetch_negotiator 
> *negotiator, int fd_out,
>   packet_buf_write(&req_buf, "include-tag");
>   if (prefer_ofs_delta)
>   packet_buf_write(&req_buf, "ofs-delta");
> + if (sideband_all)
> + packet_buf_write(&req_buf, "sideband-all");
>  
>   /* Add shallow-info and deepen request */
>   if (server_supports_feature("fetch", "shallow", 0))
> @@ -1359,6 +1362,10 @@ static struct ref *do_fetch_pack_v2(struct 
> fetch_pack_args *args,
>   packet_reader_init(&reader, fd[0], NULL, 0,
>  PACKET_READ_CHOMP_NEWLINE |
>  PACKET_READ_DIE_ON_ERR_PACKET);
> + if (server_supports_feature("fetch", "sideband-all", 0)) {
> + reader.use_sideband = 1;
> + reader.me = "fetch-pack";
> + }
>  
>   while (state != FETCH_DONE) {
>   switch (state) {
> @@ -1392,7 +1399,8 @@ static struct ref *do_fetch_pack_v2(struct 
> fetch_pack_args *args,
>   case FETCH_SEND_REQUEST:
>   if (send_fetch_request(&negotiator, fd[1], args, ref,
>  &common,
> -&haves_to_send, &in_vain))
> +&haves_to_send, &in_vain,
> +reader.use_sideband))
>   state = FETCH_GET_PACK;
>   else
>   state = FETCH_PROCESS_ACKS;

OK, so the "fetch" side enables sideband-all any time it knows that
the other side supports it.

It feels a bit strange at the logical level that reader.me is set
only when we are going to talk over "sideband-all".  I know that at
the implementation level, reader.me is only looked at when sideband
is use, but it still feels somewhat funny.  Future developers may
want to use reader.me to identify the entity that found some error
during a read, even when the sideband is not yet in use, and at that
point, uninitialized .me would be a source of an error.  IOW, that
assignment smells like a timebomb waiting to go off.

> @@ -483,16 +483,42 @@ enum packet_read_status packet_reader_read(struct 
> packet_reader *reader)
>   return reader->status;
>   }
>  
> - reader->status = packet_read_with_status(reader->fd,
> -  &reader->src_buffer,
> -  &reader->src_len,
> -  reader->buffer,
> -  reader->buffer_size,
> -  &reader->pktlen,
> -  reader->options);
> + while (1) {
> + int retval;
> + reader->status = packet_read_with_status(reader->fd,
> +  &reader->src_buffer,
> +  &reader->src_len,
> +  reader->buffer,
> +  reader->b

[PATCH 3/4] {fetch,upload}-pack: sideband v2 fetch response

2019-01-11 Thread Jonathan Tan
Currently, a response to a fetch request has sideband support only while
the packfile is being sent, meaning that the server cannot send notices
until the start of the packfile.

Extend sideband support in protocol v2 fetch responses to the whole
response. upload-pack will advertise it if the
uploadpack.allowsidebandall configuration variable is set, and
fetch-pack will automatically request it if advertised.

If the sideband is to be used throughout the whole response, upload-pack
will use it to send errors instead of prefixing a PKT-LINE payload with
"ERR ".

This will be tested in a subsequent patch.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/protocol-v2.txt | 10 +
 fetch-pack.c| 12 +-
 pkt-line.c  | 51 +++--
 pkt-line.h  |  4 ++
 sideband.c  |  7 +++-
 sideband.h  |  2 +-
 upload-pack.c   | 16 
 7 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 09e4e0273f..1b0633f59f 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -307,6 +307,16 @@ the 'wanted-refs' section in the server's response as 
explained below.
particular ref, where  is the full name of a ref on the
server.
 
+If the 'sideband-all' feature is advertised, the following argument can be
+included in the client's request:
+
+sideband-all
+   Instruct the server to send the whole response multiplexed, not just
+   the packfile section. All non-flush and non-delim PKT-LINE in the
+   response (not only in the packfile section) will then start with a byte
+   indicating its sideband (1, 2, or 3), and the server may send "0005\1"
+   (a PKT-LINE of sideband 1 with no payload) as a keepalive packet.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
diff --git a/fetch-pack.c b/fetch-pack.c
index 3f9626dbf7..b89199891d 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1125,7 +1125,8 @@ static int add_haves(struct fetch_negotiator *negotiator,
 static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
  const struct fetch_pack_args *args,
  const struct ref *wants, struct oidset *common,
- int *haves_to_send, int *in_vain)
+ int *haves_to_send, int *in_vain,
+ int sideband_all)
 {
int ret = 0;
struct strbuf req_buf = STRBUF_INIT;
@@ -1151,6 +1152,8 @@ static int send_fetch_request(struct fetch_negotiator 
*negotiator, int fd_out,
packet_buf_write(&req_buf, "include-tag");
if (prefer_ofs_delta)
packet_buf_write(&req_buf, "ofs-delta");
+   if (sideband_all)
+   packet_buf_write(&req_buf, "sideband-all");
 
/* Add shallow-info and deepen request */
if (server_supports_feature("fetch", "shallow", 0))
@@ -1359,6 +1362,10 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
packet_reader_init(&reader, fd[0], NULL, 0,
   PACKET_READ_CHOMP_NEWLINE |
   PACKET_READ_DIE_ON_ERR_PACKET);
+   if (server_supports_feature("fetch", "sideband-all", 0)) {
+   reader.use_sideband = 1;
+   reader.me = "fetch-pack";
+   }
 
while (state != FETCH_DONE) {
switch (state) {
@@ -1392,7 +1399,8 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
case FETCH_SEND_REQUEST:
if (send_fetch_request(&negotiator, fd[1], args, ref,
   &common,
-  &haves_to_send, &in_vain))
+  &haves_to_send, &in_vain,
+  reader.use_sideband))
state = FETCH_GET_PACK;
else
state = FETCH_PROCESS_ACKS;
diff --git a/pkt-line.c b/pkt-line.c
index ebdc6c2530..71b17e6b93 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -447,7 +447,7 @@ int recv_sideband(const char *me, int in_stream, int out)
 
while (1) {
len = packet_read(in_stream, NULL, NULL, buf, LARGE_PACKET_MAX, 
0);
-   retval = diagnose_sideband(me, buf, len);
+   retval = diagnose_sideband(me, buf, len, 0);
switch (r

[WIP 8/4] upload-pack: send part of packfile response as uri

2019-01-11 Thread Jonathan Tan
This is a partial implementation of upload-pack sending part of its
packfile response as URIs.

The client is not fully implemented - it knows to ignore the
"packfile-uris" section, but because it does not actually fetch those
URIs, the returned packfile is incomplete. A test is included to show
that the appropriate URI is indeed transmitted, and that the returned
packfile is lacking exactly the expected object.

Signed-off-by: Jonathan Tan 
---
 builtin/pack-objects.c | 48 
 fetch-pack.c   |  9 +++
 t/t5702-protocol-v2.sh | 27 
 upload-pack.c  | 56 --
 4 files changed, 133 insertions(+), 7 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 411aefd687..448c42a666 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -117,6 +117,15 @@ enum missing_action {
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
 
+struct configured_exclusion {
+   struct oidmap_entry e;
+   char *uri;
+};
+static struct oidmap configured_exclusions;
+
+static int exclude_configured_blobs;
+static struct oidset excluded_by_config;
+
 /*
  * stats
  */
@@ -831,6 +840,23 @@ static off_t write_reused_pack(struct hashfile *f)
return reuse_packfile_offset - sizeof(struct pack_header);
 }
 
+static void write_excluded_by_configs(void)
+{
+   struct oidset_iter iter;
+   const struct object_id *oid;
+
+   oidset_iter_init(&excluded_by_config, &iter);
+   while ((oid = oidset_iter_next(&iter))) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+
+   if (!ex)
+   BUG("configured exclusion wasn't configured");
+   write_in_full(1, ex->uri, strlen(ex->uri));
+   write_in_full(1, "\n", 1);
+   }
+}
+
 static const char no_split_warning[] = N_(
 "disabling bitmap writing, packs are split due to pack.packSizeLimit"
 );
@@ -1124,6 +1150,12 @@ static int want_object_in_pack(const struct object_id 
*oid,
}
}
 
+   if (exclude_configured_blobs &&
+   oidmap_get(&configured_exclusions, oid)) {
+   oidset_insert(&excluded_by_config, oid);
+   return 0;
+   }
+
return 1;
 }
 
@@ -2728,6 +2760,19 @@ static int git_pack_config(const char *k, const char *v, 
void *cb)
pack_idx_opts.version);
return 0;
}
+   if (!strcmp(k, "uploadpack.blobpackfileuri")) {
+   struct configured_exclusion *ex = xmalloc(sizeof(*ex));
+   const char *end;
+
+   if (parse_oid_hex(v, &ex->e.oid, &end) || *end != ' ')
+   die(_("value of uploadpack.blobpackfileuri must be "
+ "of the form ' ' (got '%s')"), v);
+   if (oidmap_get(&configured_exclusions, &ex->e.oid))
+   die(_("object already configured in another "
+ "uploadpack.blobpackfileuri (got '%s')"), v);
+   ex->uri = xstrdup(end + 1);
+   oidmap_put(&configured_exclusions, ex);
+   }
return git_default_config(k, v, cb);
 }
 
@@ -3316,6 +3361,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("do not pack objects in promisor packfiles")),
OPT_BOOL(0, "delta-islands", &use_delta_islands,
 N_("respect islands during delta compression")),
+   OPT_BOOL(0, "exclude-configured-blobs", 
&exclude_configured_blobs,
+N_("respect uploadpack.blobpackfileuri")),
OPT_END(),
};
 
@@ -3489,6 +3536,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
return 0;
if (nr_result)
prepare_pack(window, depth);
+   write_excluded_by_configs();
write_pack_file();
if (progress)
fprintf_ln(stderr,
diff --git a/fetch-pack.c b/fetch-pack.c
index 4618568fee..79af87b2cf 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1429,6 +1429,15 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
receive_wanted_refs(&reader, sought, nr_sought);
 
/* get the pack */
+   if (process_section_header(&reader, "packfile-uris", 
1)) {
+   /* skip the whole section */
+   process_section_header(&reader, 
"packfile-uris", 0);
+  

[PATCH 0/4] Sideband the whole fetch v2 response

2019-01-11 Thread Jonathan Tan
This is on ms/packet-err-check.

This is a follow-up of my earlier work to allow partial CDN offloading
of fetch v2 response packfiles, which had the issue of having to buffer
(or suspend) progress and keepalive messages because we didn't have
sideband until the packfile section started (as I wrote here [1]). These
patches expand the sideband to the whole response, eliminating that
problem.

There are 8 patches total. I'm most interested in review of the first 4
patches for inclusion into Git - I think it would be useful for servers
to be allowed to send progress messages and other notices whenever they
need to, especially if other sections are added to the response (like in
the case of CDN offloading).

The other 4 patches are from my CDN offloading work, included here to
show how this sideband-all feature could be used. But take note that
they are all WIP and that I still haven't incorporated some of the
design considerations from the earlier review [2].

[1] 
https://public-inbox.org/git/20181206232538.141378-1-jonathanta...@google.com/
[2] https://public-inbox.org/git/cover.1543879256.git.jonathanta...@google.com/

Jonathan Tan (8):
  pkt-line: introduce struct packet_writer
  sideband: reverse its dependency on pkt-line
  {fetch,upload}-pack: sideband v2 fetch response
  tests: define GIT_TEST_SIDEBAND_ALL
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  upload-pack: send part of packfile response as uri

 Documentation/technical/packfile-uri.txt |  83 
 Documentation/technical/protocol-v2.txt  |  32 ++-
 builtin/pack-objects.c   |  48 +
 fetch-pack.c |  22 +-
 pkt-line.c   | 114 --
 pkt-line.h   |  34 +++
 sideband.c   | 161 +++---
 sideband.h   |  15 +-
 t/README |   5 +
 t/lib-httpd/apache.conf  |   1 +
 t/t5537-fetch-shallow.sh |   3 +-
 t/t5701-git-serve.sh |   2 +-
 t/t5702-protocol-v2.sh   |  31 ++-
 upload-pack.c| 259 +++
 14 files changed, 607 insertions(+), 203 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

-- 
2.19.0.271.gfe8321ec05.dirty



Re: [PATCH v2 1/2] Change how HTTP response body is returned

2019-01-04 Thread Junio C Hamano
Jeff King  writes:

> The most robust thing would perhaps be:
>
>   fflush(dest->file);
>   ftruncate(fileno(dest->file), 0);
>
> which leaves the handle intact.

An added benefit of that approach is that there is no need for
the filename field in the dest structure.

Having a separate filename field could be a positive flexibility (it
could be used to open a file, store its FILE* in dest->file, while
storing the name of another file in dest->filename), but also a
negative flexibility (dest->file possibly pointing to a file
different from dest->filename is a source of confusion).  As I do
not think any current caller that wants such a flexibility, or
callers in any immediate future, it probably is an overall plus if
we do not have to add the dest->filename field.

> (I agree with the rest of your review, especially that it would be
> easier to read if this were split into separate refactor and
> change-behavior steps).
>
> -Peff


Re: [PATCH v2 1/2] Change how HTTP response body is returned

2019-01-04 Thread Jeff King
On Sat, Dec 29, 2018 at 11:44:46AM -0800, Masaya Suzuki wrote:

> +/*
> + * A callback for CURLOPT_WRITEFUNCTION. The return value is the bytes 
> consumed
> + * from ptr.
> + */
>  static size_t rpc_in(char *ptr, size_t eltsize,
>   size_t nmemb, void *buffer_)
>  {
>   size_t size = eltsize * nmemb;
> - struct rpc_state *rpc = buffer_;
> + struct rpc_in_data *data = buffer_;
> + long response_code;
> +
> + if (curl_easy_getinfo(data->slot->curl, CURLINFO_RESPONSE_CODE,
> +   &response_code) != CURLE_OK)
> + return size;

This hunk was unexpected to me. The function here is just writing out
the data, and I expected we'd handle the error after the whole transfer
is done. But we do that anyway eventually via run_slot() (which uses
handle_curl_result). I guess the goal here is to start throwing away
data when we see an error, rather than storing it?

That makes some sense, though I do wonder if there's any case where curl
would call our WRITEFUNCTION before it knows the HTTP status. That
implies a body before our header, which seems impossible, though.

> + if (response_code != 200)
> + return size;

The current behavior with CURLOPT_FAILONERROR treats codes >= 400 as an
error. And in handle_curl_result(), we treat >= 300 as an error (since
we only see 3xx for a disabled redirect). I suppose it's unlikely for us
to see any success code besides 200, but we probably ought to be
following the same rules here.

-Peff


Re: [PATCH v2 1/2] Change how HTTP response body is returned

2019-01-04 Thread Jeff King
On Thu, Jan 03, 2019 at 11:09:02AM -0800, Junio C Hamano wrote:

> > +   if (dest->file) {
> > +   /*
> > +* At this point, the file contains the response body of the
> > +* previous request. We need to truncate the file.
> > +*/
> > +   FILE *new_file = freopen(dest->filename, "w", dest->file);
> 
> Now freopen() lets us restart the file anew with a new "FILE *".
> 
> > +   if (new_file == NULL) {
> > +   error("Unable to open local file %s", dest->filename);
> 
> error_errno(), perhaps?
> 
> At this point, I presume that dest->file is closed by the failed
> freopen(), but dest->file is still non-NULL and causes further calls
> to http_request() with this dest would be a disaster?  As long as
> the caller of this function reacts to HTTP_ERROR and kill the dest,
> it would be fine.

I also wondered what timing guarantees freopen() gives us (i.e., is it
possible for it to open and truncate the file, and then close the old
handle, flushing some in-memory buffer). C99 says:

  The freopen function first attempts to close any file that is
  associated with the specified stream. Failure to close the file is
  ignored. The error and end-of-file indicators for the stream are
  cleared.

So I think the order is OK for my concern, but not for yours. I.e., on
an error, dest->file is now undefined.

It might be nice to set "dest->file == NULL" in that case. There's no
guarantee that the caller did not hold onto its own copy of the handle,
but since this struct is only exposed internally within http.c, that's
probably OK.

The most robust thing would perhaps be:

  fflush(dest->file);
  ftruncate(fileno(dest->file), 0);

which leaves the handle intact.

(I agree with the rest of your review, especially that it would be
easier to read if this were split into separate refactor and
change-behavior steps).

-Peff


Re: [PATCH v2 1/2] Change how HTTP response body is returned

2019-01-03 Thread Junio C Hamano
Masaya Suzuki  writes:

> Subject: Re: [PATCH v2 1/2] Change how HTTP response body is returned

Perhaps:

Subject: http: change the way response body is returned

but if we can state why we want to do so concisely, that would be
even better.

> This changes the way HTTP response body is returned in
> http_request_reauth and post_rpc.
>
> 1. http_request_reauth makes up to two requests; one without a
>credential and one with a credential. The first request can fail if
>it needs a credential. When the keep_error option is specified, the
>response to the first request can be written to the HTTP response
>    body destination. If the response body destination is a string
>buffer, it erases the buffer before making the second request. By
>introducing http_response_dest, it can handle the case that the
>destination is a file handle.
> 2. post_rpc makes an HTTP request and the response body is directly
>written to a file descriptor. This makes it check the HTTP status
>code before writing it, and do not write the response body if it's an
>error response. It's ok without this check now because post_rpc makes
>a request with CURLOPT_FAILONERROR, and libcurl won't call the
>callback if the response has an error status code.

The above may be an accurate description of what the code will do
with this patch, but I cannot quite read the reason why we would
want the code to behave so in the first place.

>
> Signed-off-by: Masaya Suzuki 
> ---
>  http.c| 99 +--
>  remote-curl.c | 29 ---
>  2 files changed, 81 insertions(+), 47 deletions(-)
>
> diff --git a/http.c b/http.c
> index eacc2a75e..d23417670 100644
> --- a/http.c
> +++ b/http.c
> @@ -165,6 +165,19 @@ static int http_schannel_check_revoke = 1;
>   */
>  static int http_schannel_use_ssl_cainfo;
>  
> +/*
> + * Where to store the result of http_request.
> + *
> + * At most one of buffer or file can be non-NULL. The buffer and file are not
> + * allocated by http_request, and the caller is responsible for releasing 
> them.
> + */
> +struct http_response_dest {
> + struct strbuf *buffer;
> +
> + FILE *file;
> + const char *filename;
> +};
> +
>  size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>  {
>   size_t size = eltsize * nmemb;
> @@ -1794,12 +1807,8 @@ static void http_opt_request_remainder(CURL *curl, 
> off_t pos)
>   curl_easy_setopt(curl, CURLOPT_RANGE, buf);
>  }
>  
> -/* http_request() targets */
> -#define HTTP_REQUEST_STRBUF  0
> -#define HTTP_REQUEST_FILE1
> -
>  static int http_request(const char *url,
> - void *result, int target,
> + struct http_response_dest *dest,
>   const struct http_get_options *options)
>  {
>   struct active_request_slot *slot;
> @@ -1812,21 +1821,23 @@ static int http_request(const char *url,
>   slot = get_active_slot();
>   curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
>  
> - if (result == NULL) {
> - curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
> - } else {
> + if (dest->file) {
> + off_t posn = ftello(dest->file);
>   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> - curl_easy_setopt(slot->curl, CURLOPT_FILE, result);

OK, so it used to be that we can either discard the result
(i.e. NOBODY) or send the result to CURLOPT_FILE, and the latter has
two options (sent to a file, or sent to in-core buffer).  The way
these three choices are given were with the result pointer and the
target enum.

That is replaced by a struct that allows the same three choices.

Makes sense so far.

> @@ -1930,10 +1941,10 @@ static int update_url_from_redirect(struct strbuf 
> *base,
>  }
>  
>  static int http_request_reauth(const char *url,
> -void *result, int target,
> +struct http_response_dest *dest,
>  struct http_get_options *options)
>  {
> - int ret = http_request(url, result, target, options);
> + int ret = http_request(url, dest, options);

Just adjusting the calling convention to the change we saw above.

> @@ -1949,32 +1960,34 @@ static int http_request_reauth(const char *url,
>   if (ret != HTTP_REAUTH)
>   return ret;
>  
> - /*
> -  * If we are using KEEP_ERROR, the previous request may have
> -  * put cruft into our output stream; we should clear it out before
> -  * making our next request. We only know how to do this for
> -  * the strbuf case, but that is enough to satisfy

[PATCH v2 1/2] Change how HTTP response body is returned

2018-12-29 Thread Masaya Suzuki
This changes the way HTTP response body is returned in
http_request_reauth and post_rpc.

1. http_request_reauth makes up to two requests; one without a
   credential and one with a credential. The first request can fail if
   it needs a credential. When the keep_error option is specified, the
   response to the first request can be written to the HTTP response
   body destination. If the response body destination is a string
   buffer, it erases the buffer before making the second request. By
   introducing http_response_dest, it can handle the case that the
   destination is a file handle.
2. post_rpc makes an HTTP request and the response body is directly
   written to a file descriptor. This makes it check the HTTP status
   code before writing it, and do not write the response body if it's an
   error response. It's ok without this check now because post_rpc makes
   a request with CURLOPT_FAILONERROR, and libcurl won't call the
   callback if the response has an error status code.

Signed-off-by: Masaya Suzuki 
---
 http.c| 99 +--
 remote-curl.c | 29 ---
 2 files changed, 81 insertions(+), 47 deletions(-)

diff --git a/http.c b/http.c
index eacc2a75e..d23417670 100644
--- a/http.c
+++ b/http.c
@@ -165,6 +165,19 @@ static int http_schannel_check_revoke = 1;
  */
 static int http_schannel_use_ssl_cainfo;
 
+/*
+ * Where to store the result of http_request.
+ *
+ * At most one of buffer or file can be non-NULL. The buffer and file are not
+ * allocated by http_request, and the caller is responsible for releasing them.
+ */
+struct http_response_dest {
+   struct strbuf *buffer;
+
+   FILE *file;
+   const char *filename;
+};
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
size_t size = eltsize * nmemb;
@@ -1794,12 +1807,8 @@ static void http_opt_request_remainder(CURL *curl, off_t 
pos)
curl_easy_setopt(curl, CURLOPT_RANGE, buf);
 }
 
-/* http_request() targets */
-#define HTTP_REQUEST_STRBUF0
-#define HTTP_REQUEST_FILE  1
-
 static int http_request(const char *url,
-   void *result, int target,
+   struct http_response_dest *dest,
const struct http_get_options *options)
 {
struct active_request_slot *slot;
@@ -1812,21 +1821,23 @@ static int http_request(const char *url,
slot = get_active_slot();
curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
-   if (result == NULL) {
-   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
-   } else {
+   if (dest->file) {
+   off_t posn = ftello(dest->file);
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
-   curl_easy_setopt(slot->curl, CURLOPT_FILE, result);
-
-   if (target == HTTP_REQUEST_FILE) {
-   off_t posn = ftello(result);
-   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-fwrite);
-   if (posn > 0)
-   http_opt_request_remainder(slot->curl, posn);
-   } else
-   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-fwrite_buffer);
+   curl_easy_setopt(slot->curl, CURLOPT_FILE,
+dest->file);
+   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
+fwrite);
+   if (posn > 0)
+   http_opt_request_remainder(slot->curl, posn);
+   } else if (dest->buffer) {
+   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+   curl_easy_setopt(slot->curl, CURLOPT_FILE,
+dest->buffer);
+   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
+fwrite_buffer);
+   } else {
+   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
}
 
accept_language = get_accept_language();
@@ -1930,10 +1941,10 @@ static int update_url_from_redirect(struct strbuf *base,
 }
 
 static int http_request_reauth(const char *url,
-  void *result, int target,
+  struct http_response_dest *dest,
   struct http_get_options *options)
 {
-   int ret = http_request(url, result, target, options);
+   int ret = http_request(url, dest, options);
 
if (ret != HTTP_OK && ret != HTTP_REAUTH)
return ret;
@@ -1949,32 +1960,34 @@ static int http_request_reauth(const char *url,
if (ret != HTTP_REAUTH)
return ret;
 
-   /*
-* If we are using KEEP_ERROR, the previous request may have
-* put cruft into our output stream; we should c

[PATCH 1/2] Change how HTTP response body is returned

2018-12-27 Thread Masaya Suzuki
This changes the way HTTP response body is returned in
http_request_reauth and post_rpc.

1. http_request_reauth makes up to two requests; one without a
   credential and one with a credential. The first request can fail if
   it needs a credential. When the keep_error option is specified, the
   response to the first request can be written to the HTTP response
   body destination. If the response body destination is a string
   buffer, it erases the buffer before making the second request. By
   introducing http_response_dest, it can handle the case that the
   destination is a file handle.
2. post_rpc makes an HTTP request and the response body is directly
   written to a file descriptor. This makes it check the HTTP status
   code before writing it, and do not write the response body if it's an
   error response. It's ok without this check now because post_rpc makes
   a request with CURLOPT_FAILONERROR, and libcurl won't call the
   callback if the response has an error status code.

Signed-off-by: Masaya Suzuki 
---
 http.c| 99 +--
 remote-curl.c | 29 ---
 2 files changed, 81 insertions(+), 47 deletions(-)

diff --git a/http.c b/http.c
index eacc2a75e..d23417670 100644
--- a/http.c
+++ b/http.c
@@ -165,6 +165,19 @@ static int http_schannel_check_revoke = 1;
  */
 static int http_schannel_use_ssl_cainfo;
 
+/*
+ * Where to store the result of http_request.
+ *
+ * At most one of buffer or file can be non-NULL. The buffer and file are not
+ * allocated by http_request, and the caller is responsible for releasing them.
+ */
+struct http_response_dest {
+   struct strbuf *buffer;
+
+   FILE *file;
+   const char *filename;
+};
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
size_t size = eltsize * nmemb;
@@ -1794,12 +1807,8 @@ static void http_opt_request_remainder(CURL *curl, off_t 
pos)
curl_easy_setopt(curl, CURLOPT_RANGE, buf);
 }
 
-/* http_request() targets */
-#define HTTP_REQUEST_STRBUF0
-#define HTTP_REQUEST_FILE  1
-
 static int http_request(const char *url,
-   void *result, int target,
+   struct http_response_dest *dest,
const struct http_get_options *options)
 {
struct active_request_slot *slot;
@@ -1812,21 +1821,23 @@ static int http_request(const char *url,
slot = get_active_slot();
curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
 
-   if (result == NULL) {
-   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
-   } else {
+   if (dest->file) {
+   off_t posn = ftello(dest->file);
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
-   curl_easy_setopt(slot->curl, CURLOPT_FILE, result);
-
-   if (target == HTTP_REQUEST_FILE) {
-   off_t posn = ftello(result);
-   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-fwrite);
-   if (posn > 0)
-   http_opt_request_remainder(slot->curl, posn);
-   } else
-   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-fwrite_buffer);
+   curl_easy_setopt(slot->curl, CURLOPT_FILE,
+dest->file);
+   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
+fwrite);
+   if (posn > 0)
+   http_opt_request_remainder(slot->curl, posn);
+   } else if (dest->buffer) {
+   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+   curl_easy_setopt(slot->curl, CURLOPT_FILE,
+dest->buffer);
+   curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
+fwrite_buffer);
+   } else {
+   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
}
 
accept_language = get_accept_language();
@@ -1930,10 +1941,10 @@ static int update_url_from_redirect(struct strbuf *base,
 }
 
 static int http_request_reauth(const char *url,
-  void *result, int target,
+  struct http_response_dest *dest,
   struct http_get_options *options)
 {
-   int ret = http_request(url, result, target, options);
+   int ret = http_request(url, dest, options);
 
if (ret != HTTP_OK && ret != HTTP_REAUTH)
return ret;
@@ -1949,32 +1960,34 @@ static int http_request_reauth(const char *url,
if (ret != HTTP_REAUTH)
return ret;
 
-   /*
-* If we are using KEEP_ERROR, the previous request may have
-* put cruft into our output stream; we should c

Re: [WIP RFC 5/5] upload-pack: send part of packfile response as uri

2018-12-04 Thread Stefan Beller
On Mon, Dec 3, 2018 at 3:38 PM Jonathan Tan  wrote:
>
> This is a partial implementation of upload-pack sending part of its
> packfile response as URIs.

It does so by implementing a new flag `--exclude-configured-blobs`
in pack-objects, which would change the output of pack-objects to
output a list of URLs (of the excluded blobs) followed by the
pack to be asked for.

This design seems easy to implement as then upload-pack
can just parse the output and only needs to insert
"packfile" and "packfile-uris\n" at the appropriate places
of the stream, otherwise it just passes through the information
obtained from pack-objects.

The design as-is would make for hard documentation of
pack-objects (its output is not just a pack anymore when that
flag is given, but a highly optimized byte stream).

Initially I did not anticipate this to be one of the major design problems
as I assumed we'd want to use this pack feature over broadly (e.g.
eventually by offloading most of the objects into a base pack that
is just always included as the likelihood for any object in there is
very high on initial clone), but it makes total sense to only
output the URIs that we actually need.

An alternative that comes very close to the current situation
would be to either pass a file path or file descriptor (that upload-pack
listens to in parallel) to pack-objects as an argument of the new flag.
Then we would not need to splice the protocol sections into the single
output stream, but we could announce the sections, then flush
the URIs and then flush the pack afterwards.

I looked at this quickly, but that would need either extensions in
run-command.c for setting up the new fd for us, as there we already
have OS specific code for these setups, or we'd have to duplicate
some of the logic here, which doesn't enthuse me either.

So maybe we'd create a temp file via mkstemp and pass
the file name to pack-objects for writing the URIs and then
we'd just need to stream that file?

> +   # NEEDSWORK: "git clone" fails here because it ignores the URI 
> provided
> +   # instead of fetching it.
> +   test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" \
> +   git -c protocol.version=2 clone \
> +   "$HTTPD_URL/smart/http_parent" http_child 2>err &&
> +   # Although "git clone" fails, we can still check that the server
> +   # provided the URI we requested and that the error message pinpoints
> +   # the object that is missing.
> +   grep "clone< uri https://example.com/a-uri"; log &&
> +   test_i18ngrep "did not receive expected object $(cat h)" err

That is a nice test!


Re: [WIP RFC 0/5] Design for offloading part of packfile response to CDN

2018-12-03 Thread Stefan Beller
On Mon, Dec 3, 2018 at 3:37 PM Jonathan Tan  wrote:

>
> There is a potential issue: a server which produces both the URIs and
> the packfile at roughly the same time (like the implementation in this
> patch set) will not have sideband access until it has concluded sending
> the URIs. Among other things, this means that the server cannot send
> keepalive packets until quite late in the response. One solution to this
> might be to add a feature that allows the server to use a sideband
> throughout the whole response - and this has other benefits too like
> allowing servers to inform the client throughout the whole fetch, not
> just at the end.

While side band sounds like the right thing to do, we could also
sending (NULL)-URIs within this feature.


[WIP RFC 5/5] upload-pack: send part of packfile response as uri

2018-12-03 Thread Jonathan Tan
This is a partial implementation of upload-pack sending part of its
packfile response as URIs.

The client is not fully implemented - it knows to ignore the
"packfile-uris" section, but because it does not actually fetch those
URIs, the returned packfile is incomplete. A test is included to show
that the appropriate URI is indeed transmitted, and that the returned
packfile is lacking exactly the expected object.

Signed-off-by: Jonathan Tan 
---
 builtin/pack-objects.c | 48 ++
 fetch-pack.c   |  9 
 t/t5702-protocol-v2.sh | 25 ++
 upload-pack.c  | 37 
 4 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e7ea206c08..2abbddd3cb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -117,6 +117,15 @@ enum missing_action {
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
 
+struct configured_exclusion {
+   struct oidmap_entry e;
+   char *uri;
+};
+static struct oidmap configured_exclusions;
+
+static int exclude_configured_blobs;
+static struct oidset excluded_by_config;
+
 /*
  * stats
  */
@@ -831,6 +840,23 @@ static off_t write_reused_pack(struct hashfile *f)
return reuse_packfile_offset - sizeof(struct pack_header);
 }
 
+static void write_excluded_by_configs(void)
+{
+   struct oidset_iter iter;
+   const struct object_id *oid;
+
+   oidset_iter_init(&excluded_by_config, &iter);
+   while ((oid = oidset_iter_next(&iter))) {
+   struct configured_exclusion *ex =
+   oidmap_get(&configured_exclusions, oid);
+
+   if (!ex)
+   BUG("configured exclusion wasn't configured");
+   write_in_full(1, ex->uri, strlen(ex->uri));
+   write_in_full(1, "\n", 1);
+   }
+}
+
 static const char no_split_warning[] = N_(
 "disabling bitmap writing, packs are split due to pack.packSizeLimit"
 );
@@ -1124,6 +1150,12 @@ static int want_object_in_pack(const struct object_id 
*oid,
}
}
 
+   if (exclude_configured_blobs &&
+   oidmap_get(&configured_exclusions, oid)) {
+   oidset_insert(&excluded_by_config, oid);
+   return 0;
+   }
+
return 1;
 }
 
@@ -2728,6 +2760,19 @@ static int git_pack_config(const char *k, const char *v, 
void *cb)
pack_idx_opts.version);
return 0;
}
+   if (!strcmp(k, "uploadpack.blobpackfileuri")) {
+   struct configured_exclusion *ex = xmalloc(sizeof(*ex));
+   const char *end;
+
+   if (parse_oid_hex(v, &ex->e.oid, &end) || *end != ' ')
+   die(_("value of uploadpack.blobpackfileuri must be "
+ "of the form ' ' (got '%s')"), v);
+   if (oidmap_get(&configured_exclusions, &ex->e.oid))
+   die(_("object already configured in another "
+ "uploadpack.blobpackfileuri (got '%s')"), v);
+   ex->uri = xstrdup(end + 1);
+   oidmap_put(&configured_exclusions, ex);
+   }
return git_default_config(k, v, cb);
 }
 
@@ -3314,6 +3359,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("do not pack objects in promisor packfiles")),
OPT_BOOL(0, "delta-islands", &use_delta_islands,
 N_("respect islands during delta compression")),
+   OPT_BOOL(0, "exclude-configured-blobs", 
&exclude_configured_blobs,
+N_("respect uploadpack.blobpackfileuri")),
OPT_END(),
};
 
@@ -3487,6 +3534,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
return 0;
if (nr_result)
prepare_pack(window, depth);
+   write_excluded_by_configs();
write_pack_file();
if (progress)
fprintf_ln(stderr,
diff --git a/fetch-pack.c b/fetch-pack.c
index 9691046e64..6e1985ab55 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1413,6 +1413,15 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
receive_wanted_refs(&reader, sought, nr_sought);
 
/* get the pack */
+   if (process_section_header(&reader, "packfile-uris", 
1)) {
+   /* skip the whole section */
+   process_section_header(&reader, 
"packfile-uris", 0);
+  

[WIP RFC 0/5] Design for offloading part of packfile response to CDN

2018-12-03 Thread Jonathan Tan
Some of us have been working on a design to improve the scalability of
Git servers by allowing them to offload part of the packfile response to
CDNs in this way: returning HTTP(S) URIs in fetch responses in addition
to packfiles.

This can reduce the load on individual Git servers and improves
proximity (by having data served from closer to the user).

I have included here a design document (patch 2) and a rough
implementation of the server (patch 5). Currently, the implementation
only allows replacing single blobs with URIs, but the protocol
improvement is designed in such a way as to allow independent
improvement of Git server implementations.

There is a potential issue: a server which produces both the URIs and
the packfile at roughly the same time (like the implementation in this
patch set) will not have sideband access until it has concluded sending
the URIs. Among other things, this means that the server cannot send
keepalive packets until quite late in the response. One solution to this
might be to add a feature that allows the server to use a sideband
throughout the whole response - and this has other benefits too like
allowing servers to inform the client throughout the whole fetch, not
just at the end.

Jonathan Tan (5):
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  upload-pack: refactor writing of "packfile" line
  upload-pack: send part of packfile response as uri

 Documentation/technical/packfile-uri.txt |  83 +
 Documentation/technical/protocol-v2.txt  |  22 ++--
 builtin/pack-objects.c   |  48 
 fetch-pack.c |   9 ++
 t/t5702-protocol-v2.sh   |  25 
 upload-pack.c| 150 ---
 6 files changed, 285 insertions(+), 52 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

-- 
2.19.0.271.gfe8321ec05.dirty



Re: Parsing a git HTTP protocol response

2018-11-30 Thread Bryan Turner
On Fri, Nov 30, 2018 at 6:58 PM Bryan Turner  wrote:
>
> Here's a (very ugly) patch I threw together on top of your code:
...snip

Gmail butchered my patch, so here it is as an attachment.

Bryan


short-size-reads.patch
Description: Binary data


Re: Parsing a git HTTP protocol response

2018-11-30 Thread Bryan Turner
On Fri, Nov 30, 2018 at 5:05 PM Farhan Khan  wrote:
>
> Hi all,
>
> I am writing an implementation of the git HTTP pack protocol in C. It
> just does a request to clone a repository. It works pretty well for
> small repositories, but seems to fail on larger repositories and I do
> not understand why.
>
> All that my code does is send a hard-coded "want" request. As I
> understand it, responses come with a four-byte size prefix, then the
> data of the size - 4. The first four bytes are the size of the object
> in ASCII and after that size, a new object should begin by specifying
> its size. The final "" string should signify the end.
>
> On small repositories my code works fine. But when use a large git
> repository, I hit an object size that cannot be interpreted by ASCII
> into a number. In particular, right before the crash I am pretty
> consistently getting a size starting with 0x32,0x00 or 0x32,0x30,0x00
> (note, it starts with 0x32 and has 0x00 in there). This is not a
> readable four byte ascii value, likely an erroneous size value, which
> causes the next object size calculation to land incorrectly and
> subsequently the program to malfunction.
>
> My questions:
> 1. Am I misunderstanding the protocol?
> 2. Does 0x32 signify something?
> 3. Also, where can I see in the source code git parses the data it
> receives from a "want" request?
>
> If you would like to see my code, it is located here:
> http://git.farhan.codes/farhan/post
> To compile to Linux run: gcc post.c main.c -lcurl -o post
> To compile on FreeBSD you can use the Makefile or: cc -L
> /usr/local/lib -I /usr/local/include -lcurl main.c post.c -o post
> In both cases you need to have libcurl installed.
>
> To run do: ./post [a git http url] [a commit value]
> ie: ./post http://git.farhan.codes/farhan/openbsd
> 373dd5ba116d42cddf1fdcb58b578a4274f6d589
>
> I have a lot of debugging printf() messages, but it eventually hits a
> point where you see this:
>
> Start=
> Current stats: Current state: 999 Received: 1448
> We have enough bytes, moving forward
> == New Object
> No Error, object is 32 bytes
> Size bytes: 32 30 00 00

I modified your debugging output a little bit, and right before the
failure happens I see this in my output:

Start=
Current stats: Current state: 999 Received: 1298
Read complete; still missing 1296 bytes (6901 read of 8197)
Offset: 1298

Start=
Current stats: Current state: 999 Received: 1298
Read complete; 2 more bytes in buffer
== New Object
Size bytes: 32 30 00 2b

Based on that, it appears what's happening is you're not handling the
case where you end up with fewer than 4 bytes in the buffer. As a
result, memcpy reads past the end of the response buffer and gets
whatever it gets, resulting in an incorrect parsed size.

The while loop in pack_protocol_line is checking +1, but I think it
should be checking +3 to ensure there's at least 4 bytes so it can get
a complete size. The "parseread" struct will need to grow a couple
more fields to allow buffering some additional bytes between
pack_protocol_line calls (because curl requires the write function to
always consume the full buffer) and, at the start of the loop, when
reading/parsing a size, the code will need to consider any buffered
bytes from the previous function call. That requires some knock-on
changes to how the offset is handled, as well as the the initial
"psize" set when starting a new packet.

Once you start accounting for reads that cut off in 1, 2 or 3 bytes
into the 4 required to parse a size, your program should be able to
run to completion.

Here's a (very ugly) patch I threw together on top of your code:

diff --git a/main.c b/main.c
index 956362b..8873fd5 100644
--- a/main.c
+++ b/main.c
@@ -71,7 +71,7 @@ int main(int argc, char *argv[]) {
  curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list);
  curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, (long)content_length);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA, &parseread);
- curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, pack_protocol_line);
+ curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &pack_protocol_line);

  res = curl_easy_perform(curl);
  if (curl != CURLE_OK)
diff --git a/post.c b/post.c
index cfffd5c..4f8512c 100644
--- a/post.c
+++ b/post.c
@@ -36,89 +36,83 @@ size_t pack_protocol_line(void *buffer, size_t
size, size_t nmemb, void *userp)
 {
  unsigned char *reply = buffer;
  struct parseread *parseread = userp;
- int offset = 0;
+ int offset = 0, pending = 0, remaining = 0;
  char tmp[5];
- int check;
-
- int pool = size * nmemb;

  printf("\n\nStart=\n");
- printf("Current stats: Current state: %d Received: %ld\n",
parseread->state, size*nmemb);
+ printf("Current stats: Curre

Parsing a git HTTP protocol response

2018-11-30 Thread Farhan Khan
Hi all,

I am writing an implementation of the git HTTP pack protocol in C. It
just does a request to clone a repository. It works pretty well for
small repositories, but seems to fail on larger repositories and I do
not understand why.

All that my code does is send a hard-coded "want" request. As I
understand it, responses come with a four-byte size prefix, then the
data of the size - 4. The first four bytes are the size of the object
in ASCII and after that size, a new object should begin by specifying
its size. The final "" string should signify the end.

On small repositories my code works fine. But when use a large git
repository, I hit an object size that cannot be interpreted by ASCII
into a number. In particular, right before the crash I am pretty
consistently getting a size starting with 0x32,0x00 or 0x32,0x30,0x00
(note, it starts with 0x32 and has 0x00 in there). This is not a
readable four byte ascii value, likely an erroneous size value, which
causes the next object size calculation to land incorrectly and
subsequently the program to malfunction.

My questions:
1. Am I misunderstanding the protocol?
2. Does 0x32 signify something?
3. Also, where can I see in the source code git parses the data it
receives from a "want" request?

If you would like to see my code, it is located here:
http://git.farhan.codes/farhan/post
To compile to Linux run: gcc post.c main.c -lcurl -o post
To compile on FreeBSD you can use the Makefile or: cc -L
/usr/local/lib -I /usr/local/include -lcurl main.c post.c -o post
In both cases you need to have libcurl installed.

To run do: ./post [a git http url] [a commit value]
ie: ./post http://git.farhan.codes/farhan/openbsd
373dd5ba116d42cddf1fdcb58b578a4274f6d589

I have a lot of debugging printf() messages, but it eventually hits a
point where you see this:

Start=
Current stats: Current state: 999 Received: 1448
We have enough bytes, moving forward
== New Object
No Error, object is 32 bytes
Size bytes: 32 30 00 00

The program interprets the size of {0x32,0x30,0x00,0x00} to be "20"
which in decimal is 32, causing the next read to fail.
This problem repeats on a few different repositories.

Any assistance is welcome, I am very stuck on how the HTTP git protocol works.
Thanks,
--
Farhan Khan
PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE


[Spam]Quick Response

2018-10-31 Thread Annable Katherine Grosvenor
Good day,

My name is Annable Katherine Grosvenor, I'm 57yrs old, a widow, no kids, from 
the United Kingdom, I'm very sorry to bother you with this message but it is 
very important for me that I send out this message because I am very sick and 
at the point of death, I'm diagnosed with Ovarian Cancer and from all 
indications, I will not survive it because my Doctor courageously told me that 
I have few months to live, and also I can see that my health is fast 
deteriorating, the aggression of cancer has reached a critical stage.

I'm contacting you from my sick bed with the help of my personal nurse, I need 
a trustworthy, Godfearing and a reliable person I can bank on to carry out my 
last wish for me which is also the wish of my late husband, I have a 
humanitarian project worth three million, five hundred thousand dollars only, 
if you promise to help me accomplish this my last dying wish to the Glory of 
God and humanity, I will give you thirty percent of the total sum.

For further Enquiry about my mail to you pls kindly get back to me on my 
private email address so I can give you further details on how to go about this

Thank you very much for taking out time to read my message to you.

Yours Truly,
Annable Katherine Grosvenor
annablekatherinegrosve...@gmail.com


Re: [PATCH] fetch-pack: be more precise in parsing v2 response

2018-10-25 Thread Stefan Beller
On Thu, Oct 25, 2018 at 2:04 AM Junio C Hamano  wrote:
>
> Junio C Hamano  writes:
>
> > Jonathan Tan  writes:
> >
> >> +GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
> >> +-c protocol.version=2 \
> >> +fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
> >
> > Because test_must_fail is a shell function, the above is not a
> > correct way to say "I want GIT_TRACE_PACKET exported only while this
> > thing runs".
> >
> > I'll squash the following in.
> >
> >  t/t5702-protocol-v2.sh | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> > index 51009ca391..d58fbfa9e5 100755
> > --- a/t/t5702-protocol-v2.sh
> > +++ b/t/t5702-protocol-v2.sh
> > @@ -555,7 +555,7 @@ test_expect_success 'when server does not send "ready", 
> > expect FLUSH' '
> >   printf "/acknowledgments/,$ s//0001/" \
> >   >"$HTTPD_ROOT_PATH/one-time-sed" &&
> >
> > - GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
> > + test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" git -C http_child \
> >   -c protocol.version=2 \
> >   fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
> >   grep "fetch< acknowledgments" log &&
>
> I know it only has been a few days, but is there any other issue
> in the patch, anybody?

I have reviewed the patch and I think it is good with the squashed change above.

Thanks,
Stefan


Re: [PATCH] fetch-pack: be more precise in parsing v2 response

2018-10-25 Thread Junio C Hamano
Junio C Hamano  writes:

> Jonathan Tan  writes:
>
>> +GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
>> +-c protocol.version=2 \
>> +fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
>
> Because test_must_fail is a shell function, the above is not a
> correct way to say "I want GIT_TRACE_PACKET exported only while this
> thing runs".
>
> I'll squash the following in.
>
>  t/t5702-protocol-v2.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index 51009ca391..d58fbfa9e5 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -555,7 +555,7 @@ test_expect_success 'when server does not send "ready", 
> expect FLUSH' '
>   printf "/acknowledgments/,$ s//0001/" \
>   >"$HTTPD_ROOT_PATH/one-time-sed" &&
>  
> - GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
> + test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" git -C http_child \
>   -c protocol.version=2 \
>   fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
>   grep "fetch< acknowledgments" log &&

I know it only has been a few days, but is there any other issue
in the patch, anybody?

Otherwise, I am wondering if we can move this forwared after
squashing the above fix in.

Thanks.


Re: [PATCH] fetch-pack: be more precise in parsing v2 response

2018-10-21 Thread Junio C Hamano
Jonathan Tan  writes:

> + GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
> + -c protocol.version=2 \
> + fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&

Because test_must_fail is a shell function, the above is not a
correct way to say "I want GIT_TRACE_PACKET exported only while this
thing runs".

I'll squash the following in.

 t/t5702-protocol-v2.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 51009ca391..d58fbfa9e5 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -555,7 +555,7 @@ test_expect_success 'when server does not send "ready", 
expect FLUSH' '
printf "/acknowledgments/,$ s//0001/" \
>"$HTTPD_ROOT_PATH/one-time-sed" &&
 
-   GIT_TRACE_PACKET="$(pwd)/log" test_must_fail git -C http_child \
+   test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" git -C http_child \
-c protocol.version=2 \
fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
grep "fetch< acknowledgments" log &&


[PATCH] fetch-pack: be more precise in parsing v2 response

2018-10-19 Thread Jonathan Tan
Each section in a protocol v2 response is followed by either a DELIM
packet (indicating more sections to follow) or a FLUSH packet
(indicating none to follow). But when parsing the "acknowledgments"
section, do_fetch_pack_v2() is liberal in accepting both, but determines
whether to continue reading or not based solely on the contents of the
"acknowledgments" section, not on whether DELIM or FLUSH was read.

There is no issue with a protocol-compliant server, but can result in
confusing error messages when communicating with a server that
serves unexpected additional sections. Consider a server that sends
"new-section" after "acknowledgments":

 - client writes request
 - client reads the "acknowledgments" section which contains no "ready",
   then DELIM
 - since there was no "ready", client needs to continue negotiation, and
   writes request
 - client reads "new-section", and reports to the end user "expected
   'acknowledgments', received 'new-section'"

For the person debugging the involved Git implementation(s), the error
message is confusing in that "new-section" was not received in response
to the latest request, but to the first one.

One solution is to always continue reading after DELIM, but in this
case, we can do better. We know from the protocol that "ready" means at
least the packfile section is coming (hence, DELIM) and that no "ready"
means that no sections are to follow (hence, FLUSH). So teach
process_acks() to enforce this.

Signed-off-by: Jonathan Tan 
---
 fetch-pack.c   | 12 ++
 t/t5702-protocol-v2.sh | 50 ++
 2 files changed, 62 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index b3ed7121bc..9691046e64 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1248,6 +1248,18 @@ static int process_acks(struct fetch_negotiator 
*negotiator,
reader->status != PACKET_READ_DELIM)
die(_("error processing acks: %d"), reader->status);
 
+   /*
+* If an "acknowledgments" section is sent, a packfile is sent if and
+* only if "ready" was sent in this section. The other sections
+* ("shallow-info" and "wanted-refs") are sent only if a packfile is
+* sent. Therefore, a DELIM is expected if "ready" is sent, and a FLUSH
+* otherwise.
+*/
+   if (received_ready && reader->status != PACKET_READ_DELIM)
+   die(_("expected packfile to be sent after 'ready'"));
+   if (!received_ready && reader->status != PACKET_READ_FLUSH)
+   die(_("expected no other sections to be sent after no 
'ready'"));
+
/* return 0 if no common, 1 if there are common, or 2 if ready */
return received_ready ? 2 : (received_ack ? 1 : 0);
 }
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 8360188c01..51009ca391 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -512,6 +512,56 @@ test_expect_success 'push with http:// and a config of v2 
does not request v2' '
! grep "git< version 2" log
 '
 
+test_expect_success 'when server sends "ready", expect DELIM' '
+   rm -rf "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" http_child &&
+
+   git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+   test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one &&
+
+   git clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+   test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+   # After "ready" in the acknowledgments section, pretend that a FLUSH
+   # () was sent instead of a DELIM (0001).
+   printf "/ready/,$ s/0001//" \
+   >"$HTTPD_ROOT_PATH/one-time-sed" &&
+
+   test_must_fail git -C http_child -c protocol.version=2 \
+   fetch "$HTTPD_URL/one_time_sed/http_parent" 2> err &&
+   test_i18ngrep "expected packfile to be sent after .ready." err
+'
+
+test_expect_success 'when server does not send "ready", expect FLUSH' '
+   rm -rf "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" http_child log &&
+
+   git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+   test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one &&
+
+   git clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+   test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+   # Create many commits to extend the negotiation phase across multipl

URGENT RESPONSE NEEDED

2018-10-13 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



URGENT RESPONSE NEEDED

2018-10-12 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



URGENT RESPONSE NEEDED

2018-10-11 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



URGENT RESPONSE PLEASE.

2018-09-26 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



URGENT RESPONSE PLEASE.

2018-09-25 Thread Sean Kim.
Hello my dear.

Did you receive my email message to you? Please, get back to me ASAP as the 
matter is becoming late.  Expecting your urgent response.

Sean.



Expecting Response.

2018-08-29 Thread Mr. Sean Kimasala.
Hello my dear.

I have been expecting your response based on the email I sent you few days ago. 
 Please, study my mail and respond back to me as the matter is becoming late.  
Expecting your urgent response.

Yours Sincerely,

Mr. Kimasala.



Dear friend, I have emailed you earlier without a response. In my first email I mentioned about my late client whose relatives I cannot get in touch with but both of you have the same last name, so it

2018-07-09 Thread Jacob Tape


Re: QUICK RESPONSE

2018-03-17 Thread Aabidullah Aaiz Finance®
-- 
Dear Sir/Madam

Are you a business man or woman ? Do you need a loan or funding for
any reason such as:

1) Personal Loan,Business Expansion , 2) Business Start-up ,Education,
3) Debt Consolidation , Home Improvement Loans 4) Hard Money Loans,
Investment Loans, We offer loan at low interest rate of 2% Percent and
with no credit check. Contact us' via below email for more
information' Company. Contact E-Mail: alifinanceander...@aim.com

Thanks
Yours In Service:
Customer Care Unit
For More Details:


I wait for your prompt response.

2018-02-28 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


I await your response for more close discussions.

2018-02-15 Thread Mr Jeremiah
Hello dear ,
I am Mr. Jeremiah Bworo , nationality of Ethiopia but currently In Benin 
Republic as a Principal Director with Federal Ministry Of Works and Housing 
here .

I will be delighted to invest with you there in your country into lucrative 
investments under your management as far as you could promise me of your 
country's economic stability . 

I await your response for more close discussions.

Regards,
Mr. Jeremiah Bworo
Principal director
Ministry of Works and Housing 
Benin Republic.


I await your response for more close discussions.

2018-02-15 Thread Mr Jeremiah
Hello dear ,
I am Mr. Jeremiah Bworo , nationality of Ethiopia but currently In Benin 
Republic as a Principal Director with Federal Ministry Of Works and Housing 
here .

I will be delighted to invest with you there in your country into lucrative 
investments under your management as far as you could promise me of your 
country's economic stability . 

I await your response for more close discussions.

Regards,
Mr. Jeremiah Bworo
Principal director
Ministry of Works and Housing 
Benin Republic.


I await your response for more close discussions.

2018-02-08 Thread Mr Jeremiah
Hello dear ,
I am Mr. Jeremiah Bworo , nationality of Ethiopia but currently In Benin 
Republic as a Principal Director with Federal Ministry Of Works and Housing 
here .

I will be delighted to invest with you there in your country into lucrative 
investments under your management as far as you could promise me of your 
country's economic stability . 

I await your response for more close discussions.

Regards,
Mr. Jeremiah Bworo
Principal director
Ministry of Works and Housing 
Benin Republic.


Re: git svn clone - Malformed network data: The XML response contains invalid XML: Malformed XML: no element found at /usr/share/perl5/vendor_perl/5.26/Git/SVN/Ra.pm line 312

2018-01-21 Thread Eric Wong
Jason Pyeron  wrote:
> 
> I am asuming that this is an issue caused by codeplex's svn
> from tfs implementation. Does anyone here have any insight?

Seems like it, even using svn(1) fails (see below)

> r27599 = 9e769d8327767a155d7b96b7cc28579cf0ed4c93 (refs/remotes/git-svn)
> Malformed network data: The XML response contains invalid XML: Malformed XML: 
> no element found at /usr/share/perl5/vendor_perl/5.26/Git/SVN/Ra.pm line 312.

OK, so lets find out what the next commit is after r27599:

  svn log -v -r 27599:HEAD https://smtp4dev.svn.codeplex.com/svn

  # which tells me r27864

  svn diff -r 27599:27864 https://smtp4dev.svn.codeplex.com/svn
  # which tells me:
  svn: E130003: The XML response contains invalid XML

strace only shows encrypted data, so maybe there's a flag or
debug option you can enable in SVN itself to see what was in the
bad XML or maybe contact the admin to see if it can be fixed.


git svn clone - Malformed network data: The XML response contains invalid XML: Malformed XML: no element found at /usr/share/perl5/vendor_perl/5.26/Git/SVN/Ra.pm line 312

2018-01-21 Thread Jason Pyeron

I am asuming that this is an issue caused by codeplex's svn from tfs 
implementation. Does anyone here have any insight?

$ git --version
git version 2.15.0

$ git svn clone https://smtp4dev.svn.codeplex.com/svn smtp4dev
Initialized empty Git repository in 
/cygdrive/c/Users/Public/Desktop/projects/smtp4dev/.git/
r20111 = 443d627cdfeeb240162b9f089ab50c6f058a1260 (refs/remotes/git-svn)
A   trunk/smtp4dev/smtp4dev.csproj
A   trunk/smtp4dev/Resources/Icon2.ico
A   trunk/smtp4dev/Resources/Icon1.ico
A   trunk/smtp4dev/RegistrySettings.cs
A   trunk/smtp4dev/Properties/Settings.settings
A   trunk/smtp4dev/Properties/Settings.Designer.cs
A   trunk/smtp4dev/Properties/Resources.resx
A   trunk/smtp4dev/Properties/Resources.Designer.cs
A   trunk/smtp4dev/Properties/AssemblyInfo.cs
A   trunk/smtp4dev/Program.cs
A   trunk/smtp4dev/OptionsForm.resx
A   trunk/smtp4dev/OptionsForm.Designer.cs
A   trunk/smtp4dev/OptionsForm.cs
A   
trunk/smtp4dev/obj/Debug/TempPE/Properties.Resources.Designer.cs.dll
A   trunk/smtp4dev/obj/Debug/smtp4dev.Properties.Resources.resources
A   trunk/smtp4dev/obj/Debug/smtp4dev.pdb
A   trunk/smtp4dev/obj/Debug/smtp4dev.OptionsForm.resources
A   trunk/smtp4dev/obj/Debug/smtp4dev.MainForm.resources
A   trunk/smtp4dev/obj/Debug/smtp4dev.exe
A   trunk/smtp4dev/obj/Debug/smtp4dev.csproj.GenerateResource.Cache
A   trunk/smtp4dev/obj/Debug/smtp4dev.csproj.FileListAbsolute.txt
A   trunk/smtp4dev/obj/Debug/ResolveAssemblyReference.cache
A   trunk/smtp4dev/MainForm.resx
A   trunk/smtp4dev/MainForm.Designer.cs
A   trunk/smtp4dev/MainForm.cs
A   trunk/smtp4dev/lib/log4net.dll
A   trunk/smtp4dev/lib/cses.smtp.server.xml
A   trunk/smtp4dev/lib/cses.smtp.server.dll
A   trunk/smtp4dev/Email.cs
A   trunk/smtp4dev/bin/Debug/smtp4dev.vshost.exe
A   trunk/smtp4dev/bin/Debug/smtp4dev.vshost.exe.manifest
A   trunk/smtp4dev/bin/Debug/smtp4dev.vshost.exe.config
A   trunk/smtp4dev/bin/Debug/smtp4dev.pdb
A   trunk/smtp4dev/bin/Debug/smtp4dev.exe
A   trunk/smtp4dev/bin/Debug/smtp4dev.exe.config
A   trunk/smtp4dev/bin/Debug/log4net.dll
A   trunk/smtp4dev/bin/Debug/cses.smtp.server.xml
A   trunk/smtp4dev/bin/Debug/cses.smtp.server.dll
A   trunk/smtp4dev/app.config
A   trunk/smtp4dev.suo
A   trunk/smtp4dev.sln
A   trunk/smtp4dev.4.5.resharper.user
A   trunk/Setup/Setup.vdproj
A   trunk/Setup/Debug/smtp4dev-1.0.0.0.zip
A   trunk/Setup/Debug/Setup.msi
A   trunk/Setup/Debug/setup.exe
A   trunk/Setup/Debug/LICENSE
A   trunk/Setup/Debug/LICENSE.log4net
A   trunk/Setup/Debug/LICENSE.cses-smtp-server
A   trunk/LICENSE
A   trunk/LICENSE.rtf
A   trunk/LICENSE.log4net
A   trunk/LICENSE.cses-smtp-server
A   trunk/_ReSharper.smtp4dev/Xaml/CacheProvider.dat
A   trunk/_ReSharper.smtp4dev/WordIndex.New/6/61e28584.dat
A   trunk/_ReSharper.smtp4dev/WordIndex.New/.version
A   trunk/_ReSharper.smtp4dev/WebsiteFileReferences/.version
A   trunk/_ReSharper.smtp4dev/TodoCache/9/3ace61b9.dat
A   trunk/_ReSharper.smtp4dev/TodoCache/.version
A   trunk/_ReSharper.smtp4dev/ProjectModel/ProjectModel.dat
A   trunk/_ReSharper.smtp4dev/CachesImage.bin
W: +empty_dir: trunk/Setup/Release
W: +empty_dir: trunk/smtp4dev/obj/Debug/Refactor
r20114 = 569cc523b14d6346f3198f37b27fccb8cb572ab1 (refs/remotes/git-svn)
... It chugs along then ...
W: -empty_dir: trunk/Rnwood.Smtp4dev/Behaviour.cs
W: -empty_dir: trunk/Rnwood.SmtpServer/SmtpRequest.cs
r27599 = 9e769d8327767a155d7b96b7cc28579cf0ed4c93 (refs/remotes/git-svn)
Malformed network data: The XML response contains invalid XML: Malformed XML: 
no element found at /usr/share/perl5/vendor_perl/5.26/Git/SVN/Ra.pm line 312.


I have tried to fiddle with --log-window-size but nothing seemed to work. 
(https://stackoverflow.com/questions/38071052/getting-error-while-migrating-code-from-svn-to-git-repository-malformed-network)


-Jason

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-   -
- Jason Pyeron  PD Inc. http://www.pdinc.us -
- Principal Consultant  10 West 24th Street #100-
- +1 (443) 269-1555 x333Baltimore, Maryland 21218   -
-   -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



I wait for your prompt response.

2017-12-16 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


I wait for your prompt response.

2017-12-12 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


I wait for your prompt response.

2017-11-30 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


I wait for your prompt response.

2017-11-23 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


I wait for your prompt response.

2017-11-10 Thread SAM AZADA
Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.


Urgent response !!!

2017-11-08 Thread Melisa Mehmet
Greetings,

I have a business proposal I would love to discuss with you. please reply me 
for more details
Yours Sincerely,
miss.melisa.mehmet


RE: STILL WAITING FOR YOUR RESPONSE

2017-10-28 Thread Kim Kyeong
Hello,

Did you received my previous message? 

Regards.



[PATCH v4 06/11] connect: teach client to recognize v1 server response

2017-10-16 Thread Brandon Williams
Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams 
---
 connect.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..a5e708a61 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t 
*src_len,
return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+   switch (determine_protocol_version_client(packet_buffer)) {
+   case protocol_v1:
+   return 1;
+   case protocol_v0:
+   return 0;
+   default:
+   die("server is speaking an unknown protocol");
+   }
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
 */
int responded = 0;
int len;
-   int state = EXPECTING_FIRST_REF;
+   int state = EXPECTING_PROTOCOL_VERSION;
 
*list = NULL;
 
while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
switch (state) {
+   case EXPECTING_PROTOCOL_VERSION:
+   if (process_protocol_version()) {
+   state = EXPECTING_FIRST_REF;
+   break;
+   }
+   state = EXPECTING_FIRST_REF;
+   /* fallthrough */
case EXPECTING_FIRST_REF:
process_capabilities(&len);
if (process_dummy_ref()) {
-- 
2.15.0.rc0.271.g36b669edcc-goog



[PATCH v3 06/10] connect: teach client to recognize v1 server response

2017-10-03 Thread Brandon Williams
Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams 
---
 connect.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..a5e708a61 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t 
*src_len,
return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+   switch (determine_protocol_version_client(packet_buffer)) {
+   case protocol_v1:
+   return 1;
+   case protocol_v0:
+   return 0;
+   default:
+   die("server is speaking an unknown protocol");
+   }
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
 */
int responded = 0;
int len;
-   int state = EXPECTING_FIRST_REF;
+   int state = EXPECTING_PROTOCOL_VERSION;
 
*list = NULL;
 
while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
switch (state) {
+   case EXPECTING_PROTOCOL_VERSION:
+   if (process_protocol_version()) {
+   state = EXPECTING_FIRST_REF;
+   break;
+   }
+   state = EXPECTING_FIRST_REF;
+   /* fallthrough */
case EXPECTING_FIRST_REF:
process_capabilities(&len);
if (process_dummy_ref()) {
-- 
2.14.2.920.gcf0c67979c-goog



Your response is highly appreciated

2017-09-29 Thread Mr.Adams Salem
Hello ,

I am specifically contacting you in respect of a business proposal that I
have for you as you appear very relevant in the proposal.

Please kindly reply back to me for further details.

Waiting to hear from you.

Regards,

Mr.Adams Salem


Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response

2017-09-28 Thread Brandon Williams
On 09/27, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. 
> > */
> > +static int process_protocol_version(void)
> > +{
> > +   switch (determine_protocol_version_client(packet_buffer)) {
> > +   case protocol_v1:
> > +   return 1;
> > +   case protocol_v0:
> > +   return 0;
> > +   default:
> > +   die("server is speaking an unknown protocol");
> > +   }
> > +}
> 
> For the purpose of "technology demonstration" v1 protocol, it is OK
> to discard the result of "determine_pvc()" like the above code, but
> in a real application, we would do a bit more than just ignoring an
> extra "version #" packet that appears at the beginning, no?
> 
> It would be sensible to design how the result of determien_pvc()
> call is propagated to the remainder of the program in this patch and
> implement it.  Perhaps add a new global (like server_capabilities
> already is) and store the value there, or something?  Or pass a
> pointer to enum protocol_version as a return-location parameter to
> this helper function so that the process_capabilities() can pass a
> pointer to its local variable?

Yes, once we actually implement a v2 we would need to not throw away the
result of 'determine_pvc()' and instead do control flow based on the
resultant version.  I was trying to implement 'v1' as simply as possible
so that I wouldn't have to do a large amount of refactoring when
proposing this transition, though it seems Jonathan ended up doing more
than I planned, as I figured we didn't really know what the code will
need to be refactored to, in order to handle another protocol version.
I would suspect that we maybe wouldn't want to determine which version a
server is speaking in 'get_remote_heads()' but rather at some point
before that so we can branch off to do v2 like things, for example,
capability discovery and not ref discovery.

If you do think we need to do more of that refactoring now, before a v2,
I can most certainly work on that.


> 
> >  static void process_capabilities(int *len)
> >  {
> > @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, 
> > size_t src_len,
> >  */
> > int responded = 0;
> > int len;
> > -   int state = EXPECTING_FIRST_REF;
> > +   int state = EXPECTING_PROTOCOL_VERSION;
> >  
> > *list = NULL;
> >  
> > while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> > switch (state) {
> > +   case EXPECTING_PROTOCOL_VERSION:
> > +   if (process_protocol_version()) {
> > +   state = EXPECTING_FIRST_REF;
> > +   break;
> > +   }
> > +   state = EXPECTING_FIRST_REF;
> > +   /* fallthrough */
> > case EXPECTING_FIRST_REF:
> > process_capabilities(&len);
> > if (process_dummy_ref()) {

-- 
Brandon Williams


Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response

2017-09-27 Thread Brandon Williams
On 09/27, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. 
> > */
> > +static int process_protocol_version(void)
> > +{
> > +   switch (determine_protocol_version_client(packet_buffer)) {
> > +   case protocol_v1:
> > +   return 1;
> > +   case protocol_v0:
> > +   return 0;
> > +   default:
> > +   die("server is speaking an unknown protocol");
> > +   }
> > +}
> 
> checkpatch.pl yells at me:
> 
> ERROR: switch and case should be at the same indent
> 
> and we would probably want to teach "make style" the same, if we
> already don't.

'make style' actually already understands this, I just forgot it run it
on this change :)

-- 
Brandon Williams


Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response

2017-09-26 Thread Junio C Hamano
Brandon Williams  writes:

> +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> +static int process_protocol_version(void)
> +{
> + switch (determine_protocol_version_client(packet_buffer)) {
> + case protocol_v1:
> + return 1;
> + case protocol_v0:
> + return 0;
> + default:
> + die("server is speaking an unknown protocol");
> + }
> +}

For the purpose of "technology demonstration" v1 protocol, it is OK
to discard the result of "determine_pvc()" like the above code, but
in a real application, we would do a bit more than just ignoring an
extra "version #" packet that appears at the beginning, no?

It would be sensible to design how the result of determien_pvc()
call is propagated to the remainder of the program in this patch and
implement it.  Perhaps add a new global (like server_capabilities
already is) and store the value there, or something?  Or pass a
pointer to enum protocol_version as a return-location parameter to
this helper function so that the process_capabilities() can pass a
pointer to its local variable?

>  static void process_capabilities(int *len)
>  {
> @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, 
> size_t src_len,
>*/
>   int responded = 0;
>   int len;
> - int state = EXPECTING_FIRST_REF;
> + int state = EXPECTING_PROTOCOL_VERSION;
>  
>   *list = NULL;
>  
>   while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
>   switch (state) {
> + case EXPECTING_PROTOCOL_VERSION:
> + if (process_protocol_version()) {
> + state = EXPECTING_FIRST_REF;
> + break;
> + }
> + state = EXPECTING_FIRST_REF;
> + /* fallthrough */
>   case EXPECTING_FIRST_REF:
>   process_capabilities(&len);
>   if (process_dummy_ref()) {


Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response

2017-09-26 Thread Junio C Hamano
Brandon Williams  writes:

> +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> +static int process_protocol_version(void)
> +{
> + switch (determine_protocol_version_client(packet_buffer)) {
> + case protocol_v1:
> + return 1;
> + case protocol_v0:
> + return 0;
> + default:
> + die("server is speaking an unknown protocol");
> + }
> +}

checkpatch.pl yells at me:

ERROR: switch and case should be at the same indent

and we would probably want to teach "make style" the same, if we
already don't.


[PATCH v2 6/9] connect: teach client to recognize v1 server response

2017-09-26 Thread Brandon Williams
Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams 
---
 connect.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..1805debf3 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t 
*src_len,
return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+   switch (determine_protocol_version_client(packet_buffer)) {
+   case protocol_v1:
+   return 1;
+   case protocol_v0:
+   return 0;
+   default:
+   die("server is speaking an unknown protocol");
+   }
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
 */
int responded = 0;
int len;
-   int state = EXPECTING_FIRST_REF;
+   int state = EXPECTING_PROTOCOL_VERSION;
 
*list = NULL;
 
while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
switch (state) {
+   case EXPECTING_PROTOCOL_VERSION:
+   if (process_protocol_version()) {
+   state = EXPECTING_FIRST_REF;
+   break;
+   }
+   state = EXPECTING_FIRST_REF;
+   /* fallthrough */
case EXPECTING_FIRST_REF:
process_capabilities(&len);
if (process_dummy_ref()) {
-- 
2.14.1.992.g2c7b836f3a-goog



[PATCH 5/8] connect: teach client to recognize v1 server response

2017-09-13 Thread Brandon Williams
Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams 
---
 connect.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/connect.c b/connect.c
index 49b28b83b..2702e1f2e 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -142,6 +143,27 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
if (len < 0)
die_initial_contact(saw_response);
 
+   /* Only check for version information on first response */
+   if (!saw_response) {
+   switch (determine_protocol_version_client(buffer)) {
+   case protocol_v1:
+   /*
+* First pkt-line contained the version string.
+* Continue on to process the ref advertisement.
+*/
+   continue;
+   case protocol_v0:
+   /*
+* Server is speaking protocol v0 and sent a
+* ref so we need to process it.
+*/
+   break;
+   default:
+   die("server is speaking an unknown protocol");
+   break;
+   }
+   }
+
if (!len)
break;
 
-- 
2.14.1.690.gbb1197296e-goog



Your response is required

2017-08-31 Thread Dean
Hello there, I sent you an email yesterday with a proposal that I think is 
going to be of mutual benefit but I did not receive a response from you yet so 
I am sending you this follow up to confirm if you actually received my email 
yesterday.

Please let me know if you did not receive my proposal so that I can resend it 
to you. Use the email below to contact me.

Regards,
Dean
dohn20...@secsuremailer.com


[RFC 6/7] transport: teach client to recognize v2 server response

2017-08-24 Thread Brandon Williams
Teach a client to recognize that a server understand protocol v2 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 2" sent by upload-pack.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c |   4 +-
 builtin/send-pack.c  |   5 +-
 connect.c| 165 ++-
 remote-curl.c|   7 ++-
 remote.h |  22 ++-
 transport.c  |  60 ---
 6 files changed, 178 insertions(+), 85 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..a2a5e1c73 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -52,6 +52,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct fetch_pack_args args;
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
+   struct remote_refs_scanner scanner;
 
packet_trace_identity("fetch-pack");
 
@@ -193,7 +194,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
if (!conn)
return args.diag_url ? 0 : 1;
}
-   get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+   remote_refs_scanner_init(&scanner, &ref, 0, NULL, &shallow);
+   get_remote_heads(fd[0], NULL, 0, &scanner);
 
ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..92ec1f871 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -154,6 +154,7 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
int progress = -1;
int from_stdin = 0;
struct push_cas_option cas = {0};
+   struct remote_refs_scanner scanner;
 
struct option options[] = {
OPT__VERBOSITY(&verbose),
@@ -256,8 +257,8 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
args.verbose ? CONNECT_VERBOSE : 0);
}
 
-   get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-&extra_have, &shallow);
+   remote_refs_scanner_init(&scanner, &remote_refs, REF_NORMAL, 
&extra_have, &shallow);
+   get_remote_heads(fd[0], NULL, 0, &scanner);
 
transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index d609267be..732b651d9 100644
--- a/connect.c
+++ b/connect.c
@@ -107,97 +107,124 @@ static void annotate_refs_with_symref_info(struct ref 
*ref)
string_list_clear(&symref, 0);
 }
 
-/*
- * Read all the refs from the other end
- */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+void remote_refs_scanner_init(struct remote_refs_scanner *scanner,
  struct ref **list, unsigned int flags,
  struct oid_array *extra_have,
  struct oid_array *shallow_points)
 {
-   struct ref **orig_list = list;
+   memset(scanner, 0, sizeof(*scanner));
+
+   scanner->orig_list = list;
+   *list = NULL;
+   scanner->list = list;
+   scanner->flags = flags;
+   scanner->extra_have = extra_have;
+   scanner->shallow_points = shallow_points;
+}
+
+int process_ref(struct remote_refs_scanner *scanner,
+   const char *buffer, int len)
+{
+   struct ref *ref;
+   struct object_id old_oid;
+   const char *name;
+   int name_len;
+   const char *arg;
+   int ret = 1;
+
+   if (len < 0)
+   die_initial_contact(scanner->seen_response);
+
+   if (!len) {
+   ret = 0;
+   goto out;
+   }
+
+   if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
+   die("remote error: %s", arg);
+
+   if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
+   skip_prefix(buffer, "shallow ", &arg)) {
+   if (get_oid_hex(arg, &old_oid))
+   die("protocol error: expected shallow sha-1, got '%s'", 
arg);
+   if (!scanner->shallow_points)
+   die("repository on the other end cannot be shallow");
+   oid_array_append(scanner->shallow_points, &old_oid);
+   goto out;
+   }
+
+   if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
+   buffer[GIT_SHA1_HEXSZ] != ' ')
+   die("protocol error: expected sha/ref, got '%s'", buffer);
+   name = buffer + GIT_SHA1_HEXSZ + 1;
+
+   name_len = strlen(name);
+   if (len != name_len + GIT_SHA1_HEXSZ + 1) {
+   

Your response is highly appreciated!!!

2017-05-30 Thread Mr.Adams Salem
Hello , 

I am specifically contacting you in respect of a business proposal that I have 
for you as you appear very relevant in the proposal. 

Please kindly reply back to me for further details.

Waiting to hear from you. 

Regards,

Mr.Adams Salem

Email: mradamssale...@outlook.my

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



  1   2   >