[PATCH v1 1/1] test-lint: echo -e (or -E) is not portable

2017-09-16 Thread tboegi
From: Torsten Bögershausen 

Some implementations of `echo` support the '-e' option to enable
backslash interpretation of the following string.
As an addition, they support '-E' to turn it off.

However, none of these are portable, POSIX doesn't even mention them,
and many implementations don't support them.

A check for '-n' is already done in check-non-portable-shell.pl,
extend it to cover '-n', '-e' or '-E-'

Signed-off-by: Torsten Bögershausen 
---
 t/check-non-portable-shell.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index b170cbc045..03dc9d2852 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -17,7 +17,7 @@ sub err {
 while (<>) {
chomp;
/\bsed\s+-i/ and err 'sed -i is not portable';
-   /\becho\s+-n/ and err 'echo -n is not portable (please use printf)';
+   /\becho\s+-[neE]/ and err 'echo with option is not portable (please use 
printf)';
/^\s*declare\s+/ and err 'arrays/declare not portable';
/^\s*[^#]\s*which\s/ and err 'which is not portable (please use type)';
/\btest\s+[^=]*==/ and err '"test a == b" is not portable (please use 
=)';
-- 
2.14.1.145.gb3622a4ee9



[PATCH] doc: update information about windows build

2017-09-16 Thread Kaartic Sivaraam
029aeeed5 (travis-ci: build and test Git on Windows, 2017-03-24) added
support for testing the git build for Windows.

So, update the documentation and the example used in it.

Signed-off-by: Kaartic Sivaraam 
---
 Documentation/SubmittingPatches | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 558d465b656a0..43c9b9ae49298 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -395,8 +395,9 @@ GitHub-Travis CI hints
 
 With an account at GitHub (you can get one for free to work on open
 source projects), you can use Travis CI to test your changes on Linux,
-Mac (and hopefully soon Windows).  You can find a successful example
-test build here: https://travis-ci.org/git/git/builds/120473209
+Mac.  Though Travis CI doesn't support Windows we have worked around
+that temporarily!  You can find a successful example test build
+here: https://travis-ci.org/git/git/builds/264560175
 
 Follow these steps for the initial setup:
 

--
https://github.com/git/git/pull/402


Behaviour of 'git stash show' when a stash has untracked files

2017-09-16 Thread Kaartic Sivaraam
Some time ago, I stashed a few changes along with untracked files. I
almost forgot it until recently. Then I wanted to see what I change I
had in the stash. So I did a 'git stash show '. It worked fine but
didn't say anything about the untracked files in that stash. That made
me wonder where the untracked files I added went. I then applied the
stash to see that they were still there but weren't listed in show.

I understand that they aren't listed because 'git stash show' is
typically a "diff between the stashed state and its original parent" as
the documentation says but shouldn't there be at least a message that
the stash contains untracked files? Those untracked files are "part of
the stash" and I see no way to get information about their presence
currently.

So, should this behaviour be changed?

-- 
Kaartic


Behaviour of 'git stash show' when a stash has untracked files

2017-09-16 Thread Kaartic Sivaraam
Some time ago, I stashed a few changes along with untracked files. I
almost forgot it until recently. Then I wanted to see what I change I
had in the stash. So I did a 'git stash show '. It worked fine but
didn't say anything about the untracked files in that stash. That made
me wonder where the untracked files I added went. I then applied the
stash to see that they were still there but weren't listed in show.

I understand that they aren't listed because 'git stash show' is
typically a "diff between the stashed state and its original parent" as
the documentation says but shouldn't there be at least a message that
the stash contains untracked files. Those untracked files are "part of
the stash" and I see no way to get information about their presence
currently.

-- 
Kaartic


Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension

2017-09-16 Thread Junio C Hamano
Ben Peart  writes:

> +write_integration_script() {
> + write_script .git/hooks/fsmonitor-test<<-\EOF
> + if [ "$#" -ne 2 ]; then
> + echo "$0: exactly 2 arguments expected"
> + exit 2
> + fi
> + if [ "$1" != 1 ]; then
> + echo -e "Unsupported core.fsmonitor hook version.\n" >&2
> + exit 1
> + fi

In addition to "echo -e" thing pointed out earlier, these look
somewhat unusual in our shell scripts, relative to what
Documentation/CodingGuidelines tells us to do:

 - We prefer a space between the function name and the parentheses,
   and no space inside the parentheses. The opening "{" should also
   be on the same line.

(incorrect)
my_function(){
...

(correct)
my_function () {
...

 - We prefer "test" over "[ ... ]".

 - Do not write control structures on a single line with semicolon.
   "then" should be on the next line for if statements, and "do"
   should be on the next line for "while" and "for".

(incorrect)
if test -f hello; then
do this
fi

(correct)
if test -f hello
then
do this
fi

> diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
> new file mode 100755
> index 00..aaee5d1fe3
> --- /dev/null
> +++ b/t/t7519/fsmonitor-watchman
> @@ -0,0 +1,128 @@
> +#!/usr/bin/perl
> +
> +use strict;
> +use warnings;
> +use IPC::Open2;
> + ...
> + open (my $fh, ">", ".git/watchman-query.json");
> + print $fh "[\"query\", \"$git_work_tree\", { \
> + \"since\": $time, \
> + \"fields\": [\"name\"], \
> + \"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], 
> [\"not\", \"exists\"]]] \
> + }]";
> + close $fh;
> +
> + print CHLD_IN "[\"query\", \"$git_work_tree\", { \
> + \"since\": $time, \
> + \"fields\": [\"name\"], \
> + \"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], 
> [\"not\", \"exists\"]]] \
> + }]";

This look painful to read, write and maintain.  IIRC, Perl supports
the < +}
> \ No newline at end of file

Oops.

Thanks.


Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread A. Wilcox
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 16/09/17 22:16, Junio C Hamano wrote:
> Subject: gc: call fscanf() with %s, not %c, when reading
> hostname
> 
> Earlier in this codepath, we (ab)used "%c" to read the
> hostname recorded in the lockfile into locking_host[HOST_NAME_MAX +
> 1] while substituting  with the actual value of
> HOST_NAME_MAX.
> 
> This turns out to be incorrect, as it an instruction to read
> exactly

it -> it is

> the specified number of bytes.  We are trying to read at most that 
> many bytes, we should be using "%s" instead.
> 
> Helped-by: A. Wilcox  Signed-off-by: Junio
> C Hamano  --- builtin/gc.c | 2 +- 1 file
> changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c index
> 3c78fcb9b1..bb2d6c1fb2 100644 --- a/builtin/gc.c +++
> b/builtin/gc.c @@ -258,7 +258,7 @@ static const char
> *lock_repo_for_gc(int force, pid_t* ret_pid) int should_exit;
> 
> if (!scan_fmt) -  scan_fmt = xstrfmt("%s %%%dc", 
> "%"SCNuMAX,
> HOST_NAME_MAX); + scan_fmt = xstrfmt("%s %%%ds", 
> "%"SCNuMAX,
> HOST_NAME_MAX); fp = fopen(pidfile_path, "r"); memset(locking_host,
> 0, sizeof(locking_host)); should_exit =
> 

Ack.  This is what I used in my testing; looks great.  Thanks so much
for your time and patience.


Sincerely,
- --arw

- -- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJZve4WAAoJEMspy1GSK50UruQP/3rca4kZp/mootkcgJrNwlSc
5SvFETaBMYb9M6CewOIgDWtQVqdGmkX+vhlyz/fO1aMUzed9JNgoYD0Fj8S+8RL/
aan96+Om94znlWydSlU48ZaR69sbj012TSJBvQdAs9K9Nfi40lMVGi8BvI5vsAG0
PCMyAUB4N6b9FYUNb6zO73JjmQSYzYV2TFOvACFgHwZ7ailyeyGI3LIP5Yd4OiF1
ERyJIKDoBjf0ns95xjox+HYFzG3VFDriM6GdEG1w25sLG+nvWxy/XV1Dv/K1/LiV
VzSJ3FEdNdOoO5SLcX4uRMYzRKLt3ihwnwIS6SC44Xd7XaqaWpfpueGxilQCI3Yn
FNWB3mX9oeXICIvM6PscJTzRLJd+gp3RbyLfavaQ2cNakrL9z+Qm5v6a52ufCqvP
SGdruVLGLDCR0qPWokKa64+uSfi6QNNmVgzVx4fRIwbMSUm1+sEh0uIjqgTQPBTq
Jyn6og/T234punjBI+GuEXhsb3FEbJh2xGyOQbmhW4l/DPzerUpXJAgCPC2JT93+
Z1s0aeDqC0n/dMHofVd8ZRFKt/ImVT0ywg7A9bJahhaJwmtkh0Xgb2hLgO5FGeuI
zY+9FI+doH6Al+KgxqSduKfxDOsEoxYRLCRYO2QnNjE7iYLIdUYRXEucw3Z/VQA1
b44AES8+WthuxQKsRstE
=Zwpa
-END PGP SIGNATURE-


Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread Junio C Hamano
"A. Wilcox"  writes:

>> I did a quick scan for substring "scanf" and read through the 
>> output, and it seems that this is the only one that wants to do
>> the this many characters, e.g. "%42c", conversion.
>> So it seems to me that a real fix has to read the file ourselves
>> and parse up to our HOST_NAME_MAX+1 to see if the hostname refers
>> to us, and fscanf that cannot take "slurp up to this many bytes" is
>> not useful tool to implementing that parsing.
> ...
> Except that is *exactly* *what* *s* *does* (quoting C11 §7.21.6.2):
>
> 9   An input item is defined as the longest sequence of input
> characters which does not exceed any specified field width

Ah, sorry, I completely misread what you meant.

I thought you were suggesting to replace "%c" with just an
unadorned "%s".  You meant that we can use "%s" instead.
And that solution makes sense.  Yes, it is exactly what %s
does.

So something like the following would be a sufficient fix, I guess?

Thanks.

-- >8 --
Subject: gc: call fscanf() with %s, not %c, when reading hostname

Earlier in this codepath, we (ab)used "%c" to read the hostname
recorded in the lockfile into locking_host[HOST_NAME_MAX + 1] while
substituting  with the actual value of HOST_NAME_MAX.

This turns out to be incorrect, as it an instruction to read exactly
the specified number of bytes.  We are trying to read at most that
many bytes, we should be using "%s" instead.

Helped-by: A. Wilcox 
Signed-off-by: Junio C Hamano 
---
 builtin/gc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 3c78fcb9b1..bb2d6c1fb2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -258,7 +258,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
ret_pid)
int should_exit;
 
if (!scan_fmt)
-   scan_fmt = xstrfmt("%s %%%dc", "%"SCNuMAX, 
HOST_NAME_MAX);
+   scan_fmt = xstrfmt("%s %%%ds", "%"SCNuMAX, 
HOST_NAME_MAX);
fp = fopen(pidfile_path, "r");
memset(locking_host, 0, sizeof(locking_host));
should_exit =


Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread A. Wilcox
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 16/09/17 19:36, Junio C Hamano wrote:
> "A. Wilcox"  writes:
> 
>> While musl's reading is correct from an English grammar point of
>> view, it does not seem to be how any other implementation has
>> read the standard.
>> 
>> However!  It gets better.
>> 
>> The ISO C standard, committee draft version April 12, 2011,
>> states[4]:
>> 
>>> cMatches a sequence of characters of exactly the number
>>> specified by the field width (1 if no field width is present in
>>> the directive).
>> ... Since Git is specifically attempting to read in a host name,
>> there may be a solution: while 'c' guarantees that any byte will
>> be read, and 's' will skip whitespace, RFCs 952 and 1123 §2.1[5]
>> specify that a network host name must never contain whitespace.
>> IDNA2008 §2.3.2.1[6] (and IDNA2003 before it) specifically
>> removes ASCII whitespace characters from the valid set of Unicode
>> codepoints for an IDNA host name[7]. Additionally, the buffer
>> `locking_host` is already defined as an array of char of size
>> HOST_NAME_MAX + 1, and the width specifier in fscanf is specified
>> as HOST_NAME_MAX.  Therefore, it should be safe to change git to
>> use the 's' type character.  Additionally, I can confirm that
>> this change (patch attached) allows the Git test suite to pass on
>> musl.
> 
> I did a quick scan for substring "scanf" and read through the 
> output, and it seems that this is the only one that wants to do
> the this many characters, e.g. "%42c", conversion.
> 
> I am a bit worried about the correctness of your conclusion,
> though.


%[width]s means read up to [width] non-ws characters.  It is exactly
what Git wants out of %[width]c, with the difference that 's' will
stop at whitespace.  Since hostnames cannot legally contain
whitespace, the difference is negligible.


> As long as we are reading from the file written by us, because the 
> string we write as the hostname part comes from what we prepare in 
> my_host[HOST_NAME_MAX+1] using xgethostname(), we may know it
> would fit in locking_host[HOST_NAME_MAX+1].  But because
> HOST_NAME_MAX on my platform may be shorter than what your platform
> uses, I'll run over the end of my buffer if I am reading the
> lockfile you write to notice that the repository is in use from
> your host.  After all, the reason why we write hostname in the file
> is because we expect the filesystem is shared across different
> hosts, so relying on HOST_NAME_MAX to be the same across platforms
> would not be a good way to go.


You cannot run over the buffer in any scenario: if the hostname is
longer than [width], then [width] + \0 will be written to the buffer.
 Since the buffer is HOST_NAME_MAX+1 and the width is HOST_NAME_MAX,
it stands to reason that you cannot overflow the buffer.


> So it seems to me that a real fix has to read the file ourselves
> and parse up to our HOST_NAME_MAX+1 to see if the hostname refers
> to us, and fscanf that cannot take "slurp up to this many bytes" is
> not useful tool to implementing that parsing.


Except that is *exactly* *what* *s* *does* (quoting C11 §7.21.6.2):

9   An input item is defined as the longest sequence of input
characters which does not exceed any specified field width


s   Matches a sequence of non-white-space characters.



awilcox on elaine ~ $ cat -> myhost
elaine.foxkit.us
awilcox on elaine ~ $ cat -> test.c
#include 
int main(void) {
char test[9];
fscanf(fopen("myhost", "r"), "%8s", test);
printf("result: %s\n", test);
return 0;
}
awilcox on elaine ~ $ c99 -o test test.c
awilcox on elaine ~ $ ./test
result: elaine.f
awilcox on elaine ~ $ /usr/lib/libc.so
musl libc (powerpc)
Version 1.1.16


> The current scan_fmt variable comes from da25bdb7 ("use 
> HOST_NAME_MAX to size buffers for gethostname(2)", 2017-04-18),
> and before that, we used to use "%"SCNuMAX" %127c", which was
> already problematic.  The "%127c" part came from the very original
> of this codepath in 64a99eb4 ("gc: reject if another gc is running,
> unless --force is given", 2013-08-08), whose first appearance in
> released versions was in v1.8.5, it seems.  IOW, nobody tried to
> run Git with musl C in the past 4 years and you are the first one
> to notice?


Yes.  Yes I am.

Because this is the first version that has a *test* that *tests* this
behaviour.

The odds that someone:

* is using musl
* with a Git repository over a network file system
* with another platform with a different HOST_NAME_MAX
* both concurrently running `git gc`, or the non-musl one being killed
* with loose refs that they, for some reason, would notice not being
gc'd properly

is almost zero.  However, it is theoretically something that could
happen, and that is the point of a test suite: testing code paths that
are rarely used to ensure they are correct when needed.  And this test
has proven that this code is *not* correct and will *not* function as
intended when it is 

Re: [RFC PATCH 0/2] Add named reference to latest push cert

2017-09-16 Thread Junio C Hamano
Shikher Verma  writes:

> This might be a good starting point for a sample hook if we choose to go
> that way. As Junio suggested.
>> This would not deal with concurrency as it re-uses the
>> same worktree, but illustrates what I had in mind
>> for the git history of that special ref.

That's Stefan; I wouldn't have suggested any approach that uses the
blob whose sole purpose is to serve as a temporary storage area to
pass the information to the hook as part of the permanent record.




Re: [musl] Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread Szabolcs Nagy
* Junio C Hamano  [2017-09-17 09:36:26 +0900]:
> versions was in v1.8.5, it seems.  IOW, nobody tried to run Git with
> musl C in the past 4 years and you are the first one to notice?

git works fine on musl in practice, i use it for more than 4years now.



Re: [PATCH] for_each_string_list_item(): behave correctly for empty list

2017-09-16 Thread Junio C Hamano
Michael Haggerty  writes:

> If you pass a newly-initialized or newly-cleared `string_list` to
> `for_each_string_list_item()`, then the latter does
>
> for (
> item = (list)->items; /* note, this is NULL */
> item < (list)->items + (list)->nr; /* note: NULL + 0 */
> ++item)
>
> Even though this probably works almost everywhere, it is undefined
> behavior, and it could plausibly cause highly-optimizing compilers to
> misbehave.
> ...
> It would be a pain to have to change the signature of this macro, and
> we'd prefer not to add overhead to each iteration of the loop. So
> instead, whenever `list->items` is NULL, initialize `item` to point at
> a dummy `string_list_item` created for the purpose.
> ...
> -#define for_each_string_list_item(item,list) \
> - for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
> +extern struct string_list_item dummy_string_list_item;
> +#define for_each_string_list_item(item,list) 
> \
> + for (item = (list)->items ? (list)->items : _string_list_item; \
> +  item < (list)->items + (list)->nr;  \
> +  ++item)

Sorry, but I am confused.

So when (list)->items is NULL, the loop termination condition that
used to be

NULL < NULL + 0

that was problematic because NULL + 0 is problematic now becomes

 < NULL + 0

in the new code?  What made NULL + 0 not problematic now?


Re: [PATCH v2] add test for bug in git-mv for recursive submodules

2017-09-16 Thread Junio C Hamano
Heiko Voigt  writes:

> When using git-mv with a submodule it will detect that and update the
> paths for its configurations (.gitmodules, worktree and gitfile). This
> does not work for recursive submodules where a user renames the root
> submodule.
>
> We discovered this fact when working on on-demand fetch for renamed
> submodules. Lets add a test to document.
>
> Signed-off-by: Heiko Voigt 
> ---
> On Fri, Aug 18, 2017 at 12:04:03PM -0700, Stefan Beller wrote:
>> > I just copied the shortcut that they were adding themselfes as submodule
>> > in 'setup submodule'. The whole setup of submodules in this test is like
>> > this. This way we already had a nested submodule structure which I could
>> > just add.
>> >
>> > I agree that this is unrealistic so I can change that in the test I am
>> > adding. But from what I have seen, this shortcut is taken in quite some
>> > places when dealing with submodules.
>> 
>> Please do not make it worse.
>> Once upon a time (late '16 IIRC) I had a series floating on the
>> list removing all occurrences, but there were issues with the
>> series and it did not land.
>
> Took a little while but here is a more clean patch creating individual
> submodules for the nesting.
>
> Cheers Heiko

Thanks.  Stefan, does this look good to you now?

It is not quite clear which step is expected to fail with the
current code by reading the test or the proposed log message.  Does
"mv" refuse to work and we do not get to run "status", or does
"status" report a failure, or do we fail well before that?

The log message that only says "This does not work when ..." is not
helpful in figuring it out, either.  Something like "This does not
work and fails to update the paths for its configurations" or
whatever that describes "what actually happens" (in contrast to
"what ought to happen", which you described clearly) should be
there.  

Description on how you happened to have discovered the issue feels a
lot less relevant compared to that, and it is totally useless if it
is unclear what the issue is in the first place.

>  t/t7001-mv.sh | 25 +
>  1 file changed, 25 insertions(+)
>
> diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
> index e365d1ff77..cbc5fb37fe 100755
> --- a/t/t7001-mv.sh
> +++ b/t/t7001-mv.sh
> @@ -491,4 +491,29 @@ test_expect_success 'moving a submodule in nested 
> directories' '
>   test_cmp actual expect
>  '
>  
> +test_expect_failure 'moving nested submodules' '
> + git commit -am "cleanup commit" &&
> + mkdir sub_nested_nested &&
> + (cd sub_nested_nested &&
> + touch nested_level2 &&
> + git init &&
> + git add . &&
> + git commit -m "nested level 2"
> + ) &&
> + mkdir sub_nested &&
> + (cd sub_nested &&
> + touch nested_level1 &&
> + git init &&
> + git add . &&
> + git commit -m "nested level 1"
> + git submodule add ../sub_nested_nested &&
> + git commit -m "add nested level 2"
> + ) &&
> + git submodule add ./sub_nested nested_move &&
> + git commit -m "add nested_move" &&
> + git submodule update --init --recursive &&
> + git mv nested_move sub_nested_moved &&
> + git status
> +'
> +
>  test_done


Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread Junio C Hamano
"A. Wilcox"  writes:

> While musl's reading is correct from an English grammar point of view,
> it does not seem to be how any other implementation has read the standard.
>
> However!  It gets better.
>
> The ISO C standard, committee draft version April 12, 2011, states[4]:
>
>> cMatches a sequence of characters of exactly the number specified
>> by the field width (1 if no field width is present in the directive).
> ...
> Since Git is specifically attempting to read in a host name, there may
> be a solution: while 'c' guarantees that any byte will be read, and 's'
> will skip whitespace, RFCs 952 and 1123 §2.1[5] specify that a network
> host name must never contain whitespace.  IDNA2008 §2.3.2.1[6] (and
> IDNA2003 before it) specifically removes ASCII whitespace characters
> from the valid set of Unicode codepoints for an IDNA host name[7].
> Additionally, the buffer `locking_host` is already defined as an array
> of char of size HOST_NAME_MAX + 1, and the width specifier in fscanf is
> specified as HOST_NAME_MAX.  Therefore, it should be safe to change git
> to use the 's' type character.  Additionally, I can confirm that this
> change (patch attached) allows the Git test suite to pass on musl.

I did a quick scan for substring "scanf" and read through the
output, and it seems that this is the only one that wants to do the
this many characters, e.g. "%42c", conversion.

I am a bit worried about the correctness of your conclusion, though.

As long as we are reading from the file written by us, because the
string we write as the hostname part comes from what we prepare in
my_host[HOST_NAME_MAX+1] using xgethostname(), we may know it would
fit in locking_host[HOST_NAME_MAX+1].  But because HOST_NAME_MAX on
my platform may be shorter than what your platform uses, I'll run
over the end of my buffer if I am reading the lockfile you write to
notice that the repository is in use from your host.  After all, the
reason why we write hostname in the file is because we expect the
filesystem is shared across different hosts, so relying on HOST_NAME_MAX
to be the same across platforms would not be a good way to go.

So it seems to me that a real fix has to read the file ourselves and
parse up to our HOST_NAME_MAX+1 to see if the hostname refers to us,
and fscanf that cannot take "slurp up to this many bytes" is not
useful tool to implementing that parsing.

The current scan_fmt variable comes from da25bdb7 ("use
HOST_NAME_MAX to size buffers for gethostname(2)", 2017-04-18), and
before that, we used to use "%"SCNuMAX" %127c", which was already
problematic.  The "%127c" part came from the very original of this
codepath in 64a99eb4 ("gc: reject if another gc is running, unless
--force is given", 2013-08-08), whose first appearance in released
versions was in v1.8.5, it seems.  IOW, nobody tried to run Git with
musl C in the past 4 years and you are the first one to notice?

Thanks.


Re: [musl] Re: Git 2.14.1: t6500: error during test on musl libc

2017-09-16 Thread Rich Felker
On Fri, Sep 15, 2017 at 11:58:41PM -0500, A. Wilcox wrote:
> On 15/09/17 06:30, Jeff King wrote:
> > On Fri, Sep 15, 2017 at 08:37:40AM +0200, Kevin Daudt wrote:
> > 
> >> On Thu, Sep 14, 2017 at 09:43:12PM -0500, A. Wilcox wrote:
> >>> -BEGIN PGP SIGNED MESSAGE-
> >>> Hash: SHA256
> >>>
> >>> Hi there,
> >>>
> >>> While bumping Git's version for our Linux distribution to 2.14.1, I've
> >>> run in to a new test failure in t6500-gc.sh.  This is the output of
> >>> the failing test with debug=t verbose=t:
> >>
> >> This is a new test introduced by c45af94db 
> >> (gc: run pre-detach operations under lock, 2017-07-11) which was
> >> included in v2.14.0.
> >>
> >> So it might be that this was already a problem for a longer time, only
> >> just recently uncovered.
> > 
> > The code change there is not all that big. Mostly we're just checking
> > that the lock is actually respected. The lock code doesn't exercise libc
> > all that much. It does use fscanf, which I guess is a little exotic for
> > us. It's also possible that hostname() doesn't behave quite as we
> > expect.
> > 
> > If you instrument gc like the patch below, what does it report when you
> > run:
> > 
> >   GIT_TRACE=1 ./t6500-gc.sh --verbose-only=8
> > 
> > I get:
> > 
> >   [...]
> >   trace: built-in: git 'gc' '--auto'
> >   Auto packing the repository in background for optimum performance.
> >   See "git help gc" for manual housekeeping.
> >   debug: gc lock already held by $my_hostname
> >   [...]
> > 
> > If you get "acquired gc lock", then the problem is in
> > lock_repo_for_gc(), and I'd suspect some problem with fscanf or
> > hostname.
> > 
> > -Peff
> 
> 
> Hey there Peff,
> 
> What a corner-y corner case we have here.  I believe the actual error is
> in the POSIX standard itself[1], as it is not clear what happens when
> there are not enough characters to 'fill' the width specified with %c in
> fscanf:

ISO C specifies very clearly what happens, in 7.21.6.2 The fscanf
function, paragraph 12: 

c
Matches a sequence of characters of exactly the number
specified by the field width...

Note the word "exactly". Thus a read of fewer characters is not a
match.

There is an open glibc bug for this with classic Drepper behavior
until his departure, followed by acknowledgement of the bug, but no
further action I'm aware of:

https://sourceware.org/bugzilla/show_bug.cgi?id=12701

Any applications depending on the buggy glibc behavior should be
fixed.

Rich


Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension

2017-09-16 Thread Torsten Bögershausen
On 2017-09-15 21:20, Ben Peart wrote:
> +if [ "$1" != 1 ]
> +then
> + echo -e "Unsupported core.fsmonitor hook version.\n" >&2
> + exit 1
> +fi

The echo -e not portable
(It was detected by a tighter version of the lint script,
 which I have here, but not yet send to the list :-(

This will do:
echo  "Unsupported core.fsmonitor hook version." >&2


GREETINGS FROM DUBAI

2017-09-16 Thread Mr.Austin Lizin
Hello,

Am Mr.Austin Lizin Managing Director of Global Trust Bank Dubai U.A.E.
I write you this proposal in good faith hoping that I will rely on you
in a business transaction that require absolute confidentiality and of
great interest and benefit to our both families.

In 2007,one Mr.Husson whose surname is same as yours and has  your
country in his file as his place of origin, made a fixed deposit for
36 months, valued at $26,700,000.00 with my bank. I was his account
officer before I rose to the position of Managing Director.

The maturity date for this deposit contract was 18th of June 2010.
Sadly Husson was among the death victims in the September 2009
earthquake in Indonesia that left over 1,200 people dead while he was
there on business trip. Since the last quarter of 2010 until today,
the management of my bank have been finding means to reach him so as
certain if he will roll over the Deposit or have the contract sum
withdrawn.

When I discovered that this will happen, I have tried to think up a
procedure to preserve this fund and use them proceeds for business.
Some directors here have been trying to find out from me the
information about this account and the owner, but I have kept it
closed because, I know that if they become aware that Mr. Husson is
late, they will corner the funds for themselves.

Therefore, am seeking your co-operation to present you as the one to
benefit from his fund at his death since you have the same name, so
that my bank headquarters will pay the funds to you. I have done
enough inside bank arrangement and you only have to put in your
details into the information network in the bank computers and reflect
you as his next of kin. If you concur with this proposal, I intend for
you to retain 50% of the funds while 50% shall be for me.Email {
mr.austinliz...@gmail.com } Kindly forward your response to my Email:


Regards,
Mr.Austin Lizin
Managing Director,Global Trust Bank Dubai U.A.E
(Wealth and investment Director)


Re: Commit dropped when swapping commits with rebase -i -p

2017-09-16 Thread Sebastian Schuberth
On Sat, Sep 16, 2017 at 12:41 PM, Andreas Heiduk  wrote:

> Therefore I would avoid "definitive wording" like "will drop" and use
> vague wording along "there are various dragons out there" like this:
>
> The todo list presented by `--preserve-merges --interactive` does
> not represent the topology of the revision graph.  Editing

I tried to avoid this introducing sentence from the original wording
as it reads like from a scientific research paper instead of from a
user's manual.

> commits and rewording their commit messages should work fine.
> But reordering, combining or dropping commits of a complex topology

There is no need for complex topology. If you reorder the two most
recent commits in a linear history, one gets dropped.

> can produce unexpected and useless results like missing commits,
> wrong merges, merges combining two unrelated histories and
> similar things.

"can produce" is much too soft, IMO. Reordering commits goes wrong,
period. Like wise "unexpected and useless results" is inappropriate.
The results are wrong in case of reordering, and wrong results are of
course unexpected and useless.

-- 
Sebastian Schuberth


Re: Are the 'How to' documents present as man pages?

2017-09-16 Thread Simon Ruderich
On Sat, Sep 16, 2017 at 05:17:35PM +0530, Kaartic Sivaraam wrote:
>> References to other man pages generally use round brackets, for
>> example git-merge(1).
>
> I didn't know they had different meanings for different brackets in man
> pages. [snip]

Man pages in general use only round brackets to refer to another
man page with the given section (like stat(1) or stat(2)).

Square brackets have no special meaning, but are useful for
references like URLs.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


Re: Are the 'How to' documents present as man pages?

2017-09-16 Thread Kaartic Sivaraam
On Sat, 2017-09-16 at 10:54 +0200, Simon Ruderich wrote:
> On Sat, Sep 16, 2017 at 10:30:43AM +0530, Kaartic Sivaraam wrote:
> > I was reading the 'git revert' documentation and found the following
> > line in it,
> > 
> > -m parent-number
> > --mainline parent-number
> > 
> > ...
> > 
> > See the revert-a-faulty-merge How-To[1] for more details.
> 
> Square brackets indicate links, you should have a NOTES section
> at the bottom of the man page which contains something like this:
> 

Yes there is, thanks!

> References to other man pages generally use round brackets, for
> example git-merge(1).
> 

I didn't know they had different meanings for different brackets in man
pages. The asciidoc source itself uses square-brackets to refer to man
page sections. May the link to notes should be something like [note 1].
Not sure if it's possible to do that.

---
Kaartic


Re: [PATCH] for_each_string_list_item(): behave correctly for empty list

2017-09-16 Thread SZEDER Gábor

> >> It would be a pain to have to change the signature of this macro, and
> >> we'd prefer not to add overhead to each iteration of the loop. So
> >> instead, whenever `list->items` is NULL, initialize `item` to point at
> >> a dummy `string_list_item` created for the purpose.
> > 
> > What signature change do you mean?  I don't understand what this
> > paragraph is alluding to.
> 
> I was thinking that one solution would be for the caller to provide a
> `size_t` variable for the macro's use as a counter (since I don't see a
> way for the macro to declare its own counter). The options are pretty
> limited because whatever the macro expands to has to play the same
> syntactic role as `for (...; ...; ...)`.

Another option to consider is to squeeze in an if-else before the for
loop header to handle the empty list case like this:

diff --git a/string-list.h b/string-list.h
index 29bfb7ae4..9eed47de0 100644
--- a/string-list.h
+++ b/string-list.h
@@ -32,8 +32,11 @@ void string_list_clear_func(struct string_list *list, 
string_list_clear_func_t c
 typedef int (*string_list_each_func_t)(struct string_list_item *, void *);
 int for_each_string_list(struct string_list *list,
 string_list_each_func_t, void *cb_data);
-#define for_each_string_list_item(item,list) \
-   for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
+#define for_each_string_list_item(item,list)   \
+   if ((list)->items == NULL) {\
+   /* empty list, do nothing */\
+   } else  \
+   for (item = (list)->items; item < (list)->items + (list)->nr; 
++item)
 
 /*
  * Apply want to each item in list, retaining only the ones for which

This way there would be neither additional overhead in each iteration
nor a new global.

Alas, there is a catch.  We can't use curly braces in the macro's else
branch, because the macro would contain only the opening brace but not
the closing one, which must come after the end of the loop's body.
This means that the modified macro couldn't be used in if-else
branches which themselves don't have curly braces, because it causes
ambiguity:

  if (condition)
  for_each_string_list_item(item, list)
  a_simple_oneliner(item);

Our coding guidelines encourage this style for one-liner loop bodies,
and there is indeed one such place in our codebase, so the following
hunk is needed as well:

diff --git a/send-pack.c b/send-pack.c
index 11d6f3d98..00fa1622f 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -295,9 +295,10 @@ static int generate_push_cert(struct strbuf *req_buf,
}
if (push_cert_nonce[0])
strbuf_addf(, "nonce %s\n", push_cert_nonce);
-   if (args->push_options)
+   if (args->push_options) {
for_each_string_list_item(item, args->push_options)
strbuf_addf(, "push-option %s\n", item->string);
+   }
strbuf_addstr(, "\n");
 
for (ref = remote_refs; ref; ref = ref->next) {


Luckily, reasonably modern compilers warn about such ambiguity, so
perhaps this is an acceptable compromise?


> > [...]
> > Does the following alternate fix work?  I think I prefer it because
> > it doesn't require introducing a new global. [...]
> >  #define for_each_string_list_item(item,list) \
> > -   for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
> > +   for (item = (list)->items; \
> > +(list)->items && item < (list)->items + (list)->nr; \
> > +++item)
> 
> This is the possibility that I was referring to as "add[ing] overhead to
> each iteration of the loop". I'd rather not add an extra test-and-branch
> to every iteration of a loop in which `list->items` is *not* NULL, which
> your solution appears to do. Or are compilers routinely able to optimize
> the check out?
> 
> The new global might be aesthetically unpleasant, but it only costs two
> words of memory, so I don't see it as a big disadvantage.
> 
> Another, more invasive change would be to initialize
> `string_list::items` to *always* point at `dummy_string_list_item`,
> rather similar to how `strbuf_slopbuf` is pointed at by empty `strbuf`s.
> But I really don't think the effort would be justified.



Re: Commit dropped when swapping commits with rebase -i -p

2017-09-16 Thread Andreas Heiduk
Am 15.09.2017 um 22:52 schrieb Junio C Hamano:
> Sebastian Schuberth  writes:
>>
>> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
>> index 6805a74aec..ccd0a04d54 100644
>> --- a/Documentation/git-rebase.txt
>> +++ b/Documentation/git-rebase.txt
>> @@ -782,10 +782,11 @@ case" recovery too!
>>  
>>  BUGS
>>  
>> -The todo list presented by `--preserve-merges --interactive` does not
>> -represent the topology of the revision graph.  Editing commits and
>> -rewording their commit messages should work fine, but attempts to
>> -reorder commits tend to produce counterintuitive results.
>> +Be careful when combining the `-i` / `--interactive` and `-p` /

"Be careful" is not necessary because the text is already in the "BUGS"
section.

>> +`--preserve-merges` options.  Reordering commits will drop commits from the
>> +main line. This is because the todo list does not represent the topology of 
>> the
>> +revision graph in this case.  However, editing commits and rewording their
>> +commit messages 'should' work fine.
>>  
>>  For example, an attempt to rearrange
>>  
> 
> 
> Anybody?  I personally feel that the updated text is not all that
> stronger but it is clearer by clarifying what "counterintuitive
> results" actually mean, but I am not the target audience this
> paragraph is trying to help, nor I am the one who is making excuse
> for a known bug, so...
> 

For me the proposed wording implies that the only bad effect are dropped
commits on the mainline. But I experienced something like this:


O--O--O--O---M--O==>   O--O--O--O---M--O
 \  /   \  /
  O--X--O--O O--X O


Where X was a commit without a ref and hence lost. Also the merge commit
seemed to combine two unrelated histories.

Therefore I would avoid "definitive wording" like "will drop" and use
vague wording along "there are various dragons out there" like this:

The todo list presented by `--preserve-merges --interactive` does
not represent the topology of the revision graph.  Editing
commits and rewording their commit messages should work fine.
But reordering, combining or dropping commits of a complex topology
can produce unexpected and useless results like missing commits,
wrong merges, merges combining two unrelated histories and
similar things.



Dear Talented

2017-09-16 Thread Blue Sky Studio
Dear Talented,

I am Talent Scout For BLUE SKY FILM STUDIO, Present Blue Sky Studio a Film 
Corporation Located in the United State, is Soliciting for the Right to use
Your Photo/Face and Personality as One of the Semi -Major Role/ Character in 
our Upcoming ANIMATED Stereoscope 3D Movie-The Story of Anubis (Anubis 2018)
The Movie is Currently Filming (In Production) Please Note That There Will Be 
No Auditions, Traveling or Any Special / Professional Acting Skills,
Since the Production of This Movie Will Be Done with our State of Art Computer 
-Generating Imagery Equipment. We Are Prepared to Pay the Total Sum
of $620,000.00 USD.For More Information/Understanding.

Talent Scout
Kim Sharma


Re: Are the 'How to' documents present as man pages?

2017-09-16 Thread Simon Ruderich
On Sat, Sep 16, 2017 at 10:30:43AM +0530, Kaartic Sivaraam wrote:
> I was reading the 'git revert' documentation and found the following
> line in it,
>
> -m parent-number
> --mainline parent-number
>
> ...
>
> See the revert-a-faulty-merge How-To[1] for more details.

Square brackets indicate links, you should have a NOTES section
at the bottom of the man page which contains something like this:

1. revert-a-faulty-merge How-To
   file:///usr/share/doc/git/html/howto/revert-a-faulty-merge.html

To my knowledge those are not available as man pages.

> It says that the 'How-To' is present in the first section of the man
> page. I tried to access it to get this,

References to other man pages generally use round brackets, for
example git-merge(1).

I checked the git-scm.com website [1] and interestingly they use
square brackets for these references which confused me a little.
Not sure if it's worth changing though.

Regards
Simon

[1]: https://git-scm.com/docs/git-revert
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


[PATCH v6 02/40] t0021/rot13-filter: refactor packet reading functions

2017-09-16 Thread Christian Couder
To make it possible in a following commit to move packet
reading and writing functions into a Packet.pm module,
let's refactor these functions, so they don't handle
printing debug output and exiting.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index ad685d92f8..e4495a52f3 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -60,8 +60,7 @@ sub packet_bin_read {
my $bytes_read = read STDIN, $buffer, 4;
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
-   print $debug "STOP\n";
-   exit();
+   return ( -1, "" );
}
elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
@@ -85,7 +84,7 @@ sub packet_bin_read {
 
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
-   unless ( $buf eq '' or $buf =~ s/\n$// ) {
+   unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
die "A non-binary line MUST be terminated by an LF.";
}
return ( $res, $buf );
@@ -131,7 +130,12 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-   my ( $command ) = packet_txt_read() =~ /^command=(.+)$/;
+   my ( $res, $command ) = packet_txt_read();
+   if ( $res == -1 ) {
+   print $debug "STOP\n";
+   exit();
+   }
+   $command =~ s/^command=//;
print $debug "IN: $command";
$debug->flush();
 
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 03/40] t0021/rot13-filter: improve 'if .. elsif .. else' style

2017-09-16 Thread Christian Couder
Before further refactoring the "t0021/rot13-filter.pl" script,
let's modernize the style of its 'if .. elsif .. else' clauses
to improve its readability by making it more similar to our
other perl scripts.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 39 +--
 1 file changed, 13 insertions(+), 26 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index e4495a52f3..82882392ae 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -61,23 +61,20 @@ sub packet_bin_read {
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
return ( -1, "" );
-   }
-   elsif ( $bytes_read != 4 ) {
+   } elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
}
my $pkt_size = hex($buffer);
if ( $pkt_size == 0 ) {
return ( 1, "" );
-   }
-   elsif ( $pkt_size > 4 ) {
+   } elsif ( $pkt_size > 4 ) {
my $content_size = $pkt_size - 4;
$bytes_read = read STDIN, $buffer, $content_size;
if ( $bytes_read != $content_size ) {
die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
}
return ( 0, $buffer );
-   }
-   else {
+   } else {
die "invalid packet size: $pkt_size";
}
 }
@@ -165,8 +162,7 @@ while (1) {
$debug->flush();
packet_txt_write("status=success");
packet_flush();
-   }
-   else {
+   } else {
my ( $pathname ) = packet_txt_read() =~ /^pathname=(.+)$/;
print $debug " $pathname";
$debug->flush();
@@ -205,17 +201,13 @@ while (1) {
my $output;
if ( exists $DELAY{$pathname} and exists 
$DELAY{$pathname}{"output"} ) {
$output = $DELAY{$pathname}{"output"}
-   }
-   elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
$output = "";
-   }
-   elsif ( $command eq "clean" and grep( /^clean$/, @capabilities 
) ) {
+   } elsif ( $command eq "clean" and grep( /^clean$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
+   } elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   else {
+   } else {
die "bad command '$command'";
}
 
@@ -224,25 +216,21 @@ while (1) {
$debug->flush();
packet_txt_write("status=error");
packet_flush();
-   }
-   elsif ( $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "abort.r" ) {
print $debug "[ABORT]\n";
$debug->flush();
packet_txt_write("status=abort");
packet_flush();
-   }
-   elsif ( $command eq "smudge" and
+   } elsif ( $command eq "smudge" and
exists $DELAY{$pathname} and
-   $DELAY{$pathname}{"requested"} == 1
-   ) {
+   $DELAY{$pathname}{"requested"} == 1 ) {
print $debug "[DELAYED]\n";
$debug->flush();
packet_txt_write("status=delayed");
packet_flush();
$DELAY{$pathname}{"requested"} = 2;
$DELAY{$pathname}{"output"} = $output;
-   }
-   else {
+   } else {
packet_txt_write("status=success");
packet_flush();
 
@@ -262,8 +250,7 @@ while (1) {
print $debug ".";
if ( length($output) > $MAX_PACKET_CONTENT_SIZE 
) {
$output = substr( $output, 
$MAX_PACKET_CONTENT_SIZE );
-   }
-   else {
+   } else {
$output = "";
}
}
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 15/40] Add GIT_NO_EXTERNAL_ODB env variable

2017-09-16 Thread Christian Couder
This new environment variable will be used to perform git
commands without involving any external odb mechanism.

This makes it possible for example to create new blobs that
will not be sent to an external odb even if the external odb
supports "put_*" instructions.

Signed-off-by: Christian Couder 
---
 cache.h| 9 +
 environment.c  | 4 
 external-odb.c | 6 ++
 sha1_file.c| 3 +++
 4 files changed, 22 insertions(+)

diff --git a/cache.h b/cache.h
index 6c22bd0525..a2bd2090b3 100644
--- a/cache.h
+++ b/cache.h
@@ -429,6 +429,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES"
 #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS"
 #define GIT_REPLACE_REF_BASE_ENVIRONMENT "GIT_REPLACE_REF_BASE"
+#define NO_EXTERNAL_ODB_ENVIRONMENT "GIT_NO_EXTERNAL_ODB"
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
@@ -767,6 +768,14 @@ void reset_shared_repository(void);
 extern int check_replace_refs;
 extern char *git_replace_ref_base;
 
+/*
+ * Do external odbs need to be used this run?  This variable is
+ * initialized to true unless $GIT_NO_EXTERNAL_ODB is set, but it
+ * maybe set to false by some commands that do not want external
+ * odbs to be active.
+ */
+extern int use_external_odb;
+
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/environment.c b/environment.c
index 3fd4b10845..bbccabef6b 100644
--- a/environment.c
+++ b/environment.c
@@ -48,6 +48,7 @@ const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 int check_replace_refs = 1;
 char *git_replace_ref_base;
+int use_external_odb = 1;
 enum eol core_eol = EOL_UNSET;
 enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
@@ -116,6 +117,7 @@ const char * const local_repo_env[] = {
INDEX_ENVIRONMENT,
NO_REPLACE_OBJECTS_ENVIRONMENT,
GIT_REPLACE_REF_BASE_ENVIRONMENT,
+   NO_EXTERNAL_ODB_ENVIRONMENT,
GIT_PREFIX_ENVIRONMENT,
GIT_SUPER_PREFIX_ENVIRONMENT,
GIT_SHALLOW_FILE_ENVIRONMENT,
@@ -154,6 +156,8 @@ void setup_git_env(void)
replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
  : "refs/replace/");
+   if (getenv(NO_EXTERNAL_ODB_ENVIRONMENT))
+   use_external_odb = 0;
namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT));
shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
if (shallow_file)
diff --git a/external-odb.c b/external-odb.c
index a4f8c72e1c..52cb448d01 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -65,6 +65,9 @@ int external_odb_has_object(const unsigned char *sha1)
 {
struct odb_helper *o;
 
+   if (!use_external_odb)
+   return 0;
+
external_odb_init();
 
for (o = helpers; o; o = o->next) {
@@ -124,6 +127,9 @@ int external_odb_put_object(const void *buf, size_t len,
 {
struct odb_helper *o;
 
+   if (!use_external_odb)
+   return 1;
+
/* For now accept only blobs */
if (strcmp(type, "blob"))
return 1;
diff --git a/sha1_file.c b/sha1_file.c
index d0155e392f..7b2a0f64fa 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -619,6 +619,9 @@ void prepare_external_alt_odb(void)
static int linked_external;
const char *path;
 
+   if (!use_external_odb)
+   return;
+
if (linked_external)
return;
 
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 06/40] t0021/rot13-filter: add capability functions

2017-09-16 Thread Christian Couder
Add functions to help read and write capabilities.
These functions will be reused in following patches.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 278fc6f534..ba18b207c6 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -116,20 +116,44 @@ sub packet_initialize {
packet_flush();
 }
 
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   return ( $res, @cap ) if ( $res != 0 );
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   die "bad capability buf: '$buf'" unless ( $buf =~ 
s/capability=// );
+   push @cap, $buf;
+   }
+}
+
+sub packet_read_and_check_capabilities {
+   my @local_caps = @_;
+   my @remote_res_caps = packet_read_capabilities();
+   my $res = shift @remote_res_caps;
+   my %remote_caps = map { $_ => 1 } @remote_res_caps;
+   foreach (@local_caps) {
+   die "'$_' capability not available" unless 
(exists($remote_caps{$_}));
+   }
+   return $res;
+}
+
+sub packet_write_capabilities {
+   packet_txt_write( "capability=" . $_ ) foreach (@_);
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
 packet_initialize("git-filter", 2);
 
-( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=delay" ) )  || die "bad capability";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad capability 
end";
+packet_read_and_check_capabilities("clean", "smudge", "delay");
+packet_write_capabilities(@capabilities);
 
-foreach (@capabilities) {
-   packet_txt_write( "capability=" . $_ );
-}
-packet_flush();
 print $debug "init handshake complete\n";
 $debug->flush();
 
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 24/40] external-odb: add 'get_direct' support

2017-09-16 Thread Christian Couder
This implements the 'get_direct' capability/instruction that makes
it possible for external odb helper scripts to pass blobs to Git
by directly writing them as loose objects files.

It is better to call this a "direct" mode rather than a "fault-in"
mode as we could have the same kind of mechanism to "put" objects
into an external odb, where the odb helper would access blobs it
wants to send to an external odb directly from files, but it
would be strange to call that a fault-in mode too.

Signed-off-by: Christian Couder 
---
 external-odb.c | 21 -
 external-odb.h |  1 +
 odb-helper.c   | 27 +--
 odb-helper.h   |  2 ++
 4 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 52cb448d01..31d21bfe04 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -96,7 +96,8 @@ int external_odb_get_object(const unsigned char *sha1)
int ret;
int fd;
 
-   if (!odb_helper_has_object(o, sha1))
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_GET_RAW_OBJ) &&
+   !(o->supported_capabilities & ODB_HELPER_CAP_GET_GIT_OBJ))
continue;
 
fd = create_object_tmpfile(, path);
@@ -122,6 +123,24 @@ int external_odb_get_object(const unsigned char *sha1)
return -1;
 }
 
+int external_odb_get_direct(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   for (o = helpers; o; o = o->next) {
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_GET_DIRECT))
+   continue;
+   if (odb_helper_get_direct(o, sha1) < 0)
+   continue;
+   return 0;
+   }
+
+   return -1;
+}
+
 int external_odb_put_object(const void *buf, size_t len,
const char *type, unsigned char *sha1)
 {
diff --git a/external-odb.h b/external-odb.h
index d369dfdf6f..1fda08c0fb 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,6 +4,7 @@
 extern const char *external_odb_root(void);
 extern int external_odb_has_object(const unsigned char *sha1);
 extern int external_odb_get_object(const unsigned char *sha1);
+extern int external_odb_get_direct(const unsigned char *sha1);
 extern int external_odb_put_object(const void *buf, size_t len,
   const char *type, unsigned char *sha1);
 
diff --git a/odb-helper.c b/odb-helper.c
index 1f4666b349..3d940a3171 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -425,14 +425,37 @@ static int odb_helper_get_git_object(struct odb_helper *o,
return 0;
 }
 
+int odb_helper_get_direct(struct odb_helper *o,
+ const unsigned char *sha1)
+{
+   struct odb_helper_object *obj;
+   struct odb_helper_cmd cmd;
+
+   obj = odb_helper_lookup(o, sha1);
+   if (!obj)
+   return -1;
+
+   if (odb_helper_start(o, , 0, "get_direct %s", sha1_to_hex(sha1)) < 
0)
+   return -1;
+
+   if (odb_helper_finish(o, ))
+   return -1;
+
+   return 0;
+}
+
 int odb_helper_get_object(struct odb_helper *o,
  const unsigned char *sha1,
  int fd)
 {
+   if (o->supported_capabilities & ODB_HELPER_CAP_GET_GIT_OBJ)
+   return odb_helper_get_git_object(o, sha1, fd);
if (o->supported_capabilities & ODB_HELPER_CAP_GET_RAW_OBJ)
return odb_helper_get_raw_object(o, sha1, fd);
-   else
-   return odb_helper_get_git_object(o, sha1, fd);
+   if (o->supported_capabilities & ODB_HELPER_CAP_GET_DIRECT)
+   return 0;
+
+   BUG("invalid get capability (capabilities: '%d')", 
o->supported_capabilities);
 }
 
 int odb_helper_put_object(struct odb_helper *o,
diff --git a/odb-helper.h b/odb-helper.h
index 0571ba09cb..fbb6d333ee 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -35,6 +35,8 @@ extern int odb_helper_has_object(struct odb_helper *o,
 extern int odb_helper_get_object(struct odb_helper *o,
 const unsigned char *sha1,
 int fd);
+extern int odb_helper_get_direct(struct odb_helper *o,
+const unsigned char *sha1);
 extern int odb_helper_put_object(struct odb_helper *o,
 const void *buf, size_t len,
 const char *type, unsigned char *sha1);
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 12/40] external odb: add 'put_raw_obj' support

2017-09-16 Thread Christian Couder
Add support for a 'put_raw_obj' capability/instruction to send new
objects to an external odb. Objects will be sent as they are (in
their 'raw' format). They will not be converted to Git objects.

For now any new Git object (blob, tree, commit, ...) would be sent
if 'put_raw_obj' is supported by an odb helper. This is not a great
default, but let's leave it to following commits to tweak that.

Signed-off-by: Christian Couder 
---
 external-odb.c | 15 +++
 external-odb.h |  2 ++
 odb-helper.c   | 43 ++-
 odb-helper.h   |  3 +++
 sha1_file.c|  2 ++
 5 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 0f0de170b8..82fac702e8 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -118,3 +118,18 @@ int external_odb_get_object(const unsigned char *sha1)
 
return -1;
 }
+
+int external_odb_put_object(const void *buf, size_t len,
+   const char *type, unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   int r = odb_helper_put_object(o, buf, len, type, sha1);
+   if (r <= 0)
+   return r;
+   }
+   return 1;
+}
diff --git a/external-odb.h b/external-odb.h
index dc5635f452..d369dfdf6f 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,5 +4,7 @@
 extern const char *external_odb_root(void);
 extern int external_odb_has_object(const unsigned char *sha1);
 extern int external_odb_get_object(const unsigned char *sha1);
+extern int external_odb_put_object(const void *buf, size_t len,
+  const char *type, unsigned char *sha1);
 
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index 9375eca58f..39d20fdfd7 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -71,9 +71,10 @@ static void prepare_helper_command(struct argv_array *argv, 
const char *cmd,
strbuf_release();
 }
 
-__attribute__((format (printf,3,4)))
+__attribute__((format (printf,4,5)))
 static int odb_helper_start(struct odb_helper *o,
struct odb_helper_cmd *cmd,
+   int use_stdin,
const char *fmt, ...)
 {
va_list ap;
@@ -90,7 +91,10 @@ static int odb_helper_start(struct odb_helper *o,
 
cmd->child.argv = cmd->argv.argv;
cmd->child.use_shell = 1;
-   cmd->child.no_stdin = 1;
+   if (use_stdin)
+   cmd->child.in = -1;
+   else
+   cmd->child.no_stdin = 1;
cmd->child.out = -1;
 
if (start_command(>child) < 0) {
@@ -119,7 +123,7 @@ int odb_helper_init(struct odb_helper *o)
FILE *fh;
struct strbuf line = STRBUF_INIT;
 
-   if (odb_helper_start(o, , "init") < 0)
+   if (odb_helper_start(o, , 0, "init") < 0)
return -1;
 
fh = xfdopen(cmd.child.out, "r");
@@ -179,7 +183,7 @@ static void odb_helper_load_have(struct odb_helper *o)
return;
o->have_valid = 1;
 
-   if (odb_helper_start(o, , "have") < 0)
+   if (odb_helper_start(o, , 0, "have") < 0)
return;
 
fh = xfdopen(cmd.child.out, "r");
@@ -234,7 +238,7 @@ int odb_helper_get_object(struct odb_helper *o, const 
unsigned char *sha1,
if (!obj)
return -1;
 
-   if (odb_helper_start(o, , "get_git_obj %s", sha1_to_hex(sha1)) < 0)
+   if (odb_helper_start(o, , 0, "get_git_obj %s", sha1_to_hex(sha1)) < 
0)
return -1;
 
memset(, 0, sizeof(stream));
@@ -321,3 +325,32 @@ int odb_helper_get_object(struct odb_helper *o, const 
unsigned char *sha1,
 
return 0;
 }
+
+int odb_helper_put_object(struct odb_helper *o,
+ const void *buf, size_t len,
+ const char *type, unsigned char *sha1)
+{
+   struct odb_helper_cmd cmd;
+
+   if (odb_helper_start(o, , 1, "put_raw_obj %s %"PRIuMAX" %s",
+sha1_to_hex(sha1), (uintmax_t)len, type) < 0)
+   return -1;
+
+   do {
+   int w = xwrite(cmd.child.in, buf, len);
+   if (w < 0) {
+   error("unable to write to odb helper '%s': %s",
+ o->name, strerror(errno));
+   close(cmd.child.in);
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   return -1;
+   }
+   len -= w;
+   } while (len > 0);
+
+   close(cmd.child.in);
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 5f28a6e512..0571ba09cb 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -35,5 +35,8 @@ extern int odb_helper_has_object(struct odb_helper *o,
 extern int odb_helper_get_object(struct odb_helper *o,
  

[PATCH v6 08/40] sha1_file: prepare for external odbs

2017-09-16 Thread Christian Couder
In the following commits we will need some functions that were
internal to sha1_file.c, so let's first make them non static
and declare them in "cache.h". While at it, let's rename
'create_tmpfile()' to 'create_object_tmpfile()' to make its
name less generic.

Let's also split out 'sha1_file_name_alt()' from
'sha1_file_name()' and 'open_sha1_file_alt()' from
'open_sha1_file()', as we will need both of these new
functions too.

Helped-by: Jeff King 
Signed-off-by: Christian Couder 
---
 cache.h |  8 
 sha1_file.c | 47 +--
 2 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/cache.h b/cache.h
index a916bc79e3..00d89568f3 100644
--- a/cache.h
+++ b/cache.h
@@ -902,6 +902,12 @@ extern void check_repository_format(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
+/*
+ * Like sha1_file_name, but return the filename within a specific alternate
+ * object directory. Shares the same static buffer with sha1_file_name.
+ */
+extern const char *sha1_file_name_alt(const char *objdir, const unsigned char 
*sha1);
+
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
@@ -1189,6 +1195,8 @@ extern int parse_sha1_header(const char *hdr, unsigned 
long *sizep);
 
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned 
long size, const char *type);
 
+extern int create_object_tmpfile(struct strbuf *tmp, const char *filename);
+extern void close_sha1_file(int fd);
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 /*
diff --git a/sha1_file.c b/sha1_file.c
index 5f71bbac3e..bea1ae6afb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -251,17 +251,22 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-const char *sha1_file_name(const unsigned char *sha1)
+const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1)
 {
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   strbuf_addf(, "%s/", get_object_directory());
+   strbuf_addf(, "%s/", objdir);
 
fill_sha1_path(, sha1);
return buf.buf;
 }
 
+const char *sha1_file_name(const unsigned char *sha1)
+{
+   return sha1_file_name_alt(get_object_directory(), sha1);
+}
+
 struct strbuf *alt_scratch_buf(struct alternate_object_database *alt)
 {
strbuf_setlen(>scratch, alt->base_len);
@@ -822,24 +827,14 @@ static int stat_sha1_file(const unsigned char *sha1, 
struct stat *st,
return -1;
 }
 
-/*
- * Like stat_sha1_file(), but actually open the object and return the
- * descriptor. See the caveats on the "path" parameter above.
- */
-static int open_sha1_file(const unsigned char *sha1, const char **path)
+static int open_sha1_file_alt(const unsigned char *sha1, const char **path)
 {
-   int fd;
struct alternate_object_database *alt;
-   int most_interesting_errno;
-
-   *path = sha1_file_name(sha1);
-   fd = git_open(*path);
-   if (fd >= 0)
-   return fd;
-   most_interesting_errno = errno;
+   int most_interesting_errno = errno;
 
prepare_alt_odb();
for (alt = alt_odb_list; alt; alt = alt->next) {
+   int fd;
*path = alt_sha1_path(alt, sha1);
fd = git_open(*path);
if (fd >= 0)
@@ -851,6 +846,22 @@ static int open_sha1_file(const unsigned char *sha1, const 
char **path)
return -1;
 }
 
+/*
+ * Like stat_sha1_file(), but actually open the object and return the
+ * descriptor. See the caveats on the "path" parameter above.
+ */
+static int open_sha1_file(const unsigned char *sha1, const char **path)
+{
+   int fd;
+
+   *path = sha1_file_name(sha1);
+   fd = git_open(*path);
+   if (fd >= 0)
+   return fd;
+
+   return open_sha1_file_alt(sha1, path);
+}
+
 /*
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
@@ -1428,7 +1439,7 @@ int hash_sha1_file(const void *buf, unsigned long len, 
const char *type,
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_sha1_file(int fd)
+void close_sha1_file(int fd)
 {
if (fsync_object_files)
fsync_or_die(fd, "sha1 file");
@@ -1452,7 +1463,7 @@ static inline int directory_size(const char *filename)
  * We want to avoid cross-directory filename renames, because those
  * can have problems on various filesystems (FAT, NFS, Coda).
  */
-static int create_tmpfile(struct strbuf *tmp, const char *filename)
+int create_object_tmpfile(struct strbuf *tmp, const char *filename)
 {
int fd, dirlen = directory_size(filename);
 
@@ -1492,7 +1503,7 @@ static int write_loose_object(const unsigned char *sha1, 
char *hdr, int hdrlen,
static struct strbuf tmp_file = 

[PATCH v6 13/40] external-odb: accept only blobs for now

2017-09-16 Thread Christian Couder
The mechanism to decide which blobs should be sent to which
external object database will be very simple for now.
If the external odb helper support any "put_*" instruction
all the new blobs will be sent to it.

Signed-off-by: Christian Couder 
---
 external-odb.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 82fac702e8..a4f8c72e1c 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -124,6 +124,10 @@ int external_odb_put_object(const void *buf, size_t len,
 {
struct odb_helper *o;
 
+   /* For now accept only blobs */
+   if (strcmp(type, "blob"))
+   return 1;
+
external_odb_init();
 
for (o = helpers; o; o = o->next) {
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 00/40] Add initial experimental external ODB support

2017-09-16 Thread Christian Couder
Note: a lot of information about the goals, the design and how things
work is now in the followin patches:

  - [PATCH v6 34/40] Add Documentation/technical/external-odb.txt
  - [PATCH v6 40/40] Doc/external-odb: explain transfering objects and metadata

Goal


Git can store its objects only in the form of loose objects in
separate files or packed objects in a pack file.

To be able to better handle some kind of objects, for example big
blobs, it would be nice if Git could store its objects in other object
databases (ODB).

To do that, this patch series makes it possible to register commands,
also called "helpers", using "odb..scriptCommand" or
"odb..subprocessCommand" config variables, to access external
ODBs where objects can be stored and retrieved.

Design
~~

* The "helpers" (registered commands)

Each helper manages access to one external ODB.

There are 2 different modes for helper:

  - Helpers configured using "odb..scriptCommand" are
launched each time Git wants to communicate with the 
external ODB. This is called "script mode".

  - Helpers configured using "odb..subprocessCommand" are
launched launched once as a sub-process (using sub-process.h), and
Git communicates with them using packet lines. This is called
"process mode".

A helper can be given different instructions by Git. The instructions
that are supported are negociated at the beginning of the
communication using a capability mechanism.

See patch 34/40 and 40/40 (the documentation patchs) for more
information about the different instructions and their arguments.

* Performance

The process mode has been implemented using the refactoring that Ben
Peart did on top of Lars Schneider's work on using sub-processes and
packet lines in the smudge/clean filters for git-lfs.

This also uses further work from Ben Peart called "read object
process".

See:

https://public-inbox.org/git/20170113155253.1644-1-benpe...@microsoft.com/
https://public-inbox.org/git/20170322165220.5660-1-benpe...@microsoft.com/

Ben recently sent an update of this work but this update has not been
integrated into the current patch series. See:

https://public-inbox.org/git/20170714132651.170708-1-benpe...@microsoft.com/

Anyway thanks to this, the external ODB mechanism should in the end perform
as well as the git-lfs mechanism when many objects should be
transfered.

Implementation
~~

* Mechanism to call the registered commands

A set of function in external-odb.{c,h} are called by the rest of Git
to manage all the external ODBs.

These functions use 'struct odb_helper' and its associated functions
defined in odb-helper.{c,h} to talk to the different external ODBs by
launching the configured "odb..*command" commands and writing
to or reading from them.

* Transfering information

To tranfer information about the blobs stored in external ODB, some
special refs, called "odb ref", similar as replace refs, are used in
the tests of this series, but in general nothing forces the helper to
use that mechanism.

The external odb helper is responsible for using and creating the refs
in refs/odbs//, if it wants to do that. It is free for example
to just create one ref, as it is also free to create many refs. Git
would just transmit the refs that have been created by this helper, if
Git is asked to do so.

For now in the tests there is one odb ref per blob, as it is simple
and as it is similar to what git-lfs does. Each ref name is
refs/odbs// where  is the sha1 of the blob stored
in the external odb named .

These odb refs point to a blob that is stored in the Git
repository and contain information about the blob stored in the
external odb. This information can be specific to the external odb.
The repos can then share this information using commands like:

`git fetch origin "refs/odbs//*:refs/odbs//*"`

At the end of the current patch series, "git clone" is teached a
"--initial-refspec" option, that asks it to first fetch some specified
refs. This is used in the tests to fetch the odb refs first.

This way only one "git clone" command can setup a repo using the
external ODB mechanism as long as the right helper is installed on the
machine and as long as the following options are used:

  - "--initial-refspec " to fetch the odb refspec
  - "-c odb..command=" to configure the helper

There is also a test script (t0430) that shows that the
"--initial-refspec" option along with the external ODB mechanism can
be used to implement cloning using bundles.

* ODB refs

For now odb ref management is only implemented in a helper in t0410.

When a new blob is added to an external odb, its sha1, size and type
are writen in another new blob and the odb ref is created.

When the list of existing blobs is requested from the external odb,
the content of the blobs pointed to by the odb refs can also be used
by the odb to claim that it can get the objects.

When a blob is actually requested from the external odb, it can use
the content stored in the blobs 

[PATCH v6 17/40] lib-httpd: pass config file to start_httpd()

2017-09-16 Thread Christian Couder
This makes it possible to start an apache web server with different
config files.

This will be used in a later patch to pass a config file that makes
apache store external objects.

Signed-off-by: Christian Couder 
---
 t/lib-httpd.sh | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465a..2e659a8ee2 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -171,12 +171,14 @@ prepare_httpd() {
 }
 
 start_httpd() {
+   APACHE_CONF_FILE=${1-apache.conf}
+
prepare_httpd >&3 2>&4
 
trap 'code=$?; stop_httpd; (exit $code); die' EXIT
 
"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-   -f "$TEST_PATH/apache.conf" $HTTPD_PARA \
+   -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA \
-c "Listen 127.0.0.1:$LIB_HTTPD_PORT" -k start \
>&3 2>&4
if test $? -ne 0
@@ -191,7 +193,7 @@ stop_httpd() {
trap 'die' EXIT
 
"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-   -f "$TEST_PATH/apache.conf" $HTTPD_PARA -k stop
+   -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA -k stop
 }
 
 test_http_push_nonff () {
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 27/40] Add t0450 to test 'get_direct' mechanism

2017-09-16 Thread Christian Couder
From: Ben Peart 

Signed-off-by: Ben Peart 
Signed-off-by: Christian Couder 
---
 t/t0450-read-object.sh | 28 +
 t/t0450/read-object| 68 ++
 2 files changed, 96 insertions(+)
 create mode 100755 t/t0450-read-object.sh
 create mode 100755 t/t0450/read-object

diff --git a/t/t0450-read-object.sh b/t/t0450-read-object.sh
new file mode 100755
index 00..6b97305452
--- /dev/null
+++ b/t/t0450-read-object.sh
@@ -0,0 +1,28 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0450"
+
+test_expect_success 'setup host repo with a root commit' '
+   test_commit zero &&
+   hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+   git init guest-repo &&
+   (cd guest-repo &&
+git config odb.magic.subprocessCommand "$HELPER" &&
+git cat-file blob "$hash1" >/dev/null)
+'
+
+test_expect_success 'invalid blobs generate errors' '
+   cd guest-repo &&
+   test_must_fail git cat-file blob "invalid"
+'
+
+test_done
diff --git a/t/t0450/read-object b/t/t0450/read-object
new file mode 100755
index 00..cf22e2f581
--- /dev/null
+++ b/t/t0450/read-object
@@ -0,0 +1,68 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#  cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard 
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get_direct");
+packet_write_capabilities("get_direct");
+
+while (1) {
+   my ($res, $command) = packet_txt_read();
+
+   if ( $res == -1 ) {
+   exit 0;
+   }
+
+   $command =~ s/^command=//;
+
+   if ( $command eq "init" ) {
+   packet_bin_read();
+
+   packet_txt_write("status=success");
+   packet_flush();
+   } elsif ( $command eq "get_direct" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . 
' | GIT_NO_EXTERNAL_ODB=1 git hash-object -w --stdin >/dev/null 2>&1');
+
+   packet_txt_write(($?) ? "status=error" : "status=success");
+   packet_flush();
+   } else {
+   die "bad command '$command'";
+   }
+}
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 21/40] odb-helper: add odb_helper_get_raw_object()

2017-09-16 Thread Christian Couder
The existing odb_helper_get_object() is renamed
odb_helper_get_git_object() and a new odb_helper_get_raw_object()
is introduced to deal with external objects that are not in Git format.

Signed-off-by: Christian Couder 
---
 odb-helper.c | 113 +--
 1 file changed, 111 insertions(+), 2 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 39d20fdfd7..1f4666b349 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -221,8 +221,107 @@ int odb_helper_has_object(struct odb_helper *o, const 
unsigned char *sha1)
return !!odb_helper_lookup(o, sha1);
 }
 
-int odb_helper_get_object(struct odb_helper *o, const unsigned char *sha1,
-   int fd)
+static int odb_helper_get_raw_object(struct odb_helper *o,
+const unsigned char *sha1,
+int fd)
+{
+   struct odb_helper_object *obj;
+   struct odb_helper_cmd cmd;
+   unsigned long total_got = 0;
+
+   char hdr[32];
+   int hdrlen;
+
+   int ret = Z_STREAM_END;
+   unsigned char compressed[4096];
+   git_zstream stream;
+   git_SHA_CTX hash;
+   unsigned char real_sha1[20];
+
+   obj = odb_helper_lookup(o, sha1);
+   if (!obj)
+   return -1;
+
+   if (odb_helper_start(o, , 0, "get_raw_obj %s", sha1_to_hex(sha1)) < 
0)
+   return -1;
+
+   /* Set it up */
+   git_deflate_init(, zlib_compression_level);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   git_SHA1_Init();
+
+   /* First header.. */
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj->type), 
obj->size) + 1;
+   stream.next_in = (unsigned char *)hdr;
+   stream.avail_in = hdrlen;
+   while (git_deflate(, 0) == Z_OK)
+   ; /* nothing */
+   git_SHA1_Update(, hdr, hdrlen);
+
+   for (;;) {
+   unsigned char buf[4096];
+   int r;
+
+   r = xread(cmd.child.out, buf, sizeof(buf));
+   if (r < 0) {
+   error("unable to read from odb helper '%s': %s",
+ o->name, strerror(errno));
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   git_deflate_end();
+   return -1;
+   }
+   if (r == 0)
+   break;
+
+   total_got += r;
+
+   /* Then the data itself.. */
+   stream.next_in = (void *)buf;
+   stream.avail_in = r;
+   do {
+   unsigned char *in0 = stream.next_in;
+   ret = git_deflate(, Z_FINISH);
+   git_SHA1_Update(, in0, stream.next_in - in0);
+   write_or_die(fd, compressed, stream.next_out - 
compressed);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   } while (ret == Z_OK);
+   }
+
+   close(cmd.child.out);
+   if (ret != Z_STREAM_END) {
+   warning("bad zlib data from odb helper '%s' for %s",
+   o->name, sha1_to_hex(sha1));
+   return -1;
+   }
+   ret = git_deflate_end_gently();
+   if (ret != Z_OK) {
+   warning("deflateEnd on object %s from odb helper '%s' failed 
(%d)",
+   sha1_to_hex(sha1), o->name, ret);
+   return -1;
+   }
+   git_SHA1_Final(real_sha1, );
+   if (hashcmp(sha1, real_sha1)) {
+   warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+   o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+   return -1;
+   }
+   if (odb_helper_finish(o, ))
+   return -1;
+   if (total_got != obj->size) {
+   warning("size mismatch from odb helper '%s' for %s (%lu != 
%lu)",
+   o->name, sha1_to_hex(sha1), total_got, obj->size);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int odb_helper_get_git_object(struct odb_helper *o,
+const unsigned char *sha1,
+int fd)
 {
struct odb_helper_object *obj;
struct odb_helper_cmd cmd;
@@ -326,6 +425,16 @@ int odb_helper_get_object(struct odb_helper *o, const 
unsigned char *sha1,
return 0;
 }
 
+int odb_helper_get_object(struct odb_helper *o,
+ const unsigned char *sha1,
+ int fd)
+{
+   if (o->supported_capabilities & ODB_HELPER_CAP_GET_RAW_OBJ)
+   return odb_helper_get_raw_object(o, sha1, fd);
+   else
+   return odb_helper_get_git_object(o, sha1, fd);
+}
+
 int odb_helper_put_object(struct odb_helper *o,
  const void 

[PATCH v6 20/40] lib-httpd: add apache-e-odb.conf

2017-09-16 Thread Christian Couder
This is an apache config file to test external object databases.
It uses the upload.sh and list.sh cgi that have been added
previously to make apache store external objects.

Signed-off-by: Christian Couder 
---
 t/lib-httpd/apache-e-odb.conf | 214 ++
 1 file changed, 214 insertions(+)
 create mode 100644 t/lib-httpd/apache-e-odb.conf

diff --git a/t/lib-httpd/apache-e-odb.conf b/t/lib-httpd/apache-e-odb.conf
new file mode 100644
index 00..19a1540c82
--- /dev/null
+++ b/t/lib-httpd/apache-e-odb.conf
@@ -0,0 +1,214 @@
+ServerName dummy
+PidFile httpd.pid
+DocumentRoot www
+LogFormat "%h %l %u %t \"%r\" %>s %b" common
+CustomLog access.log common
+ErrorLog error.log
+
+   LoadModule log_config_module modules/mod_log_config.so
+
+
+   LoadModule alias_module modules/mod_alias.so
+
+
+   LoadModule cgi_module modules/mod_cgi.so
+
+
+   LoadModule env_module modules/mod_env.so
+
+
+   LoadModule rewrite_module modules/mod_rewrite.so
+
+
+   LoadModule version_module modules/mod_version.so
+
+
+   LoadModule headers_module modules/mod_headers.so
+
+
+
+LockFile accept.lock
+
+
+
+
+   LoadModule auth_module modules/mod_auth.so
+
+
+
+= 2.1>
+
+   LoadModule auth_basic_module modules/mod_auth_basic.so
+
+
+   LoadModule authn_file_module modules/mod_authn_file.so
+
+
+   LoadModule authz_user_module modules/mod_authz_user.so
+
+
+   LoadModule authz_host_module modules/mod_authz_host.so
+
+
+
+= 2.4>
+
+   LoadModule authn_core_module modules/mod_authn_core.so
+
+
+   LoadModule authz_core_module modules/mod_authz_core.so
+
+
+   LoadModule access_compat_module modules/mod_access_compat.so
+
+
+   LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+
+
+   LoadModule unixd_module modules/mod_unixd.so
+
+
+
+PassEnv GIT_VALGRIND
+PassEnv GIT_VALGRIND_OPTIONS
+PassEnv GNUPGHOME
+PassEnv ASAN_OPTIONS
+PassEnv GIT_TRACE
+PassEnv GIT_CONFIG_NOSYSTEM
+
+Alias /dumb/ www/
+Alias /auth/dumb/ www/auth/dumb/
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   SetEnv GIT_COMMITTER_NAME "Custom User"
+   SetEnv GIT_COMMITTER_EMAIL cus...@example.com
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   SetEnv GIT_NAMESPACE ns
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   Header set Set-Cookie name=value
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+
+ScriptAlias /upload/ upload.sh/
+ScriptAlias /list/ list.sh/
+
+   Options FollowSymlinks
+
+
+  Options ExecCGI
+
+
+  Options ExecCGI
+
+
+   Options ExecCGI
+
+
+RewriteEngine on
+RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301]
+RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302]
+RewriteRule ^/smart-redir-auth/(.*)$ /auth/smart/$1 [R=301]
+RewriteRule ^/smart-redir-limited/(.*)/info/refs$ /smart/$1/info/refs [R=301]
+RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 [R=302]
+
+RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 
[R=302]
+RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
+
+# Apache 2.2 does not understand , so we use RewriteCond.
+# And as RewriteCond does not allow testing for non-matches, we match
+# the desired case first (one has abra, two has cadabra), and let it
+# pass by marking the RewriteRule as [L], "last rule, do not process
+# any other matching RewriteRules after this"), and then have another
+# RewriteRule that matches all other cases and lets them fail via '[F]',
+# "fail the request".
+RewriteCond %{HTTP:x-magic-one} =abra
+RewriteCond %{HTTP:x-magic-two} =cadabra
+RewriteRule ^/smart_headers/.* - [L]
+RewriteRule ^/smart_headers/.* - [F]
+
+
+LoadModule ssl_module modules/mod_ssl.so
+
+SSLCertificateFile httpd.pem
+SSLCertificateKeyFile httpd.pem
+SSLRandomSeed startup file:/dev/urandom 512
+SSLRandomSeed connect file:/dev/urandom 512
+SSLSessionCache none
+SSLMutex file:ssl_mutex
+SSLEngine On
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
+RewriteCond %{REQUEST_URI} /git-receive-pack$
+RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
+
+
+  Order Deny,Allow
+  Deny from env=AUTHREQUIRED
+
+  AuthType Basic
+  AuthName "Git Access"
+  AuthUserFile passwd
+  Require valid-user
+  Satisfy Any
+
+
+
+   LoadModule dav_module modules/mod_dav.so
+   LoadModule dav_fs_module 

[PATCH v6 09/40] Add initial external odb support

2017-09-16 Thread Christian Couder
The external-odb.{c,h} files contains the functions that are
called by the rest of Git from "sha1_file.c".

The odb-helper.{c,h} files contains the functions to
actually implement communication with the external scripts or
processes that will manage external git objects.

For now only script mode is supported, and only the 'have' and
'get_git_obj' instructions are supported.

Helped-by: Jeff King 
Signed-off-by: Christian Couder 
---
 Makefile|   2 +
 cache.h |   1 +
 external-odb.c  | 113 
 external-odb.h  |   8 ++
 odb-helper.c| 269 
 odb-helper.h|  27 +
 sha1_file.c |  31 +-
 t/t0400-external-odb.sh |  46 +
 8 files changed, 495 insertions(+), 2 deletions(-)
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100755 t/t0400-external-odb.sh

diff --git a/Makefile b/Makefile
index f2bb7f2f63..24aab8ace3 100644
--- a/Makefile
+++ b/Makefile
@@ -784,6 +784,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += external-odb.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
@@ -816,6 +817,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += odb-helper.o
 LIB_OBJS += oidset.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pack-bitmap.o
diff --git a/cache.h b/cache.h
index 00d89568f3..6c22bd0525 100644
--- a/cache.h
+++ b/cache.h
@@ -1534,6 +1534,7 @@ extern void prepare_alt_odb(void);
 extern char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 extern int foreach_alt_odb(alt_odb_fn, void*);
+extern void prepare_external_alt_odb(void);
 
 /*
  * Allocate a "struct alternate_object_database" but do _not_ actually
diff --git a/external-odb.c b/external-odb.c
new file mode 100644
index 00..e9c3f11666
--- /dev/null
+++ b/external-odb.c
@@ -0,0 +1,113 @@
+#include "cache.h"
+#include "external-odb.h"
+#include "odb-helper.h"
+
+static struct odb_helper *helpers;
+static struct odb_helper **helpers_tail = 
+
+static struct odb_helper *find_or_create_helper(const char *name, int len)
+{
+   struct odb_helper *o;
+
+   for (o = helpers; o; o = o->next)
+   if (!strncmp(o->name, name, len) && !o->name[len])
+   return o;
+
+   o = odb_helper_new(name, len);
+   *helpers_tail = o;
+   helpers_tail = >next;
+
+   return o;
+}
+
+static int external_odb_config(const char *var, const char *value, void *data)
+{
+   struct odb_helper *o;
+   const char *name;
+   int namelen;
+   const char *subkey;
+
+   if (parse_config_key(var, "odb", , , ) < 0)
+   return 0;
+
+   o = find_or_create_helper(name, namelen);
+
+   if (!strcmp(subkey, "scriptcommand"))
+   return git_config_string(>cmd, var, value);
+
+   return 0;
+}
+
+static void external_odb_init(void)
+{
+   static int initialized;
+
+   if (initialized)
+   return;
+   initialized = 1;
+
+   git_config(external_odb_config, NULL);
+}
+
+const char *external_odb_root(void)
+{
+   static const char *root;
+   if (!root)
+   root = git_pathdup("objects/external");
+   return root;
+}
+
+int external_odb_has_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next)
+   if (odb_helper_has_object(o, sha1))
+   return 1;
+   return 0;
+}
+
+int external_odb_get_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+   const char *path;
+
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   path = sha1_file_name_alt(external_odb_root(), sha1);
+   safe_create_leading_directories_const(path);
+   prepare_external_alt_odb();
+
+   for (o = helpers; o; o = o->next) {
+   struct strbuf tmpfile = STRBUF_INIT;
+   int ret;
+   int fd;
+
+   if (!odb_helper_has_object(o, sha1))
+   continue;
+
+   fd = create_object_tmpfile(, path);
+   if (fd < 0) {
+   strbuf_release();
+   return -1;
+   }
+
+   if (odb_helper_get_object(o, sha1, fd) < 0) {
+   close(fd);
+   unlink(tmpfile.buf);
+   strbuf_release();
+   continue;
+   }
+
+   close_sha1_file(fd);
+   ret = finalize_object_file(tmpfile.buf, path);
+   strbuf_release();
+

[PATCH v6 23/40] Add t0420 to test transfer to HTTP external odb

2017-09-16 Thread Christian Couder
This tests that an apache web server can be used as an
external object database and store files in their native
format instead of converting them to a Git object.

Signed-off-by: Christian Couder 
---
 t/t0420-transfer-http-e-odb.sh | 142 +
 1 file changed, 142 insertions(+)
 create mode 100755 t/t0420-transfer-http-e-odb.sh

diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
new file mode 100755
index 00..f84fe950ec
--- /dev/null
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -0,0 +1,142 @@
+#!/bin/sh
+
+test_description='tests for transfering external objects to an HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-http-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+echo >&2 "odb-http-helper args:" "$@"
+case "$1" in
+init)
+   echo "capability=get_raw_obj"
+   echo "capability=put_raw_obj"
+   echo "capability=have"
+   ;;
+have)
+   list_url="$HTTPD_URL/list/"
+   curl "$list_url" ||
+   die "curl '$list_url' failed"
+   ;;
+get_raw_obj)
+   get_url="$HTTPD_URL/list/?sha1=$2"
+   curl "$get_url" ||
+   die "curl '$get_url' failed"
+   ;;
+put_raw_obj)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   upload_url="$HTTPD_URL/upload/?sha1=$sha1=$size=$kind"
+   curl --data-binary @- --include "$upload_url" >out ||
+   die "curl '$upload_url' failed"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_NO_EXTERNAL_ODB=1 git 
hash-object -w -t blob --stdin) || exit
+   git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-http-helper"
+
+test_expect_success 'setup repo with a root commit and the helper' '
+   test_commit zero &&
+   git config odb.magic.scriptCommand "$HELPER"
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.scriptCommand "$HELPER" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+test_must_fail git clone --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git clone -c odb.magic.scriptCommand="$HELPER" \
+   --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+stop_httpd
+
+test_done
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 18/40] lib-httpd: add upload.sh

2017-09-16 Thread Christian Couder
This cgi will be used to upload objects to, or to delete
objects from, an apache web server.

This way the apache server can work as an external object
database.

Signed-off-by: Christian Couder 
---
 t/lib-httpd.sh|  1 +
 t/lib-httpd/upload.sh | 45 +
 2 files changed, 46 insertions(+)
 create mode 100644 t/lib-httpd/upload.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 2e659a8ee2..d80b004549 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
install_script broken-smart-http.sh
install_script error.sh
+   install_script upload.sh
 
ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh
new file mode 100644
index 00..64d3f31c31
--- /dev/null
+++ b/t/lib-httpd/upload.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+# In part from 
http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+   key=${1%%=*}
+   val=${1#*=}
+
+   case "$key" in
+   "sha1") sha1="$val" ;;
+   "type") type="$val" ;;
+   "size") size="$val" ;;
+   "delete") delete=1 ;;
+   *) echo >&2 "unknown key '$key'" ;;
+   esac
+
+   shift
+done
+
+case "$REQUEST_METHOD" in
+POST)
+   if test "$delete" = "1"
+   then
+   rm -f "$FILES_DIR/$sha1-$size-$type"
+   else
+   mkdir -p "$FILES_DIR"
+   cat >"$FILES_DIR/$sha1-$size-$type"
+   fi
+
+   echo 'Status: 204 No Content'
+   echo
+   ;;
+
+*)
+   echo 'Status: 405 Method Not Allowed'
+   echo
+esac
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 35/40] clone: add 'initial' param to write_remote_refs()

2017-09-16 Thread Christian Couder
We want to make it possible to separate fetching remote refs into
an initial part and a later part. To prepare for that, let's add
an 'initial' boolean parameter to write_remote_refs() to tell this
function if we are performing the initial part or not.

Signed-off-by: Christian Couder 
---
 builtin/clone.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index dcd5b878f1..2e5d60521d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -574,7 +574,7 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
return local_refs;
 }
 
-static void write_remote_refs(const struct ref *local_refs)
+static void write_remote_refs(const struct ref *local_refs, int initial)
 {
const struct ref *r;
 
@@ -593,8 +593,13 @@ static void write_remote_refs(const struct ref *local_refs)
die("%s", err.buf);
}
 
-   if (initial_ref_transaction_commit(t, ))
-   die("%s", err.buf);
+   if (initial) {
+   if (initial_ref_transaction_commit(t, ))
+   die("%s", err.buf);
+   } else {
+   if (ref_transaction_commit(t, ))
+   die("%s", err.buf);
+   }
 
strbuf_release();
ref_transaction_free(t);
@@ -641,7 +646,8 @@ static void update_remote_refs(const struct ref *refs,
   const char *branch_top,
   const char *msg,
   struct transport *transport,
-  int check_connectivity)
+  int check_connectivity,
+  int initial)
 {
const struct ref *rm = mapped_refs;
 
@@ -656,7 +662,7 @@ static void update_remote_refs(const struct ref *refs,
}
 
if (refs) {
-   write_remote_refs(mapped_refs);
+   write_remote_refs(mapped_refs, initial);
if (option_single_branch && !option_no_tags)
write_followtags(refs, msg);
}
@@ -1168,7 +1174,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local, 0);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 36/40] clone: add --initial-refspec option

2017-09-16 Thread Christian Couder
This option makes it possible to separate fetching refs when cloning
in two parts, an initial part and a later normal part.

This way after the initial part, mechanisms like the external odb
mechanism can be used to prefetch some objects using information
that has been made available during the initial fetch.

Signed-off-by: Christian Couder 
---
 builtin/clone.c | 55 ++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 2e5d60521d..57cecd194c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -57,6 +57,7 @@ static enum transport_family family;
 static struct string_list option_config = STRING_LIST_INIT_NODUP;
 static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
 static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
+static struct string_list option_initial_refspec = STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
@@ -107,6 +108,8 @@ static struct option builtin_clone_options[] = {
N_("reference repository")),
OPT_STRING_LIST(0, "reference-if-able", _optional_reference,
N_("repo"), N_("reference repository")),
+   OPT_STRING_LIST(0, "initial-refspec", _initial_refspec,
+   N_("refspec"), N_("fetch this refspec first")),
OPT_BOOL(0, "dissociate", _dissociate,
 N_("use --reference only while cloning")),
OPT_STRING('o', "origin", _origin, N_("name"),
@@ -869,6 +872,47 @@ static void dissociate_from_references(void)
free(alternates);
 }
 
+static struct refspec *parse_initial_refspecs(void)
+{
+   const char **refspecs;
+   struct refspec *initial_refspecs;
+   struct string_list_item *rs;
+   int i = 0;
+
+   if (!option_initial_refspec.nr)
+   return NULL;
+
+   refspecs = xcalloc(option_initial_refspec.nr, sizeof(const char *));
+
+   for_each_string_list_item(rs, _initial_refspec)
+   refspecs[i++] = rs->string;
+
+   initial_refspecs = parse_fetch_refspec(option_initial_refspec.nr, 
refspecs);
+
+   free(refspecs);
+
+   return initial_refspecs;
+}
+
+static void fetch_initial_refs(struct transport *transport,
+  const struct ref *refs,
+  struct refspec *initial_refspecs,
+  const char *branch_top,
+  const char *reflog_msg,
+  int is_local)
+{
+   int i;
+
+   for (i = 0; i < option_initial_refspec.nr; i++) {
+   struct ref *init_refs = NULL;
+   struct ref **tail = _refs;
+   get_fetch_map(refs, _refspecs[i], , 0);
+   transport_fetch_refs(transport, init_refs);
+   update_remote_refs(refs, init_refs, NULL, branch_top, 
reflog_msg,
+  transport, !is_local, 1);
+   }
+}
+
 int cmd_clone(int argc, const char **argv, const char *prefix)
 {
int is_bundle = 0, is_local;
@@ -892,6 +936,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   struct refspec *initial_refspecs;
+   int is_initial;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1059,6 +1106,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
+   initial_refspecs = parse_initial_refspecs();
+
fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
refspec = parse_fetch_refspec(1, _pattern);
free((char *)fetch_pattern);
@@ -1114,6 +1163,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
refs = transport_get_remote_refs(transport);
 
if (refs) {
+   fetch_initial_refs(transport, refs, initial_refspecs,
+  branch_top.buf, reflog_msg.buf, is_local);
+
mapped_refs = wanted_peer_refs(refs, refspec);
/*
 * transport_get_remote_refs() may return refs with null sha-1
@@ -1173,9 +1225,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
+   is_initial = !refs || option_initial_refspec.nr == 0;
update_remote_refs(refs, mapped_refs, remote_head_points_at,
   branch_top.buf, reflog_msg.buf, transport,
-  !is_local, 0);
+  

[PATCH v6 34/40] Add Documentation/technical/external-odb.txt

2017-09-16 Thread Christian Couder
This describes the external odb mechanism's purpose and
how it works.

Helped-by: Ben Peart 
Signed-off-by: Christian Couder 
---
 Documentation/technical/external-odb.txt | 342 +++
 1 file changed, 342 insertions(+)
 create mode 100644 Documentation/technical/external-odb.txt

diff --git a/Documentation/technical/external-odb.txt 
b/Documentation/technical/external-odb.txt
new file mode 100644
index 00..58ec8a8145
--- /dev/null
+++ b/Documentation/technical/external-odb.txt
@@ -0,0 +1,342 @@
+External ODBs
+^
+
+The External ODB mechanism makes it possible for Git objects, only
+blobs for now though, to be stored in an "external object database"
+(External ODB).
+
+An External ODB can be any object store as long as there is an helper
+program called an "odb helper" that can communicate with Git to
+transfer objects to/from the external odb and to retrieve information
+about available objects in the external odb.
+
+Purpose
+===
+
+The purpose of this mechanism is to make possible to handle Git
+objects, especially blobs, in much more flexible ways.
+
+Currently Git can store its objects only in the form of loose objects
+in separate files or packed objects in a pack file. These existing
+object stores cannot be easily optimized for many different kind of
+contents.
+
+So the current stores are not flexible enough for some important use
+cases like handling really big binary files or handling a really big
+number of files that are fetched only as needed. And it is not
+realistic to expect that Git could fully natively handle many of such
+use cases. Git would need to natively implement different internal
+stores which would be a huge burden and which could lead to
+re-implement things like HTTP servers, Docker registries or artifact
+stores that already exist outside Git.
+
+Furthermore many improvements that are dependent on specific setups
+could be implemented in the way Git objects are managed if it was
+possible to customize how the Git objects are handled. For example a
+restartable clone using the bundle mechanism has often been requested,
+but implementing that would go against the current strict rules under
+which the Git objects are currently handled.
+
+What Git needs is a mechanism to make it possible to customize in a
+lot of different ways how the Git objects are handled. Though this
+mechanism should try as much as possible to avoid interfering with the
+usual way in which Git handle its objects.
+
+Helpers
+===
+
+ODB helpers are commands that have to be registered using either the
+"odb..subprocessCommand" or the "odb..scriptCommand"
+config variables.
+
+Registering such a command tells Git that an external odb called
+ exists and that the registered command should be used to
+communicate with it.
+
+The communication happens through instructions that are sent by Git
+and that the commands should answer. If it makes sense, Git can send
+the same instruction to many commands in the order in which they are
+configured.
+
+There are 2 kinds of commands. Commands registered using the
+"odb..subprocessCommand" config variable are called "process
+commands" and the associated mode is called "process mode". Commands
+registered using the "odb..scriptCommand" config variables
+are called "script commands" and the associated mode is called "script
+mode".
+
+Early on git commands send an 'init' instruction to the registered
+commands. A capability negociation will take place during this
+request/response exchange which will let Git and the helpers know how
+they can further collaborate. The attribute system can also be used to
+tell Git which objects should be handled by which helper.
+
+Process Mode
+
+
+In process mode the command is started as a single process invocation
+that should last for the entire life of the single Git command that
+started it.
+
+A packet format (pkt-line, see technical/protocol-common.txt) based
+protocol over standard input and standard output is used for
+communication between Git and the helper command.
+
+After the process command is started, Git sends a welcome message
+("git-read-object-client"), a list of supported protocol version
+numbers, and a flush packet. Git expects to read a welcome response
+message ("git-read-object-server"), exactly one protocol version
+number from the previously sent list, and a flush packet. All further
+communication will be based on the selected version.
+
+The remaining protocol description below documents "version=1". Please
+note that "version=42" in the example below does not exist and is only
+there to illustrate how the protocol would look with more than one
+version.
+
+After the version negotiation Git sends a list of all capabilities
+that it supports and a flush packet. Git expects to read a list of
+desired capabilities, which must be a subset of the supported
+capabilities list, and a flush packet as 

[PATCH v6 25/40] odb-helper: add 'script_mode' to 'struct odb_helper'

2017-09-16 Thread Christian Couder
to prepare for having a long running odb helper sub-process
handling the communication between Git and an external odb.

We introduce "odb..subprocesscommand" to make it
possible to define such a sub-process, and we mark such odb
helpers with the new 'script_mode' field set to 0.

Helpers defined using the existing "odb..scriptcommand"
are marked with the 'script_mode' field set to 1.

Implementation of the different capabilities/instructions in
the new (sub-)process mode is left for following commits.

Signed-off-by: Christian Couder 
---
 external-odb.c |  8 +++-
 odb-helper.c   | 19 ++-
 odb-helper.h   |  1 +
 3 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 31d21bfe04..ccca67eff5 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -32,8 +32,14 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
 
o = find_or_create_helper(name, namelen);
 
-   if (!strcmp(subkey, "scriptcommand"))
+   if (!strcmp(subkey, "scriptcommand")) {
+   o->script_mode = 1;
return git_config_string(>cmd, var, value);
+   }
+   if (!strcmp(subkey, "subprocesscommand")) {
+   o->script_mode = 0;
+   return git_config_string(>cmd, var, value);
+   }
 
return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index 3d940a3171..22728200e3 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -123,6 +123,9 @@ int odb_helper_init(struct odb_helper *o)
FILE *fh;
struct strbuf line = STRBUF_INIT;
 
+   if (!o->script_mode)
+   return 0;
+
if (odb_helper_start(o, , 0, "init") < 0)
return -1;
 
@@ -173,16 +176,12 @@ static int odb_helper_object_cmp(const void *va, const 
void *vb)
return hashcmp(a->sha1, b->sha1);
 }
 
-static void odb_helper_load_have(struct odb_helper *o)
+static void have_object_script(struct odb_helper *o)
 {
struct odb_helper_cmd cmd;
FILE *fh;
struct strbuf line = STRBUF_INIT;
 
-   if (o->have_valid)
-   return;
-   o->have_valid = 1;
-
if (odb_helper_start(o, , 0, "have") < 0)
return;
 
@@ -194,6 +193,16 @@ static void odb_helper_load_have(struct odb_helper *o)
strbuf_release();
fclose(fh);
odb_helper_finish(o, );
+}
+
+static void odb_helper_load_have(struct odb_helper *o)
+{
+   if (o->have_valid)
+   return;
+   o->have_valid = 1;
+
+   if (o->script_mode)
+   have_object_script(o);
 
qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
 }
diff --git a/odb-helper.h b/odb-helper.h
index fbb6d333ee..ee2b09b182 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -15,6 +15,7 @@ struct odb_helper {
const char *name;
const char *cmd;
unsigned int supported_capabilities;
+   int script_mode;
 
struct odb_helper_object {
unsigned char sha1[20];
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 28/40] Add t0460 to test passing git objects

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0460-read-object-git.sh | 28 +
 t/t0460/read-object-git| 78 ++
 2 files changed, 106 insertions(+)
 create mode 100755 t/t0460-read-object-git.sh
 create mode 100755 t/t0460/read-object-git

diff --git a/t/t0460-read-object-git.sh b/t/t0460-read-object-git.sh
new file mode 100755
index 00..2873b445f3
--- /dev/null
+++ b/t/t0460-read-object-git.sh
@@ -0,0 +1,28 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process passing git 
objects'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0460"
+
+test_expect_success 'setup host repo with a root commit' '
+   test_commit zero &&
+   hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object-git"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+   git init guest-repo &&
+   (cd guest-repo &&
+git config odb.magic.subprocessCommand "$HELPER" &&
+git cat-file blob "$hash1" >/dev/null)
+'
+
+test_expect_success 'invalid blobs generate errors' '
+   cd guest-repo &&
+   test_must_fail git cat-file blob "invalid"
+'
+
+test_done
diff --git a/t/t0460/read-object-git b/t/t0460/read-object-git
new file mode 100755
index 00..38529e622e
--- /dev/null
+++ b/t/t0460/read-object-git
@@ -0,0 +1,78 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#  cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard 
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get_git_obj");
+packet_write_capabilities("get_git_obj");
+
+while (1) {
+   my ($res, $command) = packet_txt_read();
+
+   if ( $res == -1 ) {
+   exit 0;
+   }
+
+   $command =~ s/^command=//;
+
+   if ( $command eq "init" ) {
+   packet_bin_read();
+
+   packet_txt_write("status=success");
+   packet_flush();
+   } elsif ( $command eq "get_git_obj" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   my $path = $sha1;
+   $path =~ s{..}{$&/};
+   $path = $DIR . "/objects/" . $path;
+
+   my $contents = do {
+   local $/;
+   open my $fh, $path or die "Can't open '$path': $!";
+   <$fh>
+   };
+
+   packet_bin_write($contents);
+   packet_flush();
+   packet_txt_write("status=success");
+   packet_flush();
+   } else {
+   die "bad command '$command'";
+   }
+}
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 39/40] Add t0430 to test cloning using bundles

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0430-clone-bundle-e-odb.sh | 85 +++
 1 file changed, 85 insertions(+)
 create mode 100755 t/t0430-clone-bundle-e-odb.sh

diff --git a/t/t0430-clone-bundle-e-odb.sh b/t/t0430-clone-bundle-e-odb.sh
new file mode 100755
index 00..ac38ae1be5
--- /dev/null
+++ b/t/t0430-clone-bundle-e-odb.sh
@@ -0,0 +1,85 @@
+#!/bin/sh
+
+test_description='tests for cloning using a bundle through e-odb'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-clone-bundle-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+echo >&2 "odb-clone-bundle-helper args:" "$@"
+case "$1" in
+init)
+   ref_hash=$(git rev-parse refs/odbs/magic/bundle) ||
+   die "couldn't find refs/odbs/magic/bundle"
+   GIT_NO_EXTERNAL_ODB=1 git cat-file blob "$ref_hash" >bundle_info ||
+   die "couldn't get blob $ref_hash"
+   bundle_url=$(sed -e 's/bundle url: //' bundle_info)
+   echo >&2 "bundle_url: '$bundle_url'"
+   curl "$bundle_url" -o bundle_file ||
+   die "curl '$bundle_url' failed"
+   GIT_NO_EXTERNAL_ODB=1 git bundle unbundle bundle_file >unbundling_info 
||
+   die "unbundling 'bundle_file' failed"
+   ;;
+get*)
+   die "odb-clone-bundle-helper '$1' called"
+   ;;
+put*)
+   die "odb-clone-bundle-helper '$1' called"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-clone-bundle-helper"
+
+
+test_expect_success 'setup repo with a few commits' '
+   test_commit one &&
+   test_commit two &&
+   test_commit three &&
+   test_commit four
+'
+
+BUNDLE_FILE="file.bundle"
+FILES_DIR="httpd/www/files"
+GET_URL="$HTTPD_URL/files/$BUNDLE_FILE"
+
+test_expect_success 'create a bundle for this repo and check that it can be 
downloaded' '
+   git bundle create "$BUNDLE_FILE" master &&
+   mkdir "$FILES_DIR" &&
+   cp "$BUNDLE_FILE" "$FILES_DIR/" &&
+   curl "$GET_URL" --output actual &&
+   test_cmp "$BUNDLE_FILE" actual
+'
+
+test_expect_success 'create an e-odb ref for this bundle' '
+   ref_hash=$(echo "bundle url: $GET_URL" | GIT_NO_EXTERNAL_ODB=1 git 
hash-object -w -t blob --stdin) &&
+   git update-ref refs/odbs/magic/bundle "$ref_hash"
+'
+
+test_expect_success 'clone using the e-odb helper to download and install the 
bundle' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local \
+   -c odb.magic.scriptCommand="$HELPER" \
+   --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 29/40] odb-helper: add put_object_process()

2017-09-16 Thread Christian Couder
This adds the infrastructure to send objects to a sub-process
handling the communication with an external odb.

For now we only handle sending raw blobs using the 'put_raw_obj'
instruction.

Signed-off-by: Christian Couder 
---
 odb-helper.c | 75 +---
 1 file changed, 72 insertions(+), 3 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 3148bcfa15..356f6172d8 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -433,6 +433,58 @@ static int get_object_process(struct odb_helper *o, const 
unsigned char *sha1, i
  o->cmd, cur_cap);
 }
 
+static int send_put_packets(struct object_process *entry,
+   const unsigned char *sha1,
+   const void *buf,
+   size_t len,
+   struct strbuf *status)
+{
+   struct child_process *process = >subprocess.process;
+   int err = packet_write_fmt_gently(process->in, "command=put_raw_obj\n");
+   if (err)
+   return err;
+
+   err = packet_write_fmt_gently(process->in, "sha1=%s\n", 
sha1_to_hex(sha1));
+   if (err)
+   return err;
+
+   err = packet_write_fmt_gently(process->in, "size=%"PRIuMAX"\n", len);
+   if (err)
+   return err;
+
+   err = packet_write_fmt_gently(process->in, "kind=blob\n");
+   if (err)
+   return err;
+
+   err = packet_flush_gently(process->in);
+   if (err)
+   return err;
+
+   err = write_packetized_from_buf(buf, len, process->in);
+   if (err)
+   return err;
+
+   return check_object_process_status(process->out, status);
+}
+
+static int put_object_process(struct odb_helper *o,
+ const void *buf, size_t len,
+ const char *type, unsigned char *sha1)
+{
+   int err;
+   struct object_process *entry;
+   struct strbuf status = STRBUF_INIT;
+
+   entry = launch_object_process(o, ODB_HELPER_CAP_PUT_RAW_OBJ);
+   if (!entry)
+   return -1;
+
+   err = send_put_packets(entry, sha1, buf, len, );
+
+   return check_object_process_error(err, status.buf, entry, o->cmd,
+ ODB_HELPER_CAP_PUT_RAW_OBJ);
+}
+
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
struct odb_helper *o;
@@ -904,9 +956,9 @@ int odb_helper_get_object(struct odb_helper *o,
return res;
 }
 
-int odb_helper_put_object(struct odb_helper *o,
- const void *buf, size_t len,
- const char *type, unsigned char *sha1)
+static int put_raw_object_script(struct odb_helper *o,
+const void *buf, size_t len,
+const char *type, unsigned char *sha1)
 {
struct odb_helper_cmd cmd;
 
@@ -932,3 +984,20 @@ int odb_helper_put_object(struct odb_helper *o,
odb_helper_finish(o, );
return 0;
 }
+
+int odb_helper_put_object(struct odb_helper *o,
+ const void *buf, size_t len,
+ const char *type, unsigned char *sha1)
+{
+   int res;
+   uint64_t start = getnanotime();
+
+   if (o->script_mode)
+   res = put_raw_object_script(o, buf, len, type, sha1);
+   else
+   res = put_object_process(o, buf, len, type, sha1);
+
+   trace_performance_since(start, "odb_helper_put_object");
+
+   return res;
+}
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 40/40] Doc/external-odb: explain transfering objects and metadata

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/technical/external-odb.txt | 105 +++
 1 file changed, 105 insertions(+)

diff --git a/Documentation/technical/external-odb.txt 
b/Documentation/technical/external-odb.txt
index 58ec8a8145..76dd1e2e6c 100644
--- a/Documentation/technical/external-odb.txt
+++ b/Documentation/technical/external-odb.txt
@@ -340,3 +340,108 @@ can that contains:
 *.jpg   odb=magic
 
 
+Transfering objects
+===
+
+When an external odb helper is configured, the objects managed by the
+external odb are not put in the pack file that is sent (when pushing
+or answering clone and fetch requests), so the receiver should also
+have configured an external odb helper that can get the missing
+objects otherwise Git will error out complaining about missing
+objects.
+
+This has some drawbacks of course, but at least it makes sure that
+users' and admins' repositories are both properly configured to use a
+common external ODB before they can talk to each other.
+
+Transfering meta information and restartable clone
+==
+
+There are different ways to make it possible for the external odb
+helpers to know which services they should get the objects from (or
+put them into), for example the information could be hardcoded into
+the helpers, or the information could be computed from configuration
+information like the url of the "origin" remote.
+
+The external odb mechanism itself doesn't really take care of this, so
+helpers are free to do whatever they want.
+
+One interesting possibility though is to have this information as part
+of the repository in special refs, for example refs/odb/magic/*, where
+"magic" is the external odb name.
+
+This would especially make it possible to implement a restartable
+clone using Git bundles (and an external odb helper) like this:
+
+   1) At the very start of the clone, Git would fetch the refs
+   that contain "meta information", for example refs/odb/magic/*
+   (where "magic" is the odb name). These refs would point to
+   some blobs that contain lists of the bundles that are
+   available for fetching by the helper, along with enough
+   information for the helper to fetch them (for example HTTP
+   urls of the bundles).
+
+   2) After this first fetch of the refs/odb/magic/* refs, the
+   helper would be sent the 'init' instruction. At that time it
+   can read all the blobs pointed to by these refs and download
+   the bundles listed in the blobs.
+
+   If something goes wrong when the helper "fetches" a bundle,
+   the helper could force the clone to error out (after maybe
+   retrying), and when the user (or the helper itself) tries
+   again to clone, the helper would restart its bundle "fetch"
+   (using the restartable protocol, for example HTTP).
+
+   When this "fetch" eventually succeeds, then the helper will
+   unbundle what it received, and then give back control to the
+   second regular part of the clone.
+
+   3) This regular part of the clone will then try to fetch the
+   usual refs, but as the unbundling has already updated the
+   content of the usual refs as well as the object stores this
+   fetch will find that everything is up-to-date.
+
+   Or if everything is not quite up-to-date and there are still
+   things to fetch, another hopefully much small regular fetch
+   will happen.
+
+As this is an interesting use of the external odb mechanism, the
+`--initial-refspec` option has been implemented in `git clone`. This
+makes it possible to perform all the above steps using a single clone
+command like:
+
+
+$ git clone -c odb.magic.scriptCommand="$HELPER" \
+  --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" "$URL"
+
+
+But note that the above could also be performed using:
+
+
+$ git init
+$ git remote add origin "$URL"
+$ git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*"
+$ git config odb.magic.scriptCommand "$HELPER"
+$ git fetch origin
+
+
+So the `--initial-refspec` option can be seen as just a shortcut to
+simplify external odb helped clones for users.
+
+Also note that this `--initial-refspec` approach could be slower than
+a regular clone, so it is mostly interesting if one wants to fetch a
+big number of objects or many big objects, like for an initial clone
+of a big repo. In this use case a relatively small amount of time
+spent in the initial fetch is an acceptable trade-off if the clone is
+restartable.
+
+Though in some cases, as the `--initial-refspec` clone could alleviate
+resource usage of the Git server, it could be even faster than a
+regular clone.
+
+So admins and users should not blindly use the `--initial-refspec`
+option all the time when an external 

[PATCH v6 26/40] odb-helper: add init_object_process()

2017-09-16 Thread Christian Couder
From: Ben Peart 

This adds the infrastructure to launch and use long running
sub-processes as external odb helpers.

For now only the 'init' and 'get_direct' capabilities are
supported with sub-processes.

Signed-off-by: Christian Couder 
---
 external-odb.c |  52 ---
 odb-helper.c   | 469 +++--
 sha1_file.c|  56 +--
 3 files changed, 523 insertions(+), 54 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index ccca67eff5..084cd32e0b 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -67,32 +67,11 @@ const char *external_odb_root(void)
return root;
 }
 
-int external_odb_has_object(const unsigned char *sha1)
-{
-   struct odb_helper *o;
-
-   if (!use_external_odb)
-   return 0;
-
-   external_odb_init();
-
-   for (o = helpers; o; o = o->next) {
-   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
-   return 1;
-   if (odb_helper_has_object(o, sha1))
-   return 1;
-   }
-   return 0;
-}
-
-int external_odb_get_object(const unsigned char *sha1)
+static int external_odb_do_get_object(const unsigned char *sha1)
 {
struct odb_helper *o;
const char *path;
 
-   if (!external_odb_has_object(sha1))
-   return -1;
-
path = sha1_file_name_alt(external_odb_root(), sha1);
safe_create_leading_directories_const(path);
prepare_external_alt_odb();
@@ -147,6 +126,35 @@ int external_odb_get_direct(const unsigned char *sha1)
return -1;
 }
 
+int external_odb_has_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   if (!use_external_odb)
+   return 0;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE)) {
+   if (o->supported_capabilities & 
ODB_HELPER_CAP_GET_DIRECT)
+   return 1;
+   return !external_odb_do_get_object(sha1);
+   }
+   if (odb_helper_has_object(o, sha1))
+   return 1;
+   }
+   return 0;
+}
+
+int external_odb_get_object(const unsigned char *sha1)
+{
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   return external_odb_do_get_object(sha1);
+}
+
 int external_odb_put_object(const void *buf, size_t len,
const char *type, unsigned char *sha1)
 {
diff --git a/odb-helper.c b/odb-helper.c
index 22728200e3..3148bcfa15 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -4,6 +4,22 @@
 #include "odb-helper.h"
 #include "run-command.h"
 #include "sha1-lookup.h"
+#include "sub-process.h"
+#include "pkt-line.h"
+#include "sigchain.h"
+
+struct object_process {
+   struct subprocess_entry subprocess;
+   unsigned int supported_capabilities;
+};
+
+static struct hashmap subprocess_map;
+
+static int check_object_process_status(int fd, struct strbuf *status)
+{
+   subprocess_read_status(fd, status);
+   return strcmp(status->buf, "success");
+}
 
 static void parse_capabilities(char *cap_buf,
   unsigned int *supported_capabilities,
@@ -39,6 +55,384 @@ static void parse_capabilities(char *cap_buf,
string_list_clear(_list, 0);
 }
 
+static int start_object_process_fn(struct subprocess_entry *subprocess)
+{
+   static int versions[] = {1, 0};
+   static struct subprocess_capability capabilities[] = {
+   { "get_git_obj", ODB_HELPER_CAP_GET_GIT_OBJ },
+   { "get_raw_obj", ODB_HELPER_CAP_GET_RAW_OBJ },
+   { "get_direct",  ODB_HELPER_CAP_GET_DIRECT  },
+   { "put_git_obj", ODB_HELPER_CAP_PUT_GIT_OBJ },
+   { "put_raw_obj", ODB_HELPER_CAP_PUT_RAW_OBJ },
+   { "put_direct",  ODB_HELPER_CAP_PUT_DIRECT  },
+   { "have",ODB_HELPER_CAP_HAVE },
+   { NULL, 0 }
+   };
+   struct object_process *entry = (struct object_process *)subprocess;
+   return subprocess_handshake(subprocess, "git-read-object", versions, 
NULL,
+   capabilities,
+   >supported_capabilities);
+}
+
+static struct object_process *launch_object_process(struct odb_helper *o,
+   unsigned int capability)
+{
+   struct object_process *entry = NULL;
+
+   if (!subprocess_map.tablesize)
+   hashmap_init(_map, (hashmap_cmp_fn) cmd2process_cmp, 
NULL, 0);
+   else
+   entry = (struct object_process 
*)subprocess_find_entry(_map, o->cmd);
+
+   fflush(NULL);
+
+   if (!entry) {
+   entry = xmalloc(sizeof(*entry));
+   entry->supported_capabilities = 0;
+
+   if (subprocess_start(_map, >subprocess, 
o->cmd, 

[PATCH v6 30/40] Add t0470 to test passing raw objects

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0470-read-object-http-e-odb.sh | 109 ++
 t/t0470/read-object-plain |  83 +
 2 files changed, 192 insertions(+)
 create mode 100755 t/t0470-read-object-http-e-odb.sh
 create mode 100755 t/t0470/read-object-plain

diff --git a/t/t0470-read-object-http-e-odb.sh 
b/t/t0470-read-object-http-e-odb.sh
new file mode 100755
index 00..774528c04f
--- /dev/null
+++ b/t/t0470-read-object-http-e-odb.sh
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+test_description='tests for read-object process passing plain objects to an 
HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0470"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain"
+
+test_expect_success 'setup repo with a root commit' '
+   test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+   git config odb.magic.subprocessCommand "$HELPER"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.subprocesscommand "$HELPER" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+test_must_fail git clone --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git clone -c odb.magic.subprocessCommand="$HELPER" --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+stop_httpd
+
+test_done
diff --git a/t/t0470/read-object-plain b/t/t0470/read-object-plain
new file mode 100755
index 00..918e7b00b5
--- /dev/null
+++ b/t/t0470/read-object-plain
@@ -0,0 +1,83 @@
+#!/usr/bin/perl
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+use LWP::UserAgent;
+use HTTP::Request::Common;
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get_raw_obj", "put_raw_obj");
+packet_write_capabilities("get_raw_obj", "put_raw_obj");
+
+my $http_url = $ENV{HTTPD_URL};
+
+while (1) {
+   my ($res, $command) = packet_txt_read();
+
+   if ( $res == -1 ) {
+   exit 0;
+   }
+
+   $command =~ s/^command=//;
+
+   if ( $command eq "init" ) {
+   packet_bin_read();
+
+   packet_txt_write("status=success");
+   packet_flush();
+   } elsif ( $command eq "get_raw_obj" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   my $get_url = $http_url . "/list/?sha1=" . $sha1;
+
+   my $userAgent = LWP::UserAgent->new();
+
+   my $response = $userAgent->get( $get_url );
+
+   if 

[PATCH v6 31/40] odb-helper: add have_object_process()

2017-09-16 Thread Christian Couder
This adds the infrastructure to handle 'have' instructions in
process mode.

The answer from the helper sub-process should be like the
output in script mode, that is lines like this:

sha1 SPACE size SPACE type NEWLINE

Signed-off-by: Christian Couder 
---
 odb-helper.c | 73 
 1 file changed, 73 insertions(+)

diff --git a/odb-helper.c b/odb-helper.c
index 356f6172d8..7433a4bfc6 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -634,6 +634,71 @@ static int odb_helper_object_cmp(const void *va, const 
void *vb)
return hashcmp(a->sha1, b->sha1);
 }
 
+static int send_have_packets(struct odb_helper *o,
+struct object_process *entry,
+struct strbuf *status)
+{
+   char *line;
+   int packet_len;
+   int total_got = 0;
+   struct child_process *process = >subprocess.process;
+   int err = packet_write_fmt_gently(process->in, "command=have\n");
+
+   if (err)
+   return err;
+
+   err = packet_flush_gently(process->in);
+   if (err)
+   return err;
+
+   for (;;) {
+   /* packet_read() writes a '\0' extra byte at the end */
+   char buf[LARGE_PACKET_DATA_MAX + 1];
+   char *p = buf;
+   int more;
+
+   packet_len = packet_read(process->out, NULL, NULL,
+   buf, LARGE_PACKET_DATA_MAX + 1,
+   PACKET_READ_GENTLE_ON_EOF);
+
+   if (packet_len <= 0)
+   break;
+
+   total_got += packet_len;
+
+   /* 'have' packets should end with '\n' or '\0' */
+   do {
+   char *eol = strchrnul(p, '\n');
+   more = (*eol == '\n');
+   *eol = '\0';
+   if (add_have_entry(o, p))
+   break;
+   p = eol + 1;
+   } while (more && *p);
+   }
+
+   if (packet_len < 0)
+   return packet_len;
+
+   return check_object_process_status(process->out, status);
+}
+
+static int have_object_process(struct odb_helper *o)
+{
+   int err;
+   struct object_process *entry;
+   struct strbuf status = STRBUF_INIT;
+
+   entry = launch_object_process(o, ODB_HELPER_CAP_HAVE);
+   if (!entry)
+   return -1;
+
+   err = send_have_packets(o, entry, );
+
+   return check_object_process_error(err, status.buf, entry, o->cmd,
+ ODB_HELPER_CAP_HAVE);
+}
+
 static void have_object_script(struct odb_helper *o)
 {
struct odb_helper_cmd cmd;
@@ -655,12 +720,20 @@ static void have_object_script(struct odb_helper *o)
 
 static void odb_helper_load_have(struct odb_helper *o)
 {
+   uint64_t start;
+
if (o->have_valid)
return;
o->have_valid = 1;
 
+   start = getnanotime();
+
if (o->script_mode)
have_object_script(o);
+   else
+   have_object_process(o);
+
+   trace_performance_since(start, "odb_helper_load_have");
 
qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
 }
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 22/40] pack-objects: don't pack objects in external odbs

2017-09-16 Thread Christian Couder
Objects managed by an external ODB should not be put into
pack files. They should be transfered using other mechanism
that can be specific to the external odb.

Signed-off-by: Christian Couder 
---
 builtin/pack-objects.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index a57b4f058d..db5e225d5a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -26,6 +26,7 @@
 #include "argv-array.h"
 #include "mru.h"
 #include "packfile.h"
+#include "external-odb.h"
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -1012,6 +1013,9 @@ static int want_object_in_pack(const unsigned char *sha1,
return want;
}
 
+   if (external_odb_has_object(sha1))
+   return 0;
+
for (entry = packed_git_mru->head; entry; entry = entry->next) {
struct packed_git *p = entry->item;
off_t offset;
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 37/40] clone: disable external odb before initial clone

2017-09-16 Thread Christian Couder
To make it possible to have the external odb mechanism only kick in
after the initial part of a clone, we should disable it during the
initial part of the clone.

Let's do that by saving and then restoring the value of the
'use_external_odb' global variable.

Signed-off-by: Christian Couder 
---
 builtin/clone.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index 57cecd194c..323b73016e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -938,6 +938,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
 
struct refspec *initial_refspecs;
int is_initial;
+   int saved_use_external_odb;
 
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
@@ -1083,6 +1084,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
 
git_config(git_default_config, NULL);
 
+   /* Temporarily disable external ODB before initial clone */
+   saved_use_external_odb = use_external_odb;
+   use_external_odb = 0;
+
if (option_bare) {
if (option_mirror)
src_ref_prefix = "refs/";
@@ -1166,6 +1171,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
fetch_initial_refs(transport, refs, initial_refspecs,
   branch_top.buf, reflog_msg.buf, is_local);
 
+   use_external_odb = saved_use_external_odb;
+
mapped_refs = wanted_peer_refs(refs, refspec);
/*
 * transport_get_remote_refs() may return refs with null sha-1
@@ -1207,6 +1214,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
option_branch, option_origin);
 
warning(_("You appear to have cloned an empty repository."));
+
+   use_external_odb = saved_use_external_odb;
+
mapped_refs = NULL;
our_head_points_at = NULL;
remote_head_points_at = NULL;
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 32/40] Add t0480 to test "have" capability and raw objects

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0480-read-object-have-http-e-odb.sh | 109 +
 t/t0480/read-object-plain-have | 103 +++
 2 files changed, 212 insertions(+)
 create mode 100755 t/t0480-read-object-have-http-e-odb.sh
 create mode 100755 t/t0480/read-object-plain-have

diff --git a/t/t0480-read-object-have-http-e-odb.sh 
b/t/t0480-read-object-have-http-e-odb.sh
new file mode 100755
index 00..056a40f2bb
--- /dev/null
+++ b/t/t0480-read-object-have-http-e-odb.sh
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+test_description='tests for read-object process with "have" cap and plain 
objects'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0480"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain-have"
+
+test_expect_success 'setup repo with a root commit' '
+   test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+   git config odb.magic.subprocessCommand "$HELPER"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.subprocessCommand "$HELPER" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+test_must_fail git clone --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git clone -c odb.magic.subprocessCommand="$HELPER" --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+stop_httpd
+
+test_done
diff --git a/t/t0480/read-object-plain-have b/t/t0480/read-object-plain-have
new file mode 100755
index 00..d63e327f33
--- /dev/null
+++ b/t/t0480/read-object-plain-have
@@ -0,0 +1,103 @@
+#!/usr/bin/perl
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+use LWP::UserAgent;
+use HTTP::Request::Common;
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get_raw_obj", "put_raw_obj", "have");
+packet_write_capabilities("get_raw_obj", "put_raw_obj", "have");
+
+my $http_url = $ENV{HTTPD_URL};
+
+while (1) {
+   my ($res, $command) = packet_txt_read();
+
+   if ( $res == -1 ) {
+   exit 0;
+   }
+
+   $command =~ s/^command=//;
+
+   if ( $command eq "init" ) {
+   packet_bin_read();
+
+   packet_txt_write("status=success");
+   packet_flush();
+   } elsif ( $command eq "have" ) {
+   # read the flush after the command
+   packet_bin_read();
+
+   my $have_url = $http_url . "/list/";
+
+   my $userAgent = LWP::UserAgent->new();
+   my $response = $userAgent->get( $have_url );
+
+  

[PATCH v6 33/40] external-odb: use 'odb=magic' attribute to mark odb blobs

2017-09-16 Thread Christian Couder
To tell which blobs should be sent to the "magic" external odb,
let's require that the blobs be marked using the 'odb=magic'
attribute.

Signed-off-by: Christian Couder 
---
 external-odb.c | 22 --
 external-odb.h |  3 ++-
 sha1_file.c| 20 +++-
 t/t0400-external-odb.sh|  3 +++
 t/t0410-transfer-e-odb.sh  |  3 +++
 t/t0420-transfer-http-e-odb.sh |  3 +++
 t/t0470-read-object-http-e-odb.sh  |  3 +++
 t/t0480-read-object-have-http-e-odb.sh |  3 +++
 8 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 084cd32e0b..e103514a46 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "external-odb.h"
 #include "odb-helper.h"
+#include "attr.h"
 
 static struct odb_helper *helpers;
 static struct odb_helper **helpers_tail = 
@@ -155,8 +156,23 @@ int external_odb_get_object(const unsigned char *sha1)
return external_odb_do_get_object(sha1);
 }
 
+static int has_odb_attrs(struct odb_helper *o, const char *path)
+{
+   static struct attr_check *check;
+
+   if (!check)
+   check = attr_check_initl("odb", NULL);
+
+   if (!git_check_attr(path, check)) {
+   const char *value = check->items[0].value;
+   return value ? !strcmp(o->name, value) : 0;
+   }
+   return 0;
+}
+
 int external_odb_put_object(const void *buf, size_t len,
-   const char *type, unsigned char *sha1)
+   const char *type, unsigned char *sha1,
+   const char *path)
 {
struct odb_helper *o;
 
@@ -164,12 +180,14 @@ int external_odb_put_object(const void *buf, size_t len,
return 1;
 
/* For now accept only blobs */
-   if (strcmp(type, "blob"))
+   if (!path || strcmp(type, "blob"))
return 1;
 
external_odb_init();
 
for (o = helpers; o; o = o->next) {
+   if (!has_odb_attrs(o, path))
+   continue;
int r = odb_helper_put_object(o, buf, len, type, sha1);
if (r <= 0)
return r;
diff --git a/external-odb.h b/external-odb.h
index 1fda08c0fb..93a3b35a04 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -6,6 +6,7 @@ extern int external_odb_has_object(const unsigned char *sha1);
 extern int external_odb_get_object(const unsigned char *sha1);
 extern int external_odb_get_direct(const unsigned char *sha1);
 extern int external_odb_put_object(const void *buf, size_t len,
-  const char *type, unsigned char *sha1);
+  const char *type, unsigned char *sha1,
+  const char *path);
 
 #endif /* EXTERNAL_ODB_H */
diff --git a/sha1_file.c b/sha1_file.c
index c5b6d89b97..24fbc28eab 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1631,7 +1631,9 @@ static int freshen_packed_object(const unsigned char 
*sha1)
return 1;
 }
 
-int write_sha1_file(const void *buf, unsigned long len, const char *type, 
unsigned char *sha1)
+static int write_sha1_file_with_path(const void *buf, unsigned long len,
+const char *type, unsigned char *sha1,
+const char *path)
 {
char hdr[32];
int hdrlen = sizeof(hdr);
@@ -1640,13 +1642,19 @@ int write_sha1_file(const void *buf, unsigned long len, 
const char *type, unsign
 * it out into .git/objects/??/?{38} file.
 */
write_sha1_file_prepare(buf, len, type, sha1, hdr, );
-   if (!external_odb_put_object(buf, len, type, sha1))
+   if (!external_odb_put_object(buf, len, type, sha1, path))
return 0;
if (freshen_packed_object(sha1) || freshen_loose_object(sha1))
return 0;
return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
 }
 
+int write_sha1_file(const void *buf, unsigned long len,
+   const char *type, unsigned char *sha1)
+{
+   write_sha1_file_with_path(buf, len, type, sha1, NULL);
+}
+
 int hash_sha1_file_literally(const void *buf, unsigned long len, const char 
*type,
 struct object_id *oid, unsigned flags)
 {
@@ -1767,7 +1775,8 @@ static int index_mem(unsigned char *sha1, void *buf, 
size_t size,
}
 
if (write_object)
-   ret = write_sha1_file(buf, size, typename(type), sha1);
+   ret = write_sha1_file_with_path(buf, size, typename(type),
+   sha1, path);
else
ret = hash_sha1_file(buf, size, typename(type), sha1);
if (re_allocated)
@@ -1789,8 +1798,9 @@ static int index_stream_convert_blob(unsigned char *sha1, 
int fd,
 

[PATCH v6 38/40] Add tests for 'clone --initial-refspec'

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0420-transfer-http-e-odb.sh |  7 +
 t/t0470-read-object-http-e-odb.sh  |  7 +
 t/t0480-read-object-have-http-e-odb.sh |  7 +
 t/t5616-clone-initial-refspec.sh   | 48 ++
 4 files changed, 69 insertions(+)
 create mode 100755 t/t5616-clone-initial-refspec.sh

diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index d307af0457..ed833850c3 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -140,6 +140,13 @@ test_expect_success 'no-local clone from the first repo 
with helper succeeds' '
rm -rf my-other-clone
 '
 
+test_expect_success 'no-local initial-refspec clone succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git -c odb.magic.scriptCommand="$HELPER" \
+   clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
 stop_httpd
 
 test_done
diff --git a/t/t0470-read-object-http-e-odb.sh 
b/t/t0470-read-object-http-e-odb.sh
index d814a43d59..7355ca4d51 100755
--- a/t/t0470-read-object-http-e-odb.sh
+++ b/t/t0470-read-object-http-e-odb.sh
@@ -107,6 +107,13 @@ test_expect_success 'no-local clone from the first repo 
with helper succeeds' '
rm -rf my-other-clone
 '
 
+test_expect_success 'no-local initial-refspec clone succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git -c odb.magic.subprocessCommand="$HELPER" \
+   clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
 stop_httpd
 
 test_done
diff --git a/t/t0480-read-object-have-http-e-odb.sh 
b/t/t0480-read-object-have-http-e-odb.sh
index fe1fac5ef3..c451d269a7 100755
--- a/t/t0480-read-object-have-http-e-odb.sh
+++ b/t/t0480-read-object-have-http-e-odb.sh
@@ -107,6 +107,13 @@ test_expect_success 'no-local clone from the first repo 
with helper succeeds' '
rm -rf my-other-clone
 '
 
+test_expect_success 'no-local initial-refspec clone succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git -c odb.magic.subprocessCommand="$HELPER" \
+   clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
 stop_httpd
 
 test_done
diff --git a/t/t5616-clone-initial-refspec.sh b/t/t5616-clone-initial-refspec.sh
new file mode 100755
index 00..ccbc27f83f
--- /dev/null
+++ b/t/t5616-clone-initial-refspec.sh
@@ -0,0 +1,48 @@
+#!/bin/sh
+
+test_description='test clone with --initial-refspec option'
+. ./test-lib.sh
+
+
+test_expect_success 'setup regular repo' '
+   # Make two branches, "master" and "side"
+   echo one >file &&
+   git add file &&
+   git commit -m one &&
+   echo two >file &&
+   git commit -a -m two &&
+   git tag two &&
+   echo three >file &&
+   git commit -a -m three &&
+   git checkout -b side &&
+   echo four >file &&
+   git commit -a -m four &&
+   git checkout master
+'
+
+test_expect_success 'add a special ref pointing to a blob' '
+   hash=$(echo "Hello world!" | git hash-object -w -t blob --stdin) &&
+   git update-ref refs/special/hello "$hash"
+'
+
+test_expect_success 'no-local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local .. . &&
+test_must_fail git cat-file blob "$hash") &&
+   rm -rf my-clone
+'
+
+test_expect_success 'no-local clone with --initial-refspec' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local --initial-refspec "refs/special/*:refs/special/*" 
.. . &&
+git cat-file blob "$hash" &&
+git rev-parse refs/special/hello >actual &&
+echo "$hash" >expected &&
+test_cmp expected actual) &&
+   rm -rf my-clone
+'
+
+test_done
+
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 01/40] builtin/clone: get rid of 'value' strbuf

2017-09-16 Thread Christian Couder
This makes the code simpler by removing a few lines, and getting
rid of one variable.

Signed-off-by: Christian Couder 
---
 builtin/clone.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 8d11b570a1..dcd5b878f1 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -875,7 +875,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
const struct ref *our_head_points_at;
struct ref *mapped_refs;
const struct ref *ref;
-   struct strbuf key = STRBUF_INIT, value = STRBUF_INIT;
+   struct strbuf key = STRBUF_INIT;
struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
struct transport *transport = NULL;
const char *src_ref_prefix = "refs/heads/";
@@ -1040,7 +1040,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
-   strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
strbuf_reset();
@@ -1054,10 +1053,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   fetch_pattern = value.buf;
+   fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
refspec = parse_fetch_refspec(1, _pattern);
-
-   strbuf_reset();
+   free((char *)fetch_pattern);
 
remote = remote_get(option_origin);
transport = transport_get(remote, remote->url[0]);
@@ -1196,7 +1194,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_release(_msg);
strbuf_release(_top);
strbuf_release();
-   strbuf_release();
junk_mode = JUNK_LEAVE_ALL;
 
free(refspec);
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 05/40] t0021/rot13-filter: add packet_initialize()

2017-09-16 Thread Christian Couder
Let's refactor the code to initialize communication into its own
packet_initialize() function, so that we can reuse this
functionality in following patches.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 3b3da8a03d..278fc6f534 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -104,16 +104,22 @@ sub packet_flush {
STDOUT->flush();
 }
 
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   ( packet_txt_read() eq ( 0, $name . "-client" ) )   || die "bad 
initialize";
+   ( packet_txt_read() eq ( 0, "version=" . $version ) )   || die "bad 
version";
+   ( packet_bin_read() eq ( 1, "" ) )  || die "bad 
version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
-( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
-( packet_txt_read() eq ( 0, "version=2" ) ) || die "bad version";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad version end";
-
-packet_txt_write("git-filter-server");
-packet_txt_write("version=2");
-packet_flush();
+packet_initialize("git-filter", 2);
 
 ( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
 ( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 14/40] t0400: add test for external odb write support

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0400-external-odb.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index f9e6ea1015..03df030461 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -71,4 +71,12 @@ test_expect_success 'helper can add objects to alt repo' '
test "$size" -eq "$alt_size"
 '
 
+test_expect_success 'commit adds objects to alt repo' '
+   test_config odb.magic.scriptCommand "$HELPER" &&
+   test_commit three &&
+   hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo && git show "$hash3") &&
+   test "$content" = "three"
+'
+
 test_done
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 11/40] t0400: add 'put_raw_obj' instruction to odb-helper script

2017-09-16 Thread Christian Couder
To properly test passing objects from Git to an external odb
we need an odb-helper script that supports a 'put'
capability/instruction.

For now we will support only sending raw blobs, so the
supported capability/instruction will be 'put_raw_obj'.

While at it let's add a test to check that our odb-helper
script works well.

Signed-off-by: Christian Couder 
---
 t/t0400-external-odb.sh | 24 
 1 file changed, 24 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index ed89f3ab40..f9e6ea1015 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -7,10 +7,15 @@ test_description='basic tests for external object databases'
 ALT_SOURCE="$PWD/alt-repo/.git"
 export ALT_SOURCE
 write_script odb-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
 init)
echo "capability=get_git_obj"
+   echo "capability=put_raw_obj"
echo "capability=have"
;;
 have)
@@ -20,6 +25,16 @@ have)
 get_git_obj)
cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
;;
+put_raw_obj)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   written=$(git hash-object -w -t "$kind" --stdin)
+   test "$written" = "$sha1" || die "bad sha1 passed '$sha1' vs written 
'$written'"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
 esac
 EOF
 HELPER="\"$PWD\"/odb-helper"
@@ -47,4 +62,13 @@ test_expect_success 'helper can retrieve alt objects' '
test_cmp expect actual
 '
 
+test_expect_success 'helper can add objects to alt repo' '
+   hash=$(echo "Hello odb!" | git hash-object -w -t blob --stdin) &&
+   test -f .git/objects/$(echo $hash | sed "s#..#&/#") &&
+   size=$(git cat-file -s "$hash") &&
+   git cat-file blob "$hash" | ./odb-helper put_raw_obj "$hash" "$size" 
blob &&
+   alt_size=$(cd alt-repo && git cat-file -s "$hash") &&
+   test "$size" -eq "$alt_size"
+'
+
 test_done
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 16/40] Add t0410 to test external ODB transfer

2017-09-16 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t0410-transfer-e-odb.sh | 144 ++
 1 file changed, 144 insertions(+)
 create mode 100755 t/t0410-transfer-e-odb.sh

diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
new file mode 100755
index 00..065ec7d759
--- /dev/null
+++ b/t/t0410-transfer-e-odb.sh
@@ -0,0 +1,144 @@
+#!/bin/sh
+
+test_description='basic tests for transfering external ODBs'
+
+. ./test-lib.sh
+
+ORIG_SOURCE="$PWD/.git"
+export ORIG_SOURCE
+
+ALT_SOURCE1="$PWD/alt-repo1/.git"
+export ALT_SOURCE1
+write_script odb-helper1 <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+GIT_DIR=$ALT_SOURCE1; export GIT_DIR
+case "$1" in
+init)
+   echo "capability=get_git_obj"
+   echo "capability=have"
+   ;;
+have)
+   git cat-file --batch-check --batch-all-objects |
+   awk '{print $1 " " $3 " " $2}'
+   ;;
+get_git_obj)
+   cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+   ;;
+put_raw_obj)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   writen=$(git hash-object -w -t "$kind" --stdin)
+   test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
'$writen'"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$ORIG_SOURCE 
GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+   GIT_DIR=$ORIG_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER1="\"$PWD\"/odb-helper1"
+
+OTHER_SOURCE="$PWD/.git"
+export OTHER_SOURCE
+
+ALT_SOURCE2="$PWD/alt-repo2/.git"
+export ALT_SOURCE2
+write_script odb-helper2 <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+GIT_DIR=$ALT_SOURCE2; export GIT_DIR
+case "$1" in
+init)
+   echo "capability=get_git_obj"
+   echo "capability=have"
+   ;;
+have)
+   GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' 
refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
+   ;;
+get_git_obj)
+   OBJ_FILE="$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+   if ! test -f "$OBJ_FILE"
+   then
+   # "Download" the missing object by copying it from alt-repo1
+   OBJ_DIR=$(echo $2 | sed 's/\(..\).*/\1/')
+   OBJ_BASE=$(basename "$OBJ_FILE")
+   ALT_OBJ_DIR1="$ALT_SOURCE1/objects/$OBJ_DIR"
+   ALT_OBJ_DIR2="$ALT_SOURCE2/objects/$OBJ_DIR"
+   mkdir -p "$ALT_OBJ_DIR2" || die "Could not mkdir 
'$ALT_OBJ_DIR2'"
+   OBJ_SRC="$ALT_OBJ_DIR1/$OBJ_BASE"
+   cp "$OBJ_SRC" "$ALT_OBJ_DIR2" ||
+   die "Could not cp '$OBJ_SRC' into '$ALT_OBJ_DIR2'"
+   fi
+   cat "$OBJ_FILE" || die "Could not cat '$OBJ_FILE'"
+   ;;
+put_raw_obj)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   writen=$(git hash-object -w -t "$kind" --stdin)
+   test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
'$writen'"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$OTHER_SOURCE 
GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+   GIT_DIR=$OTHER_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER2="\"$PWD\"/odb-helper2"
+
+test_expect_success 'setup first alternate repo' '
+   git init alt-repo1 &&
+   test_commit zero &&
+   git config odb.magic.scriptCommand "$HELPER1"
+'
+
+test_expect_success 'setup other repo and its alternate repo' '
+   git init other-repo &&
+   git init alt-repo2 &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'new blobs are put in first object store' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo1 && git show "$hash1") &&
+   test "$content" = "one" &&
+   test_commit two &&
+   hash2=$(git ls-tree HEAD | grep two.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo1 && git show "$hash2") &&
+   test "$content" = "two"
+'
+
+test_expect_success 'other repo gets the blobs from object store' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+test_must_fail git cat-file blob "$hash2" &&
+git config odb.magic.scriptCommand "$HELPER2" &&
+git cat-file blob "$hash1" &&
+git cat-file blob "$hash2"
+   )
+'
+
+test_expect_success 'other repo gets everything else' '
+   (cd other-repo &&
+git fetch origin &&
+content=$(git show "$hash1") &&
+test "$content" = "one" &&
+content=$(git show "$hash2") &&
+test "$content" = "two")
+'
+
+test_done
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 10/40] odb-helper: add odb_helper_init() to send 'init' instruction

2017-09-16 Thread Christian Couder
Let's add an odb_helper_init() function to send an 'init'
instruction to the helpers. This 'init' instruction is
especially useful to get the capabilities that are supported
by the helpers.

So while at it, let's also add a parse_capabilities()
function to parse them and a supported_capabilities
variable in struct odb_helper to store them.

Signed-off-by: Christian Couder 
---
 external-odb.c  |  9 -
 odb-helper.c| 54 +
 odb-helper.h| 12 +++
 t/t0400-external-odb.sh |  4 
 4 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index e9c3f11666..0f0de170b8 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -41,12 +41,16 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
 static void external_odb_init(void)
 {
static int initialized;
+   struct odb_helper *o;
 
if (initialized)
return;
initialized = 1;
 
git_config(external_odb_config, NULL);
+
+   for (o = helpers; o; o = o->next)
+   odb_helper_init(o);
 }
 
 const char *external_odb_root(void)
@@ -63,9 +67,12 @@ int external_odb_has_object(const unsigned char *sha1)
 
external_odb_init();
 
-   for (o = helpers; o; o = o->next)
+   for (o = helpers; o; o = o->next) {
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
+   return 1;
if (odb_helper_has_object(o, sha1))
return 1;
+   }
return 0;
 }
 
diff --git a/odb-helper.c b/odb-helper.c
index 5e91044872..9375eca58f 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -5,6 +5,40 @@
 #include "run-command.h"
 #include "sha1-lookup.h"
 
+static void parse_capabilities(char *cap_buf,
+  unsigned int *supported_capabilities,
+  const char *process_name)
+{
+   struct string_list cap_list = STRING_LIST_INIT_NODUP;
+
+   string_list_split_in_place(_list, cap_buf, '=', 1);
+
+   if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, 
"capability")) {
+   const char *cap_name = cap_list.items[1].string;
+
+   if (!strcmp(cap_name, "get_git_obj")) {
+   *supported_capabilities |= ODB_HELPER_CAP_GET_GIT_OBJ;
+   } else if (!strcmp(cap_name, "get_raw_obj")) {
+   *supported_capabilities |= ODB_HELPER_CAP_GET_RAW_OBJ;
+   } else if (!strcmp(cap_name, "get_direct")) {
+   *supported_capabilities |= ODB_HELPER_CAP_GET_DIRECT;
+   } else if (!strcmp(cap_name, "put_git_obj")) {
+   *supported_capabilities |= ODB_HELPER_CAP_PUT_GIT_OBJ;
+   } else if (!strcmp(cap_name, "put_raw_obj")) {
+   *supported_capabilities |= ODB_HELPER_CAP_PUT_RAW_OBJ;
+   } else if (!strcmp(cap_name, "put_direct")) {
+   *supported_capabilities |= ODB_HELPER_CAP_PUT_DIRECT;
+   } else if (!strcmp(cap_name, "have")) {
+   *supported_capabilities |= ODB_HELPER_CAP_HAVE;
+   } else {
+   warning("external process '%s' requested unsupported 
read-object capability '%s'",
+   process_name, cap_name);
+   }
+   }
+
+   string_list_clear(_list, 0);
+}
+
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
struct odb_helper *o;
@@ -79,6 +113,26 @@ static int odb_helper_finish(struct odb_helper *o,
return 0;
 }
 
+int odb_helper_init(struct odb_helper *o)
+{
+   struct odb_helper_cmd cmd;
+   FILE *fh;
+   struct strbuf line = STRBUF_INIT;
+
+   if (odb_helper_start(o, , "init") < 0)
+   return -1;
+
+   fh = xfdopen(cmd.child.out, "r");
+   while (strbuf_getline(, fh) != EOF)
+   parse_capabilities(line.buf, >supported_capabilities, 
o->name);
+
+   strbuf_release();
+   fclose(fh);
+   odb_helper_finish(o, );
+
+   return 0;
+}
+
 static int parse_object_line(struct odb_helper_object *o, const char *line)
 {
char *end;
diff --git a/odb-helper.h b/odb-helper.h
index fb25ad579e..5f28a6e512 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -1,9 +1,20 @@
 #ifndef ODB_HELPER_H
 #define ODB_HELPER_H
 
+#include "external-odb.h"
+
+#define ODB_HELPER_CAP_GET_GIT_OBJ(1u<<0)
+#define ODB_HELPER_CAP_GET_RAW_OBJ(1u<<1)
+#define ODB_HELPER_CAP_GET_DIRECT (1u<<2)
+#define ODB_HELPER_CAP_PUT_GIT_OBJ(1u<<3)
+#define ODB_HELPER_CAP_PUT_RAW_OBJ(1u<<4)
+#define ODB_HELPER_CAP_PUT_DIRECT (1u<<5)
+#define ODB_HELPER_CAP_HAVE   (1u<<6)
+
 struct odb_helper {
const char *name;
const char *cmd;
+   unsigned int supported_capabilities;
 
struct odb_helper_object {
   

[PATCH v6 07/40] Add Git/Packet.pm from parts of t0021/rot13-filter.pl

2017-09-16 Thread Christian Couder
And while at it let's simplify t0021/rot13-filter.pl by
using Git/Packet.pm.

This will make it possible to reuse packet related
functions in other test scripts.

Signed-off-by: Christian Couder 
---
 perl/Git/Packet.pm  | 118 
 t/t0021/rot13-filter.pl |  94 ++
 2 files changed, 121 insertions(+), 91 deletions(-)
 create mode 100644 perl/Git/Packet.pm

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
new file mode 100644
index 00..b1e67477a0
--- /dev/null
+++ b/perl/Git/Packet.pm
@@ -0,0 +1,118 @@
+package Git::Packet;
+use 5.008;
+use strict;
+use warnings;
+BEGIN {
+   require Exporter;
+   if ($] < 5.008003) {
+   *import = \::import;
+   } else {
+   # Exporter 5.57 which supports this invocation was
+   # released with perl 5.8.3
+   Exporter->import('import');
+   }
+}
+
+our @EXPORT = qw(
+   packet_bin_read
+   packet_txt_read
+   packet_bin_write
+   packet_txt_write
+   packet_flush
+   packet_initialize
+   packet_read_capabilities
+   packet_write_capabilities
+   packet_read_and_check_capabilities
+   );
+our @EXPORT_OK = @EXPORT;
+
+sub packet_bin_read {
+   my $buffer;
+   my $bytes_read = read STDIN, $buffer, 4;
+   if ( $bytes_read == 0 ) {
+   # EOF - Git stopped talking to us!
+   return ( -1, "" );
+   } elsif ( $bytes_read != 4 ) {
+   die "invalid packet: '$buffer'";
+   }
+   my $pkt_size = hex($buffer);
+   if ( $pkt_size == 0 ) {
+   return ( 1, "" );
+   } elsif ( $pkt_size > 4 ) {
+   my $content_size = $pkt_size - 4;
+   $bytes_read = read STDIN, $buffer, $content_size;
+   if ( $bytes_read != $content_size ) {
+   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
+   }
+   return ( 0, $buffer );
+   } else {
+   die "invalid packet size: $pkt_size";
+   }
+}
+
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   return ( $res, $buf );
+}
+
+sub packet_bin_write {
+   my $buf = shift;
+   print STDOUT sprintf( "%04x", length($buf) + 4 );
+   print STDOUT $buf;
+   STDOUT->flush();
+}
+
+sub packet_txt_write {
+   packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+   print STDOUT sprintf( "%04x", 0 );
+   STDOUT->flush();
+}
+
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   ( packet_txt_read() eq ( 0, $name . "-client" ) )   || die "bad 
initialize";
+   ( packet_txt_read() eq ( 0, "version=" . $version ) )   || die "bad 
version";
+   ( packet_bin_read() eq ( 1, "" ) )  || die "bad 
version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   return ( $res, @cap ) if ( $res != 0 );
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   die "bad capability buf: '$buf'" unless ( $buf =~ 
s/capability=// );
+   push @cap, $buf;
+   }
+}
+
+sub packet_read_and_check_capabilities {
+   my @local_caps = @_;
+   my @remote_res_caps = packet_read_capabilities();
+   my $res = shift @remote_res_caps;
+   my %remote_caps = map { $_ => 1 } @remote_res_caps;
+   foreach (@local_caps) {
+   die "'$_' capability not available" unless 
(exists($remote_caps{$_}));
+   }
+   return $res;
+}
+
+sub packet_write_capabilities {
+   packet_txt_write( "capability=" . $_ ) foreach (@_);
+   packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index ba18b207c6..2e8ad4d496 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -30,9 +30,12 @@
 # to the "list_available_blobs" response.
 #
 
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
 use strict;
 use warnings;
 use IO::File;
+use Git::Packet;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
 my $log_file= shift @ARGV;
@@ -55,97 +58,6 @@ sub rot13 {
return $str;
 }
 
-sub packet_bin_read {
-   my $buffer;
-   my $bytes_read = read STDIN, $buffer, 4;
-   if ( 

[PATCH v6 19/40] lib-httpd: add list.sh

2017-09-16 Thread Christian Couder
This cgi script can list Git objects that have been uploaded as
files to an apache web server. This script can also retrieve
the content of each of these files.

This will help make apache work as an external object database.

Signed-off-by: Christian Couder 
---
 t/lib-httpd.sh  |  1 +
 t/lib-httpd/list.sh | 41 +
 2 files changed, 42 insertions(+)
 create mode 100644 t/lib-httpd/list.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index d80b004549..f31ea261f5 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -133,6 +133,7 @@ prepare_httpd() {
install_script broken-smart-http.sh
install_script error.sh
install_script upload.sh
+   install_script list.sh
 
ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh
new file mode 100644
index 00..b6d6c29a2f
--- /dev/null
+++ b/t/lib-httpd/list.sh
@@ -0,0 +1,41 @@
+#!/bin/sh
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+   key=${1%%=*}
+   val=${1#*=}
+
+   case "$key" in
+   "sha1") sha1="$val" ;;
+   *) echo >&2 "unknown key '$key'" ;;
+   esac
+
+   shift
+done
+
+if test -d "$FILES_DIR"
+then
+   if test -z "$sha1"
+   then
+   echo 'Status: 200 OK'
+   echo
+   ls "$FILES_DIR" | tr '-' ' '
+   else
+   if test -f "$FILES_DIR/$sha1"-*
+   then
+   echo 'Status: 200 OK'
+   echo
+   cat "$FILES_DIR/$sha1"-*
+   else
+   echo 'Status: 404 Not Found'
+   echo
+   fi
+   fi
+fi
-- 
2.14.1.576.g3f707d88cd



[PATCH v6 04/40] t0021/rot13-filter: improve error message

2017-09-16 Thread Christian Couder
If there is no new line at the end of something it receives,
the packet_txt_read() function die()s, but it's difficult to
debug without much context.

Let's give a bit more information when that happens.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 82882392ae..3b3da8a03d 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -82,7 +82,8 @@ sub packet_bin_read {
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.";
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
}
return ( $res, $buf );
 }
-- 
2.14.1.576.g3f707d88cd



Re: [PATCH] describe: fix matching to actually match all patterns

2017-09-16 Thread Jacob Keller
On Fri, Sep 15, 2017 at 10:53 PM, Max Kirillov  wrote:
> `git describe --match` with multiple patterns matches only first pattern.
> If it fails, next patterns are not tried.
>
> Fix it, add test cases and update existing test which has wrong
> expectation.
>
> Signed-off-by: Max Kirillov 
> ---
>  builtin/describe.c  | 9 ++---
>  t/t6120-describe.sh | 6 +-
>  2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 89ea1cdd60..94ff2fba0b 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -155,18 +155,21 @@ static int get_name(const char *path, const struct 
> object_id *oid, int flag, voi
>  * pattern.
>  */
> if (patterns.nr) {
> +   int found = 0;
> struct string_list_item *item;
>
> if (!is_tag)
> return 0;
>
> for_each_string_list_item(item, ) {
> -   if (!wildmatch(item->string, path + 10, 0))
> +   if (!wildmatch(item->string, path + 10, 0)) {
> +   found = 1;
> break;

I see what was wrong. The "if we got here" check is inside the loop,
so after the first wildmatch we never loop again. The fix is to add an
additional variable to store when we found something and exit the
loop, and ensure that we actually did the whole loop without finding a
match.

Thanks for the fix and proper tests!

Regards,
Jake


> +   }
> +   }
>
> -   /* If we get here, no pattern matched. */
> +   if (!found)
> return 0;
> -   }
> }
>
> /* Is it annotated? */
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index aa74eb8f0d..25110ea55d 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -182,10 +182,14 @@ check_describe "test2-lightweight-*" --tags 
> --match="test2-*"
>
>  check_describe "test2-lightweight-*" --long --tags --match="test2-*" HEAD^
>
> -check_describe "test1-lightweight-*" --long --tags --match="test1-*" 
> --match="test2-*" HEAD^
> +check_describe "test2-lightweight-*" --long --tags --match="test1-*" 
> --match="test2-*" HEAD^
>
>  check_describe "test2-lightweight-*" --long --tags --match="test1-*" 
> --no-match --match="test2-*" HEAD^
>
> +check_describe "test1-lightweight-*" --long --tags --match="test1-*" 
> --match="test3-*" HEAD
> +
> +check_describe "test1-lightweight-*" --long --tags --match="test3-*" 
> --match="test1-*" HEAD
> +
>  test_expect_success 'name-rev with exact tags' '
> echo A >expect &&
> tag_object=$(git rev-parse refs/tags/A) &&
> --
> 2.11.0.1122.gc3fec58.dirty
>


Re: [RFC PATCH 0/2] Add named reference to latest push cert

2017-09-16 Thread Shikher Verma
On Thu, Sep 07, 2017 at 05:43:19PM +, Stefan Beller wrote:
> On Thu, Sep 7, 2017 at 2:11 AM, Shikher Verma  wrote:
> > On Wed, Sep 06, 2017 at 02:31:49PM -0700, Stefan Beller wrote:
> >> On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma  
> >> wrote:
> >> > Currently, git only stores push certificates if there is a receive hook
> >> > present. This may violate the principle of least surprise (e.g., I
> >> > pushed with --signed, and I don't see anything in upstream).
> >> > Additionally, push certificates could be more versatile if they are not
> >> > tightly bound to git hooks. Finally, it would be useful to verify the
> >> > signed pushes at later points of time with ease.
> >> >
> >> > A named ref is added for ease of access/tooling around push
> >> > certificates. If the last push was signed, ref/PUSH_CERT stores the
> >> > ref of the latest push cert otherwise it is empty.
> >> >
> >> > Sending patches as RFC since the documentation would have to be
> >> > updated and git gc might have to be patched to not garbage collect
> >> > the latest push certificate.
> >> >
> >> > This patch applies on master (3ec7d702a)
> >>
> >> What are performance implications for busy repositories at busy hosts?
> >> (think kernel.org / github) They may want to disable this new feature
> >> for performance reasons or because they don't want to clutter the
> >> object store. So at least a config option to turn it off would be useful.
> >
> > Any typical git push would write several objects to disk,
> 
> (or just one pack file, [which may be renamed eventually, see 722ff7f8])
> 
> > this patch
> > would only add one more object per push so I think the performance
> > penalty is not that high. But I agree that we can have a config to turn
> > it off.
> 
> I personally do not run a high performance server, so I do not terribly mind,
> but thought it would be nice for them to have at least an option ready made
> instead of a potential performance regression.
> 
> >> On the ref to store the push certs:
> >> (a) Currently the ref points at the blob, I wonder if we'd rather want to
> >> point at a commit? (Then we can build up a history of
> >> push certs, instead of relying on the reflog to show all
> >> push certs. Also the gc issue might be easier to solve using this)
> >
> > I am not sure how that would work. The ref points at the blob of push
> > certificate. Since each push can update multiple refs, each push
> > certificate can point to mutiple commits (tip of the updated refs).
> > Also if the named ref points at the commit then how will we get the
> > corresponding push certificate?
> >
> > I did think about keeping a history of push certificates but the problem
> > is new pushes can delete refs and commits which were pointed to by
> > previous push certificates. This makes it really difficult to decide
> > which push certificates to keep and which to gc. Also this history would
> > be different for different clones of the same repo. Since push
> > certificate are only meta data of the git workflow I think its best to
> > just keep the latest push certificate and gc the old ones. People can
> > use the recieve hook if they want to do advance things like logging a
> > history of push certificates. I think git should provide a builtin
> > solution for the simple case.
> 
> What I had in mind was what would be achieved with a
> hook like this (untested):
> 
> #!/bin/sh
> if test -z GIT_PUSH_CERT ; then
> exit 0
> fi
> 
> # add a new worktree 'tmp', checking out the magic ref:
> git worktree add tmp refs/PUSH_CERT
> 
> cp $GIT_PUSH_CERT tmp/cert
> git -C tmp add cert
> git -C tmp commit -m "new push cert"
> # maybe include GIT_PUSH_CERT_[NONCE_]STATUS
> # in commit message?
> 
> # clean up, command doesn't exist yet:
> git worktree delete tmp
>

This might be a good starting point for a sample hook if we choose to go
that way. As Junio suggested.
> This would not deal with concurrency as it re-uses the
> same worktree, but illustrates what I had in mind
> for the git history of that special ref.

I personally feel that we should decouple push certificates from hooks
and serve the push certificate whenever someone does
`git pull --signed important-ref`. That way we remove trust from
services like Github, Gitlab, Bitbucket. This could be done if git
stores a map of refs to last push certificate that updated that ref.

Push certificates are preventing MiTM attack between pusher and server.
If we start serving push certificates on pull, it would prevent MiTM
attack between pusher and puller! Compromise of server wouldn't mean
vulnerabilities in your project.

A sample hook solves the immediate problem of making push certs more
accessible. But going with `git pull --signed` make push certs
tremendously more accessible and useful. What are your thoughts on
having git signed pull?

What do you guys think I should do in regards to 

Re: [PATCHv3] builtin/merge: honor commit-msg hook for merges

2017-09-16 Thread Kaartic Sivaraam
Seems 'Documentation/githooks.txt' needs an update related to this
change. Previously it said(note the **s) that 'commit-msg' is invoked
only by 'git commit',

commit-msg
   This hook is invoked by git commit**, and can be bypassed with the
   --no-verify option. It takes a single parameter, the name of the file
   that holds the proposed commit log message. Exiting with a non-zero
   status causes the git commit** to abort.

---
Kaartic


[PATCH] describe: fix matching to actually match all patterns

2017-09-16 Thread Max Kirillov
`git describe --match` with multiple patterns matches only first pattern.
If it fails, next patterns are not tried.

Fix it, add test cases and update existing test which has wrong
expectation.

Signed-off-by: Max Kirillov 
---
 builtin/describe.c  | 9 ++---
 t/t6120-describe.sh | 6 +-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 89ea1cdd60..94ff2fba0b 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -155,18 +155,21 @@ static int get_name(const char *path, const struct 
object_id *oid, int flag, voi
 * pattern.
 */
if (patterns.nr) {
+   int found = 0;
struct string_list_item *item;
 
if (!is_tag)
return 0;
 
for_each_string_list_item(item, ) {
-   if (!wildmatch(item->string, path + 10, 0))
+   if (!wildmatch(item->string, path + 10, 0)) {
+   found = 1;
break;
+   }
+   }
 
-   /* If we get here, no pattern matched. */
+   if (!found)
return 0;
-   }
}
 
/* Is it annotated? */
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index aa74eb8f0d..25110ea55d 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -182,10 +182,14 @@ check_describe "test2-lightweight-*" --tags 
--match="test2-*"
 
 check_describe "test2-lightweight-*" --long --tags --match="test2-*" HEAD^
 
-check_describe "test1-lightweight-*" --long --tags --match="test1-*" 
--match="test2-*" HEAD^
+check_describe "test2-lightweight-*" --long --tags --match="test1-*" 
--match="test2-*" HEAD^
 
 check_describe "test2-lightweight-*" --long --tags --match="test1-*" 
--no-match --match="test2-*" HEAD^
 
+check_describe "test1-lightweight-*" --long --tags --match="test1-*" 
--match="test3-*" HEAD
+
+check_describe "test1-lightweight-*" --long --tags --match="test3-*" 
--match="test1-*" HEAD
+
 test_expect_success 'name-rev with exact tags' '
echo A >expect &&
tag_object=$(git rev-parse refs/tags/A) &&
-- 
2.11.0.1122.gc3fec58.dirty