[PATCH v1 4/4] perf/run: add calls to get_var_from_env_or_config()

2017-07-13 Thread Christian Couder
These calls make it possible to have the make command or the
make options in a config file, instead of in environment
variables.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/perf/run | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/perf/run b/t/perf/run
index ad442fe64a..6bd15e7017 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -116,6 +116,9 @@ export GIT_PERF_REPEAT_COUNT
 get_var_from_env_or_config "GIT_PERF_DIRS_OR_REVS" "perf.dirsOrRevs"
 set -- $GIT_PERF_DIRS_OR_REVS "$@"
 
+get_var_from_env_or_config "GIT_PERF_MAKE_COMMAND" "perf.makeCommand"
+get_var_from_env_or_config "GIT_PERF_MAKE_OPTS" "perf.makeOpts"
+
 GIT_PERF_AGGREGATING_LATER=t
 export GIT_PERF_AGGREGATING_LATER
 
-- 
2.13.2.647.g8b2efe2a0f



[PATCH v1 3/4] perf/run: add GIT_PERF_DIRS_OR_REVS

2017-07-13 Thread Christian Couder
This environment variable can be set to some revisions or
directories whose Git versions should be tested, in addition
to the revisions or directories passed as arguments to the
'run' script.

This enables a "perf.dirsOrRevs" configuration variable to
be used to set revisions or directories whose Git versions
should be tested.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/perf/run | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/perf/run b/t/perf/run
index 41580ac6df..ad442fe64a 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -113,6 +113,9 @@ get_var_from_env_or_config () {
 get_var_from_env_or_config "GIT_PERF_REPEAT_COUNT" "perf.repeatCount" 3
 export GIT_PERF_REPEAT_COUNT
 
+get_var_from_env_or_config "GIT_PERF_DIRS_OR_REVS" "perf.dirsOrRevs"
+set -- $GIT_PERF_DIRS_OR_REVS "$@"
+
 GIT_PERF_AGGREGATING_LATER=t
 export GIT_PERF_AGGREGATING_LATER
 
-- 
2.13.2.647.g8b2efe2a0f



[PATCH v1 1/4] perf/run: add '--config' option to the 'run' script

2017-07-13 Thread Christian Couder
It is error prone and tiring to use many long environment
variables to give parameters to the 'run' script.

Let's make it easy to store some parameters in a config
file and to pass them to the run script.

The GIT_PERF_CONFIG_FILE variable will be set to the
argument of the '--config' option. This variable is not
used yet. It will be used in a following commit.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/perf/run | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/t/perf/run b/t/perf/run
index beb4acc0e4..1e7c2a59e4 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -2,9 +2,14 @@
 
 case "$1" in
--help)
-   echo "usage: $0 [other_git_tree...] [--] [test_scripts]"
+   echo "usage: $0 [--config file] [other_git_tree...] [--] 
[test_scripts]"
exit 0
;;
+   --config)
+   shift
+   GIT_PERF_CONFIG_FILE=$(cd "$(dirname "$1")"; pwd)/$(basename 
"$1")
+   export GIT_PERF_CONFIG_FILE
+   shift ;;
 esac
 
 die () {
-- 
2.13.2.647.g8b2efe2a0f



[PATCH v1 2/4] perf/run: add get_var_from_env_or_config()

2017-07-13 Thread Christian Couder
Add get_var_from_env_or_config() to easily set variables
from a config file if they are defined there and not already set.

This can also set them to a default value if one is provided.

As an example, use this function to set GIT_PERF_REPEAT_COUNT
from the perf.repeatCount config option or from the default
value.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/perf/perf-lib.sh |  3 ---
 t/perf/run | 21 +
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index b50211b259..2f88fc12a9 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -59,9 +59,6 @@ perf_results_dir=$TEST_OUTPUT_DIRECTORY/test-results
 mkdir -p "$perf_results_dir"
 rm -f "$perf_results_dir"/$(basename "$0" .sh).subtests
 
-if test -z "$GIT_PERF_REPEAT_COUNT"; then
-   GIT_PERF_REPEAT_COUNT=3
-fi
 die_if_build_dir_not_repo () {
if ! ( cd "$TEST_DIRECTORY/.." &&
git rev-parse --build-dir >/dev/null 2>&1 ); then
diff --git a/t/perf/run b/t/perf/run
index 1e7c2a59e4..41580ac6df 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -34,6 +34,7 @@ unpack_git_rev () {
(cd "$(git rev-parse --show-cdup)" && git archive --format=tar $rev) |
(cd build/$rev && tar x)
 }
+
 build_git_rev () {
rev=$1
for config in config.mak config.mak.autogen config.status
@@ -92,6 +93,26 @@ run_dirs () {
done
 }
 
+get_var_from_env_or_config () {
+   env_var="$1"
+   conf_var="$2"
+   # $3 can be set to a default value
+
+   # Do nothing if the env variable is already set
+   eval "test -z \"\${$env_var+x}\"" || return
+
+   # Check if the variable is in the config file
+   test -n "$GIT_PERF_CONFIG_FILE" &&
+   conf_value=$(git config -f "$GIT_PERF_CONFIG_FILE" "$conf_var") &&
+   eval "$env_var=\"$conf_value\"" || {
+   test -n "${3+x}" &&
+   eval "$env_var=\"$3\""
+   }
+}
+
+get_var_from_env_or_config "GIT_PERF_REPEAT_COUNT" "perf.repeatCount" 3
+export GIT_PERF_REPEAT_COUNT
+
 GIT_PERF_AGGREGATING_LATER=t
 export GIT_PERF_AGGREGATING_LATER
 
-- 
2.13.2.647.g8b2efe2a0f



[PATCH v1 0/4] Teach 'run' perf script to read config files

2017-07-13 Thread Christian Couder
Goal


Using many long environment variables to give parameters to the 'run'
script is error prone and tiring.

We want to make it possible to store the parameters to the 'run'
script in a config file. This will make it easier to store, reuse,
share and compare parameters.

In the future it could also make it easy to run series of tests. 

Design
~~

We use the same config format as the ".git/config" file as Git users
and developers are familiar with this nice format and have great tools
to change, query and manage it.

We use values from the config file to set the environment variables
that are used by the scripts if they are not already set.

Highlevel view of the patches in the series
~~~

Patch 1/4 teaches the '--config ' option to the 'run'
script, but  is just put into the GIT_PERF_CONFIG_FILE
variable which is not used yet.

Patch 2/4 add the get_var_from_env_or_config() function to read config
options from the  and set values to some variables from
these config options or from default values.

Patch 3/4 and 4/4 use the get_var_from_env_or_config() function to
make it possible to set parameters used by the 'run' script.  

Future work
~~~

We want to make it possible to run series of tests by passing only a
config file to the 'run' script.

For example a config file like the following could be used to run perf
tests with Git compiled both with and without libpcre:

[perf]
dirsOrRevs = v2.12.0 v2.13.0
repeatCount = 10
[perf "with libpcre"]
makeOpts = DEVELOPER=1 CFLAGS='-g -O0' USE_LIBPCRE=YesPlease
[perf "without libpcre"]
makeOpts = DEVELOPER=1 CFLAGS='-g -O0'

This will make it easy to see what changes between the different runs.

But this can be added later, and the current series can already be
useful as is.

Links
~

This patch series is also available here:

  https://github.com/chriscool/git/commits/perf-conf


Christian Couder (4):
  perf/run: add '--config' option to the 'run' script
  perf/run: add get_var_from_env_or_config()
  perf/run: add GIT_PERF_DIRS_OR_REVS
  perf/run: add calls to get_var_from_env_or_config()

 t/perf/perf-lib.sh |  3 ---
 t/perf/run | 34 +-
 2 files changed, 33 insertions(+), 4 deletions(-)

-- 
2.13.2.647.g8b2efe2a0f



Re: [PATCH v2 2/3] trailers: export action enums and corresponding lookup functions

2017-07-13 Thread Christian Couder
On Thu, Jul 13, 2017 at 12:21 AM, Paolo Bonzini  wrote:

> diff --git a/trailer.h b/trailer.h
> index e90ba1270..f306bf059 100644
> --- a/trailer.h
> +++ b/trailer.h
> @@ -1,11 +1,33 @@
>  #ifndef TRAILER_H
>  #define TRAILER_H
>
> +enum action_where {
> +   WHERE_END,
> +   WHERE_AFTER,
> +   WHERE_BEFORE,
> +   WHERE_START
> +};
> +enum action_if_exists {
> +   EXISTS_ADD_IF_DIFFERENT_NEIGHBOR,
> +   EXISTS_ADD_IF_DIFFERENT,
> +   EXISTS_ADD,
> +   EXISTS_REPLACE,
> +   EXISTS_DO_NOTHING
> +};
> +enum action_if_missing {
> +   MISSING_ADD,
> +   MISSING_DO_NOTHING
> +};

As these enums are now in trailer.h, maybe more specific names like
"trailer_action_where" instead of "action_where" would be better.

>  struct trailer_opts {
> int in_place;
> int trim_empty;
>  };
>
> +int set_where(enum action_where *item, const char *value);
> +int set_if_exists(enum action_if_exists *item, const char *value);
> +int set_if_missing(enum action_if_missing *item, const char *value);

"trailer_" should perhaps be added at the beginning of the names of
the above functions too.


Re: [PATCH 0/3] interpret-trailers: add --where, --if-exists, --if-missing

2017-07-12 Thread Christian Couder
On Wed, Jul 12, 2017 at 3:46 PM, Paolo Bonzini  wrote:
>
> These options are useful to experiment with "git interpret-trailers"
> without having to tinker with .gitconfig.  It can also be useful in the
> oddball case where you want a different placement for the trailer.
>
> The case that stimulated the creation of the patches was configuring
>
>  trailer.signed-off-by.where = end
>
> and then wanting "--where before" when a patch author forgets his
> Signed-off-by and provides it in a separate email.

Maybe you could have used the following to temporarily override the config:

git -c trailer.signed-off-by.where=before interpret-trailers ...

But it could be helpful and more straightforward to provide the
options you implemented.

I am not sure also if --where should override both "trailer.where" and
"trailer..where", or if should just override the former.


Re: [PATCH v5 7/7] fsmonitor: add a performance test

2017-07-08 Thread Christian Couder
On Fri, Jul 7, 2017 at 8:35 PM, Junio C Hamano  wrote:
> Ben Peart  writes:
>
>> On 6/14/2017 2:36 PM, Junio C Hamano wrote:
>>> Ben Peart  writes:
>>>
> Having said all that, I think you are using this ONLY on windows;
> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
> the above and arrange Makefile to build test-drop-cache only on that
> platform, or something?

 I didn't find any other examples of Windows only tools.  I'll update
 the #ifdef to properly dump the file system cache on Linux as well and
 only error out on other platforms.
>>>
>>> If this will become Windows-only, then I have no problem with
>>> platform specfic typedef ;-) I have no problem with CamelCase,
>>> either, as that follows the local convention on the platform
>>> (similar to those in compat/* that are only for Windows).
>>>
>>> Having said all that.
>>>
>>> Another approach is to build this helper on all platforms, ...
>
> ... and having said all that, I think it is perfectly fine to do
> such a clean-up long after the series gets more exposure to wider
> audiences as a follow-up patch.  Let's get the primary part that
> affects people's everyday use of Git right and then worry about the
> test details later.
>
> A quick show of hands to the list audiences.  How many of you guys
> actually tried this series on 'pu' and checked to see its
> performance (and correctness ;-) characteristics?

As you can guess from my previous replies to this thread (and the
previous version of this patch series), I lightly tried it and checked
its performance for Booking.com.

> Do you folks like it?  Rather not have such complexity in the core
> part of the system?  A good first step to start adding more
> performance improvements?  No opinion?

I already gave my opinion which I think is shared with Ævar. In short
I don't think it should be a hook, as that limits the performance and
is not necessary, but it is going in the right direction.


Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-07-01 Thread Christian Couder
On Sat, Jul 1, 2017 at 10:33 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:
>
>>> I think it would be good to ensure the
>>> interface is robust and performant enough to actually replace the current
>>> object store interface (even if we don't actually do that just yet).
>>
>> I agree that it should be robust and performant, but I don't think it
>> needs to be as performant in all cases as the current object store
>> right now.
>
> That sounds like starting from a defeatest position.  Is there a
> reason why you think using an external interface could never perform
> well enough to be usable in everyday work?

Perhaps in the future we will be able to make it as performant as, or
perhaps even more performant, than the current object store, but in
the current implementation the following issues mean that it will be
less performant:

- The external object stores are searched for an object after the
object has not been found in the current object store. This means that
searching for an object will be slower if the object is in an external
object store. To overcome this the "have" information (when the
external helper implements it) could be merged with information about
what objects are in the current object store, for example in a big
table or bitmap, so that only one lookup in this table or bitmap would
be needed to know if an object is available and in which object store
it is. But I really don't want to get into this right now.

- When an external odb helper retrieves an object and passes it to
Git, Git (or the helper itself in "fault in" mode) then stores the
object in the current object store. This is because we assume that it
will be faster to retrieve it again if it is cached in the current
object store. There could be a capability that asks Git to not cache
the objects that are retrieved from the external odb, but again I
don't think it is necessary at all to implement this right now.

I still think though that in some cases, like when the external odb is
used to implement a bundle clone, using the external odb mechanism can
already be more performant.


Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-07-01 Thread Christian Couder
On Sat, Jul 1, 2017 at 9:41 PM, Christian Couder
<christian.cou...@gmail.com> wrote:
> On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peart...@gmail.com> wrote:

>> The fact that "git clone is taught a --initial-refspec" option" indicates
>> this isn't just an ODB implementation detail.  Is there a general capability
>> that is missing from the ODB interface that needs to be addressed here?
>
> Technically you don't need to teach `git clone` the --initial-refspec
> option to make it work.
> It can work like this:
>
> $ git init
> $ git remote add origin 
> $ git fetch origin 
> $ git config odb..command 
> $ git fetch origin
>
> But it is much simpler for the user to instead just do:
>
> $ git clone -c odb..command= --initial-refspec
>  
>
> I also think that the --initial-refspec option could perhaps be useful
> for other kinds of refs for example tags, notes or replace refs, to
> make sure that those refs are fetched first and that hooks can use
> them when fetching other refs like branches in the later part of the
> clone.

Actually I am not sure that it's possible to setup hooks per se before
or while cloning, but perhaps there are other kind of scripts or git
commands that could trigger and use the refs that have been fetched
first.


Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-07-01 Thread Christian Couder
On Fri, Jun 23, 2017 at 8:24 PM, Ben Peart <peart...@gmail.com> wrote:
>
>
> On 6/20/2017 3:54 AM, Christian Couder wrote:

>> To be able to better handle some kind of objects, for example big
>> blobs, it would be nice if Git could store its objects in other object
>> databases (ODB).
>>
>> To do that, this patch series makes it possible to register commands,
>> also called "helpers", using "odb..command" config variables,
>> to access external ODBs where objects can be stored and retrieved.
>>
>> External ODBs should be able to tranfer information about the blobs
>> they store. This patch series shows how this is possible using kind of
>> replace refs.
>
> Great to see this making progress!
>
> My thoughts and questions are mostly about the overall design tradeoffs.
>
> Is your intention to enable the ODB to completely replace the regular object
> store or just to supplement it?

It is to supplement it, as I think the regular object store works very
well most of the time.

> I think it would be good to ensure the
> interface is robust and performant enough to actually replace the current
> object store interface (even if we don't actually do that just yet).

I agree that it should be robust and performant, but I don't think it
needs to be as performant in all cases as the current object store
right now.

> Another way of asking this is: do the 3 verbs (have, get, put) and the 3
> types of "get" enable you to wrap the current loose object and pack file
> code as ODBs and run completely via the external ODB interface?  If not,
> what is missing and can it be added?

Right now the "put" verb only send plain blobs, so the most logical
way to run completely via the external ODB interface would be to use
it to send and receive plain blobs. There are tests scripts (t0420,
t0470 and t0480) that use an http server as the external ODB and all
the blobs are stored in it.

And yeah for now it works only for blobs. There is a temporary patch
in the series that limits it to blobs. For the non RFC patch series, I
think it should either use the attribute system to tell which objects
should be run via the external ODB interface, or perhaps there should
be a way to ask each external ODB helper which kind of objects and
blobs it can handle. I should add that in the future work part.

> _Eventually_ it would be great to see the current object store(s) moved
> behind the new ODB interface.

This is not one of my goals and I think it could be a problem if we
want to keep the "fault in" mode.
In this mode the helper writes or reads directly to or from the
current object store, so it needs the current object store to be
available.

Also I think compatibility with other git implementations is important
and it is a good thing that they can all work on a common repository
format.

> When there are multiple ODB providers, what is the order they are called?

The external_odb_config() function creates the helpers for the
external ODBs in the order they are found in the config file, and then
these helpers are called in turn in the same order.

> If one fails a request (get, have, put) are the others called to see if they
> can fulfill the request?

Yes, but there are no tests to check that it works well. I will need
to add some.

> Can the order they are called for various verb be configured explicitly?

Right now, you can configure the order by changing the config file,
but the order will be the same for all the verbs.

> For
> example, it would be nice to have a "large object ODB handler" configured to
> get first try at all "put" verbs.  Then if it meets it's size requirements,
> it will handle the verb, otherwise it fail and git will try the other ODBs.

This can work if the "large object ODB handler" is configured first.

Also this is linked with how you define which objects are handled by
which helper. For example if the attribute system is used to describe
which external ODB is used for which files, there could be a way to
tell for example that blobs larger than 1MB are handled by the "large
object ODB handler" while those that are smaller are handled by
another helper.

>> Design
>> ~~
>>
>> * The "helpers" (registered commands)
>>
>> Each helper manages access to one external ODB.
>>
>> There are now 2 different modes for helper:
>>
>>- When "odb..scriptMode" is set to "true", the helper is
>>  launched each time Git wants to communicate with the 
>>  external ODB.
>>
>>- When "odb..scriptMode" is not set or set to "false", then
>>  the helper is launched once as a sub-process (using
>>  sub-process.h), 

Re: [PATCH v5 0/7] Fast git status via a file system watcher

2017-06-27 Thread Christian Couder
On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:
> Changes from V4 include:
...

I took a look at this patch series except the last patch ([PATCH v5
7/7] fsmonitor: add a performance test) as Junio reviewed it already,
and had only a few comments on patches 3/7 and 4/7.

I am still not convinced by the discussions following v2
(http://public-inbox.org/git/20170518201333.13088-1-benpe...@microsoft.com/)
about using a hook instead of for example a "core.fsmonitorcommand".

I think using a hook is not necessary and might not be a good match
for later optimizations. For example people might want to use a
library or some OS specific system calls to do what the hook does.

AEvar previously reported some not so great performance numbers on
some big Booking.com boxes with a big monorepo and it seems that using
taskset for example to make sure that the hook is run on the same CPU
improves these numbers significantly. So avoiding to run a separate
process can be important in some cases.


Re: [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension

2017-06-27 Thread Christian Couder
On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:

> +# fsmonitor works correctly with or without the untracked cache
> +# but if it is available, we'll turn it on to ensure we test that
> +# codepath as well.
> +
> +test_lazy_prereq UNTRACKED_CACHE '
> +   { git update-index --test-untracked-cache; ret=$?; } &&
> +   test $ret -ne 1
> +'
> +
> +if test_have_prereq UNTRACKED_CACHE; then
> +   git config core.untrackedcache true
> +else
> +   git config core.untrackedcache false
> +fi

I wonder if it would be better to just do something like:

=

test_expect_success 'setup' '

'

uc_values="false"
test_have_prereq UNTRACKED_CACHE && uc_values="false true"

for uc_val in $uc_values
do

test_expect_success "setup untracked cache to $uc_val" '
 git config core.untrackedcache $uc_val
'

test_expect_success 'refresh_index() invalidates fsmonitor cache' '
  ...
'

test_expect_success "status doesn't detect unreported modifications" '
  ...
'

...

done

=


Re: [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-06-27 Thread Christian Couder
On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:

> +int read_fsmonitor_extension(struct index_state *istate, const void *data,
> +   unsigned long sz)
> +{
> +   const char *index = data;
> +   uint32_t hdr_version;
> +   uint32_t ewah_size;
> +   int ret;
> +
> +   if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
> +   return error("corrupt fsmonitor extension (too short)");
> +
> +   hdr_version = get_be32(index);

Here we use get_be32(), ...

> +   index += sizeof(uint32_t);
> +   if (hdr_version != INDEX_EXTENSION_VERSION)
> +   return error("bad fsmonitor version %d", hdr_version);
> +
> +   istate->fsmonitor_last_update = get_be64(index);

...get_be64(), ...

> +   index += sizeof(uint64_t);
> +
> +   ewah_size = get_be32(index);

... and get_be32 again, ...

> +   index += sizeof(uint32_t);
> +
> +   istate->fsmonitor_dirty = ewah_new();
> +   ret = ewah_read_mmap(istate->fsmonitor_dirty, index, ewah_size);
> +   if (ret != ewah_size) {
> +   ewah_free(istate->fsmonitor_dirty);
> +   istate->fsmonitor_dirty = NULL;
> +   return error("failed to parse ewah bitmap reading fsmonitor 
> index extension");
> +   }
> +
> +   return 0;
> +}
> +
> +void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
> +{
> +   uint32_t hdr_version;
> +   uint64_t tm;
> +   struct ewah_bitmap *bitmap;
> +   int i;
> +   uint32_t ewah_start;
> +   uint32_t ewah_size = 0;
> +   int fixup = 0;
> +
> +   hdr_version = htonl(INDEX_EXTENSION_VERSION);

... but here we use htonl() instead of put_be32(), ...

> +   strbuf_add(sb, _version, sizeof(uint32_t));
> +
> +   tm = htonll((uint64_t)istate->fsmonitor_last_update);

... htonll(), ...

> +   strbuf_add(sb, , sizeof(uint64_t));
> +   fixup = sb->len;
> +   strbuf_add(sb, _size, sizeof(uint32_t)); /* we'll fix this up 
> later */
> +
> +   ewah_start = sb->len;
> +   bitmap = ewah_new();
> +   for (i = 0; i < istate->cache_nr; i++)
> +   if (istate->cache[i]->ce_flags & CE_FSMONITOR_DIRTY)
> +   ewah_set(bitmap, i);
> +   ewah_serialize_strbuf(bitmap, sb);
> +   ewah_free(bitmap);
> +
> +   /* fix up size field */
> +   ewah_size = htonl(sb->len - ewah_start);

... and htonl() again.

It would be more consistent (and perhaps more correct) to use
put_beXX() functions, instead of the htonl() and htonll() functions.

> +   memcpy(sb->buf + fixup, _size, sizeof(uint32_t));
> +}

> +/*
> + * Call the query-fsmonitor hook passing the time of the last saved results.
> + */
> +static int query_fsmonitor(int version, uint64_t last_update, struct strbuf 
> *query_result)
> +{
> +   struct child_process cp = CHILD_PROCESS_INIT;
> +   char ver[64];
> +   char date[64];
> +   const char *argv[4];
> +
> +   if (!(argv[0] = find_hook("query-fsmonitor")))
> +   return -1;
> +
> +   snprintf(ver, sizeof(version), "%d", version);
> +   snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
> +   argv[1] = ver;
> +   argv[2] = date;
> +   argv[3] = NULL;
> +   cp.argv = argv;

Maybe it would be nicer using the argv_array_pushX() functions.

> +   cp.out = -1;
> +
> +   return capture_command(, query_result, 1024);
> +}
> +
> +static void mark_file_dirty(struct index_state *istate, const char *name)
> +{
> +   struct untracked_cache_dir *dir;
> +   int pos;
> +
> +   /* find it in the index and mark that entry as dirty */
> +   pos = index_name_pos(istate, name, strlen(name));
> +   if (pos >= 0) {
> +   if (!(istate->cache[pos]->ce_flags & CE_FSMONITOR_DIRTY)) {
> +   istate->cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
> +   istate->cache_changed |= FSMONITOR_CHANGED;
> +   }
> +   }
> +
> +   /*
> +* Find the corresponding directory in the untracked cache
> +* and mark it as invalid
> +*/
> +   if (!istate->untracked || !istate->untracked->root)
> +   return;
> +
> +   dir = find_untracked_cache_dir(istate->untracked, 
> istate->untracked->root, name);
> +   if (dir) {
> +   if (dir->valid) {
> +   dir->valid = 0;
> +   istate->cache_changed |= FSMONITOR_CHANGED;
> +   }
> +   }

The code above is quite similar as what is in mark_fsmonitor_dirty(),
so I wonder if a refactoring is possible.

> +}
> +
> +void refresh_by_fsmonitor(struct index_state *istate)
> +{
> +   static int has_run_once = 0;
> +   struct strbuf query_result = STRBUF_INIT;
> +   int query_success = 0;
> +   size_t bol = 0; /* beginning of line */
> +   uint64_t last_update;
> +   char *buf, *entry;
> +   int i;
> +

Re: [GSoC][PATCH 6/6 v2] submodule: port submodule subcommand 'deinit' from shell to C

2017-06-27 Thread Christian Couder
On Tue, Jun 27, 2017 at 1:11 AM, Prathamesh Chavan  wrote:

> +static void deinit_submodule(const struct cache_entry *list_item,
> +void *cb_data)
> +{
> +   struct deinit_cb *info = cb_data;
> +   const struct submodule *sub;
> +   char *displaypath = NULL;
> +   struct child_process cp_config = CHILD_PROCESS_INIT;
> +   struct strbuf sb_config = STRBUF_INIT;
> +   char *sm_path = xstrdup(list_item->name);
> +   char *sub_git_dir = xstrfmt("%s/.git", sm_path);
> +
> +   sub = submodule_from_path(null_sha1, sm_path);
> +
> +   if (!sub->name)

In the previous patch "!sub" is used before "!sub->url", so we might
want to check "!sub" here too.

> +   goto cleanup;
> +
> +   displaypath = get_submodule_displaypath(sm_path, info->prefix);
> +
> +   /* remove the submodule work tree (unless the user already did it) */
> +   if (is_directory(sm_path)) {
> +   struct child_process cp = CHILD_PROCESS_INIT;
> +
> +   /* protect submodules containing a .git directory */
> +   if (is_git_directory(sub_git_dir))
> +   die(_("Submodule work tree '%s' contains a .git "
> + "directory use 'rm -rf' if you really want "
> + "to remove it including all of its history"),
> + displaypath);
> +
> +   if (!info->force) {
> +   struct child_process cp_rm = CHILD_PROCESS_INIT;
> +   cp_rm.git_cmd = 1;
> +   argv_array_pushl(_rm.args, "rm", "-qn", sm_path,
> +NULL);
> +
> +   /* list_item->name is changed by cmd_rm() below */

It looks like cmd_rm() is not used anymore below, so this comment
could go and the sm_path variable might not be needed any more.

> +   if (run_command(_rm))
> +   die(_("Submodule work tree '%s' contains 
> local "
> + "modifications; use '-f' to discard 
> them"),
> + displaypath);
> +   }
> +
> +   cp.use_shell = 1;

Do we really need a shell here?

> +   argv_array_pushl(, "rm", "-rf", sm_path, NULL);
> +   if (!run_command()) {
> +   if (!info->quiet)
> +   printf(_("Cleared directory '%s'\n"),
> +displaypath);
> +   } else {
> +   if (!info->quiet)
> +   printf(_("Could not remove submodule work 
> tree '%s'\n"),
> +displaypath);
> +   }
> +   }
> +
> +   if (mkdir(sm_path, 0700))

Are you sure about the 0700 mode?
Shouldn't this depend on the shared repository settings?

> +   die(_("could not create empty submodule directory %s"),
> + displaypath);


Re: [GSoC][PATCH 5/6 v2] submodule: port submodule subcommand sync from shell to C

2017-06-27 Thread Christian Couder
On Tue, Jun 27, 2017 at 1:11 AM, Prathamesh Chavan  wrote:

> +static char *get_up_path(const char *path)
> +{
> +   int i = count_slashes(path);
> +   int l = strlen(path);

Nit: "l" is quite similar to "i" in many fonts, so maybe use "len"
instead of "l", but see below.

> +   struct strbuf sb = STRBUF_INIT;
> +
> +   while (i--)
> +   strbuf_addstr(, "../");

Nit: a regular "for" loop like the following might be easier to review:

for (i = count_slashes(path); i; i--)
strbuf_addstr(, "../");

> +   /*
> +*Check if 'path' ends with slash or not
> +*for having the same output for dir/sub_dir
> +*and dir/sub_dir/
> +*/
> +   if (!is_dir_sep(path[l - 1]))

As "l" is used only here, maybe we could get rid of this variable
altogether with something like:

  if (!is_dir_sep(path[strlen(path) - 1]))

> +   strbuf_addstr(, "../");
> +
> +   return strbuf_detach(, NULL);
> +}

> +static void sync_submodule(const struct cache_entry *list_item, void 
> *cb_data)
> +{
> +   struct sync_cb *info = cb_data;
> +   const struct submodule *sub;
> +   char *sub_key, *remote_key;
> +   char *sub_origin_url, *super_config_url, *displaypath;
> +   struct strbuf sb = STRBUF_INIT;
> +   struct child_process cp = CHILD_PROCESS_INIT;
> +
> +   if (!is_submodule_initialized(list_item->name))
> +   return;
> +
> +   sub = submodule_from_path(null_sha1, list_item->name);
> +
> +   if (!sub && !sub->url)

I think it should be "(!sub || !sub->url)".

> +   die(_("no url found for submodule path '%s' in .gitmodules"),
> + list_item->name);


Re: [GSoC][PATCH 2/6 v2] submodule: port subcommand foreach from shell to C

2017-06-27 Thread Christian Couder
On Tue, Jun 27, 2017 at 1:11 AM, Prathamesh Chavan  wrote:
> +
> +   if (!is_submodule_populated_gently(list_item->name, NULL))
> +   goto cleanup;
> +
> +   prepare_submodule_repo_env(_array);
> +   /* For the purpose of executing  in the submodule,
> +* separate shell is used for the purpose of running the
> +* child process.
> +*/

As this comment spans over more than one line, it should be like:

/*
 * first line of comment
 * second line of comment
 * more stuff ...
 */

Also please explain WHY a shell is needed, we can see from the code
that we will use a shell.
So it should be something like:

/*
 * Use a shell because ...
 * and ...
 */

> +   cp.use_shell = 1;
> +   cp.dir = list_item->name;


Re: [PATCH v2 1/3] read-cache: use shared perms when writing shared index

2017-06-24 Thread Christian Couder
On Sat, Jun 24, 2017 at 12:02 AM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:

>> Because of that, on repositories created with `git init --shared=all`
>> and using the split index feature, one gets an error like:
>>
>> fatal: .git/sharedindex.a52f910b489bc462f187ab572ba0086f7b5157de: index file 
>> open failed: Permission denied
>>
>> when another user performs any operation that reads the shared index.
>>
>> We could use create_tempfile() that calls adjust_shared_perm(), but
>> unfortunately create_tempfile() doesn't replace the XX at the end
>> of the path it is passed. So let's just call adjust_shared_perm() by
>> ourselves.
>
> Because create_tempfile() is not even a viable alternative, the
> above sounds just as silly as saying "We could use X, but
> unfortunately that X doesn't create a temporary file and return its
> file descriptor" with X replaced with any one of about a dozen
> functions that happen to call adjust_shared_perm().
>
> Call adjust_shared_perm() on the temporary file created by
> mks_tempfile() ourselves to adjust the permission bits.
>
> should be sufficient.

Ok, the v3 has the above in the commit message and also uses
get_tempfile_path().


Re: [PATCH v2 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-24 Thread Christian Couder
On Sat, Jun 24, 2017 at 12:20 AM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:
>
>> Add a few tests to check that both the split-index file and the
>> shared-index file are created using the right permissions when
>> core.sharedrepository is set.
>>
>> Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
>> ---
>>  t/t1700-split-index.sh | 17 +
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
>> index af3ec0da5a..2c5be732e4 100755
>> --- a/t/t1700-split-index.sh
>> +++ b/t/t1700-split-index.sh
>> @@ -370,4 +370,21 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
>> set to "never" and "now"
>>   test $(ls .git/sharedindex.* | wc -l) -le 2
>>  '
>>
>> +while read -r mode modebits filename; do
>
> Style.

Fixed in the version (v3) I just sent.

> Running this twice in a loop would create two .git/sharedindex.*
> files in quick succession.  I do not think we want to assume that
> the filesystem timestamp can keep up with us to allow "ls -t" to
> work reliably in the second round (if there is a leftover shared
> index from previous test, even the first round may not catch the
> latest one).

Yeah, it might be a problem on some systems.

> How about doing each iteration this way instead?  Which might be a
> better solution to work around that.
>
> - with core.sharedrepository set to false, force the index to be
>   unsplit; "index" will have the default unshared permission
>   bits (but we do not care what it is and no need to check it).
>
> - remove any leftover sharedindex.*, if any.
>
> - with core.sharedrepository set to whatever mode being tested,
>   do the adding to force split.
>
> - test the permission of index file.
>
> - test the permission of sharedindex.* file; there should be
>   only one instance, so erroring out when we see two or more is
>   also a good test.
>
> The last two steps may look like:
>
> test_modebits .git/index >actual && test_cmp expect actual &&
> shared=$(ls .git/sharedindex.*) &&
> case "$shared" in
> *" "*)
> # we have more than one???
> false ;;
> *)
> test_modebits "shared" >actual &&
> test_cmp expect actual ;;
> esac

Ok, it does what you suggest in v3.

Thanks.


[PATCH v3 2/3] t1301: move modebits() to test-lib-functions.sh

2017-06-24 Thread Christian Couder
As the modebits() function can be useful outside t1301,
let's move it into test-lib-functions.sh, and while at
it let's rename it test_modebits().

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1301-shared-repo.sh  | 18 +++---
 t/test-lib-functions.sh |  5 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index 1312004f8c..dfece751b5 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -19,10 +19,6 @@ test_expect_success 'shared = 0400 (faulty permission u-w)' '
)
 '
 
-modebits () {
-   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
-}
-
 for u in 002 022
 do
test_expect_success POSIXPERM "shared=1 does not clear bits preset by 
umask $u" '
@@ -88,7 +84,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$y"
 
'
@@ -98,7 +94,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$x"
 
'
@@ -111,7 +107,7 @@ test_expect_success POSIXPERM 'info/refs respects umask in 
unshared repo' '
umask 002 &&
git update-server-info &&
echo "-rw-rw-r--" >expect &&
-   modebits .git/info/refs >actual &&
+   test_modebits .git/info/refs >actual &&
test_cmp expect actual
 '
 
@@ -177,7 +173,7 @@ test_expect_success POSIXPERM 'remote init does not use 
config from cwd' '
umask 0022 &&
git init --bare child.git &&
echo "-rw-r--r--" >expect &&
-   modebits child.git/config >actual &&
+   test_modebits child.git/config >actual &&
test_cmp expect actual
 '
 
@@ -187,7 +183,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (local)' '
echo whatever >templates/foo &&
git init --template=templates &&
echo "-rw-rw-rw-" >expect &&
-   modebits .git/foo >actual &&
+   test_modebits .git/foo >actual &&
test_cmp expect actual
 '
 
@@ -198,7 +194,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (remote)'
test_path_is_missing child.git/foo &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/foo >actual &&
+   test_modebits child.git/foo >actual &&
test_cmp expect actual
 '
 
@@ -209,7 +205,7 @@ test_expect_success POSIXPERM 'template can set 
core.sharedrepository' '
cp .git/config templates/config &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/HEAD >actual &&
+   test_modebits child.git/HEAD >actual &&
test_cmp expect actual
 '
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 5ee124332a..db622c3555 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -216,6 +216,11 @@ test_chmod () {
git update-index --add "--chmod=$@"
 }
 
+# Get the modebits from a file.
+test_modebits () {
+   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
+}
+
 # Unset a configuration variable, but don't fail if it doesn't exist.
 test_unconfig () {
config_dir=
-- 
2.13.1.517.gf6399a5ea5



[PATCH v3 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-24 Thread Christian Couder
Add a few tests to check that both the split-index file and the
shared-index file are created using the right permissions when
core.sharedrepository is set.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1700-split-index.sh | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index af3ec0da5a..22f69a410b 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -370,4 +370,34 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
set to "never" and "now"
test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
+while read -r mode modebits
+do
+   test_expect_success POSIXPERM "split index respects 
core.sharedrepository $mode" '
+   # Remove existing shared index files
+   git config core.splitIndex false &&
+   git update-index --force-remove one &&
+   rm -f .git/sharedindex.* &&
+   # Create one new shared index file
+   git config core.sharedrepository "$mode" &&
+   git config core.splitIndex true &&
+   : >one &&
+   git update-index --add one &&
+   echo "$modebits" >expect &&
+   test_modebits .git/index >actual &&
+   test_cmp expect actual &&
+   shared=$(ls .git/sharedindex.*) &&
+   case "$shared" in
+   *" "*)
+   # we have more than one???
+   false ;;
+   *)
+   test_modebits "$shared" >actual &&
+   test_cmp expect actual ;;
+   esac
+   '
+done <<\EOF
+0666 -rw-rw-rw-
+0642 -rw-r---w-
+EOF
+
 test_done
-- 
2.13.1.517.gf6399a5ea5



[PATCH v3 1/3] read-cache: use shared perms when writing shared index

2017-06-24 Thread Christian Couder
Since f6ecc62dbf (write_shared_index(): use tempfile module, 2015-08-10)
write_shared_index() has been using mks_tempfile() to create the
temporary file that will become the shared index.

But even before that, it looks like the functions used to create this
file didn't call adjust_shared_perm(), which means that the shared
index file has always been created with 600 permissions regardless
of the shared permission settings.

Because of that, on repositories created with `git init --shared=all`
and using the split index feature, one gets an error like:

fatal: .git/sharedindex.a52f910b489bc462f187ab572ba0086f7b5157de: index file 
open failed: Permission denied

when another user performs any operation that reads the shared index.

Call adjust_shared_perm() on the temporary file created by
mks_tempfile() ourselves to adjust the permission bits.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 read-cache.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index bc156a133e..1f4ec1b022 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2425,6 +2425,14 @@ static int write_shared_index(struct index_state *istate,
delete_tempfile(_sharedindex);
return ret;
}
+   ret = adjust_shared_perm(get_tempfile_path(_sharedindex));
+   if (ret) {
+   int save_errno = errno;
+   error("cannot fix permission bits on %s", 
get_tempfile_path(_sharedindex));
+   delete_tempfile(_sharedindex);
+   errno = save_errno;
+   return ret;
+   }
ret = rename_tempfile(_sharedindex,
  git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
if (!ret) {
-- 
2.13.1.517.gf6399a5ea5



Crash in t3507-cherry-pick-conflict.sh with GIT_TEST_SPLIT_INDEX set

2017-06-24 Thread Christian Couder
Git from the master branch currently segfaults when running
t3507-cherry-pick-conflict.sh with GIT_TEST_SPLIT_INDEX set:

expecting success:
pristine_detach initial &&
test_must_fail git cherry-pick -s picked-signed &&
git commit -a -s &&
test $(git show -s |grep -c "Signed-off-by") = 1

HEAD is now at df2a63d... initial
Auto-merging foo
CONFLICT (content): Merge conflict in foo
error: could not apply e4ca149... picked-signed
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add ' or 'git rm '
hint: and commit the result with 'git commit'
Segmentation fault
not ok 29 - commit after failed cherry-pick does not add duplicated -s
#
#   pristine_detach initial &&
#   test_must_fail git cherry-pick -s picked-signed &&
#   git commit -a -s &&
#   test $(git show -s |grep -c "Signed-off-by") = 1
#

The crash happens during the `git commit -a -s` with a backtrace like this:

Program received signal SIGSEGV, Segmentation fault.
0x0050bcf1 in entry_equals (map=0x8cef80 ,
e1=0xa2d2d2d2d2d2d00, e2=0x7fffce10,
keydata=0x0) at hashmap.c:98
98  return (e1 == e2) || (e1->hash == e2->hash &&
!map->cmpfn(e1, e2, keydata));
(gdb) bt
#0  0x0050bcf1 in entry_equals (map=0x8cef80 ,
e1=0xa2d2d2d2d2d2d00, e2=0x7fffce10,
keydata=0x0) at hashmap.c:98
#1  0x0050bec6 in find_entry_ptr (map=0x8cef80 ,
key=0x7fffce10, keydata=0x0)
at hashmap.c:138
#2  0x0050c044 in hashmap_get (map=0x8cef80 ,
key=0x7fffce10, keydata=0x0)
at hashmap.c:182
#3  0x00525a1d in hashmap_get_from_hash (map=0x8cef80
, hash=1625022057, keydata=0x0)
at hashmap.h:78
#4  0x00526edd in index_file_exists (istate=0x8cef40
, name=0x8f19d0 "unrelated",
namelen=9, icase=0) at name-hash.c:701
#5  0x004f55ba in treat_one_path (dir=0x7fffd0b0,
untracked=0x0, istate=0x8cef40 ,
path=0x7fffcf80, baselen=0, pathspec=0x88c2b8 , dtype=8,
de=0x8f8a00) at dir.c:1550
#6  0x004f5914 in treat_path (dir=0x7fffd0b0,
untracked=0x0, cdir=0x7fffcfa0,
istate=0x8cef40 , path=0x7fffcf80, baselen=0,
pathspec=0x88c2b8 ) at dir.c:1660
#7  0x004f6006 in read_directory_recursive
(dir=0x7fffd0b0, istate=0x8cef40 ,
base=0x61561b "", baselen=0, untracked=0x0, check_only=0,
pathspec=0x88c2b8 ) at dir.c:1809

It bisects to f9d7abec2a (split-index: add and use
unshare_split_index(), 2017-05-05) that is fixing memory leaks when
discarding the index.
It looks like we are freeing some cache entries that we shouldn't free.


[PATCH v2 2/3] t1301: move modebits() to test-lib-functions.sh

2017-06-23 Thread Christian Couder
As the modebits() function can be useful outside t1301,
let's move it into test-lib-functions.sh, and while at
it let's rename it test_modebits().

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1301-shared-repo.sh  | 18 +++---
 t/test-lib-functions.sh |  5 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index 1312004f8c..dfece751b5 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -19,10 +19,6 @@ test_expect_success 'shared = 0400 (faulty permission u-w)' '
)
 '
 
-modebits () {
-   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
-}
-
 for u in 002 022
 do
test_expect_success POSIXPERM "shared=1 does not clear bits preset by 
umask $u" '
@@ -88,7 +84,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$y"
 
'
@@ -98,7 +94,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$x"
 
'
@@ -111,7 +107,7 @@ test_expect_success POSIXPERM 'info/refs respects umask in 
unshared repo' '
umask 002 &&
git update-server-info &&
echo "-rw-rw-r--" >expect &&
-   modebits .git/info/refs >actual &&
+   test_modebits .git/info/refs >actual &&
test_cmp expect actual
 '
 
@@ -177,7 +173,7 @@ test_expect_success POSIXPERM 'remote init does not use 
config from cwd' '
umask 0022 &&
git init --bare child.git &&
echo "-rw-r--r--" >expect &&
-   modebits child.git/config >actual &&
+   test_modebits child.git/config >actual &&
test_cmp expect actual
 '
 
@@ -187,7 +183,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (local)' '
echo whatever >templates/foo &&
git init --template=templates &&
echo "-rw-rw-rw-" >expect &&
-   modebits .git/foo >actual &&
+   test_modebits .git/foo >actual &&
test_cmp expect actual
 '
 
@@ -198,7 +194,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (remote)'
test_path_is_missing child.git/foo &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/foo >actual &&
+   test_modebits child.git/foo >actual &&
test_cmp expect actual
 '
 
@@ -209,7 +205,7 @@ test_expect_success POSIXPERM 'template can set 
core.sharedrepository' '
cp .git/config templates/config &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/HEAD >actual &&
+   test_modebits child.git/HEAD >actual &&
test_cmp expect actual
 '
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 5ee124332a..db622c3555 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -216,6 +216,11 @@ test_chmod () {
git update-index --add "--chmod=$@"
 }
 
+# Get the modebits from a file.
+test_modebits () {
+   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
+}
+
 # Unset a configuration variable, but don't fail if it doesn't exist.
 test_unconfig () {
config_dir=
-- 
2.13.1.519.g0a0746bea4



[PATCH v2 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-23 Thread Christian Couder
Add a few tests to check that both the split-index file and the
shared-index file are created using the right permissions when
core.sharedrepository is set.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1700-split-index.sh | 17 +
 1 file changed, 17 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index af3ec0da5a..2c5be732e4 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -370,4 +370,21 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
set to "never" and "now"
test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
+while read -r mode modebits filename; do
+   test_expect_success POSIXPERM "split index respects 
core.sharedrepository $mode" '
+   git config core.sharedrepository "$mode" &&
+   : >"$filename" &&
+   git update-index --add "$filename" &&
+   echo "$modebits" >expect &&
+   test_modebits .git/index >actual &&
+   test_cmp expect actual &&
+   newest_shared_index=$(ls -t .git/sharedindex.* | head -1) &&
+   test_modebits "$newest_shared_index" >actual &&
+   test_cmp expect actual
+   '
+done <<\EOF
+0666 -rw-rw-rw- seventeen
+0642 -rw-r---w- eightteen
+EOF
+
 test_done
-- 
2.13.1.519.g0a0746bea4



[PATCH v2 1/3] read-cache: use shared perms when writing shared index

2017-06-23 Thread Christian Couder
Since f6ecc62dbf (write_shared_index(): use tempfile module, 2015-08-10)
write_shared_index() has been using mks_tempfile() to create the
temporary file that will become the shared index.

But even before that, it looks like the functions used to create this
file didn't call adjust_shared_perm(), which means that the shared
index file has always been created with 600 permissions regardless
of the shared permission settings.

Because of that, on repositories created with `git init --shared=all`
and using the split index feature, one gets an error like:

fatal: .git/sharedindex.a52f910b489bc462f187ab572ba0086f7b5157de: index file 
open failed: Permission denied

when another user performs any operation that reads the shared index.

We could use create_tempfile() that calls adjust_shared_perm(), but
unfortunately create_tempfile() doesn't replace the XX at the end
of the path it is passed. So let's just call adjust_shared_perm() by
ourselves.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 read-cache.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index bc156a133e..66f85f8d58 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2425,6 +2425,14 @@ static int write_shared_index(struct index_state *istate,
delete_tempfile(_sharedindex);
return ret;
}
+   ret = adjust_shared_perm(temporary_sharedindex.filename.buf);
+   if (ret) {
+   int save_errno = errno;
+   error("cannot fix permission bits on %s", 
temporary_sharedindex.filename.buf);
+   delete_tempfile(_sharedindex);
+   errno = save_errno;
+   return ret;
+   }
ret = rename_tempfile(_sharedindex,
  git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
if (!ret) {
-- 
2.13.1.519.g0a0746bea4



Re: [PATCH 1/3] read-cache: use shared perms when writing shared index

2017-06-23 Thread Christian Couder
On Thu, Jun 22, 2017 at 9:51 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:

>> Let's fix that by using create_tempfile() instead of mks_tempfile()
>> to create the shared index file.
>>
>> ...
>> - fd = mks_tempfile(_sharedindex, 
>> git_path("sharedindex_XX"));
>> + fd = create_tempfile(_sharedindex, 
>> git_path("sharedindex_XX"));
>
> So we used to create a temporary file that made sure its name is
> unique but now we create sharedindex_XX with 6 X's literally
> at the end?
>
> Doesn't mks_tempfile() family include a variant where you can give
> custom mode?  Better yet, perhaps you can call adjust_shared_perm()
> on the path _after_ seeing that mks_tempfile() succeeds (you can ask
> get_tempfile_path() which path to adjust, I presume)?

Yeah, adjust_shared_perm() is called after mks_tempfile() succeeds, in
the next version.


Re: [PATCH 2/3] t1301: move movebits() to test-lib-functions.sh

2017-06-23 Thread Christian Couder
On Fri, Jun 23, 2017 at 1:09 AM, Ramsay Jones
<ram...@ramsayjones.plus.com> wrote:
>
>
> On 22/06/17 20:52, Junio C Hamano wrote:
>> Christian Couder <christian.cou...@gmail.com> writes:
>>
>>> As the movebits() function can be useful outside t1301,
>>> let's move it into test-lib-functions.sh, and while at
>>> it let's rename it test_movebits().
>>
>> Good thinking, especially on the renaming.
>
> Err, except for the commit message! :-D
>
> Both the commit message subject and the commit message body
> refer to _move_bits() rather than _mode_bits() etc.
> (So, three instances of s/move/mode/).

Yeah, sorry about that. This is fixed in the version I will send
really soon now.


Re: [PATCH 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-22 Thread Christian Couder
On Thu, Jun 22, 2017 at 10:25 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:
>
>> We use "git config core.sharedrepository 0666" at the beginning of
>> this test, so it will only apply to the shared index files that are
>> created after that.
>>
>> Do you suggest that we test before setting core.sharedrepository that
>> the existing shared index files all have the default permissions?
>
> I think it would be sensible to see at least two values appear.
> Otherwise we cannot tell if the right value is coming by accident
> (because it was the default) or by design (because the configuration
> is correctly read).

Ok, I think I will use something like this then:

while read -r mode modebits filename; do
test_expect_success POSIXPERM "split index respects
core.sharedrepository $mode" '
git config core.sharedrepository "$mode" &&
: >"$filename" &&
git update-index --add "$filename" &&
echo "$modebits" >expect &&
test_modebits .git/index >actual &&
test_cmp expect actual &&
newest_shared_index=$(ls -t .git/sharedindex.* | head -1) &&
test_modebits "$newest_shared_index" >actual &&
test_cmp expect actual
'
done <<\EOF
0666 -rw-rw-rw- seventeen
0642 -rw-r---w- eightteen
EOF


Re: [PATCH 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-22 Thread Christian Couder
On Thu, Jun 22, 2017 at 9:53 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:
>
>> Add a test to check that both the split-index file and the
>> shared-index file are created using the right permissions
>> when core.sharedrepository is set.
>>
>> Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
>> ---
>>  t/t1700-split-index.sh | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
>> index af3ec0da5a..a52b92e82b 100755
>> --- a/t/t1700-split-index.sh
>> +++ b/t/t1700-split-index.sh
>> @@ -370,4 +370,16 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
>> set to "never" and "now"
>>   test $(ls .git/sharedindex.* | wc -l) -le 2
>>  '
>>
>> +test_expect_success POSIXPERM 'split index respects core.sharedrepository' '
>> + git config core.sharedrepository 0666 &&
>> + : >seventeen &&
>> + git update-index --add seventeen &&
>> + echo "-rw-rw-rw-" >expect &&
>> + test_modebits .git/index >actual &&
>> + test_cmp expect actual &&
>> + newest_shared_index=$(ls -t .git/sharedindex.* | head -1) &&
>
> Hmph.  Don't you want to make sure all of them, not just the latest
> one, have the expected mode bits?

We use "git config core.sharedrepository 0666" at the beginning of
this test, so it will only apply to the shared index files that are
created after that.

Do you suggest that we test before setting core.sharedrepository that
the existing shared index files all have the default permissions?


[PATCH 3/3] t1700: make sure split-index respects core.sharedrepository

2017-06-22 Thread Christian Couder
Add a test to check that both the split-index file and the
shared-index file are created using the right permissions
when core.sharedrepository is set.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1700-split-index.sh | 12 
 1 file changed, 12 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index af3ec0da5a..a52b92e82b 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -370,4 +370,16 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
set to "never" and "now"
test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
+test_expect_success POSIXPERM 'split index respects core.sharedrepository' '
+   git config core.sharedrepository 0666 &&
+   : >seventeen &&
+   git update-index --add seventeen &&
+   echo "-rw-rw-rw-" >expect &&
+   test_modebits .git/index >actual &&
+   test_cmp expect actual &&
+   newest_shared_index=$(ls -t .git/sharedindex.* | head -1) &&
+   test_modebits "$newest_shared_index" >actual &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.13.1.516.g05ec6e13aa



[PATCH 1/3] read-cache: use shared perms when writing shared index

2017-06-22 Thread Christian Couder
Since f6ecc62dbf (write_shared_index(): use tempfile module, 2015-08-10)
write_shared_index() has been using mks_tempfile() to create the
temporary file that will become the shared index.

But even before that, it looks like the functions used to create this
file didn't call adjust_shared_perm(), which means that the shared
index file has always been created with 600 permissions regardless
of the shared permission settings.

This means that on repositories created with `git init --shared=all`
and using the split index feature one gets an error like:

fatal: .git/sharedindex.a52f910b489bc462f187ab572ba0086f7b5157de: index file 
open failed: Permission denied

when another user performs any operation that reads the shared index.

Let's fix that by using create_tempfile() instead of mks_tempfile()
to create the shared index file.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index bc156a133e..eb71e93aa4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2414,7 +2414,7 @@ static int write_shared_index(struct index_state *istate,
struct split_index *si = istate->split_index;
int fd, ret;
 
-   fd = mks_tempfile(_sharedindex, 
git_path("sharedindex_XX"));
+   fd = create_tempfile(_sharedindex, 
git_path("sharedindex_XX"));
if (fd < 0) {
hashclr(si->base_sha1);
return do_write_locked_index(istate, lock, flags);
-- 
2.13.1.516.g05ec6e13aa



[PATCH 2/3] t1301: move movebits() to test-lib-functions.sh

2017-06-22 Thread Christian Couder
As the movebits() function can be useful outside t1301,
let's move it into test-lib-functions.sh, and while at
it let's rename it test_movebits().

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t1301-shared-repo.sh  | 18 +++---
 t/test-lib-functions.sh |  5 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index 1312004f8c..dfece751b5 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -19,10 +19,6 @@ test_expect_success 'shared = 0400 (faulty permission u-w)' '
)
 '
 
-modebits () {
-   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
-}
-
 for u in 002 022
 do
test_expect_success POSIXPERM "shared=1 does not clear bits preset by 
umask $u" '
@@ -88,7 +84,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$y"
 
'
@@ -98,7 +94,7 @@ do
 
rm -f .git/info/refs &&
git update-server-info &&
-   actual="$(modebits .git/info/refs)" &&
+   actual="$(test_modebits .git/info/refs)" &&
verbose test "x$actual" = "x-$x"
 
'
@@ -111,7 +107,7 @@ test_expect_success POSIXPERM 'info/refs respects umask in 
unshared repo' '
umask 002 &&
git update-server-info &&
echo "-rw-rw-r--" >expect &&
-   modebits .git/info/refs >actual &&
+   test_modebits .git/info/refs >actual &&
test_cmp expect actual
 '
 
@@ -177,7 +173,7 @@ test_expect_success POSIXPERM 'remote init does not use 
config from cwd' '
umask 0022 &&
git init --bare child.git &&
echo "-rw-r--r--" >expect &&
-   modebits child.git/config >actual &&
+   test_modebits child.git/config >actual &&
test_cmp expect actual
 '
 
@@ -187,7 +183,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (local)' '
echo whatever >templates/foo &&
git init --template=templates &&
echo "-rw-rw-rw-" >expect &&
-   modebits .git/foo >actual &&
+   test_modebits .git/foo >actual &&
test_cmp expect actual
 '
 
@@ -198,7 +194,7 @@ test_expect_success POSIXPERM 're-init respects 
core.sharedrepository (remote)'
test_path_is_missing child.git/foo &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/foo >actual &&
+   test_modebits child.git/foo >actual &&
test_cmp expect actual
 '
 
@@ -209,7 +205,7 @@ test_expect_success POSIXPERM 'template can set 
core.sharedrepository' '
cp .git/config templates/config &&
git init --bare --template=../templates child.git &&
echo "-rw-rw-rw-" >expect &&
-   modebits child.git/HEAD >actual &&
+   test_modebits child.git/HEAD >actual &&
test_cmp expect actual
 '
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 5ee124332a..db622c3555 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -216,6 +216,11 @@ test_chmod () {
git update-index --add "--chmod=$@"
 }
 
+# Get the modebits from a file.
+test_modebits () {
+   ls -l "$1" | sed -e 's|^\(..\).*|\1|'
+}
+
 # Unset a configuration variable, but don't fail if it doesn't exist.
 test_unconfig () {
config_dir=
-- 
2.13.1.516.g05ec6e13aa



Re: [GSoC][PATCH 2/6] submodule--helper: introduce get_submodule_displaypath and for_each_submodule_list

2017-06-22 Thread Christian Couder
On Mon, Jun 19, 2017 at 11:50 PM, Prathamesh Chavan  wrote:

> +static char *get_submodule_displaypath(const char *path, const char *prefix)
> +{
> +   const char *super_prefix = get_super_prefix();
> +
> +   if (prefix && super_prefix) {
> +   BUG("cannot have prefix '%s' and superprefix '%s'",
> +   prefix, super_prefix);
> +   } else if (prefix) {
> +   struct strbuf sb = STRBUF_INIT;
> +   char *displaypath = xstrdup(relative_path(path, prefix, ));
> +   strbuf_release();
> +   return displaypath;
> +   } else if (super_prefix) {
> +   int len = strlen(super_prefix);
> +   const char *format = is_dir_sep(super_prefix[len-1]) ? "%s%s" 
> : "%s/%s";

Style nit: please add spaces around "-", so "len - 1" instead of "len-1".

> +   return xstrfmt(format, super_prefix, path);
> +   } else {
> +   return xstrdup(path);
> +   }
> +}


Re: [GSoC][PATCH 5/6] submodule: port submodule subcommand sync from shell to C

2017-06-22 Thread Christian Couder
On Mon, Jun 19, 2017 at 11:50 PM, Prathamesh Chavan  wrote:

> +static char *get_up_path(const char *path)
> +{
> +   int i = count_slashes(path);
> +   struct strbuf sb = STRBUF_INIT;
> +
> +   while (i--)
> +   strbuf_addstr(, "../");
> +
> +   /*
> +*Check if 'path' ends with slash or not
> +*for having the same output for dir/sub_dir
> +*and dir/sub_dir/
> +*/
> +   if (!is_dir_sep(path[i - 1]))

i is always 0 here, as we decrease it until it gets to 0 above.

> +   strbuf_addstr(, "../");
> +
> +   return strbuf_detach(, NULL);
> +}


Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-06-20 Thread Christian Couder
On Tue, Jun 20, 2017 at 9:54 AM, Christian Couder
<christian.cou...@gmail.com> wrote:
>
> Future work
> ~~~
>
> First sorry about the state of this patch series, it is not as clean
> as I would have liked, butI think it is interesting to get feedback
> from the mailing list at this point, because the previous RFC was sent
> a long time ago and a lot of things changed.
>
> So a big part of the future work will be about cleaning this patch series.
>
> Other things I think I am going to do:
>
>   -

Ooops, I had not save my emacs buffer where I wrote this when I sent
the patch series.

This should have been:

Other things I think I may work on:

  - Remove the "odb..scriptMode" and "odb..command"
options and instead have just "odb..scriptCommand" and
"odb..subprocessCommand".

  - Use capabilities instead of "odb..fetchKind" to decide
which kind of "get" will be used.

  - Better test all the combinations of the above modes with and
without "have" and "put" instructions.

  - Maybe also have different kinds of "put" so that Git could pass
either a git object a plain object or ask the helper to retreive
it directly from Git's object database.

  - Maybe add an "init" instruction as the script mode has something
like this called "get_cap" and it would help the sub-process mode
too, as it makes it possible for Git to know the capabilities
before trying to send any instruction (that might not be supported
by the helper). The "init" instruction would be the only required
instruction for any helper to implement.

  - Add more long running tests and improve tests in general.


[RFC/PATCH v4 04/49] Add Git/Packet.pm from parts of t0021/rot13-filter.pl

2017-06-20 Thread Christian Couder
This will make it possible to reuse packet reading and writing
functions in other test scripts.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 perl/Git/Packet.pm | 71 ++
 1 file changed, 71 insertions(+)
 create mode 100644 perl/Git/Packet.pm

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
new file mode 100644
index 00..aaffecbe2a
--- /dev/null
+++ b/perl/Git/Packet.pm
@@ -0,0 +1,71 @@
+package Git::Packet;
+use 5.008;
+use strict;
+use warnings;
+BEGIN {
+   require Exporter;
+   if ($] < 5.008003) {
+   *import = \::import;
+   } else {
+   # Exporter 5.57 which supports this invocation was
+   # released with perl 5.8.3
+   Exporter->import('import');
+   }
+}
+
+our @EXPORT = qw(
+   packet_bin_read
+   packet_txt_read
+   packet_bin_write
+   packet_txt_write
+   packet_flush
+   );
+our @EXPORT_OK = @EXPORT;
+
+sub packet_bin_read {
+   my $buffer;
+   my $bytes_read = read STDIN, $buffer, 4;
+   if ( $bytes_read == 0 ) {
+   # EOF - Git stopped talking to us!
+   return ( -1, "" );
+   } elsif ( $bytes_read != 4 ) {
+   die "invalid packet: '$buffer'";
+   }
+   my $pkt_size = hex($buffer);
+   if ( $pkt_size == 0 ) {
+   return ( 1, "" );
+   } elsif ( $pkt_size > 4 ) {
+   my $content_size = $pkt_size - 4;
+   $bytes_read = read STDIN, $buffer, $content_size;
+   if ( $bytes_read != $content_size ) {
+   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
+   }
+   return ( 0, $buffer );
+   } else {
+   die "invalid packet size: $pkt_size";
+   }
+}
+
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   unless ( $res == -1 || $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.";
+   }
+   return ( $res, $buf );
+}
+
+sub packet_bin_write {
+   my $buf = shift;
+   print STDOUT sprintf( "%04x", length($buf) + 4 );
+   print STDOUT $buf;
+   STDOUT->flush();
+}
+
+sub packet_txt_write {
+   packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+   print STDOUT sprintf( "%04x", 0 );
+   STDOUT->flush();
+}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 07/49] Git/Packet.pm: add packet_initialize()

2017-06-20 Thread Christian Couder
Add a function to initialize the communication. And use this
function in 't/t0021/rot13-filter.pl'.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 perl/Git/Packet.pm  | 13 +
 t/t0021/rot13-filter.pl |  8 +---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index 2ad6b00d6c..b0233caf37 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -19,6 +19,7 @@ our @EXPORT = qw(
packet_bin_write
packet_txt_write
packet_flush
+   packet_initialize
);
 our @EXPORT_OK = @EXPORT;
 
@@ -70,3 +71,15 @@ sub packet_flush {
print STDOUT sprintf( "%04x", 0 );
STDOUT->flush();
 }
+
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   ( packet_txt_read() eq ( 0, $name . "-client" ) )   || die "bad 
initialize";
+   ( packet_txt_read() eq ( 0, "version=" . $version ) )   || die "bad 
version";
+   ( packet_bin_read() eq ( 1, "" ) )  || die "bad 
version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 36a9eb3608..5b05518640 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -40,13 +40,7 @@ sub rot13 {
 print $debug "START\n";
 $debug->flush();
 
-( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
-( packet_txt_read() eq ( 0, "version=2" ) ) || die "bad version";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad version end";
-
-packet_txt_write("git-filter-server");
-packet_txt_write("version=2");
-packet_flush();
+packet_initialize("git-filter", 2);
 
 ( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
 ( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 08/49] Git/Packet: add capability functions

2017-06-20 Thread Christian Couder
Add functions to help read and write capabilities.
Use these functions in 't/t0021/rot13-filter.pl'.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 perl/Git/Packet.pm  | 33 +
 t/t0021/rot13-filter.pl |  9 ++---
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index b0233caf37..4443b67724 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -20,6 +20,9 @@ our @EXPORT = qw(
packet_txt_write
packet_flush
packet_initialize
+   packet_read_capabilities
+   packet_write_capabilities
+   packet_read_and_check_capabilities
);
 our @EXPORT_OK = @EXPORT;
 
@@ -83,3 +86,33 @@ sub packet_initialize {
packet_txt_write( "version=" . $version );
packet_flush();
 }
+
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   return ( $res, @cap ) if ( $res != 0 );
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   die "bad capability buf: '$buf'" unless ( $buf =~ 
s/capability=// );
+   push @cap, $buf;
+   }
+}
+
+sub packet_read_and_check_capabilities {
+   my @local_caps = @_;
+   my @remote_res_caps = packet_read_capabilities();
+   my $res = shift @remote_res_caps;
+   my %remote_caps = map { $_ => 1 } @remote_res_caps;
+   foreach (@local_caps) {
+   die "'$_' capability not available" unless 
(exists($remote_caps{$_}));
+   }
+   return $res;
+}
+
+sub packet_write_capabilities {
+   packet_txt_write( "capability=" . $_ ) foreach (@_);
+   packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 5b05518640..bbfd52619d 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -42,14 +42,9 @@ $debug->flush();
 
 packet_initialize("git-filter", 2);
 
-( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad capability 
end";
+packet_read_and_check_capabilities("clean", "smudge");
+packet_write_capabilities(@capabilities);
 
-foreach (@capabilities) {
-   packet_txt_write( "capability=" . $_ );
-}
-packet_flush();
 print $debug "init handshake complete\n";
 $debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 10/49] external odb foreach

2017-06-20 Thread Christian Couder
From: Jeff King 

---
 external-odb.c | 14 ++
 external-odb.h |  6 ++
 odb-helper.c   | 15 +++
 odb-helper.h   |  4 
 4 files changed, 39 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 1ccfa99a01..42978a3298 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -113,3 +113,17 @@ int external_odb_fetch_object(const unsigned char *sha1)
 
return -1;
 }
+
+int external_odb_for_each_object(each_external_object_fn fn, void *data)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   int r = odb_helper_for_each_object(o, fn, data);
+   if (r)
+   return r;
+   }
+   return 0;
+}
diff --git a/external-odb.h b/external-odb.h
index 2397477684..cea8570a49 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -5,4 +5,10 @@ const char *external_odb_root(void);
 int external_odb_has_object(const unsigned char *sha1);
 int external_odb_fetch_object(const unsigned char *sha1);
 
+typedef int (*each_external_object_fn)(const unsigned char *sha1,
+  enum object_type type,
+  unsigned long size,
+  void *data);
+int external_odb_for_each_object(each_external_object_fn, void *);
+
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index de5562da9c..d8ef5cbf4b 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -243,3 +243,18 @@ int odb_helper_fetch_object(struct odb_helper *o, const 
unsigned char *sha1,
 
return 0;
 }
+
+int odb_helper_for_each_object(struct odb_helper *o,
+  each_external_object_fn fn,
+  void *data)
+{
+   int i;
+   for (i = 0; i < o->have_nr; i++) {
+   struct odb_helper_object *obj = >have[i];
+   int r = fn(obj->sha1, obj->type, obj->size, data);
+   if (r)
+   return r;
+   }
+
+   return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 0f704f9452..8c3916d215 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -1,6 +1,8 @@
 #ifndef ODB_HELPER_H
 #define ODB_HELPER_H
 
+#include "external-odb.h"
+
 struct odb_helper {
const char *name;
const char *cmd;
@@ -21,5 +23,7 @@ struct odb_helper *odb_helper_new(const char *name, int 
namelen);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
int fd);
+int odb_helper_for_each_object(struct odb_helper *o,
+  each_external_object_fn, void *);
 
 #endif /* ODB_HELPER_H */
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 11/49] t0400: add 'put' command to odb-helper script

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0400-external-odb.sh | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index fe85413725..6c6da5cf4f 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -7,6 +7,10 @@ test_description='basic tests for external object databases'
 ALT_SOURCE="$PWD/alt-repo/.git"
 export ALT_SOURCE
 write_script odb-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
 have)
@@ -16,6 +20,16 @@ have)
 get)
cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
;;
+put)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   writen=$(git hash-object -w -t "$kind" --stdin)
+   test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
'$writen'"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
 esac
 EOF
 HELPER="\"$PWD\"/odb-helper"
@@ -43,4 +57,13 @@ test_expect_success 'helper can retrieve alt objects' '
test_cmp expect actual
 '
 
+test_expect_success 'helper can add objects to alt repo' '
+   hash=$(echo "Hello odb!" | git hash-object -w -t blob --stdin) &&
+   test -f .git/objects/$(echo $hash | sed "s#..#&/#") &&
+   size=$(git cat-file -s "$hash") &&
+   git cat-file blob "$hash" | ./odb-helper put "$hash" "$size" blob &&
+   alt_size=$(cd alt-repo && git cat-file -s "$hash") &&
+   test "$size" -eq "$alt_size"
+'
+
 test_done
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 17/49] lib-httpd: pass config file to start_httpd()

2017-06-20 Thread Christian Couder
This makes it possible to start an apache web server with different
config files.

This will be used in a later patch to pass a config file that makes
apache store external objects.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/lib-httpd.sh | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465a..2e659a8ee2 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -171,12 +171,14 @@ prepare_httpd() {
 }
 
 start_httpd() {
+   APACHE_CONF_FILE=${1-apache.conf}
+
prepare_httpd >&3 2>&4
 
trap 'code=$?; stop_httpd; (exit $code); die' EXIT
 
"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-   -f "$TEST_PATH/apache.conf" $HTTPD_PARA \
+   -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA \
-c "Listen 127.0.0.1:$LIB_HTTPD_PORT" -k start \
>&3 2>&4
if test $? -ne 0
@@ -191,7 +193,7 @@ stop_httpd() {
trap 'die' EXIT
 
"$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \
-   -f "$TEST_PATH/apache.conf" $HTTPD_PARA -k stop
+   -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA -k stop
 }
 
 test_http_push_nonff () {
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 44/49] odb-helper: add have_object_process()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 103 ---
 1 file changed, 91 insertions(+), 12 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 2e5d8af526..01cd6a713c 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -579,27 +579,106 @@ static int odb_helper_object_cmp(const void *va, const 
void *vb)
return hashcmp(a->sha1, b->sha1);
 }
 
+static int have_object_process(struct odb_helper *o)
+{
+   int err;
+   struct read_object_process *entry;
+   struct child_process *process;
+   struct strbuf status = STRBUF_INIT;
+   const char *cmd = o->cmd;
+   uint64_t start;
+   char *line;
+   int packet_len;
+   int total_got = 0;
+
+   start = getnanotime();
+
+   entry = launch_read_object_process(cmd);
+   process = >subprocess.process;
+   o->supported_capabilities = entry->supported_capabilities;
+
+   if (!(ODB_HELPER_CAP_HAVE & entry->supported_capabilities))
+   return -1;
+
+   sigchain_push(SIGPIPE, SIG_IGN);
+
+   err = packet_write_fmt_gently(process->in, "command=have\n");
+   if (err)
+   goto done;
+
+   err = packet_flush_gently(process->in);
+   if (err)
+   goto done;
+
+   for (;;) {
+   /* packet_read() writes a '\0' extra byte at the end */
+   char buf[LARGE_PACKET_DATA_MAX + 1];
+   char *p = buf;
+   int more;
+
+   packet_len = packet_read(process->out, NULL, NULL,
+   buf, LARGE_PACKET_DATA_MAX + 1,
+   PACKET_READ_GENTLE_ON_EOF);
+
+   if (packet_len <= 0)
+   break;
+
+   total_got += packet_len;
+
+   do {
+   char *eol = strchrnul(p, '\n');
+   more = (*eol == '\n');
+   *eol = '\0';
+   if (add_have_entry(o, p))
+   break;
+   p = eol + 1;
+   } while (more);
+   }
+
+   if (packet_len < 0) {
+   err = packet_len;
+   goto done;
+   }
+
+   subprocess_read_status(process->out, );
+   err = strcmp(status.buf, "success");
+
+done:
+   sigchain_pop(SIGPIPE);
+
+   err = check_object_process_error(err, status.buf, entry, cmd, 
ODB_HELPER_CAP_HAVE);
+
+   trace_performance_since(start, "have_object_process");
+
+   return err;
+}
+
 static void odb_helper_load_have(struct odb_helper *o)
 {
-   struct odb_helper_cmd cmd;
-   FILE *fh;
-   struct strbuf line = STRBUF_INIT;
 
if (o->have_valid)
return;
o->have_valid = 1;
 
-   if (odb_helper_start(o, , 0, "have") < 0)
-   return;
+   if (o->script_mode) {
+   struct odb_helper_cmd cmd;
+   FILE *fh;
+   struct strbuf line = STRBUF_INIT;
 
-   fh = xfdopen(cmd.child.out, "r");
-   while (strbuf_getline(, fh) != EOF)
-   if (add_have_entry(o, line.buf))
-   break;
+   if (odb_helper_start(o, , 0, "have") < 0)
+   return;
 
-   strbuf_release();
-   fclose(fh);
-   odb_helper_finish(o, );
+   fh = xfdopen(cmd.child.out, "r");
+   while (strbuf_getline(, fh) != EOF)
+   if (add_have_entry(o, line.buf))
+   break;
+
+   strbuf_release();
+   fclose(fh);
+   odb_helper_finish(o, );
+   } else {
+   have_object_process(o);
+   }
 
qsort(o->have, o->have_nr, sizeof(*o->have), odb_helper_object_cmp);
 }
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 23/49] t0420: add test with HTTP external odb

2017-06-20 Thread Christian Couder
This tests that an apache web server can be used as an
external object database and store files in their native
format instead of converting them to a Git object.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0420-transfer-http-e-odb.sh | 150 +
 1 file changed, 150 insertions(+)
 create mode 100755 t/t0420-transfer-http-e-odb.sh

diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
new file mode 100755
index 00..716d722e97
--- /dev/null
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -0,0 +1,150 @@
+#!/bin/sh
+
+test_description='tests for transfering external objects to an HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-http-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+echo >&2 "odb-http-helper args:" "$@"
+case "$1" in
+have)
+   list_url="$HTTPD_URL/list/"
+   curl "$list_url" ||
+   die "curl '$list_url' failed"
+   ;;
+get)
+   get_url="$HTTPD_URL/list/?sha1=$2"
+   curl "$get_url" ||
+   die "curl '$get_url' failed"
+   ;;
+put)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   upload_url="$HTTPD_URL/upload/?sha1=$sha1=$size=$kind"
+   curl --data-binary @- --include "$upload_url" >out ||
+   die "curl '$upload_url' failed"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_NO_EXTERNAL_ODB=1 git 
hash-object -w -t blob --stdin) || exit
+   git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-http-helper"
+
+
+test_expect_success 'setup repo with a root commit and the helper' '
+   test_commit zero &&
+   git config odb.magic.command "$HELPER" &&
+   git config odb.magic.plainObjects "true"
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.plainObjects "true" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clon

[RFC/PATCH v4 36/49] odb-helper: add read_packetized_git_object_to_fd()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 84 +++-
 1 file changed, 78 insertions(+), 6 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 0017faa36e..a27208463c 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,6 +142,82 @@ static int check_object_process_error(int err,
return err;
 }
 
+static ssize_t read_packetized_git_object_to_fd(struct odb_helper *o,
+   const unsigned char *sha1,
+   int fd_in, int fd_out)
+{
+   ssize_t total_read = 0;
+   unsigned long total_got = 0;
+   int packet_len;
+   git_zstream stream;
+   int zret = Z_STREAM_END;
+   git_SHA_CTX hash;
+   unsigned char real_sha1[20];
+
+   memset(, 0, sizeof(stream));
+   git_inflate_init();
+   git_SHA1_Init();
+
+   for (;;) {
+   /* packet_read() writes a '\0' extra byte at the end */
+   char buf[LARGE_PACKET_DATA_MAX + 1];
+
+   packet_len = packet_read(fd_in, NULL, NULL,
+   buf, LARGE_PACKET_DATA_MAX + 1,
+   PACKET_READ_GENTLE_ON_EOF);
+
+   if (packet_len <= 0)
+   break;
+
+   write_or_die(fd_out, buf, packet_len);
+
+   stream.next_in = (unsigned char *)buf;
+   stream.avail_in = packet_len;
+   do {
+   unsigned char inflated[4096];
+   unsigned long got;
+
+   stream.next_out = inflated;
+   stream.avail_out = sizeof(inflated);
+   zret = git_inflate(, Z_SYNC_FLUSH);
+   got = sizeof(inflated) - stream.avail_out;
+
+   git_SHA1_Update(, inflated, got);
+   /* skip header when counting size */
+   if (!total_got) {
+   const unsigned char *p = memchr(inflated, '\0', 
got);
+   if (p)
+   got -= p - inflated + 1;
+   else
+   got = 0;
+   }
+   total_got += got;
+   } while (stream.avail_in && zret == Z_OK);
+
+   total_read += packet_len;
+   }
+
+   git_inflate_end();
+
+   if (packet_len < 0)
+   return packet_len;
+
+   git_SHA1_Final(real_sha1, );
+
+   if (zret != Z_STREAM_END) {
+   warning("bad zlib data from odb helper '%s' for %s",
+   o->name, sha1_to_hex(sha1));
+   return -1;
+   }
+   if (hashcmp(real_sha1, sha1)) {
+   warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+   o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+   return -1;
+   }
+
+   return total_read;
+}
+
 static int read_object_process(struct odb_helper *o, const unsigned char 
*sha1, int fd)
 {
int err;
@@ -174,12 +250,8 @@ static int read_object_process(struct odb_helper *o, const 
unsigned char *sha1,
if (err)
goto done;
 
-   if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN) {
-   struct strbuf buf;
-   read_packetized_to_strbuf(process->out, );
-   if (err)
-   goto done;
-   }
+   if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN)
+   err = read_packetized_git_object_to_fd(o, sha1, process->out, 
fd) < 0;
 
subprocess_read_status(process->out, );
err = strcmp(status.buf, "success");
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 21/49] odb-helper: add 'store_plain_objects' to 'struct odb_helper'

2017-06-20 Thread Christian Couder
This adds a configuration option odb..plainObjects and the
corresponding boolean variable called 'store_plain_objects' in
'struct odb_helper' to make it possible for external object
databases to store object as plain objects instead of Git objects.

The existing odb_helper_fetch_object() is renamed
odb_helper_fetch_git_object() and a new odb_helper_fetch_plain_object()
is introduce to deal with external objects that are not in Git format.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c |   2 +
 odb-helper.c   | 113 -
 odb-helper.h   |   1 +
 3 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index a88837feda..d11fc98719 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -36,6 +36,8 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
 
if (!strcmp(key, "command"))
return git_config_string(>cmd, var, value);
+   if (!strcmp(key, "plainobjects"))
+   o->store_plain_objects = git_config_bool(var, value);
 
return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index af7cc55ca2..b33ee81c97 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -159,8 +159,107 @@ int odb_helper_has_object(struct odb_helper *o, const 
unsigned char *sha1)
return !!odb_helper_lookup(o, sha1);
 }
 
-int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
-   int fd)
+static int odb_helper_fetch_plain_object(struct odb_helper *o,
+const unsigned char *sha1,
+int fd)
+{
+   struct odb_helper_object *obj;
+   struct odb_helper_cmd cmd;
+   unsigned long total_got = 0;
+
+   char hdr[32];
+   int hdrlen;
+
+   int ret = Z_STREAM_END;
+   unsigned char compressed[4096];
+   git_zstream stream;
+   git_SHA_CTX hash;
+   unsigned char real_sha1[20];
+
+   obj = odb_helper_lookup(o, sha1);
+   if (!obj)
+   return -1;
+
+   if (odb_helper_start(o, , 0, "get %s", sha1_to_hex(sha1)) < 0)
+   return -1;
+
+   /* Set it up */
+   git_deflate_init(, zlib_compression_level);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   git_SHA1_Init();
+
+   /* First header.. */
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj->type), 
obj->size) + 1;
+   stream.next_in = (unsigned char *)hdr;
+   stream.avail_in = hdrlen;
+   while (git_deflate(, 0) == Z_OK)
+   ; /* nothing */
+   git_SHA1_Update(, hdr, hdrlen);
+
+   for (;;) {
+   unsigned char buf[4096];
+   int r;
+
+   r = xread(cmd.child.out, buf, sizeof(buf));
+   if (r < 0) {
+   error("unable to read from odb helper '%s': %s",
+ o->name, strerror(errno));
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   git_deflate_end();
+   return -1;
+   }
+   if (r == 0)
+   break;
+
+   total_got += r;
+
+   /* Then the data itself.. */
+   stream.next_in = (void *)buf;
+   stream.avail_in = r;
+   do {
+   unsigned char *in0 = stream.next_in;
+   ret = git_deflate(, Z_FINISH);
+   git_SHA1_Update(, in0, stream.next_in - in0);
+   write_or_die(fd, compressed, stream.next_out - 
compressed);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   } while (ret == Z_OK);
+   }
+
+   close(cmd.child.out);
+   if (ret != Z_STREAM_END) {
+   warning("bad zlib data from odb helper '%s' for %s",
+   o->name, sha1_to_hex(sha1));
+   return -1;
+   }
+   ret = git_deflate_end_gently();
+   if (ret != Z_OK) {
+   warning("deflateEnd on object %s from odb helper '%s' failed 
(%d)",
+   sha1_to_hex(sha1), o->name, ret);
+   return -1;
+   }
+   git_SHA1_Final(real_sha1, );
+   if (hashcmp(sha1, real_sha1)) {
+   warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+   o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+   return -1;
+   }
+   if (odb_helper_finish(o, ))
+   return -1;
+   if (total_got != obj->size) {
+   warning("size mismatch from odb helper '%s' for %s (%lu != 
%lu)",
+   o->name, sha1_to_hex

[RFC/PATCH v4 25/49] external-odb: add external_odb_fault_in_object()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 21 -
 external-odb.h |  1 +
 odb-helper.c   |  7 +++
 odb-helper.h   |  1 +
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 0b6e443372..502380cac2 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -113,7 +113,8 @@ int external_odb_fetch_object(const unsigned char *sha1)
int ret;
int fd;
 
-   if (!odb_helper_has_object(o, sha1))
+   if (o->fetch_kind != ODB_FETCH_KIND_PLAIN_OBJECT &&
+   o->fetch_kind != ODB_FETCH_KIND_GIT_OBJECT)
continue;
 
fd = create_object_tmpfile(, path);
@@ -139,6 +140,24 @@ int external_odb_fetch_object(const unsigned char *sha1)
return -1;
 }
 
+int external_odb_fault_in_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   for (o = helpers; o; o = o->next) {
+   if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN)
+   continue;
+   if (odb_helper_fault_in_object(o, sha1) < 0)
+   continue;
+   return 0;
+   }
+
+   return -1;
+}
+
 int external_odb_for_each_object(each_external_object_fn fn, void *data)
 {
struct odb_helper *o;
diff --git a/external-odb.h b/external-odb.h
index 53879e900d..1b46c49e25 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,6 +4,7 @@
 const char *external_odb_root(void);
 int external_odb_has_object(const unsigned char *sha1);
 int external_odb_fetch_object(const unsigned char *sha1);
+int external_odb_fault_in_object(const unsigned char *sha1);
 
 typedef int (*each_external_object_fn)(const unsigned char *sha1,
   enum object_type type,
diff --git a/odb-helper.c b/odb-helper.c
index 24dc5375cb..5fb56c6135 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -347,9 +347,8 @@ static int odb_helper_fetch_git_object(struct odb_helper *o,
return 0;
 }
 
-static int odb_helper_fetch_fault_in(struct odb_helper *o,
-const unsigned char *sha1,
-int fd)
+int odb_helper_fault_in_object(struct odb_helper *o,
+  const unsigned char *sha1)
 {
struct odb_helper_object *obj;
struct odb_helper_cmd cmd;
@@ -377,7 +376,7 @@ int odb_helper_fetch_object(struct odb_helper *o,
case ODB_FETCH_KIND_GIT_OBJECT:
return odb_helper_fetch_git_object(o, sha1, fd);
case ODB_FETCH_KIND_FAULT_IN:
-   return odb_helper_fetch_fault_in(o, sha1, fd);
+   return 0;
default:
BUG("invalid fetch kind '%d'", o->fetch_kind);
}
diff --git a/odb-helper.h b/odb-helper.h
index e3ad8e3316..2dc6d96c40 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -30,6 +30,7 @@ struct odb_helper *odb_helper_new(const char *name, int 
namelen);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
int fd);
+int odb_helper_fault_in_object(struct odb_helper *o, const unsigned char 
*sha1);
 int odb_helper_for_each_object(struct odb_helper *o,
   each_external_object_fn, void *);
 int odb_helper_write_object(struct odb_helper *o,
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 09/49] Add initial external odb support

2017-06-20 Thread Christian Couder
From: Jeff King <p...@peff.net>

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 Makefile|   2 +
 cache.h |   9 ++
 external-odb.c  | 115 +++
 external-odb.h  |   8 ++
 odb-helper.c| 245 
 odb-helper.h|  25 +
 sha1_file.c |  79 +++-
 t/t0400-external-odb.sh |  46 +
 8 files changed, 507 insertions(+), 22 deletions(-)
 create mode 100644 external-odb.c
 create mode 100644 external-odb.h
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100755 t/t0400-external-odb.sh

diff --git a/Makefile b/Makefile
index f484801638..b488874d60 100644
--- a/Makefile
+++ b/Makefile
@@ -776,6 +776,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += external-odb.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
@@ -808,6 +809,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += odb-helper.o
 LIB_OBJS += oidset.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
diff --git a/cache.h b/cache.h
index d6ba8a2f11..391a69e9c5 100644
--- a/cache.h
+++ b/cache.h
@@ -954,6 +954,12 @@ const char *git_path_shallow(void);
  */
 extern const char *sha1_file_name(const unsigned char *sha1);
 
+/*
+ * Like sha1_file_name, but return the filename within a specific alternate
+ * object directory. Shares the same static buffer with sha1_file_name.
+ */
+extern const char *sha1_file_name_alt(const char *objdir, const unsigned char 
*sha1);
+
 /*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
@@ -1265,6 +1271,8 @@ extern int do_check_packed_object_crc;
 
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned 
long size, const char *type);
 
+extern int create_object_tmpfile(struct strbuf *tmp, const char *filename);
+extern void close_sha1_file(int fd);
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 extern int has_sha1_pack(const unsigned char *sha1);
@@ -1600,6 +1608,7 @@ extern void read_info_alternates(const char * 
relative_base, int depth);
 extern char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 extern int foreach_alt_odb(alt_odb_fn, void*);
+extern void prepare_external_alt_odb(void);
 
 /*
  * Allocate a "struct alternate_object_database" but do _not_ actually
diff --git a/external-odb.c b/external-odb.c
new file mode 100644
index 00..1ccfa99a01
--- /dev/null
+++ b/external-odb.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+#include "external-odb.h"
+#include "odb-helper.h"
+
+static struct odb_helper *helpers;
+static struct odb_helper **helpers_tail = 
+
+static struct odb_helper *find_or_create_helper(const char *name, int len)
+{
+   struct odb_helper *o;
+
+   for (o = helpers; o; o = o->next)
+   if (!strncmp(o->name, name, len) && !o->name[len])
+   return o;
+
+   o = odb_helper_new(name, len);
+   *helpers_tail = o;
+   helpers_tail = >next;
+
+   return o;
+}
+
+static int external_odb_config(const char *var, const char *value, void *data)
+{
+   struct odb_helper *o;
+   const char *key, *dot;
+
+   if (!skip_prefix(var, "odb.", ))
+   return 0;
+   dot = strrchr(key, '.');
+   if (!dot)
+   return 0;
+
+   o = find_or_create_helper(key, dot - key);
+   key = dot + 1;
+
+   if (!strcmp(key, "command"))
+   return git_config_string(>cmd, var, value);
+
+   return 0;
+}
+
+static void external_odb_init(void)
+{
+   static int initialized;
+
+   if (initialized)
+   return;
+   initialized = 1;
+
+   git_config(external_odb_config, NULL);
+}
+
+const char *external_odb_root(void)
+{
+   static const char *root;
+   if (!root)
+   root = git_pathdup("objects/external");
+   return root;
+}
+
+int external_odb_has_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next)
+   if (odb_helper_has_object(o, sha1))
+   return 1;
+   return 0;
+}
+
+int external_odb_fetch_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+   const char *path;
+
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   path = sha1_file_name_alt(external_odb_root(), sha1);
+   safe_create_leading_directories_const(path);
+   prepare_external_alt_odb();
+
+   for 

[RFC/PATCH v4 22/49] pack-objects: don't pack objects in external odbs

2017-06-20 Thread Christian Couder
Objects managed by an external ODB should not be put into
pack files.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/pack-objects.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f672225def..e423f685ff 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -24,6 +24,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "mru.h"
+#include "external-odb.h"
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -1011,6 +1012,9 @@ static int want_object_in_pack(const unsigned char *sha1,
return want;
}
 
+   if (external_odb_has_object(sha1))
+   return 0;
+
for (entry = packed_git_mru->head; entry; entry = entry->next) {
struct packed_git *p = entry->item;
off_t offset;
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 39/49] odb-helper: add write_object_process()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 76 +---
 1 file changed, 73 insertions(+), 3 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index b2d86a7928..e21113c0b8 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -383,6 +383,65 @@ static int read_object_process(struct odb_helper *o, const 
unsigned char *sha1,
return err;
 }
 
+static int write_object_process(struct odb_helper *o,
+   const void *buf, size_t len,
+   const char *type, unsigned char *sha1)
+{
+   int err;
+   struct read_object_process *entry;
+   struct child_process *process;
+   struct strbuf status = STRBUF_INIT;
+   const char *cmd = o->cmd;
+   uint64_t start;
+
+   start = getnanotime();
+
+   entry = launch_read_object_process(cmd);
+   process = >subprocess.process;
+   o->supported_capabilities = entry->supported_capabilities;
+
+   if (!(ODB_HELPER_CAP_PUT & entry->supported_capabilities))
+   return -1;
+
+   sigchain_push(SIGPIPE, SIG_IGN);
+
+   err = packet_write_fmt_gently(process->in, "command=put\n");
+   if (err)
+   goto done;
+
+   err = packet_write_fmt_gently(process->in, "sha1=%s\n", 
sha1_to_hex(sha1));
+   if (err)
+   goto done;
+
+   err = packet_write_fmt_gently(process->in, "size=%"PRIuMAX"\n", len);
+   if (err)
+   goto done;
+
+   err = packet_write_fmt_gently(process->in, "kind=blob\n");
+   if (err)
+   goto done;
+
+   err = packet_flush_gently(process->in);
+   if (err)
+   goto done;
+
+   err = write_packetized_from_buf(buf, len, process->in);
+   if (err)
+   goto done;
+
+   subprocess_read_status(process->out, );
+   err = strcmp(status.buf, "success");
+
+done:
+   sigchain_pop(SIGPIPE);
+
+   err = check_object_process_error(err, status.buf, entry, cmd, 
ODB_HELPER_CAP_PUT);
+
+   trace_performance_since(start, "write_object_process");
+
+   return err;
+}
+
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
struct odb_helper *o;
@@ -804,9 +863,9 @@ int odb_helper_for_each_object(struct odb_helper *o,
return 0;
 }
 
-int odb_helper_write_object(struct odb_helper *o,
-   const void *buf, size_t len,
-   const char *type, unsigned char *sha1)
+int odb_helper_write_plain_object(struct odb_helper *o,
+ const void *buf, size_t len,
+ const char *type, unsigned char *sha1)
 {
struct odb_helper_cmd cmd;
 
@@ -832,3 +891,14 @@ int odb_helper_write_object(struct odb_helper *o,
odb_helper_finish(o, );
return 0;
 }
+
+int odb_helper_write_object(struct odb_helper *o,
+   const void *buf, size_t len,
+   const char *type, unsigned char *sha1)
+{
+   if (o->script_mode) {
+   return odb_helper_write_plain_object(o, buf, len, type, sha1);
+   } else {
+   return write_object_process(o, buf, len, type, sha1);
+   }
+}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 28/49] contrib: add long-running-read-object/example.pl

2017-06-20 Thread Christian Couder
From: Ben Peart <benpe...@microsoft.com>

Signed-off-by: Ben Peart <benpe...@microsoft.com>
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 contrib/long-running-read-object/example.pl | 114 
 1 file changed, 114 insertions(+)
 create mode 100644 contrib/long-running-read-object/example.pl

diff --git a/contrib/long-running-read-object/example.pl 
b/contrib/long-running-read-object/example.pl
new file mode 100644
index 00..6587333b87
--- /dev/null
+++ b/contrib/long-running-read-object/example.pl
@@ -0,0 +1,114 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#  cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard 
+#
+# Please note, this sample is a minimal skeleton. No proper error handling
+# was implemented.
+#
+
+use strict;
+use warnings;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "/host_repo/.git/";
+
+sub packet_bin_read {
+   my $buffer;
+   my $bytes_read = read STDIN, $buffer, 4;
+   if ( $bytes_read == 0 ) {
+
+   # EOF - Git stopped talking to us!
+   exit();
+   }
+   elsif ( $bytes_read != 4 ) {
+   die "invalid packet: '$buffer'";
+   }
+   my $pkt_size = hex($buffer);
+   if ( $pkt_size == 0 ) {
+   return ( 1, "" );
+   }
+   elsif ( $pkt_size > 4 ) {
+   my $content_size = $pkt_size - 4;
+   $bytes_read = read STDIN, $buffer, $content_size;
+   if ( $bytes_read != $content_size ) {
+   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
+   }
+   return ( 0, $buffer );
+   }
+   else {
+   die "invalid packet size: $pkt_size";
+   }
+}
+
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.";
+   }
+   return ( $res, $buf );
+}
+
+sub packet_bin_write {
+   my $buf = shift;
+   print STDOUT sprintf( "%04x", length($buf) + 4 );
+   print STDOUT $buf;
+   STDOUT->flush();
+}
+
+sub packet_txt_write {
+   packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+   print STDOUT sprintf( "%04x", 0 );
+   STDOUT->flush();
+}
+
+( packet_txt_read() eq ( 0, "git-read-object-client" ) ) || die "bad 
initialize";
+( packet_txt_read() eq ( 0, "version=1" ) ) || die 
"bad version";
+( packet_bin_read() eq ( 1, "" ) )   || die "bad version 
end";
+
+packet_txt_write("git-read-object-server");
+packet_txt_write("version=1");
+packet_flush();
+
+( packet_txt_read() eq ( 0, "capability=get" ) )|| die "bad capability";
+( packet_bin_read() eq ( 1, "" ) )  || die "bad capability 
end";
+
+packet_txt_write("capability=get");
+packet_flush();
+
+while (1) {
+   my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+   if ( $command eq "get" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . 
' | git -c core.virtualizeobjects=false hash-object -w --stdin >/dev/null 
2>&1');
+   packet_txt_write(($?) ? "status=error" : "status=success");
+   packet_flush();
+   } else {
+   die "bad command '$command'";
+   }
+}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 20/49] lib-httpd: add apache-e-odb.conf

2017-06-20 Thread Christian Couder
This is an apache config file to test external object databases.
It uses the upload.sh and list.sh cgi that have been added
previously to make apache store external objects.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/lib-httpd/apache-e-odb.conf | 214 ++
 1 file changed, 214 insertions(+)
 create mode 100644 t/lib-httpd/apache-e-odb.conf

diff --git a/t/lib-httpd/apache-e-odb.conf b/t/lib-httpd/apache-e-odb.conf
new file mode 100644
index 00..19a1540c82
--- /dev/null
+++ b/t/lib-httpd/apache-e-odb.conf
@@ -0,0 +1,214 @@
+ServerName dummy
+PidFile httpd.pid
+DocumentRoot www
+LogFormat "%h %l %u %t \"%r\" %>s %b" common
+CustomLog access.log common
+ErrorLog error.log
+
+   LoadModule log_config_module modules/mod_log_config.so
+
+
+   LoadModule alias_module modules/mod_alias.so
+
+
+   LoadModule cgi_module modules/mod_cgi.so
+
+
+   LoadModule env_module modules/mod_env.so
+
+
+   LoadModule rewrite_module modules/mod_rewrite.so
+
+
+   LoadModule version_module modules/mod_version.so
+
+
+   LoadModule headers_module modules/mod_headers.so
+
+
+
+LockFile accept.lock
+
+
+
+
+   LoadModule auth_module modules/mod_auth.so
+
+
+
+= 2.1>
+
+   LoadModule auth_basic_module modules/mod_auth_basic.so
+
+
+   LoadModule authn_file_module modules/mod_authn_file.so
+
+
+   LoadModule authz_user_module modules/mod_authz_user.so
+
+
+   LoadModule authz_host_module modules/mod_authz_host.so
+
+
+
+= 2.4>
+
+   LoadModule authn_core_module modules/mod_authn_core.so
+
+
+   LoadModule authz_core_module modules/mod_authz_core.so
+
+
+   LoadModule access_compat_module modules/mod_access_compat.so
+
+
+   LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+
+
+   LoadModule unixd_module modules/mod_unixd.so
+
+
+
+PassEnv GIT_VALGRIND
+PassEnv GIT_VALGRIND_OPTIONS
+PassEnv GNUPGHOME
+PassEnv ASAN_OPTIONS
+PassEnv GIT_TRACE
+PassEnv GIT_CONFIG_NOSYSTEM
+
+Alias /dumb/ www/
+Alias /auth/dumb/ www/auth/dumb/
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   SetEnv GIT_COMMITTER_NAME "Custom User"
+   SetEnv GIT_COMMITTER_EMAIL cus...@example.com
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   SetEnv GIT_NAMESPACE ns
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+   Header set Set-Cookie name=value
+
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+
+ScriptAlias /upload/ upload.sh/
+ScriptAlias /list/ list.sh/
+
+   Options FollowSymlinks
+
+
+  Options ExecCGI
+
+
+  Options ExecCGI
+
+
+   Options ExecCGI
+
+
+RewriteEngine on
+RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301]
+RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302]
+RewriteRule ^/smart-redir-auth/(.*)$ /auth/smart/$1 [R=301]
+RewriteRule ^/smart-redir-limited/(.*)/info/refs$ /smart/$1/info/refs [R=301]
+RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 [R=302]
+
+RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 
[R=302]
+RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
+
+# Apache 2.2 does not understand , so we use RewriteCond.
+# And as RewriteCond does not allow testing for non-matches, we match
+# the desired case first (one has abra, two has cadabra), and let it
+# pass by marking the RewriteRule as [L], "last rule, do not process
+# any other matching RewriteRules after this"), and then have another
+# RewriteRule that matches all other cases and lets them fail via '[F]',
+# "fail the request".
+RewriteCond %{HTTP:x-magic-one} =abra
+RewriteCond %{HTTP:x-magic-two} =cadabra
+RewriteRule ^/smart_headers/.* - [L]
+RewriteRule ^/smart_headers/.* - [F]
+
+
+LoadModule ssl_module modules/mod_ssl.so
+
+SSLCertificateFile httpd.pem
+SSLCertificateKeyFile httpd.pem
+SSLRandomSeed startup file:/dev/urandom 512
+SSLRandomSeed connect file:/dev/urandom 512
+SSLSessionCache none
+SSLMutex file:ssl_mutex
+SSLEngine On
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+
+   AuthType Basic
+   AuthName "git-auth"
+   AuthUserFile passwd
+   Require valid-user
+
+
+RewriteCond %{QUERY_STRING} service=git-receive-pack [OR]
+RewriteCond %{REQUEST_URI} /git-receive-pack$
+RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
+
+
+  Order Deny,Allow
+  Deny from env=AUTHREQUIRED
+
+  AuthType Basic
+  AuthName "Git Access"
+  AuthUserFile passwd
+  Require valid-user
+  Sa

[RFC/PATCH v4 35/49] Add t0460 to test passing git objects

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0460-read-object-git.sh | 29 
 t/t0460/read-object-git| 67 ++
 2 files changed, 96 insertions(+)
 create mode 100755 t/t0460-read-object-git.sh
 create mode 100755 t/t0460/read-object-git

diff --git a/t/t0460-read-object-git.sh b/t/t0460-read-object-git.sh
new file mode 100755
index 00..d08b44cdce
--- /dev/null
+++ b/t/t0460-read-object-git.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process passing git 
objects'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0460"
+
+test_expect_success 'setup host repo with a root commit' '
+   test_commit zero &&
+   hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object-git"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+   git init guest-repo &&
+   (cd guest-repo &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "gitObject" &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'invalid blobs generate errors' '
+   cd guest-repo &&
+   test_must_fail git cat-file blob "invalid"
+'
+
+test_done
diff --git a/t/t0460/read-object-git b/t/t0460/read-object-git
new file mode 100755
index 00..356a22cd4c
--- /dev/null
+++ b/t/t0460/read-object-git
@@ -0,0 +1,67 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#  cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard 
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get");
+packet_write_capabilities("get");
+
+while (1) {
+   my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+   if ( $command eq "get" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   my $path = $sha1;
+   $path =~ s{..}{$&/};
+   $path = $DIR . "/objects/" . $path;
+
+   my $contents = do {
+   local $/;
+   open my $fh, $path or die "Can't open '$path': $!";
+   <$fh>
+   };
+
+   packet_bin_write($contents);
+   packet_flush();
+   packet_txt_write("status=success");
+   packet_flush();
+   } else {
+   die "bad command '$command'";
+   }
+}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 24/49] odb-helper: start fault in implementation

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 24 ++--
 odb-helper.c   | 30 --
 odb-helper.h   |  8 +++-
 t/t0400-external-odb.sh|  2 ++
 t/t0410-transfer-e-odb.sh  |  4 +++-
 t/t0420-transfer-http-e-odb.sh |  6 +++---
 6 files changed, 65 insertions(+), 9 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index d11fc98719..0b6e443372 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -20,6 +20,19 @@ static struct odb_helper *find_or_create_helper(const char 
*name, int len)
return o;
 }
 
+static enum odb_helper_fetch_kind parse_fetch_kind(const char *key,
+  const char *value)
+{
+   if (!strcasecmp(value, "plainobject"))
+   return ODB_FETCH_KIND_PLAIN_OBJECT;
+   else if (!strcasecmp(value, "gitobject"))
+   return ODB_FETCH_KIND_GIT_OBJECT;
+   else if (!strcasecmp(value, "faultin"))
+   return ODB_FETCH_KIND_FAULT_IN;
+
+   die("unknown value for config '%s': %s", key, value);
+}
+
 static int external_odb_config(const char *var, const char *value, void *data)
 {
struct odb_helper *o;
@@ -36,8 +49,15 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
 
if (!strcmp(key, "command"))
return git_config_string(>cmd, var, value);
-   if (!strcmp(key, "plainobjects"))
-   o->store_plain_objects = git_config_bool(var, value);
+   if (!strcmp(key, "fetchkind")) {
+   const char *fetch_kind;
+   int ret = git_config_string(_kind, var, value);
+   if (!ret) {
+   o->fetch_kind = parse_fetch_kind(var, fetch_kind);
+   free((char *)fetch_kind);
+   }
+   return ret;
+   }
 
return 0;
 }
diff --git a/odb-helper.c b/odb-helper.c
index b33ee81c97..24dc5375cb 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -347,14 +347,40 @@ static int odb_helper_fetch_git_object(struct odb_helper 
*o,
return 0;
 }
 
+static int odb_helper_fetch_fault_in(struct odb_helper *o,
+const unsigned char *sha1,
+int fd)
+{
+   struct odb_helper_object *obj;
+   struct odb_helper_cmd cmd;
+
+   obj = odb_helper_lookup(o, sha1);
+   if (!obj)
+   return -1;
+
+   if (odb_helper_start(o, , 0, "get %s", sha1_to_hex(sha1)) < 0)
+   return -1;
+
+   if (odb_helper_finish(o, ))
+   return -1;
+
+   return 0;
+}
+
 int odb_helper_fetch_object(struct odb_helper *o,
const unsigned char *sha1,
int fd)
 {
-   if (o->store_plain_objects)
+   switch(o->fetch_kind) {
+   case ODB_FETCH_KIND_PLAIN_OBJECT:
return odb_helper_fetch_plain_object(o, sha1, fd);
-   else
+   case ODB_FETCH_KIND_GIT_OBJECT:
return odb_helper_fetch_git_object(o, sha1, fd);
+   case ODB_FETCH_KIND_FAULT_IN:
+   return odb_helper_fetch_fault_in(o, sha1, fd);
+   default:
+   BUG("invalid fetch kind '%d'", o->fetch_kind);
+   }
 }
 
 int odb_helper_for_each_object(struct odb_helper *o,
diff --git a/odb-helper.h b/odb-helper.h
index 3953b9bbaf..e3ad8e3316 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -3,10 +3,16 @@
 
 #include "external-odb.h"
 
+enum odb_helper_fetch_kind {
+   ODB_FETCH_KIND_PLAIN_OBJECT = 0,
+   ODB_FETCH_KIND_GIT_OBJECT,
+   ODB_FETCH_KIND_FAULT_IN
+};
+
 struct odb_helper {
const char *name;
const char *cmd;
-   int store_plain_objects;
+   enum odb_helper_fetch_kind fetch_kind;
 
struct odb_helper_object {
unsigned char sha1[20];
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 3c868cad4c..c3cb0fdc84 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -49,6 +49,7 @@ test_expect_success 'alt objects are missing' '
 
 test_expect_success 'helper can retrieve alt objects' '
test_config odb.magic.command "$HELPER" &&
+   test_config odb.magic.fetchKind "gitObject" &&
cat >expect <<-\EOF &&
two
one
@@ -68,6 +69,7 @@ test_expect_success 'helper can add objects to alt repo' '
 
 test_expect_success 'commit adds objects to alt repo' '
test_config odb.magic.command "$HELPER" &&
+   test_config odb.magic.fetchKind "gitObject" &&
test_commit three &&
hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
content=$(cd alt-repo

[RFC/PATCH v4 45/49] clone: add initial param to write_remote_refs()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/clone.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 370a233d22..bd690576e6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -572,7 +572,7 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
return local_refs;
 }
 
-static void write_remote_refs(const struct ref *local_refs)
+static void write_remote_refs(const struct ref *local_refs, int initial)
 {
const struct ref *r;
 
@@ -591,8 +591,13 @@ static void write_remote_refs(const struct ref *local_refs)
die("%s", err.buf);
}
 
-   if (initial_ref_transaction_commit(t, ))
-   die("%s", err.buf);
+   if (initial) {
+   if (initial_ref_transaction_commit(t, ))
+   die("%s", err.buf);
+   } else {
+   if (ref_transaction_commit(t, ))
+   die("%s", err.buf);
+   }
 
strbuf_release();
ref_transaction_free(t);
@@ -639,7 +644,8 @@ static void update_remote_refs(const struct ref *refs,
   const char *branch_top,
   const char *msg,
   struct transport *transport,
-  int check_connectivity)
+  int check_connectivity,
+  int initial)
 {
const struct ref *rm = mapped_refs;
 
@@ -654,7 +660,7 @@ static void update_remote_refs(const struct ref *refs,
}
 
if (refs) {
-   write_remote_refs(mapped_refs);
+   write_remote_refs(mapped_refs, initial);
if (option_single_branch && !option_no_tags)
write_followtags(refs, msg);
}
@@ -1163,7 +1169,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local, 0);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 46/49] clone: add --initial-refspec option

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/clone.c | 55 ++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index bd690576e6..dda0ad360b 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -55,6 +55,7 @@ static enum transport_family family;
 static struct string_list option_config = STRING_LIST_INIT_NODUP;
 static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
 static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
+static struct string_list option_initial_refspec = STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
@@ -105,6 +106,8 @@ static struct option builtin_clone_options[] = {
N_("reference repository")),
OPT_STRING_LIST(0, "reference-if-able", _optional_reference,
N_("repo"), N_("reference repository")),
+   OPT_STRING_LIST(0, "initial-refspec", _initial_refspec,
+   N_("refspec"), N_("fetch this refspec first")),
OPT_BOOL(0, "dissociate", _dissociate,
 N_("use --reference only while cloning")),
OPT_STRING('o', "origin", _origin, N_("name"),
@@ -864,6 +867,47 @@ static void dissociate_from_references(void)
free(alternates);
 }
 
+static struct refspec *parse_initial_refspecs(void)
+{
+   const char **refspecs;
+   struct refspec *initial_refspecs;
+   struct string_list_item *rs;
+   int i = 0;
+
+   if (!option_initial_refspec.nr)
+   return NULL;
+
+   refspecs = xcalloc(option_initial_refspec.nr, sizeof(const char *));
+
+   for_each_string_list_item(rs, _initial_refspec)
+   refspecs[i++] = rs->string;
+
+   initial_refspecs = parse_fetch_refspec(option_initial_refspec.nr, 
refspecs);
+
+   free(refspecs);
+
+   return initial_refspecs;
+}
+
+static void fetch_initial_refs(struct transport *transport,
+  const struct ref *refs,
+  struct refspec *initial_refspecs,
+  const char *branch_top,
+  const char *reflog_msg,
+  int is_local)
+{
+   int i;
+
+   for (i = 0; i < option_initial_refspec.nr; i++) {
+   struct ref *init_refs = NULL;
+   struct ref **tail = _refs;
+   get_fetch_map(refs, _refspecs[i], , 0);
+   transport_fetch_refs(transport, init_refs);
+   update_remote_refs(refs, init_refs, NULL, branch_top, 
reflog_msg,
+  transport, !is_local, 1);
+   }
+}
+
 int cmd_clone(int argc, const char **argv, const char *prefix)
 {
int is_bundle = 0, is_local;
@@ -887,6 +931,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   struct refspec *initial_refspecs;
+   int is_initial;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1054,6 +1101,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
+   initial_refspecs = parse_initial_refspecs();
+
fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
refspec = parse_fetch_refspec(1, _pattern);
free((char *)fetch_pattern);
@@ -1109,6 +1158,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
refs = transport_get_remote_refs(transport);
 
if (refs) {
+   fetch_initial_refs(transport, refs, initial_refspecs,
+  branch_top.buf, reflog_msg.buf, is_local);
+
mapped_refs = wanted_peer_refs(refs, refspec);
/*
 * transport_get_remote_refs() may return refs with null sha-1
@@ -1168,9 +1220,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
+   is_initial = !refs || option_initial_refspec.nr == 0;
update_remote_refs(refs, mapped_refs, remote_head_points_at,
   branch_top.buf, reflog_msg.buf, transport,
-  !is_local, 0);
+  !is_local, is_initial);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 43/49] odb-helper: advertise 'put' capability

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index 2cd1f25e83..2e5d8af526 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -67,7 +67,11 @@ static int start_read_object_fn(struct subprocess_entry 
*subprocess)
if (err)
goto done;
 
-   err = packet_writel(process->in, "capability=get", "capability=have", 
NULL);
+   err = packet_writel(process->in,
+   "capability=get",
+   "capability=put",
+   "capability=have",
+   NULL);
if (err)
goto done;
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 48/49] Add test for 'clone --initial-refspec'

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t5616-clone-initial-refspec.sh | 48 
 1 file changed, 48 insertions(+)
 create mode 100755 t/t5616-clone-initial-refspec.sh

diff --git a/t/t5616-clone-initial-refspec.sh b/t/t5616-clone-initial-refspec.sh
new file mode 100755
index 00..ccbc27f83f
--- /dev/null
+++ b/t/t5616-clone-initial-refspec.sh
@@ -0,0 +1,48 @@
+#!/bin/sh
+
+test_description='test clone with --initial-refspec option'
+. ./test-lib.sh
+
+
+test_expect_success 'setup regular repo' '
+   # Make two branches, "master" and "side"
+   echo one >file &&
+   git add file &&
+   git commit -m one &&
+   echo two >file &&
+   git commit -a -m two &&
+   git tag two &&
+   echo three >file &&
+   git commit -a -m three &&
+   git checkout -b side &&
+   echo four >file &&
+   git commit -a -m four &&
+   git checkout master
+'
+
+test_expect_success 'add a special ref pointing to a blob' '
+   hash=$(echo "Hello world!" | git hash-object -w -t blob --stdin) &&
+   git update-ref refs/special/hello "$hash"
+'
+
+test_expect_success 'no-local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local .. . &&
+test_must_fail git cat-file blob "$hash") &&
+   rm -rf my-clone
+'
+
+test_expect_success 'no-local clone with --initial-refspec' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local --initial-refspec "refs/special/*:refs/special/*" 
.. . &&
+git cat-file blob "$hash" &&
+git rev-parse refs/special/hello >actual &&
+echo "$hash" >expected &&
+test_cmp expected actual) &&
+   rm -rf my-clone
+'
+
+test_done
+
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 41/49] external-odb: add external_odb_do_fetch_object()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 52 ++--
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 8c2570b2e7..c39f207dd3 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -95,32 +95,11 @@ const char *external_odb_root(void)
return root;
 }
 
-int external_odb_has_object(const unsigned char *sha1)
-{
-   struct odb_helper *o;
-
-   if (!use_external_odb)
-   return 0;
-
-   external_odb_init();
-
-   for (o = helpers; o; o = o->next) {
-   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
-   return 1;
-   if (odb_helper_has_object(o, sha1))
-   return 1;
-   }
-   return 0;
-}
-
-int external_odb_fetch_object(const unsigned char *sha1)
+static int external_odb_do_fetch_object(const unsigned char *sha1)
 {
struct odb_helper *o;
const char *path;
 
-   if (!external_odb_has_object(sha1))
-   return -1;
-
path = sha1_file_name_alt(external_odb_root(), sha1);
safe_create_leading_directories_const(path);
prepare_external_alt_odb();
@@ -175,6 +154,35 @@ int external_odb_fault_in_object(const unsigned char *sha1)
return -1;
 }
 
+int external_odb_has_object(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   if (!use_external_odb)
+   return 0;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE)) {
+   if (o->fetch_kind == ODB_FETCH_KIND_FAULT_IN)
+   return 1;
+   return !external_odb_do_fetch_object(sha1);
+   }
+   if (odb_helper_has_object(o, sha1))
+   return 1;
+   }
+   return 0;
+}
+
+int external_odb_fetch_object(const unsigned char *sha1)
+{
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   return external_odb_do_fetch_object(sha1);
+}
+
 int external_odb_for_each_object(each_external_object_fn fn, void *data)
 {
struct odb_helper *o;
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 49/49] t: add t0430 to test cloning using bundles

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0430-clone-bundle-e-odb.sh | 91 +++
 1 file changed, 91 insertions(+)
 create mode 100755 t/t0430-clone-bundle-e-odb.sh

diff --git a/t/t0430-clone-bundle-e-odb.sh b/t/t0430-clone-bundle-e-odb.sh
new file mode 100755
index 00..8934bea006
--- /dev/null
+++ b/t/t0430-clone-bundle-e-odb.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='tests for cloning using a bundle through e-odb'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+# odb helper script must see this
+export HTTPD_URL
+
+write_script odb-clone-bundle-helper <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+echo >&2 "odb-clone-bundle-helper args:" "$@"
+case "$1" in
+get_cap)
+   echo "capability=get"
+   echo "capability=have"
+   ;;
+have)
+   ref_hash=$(git rev-parse refs/odbs/magic/bundle) ||
+   die "couldn't find refs/odbs/magic/bundle"
+   GIT_NO_EXTERNAL_ODB=1 git cat-file blob "$ref_hash" >bundle_info ||
+   die "couldn't get blob $ref_hash"
+   bundle_url=$(sed -e 's/bundle url: //' bundle_info)
+   echo >&2 "bundle_url: '$bundle_url'"
+   curl "$bundle_url" -o bundle_file ||
+   die "curl '$bundle_url' failed"
+   GIT_NO_EXTERNAL_ODB=1 git bundle unbundle bundle_file >unbundling_info 
||
+   die "unbundling 'bundle_file' failed"
+   ;;
+get)
+   die "odb-clone-bundle-helper 'get' called"
+   ;;
+put)
+   die "odb-clone-bundle-helper 'put' called"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER="\"$PWD\"/odb-clone-bundle-helper"
+
+
+test_expect_success 'setup repo with a few commits' '
+   test_commit one &&
+   test_commit two &&
+   test_commit three &&
+   test_commit four
+'
+
+BUNDLE_FILE="file.bundle"
+FILES_DIR="httpd/www/files"
+GET_URL="$HTTPD_URL/files/$BUNDLE_FILE"
+
+test_expect_success 'create a bundle for this repo and check that it can be 
downloaded' '
+   git bundle create "$BUNDLE_FILE" master &&
+   mkdir "$FILES_DIR" &&
+   cp "$BUNDLE_FILE" "$FILES_DIR/" &&
+   curl "$GET_URL" --output actual &&
+   test_cmp "$BUNDLE_FILE" actual
+'
+
+test_expect_success 'create an e-odb ref for this bundle' '
+   ref_hash=$(echo "bundle url: $GET_URL" | GIT_NO_EXTERNAL_ODB=1 git 
hash-object -w -t blob --stdin) &&
+   git update-ref refs/odbs/magic/bundle "$ref_hash"
+'
+
+test_expect_success 'clone using the e-odb helper to download and install the 
bundle' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone --no-local \
+   -c odb.magic.command="$HELPER" \
+   -c odb.magic.fetchKind="faultin" \
+   -c odb.magic.scriptMode="true" \
+   --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 42/49] odb-helper: advertise 'have' capability

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index e21113c0b8..2cd1f25e83 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -67,7 +67,7 @@ static int start_read_object_fn(struct subprocess_entry 
*subprocess)
if (err)
goto done;
 
-   err = packet_writel(process->in, "capability=get", NULL);
+   err = packet_writel(process->in, "capability=get", "capability=have", 
NULL);
if (err)
goto done;
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 47/49] clone: disable external odb before initial clone

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/clone.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index dda0ad360b..a0d7b2bd2f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -933,6 +933,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
 
struct refspec *initial_refspecs;
int is_initial;
+   int saved_use_external_odb;
 
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
@@ -1078,6 +1079,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
 
git_config(git_default_config, NULL);
 
+   /* Temporarily disable external ODB before initial clone */
+   saved_use_external_odb = use_external_odb;
+   use_external_odb = 0;
+
if (option_bare) {
if (option_mirror)
src_ref_prefix = "refs/";
@@ -1161,6 +1166,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
fetch_initial_refs(transport, refs, initial_refspecs,
   branch_top.buf, reflog_msg.buf, is_local);
 
+   use_external_odb = saved_use_external_odb;
+
mapped_refs = wanted_peer_refs(refs, refspec);
/*
 * transport_get_remote_refs() may return refs with null sha-1
@@ -1202,6 +1209,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
option_branch, option_origin);
 
warning(_("You appear to have cloned an empty repository."));
+
+   use_external_odb = saved_use_external_odb;
+
mapped_refs = NULL;
our_head_points_at = NULL;
remote_head_points_at = NULL;
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 34/49] odb-helper: fix odb_helper_fetch_object() for read_object

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 910c87a482..0017faa36e 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -584,15 +584,19 @@ int odb_helper_fetch_object(struct odb_helper *o,
const unsigned char *sha1,
int fd)
 {
-   switch(o->fetch_kind) {
-   case ODB_FETCH_KIND_PLAIN_OBJECT:
-   return odb_helper_fetch_plain_object(o, sha1, fd);
-   case ODB_FETCH_KIND_GIT_OBJECT:
-   return odb_helper_fetch_git_object(o, sha1, fd);
-   case ODB_FETCH_KIND_FAULT_IN:
-   return 0;
-   default:
-   BUG("invalid fetch kind '%d'", o->fetch_kind);
+   if (o->script_mode) {
+   switch(o->fetch_kind) {
+   case ODB_FETCH_KIND_PLAIN_OBJECT:
+   return odb_helper_fetch_plain_object(o, sha1, fd);
+   case ODB_FETCH_KIND_GIT_OBJECT:
+   return odb_helper_fetch_git_object(o, sha1, fd);
+   case ODB_FETCH_KIND_FAULT_IN:
+   return 0;
+   default:
+   BUG("invalid fetch kind '%d'", o->fetch_kind);
+   }
+   } else {
+   return read_object_process(o, sha1, fd);
}
 }
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 31/49] external-odb: add external_odb_get_capabilities()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 15 ++-
 odb-helper.c   | 23 +++
 odb-helper.h   |  1 +
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index 2efa805d12..8c2570b2e7 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -66,6 +66,14 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
return 0;
 }
 
+static void external_odb_get_capabilities(void)
+{
+   struct odb_helper *o;
+
+   for (o = helpers; o; o = o->next)
+   odb_helper_get_capabilities(o);
+}
+
 static void external_odb_init(void)
 {
static int initialized;
@@ -75,6 +83,8 @@ static void external_odb_init(void)
initialized = 1;
 
git_config(external_odb_config, NULL);
+
+   external_odb_get_capabilities();
 }
 
 const char *external_odb_root(void)
@@ -94,9 +104,12 @@ int external_odb_has_object(const unsigned char *sha1)
 
external_odb_init();
 
-   for (o = helpers; o; o = o->next)
+   for (o = helpers; o; o = o->next) {
+   if (!(o->supported_capabilities & ODB_HELPER_CAP_HAVE))
+   return 1;
if (odb_helper_has_object(o, sha1))
return 1;
+   }
return 0;
 }
 
diff --git a/odb-helper.c b/odb-helper.c
index 20e83cb55a..a6bf81af8d 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -260,6 +260,29 @@ static int odb_helper_finish(struct odb_helper *o,
return 0;
 }
 
+int odb_helper_get_capabilities(struct odb_helper *o)
+{
+   struct odb_helper_cmd cmd;
+   FILE *fh;
+   struct strbuf line = STRBUF_INIT;
+
+   if (!o->script_mode)
+   return 0;
+
+   if (odb_helper_start(o, , 0, "get_cap") < 0)
+   return -1;
+
+   fh = xfdopen(cmd.child.out, "r");
+   while (strbuf_getline(, fh) != EOF)
+   parse_capabilities(line.buf, >supported_capabilities, 
o->name);
+
+   strbuf_release();
+   fclose(fh);
+   odb_helper_finish(o, );
+
+   return 0;
+}
+
 static int parse_object_line(struct odb_helper_object *o, const char *line)
 {
char *end;
diff --git a/odb-helper.h b/odb-helper.h
index b23544aa4a..8e0b0fc781 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -33,6 +33,7 @@ struct odb_helper {
 };
 
 struct odb_helper *odb_helper_new(const char *name, int namelen);
+int odb_helper_get_capabilities(struct odb_helper *o);
 int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1);
 int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1,
int fd);
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 30/49] odb-helper: add read_object_process()

2017-06-20 Thread Christian Couder
From: Ben Peart <benpe...@microsoft.com>

Signed-off-by: Ben Peart <benpe...@microsoft.com>
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 202 ---
 odb-helper.h |   5 ++
 sha1_file.c  |  33 +-
 3 files changed, 227 insertions(+), 13 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index 5fb56c6135..20e83cb55a 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -4,6 +4,187 @@
 #include "odb-helper.h"
 #include "run-command.h"
 #include "sha1-lookup.h"
+#include "sub-process.h"
+#include "pkt-line.h"
+#include "sigchain.h"
+
+struct read_object_process {
+   struct subprocess_entry subprocess;
+   unsigned int supported_capabilities;
+};
+
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
+
+static void parse_capabilities(char *cap_buf,
+  unsigned int *supported_capabilities,
+  const char *process_name)
+{
+   struct string_list cap_list = STRING_LIST_INIT_NODUP;
+
+   string_list_split_in_place(_list, cap_buf, '=', 1);
+
+   if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, 
"capability")) {
+   const char *cap_name = cap_list.items[1].string;
+
+   if (!strcmp(cap_name, "get")) {
+   *supported_capabilities |= ODB_HELPER_CAP_GET;
+   } else if (!strcmp(cap_name, "put")) {
+   *supported_capabilities |= ODB_HELPER_CAP_PUT;
+   } else if (!strcmp(cap_name, "have")) {
+   *supported_capabilities |= ODB_HELPER_CAP_HAVE;
+   } else {
+   warning("external process '%s' requested unsupported 
read-object capability '%s'",
+   process_name, cap_name);
+   }
+   }
+
+   string_list_clear(_list, 0);
+}
+
+static int start_read_object_fn(struct subprocess_entry *subprocess)
+{
+   int err;
+   struct read_object_process *entry = (struct read_object_process 
*)subprocess;
+   struct child_process *process = >process;
+   char *cap_buf;
+
+   sigchain_push(SIGPIPE, SIG_IGN);
+
+   err = packet_writel(process->in, "git-read-object-client", "version=1", 
NULL);
+   if (err)
+   goto done;
+
+   err = strcmp(packet_read_line(process->out, NULL), 
"git-read-object-server");
+   if (err) {
+   error("external process '%s' does not support read-object 
protocol version 1", subprocess->cmd);
+   goto done;
+   }
+   err = strcmp(packet_read_line(process->out, NULL), "version=1");
+   if (err)
+   goto done;
+   err = packet_read_line(process->out, NULL) != NULL;
+   if (err)
+   goto done;
+
+   err = packet_writel(process->in, "capability=get", NULL);
+   if (err)
+   goto done;
+
+   while ((cap_buf = packet_read_line(process->out, NULL)))
+   parse_capabilities(cap_buf, >supported_capabilities, 
subprocess->cmd);
+
+done:
+   sigchain_pop(SIGPIPE);
+
+   return err;
+}
+
+static struct read_object_process *launch_read_object_process(const char *cmd)
+{
+   struct read_object_process *entry;
+
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn) cmd2process_cmp, 
0);
+   entry = NULL;
+   } else {
+   entry = (struct read_object_process 
*)subprocess_find_entry(_map, cmd);
+   }
+
+   fflush(NULL);
+
+   if (!entry) {
+   entry = xmalloc(sizeof(*entry));
+   entry->supported_capabilities = 0;
+
+   if (subprocess_start(_map, >subprocess, cmd, 
start_read_object_fn)) {
+   free(entry);
+   return 0;
+   }
+   }
+
+   return entry;
+}
+
+static int check_object_process_error(int err,
+ const char *status,
+ struct read_object_process *entry,
+ const char *cmd,
+ unsigned int capability)
+{
+   if (!err)
+   return;
+
+   if (!strcmp(status, "error")) {
+   /* The process signaled a problem with the file. */
+   } else if (!strcmp(status, "notfound")) {
+   /* Object was not found */
+   err = -1;
+   } else if (!strcmp(status, "abort")) {
+   /*
+* The process signaled a permanent problem. Don't try to read
+* objects with the same command

[RFC/PATCH v4 19/49] lib-httpd: add list.sh

2017-06-20 Thread Christian Couder
This cgi script can list Git objects that have been uploaded as
files to an apache web server. This script can also retrieve
the content of each of these files.

This will help make apache work as an external object database.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/lib-httpd.sh  |  1 +
 t/lib-httpd/list.sh | 41 +
 2 files changed, 42 insertions(+)
 create mode 100644 t/lib-httpd/list.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index d80b004549..f31ea261f5 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -133,6 +133,7 @@ prepare_httpd() {
install_script broken-smart-http.sh
install_script error.sh
install_script upload.sh
+   install_script list.sh
 
ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh
new file mode 100644
index 00..7e520e507a
--- /dev/null
+++ b/t/lib-httpd/list.sh
@@ -0,0 +1,41 @@
+#!/bin/sh
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+key=${1%=*}
+val=${1#*=}
+
+case "$key" in
+   "sha1") sha1="$val" ;;
+   *) echo >&2 "unknown key '$key'" ;;
+esac
+
+shift
+done
+
+if test -d "$FILES_DIR"
+then
+if test -z "$sha1"
+then
+   echo 'Status: 200 OK'
+   echo
+   ls "$FILES_DIR" | tr '-' ' '
+else
+   if test -f "$FILES_DIR/$sha1"-*
+   then
+   echo 'Status: 200 OK'
+   echo
+   cat "$FILES_DIR/$sha1"-*
+   else
+   echo 'Status: 404 Not Found'
+   echo
+   fi
+fi
+fi
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 40/49] Add t0480 to test "have" capability and plain objects

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0480-read-object-have-http-e-odb.sh | 123 +
 t/t0480/read-object-plain-have | 116 +++
 2 files changed, 239 insertions(+)
 create mode 100755 t/t0480-read-object-have-http-e-odb.sh
 create mode 100755 t/t0480/read-object-plain-have

diff --git a/t/t0480-read-object-have-http-e-odb.sh 
b/t/t0480-read-object-have-http-e-odb.sh
new file mode 100755
index 00..52fb4d46c9
--- /dev/null
+++ b/t/t0480-read-object-have-http-e-odb.sh
@@ -0,0 +1,123 @@
+#!/bin/sh
+
+test_description='tests for read-object process with "have" cap and plain 
objects'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0480"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain-have"
+
+test_expect_success 'setup repo with a root commit' '
+   test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+   git config odb.magic.command "$HELPER" &&
+   git config odb.magic.fetchKind "plainObject"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "plainObject" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+test_must_fail git clone --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git clone -c odb.magic.command="$HELPER" \
+   -c odb.magic.plainObjects="true" \
+   --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local initial-refspec clone succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "plainObject" &&
+git -c odb.magic.command="$HELPER" \
+   -c odb.magic.plainObjects="true" \
+   clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'

[RFC/PATCH v4 37/49] odb-helper: add read_packetized_plain_object_to_fd()

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 119 ++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index a27208463c..b2d86a7928 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,6 +142,121 @@ static int check_object_process_error(int err,
return err;
 }
 
+static struct odb_helper_object *odb_helper_lookup(struct odb_helper *o,
+  const unsigned char *sha1);
+
+static ssize_t read_packetized_plain_object_to_fd(struct odb_helper *o,
+ const unsigned char *sha1,
+ int fd_in, int fd_out)
+{
+   ssize_t total_read = 0;
+   unsigned long total_got = 0;
+   int packet_len;
+
+   char hdr[32];
+   int hdrlen;
+
+   int ret = Z_STREAM_END;
+   unsigned char compressed[4096];
+   git_zstream stream;
+   git_SHA_CTX hash;
+   unsigned char real_sha1[20];
+
+   off_t size;
+   enum object_type type;
+   const char *s;
+   int pkt_size;
+   char *size_buf;
+
+   size_buf = packet_read_line(fd_in, _size);
+   if (!skip_prefix(size_buf, "size=", ))
+   return error("odb helper '%s' did not send size of plain 
object", o->name);
+   size = strtoumax(s, NULL, 10);
+   if (!skip_prefix(packet_read_line(fd_in, NULL), "kind=", ))
+   return error("odb helper '%s' did not send kind of plain 
object", o->name);
+   /* Check if the object is not available */
+   if (!strcmp(s, "none"))
+   return -1;
+   type = type_from_string_gently(s, strlen(s), 1);
+   if (type < 0)
+   return error("odb helper '%s' sent bad type '%s'", o->name, s);
+
+   /* Set it up */
+   git_deflate_init(, zlib_compression_level);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   git_SHA1_Init();
+
+   /* First header.. */
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), size) + 
1;
+   stream.next_in = (unsigned char *)hdr;
+   stream.avail_in = hdrlen;
+   while (git_deflate(, 0) == Z_OK)
+   ; /* nothing */
+   git_SHA1_Update(, hdr, hdrlen);
+
+   for (;;) {
+   /* packet_read() writes a '\0' extra byte at the end */
+   char buf[LARGE_PACKET_DATA_MAX + 1];
+
+   packet_len = packet_read(fd_in, NULL, NULL,
+   buf, LARGE_PACKET_DATA_MAX + 1,
+   PACKET_READ_GENTLE_ON_EOF);
+
+   if (packet_len <= 0)
+   break;
+
+   total_got += packet_len;
+
+   /* Then the data itself.. */
+   stream.next_in = (void *)buf;
+   stream.avail_in = packet_len;
+   do {
+   unsigned char *in0 = stream.next_in;
+   ret = git_deflate(, Z_FINISH);
+   git_SHA1_Update(, in0, stream.next_in - in0);
+   write_or_die(fd_out, compressed, stream.next_out - 
compressed);
+   stream.next_out = compressed;
+   stream.avail_out = sizeof(compressed);
+   } while (ret == Z_OK);
+
+   total_read += packet_len;
+   }
+
+   if (packet_len < 0) {
+   error("unable to read from odb helper '%s': %s",
+ o->name, strerror(errno));
+   git_deflate_end();
+   return packet_len;
+   }
+
+   if (ret != Z_STREAM_END) {
+   warning("bad zlib data from odb helper '%s' for %s",
+   o->name, sha1_to_hex(sha1));
+   return -1;
+   }
+
+   ret = git_deflate_end_gently();
+   if (ret != Z_OK) {
+   warning("deflateEnd on object %s from odb helper '%s' failed 
(%d)",
+   sha1_to_hex(sha1), o->name, ret);
+   return -1;
+   }
+   git_SHA1_Final(real_sha1, );
+   if (hashcmp(sha1, real_sha1)) {
+   warning("sha1 mismatch from odb helper '%s' for %s (got %s)",
+   o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1));
+   return -1;
+   }
+   if (total_got != size) {
+   warning("size mismatch from odb helper '%s' for %s (%lu != 
%lu)",
+   o->name, sha1_to_hex(sha1), total_got, size);
+   return -1;
+   }
+
+   return total_read;
+}
+
 static ssize_t read_packetized_git_object_to_fd(struct odb_helper *o,
const unsigned char *sha1,
int fd_

[RFC/PATCH v4 29/49] Add t0410 to test read object mechanism

2017-06-20 Thread Christian Couder
From: Ben Peart <benpe...@microsoft.com>

Signed-off-by: Ben Peart <benpe...@microsoft.com>
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0450-read-object.sh | 30 +++
 t/t0450/read-object| 56 ++
 2 files changed, 86 insertions(+)
 create mode 100755 t/t0450-read-object.sh
 create mode 100755 t/t0450/read-object

diff --git a/t/t0450-read-object.sh b/t/t0450-read-object.sh
new file mode 100755
index 00..18d726fe28
--- /dev/null
+++ b/t/t0450-read-object.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='tests for long running read-object process'
+
+. ./test-lib.sh
+
+PATH="$PATH:$TEST_DIRECTORY/t0450"
+
+test_expect_success 'setup host repo with a root commit' '
+   test_commit zero &&
+   hash1=$(git ls-tree HEAD | grep zero.t | cut -f1 | cut -d\  -f3)
+'
+
+HELPER="read-object"
+
+test_expect_success 'blobs can be retrieved from the host repo' '
+   git init guest-repo &&
+   (cd guest-repo &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "faultin" &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'invalid blobs generate errors' '
+   cd guest-repo &&
+   test_must_fail git cat-file blob "invalid"
+'
+
+
+test_done
diff --git a/t/t0450/read-object b/t/t0450/read-object
new file mode 100755
index 00..bf5fa2652b
--- /dev/null
+++ b/t/t0450/read-object
@@ -0,0 +1,56 @@
+#!/usr/bin/perl
+#
+# Example implementation for the Git read-object protocol version 1
+# See Documentation/technical/read-object-protocol.txt
+#
+# Allows you to test the ability for blobs to be pulled from a host git repo
+# "on demand."  Called when git needs a blob it couldn't find locally due to
+# a lazy clone that only cloned the commits and trees.
+#
+# A lazy clone can be simulated via the following commands from the host repo
+# you wish to create a lazy clone of:
+#
+# cd /host_repo
+# git rev-parse HEAD
+# git init /guest_repo
+# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
+#  cut -d' ' -f1 | git pack-objects /e/guest_repo/.git/objects/pack/noblobs
+# cd /guest_repo
+# git config core.virtualizeobjects true
+# git reset --hard 
+#
+# Please note, this sample is a minimal skeleton. No proper error handling 
+# was implemented.
+#
+
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
+use strict;
+use warnings;
+use Git::Packet;
+
+#
+# Point $DIR to the folder where your host git repo is located so we can pull
+# missing objects from it
+#
+my $DIR = "../.git/";
+
+packet_initialize("git-read-object", 1);
+
+packet_read_and_check_capabilities("get");
+packet_write_capabilities("get");
+
+while (1) {
+   my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+
+   if ( $command eq "get" ) {
+   my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
+   packet_bin_read();
+
+   system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . 
' | GIT_NO_EXTERNAL_ODB=1 git hash-object -w --stdin >/dev/null 2>&1');
+   packet_txt_write(($?) ? "status=error" : "status=success");
+   packet_flush();
+   } else {
+   die "bad command '$command'";
+   }
+}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 38/49] Add t0470 to test passing plain objects

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0470-read-object-http-e-odb.sh | 123 ++
 t/t0470/read-object-plain |  93 
 2 files changed, 216 insertions(+)
 create mode 100755 t/t0470-read-object-http-e-odb.sh
 create mode 100755 t/t0470/read-object-plain

diff --git a/t/t0470-read-object-http-e-odb.sh 
b/t/t0470-read-object-http-e-odb.sh
new file mode 100755
index 00..3360a98ec3
--- /dev/null
+++ b/t/t0470-read-object-http-e-odb.sh
@@ -0,0 +1,123 @@
+#!/bin/sh
+
+test_description='tests for read-object process passing plain objects to an 
HTTPD server'
+
+. ./test-lib.sh
+
+# If we don't specify a port, the current test number will be used
+# which will not work as it is less than 1024, so it can only be used by root.
+LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000)
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+
+start_httpd apache-e-odb.conf
+
+PATH="$PATH:$TEST_DIRECTORY/t0470"
+
+# odb helper script must see this
+export HTTPD_URL
+
+HELPER="read-object-plain"
+
+test_expect_success 'setup repo with a root commit' '
+   test_commit zero
+'
+
+test_expect_success 'setup another repo from the first one' '
+   git init other-repo &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'setup the helper in the root repo' '
+   git config odb.magic.command "$HELPER" &&
+   git config odb.magic.fetchKind "plainObject"
+'
+
+UPLOADFILENAME="hello_apache_upload.txt"
+
+UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob"
+
+test_expect_success 'can upload a file' '
+   echo "Hello Apache World!" >hello_to_send.txt &&
+   echo "How are you?" >>hello_to_send.txt &&
+   curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" 
>out_upload
+'
+
+LIST_URL="$HTTPD_URL/list/"
+
+test_expect_success 'can list uploaded files' '
+   curl --include "$LIST_URL" >out_list &&
+   grep "$UPLOADFILENAME" out_list
+'
+
+test_expect_success 'can delete uploaded files' '
+   curl --data "delete" --include "$UPLOAD_URL=1" >out_delete &&
+   curl --include "$LIST_URL" >out_list2 &&
+   ! grep "$UPLOADFILENAME" out_list2
+'
+
+FILES_DIR="httpd/www/files"
+
+test_expect_success 'new blobs are transfered to the http server' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   echo "$hash1-4-blob" >expected &&
+   ls "$FILES_DIR" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'blobs can be retrieved from the http server' '
+   git cat-file blob "$hash1" &&
+   git log -p >expected
+'
+
+test_expect_success 'update other repo from the first one' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "plainObject" &&
+git cat-file blob "$hash1" &&
+git pull origin master)
+'
+
+test_expect_success 'local clone from the first repo' '
+   mkdir my-clone &&
+   (cd my-clone &&
+git clone .. . &&
+git cat-file blob "$hash1")
+'
+
+test_expect_success 'no-local clone from the first repo fails' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+test_must_fail git clone --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local clone from the first repo with helper succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git clone -c odb.magic.command="$HELPER" \
+   -c odb.magic.plainObjects="true" \
+   --no-local .. .) &&
+   rm -rf my-other-clone
+'
+
+test_expect_success 'no-local initial-refspec clone succeeds' '
+   mkdir my-other-clone &&
+   (cd my-other-clone &&
+git config odb.magic.command "$HELPER" &&
+git config odb.magic.fetchKind "plainObject" &&
+git -c odb.magic.command="$HELPER" \
+   -c odb.magic.plainObjects="true" \
+   clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
+'
+
+stop_httpd
+
+test_done
diff --git 

[RFC/PATCH v4 27/49] Documentation: add read-object-protocol.txt

2017-06-20 Thread Christian Couder
From: Ben Peart <benpe...@microsoft.com>

Signed-off-by: Ben Peart <benpe...@microsoft.com>
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 Documentation/technical/read-object-protocol.txt | 102 +++
 1 file changed, 102 insertions(+)
 create mode 100644 Documentation/technical/read-object-protocol.txt

diff --git a/Documentation/technical/read-object-protocol.txt 
b/Documentation/technical/read-object-protocol.txt
new file mode 100644
index 00..a893b46e7c
--- /dev/null
+++ b/Documentation/technical/read-object-protocol.txt
@@ -0,0 +1,102 @@
+Read Object Process
+^^^
+
+The read-object process enables Git to read all missing blobs with a
+single process invocation for the entire life of a single Git command.
+This is achieved by using a packet format (pkt-line, see technical/
+protocol-common.txt) based protocol over standard input and standard
+output as follows. All packets, except for the "*CONTENT" packets and
+the "" flush packet, are considered text and therefore are
+terminated by a LF.
+
+Git starts the process when it encounters the first missing object that
+needs to be retrieved. After the process is started, Git sends a welcome
+message ("git-read-object-client"), a list of supported protocol version
+numbers, and a flush packet. Git expects to read a welcome response
+message ("git-read-object-server"), exactly one protocol version number
+from the previously sent list, and a flush packet. All further
+communication will be based on the selected version.
+
+The remaining protocol description below documents "version=1". Please
+note that "version=42" in the example below does not exist and is only
+there to illustrate how the protocol would look with more than one
+version.
+
+After the version negotiation Git sends a list of all capabilities that
+it supports and a flush packet. Git expects to read a list of desired
+capabilities, which must be a subset of the supported capabilities list,
+and a flush packet as response:
+
+packet: git> git-read-object-client
+packet: git> version=1
+packet: git> version=42
+packet: git> 
+packet: git< git-read-object-server
+packet: git< version=1
+packet: git< 
+packet: git> capability=get
+packet: git> capability=have
+packet: git> capability=put
+packet: git> capability=not-yet-invented
+packet: git> 
+packet: git< capability=get
+packet: git< 
+
+The only supported capability in version 1 is "get".
+
+Afterwards Git sends a list of "key=value" pairs terminated with a flush
+packet. The list will contain at least the command (based on the
+supported capabilities) and the sha1 of the object to retrieve. Please
+note, that the process must not send any response before it received the
+final flush packet.
+
+When the process receives the "get" command, it should make the requested
+object available in the git object store and then return success. Git will
+then check the object store again and this time find it and proceed.
+
+packet: git> command=get
+packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
+packet: git> 
+
+
+The process is expected to respond with a list of "key=value" pairs
+terminated with a flush packet. If the process does not experience
+problems then the list must contain a "success" status.
+
+packet: git< status=success
+packet: git< 
+
+
+In case the process cannot or does not want to process the content, it
+is expected to respond with an "error" status.
+
+packet: git< status=error
+packet: git< 
+
+
+In case the process cannot or does not want to process the content as
+well as any future content for the lifetime of the Git process, then it
+is expected to respond with an "abort" status at any point in the
+protocol.
+
+packet: git< status=abort
+packet: git< 
+
+
+Git neither stops nor restarts the process in case the "error"/"abort"
+status is set.
+
+If the process dies during the communication or does not adhere to the
+protocol then Git will stop the process and restart it with the next
+object that needs to be processed.
+
+After the read-object process has processed an object it is expected to
+wait for the next "key=value" list containing a command. Git will close
+the command pipe on exit. The process is expected to detect EOF and exit
+gracefully on its own. Git will wait until the process has stopped.
+
+A long running read-object process demo implementation can be found in
+`contrib/long-running-read-object/example.pl` located in the Git core
+repository. If you develop your own long running process then the
+`GIT_TRACE_PACKET` environment variables can be very helpful for
+debugging (see linkgit:git[1]).
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 33/49] odb-helper: call odb_helper_lookup() with 'have' capability

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 odb-helper.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/odb-helper.c b/odb-helper.c
index a6bf81af8d..910c87a482 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -142,19 +142,20 @@ static int check_object_process_error(int err,
return err;
 }
 
-static int read_object_process(const unsigned char *sha1)
+static int read_object_process(struct odb_helper *o, const unsigned char 
*sha1, int fd)
 {
int err;
struct read_object_process *entry;
struct child_process *process;
struct strbuf status = STRBUF_INIT;
-   const char *cmd = "read-object";
+   const char *cmd = o->cmd;
uint64_t start;
 
start = getnanotime();
 
entry = launch_read_object_process(cmd);
process = >subprocess.process;
+   o->supported_capabilities = entry->supported_capabilities;
 
if (!(ODB_HELPER_CAP_GET & entry->supported_capabilities))
return -1;
@@ -173,6 +174,13 @@ static int read_object_process(const unsigned char *sha1)
if (err)
goto done;
 
+   if (o->fetch_kind != ODB_FETCH_KIND_FAULT_IN) {
+   struct strbuf buf;
+   read_packetized_to_strbuf(process->out, );
+   if (err)
+   goto done;
+   }
+
subprocess_read_status(process->out, );
err = strcmp(status.buf, "success");
 
@@ -554,10 +562,11 @@ static int odb_helper_fetch_git_object(struct odb_helper 
*o,
 int odb_helper_fault_in_object(struct odb_helper *o,
   const unsigned char *sha1)
 {
-   struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
-
-   if (!obj)
-   return -1;
+   if (o->supported_capabilities & ODB_HELPER_CAP_HAVE) {
+   struct odb_helper_object *obj = odb_helper_lookup(o, sha1);
+   if (!obj)
+   return -1;
+   }
 
if (o->script_mode) {
struct odb_helper_cmd cmd;
@@ -567,7 +576,7 @@ int odb_helper_fault_in_object(struct odb_helper *o,
return -1;
return 0;
} else {
-   return read_object_process(sha1);
+   return read_object_process(o, sha1, -1);
}
 }
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 26/49] odb-helper: add script_mode

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 4 
 odb-helper.h   | 1 +
 t/t0400-external-odb.sh| 2 ++
 t/t0410-transfer-e-odb.sh  | 2 ++
 t/t0420-transfer-http-e-odb.sh | 7 ++-
 5 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index 502380cac2..2efa805d12 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -49,6 +49,10 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
 
if (!strcmp(key, "command"))
return git_config_string(>cmd, var, value);
+   if (!strcmp(key, "scriptmode")) {
+   o->script_mode = git_config_bool(var, value);
+   return 0;
+   }
if (!strcmp(key, "fetchkind")) {
const char *fetch_kind;
int ret = git_config_string(_kind, var, value);
diff --git a/odb-helper.h b/odb-helper.h
index 2dc6d96c40..44c98bbf56 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -13,6 +13,7 @@ struct odb_helper {
const char *name;
const char *cmd;
enum odb_helper_fetch_kind fetch_kind;
+   int script_mode;
 
struct odb_helper_object {
unsigned char sha1[20];
diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index c3cb0fdc84..18d8c38862 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -49,6 +49,7 @@ test_expect_success 'alt objects are missing' '
 
 test_expect_success 'helper can retrieve alt objects' '
test_config odb.magic.command "$HELPER" &&
+   test_config odb.magic.scriptMode true &&
test_config odb.magic.fetchKind "gitObject" &&
cat >expect <<-\EOF &&
two
@@ -69,6 +70,7 @@ test_expect_success 'helper can add objects to alt repo' '
 
 test_expect_success 'commit adds objects to alt repo' '
test_config odb.magic.command "$HELPER" &&
+   test_config odb.magic.scriptMode true &&
test_config odb.magic.fetchKind "gitObject" &&
test_commit three &&
hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
index cba89866e2..8de9a08d7c 100755
--- a/t/t0410-transfer-e-odb.sh
+++ b/t/t0410-transfer-e-odb.sh
@@ -90,6 +90,7 @@ test_expect_success 'setup first alternate repo' '
git init alt-repo1 &&
test_commit zero &&
git config odb.magic.command "$HELPER1" &&
+   git config odb.magic.scriptMode true &&
git config odb.magic.fetchKind "gitObject"
 '
 
@@ -120,6 +121,7 @@ test_expect_success 'other repo gets the blobs from object 
store' '
 test_must_fail git cat-file blob "$hash1" &&
 test_must_fail git cat-file blob "$hash2" &&
 git config odb.magic.command "$HELPER2" &&
+git config odb.magic.scriptMode true &&
 git config odb.magic.fetchKind "gitObject"
 git cat-file blob "$hash1" &&
 git cat-file blob "$hash2"
diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index 8a5f3adaa7..b8062d14c0 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -53,6 +53,7 @@ HELPER="\"$PWD\"/odb-http-helper"
 test_expect_success 'setup repo with a root commit and the helper' '
test_commit zero &&
git config odb.magic.command "$HELPER" &&
+   git config odb.magic.scriptMode true &&
git config odb.magic.fetchKind "plainObject"
 '
 
@@ -108,6 +109,7 @@ test_expect_success 'update other repo from the first one' '
 git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
 test_must_fail git cat-file blob "$hash1" &&
 git config odb.magic.command "$HELPER" &&
+git config odb.magic.scriptMode true &&
 git config odb.magic.fetchKind "plainObject" &&
 git cat-file blob "$hash1" &&
 git pull origin master)
@@ -131,6 +133,7 @@ test_expect_success 'no-local clone from the first repo 
with helper succeeds' '
mkdir my-other-clone &&
(cd my-other-clone &&
 git clone -c odb.magic.command="$HELPER" \
+   -c odb.magic.scriptMode="true" \
-c odb.magic.plainObjects="true" \
--no-local .. .) &&
rm -rf my-other-clone
@@ -141,7 +144,9 @@ test_expect_success 'no-local initial-refspec clone 
succeeds' '
(cd my-other-clone &&
 git config odb.magic.command "$HELPER" &&
 git config odb.magic.fetchKind "plainObject" &&
-git -c odb.magic.command="$HELPER" -c odb.magic.plainObjects="true" \
+git -c odb.magic.command="$HELPER" \
+   -c odb.magic.plainObjects="true" \
+   -c odb.magic.scriptMode="true" \
clone --no-local --initial-refspec 
"refs/odbs/magic/*:refs/odbs/magic/*" .. .)
 '
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 32/49] t04*: add 'get_cap' support to helpers

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0400-external-odb.sh| 4 
 t/t0410-transfer-e-odb.sh  | 8 
 t/t0420-transfer-http-e-odb.sh | 4 
 3 files changed, 16 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 18d8c38862..efabf90a8b 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -13,6 +13,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE; export GIT_DIR
 case "$1" in
+get_cap)
+   echo "capability=get"
+   echo "capability=have"
+   ;;
 have)
git cat-file --batch-check --batch-all-objects |
awk '{print $1 " " $3 " " $2}'
diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
index 8de9a08d7c..0c9cc3af7d 100755
--- a/t/t0410-transfer-e-odb.sh
+++ b/t/t0410-transfer-e-odb.sh
@@ -16,6 +16,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE1; export GIT_DIR
 case "$1" in
+get_cap)
+   echo "capability=get"
+   echo "capability=have"
+   ;;
 have)
git cat-file --batch-check --batch-all-objects |
awk '{print $1 " " $3 " " $2}'
@@ -51,6 +55,10 @@ die() {
 }
 GIT_DIR=$ALT_SOURCE2; export GIT_DIR
 case "$1" in
+get_cap)
+   echo "capability=get"
+   echo "capability=have"
+   ;;
 have)
GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' 
refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
;;
diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh
index b8062d14c0..45e66e355c 100755
--- a/t/t0420-transfer-http-e-odb.sh
+++ b/t/t0420-transfer-http-e-odb.sh
@@ -22,6 +22,10 @@ die() {
 }
 echo >&2 "odb-http-helper args:" "$@"
 case "$1" in
+get_cap)
+   echo "capability=get"
+   echo "capability=have"
+   ;;
 have)
list_url="$HTTPD_URL/list/"
curl "$list_url" ||
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 12/49] external odb: add write support

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 15 +++
 external-odb.h |  2 ++
 odb-helper.c   | 41 +
 odb-helper.h   |  3 +++
 sha1_file.c|  2 ++
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/external-odb.c b/external-odb.c
index 42978a3298..893937a7d4 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -127,3 +127,18 @@ int external_odb_for_each_object(each_external_object_fn 
fn, void *data)
}
return 0;
 }
+
+int external_odb_write_object(const void *buf, size_t len,
+ const char *type, unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   int r = odb_helper_write_object(o, buf, len, type, sha1);
+   if (r <= 0)
+   return r;
+   }
+   return 1;
+}
diff --git a/external-odb.h b/external-odb.h
index cea8570a49..53879e900d 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -10,5 +10,7 @@ typedef int (*each_external_object_fn)(const unsigned char 
*sha1,
   unsigned long size,
   void *data);
 int external_odb_for_each_object(each_external_object_fn, void *);
+int external_odb_write_object(const void *buf, size_t len,
+ const char *type, unsigned char *sha1);
 
 #endif /* EXTERNAL_ODB_H */
diff --git a/odb-helper.c b/odb-helper.c
index d8ef5cbf4b..af7cc55ca2 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -33,9 +33,10 @@ static void prepare_helper_command(struct argv_array *argv, 
const char *cmd,
strbuf_release();
 }
 
-__attribute__((format (printf,3,4)))
+__attribute__((format (printf,4,5)))
 static int odb_helper_start(struct odb_helper *o,
struct odb_helper_cmd *cmd,
+   int use_stdin,
const char *fmt, ...)
 {
va_list ap;
@@ -52,7 +53,10 @@ static int odb_helper_start(struct odb_helper *o,
 
cmd->child.argv = cmd->argv.argv;
cmd->child.use_shell = 1;
-   cmd->child.no_stdin = 1;
+   if (use_stdin)
+   cmd->child.in = -1;
+   else
+   cmd->child.no_stdin = 1;
cmd->child.out = -1;
 
if (start_command(>child) < 0) {
@@ -121,7 +125,7 @@ static void odb_helper_load_have(struct odb_helper *o)
return;
o->have_valid = 1;
 
-   if (odb_helper_start(o, , "have") < 0)
+   if (odb_helper_start(o, , 0, "have") < 0)
return;
 
fh = xfdopen(cmd.child.out, "r");
@@ -170,7 +174,7 @@ int odb_helper_fetch_object(struct odb_helper *o, const 
unsigned char *sha1,
if (!obj)
return -1;
 
-   if (odb_helper_start(o, , "get %s", sha1_to_hex(sha1)) < 0)
+   if (odb_helper_start(o, , 0, "get %s", sha1_to_hex(sha1)) < 0)
return -1;
 
memset(, 0, sizeof(stream));
@@ -258,3 +262,32 @@ int odb_helper_for_each_object(struct odb_helper *o,
 
return 0;
 }
+
+int odb_helper_write_object(struct odb_helper *o,
+   const void *buf, size_t len,
+   const char *type, unsigned char *sha1)
+{
+   struct odb_helper_cmd cmd;
+
+   if (odb_helper_start(o, , 1, "put %s %"PRIuMAX" %s",
+sha1_to_hex(sha1), (uintmax_t)len, type) < 0)
+   return -1;
+
+   do {
+   int w = xwrite(cmd.child.in, buf, len);
+   if (w < 0) {
+   error("unable to write to odb helper '%s': %s",
+ o->name, strerror(errno));
+   close(cmd.child.in);
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   return -1;
+   }
+   len -= w;
+   } while (len > 0);
+
+   close(cmd.child.in);
+   close(cmd.child.out);
+   odb_helper_finish(o, );
+   return 0;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 8c3916d215..4e321195e8 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -25,5 +25,8 @@ int odb_helper_fetch_object(struct odb_helper *o, const 
unsigned char *sha1,
int fd);
 int odb_helper_for_each_object(struct odb_helper *o,
   each_external_object_fn, void *);
+int odb_helper_write_object(struct odb_helper *o,
+   const void *buf, size_t len,
+   const char *type, unsigned char *sha1);
 
 #endif /* ODB_HELPER_H */
diff --git a/sha1_file.c b/sha1_file.c
index f87c59d711..8dd09334cf 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3450,6 +3450,8 @@ int write_sha1_file(const void *buf, un

[RFC/PATCH v4 18/49] lib-httpd: add upload.sh

2017-06-20 Thread Christian Couder
This cgi will be used to upload objects to, or to delete
objects from, an apache web server.

This way the apache server can work as an external object
database.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/lib-httpd.sh|  1 +
 t/lib-httpd/upload.sh | 45 +
 2 files changed, 46 insertions(+)
 create mode 100644 t/lib-httpd/upload.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 2e659a8ee2..d80b004549 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
install_script broken-smart-http.sh
install_script error.sh
+   install_script upload.sh
 
ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh
new file mode 100644
index 00..172be0f73f
--- /dev/null
+++ b/t/lib-httpd/upload.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+# In part from 
http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file
+
+FILES_DIR="www/files"
+
+OLDIFS="$IFS"
+IFS='&'
+set -- $QUERY_STRING
+IFS="$OLDIFS"
+
+while test $# -gt 0
+do
+key=${1%=*}
+val=${1#*=}
+
+case "$key" in
+   "sha1") sha1="$val" ;;
+   "type") type="$val" ;;
+   "size") size="$val" ;;
+   "delete") delete=1 ;;
+   *) echo >&2 "unknown key '$key'" ;;
+esac
+
+shift
+done
+
+case "$REQUEST_METHOD" in
+  POST)
+if test "$delete" = "1"
+then
+   rm -f "$FILES_DIR/$sha1-$size-$type"
+else
+   mkdir -p "$FILES_DIR"
+   cat >"$FILES_DIR/$sha1-$size-$type"
+fi
+
+echo 'Status: 204 No Content'
+echo
+;;
+
+  *)
+echo 'Status: 405 Method Not Allowed'
+echo
+esac
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 14/49] t0400: add test for external odb write support

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0400-external-odb.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh
index 6c6da5cf4f..3c868cad4c 100755
--- a/t/t0400-external-odb.sh
+++ b/t/t0400-external-odb.sh
@@ -66,4 +66,12 @@ test_expect_success 'helper can add objects to alt repo' '
test "$size" -eq "$alt_size"
 '
 
+test_expect_success 'commit adds objects to alt repo' '
+   test_config odb.magic.command "$HELPER" &&
+   test_commit three &&
+   hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo && git show "$hash3") &&
+   test "$content" = "three"
+'
+
 test_done
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 16/49] Add t0410 to test external ODB transfer

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0410-transfer-e-odb.sh | 136 ++
 1 file changed, 136 insertions(+)
 create mode 100755 t/t0410-transfer-e-odb.sh

diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh
new file mode 100755
index 00..868b55db94
--- /dev/null
+++ b/t/t0410-transfer-e-odb.sh
@@ -0,0 +1,136 @@
+#!/bin/sh
+
+test_description='basic tests for transfering external ODBs'
+
+. ./test-lib.sh
+
+ORIG_SOURCE="$PWD/.git"
+export ORIG_SOURCE
+
+ALT_SOURCE1="$PWD/alt-repo1/.git"
+export ALT_SOURCE1
+write_script odb-helper1 <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+GIT_DIR=$ALT_SOURCE1; export GIT_DIR
+case "$1" in
+have)
+   git cat-file --batch-check --batch-all-objects |
+   awk '{print $1 " " $3 " " $2}'
+   ;;
+get)
+   cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+   ;;
+put)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   writen=$(git hash-object -w -t "$kind" --stdin)
+   test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
'$writen'"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$ORIG_SOURCE 
GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+   GIT_DIR=$ORIG_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER1="\"$PWD\"/odb-helper1"
+
+OTHER_SOURCE="$PWD/.git"
+export OTHER_SOURCE
+
+ALT_SOURCE2="$PWD/alt-repo2/.git"
+export ALT_SOURCE2
+write_script odb-helper2 <<\EOF
+die() {
+   printf >&2 "%s\n" "$@"
+   exit 1
+}
+GIT_DIR=$ALT_SOURCE2; export GIT_DIR
+case "$1" in
+have)
+   GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' 
refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show
+   ;;
+get)
+   OBJ_FILE="$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#')
+   if ! test -f "$OBJ_FILE"
+   then
+   # "Download" the missing object by copying it from alt-repo1
+   OBJ_DIR=$(echo $2 | sed 's/\(..\).*/\1/')
+   OBJ_BASE=$(basename "$OBJ_FILE")
+   ALT_OBJ_DIR1="$ALT_SOURCE1/objects/$OBJ_DIR"
+   ALT_OBJ_DIR2="$ALT_SOURCE2/objects/$OBJ_DIR"
+   mkdir -p "$ALT_OBJ_DIR2" || die "Could not mkdir 
'$ALT_OBJ_DIR2'"
+   OBJ_SRC="$ALT_OBJ_DIR1/$OBJ_BASE"
+   cp "$OBJ_SRC" "$ALT_OBJ_DIR2" ||
+   die "Could not cp '$OBJ_SRC' into '$ALT_OBJ_DIR2'"
+   fi
+   cat "$OBJ_FILE" || die "Could not cat '$OBJ_FILE'"
+   ;;
+put)
+   sha1="$2"
+   size="$3"
+   kind="$4"
+   writen=$(git hash-object -w -t "$kind" --stdin)
+   test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
'$writen'"
+   ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$OTHER_SOURCE 
GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit
+   GIT_DIR=$OTHER_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash"
+   ;;
+*)
+   die "unknown command '$1'"
+   ;;
+esac
+EOF
+HELPER2="\"$PWD\"/odb-helper2"
+
+test_expect_success 'setup first alternate repo' '
+   git init alt-repo1 &&
+   test_commit zero &&
+   git config odb.magic.command "$HELPER1"
+'
+
+test_expect_success 'setup other repo and its alternate repo' '
+   git init other-repo &&
+   git init alt-repo2 &&
+   (cd other-repo &&
+git remote add origin .. &&
+git pull origin master &&
+git checkout master &&
+git log)
+'
+
+test_expect_success 'new blobs are put in first object store' '
+   test_commit one &&
+   hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo1 && git show "$hash1") &&
+   test "$content" = "one" &&
+   test_commit two &&
+   hash2=$(git ls-tree HEAD | grep two.t | cut -f1 | cut -d\  -f3) &&
+   content=$(cd alt-repo1 && git show "$hash2") &&
+   test "$content" = "two"
+'
+
+test_expect_success 'other repo gets the blobs from object store' '
+   (cd other-repo &&
+git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" &&
+test_must_fail git cat-file blob "$hash1" &&
+test_must_fail git cat-file blob "$hash2" &&
+git config odb.magic.command "$HELPER2" &&
+git cat-file blob "$hash1" &&
+git cat-file blob "$hash2"
+   )
+'
+
+test_expect_success 'other repo gets everything else' '
+   (cd other-repo &&
+git fetch origin &&
+content=$(git show "$hash1") &&
+test "$content" = "one" &&
+content=$(git show "$hash2") &&
+test "$content" = "two")
+'
+
+test_done
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 01/49] builtin/clone: get rid of 'value' strbuf

2017-06-20 Thread Christian Couder
This makes the code simpler by removing a few lines, and getting
rid of one variable.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/clone.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a2ea019c59..370a233d22 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -870,7 +870,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
const struct ref *our_head_points_at;
struct ref *mapped_refs;
const struct ref *ref;
-   struct strbuf key = STRBUF_INIT, value = STRBUF_INIT;
+   struct strbuf key = STRBUF_INIT;
struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
struct transport *transport = NULL;
const char *src_ref_prefix = "refs/heads/";
@@ -1035,7 +1035,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
-   strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
strbuf_reset();
@@ -1049,10 +1048,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   fetch_pattern = value.buf;
+   fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
refspec = parse_fetch_refspec(1, _pattern);
-
-   strbuf_reset();
+   free((char *)fetch_pattern);
 
remote = remote_get(option_origin);
transport = transport_get(remote, remote->url[0]);
@@ -1191,7 +1189,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_release(_msg);
strbuf_release(_top);
strbuf_release();
-   strbuf_release();
junk_mode = JUNK_LEAVE_ALL;
 
free(refspec);
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 03/49] t0021/rot13-filter: improve 'if .. elsif .. else' style

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index d6411ca523..1fc581c814 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -40,23 +40,20 @@ sub packet_bin_read {
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
return ( -1, "" );
-   }
-   elsif ( $bytes_read != 4 ) {
+   } elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
}
my $pkt_size = hex($buffer);
if ( $pkt_size == 0 ) {
return ( 1, "" );
-   }
-   elsif ( $pkt_size > 4 ) {
+   } elsif ( $pkt_size > 4 ) {
my $content_size = $pkt_size - 4;
$bytes_read = read STDIN, $buffer, $content_size;
if ( $bytes_read != $content_size ) {
die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
}
return ( 0, $buffer );
-   }
-   else {
+   } else {
die "invalid packet size: $pkt_size";
}
 }
@@ -144,14 +141,11 @@ while (1) {
my $output;
if ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
$output = "";
-   }
-   elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
+   } elsif ( $command eq "clean" and grep( /^clean$/, @capabilities ) ) {
$output = rot13($input);
-   }
-   elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
+   } elsif ( $command eq "smudge" and grep( /^smudge$/, @capabilities ) ) {
$output = rot13($input);
-   }
-   else {
+   } else {
die "bad command '$command'";
}
 
@@ -163,14 +157,12 @@ while (1) {
$debug->flush();
packet_txt_write("status=error");
packet_flush();
-   }
-   elsif ( $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "abort.r" ) {
print $debug "[ABORT]\n";
$debug->flush();
packet_txt_write("status=abort");
packet_flush();
-   }
-   else {
+   } else {
packet_txt_write("status=success");
packet_flush();
 
@@ -187,8 +179,7 @@ while (1) {
print $debug ".";
if ( length($output) > $MAX_PACKET_CONTENT_SIZE ) {
$output = substr( $output, 
$MAX_PACKET_CONTENT_SIZE );
-   }
-   else {
+   } else {
$output = "";
}
}
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-06-20 Thread Christian Couder
lled "read object process".

See:

http://public-inbox.org/git/20170113155253.1644-1-benpe...@microsoft.com/
http://public-inbox.org/git/20170322165220.5660-1-benpe...@microsoft.com/

Thanks to this, the external ODB mechanism should in the end perform
as well as the git-lfs mechanism when many objects should be
transfered.

Implementation
~~

* Mechanism to call the registered commands

This series adds a set of function in external-odb.{c,h} that are
called by the rest of Git to manage all the external ODBs.

These functions use 'struct odb_helper' and its associated functions
defined in odb-helper.{c,h} to talk to the different external ODBs by
launching the configured "odb..command" commands and writing
to or reading from them.

* ODB refs

For now odb ref management is only implemented in a helper in t0410.

When a new blob is added to an external odb, its sha1, size and type
are writen in another new blob and the odb ref is created.

When the list of existing blobs is requested from the external odb,
the content of the blobs pointed to by the odb refs can also be used
by the odb to claim that it can get the objects.

When a blob is actually requested from the external odb, it can use
the content stored in the blobs pointed to by the odb refs to get the
actual blobs and then pass them.

Highlevel view of the patches in the series
~~~

- Patch 1/49 is a small code cleanup that I already sent to the
  mailing list but will probably be removed in the end bdue to
  ongoing work on "git clone"

- Patches 02/49 to 08/49 create a Git/Packet.pm module by
  refactoring "t0021/rot13-filter.pl". Functions from this new
  module will be used later in test scripts.

- Patches 09/49 to 16/49 create the external ODB insfrastructure
  in external-odb.{c,h} and odb-helper.{c,h} for the script mode.

- Patches 17/49 to 23/49 improve lib-http to make it possible to
  use it as an external ODB to test storing blobs in an HTTP
  server.

- Patches 24/49 to 44/49 improve the external ODB insfrastructure
  to support sub-processes and make everything work using them.

- Patches 45/49 to 49/49 add the --initial-refspec to git clone
  along with tests.

Future work
~~~

First sorry about the state of this patch series, it is not as clean
as I would have liked, butI think it is interesting to get feedback
from the mailing list at this point, because the previous RFC was sent
a long time ago and a lot of things changed.

So a big part of the future work will be about cleaning this patch series.

Other things I think I am going to do:

  -   

Previous work and discussions
~

(Sorry for the old Gmane links, I will try to replace them with
public-inbox.org at one point.)

Peff started to work on this and discuss this some years ago:

http://thread.gmane.org/gmane.comp.version-control.git/206886/focus=207040
http://thread.gmane.org/gmane.comp.version-control.git/247171
http://thread.gmane.org/gmane.comp.version-control.git/202902/focus=203020

His work, which is not compile-tested any more, is still there:

https://github.com/peff/git/commits/jk/external-odb-wip

Initial discussions about this new series are there:

http://thread.gmane.org/gmane.comp.version-control.git/288151/focus=295160

Version 1, 2 and 3 of this RFC/PATCH series are here:

https://public-inbox.org/git/20160613085546.11784-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20160628181933.24620-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20161130210420.15982-1-chrisc...@tuxfamily.org/

Some of the discussions related to Ben Peart's work that is used by
this series are here:

http://public-inbox.org/git/20170113155253.1644-1-benpe...@microsoft.com/
http://public-inbox.org/git/20170322165220.5660-1-benpe...@microsoft.com/

Links
~

This patch series is available here:

https://github.com/chriscool/git/commits/external-odb

Version 1, 2 and 3 are here:

https://github.com/chriscool/git/commits/gl-external-odb12
https://github.com/chriscool/git/commits/gl-external-odb22
https://github.com/chriscool/git/commits/gl-external-odb61


Ben Peart (4):
  Documentation: add read-object-protocol.txt
  contrib: add long-running-read-object/example.pl
  Add t0410 to test read object mechanism
  odb-helper: add read_object_process()

Christian Couder (43):
  builtin/clone: get rid of 'value' strbuf
  t0021/rot13-filter: refactor packet reading functions
  t0021/rot13-filter: improve 'if .. elsif .. else' style
  Add Git/Packet.pm from parts of t0021/rot13-filter.pl
  t0021/rot13-filter: use Git/Packet.pm
  Git/Packet.pm: improve error message
  Git/Packet.pm: add packet_initialize()
  Git/Packet: add capability functions
  t0400: add 'put' command to odb-helper script
  external odb: add write support
  external-odb: accept only blobs for now
  t0400: add test for external odb write 

[RFC/PATCH v4 02/49] t0021/rot13-filter: refactor packet reading functions

2017-06-20 Thread Christian Couder
To make it possible in a following commit to move packet
reading and writing functions into a Packet.pm module,
let's refactor these functions, so they don't handle
printing debug output and exiting.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 617f581e56..d6411ca523 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -39,8 +39,7 @@ sub packet_bin_read {
my $bytes_read = read STDIN, $buffer, 4;
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
-   print $debug "STOP\n";
-   exit();
+   return ( -1, "" );
}
elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
@@ -64,7 +63,7 @@ sub packet_bin_read {
 
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
-   unless ( $buf =~ s/\n$// ) {
+   unless ( $res == -1 || $buf =~ s/\n$// ) {
die "A non-binary line MUST be terminated by an LF.";
}
return ( $res, $buf );
@@ -109,7 +108,12 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-   my ($command) = packet_txt_read() =~ /^command=(.+)$/;
+   my ($res, $command) = packet_txt_read();
+   if ( $res == -1 ) {
+   print $debug "STOP\n";
+   exit();
+   }
+   $command =~ s/^command=//;
print $debug "IN: $command";
$debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 06/49] Git/Packet.pm: improve error message

2017-06-20 Thread Christian Couder
Try to give a bit more information when we die()
because there is no new line at the end of something
we receive.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 perl/Git/Packet.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index aaffecbe2a..2ad6b00d6c 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -49,7 +49,8 @@ sub packet_bin_read {
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
unless ( $res == -1 || $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.";
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
}
return ( $res, $buf );
 }
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 15/49] Add GIT_NO_EXTERNAL_ODB env variable

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 cache.h| 9 +
 environment.c  | 4 
 external-odb.c | 6 ++
 sha1_file.c| 3 +++
 4 files changed, 22 insertions(+)

diff --git a/cache.h b/cache.h
index 391a69e9c5..6047755629 100644
--- a/cache.h
+++ b/cache.h
@@ -428,6 +428,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES"
 #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS"
 #define GIT_REPLACE_REF_BASE_ENVIRONMENT "GIT_REPLACE_REF_BASE"
+#define NO_EXTERNAL_ODB_ENVIRONMENT "GIT_NO_EXTERNAL_ODB"
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
@@ -760,6 +761,14 @@ void reset_shared_repository(void);
 extern int check_replace_refs;
 extern char *git_replace_ref_base;
 
+/*
+ * Do external odbs need to be used this run?  This variable is
+ * initialized to true unless $GIT_NO_EXTERNAL_ODB is set, but it
+ * maybe set to false by some commands that do not want external
+ * odbs to be active.
+ */
+extern int use_external_odb;
+
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/environment.c b/environment.c
index aa478e71de..8c4f52635c 100644
--- a/environment.c
+++ b/environment.c
@@ -46,6 +46,7 @@ const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 int check_replace_refs = 1;
 char *git_replace_ref_base;
+int use_external_odb = 1;
 enum eol core_eol = EOL_UNSET;
 enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
@@ -120,6 +121,7 @@ const char * const local_repo_env[] = {
INDEX_ENVIRONMENT,
NO_REPLACE_OBJECTS_ENVIRONMENT,
GIT_REPLACE_REF_BASE_ENVIRONMENT,
+   NO_EXTERNAL_ODB_ENVIRONMENT,
GIT_PREFIX_ENVIRONMENT,
GIT_SUPER_PREFIX_ENVIRONMENT,
GIT_SHALLOW_FILE_ENVIRONMENT,
@@ -188,6 +190,8 @@ static void setup_git_env(void)
replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
  : "refs/replace/");
+   if (getenv(NO_EXTERNAL_ODB_ENVIRONMENT))
+   use_external_odb = 0;
namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT));
namespace_len = strlen(namespace);
shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
diff --git a/external-odb.c b/external-odb.c
index 6d4fdd0bc1..a88837feda 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -63,6 +63,9 @@ int external_odb_has_object(const unsigned char *sha1)
 {
struct odb_helper *o;
 
+   if (!use_external_odb)
+   return 0;
+
external_odb_init();
 
for (o = helpers; o; o = o->next)
@@ -133,6 +136,9 @@ int external_odb_write_object(const void *buf, size_t len,
 {
struct odb_helper *o;
 
+   if (!use_external_odb)
+   return 1;
+
/* For now accept only blobs */
if (strcmp(type, "blob"))
return 1;
diff --git a/sha1_file.c b/sha1_file.c
index 8dd09334cf..9d8e37432e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -642,6 +642,9 @@ void prepare_external_alt_odb(void)
static int linked_external;
const char *path;
 
+   if (!use_external_odb)
+   return;
+
if (linked_external)
return;
 
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 13/49] external-odb: accept only blobs for now

2017-06-20 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 external-odb.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/external-odb.c b/external-odb.c
index 893937a7d4..6d4fdd0bc1 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -133,6 +133,10 @@ int external_odb_write_object(const void *buf, size_t len,
 {
struct odb_helper *o;
 
+   /* For now accept only blobs */
+   if (strcmp(type, "blob"))
+   return 1;
+
external_odb_init();
 
for (o = helpers; o; o = o->next) {
-- 
2.13.1.565.gbfcd7a9048



[RFC/PATCH v4 05/49] t0021/rot13-filter: use Git/Packet.pm

2017-06-20 Thread Christian Couder
After creating Git/Packet.pm from part of t0021/rot13-filter.pl,
we can now simplify this script by using Git/Packet.pm.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 t/t0021/rot13-filter.pl | 51 +++--
 1 file changed, 3 insertions(+), 48 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 1fc581c814..36a9eb3608 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -19,9 +19,12 @@
 # same command.
 #
 
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
 use strict;
 use warnings;
 use IO::File;
+use Git::Packet;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
 my @capabilities= @ARGV;
@@ -34,54 +37,6 @@ sub rot13 {
return $str;
 }
 
-sub packet_bin_read {
-   my $buffer;
-   my $bytes_read = read STDIN, $buffer, 4;
-   if ( $bytes_read == 0 ) {
-   # EOF - Git stopped talking to us!
-   return ( -1, "" );
-   } elsif ( $bytes_read != 4 ) {
-   die "invalid packet: '$buffer'";
-   }
-   my $pkt_size = hex($buffer);
-   if ( $pkt_size == 0 ) {
-   return ( 1, "" );
-   } elsif ( $pkt_size > 4 ) {
-   my $content_size = $pkt_size - 4;
-   $bytes_read = read STDIN, $buffer, $content_size;
-   if ( $bytes_read != $content_size ) {
-   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
-   }
-   return ( 0, $buffer );
-   } else {
-   die "invalid packet size: $pkt_size";
-   }
-}
-
-sub packet_txt_read {
-   my ( $res, $buf ) = packet_bin_read();
-   unless ( $res == -1 || $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.";
-   }
-   return ( $res, $buf );
-}
-
-sub packet_bin_write {
-   my $buf = shift;
-   print STDOUT sprintf( "%04x", length($buf) + 4 );
-   print STDOUT $buf;
-   STDOUT->flush();
-}
-
-sub packet_txt_write {
-   packet_bin_write( $_[0] . "\n" );
-}
-
-sub packet_flush {
-   print STDOUT sprintf( "%04x", 0 );
-   STDOUT->flush();
-}
-
 print $debug "START\n";
 $debug->flush();
 
-- 
2.13.1.565.gbfcd7a9048



[PATCH] sub-process: fix comment about api-sub-process.txt

2017-06-14 Thread Christian Couder
Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 sub-process.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sub-process.h b/sub-process.h
index 7d451e1cde..d9a45cd359 100644
--- a/sub-process.h
+++ b/sub-process.h
@@ -7,7 +7,7 @@
 
 /*
  * Generic implementation of background process infrastructure.
- * See Documentation/technical/api-background-process.txt.
+ * See: Documentation/technical/api-sub-process.txt
  */
 
  /* data structures */
-- 
2.13.1.453.g04e95ab038



[PATCH] builtin/clone: get rid of 'value' strbuf

2017-06-14 Thread Christian Couder
This makes the code simpler by removing a few lines, and getting
rid of one variable.

Signed-off-by: Christian Couder <chrisc...@tuxfamily.org>
---
 builtin/clone.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a2ea019c59..370a233d22 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -870,7 +870,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
const struct ref *our_head_points_at;
struct ref *mapped_refs;
const struct ref *ref;
-   struct strbuf key = STRBUF_INIT, value = STRBUF_INIT;
+   struct strbuf key = STRBUF_INIT;
struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
struct transport *transport = NULL;
const char *src_ref_prefix = "refs/heads/";
@@ -1035,7 +1035,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
-   strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
strbuf_reset();
@@ -1049,10 +1048,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   fetch_pattern = value.buf;
+   fetch_pattern = xstrfmt("+%s*:%s*", src_ref_prefix, branch_top.buf);
refspec = parse_fetch_refspec(1, _pattern);
-
-   strbuf_reset();
+   free((char *)fetch_pattern);
 
remote = remote_get(option_origin);
transport = transport_get(remote, remote->url[0]);
@@ -1191,7 +1189,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_release(_msg);
strbuf_release(_top);
strbuf_release();
-   strbuf_release();
junk_mode = JUNK_LEAVE_ALL;
 
free(refspec);
-- 
2.13.1.453.g04e95ab038



[ANNOUNCE] Git Rev News edition 28

2017-06-14 Thread Christian Couder
Hi everyone,

The 28th edition of Git Rev News is now published:

  https://git.github.io/rev_news/2017/06/14/edition-28/

Thanks a lot to all the contributors and helpers!

Enjoy,
Christian, Thomas, Jakub and Markus.


Draft of Git Rev News edition 28

2017-06-12 Thread Christian Couder
Hi,

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-28.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/246

You can also reply to this email.

In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.

I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Thomas, Jakub, Markus and myself plan to publish this edition on
Wednesday June 14th.

Thanks,
Christian.


Re: 'pu' broken at t5304 tonight

2017-06-10 Thread Christian Couder
On Sat, Jun 10, 2017 at 9:05 PM, Kevin Daudt  wrote:
> On Sat, Jun 10, 2017 at 02:48:36PM +0200, Kevin Daudt wrote:

>> For me, this bisects to the latest merge:
>>
>> 2047eebd3 (Merge branch 'bw/repo-object' into pu, 2017-06-10), but
>> neither of the parent of the merge break this test, so it looks like
>> it's because of an interaction between the repo-object topic and another
>> topic.
>
> Merging the repo-object with different other topic branches reveals this
> topic to cause the bad interaction:
>
> b56c91004 (Merge branch 'nd/prune-in-worktree' into pu, 2017-06-10)
>
> Still investigating why it happens.

Yeah, 9570b25a97 (revision.c: --indexed-objects add objects from all
worktrees, 2017-04-19) adds the following test to t5304-prune.sh but
this fails if nd/prune-in-worktree is rebased on top of
bw/repo-object:

test_expect_success 'prune: handle index in multiple worktrees' '
   git worktree add second-worktree &&
   echo "new blob for second-worktree" >second-worktree/blob &&
   git -C second-worktree add blob &&
   git prune --expire=now &&
   git -C second-worktree show :blob >actual &&
   test_cmp second-worktree/blob actual
'


Re: [PATCH 1/2] git-compat-util: add a freez() wrapper around free(x); x = NULL

2017-06-09 Thread Christian Couder
On Fri, Jun 9, 2017 at 10:53 AM, Ævar Arnfjörð Bjarmason
 wrote:

> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -787,6 +787,7 @@ extern char *xstrdup(const char *str);
>  extern void *xmalloc(size_t size);
>  extern void *xmallocz(size_t size);
>  extern void *xmallocz_gently(size_t size);
> +#define freez(p) do { if (p) { free(p); (p) = NULL; } } while (0)

I think we already rely on free(NULL) working, see
http://public-inbox.org/git/alpine.DEB.2.20.1608301948310.129229@virtualbox/
for example, so this could just be:

#define freez(p) do { free(p); (p) = NULL; } while (0)

and yeah FREEZ() would be even better than freez().


Re: [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status

2017-06-05 Thread Christian Couder
On Mon, Jun 5, 2017 at 10:25 PM, Prathamesh Chavan  wrote:

> ---
> In this new version of patch, function print_status is introduced.
>
> The functions for_each_submodule_list and get_submodule_displaypath
> are found to be the same as those in the ported submodule subcommand
> foreach's patches. The reason for doing so is to keep both the patches
> independant and on separate branches. Also this patch is build on
> the branch gitster/jk/bug-to-abort for utilizing its BUG() macro.

The BUG() macro is now in master.


Re: [PATCH] perf: work around the tested repo having an index.lock

2017-06-05 Thread Christian Couder
On Mon, Jun 5, 2017 at 4:02 AM, Junio C Hamano <gits...@pobox.com> wrote:
> Christian Couder <christian.cou...@gmail.com> writes:
>
>> On Sun, Jun 4, 2017 at 2:00 AM, Junio C Hamano <gits...@pobox.com> wrote:
>>> Ævar Arnfjörð Bjarmason <ava...@gmail.com> writes:
>>>
>>>>> My feeling exactly.  Diagnosing and failing upfront saying "well you
>>>>> made a copy but it is not suitable for testing" sounds more sensible
>>>>> at lesat to me.
>>>>
>>>> This change makes the repo suitable for testing when it wasn't before.
>>>
>>> Perhaps "not suitable" was a bit too vague.
>>>
>>> The copy you made is not in a consistent state that is good for
>>> testing.  This change may declare that it is now in a consistent
>>> state, but removal of a single *.lock file does not make it so.  We
>>> do not know what other transient inconsistency the resulting copy
>>> has; it is inherent to git-unaware copy---that is why we discouraged
>>> and removed rsync transport after all.
>>
>> If we don't like git-unaware copies, maybe we should go back to the
>> reasons why we are making one here.
>
> We do need git-unaware bit-for-bit copy for testing, because you may
> want to see the effect of unreachable objects, for example.

I think there might be different kind of people interested in performance tests.

Users with existing repositories might want to see how the different
Git versions perform on their real life repos.
Developers might want to test Git on different repos with different
characteristics.

For example some developers might want to test on repos with and
without a lot of unreachable objects, to make sure that the latest
changes they made improve perf in both cases. While some users might
only be interested in testing on their actual repositories to see how
the latest Git versions improve things (or not) in practice.

In this example the needs of developers would perhaps be better suited
if they could control the amount of unreachable objects in the tests,
while the needs of the users would be better suited if the tests just
used actual repos as is.

So I wonder what changes would be needed to the perf framework and the
perf tests to accomodate both of these kinds of needs.



> It's just that git-unaware copies, because it cannot be an atomic
> snapshot, can introduce inconsistencies the original repository did
> not have, rendering the result ineffective.


Re: [PATCH] test-lib: add ability to cap the runtime of tests

2017-06-04 Thread Christian Couder
On Mon, Jun 5, 2017 at 3:55 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason  writes:
>
>> Realistically I'm going to submit this patch, I'm not going to take
>> the much bigger project of refactoring the entire test suite so that
>> no test runs under N second, and of course any such refactoring can
>> only aim for a fixed instead of dynamic N.
>
> I do not expect any single person to tackle the splitting.  I just
> wished that a patch inspired by this patch (or better yet, a new
> version of this patch) made the tail end of "make test" output to
> read like this:
>
>...
>[18:32:44] t9400-git-cvsserver-server.sh .. ok18331 ms
>[18:32:49] t9402-git-cvsserver-refs.sh  ok22902 ms
>[18:32:49] t9200-git-cvsexportcommit.sh ... ok25163 ms
>[18:32:51]
>All tests successful.
>Files=785, Tests=16928, 122 wallclock secs ( ...
>Result: PASS
>
>* The following tests took longer than 15 seconds to run.  We
>  may want to look into splitting them into smaller files.
>
>t3404-rebase-interactive.sh ...19 secs
>t9001-send-email.sh ...22 secs
>t9402-git-cvsserver-refs.sh ...22 secs
>t9200-git-cvsexportcommit.sh ..25 secs
>
> when the hidden feature is _not_ used, so that wider set of people
> will be forced to see that some tests take inordinate amount of
> time, and entice at least some of them to look into it.

I wonder if splitting tests would make a good GSoC microproject for next year.


Re: Git Merge 2017 Videos

2017-06-04 Thread Christian Couder
On Sun, Jun 4, 2017 at 7:36 PM, Kevin Daudt  wrote:
> On Sun, Jun 04, 2017 at 11:24:17AM +0100, Philip Oakley wrote:
>> While looking at the recent .gitignore issue (the need to use `**`) I came
>> up against a comment in
>> https://public-inbox.org/git/cagz79kzqsaubfotjyqm+-+ljyyec2ykj5exuy5kderezfh0...@mail.gmail.com/
>> noting that the Git Merge 2017 videos were not available at that time.
>>
>> Well, a search found them on Youtube on the GitHub channel :
>> https://www.youtube.com/results?search_query=git+merge+2017+videos
>>
>> With a playlist : 
>> https://www.youtube.com/watch?v=tvymSWfvkjw=PL0lo9MOBetEGRAJzoTCdco_fOKDfhqaOY
>>
>> Enjoy the viewing. The first few have been good.
>
> Thanks for sharing this.

This was in the last edition of Git Rev News:
https://git.github.io/rev_news/2017/05/17/edition-27/


Re: [PATCH] perf: work around the tested repo having an index.lock

2017-06-04 Thread Christian Couder
On Sun, Jun 4, 2017 at 2:00 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason  writes:
>
>>> My feeling exactly.  Diagnosing and failing upfront saying "well you
>>> made a copy but it is not suitable for testing" sounds more sensible
>>> at lesat to me.
>>
>> This change makes the repo suitable for testing when it wasn't before.
>
> Perhaps "not suitable" was a bit too vague.
>
> The copy you made is not in a consistent state that is good for
> testing.  This change may declare that it is now in a consistent
> state, but removal of a single *.lock file does not make it so.  We
> do not know what other transient inconsistency the resulting copy
> has; it is inherent to git-unaware copy---that is why we discouraged
> and removed rsync transport after all.

If we don't like git-unaware copies, maybe we should go back to the
reasons why we are making one here.
In 342e9ef2d9 (Introduce a performance testing framework, 2012-02-17),
Thomas wrote:

3. Creating test repos from scratch in every test is extremely
   time-consuming, and shipping or downloading such large/weird repos
   is out of the question.

   We leave this decision to the user.  Two different sizes of test
   repos can be configured, and the scripts just copy one or more of
   those (using hardlinks for the object store).  By default it tries
   to use the build tree's git.git repository.

   This is fairly fast and versatile.  Using a copy instead of a clone
   preserves many properties that the user may want to test for, such
   as lots of loose objects, unpacked refs, etc.

Is a local clone really much slower these days? Wouldn't it is use
hard links too?
By the way the many properties that are preserved might not be worth
preserving as they could make results depend a lot on the current
state of the original repo.


Re: [PATCH v3 6/6] fsmonitor: add a sample query-fsmonitor hook script for Watchman

2017-05-31 Thread Christian Couder
On Thu, May 25, 2017 at 8:36 PM, Ben Peart  wrote:

>  templates/hooks--query-fsmonitor.sample | 37 
> +
>  1 file changed, 37 insertions(+)
>  create mode 100644 templates/hooks--query-fsmonitor.sample

Please make this file executable like the other sample hook scripts.


Re: [PATCH v2 0/6] Fast git status via a file system watcher

2017-05-31 Thread Christian Couder
On Thu, May 25, 2017 at 3:55 PM, Ben Peart <peart...@gmail.com> wrote:
>
> On 5/24/2017 6:54 AM, Christian Couder wrote:
>>>
>>> A new git hook (query-fsmonitor) must exist and be enabled
>>> (core.fsmonitor=true) that takes a time_t formatted as a string and
>>> outputs to stdout all files that have been modified since the requested
>>> time.
>>
>> Is there a reason why there is a new hook, instead of a
>> "core.fsmonitorquery" config option to which you could pass whatever
>> command line with options?
>
> A hook is a simple and well defined way to integrate git with another
> process.  If there is some fixed set of arguments that need to be passed to
> a file system monitor (beyond the timestamp stored in the index extension),
> they can be encoded in the integration script like I've done in the Watchman
> integration sample hook.

Yeah, but a hook must also be called everytime git wants to
communicate with the file system monitor. And we could perhaps get a
speed up if we could have only one long running process to communicate
with the file system monitor.


Re: [PATCH v2 0/6] Fast git status via a file system watcher

2017-05-31 Thread Christian Couder
 Yeah, they could be encoded in the integration script, but it could be
 better if it was possible to just configure a generic command line.

 For example if the directory that should be watched for filesystem
 changes could be passed as well as the time since the last changes,
 perhaps only a generic command line would be need.
>>>
>>> Maybe I'm not understanding what you have in mind but I haven't found
>>> this
>>> to be the case in the two integrations I've done with file system
>>> watchers
>>> (one internal and Watchman).  They require you download, install, and
>>> configure them by telling them about the folders you want monitored.
>>> Then
>>> you can start querying them for changes and processing the output to
>>> match
>>> what git expects.  While the download and install steps vary, having that
>>> query + process and return results wrapped up in an integration hook has
>>> worked well.
>>
>> It looks like one can also just ask watchman to monitor a directory with:
>>
>> watchman watch /path/to/dir
>>
>> or:
>>
>> echo '["watch", "/path/to/dir"]' | watchman -j
>>
>> Also for example on Linux people might want to use command line tools
>> like:
>>
>> https://linux.die.net/man/1/inotifywait
>>
>> and you can pass the directories you want to be watched as arguments
>> to this kind of tools.
>>
>> So it would be nice, if we didn't require the user to configure
>> anything and we could just configure the watching of what we need in
>> the hook (or a command configured using a config option). If the hook
>> (or configured command) could be passed the directory by git, it could
>> also be generic.
>
> OK, I think I understand what you're attempting to accomplish now. Often,
> Watchman (and other similar tools) are used to do much more than speed up
> git (in fact, _all_ use cases today are not used for that since this patch
> series hasn't been accepted yet :)).  They trigger builds, run verification
> tools, test passes, or other tasks.

Yeah, but some people might only be interested in installing Watchman
or similar tools to speed up "git status".

> I'm afraid that attempting to have the user configure git to configure the
> tool "automatically" is just adding an extra layer of complexity rather than
> making it simpler.

I think that for the user it makes things simpler, as the user
wouldn't have to configure anything.

For example if the hook does something like the following :

# Check that watchman is already watching the worktree
if ! watchman watch-list | grep "\"$GIT_WORK_TREE\""
then
   # Ask watchman to watch the worktree
   watchman watch "$GIT_WORK_TREE"
   exit 1
fi

# Query Watchman for all the changes since the requested time
echo "[\"query\", \"$GIT_WORK_TREE\", {\"since\": $1,
\"fields\":[\"name\"]}]" | \
watchman -j | \
...

Then users might not need to configure Watchman in the first place,
and if they move their repo they might not need to reconfigure
Watchman.

> I'll leave that to a future patch series to work out.

Yeah, the above improvement can be done later.

 I am also wondering about sparse checkout, as we might want to pass
 all the directories we are interested in.
 How is it supposed to work with sparse checkout?
>>>
>>>
>>> The fsmonitor code works well with or without a sparse-checkout.  The
>>> file
>>> system monitor is unaware of the sparse checkout so will notify git about
>>> any change irrespective of whether git will eventually ignore it because
>>> the
>>> skip worktree bit is set.
>>
>> I was wondering if it could ease the job for the monitoring service
>> and perhaps improve performance to just ask to watch the directories
>> we are interested in when using sparse checkout.
>> On Linux it looks like a separate inotify watch is created for every
>> subdirectory and there is maximum amount of inotify watches per user.
>> This can be increased by writing in
>> /proc/sys/fs/inotify/max_user_watches, but it is not nice to have to
>> ask admins to increase this.
>
> Having a single instance that watches the root of the working directory is
> the simplest model and minimizes use of system resources like inotify as
> there is only one needed per clone.

>From https://linux.die.net/man/1/inotifywait:

-r, --recursive

Watch all subdirectories of any directories passed as arguments.
Watches will be set up recursively to an unlimited depth. Symbolic
links are not traversed. Newly created subdirectories will also be
watched.

Warning: If you use this option while watching the root directory of a
large tree, it may take quite a while until all inotify watches are
established, and events will not be received in this time. Also, since
one inotify watch will be established per subdirectory, it is possible
that the maximum amount of inotify watches per user will be reached.
The default maximum is 8192; it can be increased by writing to
/proc/sys/fs/inotify/max_user_watches.

> In addition, when the sparse-checkout file is 

<    3   4   5   6   7   8   9   10   11   12   >