Re: git patch format for rename

2016-07-06 Thread Stefan Beller
On Wed, Jul 6, 2016 at 9:00 AM, nadya.zabrod...@jetbrains.com
 wrote:
> Hi,
>
> could you please explain to me why  ‘rename from & rename to’ information is 
> added to git patch?
> Is there any reason why git can’t recognize and apply rename-change which 
> looks like this:
>
> diff --git a/before.txt b/after.txt
> --- a/before.txt
> +++ b/after.txt

See https://github.com/git/git/blob/master/Documentation/diff-generate-patch.txt
for the status quo. I think most of it grew over time and made sense
at the time of implementation,
so the alternatives were maybe considered, you'd need to look for the
old discussions around
the diff format.

>
> I’m implementing a program (with an IDE) to create patches compatible with 
> Git and want to have better understanding of the format and possible issues 
> with it.
>
> Thanks a lot in advance.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 0/4] Push options in C Git

2016-07-06 Thread Stefan Beller
This is not marked for RFC any more, as I do not recall any open points
left for discussion. This addresses the only reply from Eric Wong on patch 3:

diff to v2:
diff --git a/send-pack.c b/send-pack.c
index e328276..c943560 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -536,7 +536,8 @@ int send_pack(struct send_pack_args *args,
 
for_each_string_list_item(item, args->push_options)
packet_buf_write(&sb, "%s", item->string);
-   write_or_die(out, sb.buf, sb.len);
+
+   write_or_die(out, sb.buf, sb.len);
packet_flush(out);
strbuf_release(&sb);
}
diff --git a/transport.c b/transport.c
index 598bd1f..0298be1 100644
--- a/transport.c
+++ b/transport.c
@@ -641,7 +641,6 @@ struct transport *transport_get(struct remote *remote, 
const char *url)
struct transport *ret = xcalloc(1, sizeof(*ret));
 
ret->progress = isatty(2);
-   ret->push_options = NULL;
 
if (!remote)
die("No remote provided to transport_get()");

Cover letter v2:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

Jeff writes on v1:
> Whereas in Dennis's patches, it was about specific information directly
> related to the act of pushing.

This allows to transmit arbitrary information as the backends of $VENDOR
may have different options available related to the direct act of pushing.

Thanks,
Stefan

Cover letter v1:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

When using a remote that is more than just a plain Git host (e.g. Gerrit,
Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
push options can instruct the server to:
* open a pull request
* send out emails asking for review
* (un)trigger continuous integration
* set priority for continuous integration (i.e. bots pushing may ask to be
  treated with lower priority compared to humans)
* ... 

Most of these actions can be done on the client side as well,
but in these remote-centric workflows it is easier to do that on the remote,
which is why we need to transport the information there.

More concrete examples:
* When you want a change in Gerrit to be submitted to refs/heads/master, you
  push instead to a magic branch refs/for/master and Gerrit will create a change
  for you (similar to a pull request). Instead we could imagine that you push
  to a magical refs/heads/master with a push option "create-change".
  
* When pushing to Gerrit you can already attach some of this information by
  adding a '%' followed by the parameter to the ref, i.e. when interacting with
  Gerrit it is possible to do things like[1]:

git push origin 
HEAD:refs/for/master%draft%topic=example%cc=jon@example.org
  
  This is not appealing to our users as it looks like hacks upon hacks to make
  it work. It would read better if it was spelled as:
  
  git push origin HEAD:refs/for/master \
  --push-option draft \
  --push-option topic=example \
  --push-option cc=jon@example.org
  
  (with a short form that is even easier to type,
   but this is is more intuitive already)

This is a patch series to Git core, which is developed at the same time
as a change is proposed to JGit by Dan Wang, see [2].

This code is also available at [3].

Thanks,
Stefan

[1] Not all Gerrit '%' options are documented, so here is a link to source code 
instead :(
https://gerrit.googlesource.com/gerrit/+/refs/heads/master/gerrit-server/src/main/java/com/google/gerrit/server/git/ReceiveCommits.java#1141

[2] https://git.eclipse.org/r/#/c/74570/ 
 
[3] https://github.com/stefanbeller/git/tree/pushoptions

Stefan Beller (4):
  push options: {pre,post}-receive hook learns about push options
  receive-pack: implement advertising and receiving push options
  push: accept push options
  add a test for push options

 Documentation/config.txt  |  7 +-
 Documentation/git-push.txt|  8 ++-
 Documentation/githooks.txt|  4 ++
 Documentation/technical/pack-protocol.txt | 10 +--
 Documentation/technical/protocol-capabilities.txt |  8 +++
 builtin/push.c| 16 -
 builtin/receive-pack.c| 85 +++
 send-pack.c   | 29 
 send-pack.h   |  3 +
 t/t5544-push-options.sh   | 85 +++
 transport.c   |  2 +
 transport.h   |  7 ++
 12 files changed, 242 insertions(+), 22 deletions(-)
 create mode 100755 t/t5544-push-options.sh

-- 
2.9.0.141.gdd65b60

--
To unsubscribe from this list: send the line "unsu

[PATCH 3/4] push: accept push options

2016-07-06 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 16 +---
 send-pack.c| 30 ++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 19f46b6..b0b1273 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-L::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. Only C strings
+   containing no new lines are allowed.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..1b5d205 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,8 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+
struct option options[] = {
OPT__VERBOSITY(&verbosity),
OPT_STRING( 0 , "repo", &repo, N_("repository"), 
N_("repository")),
@@ -533,6 +542,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", &push_cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", &push_options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", &family, N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"),
@@ -563,7 +573,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   rc = do_push(repo, flags, &push_options);
if (rc == -1)
usa

[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-06 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  7 ++-
 Documentation/technical/pack-protocol.txt | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  8 +++
 builtin/receive-pack.c| 59 +++
 4 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..df1b314 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,7 +2410,12 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
+   capability to its clients. If you don't want this capability
+   to be advertised, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options capability
+   to its clients. If you don't want this capability
to be advertised, set this variable to false.
 
 receive.autogc::
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..b71eda9 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,14 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is capable to accept
+push options after the update commands have been sent. If the pushing client
+requests this capability, the server will pass the options to the pre and post
+receive hooks that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index edbf81e..e71041a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);
+   return 0;
+   }
+
return git_default_config(var, value, cb);
 }
 
@@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsig

[PATCH 4/4] add a test for push options

2016-07-06 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5544-push-options.sh | 101 
 1 file changed, 101 insertions(+)
 create mode 100755 t/t5544-push-options.sh

diff --git a/t/t5544-push-options.sh b/t/t5544-push-options.sh
new file mode 100755
index 000..8dd3c8e
--- /dev/null
+++ b/t/t5544-push-options.sh
@@ -0,0 +1,101 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.141.gd59d3e9.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-06 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  |  6 +
 builtin/receive-pack.c  | 48 +++--
 templates/hooks--pre-receive.sample | 22 +
 3 files changed, 63 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..c875cde 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,9 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options are available in the variable GIT_PUSH_OPTION_COUNT
+and the options itself are in GIT_PUSH_OPTION_{0,1,...}.
+
 [[update]]
 update
 ~~
@@ -322,6 +325,9 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options are available in the variable GIT_PUSH_OPTION_COUNT
+and the options itself are in GIT_PUSH_OPTION_{0,1,...}.
+
 [[post-update]]
 post-update
 ~~~
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 15c323a..edbf81e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -550,8 +550,16 @@ static void prepare_push_cert_sha1(struct child_process 
*proc)
}
 }
 
+struct receive_hook_feed_state {
+   struct command *cmd;
+   int skip_broken;
+   struct strbuf buf;
+   const struct string_list *push_options;
+};
+
 typedef int (*feed_fn)(void *, const char **, size_t *);
-static int run_and_fee

Re: bug: depth 1 and recursive update of submodules broke in 2.9.0

2016-07-07 Thread Stefan Beller
See the discussion and bug fix at
http://thread.gmane.org/gmane.comp.version-control.git/297687/focus=297719

This is a known regression in 2.9.0 and the fix is in the master branch already.
(I think Junio will also roll it into 2.9.1)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Questions for "Git User's Survey 2016"

2016-07-07 Thread Stefan Beller
On Thu, Jul 7, 2016 at 8:03 AM, Jakub Narębski  wrote:
> I am thinking about returning to doing the Git User's Survey, and
> I'd like to ask for feedback.

Thanks for doing the survey!

>
> What I'd like to see is how people use Git, which features do they
> use and which tools (maybe there is some great unknown tool there?),
> where they go to find help about Git.
>
> What types of questions should there be in this year survey?
>

Additionally to the categories you listed, I'd like to propose

4. TEMPORAL. We may get feedback if and how much
some pain points were addressed since the last survey.
This can be done by using the very same question we used in
the last questionnaire, and see how much the answers shifted
or by doing more suggestive questions ("Last survey showed,
feature X is not easy to use, how easy is it now? Think of X
in v1.7.5")

AFAICT the last survey was 2012?

As for the content, I think submodules improved a lot
since then, however I do think they could be improved even more.
e.g. one question could be:

What would you think would improve submodules most
if you could only pick one?

* inclusion of submodules into more commands, e.g.
  "git checkout --recurse-submodules"
* support for direct clones of submodules from a host
  (i.e. not going through the .gitmodules file, but trying
  to obtain the submodule from that superprojects
  .git/modules/. This would only work for non
  bare repos)
* more fault tolerance for miss-configured submodules
* more descriptive error messages for miss-configured
  submodules
* ...

This would help to gauge how to best spend developer time
when trying to improve a feature (like submodules)

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-07 Thread Stefan Beller
On Thu, Jul 7, 2016 at 1:37 PM, Junio C Hamano  wrote:
>
>> @@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsigned 
>> char *sha1)
>> "report-status delete-refs side-band-64k quiet");
>>   if (advertise_atomic_push)
>>   strbuf_addstr(&cap, " atomic");
>> + if (advertise_push_options)
>> + strbuf_addstr(&cap, " push-options");
>>   if (prefer_ofs_delta)
>>   strbuf_addstr(&cap, " ofs-delta");
>>   if (push_cert_nonce)
>
> Hmph, was there a good reason to add it in the middle (contrast to
> the previous addition to the "only possible values are..."
> enumeration)?

No, there is no good objective reason. I added it just after the atomic
flag as that is what I implemented.

Is there a reason for a particular order of capabilities? I always considered
it a set of strings, i.e. any order is valid and there is no preference in
which way to put it.

>
>> +static struct string_list *read_push_options()
>
> static struct string_list *read_push_options(void)
>
>> +{
>> + int i;
>> + struct string_list *ret = xmalloc(sizeof(*ret));
>> + string_list_init(ret, 1);
>> +
>> + /* NEEDSWORK: expose the limitations to be configurable. */
>> + int max_options = 32;
>> +
>> + /*
>> +  * NEEDSWORK: expose the limitations to be configurable;
>> +  * Once the limit can be lifted, include a way for payloads
>> +  * larger than one pkt, e.g allow a payload of up to
>> +  * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> +  * to indicate whether the next pkt continues with this
>> +  * push option.
>> +  */
>> + int max_size = 1024;
>
> Good NEEDSWORK comments; perhaps also hint that the configuration
> must not come from the repository level configuration file (i.e.
> Peff's "scoped configuration" from jk/upload-pack-hook topic)?

Ok, I reviewed that series. It is unclear to me how the attack would
actually look like in that case.

In 20b20a22f8f Jeff writes:
> Because we promise that
> upload-pack is safe to run in an untrusted repository, we
> cannot execute arbitrary code or commands found in the
> repository (neither in hooks/, nor in the config).

I agree on this for all content that can be modified by the user
(e.g. files in the work tree such as .gitmodules), but the .git/config
file cannot be changed remotely. So I wonder how an attack would
look like for a hosting provider or anyone else?
We still rely on a sane system and trust /etc/gitconfig
so we do trust the host/admin but not the user?

>
>> + for (i = 0; i < max_options; i++) {
>> + char *line;
>> + int len;
>> +
>> + line = packet_read_line(0, &len);
>> +
>> + if (!line)
>> + break;
>> +
>> + if (len > max_size)
>> + die("protocol error: server configuration allows push "
>> + "options of size up to %d bytes", max_size);
>> +
>> + len = strcspn(line, "\n");
>> + line[len] = '\0';
>> +
>> + string_list_append(ret, line);
>> + }
>> + if (i == max_options)
>> + die("protocol error: server configuration only allows up "
>> + "to %d push options", max_options);
>
> When not going over ssh://, does the user sees these messages?
>
> More importantly, if we plan to make this configurable and not make
> the limit a hardwired constant of the wire protocol, it may be
> better to advertise push-options capability with the limit, e.g.
> "push-options=32" (or even "push-options=1024/32"), so that the
> client side can count and abort early?

Yeah we may want to start out with a strict format here indicating
the parameters used for evaluating the size.
So what do these numbers mean?

I assume (and hence I should document that,) that the first (1024)
is the maximum number of bytes per push option. The second
number (32) is the number of push options (not the number of pkts,
as one push option may take more than one pkt if the first number is
larger than 65k, see the NEEDSWORK comment.)

Do we really need 2 numbers, or could we just have one number
describing the maximum total size in bytes before the remote rejects
the connection?

>
> I wondered how well the extra flush works with the extra framing
> smart-http does to wrap the wire protocol; as I do not see any
> change to the http side, I'd assume that there is no issue.

That's a dangerous assumption of yours, as I did not test the
https side, yet.

>
>> +
>> + return ret;
>> +}
>> +
>>  static const char *parse_pack_header(struct pack_header *hdr)
>>  {
>>   switch (read_pack_header(0, hdr)) {
>> @@ -1773,6 +1829,9 @@ int cmd_receive_pack(int argc, const char **argv, 
>> const char *prefix)
>>   const char *unpack_status = NULL;
>>   struct string_list *push_options = NULL;
>>
>> +  

Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-07 Thread Stefan Beller
On Thu, Jul 7, 2016 at 1:20 PM, Junio C Hamano  wrote:
>
> What is suboptimal about the structure of the series is that we
> won't bisect down to any of the potential bugs in the above code
> even if we ever see any bug in the future.
> It also does not hint
> where push_options is expected to be read in the code in the
> subsequent patches in the series.  If I were doing this series, I
> would probably have done 2/4 first without plumbing it through
> (i.e. it is sent and accumulated in a string list at the receiver,
> and then cleared and freed without being used), and then added the
> processing (i.e. this step) as the second patch.

But your first patch (2/4) would not yet advertise the capability?
Or advertise and then just ignoring it?

That shadows other bugs that would not properly bisect, I'd imagine?

It is better for documentation purposes in this patch though. It makes
the other patch harder as "it allows transmitting push options, but
in that patch nothing of value is done with them."

So I'll see if I can reorder easily.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] add a test for push options

2016-07-07 Thread Stefan Beller
On Thu, Jul 7, 2016 at 1:01 PM, Junio C Hamano  wrote:
> On Thu, Jul 7, 2016 at 12:51 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> The functions `mk_repo_pair` as well as `test_refs` are borrowed from
>>> t5543-atomic-push, with additional hooks installed.
>>>
>>> Signed-off-by: Stefan Beller 
>>> ---
>>>  t/t5544-push-options.sh | 101 
>>> 
>>>  1 file changed, 101 insertions(+)
>>>  create mode 100755 t/t5544-push-options.sh
>>
>> FYI:
>>
>> Applying: add a test for push options
>> Test number t5544 already taken
>>
>
> I'll just move it over to t5545; no need to resend.

I'd resend because of the the first three patches anyway,
so I can include a renamed version of tests, too.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-07 Thread Stefan Beller
On Thu, Jul 7, 2016 at 2:56 PM, Jeff King  wrote:
> On Thu, Jul 07, 2016 at 02:41:37PM -0700, Stefan Beller wrote:
>
>> >> + /* NEEDSWORK: expose the limitations to be configurable. */
>> >> + int max_options = 32;
>> >> +
>> >> + /*
>> >> +  * NEEDSWORK: expose the limitations to be configurable;
>> >> +  * Once the limit can be lifted, include a way for payloads
>> >> +  * larger than one pkt, e.g allow a payload of up to
>> >> +  * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> >> +  * to indicate whether the next pkt continues with this
>> >> +  * push option.
>> >> +  */
>> >> + int max_size = 1024;
>> >
>> > Good NEEDSWORK comments; perhaps also hint that the configuration
>> > must not come from the repository level configuration file (i.e.
>> > Peff's "scoped configuration" from jk/upload-pack-hook topic)?
>>
>> Ok, I reviewed that series. It is unclear to me how the attack would
>> actually look like in that case.
>>
>> In 20b20a22f8f Jeff writes:
>> > Because we promise that
>> > upload-pack is safe to run in an untrusted repository, we
>> > cannot execute arbitrary code or commands found in the
>> > repository (neither in hooks/, nor in the config).
>>
>> I agree on this for all content that can be modified by the user
>> (e.g. files in the work tree such as .gitmodules), but the .git/config
>> file cannot be changed remotely. So I wonder how an attack would
>> look like for a hosting provider or anyone else?
>> We still rely on a sane system and trust /etc/gitconfig
>> so we do trust the host/admin but not the user?
>
> The problem is for hosting sites which serve repositories via git-daemon
> from untrusted users who have real shell accounts (e.g., you set up
> git-daemon to run as the "daemon" user serving repositories out of
> people's home directories; you don't want users to escalate their shell
> access into running arbitrary code as "daemon").

I think you would want to lock down the
hosting site as much as possible and not put untrusted users home
directories on there? So it is hard for me to imagine you'd go for such a setup
in practice.

>
> But I don't think that case applies here. That is about running
> upload-pack on an untrusted repository, but your changes here are part
> of receive-pack. In such a scenario, users should be pushing as
> themselves via ssh. And if they are not (e.g., the admin set up
> push-over-smart-http centrally), they are already screwed, as a
> malicious user could just set up a pre-receive hook.

I hear that as: "The pre-receive hook itself can do much more
damage than an oversized push option payload".

OK.

>
> IOW, we promise only that upload-pack is safe to run an untrusted repo,
> but not receive-pack.
>
> -Peff

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
On Fri, Jul 8, 2016 at 11:39 AM, Junio C Hamano  wrote:
> Jonathan Nieder  writes:
>
>> Hi,
>>
>> Junio C Hamano wrote:
>>> Stefan Beller  writes:
>>
>>>>> More importantly, if we plan to make this configurable and not make
>>>>> the limit a hardwired constant of the wire protocol, it may be
>>>>> better to advertise push-options capability with the limit, e.g.
>>>>> "push-options=32" (or even "push-options=1024/32"), so that the
>>>>> client side can count and abort early?
>>
>> Sorry to butt into the conversation late, but: I am not yet convinced.
>>
>> Is the idea that if the push options were very large, this would save
>> the client from the cost of sending them?
>
> Not really.  I have no strong opinion on the benefit of limiting
> number/size.  Stefan limited the number/size at the receiving end
> and made receiving end die with its message.

Jeff claimed we'd need some sort of DoS protection for this feature,
so I considered just die-ing enough for an initial implementation.

> I was merely trying to
> tweak the arrangement so that the sending end can complain with its
> own message, possibly in its own language, especially because it was
> unclear to me if the die() message on the receiving end would always
> go back to the sending end correctly.

I'm currently implementing Jonathans suggestion as it seems to be a reasonable
trade off (client hasn't sent a lot of data when it is decided it
doesn't go through,
the server can complain with a reasonable error message, only downside: no
i18n localisation support on the client side as the server will
currently report the
error in English).

That method will make heavy use of rp_error that uses the side band for
communicating the actual error message.

>
>> But this comes with a
>> downside: the server doesn't get to send an error message about
>> where
>> the maximum number of push options can come from (e.g., with a link to
>> a page where the limit can be adjusted, or with an explanation of when
>> clients tend to run into this problem and what they should do
>> instead).
>
> Hmm, interesting point.  That would be better told by the receiving
> end, as the way to configure it (if offered) would be different from
> installation to installation.
>
>> So I'd like to propose an alternative. What if the client tells the
>> server the number of push options early on (and possibly also a cap on
>> the length of those push options)?  That way, the client doesn't have
>> to waste bit sending the push options but the server gets an
>> opportunity to send a helpful error message on sideband 3.
>>
>>   server> HEAD\0push-options ...
>>   client> ... commands ...
>>   client> push-options 2
>>   client> my-first-option
>>   client> my-second-option
>

Another (slightly offtopic) observation:
If in the future we'll need to transmit push options >64k, instead of
splitting the push option to multiple packets, we could invent "large
packets". The current upper bound for packets is artificailly low, such
that the server is able to interleave sideband information with the actual
data in a fetch and have the client display the progress in a timely manner.
When pushing to the server we'd not need to have progress information
(the server doesn't care, and the client knows the size it is pushing).

As of today a packet consists of 4 bytes (hex characters) to indicate
the length and then the payload follows. So instead we could transmit
"v" (that is not a hex character) followed by a variable length integer for
the length and then the payload which has no upper bound.

In the release notes for 2.3 you wrote:
> * "git push" and "git fetch" did not communicate an overlong refname
>   correctly.  Now it uses 64kB sideband to accommodate longer ones.

That could also make use of these "large packets" instead.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
On Fri, Jul 8, 2016 at 2:46 PM, Jeff King  wrote:
> On Fri, Jul 08, 2016 at 11:57:20AM -0700, Stefan Beller wrote:
>
>> >> Sorry to butt into the conversation late, but: I am not yet convinced.
>> >>
>> >> Is the idea that if the push options were very large, this would save
>> >> the client from the cost of sending them?
>> >
>> > Not really.  I have no strong opinion on the benefit of limiting
>> > number/size.  Stefan limited the number/size at the receiving end
>> > and made receiving end die with its message.
>>
>> Jeff claimed we'd need some sort of DoS protection for this feature,
>> so I considered just die-ing enough for an initial implementation.
>
> I do not think we need to worry too much about niceties for these
> limits. The point is to protect servers from malicious nonsense, like
> somebody sending gigabytes of push options, or trying to overflow a
> buffer in a hook with a large value.

Agreed. This would speak for keeping the implementation as is.

>If people are seeing these in
> routine use, then the limits are set too low, and this should happen
> roughly as often as a BUG assertion, and IMHO should be treated roughly
> the same: don't bother with translation, and don't worry about
> optimizing wasted bandwidth for this case. It won't happen enough to
> matter.

Well the wasted band width is part of the server protection, no?
This would favor the idea Jonathan came up with:

server: I advertise push options
client: ok I want to use push options
client: I'll send you 1000 push options with upper bound of 1000M
server: It's a bit too much, eh?
* server quits

So this case only occurs for the (malicious?) corner case, where I
do not bother a translation.

But having the size announcement not in
the capability advertisement, but in the actual push options phase makes
sense to me as we do not want to clutter the capabilities with data that can
come later. We would only waste a little bit of band width, (the
initial ls-remote
and command list of the client).


Speaking of this, I can craft a malicious client that sends the
following command list



refs/heads/loong-ref


refs/heads/loong-ref


refs/heads/loong-ref


refs/heads/loong-ref


refs/heads/loong-ref



IIUC in the receive-pack code we would queue that up and the error checking
(two times null sha1? update of the same ref more than once?), is
done just after we send out the flush packet, i.e. when all commands
are received.

This would also result in sending gigabytes of junk as well as a
memory issue on the server
side?

The new push options design is actually neat in the way that the
client exactly says what it wants
and the server can reject early, but not cluttering the capability
advertisement.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
On Fri, Jul 8, 2016 at 3:21 PM, Jeff King  wrote:
> On Fri, Jul 08, 2016 at 03:17:13PM -0700, Stefan Beller wrote:
>
>> >If people are seeing these in
>> > routine use, then the limits are set too low, and this should happen
>> > roughly as often as a BUG assertion, and IMHO should be treated roughly
>> > the same: don't bother with translation, and don't worry about
>> > optimizing wasted bandwidth for this case. It won't happen enough to
>> > matter.
>>
>> Well the wasted band width is part of the server protection, no?
>
> Not if you stop receiving as soon as you hit the limits. Then of course
> they can send up to the limit each time, but that is not a DoS. That is
> things working as advertised.
>
>> This would favor the idea Jonathan came up with:
>>
>> server: I advertise push options
>> client: ok I want to use push options
>> client: I'll send you 1000 push options with upper bound of 1000M
>> server: It's a bit too much, eh?
>> * server quits
>>
>> So this case only occurs for the (malicious?) corner case, where I
>> do not bother a translation.
>
> In the malicious case, the client says "I'll send you 10 push option
> with an upper bound of 1024K", and then sends gigabytes anyway. Either
> way the server has to react to what is sent, not what is promised.

Well that is what the initial patch did via:

+   for (i = 0; i < max_options; i++) {
+   char *line;
+   int len;
+
+   line = packet_read_line(0, &len);
+
+   if (!line)
+   break;
+
+   if (len > max_size)
+   die("protocol error: server configuration allows push "
+   "options of size up to %d bytes", max_size);
+
+   len = strcspn(line, "\n");
+   line[len] = '\0';
+
+   string_list_append(ret, line);
+   }
+   if (i == max_options)
+   die("protocol error: server configuration only allows up "
+   "to %d push options", max_options);

I assume the die is an effective way to "stop receiving".

Thinking further about what you said, I think the initial selections of
max_size and max_options is sufficient and we only see those bounds in
the malicious case.

This discussion rather makes me wonder if we want to stick to the initial design
as it was easy and not overcomplicating things as we assume the abort case
doesn't occur often.


>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
On Fri, Jul 8, 2016 at 3:35 PM, Jeff King  wrote:
>
> Yes. I haven't been following the intermediate discussion and patches,
> but I don't see anything wrong with the general design above. I think
> you do need to use rp_error() to get the die message to the client for
> non-ssh cases, though (that is a problem with other protocol-error
> messages, too; I wonder if we should install a custom die handler, or
> convert them all to some kind of rp_die()).

Some of the rp_error messages do not want to die(), but most seem to
be ok when the rp_error would die.

I'll look into that.

>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
On Fri, Jul 8, 2016 at 3:46 PM, Jeff King  wrote:
> Sorry, I meant converting die() into:
>
>   rp_error(...);
>   die(...);
>
> possibly via an rp_die() helper.  The existing rp_error() cases would
> remain untouched.

Oh I see!
That makes a lot of sense.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] push: accept push options

2016-07-08 Thread Stefan Beller
On Thu, Jul 7, 2016 at 1:52 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> +-L::
>> +--push-option::
>> + Transmit the given string to the server, which passes them to
>> + the pre-receive as well as the post-receive hook. Only C strings
>> + containing no new lines are allowed.
>
> This is to affect what happens at the remote end, so I would have
> understood "-R".  I also would have understood "-P" as a short-hand
> for "--push-option".  What is the justification of "-L"?

It was made up. The actual code took -o for option. Changed that.

>
> What does "C strings" mean?  Did you mean to say "A sequence of
> bytes excluding NUL is passed verbatim"?



>
> I do not think I saw anything in the code I reviewed so far that
> requires "no LF" limitation.

It is enforced server side, but an additional
client side enforcement may be better indeed.

The rationale for no enforcement on the client side is an easier way
forward if we allow it on the server as the client would "just work"
and it's up to the server to complain.

That makes me wonder if we want to document that, i.e.:

-o::
--push-option::
Transmit the given argument to the server, which passes them to
the pre-receive as well as the post-receive hook. As it is up to the
server to react on these push options, it may reject push options
that contain new line or NUL characters. .


>
> ... Ahh, OK, you want to make sure that push-options are
> one-per-line in the push certificate.  While I do not think it is
> absolutely necessary, starting with a possibly tighter than
> necessary limitation is much better than starting loose and having
> to tighten it later.

This is not what I had in mind, but rather the pain of dealing with multi line
environment variables.

>>   transport_get(remote, NULL);
>> -
>> + if (flags & TRANSPORT_PUSH_OPTIONS)
>> + transport->push_options = push_options;
>
> The result would be easier to read without the removal of a blank
> that separates decl/defn and stmt here.

ok

>> + OPT_STRING_LIST('o', "push-option", &push_options, 
>> N_("server-specific"), N_("option to transmit")),
>
> Here it seems to expect "-o".  If we really want a short option,
> "-o" would probably be OK, as I do not think "git push" wants to
> have "send the output to this file" option.
>

Ok, will update the documentation.


>
> by adding this between the two lines in the pre-context of this
> hunk, i.e.
>
> if (push_cert_nonce[0])
> strbuf_addf(&cert, "nonce %s\n", push_cert_nonce);
> if (args->push_options)
> for_each_string_list_item(item, args->push_options)
> strbuf_addf(&cert, "push-option %s\n", item->string);
> strbuf_addstr(&cert, "\n");
>

makes sense.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-08 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  9 ++-
 Documentation/technical/pack-protocol.txt | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  9 +++
 builtin/receive-pack.c| 69 +++
 4 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..25b5db1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,8 +2410,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 236417e..754db6e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);

[PATCHv4 0/4] Push options

2016-07-08 Thread Stefan Beller
aster \
  --push-option draft \
  --push-option topic=example \
  --push-option cc=jon@example.org
  
  (with a short form that is even easier to type,
   but this is is more intuitive already)

This is a patch series to Git core, which is developed at the same time
as a change is proposed to JGit by Dan Wang, see [2].

This code is also available at [3].

Thanks,
Stefan

[1] Not all Gerrit '%' options are documented, so here is a link to source code 
instead :(
https://gerrit.googlesource.com/gerrit/+/refs/heads/master/gerrit-server/src/main/java/com/google/gerrit/server/git/ReceiveCommits.java#1141

[2] https://git.eclipse.org/r/#/c/74570/ 
 
[3] https://github.com/stefanbeller/git/tree/pushoptions

Stefan Beller (4):
  push options: {pre,post}-receive hook learns about push options
  receive-pack: implement advertising and receiving push options
  push: accept push options
  add a test for push options

 Documentation/config.txt  |  7 +-
 Documentation/git-push.txt|  8 ++-
 Documentation/githooks.txt|  4 ++
 Documentation/technical/pack-protocol.txt | 10 +--
 Documentation/technical/protocol-capabilities.txt |  8 +++
 builtin/push.c| 16 -
 builtin/receive-pack.c| 85 +++
 send-pack.c   | 29 
 send-pack.h   |  3 +
 t/t5544-push-options.sh   | 85 +++
 transport.c   |  2 +
 transport.h   |  7 ++
 12 files changed, 242 insertions(+), 22 deletions(-)
 create mode 100755 t/t5544-push-options.sh

-- 
2.9.0.141.gdd65b60

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] add a test for push options

2016-07-08 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5545-push-options.sh | 101 
 1 file changed, 101 insertions(+)
 create mode 100755 t/t5545-push-options.sh

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
new file mode 100755
index 000..8dd3c8e
--- /dev/null
+++ b/t/t5545-push-options.sh
@@ -0,0 +1,101 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.247.g176c4f7

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-08 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  | 18 ++
 builtin/receive-pack.c  | 49 +++--
 templates/hooks--pre-receive.sample | 24 ++
 3 files changed, 78 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..9565dc3 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,15 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
+
 [[update]]
 update
 ~~
@@ -322,6 +331,15 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client 

[PATCH 3/4] push: accept push options

2016-07-08 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 21 ++---
 send-pack.c| 27 +++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 19f46b6..e960258 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-o::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. The given string
+   must not contain a NUL or LF character.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..3bb9d6b 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+   static struct string_list_item *item;
+
struct option options[] = {
OPT__VERBOSITY(&verbosity),
OPT_STRING( 0 , "repo", &repo, N_("repository"), 
N_("repository")),
@@ -533,6 +543,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", &push_cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", &push_options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", &family, N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"),
@@ -563,7 +574,11 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   for_each_string_list_item(item, &p

Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-10 Thread Stefan Beller
On Sun, Jul 10, 2016 at 10:06 AM, Shawn Pearce  wrote:
> On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller  wrote:
>> +
>> +   /* NEEDSWORK: expose the limitations to be configurable. */
>> +   int max_options = 32;
>> +
>> +   /*
>> +* NEEDSWORK: expose the limitations to be configurable;
>> +* Once the limit can be lifted, include a way for payloads
>> +* larger than one pkt, e.g allow a payload of up to
>> +* LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> +* to indicate whether the next pkt continues with this
>> +* push option.
>> +*/
>> +   int max_size = 1024;
>
> Instead of this, what about a new config variable
> receive.maxCommandBytes[1] that places a limit on the number of bytes
> of pkt-line data the client can supply in both the command list ("old
> new ref"), push signature framing, and option list?

Including the whole command list is pretty smart as it actually tackles the
DoS problem as a whole. We shortly discussed having just one upper bound
limit for the push options alone, but we were distracted by the discussion
on whether to advertise this number or just reject it on the server side
after it filled up so much data.

The design here with two bounds was used to not care about the oversized
push options for now. (I mean we can still just reject larger push
options even when
having a receive.maxCommandBytes setting.)

>
> Memory demands for the server are proportional to the data sent. A
> simple byte limit lets the user make the decision about how this gets
> used. Longer ref names or option values means fewer refs or options
> can be sent. Shorter ref names or option values means more values or
> options can be sent.
>
> I studied a lot of repositories[2] and most use ref names under 200
> bytes in length. A 3 MiB default for receive.maxCommandBytes gives
> users something like 11,115 references in a single git push invocation
> if they used all 200 bytes in every name. Most users don't have ref
> names this long. Unlike a cap on each ref, it allows users to use the
> full 65449 bytes in a reference name available in pkt-line, but you
> can only send 48 such references. Likewise for options. :)

In an earlier discussion Jeff said roughly "either make it work well,
or don't make it work at all, i.e. why are git push options better
than a `git push .. && curl /REST-API` thing?"

And by having this design we could punt on the corner cases with
transmitting arbitrary large push options/binaries for now and claim
it's another next step that needs to be done when adding the config
option for it. By having a single receive.maxCommandBytes setting
we would sweep that problem under the rug and people could wonder
why it fails with the large push option.

As said in an earlier email as a side note, we could think about introducing
a v2 pkt line format which starts with a variable int to indicate the packet
size, such that the payload is not bound up to 64k.

I think 3MiB is a bit much for everyday use though and not enough for
corner cases?

>
>
> [1] I may propose this to JGit.
> [2] More than 3M, but maybe Peff has access to more.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


verbose fetches not overly verbose

2016-07-11 Thread Stefan Beller
$ git clone https://github.com/gitster/git tmpgit2 -v -v -v
Cloning into 'tmpgit2'...
Server supports multi_ack_detailed
Server supports no-done
Server supports side-band-64k
Server supports ofs-delta
Server version is git/2:2.6.5~dgit-checksum-v5-1484-gfc2423c
want 867ad08a2610526edb5723804723d371136fc643 (refs/heads/ab/hooks)
want 4a6ada32cb64b0eba8b6f995e4230f0c722fd580
(refs/heads/ad/bisect-cleanup)
want 06e6a745064c4f2f827177f6d92f4b9adb018200
(refs/heads/ad/bisect-terms)
...
[a lot more wants, cut]
want 47e8b7c56a5504d463ee624f6ffeeef1b6d007c1 (refs/tags/v2.9.1)
done
POST git-upload-pack (gzip 63203 to 31673 bytes)
remote: Counting objects: 201614, done.
remote: Total 201614 (delta 0), reused 0 (delta 0), pack-reused 201614
Receiving objects: 100% (201614/201614), 88.76 MiB | 32.86 MiB/s, done.
Resolving deltas: 100% (147442/147442), done.
Checking connectivity... done.

Now the same for fetch:

$ cd tmpgit2
$ git fetch -v -v -v
From https://github.com/gitster/git
 = [up to date]  master -> origin/master
 = [up to date]  ab/hooks   -> origin/ab/hooks
 = [up to date]  ad/bisect-cleanup -> origin/ad/bisect-cleanup
 = [up to date]  ad/bisect-terms -> origin/ad/bisect-terms
...
 [a lot more wants, cut]

It is missing the want lines completely. I guess that is not that intentional,
but rather we forgot to add the verbosity option correctly to the
fetch in the transport layer.
Looking at the transport code, I am confused on both how the wants are
printed in the clone
case as well as how they are not printed in the fetch case.

Any hints?

(I could use GIT_TRACE_PACKET as a work around, but it's just a hack.)

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Submodule's .git file contains absolute path when created using 'git clone --recursive'

2016-07-12 Thread Stefan Beller
On Tue, Jul 12, 2016 at 4:22 AM, Ricardo Sánchez-Sáez
 wrote:
> Stefan Beller  google.com> writes:
>
> Hi,
>
> sorry to awake an old thread. Has this been fixed? In which git version?
> It's hitting me on git version  2.7.4 (Apple Git-66) (default git client on
>  OS X 10.11.5 (15F34)).
>
> I think all submodule .git files should contain relative paths. Otherwise,
>  duplicating or moving the cloned  repository folder breaks the submodules.
>
> Best,
> Ricardo
>

The commit I referenced earlier:

$ git tag --contains f8eaa0ba98b3bd9cb9035eba184a2d9806d30b27
v2.8.3
v2.8.4
v2.9.0
v2.9.0-rc0
v2.9.0-rc1
v2.9.0-rc2
v2.9.1
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-13 Thread Stefan Beller
> * sb/push-options (2016-07-12) 5 commits
>  - add a test for push options
>  - push: accept push options
>  - SQUASH???

Squash? I do not find a squashable commit in what you pushed,
do you intend to squash the first 2 patches instead?

>  - receive-pack: implement advertising and receiving push options
>  - push options: {pre,post}-receive hook learns about push options
>
>  "git push" learned to accept and pass extra options to the
>  receiving end so that hooks can read and react to them.
>
>  Discussion continues, expecting a further reroll.
>  ($gmane/299156)

Yeah there were some late comments, so I did not reroll right away.
I think Shawns proposal to have a receive.maxCommandBytes is a
good way for an overall upper bound, but how does it stop us from
going forward with this series? It seems like a good series to implement
on top of this?

(We also have no code for limiting the number of refs pushed, so I
do not quite understand why this DoS paranoia comes up with this
series all of a sudden.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-13 Thread Stefan Beller
On Wed, Jul 13, 2016 at 10:32 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>> * sb/push-options (2016-07-12) 5 commits
>>>  - add a test for push options
>>>  - push: accept push options
>>>  - SQUASH???
>>
>> Squash? I do not find a squashable commit in what you pushed,
>> do you intend to squash the first 2 patches instead?

Oh I pulled a few minutes before you sent this email, and forgot
that you likely have pushed again when sending this email. :/
Thanks!

>> Yeah there were some late comments, so I did not reroll right away.
>> I think Shawns proposal to have a receive.maxCommandBytes is a
>> good way for an overall upper bound, but how does it stop us from
>> going forward with this series?
>
> If we were to do maxcommandbytes, then max_options would become
> irrelevant, no?

Maybe?

I do not know what kind of safety measures we want in place here, and
if we want to go for overlapping things?

Currently there are none at all in your upstream code, although you cannot
push arbitrary large things to either Shawns or Peffs $Dayjob servers, so
I wonder if we want to either agree on one format or on many overlapping
things, as some different hosts may perceive different things as DoS threats,
so they can fine tune as they want?

In the Gerrit world, you have a ref per code review, such that it is easy to
have 50k refs or more, similar to the repo Jeff pointed out [1], that has 40k
tags (and getting a new tag every 2 hours apparently).

So I could understand if different services care about the different loads
(refs vs push options) differently (one would want to allow unlimited refs
pushing for mirroring such repos as pointed out above, while another
one might care about the total load of the server for a single rogue user)

Thanks,
Stefan

[1] https://github.com/JetBrains/intellij-community
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 8:31 AM, Thorsten Glaser  wrote:
> Hi *,
>
> is there a way, for example with some sort of pre-receive hook,
> to prevent some files from being overwritten by a push?

pre-receive hooks are a thing!

 pre-receive
   This hook is invoked by git-receive-pack on the remote
repository, which happens when a git push is done on a local
repository. Just before starting to update refs on the remote
repository,
   the pre-receive hook is invoked. Its exit status determines the
success or failure of the update.

   This hook executes once for the receive operation. It takes no
arguments, but for each ref to be updated it receives on standard
input a line of the format:

SP  SP  LF

   where  is the old object name stored in the ref,
 is the new object name to be stored in the ref and
 is the full name of the ref. When creating a new ref,
is 40 0.

   If the hook exits with non-zero status, none of the refs will
be updated. If the hook exits with zero, updating of individual refs
can still be prevented by the update hook.

   Both standard output and standard error output are forwarded to
git send-pack on the other end, so you can simply echo messages for
the user.


As you have access to the old and new value, just check them out to
/tmp and inspect the files?
(That is slower than optimal I'd assume, so to get it faster you could
go roughly like

git ls-tree -r  | grep   >old_interest
git ls-tree -r  | grep  >new_interest
if diff old_interest new_interest
then
echo "There is a difference in  
exit 1
fi



>
> Use case:
>
> In some project, we use Flyway to manage the database schema,
> and Jenkins to automatically build master’s HEAD after each
> push (and install it, thus activating the schema files almost
> immediately). Now, I wish to prevent coworkers from changing
> anything in the SQL subdirectory that has ever been pushed,
> forcing them to push new SQL files with ALTER statements instead.
> (Yes, I am aware this will likely make me even less liked. No,
> this is not an issue.)
>
> As for the comparison, only the changes between the previous
> HEAD of master and the new HEAD of master after the push would
> have been accepted need to be taken into account; any intermediate
> commits, merges, etc. are no problem (because Jenkins does not
> build them, and because, once a push fails, the developer will
> have to add a commit reverting their change and moving it to
> another file on top, I’m no friend of rewriting).
>
> File matching would be “any files under a certain subdirectory”,
> nothing fancier than that.
>
> I’ve tried a web search (with two different search engines) for
> “git prevent pushed files from modification”, but this seems to
> only show people who want to ignore local changes or somesuch…
>
> I’ve asked in IRC, but with no answer for hours I thought that
> maybe this was the better place to ask for it.
>
> Thanks in advance,
> //mirabilos
> --
> tarent solutions GmbH
> Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
> Tel: +49 228 54881-393 • Fax: +49 228 54881-235
> HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
> Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  9 +++-
 Documentation/technical/pack-protocol.txt | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  9 
 builtin/receive-pack.c| 59 +++
 4 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..25b5db1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,8 +2410,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 236417e..917ac18 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);
+

[PATCH 3/4] push: accept push options

2016-07-14 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 21 ++---
 send-pack.c| 27 +++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 19f46b6..e960258 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-o::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. The given string
+   must not contain a NUL or LF character.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..3bb9d6b 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+   static struct string_list_item *item;
+
struct option options[] = {
OPT__VERBOSITY(&verbosity),
OPT_STRING( 0 , "repo", &repo, N_("repository"), 
N_("repository")),
@@ -533,6 +543,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", &push_cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", &push_options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", &family, N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"),
@@ -563,7 +574,11 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   for_each_string_list_item(item, &p

[PATCH 4/4] add a test for push options

2016-07-14 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5545-push-options.sh | 103 
 1 file changed, 103 insertions(+)
 create mode 100755 t/t5545-push-options.sh

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
new file mode 100755
index 000..ea813b9
--- /dev/null
+++ b/t/t5545-push-options.sh
@@ -0,0 +1,103 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.247.gf748855.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  | 18 ++
 builtin/receive-pack.c  | 49 +++--
 templates/hooks--pre-receive.sample | 24 ++
 3 files changed, 78 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..9565dc3 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,15 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
+
 [[update]]
 update
 ~~
@@ -322,6 +331,15 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client 

[PATCHv5 0/4] Push options

2016-07-14 Thread Stefan Beller
  return ret;
 }
diff --git a/send-pack.c b/send-pack.c
index c943560..3a842ac 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -277,13 +277,10 @@ static int generate_push_cert(struct strbuf *req_buf,
}
if (push_cert_nonce[0])
strbuf_addf(&cert, "nonce %s\n", push_cert_nonce);
-   strbuf_addstr(&cert, "\n");
-
-   if (args->push_options) {
+   if (args->push_options)
for_each_string_list_item(item, args->push_options)
strbuf_addf(&cert, "push-option %s\n", item->string);
-   strbuf_addstr(&cert, "\n");
-   }
+   strbuf_addstr(&cert, "\n");
 
for (ref = remote_refs; ref; ref = ref->next) {
if (check_to_send_update(ref, args) < 0)
diff --git a/templates/hooks--pre-receive.sample 
b/templates/hooks--pre-receive.sample
index e4d3edc..a1fd29e 100644
--- a/templates/hooks--pre-receive.sample
+++ b/templates/hooks--pre-receive.sample
@@ -6,13 +6,15 @@
 #
 # To enable this hook, rename this file to "pre-receive".
 
-if test -n "$GIT_PUSH_OPTION_COUNT"; then
+if test -n "$GIT_PUSH_OPTION_COUNT"
+then
i=0
-   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"
+   do
eval "value=\$GIT_PUSH_OPTION_$i"
case "$value" in
echoback=*)
-   echo "echo from the pre-receive-hook ${value#*=}" >&2
+   echo "echo from the pre-receive-hook: ${value#*=}" >&2
;;
reject)
exit 1


Cover letter v3:


This is not marked for RFC any more, as I do not recall any open points
left for discussion. This addresses the only reply from Eric Wong on patch 3:

diff to v2:
diff --git a/send-pack.c b/send-pack.c
index e328276..c943560 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -536,7 +536,8 @@ int send_pack(struct send_pack_args *args,
 
for_each_string_list_item(item, args->push_options)
packet_buf_write(&sb, "%s", item->string);
-   write_or_die(out, sb.buf, sb.len);
+
+   write_or_die(out, sb.buf, sb.len);
packet_flush(out);
strbuf_release(&sb);
}
diff --git a/transport.c b/transport.c
index 598bd1f..0298be1 100644
--- a/transport.c
+++ b/transport.c
@@ -641,7 +641,6 @@ struct transport *transport_get(struct remote *remote, 
const char *url)
struct transport *ret = xcalloc(1, sizeof(*ret));
 
ret->progress = isatty(2);
-   ret->push_options = NULL;
 
if (!remote)
die("No remote provided to transport_get()");

Cover letter v2:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

Jeff writes on v1:
> Whereas in Dennis's patches, it was about specific information directly
> related to the act of pushing.

This allows to transmit arbitrary information as the backends of $VENDOR
may have different options available related to the direct act of pushing.

Thanks,
Stefan

Cover letter v1:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

When using a remote that is more than just a plain Git host (e.g. Gerrit,
Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
push options can instruct the server to:
* open a pull request
* send out emails asking for review
* (un)trigger continuous integration
* set priority for continuous integration (i.e. bots pushing may ask to be
  treated with lower priority compared to humans)
* ... 

Most of these actions can be done on the client side as well,
but in these remote-centric workflows it is easier to do that on the remote,
which is why we need to transport the information there.

More concrete examples:
* When you want a change in Gerrit to be submitted to refs/heads/master, you
  push instead to a magic branch refs/for/master and Gerrit will create a change
  for you (similar to a pull request). Instead we could imagine that you push
  to a magical refs/heads/master with a push option "create-change".
  
* When pushing to Gerrit you can already attach some of this information by
  adding a '%' followed by the parameter to the ref, i.e. when interacting with
  Gerrit it is possible to do things like[1]:

git push origin 
HEAD:refs/for/master%draft%topic=example%cc=jon@example.org
  
  This is not appealing to our users as it looks like hacks upon hacks to make
  it work. It would read better if it was spelled as:
  
  git push origin HEAD:refs/for/master \
  --

Re: [PATCHv5 0/4] Push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 11:41 AM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 10:39:16AM -0700, Stefan Beller wrote:
>
>> Jeff wrote:
>> > Junio wrote:
>> >> I think those extra knobs can come later.  If we are not going to
>> >> limit with max_options in the end, however, wouldn't it be more
>> >> natural for the initial iteration without any configuration not to
>> >> have hard-coded max_options at all?
>> >
>> > Yeah, I am OK with adding restrictive knobs later as a separate topic.
>> > As Stefan notes, upstream does not have the other knobs anyway, and IIRC
>> > the push-options feature is not even enabled by default.
>>
>> * now it actually is not a default. ;)
>
> Hmm. So that is a downside for people who have implemented separate DoS
> protection only in that upgrading git will open a new "hole" that they
> need to plug. But that is their problem, not upstream git's.
>
> -Peff

But this is not specific to DoS protection, but like most features.
Just look at the vendors
using linux that think carrying patches out of tree for their special
snowflake hardware
is cheaper than getting it upstream (This is not to be read as a
snarky comment, but
rather as a pointer to the similarity of the mechanisms involved).
Although I cannot tell
offhand another feature that would easily break downstream customization.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 11:44 AM, Junio C Hamano  wrote:
> On Thu, Jul 14, 2016 at 11:27 AM, Junio C Hamano  wrote:
>> Thorsten Glaser  writes:
>>
>>>   if test x"0" != x"$(comm -23z \
>>>   <(git ls-tree -r -z "$old" "$subdir" | sort -z) \
>>>   <(git ls-tree -r -z "$new" "$subdir" | sort -z) | wc -c)"; then
>>>   echo >&2 'Untouchable files touched, commit rejected!'
>>>   exit 1
>>>   fi
>>
>> Can't this become simpler, e.g.
>>
>> if ! git diff-tree --quiet "$old" "$new" -- "$subdir"
>> then
>> echo >&2 "Ooh, $subdir is touched"

No need to go for >&2 here, as it makes no difference to
the client.

>> exit 1
>> fi
>
> Ehh, you need to tell diff-tree to recurse, i.e. "diff-tree -r".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
> packet_read_line() calls packet_read_line_generic() that calls
> packet_read() with the fixed packet_buffer[] that is sized to be
> LARGE_PACKET_MAX.

>
> Can this check even trigger?

I thought when len == LARGE_PACKET_MAX,
this could trigger. Though on inspection of packet_read,
we already reject packets that have size LARGE_PACKET_MAX,
and the largest size allowed is LARGE_PACKET_MAX - 1.

I guess we can remove it altogether and still be future proof.
If we ever want to allow larger push options we either need to
have larger (variable sized) packets or a capability push-options-v2,
so I'll rip this check out.

>
>> + lf = strchr(line, '\n');
>> + if (lf)
>> + *lf = '\0';
>
> packet_read_line() -> packet_read_line_generic() calls packet_read()
> with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?

This check was not about "option with lf at end\n", but rather we want to chop
off "option\nover\nmultiple\nlines" ?

Although as you remarked in another email, this would not pose a problem for
the shell variable, so we could also drop it to allow multi line
options. will do.

>
>> +
>> + string_list_append(ret, line);
>> + }
>> +
>> + return ret;
>> +}
>
> Other than that, looks good to me.
>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: document diff-filter exclusion

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 12:20 PM, Junio C Hamano  wrote:
> In v1.8.5 days, 7f2ea5f0 (diff: allow lowercase letter to specify
> what change class to exclude, 2013-07-17) taught the "--diff-filter"
> mechanism to take lowercase letters as exclusion, but we forgot to
> document it.
>
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/diff-options.txt | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 3ad6404..073c7e5 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -399,6 +399,9 @@ ifndef::git-format-patch[]
> paths are selected if there is any file that matches
> other criteria in the comparison; if there is no file
> that matches other criteria, nothing is selected.
> ++
> +Also, these upper-case letters can be downcased to exclude.  E.g.
> +`--diff-filter=ad` excludes added and deleted paths.
>

This looks good for the neighboring thread. :)
http://thread.gmane.org/gmane.comp.version-control.git/299512
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv6 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---

 Junio,
 please replace v5 2/4 with this patch (I only resend this single patch
 as the other 3 remain as is).

 This only changes read_push_options to not care at all about any
 limitations.
 
 Thanks,
 Stefan

# interdiff to v5:
# diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
# index 917ac18..9bb9afc 100644
# --- a/builtin/receive-pack.c
# +++ b/builtin/receive-pack.c
# @@ -1505,7 +1505,7 @@ static struct string_list *read_push_options(void)
#   string_list_init(ret, 1);
#  
#   while (1) {
# - char *line, *lf;
# + char *line;
#   int len;
#  
#   line = packet_read_line(0, &len);
# @@ -1513,30 +1513,6 @@ static struct string_list *read_push_options(void)
#   if (!line)
#   break;
#  
# - /*
# - * NEEDSWORK: expose the limitations to be configurable;
# - * Once the limit can be lifted, include a way for payloads
# - * larger than one pkt, e.g use last byte to indicate if
# - * the push option continues in the next packet or implement
# - * larger packets.
# - */
# - if (len > LARGE_PACKET_MAX - 1) {
# - /*
# -  * NEEDSWORK: The error message in die(..) is not
# -  * transmitted in call cases, so ideally all die(..)
# -  * calls are prefixed with rp_error and then we can
# -  * combine rp_error && die into one helper function.
# -  */
# - rp_error("protocol error: server configuration allows 
push "
# -  "options of size up to %d bytes",
# -  LARGE_PACKET_MAX - 1);
# - die("protocol error: push options too large");
# - }
# -
# - lf = strchr(line, '\n');
# - if (lf)
# - *lf = '\0';
# -
#   string_list_append(ret, line);
#   }


 Documentation/config.txt  |  9 --
 Documentation/technical/pack-protocol.txt | 10 ---
 Documentation/technical/protocol-capabilities.txt |  9 ++
 builtin/receive-pack.c| 35 +++
 4 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 16dc22d..0bb6daa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2417,8 +2417,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need t

Re: [PATCHv6 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 2:35 PM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 02:24:54PM -0700, Stefan Beller wrote:
>
>> # interdiff to v5:
>> [...giant deletion...]
>
> Much nicer. :)
>
>> +static struct string_list *read_push_options(void)
>> +{
>> + struct string_list *ret = xmalloc(sizeof(*ret));
>
> This struck me as a little non-idiomatic for our code base. The usual
> technique is to take a pointer to a stack-allocated struct, and write
> into that.

Oh, right! :(

That's what you get when you grow up with object orientation all along.
read_push_options() mentally mapped to "create a push options object if any"
such that I can see if it is NULL or not.

I'll reroll a more idiomatic thing. Thanks,
Stefan

>
> I guess that here:
>
>> @@ -1774,6 +1806,9 @@ int cmd_receive_pack(int argc, const char **argv, 
>> const char *prefix)
>>   const char *unpack_status = NULL;
>>   struct string_list *push_options = NULL;
>>
>> + if (use_push_options)
>> + push_options = read_push_options();
>> +
>
> You will want to later check whether push_options is NULL. But you can
> also just check push_options.nr.
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv7 0/4] Push options

2016-07-14 Thread Stefan Beller
send-pack.c
+++ b/send-pack.c
@@ -277,13 +277,10 @@ static int generate_push_cert(struct strbuf *req_buf,
}
if (push_cert_nonce[0])
strbuf_addf(&cert, "nonce %s\n", push_cert_nonce);
-   strbuf_addstr(&cert, "\n");
-
-   if (args->push_options) {
+   if (args->push_options)
for_each_string_list_item(item, args->push_options)
strbuf_addf(&cert, "push-option %s\n", item->string);
-   strbuf_addstr(&cert, "\n");
-   }
+   strbuf_addstr(&cert, "\n");
 
for (ref = remote_refs; ref; ref = ref->next) {
if (check_to_send_update(ref, args) < 0)
diff --git a/templates/hooks--pre-receive.sample 
b/templates/hooks--pre-receive.sample
index e4d3edc..a1fd29e 100644
--- a/templates/hooks--pre-receive.sample
+++ b/templates/hooks--pre-receive.sample
@@ -6,13 +6,15 @@
 #
 # To enable this hook, rename this file to "pre-receive".
 
-if test -n "$GIT_PUSH_OPTION_COUNT"; then
+if test -n "$GIT_PUSH_OPTION_COUNT"
+then
i=0
-   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"
+   do
eval "value=\$GIT_PUSH_OPTION_$i"
case "$value" in
echoback=*)
-   echo "echo from the pre-receive-hook ${value#*=}" >&2
+   echo "echo from the pre-receive-hook: ${value#*=}" >&2
;;
reject)
exit 1


Cover letter v3:


This is not marked for RFC any more, as I do not recall any open points
left for discussion. This addresses the only reply from Eric Wong on patch 3:

diff to v2:
diff --git a/send-pack.c b/send-pack.c
index e328276..c943560 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -536,7 +536,8 @@ int send_pack(struct send_pack_args *args,
 
for_each_string_list_item(item, args->push_options)
packet_buf_write(&sb, "%s", item->string);
-   write_or_die(out, sb.buf, sb.len);
+
+   write_or_die(out, sb.buf, sb.len);
packet_flush(out);
strbuf_release(&sb);
}
diff --git a/transport.c b/transport.c
index 598bd1f..0298be1 100644
--- a/transport.c
+++ b/transport.c
@@ -641,7 +641,6 @@ struct transport *transport_get(struct remote *remote, 
const char *url)
struct transport *ret = xcalloc(1, sizeof(*ret));
 
ret->progress = isatty(2);
-   ret->push_options = NULL;
 
if (!remote)
die("No remote provided to transport_get()");

Cover letter v2:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

Jeff writes on v1:
> Whereas in Dennis's patches, it was about specific information directly
> related to the act of pushing.

This allows to transmit arbitrary information as the backends of $VENDOR
may have different options available related to the direct act of pushing.

Thanks,
Stefan

Cover letter v1:


Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

When using a remote that is more than just a plain Git host (e.g. Gerrit,
Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
push options can instruct the server to:
* open a pull request
* send out emails asking for review
* (un)trigger continuous integration
* set priority for continuous integration (i.e. bots pushing may ask to be
  treated with lower priority compared to humans)
* ... 

Most of these actions can be done on the client side as well,
but in these remote-centric workflows it is easier to do that on the remote,
which is why we need to transport the information there.

More concrete examples:
* When you want a change in Gerrit to be submitted to refs/heads/master, you
  push instead to a magic branch refs/for/master and Gerrit will create a change
  for you (similar to a pull request). Instead we could imagine that you push
  to a magical refs/heads/master with a push option "create-change".
  
* When pushing to Gerrit you can already attach some of this information by
  adding a '%' followed by the parameter to the ref, i.e. when interacting with
  Gerrit it is possible to do things like[1]:

git push origin 
HEAD:refs/for/master%draft%topic=example%cc=jon@example.org
  
  This is not appealing to our users as it looks like hacks upon hacks to make
  it work. It would read better if it was spelled as:
  
  git push origin HEAD:refs/for/master \
  --push-option draft \
  --push-option topic=example \
  --push-optio

[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  | 18 ++
 builtin/receive-pack.c  | 47 +++--
 templates/hooks--pre-receive.sample | 24 +++
 3 files changed, 76 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..9565dc3 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,15 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
+
 [[update]]
 update
 ~~
@@ -322,6 +331,15 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client 

[PATCH 4/4] add a test for push options

2016-07-14 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5545-push-options.sh | 103 
 1 file changed, 103 insertions(+)
 create mode 100755 t/t5545-push-options.sh

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
new file mode 100755
index 000..ea813b9
--- /dev/null
+++ b/t/t5545-push-options.sh
@@ -0,0 +1,103 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.247.gf748855.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] push: accept push options

2016-07-14 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 21 ++---
 send-pack.c| 27 +++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 93c3527..ec514f6 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-o::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. The given string
+   must not contain a NUL or LF character.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..3bb9d6b 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+   static struct string_list_item *item;
+
struct option options[] = {
OPT__VERBOSITY(&verbosity),
OPT_STRING( 0 , "repo", &repo, N_("repository"), 
N_("repository")),
@@ -533,6 +543,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", &push_cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", &push_options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", &family, N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"),
@@ -563,7 +574,11 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   for_each_string_list_item(item, &p

[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  9 +--
 Documentation/technical/pack-protocol.txt | 10 +---
 Documentation/technical/protocol-capabilities.txt |  9 +++
 builtin/receive-pack.c| 30 +++
 4 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 16dc22d..0bb6daa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2417,8 +2417,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 6cdd2c6..3c9360a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);

Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 3:46 PM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 02:49:45PM -0700, Stefan Beller wrote:
>
>> We limit the push options for now
>> * to not exceed an arbitrary count, and
>> * to not exceed an arbitrary size.
>>
>> This serves two purposes:
>> * DoS protection (i.e. one connection can add no more than 32kB
>>   now)
>> * We need to figure out how to handle large (>64kB). Jeff wrote:
>>   > Yes, but people are also happy when they can use a flexible and
>>   > standardized tool to do a thing. I'd be more frustrated when I found out
>>   > that Git's data-pushing protocol has arbitrary limitations (like, say, I
>>   > can't push a data item larger than a single 64K pkt-line), which would
>>   > easily just work with something like HTTP POSTs.
>>   So to keep a way open in the future to deal with large pay loads,
>>   the size is restricted for now.
>
> Should this bit get dropped from the commit message?
>
> -Peff

Right. :/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] receive-pack: send keepalives during quiet periods

2016-07-15 Thread Stefan Beller
On Fri, Jul 15, 2016 at 3:43 AM, Jeff King  wrote:

>
> Signed-off-by: Jeff King 

Read-entirely-by Stefan ;)
Thanks!

> @@ -319,10 +331,60 @@ static void rp_error(const char *err, ...)
>  static int copy_to_sideband(int in, int out, void *arg)
>  {
> char data[128];

While looking at this code, do you think it is feasible to increase the
size of data[] to 1024 ? (The largest that is possible when
side-band, but no side-band-64k is given).

> +   int keepalive_active = 0;
> +
> +   if (keepalive_in_sec <= 0)
> +   use_keepalive = KEEPALIVE_NEVER;
> +   if (use_keepalive == KEEPALIVE_ALWAYS)
> +   keepalive_active = 1;
> +
> while (1) {
> -   ssize_t sz = xread(in, data, sizeof(data));
> +   ssize_t sz;
> +
> +   if (keepalive_active) {
> +   struct pollfd pfd;
> +   int ret;
> +
> +   pfd.fd = in;
> +   pfd.events = POLLIN;
> +   ret = poll(&pfd, 1, 1000 * keepalive_in_sec);
> +
> +   if (ret < 0) {
> +   if (errno == EINTR)
> +   continue;
> +   else
> +   break;

The method was short and concise, this adds a lot of lines.
Remembering d751dd11 (2016-07-10, hoist out handle_nonblock
function for xread and xwrite), do you think it would be reasonable to
put the whole poll handling into a dedicated function, maybe even reuse the
that function?

if (keepalive_active) {
if (wrapper_around_poll(&data_in) < 0) // handles EINTR internally
break;
if (!data_in)
send_keep_alive();
}

I am not sure if that makes this function more legible, just food for thought.


> +   } else if (ret == 0) {
> +   /* no data; send a keepalive packet */
> +   static const char buf[] = "0005\1";

and the \1 is the first sideband. Why do we choose that sideband?

> +   write_or_die(1, buf, sizeof(buf) - 1);
> +   continue;
> +   } /* else there is actual data to read */

"If there is data to read, we need to break the while(1), to actually
read the data?"
I got confused and needed to go back and read the actual code again,
would it make sense to rather have a loop here?

while (1) {
   while(keepalive_active) {
if (wrapper_around_poll(&data_in) < 0) // handles EINTR internally
break;
if (!data_in)
send_keep_alive();
else
break;
}

   sz = xread(in, data, sizeof(data));
 if (sz <= 0)
 break;

turn_on_keepalive_on_NUL(&data);
 }

> +   }
> +
> +   sz = xread(in, data, sizeof(data));
> if (sz <= 0)
> break;
> +
> +   if (use_keepalive == KEEPALIVE_AFTER_NUL && 
> !keepalive_active) {
> +   const char *p = memchr(data, '\0', sz);
> +   if (p) {
> +   /*
> +* The NUL tells us to start sending 
> keepalives. Make
> +* sure we send any other data we read along
> +* with it.
> +*/
> +   keepalive_active = 1;
> +   send_sideband(1, 2, data, p - data, 
> use_sideband);
> +   send_sideband(1, 2, p + 1, sz - (p - data + 
> 1), use_sideband);
> +   continue;

Oh, I see why the turn_on_keepalive_on_NUL doesn't work as well as I thought.
I wonder if we can use a better read function, that would stop reading at a NUL,
and return early instead?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] receive-pack: send keepalives during quiet periods

2016-07-18 Thread Stefan Beller
On Sat, Jul 16, 2016 at 12:56 AM, Jeff King  wrote:
>> > +   if (use_keepalive == KEEPALIVE_AFTER_NUL && 
>> > !keepalive_active) {
>> > +   const char *p = memchr(data, '\0', sz);
>> > +   if (p) {
>> > +   /*
>> > +* The NUL tells us to start sending 
>> > keepalives. Make
>> > +* sure we send any other data we read 
>> > along
>> > +* with it.
>> > +*/
>> > +   keepalive_active = 1;
>> > +   send_sideband(1, 2, data, p - data, 
>> > use_sideband);
>> > +   send_sideband(1, 2, p + 1, sz - (p - data 
>> > + 1), use_sideband);
>> > +   continue;
>>
>> Oh, I see why the turn_on_keepalive_on_NUL doesn't work as well as I thought.
>> I wonder if we can use a better read function, that would stop reading at a 
>> NUL,
>> and return early instead?
>
> How would you do that, if not by read()ing a byte at a time (which is
> inefficient)? Otherwise you have to deal with the leftovers (after the
> NUL) in your buffer. It's one of the reasons I went with a single-byte
> signal, because otherwise it gets rather complicated to do robustly.

I do not question the concept of a single NUL byte, but rather the
implementation,
i.e. if we had an xread_until_nul you would not need to have a double
send_sideband
here?

>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] receive-pack: send keepalives during quiet periods

2016-07-19 Thread Stefan Beller
On Tue, Jul 19, 2016 at 3:07 AM, Jeff King  wrote:
> On Mon, Jul 18, 2016 at 10:28:25PM -0700, Stefan Beller wrote:
>
>> On Sat, Jul 16, 2016 at 12:56 AM, Jeff King  wrote:
>> >> > +   if (use_keepalive == KEEPALIVE_AFTER_NUL && 
>> >> > !keepalive_active) {
>> >> > +   const char *p = memchr(data, '\0', sz);
>> >> > +   if (p) {
>> >> > +   /*
>> >> > +* The NUL tells us to start sending 
>> >> > keepalives. Make
>> >> > +* sure we send any other data we read 
>> >> > along
>> >> > +* with it.
>> >> > +*/
>> >> > +   keepalive_active = 1;
>> >> > +   send_sideband(1, 2, data, p - data, 
>> >> > use_sideband);
>> >> > +   send_sideband(1, 2, p + 1, sz - (p - 
>> >> > data + 1), use_sideband);
>> >> > +   continue;
>> >>
>> >> Oh, I see why the turn_on_keepalive_on_NUL doesn't work as well as I 
>> >> thought.
>> >> I wonder if we can use a better read function, that would stop reading at 
>> >> a NUL,
>> >> and return early instead?
>> >
>> > How would you do that, if not by read()ing a byte at a time (which is
>> > inefficient)? Otherwise you have to deal with the leftovers (after the
>> > NUL) in your buffer. It's one of the reasons I went with a single-byte
>> > signal, because otherwise it gets rather complicated to do robustly.
>>
>> I do not question the concept of a single NUL byte, but rather the
>> implementation, i.e. if we had an xread_until_nul you would not need
>> to have a double send_sideband here?
>
> What would xread_until_nul() look like?
>
> The only reasonable implementation I can think of is:
>
>   ssize_t xread_until_nul(int fd, char *out, size_t len)
>   {
> ssize_t total = 0;
> while (total < len) {
> ssize_t ret = xread(fd, out + total, 1);
> if (ret < 0) {
> /* Oops, anything in out[0..total] is lost, but
>  * we have no way of signaling both a partial
>  * read and an error at the end; fixable by
>  * changing the interface, but doesn't really
>  * matter in practice for this application. */
> return -1;
> }
> if (ret == 0)
> break; /* EOF, stop reading */
> if (out[total] == '\0')
> break; /* got our NUL, stop reading */
> total++;
> }
> return total;
>   }
>
> There are two problems with this function:
>
>   1. Until we see the NUL, we're making an excessive number of read()
>  syscalls looking for it. You could make larger reads, but then what
>  do you do with the residual bytes (i.e., the ones after the NUL in
>  the buffer you read)? You'd have to somehow save it to return at
>  the next xread_until_nul(). Which implie some kind of internal
>  buffering.
>
>  The reason there are two send_sidebands is to cover the case where
>  we have some real data, then the signal byte, then some more data.
>  So instead of buffering, we just handle the data immediately.
>
>   2. How does it know when to return?
>
>  We want to send the data as soon as it is available, in as large a
>  chunk as possible. Using a single xread() as we do now, we get
>  whatever the OS has for us, up to our buffer size.
>
>  But after each 1-byte read above, we would not want to return;
>  there might be more data. So it keeps reading until NUL or we fill
>  our buffer. But that may mean the xread() blocks, even though we
>  have data that _could_ be shipped over the sideband.
>
>  To fix that, you'd have to poll() for each xread(), and return when
>  it says nothing's ready.
>
> I know that writing the function myself and then critiquing is the very
> definition of a strawman. :) But it's the best I could think of.  Maybe
> you had something more clever in mind?

Actually no, I had not. I was hoping you could come up with a clever thing.
My original point was the perceived added complexity to a simple
seemingly simple function (copy_to_sideband in your original patch),
so my gut reaction was to shove the complexity away into a helper function.
But no matter how it is done, there is always this one function that looks
complex for this problem. So I think your original approach is fine then?

Thanks,
Stefan


>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] fetch-pack: grow stateless RPC windows exponentially

2016-07-19 Thread Stefan Beller
On Mon, Jul 18, 2016 at 3:21 PM, Jonathan Tan  wrote:
> When updating large repositories, the LARGE_FLUSH limit (that is, the
> limit at which the window growth strategy switches from exponential to
> linear) is reached quite quickly. Use a conservative exponential growth
> strategy when that limit is reached instead (and increase LARGE_FLUSH so
> that there is no regression in window size).

Care to elaborate on why you choose 11/10 as growth factor?

(As someone who has a tick in micro optimizing:
9/8 is roughly the same exponent, but the division
by 8 is easier as it is just a shift by 3. Similar 17/16)

I guess one design criterion was 10 being a round number?
Does it make sense to experiment with the factor at all?
Digging into that, LARGE_FLUSH originates from 6afca450c3f,
(2011-03-20, fetch-pack: progressively use larger handshake windows),
and before we only had a linear growth.

So I guess what I do not understand is why we need to slow down the
exponential growth at all?

Thanks,
Stefan



>
> This optimization is only applied during stateless RPCs to avoid the
> issue raised and fixed in commit
> 44d8dc54e73e8010c4bdf57a422fc8d5ce709029.
>
> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index b501d5c..85e77af 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -243,16 +243,21 @@ static void insert_one_alternate_ref(const struct ref 
> *ref, void *unused)
>
>  #define INITIAL_FLUSH 16
>  #define PIPESAFE_FLUSH 32
> -#define LARGE_FLUSH 1024
> +#define LARGE_FLUSH 16384
>
>  static int next_flush(struct fetch_pack_args *args, int count)
>  {
> -   int flush_limit = args->stateless_rpc ? LARGE_FLUSH : PIPESAFE_FLUSH;
> -
> -   if (count < flush_limit)
> -   count <<= 1;
> -   else
> -   count += flush_limit;
> +   if (args->stateless_rpc) {
> +   if (count < LARGE_FLUSH)
> +   count <<= 1;
> +   else
> +   count = count * 11 / 10;
> +   } else {
> +   if (count < PIPESAFE_FLUSH)
> +   count <<= 1;
> +   else
> +   count += PIPESAFE_FLUSH;
> +   }
> return count;
>  }
>
> --
> 2.8.0.rc3.226.g39d4020
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] fetch-pack: grow stateless RPC windows exponentially

2016-07-19 Thread Stefan Beller
On Tue, Jul 19, 2016 at 12:03 PM, Jonathan Tan  wrote:
> On Tue, Jul 19, 2016 at 9:46 AM, Stefan Beller  wrote:
>> Care to elaborate on why you choose 11/10 as growth factor?
>>
>> (As someone who has a tick in micro optimizing:
>> 9/8 is roughly the same exponent, but the division
>> by 8 is easier as it is just a shift by 3. Similar 17/16)
>
> I don't have a specific reason for 11/10 as opposed to, say, 9/8 - I
> think that the time taken to execute this line is negligible compared
> to what's done in the calling code, but I'll change it to 9/8 if there
> is another reason for me to send another patch.
>
>> I guess one design criterion was 10 being a round number?
>> Does it make sense to experiment with the factor at all?
>> Digging into that, LARGE_FLUSH originates from 6afca450c3f,
>> (2011-03-20, fetch-pack: progressively use larger handshake windows),
>> and before we only had a linear growth.
>>
>> So I guess what I do not understand is why we need to slow down the
>> exponential growth at all?
>
> The current code has an exponential (a' = a * 2) then a linear (a' = a
> + 1024) growth. I'm not slowing down the exponential growth - that
> part is retained. I'm replacing the linear growth with another
> conservative exponential growth (a' = a * 11 / 10).

Sorry for the miss understanding. Once we have the new conservative
exponential, we'd have a fast exponential first (a=2*a) and then after a while
a slower exponential (a=1.1*a). So we have 2 exponential curves with different
growth properties.

So my question could be reworded as: Why do we need two different exponential
growth phases (as they are both exponential)? And I answered myself with: Oh,
the exponents are different, so that's why.

But then I tried to understand why we choose the 2 exponents as of this patch.
(Why start with 2 and drop back to 1.1?) Could one exponential phase be
sufficient with an exponent in between (e.g. a=1.3*a with a larger starting a to
compensate for the reduced early growth) ?

Or to reword it another way: Using just one exponential growth is simpler than
2 different exponential growth phases, so how do we explain the added
complexity?

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule-config: use explicit empty string instead of strbuf in config_from()

2016-07-19 Thread Stefan Beller
On Tue, Jul 19, 2016 at 12:15 PM, Junio C Hamano  wrote:
> René Scharfe  writes:
>
>> Use a string constant instead of an empty strbuf to shorten the code
>> and make it easier to read.
>>
>> Signed-off-by: Rene Scharfe 
>> ---
>> ... unless someone can come up with a suitable non-empty string to feed
>> to git_config_from_mem() as its name parameter.

Thanks!

> I wondered if we used to do something useful with rev and later that
> useful thing was removed leaving an always-empty strbuf, but it
> appears that this strbuf was introduced to always hold an empty
> string and nothing else, which shows the (lack of) quality of
> reviews in this area X-<.

I am to blame for this. :(

>
> Will apply.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation, git submodule: Note about --reference

2016-07-19 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---

 Is it too obvious?
 I was approached off list and this was only obvious after some discussion,
 so I think it is a valid warning.
 
 On the other hand this might show that we want to get worktree working with
 submodules.

 Documentation/git-submodule.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index bf3bb37..dcbd460 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -373,6 +373,10 @@ the submodule itself.
 +
 *NOTE*: Do *not* use this option unless you have read the note
 for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
++
+*NOTE*: This gives the same reference to all submodules, so it is only useful
+if you are tracking different versions of a project in submodules instead
+of different projects.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
-- 
2.9.2.800.g213104a

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git and SHA-1 security (again)

2016-07-20 Thread Stefan Beller
On Wed, Jul 20, 2016 at 7:44 AM, Duy Nguyen  wrote:
> On Wed, Jul 20, 2016 at 2:28 PM, Johannes Schindelin
>  wrote:
>> But that strategy *still* ignores the distributed nature of Git. Just
>> because *you* make that merge at a certain point does not necessarily mean
>> that I make it at that point, too.
>>
>> Any approach that tries to have one single point of conversion will most
>> likely fall short of a solution.
>
> OK I see the difference in our views now. To me an sha256 repo would
> see an sha1 repo as a _foreign_ DVCS, pretty much like git sees
> mercurial now. So a transition from sha1 to sha256 is not that
> different from cvs -> svn -> a dvcs bubble -> git.
>
>> To be honest, I am less concerned about the GPG-signed commits (after all,
>> after switching to a more secure hash algorithm, a maintainer could
>> cross-sign all signed commits, or only the branch tips or tags, as new
>> tags, to reinstitute trust).
>>
>> I am much more concerned about references to commits, both inside and
>> outside the repository. That is, if I read anywhere on the internet about
>> Git having added support for `git add --chmod=+x ` in 4e55ed3 (add:
>> add --chmod=+x / --chmod=-x options, 2016-05-31), I want to find that
>> commit by that reference.
>>
>> And I am of course concerned what should happen if a user wants to fetch
>> from, or push to, a SHA-1-hashed remote repository into, or from, a
>> SHA-256-hashed local one.
>
> to follow the above, in my view, interaction with sha1 repos go
> through some conversion bridges like what we have with hg and svn. I
> don't know if we are going this route. It's certainly simpler and
> people already have experiences (from previous migration) to prepare
> for it.

When treating the SHA1 version as a foreign dvcs and the SHA256
as the real deal, we could introduce "pointer objects", and during the
conversion
create a 4e55ed3 pointer that points to the SHA256 commit of (add:
add --chmod=+x / --chmod=-x options, 2016-05-31). Ideally we would
not even expose this sort of object a lot, e.g. git show  would just
redirect automatically. Instead of a new class of "pointer objects" we could
also solve this via a lot of refs. (refs/old-sha1/4e55ed3 pointing to
the converted
commit; Though we would need to accept partial refs names then :/)

> --
> Duy
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/4] submodule: update core.worktree using git-config

2016-07-20 Thread Stefan Beller
On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
 wrote:
> To access a separate repository, the first step should be read its
> config file to determine if this repository layout is supported or
> not, or if we understand all repo extensions, of it is a linked
> worktree. Only then should know where to update the config file.
>
> Unfortunately, our C code base is not ready for doing all that in the
> same process. The repo detection is not meant to be used for peeking
> in other repository, and config code would read config.worktree that
> is in _current_ $GIT_DIR.
>
> For now, let's spawn a new process and let all that done separately.
>
> PS. submodule-helper also updates core.worktree. But in there, we
> create a new clone, we know what is the initial repository layout, so
> we know we can simply update "config" file without risks.

Right, the submodule--helper update_clone should not be required to convert.

>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  submodule.c | 16 +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index abc2ac2..b912871 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1128,7 +1128,9 @@ void connect_work_tree_and_git_dir(const char 
> *work_tree, const char *git_dir)
>  {
> struct strbuf file_name = STRBUF_INIT;
> struct strbuf rel_path = STRBUF_INIT;
> +   struct strbuf path = STRBUF_INIT;
> const char *real_work_tree = xstrdup(real_path(work_tree));
> +   struct child_process cp = CHILD_PROCESS_INIT;
>
> /* Update gitfile */
> strbuf_addf(&file_name, "%s/.git", work_tree);
> @@ -1136,13 +1138,17 @@ void connect_work_tree_and_git_dir(const char 
> *work_tree, const char *git_dir)
>relative_path(git_dir, real_work_tree, &rel_path));
>
> /* Update core.worktree setting */
> -   strbuf_reset(&file_name);
> -   strbuf_addf(&file_name, "%s/config", git_dir);
> -   git_config_set_in_file(file_name.buf, "core.worktree",
> -  relative_path(real_work_tree, git_dir,
> -&rel_path));
> +   strbuf_addstr(&path, relative_path(real_work_tree, git_dir,
> +  &rel_path));
> +   cp.git_cmd = 1;
> +   argv_array_pushl(&cp.args, "-C", work_tree, NULL);
> +   argv_array_pushl(&cp.args, "--work-tree", ".", NULL);
> +   argv_array_pushl(&cp.args, "config", "core.worktree", path.buf, NULL);
> +   if (run_command(&cp) < 0)
> +   die(_("failed to update core.worktree for %s"), git_dir);

Do we need to make this conditional on the extensions.worktreeConfig
variable, though? When I just run

git config --worktree . foo bar
fatal: Per-worktree configuration requires extensions.worktreeConfig
Please read section CONFIGURATION in `git help worktree` before
enabling it.

which would trigger the failure here?

>
> strbuf_release(&file_name);
> +   strbuf_release(&path);
> strbuf_release(&rel_path);
> free((void *)real_work_tree);
>  }
> --
> 2.9.1.566.gbd532d4
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-20 Thread Stefan Beller
On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
 wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/git-worktree.txt | 8 
>  git-submodule.sh   | 8 
>  2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index 41350db..2a5661d 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -142,6 +142,14 @@ to share to all working directories:
> you are sure you always use sparse checkout for all working
> directories.
>
> + - `submodule.*` in current state should not be shared because the
> +   information is tied to a particular version of .gitmodules in a
> +   working directory.

While the submodule.* settings are copied from the .gitmodules file initially,
they can be changed in the config later. (That was actually the whole
point of it,
so you can change the submodule remotes URL without having to change history.)

And I would think that most submodule related settings (such as remote URL,
name, path, even depth recommendation) should be the same for all worktrees,
and a different value for one worktree is a carefully crafted
exception by the user.

So while the .gitmodules file can diverge in the work trees I do not
think that the
actual remotes for the submodules in the different worktrees differ
though. The change
of the .gitmodule files may be because you checked out an old commit, that
has outdated information on where to get the submodule from.

> +
> + - `remote.*` added by submodules may be per working directory as
> +   well, unless you are sure remotes from all possible submodules in
> +   history are consistent.
> +
>  DETAILS
>  ---
>  Each linked working tree has a private sub-directory in the repository's
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 4ec7546..7b576f5 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -261,7 +261,7 @@ or you are unsure what this means choose another name 
> with the '--name' option."
> esac
> ) || die "$(eval_gettext "Unable to checkout submodule 
> '\$sm_path'")"
> fi
> -   git config submodule."$sm_name".url "$realrepo"
> +   git config --worktree submodule."$sm_name".url "$realrepo"

This is in cmd_add. Actually I think this should be --not-worktree
(i.e. --local)
as when you add a submodule in one worktree, and then in another,
you may want to have the same URL. However if another worktree
already configured it you want to keep the option.
so rather:

  if git config  submodule."$sm_name".url then
  # it exists, do nothing
  else
# it does not exist
git config --local ...

>
> git add $force "$sm_path" ||
> die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
> @@ -461,7 +461,7 @@ Submodule work tree '\$displaypath' contains a .git 
> directory
> # Remove the whole section so we have a clean state 
> when
> # the user later decides to init this submodule again
> url=$(git config submodule."$name".url)
> -   git config --remove-section submodule."$name" 
> 2>/dev/null &&
> +   git config --worktree --remove-section 
> submodule."$name" 2>/dev/null &&
> say "$(eval_gettext "Submodule '\$name' (\$url) 
> unregistered for path '\$displaypath'")"

This is in cmd_deinit, which is documented as:
   Unregister the given submodules, i.e. remove the whole
   submodule.$name section from .git/config together with their work
   tree. Further calls to git submodule update, git submodule foreach
   and git submodule sync will skip any unregistered submodules until
   they are initialized again, so use this command if you don’t want
   to have a local checkout of the submodule in your work tree
   anymore. If you really want to remove a submodule from the
   repository and commit that use git-rm(1) instead.

So one might wonder if the unregister should be a global unregister
or a worktree specific unregister.

>From a users POV there are:
* non existent submodules (no gitlink recorded, no config set,
  no repo in place)
* not initialized submodules (gitlink is recorded, no config set,
  and an empty repo is put in the working tree as a place holder).
* initialized submodules (gitlink is recorded, the config
  submodule ..url is copied from the .gitmodules file to .git/config.
  an empty dir in the working tree as a place holder)
  A user may change the configuration before the next step as the url in
  the .gitmodules file may be wrong and the user doesn't want to
  rewrite history
* existing submodules (gitlink is recorded, the config option is set
  and instead of an empty placeholder dir, we actually have a git
  repo there.)
* matching submodules (the recorded git link matches
  the 

[PATCH] submodule: do no re-read name in shell script

2016-07-20 Thread Stefan Beller
Instead of making another call to a submodule helper (name), just
propagate the value when we know it (in the update-clone helper) already.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 4 ++--
 git-submodule.sh| 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b22352b..494e088 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -683,9 +683,9 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
needs_cloning = !file_exists(sb.buf);
 
strbuf_reset(&sb);
-   strbuf_addf(&sb, "%06o %s %d %d\t%s\n", ce->ce_mode,
+   strbuf_addf(&sb, "%06o %s %d %d %s\t%s\n", ce->ce_mode,
sha1_to_hex(ce->sha1), ce_stage(ce),
-   needs_cloning, ce->name);
+   needs_cloning, sub->name, sub->path);
string_list_append(&suc->projectlines, sb.buf);
 
if (!needs_cloning)
diff --git a/git-submodule.sh b/git-submodule.sh
index 4ec7546..e23aada 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -584,11 +584,10 @@ cmd_update()
"$@" || echo "#unmatched"
} | {
err=
-   while read mode sha1 stage just_cloned sm_path
+   while read mode sha1 stage just_cloned name sm_path
do
die_if_unmatched "$mode"
 
-   name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
branch=$(get_submodule_config "$name" branch master)
if ! test -z "$update"
-- 
2.9.2.369.g0e67330

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] submodule--helper: correct index computation

2016-07-20 Thread Stefan Beller
In 665b35eccd39 (2016-06-09, submodule--helper: initial clone learns
retry logic), the index computation for the second round is wrong; it
should make use of the index that was handed in via the idx_task_cb
pointer. When that commit was introduced, I wrote:

> I wonder how we can test this properly as the git binary we have here
> is too reliable, but I observed the correctness of this patch when
> cloning a repo with lots of submodules and one of them failing.

This was only partially true as I did not observe the second try failing,
only the first retry succeeding. Testing however is really easy, when the
url of one submodule is bogus, so add a test for that.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c |  4 +++-
 t/t7406-submodule-update.sh | 16 
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 494e088..5d54885 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -795,7 +795,9 @@ static int update_clone_task_finished(int result,
suc->failed_clones[suc->failed_clones_nr++] = ce;
return 0;
} else {
-   idx = suc->current - suc->list.nr;
+   idx -= suc->list.nr;
+   if (idx >= suc->failed_clones_nr)
+   die("BUG: idx too large???");
ce  = suc->failed_clones[idx];
strbuf_addf(err, _("Failed to clone '%s' a second time, 
aborting"),
ce->name);
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 88e9750..ffb329f 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -889,4 +889,20 @@ test_expect_success 'git clone passes the parallel jobs 
config on to submodules'
rm -rf super4
 '
 
+cat << EOF >expect
+Submodule 'deeper/submodule' (bogus-url) registered for path 'deeper/submodule'
+fatal: clone of 'bogus-url' into submodule path '$pwd/super4/deeper/submodule' 
failed
+Failed to clone 'deeper/submodule'. Retry scheduled
+fatal: clone of 'bogus-url' into submodule path '$pwd/super4/deeper/submodule' 
failed
+Failed to clone 'deeper/submodule' a second time, aborting
+EOF
+
+test_expect_success 'correct message for retries' '
+   test_when_finished "rm -rf super4" &&
+   git -C super config -f .gitmodules submodule."deeper/submodule".url 
bogus-url &&
+   git -C super commit -a -m "bogus url for one submodule" &&
+   test_must_fail git clone --recurse-submodules -j 1 super super4 2>&1 | 
grep deeper >actual &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.9.2.370.g4a59376.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule: do no re-read name in shell script

2016-07-20 Thread Stefan Beller
On Wed, Jul 20, 2016 at 6:00 PM, Jonathan Nieder  wrote:

>> - needs_cloning, ce->name);
>> + needs_cloning, sub->name, sub->path);
>
> Are there any restrictions on what characters a submodule name can
> contain?  Does this need e.g. to be quoted with sq_quote?

Oh good point, this ought to fail with weird names. So instead of quoting it,
let's just drop this patch.

>
> Thanks,
> Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] status: suggest 'git merge --abort' when appropriate

2016-07-21 Thread Stefan Beller
On Thu, Jul 21, 2016 at 5:58 AM, Matthieu Moy  wrote:
> We already suggest 'git rebase --abort' during a conflicted rebase.
> Similarly, suggest 'git merge --abort' during conflict resolution on
> 'git merge'.

I think this is a good addition.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-21 Thread Stefan Beller
FYI: I started working on a series that decouples existence of a
submodule from the URL
as a preparatory series to this one. Then we can have the same URL in
all working trees, but
the existence is configured differently for each working tree.

I'll try to send it out tomorrow.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:32 AM, Jens Lehmann  wrote:
> Am 21.07.2016 um 01:22 schrieb Stefan Beller:
>>
>> So maybe we want to drop that series and first talk about a migration plan
>> from
>> the current state to a world where we have the existence depending not
>> on the url
>> parameter, but a boolean variable submodule...
>> Depending on  a submodule would be ignored or tried to checkout
>> in e.g. `submodule update`
>
>
> Whoa, that's a very intrusive change with a ton of compatibility
> problems waiting to happen. Maybe its simpler to make "git submodule
> sync" aware of worktrees and error out with an "you cannot use
> submodules with different URLs in a worktree scenario" error when
> the URL is going to change? That should make most use cases work
> while avoiding the problematic ones.

I think fixing sync alone is just a drop of water on the oven.
Actually I can think of scenarios that have different URLs for different
worktrees (think of the automatic CI thing that should only fetch from
an internal server, whereas the dev-checkout fetches from upstream)
Actually each config variable (including the update strategy as you
mention below, but also the depth, branch, path) may be different in
one work tree.

I do not want to forbid the existence of different settings (URLs)
per worktree. Rather I think a different setting is a user decision,
hence they will want to run "git config --worktree ..."

And one of the unfortunate things is the coupling of existence of a
submodule and the URL. If that were to be decoupled, you could do
a "git config --worktree submodule..exists true" (or it is wrapped
fancily in "git submodule init") and the URL would not have to be copied
from the .gitmodules file.

I agree that this is a breaking change, which is why I'd guard it with
a config option such that the user can make the choice if they want
to go with the old behavior or the new behavior.


>
>> If we want to move the current behavior of submodules forward, we
>> would want to have
>> anything but the url as shared variables and then use the url variable
>> as a per-worktree
>> existence flag.
>
>
> Without having though deeply about all submodule variables, I see
> them as worktree specific. E.g. "update=none" is used on our CI-
> Server to avoid the disk space cost on some checkouts of a certain
> superproject while using "update=checkout" on others where their
> content is needed.

But this is a conscious user choice, so you would have configured
that on a per-worktree basis anyway?
i.e. it seems to me as if "update=checkout" is a default that is good
for all but one worktree, so why would you want to configure that n times
instead of just once as default?
The non default behavior is then overwritten in the specific worktree.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>
> I just quickly glanced through the rest of this mail because, as a
> submodule ignorant, it's just mumbo jumbo to me. But what I see here
> is, there may be problems if we choose to share some submodule info,
> but I haven't seen any good thing from sharing any submodule info at
> all.

Okay. :(

I assume the sharing is beneficial. (As a work-tree ignorant) I thought
we had this main work tree, which also holds the repository, whereas
the other working trees have a light weight implementation (e.g. just
a .git file pointing back to the main working tree/git dir).

So in a way my mental model is more like the config sharing here:
You can configure things in ~/.gitconfig for example that have effects
on more than one repo. Similarly you would want to configure things
in one repo, that has effect on more than one working tree?

And my assumption was to have the repository specific parts be shared,
whereas the working tree specific things should not be shared.

By working tree specific I strongly mean:

* existence in the working tree
* the checked out sha1
* submodule.$name.path

By repository specific I strongly mean:

* the submodule URL

I am not sure about:

* submodule.$name.update, submodule.$name.ignore,
   submodule.$name.branch,
  These have to be able to be different across working trees, but do we
  require them to be set for each working tree individually?  I thought a
  repo wide setup with defaults may be ok?

>
> I can imagine long term you may want to just clone a submodule repo
> once and share it across worktrees that use it, so maybe it's just me
> not seeing things and this may be a step towards that.

Just as Junio put it:
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

I agree with that as well.

>
> Anyway, I assume some people will be working on the submodule side.

Once the discussion comes to a rough agreement, I'll give it a shot.

> And because I have not heard any bad thing about the new config
> design, I'm going to drop submodule patches from this series and focus
> on polishing config stuff.

Oh, sorry for not focusing on that part. The design of git config --worktree
is sound IMO.

> --
> Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 9:55 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> From a users POV there are:
>> * non existent submodules (no gitlink recorded, no config set,
>>   no repo in place)
>> * not initialized submodules (gitlink is recorded, no config set,
>>   and an empty repo is put in the working tree as a place holder).

I meant empty directory, not empty repo.

>
> This is no different from what you later call "embedded".  The only
> difference is that embedded thing hasn't seen its initial commit.

That did not occur to me.
The "not initialized" is what you'd get via

git clone --no-recurse repo-with-submodules

whereas the "embedded" could come from

   git clone  tmp
   cd tmp && git clone 

>
>> * initialized submodules (gitlink is recorded, the config
>>   submodule ..url is copied from the .gitmodules file to .git/config.
>>   an empty dir in the working tree as a place holder)
>>   A user may change the configuration before the next step as the url in
>>   the .gitmodules file may be wrong and the user doesn't want to
>>   rewrite history
>
> i.e. what "submodule init" gives you.

Right.

>
>> * existing submodules (gitlink is recorded, the config option is set
>>   and instead of an empty placeholder dir, we actually have a git
>>   repo there.)
>
> i.e. what "submodule update" after "submodule init" gives you.

Right.

>
>> * matching submodules (the recorded git link matches
>>   the actual checked out state of the repo!, config option and repo exist)
>
> Is this any different from "existing" case for the purpose of
> discussing the interaction between a submodule (and its checkout)
> and having possibly multiple worktrees of its superproject?

I don't think so.

>
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

At least me agrees.

>
> Assuming that everybody is on the same page, that means "do we have
> the repository for that submodule, and if so where in our local
> filesystem?" is a bit of information shared across the worktrees of
> the superproject.  And the "name" used to identify the submodule is
> also shared across these worktrees of the superproject, as it is
> meant to be a unique (within the superproject) identifier for that
> "other" project it uses, no matter where in the superproject's
> working tree (note: this is "working tree", not "worktree") it would
> be checked out, and where the upstream URL to get further updates to
> the submodule is (i.e. that URL may change over time if they relocate,
> or it may even change when the user who works on the superproject
> decides to use a different mirror).

I agree.

>
> What can be different between the instantiation of the same
> submodule in these multiple worktrees, and how they should be
> recorded?
>
>  * submodule.$name.URL?  I am not sure if we want to have different
>"upstreams" depending on the worktree of the superproject.  While
>there is no fundamental reason to forbid it, having to maintain a
>single local repository for a submodule would mean they would
>need to be treated as separate "remotes" in the submodule
>repository.

You can only have a remote if the the submodule repo exists already.
I guess that can be made a requirement.

So setting up the worktrees and submodule URLs in the config and
then doing the clone of said submodule is maybe not encouraged.

>
>  * submodule.$name.path of course can be different depending on
>which commit of the superproject is checked out in the worktree,
>as the superproject may move the submodule binding site across
>its versions.

Right.

>
>  * submodule.$name.update, submodule.$name.ignore,
>submodule.$name.branch, etc. would need to be all different among
>worktrees of the superproject, as that is the whole point of
>being able to work on separate branches of the superproject in
>separate worktrees.

What do you mean by "would need". The ability to be different or rather
the veto of an 'inheritance' of defaults from the repository configuration?

>
> Somewhere in this discussion thread, you present the conclusion of
> your discussion with Jonathan Nieder that there needs a separate
> "should the submodule directory 

Re: [PATCH 1/2] git-submodule: forward exit code of git-submodule--helper more faithfully

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:14 PM, Johannes Sixt  wrote:
> git-submodule--helper is invoked as the upstream of a pipe in several
> places. Usually, the failure of a program in this position is not
> detected by the shell. For this reason, the code inserts a token in the
> output stream when git-submodule--helper fails that is detected
> downstream, where the shell script is quit with exit code 1.
>
> There happens to be a bug in git-submodule--helper that leads to a
> segmentation fault. The test suite triggers the crash in several places,
> all of which are protected by 'test_must_fail'. But due to the inspecific
> exit code 1, the crash remains undiagnosed.
>
> Extend the failure protocol such that git-submodule--helper's exit code
> is passed downstream (only in the case of failure). This enables the
> downstream to use it as its own exit code, and 'test_must_fail' to
> identify the segmentation fault as an unexpected failure.
>
> The bug itself is fixed in the next commit.
>
> Signed-off-by: Johannes Sixt 
> ---
> When you run ./t7400-submodule-basic.sh -v, you will notice this output:
>
> fatal: destination path '/home/jsixt/Src/git/git/t/trash 
> directory.t7400-submodule-basic/init' already exists and is not an empty 
> directory.
> fatal: clone of './.subrepo' into submodule path 
> '/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
> Failed to clone 'init'. Retry scheduled
> fatal: destination path '/home/jsixt/Src/git/git/t/trash 
> directory.t7400-submodule-basic/init' already exists and is not an empty 
> directory.
> fatal: clone of './.subrepo' into submodule path 
> '/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
> /home/jsixt/Src/git/git/git-submodule: line 494: 21757 Segmentation fault 
>  git submodule--helper update-clone ${GIT_QUIET:+--quiet} 
> ${wt_prefix:+--prefix "$wt_prefix"} ${prefix:+--recursive-prefix "$prefix"} 
> ${update:+--update "$update"} ${reference:+--reference "$reference"} 
> ${depth:+--depth "$depth"} ${recommend_shallow:+"$recommend_shallow"} 
> ${jobs:+$jobs} "$@"
> ok 32 - update should fail when path is used by a file
>
> Note the segmentation fault. This mini-series addresses the issue.

The segfault has been addressed in
http://thread.gmane.org/gmane.comp.version-control.git/25
but received no attention yet.
The propagation of the exit code makes sense nevertheless.

Thanks!


>
> Noticed on Windows because it "visualizes" segfaults even for
> command line programs.
>
>  git-submodule.sh| 22 +++---
>  t/t5815-submodule-protos.sh |  4 ++--
>  t/t7400-submodule-basic.sh  |  4 ++--
>  3 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 4ec7546..0a0e12d 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -49,7 +49,7 @@ die_if_unmatched ()
>  {
> if test "$1" = "#unmatched"
> then
> -   exit 1
> +   exit ${2:-1}
> fi
>  }
>
> @@ -312,11 +312,11 @@ cmd_foreach()
>
> {
> git submodule--helper list --prefix "$wt_prefix" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
> if test -e "$sm_path"/.git
> then
> displaypath=$(git submodule--helper relative-path 
> "$prefix$sm_path" "$wt_prefix")
> @@ -423,11 +423,11 @@ cmd_deinit()
>
> {
> git submodule--helper list --prefix "$wt_prefix" "$@" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
> name=$(git submodule--helper name "$sm_path") || exit
>
> displaypath=$(git submodule--helper relative-path "$sm_path" 
> "$wt_prefix")
> @@ -581,12 +581,12 @@ cmd_update()
> ${depth:+--depth "$depth"} \
> ${recommend_shallow:+"$recommend_shallow"} \
> ${jobs:+$jobs} \
> -   "$@" || echo "#unmatched"
> +   "$@" || echo "#unmatched" $?
> } | {
> err=
> while read mode sha1 stage just_cloned sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
>
> name=$(git submodule--helper name "$sm_path") || exit
> url=$(git config submodule."$name".url)
> @@ -994,11 +994,11 @@ cmd_status()
>
> {
> git submodule--helper list --prefix "$wt_prefix" "$@" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_i

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:15 PM, Johannes Sixt  wrote:
> 'git submodule--helper update-clone' has logic to retry failed clones
> a second time. For this purpose, there is a list of submodules to clone,
> and a second list that is filled with the submodules to retry. Within
> these lists, the submodules are identified by an index as if both lists
> were just appended.
>
> This works nicely except when the second clone attempt fails as well. To
> report an error, the identifying index must be adjusted by an offset so
> that it can be used as an index into the second list. However, the
> calculation uses the logical total length of the lists so that the result
> always points one past the end of the second list.
>
> Pick the correct index.
>
> Signed-off-by: Johannes Sixt 
> ---
>  builtin/submodule--helper.c | 2 +-
>  t/t5815-submodule-protos.sh | 4 ++--
>  t/t7400-submodule-basic.sh  | 4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index b22352b..6f6d67a 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -795,7 +795,7 @@ static int update_clone_task_finished(int result,
> suc->failed_clones[suc->failed_clones_nr++] = ce;
> return 0;
> } else {
> -   idx = suc->current - suc->list.nr;
> +   idx -= suc->list.nr;

The fix is the same as in
http://thread.gmane.org/gmane.comp.version-control.git/25
There we have an additional check, which may make sense to use here as well,
specifically when having the patch 1 which propagates the exit code.

The approach to tests is different though. I like yours better than mine,
as it doesn't add more tests, but strengthens existing tests.

> ce  = suc->failed_clones[idx];
> strbuf_addf(err, _("Failed to clone '%s' a second time, 
> aborting"),
> ce->name);
> diff --git a/t/t5815-submodule-protos.sh b/t/t5815-submodule-protos.sh
> index 112cf40..06f55a1 100755
> --- a/t/t5815-submodule-protos.sh
> +++ b/t/t5815-submodule-protos.sh
> @@ -18,7 +18,7 @@ test_expect_success 'setup repository with submodules' '
> git commit -m "add submodules"
>  '
>
> -test_expect_failure 'clone with recurse-submodules fails' '
> +test_expect_success 'clone with recurse-submodules fails' '
> test_must_fail git clone --recurse-submodules . dst
>  '
>
> @@ -32,7 +32,7 @@ test_expect_success 'update of ssh allowed' '
> git -C dst submodule update ssh-module
>  '
>
> -test_expect_failure 'update of ext not allowed' '
> +test_expect_success 'update of ext not allowed' '
> test_must_fail git -C dst submodule update ext-module
>  '
>
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index 7c8b90b..b77cce8 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -352,7 +352,7 @@ test_expect_success 'sync should fail with unknown 
> submodule' '
> test_failure_with_unknown_submodule sync
>  '
>
> -test_expect_failure 'update should fail when path is used by a file' '
> +test_expect_success 'update should fail when path is used by a file' '
> echo hello >expect &&
>
> echo "hello" >init &&
> @@ -361,7 +361,7 @@ test_expect_failure 'update should fail when path is used 
> by a file' '
> test_cmp expect init
>  '
>
> -test_expect_failure 'update should fail when path is used by a nonempty 
> directory' '
> +test_expect_success 'update should fail when path is used by a nonempty 
> directory' '
> echo hello >expect &&
>
> rm -fr init &&
> --
> 2.9.0.443.ga8520ad
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:39 PM, Junio C Hamano  wrote:
> Johannes Sixt  writes:
>
>> 'git submodule--helper update-clone' has logic to retry failed clones
>> a second time. For this purpose, there is a list of submodules to clone,
>> and a second list that is filled with the submodules to retry. Within
>> these lists, the submodules are identified by an index as if both lists
>> were just appended.
>>
>> This works nicely except when the second clone attempt fails as well. To
>> report an error, the identifying index must be adjusted by an offset so
>> that it can be used as an index into the second list. However, the
>> calculation uses the logical total length of the lists so that the result
>> always points one past the end of the second list.
>>
>> Pick the correct index.
>>
>> Signed-off-by: Johannes Sixt 
>> ---
>
> Looks very similar to
>
>   http://public-inbox.org/git/20160721013923.17435-1-sbeller%40google.com/
>
> but these two patch series looks more thorough.
>
> I expect I'd queue these two instead, after seeing Acks from Stefan.
>
> Thanks.

Sure. Please take these 2 patches instead of mine.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:52 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The approach to tests is different though. I like yours better than mine,
>> as it doesn't add more tests, but strengthens existing tests.
>
> So... are you retracting
> http://thread.gmane.org/gmane.comp.version-control.git/25 and
> instead giving an Ack to these two?
>

I like this series better
* for the approach
* for the tests
* for the commit message

So I do think this should be applied instead of what I sent.

I am tempted to send a squash proposal like:
(broken whitespaces)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6f6d67a..77be97e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -796,6 +796,8 @@ static int update_clone_task_finished(int result,
  return 0;
  } else {
  idx -= suc->list.nr;
+ if (idx >= suc->failed_clones_nr)
+ die("BUG: idx too large???");
  ce  = suc->failed_clones[idx];
  strbuf_addf(err, _("Failed to clone '%s' a second time, aborting"),
 ce->name);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation: pack-protocol correct NAK response

2016-07-22 Thread Stefan Beller
In the transport protocol we use NAK to signal the non existence of a
common base, so fix the documentation. This helps readers of the document,
as they don't have to wonder about the difference between NAK and NACK.
As NACK is used in git archive and upload-archive, this is easy to get
wrong.

Signed-off-by: Stefan Beller 
---
 Documentation/technical/pack-protocol.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..d40ab65 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -307,7 +307,7 @@ In multi_ack mode:
 ready to make a packfile, it will blindly ACK all 'have' obj-ids
 back to the client.
 
-  * the server will then send a 'NACK' and then wait for another response
+  * the server will then send a 'NAK' and then wait for another response
 from the client - either a 'done' or another list of 'have' lines.
 
 In multi_ack_detailed mode:
-- 
2.9.2.370.g4a59376.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-25 Thread Stefan Beller
On Fri, Jul 22, 2016 at 10:42 AM, Duy Nguyen  wrote:
> On Fri, Jul 22, 2016 at 7:25 PM, Stefan Beller  wrote:
>> On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>>>
>>> I just quickly glanced through the rest of this mail because, as a
>>> submodule ignorant, it's just mumbo jumbo to me. But what I see here
>>> is, there may be problems if we choose to share some submodule info,
>>> but I haven't seen any good thing from sharing any submodule info at
>>> all.
>>
>> Okay. :(
>
> Didn't mean to make you feel sad :)

I was using the :( a bit carelessly here. I was quite surprised that you
"haven't seen any good thing from sharing any submodule info at all."

So what is the design philosophy in worktrees? How much independence does
one working tree have?

So here is what I did:
 *  s/git submodule init/git submodule update --init/
 * added a test_pause to the last test on the last line
 * Then:

$ find . |grep da5e6058
./addtest/.git/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./addtest/.git/worktrees/super-elsewhere/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./addtest/.git/worktrees/super-elsewhere/modules/submod2/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./.git/objects/08/da5e6058267d6be703ae058d173ce38ed53066

The last entry is the "upstream" for the addtest clone, so that is fine.
However inside the ./addtest/ (and its worktrees, which currently are
embedded in there?) we only want to have one object store for a given
submodule?

>
>> I assume the sharing is beneficial. (As a work-tree ignorant) I thought
>> we had this main work tree, which also holds the repository, whereas
>> the other working trees have a light weight implementation (e.g. just
>> a .git file pointing back to the main working tree/git dir).
>
> The main worktree is special for historical reason. But from the user
> point of view (and even developer's at a certain level) they should be
> treated equally. Think of it like cloning the same repo multiple
> times. Only now you save disk space because there's only one object
> database.

That's what we want for submodules too, see above?

>
>> So in a way my mental model is more like the config sharing here
>> You can configure things in ~/.gitconfig for example that have effects
>> on more than one repo. Similarly you would want to configure things
>> in one repo, that has effect on more than one working tree?
>>
>> And my assumption was to have the repository specific parts be shared,
>> whereas the working tree specific things should not be shared.
>
> I think that's a good assumption. Although I would rather be not
> sharing by default and let the user initiate it when they want to
> share something. Like ~/..gitconfig, we never write anything there
> unless the user asks us to explicitly (with git config --user).
> Accidental share could have negative effect.

Okay, got it.

>
>>> I can imagine long term you may want to just clone a submodule repo
>>> once and share it across worktrees that use it, so maybe it's just me
>>> not seeing things and this may be a step towards that.
>>
>> Just as Junio put it:
>>> I agree that when a top-level superproject has multiple worktrees
>>> these multiple worktrees may want to have the same submodule in
>>> different states, but I'd imagine that they want to share the same
>>> physical repository (i.e. $GIT_DIR/modules/$name of the primary
>>> worktree of the superproject)---is everybody involved in the
>>> discussion share this assumption?
>>
>> I agree with that as well.
>
> Yeah. We have a long way to go though. As I see it, you may need ref
> namespace as well (so they look like separate clones), which has never
> been used on the client side before. Either that or odb alternates...
>
>>> And because I have not heard any bad thing about the new config
>>> design, I'm going to drop submodule patches from this series and focus
>>> on polishing config stuff.
>>
>> Oh, sorry for not focusing on that part. The design of git config --worktree
>> is sound IMO.
>
> This makes me happy (I know other people can still find flaws in it,
> and I'm ok with that). This config split thing has been wrecking my
> brain for a long time, find the the "right" way to do with minimum
> impacts :)

After playing with this series a bit more, I actually like the UI as it is an
easy mental model "submodules behave completely independent".

However in 3/4 you said:

+ - `submodule.*` in current state should not be shared because the
+   information is t

Re: [PATCH] Documentation, git submodule: Note about --reference

2016-07-25 Thread Stefan Beller
On Mon, Jul 25, 2016 at 5:43 PM, Junio C Hamano  wrote:
> On Mon, Jul 25, 2016 at 5:38 PM, Stefan Beller  wrote:
>> On Tue, Jul 19, 2016 at 4:45 PM, Stefan Beller  wrote:
>>> Signed-off-by: Stefan Beller 
>>> ---
>>
>> Any comment here?
>
> I personally found it Meh in the sense that those who read
> and followed the previous note would find it adding no new
> information, and to those who don't bother the additional
> note would not help very much because they would not
> understand (and more importantly, would not care) where
> that "this affects ALL submodules" comes from, or why
> that leads to "so it is only useful...".
>
> But it may be just me.

Ok, I guess we can just drop it then.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]submodule deinit: remove outdated comment

2016-07-25 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---

  This is logically part of origin/sb/submodule-deinit-all, but this change
  failed to be there on time.
  
  As I am touching code in the near vincinity, I noticed.
  I do not include this in that series to come as I want to keep that
  series focused, so I send this separately.
  
  Thanks,
  Stefan

 git-submodule.sh | 2 --
 1 file changed, 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 42868df..a5e0ad6 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -391,8 +391,6 @@ cmd_init()
 #
 # Unregister submodules from .git/config and remove their work tree
 #
-# $@ = requested paths (use '.' to deinit all submodules)
-#
 cmd_deinit()
 {
# parse $args after "submodule ... deinit".
-- 
2.9.2.368.g08bb350

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation, git submodule: Note about --reference

2016-07-25 Thread Stefan Beller
On Tue, Jul 19, 2016 at 4:45 PM, Stefan Beller  wrote:
> Signed-off-by: Stefan Beller 
> ---

Any comment here?

>
>  Is it too obvious?
>  I was approached off list and this was only obvious after some discussion,
>  so I think it is a valid warning.
>
>  On the other hand this might show that we want to get worktree working with
>  submodules.
>
>  Documentation/git-submodule.txt | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index bf3bb37..dcbd460 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -373,6 +373,10 @@ the submodule itself.
>  +
>  *NOTE*: Do *not* use this option unless you have read the note
>  for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
> ++
> +*NOTE*: This gives the same reference to all submodules, so it is only useful
> +if you are tracking different versions of a project in submodules instead
> +of different projects.
>
>  --recursive::
> This option is only valid for foreach, update, status and sync 
> commands.
> --
> 2.9.2.800.g213104a
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/4] worktree: add per-worktree config files

2016-07-25 Thread Stefan Beller
On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
 wrote:
> A new repo extension is added, worktreeConfig. When it is present:
>
>  - Repository config reading by default includes $GIT_DIR/config _and_
>$GIT_DIR/config.worktree. "config" file remains shared in multiple
>worktree setup.
>
>  - The special treatment for core.bare and core.worktree, to stay
>effective only in main worktree, is gone. These config files are
>supposed to be in config.worktree.
>
> This extension is most useful in multiple worktree setup because you
> now have an option to store per-worktree config (which is either
> .git/config.worktree for main worktree, or
> .git/worktrees/xx/config.worktree for linked ones).
>
> This extension can be used in single worktree mode, even though it's
> pretty much useless (but this can happen after you remove all linked
> worktrees and move back to single worktree).
>
> "git config" reads from both "config" and "config.worktree" by default
> (i.e. without either --user, --file...) when this extension is
> present. Default writes still go to "config", not "config.worktree". A
> new option --worktree is added for that (*).
>
> Since a new repo extension is introduced, existing git binaries should
> refuse to access to the repo (both from main and linked worktrees). So
> they will not misread the config file (i.e. skip the config.worktree
> part). They may still accidentally write to the config file anyway if
> they use with "git config --file ".
>
> This design places a bet on the assumption that the majority of config
> variables are shared so it is the default mode. A safer move would be
> default writes go to per-worktree file, so that accidental changes are
> isolated.
>
> (*) "git config --worktree" points back to "config" file when this
> extension is not present so that it works in any setup.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 

I like the user facing design, but how am I supposed to use it internally?

Say I want to read a value preferably from the worktree I'd do a
/*
 * maybe I don't even have to set it to 1 as
 * the user is supposed to do that?
 */
repository_format_worktree_config = 1;
git_config_get_{string,bool,int} (... as usual ...)

and if I want to read the value globally I would set the variable to 0
and read? (I would need to restore it, so I'll have a temporary variable
to keep the original value of repository_format_worktree_config)

Thanks,
Stefan


> ---
>  Documentation/config.txt   | 11 -
>  Documentation/git-config.txt   | 26 
>  Documentation/git-worktree.txt | 31 ++
>  Documentation/gitrepository-layout.txt |  8 
>  builtin/config.c   | 18 +++-
>  cache.h|  2 +
>  config.c   |  7 
>  environment.c  |  1 +
>  setup.c|  5 ++-
>  t/t2028-worktree-config.sh (new +x)| 77 
> ++
>  10 files changed, 175 insertions(+), 11 deletions(-)
>  create mode 100755 t/t2028-worktree-config.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 16dc22d..7d64da0 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2,8 +2,9 @@ CONFIGURATION FILE
>  --
>
>  The Git configuration file contains a number of variables that affect
> -the Git commands' behavior. The `.git/config` file in each repository
> -is used to store the configuration for that repository, and
> +the Git commands' behavior. The files `.git/config` and optionally
> +`config.worktree` (see `extensions.worktreeConfig` below) are each
> +repository is used to store the configuration for that repository, and
>  `$HOME/.gitconfig` is used to store a per-user configuration as
>  fallback values for the `.git/config` file. The file `/etc/gitconfig`
>  can be used to store a system-wide default configuration.
> @@ -264,6 +265,12 @@ advice.*::
> show directions on how to proceed from the current state.
>  --
>
> +extensions.worktreeConfig::
> +   If set, by default "git config" reads from both "config" and
> +   "config.worktree" file in that order. In multiple working
> +   directory mode, "config" file is shared while
> +   "config.worktree" is per-working directory.
> +
>  core.fileMode::
> Tells Git if the executable bit of files in the working tree
> is to be honored.
> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
> index f163113..9dfdb6a 100644
> --- a/Documentation/git-config.txt
> +++ b/Documentation/git-config.txt
> @@ -47,13 +47,15 @@ checks or transformations are performed on the value.
>
>  When reading, the values are read from the system, global and
>  repository local configuration files by default, and options
> -`--system`, `--global`, `--local` and `--file ` can be
> -used to tell the com

Re: [PATCH 1/2] fix passing a name for config from submodules

2016-07-26 Thread Stefan Beller
On Tue, Jul 26, 2016 at 2:49 AM, Heiko Voigt  wrote:

Thanks for continuing on the submodule cache!

> In commit 959b5455 we implemented the initial version of the submodule

Usually we refer to the commit by a triple of "abbrev. sha1 (date, subject).
See d201a1ecd (2015-05-21, test_bitmap_walk: free bitmap with bitmap_free)
for an example. Or ce41720ca (2015-04-02, blame, log: format usage strings
similarly to those in documentation).

Apparently we put the subject first and then the date. I always did it
the other way
round, to there is no strict coding guide line, though it helps a lot to have an
understanding for a) how long are we in the "broken" state already as well as
b) what was the rationale for introducing it.



> @@ -397,8 +397,10 @@ static const struct submodule *config_from(struct 
> submodule_cache *cache,
> return entry->config;
> }
>
> -   if (!gitmodule_sha1_from_commit(commit_sha1, sha1))
> +   if (!gitmodule_sha1_from_commit(commit_sha1, sha1, &rev)) {
> +   strbuf_release(&rev);
> return NULL;

This is a reoccuring pattern below. Maybe it might make sense to
just do a s/return.../ goto out/ and at that label we cleanup `rev` and `config`
and return a result value?
There are currently 6 early returns (not counting the 3 from the last switch),
4 of them return NULL, so that would result in just a "goto out", whereas 2
return an actual value, they would need to assign the result value first before
jumping out of the logic. I dunno, just food for though.



> @@ -425,8 +432,9 @@ static const struct submodule *config_from(struct 
> submodule_cache *cache,
> parameter.commit_sha1 = commit_sha1;
> parameter.gitmodules_sha1 = sha1;
> parameter.overwrite = 0;
> -   git_config_from_mem(parse_config, "submodule-blob", "",
> +   git_config_from_mem(parse_config, "submodule-blob", rev.buf,
> config, config_size, ¶meter);

Ok, this is the actual fix. Do you want to demonstrate its impact by adding
one or two tests that failed before and now work?
(As I was using the submodule config API most of the time with null_sha1
to indicate we'd be looking at the current .gitmodules file in the worktree,
the actual bug may have not manifested in the users of this API.
But still, it would be nice to see what was broken?)

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] submodule-config: combine error checking if clauses

2016-07-26 Thread Stefan Beller
On Tue, Jul 26, 2016 at 2:49 AM, Heiko Voigt  wrote:
> So we have less return handling code.
>
> Signed-off-by: Heiko Voigt 

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-26 Thread Stefan Beller
On Tue, Jul 26, 2016 at 10:20 AM, Duy Nguyen  wrote:
> On Tue, Jul 26, 2016 at 1:25 AM, Stefan Beller  wrote:
>> So what is the design philosophy in worktrees? How much independence does
>> one working tree have?
>
> git-worktree started out as an alternative for git-stash: hmm.. i need
> to make some changes in another branch, okay let's leave this worktree
> (with all its messy stuff) as-is, create another worktree, make those
> changes, then delete the worktree and go back here. There's already
> another way of doing that without git-stash: you clone the repo, fix
> your stuff, push back and delete the new repo.
>
> I know I have not really answered your questions. But I think it gives
> an idea what are the typical use cases for multiple worktrees. How
> much independence would need to be decided case-by-case, I think.

Thanks!


>
>> So here is what I did:
>>  *  s/git submodule init/git submodule update --init/
>>  * added a test_pause to the last test on the last line
>>  * Then:
>>
>> $ find . |grep da5e6058
>> ./addtest/.git/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./addtest/.git/worktrees/super-elsewhere/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./addtest/.git/worktrees/super-elsewhere/modules/submod2/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./.git/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>>
>> The last entry is the "upstream" for the addtest clone, so that is fine.
>> However inside the ./addtest/ (and its worktrees, which currently are
>> embedded in there?) we only want to have one object store for a given
>> submodule?
>
> How to store stuff in .git is the implementation details that the user
> does not care about.

They do unfortunately. :(
Some teams here are trying to migrate from the repo[1] tool to submodules,
and they usually have large code bases. (e.g. The Android Open Source
Project[2], put into a superproject has a .git dir size of 17G. The
17G are partitioned as follows:

.../.git$ du --max-depth=1 -h
44K ./hooks
32K ./refs
36K ./logs
17G ./modules
4.0K ./branches
8.0K ./info
4.7M ./objects
17G .

i.e. roughly all in submodules.

So our users do care about both what is on disk, as well
as what goes over the wire (network traffic).

My sudden interest in worktrees came up when I learned the
`--reference` flag for submodule operations is broken for
our use case, and instead of fixing the `--reference` flag,
I think the worktree approach is generally saner (i.e. with the
references you may have nasty gc issues IIUC, but in the
worktree world gc knows about all the working trees, detached
heads and branches.)

[1] https://source.android.com/source/developing.html
[2] https://android.googlesource.com/

> As long as we keep the behavior the same (they
> can still "git submodule init" and stuff in the new worktree), sharing
> the same object store makes sense (pros: lower disk consumption, cons:
> none).

So I think the current workflow for submodules
may need some redesign anyway as the submodule
commands were designed with a strict "one working
tree only" assumption.

Submodule URLs  are stored in 3 places:
 A) In the .gitmodules file of the superproject
 B) In the option submodule..URL in the superproject
 C) In the remote.origin.URL in the submodule

A) is a recommendation from the superproject to make life
of downstream easier to find and setup the whole thing.
You can ignore that if you want, though generally a caring
upstream provides good URLs here.

C) is where we actually fetch from (and hope it has all
the sha1s that are recorded as gitlinks in the superproejct)

B) seems like a hack to enable the workflow as below:

Current workflow for handling submodule URLs:
 1) Clone the superproject
 2) Run git submodule init on desired submodules
 3) Inspect .git/config to see if any submodule URL needs adaption
 4) Run git submodule update to obtain the submodules from
the configured place
 5) In case of superproject adapting the URL
-> git submodule sync, which overwrites the submodule..URL in the
superprojects .git/config as well as configuring the
remote."$remote".url in the submodule
 6) In case of users desire to change the URL
-> No one command to solve it; possible workaround: edit
.gitmodules and git submodule sync, or configure  the submodule..URL
in the superprojects .git/config as well as configuring the
remote."$remote".url in
the submodule separately. Although just changing the submodules remote works
just as well (until you remove and re-clone the submodule)

One could imagine another workflow:
 1) clone the superproject, which creates empty repositories for the
submodules
 (2) from the prior workflow i

Re: git cherry-pick conflict error message is deceptive when cherry-picking multiple commits

2016-07-26 Thread Stefan Beller
> Would it be possible to expand the hint message to tell users to run
> 'git cherry-pick --continue'

Instead of expanding I'd go for replacing?

I'd say the user is tempted for 2 choices,
a) aborting (for various reasons)
b) fix and continue.

So we'd want to point out the way for those two ways and not
give an additional arbitrary way that is not quite (b) but goes that
direction?

I realize this is what the patch does, but I wanted to point out the difference
on expanding vs replacing. I think a 4 line hint is "enough", so I'd object
adding more lines, but changing existing lines is great!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Stefan Beller
Jakub wrote:
> I think the problem with `--reference` is that it does not
> setup backreferences to prevent gc removing borrowed objects;
> which is a hard problem to solve, except for limited cases...
> like git-worktree.

Right. And instead of solving the reference problem, I'd
rather solve the worktree problem as I think it yields more?

>
>> So I think the current workflow for submodules
>> may need some redesign anyway as the submodule
>> commands were designed with a strict "one working
>> tree only" assumption.
>>
>> Submodule URLs  are stored in 3 places:
>>  A) In the .gitmodules file of the superproject
>>  B) In the option submodule..URL in the superproject
>>  C) In the remote.origin.URL in the submodule
>>
>> A) is a recommendation from the superproject to make life
>> of downstream easier to find and setup the whole thing.
>> You can ignore that if you want, though generally a caring
>> upstream provides good URLs here.
>
> Also, this URL might have change if the repository moves
> to other server; even when checking out ancient version
> we usually want to use current URL, not the one in currently
> checked-out .gitmodules file.

Right.

>
>> C) is where we actually fetch from (and hope it has all
>> the sha1s that are recorded as gitlinks in the superproject)
>
> Is it? Or is it only the case if you do `git fetch` or
> equivalent from within inside of submodule? You can fetch
> updates using `git submodule ...` from supermodule, isn't it?
> But I might be wrong here.

If you call `submodule update` in the  superproject
it actually just does a `(cd $submodule && git fetch)`.

And in the submodule we have a .git file pointing to
the superprojects ".git/modules//" which is a full
blown git dir, i.e. it has its own config, HEAD etc.

>
> Also: if .git file is gitfile link, do submodule even has
> it's own configuration file?

Yes they do.

>
>>
>> B) seems like a hack to enable the workflow as below:
>
> It has overloaded meaning, being used both for current URL
> of submodule as seen in supermodule, AND that submodule
> is checked out / needs to be checked out in the worktree
> of a supermodule.  There might be the case when you check
> out (in given worktree) a version of a supermodule that
> do not include submodule at all, but you want to know that
> when going back, this submodule is to be checked out (or not).

I am currently working on solving that with a patch series, that
allows 2 settings. The URL will be used only to overwrite the
URL from the .gitmodules file and another setting will be used
to determine if we want to checkout the submodule.

>
> The second information needs to be per-worktree. How to
> solve it, be it per-worktree configuration (not shared),
> or a special configuration variable, or worktree having
> unshared copy of configuration -- this what is discussed.

>
>> Current workflow for handling submodule URLs:
>>  1) Clone the superproject
>>  2) Run git submodule init on desired submodules
>
> Or 1-2) clone the superproject recursively, with all its
> submodules.

Only if the URLs are setup properly.

>
>>  3) Inspect .git/config to see if any submodule URL needs adaption
>
> Which is usually not needed.

Yeah, I should have added the assertion that the .gitmodules
may be out of date or such for this workflow to make sense.
Usually just go with recursive clone.

>>
>> This long lived stuff probably doesn't make sense for the a single
>> repository, but in combination with submodules (which is another way
>> to approach the "sparse/narrow" desire of a large project), I think
>> that makes sense, because the "continuous integration" shares a lot
>> of submodules with my "regular everyday hacking" or the "I need to
>> test my colleague work now" worktree.
>
> One thing that git-worktree would be very useful, if it could work
> with submodules: you could use separate worktrees to easily test
> if the supermodule works with and without its submodules present.

Oh! Yeah that makes sense!

>
> [...]
>> If you switch a branch (or to any sha1), the submodule currently stays
>> "as-is" and may be updated using "submodule update", which goes through
>> the list of existing (checked out) submodules and checks them out to the
>> sha1 pointed to by the superprojects gitlink.
>
> Which might be simply a problem that submodule UI is not mature enough.
> I would like to see automatic switch of submodule contents, if
> configured so.

Me too. Once upon a time Jens pushed for that with a series found at:
https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv1] completion: add option '--recurse-submodules' to 'git clone'

2016-07-27 Thread Stefan Beller
On Wed, Jul 27, 2016 at 10:32 AM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>>> I noticed --recurse-submodules was missing from the bash completion. This 
>>> adds
>>> it. I went for '--recurse-submodules' instead of '--recursive' as I seem to
>>> recall the former being agreed upon as the better (or least ambiguous) of 
>>> the
>>> two terms.
>>
>> Yup, that position is consistent with what 3446a54c (clone: fixup
>> recurse_submodules option, 2011-02-11) gave us.
>
> Silly me. The blame actually goes down to ccdd3da6 (clone: Add the
> --recurse-submodules option as alias for --recursive, 2010-11-04),
> but the conclusion is the same.
>
>> Perhaps we should think about deprecating "--recursive"?  I dunno.
>
> Anyway, I'll apply the "addition to the completion" patch.
>
> Thanks.

Thanks for this patch!

Note: if we ever decide to resurrect sb/submodule-default-path,
we run into a merge conflict. The reasoning for using
"--recurse-submodules" instead of a plain "--recurse" makes sense
as well, so that merge conflict will be resolved in favor of this patch.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Find a topic branch containing a commit

2016-07-27 Thread Stefan Beller
On Wed, Jul 27, 2016 at 10:42 AM, Duy Nguyen  wrote:
> Before I start doing anything silly because I don't know it can
> already be done without waving my C wand like a mad man...
>
> I often do this: find a commit of interest, the commit itself is not
> enough so I need a full patch series to figure out what's going, so I
> fire up "git log --graph --oneline" and manually search that commit
> and trace back to the merge point, then I can "git log --patch". Is
> there an automatic way to accomplish that? Something like "git branch
> --contains" (or "git merge --contains")?

https://github.com/mhagger/git-when-merged ?

>
> PS. Sometimes I wish we could optionally save cover letter in the
> merge commit. Sometimes the "big plan" is hard to see by reading
> individual commit messages.
> --
> Duy
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Find a topic branch containing a commit

2016-07-27 Thread Stefan Beller
On Wed, Jul 27, 2016 at 10:50 AM, Stefan Beller  wrote:
>>
>> PS. Sometimes I wish we could optionally save cover letter in the
>> merge commit. Sometimes the "big plan" is hard to see by reading
>> individual commit messages.
>> --

I had this wish, too. That makes a lot of sense for merge commits,
although it doesn't work well with our current workflow I would presume,
because the merge commit happens at a different time than when Junio
picks up a patch series. And in the mean time, where would we
store the cover letter information to not get lost?

Note there are some external recordings instead of adding
it to the history itself, e.g. https://github.com/trast/git/tree/notes/gmane
annotated each commit with pointers to the mailing list archive.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] document how to reference previous commits

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 8:38 AM, Junio C Hamano  wrote:
> Heiko Voigt  writes:
>
>> @@ -121,6 +121,11 @@ its behaviour.  Try to make sure your explanation can 
>> be understood
>>  without external resources. Instead of giving a URL to a mailing list
>>  archive, summarize the relevant points of the discussion.
>>
>> +If you want to reference a previous commit in the history of a stable
>> +branch use the format "abbreviated sha1 (subject, date)". So for example
>> +like this: "Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30)
>> +noticed [...]".
>> +
>

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] submodule-config: passing name reference for .gitmodule blobs

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 5:49 AM, Heiko Voigt  wrote:
> Commit 959b5455 (submodule: implement a config API for lookup of
> .gitmodules values, 2015-08-18) implemented the initial version of the
> submodule config cache. During development of that initial version we
> extracted the function gitmodule_sha1_from_commit(). During that process
> we missed that the strbuf rev was still used in config_from() and now is
> left empty. Lets fix this by also returning this string.
>
> This means that now when reading .gitmodules from revisions, the error
> messages also contain a reference to the blob they are from.
>
> Signed-off-by: Heiko Voigt 
> ---
> Here you go. Now including a test.

All 3 patches look good to me, thanks!

>
> +test_expect_success 'error message contains blob reference' '
> +   (cd super &&
> +   sha1=$(git rev-parse HEAD) &&
> +   test-submodule-config \
> +   HEAD b \
> +   HEAD submodule \
> +   2>actual_err &&
> +   grep "submodule-blob $sha1:.gitmodules" actual_err >/dev/null

Makes sense!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
Gerrit has a "superproject subscription" feature[1], that triggers a
commit in a superproject that is subscribed to its submodules.
Conceptually this Gerrit feature can be done on the client side with
Git via:

git -C  submodule update --remote --force
git -C  commit -a -m "Update submodules"
git -C  push

for each branch in the superproject. To ease the configuration in Gerrit
a special value of "." has been introduced for the submodule..branch
to mean the same branch as the superproject[2], such that you can create a
new branch on both superproject and the submodule and this feature
continues to work on that new branch.

Now we have find projects in the wild with such a .gitmodules file.
To have Git working well with these, we imitate the behavior and
look up the superprojects branch name if the submodules branch is
configured to ".". In projects that do not use Gerrit, this value
whould be never configured as "." is not a valid branch name.

[1] introduced around 2012-01, e.g.
https://gerrit-review.googlesource.com/#/c/30810/
[2] excerpt from [1]:
 > The branch value could be '.' if the submodule project branch
 > has the same name as the destination branch of the commit having
 > gitlinks/.gitmodules file.

CC: Avery Pennarun 
Signed-off-by: Stefan Beller 
---
 git-submodule.sh|  7 +++
 t/t7406-submodule-update.sh | 32 +++-
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 4ec7546..12bb9e8 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -591,6 +591,13 @@ cmd_update()
name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
branch=$(get_submodule_config "$name" branch master)
+   if test "$branch" = "."
+   then
+   if ! branch=$(git symbolic-ref --short -q HEAD)
+   then
+   die "$(eval_gettext "submodule branch 
configured to inherit branch from superproject, but it's not on any branch")"
+   fi
+   fi
if ! test -z "$update"
then
update_module=$update
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index bd261ac..41d69e4 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -209,9 +209,39 @@ test_expect_success 'submodule update --remote should 
fetch upstream changes' '
)
 '
 
-test_expect_success 'local config should override .gitmodules branch' '
+test_expect_success 'submodule update --remote should fetch upstream changes 
with .' '
+   (cd super &&
+git config -f .gitmodules submodule."submodule".branch "." &&
+git add .gitmodules &&
+git commit -m "submodules: update from the respective superproject 
branch"
+   ) &&
(cd submodule &&
+echo line4a >> file &&
+git add file &&
+test_tick &&
+git commit -m "upstream line4a" &&
 git checkout -b test-branch &&
+test_commit on-test-branch
+   ) &&
+   (cd super &&
+git submodule update --remote --force submodule &&
+(cd submodule &&
+ test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git 
log -1 --oneline master)"
+) &&
+git checkout -b test-branch &&
+git submodule update --remote --force submodule &&
+(cd submodule &&
+ test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git 
log -1 --oneline test-branch)"
+) &&
+git checkout master &&
+git branch -d test-branch &&
+git revert HEAD
+   )
+'
+
+test_expect_success 'local config should override .gitmodules branch' '
+   (cd submodule &&
+git checkout test-branch &&
 echo line5 >> file &&
 git add file &&
 test_tick &&
-- 
2.9.2.466.g8c6d1f9.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
The meat is in patch 2 and helps Git and Gerrit work well together.

patch 1 looks unrelated but it is not, as when left out the broken test, breaks
with patch 2. This is because we add more commits in the submodule.

Thanks,
Stefan

Stefan Beller (2):
  t7406: correct depth test in shallow test
  submodule update: allow '.' for branch value

 git-submodule.sh|  7 +++
 t/t7406-submodule-update.sh | 37 ++---
 2 files changed, 41 insertions(+), 3 deletions(-)

-- 
2.9.2.466.g8c6d1f9.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] t7406: correct depth test in shallow test

2016-07-28 Thread Stefan Beller
We used to ask for 3 changes and tested for having 1, so the test
seems broken.

Correct the test by using test_line_count that exists in the test suite.

Signed-off-by: Stefan Beller 
---
 t/t7406-submodule-update.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 88e9750..bd261ac 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow 
submodule' '
(cd super3 &&
 sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp 
&&
 mv -f .gitmodules.tmp .gitmodules &&
-git submodule update --init --depth=3
+git submodule update --init --depth=2
 (cd submodule &&
- test 1 = $(git log --oneline | wc -l)
+ git log --oneline >lines
+ test_line_count = 2 lines
 )
 )
 '
-- 
2.9.2.466.g8c6d1f9.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 2/2] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
Gerrit has a "superproject subscription" feature[1], that triggers a
commit in a superproject that is subscribed to its submodules.
Conceptually this Gerrit feature can be done on the client side with
Git via:

git -C  submodule update --remote --force
git -C  commit -a -m "Update submodules"
git -C  push

for each branch in the superproject. To ease the configuration in Gerrit
a special value of "." has been introduced for the submodule..branch
to mean the same branch as the superproject[2], such that you can create a
new branch on both superproject and the submodule and this feature
continues to work on that new branch.

Now we have find projects in the wild with such a .gitmodules file.
To have Git working well with these, we imitate the behavior and
look up the superprojects branch name if the submodules branch is
configured to ".". In projects that do not use Gerrit, this value
whould be never configured as "." is not a valid branch name.

[1] introduced around 2012-01, e.g.
https://gerrit-review.googlesource.com/#/c/30810/
[2] excerpt from [1]:
 > The branch value could be '.' if the submodule project branch
 > has the same name as the destination branch of the commit having
 > gitlinks/.gitmodules file.

CC: Avery Pennarun 
Signed-off-by: Stefan Beller 
---

 This comes with another test that I run into while using this code.
 Please replace patch 2 with this v2.
 
 Thanks,
 Stefan

 git-submodule.sh|  9 -
 t/t7406-submodule-update.sh | 42 +-
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 4ec7546..1eb33ad 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -590,7 +590,6 @@ cmd_update()
 
name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
-   branch=$(get_submodule_config "$name" branch master)
if ! test -z "$update"
then
update_module=$update
@@ -616,6 +615,14 @@ cmd_update()
 
if test -n "$remote"
then
+   branch=$(get_submodule_config "$name" branch master)
+   if test "$branch" = "."
+   then
+   if ! branch=$(git symbolic-ref --short -q HEAD)
+   then
+   die "$(eval_gettext "submodule branch 
configured to inherit branch from superproject, but it's not on any branch")"
+   fi
+   fi
if test -z "$nofetch"
then
# Fetch remote before determining tracking $sha1
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index bd261ac..953c486 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -209,9 +209,49 @@ test_expect_success 'submodule update --remote should 
fetch upstream changes' '
)
 '
 
-test_expect_success 'local config should override .gitmodules branch' '
+test_expect_success 'submodule update --remote should fetch upstream changes 
with .' '
+   (cd super &&
+git config -f .gitmodules submodule."submodule".branch "." &&
+git add .gitmodules &&
+git commit -m "submodules: update from the respective superproject 
branch"
+   ) &&
(cd submodule &&
+echo line4a >> file &&
+git add file &&
+test_tick &&
+git commit -m "upstream line4a" &&
+git checkout -b test-branch &&
+test_commit on-test-branch
+   ) &&
+   (cd super &&
+git submodule update --remote --force submodule &&
+(cd submodule &&
+ test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git 
log -1 --oneline master)"
+) &&
 git checkout -b test-branch &&
+git submodule update --remote --force submodule &&
+(cd submodule &&
+ test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git 
log -1 --oneline test-branch)"
+) &&
+git checkout master &&
+git branch -d test-branch
+   )
+'
+
+test_expect_success 'branch = . does not confuse the rest of update' '
+   (cd super &&
+git checkout --detach &&
+# update is not confused by branch="." even if the the superproje

Re: [PATCH 1/2] t7406: correct depth test in shallow test

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 11:39 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> We used to ask for 3 changes and tested for having 1, so the test
>> seems broken.
>
> I am not sure what to think of "seems broken".

When asking for depth 3, I would expect a result of 1,2, or 3 commits.

But when testing the depth argument with a history less than 3, and then
implying: "I got 1, which is less than 3, so the depth works", seems
to be a logical shortcut to me.

I would have expected a history of >3, then ask for 3 and confirm we did not
get 4 or 5 or 6, but 3 only.

>
> Asking for 3 and having 1 is broken in what way?  Should we be
> expecting for 3 because we asked for that many?  Should we expect
> less than three even though we asked for three because the upstream
> side does not even have that many?  If the current test that asks
> for 3 and gets only 1 is not failing, why should we expect that
> asking for 2 would get 2?  In other words, why is it sane that
> asking for fewer number of commits gives us more?

I think there is a subtle thing going on, that I did not examine properly but
it is hidden in the modernization from

test 1 = $(something)
 to test_line_count = 2

I'll investigate the actual reason.

>
> Also most of the lines in this subshell seem to be breaking
> &&-chain.

Thanks for pointing that out, will fix while at it.

>
>
>
>> Correct the test by using test_line_count that exists in the test suite.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  t/t7406-submodule-update.sh | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
>> index 88e9750..bd261ac 100755
>> --- a/t/t7406-submodule-update.sh
>> +++ b/t/t7406-submodule-update.sh
>> @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow 
>> submodule' '
>>   (cd super3 &&
>>sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
>> >.gitmodules.tmp &&
>>mv -f .gitmodules.tmp .gitmodules &&
>> -  git submodule update --init --depth=3
>> +  git submodule update --init --depth=2
>>(cd submodule &&
>> -   test 1 = $(git log --oneline | wc -l)
>> +   git log --oneline >lines
>> +   test_line_count = 2 lines
>>)
>>  )
>>  '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] t7406: correct depth test in shallow test

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 11:47 AM, Stefan Beller  wrote:
> On Thu, Jul 28, 2016 at 11:39 AM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> We used to ask for 3 changes and tested for having 1, so the test
>>> seems broken.
>>
>> I am not sure what to think of "seems broken".
>
> When asking for depth 3, I would expect a result of 1,2, or 3 commits.
>
> But when testing the depth argument with a history less than 3, and then
> implying: "I got 1, which is less than 3, so the depth works", seems
> to be a logical shortcut to me.
>
> I would have expected a history of >3, then ask for 3 and confirm we did not
> get 4 or 5 or 6, but 3 only.
>
>>
>> Asking for 3 and having 1 is broken in what way?  Should we be
>> expecting for 3 because we asked for that many?  Should we expect
>> less than three even though we asked for three because the upstream
>> side does not even have that many?  If the current test that asks
>> for 3 and gets only 1 is not failing, why should we expect that
>> asking for 2 would get 2?  In other words, why is it sane that
>> asking for fewer number of commits gives us more?
>
> I think there is a subtle thing going on, that I did not examine properly but
> it is hidden in the modernization from
>
> test 1 = $(something)
>  to test_line_count = 2
>
> I'll investigate the actual reason.

So when I place a test_pause just before that check for the lines
we have the upstream submodule:
$ git log --oneline
6355002 upstream line4
820877d upstream line3
4301fd3 Commit 2
0c90624 upstream

which then allows fetching
6355002 upstream line4
820877d upstream line3
4301fd3 Commit 2

and "Commit 2" is recorded as the sha1, i.e.
when checking out "Commit 2" and running
(git log --oneline | wc -l) we get 1 as it cuts
off after that.

When adding a test (in the next patch) that adds
more commits to the submodule upstream, we
will fetch with depth 3 but will no longer see the sha1,
so we try a different approach. Current approach:
try fetching again with no depth, and then try again with sha1
given.

So one could say there is a bug as the fetching should
use the depth argument as well.

The depth of 2 I chose originally turns out to be a lucky
accident too, as the depth from "Commit 2" is 2,
so that we would observe the same depth no matter if
a --depth 2 was given and working or not.

I'll redo this test (as 2 tests, one is testing the situation as
we have now, i.e. the desired tip is reachable from within
the depth argument, the second test will test if it is not
reachable.)

>
>>
>> Also most of the lines in this subshell seem to be breaking
>> &&-chain.
>
> Thanks for pointing that out, will fix while at it.
>
>>
>>
>>
>>> Correct the test by using test_line_count that exists in the test suite.
>>>
>>> Signed-off-by: Stefan Beller 
>>> ---
>>>  t/t7406-submodule-update.sh | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
>>> index 88e9750..bd261ac 100755
>>> --- a/t/t7406-submodule-update.sh
>>> +++ b/t/t7406-submodule-update.sh
>>> @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow 
>>> submodule' '
>>>   (cd super3 &&
>>>sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
>>> >.gitmodules.tmp &&
>>>mv -f .gitmodules.tmp .gitmodules &&
>>> -  git submodule update --init --depth=3
>>> +  git submodule update --init --depth=2
>>>(cd submodule &&
>>> -   test 1 = $(git log --oneline | wc -l)
>>> +   git log --oneline >lines
>>> +   test_line_count = 2 lines
>>>)
>>>  )
>>>  '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 2/2] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 12:10 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Gerrit has a "superproject subscription" feature[1], that triggers a
>> commit in a superproject that is subscribed to its submodules.
>> Conceptually this Gerrit feature can be done on the client side with
>> Git via:
>>
>> git -C  submodule update --remote --force
>> git -C  commit -a -m "Update submodules"
>> git -C  push
>>
>> for each branch in the superproject.
>
> Which I imagine would be useful if you only have one submodule.  If
> you work on submodules A and B and then want to give the updated
> superproject that binds the latest in both A and B as an atomic
> update, the three lines above would not quite work, no?

When using Gerrit you can submit A,B together bound by a "topic"
and that will trigger a superproject update that contains one
atomic commit updating A and B at the same time.

To fully emulate that Gerrit feature you would need to put
the 3 lines above in an infinite loop that polls the remote
submodules all the time.

If you wanted to do that without that Gerrit feature,
this becomes racy as "submodule update --remote"
fetches the submodules racily (by design) and it may end
up putting A and B in the same commit or not.


>
>> To ease the configuration in Gerrit
>> a special value of "." has been introduced for the submodule..branch
>> to mean the same branch as the superproject[2], such that you can create a
>> new branch on both superproject and the submodule and this feature
>> continues to work on that new branch.
>>
>> Now we have find projects in the wild with such a .gitmodules file.

I'll fix the typo. (delete have or find)

>
> That's annoying.
>
>> To have Git working well with these, we imitate the behavior and
>> look up the superprojects branch name if the submodules branch is
>> configured to ".". In projects that do not use Gerrit, this value
>> whould be never configured as "." is not a valid branch name.
>
> I find that the last sentence is somewhat misleading.  I agree it is
> justifiable that using "." as the name to trigger a new (to us)
> feature is safe, because such a setting wouldn't have meant anything
> useful without this change, but I initially misread it and thought
> you added "are we using Gerrit?  Error out if we are not" logic,
> which is not the case here.

Well I wanted to express:

  The .gitmodules used in these Gerrit projects do not conform
  to Gits understanding of how .gitmodules should look like.
  Let's make Git understand this Gerrit corner case (which you
  would only need when using Gerrit).

  Adding special treatment of the "." value is safe as this is an
  invalid branch name in Git.


>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 4ec7546..1eb33ad 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -590,7 +590,6 @@ cmd_update()
>>
>>   name=$(git submodule--helper name "$sm_path") || exit
>>   url=$(git config submodule."$name".url)
>> - branch=$(get_submodule_config "$name" branch master)
>>   if ! test -z "$update"
>>   then
>>   update_module=$update
>> @@ -616,6 +615,14 @@ cmd_update()
>>
>>   if test -n "$remote"
>>   then
>> + branch=$(get_submodule_config "$name" branch master)
>> + if test "$branch" = "."
>> + then
>> + if ! branch=$(git symbolic-ref --short -q HEAD)
>> + then
>> + die "$(eval_gettext "submodule branch 
>> configured to inherit branch from superproject, but it's not on any branch")"
>> + fi
>> + fi
>
> I see that you narrowed the scope of "$branch" (which is only used
> when $remote exists), but it is a bit annoying to see that change
> mixed with "now a dot means something different" change.
>
> I wonder if the above 8-line block wants to be encapsulated to
> become a part of "git submodule--helper" interface, though.  IOW,
> branch=$(git submodule--helper branch "$name") or something?

There is already get_submodule_config that we implement in shell,
which is also used in cmd_summary for reading the .ignore
field.

So having the "." check in that method (whe

[PATCHv3 4/7] submodule--helper: fix usage string for relative-path

2016-07-28 Thread Stefan Beller
Internally we call the underscore version of relative_path, but externally
we present an API with no underscores.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b22352b..fb90c64 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -892,7 +892,7 @@ static int resolve_relative_path(int argc, const char 
**argv, const char *prefix
 {
struct strbuf sb = STRBUF_INIT;
if (argc != 3)
-   die("submodule--helper relative_path takes exactly 2 arguments, 
got %d", argc);
+   die("submodule--helper relative-path takes exactly 2 arguments, 
got %d", argc);
 
printf("%s", relative_path(argv[1], argv[2], &sb));
strbuf_release(&sb);
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 0/7] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
This got bigger than expected, but I am happier with the results.

The meat is found in the last patch. (At least what I am interested in; others
may be more interested in the second patch which could be argued to be a real
bug fix to be merged down to maint.)

Thanks Junio for the thourough review of the first patches,
Stefan

Stefan Beller (7):
  t7406: future proof tests with hard coded depth
  submodule update: respect depth in subsequent fetches
  submodule update: narrow scope of local variable
  submodule--helper: fix usage string for relative-path
  submodule-config: keep configured branch around
  submodule--helper: add remote-branch helper
  submodule update: allow '.' for branch value

 builtin/submodule--helper.c | 48 --
 git-submodule.sh| 11 +++
 submodule-config.c  | 11 ++-
 submodule-config.h  |  1 +
 t/t7406-submodule-update.sh | 71 +++--
 5 files changed, 125 insertions(+), 17 deletions(-)

-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 1/7] t7406: future proof tests with hard coded depth

2016-07-28 Thread Stefan Beller
The prior hard coded depth was chosen to be exactly the length from the
recorded gitlink to the tip of the remote, so if you add more commits
to the remote before, this test will not test its intention any more.

Signed-off-by: Stefan Beller 
---
 t/t7406-submodule-update.sh | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 88e9750..8fc3a25 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -841,16 +841,19 @@ test_expect_success SYMLINKS 'submodule update can handle 
symbolic links in pwd'
 '
 
 test_expect_success 'submodule update clone shallow submodule' '
+   test_when_finished "rm -rf super3" &&
+   first=$(git -C cloned submodule status submodule |cut -c2-41) &&
+   second=$(git -C submodule rev-parse HEAD) &&
+   commit_count=$(git -C submodule rev-list $first^..$second | wc -l) &&
git clone cloned super3 &&
pwd=$(pwd) &&
-   (cd super3 &&
-sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp 
&&
-mv -f .gitmodules.tmp .gitmodules &&
-git submodule update --init --depth=3
-(cd submodule &&
- test 1 = $(git log --oneline | wc -l)
-)
-)
+   (
+   cd super3 &&
+   sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
>.gitmodules.tmp &&
+   mv -f .gitmodules.tmp .gitmodules &&
+   git submodule update --init --depth=$commit_count &&
+   test 1 = $(git -C submodule log --oneline | wc -l)
+   )
 '
 
 test_expect_success 'submodule update --recursive drops module name before 
recursing' '
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 2/7] submodule update: respect depth in subsequent fetches

2016-07-28 Thread Stefan Beller
When depth is given the user may have a reasonable expectation that
any remote operation is using the given depth. Add a test to demonstrate
we still get the desired sha1 even if the depth is too short to
include the actual commit.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh|  9 +
 t/t7406-submodule-update.sh | 16 
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 4ec7546..174f4ea 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -481,7 +481,8 @@ fetch_in_submodule () (
'')
git fetch ;;
*)
-   git fetch $(get_default_remote) "$2" ;;
+   shift
+   git fetch $(get_default_remote) "$@" ;;
esac
 )
 
@@ -619,7 +620,7 @@ cmd_update()
if test -z "$nofetch"
then
# Fetch remote before determining tracking $sha1
-   fetch_in_submodule "$sm_path" ||
+   fetch_in_submodule "$sm_path" $depth ||
die "$(eval_gettext "Unable to fetch in 
submodule path '\$sm_path'")"
fi
remote_name=$(sanitize_submodule_env; cd "$sm_path" && 
get_default_remote)
@@ -642,13 +643,13 @@ cmd_update()
# Run fetch only if $sha1 isn't present or it
# is not reachable from a ref.
is_tip_reachable "$sm_path" "$sha1" ||
-   fetch_in_submodule "$sm_path" ||
+   fetch_in_submodule "$sm_path" $depth ||
die "$(eval_gettext "Unable to fetch in 
submodule path '\$displaypath'")"
 
# Now we tried the usual fetch, but $sha1 may
# not be reachable from any of the refs
is_tip_reachable "$sm_path" "$sha1" ||
-   fetch_in_submodule "$sm_path" "$sha1" ||
+   fetch_in_submodule "$sm_path" $depth "$sha1" ||
die "$(eval_gettext "Fetched in submodule path 
'\$displaypath', but it did not contain \$sha1. Direct fetching of that commit 
failed.")"
fi
 
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 8fc3a25..1bb1f43 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -856,6 +856,22 @@ test_expect_success 'submodule update clone shallow 
submodule' '
)
 '
 
+test_expect_success 'submodule update clone shallow submodule outside of 
depth' '
+   test_when_finished "rm -rf super3" &&
+   git clone cloned super3 &&
+   pwd=$(pwd) &&
+   (
+   cd super3 &&
+   sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
>.gitmodules.tmp &&
+   mv -f .gitmodules.tmp .gitmodules &&
+   test_must_fail git submodule update --init --depth=1 2>actual &&
+   test_i18ngrep "Direct fetching of that commit failed." actual &&
+   git -C ../submodule config uploadpack.allowReachableSHA1InWant 
true &&
+   git submodule update --init --depth=1 >actual &&
+   test 1 = $(git -C submodule log --oneline | wc -l)
+   )
+'
+
 test_expect_success 'submodule update --recursive drops module name before 
recursing' '
(cd super2 &&
 (cd deeper/submodule/subsubmodule &&
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 6/7] submodule--helper: add remote-branch helper

2016-07-28 Thread Stefan Beller
In a later patch we want to enhance the logic for the branch selection.
Rewrite the current logic to be in C, so we can directly use C when
we enhance the logic.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 28 +++-
 git-submodule.sh|  2 +-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index fb90c64..710048f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -899,6 +899,31 @@ static int resolve_relative_path(int argc, const char 
**argv, const char *prefix
return 0;
 }
 
+static const char *remote_submodule_branch(const char *path)
+{
+   const struct submodule *sub;
+   gitmodules_config();
+   git_config(submodule_config, NULL);
+
+   sub = submodule_from_path(null_sha1, path);
+   if (!sub->branch)
+   return "master";
+
+   return sub->branch;
+}
+
+static int resolve_remote_submodule_branch(int argc, const char **argv,
+   const char *prefix)
+{
+   struct strbuf sb = STRBUF_INIT;
+   if (argc != 2)
+   die("submodule--helper remote-branch takes exactly one 
arguments, got %d", argc);
+
+   printf("%s", remote_submodule_branch(argv[1]));
+   strbuf_release(&sb);
+   return 0;
+}
+
 struct cmd_struct {
const char *cmd;
int (*fn)(int, const char **, const char *);
@@ -912,7 +937,8 @@ static struct cmd_struct commands[] = {
{"relative-path", resolve_relative_path},
{"resolve-relative-url", resolve_relative_url},
{"resolve-relative-url-test", resolve_relative_url_test},
-   {"init", module_init}
+   {"init", module_init},
+   {"remote-branch", resolve_remote_submodule_branch}
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 0d4021f..1e493a8 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -616,7 +616,7 @@ cmd_update()
 
if test -n "$remote"
then
-   branch=$(get_submodule_config "$name" branch master)
+   branch=$(git submodule--helper remote-branch "$sm_path")
if test -z "$nofetch"
then
# Fetch remote before determining tracking $sha1
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 3/7] submodule update: narrow scope of local variable

2016-07-28 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 174f4ea..0d4021f 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -591,7 +591,6 @@ cmd_update()
 
name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
-   branch=$(get_submodule_config "$name" branch master)
if ! test -z "$update"
then
update_module=$update
@@ -617,6 +616,7 @@ cmd_update()
 
if test -n "$remote"
then
+   branch=$(get_submodule_config "$name" branch master)
if test -z "$nofetch"
then
# Fetch remote before determining tracking $sha1
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 5/7] submodule-config: keep configured branch around

2016-07-28 Thread Stefan Beller
The branch field will be used in a later patch by `submodule update`.

Signed-off-by: Stefan Beller 
---
 submodule-config.c | 11 ++-
 submodule-config.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/submodule-config.c b/submodule-config.c
index 077db40..ebee1e4 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -59,6 +59,7 @@ static void free_one_config(struct submodule_entry *entry)
 {
free((void *) entry->config->path);
free((void *) entry->config->name);
+   free((void *) entry->config->branch);
free((void *) entry->config->update_strategy.command);
free(entry->config);
 }
@@ -199,6 +200,7 @@ static struct submodule *lookup_or_create_by_name(struct 
submodule_cache *cache,
submodule->update_strategy.command = NULL;
submodule->fetch_recurse = RECURSE_SUBMODULES_NONE;
submodule->ignore = NULL;
+   submodule->branch = NULL;
submodule->recommend_shallow = -1;
 
hashcpy(submodule->gitmodules_sha1, gitmodules_sha1);
@@ -358,9 +360,16 @@ static int parse_config(const char *var, const char 
*value, void *data)
if (!me->overwrite && submodule->recommend_shallow != -1)
warn_multiple_config(me->commit_sha1, submodule->name,
 "shallow");
-   else {
+   else
submodule->recommend_shallow =
git_config_bool(var, value);
+   } else if (!strcmp(item.buf, "branch")) {
+   if (!me->overwrite && submodule->branch)
+   warn_multiple_config(me->commit_sha1, submodule->name,
+"branch");
+   else {
+   free((void *)submodule->branch);
+   submodule->branch = xstrdup(value);
}
}
 
diff --git a/submodule-config.h b/submodule-config.h
index b1fdcc0..d05c542 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -15,6 +15,7 @@ struct submodule {
const char *url;
int fetch_recurse;
const char *ignore;
+   const char *branch;
struct submodule_update_strategy update_strategy;
/* the sha1 blob id of the responsible .gitmodules file */
unsigned char gitmodules_sha1[20];
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 7/7] submodule update: allow '.' for branch value

2016-07-28 Thread Stefan Beller
Gerrit has a "superproject subscription" feature[1], that triggers a
commit in a superproject that is subscribed to its submodules.
Conceptually this Gerrit feature can be done on the client side with
Git via (except for raciness, error handling etc):

  while [ true ]; do
git -C  submodule update --remote --force
git -C  commit -a -m "Update submodules"
git -C  push
  done

for each branch in the superproject. To ease the configuration in Gerrit
a special value of "." has been introduced for the submodule..branch
to mean the same branch as the superproject[2], such that you can create a
new branch on both superproject and the submodule and this feature
continues to work on that new branch.

Now we find projects in the wild with such a .gitmodules file.
The .gitmodules used in these Gerrit projects do not conform
to Gits understanding of how .gitmodules should look like.
This teaches Git to deal gracefully with this syntax as well.

The redefinition of "." does no harm to existing projects unaware of
this change, as "." is an invalid branch name in Git, so we do not
expect such projects to exist.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 18 ++
 t/t7406-submodule-update.sh | 36 +++-
 2 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 710048f..ae88eff 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -909,6 +909,24 @@ static const char *remote_submodule_branch(const char 
*path)
if (!sub->branch)
return "master";
 
+   if (!strcmp(sub->branch, ".")) {
+   unsigned char sha1[20];
+   const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, NULL);
+
+   if (!refname)
+   die(_("No such ref: %s"), "HEAD");
+
+   /* detached HEAD */
+   if (!strcmp(refname, "HEAD"))
+   die(_("Submodule (%s) branch configured to inherit "
+ "branch from superproject, but the superproject "
+ "is not on any branch"), sub->name);
+
+   if (!skip_prefix(refname, "refs/heads/", &refname))
+   die(_("Expecting a full ref name, got %s"), refname);
+   return refname;
+   }
+
return sub->branch;
 }
 
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 1bb1f43..1c4c1f2 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -209,9 +209,43 @@ test_expect_success 'submodule update --remote should 
fetch upstream changes' '
)
 '
 
+test_expect_success 'submodule update --remote should fetch upstream changes 
with .' '
+   (
+   cd super &&
+   git config -f .gitmodules submodule."submodule".branch "." &&
+   git add .gitmodules &&
+   git commit -m "submodules: update from the respective 
superproject branch"
+   ) &&
+   (
+   cd submodule &&
+   echo line4a >> file &&
+   git add file &&
+   test_tick &&
+   git commit -m "upstream line4a" &&
+   git checkout -b test-branch &&
+   test_commit on-test-branch
+   ) &&
+   (
+   cd super &&
+   git submodule update --remote --force submodule &&
+   git -C submodule log -1 --oneline >actual
+   git -C ../submodule log -1 --oneline master >expect
+   test_cmp expect actual &&
+   git checkout -b test-branch &&
+   git submodule update --remote --force submodule &&
+   git -C submodule log -1 --oneline >actual
+   git -C ../submodule log -1 --oneline test-branch >expect
+   test_cmp expect actual &&
+   git checkout master &&
+   git branch -d test-branch &&
+   #~ git -C ../submodule branch -d test-branch &&
+   git reset --hard HEAD^
+   )
+'
+
 test_expect_success 'local config should override .gitmodules branch' '
(cd submodule &&
-git checkout -b test-branch &&
+git checkout test-branch &&
 echo line5 >> file &&
 git add file &&
 test_tick &&
-- 
2.9.2.472.g1ffb07c.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv1] completion: add option '--recurse-submodules' to 'git clone'

2016-07-28 Thread Stefan Beller
On Thu, Jul 28, 2016 at 9:22 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>> Anyway, I'll apply the "addition to the completion" patch.
>>>
>>> Thanks.
>>
>> Thanks for this patch!
>>
>> Note: if we ever decide to resurrect sb/submodule-default-path,
>> we run into a merge conflict. The reasoning for using
>> "--recurse-submodules" instead of a plain "--recurse" makes sense
>> as well, so that merge conflict will be resolved in favor of this patch.
>
> Thanks for an advance warning.  My rerere database has already been
> taught about this conflict ;-)
>
> As to sb/submodule-default-path topic, which has been blocked on
> still-in-flux attribute work, I am tempted to declare that the
> attribute work is not yet thread-ready but it is in a good enough
> shape to base other works on, and have them advance to 'next'.

I am tempted to ask for delaying sb/submodule-default-path
a bit more and see if there is an alternative solution.

Inspired by Duys series on submodules and worktree I found a new
way how to do the submodule management (separation of the submodule
URL as a flag indicating the existence and the actual URL), and that
will be very similar to what is implemented in sb/submodule-default-path
just even more streamlined. (That said the state of that new series
is still vapor and not yet solid, so ... it's hopes and dreams.)

>
> The traditional pattern of allowing the callers to randomly allocate
> an array of "struct git_attr_check" and passing the pointer to its
> first element to git_check_attr() function was impossible to extend
> without having to update the callers, but we have migrated away from
> the pattern and the attribute subsystem can be enhanced without
> impacting the callers too much.
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


<    1   2   3   4   5   6   7   8   9   10   >