Re: [Bug] commit-tree shouldn't append an extra newline to commit messages

2017-09-06 Thread Jeff King
Thanks for following up. A few minor comments:

On Tue, Sep 05, 2017 at 04:57:24PM -0400, Ross Kabus wrote:

> From: Ross Kabus 
> Date: Tue, 5 Sep 2017 13:54:52 -0400
> Subject: [PATCH] commit-tree: don't append a newline with -F

Usually you'd just omit these in favor of the email headers (and replace
the email subject with this one).

> This change makes it such that commit-tree -F never appends a newline
> character to the supplied commit message (either from file or stdin).
> 
> Previously, commit-tree -F would always append a newline character to
> the text brought in from file or stdin. This has caused confusion in a
> number of ways:
>   - This is a plumbing command and it is generally expected not to do
> text cleanup or other niceties.
>   - stdin piping with "-F -" appends a newline but stdin piping without
> -F does not append a newline (inconsistent).
>   - git-commit has the --cleanup=verbatim option that prevents appending
> a newline with its -F argument. There is no verbatim counterpart to
> commit-tree -F (inconsistent).

This explanation all makes sense to me except for the last bit, which is
a bit subtle. I'd have said it more like:

  - git-commit does not specifically append a newline to the "-F"
input. The issue is somewhat muddled by the fact that git-commit
does pass the message through its --cleanup option, which may add
such a newline. But for commit-tree to match "commit --cleanup=verbatim",
we should not do so here.

> ---
>  builtin/commit-tree.c | 1 -
>  1 file changed, 1 deletion(-)

Your patch needs to be signed-off; see the Developer's Certificate of
Origin section in Documentation/SubmittingPatches.

> diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
> index 19e898fa4..2177251e2 100644
> --- a/builtin/commit-tree.c
> +++ b/builtin/commit-tree.c
> @@ -102,7 +102,6 @@ int cmd_commit_tree(int argc, const char **argv,
> const char *prefix)
>   if (fd && close(fd))
>   die_errno("git commit-tree: failed to close '%s'",
>argv[i]);
> - strbuf_complete_line();
>   continue;

Aside from the whitespace damage, the patch itself looks good.

-Peff


Re: Branch name support & character

2017-09-06 Thread Andrew Ardill
Hi,

On 7 September 2017 at 13:27, 夏大雨  wrote:
> I want to merge branch a to branch b so I rename branch b to a, but
> when I push a to remote repo, '&' is took as & operator for shell,
> and git push fails. So it would be better that branch name support &
> character for me.

Have you tried quoting the branch name?

Using git checkout -b 'my' works for me (single quotes around
the branch name).

Pushing to a local remote also works: git push --set-upstream
../git-test-2 'my'

That being said, I would advise not using & in branch names,
specifically because it is a special character in shells.

Regards,

Andrew Ardill


Re: [PATCH 27/34] shortlog: release strbuf after use in insert_one_record()

2017-09-06 Thread Jeff King
On Thu, Sep 07, 2017 at 04:51:12AM +0900, Junio C Hamano wrote:

> > diff --git a/builtin/shortlog.c b/builtin/shortlog.c
> > index 43c4799ea9..48af16c681 100644
> > --- a/builtin/shortlog.c
> > +++ b/builtin/shortlog.c
> > @@ -50,66 +50,67 @@ static int compare_by_list(const void *a1, const void 
> > *a2)
> >  static void insert_one_record(struct shortlog *log,
> >   const char *author,
> >   const char *oneline)
> >  {
> > ...
> > item = string_list_insert(>list, namemailbuf.buf);
> > +   strbuf_release();
> 
> As log->list.strdup_strings is set to true in shortlog_init(),
> namemailbuf.buf will leak without this.
> 
> An alterative, as this is the only place we add to log->list, could
> be to make log->list take ownership of the string by not adding a
> _release() here but instead _detach(), I guess.

I agree that would be better, but I think it's a little tricky. The
string_list_insert() call may make a new entry, or it may return an
existing one. We'd still need to free in the latter case. I'm not sure
the string_list interface makes it easy to tell the difference.

> It is also curious that at the end of shortlog_output(), we set the
> log->list.strdup_strings again to true immediately before calling
> string_list_clear() on it.  I suspect that is unnecessary?

I think so. I was curious whether I might have introduced this weirdness
when I refactored this code a year or so ago. But no, it looks like it
dates all the way back to the very first version of shortlog.c.

-Peff


Branch name support & character

2017-09-06 Thread 夏大雨
I want to merge branch a to branch b so I rename branch b to a, but
when I push a to remote repo, '&' is took as & operator for shell,
and git push fails. So it would be better that branch name support &
character for me.


Re: gitmodules below root directory

2017-09-06 Thread Jacob Keller
On Wed, Sep 6, 2017 at 12:58 PM, Junio C Hamano  wrote:
> The current gitlink implementation records only the "what commit
> from the subproject history is to be checked out at this path?" and
> nothing else, by storing a single SHA-1 that happens to be the name
> of the commit object (but the superproject does not even care the
> fact that it is a commit or a random string).  We could substitute
> that with the name of a blob object that belongs to the superproject
> history and records the information about the submodule at the path
> (e.g. "which repository the upstream project recommends to clone the
> subproject from?", "what commit object is to be checked out").
>
> When you see a single tree of a superproject, you need to see what
> commit is to be checked out from the tree object and everything else
> needs to be read from the .gitmodules file in that tree in the
> current system, but it does not have to be that way.
>
>

IMO, this approach described here, (point the gitlink at a blob which
describes the full contents, URL, etc) would make more sense. The
trickiest parts I think are (a) it really requires tooling to change
the git module vs just editing a file, and (b) we'd need to prevent
the blobs from getting garbage collected.

I think it makes each individual submodule a bit more robust, since
the actual submodule pointer always points directly to the full data
about that submodule (it's recommended URL, it's path, etc), and
changes to those things *are* changes to the submodule pointer.

Thanks,
Jake


Re: [RFC PATCH 0/2] Add named reference to latest push cert

2017-09-06 Thread Junio C Hamano
Stefan Beller  writes:

> On the ref to store the push certs:
> (a) Currently the ref points at the blob, I wonder if we'd rather want to
> point at a commit? (Then we can build up a history of
> push certs, instead of relying on the reflog to show all
> push certs. Also the gc issue might be easier to solve using this)
> (b) When going with (a), we might want to change the name. Most
> refs are 3 directories deep:
>   refs/heads/
>   refs/pr/ # at github IIUC
>   refs/changes/ # Gerrit
>   refs/meta/config # Gerrit to e.g. configure ACLs of the repo
> "refs" indicates it is a ref, whereas the second part can be seen
> as a "namespace". Currently Git only uses the "heads" and "tags"
> namespace, "meta" is used by more than just Gerrit, so maybe it is
> not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert
> instead?

You also need to worry about concurrent pushes.  The resulting
"history" may not have to be sequencial, but two pushes that affect
the same ref must be serialized in the resulting push-cert store.

The original design deliberately punts all the complexity to hook
exactly because we do not want to have a half-baked "built-in"
implementation that would only get in the way of those who wants to
do high-performance servers.  It is very likely that they want to
have a long-running daemon that listens to a port or a named pipe,
where the only thing the hook would do is to write the value of
GIT_PUSH_CERT to that daemon process, which acts as a serialization
point, can read from the object store that is used as a short-term
temporary area, and write the push cert to a more permanent store.

Having said all that, I am sympathetic to a wish to have an
easy-to-enable-without-thinking example that is not so involved
(e.g. no portability concern, does not have to perform very well but
must be correct).  If Shikher wants to add one, I think the right
approach to do so would be to add and ship a sample hook.

Thanks.


Re: [PATCHv2] pull: honor submodule.recurse config option

2017-09-06 Thread Junio C Hamano
Nicolas Morey-Chaisemartin  writes:

>> If it is not the latter, perhaps we may want to flip the order of
>> config parsing and option parsing around?  That will allow us to fix
>> the handling of autostash thing to use only one variable, and also
>> fix your patch to do the right thing.
>
> I see what you mean.

> It looks like switching the code around works but I think there
> still needs to be 2 variables for autstash for this piece of code:
>
> if (!opt_rebase && opt_autostash != -1)
> die(_("--[no-]autostash option is only valid with --rebase."));
>
> The config option should not cause git pull to die when not using
> --rebase, the CLI option should.

Ah, OK.  That is a worthwhile observation that needs to be recorded
in the log message of a commit that flips the order of option/config
parsing.

Thanks.


Re: [PATCH] config: remove git_config_maybe_bool

2017-09-06 Thread Junio C Hamano
Martin Ågren  writes:

> The function was deprecated in commit 89576613 ("treewide: deprecate
> git_config_maybe_bool, use git_parse_maybe_bool", 2017-08-07) and has no
> users.
>
> Signed-off-by: Martin Ågren 
> ---

Thanks.  Will queue.


Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list

2017-09-06 Thread Jeff King
On Wed, Sep 06, 2017 at 08:12:24PM +0200, Martin Ågren wrote:

> On 5 September 2017 at 23:26, Junio C Hamano  wrote:
> > Jeff King  writes:
> >
> >> I noticed the HEAD funniness, too, when looking at this earlier. I agree
> >> with Junio that it's not quite consistent with the general rule of
> >> "string list items point to their refnames", but I don't think it
> >> matters in practice.
> >
> > Yup, we are on the same page; the "fix" I was alluding to would look
> > exactly like what you wrote below, but I agree the distinction does
> > not matter in practice.  IOW, I do not think the code after Martin's
> > fix is wrong per-se.
> 
> Well, "not wrong per-se" tells me you'd feel slightly more comfortable
> about the state of things if we did this. ;)
> 
> I'll take Peff's hint, tweak/add comments for correctness and symmetry
> with the previous patch and add an if-BUG for symmetry. Peff: Do I have
> your sign-off? (Do I need it?)

Yes, you have my sign-off. Probably it is not necessary for such a
trivial patch, but it never hurts to be sure.

> If we re-roll, would you prefer Peff's much smaller take on patch 2
> (strbuf_release where it matters, instead of sprinkling "goto out" all
> over)? I think me and him agreed that we'd be fine either way. I'd reuse
> my commit message, if I get his sign-off and "From:".

You are welcome to forge my sign-off there (but I really am OK with
either approach).

-Peff


Re: [PATCH v2] merge-recursive: change current file dir string_lists to hashmap

2017-09-06 Thread Junio C Hamano
Kevin Willford  writes:

> The code was using two string_lists, one for the directories and
> one for the files.  The code never checks the lists independently
> so we should be able to only use one list.  The string_list also
> is a O(log n) for lookup and insertion.  Switching this to use a
> hashmap will give O(1) which will save some time when there are
> millions of paths that will be checked.
>
> Also cleaned up a memory leak and method where the return was not
> being used.
>
> Signed-off-by: Kevin Willford 
> ---
>  merge-recursive.c | 76 
> ---
>  merge-recursive.h |  3 +--
>  2 files changed, 57 insertions(+), 22 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 1494ffdb82..ebfe01017f 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -24,6 +24,31 @@
>  #include "dir.h"
>  #include "submodule.h"
>  
> +struct path_hashmap_entry {
> + struct hashmap_entry;

You seem to have lost the squash you privately agreed to that is
needed in order to make it compile?


Re: [PATCHv2] builtin/merge: honor commit-msg hook for merges

2017-09-06 Thread Junio C Hamano
Stefan Beller  writes:

>> I also thought that we were hunting calls of cmd_foo() from outside
>> the git.c command dispatcher as grave errors and want to clean up
>> the codebase to get rid of them.
>
> ... but I did not account for this fact. (I was not aware of these being
> called grave errors, but assumed this is a good state. And why change
> a good state?)

https://public-inbox.org/git/20170830053108.g2xsn43rwulnw...@sigill.intra.peff.net/

gives a good explanation why it is not a good state.


Re: [PATCH] parse-options: warn developers on negated options

2017-09-06 Thread Junio C Hamano
Stefan Beller  writes:

>> Ahh, I was an idiot (call it vacation-induced-brain-disfunction).  I
>> forgot about 0f1930c5 ("parse-options: allow positivation of options
>> starting, with no-", 2012-02-25), which may have already made your
>> new use of "--no-verify" in builtin/merge.c and existing one in
>> commit.c OK long time ago.  A quick check to see how your version of
>>
>> git merge --verify
>> git merge --no-verify
>>
>> behaves with respect to the commit-msg hook is veriy much
>> appreciated, as my tree is in no shape to apply and try a patch
>> while trying to absorb the patches sent to the list the past week.
>>
>> Thanks, and sorry for a possible false alarm.
>>
>>> Having said that, because the existing parse_options_check() is all
>>> about catching the programming mistake (the end user cannot fix an
>>> error from it by tweaking the command line option s/he gives to the
>>> program), I do not think a conditional compilation like you added
>>> mixes well.  Either make the whole thing, not just your new test,
>>> conditional to -DDEVELOPER (which would make it possible for you to
>>> build and ship a binary with broken options[] array to the end-users
>>> that does not die in this function), which is undesirable, or add a
>>> new test that catches a definite error unconditionally.
>>
>> This part still is valid.  If René's work 2 years ago is sufficient
>> to address "--no-foo" thing, then there is nothing we need to add to
>> this test, but if we later need to add new sanity check, we should
>> add it without -DDEVELOPER, or we should make the whole thing inside
>> it.
>
> As far as the code is concerned it is only inside the -DDEVELOPER ?
> The intent of this patch is to have a developers aid to remind them
> that too many negations might be a sign of trouble.

I understand that.  What I was saying is that there may be no point
"reminding" them with René's "positivation" thing in effect and that
is why I asked you to try the simple two commands out to see if that
is the case.

I did that myself with "git commit --[no-]verify" and they are
indeed OK, so there is no reason to force developers to do this:

int distim = 1; /* default is to distim */
struct option options[] = {
...
OPT_BOOL(0, "distim", , N_("distim")),
...
};
...
if (distim)
do_the_distim_thing();

if/when the following is more natural in the context of the command:

int no_distim = 1; /* default is to distim */
struct option options[] = {
...
OPT_BOOL(0, "no-distim", , N_("bypass distimming")),
...
};
...
if (distim)
do_the_distim_thing();

whether it is inside -DDEVELOPER or not.




[PATCH v2] merge-recursive: change current file dir string_lists to hashmap

2017-09-06 Thread Kevin Willford
The code was using two string_lists, one for the directories and
one for the files.  The code never checks the lists independently
so we should be able to only use one list.  The string_list also
is a O(log n) for lookup and insertion.  Switching this to use a
hashmap will give O(1) which will save some time when there are
millions of paths that will be checked.

Also cleaned up a memory leak and method where the return was not
being used.

Signed-off-by: Kevin Willford 
---
 merge-recursive.c | 76 ---
 merge-recursive.h |  3 +--
 2 files changed, 57 insertions(+), 22 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1494ffdb82..ebfe01017f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -24,6 +24,31 @@
 #include "dir.h"
 #include "submodule.h"
 
+struct path_hashmap_entry {
+   struct hashmap_entry;
+   char path[FLEX_ARRAY];
+};
+
+static int path_hashmap_cmp(const void *cmp_data,
+   const void *entry,
+   const void *entry_or_key,
+   const void *keydata)
+{
+   const struct path_hashmap_entry *a = entry;
+   const struct path_hashmap_entry *b = entry_or_key;
+   const char *key = keydata;
+
+   if (ignore_case)
+   return strcasecmp(a->path, key ? key : b->path);
+   else
+   return strcmp(a->path, key ? key : b->path);
+}
+
+static unsigned int path_hash(const char *path)
+{
+   return ignore_case ? strihash(path) : strhash(path);
+}
+
 static void flush_output(struct merge_options *o)
 {
if (o->buffer_output < 2 && o->obuf.len) {
@@ -314,29 +339,25 @@ static int save_files_dirs(const unsigned char *sha1,
struct strbuf *base, const char *path,
unsigned int mode, int stage, void *context)
 {
+   struct path_hashmap_entry *entry;
int baselen = base->len;
struct merge_options *o = context;
 
strbuf_addstr(base, path);
 
-   if (S_ISDIR(mode))
-   string_list_insert(>current_directory_set, base->buf);
-   else
-   string_list_insert(>current_file_set, base->buf);
+   FLEX_ALLOC_MEM(entry, path, base->buf, base->len);
+   hashmap_entry_init(entry, path_hash(entry->path));
+   hashmap_add(>current_file_dir_set, entry);
 
strbuf_setlen(base, baselen);
return (S_ISDIR(mode) ? READ_TREE_RECURSIVE : 0);
 }
 
-static int get_files_dirs(struct merge_options *o, struct tree *tree)
+static void get_files_dirs(struct merge_options *o, struct tree *tree)
 {
-   int n;
struct pathspec match_all;
memset(_all, 0, sizeof(match_all));
-   if (read_tree_recursive(tree, "", 0, 0, _all, save_files_dirs, o))
-   return 0;
-   n = o->current_file_set.nr + o->current_directory_set.nr;
-   return n;
+   read_tree_recursive(tree, "", 0, 0, _all, save_files_dirs, o);
 }
 
 /*
@@ -646,6 +667,7 @@ static void add_flattened_path(struct strbuf *out, const 
char *s)
 
 static char *unique_path(struct merge_options *o, const char *path, const char 
*branch)
 {
+   struct path_hashmap_entry *entry;
struct strbuf newpath = STRBUF_INIT;
int suffix = 0;
size_t base_len;
@@ -654,14 +676,16 @@ static char *unique_path(struct merge_options *o, const 
char *path, const char *
add_flattened_path(, branch);
 
base_len = newpath.len;
-   while (string_list_has_string(>current_file_set, newpath.buf) ||
-  string_list_has_string(>current_directory_set, newpath.buf) ||
+   while (hashmap_get_from_hash(>current_file_dir_set,
+path_hash(newpath.buf), newpath.buf) ||
   (!o->call_depth && file_exists(newpath.buf))) {
strbuf_setlen(, base_len);
strbuf_addf(, "_%d", suffix++);
}
 
-   string_list_insert(>current_file_set, newpath.buf);
+   FLEX_ALLOC_MEM(entry, path, newpath.buf, newpath.len);
+   hashmap_entry_init(entry, path_hash(entry->path));
+   hashmap_add(>current_file_dir_set, entry);
return strbuf_detach(, NULL);
 }
 
@@ -1945,8 +1969,14 @@ int merge_trees(struct merge_options *o,
if (unmerged_cache()) {
struct string_list *entries, *re_head, *re_merge;
int i;
-   string_list_clear(>current_file_set, 1);
-   string_list_clear(>current_directory_set, 1);
+   /*
+* Only need the hashmap while processing entries, so
+* initialize it here and free it when we are done running
+* through the entries. Keeping it in the merge_options as
+* opposed to decaring a local hashmap is for convenience
+* so that we don't have to pass it to around.
+*/
+   hashmap_init(>current_file_dir_set, 

Re: [PATCHv2] builtin/merge: honor commit-msg hook for merges

2017-09-06 Thread Stefan Beller
On Tue, Sep 5, 2017 at 6:57 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Junio writes:
>>> I didn't check how "merge --continue" is implemented, but we need to
>>> behave exactly the same way over there, too.  Making sure that it is
>>> a case in t7504 may be a good idea, in addition to the test you
>>> added.
>>
>> After inspection of the code I do not think it is a good idea, because
>> (a) it clutters the test suite with something "obvious" for now,
>> the call to cmd_commit will be the same as git-commit on the
>> command line and
>> (b) piping through --[no-]verify would either introduce irregularities
>> ("Why do we pipe through --no-verify, when --sign-off is more 
>> important?")
>> or miss important options to pipe through:
>>
>>   static int continue_current_merge;
>> ...
>>   OPT_BOOL(0, "continue", _current_merge,
>>   N_("continue the current in-progress merge")),
>> ...
>>   if (continue_current_merge) {
>>   int nargc = 1;
>>   const char *nargv[] = {"commit", NULL};
>>
>>   if (orig_argc != 2)
>>   usage_msg_opt(_("--continue expects no arguments"),
>> builtin_merge_usage, builtin_merge_options);
>>
>>   /* Invoke 'git commit' */
>>   ret = cmd_commit(nargc, nargv, prefix);
>>   goto done;
>>   }
>
> That line of thought is backwards.  'something "obvious" for now'
> talks about the present.  tests are all about future-proofing.

I agree, but I did not think a call to cmd_commit would need to
be future-proofed as we already test git-commit, and these
are equal

>
> I also thought that we were hunting calls of cmd_foo() from outside
> the git.c command dispatcher as grave errors and want to clean up
> the codebase to get rid of them.

... but I did not account for this fact. (I was not aware of these being
called grave errors, but assumed this is a good state. And why change
a good state?)

> So the above is the worst example
> to use when you are trying to convince why it needs no test---the
> above is a good example of the code that would need to change soon
> when we have enough volunteers willing to keep the codebase clean
> and healthy, and we would benefit from future-proofing tests.

Given that new fact, I agree with the reasoning to add a new test
for future proofing. In the current form

git merge --continue --no-verify

would trigger to usage_msg_opt(..), so all I'd offer is a test_must_fail
for now?


Re: [PATCH 25/39] sha1_file: allow alt_odb_usable to handle arbitrary repositories

2017-09-06 Thread Stefan Beller
On Wed, Sep 6, 2017 at 1:01 PM, Junio C Hamano  wrote:
> Jonathan Nieder  writes:
>
>> Signed-off-by: Jonathan Nieder 
>> Signed-off-by: Stefan Beller 
>> ---
>>  sha1_file.c | 16 +---
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/sha1_file.c b/sha1_file.c
>> index e7c86e5363..b854cad970 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -25,6 +25,7 @@
>>  #include "repository.h"
>>  #include "object-store.h"
>>  #include "streaming.h"
>> +#include "path.h"
>>  #include "dir.h"
>>  #include "mru.h"
>>  #include "list.h"
>> @@ -281,17 +282,18 @@ static const char *alt_sha1_path(struct 
>> alternate_object_database *alt,
>>  /*
>>   * Return non-zero iff the path is usable as an alternate object database.
>>   */
>> -#define alt_odb_usable(r, p, n) alt_odb_usable_##r(p, n)
>> -static int alt_odb_usable_the_repository(struct strbuf *path,
>> -  const char *normalized_objdir)
>> +static int alt_odb_usable(struct repository *r, struct strbuf *path,
>> +   const char *normalized_objdir)
>
> These token-pasting macros introduced in 07-24/39 were certainly
> cute but they may be rather misleading and useless.  e.g. a natural
> expectaion would be
>
> struct repository *r = _repository;
> prepare_alt_odb(r);
>
> to work, but obviously it wouldn't.  And this step and later in
> 25-39/39 corrects them one by one.

Yes. The intent was to deliver patches that are easiest to review as
that is usually the crux with long series. To do that we came up with this
way:
* patches  07-24 can be reviewed easily by only looking at
   the textual replacement, no need to think about implications.
   It should be boring(in a good way) to review these.
* patches 25-39, that allow arbitrary repositories to be handled
   only need to be reviewed inside its function bounds, because
   each function call that takes a repository argument will be verified
   by the compiler, precisely because the above won't work!
   That way the reviewer can rely on the compile output
   that the conversion was done in the correct order and no
   small function was missed. All review attention can go into
   just looking at the one function that is converted in such a
   patch. no need to double check with other parts of the code.
   We discussed we'd send the patches 25-39 with --function-context.
   The follow ups will do so.

> I suspect that you used the token-pasting cuteness because you
> thought that the callsite would not have to change in the later step
> like 25/39 that removes the trick.

No. We did these two different phases to ease review specifically.

Note how the call site is the function itself, ( --function-context would
have been really neat here)


>  I also suspect that you thought
> that it may be a good thing that prepare_alt_odb(r) does not work
> immediately after 07/39, as the callee is not prepared.  But both of
> these are misguided.
>
> If you have two functions, A and B, that we want to update to
> eventually take a "struct repository *" argument, and if A uses B in
> its implementation, the endgame would be
>
> A(struct repository *r, ...)
> {
> ...
> B(r, ...);
> ...
> }
>
> but with the ##r approach, the call to B in A in the initial
> token-pasting phase would say B(the_repository, ...) so the call
> will need to be changed in the step you update A's implementation to
> take a caller-supplied respotory object anyway.  And the fact that a
> function's first parameter is not the_repository (i.e. leaving B()'s
> signature unchanged while it is not ready to take a repository) is
> sufficient to mark that a function is not yet prepared to take a
> caller-supplied repository object.

The absence of a repository argument is not sufficient that a function
(or any function called by it) does not rely on some global state, which
needs to be put into the repository struct.

So when converting function A (above), the careful reviewer needs to
inspect B, C, D... if they touch global state. Using the current approach,
most functions (maybe just B and C here) would take a repository struct
already, and the compiler would yell if these were not ready for anything
except the_repository. So the reviewer only has to worry about D.

> I found that this aspect of the structure of the series was somewhat
> irritating in that (1) combining the earlier ##r thing with its fix
> would have made the code that needs to be reviewed much cleaner,

agreed.

> (2) it wouldn't have made the patch that longer,

but more complicated to review (and write actually; as it was easy
to reason about functions to be ready for conversion, not using
global state)

> and most importantly
> (3) it is unclear why 24-7 != 39-25, i.e. it is hard to answer "is
> there any ##r hack still remaining after this 

Re: [PATCH] parse-options: warn developers on negated options

2017-09-06 Thread Stefan Beller
On Tue, Sep 5, 2017 at 8:16 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Stefan Beller  writes:
>>
>>>   This patch disallows all no- options, but we could be more open and allow
>>>   --no-options that have the NO_NEG bit set.
>>
>> "--no-foo" that does not take "--foo" is perhaps OK so should not
>> trigger an error.
>>
>> A ("--no-foo", "--foo") pair is better spelled as ("--foo",
>> "--no-foo") pair whose default is "--foo", but making it an error is
>> probably a bit too much.
>>
>> Compared to that, ("--no-foo", "--no-no-foo") pair feels nonsense.
>
> Ahh, I was an idiot (call it vacation-induced-brain-disfunction).  I
> forgot about 0f1930c5 ("parse-options: allow positivation of options
> starting, with no-", 2012-02-25), which may have already made your
> new use of "--no-verify" in builtin/merge.c and existing one in
> commit.c OK long time ago.  A quick check to see how your version of
>
> git merge --verify
> git merge --no-verify
>
> behaves with respect to the commit-msg hook is veriy much
> appreciated, as my tree is in no shape to apply and try a patch
> while trying to absorb the patches sent to the list the past week.
>
> Thanks, and sorry for a possible false alarm.
>
>> Having said that, because the existing parse_options_check() is all
>> about catching the programming mistake (the end user cannot fix an
>> error from it by tweaking the command line option s/he gives to the
>> program), I do not think a conditional compilation like you added
>> mixes well.  Either make the whole thing, not just your new test,
>> conditional to -DDEVELOPER (which would make it possible for you to
>> build and ship a binary with broken options[] array to the end-users
>> that does not die in this function), which is undesirable, or add a
>> new test that catches a definite error unconditionally.
>
> This part still is valid.  If René's work 2 years ago is sufficient
> to address "--no-foo" thing, then there is nothing we need to add to
> this test, but if we later need to add new sanity check, we should
> add it without -DDEVELOPER, or we should make the whole thing inside
> it.

As far as the code is concerned it is only inside the -DDEVELOPER ?
The intent of this patch is to have a developers aid to remind them
that too many negations might be a sign of trouble.

Thanks,
Stefan

>
> Thanks.


Re: [RFC PATCH 0/2] Add named reference to latest push cert

2017-09-06 Thread Stefan Beller
On Wed, Sep 6, 2017 at 2:39 AM, Shikher Verma  wrote:
> Currently, git only stores push certificates if there is a receive hook
> present. This may violate the principle of least surprise (e.g., I
> pushed with --signed, and I don't see anything in upstream).
> Additionally, push certificates could be more versatile if they are not
> tightly bound to git hooks. Finally, it would be useful to verify the
> signed pushes at later points of time with ease.
>
> A named ref is added for ease of access/tooling around push
> certificates. If the last push was signed, ref/PUSH_CERT stores the
> ref of the latest push cert otherwise it is empty.
>
> Sending patches as RFC since the documentation would have to be
> updated and git gc might have to be patched to not garbage collect
> the latest push certificate.
>
> This patch applies on master (3ec7d702a)

What are performance implications for busy repositories at busy hosts?
(think kernel.org / github) They may want to disable this new feature
for performance reasons or because they don't want to clutter the
object store. So at least a config option to turn it off would be useful.

On the ref to store the push certs:
(a) Currently the ref points at the blob, I wonder if we'd rather want to
point at a commit? (Then we can build up a history of
push certs, instead of relying on the reflog to show all
push certs. Also the gc issue might be easier to solve using this)
(b) When going with (a), we might want to change the name. Most
refs are 3 directories deep:
  refs/heads/
  refs/pr/ # at github IIUC
  refs/changes/ # Gerrit
  refs/meta/config # Gerrit to e.g. configure ACLs of the repo
"refs" indicates it is a ref, whereas the second part can be seen
as a "namespace". Currently Git only uses the "heads" and "tags"
namespace, "meta" is used by more than just Gerrit, so maybe it is
not wise to use "refs/meta/push_cert", but go with refs/gitmeta/pushcert
instead?


Re: [PATCH] refs: make sure we never pass NULL to hashcpy

2017-09-06 Thread Thomas Gummerer
On Wed, Sep 6, 2017 at 2:26 AM, Junio C Hamano  wrote:
> Thomas Gummerer  writes:
>
>> gcc on arch linux (version 7.1.1) warns that a NULL argument is passed
>> as the second parameter of memcpy.
>>
>> In file included from refs.c:5:0:
>> refs.c: In function ‘ref_transaction_verify’:
>> cache.h:948:2: error: argument 2 null where non-null expected 
>> [-Werror=nonnull]
>>   memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ);
>>   ^~~~
>> In file included from git-compat-util.h:165:0,
>>  from cache.h:4,
>>  from refs.c:5:
>> /usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared 
>> here
>>  extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
>>   ^~
>> ...
>> diff --git a/refs.c b/refs.c
>> index ba22f4acef..d8c12a9c44 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -896,10 +896,14 @@ struct ref_update *ref_transaction_add_update(
>>
>>   update->flags = flags;
>>
>> - if (flags & REF_HAVE_NEW)
>> + if (flags & REF_HAVE_NEW) {
>> + assert(new_sha1);
>>   hashcpy(update->new_oid.hash, new_sha1);
>> - if (flags & REF_HAVE_OLD)
>> + }
>> + if (flags & REF_HAVE_OLD) {
>> + assert(old_sha1);
>>   hashcpy(update->old_oid.hash, old_sha1);
>> + }
>
> It is hugely annoying to see a halfway-intelligent compiler forces
> you to add such pointless asserts.
>
> The only way the compiler could error on this is by inferring the
> fact that new_sha1/old_sha1 could be NULL by looking at the callsite
> in ref_transaction_update() where these are used as conditionals to
> set HAVE_NEW/HAVE_OLD that are passed.  Even if the compiler were
> doing the whole-program analysis, the other two callsites of the
> function pass the address of oid.hash[] in an oid structure so it
> should know these won't be NULL.
>
> Or is the compiler being really dumb and triggering an error for
> every use of
>
> memcpy(dst, src, size);
>
> that must now be written as
>
> assert(src);
> memcpy(dst, src, size);
>
> ???  That would be doubly annoying

No, I think it can't quite deal with the flags that are passed in.
I'm on a different
machine today, so I can't actually check, but that's what I would
expect at least.

> I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these
> codepaths, though.  Perhaps we can instead declare !!new_sha1 means
> we have the new side and rewrite the above part to
>
> if (new_sha1)
> hashcpy(update->new_oid.hash, new_sha1);
>
> without an extra and totally pointless assert()?

Yeah, that seems much nicer.  I'll try that and send a new a patch
(though I won't
get to it before tomorrow).  Thanks for the review.

>>   update->msg = xstrdup_or_null(msg);
>>   return update;
>>  }


Re: [PATCH] read-cache: fix index corruption with index v4

2017-09-06 Thread Thomas Gummerer
On Tue, Sep 5, 2017 at 8:37 PM, Kevin Willford  wrote:
>> From: Thomas Gummerer [mailto:t.gumme...@gmail.com]
>> Sent: Monday, September 4, 2017 4:58 PM

[..]

>> I unfortunately didn't have more time to dig so
>>
>> > As ce->name is however not nul terminated
>>
>> just comes from my memory and from the patch below actually fixing the
>> corruption, so it's really the most likely cause.  Would be great if
>> someone who can remember more about the index could confirm that this
>> is indeed the case.
>>
>
> Digging into this and ce->name IS nul terminated.  The issue comes in when
> the CE_STRIP_NAME is set, which is only set when using a split index.
> This sets the ce->ce_namelen = 0 without changing the actual ce->name buffer.
> When writing the entry for the split index version 4 it was using the first 
> character
> in the ce->name buffer because of the + 1, which obviously isn't correct.
> Before
> it was using a newly allocated name buffer from the ondisk struct which was
> allocated based on the ce_namelen of zero.

Thank you very much for digging into this.  That also explains why
only t1700 was
affected, but none of the other tests.  Will update the commit message.

>>  read-cache.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index 40da87ea71..80830ddcfc 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -2103,7 +2103,9 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, 
>> struct
>> cache_entry *ce,
>>   if (!result)
>>   result = ce_write(c, fd, to_remove_vi, prefix_size);
>>   if (!result)
>> - result = ce_write(c, fd, ce->name + common,
>> ce_namelen(ce) - common + 1);
>> + result = ce_write(c, fd, ce->name + common,
>> ce_namelen(ce) - common);
>> + if (!result)
>> + result = ce_write(c, fd, "\0", 1);
>
> You could use the padding variable here as well which is used in the < 
> version 4
> ce_write.

Thanks, will do that.

>>
>>   strbuf_splice(previous_name, common, to_remove,
>> ce->name + common, ce_namelen(ce) - common);
>> --
>> 2.14.1.480.gb18f417b89
>
> While looking at the code I was wondering if we could get around the
> whole setting ce->ce_namelen to zero bit but that would be much bigger
> patch and possibly introduce other bugs so this seems the appropriate
> fix for now.
>
> Thanks for finding this!

Thanks for the review! Will send an updated patch in a bit.

> Kevin


Re: [PATCH 25/39] sha1_file: allow alt_odb_usable to handle arbitrary repositories

2017-09-06 Thread Junio C Hamano
Jonathan Nieder  writes:

> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 
> ---
>  sha1_file.c | 16 +---
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index e7c86e5363..b854cad970 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -25,6 +25,7 @@
>  #include "repository.h"
>  #include "object-store.h"
>  #include "streaming.h"
> +#include "path.h"
>  #include "dir.h"
>  #include "mru.h"
>  #include "list.h"
> @@ -281,17 +282,18 @@ static const char *alt_sha1_path(struct 
> alternate_object_database *alt,
>  /*
>   * Return non-zero iff the path is usable as an alternate object database.
>   */
> -#define alt_odb_usable(r, p, n) alt_odb_usable_##r(p, n)
> -static int alt_odb_usable_the_repository(struct strbuf *path,
> -  const char *normalized_objdir)
> +static int alt_odb_usable(struct repository *r, struct strbuf *path,
> +   const char *normalized_objdir)

These token-pasting macros introduced in 07-24/39 were certainly
cute but they may be rather misleading and useless.  e.g. a natural
expectaion would be

struct repository *r = _repository;
prepare_alt_odb(r);

to work, but obviously it wouldn't.  And this step and later in
25-39/39 corrects them one by one.

I suspect that you used the token-pasting cuteness because you
thought that the callsite would not have to change in the later step
like 25/39 that removes the trick.  I also suspect that you thought
that it may be a good thing that prepare_alt_odb(r) does not work
immediately after 07/39, as the callee is not prepared.  But both of
these are misguided.

If you have two functions, A and B, that we want to update to
eventually take a "struct repository *" argument, and if A uses B in
its implementation, the endgame would be

A(struct repository *r, ...)
{
...
B(r, ...);
...
}

but with the ##r approach, the call to B in A in the initial
token-pasting phase would say B(the_repository, ...) so the call
will need to be changed in the step you update A's implementation to
take a caller-supplied respotory object anyway.  And the fact that a
function's first parameter is not the_repository (i.e. leaving B()'s
signature unchanged while it is not ready to take a repository) is
sufficient to mark that a function is not yet prepared to take a
caller-supplied repository object.

I found that this aspect of the structure of the series was somewhat
irritating in that (1) combining the earlier ##r thing with its fix
would have made the code that needs to be reviewed much cleaner, (2)
it wouldn't have made the patch that longer, and most importantly
(3) it is unclear why 24-7 != 39-25, i.e. it is hard to answer "is
there any ##r hack still remaining after this series?  if so why?"

The end-result of the whole series makes me think that it is going
in the right direction, though.

Thanks.


Re: gitmodules below root directory

2017-09-06 Thread Junio C Hamano
Robert Dailey  writes:

> The gitmodules documentation[1] states that the .gitmodules file is at
> the root. However, it would be nice if this could be supported in any
> directory similar to how .gitignore works.

I have a mild suspicion that there would be a huge impedance
mismatch between what gitmodules file is meant to do and the way
ignore/attribute setting is done.

When the mechanism is primarily about expressing a few generic
traits that are shared by things that can be grouped by paths
(e.g. "all paths whose pathnames match '*.py' pattern contain text",
"all paths in sub/ directory are ignored"), it may make sense to
spread the information across multiple .gitignore files and make the
closest one take precedence over the further ones.  Even though
allowing multiple sources of information spread over the tree leads
to end-user confusion (e.g. "why is this path ignored?", which
triggered the debugging aid "git check-ignore"), such a grouping by
pattern matching on paths (which is what makes "closest file take
precedence" meaningful) to assign generic traits (e.g. "it's text")
makes it worthwhile by allowing to express the rules more concisely.

Compared to that, what .gitmodules file expresses is more specific
to each submodule---no two submodules in your single superproject
would share the same URL, unless you are doing something quite
unusual, for example.  Having a single file also means that updating
is much simpler---"git submodule add" and other things do not have
to choose among .gitmodules, a/.gitmodules and a/b/.gitmodules when
they update an entry for the submodule at path "a/b/c".

Having said that, I do not think the current ".gitmodules must be at
the top and nothing else matters" is ideal.  A possible change that
I suspect may make more sense is to get rid of .gitmodules file,
instead of spreading more of them all over the tree.

The current gitlink implementation records only the "what commit
from the subproject history is to be checked out at this path?" and
nothing else, by storing a single SHA-1 that happens to be the name
of the commit object (but the superproject does not even care the
fact that it is a commit or a random string).  We could substitute
that with the name of a blob object that belongs to the superproject
history and records the information about the submodule at the path
(e.g. "which repository the upstream project recommends to clone the
subproject from?", "what commit object is to be checked out").  

When you see a single tree of a superproject, you need to see what
commit is to be checked out from the tree object and everything else
needs to be read from the .gitmodules file in that tree in the
current system, but it does not have to be that way.







Re: [PATCH 06/34] clone: release strbuf after use in remove_junk()

2017-09-06 Thread Junio C Hamano
Rene Scharfe  writes:

> Signed-off-by: Rene Scharfe 
> ---
>  builtin/clone.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 8d11b570a1..dbddd98f80 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -487,28 +487,28 @@ N_("Clone succeeded, but checkout failed.\n"
> ...
>   if (junk_git_dir) {
>   strbuf_addstr(, junk_git_dir);
>   remove_dir_recursively(, 0);
>   strbuf_reset();
>   }
>   if (junk_work_tree) {
>   strbuf_addstr(, junk_work_tree);
>   remove_dir_recursively(, 0);
> - strbuf_reset();
>   }
> + strbuf_release();
>  }

The code definitely needs a _release() at the end, but I feel
lukewarm about the "if we are about to _release(), do not bother to
_reset()" micro-optimization.  Keeping the existing two users that
use sb as a (shared and reused) temporary similar would help those
who add the third one or reuse the pattern in their code elsewhere.

We could flip the "be nice to the next user by clearing after use"
pattern to "clear any potential cruft before you use", i.e.

if (...) {
strbuf_reset();
strbuf_addstr(, ...);
}
if (...) {
strbuf_reset();
strbuf_addstr(, ...);
}
strbuf_release();

but that still tempts us for the same micro-optimization at the
beginning where sb hasn't been used since STRBUF_INIT, so it is not
a real "solution".

So, I dunno.


Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list

2017-09-06 Thread Junio C Hamano
Martin Ågren  writes:

> Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new
> topic or as a separate "patch 4/3" for you to place on top, any
> preference?

Until a topic hits 'next', you solely own it and it is mostly up to
you how to go about improving it.  My preference would be much less
relevant than what the end result would require, i.e

> If we re-roll, would you prefer Peff's much smaller take on patch 2
> (strbuf_release where it matters, instead of sprinkling "goto out" all
> over)? I think me and him agreed that we'd be fine either way. I'd reuse
> my commit message, if I get his sign-off and "From:".

... if we take Peff's approach, then rerolling the whole thing would
be the only approach that makes sense---incremental update would
make the resulting history less readable.  Between two approaches I
do not have a strong preference either way---it's a choice that can
be made between you and Peff.

> Obviously, if Michael or anyone else has any input, I'm open to that as
> well..

Sure.


Re: [PATCH 00/34] plug strbuf memory leaks

2017-09-06 Thread Junio C Hamano
Rene Scharfe  writes:

> Release allocated strbufs in functions that are at least potentionally
> library-like; cmd_*() functions are out of scope because the process
> ends with them and the OS cleans up for us anyway.  The patches are
> split by function and were generated with --function-context for easier
> reviewing.

I read through all of them and they looked sensible.  I had a few
comments on individual patches, though.



Re: [PATCH 34/34] wt-status: release strbuf after use in wt_longstatus_print_tracking()

2017-09-06 Thread Junio C Hamano
Rene Scharfe  writes:

> If format_tracking_info() returns 0 only if it didn't touch its strbuf
> parameter, so it's OK to exit early in that case.  Clean up sb in the
> other case.

These two "if"s confuse me; perhaps the first one is not needed?



Re: [PATCH 01/39] pack: make packed_git_mru global a value instead of a pointer

2017-09-06 Thread Junio C Hamano
Jeff King  writes:

> As an aside, the mru code could probably be simplified a bit by reusing
> the list implementation from list.h (both were added around the same
> time, and it wasn't worth creating a dependency then, but I think list.h
> is useful and here to stay at this point).
>
> It's definitely not critical to put that into this already-large series,
> though.  Maybe we can use Junio's #leftoverbits tag. :)

I had the same reaction while reading it; perhaps I should read the
responses from trusted reviewers first before reading the patches
myself ;-)

Thanks.


Re: [PATCH 27/34] shortlog: release strbuf after use in insert_one_record()

2017-09-06 Thread Junio C Hamano
Rene Scharfe  writes:

> Signed-off-by: Rene Scharfe 
> ---
>  builtin/shortlog.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/builtin/shortlog.c b/builtin/shortlog.c
> index 43c4799ea9..48af16c681 100644
> --- a/builtin/shortlog.c
> +++ b/builtin/shortlog.c
> @@ -50,66 +50,67 @@ static int compare_by_list(const void *a1, const void *a2)
>  static void insert_one_record(struct shortlog *log,
> const char *author,
> const char *oneline)
>  {
> ...
>   item = string_list_insert(>list, namemailbuf.buf);
> + strbuf_release();

As log->list.strdup_strings is set to true in shortlog_init(),
namemailbuf.buf will leak without this.

An alterative, as this is the only place we add to log->list, could
be to make log->list take ownership of the string by not adding a
_release() here but instead _detach(), I guess.

It is also curious that at the end of shortlog_output(), we set the
log->list.strdup_strings again to true immediately before calling
string_list_clear() on it.  I suspect that is unnecessary?



Your Attention Needed please

2017-09-06 Thread Mr. patrick joseph
Dear Friend,

It's my pleasure to write you today, I am Mr. Patrick Joseph, I work
with the African Development Bank, I wish to place your name as the
beneficiary to $17.5 million US dollars only due to the death of the
depositor who died years ago along with her family.

However, I assure you that this transaction will be executed under a
legitimate arrangement that will protect you from any breach of the
law both in your country and here in Ouagadougou Burkina Faso, West
Africa once the fund is transferred to your bank account.The funds I
am referring to is right here in this bank floating in suspense
account, therefore I solicit for your to be in collaboration with me
to have this done, it will be transferred into an account you will
provide any where in the world. When you receive this letter, kindly
send me an e-mail sigifying your decision including your full name and
country, your home & office address, your phone / fax number urgently
to enable the attorney start the processing of the all the relevant
legal documents for the fund remittance.

Waiting for your reply soon.

Thanks,
Mr. Patrick Joseph.


Re: gitmodules below root directory

2017-09-06 Thread Stefan Beller
On Wed, Sep 6, 2017 at 6:53 AM, Robert Dailey  wrote:
> The gitmodules documentation[1] states that the .gitmodules file is at
> the root. However, it would be nice if this could be supported in any
> directory similar to how .gitignore works. Right now git-subrepo does
> not support submodules inside of a subrepo[2] (I suspect subtrees
> would have the same problem, but I did not verify). I think this is a
> limitation of git, rather than subrepo itself. Perhaps there are
> reasons why .gitmodules must be at the root, but I at least wanted to
> point it out and see if this could be supported.
>
> [1]: https://git-scm.com/docs/gitmodules
> [2]: https://github.com/ingydotnet/git-subrepo/issues/262

I agree that subtree likely suffers the same problem.
And at first it seems reasonable to want to have .gitmodules
at deeper trees supported, as that would fix subtree and subrepo
(and others) with ease.

Historically the need to store submodule URLs were the motivation
for having the .gitmodules file. An absolute URL for a submodule would
work fine no matter where the .gitmodules file would be located.
Relative URLs are currently defined as relative to the top level
of the project, which we would need to inspect if the anchor
is chosen well at the root or if we would want to allow anchoring
the relative URL within the tree. (This is no reason against
.gitmodules in deep trees, just pointing out the work required).

But does the URL still make sense? For absolute URLs this is likely
the case, for relative URLs my bets are off. Maybe?

It turned out that people want to e.g. move, delete and re-introduce
submodules, which is why the location of a submodule git directory
was moved to be either inside the tree (to keep supporting existing
git repos with submodules) as well as interned in the superproject.

In the example given in [2], the git dir of the submodule
("folder B") may be located at .git/modules/nameB as seen
from the root of RepoX:

RepoX
+ folder A
+ folder B (submodule)
+ .gitmodules
+ .git # regular RepoX git dir
   + modules/

An important mechanism of the .gitmodules file is
the resolution of the "name" and the "path" of the
submodule. (Given the path of a gitlink entry, where do
I find the git repository for the submodule? vice versa is slightly
less relevant: Given this git repository deep inside my own git
directory, where is the working tree)

So in the example we'd have

RepoY
+ RepoX (subrepo)
  + folder A
  + folder B (submodule)
 +.gitmodules

The path entry in the .gitmodules file would not change via
subtree/subrepo merge, such that Git would need to know
that the actual path to the submodule is the
concatenation of 'path to tree in which the .gitmodules file is'
and the given path inside the .gitmodules file. Seems doable so far.

What about the name of a submodule? The .gitmodules file
follows the syntax of git config files, such that names cannot
occur twice as the names are stored as the section name:

  [submodule "nameB"]
path = "folder B"

And I would think the property of having unique names
is important, such that each submodule has its unique
place to put its git dir inside the superprojects
"$GIT_DIR/modules/".

With multiple .gitmodules files, we would loose the
uniqueness property. (It may not be too bad, maybe
even a clever hack, haven't thought about it deeply,
but it seems ugly at first)

As said above, the name<->path resolution is
important, (and shall be unique, deterministic and simple),
so how do we do it? What about the case where we have

  .gitmodules "name" -> dir/path
  dir/.gitmodules "name" -> ./path

In this case we'd have the same mapping, but using this
mechanism we can map multiple names at the same path,
and we could choose to resolve a given path in different
.gitmodules files, which is cumbersome.

  anotherdir/.gitmodules "name" -> ../dir/path

seems crazy, too.

What about moving submodules?
Consider the example as in [2] again:

  $ git mv RepoX/folderB dir/sub
  $ git commit -m "move submodule"
  # ok fine, we can come up with a plan
  # where to put the submodule configuration,
  # maybe in dir/.gitmodules?

  $ git rm RepoX
  $ git commit -m "don't need the rest of RepoX"
  # observation: we would not want
  # RepoX/.gitmodules to still have impact on
  # the submodule.

  $ git revert HEAD^^ # undo the initial move
  # we'd move the .gitmodules file back to RepoX/.

tl;dr: I think this idea produces lots of interesting
corner cases in the data model, let's not go there
without having an idea how to solve them.

>From an implementation stand point:
The submodule-config API could easily enhanced
to support reading multiple .gitmodules files (in case
their location is well defined, we would not want to
walk the whole tree recursively). This API is only
easily accessible from within C, such that current
implementing this idea in git-submodule.sh would
be a hassle to do. 

Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list

2017-09-06 Thread Martin Ågren
On 5 September 2017 at 23:26, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> I noticed the HEAD funniness, too, when looking at this earlier. I agree
>> with Junio that it's not quite consistent with the general rule of
>> "string list items point to their refnames", but I don't think it
>> matters in practice.
>
> Yup, we are on the same page; the "fix" I was alluding to would look
> exactly like what you wrote below, but I agree the distinction does
> not matter in practice.  IOW, I do not think the code after Martin's
> fix is wrong per-se.

Well, "not wrong per-se" tells me you'd feel slightly more comfortable
about the state of things if we did this. ;)

I'll take Peff's hint, tweak/add comments for correctness and symmetry
with the previous patch and add an if-BUG for symmetry. Peff: Do I have
your sign-off? (Do I need it?)

Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new
topic or as a separate "patch 4/3" for you to place on top, any
preference?

If we re-roll, would you prefer Peff's much smaller take on patch 2
(strbuf_release where it matters, instead of sprinkling "goto out" all
over)? I think me and him agreed that we'd be fine either way. I'd reuse
my commit message, if I get his sign-off and "From:".

Obviously, if Michael or anyone else has any input, I'm open to that as
well..

Martin


Re: [PATCH v4 11/16] revision.c: --all adds HEAD from all worktrees

2017-09-06 Thread Stefan Beller
On Wed, Sep 6, 2017 at 4:19 AM, Duy Nguyen  wrote:
>
> So, probably no worktree iterator (yet).

Ok, thanks for considering it.


Re: [PATCH v4 02/16] refs.c: use is_dir_sep() in resolve_gitlink_ref()

2017-09-06 Thread Stefan Beller
On Wed, Sep 6, 2017 at 4:08 AM, Duy Nguyen  wrote:
> On Thu, Aug 24, 2017 at 2:14 AM, Stefan Beller  wrote:
>> On Wed, Aug 23, 2017 at 5:36 AM, Nguyễn Thái Ngọc Duy  
>> wrote:
>>> The "submodule" argument in this function is a path, which can have
>>> either '/' or '\\' as a separator. Use is_dir_sep() to support both.
>>>
>>> Noticed-by: Johannes Sixt 
>>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>>
>> Immediate questions that come to mind:
>> * Could this go in as an independent bug fix?
>
> It probably could. But it may depend on other submodule changes in this 
> series.
>
>>   -> If so do we have any test that fails or stops failing on Windows?
>
> It was spotted during the review [1] so I guess no test fails on
> Windows (not that I would know because I can't run tests on Windows)
>
>>   -> If not, do we need one?
>
> Since I don't use Windows, I don't particularly care. And I can't add
> one anyway because I can't run it.
>
> [1] 
> https://public-inbox.org/git/%3ca74cf309-fb16-2f45-8189-d1d0c655d...@kdbg.org%3E/

IIRC I asked these questions as I was genuinely confused,
I do not mind too much either.

>
>> * Assuming this is not an independent bug fix:
>>   Why do we need this patch in this series here?
>>   (I thought this is about worktrees, not submodules.
>>   So does this fix a potential bug that will be exposed
>>   later in this series, but was harmless up to now?)
>
> The series could probably be split in two. The first part is about
> using the new refs_* API in revision.c. This helps clean up refs API a
> bit, all *_submodule() functions will be one. Not strictly needed to
> be part of this, it's just a continuation of my previous series that
> introduces *_refs. Since submodule code is touched, this is found.
>
> The second part is actually fixing the prune bug.
>
> Should I split the series in two? I think I separated two parts in the
> past (maybe I misremember) at least in the description...

I had to reread the coverletter
https://public-inbox.org/git/20170823123704.16518-1-pclo...@gmail.com/
to get that part. I would not be opposed to splitting the series, but
I'll review an unsplit series as well.

Thanks,
Stefan

> --
> Duy


Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives

2017-09-06 Thread Martin Ågren
On 5 September 2017 at 15:05, Jeff King  wrote:
>   1. It can be compiled conditionally. There's no need in
>  normal runs to do this free(), and it just wastes time.
>  By using a macro, we can get the benefit for leak-check
>  builds with zero cost for normal builds (this patch
>  uses a compile-time check, though we could clearly also
>  make it a run-time check at very low cost).
>
>  Of course one could also hide free() behind a macro, so
>  this is really just arguing for having UNLEAK(), not
>  for its particular implementation.

Like Stefan, I didn't quite follow 1. until after I had read the points
below it. But it's still a very good commit message (as always).

> diff --git a/builtin/commit.c b/builtin/commit.c
> index b3b04f5dd3..de775d906c 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1819,5 +1819,6 @@ int cmd_commit(int argc, const char **argv, const char 
> *prefix)
> print_summary(prefix, , !current_head);
>
> strbuf_release();
> +   UNLEAK(sb);
> return 0;
>  }

These are both strbufs, so this ends up being a bit inconsistent. What
would be the ideal end state for these two and all other such
structures? My guess is "always UNLEAK", as opposed to carefully judging
whether foo_release() would/could add any significant overhead.

In other words, it would be ok/wanted with changes such as "let's UNLEAK
bar, because ..., and while at it, convert the existing foo_release to
UNLEAK for consistency" (or per policy, for smaller binary, whatever).
Or "if it ain't broken, don't fix it"? Did you think about this, or was
it more a random choice?

Martin


[PATCH v2] hashmap: address ThreadSanitizer concerns

2017-09-06 Thread Jeff Hostetler
From: Jeff Hostetler 

Version 2 addresses the comments and suggestions on version 1.
It removes the explicit disable/enable rehash and just relies
on the state of hashmap counting.
It changes the declaration of the hashmap_get_size() to be
static to avoid issues seen on some compilers.
It uses BUG() rather than die() for an error condition.
It adds a comment describing why lazy-init needs to disable
couting.
It fixes line length problems.
It add details from TSan in the commit message.

Jeff Hostetler (1):
  hashmap: add API to disable item counting when threaded

 attr.c  | 15 ++-
 builtin/describe.c  |  2 +-
 hashmap.c   | 26 +++---
 hashmap.h   | 72 ++---
 name-hash.c | 10 +--
 t/helper/test-hashmap.c |  3 ++-
 6 files changed, 88 insertions(+), 40 deletions(-)

-- 
2.9.3



[PATCH v2] hashmap: add API to disable item counting when threaded

2017-09-06 Thread Jeff Hostetler
From: Jeff Hostetler 

This is to address concerns raised by ThreadSanitizer on the mailing list
about threaded unprotected R/W access to map.size with my previous "disallow
rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4).

See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.ag...@gmail.com/

Add API to hashmap to disable item counting and thus automatic rehashing.
Also include API to later re-enable them.

When item counting is disabled, the map.size field is invalid.  So to
prevent accidents, the field has been renamed and an accessor function
hashmap_get_size() has been added.  All direct references to this
field have been been updated.  And the name of the field changed
to map.private_size to communicate this.

Here is the relevant output from ThreadSanitizer showing the problem:

WARNING: ThreadSanitizer: data race (pid=10554)
  Read of size 4 at 0x0082d488 by thread T2 (mutexes: write M16):
#0 hashmap_add hashmap.c:209
#1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
#2 handle_range_dir name-hash.c:347
#3 handle_range_1 name-hash.c:415
#4 lazy_dir_thread_proc name-hash.c:471
#5  

  Previous write of size 4 at 0x0082d488 by thread T1 (mutexes: write M31):
#0 hashmap_add hashmap.c:209
#1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
#2 handle_range_dir name-hash.c:347
#3 handle_range_1 name-hash.c:415
#4 handle_range_dir name-hash.c:380
#5 handle_range_1 name-hash.c:415
#6 lazy_dir_thread_proc name-hash.c:471
#7  

Martin gives instructions for running TSan on test t3008 in this post:
https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4exxomckvsg5msuz...@mail.gmail.com/

Signed-off-by: Jeff Hostetler 
---
 attr.c  | 15 ++-
 builtin/describe.c  |  2 +-
 hashmap.c   | 26 +++---
 hashmap.h   | 72 ++---
 name-hash.c | 10 +--
 t/helper/test-hashmap.c |  3 ++-
 6 files changed, 88 insertions(+), 40 deletions(-)

diff --git a/attr.c b/attr.c
index 56961f0..195aa6d 100644
--- a/attr.c
+++ b/attr.c
@@ -149,10 +149,12 @@ struct all_attrs_item {
 static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
 {
int i;
+   unsigned int size;
 
hashmap_lock(map);
 
-   if (map->map.size < check->all_attrs_nr)
+   size = hashmap_get_size(>map);
+   if (size < check->all_attrs_nr)
die("BUG: interned attributes shouldn't be deleted");
 
/*
@@ -161,13 +163,13 @@ static void all_attrs_init(struct attr_hashmap *map, 
struct attr_check *check)
 * field), reallocate the provided attr_check instance's all_attrs
 * field and fill each entry with its corresponding git_attr.
 */
-   if (map->map.size != check->all_attrs_nr) {
+   if (size != check->all_attrs_nr) {
struct attr_hash_entry *e;
struct hashmap_iter iter;
hashmap_iter_init(>map, );
 
-   REALLOC_ARRAY(check->all_attrs, map->map.size);
-   check->all_attrs_nr = map->map.size;
+   REALLOC_ARRAY(check->all_attrs, size);
+   check->all_attrs_nr = size;
 
while ((e = hashmap_iter_next())) {
const struct git_attr *a = e->value;
@@ -235,10 +237,11 @@ static const struct git_attr *git_attr_internal(const 
char *name, int namelen)
 
if (!a) {
FLEX_ALLOC_MEM(a, name, name, namelen);
-   a->attr_nr = g_attr_hashmap.map.size;
+   a->attr_nr = hashmap_get_size(_attr_hashmap.map);
 
attr_hashmap_add(_attr_hashmap, a->name, namelen, a);
-   assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
+   assert(a->attr_nr ==
+  (hashmap_get_size(_attr_hashmap.map) - 1));
}
 
hashmap_unlock(_attr_hashmap);
diff --git a/builtin/describe.c b/builtin/describe.c
index 89ea1cd..1e3cbc7 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -505,7 +505,7 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
 
hashmap_init(, (hashmap_cmp_fn) commit_name_cmp, NULL, 0);
for_each_rawref(get_name, NULL);
-   if (!names.size && !always)
+   if (!hashmap_get_size() && !always)
die(_("No names found, cannot describe anything."));
 
if (argc == 0) {
diff --git a/hashmap.c b/hashmap.c
index 9b6a128..d42f01f 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -116,9 +116,6 @@ static void rehash(struct hashmap *map, unsigned int 
newsize)
unsigned int i, oldsize = map->tablesize;
struct hashmap_entry **oldtable = map->table;
 
-   if (map->disallow_rehash)
-   return;
-
alloc_table(map, newsize);
for (i = 0; i < oldsize; i++) {
 

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-06 Thread Jeff Hostetler



On 9/5/2017 9:24 PM, Junio C Hamano wrote:

Jeff Hostetler  writes:


From: Jeff Hostetler 



I feel somewhat stupid to say this, especially after seeing many
people applaud this patch, but I do not seem to be able to even
build Git with this patch.  I am getting:

 common-main.o: In function `hashmap_get_size':
 /home/gitster/w/git.git/hashmap.h:260: multiple definition of 
`hashmap_get_size'
 fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
 libgit.a(argv-array.o): In function `hashmap_get_size':
 /home/gitster/w/git.git/hashmap.h:260: multiple definition of 
`hashmap_get_size'
 fast-import.o:/home/gitster/w/git.git/hashmap.h:260: first defined here
 libgit.a(attr.o): In function `hashmap_get_size':
 ...

and wonder if others are building with different options or something..


That's odd. I'm not seeing that.  My Ubuntu VM is reporting:
$ gcc --version
gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005

I'll change it to static in the next version.  Hopefully that will
take care of it.

Thanks,
Jeff



Re: [PATCH] rev-parse: don't trim bisect refnames

2017-09-06 Thread Michael Haggerty
On Wed, Sep 6, 2017 at 1:53 PM, Jeff King  wrote:
> [...]
> Subject: [PATCH] rev-parse: don't trim bisect refnames
>
> Using for_each_ref_in() with a full refname has always been
> a questionable practice, but it became an error with
> b9c8e7f2fb (prefix_ref_iterator: don't trim too much,
> 2017-05-22), making "git rev-parse --bisect" pretty reliably
> show a BUG.
>
> Commit 03df567fbf (for_each_bisect_ref(): don't trim
> refnames, 2017-06-18) fixed this case for revision.c, but
> rev-parse handles this option on its own. We can use the
> same solution here (and piggy-back on its test).

It looks good to me.

Michael


gitmodules below root directory

2017-09-06 Thread Robert Dailey
The gitmodules documentation[1] states that the .gitmodules file is at
the root. However, it would be nice if this could be supported in any
directory similar to how .gitignore works. Right now git-subrepo does
not support submodules inside of a subrepo[2] (I suspect subtrees
would have the same problem, but I did not verify). I think this is a
limitation of git, rather than subrepo itself. Perhaps there are
reasons why .gitmodules must be at the root, but I at least wanted to
point it out and see if this could be supported.

[1]: https://git-scm.com/docs/gitmodules
[2]: https://github.com/ingydotnet/git-subrepo/issues/262


Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX

2017-09-06 Thread Jeff King
On Wed, Sep 06, 2017 at 01:59:31PM +0200, Michael J Gruber wrote:

> BTW, there's more fallout from those name-rev changes: In connection
> with that other thread about surprising describe results for emacs.git I
> noticed that I can easily get "git name-rev --stdin" to segfault there.
> As easy as
> 
> echo bc5d96a0b2a1dccf7c459e40d21b54c977f4 | git name-rev --stdin
> 
> for example.
> 
> That's unfortunate for the use-case of name-rev to amend git log output.
> 
> The reason seems to be that with "--stdin" or "--all", "name-rev" walks
> and names all commits before beginning to use that those names for even
> a single commit as above.
> 
> That segfault bisects to the logic changing commit in
> jc/name-rev-lw-tag, but I think the changed logic simply leads to more
> xmallocs() the segfault sooner now. Or something that I dind't spot even
> after a few hours.

The segfault seems to be due to running out of stack space. The problem
is that name_rev() is recursive over the history graph.  That topic
added a parameter to the function, which increased the memory used for
each level of the recursion. But the fundamental problem has always been
there. The right solution is to switch to iteration (with our own stack
structure if necessary).

We had similar problems with the recursive --contains traversal in tag,
and ended up with cbc60b6720 (git tag --contains: avoid stack overflow,
2014-04-24).

-Peff


Re: [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function

2017-09-06 Thread Jeff King
On Wed, Sep 06, 2017 at 03:23:42PM +0200, Johannes Schindelin wrote:

> On Wed, 6 Sep 2017, Jeff King wrote:
> 
> > I find the diff hard to read because it shows the opposite of what I did
> > (moving the big block to its own function, rather than moving the little
> > bits to a new copy of the function).
> 
> I guess --patience would have made it slightly more readable. I did not
> find any commit in your fork on GitHub with the commit subject, else I
> would have pasted the output of `git diff --patience` here.

I hadn't pushed it yet (but it's there now, 130217f4a). Neither
--patience nor --histogram produces markedly different results. I think
the issue is just that what I actually did (moving the big inner block)
is not nearly as minimal a diff as moving what surrounds it.

-Peff


Re: [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function

2017-09-06 Thread Johannes Schindelin
Hi Peff,

On Wed, 6 Sep 2017, Jeff King wrote:

> I find the diff hard to read because it shows the opposite of what I did
> (moving the big block to its own function, rather than moving the little
> bits to a new copy of the function).

I guess --patience would have made it slightly more readable. I did not
find any commit in your fork on GitHub with the commit subject, else I
would have pasted the output of `git diff --patience` here.

Ciao,
Dscho


Re: [PATCH] config: use a static lock_file struct

2017-09-06 Thread Jeff King
On Wed, Sep 06, 2017 at 12:59:46PM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.
> 
> I notice you also have a series to rework the "must remain forever"
> on the list, but nevertheless let's take this one first.
> 
> Will queue.

Yes, this will conflict with that series (but the resolution should just
be to throw this one away). I'm fine with taking this in the meantime,
though it's certainly not urgent (my main goal was getting the pre-test
bits of test-lib.sh to run with leak detection on; I don't think this
was actually affecting anybody in the real world).

-Peff


Re: [PATCH 0/10] towards clean leak-checker output

2017-09-06 Thread Jeff King
On Tue, Sep 05, 2017 at 10:41:51PM +0200, Martin Ågren wrote:

> FWIW, this series (combined with the other series you mentioned) makes
> t and t0001 pass for me with gcc/clang. There are actually some
> leaks in t, they're just silently being reported to stderr, since
> the exit statuses from git are hidden by pipes. Maybe you're already
> aware of it. Depending on your definition of "running clean" it might be
> out of scope for this series, which is of course still very interesting
> and enlightening as it stands.

Yeah, I saw those. I don't think they're worth hunting down at this
point. It's quite probable that the same leaks are exercised elsewhere,
and we'll catch them later.

Once the whole test suite runs without reporting any leaks via abort(),
I'll feel more compelled to start grepping stderr for other cases. :)

> One is in cmd_show (you did mention git show) where we leak data in rev.
> The other is some use of strdup. I can't immediately figure out how to
> get a useful stacktrace (you mentioned this as well) and it's past
> bed-time here. I'll try to play more with this tomorrow.

I think I got funny truncated stack traces with just LSAN that went away
with ASAN, but I didn't dig. If you try that, remember that:

  - Technically just SANITIZE=address turns on the leak detector, but
you need "leak" in the string to turn on the UNLEAK() magic. So use
"SANITIZE=address,leak".

  - You'll have to set ASAN_OPTIONS=abort_on_error=1 in the environment
manually to enable leak detection.

Once the test suite passes with leak detection on, I think we'd probably
do both of those automatically (but until then they make ASAN by itself
unusable).

> Note for self: getting rid of all pipes would probably also help flush
> out a few leaks (or "introduce" them, depending on your viewpoint).

Agreed. These leaks are really just a special form of bug. Those pipes
could be hiding other bugs, too. But again, I'm not too concerned. The
places that pipe git are doing so because they are not meant to be
testing that particular command (and it should generally be covered
elsewhere).

-Peff


[PATCH 2/2] git_extract_argv0_path: do nothing without RUNTIME_PREFIX

2017-09-06 Thread Jeff King
When the RUNTIME_PREFIX compile-time knob isn't set, we
never look at the argv0_path we extract. We can push its
declaration inside the #ifdef to make it more clear that the
extract code is effectively a noop.

This also un-confuses leak-checking of the argv0_path
variable when RUNTIME_PREFIX isn't set. The compiler is free
to drop this static variable that we set but never look at
(and "gcc -O2" does so).  But the compiler still must call
strbuf_detach(), since it doesn't know whether that function
has side effects; it just throws away the result rather than
putting it into the global.

Leak-checkers which work by scanning the data segment for
pointers to heap blocks would normally consider the block
as reachable at program end. But if the compiler removes the
variable entirely, there's nothing to find.

Signed-off-by: Jeff King 
---
 exec_cmd.c | 33 +++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index 61092e9715..ce192a2d64 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -5,9 +5,9 @@
 #define MAX_ARGS   32
 
 static const char *argv_exec_path;
-static const char *argv0_path;
 
 #ifdef RUNTIME_PREFIX
+static const char *argv0_path;
 
 static const char *system_prefix(void)
 {
@@ -27,6 +27,20 @@ static const char *system_prefix(void)
}
return prefix;
 }
+
+void git_extract_argv0_path(const char *argv0)
+{
+   const char *slash;
+
+   if (!argv0 || !*argv0)
+   return;
+
+   slash = find_last_dir_sep(argv0);
+
+   if (slash)
+   argv0_path = xstrndup(argv0, slash - argv0);
+}
+
 #else
 
 static const char *system_prefix(void)
@@ -34,6 +48,10 @@ static const char *system_prefix(void)
return PREFIX;
 }
 
+void git_extract_argv0_path(const char *argv0)
+{
+}
+
 #endif /* RUNTIME_PREFIX */
 
 char *system_path(const char *path)
@@ -47,19 +65,6 @@ char *system_path(const char *path)
return strbuf_detach(, NULL);
 }
 
-void git_extract_argv0_path(const char *argv0)
-{
-   const char *slash;
-
-   if (!argv0 || !*argv0)
-   return;
-
-   slash = find_last_dir_sep(argv0);
-
-   if (slash)
-   argv0_path = xstrndup(argv0, slash - argv0);
-}
-
 void git_set_argv_exec_path(const char *exec_path)
 {
argv_exec_path = exec_path;
-- 
2.14.1.757.g8fad538cea


[PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function

2017-09-06 Thread Jeff King
The system_path() function has an #ifdef in the middle of
it. Let's move the conditional logic into a sub-function.
This isolates it more, which will make it easier to change
and add to.

Signed-off-by: Jeff King 
---
I find the diff hard to read because it shows the opposite of what I did
(moving the big block to its own function, rather than moving the little
bits to a new copy of the function).

 exec_cmd.c | 33 +
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index fb94aeba9c..61092e9715 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -7,19 +7,12 @@
 static const char *argv_exec_path;
 static const char *argv0_path;
 
-char *system_path(const char *path)
-{
 #ifdef RUNTIME_PREFIX
-   static const char *prefix;
-#else
-   static const char *prefix = PREFIX;
-#endif
-   struct strbuf d = STRBUF_INIT;
 
-   if (is_absolute_path(path))
-   return xstrdup(path);
+static const char *system_prefix(void)
+{
+   static const char *prefix;
 
-#ifdef RUNTIME_PREFIX
assert(argv0_path);
assert(is_absolute_path(argv0_path));
 
@@ -32,9 +25,25 @@ char *system_path(const char *path)
"but prefix computation failed.  "
"Using static fallback '%s'.\n", prefix);
}
-#endif
+   return prefix;
+}
+#else
+
+static const char *system_prefix(void)
+{
+   return PREFIX;
+}
+
+#endif /* RUNTIME_PREFIX */
+
+char *system_path(const char *path)
+{
+   struct strbuf d = STRBUF_INIT;
+
+   if (is_absolute_path(path))
+   return xstrdup(path);
 
-   strbuf_addf(, "%s/%s", prefix, path);
+   strbuf_addf(, "%s/%s", system_prefix(), path);
return strbuf_detach(, NULL);
 }
 
-- 
2.14.1.757.g8fad538cea



[PATCH 0/2] simplifying !RUNTIME_PREFIX

2017-09-06 Thread Jeff King
On Wed, Sep 06, 2017 at 10:42:07AM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > In fact, the whole extract_argv0_path thing is pointless without
> > RUNTIME_PREFIX. So I think one reasonable fix is just:
> [...]
> Yeah, I kind of like this (I would have reduced the ifdef noise by
> leaving a totally-unused argv0_path in the BSS, though).

I wanted to drop it to make sure that I didn't miss any references to it
(and that we didn't add any in the future). But -Wunused also complains
if you leave it in. :)

I think we can reorganize the code a little to make the end result more
readable. Like this.

  [1/2]: system_path: move RUNTIME_PREFIX to a sub-function
  [2/2]: git_extract_argv0_path: do nothing without RUNTIME_PREFIX

 exec_cmd.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

-Peff


Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX

2017-09-06 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 06.09.2017 05:35:
> Michael J Gruber  writes:
> 
>> Earlier, dddbad728c ("timestamp_t: a new data type for timestamps",
>> 2017-04-26) changed several types to timestamp_t.
>>
>> 5589e87fd8 ("name-rev: change a "long" variable to timestamp_t",
>> 2017-05-20) cleaned up a missed variable, but both missed a _MAX
>> constant.
>>
>> Change the remaining constant to the one appropriate for the current
>> type
>>
>> Signed-off-by: Michael J Gruber 
>> ---
> 
> Thanks.
> 
> I think this (and the earlier 5589e8) was caused by an unnoticed
> semantic conflict at 78089b71 ("Merge branch 'jc/name-rev-lw-tag'",
> 2017-05-30).  Merging is sometimes hard ;-)

Simple merges and semi-simple merges...

BTW, there's more fallout from those name-rev changes: In connection
with that other thread about surprising describe results for emacs.git I
noticed that I can easily get "git name-rev --stdin" to segfault there.
As easy as

echo bc5d96a0b2a1dccf7c459e40d21b54c977f4 | git name-rev --stdin

for example.

That's unfortunate for the use-case of name-rev to amend git log output.

The reason seems to be that with "--stdin" or "--all", "name-rev" walks
and names all commits before beginning to use that those names for even
a single commit as above.

That segfault bisects to the logic changing commit in
jc/name-rev-lw-tag, but I think the changed logic simply leads to more
xmallocs() the segfault sooner now. Or something that I dind't spot even
after a few hours.

On the other hand, nearly every time that I try to understand describe
or name-rev I want get rid of insert_commit_by_date() and the like and
replace this by generations, and maybe a simple rev-walk (per ref)...

> Will queue.
> 
>>  builtin/name-rev.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
>> index c41ea7c2a6..598da6c8bc 100644
>> --- a/builtin/name-rev.c
>> +++ b/builtin/name-rev.c
>> @@ -253,7 +253,7 @@ static int name_ref(const char *path, const struct 
>> object_id *oid, int flags, vo
>>  struct commit *commit = (struct commit *)o;
>>  int from_tag = starts_with(path, "refs/tags/");
>>  
>> -if (taggerdate == ULONG_MAX)
>> +if (taggerdate == TIME_MAX)
>>  taggerdate = ((struct commit *)o)->date;
>>  path = name_ref_abbrev(path, can_abbreviate_output);
>>  name_rev(commit, xstrdup(path), taggerdate, 0, 0,


[PATCH] rev-parse: don't trim bisect refnames

2017-09-06 Thread Jeff King
On Tue, Sep 05, 2017 at 03:13:41PM -0700, Linus Torvalds wrote:

> On Tue, Sep 5, 2017 at 3:03 PM, Jeff King  wrote:
> >
> > This probably fixes it:
> 
> Yup. Thanks.

Thanks for confirming. Here it is with a commit message and test.

-- >8 --
Subject: [PATCH] rev-parse: don't trim bisect refnames

Using for_each_ref_in() with a full refname has always been
a questionable practice, but it became an error with
b9c8e7f2fb (prefix_ref_iterator: don't trim too much,
2017-05-22), making "git rev-parse --bisect" pretty reliably
show a BUG.

Commit 03df567fbf (for_each_bisect_ref(): don't trim
refnames, 2017-06-18) fixed this case for revision.c, but
rev-parse handles this option on its own. We can use the
same solution here (and piggy-back on its test).

Signed-off-by: Jeff King 
---
 builtin/rev-parse.c|  4 ++--
 t/t6002-rev-list-bisect.sh | 18 --
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 2bd28d3c08..9f24004c0a 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -757,8 +757,8 @@ int cmd_rev_parse(int argc, const char **argv, const char 
*prefix)
continue;
}
if (!strcmp(arg, "--bisect")) {
-   for_each_ref_in("refs/bisect/bad", 
show_reference, NULL);
-   for_each_ref_in("refs/bisect/good", 
anti_reference, NULL);
+   for_each_fullref_in("refs/bisect/bad", 
show_reference, NULL, 0);
+   for_each_fullref_in("refs/bisect/good", 
anti_reference, NULL, 0);
continue;
}
if (opt_with_value(arg, "--branches", )) {
diff --git a/t/t6002-rev-list-bisect.sh b/t/t6002-rev-list-bisect.sh
index 534903bbd2..a661408038 100755
--- a/t/t6002-rev-list-bisect.sh
+++ b/t/t6002-rev-list-bisect.sh
@@ -236,17 +236,31 @@ test_sequence "--bisect"
 #
 #
 
-test_expect_success '--bisect can default to good/bad refs' '
+test_expect_success 'set up fake --bisect refs' '
git update-ref refs/bisect/bad c3 &&
good=$(git rev-parse b1) &&
git update-ref refs/bisect/good-$good $good &&
good=$(git rev-parse c1) &&
-   git update-ref refs/bisect/good-$good $good &&
+   git update-ref refs/bisect/good-$good $good
+'
 
+test_expect_success 'rev-list --bisect can default to good/bad refs' '
# the only thing between c3 and c1 is c2
git rev-parse c2 >expect &&
git rev-list --bisect >actual &&
test_cmp expect actual
 '
 
+test_expect_success 'rev-parse --bisect can default to good/bad refs' '
+   git rev-parse c3 ^b1 ^c1 >expect &&
+   git rev-parse --bisect >actual &&
+
+   # output order depends on the refnames, which in turn depends on
+   # the exact sha1s. We just want to make sure we have the same set
+   # of lines in any order.
+   sort expect.sorted &&
+   sort actual.sorted &&
+   test_cmp expect.sorted actual.sorted
+'
+
 test_done
-- 
2.14.1.757.g8fad538cea



Re: [PATCH v4 07/16] refs: add refs_head_ref()

2017-09-06 Thread Duy Nguyen
On Fri, Aug 25, 2017 at 4:52 AM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
> ... which does what?
>
> Unlike refs_for_each_ref() and friends, this does not iterate.
> It just uses the same function signature to make a single call
> of fn on the "HEAD" ref.
>
> Did I capture what it does right?

It's basically head_ref() but can take any ref store (while head_ref()
always uses the main ref store). I'll update the commit message. What
you describe is correct, but I think I'll leave that out and assume
people know what head_ref() does.
-- 
Duy


Re: [PATCH v4 11/16] revision.c: --all adds HEAD from all worktrees

2017-09-06 Thread Duy Nguyen
On Thu, Aug 24, 2017 at 2:54 AM, Stefan Beller  wrote:
>> +int other_head_refs(each_ref_fn fn, void *cb_data)
>> +{
>> +   struct worktree **worktrees, **p;
>> +   int ret = 0;
>> +
>> +   worktrees = get_worktrees(0);
>> +   for (p = worktrees; *p; p++) {
>> +   struct worktree *wt = *p;
>> +   struct ref_store *refs;
>> +
>> +   if (wt->is_current)
>> +   continue;
>
> As said in an earlier patch (and now this pattern
> coming up twice in this patch series alone), the lines
> of this function up to here, could become
> part of a worktree iterator function?

There are a couple "loop through all worktrees" code so far, but the
filter condition is not always this.

While I like the idea of iterator function (especially if it does
"struct worktree" memory management internally), I think it's a bit
overkill to go for_each_worktree() with a callback function when the
equivalent for(;;) is so short. We would need to declare struct to
pass callback data, and the reader may have to got read
for_each_worktree() code then come back here.

So, probably no worktree iterator (yet).

>> +   refs = get_worktree_ref_store(wt);
>> +   ret = refs_head_ref(refs, fn, cb_data);
>> +   if (ret)
>> +   break;
>
> with these 4 lines in the callback function.
>
>> +/*
>> + * Similar to head_ref() for all HEADs _except_ one from the current
>> + * worktree, which is covered by head_ref().
>> + */
>> +int other_head_refs(each_ref_fn fn, void *cb_data);
>
> This is already such an iterator function, just at another
> abstraction level.

yeah.. but we can't mix and match (or combine) ref/worktree iterator callbacks

-- 
Duy


Re: [PATCH v4 02/16] refs.c: use is_dir_sep() in resolve_gitlink_ref()

2017-09-06 Thread Duy Nguyen
On Thu, Aug 24, 2017 at 2:14 AM, Stefan Beller  wrote:
> On Wed, Aug 23, 2017 at 5:36 AM, Nguyễn Thái Ngọc Duy  
> wrote:
>> The "submodule" argument in this function is a path, which can have
>> either '/' or '\\' as a separator. Use is_dir_sep() to support both.
>>
>> Noticed-by: Johannes Sixt 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>
> Immediate questions that come to mind:
> * Could this go in as an independent bug fix?

It probably could. But it may depend on other submodule changes in this series.

>   -> If so do we have any test that fails or stops failing on Windows?

It was spotted during the review [1] so I guess no test fails on
Windows (not that I would know because I can't run tests on Windows)

>   -> If not, do we need one?

Since I don't use Windows, I don't particularly care. And I can't add
one anyway because I can't run it.

[1] 
https://public-inbox.org/git/%3ca74cf309-fb16-2f45-8189-d1d0c655d...@kdbg.org%3E/

> * Assuming this is not an independent bug fix:
>   Why do we need this patch in this series here?
>   (I thought this is about worktrees, not submodules.
>   So does this fix a potential bug that will be exposed
>   later in this series, but was harmless up to now?)

The series could probably be split in two. The first part is about
using the new refs_* API in revision.c. This helps clean up refs API a
bit, all *_submodule() functions will be one. Not strictly needed to
be part of this, it's just a continuation of my previous series that
introduces *_refs. Since submodule code is touched, this is found.

The second part is actually fixing the prune bug.

Should I split the series in two? I think I separated two parts in the
past (maybe I misremember) at least in the description...
-- 
Duy


Re: [PATCH v4 00/16] Fix git-gc losing objects in multi worktree

2017-09-06 Thread Duy Nguyen
On Fri, Aug 25, 2017 at 6:21 PM, Michael J Gruber  wrote:
> I suggest we think about the UI exposure a bit when it
> comes to including all heads or naming options, though:
>
> * HEAD is "the current head"
> * refs/heads is where all local branch heads are
>
> * --branches is the rev-list/log option for refs/heads/*
> * --all is the rev-list/log option for refs/* plus HEAD
> * HEAD is the rev-list/log argument for HEAD

It also covers object references from the index file aka --indexed-objects

> * --heads is the show-ref option limiting to refs/heads/*
> * --head is the show-ref option which adds HEAD
>
> * refs/heads is the for-each-ref-pattern for refs/heads/*
> * HEAD is not the for-each-ref-pattern for HEAD
> [I'll suggest a patch to change the latter, shortly.]
>
> I would hope that the result of this series and other efforts will be:
>
> * consistent way to specify "all local branch heads"
> * consistent way to specify "the head" aka HEAD
> * consistent way to specify "all linked worktree heads"
> [* maybe something for submodules...]

Hmm.. I admit that I completely overlooked 'git show-ref'.

> This may require changing the misnamed show-ref option, but also
> thinking twice before changing the meaning of "--all" for the
> rev-list/log family: it's easy to say "--all --linked" or "--all
> --heads" to get everything plus all linked worktree heads when "--all"
> == "--branches --tags HEAD", but it's more cumbersome with a changed
> --all that is "really everything". I gues my suggestion would be:
>
> --all as it is now (refs/* plus HEAD)
>
> --head alternative way to say HEAD (as it is now for show-ref)
>
> --heads HEAD for all linked worktrees (incompatible change for show-ref)
>
> And all of them should work the same for the rev-list/log family as well
> as for-each-ref/show-ref.

How about: show-ref learns a new option to let it list HEAD (and other
per-worktree refs) of one/current worktree, or all worktrees. This is
what the --single-worktree option is for, which is added by this
series (but I need to make sure if's exposed in show-ref as well). For
showing refs as viewed by another worktree, we could have the global
option similar to --git-dir to select that worktree, e.g. "git
--work-tree-id=XXX show-ref ..."?

Since this seems a good thing to do, but not necessary to fix the
"prune" bug, I'll do it separately instead of adding in this series. I
may need to look at "git for-each-ref" too for that matter.

> I thinking that changing show-ref (porcelain, not quite as commonly
> used) should do the least harm compared to all other options.
-- 
Duy


[RFC PATCH 1/2] Always write push cert to disk

2017-09-06 Thread Shikher Verma
receive-pack: store push cert irrespective of hook

Push certificates are being saved to disk only when hook was
attached on the server side, which may complicate tooling and
interaction with the certificates. Add write_push_cert_sha1() to
save push cert to disk regardless of this.

Signed-off-by: Shikher Verma 
Helped-by: Santiago Torres-Arias 
---
 builtin/receive-pack.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 52c63ebfd..79195005f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -619,7 +619,15 @@ static int check_cert_push_options(const struct 
string_list *push_options)
return retval;
 }
 
-static void prepare_push_cert_sha1(struct child_process *proc)
+static void write_push_cert_sha1()
+{
+   if (!push_cert.len)
+   return;
+   if (write_sha1_file(push_cert.buf, push_cert.len, "blob", 
push_cert_sha1))
+   hashclr(push_cert_sha1);
+}
+
+static void prepare_push_cert(struct child_process *proc)
 {
static int already_done;
 
@@ -632,9 +640,6 @@ static void prepare_push_cert_sha1(struct child_process 
*proc)
int bogs /* beginning_of_gpg_sig */;
 
already_done = 1;
-   if (write_sha1_file(push_cert.buf, push_cert.len, "blob", 
push_cert_sha1))
-   hashclr(push_cert_sha1);
-
memset(, '\0', sizeof(sigcheck));
sigcheck.result = 'N';
 
@@ -727,7 +732,7 @@ static int run_and_feed_hook(const char *hook_name, feed_fn 
feed,
proc.err = muxer.in;
}
 
-   prepare_push_cert_sha1();
+   prepare_push_cert();
 
code = start_command();
if (code) {
@@ -1980,6 +1985,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
for (cmd = commands; cmd; cmd = cmd->next)
cmd->error_string = "inconsistent push options";
}
+   write_push_cert_sha1();
 
prepare_shallow_info(, );
if (!si.nr_ours && !si.nr_theirs)
-- 
2.14.1




[RFC PATCH 0/2] Add named reference to latest push cert

2017-09-06 Thread Shikher Verma
Currently, git only stores push certificates if there is a receive hook 
present. This may violate the principle of least surprise (e.g., I 
pushed with --signed, and I don't see anything in upstream). 
Additionally, push certificates could be more versatile if they are not 
tightly bound to git hooks. Finally, it would be useful to verify the 
signed pushes at later points of time with ease.

A named ref is added for ease of access/tooling around push 
certificates. If the last push was signed, ref/PUSH_CERT stores the 
ref of the latest push cert otherwise it is empty.
 
Sending patches as RFC since the documentation would have to be 
updated and git gc might have to be patched to not garbage collect 
the latest push certificate.

This patch applies on master (3ec7d702a) 

Shikher Verma (2):
  Always write push cert to disk
  Store latest push cert ref in PUSH_CERT

 builtin/receive-pack.c | 25 -
 path.c |  1 +
 path.h |  1 +
 3 files changed, 22 insertions(+), 5 deletions(-)

-- 
2.14.1




[RFC PATCH 2/2] Store latest push cert ref in PUSH_CERT

2017-09-06 Thread Shikher Verma
path: Add named reference to push cert
builtin/receive-pack: update named push cert ref

Push certificates are "detached" objects on the git objects tree.
Add a named reference so the latest push certificate can be easily
fetched, verified and stored for later use. This eases tooling
around the push certificate feature.

Signed-off-by: Shikher Verma 
Helped-by: Santiago Torres-Arias 
---
 builtin/receive-pack.c | 9 +
 path.c | 1 +
 path.h | 1 +
 3 files changed, 11 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 79195005f..2fdeafe63 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1530,6 +1530,15 @@ static void execute_commands(struct command *commands,
 
if (shallow_update)
warn_if_skipped_connectivity_check(commands, si);
+
+   for (struct command *cmd = commands; cmd; cmd = cmd->next)
+   if (cmd->error_string)
+   return;
+   struct strbuf buf = STRBUF_INIT;
+   if (push_cert.len)
+   strbuf_addstr(, sha1_to_hex(push_cert_sha1));
+   strbuf_addch(, '\n');
+   write_file_buf(git_path_push_cert(), buf.buf, buf.len);
 }
 
 static struct command **queue_command(struct command **tail,
diff --git a/path.c b/path.c
index b533ec938..5dca2a7bf 100644
--- a/path.c
+++ b/path.c
@@ -1366,3 +1366,4 @@ GIT_PATH_FUNC(git_path_merge_mode, "MERGE_MODE")
 GIT_PATH_FUNC(git_path_merge_head, "MERGE_HEAD")
 GIT_PATH_FUNC(git_path_fetch_head, "FETCH_HEAD")
 GIT_PATH_FUNC(git_path_shallow, "shallow")
+GIT_PATH_FUNC(git_path_push_cert, "refs/PUSH_CERT")
diff --git a/path.h b/path.h
index 9541620c7..4bdeb1f07 100644
--- a/path.h
+++ b/path.h
@@ -78,5 +78,6 @@ const char *git_path_merge_mode(void);
 const char *git_path_merge_head(void);
 const char *git_path_fetch_head(void);
 const char *git_path_shallow(void);
+const char *git_path_push_cert(void);
 
 #endif /* PATH_H */
-- 
2.14.1




[PATCHv3 2/2] pull: honor submodule.recurse config option

2017-09-06 Thread Nicolas Morey-Chaisemartin
"git pull" supports a --recurse-submodules option but does not parse the
submodule.recurse configuration item to set the default for that option.
Meanwhile "git fetch" does support submodule.recurse, producing
confusing behavior: when submodule.recurse is enabled, "git pull"
recursively fetches submodules but does not update them after fetch.

Handle submodule.recurse in "git pull" to fix this.

Reported-by: Magnus Homann 
Signed-off-by: Nicolas Morey-Chaisemartin 
---
 builtin/pull.c|  4 
 t/t5572-pull-submodule.sh | 32 
 2 files changed, 36 insertions(+)

diff --git a/builtin/pull.c b/builtin/pull.c
index 9ef1ab501..6f772e8a2 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -325,6 +325,10 @@ static int git_pull_config(const char *var, const char 
*value, void *cb)
if (!strcmp(var, "rebase.autostash")) {
config_autostash = git_config_bool(var, value);
return 0;
+   } else if (!strcmp(var, "submodule.recurse")) {
+   recurse_submodules = git_config_bool(var, value) ?
+   RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+   return 0;
}
return git_default_config(var, value, cb);
 }
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index 077eb07e1..321bd37de 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -65,6 +65,38 @@ test_expect_success 'recursive pull updates working tree' '
test_path_is_file super/sub/merge_strategy.t
 '
 
+test_expect_success "submodule.recurse option triggers recursive pull" '
+   test_commit -C child merge_strategy_2 &&
+   git -C parent submodule update --remote &&
+   git -C parent add sub &&
+   git -C parent commit -m "update submodule" &&
+
+   git -C super -c submodule.recurse pull --no-rebase &&
+   test_path_is_file super/sub/merge_strategy_2.t
+'
+
+test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
+   test_commit -C child merge_strategy_3 &&
+   git -C parent submodule update --remote &&
+   git -C parent add sub &&
+   git -C parent commit -m "update submodule" &&
+
+   git -C super -c submodule.recurse pull --no-recurse-submodules 
--no-rebase &&
+   test_path_is_missing super/sub/merge_strategy_3.t &&
+   git -C super -c submodule.recurse=false pull --recurse-submodules 
--no-rebase &&
+   test_path_is_file super/sub/merge_strategy_3.t &&
+
+   test_commit -C child merge_strategy_4 &&
+   git -C parent submodule update --remote &&
+   git -C parent add sub &&
+   git -C parent commit -m "update submodule" &&
+
+   git -C super -c submodule.recurse=false pull --no-recurse-submodules 
--no-rebase &&
+   test_path_is_missing super/sub/merge_strategy_4.t &&
+   git -C super -c submodule.recurse=true pull --recurse-submodules 
--no-rebase &&
+   test_path_is_file super/sub/merge_strategy_4.t
+'
+
 test_expect_success 'recursive rebasing pull' '
# change upstream
test_commit -C child rebase_strategy &&
-- 
2.14.1.461.g503560879



[PATCHv3 0/2] fix recurse.submodule config for git pull

2017-09-06 Thread Nicolas Morey-Chaisemartin
Changes since v2:
- Add a patch that fixes the option parsing order (parse config before cli, not 
the other way around)
- Enhance the tests to check --recurse-submodule and submodule.recurse 
combinations

Nicolas Morey-Chaisemartin (2):
  pull: fix cli and config option parsing order
  pull: honor submodule.recurse config option

 builtin/pull.c|  8 ++--
 t/t5572-pull-submodule.sh | 32 
 2 files changed, 38 insertions(+), 2 deletions(-)

-- 
2.14.1.461.g503560879



Re: RFC v3: Another proposed hash function transition plan

2017-09-06 Thread Junio C Hamano
Jonathan Nieder  writes:

> Linus Torvalds wrote:
>> On Fri, Mar 3, 2017 at 5:12 PM, Jonathan Nieder  wrote:
>
>>> This document is still in flux but I thought it best to send it out
>>> early to start getting feedback.
>>
>> This actually looks very reasonable if you can implement it cleanly
>> enough.
>
> Thanks for the kind words on what had quite a few flaws still.  Here's
> a new draft.  I think the next version will be a patch against
> Documentation/technical/.

Can we reboot the discussion and advance this to v4 state?

> As before, comments welcome, both here and inline at
>
>   https://goo.gl/gh2Mzc

I think what you have over there looks pretty-much ready as the
final outline.

One thing I still do not know how I feel about after re-reading the
thread, and I didn't find the above doc, is Linus's suggestion to
use the objects themselves as NewHash-to-SHA-1 mapper [*1*].  

It does not help the reverse mapping that is needed while pushing
things out (the SHA-1 receiver tells us what they have in terms of
SHA-1 names; we need to figure out where we stop sending based on
that).  While it does help maintaining itself (while constructing
SHA3-content, we'd be required to find out its SHA1 name but the
SHA3 objects that we refer to all know their SHA-1 names), if it is
not useful otherwise, then that does not count as a plus.  Also
having to bake corresponding SHA-1 name in the object would mean
mistakes can easily propagate and cannot be corrected without
rewriting the history, which would be a huge downside.  So perhaps
we are better off without it, I guess.


[Reference]

*1* 




Re: [PATCHv2] pull: honor submodule.recurse config option

2017-09-06 Thread Nicolas Morey-Chaisemartin


Le 06/09/2017 à 03:17, Junio C Hamano a écrit :
> Nicolas Morey-Chaisemartin  writes:
>
>> "git pull" supports a --recurse-submodules option but does not parse the
>> submodule.recurse configuration item to set the default for that option.
>> Meanwhile "git fetch" does support submodule.recurse, producing
>> confusing behavior: when submodule.recurse is enabled, "git pull"
>> recursively fetches submodules but does not update them after fetch.
>>
>> Handle submodule.recurse in "git pull" to fix this.
>>
>> Reported-by: Magnus Homann 
>> Signed-off-by: Nicolas Morey-Chaisemartin 
>> ---
>>
>> Changes since v1:
>>  * Cleanup commit message
>>  * Add test
>>  * Remove extra var in code and fallthrough to git_default_config
>>
>>  builtin/pull.c|  4 
>>  t/t5572-pull-submodule.sh | 10 ++
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/builtin/pull.c b/builtin/pull.c
>> index 7fe281414..ce8ccb15b 100644
>> --- a/builtin/pull.c
>> +++ b/builtin/pull.c
>> @@ -325,6 +325,10 @@ static int git_pull_config(const char *var, const char 
>> *value, void *cb)
>>  if (!strcmp(var, "rebase.autostash")) {
>>  config_autostash = git_config_bool(var, value);
>>  return 0;
>> +} else if (!strcmp(var, "submodule.recurse")) {
>> +recurse_submodules = git_config_bool(var, value) ?
>> +RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
>> +return 0;
>>  }
>>  return git_default_config(var, value, cb);
>>  }
> If I am reading the existing code correctly, things happen in
> cmd_pull() in this order:
>
>  - recurse_submodules is a file-scope static that is initialized to
>RECURSE_SUBMODULES_DEFAULT
>
>  - pull_options[] is given to parse_options() so that
>submodule-config.c::option_fetch_parse_recurse_submodules() can
>read "--recurse-submodules=" from the command line to
>update recurse_submodules.
>
>  - git_pull_config() is given to git_config() so that settings in
>the configuration files are read.
>
> Care must be taken to make sure that values given from the command
> line is never overriden by the default value specified in the
> configuration system because the order of the second and third items
> in the above are backwards from the usual flow.  This patch does not
> seem to have any such provision.
>
> Existing handling of "--autostash" vs "rebase.autostash" solves this
> issue by having opt_autostash and config_autostash as two separate
> variables, so I suspect that something similar to it must be there,
> at least, for this new configuration.
>
> If we want to keep the current code structure, that is.  I do not
> recall if we did not notice the fact that the order of options and
> config parsing is backwards and unknowingly worked it around with
> two variables when we added the rebase.autostash thing, or we knew
> the order was unusual but there was a good reason to keep that
> unusual order (iow, if we simply swapped the order of
> parse_options() and git_config() calls, there are things that will
> break).  
>
> If it is not the latter, perhaps we may want to flip the order of
> config parsing and option parsing around?  That will allow us to fix
> the handling of autostash thing to use only one variable, and also
> fix your patch to do the right thing.

I see what you mean.
It looks like switching the code around works but I think there still needs to 
be 2variables for autstash for this piece of code:

    if (!opt_rebase && opt_autostash != -1)
        die(_("--[no-]autostash option is only valid with --rebase."));

The config option should not cause git pull to die when not using --rebase, the 
CLI option should.

>> diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
>> index 077eb07e1..1b3a3f445 100755
>> --- a/t/t5572-pull-submodule.sh
>> +++ b/t/t5572-pull-submodule.sh
>> @@ -65,6 +65,16 @@ test_expect_success 'recursive pull updates working tree' 
>> '
>>  test_path_is_file super/sub/merge_strategy.t
>>  '
>>  
>> +test_expect_success "submodule.recurse option triggers recursive pull" '
>> +test_commit -C child merge_strategy_2 &&
>> +git -C parent submodule update --remote &&
>> +git -C parent add sub &&
>> +git -C parent commit -m "update submodule" &&
>> +
>> +git -C super -c submodule.recurse pull --no-rebase &&
>> +test_path_is_file super/sub/merge_strategy_2.t
>> +'
> This new test does not test interactions with submodule.recurse
> configuration and --recurse-submodules= from the command
> line.  It would be necessary to add tests to cover the permutations
> in addition to the basic test we see above.

Will fix

Thanks

Nicolas