Re: [PATCH] config: preserve case for one-shot config on the command line

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 01:58:07AM -0800, Junio C Hamano wrote:

> Since nothing seems to have happened in the meantime, here is what
> I'll queue so that we won't forget for now.  Lars's tests based on
> how the scripted "git submodule" uses "git config" may still be
> valid, but it is somewhat a roundabout way to demonstrate the
> breakage and not very effective way to protect the fix, so I added a
> new test that directly tests "git -c = ".

Yeah, I agree that is the best way to cover this code.

> I am not sure if this updated one is worth doing, or the previous
> "strchr and strrchr" I originally wrote was easier to understand.

I think either is OK. I would actually have written it halfway in
between, like:

  static void canonicalize_config_variable_name(char *varname)
  {
  const char *p;

  /* downcase the first segment */
  for (p = varname; *p; p++) {
  if (*p == '.')
  break;
  *p = tolower(*p);
  }

  /* locate end of middle segment, if there is one */
  p = strrchr(p, '.');
  if (!p)
  return; /* invalid single-level var, but let it pass */
  for (; *p; p++)
  *p = tolower(*p);
  }

You can toss that on the bikeshed heap. :)

The other change from yours is that it flips the order of the strrchr
and return. Yours is more efficient in the sense that we know there is
no dot, so we do not have to keep searching at all (though of course
this case is invalid and we would not expect to see it in practice).

But it did take me a minute in yours to figure out why last_dot was
always set to a dot (the answer is that it starts at "cp", which must
itself be a dot, because we would already have returned otherwise).

> One thing I noticed is that "git config --get X" will correctly
> diagnose that a dot-less X is not a valid variable name, but we do
> not seem to diagnose "git -c X=V " as invalid.

I don't think that was intentional. We just never caught the case. It
might be reasonable to do so (and it's easy to catch here while
canonicalizing). I think there are probably some other cases we _could_
catch, too (e.g., invalid characters for keynames). But I'm not excited
about duplicating the logic from the file-parser.

> Perhaps we should, but it is not the focus of this topic.

Definitely.

> diff --git a/t/t1351-config-cmdline.sh b/t/t1351-config-cmdline.sh
> new file mode 100755
> index 00..acb8dc3b15
> --- /dev/null
> +++ b/t/t1351-config-cmdline.sh
> @@ -0,0 +1,48 @@
> +#!/bin/sh
> +
> +test_description='git -c var=val'
> +
> +. ./test-lib.sh

There are a bunch of other "git -c" tests inside t1300. I don't know if
it would be better to put them all together.

Arguably, those other ones should get pulled out here to the new script.
More scripts means that the tests have fewer hidden dependencies, and we
can parallelize the run more. I admit I've shied away from new scripts
in the past because the number allocation is such a pain.

> +test_expect_success 'last one wins: two level vars' '
> + echo VAL >expect &&
> +
> + # sec.var and sec.VAR are the same variable, as the first
> + # and the last level of a configuration variable name is
> + # case insensitive.  Test both setting and querying.

I assume by "setting and querying" here you mean "setting via -c,
querying via git-config".

> + git -c sec.var=val -c sec.VAR=VAL config --get sec.var >actual &&
> + test_cmp expect actual &&
> + git -c SEC.var=val -c sec.var=VAL config --get sec.var >actual &&
> + test_cmp expect actual &&
> +
> + git -c sec.var=val -c sec.VAR=VAL config --get SEC.var >actual &&
> + test_cmp expect actual &&
> + git -c SEC.var=val -c sec.var=VAL config --get sec.VAR >actual &&
> + test_cmp expect actual

Looks good.

> +test_expect_success 'last one wins: three level vars' '
> + echo val >expect &&
> +
> + # v.a.r and v.A.r are not the same variable, as the middle
> + # level of a three-level configuration variable name is
> + # case sensitive.  Test both setting and querying.
> +
> + git -c v.a.r=val -c v.A.r=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c v.A.r=VAL config --get V.a.R >actual &&
> + test_cmp expect actual &&
> +
> + echo VAL >expect &&
> + git -c v.a.r=val -c v.a.R=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c V.a.r=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c v.a.R=VAL config --get V.a.R >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c V.a.r=VAL config --get V.a.R >actual &&
> + test_cmp expect actual
> +'

There are two blocks of tests, each with their own "expect" value.
Should the top "expect" here be moved down with its block to make that
more clear?

Other than that nit, the tests look good to 

Re: [PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Junio C Hamano
Kyle Meyer  writes:

> Kyle Meyer (4):
>   delete_ref: accept a reflog message argument
>   update-ref: pass reflog message to delete_ref()
>   rename_ref: replace empty message in HEAD's log
>   branch: record creation of renamed branch in HEAD's log

These looked reasonable.  I had to resolve conflicts with another
topic in flight that removed set_worktree_head_symref() function
while queuing, and I think I resolved it correctly, but please
double check what is pushed out on 'pu'.

Thanks.


Re: [PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 08:10:31PM -0500, Kyle Meyer wrote:

> Kyle Meyer (4):
>   delete_ref: accept a reflog message argument
>   update-ref: pass reflog message to delete_ref()
>   rename_ref: replace empty message in HEAD's log
>   branch: record creation of renamed branch in HEAD's log

These look good to me. I think you did a nice job of summarizing the
discussion in the commit messages.

-Peff


Re: url..insteadOf vs. submodules

2017-02-20 Thread Jeff King
On Tue, Feb 21, 2017 at 06:11:51AM +0100, Toolforger wrote:

> > I'm not sure I understand. You have a project policy to use certain
> > URLs. But you, the user, want to override that. Why isn't the
> > user-specific config file the right place to put that?
> 
> Ah right, I mistook ~/ for "project root" instead of "home dir".
> Sorry for the confusion.

Ah, OK, that makes more sense.

> > (I think there _is_ a mismatch, in that the change is specific not just
> > to your user, but to the repo. So you would not want to rewrite other
> > references to the same URL in other repos.
> 
> Indeed, and that's actually a problem.
> 
> The setup I'm aiming for is
>   github -> local bare repo -> local clones with worktrees
> 
> If I place insteadOf rules in ~/.gitconfig, I will be unable to pull from
> github to my local bare repos.
> Mmm... I could try to undo the insteadOf configuration from ~/.gitconfig in
> the local bare repos. Not sure whether I have to redirect from the github
> URL to itself.

Yeah, I think you would probably have to do a redirect-to-self to
override the global one.

At one point we discussed having conditional-config that would kick in
based on path-matching. I think it would be another way to do what you
want, but there's nothing merged.

I think anything involving ~/.gitconfig is basically a hack, though.
What you really want is for submodules to better support your
URL-rewriting case, and that's not an unreasonable thing to want.

We'll see if the submodule folks have any ideas on how to implement
that.

-Peff


Re: [PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-20 Thread Junio C Hamano
Matt McCutchen  writes:

> Currently "git fetch REMOTE SHA1" silently exits 1 if the server doesn't
> allow requests for unadvertised objects by sha1.  The more common case
> of requesting a nonexistent ref normally triggers a die() in
> get_fetch_map, so "git fetch" wasn't bothering to check after the fetch
> that it got all the refs it sought, like "git fetch-pack" does near the
> end of cmd_fetch_pack.
>
> Move the code from cmd_fetch_pack to a new function,
> report_unmatched_refs, that is called by fetch_refs_via_pack as part of
> "git fetch".  Also, change filter_refs (which checks whether a request
> for an unadvertised object should be sent to the server) to set a new
> match status on the "struct ref" when the request is not allowed, and
> have report_unmatched_refs check for this status and print a special
> error message, "Server does not allow request for unadvertised object".
>
> Finally, add a simple test case for "git fetch REMOTE SHA1".
>
> Signed-off-by: Matt McCutchen 
> ---

Hmph, I would have expected this to be done as a three-patch series,

 * move the loop at the end of cmd_fetch_pack() to a separate helper
   function report_unmatched_refs() and call it;

 * add a call to report_unmatched_refs() to the transport layer;

 * enhance report_unmatched_refs() by introducing match_status
   field and adding new code to filter_refs() to diagnose other
   kinds of errors.

The result looks reasonable from a cursory read, though.

Thanks for following it up to the completion.


Re: Inconsistent results of git blame --porcelain when detecting copies from other files

2017-02-20 Thread Junio C Hamano
Jeff King  writes:

> The simplest way (IMHO) to parse --porcelain output is:
>
>   - maintain a mapping of commit sha1s to the commit's details
>
>   - whenever you see a "   []"
> line, any key-value fields which follow impact _only_ that sha1, and
> you should update the details for that map entry
>
>   - when you see the actual tab-indented line content, you have gotten
> all of the key-value updates for that sha1. You can now safely do
> what you like with the line entry.

Yup, that was how the output was meant to be read.  At least in the
mind of the person who designed the output format ;-)



Re: [PATCH] git-svn: escape backslashes in refnames

2017-02-20 Thread Junio C Hamano
Eric Wong  writes:

> Junio: ping?
>
> https://public-inbox.org/git/20161223014202.GA8327@starla/raw
>
> Thanks.

Thanks for reminding.  This indeed fell thru cracks.


Re: Sending informational messages from upload-pack

2017-02-20 Thread Lukas Fleischer
On Mon, 20 Feb 2017 at 20:21:03, Jeff King wrote:
> On Mon, Feb 20, 2017 at 07:38:02PM +0100, Lukas Fleischer wrote:
> 
> > It would be handy to be able to show a message to the user when
> > cloning/fetching from a repository (e.g. to show a warning if a
> > repository is deprecated). This should technically already be possible
> > using the current pack protocol and sidebands. However, to my knowledge,
> > there is no easy way to configure this on the server side; writing a
> > wrapper around git-upload-pack(1) or replacing git-upload-pack(1) seem
> > to be the only options.
> 
> I wouldn't recommend wrapping upload-pack. You don't know you have a
> sideband until partway through the upload-pack conversation. And clients
> do not expect sideband at all until we get to the pack-sending part of
> the protocol (I think; I just quickly verified the location of the
> demuxer async code in fetch-pack.c, but I didn't dig into it in depth).

By wrapper I meant something that understands the pack protocol itself,
intercepts the traffic, forwards most of it to git-upload-pack(1) and
injects the message at the right time. I agree that it is a fairly ugly
workaround, though.

> [...]
> If my fetch-pack assertion above is right, technically the hook added by
> 20b20a22f is sufficient for your purposes, if your hook looks like:
> 
>   echo >&2 "pre-pack message"
>   git pack-objects "$@"
>   echo >72 "post-pack message"
> 
> but I would not be opposed to having pre-/post- hooks that run
> separately, if only for the convenience of the admin.
> [...]

I will give it a try. And I agree that it would still be convenient to
have pre-upload-pack and post-upload-pack hooks.

Regards,
Lukas


Re: url..insteadOf vs. submodules

2017-02-20 Thread Toolforger

On 20.02.2017 21:52, Jeff King wrote:
> I think if there is a doc bug, it is that the repo boundary between the
> submodule and the super-project is not made more clear.

It's not mentioned anywhere I'm aware of, particularly not on the 
insteadOf docs.


> That said, I do think it would be a useful feature for the super-project
> to rewrite URLs before handing them off to the submodule. But I do not
> really work on submodules nor use them myself, so there may be
> complications.

Agreed.

> I suppose you could argue that failing to rewrite violates the "any" in
> the quoted text. It doesn't say when the rewriting occurs, but it is
> essentially "when the URL is accessed". So the super-project feeds the
> raw URL to the submodule `git clone`, which then applies any URL
> rewriting.



>>> but one workaround is to set the config in ~/.gitconfig.
>>
>> No can do - that's under version control.
>> My personal setup does not belong there I think ;-)
>
> I'm not sure I understand. You have a project policy to use certain
> URLs. But you, the user, want to override that. Why isn't the
> user-specific config file the right place to put that?

Ah right, I mistook ~/ for "project root" instead of "home dir".
Sorry for the confusion.

> (I think there _is_ a mismatch, in that the change is specific not just
> to your user, but to the repo. So you would not want to rewrite other
> references to the same URL in other repos.

Indeed, and that's actually a problem.

The setup I'm aiming for is
  github -> local bare repo -> local clones with worktrees

If I place insteadOf rules in ~/.gitconfig, I will be unable to pull 
from github to my local bare repos.
Mmm... I could try to undo the insteadOf configuration from ~/.gitconfig 
in the local bare repos. Not sure whether I have to redirect from the 
github URL to itself.


Downside is that I'll have to remember to modify ~/.gitconfig whenever 
the upstream project changes its dependencies. Or whenever I want to 
reorganize my local project directory structure.
It's not totally out of the window, but right now it does not seem very 
attractive to me, and it's certainly not a good solution for everyone.


Regards,
Jo


Re: [PATCH] git-svn: escape backslashes in refnames

2017-02-20 Thread Eric Wong
Junio: ping?

https://public-inbox.org/git/20161223014202.GA8327@starla/raw

Thanks.


git svn find-rev -- search for nearest SVN rev

2017-02-20 Thread Craig McQueen
I'm using git svn with a project that uses SubWCRev.exe to incorporate the SVN 
revision into the build number.

I've updated it to use 'git svn find-rev HEAD' in git svn usage, to achieve the 
same effect.

This works until I have a local commit that hasn't yet been pushed to SVN with 
'git svn dcommit'. Then, 'git svn find-rev HEAD' returns nothing.

It would be really great if the '--before' and '--after' switches would work in 
this case. Then I could use 'git svn find-rev --before HEAD'.

-- 
Craig McQueen



Re: slightly confusing message

2017-02-20 Thread Junio C Hamano
Leon George  writes:

> On 20/02/17 21:19, Junio C Hamano wrote:
>> This sounds vaguely familiar and indeed I think it is this one:
>> https://public-inbox.org/git/CAEnOLdvG=sokfxej_plmamgj_8osc+28tsg+pbflltr+zlc...@mail.gmail.com/
>> which was from late last year.
> Which means I should have found it before bothering you.

It means no such thing, though.  It just means that I happen to be
more familiar with what has gone on the development recently than
you were ;-)

>> I suspect that the issue may already be fixed at the tip of 'master'
>> (aka Git 2.12-rc2).
> You're absolutely right. Took a while to build, but here goes:
> ...
> Splendid!

Thanks for confirming.


Re: slightly confusing message

2017-02-20 Thread Leon George
Hello
and thank you for your time.

On 20/02/17 21:19, Junio C Hamano wrote:
> This sounds vaguely familiar and indeed I think it is this one:
> https://public-inbox.org/git/CAEnOLdvG=sokfxej_plmamgj_8osc+28tsg+pbflltr+zlc...@mail.gmail.com/
> which was from late last year.
Which means I should have found it before bothering you.

> I suspect that the issue may already be fixed at the tip of 'master'
> (aka Git 2.12-rc2).
You're absolutely right. Took a while to build, but here goes:


£ git gitss
?? test
£ git --version
git version 2.11.0
£ git add -p .
warning: empty strings as pathspecs will be made invalid in upcoming
releases. please use . instead if you meant to match all paths
No changes.
£ git --version
git version 2.12.0-rc2
£ git add -p .
No changes.


Splendid!

have a wonderfull day :-)


[PATCH v2 1/4] delete_ref: accept a reflog message argument

2017-02-20 Thread Kyle Meyer
When the current branch is renamed with 'git branch -m/-M' or deleted
with 'git update-ref -m -d', the event is recorded in HEAD's log
with an empty message.  In preparation for adding a more meaningful
message to HEAD's log in these cases, update delete_ref() to take a
message argument and pass it along to ref_transaction_delete().
Modify all callers to pass NULL for the new message argument; no
change in behavior is intended.

Note that this is relevant for HEAD's log but not for the deleted
ref's log, which is currently deleted along with the ref.  Even if it
were not, an entry for the deletion wouldn't be present in the deleted
ref's log.  files_transaction_commit() writes to the log if
REF_NEEDS_COMMIT or REF_LOG_ONLY are set, but lock_ref_for_update()
doesn't set REF_NEEDS_COMMIT for the deleted ref because REF_DELETING
is set.  In contrast, the update for HEAD has REF_LOG_ONLY set by
split_head_update(), resulting in the deletion being logged.

Signed-off-by: Kyle Meyer 
---
 builtin/am.c   | 4 ++--
 builtin/branch.c   | 2 +-
 builtin/notes.c| 4 ++--
 builtin/remote.c   | 4 ++--
 builtin/replace.c  | 2 +-
 builtin/reset.c| 2 +-
 builtin/symbolic-ref.c | 2 +-
 builtin/tag.c  | 2 +-
 builtin/update-ref.c   | 2 +-
 fast-import.c  | 2 +-
 refs.c | 6 +++---
 refs.h | 4 ++--
 refs/files-backend.c   | 6 +++---
 transport.c| 2 +-
 14 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 31fb60578..f7a7a971f 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1049,7 +1049,7 @@ static void am_setup(struct am_state *state, enum 
patch_format patch_format,
} else {
write_state_text(state, "abort-safety", "");
if (!state->rebasing)
-   delete_ref("ORIG_HEAD", NULL, 0);
+   delete_ref(NULL, "ORIG_HEAD", NULL, 0);
}
 
/*
@@ -2172,7 +2172,7 @@ static void am_abort(struct am_state *state)
has_curr_head ? _head : NULL, 0,
UPDATE_REFS_DIE_ON_ERR);
else if (curr_branch)
-   delete_ref(curr_branch, NULL, REF_NODEREF);
+   delete_ref(NULL, curr_branch, NULL, REF_NODEREF);
 
free(curr_branch);
am_destroy(state);
diff --git a/builtin/branch.c b/builtin/branch.c
index 9d30f55b0..8f8242e07 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -251,7 +251,7 @@ static int delete_branches(int argc, const char **argv, int 
force, int kinds,
goto next;
}
 
-   if (delete_ref(name, is_null_sha1(sha1) ? NULL : sha1,
+   if (delete_ref(NULL, name, is_null_sha1(sha1) ? NULL : sha1,
   REF_NODEREF)) {
error(remote_branch
  ? _("Error deleting remote-tracking branch '%s'")
diff --git a/builtin/notes.c b/builtin/notes.c
index 5248a9bad..4b492abd4 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -681,9 +681,9 @@ static int merge_abort(struct notes_merge_options *o)
 * notes_merge_abort() to remove .git/NOTES_MERGE_WORKTREE.
 */
 
-   if (delete_ref("NOTES_MERGE_PARTIAL", NULL, 0))
+   if (delete_ref(NULL, "NOTES_MERGE_PARTIAL", NULL, 0))
ret += error(_("failed to delete ref NOTES_MERGE_PARTIAL"));
-   if (delete_ref("NOTES_MERGE_REF", NULL, REF_NODEREF))
+   if (delete_ref(NULL, "NOTES_MERGE_REF", NULL, REF_NODEREF))
ret += error(_("failed to delete ref NOTES_MERGE_REF"));
if (notes_merge_abort(o))
ret += error(_("failed to remove 'git notes merge' worktree"));
diff --git a/builtin/remote.c b/builtin/remote.c
index 5339ed6ad..2b415911b 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -691,7 +691,7 @@ static int mv(int argc, const char **argv)
read_ref_full(item->string, RESOLVE_REF_READING, oid.hash, 
);
if (!(flag & REF_ISSYMREF))
continue;
-   if (delete_ref(item->string, NULL, REF_NODEREF))
+   if (delete_ref(NULL, item->string, NULL, REF_NODEREF))
die(_("deleting '%s' failed"), item->string);
}
for (i = 0; i < remote_branches.nr; i++) {
@@ -1248,7 +1248,7 @@ static int set_head(int argc, const char **argv)
head_name = xstrdup(states.heads.items[0].string);
free_remote_ref_states();
} else if (opt_d && !opt_a && argc == 1) {
-   if (delete_ref(buf.buf, NULL, REF_NODEREF))
+   if (delete_ref(NULL, buf.buf, NULL, REF_NODEREF))
result |= error(_("Could not delete %s"), buf.buf);
} else
usage_with_options(builtin_remote_sethead_usage, options);
diff --git a/builtin/replace.c b/builtin/replace.c
index 

Re: [PATCH v4 03/19] builtin/diff-tree: convert to struct object_id

2017-02-20 Thread brian m. carlson
On Mon, Feb 20, 2017 at 08:08:36PM -0500, Jeff King wrote:
> On Tue, Feb 21, 2017 at 12:25:19AM +, brian m. carlson wrote:
> 
> > On Mon, Feb 20, 2017 at 03:09:02AM -0500, Jeff King wrote:
> > > It's a little disturbing that we do not seem to have even a basic test
> > > of:
> > > 
> > >   git rev-list --parents HEAD | git diff-tree --stdin
> > > 
> > > which would exercise this code.
> > 
> > I'm unsure, so I'll add a test.  The only way to know if it works is to
> > test it.
> 
> Not to spoil the ending, but I did test and it does not work. :)

Well, then I suppose I'll also end up sending out a new patch series. :)
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Kyle Meyer
Thanks to Junio and Peff for their feedback on v1.

  https://public-inbox.org/git/20170217035800.13214-1-k...@kyleam.com/T/#u

Changes from v1:

 * "msg" is now positioned as the first argument to delete_ref() to
   match update_ref().

   20170217081205.zn7j6d5cffgdv...@sigill.intra.peff.net

 * A "delete by head" test case has been added for the update-ref
   change.

   20170217082253.kxjezkxfqkfxj...@sigill.intra.peff.net

 * The added tests no longer send grep's output to /dev/null.

   20170217082253.kxjezkxfqkfxj...@sigill.intra.peff.net

 * Renaming the current branch is represented by two entries in HEAD's
   log, both which reuse the log message passed to rename_ref().
   
   20170217083112.vn7m4udsopmlv...@sigill.intra.peff.net,
   20170217195549.z6uyy7hbbhj5a...@sigill.intra.peff.net

 * Corrected a few places in the commit messages where "delete_refs"
   was written instead of "delete_ref".


Kyle Meyer (4):
  delete_ref: accept a reflog message argument
  update-ref: pass reflog message to delete_ref()
  rename_ref: replace empty message in HEAD's log
  branch: record creation of renamed branch in HEAD's log

 branch.c   |  5 +++--
 branch.h   |  3 ++-
 builtin/am.c   |  4 ++--
 builtin/branch.c   |  7 ---
 builtin/notes.c|  4 ++--
 builtin/remote.c   |  4 ++--
 builtin/replace.c  |  2 +-
 builtin/reset.c|  2 +-
 builtin/symbolic-ref.c |  2 +-
 builtin/tag.c  |  2 +-
 builtin/update-ref.c   |  2 +-
 fast-import.c  |  2 +-
 refs.c |  6 +++---
 refs.h |  7 ---
 refs/files-backend.c   | 10 +-
 t/t1400-update-ref.sh  | 18 ++
 t/t3200-branch.sh  |  6 ++
 transport.c|  2 +-
 18 files changed, 58 insertions(+), 30 deletions(-)

-- 
2.11.1



[PATCH v2 3/4] rename_ref: replace empty message in HEAD's log

2017-02-20 Thread Kyle Meyer
When the current branch is renamed, the deletion of the old ref is
recorded in HEAD's log with an empty message.  Now that delete_ref()
accepts a reflog message, provide a more descriptive message by
passing along the log message that is given to rename_ref().

The next step will be to extend HEAD's log to also include the second
part of the rename, the creation of the new branch.

Helped-by: Jeff King 
Signed-off-by: Kyle Meyer 
---
 refs/files-backend.c | 2 +-
 t/t3200-branch.sh| 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 299eb4d8a..f6e7c192c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2616,7 +2616,7 @@ static int files_rename_ref(struct ref_store *ref_store,
return error("unable to move logfile logs/%s to 
"TMP_RENAMED_LOG": %s",
oldrefname, strerror(errno));
 
-   if (delete_ref(NULL, oldrefname, orig_sha1, REF_NODEREF)) {
+   if (delete_ref(logmsg, oldrefname, orig_sha1, REF_NODEREF)) {
error("unable to delete old %s", oldrefname);
goto rollback;
}
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 8a833f354..0dbc54003 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -139,6 +139,11 @@ test_expect_success 'git branch -M baz bam should succeed 
when baz is checked ou
test $(git rev-parse --abbrev-ref HEAD) = bam
 '
 
+test_expect_success 'git branch -M baz bam should add entry to .git/logs/HEAD' 
'
+msg="Branch: renamed refs/heads/baz to refs/heads/bam" &&
+   grep " 0\{40\}.*$msg$" .git/logs/HEAD
+'
+
 test_expect_success 'git branch -M baz bam should succeed when baz is checked 
out as linked working tree' '
git checkout master &&
git worktree add -b baz bazdir &&
-- 
2.11.1



[PATCH v2 4/4] branch: record creation of renamed branch in HEAD's log

2017-02-20 Thread Kyle Meyer
Renaming the current branch adds an event to the current branch's log
and to HEAD's log.  However, the logged entries differ.  The entry in
the branch's log represents the entire renaming operation (the old and
new hash are identical), whereas the entry in HEAD's log represents
the deletion only (the new sha1 is null).

Extend replace_each_worktree_head_symref(), whose only caller is
branch_rename(), to take a reflog message argument.  This allows the
creation of the new ref to be recorded in HEAD's log.  As a result,
the renaming event is represented by two entries (a deletion and a
creation entry) in HEAD's log.

It's a bit unfortunate that the branch's log and HEAD's log now
represent the renaming event in different ways.  Given that the
renaming operation is not atomic, the two-entry form is a more
accurate representation of the operation and is more useful for
debugging purposes if a failure occurs between the deletion and
creation events.  It would make sense to move the branch's log to the
two-entry form, but this would involve changes to how the rename is
carried out and to how the update flags and reflogs are processed for
deletions, so it may not be worth the effort.

Based-on-patch-by: Jeff King 
Signed-off-by: Kyle Meyer 
---
 branch.c | 5 +++--
 branch.h | 3 ++-
 builtin/branch.c | 5 +++--
 refs.h   | 3 ++-
 refs/files-backend.c | 4 ++--
 t/t3200-branch.sh| 5 +++--
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/branch.c b/branch.c
index b955d4f31..5c12036b0 100644
--- a/branch.c
+++ b/branch.c
@@ -345,7 +345,8 @@ void die_if_checked_out(const char *branch, int 
ignore_current_worktree)
branch, wt->path);
 }
 
-int replace_each_worktree_head_symref(const char *oldref, const char *newref)
+int replace_each_worktree_head_symref(const char *oldref, const char *newref,
+ const char *logmsg)
 {
int ret = 0;
struct worktree **worktrees = get_worktrees(0);
@@ -358,7 +359,7 @@ int replace_each_worktree_head_symref(const char *oldref, 
const char *newref)
continue;
 
if (set_worktree_head_symref(get_worktree_git_dir(worktrees[i]),
-newref)) {
+newref, logmsg)) {
ret = -1;
error(_("HEAD of working tree %s is not updated"),
  worktrees[i]->path);
diff --git a/branch.h b/branch.h
index 3103eb9ad..b07788558 100644
--- a/branch.h
+++ b/branch.h
@@ -71,6 +71,7 @@ extern void die_if_checked_out(const char *branch, int 
ignore_current_worktree);
  * This will be used when renaming a branch. Returns 0 if successful, non-zero
  * otherwise.
  */
-extern int replace_each_worktree_head_symref(const char *oldref, const char 
*newref);
+extern int replace_each_worktree_head_symref(const char *oldref, const char 
*newref,
+const char *logmsg);
 
 #endif
diff --git a/builtin/branch.c b/builtin/branch.c
index 8f8242e07..e1f97dcfc 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -579,14 +579,15 @@ static void rename_branch(const char *oldname, const char 
*newname, int force)
 
if (rename_ref(oldref.buf, newref.buf, logmsg.buf))
die(_("Branch rename failed"));
-   strbuf_release();
 
if (recovery)
warning(_("Renamed a misnamed branch '%s' away"), oldref.buf + 
11);
 
-   if (replace_each_worktree_head_symref(oldref.buf, newref.buf))
+   if (replace_each_worktree_head_symref(oldref.buf, newref.buf, 
logmsg.buf))
die(_("Branch renamed to %s, but HEAD is not updated!"), 
newname);
 
+   strbuf_release();
+
strbuf_addf(, "branch.%s", oldref.buf + 11);
strbuf_release();
strbuf_addf(, "branch.%s", newref.buf + 11);
diff --git a/refs.h b/refs.h
index 5880886a7..e529f4c3a 100644
--- a/refs.h
+++ b/refs.h
@@ -334,7 +334,8 @@ int create_symref(const char *refname, const char *target, 
const char *logmsg);
  * $GIT_DIR points to.
  * Return 0 if successful, non-zero otherwise.
  * */
-int set_worktree_head_symref(const char *gitdir, const char *target);
+int set_worktree_head_symref(const char *gitdir, const char *target,
+const char *logmsg);
 
 enum action_on_err {
UPDATE_REFS_MSG_ON_ERR,
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f6e7c192c..42b137bb1 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3055,7 +3055,7 @@ static int files_create_symref(struct ref_store 
*ref_store,
return ret;
 }
 
-int set_worktree_head_symref(const char *gitdir, const char *target)
+int set_worktree_head_symref(const char *gitdir, const char *target, const 
char *logmsg)
 {
static struct lock_file head_lock;
struct ref_lock *lock;
@@ -3083,7 

[PATCH v2 2/4] update-ref: pass reflog message to delete_ref()

2017-02-20 Thread Kyle Meyer
Now that delete_ref() accepts a reflog message, pass the user-provided
message to delete_ref() rather than silently dropping it.

Signed-off-by: Kyle Meyer 
---
 builtin/update-ref.c  |  2 +-
 t/t1400-update-ref.sh | 18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/builtin/update-ref.c b/builtin/update-ref.c
index 86d006d36..0b2ecf41a 100644
--- a/builtin/update-ref.c
+++ b/builtin/update-ref.c
@@ -433,7 +433,7 @@ int cmd_update_ref(int argc, const char **argv, const char 
*prefix)
 * For purposes of backwards compatibility, we treat
 * NULL_SHA1 as "don't care" here:
 */
-   return delete_ref(NULL, refname,
+   return delete_ref(msg, refname,
  (oldval && !is_null_sha1(oldsha1)) ? oldsha1 
: NULL,
  flags);
else
diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index b0ffc0b57..6e112fb5f 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -85,6 +85,24 @@ test_expect_success "delete $m (by HEAD)" '
 '
 rm -f .git/$m
 
+test_expect_success "deleting current branch adds message to HEAD's log" '
+   git update-ref $m $A &&
+   git symbolic-ref HEAD $m &&
+   git update-ref -m delete-$m -d $m &&
+   ! test -f .git/$m &&
+   grep "delete-$m$" .git/logs/HEAD
+'
+rm -f .git/$m
+
+test_expect_success "deleting by HEAD adds message to HEAD's log" '
+   git update-ref $m $A &&
+   git symbolic-ref HEAD $m &&
+   git update-ref -m delete-by-head -d HEAD &&
+   ! test -f .git/$m &&
+   grep "delete-by-head$" .git/logs/HEAD
+'
+rm -f .git/$m
+
 test_expect_success 'update-ref does not create reflogs by default' '
test_when_finished "git update-ref -d $outside" &&
git update-ref $outside $A &&
-- 
2.11.1



Re: [PATCH v4 03/19] builtin/diff-tree: convert to struct object_id

2017-02-20 Thread Jeff King
On Tue, Feb 21, 2017 at 12:25:19AM +, brian m. carlson wrote:

> On Mon, Feb 20, 2017 at 03:09:02AM -0500, Jeff King wrote:
> > It's a little disturbing that we do not seem to have even a basic test
> > of:
> > 
> >   git rev-list --parents HEAD | git diff-tree --stdin
> > 
> > which would exercise this code.
> 
> I'm unsure, so I'll add a test.  The only way to know if it works is to
> test it.

Not to spoil the ending, but I did test and it does not work. :)

-Peff


Re: [PATCH v4 03/19] builtin/diff-tree: convert to struct object_id

2017-02-20 Thread brian m. carlson
On Mon, Feb 20, 2017 at 03:09:02AM -0500, Jeff King wrote:
> It's a little disturbing that we do not seem to have even a basic test
> of:
> 
>   git rev-list --parents HEAD | git diff-tree --stdin
> 
> which would exercise this code.

I'm unsure, so I'll add a test.  The only way to know if it works is to
test it.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Philip Oakley

From: "Jakub Narębski" 

W dniu 20.02.2017 o 21:31, Alex Hoffman pisze:

I see two different problems each with a different assumption (see the
definition of "bisectable" in the email of Junio C Hamano):

1. (Current) Assume the entire history graph is bisectable. DO: Search
where in the entire graph the first 'trait'/transition occurs.

2. (New) Assume only the graph between one good commit and one bad
commit is bisectable. DO: Search where the first transition occurs in
the graph between these two commits.


If `git bisect` is/would be affected by `git log` history-related options
then this is what `--strict-ancestor` option gives/would give.


isn't that spelt `--ancestry-path` ?
(--ancestry-path has it's own issues such as needing an --first-parent-show 
option, but that's possibly a by the by)

--
Philip 



Re: Inconsistent results of git blame --porcelain when detecting copies from other files

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 01:30:29PM -0800, Junio C Hamano wrote:

> "Sokolov, Konstantin"  writes:
> 
> > However, when using --porcelain DirectoryReader.java is reported as the 
> > origin of lines 502-504:
> > ...
> > This is not only inconsistent with the other outputs but the output is also 
> > inconsistent in itself because lines 496 -498 do not even exist in a 
> > previous version of DirectoryReader.java.
> 
> Hmph, this sounds vaguely familiar with
> 
> 
> http://public-inbox.org/git/20170106042051.nwjiuyyp7ljhs...@sigill.intra.peff.net
> 
> which is part of Git 2.12-rc0

Yeah, I had the same thought while reading Konstantin's report.

I'm not sure Git is wrong, though. I think it's just the way the
porcelain output works.

Here's a minimal reproduction; the interesting thing is when a commit is
mentioned twice (as happens on lines 1 and 5 here):

  git init repo
  cd repo
  
  # use long lines to make sure we trigger content-movement detection
  for i in $(seq 1 5); do
echo this is really long line number $i
  done >file
  git add file
  git commit -m initial
  
  sed 's/1/one/; s/5/five/' renamed
  git rm file
  git add renamed
  git commit -m 'rename and use english'
  
  git blame renamed
  git blame --line-porcelain renamed
  git blame --porcelain renamed

The first blame output looks something like this:

  bab03701 renamed ... line number 1
  ^dda1349 file... line number 2
  ^dda1349 file... line number 3
  ^dda1349 file... line number 4
  bab03701 renamed ... line number 5

so we can see it's the same case. The --line-porcelain similarly matches
the commits and filenames.

But the --porcelain output is:

  bab037010dcabaf0509db27bf232d25659b180fa 1 1 1
  ...
  filename renamed
  this is really long line number one
  dda1349d41da859f4c37e018dbed714ba6c1aa18 2 2 3
  ...
  filename file
  this is really long line number 2
  dda1349d41da859f4c37e018dbed714ba6c1aa18 3 3
  this is really long line number 3
  dda1349d41da859f4c37e018dbed714ba6c1aa18 4 4
  this is really long line number 4
  bab037010dcabaf0509db27bf232d25659b180fa 5 5 1
  this is really long line number five

You might be tempted to say that the fifth line comes from "filename
file", because that was the last "filename" entry we saw. But that's
_not_ how the porcelain output works. That "filename" entry was
associated with dda1349, but the line comes from bab0370 here.

The simplest way (IMHO) to parse --porcelain output is:

  - maintain a mapping of commit sha1s to the commit's details

  - whenever you see a "   []"
line, any key-value fields which follow impact _only_ that sha1, and
you should update the details for that map entry

  - when you see the actual tab-indented line content, you have gotten
all of the key-value updates for that sha1. You can now safely do
what you like with the line entry.

Another way, if you don't want to update your mapping, is to actually
pay attention to the size-of-hunk headers. In this case the middle three
lines come in their own hunk (which you can see from the "2 2 3" header
on the second line). The "filename" field we get applies to that hunk,
but once we switch to a different one, the filename field needs to be
looked up in the commit mapping.

But it's definitely not correct to blindly apply one "filename" field to
subsequent lines in other hunks.

And yes, I do think this is probably more complex than it needs to be.
I didn't write it. And I don't think it is worth the backwards
compatibility headache of trying to change it now. It's possible this
could be better documented (I didn't look at the documentation to write
that explanation; I happened to puzzle it out for somebody else
recently who had a similar case. That's what led to the bug-fix in the
message you linked).

-Peff


Fast Loans

2017-02-20 Thread Service Loans
Do you need a loan to pay up bill or to start a business? Email Us


Re: Inconsistent results of git blame --porcelain when detecting copies from other files

2017-02-20 Thread Junio C Hamano
"Sokolov, Konstantin"  writes:

> However, when using --porcelain DirectoryReader.java is reported as the 
> origin of lines 502-504:
> ...
> This is not only inconsistent with the other outputs but the output is also 
> inconsistent in itself because lines 496 -498 do not even exist in a previous 
> version of DirectoryReader.java.

Hmph, this sounds vaguely familiar with


http://public-inbox.org/git/20170106042051.nwjiuyyp7ljhs...@sigill.intra.peff.net

which is part of Git 2.12-rc0






Re: git diff --quiet exits with 1 on clean tree with CRLF conversions

2017-02-20 Thread Junio C Hamano
Mike Crowe  writes:

> I think that if there's a risk that file contents will undergo conversion
> then this should force the diff to check the file contents. Something like:
>
> diff --git a/diff.c b/diff.c
> index 051761b..bee1662 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3413,13 +3413,14 @@ void diff_setup_done(struct diff_options *options)
>   /*
>* Most of the time we can say "there are changes"
>* only by checking if there are changed paths, but
> -  * --ignore-whitespace* options force us to look
> -  * inside contents.
> +  * --ignore-whitespace* options or text conversion
> +  * force us to look inside contents.
>*/
>  
>   if (DIFF_XDL_TST(options, IGNORE_WHITESPACE) ||
>   DIFF_XDL_TST(options, IGNORE_WHITESPACE_CHANGE) ||
> - DIFF_XDL_TST(options, IGNORE_WHITESPACE_AT_EOL))
> + DIFF_XDL_TST(options, IGNORE_WHITESPACE_AT_EOL) ||
> + DIFF_OPT_TST(options, ALLOW_TEXTCONV))
>   DIFF_OPT_SET(options, DIFF_FROM_CONTENTS);
>   else
>   DIFF_OPT_CLR(options, DIFF_FROM_CONTENTS);

Thanks.

You may be on the right track to find FROM_CONTENTS bit, but
I think TEXTCONV bit is a red-herring.

This part of diff.c caught my attention, while thinking about this
topic:

if (output_format & DIFF_FORMAT_NO_OUTPUT &&
DIFF_OPT_TST(options, EXIT_WITH_STATUS) &&
DIFF_OPT_TST(options, DIFF_FROM_CONTENTS)) {
/*
 * run diff_flush_patch for the exit status. setting
 * options->file to /dev/null should be safe, because we
 * aren't supposed to produce any output anyway.
 */

and the body of this "if" statement loops over q->queue[].  It is
about the "even though we prefer not having to format the patch
because we are doing --quiet, we need to see if contents of one and
two that we _know_ are different are made into the same thing when
we compare them while ignoring various forms of whitespace changes".
So one and two that are removed in earlier step because they were
truly identical may not be penalized if you flip FROM_CONTENTS bit
early on.

Having said that, DIFF_FROM_CONTENTS is about all paths this options
structure governs, not just paths that have eol conversion defined.
When you say "diff --ignore-whitespace-change", that applies to all
paths.  The eol conversion is specified per-path, so ideally the
FROM_CONTENTS bit should be flipped if and only if one or more of
the paths would need the conversion (i.e. does the helper function
would_convert_to_git() say "yes" to the path?).  I however suspect
that necessary information to do so (i.e. "which paths we are
looking at?") has not been generated yet at the point of the code
you quoted.  setup comes (and must come) very early, and then
q->queue[] is populated by different front-end functions that
compare trees, the index, and the working tree, depending on the
"git diff" option or "git diff-{tree,index,files}" plumbing command,
and you can ask "would one of these paths need conversion?" only
after q->queue[] is populated.  Hmm.

Another thing is that ALLOW_TEXTCONV is not a right bit to check for
your example.  It is "do we use textconv filters to turn binary
files into a (phony) text representation before comparing?".  People
use the mechanism to throw JPEG photos in Git and have textconv
filter to extract only EXIF data, and "diff --textconv" would let us
textually compare only the EXIF data (which is the only human
readable part of the contents anyway).  

It might be a good idea to also flip FROM_CONTENTS bit under "diff
--textconv", and ...

> This solves the problem for me and my test case now passes.

... but I suspect that you were misled to think it fixes your issue,
only because "--textconv" is by default enabled for "git diff" and
"git log" (see "git diff --help").  I think you are saying that if
you always set FROM_CONTENTS bit on, you get what you want.  But
that is to be expected and it unfortunately penalizes everybody by
turning an obvious optimization permanently off.

Also "--textconv" is not on by default for the plumbing "git
diff-index" command and its friends, so it is likely that "git
diff-index HEAD" with your change will still not work as you expect.

A cheap (from code-change point of view) band-aid might be to flip
FROM_CONTENTS on if we know the repository has _some_ paths that
need eol conversion, even when the particular "diff" we are taking
does not involve any eol conversion (e.g. "is core.crlf set?").
While it may be better than "if we are porcelain (aka ALLOW_TEXTCONV
is set), unconditionally flip FROM_CONTENTS on", it is not ideal,
either.

This almost makes me suspect that the place that checks lengths of
one and two in order to refrain from running more expensive content
comparison you found earlier need to ask would_convert_to_git()
before taking the short-cut, something along the lines of 

Re: url..insteadOf vs. submodules

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 09:31:40PM +0100, Toolforger wrote:

> > The submodule operations happen in their own processes, and do not look
> > at the config of the parent repo.
> 
> Ah, then we have a docbug.
> git help config has this to say:
> 
> url..insteadOf
> Any URL that starts with this value will be rewritten to start,
> instead, with .
> 
> The "Any" here is wrong, it would be "any except submodule" (possibly other
> exceptions).

I'm not sure that "any" is wrong here. Repository-specific config does
not cross repository boundaries. That applies to this config value, and
to all the others, too (e.g., if you set "diff.renames" in the
super-project, it would not have an effect in the submodule).

I think if there is a doc bug, it is that the repo boundary between the
submodule and the super-project is not made more clear.

That said, I do think it would be a useful feature for the super-project
to rewrite URLs before handing them off to the submodule. But I do not
really work on submodules nor use them myself, so there may be
complications.

I suppose you could argue that failing to rewrite violates the "any" in
the quoted text. It doesn't say when the rewriting occurs, but it is
essentially "when the URL is accessed". So the super-project feeds the
raw URL to the submodule `git clone`, which then applies any URL
rewriting.

> > but one workaround is to set the config in ~/.gitconfig.
> 
> No can do - that's under version control.
> My personal setup does not belong there I think ;-)

I'm not sure I understand. You have a project policy to use certain
URLs. But you, the user, want to override that. Why isn't the
user-specific config file the right place to put that?

(I think there _is_ a mismatch, in that the change is specific not just
to your user, but to the repo. So you would not want to rewrite other
references to the same URL in other repos. But that does not seem to be
your objection).

-Peff


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Alex Hoffman
> If `git bisect` is/would be affected by `git log` history-related options
> then this is what `--strict-ancestor` option gives/would give.

Exactly my thoughts. All that needs to be changed in the 2nd problem
is the graph where to search.

But first we must agree about the usefulness of the 2nd problem. Any
thoughts/comments/votes for/against?


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Jakub Narębski
W dniu 20.02.2017 o 21:31, Alex Hoffman pisze:
> I see two different problems each with a different assumption (see the
> definition of "bisectable" in the email of Junio C Hamano):
> 
> 1. (Current) Assume the entire history graph is bisectable. DO: Search
> where in the entire graph the first 'trait'/transition occurs.
>
> 2. (New) Assume only the graph between one good commit and one bad
> commit is bisectable. DO: Search where the first transition occurs in
> the graph between these two commits.

If `git bisect` is/would be affected by `git log` history-related options
then this is what `--strict-ancestor` option gives/would give.

-- 
Jakub Narębski



Re: url..insteadOf vs. submodules

2017-02-20 Thread Toolforger

On 20.02.2017 10:01, Jeff King wrote:

On Sun, Feb 19, 2017 at 10:12:28PM +0100, Toolforger wrote:


I am trying to make url..insteadOf work on the URLs inside
.gitmodules, but it won't work (applying it to the repo itself works fine,
to the config setting seems to be fine).


The submodule operations happen in their own processes, and do not look
at the config of the parent repo.


Ah, then we have a docbug.
git help config has this to say:

url..insteadOf
Any URL that starts with this value will be rewritten to start,
instead, with .

The "Any" here is wrong, it would be "any except submodule" (possibly 
other exceptions).


> Are you setting the config in

.git/config of the super-project?


Exactly.
My thinking was that since the submodule URLs are specified in the super 
project's .gitmodules, that setting should apply.



I don't know if there plans to make that work,


It would certainly help me out, though I guess it's going to be too late 
for my current project :-)


> but one workaround is to set the config in ~/.gitconfig.

No can do - that's under version control.
My personal setup does not belong there I think ;-)

I am currently trying to write a shell script that
- does git submodule init
- pulls submodule configuration out of git config -l
- configures each submodule with insteadOf
It fits with my workflow because setting up the repositories is going to 
be done via script anyway.

I'm neither a shell nor a git expert, so any advice still appreciated.

Regards,
Jo


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Alex Hoffman
I see two different problems each with a different assumption (see the
definition of "bisectable" in the email of Junio C Hamano):

1. (Current) Assume the entire history graph is bisectable. DO: Search
where in the entire graph the first 'trait'/transition occurs.
2. (New) Assume only the graph between one good commit and one bad
commit is bisectable. DO: Search where the first transition occurs in
the graph between these two commits.

It seems that the real world needs a solution also for the second
problem (example if the good commit is the FIRST good commit of a
feature or if the good commit is not the first good commit, but you
definitely know, that it broke first somewhere in between the good and
bad commit).

I find the way to go as Oleg proposed is gittish enough (with a new
parameter --strategy). Beside I would underline that also the second
problem is a bisect problem, just for another graph, thus it makes
perfect sense to extend 'git bisect' for this.

Does that look reasonably?


Re: [PATCH 4/4 v4] sha1_name.c: teach get_sha1_1 "-" shorthand for "@{-1}"

2017-02-20 Thread Junio C Hamano
Siddharth Kannan  writes:

> On 17 February 2017 at 00:38, Junio C Hamano  wrote:
>> Having said all that, I do not think the remainder of the code is
>> prepared to take "-", not yet anyway [*1*], so turning "-" into
>> "@{-1}" this patch does before it calls get_sha1_basic(), while it
>> is not an ideal final state, is probably an acceptable milestone to
>> stop at.
>
> So, is it okay to stop with just supporting "-" and not support things
> like "-@{yesterday}"?

If the approach to turn "-" into "@{-1}" at that spot you did will
cause "-@{yesterday}" to barf, then I'd say so be it for now ;-).
We can later spread the understanding of "-" to functions deeper in
the callchain and add support for that, no?

>> It is a separate matter if this patch is sufficient to produce
>> correct results, though.  I haven't studied the callers of this
>> change to make sure yet, and may find bugs in this approach later.


Re: [PATCH 2/5] hashmap: allow memihash computation to be continued

2017-02-20 Thread Junio C Hamano
Johannes Schindelin  writes:

>> If an extra call level really matters, its "inline" equivalent in
>> the header would probably be good.
>
> Well, the hashing is supposed to be as fast as possible, so I would like
> to avoid that extra call level. However, the end result is not so pretty
> because FNV32_BASE needs to be made public (OTOH it removes more lines
> than it adds):

I think our usual answer is "can we measure the difference to
demonstrate that the overhead for an extra call matter?"

As two functions sit next to each other in a single file, the code
duplication does not bother me _that_ much.  A single liner 

/* keep implementations of these two in sync */

in front of these two functions would not hurt, but whoever attempts
to come up with a better hash needs to stare at this file carefully
anyway, so lack of such carefulness probably wouldn't be too big an
issue, either.

But the above 8 lines are something we need to worry about after we
definitely know that we MUST have two independent functions that are
supposed to be kept in sync; a patch that makes us worry them before
we know is a premature optimization, and that bothers me even more
than the actual code duplication that can drift apart.



Re: slightly confusing message

2017-02-20 Thread Junio C Hamano
Leon George  writes:

> Every once in a while this:
>
> £ git add -p .
> warning: empty strings as pathspecs will be made invalid in upcoming
> releases. please use . instead if you meant to match all paths
> No changes.
>
> It seems to happen only when there are no more modified files and git
> still works wonderfully otherwise - just wanted to let you know.

This sounds vaguely familiar and indeed I think it is this one:

https://public-inbox.org/git/CAEnOLdvG=sokfxej_plmamgj_8osc+28tsg+pbflltr+zlc...@mail.gmail.com/

which was from late last year.  

I suspect that the issue may already be fixed at the tip of 'master'
(aka Git 2.12-rc2).

Thanks.


Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

2017-02-20 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> There is a third category, and this one *does* come as a surprise to me.
>> It appears that at least *some* patches' Date: lines are either ignored or
>> overridden or changed on their way from the mailing list into Git's commit
>> history. There was only one commit in that commit range:
>>
>> 3c0cb0c (read_loose_refs(): read refs using resolve_ref_recursively(),
>>  Michael Haggerty 2017-02-09)
>>
>> This one was committed with an author date "Thu, 09 Feb 2017 21:53:52
>> +0100" but it appears that there was no mail sent to the Git mailing list
>
> I think this is this one:
>
> 
>
> Recent "What's cooking" lists the topic this one is part with this
> comment:
>
>  The tip one is newer than the one posted to the list but was sent
>  privately by the author via his GitHub repository.

We didn't have any pull from sub-maintainers during the period you
checked, but when we do, those could also fall into the category.
Even though I see some l10n patches Cc'ed to the list, I won't be
surprised if not everything that is sent to Jiang Xin (i18n/l10n
coordinator) is, for example.  It also is OK for sub-maintainers to
have their own commit to describe or otherwise improve their area
and without sending a patch before doing so if they deem it
appropriate [*1*].

I actually think automation like yours would help another category:
There is a newer version of the series or an entirely new series on
the list, but the project's tree has not picked them up (yet).

I from time to time sweep my inbox in an attempt to find and pick up
leftover bits.  Sometimes the authors remind me by pinging [*2*],
which greatly helps.  But another set of eyeballs that may be
enhanced with a mechanised filter that catches "messages without
corresponding commits", which is the opposite of this "third"
category, would be of great help, too [*3*].


[Footnote]

*1* ... like trivial fixes, for example, at their discretion.  After
all we entrusted their own area and we should give them the
flexibility they can exercise with good taste ;-).

*2* e.g. <2f67fc21-92f9-a03e-1b09-a237af6db...@alum.mit.edu>

*3* ... even if a mechanised filter alone might strike too many
false positives.


Re: [PATCH v6 0/6] stash: support pathspec argument

2017-02-20 Thread Junio C Hamano
Thomas Gummerer  writes:

> @@ -55,10 +53,13 @@ push [-p|--patch] [-k|--[no-]keep-index] 
> [-u|--include-untracked] [-a|--all] [-q
>  
>   Save your local modifications to a new 'stash', and run `git reset
>   --hard` to revert them.  The  part is optional and gives

I didn't notice this during v5 review, but the above seems to be
based on the codebase before your documentation update (which used
to be part of the series in older iteration).  I had to tweak the
series to apply on top of your 20a7e06172 ("Documentation/stash:
remove mention of git reset --hard", 2017-02-12) while queuing, so
please double check the result when it is pushed out to 'pu'.

Thanks.


Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

2017-02-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> There is a third category, and this one *does* come as a surprise to me.
> It appears that at least *some* patches' Date: lines are either ignored or
> overridden or changed on their way from the mailing list into Git's commit
> history. There was only one commit in that commit range:
>
> 3c0cb0c (read_loose_refs(): read refs using resolve_ref_recursively(),
>   Michael Haggerty 2017-02-09)
>
> This one was committed with an author date "Thu, 09 Feb 2017 21:53:52
> +0100" but it appears that there was no mail sent to the Git mailing list

I think this is this one:



Recent "What's cooking" lists the topic this one is part with this
comment:

 The tip one is newer than the one posted to the list but was sent
 privately by the author via his GitHub repository.



Re: Sending informational messages from upload-pack

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 07:38:02PM +0100, Lukas Fleischer wrote:

> It would be handy to be able to show a message to the user when
> cloning/fetching from a repository (e.g. to show a warning if a
> repository is deprecated). This should technically already be possible
> using the current pack protocol and sidebands. However, to my knowledge,
> there is no easy way to configure this on the server side; writing a
> wrapper around git-upload-pack(1) or replacing git-upload-pack(1) seem
> to be the only options.

I wouldn't recommend wrapping upload-pack. You don't know you have a
sideband until partway through the upload-pack conversation. And clients
do not expect sideband at all until we get to the pack-sending part of
the protocol (I think; I just quickly verified the location of the
demuxer async code in fetch-pack.c, but I didn't dig into it in depth).

So I don't think you can do a MOTD or similar in a backwards-compatible
way. You're only allowed to talk if the conversation results in an
actual pack being sent.

> What I have in mind is something like a post-upload hook whose stdout
> and stderr are redirected to sideband 2 and 3, respectively. The commit
> message of 20b20a22f (upload-pack: provide a hook for running
> pack-objects, 2016-05-18) suggests that such a hook should be
> implemented as a "config variable hook" rather than a regular hook.

Yeah, because of the "upload-pack is special and untrusted" rule, this
can't be a regular hook. I think the config mechanism used by 20b20a22f
would be the right approach.

If my fetch-pack assertion above is right, technically the hook added by
20b20a22f is sufficient for your purposes, if your hook looks like:

  echo >&2 "pre-pack message"
  git pack-objects "$@"
  echo >72 "post-pack message"

but I would not be opposed to having pre-/post- hooks that run
separately, if only for the convenience of the admin.

> One could think of additional parameters passed to such a hook. For the
> purposes I intend to use this, no parameters are needed. However, a
> fixed per-repository MOTD would be too inflexible since we are using
> namespaces and database accesses to determine whether a repository is
> "deprecated".

There was a proposed post-upload-pack hook a long time ago that
collected clone/fetch stats, and we used it at GitHub for many years.
These days we use something much more invasive that dumps stats from
every git invocation over a Unix socket.

> Am I missing any "easy" already supported way to add such messages
> without patching Git or writing a git-upload-pack(1) wrapper? If not,
> does this sound general and useful enough to become an official feature?
> Are there any alternative suggestions on how to display such messages?

I don't think there's any other mechanism to do what you're asking,
aside from the hook in 20b20a22f.

-Peff


Sending informational messages from upload-pack

2017-02-20 Thread Lukas Fleischer
Hi,

It would be handy to be able to show a message to the user when
cloning/fetching from a repository (e.g. to show a warning if a
repository is deprecated). This should technically already be possible
using the current pack protocol and sidebands. However, to my knowledge,
there is no easy way to configure this on the server side; writing a
wrapper around git-upload-pack(1) or replacing git-upload-pack(1) seem
to be the only options.

What I have in mind is something like a post-upload hook whose stdout
and stderr are redirected to sideband 2 and 3, respectively. The commit
message of 20b20a22f (upload-pack: provide a hook for running
pack-objects, 2016-05-18) suggests that such a hook should be
implemented as a "config variable hook" rather than a regular hook.

One could think of additional parameters passed to such a hook. For the
purposes I intend to use this, no parameters are needed. However, a
fixed per-repository MOTD would be too inflexible since we are using
namespaces and database accesses to determine whether a repository is
"deprecated".

Am I missing any "easy" already supported way to add such messages
without patching Git or writing a git-upload-pack(1) wrapper? If not,
does this sound general and useful enough to become an official feature?
Are there any alternative suggestions on how to display such messages?

Regards,
Lukas


Inconsistent results of git blame --porcelain when detecting copies from other files

2017-02-20 Thread Sokolov, Konstantin
Hi Folks!

The issue is best explained on an example. You can reproduce it using the 
Lucene repo https://github.com/apache/lucene-solr.git. Tested with the 
following versions:  1.8.1.6 (Ubuntu), 2.11.0.windows.1, 2.11.1.windows.1.

First, let's produce the correct results without using --procelain:

> git blame --show-name --show-number -s -w --abbrev=40 -C -C -C 
> d1165b19726fa0cd13a539827a7cd43237a4feef..10ba9abeb208d37df985e95a742f756de067353f
>  --not f5dba8b76709ff0ef8715b8b288a4b64d4993fa3 -- 
> lucene/src/java/org/apache/lucene/index/DirectoryReader.java

The following excerpt shows lines 501-505 from the output. In particular we can 
see that lines 502-503 originate from IndexReader.java.

10ba9abeb208d37df985e95a742f756de067353f 
lucene/src/java/org/apache/lucene/index/DirectoryReader.java 501 501)* 
This method
^d1165b19726fa0cd13a539827a7cd43237a4fee 
lucene/src/java/org/apache/lucene/index/IndexReader.java 496 502)* 
returns the version recorded in the commit that the
^d1165b19726fa0cd13a539827a7cd43237a4fee 
lucene/src/java/org/apache/lucene/index/IndexReader.java 497 503)* 
reader opened.  This version is advanced every time
^d1165b19726fa0cd13a539827a7cd43237a4fee 
lucene/src/java/org/apache/lucene/index/IndexReader.java 498 504)* a 
change is made with {@link IndexWriter}.
10ba9abeb208d37df985e95a742f756de067353f 
lucene/src/java/org/apache/lucene/index/DirectoryReader.java 505 505)*/

The same information can be obtained as well by using --line-porcelain:

> git blame --show-name --show-number --line-porcelain -s -w --abbrev=40 -C -C 
> -C 
> d1165b19726fa0cd13a539827a7cd43237a4feef..10ba9abeb208d37df985e95a742f756de067353f
>  --not f5dba8b76709ff0ef8715b8b288a4b64d4993fa3 -- 
> lucene/src/java/org/apache/lucene/index/DirectoryReader.java

Here is the output for line 502:

d1165b19726fa0cd13a539827a7cd43237a4feef 496 502 3
author Michael McCandless
author-mail 
author-time 1327877325
author-tz +
committer Michael McCandless
committer-mail 
committer-time 1327877325
committer-tz +
summary LUCENE-3725: add optional packing to FSTs
boundary
filename lucene/src/java/org/apache/lucene/index/IndexReader.java
* returns the version recorded in the commit that the

However, when using --porcelain DirectoryReader.java is reported as the origin 
of lines 502-504:

> git blame --show-name --show-number --porcelain -s -w --abbrev=40 -C -C -C 
> d1165b19726fa0cd13a539827a7cd43237a4feef..10ba9abeb208d37df985e95a742f756de067353f
>  --not f5dba8b76709ff0ef8715b8b288a4b64d4993fa3 -- 
> lucene/src/java/org/apache/lucene/index/DirectoryReader.java

10ba9abeb208d37df985e95a742f756de067353f 501 501 1
author Uwe Schindler
author-mail 
author-time 1327879145
author-tz +
committer Uwe Schindler
committer-mail 
committer-time 1327879145
committer-tz +
summary Reverse merged revision(s) from lucene/dev/trunk up to 1237502
previous f5dba8b76709ff0ef8715b8b288a4b64d4993fa3 
lucene/src/java/org/apache/lucene/index/DirectoryReader.java
filename lucene/src/java/org/apache/lucene/index/DirectoryReader.java
* This method
d1165b19726fa0cd13a539827a7cd43237a4feef 496 502 3
* returns the version recorded in the commit that the
d1165b19726fa0cd13a539827a7cd43237a4feef 497 503
* reader opened.  This version is advanced every time
d1165b19726fa0cd13a539827a7cd43237a4feef 498 504

This is not only inconsistent with the other outputs but the output is also 
inconsistent in itself because lines 496 -498 do not even exist in a previous 
version of DirectoryReader.java.

Thanks for any feedback.

Kind Regards
Konstantin Sokolov


Re: [PATCH] config: preserve case for one-shot config on the command line

2017-02-20 Thread Lars Schneider

> On 20 Feb 2017, at 10:58, Junio C Hamano  wrote:
> 
> Junio C Hamano  writes:
> 
>> I still haven't queued any of the variants I posted (and I do not
>> think other people sent their own versions, either).  I need to pick
>> one and queue, with a test or two.  Perhaps after -rc2.  
>> 
>> Others are welcome to work on it while I cut -rc2 tomorrow, so that
>> by the time I see their patch all that is left for me to do is to
>> apply it ;-)
> 
> Since nothing seems to have happened in the meantime, here is what
> I'll queue so that we won't forget for now.  Lars's tests based on
> how the scripted "git submodule" uses "git config" may still be
> valid, but it is somewhat a roundabout way to demonstrate the
> breakage and not very effective way to protect the fix, so I added a
> new test that directly tests "git -c = ".

Agreed. Please ignore my tests.
If you want to you could queue this tiny cleanup, though:
http://public-inbox.org/git/20170215113325.14393-1-larsxschnei...@gmail.com/

> ...
> 
> +/*
> + * downcase the  and  in . or
> + * .. and do so in place.  
> + * is left intact.
> + */
> +static void canonicalize_config_variable_name(char *varname)
> +{
> + char *cp, *last_dot;

What does cp stand for? "char pointer"?

> +
> + /* downcase the first segment */
> + for (cp = varname; *cp; cp++) {
> + if (*cp == '.')
> + break;
> + *cp = tolower(*cp);
> + }
> + if (!*cp)
> + return;
> +
> + /* scan for the last dot */
> + for (last_dot = cp; *cp; cp++)
> + if (*cp == '.')
> + last_dot = cp;
> +
> + /* downcase the last segment */
> + for (cp = last_dot; *cp; cp++)
> + *cp = tolower(*cp);
> +}
> +
> int git_config_parse_parameter(const char *text,
>  config_fn_t fn, void *data)
> {
> @@ -221,7 +249,7 @@ int git_config_parse_parameter(const char *text,
>   strbuf_list_free(pair);
>   return error("bogus config parameter: %s", text);
>   }
> - strbuf_tolower(pair[0]);
> + canonicalize_config_variable_name(pair[0]->buf);
>   if (fn(pair[0]->buf, value, data) < 0) {
>   strbuf_list_free(pair);
>   return -1;
> diff --git a/t/t1351-config-cmdline.sh b/t/t1351-config-cmdline.sh
> new file mode 100755
> index 00..acb8dc3b15
> --- /dev/null
> +++ b/t/t1351-config-cmdline.sh
> @@ -0,0 +1,48 @@
> +#!/bin/sh
> +
> +test_description='git -c var=val'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'last one wins: two level vars' '
> + echo VAL >expect &&
> +
> + # sec.var and sec.VAR are the same variable, as the first
> + # and the last level of a configuration variable name is
> + # case insensitive.  Test both setting and querying.
> +
> + git -c sec.var=val -c sec.VAR=VAL config --get sec.var >actual &&
> + test_cmp expect actual &&
> + git -c SEC.var=val -c sec.var=VAL config --get sec.var >actual &&
> + test_cmp expect actual &&
> +
> + git -c sec.var=val -c sec.VAR=VAL config --get SEC.var >actual &&
> + test_cmp expect actual &&
> + git -c SEC.var=val -c sec.var=VAL config --get sec.VAR >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'last one wins: three level vars' '
> + echo val >expect &&
> +
> + # v.a.r and v.A.r are not the same variable, as the middle
> + # level of a three-level configuration variable name is
> + # case sensitive.  Test both setting and querying.
> +
> + git -c v.a.r=val -c v.A.r=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c v.A.r=VAL config --get V.a.R >actual &&
> + test_cmp expect actual &&
> +
> + echo VAL >expect &&
> + git -c v.a.r=val -c v.a.R=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c V.a.r=VAL config --get v.a.r >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c v.a.R=VAL config --get V.a.R >actual &&
> + test_cmp expect actual &&
> + git -c v.a.r=val -c V.a.r=VAL config --get V.a.R >actual &&
> + test_cmp expect actual
> +'
> +
> +test_done
> -- 
> 2.12.0-rc2-221-g8fa194a99f
> 

Looks good to me!

Thank you,
Lars



Re: git diff --quiet exits with 1 on clean tree with CRLF conversions

2017-02-20 Thread Mike Crowe
On Friday 17 February 2017 at 22:19:58 +, Mike Crowe wrote:
> On Friday 17 February 2017 at 14:05:17 -0800, Junio C Hamano wrote:
> > Mike Crowe  writes:
> > 
> > > If "git diff --quiet" finds it necessary to compare actual file contents,
> > > and a file requires CRLF conversion, then it incorrectly exits with an 
> > > exit
> > > code of 1 even if there have been no changes.
> > >
> > > The patch below adds a test file that shows the problem.
> > 
> > If "git diff" does not show any output and "git diff --exit-code" or
> > "git diff --quiet" says there are differences, then it is a bug.
> > 
> > I would however have expected that any culprit would involve a code
> > that says "under QUICK option, we do not have to bother doing
> > this".  The part you quoted:
> > 
> > >   if (!DIFF_FILE_VALID(p->one) || /* (1) */
> > >   !DIFF_FILE_VALID(p->two) ||
> > >   (p->one->oid_valid && p->two->oid_valid) ||
> > >   (p->one->mode != p->two->mode) ||
> > >   diff_populate_filespec(p->one, CHECK_SIZE_ONLY) ||
> > >   diff_populate_filespec(p->two, CHECK_SIZE_ONLY) ||
> > >   (p->one->size != p->two->size) ||
> > >   !diff_filespec_is_identical(p->one, p->two)) /* (2) */
> > >   p->skip_stat_unmatch_result = 1;
> > 
> > is used by "git diff" with and without "--quiet", afacr, so I
> > suspect that the bug lies somewhere else.
> 
> I can't say that I really understand the code fully, but it appears that
> the first pass generates a queue of files that contain differences. The
> result of the quiet diff comes from the size of that queue,
> diff_queued_diff.nr, being non-zero in diffcore_std. I'm assuming that the
> result of the noisy diff comes from actually comparing the files.
> 
> But, I've only spent a short while looking so I may have got the wrong end
> of the stick.

Tricking Git into checking the actual file contents (by passing
--ignore-space-change for example) is sufficient to ensure that the exit
status is never 1 when it should be zero. (Of course that option has other
unwanted effects in this case.)

I think that if there's a risk that file contents will undergo conversion
then this should force the diff to check the file contents. Something like:

diff --git a/diff.c b/diff.c
index 051761b..bee1662 100644
--- a/diff.c
+++ b/diff.c
@@ -3413,13 +3413,14 @@ void diff_setup_done(struct diff_options *options)
/*
 * Most of the time we can say "there are changes"
 * only by checking if there are changed paths, but
-* --ignore-whitespace* options force us to look
-* inside contents.
+* --ignore-whitespace* options or text conversion
+* force us to look inside contents.
 */
 
if (DIFF_XDL_TST(options, IGNORE_WHITESPACE) ||
DIFF_XDL_TST(options, IGNORE_WHITESPACE_CHANGE) ||
-   DIFF_XDL_TST(options, IGNORE_WHITESPACE_AT_EOL))
+   DIFF_XDL_TST(options, IGNORE_WHITESPACE_AT_EOL) ||
+   DIFF_OPT_TST(options, ALLOW_TEXTCONV))
DIFF_OPT_SET(options, DIFF_FROM_CONTENTS);
else
DIFF_OPT_CLR(options, DIFF_FROM_CONTENTS);

This solves the problem for me and my test case now passes. Unfortunately
it breaks the 'removing and adding subproject' test case in
t3040-subprojects-basic at the line:

 test_expect_code 1 git diff -M --name-status --exit-code HEAD^ HEAD

presumably because after the rename has been detected the file contents are
identical. :( A rename of a single file appears to still be handled
correctly.

Mike.


Re: [PATCH 4/4 v4] sha1_name.c: teach get_sha1_1 "-" shorthand for "@{-1}"

2017-02-20 Thread Siddharth Kannan
On 17 February 2017 at 00:38, Junio C Hamano  wrote:
> Having said all that, I do not think the remainder of the code is
> prepared to take "-", not yet anyway [*1*], so turning "-" into
> "@{-1}" this patch does before it calls get_sha1_basic(), while it
> is not an ideal final state, is probably an acceptable milestone to
> stop at.

So, is it okay to stop with just supporting "-" and not support things
like "-@{yesterday}"?

Matthieu's comments on the matter:

Siddharth Kannan  writes:

> As per Matthieu's comments, I have updated the tests, but there
is still one
> thing that is not working: log -@{yesterday} or log -@{2.days.ago}

Note that I did not request that these things work, just that they seem
to be relevant tests: IMHO it's OK to reject them, but for example we
don't want them to segfault. And having a test is a good hint that you
thought about what could happen and to document it.

[Quoted from email ]


>
> It is a separate matter if this patch is sufficient to produce
> correct results, though.  I haven't studied the callers of this
> change to make sure yet, and may find bugs in this approach later.
>

-- 

Best Regards,

- Siddharth Kannan.


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Oleg Taranenko
> Anyway, I don't think it is feasible to weaken the assumption that there
> is only one transition from 'old' ('good') to 'new' ('bad'); this is
> what allows O(log(N)) operations.  See bisection method of root finding,
> that is finding zeros of a continuous function.

I don't intended to change default behaviour of git bisect, I can
estimate what drawback it could bring.
I'd rather implement something like

git bisect start --bisect-strategy=missing-feature

by default current state is being used.

git config value would be also useful.

Oleg


Re: [PATCH v4 00/15] Remove submodule from files-backend.c

2017-02-20 Thread Duy Nguyen
On Mon, Feb 20, 2017 at 7:42 PM, Michael Haggerty  wrote:
> On 02/18/2017 02:32 PM, Nguyễn Thái Ngọc Duy wrote:
>> v4:
>
> I very much like the direction of this patch series. I reviewed the
> design pretty carefully and left some comments about it, and skimmed
> through the code changes. But I haven't yet reviewed the code in detail.
> I'll wait for your reaction to my design comments before doing so.

Unless there are some mails in transit, thanks for the review. The
comments I haven't replied to usually mean "I agree" (but not writing
unless I could add anything after).
-- 
Duy


Re: [PATCH v4 00/15] Remove submodule from files-backend.c

2017-02-20 Thread Michael Haggerty
On 02/18/2017 02:32 PM, Nguyễn Thái Ngọc Duy wrote:
> v4:

I very much like the direction of this patch series. I reviewed the
design pretty carefully and left some comments about it, and skimmed
through the code changes. But I haven't yet reviewed the code in detail.
I'll wait for your reaction to my design comments before doing so.

Thanks,
Michael



Re: [PATCH 2/5] hashmap: allow memihash computation to be continued

2017-02-20 Thread Johannes Schindelin
Hi Junio,

On Fri, 17 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > diff --git a/hashmap.c b/hashmap.c
> > index b10b642229c..061b7d61da6 100644
> > --- a/hashmap.c
> > +++ b/hashmap.c
> > @@ -50,6 +50,20 @@ unsigned int memihash(const void *buf, size_t len)
> > return hash;
> >  }
> >  
> > +/* Incoporate another chunk of data into a memihash computation. */
> > +unsigned int memihash_continue(unsigned int hash,
> > +  const void *buf, size_t len)
> > +{
> > +   const unsigned char *p = buf;
> > +   while (len--) {
> > +   unsigned int c = *p++;
> > +   if (c >= 'a' && c <= 'z')
> > +   c -= 'a' - 'A';
> > +   hash = (hash * FNV32_PRIME) ^ c;
> > +   }
> > +   return hash;
> > +}
> 
> This makes me wonder if we want to reduce the duplication (primarily
> to avoid risking the loop body to go out of sync) by doing:
> 
>   unsigned int memihash(const void *buf, size_t len)
>   {
>   return memihash_continue(buf, len, FNV32_BASE);
>   }
> 
> If an extra call level really matters, its "inline" equivalent in
> the header would probably be good.

Well, the hashing is supposed to be as fast as possible, so I would like
to avoid that extra call level. However, the end result is not so pretty
because FNV32_BASE needs to be made public (OTOH it removes more lines
than it adds):

-- snipsnap --
diff --git a/hashmap.c b/hashmap.c
index 061b7d61da6..470a0832688 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -4,7 +4,6 @@
 #include "cache.h"
 #include "hashmap.h"
 
-#define FNV32_BASE ((unsigned int) 0x811c9dc5)
 #define FNV32_PRIME ((unsigned int) 0x01000193)
 
 unsigned int strhash(const char *str)
@@ -37,19 +36,6 @@ unsigned int memhash(const void *buf, size_t len)
return hash;
 }
 
-unsigned int memihash(const void *buf, size_t len)
-{
-   unsigned int hash = FNV32_BASE;
-   unsigned char *ucbuf = (unsigned char *) buf;
-   while (len--) {
-   unsigned int c = *ucbuf++;
-   if (c >= 'a' && c <= 'z')
-   c -= 'a' - 'A';
-   hash = (hash * FNV32_PRIME) ^ c;
-   }
-   return hash;
-}
-
 /* Incoporate another chunk of data into a memihash computation. */
 unsigned int memihash_continue(unsigned int hash,
   const void *buf, size_t len)
diff --git a/hashmap.h b/hashmap.h
index 78e14dfde71..a1a8fd7dc54 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -8,12 +8,17 @@
 
 /* FNV-1 functions */
 
+#define FNV32_BASE ((unsigned int) 0x811c9dc5)
+
 extern unsigned int strhash(const char *buf);
 extern unsigned int strihash(const char *buf);
 extern unsigned int memhash(const void *buf, size_t len);
-extern unsigned int memihash(const void *buf, size_t len);
 extern unsigned int memihash_continue(unsigned int hash_seed,
  const void *buf, size_t len);
+static inline unsigned int memihash(const void *buf, size_t len)
+{
+   return memihash_continue(FNV32_BASE, buf, len);
+}
 
 static inline unsigned int sha1hash(const unsigned char *sha1)
 {


slightly confusing message

2017-02-20 Thread Leon George
Hey, you lovely people <3

Every once in a while this:

£ git add -p .
warning: empty strings as pathspecs will be made invalid in upcoming
releases. please use . instead if you meant to match all paths
No changes.

It seems to happen only when there are no more modified files and git
still works wonderfully otherwise - just wanted to let you know.

enjoy your weeks :-)


Re: [PATCH v4 11/15] refs.c: make get_main_ref_store() public and use it

2017-02-20 Thread Michael Haggerty
On 02/18/2017 02:32 PM, Nguyễn Thái Ngọc Duy wrote:
> get_ref_store() will soon be renamed to get_submodule_ref_store().
> Together with future get_worktree_ref_store(), the three functions
> provide an appropriate ref store for different operation modes. New APIs
> will be added to operate directly on ref stores.

I see where you're going with this, but as of the end of this patch
series, there is still nothing that a caller outside of the refs module
can do with a `struct ref_store *`. This means that it would be enough
to put this declaration (and that of `get_submodule_ref_store()`, added
in a later patch) in refs/refs-internal.h for now.

If you want to move the declarations straight to `refs.h` now to avoid
code churn in some later patch series, then please mention that fact in
the commit message.

Michael



Re: [PATCH v4 14/15] files-backend: remove submodule_allowed from files_downcast()

2017-02-20 Thread Michael Haggerty
On 02/20/2017 01:33 PM, Duy Nguyen wrote:
> On Mon, Feb 20, 2017 at 7:30 PM, Michael Haggerty  
> wrote:
>> On 02/20/2017 01:21 PM, Duy Nguyen wrote:
>>> On Mon, Feb 20, 2017 at 7:11 PM, Michael Haggerty  
>>> wrote:
 [...]
 These limitations, I think, imply that submodule references have to be
 treated as read-only.
>>>
>>> Behind the scene submodule does add_submodule_odb(), which basically
>>> makes the submodule's odb an alternate of in-core odb. So odb access
>>> works. I was puzzled how submodules could by pass odb access at the
>>> beginning only to find that out. technical/api-ref-iteration.txt also
>>> mentions that you need to add_submodule_odb(), so I think it's
>>> deliberate (albeit hacky) design.
>>
>> That's interesting. I didn't know it before.
>>
>> But I still don't think that means that reference writing can work
>> correctly. If I try to set a submodule branch to an SHA-1 and I verify
>> that the SHA-1 points to a valid commit, how do I know that the commit
>> is in the same submodule that holds the reference?
> 
> Good point. So will the new flag be "read_only" (no reference to
> submodule), or "submodule"? This flag will be passed in at ref-store
> init time and kept in files_ref_store.

I haven't checked carefully whether there are other operations that
don't work and/or don't make sense for submodules. If not, then
`read_only` would also be a fine name for the flag.

Michael



Re: [PATCH v4 14/15] files-backend: remove submodule_allowed from files_downcast()

2017-02-20 Thread Duy Nguyen
On Mon, Feb 20, 2017 at 7:30 PM, Michael Haggerty  wrote:
> On 02/20/2017 01:21 PM, Duy Nguyen wrote:
>> On Mon, Feb 20, 2017 at 7:11 PM, Michael Haggerty  
>> wrote:
>>> On 02/18/2017 02:33 PM, Nguyễn Thái Ngọc Duy wrote:
 Since submodule or not is irrelevant to files-backend after the last
 patch, remove the parameter from files_downcast() and kill
 files_assert_main_repository().

 PS. This implies that all ref operations are allowed for submodules. But
 we may need to look more closely to see if that's really true...
>>>
>>> I think you are jumping the gun here.
>>>
>>> Even after your changes, there is still a significant difference between
>>> the main repository and submodules: we have access to the object
>>> database for the main repository, but not for submodules.
>>>
>>> So, for example, the following don't work for submodules:
>>>
>>> * `parse_object()` is used to ensure that references are only pointed at
>>> valid objects, and that branches are only pointed at commit objects.
>>>
>>> * `peel_object()` is used to write the peeled version of references in
>>> `packed-refs`, and to peel references while they are being iterated
>>> over. Since this doesn't work, I think you can't write `packed-refs` in
>>> a submodule.
>>>
>>> These limitations, I think, imply that submodule references have to be
>>> treated as read-only.
>>
>> Behind the scene submodule does add_submodule_odb(), which basically
>> makes the submodule's odb an alternate of in-core odb. So odb access
>> works. I was puzzled how submodules could by pass odb access at the
>> beginning only to find that out. technical/api-ref-iteration.txt also
>> mentions that you need to add_submodule_odb(), so I think it's
>> deliberate (albeit hacky) design.
>
> That's interesting. I didn't know it before.
>
> But I still don't think that means that reference writing can work
> correctly. If I try to set a submodule branch to an SHA-1 and I verify
> that the SHA-1 points to a valid commit, how do I know that the commit
> is in the same submodule that holds the reference?

Good point. So will the new flag be "read_only" (no reference to
submodule), or "submodule"? This flag will be passed in at ref-store
init time and kept in files_ref_store.
-- 
Duy


Re: [PATCH v4 06/15] files-backend: remove the use of git_path()

2017-02-20 Thread Duy Nguyen
On Mon, Feb 20, 2017 at 6:34 PM, Michael Haggerty  wrote:
> On 02/18/2017 02:32 PM, Nguyễn Thái Ngọc Duy wrote:
>> Given $GIT_DIR and $GIT_COMMON_DIR, files-backend is now in charge of
>> deciding what goes where. The end goal is to pass $GIT_DIR only. A
>> refs "view" of a linked worktree is a logical ref store that combines
>> two files backends together.
>>
>> (*) Not entirely true since strbuf_git_path_submodule() still does path
>> translation underneath. But that's for another patch.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  refs/files-backend.c | 37 +
>>  1 file changed, 33 insertions(+), 4 deletions(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index b599ddf92..dbcaf9bda 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -924,6 +924,9 @@ struct files_ref_store {
>>*/
>>   const char *submodule;
>>
>> + struct strbuf gitdir;
>> + struct strbuf gitcommondir;
>
> Is there a reason for these to be `strbuf`s rather than `const char *`?
> (One reason would be if you planned to use the `len` field, but I don't
> think you do so.)

Nope. I just didn't think about char *. It may have to lose "const"
though because in submodule case we may need a new allocation.

>
>> @@ -937,15 +940,33 @@ static void files_path(struct files_ref_store *refs, 
>> struct strbuf *sb,
>>  {
>>   struct strbuf tmp = STRBUF_INIT;
>>   va_list vap;
>> + const char *ref;
>>
>>   va_start(vap, fmt);
>>   strbuf_vaddf(, fmt, vap);
>>   va_end(vap);
>> - if (refs->submodule)
>> + if (refs->submodule) {
>>   strbuf_git_path_submodule(sb, refs->submodule,
>> "%s", tmp.buf);
>> - else
>> - strbuf_git_path(sb, "%s", tmp.buf);
>> + } else if (!strcmp(tmp.buf, "packed-refs") ||
>> +!strcmp(tmp.buf, "logs")) { /* non refname path */
>> + strbuf_addf(sb, "%s/%s", refs->gitcommondir.buf, tmp.buf);
>> + } else if (skip_prefix(tmp.buf, "logs/", )) { /* reflog */
>> + if (is_per_worktree_ref(ref))
>> + strbuf_addf(sb, "%s/%s", refs->gitdir.buf, tmp.buf);
>> + else
>> + strbuf_addf(sb, "%s/%s", refs->gitcommondir.buf, 
>> tmp.buf);
>
> This code would also be simpler if there were separate functions for
> packed-refs, loose references, and reflogs.

And maybe keep the path to packed-refs, the base path up to "logs" in
struct files_ref_store too (they will be calculated at ref store
init)? That way the files_packed_refs_path() does no calculation.
files_reflog_path() and files_ref_path() will just do string
concatenation, no fancy addf.
-- 
Duy


Re: [PATCH v4 14/15] files-backend: remove submodule_allowed from files_downcast()

2017-02-20 Thread Michael Haggerty
On 02/20/2017 01:21 PM, Duy Nguyen wrote:
> On Mon, Feb 20, 2017 at 7:11 PM, Michael Haggerty  
> wrote:
>> On 02/18/2017 02:33 PM, Nguyễn Thái Ngọc Duy wrote:
>>> Since submodule or not is irrelevant to files-backend after the last
>>> patch, remove the parameter from files_downcast() and kill
>>> files_assert_main_repository().
>>>
>>> PS. This implies that all ref operations are allowed for submodules. But
>>> we may need to look more closely to see if that's really true...
>>
>> I think you are jumping the gun here.
>>
>> Even after your changes, there is still a significant difference between
>> the main repository and submodules: we have access to the object
>> database for the main repository, but not for submodules.
>>
>> So, for example, the following don't work for submodules:
>>
>> * `parse_object()` is used to ensure that references are only pointed at
>> valid objects, and that branches are only pointed at commit objects.
>>
>> * `peel_object()` is used to write the peeled version of references in
>> `packed-refs`, and to peel references while they are being iterated
>> over. Since this doesn't work, I think you can't write `packed-refs` in
>> a submodule.
>>
>> These limitations, I think, imply that submodule references have to be
>> treated as read-only.
> 
> Behind the scene submodule does add_submodule_odb(), which basically
> makes the submodule's odb an alternate of in-core odb. So odb access
> works. I was puzzled how submodules could by pass odb access at the
> beginning only to find that out. technical/api-ref-iteration.txt also
> mentions that you need to add_submodule_odb(), so I think it's
> deliberate (albeit hacky) design.

That's interesting. I didn't know it before.

But I still don't think that means that reference writing can work
correctly. If I try to set a submodule branch to an SHA-1 and I verify
that the SHA-1 points to a valid commit, how do I know that the commit
is in the same submodule that holds the reference?

> [...]

Michael


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Jakub Narębski
W dniu 20.02.2017 o 08:38, Oleg Taranenko pisze:

 Then you must adjust your definition of "good": All commits that do not 
 have
 the feature, yet, are "good": since they do not have the feature in the
 first place, they cannot have the breakage that you found in the feature.

 That is exactly the situation in your original example! But you constructed
 the condition of goodness in such a simplistic way (depending on the
 presence of a string), that it was impossible to distinguish between "does
 not have the feature at all" and "has the feature, but it is broken".
>>>
>>> Johannes, thank you for correctly identifying the error in my logic.
>>> Indeed I was using the term 'bad' also for the commit without the
>>> feature. In order to find the commit introducing the bug in my example
>>> a new state is needed, which would make 'git bisect' a bit more
>>> complicated than the user 'most of the time' probably needs. Or do you
>>> think, it would make sense to ask the user for this state (if e.g 'git
>>> bisect' would be started with a new parameter)?
> 
>> If a commit doesn't have the feature, then it is by definition, not
>> containing a broken feature, and you can simply use the "good" state.
>> There is no need for a different state. If you can't test the commit
>> because it's broken in some other way, you can use "git bisect skip"
>> but that isn't what you want in this case.
> 
> Commits missing feature == 'good' commit is a very confusing one.

Nowadays you can change the names for 'old' and 'new' with
`git bisect terms`.  HTH.
 
> Looks like in real life it happens much often, then git developers can
> imagine. For multi-branch/multi-feature workflow it's pretty easy not
> to recognize whether it is missing or not developed yet, especially on
> retrospective view where cherry-picking/squashing/merging is being
> used. My experience shows most annoying bugs are generating after a
> heavy merge (evil merge) with conflicts resolutions, where developer
> is not involved in the knowing what happens on counterpart changes.
> Then feature can be disappeared after it was worked & tested in its
> own branches.

Good to know about this problem.

> @Alex, I'm pretty interesting in fixing this weird bisect behaviour as
> well, as far as I struggled on it last summer and continue struggling
> so far :) If you want we can join to your efforts on fixing.

Anyway, I don't think it is feasible to weaken the assumption that there
is only one transition from 'old' ('good') to 'new' ('bad'); this is
what allows O(log(N)) operations.  See bisection method of root finding,
that is finding zeros of a continuous function.

Best,
-- 
Jakub Narębski



Re: [PATCH v2 04/16] files-backend: replace *git_path*() with files_path()

2017-02-20 Thread Duy Nguyen
On Mon, Feb 20, 2017 at 6:23 PM, Michael Haggerty  wrote:
> On 02/16/2017 12:48 PM, Nguyễn Thái Ngọc Duy wrote:
>> This centralizes all path rewriting of files-backend.c in one place so
>> we have easier time removing the path rewriting later. There could be
>> some hidden indirect git_path() though, I didn't audit the code to the
>> bottom.
>
> Almost all of the calls to `files_path()` [1] take one of the following
> forms:
>
> * `files_path(refs, , "packed-refs")`
> * `files_path(refs, , "%s", refname)`
> * `files_path(refs, , "logs/%s", refname)`
>
> (though sometimes `refname` is not the name of a reference but rather
> the name of a directory containing references, like "refs/heads").
>
> This suggests to me that it would be more natural to have three
> functions, `files_packed_refs_path()`, `files_loose_ref_path()`, and
> `files_reflog_path()`, with no `fmt` arguments. Aside from making the
> callers a bit simpler and the implementation of each of the three
> functions simpler (they wouldn't have to deal with variable argument
> lists), at the cost of needing three similar functions.
>
> But I think the split would also be advantageous from a design point of
> view. The relative path locations of loose reference files, reflog
> files, and the packed-refs file is kind of a coincidence, and it would
> be advantageous to encode that policy in three functions rather than
> continuing to spread knowledge of those assumptions around.
>
> It would also make it easier to switch to a new system of encoding
> reference names, for example so that reference names that differ only in
> case can be stored on a case-insensitive filesystem, or to store reflogs
> using a naming scheme that is not susceptible to D/F conflicts so that
> we can retain reflogs for deleted references.

I agree. I didn't see it clearly at the beginning (and made several
mistakes in earlier iterations) but as you have seen files_path() the
separation is pretty clear there. I was going to do it when
introducing the "linked worktree" backend. But since you pointed it
out, I'll update it in this series too.

> Michael
>
> [1] The only exception I see is one call `files_path(refs, ,
> "logs")`, which is a prelude to iterating over the reflog files. This is
> actually an example of the code giving us a hint that the design is
> wrong: if you iterate only over the directories under `git_path(logs)`,
> you miss the reflogs for worktree references.

Oh yes, I had to fix the reflog iterator [1] exactly for that :)

[1] https://public-inbox.org/git/20170217141908.18012-14-pclo...@gmail.com/T/#u
-- 
Duy


Re: [PATCH v4 14/15] files-backend: remove submodule_allowed from files_downcast()

2017-02-20 Thread Duy Nguyen
On Mon, Feb 20, 2017 at 7:11 PM, Michael Haggerty  wrote:
> On 02/18/2017 02:33 PM, Nguyễn Thái Ngọc Duy wrote:
>> Since submodule or not is irrelevant to files-backend after the last
>> patch, remove the parameter from files_downcast() and kill
>> files_assert_main_repository().
>>
>> PS. This implies that all ref operations are allowed for submodules. But
>> we may need to look more closely to see if that's really true...
>
> I think you are jumping the gun here.
>
> Even after your changes, there is still a significant difference between
> the main repository and submodules: we have access to the object
> database for the main repository, but not for submodules.
>
> So, for example, the following don't work for submodules:
>
> * `parse_object()` is used to ensure that references are only pointed at
> valid objects, and that branches are only pointed at commit objects.
>
> * `peel_object()` is used to write the peeled version of references in
> `packed-refs`, and to peel references while they are being iterated
> over. Since this doesn't work, I think you can't write `packed-refs` in
> a submodule.
>
> These limitations, I think, imply that submodule references have to be
> treated as read-only.

Behind the scene submodule does add_submodule_odb(), which basically
makes the submodule's odb an alternate of in-core odb. So odb access
works. I was puzzled how submodules could by pass odb access at the
beginning only to find that out. technical/api-ref-iteration.txt also
mentions that you need to add_submodule_odb(), so I think it's
deliberate (albeit hacky) design.

> When I was working on a patch series similar to yours, I introduced a
> boolean "main_repository" flag in `struct ref_store`, and use that
> member to implement `files_assert_main_repository()`. That way
> `files_ref_store::submodule` can still be removed, which is the more
> important goal from a design standpoint.

I could keep the submodule check back (and replace the submodule
string in files_ref_store with just a flag). But I really think all
backend functions work with submodule. Perhaps add some tests to
exercise/verify that files-backend-on-submodule works?
-- 
Duy


Re: [PATCH v4 14/15] files-backend: remove submodule_allowed from files_downcast()

2017-02-20 Thread Michael Haggerty
On 02/18/2017 02:33 PM, Nguyễn Thái Ngọc Duy wrote:
> Since submodule or not is irrelevant to files-backend after the last
> patch, remove the parameter from files_downcast() and kill
> files_assert_main_repository().
> 
> PS. This implies that all ref operations are allowed for submodules. But
> we may need to look more closely to see if that's really true...

I think you are jumping the gun here.

Even after your changes, there is still a significant difference between
the main repository and submodules: we have access to the object
database for the main repository, but not for submodules.

So, for example, the following don't work for submodules:

* `parse_object()` is used to ensure that references are only pointed at
valid objects, and that branches are only pointed at commit objects.

* `peel_object()` is used to write the peeled version of references in
`packed-refs`, and to peel references while they are being iterated
over. Since this doesn't work, I think you can't write `packed-refs` in
a submodule.

These limitations, I think, imply that submodule references have to be
treated as read-only.

When I was working on a patch series similar to yours, I introduced a
boolean "main_repository" flag in `struct ref_store`, and use that
member to implement `files_assert_main_repository()`. That way
`files_ref_store::submodule` can still be removed, which is the more
important goal from a design standpoint.

Michael



Re: [PATCH v2] send-email: only allow one address per body tag

2017-02-20 Thread Matthieu Moy
Johan Hovold  writes:

> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1563,7 +1563,7 @@ foreach my $t (@files) {
>   # Now parse the message body
>   while(<$fh>) {
>   $message .=  $_;
> - if (/^(Signed-off-by|Cc): (.*)$/i) {
> + if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {

I think this is acceptable, but this doesn't work with trailers like

Cc: "Some > Body" 

A proper management of this kind of weird address should be doable by
reusing the regexp parsing "..." in parse_mailbox:

my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;

So the final regex would look like

if (/^(Signed-off-by|Cc): (([^>]*|"(?:[^\"\\]|\\.)*")>?)/i) {

I don't think that should block the patch inclusion, but it may be worth
considering.

Anyway, thanks for the patch!

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


[PATCH v2] send-email: only allow one address per body tag

2017-02-20 Thread Johan Hovold
Adding comments after a tag in the body is a common practise (e.g. in
the Linux kernel) and git-send-email has been supporting this for years
by removing any trailing cruft after the address.

After some recent changes, any trailing comment is now instead appended
to the recipient name (with some random white space inserted) resulting
in undesirable noise in the headers, for example:

CC: "# 3 . 3 . x : 1b9508f : sched : Rate-limit newidle" 


Revert to the earlier behaviour of discarding anything after the (first)
address in a tag while parsing the body.

Note that multiple addresses after are still allowed after a command
line switch (and in a CC header field).

Also note that --suppress-cc=self was never honoured when using multiple
addresses in a tag.

Fixes: b1c8a11c8024 ("send-email: allow multiple emails using --cc, --to
and --bcc")
Fixes: e3fdbcc8e164 ("parse_mailboxes: accept extra text after <...>
address")
Signed-off-by: Johan Hovold 
---

v2:
 - update the cc-trailer test
 - amend commit message and mention the broken --suppress-cc=self


 git-send-email.perl   | 2 +-
 t/t9001-send-email.sh | 7 +++
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 068d60b3e698..eea0a517f71b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1563,7 +1563,7 @@ foreach my $t (@files) {
# Now parse the message body
while(<$fh>) {
$message .=  $_;
-   if (/^(Signed-off-by|Cc): (.*)$/i) {
+   if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
chomp;
my ($what, $c) = ($1, $2);
chomp $c;
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 0f398dd1603d..60a80f60b268 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,7 +148,6 @@ cat >expected-cc <<\EOF
 !t...@example.com!
 !th...@example.com!
 !f...@example.com!
-!f...@example.com!
 EOF
 "
 
@@ -159,9 +158,9 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
Test Cc: trailers.
 
Cc: o...@example.com
-   Cc:  # this is part of the name
-   Cc: ,  # not.f...@example.com
-   Cc: "Some # Body"  [part.of.name.too]
+   Cc:  # trailing comments are ignored
+   Cc: ,  one address per line
+   Cc: "Some # Body"  [  ]
EOF
clean_fake_sendmail &&
git send-email -1 --to=recipi...@example.com \
-- 
2.11.1



Re: [PATCH v4 06/15] files-backend: remove the use of git_path()

2017-02-20 Thread Michael Haggerty
On 02/18/2017 02:32 PM, Nguyễn Thái Ngọc Duy wrote:
> Given $GIT_DIR and $GIT_COMMON_DIR, files-backend is now in charge of
> deciding what goes where. The end goal is to pass $GIT_DIR only. A
> refs "view" of a linked worktree is a logical ref store that combines
> two files backends together.
> 
> (*) Not entirely true since strbuf_git_path_submodule() still does path
> translation underneath. But that's for another patch.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  refs/files-backend.c | 37 +
>  1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index b599ddf92..dbcaf9bda 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -924,6 +924,9 @@ struct files_ref_store {
>*/
>   const char *submodule;
>  
> + struct strbuf gitdir;
> + struct strbuf gitcommondir;

Is there a reason for these to be `strbuf`s rather than `const char *`?
(One reason would be if you planned to use the `len` field, but I don't
think you do so.)

> @@ -937,15 +940,33 @@ static void files_path(struct files_ref_store *refs, 
> struct strbuf *sb,
>  {
>   struct strbuf tmp = STRBUF_INIT;
>   va_list vap;
> + const char *ref;
>  
>   va_start(vap, fmt);
>   strbuf_vaddf(, fmt, vap);
>   va_end(vap);
> - if (refs->submodule)
> + if (refs->submodule) {
>   strbuf_git_path_submodule(sb, refs->submodule,
> "%s", tmp.buf);
> - else
> - strbuf_git_path(sb, "%s", tmp.buf);
> + } else if (!strcmp(tmp.buf, "packed-refs") ||
> +!strcmp(tmp.buf, "logs")) { /* non refname path */
> + strbuf_addf(sb, "%s/%s", refs->gitcommondir.buf, tmp.buf);
> + } else if (skip_prefix(tmp.buf, "logs/", )) { /* reflog */
> + if (is_per_worktree_ref(ref))
> + strbuf_addf(sb, "%s/%s", refs->gitdir.buf, tmp.buf);
> + else
> + strbuf_addf(sb, "%s/%s", refs->gitcommondir.buf, 
> tmp.buf);

This code would also be simpler if there were separate functions for
packed-refs, loose references, and reflogs.

> [...]

Michael



Re: [PATCH v2 04/16] files-backend: replace *git_path*() with files_path()

2017-02-20 Thread Michael Haggerty
On 02/16/2017 12:48 PM, Nguyễn Thái Ngọc Duy wrote:
> This centralizes all path rewriting of files-backend.c in one place so
> we have easier time removing the path rewriting later. There could be
> some hidden indirect git_path() though, I didn't audit the code to the
> bottom.

Almost all of the calls to `files_path()` [1] take one of the following
forms:

* `files_path(refs, , "packed-refs")`
* `files_path(refs, , "%s", refname)`
* `files_path(refs, , "logs/%s", refname)`

(though sometimes `refname` is not the name of a reference but rather
the name of a directory containing references, like "refs/heads").

This suggests to me that it would be more natural to have three
functions, `files_packed_refs_path()`, `files_loose_ref_path()`, and
`files_reflog_path()`, with no `fmt` arguments. Aside from making the
callers a bit simpler and the implementation of each of the three
functions simpler (they wouldn't have to deal with variable argument
lists), at the cost of needing three similar functions.

But I think the split would also be advantageous from a design point of
view. The relative path locations of loose reference files, reflog
files, and the packed-refs file is kind of a coincidence, and it would
be advantageous to encode that policy in three functions rather than
continuing to spread knowledge of those assumptions around.

It would also make it easier to switch to a new system of encoding
reference names, for example so that reference names that differ only in
case can be stored on a case-insensitive filesystem, or to store reflogs
using a naming scheme that is not susceptible to D/F conflicts so that
we can retain reflogs for deleted references.

Michael

[1] The only exception I see is one call `files_path(refs, ,
"logs")`, which is a prelude to iterating over the reflog files. This is
actually an example of the code giving us a hint that the design is
wrong: if you iterate only over the directories under `git_path(logs)`,
you miss the reflogs for worktree references.



Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

2017-02-20 Thread Johannes Schindelin
Hi Josh,

On Mon, 6 Feb 2017, Johannes Schindelin wrote:

> as discussed at the GitMerge, I am trying to come up with tooling that
> will allow for substantially less tedious navigation between the local
> repository, the mailing list, and what ends up in the `pu` branch.

I found a little bit more time last Friday to play with the
cross-correlation between commits in `pu` and mails in
public-inbox/git.git and it is worse than I previously assumed.

Just as a reminder: my plan was to start developing tools that will
ultimately help me as well as other contributors with the arcane mailing
list model of patch submission. And my first target was the seemingly
simple task of figuring out the mail corresponding to any given commit in
`pu` (i.e. the mail that contained the patch, and whose mail thread is
hence expected to have the entire patch review, and to which I would be
expected to respond if I find a problem with that commit).

And since it is all-too-common that the oneline is adjusted before
applying the patch, the Subject:/oneline pair is not a good candidate to
find matches.

My next best guess was that the author date would not be touched, so the
pair of Date: and authordate should make a good candidate.

My initial finding was that this is not without problems, as some mails
were sent with identical Date: lines (most likely due to bugs in the
tools, e.g. the well-known and already fixed bug in git-am, and hence
git-rebase, where it would apply all patches using the first patch's
author date), and worse: some of those mails contained actual patch series
that actually made it into Git's commit history.

But those are not the only problems.

For starters, I tried to cross-correlate *just* the commits that entered
`pu` since one week ago (git rev-list --since=1.week.ago upstream/pu) with
mails of the past month in the mailing list archive.

One obvious caveat is that RFC 2822 is ambiguous when it comes to the date
format. While it seems nice that you *can* write single-digit day numbers
as single digit if you want, or with a leading zero, or with a leading
space, it makes it impossible to get away with exact matching. I did not
really want to complicate my research by parsing the dates and normalizing
them to epoch + timezone, also because I wanted results quick, so I simply
normalized the dates to have leading zeroes for single-digit day numbers,
that seems to work for the moment).

The first category of problematic commits come as no surprise: merges. We
do not even have a way to represent them as mails. I simply excluded them
from the remainder of this study.

The second category should not be all that surprising, too: Junio often
adjusts the release notes without sending those patches out for review.
Those commits are:

363588f (### match next, Junio C Hamano 2017-02-17)
2076907 (Git 2.12-rc2, Junio C Hamano 2017-02-17)
076c053 (Hopefully the final batch of mini-topics before the final,
Junio C Hamano 2017-02-16)
ae86372 (Revert "reset: add an example of how to split a commit into two",
Junio C Hamano 2017-02-16)
d09b692 (A bit more for -rc2, Junio C Hamano 2017-02-15)

There is a third category, and this one *does* come as a surprise to me.
It appears that at least *some* patches' Date: lines are either ignored or
overridden or changed on their way from the mailing list into Git's commit
history. There was only one commit in that commit range:

3c0cb0c (read_loose_refs(): read refs using resolve_ref_recursively(),
Michael Haggerty 2017-02-09)

This one was committed with an author date "Thu, 09 Feb 2017 21:53:52
+0100" but it appears that there was no mail sent to the Git mailing list
with that particular Date: header and the *actual* mail containing the
patch was sent with a Date: header "Fri, 10 Feb 2017 12:16:19 +0100"
(Message-ID:
d8e906d969700acbca8dc717673d0a9cdc910f62.1486724698.git.mhag...@alum.mit.edu).

It is labor-intensive, but possible to find the correlation manually in
this case because the Subject: line has been left intact.

However, this points to a serious problem with my approach: I try to
re-create information that is actually not available (which Message-ID
corresponds to a given commit name). Since that information is not
available, it is quite possible that this information cannot be retrieved
accurately (and Michael's commit demonstrates that this is not a merely
theoretic consideration). I do not know that I can fix this on my side.

> P.S.: I used public-inbox.org links instead of commit references to the
> Git repository containing the mailing list archive, because the format
> of said Git repository is so unfavorable that it was determined very
> quickly in a discussion between Patrick Reynolds (GitHub) and myself
> that it would put totally undue burden on GitHub to mirror it there
> (compare also Carlos Nieto's talk at GitMerge titled "Top Ten Worst
> Repositories to host on GitHub").

Since the main problem was the unfavorable commit history 

Re: [PATCH] config: preserve case for one-shot config on the command line

2017-02-20 Thread Junio C Hamano
Junio C Hamano  writes:

> I still haven't queued any of the variants I posted (and I do not
> think other people sent their own versions, either).  I need to pick
> one and queue, with a test or two.  Perhaps after -rc2.  
>
> Others are welcome to work on it while I cut -rc2 tomorrow, so that
> by the time I see their patch all that is left for me to do is to
> apply it ;-)

Since nothing seems to have happened in the meantime, here is what
I'll queue so that we won't forget for now.  Lars's tests based on
how the scripted "git submodule" uses "git config" may still be
valid, but it is somewhat a roundabout way to demonstrate the
breakage and not very effective way to protect the fix, so I added a
new test that directly tests "git -c = ".

I am not sure if this updated one is worth doing, or the previous
"strchr and strrchr" I originally wrote was easier to understand.

One thing I noticed is that "git config --get X" will correctly
diagnose that a dot-less X is not a valid variable name, but we do
not seem to diagnose "git -c X=V " as invalid.

Perhaps we should, but it is not the focus of this topic.

-- >8 --
From: Junio C Hamano 
Date: Wed, 15 Feb 2017 15:48:44 -0800
Subject: [PATCH] config: preserve  case for one-shot config on the
 command line

The "git -c = cmd" mechanism is to pretend that a
configuration variable  is set to  while the cmd is
running.  The code to do so however downcased  in its entirety,
which is wrong for a three-level ...

The  part needs to stay as-is.

Reported-by: Lars Schneider 
Diagnosed-by: Jonathan Tan 
Signed-off-by: Junio C Hamano 
---
 config.c  | 30 -
 t/t1351-config-cmdline.sh | 48 +++
 2 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100755 t/t1351-config-cmdline.sh

diff --git a/config.c b/config.c
index 0dfed682b8..ba9a5911b0 100644
--- a/config.c
+++ b/config.c
@@ -199,6 +199,34 @@ void git_config_push_parameter(const char *text)
strbuf_release();
 }
 
+/*
+ * downcase the  and  in . or
+ * .. and do so in place.  
+ * is left intact.
+ */
+static void canonicalize_config_variable_name(char *varname)
+{
+   char *cp, *last_dot;
+
+   /* downcase the first segment */
+   for (cp = varname; *cp; cp++) {
+   if (*cp == '.')
+   break;
+   *cp = tolower(*cp);
+   }
+   if (!*cp)
+   return;
+
+   /* scan for the last dot */
+   for (last_dot = cp; *cp; cp++)
+   if (*cp == '.')
+   last_dot = cp;
+
+   /* downcase the last segment */
+   for (cp = last_dot; *cp; cp++)
+   *cp = tolower(*cp);
+}
+
 int git_config_parse_parameter(const char *text,
   config_fn_t fn, void *data)
 {
@@ -221,7 +249,7 @@ int git_config_parse_parameter(const char *text,
strbuf_list_free(pair);
return error("bogus config parameter: %s", text);
}
-   strbuf_tolower(pair[0]);
+   canonicalize_config_variable_name(pair[0]->buf);
if (fn(pair[0]->buf, value, data) < 0) {
strbuf_list_free(pair);
return -1;
diff --git a/t/t1351-config-cmdline.sh b/t/t1351-config-cmdline.sh
new file mode 100755
index 00..acb8dc3b15
--- /dev/null
+++ b/t/t1351-config-cmdline.sh
@@ -0,0 +1,48 @@
+#!/bin/sh
+
+test_description='git -c var=val'
+
+. ./test-lib.sh
+
+test_expect_success 'last one wins: two level vars' '
+   echo VAL >expect &&
+
+   # sec.var and sec.VAR are the same variable, as the first
+   # and the last level of a configuration variable name is
+   # case insensitive.  Test both setting and querying.
+
+   git -c sec.var=val -c sec.VAR=VAL config --get sec.var >actual &&
+   test_cmp expect actual &&
+   git -c SEC.var=val -c sec.var=VAL config --get sec.var >actual &&
+   test_cmp expect actual &&
+
+   git -c sec.var=val -c sec.VAR=VAL config --get SEC.var >actual &&
+   test_cmp expect actual &&
+   git -c SEC.var=val -c sec.var=VAL config --get sec.VAR >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'last one wins: three level vars' '
+   echo val >expect &&
+
+   # v.a.r and v.A.r are not the same variable, as the middle
+   # level of a three-level configuration variable name is
+   # case sensitive.  Test both setting and querying.
+
+   git -c v.a.r=val -c v.A.r=VAL config --get v.a.r >actual &&
+   test_cmp expect actual &&
+   git -c v.a.r=val -c v.A.r=VAL config --get V.a.R >actual &&
+   test_cmp expect actual &&
+
+   echo VAL >expect &&
+   git -c v.a.r=val -c v.a.R=VAL config --get v.a.r >actual &&
+   test_cmp expect actual &&
+   git -c v.a.r=val -c V.a.r=VAL config --get v.a.r >actual &&
+   

Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Junio C Hamano
Christian Couder  writes:

> git bisect makes some assumptions that are true most of the time, so
> in practice it works well most of the time.

I think we need to clarify the documentation and ask you to stop
using such fuzzy phrases like "assumptions" and "most of the time"
;-).

For bisection to work, "git bisect" requires a very simple thing,
which is that your history is bisectable wrt one particular "trait",
i.e. what you are interested in (e.g. the trait may be "the commit
has this feature in a broken form").

What does it mean for a history to be "bisectable", then?

A bisectable history has commits with the "trait" and commits
without the "trait".  Given any commit with the "trait", all commits
that are decendant of such commit must have the "trait".  Also given
any commit without the "trait", all commits that are ancestor of
such a commit must not have the "trait".

And your goal is to find one commit with the "trait" whose direct
parent commits all lack the "trait".

If and only if you can formulate your problem into the above form,
"git bisect" can help you by not requiring you to check each and
every commit in the history.

Depending on the way you define the "trait", your history may not be
bisectable, but by formulating the "trait" carefully, many such
history can be made bisectable.  In the "Recently some commit broke
feature X.  Before then the feature used to work.  In an ancient
past, the feature did not even exist" example, if you set "trying to
use feature X breaks" as the "trait", your history is not bisectable
unless you ignore the ancient part that did not even have the
feature.  If you restate your "trait" to "feature X does not exist
in a broken form", however, the history becomes bisectable.

Historically, we called commits with "trait" BAD and others GOOD,
because bisection was used primarily to hunt for bugs.  It may be
easier to understand if the user thinks of commits without "trait"
as OLD (i.e. commits in the older part of the history are not yet
contaminated), and commits with "trait" as NEW (i.e. at some point,
there is an event to introduce the "trait" and newer part of the
history is contaminated by that event ever since).


Read

2017-02-20 Thread Darren Yaseen
Hello, I work with a private security vault firm. I will be happy to discuss a 
very important proposal with you.
Kindly reply if you are interested for further details.
Regards,
Darren 


Re: [PATCH v2] git-check-ref-format: clarify man for --normalize

2017-02-20 Thread Junio C Hamano
Jeff King  writes:

> On Sun, Feb 19, 2017 at 11:32:32PM +0100, Damien Regad wrote:
>
>> Use of 'iff' may be confusing to people not familiar with this term.
>> 
>> Improving the --normalize option's documentation to remove the use of
>> 'iff', and clearly describe what happens when the condition is not met.
>
> Looks good to me. Thanks for following up.

Looks good except that there does not seem to be sign-off.

Damien, no need to resend the patch but I need to hear you say that
you have read Documentation/SubmittingPatches and you want your
sign-off added.

Thanks.


Re: url..insteadOf vs. submodules

2017-02-20 Thread Jeff King
On Sun, Feb 19, 2017 at 10:12:28PM +0100, Toolforger wrote:

> I am trying to make url..insteadOf work on the URLs inside
> .gitmodules, but it won't work (applying it to the repo itself works fine,
> to the config setting seems to be fine).

The submodule operations happen in their own processes, and do not look
at the config of the parent repo. Are you setting the config in
.git/config of the super-project?

I don't know if there plans to make that work, but one workaround is to
set the config in ~/.gitconfig.

-Peff


Re: [PATCH v6 0/6] stash: support pathspec argument

2017-02-20 Thread Junio C Hamano
Thomas Gummerer  writes:

> @@ -55,10 +53,13 @@ push [-p|--patch] [-k|--[no-]keep-index] 
> [-u|--include-untracked] [-a|--all] [-q
>  
>   Save your local modifications to a new 'stash', and run `git reset
>   --hard` to revert them.  The  part is optional and gives

I didn't notice this during v5 review, but the above seems to be
based on the codebase before your documentation update (which used
to be part of the series in older iteration).  I had to tweak the
series to apply on top of your 20a7e06172 ("Documentation/stash:
remove mention of git reset --hard", 2017-02-12) while queuing, so
please double check the result when it is pushed out to 'pu'.

Thanks.


Re: [PATCH v4 03/19] builtin/diff-tree: convert to struct object_id

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 12:10:15AM +, brian m. carlson wrote:

>  /* Diff one or more commits. */
> -static int stdin_diff_commit(struct commit *commit, char *line, int len)
> +static int stdin_diff_commit(struct commit *commit, const char *p)
>  {
> - unsigned char sha1[20];
> - if (isspace(line[40]) && !get_sha1_hex(line+41, sha1)) {
> + struct object_id oid;
> + if (isspace(*p++) && !parse_oid_hex(p, , )) {
>   /* Graft the fake parents locally to the commit */
> - int pos = 41;
>   struct commit_list **pptr;
>  
>   /* Free the real parent list */
>   free_commit_list(commit->parents);
>   commit->parents = NULL;
>   pptr = &(commit->parents);
> - while (line[pos] && !get_sha1_hex(line + pos, sha1)) {
> - struct commit *parent = lookup_commit(sha1);
> + while (isspace(*p++) && !parse_oid_hex(p, , )) {
> + struct commit *parent = lookup_commit(oid.hash);
>   if (parent) {
>   pptr = _list_insert(parent, pptr)->next;
>   }
> - pos += 41;
>   }
>   }

Are you sure this is right? The first "if" will advance the "p" pointer,
and we'll miss it in the inner loop.

IOW, the original looked something like:

  1. see if we have any parents after the initial commit sha1

  2. if so, then free the original parent list, so we can parse the new
 ones

  3. starting at pos 41 (the same one we parsed in the conditional!),
 loop and parse each parent sha1

The conditional in step 1 can't advance our pointer, or we miss the
first parent in step 3.

It's silly to parse the same sha1 twice, though. You could solve it by
adding the first "oid" from the conditional to the new parent list. In
my "something like this" patch, I solved it by dropping the conditional,
and just having the inner loop. It lazily drops the old parent list on
the first iteration.

It's a little disturbing that we do not seem to have even a basic test
of:

  git rev-list --parents HEAD | git diff-tree --stdin

which would exercise this code.

-Peff


Re: missing handling of "No newline at end of file" in git am

2017-02-20 Thread Eric Wong
Olaf Hering  wrote:
> On Tue, Feb 14, Olaf Hering wrote:
> 
> > How would I debug it?
> 
> One line is supposed to be longer than 998 chars, but something along
> the way truncated it and corrupted the patch.

998 sounds like the SMTP limit.

Perhaps git format-patch should emit binary diffs in that case?
I doubt any human would bother reading excessively long lines as
text...