Re: [PATCH v6 05/15] sequencer: introduce the `merge` command

2018-04-13 Thread Johannes Schindelin
Hi Phillip,

On Fri, 13 Apr 2018, Phillip Wood wrote:

> On 13/04/18 11:12, Phillip Wood wrote:
> > @@ -3030,7 +3029,8 @@ static int pick_commits(struct todo_list *todo_list, 
> > struct replay_opts *opts)
> > return error(_("unknown command %d"), item->command);
> >  
> > if (res < 0 && (item->command == TODO_LABEL ||
> > -   item->command == TODO_RESET)) {
> > +   item->command == TODO_RESET ||
> > +   item->command == TODO_MERGE)) {
> 
> Unfortunately it's not as simple as that - we only want to reschedule if
> merge_recursive() fails, not if run_git_commit() does.

Correct. How about introducing a flag `reschedule` that is passed to
do_label(), do_reset() and do_merge()?

Seeing as do_reset() and do_merge() already have a replay_opts parameter,
we could add a field `needs_rescheduling` and pass the replay_opts also to
do_label().

Ciao,
Dscho


[RFC PATCH] tests: fix PATH for GIT_TEST_INSTALLED tests

2018-04-13 Thread Guillaume Maudoux
From: Guillaume Maudoux 

When running tests on an existing git installation with
GIT_TEST_INSTALLED (as described in t/README), the test helpers are
missing in the PATH.

This fixes the test suite in a way that allows all the tests to pass.

Signed-off-by: Guillaume Maudoux 
---

This is more a bug report than a real patch. The issue is described
above and this patch does solve it. I however think that someone with
more knowledge should refactor all that chunck of code that was last
changed in 2010.

In particular, it seems that the GIT_TEST_INSTALLED path does not use
bin-wrappers at all. This may imply that --with-dashes also breaks
tests.

 t/test-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git t/test-lib.sh t/test-lib.sh
index 7740d511d..0d51261f7 100644
--- t/test-lib.sh
+++ t/test-lib.sh
@@ -923,7 +923,7 @@ elif test -n "$GIT_TEST_INSTALLED"
 then
GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path)  ||
error "Cannot run git from $GIT_TEST_INSTALLED."
-   PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH
+   PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR/t/helper:$GIT_BUILD_DIR:$PATH
GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH}
 else # normal case, use ../bin-wrappers only unless $with_dashes:
git_bin_dir="$GIT_BUILD_DIR/bin-wrappers"
-- 
2.17.0



Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-13 Thread Johannes Schindelin
Hi Phillip,

On Fri, 13 Apr 2018, Phillip Wood wrote:

> On 12/04/18 23:02, Johannes Schindelin wrote:
> > 
> > [...]
> > 
> > So: the order of the 3-way merges does matter.
> >
> > [...]
> 
> Those conflicts certainly look intimidating (and the ones in your later
> reply with the N way merge example still look quite complicated). One
> option would be just to stop and have the user resolve the conflicts
> after each conflicting 3-way merge rather than at the end of all the
> merges. There are some downsides: there would need to be a way to
> explain to the user that this is an intermediate step (and what that
> step was); the code would have to do some book keeping to know where it
> had got to; and it would stop and prompt the user to resolve conflicts
> more often which could be annoying but hopefully they'd be clearer to
> resolve because they weren't nested.

I thought about that. But as I pointed out: the order of the merges *does*
matter. Otherwise we force the user to resolve conflicts that they
*already* resolved during this rebase...

Ciao,
Dscho


Re: [PATCH] Deprecate support for .git/info/grafts

2018-04-13 Thread Stefan Beller
Hi Johannes,

On Fri, Apr 13, 2018 at 3:35 PM, Johannes Schindelin
 wrote:
> Hi Stefan,

>> I wonder if we want to offer a migration tool or just leave it
>> at this hint.
>
> There is contrib/convert-grafts-to-replace-refs.sh.

Oh cool! I wonder if we want to expose it more such that people
discover it.

> I wonder whether we have to care enough to implement a `git replace
> --convert-graft-file`...

I don't think so.


Re: [PATCH] Deprecate support for .git/info/grafts

2018-04-13 Thread Johannes Schindelin
Hi Stefan,

On Fri, 13 Apr 2018, Stefan Beller wrote:

> On Fri, Apr 13, 2018 at 4:11 AM, Johannes Schindelin
>  wrote:
> > The grafts feature was a convenient way to "stich together" ancient
> > history to the fresh start of linux.git.
> 
> Did you mean: stitch?

Yes ;-)

> > Its implementation is, however, not up to Git's standards, as there are
> > too many ways where it can lead to surprising and unwelcome behavior.
> >
> > For example, when pushing from a repository with active grafts, it is
> > possible to miss commits that have been "grafted out", resulting in a
> > broken state on the other side.
> >
> > Also, the grafts feature is limited to "rewriting" commits' list of
> > parents, it cannot replace anything else.
> >
> > The much younger feature implemented as `git replace` set out to remedy
> > those limitations and dangerous bugs.
> >
> > Seeing as `git replace` is pretty mature by now, it is time to deprecate
> > support for the graft file, and to retire it eventually.
> 
> It seems that the maturity needed for this commit was reached in
> 4228e8bc98 (replace: add --graft option, 2014-07-19)

Right. I'll add that to the commit message.

> Reviewed-by: Stefan Beller 
> > Signed-off-by: Johannes Schindelin 
> > ---
> 
> > return -1;
> > +   if (advice_graft_file_deprecated)
> > +   advise(_("Support for /info/grafts is deprecated\n"
> > +"and will be removed in a future Git version.\n"
> > +"\n"
> > +"Please use \"git replace --graft [...]\" 
> > instead.\n"
> > +"\n"
> > +"Turn this message off by running\n"
> > +"\"git config advice.graftFileDeprecated 
> > false\""));
> 
> So the user would have to run:
> 
>   for line in /info/grafts:
>   git replace --graft $line
>   # The order in the grafts file is the same as the arguments,
>   # but we'd have to pass each as its own argument
>   rm /info/grafts
> 
> I wonder if we want to offer a migration tool or just leave it
> at this hint.

There is contrib/convert-grafts-to-replace-refs.sh.

I wonder whether we have to care enough to implement a `git replace
--convert-graft-file`...

Ciao,
Dscho


Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Junio C Hamano
Elijah Newren  writes:

> Yes, precisely.  Checking the *current* index is not reliable in the
> presence of renames.
>
> Trying to use the current index as a proxy for what was in the index
> before the merge started is a problem.  But we had a copy of the index
> before the merge started; we just discarded it at the end of
> unpack_trees().  We could keep it around instead.  That would also
> have the benefits of making the was_dirty() checks more accurate too,
> as using the mtime's in the current index as a proxy for what was in
> the original index has the potential for the same kinds of problems.

Yeah, I agree that you are going in the right direction with the
above.

> It's certainly tempting as an interim solution.  I have an alternative
> interim solution that I think explains well why the code here had been
> fragile, and how to just implement what we want to know rather than
> making approximations to it, which I just posted at [2].  But I can
> see the draw of just gutting the code and replacing with simple brute
> force.

Thanks; I'd love to reserve a block of time to think this through
myself, but I am a bit occupied otherwise this weekend, so I'll try
to catch up after that.



Re: [PATCH] completion: reduce overhead of clearing cached --options

2018-04-13 Thread SZEDER Gábor
On Fri, Apr 13, 2018 at 11:44 PM, Jakub Narebski  wrote:
> SZEDER Gábor  writes:
>> In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
>> builtin command, which lists the same variables, but without a
>> pipeline and 'sed' it can do so with lower overhead.
>
> What about ZSH?

Nothing, ZSH is unaffected by this patch.


Re: [PATCH v8 03/14] commit-graph: add format document

2018-04-13 Thread Jakub Narebski
Derrick Stolee  writes:
> On 4/11/2018 4:58 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
>>
>>> +CHUNK DATA:
>>> +
>>> +  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
>>> +  The ith entry, F[i], stores the number of OIDs with first
>>> +  byte at most i. Thus F[255] stores the total
>>> +  number of commits (N).
>>> +
>>> +  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
>>> +  The OIDs for all commits in the graph, sorted in ascending order.
>>> +
>>> +  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
>> I think it is a typo, and it should be CDAT, not CGET
>> (CDAT seem to me to stand for Commit DATa):
>>
>>+  Commit Data (ID: {'C', 'D', 'A', 'T' }) (N * (H + 16) bytes)
>>
>> This is what you use in actual implementation, in PATCH v8 06/14
>>
>> DS> +#define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
>> DS> +#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
>> DS> +#define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */
>> DS> +#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
>> DS> +#define GRAPH_CHUNKID_LARGEEDGES 0x45444745 /* "EDGE" */
>>
>
> Documentation bugs are hard to diagnose. Thanks for finding this. I
> double checked that the hex int "0x43444154" matches "CDAT".

Another possible way of checking the correctness would be to run
`hexdump -C` or equivalent on generated commit-graph file.  File and
chunk headers should be visible in its output.

> Here is a diff to make it match.
>
> diff --git a/Documentation/technical/commit-graph-format.txt
> b/Documentation/technical/commit-graph-format.txt
> index ad6af8105c..af03501834 100644
> --- a/Documentation/technical/commit-graph-format.txt
> +++ b/Documentation/technical/commit-graph-format.txt
> @@ -70,7 +70,7 @@ CHUNK DATA:
>    OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
>    The OIDs for all commits in the graph, sorted in ascending order.
>
> -  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
> +  Commit Data (ID: {'C', 'D', 'A', 'T' }) (N * (H + 16) bytes)
>  * The first H bytes are for the OID of the root tree.
>  * The next 8 bytes are for the positions of the first two parents
>    of the ith commit. Stores value 0x if no parent in that


Re: File versioning based on shallow Git repositories?

2018-04-13 Thread Jakub Narebski
Hello Johannes,

Johannes Schindelin  writes:
> On Fri, 13 Apr 2018, Jakub Narebski wrote:
>> Hallvard Breien Furuseth  writes:
>> 
>>> Also maybe it'll be worthwhile to generate .git/info/grafts in a local
>>> clone of the repo to get back easily visible history.  No grafts in
>>> the original repo, grafts mess things up.
>> 
>> Just a reminder: modern Git has "git replace", a modern and safe
>> alternative to the grafts file.
>
> Right!
>
> Maybe it is time to start deprecating grafts? They *do* cause problems,
> such as weird "missing objects" problems when trying to fetch into, or
> push from, a repository with grafts. These problems are not shared by the
> `git replace` method.

Also you can propagate "git replace" info with clone / fetch / push.

> I just sent out a patch to add a deprecation warning.

Thank you for this.

-- 
Jakub Narębski


Re: [PATCH] completion: reduce overhead of clearing cached --options

2018-04-13 Thread Jakub Narebski
SZEDER Gábor  writes:

> To get the names of all '$__git_builtin_*' variables caching --options
> of builtin commands in order to unset them, 8b0eaa41f2 (completion:
> clear cached --options when sourcing the completion script,
> 2018-03-22) runs a 'set |sed s///' pipeline.  This works both in Bash
> and in ZSH, but has a higher than necessasry overhead with the extra
> processes.

Typo: necessasry -> necessary

>
> In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
> builtin command, which lists the same variables, but without a
> pipeline and 'sed' it can do so with lower overhead.

What about ZSH?

-- 
Jakub Narębski


[PATCH v2 6/9] gpg-interface: extract gpg line matching helper

2018-04-13 Thread Ben Toews
From: Jeff King 

Let's separate the actual line-by-line parsing of signatures
from the notion of "is this a gpg signature line". That will
make it easier to do more refactoring of this loop in future
patches.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 3414af38b9..79333c1ee8 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -101,11 +101,16 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)
fputs(output, stderr);
 }

+static int is_gpg_start(const char *line)
+{
+   return starts_with(line, PGP_SIGNATURE) ||
+   starts_with(line, PGP_MESSAGE);
+}
+
 size_t parse_signature(const char *buf, size_t size)
 {
size_t len = 0;
-   while (len < size && !starts_with(buf + len, PGP_SIGNATURE) &&
-   !starts_with(buf + len, PGP_MESSAGE)) {
+   while (len < size && !is_gpg_start(buf + len)) {
const char *eol = memchr(buf + len, '\n', size - len);
len += eol ? eol - (buf + len) + 1 : size - len;
}
--
2.15.1 (Apple Git-101)


[PATCH v2 7/9] gpg-interface: find the last gpg signature line

2018-04-13 Thread Ben Toews
From: Jeff King 

A signed tag has a detached signature like this:

  object ...
  [...more header...]

  This is the tag body.

  -BEGIN PGP SIGNATURE-
  [opaque gpg data]
  -END PGP SIGNATURE-

Our parser finds the _first_ line that appears to start a
PGP signature block, meaning we may be confused by a
signature (or a signature-like line) in the actual body.
Let's keep parsing and always find the final block, which
should be the detached signature over all of the preceding
content.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 12 +---
 t/t7004-tag.sh  | 11 +++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 79333c1ee8..0647bd6348 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -110,11 +110,17 @@ static int is_gpg_start(const char *line)
 size_t parse_signature(const char *buf, size_t size)
 {
size_t len = 0;
-   while (len < size && !is_gpg_start(buf + len)) {
-   const char *eol = memchr(buf + len, '\n', size - len);
+   size_t match = size;
+   while (len < size) {
+   const char *eol;
+
+   if (is_gpg_start(buf + len))
+   match = len;
+
+   eol = memchr(buf + len, '\n', size - len);
len += eol ? eol - (buf + len) + 1 : size - len;
}
-   return len;
+   return match;
 }

 void set_signing_key(const char *key)
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index ee093b393d..e3f1e014aa 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1059,6 +1059,17 @@ test_expect_success GPG \
git tag -v blanknonlfile-signed-tag
 '

+test_expect_success GPG 'signed tag with embedded PGP message' '
+   cat >msg <<-\EOF &&
+   -BEGIN PGP MESSAGE-
+
+   this is not a real PGP message
+   -END PGP MESSAGE-
+   EOF
+   git tag -s -F msg confusing-pgp-message &&
+   git tag -v confusing-pgp-message
+'
+
 # messages with commented lines for signed tags:

 cat >sigcommentsfile <

[PATCH v2 9/9] gpg-interface: handle alternative signature types

2018-04-13 Thread Ben Toews
Currently you can only sign commits and tags using "gpg".
You can _almost_ plug in a related tool like "gpgsm" (which
uses S/MIME-style signatures instead of PGP) using
gpg.program, as it has command-line compatibility. But there
are a few rough edges:

  1. gpgsm generates a slightly different PEM format than
 gpg. It says:

   -BEGIN SIGNED MESSAGE-

 instead of:

   -BEGIN PGP SIGNATURE-

 This actually works OK for signed commits, where we
 just dump the gpgsig header to gpg.program regardless.

 But for tags, we actually have to parse out the
 detached signature, and we fail to recognize the gpgsm
 version.

  2. You can't mix the two types of signatures easily, as
 we'd send the output to whatever tool you've
 configured. For verification, we'd ideally dispatch gpg
 signatures to gpg, gpgsm ones to gpgsm, and so on. For
 signing, you'd obviously have to pick a tool to use.

This patch introduces a set of configuration options for
defining a "signing tool", of which gpg may be just one.
With this patch you can:

  - define a new tool "foo" with signingtool.foo.program

  - map PEM strings to it for verification using
signingtool.foo.pemtype

  - set it as your default tool for creating signatures
using signingtool.default

This subsumes the existing gpg config, as we have baked-in
config for signingtool.gpg that works just like the current
hard-coded behavior. And setting gpg.program becomes an
alias for signingtool.gpg.program.

This is enough to plug in gpgsm like:

  [signingtool "gpgsm"]
  program = gpgsm
  pemtype = "SIGNED MESSAGE"

In the future, this config scheme gives us options for
extending to other tools:

  - the tools all have to accept gpg-like options for now,
but we could add signingtool.interface to meet other
standards

  - we only match PEM headers now, but we could add other
config for matching non-PEM types

The implementation is still in gpg-interface.c, and most of
the code internally refers to this as "gpg". I've left it
this way to keep the diff sane, and to make it clear that
we're not breaking anything gpg-related. This is probably OK
for now, as our tools must be gpg-like anyway. But renaming
everything would be an obvious next-step refactoring if we
want to offer support for more exotic tools (e.g., people
have asked before on the list about using OpenBSD signify).

Signed-off-by: Ben Toews 
---
 Documentation/config.txt |  42 +---
 builtin/fmt-merge-msg.c  |   6 +-
 builtin/receive-pack.c   |   7 +-
 builtin/tag.c|   2 +-
 commit.c |   2 +-
 gpg-interface.c  | 168 ++-
 gpg-interface.h  |  24 ++-
 log-tree.c   |   7 +-
 ref-filter.c |   2 +-
 t/lib-gpg.sh |  30 +
 t/t7510-signed-commit.sh |  34 +-
 tag.c|   2 +-
 12 files changed, 283 insertions(+), 43 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4e0cff87f6..691b309306 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1727,16 +1727,40 @@ grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository.  Defaults to false.

-gpg.program::
-   Use this custom program instead of "`gpg`" found on `$PATH` when
-   making or verifying a PGP signature. The program must support the
-   same command-line interface as GPG, namely, to verify a detached
-   signature, "`gpg --verify $file - <$signature`" is run, and the
-   program is expected to signal a good signature by exiting with
-   code 0, and to generate an ASCII-armored detached signature, the
-   standard input of "`gpg -bsau $key`" is fed with the contents to be
+signingtool..program::
+   The name or path of the program to execute when making or
+   verifying a signature. This program will be used for making
+   signatures if `` is configured as `signingtool.default`.
+   This program will be used for verifying signatures whose PEM
+   block type matches `signingtool..pemtype` (see below). The
+   program must support the same command-line interface as GPG.
+   To verify a detached signature,
+   "`gpg --verify $file - <$signature`" is run, and the program is
+   expected to signal a good signature by exiting with code 0.
+   To generate an ASCII-armored detached signature, the standard
+   input of "`gpg -bsau $key`" is fed with the contents to be
signed, and the program is expected to send the result to its
-   standard output.
+   standard output. By default, `signingtool.gpg.program` is set to
+   `gpg`.
+
+signingtool..pemtype::
+   The PEM block type associated with the signing tool named
+   ``. For example, the block type of a GPG signature
+   

[PATCH v2 2/9] gpg-interface: handle bool user.signingkey

2018-04-13 Thread Ben Toews
From: Jeff King 

The config handler for user.signingkey does not check for a
boolean value, and thus:

  git -c user.signingkey tag

will segfault. We could fix this and even shorten the code
by using git_config_string(). But our set_signing_key()
helper is used by other code outside of gpg-interface.c, so
we must keep it (and we may as well use it, because unlike
git_config_string() it does not leak when we overwrite an
old value).

Ironically, the handler for gpg.program just below _could_
use git_config_string() but doesn't. But since we're going
to touch that in a future patch, we'll leave it alone for
now. We will add some whitespace and returns in preparation
for adding more config keys, though.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gpg-interface.c b/gpg-interface.c
index 4feacf16e5..61c0690e12 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -128,13 +128,19 @@ void set_signing_key(const char *key)
 int git_gpg_config(const char *var, const char *value, void *cb)
 {
if (!strcmp(var, "user.signingkey")) {
+   if (!value)
+   return config_error_nonbool(var);
set_signing_key(value);
+   return 0;
}
+
if (!strcmp(var, "gpg.program")) {
if (!value)
return config_error_nonbool(var);
gpg_program = xstrdup(value);
+   return 0;
}
+
return 0;
 }

--
2.15.1 (Apple Git-101)


[PATCH v2 3/9] gpg-interface: modernize function declarations

2018-04-13 Thread Ben Toews
From: Jeff King 

Let's drop "extern" from our declarations, which brings us
in line with our modern style guidelines. While we're
here, let's wrap some of the overly long lines, and move
docstrings for public functions to their declarations, since
they document the interface.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 17 -
 gpg-interface.h | 49 ++---
 2 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 61c0690e12..08de0daa41 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -101,12 +101,6 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)
fputs(output, stderr);
 }

-/*
- * Look at GPG signed content (e.g. a signed tag object), whose
- * payload is followed by a detached signature on it.  Return the
- * offset where the embedded detached signature begins, or the end of
- * the data when there is no such signature.
- */
 size_t parse_signature(const char *buf, unsigned long size)
 {
char *eol;
@@ -151,12 +145,6 @@ const char *get_signing_key(void)
return git_committer_info(IDENT_STRICT|IDENT_NO_DATE);
 }

-/*
- * Create a detached signature for the contents of "buffer" and append
- * it after "signature"; "buffer" and "signature" can be the same
- * strbuf instance, which would cause the detached signature appended
- * at the end.
- */
 int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char 
*signing_key)
 {
struct child_process gpg = CHILD_PROCESS_INIT;
@@ -198,11 +186,6 @@ int sign_buffer(struct strbuf *buffer, struct strbuf 
*signature, const char *sig
return 0;
 }

-/*
- * Run "gpg" to see if the payload matches the detached signature.
- * gpg_output, when set, receives the diagnostic output from GPG.
- * gpg_status, when set, receives the status output from GPG.
- */
 int verify_signed_buffer(const char *payload, size_t payload_size,
 const char *signature, size_t signature_size,
 struct strbuf *gpg_output, struct strbuf *gpg_status)
diff --git a/gpg-interface.h b/gpg-interface.h
index d2d4fd3a65..2c40a9175f 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -23,16 +23,43 @@ struct signature_check {
char *key;
 };

-extern void signature_check_clear(struct signature_check *sigc);
-extern size_t parse_signature(const char *buf, unsigned long size);
-extern void parse_gpg_output(struct signature_check *);
-extern int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const 
char *signing_key);
-extern int verify_signed_buffer(const char *payload, size_t payload_size, 
const char *signature, size_t signature_size, struct strbuf *gpg_output, struct 
strbuf *gpg_status);
-extern int git_gpg_config(const char *, const char *, void *);
-extern void set_signing_key(const char *);
-extern const char *get_signing_key(void);
-extern int check_signature(const char *payload, size_t plen,
-   const char *signature, size_t slen, struct signature_check *sigc);
-void print_signature_buffer(const struct signature_check *sigc, unsigned 
flags);
+void signature_check_clear(struct signature_check *sigc);
+
+/*
+ * Look at GPG signed content (e.g. a signed tag object), whose
+ * payload is followed by a detached signature on it.  Return the
+ * offset where the embedded detached signature begins, or the end of
+ * the data when there is no such signature.
+ */
+size_t parse_signature(const char *buf, unsigned long size);
+
+void parse_gpg_output(struct signature_check *);
+
+/*
+ * Create a detached signature for the contents of "buffer" and append
+ * it after "signature"; "buffer" and "signature" can be the same
+ * strbuf instance, which would cause the detached signature appended
+ * at the end.
+ */
+int sign_buffer(struct strbuf *buffer, struct strbuf *signature,
+   const char *signing_key);
+
+/*
+ * Run "gpg" to see if the payload matches the detached signature.
+ * gpg_output, when set, receives the diagnostic output from GPG.
+ * gpg_status, when set, receives the status output from GPG.
+ */
+int verify_signed_buffer(const char *payload, size_t payload_size,
+const char *signature, size_t signature_size,
+struct strbuf *gpg_output, struct strbuf *gpg_status);
+
+int git_gpg_config(const char *, const char *, void *);
+void set_signing_key(const char *);
+const char *get_signing_key(void);
+int check_signature(const char *payload, size_t plen,
+   const char *signature, size_t slen,
+   struct signature_check *sigc);
+void print_signature_buffer(const struct signature_check *sigc,
+   unsigned flags);

 #endif
--
2.15.1 (Apple Git-101)


[PATCH v2 5/9] gpg-interface: fix const-correctness of "eol" pointer

2018-04-13 Thread Ben Toews
From: Jeff King 

We accidentally shed the "const" of our buffer by passing it
through memchr. Let's fix that, and while we're at it, move
our variable declaration inside the loop, which is the only
place that uses it.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index ac852ad4b9..3414af38b9 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -103,11 +103,10 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)

 size_t parse_signature(const char *buf, size_t size)
 {
-   char *eol;
size_t len = 0;
while (len < size && !starts_with(buf + len, PGP_SIGNATURE) &&
!starts_with(buf + len, PGP_MESSAGE)) {
-   eol = memchr(buf + len, '\n', size - len);
+   const char *eol = memchr(buf + len, '\n', size - len);
len += eol ? eol - (buf + len) + 1 : size - len;
}
return len;
--
2.15.1 (Apple Git-101)


[PATCH v2 4/9] gpg-interface: use size_t for signature buffer size

2018-04-13 Thread Ben Toews
From: Jeff King 

Even though our object sizes (from which these buffers would
come) are typically "unsigned long", this is something we'd
like to eventually fix (since it's only 32-bits even on
64-bit Windows). It makes more sense to use size_t when
taking an in-memory buffer.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 2 +-
 gpg-interface.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 08de0daa41..ac852ad4b9 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -101,7 +101,7 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)
fputs(output, stderr);
 }

-size_t parse_signature(const char *buf, unsigned long size)
+size_t parse_signature(const char *buf, size_t size)
 {
char *eol;
size_t len = 0;
diff --git a/gpg-interface.h b/gpg-interface.h
index 2c40a9175f..a5e6517ae6 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -31,7 +31,7 @@ void signature_check_clear(struct signature_check *sigc);
  * offset where the embedded detached signature begins, or the end of
  * the data when there is no such signature.
  */
-size_t parse_signature(const char *buf, unsigned long size);
+size_t parse_signature(const char *buf, size_t size);

 void parse_gpg_output(struct signature_check *);

--
2.15.1 (Apple Git-101)


[PATCH v2 8/9] gpg-interface: prepare for parsing arbitrary PEM blocks

2018-04-13 Thread Ben Toews
From: Jeff King 

In preparation for handling more PEM blocks besides "PGP
SIGNATURE" and "PGP MESSAGE', let's break up the parsing to
parameterize the actual block type.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 gpg-interface.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 0647bd6348..0ba4a8ac3b 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -9,9 +9,6 @@
 static char *configured_signing_key;
 static const char *gpg_program = "gpg";

-#define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
-#define PGP_MESSAGE "-BEGIN PGP MESSAGE-"
-
 void signature_check_clear(struct signature_check *sigc)
 {
FREE_AND_NULL(sigc->payload);
@@ -101,10 +98,17 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)
fputs(output, stderr);
 }

-static int is_gpg_start(const char *line)
+static int is_pem_start(const char *line)
 {
-   return starts_with(line, PGP_SIGNATURE) ||
-   starts_with(line, PGP_MESSAGE);
+   if (!skip_prefix(line, "-BEGIN ", ))
+   return 0;
+   if (!skip_prefix(line, "PGP SIGNATURE", ) &&
+   !skip_prefix(line, "PGP MESSAGE", ))
+   return 0;
+   if (!starts_with(line, "-"))
+   return 0;
+
+   return 1;
 }

 size_t parse_signature(const char *buf, size_t size)
@@ -114,7 +118,7 @@ size_t parse_signature(const char *buf, size_t size)
while (len < size) {
const char *eol;

-   if (is_gpg_start(buf + len))
+   if (is_pem_start(buf + len))
match = len;

eol = memchr(buf + len, '\n', size - len);
--
2.15.1 (Apple Git-101)


[PATCH v2 0/9] gpg-interface: Multiple signing tools

2018-04-13 Thread Ben Toews
Updated to incorporate feedback from v1. In addition to changes to the patches
from v1, I added the missing `t7004: fix mistaken tag name` patch, which had
caused some confusion (sorry about that). Thanks for everyone's feedback on v1.

### Interdiff (v1..v2):

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 7906123a59..691b309306 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1728,7 +1728,7 @@ grep.fallbackToNoIndex::
is executed outside of a git repository.  Defaults to false.

 signingtool..program::
-   The name of the program on `$PATH` to execute when making or
+   The name or path of the program to execute when making or
verifying a signature. This program will be used for making
signatures if `` is configured as `signingtool.default`.
This program will be used for verifying signatures whose PEM
@@ -1750,7 +1750,9 @@ signingtool..pemtype::
SIGNATURE`. When verifying a signature with this PEM block type
the program specified in `signingtool..program` will be
used. By default `signingtool.gpg.pemtype` contains `PGP
-   SIGNATURE` and `PGP MESSAGE`.
+   SIGNATURE` and `PGP MESSAGE`. Multiple PEM types may be specified
+   for a single signing tool by including the `pemtype` directive
+   multiple times within the `signingtool` configuration.

 signingtool.default::
The `` of the signing tool to use when creating
diff --git a/gpg-interface.c b/gpg-interface.c
index 0e2a82e8e5..5d4ae2a7ed 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -22,7 +22,8 @@ static struct signing_tool *alloc_signing_tool(void)
  * Our default tool config is too complicated to specify as a constant
  * initializer, so we lazily create it as needed.
  */
-static void init_signing_tool_defaults(void) {
+static void init_signing_tool_defaults(void)
+{
struct signing_tool *tool;

if (signing_tool_config)
@@ -38,7 +39,8 @@ static void init_signing_tool_defaults(void) {
signing_tool_config = tool;
 }

-static struct signing_tool *get_signing_tool(const char *name) {
+static struct signing_tool *get_signing_tool(const char *name)
+{
struct signing_tool *tool;

init_signing_tool_defaults();
@@ -216,11 +218,12 @@ int git_gpg_config(const char *var, const char *value, 
void *cb)
}

if (!strcmp(var, "gpg.program")) {
-   struct signing_tool *tool = get_or_create_signing_tool("gpg");
+   struct signing_tool *tool;

if (!value)
return config_error_nonbool(var);

+   tool = get_or_create_signing_tool("gpg");
free(tool->program);
tool->program = xstrdup(value);
return 0;
@@ -331,7 +334,7 @@ int verify_signed_buffer(const char *payload, size_t 
payload_size,
/*
 * The caller didn't tell us which tool to use, and we
 * didn't recognize the format. Historically we've fed
-* these cases to blindly to gpg, so let's continue to
+* these cases blindly to gpg, so let's continue to
 * do so.
 */
tool = get_signing_tool("gpg");
diff --git a/gpg-interface.h b/gpg-interface.h
index cee0dfe401..8e22e67b6f 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -42,7 +42,7 @@ void signature_check_clear(struct signature_check *sigc);
  * pointed at the signing_tool that corresponds to the found
  * signature type.
  */
-size_t parse_signature(const char *buf, unsigned long size,
+size_t parse_signature(const char *buf, size_t size,
   const struct signing_tool **out_tool);

 void parse_gpg_output(struct signature_check *);
@@ -61,7 +61,7 @@ int sign_buffer(struct strbuf *buffer, struct strbuf 
*signature,
  * gpg_output, when set, receives the diagnostic output from GPG.
  * gpg_status, when set, receives the status output from GPG.
  *
- * Typically the "tool" argument should come from a previous call to
+ * Typically, the "tool" argument should come from a previous call to
  * parse_signature(). If it's NULL, then verify_signed_buffer() will
  * try to choose the appropriate tool based on the contents of the
  * "signature" buffer.
diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
index 08d23b0cf5..b9ba47057f 100755
--- a/t/lib-gpg.sh
+++ b/t/lib-gpg.sh
@@ -59,20 +59,24 @@ sanitize_pgp() {

 create_fake_signer () {
write_script fake-signer <<-\EOF
-   if [ "$1 $2" = "--status-fd=2 -bsau" ]; then
+   if test "$1 $2" = "--status-fd=2 -bsau"
+   then
echo >&2 "[GNUPG:] BEGIN_SIGNING"
echo >&2 "[GNUPG:] SIG_CREATED D 1 SHA256 0 1513792449 
4A7FF9E2330D22B19213A4E9E9C423BE17EFEE70"
-   # avoid "-" in echo arguments
-   printf "%s\n" \
- "-BEGIN FAKE 

[PATCH v2 1/9] t7004: fix mistaken tag name

2018-04-13 Thread Ben Toews
From: Jeff King 

We have a series of tests which create signed tags with
various properties, but one test accidentally verifies a tag
from much earlier in the series.

Signed-off-by: Jeff King 
Signed-off-by: Ben Toews 
---
 t/t7004-tag.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2aac77af70..ee093b393d 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1056,7 +1056,7 @@ test_expect_success GPG \
git tag -s -F sigblanknonlfile blanknonlfile-signed-tag &&
get_tag_msg blanknonlfile-signed-tag >actual &&
test_cmp expect actual &&
-   git tag -v signed-tag
+   git tag -v blanknonlfile-signed-tag
 '

 # messages with commented lines for signed tags:
--
2.15.1 (Apple Git-101)


Re: [PATCH] support: git show --follow-symlinks HEAD:symlink

2018-04-13 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 13 2018, Michael Vogt wrote:

> Add support for the `--follow-symlinks` options to git-show. This
> allows to write:
>
> git show --follow-symlink HEAD:path-a-symlink
>
> to get the content of the symlinked file.

Thanks. Commit message would be better as something like:

show: add --follow-symlinks option for :

Add a --follow-symlinks option that'll resolve symlinks to their
targets when the target is of the form :.

Without it, git will show the path of the link itself if the symlink
is the leaf node of , or otherwise an error if some component
of  is a symlink to another location in the repository. With
the new --follow-symlinks option both will be resolved to their
target, and its content shown instead.

I.e. start with ": " ("show" in this case), and explain how it
impacts the dirlink case.

> Signed-off-by: Michael Vogt 
> ---
>  Documentation/git-show.txt |  6 +
>  builtin/log.c  |  7 --
>  revision.c |  2 ++
>  revision.h |  1 +
>  t/t1800-git-show.sh| 46 ++
>  5 files changed, 60 insertions(+), 2 deletions(-)
>  create mode 100755 t/t1800-git-show.sh
>
> diff --git a/Documentation/git-show.txt b/Documentation/git-show.txt
> index e73ef5401..d9f7fd90c 100644
> --- a/Documentation/git-show.txt
> +++ b/Documentation/git-show.txt
> @@ -39,6 +39,12 @@ OPTIONS
>   For a more complete list of ways to spell object names, see
>   "SPECIFYING REVISIONS" section in linkgit:gitrevisions[7].
>
> +--follow-symlinks::
> + Follow symlinks inside the repository when requesting objects
> + in extended revision syntax of the form tree-ish:path-in-tree.
> + Instead of output about the link itself, provide output about
> + the linked-to object.
> +
>  include::pretty-options.txt[]

This needs to document the dirlink case I noted above.

> diff --git a/builtin/log.c b/builtin/log.c
> index 94ee177d5..e92af4fc7 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -142,7 +142,7 @@ static void cmd_log_init_finish(int argc, const char 
> **argv, const char *prefix,
>struct rev_info *rev, struct setup_revision_opt *opt)
>  {
>   struct userformat_want w;
> - int quiet = 0, source = 0, mailmap = 0;
> + int quiet = 0, source = 0, mailmap = 0, follow_symlinks = 0;
>   static struct line_opt_callback_data line_cb = {NULL, NULL, 
> STRING_LIST_INIT_DUP};
>   static struct string_list decorate_refs_exclude = 
> STRING_LIST_INIT_NODUP;
>   static struct string_list decorate_refs_include = 
> STRING_LIST_INIT_NODUP;
> @@ -162,6 +162,8 @@ static void cmd_log_init_finish(int argc, const char 
> **argv, const char *prefix,
>   OPT_CALLBACK('L', NULL, _cb, "n,m:file",
>N_("Process line range n,m in file, counting from 
> 1"),
>log_line_range_callback),
> + OPT_BOOL(0, "follow-symlinks", _symlinks,
> +  N_("follow in-tree symlinks (used when showing file 
> content)")),
>   OPT_END()
>   };
>
> @@ -173,9 +175,10 @@ static void cmd_log_init_finish(int argc, const char 
> **argv, const char *prefix,
>builtin_log_options, builtin_log_usage,
>PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
>PARSE_OPT_KEEP_DASHDASH);
> -

Stray line deletion here.

>   if (quiet)
>   rev->diffopt.output_format |= DIFF_FORMAT_NO_OUTPUT;
> + if (follow_symlinks)
> + rev->follow_symlinks = 1;
>   argc = setup_revisions(argc, argv, rev, opt);
>
>   /* Any arguments at this point are not recognized */
> diff --git a/revision.c b/revision.c
> index b42c836d7..4ab22313f 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1678,6 +1678,8 @@ int handle_revision_arg(const char *arg_, struct 
> rev_info *revs, int flags, unsi
>
>   if (revarg_opt & REVARG_COMMITTISH)
>   get_sha1_flags |= GET_OID_COMMITTISH;
> + if (revs && revs->follow_symlinks)
> + get_sha1_flags |= GET_OID_FOLLOW_SYMLINKS;
>
>   if (get_oid_with_context(arg, get_sha1_flags, , ))
>   return revs->ignore_missing ? 0 : -1;
> diff --git a/revision.h b/revision.h
> index b8c47b98e..060f1038a 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -122,6 +122,7 @@ struct rev_info {
>   first_parent_only:1,
>   line_level_traverse:1,
>   tree_blobs_in_commit_order:1,
> + follow_symlinks:1,
>
>   /* for internal use only */
>   exclude_promisor_objects:1;
> diff --git a/t/t1800-git-show.sh b/t/t1800-git-show.sh
> new file mode 100755
> index 0..85541b4db
> --- /dev/null
> +++ b/t/t1800-git-show.sh
> @@ -0,0 +1,46 @@
> +#!/bin/sh
> +
> +test_description='Test git show 

Re: [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge

2018-04-13 Thread Stefan Beller
On Fri, Apr 13, 2018 at 12:56 PM, Elijah Newren  wrote:
> Add several tests checking whether updates can be skipped in a merge.
> Also add several similar testcases for where updates cannot be skipped in
> a merge to make sure that we skip if and only if we should.
>
> In particular:
>
>   * Testcase 1a (particularly 1a-check-L) would have pointed out the
> problem Linus has been dealing with for year with his merges[1].
>
>   * Testcase 2a (particularly 2a-check-L) would have pointed out the
> problem with my directory-rename-series before it broke master[2].
>
>   * Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
> than 12b of t6043 making that one easier to understand.
>
>   * There are several complementary testcases to make sure we're not just
> fixing those particular issues while regressing in the opposite
> direction.
>
> [1] 
> https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/
> [2] https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>
> Signed-off-by: Elijah Newren 

This is
Reviewed-by: Stefan Beller 


> +   test_tick &&
> +   git commit -m "O" &&

minor nit: I think you can omit the test_tick before creating O
for each setup.

Stefan


Re: [RFC 07/10] FIXME: wrap-for-bin.sh: set 'core.submodulesFile' for each git invocation

2018-04-13 Thread Stefan Beller
Hi Antonio,

On Fri, Apr 13, 2018 at 1:07 AM, Antonio Ospite  wrote:
> This is to test custom gitmodules file paths. The default path can be
> overridden using the 'GIT_MODULES_FILE' environmental variable.
>
> Maybe In the final patch the option should be set only when running
> tests and not unconditionally in the wrapper script, but as a proof of
> concept the wrapper script was a convenient location.
>
> Also, in the final patch a fixed custom file path could be used instead
> of the environmental variable: to exercise the code it should be enough
> to have a value different from the default one.
>
> The change to 't0001-init.sh' is needed to make the test pass, since now
> a config is set on the command line.

Missing sign off.

So you'd think we'd have to rerun the test suite with GIT_MODULES_FILE set?
That makes for an expensive test. Can we have just a few tests for a few
commands to see that the basics are working correctly?


> ---
>  Makefile| 3 ++-
>  t/t0001-init.sh | 1 +
>  wrap-for-bin.sh | 2 ++
>  3 files changed, 5 insertions(+), 1 deletion(-)
>  mode change 100644 => 100755 wrap-for-bin.sh
>
> diff --git a/Makefile b/Makefile
> index f18168725..38ee1f6a2 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2480,7 +2480,8 @@ bin-wrappers/%: wrap-for-bin.sh
> @mkdir -p bin-wrappers
> $(QUIET_GEN)sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
>  -e 's|@@BUILD_DIR@@|$(shell pwd)|' \
> --e 's|@@PROG@@|$(patsubst test-%,t/helper/test-%,$(@F))|' < $< > 
> $@ && \
> +-e 's|@@PROG@@|$(patsubst test-%,t/helper/test-%,$(@F))|' \
> +-e 's|git\"|git\" $$GIT_OPTIONS|' < $< > $@ && \
> chmod +x $@
>
>  # GNU make supports exporting all variables by "export" without parameters.
> diff --git a/t/t0001-init.sh b/t/t0001-init.sh
> index c413bff9c..6fa3fd24e 100755
> --- a/t/t0001-init.sh
> +++ b/t/t0001-init.sh
> @@ -93,6 +93,7 @@ test_expect_success 'No extra GIT_* on alias scripts' '
> sed -n \
> -e "/^GIT_PREFIX=/d" \
> -e "/^GIT_TEXTDOMAINDIR=/d" \
> +   -e "/^GIT_CONFIG_PARAMETERS=/d" \
> -e "/^GIT_/s/=.*//p" |
> sort
> EOF
> diff --git a/wrap-for-bin.sh b/wrap-for-bin.sh
> old mode 100644
> new mode 100755
> index 584240881..02bf41cbd
> --- a/wrap-for-bin.sh
> +++ b/wrap-for-bin.sh
> @@ -20,6 +20,8 @@ PATH='@@BUILD_DIR@@/bin-wrappers:'"$PATH"
>
>  export GIT_EXEC_PATH GITPERLLIB PATH GIT_TEXTDOMAINDIR
>
> +GIT_OPTIONS="-c core.submodulesfile=${GITMODULES_FILE:-.gitmodules}"
> +
>  if test -n "$GIT_TEST_GDB"
>  then
> unset GIT_TEST_GDB
> --
> 2.17.0
>


Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Elijah Newren
On Fri, Apr 13, 2018 at 10:14 AM, Linus Torvalds
 wrote:
> On Fri, Apr 13, 2018 at 12:02 AM, Elijah Newren  wrote:

>> However, it turns out we have this awesome function called
>> "was_tracked(const char *path)" that was intended for answering this
>> exact question.  So, assuming was_tracked() isn't buggy, the correct
>> patch for this problem would look like:
>
> Apparently that causes problems, for some odd reason.
>
> I like the notion of checking the index, but it's not clear that the
> index is reliable in the presence of renames either.

Yes, precisely.  Checking the *current* index is not reliable in the
presence of renames.

Trying to use the current index as a proxy for what was in the index
before the merge started is a problem.  But we had a copy of the index
before the merge started; we just discarded it at the end of
unpack_trees().  We could keep it around instead.  That would also
have the benefits of making the was_dirty() checks more accurate too,
as using the mtime's in the current index as a proxy for what was in
the original index has the potential for the same kinds of problems.

>>   A big series
>> including that patch was merged to master two days ago, but
>> unfortunately that exact patch was the one that caused some
>> impressively awful fireworks[1].
>
> Yeah, so this code is fragile.
>
> How about we take a completely different approach? Instead of relying
> on fragile (but clever) tests, why not rely on stupid brute force?
>
> Yeah, yeah, it's bad to be stupid, but sometimes simple and stupid
> really does work.
>

> Comments? Because considering the problems this code has had, maybe
> "stupid" really is the right approach...

It's certainly tempting as an interim solution.  I have an alternative
interim solution that I think explains well why the code here had been
fragile, and how to just implement what we want to know rather than
making approximations to it, which I just posted at [2].  But I can
see the draw of just gutting the code and replacing with simple brute
force.  Long term, I think the correct solution is still Junio's
suggested rewrite[1].  My alternative is slightly closer to that
end-state, so I favor it over simple brute-force, but if others have
strong preferences here, I can be happy with either.


Elijah

[1] https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
[2] https://public-inbox.org/git/20180413195607.18091-1-new...@gmail.com/


[PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge

2018-04-13 Thread Elijah Newren
Add several tests checking whether updates can be skipped in a merge.
Also add several similar testcases for where updates cannot be skipped in
a merge to make sure that we skip if and only if we should.

In particular:

  * Testcase 1a (particularly 1a-check-L) would have pointed out the
problem Linus has been dealing with for year with his merges[1].

  * Testcase 2a (particularly 2a-check-L) would have pointed out the
problem with my directory-rename-series before it broke master[2].

  * Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
than 12b of t6043 making that one easier to understand.

  * There are several complementary testcases to make sure we're not just
fixing those particular issues while regressing in the opposite
direction.

[1] 
https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/

Signed-off-by: Elijah Newren 
---
 t/t6046-merge-skip-unneeded-updates.sh | 497 +
 1 file changed, 497 insertions(+)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

diff --git a/t/t6046-merge-skip-unneeded-updates.sh 
b/t/t6046-merge-skip-unneeded-updates.sh
new file mode 100755
index 00..89c3a953ae
--- /dev/null
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -0,0 +1,497 @@
+#!/bin/sh
+
+test_description="merge cases"
+
+# The setup for all of them, pictorially, is:
+#
+#  A
+#  o
+# / \
+#  O o   ?
+# \ /
+#  o
+#  B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Cases involving no renames (one side has subset of changes of
+#the other side)
+###
+
+# Testcase 1a, Changes on A, subset of changes on B
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: b_3
+#   Expected: b_2
+
+test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
+   test_create_repo 1a &&
+   (
+   cd 1a &&
+
+   test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+   git add b &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+   git add b &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+   git add b &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1a-check-L: Modify(A)/Modify(B), change on B subset of A' 
'
+   test_when_finished "git -C 1a reset --hard" &&
+   (
+   cd 1a &&
+
+   git checkout A^0 &&
+
+   GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+   test_i18ngrep "Skipped" out &&
+   test_must_be_empty err &&
+
+   git ls-files -s >index_files &&
+   test_line_count = 1 index_files &&
+
+   git rev-parse >actual HEAD:b &&
+   git rev-parse >expect A:b &&
+   test_cmp expect actual &&
+
+   git hash-object b   >actual &&
+   git rev-parse   A:b >expect &&
+   test_cmp expect actual
+   )
+'
+
+test_expect_success '1a-check-R: Modify(A)/Modify(B), change on B subset of A' 
'
+   test_when_finished "git -C 1a reset --hard" &&
+   (
+   cd 1a &&
+
+   git checkout B^0 &&
+
+   GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+   test_i18ngrep "Auto-merging" out &&
+   test_must_be_empty err &&
+
+   git ls-files -s >index_files &&
+   test_line_count = 1 index_files &&
+
+   git rev-parse >actual HEAD:b &&
+   git rev-parse >expect A:b &&
+   test_cmp expect actual &&
+
+   git hash-object b   >actual &&
+   git rev-parse   A:b >expect &&
+   test_cmp expect actual
+   )
+'
+
+###
+# SECTION 2: Cases involving basic renames

[PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates

2018-04-13 Thread Elijah Newren
When a merge results in contents that already existed in HEAD (because the
changes on a side branch were a subset of what was changed by HEAD), and
those contents were already at the right path, the working directory
updates can be skipped.  However, the relevant code would sometimes claim
the working directory update could be skipped despite the fact that the
contents were at the wrong path.  Make the output reflect the situation
more precisely, including an additional message that tests can check for
to make sure we are getting correct behavior.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 11 ++-
 t/t6022-merge-rename.sh |  2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 4a1ecdea03..05250939c7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2761,21 +2761,22 @@ static int merge_content(struct merge_options *o,
   o->branch2, path2, ))
return -1;
 
-   if (mfi.clean && !df_conflict_remains &&
-   oid_eq(, a_oid) && mfi.mode == a_mode) {
+   if (mfi.clean && oid_eq(, a_oid) && mfi.mode == a_mode) {
int path_renamed_outside_HEAD;
-   output(o, 3, _("Skipped %s (merged same as existing)"), path);
/*
 * The content merge resulted in the same file contents we
 * already had.  We can return early if those file contents
 * are recorded at the correct path (which may not be true
-* if the merge involves a rename).
+* if the merge involves a rename or there's a D/F conflict).
 */
path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-   if (!path_renamed_outside_HEAD) {
+   if (!df_conflict_remains && !path_renamed_outside_HEAD) {
+   output(o, 3, _("Skipped %s (merged same as existing)"), 
path);
add_cacheinfo(o, mfi.mode, , path,
  0, (!o->call_depth), 0);
return mfi.clean;
+   } else {
+   output(o, 3, _("Had correct contents for %s, but not at 
right path"), path);
}
} else
output(o, 2, _("Auto-merging %s"), path);
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index 05ebba7afa..574088bfaf 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -245,7 +245,7 @@ test_expect_success 'merge of identical changes in a 
renamed file' '
GIT_MERGE_VERBOSITY=3 git merge change | test_i18ngrep "^Skipped B" &&
git reset --hard HEAD^ &&
git checkout change &&
-   GIT_MERGE_VERBOSITY=3 git merge change+rename | test_i18ngrep "^Skipped 
B"
+   GIT_MERGE_VERBOSITY=3 git merge change+rename | test_i18ngrep "^Had 
correct contents for B, but not at right path"
 '
 
 test_expect_success 'setup for rename + d/f conflicts' '
-- 
2.16.0.35.g6dd7ede834



[RFC PATCH v9 0/30] Add directory rename detection to git

2018-04-13 Thread Elijah Newren
This series builds on commit febb3a86098f ("merge-recursive: avoid
spurious rename/rename conflict from dir renames", 2018-02-14), also known
as patch 29/30 of en/rename-directory-detection.  That patch series has
been reverted from master[1], due to a bug with patch 30/30, so does not
apply to current master.  But I didn't want to resend the whole series for
an RFC.

These four patches replace that patch and:
  - fixes Linus' rewriting-of-unchanged-files bug[1]
  - fixes the problems that broke Junio's merges after my series[2]
  - fixes the problem the original patch 30/30 was intended to solve
  - adds lots of testcases to make sure this doesn't regress.

Linus' alternative of stupid-brute-force[3], would also work here,
though I feel the first three patches are useful even if we take some
form of his patch.  Long term, the most correct solution would involve
a rewrite to merge-recursive that would simplify this code[4], though
I think the changes in this series brings this part of the code closer
to that end state.

The big questions here are:
  1) The last time my rename-directory-detection series was merged into
 master it bit Junio badly.  I'm planning to redo all merges of
 git.git and linux.git and comparing v2.17.0 to what I get after
 my changes.  What else should I test?
  2) What do folks thing about stupid-brute-force vs. the explanation in
 my final patch?


[1] 
https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
[3] 
https://public-inbox.org/git/CA+55aFwi9pTAJT_qtv=vHLgu=B1fdXBoD96i8Y5xnbS=zrf...@mail.gmail.com/
[4] https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/


Elijah Newren (4):
  merge-recursive: improve output precision around skipping updates
  t6046: testcases checking whether updates can be skipped in a merge
  merge-recursive: Fix was_tracked() to quit lying with some renamed
paths
  merge-recursive: fix check for skipability of working tree updates

 merge-recursive.c  | 109 +---
 merge-recursive.h  |   1 +
 t/t6022-merge-rename.sh|   2 +-
 t/t6043-merge-rename-directories.sh|   2 +-
 t/t6046-merge-skip-unneeded-updates.sh | 497 +
 5 files changed, 575 insertions(+), 36 deletions(-)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

-- 
2.16.0.35.g6dd7ede834



[PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths

2018-04-13 Thread Elijah Newren
In commit aacb82de3ff8 ("merge-recursive: Split was_tracked() out of
would_lose_untracked()", 2011-08-11), was_tracked() was split out of
would_lose_untracked() with the intent to provide a function that could
answer whether a path was tracked in the index before the merge.  Sadly,
it instead returned whether the path was in the working tree due to having
been tracked in the index before the merge OR having been written there by
unpack_trees().  The distinction is important when renames are involved,
e.g. for a merge where:

   HEAD:  modifies path b
   other: renames b->c

In this case, c was not tracked in the index before the merge, but would
have been added to the index at stage 0 and written to the working tree by
unpack_trees().  would_lose_untracked() is more interested in the
in-working-copy-for-either-reason behavior, while all other uses of
was_tracked() want just was-it-tracked-in-index-before-merge behavior.

Unsplit would_lose_untracked() and write a new was_tracked() function
which answers whether a path was tracked in the index before the merge
started.

This will also make was_dirty() return better results, as it will be based
off the original index rather than an index that possibly only copied over
some of the stat information.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 82 +++
 merge-recursive.h |  1 +
 2 files changed, 59 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 05250939c7..adc800f188 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -344,6 +344,7 @@ static int git_merge_trees(struct merge_options *o,
 {
int rc;
struct tree_desc t[3];
+   struct index_state tmp_index = { NULL };
 
memset(>unpack_opts, 0, sizeof(o->unpack_opts));
if (o->call_depth)
@@ -354,7 +355,7 @@ static int git_merge_trees(struct merge_options *o,
o->unpack_opts.head_idx = 2;
o->unpack_opts.fn = threeway_merge;
o->unpack_opts.src_index = _index;
-   o->unpack_opts.dst_index = _index;
+   o->unpack_opts.dst_index = _index;
setup_unpack_trees_porcelain(>unpack_opts, "merge");
 
init_tree_desc_from_tree(t+0, common);
@@ -362,13 +363,17 @@ static int git_merge_trees(struct merge_options *o,
init_tree_desc_from_tree(t+2, merge);
 
rc = unpack_trees(3, t, >unpack_opts);
+   cache_tree_free(_cache_tree);
+
+   o->orig_index = the_index;
+   the_index = tmp_index;
+
/*
-* unpack_trees NULLifies src_index, but it's used in verify_uptodate,
-* so set to the new index which will usually have modification
-* timestamp info copied over.
+* src_index is used in verify_uptodate, but was NULLified in
+* unpack_trees, so we need to set it back to the original index.
 */
-   o->unpack_opts.src_index = _index;
-   cache_tree_free(_cache_tree);
+   o->unpack_opts.src_index = >orig_index;
+
return rc;
 }
 
@@ -773,31 +778,51 @@ static int dir_in_way(const char *path, int 
check_working_copy, int empty_ok)
!(empty_ok && is_empty_dir(path));
 }
 
-static int was_tracked(const char *path)
+/*
+ * Returns whether path was tracked in the index before the merge started
+ */
+static int was_tracked(struct merge_options *o, const char *path)
 {
-   int pos = cache_name_pos(path, strlen(path));
+   int pos = index_name_pos(>orig_index, path, strlen(path));
 
if (0 <= pos)
-   /* we have been tracking this path */
+   /* we were tracking this path before the merge */
return 1;
 
-   /*
-* Look for an unmerged entry for the path,
-* specifically stage #2, which would indicate
-* that "our" side before the merge started
-* had the path tracked (and resulted in a conflict).
-*/
-   for (pos = -1 - pos;
-pos < active_nr && !strcmp(path, active_cache[pos]->name);
-pos++)
-   if (ce_stage(active_cache[pos]) == 2)
-   return 1;
return 0;
 }
 
 static int would_lose_untracked(const char *path)
 {
-   return !was_tracked(path) && file_exists(path);
+   /*
+* This may look like it can be simplified to:
+*   return !was_tracked(o, path) && file_exists(path)
+* but it can't.  This function needs to know whether path was
+* in the working tree due to EITHER having been tracked in the
+* index before the merge OR having been put into the working copy
+* and index by unpack_trees().  Due to that either-or requirement,
+* we check the current index instead of the original one.
+*/
+   int pos = cache_name_pos(path, strlen(path));
+
+   if (pos < 0)
+   pos = -1 - pos;
+   while (pos < active_nr &&
+  !strcmp(path, active_cache[pos]->name)) {
+ 

[PATCH v9 30/30] merge-recursive: fix check for skipability of working tree updates

2018-04-13 Thread Elijah Newren
The can-working-tree-updates-be-skipped check has had a long and blemished
history.  The update can be skipped iff:
  a) The merged contents match what was in HEAD
  b) The merged mode matches what was in HEAD
  c) The target path is usable and matches what was in HEAD

Steps a & b are easy to check; we have always gotten those right.  Step c
is just:
  c1) Nothing else is in the way of putting the content at the target path
  (i.e. it isn't involved in any D/F conflicts)
  c2) target path was tracked in the index before the merge

While it is easy to overlook c1, this was fixed seven years ago with
commit 4ab9a157d069 ("merge_content(): Check whether D/F conflicts are
still present", 2010-09-20).  merge-recursive didn't have a readily
available way to directly check c2, so various approximations were
used:

  * In commit b2c8c0a76274 ("merge-recursive: When we detect we can skip
an update, actually skip it", 2011-02-28), it was noted that although
the code claimed it was skipping the update, it did not actually skip
the update.  The code was made to skip it, but used lstat(path, ...)
as an approximation to path-was-tracked-in-index-before-merge.

  * In commit 5b448b853030 ("merge-recursive: When we detect we can skip
an update, actually skip it", 2011-08-11), the problem with using
lstat was noted.  It was changed to the approximation
   path2 && strcmp(path, path2)
which is also wrong.  !path2 || strcmp(path, path2) would have been
better, but would have fallen short with directory renames.

  * In c5b761fb2711 ("merge-recursive: ensure we write updates for
directory-renamed file", 2018-02-14), the problem with the previous
approximation was noted and changed to
   was_tracked(path)
That looks like exactly what we were trying to answer, and is what
was_tracked() was *intended* for, but not what was_tracked()
actually returned.

Since the previous commit made was_tracked(path) actually mean "path was
tracked in the index before the merge", we can now use it instead of
other approximations to answer the question "was path tracked in the
index before the merge?"  So, although the code change in this commit is
the same one made in c5b761fb2711, it now safe and correct due to the
prior fix to was_tracked().

Signed-off-by: Elijah Newren 
---
 merge-recursive.c  | 20 +---
 t/t6043-merge-rename-directories.sh|  2 +-
 t/t6046-merge-skip-unneeded-updates.sh |  4 ++--
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index adc800f188..1b71d00fdb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2786,16 +2786,22 @@ static int merge_content(struct merge_options *o,
   o->branch2, path2, ))
return -1;
 
+   /*
+* We can skip updating the working tree file iff:
+*   a) The merged contents match what was in HEAD
+*   b) The merged mode matches what was in HEAD
+*   c) The target path is usable and matches what was in HEAD
+* We test (a) & (b) here.
+*/
if (mfi.clean && oid_eq(, a_oid) && mfi.mode == a_mode) {
-   int path_renamed_outside_HEAD;
/*
-* The content merge resulted in the same file contents we
-* already had.  We can return early if those file contents
-* are recorded at the correct path (which may not be true
-* if the merge involves a rename or there's a D/F conflict).
+* Case c has two pieces:
+*   c1) Nothing else is in the way of writing the merged
+*   results to path (i.e. it isn't involved in any
+*   D/F conflict)
+*   c2) path was tracked in the index before the merge
 */
-   path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-   if (!df_conflict_remains && !path_renamed_outside_HEAD) {
+   if (!df_conflict_remains && was_tracked(o, path)) {
output(o, 3, _("Skipped %s (merged same as existing)"), 
path);
add_cacheinfo(o, mfi.mode, , path,
  0, (!o->call_depth), 0);
diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 45f620633f..2e28f2908d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3884,7 +3884,7 @@ test_expect_success '12b-setup: Moving one directory 
hierarchy into another' '
)
 '
 
-test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+test_expect_success '12b-check: Moving one directory hierarchy into another' '
(
cd 12b &&
 
diff --git a/t/t6046-merge-skip-unneeded-updates.sh 
b/t/t6046-merge-skip-unneeded-updates.sh
index 

Re: [RFC PATCH] Add "git show --follow-symlinks HEAD:symlink"

2018-04-13 Thread Michael Vogt
On Fri, Apr 13, 2018 at 09:33:00PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Apr 13 2018, Michael Vogt wrote:
> 
> > The update patch is attached as an inline attachement.
> 
> Your patch still just shows up as a straight-up attachment in many
> E-Mail clients. Note the difference between what your patch
[..]
> This is why Documentation/SubmittingPatches suggests using format-patch
> & send-email. You don't *have* to use those tools, and can use something
> that's compatible with what's expected on-list, but what you're doing
> isn't that.

My apologizes, I resend it using "git send-email".

Cheers,
 Michael


[PATCH] support: git show --follow-symlinks HEAD:symlink

2018-04-13 Thread Michael Vogt
Add support for the `--follow-symlinks` options to git-show. This
allows to write:

git show --follow-symlink HEAD:path-a-symlink

to get the content of the symlinked file.

Signed-off-by: Michael Vogt 
---
 Documentation/git-show.txt |  6 +
 builtin/log.c  |  7 --
 revision.c |  2 ++
 revision.h |  1 +
 t/t1800-git-show.sh| 46 ++
 5 files changed, 60 insertions(+), 2 deletions(-)
 create mode 100755 t/t1800-git-show.sh

diff --git a/Documentation/git-show.txt b/Documentation/git-show.txt
index e73ef5401..d9f7fd90c 100644
--- a/Documentation/git-show.txt
+++ b/Documentation/git-show.txt
@@ -39,6 +39,12 @@ OPTIONS
For a more complete list of ways to spell object names, see
"SPECIFYING REVISIONS" section in linkgit:gitrevisions[7].
 
+--follow-symlinks::
+   Follow symlinks inside the repository when requesting objects
+   in extended revision syntax of the form tree-ish:path-in-tree.
+   Instead of output about the link itself, provide output about
+   the linked-to object.
+
 include::pretty-options.txt[]
 
 
diff --git a/builtin/log.c b/builtin/log.c
index 94ee177d5..e92af4fc7 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -142,7 +142,7 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
 struct rev_info *rev, struct setup_revision_opt *opt)
 {
struct userformat_want w;
-   int quiet = 0, source = 0, mailmap = 0;
+   int quiet = 0, source = 0, mailmap = 0, follow_symlinks = 0;
static struct line_opt_callback_data line_cb = {NULL, NULL, 
STRING_LIST_INIT_DUP};
static struct string_list decorate_refs_exclude = 
STRING_LIST_INIT_NODUP;
static struct string_list decorate_refs_include = 
STRING_LIST_INIT_NODUP;
@@ -162,6 +162,8 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
OPT_CALLBACK('L', NULL, _cb, "n,m:file",
 N_("Process line range n,m in file, counting from 
1"),
 log_line_range_callback),
+   OPT_BOOL(0, "follow-symlinks", _symlinks,
+N_("follow in-tree symlinks (used when showing file 
content)")),
OPT_END()
};
 
@@ -173,9 +175,10 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
 builtin_log_options, builtin_log_usage,
 PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
 PARSE_OPT_KEEP_DASHDASH);
-
if (quiet)
rev->diffopt.output_format |= DIFF_FORMAT_NO_OUTPUT;
+   if (follow_symlinks)
+   rev->follow_symlinks = 1;
argc = setup_revisions(argc, argv, rev, opt);
 
/* Any arguments at this point are not recognized */
diff --git a/revision.c b/revision.c
index b42c836d7..4ab22313f 100644
--- a/revision.c
+++ b/revision.c
@@ -1678,6 +1678,8 @@ int handle_revision_arg(const char *arg_, struct rev_info 
*revs, int flags, unsi
 
if (revarg_opt & REVARG_COMMITTISH)
get_sha1_flags |= GET_OID_COMMITTISH;
+   if (revs && revs->follow_symlinks)
+   get_sha1_flags |= GET_OID_FOLLOW_SYMLINKS;
 
if (get_oid_with_context(arg, get_sha1_flags, , ))
return revs->ignore_missing ? 0 : -1;
diff --git a/revision.h b/revision.h
index b8c47b98e..060f1038a 100644
--- a/revision.h
+++ b/revision.h
@@ -122,6 +122,7 @@ struct rev_info {
first_parent_only:1,
line_level_traverse:1,
tree_blobs_in_commit_order:1,
+   follow_symlinks:1,
 
/* for internal use only */
exclude_promisor_objects:1;
diff --git a/t/t1800-git-show.sh b/t/t1800-git-show.sh
new file mode 100755
index 0..85541b4db
--- /dev/null
+++ b/t/t1800-git-show.sh
@@ -0,0 +1,46 @@
+#!/bin/sh
+
+test_description='Test git show works'
+
+. ./test-lib.sh
+
+test_expect_success 'verify git show HEAD:foo works' '
+printf "foo content\n" >foo &&
+git add foo &&
+git commit -m "added foo" &&
+git show HEAD:foo >actual &&
+printf "foo content\n" >expected &&
+test_cmp expected actual
+'
+
+test_expect_success 'verify git show HEAD:symlink shows symlink points to foo' 
'
+printf "foo content\n" >foo &&
+ln -s foo symlink &&
+git add foo symlink &&
+git commit -m "added foo and a symlink to foo" &&
+git show HEAD:foo >actual &&
+printf "foo content\n" >expected &&
+test_cmp expected actual &&
+git show HEAD:symlink >actual &&
+printf "foo" >expected &&
+test_cmp expected actual
+'
+
+test_expect_success 'verify git show --follow-symlinks HEAD:symlink shows foo' 
'
+git show --follow-symlinks HEAD:symlink >actual &&
+printf 

Re: [RFC PATCH] Add "git show --follow-symlinks HEAD:symlink"

2018-04-13 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 13 2018, Michael Vogt wrote:

> The update patch is attached as an inline attachement.

Your patch still just shows up as a straight-up attachment in many
E-Mail clients. Note the difference between what your patch
(https://public-inbox.org/git/20180413174819.GA19030@bod/raw) and a
patch that's not an attachment
(https://public-inbox.org/git/0f0942043678fe76f8d654306482ee26fac643f0.1523617836.git.johannes.schinde...@gmx.de/raw)
look like.

Try to "wget" both of those and apply them with "git am" on top of
master, and note how what you're doing results in a broken patch.

This is why Documentation/SubmittingPatches suggests using format-patch
& send-email. You don't *have* to use those tools, and can use something
that's compatible with what's expected on-list, but what you're doing
isn't that.


Re: [PATCH] Deprecate support for .git/info/grafts

2018-04-13 Thread Eric Sunshine
On Fri, Apr 13, 2018 at 7:11 AM, Johannes Schindelin
 wrote:
> The grafts feature was a convenient way to "stich together" ancient
> history to the fresh start of linux.git.
> [...]
> The much younger feature implemented as `git replace` set out to remedy
> those limitations and dangerous bugs.
>
> Seeing as `git replace` is pretty mature by now, it is time to deprecate
> support for the graft file, and to retire it eventually.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  advice.c  | 2 ++
>  advice.h  | 1 +
>  commit.c  | 9 +
>  t/t6001-rev-list-graft.sh | 9 +
>  4 files changed, 21 insertions(+)

Perhaps, as part of this deprecation, the example in
Documentation/git-filter-branch.txt should be updated to suggest
git-replace instead of .git/info/grafts.

Maybe, also, Documentation/shallow.txt should talk about replace-refs
rather than .git/info/grafts.


Re: [PATCH] Deprecate support for .git/info/grafts

2018-04-13 Thread Stefan Beller
Hi Johannes,

On Fri, Apr 13, 2018 at 4:11 AM, Johannes Schindelin
 wrote:
> The grafts feature was a convenient way to "stich together" ancient
> history to the fresh start of linux.git.

Did you mean: stitch?

> Its implementation is, however, not up to Git's standards, as there are
> too many ways where it can lead to surprising and unwelcome behavior.
>
> For example, when pushing from a repository with active grafts, it is
> possible to miss commits that have been "grafted out", resulting in a
> broken state on the other side.
>
> Also, the grafts feature is limited to "rewriting" commits' list of
> parents, it cannot replace anything else.
>
> The much younger feature implemented as `git replace` set out to remedy
> those limitations and dangerous bugs.
>
> Seeing as `git replace` is pretty mature by now, it is time to deprecate
> support for the graft file, and to retire it eventually.

It seems that the maturity needed for this commit was reached in
4228e8bc98 (replace: add --graft option, 2014-07-19)

Reviewed-by: Stefan Beller 
> Signed-off-by: Johannes Schindelin 
> ---

> return -1;
> +   if (advice_graft_file_deprecated)
> +   advise(_("Support for /info/grafts is deprecated\n"
> +"and will be removed in a future Git version.\n"
> +"\n"
> +"Please use \"git replace --graft [...]\" instead.\n"
> +"\n"
> +"Turn this message off by running\n"
> +"\"git config advice.graftFileDeprecated false\""));

So the user would have to run:

  for line in /info/grafts:
  git replace --graft $line
  # The order in the grafts file is the same as the arguments,
  # but we'd have to pass each as its own argument
  rm /info/grafts

I wonder if we want to offer a migration tool or just leave it
at this hint.

Thanks,
Stefan


Re: [PATCH v6 05/15] sequencer: introduce the `merge` command

2018-04-13 Thread Phillip Wood
On 13/04/18 11:12, Phillip Wood wrote:
> On 10/04/18 13:29, Johannes Schindelin wrote:
>> +static int do_merge(struct commit *commit, const char *arg, int arg_len,
>> +int flags, struct replay_opts *opts)
>> +{
>> +int run_commit_flags = (flags & TODO_EDIT_MERGE_MSG) ?
>> +EDIT_MSG | VERIFY_MSG : 0;
>> +struct strbuf ref_name = STRBUF_INIT;
>> +struct commit *head_commit, *merge_commit, *i;
>> +struct commit_list *bases, *j, *reversed = NULL;
>> +struct merge_options o;
>> +int merge_arg_len, oneline_offset, ret;
>> +static struct lock_file lock;
>> +const char *p;
>> +
>> +oneline_offset = arg_len;
>> +merge_arg_len = strcspn(arg, " \t\n");
>> +p = arg + merge_arg_len;
>> +p += strspn(p, " \t\n");
>> +if (*p == '#' && (!p[1] || isspace(p[1]))) {
>> +p += 1 + strspn(p + 1, " \t\n");
>> +oneline_offset = p - arg;
>> +} else if (p - arg < arg_len)
>> +BUG("octopus merges are not supported yet: '%s'", p);
>> +
>> +strbuf_addf(_name, "refs/rewritten/%.*s", merge_arg_len, arg);
>> +merge_commit = lookup_commit_reference_by_name(ref_name.buf);
>> +if (!merge_commit) {
>> +/* fall back to non-rewritten ref or commit */
>> +strbuf_splice(_name, 0, strlen("refs/rewritten/"), "", 0);
>> +merge_commit = lookup_commit_reference_by_name(ref_name.buf);
>> +}
>> +if (!merge_commit) {
>> +error(_("could not resolve '%s'"), ref_name.buf);
>> +strbuf_release(_name);
>> +return -1;
>> +}
>> +
>> +if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
>> +return -1;
>> +
>> +head_commit = lookup_commit_reference_by_name("HEAD");
>> +if (!head_commit) {
>> +rollback_lock_file();
>> +return error(_("cannot merge without a current revision"));
>> +}
>> +
>> +if (commit) {
>> +const char *message = get_commit_buffer(commit, NULL);
>> +const char *body;
>> +int len;
>> +
>> +if (!message) {
>> +rollback_lock_file();
>> +return error(_("could not get commit message of '%s'"),
>> + oid_to_hex(>object.oid));
>> +}
>> +write_author_script(message);
>> +find_commit_subject(message, );
>> +len = strlen(body);
>> +if (write_message(body, len, git_path_merge_msg(), 0) < 0) {
>> +error_errno(_("could not write '%s'"),
>> +git_path_merge_msg());
>> +unuse_commit_buffer(commit, message);
>> +rollback_lock_file();
>> +return -1;
>> +}
>> +unuse_commit_buffer(commit, message);
>> +} else {
>> +struct strbuf buf = STRBUF_INIT;
>> +int len;
>> +
>> +strbuf_addf(, "author %s", git_author_info(0));
>> +write_author_script(buf.buf);
>> +strbuf_reset();
>> +
>> +if (oneline_offset < arg_len) {
>> +p = arg + oneline_offset;
>> +len = arg_len - oneline_offset;
>> +} else {
>> +strbuf_addf(, "Merge branch '%.*s'",
>> +merge_arg_len, arg);
>> +p = buf.buf;
>> +len = buf.len;
>> +}
>> +
>> +if (write_message(p, len, git_path_merge_msg(), 0) < 0) {
>> +error_errno(_("could not write '%s'"),
>> +git_path_merge_msg());
>> +strbuf_release();
>> +rollback_lock_file();
>> +return -1;
>> +}
>> +strbuf_release();
>> +}
>> +
>> +write_message(oid_to_hex(_commit->object.oid), GIT_SHA1_HEXSZ,
>> +  git_path_merge_head(), 0);
>> +write_message("no-ff", 5, git_path_merge_mode(), 0);
>> +
>> +bases = get_merge_bases(head_commit, merge_commit);
>> +for (j = bases; j; j = j->next)
>> +commit_list_insert(j->item, );
>> +free_commit_list(bases);
>> +
>> +read_cache();
>> +init_merge_options();
>> +o.branch1 = "HEAD";
>> +o.branch2 = ref_name.buf;
>> +o.buffer_output = 2;
>> +
>> +ret = merge_recursive(, head_commit, merge_commit, reversed, );
>> +if (!ret)
>> +rerere(opts->allow_rerere_auto);
>> +if (ret <= 0)
>> +fputs(o.obuf.buf, stdout);
>> +strbuf_release();
>> +if (ret < 0) {
>> +strbuf_release(_name);
>> +rollback_lock_file();
>> +return error(_("conflicts while merging '%.*s'"),
>> + merge_arg_len, arg);
>> +}
> 
> If there are conflicts then ret == 0 rather than -1
> 
>> +
>> +if (active_cache_changed &&
>> +

Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Linus Torvalds
On Fri, Apr 13, 2018 at 10:39 AM, Stefan Beller  wrote:
>
> Would s/read/xread/ make sense in working_tree_matches ?

Makes sense, yes.

That patch was really more of a RFD than anything that should be applied.

I would like to see the "might be same" flag pushed down so that we
don't do this comparison when it makes no sense, even if the stat info
makes that less of an issue than it might otherwise be,

And maybe that whole function should be taught to do the "verify CE
against file", and do the proper full cache state check (ie time etc)
instead of doing the read.

So maybe the best option is a combination of the two patches: my
"verify the working tree state" _together_ with the was_tracked()
logic that looks up a cache entry.

Again, the problem with "look up the index state" is about files
possibly being renamed as part of the merge, but if we then check the
index state against the actual contents of the working directory, that
isn't an issue.

  Linus


Re: [RFC PATCH] Add "git show --follow-symlinks HEAD:symlink"

2018-04-13 Thread Michael Vogt
On Fri, Apr 13, 2018 at 10:28:13AM -0700, Stefan Beller wrote:
> Hi Michael,
Hi Stefan,

> thanks for the patch,
Thanks for the review.

[..]
> The patch seems reasonable, apart from minor nits:
> In the test we'd prefer no whitespace on the right side of the redirection,
> i.e. echo content >foo

Sure, updated.

> Instead of evaluating git commands in shell and assigning it to a variable,
> we'd prefer to dump it to files:
[..]

Makes sense, updated.

> There is a typo &.

Ups, sorry! Fixed.

> Can we reword the documentation, such that we do not have
> an occurrence of "extended SHA-1" ?
[..]
> Maybe
> 
> Follow symlinks inside the repository when requesting
> objects in extended revision syntax of the form tree-ish:path-in-tree.

This looks very reasonable, I updated the documentation accordingly.

The update patch is attached as an inline attachement.

Cheers,
 Michael
>From dab10f5e5aea8a31cbee0ab1d5a78204c8c9832a Mon Sep 17 00:00:00 2001
From: Michael Vogt 
Date: Mon, 9 Apr 2018 10:38:13 +0200
Subject: [PATCH] support: git show --follow-symlinks HEAD:symlink

Add support for the `--follow-symlinks` options to git-show. This
allows to write:

git show --follow-symlink HEAD:path-a-symlink

to get the content of the symlinked file.

Signed-off-by: Michael Vogt 
---
 Documentation/git-show.txt |  6 +
 builtin/log.c  |  7 --
 revision.c |  2 ++
 revision.h |  1 +
 t/t1800-git-show.sh| 46 ++
 5 files changed, 60 insertions(+), 2 deletions(-)
 create mode 100755 t/t1800-git-show.sh

diff --git a/Documentation/git-show.txt b/Documentation/git-show.txt
index e73ef5401..d9f7fd90c 100644
--- a/Documentation/git-show.txt
+++ b/Documentation/git-show.txt
@@ -39,6 +39,12 @@ OPTIONS
 	For a more complete list of ways to spell object names, see
 	"SPECIFYING REVISIONS" section in linkgit:gitrevisions[7].
 
+--follow-symlinks::
+	Follow symlinks inside the repository when requesting objects
+	in extended revision syntax of the form tree-ish:path-in-tree.
+	Instead of output about the link itself, provide output about
+	the linked-to object.
+
 include::pretty-options.txt[]
 
 
diff --git a/builtin/log.c b/builtin/log.c
index 94ee177d5..e92af4fc7 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -142,7 +142,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 			 struct rev_info *rev, struct setup_revision_opt *opt)
 {
 	struct userformat_want w;
-	int quiet = 0, source = 0, mailmap = 0;
+	int quiet = 0, source = 0, mailmap = 0, follow_symlinks = 0;
 	static struct line_opt_callback_data line_cb = {NULL, NULL, STRING_LIST_INIT_DUP};
 	static struct string_list decorate_refs_exclude = STRING_LIST_INIT_NODUP;
 	static struct string_list decorate_refs_include = STRING_LIST_INIT_NODUP;
@@ -162,6 +162,8 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 		OPT_CALLBACK('L', NULL, _cb, "n,m:file",
 			 N_("Process line range n,m in file, counting from 1"),
 			 log_line_range_callback),
+		OPT_BOOL(0, "follow-symlinks", _symlinks,
+			 N_("follow in-tree symlinks (used when showing file content)")),
 		OPT_END()
 	};
 
@@ -173,9 +175,10 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 			 builtin_log_options, builtin_log_usage,
 			 PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
 			 PARSE_OPT_KEEP_DASHDASH);
-
 	if (quiet)
 		rev->diffopt.output_format |= DIFF_FORMAT_NO_OUTPUT;
+	if (follow_symlinks)
+		rev->follow_symlinks = 1;
 	argc = setup_revisions(argc, argv, rev, opt);
 
 	/* Any arguments at this point are not recognized */
diff --git a/revision.c b/revision.c
index b42c836d7..4ab22313f 100644
--- a/revision.c
+++ b/revision.c
@@ -1678,6 +1678,8 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 
 	if (revarg_opt & REVARG_COMMITTISH)
 		get_sha1_flags |= GET_OID_COMMITTISH;
+	if (revs && revs->follow_symlinks)
+		get_sha1_flags |= GET_OID_FOLLOW_SYMLINKS;
 
 	if (get_oid_with_context(arg, get_sha1_flags, , ))
 		return revs->ignore_missing ? 0 : -1;
diff --git a/revision.h b/revision.h
index b8c47b98e..060f1038a 100644
--- a/revision.h
+++ b/revision.h
@@ -122,6 +122,7 @@ struct rev_info {
 			first_parent_only:1,
 			line_level_traverse:1,
 			tree_blobs_in_commit_order:1,
+			follow_symlinks:1,
 
 			/* for internal use only */
 			exclude_promisor_objects:1;
diff --git a/t/t1800-git-show.sh b/t/t1800-git-show.sh
new file mode 100755
index 0..85541b4db
--- /dev/null
+++ b/t/t1800-git-show.sh
@@ -0,0 +1,46 @@
+#!/bin/sh
+
+test_description='Test git show works'
+
+. ./test-lib.sh
+
+test_expect_success 'verify git show HEAD:foo works' '
+printf "foo content\n" >foo &&
+git add foo &&
+git commit -m "added foo" &&
+git show HEAD:foo >actual &&
+printf "foo content\n" >expected &&
+test_cmp 

Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Stefan Beller
Hi Linus,

On Fri, Apr 13, 2018 at 10:14 AM, Linus Torvalds
 wrote:
>
> Comments? Because considering the problems this code has had, maybe
> "stupid" really is the right approach...

Would s/read/xread/ make sense in working_tree_matches ?
If read returns unexpectedly due to EINTR or EAGAIN,
this is no correctness issue, but you'd have to recompile the kernel.

Stefan


Re: [RFC PATCH] Add "git show --follow-symlinks HEAD:symlink"

2018-04-13 Thread Stefan Beller
Hi Michael,

thanks for the patch,

> Thanks for the intial reivew. I updated the patch with a test and
> documentation for the new option. Happy to merge the test into one of
> the existing test files, I read t/README and greping around I did not
> find a place that looked like a good fit.

I think keeping tests as separate as possible is a good idea.
Looking at the patch https://public-inbox.org/git/20180413094314.GA2404@bod/

The patch seems reasonable, apart from minor nits:
In the test we'd prefer no whitespace on the right side of the redirection,
i.e. echo content >foo

Instead of evaluating git commands in shell and assigning it to a variable,
we'd prefer to dump it to files:

  git show HEAD:symlink >actual &&
  echo foo >expect &&
  test_cmp expect actual

(instead of content=$(git show HEAD:foo) && test $content == ...)

The reason for this is that the &&-chain will inspect the return code
of the git command.

There is a typo &.

Can we reword the documentation, such that we do not have
an occurrence of "extended SHA-1" ?
(By now the Git community came up with a plan to migrate
away from SHA-1, hence we'd not want to introduce more
dependencies even in the form of documentation for that)

Maybe

Follow symlinks inside the repository when requesting
objects in extended revision syntax of the form tree-ish:path-in-tree.

Thanks,
Stefan


Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Linus Torvalds
On Fri, Apr 13, 2018 at 12:02 AM, Elijah Newren  wrote:
>
> I hope you don't mind me barging into your conversation

I was getting tired of my own rambling anyway..

> However, it turns out we have this awesome function called
> "was_tracked(const char *path)" that was intended for answering this
> exact question.  So, assuming was_tracked() isn't buggy, the correct
> patch for this problem would look like:

Apparently that causes problems, for some odd reason.

I like the notion of checking the index, but it's not clear that the
index is reliable in the presence of renames either.

>   A big series
> including that patch was merged to master two days ago, but
> unfortunately that exact patch was the one that caused some
> impressively awful fireworks[1].

Yeah, so this code is fragile.

How about we take a completely different approach? Instead of relying
on fragile (but clever) tests, why not rely on stupid brute force?

Yeah, yeah, it's bad to be stupid, but sometimes simple and stupid
really does work.

See the attached patch. It gets rid of all the subtle "has this been
renamed" tests entirely, and just _always_ does that final
update_file().

But instead, it makes update_file() a bit smarter, and says "before we
write this file out, let's see if it's already there and already has
the expected contents"?

Now, it really shouldn't be _quite_ as stupid as that: we should
probably set a flag in the "hey, the oid matches, maybe it's worth
checking", so that it doesn't do the check in the cases where we know
the merge has done things.

But it's actually *fairly* low cost, because before it reads the file
it at least checks that file length matches the expected length (and
that the user permission bits match the expected mode).

So if the file doesn't match, most of the time the real cost will just
be an extra 'open/fstat/close()' sequence. That's pretty cheap.

So even the completely stupid approach is probably not too bad, and it
could be made smarter.

NOTE! I have *NOT* tested this on anything interesting. I tested the
patch on my stupid test-case, but that'[s it. I didn't even bother
re-doing the kernel merge that started this.

Comments? Because considering the problems this code has had, maybe
"stupid" really is the right approach...

[ Ok, I lied. I just tested this on the kernel merge. It worked fine,
and avoided modifying  ]

 Linus
 merge-recursive.c | 42 +-
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624..ed2200065 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -815,6 +815,32 @@ static int make_room_for_path(struct merge_options *o, const char *path)
 	return err(o, msg, path, _(": perhaps a D/F conflict?"));
 }
 
+static int working_tree_matches(const char *path, const char *buf, unsigned long size, unsigned mode)
+{
+	int fd, matches;
+	struct stat st;
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		return 0;
+	matches = 0;
+	if (!fstat(fd, ) && st.st_size == size && S_ISREG(st.st_mode) && !(0700 & (st.st_mode ^ mode))) {
+		char tmpbuf[1024];
+		while (size) {
+			int n = read(fd, tmpbuf, sizeof(tmpbuf));
+			if (n <= 0 || n > size)
+break;
+			if (memcmp(tmpbuf, buf, n))
+break;
+			buf += n;
+			size -= n;
+		}
+		matches = !size;
+	}
+	close(fd);
+	return matches;
+}
+
 static int update_file_flags(struct merge_options *o,
 			 const struct object_id *oid,
 			 unsigned mode,
@@ -856,6 +882,8 @@ static int update_file_flags(struct merge_options *o,
 size = strbuf.len;
 buf = strbuf_detach(, NULL);
 			}
+			if (working_tree_matches(path, buf, size, mode))
+goto free_buf;
 		}
 
 		if (make_room_for_path(o, path) < 0) {
@@ -1782,20 +1810,8 @@ static int merge_content(struct merge_options *o,
 
 	if (mfi.clean && !df_conflict_remains &&
 	oid_eq(, a_oid) && mfi.mode == a_mode) {
-		int path_renamed_outside_HEAD;
 		output(o, 3, _("Skipped %s (merged same as existing)"), path);
-		/*
-		 * The content merge resulted in the same file contents we
-		 * already had.  We can return early if those file contents
-		 * are recorded at the correct path (which may not be true
-		 * if the merge involves a rename).
-		 */
-		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-		if (!path_renamed_outside_HEAD) {
-			add_cacheinfo(o, mfi.mode, , path,
-  0, (!o->call_depth), 0);
-			return mfi.clean;
-		}
+		/* We could set a flag here and pass it to "update_file()" */
 	} else
 		output(o, 2, _("Auto-merging %s"), path);
 


[RFC PATCH] checkout: Force matching mtime between files

2018-04-13 Thread Michał Górny
Currently git does not control mtimes of files being checked out.  This
means that the only assumption you could make is that all files created
or modified within a single checkout action will have mtime between
start time and end time of this checkout.  The relations between mtimes
of different files depend on the order in which they are checked out,
filesystem speed and timestamp precision.

Git repositories sometimes contain both generated and respective source
files.  For example, the Gentoo 'user syncing' repository combines
source ebuild files with generated metadata cache for user convenience.
Ideally, the 'git checkout' would be fast enough that (combined with low
timestamp precision) all files created or modified within a single
checkout would have matching timestamp.  However, in reality the cache
files may end up being wrongly 'older' than source file, requiring
unnecessary recheck.

The opposite problem (cache files not being regenerated when they should
have been) might also occur.  However, it could not be solved without
preserving timestamps, therefore it is out of scope of this change.

In order to avoid unnecessary cache mismatches, force a matching mtime
between all files created by a single checkout action.  This seems to be
the best course of action.  Matching mtimes do not trigger cache
updates.  They also match the concept of 'checkout' being an atomic
action.  Finally, this change does not break backwards compatibility
as the new result is a subset of the possible previous results.

For example, let's say that 'git checkout' creates two files in order,
with respective timestamps T1 and T2.  Before this patch, T1 <= T2.
After this patch, T1 == T2 which also satisfies T1 <= T2.

A similar problem was previously being addressed via git-restore-mtime
tool.  However, that solution is unnecessarily complex for Gentoo's use
case and does not seem to be suitable for 'seamless' integration.

The patch integrates mtime forcing via adding an additional member of
'struct checkout'.  This seemed the simplest way of adding it without
having to modify prototypes and adjust multiple call sites.  The member
is set to the current time in check_updates() function and is afterwards
enforced by checkout_entry().

The patch uses utime() rather than futimes() as that seems to be
the function used everywhere else in git and provided some MinGW
compatibility code.

Signed-off-by: Michał Górny 
---
 cache.h|  1 +
 entry.c| 12 +++-
 unpack-trees.c |  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index bbaf5c349..9f0a7c867 100644
--- a/cache.h
+++ b/cache.h
@@ -1526,6 +1526,7 @@ struct checkout {
const char *base_dir;
int base_dir_len;
struct delayed_checkout *delayed_checkout;
+   time_t checkout_mtime;
unsigned force:1,
 quiet:1,
 not_new:1,
diff --git a/entry.c b/entry.c
index 2101201a1..7ee5a6d28 100644
--- a/entry.c
+++ b/entry.c
@@ -411,6 +411,7 @@ int checkout_entry(struct cache_entry *ce,
 {
static struct strbuf path = STRBUF_INIT;
struct stat st;
+   int ret;
 
if (topath)
return write_entry(ce, topath, state, 1);
@@ -473,5 +474,14 @@ int checkout_entry(struct cache_entry *ce,
return 0;
 
create_directories(path.buf, path.len, state);
-   return write_entry(ce, path.buf, state, 0);
+   ret = write_entry(ce, path.buf, state, 0);
+
+   if (ret == 0 && state->checkout_mtime != 0) {
+   struct utimbuf t;
+   t.modtime = state->checkout_mtime;
+   if (utime(path.buf, ) < 0)
+   warning_errno("failed utime() on %s", path.buf);
+   }
+
+   return ret;
 }
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051..e1efefb68 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -346,6 +346,7 @@ static int check_updates(struct unpack_trees_options *o)
state.quiet = 1;
state.refresh_cache = 1;
state.istate = index;
+   state.checkout_mtime = time(NULL);
 
progress = get_progress(o);
 
-- 
2.17.0



Bug: rebase -i creates committer time inversions on 'reword'

2018-04-13 Thread Johannes Sixt
I just noticed that all commits in a 70-commit branch have the same
committer timestamp. This is very unusual on Windows, where rebase -i of
such a long branch takes more than one second (but not more than 3 or
so thanks to the builtin nature of the command!).

And, in fact, if you mark some commits with 'reword' to delay the quick
processing of the patches, then the reworded commits have later time
stamps, but subsequent not reworded commits receive the earlier time
stamp. This is clearly not intended.

Perhaps something like this below is needed.

diff --git a/ident.c b/ident.c
index 327abe557f..2c6bff7b9d 100644
--- a/ident.c
+++ b/ident.c
@@ -178,8 +178,8 @@ const char *ident_default_email(void)
 
 static const char *ident_default_date(void)
 {
-   if (!git_default_date.len)
-   datestamp(_default_date);
+   strbuf_reset(_default_date);
+   datestamp(_default_date);
return git_default_date.buf;
 }
 



Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-13 Thread Phillip Wood

On 12/04/18 23:02, Johannes Schindelin wrote:

Hi Jake,

On Thu, 12 Apr 2018, Jacob Keller wrote:


On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov  wrote:


Jacob Keller  writes:

On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov  wrote:

It was rather --recreate-merges just a few weeks ago, and I've seen
nobody actually commented either in favor or against the
--rebase-merges.

git rebase --rebase-merges


I'm going to jump in here and say that *I* prefer --rebase-merges, as
it clearly mentions merge commits (which is the thing that changes).


OK, thanks, it's fair and the first argument in favor of
--rebase-merges I see.


I'd be ok with "--keep-merges" also.


My main argument against --keep-merges is that there is no good short
option for it: -k and -m are already taken. And I really like my `git
rebase -kir` now...

A minor argument in favor of `--rebase-merges` vs `--keep-merges` is that
we do not really keep the merge commits, we rewrite them. In the version
as per this here patch series, we really create recursive merges from
scratch.

In the later patch series on which I am working, we use a variation of
Phillip's strategy which can be construed as a generalization of the
cherry-pick to include merges: for a cherry-pick, we perform a 3-way merge
between the commit and HEAD, with the commit's parent commit as merge
base. With Phillip's strategy, we perform a 3-way merge between the merge
commit and HEAD (i.e. the rebased first parent), with the merge commit's
first parent as merge base, followed by a 3-way merge with the rebased
2nd parent (with the original 2nd parent as merge base), etc

However. This strategy, while it performed well in my initial tests (and
in Buga's initial tests, too), *does* involve more than one 3-way merge,
and therefore it risks something very, very nasty: *nested* merge
conflicts.

Now, I did see nested merge conflicts in the past, very rarely, but that
can happen, when two developers criss-cross merge each others' `master`
branch and are really happy to perform invasive changes that our merge
does not deal well with, such as indentation changes.

When rebasing a merge conflict, however, such nested conflicts can happen
relatively easily. Not rare at all.

I found out about this by doing what I keep preaching in this thred:
theory is often very nice *right* until the point where it hits reality,
and then frequently turns really useless, real quickly. Theoretical
musings can therefore be an utter waste of time, unless accompanied by
concrete examples.


Exactly (that's one reason I've been keeping a low profile on this 
thread since my initial suggestion - I haven't had time to test out any 
examples). Thanks for taking the time to test out the theory



To start, I built on the example for an "evil merge" that I gave already
in the very beginning of this insanely chatty thread: if one branch
changes the signature of a function, and a second branch adds a caller to
that function, then by necessity a merge between those two branches has to
change the caller to accommodate the signature change. Otherwise it would
end up in a broken state.

In my `sequencer-shears` branch at https://github.com/dscho/git, I added
this as a test case, where I start out with a main.c containing a single
function called core(). I then create one branch where this function is
renamed to hi(), and another branch where the function caller() is added
that calls core(). Then I merge both, amending the merge commit so that
caller() now calls hi(). So this is the main.c after merging:

int hi(void) {
printf("Hello, world!\n");
}
/* caller */
void caller(void) {
hi();
}

To create the kind of problems that are all too common in my daily work
(seemingly every time some stable patch in Git for Windows gets
upstreamed, it changes into an incompatible version, causing merge
conflicts, and sometimes not only that... but I digress...), I then added
an "upstream" where some maintainer decided that core() is better called
greeting(), and also that a placeholder function for an event loop should
be added. So in upstream, main.c looks like this:

int greeting(void) {
printf("Hello, world!\n");
}
/* main event loop */
void event_loop(void) {
/* TODO: place holder for now */
}

Keep in mind: while this is a minimal example of disagreeing changes that
may look unrealistic, in practice this is the exact type of problem I am
dealing with on a daily basis, in Git for Windows as well as in GVFS Git
(which adds a thicket of branches on top of Git for Windows) and with the
MSYS2 runtime (where Git for Windows stacks patches on top of MSYs2, which
in turn maintains their set of patches on top of the Cygwin runtime), and
with BusyBox, and probably other projects I forgot spontaneously. This
makes me convinced that 

Re: legal consent to use logo in talk

2018-04-13 Thread Lukas Puehringer
Hi,

I just realized that the logo's licensing information is available
online, and suits our needs.

You can disregard my prior email, and I apologize for the noise on the
mailing list.

Thanks,
Lukas Puehringer

On 4/13/18 5:18 PM, Lukas Puehringer wrote:
> Dear git community,
> 
> I'd like to use the git logo in the slides for a talk about software
> supply chain security at KubeCon + CloudNativeCon 2018 [1].
> 
> The talk will present in-toto [2], a framework to secure the software
> supply chain, developed at New York University, and Grafeas [3], an open
> artifact metadata API to audit and govern software supply chains,
> developed at Google.
> 
> The logo will serve to demonstrate an exemplary software supply chain.
> 
> The legal department of my co-lecturer, mandates to acquire legal
> consent when using logos, which I hereby request.
> 
> Please let me know if you need any additional information, or if you
> would like me to share the slide deck.
> 
> Thanks,
> Lukas Puehringer
> 
> 
> [1] https://kccnceu18.sched.com/event/d5ccae5373cef50d11d502901b1b7eb9
> [2] https://in-toto.io/
> [3] https://grafeas.io/
> 

-- 
lukas.puehrin...@nyu.edu
PGP fingerprint: 8BA6 9B87 D43B E294 F23E  8120 89A2 AD3C 07D9 62E8


Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-13 Thread Phillip Wood

On 12/04/18 10:30, Johannes Schindelin wrote:

Hi Phillip,

On Wed, 11 Apr 2018, Phillip Wood wrote:


On 10/04/18 13:30, Johannes Schindelin wrote:

Firstly let me say that I think expanding the documentation and having an
example is an excellent idea.


Thanks! At first, I meant to leave this for others to contribute, but I
think it makes sense for me to describe it, as I do have a little bit of
experience with rebasing merges.


+
+
+label onto
+
+# Branch: refactor-button
+reset onto
+pick 123456 Extract a generic Button class from the DownloadButton one
+pick 654321 Use the Button class for all buttons
+label refactor-button
+
+# Branch: report-a-bug
+reset refactor-button # Use the Button class for all buttons
+pick abcdef Add the feedback button
+label report-a-bug
+
+reset onto
+merge -C a1b2c3 refactor-button # Merge 'refactor-button'
+merge -C 6f5e4d report-a-bug # Merge 'report-a-bug'
+
+
+In contrast to a regular interactive rebase, there are `label`, `reset` and
+`merge` commands in addition to `pick` ones.
+
+The `label` command puts a label to whatever will be the current


s/puts a label to/associates a label with/ would be clearer I think. Maybe
s/whatever will be the current revision/the current HEAD/ an well?


Thanks, I incorporated both changes here.


+revision when that command is executed. Internally, these labels are
+worktree-local refs that will be deleted when the rebase finishes or
+when it is aborted.


I agree they should be deleted when the rebase is aborted but I cannot see any
changes to git-rebase.sh to make that happen. I think they should also be
deleted by 'rebase --quit'.


Oh right! For some reason I thought I already hooked up rebase--helper
--abort when rebase was called with --abort or quit, but I had not managed
yet. I think I will leave this for later, or for GSoC, or something.

In the meantime, I'll just drop the "or when it is aborted.".


That way, rebase operations in multiple worktrees
+linked to the same repository do not interfere with one another.
+
+The `reset` command is essentially a `git reset --hard` to the specified
+revision (typically a previously-labeled one).


s/labeled/labelled/


As Eric pointed out, I am using 'murricane spelling here (or is it
speling? Ya never know these days).


:-)


I think it would be worthwhile to point out that unlike the other commands
this will not preserve untracked files. Maybe something like
"Note that unlike the `pick` or `merge` commands or initial checkout when the
rebase starts the `reset` command will overwrite any untracked files."


You know what? You just pointed out a bug in my thinking. Previously, I
thought that this is impossible, that you cannot overwrite untracked files
because we labeled this revision previously, so the only new files to
write by `reset` were tracked files previous. But that forgets `exec` and
`reset` with unlabeled revisions (e.g. for cousins).

So I changed the `reset` command to refuse overwriting untracked files...


That sounds like the safest plan

Thanks

Phillip


Thank you for improving this patch series!
Dscho





Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-13 Thread Phillip Wood

On 11/04/18 20:10, Eric Sunshine wrote:

On Wed, Apr 11, 2018 at 11:35 AM, Phillip Wood
 wrote:

On 10/04/18 13:30, Johannes Schindelin wrote:

+The `reset` command is essentially a `git reset --hard` to the specified
+revision (typically a previously-labeled one).


s/labeled/labelled/


American vs. British English spelling.


Ah, I'd forgotten that the American version only had one 'l'

Thanks

Phillip


CodingGuidelines and SubmittingPatches talk about this. Junio
summarizes the issue well in [1]. The TL;DR is to lean toward the
American English spelling.

[1]: https://public-inbox.org/git/xmqq4m9gpebm@gitster.mtv.corp.google.com/





legal consent to use logo in talk

2018-04-13 Thread Lukas Puehringer
Dear git community,

I'd like to use the git logo in the slides for a talk about software
supply chain security at KubeCon + CloudNativeCon 2018 [1].

The talk will present in-toto [2], a framework to secure the software
supply chain, developed at New York University, and Grafeas [3], an open
artifact metadata API to audit and govern software supply chains,
developed at Google.

The logo will serve to demonstrate an exemplary software supply chain.

The legal department of my co-lecturer, mandates to acquire legal
consent when using logos, which I hereby request.

Please let me know if you need any additional information, or if you
would like me to share the slide deck.

Thanks,
Lukas Puehringer


[1] https://kccnceu18.sched.com/event/d5ccae5373cef50d11d502901b1b7eb9
[2] https://in-toto.io/
[3] https://grafeas.io/
-- 
lukas.puehrin...@nyu.edu
PGP fingerprint: 8BA6 9B87 D43B E294 F23E  8120 89A2 AD3C 07D9 62E8


Re: [PATCH 2/2] daemon: graceful shutdown of client connection

2018-04-13 Thread Johannes Schindelin
Hi Kim,

On Thu, 12 Apr 2018, Kim Gybels wrote:

> On Windows, a connection is shutdown when the last open handle to it is
> closed. When that last open handle is stdout of our child process, an
> abortive shutdown is triggered when said process exits. Ensure a
> graceful shutdown of the client connection by keeping an open handle
> until we detect our child process has finished. This allows all the data
> to be sent to the client, instead of being discarded.

Nice explanation!

> @@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, 
> socklen_t addrlen)
>   }
>  
>   cld.argv = cld_argv.argv;
> - cld.in = incoming;
> + cld.in = dup(incoming);

At first I was worried that somebody might want to remove this in the
future, but then I saw this line (which also calls dup()):

>   cld.out = dup(incoming);
>  
>   if (start_command())
>   logerror("unable to fork");
>   else
> - add_child(, addr, addrlen);
> + add_child(, addr, addrlen, incoming);
>  }
>  
>  static void child_handler(int signo)

Nice work!

I wonder whether you found a reliable way to trigger this? It would be
nice to have a regression test for this.

Ciao,
Dscho


Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll

2018-04-13 Thread Johannes Schindelin
Hi Kim,

On Thu, 12 Apr 2018, Kim Gybels wrote:

> The poll provided in compat/poll.c is not interrupted by receiving
> SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

Maybe say "When using this poll emulation, use a timeout ..."?

> diff --git a/daemon.c b/daemon.c
> index fe833ea7de..6dc95c1b2f 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
>  {
>   struct pollfd *pfd;
>   int i;
> + int poll_timeout = -1;

Just reuse the line above:

int poll_timeout = -1, i;

> @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
>   int i;
>  
>   check_dead_children();
> -
> - if (poll(pfd, socklist->nr, -1) < 0) {
> +#ifdef NO_POLL
> + poll_timeout = live_children ? 100 : -1;
> +#endif
> + int ret = poll(pfd, socklist->nr, poll_timeout);
> + if  (ret == 0) {
> + continue;
> + } else if (ret < 0) {

I would find it a bit easier on the eyes if this did not use curlies, and
dropped the unnecessary `else` (`continue` will take care of that):

if (!ret)
continue;
if (ret < 0)
[...]

Thank you for working on this!

Ciao,
Dscho


[PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic

2018-04-13 Thread Ben Peart
Update fsmonitor to utilize the new fsexcludes based logic for excluding paths
that do not need to be scaned for new or modified files.  Remove the old logic
in dir.c that utilized the untracked cache (if enabled) to accomplish the same
goal.

Signed-off-by: Ben Peart 
---
 dir.c   | 23 ---
 dir.h   |  2 --
 fsmonitor.c | 21 ++---
 fsmonitor.h | 10 +++---
 t/t7519-status-fsmonitor.sh | 14 +++---
 5 files changed, 16 insertions(+), 54 deletions(-)

diff --git a/dir.c b/dir.c
index 47a073efe1..b1859f4311 100644
--- a/dir.c
+++ b/dir.c
@@ -19,7 +19,6 @@
 #include "varint.h"
 #include "ewah/ewok.h"
 #include "fsexcludes.h"
-#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1827,20 +1826,14 @@ static int valid_cached_dir(struct dir_struct *dir,
if (!untracked)
return 0;
 
-   /*
-* With fsmonitor, we can trust the untracked cache's valid field.
-*/
-   refresh_fsmonitor(istate);
-   if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
-   if (lstat(path->len ? path->buf : ".", )) {
-   memset(>stat_data, 0, 
sizeof(untracked->stat_data));
-   return 0;
-   }
-   if (!untracked->valid ||
-   match_stat_data_racy(istate, >stat_data, 
)) {
-   fill_stat_data(>stat_data, );
-   return 0;
-   }
+   if (stat(path->len ? path->buf : ".", )) {
+   memset(>stat_data, 0, sizeof(untracked->stat_data));
+   return 0;
+   }
+   if (!untracked->valid ||
+   match_stat_data_racy(istate, >stat_data, )) {
+   fill_stat_data(>stat_data, );
+   return 0;
}
 
if (untracked->check_only != !!check_only)
diff --git a/dir.h b/dir.h
index b0758b82a2..e67ccfbb29 100644
--- a/dir.h
+++ b/dir.h
@@ -139,8 +139,6 @@ struct untracked_cache {
int gitignore_invalidated;
int dir_invalidated;
int dir_opened;
-   /* fsmonitor invalidation data */
-   unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/fsmonitor.c b/fsmonitor.c
index 6d7bcd5d0e..dd67eef851 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -2,6 +2,7 @@
 #include "config.h"
 #include "dir.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 #include "run-command.h"
 #include "strbuf.h"
@@ -125,12 +126,7 @@ static void fsmonitor_refresh_callback(struct index_state 
*istate, const char *n
ce->ce_flags &= ~CE_FSMONITOR_VALID;
}
 
-   /*
-* Mark the untracked cache dirty even if it wasn't found in the index
-* as it could be a new untracked file.
-*/
trace_printf_key(_fsmonitor, "fsmonitor_refresh_callback '%s'", 
name);
-   untracked_cache_invalidate_path(istate, name, 0);
 }
 
 void refresh_fsmonitor(struct index_state *istate)
@@ -184,11 +180,8 @@ void refresh_fsmonitor(struct index_state *istate)
/* Mark all entries invalid */
for (i = 0; i < istate->cache_nr; i++)
istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
-
-   if (istate->untracked)
-   istate->untracked->use_fsmonitor = 0;
}
-   strbuf_release(_result);
+   fsexcludes_init(_result);
 
/* Now that we've updated istate, save the last_update time */
istate->fsmonitor_last_update = last_update;
@@ -207,12 +200,6 @@ void add_fsmonitor(struct index_state *istate)
for (i = 0; i < istate->cache_nr; i++)
istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
 
-   /* reset the untracked cache */
-   if (istate->untracked) {
-   add_untracked_cache(istate);
-   istate->untracked->use_fsmonitor = 1;
-   }
-
/* Update the fsmonitor state */
refresh_fsmonitor(istate);
}
@@ -241,10 +228,6 @@ void tweak_fsmonitor(struct index_state *istate)
 
/* Mark all previously saved entries as dirty */
ewah_each_bit(istate->fsmonitor_dirty, 
fsmonitor_ewah_callback, istate);
-
-   /* Now mark the untracked cache for fsmonitor usage */
-   if (istate->untracked)
-   istate->untracked->use_fsmonitor = 1;
}
 
ewah_free(istate->fsmonitor_dirty);
diff --git a/fsmonitor.h b/fsmonitor.h
index 65f3743636..f7adfc1f7c 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -35,8 +35,7 @@ extern void tweak_fsmonitor(struct index_state *istate);
 
 /*
  * Run the configured fsmonitor integration script and clear the
- * 

[PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files

2018-04-13 Thread Ben Peart
Only minor changes from V2: 

Switched to using get_dtype() instead of DTYPE() for platform independence.
Cleaned up reverting of fsmonitor code in the untracked cache.

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/709470f33f
Checkout: git fetch https://github.com/benpeart/git fsexcludes-v3 && git 
checkout 709470f33f



### Patches

Ben Peart (2):
  fsexcludes: add a programmatic way to exclude files from git's working
directory traversal logic
  fsmonitor: switch to use new fsexcludes logic and remove unused
untracked cache based logic

 Makefile|   1 +
 dir.c   |  47 +---
 dir.h   |   2 -
 fsexcludes.c| 211 
 fsexcludes.h|  29 +
 fsmonitor.c |  21 +---
 fsmonitor.h |  10 +-
 t/t7519-status-fsmonitor.sh |  14 +--
 8 files changed, 279 insertions(+), 56 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
-- 
2.17.0.windows.1




[PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic

2018-04-13 Thread Ben Peart
The File System Excludes module is a new programmatic way to exclude files and
folders from git's traversal of the working directory.  fsexcludes_init() should
be called with a string buffer that contains a NUL separated list of path names
of the files and/or directories that should be included.  Any path not listed
will be excluded. The paths should be relative to the root of the working
directory and be separated by a single NUL.

The excludes logic in dir.c has been updated to honor the results of
fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
normal excludes logic is also checked as it could further reduce the set of
files that should be included.

Signed-off-by: Ben Peart 
---
 Makefile |   1 +
 dir.c|  24 +-
 fsexcludes.c | 211 +++
 fsexcludes.h |  29 +++
 4 files changed, 263 insertions(+), 2 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h

diff --git a/Makefile b/Makefile
index f181687250..a4f1471272 100644
--- a/Makefile
+++ b/Makefile
@@ -822,6 +822,7 @@ LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsexcludes.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
diff --git a/dir.c b/dir.c
index 63a917be45..47a073efe1 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 
 /*
@@ -1102,6 +1103,12 @@ int is_excluded_from_list(const char *pathname,
  struct exclude_list *el, struct index_state *istate)
 {
struct exclude *exclude;
+
+   if (*dtype == DT_UNKNOWN)
+   *dtype = get_dtype(NULL, istate, pathname, pathlen);
+   if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype) > 0)
+   return 1;
+
exclude = last_exclude_matching_from_list(pathname, pathlen, basename,
  dtype, el, istate);
if (exclude)
@@ -1317,8 +1324,15 @@ struct exclude *last_exclude_matching(struct dir_struct 
*dir,
 int is_excluded(struct dir_struct *dir, struct index_state *istate,
const char *pathname, int *dtype_p)
 {
-   struct exclude *exclude =
-   last_exclude_matching(dir, istate, pathname, dtype_p);
+   struct exclude *exclude;
+   int pathlen = strlen(pathname);
+
+   if (*dtype_p == DT_UNKNOWN)
+   *dtype_p = get_dtype(NULL, istate, pathname, pathlen);
+   if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype_p) > 
0)
+   return 1;
+
+   exclude = last_exclude_matching(dir, istate, pathname, dtype_p);
if (exclude)
return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
return 0;
@@ -1671,6 +1685,9 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
if (dtype != DT_DIR && has_path_in_index)
return path_none;
 
+   if (fsexcludes_is_excluded_from(istate, path->buf, path->len, dtype) > 
0)
+   return path_excluded;
+
/*
 * When we are looking at a directory P in the working tree,
 * there are three cases:
@@ -2011,6 +2028,9 @@ static enum path_treatment 
read_directory_recursive(struct dir_struct *dir,
/* add the path to the appropriate result list */
switch (state) {
case path_excluded:
+   if (fsexcludes_is_excluded_from(istate, path.buf, 
path.len,
+   get_dtype(cdir.de, istate, path.buf, 
path.len)) > 0)
+   break;
if (dir->flags & DIR_SHOW_IGNORED)
dir_add_name(dir, istate, path.buf, path.len);
else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
diff --git a/fsexcludes.c b/fsexcludes.c
new file mode 100644
index 00..0ef57f107b
--- /dev/null
+++ b/fsexcludes.c
@@ -0,0 +1,211 @@
+#include "cache.h"
+#include "fsexcludes.h"
+#include "hashmap.h"
+#include "strbuf.h"
+
+static int fsexcludes_initialized = 0;
+static struct strbuf fsexcludes_data = STRBUF_INIT;
+static struct hashmap fsexcludes_hashmap;
+static struct hashmap parent_directory_hashmap;
+
+struct fsexcludes {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *pattern;
+   int patternlen;
+};
+
+static unsigned int(*fsexcludeshash)(const void *buf, size_t len);
+static int(*fsexcludescmp)(const char *a, const char *b, size_t len);
+
+static int fsexcludes_hashmap_cmp(const void *unused_cmp_data,
+   const void *a, const void *b, const void *key)
+{
+   const struct fsexcludes *fse1 = a;
+   const struct fsexcludes *fse2 = b;
+
+   return fsexcludescmp(fse1->pattern, fse2->pattern, fse1->patternlen);
+}
+
+static int 

Re: [PATCH v1] fsmonitor: fix incorrect buffer size when printing version number

2018-04-13 Thread Ben Peart



On 4/10/2018 4:17 PM, Eric Sunshine wrote:

On Tue, Apr 10, 2018 at 2:43 PM, Ben Peart  wrote:

This is a trivial bug fix for passing the incorrect size to snprintf() when
outputing the version.  It should be passing the size of the destination buffer


s/outputing/outputting/


rather than the size of the value being printed.

Signed-off-by: Ben Peart 


Junio, do you want a re-roll of this simple fix or can you fix the 
spelling of the commit message?


Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-13 Thread Johannes Schindelin
Hi Jake,

On Thu, 12 Apr 2018, Jacob Keller wrote:

> On Thu, Apr 12, 2018 at 3:02 PM, Johannes Schindelin
>  wrote:
> 
> > [... talking about nested merge conflicts ...]
> >
> > The only way out I can see is to implement some sort of "W merge" or
> > "chandelier merge" that can perform an N-way merge between one revision
> > and N-1 other revisions (each of the N-1 bringing its own merge base). I
> > call them "W" or "chandelier" because such a merge can be visualized by
> > the original merge commit being the center of a chandelier, and each arm
> > representing one of the N-1 merge heads with their own merge bases.
> >
> 
> I think this approach sounds reasonable.

... and it would incidentally also offer a saner way to do octopus merges
(so far, an octopus merge that causes merge conflicts causes... huge
pains, as it usually stops in the middle of everything, without a UI to
help with concluding the merge).

> > Similar to the 3-way merge we have implemented in xdiff/xmerge.c, this
> > "chandelier merge" would then generate the two diffs between merge base
> > and both merge heads, except not only one time, but N-1 times. It would
> > then iterate through all hunks ordered by file name and line range. Any
> > hunk without conflicting changes would be applied as-is, and the remaining
> > ones be turned into conflicts (handling those chandelier arms first where
> > both diffs' hunks look identical).
> >
> > Have I missed any simpler alternative?
> 
> I *think* this would work well if I understand it, but it's difficult
> to process without examples.

Well, I am fairly certain about the implementation details (it's been a
while since I contributed xdiff/xmerge.c, and if you ever want to hear the
horrible story how I wrote the initial version in a stopped train in the
middle of the night, just buy me a beer or three, my memory is fresh on
the "simultaneous walking" of the diff hunks).

So it goes somewhat like this. You have two diffs, and for the matter of
the discussion, let's just look at the hunk headers (with 0 context lines,
i.e. -U0):

diff base..HEAD
@@ -10,1 +10,2 @@
@@ -40,2 +41,0 @@

diff base..branch
@@ -8,4 +8,3 @@

So on one side of the merge, we changed line 10 (e.g. wrapping a long
line), and we removed lines 40 and 41.

In the branch we want to merge, lines 8--11 were edited (removing one
line).

The 3-way merge as implemented in xdiff/xmerge.c handles only one file,
and first uses the diff machinery to figure out the hunk headers of both
diffs, then iterates through both diffs. This is the `while (xscr1 &&
xscr2)` loop in `xdl_do_merge()`, and the "scr" stands for "script" as in
"edit script". In other words, `xscr1` refers to the current hunk in the
first diff, and `xscr2` to the one in the second diff.

Inside the loop, we look whether they overlap. If not, the one with the
smaller line numbers is "applied" and we iterate to the next hunk after
that.

If the hunks overlap, we have a look at the respective post images to see
whether both sides of the merge modified that part identically; if they
don't, we create a conflict (and later, we will try to reduce the conflict
by trimming identially-changed lines at both ends of the line range).

Lather, rinse & repeat.

Now, what I have in mind is that we will have not only two diffs' hunks to
look through, but (N-1)*2 (note that if N == 2, it is the exact same thing
as before).

Again, at each iteration, we look for the next hunk among all available
ones, then determine whether it overlaps with any other hunk. If it does
not, we apply it. If it does, we first look whether all overlapping hunks
agree on the post image and if they do: apply the change, otherwise create
a conflict.

How to present such conflicts to the user, though?

The worst case, I think, would be N diverging changes with N-1 agreeing on
a large part of the post image and the remaining post image being
completely different. Imagine, for example, that the original merge
contains a long function hi() that was renamed to greeting() in HEAD, but
replaced by a completely different implementation in the rebased
branch-to-merge. In such a case, this nested conflict would be most
intuitive, methinks:

<<< intermediate merge
 HEAD
greeting()

hi()
 original merge
... /* original function body */
===
hi()
... /* complete rewrite */
>>> branch

But now that I look at it, it is still hard to parse. *Is* there any good
way to present this conflict?

And then there is the problem that our index really is only prepared for
*three* stages, but we would need N*2-1.

So maybe I am overthinking this and we should stick with the
implementation I have right now (try to merge HEAD and the original merge
first, then merge the rebased 2nd parent if there are no conflicts,
otherwise try the other way round), and simply come up with a *very good*
message to the unfortunate 

Re: [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic

2018-04-13 Thread Ben Peart



On 4/11/2018 7:52 PM, Junio C Hamano wrote:

@@ -2011,6 +2028,8 @@ static enum path_treatment 
read_directory_recursive(struct dir_struct *dir,
/* add the path to the appropriate result list */
switch (state) {
case path_excluded:
+   if (fsexcludes_is_excluded_from(istate, path.buf, 
path.len, DTYPE(cdir.de)) > 0)
+   break;


Then the use of DTYPE() looks a bit odd here.  On
NO_D_TYPE_IN_DIRENT platforms, we would get DT_UNKNOWN out of it and
then end up passing DT_UNKNOWN to the function.



Good catch.  I was trying to optimize this path and didn't realize the 
platform implications of using DTYPE().  I'll update it to match the others.


Re: File versioning based on shallow Git repositories?

2018-04-13 Thread Johannes Schindelin
Hi Kuba,

On Fri, 13 Apr 2018, Jakub Narebski wrote:

> Hallvard Breien Furuseth  writes:
> 
> > Also maybe it'll be worthwhile to generate .git/info/grafts in a local
> > clone of the repo to get back easily visible history.  No grafts in
> > the original repo, grafts mess things up.
> 
> Just a reminder: modern Git has "git replace", a modern and safe
> alternative to the grafts file.

Right!

Maybe it is time to start deprecating grafts? They *do* cause problems,
such as weird "missing objects" problems when trying to fetch into, or
push from, a repository with grafts. These problems are not shared by the
`git replace` method.

I just sent out a patch to add a deprecation warning.

Ciao,
Dscho


[PATCH] Deprecate support for .git/info/grafts

2018-04-13 Thread Johannes Schindelin
The grafts feature was a convenient way to "stich together" ancient
history to the fresh start of linux.git.

Its implementation is, however, not up to Git's standards, as there are
too many ways where it can lead to surprising and unwelcome behavior.

For example, when pushing from a repository with active grafts, it is
possible to miss commits that have been "grafted out", resulting in a
broken state on the other side.

Also, the grafts feature is limited to "rewriting" commits' list of
parents, it cannot replace anything else.

The much younger feature implemented as `git replace` set out to remedy
those limitations and dangerous bugs.

Seeing as `git replace` is pretty mature by now, it is time to deprecate
support for the graft file, and to retire it eventually.

Signed-off-by: Johannes Schindelin 
---
 advice.c  | 2 ++
 advice.h  | 1 +
 commit.c  | 9 +
 t/t6001-rev-list-graft.sh | 9 +
 4 files changed, 21 insertions(+)

diff --git a/advice.c b/advice.c
index 406efc183ba..4411704fd45 100644
--- a/advice.c
+++ b/advice.c
@@ -19,6 +19,7 @@ int advice_rm_hints = 1;
 int advice_add_embedded_repo = 1;
 int advice_ignored_hook = 1;
 int advice_waiting_for_editor = 1;
+int advice_graft_file_deprecated = 1;
 
 static struct {
const char *name;
@@ -42,6 +43,7 @@ static struct {
{ "addembeddedrepo", _add_embedded_repo },
{ "ignoredhook", _ignored_hook },
{ "waitingforeditor", _waiting_for_editor },
+   { "graftfiledeprecated", _graft_file_deprecated },
 
/* make this an alias for backward compatibility */
{ "pushnonfastforward", _push_update_rejected }
diff --git a/advice.h b/advice.h
index 70568fa7922..9f5064e82a8 100644
--- a/advice.h
+++ b/advice.h
@@ -21,6 +21,7 @@ extern int advice_rm_hints;
 extern int advice_add_embedded_repo;
 extern int advice_ignored_hook;
 extern int advice_waiting_for_editor;
+extern int advice_graft_file_deprecated;
 
 int git_default_advice_config(const char *var, const char *value);
 __attribute__((format (printf, 1, 2)))
diff --git a/commit.c b/commit.c
index ca474a7c112..a96b0a27154 100644
--- a/commit.c
+++ b/commit.c
@@ -12,6 +12,7 @@
 #include "prio-queue.h"
 #include "sha1-lookup.h"
 #include "wt-status.h"
+#include "advice.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char 
*buf, size_t len, const char **);
 
@@ -176,6 +177,14 @@ static int read_graft_file(const char *graft_file)
struct strbuf buf = STRBUF_INIT;
if (!fp)
return -1;
+   if (advice_graft_file_deprecated)
+   advise(_("Support for /info/grafts is deprecated\n"
+"and will be removed in a future Git version.\n"
+"\n"
+"Please use \"git replace --graft [...]\" instead.\n"
+"\n"
+"Turn this message off by running\n"
+"\"git config advice.graftFileDeprecated false\""));
while (!strbuf_getwholeline(, fp, '\n')) {
/* The format is just "Commit Parent1 Parent2 ...\n" */
struct commit_graft *graft = read_graft_line();
diff --git a/t/t6001-rev-list-graft.sh b/t/t6001-rev-list-graft.sh
index 05ddc69cf2a..7504ba47511 100755
--- a/t/t6001-rev-list-graft.sh
+++ b/t/t6001-rev-list-graft.sh
@@ -110,4 +110,13 @@ do
"
 
 done
+
+test_expect_success 'show advice that grafts are deprecated' '
+   git show HEAD 2>err &&
+   test_i18ngrep "git replace" err &&
+   test_config advice.graftFileDeprecated false &&
+   git show HEAD 2>err &&
+   test_i18ngrep ! "git replace" err
+'
+
 test_done

base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
-- 
2.17.0.windows.1.4.g7e4058d72e3

Published-As: https://github.com/dscho/git/releases/tag/deprecate-grafts-v1
Fetch-It-Via: git fetch https://github.com/dscho/git deprecate-grafts-v1


[PATCH] completion: reduce overhead of clearing cached --options

2018-04-13 Thread SZEDER Gábor
To get the names of all '$__git_builtin_*' variables caching --options
of builtin commands in order to unset them, 8b0eaa41f2 (completion:
clear cached --options when sourcing the completion script,
2018-03-22) runs a 'set |sed s///' pipeline.  This works both in Bash
and in ZSH, but has a higher than necessasry overhead with the extra
processes.

In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
builtin command, which lists the same variables, but without a
pipeline and 'sed' it can do so with lower overhead.

This change also happens to work around an issue reported by users of
the Powerline shell prompt on macOS, which was triggered by the same
commit 8b0eaa41f2 as well.  Powerline uses several Unicode Private Use
Area code points to represent some of its pretty text UI elements
(arrows and what not), and these are stored in the $PS1 variable.
Apparently the 'set' builtin command of the default Bash version shipped
in macOS (3.2.57) has issues with these code points, and produces
garbled output where Powerline's special symbols should be in the $PS1
variable.  This, in turn, triggers the following error message in the
downstream 'sed' process:

  sed: RE error: illegal byte sequence

Other Bash versions, notably 4.4.19 on macOS (via homebrew) and 3.2.25
on CentOS don't seem to be affected.

With this patch neither the 'set' builtin is invoked to print garbage,
nor 'sed' to choke on it.

Issue-on-macOS-reported-by: Stephon Harris 
Issue-on-macOS-explained-by: Matthew Coleman 
Signed-off-by: SZEDER Gábor 
---
 contrib/completion/git-completion.bash | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index b09c8a2362..4ef59a51be 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -282,7 +282,11 @@ __gitcomp ()
 
 # Clear the variables caching builtins' options when (re-)sourcing
 # the completion script.
-unset $(set |sed -ne 
's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+if [[ -n ${ZSH_VERSION-} ]]; then
+   unset $(set |sed -ne 
's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+else
+   unset $(compgen -v __gitcomp_builtin_)
+fi
 
 # This function is equivalent to
 #
-- 
2.17.0.366.gbe216a3084



Re: [PATCH v6 05/15] sequencer: introduce the `merge` command

2018-04-13 Thread Phillip Wood
On 10/04/18 13:29, Johannes Schindelin wrote:
> +static int do_merge(struct commit *commit, const char *arg, int arg_len,
> + int flags, struct replay_opts *opts)
> +{
> + int run_commit_flags = (flags & TODO_EDIT_MERGE_MSG) ?
> + EDIT_MSG | VERIFY_MSG : 0;
> + struct strbuf ref_name = STRBUF_INIT;
> + struct commit *head_commit, *merge_commit, *i;
> + struct commit_list *bases, *j, *reversed = NULL;
> + struct merge_options o;
> + int merge_arg_len, oneline_offset, ret;
> + static struct lock_file lock;
> + const char *p;
> +
> + oneline_offset = arg_len;
> + merge_arg_len = strcspn(arg, " \t\n");
> + p = arg + merge_arg_len;
> + p += strspn(p, " \t\n");
> + if (*p == '#' && (!p[1] || isspace(p[1]))) {
> + p += 1 + strspn(p + 1, " \t\n");
> + oneline_offset = p - arg;
> + } else if (p - arg < arg_len)
> + BUG("octopus merges are not supported yet: '%s'", p);
> +
> + strbuf_addf(_name, "refs/rewritten/%.*s", merge_arg_len, arg);
> + merge_commit = lookup_commit_reference_by_name(ref_name.buf);
> + if (!merge_commit) {
> + /* fall back to non-rewritten ref or commit */
> + strbuf_splice(_name, 0, strlen("refs/rewritten/"), "", 0);
> + merge_commit = lookup_commit_reference_by_name(ref_name.buf);
> + }
> + if (!merge_commit) {
> + error(_("could not resolve '%s'"), ref_name.buf);
> + strbuf_release(_name);
> + return -1;
> + }
> +
> + if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
> + return -1;
> +
> + head_commit = lookup_commit_reference_by_name("HEAD");
> + if (!head_commit) {
> + rollback_lock_file();
> + return error(_("cannot merge without a current revision"));
> + }
> +
> + if (commit) {
> + const char *message = get_commit_buffer(commit, NULL);
> + const char *body;
> + int len;
> +
> + if (!message) {
> + rollback_lock_file();
> + return error(_("could not get commit message of '%s'"),
> +  oid_to_hex(>object.oid));
> + }
> + write_author_script(message);
> + find_commit_subject(message, );
> + len = strlen(body);
> + if (write_message(body, len, git_path_merge_msg(), 0) < 0) {
> + error_errno(_("could not write '%s'"),
> + git_path_merge_msg());
> + unuse_commit_buffer(commit, message);
> + rollback_lock_file();
> + return -1;
> + }
> + unuse_commit_buffer(commit, message);
> + } else {
> + struct strbuf buf = STRBUF_INIT;
> + int len;
> +
> + strbuf_addf(, "author %s", git_author_info(0));
> + write_author_script(buf.buf);
> + strbuf_reset();
> +
> + if (oneline_offset < arg_len) {
> + p = arg + oneline_offset;
> + len = arg_len - oneline_offset;
> + } else {
> + strbuf_addf(, "Merge branch '%.*s'",
> + merge_arg_len, arg);
> + p = buf.buf;
> + len = buf.len;
> + }
> +
> + if (write_message(p, len, git_path_merge_msg(), 0) < 0) {
> + error_errno(_("could not write '%s'"),
> + git_path_merge_msg());
> + strbuf_release();
> + rollback_lock_file();
> + return -1;
> + }
> + strbuf_release();
> + }
> +
> + write_message(oid_to_hex(_commit->object.oid), GIT_SHA1_HEXSZ,
> +   git_path_merge_head(), 0);
> + write_message("no-ff", 5, git_path_merge_mode(), 0);
> +
> + bases = get_merge_bases(head_commit, merge_commit);
> + for (j = bases; j; j = j->next)
> + commit_list_insert(j->item, );
> + free_commit_list(bases);
> +
> + read_cache();
> + init_merge_options();
> + o.branch1 = "HEAD";
> + o.branch2 = ref_name.buf;
> + o.buffer_output = 2;
> +
> + ret = merge_recursive(, head_commit, merge_commit, reversed, );
> + if (!ret)
> + rerere(opts->allow_rerere_auto);
> + if (ret <= 0)
> + fputs(o.obuf.buf, stdout);
> + strbuf_release();
> + if (ret < 0) {
> + strbuf_release(_name);
> + rollback_lock_file();
> + return error(_("conflicts while merging '%.*s'"),
> +  merge_arg_len, arg);
> + }

If there are conflicts then ret == 0 rather than -1

> +
> + if (active_cache_changed &&
> + write_locked_index(_index, , COMMIT_LOCK)) {
> + strbuf_release(_name);

Re: [PATCH v6 04/15] sequencer: introduce new commands to reset the revision

2018-04-13 Thread Phillip Wood
On 10/04/18 13:29, Johannes Schindelin wrote:
> In the upcoming commits, we will teach the sequencer to rebase merges.
> This will be done in a very different way from the unfortunate design of
> `git rebase --preserve-merges` (which does not allow for reordering
> commits, or changing the branch topology).
> 
> The main idea is to introduce new todo list commands, to support
> labeling the current revision with a given name, resetting the current
> revision to a previous state, and  merging labeled revisions.
> 
> This idea was developed in Git for Windows' Git garden shears (that are
> used to maintain Git for Windows' "thicket of branches" on top of
> upstream Git), and this patch is part of the effort to make it available
> to a wider audience, as well as to make the entire process more robust
> (by implementing it in a safe and portable language rather than a Unix
> shell script).
> 
> This commit implements the commands to label, and to reset to, given
> revisions. The syntax is:
> 
>   label 
>   reset 
> 
> Internally, the `label ` command creates the ref
> `refs/rewritten/`. This makes it possible to work with the labeled
> revisions interactively, or in a scripted fashion (e.g. via the todo
> list command `exec`).
> 
> These temporary refs are removed upon sequencer_remove_state(), so that
> even a `git rebase --abort` cleans them up.
> 
> We disallow '#' as label because that character will be used as separator
> in the upcoming `merge` command.
> 
> Later in this patch series, we will mark the `refs/rewritten/` refs as
> worktree-local, to allow for interactive rebases to be run in parallel in
> worktrees linked to the same repository.
> 
> Signed-off-by: Johannes Schindelin 

If a label or reset command fails it is likely to be due to a
typo. Rescheduling the command would make it easier for the user to fix
the problem as they can just run 'git rebase --edit-todo'. It also
ensures that the problem has actually been fixed when the rebase
continues. I think you could do it like this

--->8---
From: Phillip Wood 
Subject: [PATCH] fixup! sequencer: introduce new commands to reset the revision

Signed-off-by: Phillip Wood 
---
 sequencer.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 809df1ce48..e1b9be7327 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3029,6 +3029,13 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
} else if (!is_noop(item->command))
return error(_("unknown command %d"), item->command);
 
+   if (res < 0 && (item->command == TODO_LABEL ||
+   item->command == TODO_RESET)) {
+   /* Reschedule */
+   todo_list->current--;
+   save_todo(todo_list, opts);
+   return res;
+   }
todo_list->current++;
if (res)
return res;
-- 
2.17.0


Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-13 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/12/2018 5:12 AM, Junio C Hamano wrote:
>> Derrick Stolee  writes:
>>
>>> +Here is a diagram to visualize the shape of the full commit graph, and
>>> +how different generation numbers relate:
>>> +
>>> ++-+
>>> +| GENERATION_NUMBER_INFINITY = 0x |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   | [gen(A) = gen(B)]
>>> +   V
>>> ++-+
>>> +| 0 < commit->generation < 0x4000 |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   |[gen(A) > gen(B)]
>>> +   V
>>> ++-+
>>> +| GENERATION_NUMBER_ZERO = 0  |
>>> ++-+
>>> +|  ^
>>> +|  |
>>> ++--+
>>> +[gen(A) = gen(B)]
>>
>> It may be just me but all I can read out of the above is that

It's not just you.

>> commit->generation may store 0x, a value between 0 and
>> 0x4000, or 0.  I cannot quite tell what the notation [gen(A)
>>  gen(B)] is trying to say.  I am guessing "Two generation
>> numbers within the 'valid' range can be compared" is what the second
>> one is trying to say, but it is much less interesting to know that
>> two infinities compare equal than how generation numbers from
>> different classes compare, which cannot be depicted in the above
>> notation, I am afraid.  For example, don't we want to say that a
>> commit with INF can never be reached by a commit with a valid
>> generation number, or something like that?
>
> My intention with the arrows was to demonstrate where parent
> relationships can go, and the generation-number relation between a
> commit A with parent B. Clearly, this diagram is less than helpful.

Perhaps the following table would make the information clearer (perhaps
in addition to the above graph, but without "gen(A) {cmp} gen(B)"
arrows).

I assume that it is possible to have both GENERATION_NUMBER_ZERO and non
zero generation numbers in one repo, perhaps via alternates.  I also
assume that A != B, and that generation numbers (both set, and 0s) are
transitivelu closed under reachability.

gen(A) \   commit B ->   | gen(B)
\-\  |
commit A   \ | 0x | larger   | smaller | 0x
\++--+-+
0x   | =>  > >
0 < larger  < 0x4000 | < N  = n> >
0 < smaller < 0x4000 | < N  < N= n   >
0x   | < N  < N< N   =

The "<", "=", ">" denotes result of comparison between gen(A) and gen(B).

Generation numbers create a negative-cut filter: "N" and "n" denote
situation where we know from gen(A) and gen(B) that B is not reachable
from A.

As can be seen if we use gen(A) < gen(B) as cutoff, we don't need to
treat "infinity" and "zero" in a special way.


Best,
-- 
Jakub Narębski


Re: [RFC PATCH] Add "git show --follow-symlinks HEAD:symlink"

2018-04-13 Thread Michael Vogt
Hi Ævar,

thanks for your quick reply!

On Mon, Apr 09, 2018 at 11:28:45AM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Apr 09 2018, Michael Vogt wrote:
[..]
> > Subject: [PATCH] support: git show --follow-symlinks HEAD:symlink
> >
> > Add support for the `--follow-symlinks` options to git-show. This
> > allows to write:
> >
> > git show --follow-symlink HEAD:path-a-symlink
> 
> The patch looks reasonable, but please submit it as described in
> Documentation/SubmittingPatches, i.e. inline instead of as an
> attachment, and with a signed-off-by line etc. We'd also need some tests
> for this.

Thanks for the intial reivew. I updated the patch with a test and
documentation for the new option. Happy to merge the test into one of
the existing test files, I read t/README and greping around I did not
find a place that looked like a good fit. 

I added the updated patch as an mutt inline attachment now.

Cheers,
 Michael
>From 5a9faa9eff00f316fc654c8e3bc85c3ba56ea659 Mon Sep 17 00:00:00 2001
From: Michael Vogt 
Date: Mon, 9 Apr 2018 10:38:13 +0200
Subject: [PATCH] support: git show --follow-symlinks HEAD:symlink

Add support for the `--follow-symlinks` options to git-show. This
allows to write:

git show --follow-symlink HEAD:path-a-symlink

to get the content of the symlinked file.

Signed-off-by: Michael Vogt 
---
 Documentation/git-show.txt |  6 ++
 builtin/log.c  |  7 +--
 revision.c |  2 ++
 revision.h |  1 +
 t/t1800-git-show.sh| 41 ++
 5 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100755 t/t1800-git-show.sh

diff --git a/Documentation/git-show.txt b/Documentation/git-show.txt
index e73ef5401..fa751c35d 100644
--- a/Documentation/git-show.txt
+++ b/Documentation/git-show.txt
@@ -39,6 +39,12 @@ OPTIONS
 	For a more complete list of ways to spell object names, see
 	"SPECIFYING REVISIONS" section in linkgit:gitrevisions[7].
 
+--follow-symlinks::
+	Follow symlinks inside the repository when requesting objects
+	with extended SHA-1 expressions of the form tree-ish:path-in-tree.
+	Instead of output about the link itself, provide output about
+	the linked-to object.
+
 include::pretty-options.txt[]
 
 
diff --git a/builtin/log.c b/builtin/log.c
index 94ee177d5..e92af4fc7 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -142,7 +142,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 			 struct rev_info *rev, struct setup_revision_opt *opt)
 {
 	struct userformat_want w;
-	int quiet = 0, source = 0, mailmap = 0;
+	int quiet = 0, source = 0, mailmap = 0, follow_symlinks = 0;
 	static struct line_opt_callback_data line_cb = {NULL, NULL, STRING_LIST_INIT_DUP};
 	static struct string_list decorate_refs_exclude = STRING_LIST_INIT_NODUP;
 	static struct string_list decorate_refs_include = STRING_LIST_INIT_NODUP;
@@ -162,6 +162,8 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 		OPT_CALLBACK('L', NULL, _cb, "n,m:file",
 			 N_("Process line range n,m in file, counting from 1"),
 			 log_line_range_callback),
+		OPT_BOOL(0, "follow-symlinks", _symlinks,
+			 N_("follow in-tree symlinks (used when showing file content)")),
 		OPT_END()
 	};
 
@@ -173,9 +175,10 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 			 builtin_log_options, builtin_log_usage,
 			 PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
 			 PARSE_OPT_KEEP_DASHDASH);
-
 	if (quiet)
 		rev->diffopt.output_format |= DIFF_FORMAT_NO_OUTPUT;
+	if (follow_symlinks)
+		rev->follow_symlinks = 1;
 	argc = setup_revisions(argc, argv, rev, opt);
 
 	/* Any arguments at this point are not recognized */
diff --git a/revision.c b/revision.c
index b42c836d7..4ab22313f 100644
--- a/revision.c
+++ b/revision.c
@@ -1678,6 +1678,8 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 
 	if (revarg_opt & REVARG_COMMITTISH)
 		get_sha1_flags |= GET_OID_COMMITTISH;
+	if (revs && revs->follow_symlinks)
+		get_sha1_flags |= GET_OID_FOLLOW_SYMLINKS;
 
 	if (get_oid_with_context(arg, get_sha1_flags, , ))
 		return revs->ignore_missing ? 0 : -1;
diff --git a/revision.h b/revision.h
index b8c47b98e..060f1038a 100644
--- a/revision.h
+++ b/revision.h
@@ -122,6 +122,7 @@ struct rev_info {
 			first_parent_only:1,
 			line_level_traverse:1,
 			tree_blobs_in_commit_order:1,
+			follow_symlinks:1,
 
 			/* for internal use only */
 			exclude_promisor_objects:1;
diff --git a/t/t1800-git-show.sh b/t/t1800-git-show.sh
new file mode 100755
index 0..86fe8ee02
--- /dev/null
+++ b/t/t1800-git-show.sh
@@ -0,0 +1,41 @@
+#!/bin/sh
+
+test_description='Test git show works'
+
+. ./test-lib.sh
+
+test_expect_success 'verify git show HEAD:foo works' '
+echo "foo content" > foo &&
+git add foo &&
+git commit -m "added foo" &&
+content=$(git show HEAD:foo) &&
+

Re: File versioning based on shallow Git repositories?

2018-04-13 Thread Jakub Narebski
Hallvard Breien Furuseth  writes:

> Also maybe it'll be worthwhile to generate .git/info/grafts in a local
> clone of the repo to get back easily visible history.  No grafts in
> the original repo, grafts mess things up.

Just a reminder: modern Git has "git replace", a modern and safe
alternative to the grafts file.

Best,
-- 
Jakub Narębski


[RFC 07/10] FIXME: wrap-for-bin.sh: set 'core.submodulesFile' for each git invocation

2018-04-13 Thread Antonio Ospite
This is to test custom gitmodules file paths. The default path can be
overridden using the 'GIT_MODULES_FILE' environmental variable.

Maybe In the final patch the option should be set only when running
tests and not unconditionally in the wrapper script, but as a proof of
concept the wrapper script was a convenient location.

Also, in the final patch a fixed custom file path could be used instead
of the environmental variable: to exercise the code it should be enough
to have a value different from the default one.

The change to 't0001-init.sh' is needed to make the test pass, since now
a config is set on the command line.
---
 Makefile| 3 ++-
 t/t0001-init.sh | 1 +
 wrap-for-bin.sh | 2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)
 mode change 100644 => 100755 wrap-for-bin.sh

diff --git a/Makefile b/Makefile
index f18168725..38ee1f6a2 100644
--- a/Makefile
+++ b/Makefile
@@ -2480,7 +2480,8 @@ bin-wrappers/%: wrap-for-bin.sh
@mkdir -p bin-wrappers
$(QUIET_GEN)sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
 -e 's|@@BUILD_DIR@@|$(shell pwd)|' \
--e 's|@@PROG@@|$(patsubst test-%,t/helper/test-%,$(@F))|' < $< > 
$@ && \
+-e 's|@@PROG@@|$(patsubst test-%,t/helper/test-%,$(@F))|' \
+-e 's|git\"|git\" $$GIT_OPTIONS|' < $< > $@ && \
chmod +x $@
 
 # GNU make supports exporting all variables by "export" without parameters.
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index c413bff9c..6fa3fd24e 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -93,6 +93,7 @@ test_expect_success 'No extra GIT_* on alias scripts' '
sed -n \
-e "/^GIT_PREFIX=/d" \
-e "/^GIT_TEXTDOMAINDIR=/d" \
+   -e "/^GIT_CONFIG_PARAMETERS=/d" \
-e "/^GIT_/s/=.*//p" |
sort
EOF
diff --git a/wrap-for-bin.sh b/wrap-for-bin.sh
old mode 100644
new mode 100755
index 584240881..02bf41cbd
--- a/wrap-for-bin.sh
+++ b/wrap-for-bin.sh
@@ -20,6 +20,8 @@ PATH='@@BUILD_DIR@@/bin-wrappers:'"$PATH"
 
 export GIT_EXEC_PATH GITPERLLIB PATH GIT_TEXTDOMAINDIR
 
+GIT_OPTIONS="-c core.submodulesfile=${GITMODULES_FILE:-.gitmodules}"
+
 if test -n "$GIT_TEST_GDB"
 then
unset GIT_TEST_GDB
-- 
2.17.0



[RFC 09/10] FIXME: submodule: pass custom gitmodules file to 'test-tool submodule-config'

2018-04-13 Thread Antonio Ospite
Add a new option to 't/helper/test-submodule-config.c' to set a custom
path for the gitmodules file.

In particular this is needed to make 't/t7411-submodule-config.sh' pass.

The option is actually put in use by the script that patches the test
suite.
---
 t/helper/test-submodule-config.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/helper/test-submodule-config.c b/t/helper/test-submodule-config.c
index 5c6e4b010..30b4ea3da 100644
--- a/t/helper/test-submodule-config.c
+++ b/t/helper/test-submodule-config.c
@@ -25,6 +25,13 @@ int cmd__submodule_config(int argc, const char **argv)
output_url = 1;
if (!strcmp(arg[0], "--name"))
lookup_name = 1;
+   if (!strcmp(arg[0], "--submodules_file")) {
+   arg++;
+   my_argc--;
+   submodules_file = expand_user_path(arg[0], 0);
+   if (!submodules_file)
+   die(_("failed to expand user dir in: '%s'"), 
arg[0]);
+   }
arg++;
my_argc--;
}
-- 
2.17.0



[RFC 08/10] FIXME: submodule: fix t1300-repo-config.sh to take into account the new config

2018-04-13 Thread Antonio Ospite
Tests are run with the 'core.submodulesfile' config set, so
't/t1300-repo-config.sh' needs to be fixed to account for that.

The changes to the HEREDOC lines are temporary and only needed to
support the environmental variable expansion, they could go away
eventually is using a fixed value is good enough.
---
 t/t1300-repo-config.sh | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index e95b1e67d..f672a6c37 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -338,6 +338,7 @@ beta.noindent=sillyValue
 nextsection.nonewline=wow2 for me
 123456.a123=987
 version.1.2.3eX.alpha=beta
+core.submodulesfile=${GITMODULES_FILE:-.gitmodules}
 EOF
 
 test_expect_success 'working --list' '
@@ -345,6 +346,7 @@ test_expect_success 'working --list' '
test_cmp expect output
 '
 cat > expect << EOF
+core.submodulesfile=${GITMODULES_FILE:-.gitmodules}
 EOF
 
 test_expect_success '--list without repo produces empty output' '
@@ -357,6 +359,7 @@ beta.noindent
 nextsection.nonewline
 123456.a123
 version.1.2.3eX.alpha
+core.submodulesfile
 EOF
 
 test_expect_success '--name-only --list' '
@@ -964,10 +967,11 @@ inued
 inued"
 EOF
 
-cat > expect <<\EOF
+cat > expect << EOF
 section.continued=continued
 section.noncont=not continued
 section.quotecont=cont;inued
+core.submodulesfile=${GITMODULES_FILE:-.gitmodules}
 EOF
 
 test_expect_success 'value continued on next line' '
@@ -984,7 +988,7 @@ cat > .git/config <<\EOF
val5
 EOF
 
-cat > expect <<\EOF
+cat > expect << EOF
 section.sub=section.val1
 foo=barQsection.sub=section.val2
 foo
@@ -992,7 +996,8 @@ barQsection.sub=section.val3
 
 
 Qsection.sub=section.val4
-Qsection.sub=section.val5Q
+Qsection.sub=section.val5Qcore.submodulesfile
+${GITMODULES_FILE:-.gitmodules}Q
 EOF
 test_expect_success '--null --list' '
git config --null --list >result.raw &&
@@ -1001,6 +1006,17 @@ test_expect_success '--null --list' '
test_cmp expect result
 '
 
+cat > expect << EOF
+section.sub=section.val1
+foo=barQsection.sub=section.val2
+foo
+barQsection.sub=section.val3
+
+
+Qsection.sub=section.val4
+Qsection.sub=section.val5Q
+EOF
+
 test_expect_success '--null --get-regexp' '
git config --null --get-regexp "val[0-9]" >result.raw &&
nul_to_q result &&
@@ -1495,6 +1511,7 @@ test_expect_success '--show-origin with --list' '
file:.git/configuser.override=local
file:.git/configinclude.path=../include/relative.include
file:.git/../include/relative.include   user.relative=include
+   command line:   
core.submodulesfile=${GITMODULES_FILE:-.gitmodules}
command line:   user.cmdline=true
EOF
git -c user.cmdline=true config --list --show-origin >output &&
@@ -1511,7 +1528,8 @@ test_expect_success '--show-origin with --list --null' '
trueQfile:.git/configQuser.override
localQfile:.git/configQinclude.path

../include/relative.includeQfile:.git/../include/relative.includeQuser.relative
-   includeQcommand line:Quser.cmdline
+   includeQcommand line:Qcore.submodulesfile
+   ${GITMODULES_FILE:-.gitmodules}Qcommand line:Quser.cmdline
trueQ
EOF
git -c user.cmdline=true config --null --list --show-origin >output.raw 
&&
-- 
2.17.0



[RFC 10/10] FIXME: add a hacky script to test the changes with a patched test suite

2018-04-13 Thread Antonio Ospite
Patch all the hardcoded occurrences of '.gitmodules' and make the file
configurable via an environment variable.

This is only for demonstration purposes, the final patch to the test
suite could just use a fixed path for the custom gitmodules file,
matching the path passed in the wrapper script via the
'core.submodulesfile' option.

The rationale would be that testing with a custom gitmodules file covers
also the case of the default gitmodules file path.

Execute 'test-custom-gitmodules-file.sh' to check that the test pass
with either the default anda custom gitmodules file.
---
 test-custom-gitmodules-file.sh | 75 ++
 1 file changed, 75 insertions(+)
 create mode 100755 test-custom-gitmodules-file.sh

diff --git a/test-custom-gitmodules-file.sh b/test-custom-gitmodules-file.sh
new file mode 100755
index 0..d59b1e40d
--- /dev/null
+++ b/test-custom-gitmodules-file.sh
@@ -0,0 +1,75 @@
+#!/bin/sh
+#
+# This is a test script to demonstrate that the changes about
+# 'core.submodulesfile' do not cause regressions and even work as intended
+# when a custom gitmodules file is specified O_O.
+#
+# The script patches the hardcoded '.gitmodules' occurrences to be overridden
+# by an environment variable.
+#
+# Note that 'core.submodulesfile' affects no only the 'submodule' git command
+# but many other git commands that directly or or *indirecly* use the
+# submodules_file variable.
+#
+# In particular all the commands that use unpack_trees(), like 'am', 'clone',
+# etc. are affected.
+#
+# Anyways this is not a problem because the option is passed to all git
+# commands in the "./bin-wrappers/git" wrapper script.
+
+set -e
+
+if [ ! -e .patched_test_suite ];
+then
+
+  echo "Patching the test suite..."
+
+  find t/ -type f -name t7506-status-submodule.sh -exec sed -i \
+-e "s/^cat \(.*\)\\\EOF/cat \1EOF/g" \
+-e "s/\(.*\)\.gitmodules$/\1\${GITMODULES_FILE:-.gitmodules}/g" \
+-e "s/^AA \.gitmodules$/AA \${GITMODULES_FILE:-.gitmodules}/g" \
+{} \;
+
+  find t/ -type f -name t7508-status.sh -exec sed -i \
+-e "s/touch \(.*\)\.gitmodules/touch 
\1\\\"\${GITMODULES_FILE:-.gitmodules}\\\"/g" \
+-e "s/\t\.gitmodules$/\t\${GITMODULES_FILE:-.gitmodules}/g" \
+{} \;
+
+  find t/ -type f -exec sed -i \
+-e "s/git \(.*\)config \(.*\)-f ..\/\.gitmodules/git \1config \2-f 
..\/\'\"\${GITMODULES_FILE:-.gitmodules}\"\'/g" \
+-e "s/git \(.*\)config \(.*\)-f \.gitmodules/git \1config \2-f 
\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/git \(.*\)config \(.*\)--file=\.gitmodules/git \1config 
\2--file=\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/git add \(.*\)\.gitmodules\(.*\)/git add 
\1\"\${GITMODULES_FILE:-.gitmodules}\"\2/g" \
+-e "s/test_cmp expect \.gitmodules/test_cmp expect 
\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/cp .gitmodules pristine-\.gitmodules/cp 
\"\${GITMODULES_FILE:-.gitmodules}\" pristine-gitmodules/g" \
+-e "s/cp pristine-\.gitmodules .gitmodules/cp pristine-gitmodules 
\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/cp \.gitmodules \.gitmodules.bak/cp 
\'\"\${GITMODULES_FILE:-.gitmodules}\"\' .gitmodules.bak/g" \
+-e "s/test-tool submodule-config/test-tool submodule-config 
--submodules_file \'\"\${GITMODULES_FILE:-.gitmodules}\"\'/g" \
+-e 
"s/\$sha1:\.gitmodules/\$sha1:\'\"\${GITMODULES_FILE:-.gitmodules}\"\'/g" \
+-e "s/cat >\.gitmodules/cat >\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/sed \(.*\)<\.gitmodules/sed 
\1<\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/rm \(.*\)\.gitmodules/rm \1\"\${GITMODULES_FILE:-.gitmodules}\"/g" \
+-e "s/echo \(.*\)\.gitmodules/echo 
\1\\\"\${GITMODULES_FILE:-.gitmodules}\\\"/g" \
+-e 
"s/\t\(.*\)\.gitmodules$/\t\1\'\"\${GITMODULES_FILE:-.gitmodules}\"\'/g" \
+-e 
"s/\t\(.*\)\.gitmodules:/\t\1\'\"\${GITMODULES_FILE:-.gitmodules}\"\':/g" \
+-e "s/\t\(.*\)\.gitmodules 
\&\&$/\t\1\'\"\${GITMODULES_FILE:-.gitmodules}\"\' \&\&/g" \
+-e "s/\t\(.*\)\.gitmodules) 
\&\&$/\t\1\'\"\${GITMODULES_FILE:-.gitmodules}\"\') \&\&/g" \
+-e "s/repo\/\.gitmodules /repo\/\'\"\${GITMODULES_FILE:-.gitmodules}\"\' 
/g" \
+-e "s/Submodule sm2 a5a65c9..280969a:/Submodule sm2 a5a65c9..\'\$(git -C 
sm2 rev-list -1 HEAD | cut -c1-7)\':/g" \
+-e "s/diff --git a\/sm2\/\.gitmodules /diff --git 
a\/sm2\/\'\"\${GITMODULES_FILE:-.gitmodules}\"\' /g" \
+-e "s/^M\([ ]\{1,2\}\)\.gitmodules$/M\1\${GITMODULES_FILE:-.gitmodules}/g" 
\
+-e "s/^D\([ ]\{1,2\}\)\.gitmodules$/D\1\${GITMODULES_FILE:-.gitmodules}/g" 
\
+{} \;
+
+make clean
+make
+touch .patched_test_suite
+fi
+
+echo "Running the tests with the default gitmodules file..."
+make test
+
+echo "Running the tests with a custom gitmodules file..."
+GITMODULES_FILE=.gitmodules_custom make test
+
+echo "Done."
-- 
2.17.0



Re: [RFC 00/10] Make .the gitmodules file path configurable

2018-04-13 Thread Antonio Ospite
On Fri, 13 Apr 2018 00:20:37 +0200
Antonio Ospite  wrote:

[...]
> Antonio Ospite (10):
>   submodule: add 'core.submodulesFile' to override the '.gitmodules'
> path
>   submodule: fix getting custom gitmodule file in fetch command
>   submodule: use the 'submodules_file' variable in output messages
>   submodule: document 'core.submodulesFile' and fix references to
> '.gitmodules'
>   submodule: adjust references to '.gitmodules' in comments
>   completion: add 'core.submodulesfile' to the git-completion.bash file
>   XXX: wrap-for-bin.sh: set 'core.submodulesFile' for each git
> invocation
>   XXX: submodule: fix t1300-repo-config.sh to take into account the new
> config
>   XXX: submodule: pass custom gitmodules file to 'test-tool
> submodule-config'
>   XXX: add a hacky script to test the changes with a patched test suite
> 

I am re-sending the last four highly experimental patches changing the
"XXX" into "FIXME" in the subject lines because vger.kernel.org refused
the original ones with the following message:

  The capital Triple-X in subject is way too often associated with junk
  email, please rephrase.

Sorry for the inconvenience.

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?


Re: Optimizing writes to unchanged files during merges?

2018-04-13 Thread Elijah Newren
On Thu, Apr 12, 2018 at 5:01 PM, Linus Torvalds
 wrote:
> [ Still talking to myself. Very soothing. ]
>
> On Thu, Apr 12, 2018 at 4:55 PM, Linus Torvalds
>  wrote:
>> [ Talking to myself ]
>>
>> Did it perhaps mean to say
>>
>> path_renamed_outside_HEAD = path2 && !strcmp(path, path2);
>>
>> instead?
>
> Probably not correct, but making that change makes my test-case work.
>
> It probably breaks something important, though. I didn't look at why
> that code happened in the first place.
>
> But it's nice to know I'm at least looking at the right code while I'm
> talking to myself.

I hope you don't mind me barging into your conversation, but I figured
out some interesting details.

What we want here is that if there are no content conflicts and the
contents match the version of the file from HEAD, and there are no
mode conflicts and the final mode matches what was in HEAD, and there
are no path conflicts (e.g. a rename/rename(1to2) issue or a D/F
conflict putting a directory in the way) and the final path matches
what was already in HEAD, then we can skip the update.

What does this look like in code?  Well, most of it was already there:

if (mfi.clean && !df_conflict_remains &&
oid_eq(, a_oid) && mfi.mode == a_mode) {

That covers everything except "the final path matches what was already
in HEAD".  How do we check for that?  The current code is just making
an approximation; your modification improves the approximation.
However, it turns out we have this awesome function called
"was_tracked(const char *path)" that was intended for answering this
exact question.  So, assuming was_tracked() isn't buggy, the correct
patch for this problem would look like:

-   path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-   if (!path_renamed_outside_HEAD) {
+   if (was_tracked(path)) {

Turns out that patch was already submitted as c5b761fb
("merge-recursive: ensure we write updates for directory-renamed
file", 2018-02-14).  While that patch was for a different bug, it is
identical to what I would have proposed to fix this bug.  A big series
including that patch was merged to master two days ago, but
unfortunately that exact patch was the one that caused some
impressively awful fireworks[1].  So it, along with the rest of the
series it was part of, was reverted just a few hours later.  As it
turns out, the cause of the problem is that was_tracked() can lie when
renames are involved.  It just hadn't ever bit us yet.

I have a fix for was_tracked(), and 10 additional testcases covering
interesting should-be-skipped and
should-not-be-skipped-but-is-close-to-a-should-be-skipped scenarios,
verifying the skipping actually happens if and only if it should
happen.  That should fix your bug, and the bug with my series.  Rough
WIP can be seen at the testme branch in github.com/newren/git for the
curious, but I need to sleep and have a bunch of other things on my
plate for the next few days.  But I thought I'd at least mention what
I've found so far.

Elijah

[1] https://public-inbox.org/git/xmqqefjm3w1h@gitster-ct.c.googlers.com/