date:20170421

Re: [PATCH v3 09/12] revision.c: --all adds HEAD from all worktrees

2017-04-21 Thread Michael Haggerty

On 04/19/2017 01:01 PM, Nguyễn Thái Ngọc Duy wrote:
> Unless single_worktree is set, --all now adds HEAD from all worktrees.
> 
> Since reachable.c code does not use setup_revisions(), we need to call
> other_head_refs_submodule() explicitly there to have the same effect on
> "git prune", so that we won't accidentally delete objects needed by some
> other HEADs.
> 
> A new FIXME is added because we would need something like
> 
> int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);
> 
> in addition to other_head_refs() to handle it, which might require
> 
> int get_submodule_worktrees(const char *submodule, int flags);
> 
> It could be a separate topic to reduce the scope of this one.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  reachable.c  |  1 +
>  refs.c   | 22 ++
>  refs.h   |  1 +
>  revision.c   | 13 +
>  submodule.c  |  2 ++
>  t/t5304-prune.sh | 12 
>  6 files changed, 51 insertions(+)
> 
> diff --git a/reachable.c b/reachable.c
> index a8a979bd4f..a3b938b46c 100644
> --- a/reachable.c
> +++ b/reachable.c
> @@ -177,6 +177,7 @@ void mark_reachable_objects(struct rev_info *revs, int 
> mark_reflog,
>  
>   /* detached HEAD is not included in the list above */
>   head_ref(add_one_ref, revs);
> + other_head_refs(add_one_ref, revs);
>  
>   /* Add all reflog info */
>   if (mark_reflog)
> diff --git a/refs.c b/refs.c
> index 537052f7ba..23e3607674 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1780,3 +1780,25 @@ int rename_ref(const char *oldref, const char *newref, 
> const char *logmsg)
>  {
>   return refs_rename_ref(get_main_ref_store(), oldref, newref, logmsg);
>  }
> +
> +int other_head_refs(each_ref_fn fn, void *cb_data)
> +{
> + struct worktree **worktrees, **p;
> + int ret = 0;
> +
> + worktrees = get_worktrees(0);
> + for (p = worktrees; *p; p++) {
> + struct worktree *wt = *p;
> + struct ref_store *refs;
> +
> + if (wt->is_current)
> + continue;
> +
> + refs = get_worktree_ref_store(wt);
> + ret = refs_head_ref(refs, fn, cb_data);
> + if (ret)
> + break;
> + }
> + free_worktrees(worktrees);
> + return ret;
> +}

This function is mainly about iterating through all worktrees. Therefore
it feels out of place in the refs module. But I don't have a strong
feeling about it.

> diff --git a/refs.h b/refs.h
> index e06db37118..cc71b6c7a0 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -247,6 +247,7 @@ int refs_for_each_remote_ref(struct ref_store *refs,
>each_ref_fn fn, void *cb_data);
>  
>  int head_ref(each_ref_fn fn, void *cb_data);
> +int other_head_refs(each_ref_fn fn, void *cb_data);
>  int for_each_ref(each_ref_fn fn, void *cb_data);
>  int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data);
>  int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data,
> diff --git a/revision.c b/revision.c
> index c329070c89..040a0064f6 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2105,6 +2105,13 @@ static int handle_revision_pseudo_opt(const char 
> *submodule,
>   int argcount;
>  
>   if (submodule) {
> + /*
> +  * We need some something like get_submodule_worktrees()
> +  * before we can go through all worktrees of a submodule,
> +  * .e.g with adding all HEADs from --all, which is not
> +  * supported right now, so stick to single worktree.
> +  */
> + assert(revs->single_worktree != 0);

You don't need `!= 0` above.

We usually don't use `assert(foo)` but rather `if (!foo)
die("BUG:...")`, which gives a better error message and isn't switched
off if a distribution compiles with NDEBUG.

But here I'm confused about whether this check failing really indicates
a bug within git, or whether it could indicate a situation that the user
has set up that git just can't handle right now. For example, maybe the
user will set up a submodule that itself uses worktrees (even if that's
not possible now, maybe they will have done it with some future version
of git or with another tool). If the problem is only that this version
of git is too stupid to handle pruning safely in that situation, it
would be more appropriate to use something more like

if (!refs->single_worktree)
die("error: git is currently unable to handle submodules that 
use
linked worktrees");

>   refs = get_submodule_ref_store(submodule);
>   } else
>   refs = get_main_ref_store();
> [...]

Michael

[PATCH v2] status: add color config slots for branch info in "--short --branch"

2017-04-21 Thread Stephen Kent

Add color config slots to be used in the status short-format when
displaying local and remote tracking branch information.

Signed-off-by: Stephen Kent 
---
 Documentation/config.txt   | 5 -
 builtin/commit.c   | 4 
 contrib/completion/git-completion.bash | 2 ++
 t/t7508-status.sh  | 5 +++--
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874..96e9cf8 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1137,7 +1137,10 @@ color.status.::
`untracked` (files which are not tracked by Git),
`branch` (the current branch),
`nobranch` (the color the 'no branch' warning is shown in, defaulting
-   to red), or
+   to red),
+   `localBranch` or `remoteBranch` (the local and remote branch names,
+   respectively, when branch and tracking information is displayed in the
+   status short-format), or
`unmerged` (files which have unmerged changes).
 
 color.ui::
diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc..43846d5 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot)
return WT_STATUS_NOBRANCH;
if (!strcasecmp(slot, "unmerged"))
return WT_STATUS_UNMERGED;
+   if (!strcasecmp(slot, "localBranch"))
+   return WT_STATUS_LOCAL_BRANCH;
+   if (!strcasecmp(slot, "remoteBranch"))
+   return WT_STATUS_REMOTE_BRANCH;
return -1;
 }
 
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 1150164..f0542b6 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2377,7 +2377,9 @@ _git_config ()
color.status.added
color.status.changed
color.status.header
+   color.status.localBranch
color.status.nobranch
+   color.status.remoteBranch
color.status.unmerged
color.status.untracked
color.status.updated
diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index fb00e6d..7d42085 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -610,7 +610,8 @@ test_expect_success 'status --porcelain ignores relative 
paths setting' '
 test_expect_success 'setup unique colors' '
 
git config status.color.untracked blue &&
-   git config status.color.branch green
+   git config status.color.branch green &&
+   git config status.color.localBranch yellow
 
 '
 
@@ -675,7 +676,7 @@ test_expect_success 'status -s with color.status' '
 '
 
 cat >expect <<\EOF
-## master
+## master
  M dir1/modified
 A  dir2/added
 ?? dir1/untracked
-- 
2.10.2

Re: [PATCH] Add color slots for branch names in "git status --short --branch"

2017-04-21 Thread Stephen Kent


Hi Peff and Junio,

I've updated the commit message and updated one of the existing unit 
tests for this feature. Patch version 2 will follow shortly after this 
email.


Responses to both of your comments:

I wondered if this "short-format" was accurate. But indeed, we do not 
seem to color the local/remote branch specially in long-format mode, 
so it really is only the short format that is affected.


Right, the hardcoded red and green colors only seem to be used for the 
branch and remote tracking branch names (and commit counts) in the 
status short-format.


There is an existing color slot "color.status.branch" for the branch 
name in the default (long) status format which is different than the new 
color config slots this patch adds.


I'm wondering if it makes sense to also use color.status.branch for the 
local branch color in short-format. On the other hand, I have configured 
different colors in the short and long status format for the local 
branch name and I find it useful for them to be separate color slots.


Normally we match config names in the code as all lowercase, since the 
key names we get from the config parser will be normalized. Here it 
works with your mixed-case because you're using strcasecmp(). 
Obviously that was picked up from the surrounding code, but I think 
those existing strcasecmp() calls could (and perhaps should) just be 
strcmp().


I don't know if it's worth converting them or not. If we leave them 
all as strcasecmp(), I don't mind your camelCase names, for 
readability.


I chose the localBranch and remoteBranch camel case names for 
consistency with the existing "color.decorate.remoteBranch" color config 
slot in log-tree.c. The documentation for color.decorate.remoteBranch 
uses that camel case name, but the config option is case-insensitive. 

I'm happy to use whatever names are best for the short status branch 
name color slots. Let me know if I should change them and what I should 
replace them with.


I know we do not test color.status. at all (other than t4026 
that makes sure a configuration from future version of Git that 
specifies a slot that is not yet known to our version of Git is safely 
ignored without triggering an error), but perhaps we would want a new 
test or two at the end of t7508?


I see the existing tests for git status in t7508. The unit tests set up 
some mock repository modifications to test git status output, so I've 
modified one of the tests to include a custom color for the local branch 
in git status -sb.


t7508 doesn't seem to contain any tests that include an ahead or behind 
commit count, so I didn't make any test changes for the remote tracking 
branch color. What's the best course of action here?


Thanks!

Stephen



On Wed, Apr 19, 2017 at 23:30, Junio C Hamano wrote:

Stephen Kent  writes:

Subject: Re: [PATCH] Add color slots for branch names in "git status 
--short --branch"


We spell one-liner title of our commits as ": " 
typically.  In this case, this is about the output from the status 
command, so


status: make the color used "--shrot --branch" output configurable

or something, perhaps?


Signed-off-by: Stephen Kent 
---
 Documentation/config.txt   | 5 -
 builtin/commit.c   | 4  
 contrib/completion/git-completion.bash | 2 ++

 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt 
index 475e874..96e9cf8 100644

--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1137,7 +1137,10 @@ color.status.::
`untracked` (files which are not tracked by Git),
`branch` (the current branch),
 	`nobranch` (the color the 'no branch' warning is shown in, 
 	defaulting -	to red), or

+   to red),
+   `localBranch` or `remoteBranch` (the local and remote branch
names, +	respectively, when branch and tracking information is 
displayed in the +	status short-format), or

`unmerged` (files which have unmerged changes).


OK.


 color.ui::
diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc..43846d5 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot) 
return WT_STATUS_NOBRANCH;

if (!strcasecmp(slot, "unmerged"))
return WT_STATUS_UNMERGED;
+   if (!strcasecmp(slot, "localBranch"))
+   return WT_STATUS_LOCAL_BRANCH;
+   if (!strcasecmp(slot, "remoteBranch"))
+   return WT_STATUS_REMOTE_BRANCH;
return -1;
 }


OK.

I know we do not test color.status. at all (other than t4026 
that makes sure a configuration from future version of Git that 
specifies a slot that is not yet known to our version of Git is

safely ignored without triggering an error), but perhaps we would
want a new test or two at the end of t7508?

Thanks.

Re: [PATCH v3 08/12] refs: remove dead for_each_*_submodule()

2017-04-21 Thread Michael Haggerty

On 04/19/2017 01:01 PM, Nguyễn Thái Ngọc Duy wrote:
> These are used in revision.c. After the last patch they are replaced
> with the refs_ version. Delete them (except for_each_remote_ref_submodule
> which is still used by submodule.c)

❤❤

I love the way this is going.

Michael

Re: [PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store

2017-04-21 Thread Michael Haggerty

On 04/19/2017 01:01 PM, Nguyễn Thái Ngọc Duy wrote:
> This is a better place that will benefit all submodule callers instead
> of just resolve_gitlink_ref()
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  refs.c | 33 +
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 5902a3d9e5..26474cb62a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1422,25 +1422,10 @@ const char *resolve_ref_unsafe(const char *refname, 
> int resolve_flags,
>  int resolve_gitlink_ref(const char *submodule, const char *refname,
>   unsigned char *sha1)
>  {
> - size_t len = strlen(submodule);
>   struct ref_store *refs;
>   int flags;
>  
> - while (len && submodule[len - 1] == '/')
> - len--;
> -
> - if (!len)
> - return -1;
> -
> - if (submodule[len]) {
> - /* We need to strip off one or more trailing slashes */
> - char *stripped = xmemdupz(submodule, len);
> -
> - refs = get_submodule_ref_store(stripped);
> - free(stripped);
> - } else {
> - refs = get_submodule_ref_store(submodule);
> - }
> + refs = get_submodule_ref_store(submodule);
>  
>   if (!refs)
>   return -1;
> @@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char 
> *submodule)
>  {
>   struct strbuf submodule_sb = STRBUF_INIT;
>   struct ref_store *refs;
> + char *to_free = NULL;
>   int ret;
> + size_t len;
> +
> + if (submodule) {
> + len = strlen(submodule);
> + while (len && submodule[len - 1] == '/')
> + len--;
> + if (!len)
> + submodule = NULL;
> + }

Ugh. Should a submodule named "///" *really* be considered to refer to
the main ref_store? I understand that's what the code did before this
patch, but it seems to me more like an accident of the old design rather
than something worth supporting. In other words, if a caller would
really pass us such a string, it seems like we could declare the caller
buggy, no?

> [...]

Otherwise, looks good and makes a lot of sense.

Michael

Re: [PATCH v3 04/12] refs.c: refactor get_submodule_ref_store(), share common free block

2017-04-21 Thread Michael Haggerty

On 04/19/2017 01:01 PM, Nguyễn Thái Ngọc Duy wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  refs.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 5d31fb6bcf..5902a3d9e5 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1570,19 +1570,16 @@ struct ref_store *get_submodule_ref_store(const char 
> *submodule)
>  
>   refs = lookup_ref_store_map(_ref_stores, submodule);
>   if (refs)
> - return refs;
> + goto done;
>  
>   strbuf_addstr(_sb, submodule);
>   ret = is_nonbare_repository_dir(_sb);
> - strbuf_release(_sb);
>   if (!ret)
> - return NULL;
> + goto done;
>  
>   ret = submodule_to_gitdir(_sb, submodule);
> - if (ret) {
> - strbuf_release(_sb);
> - return NULL;
> - }
> + if (ret)
> + goto done;

After this change, the temporary variable `ret` could be eliminated.

> [...]

Michael

Re: [PATCH v4 0/5] Kill manual ref parsing code in worktree.c

2017-04-21 Thread Michael Haggerty

On 04/14/2017 03:40 AM, Junio C Hamano wrote:
> Nguyễn Thái Ngọc Duy   writes:
> 
>> v4 adds a new patch, 2/5, which makes the hashmap related functions in
>> refs.c generic, so I could add a new map for worktree ref stores and
>> not abuse submodule hashmap.
>>
>> I mentioned about moving these hashmap back to submodule.c and
>> worktree.c where it can map more than just ref stores (in worktree
>> case, we can cache 'struct worktree', for example). But I'm not doing
>> it now to keep things small.
>>
>> The commit message in 1/5 is rephrased a bit, hopefully clearer.
> 
> Michael, does this look good to replace what has been queued?

I finally reviewed this patch series. The refs-related changes look
fine, and the submodule-related changes (which I'm not so familiar with)
looks plausible.

It's a nice cleanup :-)

Michael

Re: [PATCH v4 3/5] refs: introduce get_worktree_ref_store()

2017-04-21 Thread Michael Haggerty

On 04/04/2017 12:21 PM, Nguyễn Thái Ngọc Duy wrote:
> files-backend at this point is still aware of the per-repo/worktree
> separation in refs, so it can handle a linked worktree.
> 
> Some refs operations are known not working when current files-backend is
> used in a linked worktree (e.g. reflog). Tests will be written when
> refs_* functions start to be called with worktree backend to verify that
> they work as expected.
> 
> Note: accessing a worktree of a submodule remains unaddressed. Perhaps
> after get_worktrees() can access submodule (or rather a new function
> get_submodule_worktrees(), that lists worktrees of a submodule), we can
> update this function to work with submodules as well.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  refs.c | 31 +++
>  refs.h |  2 ++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/refs.c b/refs.c
> index 875e30a0b8..a4083caf6a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -10,6 +10,7 @@
>  #include "object.h"
>  #include "tag.h"
>  #include "submodule.h"
> +#include "worktree.h"
>  
>  /*
>   * List of all available backends
> @@ -1486,6 +1487,9 @@ static struct ref_store *main_ref_store;
>  /* A hashmap of ref_stores, stored by submodule name: */
>  static struct hashmap submodule_ref_stores;
>  
> +/* A hashmap of ref_stores, stored by worktree id: */
> +static struct hashmap worktree_ref_stores;
> +
>  /*
>   * Look up a ref store by name. If that ref_store hasn't been
>   * registered yet, return NULL.
> @@ -1590,6 +1594,33 @@ struct ref_store *get_submodule_ref_store(const char 
> *submodule)
>   return refs;
>  }
>  
> +struct ref_store *get_worktree_ref_store(const struct worktree *wt)
> +{
> + struct ref_store *refs;
> + unsigned int refs_all_capabilities =
> + REF_STORE_READ | REF_STORE_WRITE |
> + REF_STORE_ODB | REF_STORE_MAIN;

This constant appears another place, too. It might make sense to define
a constant `REF_STORE_ALL_CAPABILITIES` in `refs-internal.h` alongside
the individual bit values.

If you prefer not to, please at least declare this variable `const` to
spare the reader the trouble of looking to see whether it is modified
before it is used.

Otherwise, looks fine to me.

> [...]

Michael

[PATCH] git_fopen: fix a sparse 'not declared' warning

2017-04-21 Thread Ramsay Jones


Commit 8f4f6e53d2 ("config.mak.uname: set FREAD_READS_DIRECTORIES for
Linux and FreeBSD", 20-04-2017) caused sparse to issue a 'not declared,
should it be static?' warning on Linux. This is a result of the method
employed by 'compat/fopen.c' to suppress the (possible) redefinition of
the (system) fopen macro, which also removes the extern declaration of
the git_fopen function.

In order to suppress the warning, introduce a new macro to suppress the
definition (or possibly the re-definition) of the fopen symbol as a
macro override. This new macro (SUPPRESS_FOPEN_REDEFINITION) is only
defined in 'compat/fopen.c', just prior to the inclusion of the
'git-compat-util.h' header file.

Signed-off-by: Ramsay Jones 
---

Hi Duy,

Could you (or Junio) please add this to your 'nd/fopen-errors' branch, either
as a separate patch or squash it into commit 8f4f6e53d2 ("config.mak.uname: set 
FREAD_READS_DIRECTORIES for Linux and FreeBSD", 20-04-2017). I think it would
be better as a separate commit, but I will leave you to decide on that. ;-)

Thanks!

ATB,
Ramsay Jones

 compat/fopen.c|  4 ++--
 git-compat-util.h | 10 ++
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/compat/fopen.c b/compat/fopen.c
index b5ca142fe..107b3e818 100644
--- a/compat/fopen.c
+++ b/compat/fopen.c
@@ -1,14 +1,14 @@
 /*
  *  The order of the following two lines is important.
  *
- *  FREAD_READS_DIRECTORIES is undefined before including git-compat-util.h
+ *  SUPPRESS_FOPEN_REDEFINITION is defined before including git-compat-util.h
  *  to avoid the redefinition of fopen within git-compat-util.h. This is
  *  necessary since fopen is a macro on some platforms which may be set
  *  based on compiler options. For example, on AIX fopen is set to fopen64
  *  when _LARGE_FILES is defined. The previous technique of merely undefining
  *  fopen after including git-compat-util.h is inadequate in this case.
  */
-#undef FREAD_READS_DIRECTORIES
+#define SUPPRESS_FOPEN_REDEFINITION
 #include "../git-compat-util.h"
 
 FILE *git_fopen(const char *path, const char *mode)
diff --git a/git-compat-util.h b/git-compat-util.h
index b1e48e5e9..691ebf370 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -698,10 +698,12 @@ char *gitstrdup(const char *s);
 #endif
 
 #ifdef FREAD_READS_DIRECTORIES
-#ifdef fopen
-#undef fopen
-#endif
-#define fopen(a,b) git_fopen(a,b)
+# if !defined(SUPPRESS_FOPEN_REDEFINITION)
+#  ifdef fopen
+#   undef fopen
+#  endif
+#  define fopen(a,b) git_fopen(a,b)
+# endif
 extern FILE *git_fopen(const char*, const char*);
 #endif
 
-- 
2.12.0

[PATCH] worktree: fix an 'Using integer as a NULL pointer' warning

2017-04-21 Thread Ramsay Jones


Signed-off-by: Ramsay Jones 
---

Hi Duy,

If you need to re-roll your 'nd/worktree-move' branch could you please
squash this into the relevant patch (commit 8a1dc92e91 ("worktree move:
refuse to move worktrees with submodules", 20-04-2017)).

Thanks!

ATB,
Ramsay Jones

 builtin/worktree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index ff9ccb57f..c54abd5b8 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -534,7 +534,7 @@ static int unlock_worktree(int ac, const char **av, const 
char *prefix)
 
 static void validate_no_submodules(const struct worktree *wt)
 {
-   struct index_state istate = {0};
+   struct index_state istate = { NULL };
int i, found_submodules = 0;
 
if (read_index_from(, worktree_git_path(wt, "index")) > 0) {
-- 
2.12.0

Re: [GIT 2.12.2 REGRESSION] git cherry-pick -x

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 03:10:03PM -0700, Stefan Beller wrote:

> +Jonathan, who worked on trailers
> [...]
> > I haven't tried bisecting precisely, but somewhere along the way git
> > cherry-pick -x regressed between 2.9.3 and 2.12.2.

Yeah, it bisects to 967dfd4d5 (sequencer: use trailer's trailer layout,
2016-11-02).

-Peff

Re: [GIT 2.12.2 REGRESSION] git cherry-pick -x

2017-04-21 Thread Stefan Beller

+Jonathan, who worked on trailers

On Fri, Apr 21, 2017 at 3:01 PM, Brian Norris
 wrote:
> Hi all,
>
> I haven't tried bisecting precisely, but somewhere along the way git
> cherry-pick -x regressed between 2.9.3 and 2.12.2.
>
> Looking at upstream linux.git, I can do:
>
> git fetch git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
> git cherry-pick -x 7b309aef0463340d3ad5449d1f605d14e10a4225
>
> And see the following ending to the new commit:
>
> Acked-by: Arnd Bergmann (cherry picked from commit 
> 7b309aef0463340d3ad5449d1f605d14e10a4225)
>
> On 2.9.3, this worked fine:
>
> Acked-by: Arnd Bergmann 
> (cherry picked from commit 7b309aef0463340d3ad5449d1f605d14e10a4225)
>
> This issue doesn't happen on all commits; maybe it's bad parsing looking
> at 'Signed-off' and the like?
>
> Regards,
> Brian

Re: [PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Jeff King

On Sat, Apr 22, 2017 at 12:02:02AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Yeah, I know "use utf8" doesn't work for that, but I was thinking there
> > was some other trick. Digging...ah, here it is:
> >
> >   use open ':encoding(utf8)'
> >
> > No clue how portable that is. For such a small script it may be better
> > to just stick with vanilla binmode().
> 
> Yeah that would work, but doesn't work on 5.8.0, which is the lowest
> version we support.

Thanks for looking that up.

For the benefit of the maintainer, I'll summarize: Ævar's original patch
is fine as-is.

-Peff

Re: [PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 21, 2017 at 11:35 PM, Jeff King  wrote:
> On Fri, Apr 21, 2017 at 11:28:42PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > I thought there was some "use" flag we could set to just make all of our
>> > handles utf8. But all I could come up with was stuff like PERLIO and
>> > "perl -C". Using binmode isn't too bad, though (I think you could
>> > just do it as part of the open, too, but I'm not sure if antique
>> > versions of perl support that).
>>
>> [Debugging perl encoding issues is one of the many perks of my dayjob]
>>
>> Using binmode like this is about as straightforward as you can get,
>> the former occurrence could be equivalently replaced by:
>>
>> utf8::decode(my $line = <$fh>);
>>
>> But better just to mark the handle as utf8. There's a fancier way to
>> do it as part of the three-arg-open syntax, but I couldn't remember
>> whether all the perl versions we support have it.
>
> Yeah, I agree marking the handle is better. binmode is pretty
> straightforward, but we'd have to remember to manually set it if we add
> any other handles. That's probably not a big deal in this particular
> script, though, which is pretty short.
>
>> About the "use" flag, you're probably thinking of the confusingly
>> named "use utf8", but that's to set your source code to utf8, not your
>> handles, e.g.:
>>
>> $ perl -CA -MDevel::Peek -wE 'use utf8; my $日本語 = shift; Dump $日本語' æ
>> SV = PV(0x12cc090) at 0x12cded8
>>   REFCNT = 1
>>   FLAGS = (PADMY,POK,pPOK,UTF8)
>>   PV = 0x12de460 "\303\246"\0 [UTF8 "\x{e6}"]
>>   CUR = 2
>>   LEN = 16
>>
>> As you can see people got a bit overexcited about Unicode in the 90s.
>
> Yeah, I know "use utf8" doesn't work for that, but I was thinking there
> was some other trick. Digging...ah, here it is:
>
>   use open ':encoding(utf8)'
>
> No clue how portable that is. For such a small script it may be better
> to just stick with vanilla binmode().

Yeah that would work, but doesn't work on 5.8.0, which is the lowest
version we support.

[GIT 2.12.2 REGRESSION] git cherry-pick -x

2017-04-21 Thread Brian Norris

Hi all,

I haven't tried bisecting precisely, but somewhere along the way git
cherry-pick -x regressed between 2.9.3 and 2.12.2.

Looking at upstream linux.git, I can do:

git fetch git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
git cherry-pick -x 7b309aef0463340d3ad5449d1f605d14e10a4225

And see the following ending to the new commit:

Acked-by: Arnd Bergmann (cherry picked from commit 
7b309aef0463340d3ad5449d1f605d14e10a4225)

On 2.9.3, this worked fine:

Acked-by: Arnd Bergmann 
(cherry picked from commit 7b309aef0463340d3ad5449d1f605d14e10a4225)

This issue doesn't happen on all commits; maybe it's bad parsing looking
at 'Signed-off' and the like?

Regards,
Brian

Re: [PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 11:28:42PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I thought there was some "use" flag we could set to just make all of our
> > handles utf8. But all I could come up with was stuff like PERLIO and
> > "perl -C". Using binmode isn't too bad, though (I think you could
> > just do it as part of the open, too, but I'm not sure if antique
> > versions of perl support that).
> 
> [Debugging perl encoding issues is one of the many perks of my dayjob]
> 
> Using binmode like this is about as straightforward as you can get,
> the former occurrence could be equivalently replaced by:
> 
> utf8::decode(my $line = <$fh>);
> 
> But better just to mark the handle as utf8. There's a fancier way to
> do it as part of the three-arg-open syntax, but I couldn't remember
> whether all the perl versions we support have it.

Yeah, I agree marking the handle is better. binmode is pretty
straightforward, but we'd have to remember to manually set it if we add
any other handles. That's probably not a big deal in this particular
script, though, which is pretty short.

> About the "use" flag, you're probably thinking of the confusingly
> named "use utf8", but that's to set your source code to utf8, not your
> handles, e.g.:
> 
> $ perl -CA -MDevel::Peek -wE 'use utf8; my $日本語 = shift; Dump $日本語' æ
> SV = PV(0x12cc090) at 0x12cded8
>   REFCNT = 1
>   FLAGS = (PADMY,POK,pPOK,UTF8)
>   PV = 0x12de460 "\303\246"\0 [UTF8 "\x{e6}"]
>   CUR = 2
>   LEN = 16
> 
> As you can see people got a bit overexcited about Unicode in the 90s.

Yeah, I know "use utf8" doesn't work for that, but I was thinking there
was some other trick. Digging...ah, here it is:

  use open ':encoding(utf8)'

No clue how portable that is. For such a small script it may be better
to just stick with vanilla binmode().

-Peff

Re: [PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 21, 2017 at 10:41 PM, Jeff King  wrote:
> On Fri, Apr 21, 2017 at 07:44:28PM +, Ævar Arnfjörð Bjarmason wrote:
>
>> Change the test descriptions from being treated as binary blobs by
>> perl to being treated as UTF-8. This ensures that e.g. a test
>> description like "æ" is counted as 1 character, not 2.
>>
>> I have WIP performance tests for non-ASCII grep patterns on another
>> topic that are affected by this.
>
> Makes sense. As this is purely about test titles in our project,
> choosing utf8 as the only encoding is quite sensible.

*Nod*

>> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
>> index 924b19dab4..1dbc85b214 100755
>> --- a/t/perf/aggregate.perl
>> +++ b/t/perf/aggregate.perl
>> @@ -88,6 +88,7 @@ for my $t (@tests) {
>>  sub read_descr {
>>   my $name = shift;
>>   open my $fh, "<", $name or return "";
>> + binmode $fh, ":utf8" or die "PANIC on binmode: $!";
>
> I thought there was some "use" flag we could set to just make all of our
> handles utf8. But all I could come up with was stuff like PERLIO and
> "perl -C". Using binmode isn't too bad, though (I think you could
> just do it as part of the open, too, but I'm not sure if antique
> versions of perl support that).

[Debugging perl encoding issues is one of the many perks of my dayjob]

Using binmode like this is about as straightforward as you can get,
the former occurrence could be equivalently replaced by:

utf8::decode(my $line = <$fh>);

But better just to mark the handle as utf8. There's a fancier way to
do it as part of the three-arg-open syntax, but I couldn't remember
whether all the perl versions we support have it.

About the "use" flag, you're probably thinking of the confusingly
named "use utf8", but that's to set your source code to utf8, not your
handles, e.g.:

$ perl -CA -MDevel::Peek -wE 'use utf8; my $日本語 = shift; Dump $日本語' æ
SV = PV(0x12cc090) at 0x12cded8
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x12de460 "\303\246"\0 [UTF8 "\x{e6}"]
  CUR = 2
  LEN = 16

As you can see people got a bit overexcited about Unicode in the 90s.

Git archive doesn't fully support zip64

2017-04-21 Thread Keith Goldfarb

Dear git,

git archive, when writing a zip file, has a silent 4GB file size limit (on the 
inputs as well as the output), as it doesn’t fully support zip64.

Although a zip archive written by git which is larger than 4GB can be often 
still be unzipped, it won’t be fully useable and some tools (e.g. zipdetails) 
can’t read it at all.

I suggest either git should be changed to fully support zip64 or it should give 
a proper error when either an input or the output is too large.

Thanks,

K.

Re: [PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 07:44:28PM +, Ævar Arnfjörð Bjarmason wrote:

> Change the test descriptions from being treated as binary blobs by
> perl to being treated as UTF-8. This ensures that e.g. a test
> description like "æ" is counted as 1 character, not 2.
> 
> I have WIP performance tests for non-ASCII grep patterns on another
> topic that are affected by this.

Makes sense. As this is purely about test titles in our project,
choosing utf8 as the only encoding is quite sensible.

> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
> index 924b19dab4..1dbc85b214 100755
> --- a/t/perf/aggregate.perl
> +++ b/t/perf/aggregate.perl
> @@ -88,6 +88,7 @@ for my $t (@tests) {
>  sub read_descr {
>   my $name = shift;
>   open my $fh, "<", $name or return "";
> + binmode $fh, ":utf8" or die "PANIC on binmode: $!";

I thought there was some "use" flag we could set to just make all of our
handles utf8. But all I could come up with was stuff like PERLIO and
"perl -C". Using binmode isn't too bad, though (I think you could
just do it as part of the open, too, but I'm not sure if antique
versions of perl support that).

-Peff

[PATCH v2] completion: optionally disable checkout DWIM

2017-04-21 Thread Jeff King


When we complete branch names for "git checkout", we also
complete remote branch names that could trigger the DWIM
behavior. Depending on your workflow and project, this can
be either convenient or annoying.

For instance, my clone of gitster.git contains 74 local
"jk/*" branches, but origin contains another 147. When I
want to checkout a local branch but can't quite remember the
name, tab completion shows me 251 entries. And worse, for a
topic that has been picked up for pu, the upstream branch
name is likely to be similar to mine, leading to a high
probability that I pick the wrong one and accidentally
create a new branch.

This patch adds a way for the user to tell the completion
code not to include DWIM suggestions for checkout. This can
already be done by typing:

  git checkout --no-guess jk/

but that's rather cumbersome. The downside, of course, is
that you no longer get completion support when you _do_ want
to invoke the DWIM behavior. But depending on your workflow,
that may not be a big loss (for instance, in git.git I am
much more likely to want to detach, so I'd type "git
checkout origin/jk/" anyway).

Signed-off-by: Jeff King 
---
Compared to v1, this version requires the new variable be set to "1" (to
leave more room for later extension), and uses short-circuiting || for
the logic.

 contrib/completion/git-completion.bash | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 1150164d5..f977b54d8 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -28,6 +28,14 @@
 # completion style.  For example '!f() { : git commit ; ... }; f' will
 # tell the completion to use commit completion.  This also works with aliases
 # of form "!sh -c '...'".  For example, "!sh -c ': git commit ; ... '".
+#
+# You can set the following environment variables to influence the behavior of
+# the completion routines:
+#
+#   GIT_COMPLETION_CHECKOUT_NO_GUESS
+#
+# When set to "1", do not include "DWIM" suggestions in git-checkout
+# completion (e.g., completing "foo" when "origin/foo" exists).
 
 case "$COMP_WORDBREAKS" in
 *:*) : great ;;
@@ -1248,7 +1256,8 @@ _git_checkout ()
# check if --track, --no-track, or --no-guess was specified
# if so, disable DWIM mode
local flags="--track --no-track --no-guess" track_opt="--track"
-   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
+   if [ "$GIT_COMPLETION_CHECKOUT_NO_GUESS" = "1" ] ||
+  [ -n "$(__git_find_on_cmdline "$flags")" ]; then
track_opt=''
fi
__git_complete_refs $track_opt
-- 
2.13.0.rc0.364.g36b4d8031

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 10:14:48PM +0200, SZEDER Gábor wrote:

> >> This is flexible enough for me, but it's possible somebody would want
> >> this on a per-repo basis. I don't know that we want to read from `git
> >> config`, though, because it's relatively expensive to do so. People who
> >> want per-repo settings are probably better off with a hook that triggers
> >> when they "cd" around, and sets up their preferences.
> 
> We could discern between more than just empty vs. non-empty state of
> the environment variable, e.g.:
> 
>   - if empty/unset, then include "DWIM" suggestions.
>   - if set to 'config', then query the 'completion.checkoutNoGuess'
> configuration variable, and omit "DWIM" suggestions if its true.
>   - if set to something else, then omit "DWIM" suggestions.
> 
> Then users can themselves decide, whether the per-repo configurability
> is worth the overhead of running 'git config'.

Yep, that would work. I wasn't going to bother with that complexity
unless somebody really wanted it. The important thing is not to paint
ourselves into a corner. By the rules you gave above, it would probably
be fine to extend my patch later to match. But we could also be more
specific (e.g., look for some positive value like "1").

> >> +# You can set the following environment variables to influence the 
> >> behavior of
> >> +# the completion routines:
> >> +#
> >> +#   GIT_COMPLETION_CHECKOUT_NO_GUESS
> 
> That's one long variable name :)
> Of course it has to start with the 'GIT_COMPLETION_' prefix, and you
> can't win from there...

Yeah, I had the same thought. I also considered using "DWIM", which is
the name by which I know the feature. But since "--no-guess" is the
matching option, I went with that.

> >> @@ -1248,7 +1256,8 @@ _git_checkout ()
> >>   # check if --track, --no-track, or --no-guess was specified
> >>   # if so, disable DWIM mode
> >>   local flags="--track --no-track --no-guess" 
> >> track_opt="--track"
> >> - if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
> >> + if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \
> >> +  -n "$(__git_find_on_cmdline "$flags")" ]; then
> 
> || would be better than '-o', because the former short-circuits when
> the first condition is true, but the latter doesn't.

Ah, I didn't know that. Usually I use "||", but I thought "-o" was
generally preferred in bash-specific scripts. We definitely want to
short circuit here.

-Peff

Re: [PATCH] test-lib: abort when can't remove trash directory

2017-04-21 Thread Jeff King

On Thu, Apr 20, 2017 at 06:52:30PM +0200, SZEDER Gábor wrote:

> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b569682..e9e6f677d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -761,9 +761,12 @@ test_done () {
>   say "1..$test_count$skip_all"
>   fi
>  
> - test -d "$remove_trash" &&
> + test -d "$remove_trash" ||
> + error "Tests passed but trash directory already removed before 
> test cleanup; aborting"

I think I found out why this "test -d" was here in the first place:

  $ ./t-basic.sh --debug
  [...]
  # passed all 77 test(s)
  1..77
  error: Tests passed but trash directory already removed before test cleanup; 
aborting

When --debug is in use, we do not set $remove_trash. The original was
relying on 'test -d ""' to return false.

I think this whole removal block should probably be moved inside a
conditional like:

  if test -n "$remove_trash"
  then
 ...
  fi

I also wonder if we should come up with a better name than
$remove_trash. A script which unknowingly overwrites that variable would
be disastrous.

Perhaps we should drop it entirely and just do:

  if test -z "$debug"
  then
 test -d "$TRASH_DIRECTORY" ||
 error "Tests passed but..."
  [and so forth...]
  fi

-Peff

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-21 Thread SZEDER Gábor

On Fri, Apr 21, 2017 at 7:01 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> When we complete branch names for "git checkout", we also
>> complete remote branch names that could trigger the DWIM
>> behavior. Depending on your workflow and project, this can
>> be either convenient or annoying.
>> ...
>> This is flexible enough for me, but it's possible somebody would want
>> this on a per-repo basis. I don't know that we want to read from `git
>> config`, though, because it's relatively expensive to do so. People who
>> want per-repo settings are probably better off with a hook that triggers
>> when they "cd" around, and sets up their preferences.

We could discern between more than just empty vs. non-empty state of
the environment variable, e.g.:

  - if empty/unset, then include "DWIM" suggestions.
  - if set to 'config', then query the 'completion.checkoutNoGuess'
configuration variable, and omit "DWIM" suggestions if its true.
  - if set to something else, then omit "DWIM" suggestions.

Then users can themselves decide, whether the per-repo configurability
is worth the overhead of running 'git config'.


> Sounds OK.  I am kind of surprised that --no-guess is the only way
> to turn off this dwimming (not in the completion side, but there
> does not seem to be a way to tell "git checkout" that you do not
> need that create-missing-branch-out-of-remote-tracking).
>
>>  contrib/completion/git-completion.bash | 11 ++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/contrib/completion/git-completion.bash 
>> b/contrib/completion/git-completion.bash
>> index 1150164d5..f53b18fae 100644
>> --- a/contrib/completion/git-completion.bash
>> +++ b/contrib/completion/git-completion.bash
>> @@ -28,6 +28,14 @@
>>  # completion style.  For example '!f() { : git commit ; ... }; f' will
>>  # tell the completion to use commit completion.  This also works with 
>> aliases
>>  # of form "!sh -c '...'".  For example, "!sh -c ': git commit ; ... '".
>> +#
>> +# You can set the following environment variables to influence the behavior 
>> of
>> +# the completion routines:
>> +#
>> +#   GIT_COMPLETION_CHECKOUT_NO_GUESS

That's one long variable name :)
Of course it has to start with the 'GIT_COMPLETION_' prefix, and you
can't win from there...

>> +# When non-empty, do not include "DWIM" suggestions in git-checkout
>> +# completion (e.g., completing "foo" when "origin/foo" exists).
>>
>>  case "$COMP_WORDBREAKS" in
>>  *:*) : great ;;
>> @@ -1248,7 +1256,8 @@ _git_checkout ()
>>   # check if --track, --no-track, or --no-guess was specified
>>   # if so, disable DWIM mode
>>   local flags="--track --no-track --no-guess" track_opt="--track"
>> - if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
>> + if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \
>> +  -n "$(__git_find_on_cmdline "$flags")" ]; then

|| would be better than '-o', because the former short-circuits when
the first condition is true, but the latter doesn't.

>>   track_opt=''
>>   fi
>>   __git_complete_refs $track_opt

On Fri, Apr 21, 2017 at 7:01 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> When we complete branch names for "git checkout", we also
>> complete remote branch names that could trigger the DWIM
>> behavior. Depending on your workflow and project, this can
>> be either convenient or annoying.
>> ...
>> This is flexible enough for me, but it's possible somebody would want
>> this on a per-repo basis. I don't know that we want to read from `git
>> config`, though, because it's relatively expensive to do so. People who
>> want per-repo settings are probably better off with a hook that triggers
>> when they "cd" around, and sets up their preferences.
>
> Sounds OK.  I am kind of surprised that --no-guess is the only way
> to turn off this dwimming (not in the completion side, but there
> does not seem to be a way to tell "git checkout" that you do not
> need that create-missing-branch-out-of-remote-tracking).
>
>>  contrib/completion/git-completion.bash | 11 ++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/contrib/completion/git-completion.bash 
>> b/contrib/completion/git-completion.bash
>> index 1150164d5..f53b18fae 100644
>> --- a/contrib/completion/git-completion.bash
>> +++ b/contrib/completion/git-completion.bash
>> @@ -28,6 +28,14 @@
>>  # completion style.  For example '!f() { : git commit ; ... }; f' will
>>  # tell the completion to use commit completion.  This also works with 
>> aliases
>>  # of form "!sh -c '...'".  For example, "!sh -c ': git commit ; ... '".
>> +#
>> +# You can set the following environment variables to influence the behavior 
>> of
>> +# the completion routines:
>> +#
>> +#   GIT_COMPLETION_CHECKOUT_NO_GUESS
>> +#
>> +# When non-empty, do not

Re: [PATCH] test-lib: abort when can't remove trash directory

2017-04-21 Thread SZEDER Gábor

On Fri, Apr 21, 2017 at 2:48 AM, Junio C Hamano  wrote:
> SZEDER Gábor  writes:
>
>> We had two similar bugs in the tests sporadically triggering error
>> messages during the removal of the trash directory, see commits
>> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
>> ef09036cf (t6500: wait for detached auto gc at the end of the test
>> script, 2017-04-13).  The test script succeeded nonetheless, because
>> these errors are ignored during housekeeping in 'test_done'.
>>
>> However, such an error is a sign that something is fishy in the test
>> script.  Print an error message and abort the test script when the
>> trash directory can't be removed successfully or is already removed,
>> because that's unexpected and we woud prefer somebody notice and

s/woud/would/

>> figure out why.


> Note, that the commit message references ef09036cf (t6500: wait for
>> detached auto gc at the end of the test script, 2017-04-13), which
>> is still only in 'pu'.
>
> I think that one is already part of 2.13-rc0 ;-)

Yeah, I should have fetched before submitting.

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 05:23:51PM +0200, Christian Couder wrote:

> > I just tried on "pu" and only the first test
> > (t7009-filter-branch-null-sha1.sh) fails there.
> 
> I bisected this test's failure (when using
> GIT_TEST_SPLIT_INDEX=YesPlease) to e6a1dd77e1 (read-cache: regenerate
> shared index if necessary, 2017-02-27).
> 
> The failing test is the following:
> 
> test_expect_success 'filter commands are still checked' '
> test_must_fail git filter-branch \
> --force --prune-empty \
> --index-filter "git rm --cached --ignore-unmatch three.t"
> '
> 
> And if I add the following at the beginning of the test:
> 
> git config splitIndex.maxPercentChange 100 &&
> 
> the test then passes.
> 
> So It looks like in split index mode the test doesn't expect the
> shared index to be regenerated.
> Maybe Peff, as he is the author of this test, or Duy have an idea about this?

Right. The test has a broken tree with a null sha1, and filter-branch
will do:

  GIT_ALLOW_NULL_SHA1=1 git read-tree $broken

to allow it to enter the index. We expect that further commands that
write out the index will not allow it (so you can run commands that
remove the broken entry, but not ones that leave it).

So without split index, this command:

  git rm --cached three.t

will fail. But in split-index mode, the broken entry is in another
index, and is left untouched. I'm not sure there's a way to reconcile
the split-index behavior with what the test is expecting; it's
inherently optimizing out the thing that the test wants to check.

Probably we should catch the broken index entry when we write out the
tree, too. We usually do catch missing objects, but this one is a
gitlink, so it's OK for it to be missing. I think we should catch the
null sha1 specifically, though, as that was the intent of the commit
that added t7009.

So the patch below _almost_ works. It fixes the failing test. But note
the new test I added, with a noop index-filter. That checks that we
don't retain the cache-tree extension from the bogus tree when reading.
But for some reason it fails on split-index mode. Does cache-tree stuff
somehow work differently there?

diff --git a/cache-tree.c b/cache-tree.c
index 345ea3596..34baa6d85 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -354,7 +354,9 @@ static int update_one(struct cache_tree *it,
entlen = pathlen - baselen;
i++;
}
-   if (mode != S_IFGITLINK && !missing_ok && !has_sha1_file(sha1)) 
{
+
+   if (is_null_sha1(sha1) ||
+   (mode != S_IFGITLINK && !missing_ok && 
!has_sha1_file(sha1))) {
strbuf_release();
if (expected_missing)
return -1;
diff --git a/read-cache.c b/read-cache.c
index b3d0f3c30..15a4779f2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2197,6 +2197,7 @@ static int do_write_index(struct index_state *istate, int 
newfd,
int entries = istate->cache_nr;
struct stat st;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
+   int drop_cache_tree = 0;
 
for (i = removed = extended = 0; i < entries; i++) {
if (cache[i]->ce_flags & CE_REMOVE)
@@ -2247,6 +2248,8 @@ static int do_write_index(struct index_state *istate, int 
newfd,
warning(msg, ce->name);
else
return error(msg, ce->name);
+
+   drop_cache_tree = 1;
}
if (ce_write_entry(, newfd, ce, previous_name) < 0)
return -1;
@@ -2265,7 +2268,7 @@ static int do_write_index(struct index_state *istate, int 
newfd,
if (err)
return -1;
}
-   if (!strip_extensions && istate->cache_tree) {
+   if (!strip_extensions && !drop_cache_tree && istate->cache_tree) {
struct strbuf sb = STRBUF_INIT;
 
cache_tree_write(, istate->cache_tree);
diff --git a/t/t7009-filter-branch-null-sha1.sh 
b/t/t7009-filter-branch-null-sha1.sh
index c27f90f28..a8d9ec498 100755
--- a/t/t7009-filter-branch-null-sha1.sh
+++ b/t/t7009-filter-branch-null-sha1.sh
@@ -31,6 +31,12 @@ test_expect_success 'setup: bring HEAD and index in sync' '
git commit -a -m "back to normal"
 '
 
+test_expect_success 'noop filter-branch complains' '
+   test_must_fail git filter-branch \
+   --force --prune-empty \
+   --index-filter "true"
+'
+
 test_expect_success 'filter commands are still checked' '
test_must_fail git filter-branch \
--force --prune-empty \

[PATCH] t/perf: correctly align non-ASCII descriptions in output

2017-04-21 Thread Ævar Arnfjörð Bjarmason

Change the test descriptions from being treated as binary blobs by
perl to being treated as UTF-8. This ensures that e.g. a test
description like "æ" is counted as 1 character, not 2.

I have WIP performance tests for non-ASCII grep patterns on another
topic that are affected by this.

Now instead of:

$ ./run p-perf-lib-sanity.sh
[...]
.4: export a weird var
0.00(0.00+0.00)
.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś   0.00(0.00+0.00)
.7: important variables available in subshells
0.00(0.00+0.00)
[...]

We emit:

[...]
.4: export a weird var 0.00(0.00+0.00)
.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś  0.00(0.00+0.00)
.7: important variables available in subshells 0.00(0.00+0.00)
[...]

Fixes code originally added in 342e9ef2d9 ("Introduce a performance
testing framework", 2012-02-17).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/perf/aggregate.perl   | 3 +++
 t/perf/p-perf-lib-sanity.sh | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 924b19dab4..1dbc85b214 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -88,6 +88,7 @@ for my $t (@tests) {
 sub read_descr {
my $name = shift;
open my $fh, "<", $name or return "";
+   binmode $fh, ":utf8" or die "PANIC on binmode: $!";
my $line = <$fh>;
close $fh or die "cannot close $name";
chomp $line;
@@ -147,6 +148,8 @@ for my $t (@subtests) {
 my $totalwidth = 3*@dirs+$descrlen;
 $totalwidth += $_ for (@colwidth);
 
+binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
+
 printf "%-${descrlen}s", "Test";
 for my $i (0..$#dirs) {
my $d = $dirs[$i];
diff --git a/t/perf/p-perf-lib-sanity.sh b/t/perf/p-perf-lib-sanity.sh
index cf8e1efce7..002c21e52a 100755
--- a/t/perf/p-perf-lib-sanity.sh
+++ b/t/perf/p-perf-lib-sanity.sh
@@ -33,6 +33,8 @@ test_perf 'export a weird var' '
test_export bar
 '
 
+test_perf 'éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś' 'true'
+
 test_expect_success 'test_export works with weird vars' '
echo "$bar" &&
test "$bar" = "weird # variable"
-- 
2.11.0

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 07:25:21PM +0700, Duy Nguyen wrote:

> >> Yeah, you are right.
> >> It looks like we have GIT_TEST_OPTS to pass options like --debug,
> >> --valgrind, --verbose, but we don't have an environment variable to
> >> set config options.
> >
> > Or maybe GIT_CONFIG_PARAMETERS works for this purpose?
> 
> It has to be set inside test-lib.sh, not from outside because
> environment variables from outside are filtered if I remember
> correctly and only a few specials plus those GIT_TEST_ can survive.
> Some tests override GIT_CONFIG_PARAMETERS themselves to pass config
> vars to certain command (I know because I just did a couple days ago
> ;).which loses core.splitIndex.

It does seem reasonable to have a GIT_TEST_REPO_CONFIG that sets some
variables by default in the trash repository created by
test_create_repo.

That doesn't quite catch everything, because a few tests make their own
sub-repos. But those cases are usually testing something very specific
anyway.

-Peff

[PATCH] cache-tree: reject entries with null sha1

2017-04-21 Thread Jeff King

We generally disallow null sha1s from entering the index,
due to 4337b5856 (do not write null sha1s to on-disk index,
2012-07-28). However, we loosened that in 83bd7437c
(write_index: optionally allow broken null sha1s,
2013-08-27) so that tools like filter-branch could be used
to repair broken history.

However, we should make sure that these broken entries do
not get propagated into new trees. For most entries, we'd
catch them with the missing-object check (since presumably
the null sha1 does not exist in our object database). But
gitlink entries do not need reachability, so we may blindly
copy the entry into a bogus tree.

This patch rejects all null sha1s (with the same "invalid
entry" message that missing objects get) when building trees
from the index. It does so even for non-gitlinks, and even
when "write-tree" is given the --missing-ok flag. The null
sha1 is a special sentinel value that is already rejected in
trees by fsck; whether the object exists or not, it is an
error to put it in a tree.

Note that for this to work, we must also avoid reusing an
existing cache-tree that contains the null sha1. This patch
does so by just refusing to write out any cache tree when
the index contains a null sha1. This is blunter than we need
to be; we could just reject the subtree that contains the
offending entry. But it's not worth the complexity. The
behavior is unchanged unless you have a broken index entry,
and even then we'd refuse the whole index write unless the
emergency GIT_ALLOW_NULL_SHA1 is in use. And even then the
end result is only a performance drop (any write-tree will
have to generate the whole cache-tree from scratch).

The tests bear some explanation.

The existing test in t7009 doesn't catch this problem,
because our index-filter runs "git rm --cached", which will
try to rewrite the updated index and barf on the bogus
entry. So we never even make it to write-tree.  The new test
there adds a noop index-filter, which does show the problem.

The new tests in t1601 are slightly redundant with what
filter-branch is doing under the hood in t7009. But as
they're much more direct, they're easier to reason about.
And should filter-branch ever change or go away, we'd want
to make sure that these plumbing commands behave sanely.

Signed-off-by: Jeff King 
---
When merged to pu, this fixes the existing test breakage in t7009 when
GIT_TEST_SPLIT_INDEX is used (because the split index didn't rewrite the
whole index, "git rm --cached" didn't always barf). But I think it's
worth doing on its own merits, as demonstrated by the new tests.

The one thing I haven't figured out it is why the new test in t7009
fails with the split-index. And even more curiously, the new tests in
t1601 _don't_ fail with it, even if I instrument the fake index to have
more entries (making it more likely to split).

 cache-tree.c   |  4 +++-
 read-cache.c   |  5 -
 t/t1601-index-bogus.sh | 22 ++
 t/t7009-filter-branch-null-sha1.sh |  6 ++
 4 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100755 t/t1601-index-bogus.sh

diff --git a/cache-tree.c b/cache-tree.c
index 345ea3596..34baa6d85 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -354,7 +354,9 @@ static int update_one(struct cache_tree *it,
entlen = pathlen - baselen;
i++;
}
-   if (mode != S_IFGITLINK && !missing_ok && !has_sha1_file(sha1)) 
{
+
+   if (is_null_sha1(sha1) ||
+   (mode != S_IFGITLINK && !missing_ok && 
!has_sha1_file(sha1))) {
strbuf_release();
if (expected_missing)
return -1;
diff --git a/read-cache.c b/read-cache.c
index e44775182..6851de892 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2054,6 +2054,7 @@ static int do_write_index(struct index_state *istate, int 
newfd,
int entries = istate->cache_nr;
struct stat st;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
+   int drop_cache_tree = 0;
 
for (i = removed = extended = 0; i < entries; i++) {
if (cache[i]->ce_flags & CE_REMOVE)
@@ -2104,6 +2105,8 @@ static int do_write_index(struct index_state *istate, int 
newfd,
warning(msg, ce->name);
else
return error(msg, ce->name);
+
+   drop_cache_tree = 1;
}
if (ce_write_entry(, newfd, ce, previous_name) < 0)
return -1;
@@ -2122,7 +2125,7 @@ static int do_write_index(struct index_state *istate, int 
newfd,
if (err)
return -1;
}
-   if (!strip_extensions && istate->cache_tree) {
+   if (!strip_extensions && !drop_cache_tree && istate->cache_tree) {
struct strbuf sb = STRBUF_INIT;

[PATCH v6 2/8] convert: move packet_write_list() into pkt-line as packet_writel()

2017-04-21 Thread Ben Peart

Add packet_writel() which writes multiple lines in a single call and
then calls packet_flush_gently(). Update convert.c to use the new
packet_writel() function from pkt-line.

Signed-off-by: Ben Peart 
---
 convert.c  | 23 ++-
 pkt-line.c | 19 +++
 pkt-line.h |  1 +
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 8d652bf27c..793c29ebfd 100644
--- a/convert.c
+++ b/convert.c
@@ -521,25 +521,6 @@ static struct cmd2process 
*find_multi_file_filter_entry(struct hashmap *hashmap,
return hashmap_get(hashmap, , NULL);
 }
 
-static int packet_write_list(int fd, const char *line, ...)
-{
-   va_list args;
-   int err;
-   va_start(args, line);
-   for (;;) {
-   if (!line)
-   break;
-   if (strlen(line) > LARGE_PACKET_DATA_MAX)
-   return -1;
-   err = packet_write_fmt_gently(fd, "%s\n", line);
-   if (err)
-   return err;
-   line = va_arg(args, const char*);
-   }
-   va_end(args);
-   return packet_flush_gently(fd);
-}
-
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
@@ -616,7 +597,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   err = packet_write_list(process->in, "git-filter-client", "version=2", 
NULL);
+   err = packet_writel(process->in, "git-filter-client", "version=2", 
NULL);
if (err)
goto done;
 
@@ -632,7 +613,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
if (err)
goto done;
 
-   err = packet_write_list(process->in, "capability=clean", 
"capability=smudge", NULL);
+   err = packet_writel(process->in, "capability=clean", 
"capability=smudge", NULL);
 
for (;;) {
cap_buf = packet_read_line(process->out, NULL);
diff --git a/pkt-line.c b/pkt-line.c
index bfdb177b34..756410d2f6 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -171,6 +171,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return status;
 }
 
+int packet_writel(int fd, const char *line, ...)
+{
+   va_list args;
+   int err;
+   va_start(args, line);
+   for (;;) {
+   if (!line)
+   break;
+   if (strlen(line) > LARGE_PACKET_DATA_MAX)
+   return -1;
+   err = packet_write_fmt_gently(fd, "%s\n", line);
+   if (err)
+   return err;
+   line = va_arg(args, const char*);
+   }
+   va_end(args);
+   return packet_flush_gently(fd);
+}
+
 static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 {
static char packet_write_buffer[LARGE_PACKET_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index ad30db101a..957b9a5fa3 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,6 +25,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_writel(int fd, const char *line, ...);
 int write_packetized_from_fd(int fd_in, int fd_out);
 int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 
-- 
2.12.0.windows.1.33.g243d9b384c

[PATCH] tests: fix tests broken under GETTEXT_POISON=YesPlease

2017-04-21 Thread Ævar Arnfjörð Bjarmason

The GETTEXT_POISON=YesPlease compile-time testing option added in my
bb946bba76 ("i18n: add GETTEXT_POISON to simulate unfriendly
translator", 2011-02-22) has been slowly bitrotting as strings have
been marked for translation, and new tests have been added without
running it.

I brought this up on the list ("[BUG] test suite broken with
GETTEXT_POISON=YesPlease", [1]) asking whether this mode was useful at
all anymore. At least one person occasionally uses it, and Lars
Schneider offered to change one of the the Travis builds to run in
this mode, so fix up the failing ones.

My test setup runs most of the tests, with the notable exception of
skipping all the p4 tests, so it's possible that there's still some
lurking regressions I haven't fixed.

1. 

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t1309-early-config.sh  |  2 +-
 t/t1430-bad-ref-name.sh  |  2 +-
 t/t3203-branch-output.sh |  2 +-
 t/t3404-rebase-interactive.sh| 14 +++---
 t/t3415-rebase-autosquash.sh | 10 +-
 t/t3903-stash.sh |  4 ++--
 t/t4205-log-pretty-formats.sh|  4 ++--
 t/t5316-pack-delta-depth.sh  |  8 ++--
 t/t6134-pathspec-in-submodule.sh |  4 ++--
 t/t7004-tag.sh   |  4 ++--
 t/t7406-submodule-update.sh  |  2 +-
 t/t7509-commit.sh|  4 ++--
 t/t7800-difftool.sh  |  4 ++--
 13 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/t/t1309-early-config.sh b/t/t1309-early-config.sh
index b97357b8ab..016e43b8d5 100755
--- a/t/t1309-early-config.sh
+++ b/t/t1309-early-config.sh
@@ -59,7 +59,7 @@ test_with_config () {
 
 test_expect_success 'ignore .git/ with incompatible repository version' '
test_with_config "[core]repositoryformatversion = 99" 2>err &&
-   grep "warning:.* Expected git repo version <= [1-9]" err
+   test_i18ngrep "warning:.* Expected git repo version <= [1-9]" err
 '
 
 test_expect_failure 'ignore .git/ with invalid repository version' '
diff --git a/t/t1430-bad-ref-name.sh b/t/t1430-bad-ref-name.sh
index 8937e25e49..2003ec7907 100755
--- a/t/t1430-bad-ref-name.sh
+++ b/t/t1430-bad-ref-name.sh
@@ -122,7 +122,7 @@ test_expect_success 'push cannot create a badly named ref' '
! grep -e "broken\.\.\.ref" output
 '
 
-test_expect_failure 'push --mirror can delete badly named ref' '
+test_expect_failure !GETTEXT_POISON 'push --mirror can delete badly named ref' 
'
top=$(pwd) &&
git init src &&
git init dest &&
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5778c0afe1..a428ae6703 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -236,7 +236,7 @@ test_expect_success 'git branch --format option' '
Refname is refs/heads/ref-to-remote
EOF
git branch --format="Refname is %(refname)" >actual &&
-   test_cmp expect actual
+   test_i18ncmp expect actual
 '
 
 test_done
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 33d392ba11..e07d6d8cd2 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -366,7 +366,7 @@ test_expect_success 'verbose flag is heeded, even after 
--continue' '
grep "^ file1 | 2 +-$" output
 '
 
-test_expect_success 'multi-squash only fires up editor once' '
+test_expect_success !GETTEXT_POISON 'multi-squash only fires up editor once' '
base=$(git rev-parse HEAD~4) &&
set_fake_editor &&
FAKE_COMMIT_AMEND="ONCE" FAKE_LINES="1 squash 2 squash 3 squash 4" \
@@ -376,7 +376,7 @@ test_expect_success 'multi-squash only fires up editor 
once' '
test 1 = $(git show | grep ONCE | wc -l)
 '
 
-test_expect_success 'multi-fixup does not fire up editor' '
+test_expect_success !GETTEXT_POISON 'multi-fixup does not fire up editor' '
git checkout -b multi-fixup E &&
base=$(git rev-parse HEAD~4) &&
set_fake_editor &&
@@ -426,7 +426,7 @@ D
 ONCE
 EOF
 
-test_expect_success 'squash and fixup generate correct log messages' '
+test_expect_success !GETTEXT_POISON 'squash and fixup generate correct log 
messages' '
git checkout -b squash-fixup E &&
base=$(git rev-parse HEAD~4) &&
set_fake_editor &&
@@ -439,7 +439,7 @@ test_expect_success 'squash and fixup generate correct log 
messages' '
git branch -D squash-fixup
 '
 
-test_expect_success 'squash ignores comments' '
+test_expect_success !GETTEXT_POISON 'squash ignores comments' '
git checkout -b skip-comments E &&
base=$(git rev-parse HEAD~4) &&
set_fake_editor &&
@@ -452,7 +452,7 @@ test_expect_success 'squash ignores comments' '
git branch -D skip-comments
 '
 
-test_expect_success 'squash ignores blank lines' '
+test_expect_success !GETTEXT_POISON 'squash ignores blank lines' '
git checkout -b skip-blank-lines E &&
base=$(git rev-parse HEAD~4)

[PATCH v6 8/8] convert: Update subprocess_read_status to not die on EOF

2017-04-21 Thread Ben Peart

Enable sub-processes to gracefully handle when the process dies by
updating subprocess_read_status to return an error on EOF instead of
dying.

Update apply_multi_file_filter to take advantage of the revised
subprocess_read_status.

Signed-off-by: Ben Peart 
---
 convert.c | 10 --
 sub-process.c | 10 +++---
 sub-process.h |  2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index 79d04907a5..f1e168bc30 100644
--- a/convert.c
+++ b/convert.c
@@ -635,7 +635,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
if (err)
goto done;
@@ -644,7 +647,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
 
 done:
diff --git a/sub-process.c b/sub-process.c
index 536b60cced..92f8aea70a 100644
--- a/sub-process.c
+++ b/sub-process.c
@@ -21,13 +21,15 @@ struct subprocess_entry *subprocess_find_entry(struct 
hashmap *hashmap, const ch
return hashmap_get(hashmap, , NULL);
 }
 
-void subprocess_read_status(int fd, struct strbuf *status)
+int subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
+   int len;
+
for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
+   len = packet_read_line_gently(fd, NULL, );
+   if ((len < 0) || !line)
break;
pair = strbuf_split_str(line, '=', 2);
if (pair[0] && pair[0]->len && pair[1]) {
@@ -39,6 +41,8 @@ void subprocess_read_status(int fd, struct strbuf *status)
}
strbuf_list_free(pair);
}
+
+   return (len < 0) ? len : 0;
 }
 
 void subprocess_stop(struct hashmap *hashmap, struct subprocess_entry *entry)
diff --git a/sub-process.h b/sub-process.h
index a88e782bfc..7d451e1cde 100644
--- a/sub-process.h
+++ b/sub-process.h
@@ -44,6 +44,6 @@ static inline struct child_process 
*subprocess_get_child_process(
  * key/value pairs and return the value from the last "status" packet
  */
 
-void subprocess_read_status(int fd, struct strbuf *status);
+int subprocess_read_status(int fd, struct strbuf *status);
 
 #endif
-- 
2.12.0.windows.1.33.g243d9b384c

Re: [PATCH 00/15] Handle fopen() errors

2017-04-21 Thread Jeff King

On Fri, Apr 21, 2017 at 07:27:20PM +0700, Duy Nguyen wrote:

> On Fri, Apr 21, 2017 at 6:52 PM, Junio C Hamano  wrote:
> > Yes, but (1) we'd need to be careful about --quiet
> 
> Yeah. It's a real pain point for making changes like this. At some
> point we should just have a global (maybe multi-level) quiet flag.

I don't think it's too bad here. Isn't it just:

  FILE *fh = quiet ? fopen(x, y) : fopen_or_warn(x, y);

It is a little annoying to have to repeat "x", though (which is
sometimes a git_path() invocation).

-Peff

[PATCH v6 5/8] convert: Update generic functions to only use generic data structures

2017-04-21 Thread Ben Peart

Update all functions that are going to be moved into a reusable module
so that they only work with the reusable data structures.  Move code
that is specific to the filter out into the filter specific functions.

Signed-off-by: Ben Peart 
---
 convert.c | 41 +++--
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/convert.c b/convert.c
index bfb19beed5..d15b10a3c5 100644
--- a/convert.c
+++ b/convert.c
@@ -556,7 +556,6 @@ static void kill_multi_file_filter(struct hashmap *hashmap, 
struct subprocess_en
finish_command(>process);
 
hashmap_remove(hashmap, entry, NULL);
-   free(entry);
 }
 
 static void stop_multi_file_filter(struct child_process *process)
@@ -570,14 +569,15 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static int start_multi_file_filter_fn(struct cmd2process *entry)
+static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
 {
int err;
+   struct cmd2process *entry = (struct cmd2process *)subprocess;
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >subprocess.process;
-   const char *cmd = entry->subprocess.cmd;
+   struct child_process *process = >process;
+   const char *cmd = subprocess->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -629,17 +629,16 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
return err;
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
+int start_multi_file_filter(struct hashmap *hashmap, struct subprocess_entry 
*entry, const char *cmd,
+   subprocess_start_fn startfn)
 {
int err;
-   struct cmd2process *entry;
struct child_process *process;
const char *argv[] = { cmd, NULL };
 
-   entry = xmalloc(sizeof(*entry));
-   entry->subprocess.cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >subprocess.process;
+   entry->cmd = cmd;
+   process = >process;
 
child_process_init(process);
process->argv = argv;
@@ -649,22 +648,23 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
process->clean_on_exit = 1;
process->clean_on_exit_handler = stop_multi_file_filter;
 
-   if (start_command(process)) {
+   err = start_command(process);
+   if (err) {
error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
+   return err;
}
 
hashmap_entry_init(entry, strhash(cmd));
 
-   err = start_multi_file_filter_fn(entry);
+   err = startfn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(hashmap, >subprocess);
-   return NULL;
+   kill_multi_file_filter(hashmap, entry);
+   return err;
}
 
hashmap_add(hashmap, entry);
-   return entry;
+   return 0;
 }
 
 static int apply_multi_file_filter(const char *path, const char *src, size_t 
len,
@@ -689,9 +689,13 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
fflush(NULL);
 
if (!entry) {
-   entry = start_multi_file_filter(_process_map, cmd);
-   if (!entry)
+   entry = xmalloc(sizeof(*entry));
+   entry->supported_capabilities = 0;
+
+   if (start_multi_file_filter(_process_map, 
>subprocess, cmd, start_multi_file_filter_fn)) {
+   free(entry);
return 0;
+   }
}
process = >subprocess.process;
 
@@ -765,6 +769,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
 */
error("external filter '%s' failed", cmd);
kill_multi_file_filter(_process_map, 
>subprocess);
+   free(entry);
}
} else {
strbuf_swap(dst, );
-- 
2.12.0.windows.1.33.g243d9b384c

[PATCH v6 6/8] convert: rename reusable sub-process functions

2017-04-21 Thread Ben Peart

Do a mechanical rename of the functions that will become the reusable
sub-process module.

Signed-off-by: Ben Peart 
---
 convert.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/convert.c b/convert.c
index d15b10a3c5..cfae45aeee 100644
--- a/convert.c
+++ b/convert.c
@@ -507,8 +507,8 @@ struct cmd2process {
unsigned int supported_capabilities;
 };
 
-static int cmd_process_map_initialized;
-static struct hashmap cmd_process_map;
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
 
 static int cmd2process_cmp(const struct subprocess_entry *e1,
   const struct subprocess_entry *e2,
@@ -517,7 +517,7 @@ static int cmd2process_cmp(const struct subprocess_entry 
*e1,
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct subprocess_entry *find_multi_file_filter_entry(struct hashmap 
*hashmap, const char *cmd)
+static struct subprocess_entry *subprocess_find_entry(struct hashmap *hashmap, 
const char *cmd)
 {
struct subprocess_entry key;
 
@@ -526,7 +526,7 @@ static struct subprocess_entry 
*find_multi_file_filter_entry(struct hashmap *has
return hashmap_get(hashmap, , NULL);
 }
 
-static void read_multi_file_filter_status(int fd, struct strbuf *status)
+static void subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
@@ -546,7 +546,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct hashmap *hashmap, struct 
subprocess_entry *entry)
+static void subprocess_stop(struct hashmap *hashmap, struct subprocess_entry 
*entry)
 {
if (!entry)
return;
@@ -558,10 +558,10 @@ static void kill_multi_file_filter(struct hashmap 
*hashmap, struct subprocess_en
hashmap_remove(hashmap, entry, NULL);
 }
 
-static void stop_multi_file_filter(struct child_process *process)
+static void subprocess_exit_handler(struct child_process *process)
 {
sigchain_push(SIGPIPE, SIG_IGN);
-   /* Closing the pipe signals the filter to initiate a shutdown. */
+   /* Closing the pipe signals the subprocess to initiate a shutdown. */
close(process->in);
close(process->out);
sigchain_pop(SIGPIPE);
@@ -630,7 +630,7 @@ static int start_multi_file_filter_fn(struct 
subprocess_entry *subprocess)
 }
 
 typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
-int start_multi_file_filter(struct hashmap *hashmap, struct subprocess_entry 
*entry, const char *cmd,
+int subprocess_start(struct hashmap *hashmap, struct subprocess_entry *entry, 
const char *cmd,
subprocess_start_fn startfn)
 {
int err;
@@ -646,11 +646,11 @@ int start_multi_file_filter(struct hashmap *hashmap, 
struct subprocess_entry *en
process->in = -1;
process->out = -1;
process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
+   process->clean_on_exit_handler = subprocess_exit_handler;
 
err = start_command(process);
if (err) {
-   error("cannot fork to run external filter '%s'", cmd);
+   error("cannot fork to run subprocess '%s'", cmd);
return err;
}
 
@@ -658,8 +658,8 @@ int start_multi_file_filter(struct hashmap *hashmap, struct 
subprocess_entry *en
 
err = startfn(entry);
if (err) {
-   error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(hashmap, entry);
+   error("initialization for subprocess '%s' failed", cmd);
+   subprocess_stop(hashmap, entry);
return err;
}
 
@@ -678,12 +678,12 @@ static int apply_multi_file_filter(const char *path, 
const char *src, size_t len
struct strbuf filter_status = STRBUF_INIT;
const char *filter_type;
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized = 1;
-   hashmap_init(_process_map, (hashmap_cmp_fn) 
cmd2process_cmp, 0);
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn) cmd2process_cmp, 
0);
entry = NULL;
} else {
-   entry = (struct cmd2process 
*)find_multi_file_filter_entry(_process_map, cmd);
+   entry = (struct cmd2process 
*)subprocess_find_entry(_map, cmd);
}
 
fflush(NULL);
@@ -692,7 +692,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
entry = xmalloc(sizeof(*entry));
entry->supported_capabilities = 0;
 
-   if (start_multi_file_filter(_process_map, 
>subprocess, cmd, start_multi_file_filter_fn)) {
+   if (subprocess_start(_map, >subprocess, cmd, 
start_multi_file_filter_fn)) {

[PATCH v6 7/8] sub-process: move sub-process functions into separate files

2017-04-21 Thread Ben Peart

Move the sub-proces functions into sub-process.h/c.  Add documentation
for the new module in Documentation/technical/api-sub-process.txt

Signed-off-by: Ben Peart 
---
 Documentation/technical/api-sub-process.txt |  59 
 Makefile|   1 +
 convert.c   | 104 +---
 sub-process.c   | 102 +++
 sub-process.h   |  49 +
 5 files changed, 212 insertions(+), 103 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

diff --git a/Documentation/technical/api-sub-process.txt 
b/Documentation/technical/api-sub-process.txt
new file mode 100644
index 00..793508cf3e
--- /dev/null
+++ b/Documentation/technical/api-sub-process.txt
@@ -0,0 +1,59 @@
+sub-process API
+===
+
+The sub-process API makes it possible to run background sub-processes
+for the entire lifetime of a Git invocation. If Git needs to communicate
+with an external process multiple times, then this can reduces the process
+invocation overhead. Git and the sub-process communicate through stdin and
+stdout.
+
+The sub-processes are kept in a hashmap by command name and looked up
+via the subprocess_find_entry function.  If an existing instance can not
+be found then a new process should be created and started.  When the
+parent git command terminates, all sub-processes are also terminated.
+
+This API is based on the run-command API.
+
+Data structures
+---
+
+* `struct subprocess_entry`
+
+The sub-process structure.  Members should not be accessed directly.
+
+Types
+-
+
+'int(*subprocess_start_fn)(struct subprocess_entry *entry)'::
+
+   User-supplied function to initialize the sub-process.  This is
+   typically used to negotiate the interface version and capabilities.
+
+
+Functions
+-
+
+`cmd2process_cmp`::
+
+   Function to test two subprocess hashmap entries for equality.
+
+`subprocess_start`::
+
+   Start a subprocess and add it to the subprocess hashmap.
+
+`subprocess_stop`::
+
+   Kill a subprocess and remove it from the subprocess hashmap.
+
+`subprocess_find_entry`::
+
+   Find a subprocess in the subprocess hashmap.
+
+`subprocess_get_child_process`::
+
+   Get the underlying `struct child_process` from a subprocess.
+
+`subprocess_read_status`::
+
+   Helper function to read packets looking for the last "status="
+   key/value pair.
diff --git a/Makefile b/Makefile
index eb1a1a7cff..97c87fb647 100644
--- a/Makefile
+++ b/Makefile
@@ -840,6 +840,7 @@ LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += submodule.o
 LIB_OBJS += submodule-config.o
+LIB_OBJS += sub-process.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += tempfile.o
diff --git a/convert.c b/convert.c
index cfae45aeee..79d04907a5 100644
--- a/convert.c
+++ b/convert.c
@@ -4,6 +4,7 @@
 #include "quote.h"
 #include "sigchain.h"
 #include "pkt-line.h"
+#include "sub-process.h"
 
 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -496,12 +497,6 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct subprocess_entry {
-   struct hashmap_entry ent; /* must be the first member! */
-   const char *cmd;
-   struct child_process process;
-};
-
 struct cmd2process {
struct subprocess_entry subprocess; /* must be the first member! */
unsigned int supported_capabilities;
@@ -510,65 +505,6 @@ struct cmd2process {
 static int subprocess_map_initialized;
 static struct hashmap subprocess_map;
 
-static int cmd2process_cmp(const struct subprocess_entry *e1,
-  const struct subprocess_entry *e2,
-  const void *unused)
-{
-   return strcmp(e1->cmd, e2->cmd);
-}
-
-static struct subprocess_entry *subprocess_find_entry(struct hashmap *hashmap, 
const char *cmd)
-{
-   struct subprocess_entry key;
-
-   hashmap_entry_init(, strhash(cmd));
-   key.cmd = cmd;
-   return hashmap_get(hashmap, , NULL);
-}
-
-static void subprocess_read_status(int fd, struct strbuf *status)
-{
-   struct strbuf **pair;
-   char *line;
-   for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
-   break;
-   pair = strbuf_split_str(line, '=', 2);
-   if (pair[0] && pair[0]->len && pair[1]) {
-   /* the last "status=" line wins */
-   if (!strcmp(pair[0]->buf, "status=")) {
-   strbuf_reset(status);
-   strbuf_addbuf(status, pair[1]);
-   }
-   }
-   strbuf_list_free(pair);
-

[PATCH v6 0/8] refactor the filter process code into a reusable module

2017-04-21 Thread Ben Peart

Refactor the filter..process code into a separate sub-process
module that can be used to reduce the cost of starting up a sub-process
for multiple commands.  It does this by keeping the external process
running and processing all commands by communicating over standard input
and standard output using the packet format (pkt-line) based protocol.
Full documentation is in Documentation/technical/api-sub-process.txt.

This code is refactored from:

Commit edcc85814c ("convert: add filter..process option", 
2016-10-16)
keeps the external process running and processes all commands

Ben Peart (8):
  pkt-line: add packet_read_line_gently()
  convert: move packet_write_list() into pkt-line as packet_writel()
  convert: Split start_multi_file_filter into two separate functions
  convert: Separate generic structures and variables from the filter
specific ones
  convert: Update generic functions to only use generic data structures
  convert: rename reusable sub-process functions
  sub-process: move sub-process functions into separate files
  convert: Update subprocess_read_status to not die on EOF

 Documentation/technical/api-sub-process.txt |  59 ++
 Makefile|   1 +
 convert.c   | 161 ++--
 pkt-line.c  |  33 +-
 pkt-line.h  |  11 ++
 sub-process.c   | 106 ++
 sub-process.h   |  49 +
 7 files changed, 291 insertions(+), 129 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

-- 
2.12.0.windows.1.33.g243d9b384c

Re: [BUG] test suite broken with GETTEXT_POISON=YesPlease

2017-04-21 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 21, 2017 at 4:47 PM, Michael J Gruber  wrote:
> Ævar Arnfjörð Bjarmason venit, vidit, dixit 20.04.2017 23:58:
>> As a refresh of everyone's memory (because mine needed it). This is a
>> feature I added back in 2011 when the i18n support was initially
>> added.
>>
>> There was concern at the time that we would inadvertently mark
>> plumbing messages for translation, particularly something in a shared
>> code path, and this was a way to hopefully smoke out those issues with
>> the test suite.
>>
>> However compiling with it breaks a couple of dozen tests, I stopped
>> digging when I saw some broke back in 2014.
>>
>> What should be done about this? I think if we're going to keep them
>> they need to be run regularly by something like Travis (Lars CC'd),
>> however empirical evidence suggests that not running them is just fine
>> too, so should we just remove support for this test mode?
>>
>> I don't care, but I can come up with the patch either way, but would
>> only be motivated to write the one-time fix for it if some CI system
>> is actually running them regularly, otherwise they'll just be subject
>> to bitrotting again.
>>
>
> I use that switch when I change something that involves l10n, but
> usually I run specific tests only. To be honest: I have to make sure not
> to get confused by (nor forget one of) the build flag GETTEXT_POISON and
> the environment variable GIT_GETTEXT_POISON. I'm not sure I always
> tested what I meant to test...

For any of the built-in tests, you just need to compile git with
GETTEXT_POISON=YesPlease, the env var is set by test-lib.sh for you,
you only need to set GIT_GETTEXT_POISON=1 if you're ad-hoc running
some git command yourself, e.g.:

$ ./git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean
$ GIT_GETTEXT_POISON=1 ./git status
# GETTEXT POISON #master
# GETTEXT POISON #

[PATCH v6 4/8] convert: Separate generic structures and variables from the filter specific ones

2017-04-21 Thread Ben Peart

To enable future reuse of the filter..process infrastructure,
split the cmd2process structure into two separate parts.

subprocess_entry will now contain the generic data required to manage
the creation and tracking of the child process in a hashmap. Also move
all knowledge of the hashmap into the generic functions.

cmd2process is a filter protocol specific structure that is used to
track the negotiated capabilities of the filter.

Signed-off-by: Ben Peart 
---
 convert.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/convert.c b/convert.c
index 36401fe087..bfb19beed5 100644
--- a/convert.c
+++ b/convert.c
@@ -496,26 +496,31 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct cmd2process {
+struct subprocess_entry {
struct hashmap_entry ent; /* must be the first member! */
-   unsigned int supported_capabilities;
const char *cmd;
struct child_process process;
 };
 
+struct cmd2process {
+   struct subprocess_entry subprocess; /* must be the first member! */
+   unsigned int supported_capabilities;
+};
+
 static int cmd_process_map_initialized;
 static struct hashmap cmd_process_map;
 
-static int cmd2process_cmp(const struct cmd2process *e1,
-  const struct cmd2process *e2,
+static int cmd2process_cmp(const struct subprocess_entry *e1,
+  const struct subprocess_entry *e2,
   const void *unused)
 {
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct cmd2process *find_multi_file_filter_entry(struct hashmap 
*hashmap, const char *cmd)
+static struct subprocess_entry *find_multi_file_filter_entry(struct hashmap 
*hashmap, const char *cmd)
 {
-   struct cmd2process key;
+   struct subprocess_entry key;
+
hashmap_entry_init(, strhash(cmd));
key.cmd = cmd;
return hashmap_get(hashmap, , NULL);
@@ -541,7 +546,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct hashmap *hashmap, struct cmd2process 
*entry)
+static void kill_multi_file_filter(struct hashmap *hashmap, struct 
subprocess_entry *entry)
 {
if (!entry)
return;
@@ -571,8 +576,8 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >process;
-   const char *cmd = entry->cmd;
+   struct child_process *process = >subprocess.process;
+   const char *cmd = entry->subprocess.cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -632,9 +637,9 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
const char *argv[] = { cmd, NULL };
 
entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
+   entry->subprocess.cmd = cmd;
entry->supported_capabilities = 0;
-   process = >process;
+   process = >subprocess.process;
 
child_process_init(process);
process->argv = argv;
@@ -654,7 +659,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
err = start_multi_file_filter_fn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(hashmap, entry);
+   kill_multi_file_filter(hashmap, >subprocess);
return NULL;
}
 
@@ -678,7 +683,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
hashmap_init(_process_map, (hashmap_cmp_fn) 
cmd2process_cmp, 0);
entry = NULL;
} else {
-   entry = find_multi_file_filter_entry(_process_map, cmd);
+   entry = (struct cmd2process 
*)find_multi_file_filter_entry(_process_map, cmd);
}
 
fflush(NULL);
@@ -688,7 +693,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (!entry)
return 0;
}
-   process = >process;
+   process = >subprocess.process;
 
if (!(wanted_capability & entry->supported_capabilities))
return 0;
@@ -759,7 +764,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
 * Force shutdown and restart if another blob requires 
filtering.
 */
error("external filter '%s' failed", cmd);
-   kill_multi_file_filter(_process_map, entry);
+   kill_multi_file_filter(_process_map, 
>subprocess);
}
} else {
strbuf_swap(dst, );
-- 
2.12.0.windows.1.33.g243d9b384c

[PATCH v6 3/8] convert: Split start_multi_file_filter into two separate functions

2017-04-21 Thread Ben Peart

To enable future reuse of the filter..process infrastructure,
split start_multi_file_filter into two separate parts.

start_multi_file_filter will now only contain the generic logic to
manage the creation and tracking of the child process in a hashmap.

start_multi_file_filter_fn is a protocol specific initialization
function that will negotiate the multi-file-filter interface version
and capabilities.

Signed-off-by: Ben Peart 
---
 convert.c | 62 --
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/convert.c b/convert.c
index 793c29ebfd..36401fe087 100644
--- a/convert.c
+++ b/convert.c
@@ -565,35 +565,14 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+static int start_multi_file_filter_fn(struct cmd2process *entry)
 {
int err;
-   struct cmd2process *entry;
-   struct child_process *process;
-   const char *argv[] = { cmd, NULL };
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-
-   entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >process;
-
-   child_process_init(process);
-   process->argv = argv;
-   process->use_shell = 1;
-   process->in = -1;
-   process->out = -1;
-   process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
-
-   if (start_command(process)) {
-   error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
-   }
-
-   hashmap_entry_init(entry, strhash(cmd));
+   struct child_process *process = >process;
+   const char *cmd = entry->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -642,7 +621,38 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 done:
sigchain_pop(SIGPIPE);
 
-   if (err || errno == EPIPE) {
+   return err;
+}
+
+static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+{
+   int err;
+   struct cmd2process *entry;
+   struct child_process *process;
+   const char *argv[] = { cmd, NULL };
+
+   entry = xmalloc(sizeof(*entry));
+   entry->cmd = cmd;
+   entry->supported_capabilities = 0;
+   process = >process;
+
+   child_process_init(process);
+   process->argv = argv;
+   process->use_shell = 1;
+   process->in = -1;
+   process->out = -1;
+   process->clean_on_exit = 1;
+   process->clean_on_exit_handler = stop_multi_file_filter;
+
+   if (start_command(process)) {
+   error("cannot fork to run external filter '%s'", cmd);
+   return NULL;
+   }
+
+   hashmap_entry_init(entry, strhash(cmd));
+
+   err = start_multi_file_filter_fn(entry);
+   if (err) {
error("initialization for external filter '%s' failed", cmd);
kill_multi_file_filter(hashmap, entry);
return NULL;
@@ -733,7 +743,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
 done:
sigchain_pop(SIGPIPE);
 
-   if (err || errno == EPIPE) {
+   if (err) {
if (!strcmp(filter_status.buf, "error")) {
/* The filter signaled a problem with the file. */
} else if (!strcmp(filter_status.buf, "abort")) {
-- 
2.12.0.windows.1.33.g243d9b384c

[PATCH v6 1/8] pkt-line: add packet_read_line_gently()

2017-04-21 Thread Ben Peart

Add packet_read_line_gently() to enable reading a line without dying on
EOF.

Signed-off-by: Ben Peart 
---
 pkt-line.c | 14 +-
 pkt-line.h | 10 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index d4b6bfe076..bfdb177b34 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -315,7 +315,7 @@ static char *packet_read_line_generic(int fd,
  PACKET_READ_CHOMP_NEWLINE);
if (dst_len)
*dst_len = len;
-   return len ? packet_buffer : NULL;
+   return (len > 0) ? packet_buffer : NULL;
 }
 
 char *packet_read_line(int fd, int *len_p)
@@ -323,6 +323,18 @@ char *packet_read_line(int fd, int *len_p)
return packet_read_line_generic(fd, NULL, NULL, len_p);
 }
 
+int packet_read_line_gently(int fd, int *dst_len, char** dst_line)
+{
+   int len = packet_read(fd, NULL, NULL,
+ packet_buffer, sizeof(packet_buffer),
+ 
PACKET_READ_CHOMP_NEWLINE|PACKET_READ_GENTLE_ON_EOF);
+   if (dst_len)
+   *dst_len = len;
+   if (dst_line)
+   *dst_line = (len > 0) ? packet_buffer : NULL;
+   return len;
+}
+
 char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
 {
return packet_read_line_generic(-1, src, src_len, dst_len);
diff --git a/pkt-line.h b/pkt-line.h
index 18eac64830..ad30db101a 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -74,6 +74,16 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, 
char
 char *packet_read_line(int fd, int *size);
 
 /*
+ * Convenience wrapper for packet_read that sets the PACKET_READ_GENTLE_ON_EOF
+ * and CHOMP_NEWLINE options. The return value specifies the number of bytes
+ * read into the buffer or -1 on truncated input. If the *dst_line parameter
+ * is not NULL it will return NULL for a flush packet and otherwise points to
+ * a static buffer (that may be overwritten by subsequent calls). If the size
+ * parameter is not NULL, the length of the packet is written to it.
+ */
+int packet_read_line_gently(int fd, int *size, char** dst_line);
+
+/*
  * Same as packet_read_line, but read from a buf rather than a descriptor;
  * see packet_read for details on how src_* is used.
  */
-- 
2.12.0.windows.1.33.g243d9b384c

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Fri, Apr 21, 2017 at 4:25 PM, Christian Couder
 wrote:
> On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  
> wrote:
>> On 04/20, Christian Couder wrote:
>>>
>>> Could you try with the following patch:
>>>
>>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>>
>> Yeah, I tried with and without that patch with the same result.
>> Unless I'm screwing something up when testing I don't think this fixes
>> the issue unfortunately.
>
> I just tried on "pu" and only the first test
> (t7009-filter-branch-null-sha1.sh) fails there.

I bisected this test's failure (when using
GIT_TEST_SPLIT_INDEX=YesPlease) to e6a1dd77e1 (read-cache: regenerate
shared index if necessary, 2017-02-27).

The failing test is the following:

test_expect_success 'filter commands are still checked' '
test_must_fail git filter-branch \
--force --prune-empty \
--index-filter "git rm --cached --ignore-unmatch three.t"
'

And if I add the following at the beginning of the test:

git config splitIndex.maxPercentChange 100 &&

the test then passes.

So It looks like in split index mode the test doesn't expect the
shared index to be regenerated.
Maybe Peff, as he is the author of this test, or Duy have an idea about this?

Re: [BUG] test suite broken with GETTEXT_POISON=YesPlease

2017-04-21 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 21, 2017 at 4:54 PM, Lars Schneider
 wrote:
>
>> Am 20.04.2017 um 23:58 schrieb Ævar Arnfjörð Bjarmason :
>>
>> As a refresh of everyone's memory (because mine needed it). This is a
>> feature I added back in 2011 when the i18n support was initially
>> added.
>>
>> There was concern at the time that we would inadvertently mark
>> plumbing messages for translation, particularly something in a shared
>> code path, and this was a way to hopefully smoke out those issues with
>> the test suite.
>>
>> However compiling with it breaks a couple of dozen tests, I stopped
>> digging when I saw some broke back in 2014.
>>
>> What should be done about this? I think if we're going to keep them
>> they need to be run regularly by something like Travis (Lars CC'd),
>> however empirical evidence suggests that not running them is just fine
>> too, so should we just remove support for this test mode?
>
> Right now we are building and testing Git in the following configurations:
>
> 1. Linux, gcc, stable Perforce an GitLFS version (used by git-p4 tests)
> 2. Linux, gcc, stable Perforce an GitLFS version (used by git-p4 tests) *
> 3. OSX, clang, latest Perforce an GitLFS version (used by git-p4 tests)
> 4. OSX, clang, latest Perforce an GitLFS version (used by git-p4 tests) *
> 5. Linux32, gcc, no git-p4 tests
> 6. Windows, gcc, no git-p4 tests
>
> 1-4 run the same tests right now. This was especially useful in the beginning 
> to identify flaky tests (t0025 is still flaky!).
>
> We could easily run the tests in 1-4 with different configurations. E.g. 
> enable GETTEXT_POISON in 2.
>
> Cheers,
> Lars
>
> *) 2 and 4 use the wrong compiler right now. 2 should use clang on Linux and 
> 4 should use gcc. A patch is on my todo list.

Great, thanks. I'll fixup the poison tests then.

[PATCHv3 4/4] builtin/reset: add --recurse-submodules switch

2017-04-21 Thread Stefan Beller

git-reset is yet another working tree manipulator, which should
be taught about submodules.

When a user uses git-reset and requests to recurse into submodules,
this will reset the submodules to the object name as recorded in the
superproject, detaching the HEADs.

Signed-off-by: Stefan Beller 
---

  This replaces the topmost patch in sb/reset-recurse-submodules.
  The only difference is the rewording of the commit message.
  
  Thanks,
  Stefan

 builtin/reset.c| 30 ++
 t/t7112-reset-submodule.sh |  8 
 2 files changed, 38 insertions(+)

diff --git a/builtin/reset.c b/builtin/reset.c
index fc3b906c47..5ce27fcaed 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -21,6 +21,27 @@
 #include "parse-options.h"
 #include "unpack-trees.h"
 #include "cache-tree.h"
+#include "submodule.h"
+#include "submodule-config.h"
+
+static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+
+static int option_parse_recurse_submodules(const struct option *opt,
+  const char *arg, int unset)
+{
+   if (unset) {
+   recurse_submodules = RECURSE_SUBMODULES_OFF;
+   return 0;
+   }
+   if (arg)
+   recurse_submodules =
+   parse_update_recurse_submodules_arg(opt->long_name,
+   arg);
+   else
+   recurse_submodules = RECURSE_SUBMODULES_ON;
+
+   return 0;
+}
 
 static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
@@ -283,6 +304,9 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
N_("reset HEAD, index and working tree"), 
MERGE),
OPT_SET_INT(0, "keep", _type,
N_("reset HEAD but keep local changes"), KEEP),
+   { OPTION_CALLBACK, 0, "recurse-submodules", _submodules,
+   "reset", "control recursive updating of submodules",
+   PARSE_OPT_OPTARG, option_parse_recurse_submodules },
OPT_BOOL('p', "patch", _mode, N_("select hunks 
interactively")),
OPT_BOOL('N', "intent-to-add", _to_add,
N_("record only the fact that removed paths 
will be added later")),
@@ -295,6 +319,12 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
PARSE_OPT_KEEP_DASHDASH);
parse_args(, argv, prefix, patch_mode, );
 
+   if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT) {
+   gitmodules_config();
+   git_config(submodule_config, NULL);
+   set_config_update_recurse_submodules(RECURSE_SUBMODULES_ON);
+   }
+
unborn = !strcmp(rev, "HEAD") && get_sha1("HEAD", oid.hash);
if (unborn) {
/* reset on unborn branch: treat as reset to empty tree */
diff --git a/t/t7112-reset-submodule.sh b/t/t7112-reset-submodule.sh
index 2eda6adeb1..f86ccdf215 100755
--- a/t/t7112-reset-submodule.sh
+++ b/t/t7112-reset-submodule.sh
@@ -5,6 +5,14 @@ test_description='reset can handle submodules'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
+KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
+KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
+KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1
+
+test_submodule_switch_recursing "git reset --recurse-submodules --keep"
+
+test_submodule_forced_switch_recursing "git reset --hard --recurse-submodules"
+
 test_submodule_switch "git reset --keep"
 
 test_submodule_switch "git reset --merge"
-- 
2.13.0.rc0.2.g0d1ae48b0e

Re: [BUG] test suite broken with GETTEXT_POISON=YesPlease

2017-04-21 Thread Michael J Gruber

Ævar Arnfjörð Bjarmason venit, vidit, dixit 20.04.2017 23:58:
> As a refresh of everyone's memory (because mine needed it). This is a
> feature I added back in 2011 when the i18n support was initially
> added.
> 
> There was concern at the time that we would inadvertently mark
> plumbing messages for translation, particularly something in a shared
> code path, and this was a way to hopefully smoke out those issues with
> the test suite.
> 
> However compiling with it breaks a couple of dozen tests, I stopped
> digging when I saw some broke back in 2014.
> 
> What should be done about this? I think if we're going to keep them
> they need to be run regularly by something like Travis (Lars CC'd),
> however empirical evidence suggests that not running them is just fine
> too, so should we just remove support for this test mode?
> 
> I don't care, but I can come up with the patch either way, but would
> only be motivated to write the one-time fix for it if some CI system
> is actually running them regularly, otherwise they'll just be subject
> to bitrotting again.
> 

I use that switch when I change something that involves l10n, but
usually I run specific tests only. To be honest: I have to make sure not
to get confused by (nor forget one of) the build flag GETTEXT_POISON and
the environment variable GIT_GETTEXT_POISON. I'm not sure I always
tested what I meant to test...

Michael

RE: Proposal for "fetch-any-blob Git protocol" and server design

2017-04-21 Thread Kevin David

Hi Jonathan,

Sorry for the delayed response - other work got in the way, unfortunately!

Kevin

-Original Message-
From: Jonathan Tan [mailto:jonathanta...@google.com] 
Sent: Thursday, April 13, 2017 4:13 PM

>> I know we're considering server behavior here, but how large do you generally
>> expect these blob-want requests to be? I ask because we took an initial 
>> approach
>> very similar to this, however, we had a hard time being clever about 
>> figuring out
>> what set of blobs to request for those clients that didn't want the entire 
>> set, and
>> ended up falling back to single-blob requests.
>>
>> Obviously, this could be due to thenature of our 
>> filesystem-virtualization-based client,
>> but I also suspect that the teams attacking this problem are more often than 
>> not dealing
>> with very large blob objects, so the cost of a round-trip becomes lower 
>> relative to sending
>> the object content itself.
>
> I am envisioning (1a) as described in Jeff Hostetler's e-mail [1] ("a 
> pre-command or hook to identify needed blobs and pre-fetch them before 
> allowing the actual command to start"), so a Git command would typically 
> make a single request that contains all the blobs required, but my 
> proposal can also handle (1c) ('"fault" them in as necessary in 
> read_object() while the command is running and without any pre-fetch 
> (either synchronously or asynchronously and with/without a helper 
> process)').
> 
> Even if we decided to go with single-blob requests and responses, it is 
> still important to send them as packfiles, so that the server can serve 
> them directly from its compressed storage without first having to 
> uncompress them.
> 
> [1] 
> https://public-inbox.org/git/1488999039-37631-1-git-send-email-...@jeffhostetler.com/

Ah, I missed this. If we think we can build meaningfully-sized requests via 
(1a) and (1b), 
then I agree - packs are optimal.

However, if single-blob requests/responses dominate, I have a few concerns, 
mostly from 
experience playing with something similar: 
* Regardless of the size of the object or number returned, the client will need 
to 
  `index-pack` the result and create a corresponding `.idx` file, which 
requires 
  decompression to construct (right?)

* Unless the client's repository is repacked aggressively, we'll pollute the 
  `.git\objects\pack` directory with little indexes (somewhere around 4KiB 
minimum) 
  and packfiles rapidly, which would degrade performance. One rudimentary 
workaround 
  would be to loosen these packs on the client if they were under a certain 
  size/object count. I think fetch does this already?

In either case, shifting the pack decompression/loose object recompression 
problem 
to the client instead of the server is probably a good principle, but in our 
case it 
simply wasn't fast enough to serve single blobs interactively (e.g. opening a 
source 
file you don't have locally). I'm hopeful that the proposed partial clone 
solutions you 
referenced would reduce the frequency of this being required.

>> Being a bit more clever about packing objects (e.g. splitting blobs out from 
>> commits
>> and trees) improved this a bit, but we still hit a bottlenecks from what 
>> appeared to
>> be a large number of memory-mapping operations on a ~140GiB packfile of 
>> blobs.
>> 
>> Each `pack-objects` process would consume approximately one CPU core for the
>> duration of the request. It's possible that further splitting of these large 
>> blob packs
>> would have improved performance in some scenarios, but that would increase 
>> the
>> amount of pack-index lookups necessary to find a single object.
>
> I'm not very experienced with mmap, but I thought that memory-mapping a 
> large file in itself does not incur much of a performance penalty (if 
> any) - it is the accesses that count. I experimented with 15,000 and 
> 150,000 MiB files and mmap and they seem to be handled quite well. Also, 
> how many objects are "pack-objects" packing here?

Back when we took this approach, it was ~4000 blob objects at a time. 
Perhaps we were being bitten by the Windows implementation of git_mmap[3]?. 
When I profiled the 4k blob scenario, the majority of CPU and wall time was 
spent in MapViewOfFileEx, which looks like it could mean accesses as well.

[3] https://github.com/git/git/blob/master/compat/win32mmap.c

>> This seems like a clever way to avoid the canonical 
>> `/info/refs?service=git-upload-pack`
>> capability negotiation on every call. However, using error handling to 
>> fallback seems
>> slightly wonky to me. Hopefully users are incentivized to upgrade their 
>> clients.
>
> By "error handling to fallback", do you mean in my proposal or in a 
> possible future one (assuming my proposal is implemented)? I don't think 
> my proposal requires any error handling to fallback (since only new 
> clients can clone partially - old clients will just clone totally and 
> obliviously), but I acknowledge that this

Re: [BUG] test suite broken with GETTEXT_POISON=YesPlease

2017-04-21 Thread Lars Schneider


> Am 20.04.2017 um 23:58 schrieb Ævar Arnfjörð Bjarmason :
> 
> As a refresh of everyone's memory (because mine needed it). This is a
> feature I added back in 2011 when the i18n support was initially
> added.
> 
> There was concern at the time that we would inadvertently mark
> plumbing messages for translation, particularly something in a shared
> code path, and this was a way to hopefully smoke out those issues with
> the test suite.
> 
> However compiling with it breaks a couple of dozen tests, I stopped
> digging when I saw some broke back in 2014.
> 
> What should be done about this? I think if we're going to keep them
> they need to be run regularly by something like Travis (Lars CC'd),
> however empirical evidence suggests that not running them is just fine
> too, so should we just remove support for this test mode?

Right now we are building and testing Git in the following configurations:

1. Linux, gcc, stable Perforce an GitLFS version (used by git-p4 tests)
2. Linux, gcc, stable Perforce an GitLFS version (used by git-p4 tests) *
3. OSX, clang, latest Perforce an GitLFS version (used by git-p4 tests)
4. OSX, clang, latest Perforce an GitLFS version (used by git-p4 tests) *
5. Linux32, gcc, no git-p4 tests
6. Windows, gcc, no git-p4 tests

1-4 run the same tests right now. This was especially useful in the beginning 
to identify flaky tests (t0025 is still flaky!).

We could easily run the tests in 1-4 with different configurations. E.g. enable 
GETTEXT_POISON in 2.

Cheers,
Lars

*) 2 and 4 use the wrong compiler right now. 2 should use clang on Linux and 4 
should use gcc. A patch is on my todo list.

Re: [PATCH 0/6] nd/worktree-move update

2017-04-21 Thread Jeff King

On Thu, Apr 20, 2017 at 05:10:18PM +0700, Nguyễn Thái Ngọc Duy wrote:

>  - fixes the compile problem on latest master (because prefix_filename
>takes one argument less)

It also now returns an allocated buffer.

So:

> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -561,9 +561,7 @@ static int move_worktree(int ac, const char **av, const 
> char *prefix)
>   if (ac != 2)
>   usage_with_options(worktree_usage, options);
>  
> - strbuf_addstr(, prefix_filename(prefix,
> - strlen(prefix),
> - av[1]));
> + strbuf_addstr(, prefix_filename(prefix, av[1]));

...this is now a leak. Probably:

  const char *filename = prefix_filename(prefix, av[1]);
  strbuf_attach(, filename, strlen(filename), strlen(filename));

is what you want. That would be less awkward if we had a
strbuf_attach_str().

Or if we had a strbuf variant of prefix_filename(), you could do:

  prefix_filename_buf(, prefix, av[1]);

I almost added that when I did the prefix_filename() work, since it uses
a strbuf internally. But there were no callers that would have used it.
Maybe it's worth doing now.

-- >8 --
>From d0b933fc023023a017df9268360aa327c28b90f0 Mon Sep 17 00:00:00 2001
From: Jeff King 
Date: Fri, 21 Apr 2017 10:53:07 -0400
Subject: [PATCH] prefix_filename: add strbuf variant

Now that prefix_filename() always allocates, it's awkward to
put its value directly into a strbuf. You have to either
free:

  const char *filename = prefix_filename(prefix, arg);
  strbuf_addstr(, filename);
  free(filename);

or you have to attach:

  const char *filename = prefix_filename(prefix, arg);
  strbuf_attach(, filename, strlen(filename), strlen(filename));

Since we're already using a strbuf internally, it's easy to
provide a variant that lets you write directly into one:

  prefix_filename_buf(, prefix, arg);

For consistency with git_path_buf(), the function overwrites
the strbuf rather than appending.

Signed-off-by: Jeff King 
---
 abspath.c | 17 -
 cache.h   |  6 ++
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/abspath.c b/abspath.c
index 7f1cfe979..7594f62a5 100644
--- a/abspath.c
+++ b/abspath.c
@@ -246,21 +246,28 @@ char *absolute_pathdup(const char *path)
return strbuf_detach(, NULL);
 }
 
-char *prefix_filename(const char *pfx, const char *arg)
+void prefix_filename_buf(struct strbuf *path, const char *pfx, const char *arg)
 {
-   struct strbuf path = STRBUF_INIT;
size_t pfx_len = pfx ? strlen(pfx) : 0;
 
+   strbuf_reset(path);
+
if (!pfx_len)
; /* nothing to prefix */
else if (is_absolute_path(arg))
pfx_len = 0;
else
-   strbuf_add(, pfx, pfx_len);
+   strbuf_add(path, pfx, pfx_len);
 
-   strbuf_addstr(, arg);
+   strbuf_addstr(path, arg);
 #ifdef GIT_WINDOWS_NATIVE
-   convert_slashes(path.buf + pfx_len);
+   convert_slashes(path->buf + pfx_len);
 #endif
+}
+
+char *prefix_filename(const char *pfx, const char *arg)
+{
+   struct strbuf path = STRBUF_INIT;
+   prefix_filename_buf(, pfx, arg);
return strbuf_detach(, NULL);
 }
diff --git a/cache.h b/cache.h
index ba27595d5..209039a98 100644
--- a/cache.h
+++ b/cache.h
@@ -548,6 +548,12 @@ extern char *prefix_path_gently(const char *prefix, int 
len, int *remaining, con
  */
 extern char *prefix_filename(const char *prefix, const char *path);
 
+/*
+ * Like prefix_filename, but write into "buf", overwriting any
+ * previous contents of the strbuf.
+ */
+extern void prefix_filename_buf(struct strbuf *out, const char *prefix, const 
char *path);
+
 extern int check_filename(const char *prefix, const char *name);
 extern void verify_filename(const char *prefix,
const char *name,
-- 
2.13.0.rc0.364.g36b4d8031

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  wrote:
> On 04/20, Christian Couder wrote:
>>
>> Could you try with the following patch:
>>
>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>
> Yeah, I tried with and without that patch with the same result.
> Unless I'm screwing something up when testing I don't think this fixes
> the issue unfortunately.

I just tried on "pu" and only the first test
(t7009-filter-branch-null-sha1.sh) fails there.

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-21 Thread Christian Couder

On Fri, Apr 21, 2017 at 11:50 AM, Johannes Schindelin
 wrote:
>
> Since that "let's aggregate everything and only push out the final result
> at the end of the day" approach does not really allow the Continuous
> Testing to identify problems associated with individual topic branches, I
> have another job that bisects the breakages

Great!

> (with all associated problems
> I reported earlier, as you apply some patches on top of really ancient
> commits and bisect wants to test all merge bases first)

First bisect should ask you to test merge bases only if there are
"good" commits that are not ancestors of the "bad" commit.
So I don't know how you want to bisect, but it seems to me that you
could just go back to a previous know "good" commit when the one you
have is not an ancestor of the "bad" commit.
This would get rid of the whole merge base problem.

Second yeah there is probably an old bug in bisect there. In theory in
most cases bisect should ask you to test only one merge base, as:

- if the merge base is "bad", it means that the bug has been fixed
between the merge base and your "good" commit, and bisecting will
stop,
- if the merge base is "good", it means that all the merge bases that
are ancestor of this merge base are also good, so there is no need to
test them

But that theory can work only if bisect is asking you to test a merge
base that is a descendant of many (if not all the other merge bases),
but unfortunately I don't think bisect is smart enough for that yet.

> because the
> required time *definitely* would let Travis time out all the time. Those
> bisect results are even less visible than the Travis results, see e.g.
> https://github.com/git/git/commit/2e3a8b9035a#commitcomment-21836854.

Nice that they exists though!

Re: [PATCH] completion: expand "push --delete " for refs on that

2017-04-21 Thread SZEDER Gábor

On Tue, Apr 18, 2017 at 3:31 PM, Ævar Arnfjörð Bjarmason  
wrote:
> Change the completion of "push --delete  " to complete
> refs on that , not all refs.

Good.

> Before this e.g. cloning git.git
> and doing "git push --delete origin p" will complete nothing,

Well, it will complete all local branches starting with 'p', but
perhaps you don't happen to have any.

> whereas origin/p will uselessly complete origin/pu.
> 
> Now p will complete as "pu". The completion of giving --delete
> later, e.g. "git push origin --delete p" remains unchanged, this
> is a bug, but is a general existing limitation of the bash completion,
> and not how git-push is documented, so I'm not fixing that case.
> 
> I looked over t9902-completion.sh but couldn't quickly find out how to
> add a test for this,

Yeah, this helper function has to look at the whole command line to do
its thing, and we don't have other unit test-like tests doing
something like that.

One option would be something like this:

@@ -1162,6 +1162,19 @@ test_expect_success '__git_complete_fetch_refspecs - 
fully qualified & prefix' '
test_cmp expected out
 '

+test_expect_success '__git_complete_remote_or_refspec - push -d' '
+   sed -e "s/Z$//" >expected <<-EOF &&
+   master-in-other Z
+   EOF
+   (
+   words=(git push -d other ma) &&
+   cword=${#words[@]} cur=${words[cword-1]} &&
+   __git_complete_remote_or_refspec &&
+   print_comp
+   ) &&
+   test_cmp expected out
+'
+
 test_expect_success 'teardown after ref completion' '
git branch -d matching-branch &&
git tag -d matching-tag &&

This is chatty, no question about that, but it only excercises
__git_complete_remote_or_refspec() (and __git_refs() behind its back),
and thus fits right in there with other refs completion tests.

The other option would be something like this:

@@ -1348,6 +1361,10 @@ test_expect_success 'complete tree filename with 
metacharacters' '
EOF
 '

+test_expect_success 'complete remote refs for git push -d' '
+   test_completion "git push -d other ma" "master-in-other "
+'
+
 test_expect_success 'send-email' '
test_completion "git send-email --cov" "--cover-letter " &&
test_completion "git send-email ma" "master "

While this is much more compact, it does excercise the whole shebang,
therefore it has to go to the integration tests.  However, at that
point in the test script there aren't any remote refs in the
repository (and, consequently this test will fail as it is), so you
would need to add a few, which in turn would most likely require
adjustments in other tests.

I'm partial to the former, even if it's more lines of code, because if
it were to fail, then it already narrowed down the scope where we'd
need to look for the cause of the failure.

Take your pick :)

> but all the existing tests pass, and all my
> manual testing of "git push --delete  ..." does the right
> thing now.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  contrib/completion/git-completion.bash | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 1150164d5c..2e5b3ed776 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -701,7 +701,7 @@ __git_complete_revlist ()
>  __git_complete_remote_or_refspec ()
>  {
>   local cur_="$cur" cmd="${words[1]}"
> - local i c=2 remote="" pfx="" lhs=1 no_complete_refspec=0
> + local i c=2 remote="" pfx="" lhs=1 no_complete_refspec=0 delete=0
>   if [ "$cmd" = "remote" ]; then
>   ((c++))
>   fi
> @@ -709,6 +709,7 @@ __git_complete_remote_or_refspec ()
>   i="${words[c]}"
>   case "$i" in
>   --mirror) [ "$cmd" = "push" ] && no_complete_refspec=1 ;;
> + --delete) delete=1 ;;

I noticed the two identical __git_complete_refs() calls in the hunk
below.  How about:

  -d|--delete) [ "$cmd" = "push" ] && lhs=0 ;;

First, it recognizes the short option, too.
Second, with 'push -d' any ref is interpreted as the right hand side
of a refspec whose left hand side is empty (i.e. '-d pu' means ':pu').
That 'lhs=0' tells the rest of the function to complete the right hand
side of a refspec, i.e. in case of 'push' to list remote refs, which
is exactly what you want.  And you won't need the extra if branch in
the hunk below, or the new local variable.
In this case, however, we should check that the command is 'push' as
well, just in case the other commands whose completion is driven by
this helper function get these options in the future.

>   --all)
>   case "$cmd" in
>   push) no_complete_refspec=1 ;;
> @@ -761,7 +762,9 @@ __git_complete_remote_or_refspec ()
>   fi
>   ;;
>   push)
> -

Re: [PATCH 00/15] Handle fopen() errors

2017-04-21 Thread Duy Nguyen

On Fri, Apr 21, 2017 at 6:52 PM, Junio C Hamano  wrote:
> Yes, but (1) we'd need to be careful about --quiet

Yeah. It's a real pain point for making changes like this. At some
point we should just have a global (maybe multi-level) quiet flag.
-- 
Duy

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Duy Nguyen

On Fri, Apr 21, 2017 at 6:57 PM, Christian Couder
 wrote:
> On Fri, Apr 21, 2017 at 1:46 PM, Christian Couder
>  wrote:
>> On Fri, Apr 21, 2017 at 11:53 AM, Duy Nguyen  wrote:
>>> On Fri, Apr 21, 2017 at 2:10 PM, Christian Couder
>>>  wrote:
 On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  
 wrote:
> On 04/20, Christian Couder wrote:
>>
>> Could you try with the following patch:
>>
>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>
> Yeah, I tried with and without that patch with the same result.
> Unless I'm screwing something up when testing I don't think this fixes
> the issue unfortunately.

 Ok, I will take a look soon.

 By the way I think that GIT_TEST_SPLIT_INDEX has become redundant now
 that there is core.splitIndex.
 So perhaps in the long run it will be best to deprecate
 GIT_TEST_SPLIT_INDEX and eventually remove it.
>>>
>>> I think you can't, at least the way I understand this variable. It's a
>>> _test_ variable to force exercise split index code path a whole lot
>>> more, by running the entire test suite with split index always
>>> enabled, instead of just a couple in  t-split-index.sh. We can't
>>> achieve the same with core.splitIndex because that's more about user
>>> control and you can't just set core.splitIndex for the entire test
>>> suite (can we?).
>>
>> Yeah, you are right.
>> It looks like we have GIT_TEST_OPTS to pass options like --debug,
>> --valgrind, --verbose, but we don't have an environment variable to
>> set config options.
>
> Or maybe GIT_CONFIG_PARAMETERS works for this purpose?

It has to be set inside test-lib.sh, not from outside because
environment variables from outside are filtered if I remember
correctly and only a few specials plus those GIT_TEST_ can survive.
Some tests override GIT_CONFIG_PARAMETERS themselves to pass config
vars to certain command (I know because I just did a couple days ago
;).which loses core.splitIndex.
-- 
Duy

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Fri, Apr 21, 2017 at 1:46 PM, Christian Couder
 wrote:
> On Fri, Apr 21, 2017 at 11:53 AM, Duy Nguyen  wrote:
>> On Fri, Apr 21, 2017 at 2:10 PM, Christian Couder
>>  wrote:
>>> On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  
>>> wrote:
 On 04/20, Christian Couder wrote:
>
> Could you try with the following patch:
>
> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/

 Yeah, I tried with and without that patch with the same result.
 Unless I'm screwing something up when testing I don't think this fixes
 the issue unfortunately.
>>>
>>> Ok, I will take a look soon.
>>>
>>> By the way I think that GIT_TEST_SPLIT_INDEX has become redundant now
>>> that there is core.splitIndex.
>>> So perhaps in the long run it will be best to deprecate
>>> GIT_TEST_SPLIT_INDEX and eventually remove it.
>>
>> I think you can't, at least the way I understand this variable. It's a
>> _test_ variable to force exercise split index code path a whole lot
>> more, by running the entire test suite with split index always
>> enabled, instead of just a couple in  t-split-index.sh. We can't
>> achieve the same with core.splitIndex because that's more about user
>> control and you can't just set core.splitIndex for the entire test
>> suite (can we?).
>
> Yeah, you are right.
> It looks like we have GIT_TEST_OPTS to pass options like --debug,
> --valgrind, --verbose, but we don't have an environment variable to
> set config options.

Or maybe GIT_CONFIG_PARAMETERS works for this purpose?

Re: [PATCH 00/15] Handle fopen() errors

2017-04-21 Thread Junio C Hamano

On Fri, Apr 21, 2017 at 8:04 PM, Duy Nguyen  wrote:
> On Fri, Apr 21, 2017 at 1:29 PM, Jeff King  wrote:
>>
>> I had a similar thought while reading through it. I think it would be
>> shorter still with:
>>
>>   FILE *fopen_or_warn(const char *path, const char *mode)
>>   {
>> FILE *fh = fopen(path, mode);
>> if (!fh)
>> warn_failure_to_read_open_optional_path(path);
>> return fh;
>>   }
>>
>> And then quite a few of the patches could just be
>> s/fopen/fopen_or_warn/.
>
> Jeff.. oh Jeff.. you have made it _way_ too convenient that after a
> quick grep at fopen( again, I found a couple more places that I would
> have just ignored last time (too much work), but now all I need to do
> is Alt-f to the end of fopen and Alt-/ a few times. Too tempting.. :)

Yes, but (1) we'd need to be careful about --quiet, and (2) we would also need
a wrapper for open(2).

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Fri, Apr 21, 2017 at 11:53 AM, Duy Nguyen  wrote:
> On Fri, Apr 21, 2017 at 2:10 PM, Christian Couder
>  wrote:
>> On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  
>> wrote:
>>> On 04/20, Christian Couder wrote:

 Could you try with the following patch:

 http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>>>
>>> Yeah, I tried with and without that patch with the same result.
>>> Unless I'm screwing something up when testing I don't think this fixes
>>> the issue unfortunately.
>>
>> Ok, I will take a look soon.
>>
>> By the way I think that GIT_TEST_SPLIT_INDEX has become redundant now
>> that there is core.splitIndex.
>> So perhaps in the long run it will be best to deprecate
>> GIT_TEST_SPLIT_INDEX and eventually remove it.
>
> I think you can't, at least the way I understand this variable. It's a
> _test_ variable to force exercise split index code path a whole lot
> more, by running the entire test suite with split index always
> enabled, instead of just a couple in  t-split-index.sh. We can't
> achieve the same with core.splitIndex because that's more about user
> control and you can't just set core.splitIndex for the entire test
> suite (can we?).

Yeah, you are right.
It looks like we have GIT_TEST_OPTS to pass options like --debug,
--valgrind, --verbose, but we don't have an environment variable to
set config options.

Re: [PATCH 00/15] Handle fopen() errors

2017-04-21 Thread Duy Nguyen

On Fri, Apr 21, 2017 at 1:29 PM, Jeff King  wrote:
> On Thu, Apr 20, 2017 at 08:41:32PM -0700, Junio C Hamano wrote:
>
>> Junio C Hamano  writes:
>>
>> > I wonder if it is OK to only special case ENOENT for !fp cases,
>> > where existing code silently returns.  Perhaps it is trying to read
>> > an optional file, and it returns silently because lack of it is
>> > perfectly OK for the purpose of the code.  Are there cases where
>> > this optional file is inside an optional directory, leading to
>> > ENOTDIR, instead of ENOENT, observed and reported by your check?
>>
>> "git grep -B1 warn_on_inaccessible" is enlightening.  I wonder if we
>> want to wrap the two lines into a hard to misuse helper function,
>> something along this line.  Would having this patch as a preparatory
>> step shrink your series?  The patch count would be the same, but you
>> wouldn't be writing "if (errno != ENOENT)" lines yourself.
>
> I had a similar thought while reading through it. I think it would be
> shorter still with:
>
>   FILE *fopen_or_warn(const char *path, const char *mode)
>   {
> FILE *fh = fopen(path, mode);
> if (!fh)
> warn_failure_to_read_open_optional_path(path);
> return fh;
>   }
>
> And then quite a few of the patches could just be
> s/fopen/fopen_or_warn/.

Jeff.. oh Jeff.. you have made it _way_ too convenient that after a
quick grep at fopen( again, I found a couple more places that I would
have just ignored last time (too much work), but now all I need to do
is Alt-f to the end of fopen and Alt-/ a few times. Too tempting.. :)
-- 
Duy

[PATCH v4 3/9] t0006 & t5000: skip "far in the future" test when time_t is too limited

2017-04-21 Thread Johannes Schindelin

Git's source code refers to timestamps as unsigned long, which is
ill-defined, as there is no guarantee about the number of bits that
data type has.

In preparation of switching to another data type that is large enough
to hold "far in the future" dates, we need to prepare the t0006-date.sh
script for the case where we *still* cannot format those dates if the
system library uses 32-bit time_t.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 2 +-
 t/test-lib.sh| 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 4727bea255c..ac7c66c733b 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -5,7 +5,8 @@ static const char *usage_msg = "\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
 "  test-date approxidate [date]...\n"
-"  test-date is64bit\n";
+"  test-date is64bit\n"
+"  test-date time_t-is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -96,6 +97,8 @@ int cmd_main(int argc, const char **argv)
parse_approxidate(argv+1, );
else if (!strcmp(*argv, "is64bit"))
return sizeof(unsigned long) == 8 ? 0 : 1;
+   else if (!strcmp(*argv, "time_t-is64bit"))
+   return sizeof(time_t) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 9539b425ffb..42d4ea61ef5 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" 
TIME_IS_64BIT,TIME_T_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" 
TIME_IS_64BIT,TIME_T_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 997aa9dea28..fe2d4f15a73 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -402,7 +402,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future 
mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can 
read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index beee1d847ff..8d25cb7c183 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1166,3 +1166,4 @@ test_lazy_prereq LONG_IS_64BIT '
 '
 
 test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
+test_lazy_prereq TIME_T_IS_64BIT 'test-date time_t-is64bit'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v4 2/9] t0006 & t5000: prepare for 64-bit timestamps

2017-04-21 Thread Johannes Schindelin

Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough to
capture dates that are "absurdly far in the future".

It is perfectly valid by the C standard, of course, for the `long` data
type to refer to 32-bit integers. That is why the `time_t` data type
exists: so that it can be 64-bit even if `long` is 32-bit. Git's source
code simply uses an incorrect data type for timestamps, is all.

The earlier quick fix 6b9c38e14cd (t0006: skip "far in the future" test
when unsigned long is not long enough, 2016-07-11) papered over this
issue simply by skipping the respective test cases on platforms where
they would fail due to the data type in use.

This quick fix, however, tests for *long* to be 64-bit or not. What we
need, though, is a test that says whether *whatever data type we use for
timestamps* is 64-bit or not.

The same quick fix was used to handle the similar problem where Git's
source code uses `unsigned long` to represent size, instead of `size_t`,
conflating the two issues.

So let's just add another prerequisite to test specifically whether
timestamps are represented by a 64-bit data type or not. Later, after we
switch to a larger data type, we can flip that prerequisite to test
`time_t` instead of `long`.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 6 +++---
 t/test-lib.sh| 2 ++
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 506054bcd5d..4727bea255c 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -4,7 +4,8 @@ static const char *usage_msg = "\n"
 "  test-date relative [time_t]...\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
-"  test-date approxidate [date]...\n";
+"  test-date approxidate [date]...\n"
+"  test-date is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -93,6 +94,8 @@ int cmd_main(int argc, const char **argv)
parse_dates(argv+1, );
else if (!strcmp(*argv, "approxidate"))
parse_approxidate(argv+1, );
+   else if (!strcmp(*argv, "is64bit"))
+   return sizeof(unsigned long) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index c0c910867d7..9539b425ffb 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 886b6953e40..997aa9dea28 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -390,7 +390,7 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can 
read our huge size' '
test_cmp expect actual
 '
 
-test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' '
+test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -398,11 +398,11 @@ test_expect_success LONG_IS_64BIT 'set up repository with 
far-future commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success LONG_IS_64BIT 'generate tar with future mtime' '
+test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822d..beee1d847ff 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1164,3 +1164,5 @@ build_option () {
 test_lazy_prereq LONG_IS_64BIT '
test 8 -le "$(build_option sizeof-long)"
 '
+
+test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v4 4/9] Specify explicitly where we parse timestamps

2017-04-21 Thread Johannes Schindelin

Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin 
---
 builtin/am.c   | 2 +-
 builtin/receive-pack.c | 4 ++--
 bundle.c   | 2 +-
 commit.c   | 6 +++---
 date.c | 6 +++---
 fsck.c | 2 +-
 git-compat-util.h  | 2 ++
 pretty.c   | 2 +-
 ref-filter.c   | 2 +-
 refs/files-backend.c   | 2 +-
 t/helper/test-date.c   | 2 +-
 tag.c  | 4 ++--
 upload-pack.c  | 2 +-
 13 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index f7a7a971fbe..2c93adc69c3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -882,7 +882,7 @@ static int hg_patch_to_mail(FILE *out, FILE *in, int 
keep_cr)
char *end;
 
errno = 0;
-   timestamp = strtoul(str, , 10);
+   timestamp = parse_timestamp(str, , 10);
if (errno)
return error(_("invalid timestamp"));
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3cba3fd278f..9a4c2a7ade4 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -534,7 +534,7 @@ static const char *check_nonce(const char *buf, size_t len)
retval = NONCE_BAD;
goto leave;
}
-   stamp = strtoul(nonce, , 10);
+   stamp = parse_timestamp(nonce, , 10);
if (bohmac == nonce || bohmac[0] != '-') {
retval = NONCE_BAD;
goto leave;
@@ -552,7 +552,7 @@ static const char *check_nonce(const char *buf, size_t len)
 * would mean it was issued by another server with its clock
 * skewed in the future.
 */
-   ostamp = strtoul(push_cert_nonce, NULL, 10);
+   ostamp = parse_timestamp(push_cert_nonce, NULL, 10);
nonce_stamp_slop = (long)ostamp - (long)stamp;
 
if (nonce_stamp_slop_limit &&
diff --git a/bundle.c b/bundle.c
index bbf4efa0a0a..f43bfcf5ff3 100644
--- a/bundle.c
+++ b/bundle.c
@@ -227,7 +227,7 @@ static int is_tag_in_date_range(struct object *tag, struct 
rev_info *revs)
line = memchr(line, '>', lineend ? lineend - line : buf + size - line);
if (!line++)
goto out;
-   date = strtoul(line, NULL, 10);
+   date = parse_timestamp(line, NULL, 10);
result = (revs->max_age == -1 || revs->max_age < date) &&
(revs->min_age == -1 || revs->min_age > date);
 out:
diff --git a/commit.c b/commit.c
index 73c78c2b80c..0d2d0fa1984 100644
--- a/commit.c
+++ b/commit.c
@@ -89,8 +89,8 @@ static unsigned long parse_commit_date(const char *buf, const 
char *tail)
/* nada */;
if (buf >= tail)
return 0;
-   /* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */
-   return strtoul(dateptr, NULL, 10);
+   /* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */
+   return parse_timestamp(dateptr, NULL, 10);
 }
 
 static struct commit_graft **commit_graft;
@@ -607,7 +607,7 @@ static void record_author_date(struct author_date_slab 
*author_date,
!ident.date_begin || !ident.date_end)
goto fail_exit; /* malformed "author" line */
 
-   date = strtoul(ident.date_begin, _end, 10);
+   date = parse_timestamp(ident.date_begin, _end, 10);
if (date_end != ident.date_end)
goto fail_exit; /* malformed date */
*(author_date_slab_at(author_date, commit)) = date;
diff --git a/date.c b/date.c
index a996331f5b3..495c207c64f 100644
--- a/date.c
+++ b/date.c
@@ -510,7 +510,7 @@ static int match_digit(const char *date, struct tm *tm, int 
*offset, int *tm_gmt
char *end;
unsigned long num;
 
-   num = strtoul(date, , 10);
+   num = parse_timestamp(date, , 10);
 
/*
 * Seconds since 1970? We trigger on that for any numbers with
@@ -658,7 +658,7 @@ static int match_object_header_date(const char *date, 
unsigned long *timestamp,
 
if (*date < '0' || '9' < *date)
return -1;
-   stamp = strtoul(date, , 10);
+   stamp = parse_timestamp(date, , 10);
if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != 
'-'))
return -1;
date = end + 2;
@@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, 
struct tm *tm, int *num,
 time_t now)
 {
char *end;
-   unsigned long number = strtoul(date, , 10);
+   unsigned long number = parse_timestamp(date, , 10);
 
switch (*end) {
case ':':
diff --git a/fsck.c b/fsck.c
index

[PATCH v4 6/9] Introduce a new data type for timestamps

2017-04-21 Thread Johannes Schindelin

Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/api-parse-options.txt |  8 ++--
 archive-tar.c |  5 +-
 archive-zip.c | 12 -
 archive.h |  2 +-
 builtin/am.c  |  2 +-
 builtin/blame.c   |  8 ++--
 builtin/fsck.c|  4 +-
 builtin/gc.c  |  2 +-
 builtin/log.c |  2 +-
 builtin/merge-base.c  |  2 +-
 builtin/name-rev.c|  6 +--
 builtin/pack-objects.c|  4 +-
 builtin/prune.c   |  4 +-
 builtin/receive-pack.c|  6 +--
 builtin/reflog.c  | 24 +-
 builtin/show-branch.c |  4 +-
 builtin/worktree.c|  4 +-
 bundle.c  |  2 +-
 cache.h   | 14 +++---
 commit.c  | 12 ++---
 commit.h  |  2 +-
 config.c  |  2 +-
 credential-cache--daemon.c| 12 ++---
 date.c| 66 +--
 fetch-pack.c  |  6 +--
 git-compat-util.h |  2 +
 http-backend.c|  4 +-
 parse-options-cb.c|  4 +-
 pretty.c  |  2 +-
 reachable.c   |  9 ++--
 reachable.h   |  4 +-
 ref-filter.c  |  4 +-
 reflog-walk.c |  8 ++--
 refs.c| 14 +++---
 refs.h|  8 ++--
 refs/files-backend.c  |  4 +-
 revision.c|  6 +--
 revision.h|  4 +-
 sha1_name.c   |  6 +--
 t/helper/test-date.c  |  8 ++--
 t/helper/test-parse-options.c |  2 +-
 t/helper/test-ref-store.c |  2 +-
 tag.c |  2 +-
 tag.h |  2 +-
 upload-pack.c |  4 +-
 vcs-svn/fast_export.c |  4 +-
 vcs-svn/fast_export.h |  4 +-
 vcs-svn/svndump.c |  2 +-
 wt-status.c   |  2 +-
 49 files changed, 169 insertions(+), 157 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt 
b/Documentation/technical/api-parse-options.txt
index 36768b479e1..829b5581105 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -183,13 +183,13 @@ There are some macros to easily define options:
scale the provided value by 1024, 1024^2 or 1024^3 respectively.
The scaled value is put into `unsigned_long_var`.
 
-`OPT_DATE(short, long, _var, description)`::
+`OPT_DATE(short, long, _t_var, description)`::
Introduce an option with date argument, see `approxidate()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
-`OPT_EXPIRY_DATE(short, long, _var, description)`::
+`OPT_EXPIRY_DATE(short, long, _t_var, description)`::
Introduce an option with expiry date argument, see 
`parse_expiry_date()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
 `OPT_CALLBACK(short, long, , arg_str, description, func_ptr)`::
Introduce an option with argument.
diff --git a/archive-tar.c b/archive-tar.c
index 380e3aedd23..695339a2369 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -27,9 +27,12 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  */
 #if ULONG_MAX == 0x
 #define USTAR_MAX_SIZE ULONG_MAX
-#define USTAR_MAX_MTIME ULONG_MAX
 #else
 #define

[PATCH v4 9/9] show_date_ident(): defer date overflow check

2017-04-21 Thread Johannes Schindelin

Now that we use uintmax_t for timestamps, we can represent timestamps
that would not fit inside the time_t data type. As long as we do not
have to use the system functions, we can even display them, e.g. as Unix
epoch.

Signed-off-by: Johannes Schindelin 
---
 pretty.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/pretty.c b/pretty.c
index 587d48371b0..35fd290096a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -410,14 +410,10 @@ const char *show_ident_date(const struct ident_split 
*ident,
 
if (ident->date_begin && ident->date_end)
date = parse_timestamp(ident->date_begin, NULL, 10);
-   if (date_overflows(date))
-   date = 0;
-   else {
-   if (ident->tz_begin && ident->tz_end)
-   tz = strtol(ident->tz_begin, NULL, 10);
-   if (tz >= INT_MAX || tz <= INT_MIN)
-   tz = 0;
-   }
+   if (ident->tz_begin && ident->tz_end)
+   tz = strtol(ident->tz_begin, NULL, 10);
+   if (tz >= INT_MAX || tz <= INT_MIN)
+   tz = 0;
return show_date(date, tz, mode);
 }
 
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v4 5/9] Introduce a new "printf format" for timestamps

2017-04-21 Thread Johannes Schindelin

Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.

So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.

Signed-off-by: Johannes Schindelin 
---
 builtin/blame.c   |  6 +++---
 builtin/fsck.c|  2 +-
 builtin/log.c |  2 +-
 builtin/receive-pack.c|  4 ++--
 builtin/rev-list.c|  2 +-
 builtin/rev-parse.c   |  2 +-
 date.c| 26 +-
 fetch-pack.c  |  2 +-
 git-compat-util.h |  1 +
 refs/files-backend.c  |  2 +-
 t/helper/test-date.c  |  2 +-
 t/helper/test-parse-options.c |  2 +-
 t/helper/test-ref-store.c |  2 +-
 upload-pack.c |  2 +-
 vcs-svn/fast_export.c |  4 ++--
 15 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 07506a3e457..e4b3c7b0ebf 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1727,11 +1727,11 @@ static int emit_one_suspect_detail(struct origin 
*suspect, int repeat)
get_commit_info(suspect->commit, , 1);
printf("author %s\n", ci.author.buf);
printf("author-mail %s\n", ci.author_mail.buf);
-   printf("author-time %lu\n", ci.author_time);
+   printf("author-time %"PRItime"\n", ci.author_time);
printf("author-tz %s\n", ci.author_tz.buf);
printf("committer %s\n", ci.committer.buf);
printf("committer-mail %s\n", ci.committer_mail.buf);
-   printf("committer-time %lu\n", ci.committer_time);
+   printf("committer-time %"PRItime"\n", ci.committer_time);
printf("committer-tz %s\n", ci.committer_tz.buf);
printf("summary %s\n", ci.summary.buf);
if (suspect->commit->object.flags & UNINTERESTING)
@@ -1844,7 +1844,7 @@ static const char *format_time(unsigned long time, const 
char *tz_str,
 
strbuf_reset(_buf);
if (show_raw_time) {
-   strbuf_addf(_buf, "%lu %s", time, tz_str);
+   strbuf_addf(_buf, "%"PRItime" %s", time, tz_str);
}
else {
const char *time_str;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index f76e4163abb..af7b985c6eb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -407,7 +407,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
obj,
-   xstrfmt("%s@{%ld}", refname, 
timestamp));
+   xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->used = 1;
mark_object_reachable(obj);
} else {
diff --git a/builtin/log.c b/builtin/log.c
index b3b10cc1edb..f93ef6c7100 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -910,7 +910,7 @@ static void get_patch_ids(struct rev_info *rev, struct 
patch_ids *ids)
 static void gen_message_id(struct rev_info *info, char *base)
 {
struct strbuf buf = STRBUF_INIT;
-   strbuf_addf(, "%s.%lu.git.%s", base,
+   strbuf_addf(, "%s.%"PRItime".git.%s", base,
(unsigned long) time(NULL),

git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(, NULL);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 9a4c2a7ade4..c49333c9b66 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -459,12 +459,12 @@ static char *prepare_push_cert_nonce(const char *path, 
unsigned long stamp)
struct strbuf buf = STRBUF_INIT;
unsigned char sha1[20];
 
-   strbuf_addf(, "%s:%lu", path, stamp);
+   strbuf_addf(, "%s:%"PRItime, path, stamp);
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%lu-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index bcf77f0b8a2..3b292c99bda 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -80,7 +80,7 @@ static void show_commit(struct commit *commit, void *data)
}
 
if (info->show_timestamp)
-   printf("%lu ", commit->date);
+   printf("%"PRItime" ", commit->date);
if (info->header_prefix)

[PATCH v4 7/9] Abort if the system time cannot handle one of our timestamps

2017-04-21 Thread Johannes Schindelin

We are about to switch to a new data type for time stamps that is
definitely not smaller or equal, but larger or equal to time_t.

So before using the system functions to process or format timestamps,
let's make extra certain that they can handle what we feed them.

Signed-off-by: Johannes Schindelin 
---
 date.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/date.c b/date.c
index 92ab31aa441..75f6335cd09 100644
--- a/date.c
+++ b/date.c
@@ -46,7 +46,10 @@ static time_t gm_time_t(timestamp_t time, int tz)
minutes = tz < 0 ? -tz : tz;
minutes = (minutes / 100)*60 + (minutes % 100);
minutes = tz < 0 ? -minutes : minutes;
-   return time + minutes * 60;
+
+   if (date_overflows(time + minutes * 60))
+   die("Timestamp too large for this system: %"PRItime, time);
+   return (time_t)time + minutes * 60;
 }
 
 /*
@@ -70,7 +73,10 @@ static int local_tzoffset(timestamp_t time)
struct tm tm;
int offset, eastwest;
 
-   t = time;
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
+   t = (time_t)time;
localtime_r(, );
t_local = tm_to_time_t();
 
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v4 8/9] Use uintmax_t for timestamps

2017-04-21 Thread Johannes Schindelin

Previously, we used `unsigned long` for timestamps. This was only a good
choice on Linux, where we know implicitly that `unsigned long` is what is
used for `time_t`.

However, we want to use a different data type for timestamps for two
reasons:

- there is nothing that says that `unsigned long` should be the same data
  type as `time_t`, and indeed, on 64-bit Windows for example, it is not:
  `unsigned long` is 32-bit but `time_t` is 64-bit.

- even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is
  32-bit, we *want* to be able to encode timestamps in Git that are
  currently absurdly far in the future, *even if* the system library is
  not able to format those timestamps into date strings.

So let's just switch to the maximal integer type available, which should
be at least 64-bit for all practical purposes these days. It certainly
cannot be worse than `unsigned long`, so...

Signed-off-by: Johannes Schindelin 
---
 git-compat-util.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 72c12173a14..c678ca94b8f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -319,10 +319,14 @@ extern char *gitdirname(char *);
 #define PRIo32 "o"
 #endif
 
-typedef unsigned long timestamp_t;
-#define PRItime "lu"
-#define parse_timestamp strtoul
+typedef uintmax_t timestamp_t;
+#define PRItime PRIuMAX
+#define parse_timestamp strtoumax
+#ifdef ULLONG_MAX
+#define TIME_MAX ULLONG_MAX
+#else
 #define TIME_MAX ULONG_MAX
+#endif
 
 #ifndef PATH_SEP
 #define PATH_SEP ':'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v4 0/9] Introduce timestamp_t for timestamps

2017-04-21 Thread Johannes Schindelin

Git v2.9.2 was released in a hurry to accomodate for platforms like
Windows, where the `unsigned long` data type is 32-bit even for 64-bit
setups.

The quick fix was to simply disable all the testing with "absurd" future
dates.

However, we can do much better than that, as we already make use of
64-bit data types internally. There is no good reason why we should not
use the same for timestamps. Hence, let's use uintmax_t for timestamps.

Note: while the `time_t` data type exists and is meant to be used for
timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
used `time_t` for that reason, but it came with a few serious downsides:
as `time_t` can be signed (and indeed, on Windows it is an int64_t),
Git's expectation that 0 is the minimal value does no longer hold true,
introducing its own set of interesting challenges. Besides, if we *can*
handle far in the future timestamps (except for formatting them using
the system libraries), it is more consistent to do so.

The upside of using `uintmax_t` for timestamps is that we do a much
better job to support far in the future timestamps across all platforms,
including 32-bit ones. The downside is that those platforms that use a
32-bit `time_t` will barf when parsing or formatting those timestamps.

This iteration has two fixes for issues reported by Travis CI (thank
deity for Continuous Testing, eh?), one by the MacOSX nodes (but not by
the Linux nor the Windows nodes), the other by the windows node (which
reminds me to run tests after rebasing to `master` not only in a Linux
VM but also on Windows, even if it takes ages there due to the heavy
shell scripting).

After reading Junio's comments on the cover letter of v2, I had another
look at the usage of date_overflows() and was able to relax two of them.

Changes since v3:

- fixed the fix in archive-zip.c that tried to report a too large
  timestamp (and would have reported the uninitialized time_t instead)

- adjusted the so-far forgotten each_reflog() function (that was
  introduced after v1, in 80f2a6097c4 (t/helper: add test-ref-store to
  test ref-store functions, 2017-03-26)) to use timestamp_t and PRItime,
  too

- removed the date_overflows() check from time_to_tm(), as it calls
  gm_time_t() which already performs that check

- the date_overflows() check in show_ident_date() was removed, as we do
  not know at that point yet whether we use the system functions to
  render the date or not (and there would not be a problem in the latter
  case)


Johannes Schindelin (9):
  ref-filter: avoid using `unsigned long` for catch-all data type
  t0006 & t5000: prepare for 64-bit timestamps
  t0006 & t5000: skip "far in the future" test when time_t is too
limited
  Specify explicitly where we parse timestamps
  Introduce a new "printf format" for timestamps
  Introduce a new data type for timestamps
  Abort if the system time cannot handle one of our timestamps
  Use uintmax_t for timestamps
  show_date_ident(): defer date overflow check

 Documentation/technical/api-parse-options.txt |   8 +-
 archive-tar.c |   5 +-
 archive-zip.c |  12 ++-
 archive.h |   2 +-
 builtin/am.c  |   4 +-
 builtin/blame.c   |  14 ++--
 builtin/fsck.c|   6 +-
 builtin/gc.c  |   2 +-
 builtin/log.c |   4 +-
 builtin/merge-base.c  |   2 +-
 builtin/name-rev.c|   6 +-
 builtin/pack-objects.c|   4 +-
 builtin/prune.c   |   4 +-
 builtin/receive-pack.c|  14 ++--
 builtin/reflog.c  |  24 +++---
 builtin/rev-list.c|   2 +-
 builtin/rev-parse.c   |   2 +-
 builtin/show-branch.c |   4 +-
 builtin/worktree.c|   4 +-
 bundle.c  |   4 +-
 cache.h   |  14 ++--
 commit.c  |  18 ++---
 commit.h  |   2 +-
 config.c  |   2 +-
 credential-cache--daemon.c|  12 +--
 date.c| 106 ++
 fetch-pack.c  |   8 +-
 fsck.c|   2 +-
 git-compat-util.h |   9 +++
 http-backend.c|   4 +-
 parse-options-cb.c|   4 +-
 pretty.c  |  16 ++--
 reachable.c   |   9 +--
 reachable.h   |   4 +-
 ref-filter.c

[PATCH v4 1/9] ref-filter: avoid using `unsigned long` for catch-all data type

2017-04-21 Thread Johannes Schindelin

In its `atom_value` struct, the ref-filter source code wants to store
different values in a field called `ul` (for `unsigned long`), e.g.
timestamps.

However, as we are about to switch the data type of timestamps away from
`unsigned long` (because it may be 32-bit even when `time_t` is 64-bit),
that data type is not large enough.

Simply change that field to use `uintmax_t` instead.

This patch is a bit larger than the mere change of the data type
because the field's name was tied to its data type, which has been fixed
at the same time.

Signed-off-by: Johannes Schindelin 
---
 ref-filter.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 3a640448fd8..92871266001 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -351,7 +351,7 @@ struct ref_formatting_state {
 struct atom_value {
const char *s;
void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
-   unsigned long ul; /* used for sorting when not FIELD_STR */
+   uintmax_t value; /* used for sorting when not FIELD_STR */
struct used_atom *atom;
 };
 
@@ -723,7 +723,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   v->ul = sz;
+   v->value = sz;
v->s = xstrfmt("%lu", sz);
}
else if (deref)
@@ -770,8 +770,8 @@ static void grab_commit_values(struct atom_value *val, int 
deref, struct object
v->s = xstrdup(oid_to_hex(>tree->object.oid));
}
else if (!strcmp(name, "numparent")) {
-   v->ul = commit_list_count(commit->parents);
-   v->s = xstrfmt("%lu", v->ul);
+   v->value = commit_list_count(commit->parents);
+   v->s = xstrfmt("%lu", (unsigned long)v->value);
}
else if (!strcmp(name, "parent")) {
struct commit_list *parents;
@@ -875,11 +875,11 @@ static void grab_date(const char *buf, struct atom_value 
*v, const char *atomnam
if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE)
goto bad;
v->s = xstrdup(show_date(timestamp, tz, _mode));
-   v->ul = timestamp;
+   v->value = timestamp;
return;
  bad:
v->s = "";
-   v->ul = 0;
+   v->value = 0;
 }
 
 /* See grab_values */
@@ -1941,9 +1941,9 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
else if (cmp_type == FIELD_STR)
cmp = cmp_fn(va->s, vb->s);
else {
-   if (va->ul < vb->ul)
+   if (va->value < vb->value)
cmp = -1;
-   else if (va->ul == vb->ul)
+   else if (va->value == vb->value)
cmp = cmp_fn(a->refname, b->refname);
else
cmp = 1;
-- 
2.12.2.windows.2.406.gd14a8f8640f

Re: [PATCH v3 0/8] Introduce timestamp_t for timestamps

2017-04-21 Thread Johannes Schindelin

Hi Junio,

On Thu, 20 Apr 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Note: while the `time_t` data type exists and is meant to be used for
> > timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
> > used `time_t` for that reason, but it came with a few serious
> > downsides: as `time_t` can be signed (and indeed, on Windows it is an
> > int64_t), Git's expectation that 0 is the minimal value does no longer
> > hold true, introducing its own set of interesting challenges. Besides,
> > if we *can* handle far in the future timestamps (except for formatting
> > them using the system libraries), it is more consistent to do so.
> 
> I somehow had an impression that the list consensus during the
> discussion on an earlier round of this series was that time_t is
> more appropriate, as platforms with time_t with inadequent range
> will be updated before it gets too late at around 2038 (or they will
> die off).

There was this sentiment, but that would require a change in Git's source
code where it can handle unsigned *and* signed data types for timestamps,
and that was too much of a change to bring in that late in the patch
series.

Besides, supporting a signed timestamp data type is a separate issue from
trying to support dates that are insanely far in the future.

> After all, at some point we need to interact with the
> platform functions that expect time_t as their interface and they do
> not take our own timestamp_t without casting.

No, not necessarily. When we generate a .zip archive, for example, we may
never need to parse or format the timestamps. When we log with a format
that does not require the system routines to format the date (e.g. Unix
epoch). And when we do not parse nor format the timestamps to begin with,
e.g. fetching and pushing or committing.

It is *only* when we try to parse or format timestamps, *and* the system's
time_t is not large enough, that we run into trouble.

This is an improvement from before, where we would run into trouble
whenever time_t was larger than unsigned long, *or* whenever time_t is too
small for those timestamps.

Your comment, and my reaction, reminded me that I had planned to look into
all calls to date_overflow() and remove those that are now unnecessary, or
at least move them to the place where they are absolutely necessary.

> But that is provided if not introducing timestamp_t and using time_t
> results in a simpler code.  I do not know if that is the case.

As I had pointed out in my reply to Peff: removing the assumption that 0
is the minimal timestamp is something I am unwilling to tackle, it looks
way too fragile/dangerous to me to do this in the time I could allocate
for that.

> For timestamps in the distant past, even though time_t could be
> unsigned, I do not think anybody came up with a concrete example of a
> platform with such a problem during the previous discussions, while I do
> recall people wanting to use Git to store historical documents with
> timestamps before 1970.  We do expect 0 can be used as a sentinel, which
> needs to be updated once we seriously start supporting such use cases.
> I think that (i.e. should the timestamp be signed?) is more or less
> unrelated to the focus of the discussion that stemed from this topic,
> which was "ulong that is often 32-bit does not necessarily fit the
> notion of time on a platform, which is time_t, and we need to widen it",
> which all involved in the discussion agreed.

If anybody feels strongly enough about representing timestamps earlier
than the Jan 1 1970, they should feel very welcome to work on this.

My patch series would make that task slightly easier, even.

> In any case, when merged to 'pu', this had a slight conflict with
> topics in flight in builtin/name-rev.c and I think I resolved it
> correctly, but please double check.

I double-checked, there were just two conversions from `unsigned long
taggerdate` to `timestamp_t taggerdate`. I repeated the merge with my
latest iteration and resolved them the same way as you did.

Apart from the date_overflows() changes I mentioned above, I also fixed
the problems identified by Travis CI and VSTS Build.

v4 is coming,
Dscho

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Duy Nguyen

On Fri, Apr 21, 2017 at 2:10 PM, Christian Couder
 wrote:
> On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  
> wrote:
>> On 04/20, Christian Couder wrote:
>>>
>>> Could you try with the following patch:
>>>
>>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>>
>> Yeah, I tried with and without that patch with the same result.
>> Unless I'm screwing something up when testing I don't think this fixes
>> the issue unfortunately.
>
> Ok, I will take a look soon.
>
> By the way I think that GIT_TEST_SPLIT_INDEX has become redundant now
> that there is core.splitIndex.
> So perhaps in the long run it will be best to deprecate
> GIT_TEST_SPLIT_INDEX and eventually remove it.

I think you can't, at least the way I understand this variable. It's a
_test_ variable to force exercise split index code path a whole lot
more, by running the entire test suite with split index always
enabled, instead of just a couple in  t-split-index.sh. We can't
achieve the same with core.splitIndex because that's more about user
control and you can't just set core.splitIndex for the entire test
suite (can we?).

I would understand if the failed tests are about core.splitIndex (they
both try to control how split index is created), but failed tests look
totally unrelated. We may have discovered some bug (or that git_path
one Jeff mentioned).
-- 
Duy

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-21 Thread Johannes Schindelin

Hi Junio,

On Thu, 20 Apr 2017, Junio C Hamano wrote:

> Lars Schneider  writes:
> 
> >> * bw/forking-and-threading (2017-04-19) 11 commits
> >> - run-command: block signals between fork and execve
> >> - run-command: add note about forking and threading
> >> - run-command: handle dup2 and close errors in child
> >> - run-command: eliminate calls to error handling functions in child
> >> - run-command: don't die in child when duping /dev/null
> >> - run-command: prepare child environment before forking
> >> - string-list: add string_list_remove function
> >> - run-command: use the async-signal-safe execv instead of execvp
> >> - run-command: prepare command before forking
> >> - t0061: run_command executes scripts without a #! line
> >> - t5550: use write_script to generate post-update hook
> >> 
> >> The "run-command" APIimplementation has been made more robust
> >> against dead-locking in a threaded environment.
> >> 
> >> Will merge to 'next'.
> >
> > There might be a problem on Windows with this (that's just a hunch, i can't 
> > test this right now):
> > https://travis-ci.org/git/git/jobs/223830474
> 
> Thanks for keeping an eye on Travis output. My eyes learned to
> ignore the Windows section as its failures often seem to be due to
> timing out.

Part of the reason is that you push out all of the branches in one go,
typically at the very end of your work day. The idea of Continuous
Integration is a little orthogonal to that style, suggesting to build &
test whenever new changes come into the integration branch.

As a consequence, my original setup was a little overloaded: the VM sat
idle most of the time, and when you pushed, it was overloaded.

To accommodate even that use case, I managed to pry away some resources
that should be mostly idle at the time you push, and that should be able
to run up to 4 builds in parallel (the number "4" is not really magic, it
is the number of integration branches of git.git).

Since that "let's aggregate everything and only push out the final result
at the end of the day" approach does not really allow the Continuous
Testing to identify problems associated with individual topic branches, I
have another job that bisects the breakages (with all associated problems
I reported earlier, as you apply some patches on top of really ancient
commits and bisect wants to test all merge bases first) because the
required time *definitely* would let Travis time out all the time. Those
bisect results are even less visible than the Travis results, see e.g.
https://github.com/git/git/commit/2e3a8b9035a#commitcomment-21836854.

Having said that, I do not think that it makes sense for you to change
your habits, as proper Continuous Integration (as opposed to a variation
of Continuous Testing that we use, really) would take a lot more of a
change than you are comfortable with: it would look a lot more Pull
Request centric than the current mailing list centered process.

Ciao,
Dscho

Re: [PATCH] refs.h: rename submodule arguments to submodule_path

2017-04-21 Thread Duy Nguyen

On Fri, Apr 21, 2017 at 1:42 PM, Michael Haggerty  wrote:
> On 04/21/2017 08:32 AM, Michael Haggerty wrote:
>> [...]
>> I've CCed Duy because I don't know whether he has more plans regarding
>> submodule references [...] get rid of the
>> `for_each_ref_submodule()` family of functions entirely.
>>
>> So perhaps the code that this patch touches won't be around long anyway.
>
> Oh yeah, he has done exactly that in his nd/prune-in-worktree patch
> series. (I knew I'd seen that somewhere...)
>
> So it seems that the argument renaming has mostly been overtaken by
> events, though even after Duy's patch series there are a few `const char
> *submodule` arguments that could be renamed.

Yeah. After that series, the only place that takes a submodule (path)
is get_submodule_ref_store() (other functions are just helpers).
Renaming it to submodule_path makes perfect sense. Johannes Sixt when
reviewing that series also noticed the "path" nature of this
"submodule" argument and suggested converting submodule[x] == '/' to
is_dir_sep(submodule[i]) for that reason.

At this point, I think Stefan even has the opportunity to
reference/look up a submodule ref store by something other than a path
 (like "struct submodule *", or by name)  if he wants to. But if you
do that, maybe rename the current function to
get_submodule_ref_store_by_path() before you add a new
get_submodule_ref_store_by_whatever().

About ".. because I don't know whether he (Duy) has more plans
regarding submodule references", I plan to convert "submodule[x] ==
'/'" to is_dir_sep(submodule[x]), but I think that's about it. I'm not
involved much in submodule to see the direction it's heading. As far
as refs code is concerned, a "struct ref_store *" is all it needs,
regardless of how you obtain it.
-- 
Duy

RE: [PATCH] Increase core.packedGitLimit

2017-04-21 Thread Johannes Schindelin

Hi Dave,

On Thu, 20 Apr 2017, David Turner wrote:
> 
> We might want to think about replacing git_config_ulong with 
> git_config_size_t in nearly all cases. "long" has ceased to be 
> useful.  More modern versions of C prefer uint64_t, but I
> think that we'll usually want size_t because these values will
> be used as memory limits of various sorts.

Indeed, `unsigned long` has ceased to be useful ever since the
introduction of inttypes.h. It is just too undefined.

We probably also need git_config_off_t(), though, and maybe even
git_config_ssize_t().

I would definitely welcome a change in direction where we use datatypes
also as a documentation of the variables' purpose.

Ciao,
Dscho

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-21 Thread Jacob Keller

On Thu, Apr 20, 2017 at 1:12 PM, Jeff King  wrote:
> When we complete branch names for "git checkout", we also
> complete remote branch names that could trigger the DWIM
> behavior. Depending on your workflow and project, this can
> be either convenient or annoying.
>
> For instance, my clone of gitster.git contains 74 local
> "jk/*" branches, but origin contains another 147. When I
> want to checkout a local branch but can't quite remember the
> name, tab completion shows me 251 entries. And worse, for a
> topic that has been picked up for pu, the upstream branch
> name is likely to be similar to mine, leading to a high
> probability that I pick the wrong one and accidentally
> create a new branch.
>
> This patch adds a way for the user to tell the completion
> code not to include DWIM suggestions for checkout. This can
> already be done by typing:
>
>   git checkout --no-guess jk/
>
> but that's rather cumbersome. The downside, of course, is
> that you no longer get completion support when you _do_ want
> to invoke the DWIM behavior. But depending on your workflow,
> that may not be a big loss (for instance, in git.git I am
> much more likely to want to detach, so I'd type "git
> checkout origin/jk/" anyway).
>
> Signed-off-by: Jeff King 
> ---
> This is flexible enough for me, but it's possible somebody would want
> this on a per-repo basis. I don't know that we want to read from `git
> config`, though, because it's relatively expensive to do so. People who
> want per-repo settings are probably better off with a hook that triggers
> when they "cd" around, and sets up their preferences.
>

I would use this. Completing these can get quite cumbersome to use
when I have only a few local branches but many remote ones like in
git.git

Thanks,
Jake

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Thu, Apr 20, 2017 at 11:24 PM, Thomas Gummerer  wrote:
> On 04/20, Christian Couder wrote:
>>
>> Could you try with the following patch:
>>
>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>
> Yeah, I tried with and without that patch with the same result.
> Unless I'm screwing something up when testing I don't think this fixes
> the issue unfortunately.

Ok, I will take a look soon.

By the way I think that GIT_TEST_SPLIT_INDEX has become redundant now
that there is core.splitIndex.
So perhaps in the long run it will be best to deprecate
GIT_TEST_SPLIT_INDEX and eventually remove it.

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-21 Thread Christian Couder

On Fri, Apr 21, 2017 at 6:01 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Could you try with the following patch:
>>
>> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/
>
> Ah, this reminds me.  The patch has been in the stalled state for
> quite some time due to confusing description.  How about explaining
> it like so and merge it to 'next'?

Yeah, I am ok with this description. Thanks!

Re: [PATCH] refs.h: rename submodule arguments to submodule_path

2017-04-21 Thread Michael Haggerty

On 04/21/2017 08:32 AM, Michael Haggerty wrote:
> [...]
> I've CCed Duy because I don't know whether he has more plans regarding
> submodule references [...] get rid of the
> `for_each_ref_submodule()` family of functions entirely.
> 
> So perhaps the code that this patch touches won't be around long anyway.

Oh yeah, he has done exactly that in his nd/prune-in-worktree patch
series. (I knew I'd seen that somewhere...)

So it seems that the argument renaming has mostly been overtaken by
events, though even after Duy's patch series there are a few `const char
*submodule` arguments that could be renamed.

Michael

Re: [PATCH] refs.h: rename submodule arguments to submodule_path

2017-04-21 Thread Michael Haggerty

On 04/21/2017 03:12 AM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
>> + Junio
> 
> Just like Michael, I do not have strong enough opinion for or
> against this patch to comment on it.
> 
> I do agree with you that it would be a good longer-term direction to
> use "submodule" for a "struct submodule" (i.e. submodule object),
> and call a string that names a submodule either "submodule_name" or
> "submodule_path" depending on how it names one, for maintainability
> of the code.  
> 
> However I am not convinced that this patch is an improvement.  Even
> though parameter names in decls only serve documentation purpose and
> it is even OK to only have type without name there, if we are going
> to _have_ names, it would make sense to match them to the parameter
> names actually used in the implementation.  
> 
> Updating these names used in refs.c would make a very noisy patch,
> of course.  But I am not sure if it is a good middle ground to avoid
> that and to update only refs.h.

One should never infer too much from my silence. As often as not it's
because I'm simply busy with other things.

But in this case Junio's right. I think it is a good idea to use
argument names in declarations as documentation, and I also agree that
it is a minus for the names in the declarations not to agree with the
names in the definition. But the code that would have to be touched
already has a lot of work going on in it, so conflicts would be likely.

I've CCed Duy because I don't know whether he has more plans regarding
submodule references. A natural followup to his recent work would be to
add a feature to the `submodule` module that allows a caller to look up
the `ref_store` object for the submodule. Then client code could use the
`refs_for_each_ref(struct ref_store *, ...)` family of functions to
access such references, and we could get rid of the
`for_each_ref_submodule()` family of functions entirely.

So perhaps the code that this patch touches won't be around long anyway.

Michael

Re: [PATCH 05/15] log: report errno on failure to fopen() a file

2017-04-21 Thread Jeff King

On Thu, Apr 20, 2017 at 06:25:59PM +0700, Nguyễn Thái Ngọc Duy wrote:

> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/log.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/log.c b/builtin/log.c
> index b3b10cc1ed..26d6a3cf14 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -858,7 +858,8 @@ static int open_next_file(struct commit *commit, const 
> char *subject,
>   printf("%s\n", filename.buf + outdir_offset);
>  
>   if ((rev->diffopt.file = fopen(filename.buf, "w")) == NULL)
> - return error(_("Cannot open patch file %s"), filename.buf);
> + return error_errno(_("Cannot open patch file %s"),
> +filename.buf);
>  
>   strbuf_release();
>   return 0;

Not a new problem with your patch, but just looking at the context it
seems clear that "filename" is leaked in the error case.

-Peff

Re: [PATCH 00/15] Handle fopen() errors

2017-04-21 Thread Jeff King

On Thu, Apr 20, 2017 at 08:41:32PM -0700, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > I wonder if it is OK to only special case ENOENT for !fp cases,
> > where existing code silently returns.  Perhaps it is trying to read
> > an optional file, and it returns silently because lack of it is
> > perfectly OK for the purpose of the code.  Are there cases where
> > this optional file is inside an optional directory, leading to
> > ENOTDIR, instead of ENOENT, observed and reported by your check?
> 
> "git grep -B1 warn_on_inaccessible" is enlightening.  I wonder if we
> want to wrap the two lines into a hard to misuse helper function,
> something along this line.  Would having this patch as a preparatory
> step shrink your series?  The patch count would be the same, but you
> wouldn't be writing "if (errno != ENOENT)" lines yourself.

I had a similar thought while reading through it. I think it would be
shorter still with:

  FILE *fopen_or_warn(const char *path, const char *mode)
  {
FILE *fh = fopen(path, mode);
if (!fh)
warn_failure_to_read_open_optional_path(path);
return fh;
  }

And then quite a few of the patches could just be
s/fopen/fopen_or_warn/.

-Peff

Re: [PATCH v3 0/8] Introduce timestamp_t for timestamps

2017-04-21 Thread Junio C Hamano

Johannes Schindelin  writes:

> Note: while the `time_t` data type exists and is meant to be used for
> timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
> used `time_t` for that reason, but it came with a few serious downsides:
> as `time_t` can be signed (and indeed, on Windows it is an int64_t),
> Git's expectation that 0 is the minimal value does no longer hold true,
> introducing its own set of interesting challenges. Besides, if we *can*
> handle far in the future timestamps (except for formatting them using
> the system libraries), it is more consistent to do so.

I somehow had an impression that the list consensus during the
discussion on an earlier round of this series was that time_t is
more appropriate, as platforms with time_t with inadequent range
will be updated before it gets too late at around 2038 (or they will
die off).  After all, at some point we need to interact with the
platform functions that expect time_t as their interface and they do
not take our own timestamp_t without casting.

But that is provided if not introducing timestamp_t and using time_t
results in a simpler code.  I do not know if that is the case.

For timestamps in the distant past, even though time_t could be
unsigned, I do not think anybody came up with a concrete example of
a platform with such a problem during the previous discussions,
while I do recall people wanting to use Git to store historical
documents with timestamps before 1970.  We do expect 0 can be used
as a sentinel, which needs to be updated once we seriously start
supporting such use cases.  I think that (i.e. should the timestamp
be signed?) is more or less unrelated to the focus of the discussion
that stemed from this topic, which was "ulong that is often 32-bit
does not necessarily fit the notion of time on a platform, which is
time_t, and we need to widen it", which all involved in the discussion
agreed.

In any case, when merged to 'pu', this had a slight conflict with
topics in flight in builtin/name-rev.c and I think I resolved it
correctly, but please double check.  I will have to revamp the
resolution rerere remembered when the next version of this series
starts using time_t instead of timestamp_t anyway, but it is better
to always get conflict resolution right.  That is what maintainers
do and what they are for ;-)

Thanks.

79 matches

Mail list logo