Re: [PATCH v4 2/5] dir_iterator: iterate over dir after its contents

2017-03-29 Thread Michael Haggerty
On 03/29/2017 06:46 PM, Junio C Hamano wrote:
> Michael Haggerty  writes:
> 
>> I also realize that I made a goof in my comments about v3 of this patch
>> series. Your new option is not choosing between "depth-first" and
>> "breadth-first". Both types of iteration are depth-first. Really it is
>> choosing between pre-order and post-order traversal. So I think it would
>> be better to name the option `DIR_ITERATOR_POST_ORDER`. Sorry about that.
> 
> That solicits a natural reaction from a bystander.  Would an
> IN_ORDER option also be useful?  I am not demanding it to be added
> to this series, especially if there is no immediate need, but if we
> foresee that it would also make sense for some other callers, we
> would at least want to make sure that the code after this addition
> of POST_ORDER is in a shape that is easy to add such an option
> later.

I think IN_ORDER really only applies to *binary* trees, not arbitrary
trees like a filesystem.

Michael



[PATCH v5 5/6] remove_subtree(): reimplement using iterators

2017-03-29 Thread Daniel Ferreira
From: Daniel Ferreira 

Use dir_iterator to traverse through remove_subtree()'s directory tree,
avoiding the need for recursive calls to readdir(). Simplify
remove_subtree()'s code.

A conversion similar in purpose was previously done at 46d092a
("for_each_reflog(): reimplement using iterators", 2016-05-21).

Signed-off-by: Daniel Ferreira 
---
 entry.c | 41 +++--
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/entry.c b/entry.c
index c6eea24..30197b2 100644
--- a/entry.c
+++ b/entry.c
@@ -2,6 +2,8 @@
 #include "blob.h"
 #include "dir.h"
 #include "streaming.h"
+#include "iterator.h"
+#include "dir-iterator.h"

 static void create_directories(const char *path, int path_len,
   const struct checkout *state)
@@ -44,33 +46,20 @@ static void create_directories(const char *path, int 
path_len,
free(buf);
 }

-static void remove_subtree(struct strbuf *path)
+static void remove_subtree(const char *path)
 {
-   DIR *dir = opendir(path->buf);
-   struct dirent *de;
-   int origlen = path->len;
-
-   if (!dir)
-   die_errno("cannot opendir '%s'", path->buf);
-   while ((de = readdir(dir)) != NULL) {
-   struct stat st;
-
-   if (is_dot_or_dotdot(de->d_name))
-   continue;
-
-   strbuf_addch(path, '/');
-   strbuf_addstr(path, de->d_name);
-   if (lstat(path->buf, ))
-   die_errno("cannot lstat '%s'", path->buf);
-   if (S_ISDIR(st.st_mode))
-   remove_subtree(path);
-   else if (unlink(path->buf))
-   die_errno("cannot unlink '%s'", path->buf);
-   strbuf_setlen(path, origlen);
+   struct dir_iterator *diter = dir_iterator_begin(path, 
DIR_ITERATOR_POST_ORDER_TRAVERSAL);
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode)) {
+   if (rmdir(diter->path.buf))
+   die_errno("cannot rmdir '%s'", diter->path.buf);
+   } else if (unlink(diter->path.buf))
+   die_errno("cannot unlink '%s'", diter->path.buf);
}
-   closedir(dir);
-   if (rmdir(path->buf))
-   die_errno("cannot rmdir '%s'", path->buf);
+
+   if (rmdir(path))
+   die_errno("cannot rmdir '%s'", path);
 }

 static int create_file(const char *path, unsigned int mode)
@@ -282,7 +271,7 @@ int checkout_entry(struct cache_entry *ce,
return 0;
if (!state->force)
return error("%s is a directory", path.buf);
-   remove_subtree();
+   remove_subtree(path.buf);
} else if (unlink(path.buf))
return error_errno("unable to unlink old '%s'", 
path.buf);
} else if (state->not_new)
--
2.7.4 (Apple Git-66)



[PATCH v5 6/6] remove_subtree(): test removing nested directories

2017-03-29 Thread Daniel Ferreira
Test removing a nested directory when an attempt is made to restore the
index to a state where it does not exist. A similar test could be found
previously in t/t2000-checkout-cache-clash.sh, but it did not check for
nested directories, which could allow a faulty implementation of
remove_subtree() pass the tests.

Signed-off-by: Daniel Ferreira 
---
 t/t2000-checkout-cache-clash.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t2000-checkout-cache-clash.sh b/t/t2000-checkout-cache-clash.sh
index de3edb5..ac10ba3 100755
--- a/t/t2000-checkout-cache-clash.sh
+++ b/t/t2000-checkout-cache-clash.sh
@@ -57,4 +57,15 @@ test_expect_success SYMLINKS 'checkout-index -f twice with 
--prefix' '
git checkout-index -a -f --prefix=there/
 '
 
+test_expect_success 'git checkout-index -f should remove nested subtrees' '
+   echo content >path &&
+   git update-index --add path &&
+   rm path &&
+   mkdir -p path/with/nested/paths &&
+   echo content >path/file1 &&
+   echo content >path/with/nested/paths/file2 &&
+   git checkout-index -f -a &&
+   test ! -d path
+'
+
 test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v5 4/6] dir_iterator: add tests for dir_iterator API

2017-03-29 Thread Daniel Ferreira
Create t/helper/test-dir-iterator.c, which prints relevant information
about a directory tree iterated over with dir_iterator.

Create t/t0065-dir-iterator.sh, which tests that dir_iterator does
iterate through a whole directory tree and that post-order directory
iteration is correctly implemented.

Signed-off-by: Daniel Ferreira 
---
 Makefile |  1 +
 t/helper/test-dir-iterator.c | 32 +++
 t/t0065-dir-iterator.sh  | 45 
 3 files changed, 78 insertions(+)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

diff --git a/Makefile b/Makefile
index a5a11e7..d0245f3 100644
--- a/Makefile
+++ b/Makefile
@@ -607,6 +607,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-dir-iterator
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
new file mode 100644
index 000..b4a148f
--- /dev/null
+++ b/t/helper/test-dir-iterator.c
@@ -0,0 +1,32 @@
+#include "cache.h"
+#include "blob.h"
+#include "dir.h"
+#include "streaming.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+
+int cmd_main(int argc, const char **argv) {
+   if (argc < 2) {
+   return 1;
+   }
+
+   struct strbuf path = STRBUF_INIT;
+   strbuf_add(, argv[1], strlen(argv[1]));
+
+   unsigned flag = 0;
+   if (argc == 3 && strcmp(argv[2], "--post-order") == 0)
+   flag = DIR_ITERATOR_POST_ORDER_TRAVERSAL;
+
+   struct dir_iterator *diter = dir_iterator_begin(()->buf, flag);
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode))
+   printf("[d] ");
+   else
+   printf("[f] ");
+
+   printf("(%s) %s\n", diter->relative_path, diter->path.buf);
+   }
+
+   return 0;
+}
diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
new file mode 100755
index 000..3c8ea9a
--- /dev/null
+++ b/t/t0065-dir-iterator.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+test_description='Test directory iteration.'
+
+. ./test-lib.sh
+
+ITER_SORTED_OUTPUT='[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[d] (d) ./dir/d
+[d] (d/e) ./dir/d/e
+[d] (d/e/d) ./dir/d/e/d
+[f] (a/b/c/d) ./dir/a/b/c/d
+[f] (a/e) ./dir/a/e
+[f] (b) ./dir/b
+[f] (c) ./dir/c
+[f] (d/e/d/a) ./dir/d/e/d/a'
+
+test_expect_success 'dir-iterator should iterate through all files' '
+   mkdir -p dir &&
+   mkdir -p dir/a/b/c/ &&
+   date >dir/b &&
+   date >dir/c &&
+   mkdir -p dir/d/e/d/ &&
+   date >dir/a/b/c/d &&
+   date >dir/a/e &&
+   date >dir/d/e/d/a &&
+
+   test-dir-iterator ./dir >it &&
+   test "$(sort it)" == "$ITER_SORTED_OUTPUT"
+'
+
+ITER_POST_ORDER_OUTPUT='[f] (a/b/c/d) ./dir2/a/b/c/d
+[d] (a/b/c) ./dir2/a/b/c
+[d] (a/b) ./dir2/a/b
+[d] (a) ./dir2/a'
+
+test_expect_success 'dir-iterator should list files properly on post-order 
mode' '
+   mkdir -p dir2/a/b/c/ &&
+   date >dir2/a/b/c/d &&
+
+   test "$(test-dir-iterator ./dir2 --post-order)" == 
"$ITER_POST_ORDER_OUTPUT"
+'
+
+test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v5 0/6] [GSoC] remove_subtree(): reimplement using iterators

2017-03-29 Thread Daniel Ferreira
This is the fifth version of a patch series that implements the GSoC
microproject of converting a recursive call to readdir() to use dir_iterator.

v1: 
https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsah16a...@mail.gmail.com/T/#t
v2: 
https://public-inbox.org/git/cacsjy8dxh-qpbblfafwpawusba9gvxa7x+mxljevykhk1zo...@mail.gmail.com/T/#t
v3: 
https://public-inbox.org/git/CAGZ79kYtpmURSQWPumobA=e3jbfjkhwcdv_lphkcd71zrwm...@mail.gmail.com/T/#t
v4: 
https://public-inbox.org/git/1490747533-89143-1-git-send-email-bnm...@gmail.com/T/#e437a63e0c22c00c69b5d92977c9b438ed2b9fd3a

I would like to really thank Michael for the incredibly thorough review of
the last version of this series. I never expected anyone to give that
level of attention to this change, and it's really, really appreciated.

All of the points he addressed are fixed in this version. As always, more
feedback is greatly appreciated.

Thanks,
Daniel.

Daniel Ferreira (6):
  dir_iterator: add helpers to dir_iterator_advance
  dir_iterator: refactor state machine model
  dir_iterator: iterate over dir after its contents
  dir_iterator: add tests for dir_iterator API
  remove_subtree(): reimplement using iterators
  remove_subtree(): test removing nested directories

 Makefile|   1 +
 dir-iterator.c  | 123 ++--
 dir-iterator.h  |  17 --
 entry.c |  41 +-
 refs/files-backend.c|   2 +-
 t/helper/test-dir-iterator.c|  32 +++
 t/t0065-dir-iterator.sh |  45 +++
 t/t2000-checkout-cache-clash.sh |  11 
 8 files changed, 210 insertions(+), 62 deletions(-)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

--
2.7.4 (Apple Git-66)



[PATCH v5 3/6] dir_iterator: iterate over dir after its contents

2017-03-29 Thread Daniel Ferreira
Create an option for the dir_iterator API to iterate over subdirectories
only after having iterated through their contents. This feature was
predicted, although not implemented by 0fe5043 ("dir_iterator: new API
for iterating over a directory tree", 2016-06-18).

Add the "flags" parameter to dir_iterator_create, allowing for the
aforementioned "depth-first" iteration mode to be enabled. Currently,
the only acceptable flag is DIR_ITERATOR_DEPTH_FIRST.

This is useful for recursively removing a directory and calling rmdir()
on a directory only after all of its contents have been wiped.

Amend a call to dir_iterator_begin() to pass the flags parameter
introduced.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c   | 53 
 dir-iterator.h   | 17 -
 refs/files-backend.c |  2 +-
 3 files changed, 62 insertions(+), 10 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 3ac984b..05d53d2 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -47,6 +47,9 @@ struct dir_iterator_int {
 * that will be included in this iteration.
 */
struct dir_iterator_level *levels;
+
+   /* Holds the flags passed to dir_iterator_begin(). */
+   unsigned flags;
 };
 
 static inline void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
@@ -113,12 +116,14 @@ int dir_iterator_advance(struct dir_iterator 
*dir_iterator)
iter->base.path.buf, strerror(errno));
/* Popping the level is handled below */
}
-   } else if (S_ISDIR(iter->base.st.st_mode)) {
+   } else if (S_ISDIR(iter->base.st.st_mode) &&
+   !(iter->flags & DIR_ITERATOR_POST_ORDER_TRAVERSAL)) {
if (level->dir_state == DIR_STATE_ITER) {
/*
 * The directory was just iterated
 * over; now prepare to iterate into
-* it.
+* it (unless an option is set for us
+* to do otherwise).
 */
push_dir_level(iter, level);
continue;
@@ -152,7 +157,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
de = readdir(level->dir);
 
if (!de) {
-   /* This level is exhausted; pop up a level. */
+   /* This level is exhausted  */
if (errno) {
warning("error reading directory %s: 
%s",
iter->base.path.buf, 
strerror(errno));
@@ -160,6 +165,32 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
warning("error closing directory %s: 
%s",
iter->base.path.buf, 
strerror(errno));
 
+   if (iter->flags & 
DIR_ITERATOR_POST_ORDER_TRAVERSAL) {
+   /* If we are handling dirpaths after 
their contents,
+* we have to iterate over the 
directory now that we'll
+* have finished iterating into it. */
+   level->dir = NULL;
+
+   if (pop_dir_level(iter) == 0)
+   return 
dir_iterator_abort(dir_iterator);
+
+   level = >levels[iter->levels_nr - 
1];
+   /* Since we are iterating through the 
dirpath
+* after we have gone through it, we 
still need
+* to get rid of the trailing slash we 
appended.
+*
+* This may generate issues if we ever 
want to
+* iterate through the root directory 
AND have
+* post-order traversal enabled.
+*/
+   strbuf_strip_suffix(>base.path, 
"/");
+
+   if (set_iterator_data(iter, level))
+   continue;
+
+   return ITER_OK;
+   }
+
level->dir = NULL;
if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
@@ -174,6 +205,18 @@ int dir_iterator_advance(struct 

[PATCH v5 1/6] dir_iterator: add helpers to dir_iterator_advance

2017-03-29 Thread Daniel Ferreira
Create inline helpers to dir_iterator_advance(). Make
dir_iterator_advance()'s code more legible and allow some behavior to
be reusable.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c | 65 +-
 1 file changed, 42 insertions(+), 23 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 34182a9..ce8bf81 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -50,6 +50,43 @@ struct dir_iterator_int {
struct dir_iterator_level *levels;
 };

+static inline void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   level->dir_state = DIR_STATE_RECURSE;
+   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+  iter->levels_alloc);
+   level = >levels[iter->levels_nr++];
+   level->initialized = 0;
+}
+
+static inline int pop_dir_level(struct dir_iterator_int *iter)
+{
+   return --iter->levels_nr;
+}
+
+static inline int set_iterator_data(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   if (lstat(iter->base.path.buf, >base.st) < 0) {
+   if (errno != ENOENT)
+   warning("error reading path '%s': %s",
+   iter->base.path.buf,
+   strerror(errno));
+   return -1;
+   }
+
+   /*
+* We have to set these each time because
+* the path strbuf might have been realloc()ed.
+*/
+   iter->base.relative_path =
+   iter->base.path.buf + iter->levels[0].prefix_len;
+   iter->base.basename =
+   iter->base.path.buf + level->prefix_len;
+   level->dir_state = DIR_STATE_ITER;
+
+   return 0;
+}
+
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
struct dir_iterator_int *iter =
@@ -84,11 +121,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * over; now prepare to iterate into
 * it.
 */
-   level->dir_state = DIR_STATE_RECURSE;
-   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
-  iter->levels_alloc);
-   level = >levels[iter->levels_nr++];
-   level->initialized = 0;
+   push_dir_level(iter, level);
continue;
} else {
/*
@@ -104,7 +137,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * This level is exhausted (or wasn't opened
 * successfully); pop up a level.
 */
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);

continue;
@@ -129,7 +162,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
iter->base.path.buf, 
strerror(errno));

level->dir = NULL;
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
break;
}
@@ -138,23 +171,9 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
continue;

strbuf_addstr(>base.path, de->d_name);
-   if (lstat(iter->base.path.buf, >base.st) < 0) {
-   if (errno != ENOENT)
-   warning("error reading path '%s': %s",
-   iter->base.path.buf,
-   strerror(errno));
-   continue;
-   }

-   /*
-* We have to set these each time because
-* the path strbuf might have been realloc()ed.
-*/
-   iter->base.relative_path =
-   iter->base.path.buf + 
iter->levels[0].prefix_len;
-   iter->base.basename =
-   iter->base.path.buf + level->prefix_len;
-   level->dir_state = DIR_STATE_ITER;
+   if (set_iterator_data(iter, level))
+   continue;

return ITER_OK;
}
--
2.7.4 (Apple Git-66)



[PATCH v5 2/6] dir_iterator: refactor state machine model

2017-03-29 Thread Daniel Ferreira
Remove the "initialized" member of dir_iterator_level. Replace its
functionality with a DIR_STATE_PUSH state in the
dir_iterator_level.dir_state enum.

This serves to remove a redundant property in the dir_iterator_level
struct and ease comprehension of the state machine's behavior.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index ce8bf81..3ac984b 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -4,8 +4,6 @@
 #include "dir-iterator.h"

 struct dir_iterator_level {
-   int initialized;
-
DIR *dir;

/*
@@ -20,6 +18,7 @@ struct dir_iterator_level {
 * iteration and also iterated into):
 */
enum {
+   DIR_STATE_PUSH,
DIR_STATE_ITER,
DIR_STATE_RECURSE
} dir_state;
@@ -55,8 +54,10 @@ static inline void push_dir_level(struct dir_iterator_int 
*iter, struct dir_iter
level->dir_state = DIR_STATE_RECURSE;
ALLOC_GROW(iter->levels, iter->levels_nr + 1,
   iter->levels_alloc);
+
+   /* Push a new level */
level = >levels[iter->levels_nr++];
-   level->initialized = 0;
+   level->dir_state = DIR_STATE_PUSH;
 }

 static inline int pop_dir_level(struct dir_iterator_int *iter)
@@ -97,7 +98,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>levels[iter->levels_nr - 1];
struct dirent *de;

-   if (!level->initialized) {
+   if (level->dir_state == DIR_STATE_PUSH) {
/*
 * Note: dir_iterator_begin() ensures that
 * path is not the empty string.
@@ -112,8 +113,6 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
iter->base.path.buf, strerror(errno));
/* Popping the level is handled below */
}
-
-   level->initialized = 1;
} else if (S_ISDIR(iter->base.st.st_mode)) {
if (level->dir_state == DIR_STATE_ITER) {
/*
@@ -215,7 +214,7 @@ struct dir_iterator *dir_iterator_begin(const char *path)
ALLOC_GROW(iter->levels, 10, iter->levels_alloc);

iter->levels_nr = 1;
-   iter->levels[0].initialized = 0;
+   iter->levels[0].dir_state = DIR_STATE_PUSH;

return dir_iterator;
 }
--
2.7.4 (Apple Git-66)



Re: [PATCH v8 04/11] update-index: add untracked cache notifications

2017-03-29 Thread Jeff King
On Wed, Jan 27, 2016 at 07:58:00AM +0100, Christian Couder wrote:

> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 6dd..369c207 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -130,7 +130,7 @@ static int test_if_untracked_cache_is_supported(void)
>   if (!mkdtemp(mtime_dir.buf))
>   die_errno("Could not make temporary directory");
>  
> - fprintf(stderr, _("Testing "));
> + fprintf(stderr, _("Testing mtime in '%s' "), xgetcwd());

Coverity points out that this is a leak (xgetcwd returns an allocated
buffer).

-Peff


Re: [PATCH 3/3] WIP - Allow custom printf function for column printing

2017-03-29 Thread Jeff King
On Wed, Mar 29, 2017 at 06:42:38PM -0700, Stefan Beller wrote:

> Ever wondered why column.ui applies the untracked files in git-status,
> but not for the help text comment in git-commit? Nobody wrote the code!
> 
> This is marked as WIP, as it barely demonstrates how the code may look
> like. No tests, no documentation.

I'm confused about what this patch is trying to do. Is it just to turn
on column.status support for the template shown by git-commit? Or is it
columnizing more than the untracked files list?

If the former, why isn't just setting s.colopts enough? I guess because
we write into a strbuf?

If the latter, then isn't that a separate logical patch from "should
commit use columns"?

I don't have an opinion on the whole thing myself, as I do not use
columns nor really know how they are supposed to work. You found my
4d2292e9a9, but I was explicitly trying _not_ to get involved in any
behavior changes there due to my cluelessness. :)

I think Duy, who wrote all of the column code, is a better person to cc.

-Peff


Re: [PATCH 2/3] column: allow for custom printf

2017-03-29 Thread Jeff King
On Wed, Mar 29, 2017 at 06:42:37PM -0700, Stefan Beller wrote:

> Signed-off-by: Stefan Beller 

No justification?

I assume it will be used in a future patch.

> diff --git a/column.h b/column.h
> index 0a61917fa7..c44a1525a9 100644
> --- a/column.h
> +++ b/column.h
> @@ -24,6 +24,9 @@ struct column_options {
>   int padding;
>   const char *indent;
>   const char *nl;
> +
> + /* when non-NULL, use this printing function, fallback to printf */
> + int (*_printf)(const char *__format, ...);
>  };

Avoid names with leading underscores. They're reserved by the C
standard.

I wonder if gcc is smart enough to let us mark this function pointer
with a "format" attribute so we can get compile-time checking of the
format string.

-Peff


Re: Re: [PATCH] userdiff: add build-in pattern for shell

2017-03-29 Thread Pickfire
Junio C Hamano  wrote:

> Ivan Tham  writes:
> 
> > Shell are widely used but comes with lots of different patterns. The
> > build-in pattern aim for POSIX-compatible shells with some additions:
> >
> > - Notably ${g//re/s} and ${g#cut}
> > - "function" from bash
> >
> > Signed-off-by: Ivan Tham 
> > ---
> >  Documentation/gitattributes.txt |  2 ++
> >  t/t4034-diff-words.sh   |  1 +
> >  t/t4034/sh/expect   | 14 ++
> >  t/t4034/sh/post |  7 +++
> >  t/t4034/sh/pre  |  7 +++
> >  userdiff.c  |  5 +
> >  6 files changed, 36 insertions(+)
> >  create mode 100644 t/t4034/sh/expect
> >  create mode 100644 t/t4034/sh/post
> >  create mode 100644 t/t4034/sh/pre
> >
> > diff --git a/Documentation/gitattributes.txt 
> > b/Documentation/gitattributes.txt
> > index a53d093ca..1bad72df2 100644
> > --- a/Documentation/gitattributes.txt
> > +++ b/Documentation/gitattributes.txt
> > @@ -706,6 +706,8 @@ patterns are available:
> >  
> >  - `ruby` suitable for source code in the Ruby language.
> >  
> > +- `sh` suitable for source code in POSIX-compatible shells.
> 
> The new test you added seems to show that this is not limited to
> POSIX shells but also understands bashisms like ${x//x/x}.  Perhaps
> drop "POSIX-compatible" from here

Those shells are still POSIX-compatible so I think it is true to put
that or otherwise, something like fish shell will break since it is
as well a shell but the syntax is totally different.

> > diff --git a/userdiff.c b/userdiff.c
> > index 8b732e40b..8d5127fb6 100644
> > --- a/userdiff.c
> > +++ b/userdiff.c
> > @@ -148,6 +148,11 @@ PATTERNS("csharp",
> >  "[a-zA-Z_][a-zA-Z0-9_]*"
> >  "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
> >  "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"),
> > +PATTERNS("sh",
> > +"^[ \t]*(function )?[A-Za-z_][A-Za-z_0-9]*[ \t]*()[\t]*\\{?$",
> 
> There is something funky going on around parentheses on this line.
> The ones around "function " is meant to be syntactic metacharacters
> to produce a group in the regexp so that you can apply '?'
> (i.e. zero or one occurrence) to it.  But I think the second pair of
> parentheses that appears later on the line, which enclose nothing,
> are meant to be literal?  E.g. "hello (){\n\techo world;\n}\n"  They
> would need some quoting, perhaps like
> 
>   ...[ \t]*\\(\\)[\t]*

Ah, I think I forgot to escape the quoting of ( and ). I will send in another
patch for that.

> > +/* -- */
> > +"(\\$|--?)?([a-zA-Z_][a-zA-Z0-9._]*|[0-9]+|#)|--" /* command/param */
> 
> TBH, I have no idea what this line-noise is doing.

That breaks word into "a", "$a" and "-a" as well as "$1" and "$#". I tried
supporting $? by adding +|#|\\?)--" but it doesn't seemed like it is working.

> $foobar, $4, --foobar, foobar, 123 and -- can be seen easily out of
> these patterns.  I am not sure what --# would be (perhaps you meant
> to only catch $# and --# is included by accident, in which case it
> is understandable).  It feels a bit strange to see that $# is
> supported but not $?; --foo but not --foo=bar; foobar but not "foo
> bar" inside a dq-pair.

Yes, getting --# will be very rare in shell. I think it is better to seperate
the --foo=bar into --foo and bar. I don't get what you man by the dq-pair.

> > +"|\\$[({]|[)}]|[-+*/=!]=?|[\\]&%#/|]{1,2}|[<>]{1,3}|[ \t]#.*"),
> 
> And this one is even more dense.

Yes, that takes care of the operators, special symbols and stuff.


[PATCH 1/3] column.c: pass column_options to down to display_plain

2017-03-29 Thread Stefan Beller
In a later patch we want to use more of the column_options members at
places, where we do actual output, so it will be handy to have the whole
struct around in `display_plain`.

Signed-off-by: Stefan Beller 
---
 column.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/column.c b/column.c
index d55ead18ef..4851b9aa04 100644
--- a/column.c
+++ b/column.c
@@ -109,12 +109,12 @@ static void shrink_columns(struct column_data *data)
 
 /* Display without layout when not enabled */
 static void display_plain(const struct string_list *list,
- const char *indent, const char *nl)
+ const struct column_options *opts)
 {
int i;
 
for (i = 0; i < list->nr; i++)
-   printf("%s%s%s", indent, list->items[i].string, nl);
+   printf("%s%s%s", opts->indent, list->items[i].string, opts->nl);
 }
 
 /* Print a cell to stdout with all necessary leading/traling space */
@@ -201,12 +201,14 @@ void print_columns(const struct string_list *list, 
unsigned int colopts,
nopts.padding = opts ? opts->padding : 1;
nopts.width = opts && opts->width ? opts->width : term_columns() - 1;
if (!column_active(colopts)) {
-   display_plain(list, "", "\n");
+   nopts.indent = "";
+   nopts.nl = "\n";
+   display_plain(list, );
return;
}
switch (COL_LAYOUT(colopts)) {
case COL_PLAIN:
-   display_plain(list, nopts.indent, nopts.nl);
+   display_plain(list, );
break;
case COL_ROW:
case COL_COLUMN:
-- 
2.12.2.511.g2abb8caf66



[PATCH 3/3] WIP - Allow custom printf function for column printing

2017-03-29 Thread Stefan Beller
Ever wondered why column.ui applies the untracked files in git-status,
but not for the help text comment in git-commit? Nobody wrote the code!

This is marked as WIP, as it barely demonstrates how the code may look
like. No tests, no documentation.

Signed-off-by: Stefan Beller 
---
 builtin/commit.c |  1 -
 wt-status.c  | 29 -
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc513..f482150df8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1649,7 +1649,6 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
 
status_init_config(, git_commit_config);
status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
-   s.colopts = 0;
 
if (get_sha1("HEAD", oid.hash))
current_head = NULL;
diff --git a/wt-status.c b/wt-status.c
index 308cf3779e..cfba352683 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -43,12 +43,13 @@ static const char *color(int slot, struct wt_status *s)
return c;
 }
 
-static void status_vprintf(struct wt_status *s, int at_bol, const char *color,
+static int status_vprintf(struct wt_status *s, int at_bol, const char *color,
const char *fmt, va_list ap, const char *trail)
 {
struct strbuf sb = STRBUF_INIT;
struct strbuf linebuf = STRBUF_INIT;
const char *line, *eol;
+   int ret = 0;
 
strbuf_vaddf(, fmt, ap);
if (!sb.len) {
@@ -59,9 +60,9 @@ static void status_vprintf(struct wt_status *s, int at_bol, 
const char *color,
}
color_print_strbuf(s->fp, color, );
if (trail)
-   fprintf(s->fp, "%s", trail);
+   ret += fprintf(s->fp, "%s", trail);
strbuf_release();
-   return;
+   return ret;
}
for (line = sb.buf; *line; line = eol + 1) {
eol = strchr(line, '\n');
@@ -78,15 +79,16 @@ static void status_vprintf(struct wt_status *s, int at_bol, 
const char *color,
strbuf_addstr(, line);
color_print_strbuf(s->fp, color, );
if (eol)
-   fprintf(s->fp, "\n");
+   ret += fprintf(s->fp, "\n");
else
break;
at_bol = 1;
}
if (trail)
-   fprintf(s->fp, "%s", trail);
+   ret += fprintf(s->fp, "%s", trail);
strbuf_release();
strbuf_release();
+   return ret;
 }
 
 void status_printf_ln(struct wt_status *s, const char *color,
@@ -834,6 +836,20 @@ static void wt_longstatus_print_submodule_summary(struct 
wt_status *s, int uncom
strbuf_release();
 }
 
+static struct wt_status *global_wt_status_hack;
+static int column_status_printf(const char *fmt, ...)
+{
+   va_list ap;
+   struct wt_status *s = global_wt_status_hack;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = status_vprintf(s, 0, "", fmt, ap, NULL);
+   va_end(ap);
+
+   return ret;
+}
+
 static void wt_longstatus_print_other(struct wt_status *s,
  struct string_list *l,
  const char *what,
@@ -856,6 +872,7 @@ static void wt_longstatus_print_other(struct wt_status *s,
path = quote_path(it->string, s->prefix, );
if (column_active(s->colopts)) {
string_list_append(, path);
+   global_wt_status_hack = s;
continue;
}
status_printf(s, color(WT_STATUS_HEADER, s), "\t");
@@ -876,6 +893,8 @@ static void wt_longstatus_print_other(struct wt_status *s,
copts.indent = buf.buf;
if (want_color(s->use_color))
copts.nl = GIT_COLOR_RESET "\n";
+
+   copts._printf = column_status_printf;
print_columns(, s->colopts, );
string_list_clear(, 0);
strbuf_release();
-- 
2.12.2.511.g2abb8caf66



[PATCH 2/3] column: allow for custom printf

2017-03-29 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 column.c | 13 -
 column.h |  3 +++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/column.c b/column.c
index 4851b9aa04..522e554f29 100644
--- a/column.c
+++ b/column.c
@@ -114,7 +114,9 @@ static void display_plain(const struct string_list *list,
int i;
 
for (i = 0; i < list->nr; i++)
-   printf("%s%s%s", opts->indent, list->items[i].string, opts->nl);
+   opts->_printf("%s%s%s", opts->indent,
+   list->items[i].string,
+   opts->nl);
 }
 
 /* Print a cell to stdout with all necessary leading/traling space */
@@ -143,10 +145,10 @@ static int display_cell(struct column_data *data, int 
initial_width,
else
newline = x == data->cols - 1 || i == data->list->nr - 1;
 
-   printf("%s%s%s",
-  x == 0 ? data->opts.indent : "",
-  data->list->items[i].string,
-  newline ? data->opts.nl : empty_cell + len);
+   data->opts._printf("%s%s%s",
+  x == 0 ? data->opts.indent : "",
+  data->list->items[i].string,
+  newline ? data->opts.nl : empty_cell + len);
return 0;
 }
 
@@ -200,6 +202,7 @@ void print_columns(const struct string_list *list, unsigned 
int colopts,
nopts.nl = opts && opts->nl ? opts->nl : "\n";
nopts.padding = opts ? opts->padding : 1;
nopts.width = opts && opts->width ? opts->width : term_columns() - 1;
+   nopts._printf = opts && opts->_printf ? opts->_printf : printf;
if (!column_active(colopts)) {
nopts.indent = "";
nopts.nl = "\n";
diff --git a/column.h b/column.h
index 0a61917fa7..c44a1525a9 100644
--- a/column.h
+++ b/column.h
@@ -24,6 +24,9 @@ struct column_options {
int padding;
const char *indent;
const char *nl;
+
+   /* when non-NULL, use this printing function, fallback to printf */
+   int (*_printf)(const char *__format, ...);
 };
 
 struct option;
-- 
2.12.2.511.g2abb8caf66



[PATCH 0/3] Respect column.ui in commented status in git-commit

2017-03-29 Thread Stefan Beller
When studying the code, I was nerd-sniped by the commit message of
4d2292e9a9 (status: refactor colopts handling, 2012-05-07)
as I did not understand why it was so important to reset the s.colopts to 0
in builtin/commit.c.

In my adolescent hybris I nearly sent out a patch claiming that line to be
useless and wrong, but then I studied a bit more. After the background story
became clear, I decided to "just write the missing piece", how hard can it be?

I would consider the following three patches a hack, but they work. You can
have untracked files in column mode in the commented text for a commit.

Thanks,
Stefan

Stefan Beller (3):
  column.c: pass column_options to down to display_plain
  column: allow for custom printf
  WIP - Allow custom printf function for column printing

 builtin/commit.c |  1 -
 column.c | 21 +
 column.h |  3 +++
 wt-status.c  | 29 -
 4 files changed, 40 insertions(+), 14 deletions(-)

-- 
2.12.2.511.g2abb8caf66



Re: [PATCH v2 07/21] builtin/receive-pack: convert portions to struct object_id

2017-03-29 Thread Jeff King
On Wed, Mar 29, 2017 at 11:21:52PM +, brian m. carlson wrote:

> On Tue, Mar 28, 2017 at 03:07:12AM -0400, Jeff King wrote:
> > It took me a while to find it. This is the switch from "len == 48" to
> > "len > 8" when matching "shallow" lines. I think this makes sense.
> > 
> > > Note that in queue_command we are guaranteed to have a NUL-terminated
> > > buffer or at least one byte of overflow that we can safely read, so the
> > > linelen check can be elided.  We would die in such a case, but not read
> > > invalid memory.
> > 
> > I think linelen is always just strlen(line). Since the queue_command
> > function no longer cares about it, perhaps we can just omit it?
> 
> I've just looked at this to put in a fix, and I don't agree.  We are
> guaranteed that we'll have overflow, but linelen can point to a newline,
> not just a NUL, so it isn't always strlen(line).  This is the case in
> queue_commands_from_cert.
> 
> We do use this value, in computing the reflen value, so I think it's
> better to keep it for now.  I agree that a future refactor could convert
> it to be NUL-terminated, but I'd rather not make that change here.

Yeah, you might have missed my followup:

  
http://public-inbox.org/git/20170328173536.ylwesrj7jbrez...@sigill.intra.peff.net/

Basically yes, I agree it's probably not worth trying to refactor
further.

-Peff


Re: [RFC PATCH] change default for status.submoduleSummary to true

2017-03-29 Thread Jacob Keller
On Wed, Mar 29, 2017 at 6:20 PM, Stefan Beller  wrote:
> A user complained about the workflow with submodules:
>> Re submodules pain, I've seen a lot of people get confused about
>> how and when to commit submodule changes. The main thing missing
>> in the related UIs is some way to summarize the subproject commit
>> diff in a human readable way. Maybe last log message would be better
>> than just sha?
>
> We could advise all the confused users to turn on
> status.submoduleSummary.  However there is no downside from turning
> it on by default apart from a slight change in behavior and bit
> longer output of git-status and the help in git-commit.
>

Makes sense to me.

Thanks,
Jake

> Signed-off-by: Stefan Beller 
> ---
>
>  Maybe we can merge this early after 2.13, so we have a longer time frame
>  in which people may react to this change of a default?
>
>  Thanks,
>  Stefan
>
>  Documentation/config.txt | 2 +-
>  builtin/commit.c | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1df1965457..34d4735414 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2984,7 +2984,7 @@ This variable can be overridden with the 
> -u|--untracked-files option
>  of linkgit:git-status[1] and linkgit:git-commit[1].
>
>  status.submoduleSummary::
> -   Defaults to false.
> +   Defaults to true.
> If this is set to a non zero number or true (identical to -1 or an
> unlimited number), the submodule summary will be enabled and a
> summary of commits for modified submodules will be shown (see
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 4e288bc513..833a651013 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1388,6 +1388,7 @@ int cmd_status(int argc, const char **argv, const char 
> *prefix)
>
> s.ignore_submodule_arg = ignore_submodule_arg;
> s.status_format = status_format;
> +   s.submodule_summary = -1;
> s.verbose = verbose;
>
> wt_status_collect();
> @@ -1650,6 +1651,7 @@ int cmd_commit(int argc, const char **argv, const char 
> *prefix)
> status_init_config(, git_commit_config);
> status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
> s.colopts = 0;
> +   s.submodule_summary = -1;
>
> if (get_sha1("HEAD", oid.hash))
> current_head = NULL;
> --
> 2.12.2.511.g2abb8caf66
>


[RFC PATCH] change default for status.submoduleSummary to true

2017-03-29 Thread Stefan Beller
A user complained about the workflow with submodules:
> Re submodules pain, I've seen a lot of people get confused about
> how and when to commit submodule changes. The main thing missing
> in the related UIs is some way to summarize the subproject commit
> diff in a human readable way. Maybe last log message would be better
> than just sha?

We could advise all the confused users to turn on
status.submoduleSummary.  However there is no downside from turning
it on by default apart from a slight change in behavior and bit
longer output of git-status and the help in git-commit.

Signed-off-by: Stefan Beller 
---

 Maybe we can merge this early after 2.13, so we have a longer time frame
 in which people may react to this change of a default?
 
 Thanks,
 Stefan

 Documentation/config.txt | 2 +-
 builtin/commit.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1df1965457..34d4735414 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2984,7 +2984,7 @@ This variable can be overridden with the 
-u|--untracked-files option
 of linkgit:git-status[1] and linkgit:git-commit[1].
 
 status.submoduleSummary::
-   Defaults to false.
+   Defaults to true.
If this is set to a non zero number or true (identical to -1 or an
unlimited number), the submodule summary will be enabled and a
summary of commits for modified submodules will be shown (see
diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc513..833a651013 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1388,6 +1388,7 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
 
s.ignore_submodule_arg = ignore_submodule_arg;
s.status_format = status_format;
+   s.submodule_summary = -1;
s.verbose = verbose;
 
wt_status_collect();
@@ -1650,6 +1651,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
status_init_config(, git_commit_config);
status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
s.colopts = 0;
+   s.submodule_summary = -1;
 
if (get_sha1("HEAD", oid.hash))
current_head = NULL;
-- 
2.12.2.511.g2abb8caf66



Re: [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag

2017-03-29 Thread Junio C Hamano
Johannes Schindelin  writes:

> - After the SHAttered blog post became public, Linus first made the case
>   that it matters not all that much: the development of the Linux kernel
>   is based on trust, and nobody would pull from a person they do not trust.
>   This approach does obviously not extend to most other projects.
>
> - By switching the default to DC_SHA1 already in `master`, you now took
>   the exact opposite position: it *always* matters, even when you trust
>   people, and the 6x slowdown demonstrated by my perf test is something that
>   everybody needs to accept, even if it is spent for no benefit in return.

You seem to be trying very hard to make it look like as if Linus
said one thing and I decided to do something else, but if one digs
into the archive, it should be clear that that is not how the
current line of development has happened.  Even from the same
person, you can hear that "the social aspect of how the project is
structured makes the kernel less susceptible to the shattered
attack" [*1*] and that "mitigation at Git level is easy" [*2*].

The kernel as a project may not have to do much, but the world of
projects using Git is wider than the kernel alone, and mitigation
that applies to all was available and is easy to make it the
default.

> The approach I chose instead was to make the switch global, per command.
> Obviously, the next step is to identify the Git commands which accept
> objects from external sources (`clone`, `fetch`, `fast-import` and all the
> remote helpers come to mind) and let them set a global flag before asking
> git_default_config() to parse the core.enableSHA1DC setting, so that the
> special value `external` could trigger the collision detecting code for
> those, and only those, commands. That would still err on the safe side if
> these commands are used to hash objects originating from the same machine,
> but that is a price I would be willing to pay for the simplicity of this
> approach.
>
> Does my explanation manage to illustrate in a plausible way why I chose
> the approach that I did?

I agree that there may be rooms to tweak the degree of paranoia per
codepath (I said that already in the message you are responding to),
but as Linus and Peff already said in the old discussion thread
[*3*], I doubt that it needs to be runtime configurable.

In any case, I do not think you should blindly trust what end-users
add to the repository locally.  A vendor may send in a binary driver
update via a pull-request, and you would want to protect "clone" and
"fetch" client because of that.  The same driver update however may
come as a patch that is accepted by running "git am".  Or you may
get a vendor tarball ftp'ed over and "git add ." the whole thing.

These "local" operations still need the same degree of paranoia as
your "git fetch"; "per command" may not be such a good criterion
because of that.

That is why I mentioned "you want to be paranoid any time you add
new data to the object database" as a rule of thumb in the message
you are responding to.  It may be overly broad, but would be a
better starting point than "'git add' is always safe as it is
local".

So in short, I do agree with your idea of using "faster" hashing for
some codepaths and "paranoid" ones for others.  I think "slower but
collision-attack detecting" one (aka DC_SHA1) plus a choice of
"faster" one at compile time would be a sensible way to go, and if
one is truly paranoid, the "faster" one _could_ be sha1dc.c with
collision detection disabled.  On the other hand, if one is in a
closed environment, one may want both hashes configurable and let
OpenSSL implementation be chosen for both "slower but DC" and
"faster" hash.  And I am OK with that, too.


[Reference]

*1* 
https://public-inbox.org/git/CA+55aFz98r7NC_3BW_1HU9-0C-HcrFou3=0gmRcS38=-x8d...@mail.gmail.com/

*2* 
https://public-inbox.org/git/ca+55afxmr6ntwgbjda8toyxxdx3h-yd4tqthgv_tn1u91yy...@mail.gmail.com/

*3* 
https://public-inbox.org/git/20170323174750.xyucxmfhuc6db...@sigill.intra.peff.net/


Re: [GSoC] Proposal: turn git-add--interactive.perl into a builtin

2017-03-29 Thread Johannes Schindelin
Hi Stefan & Daniel,

On Tue, 28 Mar 2017, Stefan Beller wrote:

> On Sat, Mar 25, 2017 at 8:15 PM, Daniel Ferreira (theiostream)
>  wrote:
> 
> > SYNOPSIS
> > There are many advantages to converting parts of git that are still
> > scripts to C builtins, among which execution speed, improved
> > compatibility and code deduplication.
> 
> agreed.

I would even add portability. But yeah, speed is a big thing. I am an
extensive user of `git add -p` (which is backed by
git-add--interactive.perl) and it is slow as molasses on Windows, just
because it is a Perl script (and the Perl interpreter needs to emulate
POSIX functionality that is frequently not even needed, such as: copying
all memory and reopening all file descriptors in a fork() call only to
exec() git.exe right away, tossing all of the diligently work into the
dustbin).

> > git-add--interactive, one of the most useful features of Git.
> 
> knee jerk reaction: I never used it, so it cannot be that important ;)
> (I use git-gui, which is essentially the same workflow. There are tons
> of ways to accomplish a given goal using Git, so I guess we don't
> want to get in an argument here).

Well, I make up for your lack of `git add -i` usage.

Of course, since you use git-gui, you are simply using another dependency
that bloats Git for Windows: Tcl/Tk.

> > FEASIBILITY
> >
> > There was only one discussion regarding the feasibility of its porting
> > (https://public-inbox.org/git/CAP8UFD2PcBsU6=FK4OHVrB7E98ycohS_0pYcbCBar=of1hl...@mail.gmail.com/).
> > It resulted in a consensus that doing it would be a task too large –
> > although interesting – for GSoC 2015 based on the amount of its lines
> > of code. It is, however, only a few lines larger than
> > git-rebase--interactive, which has been considered an appropriate
> > idea. As such, it looks like a possible project for three months of
> > full-time work.
> 
> ok, it sounds a challenging project. (currently counting 1750 lines of
> code). Scrolling over the source code, there are quite a couple of
> functions, where the direct equivalent in C springs to mind.
> 
> run_cmd_pipe -> see run-command.h
> unquote_path -> unquote_c_style ?
> refresh -> update_index_if_able()
> list_modified -> iterate over "const struct cache_entry *ce = 
> active_cache[i];"

Yes, I think it would be more important to acquaint oneself with the
idiosynchracies of Git's internal "API" than to get familiar with Perl:
interpreting what obscure Perl code does is something I would gladly do as
a mentor.

> > PROJECTED TIMELINE
> > - Prior to May 4
> > -- Refine my basic knowledge of Perl
> > -- Craft one or two small patches to some of Git's Perl components
> > (preferentially to git-add--interactive itself) to improve my
> > understanding of the language and of how Git's Perl scripts actually
> > work

As I mentioned above, the Perl code should be fairly intuitive for the
most part, with maybe a couple of pieces of code using more advanced Perl
techniques. Given the scope of the project, I would recommend working
closely together with the mentor(s) to clarify what those code parts do.

> > - May 4 - May 30
> > -- Clarify implementation details with my mentor, and work on a more
> > detailed roadmap for the project
> > -- Investigate roughly how to replace command invocations from the
> > script with actual builtin functions; which Git APIs in Perl already
> > have functional equivalents in C; which parts will require a full
> > rewrite.
> 
> There are different approaches for replacing functionality in another
> language. Examples:
> * Implement the functionality in C and then have a "flag-day" commit
>   783d7e865e (builtin-am: remove redirection to git-am.sh, 2015-08-04)
>   This only works when the whole functionality was replaced in prior commits
> * Implement partial functionality in C and call it via a helper function.
>   3604242f08 (submodule: port init from shell to C, 2016-04-15)
>   This works well for only partial conversions (the larger the thing to
>   convert the more appealing this is, as it gets code shipped early.)
>   When choosing this strategy, this part of the Project would be to
>   identify parts that could be ported on its own without much
>   additional glue-code.

To offer my perspective: I strongly prefer the latter approach. Not only
does it yield earlier results, it also makes it substantially easier to
handle the project even if it should turn out to be a little larger than
just 3 months.

> > - May 30 - June 30 (start of coding period)
> > -- Define the architecture of the builtin within git (which
> > functions/interfaces will it have? where will its code reside?).
> > -- Implement a small subset of the builtin (to be defined with my
> > mentor) and glue it into the existing Perl script. Present this as a
> > first patch to get feedback early regarding the implementation and
> > avoid piling up mistakes early.
> > -- Do necessary changes based on this initial review.
> > -- Have 

Re: [PATCH v2] travis-ci: build and test Git on Windows

2017-03-29 Thread Johannes Schindelin
Hi,

On Fri, 24 Mar 2017, Sebastian Schuberth wrote:

> On Fri, Mar 24, 2017 at 1:35 PM, Lars Schneider
>  wrote:
> 
> >> 1. use appveyor.com, as that is a Travis-like service for Windows. We do 
> >> our
> >>   windows-builds in the curl project using that.
> >
> > The Git for Windows build and tests are *really* resources intensive and 
> > they
> > take a lot of setup time. AFAIK we would run into timeouts with AppVeyor.
> > Maybe Sebastian or Dscho know details?
> 
> At lot of setup time, in terms of installing the Git for Windows SDK,
> could probably be saved by only installing the required packages on
> top of MSYS2 that's already installed in AppVeyor nodes [1].
> 
> [1] https://www.appveyor.com/docs/build-environment/#mingw-msys-cygwin

And that is indeed what Git for Windows' Git fork has configured. Sadly,
the job timed out too often for me, even if I disabled the tests.

As of recent, things turned more green, with the build taking only about 3
minutes. Of course, re-enabling the tests would let us run smack into the
20 minutes time out, *and* we still would have to disable the tests that
fail because MSYS2's maintainer ignored the PRs trying to integrate Git
for Windows' patches to the MSYS2 runtime.

Example for a green job:
https://ci.appveyor.com/project/dscho/git/build/1.0.676

Ciao,
Johannes


Re: [PATCH v2 07/21] builtin/receive-pack: convert portions to struct object_id

2017-03-29 Thread brian m. carlson
On Tue, Mar 28, 2017 at 03:07:12AM -0400, Jeff King wrote:
> It took me a while to find it. This is the switch from "len == 48" to
> "len > 8" when matching "shallow" lines. I think this makes sense.
> 
> > Note that in queue_command we are guaranteed to have a NUL-terminated
> > buffer or at least one byte of overflow that we can safely read, so the
> > linelen check can be elided.  We would die in such a case, but not read
> > invalid memory.
> 
> I think linelen is always just strlen(line). Since the queue_command
> function no longer cares about it, perhaps we can just omit it?

I've just looked at this to put in a fix, and I don't agree.  We are
guaranteed that we'll have overflow, but linelen can point to a newline,
not just a NUL, so it isn't always strlen(line).  This is the case in
queue_commands_from_cert.

We do use this value, in computing the reflen value, so I think it's
better to keep it for now.  I agree that a future refactor could convert
it to be NUL-terminated, but I'd rather not make that change here.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 2/2] submodule.c: correctly handle nested submodules in is_submodule_modified

2017-03-29 Thread Jonathan Nieder
Stefan Beller wrote:

> This bug fix also affects the default output (non-short, non-porcelain)
> of git-status, which is not tested here.

Do you have an example?  (In just the commit message would be fine, in
tests would be even better.)

> Signed-off-by: Stefan Beller 
> ---
>  Documentation/git-status.txt |  2 ++
>  submodule.c  | 21 +++--
>  t/t3600-rm.sh|  2 +-
>  t/t7506-status-submodule.sh  |  4 ++--
>  4 files changed, 24 insertions(+), 5 deletions(-)

Reviewed-by: Jonathan Nieder 

but I would be a lot more comfortable after looking at the change to
"git status" output.  (E.g. a test demonstrating it can happen in a
followup change if that's simpler.)

Thanks for your patient work on this.


Re: [PATCH 1/2] short status: improve reporting for submodule changes

2017-03-29 Thread Jonathan Nieder
Stefan Beller wrote:

> Signed-off-by: Stefan Beller 
> Reviewed-by: Jonathan Nieder 
> ---
>  Documentation/git-status.txt |  11 
>  t/t3600-rm.sh|  18 +--
>  t/t7506-status-submodule.sh  | 117 
> +++
>  wt-status.c  |  17 ++-
>  4 files changed, 156 insertions(+), 7 deletions(-)

Yes, this looks good.

Thank you,
Jonathan


Re: [PATCH] unpack-trees.c: align submodule error message to the other error messages

2017-03-29 Thread Junio C Hamano
Stefan Beller  writes:

> As the place holder in the error message is for multiple submodules,
> we don't want to encapsulate the string place holder in single quotes.
>
> Signed-off-by: Stefan Beller 
> ---
>
>> Nitpicking about wording: unless the user has adopted a strongly
>> object-oriented point of view, it is Git that cannot checkout a new
>> HEAD, not the submodule.
>> 
>> How about:
>> 
>> _("Cannot update submodule:\n%s")
>
>> That's vague, but if I understand correctly the way this error gets
>> used is equally vague --- i.e., a clearer message would involve
>> finer-grained error codes.
>
> Makes sense. Here is the patch.
> Let's roll this as its own instead of waiting for the discussion on the other
> patch to settle.
>
> Thanks,
> Stefan
>
>  unpack-trees.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 8333da2cc9..0d82452f7f 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -167,7 +167,7 @@ void setup_unpack_trees_porcelain(struct 
> unpack_trees_options *opts,
>   msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
>   _("The following working tree files would be removed by sparse 
> checkout update:\n%s");
>   msgs[ERROR_WOULD_LOSE_SUBMODULE] =
> - _("Submodule '%s' cannot checkout new HEAD");
> + _("Cannot update submodule:\n%s")

Missing ';'.  I'll fix locally, but the final integration result
won't be pushed out until later tonight, as I need to redo jch and
pu branches with a fixed version.


>  
>   opts->show_all_errors = 1;
>   /* rejected paths may not have a static buffer */


[PATCH] unpack-trees.c: align submodule error message to the other error messages

2017-03-29 Thread Stefan Beller
As the place holder in the error message is for multiple submodules,
we don't want to encapsulate the string place holder in single quotes.

Signed-off-by: Stefan Beller 
---

> Nitpicking about wording: unless the user has adopted a strongly
> object-oriented point of view, it is Git that cannot checkout a new
> HEAD, not the submodule.
> 
> How about:
> 
> _("Cannot update submodule:\n%s")

> That's vague, but if I understand correctly the way this error gets
> used is equally vague --- i.e., a clearer message would involve
> finer-grained error codes.

Makes sense. Here is the patch.
Let's roll this as its own instead of waiting for the discussion on the other
patch to settle.

Thanks,
Stefan

 unpack-trees.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 8333da2cc9..0d82452f7f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -167,7 +167,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
_("The following working tree files would be removed by sparse 
checkout update:\n%s");
msgs[ERROR_WOULD_LOSE_SUBMODULE] =
-   _("Submodule '%s' cannot checkout new HEAD");
+   _("Cannot update submodule:\n%s")
 
opts->show_all_errors = 1;
/* rejected paths may not have a static buffer */
-- 
2.12.1.442.ge9452a8fbc



Re: [PATCH v2 16/21] Make sha1_array_append take a struct object_id *

2017-03-29 Thread brian m. carlson
On Wed, Mar 29, 2017 at 08:14:19AM -0700, Junio C Hamano wrote:
> This change is about dropping the need for ".hash", and I think a
> faithful, boring and mechanical conversion that tries to preserve
> the intent of the original author would be more appropriate.  It is
> entirely possible that some places where the original said E2[E3]
> were easier to understand if it were *(E2 + E3), thus we may want to
> further rewrite such a place to (E2 + E3) instead of [E3] after
> the mechanical conversion.

You've convinced me that [E3] is a better choice in this situation,
so I'll reroll and fix the other issues mentioned.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[PATCH 1/2] short status: improve reporting for submodule changes

2017-03-29 Thread Stefan Beller
If I add an untracked file to a submodule or modify a tracked file,
currently "git status --short" treats the change in the same way as
changes to the current HEAD of the submodule:

$ git clone --quiet --recurse-submodules 
https://gerrit.googlesource.com/gerrit
$ echo hello >gerrit/plugins/replication/stray-file
$ sed -i -e 's/.*//' gerrit/plugins/replication/.mailmap
$ git -C gerrit status --short
 M plugins/replication

This is by analogy with ordinary files, where "M" represents a change
that has not been added yet to the index.  But this change cannot be
added to the index without entering the submodule, "git add"-ing it,
and running "git commit", so the analogy is counterproductive.

Introduce new status letters " ?" and " m" for this.  These are similar
to the existing "??" and " M" but mean that the submodule (not the
parent project) has new untracked files and modified files, respectively.
The user can use "git add" and "git commit" from within the submodule to
add them.

Changes to the submodule's HEAD commit can be recorded in the index with
a plain "git add -u" and are shown with " M", like today.

To avoid excessive clutter, show at most one of " ?", " m", and " M" for
the submodule.  They represent increasing levels of change --- the last
one that applies is shown (e.g., " m" if there are both modified files
and untracked files in the submodule, or " M" if the submodule's HEAD
has been modified and it has untracked files).

While making these changes, we need to make sure to not break porcelain
level 1, which shares code with "status --short".  We only change
"git status --short".

Non-short "git status" and "git status --porcelain=2" already handle
these cases by showing more detail:

$ git -C gerrit status --porcelain=2
1 .M S.MU 16 16 16 305c864db28eb0c77c8499bc04c87de3f849cf3c 
305c864db28eb0c77c8499bc04c87de3f849cf3c plugins/replication
$ git -C gerrit status
[...]
modified:   plugins/replication (modified content, untracked content)

Scripts caring about these distinctions should use --porcelain=2.

Helped-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
Reviewed-by: Jonathan Nieder 
---
 Documentation/git-status.txt |  11 
 t/t3600-rm.sh|  18 +--
 t/t7506-status-submodule.sh  | 117 +++
 wt-status.c  |  17 ++-
 4 files changed, 156 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index ba873657cf..67f1a910f3 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -181,6 +181,15 @@ in which case `XY` are `!!`.
 !   !ignored
 -
 
+Submodules have more state and instead report
+   Mthe submodule has a different HEAD than
+recorded in the index
+   mthe submodule has modified content
+   ?the submodule has untracked files
+since modified content or untracked files in a submodule cannot be added
+via `git add` in the superproject to prepare a commit.
+
+
 If -b is used the short-format status is preceded by a line
 
 ## branchname tracking info
@@ -210,6 +219,8 @@ field from the first filename).  Third, filenames 
containing special
 characters are not specially formatted; no quoting or
 backslash-escaping is performed.
 
+Any submodule changes are reported as modified `M` instead of `m` or single 
`?`.
+
 Porcelain Format Version 2
 ~~
 
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 5aa6db584c..a6e5c5bd56 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -268,6 +268,14 @@ cat >expect.modified actual &&
@@ -436,7 +444,7 @@ test_expect_success 'rm of a populated submodule with 
untracked files fails unle
test -d submod &&
test -f submod/.git &&
git status -s -uno --ignore-submodules=none >actual &&
-   test_cmp expect.modified actual &&
+   test_cmp expect.modified_untracked actual &&
git rm -f submod &&
test ! -d submod &&
git status -s -uno --ignore-submodules=none >actual &&
@@ -621,7 +629,7 @@ test_expect_success 'rm of a populated nested submodule 
with different nested HE
test -d submod &&
test -f submod/.git &&
git status -s -uno --ignore-submodules=none >actual &&
-   test_cmp expect.modified actual &&
+   test_cmp expect.modified_inside actual &&
git rm -f submod &&
test ! -d submod &&
git status -s -uno --ignore-submodules=none 

[PATCH 2/2] submodule.c: correctly handle nested submodules in is_submodule_modified

2017-03-29 Thread Stefan Beller
Suppose I have a superproject 'super', with two submodules 'super/sub'
and 'super/sub1'. 'super/sub' itself contains a submodule
'super/sub/subsub'. Now suppose I run, from within 'super':

echo hi >sub/subsub/stray-file
echo hi >sub1/stray-file

Currently we get would see the following output in git-status:

git status --short
 m sub
 ? sub1

With this patch applied, the untracked file in the nested submodule is
displayed as an untracked file on the 'super' level as well.

git status --short
 ? sub
 ? sub1

This doesn't change the output of 'git status --porcelain=1' for nested
submodules, because its output is always ' M' for either untracked files
or local modifications no matter the nesting level of the submodule.

'git status --porcelain=2' is affected by this change in a nested
submodule, though. Without this patch it would report the direct submodule
as modified and having no untracked files. With this patch it would report
untracked files. Chalk this up as a bug fix.

This bug fix also affects the default output (non-short, non-porcelain)
of git-status, which is not tested here.

Signed-off-by: Stefan Beller 
---
 Documentation/git-status.txt |  2 ++
 submodule.c  | 21 +++--
 t/t3600-rm.sh|  2 +-
 t/t7506-status-submodule.sh  |  4 ++--
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 67f1a910f3..d70abc6afe 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -189,6 +189,8 @@ Submodules have more state and instead report
 since modified content or untracked files in a submodule cannot be added
 via `git add` in the superproject to prepare a commit.
 
+'m' and '?' are applied recursively. For example if a nested submodule
+in a submodule contains an untracked file, this is reported as '?' as well.
 
 If -b is used the short-format status is preceded by a line
 
diff --git a/submodule.c b/submodule.c
index fa21c7bb72..3da65100e3 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1078,8 +1078,25 @@ unsigned is_submodule_modified(const char *path, int 
ignore_untracked)
/* regular untracked files */
if (buf.buf[0] == '?')
dirty_submodule |= DIRTY_SUBMODULE_UNTRACKED;
-   else
-   dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
+
+   if (buf.buf[0] == 'u' ||
+   buf.buf[0] == '1' ||
+   buf.buf[0] == '2') {
+   /* T = line type, XY = status,  = submodule state */
+   if (buf.len < strlen("T XY "))
+   die("BUG: invalid status --porcelain=2 line %s",
+   buf.buf);
+
+   if (buf.buf[5] == 'S' && buf.buf[8] == 'U')
+   /* nested untracked file */
+   dirty_submodule |= DIRTY_SUBMODULE_UNTRACKED;
+
+   if (buf.buf[0] == 'u' ||
+   buf.buf[0] == '2' ||
+   memcmp(buf.buf + 5, "S..U", 4))
+   /* other change */
+   dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
+   }
 
if ((dirty_submodule & DIRTY_SUBMODULE_MODIFIED) &&
((dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) ||
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index a6e5c5bd56..b58793448b 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -659,7 +659,7 @@ test_expect_success 'rm of a populated nested submodule 
with nested untracked fi
test -d submod &&
test -f submod/.git &&
git status -s -uno --ignore-submodules=none >actual &&
-   test_cmp expect.modified_inside actual &&
+   test_cmp expect.modified_untracked actual &&
git rm -f submod &&
test ! -d submod &&
git status -s -uno --ignore-submodules=none >actual &&
diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 1fa2ff2909..055c90736e 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -356,7 +356,7 @@ test_expect_success 'status with untracked file in nested 
submodule (porcelain=2
git -C super status --porcelain=2 >output &&
sanitize_output output &&
diff output - <<-\EOF
-   1 .M S.M. 16 16 16 HASH HASH sub1
+   1 .M S..U 16 16 16 HASH HASH sub1
1 .M S..U 16 16 16 HASH HASH sub2
1 .M S..U 16 16 16 HASH HASH sub3
EOF
@@ -365,7 +365,7 @@ test_expect_success 'status with untracked file in nested 
submodule (porcelain=2
 test_expect_success 'status with untracked file in nested submodule (short)' '
git -C super status --short >output &&
diff output - <<-\EOF
-m sub1
+? sub1
 ? sub2
   

[PATCHv9 (6,7)/7] short status: improve reporting for submodule changes

2017-03-29 Thread Stefan Beller
v9:
* This is a resend of the last two patches, i.e. these two patches apply
  at 5c896f7c3ec (origin/sb/submodule-short-status^^)
* below is a diff of this patch series against origin/sb/submodule-short-status
* better tests, refined documentation, thanks for the review, Jonathan!

Thanks,
Stefan

previous work:
https://public-inbox.org/git/20170328230938.9887-1-sbel...@google.com/

Stefan Beller (2):
  short status: improve reporting for submodule changes
  submodule.c: correctly handle nested submodules in
is_submodule_modified

 Documentation/git-status.txt |  13 +
 submodule.c  |  21 +++-
 t/t3600-rm.sh|  18 +--
 t/t7506-status-submodule.sh  | 117 +++
 wt-status.c  |  17 ++-
 5 files changed, 177 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 01b457c322..d70abc6afe 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -186,7 +186,11 @@ Submodules have more state and instead report
 recorded in the index
mthe submodule has modified content
?the submodule has untracked files
+since modified content or untracked files in a submodule cannot be added
+via `git add` in the superproject to prepare a commit.
 
+'m' and '?' are applied recursively. For example if a nested submodule
+in a submodule contains an untracked file, this is reported as '?' as well.
 
 If -b is used the short-format status is preceded by a line
 
diff --git a/submodule.c b/submodule.c
index 730cc9513a..3da65100e3 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1082,20 +1082,18 @@ unsigned is_submodule_modified(const char *path, int 
ignore_untracked)
if (buf.buf[0] == 'u' ||
buf.buf[0] == '1' ||
buf.buf[0] == '2') {
-   /*
-* T XY :
-* T = line type, XY = status,  = submodule state
-*/
-   if (buf.len < 1 + 1 + 2 + 1 + 4)
+   /* T = line type, XY = status,  = submodule state */
+   if (buf.len < strlen("T XY "))
die("BUG: invalid status --porcelain=2 line %s",
buf.buf);
 
-   /* regular unmerged and renamed files */
if (buf.buf[5] == 'S' && buf.buf[8] == 'U')
/* nested untracked file */
dirty_submodule |= DIRTY_SUBMODULE_UNTRACKED;
 
-   if (memcmp(buf.buf + 5, "S..U", 4))
+   if (buf.buf[0] == 'u' ||
+   buf.buf[0] == '2' ||
+   memcmp(buf.buf + 5, "S..U", 4))
/* other change */
dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
}
diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index ab822c79e6..055c90736e 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -17,6 +17,12 @@ test_create_repo_with_commit () {
)
 }
 
+sanitize_output () {
+   sed -e "s/$_x40/HASH/" -e "s/$_x40/HASH/" output >output2 &&
+   mv output2 output
+}
+
+
 test_expect_success 'setup' '
test_create_repo_with_commit sub &&
echo output > .gitignore &&
@@ -311,6 +317,10 @@ test_expect_success 'diff --submodule with merge conflict 
in .gitmodules' '
test_cmp diff_submodule_actual diff_submodule_expect
 '
 
+# We'll setup different cases for further testing:
+# sub1 will contain a nested submodule,
+# sub2 will have an untracked file
+# sub3 will have an untracked repository
 test_expect_success 'setup superproject with untracked file in nested 
submodule' '
(
cd super &&
@@ -318,6 +328,7 @@ test_expect_success 'setup superproject with untracked file 
in nested submodule'
rm .gitmodules &&
git submodule add -f ./sub1 &&
git submodule add -f ./sub2 &&
+   git submodule add -f ./sub1 sub3 &&
git commit -a -m "messy merge in superproject" &&
(
cd sub1 &&
@@ -327,13 +338,27 @@ test_expect_success 'setup superproject with untracked 
file in nested submodule'
git add sub1 &&
git commit -a -m "update sub1 to contain nested sub"
) &&
-   echo untracked >super/sub1/sub2/untracked
+   echo content >super/sub1/sub2/file &&
+   echo content >super/sub2/file &&
+   git -C super/sub3 clone ../../sub2 untracked_repository
 '
 
 test_expect_success 'status with untracked file in nested submodule 
(porcelain)' '
git -C super status --porcelain >output &&
diff output - <<-\EOF
 M 

Re: [PATCH v3] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Jeffrey Walton
>>> Now the logic added in commit ee9be06770 ("perl: detect new files in
>>> MakeMaker builds", 2012-07-27) is extended to regenerate
>>> perl/perl.mak if there's any change to "perl -V".
>>
>> Nice. This fix is way simpler than I feared.
>>
>>> This will in some cases redundantly trigger perl/perl.mak to be
>>> re-made, e.g. if @INC is modified in ways the build process doesn't
>>> care about through sitecustomize.pl, but the common case is that we
>>> just do the right thing and re-generate perl/perl.mak when needed.
>>
>> I think that's fine. There's a related bug that the generation of
>> perl/perl.mak via recursive-make is sometimes racy. So that _might_
>> trigger more often as a result of this, but I think the solution is to
>> fix that race, not try to pretend it won't happen. :)
>
> We'll also redundantly trigger if you upgrade to a minor new perl
> version, but I think that's squarely in "who cares" territory. This'll
> only impact people working on git, and *occasionally* they might get a
> 100 ms hit when running make, as opposed to a cryptic error where
> they'll likely stare at it for a bit before running "make clean".

+1, I don't mind extra config or build times as long as things "just
work" for the common case.

I was trying to figure out the use case that I was seeing. I was
envisioning someone with Perl 4 in /usr/local who complained it would
break some one-off setup. In the common case, the guy running Perl 4
should do the extra work, not the majority of users operating under
the common case.

Jeff


Re: What's cooking in git.git (Mar 2017, #12; Wed, 29)

2017-03-29 Thread Junio C Hamano
Junio C Hamano  writes:

> * mg/name-rev-debug (2017-03-29) 3 commits
>  - name-rev: provide debug output
>  - name-rev: favor describing with tags and use committer date to tiebreak
>  - name-rev: refactor logic to see if a new candidate is a better name
>  (this branch uses mg/describe-debug-l10n.)
>
>  "git describe --debug --contains" did not add any meaningful
>  information, even though without "--contains" it did.
>
>  Waiting for jc/p4-current-branch-fix to settle.
>  This replaces jc/name-rev.

Regarding this "p4 fix", which is this one:

> * jc/p4-current-branch-fix (2017-03-27) 2 commits
>  - DONTMERGE git-p4: "name-rev HEAD" is not a way to find the current branch
>  - git-p4: add failing test for name-rev rather than symbolic-ref
>
>  "git p4" used "name-rev HEAD" when it wants to learn what branch is
>  checked out; it should use "symbolic-ref HEAD".
>
>  The tip one (i.e. real fix) needs to be redone by somebody
>  competent with Python and git-p4 codebase.

does anybody want to volunteer rewriting the "DONTMERGE" one, which
is shown below, in a better way?  Two things that I find disturbing
are

 * there is another "git symbolic-ref HEAD" invocation immediately
   before the context we see in the patch, which feels redundant;

 * the [11:] is to strip the leading "refs/heads/"; there should be
   a more Pythoninc way (which may even be used in existing code in
   git-p4.py) to do so.

If I hear from nobody in a few days, I'll have to merge the crappy
one below to 'next' to allow Michael's "name-rev" updates to move
forward.

Thanks.

-- >8 --
From: Junio C Hamano 
Date: Thu, 16 Mar 2017 22:56:22 -0700
Subject: [PATCH] git-p4: "name-rev HEAD" is not a way to find the current branch

This function seems to want to learn which branch we are on, and
running "name-rev HEAD" is *NEVER* the right way to do so.  If you
are on branch B which happens to point at the same commit as branch
A, "name-rev HEAD" can say either A or B (and it is likely it would
say A simply because it sorts earlier, and the logic seems to favor
the one that was discovered earlier when all else being equal).  If
you are on branch B which happens to be pointed by an annotated tag
T, "name-rev HEAD" will say T, not B.

Use "symbolic-ref HEAD" instead.

Signed-off-by: Junio C Hamano 
---
 git-p4.py| 2 +-
 t/t9807-git-p4-submit.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 0cfc8668d6..6a448a573b 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -549,7 +549,7 @@ def currentGitBranch():
 # on a detached head
 return None
 else:
-return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip()
+return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:]
 
 def isValidGitDir(path):
 if (os.path.exists(path + "/HEAD")
diff --git a/t/t9807-git-p4-submit.sh b/t/t9807-git-p4-submit.sh
index 4e625fad07..643258500b 100755
--- a/t/t9807-git-p4-submit.sh
+++ b/t/t9807-git-p4-submit.sh
@@ -139,7 +139,7 @@ test_expect_success 'submit with master branch name from 
argv' '
)
 '
 
-test_expect_failure 'allow submit from branch with same revision but different 
name' '
+test_expect_success 'allow submit from branch with same revision but different 
name' '
test_when_finished cleanup_git &&
git p4 clone --dest="$git" //depot &&
(
-- 
2.12.2-510-ge1104a5ee5



What's cooking in git.git (Mar 2017, #12; Wed, 29)

2017-03-29 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/branch-list-doc (2017-03-24) 2 commits
  (merged to 'next' on 2017-03-27 at 4bb47907ee)
 + branch doc: update description for `--list`
 + branch doc: change `git branch ` to use ``

 Doc update.


* bw/grep-recurse-submodules (2017-03-18) 2 commits
  (merged to 'next' on 2017-03-21 at a57e2f0129)
 + grep: fix builds with with no thread support
 + grep: set default output method

 Build fix for NO_PTHREADS build.


* jh/memihash-opt (2017-03-24) 8 commits
  (merged to 'next' on 2017-03-24 at f1aa0c4d37)
 + name-hash: add test-lazy-init-name-hash to .gitignore
 + name-hash: add perf test for lazy_init_name_hash
 + name-hash: add test-lazy-init-name-hash
 + name-hash: perf improvement for lazy_init_name_hash
 + hashmap: document memihash_cont, hashmap_disallow_rehash api
 + hashmap: add disallow_rehash setting
 + hashmap: allow memihash computation to be continued
 + name-hash: specify initial size for istate.dir_hash table

 The name-hash used for detecting paths that are different only in
 cases (which matter on case insensitive filesystems) has been
 optimized to take advantage of multi-threading when it makes sense.


* jk/fast-import-cleanup (2017-03-24) 4 commits
  (merged to 'next' on 2017-03-27 at 9f6058007f)
 + pack.h: define largest possible encoded object size
 + encode_in_pack_object_header: respect output buffer length
 + fast-import: use xsnprintf for formatting headers
 + fast-import: use xsnprintf for writing sha1s

 Code clean-up.


* jk/pager-in-use (2017-03-24) 1 commit
  (merged to 'next' on 2017-03-27 at 513f007025)
 + pager_in_use: use git_env_bool()

 Code clean-up.


* jk/sha1dc (2017-03-26) 1 commit
  (merged to 'next' on 2017-03-27 at 91bf9f06b4)
 + sha1dc: avoid CPP macro collisions

 sha1dc/sha1.c wanted to check the endianness of the target platform
 at compilation time and used a CPP macro with a rather overly
 generic name, "BIGENDIAN", to pass the result of the check around
 in the file.  It wasn't prepared for the same macro set to 0
 (false) by the platform to signal that the target is _not_ a big
 endian box, and assumed that the endianness detection logic it has
 alone would be the one that is setting the macro, resulting in a
 breakage on Windows.  This has been fixed by using a bit less
 generic name for the same purpose.


* sb/checkout-recurse-submodules (2017-03-16) 19 commits
  (merged to 'next' on 2017-03-22 at 48b49d572c)
 + builtin/read-tree: add --recurse-submodules switch
 + builtin/checkout: add --recurse-submodules switch
 + entry.c: create submodules when interesting
 + unpack-trees: check if we can perform the operation for submodules
 + unpack-trees: pass old oid to verify_clean_submodule
 + update submodules: add submodule_move_head
 + submodule.c: get_super_prefix_or_empty
 + update submodules: move up prepare_submodule_repo_env
 + submodules: introduce check to see whether to touch a submodule
 + update submodules: add a config option to determine if submodules are updated
 + update submodules: add submodule config parsing
 + make is_submodule_populated gently
 + lib-submodule-update.sh: define tests for recursing into submodules
 + lib-submodule-update.sh: replace sha1 by hash
 + lib-submodule-update: teach test_submodule_content the -C  flag
 + lib-submodule-update.sh: do not use ./. as submodule remote
 + lib-submodule-update.sh: reorder create_lib_submodule_repo
 + submodule--helper.c: remove duplicate code
 + connect_work_tree_and_git_dir: safely create leading directories

 "git checkout" is taught the "--recurse-submodules" option.


* sg/skip-prefix-in-prettify-refname (2017-03-23) 1 commit
  (merged to 'next' on 2017-03-27 at f7d0c115f9)
 + refs.c: use skip_prefix() in prettify_refname()

 Code cleanup.


* tg/stash-push-fixup (2017-03-22) 3 commits
  (merged to 'next' on 2017-03-24 at e6b9e04213)
 + stash: keep untracked files intact in stash -k
 + stash: pass the pathspec argument to git reset
 + stash: don't show internal implementation details

 Recent enhancement to "git stash push" command to support pathspec
 to allow only a subset of working tree changes to be stashed away
 was found to be too chatty and exposed the internal implementation
 detail (e.g. when it uses reset to match the index to HEAD before
 doing other things, output from reset seeped out).  These, and
 other chattyness has been fixed.

--
[New Topics]

* bc/push-cert-receive-fix (2017-03-28) 1 commit
 - 

Re: [PATCH 2/2] submodule.c: correctly handle nested submodules in is_submodule_modified

2017-03-29 Thread Stefan Beller
> sanity check: What does this do for a "2" line indicating a sub-submodule
> that has been renamed that contains an untracked file?  Do we need to
> rely on some other indication to show this as a change?

Oh. :(

In case of 'u' and '2' we need to set DIRTY_SUBMODULE_MODIFIED
additionally. will fix in a reroll.

>
> Enumerating some more cases, since I keep finding myself getting lost:
>
>  - if the HEAD commit of "sub" changes, we show this as " M sub".
>What should we show if the HEAD commit of "sub/subsub" changes?
>I think this should be " m".
>
>  - if "sub" is renamed, we show this as "R  sub -> newname".
>What should we show if "sub/subsub" is renamed?  It is tempting
>to show this as " m".
>
>  - if "sub" is deleted, we show this as "D  sub". What should we
>show if "sub/subsub" is deleted? I think this is " m".

All these cases are ' m', which I agree with, as it is a "modification
that cannot be git-add'ed in the superproject".

We might be inclined to later come up with  ' d' for a deleted nested
submodule, but I do not think it is worth the effort.

Thanks,
Stefan


Re: [ANNOUNCE] Git for Windows 2.12.2

2017-03-29 Thread Johannes Schindelin
Hi Andrew,

On Wed, 29 Mar 2017, Andrew Witte wrote:

> The git 2.12 GCM for Windows is broken. I tried doing a git clone and
> got "*remote: HTTP Basic: Access denied*".
> I downgraded to git 2.11.0 and everything worked fine.

Could you test v2.12.1, too, and open a bug report at:
https://github.com/git-for-windows/git/issues/new ?

I am particularly interested in any details you can share that would help
other developers like me to reproduce the issue.

Thank you,
Johannes


Re: [PATCH 2/2] unpack-trees.c: align submodule error message to the other error messages

2017-03-29 Thread Junio C Hamano
Jonathan Nieder  writes:

> Stefan Beller wrote:
>
>> As the place holder in the error message is for multiple submodules,
>> we don't want to encapsulate the string place holder in single quotes.
>
> Makes sense.
>
>> Signed-off-by: Stefan Beller 
>> ---
>>  unpack-trees.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 8333da2cc9..9f386cc174 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -167,7 +167,7 @@ void setup_unpack_trees_porcelain(struct 
>> unpack_trees_options *opts,
>>  msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
>>  _("The following working tree files would be removed by sparse 
>> checkout update:\n%s");
>>  msgs[ERROR_WOULD_LOSE_SUBMODULE] =
>> -_("Submodule '%s' cannot checkout new HEAD");
>> +_("The following submodules cannot checkout a new HEAD:\n%s");
>
> Nitpicking about wording: unless the user has adopted a strongly
> object-oriented point of view, it is Git that cannot checkout a new
> HEAD, not the submodule.
>
> How about:
>
>   _("Cannot update submodule:\n%s")
>
> That's vague, but if I understand correctly the way this error gets
> used is equally vague --- i.e., a clearer message would involve
> finer-grained error codes.

Makes sense to me.

Thanks for helping.


Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms

2017-03-29 Thread Stephan Beyer
Hi,

On 03/29/2017 10:02 PM, Alex Hoffman wrote:
> Any news about this patch?

Haha nice, your initial patch is the same as mine (but mine was part of
a bigger patch series and the v3 is probably going to have one less commit):
https://public-inbox.org/git/1456452282-10325-4-git-send-email-s-be...@gmx.net/

>> for (pp = commit->parents; pp; pp = pp->next)
>> fprintf(stderr, " %.*s", 8,
>> -   sha1_to_hex(pp->item->object.sha1));
>> +   oid_to_hex(>item->object.oid));

I guess your change in continued indentation is intentional, but is it
just my mail client or do you f*ck up tabs? (I haven't tried to apply
the patch but it looks like it is not possible due to broken tabs.)

Stephan


Re: [PATCH 1/2] entry.c: submodule recursing: respect force flag correctly

2017-03-29 Thread Junio C Hamano
Jonathan Nieder  writes:

> Stefan Beller wrote:
>
>> In case of a non-forced worktree update, the submodule movement is tested
>> in a dry run first, such that it doesn't matter if the actual update is
>> done via the force flag. However for correctness, we want to give the
>> flag is specified by the user.
>
> "for correctness" means "to avoid races"?

Sorry, but neither explanation makes much sense to me.

The codepath the patch touches says "if the submodule is not
populated, then checkout the submodule by switching from NULL
(nothing checked out) to the commit bound to the index of the
superproject; otherwise, checkout the submodule by switching from
HEAD (what is currently checked out) to the commit in the index".

Where does that "tested in a dry run first" come into play?
Whatever code calls checkout_entry(), does it call it twice, first
with a "--dry-run" option and then without one?  How does this
codepath respond differently to these two invocations, and how does
this change affect the way these two invocations behave?



>
>> Signed-off-by: Stefan Beller 
>> ---
>>  entry.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/entry.c b/entry.c
>> index d2b512da90..645121f828 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -287,7 +287,7 @@ int checkout_entry(struct cache_entry *ce,
>>  } else
>>  return submodule_move_head(ce->name,
>>  "HEAD", oid_to_hex(>oid),
>> -SUBMODULE_MOVE_HEAD_FORCE);
>> +state->force ? 
>> SUBMODULE_MOVE_HEAD_FORCE : 0);
>
> Looks like a good change.
>
> This moves past the 80-column margin.  I wish there were a tool like
> gofmt or clang-format that would take care of formatting for us.
>
> This isn't the only place SUBMODULE_MOVE_HEAD_FORCE is used in the
> file.  Do they need the same treatment?
>
> Thanks,
> Jonathan


Re: [PATCH v3] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> We'll also redundantly trigger if you upgrade to a minor new perl
> version, but I think that's squarely in "who cares" territory.
> ...
> But I think overall leaning on the side of busting the cache more
> often to avoid cryptic errors is the right choice, and we should use
> "perl -V".

I'd throw it into "better safe than sorry" category.  I think we all
like the approach this patch takes.  Let's queue it and merge it
down soonish.

Thanks.



Re: [PATCH v3] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Ævar Arnfjörð Bjarmason
On Wed, Mar 29, 2017 at 8:12 PM, Jeff King  wrote:
> On Wed, Mar 29, 2017 at 01:57:03PM +, Ævar Arnfjörð Bjarmason wrote:
>
>> Change the perl/perl.mak build process so that the file is regenerated
>> if the output of "perl -V" changes.
>>
>> Before this change updating e.g. /usr/bin/perl to a new major version
>> would cause the next "make" command to fail, since perl.mak has
>> hardcoded paths to perl library paths retrieved from its first run.
>
> This is one of those things that has been bugging me for years, but it
> comes up so rarely that I have never dug into it.

Glad to help. I've only run into this once a couple of days ago, made
a mental note to fix it, and then I saw that thread...

>> Now the logic added in commit ee9be06770 ("perl: detect new files in
>> MakeMaker builds", 2012-07-27) is extended to regenerate
>> perl/perl.mak if there's any change to "perl -V".
>
> Nice. This fix is way simpler than I feared.
>
>> This will in some cases redundantly trigger perl/perl.mak to be
>> re-made, e.g. if @INC is modified in ways the build process doesn't
>> care about through sitecustomize.pl, but the common case is that we
>> just do the right thing and re-generate perl/perl.mak when needed.
>
> I think that's fine. There's a related bug that the generation of
> perl/perl.mak via recursive-make is sometimes racy. So that _might_
> trigger more often as a result of this, but I think the solution is to
> fix that race, not try to pretend it won't happen. :)

We'll also redundantly trigger if you upgrade to a minor new perl
version, but I think that's squarely in "who cares" territory. This'll
only impact people working on git, and *occasionally* they might get a
100 ms hit when running make, as opposed to a cryptic error where
they'll likely stare at it for a bit before running "make clean".

If we were being more pedantic we could only bust the cache on major
perl version upgrades:

perl -e 'print substr($], 0, 5), "\n"' >>PM.stamp+

Or use Config.pm:

perl -MConfig -e 'print @Config{qw(api_revision api_version)},
"\n"' >>PM.stamp+

But I think overall leaning on the side of busting the cache more
often to avoid cryptic errors is the right choice, and we should use
"perl -V".


Re: [PATCH v3 7/8] sub-process: move sub-process functions into separate files

2017-03-29 Thread Junio C Hamano
Ben Peart  writes:

> +Types
> +-
> +
> +'int(*subprocess_start_fn)(struct subprocess_entry *entry)'::
> +
> + User-supplied function to initialize the sub-process.  This is
> + typically used to negoiate the interface version and capabilities.
> +
> +
> +Functions
> +-
> +
> +`subprocess_start`::
> +
> + Start a subprocess and add it to the subprocess hashmap.
> +
> +`subprocess_stop`::
> +
> + Kill a subprocess and remove it from the subprocess hashmap.
> +
> +`subprocess_find_entry`::
> +
> + Find a subprocess in the subprocess hashmap.
> +
> +`subprocess_get_child_process`::
> +
> + Get the underlying `struct child_process` from a subprocess.
> +
> +`subprocess_read_status`::
> +
> + Helper function to read packets looking for the last "status="
> + key/value pair.

OK.

> diff --git a/sub-process.c b/sub-process.c
> new file mode 100644
> index 00..2c4d27c193
> --- /dev/null
> +++ b/sub-process.c
> @@ -0,0 +1,116 @@
> +/*
> + * Generic implementation of background process infrastructure.
> + */
> +#include "sub-process.h"
> +#include "sigchain.h"
> +#include "pkt-line.h"
> + ...
> +void subprocess_exit_handler(struct child_process *process)
> +{

This is not only undocumented in the above, but it does not seem to
be necessary to be a public function.  The only thing that uses this
is subprocess_start(), which is in this file.  Perhaps make it static?


Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms

2017-03-29 Thread Alex Hoffman
Any news about this patch?

2017-03-21 22:24 GMT+01:00 Alex Hoffman :
> Hi, Brian,
>
> We definitely prefer the wrapper function oid_to_hex() to
> sha1_to_hex(). Thanks for feedback.
> Below is the updated patch:
>
> ---
>  bisect.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index 30808cadf..7b65acbcd 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -131,7 +131,7 @@ static void show_list(const char *debug, int
> counted, int nr,
> unsigned flags = commit->object.flags;
> enum object_type type;
> unsigned long size;
> -   char *buf = read_sha1_file(commit->object.sha1, , );
> +   char *buf = read_sha1_file(commit->object.oid.hash,
> , );
> const char *subject_start;
> int subject_len;
>
> @@ -143,10 +143,10 @@ static void show_list(const char *debug, int
> counted, int nr,
> fprintf(stderr, "%3d", weight(p));
> else
> fprintf(stderr, "---");
> -   fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1));
> +   fprintf(stderr, " %.*s", 8, oid_to_hex(>object.oid));
> for (pp = commit->parents; pp; pp = pp->next)
> fprintf(stderr, " %.*s", 8,
> -   sha1_to_hex(pp->item->object.sha1));
> +   oid_to_hex(>item->object.oid));
>
> subject_len = find_commit_subject(buf, _start);
> if (subject_len)
> --
> 2.12.0.400.g54ad2d445.dirty


Re: [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag

2017-03-29 Thread Johannes Schindelin
Hi Junio,

On Fri, 24 Mar 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > - the most important part will be the patch turning core.enableSHA1DC
> > into a tristate: "externalOnly" or "smart" or "auto" or something
> > indicating that it switches on collision detection only for commands
> > that accept objects from an outside source into the local repository,
> > such as fetch, fast-import, etc
> 
> There are different uses of SHA-1 hashing in Git, and I do agree
> that depending on the use, some of them do not need the overhead for
> the collision-attack detection.

Indeed.

I guess I should have clarified very clearly what I intended to accomplish
with this preview.

Let me first summarize the background:

- Git originally used SHA-1 as a convenient catch-all hashing algorithm,
  with the security of the hash merely being a nice afterthought.
  Unfortunately, SHA-1 was hardcoded. Mistakes were made. They always are.

- After the SHAttered blog post became public, Linus first made the case
  that it matters not all that much: the development of the Linux kernel
  is based on trust, and nobody would pull from a person they do not trust.
  This approach does obviously not extend to most other projects.

- By switching the default to DC_SHA1 already in `master`, you now took
  the exact opposite position: it *always* matters, even when you trust
  people, and the 6x slowdown demonstrated by my perf test is something that
  everybody needs to accept, even if it is spent for no benefit in return.

Between these two extremes ("collision attacks do not matter if you run
your project based on trust" vs "we always mistrust everybody, including
the user's own source code that they stage for commit"), I think there are
many shades of green, and I think it would be delusional to believe that
we can predict the trust model for each and every Git user [*1*], baking
it into a single compile time setting.

That is why I wanted to implement a tristate config so that users can
adapt Git to their particular scenario. That way, maintainers of
precompiled Git packages do not have to dictate to Git users what trust
model they should use.

One scenario seems to be common, and it is certainly one that I have a
direct interest in supporting: inside a company, where the server as well
as the developers are implicitly trusted not to fiddle with collision
attacks (because there are much easier ways to hide malicious code, let's
be frank). And in this scenario, slowing down the SHA-1 computation half
an order of magnitude by trying to detect collision attacks is simply
unacceptable, because there is no benefit, only cost to it.

An even more common scenario is when a developer works on a local
repository, adds a couple of files, then runs rebase, then merges, etc. In
*none* of these cases does the developer distrust any of the objects
flying about. Like above, forcing the developer to accept half an order of
magnitude slow down of the SHA-1 computation is something I would consider
disrespectful of those developers' time. Note: in this scenario, any
object coming from elsewhere would most likely be subject to the collision
detection, as the developer may not trust anybody but themselves.

In other words: I disagree that, say, `git add` should use the collision
detecting SHA-1.

I also suspect that you had a much more elaborate (maybe even fragile)
strategy than mine in mind when you tried to determine which code paths
would need collision detection and which ones would not: we have *no*
context in the object-oriented sense whenever we call the object hashing
functions. Meaning that you would have to introduce such a context, or to
add some sort of thread local state. I have to admit that I do not like
either way.

The approach I chose instead was to make the switch global, per command.
Obviously, the next step is to identify the Git commands which accept
objects from external sources (`clone`, `fetch`, `fast-import` and all the
remote helpers come to mind) and let them set a global flag before asking
git_default_config() to parse the core.enableSHA1DC setting, so that the
special value `external` could trigger the collision detecting code for
those, and only those, commands. That would still err on the safe side if
these commands are used to hash objects originating from the same machine,
but that is a price I would be willing to pay for the simplicity of this
approach.

Does my explanation manage to illustrate in a plausible way why I chose
the approach that I did?

Ciao,
Johannes

Footnote *1*: It is equally delusional, of course, to claim that every Git
user can and should configure and compile Git for themselves.


Re: [PATCH v3 0/8] refactor the filter process code into a reusable module

2017-03-29 Thread Junio C Hamano
Ben Peart  writes:

> Ben Peart (8):
>   pkt-line: add packet_writel() and packet_read_line_gently()
>   convert: Update convert to use new packet_writel() function
>   convert: Split start_multi_file_filter into two separate functions
>   convert: Separate generic structures and variables from the filter
> specific ones
>   convert: Update generic functions to only use generic data structures
>   convert: rename reusable sub-process functions
>   sub-process: move sub-process functions into separate files
>   convert: Update subprocess_read_status to not die on EOF

This presentation is much easier to digest, compared to the large
ball of wax we saw previously.  It highlights the key modification
that cmd2process is now "subclassed" from subprocess_entry which is
a more generic structure by embedding the latter at the beginning,
and have its user start_multi_file_filter_fn() explicitly downcast
the latter to the former around patches 4/8 and 5/8.

If I were doing this series, I would organize the first two slightly
differently, namely:

 * 1/8 just adds packet_read_line_gently().

 * 2/8 moves packet_write_line() from convert.c to pkt-line.c while
   renaming it, with the justification that this function must be
   made more widely available.  It would naturally involves
   adjusting existing callers.

because write and read done in your 1/8 are independent and
orthogonal changes, and doing it that way also avoids needless
temporary duplication of the same function.

I may later have further comments on 3-8/8 after giving them another
read, but I haven't seen anything questionable in them so far.

Thanks.



Re: [PATCH 2/2] unpack-trees.c: align submodule error message to the other error messages

2017-03-29 Thread Jonathan Nieder
Stefan Beller wrote:

> As the place holder in the error message is for multiple submodules,
> we don't want to encapsulate the string place holder in single quotes.

Makes sense.

> Signed-off-by: Stefan Beller 
> ---
>  unpack-trees.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 8333da2cc9..9f386cc174 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -167,7 +167,7 @@ void setup_unpack_trees_porcelain(struct 
> unpack_trees_options *opts,
>   msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
>   _("The following working tree files would be removed by sparse 
> checkout update:\n%s");
>   msgs[ERROR_WOULD_LOSE_SUBMODULE] =
> - _("Submodule '%s' cannot checkout new HEAD");
> + _("The following submodules cannot checkout a new HEAD:\n%s");

Nitpicking about wording: unless the user has adopted a strongly
object-oriented point of view, it is Git that cannot checkout a new
HEAD, not the submodule.

How about:

_("Cannot update submodule:\n%s")

That's vague, but if I understand correctly the way this error gets
used is equally vague --- i.e., a clearer message would involve
finer-grained error codes.

Thanks,
Jonathan


Re: [PATCH 1/2] entry.c: submodule recursing: respect force flag correctly

2017-03-29 Thread Jonathan Nieder
Stefan Beller wrote:

> In case of a non-forced worktree update, the submodule movement is tested
> in a dry run first, such that it doesn't matter if the actual update is
> done via the force flag. However for correctness, we want to give the
> flag is specified by the user.

"for correctness" means "to avoid races"?

> Signed-off-by: Stefan Beller 
> ---
>  entry.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/entry.c b/entry.c
> index d2b512da90..645121f828 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -287,7 +287,7 @@ int checkout_entry(struct cache_entry *ce,
>   } else
>   return submodule_move_head(ce->name,
>   "HEAD", oid_to_hex(>oid),
> - SUBMODULE_MOVE_HEAD_FORCE);
> + state->force ? 
> SUBMODULE_MOVE_HEAD_FORCE : 0);

Looks like a good change.

This moves past the 80-column margin.  I wish there were a tool like
gofmt or clang-format that would take care of formatting for us.

This isn't the only place SUBMODULE_MOVE_HEAD_FORCE is used in the
file.  Do they need the same treatment?

Thanks,
Jonathan


Re: BUG: Renaming a branch checked out in a different work tree

2017-03-29 Thread Andreas Schwab
On Mär 29 2017, Eyal Lotem  wrote:

> git version: 2.7.4
> installed from Ubuntu repos:
> Ubuntu Version: 1:2.7.4-0ubuntu1
>
> When renaming a branch checked out in a different work tree, that work
> tree's state is corrupted. Git status in that work tree then reports
> itself being on the "initial commit" with all files being in the
> staging area.

Fixed since 2.8.3.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: BUG: Renaming a branch checked out in a different work tree

2017-03-29 Thread Jonathan Nieder
(+cc: Kazuki Yamaguchi --- thanks for writing the fix!)
Hi Eyal,

Eyal Lotem wrote:

> git version: 2.7.4
> installed from Ubuntu repos:
> Ubuntu Version: 1:2.7.4-0ubuntu1
>
> When renaming a branch checked out in a different work tree, that work
> tree's state is corrupted. Git status in that work tree then reports
> itself being on the "initial commit" with all files being in the
> staging area.

I believe this is fixed by v2.8.3~42^2~1 (branch -m: update all
per-worktree HEADs, 2016-03-27).

Thanks and hope that helps,
Jonathan


[PATCH 2/2] unpack-trees.c: align submodule error message to the other error messages

2017-03-29 Thread Stefan Beller
As the place holder in the error message is for multiple submodules,
we don't want to encapsulate the string place holder in single quotes.

Signed-off-by: Stefan Beller 
---
 unpack-trees.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 8333da2cc9..9f386cc174 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -167,7 +167,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
_("The following working tree files would be removed by sparse 
checkout update:\n%s");
msgs[ERROR_WOULD_LOSE_SUBMODULE] =
-   _("Submodule '%s' cannot checkout new HEAD");
+   _("The following submodules cannot checkout a new HEAD:\n%s");
 
opts->show_all_errors = 1;
/* rejected paths may not have a static buffer */
-- 
2.12.0.rc1.52.g2de7d24de9.dirty



[PATCH 1/2] entry.c: submodule recursing: respect force flag correctly

2017-03-29 Thread Stefan Beller
In case of a non-forced worktree update, the submodule movement is tested
in a dry run first, such that it doesn't matter if the actual update is
done via the force flag. However for correctness, we want to give the
flag is specified by the user.

Signed-off-by: Stefan Beller 
---
 entry.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index d2b512da90..645121f828 100644
--- a/entry.c
+++ b/entry.c
@@ -287,7 +287,7 @@ int checkout_entry(struct cache_entry *ce,
} else
return submodule_move_head(ce->name,
"HEAD", oid_to_hex(>oid),
-   SUBMODULE_MOVE_HEAD_FORCE);
+   state->force ? 
SUBMODULE_MOVE_HEAD_FORCE : 0);
}
 
if (!changed)
-- 
2.12.0.rc1.52.g2de7d24de9.dirty



Re: [PATCH v3] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Jeff King
On Wed, Mar 29, 2017 at 01:57:03PM +, Ævar Arnfjörð Bjarmason wrote:

> Change the perl/perl.mak build process so that the file is regenerated
> if the output of "perl -V" changes.
> 
> Before this change updating e.g. /usr/bin/perl to a new major version
> would cause the next "make" command to fail, since perl.mak has
> hardcoded paths to perl library paths retrieved from its first run.

This is one of those things that has been bugging me for years, but it
comes up so rarely that I have never dug into it.

> Now the logic added in commit ee9be06770 ("perl: detect new files in
> MakeMaker builds", 2012-07-27) is extended to regenerate
> perl/perl.mak if there's any change to "perl -V".

Nice. This fix is way simpler than I feared.

> This will in some cases redundantly trigger perl/perl.mak to be
> re-made, e.g. if @INC is modified in ways the build process doesn't
> care about through sitecustomize.pl, but the common case is that we
> just do the right thing and re-generate perl/perl.mak when needed.

I think that's fine. There's a related bug that the generation of
perl/perl.mak via recursive-make is sometimes racy. So that _might_
trigger more often as a result of this, but I think the solution is to
fix that race, not try to pretend it won't happen. :)

-Peff


BUG: Renaming a branch checked out in a different work tree

2017-03-29 Thread Eyal Lotem
git version: 2.7.4
installed from Ubuntu repos:
Ubuntu Version: 1:2.7.4-0ubuntu1

When renaming a branch checked out in a different work tree, that work
tree's state is corrupted. Git status in that work tree then reports
itself being on the "initial commit" with all files being in the
staging area.

--
Eyal


Re: [PATCH] read-cache: avoid git_path() race in freshen_shared_index()

2017-03-29 Thread Jeff King
On Wed, Mar 29, 2017 at 10:06:52AM -0700, Junio C Hamano wrote:

> > This shows that we should be careful not to use git_path() in
> > freshen_shared_index(). It is using a shared buffer that can
> > too easily lead to races.
> 
> The impression I get from the symptom is that after git_path() is
> called here, before check_and_freshen_file() uses that result, it
> (or functions it calls) uses git_path(), and the number of times it
> does so has changed since cc/split-index-config was written on the
> mainline, and the rotating 4-element buffer get_pathname() gives is
> now exhausted, leading to the failure you observed.  By the way,
> that does not sound a race to me.
> 
> In any case, that explains why bisect says the merge is the first
> bad one, and cures the confused reader ;-) The use of git_path() on
> the topic was still safe; it was a timebomb waiting to go off.  The
> mainline started using more calls and the merge result was unsafe.

Yeah, it looks like that is what happened. I see that Christian bisected
the rebase to find the commit in the series that introduces the problem.
I'm mildly curious which commit upstream created the problem[1].
There's a reasonable chance it's some innocent-looking cleanup (possibly
one of my recent "stop using a fixed buffer" ones).

But in the end it doesn't really matter. I think code like:

  const char *filename = git_path(...);

or

  nontrivial_function(git_path(...));

is an anti-pattern. It _might_ be safe, but it's really hard to tell
without following the complete lifetime of the return value. I've been
tempted to suggest we should abolish git_path() entirely. But it's so
darn useful for things like unlink(git_path(...)), or other direct
system calls.

As an aside, this kind of static-buffer reuse _used_ to mean you might
see somebody else's buffer. Which is bad enough. But since the move to
use strbufs underneath the hood of git_path(), it may produce that
effect or it may be a use-after-free (if the strbuf had to reallocate to
grow in the meantime).

Anyway. The fix in the patch is obviously the right thing.

-Peff

[1] I think we could pinpoint the upstream change that caused the bad
interaction by bisecting between the merge-base and the first-parent
of the broken merge. For each commit, cherry-pick the complete
series on top of it, and test the result.


Re: [PATCH v2 0/3] name-rev sanity

2017-03-29 Thread Junio C Hamano
Junio C Hamano  writes:

> The first two applies cleanly to the same base as jc/name-rev that
> the first two of these patches are meant to replace, but the third
> one doesn't apply on top.  Are you depending on something newer?

Ah, of course, you are depending on your other topic ;-)
I'll wiggle these in.

Thanks.


Re: [PATCH] userdiff: add build-in pattern for shell

2017-03-29 Thread Junio C Hamano
Ivan Tham  writes:

> Shell are widely used but comes with lots of different patterns. The
> build-in pattern aim for POSIX-compatible shells with some additions:
>
> - Notably ${g//re/s} and ${g#cut}
> - "function" from bash
>
> Signed-off-by: Ivan Tham 
> ---
>  Documentation/gitattributes.txt |  2 ++
>  t/t4034-diff-words.sh   |  1 +
>  t/t4034/sh/expect   | 14 ++
>  t/t4034/sh/post |  7 +++
>  t/t4034/sh/pre  |  7 +++
>  userdiff.c  |  5 +
>  6 files changed, 36 insertions(+)
>  create mode 100644 t/t4034/sh/expect
>  create mode 100644 t/t4034/sh/post
>  create mode 100644 t/t4034/sh/pre
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index a53d093ca..1bad72df2 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -706,6 +706,8 @@ patterns are available:
>  
>  - `ruby` suitable for source code in the Ruby language.
>  
> +- `sh` suitable for source code in POSIX-compatible shells.

The new test you added seems to show that this is not limited to
POSIX shells but also understands bashisms like ${x//x/x}.  Perhaps
drop "POSIX-compatible" from here.

> diff --git a/userdiff.c b/userdiff.c
> index 8b732e40b..8d5127fb6 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -148,6 +148,11 @@ PATTERNS("csharp",
>"[a-zA-Z_][a-zA-Z0-9_]*"
>"|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
>"|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"),
> +PATTERNS("sh",
> +  "^[ \t]*(function )?[A-Za-z_][A-Za-z_0-9]*[ \t]*()[\t]*\\{?$",

There is something funky going on around parentheses on this line.
The ones around "function " is meant to be syntactic metacharacters
to produce a group in the regexp so that you can apply '?'
(i.e. zero or one occurrence) to it.  But I think the second pair of
parentheses that appears later on the line, which enclose nothing,
are meant to be literal?  E.g. "hello (){\n\techo world;\n}\n"  They
would need some quoting, perhaps like

...[ \t]*\\(\\)[\t]*

> +  /* -- */
> +  "(\\$|--?)?([a-zA-Z_][a-zA-Z0-9._]*|[0-9]+|#)|--" /* command/param */

TBH, I have no idea what this line-noise is doing.

$foobar, $4, --foobar, foobar, 123 and -- can be seen easily out of
these patterns.  I am not sure what --# would be (perhaps you meant
to only catch $# and --# is included by accident, in which case it
is understandable).  It feels a bit strange to see that $# is
supported but not $?; --foo but not --foo=bar; foobar but not "foo
bar" inside a dq-pair.

> +  "|\\$[({]|[)}]|[-+*/=!]=?|[\\]&%#/|]{1,2}|[<>]{1,3}|[ \t]#.*"),

And this one is even more dense.


Re: [PATCH v2 0/3] name-rev sanity

2017-03-29 Thread Junio C Hamano
Michael J Gruber  writes:

> So here is v2 of the name-rev series, the result of our discussions being:
>
> Junio C Hamano (2):
>   name-rev: refactor logic to see if a new candidate is a better name
>   name-rev: favor describing with tags and use committer date to
> tiebreak
>
> That second patch is slighty changed as discussed, but still mostly Junio's
>
> Michael J Gruber (1):
>   name-rev: provide debug output
>
> This replaces the patch which documented that --debug does not work with 
> --contains :)
>
>  builtin/describe.c |   2 +
>  builtin/name-rev.c | 117 
> +
>  t/t4202-log.sh |   2 +-
>  3 files changed, 103 insertions(+), 18 deletions(-)

The first two applies cleanly to the same base as jc/name-rev that
the first two of these patches are meant to replace, but the third
one doesn't apply on top.  Are you depending on something newer?


Re: git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-29 Thread Junio C Hamano
Jeffrey Walton  writes:

> Some more 2.12.2 testing on Solaris 11.3 x86_64:
>
> $ make V=1
> gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
> credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
> -D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
> -DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
> -I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
> -DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H

Looking at config.mak.uname, nothing in SunOS section seems to set
NO_INET_NTOP or NO_INET_PTON.  Why is your build setting them?


Re: [PATCH] read-cache: avoid git_path() race in freshen_shared_index()

2017-03-29 Thread Junio C Hamano
Christian Couder  writes:

> When performing an interactive rebase in split-index mode,
> the commit message that one should rework when squashing commits
> can contain some garbage instead of the usual concatenation of
> both of the commit messages.

OK, that is an understandable explanation of what problem you are
trying to fix.

>
> When bisecting it appears that 94c9b5af70 (Merge branch
> 'cc/split-index-config', 2017-03-17) is the first bad commit.
>
> But when rebasing cc/split-index-config on top of the commit it
> was merged with, the first bad commit is then c3a0082502
> (read-cache: use freshen_shared_index() in read_index_from(),
> 2017-03-06).

This part however doesn't help understanding the issue.  "When X but
when Y" sounds as if you found a botched merge, but that does not
seem to be the case.  The resulting tree after rebasing (with
conflict resolution) is the same as the recorded merge result.  It
could be saying that "git bisect" is buggy and does not pinpoint the
broken commit, but this is not a commit to fix "bisect".

That leaves the reader confused.

> This shows that we should be careful not to use git_path() in
> freshen_shared_index(). It is using a shared buffer that can
> too easily lead to races.

The impression I get from the symptom is that after git_path() is
called here, before check_and_freshen_file() uses that result, it
(or functions it calls) uses git_path(), and the number of times it
does so has changed since cc/split-index-config was written on the
mainline, and the rotating 4-element buffer get_pathname() gives is
now exhausted, leading to the failure you observed.  By the way,
that does not sound a race to me.

In any case, that explains why bisect says the merge is the first
bad one, and cures the confused reader ;-) The use of git_path() on
the topic was still safe; it was a timebomb waiting to go off.  The
mainline started using more calls and the merge result was unsafe.

If you meant to summarise the whole two paragraphs above that I
needed to think it through with "This shows that", I'd have to say
that you are expecting too much from your readers.  Please be a bit
more gentle to them.

Thanks.

> Signed-off-by: Christian Couder 
> ---
>  read-cache.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index e447751823..2f10242c24 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1682,9 +1682,10 @@ int do_read_index(struct index_state *istate, const 
> char *path, int must_exist)
>   */
>  static void freshen_shared_index(char *base_sha1_hex, int warn)
>  {
> - const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
> + char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
>   if (!check_and_freshen_file(shared_index, 1) && warn)
>   warning("could not freshen shared index '%s'", shared_index);
> + free(shared_index);
>  }
>  
>  int read_index_from(struct index_state *istate, const char *path)


[PATCH] userdiff: add build-in pattern for shell

2017-03-29 Thread Ivan Tham
Shell are widely used but comes with lots of different patterns. The
build-in pattern aim for POSIX-compatible shells with some additions:

- Notably ${g//re/s} and ${g#cut}
- "function" from bash

Signed-off-by: Ivan Tham 
---
 Documentation/gitattributes.txt |  2 ++
 t/t4034-diff-words.sh   |  1 +
 t/t4034/sh/expect   | 14 ++
 t/t4034/sh/post |  7 +++
 t/t4034/sh/pre  |  7 +++
 userdiff.c  |  5 +
 6 files changed, 36 insertions(+)
 create mode 100644 t/t4034/sh/expect
 create mode 100644 t/t4034/sh/post
 create mode 100644 t/t4034/sh/pre

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index a53d093ca..1bad72df2 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -706,6 +706,8 @@ patterns are available:
 
 - `ruby` suitable for source code in the Ruby language.
 
+- `sh` suitable for source code in POSIX-compatible shells.
+
 - `tex` suitable for source code for LaTeX documents.
 
 
diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index 912df9122..2eb662f89 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -313,6 +313,7 @@ test_language_driver perl
 test_language_driver php
 test_language_driver python
 test_language_driver ruby
+test_language_driver sh
 test_language_driver tex
 
 test_expect_success 'word-diff with diff.sbe' '
diff --git a/t/t4034/sh/expect b/t/t4034/sh/expect
new file mode 100644
index 0..e7b0a9ae3
--- /dev/null
+++ b/t/t4034/sh/expect
@@ -0,0 +1,14 @@
+diff --git a/pre b/post
+index 7bb0d15..df3845b 100644
+--- a/pre
 b/post
+@@ -1,7 +1,7 @@
+echo "Hello world!
+bomb?"
+fork(){ 
bombfork|bombfork& }
+; bomb
+ax=1 a2 
x=$((ax+12)) 
ax=$((a-1x-2)) 
ax=$((ax*12))
 
ax=$((ax/12))
+ax=$(ax) 
ax=`ax` 
ax=${ax#ax*}
 
ax=${ax%ax*}
 
ax=${ax//ax/ax}
+command -h -v--help=all -q | xargs -- echo 
2>&1 &/dev/null
+[ $a -eq $b$x -ne $y ]& 
aaxx||echo bbyy
+[ "$a$x"!=12 ] && echo 
ax || echo by
diff --git a/t/t4034/sh/post b/t/t4034/sh/post
new file mode 100644
index 0..df3845b4f
--- /dev/null
+++ b/t/t4034/sh/post
@@ -0,0 +1,7 @@
+echo "Hello world?"
+fork(){ fork|fork& }
+x=2 x=$((x+2)) x=$((x-2)) x=$((x*2)) x=$((x/2))
+x=$(x) x=`x` x=${x#x*} x=${x%x*} x=${x//x/x}
+command --help=all -q | xargs -- echo 2>/dev/null
+[ $x -ne $y ]& xx||echo yy
+[ "$x"!=2 ] && echo x || echo y
diff --git a/t/t4034/sh/pre b/t/t4034/sh/pre
new file mode 100644
index 0..7bb0d1562
--- /dev/null
+++ b/t/t4034/sh/pre
@@ -0,0 +1,7 @@
+echo Hello world!
+bomb(){ bomb|bomb& }; bomb
+a=1 a=$((a+1)) a=$((a-1)) a=$((a*1)) a=$((a/1))
+a=$(a) a=`a` a=${a#a*} a=${a%a*} a=${a//a/a}
+command -h -v | xargs -- echo >&1 &
+[ $a -eq $b ]& aa||echo bb
+[ "$a"!=1 ] && echo a || echo b
diff --git a/userdiff.c b/userdiff.c
index 8b732e40b..8d5127fb6 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -148,6 +148,11 @@ PATTERNS("csharp",
 "[a-zA-Z_][a-zA-Z0-9_]*"
 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
 "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"),
+PATTERNS("sh",
+"^[ \t]*(function )?[A-Za-z_][A-Za-z_0-9]*[ \t]*()[\t]*\\{?$",
+/* -- */
+"(\\$|--?)?([a-zA-Z_][a-zA-Z0-9._]*|[0-9]+|#)|--" /* command/param */
+"|\\$[({]|[)}]|[-+*/=!]=?|[\\]&%#/|]{1,2}|[<>]{1,3}|[ \t]#.*"),
 IPATTERN("css",
 "![:;][[:space:]]*$\n"
 "^[_a-z0-9].*$",
-- 
2.12.2.609.gf7d0c115f



Re: [PATCH v4 2/5] dir_iterator: iterate over dir after its contents

2017-03-29 Thread Junio C Hamano
Michael Haggerty  writes:

> I also realize that I made a goof in my comments about v3 of this patch
> series. Your new option is not choosing between "depth-first" and
> "breadth-first". Both types of iteration are depth-first. Really it is
> choosing between pre-order and post-order traversal. So I think it would
> be better to name the option `DIR_ITERATOR_POST_ORDER`. Sorry about that.

That solicits a natural reaction from a bystander.  Would an
IN_ORDER option also be useful?  I am not demanding it to be added
to this series, especially if there is no immediate need, but if we
foresee that it would also make sense for some other callers, we
would at least want to make sure that the code after this addition
of POST_ORDER is in a shape that is easy to add such an option
later.


Re: [PATCH 0/18] snprintf cleanups

2017-03-29 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Mar 28, 2017 at 03:33:48PM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> > It's a lot of patches, but hopefully they're all pretty straightforward
>> > to read.
>> 
>> Yes, quite a lot of changes.  I didn't see anything questionable in
>> there.
>> 
>> As to the "patch-id" thing, I find the alternate one slightly easier
>> to read.  Also, exactly because this is not a performance critical
>> codepath, it may be better if patch_id_add_string() filtered out
>> whitespaces; that would allow the source to express things in more
>> natural way, e.g.
>> 
>>  patch_id_addf(, "new file mode");
>>  patch_id_addf(, "%06o", p->two->mode);
>>  patch_id_addf(, "--- /dev/null");
>>  patch_id_addf(, "+++ b/%.*s", len2, p->two->path);
>> 
>> Or I may be going overboard by bringing "addf" into the mix X-<.
>
> I think there are two things going on in your example.
>
> One is that obviously patch_id_addf() removes the spaces from the
> result. But we could do that now by keeping the big strbuf_addf(), and
> then just walking the result and feeding non-spaces.
>
> The second is that your addf means we are back to formatting everything
> into a buffer again

You are right to point out that I was blinded by the ugliness of
words stuck together without spaces in between, which was inherited
from the original code, and failed to see the sole point of this
series, which is to remove truncation without adding unnecessary
allocation and freeing.

Thanks for straighten my thinking out.  I think the seeming
ugliness, if it ever becomes a real problem, should be handled
outside this series after the dust settles.





[PATCH v3 0/8] refactor the filter process code into a reusable module

2017-03-29 Thread Ben Peart
Refactor the filter..process code into a separate sub-process
module that can be used to reduce the cost of starting up a sub-process
for multiple commands.  It does this by keeping the external process
running and processing all commands by communicating over standard input
and standard output using the packet format (pkt-line) based protocol.
Full documentation is in Documentation/technical/api-sub-process.txt.

This code is refactored from:

Commit edcc85814c ("convert: add filter..process option", 
2016-10-16)
keeps the external process running and processes all commands


Ben Peart (8):
  pkt-line: add packet_writel() and packet_read_line_gently()
  convert: Update convert to use new packet_writel() function
  convert: Split start_multi_file_filter into two separate functions
  convert: Separate generic structures and variables from the filter
specific ones
  convert: Update generic functions to only use generic data structures
  convert: rename reusable sub-process functions
  sub-process: move sub-process functions into separate files
  convert: Update subprocess_read_status to not die on EOF

 Documentation/technical/api-sub-process.txt |  54 ++
 Makefile|   1 +
 convert.c   | 159 +---
 pkt-line.c  |  31 ++
 pkt-line.h  |  11 ++
 sub-process.c   | 120 +
 sub-process.h   |  46 
 7 files changed, 292 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 5/8] convert: Update generic functions to only use generic data structures

2017-03-29 Thread Ben Peart
Update all functions that are going to be moved into a reusable module
so that they only work with the reusable data structures.  Move code
that is specific to the filter out into the filter functions.

Signed-off-by: Ben Peart 
---
 convert.c | 48 +---
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/convert.c b/convert.c
index f569026511..77966e246b 100644
--- a/convert.c
+++ b/convert.c
@@ -562,7 +562,6 @@ static void kill_multi_file_filter(struct subprocess_entry 
*entry)
finish_command(>process);
 
hashmap_remove(_process_map, entry, NULL);
-   free(entry);
 }
 
 static void stop_multi_file_filter(struct child_process *process)
@@ -576,14 +575,15 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static int start_multi_file_filter_fn(struct cmd2process *entry)
+static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
 {
int err;
+   struct cmd2process *entry = (struct cmd2process *)subprocess;
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >subprocess.process;
-   const char *cmd = entry->subprocess.cmd;
+   struct child_process *process = >process;
+   const char *cmd = subprocess->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -638,17 +638,21 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
return err;
 }
 
-static struct cmd2process *start_multi_file_filter(const char *cmd)
+typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
+int start_multi_file_filter(struct subprocess_entry *entry, const char *cmd,
+   subprocess_start_fn startfn)
 {
int err;
-   struct cmd2process *entry;
struct child_process *process;
const char *argv[] = { cmd, NULL };
 
-   entry = xmalloc(sizeof(*entry));
-   entry->subprocess.cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >subprocess.process;
+   if (!cmd_process_map_initialized) {
+   cmd_process_map_initialized = 1;
+   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   }
+
+   entry->cmd = cmd;
+   process = >process;
 
child_process_init(process);
process->argv = argv;
@@ -658,22 +662,23 @@ static struct cmd2process *start_multi_file_filter(const 
char *cmd)
process->clean_on_exit = 1;
process->clean_on_exit_handler = stop_multi_file_filter;
 
-   if (start_command(process)) {
+   err = start_command(process);
+   if (err) {
error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
+   return err;
}
 
hashmap_entry_init(entry, strhash(cmd));
 
-   err = start_multi_file_filter_fn(entry);
+   err = startfn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(>subprocess);
-   return NULL;
+   kill_multi_file_filter(entry);
+   return err;
}
 
hashmap_add(_process_map, entry);
-   return entry;
+   return 0;
 }
 
 static int apply_multi_file_filter(const char *path, const char *src, size_t 
len,
@@ -692,9 +697,13 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
fflush(NULL);
 
if (!entry) {
-   entry = start_multi_file_filter(cmd);
-   if (!entry)
+   entry = xmalloc(sizeof(*entry));
+   entry->supported_capabilities = 0;
+
+   if (start_multi_file_filter(>subprocess, cmd, 
start_multi_file_filter_fn)) {
+   free(entry);
return 0;
+   }
}
process = >subprocess.process;
 
@@ -767,7 +776,8 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
 * Force shutdown and restart if another blob requires 
filtering.
 */
error("external filter '%s' failed", cmd);
-   kill_multi_file_filter(>subprocess);
+   kill_multi_file_filter((struct subprocess_entry 
*)entry);
+   free(entry);
}
} else {
strbuf_swap(dst, );
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 4/8] convert: Separate generic structures and variables from the filter specific ones

2017-03-29 Thread Ben Peart
To enable future reuse of the filter..process infrastructure,
split the cmd2process structure into two separate parts.

subprocess_entry will now contain the generic data required to manage
the creation and tracking of the child process in a hashmap. Also move
all knowledge of the hashmap into the generic functions.

cmd2process is a filter protocol specific structure that is used to
track the negotiated capabilities of the filter.

Signed-off-by: Ben Peart 
---
 convert.c | 57 +++--
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/convert.c b/convert.c
index 404757eac9..f569026511 100644
--- a/convert.c
+++ b/convert.c
@@ -496,29 +496,40 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct cmd2process {
+struct subprocess_entry {
struct hashmap_entry ent; /* must be the first member! */
-   unsigned int supported_capabilities;
const char *cmd;
struct child_process process;
 };
 
+struct cmd2process {
+   struct subprocess_entry subprocess; /* must be the first member! */
+   unsigned int supported_capabilities;
+};
+
 static int cmd_process_map_initialized;
 static struct hashmap cmd_process_map;
 
-static int cmd2process_cmp(const struct cmd2process *e1,
-  const struct cmd2process *e2,
+static int cmd2process_cmp(const struct subprocess_entry *e1,
+  const struct subprocess_entry *e2,
   const void *unused)
 {
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct cmd2process *find_multi_file_filter_entry(struct hashmap 
*hashmap, const char *cmd)
+static struct subprocess_entry *find_multi_file_filter_entry(const char *cmd)
 {
-   struct cmd2process key;
+   struct subprocess_entry key;
+
+   if (!cmd_process_map_initialized) {
+   cmd_process_map_initialized = 1;
+   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   return NULL;
+   }
+
hashmap_entry_init(, strhash(cmd));
key.cmd = cmd;
-   return hashmap_get(hashmap, , NULL);
+   return hashmap_get(_process_map, , NULL);
 }
 
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
@@ -541,7 +552,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct hashmap *hashmap, struct cmd2process 
*entry)
+static void kill_multi_file_filter(struct subprocess_entry *entry)
 {
if (!entry)
return;
@@ -550,7 +561,7 @@ static void kill_multi_file_filter(struct hashmap *hashmap, 
struct cmd2process *
kill(entry->process.pid, SIGTERM);
finish_command(>process);
 
-   hashmap_remove(hashmap, entry, NULL);
+   hashmap_remove(_process_map, entry, NULL);
free(entry);
 }
 
@@ -571,8 +582,8 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >process;
-   const char *cmd = entry->cmd;
+   struct child_process *process = >subprocess.process;
+   const char *cmd = entry->subprocess.cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -627,7 +638,7 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
return err;
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+static struct cmd2process *start_multi_file_filter(const char *cmd)
 {
int err;
struct cmd2process *entry;
@@ -635,9 +646,9 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
const char *argv[] = { cmd, NULL };
 
entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
+   entry->subprocess.cmd = cmd;
entry->supported_capabilities = 0;
-   process = >process;
+   process = >subprocess.process;
 
child_process_init(process);
process->argv = argv;
@@ -657,11 +668,11 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
err = start_multi_file_filter_fn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(hashmap, entry);
+   kill_multi_file_filter(>subprocess);
return NULL;
}
 
-   hashmap_add(hashmap, entry);
+   hashmap_add(_process_map, entry);
return entry;
 }
 
@@ -676,22 +687,16 @@ static int apply_multi_file_filter(const char *path, 
const char *src, size_t len
struct strbuf filter_status = STRBUF_INIT;
const char *filter_type;
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized 

[PATCH v3 2/8] convert: Update convert to use new packet_writel() function

2017-03-29 Thread Ben Peart
convert.c had it's own packet_write_list() function that can now be
replaced with the new packet_writel() function from pkt-line.

Signed-off-by: Ben Peart 
---
 convert.c | 23 ++-
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 8d652bf27c..793c29ebfd 100644
--- a/convert.c
+++ b/convert.c
@@ -521,25 +521,6 @@ static struct cmd2process 
*find_multi_file_filter_entry(struct hashmap *hashmap,
return hashmap_get(hashmap, , NULL);
 }
 
-static int packet_write_list(int fd, const char *line, ...)
-{
-   va_list args;
-   int err;
-   va_start(args, line);
-   for (;;) {
-   if (!line)
-   break;
-   if (strlen(line) > LARGE_PACKET_DATA_MAX)
-   return -1;
-   err = packet_write_fmt_gently(fd, "%s\n", line);
-   if (err)
-   return err;
-   line = va_arg(args, const char*);
-   }
-   va_end(args);
-   return packet_flush_gently(fd);
-}
-
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
@@ -616,7 +597,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   err = packet_write_list(process->in, "git-filter-client", "version=2", 
NULL);
+   err = packet_writel(process->in, "git-filter-client", "version=2", 
NULL);
if (err)
goto done;
 
@@ -632,7 +613,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
if (err)
goto done;
 
-   err = packet_write_list(process->in, "capability=clean", 
"capability=smudge", NULL);
+   err = packet_writel(process->in, "capability=clean", 
"capability=smudge", NULL);
 
for (;;) {
cap_buf = packet_read_line(process->out, NULL);
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 6/8] convert: rename reusable sub-process functions

2017-03-29 Thread Ben Peart
Do a mechanical rename of the functions that will become the reusable
sub-process module.

Signed-off-by: Ben Peart 
---
 convert.c | 46 +++---
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/convert.c b/convert.c
index 77966e246b..8e9223f48a 100644
--- a/convert.c
+++ b/convert.c
@@ -507,8 +507,8 @@ struct cmd2process {
unsigned int supported_capabilities;
 };
 
-static int cmd_process_map_initialized;
-static struct hashmap cmd_process_map;
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
 
 static int cmd2process_cmp(const struct subprocess_entry *e1,
   const struct subprocess_entry *e2,
@@ -517,22 +517,22 @@ static int cmd2process_cmp(const struct subprocess_entry 
*e1,
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct subprocess_entry *find_multi_file_filter_entry(const char *cmd)
+static struct subprocess_entry *subprocess_find_entry(const char *cmd)
 {
struct subprocess_entry key;
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized = 1;
-   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
return NULL;
}
 
hashmap_entry_init(, strhash(cmd));
key.cmd = cmd;
-   return hashmap_get(_process_map, , NULL);
+   return hashmap_get(_map, , NULL);
 }
 
-static void read_multi_file_filter_status(int fd, struct strbuf *status)
+static void subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
@@ -552,7 +552,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct subprocess_entry *entry)
+static void subprocess_stop(struct subprocess_entry *entry)
 {
if (!entry)
return;
@@ -561,10 +561,10 @@ static void kill_multi_file_filter(struct 
subprocess_entry *entry)
kill(entry->process.pid, SIGTERM);
finish_command(>process);
 
-   hashmap_remove(_process_map, entry, NULL);
+   hashmap_remove(_map, entry, NULL);
 }
 
-static void stop_multi_file_filter(struct child_process *process)
+static void subprocess_exit_handler(struct child_process *process)
 {
sigchain_push(SIGPIPE, SIG_IGN);
/* Closing the pipe signals the filter to initiate a shutdown. */
@@ -639,16 +639,16 @@ static int start_multi_file_filter_fn(struct 
subprocess_entry *subprocess)
 }
 
 typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
-int start_multi_file_filter(struct subprocess_entry *entry, const char *cmd,
+int subprocess_start(struct subprocess_entry *entry, const char *cmd,
subprocess_start_fn startfn)
 {
int err;
struct child_process *process;
const char *argv[] = { cmd, NULL };
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized = 1;
-   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
}
 
entry->cmd = cmd;
@@ -660,7 +660,7 @@ int start_multi_file_filter(struct subprocess_entry *entry, 
const char *cmd,
process->in = -1;
process->out = -1;
process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
+   process->clean_on_exit_handler = subprocess_exit_handler;
 
err = start_command(process);
if (err) {
@@ -673,11 +673,11 @@ int start_multi_file_filter(struct subprocess_entry 
*entry, const char *cmd,
err = startfn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(entry);
+   subprocess_stop(entry);
return err;
}
 
-   hashmap_add(_process_map, entry);
+   hashmap_add(_map, entry);
return 0;
 }
 
@@ -692,7 +692,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
struct strbuf filter_status = STRBUF_INIT;
const char *filter_type;
 
-   entry = (struct cmd2process *)find_multi_file_filter_entry(cmd);
+   entry = (struct cmd2process *)subprocess_find_entry(cmd);
 
fflush(NULL);
 
@@ -700,7 +700,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
entry = xmalloc(sizeof(*entry));
entry->supported_capabilities = 0;
 
-   if (start_multi_file_filter(>subprocess, cmd, 
start_multi_file_filter_fn)) {
+   if (subprocess_start(>subprocess, cmd, 

[PATCH v3 3/8] convert: Split start_multi_file_filter into two separate functions

2017-03-29 Thread Ben Peart
To enable future reuse of the filter..process infrastructure,
split start_multi_file_filter into two separate parts.

start_multi_file_filter will now only contain the generic logic to
manage the creation and tracking of the child process in a hashmap.

start_multi_file_filter_fn is a protocol specific initialization
function that will negotiate the multi-file-filter interface version
and capabilities.

Signed-off-by: Ben Peart 
---
 convert.c | 63 ++-
 1 file changed, 38 insertions(+), 25 deletions(-)

diff --git a/convert.c b/convert.c
index 793c29ebfd..404757eac9 100644
--- a/convert.c
+++ b/convert.c
@@ -565,35 +565,14 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+static int start_multi_file_filter_fn(struct cmd2process *entry)
 {
int err;
-   struct cmd2process *entry;
-   struct child_process *process;
-   const char *argv[] = { cmd, NULL };
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-
-   entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >process;
-
-   child_process_init(process);
-   process->argv = argv;
-   process->use_shell = 1;
-   process->in = -1;
-   process->out = -1;
-   process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
-
-   if (start_command(process)) {
-   error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
-   }
-
-   hashmap_entry_init(entry, strhash(cmd));
+   struct child_process *process = >process;
+   const char *cmd = entry->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -642,7 +621,41 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 done:
sigchain_pop(SIGPIPE);
 
-   if (err || errno == EPIPE) {
+   if (err || errno == EPIPE)
+   err = err ? err : errno;
+
+   return err;
+}
+
+static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+{
+   int err;
+   struct cmd2process *entry;
+   struct child_process *process;
+   const char *argv[] = { cmd, NULL };
+
+   entry = xmalloc(sizeof(*entry));
+   entry->cmd = cmd;
+   entry->supported_capabilities = 0;
+   process = >process;
+
+   child_process_init(process);
+   process->argv = argv;
+   process->use_shell = 1;
+   process->in = -1;
+   process->out = -1;
+   process->clean_on_exit = 1;
+   process->clean_on_exit_handler = stop_multi_file_filter;
+
+   if (start_command(process)) {
+   error("cannot fork to run external filter '%s'", cmd);
+   return NULL;
+   }
+
+   hashmap_entry_init(entry, strhash(cmd));
+
+   err = start_multi_file_filter_fn(entry);
+   if (err) {
error("initialization for external filter '%s' failed", cmd);
kill_multi_file_filter(hashmap, entry);
return NULL;
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 7/8] sub-process: move sub-process functions into separate files

2017-03-29 Thread Ben Peart
Move the sub-proces functions into sub-process.h/c.  Add documentation
for the new module in Documentation/technical/api-sub-process.txt

Signed-off-by: Ben Peart 
---
 Documentation/technical/api-sub-process.txt |  54 +
 Makefile|   1 +
 convert.c   | 118 +---
 sub-process.c   | 116 +++
 sub-process.h   |  46 +++
 5 files changed, 218 insertions(+), 117 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

diff --git a/Documentation/technical/api-sub-process.txt 
b/Documentation/technical/api-sub-process.txt
new file mode 100644
index 00..eb5005aa72
--- /dev/null
+++ b/Documentation/technical/api-sub-process.txt
@@ -0,0 +1,54 @@
+sub-process API
+===
+
+The sub-process API makes it possible to run background sub-processes
+that should run until the git command exits and communicate with it
+through stdin and stdout.  This reduces the overhead of having to fork
+a new process each time it needs to be communicated with.
+
+The sub-processes are kept in a hashmap by command name and looked up
+via the subprocess_find_entry function.  If an existing instance can not
+be found then a new process should be created and started.  When the
+parent git command terminates, all sub-processes are also terminated.
+
+This API is based on the run-command API.
+
+Data structures
+---
+
+* `struct subprocess_entry`
+
+The sub-process structure.  Members should not be accessed directly.
+
+Types
+-
+
+'int(*subprocess_start_fn)(struct subprocess_entry *entry)'::
+
+   User-supplied function to initialize the sub-process.  This is
+   typically used to negoiate the interface version and capabilities.
+
+
+Functions
+-
+
+`subprocess_start`::
+
+   Start a subprocess and add it to the subprocess hashmap.
+
+`subprocess_stop`::
+
+   Kill a subprocess and remove it from the subprocess hashmap.
+
+`subprocess_find_entry`::
+
+   Find a subprocess in the subprocess hashmap.
+
+`subprocess_get_child_process`::
+
+   Get the underlying `struct child_process` from a subprocess.
+
+`subprocess_read_status`::
+
+   Helper function to read packets looking for the last "status="
+   key/value pair.
diff --git a/Makefile b/Makefile
index 9f8b35ad41..add945b560 100644
--- a/Makefile
+++ b/Makefile
@@ -838,6 +838,7 @@ LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += submodule.o
 LIB_OBJS += submodule-config.o
+LIB_OBJS += sub-process.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += tempfile.o
diff --git a/convert.c b/convert.c
index 8e9223f48a..baa41da760 100644
--- a/convert.c
+++ b/convert.c
@@ -4,6 +4,7 @@
 #include "quote.h"
 #include "sigchain.h"
 #include "pkt-line.h"
+#include "sub-process.h"
 
 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -496,85 +497,11 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct subprocess_entry {
-   struct hashmap_entry ent; /* must be the first member! */
-   const char *cmd;
-   struct child_process process;
-};
-
 struct cmd2process {
struct subprocess_entry subprocess; /* must be the first member! */
unsigned int supported_capabilities;
 };
 
-static int subprocess_map_initialized;
-static struct hashmap subprocess_map;
-
-static int cmd2process_cmp(const struct subprocess_entry *e1,
-  const struct subprocess_entry *e2,
-  const void *unused)
-{
-   return strcmp(e1->cmd, e2->cmd);
-}
-
-static struct subprocess_entry *subprocess_find_entry(const char *cmd)
-{
-   struct subprocess_entry key;
-
-   if (!subprocess_map_initialized) {
-   subprocess_map_initialized = 1;
-   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
-   return NULL;
-   }
-
-   hashmap_entry_init(, strhash(cmd));
-   key.cmd = cmd;
-   return hashmap_get(_map, , NULL);
-}
-
-static void subprocess_read_status(int fd, struct strbuf *status)
-{
-   struct strbuf **pair;
-   char *line;
-   for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
-   break;
-   pair = strbuf_split_str(line, '=', 2);
-   if (pair[0] && pair[0]->len && pair[1]) {
-   /* the last "status=" line wins */
-   if (!strcmp(pair[0]->buf, "status=")) {
-   strbuf_reset(status);
-   strbuf_addbuf(status, pair[1]);
-   }
-   }
-   strbuf_list_free(pair);

[PATCH v3 8/8] convert: Update subprocess_read_status to not die on EOF

2017-03-29 Thread Ben Peart
Enable sub-processes to gracefully handle when the process dies by
updating subprocess_read_status to return an error on EOF instead of
dying.

Update apply_multi_file_filter to take advantage of the revised
subprocess_read_status.

Signed-off-by: Ben Peart 
---
 convert.c | 10 --
 sub-process.c | 10 +++---
 sub-process.h |  2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index baa41da760..9e181e27ad 100644
--- a/convert.c
+++ b/convert.c
@@ -629,7 +629,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
if (err)
goto done;
@@ -638,7 +641,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
 
 done:
diff --git a/sub-process.c b/sub-process.c
index 2c4d27c193..202d7d0255 100644
--- a/sub-process.c
+++ b/sub-process.c
@@ -30,13 +30,15 @@ struct subprocess_entry *subprocess_find_entry(const char 
*cmd)
return hashmap_get(_map, , NULL);
 }
 
-void subprocess_read_status(int fd, struct strbuf *status)
+int subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
+   int len;
+
for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
+   len = packet_read_line_gently(fd, NULL, );
+   if ((len == -1) || !line)
break;
pair = strbuf_split_str(line, '=', 2);
if (pair[0] && pair[0]->len && pair[1]) {
@@ -48,6 +50,8 @@ void subprocess_read_status(int fd, struct strbuf *status)
}
strbuf_list_free(pair);
}
+
+   return len == -1 ? len : 0;
 }
 
 void subprocess_stop(struct subprocess_entry *entry)
diff --git a/sub-process.h b/sub-process.h
index 0cf1760a0a..5a1ce0 100644
--- a/sub-process.h
+++ b/sub-process.h
@@ -41,6 +41,6 @@ static inline struct child_process 
*subprocess_get_child_process(
  * key/value pairs and return the value from the last "status" packet
  */
 
-void subprocess_read_status(int fd, struct strbuf *status);
+int subprocess_read_status(int fd, struct strbuf *status);
 
 #endif
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 1/8] pkt-line: add packet_writel() and packet_read_line_gently()

2017-03-29 Thread Ben Peart
Add packet_writel() which writes multiple lines in a single call and
then calls packet_flush_gently(). Add packet_read_line_gently() to
enable reading a line without dying on EOF.

Signed-off-by: Ben Peart 
---
 pkt-line.c | 31 +++
 pkt-line.h | 11 +++
 2 files changed, 42 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index d4b6bfe076..2788aa1af6 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -171,6 +171,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return status;
 }
 
+int packet_writel(int fd, const char *line, ...)
+{
+   va_list args;
+   int err;
+   va_start(args, line);
+   for (;;) {
+   if (!line)
+   break;
+   if (strlen(line) > LARGE_PACKET_DATA_MAX)
+   return -1;
+   err = packet_write_fmt_gently(fd, "%s\n", line);
+   if (err)
+   return err;
+   line = va_arg(args, const char*);
+   }
+   va_end(args);
+   return packet_flush_gently(fd);
+}
+
 static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 {
static char packet_write_buffer[LARGE_PACKET_MAX];
@@ -323,6 +342,18 @@ char *packet_read_line(int fd, int *len_p)
return packet_read_line_generic(fd, NULL, NULL, len_p);
 }
 
+int packet_read_line_gently(int fd, int *dst_len, char** dst_line)
+{
+   int len = packet_read(fd, NULL, NULL,
+   packet_buffer, sizeof(packet_buffer),
+   PACKET_READ_CHOMP_NEWLINE|PACKET_READ_GENTLE_ON_EOF);
+   if (dst_len)
+   *dst_len = len;
+   if (dst_line)
+   *dst_line = (len > 0) ? packet_buffer : NULL;
+   return len;
+}
+
 char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
 {
return packet_read_line_generic(-1, src, src_len, dst_len);
diff --git a/pkt-line.h b/pkt-line.h
index 18eac64830..cb3eda9695 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,6 +25,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_writel(int fd, const char *line, ...);
 int write_packetized_from_fd(int fd_in, int fd_out);
 int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 
@@ -74,6 +75,16 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, 
char
 char *packet_read_line(int fd, int *size);
 
 /*
+ * Convenience wrapper for packet_read that sets the PACKET_READ_GENTLE_ON_EOF
+ * and CHOMP_NEWLINE options. The return value specifies the number of bytes
+ * read into the buffer or -1 on truncated input. if the *dst_line parameter
+ * is not NULL it will return NULL for a flush packet and otherwise points to
+ * a static buffer (that may be overwritten by subsequent calls). If the size
+ * parameter is not NULL, the length of the packet is written to it.
+ */
+int packet_read_line_gently(int fd, int *size, char** dst_line);
+
+/*
  * Same as packet_read_line, but read from a buf rather than a descriptor;
  * see packet_read for details on how src_* is used.
  */
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v3 0/8] refactor the filter process code into a reusable module

2017-03-29 Thread Ben Peart
Refactor the filter..process code into a separate sub-process
module that can be used to reduce the cost of starting up a sub-process
for multiple commands.  It does this by keeping the external process
running and processing all commands by communicating over standard input
and standard output using the packet format (pkt-line) based protocol.
Full documentation is in Documentation/technical/api-sub-process.txt.

This code is refactored from:

Commit edcc85814c ("convert: add filter..process option", 
2016-10-16)
keeps the external process running and processes all commands


Ben Peart (8):
  pkt-line: add packet_writel() and packet_read_line_gently()
  convert: Update convert to use new packet_writel() function
  convert: Split start_multi_file_filter into two separate functions
  convert: Separate generic structures and variables from the filter
specific ones
  convert: Update generic functions to only use generic data structures
  convert: rename reusable sub-process functions
  sub-process: move sub-process functions into separate files
  convert: Update subprocess_read_status to not die on EOF

 Documentation/technical/api-sub-process.txt |  54 ++
 Makefile|   1 +
 convert.c   | 159 +---
 pkt-line.c  |  31 ++
 pkt-line.h  |  11 ++
 sub-process.c   | 120 +
 sub-process.h   |  46 
 7 files changed, 292 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

-- 
2.12.1.gvfs.1.18.ge47db72



Re: [PATCH v2 16/21] Make sha1_array_append take a struct object_id *

2017-03-29 Thread Junio C Hamano
"brian m. carlson"  writes:

> On Tue, Mar 28, 2017 at 10:27:41AM -0700, Junio C Hamano wrote:
>> ... 
>> After all, the original written by a human said E2[E3].hash (or
>> array->sha1[i]) because to the human's mind, E2 is a series of
>> things that can be indexed with an int E3, and even though 
>> 
>> *(E2 + E3)
>> E2[E3]
>> E3[E2]
>> 
>> all mean the same thing, the human decided that E2[E3] is the most
>> natural way to express this particular reference to an item in the
>> array.  [E3] would keep that intention by the original author
>> better than E2 + E3.
>
> I'm happy to make that change.  I'm an experienced C programmer, so a
> pointer addition seems very readable and natural to me, but if you think
> it's better or more readable as [E3], I can certainly reroll.

The obvious three possible variants I listed above are equivalent to
the language lawyers and the compiler ;-).

This change is about dropping the need for ".hash", and I think a
faithful, boring and mechanical conversion that tries to preserve
the intent of the original author would be more appropriate.  It is
entirely possible that some places where the original said E2[E3]
were easier to understand if it were *(E2 + E3), thus we may want to
further rewrite such a place to (E2 + E3) instead of [E3] after
the mechanical conversion.

Thanks.


[PATCH v2 0/3] name-rev sanity

2017-03-29 Thread Michael J Gruber
So here is v2 of the name-rev series, the result of our discussions being:

Junio C Hamano (2):
  name-rev: refactor logic to see if a new candidate is a better name
  name-rev: favor describing with tags and use committer date to
tiebreak

That second patch is slighty changed as discussed, but still mostly Junio's

Michael J Gruber (1):
  name-rev: provide debug output

This replaces the patch which documented that --debug does not work with 
--contains :)

 builtin/describe.c |   2 +
 builtin/name-rev.c | 117 +
 t/t4202-log.sh |   2 +-
 3 files changed, 103 insertions(+), 18 deletions(-)

-- 
2.12.2.712.gba4c48c431



[PATCH v2 2/3] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-29 Thread Michael J Gruber
From: Junio C Hamano 

"git name-rev" assigned a phony "far in the future" date to tips of
refs that are not pointing at tag objects, and favored names based
on a ref with the oldest date.  This made it almost impossible for
an unannotated tags and branches to be counted as a viable base,
which was especially problematic when the command is run with the
"--tags" option.  If an unannotated tag that points at an ancient
commit and an annotated tag that points at a much newer commit
reaches the commit that is being named, the old unannotated tag was
ignored.

Update the "taggerdate" field of the rev-name structure, which is
initialized from the tip of ref, to have the committer date if the
object at the tip of ref is a commit, not a tag, so that we can
optionally take it into account when doing "is this name better?"
comparison logic.

When "name-rev" is run without the "--tags" option, the general
expectation is still to name the commit based on a tag if possible,
but use non-tag refs as fallback, and tiebreak among these non-tag
refs by favoring names with shorter hops from the tip.  The use of a
phony "far in the future" date in the original code was an effective
way to ensure this expectation is held: a non-tag tip gets the same
"far in the future" timestamp, giving precedence to tags, and among
non-tag tips, names with shorter hops are preferred over longer
hops, without taking the "taggerdate" into account.  As we are
taking over the "taggerdate" field to store the committer date for
tips with commits:

 (1) keep the original logic when comparing names based on two refs
 both of which are from refs/tags/;

 (2) favoring a name based on a ref in refs/tags/ hierarchy over
 a ref outside the hierarchy;

 (3) between two names based on a ref both outside refs/tags/, give
 precedence to a name with shorter hops and use "taggerdate"
 only to tie-break.

A change to t4202 is a natural consequence.  The test creates a
commit on a branch "side" and points at it with an unannotated tag
"refs/tags/side-2".  The original code couldn't decide which one to
favor at all, and gave a name based on a branch (simply because
refs/heads/side sorts earlier than refs/tags/side-2).  Because the
updated logic is taught to favor refs in refs/tags/ hierarchy, the
the test is updated to expect to see tags/side-2 instead.

Signed-off-by: Junio C Hamano 
---

 * I am sure others can add better heurisitics in is_better_name()
   for comparing names based on non-tag refs, and I do not mind
   people disagreeing with the logic that this patch happens to
   implement.  This is done primarily to illustrate the value of
   using a separate helper function is_better_name() instead of
   open-coding the selection logic in name_rev() function.
Signed-off-by: Michael J Gruber 
---
 builtin/name-rev.c | 53 -
 t/t4202-log.sh |  2 +-
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 7890f826ce..bf7ed015ae 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -13,6 +13,7 @@ typedef struct rev_name {
unsigned long taggerdate;
int generation;
int distance;
+   int from_tag;
 } rev_name;
 
 static long cutoff = LONG_MAX;
@@ -24,16 +25,43 @@ static int is_better_name(struct rev_name *name,
  const char *tip_name,
  unsigned long taggerdate,
  int generation,
- int distance)
+ int distance,
+ int from_tag)
 {
-   return (name->taggerdate > taggerdate ||
-   (name->taggerdate == taggerdate &&
-name->distance > distance));
+   /*
+* When comparing names based on tags, prefer names
+* based on the older tag, even if it is farther away.
+*/
+   if (from_tag && name->from_tag)
+   return (name->taggerdate > taggerdate ||
+   (name->taggerdate == taggerdate &&
+name->distance > distance));
+
+   /*
+* We know that at least one of them is a non-tag at this point.
+* favor a tag over a non-tag.
+*/
+   if (name->from_tag != from_tag)
+   return from_tag;
+
+   /*
+* We are now looking at two non-tags.  Tiebreak to favor
+* shorter hops.
+*/
+   if (name->distance != distance)
+   return name->distance > distance;
+
+   /* ... or tiebreak to favor older date */
+   if (name->taggerdate != taggerdate)
+   return name->taggerdate > taggerdate;
+
+   /* keep the current one if we cannot decide */
+   return 0;
 }
 
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
-   int generation, int distance,
+

[PATCH v2 3/3] name-rev: provide debug output

2017-03-29 Thread Michael J Gruber
Currently, `git describe --contains --debug` does not create any debug
output because it does not pass the flag down to `git name-rev`, which
does not know that flag.

Teach the latter that flag and the former to pass it down.

The output is patterned after that of `git describe --debug`, with the
following differences:

describe loops over all args to describe, then over all possible
descriptions; name-rev does it the other way round. Therefore, we need
to amend each possible description by the arg that it is for (and we
leave out the "searching to describe" header).

The date cut-off for name-rev kicks in way more often than the candidate
number cut-off of describe, so we do not clutter the output with the
cut-off.

Signed-off-by: Michael J Gruber 
---
 builtin/describe.c |  2 ++
 builtin/name-rev.c | 64 +++---
 2 files changed, 53 insertions(+), 13 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index a5cd8c513f..30196793f0 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -486,6 +486,8 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
 NULL);
if (always)
argv_array_push(, "--always");
+   if (debug)
+   argv_array_push(, "--debug");
if (!all) {
argv_array_push(, "--tags");
for_each_string_list_item(item, )
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index bf7ed015ae..51302a79ba 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -18,6 +18,10 @@ typedef struct rev_name {
 
 static long cutoff = LONG_MAX;
 
+static const char *prio_names[] = {
+   N_("head"), N_("lightweight"), N_("annotated"),
+};
+
 /* How many generations are maximally preferred over _one_ merge traversal? */
 #define MERGE_TRAVERSAL_WEIGHT 65535
 
@@ -59,10 +63,19 @@ static int is_better_name(struct rev_name *name,
return 0;
 }
 
+struct name_ref_data {
+   int tags_only;
+   int name_only;
+   int debug;
+   struct string_list ref_filters;
+   struct string_list exclude_filters;
+   struct object_array *revs;
+};
+
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
int generation, int distance, int from_tag,
-   int deref)
+   int deref, struct name_ref_data *data)
 {
struct rev_name *name = (struct rev_name *)commit->util;
struct commit_list *parents;
@@ -75,6 +88,7 @@ static void name_rev(struct commit *commit,
 
if (deref) {
tip_name = xstrfmt("%s^0", tip_name);
+   from_tag += 1;
 
if (generation)
die("generation: %d, but deref?", generation);
@@ -87,6 +101,36 @@ static void name_rev(struct commit *commit,
} else if (is_better_name(name, tip_name, taggerdate,
  generation, distance, from_tag)) {
 copy_data:
+   if (data->debug) {
+   int i;
+   static int label_width = -1;
+   static int name_width = -1;
+   if (label_width < 0) {
+   int w;
+   for (i = 0; i < ARRAY_SIZE(prio_names); i++) {
+   w = strlen(_(prio_names[i]));
+   if (label_width < w)
+   label_width = w;
+   }
+   }
+   if (name_width < 0) {
+   int w;
+   for (i = 0; i < data->revs->nr; i++) {
+   w = strlen(data->revs->objects[i].name);
+   if (name_width < w)
+   name_width = w;
+   }
+   }
+   for (i = 0; i < data->revs->nr; i++)
+   if (!oidcmp(>object.oid,
+   >revs->objects[i].item->oid))
+   fprintf(stderr, " %-*s %8d %-*s 
%s~%d\n",
+   label_width,
+   _(prio_names[from_tag]),
+   distance, name_width,
+   data->revs->objects[i].name,
+   tip_name, generation);
+   }
name->tip_name = tip_name;
name->taggerdate = taggerdate;
name->generation = generation;
@@ -112,11 +156,11 @@ static void name_rev(struct commit *commit,
 

[PATCH v2 1/3] name-rev: refactor logic to see if a new candidate is a better name

2017-03-29 Thread Michael J Gruber
From: Junio C Hamano 

When we encounter a new ref that could describe the commit we are
looking at, we compare the name that is formed using that ref and
the name we found so far and pick a better one.

Factor the comparison logic out to a separate helper function, while
keeping the current logic the same (i.e. a name that is based on an
older tag is better, and if two tags of the same age can reach the
commit, the one with fewer number of hops to reach the commit is
better).

Signed-off-by: Junio C Hamano 
Signed-off-by: Michael J Gruber 
---
 builtin/name-rev.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 8bdc3eaa6f..7890f826ce 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -20,6 +20,17 @@ static long cutoff = LONG_MAX;
 /* How many generations are maximally preferred over _one_ merge traversal? */
 #define MERGE_TRAVERSAL_WEIGHT 65535
 
+static int is_better_name(struct rev_name *name,
+ const char *tip_name,
+ unsigned long taggerdate,
+ int generation,
+ int distance)
+{
+   return (name->taggerdate > taggerdate ||
+   (name->taggerdate == taggerdate &&
+name->distance > distance));
+}
+
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
int generation, int distance,
@@ -45,9 +56,8 @@ static void name_rev(struct commit *commit,
name = xmalloc(sizeof(rev_name));
commit->util = name;
goto copy_data;
-   } else if (name->taggerdate > taggerdate ||
-   (name->taggerdate == taggerdate &&
-name->distance > distance)) {
+   } else if (is_better_name(name, tip_name, taggerdate,
+ generation, distance)) {
 copy_data:
name->tip_name = tip_name;
name->taggerdate = taggerdate;
-- 
2.12.2.712.gba4c48c431



[PATCH v3] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Ævar Arnfjörð Bjarmason
Change the perl/perl.mak build process so that the file is regenerated
if the output of "perl -V" changes.

Before this change updating e.g. /usr/bin/perl to a new major version
would cause the next "make" command to fail, since perl.mak has
hardcoded paths to perl library paths retrieved from its first run.

Now the logic added in commit ee9be06770 ("perl: detect new files in
MakeMaker builds", 2012-07-27) is extended to regenerate
perl/perl.mak if there's any change to "perl -V".

This will in some cases redundantly trigger perl/perl.mak to be
re-made, e.g. if @INC is modified in ways the build process doesn't
care about through sitecustomize.pl, but the common case is that we
just do the right thing and re-generate perl/perl.mak when needed.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Wed, Mar 29, 2017 at 3:36 PM,   wrote:
> Am 29.03.2017 um 15:33 schrieb Ævar Arnfjörð Bjarmason:
> [...]
>> Now the logic added in commit ee9be06770 ("perl: detect new files in
>> MakeMaker builds", 2012-07-27) is extended to regeneratio
>
> s/regeneratio/regenerate/
>
>> [...]
>
>
> /S

Thanks!


 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index 9f8b35ad41..485c453ca2 100644
--- a/Makefile
+++ b/Makefile
@@ -1851,6 +1851,7 @@ perl/perl.mak: perl/PM.stamp
 
 perl/PM.stamp: FORCE
@$(FIND) perl -type f -name '*.pm' | sort >$@+ && \
+   $(PERL_PATH) -V >>$@+ && \
{ cmp $@+ $@ >/dev/null 2>/dev/null || mv $@+ $@; } && \
$(RM) $@+
 
-- 
2.11.0



Re: [PATCH v2] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread stefan.naewe
Am 29.03.2017 um 15:33 schrieb Ævar Arnfjörð Bjarmason:
> Change the perl/perl.mak build process so that the file is re-made if
> the output of "perl -V" changes.
> 
> Before this change updating e.g. /usr/bin/perl to a new major version
> would cause the next "make" command to fail, since perl.mak has
> hardcoded paths to perl library paths retrieved from its first run.
> 
> Now the logic added in commit ee9be06770 ("perl: detect new files in
> MakeMaker builds", 2012-07-27) is extended to regeneratio

s/regeneratio/regenerate/

> [...]


/S
-- 

/dev/random says: HELP! I need a tagline. HELP! Not just any tagline.
python -c "print 
'73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')"
 
GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9  9666 829B 49C5 9221 27AF

[PATCH v2] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Ævar Arnfjörð Bjarmason
Change the perl/perl.mak build process so that the file is re-made if
the output of "perl -V" changes.

Before this change updating e.g. /usr/bin/perl to a new major version
would cause the next "make" command to fail, since perl.mak has
hardcoded paths to perl library paths retrieved from its first run.

Now the logic added in commit ee9be06770 ("perl: detect new files in
MakeMaker builds", 2012-07-27) is extended to regeneratio
perl/perl.mak if there's any change to "perl -V".

This will in some cases redundantly trigger perl/perl.mak to be
re-made, e.g. if @INC is modified in ways the build process doesn't
care about through sitecustomize.pl, but the common case is that we
just do the right thing and re-generate perl/perl.mak when needed.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Maybe this'll set some sort of record for a v2 submission, but anyway,
this should clearly be >> not >, we don't want to overwrite the list
of *.pm files we just added.

 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index 9f8b35ad41..485c453ca2 100644
--- a/Makefile
+++ b/Makefile
@@ -1851,6 +1851,7 @@ perl/perl.mak: perl/PM.stamp
 
 perl/PM.stamp: FORCE
@$(FIND) perl -type f -name '*.pm' | sort >$@+ && \
+   $(PERL_PATH) -V >>$@+ && \
{ cmp $@+ $@ >/dev/null 2>/dev/null || mv $@+ $@; } && \
$(RM) $@+
 
-- 
2.11.0



[PATCH] perl: regenerate perl.mak if perl -V changes

2017-03-29 Thread Ævar Arnfjörð Bjarmason
Change the perl/perl.mak build process so that the file is re-made if
the output of "perl -V" changes.

Before this change updating e.g. /usr/bin/perl to a new major version
would cause the next "make" command to fail, since perl.mak has
hardcoded paths to perl library paths retrieved from its first run.

Now the logic added in commit ee9be06770 ("perl: detect new files in
MakeMaker builds", 2012-07-27) is extended to regeneratio
perl/perl.mak if there's any change to "perl -V".

This will in some cases redundantly trigger perl/perl.mak to be
re-made, e.g. if @INC is modified in ways the build process doesn't
care about through sitecustomize.pl, but the common case is that we
just do the right thing and re-generate perl/perl.mak when needed.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Wed, Mar 29, 2017 at 4:18 AM, Jeff King  wrote:
> On Tue, Mar 28, 2017 at 09:03:43PM -0400, Jeffrey Walton wrote:
>[...]

At first I thought Jeffrey was running into this longstanding issue
with the perl Makefile. Looks like not, and he just wasn't passing
PERL_PATH correctly, but fix this related issue while it's fresh in my
mind.

 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index c80fec2920..c0c5510238 100644
--- a/Makefile
+++ b/Makefile
@@ -1850,6 +1850,7 @@ perl/perl.mak: perl/PM.stamp
 
 perl/PM.stamp: FORCE
@$(FIND) perl -type f -name '*.pm' | sort >$@+ && \
+   $(PERL_PATH) -V >$@+ && \
{ cmp $@+ $@ >/dev/null 2>/dev/null || mv $@+ $@; } && \
$(RM) $@+
 
-- 
2.11.0



Re: Git hackathon New York / London - call for mentors

2017-03-29 Thread Christian Couder
Hi Charles,

I maybe available to mentor. I am in Paris so closer to London.
I have no date restriction for the weekends after April 9th yet.

Thanks,
Christian.

On Wed, Mar 29, 2017 at 12:18 AM, Charles Bailey  wrote:
> Bloomberg would like to host a Git hackathon over a weekend in both New
> York and London, towards the end of April or the beginning of May.
>
> Crucial to the success of the weekend will be having mentors available
> in both locations who can guide people on the project. Mentors should
> have some experience with developing for Git and should be familiar with
> the process and guidelines around contributing.
>
> If you are interested in being a mentor or have further questions, then
> please get in contact with me via email (either to this address or to
> cbaile...@bloomberg.net) letting me know whether you are closer to New
> York or London and if you have any date restrictions.
>
> Charles.
>
> ---
>
> Git was the first project for which we hosted an "Open Source Day" and
> since then we've learned a lot and would like to revisit Git again.
>
> The event will involve volunteers who are usually competent programmers
> but who don't necessarily have experience with contributing to Git,
> working to contribute to the project over two days. Typically the type
> of tasks tackled would include documentation improvements, test case
> improvements and very simple bug fixes that have previously been
> identified as "low hanging fruit".


Re: [RFC] should these two topics graduate to 'master' soon?

2017-03-29 Thread Jon Loeliger
So, like, Junio C Hamano said:
> There are two topics that are marked as "Will cook in 'next'" for
> practically forever in the "What's cooking" reports.  The world may
> have become ready for one or both of them, in which case we should
> do the merge not too late in the cycle.
> 
> * jc/merge-drop-old-syntax (2015-04-29) 1 commit
> 
> 
> * jk/no-looking-at-dotgit-outside-repo-final (2016-10-26) 1 commit
> 
> Opinions?

While I have no technical data on these issues, I think
they should go in.

jdl


Re: [PATCH v2 2/3] sequencer: make commit options more extensible

2017-03-29 Thread Johannes Schindelin
Hi Junio,

On Mon, 27 Mar 2017, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > As this thing is about fixing a regression visible to end users, I
> > may get around to fixing things up myself, but I have other topics
> > to attend to, so...
> 
> So I ended up with this version before merging it to 'next'.  

Thanks,
Dscho


Re: [PATCH v4 2/5] dir_iterator: iterate over dir after its contents

2017-03-29 Thread Michael Haggerty
On 03/29/2017 11:56 AM, Michael Haggerty wrote:
> On 03/29/2017 02:32 AM, Daniel Ferreira wrote:
>> [...]
> [...]
> The disagreement is not a surprise, because there isn't a corresponding
> coding error in the code below that returns the directory itself in a
> post-order iteration. The net result appears to be that there is no
> recursion at all into subdirectories when `DIR_ITERATOR_DEPTH_FIRST` is
> set. So due to this bug, we get neither a correct post-order iteration
> nor a correct pre-order iteration with the new option.

Correction: the second-to-last sentence should read "there is no
recursion at all into subdirectories when `DIR_ITERATOR_DEPTH_FIRST` is
*NOT* set.

Michael



Re: [PATCH v6 18/27] files-backend: replace submodule_allowed check in files_downcast()

2017-03-29 Thread Michael Haggerty
On 03/26/2017 04:16 AM, Duy Nguyen wrote:
> On Mon, Mar 20, 2017 at 4:18 AM, Michael Haggerty  
> wrote:
>>> +/* ref_store_init flags */
>>> +#define REF_STORE_READ   (1 << 0)
>>
>> I asked [1] in reply to v5 whether `REF_STORE_READ` is really necessary
>> but I don't think you replied. Surely a reference store that we can't
>> read would be useless? Can't we just assume that any `ref_store` is
>> readable and drop this constant?
> 
> I deleted it then I realized many lines like these
> 
> files_downcast(ref_store, REF_STORE_READ, "read_raw_ref");
> 
> would become
> 
> files_downcast(ref_store, 0, "read_raw_ref");
> 
> I think for newcomers, the former gives a better hint what the second
> argument is than a plain zero in the latter, so I'm inclined to keep
> it.

That's OK with me; I don't feel strongly about it.

Michael



Re: [PATCH v4 5/5] files_reflog_iterator: amend use of dir_iterator

2017-03-29 Thread Michael Haggerty
On 03/29/2017 02:32 AM, Daniel Ferreira wrote:
> Amend a call to dir_iterator_begin() to pass the flags parameter
> introduced in 3efb5c0 ("dir_iterator: iterate over dir after its
> contents", 2017-28-03).
> 
> Signed-off-by: Daniel Ferreira 
> ---
>  refs/files-backend.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 50188e9..b4bba74 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3346,7 +3346,7 @@ static struct ref_iterator 
> *files_reflog_iterator_begin(struct ref_store *ref_st
>   files_downcast(ref_store, 0, "reflog_iterator_begin");
> 
>   base_ref_iterator_init(ref_iterator, _reflog_iterator_vtable);
> - iter->dir_iterator = dir_iterator_begin(git_path("logs"));
> + iter->dir_iterator = dir_iterator_begin(git_path("logs"), 0);
>   return ref_iterator;
>  }

As mentioned earlier, this patch should be squashed into patch [2/5].

Michael



Re: [PATCH v4 3/5] remove_subtree(): reimplement using iterators

2017-03-29 Thread Michael Haggerty
On 03/29/2017 02:32 AM, Daniel Ferreira wrote:
> Use dir_iterator to traverse through remove_subtree()'s directory tree,
> avoiding the need for recursive calls to readdir(). Simplify
> remove_subtree()'s code.
> 
> A conversion similar in purpose was previously done at 46d092a
> ("for_each_reflog(): reimplement using iterators", 2016-05-21).
> 
> Signed-off-by: Daniel Ferreira 
> ---
>  entry.c | 31 ++-
>  1 file changed, 10 insertions(+), 21 deletions(-)
> 
> diff --git a/entry.c b/entry.c
> index c6eea24..bbebd16 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -2,6 +2,8 @@
>  #include "blob.h"
>  #include "dir.h"
>  #include "streaming.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
> 
>  static void create_directories(const char *path, int path_len,
>  const struct checkout *state)
> @@ -46,29 +48,16 @@ static void create_directories(const char *path, int 
> path_len,
> 
>  static void remove_subtree(struct strbuf *path)

The reason that this function took a `strbuf` as argument was that it
used to modify the strbuf while it was working. Now that `dir_iterator`
does all of that work, you could simplify it to take a `const char *`
argument, and change the caller to pass `path.buf`.

>  {
> - DIR *dir = opendir(path->buf);
> - struct dirent *de;
> - int origlen = path->len;
> -
> - if (!dir)
> - die_errno("cannot opendir '%s'", path->buf);
> - while ((de = readdir(dir)) != NULL) {
> - struct stat st;
> -
> - if (is_dot_or_dotdot(de->d_name))
> - continue;
> -
> - strbuf_addch(path, '/');
> - strbuf_addstr(path, de->d_name);
> - if (lstat(path->buf, ))
> - die_errno("cannot lstat '%s'", path->buf);
> - if (S_ISDIR(st.st_mode))
> - remove_subtree(path);
> - else if (unlink(path->buf))
> + struct dir_iterator *diter = dir_iterator_begin(path->buf, 
> DIR_ITERATOR_DEPTH_FIRST);
> +
> + while (dir_iterator_advance(diter) == ITER_OK) {
> + if (S_ISDIR(diter->st.st_mode)) {
> + if (rmdir(diter->path.buf))
> + die_errno("cannot rmdir '%s'", path->buf);
> + } else if (unlink(diter->path.buf))
>   die_errno("cannot unlink '%s'", path->buf);

The two `die_errno()` calls must be changed to use `diter->path.buf`
rather than `path->buf`.

> - strbuf_setlen(path, origlen);
>   }
> - closedir(dir);
> +
>   if (rmdir(path->buf))
>   die_errno("cannot rmdir '%s'", path->buf);
>  }

Michael



Re: [PATCH v4 2/5] dir_iterator: iterate over dir after its contents

2017-03-29 Thread Michael Haggerty
On 03/29/2017 02:32 AM, Daniel Ferreira wrote:
> Create an option for the dir_iterator API to iterate over subdirectories
> only after having iterated through their contents. This feature was
> predicted, although not implemented by 0fe5043 ("dir_iterator: new API
> for iterating over a directory tree", 2016-06-18).

I admire your ambition at taking on this project. Even though the
`dir_iterator` code is fairly short, it is quite intricate and it took
me quite a bit of thought to get it right. Don't be discouraged by how
many iterations it takes to get a change like this accepted.

When reviewing your changes, I realized that
`dir_iterator_level::initialized` and `dir_iterator_level::dir_state`
are somewhat redundant with each other in the pre-existing code. It
would be possible to add a third state, say `DIR_STATE_PUSH`, to the
enum, and use that to replace the state that we now call `!initialized`.

I'm not demanding that change, but it might make the bookkeeping in the
new code a little bit easier to understand, because the logic that
decides what to do based on setting of the new option would either
transition the level from `DIR_STATE_PUSH -> DIR_STATE_ITER ->
DIR_STATE_RECURSE` (for pre-order traversal) or `DIR_STATE_PUSH ->
DIR_STATE_RECURSE -> DIR_STATE_ITER` (for post-order traversal). I think
this could make the state machine clearer and thereby make it easier to
reason about the code.

> Add the "flags" parameter to dir_iterator_create, allowing for the
> aforementioned "depth-first" iteration mode to be enabled. Currently,
> the only acceptable flag is DIR_ITERATOR_DEPTH_FIRST.
> 
> This is useful for recursively removing a directory and calling rmdir()
> on a directory only after all of its contents have been wiped.

This patch changes the signature of `ref_iterator_begin()` without
adjusting the caller in `refs/files-backend.c`. This means that the code
doesn't even compile after this patch is applied.

The Git project insists that the code compile and all tests pass after
each and every commit. Among other things, this keeps the code
"bisectable" using `git bisect`, which is a very useful property.

I see that you adjust the caller later in the patch series, in
"files_reflog_iterator: amend use of dir_iterator". Please squash that
patch into this one.

I also realize that I made a goof in my comments about v3 of this patch
series. Your new option is not choosing between "depth-first" and
"breadth-first". Both types of iteration are depth-first. Really it is
choosing between pre-order and post-order traversal. So I think it would
be better to name the option `DIR_ITERATOR_POST_ORDER`. Sorry about that.

> ---
>  dir-iterator.c | 46 ++
>  dir-iterator.h | 14 +++---
>  2 files changed, 53 insertions(+), 7 deletions(-)
> 
> diff --git a/dir-iterator.c b/dir-iterator.c
> index 853c040..545d333 100644
> --- a/dir-iterator.c
> +++ b/dir-iterator.c
> @@ -48,6 +48,9 @@ struct dir_iterator_int {
>* that will be included in this iteration.
>*/
>   struct dir_iterator_level *levels;
> +
> + /* Holds the flags passed to dir_iterator_begin(). */
> + unsigned flags;
>  };
> 
>  static inline void push_dir_level(struct dir_iterator_int *iter, struct 
> dir_iterator_level *level)
> @@ -114,12 +117,14 @@ int dir_iterator_advance(struct dir_iterator 
> *dir_iterator)
>   }
> 
>   level->initialized = 1;
> - } else if (S_ISDIR(iter->base.st.st_mode)) {
> + } else if (S_ISDIR(iter->base.st.st_mode) &&
> + !iter->flags & DIR_ITERATOR_DEPTH_FIRST) {

You need parentheses on the previous line:

!(iter->flags & DIR_ITERATOR_DEPTH_FIRST)) {

because otherwise it is interpreted as

(!iter->flags) & DIR_ITERATOR_DEPTH_FIRST) {

BTW, you should compile with stricter compiler options so that your
compiler warns you about problems like this. This is documented in
`Documentation/CodingGuidelines` (along with lots more useful information):

 - As a Git developer we assume you have a reasonably modern compiler
   and we recommend you to enable the DEVELOPER makefile knob to
   ensure your patch is clear of all compiler warnings we care about,
   by e.g. "echo DEVELOPER=1 >>config.mak".

Also, this line is not indented far enough. It should probably line up
with the inside of the opening parenthesis of the preceding line, like

> + } else if (S_ISDIR(iter->base.st.st_mode) &&
> +!(iter->flags & DIR_ITERATOR_DEPTH_FIRST)) {

But let's dig a little bit deeper. Why don't the tests catch this coding
error? Currently there is only one option, `DIR_ITERATOR_DEPTH_FIRST`,
so if that constant were defined to be 1 as expected, the code would
accidentally work anyway (though it would break if another flag is ever
added). But in fact, you define `DIR_ITERATOR_DEPTH_FIRST` to be 2
(probably unintentionally; see 

FROM AISHA GADDAFI

2017-03-29 Thread aishaga...@ono.com
for more details,contact me at my private email : agaddafi752(at)gmail.
com
Dr. Aisha Gaddafi, the daughter of late Libyan president, I need a 
partner or an investor that will help me in investing the sum of $87.5 
million USD in his or her country, the funds is deposited here in 
Burkina Faso where I am staying for the moment with my family, I would 
like to know your mind tows the investment  of the total sum mentioned 
above, note you will received  30% of the total sum after the transfer 
is completed into your account, 
for more details,contact me at my private email : agaddafi752(at)gmail.
com
Yours friend Dr. Aisha Gaddafi
 
FROM AISHA GADDAFI


git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-29 Thread Jeffrey Walton
Some more 2.12.2 testing on Solaris 11.3 x86_64:

$ make V=1
gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
-D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
-DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
-I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
-DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H
-DHAVE_STRINGS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME
-DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM -DSHA1_HEADER=''
 -Icompat/regex -DSHELL_PATH='"/bin/bash"' -DPAGER_ENV='"LESS=FRX
LV=-c"'  credential-store.c
In file included from cache.h:4:0,
 from credential-store.c:1:
git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'
 const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 ^
In file included from git-compat-util.h:209:0,
 from cache.h:4,
 from credential-store.c:1:
/usr/include/arpa/inet.h:43:20: note: previous declaration of
'inet_ntop' was here
 extern const char *inet_ntop(int, const void *_RESTRICT_KYWD,
^
make: *** [credential-store.o] Error 1



Looking at git-compat-util.h around line 730:

#ifdef NO_INET_PTON
int inet_pton(int af, const char *src, void *dst);
#endif

#ifdef NO_INET_NTOP
const char *inet_ntop(int af, const void *src, char *dst, size_t size);
#endif



When I grep config.log, I don't see a test that results in NO_INET_*:

$ grep NO_INET config.log
$



According to Solaris' man page for inet_ntop:

inet(3SOCKET)  Sockets Library Functions inet(3SOCKET)

NAME
   inet,  inet6, inet_ntop, inet_pton, inet_aton, inet_addr, inet_network,
   inet_makeaddr, inet_lnaof, inet_netof,  inet_ntoa  -  Internet  address
   manipulation

SYNOPSIS
   cc [ flag... ] file... -lsocket  -lnsl  [ library... ]
   #include 
   #include 
   #include 

   const char *inet_ntop(int af, const void *addr, char *cp,
socklen_t size);

   int inet_pton(int af, const char *cp, void *addr);

   int inet_aton(const char *cp, struct in_addr *addr);

   in_addr_t inet_addr(const char *cp);

   in_addr_t inet_network(const char *cp);
   ...

Jeff


[PATCH] read-cache: avoid git_path() race in freshen_shared_index()

2017-03-29 Thread Christian Couder
When performing an interactive rebase in split-index mode,
the commit message that one should rework when squashing commits
can contain some garbage instead of the usual concatenation of
both of the commit messages.

When bisecting it appears that 94c9b5af70 (Merge branch
'cc/split-index-config', 2017-03-17) is the first bad commit.

But when rebasing cc/split-index-config on top of the commit it
was merged with, the first bad commit is then c3a0082502
(read-cache: use freshen_shared_index() in read_index_from(),
2017-03-06).

This shows that we should be careful not to use git_path() in
freshen_shared_index(). It is using a shared buffer that can
too easily lead to races.

Signed-off-by: Christian Couder 
---
 read-cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index e447751823..2f10242c24 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1682,9 +1682,10 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
  */
 static void freshen_shared_index(char *base_sha1_hex, int warn)
 {
-   const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+   char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
if (!check_and_freshen_file(shared_index, 1) && warn)
warning("could not freshen shared index '%s'", shared_index);
+   free(shared_index);
 }
 
 int read_index_from(struct index_state *istate, const char *path)
-- 
2.12.0.339.g3df8399f7e.dirty



Re: [PATCH] contrib/subtree: add "--no-commit" flag for merge and pull

2017-03-29 Thread Mike Lewis
I had defaulted it to "--commit" just to make it absolutely clear what the 
default was going to be, since passing the "--commit" flag to `git-merge` in 
this case doesn’t change the existing behavior. But you’re right that having 
the default be blank matches the existing style better, so I’ll make a new 
patch.

Regarding the "--no-no-commit"  I definitely agree it should be that way, and 
my first attempt at this had it that way. However, `git rev-parse --parseopt` 
seems to rewrite "--commit" to "--no-no-commit" when it detects the "no-commit" 
option in the $OPTS_SPEC variable. Essentially, you still type the argument as 
"--commit", but it’s rewritten to "--no-no-commit" before it gets to the 
argument case block, as "no-commit" is the only documented option. Documenting 
both the "--commit" and "--no-commit" options in $OPTS_SPEC leads to the 
expected resulting argument, but I wasn't sure if both arguments should be 
documented, as none of the other "default" options (for instance, 
"--no-squash") are documented in the help output. But please let me know what 
your thoughts are, and I'll gladly update my patch.

Mike Lewis

> On 29 Mar 2017, at 03:37, David Aguilar  wrote:
> 
> On Sun, Mar 26, 2017 at 03:02:38AM -0400, Mike Lewis wrote:
>> Allows the user to verify and/or change the contents of the merge
>> before committing as necessary
>> 
>> Signed-off-by: Mike Lewis 
>> ---
>> contrib/subtree/git-subtree.sh | 17 +
>> 1 file changed, 13 insertions(+), 4 deletions(-)
>> 
>> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
>> index dec085a23..c30087485 100755
>> --- a/contrib/subtree/git-subtree.sh
>> +++ b/contrib/subtree/git-subtree.sh
>> @@ -29,6 +29,8 @@ onto= try connecting new tree to an existing one
>> rejoinmerge the new branch back into HEAD
>>  options for 'add', 'merge', and 'pull'
>> squashmerge subtree changes as a single commit
>> + options for 'merge' and 'pull'
>> +no-commit perform the merge, but don't commit
>> "
>> eval "$(echo "$OPTS_SPEC" | git rev-parse --parseopt -- "$@" || echo exit 
>> $?)"
>> 
>> @@ -48,6 +50,7 @@ annotate=
>> squash=
>> message=
>> prefix=
>> +commit_option="--commit"
> 
> It might be simpler to default commit_option= empty like the others, and
> remove the "" double quotes around "$commit_option" indicated below so
> that the shell ignores it when it's empty.
> 
>> 
>> debug () {
>>  if test -n "$debug"
>> @@ -137,6 +140,12 @@ do
>>  --no-squash)
>>  squash=
>>  ;;
>> +--no-commit)
>> +commit_option="--no-commit"
>> +;;
>> +--no-no-commit)
>> +commit_option="--commit"
>> +;;
> 
> "--no-no-commit" should just be "--commit" instead.
> The real flag is called "--commit" (git help merge), so subtree
> should follow suite by supporting "--commit" and "--no-commit" only.
> 
> 
>> @@ -815,17 +824,17 @@ cmd_merge () {
>>  then
>>  if test -n "$message"
>>  then
>> -git merge -s subtree --message="$message" "$rev"
>> +git merge -s subtree --message="$message" 
>> "$commit_option" "$rev"
>  ^
>   ^
>>  else
>> -git merge -s subtree "$rev"
>> +git merge -s subtree "$commit_option" "$rev"
> ^  ^
>>  fi
>> [...]
> 
> -- 
> David



Re: [PATCH] contrib/subtree: add "--no-commit" flag for merge and pull

2017-03-29 Thread David Aguilar
On Sun, Mar 26, 2017 at 03:02:38AM -0400, Mike Lewis wrote:
> Allows the user to verify and/or change the contents of the merge
> before committing as necessary
> 
> Signed-off-by: Mike Lewis 
> ---
>  contrib/subtree/git-subtree.sh | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index dec085a23..c30087485 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -29,6 +29,8 @@ onto= try connecting new tree to an existing one
>  rejoinmerge the new branch back into HEAD
>   options for 'add', 'merge', and 'pull'
>  squashmerge subtree changes as a single commit
> + options for 'merge' and 'pull'
> +no-commit perform the merge, but don't commit
>  "
>  eval "$(echo "$OPTS_SPEC" | git rev-parse --parseopt -- "$@" || echo exit 
> $?)"
>  
> @@ -48,6 +50,7 @@ annotate=
>  squash=
>  message=
>  prefix=
> +commit_option="--commit"

It might be simpler to default commit_option= empty like the others, and
remove the "" double quotes around "$commit_option" indicated below so
that the shell ignores it when it's empty.

>  
>  debug () {
>   if test -n "$debug"
> @@ -137,6 +140,12 @@ do
>   --no-squash)
>   squash=
>   ;;
> + --no-commit)
> + commit_option="--no-commit"
> + ;;
> + --no-no-commit)
> + commit_option="--commit"
> + ;;

"--no-no-commit" should just be "--commit" instead.
The real flag is called "--commit" (git help merge), so subtree
should follow suite by supporting "--commit" and "--no-commit" only.


> @@ -815,17 +824,17 @@ cmd_merge () {
>   then
>   if test -n "$message"
>   then
> - git merge -s subtree --message="$message" "$rev"
> + git merge -s subtree --message="$message" 
> "$commit_option" "$rev"
  ^ 
 ^
>   else
> - git merge -s subtree "$rev"
> + git merge -s subtree "$commit_option" "$rev"
 ^  ^
>   fi
>  [...]

-- 
David


[PATCH] Makefile: detect errors in running spatch

2017-03-29 Thread Jeff King
The "make coccicheck" target runs spatch against each source
file. But it does so in a for loop, so "make" never sees the
exit code of spatch. Worse, it redirects stderr to a log
file, so the user has no indication of any failure. And then
to top it all off, because we touched the patch file's
mtime, make will refuse to repeat the command because it
think the target is up-to-date.

So for example:

  $ make coccicheck SPATCH=does-not-exist
  SPATCH contrib/coccinelle/free.cocci
  SPATCH contrib/coccinelle/qsort.cocci
  SPATCH contrib/coccinelle/xstrdup_or_null.cocci
  SPATCH contrib/coccinelle/swap.cocci
  SPATCH contrib/coccinelle/strbuf.cocci
  SPATCH contrib/coccinelle/object_id.cocci
  SPATCH contrib/coccinelle/array.cocci
  $ make coccicheck SPATCH=does-not-exist
  make: Nothing to be done for 'coccicheck'.

With this patch, you get:

  $ make coccicheck SPATCH=does-not-exist
   SPATCH contrib/coccinelle/free.cocci
  /bin/sh: 4: does-not-exist: not found
  Makefile:2338: recipe for target 'contrib/coccinelle/free.cocci.patch' failed
  make: *** [contrib/coccinelle/free.cocci.patch] Error 1

It also dumps the log on failure, so any errors from spatch
itself (like syntax errors in our .cocci files) will be seen
by the user.

Signed-off-by: Jeff King 
---
This is a verbatim repost of:

  
http://public-inbox.org/git/20170310083117.cbflqx7zbe4s7...@sigill.intra.peff.net/

I think this is a strict improvement over the status quo. The
conversation in that thread turned to possible refactorings, but since
those didn't materialize, I think we should apply this in the meantime.

 Makefile | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 9f8b35ad4..9b36068ac 100644
--- a/Makefile
+++ b/Makefile
@@ -2348,9 +2348,17 @@ check: common-cmds.h
 C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
+   ret=0; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
-   done >$@ 2>$@.log; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
+   { ret=$$?; break; }; \
+   done >$@+ 2>$@.log; \
+   if test $$ret != 0; \
+   then \
+   cat $@.log; \
+   exit 1; \
+   fi; \
+   mv $@+ $@; \
if test -s $@; \
then \
echo '' SPATCH result: $@; \
-- 
2.12.2.920.gc31091ce4


Re: should git-subtree ignore submodules?

2017-03-29 Thread David Aguilar
On Tue, Nov 01, 2016 at 12:41:50PM -0700, Kirill Katsnelson wrote:
> "git-subtree add" fails to add the subtree into a repository with
> submodules, reporting that the worktree is not clean, while `git status`
> says that everything is clean (including submodules). I tracked the problem
> down to the command "git index HEAD" returning all submodules as having the
> M status:
> 
> $ git diff-index HEAD
> :16 16 d3812c9318c4d0336897fd2d666be908fa1a7953
> d3812c9318c4d0336897fd2d666be908fa1a7953 M  ext/grpc
> 
> $ git --version
> git version 2.9.2.windows.1
> 
> I worked around the problem in my local copy of git-subtree shell script by
> adding "--ignore-submodules=all" to the two invocations of `git diff-index`
> in the ensure_clean() function (direct link
> ).
> 
> I am wondering, is this a defect in git-subtree? To my understanding, the
> command should not care about submodules more than ensuring their worktree
> is not in the way of new prefix, and that's a separate check. So *even if*
> the submodule is modified, this should not be a show stopper for
> "git-subtree add". Or am I missing some subtleties?

That sounds correct to me; subtree should ignore submodules.
-- 
David