From Lyrix Gold Mining Sarl

2017-01-27 Thread Raw Gold Aavailable
Dearest Gold Purchaser,

We are Local Miners of Kenyeba and Tabakoto region in Mali who can visit
us here for the TTM via face to face inspect of the product and
confirmation of purity, We have gold in raw bars form, and we are looking
for a serious buyers to buy our gold also sign a long term contract
relationship with us, Immediately Buyer come down to the origin of the
gold seller, Seller is willing to provide any amount of quantity of raw
gold bars and deliver it together direct to buyer's destination.

--
Let us know if you are interested to purchase raw gold business with us
officially with this trial order of small quantity to enable us know the
buyer capability in business. I appreciate your cooperation and waiting
your soonest comment in this future business.

Best Regards.
LYRIX GOLD SARL & EXPORT -IMPORT

Dr Joseph Camara & Sacko Mamadou Camara (End Seller/ Ceo.

Tell:  +223 683 089 50
Fax :    +223 683 089 56
Skype:    rawgold.available
Email:     i...@lyrixgoldsarl.com
Email:       office.consultant7...@gmail.com
Web:           www.lyrixgoldsarl.com

  ADDRESS: Rue 11 7 Jicoron Para :Port 16 MALI-BAMAKO WEST AFRICA.

  GESCHAFTSFIHRER/PROF. ID KURT VR-BK /73 VR 5293 BKO T-ID-DE1833


Re: show all merge conflicts

2017-01-27 Thread G. Sylvie Davies
On Fri, Jan 27, 2017 at 9:51 AM, Jeff King  wrote:
> On Fri, Jan 27, 2017 at 11:56:08AM -0500, Michael Spiegel wrote:
>
>> I'm trying to determine whether a merge required a conflict to resolve
>> after the merge has occurred. The git book has some advice
>> (https://git-scm.com/book/en/v2/Git-Tools-Advanced-Merging) to use
>> `git show` on the merge commit or use `git log --cc -p -1`. These
>> strategies work when the merge conflict was resolved with a change
>> that is different from either parent. When the conflict is resolved
>> with a change that is the same as one of the parents, then these
>> commands are indistinguishable from a merge that did not conflict. Is
>> it possible to distinguish between a conflict-free merge and a merge
>> conflict that is resolved by with the changes from one the parents?
>
> No. You'd have to replay the merge to know if it would have had
> conflicts.
>


Aside from the usual "git log -cc", I think this should work (replace
HEAD with whichever commit you are analyzing):

git diff --name-only HEAD^2...HEAD^1 > m1
git diff --name-only HEAD^1...HEAD^2 > b1
git diff --name-only HEAD^1..HEAD> m2
git diff --name-only HEAD^2..HEAD> b2

If files listed between m1 and b2 differ, then the merge is dirty.
Similarly for m2 and b1.

More information here:

http://stackoverflow.com/questions/27683077/how-do-you-detect-an-evil-merge-in-git/41356308#41356308


- Sylvie



> There was a patch series a few years ago that added a new diff-mode to
> do exactly that, and show the diff against what was resolved. It had a
> few issues (I don't remember exactly what) and never got merged.
>
> Certainly one complication is that you don't know exactly _how_ the
> merge was done in the first place (e.g., which merge strategy, which
> custom merge drivers were in effect, etc). But in general, replaying
> with a standard merge-recursive would get you most of what you want to
> know.
>
> I've done this manually sometimes when digging into erroneous merges
> (e.g., somebody accidentally runs "git reset -- " in the middle
> of a merge and throws away some changes.
>
> You should be able to do:
>
>   git checkout $merge^1
>   git merge $merge^2
>   git diff -R $merge
>
> to see what the original resolution did.
>
> -Peff


Re: [PATCH v5 1/3] color.c: fix color_parse_mem() with value_len == 0

2017-01-27 Thread Jeff King
On Thu, Jan 19, 2017 at 11:38:41AM -0500, Jeff King wrote:

> On Thu, Jan 19, 2017 at 06:41:21PM +0700, Nguyễn Thái Ngọc Duy wrote:
> 
> > In this code we want to match the word "reset". If len is zero,
> > strncasecmp() will return zero and we incorrectly assume it's "reset" as
> > a result.
> 
> This is probably a good idea. This _is_ user-visible, so it's possible
> somebody was using empty config as a synonym for "reset". But since it
> was never documented, I feel like relying on that is somewhat crazy.

Hrm. This seems to break the add--interactive script if you do not have
color.diff.plain set:

  $ GIT_TRACE=1 git add -p
  ...
  22:58:12.568990 [pid=11401] git.c:387  trace: built-in: git 'config' 
'--get-color' 'color.diff.plain' ''
  fatal: unable to parse default color value
  config --get-color color.diff.plain : command returned error: 128

As you can see, the default value the empty string, which is now an
error.

The default in the C code for that value is GIT_COLOR_NORMAL, which
really is the empty string. So I think the old code was buggy to choose
"reset", but the new one is worse because it fails entirely. :)

We probably want something like this instead:

diff --git a/color.c b/color.c
index 190b2da96..dee61557e 100644
--- a/color.c
+++ b/color.c
@@ -212,8 +212,10 @@ int color_parse_mem(const char *value, int value_len, char 
*dst)
len--;
}
 
-   if (!len)
-   return -1;
+   if (!len) {
+   dst[0] = '\0';
+   return 0;
+   }
 
if (!strncasecmp(ptr, "reset", len)) {
xsnprintf(dst, end - dst, GIT_COLOR_RESET);

The breakage is in 'next' (it looks like it went out a few days ago; I'm
surprised I didn't notice until now).

-Peff


Re: difflame

2017-01-27 Thread Jeff King
On Tue, Jan 17, 2017 at 11:24:02PM -0600, Edmundo Carmona Antoranz wrote:

> For a very long time I had wanted to get the output of diff to include
> blame information as well (to see when something was added/removed).

This is something I've wanted, too. The trickiest part, though, is
blaming deletions, because git-blame only tracks the origin of content,
not the origin of a change.

For example, try this case:

  git init
  for i in $(seq 1 10); do
echo $i >>file
git add file
git commit -m "add $i"
  done
  sed 4d tmp && mv tmp file
  git commit -am "drop 4"

Running "difflame HEAD~5 HEAD" produces this output:

  diff --git a/file b/file
  index b414108..051c298 100644
  --- a/file
  +++ b/file
  @@ -1,6 +1,9 @@
   ^2b028ff (Jeff King 2017-01-27 22:44:10 -0500 1) 1
   ed056366 (Jeff King 2017-01-27 22:44:10 -0500 2) 2
   771030d8 (Jeff King 2017-01-27 22:44:10 -0500 3) 3
  -89c09c82 (Jeff King 2017-01-27 22:44:10 -0500 4) 4
   b619039c (Jeff King 2017-01-27 22:44:10 -0500 4) 5
   6a7aa0e5 (Jeff King 2017-01-27 22:44:10 -0500 5) 6
  +39bc9dc4 (Jeff King 2017-01-27 22:44:10 -0500 6) 7
  +f253cc8f (Jeff King 2017-01-27 22:44:10 -0500 7) 8
  +85c10f46 (Jeff King 2017-01-27 22:44:10 -0500 8) 9
  +89c09c82 (Jeff King 2017-01-27 22:44:10 -0500 9) 10

The last 4 lines are right; they correspond to the addition commits. But
the line taking away 4 is wrong. You can see even without looking at its
patch, because it is blamed to the same commit that added "10", which
is wrong.

Sorry I don't have a solution. I think it's an open problem with
git-blame, though you could probably script something around "git blame
--reverse". See the commit message of 85af7929e (git-blame --reverse,
2008-04-02) for some discussion.

-Peff


[PATCH v3 13/27] attr.c: outline the future plans by heavily commenting

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 8026d68bd..50e5ee393 100644
--- a/attr.c
+++ b/attr.c
@@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define DEBUG_ATTR 0
 #endif
 
+/*
+ * NEEDSWORK: the global dictionary of the interned attributes
+ * must stay a singleton even after we become thread-ready.
+ * Access to these must be surrounded with mutex when it happens.
+ */
 struct git_attr {
struct git_attr *next;
unsigned h;
@@ -39,10 +44,19 @@ struct git_attr {
char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static struct git_attr *(git_attr_hash[HASHSIZE]);
+
+/*
+ * NEEDSWORK: maybe-real, maybe-macro are not property of
+ * an attribute, as it depends on what .gitattributes are
+ * read.  Once we introduce per git_attr_check attr_stack
+ * and check_all_attr, the optimization based on them will
+ * become unnecessary and can go away.  So is this variable.
+ */
 static int cannot_trust_maybe_real;
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check *check_all_attr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -102,6 +116,11 @@ static struct git_attr *git_attr_internal(const char 
*name, int len)
a->maybe_real = 0;
git_attr_hash[pos] = a;
 
+   /*
+* NEEDSWORK: per git_attr_check check_all_attr
+* will be initialized a lot more lazily, not
+* like this, and not here.
+*/
REALLOC_ARRAY(check_all_attr, attr_nr);
check_all_attr[a->attr_nr].attr = a;
check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
@@ -318,6 +337,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct attr_stack {
struct attr_stack *prev;
char *origin;
@@ -382,6 +402,24 @@ static struct attr_stack *read_attr_from_array(const char 
**list)
return res;
 }
 
+/*
+ * NEEDSWORK: these two are tricky.  The callers assume there is a
+ * single, system-wide global state "where we read attributes from?"
+ * and when the state is flipped by calling git_attr_set_direction(),
+ * attr_stack is discarded so that subsequent attr_check will lazily
+ * read from the right place.  And they do not know or care who called
+ * by them uses the attribute subsystem, hence have no knowledge of
+ * existing git_attr_check instances or future ones that will be
+ * created).
+ *
+ * Probably we need a thread_local that holds these two variables,
+ * and a list of git_attr_check instances (which need to be maintained
+ * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
+ * git_attr_check_clear().  Then git_attr_set_direction() updates the
+ * fields in that thread_local for these two variables, iterate over
+ * all the active git_attr_check instances and discard the attr_stack
+ * they hold.  Yuck, but it sounds doable.
+ */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 26/27] attr: push the bare repo check into read_attr()

2017-01-27 Thread Brandon Williams
Push the bare repository check into the 'read_attr()' function.  This
avoids needing to have extra logic which creates an empty stack frame
when inside a bare repo as a similar bit of logic already exists in the
'read_attr()' function.

Signed-off-by: Brandon Williams 
---
 attr.c | 114 +++--
 1 file changed, 54 insertions(+), 60 deletions(-)

diff --git a/attr.c b/attr.c
index bcee0921d..62298ec2f 100644
--- a/attr.c
+++ b/attr.c
@@ -747,25 +747,28 @@ static struct attr_stack *read_attr_from_index(const char 
*path, int macro_ok)
 
 static struct attr_stack *read_attr(const char *path, int macro_ok)
 {
-   struct attr_stack *res;
+   struct attr_stack *res = NULL;
 
-   if (direction == GIT_ATTR_CHECKOUT) {
+   if (direction == GIT_ATTR_INDEX) {
res = read_attr_from_index(path, macro_ok);
-   if (!res)
-   res = read_attr_from_file(path, macro_ok);
-   }
-   else if (direction == GIT_ATTR_CHECKIN) {
-   res = read_attr_from_file(path, macro_ok);
-   if (!res)
-   /*
-* There is no checked out .gitattributes file there, 
but
-* we might have it in the index.  We allow operation 
in a
-* sparsely checked out work tree, so read from it.
-*/
+   } else if (!is_bare_repository()) {
+   if (direction == GIT_ATTR_CHECKOUT) {
res = read_attr_from_index(path, macro_ok);
+   if (!res)
+   res = read_attr_from_file(path, macro_ok);
+   } else if (direction == GIT_ATTR_CHECKIN) {
+   res = read_attr_from_file(path, macro_ok);
+   if (!res)
+   /*
+* There is no checked out .gitattributes file
+* there, but we might have it in the index.
+* We allow operation in a sparsely checked out
+* work tree, so read from it.
+*/
+   res = read_attr_from_index(path, macro_ok);
+   }
}
-   else
-   res = read_attr_from_index(path, macro_ok);
+
if (!res)
res = xcalloc(1, sizeof(*res));
return res;
@@ -857,10 +860,7 @@ static void bootstrap_attr_stack(struct attr_stack **stack)
}
 
/* root directory */
-   if (!is_bare_repository() || direction == GIT_ATTR_INDEX)
-   e = read_attr(GITATTRIBUTES_FILE, 1);
-   else
-   e = xcalloc(1, sizeof(struct attr_stack));
+   e = read_attr(GITATTRIBUTES_FILE, 1);
push_stack(stack, e, xstrdup(""), 0);
 
/* info frame */
@@ -877,6 +877,7 @@ static void prepare_attr_stack(const char *path, int dirlen,
   struct attr_stack **stack)
 {
struct attr_stack *info;
+   struct strbuf pathbuf = STRBUF_INIT;
 
/*
 * At the bottom of the attribute stack is the built-in
@@ -923,54 +924,47 @@ static void prepare_attr_stack(const char *path, int 
dirlen,
}
 
/*
-* Read from parent directories and push them down
+* bootstrap_attr_stack() should have added, and the
+* above loop should have stopped before popping, the
+* root element whose attr_stack->origin is set to an
+* empty string.
 */
-   if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-   /*
-* bootstrap_attr_stack() should have added, and the
-* above loop should have stopped before popping, the
-* root element whose attr_stack->origin is set to an
-* empty string.
-*/
-   struct strbuf pathbuf = STRBUF_INIT;
-
-   assert((*stack)->origin);
-   strbuf_addstr(, (*stack)->origin);
-   /* Build up to the directory 'path' is in */
-   while (pathbuf.len < dirlen) {
-   size_t len = pathbuf.len;
-   struct attr_stack *next;
-   char *origin;
-
-   /* Skip path-separator */
-   if (len < dirlen && is_dir_sep(path[len]))
-   len++;
-   /* Find the end of the next component */
-   while (len < dirlen && !is_dir_sep(path[len]))
-   len++;
-
-   if (pathbuf.len > 0)
-   strbuf_addch(, '/');
-   strbuf_add(, path + pathbuf.len,
-  (len - pathbuf.len));
-   strbuf_addf(, "/%s", GITATTRIBUTES_FILE);
-
-

[PATCH v3 11/27] attr.c: add push_stack() helper

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

There are too many repetitious "I have this new attr_stack element;
push it at the top of the stack" sequence.  The new helper function
push_stack() gives us a way to express what is going on at these
places, and as a side effect, halves the number of times we mention
the attr_stack global variable.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 71 +++---
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index e1c630f79..8026d68bd 100644
--- a/attr.c
+++ b/attr.c
@@ -510,6 +510,18 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
+static void push_stack(struct attr_stack **attr_stack_p,
+  struct attr_stack *elem, char *origin, size_t originlen)
+{
+   if (elem) {
+   elem->origin = origin;
+   if (origin)
+   elem->originlen = originlen;
+   elem->prev = *attr_stack_p;
+   *attr_stack_p = elem;
+   }
+}
+
 static void bootstrap_attr_stack(void)
 {
struct attr_stack *elem;
@@ -517,37 +529,23 @@ static void bootstrap_attr_stack(void)
if (attr_stack)
return;
 
-   elem = read_attr_from_array(builtin_attr);
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-
-   if (git_attr_system()) {
-   elem = read_attr_from_file(git_etc_gitattributes(), 1);
-   if (elem) {
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-   }
-   }
+   push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0);
+
+   if (git_attr_system())
+   push_stack(_stack,
+  read_attr_from_file(git_etc_gitattributes(), 1),
+  NULL, 0);
 
if (!git_attributes_file)
git_attributes_file = xdg_config_home("attributes");
-   if (git_attributes_file) {
-   elem = read_attr_from_file(git_attributes_file, 1);
-   if (elem) {
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-   }
-   }
+   if (git_attributes_file)
+   push_stack(_stack,
+  read_attr_from_file(git_attributes_file, 1),
+  NULL, 0);
 
if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
elem = read_attr(GITATTRIBUTES_FILE, 1);
-   elem->origin = xstrdup("");
-   elem->originlen = 0;
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   push_stack(_stack, elem, xstrdup(""), 0);
debug_push(elem);
}
 
@@ -558,15 +556,12 @@ static void bootstrap_attr_stack(void)
 
if (!elem)
elem = xcalloc(1, sizeof(*elem));
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   push_stack(_stack, elem, NULL, 0);
 }
 
 static void prepare_attr_stack(const char *path, int dirlen)
 {
struct attr_stack *elem, *info;
-   int len;
const char *cp;
 
/*
@@ -626,20 +621,21 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 
assert(attr_stack->origin);
while (1) {
-   len = strlen(attr_stack->origin);
+   size_t len = strlen(attr_stack->origin);
+   char *origin;
+
if (dirlen <= len)
break;
cp = memchr(path + len + 1, '/', dirlen - len - 1);
if (!cp)
cp = path + dirlen;
-   strbuf_add(, path, cp - path);
-   strbuf_addch(, '/');
-   strbuf_addstr(, GITATTRIBUTES_FILE);
+   strbuf_addf(,
+   "%.*s/%s", (int)(cp - path), path,
+   GITATTRIBUTES_FILE);
elem = read_attr(pathbuf.buf, 0);
strbuf_setlen(, cp - path);
-   elem->origin = strbuf_detach(, 
>originlen);
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   origin = strbuf_detach(, );
+   push_stack(_stack, elem, origin, len);
debug_push(elem);
}
 
@@ -649,8 +645,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
/*
 * Finally push the "info" one at the top of the stack.
   

[PATCH v3 14/27] attr: rename function and struct related to checking attributes

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 archive.c  |  6 +++---
 attr.c | 12 ++--
 attr.h |  8 
 builtin/check-attr.c   | 19 ++-
 builtin/pack-objects.c |  6 +++---
 convert.c  | 12 ++--
 ll-merge.c | 10 +-
 userdiff.c |  4 ++--
 ws.c   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index 01751e574..b76bd4691 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct attr_check_item *check)
 {
static struct git_attr *attr_export_ignore;
static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
struct archiver_context *c = context;
struct archiver_args *args = c->args;
write_archive_entry_fn_t write_entry = c->write_entry;
-   struct git_attr_check check[2];
+   struct attr_check_item check[2];
const char *path_without_prefix;
int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
path_without_prefix = path.buf + args->baselen;
 
setup_archive_check(check);
-   if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+   if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
if (ATTR_TRUE(check[0].value))
return 0;
args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index 50e5ee393..2f180d609 100644
--- a/attr.c
+++ b/attr.c
@@ -56,7 +56,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 static int cannot_trust_maybe_real;
 
 /* NEEDSWORK: This will become per git_attr_check */
-static struct git_attr_check *check_all_attr;
+static struct attr_check_item *check_all_attr;
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -713,7 +713,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-   struct git_attr_check *check = check_all_attr;
+   struct attr_check_item *check = check_all_attr;
int i;
 
for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -778,7 +778,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-  struct git_attr_check *check)
+  struct attr_check_item *check)
 
 {
struct attr_stack *stk;
@@ -806,7 +806,7 @@ static void collect_some_attrs(const char *path, int num,
rem = 0;
for (i = 0; i < num; i++) {
if (!check[i].attr->maybe_real) {
-   struct git_attr_check *c;
+   struct attr_check_item *c;
c = check_all_attr + check[i].attr->attr_nr;
c->value = ATTR__UNSET;
rem++;
@@ -821,7 +821,7 @@ static void collect_some_attrs(const char *path, int num,
rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 {
int i;
 
@@ -837,7 +837,7 @@ int git_check_attr(const char *path, int num, struct 
git_attr_check *check)
return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 {
int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c..efc7bb3b3 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct attr_check_item {
  

[PATCH v3 23/27] attr: remove maybe-real, maybe-macro from git_attr

2017-01-27 Thread Brandon Williams
Whether or not a git attribute is real or a macro isn't a property of
the attribute but rather it depends on the attribute stack (which
.gitattribute files were read).

This patch removes the 'maybe_real' and 'maybe_macro' fields in a
git_attr and instead adds the 'macro' field to a attr_check_item.  The
'macro' indicates (if non-NULL) that a particular attribute is a macro
for the given attribute stack.  It's populated, through a quick scan of
the attribute stack, with the match_attr that corresponds to the macro's
definition.  This way the attribute stack only needs to be scanned a
single time prior to attribute collection instead of each time a macro
needs to be expanded.

Signed-off-by: Brandon Williams 
---
 attr.c | 75 +-
 1 file changed, 37 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 2637804b1..8f4402ef3 100644
--- a/attr.c
+++ b/attr.c
@@ -30,20 +30,9 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 
 struct git_attr {
int attr_nr; /* unique attribute number */
-   int maybe_macro;
-   int maybe_real;
char name[FLEX_ARRAY]; /* attribute name */
 };
 
-/*
- * NEEDSWORK: maybe-real, maybe-macro are not property of
- * an attribute, as it depends on what .gitattributes are
- * read.  Once we introduce per git_attr_check attr_stack
- * and check_all_attr, the optimization based on them will
- * become unnecessary and can go away.  So is this variable.
- */
-static int cannot_trust_maybe_real;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
return attr->name;
@@ -142,6 +131,12 @@ static void attr_hashmap_add(struct attr_hashmap *map,
 struct all_attrs_item {
const struct git_attr *attr;
const char *value;
+   /*
+* If 'macro' is non-NULL, indicates that 'attr' is a macro based on
+* the current attribute stack and contains a pointer to the match_attr
+* definition of the macro
+*/
+   const struct match_attr *macro;
 };
 
 /*
@@ -187,6 +182,7 @@ static void all_attrs_init(struct attr_hashmap *map, struct 
attr_check *check)
 */
for (i = 0; i < check->all_attrs_nr; i++) {
check->all_attrs[i].value = ATTR__UNKNOWN;
+   check->all_attrs[i].macro = NULL;
}
 }
 
@@ -238,8 +234,6 @@ static struct git_attr *git_attr_internal(const char *name, 
int namelen)
if (!a) {
FLEX_ALLOC_MEM(a, name, name, namelen);
a->attr_nr = g_attr_hashmap.map.size;
-   a->maybe_real = 0;
-   a->maybe_macro = 0;
 
attr_hashmap_add(_attr_hashmap, a->name, namelen, a);
assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
@@ -402,7 +396,6 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  (is_macro ? 0 : namelen + 1));
if (is_macro) {
res->u.attr = git_attr_internal(name, namelen);
-   res->u.attr->maybe_macro = 1;
} else {
char *p = (char *)&(res->state[num_attr]);
memcpy(p, name, namelen);
@@ -423,10 +416,6 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
/* Second pass to fill the attr_states */
for (cp = states, i = 0; *cp; i++) {
cp = parse_attr(src, lineno, cp, &(res->state[i]));
-   if (!is_macro)
-   res->state[i].attr->maybe_real = 1;
-   if (res->state[i].attr->maybe_macro)
-   cannot_trust_maybe_real = 1;
}
 
strbuf_release();
@@ -904,7 +893,7 @@ static int path_matches(const char *pathname, int pathlen,
 static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
 
 static int fill_one(const char *what, struct all_attrs_item *all_attrs,
-   struct match_attr *a, int rem)
+   const struct match_attr *a, int rem)
 {
int i;
 
@@ -945,24 +934,34 @@ static int fill(const char *path, int pathlen, int 
basename_offset,
 
 static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem)
 {
-   struct attr_stack *stk;
-   int i;
+   const struct all_attrs_item *item = _attrs[nr];
 
-   if (all_attrs[nr].value != ATTR__TRUE ||
-   !all_attrs[nr].attr->maybe_macro)
+   if (item->macro && item->value == ATTR__TRUE)
+   return fill_one("expand", all_attrs, item->macro, rem);
+   else
return rem;
+}
 
-   for (stk = attr_stack; stk; stk = stk->prev) {
-   for (i = stk->num_matches - 1; 0 <= i; i--) {
-   struct match_attr *ma = stk->attrs[i];
-   if (!ma->is_macro)
-   continue;
-   if (ma->u.attr->attr_nr == nr)
-   return fill_one("expand", 

[PATCH v3 21/27] attr: use hashmap for attribute dictionary

2017-01-27 Thread Brandon Williams
The current implementation of the attribute dictionary uses a custom
hashtable.  This modernizes the dictionary by converting it to the builtin
'hashmap' structure.

Also, in order to enable a threaded API in the future add an
accompanying mutex which must be acquired prior to accessing the
dictionary of interned attributes.

Signed-off-by: Brandon Williams 
---
 attr.c| 173 +++---
 attr.h|   2 +
 common-main.c |   3 +
 3 files changed, 133 insertions(+), 45 deletions(-)

diff --git a/attr.c b/attr.c
index 9fe848f59..e008f3026 100644
--- a/attr.c
+++ b/attr.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "utf8.h"
 #include "quote.h"
+#include "thread-utils.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -23,28 +24,17 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define ATTR__UNSET NULL
 #define ATTR__UNKNOWN git_attr__unknown
 
-/* This is a randomly chosen prime. */
-#define HASHSIZE 257
-
 #ifndef DEBUG_ATTR
 #define DEBUG_ATTR 0
 #endif
 
-/*
- * NEEDSWORK: the global dictionary of the interned attributes
- * must stay a singleton even after we become thread-ready.
- * Access to these must be surrounded with mutex when it happens.
- */
 struct git_attr {
-   struct git_attr *next;
-   unsigned h;
-   int attr_nr;
+   int attr_nr; /* unique attribute number */
int maybe_macro;
int maybe_real;
-   char name[FLEX_ARRAY];
+   char name[FLEX_ARRAY]; /* attribute name */
 };
 static int attr_nr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -63,15 +53,94 @@ const char *git_attr_name(const struct git_attr *attr)
return attr->name;
 }
 
-static unsigned hash_name(const char *name, int namelen)
+struct attr_hashmap {
+   struct hashmap map;
+#ifndef NO_PTHREADS
+   pthread_mutex_t mutex;
+#endif
+};
+
+static inline void hashmap_lock(struct attr_hashmap *map)
+{
+#ifndef NO_PTHREADS
+   pthread_mutex_lock(>mutex);
+#endif
+}
+
+static inline void hashmap_unlock(struct attr_hashmap *map)
 {
-   unsigned val = 0, c;
+#ifndef NO_PTHREADS
+   pthread_mutex_unlock(>mutex);
+#endif
+}
 
-   while (namelen--) {
-   c = *name++;
-   val = ((val << 7) | (val >> 22)) ^ c;
-   }
-   return val;
+/*
+ * The global dictionary of all interned attributes.  This
+ * is a singleton object which is shared between threads.
+ * Access to this dictionary must be surrounded with a mutex.
+ */
+static struct attr_hashmap g_attr_hashmap;
+
+/* The container for objects stored in "struct attr_hashmap" */
+struct attr_hash_entry {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *key; /* the key; memory should be owned by value */
+   size_t keylen; /* length of the key */
+   void *value; /* the stored value */
+};
+
+/* attr_hashmap comparison function */
+static int attr_hash_entry_cmp(const struct attr_hash_entry *a,
+  const struct attr_hash_entry *b,
+  void *unused)
+{
+   return (a->keylen != b->keylen) || strncmp(a->key, b->key, a->keylen);
+}
+
+/* Initialize an 'attr_hashmap' object */
+static void attr_hashmap_init(struct attr_hashmap *map)
+{
+   hashmap_init(>map, (hashmap_cmp_fn) attr_hash_entry_cmp, 0);
+}
+
+/*
+ * Retrieve the 'value' stored in a hashmap given the provided 'key'.
+ * If there is no matching entry, return NULL.
+ */
+static void *attr_hashmap_get(struct attr_hashmap *map,
+ const char *key, size_t keylen)
+{
+   struct attr_hash_entry k;
+   struct attr_hash_entry *e;
+
+   if (!map->map.tablesize)
+   attr_hashmap_init(map);
+
+   hashmap_entry_init(, memhash(key, keylen));
+   k.key = key;
+   k.keylen = keylen;
+   e = hashmap_get(>map, , NULL);
+
+   return e ? e->value : NULL;
+}
+
+/* Add 'value' to a hashmap based on the provided 'key'. */
+static void attr_hashmap_add(struct attr_hashmap *map,
+const char *key, size_t keylen,
+void *value)
+{
+   struct attr_hash_entry *e;
+
+   if (!map->map.tablesize)
+   attr_hashmap_init(map);
+
+   e = xmalloc(sizeof(struct attr_hash_entry));
+   hashmap_entry_init(e, memhash(key, keylen));
+   e->key = key;
+   e->keylen = keylen;
+   e->value = value;
+
+   hashmap_add(>map, e);
 }
 
 static int attr_name_valid(const char *name, size_t namelen)
@@ -103,37 +172,44 @@ static void report_invalid_attr(const char *name, size_t 
len,
strbuf_release();
 }
 
-static struct git_attr *git_attr_internal(const char *name, int len)
+/*
+ * Given a 'name', lookup and return the corresponding attribute in the global
+ * dictionary.  If no entry is 

[PATCH v3 25/27] attr: store attribute stack in attr_check structure

2017-01-27 Thread Brandon Williams
The last big hurdle towards a thread-safe API for the attribute system
is the reliance on a global attribute stack that is modified during each
call into the attribute system.

This patch removes this global stack and instead a stack is stored
locally in each attr_check instance.  This opens up the opportunity for
future optimizations to customize the attribute stack for the attributes
that a particular attr_check struct is interested in.

One caveat with pushing the attribute stack into the attr_check
structure is that the attribute system now needs to keep track of all
active attr_check instances.  Due to the direction mechanism the stack
needs to be dropped when the direction is switched.  In order to ensure
correctness when the direction is changed the attribute system needs to
iterate through all active attr_check instances and drop each of their
stacks.

Signed-off-by: Brandon Williams 
---
 attr.c | 284 +
 attr.h |   4 +-
 2 files changed, 199 insertions(+), 89 deletions(-)

diff --git a/attr.c b/attr.c
index 69643ae77..bcee0921d 100644
--- a/attr.c
+++ b/attr.c
@@ -445,17 +445,16 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
+struct attr_stack {
struct attr_stack *prev;
char *origin;
size_t originlen;
unsigned num_matches;
unsigned alloc;
struct match_attr **attrs;
-} *attr_stack;
+};
 
-static void free_attr_elem(struct attr_stack *e)
+static void attr_stack_free(struct attr_stack *e)
 {
int i;
free(e->origin);
@@ -478,9 +477,96 @@ static void free_attr_elem(struct attr_stack *e)
free(e);
 }
 
+static void drop_attr_stack(struct attr_stack **stack)
+{
+   while (*stack) {
+   struct attr_stack *elem = *stack;
+   *stack = elem->prev;
+   attr_stack_free(elem);
+   }
+}
+
+/* List of all attr_check structs; access should be surrounded by mutex */
+static struct check_vector {
+   size_t nr;
+   size_t alloc;
+   struct attr_check **checks;
+#ifndef NO_PTHREADS
+   pthread_mutex_t mutex;
+#endif
+} check_vector;
+
+static inline void vector_lock(void)
+{
+#ifndef NO_PTHREADS
+   pthread_mutex_lock(_vector.mutex);
+#endif
+}
+
+static inline void vector_unlock(void)
+{
+#ifndef NO_PTHREADS
+   pthread_mutex_unlock(_vector.mutex);
+#endif
+}
+
+static void check_vector_add(struct attr_check *c)
+{
+   vector_lock();
+
+   ALLOC_GROW(check_vector.checks,
+  check_vector.nr + 1,
+  check_vector.alloc);
+   check_vector.checks[check_vector.nr++] = c;
+
+   vector_unlock();
+}
+
+static void check_vector_remove(struct attr_check *check)
+{
+   int i;
+
+   vector_lock();
+
+   /* Find entry */
+   for (i = 0; i < check_vector.nr; i++)
+   if (check_vector.checks[i] == check)
+   break;
+
+   if (i >= check_vector.nr)
+   die("BUG: no entry found");
+
+   /* shift entries over */
+   for (; i < check_vector.nr - 1; i++)
+   check_vector.checks[i] = check_vector.checks[i + 1];
+
+   check_vector.nr--;
+
+   vector_unlock();
+}
+
+/* Iterate through all attr_check instances and drop their stacks */
+static void drop_all_attr_stacks(void)
+{
+   int i;
+
+   vector_lock();
+
+   for (i = 0; i < check_vector.nr; i++) {
+   drop_attr_stack(_vector.checks[i]->stack);
+   }
+
+   vector_unlock();
+}
+
 struct attr_check *attr_check_alloc(void)
 {
-   return xcalloc(1, sizeof(struct attr_check));
+   struct attr_check *c = xcalloc(1, sizeof(struct attr_check));
+
+   /* save pointer to the check struct */
+   check_vector_add(c);
+
+   return c;
 }
 
 struct attr_check *attr_check_initl(const char *one, ...)
@@ -543,12 +629,19 @@ void attr_check_clear(struct attr_check *check)
free(check->all_attrs);
check->all_attrs = NULL;
check->all_attrs_nr = 0;
+
+   drop_attr_stack(>stack);
 }
 
 void attr_check_free(struct attr_check *check)
 {
-   attr_check_clear(check);
-   free(check);
+   if (check) {
+   /* Remove check from the check vector */
+   check_vector_remove(check);
+
+   attr_check_clear(check);
+   free(check);
+   }
 }
 
 static const char *builtin_attr[] = {
@@ -705,15 +798,6 @@ static void debug_set(const char *what, const char *match, 
struct git_attr *attr
 #define debug_set(a,b,c,d) do { ; } while (0)
 #endif /* DEBUG_ATTR */
 
-static void drop_attr_stack(void)
-{
-   while (attr_stack) {
-   struct attr_stack *elem = attr_stack;
-   attr_stack = elem->prev;
-   free_attr_elem(elem);

[PATCH v3 16/27] attr: convert git_all_attrs() to use "struct attr_check"

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

This updates the other two ways the attribute check is done via an
array of "struct attr_check_item" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use attr_check_initl() to prepare the
   attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call attr_check_alloc() to allocate an empty
attr_check, and then call attr_check_append() to add attribute names
one by one.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c   | 168 ---
 attr.h   |   9 +--
 builtin/check-attr.c |  60 +-
 3 files changed, 112 insertions(+), 125 deletions(-)

diff --git a/attr.c b/attr.c
index de8bf35a3..40818246f 100644
--- a/attr.c
+++ b/attr.c
@@ -132,75 +132,6 @@ struct git_attr *git_attr(const char *name)
return git_attr_internal(name, strlen(name));
 }
 
-struct attr_check *attr_check_alloc(void)
-{
-   return xcalloc(1, sizeof(struct attr_check));
-}
-
-struct attr_check *attr_check_initl(const char *one, ...)
-{
-   struct attr_check *check;
-   int cnt;
-   va_list params;
-   const char *param;
-
-   va_start(params, one);
-   for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
-   ;
-   va_end(params);
-
-   check = attr_check_alloc();
-   check->nr = cnt;
-   check->alloc = cnt;
-   check->items = xcalloc(cnt, sizeof(struct attr_check_item));
-
-   check->items[0].attr = git_attr(one);
-   va_start(params, one);
-   for (cnt = 1; cnt < check->nr; cnt++) {
-   const struct git_attr *attr;
-   param = va_arg(params, const char *);
-   if (!param)
-   die("BUG: counted %d != ended at %d",
-   check->nr, cnt);
-   attr = git_attr(param);
-   if (!attr)
-   die("BUG: %s: not a valid attribute name", param);
-   check->items[cnt].attr = attr;
-   }
-   va_end(params);
-   return check;
-}
-
-struct attr_check_item *attr_check_append(struct attr_check *check,
- const struct git_attr *attr)
-{
-   struct attr_check_item *item;
-
-   ALLOC_GROW(check->items, check->nr + 1, check->alloc);
-   item = >items[check->nr++];
-   item->attr = attr;
-   return item;
-}
-
-void attr_check_reset(struct attr_check *check)
-{
-   check->nr = 0;
-}
-
-void attr_check_clear(struct attr_check *check)
-{
-   free(check->items);
-   check->items = NULL;
-   check->alloc = 0;
-   check->nr = 0;
-}
-
-void attr_check_free(struct attr_check *check)
-{
-   attr_check_clear(check);
-   free(check);
-}
-
 /* What does a matched pattern decide? */
 struct attr_state {
struct git_attr *attr;
@@ -439,6 +370,75 @@ static void free_attr_elem(struct attr_stack *e)
free(e);
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+   return xcalloc(1, sizeof(struct attr_check));
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+   struct attr_check *check;
+   int cnt;
+   va_list params;
+   const char *param;
+
+   va_start(params, one);
+   for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+   ;
+   va_end(params);
+
+   check = attr_check_alloc();
+   check->nr = cnt;
+   check->alloc = cnt;
+   check->items = xcalloc(cnt, sizeof(struct attr_check_item));
+
+   check->items[0].attr = git_attr(one);
+   va_start(params, one);
+   for (cnt = 1; cnt < check->nr; cnt++) {
+   const struct git_attr *attr;
+   param = va_arg(params, const char *);
+   if (!param)
+   die("BUG: counted %d != ended at %d",
+   check->nr, cnt);
+   attr = git_attr(param);
+   if (!attr)
+   die("BUG: %s: not a valid attribute name", param);
+   check->items[cnt].attr = attr;
+   }
+   va_end(params);
+   return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+ const struct git_attr *attr)
+{
+   struct attr_check_item *item;
+
+   ALLOC_GROW(check->items, check->nr + 1, check->alloc);
+   item = >items[check->nr++];
+   item->attr = attr;
+   return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+   check->nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+   free(check->items);
+   check->items = 

[PATCH v3 22/27] attr: eliminate global check_all_attr array

2017-01-27 Thread Brandon Williams
Currently there is a reliance on 'check_all_attr' which is a global
array of 'attr_check_item' items which is used to store the value of
each attribute during the collection process.

This patch eliminates this global and instead creates an array per
'attr_check' instance which is then used in the attribute collection
process.  This brings the attribute system one step closer to being
thread-safe.

Signed-off-by: Brandon Williams 
---
 attr.c | 121 -
 attr.h |   5 +++
 2 files changed, 87 insertions(+), 39 deletions(-)

diff --git a/attr.c b/attr.c
index e008f3026..2637804b1 100644
--- a/attr.c
+++ b/attr.c
@@ -34,7 +34,6 @@ struct git_attr {
int maybe_real;
char name[FLEX_ARRAY]; /* attribute name */
 };
-static int attr_nr;
 
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
@@ -45,9 +44,6 @@ static int attr_nr;
  */
 static int cannot_trust_maybe_real;
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_check_item *check_all_attr;
-
 const char *git_attr_name(const struct git_attr *attr)
 {
return attr->name;
@@ -143,6 +139,57 @@ static void attr_hashmap_add(struct attr_hashmap *map,
hashmap_add(>map, e);
 }
 
+struct all_attrs_item {
+   const struct git_attr *attr;
+   const char *value;
+};
+
+/*
+ * Reallocate and reinitialize the array of all attributes (which is used in
+ * the attribute collection process) in 'check' based on the global dictionary
+ * of attributes.
+ */
+static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check)
+{
+   int i;
+
+   hashmap_lock(map);
+
+   if (map->map.size < check->all_attrs_nr)
+   die("BUG: interned attributes shouldn't be deleted");
+
+   /*
+* If the number of attributes in the global dictionary has increased
+* (or this attr_check instance doesn't have an initialized all_attrs
+* field), reallocate the provided attr_check instance's all_attrs
+* field and fill each entry with its corresponding git_attr.
+*/
+   if (map->map.size != check->all_attrs_nr) {
+   struct attr_hash_entry *e;
+   struct hashmap_iter iter;
+   hashmap_iter_init(>map, );
+
+   REALLOC_ARRAY(check->all_attrs, map->map.size);
+   check->all_attrs_nr = map->map.size;
+
+   while ((e = hashmap_iter_next())) {
+   const struct git_attr *a = e->value;
+   check->all_attrs[a->attr_nr].attr = a;
+   }
+   }
+
+   hashmap_unlock(map);
+
+   /*
+* Re-initialize every entry in check->all_attrs.
+* This re-initialization can live outside of the locked region since
+* the attribute dictionary is no longer being accessed.
+*/
+   for (i = 0; i < check->all_attrs_nr; i++) {
+   check->all_attrs[i].value = ATTR__UNKNOWN;
+   }
+}
+
 static int attr_name_valid(const char *name, size_t namelen)
 {
/*
@@ -196,16 +243,6 @@ static struct git_attr *git_attr_internal(const char 
*name, int namelen)
 
attr_hashmap_add(_attr_hashmap, a->name, namelen, a);
assert(a->attr_nr == (g_attr_hashmap.map.size - 1));
-
-   /*
-* NEEDSWORK: per git_attr_check check_all_attr
-* will be initialized a lot more lazily, not
-* like this, and not here.
-*/
-   REALLOC_ARRAY(check_all_attr, ++attr_nr);
-   check_all_attr[a->attr_nr].attr = a;
-   check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
-   assert(a->attr_nr == (attr_nr - 1));
}
 
hashmap_unlock(_attr_hashmap);
@@ -513,6 +550,10 @@ void attr_check_clear(struct attr_check *check)
check->items = NULL;
check->alloc = 0;
check->nr = 0;
+
+   free(check->all_attrs);
+   check->all_attrs = NULL;
+   check->all_attrs_nr = 0;
 }
 
 void attr_check_free(struct attr_check *check)
@@ -860,16 +901,16 @@ static int path_matches(const char *pathname, int pathlen,
  pattern, prefix, pat->patternlen, pat->flags);
 }
 
-static int macroexpand_one(int attr_nr, int rem);
+static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
 
-static int fill_one(const char *what, struct match_attr *a, int rem)
+static int fill_one(const char *what, struct all_attrs_item *all_attrs,
+   struct match_attr *a, int rem)
 {
-   struct attr_check_item *check = check_all_attr;
int i;
 
-   for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
+   for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
struct git_attr *attr = a->state[i].attr;
-   const char **n = &(check[attr->attr_nr].value);
+   const char **n = 

[PATCH v3 20/27] attr: change validity check for attribute names to use positive logic

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Convert 'invalid_attr_name()' to 'attr_name_valid()' and use positive
logic for the return value.  In addition create a helper function that
prints out an error message when an invalid attribute name is used.

We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/attr.c b/attr.c
index 81a3c74d8..9fe848f59 100644
--- a/attr.c
+++ b/attr.c
@@ -74,23 +74,33 @@ static unsigned hash_name(const char *name, int namelen)
return val;
 }
 
-static int invalid_attr_name(const char *name, int namelen)
+static int attr_name_valid(const char *name, size_t namelen)
 {
/*
 * Attribute name cannot begin with '-' and must consist of
 * characters from [-A-Za-z0-9_.].
 */
if (namelen <= 0 || *name == '-')
-   return -1;
+   return 0;
while (namelen--) {
char ch = *name++;
if (! (ch == '-' || ch == '.' || ch == '_' ||
   ('0' <= ch && ch <= '9') ||
   ('a' <= ch && ch <= 'z') ||
   ('A' <= ch && ch <= 'Z')) )
-   return -1;
+   return 0;
}
-   return 0;
+   return 1;
+}
+
+static void report_invalid_attr(const char *name, size_t len,
+   const char *src, int lineno)
+{
+   struct strbuf err = STRBUF_INIT;
+   strbuf_addf(, _("%.*s is not a valid attribute name"),
+   (int) len, name);
+   fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno);
+   strbuf_release();
 }
 
 static struct git_attr *git_attr_internal(const char *name, int len)
@@ -105,7 +115,7 @@ static struct git_attr *git_attr_internal(const char *name, 
int len)
return a;
}
 
-   if (invalid_attr_name(name, len))
+   if (!attr_name_valid(name, len))
return NULL;
 
FLEX_ALLOC_MEM(a, name, name, len);
@@ -196,17 +206,15 @@ static const char *parse_attr(const char *src, int 
lineno, const char *cp,
cp++;
len--;
}
-   if (invalid_attr_name(cp, len)) {
-   fprintf(stderr,
-   "%.*s is not a valid attribute name: %s:%d\n",
-   len, cp, src, lineno);
+   if (!attr_name_valid(cp, len)) {
+   report_invalid_attr(cp, len, src, lineno);
return NULL;
}
} else {
/*
 * As this function is always called twice, once with
 * e == NULL in the first pass and then e != NULL in
-* the second pass, no need for invalid_attr_name()
+* the second pass, no need for attr_name_valid()
 * check here.
 */
if (*cp == '-' || *cp == '!') {
@@ -258,10 +266,8 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
name += strlen(ATTRIBUTE_MACRO_PREFIX);
name += strspn(name, blank);
namelen = strcspn(name, blank);
-   if (invalid_attr_name(name, namelen)) {
-   fprintf(stderr,
-   "%.*s is not a valid attribute name: %s:%d\n",
-   namelen, name, src, lineno);
+   if (!attr_name_valid(name, namelen)) {
+   report_invalid_attr(name, namelen, src, lineno);
goto fail_return;
}
}
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 24/27] attr: tighten const correctness with git_attr and match_attr

2017-01-27 Thread Brandon Williams
Signed-off-by: Brandon Williams 
---
 attr.c   | 12 ++--
 attr.h   |  2 +-
 builtin/check-attr.c |  3 ++-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/attr.c b/attr.c
index 8f4402ef3..69643ae77 100644
--- a/attr.c
+++ b/attr.c
@@ -220,7 +220,7 @@ static void report_invalid_attr(const char *name, size_t 
len,
  * dictionary.  If no entry is found, create a new attribute and store it in
  * the dictionary.
  */
-static struct git_attr *git_attr_internal(const char *name, int namelen)
+static const struct git_attr *git_attr_internal(const char *name, int namelen)
 {
struct git_attr *a;
 
@@ -244,14 +244,14 @@ static struct git_attr *git_attr_internal(const char 
*name, int namelen)
return a;
 }
 
-struct git_attr *git_attr(const char *name)
+const struct git_attr *git_attr(const char *name)
 {
return git_attr_internal(name, strlen(name));
 }
 
 /* What does a matched pattern decide? */
 struct attr_state {
-   struct git_attr *attr;
+   const struct git_attr *attr;
const char *setto;
 };
 
@@ -278,7 +278,7 @@ struct pattern {
 struct match_attr {
union {
struct pattern pat;
-   struct git_attr *attr;
+   const struct git_attr *attr;
} u;
char is_macro;
unsigned num_attr;
@@ -898,7 +898,7 @@ static int fill_one(const char *what, struct all_attrs_item 
*all_attrs,
int i;
 
for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
-   struct git_attr *attr = a->state[i].attr;
+   const struct git_attr *attr = a->state[i].attr;
const char **n = &(all_attrs[attr->attr_nr].value);
const char *v = a->state[i].setto;
 
@@ -922,7 +922,7 @@ static int fill(const char *path, int pathlen, int 
basename_offset,
const char *base = stk->origin ? stk->origin : "";
 
for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-   struct match_attr *a = stk->attrs[i];
+   const struct match_attr *a = stk->attrs[i];
if (a->is_macro)
continue;
if (path_matches(path, pathlen, basename_offset,
diff --git a/attr.h b/attr.h
index 5aaf55c3e..abebbc19c 100644
--- a/attr.h
+++ b/attr.h
@@ -11,7 +11,7 @@ struct all_attrs_item;
  * Given a string, return the gitattribute object that
  * corresponds to it.
  */
-struct git_attr *git_attr(const char *);
+const struct git_attr *git_attr(const char *);
 
 /* Internal use */
 extern const char git_attr__true[];
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 40cdff13e..4d01ca0c8 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -166,7 +166,8 @@ int cmd_check_attr(int argc, const char **argv, const char 
*prefix)
check = attr_check_alloc();
if (!all_attrs) {
for (i = 0; i < cnt; i++) {
-   struct git_attr *a = git_attr(argv[i]);
+   const struct git_attr *a = git_attr(argv[i]);
+
if (!a)
return error("%s: not a valid attribute name",
 argv[i]);
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 27/27] attr: reformat git_attr_set_direction() function

2017-01-27 Thread Brandon Williams
Move the 'git_attr_set_direction()' up to be closer to the variables
that it modifies as well as a small formatting by renaming the variable
'new' to 'new_direction' so that it is more descriptive.

Update the comment about how 'direction' is used to read the state of
the world.  It should be noted that callers of
'git_attr_set_direction()' should ensure that other threads are not
making calls into the attribute system until after the call to
'git_attr_set_direction()' completes.  This function essentially acts as
reset button for the attribute system and should be handled with care.

Signed-off-by: Brandon Williams 
---
 attr.c | 49 -
 attr.h |  3 ++-
 2 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/attr.c b/attr.c
index 62298ec2f..5493bff22 100644
--- a/attr.c
+++ b/attr.c
@@ -677,26 +677,30 @@ static struct attr_stack *read_attr_from_array(const char 
**list)
 }
 
 /*
- * NEEDSWORK: these two are tricky.  The callers assume there is a
- * single, system-wide global state "where we read attributes from?"
- * and when the state is flipped by calling git_attr_set_direction(),
- * attr_stack is discarded so that subsequent attr_check will lazily
- * read from the right place.  And they do not know or care who called
- * by them uses the attribute subsystem, hence have no knowledge of
- * existing git_attr_check instances or future ones that will be
- * created).
- *
- * Probably we need a thread_local that holds these two variables,
- * and a list of git_attr_check instances (which need to be maintained
- * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
- * git_attr_check_clear().  Then git_attr_set_direction() updates the
- * fields in that thread_local for these two variables, iterate over
- * all the active git_attr_check instances and discard the attr_stack
- * they hold.  Yuck, but it sounds doable.
+ * Callers into the attribute system assume there is a single, system-wide
+ * global state where attributes are read from and when the state is flipped by
+ * calling git_attr_set_direction(), the stack frames that have been
+ * constructed need to be discarded so so that subsequent calls into the
+ * attribute system will lazily read from the right place.  Since changing
+ * direction causes a global paradigm shift, it should not ever be called while
+ * another thread could potentially be calling into the attribute system.
  */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
+void git_attr_set_direction(enum git_attr_direction new_direction,
+   struct index_state *istate)
+{
+   if (is_bare_repository() && new_direction != GIT_ATTR_INDEX)
+   die("BUG: non-INDEX attr direction in a bare repo");
+
+   if (new_direction != direction)
+   drop_all_attr_stacks();
+
+   direction = new_direction;
+   use_index = istate;
+}
+
 static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
 {
FILE *fp = fopen(path, "r");
@@ -1148,19 +1152,6 @@ void git_all_attrs(const char *path, struct attr_check 
*check)
}
 }
 
-void git_attr_set_direction(enum git_attr_direction new, struct index_state 
*istate)
-{
-   enum git_attr_direction old = direction;
-
-   if (is_bare_repository() && new != GIT_ATTR_INDEX)
-   die("BUG: non-INDEX attr direction in a bare repo");
-
-   direction = new;
-   if (new != old)
-   drop_all_attr_stacks();
-   use_index = istate;
-}
-
 void attr_start(void)
 {
 #ifndef NO_PTHREADS
diff --git a/attr.h b/attr.h
index 6f4961fdb..48ab3e1c2 100644
--- a/attr.h
+++ b/attr.h
@@ -72,7 +72,8 @@ enum git_attr_direction {
GIT_ATTR_CHECKOUT,
GIT_ATTR_INDEX
 };
-void git_attr_set_direction(enum git_attr_direction, struct index_state *);
+void git_attr_set_direction(enum git_attr_direction new_direction,
+   struct index_state *istate);
 
 extern void attr_start(void);
 
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 18/27] attr: retire git_check_attrs() API

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 Documentation/technical/api-gitattributes.txt | 86 +--
 attr.c|  3 +-
 attr.h|  1 -
 3 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt 
b/Documentation/technical/api-gitattributes.txt
index 260266867..e7cbb7c13 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,10 +16,15 @@ Data Structure
of no interest to the calling programs.  The name of the
attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check`::
+`struct attr_check_item`::
 
-   This structure represents a set of attributes to check in a call
-   to `git_check_attr()` function, and receives the results.
+   This structure represents one attribute and its value.
+
+`struct attr_check`::
+
+   This structure represents a collection of `attr_check_item`.
+   It is passed to `git_check_attr()` function, specifying the
+   attributes to check, and receives their values.
 
 
 Attribute Values
@@ -27,7 +32,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records it.  There are three macros to check these:
+attr_check_item` records it.  There are three macros to check these:
 
 `ATTR_TRUE()`::
 
@@ -48,49 +53,51 @@ value of the attribute for the path.
 Querying Specific Attributes
 
 
-* Prepare an array of `struct git_attr_check` to define the list of
-  attributes you would want to check.  To populate this array, you would
-  need to define necessary attributes by calling `git_attr()` function.
+* Prepare `struct attr_check` using attr_check_initl()
+  function, enumerating the names of attributes whose values you are
+  interested in, terminated with a NULL pointer.  Alternatively, an
+  empty `struct attr_check` can be prepared by calling
+  `attr_check_alloc()` function and then attributes you want to
+  ask about can be added to it with `attr_check_append()`
+  function.
 
 * Call `git_check_attr()` to check the attributes for the path.
 
-* Inspect `git_attr_check` structure to see how each of the attribute in
-  the array is defined for the path.
+* Inspect `attr_check` structure to see how each of the
+  attribute in the array is defined for the path.
 
 
 Example
 ---
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
-. Prepare an array of `struct git_attr_check` with two elements (because
-  we are checking two attributes).  Initialize their `attr` member with
-  pointers to `struct git_attr` obtained by calling `git_attr()`:
+. Prepare a `struct attr_check` with two elements (because
+  we are checking two attributes):
 
 
-static struct git_attr_check check[2];
+static struct attr_check *check;
 static void setup_check(void)
 {
-   if (check[0].attr)
+   if (check)
return; /* already done */
-   check[0].attr = git_attr("crlf");
-   check[1].attr = git_attr("ident");
+   check = attr_check_initl("crlf", "ident", NULL);
 }
 
 
-. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared `struct attr_check`:
 
 
const char *path;
 
setup_check();
-   git_check_attr(path, ARRAY_SIZE(check), check);
+   git_check_attr(path, check);
 
 
-. Act on `.value` member of the result, left in `check[]`:
+. Act on `.value` member of the result, left in `check->items[]`:
 
 
-   const char *value = check[0].value;
+   const char *value = check->items[0].value;
 
if (ATTR_TRUE(value)) {
The attribute is Set, by listing only the name of the
@@ -109,20 +116,39 @@ static void setup_check(void)
}
 
 
+To see how attributes in argv[] are set for different paths, only
+the first step in the above would be different.
+
+
+static struct attr_check *check;
+static void setup_check(const char **argv)
+{
+   check = attr_check_alloc();
+   while (*argv) {
+   struct git_attr *attr = git_attr(*argv);
+   attr_check_append(check, attr);
+   argv++;
+   }
+}
+
+
 
 Querying All Attributes
 ---
 
 To get the values of all attributes associated with a file:
 
-* Call `git_all_attrs()`, which returns an array 

[PATCH v3 17/27] attr: convert git_check_attrs() callers to use the new API

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 archive.c  | 24 ++--
 builtin/pack-objects.c | 19 +--
 convert.c  | 17 ++---
 ll-merge.c | 33 ++---
 userdiff.c | 19 ---
 ws.c   | 19 ++-
 6 files changed, 45 insertions(+), 86 deletions(-)

diff --git a/archive.c b/archive.c
index b76bd4691..60b889198 100644
--- a/archive.c
+++ b/archive.c
@@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args,
return buffer;
 }
 
-static void setup_archive_check(struct attr_check_item *check)
-{
-   static struct git_attr *attr_export_ignore;
-   static struct git_attr *attr_export_subst;
-
-   if (!attr_export_ignore) {
-   attr_export_ignore = git_attr("export-ignore");
-   attr_export_subst = git_attr("export-subst");
-   }
-   check[0].attr = attr_export_ignore;
-   check[1].attr = attr_export_subst;
-}
-
 struct directory {
struct directory *up;
struct object_id oid;
@@ -120,10 +107,10 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
void *context)
 {
static struct strbuf path = STRBUF_INIT;
+   static struct attr_check *check;
struct archiver_context *c = context;
struct archiver_args *args = c->args;
write_archive_entry_fn_t write_entry = c->write_entry;
-   struct attr_check_item check[2];
const char *path_without_prefix;
int err;
 
@@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
strbuf_addch(, '/');
path_without_prefix = path.buf + args->baselen;
 
-   setup_archive_check(check);
-   if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
-   if (ATTR_TRUE(check[0].value))
+   if (!check)
+   check = attr_check_initl("export-ignore", "export-subst", NULL);
+   if (!git_check_attr(path_without_prefix, check)) {
+   if (ATTR_TRUE(check->items[0].value))
return 0;
-   args->convert = ATTR_TRUE(check[1].value);
+   args->convert = ATTR_TRUE(check->items[1].value);
}
 
if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8b8fbd814..181e4a198 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,24 +894,15 @@ static void write_pack_file(void)
written, nr_result);
 }
 
-static void setup_delta_attr_check(struct attr_check_item *check)
-{
-   static struct git_attr *attr_delta;
-
-   if (!attr_delta)
-   attr_delta = git_attr("delta");
-
-   check[0].attr = attr_delta;
-}
-
 static int no_try_delta(const char *path)
 {
-   struct attr_check_item check[1];
+   static struct attr_check *check;
 
-   setup_delta_attr_check(check);
-   if (git_check_attrs(path, ARRAY_SIZE(check), check))
+   if (!check)
+   check = attr_check_initl("delta", NULL);
+   if (git_check_attr(path, check))
return 0;
-   if (ATTR_FALSE(check->value))
+   if (ATTR_FALSE(check->items[0].value))
return 1;
return 0;
 }
diff --git a/convert.c b/convert.c
index 1b9829279..8d652bf27 100644
--- a/convert.c
+++ b/convert.c
@@ -1085,24 +1085,19 @@ struct conv_attrs {
int ident;
 };
 
-static const char *conv_attr_name[] = {
-   "crlf", "ident", "filter", "eol", "text",
-};
-#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name)
-
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
-   int i;
-   static struct attr_check_item ccheck[NUM_CONV_ATTRS];
+   static struct attr_check *check;
 
-   if (!ccheck[0].attr) {
-   for (i = 0; i < NUM_CONV_ATTRS; i++)
-   ccheck[i].attr = git_attr(conv_attr_name[i]);
+   if (!check) {
+   check = attr_check_initl("crlf", "ident", "filter",
+"eol", "text", NULL);
user_convert_tail = _convert;
git_config(read_convert_config, NULL);
}
 
-   if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
+   if (!git_check_attr(path, check)) {
+   struct attr_check_item *ccheck = check->items;
   

[PATCH v3 12/27] Documentation: fix a typo

2017-01-27 Thread Brandon Williams
From: Stefan Beller 

Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 Documentation/gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 3173dee7e..a53d093ca 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is 
used instead.
 Attributes for all users on a system should be placed in the
 `$(prefix)/etc/gitattributes` file.
 
-Sometimes you would need to override an setting of an attribute
+Sometimes you would need to override a setting of an attribute
 for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 19/27] attr: pass struct attr_check to collect_some_attrs

2017-01-27 Thread Brandon Williams
The old callchain used to take an array of attr_check_item items.
Instead pass the 'attr_check' container object to 'collect_some_attrs()'
and access the fields in the data structure directly.

Signed-off-by: Brandon Williams 
---
 attr.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/attr.c b/attr.c
index c0e7893b5..81a3c74d8 100644
--- a/attr.c
+++ b/attr.c
@@ -846,9 +846,7 @@ static int macroexpand_one(int nr, int rem)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int num,
-  struct attr_check_item *check)
-
+static void collect_some_attrs(const char *path, struct attr_check *check)
 {
struct attr_stack *stk;
int i, pathlen, rem, dirlen;
@@ -871,17 +869,18 @@ static void collect_some_attrs(const char *path, int num,
prepare_attr_stack(path, dirlen);
for (i = 0; i < attr_nr; i++)
check_all_attr[i].value = ATTR__UNKNOWN;
-   if (num && !cannot_trust_maybe_real) {
+   if (check->nr && !cannot_trust_maybe_real) {
rem = 0;
-   for (i = 0; i < num; i++) {
-   if (!check[i].attr->maybe_real) {
+   for (i = 0; i < check->nr; i++) {
+   const struct git_attr *a = check->items[i].attr;
+   if (!a->maybe_real) {
struct attr_check_item *c;
-   c = check_all_attr + check[i].attr->attr_nr;
+   c = check_all_attr + a->attr_nr;
c->value = ATTR__UNSET;
rem++;
}
}
-   if (rem == num)
+   if (rem == check->nr)
return;
}
 
@@ -890,18 +889,17 @@ static void collect_some_attrs(const char *path, int num,
rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int num,
-  struct attr_check_item *check)
+int git_check_attr(const char *path, struct attr_check *check)
 {
int i;
 
-   collect_some_attrs(path, num, check);
+   collect_some_attrs(path, check);
 
-   for (i = 0; i < num; i++) {
-   const char *value = 
check_all_attr[check[i].attr->attr_nr].value;
+   for (i = 0; i < check->nr; i++) {
+   const char *value = 
check_all_attr[check->items[i].attr->attr_nr].value;
if (value == ATTR__UNKNOWN)
value = ATTR__UNSET;
-   check[i].value = value;
+   check->items[i].value = value;
}
 
return 0;
@@ -912,7 +910,7 @@ void git_all_attrs(const char *path, struct attr_check 
*check)
int i;
 
attr_check_reset(check);
-   collect_some_attrs(path, check->nr, check->items);
+   collect_some_attrs(path, check);
 
for (i = 0; i < attr_nr; i++) {
const char *name = check_all_attr[i].attr->name;
@@ -925,11 +923,6 @@ void git_all_attrs(const char *path, struct attr_check 
*check)
}
 }
 
-int git_check_attr(const char *path, struct attr_check *check)
-{
-   return git_check_attrs(path, check->nr, check->items);
-}
-
 void git_attr_set_direction(enum git_attr_direction new, struct index_state 
*istate)
 {
enum git_attr_direction old = direction;
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 15/27] attr: (re)introduce git_check_attr() and struct attr_check

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N attr_check_item items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
attr_check_item items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 74 ++
 attr.h | 17 +++
 2 files changed, 91 insertions(+)

diff --git a/attr.c b/attr.c
index 2f180d609..de8bf35a3 100644
--- a/attr.c
+++ b/attr.c
@@ -132,6 +132,75 @@ struct git_attr *git_attr(const char *name)
return git_attr_internal(name, strlen(name));
 }
 
+struct attr_check *attr_check_alloc(void)
+{
+   return xcalloc(1, sizeof(struct attr_check));
+}
+
+struct attr_check *attr_check_initl(const char *one, ...)
+{
+   struct attr_check *check;
+   int cnt;
+   va_list params;
+   const char *param;
+
+   va_start(params, one);
+   for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+   ;
+   va_end(params);
+
+   check = attr_check_alloc();
+   check->nr = cnt;
+   check->alloc = cnt;
+   check->items = xcalloc(cnt, sizeof(struct attr_check_item));
+
+   check->items[0].attr = git_attr(one);
+   va_start(params, one);
+   for (cnt = 1; cnt < check->nr; cnt++) {
+   const struct git_attr *attr;
+   param = va_arg(params, const char *);
+   if (!param)
+   die("BUG: counted %d != ended at %d",
+   check->nr, cnt);
+   attr = git_attr(param);
+   if (!attr)
+   die("BUG: %s: not a valid attribute name", param);
+   check->items[cnt].attr = attr;
+   }
+   va_end(params);
+   return check;
+}
+
+struct attr_check_item *attr_check_append(struct attr_check *check,
+ const struct git_attr *attr)
+{
+   struct attr_check_item *item;
+
+   ALLOC_GROW(check->items, check->nr + 1, check->alloc);
+   item = >items[check->nr++];
+   item->attr = attr;
+   return item;
+}
+
+void attr_check_reset(struct attr_check *check)
+{
+   check->nr = 0;
+}
+
+void attr_check_clear(struct attr_check *check)
+{
+   free(check->items);
+   check->items = NULL;
+   check->alloc = 0;
+   check->nr = 0;
+}
+
+void attr_check_free(struct attr_check *check)
+{
+   attr_check_clear(check);
+   free(check);
+}
+
 /* What does a matched pattern decide? */
 struct attr_state {
struct git_attr *attr;
@@ -865,6 +934,11 @@ int git_all_attrs(const char *path, int *num, struct 
attr_check_item **check)
return 0;
 }
 
+int git_check_attr(const char *path, struct attr_check *check)
+{
+   return git_check_attrs(path, check->nr, check->items);
+}
+
 void git_attr_set_direction(enum git_attr_direction new, struct index_state 
*istate)
 {
enum git_attr_direction old = direction;
diff --git a/attr.h b/attr.h
index efc7bb3b3..e611b139a 100644
--- a/attr.h
+++ b/attr.h
@@ -29,6 +29,22 @@ struct attr_check_item {
const char *value;
 };
 
+struct attr_check {
+   int nr;
+   int alloc;
+   struct attr_check_item *items;
+};
+

[PATCH v3 10/27] attr: support quoting pathname patterns in C style

2017-01-27 Thread Brandon Williams
From: Nguyễn Thái Ngọc Duy 

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 Documentation/gitattributes.txt |  8 +---
 attr.c  | 15 +--
 t/t0003-attributes.sh   | 26 ++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c122..3173dee7e 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
pattern attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index d180c7833..e1c630f79 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -212,12 +213,21 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
const char *cp, *name, *states;
struct match_attr *res = NULL;
int is_macro;
+   struct strbuf pattern = STRBUF_INIT;
 
cp = line + strspn(line, blank);
if (!*cp || *cp == '#')
return NULL;
name = cp;
-   namelen = strcspn(name, blank);
+
+   if (*cp == '"' && !unquote_c_style(, name, )) {
+   name = pattern.buf;
+   namelen = pattern.len;
+   } else {
+   namelen = strcspn(name, blank);
+   states = name + namelen;
+   }
+
if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
if (!macro_ok) {
@@ -239,7 +249,6 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
else
is_macro = 0;
 
-   states = name + namelen;
states += strspn(states, blank);
 
/* First pass to count the attr_states */
@@ -282,9 +291,11 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
cannot_trust_maybe_real = 1;
}
 
+   strbuf_release();
return res;
 
 fail_return:
+   strbuf_release();
free(res);
return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..f19ae4f8c 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+   path="$1"
+   quoted_path="$2"
+   expect="$3"
+
+   git check-attr test -- "$path" >actual &&
+   echo "\"$quoted_path\": test: $expect" >expect &&
+   test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+   echo "\"a test=a" >.gitattributes &&
+   test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
mkdir -p a/b/d a/c b &&
(
echo "[attr]notest !test"
+   echo "\" d \"   test=d"
+   echo " etest=e"
+   echo " e\"  test=e"
echo "f test=f"
echo "a/i test=a/i"
echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+   attr_check " d " d &&
+   attr_check e e &&
+   attr_check_quote e\" e\\\" e &&
+
attr_check f f &&
attr_check a/f f &&
attr_check a/c/f f &&
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 07/27] attr.c: simplify macroexpand_one()

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffe..e42f931b3 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int 
basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
struct attr_stack *stk;
-   struct match_attr *a = NULL;
int i;
 
if (check_all_attr[nr].value != ATTR__TRUE ||
!check_all_attr[nr].attr->maybe_macro)
return rem;
 
-   for (stk = attr_stack; !a && stk; stk = stk->prev)
-   for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+   for (stk = attr_stack; stk; stk = stk->prev) {
+   for (i = stk->num_matches - 1; 0 <= i; i--) {
struct match_attr *ma = stk->attrs[i];
if (!ma->is_macro)
continue;
if (ma->u.attr->attr_nr == nr)
-   a = ma;
+   return fill_one("expand", ma, rem);
}
-
-   if (a)
-   rem = fill_one("expand", a, rem);
+   }
 
return rem;
 }
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 06/27] attr.c: mark where #if DEBUG ends more clearly

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6f..17297fffe 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, 
struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 08/27] attr.c: tighten constness around "git_attr" structure

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

It holds an interned string, and git_attr_name() is a way to peek
into it.  Make sure the involved pointer types are pointer-to-const.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 2 +-
 attr.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index e42f931b3..f7cf7ae30 100644
--- a/attr.c
+++ b/attr.c
@@ -43,7 +43,7 @@ static int cannot_trust_maybe_real;
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
-char *git_attr_name(struct git_attr *attr)
+const char *git_attr_name(const struct git_attr *attr)
 {
return attr->name;
 }
diff --git a/attr.h b/attr.h
index 8b08d33af..00d7a662c 100644
--- a/attr.h
+++ b/attr.h
@@ -25,7 +25,7 @@ extern const char git_attr__false[];
  * Unset one is returned as NULL.
  */
 struct git_attr_check {
-   struct git_attr *attr;
+   const struct git_attr *attr;
const char *value;
 };
 
@@ -34,7 +34,7 @@ struct git_attr_check {
  * return value is a pointer to a null-delimited string that is part
  * of the internal data structure; it should not be modified or freed.
  */
-char *git_attr_name(struct git_attr *);
+extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attr(const char *path, int, struct git_attr_check *);
 
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr()

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a299..6b55a57ef 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, 
const char *cp,
return NULL;
}
} else {
+   /*
+* As this function is always called twice, once with
+* e == NULL in the first pass and then e != NULL in
+* the second pass, no need for invalid_attr_name()
+* check here.
+*/
if (*cp == '-' || *cp == '!') {
e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
cp++;
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 03/27] attr.c: update a stale comment on "struct match_attr"

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e..007f1a299 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 09/27] attr.c: plug small leak in parse_attr_line()

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae30..d180c7833 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
if (!macro_ok) {
fprintf(stderr, "%s not allowed: %s:%d\n",
name, src, lineno);
-   return NULL;
+   goto fail_return;
}
is_macro = 1;
name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
fprintf(stderr,
"%.*s is not a valid attribute name: %s:%d\n",
namelen, name, src, lineno);
-   return NULL;
+   goto fail_return;
}
}
else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
for (cp = states, num_attr = 0; *cp; num_attr++) {
cp = parse_attr(src, lineno, cp, NULL);
if (!cp)
-   return NULL;
+   goto fail_return;
}
 
res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
warning(_("Negative patterns are ignored in git 
attributes\n"
  "Use '\\!' for literal leading 
exclamation."));
-   return NULL;
+   goto fail_return;
}
}
res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
}
 
return res;
+
+fail_return:
+   free(res);
+   return NULL;
 }
 
 /*
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 00/27] Revamp the attribute system; another round

2017-01-27 Thread Brandon Williams
Per some of the discussion online and off I locally broke up up the question
and answer and I wasn't very thrilled with the outcome for a number of reasons.

1. The API is more complex.  Callers needs to have two structures allocated
instead of one, one can be shared read-only while the other can't.  While this
many not be that big of a deal, it was more confusing to me.

2. Performance hit.  The allocation churn with creating/freeing a
scoreboard and the results struct adds up.  It even looks like the
cost of looking up a stack frame in a hashmap isn't very cheap.

  Here are some very rough performance measurements I made on my machine
  on linux.git by: `perf stat -r 50 git grep "asdfghjkl"`

  master: 0.302176063 seconds
  v1: 0.324243806 seconds
  v2: 0.304339636 seconds
  split:  0.349892023 seconds (hashtable of stacks, all_attr scoreboard
   allocated per git_attr_check() call, split
   question/answer)

After looking at this, I'm of the opinion that the API in v2 is the best route
to take.  Its a step-up from what it is currently (at master) and there isn't a
performance degradation (ok there's a small bit but it seems within the margin
of error).  It also allows for easier adaptation of the API if we wanted to do
a change in the future since the primary functionality remains intact, or to do
optimizations like stack pruning (if we decided to go down that route).

Given the above, v3 is a reroll of the same design as in v2.  This is a good
milestone in improving the attribute system as it achieves the goal of making
the attribute subsystem thread-safe (ie multiple callers can be executing
inside the attribute system at the same time) and will enable a future series
to allow pathspec code to call into the attribute system.

Most of the changes in this revision are cosmetic (variable renames, code
movement, etc) but there was a memory leak that was also fixed.

Brandon Williams (8):
  attr: pass struct attr_check to collect_some_attrs
  attr: use hashmap for attribute dictionary
  attr: eliminate global check_all_attr array
  attr: remove maybe-real, maybe-macro from git_attr
  attr: tighten const correctness with git_attr and match_attr
  attr: store attribute stack in attr_check structure
  attr: push the bare repo check into read_attr()
  attr: reformat git_attr_set_direction() function

Junio C Hamano (17):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr.c: add push_stack() helper
  attr.c: outline the future plans by heavily commenting
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct attr_check
  attr: convert git_all_attrs() to use "struct attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: change validity check for attribute names to use positive logic

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (1):
  Documentation: fix a typo

 Documentation/gitattributes.txt   |  10 +-
 Documentation/technical/api-gitattributes.txt |  86 ++-
 archive.c |  24 +-
 attr.c| 878 ++
 attr.h|  49 +-
 builtin/check-attr.c  |  66 +-
 builtin/pack-objects.c|  19 +-
 commit.c  |   3 +-
 common-main.c |   3 +
 convert.c |  25 +-
 ll-merge.c|  33 +-
 t/t0003-attributes.sh |  26 +
 userdiff.c|  19 +-
 ws.c  |  19 +-
 14 files changed, 816 insertions(+), 444 deletions(-)

-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 01/27] commit.c: use strchrnul() to scan for one line

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 commit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 2cf85158b..0c4ee3de4 100644
--- a/commit.c
+++ b/commit.c
@@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const 
char **subject)
p++;
if (*p) {
p = skip_blank_lines(p + 2);
-   for (eol = p; *eol && *eol != '\n'; eol++)
-   ; /* do nothing */
+   eol = strchrnul(p, '\n');
} else
eol = p;
 
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 02/27] attr.c: use strchrnul() to scan for one line

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b8..04d24334e 100644
--- a/attr.c
+++ b/attr.c
@@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char 
*path, int macro_ok)
for (sp = buf; *sp; ) {
char *ep;
int more;
-   for (ep = sp; *ep && *ep != '\n'; ep++)
-   ;
+
+   ep = strchrnul(sp, '\n');
more = (*ep == '\n');
*ep = '\0';
handle_attr_line(res, sp, path, ++lineno, macro_ok);
-- 
2.11.0.483.g087da7b7c-goog



[PATCH v3 05/27] attr.c: complete a sentence in a comment

2017-01-27 Thread Brandon Williams
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Brandon Williams 
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef..9bdf87a6f 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.11.0.483.g087da7b7c-goog



Re: [PATCH 1/2] pack-objects: enforce --depth limit in reused deltas

2017-01-27 Thread Junio C Hamano
Jeff King  writes:

>> > +#On the receiving end, "index-pack --fix-thin" will
>> > +#complete the pack with a base copy of tree X-1.
>> 
>> blob? tree? I think the argument would work the same way for either
>> type of objects, but the previous paragraph is using blob as the
>> example, so s/tree/blob/ here?
>
> Oops, yes. I had originally sketched out the example to use trees, but
> it was easier to assemble with blobs so I switched (I never actually dug
> into my "wild" case to see what it was segfaulting on, but I agree it
> applies equally well to either).

OK, then I'll squash in s/tree/blob/ here, in addition to the "my"
thing.  Thanks.


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Junio C Hamano
Jeff King  writes:

> I think a lot of the documentation uses  to refer to pathspecs
> (e.g., git-log(1), git-diff(1), etc).  As long as we're consistent with
> that convention, I don't think it's that big a deal.
>
> This spot needs a specific mention because it violates the convention.

Yup, I think we are in agreement.

> I don't know if the are other spots where it might be unclear, but I
> think we're probably better to tighten those as they come up, rather
> than switch to saying "" everywhere.

Oh, I do not think I would disagree.  As I think this change is an
instancethat makes the resulting text unclear, it would set a good
example to tighten existing one as part of its clean-up.

It can be done as a follow-up bugfix patch (i.e. "previous one made
the resulting document uncleasr and here is to fix it"), but that
would not serve as good ra ole model to mentor newer contributor as
doing the other way around.




Re: [PATCH 1/2] pack-objects: enforce --depth limit in reused deltas

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 03:31:36PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Since 898b14c (pack-objects: rework check_delta_limit usage,
> > 2007-04-16), we check the delta depth limit only when
> > figuring out whether we should make a new delta. We don't
> > consider it at all when reusing deltas, which means that
> > packing once with --depth=250, and then again with
> > --depth=50, the second pack my still contain chains larger
> > than 50.
> 
> "may still contain", methinks.

Oops, yes. There was another typo there that I fixed while proofreading,
and clearly I made it worse. :-/

> > This patch bounds the length of delta chains in the output
> > pack based on --depth, regardless of whether they are caused
> > by cross-pack deltas or existed in the input packs. This
> > fixes the problem, but does have two possible downsides:
> >
> >   1. High-depth aggressive repacks followed by "normal"
> >  repacks will throw away the high-depth chains.
> 
> I actually think it is a feature that the normal one that runs later
> is allowed to fix an over-deep delta chain.

Yeah, I saw you expressing that sentiment in the earlier thread, and I
think I agree with it.

The big problem to me is mostly that people may be surprised by the
change of behavior if they have a complicated setup. But those people
read the release notes, right? ;)

Arguably setting gc.aggressiveDepth is a mistake after this patch (you
should just set pack.depth).  Possibly we should ignore it with a
warning.

> > +#On the receiving end, "index-pack --fix-thin" will
> > +#complete the pack with a base copy of tree X-1.
> 
> blob? tree? I think the argument would work the same way for either
> type of objects, but the previous paragraph is using blob as the
> example, so s/tree/blob/ here?

Oops, yes. I had originally sketched out the example to use trees, but
it was easier to assemble with blobs so I switched (I never actually dug
into my "wild" case to see what it was segfaulting on, but I agree it
applies equally well to either).

> > +# Note that this whole setup is pretty reliant on the current
> > +# packing heuristics. We double-check that our test case
> > +# actually produces a long chain. If it doesn't, it should be
> > +# adjusted (or scrapped if the heuristics have become too unreliable)
> 
> IOW, we want something that says "we first check X and if X still
> holds, then we expect Y to also hold; if X no longer hold, do not
> bother to test that Y holds".  Nice food for thought.  Perhaps we
> want a way to express that in our test framework, or is X in the
> above merely another way to say "prerequisite"?

It _is_ a prerequisite, but I think unlike our normal prerequisites, we
wouldn't want to just quietly skip the test if it fails. Because it's
not "oops, this system doesn't support this test" so much as "something
in Git changed, and a human needs to evaluate whether this test can
still be performed".

So I hoped that if that prerequisite test ever broke due to a change in
pack-objects, the person making the change would find the comment and
decide the appropriate next step.

I'll include a version below fixing the typos you found, in case you did
not just mark them up already.

-- >8 --
Subject: [PATCH] pack-objects: enforce --depth limit in reused deltas

Since 898b14c (pack-objects: rework check_delta_limit usage,
2007-04-16), we check the delta depth limit only when
figuring out whether we should make a new delta. We don't
consider it at all when reusing deltas, which means that
packing once with --depth=250, and then again with
--depth=50, the second pack may still contain chains larger
than 50.

This is generally considered a feature, as the results of
earlier high-depth repacks are carried forward, used for
serving fetches, etc. However, since we started using
cross-pack deltas in c9af708b1 (pack-objects: use mru list
when iterating over packs, 2016-08-11), we are no longer
bounded by the length of an existing delta chain in a single
pack.

Here's one particular pathological case: a sequence of N
packs, each with 2 objects, the base of which is stored as a
delta in a previous pack. If we chain all the deltas
together, we have a cycle of length N. We break the cycle,
but the tip delta is still at depth N-1.

This is less unlikely than it might sound. See the included
test for a reconstruction based on real-world actions.  I
ran into such a case in the wild, where a client was rapidly
sending packs, and we had accumulated 10,000 before doing a
server-side repack.  The pack that "git repack" tried to
generate had a very deep chain, which caused pack-objects to
run out of stack space in the recursive write_one().

This patch bounds the length of delta chains in the output
pack based on --depth, regardless of whether they are caused
by cross-pack deltas or existed in the input packs. This
fixes the problem, but does have two possible downsides:

  1. High-depth aggressive repacks 

Re: Deadlock between git-remote-http and git fetch-pack

2017-01-27 Thread Junio C Hamano
tsuna  writes:

> While investigating a hung job in our CI system today, I think I found
> a deadlock in git-remote-http
> ...
> Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
> parent, PID 27317 (git-remote-http) is stuck reading on its child’s
> stdout.  Nothing has moved for like 2h, it’s deadlocked.

Hmph, would this be related to 296b847c0d ("remote-curl: don't hang
when a server dies before any output", 2016-11-18) I wonder...


Re: [PATCH 1/2] pack-objects: enforce --depth limit in reused deltas

2017-01-27 Thread Junio C Hamano
Jeff King  writes:

> Since 898b14c (pack-objects: rework check_delta_limit usage,
> 2007-04-16), we check the delta depth limit only when
> figuring out whether we should make a new delta. We don't
> consider it at all when reusing deltas, which means that
> packing once with --depth=250, and then again with
> --depth=50, the second pack my still contain chains larger
> than 50.

"may still contain", methinks.

> ...
>
> This patch bounds the length of delta chains in the output
> pack based on --depth, regardless of whether they are caused
> by cross-pack deltas or existed in the input packs. This
> fixes the problem, but does have two possible downsides:
>
>   1. High-depth aggressive repacks followed by "normal"
>  repacks will throw away the high-depth chains.

I actually think it is a feature that the normal one that runs later
is allowed to fix an over-deep delta chain.

>   2. If you really do want to store high-depth deltas on
>  disk, they may be discarded and new delta computed when
>  serving a fetch, unless you set pack.depth to match
>  your high-depth size.

Likewise.

> ... But we may
> visit an object through multiple delta paths, ...

Yes, good thinking.

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 8841f8b36..2b08c8121 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1539,6 +1539,8 @@ static int pack_offset_sort(const void *_a, const void 
> *_b)
>   *   2. Updating our size/type to the non-delta representation. These were
>   *  either not recorded initially (size) or overwritten with the delta 
> type
>   *  (type) when check_object() decided to reuse the delta.
> + *
> + *   3. Resetting our delta depth, as we are now a base object.
>   */
>  static void drop_reused_delta(struct object_entry *entry)
>  {
> @@ -1552,6 +1554,7 @@ static void drop_reused_delta(struct object_entry 
> *entry)
>   p = &(*p)->delta_sibling;
>   }
>   entry->delta = NULL;
> + entry->depth = 0;
>  
>   oi.sizep = >size;
>   oi.typep = >type;

Makes sense.

>  static void break_delta_chains(struct object_entry *entry)
>  {
> @@ -1587,6 +1593,18 @@ static void break_delta_chains(struct object_entry 
> *entry)
>*/
>   entry->dfs_state = DFS_ACTIVE;
>   break_delta_chains(entry->delta);
> +
> + /*
> +  * Once we've recursed, our base (if we still have one) knows
> +  * its depth, so we can compute ours (and check it against
> +  * the limit).
> +  */
> + if (entry->delta) {
> + entry->depth = entry->delta->depth + 1;
> + if (entry->depth > depth)
> + drop_reused_delta(entry);
> + }

;-)  Surprisingly simpple and effective.

> diff --git a/t/t5316-pack-delta-depth.sh b/t/t5316-pack-delta-depth.sh
> new file mode 100755
> index 0..236d60fe6
> --- /dev/null
> +++ b/t/t5316-pack-delta-depth.sh
> @@ -0,0 +1,93 @@
> +#!/bin/sh
> +
> +test_description='pack-objects breaks long cross-pack delta chains'
> +. ./test-lib.sh
> +
> +# This mirrors a repeated push setup:
> +#
> +# 1. A client repeatedly modifies some files, makes a
> +#  commit, and pushes the result. It does this N times
> +#  before we get around to repacking.
> +#
> +# 2. Each push generates a thin pack with the new version of
> +#various objects. Let's consider some file in the root tree
> +#which is updated in each commit.
> +#
> +#When generating push number X, we feed commit X-1 (and
> +#thus blob X-1) as a preferred base. The resulting pack has
> +#blob X as a thin delta against blob X-1.
> +#
> +#On the receiving end, "index-pack --fix-thin" will
> +#complete the pack with a base copy of tree X-1.

blob? tree? I think the argument would work the same way for either
type of objects, but the previous paragraph is using blob as the
example, so s/tree/blob/ here?

> +# 3. In older versions of git, if we used the delta from
> +#pack X, then we'd always find blob X-1 as a base in the
> +#same pack (and generate a fresh delta).
> +#
> +#But with the pack mru, we jump from delta to delta
> +#following the traversal order:
> +# ...
> +# Now we have blob X as a delta against X-1, which is a delta
> +# against X-2, and so forth.

> +# Note that this whole setup is pretty reliant on the current
> +# packing heuristics. We double-check that our test case
> +# actually produces a long chain. If it doesn't, it should be
> +# adjusted (or scrapped if the heuristics have become too unreliable)

IOW, we want something that says "we first check X and if X still
holds, then we expect Y to also hold; if X no longer hold, do not
bother to test that Y holds".  Nice food for thought.  Perhaps we
want a way to express that in our test framework, or is X in the
above merely another way to say 

Re: Deadlock between git-remote-http and git fetch-pack

2017-01-27 Thread Jonathan Tan

On 01/27/2017 02:31 PM, tsuna wrote:

Hi there,
While investigating a hung job in our CI system today, I think I found
a deadlock in git-remote-http

Git version: 2.9.3
Linux (amd64) kernel 4.9.0

Excerpt from the process list:

jenkins  27316  0.0  0.0  18508  6024 ?S19:30   0:00  |
   \_ git -C ../../../arista fetch --unshallow
jenkins  27317  0.0  0.0 169608 10916 ?S19:30   0:00  |
   \_ git-remote-http origin http://gerrit/arista
jenkins  27319  0.0  0.0  24160  8260 ?S19:30   0:00  |
   \_ git fetch-pack --stateless-rpc --stdin
--lock-pack --include-tag --thin --no-progress --depth=2147483647
http://gerrit/arista/

Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
parent, PID 27317 (git-remote-http) is stuck reading on its child’s
stdout.  Nothing has moved for like 2h, it’s deadlocked.


strace -fp 27319

strace: Process 27319 attached
read(0,

Here FD 0 is a pipe:

~ @8a33a534e2f7> lsof -np 27319 | grep 0r
git 27319 jenkins0r  FIFO   0,10  0t0 354519158 pipe

The writing end of which is owned by the parent process:

~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158
git-remot 27317jenkins4w FIFO   0,10  0t0
354519158 pipe
git   27319jenkins0r FIFO   0,10  0t0
354519158 pipe

And the parent process (git-remote-http) is stuck reading from another FD:


strace -fp 27317

strace: Process 27317 attached
read(5,

And here FD 5 is another pipe:

~ @8a33a534e2f7> lsof -np 27317 | grep 5r
git-remot 27317 jenkins5r  FIFO   0,10  0t0 354519159 pipe

Which is the child’s stdout:


lsof -n 2>/dev/null | fgrep 354519159

git-remot 27317jenkins5r FIFO   0,10  0t0
354519159 pipe
git   27319jenkins1w FIFO   0,10  0t0
354519159 pipe

Hence the deadlock.

Stack trace in git-remote-http:

(gdb) bt
#0  0x7f04f1e1363d in read () from target:/lib64/libpthread.so.0
#1  0x562417472d73 in xread ()
#2  0x562417472f2b in read_in_full ()
#3  0x562417438a6e in get_packet_data ()
#4  0x562417439129 in packet_read ()
#5  0x5624174245e0 in rpc_service ()
#6  0x562417424f10 in fetch_git ()
#7  0x5624174233fd in main ()

Stack trace in git fetch-pack:

(gdb) bt
#0  0x7fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0
#1  0x55f688827283 in xread ()
#2  0x55f68882743b in read_in_full ()
#3  0x55f6887ce35e in get_packet_data ()
#4  0x55f6887cea19 in packet_read ()
#5  0x55f6887ceb90 in packet_read_line ()
#6  0x55f68879dd05 in get_ack ()
#7  0x55f68879f6b4 in fetch_pack ()
#8  0x55f688710619 in cmd_fetch_pack ()
#9  0x55f6886dff7b in handle_builtin ()
#10 0x55f6886df026 in main ()

I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and
remote-curl.c and didn’t see anything noteworthy in that area of the
code, so I presume the bug is still there in master.



I haven't looked into this in detail, but this might be related to 
something I discovered while writing my patch set. I noticed that 
upload-pack (the process on the "other side" of fetch-pack) can die 
without first writing any notification, causing fetch-pack to block 
forever on a read. A fix would probably look like that patch [1].


[1] 



Deadlock between git-remote-http and git fetch-pack

2017-01-27 Thread tsuna
Hi there,
While investigating a hung job in our CI system today, I think I found
a deadlock in git-remote-http

Git version: 2.9.3
Linux (amd64) kernel 4.9.0

Excerpt from the process list:

jenkins  27316  0.0  0.0  18508  6024 ?S19:30   0:00  |
   \_ git -C ../../../arista fetch --unshallow
jenkins  27317  0.0  0.0 169608 10916 ?S19:30   0:00  |
   \_ git-remote-http origin http://gerrit/arista
jenkins  27319  0.0  0.0  24160  8260 ?S19:30   0:00  |
   \_ git fetch-pack --stateless-rpc --stdin
--lock-pack --include-tag --thin --no-progress --depth=2147483647
http://gerrit/arista/

Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
parent, PID 27317 (git-remote-http) is stuck reading on its child’s
stdout.  Nothing has moved for like 2h, it’s deadlocked.

> strace -fp 27319
strace: Process 27319 attached
read(0,

Here FD 0 is a pipe:

~ @8a33a534e2f7> lsof -np 27319 | grep 0r
git 27319 jenkins0r  FIFO   0,10  0t0 354519158 pipe

The writing end of which is owned by the parent process:

~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158
git-remot 27317jenkins4w FIFO   0,10  0t0
354519158 pipe
git   27319jenkins0r FIFO   0,10  0t0
354519158 pipe

And the parent process (git-remote-http) is stuck reading from another FD:

> strace -fp 27317
strace: Process 27317 attached
read(5,

And here FD 5 is another pipe:

~ @8a33a534e2f7> lsof -np 27317 | grep 5r
git-remot 27317 jenkins5r  FIFO   0,10  0t0 354519159 pipe

Which is the child’s stdout:

> lsof -n 2>/dev/null | fgrep 354519159
git-remot 27317jenkins5r FIFO   0,10  0t0
354519159 pipe
git   27319jenkins1w FIFO   0,10  0t0
354519159 pipe

Hence the deadlock.

Stack trace in git-remote-http:

(gdb) bt
#0  0x7f04f1e1363d in read () from target:/lib64/libpthread.so.0
#1  0x562417472d73 in xread ()
#2  0x562417472f2b in read_in_full ()
#3  0x562417438a6e in get_packet_data ()
#4  0x562417439129 in packet_read ()
#5  0x5624174245e0 in rpc_service ()
#6  0x562417424f10 in fetch_git ()
#7  0x5624174233fd in main ()

Stack trace in git fetch-pack:

(gdb) bt
#0  0x7fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0
#1  0x55f688827283 in xread ()
#2  0x55f68882743b in read_in_full ()
#3  0x55f6887ce35e in get_packet_data ()
#4  0x55f6887cea19 in packet_read ()
#5  0x55f6887ceb90 in packet_read_line ()
#6  0x55f68879dd05 in get_ack ()
#7  0x55f68879f6b4 in fetch_pack ()
#8  0x55f688710619 in cmd_fetch_pack ()
#9  0x55f6886dff7b in handle_builtin ()
#10 0x55f6886df026 in main ()

I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and
remote-curl.c and didn’t see anything noteworthy in that area of the
code, so I presume the bug is still there in master.

-- 
Benoit "tsuna" Sigoure


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 10:30:48AM -0800, Junio C Hamano wrote:

> > Is it worth clarifying that these are paths, not pathspecs? The word
> > "paths" is used to refer to the pathspecs on the command-line elsewhere
> > in the document.
> 
> If the code forces literal pathspecs, then what the user feeds to
> the command is no longer pathspecs from the user's point of view,
> and I agree that the distinction is important.  
> 
> If the remainder of the documentation is loose in terminology and
> the reason why we were able to get away with it was because we
> consistently used "paths" when we meant "pathspec", these existing
> mention of "paths" have to be tightened, either in this patch or a
> preparatory step patch before this one, because the addition of
> "this thing reads paths not pathspecs" is what makes them ambiguous.

I think a lot of the documentation uses  to refer to pathspecs
(e.g., git-log(1), git-diff(1), etc).  As long as we're consistent with
that convention, I don't think it's that big a deal.

This spot needs a specific mention because it violates the convention.

I don't know if the are other spots where it might be unclear, but I
think we're probably better to tighten those as they come up, rather
than switch to saying "" everywhere.

That's outside the scope of this series, though, I would think.

-Peff


[PATCH 2/2] pack-objects: convert recursion to iteration in break_delta_chain()

2017-01-27 Thread Jeff King
The break_delta_chain() function is recursive over the depth
of a given delta chain, which can lead to possibly running
out of stack space. Normally delta depth is quite small, but
if there _is_ a pathological case, this is where we would
find and fix it, so we should be more careful.

We can do it without recursion at all, but there's a little
bit of cleverness needed to do so. It's easiest to explain
by covering the less-clever strategies first.

The obvious thing to try is just keeping our own stack on
the heap. Whenever we would recurse, push the new entry onto
the stack and loop instead. But this gets tricky; when we
see an ACTIVE entry, we need to care if we just pushed it
(in which case it's a cycle) or if we just popped it (in
which case we dealt with its bases, and no we need to clear
the ACTIVE flag and compute its depth).

You can hack around that in various ways, like keeping a
"just pushed" flag, but the logic gets muddled. However, we
can observe that we do all of our pushes first, and then all
of our pops afterwards. In other words, we can do this in
two passes. First dig down to the base, stopping when we see
a cycle, and pushing each item onto our stack.  Then pop the
stack elements, clearing the ACTIVE flag and computing the
depth for each.

This works, and is reasonably elegant. However, why do we
need the stack for the second pass? We can just walk the
delta pointers again. There's one complication. Popping the
stack went over our list in reverse, so we could compute the
depth of each entry by incrementing the depth of its base,
which we will have just computed.  To go forward in the
second pass, we have to compute the total depth on the way
down, and then assign it as we go.

This patch implements this final strategy, because it not
only keeps the memory off the stack, but it eliminates it
entirely. Credit for the cleverness in that approach goes to
Michael Haggerty; bugs are mine.

Signed-off-by: Jeff King 
---
The diff is nearly impossible to read, so I'd recommend just looking at
the end result. I tried to document the tricky parts with comments.
There are a few parts that could be made more terse, but I screwed it up
so many times while writing it that I decided to do it in a way that
carefully documents all of the assumptions.

 builtin/pack-objects.c | 129 +
 1 file changed, 99 insertions(+), 30 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2b08c8121..c7af47548 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1579,48 +1579,117 @@ static void drop_reused_delta(struct object_entry 
*entry)
  */
 static void break_delta_chains(struct object_entry *entry)
 {
-   /* If it's not a delta, it can't be part of a cycle. */
-   if (!entry->delta) {
-   entry->dfs_state = DFS_DONE;
-   return;
-   }
+   /*
+* The actual depth of each object we will write is stored as an int,
+* as it cannot exceed our int "depth" limit. But before we break
+* changes based no that limit, we may potentially go as deep as the
+* number of objects, which is elsewhere bounded to a uint32_t.
+*/
+   uint32_t total_depth;
+   struct object_entry *cur, *next;
+
+   for (cur = entry, total_depth = 0;
+cur;
+cur = cur->delta, total_depth++) {
+   if (cur->dfs_state == DFS_DONE) {
+   /*
+* We've already seen this object and know it isn't
+* part of a cycle. We do need to append its depth
+* to our count.
+*/
+   total_depth += cur->depth;
+   break;
+   }
 
-   switch (entry->dfs_state) {
-   case DFS_NONE:
/*
-* This is the first time we've seen the object. We mark it as
-* part of the active potential cycle and recurse.
+* We break cycles before looping, so an ACTIVE state (or any
+* other cruft which made its way into the state variable)
+* is a bug.
 */
-   entry->dfs_state = DFS_ACTIVE;
-   break_delta_chains(entry->delta);
+   if (cur->dfs_state != DFS_NONE)
+   die("BUG: confusing delta dfs state in first pass: %d",
+   cur->dfs_state);
 
/*
-* Once we've recursed, our base (if we still have one) knows
-* its depth, so we can compute ours (and check it against
-* the limit).
+* Now we know this is the first time we've seen the object. If
+* it's not a delta, we're done traversing, but we'll mark it
+* done to save time on future traversals.
 */
-   if (entry->delta) {
- 

[PATCH 0/2] limit reused delta chains based on --depth

2017-01-27 Thread Jeff King
Back when we switched pack-objects to visiting packs in
most-recently-used order last August, we realized that this could reuse
cross-pack deltas, and that the result could have longer delta chains
than any single pack contains.

I produced a patch back then[1], but we decided not to follow through
with it. Two things happened to make me revive that patch:

  1. I hit a case in the wild with a really long delta chain that caused
 pack-objects' write_one() to run out of stack space.

  2. We dropped the --aggressive depth to match the normal one. So
 there's less concern about mismatches throwing out on-disk deltas
 from a previous aggressive repack (but see the caveats in the first
 patch's commit message).

So here it is, plus another patch that converts the recursion to
iteration (the stack space needed by the function is small enough that
just the first patch was enough to fix my problem case, but it seemed
like tempting fate to leave it recursive).

  [1/2]: pack-objects: enforce --depth limit in reused deltas
  [2/2]: pack-objects: convert recursion to iteration in break_delta_chain()

 builtin/pack-objects.c  | 133 
 pack-objects.h  |   4 ++
 t/t5316-pack-delta-depth.sh |  93 +++
 3 files changed, 207 insertions(+), 23 deletions(-)
 create mode 100755 t/t5316-pack-delta-depth.sh

-Peff

[1] 
http://public-inbox.org/git/20160811095710.p2bffympjlwmv...@sigill.intra.peff.net/


[PATCH 1/2] pack-objects: enforce --depth limit in reused deltas

2017-01-27 Thread Jeff King
Since 898b14c (pack-objects: rework check_delta_limit usage,
2007-04-16), we check the delta depth limit only when
figuring out whether we should make a new delta. We don't
consider it at all when reusing deltas, which means that
packing once with --depth=250, and then again with
--depth=50, the second pack my still contain chains larger
than 50.

This is generally considered a feature, as the results of
earlier high-depth repacks are carried forward, used for
serving fetches, etc. However, since we started using
cross-pack deltas in c9af708b1 (pack-objects: use mru list
when iterating over packs, 2016-08-11), we are no longer
bounded by the length of an existing delta chain in a single
pack.

Here's one particular pathological case: a sequence of N
packs, each with 2 objects, the base of which is stored as a
delta in a previous pack. If we chain all the deltas
together, we have a cycle of length N. We break the cycle,
but the tip delta is still at depth N-1.

This is less unlikely than it might sound. See the included
test for a reconstruction based on real-world actions.  I
ran into such a case in the wild, where a client was rapidly
sending packs, and we had accumulated 10,000 before doing a
server-side repack.  The pack that "git repack" tried to
generate had a very deep chain, which caused pack-objects to
run out of stack space in the recursive write_one().

This patch bounds the length of delta chains in the output
pack based on --depth, regardless of whether they are caused
by cross-pack deltas or existed in the input packs. This
fixes the problem, but does have two possible downsides:

  1. High-depth aggressive repacks followed by "normal"
 repacks will throw away the high-depth chains.

 In the long run this is probably OK; investigation
 showed that high-depth repacks aren't actually
 beneficial, and we dropped the aggressive depth default
 to match the normal case in 07e7dbf0d (gc: default
 aggressive depth to 50, 2016-08-11).

  2. If you really do want to store high-depth deltas on
 disk, they may be discarded and new delta computed when
 serving a fetch, unless you set pack.depth to match
 your high-depth size.

The implementation uses the existing search for delta
cycles.  That lets us compute the depth of any node based on
the depth of its base, because we know the base is DFS_DONE
by the time we look at it (modulo any cycles in the graph,
but we know there cannot be any because we break them as we
see them).

There is some subtlety worth mentioning, though. We record
the depth of each object as we compute it. It might seem
like we could save the per-object storage space by just
keeping track of the depth of our traversal (i.e., have
break_delta_chains() report how deep it went). But we may
visit an object through multiple delta paths, and on
subsequent paths we want to know its depth immediately,
without having to walk back down to its final base (doing so
would make our graph walk quadratic rather than linear).

Likewise, one could try to record the depth not from the
base, but from our starting point (i.e., start
recursion_depth at 0, and pass "recursion_depth + 1" to each
invocation of break_delta_chains()). And then when
recursion_depth gets too big, we know that we must cut the
delta chain.  But that technique is wrong if we do not visit
the nodes in topological order. In a chain A->B->C, it
if we visit "C", then "B", then "A", we will never recurse
deeper than 1 link (because we see at each node that we have
already visited it).

Signed-off-by: Jeff King 
---
 builtin/pack-objects.c  | 18 +
 pack-objects.h  |  4 ++
 t/t5316-pack-delta-depth.sh | 93 +
 3 files changed, 115 insertions(+)
 create mode 100755 t/t5316-pack-delta-depth.sh

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8841f8b36..2b08c8121 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1539,6 +1539,8 @@ static int pack_offset_sort(const void *_a, const void 
*_b)
  *   2. Updating our size/type to the non-delta representation. These were
  *  either not recorded initially (size) or overwritten with the delta type
  *  (type) when check_object() decided to reuse the delta.
+ *
+ *   3. Resetting our delta depth, as we are now a base object.
  */
 static void drop_reused_delta(struct object_entry *entry)
 {
@@ -1552,6 +1554,7 @@ static void drop_reused_delta(struct object_entry *entry)
p = &(*p)->delta_sibling;
}
entry->delta = NULL;
+   entry->depth = 0;
 
oi.sizep = >size;
oi.typep = >type;
@@ -1570,6 +1573,9 @@ static void drop_reused_delta(struct object_entry *entry)
  * Follow the chain of deltas from this entry onward, throwing away any links
  * that cause us to hit a cycle (as determined by the DFS state flags in
  * the entries).
+ *
+ * We also detect too-long reused chains that would 

Re: octopus merge --no-ff claims to fast-forward even though it doesn't

2017-01-27 Thread Junio C Hamano
Samuel Lijin  writes:

> I was doing an octopus merge earlier and noticed that it claims to
> fast-forward when you specify --no-ff, even though it does actually
> abide by --no-ff.

This was intentional and hasn't changed since it was first designed;
the octopus was to be used only for the simple and obvious merges.

If anything, I think it should error out when the --no-ff option is
given, issuing the same error message as the one given when any step
other than the last one needs manual resolution.


Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Junio C Hamano
Cornelius Weig  writes:

> Sorry, I forgot to mark this patch as follow-up to message
> 

I appreciate that you are very considerate, but in practice, if you
do not have too many topics in flight and your response time is less
than 48 hours, we can tell which new message is about which older
discussion thread.  Don't worry about it too much.

Thanks.


Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Philip Oakley

From: "Stefan Beller" 

On Fri, Jan 27, 2017 at 12:01 PM,   wrote:

From: Cornelius Weig 

The documentation for submission discourages pgp-signing, but demands
a proper sign-off by contributors. However, when skimming the headings,
the wording of the section for sign-off could mistakenly be understood
as concerning pgp-signing. Thus, new contributors could oversee the
necessary sign-off.

This commit improves the wording such that the section about sign-off
cannot be misunderstood as pgp-signing. In addition, the paragraph about
pgp-signing is changed such that it avoids the impression that
pgp-signing could be relevant at later stages of the submission.

Signed-off-by: Cornelius Weig 
Signed-off-by: Junio C Hamano 
Signed-off-by: Philip Oakley 
Signed-off-by: Stefan Beller 
---

Notes:
This patch summarizes the suggested changes.

As I don't know what is appropriate, I took the liberty to add 
everybody's

sign-off who was involved in the discussion in alphabetic order.


Heh, my first though was to conclude you haven't read the
sign off part, yet apart from the changed header.
/me goes back and actually reads the DCO again.
And actually these sign offs were there in other patches in this area,
so you'd claim (a) that yours was just created partly by you and having
other patches that were also signed off (b), whose sign offs you
merely copy over to here.

And then after reading I realized I slightly confused the signing
myself as the sign offs are also used to track the flow of a patch.
These sign offs suggest that you made the patch initially, then
passed it to Junio, then to Philip and then to me.
And Junio will sign it again when applying the patch.

So maybe s/signed-off-by/helped-by/?


Helped-by: Philip Oakley 

is sufficient for me (if that).


The patch with the aggregation looks good to me.

Thanks,
Stefan



 Documentation/SubmittingPatches | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/SubmittingPatches 
b/Documentation/SubmittingPatches

index 08352de..3faf7eb 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -216,12 +216,11 @@ that it will be postponed.
 Exception:  If your mailer is mangling patches then someone may ask
 you to re-send them using MIME, that is OK.

-Do not PGP sign your patch, at least for now.  Most likely, your
-maintainer or other people on the list would not have your PGP
-key and would not bother obtaining it anyway.  Your patch is not
-judged by who you are; a good patch from an unknown origin has a
-far better chance of being accepted than a patch from a known,
-respected origin that is done poorly or does incorrect things.
+Do not PGP sign your patch. Most likely, your maintainer or other people 
on the
+list would not have your PGP key and would not bother obtaining it 
anyway.
+Your patch is not judged by who you are; a good patch from an unknown 
origin
+has a far better chance of being accepted than a patch from a known, 
respected

+origin that is done poorly or does incorrect things.

 If you really really really really want to do a PGP signed
 patch, format it as "multipart/signed", not a text/plain message
@@ -246,7 +245,7 @@ patch.
  *2* The mailing list: git@vger.kernel.org


-(5) Sign your work
+(5) Certify your work by adding your "Signed-off-by: " line

 To improve tracking of who did what, we've borrowed the
 "sign-off" procedure from the Linux kernel project on patches
--
2.10.2







[PATCH v2 0/7] completion: recognize more long-options

2017-01-27 Thread cornelius . weig
From: Cornelius Weig 

This revision addresses Johannes' concerns. Changes wrt v1:

 - fixed the commit message: two of the "dangerous" options erroneously ended
   up in the commit message. These options were already in the list of
   auto-completable options.
 - removed the possibly dangerous option '--unsafe-paths' from git-apply.
 - added my sign-off

Patches 1-6 are not resent, because they have not changed (other than my added 
sign-off).

Also, I added further people to CC, because nobody actually has looked at the 
code yet.


Cornelius Weig (7):
  completion: recognize more long-options

 contrib/completion/git-completion.bash | 132 +++--
 1 file changed, 110 insertions(+), 22 deletions(-)

-- 
2.10.2



[PATCH v2 7/7] completion: recognize more long-options

2017-01-27 Thread cornelius . weig
From: Cornelius Weig 

Recognize several new long-options for bash completion in the following
commands:

 - apply: --recount --directory=
 - archive: --output
 - branch: --column --no-column --sort= --points-at
 - clone: --no-single-branch --shallow-submodules
 - commit: --patch --short --date --allow-empty
 - describe: --first-parent
 - fetch, pull: --unshallow --update-shallow
 - fsck: --name-objects
 - grep: --break --heading --show-function --function-context
 --untracked --no-index
 - mergetool: --prompt --no-prompt
 - reset: --keep
 - revert: --strategy= --strategy-option=
 - rm: --force
 - shortlog: --email
 - tag: --merged --no-merged --create-reflog

Signed-off-by: Cornelius Weig 
Helped-by: Johannes Sixt 
---
 contrib/completion/git-completion.bash | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 0e09519..933bb6e 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -936,6 +936,7 @@ _git_apply ()
--apply --no-add --exclude=
--ignore-whitespace --ignore-space-change
--whitespace= --inaccurate-eof --verbose
+   --recount --directory=
"
return
esac
@@ -974,7 +975,7 @@ _git_archive ()
--*)
__gitcomp "
--format= --list --verbose
-   --prefix= --remote= --exec=
+   --prefix= --remote= --exec= --output
"
return
;;
@@ -1029,6 +1030,7 @@ _git_branch ()
--track --no-track --contains --merged --no-merged
--set-upstream-to= --edit-description --list
--unset-upstream --delete --move --remotes
+   --column --no-column --sort= --points-at
"
;;
*)
@@ -1142,6 +1144,8 @@ _git_clone ()
--single-branch
--branch
--recurse-submodules
+   --no-single-branch
+   --shallow-submodules
"
return
;;
@@ -1183,6 +1187,7 @@ _git_commit ()
--reset-author --file= --message= --template=
--cleanup= --untracked-files --untracked-files=
--verbose --quiet --fixup= --squash=
+   --patch --short --date --allow-empty
"
return
esac
@@ -1201,7 +1206,7 @@ _git_describe ()
--*)
__gitcomp "
--all --tags --contains --abbrev= --candidates=
-   --exact-match --debug --long --match --always
+   --exact-match --debug --long --match --always 
--first-parent
"
return
esac
@@ -1284,6 +1289,7 @@ __git_fetch_recurse_submodules="yes on-demand no"
 __git_fetch_options="
--quiet --verbose --append --upload-pack --force --keep --depth=
--tags --no-tags --all --prune --dry-run --recurse-submodules=
+   --unshallow --update-shallow
 "
 
 _git_fetch ()
@@ -1333,7 +1339,7 @@ _git_fsck ()
--*)
__gitcomp "
--tags --root --unreachable --cache --no-reflogs --full
-   --strict --verbose --lost-found
+   --strict --verbose --lost-found --name-objects
"
return
;;
@@ -1377,6 +1383,8 @@ _git_grep ()
--max-depth
--count
--and --or --not --all-match
+   --break --heading --show-function --function-context
+   --untracked --no-index
"
return
;;
@@ -1576,7 +1584,7 @@ _git_mergetool ()
return
;;
--*)
-   __gitcomp "--tool="
+   __gitcomp "--tool= --prompt --no-prompt"
return
;;
esac
@@ -2456,7 +2464,7 @@ _git_reset ()
 
case "$cur" in
--*)
-   __gitcomp "--merge --mixed --hard --soft --patch"
+   __gitcomp "--merge --mixed --hard --soft --patch --keep"
return
;;
esac
@@ -2472,7 +2480,10 @@ _git_revert ()
fi
case "$cur" in
--*)
-   __gitcomp "--edit --mainline --no-edit --no-commit --signoff"
+   __gitcomp "
+   --edit --mainline --no-edit --no-commit --signoff
+   

Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Stefan Beller
On Fri, Jan 27, 2017 at 12:48 PM, Cornelius Weig
 wrote:
>>
>> So maybe s/signed-off-by/helped-by/?
>>
>
> This is getting real complex :-/

uh; sorry for that. I do not mind the patch as posted,
just in case you reroll for another reason, this is worth thinking about.

In fact, as said before I like that patch.

>
> As I said in the notes for the patch:
>
>>> As I don't know what is appropriate, I took the liberty to add 
>>> everybody's
>>> sign-off who was involved in the discussion in alphabetic order.
>
> With your explanation, I guess the most accurate sign-off chain would be:
>
> Signed-off-by: Stefan Beller  (as you sent a patch)

...and here we could continue arguing. ;)
Is the patch I sent note-worthy enough to be deriving work from?
My gut reaction would be "no".

> Helped-by: Philip Oakley  (no patch, but helpful)
> Signed-off-by: Cornelius Weig  (this patch)
> Signed-off-by: Junio C Hamano  (once he decides it's good)


Re: [PATCH] test-lib: on FreeBSD, look for unzip(1) in /usr/local/bin/

2017-01-27 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>> On Thu, 21 Jul 2016, Eric Wong wrote:
>>
>>> Thanks, t5003 now works out-of-the-box.
>>> Tested with Info-ZIP unzip installed and uninstalled.
>>> 
>>> Tested-by: Eric Wong 
>>
>> Did you forget about this?
>
> This fell off the radar.  
>
> You could have resent with Eric's Tested-by: added, to make it
> easier to apply.  I'll see if I can find the original but it won't
> happen until later this afternoon.

The errand I needed to run earlier in the day turned out to be not
so time consuming---I found the exchange and the patch is now
queued, and will be part of today's pushout.

Thanks, both.


Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Cornelius Weig
> 
> So maybe s/signed-off-by/helped-by/?
> 

This is getting real complex :-/

As I said in the notes for the patch:

>> As I don't know what is appropriate, I took the liberty to add 
>> everybody's
>> sign-off who was involved in the discussion in alphabetic order.

With your explanation, I guess the most accurate sign-off chain would be:

Signed-off-by: Stefan Beller  (as you sent a patch)
Helped-by: Philip Oakley  (no patch, but helpful)
Signed-off-by: Cornelius Weig  (this patch)
Signed-off-by: Junio C Hamano  (once he decides it's good)


octopus merge --no-ff claims to fast-forward even though it doesn't

2017-01-27 Thread Samuel Lijin
I was doing an octopus merge earlier and noticed that it claims to
fast-forward when you specify --no-ff, even though it does actually
abide by --no-ff.

I can consistently reproduce as follows:

$ git clone https://github.com/sxlijin/merge-octopus-experiment
$ cd merge-octopus-experiment
$ git merge --no-ff origin/A origin/B --no-edit
Fast-forwarding to: origin/A
Trying simple merge with origin/B
Merge made by the 'octopus' strategy.
 anotherA | 0
 anotherB | 0
 otherA   | 0
 otherB   | 0
 4 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 anotherA
 create mode 100644 anotherB
 create mode 100644 otherA
 create mode 100644 otherB

$ git log --graph --pretty=oneline --decorate

I've reproduced the issue with 2.11.0 on both a Windows box (MSYS) and
Linux (Arch).

The issue seems to live in git-merge-octopus.sh, specifically in that
--no-ff does not affect the initial value of NON_FF_MERGE. I'm happy
to submit a patch if someone can point me in the right direction.

Sam


Re: [PATCH 11/12] receive-pack: treat namespace .have lines like alternates

2017-01-27 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Jan 27, 2017 at 06:45:26PM +0100, Lukas Fleischer wrote:
>
>> I think this is already possible using receive.hideRefs (which causes
>> the ref_is_hidden() branch above to return if applicable).
>> ...
>
> Thanks for the pointers. I think a "turn off namespace .have lines"
> option would be easier for some cases, but what you've implemented is
> much more flexible. So if people using namespaces are happy with it, I
> don't see any need to add another way to do the same thing.

Yeah, I agree.  Thanks, both.


Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Stefan Beller
On Fri, Jan 27, 2017 at 12:01 PM,   wrote:
> From: Cornelius Weig 
>
> The documentation for submission discourages pgp-signing, but demands
> a proper sign-off by contributors. However, when skimming the headings,
> the wording of the section for sign-off could mistakenly be understood
> as concerning pgp-signing. Thus, new contributors could oversee the
> necessary sign-off.
>
> This commit improves the wording such that the section about sign-off
> cannot be misunderstood as pgp-signing. In addition, the paragraph about
> pgp-signing is changed such that it avoids the impression that
> pgp-signing could be relevant at later stages of the submission.
>
> Signed-off-by: Cornelius Weig 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Philip Oakley 
> Signed-off-by: Stefan Beller 
> ---
>
> Notes:
> This patch summarizes the suggested changes.
>
> As I don't know what is appropriate, I took the liberty to add everybody's
> sign-off who was involved in the discussion in alphabetic order.

Heh, my first though was to conclude you haven't read the
sign off part, yet apart from the changed header.
/me goes back and actually reads the DCO again.
And actually these sign offs were there in other patches in this area,
so you'd claim (a) that yours was just created partly by you and having
other patches that were also signed off (b), whose sign offs you
merely copy over to here.

And then after reading I realized I slightly confused the signing
myself as the sign offs are also used to track the flow of a patch.
These sign offs suggest that you made the patch initially, then
passed it to Junio, then to Philip and then to me.
And Junio will sign it again when applying the patch.

So maybe s/signed-off-by/helped-by/?

The patch with the aggregation looks good to me.

Thanks,
Stefan

>
>  Documentation/SubmittingPatches | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index 08352de..3faf7eb 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -216,12 +216,11 @@ that it will be postponed.
>  Exception:  If your mailer is mangling patches then someone may ask
>  you to re-send them using MIME, that is OK.
>
> -Do not PGP sign your patch, at least for now.  Most likely, your
> -maintainer or other people on the list would not have your PGP
> -key and would not bother obtaining it anyway.  Your patch is not
> -judged by who you are; a good patch from an unknown origin has a
> -far better chance of being accepted than a patch from a known,
> -respected origin that is done poorly or does incorrect things.
> +Do not PGP sign your patch. Most likely, your maintainer or other people on 
> the
> +list would not have your PGP key and would not bother obtaining it anyway.
> +Your patch is not judged by who you are; a good patch from an unknown origin
> +has a far better chance of being accepted than a patch from a known, 
> respected
> +origin that is done poorly or does incorrect things.
>
>  If you really really really really want to do a PGP signed
>  patch, format it as "multipart/signed", not a text/plain message
> @@ -246,7 +245,7 @@ patch.
>   *2* The mailing list: git@vger.kernel.org
>
>
> -(5) Sign your work
> +(5) Certify your work by adding your "Signed-off-by: " line
>
>  To improve tracking of who did what, we've borrowed the
>  "sign-off" procedure from the Linux kernel project on patches
> --
> 2.10.2
>


[PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread cornelius . weig
From: Cornelius Weig 

The documentation for submission discourages pgp-signing, but demands
a proper sign-off by contributors. However, when skimming the headings,
the wording of the section for sign-off could mistakenly be understood
as concerning pgp-signing. Thus, new contributors could oversee the
necessary sign-off.

This commit improves the wording such that the section about sign-off
cannot be misunderstood as pgp-signing. In addition, the paragraph about
pgp-signing is changed such that it avoids the impression that
pgp-signing could be relevant at later stages of the submission.

Signed-off-by: Cornelius Weig 
Signed-off-by: Junio C Hamano 
Signed-off-by: Philip Oakley 
Signed-off-by: Stefan Beller 
---

Notes:
This patch summarizes the suggested changes.

As I don't know what is appropriate, I took the liberty to add everybody's
sign-off who was involved in the discussion in alphabetic order.

 Documentation/SubmittingPatches | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 08352de..3faf7eb 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -216,12 +216,11 @@ that it will be postponed.
 Exception:  If your mailer is mangling patches then someone may ask
 you to re-send them using MIME, that is OK.
 
-Do not PGP sign your patch, at least for now.  Most likely, your
-maintainer or other people on the list would not have your PGP
-key and would not bother obtaining it anyway.  Your patch is not
-judged by who you are; a good patch from an unknown origin has a
-far better chance of being accepted than a patch from a known,
-respected origin that is done poorly or does incorrect things.
+Do not PGP sign your patch. Most likely, your maintainer or other people on the
+list would not have your PGP key and would not bother obtaining it anyway.
+Your patch is not judged by who you are; a good patch from an unknown origin
+has a far better chance of being accepted than a patch from a known, respected
+origin that is done poorly or does incorrect things.
 
 If you really really really really want to do a PGP signed
 patch, format it as "multipart/signed", not a text/plain message
@@ -246,7 +245,7 @@ patch.
  *2* The mailing list: git@vger.kernel.org
 
 
-(5) Sign your work
+(5) Certify your work by adding your "Signed-off-by: " line
 
 To improve tracking of who did what, we've borrowed the
 "sign-off" procedure from the Linux kernel project on patches
-- 
2.10.2



Re: [PATCH] doc: clarify distinction between sign-off and pgp-signing

2017-01-27 Thread Cornelius Weig
Sorry, I forgot to mark this patch as follow-up to message



Re: [PATCH v4] t/Makefile: add a rule to re-run previously-failed tests

2017-01-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> This patch automates the process of determining which tests failed
> previously and re-running them.
> ...
>
> Signed-off-by: Johannes Schindelin 

I stored both versions in files and compared them, and it seems the
single word change in the proposed commit log message is the only
difference.  I would have written "Automate the process...", though.

If you are resending, touching up to cover all points raised by a
reviewer and doing nothing else, having "Reviewed-by: Jeff King
" would have been nicer.  

Will queue.  Thanks.

> ---
> Published-As: https://github.com/dscho/git/releases/tag/failing-tests-v4
> Fetch-It-Via: git fetch https://github.com/dscho/git failing-tests-v4
>
>  t/Makefile | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/t/Makefile b/t/Makefile
> index d613935f14..1bb06c36f2 100644
> --- a/t/Makefile
> +++ b/t/Makefile
> @@ -35,6 +35,12 @@ all: $(DEFAULT_TEST_TARGET)
>  test: pre-clean $(TEST_LINT)
>   $(MAKE) aggregate-results-and-cleanup
>  
> +failed:
> + @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
> + grep -l '^failed [1-9]' *.counts | \
> + sed -n 's/\.counts$$/.sh/p') && \
> + test -z "$$failed" || $(MAKE) $$failed
> +
>  prove: pre-clean $(TEST_LINT)
>   @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
> $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
>   $(MAKE) clean-except-prove-cache
>
> base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe


Re: [PATCH] test-lib: on FreeBSD, look for unzip(1) in /usr/local/bin/

2017-01-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> Hi Junio,
>
> On Thu, 21 Jul 2016, Eric Wong wrote:
>
>> Johannes Schindelin  wrote:
>> > The common work-around is to install Info-Zip on FreeBSD, into
>> > /usr/local/bin/.
>> > 
>> > Signed-off-by: Johannes Schindelin 
>> 
>> Thanks, t5003 now works out-of-the-box.
>> Tested with Info-ZIP unzip installed and uninstalled.
>> 
>> Tested-by: Eric Wong 
>
> Did you forget about this?

This fell off the radar.  

You could have resent with Eric's Tested-by: added, to make it
easier to apply.  I'll see if I can find the original but it won't
happen until later this afternoon.


Re: [PATCH] help: correct behavior for is_executable on Windows

2017-01-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> From: Heiko Voigt 
>
> The previous implementation said that the filesystem information on
> Windows is not reliable to determine whether a file is executable. To
> gather this information it was peeking into the first two bytes of a
> file to see whether it looks executable.
>
> Apart from the fact that on Windows executables are defined as such by
> their extension (and Git has special code to support "executing" scripts
> when they have a she-bang line) it leads to serious performance
> problems: not only do we have to open many files now, it gets even
> slower when a virus scanner is running.

Heiko, around here (before going into the details of how severe the
problem is and how wonderful the result applying of this change is)
is the best place to summarize the solution.  E.g.

Because the definition of "executable" on Windows is based
on the file extension, update the function to declare that a
file with ".exe" extension without opening and reading the
early bytes from it.  This avoids serious performance issues.

I paraphrased the rest only so that the description of the solution
(i.e. "instead of opening and peeking, we trust .exe suffix") fits
well in the surrounding text; the important part is to say what the
change does clearly.

I agree with the reasoning and the execution of the patch, except
that 

 - "correct behaviour" in the title makes it appear that this is a
   correctness thing, but this is primarily a performance fix.

 - It is a bit strange that "MZ" is dropped in the same patch
   without any mention.

The latter may be "correctness" fix, in that earlier we treated any
file that happens to begin with MZ as an executable, regardless of
the filename, which may have misidentified a non-executable file as
an executable.  If that is what is going on, it deserves to be
described in the log message.

> diff --git a/help.c b/help.c
> index 53e2a67e00..bc6cd19cf3 100644
> --- a/help.c
> +++ b/help.c
> @@ -105,7 +105,22 @@ static int is_executable(const char *name)
>   return 0;
>  
>  #if defined(GIT_WINDOWS_NATIVE)
> -{/* cannot trust the executable bit, peek into the file instead */
> + /*
> +  * On Windows there is no executable bit. The file extension
> +  * indicates whether it can be run as an executable, and Git
> +  * has special-handling to detect scripts and launch them
> +  * through the indicated script interpreter. We test for the
> +  * file extension first because virus scanners may make
> +  * it quite expensive to open many files.
> +  */
> + if (ends_with(name, ".exe"))
> + return S_IXUSR;
> +
> +{
> + /*
> +  * Now that we know it does not have an executable extension,
> +  * peek into the file instead.
> +  */
>   char buf[3] = { 0 };
>   int n;
>   int fd = open(name, O_RDONLY);
> @@ -113,8 +128,8 @@ static int is_executable(const char *name)
>   if (fd >= 0) {
>   n = read(fd, buf, 2);
>   if (n == 2)
> - /* DOS executables start with "MZ" */
> - if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
> + /* look for a she-bang */
> + if (!strcmp(buf, "#!"))
>   st.st_mode |= S_IXUSR;
>   close(fd);
>   }
>
> base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Junio C Hamano
Jeff King  writes:

> A few minor suggestions:
>
>> +--stdin::
>> +Instead of taking list of paths from the command line,
>> +read list of paths from the standard input.  Paths are
>> +separated by LF (i.e. one path per line) by default.
>> +
>> +-z::
>> +Only meaningful with `--stdin`; paths are separated with
>> +NUL character instead of LF.
>
> Is it worth clarifying that these are paths, not pathspecs? The word
> "paths" is used to refer to the pathspecs on the command-line elsewhere
> in the document.

If the code forces literal pathspecs, then what the user feeds to
the command is no longer pathspecs from the user's point of view,
and I agree that the distinction is important.  

If the remainder of the documentation is loose in terminology and
the reason why we were able to get away with it was because we
consistently used "paths" when we meant "pathspec", these existing
mention of "paths" have to be tightened, either in this patch or a
preparatory step patch before this one, because the addition of
"this thing reads paths not pathspecs" is what makes them ambiguous.

Thanks.  I re-read the part of the code that reads and unquotes as
necessary in the patch and they looked correct.






Re: [PATCH] mingw: allow hooks to be .exe files

2017-01-27 Thread Stefan Beller
On Fri, Jan 27, 2017 at 2:29 AM, Johannes Schindelin
 wrote:
> Hi Junio,
>
> On Thu, 26 Jan 2017, Junio C Hamano wrote:
>
>> Johannes Schindelin  writes:
>>
>> > On Wed, 25 Jan 2017, Jeff King wrote:
>> >
>> >> On Wed, Jan 25, 2017 at 05:58:42PM +0100, Johannes Schindelin wrote:
>> >>
>> >> > -if (access(path.buf, X_OK) < 0)
>> >> > +if (access(path.buf, X_OK) < 0) {
>> >> > +#ifdef STRIP_EXTENSION
>> >> > +strbuf_addstr(, ".exe");
>> >>
>> >> I think STRIP_EXTENSION is a string.  Should this line be:
>> >>
>> >>   strbuf_addstr(, STRIP_EXTENSION);
>> >
>> > Yep.
>> >
>> > v2 coming,
>> > Johannes
>>
>> I think I've already tweaked it out when I queued the original one.
>
> After digging, I found your SQUASH commit. I had not known about that.
>
> In any case, I much rather prefer to have the final version of any patch
> or patch series I contribute to be identical between what you commit and
> what I sent to the mailing list. We do disagree from time to time, and I
> would like to have the opportunity of reviewing how you tweak my changes.
>
> Ciao,
> Johannes

I saw the queued up commit and that lead me to this thread here.
The commit message, while technically correct, seems a bit off. This is because
the commit message only talks about .exe extensions, but the change in code
doesn't even have the string "exe" mentioned once.

With this discussion here, it is easy for me to connect the dots, but
it would be nice
to have the full picture in the commit message. This is what I would write:

mingw: allow hooks to be .exe files

Executable files in Windows need to have the extension '.exe', otherwise
they do not work. Extend the hooks to not just look at the hard coded names,
but also at the extended names via the custom STRIP_EXTENSION, which
is defined as '.exe' in Windows.

Stefan


Re: [PATCH 11/12] receive-pack: treat namespace .have lines like alternates

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 06:45:26PM +0100, Lukas Fleischer wrote:

> > This is an unrelated tangent, but there may want to be a knob to
> > make the code return here without even showing, to make the
> > advertisement even smaller and also to stop miniscule information
> > leakage?  If the namespaced multiple projects are totally unrelated
> > (i.e. "My sysadmin gave me a write access only to this single
> > directory, so I am using the namespace feature to host these three
> > projects that have nothing to do with each other"), showing objects
> > of other namespaces will buy us nothing and the user is better off
> > without this code showing these refs as ".have".
> 
> I think this is already possible using receive.hideRefs (which causes
> the ref_is_hidden() branch above to return if applicable).
> 
> Having support for suppressing .have lines corresponding to different
> namespaces was actually the reason I implemented 78a766ab6 (hideRefs:
> add support for matching full refs, 2015-11-03). We have been using
> namespaces for hosting the package Git repositories of the Arch Linux
> User Repository [1] with a shared object storage for several months now.
> See [2] for *some* technical details on how things are implemented; the
> last section explains how the hideRefs mechanism can be used to limit
> ref advertisement to the "active" namespace.

Thanks for the pointers. I think a "turn off namespace .have lines"
option would be easier for some cases, but what you've implemented is
much more flexible. So if people using namespaces are happy with it, I
don't see any need to add another way to do the same thing.

-Peff


Re: [PATCH v2 3/3] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-01-27 Thread Junio C Hamano
Just to save us extra round-trip.

Junio C Hamano  writes:

>> +`GIT_SSH_VARIANT`::
>> +If this environment variable is set, it overrides the autodetection
>> +of plink/tortoiseplink in the SSH command that 'git fetch' and 'git
>> +push' use. It can be set to either `ssh`, `plink`, `putty` or
>> +`tortoiseplink`. Any other value will be treated as normal ssh. This
>> +is useful in case that Git gets this wrong.
>
> Similarly this should mention GIT_SSH_COMMAND at least.  It is crazy
> to set something that will cause misdetection to core.sshCommand and
> use this environment variable to fix it (instead of using ssh.variant),
> so I think it is OK not to mention core.sshCommand (but it would not
> hurt to do so).
>
>> diff --git a/connect.c b/connect.c
>> index 9f750eacb6..7b4437578b 100644
>> --- a/connect.c
>> +++ b/connect.c
>> @@ -691,6 +691,24 @@ static const char *get_ssh_command(void)
>>  return NULL;
>>  }
>>  
>> +static int handle_ssh_variant(int *port_option, int *needs_batch)
>> +{
>> ...
>> +}
> ...
>
> The string that came from the configuration must be freed, no?  That
> is what I recall in Peff's comment yesterday.

The leak needs plugging in some way.  

Because this thing is merely an escape hatch that somebody who needs
it needs to use it always (as opposed to one-shot basis), we do not
need to support the environment variable and go with only the
configuration, which may make it easier to plug the leak.

>> @@ -817,7 +835,8 @@ struct child_process *git_connect(int fd[2], const char 
>> *url,
>>  ssh_argv0 = xstrdup(ssh);
>>  }
>>  
>> -if (ssh_argv0) {
>> +if (!handle_ssh_variant(_option, _batch) &&
>> +ssh_argv0) {
>
> I like the placement of this "if the user explicitly told us" much
> better.
> ...
> And that reasoning will lead to a realization "there is no reason to
> even do the split_cmdline() if the user explicitly told us".  While
> that reasoning is correct and we _should_ further refactor, I didn't
> want to spend braincycles on that myself.

This _should_ above is a lot weaker than _must_.  

IOW, I think it is acceptable to always split GIT_SSH_COMMAND into
tokens before we realize that the user used the escape hatch and the
splitting was a wasted effort.  This is exactly because this thing
is an escape hatch that is expected to be rarely used.  Of course,
if the "wasted effort" can be eliminated without sacrificing the
simplicity of the code, that is fine as well.






Re: [PATCH 2/2] use absolute_pathdup()

2017-01-27 Thread Junio C Hamano
René Scharfe  writes:

> Hi Dscho,
>
> Am 27.01.2017 um 11:21 schrieb Johannes Schindelin:
>> On Thu, 26 Jan 2017, René Scharfe wrote:
>>> Apply the symantic patch for converting callers that duplicate the
>>
>> s/symantic/semantic/
>
> thank you!  I wonder where this came from.  And where my spellchecker
> went without as much as a farewell.  Reinstalled it now..
>
> René

Thanks.  I've already locally amended this.


Re: [PATCH v3] t/Makefile: add a rule to re-run previously-failed tests

2017-01-27 Thread Johannes Schindelin
Hi Peff,

On Fri, 27 Jan 2017, Jeff King wrote:

> On Fri, Jan 27, 2017 at 03:17:36PM +0100, Johannes Schindelin wrote:
> 
> > This patch automates the process of determinig which tests failed
> > previously and re-running them.
> 
> s/determinig/determining/

Fixed in v4,
Johannes


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 06:34:46PM +0100, Johannes Schindelin wrote:

> > Is it worth clarifying that these are paths, not pathspecs? The word
> > "paths" is used to refer to the pathspecs on the command-line elsewhere
> > in the document.
> > 
> > It might also be worth mentioning the quoting rules for the non-z case.
> 
> I think this would be overkill. In reality, --stdin without -z does not
> make much sense, anyway.

I think with most Unix tools people tend to use the non-z forms until
they break, and then switch to the z form. :)

And in that sense, this transparently Just Works because the output will
often come from another git tool, which will quote as appropriate.

> If you feel strongly about it, I encourage you to submit a follow-up
> patch.
> 
> The rest of your suggestions have been implemented in v3.

Thanks. I think the path/pathspec thing was the more important of the
two suggestions.

-Peff


Re: [PATCH 11/12] receive-pack: treat namespace .have lines like alternates

2017-01-27 Thread Lukas Fleischer
On Wed, 25 Jan 2017 at 20:51:17, Junio C Hamano wrote:
> [...]
> > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> > index 8f8762e4a..c55e2f993 100644
> > --- a/builtin/receive-pack.c
> > +++ b/builtin/receive-pack.c
> > @@ -251,8 +251,9 @@ static void show_ref(const char *path, const unsigned 
> > char *sha1)
> [...]
> >   if (ref_is_hidden(path, path_full))
> [...]
> This is an unrelated tangent, but there may want to be a knob to
> make the code return here without even showing, to make the
> advertisement even smaller and also to stop miniscule information
> leakage?  If the namespaced multiple projects are totally unrelated
> (i.e. "My sysadmin gave me a write access only to this single
> directory, so I am using the namespace feature to host these three
> projects that have nothing to do with each other"), showing objects
> of other namespaces will buy us nothing and the user is better off
> without this code showing these refs as ".have".

I think this is already possible using receive.hideRefs (which causes
the ref_is_hidden() branch above to return if applicable).

Having support for suppressing .have lines corresponding to different
namespaces was actually the reason I implemented 78a766ab6 (hideRefs:
add support for matching full refs, 2015-11-03). We have been using
namespaces for hosting the package Git repositories of the Arch Linux
User Repository [1] with a shared object storage for several months now.
See [2] for *some* technical details on how things are implemented; the
last section explains how the hideRefs mechanism can be used to limit
ref advertisement to the "active" namespace.

Regards,
Lukas

[1] https://aur.archlinux.org/
[2] https://git.archlinux.org/aurweb.git/plain/doc/git-interface.txt


Re: [PATCH] fixup! worktree move: new command

2017-01-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> always been discussed on the mailing list, I would like to kindly ask you
> to please add this patch to the nd/worktree-move branch for the time being
> (i.e. until Duy responds),

The tip of 'pu' (or anything beyond the tip of 'jch') is not always
expected to pass test or even build, and unless I know the OP is
leaky or usually slow, a "fixup!" that directly addresses the OP
tends to be left for the OP to pick up [*1*] to avoid "ah, you sent
a reroll but I already had one squashed locally", which is confusing
to both myself and OP (and adds burden to me).

It seems that OP is slower to respond than his usual, so I'll add it
to the tip of the topic so that it won't be lost.


[Footnote]

*1* It still is used on my end to leave a mental note to myself that
the topic is expected/waiting to be rerolled, but that is not
something you can read in "git log --first-parent master..pu".


Re: show all merge conflicts

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 11:56:08AM -0500, Michael Spiegel wrote:

> I'm trying to determine whether a merge required a conflict to resolve
> after the merge has occurred. The git book has some advice
> (https://git-scm.com/book/en/v2/Git-Tools-Advanced-Merging) to use
> `git show` on the merge commit or use `git log --cc -p -1`. These
> strategies work when the merge conflict was resolved with a change
> that is different from either parent. When the conflict is resolved
> with a change that is the same as one of the parents, then these
> commands are indistinguishable from a merge that did not conflict. Is
> it possible to distinguish between a conflict-free merge and a merge
> conflict that is resolved by with the changes from one the parents?

No. You'd have to replay the merge to know if it would have had
conflicts.

There was a patch series a few years ago that added a new diff-mode to
do exactly that, and show the diff against what was resolved. It had a
few issues (I don't remember exactly what) and never got merged.

Certainly one complication is that you don't know exactly _how_ the
merge was done in the first place (e.g., which merge strategy, which
custom merge drivers were in effect, etc). But in general, replaying
with a standard merge-recursive would get you most of what you want to
know.

I've done this manually sometimes when digging into erroneous merges
(e.g., somebody accidentally runs "git reset -- " in the middle
of a merge and throws away some changes.

You should be able to do:

  git checkout $merge^1
  git merge $merge^2
  git diff -R $merge

to see what the original resolution did.

-Peff


Re: [PATCH v3 4/4] urlmatch: allow globbing for the URL host part

2017-01-27 Thread Junio C Hamano
Patrick Steinhardt  writes:

>> This is probably a useful improvement.
>> 
>> Having said that, when I mentioned "glob", I meant to also support
>> something like this:
>> 
>>  https://www[1-4].ibm.com/
>
> The problem with additional extended syntax like proposed by you
> is that we would indeed need an escaping mechanism here.

True.  I think a true shell globbing is overkill (so is regexp) and
just a simple wildcarding with '*' would be a good first step that
is easy to explain and later extend as needed.

Thanks.


Re: SubmittingPatches: drop temporal reference for PGP signing

2017-01-27 Thread Junio C Hamano
Cornelius Weig  writes:

> -Do not PGP sign your patch, -at least for now-. Most likely, your (...)
> +Do not PGP sign your patch. Most likely, your maintainer or other (...)

It has been quite a while since we wrote that "at least for now", so
it probably makes sense to drop it.

>> Maybe even s/by signing off/by adding your Signed-off-by:/ to be sure
>> that the reader knows that it is _their certification_ that is being
>> sought. Even if it does double up on the 'your'.
>
> I don't think doubling on the 'your' will make the heading clearer. The
> main intention of this change is to avoid mixups with pgp-signing and
> that would IMHO not improve by that.
> Besides, I find the colon in the heading a bit awkward. Is the following
> version as expressive as with the colon?
>
> -(5) Sign your work
> +(5) Certify your work by adding Signed-off-by

I am leaning towards agreeing with Philip, and my gut feeling says
the presense of the colon, would help a lot.  It would visually
click to readers what we are talking about.

At the same time, I think ending a section header with a colon would
be seen as "awkward".  I would have written it more like this to
avoid it:

... by adding "Signed-off-by: " line

I personally do not mind having "your" there, either, but I can see
that a section header wants to be kept shorter.


[PATCH v3 0/1] Support `git reset --stdin`

2017-01-27 Thread Johannes Schindelin
This feature was missing, and made it cumbersome for third-party
tools to reset a lot of paths in one go.

Support for --stdin has been added, following builtin/checkout-index.c's
example.

Changes since v2:

- the documentation clarifies that --stdin does not treat the input as
  pathspecs

- the code now uses struct argv_array instead of rolling its own


Johannes Schindelin (1):
  reset: support the --stdin option

 Documentation/git-reset.txt | 10 ++
 builtin/reset.c | 47 -
 t/t7107-reset-stdin.sh  | 33 +++
 3 files changed, 89 insertions(+), 1 deletion(-)
 create mode 100755 t/t7107-reset-stdin.sh


base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
Published-As: https://github.com/dscho/git/releases/tag/reset-stdin-v3
Fetch-It-Via: git fetch https://github.com/dscho/git reset-stdin-v3

Interdiff vs v2:

 diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
 index abb71bb805..d319ed9b20 100644
 --- a/Documentation/git-reset.txt
 +++ b/Documentation/git-reset.txt
 @@ -100,7 +100,8 @@ OPTIONS
  
  --stdin::
Instead of taking list of paths from the command line,
 -  read list of paths from the standard input.  Paths are
 +  read list of paths from the standard input.  The paths are
 +  read verbatim, i.e. not handled as pathspecs.  Paths are
separated by LF (i.e. one path per line) by default.
  
  -z::
 diff --git a/builtin/reset.c b/builtin/reset.c
 index 1d3075b7ee..fe7723c179 100644
 --- a/builtin/reset.c
 +++ b/builtin/reset.c
 @@ -23,6 +23,7 @@
  #include "cache-tree.h"
  #include "strbuf.h"
  #include "quote.h"
 +#include "argv-array.h"
  
  static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
 @@ -271,8 +272,7 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
  {
int reset_type = NONE, update_ref_status = 0, quiet = 0;
int patch_mode = 0, nul_term_line = 0, read_from_stdin = 0, unborn;
 -  char **stdin_paths = NULL;
 -  int stdin_nr = 0, stdin_alloc = 0;
 +  struct argv_array stdin_paths = ARGV_ARRAY_INIT;
const char *rev;
struct object_id oid;
struct pathspec pathspec;
 @@ -325,18 +325,15 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
die(_("line is badly quoted"));
strbuf_swap(, );
}
 -  ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
 -  stdin_paths[stdin_nr++] = xstrdup(buf.buf);
 +  argv_array_push(_paths, buf.buf);
strbuf_reset();
}
strbuf_release();
strbuf_release();
  
 -  ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
 -  stdin_paths[stdin_nr++] = NULL;
flags |= PATHSPEC_LITERAL_PATH;
parse_pathspec(, 0, flags, prefix,
 - (const char **)stdin_paths);
 + stdin_paths.argv);
  
} else if (nul_term_line)
die(_("-z requires --stdin"));
 @@ -431,11 +428,7 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
if (!pathspec.nr)
remove_branch_state();
  
 -  if (stdin_paths) {
 -  while (stdin_nr)
 -  free(stdin_paths[--stdin_nr]);
 -  free(stdin_paths);
 -  }
 +  argv_array_clear(_paths);
  
return update_ref_status;
  }

-- 
2.11.1.windows.prerelease.2.9.g3014b57



Urgent Assistance

2017-01-27 Thread Tony Manssa
Meine Liebste,

Ich bin Barrister Tony Manssa Rechtsanwalt, ich suche Ihr Vertrauen,
um für die Hinterlegung Klage Erbschaftsgeld von meinem verstorbenen
Klienten eine Staatsangehörigkeit Ihres Landes zu beantragen, die hier
in Lome, Republik Togo lebte viele Jahre, bis er leider mit seinem
gestorben Ganze Familie in Auto-Crash, er tragen den gleichen
Nachnamen mit Ihnen.Ich habe für seine Familienangehörigen für seine
Heimat repatriieren, aber erfolglos. Daher ist mein Kontakt zu Ihnen
für Sie zu stehen, wie der Erbe offensichtlich (Next von Kin) zu
seinem Nachlass im Wert von US $ 4,500,000.00 erhalten, die er
hinterlegt in einer Bank vor seinem Tod. Könnten Sie die oben genannte
Summe auf Ihr Bankkonto erhalten?

Freundliche Grüße
Rechtsanwalt Tony Manssa
Hauptrechtsanwalt.
Lome - Togo
***
My Dear,

I am Barrister Tony Manssa attorney at Law, I am seeking your
confidence to apply for the deposit claim inheritance fund of my late
client a nationality of your country, who lived here in Lome, Republic
of Togo for many years until he unfortunately died with his entire
family in auto crash, he bear the same surname with you.I have
searched for his family relatives to repatriate his estate, but to be
unsuccessful. Hence my contact to you is for you to stand as the heir
apparent (Next of Kin) to receive his estate valued at US$4,500,000.00
which he deposited in a bank before his death. Could you be able to
receive the above sum into your bank account?

Best Regards
Barrister Tony Manssa
Principal Attorney.
Lome - Togo


[PATCH v3 1/1] reset: support the --stdin option

2017-01-27 Thread Johannes Schindelin
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).

Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-reset.txt | 10 ++
 builtin/reset.c | 47 -
 t/t7107-reset-stdin.sh  | 33 +++
 3 files changed, 89 insertions(+), 1 deletion(-)
 create mode 100755 t/t7107-reset-stdin.sh

diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index 25432d9257..d319ed9b20 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git reset' [-q] [] [--] ...
 'git reset' (--patch | -p) [] [--] [...]
+'git reset' [-q] [--stdin [-z]] []
 'git reset' [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] []
 
 DESCRIPTION
@@ -97,6 +98,15 @@ OPTIONS
 --quiet::
Be quiet, only report errors.
 
+--stdin::
+   Instead of taking list of paths from the command line,
+   read list of paths from the standard input.  The paths are
+   read verbatim, i.e. not handled as pathspecs.  Paths are
+   separated by LF (i.e. one path per line) by default.
+
+-z::
+   Only meaningful with `--stdin`; paths are separated with
+   NUL character instead of LF.
 
 EXAMPLES
 
diff --git a/builtin/reset.c b/builtin/reset.c
index 8ab915bfcb..fe7723c179 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -21,10 +21,14 @@
 #include "parse-options.h"
 #include "unpack-trees.h"
 #include "cache-tree.h"
+#include "strbuf.h"
+#include "quote.h"
+#include "argv-array.h"
 
 static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
N_("git reset [-q] [] [--] ..."),
+   N_("git reset [-q] [--stdin [-z]] []"),
N_("git reset --patch [] [--] [...]"),
NULL
 };
@@ -267,7 +271,8 @@ static int reset_refs(const char *rev, const struct 
object_id *oid)
 int cmd_reset(int argc, const char **argv, const char *prefix)
 {
int reset_type = NONE, update_ref_status = 0, quiet = 0;
-   int patch_mode = 0, unborn;
+   int patch_mode = 0, nul_term_line = 0, read_from_stdin = 0, unborn;
+   struct argv_array stdin_paths = ARGV_ARRAY_INIT;
const char *rev;
struct object_id oid;
struct pathspec pathspec;
@@ -286,6 +291,10 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
OPT_BOOL('p', "patch", _mode, N_("select hunks 
interactively")),
OPT_BOOL('N', "intent-to-add", _to_add,
N_("record only the fact that removed paths 
will be added later")),
+   OPT_BOOL('z', NULL, _term_line,
+   N_("paths are separated with NUL character")),
+   OPT_BOOL(0, "stdin", _from_stdin,
+   N_("read paths from ")),
OPT_END()
};
 
@@ -295,6 +304,40 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
PARSE_OPT_KEEP_DASHDASH);
parse_args(, argv, prefix, patch_mode, );
 
+   if (read_from_stdin) {
+   strbuf_getline_fn getline_fn = nul_term_line ?
+   strbuf_getline_nul : strbuf_getline_lf;
+   int flags = PATHSPEC_PREFER_FULL |
+   PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP;
+   struct strbuf buf = STRBUF_INIT;
+   struct strbuf unquoted = STRBUF_INIT;
+
+   if (patch_mode)
+   die(_("--stdin is incompatible with --patch"));
+
+   if (pathspec.nr)
+   die(_("--stdin is incompatible with path arguments"));
+
+   while (getline_fn(, stdin) != EOF) {
+   if (!nul_term_line && buf.buf[0] == '"') {
+   strbuf_reset();
+   if (unquote_c_style(, buf.buf, NULL))
+   die(_("line is badly quoted"));
+   strbuf_swap(, );
+   }
+   argv_array_push(_paths, buf.buf);
+   strbuf_reset();
+   }
+   strbuf_release();
+   strbuf_release();
+
+   flags |= PATHSPEC_LITERAL_PATH;
+   parse_pathspec(, 0, flags, prefix,
+  stdin_paths.argv);
+
+   } else if (nul_term_line)
+   die(_("-z requires 

show all merge conflicts

2017-01-27 Thread Michael Spiegel
Hi folks,

I'm trying to determine whether a merge required a conflict to resolve
after the merge has occurred. The git book has some advice
(https://git-scm.com/book/en/v2/Git-Tools-Advanced-Merging) to use
`git show` on the merge commit or use `git log --cc -p -1`. These
strategies work when the merge conflict was resolved with a change
that is different from either parent. When the conflict is resolved
with a change that is the same as one of the parents, then these
commands are indistinguishable from a merge that did not conflict. Is
it possible to distinguish between a conflict-free merge and a merge
conflict that is resolved by with the changes from one the parents?

Thanks,
--Michael


[PATCH v4] t/Makefile: add a rule to re-run previously-failed tests

2017-01-27 Thread Johannes Schindelin
This patch automates the process of determining which tests failed
previously and re-running them.

While developing patch series, it is a good practice to run the test
suite from time to time, just to make sure that obvious bugs are caught
early.  With complex patch series, it is common to run `make -j15 -k
test`, i.e.  run the tests in parallel and *not* stop at the first
failing test but continue. This has the advantage of identifying
possibly multiple problems in one big test run.

It is particularly important to reduce the turn-around time thusly on
Windows, where the test suite spends 45 minutes on the computer on which
this patch was developed.

It is the most convenient way to determine which tests failed after
running the entire test suite, in parallel, to look for left-over "trash
directory.t*" subdirectories in the t/ subdirectory. However, those
directories might live outside t/ when overridden using the
--root= option, to which the Makefile has no access. The next
best method is to grep explicitly for failed tests in the test-results/
directory, which the Makefile *can* access.

Please note that the often-recommended `prove` tool requires Perl, and
that opens a whole new can of worms on Windows. As no native Windows Perl
comes with Subversion bindings, we have to use a Perl in Git for Windows
that uses the POSIX emulation layer named MSYS2 (which is a portable
version of Cygwin). When using this emulation layer under stress, e.g.
when running massively-parallel tests, unexplicable crashes occur quite
frequently, and instead of having a solution to the original problem, the
developer now has an additional, quite huge problem. For that reason, this
developer rejected `prove` as a solution and went with this patch instead.

Signed-off-by: Johannes Schindelin 
---
Published-As: https://github.com/dscho/git/releases/tag/failing-tests-v4
Fetch-It-Via: git fetch https://github.com/dscho/git failing-tests-v4

 t/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/t/Makefile b/t/Makefile
index d613935f14..1bb06c36f2 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -35,6 +35,12 @@ all: $(DEFAULT_TEST_TARGET)
 test: pre-clean $(TEST_LINT)
$(MAKE) aggregate-results-and-cleanup
 
+failed:
+   @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
+   grep -l '^failed [1-9]' *.counts | \
+   sed -n 's/\.counts$$/.sh/p') && \
+   test -z "$$failed" || $(MAKE) $$failed
+
 prove: pre-clean $(TEST_LINT)
@echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
$(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
$(MAKE) clean-except-prove-cache

base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
-- 
2.11.1.windows.prerelease.2.9.g3014b57


Re: [PATCH v2] git-p4: Fix git-p4.mapUser on Windows

2017-01-27 Thread Junio C Hamano
George Vanburgh  writes:

> From: George Vanburgh 
>
> When running git-p4 on Windows, with multiple git-p4.mapUser entries in
> git config - no user mappings are applied to the generated repository.
> ...
> Using splitlines solves this issue, by splitting config on all
> typical delimiters ('\n', '\r\n' etc.)

Luke, Lars, this version seems to be in line with the conclusion of
your earlier reviews, e.g.



Even though it looks OK to my eyes, I'll wait for Acks or further
refinement suggestions from either of you two before acting on this
patch.

Thanks.

> Signed-off-by: George Vanburgh 
> ---
>  git-p4.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index f427bf6..b66f68b 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -656,7 +656,7 @@ def gitConfigInt(key):
>  def gitConfigList(key):
>  if not _gitConfig.has_key(key):
>  s = read_pipe(["git", "config", "--get-all", key], ignore_error=True)
> -_gitConfig[key] = s.strip().split(os.linesep)
> +_gitConfig[key] = s.strip().splitlines()
>  if _gitConfig[key] == ['']:
>  _gitConfig[key] = []
>  return _gitConfig[key]
>
> --
> https://github.com/git/git/pull/319


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Johannes Schindelin
Hi Peff,

On Fri, 27 Jan 2017, Jeff King wrote:

> On Fri, Jan 27, 2017 at 01:38:55PM +0100, Johannes Schindelin wrote:
> 
> A few minor suggestions:
> 
> > +--stdin::
> > +   Instead of taking list of paths from the command line,
> > +   read list of paths from the standard input.  Paths are
> > +   separated by LF (i.e. one path per line) by default.
> > +
> > +-z::
> > +   Only meaningful with `--stdin`; paths are separated with
> > +   NUL character instead of LF.
> 
> Is it worth clarifying that these are paths, not pathspecs? The word
> "paths" is used to refer to the pathspecs on the command-line elsewhere
> in the document.
> 
> It might also be worth mentioning the quoting rules for the non-z case.

I think this would be overkill. In reality, --stdin without -z does not
make much sense, anyway.

If you feel strongly about it, I encourage you to submit a follow-up
patch.

The rest of your suggestions have been implemented in v3.

Ciao,
Johannes


Re: [PATCH v3] t/Makefile: add a rule to re-run previously-failed tests

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 03:17:36PM +0100, Johannes Schindelin wrote:

> This patch automates the process of determinig which tests failed
> previously and re-running them.

s/determinig/determining/

Patch otherwise looks good, and I'm happy to be rid of the sed
complexity from v2.

-Peff


Re: [PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Jeff King
On Fri, Jan 27, 2017 at 01:38:55PM +0100, Johannes Schindelin wrote:

> Just like with other Git commands, this option makes it read the paths
> from the standard input. It comes in handy when resetting many, many
> paths at once and wildcards are not an option (e.g. when the paths are
> generated by a tool).
> 
> Note: we first parse the entire list and perform the actual reset action
> only in a second phase. Not only does this make things simpler, it also
> helps performance, as do_diff_cache() traverses the index and the
> (sorted) pathspecs in simultaneously to avoid unnecessary lookups.

This looks OK to me. At first I wondered if using PATHSPEC_LITERAL_PATH
was consistent with other "--stdin" tools. I think it mostly is (or at
least consistent with checkout-index). The exception is "rev-list
--stdin", but that's probably fine. It is taking rev-list arguments in
the first place, not a list of paths.

A few minor suggestions:

> +--stdin::
> + Instead of taking list of paths from the command line,
> + read list of paths from the standard input.  Paths are
> + separated by LF (i.e. one path per line) by default.
> +
> +-z::
> + Only meaningful with `--stdin`; paths are separated with
> + NUL character instead of LF.

Is it worth clarifying that these are paths, not pathspecs? The word
"paths" is used to refer to the pathspecs on the command-line elsewhere
in the document.

It might also be worth mentioning the quoting rules for the non-z case.

> @@ -267,7 +270,9 @@ static int reset_refs(const char *rev, const struct 
> object_id *oid)
>  int cmd_reset(int argc, const char **argv, const char *prefix)
>  {
>   int reset_type = NONE, update_ref_status = 0, quiet = 0;
> - int patch_mode = 0, unborn;
> + int patch_mode = 0, nul_term_line = 0, read_from_stdin = 0, unborn;
> + char **stdin_paths = NULL;
> + int stdin_nr = 0, stdin_alloc = 0;

This list is a good candidate for an argv_array, I think. So:

  struct argv_array stdin_paths = ARGV_ARRAY_INIT;

> + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
> + stdin_paths[stdin_nr++] = xstrdup(buf.buf);

  argv_array_push(_paths, buf.buf);

> + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
> + stdin_paths[stdin_nr++] = NULL;

  This goes away because argv_array handles it for you.

> + if (stdin_paths) {
> + while (stdin_nr)
> + free(stdin_paths[--stdin_nr]);
> + free(stdin_paths);
> + }

  argv_array_clear(_paths);

> @@ -295,6 +304,43 @@ int cmd_reset(int argc, const char **argv, const char 
> *prefix)
>   PARSE_OPT_KEEP_DASHDASH);
>   parse_args(, argv, prefix, patch_mode, );
>  
> + if (read_from_stdin) {
> + strbuf_getline_fn getline_fn = nul_term_line ?
> + strbuf_getline_nul : strbuf_getline_lf;
> + int flags = PATHSPEC_PREFER_FULL |
> + PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP;

You set two flags here...

> + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
> + stdin_paths[stdin_nr++] = NULL;
> + flags |= PATHSPEC_LITERAL_PATH;
> + parse_pathspec(, 0, flags, prefix,
> +(const char **)stdin_paths);

...and then one more here. They all seem to be set unconditionally, and
we never look at "flags" between the two lines. I think it would be more
obvious to set them all in the same place.

> + } else if (nul_term_line)
> + die(_("-z requires --stdin"));
> +

Hmm, there's our brace question coming up again. :)

> diff --git a/t/t7107-reset-stdin.sh b/t/t7107-reset-stdin.sh
> new file mode 100755
> index 00..997dc42dd2
> --- /dev/null
> +++ b/t/t7107-reset-stdin.sh
> @@ -0,0 +1,33 @@
> +#!/bin/sh
> +
> +test_description='reset --stdin'

Here are a few things we might want to test that I didn't see covered:

  - feeding path X does not reset path Y

  - de-quoting in the non-z case

-Peff


Re: [PATCH 2/2] use absolute_pathdup()

2017-01-27 Thread René Scharfe

Hi Dscho,

Am 27.01.2017 um 11:21 schrieb Johannes Schindelin:

On Thu, 26 Jan 2017, René Scharfe wrote:

Apply the symantic patch for converting callers that duplicate the


s/symantic/semantic/


thank you!  I wonder where this came from.  And where my spellchecker 
went without as much as a farewell.  Reinstalled it now..


René


[PATCH v3] t/Makefile: add a rule to re-run previously-failed tests

2017-01-27 Thread Johannes Schindelin
This patch automates the process of determinig which tests failed
previously and re-running them.

While developing patch series, it is a good practice to run the test
suite from time to time, just to make sure that obvious bugs are caught
early.  With complex patch series, it is common to run `make -j15 -k
test`, i.e.  run the tests in parallel and *not* stop at the first
failing test but continue. This has the advantage of identifying
possibly multiple problems in one big test run.

It is particularly important to reduce the turn-around time thusly on
Windows, where the test suite spends 45 minutes on the computer on which
this patch was developed.

It is the most convenient way to determine which tests failed after
running the entire test suite, in parallel, to look for left-over "trash
directory.t*" subdirectories in the t/ subdirectory. However, those
directories might live outside t/ when overridden using the
--root= option, to which the Makefile has no access. The next
best method is to grep explicitly for failed tests in the test-results/
directory, which the Makefile *can* access.

Please note that the often-recommended `prove` tool requires Perl, and
that opens a whole new can of worms on Windows. As no native Windows Perl
comes with Subversion bindings, we have to use a Perl in Git for Windows
that uses the POSIX emulation layer named MSYS2 (which is a portable
version of Cygwin). When using this emulation layer under stress, e.g.
when running massively-parallel tests, unexplicable crashes occur quite
frequently, and instead of having a solution to the original problem, the
developer now has an additional, quite huge problem. For that reason, this
developer rejected `prove` as a solution and went with this patch instead.

Signed-off-by: Johannes Schindelin 
---
Published-As: https://github.com/dscho/git/releases/tag/failing-tests-v3
Fetch-It-Via: git fetch https://github.com/dscho/git failing-tests-v3
Interdiff vs v2:

 diff --git a/t/Makefile b/t/Makefile
 index 8aa6a72a70..1bb06c36f2 100644
 --- a/t/Makefile
 +++ b/t/Makefile
 @@ -37,9 +37,8 @@ test: pre-clean $(TEST_LINT)
  
  failed:
@failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
 -  grep -l '^failed [1-9]' $$(ls -t *.counts | \
 -  sed 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') | 
\
 -  sed -n 's/-[0-9]*\.counts$$/.sh/p') && \
 +  grep -l '^failed [1-9]' *.counts | \
 +  sed -n 's/\.counts$$/.sh/p') && \
test -z "$$failed" || $(MAKE) $$failed
  
  prove: pre-clean $(TEST_LINT)


 t/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/t/Makefile b/t/Makefile
index d613935f14..1bb06c36f2 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -35,6 +35,12 @@ all: $(DEFAULT_TEST_TARGET)
 test: pre-clean $(TEST_LINT)
$(MAKE) aggregate-results-and-cleanup
 
+failed:
+   @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
+   grep -l '^failed [1-9]' *.counts | \
+   sed -n 's/\.counts$$/.sh/p') && \
+   test -z "$$failed" || $(MAKE) $$failed
+
 prove: pre-clean $(TEST_LINT)
@echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
$(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
$(MAKE) clean-except-prove-cache

base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
-- 
2.11.1.windows.prerelease.2.9.g3014b57


Re: [PATCH] test-lib: on FreeBSD, look for unzip(1) in /usr/local/bin/

2017-01-27 Thread Johannes Schindelin
Hi Junio,

On Thu, 21 Jul 2016, Eric Wong wrote:

> Johannes Schindelin  wrote:
> > The common work-around is to install Info-Zip on FreeBSD, into
> > /usr/local/bin/.
> > 
> > Signed-off-by: Johannes Schindelin 
> 
> Thanks, t5003 now works out-of-the-box.
> Tested with Info-ZIP unzip installed and uninstalled.
> 
> Tested-by: Eric Wong 

Did you forget about this?

Ciao,
Johannes


[PATCH] help: correct behavior for is_executable on Windows

2017-01-27 Thread Johannes Schindelin
From: Heiko Voigt 

The previous implementation said that the filesystem information on
Windows is not reliable to determine whether a file is executable. To
gather this information it was peeking into the first two bytes of a
file to see whether it looks executable.

Apart from the fact that on Windows executables are defined as such by
their extension (and Git has special code to support "executing" scripts
when they have a she-bang line) it leads to serious performance
problems: not only do we have to open many files now, it gets even
slower when a virus scanner is running.

See the following measurements (in seconds) as an example, where we
execute a simple program that simply lists the directory contents and
calls open() on every listed file:

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.00
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.36
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.00
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real0m8.723s
user0m0.000s
sys 0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real1m37.734s
user0m0.000s
sys 0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

Signed-off-by: Heiko Voigt 
Signed-off-by: Johannes Schindelin 
---
Published-As: https://github.com/dscho/git/releases/tag/help-is-exe-v1
Fetch-It-Via: git fetch https://github.com/dscho/git help-is-exe-v1

 help.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 53e2a67e00..bc6cd19cf3 100644
--- a/help.c
+++ b/help.c
@@ -105,7 +105,22 @@ static int is_executable(const char *name)
return 0;
 
 #if defined(GIT_WINDOWS_NATIVE)
-{  /* cannot trust the executable bit, peek into the file instead */
+   /*
+* On Windows there is no executable bit. The file extension
+* indicates whether it can be run as an executable, and Git
+* has special-handling to detect scripts and launch them
+* through the indicated script interpreter. We test for the
+* file extension first because virus scanners may make
+* it quite expensive to open many files.
+*/
+   if (ends_with(name, ".exe"))
+   return S_IXUSR;
+
+{
+   /*
+* Now that we know it does not have an executable extension,
+* peek into the file instead.
+*/
char buf[3] = { 0 };
int n;
int fd = open(name, O_RDONLY);
@@ -113,8 +128,8 @@ static int is_executable(const char *name)
if (fd >= 0) {
n = read(fd, buf, 2);
if (n == 2)
-   /* DOS executables start with "MZ" */
-   if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
+   /* look for a she-bang */
+   if (!strcmp(buf, "#!"))
st.st_mode |= S_IXUSR;
close(fd);
}

base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
-- 
2.11.1.windows.prerelease.2.9.g3014b57


[PATCH v2 0/1] Support `git reset --stdin`

2017-01-27 Thread Johannes Schindelin
This feature was missing, and made it cumbersome for third-party
tools to reset a lot of paths in one go.

Support for --stdin has been added, following builtin/checkout-index.c's
example.

Changes since v1:

- adjusted commit message to explain why we read everything before
  resetting

- fixed synopsis (`--stdin` does not work with `-- ...`)

- avoid handling patch_mode twice

- use PATHSPEC_LITERAL_PATH to avoid interpreting input as wildcards


Johannes Schindelin (1):
  reset: support the --stdin option

 Documentation/git-reset.txt |  9 
 builtin/reset.c | 54 -
 t/t7107-reset-stdin.sh  | 33 +++
 3 files changed, 95 insertions(+), 1 deletion(-)
 create mode 100755 t/t7107-reset-stdin.sh


base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
Published-As: https://github.com/dscho/git/releases/tag/reset-stdin-v2
Fetch-It-Via: git fetch https://github.com/dscho/git reset-stdin-v2

Interdiff vs v1:

 diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
 index 533ef69f91..abb71bb805 100644
 --- a/Documentation/git-reset.txt
 +++ b/Documentation/git-reset.txt
 @@ -8,8 +8,9 @@ git-reset - Reset current HEAD to the specified state
  SYNOPSIS
  
  [verse]
 -'git reset' [-q] [--stdin [-z]] [] [--] ...
 +'git reset' [-q] [] [--] ...
  'git reset' (--patch | -p) [] [--] [...]
 +'git reset' [-q] [--stdin [-z]] []
  'git reset' [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
[]
  
  DESCRIPTION
 diff --git a/builtin/reset.c b/builtin/reset.c
 index 6de3936aed..1d3075b7ee 100644
 --- a/builtin/reset.c
 +++ b/builtin/reset.c
 @@ -26,7 +26,8 @@
  
  static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
 -  N_("git reset [-q] [--stdin [-z]] [] [--] ..."),
 +  N_("git reset [-q] [] [--] ..."),
 +  N_("git reset [-q] [--stdin [-z]] []"),
N_("git reset --patch [] [--] [...]"),
NULL
  };
 @@ -317,9 +318,6 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
if (pathspec.nr)
die(_("--stdin is incompatible with path arguments"));
  
 -  if (patch_mode)
 -  flags |= PATHSPEC_PREFIX_ORIGIN;
 -
while (getline_fn(, stdin) != EOF) {
if (!nul_term_line && buf.buf[0] == '"') {
strbuf_reset();
 @@ -336,8 +334,10 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
  
ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
stdin_paths[stdin_nr++] = NULL;
 +  flags |= PATHSPEC_LITERAL_PATH;
parse_pathspec(, 0, flags, prefix,
   (const char **)stdin_paths);
 +
} else if (nul_term_line)
die(_("-z requires --stdin"));
  

-- 
2.11.1.windows.prerelease.2.9.g3014b57



[PATCH v2 1/1] reset: support the --stdin option

2017-01-27 Thread Johannes Schindelin
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).

Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-reset.txt |  9 
 builtin/reset.c | 54 -
 t/t7107-reset-stdin.sh  | 33 +++
 3 files changed, 95 insertions(+), 1 deletion(-)
 create mode 100755 t/t7107-reset-stdin.sh

diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index 25432d9257..abb71bb805 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git reset' [-q] [] [--] ...
 'git reset' (--patch | -p) [] [--] [...]
+'git reset' [-q] [--stdin [-z]] []
 'git reset' [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] []
 
 DESCRIPTION
@@ -97,6 +98,14 @@ OPTIONS
 --quiet::
Be quiet, only report errors.
 
+--stdin::
+   Instead of taking list of paths from the command line,
+   read list of paths from the standard input.  Paths are
+   separated by LF (i.e. one path per line) by default.
+
+-z::
+   Only meaningful with `--stdin`; paths are separated with
+   NUL character instead of LF.
 
 EXAMPLES
 
diff --git a/builtin/reset.c b/builtin/reset.c
index 8ab915bfcb..1d3075b7ee 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -21,10 +21,13 @@
 #include "parse-options.h"
 #include "unpack-trees.h"
 #include "cache-tree.h"
+#include "strbuf.h"
+#include "quote.h"
 
 static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
N_("git reset [-q] [] [--] ..."),
+   N_("git reset [-q] [--stdin [-z]] []"),
N_("git reset --patch [] [--] [...]"),
NULL
 };
@@ -267,7 +270,9 @@ static int reset_refs(const char *rev, const struct 
object_id *oid)
 int cmd_reset(int argc, const char **argv, const char *prefix)
 {
int reset_type = NONE, update_ref_status = 0, quiet = 0;
-   int patch_mode = 0, unborn;
+   int patch_mode = 0, nul_term_line = 0, read_from_stdin = 0, unborn;
+   char **stdin_paths = NULL;
+   int stdin_nr = 0, stdin_alloc = 0;
const char *rev;
struct object_id oid;
struct pathspec pathspec;
@@ -286,6 +291,10 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
OPT_BOOL('p', "patch", _mode, N_("select hunks 
interactively")),
OPT_BOOL('N', "intent-to-add", _to_add,
N_("record only the fact that removed paths 
will be added later")),
+   OPT_BOOL('z', NULL, _term_line,
+   N_("paths are separated with NUL character")),
+   OPT_BOOL(0, "stdin", _from_stdin,
+   N_("read paths from ")),
OPT_END()
};
 
@@ -295,6 +304,43 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
PARSE_OPT_KEEP_DASHDASH);
parse_args(, argv, prefix, patch_mode, );
 
+   if (read_from_stdin) {
+   strbuf_getline_fn getline_fn = nul_term_line ?
+   strbuf_getline_nul : strbuf_getline_lf;
+   int flags = PATHSPEC_PREFER_FULL |
+   PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP;
+   struct strbuf buf = STRBUF_INIT;
+   struct strbuf unquoted = STRBUF_INIT;
+
+   if (patch_mode)
+   die(_("--stdin is incompatible with --patch"));
+
+   if (pathspec.nr)
+   die(_("--stdin is incompatible with path arguments"));
+
+   while (getline_fn(, stdin) != EOF) {
+   if (!nul_term_line && buf.buf[0] == '"') {
+   strbuf_reset();
+   if (unquote_c_style(, buf.buf, NULL))
+   die(_("line is badly quoted"));
+   strbuf_swap(, );
+   }
+   ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
+   stdin_paths[stdin_nr++] = xstrdup(buf.buf);
+   strbuf_reset();
+   }
+   strbuf_release();
+   strbuf_release();
+
+   ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc);
+   stdin_paths[stdin_nr++] = NULL;
+   flags |= PATHSPEC_LITERAL_PATH;
+   parse_pathspec(, 0, flags, prefix,
+ 

Re: [PATCH] fixup! worktree move: new command

2017-01-27 Thread Johannes Schindelin
Hi Junio,

On Wed, 25 Jan 2017, Johannes Schindelin wrote:

> This is required for the test to pass on Windows, where $TRASH_DIRECTORY
> is a POSIX path, while Git works with Windows paths instead. Using
> `$(pwd)` is the common workaround: it reports a Windows path (while `$PWD`
> would report the POSIX equivalent which, however, would only be understood
> by shell and Perl scripts).
> 
> Duy, if you re-roll the `worktree-move` patch series, would you terribly
> mind squashing this in?
> 
> Signed-off-by: Johannes Schindelin 
> ---

Given that you add all kinds of SQUASH patches in `pu` that have not even
always been discussed on the mailing list, I would like to kindly ask you
to please add this patch to the nd/worktree-move branch for the time being
(i.e. until Duy responds), so that I am not continuously pestered by
failing Continuous Integration builds about a problem for which I have
provided a fix already.

Thanks,
Johannes


Re: [PATCH v2 1/1] status: be prepared for not-yet-started interactive rebase

2017-01-27 Thread Johannes Schindelin
Hi Stefan,

On Thu, 26 Jan 2017, Stefan Beller wrote:

> On Thu, Jan 26, 2017 at 8:08 AM, Johannes Schindelin
>  wrote:
> > -   if (!f)
> > +   if (!f) {
> > +   if (errno == ENOENT)
> > +   return -1;
> > die_errno("Could not open file %s for reading",
> >   git_path("%s", fname));
> 
> While at it, fix the translation with die_errno(_(..),..) ?

That is not the purpose of my patch. But feel free to offer a follow-up
patch!

Ciao,
Johannes


Re: SubmittingPatches: drop temporal reference for PGP signing

2017-01-27 Thread Cornelius Weig


On 01/26/2017 09:58 PM, Philip Oakley wrote:
> From: "Junio C Hamano" 
>> Cornelius Weig  writes:
>>
>>> How about something along these lines? Does the forward reference
>>> break the main line of thought too severly?
>>
>> I find it a bit distracting for those who know PGP signing has
>> nothing to do with signing off your patch, but I think that is OK
>> because they are not the primary target audience of this part of the
>> document.
> 
> Agreed. I this case the target audience was those weren't aware of that.

Yes, maybe the forward reference does give a wrong hint about a
connection between sign-off and pgp-signing. However, I would still vote
for the following change suggested by sbel...@google.com:

-Do not PGP sign your patch, -at least for now-. Most likely, your (...)
+Do not PGP sign your patch. Most likely, your maintainer or other (...)


> 
> Maybe even s/by signing off/by adding your Signed-off-by:/ to be sure
> that the reader knows that it is _their certification_ that is being
> sought. Even if it does double up on the 'your'.
> 

I don't think doubling on the 'your' will make the heading clearer. The
main intention of this change is to avoid mixups with pgp-signing and
that would IMHO not improve by that.
Besides, I find the colon in the heading a bit awkward. Is the following
version as expressive as with the colon?

-(5) Sign your work
+(5) Certify your work by adding Signed-off-by


Re: [PATCH v2 3/3] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-01-27 Thread Johannes Schindelin
Hi Junio,

On Thu, 26 Jan 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > diff --git a/connect.c b/connect.c
> > index 9f750eacb6..7b4437578b 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -691,6 +691,24 @@ static const char *get_ssh_command(void)
> > return NULL;
> >  }
> >  
> > +static int handle_ssh_variant(int *port_option, int *needs_batch)
> > +{
> > +   const char *variant;
> > +
> > +   if (!(variant = getenv("GIT_SSH_VARIANT")) &&
> > +   git_config_get_string_const("ssh.variant", ))
> > +   return 0;
> > +
> > +   if (!strcmp(variant, "plink") || !strcmp(variant, "putty"))
> > +   *port_option = 'P';
> > +   else if (!strcmp(variant, "tortoiseplink")) {
> > +   *port_option = 'P';
> > +   *needs_batch = 1;
> > +   }
> > +
> > +   return 1;
> > +}
> 
> Between handle and get I do not think there is strong reason to
> favor one over the other.

That is correct. "handle" and "get" are two very beautiful words, and none
of them deserves to take a back seat behind the other.

In this case, "get" is inappropriate, though, as the function does not
return the ssh variant, nor does it assign the ssh variant to any variable
to which any of its argument points.

Will try to find the time to address the other issues soon,
Johannes


[PATCH v4 5/5] urlmatch: allow globbing for the URL host part

2017-01-27 Thread Patrick Steinhardt
The URL matching function computes for two URLs whether they match not.
The match is performed by splitting up the URL into different parts and
then doing an exact comparison with the to-be-matched URL.

The main user of `urlmatch` is the configuration subsystem. It allows to
set certain configurations based on the URL which is being connected to
via keys like `http..*`. A common use case for this is to set
proxies for only some remotes which match the given URL. Unfortunately,
having exact matches for all parts of the URL can become quite tedious
in some setups. Imagine for example a corporate network where there are
dozens or even hundreds of subdomains, which would have to be configured
individually.

Allow users to write an asterisk '*' in place of any 'host' or
'subdomain' label as part of the host name.  For example,
"http.https://*.example.com.proxy; sets "http.proxy" for all direct
subdomains of "https://example.com;, e.g. "https://foo.example.com;, but
not "https://foo.bar.example.com;.

Signed-off-by: Patrick Steinhardt 
Helped-by: Junio C Hamano 
---
 Documentation/config.txt |  5 +++-
 t/t1300-repo-config.sh   | 72 
 urlmatch.c   | 49 +---
 3 files changed, 121 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 506431267..078e9b490 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1914,7 +1914,10 @@ http..*::
   must match exactly between the config key and the URL.
 
 . Host/domain name (e.g., `example.com` in `https://example.com/`).
-  This field must match exactly between the config key and the URL.
+  This field must match between the config key and the URL. It is
+  possible to specify a `*` as part of the host name to match all subdomains
+  at this level. `https://*.example.com/` for example would match
+  `https://foo.example.com/`, but not `https://foo.bar.example.com/`.
 
 . Port number (e.g., `8080` in `http://example.com:8080/`).
   This field must match exactly between the config key and the URL.
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 6c844d519..052f12021 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1187,6 +1187,18 @@ test_expect_success 'urlmatch favors more specific URLs' 
'
cookieFile = /tmp/user.txt
[http "https://averylongu...@example.com/;]
cookieFile = /tmp/averylonguser.txt
+   [http "https://preceding.example.com;]
+   cookieFile = /tmp/preceding.txt
+   [http "https://*.example.com;]
+   cookieFile = /tmp/wildcard.txt
+   [http "https://*.example.com/wildcardwithsubdomain;]
+   cookieFile = /tmp/wildcardwithsubdomain.txt
+   [http "https://trailing.example.com;]
+   cookieFile = /tmp/trailing.txt
+   [http "https://user@*.example.com/;]
+   cookieFile = /tmp/wildcardwithuser.txt
+   [http "https://sub.example.com/;]
+   cookieFile = /tmp/sub.txt
EOF
 
echo http.cookiefile /tmp/root.txt >expect &&
@@ -1207,6 +1219,66 @@ test_expect_success 'urlmatch favors more specific URLs' 
'
 
echo http.cookiefile /tmp/subdirectory.txt >expect &&
git config --get-urlmatch HTTP 
https://averylongu...@example.com/subdirectory >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/preceding.txt >expect &&
+   git config --get-urlmatch HTTP https://preceding.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/wildcard.txt >expect &&
+   git config --get-urlmatch HTTP https://wildcard.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/sub.txt >expect &&
+   git config --get-urlmatch HTTP 
https://sub.example.com/wildcardwithsubdomain >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/trailing.txt >expect &&
+   git config --get-urlmatch HTTP https://trailing.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/sub.txt >expect &&
+   git config --get-urlmatch HTTP https://u...@sub.example.com >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'urlmatch with wildcard' '
+   cat >.git/config <<-\EOF &&
+   [http]
+   sslVerify
+   [http "https://*.example.com;]
+   sslVerify = false
+   cookieFile = /tmp/cookie.txt
+   EOF
+
+   test_expect_code 1 git config --bool --get-urlmatch doesnt.exist 
https://good.example.com >actual &&
+   test_must_be_empty actual &&
+
+   echo true >expect &&
+   git config --bool --get-urlmatch http.SSLverify https://example.com 
>actual &&
+   test_cmp expect actual &&
+
+   echo true >expect &&
+   git config --bool --get-urlmatch http.SSLverify 

[PATCH v4 2/5] urlmatch: enable normalization of URLs with globs

2017-01-27 Thread Patrick Steinhardt
The `url_normalize` function is used to validate and normalize URLs. As
such, it does not allow for some special characters to be part of the
URLs that are to be normalized. As we want to allow using globs in some
configuration keys making use of URLs, namely `http..`, but
still normalize them, we need to somehow enable some additional allowed
characters.

To do this without having to change all callers of `url_normalize`,
where most do not actually want globbing at all, we split off another
function `url_normalize_1`. This function accepts an additional
parameter `allow_globs`, which is subsequently called by `url_normalize`
with `allow_globs=0`.

As of now, this function is not used with globbing enabled. A caller
will be added in the following commit.

Signed-off-by: Patrick Steinhardt 
---
 urlmatch.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index 132d342bc..d350478c0 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -63,7 +63,7 @@ static int append_normalized_escapes(struct strbuf *buf,
return 1;
 }
 
-char *url_normalize(const char *url, struct url_info *out_info)
+static char *url_normalize_1(const char *url, struct url_info *out_info, char 
allow_globs)
 {
/*
 * Normalize NUL-terminated url using the following rules:
@@ -191,7 +191,12 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release();
return NULL;
}
-   spanned = strspn(url, URL_HOST_CHARS);
+
+   if (allow_globs)
+   spanned = strspn(url, URL_HOST_CHARS "*");
+   else
+   spanned = strspn(url, URL_HOST_CHARS);
+
if (spanned < colon_ptr - url) {
/* Host name has invalid characters */
if (out_info) {
@@ -380,6 +385,11 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
return result;
 }
 
+char *url_normalize(const char *url, struct url_info *out_info)
+{
+   return url_normalize_1(url, out_info, 0);
+}
+
 static size_t url_match_prefix(const char *url,
   const char *url_prefix,
   size_t url_prefix_len)
-- 
2.11.0



[PATCH v4 4/5] urlmatch: include host and port in urlmatch length

2017-01-27 Thread Patrick Steinhardt
In order to be able to rank positive matches by `urlmatch`, we inspect
the path length and user part to decide whether a match is better than
another match. As all other parts are matched exactly between both URLs,
this is the right thing to do right now.

In the future, though, we want to introduce wild cards for the domain
part. When doing this, though, it does not make sense anymore to only
compare the path lengths. Instead, we also want to compare the domain
lengths to determine which of both URLs matches the host part more
closely.

Signed-off-by: Patrick Steinhardt 
---
 t/t1300-repo-config.sh | 33 
 urlmatch.c | 59 +-
 urlmatch.h |  3 ++-
 3 files changed, 69 insertions(+), 26 deletions(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 923bfc5a2..6c844d519 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1177,6 +1177,39 @@ test_expect_success 'urlmatch' '
test_cmp expect actual
 '
 
+test_expect_success 'urlmatch favors more specific URLs' '
+   cat >.git/config <<-\EOF &&
+   [http "https://example.com/;]
+   cookieFile = /tmp/root.txt
+   [http "https://example.com/subdirectory;]
+   cookieFile = /tmp/subdirectory.txt
+   [http "https://u...@example.com/;]
+   cookieFile = /tmp/user.txt
+   [http "https://averylongu...@example.com/;]
+   cookieFile = /tmp/averylonguser.txt
+   EOF
+
+   echo http.cookiefile /tmp/root.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com/subdirectory >actual 
&&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com/subdirectory/nested 
>actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/user.txt >expect &&
+   git config --get-urlmatch HTTP https://u...@example.com/ >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP 
https://averylongu...@example.com/subdirectory >actual &&
+   test_cmp expect actual
+'
+
 # good section hygiene
 test_expect_failure 'unsetting the last key in a section removes header' '
cat >.git/config <<-\EOF &&
diff --git a/urlmatch.c b/urlmatch.c
index e328905eb..f35d00a6e 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -426,7 +426,7 @@ static size_t url_match_prefix(const char *url,
 
 static int match_urls(const struct url_info *url,
  const struct url_info *url_prefix,
- int *exactusermatch)
+ struct urlmatch_item *match)
 {
/*
 * url_prefix matches url if the scheme, host and port of url_prefix
@@ -445,8 +445,8 @@ static int match_urls(const struct url_info *url,
 * contained a user name or false if url_prefix did not have a
 * user name.  If there is no match *exactusermatch is left untouched.
 */
-   int usermatched = 0;
-   int pathmatchlen;
+   char usermatched = 0;
+   size_t pathmatchlen;
 
if (!url || !url_prefix || !url->url || !url_prefix->url)
return 0;
@@ -483,22 +483,38 @@ static int match_urls(const struct url_info *url,
url->url + url->path_off,
url_prefix->url + url_prefix->path_off,
url_prefix->url_len - url_prefix->path_off);
+   if (!pathmatchlen)
+   return 0; /* paths do not match */
 
-   if (pathmatchlen && exactusermatch)
-   *exactusermatch = usermatched;
-   return pathmatchlen;
+   if (match) {
+   match->hostmatch_len = url_prefix->host_len;
+   match->pathmatch_len = pathmatchlen;
+   match->user_matched = usermatched;
+   }
+
+   return 1;
+}
+
+static int cmp_matches(const struct urlmatch_item *a,
+  const struct urlmatch_item *b)
+{
+   if (a->hostmatch_len != b->hostmatch_len)
+   return a->hostmatch_len < b->hostmatch_len ? -1 : 1;
+   if (a->pathmatch_len != b->pathmatch_len)
+   return a->pathmatch_len < b->pathmatch_len ? -1 : 1;
+   if (a->user_matched != b->user_matched)
+   return b->user_matched ? -1 : 1;
+   return 0;
 }
 
 int urlmatch_config_entry(const char *var, const char *value, void *cb)
 {
struct string_list_item *item;
struct urlmatch_config *collect = cb;
-   struct urlmatch_item *matched;
+   struct urlmatch_item matched;
struct url_info *url = >url;
const char *key, *dot;
struct strbuf synthkey = STRBUF_INIT;
-   size_t 

[PATCH v4 3/5] urlmatch: split host and port fields in `struct url_info`

2017-01-27 Thread Patrick Steinhardt
The `url_info` structure contains information about a normalized URL
with the URL's components being represented by different fields. The
host and port part though are to be accessed by the same `host` field,
so that getting the host and/or port separately becomes more involved
than really necessary.

To make the port more readily accessible, split up the host and port
fields. Namely, the `host_len` will not include the port length anymore
and a new `port_off` field has been added which includes the offset to
the port, if available.

The only user of these fields is `url_normalize_1`. This change makes it
easier later on to treat host and port differently when introducing
globs for domains.

Signed-off-by: Patrick Steinhardt 
---
 urlmatch.c | 16 
 urlmatch.h |  9 +
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index d350478c0..e328905eb 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -104,7 +104,7 @@ static char *url_normalize_1(const char *url, struct 
url_info *out_info, char al
struct strbuf norm;
size_t spanned;
size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
-   size_t host_off=0, host_len=0, port_len=0, path_off, path_len, 
result_len;
+   size_t host_off=0, host_len=0, port_off=0, port_len=0, path_off, 
path_len, result_len;
const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
char *result;
 
@@ -263,6 +263,7 @@ static char *url_normalize_1(const char *url, struct 
url_info *out_info, char al
return NULL;
}
strbuf_addch(, ':');
+   port_off = norm.len;
strbuf_add(, url, slash_ptr - url);
port_len = slash_ptr - url;
}
@@ -270,7 +271,7 @@ static char *url_normalize_1(const char *url, struct 
url_info *out_info, char al
url = slash_ptr;
}
if (host_off)
-   host_len = norm.len - host_off;
+   host_len = norm.len - host_off - (port_len ? port_len + 1 : 0);
 
 
/*
@@ -378,6 +379,7 @@ static char *url_normalize_1(const char *url, struct 
url_info *out_info, char al
out_info->passwd_len = passwd_len;
out_info->host_off = host_off;
out_info->host_len = host_len;
+   out_info->port_off = port_off;
out_info->port_len = port_len;
out_info->path_off = path_off;
out_info->path_len = path_len;
@@ -464,11 +466,17 @@ static int match_urls(const struct url_info *url,
usermatched = 1;
}
 
-   /* check the host and port */
+   /* check the host */
if (url_prefix->host_len != url->host_len ||
strncmp(url->url + url->host_off,
url_prefix->url + url_prefix->host_off, url->host_len))
-   return 0; /* host names and/or ports do not match */
+   return 0; /* host names do not match */
+
+   /* check the port */
+   if (url_prefix->port_len != url->port_len ||
+   strncmp(url->url + url->port_off,
+   url_prefix->url + url_prefix->port_off, url->port_len))
+   return 0; /* ports do not match */
 
/* check the path */
pathmatchlen = url_match_prefix(
diff --git a/urlmatch.h b/urlmatch.h
index 528862adc..0ea812b03 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -18,11 +18,12 @@ struct url_info {
size_t passwd_len;  /* length of passwd; if passwd_off != 0 but
   passwd_len == 0, an empty passwd was given */
size_t host_off;/* offset into url to start of host name (0 => 
none) */
-   size_t host_len;/* length of host name; this INCLUDES any 
':portnum';
+   size_t host_len;/* length of host name;
 * file urls may have host_len == 0 */
-   size_t port_len;/* if a portnum is present (port_len != 0), it 
has
-* this length (excluding the leading ':') at 
the
-* end of the host name (always 0 for file 
urls) */
+   size_t port_off;/* offset into url to start of port number (0 
=> none) */
+   size_t port_len;/* if a portnum is present (port_off != 0), it 
has
+* this length (excluding the leading ':') 
starting
+* from port_off (always 0 for file urls) */
size_t path_off;/* offset into url to the start of the url path;
 * this will always point to a '/' character
 * after the url has been normalized */
-- 
2.11.0



[PATCH v4 0/5] urlmatch: allow wildcard-based matches

2017-01-27 Thread Patrick Steinhardt
Hi,

so this is part four of my patch series. The previous version can
be found at [1]. The use case is to be able to configure an HTTP
proxy for all subdomains of a domain where there are hundreds of
subdomains.

Changes to the previous version:

 - applied Junio's proposed patch to replace `strtok_r` with a
   `memchr`-based loop
 - applied Junio's proposed rewrite of the commit message of
   patch 5
 - I realized that with my patches, "ranking" of URLs was broken.
   Previously, we've always taken the longest matching URL. As
   previously, only the user and path could actually differ, only
   these two components were used for the comparison. I've
   changed this now to also include the host part so that URLs
   with a longer host will take precedence. This resulted in a
   the patch 4.
 - New tests are included which examine if the precedence-rules
   are actually honored correctly. The tests are part of patches
   4 and 5.

You can find the interdiff below.

Regards
Patrick

[1]: 
http://public-inbox.org/git/20170125095648.4116-1-patrick.steinha...@elego.de/T/#t

Patrick Steinhardt (5):
  mailmap: add Patrick Steinhardt's work address
  urlmatch: enable normalization of URLs with globs
  urlmatch: split host and port fields in `struct url_info`
  urlmatch: include host and port in urlmatch length
  urlmatch: allow globbing for the URL host part

 .mailmap |   1 +
 Documentation/config.txt |   5 +-
 t/t1300-repo-config.sh   | 105 
 urlmatch.c   | 138 +++
 urlmatch.h   |  12 +++--
 5 files changed, 220 insertions(+), 41 deletions(-)

-- 
2.11.0

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index ec545e092..052f12021 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1177,7 +1177,72 @@ test_expect_success 'urlmatch' '
test_cmp expect actual
 '
 
-test_expect_success 'glob-based urlmatch' '
+test_expect_success 'urlmatch favors more specific URLs' '
+   cat >.git/config <<-\EOF &&
+   [http "https://example.com/;]
+   cookieFile = /tmp/root.txt
+   [http "https://example.com/subdirectory;]
+   cookieFile = /tmp/subdirectory.txt
+   [http "https://u...@example.com/;]
+   cookieFile = /tmp/user.txt
+   [http "https://averylongu...@example.com/;]
+   cookieFile = /tmp/averylonguser.txt
+   [http "https://preceding.example.com;]
+   cookieFile = /tmp/preceding.txt
+   [http "https://*.example.com;]
+   cookieFile = /tmp/wildcard.txt
+   [http "https://*.example.com/wildcardwithsubdomain;]
+   cookieFile = /tmp/wildcardwithsubdomain.txt
+   [http "https://trailing.example.com;]
+   cookieFile = /tmp/trailing.txt
+   [http "https://user@*.example.com/;]
+   cookieFile = /tmp/wildcardwithuser.txt
+   [http "https://sub.example.com/;]
+   cookieFile = /tmp/sub.txt
+   EOF
+
+   echo http.cookiefile /tmp/root.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com/subdirectory >actual 
&&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP https://example.com/subdirectory/nested 
>actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/user.txt >expect &&
+   git config --get-urlmatch HTTP https://u...@example.com/ >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/subdirectory.txt >expect &&
+   git config --get-urlmatch HTTP 
https://averylongu...@example.com/subdirectory >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/preceding.txt >expect &&
+   git config --get-urlmatch HTTP https://preceding.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/wildcard.txt >expect &&
+   git config --get-urlmatch HTTP https://wildcard.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/sub.txt >expect &&
+   git config --get-urlmatch HTTP 
https://sub.example.com/wildcardwithsubdomain >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/trailing.txt >expect &&
+   git config --get-urlmatch HTTP https://trailing.example.com >actual &&
+   test_cmp expect actual &&
+
+   echo http.cookiefile /tmp/sub.txt >expect &&
+   git config --get-urlmatch HTTP https://u...@sub.example.com >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'urlmatch with wildcard' '
cat >.git/config <<-\EOF &&
[http]
sslVerify
@@ -1210,6 +1275,10 @@ test_expect_success 'glob-based 

[PATCH v4 1/5] mailmap: add Patrick Steinhardt's work address

2017-01-27 Thread Patrick Steinhardt
Signed-off-by: Patrick Steinhardt 
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 9c87a3840..ea59205b9 100644
--- a/.mailmap
+++ b/.mailmap
@@ -177,6 +177,7 @@ Paolo Bonzini  
 Pascal Obry  
 Pascal Obry  
 Pat Notz  
+Patrick Steinhardt  
 Paul Mackerras  
 Paul Mackerras  
 Peter Baumann  

-- 
2.11.0



Re: [PATCH] mingw: allow hooks to be .exe files

2017-01-27 Thread Johannes Schindelin
Hi Junio,

On Thu, 26 Jan 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > On Wed, 25 Jan 2017, Jeff King wrote:
> >
> >> On Wed, Jan 25, 2017 at 05:58:42PM +0100, Johannes Schindelin wrote:
> >> 
> >> > -if (access(path.buf, X_OK) < 0)
> >> > +if (access(path.buf, X_OK) < 0) {
> >> > +#ifdef STRIP_EXTENSION
> >> > +strbuf_addstr(, ".exe");
> >> 
> >> I think STRIP_EXTENSION is a string.  Should this line be:
> >> 
> >>   strbuf_addstr(, STRIP_EXTENSION);
> >
> > Yep.
> >
> > v2 coming,
> > Johannes
> 
> I think I've already tweaked it out when I queued the original one.

After digging, I found your SQUASH commit. I had not known about that.

In any case, I much rather prefer to have the final version of any patch
or patch series I contribute to be identical between what you commit and
what I sent to the mailing list. We do disagree from time to time, and I
would like to have the opportunity of reviewing how you tweak my changes.

Ciao,
Johannes


[PATCH v3 1/3] config: add markup to core.logAllRefUpdates doc

2017-01-27 Thread cornelius . weig
From: Cornelius Weig 

Signed-off-by: Cornelius Weig 
---

Notes:
Changes wrt v2:
Remove duplicated line.

 Documentation/config.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index af2ae4c..c7d8a01 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -517,10 +517,10 @@ core.logAllRefUpdates::
"`$GIT_DIR/logs/`", by appending the new and old
SHA-1, the date/time and the reason of the update, but
only when the file exists.  If this configuration
-   variable is set to true, missing "`$GIT_DIR/logs/`"
+   variable is set to `true`, missing "`$GIT_DIR/logs/`"
file is automatically created for branch heads (i.e. under
-   refs/heads/), remote refs (i.e. under refs/remotes/),
-   note refs (i.e. under refs/notes/), and the symbolic ref HEAD.
+   `refs/heads/`), remote refs (i.e. under `refs/remotes/`),
+   note refs (i.e. under `refs/notes/`), and the symbolic ref `HEAD`.
 +
 This information can be used to determine what commit
 was the tip of a branch "2 days ago".
-- 
2.10.2



[PATCH v3 2/3] refs: add option core.logAllRefUpdates = always

2017-01-27 Thread cornelius . weig
From: Cornelius Weig 

When core.logallrefupdates is true, we only create a new reflog for refs
that are under certain well-known hierarchies. The reason is that we
know that some hierarchies (like refs/tags) are not meant to change, and
that unknown hierarchies might not want reflogs at all (e.g., a
hypothetical refs/foo might be meant to change often and drop old
history immediately).

However, sometimes it is useful to override this decision and simply log
for all refs, because the safety and audit trail is more important than
the performance implications of keeping the log around.

This patch introduces a new "always" mode for the core.logallrefupdates
option which will log updates to everything under refs/, regardless
where in the hierarchy it is (we still will not log things like
ORIG_HEAD and FETCH_HEAD, which are known to be transient).

Based-on-patch-by: Jeff King 
Signed-off-by: Cornelius Weig 
Reviewed-by: Jeff King 
---

Notes:
Changes wrt v2:

 - change wording in commit message s/do not typically/are not meant to/;
 - in update_refs_for_switch move refname to the enclosing block, so that
   should_autocreate_reflog has access. Thanks Junio for spotting this
   potential bug early :)
 - add test that asserts reflogs are created for tags if
   logAllRefUpdates=always. The case with logAllRefUpdates=true is IMHO 
already
   covered by the default case. To make that clearer, I explicitly added
   logAllRefUpdates=true.

When writing the test for git-tag, I realized that the option
--no-create-reflog to git-tag does not take precedence over
logAllRefUpdate=always. IOW the setting cannot be overridden on the command
line. Do you think this is a defect or would it not be desirable to have 
this
feature anyway?

 Documentation/config.txt  |  2 ++
 Documentation/git-tag.txt |  3 ++-
 branch.c  |  2 +-
 builtin/checkout.c|  7 +++
 builtin/init-db.c |  2 +-
 cache.h   |  9 -
 config.c  |  7 ++-
 environment.c |  2 +-
 refs.c| 15 ++-
 refs.h|  2 ++
 refs/files-backend.c  |  6 +++---
 refs/refs-internal.h  |  2 --
 t/t1400-update-ref.sh | 37 +
 t/t7004-tag.sh|  8 
 14 files changed, 84 insertions(+), 20 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c7d8a01..d1fab67 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -521,6 +521,8 @@ core.logAllRefUpdates::
file is automatically created for branch heads (i.e. under
`refs/heads/`), remote refs (i.e. under `refs/remotes/`),
note refs (i.e. under `refs/notes/`), and the symbolic ref `HEAD`.
+   If it is set to `always`, then a missing reflog is automatically
+   created for any ref under `refs/`.
 +
 This information can be used to determine what commit
 was the tip of a branch "2 days ago".
diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 5055a96..2ac25a9 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -150,7 +150,8 @@ This option is only applicable when listing tags without 
annotation lines.
'strip' removes both whitespace and commentary.
 
 --create-reflog::
-   Create a reflog for the tag.
+   Create a reflog for the tag. To globally enable reflogs for tags, see
+   `core.logAllRefUpdates` in linkgit:git-config[1].
 
 ::
The name of the tag to create, delete, or describe.
diff --git a/branch.c b/branch.c
index c431cbf..b955d4f 100644
--- a/branch.c
+++ b/branch.c
@@ -298,7 +298,7 @@ void create_branch(const char *name, const char *start_name,
 start_name);
 
if (reflog)
-   log_all_ref_updates = 1;
+   log_all_ref_updates = LOG_REFS_NORMAL;
 
if (!dont_change_ref) {
struct ref_transaction *transaction;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index bfe685c..81ea2ed 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -612,14 +612,12 @@ static void update_refs_for_switch(const struct 
checkout_opts *opts,
const char *old_desc, *reflog_msg;
if (opts->new_branch) {
if (opts->new_orphan_branch) {
-   if (opts->new_branch_log && !log_all_ref_updates) {
+   const char *refname = mkpathdup("refs/heads/%s", 
opts->new_orphan_branch);
+   if (opts->new_branch_log && 
should_autocreate_reflog(refname)) {
int ret;
-   char *refname;
struct strbuf err = STRBUF_INIT;
 
-   refname = mkpathdup("refs/heads/%s", 
opts->new_orphan_branch);

Re: [PATCH 2/2] use absolute_pathdup()

2017-01-27 Thread Johannes Schindelin
Hi René,

On Thu, 26 Jan 2017, René Scharfe wrote:

> Apply the symantic patch for converting callers that duplicate the

s/symantic/semantic/

Ciao,
Dscho

  1   2   >