Re: worktrees vs. alternates

2018-05-18 Thread Duy Nguyen
On Thu, May 17, 2018 at 5:31 AM, Jeff King  wrote:
> On Thu, May 17, 2018 at 06:13:55AM +0530, Sitaram Chamarty wrote:
>
>> I may have missed a few of the earlier messages, but in the last
>> 20 or so in this thread, I did not see namespaces mentioned by
>> anyone. (I.e., apologies if it was addressed and discarded
>> earlier!)
>>
>> I was under the impression that, as long as "read" access need
>> not be controlled (Konstantin's situation, at least, and maybe
>> Peff's too, for public repos), namespaces are a good way to
>> create and manage that "mother repo".
>>
>> Is that not true anymore?  Mind, I have not actually used them
>> in anger anywhere, so I could be missing some really big point
>> here.
>
> The biggest problem with namespaces as they are currently implemented is
> that they do not apply universally to all commands. If you only access
> the repo via push/fetch, they may be fine. But as soon as you start
> doing other operations (e.g., showing the history of a branch in a web
> interface), you don't get to use the namespaced names anymore.
>
> I think a different implementation of namespaces could do this better.
> E.g., by controlling the view of the refs at the refs.c layer (or
> perhaps as a filtering backend).

Yeah. Namespaces (that work for all commands) + worktree was my plan
for centralizing repos (for one user). But I never got that far to
look into making ref namespaces work for everything.
-- 
Duy


Re: [PATCH v2 14/14] commit.h: delete 'util' field in struct commit

2018-05-18 Thread Duy Nguyen
On Mon, May 14, 2018 at 8:14 PM, Derrick Stolee  wrote:
> I disagree with the removal of "generation". My intention is to make the
> commit-graph feature a standard feature that most repos (of reasonable size)
> want to have. In that case, 'graph_pos' and 'generation' will be set during
> every parse and used by every commit walk. This is just my gut reaction.

My 'often used' is a poor choice of words. The problem here is memory
consumption in full-repo walk like 'git clone' or 'git gc', where more
memory use may equal more swapping and that will slow things down.

commit-slab also has a few advantage over packing everything in struct
commit besides the performance argument. It makes it easier to see
what field is used for what. And if an operation only uses a field in
a short time, then it's possible to free data after it's done
(impossible to free these struct commits without a lot more thinking
and checking; whatever you add here will stay until the end of the
process, which could be a long time for things like pack-objects)

> As I investigate these changes, I'll try to see what performance hit is
> caused by converting the graph_pos and/or generation to commit slabs.
> (Again, I'm assuming the slabs will make this slower. I may be wrong here.)

The slab allocation code is very similar to memory-pool, which was
made common code also by Microsoft to make reading cache faster so I
don't think it will slow things down for you (again, guessing, no real
measurement from my side). This does make me think it may be a good
idea to try unify commit-slab and memory-pool, let commit-slab be a
thin layer on top of memory-pool or something like that (the fewer
custom mem allocators we have, the better)
-- 
Duy


[PATCH v3 10/15] name-rev: use commit-slab for rev-name instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/name-rev.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 387ddf85d2..0eb440359d 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -6,6 +6,7 @@
 #include "refs.h"
 #include "parse-options.h"
 #include "sha1-lookup.h"
+#include "commit-slab.h"
 
 #define CUTOFF_DATE_SLOP 86400 /* one day */
 
@@ -17,11 +18,26 @@ typedef struct rev_name {
int from_tag;
 } rev_name;
 
+define_commit_slab(commit_rev_name, struct rev_name *);
+
 static timestamp_t cutoff = TIME_MAX;
+static struct commit_rev_name rev_names;
 
 /* How many generations are maximally preferred over _one_ merge traversal? */
 #define MERGE_TRAVERSAL_WEIGHT 65535
 
+static struct rev_name *get_commit_rev_name(struct commit *commit)
+{
+   struct rev_name **slot = commit_rev_name_peek(_names, commit);
+
+   return slot ? *slot : NULL;
+}
+
+static void set_commit_rev_name(struct commit *commit, struct rev_name *name)
+{
+   *commit_rev_name_at(_names, commit) = name;
+}
+
 static int is_better_name(struct rev_name *name,
  const char *tip_name,
  timestamp_t taggerdate,
@@ -65,7 +81,7 @@ static void name_rev(struct commit *commit,
int generation, int distance, int from_tag,
int deref)
 {
-   struct rev_name *name = (struct rev_name *)commit->util;
+   struct rev_name *name = get_commit_rev_name(commit);
struct commit_list *parents;
int parent_number = 1;
char *to_free = NULL;
@@ -84,7 +100,7 @@ static void name_rev(struct commit *commit,
 
if (name == NULL) {
name = xmalloc(sizeof(rev_name));
-   commit->util = name;
+   set_commit_rev_name(commit, name);
goto copy_data;
} else if (is_better_name(name, tip_name, taggerdate,
  generation, distance, from_tag)) {
@@ -296,7 +312,7 @@ static const char *get_rev_name(const struct object *o, 
struct strbuf *buf)
if (o->type != OBJ_COMMIT)
return get_exact_ref_match(o);
c = (struct commit *) o;
-   n = c->util;
+   n = get_commit_rev_name(c);
if (!n)
return NULL;
 
@@ -413,6 +429,7 @@ int cmd_name_rev(int argc, const char **argv, const char 
*prefix)
OPT_END(),
};
 
+   init_commit_rev_name(_names);
git_config(git_default_config, NULL);
argc = parse_options(argc, argv, prefix, opts, name_rev_usage, 0);
if (all + transform_stdin + !!argc > 1) {
-- 
2.17.0.705.g3525833791



[PATCH v3 03/15] blame: use commit-slab for blame suspects instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 blame.c | 42 +++---
 blame.h |  2 ++
 builtin/blame.c |  2 +-
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/blame.c b/blame.c
index 78c9808bd1..18e8bd996a 100644
--- a/blame.c
+++ b/blame.c
@@ -6,6 +6,24 @@
 #include "diffcore.h"
 #include "tag.h"
 #include "blame.h"
+#include "commit-slab.h"
+
+define_commit_slab(blame_suspects, struct blame_origin *);
+static struct blame_suspects blame_suspects;
+
+struct blame_origin *get_blame_suspects(struct commit *commit)
+{
+   struct blame_origin **result;
+
+   result = blame_suspects_peek(_suspects, commit);
+
+   return result ? *result : NULL;
+}
+
+static void set_blame_suspects(struct commit *commit, struct blame_origin 
*origin)
+{
+   *blame_suspects_at(_suspects, commit) = origin;
+}
 
 void blame_origin_decref(struct blame_origin *o)
 {
@@ -15,12 +33,12 @@ void blame_origin_decref(struct blame_origin *o)
blame_origin_decref(o->previous);
free(o->file.ptr);
/* Should be present exactly once in commit chain */
-   for (p = o->commit->util; p; l = p, p = p->next) {
+   for (p = get_blame_suspects(o->commit); p; l = p, p = p->next) {
if (p == o) {
if (l)
l->next = p->next;
else
-   o->commit->util = p->next;
+   set_blame_suspects(o->commit, p->next);
free(o);
return;
}
@@ -41,8 +59,8 @@ static struct blame_origin *make_origin(struct commit 
*commit, const char *path)
FLEX_ALLOC_STR(o, path, path);
o->commit = commit;
o->refcnt = 1;
-   o->next = commit->util;
-   commit->util = o;
+   o->next = get_blame_suspects(commit);
+   set_blame_suspects(commit, o);
return o;
 }
 
@@ -54,13 +72,13 @@ static struct blame_origin *get_origin(struct commit 
*commit, const char *path)
 {
struct blame_origin *o, *l;
 
-   for (o = commit->util, l = NULL; o; l = o, o = o->next) {
+   for (o = get_blame_suspects(commit), l = NULL; o; l = o, o = o->next) {
if (!strcmp(o->path, path)) {
/* bump to front */
if (l) {
l->next = o->next;
-   o->next = commit->util;
-   commit->util = o;
+   o->next = get_blame_suspects(commit);
+   set_blame_suspects(commit, o);
}
return blame_origin_incref(o);
}
@@ -478,7 +496,7 @@ static void queue_blames(struct blame_scoreboard *sb, 
struct blame_origin *porig
porigin->suspects = blame_merge(porigin->suspects, sorted);
else {
struct blame_origin *o;
-   for (o = porigin->commit->util; o; o = o->next) {
+   for (o = get_blame_suspects(porigin->commit); o; o = o->next) {
if (o->suspects) {
porigin->suspects = sorted;
return;
@@ -525,7 +543,7 @@ static struct blame_origin *find_origin(struct commit 
*parent,
const char *paths[2];
 
/* First check any existing origins */
-   for (porigin = parent->util; porigin; porigin = porigin->next)
+   for (porigin = get_blame_suspects(parent); porigin; porigin = 
porigin->next)
if (!strcmp(porigin->path, origin->path)) {
/*
 * The same path between origin and its parent
@@ -1550,7 +1568,7 @@ void assign_blame(struct blame_scoreboard *sb, int opt)
 
while (commit) {
struct blame_entry *ent;
-   struct blame_origin *suspect = commit->util;
+   struct blame_origin *suspect = get_blame_suspects(commit);
 
/* find one suspect to break down */
while (suspect && !suspect->suspects)
@@ -1752,6 +1770,8 @@ void setup_scoreboard(struct blame_scoreboard *sb, const 
char *path, struct blam
struct commit *final_commit = NULL;
enum object_type type;
 
+   init_blame_suspects(_suspects);
+
if (sb->reverse && sb->contents_from)
die(_("--contents and --reverse do not blend well."));
 
@@ -1815,7 +1835,7 @@ void setup_scoreboard(struct blame_scoreboard *sb, const 
char *path, struct blam
}
 
if (is_null_oid(>final->object.oid)) {
-   o = sb->final->util;
+   o = get_blame_suspects(sb->final);
sb->final_buf = 

[PATCH v3 05/15] shallow.c: use commit-slab for commit depth instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.

While at there, plug a leak for keeping track of depth in this code.
---
 shallow.c | 40 
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/shallow.c b/shallow.c
index df4d44ea7a..0301049781 100644
--- a/shallow.c
+++ b/shallow.c
@@ -12,6 +12,7 @@
 #include "commit-slab.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "commit-slab.h"
 
 static int is_shallow = -1;
 static struct stat_validity shallow_stat;
@@ -74,6 +75,11 @@ int is_repository_shallow(void)
return is_shallow;
 }
 
+/*
+ * TODO: use "int" elemtype instead of "int *" when/if commit-slab
+ * supports a "valid" flag.
+ */
+define_commit_slab(commit_depth, int *);
 struct commit_list *get_shallow_commits(struct object_array *heads, int depth,
int shallow_flag, int not_shallow_flag)
 {
@@ -82,25 +88,29 @@ struct commit_list *get_shallow_commits(struct object_array 
*heads, int depth,
struct object_array stack = OBJECT_ARRAY_INIT;
struct commit *commit = NULL;
struct commit_graft *graft;
+   struct commit_depth depths;
 
+   init_commit_depth();
while (commit || i < heads->nr || stack.nr) {
struct commit_list *p;
if (!commit) {
if (i < heads->nr) {
+   int **depth_slot;
commit = (struct commit *)
deref_tag(heads->objects[i++].item, 
NULL, 0);
if (!commit || commit->object.type != 
OBJ_COMMIT) {
commit = NULL;
continue;
}
-   if (!commit->util)
-   commit->util = xmalloc(sizeof(int));
-   *(int *)commit->util = 0;
+   depth_slot = commit_depth_at(, commit);
+   if (!*depth_slot)
+   *depth_slot = xmalloc(sizeof(int));
+   **depth_slot = 0;
cur_depth = 0;
} else {
commit = (struct commit *)
object_array_pop();
-   cur_depth = *(int *)commit->util;
+   cur_depth = **commit_depth_at(, commit);
}
}
parse_commit_or_die(commit);
@@ -116,25 +126,31 @@ struct commit_list *get_shallow_commits(struct 
object_array *heads, int depth,
}
commit->object.flags |= not_shallow_flag;
for (p = commit->parents, commit = NULL; p; p = p->next) {
-   if (!p->item->util) {
-   int *pointer = xmalloc(sizeof(int));
-   p->item->util = pointer;
-   *pointer =  cur_depth;
+   int **depth_slot = commit_depth_at(, p->item);
+   if (!*depth_slot) {
+   *depth_slot = xmalloc(sizeof(int));
+   **depth_slot = cur_depth;
} else {
-   int *pointer = p->item->util;
-   if (cur_depth >= *pointer)
+   if (cur_depth >= **depth_slot)
continue;
-   *pointer = cur_depth;
+   **depth_slot = cur_depth;
}
if (p->next)
add_object_array(>item->object,
NULL, );
else {
commit = p->item;
-   cur_depth = *(int *)commit->util;
+   cur_depth = **commit_depth_at(, commit);
}
}
}
+   for (i = 0; i < depths.slab_count; i++) {
+   int j;
+
+   for (j = 0; j < depths.slab_size; j++)
+   free(depths.slab[i][j]);
+   }
+   clear_commit_depth();
 
return result;
 }
-- 
2.17.0.705.g3525833791



[PATCH v3 06/15] sequencer.c: use commit-slab to mark seen commits

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 sequencer.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 4ce5120e77..3af296db3b 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -23,6 +23,7 @@
 #include "hashmap.h"
 #include "notes-utils.h"
 #include "sigchain.h"
+#include "commit-slab.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -3160,6 +3161,7 @@ static enum check_level 
get_missing_commit_check_level(void)
return CHECK_IGNORE;
 }
 
+define_commit_slab(commit_seen, unsigned char);
 /*
  * Check if the user dropped some commits by mistake
  * Behaviour determined by rebase.missingCommitsCheck.
@@ -3173,6 +3175,9 @@ int check_todo_list(void)
struct todo_list todo_list = TODO_LIST_INIT;
struct strbuf missing = STRBUF_INIT;
int advise_to_edit_todo = 0, res = 0, i;
+   struct commit_seen commit_seen;
+
+   init_commit_seen(_seen);
 
strbuf_addstr(_file, rebase_path_todo());
if (strbuf_read_file_or_whine(_list.buf, todo_file.buf) < 0) {
@@ -3189,7 +3194,7 @@ int check_todo_list(void)
for (i = 0; i < todo_list.nr; i++) {
struct commit *commit = todo_list.items[i].commit;
if (commit)
-   commit->util = (void *)1;
+   *commit_seen_at(_seen, commit) = 1;
}
 
todo_list_release(_list);
@@ -3205,11 +3210,11 @@ int check_todo_list(void)
for (i = todo_list.nr - 1; i >= 0; i--) {
struct todo_item *item = todo_list.items + i;
struct commit *commit = item->commit;
-   if (commit && !commit->util) {
+   if (commit && !*commit_seen_at(_seen, commit)) {
strbuf_addf(, " - %s %.*s\n",
short_commit_name(commit),
item->arg_len, item->arg);
-   commit->util = (void *)1;
+   *commit_seen_at(_seen, commit) = 1;
}
}
 
@@ -3235,6 +3240,7 @@ int check_todo_list(void)
"The possible behaviours are: ignore, warn, error.\n\n"));
 
 leave_check:
+   clear_commit_seen(_seen);
strbuf_release(_file);
todo_list_release(_list);
 
-- 
2.17.0.705.g3525833791



[PATCH v3 14/15] merge: use commit-slab in merge remote desc instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/merge.c   | 25 +
 commit.c  | 12 ++--
 commit.h  |  2 +-
 merge-recursive.c |  8 +---
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9db5a2cf16..fc55bc264b 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -443,6 +443,7 @@ static void merge_name(const char *remote, struct strbuf 
*msg)
struct object_id branch_head;
struct strbuf buf = STRBUF_INIT;
struct strbuf bname = STRBUF_INIT;
+   struct merge_remote_desc *desc;
const char *ptr;
char *found_ref;
int len, early;
@@ -515,16 +516,13 @@ static void merge_name(const char *remote, struct strbuf 
*msg)
strbuf_release();
}
 
-   if (remote_head->util) {
-   struct merge_remote_desc *desc;
-   desc = merge_remote_util(remote_head);
-   if (desc && desc->obj && desc->obj->type == OBJ_TAG) {
-   strbuf_addf(msg, "%s\t\t%s '%s'\n",
-   oid_to_hex(>obj->oid),
-   type_name(desc->obj->type),
-   remote);
-   goto cleanup;
-   }
+   desc = merge_remote_util(remote_head);
+   if (desc && desc->obj && desc->obj->type == OBJ_TAG) {
+   strbuf_addf(msg, "%s\t\t%s '%s'\n",
+   oid_to_hex(>obj->oid),
+   type_name(desc->obj->type),
+   remote);
+   goto cleanup;
}
 
strbuf_addf(msg, "%s\t\tcommit '%s'\n",
@@ -932,8 +930,11 @@ static void write_merge_heads(struct commit_list 
*remoteheads)
for (j = remoteheads; j; j = j->next) {
struct object_id *oid;
struct commit *c = j->item;
-   if (c->util && merge_remote_util(c)->obj) {
-   oid = _remote_util(c)->obj->oid;
+   struct merge_remote_desc *desc;
+
+   desc = merge_remote_util(c);
+   if (desc && desc->obj) {
+   oid = >obj->oid;
} else {
oid = >object.oid;
}
diff --git a/commit.c b/commit.c
index 57049118a5..e63a8dfeaa 100644
--- a/commit.c
+++ b/commit.c
@@ -1574,13 +1574,21 @@ int commit_tree_extended(const char *msg, size_t 
msg_len,
return result;
 }
 
+define_commit_slab(merge_desc_slab, struct merge_remote_desc *);
+static struct merge_desc_slab merge_desc_slab = COMMIT_SLAB_INIT(1, 
merge_desc_slab);
+
+struct merge_remote_desc *merge_remote_util(struct commit *commit)
+{
+   return *merge_desc_slab_at(_desc_slab, commit);
+}
+
 void set_merge_remote_desc(struct commit *commit,
   const char *name, struct object *obj)
 {
struct merge_remote_desc *desc;
FLEX_ALLOC_STR(desc, name, name);
desc->obj = obj;
-   commit->util = desc;
+   *merge_desc_slab_at(_desc_slab, commit) = desc;
 }
 
 struct commit *get_merge_parent(const char *name)
@@ -1592,7 +1600,7 @@ struct commit *get_merge_parent(const char *name)
return NULL;
obj = parse_object();
commit = (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
-   if (commit && !commit->util)
+   if (commit && !merge_remote_util(commit))
set_merge_remote_desc(commit, name, obj);
return commit;
 }
diff --git a/commit.h b/commit.h
index e57ae4b583..838f6a6b26 100644
--- a/commit.h
+++ b/commit.h
@@ -303,7 +303,7 @@ struct merge_remote_desc {
struct object *obj; /* the named object, could be a tag */
char name[FLEX_ARRAY];
 };
-#define merge_remote_util(commit) ((struct merge_remote_desc 
*)((commit)->util))
+extern struct merge_remote_desc *merge_remote_util(struct commit *);
 extern void set_merge_remote_desc(struct commit *commit,
  const char *name, struct object *obj);
 
diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624d..5537f01f8e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -223,10 +223,12 @@ static void output(struct merge_options *o, int v, const 
char *fmt, ...)
 
 static void output_commit_title(struct merge_options *o, struct commit *commit)
 {
+   struct merge_remote_desc *desc;
+
strbuf_addchars(>obuf, ' ', o->call_depth * 2);
-   if (commit->util)
-   strbuf_addf(>obuf, "virtual %s\n",
-   merge_remote_util(commit)->name);
+   desc = merge_remote_util(commit);
+   if (desc)
+   strbuf_addf(>obuf, "virtual %s\n", desc->name);
else {
strbuf_add_unique_abbrev(>obuf, >object.oid,
 DEFAULT_ABBREV);
-- 
2.17.0.705.g3525833791



[PATCH v3 01/15] commit-slab.h: code split

2018-05-18 Thread Nguyễn Thái Ngọc Duy
The struct declaration and implementation macros are moved to
commit-slab-hdr.h and commit-slab-impl.h respectively.

This right now is not needed for current users but if we make a public
commit-slab type, we may want to avoid including the slab
implementation in a header file which gets replicated in every c file
that includes it.
---
 commit-slab-decl.h |  30 
 commit-slab-impl.h |  91 +++
 commit-slab.h  | 115 +++--
 3 files changed, 127 insertions(+), 109 deletions(-)
 create mode 100644 commit-slab-decl.h
 create mode 100644 commit-slab-impl.h

diff --git a/commit-slab-decl.h b/commit-slab-decl.h
new file mode 100644
index 00..fb5220fb7d
--- /dev/null
+++ b/commit-slab-decl.h
@@ -0,0 +1,30 @@
+#ifndef COMMIT_SLAB_HDR_H
+#define COMMIT_SLAB_HDR_H
+
+/* allocate ~512kB at once, allowing for malloc overhead */
+#ifndef COMMIT_SLAB_SIZE
+#define COMMIT_SLAB_SIZE (512*1024-32)
+#endif
+
+#define declare_commit_slab(slabname, elemtype)\
+   \
+struct slabname {  \
+   unsigned slab_size; \
+   unsigned stride;\
+   unsigned slab_count;\
+   elemtype **slab;\
+}
+
+/*
+ * Statically initialize a commit slab named "var". Note that this
+ * evaluates "stride" multiple times! Example:
+ *
+ *   struct indegree indegrees = COMMIT_SLAB_INIT(1, indegrees);
+ *
+ */
+#define COMMIT_SLAB_INIT(stride, var) { \
+   COMMIT_SLAB_SIZE / sizeof(**((var).slab)) / (stride), \
+   (stride), 0, NULL \
+}
+
+#endif /* COMMIT_SLAB_HDR_H */
diff --git a/commit-slab-impl.h b/commit-slab-impl.h
new file mode 100644
index 00..234d9ee5f0
--- /dev/null
+++ b/commit-slab-impl.h
@@ -0,0 +1,91 @@
+#ifndef COMMIT_SLAB_IMPL_H
+#define COMMIT_SLAB_IMPL_H
+
+#define MAYBE_UNUSED __attribute__((__unused__))
+
+#define implement_commit_slab(slabname, elemtype)  \
+   \
+static int stat_ ##slabname## realloc; \
+   \
+static MAYBE_UNUSED void init_ ##slabname## _with_stride(struct slabname *s, \
+  unsigned stride) \
+{  \
+   unsigned int elem_size; \
+   if (!stride)\
+   stride = 1; \
+   s->stride = stride; \
+   elem_size = sizeof(elemtype) * stride;  \
+   s->slab_size = COMMIT_SLAB_SIZE / elem_size;\
+   s->slab_count = 0;  \
+   s->slab = NULL; \
+}  \
+   \
+static MAYBE_UNUSED void init_ ##slabname(struct slabname *s)  \
+{  \
+   init_ ##slabname## _with_stride(s, 1);  \
+}  \
+   \
+static MAYBE_UNUSED void clear_ ##slabname(struct slabname *s) \
+{  \
+   unsigned int i; \
+   for (i = 0; i < s->slab_count; i++) \
+   free(s->slab[i]);   \
+   s->slab_count = 0;  \
+   FREE_AND_NULL(s->slab); \
+}  \
+   \
+static MAYBE_UNUSED elemtype *slabname## _at_peek(struct slabname *s,  \
+ const struct commit *c, \
+ int add_if_missing)   \
+{  \
+   unsigned int nth_slab, nth_slot;\
+   \
+   nth_slab = c->index / s->slab_size; \
+   nth_slot = 

[PATCH v3 15/15] commit.h: delete 'util' field in struct commit

2018-05-18 Thread Nguyễn Thái Ngọc Duy
If you have come this far, you probably have seen that this 'util'
pointer is used for many different purposes. Some are not even
contained in a command code, but buried deep in common code with no
clue who will use it and how.

The move to using commit-slab gives us a much better picture of how
some piece of data is associated with a commit and what for. Since
nobody uses 'util' pointer anymore, we can retire so that nobody will
abuse it again. commit-slab will be the way forward for associating
data to a commit.

As a side benefit, this shrinks struct commit by 8 bytes (on 64-bit
architecture) which should help reduce memory usage for reachability
test a bit. This is also what commit-slab is invented for [1].

[1] 96c4f4a370 (commit: allow associating auxiliary info on-demand -
2013-04-09)
---
 commit.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/commit.h b/commit.h
index 838f6a6b26..4432458367 100644
--- a/commit.h
+++ b/commit.h
@@ -16,9 +16,13 @@ struct commit_list {
struct commit_list *next;
 };
 
+/*
+ * The size of this struct matters in full repo walk operations like
+ * 'git clone' or 'git gc'. Consider using commit-slab to attach data
+ * to a commit instead of adding new fields here.
+ */
 struct commit {
struct object object;
-   void *util;
unsigned int index;
timestamp_t date;
struct commit_list *parents;
-- 
2.17.0.705.g3525833791



[PATCH v3 02/15] commit-slab: support shared commit-slab

2018-05-18 Thread Nguyễn Thái Ngọc Duy
define_shared_commit_slab() could be used in a header file to define a
commit-slab. One of these C files must include commit-slab-impl.h and
"call" implement_shared_commit_slab().
---
 commit-slab-decl.h | 13 +
 commit-slab-impl.h | 22 ++
 commit-slab.h  |  2 +-
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/commit-slab-decl.h b/commit-slab-decl.h
index fb5220fb7d..adc7b46c83 100644
--- a/commit-slab-decl.h
+++ b/commit-slab-decl.h
@@ -27,4 +27,17 @@ struct slabname {
\
(stride), 0, NULL \
 }
 
+#define declare_commit_slab_prototypes(slabname, elemtype) \
+   \
+void init_ ##slabname## _with_stride(struct slabname *s, unsigned stride); \
+void init_ ##slabname(struct slabname *s); \
+void clear_ ##slabname(struct slabname *s);\
+elemtype *slabname## _at_peek(struct slabname *s, const struct commit *c, int 
add_if_missing); \
+elemtype *slabname## _at(struct slabname *s, const struct commit *c);  \
+elemtype *slabname## _peek(struct slabname *s, const struct commit *c)
+
+#define define_shared_commit_slab(slabname, elemtype) \
+   declare_commit_slab(slabname, elemtype); \
+   declare_commit_slab_prototypes(slabname, elemtype)
+
 #endif /* COMMIT_SLAB_HDR_H */
diff --git a/commit-slab-impl.h b/commit-slab-impl.h
index 234d9ee5f0..87a9cadfcc 100644
--- a/commit-slab-impl.h
+++ b/commit-slab-impl.h
@@ -3,11 +3,17 @@
 
 #define MAYBE_UNUSED __attribute__((__unused__))
 
-#define implement_commit_slab(slabname, elemtype)  \
+#define implement_static_commit_slab(slabname, elemtype) \
+   implement_commit_slab(slabname, elemtype, static MAYBE_UNUSED)
+
+#define implement_shared_commit_slab(slabname, elemtype) \
+   implement_commit_slab(slabname, elemtype, )
+
+#define implement_commit_slab(slabname, elemtype, scope)   \
\
 static int stat_ ##slabname## realloc; \
\
-static MAYBE_UNUSED void init_ ##slabname## _with_stride(struct slabname *s, \
+scope void init_ ##slabname## _with_stride(struct slabname *s, \
   unsigned stride) \
 {  \
unsigned int elem_size; \
@@ -20,12 +26,12 @@ static MAYBE_UNUSED void init_ ##slabname## 
_with_stride(struct slabname *s, \
s->slab = NULL; \
 }  \
\
-static MAYBE_UNUSED void init_ ##slabname(struct slabname *s)  \
+scope void init_ ##slabname(struct slabname *s)
\
 {  \
init_ ##slabname## _with_stride(s, 1);  \
 }  \
\
-static MAYBE_UNUSED void clear_ ##slabname(struct slabname *s) \
+scope void clear_ ##slabname(struct slabname *s)   \
 {  \
unsigned int i; \
for (i = 0; i < s->slab_count; i++) \
@@ -34,7 +40,7 @@ static MAYBE_UNUSED void clear_ ##slabname(struct slabname 
*s)\
FREE_AND_NULL(s->slab); \
 }  \
\
-static MAYBE_UNUSED elemtype *slabname## _at_peek(struct slabname *s,  \
+scope elemtype *slabname## _at_peek(struct slabname *s,
\
  const struct commit *c, \
  int add_if_missing)   \
 {  \
@@ -62,13 +68,13 @@ static MAYBE_UNUSED elemtype *slabname## _at_peek(struct 
slabname *s,   \
return >slab[nth_slab][nth_slot * s->stride];\
 }  \
\
-static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s,   \
+scope elemtype *slabname## _at(struct slabname *s, \
 

[PATCH v3 11/15] show-branch: use commit-slab for commit-name instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/show-branch.c | 39 +++
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 6c2148b71d..29d15d16d2 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -7,6 +7,7 @@
 #include "argv-array.h"
 #include "parse-options.h"
 #include "dir.h"
+#include "commit-slab.h"
 
 static const char* show_branch_usage[] = {
 N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | 
--date-order]\n"
@@ -59,15 +60,27 @@ struct commit_name {
int generation; /* how many parents away from head_name */
 };
 
+define_commit_slab(commit_name_slab, struct commit_name *);
+static struct commit_name_slab name_slab;
+
+static struct commit_name *commit_to_name(struct commit *commit)
+{
+   return *commit_name_slab_at(_slab, commit);
+}
+
+
 /* Name the commit as nth generation ancestor of head_name;
  * we count only the first-parent relationship for naming purposes.
  */
 static void name_commit(struct commit *commit, const char *head_name, int nth)
 {
struct commit_name *name;
-   if (!commit->util)
-   commit->util = xmalloc(sizeof(struct commit_name));
-   name = commit->util;
+
+   name = *commit_name_slab_at(_slab, commit);
+   if (!name) {
+   name = xmalloc(sizeof(*name));
+   *commit_name_slab_at(_slab, commit) = name;
+   }
name->head_name = head_name;
name->generation = nth;
 }
@@ -79,8 +92,8 @@ static void name_commit(struct commit *commit, const char 
*head_name, int nth)
  */
 static void name_parent(struct commit *commit, struct commit *parent)
 {
-   struct commit_name *commit_name = commit->util;
-   struct commit_name *parent_name = parent->util;
+   struct commit_name *commit_name = commit_to_name(commit);
+   struct commit_name *parent_name = commit_to_name(parent);
if (!commit_name)
return;
if (!parent_name ||
@@ -94,12 +107,12 @@ static int name_first_parent_chain(struct commit *c)
int i = 0;
while (c) {
struct commit *p;
-   if (!c->util)
+   if (!commit_to_name(c))
break;
if (!c->parents)
break;
p = c->parents->item;
-   if (!p->util) {
+   if (!commit_to_name(p)) {
name_parent(c, p);
i++;
}
@@ -122,7 +135,7 @@ static void name_commits(struct commit_list *list,
/* First give names to the given heads */
for (cl = list; cl; cl = cl->next) {
c = cl->item;
-   if (c->util)
+   if (commit_to_name(c))
continue;
for (i = 0; i < num_rev; i++) {
if (rev[i] == c) {
@@ -148,9 +161,9 @@ static void name_commits(struct commit_list *list,
struct commit_name *n;
int nth;
c = cl->item;
-   if (!c->util)
+   if (!commit_to_name(c))
continue;
-   n = c->util;
+   n = commit_to_name(c);
parents = c->parents;
nth = 0;
while (parents) {
@@ -158,7 +171,7 @@ static void name_commits(struct commit_list *list,
struct strbuf newname = STRBUF_INIT;
parents = parents->next;
nth++;
-   if (p->util)
+   if (commit_to_name(p))
continue;
switch (n->generation) {
case 0:
@@ -271,7 +284,7 @@ static void show_one_commit(struct commit *commit, int 
no_name)
 {
struct strbuf pretty = STRBUF_INIT;
const char *pretty_str = "(unavailable)";
-   struct commit_name *name = commit->util;
+   struct commit_name *name = commit_to_name(commit);
 
if (commit->object.parsed) {
pp_commit_easy(CMIT_FMT_ONELINE, commit, );
@@ -660,6 +673,8 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
OPT_END()
};
 
+   init_commit_name_slab(_slab);
+
git_config(git_show_branch_config, NULL);
 
/* If nothing is specified, try the default first */
-- 
2.17.0.705.g3525833791



[PATCH v3 13/15] log: use commit-slab in prepare_bases() instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/log.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index d017e57475..967fbc5caa 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -28,6 +28,7 @@
 #include "mailmap.h"
 #include "gpg-interface.h"
 #include "progress.h"
+#include "commit-slab.h"
 
 #define MAIL_DEFAULT_WRAP 72
 
@@ -1340,6 +1341,8 @@ static struct commit *get_base_commit(const char 
*base_commit,
return base;
 }
 
+define_commit_slab(commit_base, int);
+
 static void prepare_bases(struct base_tree_info *bases,
  struct commit *base,
  struct commit **list,
@@ -1348,11 +1351,13 @@ static void prepare_bases(struct base_tree_info *bases,
struct commit *commit;
struct rev_info revs;
struct diff_options diffopt;
+   struct commit_base commit_base;
int i;
 
if (!base)
return;
 
+   init_commit_base(_base);
diff_setup();
diffopt.flags.recursive = 1;
diff_setup_done();
@@ -1365,7 +1370,7 @@ static void prepare_bases(struct base_tree_info *bases,
for (i = 0; i < total; i++) {
list[i]->object.flags &= ~UNINTERESTING;
add_pending_object(, [i]->object, "rev_list");
-   list[i]->util = (void *)1;
+   *commit_base_at(_base, list[i]) = 1;
}
base->object.flags |= UNINTERESTING;
add_pending_object(, >object, "base");
@@ -1379,7 +1384,7 @@ static void prepare_bases(struct base_tree_info *bases,
while ((commit = get_revision()) != NULL) {
struct object_id oid;
struct object_id *patch_id;
-   if (commit->util)
+   if (*commit_base_at(_base, commit))
continue;
if (commit_patch_id(commit, , , 0))
die(_("cannot get patch id"));
@@ -1388,6 +1393,7 @@ static void prepare_bases(struct base_tree_info *bases,
oidcpy(patch_id, );
bases->nr_patch_id++;
}
+   clear_commit_base(_base);
 }
 
 static void print_bases(struct base_tree_info *bases, FILE *file)
-- 
2.17.0.705.g3525833791



[PATCH v3 00/15] Die commit->util, die!

2018-05-18 Thread Nguyễn Thái Ngọc Duy
v3 fixes Junio's comments on v2:

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index b08e5ea0e3..5aaf5c8e59 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -592,7 +592,7 @@ static void handle_commit(struct commit *commit, struct 
rev_info *rev,
if (!S_ISGITLINK(diff_queued_diff.queue[i]->two->mode))
export_blob(_queued_diff.queue[i]->two->oid);
 
-   refname = *revision_sources_peek(_sources, commit);
+   refname = *revision_sources_at(_sources, commit);
if (anonymize) {
refname = anonymize_refname(refname);
anonymize_ident_line(, _end);
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 29d15d16d2..f2e985c00a 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -22,6 +22,11 @@ static int showbranch_use_color = -1;
 
 static struct argv_array default_args = ARGV_ARRAY_INIT;
 
+/*
+ * TODO: convert this use of commit->object.flags to commit-slab
+ * instead to store a pointer to ref name directly. Then use the same
+ * UNINTERESTING definition from revision.h here.
+ */
 #define UNINTERESTING  01
 
 #define REV_SHIFT   2
diff --git a/commit-slab-hdr.h b/commit-slab-decl.h
similarity index 100%
rename from commit-slab-hdr.h
rename to commit-slab-decl.h
diff --git a/commit-slab-impl.h b/commit-slab-impl.h
index 19a88d7d8f..87a9cadfcc 100644
--- a/commit-slab-impl.h
+++ b/commit-slab-impl.h
@@ -87,7 +87,7 @@ struct slabname
  * to allow a terminating semicolon, which makes instantiations look
  * like function declarations.  I.e., the expansion of
  *
- *implement_commit_slab(indegree, int);
+ *implement_commit_slab(indegree, int, static);
  *
  * ends in 'struct indegree;'.  This would otherwise
  * be a syntax error according (at least) to ISO C.  It's hard to
diff --git a/commit-slab.h b/commit-slab.h
index dc029acc66..69bf0c807c 100644
--- a/commit-slab.h
+++ b/commit-slab.h
@@ -1,7 +1,7 @@
 #ifndef COMMIT_SLAB_H
 #define COMMIT_SLAB_H
 
-#include "commit-slab-hdr.h"
+#include "commit-slab-decl.h"
 #include "commit-slab-impl.h"
 
 /*
diff --git a/commit.c b/commit.c
index 8202067cd5..e63a8dfeaa 100644
--- a/commit.c
+++ b/commit.c
@@ -1575,7 +1575,7 @@ int commit_tree_extended(const char *msg, size_t msg_len,
 }
 
 define_commit_slab(merge_desc_slab, struct merge_remote_desc *);
-struct merge_desc_slab merge_desc_slab = COMMIT_SLAB_INIT(1, merge_desc_slab);
+static struct merge_desc_slab merge_desc_slab = COMMIT_SLAB_INIT(1, 
merge_desc_slab);
 
 struct merge_remote_desc *merge_remote_util(struct commit *commit)
 {
diff --git a/commit.h b/commit.h
index 70371e111e..4432458367 100644
--- a/commit.h
+++ b/commit.h
@@ -16,6 +16,11 @@ struct commit_list {
struct commit_list *next;
 };
 
+/*
+ * The size of this struct matters in full repo walk operations like
+ * 'git clone' or 'git gc'. Consider using commit-slab to attach data
+ * to a commit instead of adding new fields here.
+ */
 struct commit {
struct object object;
unsigned int index;
@@ -23,11 +28,6 @@ struct commit {
struct commit_list *parents;
struct tree *tree;
uint32_t graph_pos;
-   /*
-* Do not add more fields here unless it's _very_ often
-* used. Use commit-slab to associate more data with a commit
-* instead.
-*/
 };
 
 extern int save_commit_buffer;
diff --git a/object.h b/object.h
index b8e70e5519..caf36529f3 100644
--- a/object.h
+++ b/object.h
@@ -43,6 +43,7 @@ struct object_array {
  * builtin/index-pack.c: 2021
  * builtin/pack-objects.c:   20
  * builtin/reflog.c:   10--12
+ * builtin/show-branch.c:0---26
  * builtin/unpack-objects.c: 2021
  */
 #define FLAG_BITS  27
diff --git a/revision.h b/revision.h
index f3dc5f9740..bf2239f876 100644
--- a/revision.h
+++ b/revision.h
@@ -6,7 +6,7 @@
 #include "notes.h"
 #include "pretty.h"
 #include "diff.h"
-#include "commit-slab-hdr.h"
+#include "commit-slab-decl.h"
 
 /* Remember to update object flag allocation in object.h */
 #define SEEN   (1u<<0)
diff --git a/sequencer.c b/sequencer.c
index dd4993fd99..3b6d56d085 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3161,7 +3161,7 @@ static enum check_level 
get_missing_commit_check_level(void)
return CHECK_IGNORE;
 }
 
-define_commit_slab(commit_seen, uint8_t);
+define_commit_slab(commit_seen, unsigned char);
 /*
  * Check if the user dropped some commits by mistake
  * Behaviour determined by rebase.missingCommitsCheck.
diff --git a/shallow.c b/shallow.c
index daf60a9391..0301049781 100644
--- a/shallow.c
+++ b/shallow.c
@@ -110,7 +110,7 @@ struct commit_list *get_shallow_commits(struct object_array 
*heads, int depth,
} else {
commit = (struct commit *)
 

[PATCH v3 04/15] describe: use commit-slab for commit names instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/describe.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index b5afc45846..1b6ca42553 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -15,9 +15,12 @@
 #include "run-command.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "commit-slab.h"
 
 #define MAX_TAGS   (FLAG_BITS - 1)
 
+define_commit_slab(commit_names, struct commit_name *);
+
 static const char * const describe_usage[] = {
N_("git describe [] [...]"),
N_("git describe [] --dirty"),
@@ -37,6 +40,7 @@ static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
 static const char *suffix, *dirty, *broken;
+static struct commit_names commit_names;
 
 /* diff-index command arguments to check if working tree is dirty. */
 static const char *diff_index_args[] = {
@@ -321,11 +325,14 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
if (!have_util) {
struct hashmap_iter iter;
struct commit *c;
-   struct commit_name *n = hashmap_iter_first(, );
+   struct commit_name *n;
+
+   init_commit_names(_names);
+   n = hashmap_iter_first(, );
for (; n; n = hashmap_iter_next()) {
c = lookup_commit_reference_gently(>peeled, 1);
if (c)
-   c->util = n;
+   *commit_names_at(_names, c) = n;
}
have_util = 1;
}
@@ -336,8 +343,11 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
while (list) {
struct commit *c = pop_commit();
struct commit_list *parents = c->parents;
+   struct commit_name **slot;
+
seen_commits++;
-   n = c->util;
+   slot = commit_names_peek(_names, c);
+   n = slot ? *slot : NULL;
if (n) {
if (!tags && !all && n->prio < 2) {
unannotated_cnt++;
-- 
2.17.0.705.g3525833791



[PATCH v3 09/15] bisect.c: use commit-slab for commit weight instead of commit->util

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 bisect.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/bisect.c b/bisect.c
index a579b50884..6de1abd407 100644
--- a/bisect.c
+++ b/bisect.c
@@ -12,6 +12,7 @@
 #include "bisect.h"
 #include "sha1-array.h"
 #include "argv-array.h"
+#include "commit-slab.h"
 
 static struct oid_array good_revs;
 static struct oid_array skipped_revs;
@@ -70,16 +71,19 @@ static void clear_distance(struct commit_list *list)
}
 }
 
+define_commit_slab(commit_weight, int *);
+static struct commit_weight commit_weight;
+
 #define DEBUG_BISECT 0
 
 static inline int weight(struct commit_list *elem)
 {
-   return *((int*)(elem->item->util));
+   return **commit_weight_at(_weight, elem->item);
 }
 
 static inline void weight_set(struct commit_list *elem, int weight)
 {
-   *((int*)(elem->item->util)) = weight;
+   **commit_weight_at(_weight, elem->item) = weight;
 }
 
 static int count_interesting_parents(struct commit *commit)
@@ -265,7 +269,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
struct commit *commit = p->item;
unsigned flags = commit->object.flags;
 
-   p->item->util = [n++];
+   *commit_weight_at(_weight, p->item) = [n++];
switch (count_interesting_parents(commit)) {
case 0:
if (!(flags & TREESAME)) {
@@ -372,6 +376,7 @@ void find_bisection(struct commit_list **commit_list, int 
*reaches,
int *weights;
 
show_list("bisection 2 entry", 0, 0, *commit_list);
+   init_commit_weight(_weight);
 
/*
 * Count the number of total and tree-changing items on the
@@ -412,6 +417,7 @@ void find_bisection(struct commit_list **commit_list, int 
*reaches,
}
free(weights);
*commit_list = best;
+   clear_commit_weight(_weight);
 }
 
 static int register_ref(const char *refname, const struct object_id *oid,
-- 
2.17.0.705.g3525833791



[PATCH v3 07/15] sequencer.c: use commit-slab to associate todo items to commits

2018-05-18 Thread Nguyễn Thái Ngọc Duy
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 sequencer.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 3af296db3b..3b6d56d085 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3362,6 +3362,8 @@ static int subject2item_cmp(const void *fndata,
return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
 }
 
+define_commit_slab(commit_todo_item, struct todo_item *);
+
 /*
  * Rearrange the todo list that has both "pick commit-id msg" and "pick
  * commit-id fixup!/squash! msg" in it so that the latter is put immediately
@@ -3378,6 +3380,7 @@ int rearrange_squash(void)
struct hashmap subject2item;
int res = 0, rearranged = 0, *next, *tail, i;
char **subjects;
+   struct commit_todo_item commit_todo;
 
if (strbuf_read_file_or_whine(_list.buf, todo_file) < 0)
return -1;
@@ -3386,6 +3389,7 @@ int rearrange_squash(void)
return -1;
}
 
+   init_commit_todo_item(_todo);
/*
 * The hashmap maps onelines to the respective todo list index.
 *
@@ -3416,10 +3420,11 @@ int rearrange_squash(void)
 
if (is_fixup(item->command)) {
todo_list_release(_list);
+   clear_commit_todo_item(_todo);
return error(_("the script was already rearranged."));
}
 
-   item->commit->util = item;
+   *commit_todo_item_at(_todo, item->commit) = item;
 
parse_commit(item->commit);
commit_buffer = get_commit_buffer(item->commit, NULL);
@@ -3446,9 +3451,9 @@ int rearrange_squash(void)
else if (!strchr(p, ' ') &&
 (commit2 =
  lookup_commit_reference_by_name(p)) &&
-commit2->util)
+*commit_todo_item_at(_todo, commit2))
/* found by commit name */
-   i2 = (struct todo_item *)commit2->util
+   i2 = *commit_todo_item_at(_todo, commit2)
- todo_list.items;
else {
/* copy can be a prefix of the commit subject */
@@ -3527,5 +3532,6 @@ int rearrange_squash(void)
hashmap_free(, 1);
todo_list_release(_list);
 
+   clear_commit_todo_item(_todo);
return res;
 }
-- 
2.17.0.705.g3525833791



[PATCH v3 08/15] revision.c: use commit-slab for show_source

2018-05-18 Thread Nguyễn Thái Ngọc Duy
Instead of relying on commit->util to store the source string, let the
user provide a commit-slab to store the source strings in.

It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
---
 builtin/fast-export.c | 14 +-
 builtin/log.c |  7 +--
 log-tree.c|  8 ++--
 revision.c| 19 +++
 revision.h|  5 -
 5 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 530df12f05..5aaf5c8e59 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "remote.h"
 #include "blob.h"
+#include "commit-slab.h"
 
 static const char *fast_export_usage[] = {
N_("git fast-export [rev-list-opts]"),
@@ -38,6 +39,7 @@ static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
 static struct refspec *refspecs;
 static int refspecs_nr;
 static int anonymize;
+static struct revision_sources revision_sources;
 
 static int parse_opt_signed_tag_mode(const struct option *opt,
 const char *arg, int unset)
@@ -590,7 +592,7 @@ static void handle_commit(struct commit *commit, struct 
rev_info *rev,
if (!S_ISGITLINK(diff_queued_diff.queue[i]->two->mode))
export_blob(_queued_diff.queue[i]->two->oid);
 
-   refname = commit->util;
+   refname = *revision_sources_at(_sources, commit);
if (anonymize) {
refname = anonymize_refname(refname);
anonymize_ident_line(, _end);
@@ -862,10 +864,11 @@ static void get_tags_and_duplicates(struct 
rev_cmdline_info *info)
 * This ref will not be updated through a commit, lets make
 * sure it gets properly updated eventually.
 */
-   if (commit->util || commit->object.flags & SHOWN)
+   if (*revision_sources_at(_sources, commit) ||
+   commit->object.flags & SHOWN)
string_list_append(_refs, full_name)->util = 
commit;
-   if (!commit->util)
-   commit->util = full_name;
+   if (!*revision_sources_at(_sources, commit))
+   *revision_sources_at(_sources, commit) = 
full_name;
}
 }
 
@@ -1029,8 +1032,9 @@ int cmd_fast_export(int argc, const char **argv, const 
char *prefix)
git_config(git_default_config, NULL);
 
init_revisions(, prefix);
+   init_revision_sources(_sources);
revs.topo_order = 1;
-   revs.show_source = 1;
+   revs.sources = _sources;
revs.rewrite_parents = 1;
argc = parse_options(argc, argv, prefix, options, fast_export_usage,
PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN);
diff --git a/builtin/log.c b/builtin/log.c
index 71f68a3e4f..d017e57475 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -148,6 +148,7 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
static struct string_list decorate_refs_include = 
STRING_LIST_INIT_NODUP;
struct decoration_filter decoration_filter = {_refs_include,
  _refs_exclude};
+   static struct revision_sources revision_sources;
 
const struct option builtin_log_options[] = {
OPT__QUIET(, N_("suppress diff output")),
@@ -194,8 +195,10 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
rev->diffopt.filter || rev->diffopt.flags.follow_renames)
rev->always_show_header = 0;
 
-   if (source)
-   rev->show_source = 1;
+   if (source) {
+   init_revision_sources(_sources);
+   rev->sources = _sources;
+   }
 
if (mailmap) {
rev->mailmap = xcalloc(1, sizeof(struct string_list));
diff --git a/log-tree.c b/log-tree.c
index d1c0bedf24..0b41ee3235 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -295,8 +295,12 @@ void show_decorations(struct rev_info *opt, struct commit 
*commit)
 {
struct strbuf sb = STRBUF_INIT;
 
-   if (opt->show_source && commit->util)
-   fprintf(opt->diffopt.file, "\t%s", (char *) commit->util);
+   if (opt->sources) {
+   char **slot = revision_sources_peek(opt->sources, commit);
+
+   if (slot && *slot)
+   fprintf(opt->diffopt.file, "\t%s", *slot);
+   }
if (!opt->show_decorations)
return;
format_decorations(, commit, opt->diffopt.use_color);
diff --git a/revision.c b/revision.c
index 1cff11833e..be8fe7d67b 100644
--- a/revision.c
+++ b/revision.c
@@ -29,6 +29,8 @@ volatile show_early_output_fn_t show_early_output;
 static const char *term_bad;
 static const char *term_good;
 
+implement_shared_commit_slab(revision_sources, char *);
+
 void 

[PATCH v3 12/15] show-branch: note about its object flags usage

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This is another candidate for commit-slab. Keep Junio's observation in
code so we can search it later on when somebody wants to improve the
code.
---
 builtin/show-branch.c | 5 +
 object.h  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 29d15d16d2..f2e985c00a 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -22,6 +22,11 @@ static int showbranch_use_color = -1;
 
 static struct argv_array default_args = ARGV_ARRAY_INIT;
 
+/*
+ * TODO: convert this use of commit->object.flags to commit-slab
+ * instead to store a pointer to ref name directly. Then use the same
+ * UNINTERESTING definition from revision.h here.
+ */
 #define UNINTERESTING  01
 
 #define REV_SHIFT   2
diff --git a/object.h b/object.h
index b8e70e5519..caf36529f3 100644
--- a/object.h
+++ b/object.h
@@ -43,6 +43,7 @@ struct object_array {
  * builtin/index-pack.c: 2021
  * builtin/pack-objects.c:   20
  * builtin/reflog.c:   10--12
+ * builtin/show-branch.c:0---26
  * builtin/unpack-objects.c: 2021
  */
 #define FLAG_BITS  27
-- 
2.17.0.705.g3525833791



Re: [PATCH v2 11/14] show-branch: use commit-slab for commit-name instead of commit->util

2018-05-18 Thread Duy Nguyen
On Mon, May 14, 2018 at 8:45 AM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> It's done so that commit->util can be removed. See more explanation in
>> the commit that removes commit->util.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  builtin/show-branch.c | 39 +++
>>  1 file changed, 27 insertions(+), 12 deletions(-)
>
> Looks obviously correct.
>
> Another place we could use commit-slab in this program, which I
> think is a more interesting application, is to use it to store a
> bitmask with runtime-computed width to replace those object->flags
> bits, which will allow us to lift the MAX_REVS limitation.

Interesting. I have a feeling paint_down() from shallow.c might be
reusable. Anyway I don't have enough interest in this command to
actually fix this so I'll just make a TODO note in case people need
something to do and grep for them in the code.
-- 
Duy


Re: [PATCH v5 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'

2018-05-18 Thread Taylor Blau
On Fri, May 18, 2018 at 02:50:21PM -0700, Taylor Blau wrote:
> [...]
>
> > Another might be to pick "foo" in the first and "bar" in the second
> > line, as that is the "first hit" on the line, which is consistent
> > with how "git grep -e foo" would say about "a foo b foo c foo" (I
> > expect that the leftmost "foo" would be the first hit).  So there
> > may be multiple, equally plausible answer to the question.
>
> This is the largest fact in my mind pertaining to this discussion: there
> are probably many different interpretations of semantics for this, all
> equally valid in their own way. I am partial to the minimum substring
> interpretation (which follows naturally from the minimum-start,
> maximum-end idea), accepting the shortcoming that `--or` sometimes
> doesn't ``do the right thing.''

Ignore this last part. `--or` _does_ do the right thing, as noted below.

Thanks,
Taylor


[PATCH v2] travis-ci: run gcc-8 on linux-gcc jobs

2018-05-18 Thread Nguyễn Thái Ngọc Duy
Switch from gcc-4.8 to gcc-8. Newer compilers come with more warning
checks (usually in -Wextra).  Since -Wextra is enabled in developer
mode (which is also enabled in travis), this lets travis report more
warnings before other people do it.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 v2 switches from gcc-7 to gcc-8. I initially ignored gcc 8 because it
 was too new. But it built 'pu' ok and 8.1 was already out so the
 first wave of bugs should have been gone by now.

 .travis.yml| 3 +++
 ci/lib-travisci.sh | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 5f5ee4f3bd..4d4e26c9df 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -16,10 +16,13 @@ compiler:
 
 addons:
   apt:
+sources:
+- ubuntu-toolchain-r-test
 packages:
 - language-pack-is
 - git-svn
 - apache2
+- gcc-8
 
 matrix:
   include:
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 109ef280da..ceecc889ca 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -99,6 +99,9 @@ export DEFAULT_TEST_TARGET=prove
 export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
 export GIT_TEST_OPTS="--verbose-log -x"
 export GIT_TEST_CLONE_2GB=YesPlease
+if [ "$jobname" = linux-gcc ]; then
+   export CC=gcc-8
+fi
 
 case "$jobname" in
 linux-clang|linux-gcc)
-- 
2.17.0.705.g3525833791



[PATCH 12/14] completion: let git provide the completable command list

2018-05-18 Thread Nguyễn Thái Ngọc Duy
Instead of maintaining a separate list of command classification,
which often could go out of date, let's centralize the information
back in git.

While the function in git-completion.bash implies "list porcelain
commands", that's not exactly what it does. It gets all commands (aka
--list-cmds=main,others) then exclude certain non-porcelain ones. We
could almost recreate this list two lists list-mainporcelain and
others. The non-porcelain-but-included-anyway is added by the third
category list-complete.

list-complete does not recreate exactly the command list before this
patch though. The following commands will disappear from the complete
list because they are not in command-list.txt and it's not worth
adding them back: lost-found, peek-remote and tar-tree.

Note that the current completion script incorrectly classifies
filter-branch as porcelain and t9902 tests this behavior. We keep it
this way in t9902 because this test does not really care which
particular command is porcelain or plubmbing, they're just names.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 command-list.txt   |  53 +--
 contrib/completion/git-completion.bash | 119 ++---
 t/t9902-completion.sh  |   5 +-
 3 files changed, 58 insertions(+), 119 deletions(-)

diff --git a/command-list.txt b/command-list.txt
index a2f360eab9..dcf1907a54 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -47,12 +47,12 @@
 # command name  category [category] [category]
 git-add mainporcelain   worktree
 git-am  mainporcelain
-git-annotateancillaryinterrogators
-git-apply   plumbingmanipulators
+git-annotateancillaryinterrogators  
complete
+git-apply   plumbingmanipulators
complete
 git-archimport  foreignscminterface
 git-archive mainporcelain
 git-bisect  mainporcelain   info
-git-blame   ancillaryinterrogators
+git-blame   ancillaryinterrogators  
complete
 git-branch  mainporcelain   history
 git-bundle  mainporcelain
 git-cat-fileplumbinginterrogators
@@ -62,7 +62,7 @@ git-check-mailmap   purehelpers
 git-checkoutmainporcelain   history
 git-checkout-index  plumbingmanipulators
 git-check-ref-formatpurehelpers
-git-cherry  ancillaryinterrogators
+git-cherry  ancillaryinterrogators  
complete
 git-cherry-pick mainporcelain
 git-citool  mainporcelain
 git-clean   mainporcelain
@@ -70,7 +70,7 @@ git-clone   mainporcelain 
  init
 git-column  purehelpers
 git-commit  mainporcelain   history
 git-commit-tree plumbingmanipulators
-git-config  ancillarymanipulators
+git-config  ancillarymanipulators   
complete
 git-count-objects   ancillaryinterrogators
 git-credential  purehelpers
 git-credential-cachepurehelpers
@@ -84,30 +84,30 @@ git-diffmainporcelain   
history
 git-diff-files  plumbinginterrogators
 git-diff-index  plumbinginterrogators
 git-diff-tree   plumbinginterrogators
-git-difftoolancillaryinterrogators
+git-difftoolancillaryinterrogators  
complete
 git-fast-export ancillarymanipulators
 git-fast-import ancillarymanipulators
 git-fetch   mainporcelain   remote
 git-fetch-pack  synchingrepositories
-git-filter-branch   ancillarymanipulators
+git-filter-branch   ancillarymanipulators   
complete
 git-fmt-merge-msg   purehelpers
 git-for-each-refplumbinginterrogators
 git-format-patchmainporcelain
-git-fsckancillaryinterrogators
+git-fsckancillaryinterrogators  
complete
 git-gc  mainporcelain
-git-get-tar-commit-id   ancillaryinterrogators
+git-get-tar-commit-id   ancillaryinterrogators

[PATCH 11/14] command-list.txt: documentation and guide line

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This is intended to help anybody who needs to update command-list.txt.
It gives a brief introduction of all attributes a command can take.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 command-list.txt | 45 +
 1 file changed, 45 insertions(+)

diff --git a/command-list.txt b/command-list.txt
index 99ddc231c1..a2f360eab9 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -1,3 +1,48 @@
+# Command classification list
+# ---
+# All supported commands, builtin or external, must be described in
+# here. This info is used to list commands in various places. Each
+# command is on one line followed by one or more attributes.
+#
+# The first attribute group is mandatory and indicates the command
+# type. This group includes:
+#
+#   mainporcelain
+#   ancillarymanipulators
+#   ancillaryinterrogators
+#   foreignscminterface
+#   plumbingmanipulators
+#   plumbinginterrogators
+#   synchingrepositories
+#   synchelpers
+#   purehelpers
+#
+# The type names are self explanatory. But if you want to see what
+# command belongs to what group to get a better picture, have a look
+# at "git" man page, "GIT COMMANDS" section.
+#
+# Commands of type mainporcelain can also optionally have one of these
+# attributes:
+#
+#   init
+#   worktree
+#   info
+#   history
+#   remote
+#
+# These commands are considered "common" and will show up in "git
+# help" output in groups. Uncommon porcelain commands must not
+# specify any of these attributes.
+#
+# "complete" attribute is used to mark that the command should be
+# completable by git-completion.bash. Note that by default,
+# mainporcelain commands are completable so you don't need this
+# attribute.
+#
+# As part of the Git man page list, the man(5/7) guides are also
+# specified here, which can only have "guide" attribute and nothing
+# else.
+#
 ### command list (do not change this line, also do not change alignment)
 # command name  category [category] [category]
 git-add mainporcelain   worktree
-- 
2.17.0.705.g3525833791



[PATCH 13/14] completion: reduce completable command list

2018-05-18 Thread Nguyễn Thái Ngọc Duy
The following commands are removed from the complete list:

- annotate obsolete, discouraged to use
- filter-branchnot often used
- get-tar-commit-idnot often used
- imap-sendnot often used
- interpreter-trailers not for interactive use
- p4   too short and probably not often used (*)
- svn  same category as p4 (*)
- verify-commitnot often used

(*) to be fair, send-email command which is in the same foreignscminterface
group as svn and p4 does get completion, just because it's used by git
and kernel development. So maybe we should include them.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 command-list.txt | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/command-list.txt b/command-list.txt
index dcf1907a54..8462ea475f 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -47,7 +47,7 @@
 # command name  category [category] [category]
 git-add mainporcelain   worktree
 git-am  mainporcelain
-git-annotateancillaryinterrogators  
complete
+git-annotateancillaryinterrogators
 git-apply   plumbingmanipulators
complete
 git-archimport  foreignscminterface
 git-archive mainporcelain
@@ -89,13 +89,13 @@ git-fast-export 
ancillarymanipulators
 git-fast-import ancillarymanipulators
 git-fetch   mainporcelain   remote
 git-fetch-pack  synchingrepositories
-git-filter-branch   ancillarymanipulators   
complete
+git-filter-branch   ancillarymanipulators
 git-fmt-merge-msg   purehelpers
 git-for-each-refplumbinginterrogators
 git-format-patchmainporcelain
 git-fsckancillaryinterrogators  
complete
 git-gc  mainporcelain
-git-get-tar-commit-id   ancillaryinterrogators  
complete
+git-get-tar-commit-id   ancillaryinterrogators
 git-grepmainporcelain   info
 git-gui mainporcelain
 git-hash-object plumbingmanipulators
@@ -103,11 +103,11 @@ git-help
ancillaryinterrogators  complete
 git-http-backendsynchingrepositories
 git-http-fetch  synchelpers
 git-http-push   synchelpers
-git-imap-send   foreignscminterface 
complete
+git-imap-send   foreignscminterface
 git-index-pack  plumbingmanipulators
 git-initmainporcelain   init
 git-instawebancillaryinterrogators  
complete
-git-interpret-trailers  purehelpers 
complete
+git-interpret-trailers  purehelpers
 gitkmainporcelain
 git-log mainporcelain   info
 git-ls-filesplumbinginterrogators
@@ -127,7 +127,7 @@ git-mktree  plumbingmanipulators
 git-mv  mainporcelain   worktree
 git-name-revplumbinginterrogators   
complete
 git-notes   mainporcelain
-git-p4  foreignscminterface 
complete
+git-p4  foreignscminterface
 git-pack-objectsplumbingmanipulators
 git-pack-redundant  plumbinginterrogators
 git-pack-refs   ancillarymanipulators
@@ -167,7 +167,7 @@ git-stage   
complete
 git-status  mainporcelain   info
 git-stripspace  purehelpers
 git-submodule   mainporcelain
-git-svn foreignscminterface 
complete
+git-svn foreignscminterface
 git-symbolic-refplumbingmanipulators
 git-tag mainporcelain   history
 git-unpack-file plumbinginterrogators
@@ -178,7 +178,7 @@ git-update-server-info  synchingrepositories
 git-upload-archive  synchelpers
 git-upload-pack synchelpers
 git-var plumbinginterrogators

[PATCH 04/14] Remove common-cmds.h

2018-05-18 Thread Nguyễn Thái Ngọc Duy
After the last patch, common-cmds.h is no longer used (and it was
actually broken). Remove all related code. command-list.h will take
its place from now on.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 .gitignore  |  1 -
 Makefile| 17 ++---
 generate-cmdlist.sh | 46 +++--
 3 files changed, 9 insertions(+), 55 deletions(-)

diff --git a/.gitignore b/.gitignore
index d4c3914167..0836083992 100644
--- a/.gitignore
+++ b/.gitignore
@@ -179,7 +179,6 @@
 /gitweb/gitweb.cgi
 /gitweb/static/gitweb.js
 /gitweb/static/gitweb.min.*
-/common-cmds.h
 /command-list.h
 *.tar.gz
 *.dsc
diff --git a/Makefile b/Makefile
index 5c58b0b692..a60a78ee67 100644
--- a/Makefile
+++ b/Makefile
@@ -757,7 +757,7 @@ LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
 VCSSVN_LIB = vcs-svn/lib.a
 
-GENERATED_H += common-cmds.h command-list.h
+GENERATED_H += command-list.h
 
 LIB_H = $(shell $(FIND) . \
-name .git -prune -o \
@@ -1914,9 +1914,9 @@ git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
$(filter %.o,$^) $(LIBS)
 
-help.sp help.s help.o: common-cmds.h command-list.h
+help.sp help.s help.o: command-list.h
 
-builtin/help.sp builtin/help.s builtin/help.o: common-cmds.h command-list.h 
GIT-PREFIX
+builtin/help.sp builtin/help.s builtin/help.o: command-list.h GIT-PREFIX
 builtin/help.sp builtin/help.s builtin/help.o: EXTRA_CPPFLAGS = \
'-DGIT_HTML_PATH="$(htmldir_relative_SQ)"' \
'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
@@ -1935,11 +1935,6 @@ $(BUILT_INS): git$X
ln -s $< $@ 2>/dev/null || \
cp $< $@
 
-common-cmds.h: generate-cmdlist.sh command-list.txt
-
-common-cmds.h: $(wildcard Documentation/git-*.txt)
-   $(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt COMMON 
>$@+ && mv $@+ $@
-
 command-list.h: generate-cmdlist.sh command-list.txt
 
 command-list.h: $(wildcard Documentation/git-*.txt)
@@ -2153,7 +2148,7 @@ else
 # Dependencies on header files, for platforms that do not support
 # the gcc -MMD option.
 #
-# Dependencies on automatically generated headers such as common-cmds.h or 
command-list.h
+# Dependencies on automatically generated headers such as command-list.h
 # should _not_ be included here, since they are necessary even when
 # building an object for the first time.
 
@@ -2532,7 +2527,7 @@ sparse: $(SP_OBJ)
 style:
git clang-format --style file --diff --extensions c,h
 
-check: common-cmds.h command-list.h
+check: command-list.h
@if sparse; \
then \
echo >&2 "Use 'make sparse' instead"; \
@@ -2780,7 +2775,7 @@ clean: profile-clean coverage-clean
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
$(RM) -r bin-wrappers $(dep_dirs)
$(RM) -r po/build/
-   $(RM) *.pyc *.pyo */*.pyc */*.pyo common-cmds.h command-list.h 
$(ETAGS_TARGET) tags cscope*
+   $(RM) *.pyc *.pyo */*.pyc */*.pyo command-list.h $(ETAGS_TARGET) tags 
cscope*
$(RM) -r $(GIT_TARNAME) .doc-tmp-dir
$(RM) $(GIT_TARNAME).tar.gz git-core_$(GIT_VERSION)-*.tar.gz
$(RM) $(htmldocs).tar.gz $(manpages).tar.gz
diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index 9eb22c4ef1..3bcc1ee57d 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -68,46 +68,6 @@ struct cmdname_help {
uint32_t category;
 };
 "
-if test -z "$2"
-then
-   define_categories "$1"
-   echo
-   print_command_list "$1"
-   exit 0
-fi
-
-echo "static const char *common_cmd_groups[] = {"
-
-grps=grps$$.tmp
-match=match$$.tmp
-trap "rm -f '$grps' '$match'" 0 1 2 3 15
-
-sed -n '
-   1,/^### common groups/b
-   /^### command list/q
-   /^#/b
-   /^[ ]*$/b
-   h;s/^[^ ][^ ]*[ ][  ]*\(.*\)/   N_("\1"),/p
-   g;s/^\([^   ][^ ]*\)[   ].*/\1/w '$grps'
-   ' "$1"
-printf '};\n\n'
-
-n=0
-substnum=
-while read grp
-do
-   echo "^git-..*[ ]$grp"
-   substnum="$substnum${substnum:+;}s/[]$grp/$n/"
-   n=$(($n+1))
-done <"$grps" >"$match"
-
-printf 'static struct cmdname_help common_cmds[] = {\n'
-grep -f "$match" "$1" |
-sed 's/^git-//' |
-sort |
-while read cmd tags
-do
-   tag=$(echo "$tags" | sed "$substnum; s/[^0-9]//g")
-   echo "  {\"$cmd\", $(get_synopsis git-$cmd), $tag},"
-done
-echo "};"
+define_categories "$1"
+echo
+print_command_list "$1"
-- 
2.17.0.705.g3525833791



[PATCH 05/14] git.c: convert --list-* to --list-cmds=*

2018-05-18 Thread Nguyễn Thái Ngọc Duy
Even if these are hidden options, let's make them a bit more generic
since we're introducing more listing types shortly. The code is
structured to allow combining multiple listing types together because
we will soon add more types the 'builtins'.

'parseopt' remains separate because it has separate (SPC) to match
git-completion.bash needs and will not combine with others.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash |  2 +-
 git.c  | 37 +-
 t/t0012-help.sh|  2 +-
 3 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index a757073945..3556838759 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -3049,7 +3049,7 @@ __git_complete_common () {
 __git_cmds_with_parseopt_helper=
 __git_support_parseopt_helper () {
test -n "$__git_cmds_with_parseopt_helper" ||
-   __git_cmds_with_parseopt_helper="$(__git 
--list-parseopt-builtins)"
+   __git_cmds_with_parseopt_helper="$(__git --list-cmds=parseopt)"
 
case " $__git_cmds_with_parseopt_helper " in
*" $1 "*)
diff --git a/git.c b/git.c
index 3a89893712..cd85355d81 100644
--- a/git.c
+++ b/git.c
@@ -38,6 +38,30 @@ static int use_pager = -1;
 
 static void list_builtins(unsigned int exclude_option, char sep);
 
+static int match_token(const char *spec, int len, const char *token)
+{
+   int token_len = strlen(token);
+
+   return len == token_len && !strncmp(spec, token, token_len);
+}
+
+static int list_cmds(const char *spec)
+{
+   while (*spec) {
+   const char *sep = strchrnul(spec, ',');
+   int len = sep - spec;
+
+   if (match_token(spec, len, "builtins"))
+   list_builtins(0, '\n');
+   else
+   die(_("unsupported command listing type '%s'"), spec);
+   spec += len;
+   if (*spec == ',')
+   spec++;
+   }
+   return 0;
+}
+
 static void commit_pager_choice(void) {
switch (use_pager) {
case 0:
@@ -223,12 +247,13 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
}
(*argv)++;
(*argc)--;
-   } else if (!strcmp(cmd, "--list-builtins")) {
-   list_builtins(0, '\n');
-   exit(0);
-   } else if (!strcmp(cmd, "--list-parseopt-builtins")) {
-   list_builtins(NO_PARSEOPT, ' ');
-   exit(0);
+   } else if (skip_prefix(cmd, "--list-cmds=", )) {
+   if (!strcmp(cmd, "parseopt")) {
+   list_builtins(NO_PARSEOPT, ' ');
+   exit(0);
+   } else {
+   exit(list_cmds(cmd));
+   }
} else {
fprintf(stderr, _("unknown option: %s\n"), cmd);
usage(git_usage_string);
diff --git a/t/t0012-help.sh b/t/t0012-help.sh
index c096f33505..3c91a9024a 100755
--- a/t/t0012-help.sh
+++ b/t/t0012-help.sh
@@ -59,7 +59,7 @@ test_expect_success 'git help' '
 '
 
 test_expect_success 'generate builtin list' '
-   git --list-builtins >builtins
+   git --list-cmds=builtins >builtins
 '
 
 while read builtin
-- 
2.17.0.705.g3525833791



[PATCH 01/14] generate-cmds.sh: factor out synopsis extract code

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This makes it easier to reuse the same code in another place (very
soon).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 generate-cmdlist.sh | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index eeea4b67ea..31b6d886cb 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -1,5 +1,15 @@
 #!/bin/sh
 
+get_synopsis () {
+   sed -n '
+   /^NAME/,/'"$1"'/H
+   ${
+   x
+   s/.*'"$1"' - \(.*\)/N_("\1")/
+   p
+   }' "Documentation/$1.txt"
+}
+
 echo "/* Automatically generated by generate-cmdlist.sh */
 struct cmdname_help {
char name[16];
@@ -39,12 +49,6 @@ sort |
 while read cmd tags
 do
tag=$(echo "$tags" | sed "$substnum; s/[^0-9]//g")
-   sed -n '
-   /^NAME/,/git-'"$cmd"'/H
-   ${
-   x
-   s/.*git-'"$cmd"' - \(.*\)/  {"'"$cmd"'", N_("\1"), 
'$tag'},/
-   p
-   }' "Documentation/git-$cmd.txt"
+   echo "  {\"$cmd\", $(get_synopsis git-$cmd), $tag},"
 done
 echo "};"
-- 
2.17.0.705.g3525833791



[PATCH 06/14] git --list-cmds: collect command list in a string_list

2018-05-18 Thread Nguyễn Thái Ngọc Duy
Instead of printing the command directly one by one, keep them in a
list and print at the end. This allows more modification before we
print out (e.g. sorting, removing duplicates or even excluding some
items).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 git.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/git.c b/git.c
index cd85355d81..376a59b97f 100644
--- a/git.c
+++ b/git.c
@@ -36,7 +36,7 @@ const char git_more_info_string[] =
 
 static int use_pager = -1;
 
-static void list_builtins(unsigned int exclude_option, char sep);
+static void list_builtins(struct string_list *list, unsigned int 
exclude_option);
 
 static int match_token(const char *spec, int len, const char *token)
 {
@@ -47,18 +47,24 @@ static int match_token(const char *spec, int len, const 
char *token)
 
 static int list_cmds(const char *spec)
 {
+   struct string_list list = STRING_LIST_INIT_DUP;
+   int i;
+
while (*spec) {
const char *sep = strchrnul(spec, ',');
int len = sep - spec;
 
if (match_token(spec, len, "builtins"))
-   list_builtins(0, '\n');
+   list_builtins(, 0);
else
die(_("unsupported command listing type '%s'"), spec);
spec += len;
if (*spec == ',')
spec++;
}
+   for (i = 0; i < list.nr; i++)
+   puts(list.items[i].string);
+   string_list_clear(, 0);
return 0;
 }
 
@@ -249,7 +255,13 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
(*argc)--;
} else if (skip_prefix(cmd, "--list-cmds=", )) {
if (!strcmp(cmd, "parseopt")) {
-   list_builtins(NO_PARSEOPT, ' ');
+   struct string_list list = STRING_LIST_INIT_DUP;
+   int i;
+
+   list_builtins(, NO_PARSEOPT);
+   for (i = 0; i < list.nr; i++)
+   printf("%s ", list.items[i].string);
+   string_list_clear(, 0);
exit(0);
} else {
exit(list_cmds(cmd));
@@ -533,14 +545,14 @@ int is_builtin(const char *s)
return !!get_builtin(s);
 }
 
-static void list_builtins(unsigned int exclude_option, char sep)
+static void list_builtins(struct string_list *out, unsigned int exclude_option)
 {
int i;
for (i = 0; i < ARRAY_SIZE(commands); i++) {
if (exclude_option &&
(commands[i].option & exclude_option))
continue;
-   printf("%s%c", commands[i].cmd, sep);
+   string_list_append(out, commands[i].cmd);
}
 }
 
-- 
2.17.0.705.g3525833791



[PATCH 02/14] generate-cmds.sh: export all commands to command-list.h

2018-05-18 Thread Nguyễn Thái Ngọc Duy
The current generate-cmds.sh generates just enough to print "git help"
output. That is, it only extracts help text for common commands.

The script is now updated to extract help text for all commands and
keep command classification a new file, command-list.h. This will be
useful later:

- "git help -a" could print a short summary of all commands instead of
  just the common ones.

- "git" could produce a list of commands of one or more category. One
  of its use is to reduce another command classification embedded in
  git-completion.bash.

The new file can be generated but is not used anywhere yet. The plan
is we migrate away from common-cmds.h. Then we can kill off
common-cmds.h build rules and generation code (and also delete
duplicate content in command-list.h which we keep for now to not mess
generate-cmds.sh up too much).

PS. The new fixed column requirement on command-list.txt is
technically not needed. But it helps simplify the code a bit at this
stage. We could lift this restriction later if we want to.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 .gitignore  |  1 +
 Makefile| 13 ++---
 command-list.txt|  4 +--
 generate-cmdlist.sh | 67 ++---
 4 files changed, 75 insertions(+), 10 deletions(-)

diff --git a/.gitignore b/.gitignore
index 833ef3b0b7..d4c3914167 100644
--- a/.gitignore
+++ b/.gitignore
@@ -180,6 +180,7 @@
 /gitweb/static/gitweb.js
 /gitweb/static/gitweb.min.*
 /common-cmds.h
+/command-list.h
 *.tar.gz
 *.dsc
 *.deb
diff --git a/Makefile b/Makefile
index f181687250..2a8913ea21 100644
--- a/Makefile
+++ b/Makefile
@@ -757,7 +757,7 @@ LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
 VCSSVN_LIB = vcs-svn/lib.a
 
-GENERATED_H += common-cmds.h
+GENERATED_H += common-cmds.h command-list.h
 
 LIB_H = $(shell $(FIND) . \
-name .git -prune -o \
@@ -1938,6 +1938,11 @@ $(BUILT_INS): git$X
 common-cmds.h: generate-cmdlist.sh command-list.txt
 
 common-cmds.h: $(wildcard Documentation/git-*.txt)
+   $(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt COMMON 
>$@+ && mv $@+ $@
+
+command-list.h: generate-cmdlist.sh command-list.txt
+
+command-list.h: $(wildcard Documentation/git-*.txt)
$(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt >$@+ 
&& mv $@+ $@
 
 SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):$(GIT_VERSION):\
@@ -2148,7 +2153,7 @@ else
 # Dependencies on header files, for platforms that do not support
 # the gcc -MMD option.
 #
-# Dependencies on automatically generated headers such as common-cmds.h
+# Dependencies on automatically generated headers such as common-cmds.h or 
command-list.h
 # should _not_ be included here, since they are necessary even when
 # building an object for the first time.
 
@@ -2527,7 +2532,7 @@ sparse: $(SP_OBJ)
 style:
git clang-format --style file --diff --extensions c,h
 
-check: common-cmds.h
+check: common-cmds.h command-list.h
@if sparse; \
then \
echo >&2 "Use 'make sparse' instead"; \
@@ -2775,7 +2780,7 @@ clean: profile-clean coverage-clean
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
$(RM) -r bin-wrappers $(dep_dirs)
$(RM) -r po/build/
-   $(RM) *.pyc *.pyo */*.pyc */*.pyo common-cmds.h $(ETAGS_TARGET) tags 
cscope*
+   $(RM) *.pyc *.pyo */*.pyc */*.pyo common-cmds.h command-list.h 
$(ETAGS_TARGET) tags cscope*
$(RM) -r $(GIT_TARNAME) .doc-tmp-dir
$(RM) $(GIT_TARNAME).tar.gz git-core_$(GIT_VERSION)-*.tar.gz
$(RM) $(htmldocs).tar.gz $(manpages).tar.gz
diff --git a/command-list.txt b/command-list.txt
index a1fad28fd8..786536aba0 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -8,8 +8,8 @@ info examine the history and state (see also: git help 
revisions)
 history  grow, mark and tweak your common history
 remote   collaborate (see also: git help workflows)
 
-### command list (do not change this line)
-# command name  category [deprecated] [common]
+### command list (do not change this line, also do not change alignment)
+# command name  category [category] [category]
 git-add mainporcelain   worktree
 git-am  mainporcelain
 git-annotateancillaryinterrogators
diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index 31b6d886cb..870d3b626a 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -1,5 +1,27 @@
 #!/bin/sh
 
+die () {
+   echo "$@" >&2
+   exit 1
+}
+
+command_list () {
+   sed '1,/^### command list/d;/^#/d' "$1"
+}
+
+get_categories () {
+   tr ' ' '\n'|
+   grep -v '^$' |
+   sort |
+   uniq
+}
+
+category_list () {
+   command_list "$1" |
+   cut -c 40- |
+   get_categories
+}
+
 get_synopsis () {
sed -n '
/^NAME/,/'"$1"'/H
@@ -10,14 +32,51 @@ get_synopsis () {
}' 

[PATCH 07/14] completion: implement and use --list-cmds=main,others

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This is part of the effort to break down and provide commands by
category in machine-readable form. This could be helpful later on when
completion script switches to use --list-cmds for selecting
completable commands. It would be much easier for the user to choose
to complete _all_ commands instead of the default selection by passing
different values to --list-cmds in git-completino.bash.

While at there, replace "git help -a" in git-completion.bash with
--list-cmds since it's better suited for this task.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash |  2 +-
 git.c  |  4 
 help.c | 32 ++
 help.h |  4 
 4 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 3556838759..62ca8641f4 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -839,7 +839,7 @@ __git_commands () {
then
printf "%s" "${GIT_TESTING_COMMAND_COMPLETION}"
else
-   git help -a|egrep '^  [a-zA-Z0-9]'
+   git --list-cmds=main,others
fi
 }
 
diff --git a/git.c b/git.c
index 376a59b97f..10907f7266 100644
--- a/git.c
+++ b/git.c
@@ -56,6 +56,10 @@ static int list_cmds(const char *spec)
 
if (match_token(spec, len, "builtins"))
list_builtins(, 0);
+   else if (match_token(spec, len, "main"))
+   list_all_main_cmds();
+   else if (match_token(spec, len, "others"))
+   list_all_other_cmds();
else
die(_("unsupported command listing type '%s'"), spec);
spec += len;
diff --git a/help.c b/help.c
index 2d6a3157f8..d5ce9dfcbb 100644
--- a/help.c
+++ b/help.c
@@ -297,6 +297,38 @@ void list_common_cmds_help(void)
print_cmd_by_category(common_categories);
 }
 
+void list_all_main_cmds(struct string_list *list)
+{
+   struct cmdnames main_cmds, other_cmds;
+   int i;
+
+   memset(_cmds, 0, sizeof(main_cmds));
+   memset(_cmds, 0, sizeof(other_cmds));
+   load_command_list("git-", _cmds, _cmds);
+
+   for (i = 0; i < main_cmds.cnt; i++)
+   string_list_append(list, main_cmds.names[i]->name);
+
+   clean_cmdnames(_cmds);
+   clean_cmdnames(_cmds);
+}
+
+void list_all_other_cmds(struct string_list *list)
+{
+   struct cmdnames main_cmds, other_cmds;
+   int i;
+
+   memset(_cmds, 0, sizeof(main_cmds));
+   memset(_cmds, 0, sizeof(other_cmds));
+   load_command_list("git-", _cmds, _cmds);
+
+   for (i = 0; i < other_cmds.cnt; i++)
+   string_list_append(list, other_cmds.names[i]->name);
+
+   clean_cmdnames(_cmds);
+   clean_cmdnames(_cmds);
+}
+
 int is_in_cmdlist(struct cmdnames *c, const char *s)
 {
int i;
diff --git a/help.h b/help.h
index b21d7c94e8..97e6c0965e 100644
--- a/help.h
+++ b/help.h
@@ -1,6 +1,8 @@
 #ifndef HELP_H
 #define HELP_H
 
+struct string_list;
+
 struct cmdnames {
int alloc;
int cnt;
@@ -17,6 +19,8 @@ static inline void mput_char(char c, unsigned int num)
 }
 
 extern void list_common_cmds_help(void);
+extern void list_all_main_cmds(struct string_list *list);
+extern void list_all_other_cmds(struct string_list *list);
 extern const char *help_unknown_cmd(const char *cmd);
 extern void load_command_list(const char *prefix,
  struct cmdnames *main_cmds,
-- 
2.17.0.705.g3525833791



[PATCH 10/14] help: use command-list.txt for the source of guides

2018-05-18 Thread Nguyễn Thái Ngọc Duy
The help command currently hard codes the list of guides and their
summary in C. Let's move this list to command-list.txt. This lets us
extract summary lines from Documentation/git*.txt. This also
potentially lets us list guides in git.txt, but I'll leave that for
now.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/gitattributes.txt|  2 +-
 Documentation/gitmodules.txt   |  2 +-
 Documentation/gitrevisions.txt |  2 +-
 Makefile   |  2 +-
 builtin/help.c | 32 --
 command-list.txt   | 16 +
 contrib/completion/git-completion.bash | 15 
 help.c | 21 +
 help.h |  1 +
 t/t0012-help.sh|  6 +
 10 files changed, 54 insertions(+), 45 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 1094fe2b5b..083c2f380d 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -3,7 +3,7 @@ gitattributes(5)
 
 NAME
 
-gitattributes - defining attributes per path
+gitattributes - Defining attributes per path
 
 SYNOPSIS
 
diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
index db5d47eb19..4d63def206 100644
--- a/Documentation/gitmodules.txt
+++ b/Documentation/gitmodules.txt
@@ -3,7 +3,7 @@ gitmodules(5)
 
 NAME
 
-gitmodules - defining submodule properties
+gitmodules - Defining submodule properties
 
 SYNOPSIS
 
diff --git a/Documentation/gitrevisions.txt b/Documentation/gitrevisions.txt
index 27dec5b91d..1f6cceaefb 100644
--- a/Documentation/gitrevisions.txt
+++ b/Documentation/gitrevisions.txt
@@ -3,7 +3,7 @@ gitrevisions(7)
 
 NAME
 
-gitrevisions - specifying revisions and ranges for Git
+gitrevisions - Specifying revisions and ranges for Git
 
 SYNOPSIS
 
diff --git a/Makefile b/Makefile
index a60a78ee67..1efb751e46 100644
--- a/Makefile
+++ b/Makefile
@@ -1937,7 +1937,7 @@ $(BUILT_INS): git$X
 
 command-list.h: generate-cmdlist.sh command-list.txt
 
-command-list.h: $(wildcard Documentation/git-*.txt)
+command-list.h: $(wildcard Documentation/git*.txt)
$(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt >$@+ 
&& mv $@+ $@
 
 SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):$(GIT_VERSION):\
diff --git a/builtin/help.c b/builtin/help.c
index 0e0af8426a..5727fb5e51 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -402,38 +402,6 @@ static void show_html_page(const char *git_cmd)
open_html(page_path.buf);
 }
 
-static struct {
-   const char *name;
-   const char *help;
-} common_guides[] = {
-   { "attributes", N_("Defining attributes per path") },
-   { "everyday", N_("Everyday Git With 20 Commands Or So") },
-   { "glossary", N_("A Git glossary") },
-   { "ignore", N_("Specifies intentionally untracked files to ignore") },
-   { "modules", N_("Defining submodule properties") },
-   { "revisions", N_("Specifying revisions and ranges for Git") },
-   { "tutorial", N_("A tutorial introduction to Git (for version 1.5.1 or 
newer)") },
-   { "workflows", N_("An overview of recommended workflows with Git") },
-};
-
-static void list_common_guides_help(void)
-{
-   int i, longest = 0;
-
-   for (i = 0; i < ARRAY_SIZE(common_guides); i++) {
-   if (longest < strlen(common_guides[i].name))
-   longest = strlen(common_guides[i].name);
-   }
-
-   puts(_("The common Git guides are:\n"));
-   for (i = 0; i < ARRAY_SIZE(common_guides); i++) {
-   printf("   %s   ", common_guides[i].name);
-   mput_char(' ', longest - strlen(common_guides[i].name));
-   puts(_(common_guides[i].help));
-   }
-   putchar('\n');
-}
-
 static const char *check_git_cmd(const char* cmd)
 {
char *alias;
diff --git a/command-list.txt b/command-list.txt
index 3bd23201a6..99ddc231c1 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -139,3 +139,19 @@ gitweb  
ancillaryinterrogators
 git-whatchanged ancillaryinterrogators
 git-worktreemainporcelain
 git-write-tree  plumbingmanipulators
+gitattributes   guide
+gitcli  guide
+gitcore-tutorialguide
+gitcvs-migrationguide
+gitdiffcore guide
+giteveryday guide
+gitglossary guide
+githooksguide
+gitignore   guide
+gitmodules  guide
+gitnamespaces   guide
+gitrepository-layoutguide
+gitrevisions

[PATCH 03/14] help: use command-list.h for common command list

2018-05-18 Thread Nguyễn Thái Ngọc Duy
The previous commit added code generation for all_cmd_desc[] which
includes almost everything we need to generate common command list.
Convert help code to use that array instead and drop common_cmds[] array.

The description of each common command group is removed from
command-list.txt. This keeps this file format simpler. common-cmds.h
will not be generated correctly after this change due to the
command-list.txt format change. But it does not matter and
common-cmds.h will be removed.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Makefile|   4 +-
 command-list.txt|  10 ---
 generate-cmdlist.sh |   4 +-
 help.c  | 145 +---
 t/t0012-help.sh |   9 +++
 5 files changed, 122 insertions(+), 50 deletions(-)

diff --git a/Makefile b/Makefile
index 2a8913ea21..5c58b0b692 100644
--- a/Makefile
+++ b/Makefile
@@ -1914,9 +1914,9 @@ git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
$(filter %.o,$^) $(LIBS)
 
-help.sp help.s help.o: common-cmds.h
+help.sp help.s help.o: common-cmds.h command-list.h
 
-builtin/help.sp builtin/help.s builtin/help.o: common-cmds.h GIT-PREFIX
+builtin/help.sp builtin/help.s builtin/help.o: common-cmds.h command-list.h 
GIT-PREFIX
 builtin/help.sp builtin/help.s builtin/help.o: EXTRA_CPPFLAGS = \
'-DGIT_HTML_PATH="$(htmldir_relative_SQ)"' \
'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
diff --git a/command-list.txt b/command-list.txt
index 786536aba0..3bd23201a6 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -1,13 +1,3 @@
-# common commands are grouped by themes
-# these groups are output by 'git help' in the order declared here.
-# map each common command in the command list to one of these groups.
-### common groups (do not change this line)
-init start a working area (see also: git help tutorial)
-worktree work on the current change (see also: git help everyday)
-info examine the history and state (see also: git help revisions)
-history  grow, mark and tweak your common history
-remote   collaborate (see also: git help workflows)
-
 ### command list (do not change this line, also do not change alignment)
 # command name  category [category] [category]
 git-add mainporcelain   worktree
diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index 870d3b626a..9eb22c4ef1 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -6,7 +6,7 @@ die () {
 }
 
 command_list () {
-   sed '1,/^### command list/d;/^#/d' "$1"
+   grep -v '^#' "$1"
 }
 
 get_categories () {
@@ -65,7 +65,7 @@ echo "/* Automatically generated by generate-cmdlist.sh */
 struct cmdname_help {
const char *name;
const char *help;
-   uint32_t group;
+   uint32_t category;
 };
 "
 if test -z "$2"
diff --git a/help.c b/help.c
index 60071a9bea..2d6a3157f8 100644
--- a/help.c
+++ b/help.c
@@ -5,13 +5,114 @@
 #include "run-command.h"
 #include "levenshtein.h"
 #include "help.h"
-#include "common-cmds.h"
+#include "command-list.h"
 #include "string-list.h"
 #include "column.h"
 #include "version.h"
 #include "refs.h"
 #include "parse-options.h"
 
+struct category_description {
+   uint32_t category;
+   const char *desc;
+};
+static uint32_t common_mask =
+   CAT_init | CAT_worktree | CAT_info |
+   CAT_history | CAT_remote;
+static struct category_description common_categories[] = {
+   { CAT_init, N_("start a working area (see also: git help tutorial)") },
+   { CAT_worktree, N_("work on the current change (see also: git help 
everyday)") },
+   { CAT_info, N_("examine the history and state (see also: git help 
revisions)") },
+   { CAT_history, N_("grow, mark and tweak your common history") },
+   { CAT_remote, N_("collaborate (see also: git help workflows)") },
+   { 0, NULL }
+};
+
+static const char *drop_prefix(const char *name)
+{
+   const char *new_name;
+
+   if (skip_prefix(name, "git-", _name))
+   return new_name;
+   return name;
+
+}
+
+static void extract_cmds(struct cmdname_help **p_cmds, uint32_t mask)
+{
+   int i, nr = 0;
+   struct cmdname_help *cmds;
+
+   if (ARRAY_SIZE(command_list) == 0)
+   BUG("empty command_list[] is a sign of broken 
generate-cmdlist.sh");
+
+   ALLOC_ARRAY(cmds, ARRAY_SIZE(command_list) + 1);
+
+   for (i = 0; i < ARRAY_SIZE(command_list); i++) {
+   const struct cmdname_help *cmd = command_list + i;
+
+   if (!(cmd->category & mask))
+   continue;
+
+   cmds[nr] = *cmd;
+   cmds[nr].name = drop_prefix(cmd->name);
+
+   nr++;
+   }
+   cmds[nr].name = NULL;
+   *p_cmds = cmds;
+}
+
+static void print_command_list(const struct cmdname_help *cmds,
+  uint32_t 

[PATCH 14/14] completion: allow to customize the completable command list

2018-05-18 Thread Nguyễn Thái Ngọc Duy
By default we show porcelain, external commands and a couple others
that are also popular. If you are not happy with this list, you can
now customize it. See the big comment block for details.

PS. perhaps I should make aliases a group too, which makes it possible
to _not_ complete aliases by omitting this special group in
$GIT_COMPLETION_CMD_GROUPS

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt   | 10 
 contrib/completion/git-completion.bash | 28 +-
 git.c  |  2 ++
 help.c | 33 ++
 help.h |  1 +
 5 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..91f7eaed7b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1343,6 +1343,16 @@ credential..*::
 credentialCache.ignoreSIGHUP::
Tell git-credential-cache--daemon to ignore SIGHUP, instead of quitting.
 
+completion.commands::
+   This is only used by git-completion.bash to add or remove
+   commands from the complete list. Normally only porcelain
+   commands and a few select others are in the complete list. You
+   can add more commands, separated by space, in this
+   variable. Prefixing the command with '-' will remove it from
+   the existing list.
++
+This variable should not be per-repository.
+
 include::diff-config.txt[]
 
 difftool..path::
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index cd1d8e553f..f237eb0ff4 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -38,6 +38,29 @@
 #
 # When set to "1", do not include "DWIM" suggestions in git-checkout
 # completion (e.g., completing "foo" when "origin/foo" exists).
+#
+#   GIT_COMPLETION_CMD_GROUPS
+#
+# When set, "git --list-cmds=$GIT_COMPLETION_CMD_GROUPS" will be
+# used to get the list of completable commands. The default is
+# "mainporcelain,others,list-complete" (in English: all porcelain
+# commands and external ones are included. Certain non-porcelain
+# commands are also marked for completion in command-list.txt).
+# You could for example complete all commands with
+#
+# GIT_COMPLETION_CMD_GROUPS=main,others
+#
+# Or you could go with main porcelain only and extra commands in
+# the configuration variable completion.commands with
+#
+# GIT_COMPLETION_CMD_GROUPS=mainporcelain,config
+#
+# Or go completely custom group with
+#
+# GIT_COMPLETION_CMD_GROUPS=config
+#
+# Or you could even play with other command categories found in
+# command-list.txt.
 
 case "$COMP_WORDBREAKS" in
 *:*) : great ;;
@@ -842,8 +865,11 @@ __git_commands () {
if test -n "$GIT_TESTING_PORCELAIN_COMMAND_LIST"
then
printf "%s" "$GIT_TESTING_PORCELAIN_COMMAND_LIST"
+   elif test -n "$GIT_COMPLETION_CMD_GROUPS"
+   then
+   git --list-cmds="$GIT_COMPLETION_CMD_GROUPS"
else
-   git --list-cmds=list-mainporcelain,others,list-complete
+   git 
--list-cmds=list-mainporcelain,others,list-complete,config
fi
;;
all)
diff --git a/git.c b/git.c
index 4d5b8a9931..ea4feedd0b 100644
--- a/git.c
+++ b/git.c
@@ -60,6 +60,8 @@ static int list_cmds(const char *spec)
list_all_main_cmds();
else if (match_token(spec, len, "others"))
list_all_other_cmds();
+   else if (match_token(spec, len, "config"))
+   list_cmds_by_config();
else if (len > 5 && !strncmp(spec, "list-", 5)) {
struct strbuf sb = STRBUF_INIT;
 
diff --git a/help.c b/help.c
index 23924dd300..abf87205b2 100644
--- a/help.c
+++ b/help.c
@@ -366,6 +366,39 @@ void list_cmds_by_category(struct string_list *list,
}
 }
 
+void list_cmds_by_config(struct string_list *list)
+{
+   const char *cmd_list;
+
+   /*
+* There's no actual repository setup at this point (and even
+* if there is, we don't really care; only global config
+* matters). If we accidentally set up a repository, it's ok
+* too since the caller (git --list-cmds=) should exit shortly
+* anyway.
+*/
+   if (git_config_get_string_const("completion.commands", _list))
+   return;
+
+   string_list_sort(list);
+   string_list_remove_duplicates(list, 0);
+
+   while (*cmd_list) {
+   struct strbuf sb = STRBUF_INIT;
+   const char *p = strchrnul(cmd_list, ' ');
+
+   strbuf_add(, cmd_list, p - cmd_list);
+   if (*cmd_list == '-')
+   

[PATCH 09/14] help: add "-a --verbose" to list all commands with synopsis

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This lists all recognized commands [1] by category. The group order
follows closely git.txt.

[1] We may actually show commands that are not built (e.g. if you set
NO_PERL you don't have git-instaweb but it's still listed here). I
ignore the problem because on Linux a git package could be split
anyway. The "git-core" package may not contain git-instaweb even if
it's built because it may end up in a separate package. We can't know
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-help.txt |  4 +++-
 builtin/help.c |  7 +++
 help.c | 16 
 help.h |  2 ++
 t/t0012-help.sh|  9 +
 5 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-help.txt b/Documentation/git-help.txt
index 40d328a4b3..a40fc38d8b 100644
--- a/Documentation/git-help.txt
+++ b/Documentation/git-help.txt
@@ -8,7 +8,7 @@ git-help - Display help information about Git
 SYNOPSIS
 
 [verse]
-'git help' [-a|--all] [-g|--guide]
+'git help' [-a|--all [--verbose]] [-g|--guide]
   [-i|--info|-m|--man|-w|--web] [COMMAND|GUIDE]
 
 DESCRIPTION
@@ -42,6 +42,8 @@ OPTIONS
 --all::
Prints all the available commands on the standard output. This
option overrides any given command or guide name.
+   When used with `--verbose` print description for all recognized
+   commands.
 
 -g::
 --guides::
diff --git a/builtin/help.c b/builtin/help.c
index 598867cfea..0e0af8426a 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -36,6 +36,7 @@ static const char *html_path;
 
 static int show_all = 0;
 static int show_guides = 0;
+static int verbose;
 static unsigned int colopts;
 static enum help_format help_format = HELP_FORMAT_NONE;
 static int exclude_guides;
@@ -48,6 +49,7 @@ static struct option builtin_help_options[] = {
HELP_FORMAT_WEB),
OPT_SET_INT('i', "info", _format, N_("show info page"),
HELP_FORMAT_INFO),
+   OPT__VERBOSE(, N_("print command description")),
OPT_END(),
 };
 
@@ -463,6 +465,11 @@ int cmd_help(int argc, const char **argv, const char 
*prefix)
 
if (show_all) {
git_config(git_help_config, NULL);
+   if (verbose) {
+   setup_pager();
+   list_all_cmds_help();
+   return 0;
+   }
printf(_("usage: %s%s"), _(git_usage_string), "\n\n");
load_command_list("git-", _cmds, _cmds);
list_commands(colopts, _cmds, _cmds);
diff --git a/help.c b/help.c
index 1117f7d1d1..c7df1d2338 100644
--- a/help.c
+++ b/help.c
@@ -27,6 +27,17 @@ static struct category_description common_categories[] = {
{ CAT_remote, N_("collaborate (see also: git help workflows)") },
{ 0, NULL }
 };
+static struct category_description main_categories[] = {
+   { CAT_mainporcelain, N_("Main Porcelain Commands") },
+   { CAT_ancillarymanipulators, N_("Ancillary Commands / Manipulators") },
+   { CAT_ancillaryinterrogators, N_("Ancillary Commands / Interrogators") 
},
+   { CAT_foreignscminterface, N_("Interacting with Others") },
+   { CAT_plumbingmanipulators, N_("Low-level Commands / Manipulators") },
+   { CAT_plumbinginterrogators, N_("Low-level Commands / Interrogators") },
+   { CAT_synchingrepositories, N_("Low-level Commands / Synching 
Repositories") },
+   { CAT_purehelpers, N_("Low-level Commands / Internal Helpers") },
+   { 0, NULL }
+};
 
 static const char *drop_prefix(const char *name)
 {
@@ -352,6 +363,11 @@ void list_cmds_by_category(struct string_list *list,
}
 }
 
+void list_all_cmds_help(void)
+{
+   print_cmd_by_category(main_categories);
+}
+
 int is_in_cmdlist(struct cmdnames *c, const char *s)
 {
int i;
diff --git a/help.h b/help.h
index 734bba32d3..40917fc38c 100644
--- a/help.h
+++ b/help.h
@@ -19,6 +19,8 @@ static inline void mput_char(char c, unsigned int num)
 }
 
 extern void list_common_cmds_help(void);
+extern void list_all_cmds_help(void);
+
 extern void list_all_main_cmds(struct string_list *list);
 extern void list_all_other_cmds(struct string_list *list);
 extern void list_cmds_by_category(struct string_list *list,
diff --git a/t/t0012-help.sh b/t/t0012-help.sh
index 3c91a9024a..060df24c2d 100755
--- a/t/t0012-help.sh
+++ b/t/t0012-help.sh
@@ -25,6 +25,15 @@ test_expect_success "setup" '
EOF
 '
 
+# make sure to exercise these code paths, the output is a bit tricky
+# to verify
+test_expect_success 'basic help commands' '
+   git help >/dev/null &&
+   git help -a >/dev/null &&
+   git help -g >/dev/null &&
+   git help -av >/dev/null
+'
+
 test_expect_success "works for commands and guides by default" '
configure_help &&
git help status &&
-- 
2.17.0.705.g3525833791



[PATCH 00/14] nd/command-list updates

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This should be the final update before nd/command-list hits 'next',
hopefully.

Besides the interdiff below, the old 12/13 is split into two: 12/14
keeps the complete output as close as possible to the base version and
13/14 removes some commands from the completion list.

interdiff

diff --git a/command-list.txt b/command-list.txt
index 3e21ddfcfb..8462ea475f 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -39,8 +39,9 @@
 # mainporcelain commands are completable so you don't need this
 # attribute.
 #
-# While not true commands, guides are also specified here, which can
-# only have "guide" attribute and nothing else.
+# As part of the Git man page list, the man(5/7) guides are also
+# specified here, which can only have "guide" attribute and nothing
+# else.
 #
 ### command list (do not change this line, also do not change alignment)
 # command name  category [category] [category]
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index f7ca9a4d4f..f237eb0ff4 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -857,6 +857,8 @@ __git_complete_strategy ()
return 1
 }
 
+# __git_commands requires 1 argument:
+# 1: the command group, either "all" or "porcelain"
 __git_commands () {
case "$1" in
porcelain)


Nguyễn Thái Ngọc Duy (14):
  generate-cmds.sh: factor out synopsis extract code
  generate-cmds.sh: export all commands to command-list.h
  help: use command-list.h for common command list
  Remove common-cmds.h
  git.c: convert --list-* to --list-cmds=*
  git --list-cmds: collect command list in a string_list
  completion: implement and use --list-cmds=main,others
  git: support --list-cmds=list-
  help: add "-a --verbose" to list all commands with synopsis
  help: use command-list.txt for the source of guides
  command-list.txt: documentation and guide line
  completion: let git provide the completable command list
  completion: reduce completable command list
  completion: allow to customize the completable command list

 .gitignore |   2 +-
 Documentation/config.txt   |  10 +
 Documentation/git-help.txt |   4 +-
 Documentation/gitattributes.txt|   2 +-
 Documentation/gitmodules.txt   |   2 +-
 Documentation/gitrevisions.txt |   2 +-
 Makefile   |  16 +-
 builtin/help.c |  39 +---
 command-list.txt   | 112 +---
 contrib/completion/git-completion.bash | 162 +++-
 generate-cmdlist.sh| 126 -
 git.c  |  68 ++-
 help.c | 244 ++---
 help.h |  10 +
 t/t0012-help.sh|  26 ++-
 t/t9902-completion.sh  |   5 +-
 16 files changed, 576 insertions(+), 254 deletions(-)

-- 
2.17.0.705.g3525833791



[PATCH 08/14] git: support --list-cmds=list-

2018-05-18 Thread Nguyễn Thái Ngọc Duy
This allows us to select any group of commands by a category defined
in command-list.txt. This is an internal/hidden option so we don't
have to be picky about the category name or worried about exposing too
much.

This will be used later by git-completion.bash to retrieve certain
command groups.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 generate-cmdlist.sh | 17 +
 git.c   |  7 +++
 help.c  | 23 +++
 help.h  |  2 ++
 4 files changed, 49 insertions(+)

diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index 3bcc1ee57d..8d6d8b45ce 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -45,6 +45,21 @@ define_categories () {
test "$bit" -gt 32 && die "Urgh.. too many categories?"
 }
 
+define_category_names () {
+   echo
+   echo "/* Category names */"
+   echo "static const char *category_names[] = {"
+   bit=0
+   category_list "$1" |
+   while read cat
+   do
+   echo "  \"$cat\", /* (1UL << $bit) */"
+   bit=$(($bit+1))
+   done
+   echo "  NULL"
+   echo "};"
+}
+
 print_command_list () {
echo "static struct cmdname_help command_list[] = {"
 
@@ -70,4 +85,6 @@ struct cmdname_help {
 "
 define_categories "$1"
 echo
+define_category_names "$1"
+echo
 print_command_list "$1"
diff --git a/git.c b/git.c
index 10907f7266..4d5b8a9931 100644
--- a/git.c
+++ b/git.c
@@ -60,6 +60,13 @@ static int list_cmds(const char *spec)
list_all_main_cmds();
else if (match_token(spec, len, "others"))
list_all_other_cmds();
+   else if (len > 5 && !strncmp(spec, "list-", 5)) {
+   struct strbuf sb = STRBUF_INIT;
+
+   strbuf_add(, spec + 5, len - 5);
+   list_cmds_by_category(, sb.buf);
+   strbuf_release();
+   }
else
die(_("unsupported command listing type '%s'"), spec);
spec += len;
diff --git a/help.c b/help.c
index d5ce9dfcbb..1117f7d1d1 100644
--- a/help.c
+++ b/help.c
@@ -329,6 +329,29 @@ void list_all_other_cmds(struct string_list *list)
clean_cmdnames(_cmds);
 }
 
+void list_cmds_by_category(struct string_list *list,
+  const char *cat)
+{
+   int i, n = ARRAY_SIZE(command_list);
+   uint32_t cat_id = 0;
+
+   for (i = 0; category_names[i]; i++) {
+   if (!strcmp(cat, category_names[i])) {
+   cat_id = 1UL << i;
+   break;
+   }
+   }
+   if (!cat_id)
+   die(_("unsupported command listing type '%s'"), cat);
+
+   for (i = 0; i < n; i++) {
+   struct cmdname_help *cmd = command_list + i;
+
+   if (cmd->category & cat_id)
+   string_list_append(list, drop_prefix(cmd->name));
+   }
+}
+
 int is_in_cmdlist(struct cmdnames *c, const char *s)
 {
int i;
diff --git a/help.h b/help.h
index 97e6c0965e..734bba32d3 100644
--- a/help.h
+++ b/help.h
@@ -21,6 +21,8 @@ static inline void mput_char(char c, unsigned int num)
 extern void list_common_cmds_help(void);
 extern void list_all_main_cmds(struct string_list *list);
 extern void list_all_other_cmds(struct string_list *list);
+extern void list_cmds_by_category(struct string_list *list,
+ const char *category);
 extern const char *help_unknown_cmd(const char *cmd);
 extern void load_command_list(const char *prefix,
  struct cmdnames *main_cmds,
-- 
2.17.0.705.g3525833791



Fun puzzle: successful merge's contents depends on...commit timestamps?

2018-05-18 Thread Elijah Newren
Check this out; I merge another branch successfully:
  $ git merge -m F $othersha; echo $?
  0

This is just a simple test repo:
  $ git log --oneline
  18f5e65 (HEAD -> master) F
  7472b46 E
  ea0d801 D
  4486f96 C
  7727e3b B
  ee47c13 A

Below it is important to note that one commit was made at 1526194997,
and this value doesn't appear anywhere else in fast-export output:
  $ git fast-export --no-data master | grep 1526194997
  author Elijah Newren  1526194997 -0700
  committer Elijah Newren  1526194997 -0700

Let's create a new branch and rewrite history ONLY changing that one timestamp:
  $ git checkout -b redo
  $ git fast-export --no-data redo | \
   sed -e s/1526194997/15/ | \
   git fast-import --quiet --force

At this point, the trees for 'master' and 'redo' match, as you'd expect:
  $ git diff --quiet master HEAD; echo $?
  0

Let's redo that merge, shall we?:
  $ git checkout --quiet 'redo^1'
  $ git merge -m newF 'redo^2'; echo $?
  $ 0

Did we get the same merge result?  Nope:
  $ git diff --quiet master HEAD; echo $?
  1


I believe there are about half a dozen solutions to this puzzle.  Can you find
at least one?


[PATCH 4/5] merge-recursive: rename conflict_rename_*() family of functions

2018-05-18 Thread Elijah Newren
These functions were added because processing of these conflicts needed
to be deferred until process_entry() in order to get D/F conflicts and
such right.  The number of these has grown over time, and now include
some whose name is misleading:
  * conflict_rename_normal() is for handling normal file renames; a
typical rename may need content merging, but we expect conflicts
from that to be more the exception than the rule.
  * conflict_rename_via_dir() will not be a conflict; it was just an
add that turned into a move due to directory rename detection.
(If there was a file in the way of the move, that would have been
detected and reported earlier.)
  * conflict_rename_rename_2to1 and conflict_rename_add (the latter
of which doesn't exist yet but has been submitted before and I
intend to resend) technically might not be conflicts if the
colliding paths happen to match exactly.
Rename this family of functions to handle_rename_*().

Also rename handle_renames() to detect_and_process_renames() both to make
it clearer what it does, and to differentiate it as a pre-processing step
from all the handle_rename_*() functions which are called from
process_entry().

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 86 +++
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index d30085d9c7..273ee79afa 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1224,10 +1224,10 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, , , , path, branch1, branch2, mfi);
 }
 
-static int conflict_rename_via_dir(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *other_branch)
+static int handle_rename_via_dir(struct merge_options *o,
+struct diff_filepair *pair,
+const char *rename_branch,
+const char *other_branch)
 {
/*
 * Handle file adds that need to be renamed due to directory rename
@@ -1329,10 +1329,10 @@ static int handle_change_delete(struct merge_options *o,
return ret;
 }
 
-static int conflict_rename_delete(struct merge_options *o,
- struct diff_filepair *pair,
- const char *rename_branch,
- const char *delete_branch)
+static int handle_rename_delete(struct merge_options *o,
+   struct diff_filepair *pair,
+   const char *rename_branch,
+   const char *delete_branch)
 {
const struct diff_filespec *orig = pair->one;
const struct diff_filespec *dest = pair->two;
@@ -1434,8 +1434,8 @@ static int handle_file(struct merge_options *o,
return ret;
 }
 
-static int conflict_rename_rename_1to2(struct merge_options *o,
-  struct rename_conflict_info *ci)
+static int handle_rename_rename_1to2(struct merge_options *o,
+struct rename_conflict_info *ci)
 {
/* One file was renamed in both branches, but to different names. */
struct diff_filespec *one = ci->pair1->one;
@@ -1496,8 +1496,8 @@ static int conflict_rename_rename_1to2(struct 
merge_options *o,
return 0;
 }
 
-static int conflict_rename_rename_2to1(struct merge_options *o,
-  struct rename_conflict_info *ci)
+static int handle_rename_rename_2to1(struct merge_options *o,
+struct rename_conflict_info *ci)
 {
/* Two files, a & b, were renamed to the same thing, c. */
struct diff_filespec *a = ci->pair1->one;
@@ -2231,7 +2231,7 @@ static void apply_directory_rename_modifications(struct 
merge_options *o,
 * "NOTE" in update_stages(), doing so will modify the current
 * in-memory index which will break calls to would_lose_untracked()
 * that we need to make.  Instead, we need to just make sure that
-* the various conflict_rename_*() functions update the index
+* the various handle_rename_*() functions update the index
 * explicitly rather than relying on unpack_trees() to have done it.
 */
get_tree_entry(>object.oid,
@@ -2635,12 +2635,12 @@ static void initial_cleanup_rename(struct 
diff_queue_struct *pairs,
free(pairs);
 }
 
-static int handle_renames(struct merge_options *o,
- struct tree *common,
- struct tree *head,
- struct tree *merge,
- struct string_list *entries,
- struct rename_info *ri)
+static int detect_and_process_renames(struct merge_options *o,
+  

[PATCH 3/5] merge-recursive: clarify the rename_dir/RENAME_DIR meaning

2018-05-18 Thread Elijah Newren
We had an enum of rename types which included RENAME_DIR; this name felt
misleading since it was not about an entire directory but was a status for
each individual file add that occurred within a renamed directory.  Since
this type is for signifying that the files in question were being renamed
due to directory rename detection, rename this enum value to
RENAME_VIA_DIR.  Make a similar change to the conflict_rename_dir()
function, and add a comment to the top of that function explaining its
purpose (it may not be quite as obvious as for the other
conflict_rename_*() functions).

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 01306c87eb..d30085d9c7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -180,7 +180,7 @@ static int oid_eq(const struct object_id *a, const struct 
object_id *b)
 
 enum rename_type {
RENAME_NORMAL = 0,
-   RENAME_DIR,
+   RENAME_VIA_DIR,
RENAME_DELETE,
RENAME_ONE_FILE_TO_ONE,
RENAME_ONE_FILE_TO_TWO,
@@ -1224,11 +1224,17 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, , , , path, branch1, branch2, mfi);
 }
 
-static int conflict_rename_dir(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *other_branch)
+static int conflict_rename_via_dir(struct merge_options *o,
+  struct diff_filepair *pair,
+  const char *rename_branch,
+  const char *other_branch)
 {
+   /*
+* Handle file adds that need to be renamed due to directory rename
+* detection.  This differs from handle_rename_normal, because
+* there is no content merge to do; just move the path into the
+* desired final location.
+*/
const struct diff_filespec *dest = pair->two;
 
if (!o->call_depth && would_lose_untracked(dest->path)) {
@@ -2498,7 +2504,7 @@ static int process_renames(struct merge_options *o,
 
if (oid_eq(_other.oid, _oid) &&
ren1->add_turned_into_rename) {
-   setup_rename_conflict_info(RENAME_DIR,
+   setup_rename_conflict_info(RENAME_VIA_DIR,
   ren1->pair,
   NULL,
   branch1,
@@ -2944,12 +2950,12 @@ static int process_entry(struct merge_options *o,
 b_oid, b_mode,
 conflict_info);
break;
-   case RENAME_DIR:
+   case RENAME_VIA_DIR:
clean_merge = 1;
-   if (conflict_rename_dir(o,
-   conflict_info->pair1,
-   conflict_info->branch1,
-   conflict_info->branch2))
+   if (conflict_rename_via_dir(o,
+   conflict_info->pair1,
+   conflict_info->branch1,
+   conflict_info->branch2))
clean_merge = -1;
break;
case RENAME_DELETE:
-- 
2.17.0.847.g20b8963732



[PATCH 1/5] merge-recursive: fix miscellaneous grammar error in comment

2018-05-18 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 35df695fa4..c430fd72bc 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -531,7 +531,7 @@ static void record_df_conflict_files(struct merge_options 
*o,
 struct string_list *entries)
 {
/* If there is a D/F conflict and the file for such a conflict
-* currently exist in the working tree, we want to allow it to be
+* currently exists in the working tree, we want to allow it to be
 * removed to make room for the corresponding directory if needed.
 * The files underneath the directories of such D/F conflicts will
 * be processed before the corresponding file involved in the D/F
-- 
2.17.0.847.g20b8963732



[PATCH 2/5] merge-recursive: fix numerous argument alignment issues

2018-05-18 Thread Elijah Newren
Various refactorings throughout the code have left lots of alignment
issues that were driving me crazy; fix them.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 47 ---
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index c430fd72bc..01306c87eb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -308,8 +308,8 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
 }
 
 static int add_cacheinfo(struct merge_options *o,
-   unsigned int mode, const struct object_id *oid,
-   const char *path, int stage, int refresh, int options)
+unsigned int mode, const struct object_id *oid,
+const char *path, int stage, int refresh, int options)
 {
struct cache_entry *ce;
int ret;
@@ -409,8 +409,8 @@ struct tree *write_tree_from_memory(struct merge_options *o)
 }
 
 static int save_files_dirs(const struct object_id *oid,
-   struct strbuf *base, const char *path,
-   unsigned int mode, int stage, void *context)
+  struct strbuf *base, const char *path,
+  unsigned int mode, int stage, void *context)
 {
struct path_hashmap_entry *entry;
int baselen = base->len;
@@ -1257,13 +1257,13 @@ static int conflict_rename_dir(struct merge_options *o,
 }
 
 static int handle_change_delete(struct merge_options *o,
-const char *path, const char *old_path,
-const struct object_id *o_oid, int o_mode,
-const struct object_id *changed_oid,
-int changed_mode,
-const char *change_branch,
-const char *delete_branch,
-const char *change, const char *change_past)
+   const char *path, const char *old_path,
+   const struct object_id *o_oid, int o_mode,
+   const struct object_id *changed_oid,
+   int changed_mode,
+   const char *change_branch,
+   const char *delete_branch,
+   const char *change, const char *change_past)
 {
char *alt_path = NULL;
const char *update_path = path;
@@ -1324,9 +1324,9 @@ static int handle_change_delete(struct merge_options *o,
 }
 
 static int conflict_rename_delete(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *delete_branch)
+ struct diff_filepair *pair,
+ const char *rename_branch,
+ const char *delete_branch)
 {
const struct diff_filespec *orig = pair->one;
const struct diff_filespec *dest = pair->two;
@@ -1429,7 +1429,7 @@ static int handle_file(struct merge_options *o,
 }
 
 static int conflict_rename_rename_1to2(struct merge_options *o,
-   struct rename_conflict_info *ci)
+  struct rename_conflict_info *ci)
 {
/* One file was renamed in both branches, but to different names. */
struct diff_filespec *one = ci->pair1->one;
@@ -1491,7 +1491,7 @@ static int conflict_rename_rename_1to2(struct 
merge_options *o,
 }
 
 static int conflict_rename_rename_2to1(struct merge_options *o,
-   struct rename_conflict_info *ci)
+  struct rename_conflict_info *ci)
 {
/* Two files, a & b, were renamed to the same thing, c. */
struct diff_filespec *a = ci->pair1->one;
@@ -2710,7 +2710,8 @@ static struct object_id *stage_oid(const struct object_id 
*oid, unsigned mode)
 }
 
 static int read_oid_strbuf(struct merge_options *o,
-   const struct object_id *oid, struct strbuf *dst)
+  const struct object_id *oid,
+  struct strbuf *dst)
 {
void *buf;
enum object_type type;
@@ -2763,10 +2764,10 @@ static int blob_unchanged(struct merge_options *opt,
 }
 
 static int handle_modify_delete(struct merge_options *o,
-const char *path,
-struct object_id *o_oid, int o_mode,
-struct object_id *a_oid, int a_mode,
-struct object_id *b_oid, int b_mode)
+   const char *path,
+   struct object_id *o_oid, int o_mode,
+   struct object_id *a_oid, int a_mode,
+   

[PATCH 0/5] merge-recursive code cleanups

2018-05-18 Thread Elijah Newren
This patch series contains several small code cleanups for merge-recursive.
I have removed a couple small cleanup chunks in order to avoid conflicts
with any other in-flight topics in pu (namely, nd/commit-util-to-slab and
sb/submodule-merge-in-merge-recursive).  I may resend those later
separately.

The series was made on top of next (specifically commit c95db04db ("Merge
branch 'sb/object-store-replace' into next")); it will not apply to
master.

Elijah Newren (5):
  merge-recursive: fix miscellaneous grammar error in comment
  merge-recursive: fix numerous argument alignment issues
  merge-recursive: clarify the rename_dir/RENAME_DIR meaning
  merge-recursive: rename conflict_rename_*() family of functions
  merge-recursive: simplify handle_change_delete

 merge-recursive.c | 173 ++
 1 file changed, 83 insertions(+), 90 deletions(-)

-- 
2.17.0.847.g20b8963732



[PATCH 5/5] merge-recursive: simplify handle_change_delete

2018-05-18 Thread Elijah Newren
There is really no need for four branches of nearly identical messages
when we can store the differences into small variables before printing.
It does require a few allocations this way, but makes the code much
easier to parse for human readers.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 36 +++-
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 273ee79afa..3bd727995b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1290,31 +1290,17 @@ static int handle_change_delete(struct merge_options *o,
if (!ret)
ret = update_file(o, 0, o_oid, o_mode, update_path);
} else {
-   if (!alt_path) {
-   if (!old_path) {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s in %s. Version %s of %s left in 
tree."),
-  change, path, delete_branch, change_past,
-  change_branch, change_branch, path);
-   } else {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s to %s in %s. Version %s of %s 
left in tree."),
-  change, old_path, delete_branch, 
change_past, path,
-  change_branch, change_branch, path);
-   }
-   } else {
-   if (!old_path) {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s in %s. Version %s of %s left in 
tree at %s."),
-  change, path, delete_branch, change_past,
-  change_branch, change_branch, path, 
alt_path);
-   } else {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s to %s in %s. Version %s of %s 
left in tree at %s."),
-  change, old_path, delete_branch, 
change_past, path,
-  change_branch, change_branch, path, 
alt_path);
-   }
-   }
+   const char *deleted_path = old_path ? old_path : path;
+   char *supp1 = xstrfmt(old_path ? " to %s" : "", path);
+   char *supp2 = xstrfmt(alt_path ? " at %s" : "", alt_path);
+
+   output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
+  "and %s%s in %s. Version %s of %s left in tree%s."),
+  change, deleted_path, delete_branch, change_past,
+  supp1, change_branch, change_branch, path, supp2);
+   free(supp1);
+   free(supp2);
+
/*
 * No need to call update_file() on path when change_branch ==
 * o->branch1 && !alt_path, since that would needlessly touch
-- 
2.17.0.847.g20b8963732



[PATCH 5/5] fmt_with_err: add a comment that truncation is OK

2018-05-18 Thread Jeff King
Functions like die_errno() use fmt_with_err() to combine the
caller-provided format with the strerror() string. We use a
fixed stack buffer because we're already handling an error
and don't have any way to report another one. Our buffer
should generally be big enough to fit this, but if it's not,
truncation is our best option. Let's add a comment to that
effect, so that anybody auditing the code for truncation
bugs knows that this is fine.

Signed-off-by: Jeff King 
---
 usage.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/usage.c b/usage.c
index cdd534c9df..b3c78931ad 100644
--- a/usage.c
+++ b/usage.c
@@ -148,6 +148,7 @@ static const char *fmt_with_err(char *buf, int n, const 
char *fmt)
}
}
str_error[j] = 0;
+   /* Truncation is acceptable here */
snprintf(buf, n, "%s: %s", fmt, str_error);
return buf;
 }
-- 
2.17.0.1052.g7d69f75dbf


[PATCH 4/5] shorten_unambiguous_ref: use xsnprintf

2018-05-18 Thread Jeff King
We convert the ref_rev_parse_rules array into scanf formats
on the fly, and use snprintf() to write into each string. We
should have enough memory to hold everything because of the
earlier total_len computation. Let's use xsnprintf() to
give runtime confirmation that this is the case, and to make
it easy for people auditing the code to know there's no
truncation bug.

Signed-off-by: Jeff King 
---
 refs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 64aadd14c9..73c446628f 100644
--- a/refs.c
+++ b/refs.c
@@ -1147,8 +1147,8 @@ char *shorten_unambiguous_ref(const char *refname, int 
strict)
for (i = 0; i < nr_rules; i++) {
assert(offset < total_len);
scanf_fmts[i] = (char *)_fmts[nr_rules] + offset;
-   offset += snprintf(scanf_fmts[i], total_len - offset,
-  ref_rev_parse_rules[i], 2, "%s") + 1;
+   offset += xsnprintf(scanf_fmts[i], total_len - offset,
+   ref_rev_parse_rules[i], 2, "%s") + 
1;
}
}
 
-- 
2.17.0.1052.g7d69f75dbf



[PATCH 3/5] query_fsmonitor: use xsnprintf for formatting integers

2018-05-18 Thread Jeff King
These formatted integers should always fit into their
64-byte buffers. Let's use xsnprintf() to add an assertion
that this is the case, which makes auditing for other
unchecked snprintfs() easier.

Signed-off-by: Jeff King 
---
 fsmonitor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index ed3d1a074d..cc19b27e1d 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -104,8 +104,8 @@ static int query_fsmonitor(int version, uint64_t 
last_update, struct strbuf *que
if (!(argv[0] = core_fsmonitor))
return -1;
 
-   snprintf(ver, sizeof(ver), "%d", version);
-   snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
+   xsnprintf(ver, sizeof(ver), "%d", version);
+   xsnprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
argv[1] = ver;
argv[2] = date;
argv[3] = NULL;
-- 
2.17.0.1052.g7d69f75dbf



[PATCH 2/5] log_write_email_headers: use strbufs

2018-05-18 Thread Jeff King
When we write a MIME attachment, we write the mime headers
into fixed-size buffers. These are likely to be big enough
in practice, but technically the input could be arbitrarily
large (e.g., if the caller provided a lot of content in the
extra_headers string), in which case we'd quietly truncate
it and generate bogus output. Let's convert these buffers to
strbufs.

The memory ownership here is a bit funny. The original fixed
buffers were static, and we merely pass out pointers to them
to be used by the caller (and in one case, we even just
stuff our value into the opt->diffopt.stat_sep value).
Ideally we'd actually pass back heap buffers, and the caller
would be responsible for freeing them.

This patch punts on that cleanup for now, and instead just
marks the strbufs as static. That means we keep ownership in
this function, making it not a complete leak. This also
takes us one step closer to fixing it in the long term
(since we can eventually use strbuf_detach() to hand
ownership to the caller, once it's ready).

Signed-off-by: Jeff King 
---
The rest of that cleanup is a possible #leftoverbits.

 log-tree.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index d1c0bedf24..1173fdb057 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -386,11 +386,15 @@ void log_write_email_headers(struct rev_info *opt, struct 
commit *commit,
graph_show_oneline(opt->graph);
}
if (opt->mime_boundary) {
-   static char subject_buffer[1024];
-   static char buffer[1024];
+   static struct strbuf subject_buffer = STRBUF_INIT;
+   static struct strbuf buffer = STRBUF_INIT;
struct strbuf filename =  STRBUF_INIT;
*need_8bit_cte_p = -1; /* NEVER */
-   snprintf(subject_buffer, sizeof(subject_buffer) - 1,
+
+   strbuf_reset(_buffer);
+   strbuf_reset();
+
+   strbuf_addf(_buffer,
 "%s"
 "MIME-Version: 1.0\n"
 "Content-Type: multipart/mixed;"
@@ -405,13 +409,13 @@ void log_write_email_headers(struct rev_info *opt, struct 
commit *commit,
 extra_headers ? extra_headers : "",
 mime_boundary_leader, opt->mime_boundary,
 mime_boundary_leader, opt->mime_boundary);
-   extra_headers = subject_buffer;
+   extra_headers = subject_buffer.buf;
 
if (opt->numbered_files)
strbuf_addf(, "%d", opt->nr);
else
fmt_output_commit(, commit, opt);
-   snprintf(buffer, sizeof(buffer) - 1,
+   strbuf_addf(,
 "\n--%s%s\n"
 "Content-Type: text/x-patch;"
 " name=\"%s\"\n"
@@ -422,7 +426,7 @@ void log_write_email_headers(struct rev_info *opt, struct 
commit *commit,
 filename.buf,
 opt->no_inline ? "attachment" : "inline",
 filename.buf);
-   opt->diffopt.stat_sep = buffer;
+   opt->diffopt.stat_sep = buffer.buf;
strbuf_release();
}
*extra_headers_p = extra_headers;
-- 
2.17.0.1052.g7d69f75dbf



[PATCH 1/5] http: use strbufs instead of fixed buffers

2018-05-18 Thread Jeff King
We keep the names of incoming packs and objects in fixed
PATH_MAX-size buffers, and snprintf() into them. This is
unlikely to end up with truncated filenames, but it is
possible (especially on systems where PATH_MAX is shorter
than actual paths can be). Let's switch to using strbufs,
which makes the question go away entirely.

Signed-off-by: Jeff King 
---
The diff is a little noisy due to the s/tmpfile/&.buf/ everywhere. The
interesting lines to look at are initialization and release, which I
think I got right.

 http.c | 66 --
 http.h |  4 ++--
 2 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/http.c b/http.c
index fed13b2169..260bd10f95 100644
--- a/http.c
+++ b/http.c
@@ -2082,6 +2082,7 @@ void release_http_pack_request(struct http_pack_request 
*preq)
preq->packfile = NULL;
}
preq->slot = NULL;
+   strbuf_release(>tmpfile);
free(preq->url);
free(preq);
 }
@@ -2104,19 +2105,19 @@ int finish_http_pack_request(struct http_pack_request 
*preq)
lst = &((*lst)->next);
*lst = (*lst)->next;
 
-   if (!strip_suffix(preq->tmpfile, ".pack.temp", ))
+   if (!strip_suffix(preq->tmpfile.buf, ".pack.temp", ))
die("BUG: pack tmpfile does not end in .pack.temp?");
-   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile);
+   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile.buf);
 
argv_array_push(, "index-pack");
argv_array_pushl(, "-o", tmp_idx, NULL);
-   argv_array_push(, preq->tmpfile);
+   argv_array_push(, preq->tmpfile.buf);
ip.git_cmd = 1;
ip.no_stdin = 1;
ip.no_stdout = 1;
 
if (run_command()) {
-   unlink(preq->tmpfile);
+   unlink(preq->tmpfile.buf);
unlink(tmp_idx);
free(tmp_idx);
return -1;
@@ -2124,7 +2125,7 @@ int finish_http_pack_request(struct http_pack_request 
*preq)
 
unlink(sha1_pack_index_name(p->sha1));
 
-   if (finalize_object_file(preq->tmpfile, sha1_pack_name(p->sha1))
+   if (finalize_object_file(preq->tmpfile.buf, sha1_pack_name(p->sha1))
 || finalize_object_file(tmp_idx, sha1_pack_index_name(p->sha1))) {
free(tmp_idx);
return -1;
@@ -2143,6 +2144,7 @@ struct http_pack_request *new_http_pack_request(
struct http_pack_request *preq;
 
preq = xcalloc(1, sizeof(*preq));
+   strbuf_init(>tmpfile, 0);
preq->target = target;
 
end_url_with_slash(, base_url);
@@ -2150,12 +2152,11 @@ struct http_pack_request *new_http_pack_request(
sha1_to_hex(target->sha1));
preq->url = strbuf_detach(, NULL);
 
-   snprintf(preq->tmpfile, sizeof(preq->tmpfile), "%s.temp",
-   sha1_pack_name(target->sha1));
-   preq->packfile = fopen(preq->tmpfile, "a");
+   strbuf_addf(>tmpfile, "%s.temp", sha1_pack_name(target->sha1));
+   preq->packfile = fopen(preq->tmpfile.buf, "a");
if (!preq->packfile) {
error("Unable to open local file %s for pack",
- preq->tmpfile);
+ preq->tmpfile.buf);
goto abort;
}
 
@@ -2182,6 +2183,7 @@ struct http_pack_request *new_http_pack_request(
return preq;
 
 abort:
+   strbuf_release(>tmpfile);
free(preq->url);
free(preq);
return NULL;
@@ -2232,7 +2234,7 @@ struct http_object_request *new_http_object_request(const 
char *base_url,
 {
char *hex = sha1_to_hex(sha1);
struct strbuf filename = STRBUF_INIT;
-   char prevfile[PATH_MAX];
+   struct strbuf prevfile = STRBUF_INIT;
int prevlocal;
char prev_buf[PREV_BUF_SIZE];
ssize_t prev_read = 0;
@@ -2240,40 +2242,41 @@ struct http_object_request 
*new_http_object_request(const char *base_url,
struct http_object_request *freq;
 
freq = xcalloc(1, sizeof(*freq));
+   strbuf_init(>tmpfile, 0);
hashcpy(freq->sha1, sha1);
freq->localfile = -1;
 
sha1_file_name(the_repository, , sha1);
-   snprintf(freq->tmpfile, sizeof(freq->tmpfile),
-"%s.temp", filename.buf);
+   strbuf_addf(>tmpfile, "%s.temp", filename.buf);
 
-   snprintf(prevfile, sizeof(prevfile), "%s.prev", filename.buf);
-   unlink_or_warn(prevfile);
-   rename(freq->tmpfile, prevfile);
-   unlink_or_warn(freq->tmpfile);
+   strbuf_addf(, "%s.prev", filename.buf);
+   unlink_or_warn(prevfile.buf);
+   rename(freq->tmpfile.buf, prevfile.buf);
+   unlink_or_warn(freq->tmpfile.buf);
strbuf_release();
 
if (freq->localfile != -1)
error("fd leakage in start: %d", freq->localfile);
-   freq->localfile = open(freq->tmpfile,
+   freq->localfile = open(freq->tmpfile.buf,
   O_WRONLY | O_CREAT | 

[PATCH 0/5] snprintf truncation fixes

2018-05-18 Thread Jeff King
I happened today to be looking at a piece of code that used a bare
snprintf() without checking for truncation, and I got annoyed enough to
root out the last few cases in our codebase. After this series, looking
over the results of:

  git grep '[^vxn]snprintf' '*.c' :^compat

is pretty pleasant.

This series also gets rid of some uses of PATH_MAX, which is another pet
peeve of mine. :)

  [1/5]: http: use strbufs instead of fixed buffers
  [2/5]: log_write_email_headers: use strbufs
  [3/5]: query_fsmonitor: use xsnprintf for formatting integers
  [4/5]: shorten_unambiguous_ref: use xsnprintf
  [5/5]: fmt_with_err: add a comment that truncation is OK

 fsmonitor.c |  4 ++--
 http.c  | 66 +
 http.h  |  4 ++--
 log-tree.c  | 16 -
 refs.c  |  4 ++--
 usage.c |  1 +
 6 files changed, 53 insertions(+), 42 deletions(-)

-Peff


FROM LEONARD PAKKER

2018-05-18 Thread Mr.Leonard Pakker
Attention

I am an American citizen, My name is Leonard Pakker, a member of the
U.S. ARMY medical team apart from being surprise you may be skeptical
to reply me based on what is happening around world on the internet. I
have just deployed to Syria.

I just discover One trunks of consignment boxes containing One Hundred
Notes in $100 US bills the total amount withheld. I have deposited the
consignments with a security firm here in Syria. Presently the Fund
has been Transferred in there corresponding bank in United Kingdom for
safety  am looking for a trust worthy individual who will assist me in
receiving the funds.

To prove my sincerity, you are not sending me any money, because most
of this scam is about sending money. Though, I would like to hold back
some information's for security reasons pending your response.I send
you this mail not without a measure of fear as to the consequences,
but I know within me that nothing ventured is nothing gained and that
success and riches never come easy or on a platter of gold. Please get
back to me immediately and let me know if you intend to assist or not.

If your also do not want to be involved kindly let me know so I would
sever further communication with you.

Regards

Leonard Pakker


Re: [PATCH v3 3/3] unpack_trees_options: free messages when done

2018-05-18 Thread Jeff King
On Fri, May 18, 2018 at 03:30:44PM -0700, Elijah Newren wrote:

> > would become:
> >
> >   msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOUPTODATE_FILE] =
> > string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;
> >
> > I don't know if that's worth it or not (I suspect that there are other
> > places where appendf would be handy, but I didn't poke around).
> 
> The strdup_strings=1 immediately before calling string_list_clear()
> has been used in one other place in merge-recursive.c, and tripped up
> the reviewer requiring a big code comment to explain it. (See the very
> end of 
> https://public-inbox.org/git/cabpp-bgh7qttfu3kgh4ko5drrxiqjtrnhx_uaqsb6fhxt+9...@mail.gmail.com/
> ).  So there's already one other place in merge-recursive.c that might
> benefit from such a change.

Thanks. I knew I had seen such hackery before, but it's nice to have a
specific site that would benefit.

IMHO the "nodup" variant of string_list is quite often a sign that
things are more complicated than they need to be. Even in cases that are
truly pointing to existing strings, is the complication really worth
saving a few strdups? Perhaps sometimes, but I have a suspicion it's
mostly premature optimization.

> Maybe someone wants to tackle that as a separate patch series?  (Maybe
> we make it a micro-project for future GSoC'ers?)

Yeah, I'm fine with these patches if somebody wants to do it separately.
It would be a good micro-project, but I'd also be just as happy if
somebody did it before next year. :)

-Peff


Re: [PATCH 0/2] fix a segfault in get_main_ref_store()

2018-05-18 Thread Johannes Schindelin
Hi Peff,

On Fri, 18 May 2018, Jeff King wrote:

> I stumbled across a BUG() today. But interestingly, in the current tip
> of master it actually segfaults instead! This fixes the segfault (back
> into a BUG(), and then fixes the caller to avoid the BUG() in the first
> place).
> 
>   [1/2]: get_main_ref_store: BUG() when outside a repository
>   [2/2]: config: die when --blob is used outside a repository
> 
>  builtin/config.c   | 3 +++
>  refs.c | 3 +++
>  t/t1307-config-blob.sh | 4 
>  3 files changed, 10 insertions(+)

Both patches look obviously correct to me.

Thanks,
Dscho


Re: [PATCH 1/2] get_main_ref_store: BUG() when outside a repository

2018-05-18 Thread Stefan Beller
On Fri, May 18, 2018 at 3:25 PM, Jeff King  wrote:
> If we don't have a repository, then we can't initialize the
> ref store.  Prior to 64a741619d (refs: store the main ref
> store inside the repository struct, 2018-04-11), we'd try to
> access get_git_dir(), and outside a repository that would
> trigger a BUG(). After that commit, though, we directly use
> the_repository->git_dir; if it's NULL we'll just segfault.
>
> Let's catch this case and restore the BUG() behavior.
> Obviously we don't ever want to hit this code, but a BUG()
> is a lot more helpful than a segfault if we do.

That is true and an immediate bandaid; an alternative would
be to do:

  if (!r->gitdir)
/* not in a git directory ? */
return NULL;
/* We signal the caller the absence of a git repo by NULL ness
  of the ref store */

However that we would need to catch at all callers of
get_main_ref_store and error out accordingly.

Then the trade off would be, how many callers to the main ref
store do we have compared to options that rely on a ref store
present? (I assume a patch like the second patch would show
up in more cases now for all BUGs that we discover via this patch.
The tradeoff is just if we want to report all these early by checking
the config and system state, or wait until we get NULL ref store
and then bail)

I think checking early makes sense; I am not so sure about this
patch; for the time being it makes sense, though in the long run,
we rather want to catch this at a higher level:

r->gitdir is supposedly never NULL, so we shall not
produce this state. Maybe we want to set the_repository to NULL
if set_git_dir fails (via repo_set_gitdir in setup_git_env, which both
return void today).

Enough of my rambling, the patches look good!
Stefan


Re: [PATCH 2/2] config: die when --blob is used outside a repository

2018-05-18 Thread Taylor Blau
On Fri, May 18, 2018 at 03:27:04PM -0700, Jeff King wrote:
> If you run "config --blob" outside of a repository, then we
> eventually try to resolve the blob name and hit a BUG().
> Let's catch this earlier and provide a useful message.

I think that this approach is sensible. Let's (1) fix a SIGSEGV that
should be a BUG(), and (2) make sure that we never get there in the
first place.

This looks fine to me.

> Note that we could also catch this much lower in the stack,
> in git_config_from_blob_ref(). That might cover other
> callsites, too, but it's unclear whether those ones would
> actually be bugs or not. So let's leave the low-level
> functions to assume the caller knows what it's doing (and
> BUG() if it turns out it doesn't).
>
> Signed-off-by: Jeff King 

Thanks,
Taylor


Re: [PATCH v3 3/3] unpack_trees_options: free messages when done

2018-05-18 Thread Elijah Newren
On Fri, May 18, 2018 at 2:33 PM, Jeff King  wrote:
> On Fri, May 18, 2018 at 11:23:27PM +0200, Martin Ågren wrote:
>
>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 79fd97074e..60293ff536 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -103,6 +103,8 @@ void setup_unpack_trees_porcelain(struct 
>> unpack_trees_options *opts,
>>   const char **msgs = opts->msgs;
>>   const char *msg;
>>
>> + opts->msgs_to_free.strdup_strings = 0;
>> +
>> [...]
>> +void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
>> +{
>> + opts->msgs_to_free.strdup_strings = 1;
>> + string_list_clear(>msgs_to_free, 0);
>
> I like this string_list approach much better, but it's too bad we have
> to go through these contortions with the strdup flag to get the memory
> ownership right.
>
> If we had a string_list_appendf(), then we could just leave that flag
> alone and this:
>
>> @@ -118,8 +120,9 @@ void setup_unpack_trees_porcelain(struct 
>> unpack_trees_options *opts,
>> ? _("Your local changes to the following files would be 
>> overwritten by %s:\n%%s"
>> "Please commit your changes or stash them before you 
>> %s.")
>> : _("Your local changes to the following files would be 
>> overwritten by %s:\n%%s");
>> - msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
>> - xstrfmt(msg, cmd, cmd);
>> + msg = xstrfmt(msg, cmd, cmd);
>> + msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] = msg;
>> + string_list_append(>msgs_to_free, msg);
>
> would become:
>
>   msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOUPTODATE_FILE] =
> string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;
>
> I don't know if that's worth it or not (I suspect that there are other
> places where appendf would be handy, but I didn't poke around).

The strdup_strings=1 immediately before calling string_list_clear()
has been used in one other place in merge-recursive.c, and tripped up
the reviewer requiring a big code comment to explain it. (See the very
end of 
https://public-inbox.org/git/cabpp-bgh7qttfu3kgh4ko5drrxiqjtrnhx_uaqsb6fhxt+9...@mail.gmail.com/
).  So there's already one other place in merge-recursive.c that might
benefit from such a change.


A quick search shows about half a dozen other sites throughout the
code that are doing something similar:

$ git grep -3 strdup_strings | grep -B 1 string_list_clear
bisect.c: refs_for_removal.strdup_strings = 1;
bisect.c- string_list_clear(_for_removal, 0);
--
builtin/shortlog.c: onelines->strdup_strings = 1;
builtin/shortlog.c- string_list_clear(onelines, 0);
--
builtin/shortlog.c: log->list.strdup_strings = 1;
builtin/shortlog.c- string_list_clear(>list, 1);
--
mailmap.c: me->namemap.strdup_strings = 1;
mailmap.c- string_list_clear_func(>namemap, free_mailmap_info);
--
mailmap.c: map->strdup_strings = 1;
mailmap.c- string_list_clear_func(map, free_mailmap_entry);
--
merge-recursive.c: entry->possible_new_dirs.strdup_strings = 1;
merge-recursive.c- string_list_clear(>possible_new_dirs, 1);
--
revision.c: revs->notes_opt.extra_notes_refs.strdup_strings = 1;
revision.c- string_list_clear(>notes_opt.extra_notes_refs, 0);


Maybe someone wants to tackle that as a separate patch series?  (Maybe
we make it a micro-project for future GSoC'ers?)


[PATCH 2/2] config: die when --blob is used outside a repository

2018-05-18 Thread Jeff King
If you run "config --blob" outside of a repository, then we
eventually try to resolve the blob name and hit a BUG().
Let's catch this earlier and provide a useful message.

Note that we could also catch this much lower in the stack,
in git_config_from_blob_ref(). That might cover other
callsites, too, but it's unclear whether those ones would
actually be bugs or not. So let's leave the low-level
functions to assume the caller knows what it's doing (and
BUG() if it turns out it doesn't).

Signed-off-by: Jeff King 
---
 builtin/config.c   | 3 +++
 t/t1307-config-blob.sh | 4 
 2 files changed, 7 insertions(+)

diff --git a/builtin/config.c b/builtin/config.c
index 69e7270356..4155734f4a 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -602,6 +602,9 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
if (use_local_config && nongit)
die(_("--local can only be used inside a git repository"));
 
+   if (given_config_source.blob && nongit)
+   die(_("--blob can only be used inside a git repository"));
+
if (given_config_source.file &&
!strcmp(given_config_source.file, "-")) {
given_config_source.file = NULL;
diff --git a/t/t1307-config-blob.sh b/t/t1307-config-blob.sh
index eed31ffa30..37dc689d8c 100755
--- a/t/t1307-config-blob.sh
+++ b/t/t1307-config-blob.sh
@@ -73,4 +73,8 @@ test_expect_success 'can parse blob ending with CR' '
test_cmp expect actual
 '
 
+test_expect_success 'config --blob outside of a repository is an error' '
+   test_must_fail nongit git config --blob=foo --list
+'
+
 test_done
-- 
2.17.0.1052.g7d69f75dbf


[PATCH 1/2] get_main_ref_store: BUG() when outside a repository

2018-05-18 Thread Jeff King
If we don't have a repository, then we can't initialize the
ref store.  Prior to 64a741619d (refs: store the main ref
store inside the repository struct, 2018-04-11), we'd try to
access get_git_dir(), and outside a repository that would
trigger a BUG(). After that commit, though, we directly use
the_repository->git_dir; if it's NULL we'll just segfault.

Let's catch this case and restore the BUG() behavior.
Obviously we don't ever want to hit this code, but a BUG()
is a lot more helpful than a segfault if we do.

Signed-off-by: Jeff King 
---
 refs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/refs.c b/refs.c
index 64aadd14c9..2e4a42f459 100644
--- a/refs.c
+++ b/refs.c
@@ -1668,6 +1668,9 @@ struct ref_store *get_main_ref_store(struct repository *r)
if (r->refs)
return r->refs;
 
+   if (!r->gitdir)
+   BUG("attempting to get main_ref_store outside of repository");
+
r->refs = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
return r->refs;
 }
-- 
2.17.0.1052.g7d69f75dbf



[PATCH 0/2] fix a segfault in get_main_ref_store()

2018-05-18 Thread Jeff King
I stumbled across a BUG() today. But interestingly, in the current tip
of master it actually segfaults instead! This fixes the segfault (back
into a BUG(), and then fixes the caller to avoid the BUG() in the first
place).

  [1/2]: get_main_ref_store: BUG() when outside a repository
  [2/2]: config: die when --blob is used outside a repository

 builtin/config.c   | 3 +++
 refs.c | 3 +++
 t/t1307-config-blob.sh | 4 
 3 files changed, 10 insertions(+)

-Peff


Re: [PATCH v5 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'

2018-05-18 Thread Taylor Blau
Thanks for your thoughtful response. I answered in detail below, but I
think that we're in agreement about the semantics, with a few
corrections on my part. I'd like to push forward with this series,
including the proposed changes below, but feel that sending it as v7
would be asking too much of a reviewer. Would it be OK if I sent this a
new series entirely and we abandon this thread?

On Fri, May 18, 2018 at 03:27:44PM +0900, Junio C Hamano wrote:
> Taylor Blau  writes:
>
> >   * `git grep --and -e foo -e bar`. This binary operation should recur
> > on its sub-expressions and take the minimum of the starting offset
> > and the maximum of the ending offset.
>
> We use infix notation, so the above is "git grep -e foo --and -e
> bar" actually ;-).

Thanks for catching that :-).

> But you raise an interesting point.  A line with "hello foo bar baz"
> on it would match, so does a line with "goodbye bar baz foo", as
> both of them hits pattern "foo" *and* pattern "bar".  It is not
> quite clear what it means to "show the first hit on the line".  One
> interpretation would be to take the minimum span that makes both
> sides of "--and" happy (your "minimum of start, maximum of end").

It's funny you should mention: this was nearly the exact phrase I used
when speaking with Peff.

> Another might be to pick "foo" in the first and "bar" in the second
> line, as that is the "first hit" on the line, which is consistent
> with how "git grep -e foo" would say about "a foo b foo c foo" (I
> expect that the leftmost "foo" would be the first hit).  So there
> may be multiple, equally plausible answer to the question.

This is the largest fact in my mind pertaining to this discussion: there
are probably many different interpretations of semantics for this, all
equally valid in their own way. I am partial to the minimum substring
interpretation (which follows naturally from the minimum-start,
maximum-end idea), accepting the shortcoming that `--or` sometimes
doesn't ``do the right thing.''

> > For inputs of the form "foobar" and "foo bar", it will do the right
> > thing (give the starting and ending offset for "foobar" and give no
> > match, respectively).
>
> I think I agree with "foobar", but I do not understand why there is
> no match for "foo bar".

Ah, I think this is my mistake -- when I wrote this note last night. The
case of `-e foo --and -e bar` should clearly match both `foo bar` _and_
`foobar`.

> >   * `git grep --or -e foo -e bar`. This is the most complicated case, in
> > my opinion. In going with the min/max idea in the and case above, I
> > think that `--or` should also min/max its sub-expressions, but in
> > fact we short-circuit evaluating the second sub-expression when we
> > find a match for the first.
>
> I am not sure I follow.  "git grep -e foo --or -e bar" is just a
> longhand for "git grep -e foo -e bar".  Shouldn't it highlight
> whichever between foo and bar that appears leftmost on the line?

I don't believe that the two are treated the same, but I think that this
is another case where I was incorrect in my judgement of the
implementation last night. In fact, the only time when we _don't_ recur
on both sub-expressions of `--or` is when 'collect_hits' is zero. That's
fine, I believe.

> > So, in cases like matching `--or -e foo -e bar` with "foo baz bar",
> > we'll do the right thing, since `foo` is the first sub-expression
> > and happens to be the left-most match. In other words, we __adhere
> > to our answer with the left-most match first__ semantics, but only
> > because __the first sub-expression is the left-most match__.
> >
> > In the other case where we try and match the same expression against
> > "bar baz foo", we'll return the starting offset of "foo", even
> > though it isn't the left-most match, violating our semantics.
>
> I am not sure why you think your min-starting/max-ending would lead
> to such a conclusion.  'foo baz bar' would be covered in its
> entirety, 'bar baz foo' would also, as starting of hits with pattern
> 'foo' and pattern 'bar' would be 'b' in 'bar' on that three-word
> line, and ending of hits with these two patterns would be the last
> 'o' in 'foo' on the line.

Right, I think with the understanding in my last stanza of this response
("I don't believe that ...") this issue is resolved, and the
min-starting/max-ending _will_ do the right thing.

> I'd expect that a line 'foo baz bar' matched against "-e foo --or -e
> bar" would say "among three words on me, 'f' in foo is the first
> location of the match", though.

I would, too.

Thanks,
Taylor


Re: [PATCH] t7005-editor: get rid of the SPACES_IN_FILENAMES prereq

2018-05-18 Thread Jeff King
On Mon, May 14, 2018 at 12:28:12PM +0200, SZEDER Gábor wrote:

> The last two tests 'editor with a space' and 'core.editor with a
> space' in 't7005-editor.sh' need the SPACES_IN_FILENAMES prereq to
> ensure that they are only run on filesystems that allow, well, spaces
> in filesnames.  However, we have been putting a space in the name of
> the trash directory for just over a decade now, so we wouldn't be able
> to run any of our tests on such a filesystem in the first place.
> 
> This prereq is therefore unnecessary, remove it.

I wondered if there might be some platforms that tweak the test
environment in such a way that we skip the space in "trash directory".
But looking at test-lib.sh, I don't think there is any easy way to do so
(e.g., you can't just set $TEST_DIRECTORY_NAME or something). So this
does seem safe, and it's nice to reduce unnecessary complexity.

-Peff


Re: [PATCH v3 3/3] unpack_trees_options: free messages when done

2018-05-18 Thread Jeff King
On Fri, May 18, 2018 at 11:23:27PM +0200, Martin Ågren wrote:

> diff --git a/unpack-trees.c b/unpack-trees.c
> index 79fd97074e..60293ff536 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -103,6 +103,8 @@ void setup_unpack_trees_porcelain(struct 
> unpack_trees_options *opts,
>   const char **msgs = opts->msgs;
>   const char *msg;
>  
> + opts->msgs_to_free.strdup_strings = 0;
> +
> [...]
> +void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
> +{
> + opts->msgs_to_free.strdup_strings = 1;
> + string_list_clear(>msgs_to_free, 0);

I like this string_list approach much better, but it's too bad we have
to go through these contortions with the strdup flag to get the memory
ownership right.

If we had a string_list_appendf(), then we could just leave that flag
alone and this:

> @@ -118,8 +120,9 @@ void setup_unpack_trees_porcelain(struct 
> unpack_trees_options *opts,
> ? _("Your local changes to the following files would be 
> overwritten by %s:\n%%s"
> "Please commit your changes or stash them before you 
> %s.")
> : _("Your local changes to the following files would be 
> overwritten by %s:\n%%s");
> - msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
> - xstrfmt(msg, cmd, cmd);
> + msg = xstrfmt(msg, cmd, cmd);
> + msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] = msg;
> + string_list_append(>msgs_to_free, msg);

would become:

  msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOUPTODATE_FILE] =
string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;

I don't know if that's worth it or not (I suspect that there are other
places where appendf would be handy, but I didn't poke around).

-Peff


Re: [PATCH 1/1] Inform about fast-forwarding of submodules during merge

2018-05-18 Thread Elijah Newren
Hi Leif,

On Fri, May 18, 2018 at 12:48 PM, Leif Middelschulte
 wrote:
> From: Leif Middelschulte 
>
> Silent fast-forwarding might lead to inconveniences in cases where
> submodules are expected to have a certain revision, because 'more recent'
> (therefore fast-forwardable) versions might break behavior/contain 
> regressions.
>
> A use-case is the integration (merge) phase as part of the feature-centric
> 'git-flow' workflow [0]. I.e. a feature might be well-tested with a certain
> submodule revision, but break because of regressions (or changes in general)
> within an updated version of the sourced submodule.
>
> This change tries to support the integrator by telling her about another 
> possible
> source of unexpected behavior (differing submodule versions) she might see
> during integration tests.

Thanks for continuing to push on this.  This looks good so far (to
me), but I was also hoping to see the analogy between these messages
and "Auto-merging $FILE" for regular files mentioned.  Both Junio[1]
and I[2] pointed out this similarity, and I think this
similarity/analogy is useful additional motivation for making this
change.

[1] https://public-inbox.org/git/xmqqo9hg7554@gitster-ct.c.googlers.com/
[2] 
https://public-inbox.org/git/CABPp-BGaibCPWuCnaX5Af=sv-2zvyhncupt+-pkxhdfjbg_...@mail.gmail.com/


> +   } else if (show(o, 2))
> +   output(o, 2, _("Fast-forwarding submodule %s to %s"), 
> path, oid_to_hex(b));
...
> +   } else if (show(o, 2))
> +   output(o, 2, _("Fast-forwarding submodule %s to %s"), 
> path, oid_to_hex(a));

Also, by analogy to the "Auto-merging $FILE" comparison, the "to %s"
on these two lines feels out of place.  Users can just look at the
submodule to see what it was updated to.  In a sea of output from
merging, this extra detail feels like noise for the standard use-case,
unless I'm misunderstanding how submodules are special.  Junio also
commented on this in the same email referenced above (at [1]).  Is
there a reason this is an important piece of the message for you to be
shown at the standard merge verbosity?


[PATCH v3 3/3] unpack_trees_options: free messages when done

2018-05-18 Thread Martin Ågren
The strings allocated in `setup_unpack_trees_porcelain()` are never
freed. Provide a function `clear_unpack_trees_porcelain()` to do so and
call it where we use `setup_unpack_trees_porcelain()`. The only
non-trivial user is `unpack_trees_start()`, where we should place the
new call in `unpack_trees_finish()`.

We keep the string pointers in an array, mixing pointers to static
memory and memory that we allocate on the heap. We also keep several
copies of the individual pointers. So we need to make sure that we do
not free what we must not free and that we do not double-free. Keep the
unique, heap-allocated pointers in a separate string list, to make the
freeing safe and future-proof.

Zero the whole array of string pointers to make sure that we do not
leave any dangling pointers.

Note that we only take responsibility for the memory allocated in
`setup_unpack_trees_porcelain()` and not any other members of the
`struct unpack_trees_options`.

Helped-by: Junio C Hamano 
Signed-off-by: Martin Ågren 
---
 unpack-trees.h |  6 ++
 builtin/checkout.c |  1 +
 merge-recursive.c  |  1 +
 merge.c|  3 +++
 unpack-trees.c | 23 +++
 5 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.h b/unpack-trees.h
index 41178ada94..5a84123a40 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -33,6 +33,11 @@ enum unpack_trees_error_types {
 void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
  const char *cmd);
 
+/*
+ * Frees resources allocated by setup_unpack_trees_porcelain().
+ */
+void clear_unpack_trees_porcelain(struct unpack_trees_options *opts);
+
 struct unpack_trees_options {
unsigned int reset,
 merge,
@@ -57,6 +62,7 @@ struct unpack_trees_options {
struct pathspec *pathspec;
merge_fn_t fn;
const char *msgs[NB_UNPACK_TREES_ERROR_TYPES];
+   struct string_list msgs_to_free;
/*
 * Store error messages in an array, each case
 * corresponding to a error message type
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b49b582071..5cebe170fc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -526,6 +526,7 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
init_tree_desc([1], tree->buffer, tree->size);
 
ret = unpack_trees(2, trees, );
+   clear_unpack_trees_porcelain();
if (ret == -1) {
/*
 * Unpack couldn't do a trivial merge; either
diff --git a/merge-recursive.c b/merge-recursive.c
index ddb0fa7369..338f63a952 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -382,6 +382,7 @@ static int unpack_trees_start(struct merge_options *o,
 static void unpack_trees_finish(struct merge_options *o)
 {
discard_index(>orig_index);
+   clear_unpack_trees_porcelain(>unpack_opts);
 }
 
 struct tree *write_tree_from_memory(struct merge_options *o)
diff --git a/merge.c b/merge.c
index f123658e58..b433291d0c 100644
--- a/merge.c
+++ b/merge.c
@@ -130,8 +130,11 @@ int checkout_fast_forward(const struct object_id *head,
 
if (unpack_trees(nr_trees, t, )) {
rollback_lock_file(_file);
+   clear_unpack_trees_porcelain();
return -1;
}
+   clear_unpack_trees_porcelain();
+
if (write_locked_index(_index, _file, COMMIT_LOCK))
return error(_("unable to write new index file"));
return 0;
diff --git a/unpack-trees.c b/unpack-trees.c
index 79fd97074e..60293ff536 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -103,6 +103,8 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
const char **msgs = opts->msgs;
const char *msg;
 
+   opts->msgs_to_free.strdup_strings = 0;
+
if (!strcmp(cmd, "checkout"))
msg = advice_commit_before_merge
  ? _("Your local changes to the following files would be 
overwritten by checkout:\n%%s"
@@ -118,8 +120,9 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
  ? _("Your local changes to the following files would be 
overwritten by %s:\n%%s"
  "Please commit your changes or stash them before you 
%s.")
  : _("Your local changes to the following files would be 
overwritten by %s:\n%%s");
-   msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-   xstrfmt(msg, cmd, cmd);
+   msg = xstrfmt(msg, cmd, cmd);
+   msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] = msg;
+   string_list_append(>msgs_to_free, msg);
 
msgs[ERROR_NOT_UPTODATE_DIR] =
_("Updating the following directories would lose untracked 
files in them:\n%s");
@@ -139,7 +142,9 @@ void setup_unpack_trees_porcelain(struct 

[PATCH v3 2/3] merge-recursive: provide pair of `unpack_trees_{start,finish}()`

2018-05-18 Thread Martin Ågren
From: Elijah Newren 

Rename `git_merge_trees()` to `unpack_trees_start()` and extract the
call to `discard_index()` into a new function `unpack_trees_finish()`.
As a result, these are called early resp. late in `merge_trees()`,
making the resource handling clearer. The next commit will expand on
that, teaching `..._finish()` to free more memory. (So rather than
moving the FIXME-comment, just drop it, since it will be addressed soon
enough.)

Also call `..._finish()` when `merge_trees()` returns early.

Signed-off-by: Elijah Newren 
Signed-off-by: Martin Ågren 
Signed-off-by: Junio C Hamano 
---
 merge-recursive.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 680e01226b..ddb0fa7369 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -337,10 +337,10 @@ static void init_tree_desc_from_tree(struct tree_desc 
*desc, struct tree *tree)
init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(struct merge_options *o,
-  struct tree *common,
-  struct tree *head,
-  struct tree *merge)
+static int unpack_trees_start(struct merge_options *o,
+ struct tree *common,
+ struct tree *head,
+ struct tree *merge)
 {
int rc;
struct tree_desc t[3];
@@ -379,6 +379,11 @@ static int git_merge_trees(struct merge_options *o,
return rc;
 }
 
+static void unpack_trees_finish(struct merge_options *o)
+{
+   discard_index(>orig_index);
+}
+
 struct tree *write_tree_from_memory(struct merge_options *o)
 {
struct tree *result = NULL;
@@ -3088,13 +3093,14 @@ int merge_trees(struct merge_options *o,
return 1;
}
 
-   code = git_merge_trees(o, common, head, merge);
+   code = unpack_trees_start(o, common, head, merge);
 
if (code != 0) {
if (show(o, 4) || o->call_depth)
err(o, _("merging of trees %s and %s failed"),
oid_to_hex(>object.oid),
oid_to_hex(>object.oid));
+   unpack_trees_finish(o);
return -1;
}
 
@@ -3147,20 +3153,15 @@ int merge_trees(struct merge_options *o,
 
hashmap_free(>current_file_dir_set, 1);
 
-   if (clean < 0)
+   if (clean < 0) {
+   unpack_trees_finish(o);
return clean;
+   }
}
else
clean = 1;
 
-   /* Free the extra index left from git_merge_trees() */
-   /*
-* FIXME: Need to also free data allocated by
-* setup_unpack_trees_porcelain() tucked away in o->unpack_opts.msgs,
-* but the problem is that only half of it refers to dynamically
-* allocated data, while the other half points at static strings.
-*/
-   discard_index(>orig_index);
+   unpack_trees_finish(o);
 
if (o->call_depth && !(*result = write_tree_from_memory(o)))
return -1;
-- 
2.17.0.840.g5d83f92caf



[PATCH v3 1/3] merge: setup `opts` later in `checkout_fast_forward()`

2018-05-18 Thread Martin Ågren
After we initialize the various fields in `opts` but before we actually
use them, we might return early. Move the initialization further down,
to immediately before we use `opts`.

This limits the scope of `opts` and will help a later commit fix a
memory leak without having to worry about those early returns.

This patch is best viewed using something like this (note the tab!):
--color-moved --anchored="  trees[nr_trees] = parse_tree_indirect"

Signed-off-by: Martin Ågren 
Signed-off-by: Junio C Hamano 
---
 merge.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/merge.c b/merge.c
index f06a4773d4..f123658e58 100644
--- a/merge.c
+++ b/merge.c
@@ -94,8 +94,24 @@ int checkout_fast_forward(const struct object_id *head,
return -1;
 
memset(, 0, sizeof(trees));
-   memset(, 0, sizeof(opts));
memset(, 0, sizeof(t));
+
+   trees[nr_trees] = parse_tree_indirect(head);
+   if (!trees[nr_trees++]) {
+   rollback_lock_file(_file);
+   return -1;
+   }
+   trees[nr_trees] = parse_tree_indirect(remote);
+   if (!trees[nr_trees++]) {
+   rollback_lock_file(_file);
+   return -1;
+   }
+   for (i = 0; i < nr_trees; i++) {
+   parse_tree(trees[i]);
+   init_tree_desc(t+i, trees[i]->buffer, trees[i]->size);
+   }
+
+   memset(, 0, sizeof(opts));
if (overwrite_ignore) {
memset(, 0, sizeof(dir));
dir.flags |= DIR_SHOW_IGNORED;
@@ -112,20 +128,6 @@ int checkout_fast_forward(const struct object_id *head,
opts.fn = twoway_merge;
setup_unpack_trees_porcelain(, "merge");
 
-   trees[nr_trees] = parse_tree_indirect(head);
-   if (!trees[nr_trees++]) {
-   rollback_lock_file(_file);
-   return -1;
-   }
-   trees[nr_trees] = parse_tree_indirect(remote);
-   if (!trees[nr_trees++]) {
-   rollback_lock_file(_file);
-   return -1;
-   }
-   for (i = 0; i < nr_trees; i++) {
-   parse_tree(trees[i]);
-   init_tree_desc(t+i, trees[i]->buffer, trees[i]->size);
-   }
if (unpack_trees(nr_trees, t, )) {
rollback_lock_file(_file);
return -1;
-- 
2.17.0.840.g5d83f92caf



[PATCH v3 0/3] unpack_trees_options: free messages when done

2018-05-18 Thread Martin Ågren
This is a reroll of my attempt at freeing the memory allocated by
`setup_unpack_trees_porcelain()`. The first two patches are identical to
v2. The third patch no longer relies on rather intimate knowledge of
which strings are on the heap and which pointers are duplicates.
Instead, as suggested by Junio, I keep a separate string-list of strings
to free. That should make things more future-proof.

v2: https://public-inbox.org/git/cover.1526488122.git.martin.ag...@gmail.com/

Martin

Elijah Newren (1):
  merge-recursive: provide pair of `unpack_trees_{start,finish}()`

Martin Ågren (2):
  merge: setup `opts` later in `checkout_fast_forward()`
  unpack_trees_options: free messages when done

 unpack-trees.h |  6 ++
 builtin/checkout.c |  1 +
 merge-recursive.c  | 30 --
 merge.c| 35 ---
 unpack-trees.c | 23 +++
 5 files changed, 62 insertions(+), 33 deletions(-)

-- 
2.17.0.840.g5d83f92caf



Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Robert P. J. Day
On Fri, 18 May 2018, Sybille Peters wrote:

> My 2c on this:
>
> 1) "If the --keep-index option is used, all changes already added to
>the index are left intact" (manpage git stash)
>
> That appears to be correct and clear
>
>
> 2) "$ git stash push --keep-index # save *all other* changes to the
>stash"  (manpage git stash)
>
> That is either not correct or misleading. "All other" implies in my
> opinion all changes except the ones that were already added. *"All
> changes including already staged changes"* might be a better choice.
>
> Please also see open question on StackOverflow:
>
> https://stackoverflow.com/questions/50242489/how-to-ignore-added-hunks-in-git-stash-p/

  hilariously, that SO piece refers to an issue posted to github here:

https://github.com/progit/progit2/issues/822

which was, in fact, posted by me. :-) in any event, let me summarize a
couple things here.

  when i first read up on git stash, it was just that section in the
man page that threw me, and once i figured out how it worked, i
thought of how *i* would have explained it, and it would have gone
like this (assuming i do, in fact, now understand it correctly, which
is by no means guaranteed).

  first, when you do "git stash push", what *always* happens is that
what is stashed is, in two parts, changes in the working directory,
and what is staged in the index. clearly, the staged changes are a
subset of the overall working directory changes, but what's important
is that the stash contains *all* of those changes and, more
importantly, distinguishes between the two categories. that's the
crucial part -- what is stashed (regardless of "--keep-index" or not)
is:

  1) staged changes
  2) unstaged changes

can we agree on that? the only difference made by "--keep-index" is
that the staged changes, in addition to being stashed, are left in the
index. but that makes no difference to what is stashed, do i have that
right?

  now, when popping (or applying), what is popped are all of the
changes in the stash, back into the working directory. period. that
always happens, correct? the only difference introduced by "--index"
is that the pop/apply *also* tries to restage what was staged before.

  is all of the above correct? if it had been explained that way, i
wouldn't have spent several confused hours trying to figure out why i
wasn't getting the results i was expecting.

rday



[PATCH v3 0/1] rebased: inform about auto submodule ff

2018-05-18 Thread Leif Middelschulte
From: Leif Middelschulte 

This is a follow-up on Junio C Hamano's comments [0] and Stefan Beller's request
[1] for a more explainatory/elaborate commit message.

[0] https://public-inbox.org/git/xmqqk1s474vx@gitster-ct.c.googlers.com/
[1] https://public-inbox.org/git/20180517184008.25445-1-sbel...@google.com/

Leif Middelschulte (1):
  Inform about fast-forwarding of submodules during merge

 merge-recursive.c | 16 
 1 file changed, 16 insertions(+)

-- 
2.15.1 (Apple Git-101)



[PATCH 1/1] Inform about fast-forwarding of submodules during merge

2018-05-18 Thread Leif Middelschulte
From: Leif Middelschulte 

Silent fast-forwarding might lead to inconveniences in cases where
submodules are expected to have a certain revision, because 'more recent'
(therefore fast-forwardable) versions might break behavior/contain regressions.

A use-case is the integration (merge) phase as part of the feature-centric
'git-flow' workflow [0]. I.e. a feature might be well-tested with a certain
submodule revision, but break because of regressions (or changes in general)
within an updated version of the sourced submodule.

This change tries to support the integrator by telling her about another 
possible
source of unexpected behavior (differing submodule versions) she might see
during integration tests.

[0] http://nvie.com/posts/a-successful-git-branching-model/

Signed-off-by: Leif Middelschulte 
---
 merge-recursive.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index a4b91d17f..e2c99924d 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1093,10 +1093,26 @@ static int merge_submodule(struct merge_options *o,
/* Case #1: a is contained in b or vice versa */
if (in_merge_bases(commit_a, commit_b)) {
oidcpy(result, b);
+   if (show(o, 3)) {
+   output(o, 3, _("Fast-forwarding submodule %s to the 
following commit:"), path);
+   output_commit_title(o, commit_b);
+   } else if (show(o, 2))
+   output(o, 2, _("Fast-forwarding submodule %s to %s"), 
path, oid_to_hex(b));
+   else
+   ; /* no output */
+
return 1;
}
if (in_merge_bases(commit_b, commit_a)) {
oidcpy(result, a);
+   if (show(o, 3)) {
+   output(o, 3, _("Fast-forwarding submodule %s to the 
following commit:"), path);
+   output_commit_title(o, commit_a);
+   } else if (show(o, 2))
+   output(o, 2, _("Fast-forwarding submodule %s to %s"), 
path, oid_to_hex(a));
+   else
+   ; /* no output */
+
return 1;
}
 
-- 
2.15.1 (Apple Git-101)



[PATCH v2 0/1] rebased: inform about auto submodule ff

2018-05-18 Thread Leif Middelschulte
From: Leif Middelschulte 

This is a follow-up on Junio C Hamano's comments [0] and Stefan Beller's request
[1] for a more explainatory/elaborate commit message.

[0] https://public-inbox.org/git/xmqqk1s474vx@gitster-ct.c.googlers.com/
[1] https://public-inbox.org/git/20180517184008.25445-1-sbel...@google.com/

Leif Middelschulte (1):
  Inform about fast-forwarding of submodules during merge

 merge-recursive.c | 4 
 1 file changed, 4 insertions(+)

-- 
2.15.1 (Apple Git-101)



Re: [PATCH 2/2] t9902-completion: exercise __git_complete_index_file() directly

2018-05-18 Thread Eric Sunshine
On Fri, May 18, 2018 at 10:17 AM, SZEDER Gábor  wrote:
> The tests added in 2f271cd9cf (t9902-completion: add tests
> demonstrating issues with quoted pathnames, 2018-05-08) and in
> 2ab6eab4fe (completion: improve handling quoted paths in 'git
> ls-files's output, 2018-03-28) have a few shortcomings:
>
>   - All these test use the 'test_completion' helper function, thus

s/these test//

> they are exercising the whole completion machinery, although they
> are only interested in how git-aware path completion, specifically
> the __git_complete_index_file() function deals with unusual
> characters in pathnames and on the command line.
>
>   - These tests can't satisfactorily test the case of pathnames
> containing spaces, because 'test_completion' gets the words on the
> command line as a single argument and it uses space as word
> separator.
>
>   - Some of the tests are protected by different FUNNYNAMES_* prereqs
> depending on whether they put backslashes and double quotes or
> separator characters (FS, GS, RS, US) in pathnames, although a
> filesystem not allowing one likely doesn't allow the others
> either.
>
>   - One of the tests operates on paths containing '|' and '&'
> characters without being protected by a FUNNYNAMES prereq, but
> some filesystems (notably on Windows) don't allow these characters
> in pathnames, either.
>
> Replace these tests with basically equivalent, more focused tests that
> call __git_complete_index_file() directly.  Since this function only
> looks at the current word to be completed, i.e. the $cur variable, we
> can easily include pathnames containing spaces in the tests, so use
> such pathnames instead of pathnames containing '|' and '&'.  Finally,
> use only a single FUNNYNAMES prereq for all kinds of special
> characters.
>
> Signed-off-by: SZEDER Gábor 


Re: [PATCH 6/8] diff.c: decouple white space treatment from move detection algorithm

2018-05-18 Thread Stefan Beller
On Thu, May 17, 2018 at 9:00 PM, Simon Ruderich  wrote:
> On Thu, May 17, 2018 at 12:46:51PM -0700, Stefan Beller wrote:
>> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> index bb9f1b7cd82..7b2527b9a19 100644
>> --- a/Documentation/diff-options.txt
>> +++ b/Documentation/diff-options.txt
>> @@ -292,6 +292,19 @@ dimmed_zebra::
>>   blocks are considered interesting, the rest is uninteresting.
>>  --
>>
>> +--color-moved-[no-]ignore-space-at-eol::
>> + Ignore changes in whitespace at EOL when performing the move
>> + detection for --color-moved.
>> +--color-moved-[no-]ignore-space-change::
>> + Ignore changes in amount of whitespace when performing the move
>> + detection for --color-moved.  This ignores whitespace
>> + at line end, and considers all other sequences of one or
>> + more whitespace characters to be equivalent.
>> +--color-moved-[no-]ignore-all-space::
>> + Ignore whitespace when comparing lines when performing the move
>> + detection for --color-moved.  This ignores differences even if
>> + one line has whitespace where the other line has none.
>> +
>>  --word-diff[=]::
>>   Show a word diff, using the  to delimit changed words.
>>   By default, words are delimited by whitespace; see
>
> Hello,
>
> I think it would be better to specify the options unabbreviated.
> Not being able to search the man page for
> "--color-moved-ignore-space-at-eol" or
> "--color-moved-no-ignore-space-at-eol" can be a major pain when
> looking for documentation. So maybe something like this instead:
>
>> +--color-moved-ignore-space-at-eol::
>> +--color-moved-no-ignore-space-at-eol::
>> + Ignore changes in whitespace at EOL when performing the move
>> + detection for --color-moved.

That makes sense.

Stepping back a bit, looking for similar precedents, we have lots of
"[no-]" strings in our documentation. But that is ok, as we the prefix
is at the beginning of the option ("--[no-]foo-bar"), such that searching
for the whole option without the leading dashes ("foo-bar") will still find
the option.

So maybe another option would be rename the negative options
to "--no-color-moved-ignore-space-at-eol", such that the documentation
could fall back to the old pattern of "--[no-]long-name".

Initially I was tempted on not choosing such names, as I viewed all
this white space options specific to the color-move feature, such that
a prefix of "--color-moved" might be desirable. Turning off one sub-feature
in that feature would naturally be --feature-no-subfeature instead of
negating the whole feature.

I also cannot find a good existing example for subfeatures in features,
they would usually come as --feature=

Undecided,
Stefan


благодаря

2018-05-18 Thread Michelle Goodman
Здравствуйте, дорогой, мне нужен ваш срочный ответ. У меня есть
хорошая новость для вас.
Мишель
благодаря


--diff-filter could use option for pre-rename files

2018-05-18 Thread Nathan Bush

I've been trying to hook together a build script and a GIT repo.

At one points it makes the following calls:

git diff --find-copies-harder --find-renames=101% --name-only 
--diff-filter=ACMRT 'refs/tags/0.0.25' 'refs/tags/0.0.27' > modified.txt
git diff --find-copies-harder --find-renames=101% --name-only 
--diff-filter=D 'refs/tags/0.0.25' 'refs/tags/0.0.27' > deleted.txt


If a file was renamed, but otherwise unchanged (and needs to be deleted 
by the build script), the files' original name does not appear on either 
list. Or, as far as I can tell, at all.


My current scripted workaround is something like

older = `git ls-tree --name-only --full-tree -r 'refs/tags/0.0.25'`
newer = `git ls-tree --name-only --full-tree -r 'refs/tags/0.0.27'`
deleted = older - newer
(puts deleted in deleted.txt)

Ideally this would just be a diff-filter option, say, --diff-filter=DE

-Nathan Bush



Re: Add option to git to ignore binary files unless force added

2018-05-18 Thread Anmol Sethi
How about a hook to ignore certain files? Then you could ignore based on the 
contents of the fail instead of just the extension. It’d be very flexible.

> On May 18, 2018, at 2:09 PM, Jacob Keller  wrote:
> 
> On Fri, May 18, 2018 at 4:31 AM, Anmol Sethi  wrote:
>> This definitely works but it would be more clean to just have git ignore the 
>> binary files from the get go.
>> 
> 
> Sure it'd be more convenient for you. But there are loads of possible
> combinations, and the idea of what constitutes unwanted files is
> hugely variable to each user. We don't really want to end up
> supporting a million different ways to do this, and the hooks
> interface provides a reasonable method for rejecting commits with
> unwanted contents.
> 
> 
> Thanks,
> Jake

-- 
Best,
Anmol



Re: Add option to git to ignore binary files unless force added

2018-05-18 Thread Jacob Keller
On Fri, May 18, 2018 at 4:31 AM, Anmol Sethi  wrote:
> This definitely works but it would be more clean to just have git ignore the 
> binary files from the get go.
>

Sure it'd be more convenient for you. But there are loads of possible
combinations, and the idea of what constitutes unwanted files is
hugely variable to each user. We don't really want to end up
supporting a million different ways to do this, and the hooks
interface provides a reasonable method for rejecting commits with
unwanted contents.


Thanks,
Jake


Re: [PATCH v2 13/14] merge: use commit-slab in merge remote desc instead of commit->util

2018-05-18 Thread Ramsay Jones


On 18/05/18 03:16, Junio C Hamano wrote:
> Nguyễn Thái Ngọc Duy   writes:
> 
>> diff --git a/commit.c b/commit.c
>> index 57049118a5..8202067cd5 100644
>> --- a/commit.c
>> +++ b/commit.c
>> @@ -1574,13 +1574,21 @@ int commit_tree_extended(const char *msg, size_t 
>> msg_len,
>>  return result;
>>  }
>>  
>> +define_commit_slab(merge_desc_slab, struct merge_remote_desc *);
>> +struct merge_desc_slab merge_desc_slab = COMMIT_SLAB_INIT(1, 
>> merge_desc_slab);
> 
> s/^/static /; otherwise sparse would complain?

Indeed, it already did (see [1]). ;-)

Your fixup, commit dc2172daed, on the 'nd/commit-util-to-slab'
branch has indeed fixed it up.

Thanks!

[1] 
https://public-inbox.org/git/e2c4276f-bcfd-faaa-f9ee-cb50e99da...@ramsayjones.plus.com/

ATB,
Ramsay Jones




Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Martin Ågren
On 18 May 2018 at 17:43, Robert P. J. Day  wrote:
> On Fri, 18 May 2018, Sybille Peters wrote:
>
>> My 2c on this:
>>
>> 1) "If the --keep-index option is used, all changes already added to
>>the index are left intact" (manpage git stash)
>>
>> That appears to be correct and clear
>
>   yup, that's not the issue.
>
>> 2) "$ git stash push --keep-index # save *all other* changes to the
>>stash"  (manpage git stash)
>>
>> That is either not correct or misleading. "All other" implies in my
>> opinion all changes except the ones that were already added. *"All
>> changes including already staged changes"* might be a better choice.

Thank you Sybille for formulating these two points.

>   yup, that is *exactly* the point i was trying to make.

Ah, this is about saving to the stash vs stashing away. The latter is
what `git stash` is all about -- stashing changes *away*. At least
according to my mental model and the top of the man-page.

But testing this -- not only from the point of view of the example, by
pushing and popping, but by actually investigating what is in the stash
-- finally makes me see what you mean. Yes, the whole lot gets saved to
the stash.

So there is a difference between what gets saved to the stash and what
gets removed from the working directory. The comment in the example
places itself somewhere in the middle by using the word "save" but
really talking about what gets dropped from the working tree.

The proposed wording does not really address that. It could be taken to
mean "all changes are saved to the stash; none are removed from the
working tree".

The work flow in the example is about temporarily stashing a few changes
(changes B) to test a couple of others (changes A). Whether the stash
entry contains changes A or not is practically irrelevant to the use
case. At pop-time, auto-merging will do the correct thing.

So how about "All changes are saved to the stash. Those that have been
added to the index are left intact in the working tree, all others are
removed from the working tree."? That's quite a lot of text. Maybe
"save all changes to the stash, make the working tree match the index"?

Or more to the point: "make the working directory match the index" or
"keep only what is in the index"?

Martin


Re: Troubles with picking an editor during Git update

2018-05-18 Thread Bartosz Konikiewicz
Git guidelines states that I should leave cc intact. I have altered
it, as I wanted to reply to both of you. I hope that my approach can
be considered acceptable.

Johannes, I see the following line in the piece of code you quoted:
  
EditorAvailable[GE_NotepadPlusPlus]:=RegQueryStringValue(HKEY_LOCAL_MACHINE,'SOFTWARE\Microsoft\Windows\CurrentVersion\App
Paths\notepad++.exe','',NotepadPlusPlusPath);

It mentions the following registry key, which was missing from my registry:
  HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App
Paths\notepad++.exe

I reinstalled Notepad++ 6.6.9, which added the key and made the
installer acknowledge that Notepad++ is installed on my system.

Thanks for encouraging me to further investigate the issue. I have dig
deeper into the file that you linked for me. As far as my
comprehension goes, an EditorSelectionChanged procedure (line 1084) is
responsible for toggling the "next" button. I noticed that it gets
invoked in at least 2 situations:
  - when an user picks a text editor (an OnChange event, line 1119)
  - once the custom page for configuring the default Git editor is
initialized (the InitializeWizard procedure, line 1199)

My reasoning tells me that it would be applicable in the
NextButtonClick procedure as well. I'd add the following piece of code
to it:
  if (EditorPage<>NIL) and (CurPageID=EditorPage.ID) then begin
  EditorSelectionChanged(NIL);
  end;

I'd also remove the line 1199, as I think the change that I proposed
would render it redundant.

Philip, I feel that composing a pull request is beyond me. I came to a
conclusion that it would drain me of too much resources. Thanks for
the suggestion, as it would be my first pull request ever. I didn't
manage to achieve it, although I believe that I would be capable of
doing this, which already fills me with utter joy. I am willing to
post my report to Git for Windows issue tracker. Thanks for the
suggestion!


Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Robert P. J. Day
On Fri, 18 May 2018, Sybille Peters wrote:

> My 2c on this:
>
> 1) "If the --keep-index option is used, all changes already added to
>the index are left intact" (manpage git stash)
>
> That appears to be correct and clear

  yup, that's not the issue.

> 2) "$ git stash push --keep-index # save *all other* changes to the
>stash"  (manpage git stash)
>
> That is either not correct or misleading. "All other" implies in my
> opinion all changes except the ones that were already added. *"All
> changes including already staged changes"* might be a better choice.

  yup, that is *exactly* the point i was trying to make.

rday


RE: Add option to git to ignore binary files unless force added

2018-05-18 Thread Randall S. Becker
On May 18, 2018 7:31 AM, Anmol Sethi 
> That works but most binaries do not have a file extension. Its just not
> standard on linux.
> 
> > On May 17, 2018, at 8:37 AM, Randall S. Becker 
> wrote:
> >
> > On May 16, 2018 11:18 PM, Jacob Keller
> >> On Wed, May 16, 2018 at 5:45 PM, Anmol Sethi  wrote:
> >>> I think it’d be great to have an option to have git ignore binary
> >>> files. My
> >> repositories are always source only, committing a binary is always a
> mistake.
> >> At the moment, I have to configure the .gitignore to ignore every
> >> binary file and that gets tedious. Having git ignore all binary files 
> >> would be
> great.
> >>>
> >>> This could be achieved via an option in .gitconfig or maybe a
> >>> special line in
> >> .gitignore.
> >>>
> >>> I just want to never accidentally commit a binary again.
> >>
> >> I believe you can do a couple things. There should be a hook which
> >> you can modify to validate that there are no binary files on
> >> pre-commit[1], or pre- push[2] to verify that you never push commits
> with binaries in them.
> >>
> >> You could also implement the update hook on the server if you control
> >> it, to allow it to block pushes which contain binary files.
> >
> > What about configuring ${HOME}/.config/git/ignore instead (described at
> https://git-scm.com/docs/gitignore). Inside, put:
> >
> > *.o
> > *.exe
> > *.bin
> > *.dat
> > Etc

I have a similar problem on my platform, with a different solution. My builds 
involve GCC binaries, NonStop L-series binaries (x86), and a NonStop J-series 
binaries (itanium). To keep me sane, I have all build targets going to separate 
directories, like Build/GCC, Build/Lbin, Build/Jbin away from the sources. This 
allows me to ignore Build/ regardless of extension and also to build different 
targets without link collisions. This is similar to how Java works (a.k.a. 
bin/). Much more workable, IMHO, than trying to manage individual binaries name 
by name or even by extension. I also have a mix of jpg and UTF-16 HTML that 
would end up in false-positives on a pure binary match and I do want to manage 
those.

What helps me is that I do most of my work in ECLIPSE, so derived resources 
(objects, generated sources) get auto-ignored by EGit, if you can make your 
compiler arrange that - but that's an ECLIPSE thing not a file system thing.

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.





Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Sybille Peters

My 2c on this:

1) "If the --keep-index option is used, all changes already added to the 
index are left intact" (manpage git stash)


That appears to be correct and clear


2) "$ git stash push --keep-index    # save *all other* changes to the 
stash" (manpage git stash)


That is either not correct or misleading. "All other" implies in my 
opinion all changes


except the ones that were already added. *"All changes including already 
staged changes"*


might be a better choice.


Please also see open question on StackOverflow:

https://stackoverflow.com/questions/50242489/how-to-ignore-added-hunks-in-git-stash-p/




Re: git diff: meaning of ^M at line ends ?

2018-05-18 Thread Torsten Bögershausen
On 15.05.18 20:04, Frank Schäfer wrote:
> Am 14.05.2018 um 20:13 schrieb Torsten Bögershausen:
>> ^M is the representation of a "Carriage Return" or CR.
>> Under Linux/Unix/Mac OS X a line is terminated with a single
>> "line feed", LF.
>>
>> Windows typically uses CRLF at the end of the line.
>> "git diff" uses the LF to detect the end of line,
>> leaving the CR alone.
>>
>> Nothing to worry about.
> Thanks, I already suspected something like that.
> Has this behavior been changed/added recently ?

That is a good question.
There is, to my knowledge, no intentional change.

> I didn't observe it before, although the project I'm currently looking
> into has always been using CR+LF...

Do you mean that older versions did behave differently ?
Do you have a version number for the "old" handling ?

> 
> Why does the ^M only show up in '+' lines ?
> When changing the line end from CR+LF to LF, the diff looks like this:

> 
> -blahblah
> +blahblah
> 
> But I would expect it to be
> 
> -blahblah^M
> +blahblah

May be this helps (I haven't tested it) ?
git config  core.whitespace cr-at-eol



[PATCH 1/2] completion: don't return with error from __gitcomp_file_direct()

2018-05-18 Thread SZEDER Gábor
In __gitcomp_file_direct() we tell Bash that it should handle our
possible completion words as filenames with the following piece of
cleverness:

  # use a hack to enable file mode in bash < 4
  compopt -o filenames +o nospace 2>/dev/null ||
  compgen -f /non-existing-dir/ > /dev/null

Unfortunately, this makes this function always return with error
when it is not invoked in real completion, but e.g. in tests of
't9902-completion.sh':

  - First the 'compopt' line errors out
- either because in Bash v3.x there is no such command,
- or because in Bash v4.x it complains about "not currently
  executing completion function",

  - then 'compgen' just silently returns with error because of the
non-existing directory.

Since __gitcomp_file_direct() is now the last command executed in
__git_complete_index_file(), that function returns with error as well,
which prevents it from being invoked in tests directly as is, and
would require extra steps in test to hide its error code.

So let's make sure that __gitcomp_file_direct() doesn't return with
error, because in the tests coming in the following patch we do want
to exercise __git_complete_index_file() directly,

__gitcomp_file() contains the same construct, and thus it, too, always
returns with error.  Update that function accordingly as well.

While at it, also remove the space from between the redirection
operator and the filename in both functions.

Signed-off-by: SZEDER Gábor 
---
 contrib/completion/git-completion.bash | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 816901f0f0..8bc79a5226 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -420,7 +420,8 @@ __gitcomp_file_direct ()
 
# use a hack to enable file mode in bash < 4
compopt -o filenames +o nospace 2>/dev/null ||
-   compgen -f /non-existing-dir/ > /dev/null
+   compgen -f /non-existing-dir/ >/dev/null ||
+   true
 }
 
 # Generates completion reply with compgen from newline-separated possible
@@ -442,7 +443,8 @@ __gitcomp_file ()
 
# use a hack to enable file mode in bash < 4
compopt -o filenames +o nospace 2>/dev/null ||
-   compgen -f /non-existing-dir/ > /dev/null
+   compgen -f /non-existing-dir/ >/dev/null ||
+   true
 }
 
 # Execute 'git ls-files', unless the --committable option is specified, in
-- 
2.17.0.799.gd371044c7c



[PATCH 0/2] Test improvements for 'sg/complete-paths'

2018-05-18 Thread SZEDER Gábor
> > So, I think for v2 I will rewrite these tests to call
> > __git_complete_index_file() directly instead of using
> > 'test_completion', and will include a test with spaces in path names.
> 
> Quite well thought-out reasoning.  Thanks.

Unfortunately I couldn't get around to it soon enough, and now the
topic 'sg/complete-paths' is already in next, so here are those test
improvements on top.


SZEDER Gábor (2):
  completion: don't return with error from __gitcomp_file_direct()
  t9902-completion: exercise __git_complete_index_file() directly

 contrib/completion/git-completion.bash |   6 +-
 t/t9902-completion.sh  | 225 +
 2 files changed, 122 insertions(+), 109 deletions(-)

-- 
2.17.0.799.gd371044c7c



[PATCH 2/2] t9902-completion: exercise __git_complete_index_file() directly

2018-05-18 Thread SZEDER Gábor
The tests added in 2f271cd9cf (t9902-completion: add tests
demonstrating issues with quoted pathnames, 2018-05-08) and in
2ab6eab4fe (completion: improve handling quoted paths in 'git
ls-files's output, 2018-03-28) have a few shortcomings:

  - All these test use the 'test_completion' helper function, thus
they are exercising the whole completion machinery, although they
are only interested in how git-aware path completion, specifically
the __git_complete_index_file() function deals with unusual
characters in pathnames and on the command line.

  - These tests can't satisfactorily test the case of pathnames
containing spaces, because 'test_completion' gets the words on the
command line as a single argument and it uses space as word
separator.

  - Some of the tests are protected by different FUNNYNAMES_* prereqs
depending on whether they put backslashes and double quotes or
separator characters (FS, GS, RS, US) in pathnames, although a
filesystem not allowing one likely doesn't allow the others
either.

  - One of the tests operates on paths containing '|' and '&'
characters without being protected by a FUNNYNAMES prereq, but
some filesystems (notably on Windows) don't allow these characters
in pathnames, either.

Replace these tests with basically equivalent, more focused tests that
call __git_complete_index_file() directly.  Since this function only
looks at the current word to be completed, i.e. the $cur variable, we
can easily include pathnames containing spaces in the tests, so use
such pathnames instead of pathnames containing '|' and '&'.  Finally,
use only a single FUNNYNAMES prereq for all kinds of special
characters.

Signed-off-by: SZEDER Gábor 
---
 t/t9902-completion.sh | 225 ++
 1 file changed, 118 insertions(+), 107 deletions(-)

diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 955932174c..1b6d275254 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1209,6 +1209,124 @@ test_expect_success 'teardown after ref completion' '
git remote remove other
 '
 
+
+test_path_completion ()
+{
+   test $# = 2 || error "bug in the test script: not 2 parameters to 
test_path_completion"
+
+   local cur="$1" expected="$2"
+   echo "$expected" >expected &&
+   (
+   # In the following tests calling this function we only
+   # care about how __git_complete_index_file() deals with
+   # unusual characters in path names.  By requesting only
+   # untracked files we dont have to bother adding any
+   # paths to the index in those tests.
+   __git_complete_index_file --others &&
+   print_comp
+   ) &&
+   test_cmp expected out
+}
+
+test_expect_success 'setup for path completion tests' '
+   mkdir simple-dir \
+ "spaces in dir" \
+ árvíztűrő &&
+   touch simple-dir/simple-file \
+ "spaces in dir/spaces in file" \
+ "árvíztűrő/Сайн яваарай" &&
+   if test_have_prereq !MINGW &&
+  mkdir BS\\dir \
+'$'separators\034in\035dir'' &&
+  touch BS\\dir/DQ\"file \
+'$'separators\034in\035dir/sep\036in\037file''
+   then
+   test_set_prereq FUNNYNAMES
+   else
+   rm -rf BS\\dir '$'separators\034in\035dir''
+   fi
+'
+
+test_expect_success '__git_complete_index_file - simple' '
+   test_path_completion simple simple-dir &&  # Bash is supposed to
+  # add the trailing /.
+   test_path_completion simple-dir/simple simple-dir/simple-file
+'
+
+test_expect_success \
+'__git_complete_index_file - escaped characters on cmdline' '
+   test_path_completion spac "spaces in dir" &&  # Bash will turn this
+ # into "spaces\ in\ dir"
+   test_path_completion "spaces\\ i" \
+"spaces in dir" &&
+   test_path_completion "spaces\\ in\\ dir/s" \
+"spaces in dir/spaces in file" &&
+   test_path_completion "spaces\\ in\\ dir/spaces\\ i" \
+"spaces in dir/spaces in file"
+'
+
+test_expect_success \
+'__git_complete_index_file - quoted characters on cmdline' '
+   # Testing with an opening but without a corresponding closing
+   # double quote is important.
+   test_path_completion \"spac "spaces in dir" &&
+   test_path_completion "\"spaces i" \
+"spaces in dir" &&
+   test_path_completion "\"spaces in dir/s" \
+"spaces in dir/spaces in file" &&
+   test_path_completion "\"spaces in dir/spaces i" \
+"spaces in dir/spaces in file"
+'
+
+test_expect_success '__git_complete_index_file - UTF-8 in ls-files output' '
+   

Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Martin Ågren
On 18 May 2018 at 12:51, Robert P. J. Day  wrote:
> On Fri, 18 May 2018, Martin Ågren wrote:
>
>> On 18 May 2018 at 11:37, Robert P. J. Day  wrote:
>> >
>> >   toward the bottom of "man git-stash", one reads part of an example:
>> >
>> >   # ... hack hack hack ...
>> >   $ git add --patch foo# add just first part to the index
>> >   $ git stash push --keep-index# save all other changes to the stash
>> >   ^ ???
>> >
>> > i thought that, even if "--keep-index" left staged changes in the
>> > index, it still included those staged changes in the stash. that's
>> > not the impression one gets from the above.

If I try to follow the example, it does exactly what it purports to do
and I understand why I would want to use this exact technique with
`--keep-index` in the situation outlined: "you want to make two or more
commits out of the changes in the work tree, and you want to test each
change before committing"

So I claim that the example is correct in the sense that it describes
what happens when one uses git in the real world. Another question is
whether the example and the real-world behavior match the impression
you have from elsewhere, e.g., from earlier in the man-page.

That's why I asked this:

>> So would the error be in the part of the man-page quoted below?
>>
>>   If the --keep-index option is used, all changes already added to
>>   the index are left intact.
>
>   no, that part is correct, it clearly(?) states that staged changes
> are left where they are, in the index. i submit that the misleading
> part is in the example i quoted, which suggests that only the "other"
> changes are saved to the stash -- that is, the changes other than what
> is already in the index.

Could you try the example? I think it is important that we find whether
the source of confusion is the example or the earlier explanation
quoted just here.

>> That is, this doesn't say *where* things are left intact (in the
>> index? in the working tree?).
>
>   in that case, that's something that could obviously be clarified.

Would you suggest adding something like "both in the index and in the
working tree" to what I quoted above?

>> The man-page does start with
>>
>>   git-stash - Stash the changes in a dirty working directory away
>>
>> which to me suggests that "leaving something intact" refers to
>> precisely this -- the working directory.
>>
>> Or is it the name of the option that trips you up? That is, you read
>> the name as `--keep-the-index-as-is-but-stash-as-usual`, as opposed
>> to `--keep-what-is-already-in-the-index-around`?
>>
>> While I'm sure that some clarification could be provided, I'm tempted to
>> argue that is exactly what the example provides that you quoted from.
>
>   i guess we can agree to disagree, since i think the snippet of the
> example i provided gives a misleading explanation.

We can disagree on many things, but I would very much like us to agree
on whether the example correctly describes what the command does, or
not.

Then, if the error or source of confusion is not there, then let's
identify it and see if we can find out how to address it.

I suggest that the source of confusion is here:

  If the --keep-index option is used, all changes already added to
  the index are left intact.

Martin


Re: [PATCH] git-p4: add options --commit and --disable-rebase

2018-05-18 Thread Luke Diamand
On 9 May 2018 at 16:32, Romain Merland  wrote:
> On a daily work with multiple local git branches, the usual way to submit 
> only a
> specified commit was to cherry-pick the commit on master then run git-p4 
> submit.
> It can be very annoying to switch between local branches and master, only to
> submit one commit.
> The proposed new way is to select directly the commit you want to submit.
>
> add option --commit to command 'git-p4 submit' in order to submit only 
> specified commit(s) in p4.
>
> On a daily work developping software with big compilation time, one may not 
> want
> to rebase on his local git tree, in order to avoid long recompilation.
>
> add option --disable-rebase to command 'git-p4 submit' in order to disable 
> rebase after submission.

I've been using this for real and it works well for me. Ack.

Because of the way I'm using git-p4, the --disable-rebase option
doesn't really help me - I really need a --disable-sync option but
that's a different feature.

Thanks
Luke



> ---
>  Documentation/git-p4.txt | 14 ++
>  git-p4.py| 29 +++--
>  t/t9807-git-p4-submit.sh | 40 
>  3 files changed, 77 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index d8c8f11c9..88d109deb 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -149,6 +149,12 @@ To specify a branch other than the current one, use:
>  $ git p4 submit topicbranch
>  
>
> +To specify a single commit or a range of commits, use:
> +
> +$ git p4 submit --commit 
> +$ git p4 submit --commit 
> +
> +
>  The upstream reference is generally 'refs/remotes/p4/master', but can
>  be overridden using the `--origin=` command-line option.
>
> @@ -330,6 +336,14 @@ These options can be used to modify 'git p4 submit' 
> behavior.
> p4/master.  See the "Sync options" section above for more
> information.
>
> +--commit |::
> +Submit only the specified commit or range of commits, instead of the full
> +list of changes that are in the current Git branch.
> +
> +--disable-rebase::
> +Disable the automatic rebase after all commits have been successfully
> +submitted.
> +
>  Rebase options
>  ~~
>  These options can be used to modify 'git p4 rebase' behavior.
> diff --git a/git-p4.py b/git-p4.py
> index 7bb9cadc6..f4a6f3b4c 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -1352,7 +1352,12 @@ class P4Submit(Command, P4UserMap):
>  optparse.make_option("--update-shelve", 
> dest="update_shelve", action="append", type="int",
>   metavar="CHANGELIST",
>   help="update an existing shelved 
> changelist, implies --shelve, "
> -   "repeat in-order for multiple 
> shelved changelists")
> +   "repeat in-order for multiple 
> shelved changelists"),
> +optparse.make_option("--commit", dest="commit", 
> metavar="COMMIT",
> + help="submit only the specified 
> commit(s), one commit or xxx..xxx"),
> +optparse.make_option("--disable-rebase", 
> dest="disable_rebase", action="store_true",
> + help="Disable rebase after submit is 
> completed. Can be useful if you "
> + "work from a local git branch that is 
> not master")
>  ]
>  self.description = "Submit changes from git to the perforce depot."
>  self.usage += " [name of git branch to submit into perforce depot]"
> @@ -1362,6 +1367,8 @@ class P4Submit(Command, P4UserMap):
>  self.dry_run = False
>  self.shelve = False
>  self.update_shelve = list()
> +self.commit = ""
> +self.disable_rebase = False
>  self.prepare_p4_only = False
>  self.conflict_behavior = None
>  self.isWindows = (platform.system() == "Windows")
> @@ -2103,9 +2110,18 @@ class P4Submit(Command, P4UserMap):
>  else:
>  commitish = 'HEAD'
>
> -for line in read_pipe_lines(["git", "rev-list", "--no-merges", 
> "%s..%s" % (self.origin, commitish)]):
> -commits.append(line.strip())
> -commits.reverse()
> +if self.commit != "":
> +if self.commit.find("..") != -1:
> +limits_ish = self.commit.split("..")
> +for line in read_pipe_lines(["git", "rev-list", 
> "--no-merges", "%s..%s" % (limits_ish[0], limits_ish[1])]):
> +commits.append(line.strip())
> +commits.reverse()
> +else:
> +commits.append(self.commit)
> +else:
> +for line in read_pipe_lines(["git", "rev-list", "--no-merges", 
> "%s..%s" % (self.origin, commitish)]):
> +   

Re: Add option to git to ignore binary files unless force added

2018-05-18 Thread Anmol Sethi
This definitely works but it would be more clean to just have git ignore the 
binary files from the get go.

> On May 16, 2018, at 11:18 PM, Jacob Keller  wrote:
> 
> On Wed, May 16, 2018 at 5:45 PM, Anmol Sethi  wrote:
>> I think it’d be great to have an option to have git ignore binary files. My 
>> repositories are always source only, committing a binary is always a 
>> mistake. At the moment, I have to configure the .gitignore to ignore every 
>> binary file and that gets tedious. Having git ignore all binary files would 
>> be great.
>> 
>> This could be achieved via an option in .gitconfig or maybe a special line 
>> in .gitignore.
>> 
>> I just want to never accidentally commit a binary again.
>> 
>> --
>> Best,
>> Anmol
>> 
> 
> I believe you can do a couple things. There should be a hook which you
> can modify to validate that there are no binary files on
> pre-commit[1], or pre-push[2] to verify that you never push commits
> with binaries in them.
> 
> You could also implement the update hook on the server if you control
> it, to allow it to block pushes which contain binary files.
> 
> Thanks,
> Jake
> 
> [1]https://git-scm.com/docs/githooks#_pre_commit
> [2]https://git-scm.com/docs/githooks#_pre_push
> [3]https://git-scm.com/docs/githooks#update

-- 
Best,
Anmol



Re: Add option to git to ignore binary files unless force added

2018-05-18 Thread Anmol Sethi
That works but most binaries do not have a file extension. Its just not 
standard on linux.

> On May 17, 2018, at 8:37 AM, Randall S. Becker  wrote:
> 
> On May 16, 2018 11:18 PM, Jacob Keller
>> On Wed, May 16, 2018 at 5:45 PM, Anmol Sethi  wrote:
>>> I think it’d be great to have an option to have git ignore binary files. My
>> repositories are always source only, committing a binary is always a mistake.
>> At the moment, I have to configure the .gitignore to ignore every binary file
>> and that gets tedious. Having git ignore all binary files would be great.
>>> 
>>> This could be achieved via an option in .gitconfig or maybe a special line 
>>> in
>> .gitignore.
>>> 
>>> I just want to never accidentally commit a binary again.
>> 
>> I believe you can do a couple things. There should be a hook which you can
>> modify to validate that there are no binary files on pre-commit[1], or pre-
>> push[2] to verify that you never push commits with binaries in them.
>> 
>> You could also implement the update hook on the server if you control it, to
>> allow it to block pushes which contain binary files.
> 
> What about configuring ${HOME}/.config/git/ignore instead (described at 
> https://git-scm.com/docs/gitignore). Inside, put:
> 
> *.o
> *.exe
> *.bin
> *.dat
> Etc
> 
> Cheers,
> Randall
> 
> 

-- 
Best,
Anmol



Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Robert P. J. Day
On Fri, 18 May 2018, Martin Ågren wrote:

> On 18 May 2018 at 11:37, Robert P. J. Day  wrote:
> >
> >   toward the bottom of "man git-stash", one reads part of an example:
> >
> >   # ... hack hack hack ...
> >   $ git add --patch foo# add just first part to the index
> >   $ git stash push --keep-index# save all other changes to the stash
> >   ^ ???
> >
> > i thought that, even if "--keep-index" left staged changes in the
> > index, it still included those staged changes in the stash. that's
> > not the impression one gets from the above.
>
> So would the error be in the part of the man-page quoted below?
>
>   If the --keep-index option is used, all changes already added to
>   the index are left intact.

  no, that part is correct, it clearly(?) states that staged changes
are left where they are, in the index. i submit that the misleading
part is in the example i quoted, which suggests that only the "other"
changes are saved to the stash -- that is, the changes other than what
is already in the index.

> That is, this doesn't say *where* things are left intact (in the
> index? in the working tree?).

  in that case, that's something that could obviously be clarified.

> The man-page does start with
>
>   git-stash - Stash the changes in a dirty working directory away
>
> which to me suggests that "leaving something intact" refers to
> precisely this -- the working directory.
>
> Or is it the name of the option that trips you up? That is, you read
> the name as `--keep-the-index-as-is-but-stash-as-usual`, as opposed
> to `--keep-what-is-already-in-the-index-around`?
>
> While I'm sure that some clarification could be provided, I'm tempted to
> argue that is exactly what the example provides that you quoted from.

  i guess we can agree to disagree, since i think the snippet of the
example i provided gives a misleading explanation.

rday

Re: error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Martin Ågren
On 18 May 2018 at 11:37, Robert P. J. Day  wrote:
>
>   toward the bottom of "man git-stash", one reads part of an example:
>
>   # ... hack hack hack ...
>   $ git add --patch foo# add just first part to the index
>   $ git stash push --keep-index# save all other changes to the stash
>   ^ ???
>
> i thought that, even if "--keep-index" left staged changes in the
> index, it still included those staged changes in the stash. that's not
> the impression one gets from the above.

So would the error be in the part of the man-page quoted below?

  If the --keep-index option is used, all changes already added to
  the index are left intact.

That is, this doesn't say *where* things are left intact (in the index?
in the working tree?). The man-page does start with

  git-stash - Stash the changes in a dirty working directory away

which to me suggests that "leaving something intact" refers to precisely
this -- the working directory.

Or is it the name of the option that trips you up? That is, you read the
name as `--keep-the-index-as-is-but-stash-as-usual`, as opposed to
`--keep-what-is-already-in-the-index-around`?

While I'm sure that some clarification could be provided, I'm tempted to
argue that is exactly what the example provides that you quoted from.

Martin


благодаря

2018-05-18 Thread MICHELLE GOODMAN
Здравствуйте, дорогой, мне нужен ваш срочный ответ. У меня есть
хорошая новость для вас.
Мишель
благодаря


error(?) in "man git-stash" regarding "--keep-index"

2018-05-18 Thread Robert P. J. Day

  toward the bottom of "man git-stash", one reads part of an example:

  # ... hack hack hack ...
  $ git add --patch foo# add just first part to the index
  $ git stash push --keep-index# save all other changes to the stash
  ^ ???

i thought that, even if "--keep-index" left staged changes in the
index, it still included those staged changes in the stash. that's not
the impression one gets from the above.

rday


[PATCH RFC v2 1/4] ref-filter: start using oid_object_info

2018-05-18 Thread Olga Telezhnaya
Start using oid_object_info_extended(). So, if info from this function
is enough, we do not need to get and parse whole object (as it was before).
It's good for 3 reasons:
1. Some Git commands potentially will work faster.
2. It's much easier to add support for objectsize:disk and deltabase.
   (I have plans to add this support further)
3. It's easier to move formatting from cat-file command to this logic
   (It pretends to be unified formatting logic in the end)

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 47 +++
 ref-filter.h | 21 +
 2 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 39e2744c949bb..4008351553391 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -100,6 +100,7 @@ static struct used_atom {
} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
+static struct expand_data oi_data;
 
 /*
  * Expand string, append it to strbuf *sb, then return error code ret.
@@ -267,6 +268,22 @@ static int contents_atom_parser(const struct ref_format 
*format, struct used_ato
return 0;
 }
 
+static int objecttype_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *unused_err)
+{
+   oi_data.use_data = 1;
+   oi_data.info.typep = _data.type;
+   return 0;
+}
+
+static int objectsize_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *unused_err)
+{
+   oi_data.use_data = 1;
+   oi_data.info.sizep = _data.size;
+   return 0;
+}
+
 static int objectname_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
  const char *arg, struct strbuf *err)
 {
@@ -383,9 +400,9 @@ static struct {
int (*parser)(const struct ref_format *format, struct used_atom *atom,
  const char *arg, struct strbuf *err);
 } valid_atom[] = {
-   { "refname" , FIELD_STR, refname_atom_parser },
-   { "objecttype" },
-   { "objectsize", FIELD_ULONG },
+   { "refname", FIELD_STR, refname_atom_parser },
+   { "objecttype", FIELD_STR, objecttype_atom_parser },
+   { "objectsize", FIELD_ULONG, objectsize_atom_parser },
{ "objectname", FIELD_STR, objectname_atom_parser },
{ "tree" },
{ "parent" },
@@ -1207,7 +1224,8 @@ static void fill_missing_values(struct atom_value *val)
  */
 static void grab_values(struct atom_value *val, int deref, struct object *obj, 
void *buf, unsigned long sz)
 {
-   grab_common_values(val, deref, obj, buf, sz);
+   if (deref || !oi_data.use_data)
+   grab_common_values(val, deref, obj, buf, sz);
switch (obj->type) {
case OBJ_TAG:
grab_tag_values(val, deref, obj, buf, sz);
@@ -1536,6 +1554,13 @@ static int populate_value(struct ref_array_item *ref, 
struct strbuf *err)
continue;
} else if (!deref && grab_objectname(name, >objectname, v, 
atom)) {
continue;
+   } else if (!deref && !strcmp(name, "objecttype") && 
oi_data.use_data) {
+   v->s = type_name(oi_data.type);
+   continue;
+   } else if (!deref && !strcmp(name, "objectsize") && 
oi_data.use_data) {
+   v->value = oi_data.size;
+   v->s = xstrfmt("%lu", oi_data.size);
+   continue;
} else if (!strcmp(name, "HEAD")) {
if (atom->u.head && !strcmp(ref->refname, atom->u.head))
v->s = "*";
@@ -2156,8 +2181,17 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
int (*cmp_fn)(const char *, const char *);
struct strbuf err = STRBUF_INIT;
 
+   oi_data.oid = a->objectname;
+   if (oi_data.use_data &&
+   oid_object_info_extended(_data.oid, _data.info, 
OBJECT_INFO_LOOKUP_REPLACE) < 0)
+   die(_("missing object %s for %s"), oid_to_hex(>objectname), 
a->refname);
if (get_ref_atom_value(a, s->atom, , ))
die("%s", err.buf);
+
+   oi_data.oid = b->objectname;
+   if (oi_data.use_data &&
+   oid_object_info_extended(_data.oid, _data.info, 
OBJECT_INFO_LOOKUP_REPLACE) < 0)
+   die(_("missing object %s for %s"), oid_to_hex(>objectname), 
b->refname);
if (get_ref_atom_value(b, s->atom, , ))
die("%s", err.buf);
strbuf_release();
@@ -2226,6 +2260,11 @@ int format_ref_array_item(struct ref_array_item *info,
 {
const char *cp, *sp, *ep;
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
+   oi_data.oid = info->objectname;
+   if (oi_data.use_data &&
+   oid_object_info_extended(_data.oid, 

[PATCH RFC v2 2/4] ref-filter: add objectsize:disk formatting option

2018-05-18 Thread Olga Telezhnaya
Add %(objectsize:disk) support. It is still not working for deref:
I am thinking how to support it in a more elegant way.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4008351553391..c00de58455301 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -277,10 +277,15 @@ static int objecttype_atom_parser(const struct ref_format 
*format, struct used_a
 }
 
 static int objectsize_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
- const char *arg, struct strbuf *unused_err)
+ const char *arg, struct strbuf *err)
 {
+   if (!arg)
+   oi_data.info.sizep = _data.size;
+   else if (!strcmp(arg, "disk"))
+   oi_data.info.disk_sizep = _data.disk_size;
+   else
+   return strbuf_addf_ret(err, -1, _("unrecognized %%(objectsize) 
argument: %s"), arg);
oi_data.use_data = 1;
-   oi_data.info.sizep = _data.size;
return 0;
 }
 
@@ -1557,9 +1562,15 @@ static int populate_value(struct ref_array_item *ref, 
struct strbuf *err)
} else if (!deref && !strcmp(name, "objecttype") && 
oi_data.use_data) {
v->s = type_name(oi_data.type);
continue;
-   } else if (!deref && !strcmp(name, "objectsize") && 
oi_data.use_data) {
-   v->value = oi_data.size;
-   v->s = xstrfmt("%lu", oi_data.size);
+   } else if (!deref && starts_with(name, "objectsize") && 
oi_data.use_data) {
+   if (!strcmp(name, "objectsize")) {
+   v->value = oi_data.size;
+   v->s = xstrfmt("%lu", oi_data.size);
+   } else {
+   /* It can be only objectsize:disk, we checked 
it in parser */
+   v->value = oi_data.disk_size;
+   v->s = xstrfmt("%lu", oi_data.disk_size);
+   }
continue;
} else if (!strcmp(name, "HEAD")) {
if (atom->u.head && !strcmp(ref->refname, atom->u.head))

--
https://github.com/git/git/pull/493


[PATCH RFC v2 3/4] ref-filter: add tests for objectsize:disk

2018-05-18 Thread Olga Telezhnaya
Add tests for %(objectsize:disk) atom.

Signed-off-by: Olga Telezhnaia 
---
 t/t6300-for-each-ref.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 295d1475bde01..570bb606045d7 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -83,6 +83,7 @@ test_atom head push:strip=1 remotes/myfork/master
 test_atom head push:strip=-1 master
 test_atom head objecttype commit
 test_atom head objectsize 171
+test_atom head objectsize:disk 138
 test_atom head objectname $(git rev-parse refs/heads/master)
 test_atom head objectname:short $(git rev-parse --short refs/heads/master)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
@@ -124,6 +125,7 @@ test_atom tag upstream ''
 test_atom tag push ''
 test_atom tag objecttype tag
 test_atom tag objectsize 154
+test_atom tag objectsize:disk 138
 test_atom tag objectname $(git rev-parse refs/tags/testtag)
 test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)

--
https://github.com/git/git/pull/493


[PATCH RFC v2 4/4] ref-filter: add deltabase formatting option

2018-05-18 Thread Olga Telezhnaya
Add %(deltabase) support. It is still not working for deref:
I am thinking how to support it in a more elegant way.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index c00de58455301..989ccdb356a32 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -276,6 +276,14 @@ static int objecttype_atom_parser(const struct ref_format 
*format, struct used_a
return 0;
 }
 
+static int deltabase_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *unused_err)
+{
+   oi_data.use_data = 1;
+   oi_data.info.delta_base_sha1 = oi_data.delta_base_oid.hash;
+   return 0;
+}
+
 static int objectsize_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
  const char *arg, struct strbuf *err)
 {
@@ -409,6 +417,7 @@ static struct {
{ "objecttype", FIELD_STR, objecttype_atom_parser },
{ "objectsize", FIELD_ULONG, objectsize_atom_parser },
{ "objectname", FIELD_STR, objectname_atom_parser },
+   { "deltabase", FIELD_STR, deltabase_atom_parser },
{ "tree" },
{ "parent" },
{ "numparent", FIELD_ULONG },
@@ -1572,6 +1581,9 @@ static int populate_value(struct ref_array_item *ref, 
struct strbuf *err)
v->s = xstrfmt("%lu", oi_data.disk_size);
}
continue;
+   } else if (!deref && !strcmp(name, "deltabase") && 
oi_data.use_data) {
+   v->s = xstrdup(oid_to_hex(_data.delta_base_oid));
+   continue;
} else if (!strcmp(name, "HEAD")) {
if (atom->u.head && !strcmp(ref->refname, atom->u.head))
v->s = "*";

--
https://github.com/git/git/pull/493


[PATCH RFC v2 0/4] ref-filter: start using oid_object_info

2018-05-18 Thread Оля Тележная
Hello,
Discussion starts here: [1], [2].

Updates:

I fixed memory leaks, now we use data from oid_object_info() OR from
grab_commom_values() - not both as it was before.

I added support for "%(objectsize:disk)" atom (with tests) and for
"%(deltabase)" atom (without tests yet).
I didn't support "%(*objectsize:disk)" and "%(*deltabase)" because I
can't see the possibility to fill these fields at the same time with
other processes.
I really want to discuss this moment with someone. The best idea that
I have is to fully change grab_common_values() function so that it
would collect and fill all info from oid_object_info_extended(), and
we will invoke it before we actually get and parse full object.

I am not sure with this proposed solution at all: we would still have
the scenario when we call both oid_object_info_extended() and
get_object(), it means the performance will be worse in some cases. I
don't know if it's crucial.

By the way, do we have an official way to measure performance?

I didn't add tests for %(deltabase) atom because I am not sure that it
has sense for for-each-ref at all: I have a guess that I added this
atom only for future using in cat-file command, and also for deref
option %(*deltabase).

I was also thinking about adding new parameter to atom, something like
enum "need oid_object_info", "need get_object" etc, so that we could
collect all needed data in a shortest way. The problem is that the
code will be even more complicated, and I am not sure our goal worth
it. And, anyway, there would be the scenario when we have to call both
oid_object_info_extended() and get_object(): for example, if we have
format="%(objectsize:disk) %(author)".

Thanks a lot,
Waiting for any ideas/comments!

[1] 
https://public-inbox.org/git/CAL21Bm=6Z54-zsUq0DJqmqhSciHCDLUNXR8inDMAd-b-=qj...@mail.gmail.com/
[2] https://public-inbox.org/git/xmqq8t8la81a@gitster-ct.c.googlers.com/


Re: [PATCH v5 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'

2018-05-18 Thread Junio C Hamano
Taylor Blau  writes:

>   * `git grep --and -e foo -e bar`. This binary operation should recur
> on its sub-expressions and take the minimum of the starting offset
> and the maximum of the ending offset.

We use infix notation, so the above is "git grep -e foo --and -e
bar" actually ;-).

But you raise an interesting point.  A line with "hello foo bar baz"
on it would match, so does a line with "goodbye bar baz foo", as
both of them hits pattern "foo" *and* pattern "bar".  It is not
quite clear what it means to "show the first hit on the line".  One
interpretation would be to take the minimum span that makes both
sides of "--and" happy (your "minimum of start, maximum of end").
Another might be to pick "foo" in the first and "bar" in the second
line, as that is the "first hit" on the line, which is consistent
with how "git grep -e foo" would say about "a foo b foo c foo" (I
expect that the leftmost "foo" would be the first hit).  So there
may be multiple, equally plausible answer to the question.

> For inputs of the form "foobar" and "foo bar", it will do the right
> thing (give the starting and ending offset for "foobar" and give no
> match, respectively).

I think I agree with "foobar", but I do not understand why there is
no match for "foo bar".

>   * `git grep --or -e foo -e bar`. This is the most complicated case, in
> my opinion. In going with the min/max idea in the and case above, I
> think that `--or` should also min/max its sub-expressions, but in
> fact we short-circuit evaluating the second sub-expression when we
> find a match for the first.

I am not sure I follow.  "git grep -e foo --or -e bar" is just a
longhand for "git grep -e foo -e bar".  Shouldn't it highlight
whichever between foo and bar that appears leftmost on the line?

> So, in cases like matching `--or -e foo -e bar` with "foo baz bar",
> we'll do the right thing, since `foo` is the first sub-expression
> and happens to be the left-most match. In other words, we __adhere
> to our answer with the left-most match first__ semantics, but only
> because __the first sub-expression is the left-most match__.
>
> In the other case where we try and match the same expression against
> "bar baz foo", we'll return the starting offset of "foo", even
> though it isn't the left-most match, violating our semantics.

I am not sure why you think your min-starting/max-ending would lead
to such a conclusion.  'foo baz bar' would be covered in its
entirety, 'bar baz foo' would also, as starting of hits with pattern
'foo' and pattern 'bar' would be 'b' in 'bar' on that three-word
line, and ending of hits with these two patterns would be the last
'o' in 'foo' on the line.  

I'd expect that a line 'foo baz bar' matched against "-e foo --or -e
bar" would say "among three words on me, 'f' in foo is the first
location of the match", though.


  1   2   >