date:20180628

[PATCH 1/3] .mailmap: merge different spellings of names

2018-06-28 Thread Stefan Beller

This is a continuation of 94b410bba86 (.mailmap: Map email
addresses to names, 2013-07-12), merging names that are
spelled differently but have the same author email to the
same person.

Most spellings differed in accents or the order of names.

Signed-off-by: Stefan Beller 
---
 .mailmap | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index df7cf6313c7..f165222a782 100644
--- a/.mailmap
+++ b/.mailmap
@@ -35,11 +35,13 @@ Chris Wright  
 Cord Seele  
 Christian Couder  
 Christian Stimming  
+Christopher Díaz Riveros  Christopher Diaz Riveros
 Csaba Henk  
 Dan Johnson 
 Dana L. How  
 Dana L. How  Dana How
 Daniel Barkalow 
+Daniel Knittl-Frank  knittl
 Daniel Trstenjak  
 Daniel Trstenjak  
 David Brown  
@@ -57,6 +59,7 @@ Eric S. Raymond 
 Eric Wong  
 Erik Faye-Lund  
 Eyvind Bernhardsen  
+Fangyi Zhou  Zhou Fangyi
 Florian Achleitner  

 Franck Bui-Huu  
 Frank Lichtenheld  
@@ -86,6 +89,8 @@ Jason McMullan 
 Jason Riedy  
 Jason Riedy  
 Jay Soffian  
+Jean-Noël Avila  Jean-Noel Avila
+Jean-Noël Avila  Jean-Noël AVILA
 Jeff King  
 Jeff Muizelaar  
 Jens Axboe  
@@ -149,6 +154,7 @@ Matt Draisey  
 Matt Kraai  
 Matt McCutchen  
 Matthias Kestenholz  
+Matthias Rüster  Matthias Ruester
 Matthias Urlichs  
 Matthias Urlichs  
 Michael Coleman 
@@ -213,6 +219,8 @@ Sean Estabrooks 
 Sebastian Schuberth  
 Seth Falcon  
 Shawn O. Pearce 
+Wei Shuyu  Shuyu Wei
+Sidhant Sharma  Sidhant Sharma [:tk]
 Simon Hausmann  
 Simon Hausmann  
 Stefan Beller  
@@ -253,7 +261,8 @@ Uwe Kleine-König  
 
 Uwe Kleine-König  

 Ville Skyttä  
-Vitaly "_Vi" Shukela 
+Vitaly "_Vi" Shukela  
+Vitaly "_Vi" Shukela  Vitaly _Vi Shukela
 W. Trevor King  
 William Pursell 
 YONETANI Tomokazu  
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH 2/3] .mailmap: assume Jason McMullan to be the same person

2018-06-28 Thread Stefan Beller

Over the years of contributing to open source, I realized the world of
open source is smaller than I originally thought and a name is still
a pretty unique thing. So let's assume these two author idents are the
same person.

In 10813e0d3c7 (.mailmap: update long-lost friends with multiple defunct
addresses, 2013-08-12) we learned both their email bounce, so asking is
no option.

Signed-off-by: Stefan Beller 
---
 .mailmap | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/.mailmap b/.mailmap
index f165222a782..ff96ef7401f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -82,10 +82,7 @@ J. Bruce Fields  

 J. Bruce Fields  
 Jakub Narębski 
 James Y Knight  
-# The 2 following authors are probably the same person,
-# but both emails bounce.
-Jason McMullan 
-Jason McMullan 
+Jason McMullan  Jason McMullan 

 Jason Riedy  
 Jason Riedy  
 Jay Soffian  
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH 3/3] .mailmap: map names with multiple emails to the same author identity

2018-06-28 Thread Stefan Beller

There are multiple author idents who have different email addresses, but
the same name; assume they are the same person, as the world of open source
is actually rather small.

Sift through the history via:

git shortlog -sne origin/pu |awk '{ NF--; $1=""; print }' |sort |uniq -d 
|cut -c 2-  >names
while read name; do
current_name=$(git log --author "$name" --format="%an <%ae>" -1)
git log --author "$name" --format="%an <%ae>" |sort |uniq >all_names

while read one_name; do
if [ "$one_name" != "$current_name" ]
then
echo $current_name $one_name >> out
fi
done 
---
 .mailmap | 185 ---
 1 file changed, 134 insertions(+), 51 deletions(-)

diff --git a/.mailmap b/.mailmap
index ff96ef7401f..2607846582a 100644
--- a/.mailmap
+++ b/.mailmap
@@ -5,54 +5,86 @@
 # same person appearing not to be so.
 #
 
- 
+Adam Simpkins  Adam Simpkins 
 Alejandro R. Sedeño  
+Alexander Gavrilov 
+Alexander Kuleshov 
+Alex Bennée  Alex Bennée 
 Alex Bennée 
+Alexey Shumkin  
+Alexey Shumkin  
+Alex Riesen  Alex Riesen 
+Alex Riesen  Alex Riesen 
+Alex Riesen  Alex Riesen 
+Alex Riesen  Alex Riesen 
 Alex Riesen  
 Alex Riesen  
 Alex Riesen  
 Alex Vandiver  
-Alexander Gavrilov 
-Alexander Kuleshov 
-Alexey Shumkin  
-Alexey Shumkin  
+Alex Vandiver  Alex Vandiver 
+Alex Vandiver  Alex Vandiver 
+Amos Waterland  
+Amos Waterland  
 Anders Kaseorg  
 Anders Kaseorg  
+Andreas Ericsson  Andreas Ericsson 
+Andreas Schwab  Andreas Schwab 
 Aneesh Kumar K.V 
-Amos Waterland  
-Amos Waterland  
-Ben Walton  
+# the two anonymous contributors are different persons:
+anonymous 
+anonymous 
+Antonio Ospite  Antonio Ospite 
+Avery Pennarun  Avery Pennarun 
+Beat Bolli  Beat Bolli 
+Benoit Person  Benoit Person 

 Benoit Sigoure  
+Ben Peart  Ben Peart 
+Ben Peart  Ben Peart 
+Ben Walton  
 Bernt Hansen  
 Brandon Casey  
+Brian Gernhardt  Brian Gernhardt 

+Brian Gernhardt  Brian Gernhardt 

 brian m. carlson  Brian M. Carlson 

 brian m. carlson  

 Bryan Larsen  
 Bryan Larsen  
+Carlos Martín Nieto  Carlos Martín Nieto 
+Charles Bailey  Charles Bailey 
 Cheng Renquan 
 Chris Shoemaker 
-Chris Wright  
-Cord Seele  
 Christian Couder  
+Christian Ludwig  Christian Ludwig 

 Christian Stimming  
 Christopher Díaz Riveros  Christopher Diaz Riveros
+Chris Wright  
+Clemens Buchacher  Clemens Buchacher 

+Clemens Buchacher  Clemens Buchacher 
+Cord Seele  
 Csaba Henk  
-Dan Johnson 
-Dana L. How  
 Dana L. How  Dana How
+Dana L. How  
 Daniel Barkalow 
+Daniel Knittl-Frank  Daniel Knittl-Frank 

 Daniel Knittl-Frank  knittl
 Daniel Trstenjak  
 Daniel Trstenjak  
+Dan Johnson 
+David Barr  David Barr 
 David Brown  
 David D. Kilzer 
 David Kågedal 
 David Reiss  
 David S. Miller 
+David Turner  David Turner 
+David Turner  David Turner 
+David Turner  David Turner 
 David Turner  
 David Turner  
 Deskin Miller 
 Dirk Süsserott 
+Dotan Barak  Dotan Barak 
+Edward Thomson  Edward Thomson 

 Eric Blake  
 Eric Hanchrow  
 Eric S. Raymond 
@@ -64,28 +96,33 @@ Florian Achleitner  
 
 Frank Lichtenheld  
 Frank Lichtenheld  
-Fredrik Kuivinen  
 Frédéric Heitzmann 
+Fredrik Kuivinen  
 Garry Dolley  
-Greg Price  
+Grégoire Paris  Grégoire Paris 

 Greg Price  
+Greg Price  
+Han-Wen Nienhuys  Han-Wen Nienhuys 
+Hartmut Henkel  Hartmut Henkel 
 Heiko Voigt  
 H. Merijn Brand  H.Merijn Brand 
+Horst H. von Brand 
 H. Peter Anvin  
 H. Peter Anvin  
 H. Peter Anvin  
 H. Peter Anvin  
-Han-Wen Nienhuys  Han-Wen Nienhuys 
-Horst H. von Brand 
-J. Bruce Fields  
-J. Bruce Fields  
-J. Bruce Fields  
+İsmail Dönmez 
+Jacob Keller  Jacob Keller 
 Jakub Narębski 
 James Y Knight  
 Jason McMullan  Jason McMullan 

-Jason Riedy  
+Jason McMullan  Jason McMullan 

 Jason Riedy  
+Jason Riedy  
 Jay Soffian  
+J. Bruce Fields  
+J. Bruce Fields  
+J. Bruce Fields  
 Jean-Noël Avila  Jean-Noel Avila
 Jean-Noël Avila  Jean-Noël AVILA
 Jeff King  
@@ -93,18 +130,22 @@ Jeff Muizelaar  
 Jens Axboe  
 Jens Axboe  
 Jens Lindström  Jens Lindstrom 
+Jiang Xin  Jiang Xin 
 Jim Meyering  
 Joachim Berdal Haga 
+Joachim Jablon  Joachim Jablon 

+Joey Hess  Joey Hess 
 Johannes Schindelin  
+Johannes Sixt  
 Johannes Sixt  
 Johannes Sixt  
-Johannes Sixt  
 John 'Warthog9' Hawley  
+Jonathan del Strother  
+Jonathan Nieder  
+Jonathon Mah  Jonathon Mah 
 Jon Loeliger  
 Jon Loeliger  
 Jon Seymour  
-Jonathan Nieder  
-Jonathan del Strother  
 Josh Triplett  
 Josh Triplett  
 Julian Phillips  
@@ -116,73 +157,107 @@ Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
 Kaartic Sivaraam  
+Kacper Kornet  Kacper Kornet 
 Karl Wiberg  Karl  Hasselström
 Karl Wiberg  
 Karsten Blees  
 Karsten Blees  
-Kay Sievers  
 Kay Sievers  
+Kay Sievers  
 Kazuki Saitoh  kazuki saitoh 
 Keith Cascio  
 Kent Engstrom 
 Kevin Leung 
+Kirill A. Shutemov  Kirill A. Shutemov 

 Kirill Smelkov  
 Kirill Smelkov  
+Kirill Smelkov  Kirill

[PATCH v3 13/32] tag: add repository argument to lookup_tag

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of lookup_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 builtin/describe.c | 6 +++---
 builtin/pack-objects.c | 2 +-
 builtin/replace.c  | 2 +-
 log-tree.c | 2 +-
 object.c   | 2 +-
 sha1-name.c| 2 +-
 tag.c  | 4 ++--
 tag.h  | 4 ++--
 8 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index c8ff64766d0..41606c8a900 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -93,13 +93,13 @@ static int replace_name(struct commit_name *e,
struct tag *t;
 
if (!e->tag) {
-   t = lookup_tag(>oid);
+   t = lookup_tag(the_repository, >oid);
if (!t || parse_tag(t))
return 1;
e->tag = t;
}
 
-   t = lookup_tag(oid);
+   t = lookup_tag(the_repository, oid);
if (!t || parse_tag(t))
return 0;
*tag = t;
@@ -267,7 +267,7 @@ static unsigned long finish_depth_computation(
 static void append_name(struct commit_name *n, struct strbuf *dst)
 {
if (n->prio == 2 && !n->tag) {
-   n->tag = lookup_tag(>oid);
+   n->tag = lookup_tag(the_repository, >oid);
if (!n->tag || parse_tag(n->tag))
die(_("annotated tag %s not available"), n->path);
}
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 69d3d7b82af..6565c800ac3 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2474,7 +2474,7 @@ static void add_tag_chain(const struct object_id *oid)
if (packlist_find(_pack, oid->hash, NULL))
return;
 
-   tag = lookup_tag(oid);
+   tag = lookup_tag(the_repository, oid);
while (1) {
if (!tag || parse_tag(tag) || !tag->tagged)
die("unable to pack objects reachable from tag %s",
diff --git a/builtin/replace.c b/builtin/replace.c
index 0232f98f020..0351b7c62cf 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -402,7 +402,7 @@ static int check_one_mergetag(struct commit *commit,
int i;
 
hash_object_file(extra->value, extra->len, type_name(OBJ_TAG), 
_oid);
-   tag = lookup_tag(_oid);
+   tag = lookup_tag(the_repository, _oid);
if (!tag)
return error(_("bad mergetag in commit '%s'"), ref);
if (parse_tag_buffer(tag, extra->value, extra->len))
diff --git a/log-tree.c b/log-tree.c
index abe67e8b2e4..840423ca149 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -498,7 +498,7 @@ static int show_one_mergetag(struct commit *commit,
size_t payload_size, gpg_message_offset;
 
hash_object_file(extra->value, extra->len, type_name(OBJ_TAG), );
-   tag = lookup_tag();
+   tag = lookup_tag(the_repository, );
if (!tag)
return -1; /* error message already given */
 
diff --git a/object.c b/object.c
index f08a8874de3..bcfcfd38760 100644
--- a/object.c
+++ b/object.c
@@ -223,7 +223,7 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
obj = >object;
}
} else if (type == OBJ_TAG) {
-   struct tag *tag = lookup_tag(oid);
+   struct tag *tag = lookup_tag(the_repository, oid);
if (tag) {
if (parse_tag_buffer(tag, buffer, size))
   return NULL;
diff --git a/sha1-name.c b/sha1-name.c
index 98480ade12d..5854bc75fe2 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -358,7 +358,7 @@ static int show_ambiguous_object(const struct object_id 
*oid, void *data)
format_commit_message(commit, " %ad - %s", , );
}
} else if (type == OBJ_TAG) {
-   struct tag *tag = lookup_tag(oid);
+   struct tag *tag = lookup_tag(the_repository, oid);
if (!parse_tag(tag) && tag->tag)
strbuf_addf(, " %s", tag->tag);
}
diff --git a/tag.c b/tag.c
index 5dcdf7bf6f9..5b41fc71fad 100644
--- a/tag.c
+++ b/tag.c
@@ -92,7 +92,7 @@ struct object *deref_tag_noverify(struct object *o)
return o;
 }
 
-struct tag *lookup_tag(const struct object_id *oid)
+struct tag *lookup_tag_the_repository(const struct object_id *oid)
 {
struct object *obj = lookup_object(the_repository, oid->hash);
if (!obj)
@@ -160,7 +160,7 @@ int parse_tag_buffer(struct tag *item, const void

[PATCH v3 07/32] commit: add repository argument to lookup_commit_reference_gently

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 archive.c |  2 +-
 blame.c   |  3 ++-
 builtin/checkout.c|  6 +++---
 builtin/describe.c|  5 +++--
 builtin/fetch.c   |  9 ++---
 builtin/reflog.c  | 10 ++
 builtin/show-branch.c |  3 ++-
 bundle.c  |  2 +-
 commit-graph.c|  2 +-
 commit.c  |  6 +++---
 commit.h  |  5 -
 fast-import.c |  6 --
 notes-cache.c |  3 ++-
 ref-filter.c  |  6 --
 remote.c  |  9 +
 sequencer.c   |  2 +-
 sha1-name.c   |  4 ++--
 shallow.c |  9 ++---
 walker.c  |  3 ++-
 wt-status.c   |  2 +-
 20 files changed, 59 insertions(+), 38 deletions(-)

diff --git a/archive.c b/archive.c
index 875dab64b60..78b0a398a0f 100644
--- a/archive.c
+++ b/archive.c
@@ -380,7 +380,7 @@ static void parse_treeish_arg(const char **argv,
if (get_oid(name, ))
die("Not a valid object name");
 
-   commit = lookup_commit_reference_gently(, 1);
+   commit = lookup_commit_reference_gently(the_repository, , 1);
if (commit) {
commit_sha1 = commit->object.oid.hash;
archive_time = commit->date;
diff --git a/blame.c b/blame.c
index 0c4490a35bb..5b022cc2254 100644
--- a/blame.c
+++ b/blame.c
@@ -1712,7 +1712,8 @@ static struct commit *dwim_reverse_initial(struct 
rev_info *revs,
/* Do we have HEAD? */
if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, _oid, NULL))
return NULL;
-   head_commit = lookup_commit_reference_gently(_oid, 1);
+   head_commit = lookup_commit_reference_gently(the_repository,
+_oid, 1);
if (!head_commit)
return NULL;
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 28627650cd6..40c27bf54d7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -379,7 +379,7 @@ static int checkout_paths(const struct checkout_opts *opts,
die(_("unable to write new index file"));
 
read_ref_full("HEAD", 0, , NULL);
-   head = lookup_commit_reference_gently(, 1);
+   head = lookup_commit_reference_gently(the_repository, , 1);
 
errs |= post_checkout_hook(head, head, 0);
return errs;
@@ -830,7 +830,7 @@ static int switch_branches(const struct checkout_opts *opts,
memset(_branch_info, 0, sizeof(old_branch_info));
old_branch_info.path = path_to_free = resolve_refdup("HEAD", 0, , 
);
if (old_branch_info.path)
-   old_branch_info.commit = lookup_commit_reference_gently(, 
1);
+   old_branch_info.commit = 
lookup_commit_reference_gently(the_repository, , 1);
if (!(flag & REF_ISSYMREF))
old_branch_info.path = NULL;
 
@@ -1004,7 +1004,7 @@ static int parse_branchname_arg(int argc, const char 
**argv,
else
new_branch_info->path = NULL; /* not an existing branch */
 
-   new_branch_info->commit = lookup_commit_reference_gently(rev, 1);
+   new_branch_info->commit = 
lookup_commit_reference_gently(the_repository, rev, 1);
if (!new_branch_info->commit) {
/* not a commit */
*source_tree = parse_tree_indirect(rev);
diff --git a/builtin/describe.c b/builtin/describe.c
index 1e87f68d5ee..0b5d8562331 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -331,7 +331,8 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
init_commit_names(_names);
n = hashmap_iter_first(, );
for (; n; n = hashmap_iter_next()) {
-   c = lookup_commit_reference_gently(>peeled, 1);
+   c = lookup_commit_reference_gently(the_repository,
+  >peeled, 1);
if (c)
*commit_names_at(_names, c) = n;
}
@@ -509,7 +510,7 @@ static void describe(const char *arg, int last_one)
 
if (get_oid(arg, ))
die(_("Not a valid object name %s"), arg);
-   cmit = lookup_commit_reference_gently(, 1);
+   cmit = lookup_commit_reference_gently(the_repository, , 1);
 
if (cmit)
describe_commit(, );
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 83f36d7cdeb..f5d960baecf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -684,8 +684,10 @@ static int update_local_ref(struct ref *ref,
return r;
}
 
-

[PATCH v3 09/32] commit: add repository argument to lookup_commit

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of lookup_commit to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 builtin/am.c|  3 ++-
 builtin/commit-tree.c   |  4 +++-
 builtin/diff-tree.c |  2 +-
 builtin/fast-export.c   |  2 +-
 builtin/fmt-merge-msg.c |  2 +-
 builtin/merge-base.c|  2 +-
 builtin/verify-commit.c |  4 +++-
 commit-graph.c  | 10 +-
 commit.c|  7 ---
 commit.h|  3 ++-
 fetch-pack.c|  5 +++--
 log-tree.c  |  2 +-
 notes-utils.c   |  4 +++-
 object.c|  2 +-
 sequencer.c |  4 ++--
 sha1-name.c |  2 +-
 shallow.c   | 17 ++---
 tag.c   |  2 +-
 tree.c  |  2 +-
 19 files changed, 46 insertions(+), 33 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 72e928cee79..b6eeb46c4b6 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1633,7 +1633,8 @@ static void do_commit(const struct am_state *state)
 
if (!get_oid_commit("HEAD", )) {
old_oid = 
-   commit_list_insert(lookup_commit(), );
+   commit_list_insert(lookup_commit(the_repository, ),
+  );
} else {
old_oid = NULL;
say(state, stderr, _("applying to an empty history"));
diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
index 9fbd3529fb1..9ec36a82b63 100644
--- a/builtin/commit-tree.c
+++ b/builtin/commit-tree.c
@@ -6,6 +6,7 @@
 #include "cache.h"
 #include "config.h"
 #include "object-store.h"
+#include "repository.h"
 #include "commit.h"
 #include "tree.h"
 #include "builtin.h"
@@ -60,7 +61,8 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
if (get_oid_commit(argv[i], ))
die("Not a valid object name %s", argv[i]);
assert_oid_type(, OBJ_COMMIT);
-   new_parent(lookup_commit(), );
+   new_parent(lookup_commit(the_repository, ),
+);
continue;
}
 
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index a5718d96ee2..91ba67070e2 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -25,7 +25,7 @@ static int stdin_diff_commit(struct commit *commit, const 
char *p)
 
/* Graft the fake parents locally to the commit */
while (isspace(*p++) && !parse_oid_hex(p, , )) {
-   struct commit *parent = lookup_commit();
+   struct commit *parent = lookup_commit(the_repository, );
if (!pptr) {
/* Free the real parent list */
free_commit_list(commit->parents);
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 7d6b1d8aea2..223499d7ca4 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -963,7 +963,7 @@ static void import_marks(char *input_file)
/* only commits */
continue;
 
-   commit = lookup_commit();
+   commit = lookup_commit(the_repository, );
if (!commit)
die("not a commit? can't happen: %s", oid_to_hex());
 
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 5e44589b545..36318ef46e7 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -572,7 +572,7 @@ static void find_merge_parents(struct merge_parents *result,
commit_list_insert(parent, );
add_merge_parent(result, >oid, >object.oid);
}
-   head_commit = lookup_commit(head);
+   head_commit = lookup_commit(the_repository, head);
if (head_commit)
commit_list_insert(head_commit, );
reduce_heads_replace();
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index bbead6f33e5..08d91b1f0c0 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -124,7 +124,7 @@ static void add_one_commit(struct object_id *oid, struct 
rev_collect *revs)
if (is_null_oid(oid))
return;
 
-   commit = lookup_commit(oid);
+   commit = lookup_commit(the_repository, oid);
if (!commit ||
(commit->object.flags & TMP_MARK) ||
parse_commit(commit))
diff --git a/builtin/verify-commit.c b/builtin/verify-commit.c
index f6922da16d6..7772c07ed7a 100644
--- a/builtin/verify-commit.c
+++ b/builtin/verify-commit.c
@@ -9,6 +9,7 @@
 #include "config.h"
 #include "builtin.h"
 #include "object-store.h"
+#include "repository.h"
 #include "commit.h"
 #include

[PATCH v3 27/32] commit.c: allow get_cached_commit_buffer to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 4 ++--
 commit.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index dd8c9c15b14..15b044331a1 100644
--- a/commit.c
+++ b/commit.c
@@ -283,10 +283,10 @@ void set_commit_buffer(struct repository *r, struct 
commit *commit, void *buffer
v->size = size;
 }
 
-const void *get_cached_commit_buffer_the_repository(const struct commit 
*commit, unsigned long *sizep)
+const void *get_cached_commit_buffer(struct repository *r, const struct commit 
*commit, unsigned long *sizep)
 {
struct commit_buffer *v = buffer_slab_peek(
-   the_repository->parsed_objects->buffer_slab, commit);
+   r->parsed_objects->buffer_slab, commit);
if (!v) {
if (sizep)
*sizep = 0;
diff --git a/commit.h b/commit.h
index 7297af467b9..d61585df5bc 100644
--- a/commit.h
+++ b/commit.h
@@ -103,8 +103,7 @@ void set_commit_buffer(struct repository *r, struct commit 
*, void *buffer, unsi
  * Get any cached object buffer associated with the commit. Returns NULL
  * if none. The resulting memory should not be freed.
  */
-#define get_cached_commit_buffer(r, c, s) get_cached_commit_buffer_##r(c, s)
-const void *get_cached_commit_buffer_the_repository(const struct commit *, 
unsigned long *size);
+const void *get_cached_commit_buffer(struct repository *, const struct commit 
*, unsigned long *size);
 
 /*
  * Get the commit's object contents, either from cache or by reading the object
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 26/32] commit.c: allow set_commit_buffer to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 4 ++--
 commit.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index 1baac77861f..dd8c9c15b14 100644
--- a/commit.c
+++ b/commit.c
@@ -275,10 +275,10 @@ void free_commit_buffer_slab(struct buffer_slab *bs)
free(bs);
 }
 
-void set_commit_buffer_the_repository(struct commit *commit, void *buffer, 
unsigned long size)
+void set_commit_buffer(struct repository *r, struct commit *commit, void 
*buffer, unsigned long size)
 {
struct commit_buffer *v = buffer_slab_at(
-   the_repository->parsed_objects->buffer_slab, commit);
+   r->parsed_objects->buffer_slab, commit);
v->buffer = buffer;
v->size = size;
 }
diff --git a/commit.h b/commit.h
index bea5e015b28..7297af467b9 100644
--- a/commit.h
+++ b/commit.h
@@ -97,8 +97,7 @@ void free_commit_buffer_slab(struct buffer_slab *bs);
  * Associate an object buffer with the commit. The ownership of the
  * memory is handed over to the commit, and must be free()-able.
  */
-#define set_commit_buffer(r, c, b, s) set_commit_buffer_##r(c, b, s)
-void set_commit_buffer_the_repository(struct commit *, void *buffer, unsigned 
long size);
+void set_commit_buffer(struct repository *r, struct commit *, void *buffer, 
unsigned long size);
 
 /*
  * Get any cached object buffer associated with the commit. Returns NULL
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 31/32] commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 8 
 commit.h | 4 +---
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/commit.c b/commit.c
index 15b044331a1..08b4602f43f 100644
--- a/commit.c
+++ b/commit.c
@@ -24,16 +24,16 @@ int save_commit_buffer = 1;
 
 const char *commit_type = "commit";
 
-struct commit *lookup_commit_reference_gently_the_repository(
+struct commit *lookup_commit_reference_gently(struct repository *r,
const struct object_id *oid, int quiet)
 {
-   struct object *obj = deref_tag(the_repository,
-  parse_object(the_repository, oid),
+   struct object *obj = deref_tag(r,
+  parse_object(r, oid),
   NULL, 0);
 
if (!obj)
return NULL;
-   return object_as_type(the_repository, obj, OBJ_COMMIT, quiet);
+   return object_as_type(r, obj, OBJ_COMMIT, quiet);
 }
 
 struct commit *lookup_commit_reference_the_repository(const struct object_id 
*oid)
diff --git a/commit.h b/commit.h
index d61585df5bc..f1f25957de4 100644
--- a/commit.h
+++ b/commit.h
@@ -67,9 +67,7 @@ struct commit *lookup_commit(struct repository *r, const 
struct object_id *oid);
 #define lookup_commit_reference(r, o) \
lookup_commit_reference_##r(o)
 struct commit *lookup_commit_reference_the_repository(const struct object_id 
*oid);
-#define lookup_commit_reference_gently(r, o, q) \
-   lookup_commit_reference_gently_##r(o, q)
-struct commit *lookup_commit_reference_gently_the_repository(
+struct commit *lookup_commit_reference_gently(struct repository *r,
  const struct object_id *oid,
  int quiet);
 struct commit *lookup_commit_reference_by_name(const char *name);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 29/32] object.c: allow parse_object to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 object.c | 14 +++---
 object.h |  3 +--
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/object.c b/object.c
index cd870fee22b..b0faab85d40 100644
--- a/object.c
+++ b/object.c
@@ -245,28 +245,28 @@ struct object *parse_object_or_die(const struct object_id 
*oid,
die(_("unable to parse object: %s"), name ? name : oid_to_hex(oid));
 }
 
-struct object *parse_object_the_repository(const struct object_id *oid)
+struct object *parse_object(struct repository *r, const struct object_id *oid)
 {
unsigned long size;
enum object_type type;
int eaten;
-   const struct object_id *repl = lookup_replace_object(the_repository, 
oid);
+   const struct object_id *repl = lookup_replace_object(r, oid);
void *buffer;
struct object *obj;
 
-   obj = lookup_object(the_repository, oid->hash);
+   obj = lookup_object(r, oid->hash);
if (obj && obj->parsed)
return obj;
 
if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
(!obj && has_object_file(oid) &&
-oid_object_info(the_repository, oid, NULL) == OBJ_BLOB)) {
+oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
if (check_object_signature(repl, NULL, 0, NULL) < 0) {
error("sha1 mismatch %s", oid_to_hex(oid));
return NULL;
}
-   parse_blob_buffer(lookup_blob(the_repository, oid), NULL, 0);
-   return lookup_object(the_repository, oid->hash);
+   parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
+   return lookup_object(r, oid->hash);
}
 
buffer = read_object_file(oid, , );
@@ -277,7 +277,7 @@ struct object *parse_object_the_repository(const struct 
object_id *oid)
return NULL;
}
 
-   obj = parse_object_buffer(the_repository, oid, type, size,
+   obj = parse_object_buffer(r, oid, type, size,
  buffer, );
if (!eaten)
free(buffer);
diff --git a/object.h b/object.h
index 38198bb73a1..fa5ca975678 100644
--- a/object.h
+++ b/object.h
@@ -124,8 +124,7 @@ void *object_as_type(struct repository *r, struct object 
*obj, enum object_type
  *
  * Returns NULL if the object is missing or corrupt.
  */
-#define parse_object(r, oid) parse_object_##r(oid)
-struct object *parse_object_the_repository(const struct object_id *oid);
+struct object *parse_object(struct repository *r, const struct object_id *oid);
 
 /*
  * Like parse_object, but will die() instead of returning NULL. If the
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 32/32] commit.c: allow lookup_commit_reference to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 4 ++--
 commit.h | 5 ++---
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/commit.c b/commit.c
index 08b4602f43f..b88ced5b026 100644
--- a/commit.c
+++ b/commit.c
@@ -36,9 +36,9 @@ struct commit *lookup_commit_reference_gently(struct 
repository *r,
return object_as_type(r, obj, OBJ_COMMIT, quiet);
 }
 
-struct commit *lookup_commit_reference_the_repository(const struct object_id 
*oid)
+struct commit *lookup_commit_reference(struct repository *r, const struct 
object_id *oid)
 {
-   return lookup_commit_reference_gently(the_repository, oid, 0);
+   return lookup_commit_reference_gently(r, oid, 0);
 }
 
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char 
*ref_name)
diff --git a/commit.h b/commit.h
index f1f25957de4..8b2cf9692de 100644
--- a/commit.h
+++ b/commit.h
@@ -64,9 +64,8 @@ void add_name_decoration(enum decoration_type type, const 
char *name, struct obj
 const struct name_decoration *get_name_decoration(const struct object *obj);
 
 struct commit *lookup_commit(struct repository *r, const struct object_id 
*oid);
-#define lookup_commit_reference(r, o) \
-   lookup_commit_reference_##r(o)
-struct commit *lookup_commit_reference_the_repository(const struct object_id 
*oid);
+struct commit *lookup_commit_reference(struct repository *r,
+  const struct object_id *oid);
 struct commit *lookup_commit_reference_gently(struct repository *r,
  const struct object_id *oid,
  int quiet);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 30/32] tag.c: allow deref_tag to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 tag.c | 5 ++---
 tag.h | 3 +--
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/tag.c b/tag.c
index 682e7793059..94a89b21cb5 100644
--- a/tag.c
+++ b/tag.c
@@ -64,12 +64,11 @@ int gpg_verify_tag(const struct object_id *oid, const char 
*name_to_report,
return ret;
 }
 
-struct object *deref_tag_the_repository(struct object *o, const char *warn, 
int warnlen)
+struct object *deref_tag(struct repository *r, struct object *o, const char 
*warn, int warnlen)
 {
while (o && o->type == OBJ_TAG)
if (((struct tag *)o)->tagged)
-   o = parse_object(the_repository,
-&((struct tag *)o)->tagged->oid);
+   o = parse_object(r, &((struct tag *)o)->tagged->oid);
else
o = NULL;
if (!o && warn) {
diff --git a/tag.h b/tag.h
index efd4c7da67c..e669c3e497a 100644
--- a/tag.h
+++ b/tag.h
@@ -15,8 +15,7 @@ extern struct tag *lookup_tag(struct repository *r, const 
struct object_id *oid)
 extern int parse_tag_buffer(struct repository *r, struct tag *item, const void 
*data, unsigned long size);
 extern int parse_tag(struct tag *item);
 extern void release_tag_memory(struct tag *t);
-#define deref_tag(r, o, w, l) deref_tag_##r(o, w, l)
-extern struct object *deref_tag_the_repository(struct object *, const char *, 
int);
+extern struct object *deref_tag(struct repository *r, struct object *, const 
char *, int);
 extern struct object *deref_tag_noverify(struct object *);
 extern int gpg_verify_tag(const struct object_id *oid,
const char *name_to_report, unsigned flags);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 28/32] object.c: allow parse_object_buffer to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 object.c | 18 +-
 object.h |  3 +--
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index 9d588448192..cd870fee22b 100644
--- a/object.c
+++ b/object.c
@@ -185,21 +185,21 @@ struct object *lookup_unknown_object(const unsigned char 
*sha1)
return obj;
 }
 
-struct object *parse_object_buffer_the_repository(const struct object_id *oid, 
enum object_type type, unsigned long size, void *buffer, int *eaten_p)
+struct object *parse_object_buffer(struct repository *r, const struct 
object_id *oid, enum object_type type, unsigned long size, void *buffer, int 
*eaten_p)
 {
struct object *obj;
*eaten_p = 0;
 
obj = NULL;
if (type == OBJ_BLOB) {
-   struct blob *blob = lookup_blob(the_repository, oid);
+   struct blob *blob = lookup_blob(r, oid);
if (blob) {
if (parse_blob_buffer(blob, buffer, size))
return NULL;
obj = >object;
}
} else if (type == OBJ_TREE) {
-   struct tree *tree = lookup_tree(the_repository, oid);
+   struct tree *tree = lookup_tree(r, oid);
if (tree) {
obj = >object;
if (!tree->buffer)
@@ -211,20 +211,20 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
}
}
} else if (type == OBJ_COMMIT) {
-   struct commit *commit = lookup_commit(the_repository, oid);
+   struct commit *commit = lookup_commit(r, oid);
if (commit) {
-   if (parse_commit_buffer(the_repository, commit, buffer, 
size, 1))
+   if (parse_commit_buffer(r, commit, buffer, size, 1))
return NULL;
-   if (!get_cached_commit_buffer(the_repository, commit, 
NULL)) {
-   set_commit_buffer(the_repository, commit, 
buffer, size);
+   if (!get_cached_commit_buffer(r, commit, NULL)) {
+   set_commit_buffer(r, commit, buffer, size);
*eaten_p = 1;
}
obj = >object;
}
} else if (type == OBJ_TAG) {
-   struct tag *tag = lookup_tag(the_repository, oid);
+   struct tag *tag = lookup_tag(r, oid);
if (tag) {
-   if (parse_tag_buffer(the_repository, tag, buffer, size))
+   if (parse_tag_buffer(r, tag, buffer, size))
   return NULL;
obj = >object;
}
diff --git a/object.h b/object.h
index f54a892bd10..38198bb73a1 100644
--- a/object.h
+++ b/object.h
@@ -138,8 +138,7 @@ struct object *parse_object_or_die(const struct object_id 
*oid, const char *name
  * parsing it.  eaten_p indicates if the object has a borrowed copy
  * of buffer and the caller should not free() it.
  */
-#define parse_object_buffer(r, o, t, s, b, e) parse_object_buffer_##r(o, t, s, 
b, e)
-struct object *parse_object_buffer_the_repository(const struct object_id *oid, 
enum object_type type, unsigned long size, void *buffer, int *eaten_p);
+struct object *parse_object_buffer(struct repository *r, const struct 
object_id *oid, enum object_type type, unsigned long size, void *buffer, int 
*eaten_p);
 
 /** Returns the object, with potentially excess memory allocated. **/
 struct object *lookup_unknown_object(const unsigned  char *sha1);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 22/32] tag: allow parse_tag_buffer to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 tag.c | 10 +-
 tag.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/tag.c b/tag.c
index 46b5882ee12..682e7793059 100644
--- a/tag.c
+++ b/tag.c
@@ -126,7 +126,7 @@ void release_tag_memory(struct tag *t)
t->date = 0;
 }
 
-int parse_tag_buffer_the_repository(struct tag *item, const void *data, 
unsigned long size)
+int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, 
unsigned long size)
 {
struct object_id oid;
char type[20];
@@ -154,13 +154,13 @@ int parse_tag_buffer_the_repository(struct tag *item, 
const void *data, unsigned
bufptr = nl + 1;
 
if (!strcmp(type, blob_type)) {
-   item->tagged = (struct object *)lookup_blob(the_repository, 
);
+   item->tagged = (struct object *)lookup_blob(r, );
} else if (!strcmp(type, tree_type)) {
-   item->tagged = (struct object *)lookup_tree(the_repository, 
);
+   item->tagged = (struct object *)lookup_tree(r, );
} else if (!strcmp(type, commit_type)) {
-   item->tagged = (struct object *)lookup_commit(the_repository, 
);
+   item->tagged = (struct object *)lookup_commit(r, );
} else if (!strcmp(type, tag_type)) {
-   item->tagged = (struct object *)lookup_tag(the_repository, 
);
+   item->tagged = (struct object *)lookup_tag(r, );
} else {
error("Unknown type %s", type);
item->tagged = NULL;
diff --git a/tag.h b/tag.h
index 6a160c91875..efd4c7da67c 100644
--- a/tag.h
+++ b/tag.h
@@ -12,8 +12,7 @@ struct tag {
timestamp_t date;
 };
 extern struct tag *lookup_tag(struct repository *r, const struct object_id 
*oid);
-#define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s)
-extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, 
unsigned long size);
+extern int parse_tag_buffer(struct repository *r, struct tag *item, const void 
*data, unsigned long size);
 extern int parse_tag(struct tag *item);
 extern void release_tag_memory(struct tag *t);
 #define deref_tag(r, o, w, l) deref_tag_##r(o, w, l)
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 23/32] commit.c: allow parse_commit_buffer to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 10 +-
 commit.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/commit.c b/commit.c
index 8749e151451..41d23352098 100644
--- a/commit.c
+++ b/commit.c
@@ -364,7 +364,7 @@ const void *detach_commit_buffer(struct commit *commit, 
unsigned long *sizep)
return ret;
 }
 
-int parse_commit_buffer_the_repository(struct commit *item, const void 
*buffer, unsigned long size, int check_graph)
+int parse_commit_buffer(struct repository *r, struct commit *item, const void 
*buffer, unsigned long size, int check_graph)
 {
const char *tail = buffer;
const char *bufptr = buffer;
@@ -384,11 +384,11 @@ int parse_commit_buffer_the_repository(struct commit 
*item, const void *buffer,
if (get_oid_hex(bufptr + 5, ) < 0)
return error("bad tree pointer in commit %s",
 oid_to_hex(>object.oid));
-   item->maybe_tree = lookup_tree(the_repository, );
+   item->maybe_tree = lookup_tree(r, );
bufptr += tree_entry_len + 1; /* "tree " + "hex sha1" + "\n" */
pptr = >parents;
 
-   graft = lookup_commit_graft(the_repository, >object.oid);
+   graft = lookup_commit_graft(r, >object.oid);
while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 
7)) {
struct commit *new_parent;
 
@@ -403,7 +403,7 @@ int parse_commit_buffer_the_repository(struct commit *item, 
const void *buffer,
 */
if (graft && (graft->nr_parent < 0 || grafts_replace_parents))
continue;
-   new_parent = lookup_commit(the_repository, );
+   new_parent = lookup_commit(r, );
if (new_parent)
pptr = _list_insert(new_parent, pptr)->next;
}
@@ -411,7 +411,7 @@ int parse_commit_buffer_the_repository(struct commit *item, 
const void *buffer,
int i;
struct commit *new_parent;
for (i = 0; i < graft->nr_parent; i++) {
-   new_parent = lookup_commit(the_repository,
+   new_parent = lookup_commit(r,
   >parent[i]);
if (!new_parent)
continue;
diff --git a/commit.h b/commit.h
index 27888d82468..e9cb5e9 100644
--- a/commit.h
+++ b/commit.h
@@ -81,8 +81,7 @@ struct commit *lookup_commit_reference_by_name(const char 
*name);
  */
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char 
*ref_name);
 
-#define parse_commit_buffer(r, i, b, s, g) parse_commit_buffer_##r(i, b, s, g)
-int parse_commit_buffer_the_repository(struct commit *item, const void 
*buffer, unsigned long size, int check_graph);
+int parse_commit_buffer(struct repository *r, struct commit *item, const void 
*buffer, unsigned long size, int check_graph);
 int parse_commit_gently(struct commit *item, int quiet_on_missing);
 static inline int parse_commit(struct commit *item)
 {
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 24/32] commit-slabs: remove realloc counter outside of slab struct

2018-06-28 Thread Stefan Beller

The realloc counter is declared outside the struct for the given slabname,
which makes it harder for a follow up patch to move the declaration of the
struct around as then the counter variable would need special treatment.

As the reallocation counter is currently unused we can just remove it.
If we ever need to count the reallocations again, we can reintroduce
the counter as part of 'struct slabname' in commit-slab-decl.h.

Signed-off-by: Stefan Beller 
---
 commit-slab-impl.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/commit-slab-impl.h b/commit-slab-impl.h
index 87a9cadfcca..ac1e6d409ad 100644
--- a/commit-slab-impl.h
+++ b/commit-slab-impl.h
@@ -11,8 +11,6 @@
 
 #define implement_commit_slab(slabname, elemtype, scope)   \
\
-static int stat_ ##slabname## realloc; \
-   \
 scope void init_ ##slabname## _with_stride(struct slabname *s, \
   unsigned stride) \
 {  \
@@ -54,7 +52,6 @@ scope elemtype *slabname## _at_peek(struct slabname *s,   
\
if (!add_if_missing)\
return NULL;\
REALLOC_ARRAY(s->slab, nth_slab + 1);   \
-   stat_ ##slabname## realloc++;   \
for (i = s->slab_count; i <= nth_slab; i++) \
s->slab[i] = NULL;  \
s->slab_count = nth_slab + 1;   \
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 25/32] commit.c: migrate the commit buffer to the parsed object store

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 29 +++--
 commit.h |  4 
 object.c |  5 +
 object.h |  4 
 4 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/commit.c b/commit.c
index 41d23352098..1baac77861f 100644
--- a/commit.c
+++ b/commit.c
@@ -261,18 +261,32 @@ struct commit_buffer {
unsigned long size;
 };
 define_commit_slab(buffer_slab, struct commit_buffer);
-static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab);
+
+struct buffer_slab *allocate_commit_buffer_slab(void)
+{
+   struct buffer_slab *bs = xmalloc(sizeof(*bs));
+   init_buffer_slab(bs);
+   return bs;
+}
+
+void free_commit_buffer_slab(struct buffer_slab *bs)
+{
+   clear_buffer_slab(bs);
+   free(bs);
+}
 
 void set_commit_buffer_the_repository(struct commit *commit, void *buffer, 
unsigned long size)
 {
-   struct commit_buffer *v = buffer_slab_at(_slab, commit);
+   struct commit_buffer *v = buffer_slab_at(
+   the_repository->parsed_objects->buffer_slab, commit);
v->buffer = buffer;
v->size = size;
 }
 
 const void *get_cached_commit_buffer_the_repository(const struct commit 
*commit, unsigned long *sizep)
 {
-   struct commit_buffer *v = buffer_slab_peek(_slab, commit);
+   struct commit_buffer *v = buffer_slab_peek(
+   the_repository->parsed_objects->buffer_slab, commit);
if (!v) {
if (sizep)
*sizep = 0;
@@ -304,14 +318,16 @@ const void *get_commit_buffer(const struct commit 
*commit, unsigned long *sizep)
 
 void unuse_commit_buffer(const struct commit *commit, const void *buffer)
 {
-   struct commit_buffer *v = buffer_slab_peek(_slab, commit);
+   struct commit_buffer *v = buffer_slab_peek(
+   the_repository->parsed_objects->buffer_slab, commit);
if (!(v && v->buffer == buffer))
free((void *)buffer);
 }
 
 void free_commit_buffer(struct commit *commit)
 {
-   struct commit_buffer *v = buffer_slab_peek(_slab, commit);
+   struct commit_buffer *v = buffer_slab_peek(
+   the_repository->parsed_objects->buffer_slab, commit);
if (v) {
FREE_AND_NULL(v->buffer);
v->size = 0;
@@ -347,7 +363,8 @@ void release_commit_memory(struct commit *c)
 
 const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep)
 {
-   struct commit_buffer *v = buffer_slab_peek(_slab, commit);
+   struct commit_buffer *v = buffer_slab_peek(
+   the_repository->parsed_objects->buffer_slab, commit);
void *ret;
 
if (!v) {
diff --git a/commit.h b/commit.h
index e9cb5e9..bea5e015b28 100644
--- a/commit.h
+++ b/commit.h
@@ -89,6 +89,10 @@ static inline int parse_commit(struct commit *item)
 }
 void parse_commit_or_die(struct commit *item);
 
+struct buffer_slab;
+struct buffer_slab *allocate_commit_buffer_slab(void);
+void free_commit_buffer_slab(struct buffer_slab *bs);
+
 /*
  * Associate an object buffer with the commit. The ownership of the
  * memory is handed over to the commit, and must be free()-able.
diff --git a/object.c b/object.c
index 9d74de95f5b..9d588448192 100644
--- a/object.c
+++ b/object.c
@@ -467,6 +467,8 @@ struct parsed_object_pool *parsed_object_pool_new(void)
o->is_shallow = -1;
o->shallow_stat = xcalloc(1, sizeof(*o->shallow_stat));
 
+   o->buffer_slab = allocate_commit_buffer_slab();
+
return o;
 }
 
@@ -541,6 +543,9 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
FREE_AND_NULL(o->obj_hash);
o->obj_hash_size = 0;
 
+   free_commit_buffer_slab(o->buffer_slab);
+   o->buffer_slab = NULL;
+
clear_alloc_state(o->blob_state);
clear_alloc_state(o->tree_state);
clear_alloc_state(o->commit_state);
diff --git a/object.h b/object.h
index 0d7d74129b6..f54a892bd10 100644
--- a/object.h
+++ b/object.h
@@ -1,6 +1,8 @@
 #ifndef OBJECT_H
 #define OBJECT_H
 
+struct buffer_slab;
+
 struct parsed_object_pool {
struct object **obj_hash;
int nr_objs, obj_hash_size;
@@ -22,6 +24,8 @@ struct parsed_object_pool {
char *alternate_shallow_file;
 
int commit_graft_prepared;
+
+   struct buffer_slab *buffer_slab;
 };
 
 struct parsed_object_pool *parsed_object_pool_new(void);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 00/32] object-store: lookup_commit

2018-06-28 Thread Stefan Beller

This continues the elimination of global variables in the object store and
teaches lookup_commit[_reference] and alike to handle a_repository.

This is also available as
https://github.com/stefanbeller/git/tree/object-store-lookup-commit
or applies on top of 02f70d63027 (Merge branch 'sb/object-store-grafts'
into next, 2018-06-28).

Picking a base for this one is really hard, as nearly all series currently
cooking or in flight collide with it on one or two lines. (lookup_* is used
heavily, who would have thought?); I really needed 'sb/object-store-grafts'
to build on; and there is only one other series before that in current next,
which this series would have conflicted with, if not based on top of it.

In "The state of the object store series" [1], this was tentatively named
"sb/object-store-lookup".

Thanks,
Stefan

[1] 
https://public-inbox.org/git/cagz79kzteza1rvgfscs+m4dsrb86cf-xiepwqmeu-kcnxp_...@mail.gmail.com/


v2 https://public-inbox.org/git/20180613230522.55335-1-sbel...@google.com/
* removed mentions of cooci patches
* added forward declaration of commit buffer slabs.
* dropped 3 patches that add the repository to lookup_unkonwn_object,
  parse_commit and parse_commit_gently, but were not converting those
  functions. We'll convert these in the next series, as this series is
  growing big already.
* This series can be found as branch 'object-store-lookup-commit' on github,
  it applies on top of nd/commit-util-to-slab merged with sb/object-store-grafts

v1, https://public-inbox.org/git/20180530004810.30076-1-sbel...@google.com/

This applies on the merge of nd/commit-util-to-slab and sb/object-store-grafts,
and is available at http://github.com/stefanbeller/ as branch 
object-store-lookup-commit
as the merge has some merge conflicts as well as syntactical conflicts 
(upload-pack.c
and fetch-pack.c introduce new calls of functions that would want to take a 
repository struct
in the object-store-grafts series)

As layed out in 
https://public-inbox.org/git/20180517225154.9200-1-sbel...@google.com/
this is getting close to finishing the set of object store series though the 
last
unfinished part of this RFC hints at new work on the plate:
* To give this series a nice polish, we'd want to convert parse_commit, too.
  But that requires the conversion of the new commit graph. Maybe we need
  to split this series into 2. 
* Once this is in good shape we can talk about converting parts of the revision
  walking code,
* which then can be used by the submodule code as the end goal for the
  object store series.

Thanks,
Stefan


Stefan Beller (32):
  object: add repository argument to parse_object
  object: add repository argument to lookup_object
  object: add repository argument to parse_object_buffer
  object: add repository argument to object_as_type
  blob: add repository argument to lookup_blob
  tree: add repository argument to lookup_tree
  commit: add repository argument to lookup_commit_reference_gently
  commit: add repository argument to lookup_commit_reference
  commit: add repository argument to lookup_commit
  commit: add repository argument to parse_commit_buffer
  commit: add repository argument to set_commit_buffer
  commit: add repository argument to get_cached_commit_buffer
  tag: add repository argument to lookup_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to deref_tag
  object: allow object_as_type to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: migrate the commit buffer to the parsed object store
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary
repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary
repositories
  commit.c: allow lookup_commit_reference to handle arbitrary
repositories

 archive.c|  2 +-
 bisect.c |  2 +-
 blame.c  | 13 ---
 blob.c   | 10 ++---
 blob.h   |  2 +-
 branch.c |  2 +-
 builtin/am.c |  9 +++--
 builtin/branch.c |  7 ++--
 builtin/checkout.c   |  6 +--
 builtin/clone.c  |  3 +-
 builtin/commit-tree.c|  4 +-
 builtin/describe.c   | 13 ---
 builtin/diff-tree.c  |  9 +++--

[PATCH v3 12/32] commit: add repository argument to get_cached_commit_buffer

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of get_cached_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 commit.c | 4 ++--
 commit.h | 3 ++-
 object.c | 2 +-
 pretty.c | 2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/commit.c b/commit.c
index cdfb1a025b6..9e2899bd5ad 100644
--- a/commit.c
+++ b/commit.c
@@ -269,7 +269,7 @@ void set_commit_buffer_the_repository(struct commit 
*commit, void *buffer, unsig
v->size = size;
 }
 
-const void *get_cached_commit_buffer(const struct commit *commit, unsigned 
long *sizep)
+const void *get_cached_commit_buffer_the_repository(const struct commit 
*commit, unsigned long *sizep)
 {
struct commit_buffer *v = buffer_slab_peek(_slab, commit);
if (!v) {
@@ -284,7 +284,7 @@ const void *get_cached_commit_buffer(const struct commit 
*commit, unsigned long
 
 const void *get_commit_buffer(const struct commit *commit, unsigned long 
*sizep)
 {
-   const void *ret = get_cached_commit_buffer(commit, sizep);
+   const void *ret = get_cached_commit_buffer(the_repository, commit, 
sizep);
if (!ret) {
enum object_type type;
unsigned long size;
diff --git a/commit.h b/commit.h
index 7c14dfdc54b..237607d64cb 100644
--- a/commit.h
+++ b/commit.h
@@ -102,7 +102,8 @@ void set_commit_buffer_the_repository(struct commit *, void 
*buffer, unsigned lo
  * Get any cached object buffer associated with the commit. Returns NULL
  * if none. The resulting memory should not be freed.
  */
-const void *get_cached_commit_buffer(const struct commit *, unsigned long 
*size);
+#define get_cached_commit_buffer(r, c, s) get_cached_commit_buffer_##r(c, s)
+const void *get_cached_commit_buffer_the_repository(const struct commit *, 
unsigned long *size);
 
 /*
  * Get the commit's object contents, either from cache or by reading the object
diff --git a/object.c b/object.c
index d1f77565af6..f08a8874de3 100644
--- a/object.c
+++ b/object.c
@@ -216,7 +216,7 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
if (commit) {
if (parse_commit_buffer(the_repository, commit, buffer, 
size, 1))
return NULL;
-   if (!get_cached_commit_buffer(commit, NULL)) {
+   if (!get_cached_commit_buffer(the_repository, commit, 
NULL)) {
set_commit_buffer(the_repository, commit, 
buffer, size);
*eaten_p = 1;
}
diff --git a/pretty.c b/pretty.c
index cbd25b6ceae..cde4fe07db0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -630,7 +630,7 @@ const char *logmsg_reencode(const struct commit *commit,
 * the cached copy from get_commit_buffer, we need to duplicate 
it
 * to avoid munging the cached copy.
 */
-   if (msg == get_cached_commit_buffer(commit, NULL))
+   if (msg == get_cached_commit_buffer(the_repository, commit, 
NULL))
out = xstrdup(msg);
else
out = (char *)msg;
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 11/32] commit: add repository argument to set_commit_buffer

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of set_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 blame.c  | 2 +-
 commit.c | 4 ++--
 commit.h | 3 ++-
 object.c | 2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/blame.c b/blame.c
index 8a0655a5997..cf102276bea 100644
--- a/blame.c
+++ b/blame.c
@@ -158,7 +158,7 @@ static void set_commit_buffer_from_strbuf(struct commit *c, 
struct strbuf *sb)
 {
size_t len;
void *buf = strbuf_detach(sb, );
-   set_commit_buffer(c, buf, len);
+   set_commit_buffer(the_repository, c, buf, len);
 }
 
 /*
diff --git a/commit.c b/commit.c
index 75d0bdede84..cdfb1a025b6 100644
--- a/commit.c
+++ b/commit.c
@@ -262,7 +262,7 @@ struct commit_buffer {
 define_commit_slab(buffer_slab, struct commit_buffer);
 static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab);
 
-void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size)
+void set_commit_buffer_the_repository(struct commit *commit, void *buffer, 
unsigned long size)
 {
struct commit_buffer *v = buffer_slab_at(_slab, commit);
v->buffer = buffer;
@@ -450,7 +450,7 @@ int parse_commit_gently(struct commit *item, int 
quiet_on_missing)
}
ret = parse_commit_buffer(the_repository, item, buffer, size, 0);
if (save_commit_buffer && !ret) {
-   set_commit_buffer(item, buffer, size);
+   set_commit_buffer(the_repository, item, buffer, size);
return 0;
}
free(buffer);
diff --git a/commit.h b/commit.h
index f326c13622b..7c14dfdc54b 100644
--- a/commit.h
+++ b/commit.h
@@ -95,7 +95,8 @@ void parse_commit_or_die(struct commit *item);
  * Associate an object buffer with the commit. The ownership of the
  * memory is handed over to the commit, and must be free()-able.
  */
-void set_commit_buffer(struct commit *, void *buffer, unsigned long size);
+#define set_commit_buffer(r, c, b, s) set_commit_buffer_##r(c, b, s)
+void set_commit_buffer_the_repository(struct commit *, void *buffer, unsigned 
long size);
 
 /*
  * Get any cached object buffer associated with the commit. Returns NULL
diff --git a/object.c b/object.c
index 5494c0cbaa1..d1f77565af6 100644
--- a/object.c
+++ b/object.c
@@ -217,7 +217,7 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
if (parse_commit_buffer(the_repository, commit, buffer, 
size, 1))
return NULL;
if (!get_cached_commit_buffer(commit, NULL)) {
-   set_commit_buffer(commit, buffer, size);
+   set_commit_buffer(the_repository, commit, 
buffer, size);
*eaten_p = 1;
}
obj = >object;
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 17/32] object: allow lookup_object to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object.c | 15 +++
 object.h |  3 +--
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/object.c b/object.c
index f41f82c6725..9d74de95f5b 100644
--- a/object.c
+++ b/object.c
@@ -84,21 +84,20 @@ static void insert_obj_hash(struct object *obj, struct 
object **hash, unsigned i
  * Look up the record for the given sha1 in the hash map stored in
  * obj_hash.  Return NULL if it was not found.
  */
-struct object *lookup_object_the_repository(const unsigned char *sha1)
+struct object *lookup_object(struct repository *r, const unsigned char *sha1)
 {
unsigned int i, first;
struct object *obj;
 
-   if (!the_repository->parsed_objects->obj_hash)
+   if (!r->parsed_objects->obj_hash)
return NULL;
 
-   first = i = hash_obj(sha1,
-the_repository->parsed_objects->obj_hash_size);
-   while ((obj = the_repository->parsed_objects->obj_hash[i]) != NULL) {
+   first = i = hash_obj(sha1, r->parsed_objects->obj_hash_size);
+   while ((obj = r->parsed_objects->obj_hash[i]) != NULL) {
if (!hashcmp(sha1, obj->oid.hash))
break;
i++;
-   if (i == the_repository->parsed_objects->obj_hash_size)
+   if (i == r->parsed_objects->obj_hash_size)
i = 0;
}
if (obj && i != first) {
@@ -107,8 +106,8 @@ struct object *lookup_object_the_repository(const unsigned 
char *sha1)
 * that we do not need to walk the hash table the next
 * time we look for it.
 */
-   SWAP(the_repository->parsed_objects->obj_hash[i],
-the_repository->parsed_objects->obj_hash[first]);
+   SWAP(r->parsed_objects->obj_hash[i],
+r->parsed_objects->obj_hash[first]);
}
return obj;
 }
diff --git a/object.h b/object.h
index 6f3271eb228..0d7d74129b6 100644
--- a/object.h
+++ b/object.h
@@ -109,8 +109,7 @@ extern struct object *get_indexed_object(unsigned int);
  * half-initialised objects, the caller is expected to initialize them
  * by calling parse_object() on them.
  */
-#define lookup_object(r, s) lookup_object_##r(s)
-struct object *lookup_object_the_repository(const unsigned char *sha1);
+struct object *lookup_object(struct repository *r, const unsigned char *sha1);
 
 extern void *create_object(struct repository *r, const unsigned char *sha1, 
void *obj);
 
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 08/32] commit: add repository argument to lookup_commit_reference

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of lookup_commit_reference
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 bisect.c  |  2 +-
 blame.c   |  2 +-
 branch.c  |  2 +-
 builtin/branch.c  |  7 ---
 builtin/clone.c   |  3 ++-
 builtin/describe.c|  2 +-
 builtin/diff-tree.c   |  2 +-
 builtin/log.c |  7 ---
 builtin/merge-base.c  |  5 +++--
 builtin/notes.c   |  3 ++-
 builtin/pull.c| 15 ++-
 builtin/replace.c |  4 ++--
 builtin/reset.c   |  4 ++--
 builtin/rev-parse.c   |  6 +++---
 builtin/show-branch.c |  2 +-
 builtin/tag.c |  2 +-
 bundle.c  |  3 ++-
 commit.c  |  6 +++---
 commit.h  |  4 +++-
 merge-recursive.c |  6 +++---
 notes-merge.c |  5 +++--
 parse-options-cb.c|  2 +-
 remote.c  |  4 ++--
 revision.c|  4 ++--
 sequencer.c   |  6 +++---
 sha1-name.c   |  4 ++--
 submodule.c   |  4 ++--
 27 files changed, 65 insertions(+), 51 deletions(-)

diff --git a/bisect.c b/bisect.c
index 6de1abd407b..e1275ba79e8 100644
--- a/bisect.c
+++ b/bisect.c
@@ -724,7 +724,7 @@ static int bisect_checkout(const struct object_id 
*bisect_rev, int no_checkout)
 
 static struct commit *get_commit_reference(const struct object_id *oid)
 {
-   struct commit *r = lookup_commit_reference(oid);
+   struct commit *r = lookup_commit_reference(the_repository, oid);
if (!r)
die(_("Not a valid commit name %s"), oid_to_hex(oid));
return r;
diff --git a/blame.c b/blame.c
index 5b022cc2254..8a0655a5997 100644
--- a/blame.c
+++ b/blame.c
@@ -119,7 +119,7 @@ static struct commit_list **append_parent(struct 
commit_list **tail, const struc
 {
struct commit *parent;
 
-   parent = lookup_commit_reference(oid);
+   parent = lookup_commit_reference(the_repository, oid);
if (!parent)
die("no such commit %s", oid_to_hex(oid));
return _list_insert(parent, tail)->next;
diff --git a/branch.c b/branch.c
index 6a35dd31f2a..ecd710d7308 100644
--- a/branch.c
+++ b/branch.c
@@ -302,7 +302,7 @@ void create_branch(const char *name, const char *start_name,
break;
}
 
-   if ((commit = lookup_commit_reference()) == NULL)
+   if ((commit = lookup_commit_reference(the_repository, )) == NULL)
die(_("Not a valid branch point: '%s'."), start_name);
oidcpy(, >object.oid);
 
diff --git a/builtin/branch.c b/builtin/branch.c
index 1876ca9e796..a50632fb23f 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -121,7 +121,8 @@ static int branch_merged(int kind, const char *name,
(reference_name = reference_name_to_free =
 resolve_refdup(upstream, RESOLVE_REF_READING,
, NULL)) != NULL)
-   reference_rev = lookup_commit_reference();
+   reference_rev = lookup_commit_reference(the_repository,
+   );
}
if (!reference_rev)
reference_rev = head_rev;
@@ -154,7 +155,7 @@ static int check_branch_commit(const char *branchname, 
const char *refname,
   const struct object_id *oid, struct commit 
*head_rev,
   int kinds, int force)
 {
-   struct commit *rev = lookup_commit_reference(oid);
+   struct commit *rev = lookup_commit_reference(the_repository, oid);
if (!rev) {
error(_("Couldn't look up commit object for '%s'"), refname);
return -1;
@@ -208,7 +209,7 @@ static int delete_branches(int argc, const char **argv, int 
force, int kinds,
}
 
if (!force) {
-   head_rev = lookup_commit_reference(_oid);
+   head_rev = lookup_commit_reference(the_repository, _oid);
if (!head_rev)
die(_("Couldn't look up commit object for HEAD"));
}
diff --git a/builtin/clone.c b/builtin/clone.c
index 1d939af9d8c..4b3b48ee84a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -696,7 +696,8 @@ static void update_head(const struct ref *our, const struct 
ref *remote,
install_branch_config(0, head, option_origin, 
our->name);
}
} else if (our) {
-   struct commit *c = lookup_commit_reference(>old_oid);
+   struct commit *c = lookup_commit_reference(the_repository,
+  >old_oid);
/* --branch specifies a non-branch (i.e. tags),

[PATCH v3 16/32] object: allow object_as_type to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 object.c | 4 ++--
 object.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/object.c b/object.c
index e095d49b379..f41f82c6725 100644
--- a/object.c
+++ b/object.c
@@ -158,13 +158,13 @@ void *create_object(struct repository *r, const unsigned 
char *sha1, void *o)
return obj;
 }
 
-void *object_as_type_the_repository(struct object *obj, enum object_type type, 
int quiet)
+void *object_as_type(struct repository *r, struct object *obj, enum 
object_type type, int quiet)
 {
if (obj->type == type)
return obj;
else if (obj->type == OBJ_NONE) {
if (type == OBJ_COMMIT)
-   ((struct commit *)obj)->index = 
alloc_commit_index(the_repository);
+   ((struct commit *)obj)->index = alloc_commit_index(r);
obj->type = type;
return obj;
}
diff --git a/object.h b/object.h
index 3faa89578fc..6f3271eb228 100644
--- a/object.h
+++ b/object.h
@@ -114,8 +114,7 @@ struct object *lookup_object_the_repository(const unsigned 
char *sha1);
 
 extern void *create_object(struct repository *r, const unsigned char *sha1, 
void *obj);
 
-#define object_as_type(r, o, t, q) object_as_type_##r(o, t, q)
-void *object_as_type_the_repository(struct object *obj, enum object_type type, 
int quiet);
+void *object_as_type(struct repository *r, struct object *obj, enum 
object_type type, int quiet);
 
 /*
  * Returns the object, having parsed it to find out what it is.
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 19/32] tree: allow lookup_tree to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 tree.c | 10 +-
 tree.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/tree.c b/tree.c
index 45e89ff08eb..78d440a9c8f 100644
--- a/tree.c
+++ b/tree.c
@@ -195,13 +195,13 @@ int read_tree(struct tree *tree, int stage, struct 
pathspec *match,
return 0;
 }
 
-struct tree *lookup_tree_the_repository(const struct object_id *oid)
+struct tree *lookup_tree(struct repository *r, const struct object_id *oid)
 {
-   struct object *obj = lookup_object(the_repository, oid->hash);
+   struct object *obj = lookup_object(r, oid->hash);
if (!obj)
-   return create_object(the_repository, oid->hash,
-alloc_tree_node(the_repository));
-   return object_as_type(the_repository, obj, OBJ_TREE, 0);
+   return create_object(r, oid->hash,
+alloc_tree_node(r));
+   return object_as_type(r, obj, OBJ_TREE, 0);
 }
 
 int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size)
diff --git a/tree.h b/tree.h
index 2ea21ed174b..d4807dc8058 100644
--- a/tree.h
+++ b/tree.h
@@ -12,8 +12,7 @@ struct tree {
unsigned long size;
 };
 
-#define lookup_tree(r, oid) lookup_tree_##r(oid)
-struct tree *lookup_tree_the_repository(const struct object_id *oid);
+struct tree *lookup_tree(struct repository *r, const struct object_id *oid);
 
 int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size);
 
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 15/32] tag: add repository argument to deref_tag

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of deref_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 blame.c |  6 +++---
 builtin/diff.c  |  2 +-
 builtin/fmt-merge-msg.c |  3 ++-
 builtin/grep.c  |  3 ++-
 builtin/name-rev.c  |  3 ++-
 commit.c|  3 ++-
 fetch-pack.c|  9 ++---
 http-backend.c  |  2 +-
 http-push.c |  2 +-
 line-log.c  |  2 +-
 merge-recursive.c   |  3 ++-
 remote.c|  6 --
 server-info.c   |  2 +-
 sha1-name.c | 11 +++
 shallow.c   |  4 +++-
 tag.c   |  2 +-
 tag.h   |  3 ++-
 upload-pack.c   |  2 +-
 18 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/blame.c b/blame.c
index cf102276bea..4229bcd384c 100644
--- a/blame.c
+++ b/blame.c
@@ -1674,7 +1674,7 @@ static struct commit *find_single_final(struct rev_info 
*revs,
struct object *obj = revs->pending.objects[i].item;
if (obj->flags & UNINTERESTING)
continue;
-   obj = deref_tag(obj, NULL, 0);
+   obj = deref_tag(the_repository, obj, NULL, 0);
if (obj->type != OBJ_COMMIT)
die("Non commit %s?", revs->pending.objects[i].name);
if (found)
@@ -1705,7 +1705,7 @@ static struct commit *dwim_reverse_initial(struct 
rev_info *revs,
 
/* Is that sole rev a committish? */
obj = revs->pending.objects[0].item;
-   obj = deref_tag(obj, NULL, 0);
+   obj = deref_tag(the_repository, obj, NULL, 0);
if (obj->type != OBJ_COMMIT)
return NULL;
 
@@ -1741,7 +1741,7 @@ static struct commit *find_single_initial(struct rev_info 
*revs,
struct object *obj = revs->pending.objects[i].item;
if (!(obj->flags & UNINTERESTING))
continue;
-   obj = deref_tag(obj, NULL, 0);
+   obj = deref_tag(the_repository, obj, NULL, 0);
if (obj->type != OBJ_COMMIT)
die("Non commit %s?", revs->pending.objects[i].name);
if (found)
diff --git a/builtin/diff.c b/builtin/diff.c
index 7971530b9b3..361a3c3ed38 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -402,7 +402,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
int flags = (obj->flags & UNINTERESTING);
if (!obj->parsed)
obj = parse_object(the_repository, >oid);
-   obj = deref_tag(obj, NULL, 0);
+   obj = deref_tag(the_repository, obj, NULL, 0);
if (!obj)
die(_("invalid object '%s' given."), name);
if (obj->type == OBJ_COMMIT)
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 36318ef46e7..ff165c0fcd2 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -344,7 +344,8 @@ static void shortlog(const char *name,
const struct object_id *oid = _data->oid;
int limit = opts->shortlog_len;
 
-   branch = deref_tag(parse_object(the_repository, oid), oid_to_hex(oid),
+   branch = deref_tag(the_repository, parse_object(the_repository, oid),
+  oid_to_hex(oid),
   GIT_SHA1_HEXSZ);
if (!branch || branch->type != OBJ_COMMIT)
return;
diff --git a/builtin/grep.c b/builtin/grep.c
index ee753a403eb..538a818e6dd 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -647,7 +647,8 @@ static int grep_objects(struct grep_opt *opt, const struct 
pathspec *pathspec,
 
for (i = 0; i < nr; i++) {
struct object *real_obj;
-   real_obj = deref_tag(list->objects[i].item, NULL, 0);
+   real_obj = deref_tag(the_repository, list->objects[i].item,
+NULL, 0);
 
/* load the gitmodules file for this rev */
if (recurse_submodules) {
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index f6eb419a029..f1cb45c2274 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -455,7 +455,8 @@ int cmd_name_rev(int argc, const char **argv, const char 
*prefix)
commit = NULL;
object = parse_object(the_repository, );
if (object) {
-   struct object *peeled = deref_tag(object, *argv, 0);
+   struct object *peeled = deref_tag(the_repository,
+ object, *argv, 0);
if

[PATCH v3 14/32] tag: add repository argument to parse_tag_buffer

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of parse_tag_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 builtin/replace.c | 2 +-
 log-tree.c| 2 +-
 object.c  | 2 +-
 sha1-file.c   | 2 +-
 tag.c | 4 ++--
 tag.h | 3 ++-
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index 0351b7c62cf..ef22d724bbc 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -405,7 +405,7 @@ static int check_one_mergetag(struct commit *commit,
tag = lookup_tag(the_repository, _oid);
if (!tag)
return error(_("bad mergetag in commit '%s'"), ref);
-   if (parse_tag_buffer(tag, extra->value, extra->len))
+   if (parse_tag_buffer(the_repository, tag, extra->value, extra->len))
return error(_("malformed mergetag in commit '%s'"), ref);
 
/* iterate over new parents */
diff --git a/log-tree.c b/log-tree.c
index 840423ca149..76475d0136b 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -503,7 +503,7 @@ static int show_one_mergetag(struct commit *commit,
return -1; /* error message already given */
 
strbuf_init(_message, 256);
-   if (parse_tag_buffer(tag, extra->value, extra->len))
+   if (parse_tag_buffer(the_repository, tag, extra->value, extra->len))
strbuf_addstr(_message, "malformed mergetag\n");
else if (is_common_merge(commit) &&
 !oidcmp(>tagged->oid,
diff --git a/object.c b/object.c
index bcfcfd38760..e095d49b379 100644
--- a/object.c
+++ b/object.c
@@ -225,7 +225,7 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
} else if (type == OBJ_TAG) {
struct tag *tag = lookup_tag(the_repository, oid);
if (tag) {
-   if (parse_tag_buffer(tag, buffer, size))
+   if (parse_tag_buffer(the_repository, tag, buffer, size))
   return NULL;
obj = >object;
}
diff --git a/sha1-file.c b/sha1-file.c
index 75ba30b4ab1..c75ef771f8b 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1809,7 +1809,7 @@ static void check_tag(const void *buf, size_t size)
 {
struct tag t;
memset(, 0, sizeof(t));
-   if (parse_tag_buffer(, buf, size))
+   if (parse_tag_buffer(the_repository, , buf, size))
die("corrupt tag");
 }
 
diff --git a/tag.c b/tag.c
index 5b41fc71fad..4971fd4dfc9 100644
--- a/tag.c
+++ b/tag.c
@@ -126,7 +126,7 @@ void release_tag_memory(struct tag *t)
t->date = 0;
 }
 
-int parse_tag_buffer(struct tag *item, const void *data, unsigned long size)
+int parse_tag_buffer_the_repository(struct tag *item, const void *data, 
unsigned long size)
 {
struct object_id oid;
char type[20];
@@ -203,7 +203,7 @@ int parse_tag(struct tag *item)
return error("Object %s not a tag",
 oid_to_hex(>object.oid));
}
-   ret = parse_tag_buffer(item, data, size);
+   ret = parse_tag_buffer(the_repository, item, data, size);
free(data);
return ret;
 }
diff --git a/tag.h b/tag.h
index 276c448cd55..149959c81ba 100644
--- a/tag.h
+++ b/tag.h
@@ -13,7 +13,8 @@ struct tag {
 };
 #define lookup_tag(r, o) lookup_tag_##r(o)
 extern struct tag *lookup_tag_the_repository(const struct object_id *oid);
-extern int parse_tag_buffer(struct tag *item, const void *data, unsigned long 
size);
+#define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s)
+extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, 
unsigned long size);
 extern int parse_tag(struct tag *item);
 extern void release_tag_memory(struct tag *t);
 extern struct object *deref_tag(struct object *, const char *, int);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 06/32] tree: add repository argument to lookup_tree

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of lookup_tree
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 builtin/am.c| 6 --
 builtin/diff-tree.c | 2 +-
 builtin/diff.c  | 3 ++-
 builtin/reflog.c| 2 +-
 cache-tree.c| 3 ++-
 commit-graph.c  | 2 +-
 commit.c| 2 +-
 fsck.c  | 2 +-
 http-push.c | 3 ++-
 list-objects.c  | 2 +-
 merge-recursive.c   | 6 +++---
 object.c| 2 +-
 reachable.c | 2 +-
 revision.c  | 4 ++--
 sequencer.c | 2 +-
 tag.c   | 2 +-
 tree.c  | 4 ++--
 tree.h  | 3 ++-
 walker.c| 3 ++-
 19 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 6273ea5195b..72e928cee79 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -32,6 +32,7 @@
 #include "apply.h"
 #include "string-list.h"
 #include "packfile.h"
+#include "repository.h"
 
 /**
  * Returns 1 if the file is empty or does not exist, 0 otherwise.
@@ -1400,9 +1401,10 @@ static void write_index_patch(const struct am_state 
*state)
FILE *fp;
 
if (!get_oid_tree("HEAD", ))
-   tree = lookup_tree();
+   tree = lookup_tree(the_repository, );
else
-   tree = lookup_tree(the_hash_algo->empty_tree);
+   tree = lookup_tree(the_repository,
+  the_repository->hash_algo->empty_tree);
 
fp = xfopen(am_path(state, "patch"), "w");
init_revisions(_info, NULL);
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index d8db8f682f0..29901515a13 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -46,7 +46,7 @@ static int stdin_diff_trees(struct tree *tree1, const char *p)
struct tree *tree2;
if (!isspace(*p++) || parse_oid_hex(p, , ) || *p)
return error("Need exactly two trees, separated by a space");
-   tree2 = lookup_tree();
+   tree2 = lookup_tree(the_repository, );
if (!tree2 || parse_tree(tree2))
return -1;
printf("%s %s\n", oid_to_hex(>object.oid),
diff --git a/builtin/diff.c b/builtin/diff.c
index d0421c90e29..7971530b9b3 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -386,7 +386,8 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
add_head_to_pending();
if (!rev.pending.nr) {
struct tree *tree;
-   tree = 
lookup_tree(the_hash_algo->empty_tree);
+   tree = lookup_tree(the_repository,
+  
the_repository->hash_algo->empty_tree);
add_pending_object(, >object, 
"HEAD");
}
break;
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 948002b81ea..5e12c856049 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -66,7 +66,7 @@ static int tree_is_complete(const struct object_id *oid)
int complete;
struct tree *tree;
 
-   tree = lookup_tree(oid);
+   tree = lookup_tree(the_repository, oid);
if (!tree)
return 0;
if (tree->object.flags & SEEN)
diff --git a/cache-tree.c b/cache-tree.c
index 6b467119960..181d5919f0f 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -671,7 +671,8 @@ static void prime_cache_tree_rec(struct cache_tree *it, 
struct tree *tree)
cnt++;
else {
struct cache_tree_sub *sub;
-   struct tree *subtree = lookup_tree(entry.oid);
+   struct tree *subtree = lookup_tree(the_repository,
+  entry.oid);
if (!subtree->object.parsed)
parse_tree(subtree);
sub = cache_tree_sub(it, entry.path);
diff --git a/commit-graph.c b/commit-graph.c
index b63a1fc85ea..d1a68f0128f 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -344,7 +344,7 @@ static struct tree *load_tree_for_commit(struct 
commit_graph *g, struct commit *
   GRAPH_DATA_WIDTH * (c->graph_pos);
 
hashcpy(oid.hash, commit_data);
-   c->maybe_tree = lookup_tree();
+   c->maybe_tree = lookup_tree(the_repository, );
 
return c->maybe_tree;
 }
diff --git a/commit.c b/commit.c
index 0d55600e643..2fa4220ac86 100644
--- a/commit.c
+++ b/commit.c
@@ -383,7 +383,7 @@ int

[PATCH v3 01/32] object: add repository argument to parse_object

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 builtin/diff-tree.c |  3 ++-
 builtin/diff.c  |  2 +-
 builtin/fast-export.c   |  2 +-
 builtin/fmt-merge-msg.c |  6 --
 builtin/fsck.c  |  4 ++--
 builtin/log.c   |  3 ++-
 builtin/name-rev.c  |  7 ---
 builtin/receive-pack.c  |  6 +++---
 builtin/reflog.c|  3 ++-
 builtin/rev-list.c  |  2 +-
 bundle.c|  5 +++--
 commit.c|  5 +++--
 fetch-pack.c| 18 ++
 fsck.c  |  3 ++-
 http-backend.c  |  2 +-
 http-push.c |  6 --
 log-tree.c  |  7 ---
 merge-recursive.c   |  4 +++-
 object.c|  4 ++--
 object.h|  3 ++-
 packfile.c  |  2 +-
 pretty.c|  2 +-
 ref-filter.c|  3 ++-
 reflog-walk.c   |  3 ++-
 refs/files-backend.c|  2 +-
 remote.c|  4 ++--
 revision.c  | 14 +++---
 server-info.c   |  2 +-
 sha1-name.c | 14 +++---
 tag.c   |  5 +++--
 tree.c  |  5 +++--
 upload-pack.c   | 13 +++--
 walker.c|  3 ++-
 33 files changed, 95 insertions(+), 72 deletions(-)

diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index 473615117e0..d8db8f682f0 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -5,6 +5,7 @@
 #include "log-tree.h"
 #include "builtin.h"
 #include "submodule.h"
+#include "repository.h"
 
 static struct rev_info log_tree_opt;
 
@@ -68,7 +69,7 @@ static int diff_tree_stdin(char *line)
line[len-1] = 0;
if (parse_oid_hex(line, , ))
return -1;
-   obj = parse_object();
+   obj = parse_object(the_repository, );
if (!obj)
return -1;
if (obj->type == OBJ_COMMIT)
diff --git a/builtin/diff.c b/builtin/diff.c
index b709b6e9842..d0421c90e29 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -400,7 +400,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
const char *name = entry->name;
int flags = (obj->flags & UNINTERESTING);
if (!obj->parsed)
-   obj = parse_object(>oid);
+   obj = parse_object(the_repository, >oid);
obj = deref_tag(obj, NULL, 0);
if (!obj)
die(_("invalid object '%s' given."), name);
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9ee6a4d2e8f..a16aeaa826f 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -801,7 +801,7 @@ static struct commit *get_commit(struct rev_cmdline_entry 
*e, char *full_name)
 
/* handle nested tags */
while (tag && tag->object.type == OBJ_TAG) {
-   parse_object(>object.oid);
+   parse_object(the_repository, >object.oid);
string_list_append(_refs, full_name)->util = tag;
tag = (struct tag *)tag->tagged;
}
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 1b526adb3a9..5e44589b545 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -11,6 +11,7 @@
 #include "branch.h"
 #include "fmt-merge-msg.h"
 #include "gpg-interface.h"
+#include "repository.h"
 
 static const char * const fmt_merge_msg_usage[] = {
N_("git fmt-merge-msg [-m ] [--log[=] | --no-log] [--file 
]"),
@@ -343,7 +344,8 @@ static void shortlog(const char *name,
const struct object_id *oid = _data->oid;
int limit = opts->shortlog_len;
 
-   branch = deref_tag(parse_object(oid), oid_to_hex(oid), GIT_SHA1_HEXSZ);
+   branch = deref_tag(parse_object(the_repository, oid), oid_to_hex(oid),
+  GIT_SHA1_HEXSZ);
if (!branch || branch->type != OBJ_COMMIT)
return;
 
@@ -563,7 +565,7 @@ static void find_merge_parents(struct merge_parents *result,
 * "name" here and we do not want to contaminate its
 * util field yet.
 */
-   obj = parse_object();
+   obj = parse_object(the_repository, );
parent = (struct commit *)peel_to_type(NULL, 0, obj, 
OBJ_COMMIT);
if (!parent)
continue;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 3ad4f160f99..2b0930101d3 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -452,7 +452,7 @@ static int fsck_handle_ref(const char *refname, const 
struct

[PATCH v3 18/32] blob: allow lookup_blob to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 blob.c | 10 +-
 blob.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/blob.c b/blob.c
index 17b9314f0a0..342bdbb1bbe 100644
--- a/blob.c
+++ b/blob.c
@@ -5,13 +5,13 @@
 
 const char *blob_type = "blob";
 
-struct blob *lookup_blob_the_repository(const struct object_id *oid)
+struct blob *lookup_blob(struct repository *r, const struct object_id *oid)
 {
-   struct object *obj = lookup_object(the_repository, oid->hash);
+   struct object *obj = lookup_object(r, oid->hash);
if (!obj)
-   return create_object(the_repository, oid->hash,
-alloc_blob_node(the_repository));
-   return object_as_type(the_repository, obj, OBJ_BLOB, 0);
+   return create_object(r, oid->hash,
+alloc_blob_node(r));
+   return object_as_type(r, obj, OBJ_BLOB, 0);
 }
 
 int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size)
diff --git a/blob.h b/blob.h
index 08bc34487a0..16648720557 100644
--- a/blob.h
+++ b/blob.h
@@ -9,8 +9,7 @@ struct blob {
struct object object;
 };
 
-#define lookup_blob(r, o) lookup_blob_##r(o)
-struct blob *lookup_blob_the_repository(const struct object_id *oid);
+struct blob *lookup_blob(struct repository *r, const struct object_id *oid);
 
 int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size);
 
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 20/32] commit: allow lookup_commit to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 commit.c | 10 +-
 commit.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/commit.c b/commit.c
index aa5557dee84..8749e151451 100644
--- a/commit.c
+++ b/commit.c
@@ -53,13 +53,13 @@ struct commit *lookup_commit_or_die(const struct object_id 
*oid, const char *ref
return c;
 }
 
-struct commit *lookup_commit_the_repository(const struct object_id *oid)
+struct commit *lookup_commit(struct repository *r, const struct object_id *oid)
 {
-   struct object *obj = lookup_object(the_repository, oid->hash);
+   struct object *obj = lookup_object(r, oid->hash);
if (!obj)
-   return create_object(the_repository, oid->hash,
-alloc_commit_node(the_repository));
-   return object_as_type(the_repository, obj, OBJ_COMMIT, 0);
+   return create_object(r, oid->hash,
+alloc_commit_node(r));
+   return object_as_type(r, obj, OBJ_COMMIT, 0);
 }
 
 struct commit *lookup_commit_reference_by_name(const char *name)
diff --git a/commit.h b/commit.h
index 237607d64cb..27888d82468 100644
--- a/commit.h
+++ b/commit.h
@@ -63,8 +63,7 @@ enum decoration_type {
 void add_name_decoration(enum decoration_type type, const char *name, struct 
object *obj);
 const struct name_decoration *get_name_decoration(const struct object *obj);
 
-#define lookup_commit(r, o) lookup_commit_##r(o)
-struct commit *lookup_commit_the_repository(const struct object_id *oid);
+struct commit *lookup_commit(struct repository *r, const struct object_id 
*oid);
 #define lookup_commit_reference(r, o) \
lookup_commit_reference_##r(o)
 struct commit *lookup_commit_reference_the_repository(const struct object_id 
*oid);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 02/32] object: add repository argument to lookup_object

2018-06-28 Thread Stefan Beller

Add a repository argument to allow callers of lookup_object to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
---
 blob.c   |  2 +-
 builtin/fast-export.c|  5 +++--
 builtin/fsck.c   |  5 +++--
 builtin/name-rev.c   |  3 ++-
 builtin/prune.c  |  2 +-
 builtin/unpack-objects.c |  2 +-
 commit.c |  2 +-
 fetch-pack.c | 13 +++--
 http-push.c  |  2 +-
 object.c |  8 
 object.h |  3 ++-
 reachable.c  |  4 ++--
 tag.c|  2 +-
 tree.c   |  2 +-
 upload-pack.c|  2 +-
 15 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/blob.c b/blob.c
index 458dafa811e..75b737a761e 100644
--- a/blob.c
+++ b/blob.c
@@ -7,7 +7,7 @@ const char *blob_type = "blob";
 
 struct blob *lookup_blob(const struct object_id *oid)
 {
-   struct object *obj = lookup_object(oid->hash);
+   struct object *obj = lookup_object(the_repository, oid->hash);
if (!obj)
return create_object(the_repository, oid->hash,
 alloc_blob_node(the_repository));
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index a16aeaa826f..e39c4e2c1d0 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -230,7 +230,7 @@ static void export_blob(const struct object_id *oid)
if (is_null_oid(oid))
return;
 
-   object = lookup_object(oid->hash);
+   object = lookup_object(the_repository, oid->hash);
if (object && object->flags & SHOWN)
return;
 
@@ -402,7 +402,8 @@ static void show_filemodify(struct diff_queue_struct *q,
   anonymize_sha1(>oid) :
   spec->oid.hash));
else {
-   struct object *object = 
lookup_object(spec->oid.hash);
+   struct object *object = 
lookup_object(the_repository,
+ 
spec->oid.hash);
printf("M %06o :%d ", spec->mode,
   get_object_mark(object));
}
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2b0930101d3..12d01e91747 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -410,7 +410,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
struct object *obj;
 
if (!is_null_oid(oid)) {
-   obj = lookup_object(oid->hash);
+   obj = lookup_object(the_repository, oid->hash);
if (obj && (obj->flags & HAS_OBJ)) {
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
@@ -763,7 +763,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
const char *arg = argv[i];
struct object_id oid;
if (!get_oid(arg, )) {
-   struct object *obj = lookup_object(oid.hash);
+   struct object *obj = lookup_object(the_repository,
+  oid.hash);
 
if (!obj || !(obj->flags & HAS_OBJ)) {
if (is_promisor_object())
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index de54fa93e4f..f6eb419a029 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -379,7 +379,8 @@ static void name_rev_line(char *p, struct name_ref_data 
*data)
*(p+1) = 0;
if (!get_oid(p - (GIT_SHA1_HEXSZ - 1), )) {
struct object *o =
-   lookup_object(oid.hash);
+   lookup_object(the_repository,
+ oid.hash);
if (o)
name = get_rev_name(o, );
}
diff --git a/builtin/prune.c b/builtin/prune.c
index 70ec35aa058..72b0621b768 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -40,7 +40,7 @@ static int prune_object(const struct object_id *oid, const 
char *fullpath,
 * Do we know about this object?
 * It must have been reachable
 */
-   if (lookup_object(oid->hash))
+   if (lookup_object(the_repository, oid->hash))
return 0;
 
if (lstat(fullpath, )) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index cf585fcc5ee..335b5ed9a0f

[PATCH v3 21/32] tag: allow lookup_tag to handle arbitrary repositories

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 tag.c | 10 +-
 tag.h |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/tag.c b/tag.c
index fbb4659325b..46b5882ee12 100644
--- a/tag.c
+++ b/tag.c
@@ -92,13 +92,13 @@ struct object *deref_tag_noverify(struct object *o)
return o;
 }
 
-struct tag *lookup_tag_the_repository(const struct object_id *oid)
+struct tag *lookup_tag(struct repository *r, const struct object_id *oid)
 {
-   struct object *obj = lookup_object(the_repository, oid->hash);
+   struct object *obj = lookup_object(r, oid->hash);
if (!obj)
-   return create_object(the_repository, oid->hash,
-alloc_tag_node(the_repository));
-   return object_as_type(the_repository, obj, OBJ_TAG, 0);
+   return create_object(r, oid->hash,
+alloc_tag_node(r));
+   return object_as_type(r, obj, OBJ_TAG, 0);
 }
 
 static timestamp_t parse_tag_date(const char *buf, const char *tail)
diff --git a/tag.h b/tag.h
index 45b0b08b1f6..6a160c91875 100644
--- a/tag.h
+++ b/tag.h
@@ -11,8 +11,7 @@ struct tag {
char *tag;
timestamp_t date;
 };
-#define lookup_tag(r, o) lookup_tag_##r(o)
-extern struct tag *lookup_tag_the_repository(const struct object_id *oid);
+extern struct tag *lookup_tag(struct repository *r, const struct object_id 
*oid);
 #define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s)
 extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, 
unsigned long size);
 extern int parse_tag(struct tag *item);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 10/32] commit: add repository argument to parse_commit_buffer

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of parse_commit_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 commit.c| 4 ++--
 commit.h| 3 ++-
 object.c| 2 +-
 sha1-file.c | 2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/commit.c b/commit.c
index 4803c8be1df..75d0bdede84 100644
--- a/commit.c
+++ b/commit.c
@@ -363,7 +363,7 @@ const void *detach_commit_buffer(struct commit *commit, 
unsigned long *sizep)
return ret;
 }
 
-int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long 
size, int check_graph)
+int parse_commit_buffer_the_repository(struct commit *item, const void 
*buffer, unsigned long size, int check_graph)
 {
const char *tail = buffer;
const char *bufptr = buffer;
@@ -448,7 +448,7 @@ int parse_commit_gently(struct commit *item, int 
quiet_on_missing)
return error("Object %s not a commit",
 oid_to_hex(>object.oid));
}
-   ret = parse_commit_buffer(item, buffer, size, 0);
+   ret = parse_commit_buffer(the_repository, item, buffer, size, 0);
if (save_commit_buffer && !ret) {
set_commit_buffer(item, buffer, size);
return 0;
diff --git a/commit.h b/commit.h
index cd80dab59c1..f326c13622b 100644
--- a/commit.h
+++ b/commit.h
@@ -82,7 +82,8 @@ struct commit *lookup_commit_reference_by_name(const char 
*name);
  */
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char 
*ref_name);
 
-int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long 
size, int check_graph);
+#define parse_commit_buffer(r, i, b, s, g) parse_commit_buffer_##r(i, b, s, g)
+int parse_commit_buffer_the_repository(struct commit *item, const void 
*buffer, unsigned long size, int check_graph);
 int parse_commit_gently(struct commit *item, int quiet_on_missing);
 static inline int parse_commit(struct commit *item)
 {
diff --git a/object.c b/object.c
index 530c55e41e4..5494c0cbaa1 100644
--- a/object.c
+++ b/object.c
@@ -214,7 +214,7 @@ struct object *parse_object_buffer_the_repository(const 
struct object_id *oid, e
} else if (type == OBJ_COMMIT) {
struct commit *commit = lookup_commit(the_repository, oid);
if (commit) {
-   if (parse_commit_buffer(commit, buffer, size, 1))
+   if (parse_commit_buffer(the_repository, commit, buffer, 
size, 1))
return NULL;
if (!get_cached_commit_buffer(commit, NULL)) {
set_commit_buffer(commit, buffer, size);
diff --git a/sha1-file.c b/sha1-file.c
index de4839e634c..75ba30b4ab1 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1801,7 +1801,7 @@ static void check_commit(const void *buf, size_t size)
 {
struct commit c;
memset(, 0, sizeof(c));
-   if (parse_commit_buffer(, buf, size, 0))
+   if (parse_commit_buffer(the_repository, , buf, size, 0))
die("corrupt commit");
 }
 
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 04/32] object: add repository argument to object_as_type

2018-06-28 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 blob.c | 2 +-
 builtin/fsck.c | 2 +-
 commit.c   | 4 ++--
 object.c   | 2 +-
 object.h   | 3 ++-
 refs.c | 2 +-
 tag.c  | 2 +-
 tree.c | 2 +-
 8 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/blob.c b/blob.c
index 75b737a761e..dada295698c 100644
--- a/blob.c
+++ b/blob.c
@@ -11,7 +11,7 @@ struct blob *lookup_blob(const struct object_id *oid)
if (!obj)
return create_object(the_repository, oid->hash,
 alloc_blob_node(the_repository));
-   return object_as_type(obj, OBJ_BLOB, 0);
+   return object_as_type(the_repository, obj, OBJ_BLOB, 0);
 }
 
 int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 09cf5333444..a906fe4a82b 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -70,7 +70,7 @@ static const char *printable_type(struct object *obj)
enum object_type type = oid_object_info(the_repository,
>oid, NULL);
if (type > 0)
-   object_as_type(obj, type, 0);
+   object_as_type(the_repository, obj, type, 0);
}
 
ret = type_name(obj->type);
diff --git a/commit.c b/commit.c
index b4dbfd889a3..0d55600e643 100644
--- a/commit.c
+++ b/commit.c
@@ -32,7 +32,7 @@ struct commit *lookup_commit_reference_gently(const struct 
object_id *oid,
 
if (!obj)
return NULL;
-   return object_as_type(obj, OBJ_COMMIT, quiet);
+   return object_as_type(the_repository, obj, OBJ_COMMIT, quiet);
 }
 
 struct commit *lookup_commit_reference(const struct object_id *oid)
@@ -58,7 +58,7 @@ struct commit *lookup_commit(const struct object_id *oid)
if (!obj)
return create_object(the_repository, oid->hash,
 alloc_commit_node(the_repository));
-   return object_as_type(obj, OBJ_COMMIT, 0);
+   return object_as_type(the_repository, obj, OBJ_COMMIT, 0);
 }
 
 struct commit *lookup_commit_reference_by_name(const char *name)
diff --git a/object.c b/object.c
index 49719694c14..404919043d4 100644
--- a/object.c
+++ b/object.c
@@ -158,7 +158,7 @@ void *create_object(struct repository *r, const unsigned 
char *sha1, void *o)
return obj;
 }
 
-void *object_as_type(struct object *obj, enum object_type type, int quiet)
+void *object_as_type_the_repository(struct object *obj, enum object_type type, 
int quiet)
 {
if (obj->type == type)
return obj;
diff --git a/object.h b/object.h
index 2ba23c07a72..3faa89578fc 100644
--- a/object.h
+++ b/object.h
@@ -114,7 +114,8 @@ struct object *lookup_object_the_repository(const unsigned 
char *sha1);
 
 extern void *create_object(struct repository *r, const unsigned char *sha1, 
void *obj);
 
-void *object_as_type(struct object *obj, enum object_type type, int quiet);
+#define object_as_type(r, o, t, q) object_as_type_##r(o, t, q)
+void *object_as_type_the_repository(struct object *obj, enum object_type type, 
int quiet);
 
 /*
  * Returns the object, having parsed it to find out what it is.
diff --git a/refs.c b/refs.c
index 3b4508a97aa..fcfd3171e83 100644
--- a/refs.c
+++ b/refs.c
@@ -305,7 +305,7 @@ enum peel_status peel_object(const struct object_id *name, 
struct object_id *oid
 
if (o->type == OBJ_NONE) {
int type = oid_object_info(the_repository, name, NULL);
-   if (type < 0 || !object_as_type(o, type, 0))
+   if (type < 0 || !object_as_type(the_repository, o, type, 0))
return PEEL_INVALID;
}
 
diff --git a/tag.c b/tag.c
index 1b95eb9f07f..a14a4f23037 100644
--- a/tag.c
+++ b/tag.c
@@ -98,7 +98,7 @@ struct tag *lookup_tag(const struct object_id *oid)
if (!obj)
return create_object(the_repository, oid->hash,
 alloc_tag_node(the_repository));
-   return object_as_type(obj, OBJ_TAG, 0);
+   return object_as_type(the_repository, obj, OBJ_TAG, 0);
 }
 
 static timestamp_t parse_tag_date(const char *buf, const char *tail)
diff --git a/tree.c b/tree.c
index 73e8a8a948d..f31afb81be0 100644
--- a/tree.c
+++ b/tree.c
@@ -201,7 +201,7 @@ struct tree *lookup_tree(const struct object_id *oid)
if (!obj)
return create_object(the_repository, oid->hash,
 alloc_tree_node(the_repository));
-   return object_as_type(obj, OBJ_TREE, 0);
+   return object_as_type(the_repository, obj, OBJ_TREE, 0);
 }
 
 int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size)
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v3 05/32] blob: add repository argument to lookup_blob

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of lookup_blob
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 blob.c   | 2 +-
 blob.h   | 3 ++-
 builtin/fast-export.c| 2 +-
 builtin/fsck.c   | 3 ++-
 builtin/index-pack.c | 2 +-
 builtin/merge-tree.c | 3 ++-
 builtin/unpack-objects.c | 2 +-
 fsck.c   | 4 ++--
 http-push.c  | 3 ++-
 list-objects.c   | 2 +-
 object.c | 4 ++--
 reachable.c  | 2 +-
 revision.c   | 4 ++--
 tag.c| 2 +-
 walker.c | 3 ++-
 15 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/blob.c b/blob.c
index dada295698c..17b9314f0a0 100644
--- a/blob.c
+++ b/blob.c
@@ -5,7 +5,7 @@
 
 const char *blob_type = "blob";
 
-struct blob *lookup_blob(const struct object_id *oid)
+struct blob *lookup_blob_the_repository(const struct object_id *oid)
 {
struct object *obj = lookup_object(the_repository, oid->hash);
if (!obj)
diff --git a/blob.h b/blob.h
index 44606168310..08bc34487a0 100644
--- a/blob.h
+++ b/blob.h
@@ -9,7 +9,8 @@ struct blob {
struct object object;
 };
 
-struct blob *lookup_blob(const struct object_id *oid);
+#define lookup_blob(r, o) lookup_blob_##r(o)
+struct blob *lookup_blob_the_repository(const struct object_id *oid);
 
 int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size);
 
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 03a2e4b79e6..7d6b1d8aea2 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -236,7 +236,7 @@ static void export_blob(const struct object_id *oid)
 
if (anonymize) {
buf = anonymize_blob();
-   object = (struct object *)lookup_blob(oid);
+   object = (struct object *)lookup_blob(the_repository, oid);
eaten = 0;
} else {
buf = read_object_file(oid, , );
diff --git a/builtin/fsck.c b/builtin/fsck.c
index a906fe4a82b..263191942d4 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -810,7 +810,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
mode = active_cache[i]->ce_mode;
if (S_ISGITLINK(mode))
continue;
-   blob = lookup_blob(_cache[i]->oid);
+   blob = lookup_blob(the_repository,
+  _cache[i]->oid);
if (!blob)
continue;
obj = >object;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 9c08cf3cead..de311febe30 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -832,7 +832,7 @@ static void sha1_object(const void *data, struct 
object_entry *obj_entry,
if (strict || do_fsck_object) {
read_lock();
if (type == OBJ_BLOB) {
-   struct blob *blob = lookup_blob(oid);
+   struct blob *blob = lookup_blob(the_repository, oid);
if (blob)
blob->object.flags |= FLAG_CHECKED;
else
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 8a8d5797520..f8023bae1e2 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -2,6 +2,7 @@
 #include "tree-walk.h"
 #include "xdiff-interface.h"
 #include "object-store.h"
+#include "repository.h"
 #include "blob.h"
 #include "exec-cmd.h"
 #include "merge-blobs.h"
@@ -170,7 +171,7 @@ static struct merge_list *create_entry(unsigned stage, 
unsigned mode, const stru
res->stage = stage;
res->path = path;
res->mode = mode;
-   res->blob = lookup_blob(oid);
+   res->blob = lookup_blob(the_repository, oid);
return res;
 }
 
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 75d1d5ea0b8..716408e3a9d 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -254,7 +254,7 @@ static void write_object(unsigned nr, enum object_type type,
added_object(nr, type, buf, size);
free(buf);
 
-   blob = lookup_blob(_list[nr].oid);
+   blob = lookup_blob(the_repository, _list[nr].oid);
if (blob)
blob->object.flags |= FLAG_WRITTEN;
else
diff --git a/fsck.c b/fsck.c
index bb3d622fb93..ea00f7228df 100644
--- a/fsck.c
+++ b/fsck.c
@@ -414,7 +414,7 @@ static int fsck_walk_tree(struct tree *tree, void *data, 
struct fsck_options *op
result =

[PATCH v3 03/32] object: add repository argument to parse_object_buffer

2018-06-28 Thread Stefan Beller

Add a repository argument to allow the callers of parse_object_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 builtin/fast-export.c| 3 ++-
 builtin/fsck.c   | 7 +--
 builtin/index-pack.c | 3 ++-
 builtin/unpack-objects.c | 3 ++-
 object.c | 5 +++--
 object.h | 3 ++-
 ref-filter.c | 3 ++-
 7 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index e39c4e2c1d0..03a2e4b79e6 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -244,7 +244,8 @@ static void export_blob(const struct object_id *oid)
die ("Could not read blob %s", oid_to_hex(oid));
if (check_object_signature(oid, buf, size, type_name(type)) < 0)
die("sha1 mismatch in blob %s", oid_to_hex(oid));
-   object = parse_object_buffer(oid, type, size, buf, );
+   object = parse_object_buffer(the_repository, oid, type,
+size, buf, );
}
 
if (!object)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 12d01e91747..09cf5333444 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -392,7 +392,8 @@ static int fsck_obj_buffer(const struct object_id *oid, 
enum object_type type,
 * verify_packfile(), data_valid variable for details.
 */
struct object *obj;
-   obj = parse_object_buffer(oid, type, size, buffer, eaten);
+   obj = parse_object_buffer(the_repository, oid, type, size, buffer,
+ eaten);
if (!obj) {
errors_found |= ERROR_OBJECT;
return error("%s: object corrupt or missing", oid_to_hex(oid));
@@ -525,7 +526,9 @@ static int fsck_loose(const struct object_id *oid, const 
char *path, void *data)
if (!contents && type != OBJ_BLOB)
BUG("read_loose_object streamed a non-blob");
 
-   obj = parse_object_buffer(oid, type, size, contents, );
+   obj = parse_object_buffer(the_repository, oid, type, size,
+ contents, );
+
if (!obj) {
errors_found |= ERROR_OBJECT;
error("%s: object could not be parsed: %s",
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 74fe2973e12..9c08cf3cead 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -851,7 +851,8 @@ static void sha1_object(const void *data, struct 
object_entry *obj_entry,
 * we do not need to free the memory here, as the
 * buf is deleted by the caller.
 */
-   obj = parse_object_buffer(oid, type, size, buf,
+   obj = parse_object_buffer(the_repository, oid, type,
+ size, buf,
  );
if (!obj)
die(_("invalid %s"), type_name(type));
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 335b5ed9a0f..75d1d5ea0b8 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -265,7 +265,8 @@ static void write_object(unsigned nr, enum object_type type,
int eaten;
hash_object_file(buf, size, type_name(type), _list[nr].oid);
added_object(nr, type, buf, size);
-   obj = parse_object_buffer(_list[nr].oid, type, size, buf,
+   obj = parse_object_buffer(the_repository, _list[nr].oid,
+ type, size, buf,
  );
if (!obj)
die("invalid %s", type_name(type));
diff --git a/object.c b/object.c
index 002ebb69e3b..49719694c14 100644
--- a/object.c
+++ b/object.c
@@ -186,7 +186,7 @@ struct object *lookup_unknown_object(const unsigned char 
*sha1)
return obj;
 }
 
-struct object *parse_object_buffer(const struct object_id *oid, enum 
object_type type, unsigned long size, void *buffer, int *eaten_p)
+struct object *parse_object_buffer_the_repository(const struct object_id *oid, 
enum object_type type, unsigned long size, void *buffer, int *eaten_p)
 {
struct object *obj;
*eaten_p = 0;
@@ -278,7 +278,8 @@ struct object *parse_object_the_repository(const struct 
object_id *oid)
return NULL;
}
 
-   obj = parse_object_buffer(oid, type, size, buffer, );
+   obj = parse_object_buffer(the_repository, oid, type, size,
+

Re: [PATCH 4/4] fsck: silence stderr when parsing .gitmodules

2018-06-28 Thread Ramsay Jones




On 28/06/18 23:12, Jeff King wrote:
> On Thu, Jun 28, 2018 at 06:06:03PM -0400, Jeff King wrote:
> 
>> Note that we didn't test this case at all, so I've added
>> coverage in t7415. We may end up toning down or removing
>> this fsck check in the future. So take this test as checking
>> what happens now with a focus on stderr, and not any
>> ironclad guarantee that we must detect and report parse
>> failures in the future.
> 
> And such a patch _could_ look something like this. Though we could also
> perhaps leave it in place and tone it down to "ignore" by default.
> 
> There's another case that triggers gitmodulesParse, too, which is a blob
> so gigantic that we try not to hold it in memory. Technically that could
> also happen due to somebody using .gitmodules for something unrelated.
> But that seems even more far-fetched. And it _is_ dangerous to leave,
> because I think existing vulnerable clients will try to load a 500MB
> .gitmodules file in memory and parse it.

I also applied and tested the patch below. I think this patch
must be included in the series.

ATB,
Ramsay Jones

> ---
> diff --git a/fsck.c b/fsck.c
> index 87b0e228bd..296e8a8a7c 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -1013,11 +1013,9 @@ static int fsck_blob(struct blob *blob, const char 
> *buf,
>   data.options = options;
>   data.ret = 0;
>   config_opts.error_action = CONFIG_ERROR_SILENT;
> - if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
> - ".gitmodules", buf, size, , _opts))
> - data.ret |= report(options, >object,
> -FSCK_MSG_GITMODULES_PARSE,
> -"could not parse gitmodules blob");
> + /* ignore errors */
> + git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
> + ".gitmodules", buf, size, , _opts);
>  
>   return data.ret;
>  }
> diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh
> index ba8af785a5..9a7dae88a5 100755
> --- a/t/t7415-submodule-names.sh
> +++ b/t/t7415-submodule-names.sh
> @@ -176,7 +176,7 @@ test_expect_success 'fsck detects non-blob .gitmodules' '
>   )
>  '
>  
> -test_expect_success 'fsck detects corrupt .gitmodules' '
> +test_expect_success 'fsck does not complain about corrupt .gitmodules' '
>   git init corrupt &&
>   (
>   cd corrupt &&
> @@ -185,9 +185,8 @@ test_expect_success 'fsck detects corrupt .gitmodules' '
>   git add .gitmodules &&
>   git commit -m "broken gitmodules" &&
>  
> - test_must_fail git fsck 2>output &&
> - grep gitmodulesParse output &&
> - test_i18ngrep ! "bad config" output
> + git fsck 2>output &&
> + ! test -s output
>   )
>  '
>  
>

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Ramsay Jones

On 28/06/18 23:03, Jeff King wrote:
> On Thu, Jun 28, 2018 at 07:53:27PM +0100, Ramsay Jones wrote:
[snip]
> Yes, it can go in quickly. But I'd prefer not to keep it in the long
> term if it's literally doing nothing.

Hmm, I don't think you can say its doing nothing!

"Yeah, this solution seems sensible. Given that we would
 never report any error for that blob, there is no point
 in even looking at it."

... is no less true, with or without additional patches! ;-)

> I have some patches which I think solve your problem. They apply on
> v2.18.0, but not on v2.17.1 (because they rely on Dscho's increased
> passing of config_options in v2.18). Is that good enough?

Heh, I was also writing patches to address this tonight (but
I was also watching the football, so I was somewhat behind you).
My patches were not too dissimilar to yours, except I was aiming
to allow even do_config_from_file() etc., to suppress errors.

Your patches were cleaner and more focused than mine. (Instead of
turning die_on_error into an enum, I added an additional 'quiet'
flag. When pushing the stack (eg. for include files), I had to
copy the quiet flag from the parent struct, etc, ... ;-) ).

> Yes, it would include any syntax error. I also have a slight worry about
> that, but nobody seems to have screamed _yet_. :)

Hmm, I don't think we can ignore this. :(

> Here are the patches I came up with.

Yes, I applied these locally and tested them. All OK here.

So, FWIW, Ack!

[I still think my original patch, with the 'to_be_skipped'
function name changed to 'object_on_skiplist', should be
the first patch of the series!]

ATB,
Ramsay Jones

[PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes

2018-06-28 Thread Stefan Beller

The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.

To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough.  However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".

As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.

This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.

As this is a white space mode, it is made exclusive to other white space
modes in the move detection.

This patch brings some challenges, related to the detection of blocks.
We need a white net the catch the possible moved lines, but then need to
narrow down to check if the blocks are still in tact. Consider this
example (ignoring block sizes):

 - A
 - B
 - C
 +A
 +B
 +C

At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.

Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:

 -  A
 ...
 + A 
 ...

As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:

 - A
 - B
 -   A
 -   B
 +A
 +B
 +  A
 +  B

When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/diff-options.txt |   5 ++
 diff.c | 158 -
 diff.h |   3 +
 t/t4015-diff-whitespace.sh |  98 ++--
 4 files changed, 256 insertions(+), 8 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 80e29e39854..143acd9417e 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -307,6 +307,11 @@ ignore-space-change::
 ignore-all-space::
Ignore whitespace when comparing lines. This ignores differences
even if one line has whitespace where the other line has none.
+allow-indentation-change::
+   Initially ignore any white spaces in the move detection, then
+   group the moved code blocks only into a block if the change in
+   whitespace is the same per line. This is incompatible with the
+   other modes.
 --
 
 --word-diff[=]::
diff --git a/diff.c b/diff.c
index 4963819e530..f51f0ac32f4 100644
--- a/diff.c
+++ b/diff.c
@@ -302,12 +302,18 @@ static int parse_color_moved_ws(const char *arg)
ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
else if (!strcmp(sb.buf, "ignore-all-space"))
ret |= XDF_IGNORE_WHITESPACE;
+   else if (!strcmp(sb.buf, "allow-indentation-change"))
+   ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
else
error(_("ignoring unknown color-moved-ws mode '%s'"), 
sb.buf);
 
strbuf_release();
}
 
+   if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
+   (ret & XDF_WHITESPACE_FLAGS))
+   die(_("color-moved-ws: allow-indentation-change cannot be 
combined with other white space modes"));
+
string_list_clear(, 0);
 
return ret;
@@ -737,7 +743,91 @@ struct moved_entry {
struct hashmap_entry ent;
const struct emitted_diff_symbol *es;
struct moved_entry *next_line;
+   struct ws_delta *wsd;
+};
+
+/**
+ * The struct ws_delta holds white space differences between moved lines, i.e.
+ * between '+' and '-' lines that have been detected to be a move.
+ * The string contains the difference in leading white spaces, before the
+ * rest of

[PATCH v4 8/9] diff.c: factor advance_or_nullify out of mark_color_as_moved

2018-06-28 Thread Stefan Beller

This moves the part of code that checks if we're still in a block
into its own function.  We'll need a different approach on advancing
the blocks in a later patch, so having it as a separate function will
prove useful.

While at it rename the variable `p` to `prev` to indicate that it refers
to the previous line. This is as pmb[i] was assigned in the last iteration
of the outmost for loop.

Further rename `pnext` to `cur` to indicate that this should match up with
the current line of the outmost for loop.

Also replace the advancement of pmb[i] to reuse `cur` instead of
using `p->next` (which is how the name for pnext could be explained.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 diff.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/diff.c b/diff.c
index 70eeb40c5fd..4963819e530 100644
--- a/diff.c
+++ b/diff.c
@@ -801,6 +801,25 @@ static void add_lines_to_move_detection(struct 
diff_options *o,
}
 }
 
+static void pmb_advance_or_null(struct diff_options *o,
+   struct moved_entry *match,
+   struct hashmap *hm,
+   struct moved_entry **pmb,
+   int pmb_nr)
+{
+   int i;
+   for (i = 0; i < pmb_nr; i++) {
+   struct moved_entry *prev = pmb[i];
+   struct moved_entry *cur = (prev && prev->next_line) ?
+   prev->next_line : NULL;
+   if (cur && !hm->cmpfn(o, cur, match, NULL)) {
+   pmb[i] = cur;
+   } else {
+   pmb[i] = NULL;
+   }
+   }
+}
+
 static int shrink_potential_moved_blocks(struct moved_entry **pmb,
 int pmb_nr)
 {
@@ -875,7 +894,6 @@ static void mark_color_as_moved(struct diff_options *o,
struct moved_entry *key;
struct moved_entry *match = NULL;
struct emitted_diff_symbol *l = >emitted_symbols->buf[n];
-   int i;
 
switch (l->s) {
case DIFF_SYMBOL_PLUS:
@@ -906,17 +924,7 @@ static void mark_color_as_moved(struct diff_options *o,
if (o->color_moved == COLOR_MOVED_PLAIN)
continue;
 
-   /* Check any potential block runs, advance each or nullify */
-   for (i = 0; i < pmb_nr; i++) {
-   struct moved_entry *p = pmb[i];
-   struct moved_entry *pnext = (p && p->next_line) ?
-   p->next_line : NULL;
-   if (pnext && !hm->cmpfn(o, pnext, match, NULL)) {
-   pmb[i] = p->next_line;
-   } else {
-   pmb[i] = NULL;
-   }
-   }
+   pmb_advance_or_null(o, match, hm, pmb, pmb_nr);
 
pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr);
 
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v4 1/9] xdiff/xdiff.h: remove unused flags

2018-06-28 Thread Stefan Beller

These flags were there since the beginning (3443546f6e (Use a *real*
built-in diff generator, 2006-03-24), but were never used. Remove them.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 xdiff/xdiff.h | 8 
 1 file changed, 8 deletions(-)

diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index c1937a29112..2356da5f784 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -52,14 +52,6 @@ extern "C" {
 #define XDL_EMIT_FUNCNAMES (1 << 0)
 #define XDL_EMIT_FUNCCONTEXT (1 << 2)
 
-#define XDL_MMB_READONLY (1 << 0)
-
-#define XDL_MMF_ATOMIC (1 << 0)
-
-#define XDL_BDOP_INS 1
-#define XDL_BDOP_CPY 2
-#define XDL_BDOP_INSB 3
-
 /* merge simplification levels */
 #define XDL_MERGE_MINIMAL 0
 #define XDL_MERGE_EAGER 1
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v4 5/9] diff.c: adjust hash function signature to match hashmap expectation

2018-06-28 Thread Stefan Beller

This makes the follow up patch easier.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 diff.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index ce7bedc1b92..d1bae900cdc 100644
--- a/diff.c
+++ b/diff.c
@@ -707,11 +707,15 @@ struct moved_entry {
struct moved_entry *next_line;
 };
 
-static int moved_entry_cmp(const struct diff_options *diffopt,
-  const struct moved_entry *a,
-  const struct moved_entry *b,
+static int moved_entry_cmp(const void *hashmap_cmp_fn_data,
+  const void *entry,
+  const void *entry_or_key,
   const void *keydata)
 {
+   const struct diff_options *diffopt = hashmap_cmp_fn_data;
+   const struct moved_entry *a = entry;
+   const struct moved_entry *b = entry_or_key;
+
return !xdiff_compare_lines(a->es->line, a->es->len,
b->es->line, b->es->len,
diffopt->xdl_opts);
@@ -5534,10 +5538,8 @@ static void diff_flush_patch_all_file_pairs(struct 
diff_options *o)
if (o->color_moved) {
struct hashmap add_lines, del_lines;
 
-   hashmap_init(_lines,
-(hashmap_cmp_fn)moved_entry_cmp, o, 0);
-   hashmap_init(_lines,
-(hashmap_cmp_fn)moved_entry_cmp, o, 0);
+   hashmap_init(_lines, moved_entry_cmp, o, 0);
+   hashmap_init(_lines, moved_entry_cmp, o, 0);
 
add_lines_to_move_detection(o, _lines, _lines);
mark_color_as_moved(o, _lines, _lines);
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v4 2/9] xdiff/xdiffi.c: remove unneeded function declarations

2018-06-28 Thread Stefan Beller

There is no need to forward-declare these functions, as they are used
after their implementation only.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 xdiff/xdiffi.c | 17 -
 1 file changed, 17 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 0de1ef463bf..3e8aff92bc4 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,34 +22,17 @@
 
 #include "xinclude.h"
 
-
-
 #define XDL_MAX_COST_MIN 256
 #define XDL_HEUR_MIN_COST 256
 #define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
 #define XDL_SNAKE_CNT 20
 #define XDL_K_HEUR 4
 
-
-
 typedef struct s_xdpsplit {
long i1, i2;
int min_lo, min_hi;
 } xdpsplit_t;
 
-
-
-
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
- long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
- xdalgoenv_t *xenv);
-static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long 
chg1, long chg2);
-
-
-
-
-
 /*
  * See "An O(ND) Difference Algorithm and its Variations", by Eugene Myers.
  * Basically considers a "box" (off1, off2, lim1, lim2) and scan from both
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v4 0/9] Moved code detection: ignore space on uniform indentation

2018-06-28 Thread Stefan Beller

v4:
 * see range diff below
 * brought best practices to t4015 and have git not upstream of a pipe
   (new patch 3)
 * squashed in the SQUASH patches
 * fixed the translation as well.

v3:
 This is a complete rewrite of the actual patch, with slight modifications]
 on the refactoring how to decouple the white space treatment from the
 move detection. See range-diff below.
 https://public-inbox.org/git/20180622015725.219575-1-sbel...@google.com/

v2: https://public-inbox.org/git/20180424210330.87861-1-sbel...@google.com/

v1: https://public-inbox.org/git/20180402224854.86922-1-sbel...@google.com/

Stefan Beller (9):
  xdiff/xdiff.h: remove unused flags
  xdiff/xdiffi.c: remove unneeded function declarations
  t4015: avoid git as a pipe input
  diff.c: do not pass diff options as keydata to hashmap
  diff.c: adjust hash function signature to match hashmap expectation
  diff.c: add a blocks mode for moved code detection
  diff.c: decouple white space treatment from move detection algorithm
  diff.c: factor advance_or_nullify out of mark_color_as_moved
  diff.c: add white space mode to move detection that allows indent
changes

 Documentation/diff-options.txt |  30 +++-
 diff.c | 253 +
 diff.h |   9 +-
 t/t4015-diff-whitespace.sh | 243 ++-
 xdiff/xdiff.h  |   8 --
 xdiff/xdiffi.c |  17 ---
 6 files changed, 472 insertions(+), 88 deletions(-)

-- 
2.18.0.399.gad0ab374a1-goog

1:  ace2dc2bc11 = 1:  ace2dc2bc11 xdiff/xdiff.h: remove unused flags
2:  53b3574564e = 2:  53b3574564e xdiff/xdiffi.c: remove unneeded function 
declarations
-:  --- > 3:  9b58621e0d8 t4015: avoid git as a pipe input
3:  34850b565df = 4:  be0ea0717e1 diff.c: do not pass diff options as keydata 
to hashmap
4:  8148e51178f = 5:  ff7e8721afa diff.c: adjust hash function signature to 
match hashmap expectation
5:  9d1de6a208e ! 6:  73c2801a5e3 diff.c: add a blocks mode for moved code 
detection
@@ -19,7 +19,7 @@
moved line, but it is not very useful in a review to determine
if a block of code was moved without permutation.
 -zebra::
-+blocks:
++blocks::
Blocks of moved text of at least 20 alphanumeric characters
are detected greedily. The detected blocks are
 -  painted using either the 'color.diff.{old,new}Moved' color or
@@ -95,10 +95,8 @@
test_config color.diff.newMovedDimmed "normal cyan" &&
test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
-+
-+  git diff HEAD --no-renames --color-moved=blocks --color |
-+  grep -v "index" |
-+  test_decode_color >actual &&
++  git diff HEAD --no-renames --color-moved=blocks --color >actual.raw &&
++  grep -v "index" actual.raw | test_decode_color >actual &&
 +  cat <<-\EOF >expected &&
 +  diff --git a/lines.txt b/lines.txt
 +  --- a/lines.txt
@@ -141,6 +139,16 @@
 +  test_config color.diff.newMovedDimmed "normal cyan" &&
 +  test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
 +  test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
-   git diff HEAD --no-renames --color-moved=dimmed_zebra --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved=dimmed_zebra --color 
>actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
+   cat <<-\EOF >expected &&
+@@
+   7charsA
+   EOF
+ 
+-  git diff HEAD --color-moved=zebra --color --no-renames | grep -v 
"index" | test_decode_color >actual &&
++  git diff HEAD --color-moved=zebra --color --no-renames >actual.raw &&
++  grep -v "index" actual.raw | test_decode_color >actual &&
+   cat >expected <<-\EOF &&
+   diff --git a/bar b/bar
+   --- a/bar
6:  e37bb7b1fc8 ! 7:  87a8919c260 diff.c: decouple white space treatment from 
move detection algorithm
@@ -55,6 +55,7 @@
 +ignore-all-space::
 +  Ignore whitespace when comparing lines. This ignores differences
 +  even if one line has whitespace where the other line has none.
++--
 +
  --word-diff[=]::
Show a word diff, using the  to delimit changed words.
@@ -154,32 +155,32 @@
EOF
test_cmp expected actual &&
  
--  git diff HEAD --no-renames -w --color-moved --color |
+-  git diff HEAD --no-renames -w --color-moved --color >actual.raw &&
 +  git diff HEAD --no-renames --color-moved --color \
-+  --color-moved-ws=ignore-all-space |
-   grep -v "index" |
-   test_decode_color >actual &&
++  --color-moved-ws=ignore-all-space >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
+

[PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm

2018-06-28 Thread Stefan Beller

In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff.  Some cases came up where different treatment would
have been nice.

Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.

As we change the default, we'll adjust the tests.

For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.

As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to  XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.

By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/diff-options.txt | 17 +
 diff.c | 39 +++--
 diff.h |  1 +
 t/t4015-diff-whitespace.sh | 64 +++---
 4 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index ba56169de31..80e29e39854 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -292,6 +292,23 @@ dimmed_zebra::
blocks are considered interesting, the rest is uninteresting.
 --
 
+--color-moved-ws=::
+   This configures how white spaces are ignored when performing the
+   move detection for `--color-moved`. These modes can be given
+   as a comma separated list:
++
+--
+ignore-space-at-eol::
+   Ignore changes in whitespace at EOL.
+ignore-space-change::
+   Ignore changes in amount of whitespace.  This ignores whitespace
+   at line end, and considers all other sequences of one or
+   more whitespace characters to be equivalent.
+ignore-all-space::
+   Ignore whitespace when comparing lines. This ignores differences
+   even if one line has whitespace where the other line has none.
+--
+
 --word-diff[=]::
Show a word diff, using the  to delimit changed words.
By default, words are delimited by whitespace; see
diff --git a/diff.c b/diff.c
index 95c51c0b7df..70eeb40c5fd 100644
--- a/diff.c
+++ b/diff.c
@@ -283,6 +283,36 @@ static int parse_color_moved(const char *arg)
return error(_("color moved setting must be one of 'no', 
'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'"));
 }
 
+static int parse_color_moved_ws(const char *arg)
+{
+   int ret = 0;
+   struct string_list l = STRING_LIST_INIT_DUP;
+   struct string_list_item *i;
+
+   string_list_split(, arg, ',', -1);
+
+   for_each_string_list_item(i, ) {
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addstr(, i->string);
+   strbuf_trim();
+
+   if (!strcmp(sb.buf, "ignore-space-change"))
+   ret |= XDF_IGNORE_WHITESPACE_CHANGE;
+   else if (!strcmp(sb.buf, "ignore-space-at-eol"))
+   ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
+   else if (!strcmp(sb.buf, "ignore-all-space"))
+   ret |= XDF_IGNORE_WHITESPACE;
+   else
+   error(_("ignoring unknown color-moved-ws mode '%s'"), 
sb.buf);
+
+   strbuf_release();
+   }
+
+   string_list_clear(, 0);
+
+   return ret;
+}
+
 int git_diff_ui_config(const char *var, const char *value, void *cb)
 {
if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
@@ -717,10 +747,12 @@ static int moved_entry_cmp(const void 
*hashmap_cmp_fn_data,
const struct diff_options *diffopt = hashmap_cmp_fn_data;
const struct moved_entry *a = entry;
const struct moved_entry *b = entry_or_key;
+   unsigned flags = diffopt->color_moved_ws_handling
+& XDF_WHITESPACE_FLAGS;
 
return !xdiff_compare_lines(a->es->line, a->es->len,
b->es->line, b->es->len,
-   diffopt->xdl_opts);
+   flags);
 }
 
 static struct moved_entry *prepare_entry(struct diff_options *o,
@@ -728,8 +760,9 @@ static struct moved_entry *prepare_entry(struct 
diff_options *o,
 {
struct moved_entry *ret = xmalloc(sizeof(*ret));

[PATCH v4 4/9] diff.c: do not pass diff options as keydata to hashmap

2018-06-28 Thread Stefan Beller

When we initialize the hashmap, we give it a pointer to the
diff_options, which it then passes along to each call of the
hashmap_cmp_fn function. There's no need to pass it a second time as
the "keydata" parameter, and our comparison functions never look at
keydata.

This was a mistake left over from an earlier round of 2e2d5ac184
(diff.c: color moved lines differently, 2017-06-30), before hashmap
learned to pass the data pointer for us.

Explanation-by: Jeff King 
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 1289df4b1f9..ce7bedc1b92 100644
--- a/diff.c
+++ b/diff.c
@@ -842,13 +842,13 @@ static void mark_color_as_moved(struct diff_options *o,
case DIFF_SYMBOL_PLUS:
hm = del_lines;
key = prepare_entry(o, n);
-   match = hashmap_get(hm, key, o);
+   match = hashmap_get(hm, key, NULL);
free(key);
break;
case DIFF_SYMBOL_MINUS:
hm = add_lines;
key = prepare_entry(o, n);
-   match = hashmap_get(hm, key, o);
+   match = hashmap_get(hm, key, NULL);
free(key);
break;
default:
-- 
2.18.0.399.gad0ab374a1-goog

[PATCH v4 3/9] t4015: avoid git as a pipe input

2018-06-28 Thread Stefan Beller

In t4015 we have a pattern of

git diff [] |
grep -v "index" |
test_decode_color >actual &&

to produce output that we want to test against. This pattern was introduced
in 86b452e2769 (diff.c: add dimming to moved line detection, 2017-06-30)
as then the focus on getting the colors right. However the pattern used
is not best practice as we do care about the exit code of Git. So let's
not have Git as the upstream of a pipe. Piping the output of grep to
some function is fine as we assume grep to be un-flawed in our test suite.

Signed-off-by: Stefan Beller 
---
 t/t4015-diff-whitespace.sh | 50 +++---
 1 file changed, 20 insertions(+), 30 deletions(-)

diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 17df491a3ab..ddbc3901385 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1271,9 +1271,8 @@ test_expect_success 'detect permutations inside moved 
code -- dimmed_zebra' '
test_config color.diff.newMovedDimmed "normal cyan" &&
test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
-   git diff HEAD --no-renames --color-moved=dimmed_zebra --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved=dimmed_zebra --color 
>actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1315,9 +1314,8 @@ test_expect_success 'cmd option assumes configured 
colored-moved' '
test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
test_config diff.colorMoved zebra &&
-   git diff HEAD --no-renames --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1395,9 +1393,8 @@ test_expect_success 'move detection ignoring whitespace ' 
'
line 4
line 5
EOF
-   git diff HEAD --no-renames --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1419,9 +1416,8 @@ test_expect_success 'move detection ignoring whitespace ' 
'
EOF
test_cmp expected actual &&
 
-   git diff HEAD --no-renames -w --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames -w --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1459,9 +1455,8 @@ test_expect_success 'move detection ignoring whitespace 
changes' '
line 5
EOF
 
-   git diff HEAD --no-renames --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1483,9 +1478,8 @@ test_expect_success 'move detection ignoring whitespace 
changes' '
EOF
test_cmp expected actual &&
 
-   git diff HEAD --no-renames -b --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames -b --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1526,9 +1520,8 @@ test_expect_success 'move detection ignoring whitespace 
at eol' '
# avoid cluttering the output with complaints about our eol whitespace
test_config core.whitespace -blank-at-eol &&
 
-   git diff HEAD --no-renames --color-moved --color |
-   grep -v "index" |
-   test_decode_color >actual &&
+   git diff HEAD --no-renames --color-moved --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
cat <<-\EOF >expected &&
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
@@ -1550,9 +1543,8 @@ test_expect_success 'move detection ignoring whitespace 
at eol' '
EOF

[PATCH v4 6/9] diff.c: add a blocks mode for moved code detection

2018-06-28 Thread Stefan Beller

The new "blocks" mode provides a middle ground between plain and zebra.
It is as intuitive (few colors) as plain, but still has the requirement
for a minimum of lines/characters to count a block as moved.

Suggested-by: Ævar Arnfjörð Bjarmason 
 (https://public-inbox.org/git/87o9j0uljo@evledraar.gmail.com/)
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/diff-options.txt |  8 --
 diff.c |  6 +++--
 diff.h |  5 ++--
 t/t4015-diff-whitespace.sh | 49 --
 4 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index e3a44f03cdc..ba56169de31 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -276,10 +276,14 @@ plain::
that are added somewhere else in the diff. This mode picks up any
moved line, but it is not very useful in a review to determine
if a block of code was moved without permutation.
-zebra::
+blocks::
Blocks of moved text of at least 20 alphanumeric characters
are detected greedily. The detected blocks are
-   painted using either the 'color.diff.{old,new}Moved' color or
+   painted using either the 'color.diff.{old,new}Moved' color.
+   Adjacent blocks cannot be told apart.
+zebra::
+   Blocks of moved text are detected as in 'blocks' mode. The blocks
+   are painted using either the 'color.diff.{old,new}Moved' color or
'color.diff.{old,new}MovedAlternative'. The change between
the two colors indicates that a new block was detected.
 dimmed_zebra::
diff --git a/diff.c b/diff.c
index d1bae900cdc..95c51c0b7df 100644
--- a/diff.c
+++ b/diff.c
@@ -271,6 +271,8 @@ static int parse_color_moved(const char *arg)
return COLOR_MOVED_NO;
else if (!strcmp(arg, "plain"))
return COLOR_MOVED_PLAIN;
+   else if (!strcmp(arg, "blocks"))
+   return COLOR_MOVED_BLOCKS;
else if (!strcmp(arg, "zebra"))
return COLOR_MOVED_ZEBRA;
else if (!strcmp(arg, "default"))
@@ -278,7 +280,7 @@ static int parse_color_moved(const char *arg)
else if (!strcmp(arg, "dimmed_zebra"))
return COLOR_MOVED_ZEBRA_DIM;
else
-   return error(_("color moved setting must be one of 'no', 
'default', 'zebra', 'dimmed_zebra', 'plain'"));
+   return error(_("color moved setting must be one of 'no', 
'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'"));
 }
 
 int git_diff_ui_config(const char *var, const char *value, void *cb)
@@ -903,7 +905,7 @@ static void mark_color_as_moved(struct diff_options *o,
 
block_length++;
 
-   if (flipped_block)
+   if (flipped_block && o->color_moved != COLOR_MOVED_BLOCKS)
l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
}
adjust_last_block(o, n, block_length);
diff --git a/diff.h b/diff.h
index d29560f822c..7bd4f182c33 100644
--- a/diff.h
+++ b/diff.h
@@ -208,8 +208,9 @@ struct diff_options {
enum {
COLOR_MOVED_NO = 0,
COLOR_MOVED_PLAIN = 1,
-   COLOR_MOVED_ZEBRA = 2,
-   COLOR_MOVED_ZEBRA_DIM = 3,
+   COLOR_MOVED_BLOCKS = 2,
+   COLOR_MOVED_ZEBRA = 3,
+   COLOR_MOVED_ZEBRA_DIM = 4,
} color_moved;
#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
#define COLOR_MOVED_MIN_ALNUM_COUNT 20
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index ddbc3901385..e54529f026d 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1223,7 +1223,7 @@ test_expect_success 'plain moved code, inside file' '
test_cmp expected actual
 '
 
-test_expect_success 'detect permutations inside moved code -- dimmed_zebra' '
+test_expect_success 'detect blocks of moved code' '
git reset --hard &&
cat <<-\EOF >lines.txt &&
long line 1
@@ -1271,6 +1271,50 @@ test_expect_success 'detect permutations inside moved 
code -- dimmed_zebra' '
test_config color.diff.newMovedDimmed "normal cyan" &&
test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
+   git diff HEAD --no-renames --color-moved=blocks --color >actual.raw &&
+   grep -v "index" actual.raw | test_decode_color >actual &&
+   cat <<-\EOF >expected &&
+   diff --git a/lines.txt b/lines.txt
+   --- a/lines.txt
+   +++ b/lines.txt
+   @@ -1,16 +1,16 @@
+   -long line 1
+   -long line 2
+   -long line 3
+line 4
+line 5
+line 6
+line 7
+line 8
+line 9
+   +long line 1
+   +long line 2
+   +long line 3
+   +long line 14
+   +long line 15
+   +long line 16
+line 10
+

Re: [PATCH v6 0/4] stash: Convert some `git stash` commands to a builtin

2018-06-28 Thread Paul-Sebastian Ungureanu


Hello,

On 27.06.2018 01:12, Johannes Schindelin wrote:


Joel Teichroeb (4):
   stash: convert apply to builtin
   stash: convert drop and clear to builtin
   stash: convert branch to builtin
   stash: convert pop to builtin


Were there any changes since v5 in these patches?



There are some changes since v5 based on the reviews.

There are also some changes which were not suggested in the reviews, but 
which I considered to be good (I hope they really are). For example, in 
the previous version of the patch that introduces the `stash--helper`, 
there was no `assert_stash_like()` function (it was part of 
`get_stash_info()`). In v6, it was implemented as a separate function in 
order to keep it more alike to `git-stash.sh` script. Another difference 
was to change `assert_stash_ref()` function type from `int` to `void` 
and call `exit()` inside. This was introduced in order to make it more 
similar with `assert()` from `assert.h`. I do not think of these changes 
to be something of great importance, but I hope they might make the code 
easier to read.


Best,
Paul

Re: [PATCH 0/1] Makefile: fix the "built from commit" code

2018-06-28 Thread brian m. carlson

On Thu, Jun 28, 2018 at 12:53:15PM +, Johannes Schindelin via GitGitGadget 
wrote:
> Let's fix that quoting, and while at it, also suppress the unhelpful
> message
> 
> fatal: not a git repository (or any of the parent directories): .git
> 
> that gets printed to stderr if no current commit could be determined,
> and might scare the occasional developer who simply tries to build Git
> from scratch.

I saw that building Git 2.18.0 for $DAYJOB.  Thanks for fixing it.

The series looked good to me, too.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH v6 2/4] stash: convert drop and clear to builtin

2018-06-28 Thread Paul-Sebastian Ungureanu


Hi,

On 27.06.2018 01:17, Johannes Schindelin wrote:

I thought you had introduced `get_oidf()` specifically so you could avoid
the `rev-parse` call... `get_oidf(_oid, "%s@{0}", ref_stash)` should
do this, right?


We discussed this over the IRC channel, but since not everybody might 
follow the chat, I would like to put it also here.


The main reason why `get_oidf()` was introduced is to make 
`assert_stash_like()` less tedious. I am sure that this is not the only 
place where this function could be useful.


Over the last weeks, I have been working in parallel on adding a new 
flag (`GET_OID_GENTLY`) for `get_oid_with_context()`. By using this 
flag, the `rev-parse` call could be replaced with an internal API call. 
I hope that I will be able to send this patch soon enough.


Best regards,
Paul

Re: What's cooking in git.git (Jun 2018, #07; Thu, 28)

2018-06-28 Thread Stefan Beller

On Thu, Jun 28, 2018 at 2:40 PM Junio C Hamano  wrote:
> The tip of 'next' has been rewound and it currently has only 4
> topics.  Quite a many topics are cooking in 'pu' and need to be
> sifted into good bins (for 'next') and the remainder.  Help is
> appreciated in that area ;-)

Which branches are needing review most? Or should I start
commenting on random stuff? ;-)

>
> * bw/ref-in-want (2018-06-28) 8 commits
>  - fetch-pack: implement ref-in-want
>  - fetch-pack: put shallow info in output parameter
>  - fetch: refactor to make function args narrower
>  - fetch: refactor fetch_refs into two functions
>  - fetch: refactor the population of peer ref OIDs
>  - upload-pack: test negotiation with changing repository
>  - upload-pack: implement ref-in-want
>  - test-pkt-line: add unpack-sideband subcommand
>
>  Protocol v2 has been updated to allow slightly out-of-sync set of
>  servers to work together to serve a single client, which would be
>  useful with load-balanced servers that talk smart HTTP transport.
>

I think another selling point, which should be emphasized more is
the ease of ACL checking on the server side.
Even when the read permissions are per repository (e.g. githubs
model AFAICT), the serving side still has to do a lookup "Can
a wanted sha1 be reached from all refs?", which now can be
optimized away ("Can I access the master branch?") and it
is cheaper to run the server.
Specifically if read permissions are per ref (Gerrits model), I'd
expect some CPU savings on the serving side.

> * jt/remove-pack-bitmap-global (2018-06-21) 2 commits
>  - pack-bitmap: add free function
>  - pack-bitmap: remove bitmap_git global variable
>
>  The effort to move globals to per-repository in-core structure
>  continues.

This is mostly done, though Peff seems to expect a reroll with
clarification on how the series is structured?
https://public-inbox.org/git/20180611211033.gb26...@sigill.intra.peff.net/

> * sb/submodule-move-head-error-msg (2018-06-25) 1 commit
>  - submodule.c: report the submodule that an error occurs in
>
>  Needs a reroll.
>  cf. <20180622081713.5360-1-szeder@gmail.com>

https://public-inbox.org/git/xmqqmuviq2n7@gitster-ct.c.googlers.com/

suggests that you applied that change and a reroll would not be needed.


>
> * ds/multi-pack-index (2018-06-25) 24 commits
>  - midx: clear midx on repack
>  - packfile: skip loading index if in multi-pack-index
>  - midx: prevent duplicate packfile loads
>  - midx: use midx in approximate_object_count
>  - midx: use existing midx when writing new one
>  - midx: use midx in abbreviation calculations
>  - midx: read objects from multi-pack-index
>  - midx: prepare midxed_git struct
>  - config: create core.multiPackIndex setting
>  - midx: write object offsets
>  - midx: write object id fanout chunk
>  - midx: write object ids in a chunk
>  - midx: sort and deduplicate objects from packfiles
>  - midx: read pack names into array
>  - multi-pack-index: write pack names in chunk
>  - multi-pack-index: read packfile list
>  - packfile: generalize pack directory list
>  - multi-pack-index: expand test data
>  - multi-pack-index: load into memory
>  - midx: write header information to lockfile
>  - multi-pack-index: add 'write' verb
>  - multi-pack-index: add builtin
>  - multi-pack-index: add format details
>  - multi-pack-index: add design document
>
>  When there are too many packfiles in a repository (which is not
>  recommended), looking up an object in these would require
>  consulting many pack .idx files; a new mechanism to have a single
>  file that consolidates all of these .idx files is introduced.
>
>
> * nd/use-the-index-compat-less (2018-06-25) 13 commits
>  - wt-status.c: stop using index compat macros
>  - sha1-name.c: stop using index compat macros
>  - sequencer.c: stop using index compat macros
>  - revision.c: stop using index compat macros
>  - rerere.c: stop using index compat macros
>  - merge.c: stop using index compat macros
>  - merge-recursive.c: stop using index compat macros
>  - entry.c: stop using index compat macros
>  - diff.c: stop using index compat macros
>  - diff-lib.c: stop using index compat macros
>  - check-racy.c: stop using index compat macros
>  - blame.c: stop using index compat macros
>  - apply.c: stop using index compat macros
>
>  Use of many convenience functions that operate on the_index
>  "primary in-core index instance" have been rewritten to explicitly
>  call the coutnerpart functions that take arbitrary index_state and
>  pass the_index.
>
>  I'd say that alone is a useless code churn, but people seem to be
>  excited about the change, so it is queued here.

As someone who also does lots of refactoring lately, I am excited
to see the code health moving in the right direction.

It is easy to quantify how often we are bitten by code churn
(that you call useless here); and very hard to quantify bugs
introduced or features not written/upstreamed due to clunky
API (as we don't see those or do

Re: [PATCH v6 0/4] stash: add new tests and introduce a new helper function

2018-06-28 Thread Paul-Sebastian Ungureanu


Hello,


Heh, what is more useful than apology is to tell us which order
these three (apparent) series build on top of each other ;-)

The answer, IIUC, is that

  * oidf+tests come first, then
  * apply/drop/branch/pop (as these rely on oidf) on top, and finally
  * list (as it wants to add to stash--helper that is a new file in the middle)

When there is clear dependency like that, I agree that it would help
readers to emphasize that these cannot be applied in an arbitrary
order.  It is especially true as the second part of the above _will_
apply more-or-less cleanly without the first one, and then fail to
compile due to lack of oidf.

Thanks.



I would actually say that it was 100% my fault (I could have tested 
before sending patches or ask if I was not completely sure) and I would 
like to apologize for this.


The order of the commits is actually good. The only thing that is not 
right are the cover letters, which are missing. Every 1/N marks the 
beginning of a series of patches. Just to be more clear, this is the 
right order:


First patch:
sha1-name.c: added 'get_oidf', which acts like 'get_oid'
stash: improve option parsing test coverage
stash: update test cases conform to coding guidelines
stash: renamed test cases to be more descriptive

Second patch:
stash: convert apply to builtin
stash: convert drop and clear to builtin
stash: convert branch to builtin
stash: convert pop to builtin

Third patch:
stash: implement the "list" command in the builtin
stash: convert show to builtin
stash: change `git stash show` usage text and documentation
stash: refactor `show_stash()` to use the diff API
stash: update `git stash show` documentation
stash: convert store to builtin

The only thing which was in the cover letters and might be worth 
mentioning here is related to `git stash show`. Although it might not be 
efficient, a 1:1 conversion is more easily to follow and review. Because 
of that, the first commit related to `git stash show` approaches a 1:1 
conversion to C. There is a subsequent patch (`refactor `) which 
makes the `show` subcommand use the existent API. Any change of behavior 
is noted in the commit message which introduces that change.


Best,
Paul

Re: [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs

2018-06-28 Thread SZEDER Gábor



> diff --git a/read-cache.c b/read-cache.c
> index 9624ce1784..116fd51680 100644
> --- a/read-cache.c
> +++ b/read-cache.c

> +struct cache_entry *make_transient_cache_entry(unsigned int mode, const 
> struct object_id *oid,
> +const char *path, int stage)
> +{
> + struct cache_entry *ce;
> + int len;
> +
> + if (!verify_path(path, mode)) {
> + error("Invalid path '%s'", path);
> + return NULL;
> + }
> +
> + len = strlen(path);
> + ce = make_empty_transient_cache_entry(len);
> +
> + hashcpy(ce->oid.hash, oid->hash);

Please use oidcpy() here, too.

Re: [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct

2018-06-28 Thread SZEDER Gábor

> The make_cache_entry function should take an object_id struct instead
> of sha.


> diff --git a/read-cache.c b/read-cache.c
> index fa8366ecab..9624ce1784 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -746,8 +746,10 @@ int add_file_to_index(struct index_state *istate, const 
> char *path, int flags)
>  }
> 
>  struct cache_entry *make_cache_entry(unsigned int mode,
> - const unsigned char *sha1, const char *path, int stage,
> - unsigned int refresh_options)
> +  const struct object_id *oid,
> +  const char *path,
> +  int stage,
> +  unsigned int refresh_options)
>  {
>   int size, len;
>   struct cache_entry *ce, *ret;
> @@ -761,7 +763,7 @@ struct cache_entry *make_cache_entry(unsigned int mode,
>   size = cache_entry_size(len);
>   ce = xcalloc(1, size);
> 
> - hashcpy(ce->oid.hash, sha1);
> + hashcpy(ce->oid.hash, oid->hash);

Speaking of using struct object_id instead of sha, please use oidcpy()
here.

Re: [PATCH v3] fetch-pack: support negotiation tip whitelist

2018-06-28 Thread Brandon Williams

On 06/28, Jonathan Tan wrote:
> During negotiation, fetch-pack eventually reports as "have" lines all
> commits reachable from all refs. Allow the user to restrict the commits
> sent in this way by providing a whitelist of tips; only the tips
> themselves and their ancestors will be sent.
> 
> Both globs and single objects are supported.
> 
> This feature is only supported for protocols that support connect or
> stateless-connect (such as HTTP with protocol v2).
> 
> This will speed up negotiation when the repository has multiple
> relatively independent branches (for example, when a repository
> interacts with multiple repositories, such as with linux-next [1] and
> torvalds/linux [2]), and the user knows which local branch is likely to
> have commits in common with the upstream branch they are fetching.
> 
> [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/
> [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/
> 
> Signed-off-by: Jonathan Tan 
> ---
> This is on jt/fetch-pack-negotiator.
> 
> > I don't think that would be strange at all, and no where in git do we
> > handle heads/* but we do already handle refs/heads/* as well as DWIM
> > master.
> >
> > > and (2) I can't think of anywhere in Git
> > > where you can provide either one - it's either SHA-1 and DWIM name, or
> > > SHA-1 and refspec, but not all three.
> >
> > fetch is a perfect example of supporting all three.  I can do
> >
> >   git fetch origin SHA1
> >   git fetch origin master
> >   git fetch origin refs/heads/*:refs/heads/*
> 
> OK, Brandon managed to convince me that this is fine. I've included glob
> support, supporting the same globs that git notes supports.
> ---
>  Documentation/fetch-options.txt | 16 +++
>  builtin/fetch.c | 41 +
>  fetch-pack.c| 19 +++-
>  fetch-pack.h|  7 +++
>  t/t5510-fetch.sh| 78 +
>  transport-helper.c  |  3 ++
>  transport.c |  1 +
>  transport.h | 10 +
>  8 files changed, 173 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 97d3217df..6e4db1738 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -42,6 +42,22 @@ the current repository has the same history as the source 
> repository.
>   .git/shallow. This option updates .git/shallow and accept such
>   refs.
>  
> +--negotiation-tip=::
> + By default, Git will report, to the server, commits reachable
> + from all local refs to find common commits in an attempt to
> + reduce the size of the to-be-received packfile. If specified,
> + Git will only report commits reachable from the given tips.
> + This is useful to speed up fetches when the user knows which
> + local ref is likely to have commits in common with the
> + upstream ref being fetched.
> ++
> +This option may be specified more than once; if so, Git will report
> +commits reachable from any of the given commits.
> ++
> +The argument to this option may be a glob on ref names, a ref, or the 
> (possibly
> +abbreviated SHA-1 of a commit. Specifying a glob is equivalent to specifying
> +this option multiple times, one for each matching ref name.

I think you're missing a closing ')'

Aside from that nit this patch looks good, thanks!


-- 
Brandon Williams

[PATCH v3] fetch-pack: support negotiation tip whitelist

2018-06-28 Thread Jonathan Tan

During negotiation, fetch-pack eventually reports as "have" lines all
commits reachable from all refs. Allow the user to restrict the commits
sent in this way by providing a whitelist of tips; only the tips
themselves and their ancestors will be sent.

Both globs and single objects are supported.

This feature is only supported for protocols that support connect or
stateless-connect (such as HTTP with protocol v2).

This will speed up negotiation when the repository has multiple
relatively independent branches (for example, when a repository
interacts with multiple repositories, such as with linux-next [1] and
torvalds/linux [2]), and the user knows which local branch is likely to
have commits in common with the upstream branch they are fetching.

[1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/
[2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/

Signed-off-by: Jonathan Tan 
---
This is on jt/fetch-pack-negotiator.

> I don't think that would be strange at all, and no where in git do we
> handle heads/* but we do already handle refs/heads/* as well as DWIM
> master.
>
> > and (2) I can't think of anywhere in Git
> > where you can provide either one - it's either SHA-1 and DWIM name, or
> > SHA-1 and refspec, but not all three.
>
> fetch is a perfect example of supporting all three.  I can do
>
>   git fetch origin SHA1
>   git fetch origin master
>   git fetch origin refs/heads/*:refs/heads/*

OK, Brandon managed to convince me that this is fine. I've included glob
support, supporting the same globs that git notes supports.
---
 Documentation/fetch-options.txt | 16 +++
 builtin/fetch.c | 41 +
 fetch-pack.c| 19 +++-
 fetch-pack.h|  7 +++
 t/t5510-fetch.sh| 78 +
 transport-helper.c  |  3 ++
 transport.c |  1 +
 transport.h | 10 +
 8 files changed, 173 insertions(+), 2 deletions(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 97d3217df..6e4db1738 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -42,6 +42,22 @@ the current repository has the same history as the source 
repository.
.git/shallow. This option updates .git/shallow and accept such
refs.
 
+--negotiation-tip=::
+   By default, Git will report, to the server, commits reachable
+   from all local refs to find common commits in an attempt to
+   reduce the size of the to-be-received packfile. If specified,
+   Git will only report commits reachable from the given tips.
+   This is useful to speed up fetches when the user knows which
+   local ref is likely to have commits in common with the
+   upstream ref being fetched.
++
+This option may be specified more than once; if so, Git will report
+commits reachable from any of the given commits.
++
+The argument to this option may be a glob on ref names, a ref, or the (possibly
+abbreviated SHA-1 of a commit. Specifying a glob is equivalent to specifying
+this option multiple times, one for each matching ref name.
+
 ifndef::git-pull[]
 --dry-run::
Show what would be done, without making any changes.
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..1a7ef305d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -63,6 +63,7 @@ static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
 static struct list_objects_filter_options filter_options;
 static struct string_list server_options = STRING_LIST_INIT_DUP;
+static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_STRING_LIST(0, "negotiation-tip", _tip, N_("revision"),
+   N_("report that we have only objects reachable from 
this object")),
OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
@@ -1049,6 +1052,38 @@ static void set_option(struct transport *transport, 
const char *name, const char
name, transport->url);
 }
 
+
+static int add_oid(const char *refname, const struct object_id *oid, int flags,
+  void *cb_data)
+{
+   struct oid_array *oids = cb_data;
+   oid_array_append(oids, oid);
+   return 0;
+}
+
+static void add_negotiation_tips(struct git_transport_options *smart_options)
+{
+   struct oid_array *oids = xcalloc(1, sizeof(*oids));
+   int i;
+   for (i = 0; i < negotiation_tip.nr; i++) {
+   const char *s = negotiation_tip.items[i].string;
+   int old_nr;
+   if (!has_glob_specials(s))

Re: [PATCH 4/4] fsck: silence stderr when parsing .gitmodules

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 06:06:03PM -0400, Jeff King wrote:

> Note that we didn't test this case at all, so I've added
> coverage in t7415. We may end up toning down or removing
> this fsck check in the future. So take this test as checking
> what happens now with a focus on stderr, and not any
> ironclad guarantee that we must detect and report parse
> failures in the future.

And such a patch _could_ look something like this. Though we could also
perhaps leave it in place and tone it down to "ignore" by default.

There's another case that triggers gitmodulesParse, too, which is a blob
so gigantic that we try not to hold it in memory. Technically that could
also happen due to somebody using .gitmodules for something unrelated.
But that seems even more far-fetched. And it _is_ dangerous to leave,
because I think existing vulnerable clients will try to load a 500MB
.gitmodules file in memory and parse it.

---
diff --git a/fsck.c b/fsck.c
index 87b0e228bd..296e8a8a7c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1013,11 +1013,9 @@ static int fsck_blob(struct blob *blob, const char *buf,
data.options = options;
data.ret = 0;
config_opts.error_action = CONFIG_ERROR_SILENT;
-   if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
-   ".gitmodules", buf, size, , _opts))
-   data.ret |= report(options, >object,
-  FSCK_MSG_GITMODULES_PARSE,
-  "could not parse gitmodules blob");
+   /* ignore errors */
+   git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
+   ".gitmodules", buf, size, , _opts);
 
return data.ret;
 }
diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh
index ba8af785a5..9a7dae88a5 100755
--- a/t/t7415-submodule-names.sh
+++ b/t/t7415-submodule-names.sh
@@ -176,7 +176,7 @@ test_expect_success 'fsck detects non-blob .gitmodules' '
)
 '
 
-test_expect_success 'fsck detects corrupt .gitmodules' '
+test_expect_success 'fsck does not complain about corrupt .gitmodules' '
git init corrupt &&
(
cd corrupt &&
@@ -185,9 +185,8 @@ test_expect_success 'fsck detects corrupt .gitmodules' '
git add .gitmodules &&
git commit -m "broken gitmodules" &&
 
-   test_must_fail git fsck 2>output &&
-   grep gitmodulesParse output &&
-   test_i18ngrep ! "bad config" output
+   git fsck 2>output &&
+   ! test -s output
)
 '

[PATCH 4/4] fsck: silence stderr when parsing .gitmodules

2018-06-28 Thread Jeff King

If there's a parsing error we'll already report it via the
usual fsck report() function (or not, if the user has asked
to skip this object or warning type). The error message from
the config parser just adds confusion. Let's suppress it.

Note that we didn't test this case at all, so I've added
coverage in t7415. We may end up toning down or removing
this fsck check in the future. So take this test as checking
what happens now with a focus on stderr, and not any
ironclad guarantee that we must detect and report parse
failures in the future.

Signed-off-by: Jeff King 
---
 fsck.c |  4 +++-
 t/t7415-submodule-names.sh | 15 +++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index aa7a52cc80..87b0e228bd 100644
--- a/fsck.c
+++ b/fsck.c
@@ -992,6 +992,7 @@ static int fsck_blob(struct blob *blob, const char *buf,
 unsigned long size, struct fsck_options *options)
 {
struct fsck_gitmodules_data data;
+   struct config_options config_opts = { 0 };
 
if (!oidset_contains(_found, >object.oid))
return 0;
@@ -1011,8 +1012,9 @@ static int fsck_blob(struct blob *blob, const char *buf,
data.obj = >object;
data.options = options;
data.ret = 0;
+   config_opts.error_action = CONFIG_ERROR_SILENT;
if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
-   ".gitmodules", buf, size, , NULL))
+   ".gitmodules", buf, size, , _opts))
data.ret |= report(options, >object,
   FSCK_MSG_GITMODULES_PARSE,
   "could not parse gitmodules blob");
diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh
index b68c5f5e85..ba8af785a5 100755
--- a/t/t7415-submodule-names.sh
+++ b/t/t7415-submodule-names.sh
@@ -176,4 +176,19 @@ test_expect_success 'fsck detects non-blob .gitmodules' '
)
 '
 
+test_expect_success 'fsck detects corrupt .gitmodules' '
+   git init corrupt &&
+   (
+   cd corrupt &&
+
+   echo "[broken" >.gitmodules &&
+   git add .gitmodules &&
+   git commit -m "broken gitmodules" &&
+
+   test_must_fail git fsck 2>output &&
+   grep gitmodulesParse output &&
+   test_i18ngrep ! "bad config" output
+   )
+'
+
 test_done
-- 
2.18.0.604.g8c4f59c959

[PATCH 3/4] config: add options parameter to git_config_from_mem

2018-06-28 Thread Jeff King

The underlying config parser knows how to handle a
config_options struct, but git_config_from_mem() always
passes NULL. Let's allow our callers to specify the options
struct.

We could add a "_with_options" variant, but since there are
only a handful of callers, let's just update them to pass
NULL.

Signed-off-by: Jeff King 
---
 config.c   | 11 +++
 config.h   |  7 +--
 fsck.c |  2 +-
 submodule-config.c |  2 +-
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/config.c b/config.c
index 0797e284f4..4548edb1e5 100644
--- a/config.c
+++ b/config.c
@@ -1569,8 +1569,10 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
return git_config_from_file_with_options(fn, filename, data, NULL);
 }
 
-int git_config_from_mem(config_fn_t fn, const enum config_origin_type 
origin_type,
-   const char *name, const char *buf, size_t len, void 
*data)
+int git_config_from_mem(config_fn_t fn,
+   const enum config_origin_type origin_type,
+   const char *name, const char *buf, size_t len,
+   void *data, const struct config_options *opts)
 {
struct config_source top;
 
@@ -1585,7 +1587,7 @@ int git_config_from_mem(config_fn_t fn, const enum 
config_origin_type origin_typ
top.do_ungetc = config_buf_ungetc;
top.do_ftell = config_buf_ftell;
 
-   return do_config_from(, fn, data, NULL);
+   return do_config_from(, fn, data, opts);
 }
 
 int git_config_from_blob_oid(config_fn_t fn,
@@ -1606,7 +1608,8 @@ int git_config_from_blob_oid(config_fn_t fn,
return error("reference '%s' does not point to a blob", name);
}
 
-   ret = git_config_from_mem(fn, CONFIG_ORIGIN_BLOB, name, buf, size, 
data);
+   ret = git_config_from_mem(fn, CONFIG_ORIGIN_BLOB, name, buf, size,
+ data, NULL);
free(buf);
 
return ret;
diff --git a/config.h b/config.h
index c02809ffdc..f2063ceb86 100644
--- a/config.h
+++ b/config.h
@@ -68,8 +68,11 @@ extern int git_config_from_file(config_fn_t fn, const char 
*, void *);
 extern int git_config_from_file_with_options(config_fn_t fn, const char *,
 void *,
 const struct config_options *);
-extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type,
-   const char *name, const char *buf, 
size_t len, void *data);
+extern int git_config_from_mem(config_fn_t fn,
+  const enum config_origin_type,
+  const char *name,
+  const char *buf, size_t len,
+  void *data, const struct config_options *opts);
 extern int git_config_from_blob_oid(config_fn_t fn, const char *name,
const struct object_id *oid, void *data);
 extern void git_config_push_parameter(const char *text);
diff --git a/fsck.c b/fsck.c
index 0b8b20b6c4..aa7a52cc80 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1012,7 +1012,7 @@ static int fsck_blob(struct blob *blob, const char *buf,
data.options = options;
data.ret = 0;
if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB,
-   ".gitmodules", buf, size, ))
+   ".gitmodules", buf, size, , NULL))
data.ret |= report(options, >object,
   FSCK_MSG_GITMODULES_PARSE,
   "could not parse gitmodules blob");
diff --git a/submodule-config.c b/submodule-config.c
index 388ef1f892..2ca3272dd1 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -561,7 +561,7 @@ static const struct submodule *config_from(struct 
submodule_cache *cache,
parameter.gitmodules_oid = 
parameter.overwrite = 0;
git_config_from_mem(parse_config, CONFIG_ORIGIN_SUBMODULE_BLOB, rev.buf,
-   config, config_size, );
+   config, config_size, , NULL);
strbuf_release();
free(config);
 
-- 
2.18.0.604.g8c4f59c959

[PATCH 2/4] config: add CONFIG_ERROR_SILENT handler

2018-06-28 Thread Jeff King

We can currently die() or error(), but there's not yet any
way for callers to ask us just to quietly return an error.
Let's give them one.

Signed-off-by: Jeff King 
---
 config.c | 3 +++
 config.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/config.c b/config.c
index b5c23369d0..0797e284f4 100644
--- a/config.c
+++ b/config.c
@@ -818,6 +818,9 @@ static int git_parse_source(config_fn_t fn, void *data,
case CONFIG_ERROR_ERROR:
error_return = error("%s", error_msg);
break;
+   case CONFIG_ERROR_SILENT:
+   error_return = -1;
+   break;
case CONFIG_ERROR_UNSET:
BUG("config error action unset");
}
diff --git a/config.h b/config.h
index ce75bf1e5c..c02809ffdc 100644
--- a/config.h
+++ b/config.h
@@ -58,6 +58,7 @@ struct config_options {
CONFIG_ERROR_UNSET = 0, /* use source-specific default */
CONFIG_ERROR_DIE, /* die() on error */
CONFIG_ERROR_ERROR, /* error() on error, return -1 */
+   CONFIG_ERROR_SILENT, /* return -1 */
} error_action;
 };
 
-- 
2.18.0.604.g8c4f59c959

[PATCH 1/4] config: turn die_on_error into caller-facing enum

2018-06-28 Thread Jeff King

The config code has a die_on_error flag, which lets us emit
an error() instead of dying when we see a bogus config file.
But there's no way for a caller of the config code to set
this: it's auto-set based on whether we're reading a file or
a blob.

Instead, let's add it to the config_options struct. When
it's not set (or we have no options) we'll continue to fall
back to the existing file/blob behavior.

Signed-off-by: Jeff King 
---
 config.c | 18 +-
 config.h |  5 +
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/config.c b/config.c
index a0a6ae1980..b5c23369d0 100644
--- a/config.c
+++ b/config.c
@@ -31,7 +31,7 @@ struct config_source {
enum config_origin_type origin_type;
const char *name;
const char *path;
-   int die_on_error;
+   enum config_error_action default_error_action;
int linenr;
int eof;
struct strbuf value;
@@ -809,10 +809,18 @@ static int git_parse_source(config_fn_t fn, void *data,
  cf->linenr, cf->name);
}
 
-   if (cf->die_on_error)
+   switch (opts && opts->error_action ?
+   opts->error_action :
+   cf->default_error_action) {
+   case CONFIG_ERROR_DIE:
die("%s", error_msg);
-   else
+   break;
+   case CONFIG_ERROR_ERROR:
error_return = error("%s", error_msg);
+   break;
+   case CONFIG_ERROR_UNSET:
+   BUG("config error action unset");
+   }
 
free(error_msg);
return error_return;
@@ -1520,7 +1528,7 @@ static int do_config_from_file(config_fn_t fn,
top.origin_type = origin_type;
top.name = name;
top.path = path;
-   top.die_on_error = 1;
+   top.default_error_action = CONFIG_ERROR_DIE;
top.do_fgetc = config_file_fgetc;
top.do_ungetc = config_file_ungetc;
top.do_ftell = config_file_ftell;
@@ -1569,7 +1577,7 @@ int git_config_from_mem(config_fn_t fn, const enum 
config_origin_type origin_typ
top.origin_type = origin_type;
top.name = name;
top.path = NULL;
-   top.die_on_error = 0;
+   top.default_error_action = CONFIG_ERROR_ERROR;
top.do_fgetc = config_buf_fgetc;
top.do_ungetc = config_buf_ungetc;
top.do_ftell = config_buf_ftell;
diff --git a/config.h b/config.h
index 626d4654bd..ce75bf1e5c 100644
--- a/config.h
+++ b/config.h
@@ -54,6 +54,11 @@ struct config_options {
const char *git_dir;
config_parser_event_fn_t event_fn;
void *event_fn_data;
+   enum config_error_action {
+   CONFIG_ERROR_UNSET = 0, /* use source-specific default */
+   CONFIG_ERROR_DIE, /* die() on error */
+   CONFIG_ERROR_ERROR, /* error() on error, return -1 */
+   } error_action;
 };
 
 typedef int (*config_fn_t)(const char *, const char *, void *);
-- 
2.18.0.604.g8c4f59c959

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 07:53:27PM +0100, Ramsay Jones wrote:

> > Yes, though I don't think it's too bad. We already have a "die_on_error"
> > flag in the config code. I think it just needs to become a tristate:
> > die/error/silent (and probably get passed in via config_options, since I
> > think we tie it right now to the file/blob source).
> 
> Yes, but this code is already very crufty - and I'm just
> waiting for someone to want to have per-repo/submodule
> config parsing i... ;-)

It is crufty, but I think we actually handle that part OK; the flag gets
attached to the stack.

> > Hmm, if we end up doing the config thing above, then this patch would
> > become unnecessary.
> 
> I was thinking of timing - the current patch could go
> in quickly to solve the immediate problem (eg. in maint).
> 
> Also, it does not hurt to do this _as well as_ suppress
> the config errors.

Yes, it can go in quickly. But I'd prefer not to keep it in the long
term if it's literally doing nothing.

I have some patches which I think solve your problem. They apply on
v2.18.0, but not on v2.17.1 (because they rely on Dscho's increased
passing of config_options in v2.18). Is that good enough?

> > And I think doing that would help _all_ cases, even ones without a
> > skiplist. They don't need to see random config error messages either,
> > even if we do eventually report an fsck error.
> 
> Oh, yes, I agree. You will have noticed that it was my
> first suggested solution. (I have started writing that
> patch a few times, but it just makes me want to throw
> the current code away and start again)!
> 
> Hmm, BTW, the 'rejected push' problem would include *any*
> '.gitmodules' blob that contained a syntax error, right?
> 
> Maybe it won't be as rare as all that! (Imagine not being
> able to push due to a compiler error/warning in source files.
> How irritating would that be! :-P ).

Yes, it would include any syntax error. I also have a slight worry about
that, but nobody seems to have screamed _yet_. :)

> (if we fix this, you could hide some nefarious settings
> after an obvious syntax error - then get the victim to
> fix the syntax error ...)

You can also usually get the victim to type "make", which is even
simpler. :)

Here are the patches I came up with.

Note that the config_options struct has a bit of a dual-nature. Some
options are respected only via config_from_options(), and some only from
git_config_from_file(). I think this should probably be remedied in the
long run, but I stopped here in the interest of keeping this
maint-worthy.

  [1/4]: config: turn die_on_error into caller-facing enum
  [2/4]: config: add CONFIG_ERROR_SILENT handler
  [3/4]: config: add options parameter to git_config_from_mem
  [4/4]: fsck: silence stderr when parsing .gitmodules

 config.c   | 32 +++-
 config.h   | 13 +++--
 fsck.c |  4 +++-
 submodule-config.c |  2 +-
 t/t7415-submodule-names.sh | 15 +++
 5 files changed, 53 insertions(+), 13 deletions(-)

-Peff

Re: [PATCH 5/5] builtin/rebase: support running "git rebase "

2018-06-28 Thread Stefan Beller

On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki  wrote:
>
> This patch gives life to the skeleton added in the previous patch.
> This patch makes real operation happen i.e. by using
> `git -c rebase.usebuiltin=true rebase `.
> With this patch, the basic operation of rebase can be done.

Would it make sense to add this config option to some basic test in the
test suite to show off one case in there? (Otherwise it is hard to keep this
code correct for the future (even if it is just a few days/weeks) as other
series on the list may collide with it in subtle ways, so a test would be
fast signal to catch these subtleties).

Maybe setting this in one of the early tests in t3400 would be good?

> These backends use Unix shell functions defined both by git-sh-setup.sh
> and git-rebase.sh (we move the latter's into git-rebase--common.sh to

s/move/moved in a previous patch/ ? But then again we already know about
the earlier patch, I am on the fence whether this is worth mentioning. But
it sure is fine to leave it here.

> accommodate for that), so we not only have to source the backend file
> before calling the respective Unix shell script function, but we have
> to source git-sh-setup and git-rebase--common before that.
> And since this is all done in a Unix shell script snippet, all of this
> is in argv[0]. There never will be a non-NULL argv[1].

No double negatives are never harder to read than simple forms. ;)
So you are saying, there are no further arguments to that shell
invocation?

> This patch does the *bare* minimum to get `git rebase ` to
> work: there is still no option parsing, and only the bare minimum set
> of environment variables are set (in particular, the current revision
> would be susceptible to bugs where e.g. `rebase_root` could be set by
> mistake before running `git rebase` and the `git-rebase--am` backend
> would pick up that variable and use it).
>
> It still calls original `git-legacy-rebase.sh` unless the config
> setting rebase.useBuiltin is set to true. This patch uses the
> detach_head_to() function from checkout.c introduced by a previous
> commit to perform initial checkout.
>
> Signed-off-by: Pratik Karki 
> ---
>  builtin/rebase.c | 231 ++-
>  1 file changed, 229 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 1152b7229..2f90389c2 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -9,6 +9,19 @@
>  #include "exec-cmd.h"
>  #include "argv-array.h"
>  #include "dir.h"
> +#include "packfile.h"
> +#include "checkout.h"
> +#include "refs.h"
> +
> +static GIT_PATH_FUNC(apply_dir, "rebase-apply");
> +static GIT_PATH_FUNC(merge_dir, "rebase-merge");
> +
> +enum rebase_type {
> +   REBASE_AM,
> +   REBASE_MERGE,
> +   REBASE_INTERACTIVE,
> +   REBASE_PRESERVE_MERGES
> +};
>
>  static int use_builtin_rebase(void)
>  {
> @@ -28,8 +41,129 @@ static int use_builtin_rebase(void)
> return ret;
>  }
>
> +static int apply_autostash(void)
> +{
> +   warning("TODO");

This comes up unconditionally here, so the automated testing
idea from above might not be as good as I thought after all.

> +static struct commit *peel_committish(const char *name)

The -ish suffix is to indicate that a wide range of notations
that describe commits are accepted. Another way of naming this
function would be by its output, i.e. peel_to_commit, the name
similar to peel_to_type. But I guess emphasizing the input
to be anything that describes a commit is also important here,
as we pass in the arguments eventually provided by users
(e.g. "master^^") so this name sounds fine; I cannot think of
a better suggestion for now.

What's cooking in git.git (Jun 2018, #07; Thu, 28)

2018-06-28 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The tip of 'next' has been rewound and it currently has only 4
topics.  Quite a many topics are cooking in 'pu' and need to be
sifted into good bins (for 'next') and the remainder.  Help is
appreciated in that area ;-)

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/refspec-init-fix (2018-06-11) 3 commits
  (merged to 'next' on 2018-06-13 at 91d71d8435)
 + refspec: initalize `refspec_item` in `valid_fetch_refspec()`
 + refspec: add back a refspec_item_init() function
 + refspec: s/refspec_item_init/&_or_die/g

 Make refspec parsing codepath more robust.


* as/safecrlf-quiet-fix (2018-06-11) 1 commit
  (merged to 'next' on 2018-06-13 at b163674843)
 + config.c: fix regression for core.safecrlf false

 Fix for 2.17-era regression around `core.safecrlf`.


* jc/clean-after-sanity-tests (2018-06-15) 1 commit
  (merged to 'next' on 2018-06-22 at 2df77b8af9)
 + tests: clean after SANITY tests

 test cleanup.


* jh/partial-clone (2018-06-12) 1 commit
  (merged to 'next' on 2018-06-13 at 818f864b0c)
 + list-objects: check if filter is NULL before using

 The recent addition of "partial clone" experimental feature kicked
 in when it shouldn't, namely, when there is no partial-clone filter
 defined even if extensions.partialclone is set.


* jk/fetch-all-peeled-fix (2018-06-13) 2 commits
  (merged to 'next' on 2018-06-13 at 1333bb9d90)
 + fetch-pack: test explicitly that --all can fetch tag references pointing to 
non-commits
 + fetch-pack: don't try to fetch peel values with --all

 "git fetch-pack --all" used to unnecessarily fail upon seeing an
 annotated tag that points at an object other than a commit.


* ms/send-pack-honor-config (2018-06-12) 1 commit
  (merged to 'next' on 2018-06-13 at e2cd933715)
 + builtin/send-pack: populate the default configs

 "git send-pack --signed" (hence "git push --signed" over the http
 transport) did not read user ident from the config mechanism to
 determine whom to sign the push certificate as, which has been
 corrected.


* nd/completion-negation (2018-06-11) 3 commits
  (merged to 'next' on 2018-06-19 at a3be59b450)
 + completion: collapse extra --no-.. options
 + completion: suppress some -no- options
 + parse-options: option to let --git-completion-helper show negative form

 Continuing with the idea to programmatically enumerate various
 pieces of data required for command line completion, the codebase
 has been taught to enumerate options prefixed with "--no-" to
 negate them.


* pw/add-p-recount (2018-06-11) 1 commit
  (merged to 'next' on 2018-06-19 at 1880ecc3f1)
 + add -p: fix counting empty context lines in edited patches

 When user edits the patch in "git add -p" and the user's editor is
 set to strip trailing whitespaces indiscriminately, an empty line
 that is unchanged in the patch would become completely empty
 (instead of a line with a sole SP on it).  The code introduced in
 Git 2.17 timeframe failed to parse such a patch, but now it learned
 to notice the situation and cope with it.


* sb/fix-fetching-moved-submodules (2018-06-14) 2 commits
  (merged to 'next' on 2018-06-22 at 16039dc62a)
 + t5526: test recursive submodules when fetching moved submodules
 + submodule: fix NULL correctness in renamed broken submodules

 The code to try seeing if a fetch is necessary in a submodule
 during a fetch with --recurse-submodules got confused when the path
 to the submodule was changed in the range of commits in the
 superproject, sometimes showing "(null)".  This has been corrected.


* sg/gpg-tests-fix (2018-06-11) 2 commits
  (merged to 'next' on 2018-06-13 at f3a05f1c41)
 + tests: make forging GPG signed commits and tags more robust
 + t7510-signed-commit: use 'test_must_fail'

 Some flaky tests have been fixed.


* tz/cred-netrc-cleanup (2018-06-18) 4 commits
  (merged to 'next' on 2018-06-22 at a471dd714c)
 + git-credential-netrc: make "all" default target of Makefile
 + git-credential-netrc: fix exit status when tests fail
 + git-credential-netrc: use in-tree Git.pm for tests
 + git-credential-netrc: minor whitespace cleanup in test script

 Build and test procedure for netrc credential helper (in contrib/)
 has been updated.

--
[New Topics]

* ao/config-from-gitmodules (2018-06-26) 6 commits
 - submodule-config: reuse config_from_gitmodules in repo_read_gitmodules
 - submodule-config: pass repository as argument to config_from_gitmodules
 - submodule-config: make 'config_from_gitmodules' private
 - submodule-config: add helper to get 'update-clone'

Re: [PATCH 4/5] sequencer: refactor the code to detach HEAD to checkout.c

2018-06-28 Thread Stefan Beller

Hi Pratik,
On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki  wrote:
>
> The motivation behind this commit is to extract the core part of
> do_reset() from sequencer.c and move it to a new detach_head_to()
> function in checkout.c.
>
[...]
>
> The new function will be used in the next commit by the builtin rebase,
> to perform the initial checkout.

This sounds like the actual motivation, which is fine.

> Here the index only gets locked after performing the first part of
> `do_reset()` rather than before which essentially derives the `oid`
> from the specified label/name passed to the `do_reset()` function.
> It also fixes two bugs: there were two `return error()` statements in
> the `[new root]` case that would have failed to unlock the index.

This sounds as if this fixes a problem? If so it would be nice to have
a test that demonstrates that these specific problems go away.
(but I think we could just argue based on the motivation above that this
is a good change on its own, with or without demonstrating these
additional issues)

> Signed-off-by: Pratik Karki 
> ---
>  checkout.c  | 64 +
>  checkout.h  |  3 +++
>  sequencer.c | 58 +---
>  3 files changed, 72 insertions(+), 53 deletions(-)
>
> diff --git a/checkout.c b/checkout.c
> index bdefc888b..da68915fd 100644
> --- a/checkout.c
> +++ b/checkout.c
> @@ -2,6 +2,11 @@
>  #include "remote.h"
>  #include "refspec.h"
>  #include "checkout.h"
> +#include "unpack-trees.h"
> +#include "lockfile.h"
> +#include "refs.h"
> +#include "tree.h"
> +#include "cache-tree.h"
>
>  struct tracking_name_data {
> /* const */ char *src_ref;
> @@ -42,3 +47,62 @@ const char *unique_tracking_name(const char *name, struct 
> object_id *oid)
> free(cb_data.dst_ref);
> return NULL;
>  }
> +
> +int detach_head_to(struct object_id *oid, const char *action,
> +  const char *reflog_message)
> +{
> +   struct strbuf ref_name = STRBUF_INIT;
> +   struct tree_desc desc;
> +   struct lock_file lock = LOCK_INIT;
> +   struct unpack_trees_options unpack_tree_opts;
> +   struct tree *tree;
> +   int ret = 0;
> +
> +   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
> +   return -1;
> +
> +   memset(_tree_opts, 0, sizeof(unpack_tree_opts));
> +   setup_unpack_trees_porcelain(_tree_opts, action);
> +   unpack_tree_opts.head_idx = 1;
> +   unpack_tree_opts.src_index = _index;
> +   unpack_tree_opts.dst_index = _index;
> +   unpack_tree_opts.fn = oneway_merge;
> +   unpack_tree_opts.merge = 1;
> +   unpack_tree_opts.update = 1;
> +
> +   if (read_cache_unmerged()) {
> +   rollback_lock_file();
> +   strbuf_release(_name);
> +   return error_resolve_conflict(_(action));
> +   }
> +
> +   if (!fill_tree_descriptor(, oid)) {
> +   error(_("failed to find tree of %s"), oid_to_hex(oid));
> +   rollback_lock_file();
> +   free((void *)desc.buffer);
> +   strbuf_release(_name);

These lines are repeated as a very similar pattern after each
failing function. Maybe we can make it more readable by moving
all these to the end and then using goto to jump there.

For example see "write_pseudoref" in refs.c, that has some interesting
patterns to learn from, e.g. how the return code is constructed
(start off with setting it -1 and only if we go through the whole
function, just before the jump label, we'd set it to 0) and
how all the free/strbuf_releases are at the end (no need to
repeat them).

> +   return -1;
> +   }
> +
> +   if (unpack_trees(1, , _tree_opts)) {
> +   rollback_lock_file();
> +   free((void *)desc.buffer);
> +   strbuf_release(_name);
> +   return -1;
> +   }
> +
> +   tree = parse_tree_indirect(oid);

Awesome, the _indirect function can take commits/tags or trees.

> +   prime_cache_tree(_index, tree);

As there is a larger movement to get rid of globals, and the_index is one
of them[1]. So maybe just use the_repository->index already (the_repository
suffers a similar problem, but I think that is more futureproof for
the time being
as we'd want to kill off the_repository in library code eventually as
well and pass
through a repository struct. But for now I'd just use the_repository instead of
having a repository argument)

[1] c.f. https://public-inbox.org/git/20180616054157.32433-1-pclo...@gmail.com/

> -   struct lock_file lock = LOCK_INIT;
> -   struct tree_desc desc;
> -   struct tree *tree;
> -   struct unpack_trees_options unpack_tree_opts;
> -   int ret = 0, i;

[...]

Oh I misspoke above, this is moving code (I should have understood
the hint with 'extracting' by the commit message), so in this case we'd
rather want to move code most verbatim to make review easier, which
it is. So

Re: [PATCH] sequencer: use configured comment character

2018-06-28 Thread Junio C Hamano

Aaron Schrab  writes:

> Use configured comment character when generating comments about branches
> in an instruction sheet.  Failure to honor this configuration causes a
> failure to parse the resulting instruction sheet.
>
> Signed-off-by: Aaron Schrab 
> ---
>  sequencer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 4034c0461b..caf91af29d 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -3991,7 +3991,7 @@ static int make_script_with_merges(struct 
> pretty_print_context *pp,
>   entry = oidmap_get(, >object.oid);
>  
>   if (entry)
> - fprintf(out, "\n# Branch %s\n", entry->string);
> + fprintf(out, "\n%c Branch %s\n", comment_line_char, 
> entry->string);
>   else
>   fprintf(out, "\n");

Would this interact OK with core.commentchar set to "auto"?

Re: [PATCH 3/5] rebase: refactor common shell functions into their own file

2018-06-28 Thread Stefan Beller

On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki  wrote:
>
> The function present in `git-legacy-rebase.sh` are used by backends
> so, this refactor tries to extract the functions out so that, the

it not only tries to, it actually does. :)

> `git-legacy-rebase.sh` can be retired easily as the
> `git-rebase--common.sh` will provide the functions for now.
>
> The motivation behind this is to call the backend functions
> *directly* from C, bypassing `git-rebase.sh`. Therefore those functions
> need to live in a separate file: we need to be able to call
> `.git-rebase--common` in that script snippet so that those functions
> are defined.

Makes sense.

I applied the patch (and checked the move via the --color-moved option
to see if there are discrepancies that slip in easily via rebases as there is
more work currently going on in the rebase area) and the found the functions
were moved as-is, just reordered. Can you give a hint on why you choose a
different order for the moved functions (not as an email reply, but as part
of the commit message, later on people may ask the same question only
to find this commit via git-blame or git-log for example)

Thanks,
Stefan

Re: [PATCH v5 6/8] mem-pool: fill out functionality

2018-06-28 Thread Junio C Hamano

Jameson Miller  writes:

> +void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src)
> +{
> + struct mp_block *p;
> +
> + /* Append the blocks from src to dst */
> + if (dst->mp_block && src->mp_block) {
> + /*
> +  * src and dst have blocks, append
> +  * blocks from src to dst.
> +  */
> + p = dst->mp_block;
> + while (p->next_block)
> + p = p->next_block;
> +
> + p->next_block = src->mp_block;

Just being curious, but does this interact with the "we carve out
only from the first block" done in step 4/8?  The remaining unused
space in the first block in the src pool would be wasted, which may
not be such a big deal and may not even be worth comparing the
available space in two blocks and picking a larger one.  But we do
want to decide _after_ thinking things through nevertheless.

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Ramsay Jones

On 28/06/18 18:45, Jeff King wrote:
> On Thu, Jun 28, 2018 at 05:56:18PM +0100, Ramsay Jones wrote:
[snip]
>>> One thing we could do is turn the parse failure into a noop. The main
>>> point of the check is to protect people against the malicious
>>> .gitmodules bug. If the file can't be parsed, then it also can't be an
>>> attack vector (assuming the parser used for the fsck check and the
>>> parser used by the victim behave identically).
>>
>> Hmm, yeah, but I would have to provide a means of suppressing
>> the config parser error messages. Something I wanted to avoid. :(
> 
> Yes, though I don't think it's too bad. We already have a "die_on_error"
> flag in the config code. I think it just needs to become a tristate:
> die/error/silent (and probably get passed in via config_options, since I
> think we tie it right now to the file/blob source).

Yes, but this code is already very crufty - and I'm just
waiting for someone to want to have per-repo/submodule
config parsing i... ;-)

>> Junio, do you want me to address the above 'rejected push'
>> issue in this patch, or with a follow-up patch? (It should
>> be a pretty rare problem - famous last words!)
> 
> Hmm, if we end up doing the config thing above, then this patch would
> become unnecessary.

I was thinking of timing - the current patch could go
in quickly to solve the immediate problem (eg. in maint).

Also, it does not hurt to do this _as well as_ suppress
the config errors.

> And I think doing that would help _all_ cases, even ones without a
> skiplist. They don't need to see random config error messages either,
> even if we do eventually report an fsck error.

Oh, yes, I agree. You will have noticed that it was my
first suggested solution. (I have started writing that
patch a few times, but it just makes me want to throw
the current code away and start again)!

Hmm, BTW, the 'rejected push' problem would include *any*
'.gitmodules' blob that contained a syntax error, right?

Maybe it won't be as rare as all that! (Imagine not being
able to push due to a compiler error/warning in source files.
How irritating would that be! :-P ).

(if we fix this, you could hide some nefarious settings
after an obvious syntax error - then get the victim to
fix the syntax error ...)

ATB,
Ramsay Jones

Re: [PATCH 2/5] rebase: start implementing it as a builtin

2018-06-28 Thread Stefan Beller

On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki  wrote:
>
> This commit imitates the strategy that was used to convert the
> difftool to a builtin, see be8a90e (difftool: add a skeleton for the
> upcoming builtin, 2017-01-17) for details: This commit renames the
> shell script `git-rebase.sh` to `git-legacy-rebase.sh` and hands off to
> it by default.

That is a good way to start, imitating Johannes approach on rewriting
the difftool. Thanks for pointing this out.

> The current version of the builtin rebase does not, however, make full
> use of the internals but instead chooses to spawn a couple of Git
> processes to find out if we run the builtin or legacy rebase as that
> keeps the directory that we are in correct. There remains a lot
> of room for improvement, left for a later date. The following commits
> will recreate the functionality of the shell script, in pure C.
>
> We intentionally avoid reading the config directly to avoid
> messing up the GIT_* environment variables when we need to fall back to
> exec()ing the shell script.

Thanks for calling this out!

The test of builtin rebase can be done by
> `git -c rebase.useBuiltin=true rebase ...`
>
> Signed-off-by: Pratik Karki 
> ---
>  .gitignore|  1 +
>  Makefile  |  3 +-
>  builtin.h |  1 +
>  builtin/rebase.c  | 55 +++
>  git-rebase.sh => git-legacy-rebase.sh |  0
>  git.c |  6 +++
>  6 files changed, 65 insertions(+), 1 deletion(-)
>  create mode 100644 builtin/rebase.c
>  rename git-rebase.sh => git-legacy-rebase.sh (100%)
>
> diff --git a/.gitignore b/.gitignore
> index 3284a1e9b..ec2395901 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -78,6 +78,7 @@
>  /git-init-db
>  /git-interpret-trailers
>  /git-instaweb
> +/git-legacy-rebase
>  /git-log
>  /git-ls-files
>  /git-ls-remote
> diff --git a/Makefile b/Makefile
> index 0cb6590f2..e88fe2e5f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -609,7 +609,7 @@ SCRIPT_SH += git-merge-one-file.sh
>  SCRIPT_SH += git-merge-resolve.sh
>  SCRIPT_SH += git-mergetool.sh
>  SCRIPT_SH += git-quiltimport.sh
> -SCRIPT_SH += git-rebase.sh
> +SCRIPT_SH += git-legacy-rebase.sh
>  SCRIPT_SH += git-remote-testgit.sh
>  SCRIPT_SH += git-request-pull.sh
>  SCRIPT_SH += git-stash.sh
> @@ -1059,6 +1059,7 @@ BUILTIN_OBJS += builtin/prune.o
>  BUILTIN_OBJS += builtin/pull.o
>  BUILTIN_OBJS += builtin/push.o
>  BUILTIN_OBJS += builtin/read-tree.o
> +BUILTIN_OBJS += builtin/rebase.o
>  BUILTIN_OBJS += builtin/rebase--helper.o
>  BUILTIN_OBJS += builtin/receive-pack.o
>  BUILTIN_OBJS += builtin/reflog.o
> diff --git a/builtin.h b/builtin.h
> index 0362f1ce2..44651a447 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
> const char *prefix);
>  extern int cmd_pull(int argc, const char **argv, const char *prefix);
>  extern int cmd_push(int argc, const char **argv, const char *prefix);
>  extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
> +extern int cmd_rebase(int argc, const char **argv, const char *prefix);
>  extern int cmd_rebase__helper(int argc, const char **argv, const char 
> *prefix);
>  extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
>  extern int cmd_reflog(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> new file mode 100644
> index 0..1152b7229
> --- /dev/null
> +++ b/builtin/rebase.c
> @@ -0,0 +1,55 @@
> +/*
> + * "git rebase" builtin command
> + *
> + * Copyright (c) 2018 Pratik Karki
> + */
> +
> +#include "builtin.h"
> +#include "run-command.h"
> +#include "exec-cmd.h"
> +#include "argv-array.h"
> +#include "dir.h"
> +
> +static int use_builtin_rebase(void)
> +{
> +   struct child_process cp = CHILD_PROCESS_INIT;
> +   struct strbuf out = STRBUF_INIT;
> +   int ret;
> +
> +   argv_array_pushl(,
> +"config", "--bool", "rebase.usebuiltin", NULL);

--bool is documented as "Historical options for selecting
a type specifier. Prefer instead --type, (see: above)." in the
man page of git-config. But as this code will go away once
the conversion is done, this is not kept around for long.
So we should be fine using the --bool option.

> +   cp.git_cmd = 1;
> +   if (capture_command(, , 6))
> +   return 0;
> +
> +   strbuf_trim();
> +   ret = !strcmp("true", out.buf);

As --bool will make sure that the config command
prints "true" or "false", even when the user configured
0 or 1 instead, this is fine.

> +   if (argc != 2)
> +   die("Usage: %s ", argv[0]);
> +   prefix = setup_git_directory();
> +   trace_repo_setup(prefix);
> +   setup_work_tree();
> +
> +   die("TODO");

When reading the last sentence in the commit message
("This can be tested ...") I shortly wondered how

Re: [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs

2018-06-28 Thread Junio C Hamano

Jameson Miller  writes:

> Add an API around managing the lifetime of cache_entry
> structs. Abstracting memory management details behind this API will
> allow for alternative memory management strategies without affecting
> all the call sites.  This commit does not change how memory is
> allocated or freed. A later commit in this series will allocate cache
> entries from memory pools as appropriate.
>
> Motivation:
> It has been observed that the time spent loading an index with a large
> number of entries is partly dominated by malloc() calls. This change
> is in preparation for using memory pools to reduce the number of
> malloc() calls made when loading an index.

Not worth a reroll, but having these four lines at the very
beginning, dropping the line "Motivation:", and then following that
with "Add an API around ..." as the second paragraph, would make the
above easier to read, without stutter-causing "Motivation:" in the
middle.

> diff --git a/apply.c b/apply.c
> index 8ef975a32d..8a4a4439bc 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -4092,12 +4092,12 @@ static int build_fake_ancestor(struct apply_state 
> *state, struct patch *list)
>   return error(_("sha1 information is lacking or useless "
>  "(%s)."), name);
>  
> - ce = make_cache_entry(patch->old_mode, , name, 0, 0);
> + ce = make_cache_entry(, patch->old_mode, , name, 0, 
> 0);
>   if (!ce)
>   return error(_("make_cache_entry failed for path '%s'"),
>name);
>   if (add_index_entry(, ce, ADD_CACHE_OK_TO_ADD)) {
> - free(ce);
> + discard_cache_entry(ce);
>   return error(_("could not add %s to temporary index"),
>name);
>   }

So..., even though it wasn't clear in the proposed log message, two
large part of the lifecycle management API is (1) make_cache_entry()
knows for which istate it is creating the entry and (2) discarding
the entry may not be just a simple matter of free()ing.  Both of
which makes perfect sense, but if the changes are that easily
enumeratable, it would have been nicer for readers if the commit did
so in the proposed log message.

> @@ -4424,27 +4423,26 @@ static int add_conflicted_stages_file(struct 
> apply_state *state,
>  struct patch *patch)
>  {
>   int stage, namelen;
> - unsigned ce_size, mode;
> + unsigned mode;
>   struct cache_entry *ce;
>  
>   if (!state->update_index)
>   return 0;
>   namelen = strlen(patch->new_name);
> - ce_size = cache_entry_size(namelen);
>   mode = patch->new_mode ? patch->new_mode : (S_IFREG | 0644);
>  
>   remove_file_from_cache(patch->new_name);
>   for (stage = 1; stage < 4; stage++) {
>   if (is_null_oid(>threeway_stage[stage - 1]))
>   continue;
> - ce = xcalloc(1, ce_size);
> + ce = make_empty_cache_entry(_index, namelen);

... and another one in the enumeration is make_empty_cache_entry()
which is somehow different.  And the readers are forced to read its
implementation to find out how it is different, but the use of the
same discard_cache_entry() suggests that the liftime rule of an
entry allcoated by it may be similar to those created by
make_cache_entry().

> ...
>   if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0) {
> - free(ce);
> + discard_cache_entry(ce);
>   return error(_("unable to add cache entry for %s"),
>patch->new_name);
>   }

> @@ -230,11 +230,11 @@ static int checkout_merged(int pos, const struct 
> checkout *state)
>   if (write_object_file(result_buf.ptr, result_buf.size, blob_type, ))
>   die(_("Unable to add merge result for '%s'"), path);
>   free(result_buf.ptr);
> - ce = make_cache_entry(mode, , path, 2, 0);
> + ce = make_transient_cache_entry(mode, , path, 2);

... and then yet another, which is "transient".  An intelligent
reader can guess from the lack of istate parameter (and from the
word "transient") that the resulting one would not belong to any
in-core index.

>   if (!ce)
>   die(_("make_cache_entry failed for path '%s'"), path);
>   status = checkout_entry(ce, state, NULL);
> - free(ce);
> + discard_cache_entry(ce);

... but discovers that it is discarded the same way, realizes that
ce knows how it was allocated to allow discard() different way to
discard it, and his/her earlier conjecture about make_empty() does
not hold at all and gets somewhat disappointed.

> diff --git a/cache.h b/cache.h
> index 3fbf24771a..035a627bea 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -339,6 +339,40 @@ extern void remove_name_hash(struct index_state *istate, 
> struct cache_entry

Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'

2018-06-28 Thread Jeff King

On Mon, Jun 25, 2018 at 04:26:00PM -0500, Taylor Blau wrote:

> For instance, a line containing the following (taken from README.md:27):
> 
>   (`man gitcvs-migration` or `git help cvs-migration` if git is
> 
> Is printed as follows:
> 
>   $ git grep -no -e git -- README.md | grep ":27"
>   README.md:27:7:git
>   README.md:27:16:git
>   README.md:27:38:git

This is with "--column", too, right?

> Like GNU grep, this patch ignores --only-matching when --invert (-v) is
> given. There is a sensible answer here, but parity with the behavior of
> other tools is preferred.

Yeah, after all of our discussion about inverted matching and columns,
I'm sure we could come up with _some_ answer. But I agree that what you
have here is quite sensible, and matching GNU grep seems like a good
idea.

> ---
>  builtin/grep.c  |  6 ++
>  grep.c  | 48 +---
>  grep.h  |  1 +
>  t/t7810-grep.sh | 15 +++
>  4 files changed, 55 insertions(+), 15 deletions(-)

The patch itself looks pretty straightforward to me (especially with
"-w"). I didn't hit the compiler warning that Junio did (I have gcc
7.3.0). But I agree it's better to avoid even passing an uninitialized
variable to another function.

-Peff

Re: curious about wording in "man git-config", ENVIRONMENT

2018-06-28 Thread Jeff King

On Tue, Jun 26, 2018 at 12:51:45PM -0400, Robert P. J. Day wrote:

> > I agree it's weird. I think it's trying to mean "behaves as if it
> > was set to", but with the additional notion that the command-line
> > argument would take precedence over the environment (which is our
> > usual rule). But then we should just say those things explicitly.
> >
> > Just looking at mentions of GIT_CONFIG in that manpage and knowing
> > the history, I think:
> 
>   ... snip ...
> 
> i'm just going to admit that i don't quite have the background to know
> how to submit a patch to tidy things up based on Jeff's analysis, so
> I'm going to leave this to someone higher up the food chain.

There's some related discussion going on in:

  https://public-inbox.org/git/20180627045637.13818-1-...@pobox.com/

I think it makes sense to wait for that to settle and then I may try to
build on top.

-Peff

Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 12:36:06PM -0400, Todd Zullinger wrote:

> >> It might be enough if the default values are relatively sane
> >> and consistent.  That would be a slight improvement over the
> >> current situation still.
> > 
> > Yeah. Taking a step back from the implementation questions, I think
> > "$(prefix)/etc/gitconfig" is not very helpful text. I'd be happy to see
> > us come up with a generic way of saying that which is more
> > comprehensible to end-users. Your patches side-step that by filling in
> > the real value, but unfortunately we can't do that everywhere. :)
> > 
> > There may not be a good "token" value, though. I.e., we may need to have
> > two sets of verbiage: the specific one, and the generic one.
> 
> Yeah.  About the best generic term I can come up with is
> using '$sysconfdir'.  But I have no idea if that's something
> most readers will recognize as a placeholder for something
> like /etc.

I don't that's much better than $(prefix). It's at least _correct_ more
often, but unless you're used to autotools conventions, it's pretty
obscure. Can we just say "the system configuration direction (e.g.,
/etc)" or something like that?

That definitely requires doing some kind of ifdef in the asciidoc
source, but I think the end result will be much more comprehensible to
end users. And IMHO the readability hit to the source is not too bad (at
least I don't find the DEFAULT_PAGER one to be).

Something like this:

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 18ddc78f42..ab20dd5235 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -113,13 +113,16 @@ For reading options: read only from global `~/.gitconfig` 
and from
 See also <>.

 --system::
-   For writing options: write to system-wide
-   `$(prefix)/etc/gitconfig` rather than the repository
-   `.git/config`.
+   For writing options: write to the system-wide gitconfig file
+   rather then the repository config.
 +
-For reading options: read only from system-wide `$(prefix)/etc/gitconfig`
+For reading options: read only from system-wide gitconfig file
 rather than from all available files.
 +
+The exact location of the system-wide file is configured when Git is
+built.
+ifdef::git-doc-generic[In many builds, it is `/etc/gitconfig`.]
+ifndef::git-doc-generic[In your build, it is +{etc-gitconfig}+.]
 See also <>.

 --local::

Though I would also be happy if we simply said "system-wide gitconfig"
here and then had the conditional part in FILES.

I'd actually argue that all of the "source" options should be grouped,
like:

  --local::
  --global::
  --system::
  --file=::
When writing, write to the repository-local, user-global, or
system-wide configuration file (or any `` you specify).
When reading, read from a specific file rather than from all
available files. See < below for more details.

And then let <> describe in prose the various files, where they
might be, etc. That also cleans up the `.git/config` thing, which is a
mild fiction (it is really `$GIT_DIR/config`).

> A number of the path references are in the FILES sections of
> the man pages.  It might not be much of an improvement if we
> try to stuff too much text in those references.  Perhaps if
> those used '$sysconfdir/gitconfig' a subsequent note could
> expand on that?  It could even be wrapped in an ifdef
> similar to that used for the default editor/pager.

So yeah, I think I am arguing along the same lines, but just saying
"system-wide gitconfig" or similar instead of $sysconfdir/gitconfig. I
think the English is a little less daunting.

> > I think it shouldn't go into config.mak.uname, since the idea there was
> > to keep the long list of platform defaults separate from other logic
> > (the Makefile is already long enough!). So I'm basically proposing the
> > same thing but in its own file. :)
> 
> Ahh, something that the top-level Makefile would create and
> then subdir Makefiles would include, perhaps similar to the
> way GIT-VERSION-FILE is updated and included?  That could
> prove useful to some of the tools in contrib which duplicate
> logic.

No, I had just envisioned it would be a static file, like
config.mak.paths or something. I'm literally just trying to break out
bits of our Makefile into bite-sized files so they're easier look at.

> Skipping that larger de-duplication cleanup, here's a stab
> at implementing the DOC_GENERIC knob (and the DOC_ overrides
> for ETC_GIT(CONFIG|ATTRIBUTES).  The defaults with
> '/GENERIC-SYSCONFDIR' in them are just placeholders waiting
> for a better name. :)

So obviously I like the direction of this patch, but if you agree with
my line of thinking above you'd need to turn DOC_GENERIC into an
attribute and use in-source conditionals. Hopefully you do agree. :)

> Thanks for thinking this through and providing some good
> directions to work on!

Thank you for working on it! I was thinking about this because of

Re: [GSoC][PATCH v5 0/2] rebase -i: rewrite append_todo_help() in C

2018-06-28 Thread Junio C Hamano

Alban Gruin  writes:

> This patch rewrites append_todo_help() from shell to C. The C version
> covers a bit more than the old shell version. To achieve that, some
> parameters were added to rebase--helper.
>
> This also introduce a new source file, rebase-interactive.c.
>
> This is part of the effort to rewrite interactive rebase in C.
>
> This is based on next, as of 2018-06-28.

Please do not base new development on 'next'.  You typically do not
require the whole of 'next'; there is *NO* reason why this topic
must wait until say ds/ewah-cleanup topic graduates to 'master', for
example.

You may still depend on just a handful of selected topics that you
build on that are not yet in 'master'.  Identify them and build on
top of the merge of them on top of whatever integration branch you
are aiming for (typically 'master').

And list these topics explicitly, instead of saying 'based on next'.

Re: [PATCH 1/1] Makefile: fix the "built from commit" code

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 06:23:56PM +0200, Johannes Schindelin wrote:

> On Thu, 28 Jun 2018, Jeff King wrote:
> 
> > On Wed, Jun 27, 2018 at 09:35:23PM +0200, Johannes Schindelin via 
> > GitGitGadget wrote:
> > 
> > > To prevent erroneous commits from being reported (e.g. when unpacking
> > > Git's source code from a .tar.gz file into a subdirectory of a different
> > > Git project, as e.g. git_osx_installer does), we painstakingly set
> > > GIT_CEILING_DIRECTORIES when trying to determine the current commit.
> > > 
> > > Except that we got the quoting wrong, and that variable therefore does
> > > not have the desired effect.
> > > 
> > > Let's fix that quoting, and while at it, also suppress the unhelpful
> > > message
> > 
> > I had to stare at the code for a bit to figure out what was wrong:
> 
> Do you want me to update the commit message?

I'm OK either way. Probably not worth a re-roll unless you want to be
completionist about commit messages (personally I do not mind
occasionally jumping to the list archive to get historical context and
reviews).

-Peff

Re: [PATCH 1/1] Makefile: fix the "built from commit" code

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 10:27:32AM -0700, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> I.e.:
> >> 
> >>   FOO='with spaces'
> >>   BAR=$FOO sh -c 'echo $BAR'
> >> 
> >> works just fine.
> >
> > $ x="two  spaces"
> >
> > $ echo $x
> > two spaces
> >
> > Maybe we should quote a little bit more religiously.
> 
> Both of you are wrong ;-)
> 
> Of course, the lack of dq around echo's argument makes shell split
> two and spaces into two args and feed them separately to echo, and
> causes echo to show them with a single SP in between.  Peff's
> exampel should have been
> 
>   BAR=$FOO sh -c 'echo "$BAR"'

Yes, that's a better example. I was primarily trying to show that the
outer shell did not barf with "spaces: command not found".

> But that does not have much to do with the primary point Peff was
> talking about, which is that in this sequence:
> 
>   $ x="two  spaces"
>   $ y="$x"
>   $ z=$x
>   $ echo "x=<$x>" "y=<$y>" "z=<$z>"
> 
> assignment to y and z behave identically, i.e. dq around "$x" when
> assigning to y is not needed.

I actually had to test it to convince myself that one-shot assignments
behaved the same way, but they do.

-Peff

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 05:56:18PM +0100, Ramsay Jones wrote:

> > Yeah, this solution seems sensible. Given that we would never report any
> > error for that blob, there is no point in even looking at it. I wonder
> > if we ought to do the same for other types, too. Is there any point in
> > opening a tree that is in the skiplist?
> 
> Note that the 'blob' object has already been 'loaded' at this
> point anyway (and the basic 'object' checks have been done).

Yeah, you're right. If we wanted to avoid that, we should prevent it
from entering the gitmodules_found list in the first place.

Of course, we'd generally still load it once anyway in order to check
the sha1. So really, the most we could do is prevent it from being
loaded a _second_ time for no reason during the fsck_finish() stage.

But for the reasons I gave in the fsck_finish() patches (like pack
ordering), we will quite often end up hitting it in the first pass
anyway. So it's probably not worth spending too much time trying to
optimize it.

And thinking on that, my "should we do this for trees" is pretty dumb,
too. We load them and compute their sha1 anyway (I _think_ bitrot checks
like that are totally independent of skiplist). So all we'd be saving is
walking the buffer entries.

> I did think about this, briefly, but decided that we should
> only 'skip' the leaf nodes (blobs). (So, when processing
> commits, trees and tags, we should not report an error for
> that object-id, but that should not stop us from checking
> the tree object associated with a commit, just because of
> a problem with the commit message).
> 
> [Oh, wait - Hmm, each object could be checked independently
> of all others and not used for any object traversal right?
> (e.g. using packfile index). I saw fsck_walk() and didn't
> look any further ... Ah, broken link check, ... I obviously
> need to read the code some more ... :-D ]

Yes, I've been confused by this code before. I'm still not sure I
totally understand it. ;)

> > One thing we could do is turn the parse failure into a noop. The main
> > point of the check is to protect people against the malicious
> > .gitmodules bug. If the file can't be parsed, then it also can't be an
> > attack vector (assuming the parser used for the fsck check and the
> > parser used by the victim behave identically).
> 
> Hmm, yeah, but I would have to provide a means of suppressing
> the config parser error messages. Something I wanted to avoid. :(

Yes, though I don't think it's too bad. We already have a "die_on_error"
flag in the config code. I think it just needs to become a tristate:
die/error/silent (and probably get passed in via config_options, since I
think we tie it right now to the file/blob source).

> Junio, do you want me to address the above 'rejected push'
> issue in this patch, or with a follow-up patch? (It should
> be a pretty rare problem - famous last words!)

Hmm, if we end up doing the config thing above, then this patch would
become unnecessary.

And I think doing that would help _all_ cases, even ones without a
skiplist. They don't need to see random config error messages either,
even if we do eventually report an fsck error.

-Peff

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 09:39:39AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Yeah, this solution seems sensible. Given that we would never report any
> > error for that blob, there is no point in even looking at it. I wonder
> > if we ought to do the same for other types, too. Is there any point in
> > opening a tree that is in the skiplist?
> 
> Suppose the tree is listed there only because it has one entry for a
> subtree with leading 0 in its mode.  We do want to ignore that
> format violation, but we still want to learn the fact that the
> subtree it points at and its contents are connected and not
> dangling, so we do need to open it.  Is that open done in a separate
> phase?

To be honest, I'm not sure. There _is_ a separate phase for checking
reachability, but I think there may be some dependencies between the
phases. Once upon a time they were communicated by the existence of
entries in obj_hash (blech!) but I think these days it happens using a
a bit in object->flags.

There is at least one case of interest just in this phase, though: we
have to read the surrounding tree to find out that a particular blob is
a .gitmodules file. So if you skiplist'd a tree, that would also mean we
fail to mark its .gitmodules (if any) as such. I'm not sure if that
would be a bug or a feature, though.

-Peff

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Junio C Hamano

Ramsay Jones  writes:

> Junio, do you want me to address the above 'rejected push'
> issue in this patch, or with a follow-up patch? (It should
> be a pretty rare problem - famous last words!)

If you feel the need to say "famous last words", it is an indication
that it is better done as a follow-up, I would think ;-)

Thanks for spotting and addressing this issue.

Re: [PATCH 1/1] Makefile: fix the "built from commit" code

2018-06-28 Thread Junio C Hamano

Johannes Schindelin  writes:

>> I.e.:
>> 
>>   FOO='with spaces'
>>   BAR=$FOO sh -c 'echo $BAR'
>> 
>> works just fine.
>
>   $ x="two  spaces"
>
>   $ echo $x
>   two spaces
>
> Maybe we should quote a little bit more religiously.

Both of you are wrong ;-)

Of course, the lack of dq around echo's argument makes shell split
two and spaces into two args and feed them separately to echo, and
causes echo to show them with a single SP in between.  Peff's
exampel should have been

BAR=$FOO sh -c 'echo "$BAR"'

But that does not have much to do with the primary point Peff was
talking about, which is that in this sequence:

$ x="two  spaces"
$ y="$x"
$ z=$x
$ echo "x=<$x>" "y=<$y>" "z=<$z>"

assignment to y and z behave identically, i.e. dq around "$x" when
assigning to y is not needed.

Re: Use of new .gitattributes working-tree-encoding attribute across different platform types

2018-06-28 Thread Jeff King

On Thu, Jun 28, 2018 at 07:21:18PM +0200, Lars Schneider wrote:

> How about this:
> 
> 1) We allow users to set the encoding "auto". Example:
> 
>   *.txt working-tree-encoding=auto
> 
> 2) We define a new variable `core.autoencoding`. By default the value is 
> UTF-8 (== no re-encoding) but user can set to any value in their Git config. 
> Example:
> 
> git config --global core.autoencoding UTF-16
> 
> All files marked with the value "auto" will use the encoding defined in
> `core.autoencoding`.
> 
> Would that work?

Yeah, that was along the lines that I was thinking. I wonder if anybody
would ever need two such auto-encodings, though. Probably not. But
another way to think about it would be to allow something like:

  working-tree-encoding=foo

and then in your config "foo" to map to some encoding.

But that may be over-engineering, I dunno. utf8 has always been enough
for me. :)

-Peff

Re: Use of new .gitattributes working-tree-encoding attribute across different platform types

2018-06-28 Thread Lars Schneider




> On Jun 28, 2018, at 4:34 PM, Jeff King  wrote:
> 
> On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote:
> 
>> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote:
>>> We have common code that is supposed to be usable across different 
>>> platforms and hence different file encodings. With the full support of the 
>>> working-tree-encoding in the latest version of git on all platforms, how do 
>>> we have files converted to different encodings on different platforms?
>>> I could not find anything that would allow us to say 'if platform = z/OS 
>>> then encoding=EBCDIC else encoding=ASCII'.   Is there a way this can be 
>>> done?
>> 
>> I don't believe there is such functionality.  Git doesn't have
>> attributes that are conditional on the platform in that sort of way.
>> You could use a smudge/clean filter and adjust the filter for the
>> platform you're on, which might meet your needs.
> 
> We do have prior art in the line-ending code, though. There the
> attributes say either that a file needs a specific line-ending type
> (which is relatively rare), or that it should follow the system type,
> which is then set separately in the config.
> 
> I have the impression that the working-tree-encoding stuff was made to
> handle the first case, but not the second. It doesn't seem like an
> outrageous thing to eventually add.
> 
> (Though I agree that clean/smudge filters would work, and can even
> implement the existing working-tree-encoding feature, albeit less
> efficiently and conveniently).

Thanks for the suggestion Peff! 
How about this:

1) We allow users to set the encoding "auto". Example:

*.txt working-tree-encoding=auto

2) We define a new variable `core.autoencoding`. By default the value is 
UTF-8 (== no re-encoding) but user can set to any value in their Git config. 
Example:

git config --global core.autoencoding UTF-16

All files marked with the value "auto" will use the encoding defined in
`core.autoencoding`.

Would that work?

@steve: Would that fix your problem?

- Lars

Re: [PATCH v5 5/8] mem-pool: add life cycle management functions

2018-06-28 Thread Junio C Hamano

Jameson Miller  writes:

> Add initialization and discard functions to mem_pool type. As the
> memory allocated by mem_pool can now be freed, we also track the large
> allocations.
>
> If the there are existing mp_blocks in the mem_poo's linked list of

mem_poo?

> mp_blocksl, then the mp_block for a large allocation is inserted

mp_blocksl?

Re: [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct

2018-06-28 Thread Junio C Hamano

Jameson Miller  writes:

> The make_cache_entry function should take an object_id struct instead
> of sha.

The name of the hash is SHA-1, not sha ;-)

It is not worth a reroll, but I do not think "should" is a
particularly good thing to say in the title or justification in the
log message in this case.  It is more like you (or somebody else who
commented) _want_ to make it take an oid for _some_ reason.  "teach
make_cache_entry() to take object_id" is probably a better title
that admits that we do not explicitly say _why_ we are doing so,
than saying "it should", which equally is not explicit ;-)

> diff --git a/read-cache.c b/read-cache.c
> index fa8366ecab..9624ce1784 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -746,8 +746,10 @@ int add_file_to_index(struct index_state *istate, const 
> char *path, int flags)
>  }
>  
>  struct cache_entry *make_cache_entry(unsigned int mode,
> - const unsigned char *sha1, const char *path, int stage,
> - unsigned int refresh_options)
> +  const struct object_id *oid,
> +  const char *path,
> +  int stage,
> +  unsigned int refresh_options)
>  {
>   int size, len;
>   struct cache_entry *ce, *ret;
> @@ -761,7 +763,7 @@ struct cache_entry *make_cache_entry(unsigned int mode,
>   size = cache_entry_size(len);
>   ce = xcalloc(1, size);
>  
> - hashcpy(ce->oid.hash, sha1);
> + hashcpy(ce->oid.hash, oid->hash);
>   memcpy(ce->name, path, len);
>   ce->ce_flags = create_ce_flags(stage);
>   ce->ce_namelen = len;

The patch itself looks good.

Thanks.

Re: [PATCH v4] Documentation: declare "core.ignoreCase" as internal variable

2018-06-28 Thread Torsten Bögershausen

On 28.06.18 13:21, Marc Strapetz wrote:
> The current description of "core.ignoreCase" reads like an option which
> is intended to be changed by the user while it's actually expected to
> be set by Git on initialization only. Subsequently, Git relies on the
> proper configuration of this variable, as noted by Bryan Turner [1]:
> 
> Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT,
> vFAT, NTFS, etc.) is not designed to be run with anything other
> than core.ignoreCase=true.
> 
> [1] https://marc.info/?l=git=152998665813997=2
> mid:cagyf7-gee8jrgpkme9rhkptheq6p1+ebpmmwatmh01uo3bf...@mail.gmail.com
> 
> Signed-off-by: Marc Strapetz 
> ---
>  Documentation/config.txt | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1cc18a828..c70cfe956 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -390,16 +390,19 @@ core.hideDotFiles::
>   default mode is 'dotGitOnly'.
>  
>  core.ignoreCase::
> - If true, this option enables various workarounds to enable
> + Internal variable which enables various workarounds to enable
>   Git to work better on filesystems that are not case sensitive,
> - like FAT. For example, if a directory listing finds
> - "makefile" when Git expects "Makefile", Git will assume
> + like APFS, HFS+, FAT, NTFS, etc. For example, if a directory listing
> + finds "makefile" when Git expects "Makefile", Git will assume
>   it is really the same file, and continue to remember it as
>   "Makefile".
>  +
>  The default is false, except linkgit:git-clone[1] or linkgit:git-init[1]
>  will probe and set core.ignoreCase true if appropriate when the repository
>  is created.
> ++
> +Git relies on the proper configuration of this variable for your operating
> +and file system. Modifying this value may result in unexpected behavior.
>  
>  core.precomposeUnicode::
>   This option is only used by Mac OS implementation of Git.
> 

Looks good to me

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Ramsay Jones




On 28/06/18 12:49, Jeff King wrote:
> On Wed, Jun 27, 2018 at 07:39:53PM +0100, Ramsay Jones wrote:
> 
>> Since commit ed8b10f631 ("fsck: check .gitmodules content", 2018-05-02),
>> fsck will issue an error message for '.gitmodules' content that cannot
>> be parsed correctly. This is the case, even when the corresponding blob
>> object has been included on the skiplist. For example, using the cgit
>> repository, we see the following:
>>
>>   $ git fsck
>>   Checking object directories: 100% (256/256), done.
>>   error: bad config line 5 in blob .gitmodules
>>   error in blob 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: 
>> could not parse gitmodules blob
>>   Checking objects: 100% (6626/6626), done.
>>   $
>>
>>   $ git config fsck.skiplist '.git/skip'
>>   $ echo 51dd1eff1edc663674df9ab85d2786a40f7ae3a5 >.git/skip
>>   $
>>
>>   $ git fsck
>>   Checking object directories: 100% (256/256), done.
>>   error: bad config line 5 in blob .gitmodules
>>   Checking objects: 100% (6626/6626), done.
>>   $
>>
>> Note that the error message issued by the config parser is still
>> present, despite adding the object-id of the blob to the skiplist.
>>
>> One solution would be to provide a means of suppressing the messages
>> issued by the config parser. However, given that (logically) we are
>> asking fsck to ignore this object, a simpler approach is to just not
>> call the config parser if the object is to be skipped. Add a check to
>> the 'fsck_blob()' processing function, to determine if the object is
>> on the skiplist and, if so, exit the function early.
> 
> Yeah, this solution seems sensible. Given that we would never report any
> error for that blob, there is no point in even looking at it. I wonder
> if we ought to do the same for other types, too. Is there any point in
> opening a tree that is in the skiplist?

Note that the 'blob' object has already been 'loaded' at this
point anyway (and the basic 'object' checks have been done).

I did think about this, briefly, but decided that we should
only 'skip' the leaf nodes (blobs). (So, when processing
commits, trees and tags, we should not report an error for
that object-id, but that should not stop us from checking
the tree object associated with a commit, just because of
a problem with the commit message).

[Oh, wait - Hmm, each object could be checked independently
of all others and not used for any object traversal right?
(e.g. using packfile index). I saw fsck_walk() and didn't
look any further ... Ah, broken link check, ... I obviously
need to read the code some more ... :-D ]

>> I noticed recently that the 'cgit.git' repo was complaining when
>> doing an 'git fsck' ...
>>
>> Note that 'cgit' had a '.gitmodules' file and a 'submodule.sh'
>> script back in 2007, which had nothing to do with the current
>> git submodule facilities, ... ;-)
> 
> Yikes. I worried about that sort of regression when adding the
> .gitmodules checks. But this _is_ a pretty unique case: somebody was
> implementing their own version of submodules (pre-git-submodule) and
> decided to use the same name. So I'm not sure if this is a sign that we
> need to think through the regression, or a sign that it really is rare.

I don't have any numbers, but my gut tells me that this would
be very rare indeed. Of course, my gut has been wrong before ... :-D

> One thing we could do is turn the parse failure into a noop. The main
> point of the check is to protect people against the malicious
> .gitmodules bug. If the file can't be parsed, then it also can't be an
> attack vector (assuming the parser used for the fsck check and the
> parser used by the victim behave identically).

Hmm, yeah, but I would have to provide a means of suppressing
the config parser error messages. Something I wanted to avoid. :(

> That wouldn't help with your stray message, of course, but it would
> eliminate the need to deal with the skiplist here (and skiplists aren't
> always easy to do -- for instance, pushing up a non-fork of cgit to
> GitHub would now be rejected because of this old file, and you'd have to
> contact support to resolve it).

Good point.

>> I just remembered that I had intended to review the name of the
>> function that this patch introduces before sending, but totally
>> forgot! :(
>>
>> [Hmm, 'to_be_skipped' -> object_to_be_skipped, object_on_skiplist,
>> etc., ... suggestions welcome!]
> 
> The current name is OK to be, but object_on_skiplist() also seems quite
> obvious.

object_on_skiplist() it is!

Junio, do you want me to address the above 'rejected push'
issue in this patch, or with a follow-up patch? (It should
be a pretty rare problem - famous last words!)

ATB,
Ramsay Jones

Re: Inconsistencies in commit-graph technical docs.

2018-06-28 Thread Grant Welch

Thanks for the quick response and for the patch.

I started writing such a long process document because I _thought_
that I found a major issue with the Large Edge List. But, in the end,
it was user error. It turned out that I was looking at index '11' when
I should have been looking at index '0x11' ('17'). Whoops!
On Thu, Jun 28, 2018 at 5:49 AM Derrick Stolee  wrote:
>
> On 6/28/2018 1:11 AM, Grant Welch wrote:
> > I recently read the "Supercharging the Git Commit Graph blog by
> > Derrick Stolee. I found the article interesting and wanted to verify
> > the performance numbers for myself. Then that led me to want to know
> > more about the implementation, so I read the technical design notes in
> > commit-graph.txt, and then I jumped into the format documentation in
> > commit-graph-format.txt.
> >
> > Along the way, I noticed a few issues. They might just be errors in
> > the documentation, but I figured it was worth documenting my entire
> > process just to be sure.
> >
> > "Supercharging the Git Commit Graph", by Derrick Stolee:
> >
> > https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/
> >
> > # TL;DR
> >
> > I found a few discrepencies between the documentation in
> > commit-graph-format.txt and the results that I observed for myself.
> >
> > 1. The "Commit Data" chunk signature is documented to be 'CGET', but
> > it should be 'CDAT'.
> >
> > commit-graph.c:18
> >#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
>
> Thanks for catching this! Thankfully, this is an easy fix, as we only
> need to update the documentation.
>
> > 2. The "no parent" value is documented to be 0x, but is
> > actually 0x7000.
> >
> > commit-graph.c:34
> >#define GRAPH_PARENT_NONE 0x7000
>
> This is a more important mistake, as it affects the data that was
> written in the commit-graph file.
>
> Part of the problem is that leading hex digit of 0x7 which in binary is
> 0b0111. We already designed a limit of at most 2^{31}-1 (~2.1 billion)
> commits in the commit-graph because of the way we track octopus edges,
> but this mistake has cost us more: we cannot store more than ~1.8
> billion commits.
>
> I'm sorry for this mixup, mostly because it is aesthetically unpleasant.
> Those extra 300 million commits mean less to me than having a clean file
> format.
>
> > 3. The "generation" field isn't set on any of the commits. (I don't
> > know what to make of this.)
>
> This is a difference between 2.18 and current 'master', which merged
> ds/generation-numbers. Commit-graphs written with Git 2.18 have all
> generation numbers listed as GENERATION_NUMBER_ZERO (0), which lets
> future versions know that the generation number is not computed yet, so
> the next commit-graph write will compute the correct generation number.
>
> I'll send a patch soon fixing these doc issues.
>
> Thanks,
> -Stolee



--
-grant welch
-gwelch...@gmail.com

Re: [PATCH v4] Documentation: declare "core.ignoreCase" as internal variable

2018-06-28 Thread Junio C Hamano

Thanks.

Re: [PATCH] fsck: check skiplist for object in fsck_blob()

2018-06-28 Thread Junio C Hamano

Jeff King  writes:

> Yeah, this solution seems sensible. Given that we would never report any
> error for that blob, there is no point in even looking at it. I wonder
> if we ought to do the same for other types, too. Is there any point in
> opening a tree that is in the skiplist?

Suppose the tree is listed there only because it has one entry for a
subtree with leading 0 in its mode.  We do want to ignore that
format violation, but we still want to learn the fact that the
subtree it points at and its contents are connected and not
dangling, so we do need to open it.  Is that open done in a separate
phase?

Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs

2018-06-28 Thread Todd Zullinger

Jeff King wrote:
> We could get rid around that by using $(DOC_ETC_GITCONFIG) or something,
> with:
> 
>   DOC_ETC_GITCONFIG ?= $(ETC_GITCONFIG)
> 
> in the Makefile. But that still leaves the choice of which generic text
> to use up to the caller. Maybe we should provide more guidance.
> 
> I.e., provide a knob like DOC_GENERIC that fills in generic values for
> these values (and maybe some others; it sounds like we have some
> existing problem cases).

That sounds pretty reasonable.  I have something
implementing this below.

>> It might be enough if the default values are relatively sane
>> and consistent.  That would be a slight improvement over the
>> current situation still.
> 
> Yeah. Taking a step back from the implementation questions, I think
> "$(prefix)/etc/gitconfig" is not very helpful text. I'd be happy to see
> us come up with a generic way of saying that which is more
> comprehensible to end-users. Your patches side-step that by filling in
> the real value, but unfortunately we can't do that everywhere. :)
> 
> There may not be a good "token" value, though. I.e., we may need to have
> two sets of verbiage: the specific one, and the generic one.

Yeah.  About the best generic term I can come up with is
using '$sysconfdir'.  But I have no idea if that's something
most readers will recognize as a placeholder for something
like /etc.

A number of the path references are in the FILES sections of
the man pages.  It might not be much of an improvement if we
try to stuff too much text in those references.  Perhaps if
those used '$sysconfdir/gitconfig' a subsequent note could
expand on that?  It could even be wrapped in an ifdef
similar to that used for the default editor/pager.

I don't want to make the plain .txt files significantly less
readable in the process, of course.

>>> It's a shame we have to repeat this logic from the Makefile, though I
>>> guess we already do so for prefix, bindir, etc, so far.
>>> 
>>> Should we factor the path logic from the top-level Makefile into an
>>> include that can be used from either?
>> 
>> Yeah, maybe.  I didn't like the duplication either, but as
>> you noticed, we do it already for many of the other
>> variables.  I suspect we could put these defaults into
>> config.mak.uname which Documentation/Makefile includes and
>> then you could run 'make -C Documentation' in a fresh clone
>> or tarball and get docs built with the defaults set for each
>> platform.
> 
> I think it shouldn't go into config.mak.uname, since the idea there was
> to keep the long list of platform defaults separate from other logic
> (the Makefile is already long enough!). So I'm basically proposing the
> same thing but in its own file. :)

Ahh, something that the top-level Makefile would create and
then subdir Makefiles would include, perhaps similar to the
way GIT-VERSION-FILE is updated and included?  That could
prove useful to some of the tools in contrib which duplicate
logic.

Skipping that larger de-duplication cleanup, here's a stab
at implementing the DOC_GENERIC knob (and the DOC_ overrides
for ETC_GIT(CONFIG|ATTRIBUTES).  The defaults with
'/GENERIC-SYSCONFDIR' in them are just placeholders waiting
for a better name. :)

Similarly, the use of {etc-git(config|attributes)} as the
attribute for the replacements could likely be improved for
readers of the .txt files.  {system-wide-gitconfig} is
likely better.  Maybe the default for the generic paths
could be /system-wide/git(config|attributes) too (or in
CAPS to make it more obviously a placeholder)?

Thanks for thinking this through and providing some good
directions to work on!

-- >8 --
Subject: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs

Replace `$(prefix)/etc/gitconfig` and `$(prefix)/etc/gitattributes` in
generated documentation with the paths chosen when building.  Readers of
the documentation should not need to know how `$(prefix)` was defined.

It's also more consistent than sometimes using `$(prefix)/etc/gitconfig`
and other times using `/etc/gitconfig` to refer to the system-wide
config file.

Update the SYNOPSIS of gitattributes(5) to include the system-wide
config file as well.

Support building generic documentation with a DOC_GENERIC Makefile knob.
The default generic paths may be further customized via the
DOC_ETC_GITCONFIG and DOC_ETC_GITATTRIBUTES variables.

Define DOC_GENERIC in dist-doc make target to ensure generic paths are
used in the generated html and manpage tarballs.

Helped-by: Jeff King 
Signed-off-by: Todd Zullinger 
---
 Documentation/Makefile  | 22 ++
 Documentation/config.txt|  4 ++--
 Documentation/git-config.txt| 10 +-
 Documentation/git.txt   |  4 ++--
 Documentation/gitattributes.txt |  4 ++--
 Makefile| 10 --
 6 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index d079d7c73..2b3540a6c 100644
---

Re: [PATCH 1/1] Makefile: fix the "built from commit" code

2018-06-28 Thread Johannes Schindelin

Hi Peff,

On Thu, 28 Jun 2018, Jeff King wrote:

> On Wed, Jun 27, 2018 at 09:35:23PM +0200, Johannes Schindelin via 
> GitGitGadget wrote:
> 
> > To prevent erroneous commits from being reported (e.g. when unpacking
> > Git's source code from a .tar.gz file into a subdirectory of a different
> > Git project, as e.g. git_osx_installer does), we painstakingly set
> > GIT_CEILING_DIRECTORIES when trying to determine the current commit.
> > 
> > Except that we got the quoting wrong, and that variable therefore does
> > not have the desired effect.
> > 
> > Let's fix that quoting, and while at it, also suppress the unhelpful
> > message
> 
> I had to stare at the code for a bit to figure out what was wrong:

Do you want me to update the commit message?

> > -   '-DGIT_BUILT_FROM_COMMIT="$(shell 
> > GIT_CEILING_DIRECTORIES=\"$(CURDIR)/..\" \
> > -   git rev-parse -q --verify HEAD || :)"'
> > +   '-DGIT_BUILT_FROM_COMMIT="$(shell \
> > +   GIT_CEILING_DIRECTORIES="$(CURDIR)/.." \
> > +   git rev-parse -q --verify HEAD 2>/dev/null)"'
> 
> The issue is that the $(shell) is resolved before the output is stuffed
> into the command-line with -DGIT_BUILT_FROM_COMMIT, and therefore is
> _not_ inside quotes. And thus backslashing the quotes is wrong, as the
> quote gets literally inserted into the CEILING_DIRECTORIES variable.
> 
> I thought at first we could not need the quotes in the post-image
> either, because shell variable assignments do not do word-splitting.

> I.e.:
> 
>   FOO='with spaces'
>   BAR=$FOO sh -c 'echo $BAR'
> 
> works just fine.

$ x="two  spaces"

$ echo $x
two spaces

Maybe we should quote a little bit more religiously.

> But $(CURDIR) here is not a shell variable, but rather a Makefile
> variable, so it's expanded before we hit the shell. So we need the
> quotes. And unfortunately it also breaks if $(CURDIR) contains exotic
> metacharacters. If we cared we could use single quotes and $(CURDIR_SQ),
> but I suspect it would be far from the first thing to break in such a
> case.
> 
> Which is a long-winded way of saying the patch looks correct to me.

Thanks ;-)

Ciao,
Dscho

Re: [PATCH v6 3/4] stash: convert branch to builtin

2018-06-28 Thread Paul-Sebastian Ungureanu


Hello,

On 27.06.2018 21:39, Junio C Hamano wrote:

This is primarily because cmd_$foo() is designed to be replacement
of "main()" in usual programs---it is allowed to assume the global
variables it uses have their initial values and nobody cares the
state it leaves behind when it returns.  Argument parser may flip
bits in file-scope static variables to record which options are
seen, assuming that these variables are initialized to all-zero, and
that assumption would not hold for the second call to cmd_$foo(),
etc.  cmd_$foo() also is allowed to call die() when there is an
unrecoverable error condition in the context of carrying out $foo; a
scripted Porcelain that used $foo as a building block to implement a
larger operation (e.g. "stash" that used "checkout" to switch to a
different branch) may want to react to the failure and do something
extra (i.e. "git checkout || do something more").

Using run_command() interface would not cause any of these problems;
the subcommand will run in a clean environment, and the caller can
react to its failure.



Thank you a lot for this great explanation. When I will submit a new 
iteration, I will make sure to replace calls of `cmd_$foo()` with 
`run_command()` (or with API, if possible).


Best regards,
Paul Ungureanu

Re: [PATCH v2] fetch-pack: support negotiation tip whitelist

2018-06-28 Thread Brandon Williams

On 06/28, Jonathan Tan wrote:
> > This seems like a pretty difficult to use feature, requiring that I
> > provide the actual OIDs.  I think a much better UI would probably be to
> > accept a number of different things ranging from exact OIDs to actual
> > ref names or even better, allowing for ref-patterns which include globs.
> > That way I can do the following:
> >   
> >   git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote
> > 
> > in order to easily limit the tips to all the refs I have from that
> > particular remote.
> 
> Actual ref names are already supported - it uses the same DWIM as
> others. As for globs, I thought of supporting both DWIM objects and the
> LHS of refspecs, but (1) it might be strange to support master and
> refs/heads/* but not heads/*,

I don't think that would be strange at all, and no where in git do we
handle heads/* but we do already handle refs/heads/* as well as DWIM
master.


> and (2) I can't think of anywhere in Git
> where you can provide either one - it's either SHA-1 and DWIM name, or
> SHA-1 and refspec, but not all three.

fetch is a perfect example of supporting all three.  I can do

  git fetch origin SHA1
  git fetch origin master
  git fetch origin refs/heads/*:refs/heads/*

-- 
Brandon Williams

Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect

2018-06-28 Thread Junio C Hamano

Johannes Schindelin  writes:

> On Wed, 27 Jun 2018, Junio C Hamano wrote:
>
>> Johannes Schindelin  writes:
>> 
>> >git rev-list --bisect-all --first-parent F..E >revs &&
>> ># only E, e1..e8 should be listed, nothing else
>> >test_line_count = 9 revs &&
>> >for rev in E e1 e2 e3 e4 e5 e6 e7 e8
>> >do
>> >grep "^$(git rev-parse $rev) " revs || return
>> >done
>> >
>> > I am faster by... a lot. Like, seconds instead of minutes.
>> 
>> I'm fine either way.  I just thought you would not want 9 separate
>> invocations of grep ;-)
>
> I don't.
>
> I like the unnecessary test_commit calls even less ;-)
>
> And yes, we could avoid those `grep` calls, I know. By something as
> convoluted as

I now think you are reading me wrong ;-)  

I think you already established pretty well that it is a good idea
to avoid introducing different history to run the new test on, when
there is perfectly usable one available.  That part, i.e. test_commit,
I do not think we have anything to disagree about.

What I meant by "many separte grep calls" was to contrast these two
approaches:

 * Have one typical output spelled out as "expect", take an output
   from an actual run into "actual", make them comparable and then
   do a compare (which does not use grep; it uses sort in the
   "making comparable" phase)

 * Not have any typical output, take an output from an actual run,
   and have _code_ that inspects the output encode the rule over the
   output (e.g. "these must exist" is inspected with many grep
   invocations)

Two things the "output must have 9 entries, and these 9 must be
mentioned" we see at the beginning of this message does not verify
are (1) exact dist value given to each of these entries and (2) the
order in which these entries appear in the output.  The latter is
something we document, and the test should cover.  The former does
not have to be cast in stone (i.e. I do not think it does not make a
difference to label the edge commits with dist=1 or dist=0 as long
as everything is consistent), but if there is no strong reason to
keep it possible for us to later change how the numbers are assigned,
I am OK if the test cast it in stone.

Another reason (other than "many separate invocation of grep") to
favor the former approach (i.e. use real-looking expected output,
munge it and the real output into comparable forms and then compare)
is that it is easier to see what aspect of the output we care (and
we do not care) about.

It is harder to see the omission of exact dist value and ordering of
entries in the outpu in the latter approach, and more importantly,
know if the omission was deliberate (e.g. it was merely an example)
or a mere mistake.

With "using a real-looking expected output, make it and the actual
output comparable and then compare" approach, the aspects in the
output we choose not to care about will show in the "make them
comparable" munging.  If we do not care the exact dist values, there
would be something like s/dist=[0-9]*/dist=X/ to mask the exact
value before making the two comparable to see that the expect and
the actual files have the same entries.  If we still do care about
the entries are ordered by the dist values, there would be something
that sorts the entries with the actual dist values before doing that
masking to ensure if the order is correct.

Re: [PATCH v2] fetch-pack: support negotiation tip whitelist

2018-06-28 Thread Jonathan Tan

> This seems like a pretty difficult to use feature, requiring that I
> provide the actual OIDs.  I think a much better UI would probably be to
> accept a number of different things ranging from exact OIDs to actual
> ref names or even better, allowing for ref-patterns which include globs.
> That way I can do the following:
>   
>   git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote
> 
> in order to easily limit the tips to all the refs I have from that
> particular remote.

Actual ref names are already supported - it uses the same DWIM as
others. As for globs, I thought of supporting both DWIM objects and the
LHS of refspecs, but (1) it might be strange to support master and
refs/heads/* but not heads/*, and (2) I can't think of anywhere in Git
where you can provide either one - it's either SHA-1 and DWIM name, or
SHA-1 and refspec, but not all three.

Re: [PATCH v2] fetch-pack: support negotiation tip whitelist

2018-06-28 Thread Brandon Williams

On 06/27, Jonathan Tan wrote:
> During negotiation, fetch-pack eventually reports as "have" lines all
> commits reachable from all refs. Allow the user to restrict the commits
> sent in this way by providing a whitelist of tips; only the tips
> themselves and their ancestors will be sent.
> 
> This feature is only supported for protocols that support connect or
> stateless-connect (such as HTTP with protocol v2).
> 
> This will speed up negotiation when the repository has multiple
> relatively independent branches (for example, when a repository
> interacts with multiple repositories, such as with linux-next [1] and
> torvalds/linux [2]), and the user knows which local branch is likely to
> have commits in common with the upstream branch they are fetching.
> 
> [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/
> [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/
> 
> Signed-off-by: Jonathan Tan 
> ---
> v2 is exactly the same as the original, except with user-facing
> documentation in Documentation/fetch-options.txt.
> 
> > What's the plan to expose this "feature" to end-users?  There is no
> > end-user facing documentation added by this patch, and in-code
> > comments only talk about what (mechanical) effect the option has,
> > but not when a user may want to use the feature, or how the user
> > would best decide the set of commits to pass to this new option.
> 
> Jonathan Nieder also mentioned this. Lack of documentation was an
> oversight, sorry. I've added it in this version.
> 
> > Would something like this
> >
> > git fetch $(git for-each-ref \
> > --format=--nego-tip="%(objectname)" \
> > refs/remotes/linux-next/) \
> > linux-next
> >
> > be an expected typical way to pull from one remote, exposing only
> > the tips of refs we got from that remote and not the ones we
> > obtained from other places?
> 
> Yes, that is one way. Alternatively, if the user is only fetching one
> branch, they may also want to specify a single branch.
> ---
>  Documentation/fetch-options.txt | 12 +++
>  builtin/fetch.c | 21 +
>  fetch-pack.c| 19 ++--
>  fetch-pack.h|  7 +
>  t/t5510-fetch.sh| 55 +
>  transport-helper.c  |  3 ++
>  transport.c |  1 +
>  transport.h | 10 ++
>  8 files changed, 126 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 97d3217df9..80c4c94595 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -42,6 +42,18 @@ the current repository has the same history as the source 
> repository.
>   .git/shallow. This option updates .git/shallow and accept such
>   refs.
>  
> +--negotiation-tip::
> + By default, Git will report, to the server, commits reachable
> + from all local refs to find common commits in an attempt to
> + reduce the size of the to-be-received packfile. If specified,
> + Git will only report commits reachable from the given commit.
> + This is useful to speed up fetches when the user knows which
> + local ref is likely to have commits in common with the
> + upstream ref being fetched.

This seems like a pretty difficult to use feature, requiring that I
provide the actual OIDs.  I think a much better UI would probably be to
accept a number of different things ranging from exact OIDs to actual
ref names or even better, allowing for ref-patterns which include globs.
That way I can do the following:
  
  git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote

in order to easily limit the tips to all the refs I have from that
particular remote.

> ++
> +This option may be specified more than once; if so, Git will report
> +commits reachable from any of the given commits.
> +
>  ifndef::git-pull[]
>  --dry-run::
>   Show what would be done, without making any changes.
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ea5b9669ad..12daec0f3b 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -63,6 +63,7 @@ static int shown_url = 0;
>  static struct refspec refmap = REFSPEC_INIT_FETCH;
>  static struct list_objects_filter_options filter_options;
>  static struct string_list server_options = STRING_LIST_INIT_DUP;
> +static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
>  
>  static int git_fetch_config(const char *k, const char *v, void *cb)
>  {
> @@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = {
>   TRANSPORT_FAMILY_IPV4),
>   OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
>   TRANSPORT_FAMILY_IPV6),
> + OPT_STRING_LIST(0, "negotiation-tip", _tip, N_("revision"),
> + N_("report that we have only objects reachable from 
> this object")),
>

Re: [PATCH v1 0/9] Introducing remote ODBs

2018-06-28 Thread Christian Couder

On Tue, Jun 26, 2018 at 2:37 AM, Eric Sunshine  wrote:

> In addition to the t5702 failures, I'm also seeing failures of
> t0410.1, t5616.6 and t5616.7 at the tip of 'pu' as of [1], all of
> which seem to be related to these changes.

Yeah but only s/core.partialclonefilter/odb.origin.partialclonefilter/
is needed on top of the fix for for the master branch.

Thanks for the report!

Re: [PATCH 29/29] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells

2018-06-28 Thread Jeff King

On Tue, Jun 26, 2018 at 07:15:45PM -0700, Elijah Newren wrote:

> Crazy idea: maybe we could defang it a little more thoroughly with
> something like the following (apologies in advance if gmail whitespace
> damages this):
> 
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 28315706be..7fda08a90a 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -675,7 +675,7 @@ test_run_ () {
> trace=
> # 117 is magic because it is unlikely to match the exit
> # code of other programs
> -   if test "OK-117" != "$(test_eval_ "(exit 117) &&
> $1${LF}${LF}echo OK-\$?" 3>&1)"
> +   if test "OK-117" != "$(test_eval_ "cd() { return 0; }
> && PATH=/dev/null && export PATH && (exit 117) && $1${LF}${LF}echo
> OK-\$?" 3>&1)"

Clever. We'd still run shell builtins, which is why you need the cd()
above. There may be others, but at least it narrows things down. Unless
the shell is busybox or something, and implements everything as a
builtin. :)

I agree on the point elsewhere of returning non-zero (and the items
missing from PATH should do that, which is good).

-Peff

1 2 >

1 - 100 of 141 matches

Mail list logo