[PATCH 1/3] .mailmap: merge different spellings of names
This is a continuation of 94b410bba86 (.mailmap: Map email addresses to names, 2013-07-12), merging names that are spelled differently but have the same author email to the same person. Most spellings differed in accents or the order of names. Signed-off-by: Stefan Beller --- .mailmap | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index df7cf6313c7..f165222a782 100644 --- a/.mailmap +++ b/.mailmap @@ -35,11 +35,13 @@ Chris Wright Cord Seele Christian Couder Christian Stimming +Christopher Díaz Riveros Christopher Diaz Riveros Csaba Henk Dan Johnson Dana L. How Dana L. How Dana How Daniel Barkalow +Daniel Knittl-Frank knittl Daniel Trstenjak Daniel Trstenjak David Brown @@ -57,6 +59,7 @@ Eric S. Raymond Eric Wong Erik Faye-Lund Eyvind Bernhardsen +Fangyi Zhou Zhou Fangyi Florian Achleitner Franck Bui-Huu Frank Lichtenheld @@ -86,6 +89,8 @@ Jason McMullan Jason Riedy Jason Riedy Jay Soffian +Jean-Noël Avila Jean-Noel Avila +Jean-Noël Avila Jean-Noël AVILA Jeff King Jeff Muizelaar Jens Axboe @@ -149,6 +154,7 @@ Matt Draisey Matt Kraai Matt McCutchen Matthias Kestenholz +Matthias Rüster Matthias Ruester Matthias Urlichs Matthias Urlichs Michael Coleman @@ -213,6 +219,8 @@ Sean Estabrooks Sebastian Schuberth Seth Falcon Shawn O. Pearce +Wei Shuyu Shuyu Wei +Sidhant Sharma Sidhant Sharma [:tk] Simon Hausmann Simon Hausmann Stefan Beller @@ -253,7 +261,8 @@ Uwe Kleine-König Uwe Kleine-König Ville Skyttä -Vitaly "_Vi" Shukela +Vitaly "_Vi" Shukela +Vitaly "_Vi" Shukela Vitaly _Vi Shukela W. Trevor King William Pursell YONETANI Tomokazu -- 2.18.0.399.gad0ab374a1-goog
[PATCH 2/3] .mailmap: assume Jason McMullan to be the same person
Over the years of contributing to open source, I realized the world of open source is smaller than I originally thought and a name is still a pretty unique thing. So let's assume these two author idents are the same person. In 10813e0d3c7 (.mailmap: update long-lost friends with multiple defunct addresses, 2013-08-12) we learned both their email bounce, so asking is no option. Signed-off-by: Stefan Beller --- .mailmap | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/.mailmap b/.mailmap index f165222a782..ff96ef7401f 100644 --- a/.mailmap +++ b/.mailmap @@ -82,10 +82,7 @@ J. Bruce Fields J. Bruce Fields Jakub Narębski James Y Knight -# The 2 following authors are probably the same person, -# but both emails bounce. -Jason McMullan -Jason McMullan +Jason McMullan Jason McMullan Jason Riedy Jason Riedy Jay Soffian -- 2.18.0.399.gad0ab374a1-goog
[PATCH 3/3] .mailmap: map names with multiple emails to the same author identity
There are multiple author idents who have different email addresses, but the same name; assume they are the same person, as the world of open source is actually rather small. Sift through the history via: git shortlog -sne origin/pu |awk '{ NF--; $1=""; print }' |sort |uniq -d |cut -c 2- >names while read name; do current_name=$(git log --author "$name" --format="%an <%ae>" -1) git log --author "$name" --format="%an <%ae>" |sort |uniq >all_names while read one_name; do if [ "$one_name" != "$current_name" ] then echo $current_name $one_name >> out fi done --- .mailmap | 185 --- 1 file changed, 134 insertions(+), 51 deletions(-) diff --git a/.mailmap b/.mailmap index ff96ef7401f..2607846582a 100644 --- a/.mailmap +++ b/.mailmap @@ -5,54 +5,86 @@ # same person appearing not to be so. # - +Adam Simpkins Adam Simpkins Alejandro R. Sedeño +Alexander Gavrilov +Alexander Kuleshov +Alex Bennée Alex Bennée Alex Bennée +Alexey Shumkin +Alexey Shumkin +Alex Riesen Alex Riesen +Alex Riesen Alex Riesen +Alex Riesen Alex Riesen +Alex Riesen Alex Riesen Alex Riesen Alex Riesen Alex Riesen Alex Vandiver -Alexander Gavrilov -Alexander Kuleshov -Alexey Shumkin -Alexey Shumkin +Alex Vandiver Alex Vandiver +Alex Vandiver Alex Vandiver +Amos Waterland +Amos Waterland Anders Kaseorg Anders Kaseorg +Andreas Ericsson Andreas Ericsson +Andreas Schwab Andreas Schwab Aneesh Kumar K.V -Amos Waterland -Amos Waterland -Ben Walton +# the two anonymous contributors are different persons: +anonymous +anonymous +Antonio Ospite Antonio Ospite +Avery Pennarun Avery Pennarun +Beat Bolli Beat Bolli +Benoit Person Benoit Person Benoit Sigoure +Ben Peart Ben Peart +Ben Peart Ben Peart +Ben Walton Bernt Hansen Brandon Casey +Brian Gernhardt Brian Gernhardt +Brian Gernhardt Brian Gernhardt brian m. carlson Brian M. Carlson brian m. carlson Bryan Larsen Bryan Larsen +Carlos Martín Nieto Carlos Martín Nieto +Charles Bailey Charles Bailey Cheng Renquan Chris Shoemaker -Chris Wright -Cord Seele Christian Couder +Christian Ludwig Christian Ludwig Christian Stimming Christopher Díaz Riveros Christopher Diaz Riveros +Chris Wright +Clemens Buchacher Clemens Buchacher +Clemens Buchacher Clemens Buchacher +Cord Seele Csaba Henk -Dan Johnson -Dana L. How Dana L. How Dana How +Dana L. How Daniel Barkalow +Daniel Knittl-Frank Daniel Knittl-Frank Daniel Knittl-Frank knittl Daniel Trstenjak Daniel Trstenjak +Dan Johnson +David Barr David Barr David Brown David D. Kilzer David Kågedal David Reiss David S. Miller +David Turner David Turner +David Turner David Turner +David Turner David Turner David Turner David Turner Deskin Miller Dirk Süsserott +Dotan Barak Dotan Barak +Edward Thomson Edward Thomson Eric Blake Eric Hanchrow Eric S. Raymond @@ -64,28 +96,33 @@ Florian Achleitner Frank Lichtenheld Frank Lichtenheld -Fredrik Kuivinen Frédéric Heitzmann +Fredrik Kuivinen Garry Dolley -Greg Price +Grégoire Paris Grégoire Paris Greg Price +Greg Price +Han-Wen Nienhuys Han-Wen Nienhuys +Hartmut Henkel Hartmut Henkel Heiko Voigt H. Merijn Brand H.Merijn Brand +Horst H. von Brand H. Peter Anvin H. Peter Anvin H. Peter Anvin H. Peter Anvin -Han-Wen Nienhuys Han-Wen Nienhuys -Horst H. von Brand -J. Bruce Fields -J. Bruce Fields -J. Bruce Fields +İsmail Dönmez +Jacob Keller Jacob Keller Jakub Narębski James Y Knight Jason McMullan Jason McMullan -Jason Riedy +Jason McMullan Jason McMullan Jason Riedy +Jason Riedy Jay Soffian +J. Bruce Fields +J. Bruce Fields +J. Bruce Fields Jean-Noël Avila Jean-Noel Avila Jean-Noël Avila Jean-Noël AVILA Jeff King @@ -93,18 +130,22 @@ Jeff Muizelaar Jens Axboe Jens Axboe Jens Lindström Jens Lindstrom +Jiang Xin Jiang Xin Jim Meyering Joachim Berdal Haga +Joachim Jablon Joachim Jablon +Joey Hess Joey Hess Johannes Schindelin +Johannes Sixt Johannes Sixt Johannes Sixt -Johannes Sixt John 'Warthog9' Hawley +Jonathan del Strother +Jonathan Nieder +Jonathon Mah Jonathon Mah Jon Loeliger Jon Loeliger Jon Seymour -Jonathan Nieder -Jonathan del Strother Josh Triplett Josh Triplett Julian Phillips @@ -116,73 +157,107 @@ Junio C Hamano Junio C Hamano Junio C Hamano Kaartic Sivaraam +Kacper Kornet Kacper Kornet Karl Wiberg Karl Hasselström Karl Wiberg Karsten Blees Karsten Blees -Kay Sievers Kay Sievers +Kay Sievers Kazuki Saitoh kazuki saitoh Keith Cascio Kent Engstrom Kevin Leung +Kirill A. Shutemov Kirill A. Shutemov Kirill Smelkov Kirill Smelkov +Kirill Smelkov Kirill
[PATCH v3 13/32] tag: add repository argument to lookup_tag
Add a repository argument to allow the callers of lookup_tag to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/describe.c | 6 +++--- builtin/pack-objects.c | 2 +- builtin/replace.c | 2 +- log-tree.c | 2 +- object.c | 2 +- sha1-name.c| 2 +- tag.c | 4 ++-- tag.h | 4 ++-- 8 files changed, 12 insertions(+), 12 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index c8ff64766d0..41606c8a900 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -93,13 +93,13 @@ static int replace_name(struct commit_name *e, struct tag *t; if (!e->tag) { - t = lookup_tag(>oid); + t = lookup_tag(the_repository, >oid); if (!t || parse_tag(t)) return 1; e->tag = t; } - t = lookup_tag(oid); + t = lookup_tag(the_repository, oid); if (!t || parse_tag(t)) return 0; *tag = t; @@ -267,7 +267,7 @@ static unsigned long finish_depth_computation( static void append_name(struct commit_name *n, struct strbuf *dst) { if (n->prio == 2 && !n->tag) { - n->tag = lookup_tag(>oid); + n->tag = lookup_tag(the_repository, >oid); if (!n->tag || parse_tag(n->tag)) die(_("annotated tag %s not available"), n->path); } diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 69d3d7b82af..6565c800ac3 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2474,7 +2474,7 @@ static void add_tag_chain(const struct object_id *oid) if (packlist_find(_pack, oid->hash, NULL)) return; - tag = lookup_tag(oid); + tag = lookup_tag(the_repository, oid); while (1) { if (!tag || parse_tag(tag) || !tag->tagged) die("unable to pack objects reachable from tag %s", diff --git a/builtin/replace.c b/builtin/replace.c index 0232f98f020..0351b7c62cf 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -402,7 +402,7 @@ static int check_one_mergetag(struct commit *commit, int i; hash_object_file(extra->value, extra->len, type_name(OBJ_TAG), _oid); - tag = lookup_tag(_oid); + tag = lookup_tag(the_repository, _oid); if (!tag) return error(_("bad mergetag in commit '%s'"), ref); if (parse_tag_buffer(tag, extra->value, extra->len)) diff --git a/log-tree.c b/log-tree.c index abe67e8b2e4..840423ca149 100644 --- a/log-tree.c +++ b/log-tree.c @@ -498,7 +498,7 @@ static int show_one_mergetag(struct commit *commit, size_t payload_size, gpg_message_offset; hash_object_file(extra->value, extra->len, type_name(OBJ_TAG), ); - tag = lookup_tag(); + tag = lookup_tag(the_repository, ); if (!tag) return -1; /* error message already given */ diff --git a/object.c b/object.c index f08a8874de3..bcfcfd38760 100644 --- a/object.c +++ b/object.c @@ -223,7 +223,7 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e obj = >object; } } else if (type == OBJ_TAG) { - struct tag *tag = lookup_tag(oid); + struct tag *tag = lookup_tag(the_repository, oid); if (tag) { if (parse_tag_buffer(tag, buffer, size)) return NULL; diff --git a/sha1-name.c b/sha1-name.c index 98480ade12d..5854bc75fe2 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -358,7 +358,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data) format_commit_message(commit, " %ad - %s", , ); } } else if (type == OBJ_TAG) { - struct tag *tag = lookup_tag(oid); + struct tag *tag = lookup_tag(the_repository, oid); if (!parse_tag(tag) && tag->tag) strbuf_addf(, " %s", tag->tag); } diff --git a/tag.c b/tag.c index 5dcdf7bf6f9..5b41fc71fad 100644 --- a/tag.c +++ b/tag.c @@ -92,7 +92,7 @@ struct object *deref_tag_noverify(struct object *o) return o; } -struct tag *lookup_tag(const struct object_id *oid) +struct tag *lookup_tag_the_repository(const struct object_id *oid) { struct object *obj = lookup_object(the_repository, oid->hash); if (!obj) @@ -160,7 +160,7 @@ int parse_tag_buffer(struct tag *item, const void
[PATCH v3 07/32] commit: add repository argument to lookup_commit_reference_gently
Add a repository argument to allow callers of lookup_commit_reference_gently to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- archive.c | 2 +- blame.c | 3 ++- builtin/checkout.c| 6 +++--- builtin/describe.c| 5 +++-- builtin/fetch.c | 9 ++--- builtin/reflog.c | 10 ++ builtin/show-branch.c | 3 ++- bundle.c | 2 +- commit-graph.c| 2 +- commit.c | 6 +++--- commit.h | 5 - fast-import.c | 6 -- notes-cache.c | 3 ++- ref-filter.c | 6 -- remote.c | 9 + sequencer.c | 2 +- sha1-name.c | 4 ++-- shallow.c | 9 ++--- walker.c | 3 ++- wt-status.c | 2 +- 20 files changed, 59 insertions(+), 38 deletions(-) diff --git a/archive.c b/archive.c index 875dab64b60..78b0a398a0f 100644 --- a/archive.c +++ b/archive.c @@ -380,7 +380,7 @@ static void parse_treeish_arg(const char **argv, if (get_oid(name, )) die("Not a valid object name"); - commit = lookup_commit_reference_gently(, 1); + commit = lookup_commit_reference_gently(the_repository, , 1); if (commit) { commit_sha1 = commit->object.oid.hash; archive_time = commit->date; diff --git a/blame.c b/blame.c index 0c4490a35bb..5b022cc2254 100644 --- a/blame.c +++ b/blame.c @@ -1712,7 +1712,8 @@ static struct commit *dwim_reverse_initial(struct rev_info *revs, /* Do we have HEAD? */ if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, _oid, NULL)) return NULL; - head_commit = lookup_commit_reference_gently(_oid, 1); + head_commit = lookup_commit_reference_gently(the_repository, +_oid, 1); if (!head_commit) return NULL; diff --git a/builtin/checkout.c b/builtin/checkout.c index 28627650cd6..40c27bf54d7 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -379,7 +379,7 @@ static int checkout_paths(const struct checkout_opts *opts, die(_("unable to write new index file")); read_ref_full("HEAD", 0, , NULL); - head = lookup_commit_reference_gently(, 1); + head = lookup_commit_reference_gently(the_repository, , 1); errs |= post_checkout_hook(head, head, 0); return errs; @@ -830,7 +830,7 @@ static int switch_branches(const struct checkout_opts *opts, memset(_branch_info, 0, sizeof(old_branch_info)); old_branch_info.path = path_to_free = resolve_refdup("HEAD", 0, , ); if (old_branch_info.path) - old_branch_info.commit = lookup_commit_reference_gently(, 1); + old_branch_info.commit = lookup_commit_reference_gently(the_repository, , 1); if (!(flag & REF_ISSYMREF)) old_branch_info.path = NULL; @@ -1004,7 +1004,7 @@ static int parse_branchname_arg(int argc, const char **argv, else new_branch_info->path = NULL; /* not an existing branch */ - new_branch_info->commit = lookup_commit_reference_gently(rev, 1); + new_branch_info->commit = lookup_commit_reference_gently(the_repository, rev, 1); if (!new_branch_info->commit) { /* not a commit */ *source_tree = parse_tree_indirect(rev); diff --git a/builtin/describe.c b/builtin/describe.c index 1e87f68d5ee..0b5d8562331 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -331,7 +331,8 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) init_commit_names(_names); n = hashmap_iter_first(, ); for (; n; n = hashmap_iter_next()) { - c = lookup_commit_reference_gently(>peeled, 1); + c = lookup_commit_reference_gently(the_repository, + >peeled, 1); if (c) *commit_names_at(_names, c) = n; } @@ -509,7 +510,7 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, )) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference_gently(, 1); + cmit = lookup_commit_reference_gently(the_repository, , 1); if (cmit) describe_commit(, ); diff --git a/builtin/fetch.c b/builtin/fetch.c index 83f36d7cdeb..f5d960baecf 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -684,8 +684,10 @@ static int update_local_ref(struct ref *ref, return r; } -
[PATCH v3 09/32] commit: add repository argument to lookup_commit
Add a repository argument to allow callers of lookup_commit to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- builtin/am.c| 3 ++- builtin/commit-tree.c | 4 +++- builtin/diff-tree.c | 2 +- builtin/fast-export.c | 2 +- builtin/fmt-merge-msg.c | 2 +- builtin/merge-base.c| 2 +- builtin/verify-commit.c | 4 +++- commit-graph.c | 10 +- commit.c| 7 --- commit.h| 3 ++- fetch-pack.c| 5 +++-- log-tree.c | 2 +- notes-utils.c | 4 +++- object.c| 2 +- sequencer.c | 4 ++-- sha1-name.c | 2 +- shallow.c | 17 ++--- tag.c | 2 +- tree.c | 2 +- 19 files changed, 46 insertions(+), 33 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 72e928cee79..b6eeb46c4b6 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1633,7 +1633,8 @@ static void do_commit(const struct am_state *state) if (!get_oid_commit("HEAD", )) { old_oid = - commit_list_insert(lookup_commit(), ); + commit_list_insert(lookup_commit(the_repository, ), + ); } else { old_oid = NULL; say(state, stderr, _("applying to an empty history")); diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c index 9fbd3529fb1..9ec36a82b63 100644 --- a/builtin/commit-tree.c +++ b/builtin/commit-tree.c @@ -6,6 +6,7 @@ #include "cache.h" #include "config.h" #include "object-store.h" +#include "repository.h" #include "commit.h" #include "tree.h" #include "builtin.h" @@ -60,7 +61,8 @@ int cmd_commit_tree(int argc, const char **argv, const char *prefix) if (get_oid_commit(argv[i], )) die("Not a valid object name %s", argv[i]); assert_oid_type(, OBJ_COMMIT); - new_parent(lookup_commit(), ); + new_parent(lookup_commit(the_repository, ), +); continue; } diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index a5718d96ee2..91ba67070e2 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -25,7 +25,7 @@ static int stdin_diff_commit(struct commit *commit, const char *p) /* Graft the fake parents locally to the commit */ while (isspace(*p++) && !parse_oid_hex(p, , )) { - struct commit *parent = lookup_commit(); + struct commit *parent = lookup_commit(the_repository, ); if (!pptr) { /* Free the real parent list */ free_commit_list(commit->parents); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 7d6b1d8aea2..223499d7ca4 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -963,7 +963,7 @@ static void import_marks(char *input_file) /* only commits */ continue; - commit = lookup_commit(); + commit = lookup_commit(the_repository, ); if (!commit) die("not a commit? can't happen: %s", oid_to_hex()); diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index 5e44589b545..36318ef46e7 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -572,7 +572,7 @@ static void find_merge_parents(struct merge_parents *result, commit_list_insert(parent, ); add_merge_parent(result, >oid, >object.oid); } - head_commit = lookup_commit(head); + head_commit = lookup_commit(the_repository, head); if (head_commit) commit_list_insert(head_commit, ); reduce_heads_replace(); diff --git a/builtin/merge-base.c b/builtin/merge-base.c index bbead6f33e5..08d91b1f0c0 100644 --- a/builtin/merge-base.c +++ b/builtin/merge-base.c @@ -124,7 +124,7 @@ static void add_one_commit(struct object_id *oid, struct rev_collect *revs) if (is_null_oid(oid)) return; - commit = lookup_commit(oid); + commit = lookup_commit(the_repository, oid); if (!commit || (commit->object.flags & TMP_MARK) || parse_commit(commit)) diff --git a/builtin/verify-commit.c b/builtin/verify-commit.c index f6922da16d6..7772c07ed7a 100644 --- a/builtin/verify-commit.c +++ b/builtin/verify-commit.c @@ -9,6 +9,7 @@ #include "config.h" #include "builtin.h" #include "object-store.h" +#include "repository.h" #include "commit.h" #include
[PATCH v3 27/32] commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 4 ++-- commit.h | 3 +-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index dd8c9c15b14..15b044331a1 100644 --- a/commit.c +++ b/commit.c @@ -283,10 +283,10 @@ void set_commit_buffer(struct repository *r, struct commit *commit, void *buffer v->size = size; } -const void *get_cached_commit_buffer_the_repository(const struct commit *commit, unsigned long *sizep) +const void *get_cached_commit_buffer(struct repository *r, const struct commit *commit, unsigned long *sizep) { struct commit_buffer *v = buffer_slab_peek( - the_repository->parsed_objects->buffer_slab, commit); + r->parsed_objects->buffer_slab, commit); if (!v) { if (sizep) *sizep = 0; diff --git a/commit.h b/commit.h index 7297af467b9..d61585df5bc 100644 --- a/commit.h +++ b/commit.h @@ -103,8 +103,7 @@ void set_commit_buffer(struct repository *r, struct commit *, void *buffer, unsi * Get any cached object buffer associated with the commit. Returns NULL * if none. The resulting memory should not be freed. */ -#define get_cached_commit_buffer(r, c, s) get_cached_commit_buffer_##r(c, s) -const void *get_cached_commit_buffer_the_repository(const struct commit *, unsigned long *size); +const void *get_cached_commit_buffer(struct repository *, const struct commit *, unsigned long *size); /* * Get the commit's object contents, either from cache or by reading the object -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 26/32] commit.c: allow set_commit_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 4 ++-- commit.h | 3 +-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index 1baac77861f..dd8c9c15b14 100644 --- a/commit.c +++ b/commit.c @@ -275,10 +275,10 @@ void free_commit_buffer_slab(struct buffer_slab *bs) free(bs); } -void set_commit_buffer_the_repository(struct commit *commit, void *buffer, unsigned long size) +void set_commit_buffer(struct repository *r, struct commit *commit, void *buffer, unsigned long size) { struct commit_buffer *v = buffer_slab_at( - the_repository->parsed_objects->buffer_slab, commit); + r->parsed_objects->buffer_slab, commit); v->buffer = buffer; v->size = size; } diff --git a/commit.h b/commit.h index bea5e015b28..7297af467b9 100644 --- a/commit.h +++ b/commit.h @@ -97,8 +97,7 @@ void free_commit_buffer_slab(struct buffer_slab *bs); * Associate an object buffer with the commit. The ownership of the * memory is handed over to the commit, and must be free()-able. */ -#define set_commit_buffer(r, c, b, s) set_commit_buffer_##r(c, b, s) -void set_commit_buffer_the_repository(struct commit *, void *buffer, unsigned long size); +void set_commit_buffer(struct repository *r, struct commit *, void *buffer, unsigned long size); /* * Get any cached object buffer associated with the commit. Returns NULL -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 31/32] commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 8 commit.h | 4 +--- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/commit.c b/commit.c index 15b044331a1..08b4602f43f 100644 --- a/commit.c +++ b/commit.c @@ -24,16 +24,16 @@ int save_commit_buffer = 1; const char *commit_type = "commit"; -struct commit *lookup_commit_reference_gently_the_repository( +struct commit *lookup_commit_reference_gently(struct repository *r, const struct object_id *oid, int quiet) { - struct object *obj = deref_tag(the_repository, - parse_object(the_repository, oid), + struct object *obj = deref_tag(r, + parse_object(r, oid), NULL, 0); if (!obj) return NULL; - return object_as_type(the_repository, obj, OBJ_COMMIT, quiet); + return object_as_type(r, obj, OBJ_COMMIT, quiet); } struct commit *lookup_commit_reference_the_repository(const struct object_id *oid) diff --git a/commit.h b/commit.h index d61585df5bc..f1f25957de4 100644 --- a/commit.h +++ b/commit.h @@ -67,9 +67,7 @@ struct commit *lookup_commit(struct repository *r, const struct object_id *oid); #define lookup_commit_reference(r, o) \ lookup_commit_reference_##r(o) struct commit *lookup_commit_reference_the_repository(const struct object_id *oid); -#define lookup_commit_reference_gently(r, o, q) \ - lookup_commit_reference_gently_##r(o, q) -struct commit *lookup_commit_reference_gently_the_repository( +struct commit *lookup_commit_reference_gently(struct repository *r, const struct object_id *oid, int quiet); struct commit *lookup_commit_reference_by_name(const char *name); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 29/32] object.c: allow parse_object to handle arbitrary repositories
Signed-off-by: Stefan Beller --- object.c | 14 +++--- object.h | 3 +-- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/object.c b/object.c index cd870fee22b..b0faab85d40 100644 --- a/object.c +++ b/object.c @@ -245,28 +245,28 @@ struct object *parse_object_or_die(const struct object_id *oid, die(_("unable to parse object: %s"), name ? name : oid_to_hex(oid)); } -struct object *parse_object_the_repository(const struct object_id *oid) +struct object *parse_object(struct repository *r, const struct object_id *oid) { unsigned long size; enum object_type type; int eaten; - const struct object_id *repl = lookup_replace_object(the_repository, oid); + const struct object_id *repl = lookup_replace_object(r, oid); void *buffer; struct object *obj; - obj = lookup_object(the_repository, oid->hash); + obj = lookup_object(r, oid->hash); if (obj && obj->parsed) return obj; if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) || (!obj && has_object_file(oid) && -oid_object_info(the_repository, oid, NULL) == OBJ_BLOB)) { +oid_object_info(r, oid, NULL) == OBJ_BLOB)) { if (check_object_signature(repl, NULL, 0, NULL) < 0) { error("sha1 mismatch %s", oid_to_hex(oid)); return NULL; } - parse_blob_buffer(lookup_blob(the_repository, oid), NULL, 0); - return lookup_object(the_repository, oid->hash); + parse_blob_buffer(lookup_blob(r, oid), NULL, 0); + return lookup_object(r, oid->hash); } buffer = read_object_file(oid, , ); @@ -277,7 +277,7 @@ struct object *parse_object_the_repository(const struct object_id *oid) return NULL; } - obj = parse_object_buffer(the_repository, oid, type, size, + obj = parse_object_buffer(r, oid, type, size, buffer, ); if (!eaten) free(buffer); diff --git a/object.h b/object.h index 38198bb73a1..fa5ca975678 100644 --- a/object.h +++ b/object.h @@ -124,8 +124,7 @@ void *object_as_type(struct repository *r, struct object *obj, enum object_type * * Returns NULL if the object is missing or corrupt. */ -#define parse_object(r, oid) parse_object_##r(oid) -struct object *parse_object_the_repository(const struct object_id *oid); +struct object *parse_object(struct repository *r, const struct object_id *oid); /* * Like parse_object, but will die() instead of returning NULL. If the -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 32/32] commit.c: allow lookup_commit_reference to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 4 ++-- commit.h | 5 ++--- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/commit.c b/commit.c index 08b4602f43f..b88ced5b026 100644 --- a/commit.c +++ b/commit.c @@ -36,9 +36,9 @@ struct commit *lookup_commit_reference_gently(struct repository *r, return object_as_type(r, obj, OBJ_COMMIT, quiet); } -struct commit *lookup_commit_reference_the_repository(const struct object_id *oid) +struct commit *lookup_commit_reference(struct repository *r, const struct object_id *oid) { - return lookup_commit_reference_gently(the_repository, oid, 0); + return lookup_commit_reference_gently(r, oid, 0); } struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name) diff --git a/commit.h b/commit.h index f1f25957de4..8b2cf9692de 100644 --- a/commit.h +++ b/commit.h @@ -64,9 +64,8 @@ void add_name_decoration(enum decoration_type type, const char *name, struct obj const struct name_decoration *get_name_decoration(const struct object *obj); struct commit *lookup_commit(struct repository *r, const struct object_id *oid); -#define lookup_commit_reference(r, o) \ - lookup_commit_reference_##r(o) -struct commit *lookup_commit_reference_the_repository(const struct object_id *oid); +struct commit *lookup_commit_reference(struct repository *r, + const struct object_id *oid); struct commit *lookup_commit_reference_gently(struct repository *r, const struct object_id *oid, int quiet); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 30/32] tag.c: allow deref_tag to handle arbitrary repositories
Signed-off-by: Stefan Beller --- tag.c | 5 ++--- tag.h | 3 +-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/tag.c b/tag.c index 682e7793059..94a89b21cb5 100644 --- a/tag.c +++ b/tag.c @@ -64,12 +64,11 @@ int gpg_verify_tag(const struct object_id *oid, const char *name_to_report, return ret; } -struct object *deref_tag_the_repository(struct object *o, const char *warn, int warnlen) +struct object *deref_tag(struct repository *r, struct object *o, const char *warn, int warnlen) { while (o && o->type == OBJ_TAG) if (((struct tag *)o)->tagged) - o = parse_object(the_repository, -&((struct tag *)o)->tagged->oid); + o = parse_object(r, &((struct tag *)o)->tagged->oid); else o = NULL; if (!o && warn) { diff --git a/tag.h b/tag.h index efd4c7da67c..e669c3e497a 100644 --- a/tag.h +++ b/tag.h @@ -15,8 +15,7 @@ extern struct tag *lookup_tag(struct repository *r, const struct object_id *oid) extern int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, unsigned long size); extern int parse_tag(struct tag *item); extern void release_tag_memory(struct tag *t); -#define deref_tag(r, o, w, l) deref_tag_##r(o, w, l) -extern struct object *deref_tag_the_repository(struct object *, const char *, int); +extern struct object *deref_tag(struct repository *r, struct object *, const char *, int); extern struct object *deref_tag_noverify(struct object *); extern int gpg_verify_tag(const struct object_id *oid, const char *name_to_report, unsigned flags); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 28/32] object.c: allow parse_object_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- object.c | 18 +- object.h | 3 +-- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/object.c b/object.c index 9d588448192..cd870fee22b 100644 --- a/object.c +++ b/object.c @@ -185,21 +185,21 @@ struct object *lookup_unknown_object(const unsigned char *sha1) return obj; } -struct object *parse_object_buffer_the_repository(const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p) +struct object *parse_object_buffer(struct repository *r, const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p) { struct object *obj; *eaten_p = 0; obj = NULL; if (type == OBJ_BLOB) { - struct blob *blob = lookup_blob(the_repository, oid); + struct blob *blob = lookup_blob(r, oid); if (blob) { if (parse_blob_buffer(blob, buffer, size)) return NULL; obj = >object; } } else if (type == OBJ_TREE) { - struct tree *tree = lookup_tree(the_repository, oid); + struct tree *tree = lookup_tree(r, oid); if (tree) { obj = >object; if (!tree->buffer) @@ -211,20 +211,20 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e } } } else if (type == OBJ_COMMIT) { - struct commit *commit = lookup_commit(the_repository, oid); + struct commit *commit = lookup_commit(r, oid); if (commit) { - if (parse_commit_buffer(the_repository, commit, buffer, size, 1)) + if (parse_commit_buffer(r, commit, buffer, size, 1)) return NULL; - if (!get_cached_commit_buffer(the_repository, commit, NULL)) { - set_commit_buffer(the_repository, commit, buffer, size); + if (!get_cached_commit_buffer(r, commit, NULL)) { + set_commit_buffer(r, commit, buffer, size); *eaten_p = 1; } obj = >object; } } else if (type == OBJ_TAG) { - struct tag *tag = lookup_tag(the_repository, oid); + struct tag *tag = lookup_tag(r, oid); if (tag) { - if (parse_tag_buffer(the_repository, tag, buffer, size)) + if (parse_tag_buffer(r, tag, buffer, size)) return NULL; obj = >object; } diff --git a/object.h b/object.h index f54a892bd10..38198bb73a1 100644 --- a/object.h +++ b/object.h @@ -138,8 +138,7 @@ struct object *parse_object_or_die(const struct object_id *oid, const char *name * parsing it. eaten_p indicates if the object has a borrowed copy * of buffer and the caller should not free() it. */ -#define parse_object_buffer(r, o, t, s, b, e) parse_object_buffer_##r(o, t, s, b, e) -struct object *parse_object_buffer_the_repository(const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p); +struct object *parse_object_buffer(struct repository *r, const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p); /** Returns the object, with potentially excess memory allocated. **/ struct object *lookup_unknown_object(const unsigned char *sha1); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 22/32] tag: allow parse_tag_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- tag.c | 10 +- tag.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/tag.c b/tag.c index 46b5882ee12..682e7793059 100644 --- a/tag.c +++ b/tag.c @@ -126,7 +126,7 @@ void release_tag_memory(struct tag *t) t->date = 0; } -int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned long size) +int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, unsigned long size) { struct object_id oid; char type[20]; @@ -154,13 +154,13 @@ int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned bufptr = nl + 1; if (!strcmp(type, blob_type)) { - item->tagged = (struct object *)lookup_blob(the_repository, ); + item->tagged = (struct object *)lookup_blob(r, ); } else if (!strcmp(type, tree_type)) { - item->tagged = (struct object *)lookup_tree(the_repository, ); + item->tagged = (struct object *)lookup_tree(r, ); } else if (!strcmp(type, commit_type)) { - item->tagged = (struct object *)lookup_commit(the_repository, ); + item->tagged = (struct object *)lookup_commit(r, ); } else if (!strcmp(type, tag_type)) { - item->tagged = (struct object *)lookup_tag(the_repository, ); + item->tagged = (struct object *)lookup_tag(r, ); } else { error("Unknown type %s", type); item->tagged = NULL; diff --git a/tag.h b/tag.h index 6a160c91875..efd4c7da67c 100644 --- a/tag.h +++ b/tag.h @@ -12,8 +12,7 @@ struct tag { timestamp_t date; }; extern struct tag *lookup_tag(struct repository *r, const struct object_id *oid); -#define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s) -extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned long size); +extern int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, unsigned long size); extern int parse_tag(struct tag *item); extern void release_tag_memory(struct tag *t); #define deref_tag(r, o, w, l) deref_tag_##r(o, w, l) -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 23/32] commit.c: allow parse_commit_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 10 +- commit.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/commit.c b/commit.c index 8749e151451..41d23352098 100644 --- a/commit.c +++ b/commit.c @@ -364,7 +364,7 @@ const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) return ret; } -int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, unsigned long size, int check_graph) +int parse_commit_buffer(struct repository *r, struct commit *item, const void *buffer, unsigned long size, int check_graph) { const char *tail = buffer; const char *bufptr = buffer; @@ -384,11 +384,11 @@ int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, if (get_oid_hex(bufptr + 5, ) < 0) return error("bad tree pointer in commit %s", oid_to_hex(>object.oid)); - item->maybe_tree = lookup_tree(the_repository, ); + item->maybe_tree = lookup_tree(r, ); bufptr += tree_entry_len + 1; /* "tree " + "hex sha1" + "\n" */ pptr = >parents; - graft = lookup_commit_graft(the_repository, >object.oid); + graft = lookup_commit_graft(r, >object.oid); while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) { struct commit *new_parent; @@ -403,7 +403,7 @@ int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, */ if (graft && (graft->nr_parent < 0 || grafts_replace_parents)) continue; - new_parent = lookup_commit(the_repository, ); + new_parent = lookup_commit(r, ); if (new_parent) pptr = _list_insert(new_parent, pptr)->next; } @@ -411,7 +411,7 @@ int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, int i; struct commit *new_parent; for (i = 0; i < graft->nr_parent; i++) { - new_parent = lookup_commit(the_repository, + new_parent = lookup_commit(r, >parent[i]); if (!new_parent) continue; diff --git a/commit.h b/commit.h index 27888d82468..e9cb5e9 100644 --- a/commit.h +++ b/commit.h @@ -81,8 +81,7 @@ struct commit *lookup_commit_reference_by_name(const char *name); */ struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name); -#define parse_commit_buffer(r, i, b, s, g) parse_commit_buffer_##r(i, b, s, g) -int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, unsigned long size, int check_graph); +int parse_commit_buffer(struct repository *r, struct commit *item, const void *buffer, unsigned long size, int check_graph); int parse_commit_gently(struct commit *item, int quiet_on_missing); static inline int parse_commit(struct commit *item) { -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 24/32] commit-slabs: remove realloc counter outside of slab struct
The realloc counter is declared outside the struct for the given slabname, which makes it harder for a follow up patch to move the declaration of the struct around as then the counter variable would need special treatment. As the reallocation counter is currently unused we can just remove it. If we ever need to count the reallocations again, we can reintroduce the counter as part of 'struct slabname' in commit-slab-decl.h. Signed-off-by: Stefan Beller --- commit-slab-impl.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/commit-slab-impl.h b/commit-slab-impl.h index 87a9cadfcca..ac1e6d409ad 100644 --- a/commit-slab-impl.h +++ b/commit-slab-impl.h @@ -11,8 +11,6 @@ #define implement_commit_slab(slabname, elemtype, scope) \ \ -static int stat_ ##slabname## realloc; \ - \ scope void init_ ##slabname## _with_stride(struct slabname *s, \ unsigned stride) \ { \ @@ -54,7 +52,6 @@ scope elemtype *slabname## _at_peek(struct slabname *s, \ if (!add_if_missing)\ return NULL;\ REALLOC_ARRAY(s->slab, nth_slab + 1); \ - stat_ ##slabname## realloc++; \ for (i = s->slab_count; i <= nth_slab; i++) \ s->slab[i] = NULL; \ s->slab_count = nth_slab + 1; \ -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 25/32] commit.c: migrate the commit buffer to the parsed object store
Signed-off-by: Stefan Beller --- commit.c | 29 +++-- commit.h | 4 object.c | 5 + object.h | 4 4 files changed, 36 insertions(+), 6 deletions(-) diff --git a/commit.c b/commit.c index 41d23352098..1baac77861f 100644 --- a/commit.c +++ b/commit.c @@ -261,18 +261,32 @@ struct commit_buffer { unsigned long size; }; define_commit_slab(buffer_slab, struct commit_buffer); -static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab); + +struct buffer_slab *allocate_commit_buffer_slab(void) +{ + struct buffer_slab *bs = xmalloc(sizeof(*bs)); + init_buffer_slab(bs); + return bs; +} + +void free_commit_buffer_slab(struct buffer_slab *bs) +{ + clear_buffer_slab(bs); + free(bs); +} void set_commit_buffer_the_repository(struct commit *commit, void *buffer, unsigned long size) { - struct commit_buffer *v = buffer_slab_at(_slab, commit); + struct commit_buffer *v = buffer_slab_at( + the_repository->parsed_objects->buffer_slab, commit); v->buffer = buffer; v->size = size; } const void *get_cached_commit_buffer_the_repository(const struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_peek(_slab, commit); + struct commit_buffer *v = buffer_slab_peek( + the_repository->parsed_objects->buffer_slab, commit); if (!v) { if (sizep) *sizep = 0; @@ -304,14 +318,16 @@ const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) void unuse_commit_buffer(const struct commit *commit, const void *buffer) { - struct commit_buffer *v = buffer_slab_peek(_slab, commit); + struct commit_buffer *v = buffer_slab_peek( + the_repository->parsed_objects->buffer_slab, commit); if (!(v && v->buffer == buffer)) free((void *)buffer); } void free_commit_buffer(struct commit *commit) { - struct commit_buffer *v = buffer_slab_peek(_slab, commit); + struct commit_buffer *v = buffer_slab_peek( + the_repository->parsed_objects->buffer_slab, commit); if (v) { FREE_AND_NULL(v->buffer); v->size = 0; @@ -347,7 +363,8 @@ void release_commit_memory(struct commit *c) const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_peek(_slab, commit); + struct commit_buffer *v = buffer_slab_peek( + the_repository->parsed_objects->buffer_slab, commit); void *ret; if (!v) { diff --git a/commit.h b/commit.h index e9cb5e9..bea5e015b28 100644 --- a/commit.h +++ b/commit.h @@ -89,6 +89,10 @@ static inline int parse_commit(struct commit *item) } void parse_commit_or_die(struct commit *item); +struct buffer_slab; +struct buffer_slab *allocate_commit_buffer_slab(void); +void free_commit_buffer_slab(struct buffer_slab *bs); + /* * Associate an object buffer with the commit. The ownership of the * memory is handed over to the commit, and must be free()-able. diff --git a/object.c b/object.c index 9d74de95f5b..9d588448192 100644 --- a/object.c +++ b/object.c @@ -467,6 +467,8 @@ struct parsed_object_pool *parsed_object_pool_new(void) o->is_shallow = -1; o->shallow_stat = xcalloc(1, sizeof(*o->shallow_stat)); + o->buffer_slab = allocate_commit_buffer_slab(); + return o; } @@ -541,6 +543,9 @@ void parsed_object_pool_clear(struct parsed_object_pool *o) FREE_AND_NULL(o->obj_hash); o->obj_hash_size = 0; + free_commit_buffer_slab(o->buffer_slab); + o->buffer_slab = NULL; + clear_alloc_state(o->blob_state); clear_alloc_state(o->tree_state); clear_alloc_state(o->commit_state); diff --git a/object.h b/object.h index 0d7d74129b6..f54a892bd10 100644 --- a/object.h +++ b/object.h @@ -1,6 +1,8 @@ #ifndef OBJECT_H #define OBJECT_H +struct buffer_slab; + struct parsed_object_pool { struct object **obj_hash; int nr_objs, obj_hash_size; @@ -22,6 +24,8 @@ struct parsed_object_pool { char *alternate_shallow_file; int commit_graft_prepared; + + struct buffer_slab *buffer_slab; }; struct parsed_object_pool *parsed_object_pool_new(void); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 00/32] object-store: lookup_commit
This continues the elimination of global variables in the object store and teaches lookup_commit[_reference] and alike to handle a_repository. This is also available as https://github.com/stefanbeller/git/tree/object-store-lookup-commit or applies on top of 02f70d63027 (Merge branch 'sb/object-store-grafts' into next, 2018-06-28). Picking a base for this one is really hard, as nearly all series currently cooking or in flight collide with it on one or two lines. (lookup_* is used heavily, who would have thought?); I really needed 'sb/object-store-grafts' to build on; and there is only one other series before that in current next, which this series would have conflicted with, if not based on top of it. In "The state of the object store series" [1], this was tentatively named "sb/object-store-lookup". Thanks, Stefan [1] https://public-inbox.org/git/cagz79kzteza1rvgfscs+m4dsrb86cf-xiepwqmeu-kcnxp_...@mail.gmail.com/ v2 https://public-inbox.org/git/20180613230522.55335-1-sbel...@google.com/ * removed mentions of cooci patches * added forward declaration of commit buffer slabs. * dropped 3 patches that add the repository to lookup_unkonwn_object, parse_commit and parse_commit_gently, but were not converting those functions. We'll convert these in the next series, as this series is growing big already. * This series can be found as branch 'object-store-lookup-commit' on github, it applies on top of nd/commit-util-to-slab merged with sb/object-store-grafts v1, https://public-inbox.org/git/20180530004810.30076-1-sbel...@google.com/ This applies on the merge of nd/commit-util-to-slab and sb/object-store-grafts, and is available at http://github.com/stefanbeller/ as branch object-store-lookup-commit as the merge has some merge conflicts as well as syntactical conflicts (upload-pack.c and fetch-pack.c introduce new calls of functions that would want to take a repository struct in the object-store-grafts series) As layed out in https://public-inbox.org/git/20180517225154.9200-1-sbel...@google.com/ this is getting close to finishing the set of object store series though the last unfinished part of this RFC hints at new work on the plate: * To give this series a nice polish, we'd want to convert parse_commit, too. But that requires the conversion of the new commit graph. Maybe we need to split this series into 2. * Once this is in good shape we can talk about converting parts of the revision walking code, * which then can be used by the submodule code as the end goal for the object store series. Thanks, Stefan Stefan Beller (32): object: add repository argument to parse_object object: add repository argument to lookup_object object: add repository argument to parse_object_buffer object: add repository argument to object_as_type blob: add repository argument to lookup_blob tree: add repository argument to lookup_tree commit: add repository argument to lookup_commit_reference_gently commit: add repository argument to lookup_commit_reference commit: add repository argument to lookup_commit commit: add repository argument to parse_commit_buffer commit: add repository argument to set_commit_buffer commit: add repository argument to get_cached_commit_buffer tag: add repository argument to lookup_tag tag: add repository argument to parse_tag_buffer tag: add repository argument to deref_tag object: allow object_as_type to handle arbitrary repositories object: allow lookup_object to handle arbitrary repositories blob: allow lookup_blob to handle arbitrary repositories tree: allow lookup_tree to handle arbitrary repositories commit: allow lookup_commit to handle arbitrary repositories tag: allow lookup_tag to handle arbitrary repositories tag: allow parse_tag_buffer to handle arbitrary repositories commit.c: allow parse_commit_buffer to handle arbitrary repositories commit-slabs: remove realloc counter outside of slab struct commit.c: migrate the commit buffer to the parsed object store commit.c: allow set_commit_buffer to handle arbitrary repositories commit.c: allow get_cached_commit_buffer to handle arbitrary repositories object.c: allow parse_object_buffer to handle arbitrary repositories object.c: allow parse_object to handle arbitrary repositories tag.c: allow deref_tag to handle arbitrary repositories commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories commit.c: allow lookup_commit_reference to handle arbitrary repositories archive.c| 2 +- bisect.c | 2 +- blame.c | 13 --- blob.c | 10 ++--- blob.h | 2 +- branch.c | 2 +- builtin/am.c | 9 +++-- builtin/branch.c | 7 ++-- builtin/checkout.c | 6 +-- builtin/clone.c | 3 +- builtin/commit-tree.c| 4 +- builtin/describe.c | 13 --- builtin/diff-tree.c | 9 +++--
[PATCH v3 12/32] commit: add repository argument to get_cached_commit_buffer
Add a repository argument to allow callers of get_cached_commit_buffer to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- commit.c | 4 ++-- commit.h | 3 ++- object.c | 2 +- pretty.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/commit.c b/commit.c index cdfb1a025b6..9e2899bd5ad 100644 --- a/commit.c +++ b/commit.c @@ -269,7 +269,7 @@ void set_commit_buffer_the_repository(struct commit *commit, void *buffer, unsig v->size = size; } -const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *get_cached_commit_buffer_the_repository(const struct commit *commit, unsigned long *sizep) { struct commit_buffer *v = buffer_slab_peek(_slab, commit); if (!v) { @@ -284,7 +284,7 @@ const void *get_cached_commit_buffer(const struct commit *commit, unsigned long const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) { - const void *ret = get_cached_commit_buffer(commit, sizep); + const void *ret = get_cached_commit_buffer(the_repository, commit, sizep); if (!ret) { enum object_type type; unsigned long size; diff --git a/commit.h b/commit.h index 7c14dfdc54b..237607d64cb 100644 --- a/commit.h +++ b/commit.h @@ -102,7 +102,8 @@ void set_commit_buffer_the_repository(struct commit *, void *buffer, unsigned lo * Get any cached object buffer associated with the commit. Returns NULL * if none. The resulting memory should not be freed. */ -const void *get_cached_commit_buffer(const struct commit *, unsigned long *size); +#define get_cached_commit_buffer(r, c, s) get_cached_commit_buffer_##r(c, s) +const void *get_cached_commit_buffer_the_repository(const struct commit *, unsigned long *size); /* * Get the commit's object contents, either from cache or by reading the object diff --git a/object.c b/object.c index d1f77565af6..f08a8874de3 100644 --- a/object.c +++ b/object.c @@ -216,7 +216,7 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e if (commit) { if (parse_commit_buffer(the_repository, commit, buffer, size, 1)) return NULL; - if (!get_cached_commit_buffer(commit, NULL)) { + if (!get_cached_commit_buffer(the_repository, commit, NULL)) { set_commit_buffer(the_repository, commit, buffer, size); *eaten_p = 1; } diff --git a/pretty.c b/pretty.c index cbd25b6ceae..cde4fe07db0 100644 --- a/pretty.c +++ b/pretty.c @@ -630,7 +630,7 @@ const char *logmsg_reencode(const struct commit *commit, * the cached copy from get_commit_buffer, we need to duplicate it * to avoid munging the cached copy. */ - if (msg == get_cached_commit_buffer(commit, NULL)) + if (msg == get_cached_commit_buffer(the_repository, commit, NULL)) out = xstrdup(msg); else out = (char *)msg; -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 11/32] commit: add repository argument to set_commit_buffer
Add a repository argument to allow callers of set_commit_buffer to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- blame.c | 2 +- commit.c | 4 ++-- commit.h | 3 ++- object.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/blame.c b/blame.c index 8a0655a5997..cf102276bea 100644 --- a/blame.c +++ b/blame.c @@ -158,7 +158,7 @@ static void set_commit_buffer_from_strbuf(struct commit *c, struct strbuf *sb) { size_t len; void *buf = strbuf_detach(sb, ); - set_commit_buffer(c, buf, len); + set_commit_buffer(the_repository, c, buf, len); } /* diff --git a/commit.c b/commit.c index 75d0bdede84..cdfb1a025b6 100644 --- a/commit.c +++ b/commit.c @@ -262,7 +262,7 @@ struct commit_buffer { define_commit_slab(buffer_slab, struct commit_buffer); static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab); -void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) +void set_commit_buffer_the_repository(struct commit *commit, void *buffer, unsigned long size) { struct commit_buffer *v = buffer_slab_at(_slab, commit); v->buffer = buffer; @@ -450,7 +450,7 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) } ret = parse_commit_buffer(the_repository, item, buffer, size, 0); if (save_commit_buffer && !ret) { - set_commit_buffer(item, buffer, size); + set_commit_buffer(the_repository, item, buffer, size); return 0; } free(buffer); diff --git a/commit.h b/commit.h index f326c13622b..7c14dfdc54b 100644 --- a/commit.h +++ b/commit.h @@ -95,7 +95,8 @@ void parse_commit_or_die(struct commit *item); * Associate an object buffer with the commit. The ownership of the * memory is handed over to the commit, and must be free()-able. */ -void set_commit_buffer(struct commit *, void *buffer, unsigned long size); +#define set_commit_buffer(r, c, b, s) set_commit_buffer_##r(c, b, s) +void set_commit_buffer_the_repository(struct commit *, void *buffer, unsigned long size); /* * Get any cached object buffer associated with the commit. Returns NULL diff --git a/object.c b/object.c index 5494c0cbaa1..d1f77565af6 100644 --- a/object.c +++ b/object.c @@ -217,7 +217,7 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e if (parse_commit_buffer(the_repository, commit, buffer, size, 1)) return NULL; if (!get_cached_commit_buffer(commit, NULL)) { - set_commit_buffer(commit, buffer, size); + set_commit_buffer(the_repository, commit, buffer, size); *eaten_p = 1; } obj = >object; -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 17/32] object: allow lookup_object to handle arbitrary repositories
Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- object.c | 15 +++ object.h | 3 +-- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/object.c b/object.c index f41f82c6725..9d74de95f5b 100644 --- a/object.c +++ b/object.c @@ -84,21 +84,20 @@ static void insert_obj_hash(struct object *obj, struct object **hash, unsigned i * Look up the record for the given sha1 in the hash map stored in * obj_hash. Return NULL if it was not found. */ -struct object *lookup_object_the_repository(const unsigned char *sha1) +struct object *lookup_object(struct repository *r, const unsigned char *sha1) { unsigned int i, first; struct object *obj; - if (!the_repository->parsed_objects->obj_hash) + if (!r->parsed_objects->obj_hash) return NULL; - first = i = hash_obj(sha1, -the_repository->parsed_objects->obj_hash_size); - while ((obj = the_repository->parsed_objects->obj_hash[i]) != NULL) { + first = i = hash_obj(sha1, r->parsed_objects->obj_hash_size); + while ((obj = r->parsed_objects->obj_hash[i]) != NULL) { if (!hashcmp(sha1, obj->oid.hash)) break; i++; - if (i == the_repository->parsed_objects->obj_hash_size) + if (i == r->parsed_objects->obj_hash_size) i = 0; } if (obj && i != first) { @@ -107,8 +106,8 @@ struct object *lookup_object_the_repository(const unsigned char *sha1) * that we do not need to walk the hash table the next * time we look for it. */ - SWAP(the_repository->parsed_objects->obj_hash[i], -the_repository->parsed_objects->obj_hash[first]); + SWAP(r->parsed_objects->obj_hash[i], +r->parsed_objects->obj_hash[first]); } return obj; } diff --git a/object.h b/object.h index 6f3271eb228..0d7d74129b6 100644 --- a/object.h +++ b/object.h @@ -109,8 +109,7 @@ extern struct object *get_indexed_object(unsigned int); * half-initialised objects, the caller is expected to initialize them * by calling parse_object() on them. */ -#define lookup_object(r, s) lookup_object_##r(s) -struct object *lookup_object_the_repository(const unsigned char *sha1); +struct object *lookup_object(struct repository *r, const unsigned char *sha1); extern void *create_object(struct repository *r, const unsigned char *sha1, void *obj); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 08/32] commit: add repository argument to lookup_commit_reference
Add a repository argument to allow callers of lookup_commit_reference to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- bisect.c | 2 +- blame.c | 2 +- branch.c | 2 +- builtin/branch.c | 7 --- builtin/clone.c | 3 ++- builtin/describe.c| 2 +- builtin/diff-tree.c | 2 +- builtin/log.c | 7 --- builtin/merge-base.c | 5 +++-- builtin/notes.c | 3 ++- builtin/pull.c| 15 ++- builtin/replace.c | 4 ++-- builtin/reset.c | 4 ++-- builtin/rev-parse.c | 6 +++--- builtin/show-branch.c | 2 +- builtin/tag.c | 2 +- bundle.c | 3 ++- commit.c | 6 +++--- commit.h | 4 +++- merge-recursive.c | 6 +++--- notes-merge.c | 5 +++-- parse-options-cb.c| 2 +- remote.c | 4 ++-- revision.c| 4 ++-- sequencer.c | 6 +++--- sha1-name.c | 4 ++-- submodule.c | 4 ++-- 27 files changed, 65 insertions(+), 51 deletions(-) diff --git a/bisect.c b/bisect.c index 6de1abd407b..e1275ba79e8 100644 --- a/bisect.c +++ b/bisect.c @@ -724,7 +724,7 @@ static int bisect_checkout(const struct object_id *bisect_rev, int no_checkout) static struct commit *get_commit_reference(const struct object_id *oid) { - struct commit *r = lookup_commit_reference(oid); + struct commit *r = lookup_commit_reference(the_repository, oid); if (!r) die(_("Not a valid commit name %s"), oid_to_hex(oid)); return r; diff --git a/blame.c b/blame.c index 5b022cc2254..8a0655a5997 100644 --- a/blame.c +++ b/blame.c @@ -119,7 +119,7 @@ static struct commit_list **append_parent(struct commit_list **tail, const struc { struct commit *parent; - parent = lookup_commit_reference(oid); + parent = lookup_commit_reference(the_repository, oid); if (!parent) die("no such commit %s", oid_to_hex(oid)); return _list_insert(parent, tail)->next; diff --git a/branch.c b/branch.c index 6a35dd31f2a..ecd710d7308 100644 --- a/branch.c +++ b/branch.c @@ -302,7 +302,7 @@ void create_branch(const char *name, const char *start_name, break; } - if ((commit = lookup_commit_reference()) == NULL) + if ((commit = lookup_commit_reference(the_repository, )) == NULL) die(_("Not a valid branch point: '%s'."), start_name); oidcpy(, >object.oid); diff --git a/builtin/branch.c b/builtin/branch.c index 1876ca9e796..a50632fb23f 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -121,7 +121,8 @@ static int branch_merged(int kind, const char *name, (reference_name = reference_name_to_free = resolve_refdup(upstream, RESOLVE_REF_READING, , NULL)) != NULL) - reference_rev = lookup_commit_reference(); + reference_rev = lookup_commit_reference(the_repository, + ); } if (!reference_rev) reference_rev = head_rev; @@ -154,7 +155,7 @@ static int check_branch_commit(const char *branchname, const char *refname, const struct object_id *oid, struct commit *head_rev, int kinds, int force) { - struct commit *rev = lookup_commit_reference(oid); + struct commit *rev = lookup_commit_reference(the_repository, oid); if (!rev) { error(_("Couldn't look up commit object for '%s'"), refname); return -1; @@ -208,7 +209,7 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, } if (!force) { - head_rev = lookup_commit_reference(_oid); + head_rev = lookup_commit_reference(the_repository, _oid); if (!head_rev) die(_("Couldn't look up commit object for HEAD")); } diff --git a/builtin/clone.c b/builtin/clone.c index 1d939af9d8c..4b3b48ee84a 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -696,7 +696,8 @@ static void update_head(const struct ref *our, const struct ref *remote, install_branch_config(0, head, option_origin, our->name); } } else if (our) { - struct commit *c = lookup_commit_reference(>old_oid); + struct commit *c = lookup_commit_reference(the_repository, + >old_oid); /* --branch specifies a non-branch (i.e. tags),
[PATCH v3 16/32] object: allow object_as_type to handle arbitrary repositories
Signed-off-by: Stefan Beller --- object.c | 4 ++-- object.h | 3 +-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/object.c b/object.c index e095d49b379..f41f82c6725 100644 --- a/object.c +++ b/object.c @@ -158,13 +158,13 @@ void *create_object(struct repository *r, const unsigned char *sha1, void *o) return obj; } -void *object_as_type_the_repository(struct object *obj, enum object_type type, int quiet) +void *object_as_type(struct repository *r, struct object *obj, enum object_type type, int quiet) { if (obj->type == type) return obj; else if (obj->type == OBJ_NONE) { if (type == OBJ_COMMIT) - ((struct commit *)obj)->index = alloc_commit_index(the_repository); + ((struct commit *)obj)->index = alloc_commit_index(r); obj->type = type; return obj; } diff --git a/object.h b/object.h index 3faa89578fc..6f3271eb228 100644 --- a/object.h +++ b/object.h @@ -114,8 +114,7 @@ struct object *lookup_object_the_repository(const unsigned char *sha1); extern void *create_object(struct repository *r, const unsigned char *sha1, void *obj); -#define object_as_type(r, o, t, q) object_as_type_##r(o, t, q) -void *object_as_type_the_repository(struct object *obj, enum object_type type, int quiet); +void *object_as_type(struct repository *r, struct object *obj, enum object_type type, int quiet); /* * Returns the object, having parsed it to find out what it is. -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 19/32] tree: allow lookup_tree to handle arbitrary repositories
Signed-off-by: Stefan Beller --- tree.c | 10 +- tree.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/tree.c b/tree.c index 45e89ff08eb..78d440a9c8f 100644 --- a/tree.c +++ b/tree.c @@ -195,13 +195,13 @@ int read_tree(struct tree *tree, int stage, struct pathspec *match, return 0; } -struct tree *lookup_tree_the_repository(const struct object_id *oid) +struct tree *lookup_tree(struct repository *r, const struct object_id *oid) { - struct object *obj = lookup_object(the_repository, oid->hash); + struct object *obj = lookup_object(r, oid->hash); if (!obj) - return create_object(the_repository, oid->hash, -alloc_tree_node(the_repository)); - return object_as_type(the_repository, obj, OBJ_TREE, 0); + return create_object(r, oid->hash, +alloc_tree_node(r)); + return object_as_type(r, obj, OBJ_TREE, 0); } int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size) diff --git a/tree.h b/tree.h index 2ea21ed174b..d4807dc8058 100644 --- a/tree.h +++ b/tree.h @@ -12,8 +12,7 @@ struct tree { unsigned long size; }; -#define lookup_tree(r, oid) lookup_tree_##r(oid) -struct tree *lookup_tree_the_repository(const struct object_id *oid); +struct tree *lookup_tree(struct repository *r, const struct object_id *oid); int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 15/32] tag: add repository argument to deref_tag
Add a repository argument to allow the callers of deref_tag to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- blame.c | 6 +++--- builtin/diff.c | 2 +- builtin/fmt-merge-msg.c | 3 ++- builtin/grep.c | 3 ++- builtin/name-rev.c | 3 ++- commit.c| 3 ++- fetch-pack.c| 9 ++--- http-backend.c | 2 +- http-push.c | 2 +- line-log.c | 2 +- merge-recursive.c | 3 ++- remote.c| 6 -- server-info.c | 2 +- sha1-name.c | 11 +++ shallow.c | 4 +++- tag.c | 2 +- tag.h | 3 ++- upload-pack.c | 2 +- 18 files changed, 42 insertions(+), 26 deletions(-) diff --git a/blame.c b/blame.c index cf102276bea..4229bcd384c 100644 --- a/blame.c +++ b/blame.c @@ -1674,7 +1674,7 @@ static struct commit *find_single_final(struct rev_info *revs, struct object *obj = revs->pending.objects[i].item; if (obj->flags & UNINTERESTING) continue; - obj = deref_tag(obj, NULL, 0); + obj = deref_tag(the_repository, obj, NULL, 0); if (obj->type != OBJ_COMMIT) die("Non commit %s?", revs->pending.objects[i].name); if (found) @@ -1705,7 +1705,7 @@ static struct commit *dwim_reverse_initial(struct rev_info *revs, /* Is that sole rev a committish? */ obj = revs->pending.objects[0].item; - obj = deref_tag(obj, NULL, 0); + obj = deref_tag(the_repository, obj, NULL, 0); if (obj->type != OBJ_COMMIT) return NULL; @@ -1741,7 +1741,7 @@ static struct commit *find_single_initial(struct rev_info *revs, struct object *obj = revs->pending.objects[i].item; if (!(obj->flags & UNINTERESTING)) continue; - obj = deref_tag(obj, NULL, 0); + obj = deref_tag(the_repository, obj, NULL, 0); if (obj->type != OBJ_COMMIT) die("Non commit %s?", revs->pending.objects[i].name); if (found) diff --git a/builtin/diff.c b/builtin/diff.c index 7971530b9b3..361a3c3ed38 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -402,7 +402,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix) int flags = (obj->flags & UNINTERESTING); if (!obj->parsed) obj = parse_object(the_repository, >oid); - obj = deref_tag(obj, NULL, 0); + obj = deref_tag(the_repository, obj, NULL, 0); if (!obj) die(_("invalid object '%s' given."), name); if (obj->type == OBJ_COMMIT) diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index 36318ef46e7..ff165c0fcd2 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -344,7 +344,8 @@ static void shortlog(const char *name, const struct object_id *oid = _data->oid; int limit = opts->shortlog_len; - branch = deref_tag(parse_object(the_repository, oid), oid_to_hex(oid), + branch = deref_tag(the_repository, parse_object(the_repository, oid), + oid_to_hex(oid), GIT_SHA1_HEXSZ); if (!branch || branch->type != OBJ_COMMIT) return; diff --git a/builtin/grep.c b/builtin/grep.c index ee753a403eb..538a818e6dd 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -647,7 +647,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; - real_obj = deref_tag(list->objects[i].item, NULL, 0); + real_obj = deref_tag(the_repository, list->objects[i].item, +NULL, 0); /* load the gitmodules file for this rev */ if (recurse_submodules) { diff --git a/builtin/name-rev.c b/builtin/name-rev.c index f6eb419a029..f1cb45c2274 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -455,7 +455,8 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix) commit = NULL; object = parse_object(the_repository, ); if (object) { - struct object *peeled = deref_tag(object, *argv, 0); + struct object *peeled = deref_tag(the_repository, + object, *argv, 0); if
[PATCH v3 14/32] tag: add repository argument to parse_tag_buffer
Add a repository argument to allow the callers of parse_tag_buffer to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/replace.c | 2 +- log-tree.c| 2 +- object.c | 2 +- sha1-file.c | 2 +- tag.c | 4 ++-- tag.h | 3 ++- 6 files changed, 8 insertions(+), 7 deletions(-) diff --git a/builtin/replace.c b/builtin/replace.c index 0351b7c62cf..ef22d724bbc 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -405,7 +405,7 @@ static int check_one_mergetag(struct commit *commit, tag = lookup_tag(the_repository, _oid); if (!tag) return error(_("bad mergetag in commit '%s'"), ref); - if (parse_tag_buffer(tag, extra->value, extra->len)) + if (parse_tag_buffer(the_repository, tag, extra->value, extra->len)) return error(_("malformed mergetag in commit '%s'"), ref); /* iterate over new parents */ diff --git a/log-tree.c b/log-tree.c index 840423ca149..76475d0136b 100644 --- a/log-tree.c +++ b/log-tree.c @@ -503,7 +503,7 @@ static int show_one_mergetag(struct commit *commit, return -1; /* error message already given */ strbuf_init(_message, 256); - if (parse_tag_buffer(tag, extra->value, extra->len)) + if (parse_tag_buffer(the_repository, tag, extra->value, extra->len)) strbuf_addstr(_message, "malformed mergetag\n"); else if (is_common_merge(commit) && !oidcmp(>tagged->oid, diff --git a/object.c b/object.c index bcfcfd38760..e095d49b379 100644 --- a/object.c +++ b/object.c @@ -225,7 +225,7 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e } else if (type == OBJ_TAG) { struct tag *tag = lookup_tag(the_repository, oid); if (tag) { - if (parse_tag_buffer(tag, buffer, size)) + if (parse_tag_buffer(the_repository, tag, buffer, size)) return NULL; obj = >object; } diff --git a/sha1-file.c b/sha1-file.c index 75ba30b4ab1..c75ef771f8b 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1809,7 +1809,7 @@ static void check_tag(const void *buf, size_t size) { struct tag t; memset(, 0, sizeof(t)); - if (parse_tag_buffer(, buf, size)) + if (parse_tag_buffer(the_repository, , buf, size)) die("corrupt tag"); } diff --git a/tag.c b/tag.c index 5b41fc71fad..4971fd4dfc9 100644 --- a/tag.c +++ b/tag.c @@ -126,7 +126,7 @@ void release_tag_memory(struct tag *t) t->date = 0; } -int parse_tag_buffer(struct tag *item, const void *data, unsigned long size) +int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned long size) { struct object_id oid; char type[20]; @@ -203,7 +203,7 @@ int parse_tag(struct tag *item) return error("Object %s not a tag", oid_to_hex(>object.oid)); } - ret = parse_tag_buffer(item, data, size); + ret = parse_tag_buffer(the_repository, item, data, size); free(data); return ret; } diff --git a/tag.h b/tag.h index 276c448cd55..149959c81ba 100644 --- a/tag.h +++ b/tag.h @@ -13,7 +13,8 @@ struct tag { }; #define lookup_tag(r, o) lookup_tag_##r(o) extern struct tag *lookup_tag_the_repository(const struct object_id *oid); -extern int parse_tag_buffer(struct tag *item, const void *data, unsigned long size); +#define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s) +extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned long size); extern int parse_tag(struct tag *item); extern void release_tag_memory(struct tag *t); extern struct object *deref_tag(struct object *, const char *, int); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 06/32] tree: add repository argument to lookup_tree
Add a repository argument to allow the callers of lookup_tree to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/am.c| 6 -- builtin/diff-tree.c | 2 +- builtin/diff.c | 3 ++- builtin/reflog.c| 2 +- cache-tree.c| 3 ++- commit-graph.c | 2 +- commit.c| 2 +- fsck.c | 2 +- http-push.c | 3 ++- list-objects.c | 2 +- merge-recursive.c | 6 +++--- object.c| 2 +- reachable.c | 2 +- revision.c | 4 ++-- sequencer.c | 2 +- tag.c | 2 +- tree.c | 4 ++-- tree.h | 3 ++- walker.c| 3 ++- 19 files changed, 31 insertions(+), 24 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 6273ea5195b..72e928cee79 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -32,6 +32,7 @@ #include "apply.h" #include "string-list.h" #include "packfile.h" +#include "repository.h" /** * Returns 1 if the file is empty or does not exist, 0 otherwise. @@ -1400,9 +1401,10 @@ static void write_index_patch(const struct am_state *state) FILE *fp; if (!get_oid_tree("HEAD", )) - tree = lookup_tree(); + tree = lookup_tree(the_repository, ); else - tree = lookup_tree(the_hash_algo->empty_tree); + tree = lookup_tree(the_repository, + the_repository->hash_algo->empty_tree); fp = xfopen(am_path(state, "patch"), "w"); init_revisions(_info, NULL); diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index d8db8f682f0..29901515a13 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -46,7 +46,7 @@ static int stdin_diff_trees(struct tree *tree1, const char *p) struct tree *tree2; if (!isspace(*p++) || parse_oid_hex(p, , ) || *p) return error("Need exactly two trees, separated by a space"); - tree2 = lookup_tree(); + tree2 = lookup_tree(the_repository, ); if (!tree2 || parse_tree(tree2)) return -1; printf("%s %s\n", oid_to_hex(>object.oid), diff --git a/builtin/diff.c b/builtin/diff.c index d0421c90e29..7971530b9b3 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -386,7 +386,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix) add_head_to_pending(); if (!rev.pending.nr) { struct tree *tree; - tree = lookup_tree(the_hash_algo->empty_tree); + tree = lookup_tree(the_repository, + the_repository->hash_algo->empty_tree); add_pending_object(, >object, "HEAD"); } break; diff --git a/builtin/reflog.c b/builtin/reflog.c index 948002b81ea..5e12c856049 100644 --- a/builtin/reflog.c +++ b/builtin/reflog.c @@ -66,7 +66,7 @@ static int tree_is_complete(const struct object_id *oid) int complete; struct tree *tree; - tree = lookup_tree(oid); + tree = lookup_tree(the_repository, oid); if (!tree) return 0; if (tree->object.flags & SEEN) diff --git a/cache-tree.c b/cache-tree.c index 6b467119960..181d5919f0f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -671,7 +671,8 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree) cnt++; else { struct cache_tree_sub *sub; - struct tree *subtree = lookup_tree(entry.oid); + struct tree *subtree = lookup_tree(the_repository, + entry.oid); if (!subtree->object.parsed) parse_tree(subtree); sub = cache_tree_sub(it, entry.path); diff --git a/commit-graph.c b/commit-graph.c index b63a1fc85ea..d1a68f0128f 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -344,7 +344,7 @@ static struct tree *load_tree_for_commit(struct commit_graph *g, struct commit * GRAPH_DATA_WIDTH * (c->graph_pos); hashcpy(oid.hash, commit_data); - c->maybe_tree = lookup_tree(); + c->maybe_tree = lookup_tree(the_repository, ); return c->maybe_tree; } diff --git a/commit.c b/commit.c index 0d55600e643..2fa4220ac86 100644 --- a/commit.c +++ b/commit.c @@ -383,7 +383,7 @@ int
[PATCH v3 01/32] object: add repository argument to parse_object
Add a repository argument to allow the callers of parse_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/diff-tree.c | 3 ++- builtin/diff.c | 2 +- builtin/fast-export.c | 2 +- builtin/fmt-merge-msg.c | 6 -- builtin/fsck.c | 4 ++-- builtin/log.c | 3 ++- builtin/name-rev.c | 7 --- builtin/receive-pack.c | 6 +++--- builtin/reflog.c| 3 ++- builtin/rev-list.c | 2 +- bundle.c| 5 +++-- commit.c| 5 +++-- fetch-pack.c| 18 ++ fsck.c | 3 ++- http-backend.c | 2 +- http-push.c | 6 -- log-tree.c | 7 --- merge-recursive.c | 4 +++- object.c| 4 ++-- object.h| 3 ++- packfile.c | 2 +- pretty.c| 2 +- ref-filter.c| 3 ++- reflog-walk.c | 3 ++- refs/files-backend.c| 2 +- remote.c| 4 ++-- revision.c | 14 +++--- server-info.c | 2 +- sha1-name.c | 14 +++--- tag.c | 5 +++-- tree.c | 5 +++-- upload-pack.c | 13 +++-- walker.c| 3 ++- 33 files changed, 95 insertions(+), 72 deletions(-) diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index 473615117e0..d8db8f682f0 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -5,6 +5,7 @@ #include "log-tree.h" #include "builtin.h" #include "submodule.h" +#include "repository.h" static struct rev_info log_tree_opt; @@ -68,7 +69,7 @@ static int diff_tree_stdin(char *line) line[len-1] = 0; if (parse_oid_hex(line, , )) return -1; - obj = parse_object(); + obj = parse_object(the_repository, ); if (!obj) return -1; if (obj->type == OBJ_COMMIT) diff --git a/builtin/diff.c b/builtin/diff.c index b709b6e9842..d0421c90e29 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -400,7 +400,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix) const char *name = entry->name; int flags = (obj->flags & UNINTERESTING); if (!obj->parsed) - obj = parse_object(>oid); + obj = parse_object(the_repository, >oid); obj = deref_tag(obj, NULL, 0); if (!obj) die(_("invalid object '%s' given."), name); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 9ee6a4d2e8f..a16aeaa826f 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -801,7 +801,7 @@ static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name) /* handle nested tags */ while (tag && tag->object.type == OBJ_TAG) { - parse_object(>object.oid); + parse_object(the_repository, >object.oid); string_list_append(_refs, full_name)->util = tag; tag = (struct tag *)tag->tagged; } diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index 1b526adb3a9..5e44589b545 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -11,6 +11,7 @@ #include "branch.h" #include "fmt-merge-msg.h" #include "gpg-interface.h" +#include "repository.h" static const char * const fmt_merge_msg_usage[] = { N_("git fmt-merge-msg [-m ] [--log[=] | --no-log] [--file ]"), @@ -343,7 +344,8 @@ static void shortlog(const char *name, const struct object_id *oid = _data->oid; int limit = opts->shortlog_len; - branch = deref_tag(parse_object(oid), oid_to_hex(oid), GIT_SHA1_HEXSZ); + branch = deref_tag(parse_object(the_repository, oid), oid_to_hex(oid), + GIT_SHA1_HEXSZ); if (!branch || branch->type != OBJ_COMMIT) return; @@ -563,7 +565,7 @@ static void find_merge_parents(struct merge_parents *result, * "name" here and we do not want to contaminate its * util field yet. */ - obj = parse_object(); + obj = parse_object(the_repository, ); parent = (struct commit *)peel_to_type(NULL, 0, obj, OBJ_COMMIT); if (!parent) continue; diff --git a/builtin/fsck.c b/builtin/fsck.c index 3ad4f160f99..2b0930101d3 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -452,7 +452,7 @@ static int fsck_handle_ref(const char *refname, const struct
[PATCH v3 18/32] blob: allow lookup_blob to handle arbitrary repositories
Signed-off-by: Stefan Beller --- blob.c | 10 +- blob.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/blob.c b/blob.c index 17b9314f0a0..342bdbb1bbe 100644 --- a/blob.c +++ b/blob.c @@ -5,13 +5,13 @@ const char *blob_type = "blob"; -struct blob *lookup_blob_the_repository(const struct object_id *oid) +struct blob *lookup_blob(struct repository *r, const struct object_id *oid) { - struct object *obj = lookup_object(the_repository, oid->hash); + struct object *obj = lookup_object(r, oid->hash); if (!obj) - return create_object(the_repository, oid->hash, -alloc_blob_node(the_repository)); - return object_as_type(the_repository, obj, OBJ_BLOB, 0); + return create_object(r, oid->hash, +alloc_blob_node(r)); + return object_as_type(r, obj, OBJ_BLOB, 0); } int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size) diff --git a/blob.h b/blob.h index 08bc34487a0..16648720557 100644 --- a/blob.h +++ b/blob.h @@ -9,8 +9,7 @@ struct blob { struct object object; }; -#define lookup_blob(r, o) lookup_blob_##r(o) -struct blob *lookup_blob_the_repository(const struct object_id *oid); +struct blob *lookup_blob(struct repository *r, const struct object_id *oid); int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 20/32] commit: allow lookup_commit to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 10 +- commit.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/commit.c b/commit.c index aa5557dee84..8749e151451 100644 --- a/commit.c +++ b/commit.c @@ -53,13 +53,13 @@ struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref return c; } -struct commit *lookup_commit_the_repository(const struct object_id *oid) +struct commit *lookup_commit(struct repository *r, const struct object_id *oid) { - struct object *obj = lookup_object(the_repository, oid->hash); + struct object *obj = lookup_object(r, oid->hash); if (!obj) - return create_object(the_repository, oid->hash, -alloc_commit_node(the_repository)); - return object_as_type(the_repository, obj, OBJ_COMMIT, 0); + return create_object(r, oid->hash, +alloc_commit_node(r)); + return object_as_type(r, obj, OBJ_COMMIT, 0); } struct commit *lookup_commit_reference_by_name(const char *name) diff --git a/commit.h b/commit.h index 237607d64cb..27888d82468 100644 --- a/commit.h +++ b/commit.h @@ -63,8 +63,7 @@ enum decoration_type { void add_name_decoration(enum decoration_type type, const char *name, struct object *obj); const struct name_decoration *get_name_decoration(const struct object *obj); -#define lookup_commit(r, o) lookup_commit_##r(o) -struct commit *lookup_commit_the_repository(const struct object_id *oid); +struct commit *lookup_commit(struct repository *r, const struct object_id *oid); #define lookup_commit_reference(r, o) \ lookup_commit_reference_##r(o) struct commit *lookup_commit_reference_the_repository(const struct object_id *oid); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 02/32] object: add repository argument to lookup_object
Add a repository argument to allow callers of lookup_object to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- blob.c | 2 +- builtin/fast-export.c| 5 +++-- builtin/fsck.c | 5 +++-- builtin/name-rev.c | 3 ++- builtin/prune.c | 2 +- builtin/unpack-objects.c | 2 +- commit.c | 2 +- fetch-pack.c | 13 +++-- http-push.c | 2 +- object.c | 8 object.h | 3 ++- reachable.c | 4 ++-- tag.c| 2 +- tree.c | 2 +- upload-pack.c| 2 +- 15 files changed, 31 insertions(+), 26 deletions(-) diff --git a/blob.c b/blob.c index 458dafa811e..75b737a761e 100644 --- a/blob.c +++ b/blob.c @@ -7,7 +7,7 @@ const char *blob_type = "blob"; struct blob *lookup_blob(const struct object_id *oid) { - struct object *obj = lookup_object(oid->hash); + struct object *obj = lookup_object(the_repository, oid->hash); if (!obj) return create_object(the_repository, oid->hash, alloc_blob_node(the_repository)); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index a16aeaa826f..e39c4e2c1d0 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -230,7 +230,7 @@ static void export_blob(const struct object_id *oid) if (is_null_oid(oid)) return; - object = lookup_object(oid->hash); + object = lookup_object(the_repository, oid->hash); if (object && object->flags & SHOWN) return; @@ -402,7 +402,8 @@ static void show_filemodify(struct diff_queue_struct *q, anonymize_sha1(>oid) : spec->oid.hash)); else { - struct object *object = lookup_object(spec->oid.hash); + struct object *object = lookup_object(the_repository, + spec->oid.hash); printf("M %06o :%d ", spec->mode, get_object_mark(object)); } diff --git a/builtin/fsck.c b/builtin/fsck.c index 2b0930101d3..12d01e91747 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -410,7 +410,7 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid, struct object *obj; if (!is_null_oid(oid)) { - obj = lookup_object(oid->hash); + obj = lookup_object(the_repository, oid->hash); if (obj && (obj->flags & HAS_OBJ)) { if (timestamp && name_objects) add_decoration(fsck_walk_options.object_names, @@ -763,7 +763,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) const char *arg = argv[i]; struct object_id oid; if (!get_oid(arg, )) { - struct object *obj = lookup_object(oid.hash); + struct object *obj = lookup_object(the_repository, + oid.hash); if (!obj || !(obj->flags & HAS_OBJ)) { if (is_promisor_object()) diff --git a/builtin/name-rev.c b/builtin/name-rev.c index de54fa93e4f..f6eb419a029 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -379,7 +379,8 @@ static void name_rev_line(char *p, struct name_ref_data *data) *(p+1) = 0; if (!get_oid(p - (GIT_SHA1_HEXSZ - 1), )) { struct object *o = - lookup_object(oid.hash); + lookup_object(the_repository, + oid.hash); if (o) name = get_rev_name(o, ); } diff --git a/builtin/prune.c b/builtin/prune.c index 70ec35aa058..72b0621b768 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -40,7 +40,7 @@ static int prune_object(const struct object_id *oid, const char *fullpath, * Do we know about this object? * It must have been reachable */ - if (lookup_object(oid->hash)) + if (lookup_object(the_repository, oid->hash)) return 0; if (lstat(fullpath, )) { diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index cf585fcc5ee..335b5ed9a0f
[PATCH v3 21/32] tag: allow lookup_tag to handle arbitrary repositories
Signed-off-by: Stefan Beller --- tag.c | 10 +- tag.h | 3 +-- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/tag.c b/tag.c index fbb4659325b..46b5882ee12 100644 --- a/tag.c +++ b/tag.c @@ -92,13 +92,13 @@ struct object *deref_tag_noverify(struct object *o) return o; } -struct tag *lookup_tag_the_repository(const struct object_id *oid) +struct tag *lookup_tag(struct repository *r, const struct object_id *oid) { - struct object *obj = lookup_object(the_repository, oid->hash); + struct object *obj = lookup_object(r, oid->hash); if (!obj) - return create_object(the_repository, oid->hash, -alloc_tag_node(the_repository)); - return object_as_type(the_repository, obj, OBJ_TAG, 0); + return create_object(r, oid->hash, +alloc_tag_node(r)); + return object_as_type(r, obj, OBJ_TAG, 0); } static timestamp_t parse_tag_date(const char *buf, const char *tail) diff --git a/tag.h b/tag.h index 45b0b08b1f6..6a160c91875 100644 --- a/tag.h +++ b/tag.h @@ -11,8 +11,7 @@ struct tag { char *tag; timestamp_t date; }; -#define lookup_tag(r, o) lookup_tag_##r(o) -extern struct tag *lookup_tag_the_repository(const struct object_id *oid); +extern struct tag *lookup_tag(struct repository *r, const struct object_id *oid); #define parse_tag_buffer(r, i, d, s) parse_tag_buffer_##r(i, d, s) extern int parse_tag_buffer_the_repository(struct tag *item, const void *data, unsigned long size); extern int parse_tag(struct tag *item); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 10/32] commit: add repository argument to parse_commit_buffer
Add a repository argument to allow the callers of parse_commit_buffer to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- commit.c| 4 ++-- commit.h| 3 ++- object.c| 2 +- sha1-file.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/commit.c b/commit.c index 4803c8be1df..75d0bdede84 100644 --- a/commit.c +++ b/commit.c @@ -363,7 +363,7 @@ const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) return ret; } -int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size, int check_graph) +int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, unsigned long size, int check_graph) { const char *tail = buffer; const char *bufptr = buffer; @@ -448,7 +448,7 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return error("Object %s not a commit", oid_to_hex(>object.oid)); } - ret = parse_commit_buffer(item, buffer, size, 0); + ret = parse_commit_buffer(the_repository, item, buffer, size, 0); if (save_commit_buffer && !ret) { set_commit_buffer(item, buffer, size); return 0; diff --git a/commit.h b/commit.h index cd80dab59c1..f326c13622b 100644 --- a/commit.h +++ b/commit.h @@ -82,7 +82,8 @@ struct commit *lookup_commit_reference_by_name(const char *name); */ struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name); -int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size, int check_graph); +#define parse_commit_buffer(r, i, b, s, g) parse_commit_buffer_##r(i, b, s, g) +int parse_commit_buffer_the_repository(struct commit *item, const void *buffer, unsigned long size, int check_graph); int parse_commit_gently(struct commit *item, int quiet_on_missing); static inline int parse_commit(struct commit *item) { diff --git a/object.c b/object.c index 530c55e41e4..5494c0cbaa1 100644 --- a/object.c +++ b/object.c @@ -214,7 +214,7 @@ struct object *parse_object_buffer_the_repository(const struct object_id *oid, e } else if (type == OBJ_COMMIT) { struct commit *commit = lookup_commit(the_repository, oid); if (commit) { - if (parse_commit_buffer(commit, buffer, size, 1)) + if (parse_commit_buffer(the_repository, commit, buffer, size, 1)) return NULL; if (!get_cached_commit_buffer(commit, NULL)) { set_commit_buffer(commit, buffer, size); diff --git a/sha1-file.c b/sha1-file.c index de4839e634c..75ba30b4ab1 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1801,7 +1801,7 @@ static void check_commit(const void *buf, size_t size) { struct commit c; memset(, 0, sizeof(c)); - if (parse_commit_buffer(, buf, size, 0)) + if (parse_commit_buffer(the_repository, , buf, size, 0)) die("corrupt commit"); } -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 04/32] object: add repository argument to object_as_type
Signed-off-by: Stefan Beller --- blob.c | 2 +- builtin/fsck.c | 2 +- commit.c | 4 ++-- object.c | 2 +- object.h | 3 ++- refs.c | 2 +- tag.c | 2 +- tree.c | 2 +- 8 files changed, 10 insertions(+), 9 deletions(-) diff --git a/blob.c b/blob.c index 75b737a761e..dada295698c 100644 --- a/blob.c +++ b/blob.c @@ -11,7 +11,7 @@ struct blob *lookup_blob(const struct object_id *oid) if (!obj) return create_object(the_repository, oid->hash, alloc_blob_node(the_repository)); - return object_as_type(obj, OBJ_BLOB, 0); + return object_as_type(the_repository, obj, OBJ_BLOB, 0); } int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size) diff --git a/builtin/fsck.c b/builtin/fsck.c index 09cf5333444..a906fe4a82b 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -70,7 +70,7 @@ static const char *printable_type(struct object *obj) enum object_type type = oid_object_info(the_repository, >oid, NULL); if (type > 0) - object_as_type(obj, type, 0); + object_as_type(the_repository, obj, type, 0); } ret = type_name(obj->type); diff --git a/commit.c b/commit.c index b4dbfd889a3..0d55600e643 100644 --- a/commit.c +++ b/commit.c @@ -32,7 +32,7 @@ struct commit *lookup_commit_reference_gently(const struct object_id *oid, if (!obj) return NULL; - return object_as_type(obj, OBJ_COMMIT, quiet); + return object_as_type(the_repository, obj, OBJ_COMMIT, quiet); } struct commit *lookup_commit_reference(const struct object_id *oid) @@ -58,7 +58,7 @@ struct commit *lookup_commit(const struct object_id *oid) if (!obj) return create_object(the_repository, oid->hash, alloc_commit_node(the_repository)); - return object_as_type(obj, OBJ_COMMIT, 0); + return object_as_type(the_repository, obj, OBJ_COMMIT, 0); } struct commit *lookup_commit_reference_by_name(const char *name) diff --git a/object.c b/object.c index 49719694c14..404919043d4 100644 --- a/object.c +++ b/object.c @@ -158,7 +158,7 @@ void *create_object(struct repository *r, const unsigned char *sha1, void *o) return obj; } -void *object_as_type(struct object *obj, enum object_type type, int quiet) +void *object_as_type_the_repository(struct object *obj, enum object_type type, int quiet) { if (obj->type == type) return obj; diff --git a/object.h b/object.h index 2ba23c07a72..3faa89578fc 100644 --- a/object.h +++ b/object.h @@ -114,7 +114,8 @@ struct object *lookup_object_the_repository(const unsigned char *sha1); extern void *create_object(struct repository *r, const unsigned char *sha1, void *obj); -void *object_as_type(struct object *obj, enum object_type type, int quiet); +#define object_as_type(r, o, t, q) object_as_type_##r(o, t, q) +void *object_as_type_the_repository(struct object *obj, enum object_type type, int quiet); /* * Returns the object, having parsed it to find out what it is. diff --git a/refs.c b/refs.c index 3b4508a97aa..fcfd3171e83 100644 --- a/refs.c +++ b/refs.c @@ -305,7 +305,7 @@ enum peel_status peel_object(const struct object_id *name, struct object_id *oid if (o->type == OBJ_NONE) { int type = oid_object_info(the_repository, name, NULL); - if (type < 0 || !object_as_type(o, type, 0)) + if (type < 0 || !object_as_type(the_repository, o, type, 0)) return PEEL_INVALID; } diff --git a/tag.c b/tag.c index 1b95eb9f07f..a14a4f23037 100644 --- a/tag.c +++ b/tag.c @@ -98,7 +98,7 @@ struct tag *lookup_tag(const struct object_id *oid) if (!obj) return create_object(the_repository, oid->hash, alloc_tag_node(the_repository)); - return object_as_type(obj, OBJ_TAG, 0); + return object_as_type(the_repository, obj, OBJ_TAG, 0); } static timestamp_t parse_tag_date(const char *buf, const char *tail) diff --git a/tree.c b/tree.c index 73e8a8a948d..f31afb81be0 100644 --- a/tree.c +++ b/tree.c @@ -201,7 +201,7 @@ struct tree *lookup_tree(const struct object_id *oid) if (!obj) return create_object(the_repository, oid->hash, alloc_tree_node(the_repository)); - return object_as_type(obj, OBJ_TREE, 0); + return object_as_type(the_repository, obj, OBJ_TREE, 0); } int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size) -- 2.18.0.399.gad0ab374a1-goog
[PATCH v3 05/32] blob: add repository argument to lookup_blob
Add a repository argument to allow the callers of lookup_blob to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- blob.c | 2 +- blob.h | 3 ++- builtin/fast-export.c| 2 +- builtin/fsck.c | 3 ++- builtin/index-pack.c | 2 +- builtin/merge-tree.c | 3 ++- builtin/unpack-objects.c | 2 +- fsck.c | 4 ++-- http-push.c | 3 ++- list-objects.c | 2 +- object.c | 4 ++-- reachable.c | 2 +- revision.c | 4 ++-- tag.c| 2 +- walker.c | 3 ++- 15 files changed, 23 insertions(+), 18 deletions(-) diff --git a/blob.c b/blob.c index dada295698c..17b9314f0a0 100644 --- a/blob.c +++ b/blob.c @@ -5,7 +5,7 @@ const char *blob_type = "blob"; -struct blob *lookup_blob(const struct object_id *oid) +struct blob *lookup_blob_the_repository(const struct object_id *oid) { struct object *obj = lookup_object(the_repository, oid->hash); if (!obj) diff --git a/blob.h b/blob.h index 44606168310..08bc34487a0 100644 --- a/blob.h +++ b/blob.h @@ -9,7 +9,8 @@ struct blob { struct object object; }; -struct blob *lookup_blob(const struct object_id *oid); +#define lookup_blob(r, o) lookup_blob_##r(o) +struct blob *lookup_blob_the_repository(const struct object_id *oid); int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 03a2e4b79e6..7d6b1d8aea2 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -236,7 +236,7 @@ static void export_blob(const struct object_id *oid) if (anonymize) { buf = anonymize_blob(); - object = (struct object *)lookup_blob(oid); + object = (struct object *)lookup_blob(the_repository, oid); eaten = 0; } else { buf = read_object_file(oid, , ); diff --git a/builtin/fsck.c b/builtin/fsck.c index a906fe4a82b..263191942d4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -810,7 +810,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) mode = active_cache[i]->ce_mode; if (S_ISGITLINK(mode)) continue; - blob = lookup_blob(_cache[i]->oid); + blob = lookup_blob(the_repository, + _cache[i]->oid); if (!blob) continue; obj = >object; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 9c08cf3cead..de311febe30 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -832,7 +832,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, if (strict || do_fsck_object) { read_lock(); if (type == OBJ_BLOB) { - struct blob *blob = lookup_blob(oid); + struct blob *blob = lookup_blob(the_repository, oid); if (blob) blob->object.flags |= FLAG_CHECKED; else diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c index 8a8d5797520..f8023bae1e2 100644 --- a/builtin/merge-tree.c +++ b/builtin/merge-tree.c @@ -2,6 +2,7 @@ #include "tree-walk.h" #include "xdiff-interface.h" #include "object-store.h" +#include "repository.h" #include "blob.h" #include "exec-cmd.h" #include "merge-blobs.h" @@ -170,7 +171,7 @@ static struct merge_list *create_entry(unsigned stage, unsigned mode, const stru res->stage = stage; res->path = path; res->mode = mode; - res->blob = lookup_blob(oid); + res->blob = lookup_blob(the_repository, oid); return res; } diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 75d1d5ea0b8..716408e3a9d 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -254,7 +254,7 @@ static void write_object(unsigned nr, enum object_type type, added_object(nr, type, buf, size); free(buf); - blob = lookup_blob(_list[nr].oid); + blob = lookup_blob(the_repository, _list[nr].oid); if (blob) blob->object.flags |= FLAG_WRITTEN; else diff --git a/fsck.c b/fsck.c index bb3d622fb93..ea00f7228df 100644 --- a/fsck.c +++ b/fsck.c @@ -414,7 +414,7 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op result =
[PATCH v3 03/32] object: add repository argument to parse_object_buffer
Add a repository argument to allow the callers of parse_object_buffer to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/fast-export.c| 3 ++- builtin/fsck.c | 7 +-- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- object.c | 5 +++-- object.h | 3 ++- ref-filter.c | 3 ++- 7 files changed, 18 insertions(+), 9 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index e39c4e2c1d0..03a2e4b79e6 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -244,7 +244,8 @@ static void export_blob(const struct object_id *oid) die ("Could not read blob %s", oid_to_hex(oid)); if (check_object_signature(oid, buf, size, type_name(type)) < 0) die("sha1 mismatch in blob %s", oid_to_hex(oid)); - object = parse_object_buffer(oid, type, size, buf, ); + object = parse_object_buffer(the_repository, oid, type, +size, buf, ); } if (!object) diff --git a/builtin/fsck.c b/builtin/fsck.c index 12d01e91747..09cf5333444 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -392,7 +392,8 @@ static int fsck_obj_buffer(const struct object_id *oid, enum object_type type, * verify_packfile(), data_valid variable for details. */ struct object *obj; - obj = parse_object_buffer(oid, type, size, buffer, eaten); + obj = parse_object_buffer(the_repository, oid, type, size, buffer, + eaten); if (!obj) { errors_found |= ERROR_OBJECT; return error("%s: object corrupt or missing", oid_to_hex(oid)); @@ -525,7 +526,9 @@ static int fsck_loose(const struct object_id *oid, const char *path, void *data) if (!contents && type != OBJ_BLOB) BUG("read_loose_object streamed a non-blob"); - obj = parse_object_buffer(oid, type, size, contents, ); + obj = parse_object_buffer(the_repository, oid, type, size, + contents, ); + if (!obj) { errors_found |= ERROR_OBJECT; error("%s: object could not be parsed: %s", diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 74fe2973e12..9c08cf3cead 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -851,7 +851,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, * we do not need to free the memory here, as the * buf is deleted by the caller. */ - obj = parse_object_buffer(oid, type, size, buf, + obj = parse_object_buffer(the_repository, oid, type, + size, buf, ); if (!obj) die(_("invalid %s"), type_name(type)); diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 335b5ed9a0f..75d1d5ea0b8 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -265,7 +265,8 @@ static void write_object(unsigned nr, enum object_type type, int eaten; hash_object_file(buf, size, type_name(type), _list[nr].oid); added_object(nr, type, buf, size); - obj = parse_object_buffer(_list[nr].oid, type, size, buf, + obj = parse_object_buffer(the_repository, _list[nr].oid, + type, size, buf, ); if (!obj) die("invalid %s", type_name(type)); diff --git a/object.c b/object.c index 002ebb69e3b..49719694c14 100644 --- a/object.c +++ b/object.c @@ -186,7 +186,7 @@ struct object *lookup_unknown_object(const unsigned char *sha1) return obj; } -struct object *parse_object_buffer(const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p) +struct object *parse_object_buffer_the_repository(const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p) { struct object *obj; *eaten_p = 0; @@ -278,7 +278,8 @@ struct object *parse_object_the_repository(const struct object_id *oid) return NULL; } - obj = parse_object_buffer(oid, type, size, buffer, ); + obj = parse_object_buffer(the_repository, oid, type, size, +
Re: [PATCH 4/4] fsck: silence stderr when parsing .gitmodules
On 28/06/18 23:12, Jeff King wrote: > On Thu, Jun 28, 2018 at 06:06:03PM -0400, Jeff King wrote: > >> Note that we didn't test this case at all, so I've added >> coverage in t7415. We may end up toning down or removing >> this fsck check in the future. So take this test as checking >> what happens now with a focus on stderr, and not any >> ironclad guarantee that we must detect and report parse >> failures in the future. > > And such a patch _could_ look something like this. Though we could also > perhaps leave it in place and tone it down to "ignore" by default. > > There's another case that triggers gitmodulesParse, too, which is a blob > so gigantic that we try not to hold it in memory. Technically that could > also happen due to somebody using .gitmodules for something unrelated. > But that seems even more far-fetched. And it _is_ dangerous to leave, > because I think existing vulnerable clients will try to load a 500MB > .gitmodules file in memory and parse it. I also applied and tested the patch below. I think this patch must be included in the series. ATB, Ramsay Jones > --- > diff --git a/fsck.c b/fsck.c > index 87b0e228bd..296e8a8a7c 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -1013,11 +1013,9 @@ static int fsck_blob(struct blob *blob, const char > *buf, > data.options = options; > data.ret = 0; > config_opts.error_action = CONFIG_ERROR_SILENT; > - if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, > - ".gitmodules", buf, size, , _opts)) > - data.ret |= report(options, >object, > -FSCK_MSG_GITMODULES_PARSE, > -"could not parse gitmodules blob"); > + /* ignore errors */ > + git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, > + ".gitmodules", buf, size, , _opts); > > return data.ret; > } > diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh > index ba8af785a5..9a7dae88a5 100755 > --- a/t/t7415-submodule-names.sh > +++ b/t/t7415-submodule-names.sh > @@ -176,7 +176,7 @@ test_expect_success 'fsck detects non-blob .gitmodules' ' > ) > ' > > -test_expect_success 'fsck detects corrupt .gitmodules' ' > +test_expect_success 'fsck does not complain about corrupt .gitmodules' ' > git init corrupt && > ( > cd corrupt && > @@ -185,9 +185,8 @@ test_expect_success 'fsck detects corrupt .gitmodules' ' > git add .gitmodules && > git commit -m "broken gitmodules" && > > - test_must_fail git fsck 2>output && > - grep gitmodulesParse output && > - test_i18ngrep ! "bad config" output > + git fsck 2>output && > + ! test -s output > ) > ' > >
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On 28/06/18 23:03, Jeff King wrote: > On Thu, Jun 28, 2018 at 07:53:27PM +0100, Ramsay Jones wrote: [snip] > Yes, it can go in quickly. But I'd prefer not to keep it in the long > term if it's literally doing nothing. Hmm, I don't think you can say its doing nothing! "Yeah, this solution seems sensible. Given that we would never report any error for that blob, there is no point in even looking at it." ... is no less true, with or without additional patches! ;-) > I have some patches which I think solve your problem. They apply on > v2.18.0, but not on v2.17.1 (because they rely on Dscho's increased > passing of config_options in v2.18). Is that good enough? Heh, I was also writing patches to address this tonight (but I was also watching the football, so I was somewhat behind you). My patches were not too dissimilar to yours, except I was aiming to allow even do_config_from_file() etc., to suppress errors. Your patches were cleaner and more focused than mine. (Instead of turning die_on_error into an enum, I added an additional 'quiet' flag. When pushing the stack (eg. for include files), I had to copy the quiet flag from the parent struct, etc, ... ;-) ). > Yes, it would include any syntax error. I also have a slight worry about > that, but nobody seems to have screamed _yet_. :) Hmm, I don't think we can ignore this. :( > Here are the patches I came up with. Yes, I applied these locally and tested them. All OK here. So, FWIW, Ack! [I still think my original patch, with the 'to_be_skipped' function name changed to 'object_on_skiplist', should be the first patch of the series!] ATB, Ramsay Jones
[PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the mailing list. However when refactoring sometimes the indentation changes, for example when partitioning a functions into smaller helper functions the code usually mostly moved around except for a decrease in indentation. To just review the moved code ignoring the change in indentation, a mode to ignore spaces in the move detection as implemented in a previous patch would be enough. However the whole move coloring as motivated in commit 2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought up the notion of the reviewer being able to trust the move of a "block". As there are languages such as python, which depend on proper relative indentation for the control flow of the program, ignoring any white space change in a block would not uphold the promises of 2e2d5ac that allows reviewers to pay less attention to the inside of a block, as inside the reviewer wants to assume the same program flow. This new mode of white space ignorance will take this into account and will only allow the same white space changes per line in each block. This patch even allows only for the same change at the beginning of the lines. As this is a white space mode, it is made exclusive to other white space modes in the move detection. This patch brings some challenges, related to the detection of blocks. We need a white net the catch the possible moved lines, but then need to narrow down to check if the blocks are still in tact. Consider this example (ignoring block sizes): - A - B - C +A +B +C At the beginning of a block when checking if there is a counterpart for A, we have to ignore all space changes. However at the following lines we have to check if the indent change stayed the same. Checking if the indentation change did stay the same, is done by computing the indentation change by the difference in line length, and then assume the change is only in the beginning of the longer line, the common tail is the same. That is why the test contains lines like: - A ... + A ... As the first line starting a block is caught using a compare function that ignores white spaces unlike the rest of the block, where the white space delta is taken into account for the comparison, we also have to think about the following situation: - A - B - A - B +A +B + A + B When checking if the first A (both in the + and - lines) is a start of a block, we have to check all 'A' and record all the white space deltas such that we can find the example above to be just one block that is indented. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- Documentation/diff-options.txt | 5 ++ diff.c | 158 - diff.h | 3 + t/t4015-diff-whitespace.sh | 98 ++-- 4 files changed, 256 insertions(+), 8 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 80e29e39854..143acd9417e 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -307,6 +307,11 @@ ignore-space-change:: ignore-all-space:: Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has none. +allow-indentation-change:: + Initially ignore any white spaces in the move detection, then + group the moved code blocks only into a block if the change in + whitespace is the same per line. This is incompatible with the + other modes. -- --word-diff[=]:: diff --git a/diff.c b/diff.c index 4963819e530..f51f0ac32f4 100644 --- a/diff.c +++ b/diff.c @@ -302,12 +302,18 @@ static int parse_color_moved_ws(const char *arg) ret |= XDF_IGNORE_WHITESPACE_AT_EOL; else if (!strcmp(sb.buf, "ignore-all-space")) ret |= XDF_IGNORE_WHITESPACE; + else if (!strcmp(sb.buf, "allow-indentation-change")) + ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE; else error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf); strbuf_release(); } + if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) && + (ret & XDF_WHITESPACE_FLAGS)) + die(_("color-moved-ws: allow-indentation-change cannot be combined with other white space modes")); + string_list_clear(, 0); return ret; @@ -737,7 +743,91 @@ struct moved_entry { struct hashmap_entry ent; const struct emitted_diff_symbol *es; struct moved_entry *next_line; + struct ws_delta *wsd; +}; + +/** + * The struct ws_delta holds white space differences between moved lines, i.e. + * between '+' and '-' lines that have been detected to be a move. + * The string contains the difference in leading white spaces, before the + * rest of
[PATCH v4 8/9] diff.c: factor advance_or_nullify out of mark_color_as_moved
This moves the part of code that checks if we're still in a block into its own function. We'll need a different approach on advancing the blocks in a later patch, so having it as a separate function will prove useful. While at it rename the variable `p` to `prev` to indicate that it refers to the previous line. This is as pmb[i] was assigned in the last iteration of the outmost for loop. Further rename `pnext` to `cur` to indicate that this should match up with the current line of the outmost for loop. Also replace the advancement of pmb[i] to reuse `cur` instead of using `p->next` (which is how the name for pnext could be explained. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/diff.c b/diff.c index 70eeb40c5fd..4963819e530 100644 --- a/diff.c +++ b/diff.c @@ -801,6 +801,25 @@ static void add_lines_to_move_detection(struct diff_options *o, } } +static void pmb_advance_or_null(struct diff_options *o, + struct moved_entry *match, + struct hashmap *hm, + struct moved_entry **pmb, + int pmb_nr) +{ + int i; + for (i = 0; i < pmb_nr; i++) { + struct moved_entry *prev = pmb[i]; + struct moved_entry *cur = (prev && prev->next_line) ? + prev->next_line : NULL; + if (cur && !hm->cmpfn(o, cur, match, NULL)) { + pmb[i] = cur; + } else { + pmb[i] = NULL; + } + } +} + static int shrink_potential_moved_blocks(struct moved_entry **pmb, int pmb_nr) { @@ -875,7 +894,6 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_entry *key; struct moved_entry *match = NULL; struct emitted_diff_symbol *l = >emitted_symbols->buf[n]; - int i; switch (l->s) { case DIFF_SYMBOL_PLUS: @@ -906,17 +924,7 @@ static void mark_color_as_moved(struct diff_options *o, if (o->color_moved == COLOR_MOVED_PLAIN) continue; - /* Check any potential block runs, advance each or nullify */ - for (i = 0; i < pmb_nr; i++) { - struct moved_entry *p = pmb[i]; - struct moved_entry *pnext = (p && p->next_line) ? - p->next_line : NULL; - if (pnext && !hm->cmpfn(o, pnext, match, NULL)) { - pmb[i] = p->next_line; - } else { - pmb[i] = NULL; - } - } + pmb_advance_or_null(o, match, hm, pmb, pmb_nr); pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v4 1/9] xdiff/xdiff.h: remove unused flags
These flags were there since the beginning (3443546f6e (Use a *real* built-in diff generator, 2006-03-24), but were never used. Remove them. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- xdiff/xdiff.h | 8 1 file changed, 8 deletions(-) diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h index c1937a29112..2356da5f784 100644 --- a/xdiff/xdiff.h +++ b/xdiff/xdiff.h @@ -52,14 +52,6 @@ extern "C" { #define XDL_EMIT_FUNCNAMES (1 << 0) #define XDL_EMIT_FUNCCONTEXT (1 << 2) -#define XDL_MMB_READONLY (1 << 0) - -#define XDL_MMF_ATOMIC (1 << 0) - -#define XDL_BDOP_INS 1 -#define XDL_BDOP_CPY 2 -#define XDL_BDOP_INSB 3 - /* merge simplification levels */ #define XDL_MERGE_MINIMAL 0 #define XDL_MERGE_EAGER 1 -- 2.18.0.399.gad0ab374a1-goog
[PATCH v4 5/9] diff.c: adjust hash function signature to match hashmap expectation
This makes the follow up patch easier. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index ce7bedc1b92..d1bae900cdc 100644 --- a/diff.c +++ b/diff.c @@ -707,11 +707,15 @@ struct moved_entry { struct moved_entry *next_line; }; -static int moved_entry_cmp(const struct diff_options *diffopt, - const struct moved_entry *a, - const struct moved_entry *b, +static int moved_entry_cmp(const void *hashmap_cmp_fn_data, + const void *entry, + const void *entry_or_key, const void *keydata) { + const struct diff_options *diffopt = hashmap_cmp_fn_data; + const struct moved_entry *a = entry; + const struct moved_entry *b = entry_or_key; + return !xdiff_compare_lines(a->es->line, a->es->len, b->es->line, b->es->len, diffopt->xdl_opts); @@ -5534,10 +5538,8 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved) { struct hashmap add_lines, del_lines; - hashmap_init(_lines, -(hashmap_cmp_fn)moved_entry_cmp, o, 0); - hashmap_init(_lines, -(hashmap_cmp_fn)moved_entry_cmp, o, 0); + hashmap_init(_lines, moved_entry_cmp, o, 0); + hashmap_init(_lines, moved_entry_cmp, o, 0); add_lines_to_move_detection(o, _lines, _lines); mark_color_as_moved(o, _lines, _lines); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v4 2/9] xdiff/xdiffi.c: remove unneeded function declarations
There is no need to forward-declare these functions, as they are used after their implementation only. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- xdiff/xdiffi.c | 17 - 1 file changed, 17 deletions(-) diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c index 0de1ef463bf..3e8aff92bc4 100644 --- a/xdiff/xdiffi.c +++ b/xdiff/xdiffi.c @@ -22,34 +22,17 @@ #include "xinclude.h" - - #define XDL_MAX_COST_MIN 256 #define XDL_HEUR_MIN_COST 256 #define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1) #define XDL_SNAKE_CNT 20 #define XDL_K_HEUR 4 - - typedef struct s_xdpsplit { long i1, i2; int min_lo, min_hi; } xdpsplit_t; - - - -static long xdl_split(unsigned long const *ha1, long off1, long lim1, - unsigned long const *ha2, long off2, long lim2, - long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl, - xdalgoenv_t *xenv); -static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1, long chg2); - - - - - /* * See "An O(ND) Difference Algorithm and its Variations", by Eugene Myers. * Basically considers a "box" (off1, off2, lim1, lim2) and scan from both -- 2.18.0.399.gad0ab374a1-goog
[PATCH v4 0/9] Moved code detection: ignore space on uniform indentation
v4: * see range diff below * brought best practices to t4015 and have git not upstream of a pipe (new patch 3) * squashed in the SQUASH patches * fixed the translation as well. v3: This is a complete rewrite of the actual patch, with slight modifications] on the refactoring how to decouple the white space treatment from the move detection. See range-diff below. https://public-inbox.org/git/20180622015725.219575-1-sbel...@google.com/ v2: https://public-inbox.org/git/20180424210330.87861-1-sbel...@google.com/ v1: https://public-inbox.org/git/20180402224854.86922-1-sbel...@google.com/ Stefan Beller (9): xdiff/xdiff.h: remove unused flags xdiff/xdiffi.c: remove unneeded function declarations t4015: avoid git as a pipe input diff.c: do not pass diff options as keydata to hashmap diff.c: adjust hash function signature to match hashmap expectation diff.c: add a blocks mode for moved code detection diff.c: decouple white space treatment from move detection algorithm diff.c: factor advance_or_nullify out of mark_color_as_moved diff.c: add white space mode to move detection that allows indent changes Documentation/diff-options.txt | 30 +++- diff.c | 253 + diff.h | 9 +- t/t4015-diff-whitespace.sh | 243 ++- xdiff/xdiff.h | 8 -- xdiff/xdiffi.c | 17 --- 6 files changed, 472 insertions(+), 88 deletions(-) -- 2.18.0.399.gad0ab374a1-goog 1: ace2dc2bc11 = 1: ace2dc2bc11 xdiff/xdiff.h: remove unused flags 2: 53b3574564e = 2: 53b3574564e xdiff/xdiffi.c: remove unneeded function declarations -: --- > 3: 9b58621e0d8 t4015: avoid git as a pipe input 3: 34850b565df = 4: be0ea0717e1 diff.c: do not pass diff options as keydata to hashmap 4: 8148e51178f = 5: ff7e8721afa diff.c: adjust hash function signature to match hashmap expectation 5: 9d1de6a208e ! 6: 73c2801a5e3 diff.c: add a blocks mode for moved code detection @@ -19,7 +19,7 @@ moved line, but it is not very useful in a review to determine if a block of code was moved without permutation. -zebra:: -+blocks: ++blocks:: Blocks of moved text of at least 20 alphanumeric characters are detected greedily. The detected blocks are - painted using either the 'color.diff.{old,new}Moved' color or @@ -95,10 +95,8 @@ test_config color.diff.newMovedDimmed "normal cyan" && test_config color.diff.oldMovedAlternativeDimmed "normal blue" && test_config color.diff.newMovedAlternativeDimmed "normal yellow" && -+ -+ git diff HEAD --no-renames --color-moved=blocks --color | -+ grep -v "index" | -+ test_decode_color >actual && ++ git diff HEAD --no-renames --color-moved=blocks --color >actual.raw && ++ grep -v "index" actual.raw | test_decode_color >actual && + cat <<-\EOF >expected && + diff --git a/lines.txt b/lines.txt + --- a/lines.txt @@ -141,6 +139,16 @@ + test_config color.diff.newMovedDimmed "normal cyan" && + test_config color.diff.oldMovedAlternativeDimmed "normal blue" && + test_config color.diff.newMovedAlternativeDimmed "normal yellow" && - git diff HEAD --no-renames --color-moved=dimmed_zebra --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved=dimmed_zebra --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && + cat <<-\EOF >expected && +@@ + 7charsA + EOF + +- git diff HEAD --color-moved=zebra --color --no-renames | grep -v "index" | test_decode_color >actual && ++ git diff HEAD --color-moved=zebra --color --no-renames >actual.raw && ++ grep -v "index" actual.raw | test_decode_color >actual && + cat >expected <<-\EOF && + diff --git a/bar b/bar + --- a/bar 6: e37bb7b1fc8 ! 7: 87a8919c260 diff.c: decouple white space treatment from move detection algorithm @@ -55,6 +55,7 @@ +ignore-all-space:: + Ignore whitespace when comparing lines. This ignores differences + even if one line has whitespace where the other line has none. ++-- + --word-diff[=]:: Show a word diff, using the to delimit changed words. @@ -154,32 +155,32 @@ EOF test_cmp expected actual && -- git diff HEAD --no-renames -w --color-moved --color | +- git diff HEAD --no-renames -w --color-moved --color >actual.raw && + git diff HEAD --no-renames --color-moved --color \ -+ --color-moved-ws=ignore-all-space | - grep -v "index" | - test_decode_color >actual && ++ --color-moved-ws=ignore-all-space >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && +
[PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm
In the original implementation of the move detection logic the choice for ignoring white space changes is the same for the move detection as it is for the regular diff. Some cases came up where different treatment would have been nice. Allow the user to specify that white space should be ignored differently during detection of moved lines than during generation of added and removed lines. This is done by providing analogs to the --ignore-space-at-eol, -b, and -w options by introducing the option --color-moved-ws= with the modes named "ignore-space-at-eol", "ignore-space-change" and "ignore-all-space", which is used only during the move detection phase. As we change the default, we'll adjust the tests. For now we do not infer any options to treat white spaces in the move detection from the generic white space options given to diff. This can be tuned later to reasonable default. As we plan on adding more white space related options in a later patch, that interferes with the current white space options, use a flag field and clamp it down to XDF_WHITESPACE_FLAGS, as that (a) allows to easily check at parse time if we give invalid combinations and (b) can reuse parts of this patch. By having the white space treatment in its own option, we'll also make it easier for a later patch to have an config option for spaces in the move detection. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- Documentation/diff-options.txt | 17 + diff.c | 39 +++-- diff.h | 1 + t/t4015-diff-whitespace.sh | 64 +++--- 4 files changed, 115 insertions(+), 6 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index ba56169de31..80e29e39854 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -292,6 +292,23 @@ dimmed_zebra:: blocks are considered interesting, the rest is uninteresting. -- +--color-moved-ws=:: + This configures how white spaces are ignored when performing the + move detection for `--color-moved`. These modes can be given + as a comma separated list: ++ +-- +ignore-space-at-eol:: + Ignore changes in whitespace at EOL. +ignore-space-change:: + Ignore changes in amount of whitespace. This ignores whitespace + at line end, and considers all other sequences of one or + more whitespace characters to be equivalent. +ignore-all-space:: + Ignore whitespace when comparing lines. This ignores differences + even if one line has whitespace where the other line has none. +-- + --word-diff[=]:: Show a word diff, using the to delimit changed words. By default, words are delimited by whitespace; see diff --git a/diff.c b/diff.c index 95c51c0b7df..70eeb40c5fd 100644 --- a/diff.c +++ b/diff.c @@ -283,6 +283,36 @@ static int parse_color_moved(const char *arg) return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'")); } +static int parse_color_moved_ws(const char *arg) +{ + int ret = 0; + struct string_list l = STRING_LIST_INIT_DUP; + struct string_list_item *i; + + string_list_split(, arg, ',', -1); + + for_each_string_list_item(i, ) { + struct strbuf sb = STRBUF_INIT; + strbuf_addstr(, i->string); + strbuf_trim(); + + if (!strcmp(sb.buf, "ignore-space-change")) + ret |= XDF_IGNORE_WHITESPACE_CHANGE; + else if (!strcmp(sb.buf, "ignore-space-at-eol")) + ret |= XDF_IGNORE_WHITESPACE_AT_EOL; + else if (!strcmp(sb.buf, "ignore-all-space")) + ret |= XDF_IGNORE_WHITESPACE; + else + error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf); + + strbuf_release(); + } + + string_list_clear(, 0); + + return ret; +} + int git_diff_ui_config(const char *var, const char *value, void *cb) { if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) { @@ -717,10 +747,12 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const struct diff_options *diffopt = hashmap_cmp_fn_data; const struct moved_entry *a = entry; const struct moved_entry *b = entry_or_key; + unsigned flags = diffopt->color_moved_ws_handling +& XDF_WHITESPACE_FLAGS; return !xdiff_compare_lines(a->es->line, a->es->len, b->es->line, b->es->len, - diffopt->xdl_opts); + flags); } static struct moved_entry *prepare_entry(struct diff_options *o, @@ -728,8 +760,9 @@ static struct moved_entry *prepare_entry(struct diff_options *o, { struct moved_entry *ret = xmalloc(sizeof(*ret));
[PATCH v4 4/9] diff.c: do not pass diff options as keydata to hashmap
When we initialize the hashmap, we give it a pointer to the diff_options, which it then passes along to each call of the hashmap_cmp_fn function. There's no need to pass it a second time as the "keydata" parameter, and our comparison functions never look at keydata. This was a mistake left over from an earlier round of 2e2d5ac184 (diff.c: color moved lines differently, 2017-06-30), before hashmap learned to pass the data pointer for us. Explanation-by: Jeff King Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 1289df4b1f9..ce7bedc1b92 100644 --- a/diff.c +++ b/diff.c @@ -842,13 +842,13 @@ static void mark_color_as_moved(struct diff_options *o, case DIFF_SYMBOL_PLUS: hm = del_lines; key = prepare_entry(o, n); - match = hashmap_get(hm, key, o); + match = hashmap_get(hm, key, NULL); free(key); break; case DIFF_SYMBOL_MINUS: hm = add_lines; key = prepare_entry(o, n); - match = hashmap_get(hm, key, o); + match = hashmap_get(hm, key, NULL); free(key); break; default: -- 2.18.0.399.gad0ab374a1-goog
[PATCH v4 3/9] t4015: avoid git as a pipe input
In t4015 we have a pattern of git diff [] | grep -v "index" | test_decode_color >actual && to produce output that we want to test against. This pattern was introduced in 86b452e2769 (diff.c: add dimming to moved line detection, 2017-06-30) as then the focus on getting the colors right. However the pattern used is not best practice as we do care about the exit code of Git. So let's not have Git as the upstream of a pipe. Piping the output of grep to some function is fine as we assume grep to be un-flawed in our test suite. Signed-off-by: Stefan Beller --- t/t4015-diff-whitespace.sh | 50 +++--- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 17df491a3ab..ddbc3901385 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1271,9 +1271,8 @@ test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' test_config color.diff.newMovedDimmed "normal cyan" && test_config color.diff.oldMovedAlternativeDimmed "normal blue" && test_config color.diff.newMovedAlternativeDimmed "normal yellow" && - git diff HEAD --no-renames --color-moved=dimmed_zebra --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved=dimmed_zebra --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1315,9 +1314,8 @@ test_expect_success 'cmd option assumes configured colored-moved' ' test_config color.diff.oldMovedAlternativeDimmed "normal blue" && test_config color.diff.newMovedAlternativeDimmed "normal yellow" && test_config diff.colorMoved zebra && - git diff HEAD --no-renames --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1395,9 +1393,8 @@ test_expect_success 'move detection ignoring whitespace ' ' line 4 line 5 EOF - git diff HEAD --no-renames --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1419,9 +1416,8 @@ test_expect_success 'move detection ignoring whitespace ' ' EOF test_cmp expected actual && - git diff HEAD --no-renames -w --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames -w --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1459,9 +1455,8 @@ test_expect_success 'move detection ignoring whitespace changes' ' line 5 EOF - git diff HEAD --no-renames --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1483,9 +1478,8 @@ test_expect_success 'move detection ignoring whitespace changes' ' EOF test_cmp expected actual && - git diff HEAD --no-renames -b --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames -b --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1526,9 +1520,8 @@ test_expect_success 'move detection ignoring whitespace at eol' ' # avoid cluttering the output with complaints about our eol whitespace test_config core.whitespace -blank-at-eol && - git diff HEAD --no-renames --color-moved --color | - grep -v "index" | - test_decode_color >actual && + git diff HEAD --no-renames --color-moved --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/lines.txt b/lines.txt --- a/lines.txt @@ -1550,9 +1543,8 @@ test_expect_success 'move detection ignoring whitespace at eol' ' EOF
[PATCH v4 6/9] diff.c: add a blocks mode for moved code detection
The new "blocks" mode provides a middle ground between plain and zebra. It is as intuitive (few colors) as plain, but still has the requirement for a minimum of lines/characters to count a block as moved. Suggested-by: Ævar Arnfjörð Bjarmason (https://public-inbox.org/git/87o9j0uljo@evledraar.gmail.com/) Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- Documentation/diff-options.txt | 8 -- diff.c | 6 +++-- diff.h | 5 ++-- t/t4015-diff-whitespace.sh | 49 -- 4 files changed, 60 insertions(+), 8 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index e3a44f03cdc..ba56169de31 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -276,10 +276,14 @@ plain:: that are added somewhere else in the diff. This mode picks up any moved line, but it is not very useful in a review to determine if a block of code was moved without permutation. -zebra:: +blocks:: Blocks of moved text of at least 20 alphanumeric characters are detected greedily. The detected blocks are - painted using either the 'color.diff.{old,new}Moved' color or + painted using either the 'color.diff.{old,new}Moved' color. + Adjacent blocks cannot be told apart. +zebra:: + Blocks of moved text are detected as in 'blocks' mode. The blocks + are painted using either the 'color.diff.{old,new}Moved' color or 'color.diff.{old,new}MovedAlternative'. The change between the two colors indicates that a new block was detected. dimmed_zebra:: diff --git a/diff.c b/diff.c index d1bae900cdc..95c51c0b7df 100644 --- a/diff.c +++ b/diff.c @@ -271,6 +271,8 @@ static int parse_color_moved(const char *arg) return COLOR_MOVED_NO; else if (!strcmp(arg, "plain")) return COLOR_MOVED_PLAIN; + else if (!strcmp(arg, "blocks")) + return COLOR_MOVED_BLOCKS; else if (!strcmp(arg, "zebra")) return COLOR_MOVED_ZEBRA; else if (!strcmp(arg, "default")) @@ -278,7 +280,7 @@ static int parse_color_moved(const char *arg) else if (!strcmp(arg, "dimmed_zebra")) return COLOR_MOVED_ZEBRA_DIM; else - return error(_("color moved setting must be one of 'no', 'default', 'zebra', 'dimmed_zebra', 'plain'")); + return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'")); } int git_diff_ui_config(const char *var, const char *value, void *cb) @@ -903,7 +905,7 @@ static void mark_color_as_moved(struct diff_options *o, block_length++; - if (flipped_block) + if (flipped_block && o->color_moved != COLOR_MOVED_BLOCKS) l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT; } adjust_last_block(o, n, block_length); diff --git a/diff.h b/diff.h index d29560f822c..7bd4f182c33 100644 --- a/diff.h +++ b/diff.h @@ -208,8 +208,9 @@ struct diff_options { enum { COLOR_MOVED_NO = 0, COLOR_MOVED_PLAIN = 1, - COLOR_MOVED_ZEBRA = 2, - COLOR_MOVED_ZEBRA_DIM = 3, + COLOR_MOVED_BLOCKS = 2, + COLOR_MOVED_ZEBRA = 3, + COLOR_MOVED_ZEBRA_DIM = 4, } color_moved; #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA #define COLOR_MOVED_MIN_ALNUM_COUNT 20 diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index ddbc3901385..e54529f026d 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1223,7 +1223,7 @@ test_expect_success 'plain moved code, inside file' ' test_cmp expected actual ' -test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' +test_expect_success 'detect blocks of moved code' ' git reset --hard && cat <<-\EOF >lines.txt && long line 1 @@ -1271,6 +1271,50 @@ test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' test_config color.diff.newMovedDimmed "normal cyan" && test_config color.diff.oldMovedAlternativeDimmed "normal blue" && test_config color.diff.newMovedAlternativeDimmed "normal yellow" && + git diff HEAD --no-renames --color-moved=blocks --color >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && + cat <<-\EOF >expected && + diff --git a/lines.txt b/lines.txt + --- a/lines.txt + +++ b/lines.txt + @@ -1,16 +1,16 @@ + -long line 1 + -long line 2 + -long line 3 +line 4 +line 5 +line 6 +line 7 +line 8 +line 9 + +long line 1 + +long line 2 + +long line 3 + +long line 14 + +long line 15 + +long line 16 +line 10 +
Re: [PATCH v6 0/4] stash: Convert some `git stash` commands to a builtin
Hello, On 27.06.2018 01:12, Johannes Schindelin wrote: Joel Teichroeb (4): stash: convert apply to builtin stash: convert drop and clear to builtin stash: convert branch to builtin stash: convert pop to builtin Were there any changes since v5 in these patches? There are some changes since v5 based on the reviews. There are also some changes which were not suggested in the reviews, but which I considered to be good (I hope they really are). For example, in the previous version of the patch that introduces the `stash--helper`, there was no `assert_stash_like()` function (it was part of `get_stash_info()`). In v6, it was implemented as a separate function in order to keep it more alike to `git-stash.sh` script. Another difference was to change `assert_stash_ref()` function type from `int` to `void` and call `exit()` inside. This was introduced in order to make it more similar with `assert()` from `assert.h`. I do not think of these changes to be something of great importance, but I hope they might make the code easier to read. Best, Paul
Re: [PATCH 0/1] Makefile: fix the "built from commit" code
On Thu, Jun 28, 2018 at 12:53:15PM +, Johannes Schindelin via GitGitGadget wrote: > Let's fix that quoting, and while at it, also suppress the unhelpful > message > > fatal: not a git repository (or any of the parent directories): .git > > that gets printed to stderr if no current commit could be determined, > and might scare the occasional developer who simply tries to build Git > from scratch. I saw that building Git 2.18.0 for $DAYJOB. Thanks for fixing it. The series looked good to me, too. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH v6 2/4] stash: convert drop and clear to builtin
Hi, On 27.06.2018 01:17, Johannes Schindelin wrote: I thought you had introduced `get_oidf()` specifically so you could avoid the `rev-parse` call... `get_oidf(_oid, "%s@{0}", ref_stash)` should do this, right? We discussed this over the IRC channel, but since not everybody might follow the chat, I would like to put it also here. The main reason why `get_oidf()` was introduced is to make `assert_stash_like()` less tedious. I am sure that this is not the only place where this function could be useful. Over the last weeks, I have been working in parallel on adding a new flag (`GET_OID_GENTLY`) for `get_oid_with_context()`. By using this flag, the `rev-parse` call could be replaced with an internal API call. I hope that I will be able to send this patch soon enough. Best regards, Paul
Re: What's cooking in git.git (Jun 2018, #07; Thu, 28)
On Thu, Jun 28, 2018 at 2:40 PM Junio C Hamano wrote: > The tip of 'next' has been rewound and it currently has only 4 > topics. Quite a many topics are cooking in 'pu' and need to be > sifted into good bins (for 'next') and the remainder. Help is > appreciated in that area ;-) Which branches are needing review most? Or should I start commenting on random stuff? ;-) > > * bw/ref-in-want (2018-06-28) 8 commits > - fetch-pack: implement ref-in-want > - fetch-pack: put shallow info in output parameter > - fetch: refactor to make function args narrower > - fetch: refactor fetch_refs into two functions > - fetch: refactor the population of peer ref OIDs > - upload-pack: test negotiation with changing repository > - upload-pack: implement ref-in-want > - test-pkt-line: add unpack-sideband subcommand > > Protocol v2 has been updated to allow slightly out-of-sync set of > servers to work together to serve a single client, which would be > useful with load-balanced servers that talk smart HTTP transport. > I think another selling point, which should be emphasized more is the ease of ACL checking on the server side. Even when the read permissions are per repository (e.g. githubs model AFAICT), the serving side still has to do a lookup "Can a wanted sha1 be reached from all refs?", which now can be optimized away ("Can I access the master branch?") and it is cheaper to run the server. Specifically if read permissions are per ref (Gerrits model), I'd expect some CPU savings on the serving side. > * jt/remove-pack-bitmap-global (2018-06-21) 2 commits > - pack-bitmap: add free function > - pack-bitmap: remove bitmap_git global variable > > The effort to move globals to per-repository in-core structure > continues. This is mostly done, though Peff seems to expect a reroll with clarification on how the series is structured? https://public-inbox.org/git/20180611211033.gb26...@sigill.intra.peff.net/ > * sb/submodule-move-head-error-msg (2018-06-25) 1 commit > - submodule.c: report the submodule that an error occurs in > > Needs a reroll. > cf. <20180622081713.5360-1-szeder@gmail.com> https://public-inbox.org/git/xmqqmuviq2n7@gitster-ct.c.googlers.com/ suggests that you applied that change and a reroll would not be needed. > > * ds/multi-pack-index (2018-06-25) 24 commits > - midx: clear midx on repack > - packfile: skip loading index if in multi-pack-index > - midx: prevent duplicate packfile loads > - midx: use midx in approximate_object_count > - midx: use existing midx when writing new one > - midx: use midx in abbreviation calculations > - midx: read objects from multi-pack-index > - midx: prepare midxed_git struct > - config: create core.multiPackIndex setting > - midx: write object offsets > - midx: write object id fanout chunk > - midx: write object ids in a chunk > - midx: sort and deduplicate objects from packfiles > - midx: read pack names into array > - multi-pack-index: write pack names in chunk > - multi-pack-index: read packfile list > - packfile: generalize pack directory list > - multi-pack-index: expand test data > - multi-pack-index: load into memory > - midx: write header information to lockfile > - multi-pack-index: add 'write' verb > - multi-pack-index: add builtin > - multi-pack-index: add format details > - multi-pack-index: add design document > > When there are too many packfiles in a repository (which is not > recommended), looking up an object in these would require > consulting many pack .idx files; a new mechanism to have a single > file that consolidates all of these .idx files is introduced. > > > * nd/use-the-index-compat-less (2018-06-25) 13 commits > - wt-status.c: stop using index compat macros > - sha1-name.c: stop using index compat macros > - sequencer.c: stop using index compat macros > - revision.c: stop using index compat macros > - rerere.c: stop using index compat macros > - merge.c: stop using index compat macros > - merge-recursive.c: stop using index compat macros > - entry.c: stop using index compat macros > - diff.c: stop using index compat macros > - diff-lib.c: stop using index compat macros > - check-racy.c: stop using index compat macros > - blame.c: stop using index compat macros > - apply.c: stop using index compat macros > > Use of many convenience functions that operate on the_index > "primary in-core index instance" have been rewritten to explicitly > call the coutnerpart functions that take arbitrary index_state and > pass the_index. > > I'd say that alone is a useless code churn, but people seem to be > excited about the change, so it is queued here. As someone who also does lots of refactoring lately, I am excited to see the code health moving in the right direction. It is easy to quantify how often we are bitten by code churn (that you call useless here); and very hard to quantify bugs introduced or features not written/upstreamed due to clunky API (as we don't see those or do
Re: [PATCH v6 0/4] stash: add new tests and introduce a new helper function
Hello, Heh, what is more useful than apology is to tell us which order these three (apparent) series build on top of each other ;-) The answer, IIUC, is that * oidf+tests come first, then * apply/drop/branch/pop (as these rely on oidf) on top, and finally * list (as it wants to add to stash--helper that is a new file in the middle) When there is clear dependency like that, I agree that it would help readers to emphasize that these cannot be applied in an arbitrary order. It is especially true as the second part of the above _will_ apply more-or-less cleanly without the first one, and then fail to compile due to lack of oidf. Thanks. I would actually say that it was 100% my fault (I could have tested before sending patches or ask if I was not completely sure) and I would like to apologize for this. The order of the commits is actually good. The only thing that is not right are the cover letters, which are missing. Every 1/N marks the beginning of a series of patches. Just to be more clear, this is the right order: First patch: sha1-name.c: added 'get_oidf', which acts like 'get_oid' stash: improve option parsing test coverage stash: update test cases conform to coding guidelines stash: renamed test cases to be more descriptive Second patch: stash: convert apply to builtin stash: convert drop and clear to builtin stash: convert branch to builtin stash: convert pop to builtin Third patch: stash: implement the "list" command in the builtin stash: convert show to builtin stash: change `git stash show` usage text and documentation stash: refactor `show_stash()` to use the diff API stash: update `git stash show` documentation stash: convert store to builtin The only thing which was in the cover letters and might be worth mentioning here is related to `git stash show`. Although it might not be efficient, a 1:1 conversion is more easily to follow and review. Because of that, the first commit related to `git stash show` approaches a 1:1 conversion to C. There is a subsequent patch (`refactor `) which makes the `show` subcommand use the existent API. Any change of behavior is noted in the commit message which introduces that change. Best, Paul
Re: [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs
> diff --git a/read-cache.c b/read-cache.c > index 9624ce1784..116fd51680 100644 > --- a/read-cache.c > +++ b/read-cache.c > +struct cache_entry *make_transient_cache_entry(unsigned int mode, const > struct object_id *oid, > +const char *path, int stage) > +{ > + struct cache_entry *ce; > + int len; > + > + if (!verify_path(path, mode)) { > + error("Invalid path '%s'", path); > + return NULL; > + } > + > + len = strlen(path); > + ce = make_empty_transient_cache_entry(len); > + > + hashcpy(ce->oid.hash, oid->hash); Please use oidcpy() here, too.
Re: [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct
> The make_cache_entry function should take an object_id struct instead > of sha. > diff --git a/read-cache.c b/read-cache.c > index fa8366ecab..9624ce1784 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -746,8 +746,10 @@ int add_file_to_index(struct index_state *istate, const > char *path, int flags) > } > > struct cache_entry *make_cache_entry(unsigned int mode, > - const unsigned char *sha1, const char *path, int stage, > - unsigned int refresh_options) > + const struct object_id *oid, > + const char *path, > + int stage, > + unsigned int refresh_options) > { > int size, len; > struct cache_entry *ce, *ret; > @@ -761,7 +763,7 @@ struct cache_entry *make_cache_entry(unsigned int mode, > size = cache_entry_size(len); > ce = xcalloc(1, size); > > - hashcpy(ce->oid.hash, sha1); > + hashcpy(ce->oid.hash, oid->hash); Speaking of using struct object_id instead of sha, please use oidcpy() here.
Re: [PATCH v3] fetch-pack: support negotiation tip whitelist
On 06/28, Jonathan Tan wrote: > During negotiation, fetch-pack eventually reports as "have" lines all > commits reachable from all refs. Allow the user to restrict the commits > sent in this way by providing a whitelist of tips; only the tips > themselves and their ancestors will be sent. > > Both globs and single objects are supported. > > This feature is only supported for protocols that support connect or > stateless-connect (such as HTTP with protocol v2). > > This will speed up negotiation when the repository has multiple > relatively independent branches (for example, when a repository > interacts with multiple repositories, such as with linux-next [1] and > torvalds/linux [2]), and the user knows which local branch is likely to > have commits in common with the upstream branch they are fetching. > > [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/ > [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/ > > Signed-off-by: Jonathan Tan > --- > This is on jt/fetch-pack-negotiator. > > > I don't think that would be strange at all, and no where in git do we > > handle heads/* but we do already handle refs/heads/* as well as DWIM > > master. > > > > > and (2) I can't think of anywhere in Git > > > where you can provide either one - it's either SHA-1 and DWIM name, or > > > SHA-1 and refspec, but not all three. > > > > fetch is a perfect example of supporting all three. I can do > > > > git fetch origin SHA1 > > git fetch origin master > > git fetch origin refs/heads/*:refs/heads/* > > OK, Brandon managed to convince me that this is fine. I've included glob > support, supporting the same globs that git notes supports. > --- > Documentation/fetch-options.txt | 16 +++ > builtin/fetch.c | 41 + > fetch-pack.c| 19 +++- > fetch-pack.h| 7 +++ > t/t5510-fetch.sh| 78 + > transport-helper.c | 3 ++ > transport.c | 1 + > transport.h | 10 + > 8 files changed, 173 insertions(+), 2 deletions(-) > > diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt > index 97d3217df..6e4db1738 100644 > --- a/Documentation/fetch-options.txt > +++ b/Documentation/fetch-options.txt > @@ -42,6 +42,22 @@ the current repository has the same history as the source > repository. > .git/shallow. This option updates .git/shallow and accept such > refs. > > +--negotiation-tip=:: > + By default, Git will report, to the server, commits reachable > + from all local refs to find common commits in an attempt to > + reduce the size of the to-be-received packfile. If specified, > + Git will only report commits reachable from the given tips. > + This is useful to speed up fetches when the user knows which > + local ref is likely to have commits in common with the > + upstream ref being fetched. > ++ > +This option may be specified more than once; if so, Git will report > +commits reachable from any of the given commits. > ++ > +The argument to this option may be a glob on ref names, a ref, or the > (possibly > +abbreviated SHA-1 of a commit. Specifying a glob is equivalent to specifying > +this option multiple times, one for each matching ref name. I think you're missing a closing ')' Aside from that nit this patch looks good, thanks! -- Brandon Williams
[PATCH v3] fetch-pack: support negotiation tip whitelist
During negotiation, fetch-pack eventually reports as "have" lines all commits reachable from all refs. Allow the user to restrict the commits sent in this way by providing a whitelist of tips; only the tips themselves and their ancestors will be sent. Both globs and single objects are supported. This feature is only supported for protocols that support connect or stateless-connect (such as HTTP with protocol v2). This will speed up negotiation when the repository has multiple relatively independent branches (for example, when a repository interacts with multiple repositories, such as with linux-next [1] and torvalds/linux [2]), and the user knows which local branch is likely to have commits in common with the upstream branch they are fetching. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/ [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/ Signed-off-by: Jonathan Tan --- This is on jt/fetch-pack-negotiator. > I don't think that would be strange at all, and no where in git do we > handle heads/* but we do already handle refs/heads/* as well as DWIM > master. > > > and (2) I can't think of anywhere in Git > > where you can provide either one - it's either SHA-1 and DWIM name, or > > SHA-1 and refspec, but not all three. > > fetch is a perfect example of supporting all three. I can do > > git fetch origin SHA1 > git fetch origin master > git fetch origin refs/heads/*:refs/heads/* OK, Brandon managed to convince me that this is fine. I've included glob support, supporting the same globs that git notes supports. --- Documentation/fetch-options.txt | 16 +++ builtin/fetch.c | 41 + fetch-pack.c| 19 +++- fetch-pack.h| 7 +++ t/t5510-fetch.sh| 78 + transport-helper.c | 3 ++ transport.c | 1 + transport.h | 10 + 8 files changed, 173 insertions(+), 2 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 97d3217df..6e4db1738 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -42,6 +42,22 @@ the current repository has the same history as the source repository. .git/shallow. This option updates .git/shallow and accept such refs. +--negotiation-tip=:: + By default, Git will report, to the server, commits reachable + from all local refs to find common commits in an attempt to + reduce the size of the to-be-received packfile. If specified, + Git will only report commits reachable from the given tips. + This is useful to speed up fetches when the user knows which + local ref is likely to have commits in common with the + upstream ref being fetched. ++ +This option may be specified more than once; if so, Git will report +commits reachable from any of the given commits. ++ +The argument to this option may be a glob on ref names, a ref, or the (possibly +abbreviated SHA-1 of a commit. Specifying a glob is equivalent to specifying +this option multiple times, one for each matching ref name. + ifndef::git-pull[] --dry-run:: Show what would be done, without making any changes. diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669a..1a7ef305d 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -63,6 +63,7 @@ static int shown_url = 0; static struct refspec refmap = REFSPEC_INIT_FETCH; static struct list_objects_filter_options filter_options; static struct string_list server_options = STRING_LIST_INIT_DUP; +static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP; static int git_fetch_config(const char *k, const char *v, void *cb) { @@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_STRING_LIST(0, "negotiation-tip", _tip, N_("revision"), + N_("report that we have only objects reachable from this object")), OPT_PARSE_LIST_OBJECTS_FILTER(_options), OPT_END() }; @@ -1049,6 +1052,38 @@ static void set_option(struct transport *transport, const char *name, const char name, transport->url); } + +static int add_oid(const char *refname, const struct object_id *oid, int flags, + void *cb_data) +{ + struct oid_array *oids = cb_data; + oid_array_append(oids, oid); + return 0; +} + +static void add_negotiation_tips(struct git_transport_options *smart_options) +{ + struct oid_array *oids = xcalloc(1, sizeof(*oids)); + int i; + for (i = 0; i < negotiation_tip.nr; i++) { + const char *s = negotiation_tip.items[i].string; + int old_nr; + if (!has_glob_specials(s))
Re: [PATCH 4/4] fsck: silence stderr when parsing .gitmodules
On Thu, Jun 28, 2018 at 06:06:03PM -0400, Jeff King wrote: > Note that we didn't test this case at all, so I've added > coverage in t7415. We may end up toning down or removing > this fsck check in the future. So take this test as checking > what happens now with a focus on stderr, and not any > ironclad guarantee that we must detect and report parse > failures in the future. And such a patch _could_ look something like this. Though we could also perhaps leave it in place and tone it down to "ignore" by default. There's another case that triggers gitmodulesParse, too, which is a blob so gigantic that we try not to hold it in memory. Technically that could also happen due to somebody using .gitmodules for something unrelated. But that seems even more far-fetched. And it _is_ dangerous to leave, because I think existing vulnerable clients will try to load a 500MB .gitmodules file in memory and parse it. --- diff --git a/fsck.c b/fsck.c index 87b0e228bd..296e8a8a7c 100644 --- a/fsck.c +++ b/fsck.c @@ -1013,11 +1013,9 @@ static int fsck_blob(struct blob *blob, const char *buf, data.options = options; data.ret = 0; config_opts.error_action = CONFIG_ERROR_SILENT; - if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, - ".gitmodules", buf, size, , _opts)) - data.ret |= report(options, >object, - FSCK_MSG_GITMODULES_PARSE, - "could not parse gitmodules blob"); + /* ignore errors */ + git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, + ".gitmodules", buf, size, , _opts); return data.ret; } diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index ba8af785a5..9a7dae88a5 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -176,7 +176,7 @@ test_expect_success 'fsck detects non-blob .gitmodules' ' ) ' -test_expect_success 'fsck detects corrupt .gitmodules' ' +test_expect_success 'fsck does not complain about corrupt .gitmodules' ' git init corrupt && ( cd corrupt && @@ -185,9 +185,8 @@ test_expect_success 'fsck detects corrupt .gitmodules' ' git add .gitmodules && git commit -m "broken gitmodules" && - test_must_fail git fsck 2>output && - grep gitmodulesParse output && - test_i18ngrep ! "bad config" output + git fsck 2>output && + ! test -s output ) '
[PATCH 4/4] fsck: silence stderr when parsing .gitmodules
If there's a parsing error we'll already report it via the usual fsck report() function (or not, if the user has asked to skip this object or warning type). The error message from the config parser just adds confusion. Let's suppress it. Note that we didn't test this case at all, so I've added coverage in t7415. We may end up toning down or removing this fsck check in the future. So take this test as checking what happens now with a focus on stderr, and not any ironclad guarantee that we must detect and report parse failures in the future. Signed-off-by: Jeff King --- fsck.c | 4 +++- t/t7415-submodule-names.sh | 15 +++ 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/fsck.c b/fsck.c index aa7a52cc80..87b0e228bd 100644 --- a/fsck.c +++ b/fsck.c @@ -992,6 +992,7 @@ static int fsck_blob(struct blob *blob, const char *buf, unsigned long size, struct fsck_options *options) { struct fsck_gitmodules_data data; + struct config_options config_opts = { 0 }; if (!oidset_contains(_found, >object.oid)) return 0; @@ -1011,8 +1012,9 @@ static int fsck_blob(struct blob *blob, const char *buf, data.obj = >object; data.options = options; data.ret = 0; + config_opts.error_action = CONFIG_ERROR_SILENT; if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, - ".gitmodules", buf, size, , NULL)) + ".gitmodules", buf, size, , _opts)) data.ret |= report(options, >object, FSCK_MSG_GITMODULES_PARSE, "could not parse gitmodules blob"); diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index b68c5f5e85..ba8af785a5 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -176,4 +176,19 @@ test_expect_success 'fsck detects non-blob .gitmodules' ' ) ' +test_expect_success 'fsck detects corrupt .gitmodules' ' + git init corrupt && + ( + cd corrupt && + + echo "[broken" >.gitmodules && + git add .gitmodules && + git commit -m "broken gitmodules" && + + test_must_fail git fsck 2>output && + grep gitmodulesParse output && + test_i18ngrep ! "bad config" output + ) +' + test_done -- 2.18.0.604.g8c4f59c959
[PATCH 3/4] config: add options parameter to git_config_from_mem
The underlying config parser knows how to handle a config_options struct, but git_config_from_mem() always passes NULL. Let's allow our callers to specify the options struct. We could add a "_with_options" variant, but since there are only a handful of callers, let's just update them to pass NULL. Signed-off-by: Jeff King --- config.c | 11 +++ config.h | 7 +-- fsck.c | 2 +- submodule-config.c | 2 +- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/config.c b/config.c index 0797e284f4..4548edb1e5 100644 --- a/config.c +++ b/config.c @@ -1569,8 +1569,10 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) return git_config_from_file_with_options(fn, filename, data, NULL); } -int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_type, - const char *name, const char *buf, size_t len, void *data) +int git_config_from_mem(config_fn_t fn, + const enum config_origin_type origin_type, + const char *name, const char *buf, size_t len, + void *data, const struct config_options *opts) { struct config_source top; @@ -1585,7 +1587,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; - return do_config_from(, fn, data, NULL); + return do_config_from(, fn, data, opts); } int git_config_from_blob_oid(config_fn_t fn, @@ -1606,7 +1608,8 @@ int git_config_from_blob_oid(config_fn_t fn, return error("reference '%s' does not point to a blob", name); } - ret = git_config_from_mem(fn, CONFIG_ORIGIN_BLOB, name, buf, size, data); + ret = git_config_from_mem(fn, CONFIG_ORIGIN_BLOB, name, buf, size, + data, NULL); free(buf); return ret; diff --git a/config.h b/config.h index c02809ffdc..f2063ceb86 100644 --- a/config.h +++ b/config.h @@ -68,8 +68,11 @@ extern int git_config_from_file(config_fn_t fn, const char *, void *); extern int git_config_from_file_with_options(config_fn_t fn, const char *, void *, const struct config_options *); -extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, - const char *name, const char *buf, size_t len, void *data); +extern int git_config_from_mem(config_fn_t fn, + const enum config_origin_type, + const char *name, + const char *buf, size_t len, + void *data, const struct config_options *opts); extern int git_config_from_blob_oid(config_fn_t fn, const char *name, const struct object_id *oid, void *data); extern void git_config_push_parameter(const char *text); diff --git a/fsck.c b/fsck.c index 0b8b20b6c4..aa7a52cc80 100644 --- a/fsck.c +++ b/fsck.c @@ -1012,7 +1012,7 @@ static int fsck_blob(struct blob *blob, const char *buf, data.options = options; data.ret = 0; if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, - ".gitmodules", buf, size, )) + ".gitmodules", buf, size, , NULL)) data.ret |= report(options, >object, FSCK_MSG_GITMODULES_PARSE, "could not parse gitmodules blob"); diff --git a/submodule-config.c b/submodule-config.c index 388ef1f892..2ca3272dd1 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -561,7 +561,7 @@ static const struct submodule *config_from(struct submodule_cache *cache, parameter.gitmodules_oid = parameter.overwrite = 0; git_config_from_mem(parse_config, CONFIG_ORIGIN_SUBMODULE_BLOB, rev.buf, - config, config_size, ); + config, config_size, , NULL); strbuf_release(); free(config); -- 2.18.0.604.g8c4f59c959
[PATCH 2/4] config: add CONFIG_ERROR_SILENT handler
We can currently die() or error(), but there's not yet any way for callers to ask us just to quietly return an error. Let's give them one. Signed-off-by: Jeff King --- config.c | 3 +++ config.h | 1 + 2 files changed, 4 insertions(+) diff --git a/config.c b/config.c index b5c23369d0..0797e284f4 100644 --- a/config.c +++ b/config.c @@ -818,6 +818,9 @@ static int git_parse_source(config_fn_t fn, void *data, case CONFIG_ERROR_ERROR: error_return = error("%s", error_msg); break; + case CONFIG_ERROR_SILENT: + error_return = -1; + break; case CONFIG_ERROR_UNSET: BUG("config error action unset"); } diff --git a/config.h b/config.h index ce75bf1e5c..c02809ffdc 100644 --- a/config.h +++ b/config.h @@ -58,6 +58,7 @@ struct config_options { CONFIG_ERROR_UNSET = 0, /* use source-specific default */ CONFIG_ERROR_DIE, /* die() on error */ CONFIG_ERROR_ERROR, /* error() on error, return -1 */ + CONFIG_ERROR_SILENT, /* return -1 */ } error_action; }; -- 2.18.0.604.g8c4f59c959
[PATCH 1/4] config: turn die_on_error into caller-facing enum
The config code has a die_on_error flag, which lets us emit an error() instead of dying when we see a bogus config file. But there's no way for a caller of the config code to set this: it's auto-set based on whether we're reading a file or a blob. Instead, let's add it to the config_options struct. When it's not set (or we have no options) we'll continue to fall back to the existing file/blob behavior. Signed-off-by: Jeff King --- config.c | 18 +- config.h | 5 + 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/config.c b/config.c index a0a6ae1980..b5c23369d0 100644 --- a/config.c +++ b/config.c @@ -31,7 +31,7 @@ struct config_source { enum config_origin_type origin_type; const char *name; const char *path; - int die_on_error; + enum config_error_action default_error_action; int linenr; int eof; struct strbuf value; @@ -809,10 +809,18 @@ static int git_parse_source(config_fn_t fn, void *data, cf->linenr, cf->name); } - if (cf->die_on_error) + switch (opts && opts->error_action ? + opts->error_action : + cf->default_error_action) { + case CONFIG_ERROR_DIE: die("%s", error_msg); - else + break; + case CONFIG_ERROR_ERROR: error_return = error("%s", error_msg); + break; + case CONFIG_ERROR_UNSET: + BUG("config error action unset"); + } free(error_msg); return error_return; @@ -1520,7 +1528,7 @@ static int do_config_from_file(config_fn_t fn, top.origin_type = origin_type; top.name = name; top.path = path; - top.die_on_error = 1; + top.default_error_action = CONFIG_ERROR_DIE; top.do_fgetc = config_file_fgetc; top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; @@ -1569,7 +1577,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.origin_type = origin_type; top.name = name; top.path = NULL; - top.die_on_error = 0; + top.default_error_action = CONFIG_ERROR_ERROR; top.do_fgetc = config_buf_fgetc; top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; diff --git a/config.h b/config.h index 626d4654bd..ce75bf1e5c 100644 --- a/config.h +++ b/config.h @@ -54,6 +54,11 @@ struct config_options { const char *git_dir; config_parser_event_fn_t event_fn; void *event_fn_data; + enum config_error_action { + CONFIG_ERROR_UNSET = 0, /* use source-specific default */ + CONFIG_ERROR_DIE, /* die() on error */ + CONFIG_ERROR_ERROR, /* error() on error, return -1 */ + } error_action; }; typedef int (*config_fn_t)(const char *, const char *, void *); -- 2.18.0.604.g8c4f59c959
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On Thu, Jun 28, 2018 at 07:53:27PM +0100, Ramsay Jones wrote: > > Yes, though I don't think it's too bad. We already have a "die_on_error" > > flag in the config code. I think it just needs to become a tristate: > > die/error/silent (and probably get passed in via config_options, since I > > think we tie it right now to the file/blob source). > > Yes, but this code is already very crufty - and I'm just > waiting for someone to want to have per-repo/submodule > config parsing i... ;-) It is crufty, but I think we actually handle that part OK; the flag gets attached to the stack. > > Hmm, if we end up doing the config thing above, then this patch would > > become unnecessary. > > I was thinking of timing - the current patch could go > in quickly to solve the immediate problem (eg. in maint). > > Also, it does not hurt to do this _as well as_ suppress > the config errors. Yes, it can go in quickly. But I'd prefer not to keep it in the long term if it's literally doing nothing. I have some patches which I think solve your problem. They apply on v2.18.0, but not on v2.17.1 (because they rely on Dscho's increased passing of config_options in v2.18). Is that good enough? > > And I think doing that would help _all_ cases, even ones without a > > skiplist. They don't need to see random config error messages either, > > even if we do eventually report an fsck error. > > Oh, yes, I agree. You will have noticed that it was my > first suggested solution. (I have started writing that > patch a few times, but it just makes me want to throw > the current code away and start again)! > > Hmm, BTW, the 'rejected push' problem would include *any* > '.gitmodules' blob that contained a syntax error, right? > > Maybe it won't be as rare as all that! (Imagine not being > able to push due to a compiler error/warning in source files. > How irritating would that be! :-P ). Yes, it would include any syntax error. I also have a slight worry about that, but nobody seems to have screamed _yet_. :) > (if we fix this, you could hide some nefarious settings > after an obvious syntax error - then get the victim to > fix the syntax error ...) You can also usually get the victim to type "make", which is even simpler. :) Here are the patches I came up with. Note that the config_options struct has a bit of a dual-nature. Some options are respected only via config_from_options(), and some only from git_config_from_file(). I think this should probably be remedied in the long run, but I stopped here in the interest of keeping this maint-worthy. [1/4]: config: turn die_on_error into caller-facing enum [2/4]: config: add CONFIG_ERROR_SILENT handler [3/4]: config: add options parameter to git_config_from_mem [4/4]: fsck: silence stderr when parsing .gitmodules config.c | 32 +++- config.h | 13 +++-- fsck.c | 4 +++- submodule-config.c | 2 +- t/t7415-submodule-names.sh | 15 +++ 5 files changed, 53 insertions(+), 13 deletions(-) -Peff
Re: [PATCH 5/5] builtin/rebase: support running "git rebase "
On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki wrote: > > This patch gives life to the skeleton added in the previous patch. > This patch makes real operation happen i.e. by using > `git -c rebase.usebuiltin=true rebase `. > With this patch, the basic operation of rebase can be done. Would it make sense to add this config option to some basic test in the test suite to show off one case in there? (Otherwise it is hard to keep this code correct for the future (even if it is just a few days/weeks) as other series on the list may collide with it in subtle ways, so a test would be fast signal to catch these subtleties). Maybe setting this in one of the early tests in t3400 would be good? > These backends use Unix shell functions defined both by git-sh-setup.sh > and git-rebase.sh (we move the latter's into git-rebase--common.sh to s/move/moved in a previous patch/ ? But then again we already know about the earlier patch, I am on the fence whether this is worth mentioning. But it sure is fine to leave it here. > accommodate for that), so we not only have to source the backend file > before calling the respective Unix shell script function, but we have > to source git-sh-setup and git-rebase--common before that. > And since this is all done in a Unix shell script snippet, all of this > is in argv[0]. There never will be a non-NULL argv[1]. No double negatives are never harder to read than simple forms. ;) So you are saying, there are no further arguments to that shell invocation? > This patch does the *bare* minimum to get `git rebase ` to > work: there is still no option parsing, and only the bare minimum set > of environment variables are set (in particular, the current revision > would be susceptible to bugs where e.g. `rebase_root` could be set by > mistake before running `git rebase` and the `git-rebase--am` backend > would pick up that variable and use it). > > It still calls original `git-legacy-rebase.sh` unless the config > setting rebase.useBuiltin is set to true. This patch uses the > detach_head_to() function from checkout.c introduced by a previous > commit to perform initial checkout. > > Signed-off-by: Pratik Karki > --- > builtin/rebase.c | 231 ++- > 1 file changed, 229 insertions(+), 2 deletions(-) > > diff --git a/builtin/rebase.c b/builtin/rebase.c > index 1152b7229..2f90389c2 100644 > --- a/builtin/rebase.c > +++ b/builtin/rebase.c > @@ -9,6 +9,19 @@ > #include "exec-cmd.h" > #include "argv-array.h" > #include "dir.h" > +#include "packfile.h" > +#include "checkout.h" > +#include "refs.h" > + > +static GIT_PATH_FUNC(apply_dir, "rebase-apply"); > +static GIT_PATH_FUNC(merge_dir, "rebase-merge"); > + > +enum rebase_type { > + REBASE_AM, > + REBASE_MERGE, > + REBASE_INTERACTIVE, > + REBASE_PRESERVE_MERGES > +}; > > static int use_builtin_rebase(void) > { > @@ -28,8 +41,129 @@ static int use_builtin_rebase(void) > return ret; > } > > +static int apply_autostash(void) > +{ > + warning("TODO"); This comes up unconditionally here, so the automated testing idea from above might not be as good as I thought after all. > +static struct commit *peel_committish(const char *name) The -ish suffix is to indicate that a wide range of notations that describe commits are accepted. Another way of naming this function would be by its output, i.e. peel_to_commit, the name similar to peel_to_type. But I guess emphasizing the input to be anything that describes a commit is also important here, as we pass in the arguments eventually provided by users (e.g. "master^^") so this name sounds fine; I cannot think of a better suggestion for now.
What's cooking in git.git (Jun 2018, #07; Thu, 28)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. The tip of 'next' has been rewound and it currently has only 4 topics. Quite a many topics are cooking in 'pu' and need to be sifted into good bins (for 'next') and the remainder. Help is appreciated in that area ;-) You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ab/refspec-init-fix (2018-06-11) 3 commits (merged to 'next' on 2018-06-13 at 91d71d8435) + refspec: initalize `refspec_item` in `valid_fetch_refspec()` + refspec: add back a refspec_item_init() function + refspec: s/refspec_item_init/&_or_die/g Make refspec parsing codepath more robust. * as/safecrlf-quiet-fix (2018-06-11) 1 commit (merged to 'next' on 2018-06-13 at b163674843) + config.c: fix regression for core.safecrlf false Fix for 2.17-era regression around `core.safecrlf`. * jc/clean-after-sanity-tests (2018-06-15) 1 commit (merged to 'next' on 2018-06-22 at 2df77b8af9) + tests: clean after SANITY tests test cleanup. * jh/partial-clone (2018-06-12) 1 commit (merged to 'next' on 2018-06-13 at 818f864b0c) + list-objects: check if filter is NULL before using The recent addition of "partial clone" experimental feature kicked in when it shouldn't, namely, when there is no partial-clone filter defined even if extensions.partialclone is set. * jk/fetch-all-peeled-fix (2018-06-13) 2 commits (merged to 'next' on 2018-06-13 at 1333bb9d90) + fetch-pack: test explicitly that --all can fetch tag references pointing to non-commits + fetch-pack: don't try to fetch peel values with --all "git fetch-pack --all" used to unnecessarily fail upon seeing an annotated tag that points at an object other than a commit. * ms/send-pack-honor-config (2018-06-12) 1 commit (merged to 'next' on 2018-06-13 at e2cd933715) + builtin/send-pack: populate the default configs "git send-pack --signed" (hence "git push --signed" over the http transport) did not read user ident from the config mechanism to determine whom to sign the push certificate as, which has been corrected. * nd/completion-negation (2018-06-11) 3 commits (merged to 'next' on 2018-06-19 at a3be59b450) + completion: collapse extra --no-.. options + completion: suppress some -no- options + parse-options: option to let --git-completion-helper show negative form Continuing with the idea to programmatically enumerate various pieces of data required for command line completion, the codebase has been taught to enumerate options prefixed with "--no-" to negate them. * pw/add-p-recount (2018-06-11) 1 commit (merged to 'next' on 2018-06-19 at 1880ecc3f1) + add -p: fix counting empty context lines in edited patches When user edits the patch in "git add -p" and the user's editor is set to strip trailing whitespaces indiscriminately, an empty line that is unchanged in the patch would become completely empty (instead of a line with a sole SP on it). The code introduced in Git 2.17 timeframe failed to parse such a patch, but now it learned to notice the situation and cope with it. * sb/fix-fetching-moved-submodules (2018-06-14) 2 commits (merged to 'next' on 2018-06-22 at 16039dc62a) + t5526: test recursive submodules when fetching moved submodules + submodule: fix NULL correctness in renamed broken submodules The code to try seeing if a fetch is necessary in a submodule during a fetch with --recurse-submodules got confused when the path to the submodule was changed in the range of commits in the superproject, sometimes showing "(null)". This has been corrected. * sg/gpg-tests-fix (2018-06-11) 2 commits (merged to 'next' on 2018-06-13 at f3a05f1c41) + tests: make forging GPG signed commits and tags more robust + t7510-signed-commit: use 'test_must_fail' Some flaky tests have been fixed. * tz/cred-netrc-cleanup (2018-06-18) 4 commits (merged to 'next' on 2018-06-22 at a471dd714c) + git-credential-netrc: make "all" default target of Makefile + git-credential-netrc: fix exit status when tests fail + git-credential-netrc: use in-tree Git.pm for tests + git-credential-netrc: minor whitespace cleanup in test script Build and test procedure for netrc credential helper (in contrib/) has been updated. -- [New Topics] * ao/config-from-gitmodules (2018-06-26) 6 commits - submodule-config: reuse config_from_gitmodules in repo_read_gitmodules - submodule-config: pass repository as argument to config_from_gitmodules - submodule-config: make 'config_from_gitmodules' private - submodule-config: add helper to get 'update-clone'
Re: [PATCH 4/5] sequencer: refactor the code to detach HEAD to checkout.c
Hi Pratik, On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki wrote: > > The motivation behind this commit is to extract the core part of > do_reset() from sequencer.c and move it to a new detach_head_to() > function in checkout.c. > [...] > > The new function will be used in the next commit by the builtin rebase, > to perform the initial checkout. This sounds like the actual motivation, which is fine. > Here the index only gets locked after performing the first part of > `do_reset()` rather than before which essentially derives the `oid` > from the specified label/name passed to the `do_reset()` function. > It also fixes two bugs: there were two `return error()` statements in > the `[new root]` case that would have failed to unlock the index. This sounds as if this fixes a problem? If so it would be nice to have a test that demonstrates that these specific problems go away. (but I think we could just argue based on the motivation above that this is a good change on its own, with or without demonstrating these additional issues) > Signed-off-by: Pratik Karki > --- > checkout.c | 64 + > checkout.h | 3 +++ > sequencer.c | 58 +--- > 3 files changed, 72 insertions(+), 53 deletions(-) > > diff --git a/checkout.c b/checkout.c > index bdefc888b..da68915fd 100644 > --- a/checkout.c > +++ b/checkout.c > @@ -2,6 +2,11 @@ > #include "remote.h" > #include "refspec.h" > #include "checkout.h" > +#include "unpack-trees.h" > +#include "lockfile.h" > +#include "refs.h" > +#include "tree.h" > +#include "cache-tree.h" > > struct tracking_name_data { > /* const */ char *src_ref; > @@ -42,3 +47,62 @@ const char *unique_tracking_name(const char *name, struct > object_id *oid) > free(cb_data.dst_ref); > return NULL; > } > + > +int detach_head_to(struct object_id *oid, const char *action, > + const char *reflog_message) > +{ > + struct strbuf ref_name = STRBUF_INIT; > + struct tree_desc desc; > + struct lock_file lock = LOCK_INIT; > + struct unpack_trees_options unpack_tree_opts; > + struct tree *tree; > + int ret = 0; > + > + if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0) > + return -1; > + > + memset(_tree_opts, 0, sizeof(unpack_tree_opts)); > + setup_unpack_trees_porcelain(_tree_opts, action); > + unpack_tree_opts.head_idx = 1; > + unpack_tree_opts.src_index = _index; > + unpack_tree_opts.dst_index = _index; > + unpack_tree_opts.fn = oneway_merge; > + unpack_tree_opts.merge = 1; > + unpack_tree_opts.update = 1; > + > + if (read_cache_unmerged()) { > + rollback_lock_file(); > + strbuf_release(_name); > + return error_resolve_conflict(_(action)); > + } > + > + if (!fill_tree_descriptor(, oid)) { > + error(_("failed to find tree of %s"), oid_to_hex(oid)); > + rollback_lock_file(); > + free((void *)desc.buffer); > + strbuf_release(_name); These lines are repeated as a very similar pattern after each failing function. Maybe we can make it more readable by moving all these to the end and then using goto to jump there. For example see "write_pseudoref" in refs.c, that has some interesting patterns to learn from, e.g. how the return code is constructed (start off with setting it -1 and only if we go through the whole function, just before the jump label, we'd set it to 0) and how all the free/strbuf_releases are at the end (no need to repeat them). > + return -1; > + } > + > + if (unpack_trees(1, , _tree_opts)) { > + rollback_lock_file(); > + free((void *)desc.buffer); > + strbuf_release(_name); > + return -1; > + } > + > + tree = parse_tree_indirect(oid); Awesome, the _indirect function can take commits/tags or trees. > + prime_cache_tree(_index, tree); As there is a larger movement to get rid of globals, and the_index is one of them[1]. So maybe just use the_repository->index already (the_repository suffers a similar problem, but I think that is more futureproof for the time being as we'd want to kill off the_repository in library code eventually as well and pass through a repository struct. But for now I'd just use the_repository instead of having a repository argument) [1] c.f. https://public-inbox.org/git/20180616054157.32433-1-pclo...@gmail.com/ > - struct lock_file lock = LOCK_INIT; > - struct tree_desc desc; > - struct tree *tree; > - struct unpack_trees_options unpack_tree_opts; > - int ret = 0, i; [...] Oh I misspoke above, this is moving code (I should have understood the hint with 'extracting' by the commit message), so in this case we'd rather want to move code most verbatim to make review easier, which it is. So
Re: [PATCH] sequencer: use configured comment character
Aaron Schrab writes: > Use configured comment character when generating comments about branches > in an instruction sheet. Failure to honor this configuration causes a > failure to parse the resulting instruction sheet. > > Signed-off-by: Aaron Schrab > --- > sequencer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sequencer.c b/sequencer.c > index 4034c0461b..caf91af29d 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -3991,7 +3991,7 @@ static int make_script_with_merges(struct > pretty_print_context *pp, > entry = oidmap_get(, >object.oid); > > if (entry) > - fprintf(out, "\n# Branch %s\n", entry->string); > + fprintf(out, "\n%c Branch %s\n", comment_line_char, > entry->string); > else > fprintf(out, "\n"); Would this interact OK with core.commentchar set to "auto"?
Re: [PATCH 3/5] rebase: refactor common shell functions into their own file
On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki wrote: > > The function present in `git-legacy-rebase.sh` are used by backends > so, this refactor tries to extract the functions out so that, the it not only tries to, it actually does. :) > `git-legacy-rebase.sh` can be retired easily as the > `git-rebase--common.sh` will provide the functions for now. > > The motivation behind this is to call the backend functions > *directly* from C, bypassing `git-rebase.sh`. Therefore those functions > need to live in a separate file: we need to be able to call > `.git-rebase--common` in that script snippet so that those functions > are defined. Makes sense. I applied the patch (and checked the move via the --color-moved option to see if there are discrepancies that slip in easily via rebases as there is more work currently going on in the rebase area) and the found the functions were moved as-is, just reordered. Can you give a hint on why you choose a different order for the moved functions (not as an email reply, but as part of the commit message, later on people may ask the same question only to find this commit via git-blame or git-log for example) Thanks, Stefan
Re: [PATCH v5 6/8] mem-pool: fill out functionality
Jameson Miller writes: > +void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src) > +{ > + struct mp_block *p; > + > + /* Append the blocks from src to dst */ > + if (dst->mp_block && src->mp_block) { > + /* > + * src and dst have blocks, append > + * blocks from src to dst. > + */ > + p = dst->mp_block; > + while (p->next_block) > + p = p->next_block; > + > + p->next_block = src->mp_block; Just being curious, but does this interact with the "we carve out only from the first block" done in step 4/8? The remaining unused space in the first block in the src pool would be wasted, which may not be such a big deal and may not even be worth comparing the available space in two blocks and picking a larger one. But we do want to decide _after_ thinking things through nevertheless.
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On 28/06/18 18:45, Jeff King wrote: > On Thu, Jun 28, 2018 at 05:56:18PM +0100, Ramsay Jones wrote: [snip] >>> One thing we could do is turn the parse failure into a noop. The main >>> point of the check is to protect people against the malicious >>> .gitmodules bug. If the file can't be parsed, then it also can't be an >>> attack vector (assuming the parser used for the fsck check and the >>> parser used by the victim behave identically). >> >> Hmm, yeah, but I would have to provide a means of suppressing >> the config parser error messages. Something I wanted to avoid. :( > > Yes, though I don't think it's too bad. We already have a "die_on_error" > flag in the config code. I think it just needs to become a tristate: > die/error/silent (and probably get passed in via config_options, since I > think we tie it right now to the file/blob source). Yes, but this code is already very crufty - and I'm just waiting for someone to want to have per-repo/submodule config parsing i... ;-) >> Junio, do you want me to address the above 'rejected push' >> issue in this patch, or with a follow-up patch? (It should >> be a pretty rare problem - famous last words!) > > Hmm, if we end up doing the config thing above, then this patch would > become unnecessary. I was thinking of timing - the current patch could go in quickly to solve the immediate problem (eg. in maint). Also, it does not hurt to do this _as well as_ suppress the config errors. > And I think doing that would help _all_ cases, even ones without a > skiplist. They don't need to see random config error messages either, > even if we do eventually report an fsck error. Oh, yes, I agree. You will have noticed that it was my first suggested solution. (I have started writing that patch a few times, but it just makes me want to throw the current code away and start again)! Hmm, BTW, the 'rejected push' problem would include *any* '.gitmodules' blob that contained a syntax error, right? Maybe it won't be as rare as all that! (Imagine not being able to push due to a compiler error/warning in source files. How irritating would that be! :-P ). (if we fix this, you could hide some nefarious settings after an obvious syntax error - then get the victim to fix the syntax error ...) ATB, Ramsay Jones
Re: [PATCH 2/5] rebase: start implementing it as a builtin
On Thu, Jun 28, 2018 at 12:48 AM Pratik Karki wrote: > > This commit imitates the strategy that was used to convert the > difftool to a builtin, see be8a90e (difftool: add a skeleton for the > upcoming builtin, 2017-01-17) for details: This commit renames the > shell script `git-rebase.sh` to `git-legacy-rebase.sh` and hands off to > it by default. That is a good way to start, imitating Johannes approach on rewriting the difftool. Thanks for pointing this out. > The current version of the builtin rebase does not, however, make full > use of the internals but instead chooses to spawn a couple of Git > processes to find out if we run the builtin or legacy rebase as that > keeps the directory that we are in correct. There remains a lot > of room for improvement, left for a later date. The following commits > will recreate the functionality of the shell script, in pure C. > > We intentionally avoid reading the config directly to avoid > messing up the GIT_* environment variables when we need to fall back to > exec()ing the shell script. Thanks for calling this out! The test of builtin rebase can be done by > `git -c rebase.useBuiltin=true rebase ...` > > Signed-off-by: Pratik Karki > --- > .gitignore| 1 + > Makefile | 3 +- > builtin.h | 1 + > builtin/rebase.c | 55 +++ > git-rebase.sh => git-legacy-rebase.sh | 0 > git.c | 6 +++ > 6 files changed, 65 insertions(+), 1 deletion(-) > create mode 100644 builtin/rebase.c > rename git-rebase.sh => git-legacy-rebase.sh (100%) > > diff --git a/.gitignore b/.gitignore > index 3284a1e9b..ec2395901 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -78,6 +78,7 @@ > /git-init-db > /git-interpret-trailers > /git-instaweb > +/git-legacy-rebase > /git-log > /git-ls-files > /git-ls-remote > diff --git a/Makefile b/Makefile > index 0cb6590f2..e88fe2e5f 100644 > --- a/Makefile > +++ b/Makefile > @@ -609,7 +609,7 @@ SCRIPT_SH += git-merge-one-file.sh > SCRIPT_SH += git-merge-resolve.sh > SCRIPT_SH += git-mergetool.sh > SCRIPT_SH += git-quiltimport.sh > -SCRIPT_SH += git-rebase.sh > +SCRIPT_SH += git-legacy-rebase.sh > SCRIPT_SH += git-remote-testgit.sh > SCRIPT_SH += git-request-pull.sh > SCRIPT_SH += git-stash.sh > @@ -1059,6 +1059,7 @@ BUILTIN_OBJS += builtin/prune.o > BUILTIN_OBJS += builtin/pull.o > BUILTIN_OBJS += builtin/push.o > BUILTIN_OBJS += builtin/read-tree.o > +BUILTIN_OBJS += builtin/rebase.o > BUILTIN_OBJS += builtin/rebase--helper.o > BUILTIN_OBJS += builtin/receive-pack.o > BUILTIN_OBJS += builtin/reflog.o > diff --git a/builtin.h b/builtin.h > index 0362f1ce2..44651a447 100644 > --- a/builtin.h > +++ b/builtin.h > @@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, > const char *prefix); > extern int cmd_pull(int argc, const char **argv, const char *prefix); > extern int cmd_push(int argc, const char **argv, const char *prefix); > extern int cmd_read_tree(int argc, const char **argv, const char *prefix); > +extern int cmd_rebase(int argc, const char **argv, const char *prefix); > extern int cmd_rebase__helper(int argc, const char **argv, const char > *prefix); > extern int cmd_receive_pack(int argc, const char **argv, const char *prefix); > extern int cmd_reflog(int argc, const char **argv, const char *prefix); > diff --git a/builtin/rebase.c b/builtin/rebase.c > new file mode 100644 > index 0..1152b7229 > --- /dev/null > +++ b/builtin/rebase.c > @@ -0,0 +1,55 @@ > +/* > + * "git rebase" builtin command > + * > + * Copyright (c) 2018 Pratik Karki > + */ > + > +#include "builtin.h" > +#include "run-command.h" > +#include "exec-cmd.h" > +#include "argv-array.h" > +#include "dir.h" > + > +static int use_builtin_rebase(void) > +{ > + struct child_process cp = CHILD_PROCESS_INIT; > + struct strbuf out = STRBUF_INIT; > + int ret; > + > + argv_array_pushl(, > +"config", "--bool", "rebase.usebuiltin", NULL); --bool is documented as "Historical options for selecting a type specifier. Prefer instead --type, (see: above)." in the man page of git-config. But as this code will go away once the conversion is done, this is not kept around for long. So we should be fine using the --bool option. > + cp.git_cmd = 1; > + if (capture_command(, , 6)) > + return 0; > + > + strbuf_trim(); > + ret = !strcmp("true", out.buf); As --bool will make sure that the config command prints "true" or "false", even when the user configured 0 or 1 instead, this is fine. > + if (argc != 2) > + die("Usage: %s ", argv[0]); > + prefix = setup_git_directory(); > + trace_repo_setup(prefix); > + setup_work_tree(); > + > + die("TODO"); When reading the last sentence in the commit message ("This can be tested ...") I shortly wondered how
Re: [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs
Jameson Miller writes: > Add an API around managing the lifetime of cache_entry > structs. Abstracting memory management details behind this API will > allow for alternative memory management strategies without affecting > all the call sites. This commit does not change how memory is > allocated or freed. A later commit in this series will allocate cache > entries from memory pools as appropriate. > > Motivation: > It has been observed that the time spent loading an index with a large > number of entries is partly dominated by malloc() calls. This change > is in preparation for using memory pools to reduce the number of > malloc() calls made when loading an index. Not worth a reroll, but having these four lines at the very beginning, dropping the line "Motivation:", and then following that with "Add an API around ..." as the second paragraph, would make the above easier to read, without stutter-causing "Motivation:" in the middle. > diff --git a/apply.c b/apply.c > index 8ef975a32d..8a4a4439bc 100644 > --- a/apply.c > +++ b/apply.c > @@ -4092,12 +4092,12 @@ static int build_fake_ancestor(struct apply_state > *state, struct patch *list) > return error(_("sha1 information is lacking or useless " > "(%s)."), name); > > - ce = make_cache_entry(patch->old_mode, , name, 0, 0); > + ce = make_cache_entry(, patch->old_mode, , name, 0, > 0); > if (!ce) > return error(_("make_cache_entry failed for path '%s'"), >name); > if (add_index_entry(, ce, ADD_CACHE_OK_TO_ADD)) { > - free(ce); > + discard_cache_entry(ce); > return error(_("could not add %s to temporary index"), >name); > } So..., even though it wasn't clear in the proposed log message, two large part of the lifecycle management API is (1) make_cache_entry() knows for which istate it is creating the entry and (2) discarding the entry may not be just a simple matter of free()ing. Both of which makes perfect sense, but if the changes are that easily enumeratable, it would have been nicer for readers if the commit did so in the proposed log message. > @@ -4424,27 +4423,26 @@ static int add_conflicted_stages_file(struct > apply_state *state, > struct patch *patch) > { > int stage, namelen; > - unsigned ce_size, mode; > + unsigned mode; > struct cache_entry *ce; > > if (!state->update_index) > return 0; > namelen = strlen(patch->new_name); > - ce_size = cache_entry_size(namelen); > mode = patch->new_mode ? patch->new_mode : (S_IFREG | 0644); > > remove_file_from_cache(patch->new_name); > for (stage = 1; stage < 4; stage++) { > if (is_null_oid(>threeway_stage[stage - 1])) > continue; > - ce = xcalloc(1, ce_size); > + ce = make_empty_cache_entry(_index, namelen); ... and another one in the enumeration is make_empty_cache_entry() which is somehow different. And the readers are forced to read its implementation to find out how it is different, but the use of the same discard_cache_entry() suggests that the liftime rule of an entry allcoated by it may be similar to those created by make_cache_entry(). > ... > if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0) { > - free(ce); > + discard_cache_entry(ce); > return error(_("unable to add cache entry for %s"), >patch->new_name); > } > @@ -230,11 +230,11 @@ static int checkout_merged(int pos, const struct > checkout *state) > if (write_object_file(result_buf.ptr, result_buf.size, blob_type, )) > die(_("Unable to add merge result for '%s'"), path); > free(result_buf.ptr); > - ce = make_cache_entry(mode, , path, 2, 0); > + ce = make_transient_cache_entry(mode, , path, 2); ... and then yet another, which is "transient". An intelligent reader can guess from the lack of istate parameter (and from the word "transient") that the resulting one would not belong to any in-core index. > if (!ce) > die(_("make_cache_entry failed for path '%s'"), path); > status = checkout_entry(ce, state, NULL); > - free(ce); > + discard_cache_entry(ce); ... but discovers that it is discarded the same way, realizes that ce knows how it was allocated to allow discard() different way to discard it, and his/her earlier conjecture about make_empty() does not hold at all and gets somewhat disappointed. > diff --git a/cache.h b/cache.h > index 3fbf24771a..035a627bea 100644 > --- a/cache.h > +++ b/cache.h > @@ -339,6 +339,40 @@ extern void remove_name_hash(struct index_state *istate, > struct cache_entry
Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'
On Mon, Jun 25, 2018 at 04:26:00PM -0500, Taylor Blau wrote: > For instance, a line containing the following (taken from README.md:27): > > (`man gitcvs-migration` or `git help cvs-migration` if git is > > Is printed as follows: > > $ git grep -no -e git -- README.md | grep ":27" > README.md:27:7:git > README.md:27:16:git > README.md:27:38:git This is with "--column", too, right? > Like GNU grep, this patch ignores --only-matching when --invert (-v) is > given. There is a sensible answer here, but parity with the behavior of > other tools is preferred. Yeah, after all of our discussion about inverted matching and columns, I'm sure we could come up with _some_ answer. But I agree that what you have here is quite sensible, and matching GNU grep seems like a good idea. > --- > builtin/grep.c | 6 ++ > grep.c | 48 +--- > grep.h | 1 + > t/t7810-grep.sh | 15 +++ > 4 files changed, 55 insertions(+), 15 deletions(-) The patch itself looks pretty straightforward to me (especially with "-w"). I didn't hit the compiler warning that Junio did (I have gcc 7.3.0). But I agree it's better to avoid even passing an uninitialized variable to another function. -Peff
Re: curious about wording in "man git-config", ENVIRONMENT
On Tue, Jun 26, 2018 at 12:51:45PM -0400, Robert P. J. Day wrote: > > I agree it's weird. I think it's trying to mean "behaves as if it > > was set to", but with the additional notion that the command-line > > argument would take precedence over the environment (which is our > > usual rule). But then we should just say those things explicitly. > > > > Just looking at mentions of GIT_CONFIG in that manpage and knowing > > the history, I think: > > ... snip ... > > i'm just going to admit that i don't quite have the background to know > how to submit a patch to tidy things up based on Jeff's analysis, so > I'm going to leave this to someone higher up the food chain. There's some related discussion going on in: https://public-inbox.org/git/20180627045637.13818-1-...@pobox.com/ I think it makes sense to wait for that to settle and then I may try to build on top. -Peff
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
On Thu, Jun 28, 2018 at 12:36:06PM -0400, Todd Zullinger wrote: > >> It might be enough if the default values are relatively sane > >> and consistent. That would be a slight improvement over the > >> current situation still. > > > > Yeah. Taking a step back from the implementation questions, I think > > "$(prefix)/etc/gitconfig" is not very helpful text. I'd be happy to see > > us come up with a generic way of saying that which is more > > comprehensible to end-users. Your patches side-step that by filling in > > the real value, but unfortunately we can't do that everywhere. :) > > > > There may not be a good "token" value, though. I.e., we may need to have > > two sets of verbiage: the specific one, and the generic one. > > Yeah. About the best generic term I can come up with is > using '$sysconfdir'. But I have no idea if that's something > most readers will recognize as a placeholder for something > like /etc. I don't that's much better than $(prefix). It's at least _correct_ more often, but unless you're used to autotools conventions, it's pretty obscure. Can we just say "the system configuration direction (e.g., /etc)" or something like that? That definitely requires doing some kind of ifdef in the asciidoc source, but I think the end result will be much more comprehensible to end users. And IMHO the readability hit to the source is not too bad (at least I don't find the DEFAULT_PAGER one to be). Something like this: diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt index 18ddc78f42..ab20dd5235 100644 --- a/Documentation/git-config.txt +++ b/Documentation/git-config.txt @@ -113,13 +113,16 @@ For reading options: read only from global `~/.gitconfig` and from See also <>. --system:: - For writing options: write to system-wide - `$(prefix)/etc/gitconfig` rather than the repository - `.git/config`. + For writing options: write to the system-wide gitconfig file + rather then the repository config. + -For reading options: read only from system-wide `$(prefix)/etc/gitconfig` +For reading options: read only from system-wide gitconfig file rather than from all available files. + +The exact location of the system-wide file is configured when Git is +built. +ifdef::git-doc-generic[In many builds, it is `/etc/gitconfig`.] +ifndef::git-doc-generic[In your build, it is +{etc-gitconfig}+.] See also <>. --local:: Though I would also be happy if we simply said "system-wide gitconfig" here and then had the conditional part in FILES. I'd actually argue that all of the "source" options should be grouped, like: --local:: --global:: --system:: --file=:: When writing, write to the repository-local, user-global, or system-wide configuration file (or any `` you specify). When reading, read from a specific file rather than from all available files. See < below for more details. And then let <> describe in prose the various files, where they might be, etc. That also cleans up the `.git/config` thing, which is a mild fiction (it is really `$GIT_DIR/config`). > A number of the path references are in the FILES sections of > the man pages. It might not be much of an improvement if we > try to stuff too much text in those references. Perhaps if > those used '$sysconfdir/gitconfig' a subsequent note could > expand on that? It could even be wrapped in an ifdef > similar to that used for the default editor/pager. So yeah, I think I am arguing along the same lines, but just saying "system-wide gitconfig" or similar instead of $sysconfdir/gitconfig. I think the English is a little less daunting. > > I think it shouldn't go into config.mak.uname, since the idea there was > > to keep the long list of platform defaults separate from other logic > > (the Makefile is already long enough!). So I'm basically proposing the > > same thing but in its own file. :) > > Ahh, something that the top-level Makefile would create and > then subdir Makefiles would include, perhaps similar to the > way GIT-VERSION-FILE is updated and included? That could > prove useful to some of the tools in contrib which duplicate > logic. No, I had just envisioned it would be a static file, like config.mak.paths or something. I'm literally just trying to break out bits of our Makefile into bite-sized files so they're easier look at. > Skipping that larger de-duplication cleanup, here's a stab > at implementing the DOC_GENERIC knob (and the DOC_ overrides > for ETC_GIT(CONFIG|ATTRIBUTES). The defaults with > '/GENERIC-SYSCONFDIR' in them are just placeholders waiting > for a better name. :) So obviously I like the direction of this patch, but if you agree with my line of thinking above you'd need to turn DOC_GENERIC into an attribute and use in-source conditionals. Hopefully you do agree. :) > Thanks for thinking this through and providing some good > directions to work on! Thank you for working on it! I was thinking about this because of
Re: [GSoC][PATCH v5 0/2] rebase -i: rewrite append_todo_help() in C
Alban Gruin writes: > This patch rewrites append_todo_help() from shell to C. The C version > covers a bit more than the old shell version. To achieve that, some > parameters were added to rebase--helper. > > This also introduce a new source file, rebase-interactive.c. > > This is part of the effort to rewrite interactive rebase in C. > > This is based on next, as of 2018-06-28. Please do not base new development on 'next'. You typically do not require the whole of 'next'; there is *NO* reason why this topic must wait until say ds/ewah-cleanup topic graduates to 'master', for example. You may still depend on just a handful of selected topics that you build on that are not yet in 'master'. Identify them and build on top of the merge of them on top of whatever integration branch you are aiming for (typically 'master'). And list these topics explicitly, instead of saying 'based on next'.
Re: [PATCH 1/1] Makefile: fix the "built from commit" code
On Thu, Jun 28, 2018 at 06:23:56PM +0200, Johannes Schindelin wrote: > On Thu, 28 Jun 2018, Jeff King wrote: > > > On Wed, Jun 27, 2018 at 09:35:23PM +0200, Johannes Schindelin via > > GitGitGadget wrote: > > > > > To prevent erroneous commits from being reported (e.g. when unpacking > > > Git's source code from a .tar.gz file into a subdirectory of a different > > > Git project, as e.g. git_osx_installer does), we painstakingly set > > > GIT_CEILING_DIRECTORIES when trying to determine the current commit. > > > > > > Except that we got the quoting wrong, and that variable therefore does > > > not have the desired effect. > > > > > > Let's fix that quoting, and while at it, also suppress the unhelpful > > > message > > > > I had to stare at the code for a bit to figure out what was wrong: > > Do you want me to update the commit message? I'm OK either way. Probably not worth a re-roll unless you want to be completionist about commit messages (personally I do not mind occasionally jumping to the list archive to get historical context and reviews). -Peff
Re: [PATCH 1/1] Makefile: fix the "built from commit" code
On Thu, Jun 28, 2018 at 10:27:32AM -0700, Junio C Hamano wrote: > Johannes Schindelin writes: > > >> I.e.: > >> > >> FOO='with spaces' > >> BAR=$FOO sh -c 'echo $BAR' > >> > >> works just fine. > > > > $ x="two spaces" > > > > $ echo $x > > two spaces > > > > Maybe we should quote a little bit more religiously. > > Both of you are wrong ;-) > > Of course, the lack of dq around echo's argument makes shell split > two and spaces into two args and feed them separately to echo, and > causes echo to show them with a single SP in between. Peff's > exampel should have been > > BAR=$FOO sh -c 'echo "$BAR"' Yes, that's a better example. I was primarily trying to show that the outer shell did not barf with "spaces: command not found". > But that does not have much to do with the primary point Peff was > talking about, which is that in this sequence: > > $ x="two spaces" > $ y="$x" > $ z=$x > $ echo "x=<$x>" "y=<$y>" "z=<$z>" > > assignment to y and z behave identically, i.e. dq around "$x" when > assigning to y is not needed. I actually had to test it to convince myself that one-shot assignments behaved the same way, but they do. -Peff
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On Thu, Jun 28, 2018 at 05:56:18PM +0100, Ramsay Jones wrote: > > Yeah, this solution seems sensible. Given that we would never report any > > error for that blob, there is no point in even looking at it. I wonder > > if we ought to do the same for other types, too. Is there any point in > > opening a tree that is in the skiplist? > > Note that the 'blob' object has already been 'loaded' at this > point anyway (and the basic 'object' checks have been done). Yeah, you're right. If we wanted to avoid that, we should prevent it from entering the gitmodules_found list in the first place. Of course, we'd generally still load it once anyway in order to check the sha1. So really, the most we could do is prevent it from being loaded a _second_ time for no reason during the fsck_finish() stage. But for the reasons I gave in the fsck_finish() patches (like pack ordering), we will quite often end up hitting it in the first pass anyway. So it's probably not worth spending too much time trying to optimize it. And thinking on that, my "should we do this for trees" is pretty dumb, too. We load them and compute their sha1 anyway (I _think_ bitrot checks like that are totally independent of skiplist). So all we'd be saving is walking the buffer entries. > I did think about this, briefly, but decided that we should > only 'skip' the leaf nodes (blobs). (So, when processing > commits, trees and tags, we should not report an error for > that object-id, but that should not stop us from checking > the tree object associated with a commit, just because of > a problem with the commit message). > > [Oh, wait - Hmm, each object could be checked independently > of all others and not used for any object traversal right? > (e.g. using packfile index). I saw fsck_walk() and didn't > look any further ... Ah, broken link check, ... I obviously > need to read the code some more ... :-D ] Yes, I've been confused by this code before. I'm still not sure I totally understand it. ;) > > One thing we could do is turn the parse failure into a noop. The main > > point of the check is to protect people against the malicious > > .gitmodules bug. If the file can't be parsed, then it also can't be an > > attack vector (assuming the parser used for the fsck check and the > > parser used by the victim behave identically). > > Hmm, yeah, but I would have to provide a means of suppressing > the config parser error messages. Something I wanted to avoid. :( Yes, though I don't think it's too bad. We already have a "die_on_error" flag in the config code. I think it just needs to become a tristate: die/error/silent (and probably get passed in via config_options, since I think we tie it right now to the file/blob source). > Junio, do you want me to address the above 'rejected push' > issue in this patch, or with a follow-up patch? (It should > be a pretty rare problem - famous last words!) Hmm, if we end up doing the config thing above, then this patch would become unnecessary. And I think doing that would help _all_ cases, even ones without a skiplist. They don't need to see random config error messages either, even if we do eventually report an fsck error. -Peff
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On Thu, Jun 28, 2018 at 09:39:39AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > Yeah, this solution seems sensible. Given that we would never report any > > error for that blob, there is no point in even looking at it. I wonder > > if we ought to do the same for other types, too. Is there any point in > > opening a tree that is in the skiplist? > > Suppose the tree is listed there only because it has one entry for a > subtree with leading 0 in its mode. We do want to ignore that > format violation, but we still want to learn the fact that the > subtree it points at and its contents are connected and not > dangling, so we do need to open it. Is that open done in a separate > phase? To be honest, I'm not sure. There _is_ a separate phase for checking reachability, but I think there may be some dependencies between the phases. Once upon a time they were communicated by the existence of entries in obj_hash (blech!) but I think these days it happens using a a bit in object->flags. There is at least one case of interest just in this phase, though: we have to read the surrounding tree to find out that a particular blob is a .gitmodules file. So if you skiplist'd a tree, that would also mean we fail to mark its .gitmodules (if any) as such. I'm not sure if that would be a bug or a feature, though. -Peff
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
Ramsay Jones writes: > Junio, do you want me to address the above 'rejected push' > issue in this patch, or with a follow-up patch? (It should > be a pretty rare problem - famous last words!) If you feel the need to say "famous last words", it is an indication that it is better done as a follow-up, I would think ;-) Thanks for spotting and addressing this issue.
Re: [PATCH 1/1] Makefile: fix the "built from commit" code
Johannes Schindelin writes: >> I.e.: >> >> FOO='with spaces' >> BAR=$FOO sh -c 'echo $BAR' >> >> works just fine. > > $ x="two spaces" > > $ echo $x > two spaces > > Maybe we should quote a little bit more religiously. Both of you are wrong ;-) Of course, the lack of dq around echo's argument makes shell split two and spaces into two args and feed them separately to echo, and causes echo to show them with a single SP in between. Peff's exampel should have been BAR=$FOO sh -c 'echo "$BAR"' But that does not have much to do with the primary point Peff was talking about, which is that in this sequence: $ x="two spaces" $ y="$x" $ z=$x $ echo "x=<$x>" "y=<$y>" "z=<$z>" assignment to y and z behave identically, i.e. dq around "$x" when assigning to y is not needed.
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Thu, Jun 28, 2018 at 07:21:18PM +0200, Lars Schneider wrote: > How about this: > > 1) We allow users to set the encoding "auto". Example: > > *.txt working-tree-encoding=auto > > 2) We define a new variable `core.autoencoding`. By default the value is > UTF-8 (== no re-encoding) but user can set to any value in their Git config. > Example: > > git config --global core.autoencoding UTF-16 > > All files marked with the value "auto" will use the encoding defined in > `core.autoencoding`. > > Would that work? Yeah, that was along the lines that I was thinking. I wonder if anybody would ever need two such auto-encodings, though. Probably not. But another way to think about it would be to allow something like: working-tree-encoding=foo and then in your config "foo" to map to some encoding. But that may be over-engineering, I dunno. utf8 has always been enough for me. :) -Peff
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
> On Jun 28, 2018, at 4:34 PM, Jeff King wrote: > > On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote: > >> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: >>> We have common code that is supposed to be usable across different >>> platforms and hence different file encodings. With the full support of the >>> working-tree-encoding in the latest version of git on all platforms, how do >>> we have files converted to different encodings on different platforms? >>> I could not find anything that would allow us to say 'if platform = z/OS >>> then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be >>> done? >> >> I don't believe there is such functionality. Git doesn't have >> attributes that are conditional on the platform in that sort of way. >> You could use a smudge/clean filter and adjust the filter for the >> platform you're on, which might meet your needs. > > We do have prior art in the line-ending code, though. There the > attributes say either that a file needs a specific line-ending type > (which is relatively rare), or that it should follow the system type, > which is then set separately in the config. > > I have the impression that the working-tree-encoding stuff was made to > handle the first case, but not the second. It doesn't seem like an > outrageous thing to eventually add. > > (Though I agree that clean/smudge filters would work, and can even > implement the existing working-tree-encoding feature, albeit less > efficiently and conveniently). Thanks for the suggestion Peff! How about this: 1) We allow users to set the encoding "auto". Example: *.txt working-tree-encoding=auto 2) We define a new variable `core.autoencoding`. By default the value is UTF-8 (== no re-encoding) but user can set to any value in their Git config. Example: git config --global core.autoencoding UTF-16 All files marked with the value "auto" will use the encoding defined in `core.autoencoding`. Would that work? @steve: Would that fix your problem? - Lars
Re: [PATCH v5 5/8] mem-pool: add life cycle management functions
Jameson Miller writes: > Add initialization and discard functions to mem_pool type. As the > memory allocated by mem_pool can now be freed, we also track the large > allocations. > > If the there are existing mp_blocks in the mem_poo's linked list of mem_poo? > mp_blocksl, then the mp_block for a large allocation is inserted mp_blocksl?
Re: [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct
Jameson Miller writes: > The make_cache_entry function should take an object_id struct instead > of sha. The name of the hash is SHA-1, not sha ;-) It is not worth a reroll, but I do not think "should" is a particularly good thing to say in the title or justification in the log message in this case. It is more like you (or somebody else who commented) _want_ to make it take an oid for _some_ reason. "teach make_cache_entry() to take object_id" is probably a better title that admits that we do not explicitly say _why_ we are doing so, than saying "it should", which equally is not explicit ;-) > diff --git a/read-cache.c b/read-cache.c > index fa8366ecab..9624ce1784 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -746,8 +746,10 @@ int add_file_to_index(struct index_state *istate, const > char *path, int flags) > } > > struct cache_entry *make_cache_entry(unsigned int mode, > - const unsigned char *sha1, const char *path, int stage, > - unsigned int refresh_options) > + const struct object_id *oid, > + const char *path, > + int stage, > + unsigned int refresh_options) > { > int size, len; > struct cache_entry *ce, *ret; > @@ -761,7 +763,7 @@ struct cache_entry *make_cache_entry(unsigned int mode, > size = cache_entry_size(len); > ce = xcalloc(1, size); > > - hashcpy(ce->oid.hash, sha1); > + hashcpy(ce->oid.hash, oid->hash); > memcpy(ce->name, path, len); > ce->ce_flags = create_ce_flags(stage); > ce->ce_namelen = len; The patch itself looks good. Thanks.
Re: [PATCH v4] Documentation: declare "core.ignoreCase" as internal variable
On 28.06.18 13:21, Marc Strapetz wrote: > The current description of "core.ignoreCase" reads like an option which > is intended to be changed by the user while it's actually expected to > be set by Git on initialization only. Subsequently, Git relies on the > proper configuration of this variable, as noted by Bryan Turner [1]: > > Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT, > vFAT, NTFS, etc.) is not designed to be run with anything other > than core.ignoreCase=true. > > [1] https://marc.info/?l=git=152998665813997=2 > mid:cagyf7-gee8jrgpkme9rhkptheq6p1+ebpmmwatmh01uo3bf...@mail.gmail.com > > Signed-off-by: Marc Strapetz > --- > Documentation/config.txt | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828..c70cfe956 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -390,16 +390,19 @@ core.hideDotFiles:: > default mode is 'dotGitOnly'. > > core.ignoreCase:: > - If true, this option enables various workarounds to enable > + Internal variable which enables various workarounds to enable > Git to work better on filesystems that are not case sensitive, > - like FAT. For example, if a directory listing finds > - "makefile" when Git expects "Makefile", Git will assume > + like APFS, HFS+, FAT, NTFS, etc. For example, if a directory listing > + finds "makefile" when Git expects "Makefile", Git will assume > it is really the same file, and continue to remember it as > "Makefile". > + > The default is false, except linkgit:git-clone[1] or linkgit:git-init[1] > will probe and set core.ignoreCase true if appropriate when the repository > is created. > ++ > +Git relies on the proper configuration of this variable for your operating > +and file system. Modifying this value may result in unexpected behavior. > > core.precomposeUnicode:: > This option is only used by Mac OS implementation of Git. > Looks good to me
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
On 28/06/18 12:49, Jeff King wrote: > On Wed, Jun 27, 2018 at 07:39:53PM +0100, Ramsay Jones wrote: > >> Since commit ed8b10f631 ("fsck: check .gitmodules content", 2018-05-02), >> fsck will issue an error message for '.gitmodules' content that cannot >> be parsed correctly. This is the case, even when the corresponding blob >> object has been included on the skiplist. For example, using the cgit >> repository, we see the following: >> >> $ git fsck >> Checking object directories: 100% (256/256), done. >> error: bad config line 5 in blob .gitmodules >> error in blob 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: >> could not parse gitmodules blob >> Checking objects: 100% (6626/6626), done. >> $ >> >> $ git config fsck.skiplist '.git/skip' >> $ echo 51dd1eff1edc663674df9ab85d2786a40f7ae3a5 >.git/skip >> $ >> >> $ git fsck >> Checking object directories: 100% (256/256), done. >> error: bad config line 5 in blob .gitmodules >> Checking objects: 100% (6626/6626), done. >> $ >> >> Note that the error message issued by the config parser is still >> present, despite adding the object-id of the blob to the skiplist. >> >> One solution would be to provide a means of suppressing the messages >> issued by the config parser. However, given that (logically) we are >> asking fsck to ignore this object, a simpler approach is to just not >> call the config parser if the object is to be skipped. Add a check to >> the 'fsck_blob()' processing function, to determine if the object is >> on the skiplist and, if so, exit the function early. > > Yeah, this solution seems sensible. Given that we would never report any > error for that blob, there is no point in even looking at it. I wonder > if we ought to do the same for other types, too. Is there any point in > opening a tree that is in the skiplist? Note that the 'blob' object has already been 'loaded' at this point anyway (and the basic 'object' checks have been done). I did think about this, briefly, but decided that we should only 'skip' the leaf nodes (blobs). (So, when processing commits, trees and tags, we should not report an error for that object-id, but that should not stop us from checking the tree object associated with a commit, just because of a problem with the commit message). [Oh, wait - Hmm, each object could be checked independently of all others and not used for any object traversal right? (e.g. using packfile index). I saw fsck_walk() and didn't look any further ... Ah, broken link check, ... I obviously need to read the code some more ... :-D ] >> I noticed recently that the 'cgit.git' repo was complaining when >> doing an 'git fsck' ... >> >> Note that 'cgit' had a '.gitmodules' file and a 'submodule.sh' >> script back in 2007, which had nothing to do with the current >> git submodule facilities, ... ;-) > > Yikes. I worried about that sort of regression when adding the > .gitmodules checks. But this _is_ a pretty unique case: somebody was > implementing their own version of submodules (pre-git-submodule) and > decided to use the same name. So I'm not sure if this is a sign that we > need to think through the regression, or a sign that it really is rare. I don't have any numbers, but my gut tells me that this would be very rare indeed. Of course, my gut has been wrong before ... :-D > One thing we could do is turn the parse failure into a noop. The main > point of the check is to protect people against the malicious > .gitmodules bug. If the file can't be parsed, then it also can't be an > attack vector (assuming the parser used for the fsck check and the > parser used by the victim behave identically). Hmm, yeah, but I would have to provide a means of suppressing the config parser error messages. Something I wanted to avoid. :( > That wouldn't help with your stray message, of course, but it would > eliminate the need to deal with the skiplist here (and skiplists aren't > always easy to do -- for instance, pushing up a non-fork of cgit to > GitHub would now be rejected because of this old file, and you'd have to > contact support to resolve it). Good point. >> I just remembered that I had intended to review the name of the >> function that this patch introduces before sending, but totally >> forgot! :( >> >> [Hmm, 'to_be_skipped' -> object_to_be_skipped, object_on_skiplist, >> etc., ... suggestions welcome!] > > The current name is OK to be, but object_on_skiplist() also seems quite > obvious. object_on_skiplist() it is! Junio, do you want me to address the above 'rejected push' issue in this patch, or with a follow-up patch? (It should be a pretty rare problem - famous last words!) ATB, Ramsay Jones
Re: Inconsistencies in commit-graph technical docs.
Thanks for the quick response and for the patch. I started writing such a long process document because I _thought_ that I found a major issue with the Large Edge List. But, in the end, it was user error. It turned out that I was looking at index '11' when I should have been looking at index '0x11' ('17'). Whoops! On Thu, Jun 28, 2018 at 5:49 AM Derrick Stolee wrote: > > On 6/28/2018 1:11 AM, Grant Welch wrote: > > I recently read the "Supercharging the Git Commit Graph blog by > > Derrick Stolee. I found the article interesting and wanted to verify > > the performance numbers for myself. Then that led me to want to know > > more about the implementation, so I read the technical design notes in > > commit-graph.txt, and then I jumped into the format documentation in > > commit-graph-format.txt. > > > > Along the way, I noticed a few issues. They might just be errors in > > the documentation, but I figured it was worth documenting my entire > > process just to be sure. > > > > "Supercharging the Git Commit Graph", by Derrick Stolee: > > > > https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/ > > > > # TL;DR > > > > I found a few discrepencies between the documentation in > > commit-graph-format.txt and the results that I observed for myself. > > > > 1. The "Commit Data" chunk signature is documented to be 'CGET', but > > it should be 'CDAT'. > > > > commit-graph.c:18 > >#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ > > Thanks for catching this! Thankfully, this is an easy fix, as we only > need to update the documentation. > > > 2. The "no parent" value is documented to be 0x, but is > > actually 0x7000. > > > > commit-graph.c:34 > >#define GRAPH_PARENT_NONE 0x7000 > > This is a more important mistake, as it affects the data that was > written in the commit-graph file. > > Part of the problem is that leading hex digit of 0x7 which in binary is > 0b0111. We already designed a limit of at most 2^{31}-1 (~2.1 billion) > commits in the commit-graph because of the way we track octopus edges, > but this mistake has cost us more: we cannot store more than ~1.8 > billion commits. > > I'm sorry for this mixup, mostly because it is aesthetically unpleasant. > Those extra 300 million commits mean less to me than having a clean file > format. > > > 3. The "generation" field isn't set on any of the commits. (I don't > > know what to make of this.) > > This is a difference between 2.18 and current 'master', which merged > ds/generation-numbers. Commit-graphs written with Git 2.18 have all > generation numbers listed as GENERATION_NUMBER_ZERO (0), which lets > future versions know that the generation number is not computed yet, so > the next commit-graph write will compute the correct generation number. > > I'll send a patch soon fixing these doc issues. > > Thanks, > -Stolee -- -grant welch -gwelch...@gmail.com
Re: [PATCH v4] Documentation: declare "core.ignoreCase" as internal variable
Thanks.
Re: [PATCH] fsck: check skiplist for object in fsck_blob()
Jeff King writes: > Yeah, this solution seems sensible. Given that we would never report any > error for that blob, there is no point in even looking at it. I wonder > if we ought to do the same for other types, too. Is there any point in > opening a tree that is in the skiplist? Suppose the tree is listed there only because it has one entry for a subtree with leading 0 in its mode. We do want to ignore that format violation, but we still want to learn the fact that the subtree it points at and its contents are connected and not dangling, so we do need to open it. Is that open done in a separate phase?
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
Jeff King wrote: > We could get rid around that by using $(DOC_ETC_GITCONFIG) or something, > with: > > DOC_ETC_GITCONFIG ?= $(ETC_GITCONFIG) > > in the Makefile. But that still leaves the choice of which generic text > to use up to the caller. Maybe we should provide more guidance. > > I.e., provide a knob like DOC_GENERIC that fills in generic values for > these values (and maybe some others; it sounds like we have some > existing problem cases). That sounds pretty reasonable. I have something implementing this below. >> It might be enough if the default values are relatively sane >> and consistent. That would be a slight improvement over the >> current situation still. > > Yeah. Taking a step back from the implementation questions, I think > "$(prefix)/etc/gitconfig" is not very helpful text. I'd be happy to see > us come up with a generic way of saying that which is more > comprehensible to end-users. Your patches side-step that by filling in > the real value, but unfortunately we can't do that everywhere. :) > > There may not be a good "token" value, though. I.e., we may need to have > two sets of verbiage: the specific one, and the generic one. Yeah. About the best generic term I can come up with is using '$sysconfdir'. But I have no idea if that's something most readers will recognize as a placeholder for something like /etc. A number of the path references are in the FILES sections of the man pages. It might not be much of an improvement if we try to stuff too much text in those references. Perhaps if those used '$sysconfdir/gitconfig' a subsequent note could expand on that? It could even be wrapped in an ifdef similar to that used for the default editor/pager. I don't want to make the plain .txt files significantly less readable in the process, of course. >>> It's a shame we have to repeat this logic from the Makefile, though I >>> guess we already do so for prefix, bindir, etc, so far. >>> >>> Should we factor the path logic from the top-level Makefile into an >>> include that can be used from either? >> >> Yeah, maybe. I didn't like the duplication either, but as >> you noticed, we do it already for many of the other >> variables. I suspect we could put these defaults into >> config.mak.uname which Documentation/Makefile includes and >> then you could run 'make -C Documentation' in a fresh clone >> or tarball and get docs built with the defaults set for each >> platform. > > I think it shouldn't go into config.mak.uname, since the idea there was > to keep the long list of platform defaults separate from other logic > (the Makefile is already long enough!). So I'm basically proposing the > same thing but in its own file. :) Ahh, something that the top-level Makefile would create and then subdir Makefiles would include, perhaps similar to the way GIT-VERSION-FILE is updated and included? That could prove useful to some of the tools in contrib which duplicate logic. Skipping that larger de-duplication cleanup, here's a stab at implementing the DOC_GENERIC knob (and the DOC_ overrides for ETC_GIT(CONFIG|ATTRIBUTES). The defaults with '/GENERIC-SYSCONFDIR' in them are just placeholders waiting for a better name. :) Similarly, the use of {etc-git(config|attributes)} as the attribute for the replacements could likely be improved for readers of the .txt files. {system-wide-gitconfig} is likely better. Maybe the default for the generic paths could be /system-wide/git(config|attributes) too (or in CAPS to make it more obviously a placeholder)? Thanks for thinking this through and providing some good directions to work on! -- >8 -- Subject: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs Replace `$(prefix)/etc/gitconfig` and `$(prefix)/etc/gitattributes` in generated documentation with the paths chosen when building. Readers of the documentation should not need to know how `$(prefix)` was defined. It's also more consistent than sometimes using `$(prefix)/etc/gitconfig` and other times using `/etc/gitconfig` to refer to the system-wide config file. Update the SYNOPSIS of gitattributes(5) to include the system-wide config file as well. Support building generic documentation with a DOC_GENERIC Makefile knob. The default generic paths may be further customized via the DOC_ETC_GITCONFIG and DOC_ETC_GITATTRIBUTES variables. Define DOC_GENERIC in dist-doc make target to ensure generic paths are used in the generated html and manpage tarballs. Helped-by: Jeff King Signed-off-by: Todd Zullinger --- Documentation/Makefile | 22 ++ Documentation/config.txt| 4 ++-- Documentation/git-config.txt| 10 +- Documentation/git.txt | 4 ++-- Documentation/gitattributes.txt | 4 ++-- Makefile| 10 -- 6 files changed, 41 insertions(+), 13 deletions(-) diff --git a/Documentation/Makefile b/Documentation/Makefile index d079d7c73..2b3540a6c 100644 ---
Re: [PATCH 1/1] Makefile: fix the "built from commit" code
Hi Peff, On Thu, 28 Jun 2018, Jeff King wrote: > On Wed, Jun 27, 2018 at 09:35:23PM +0200, Johannes Schindelin via > GitGitGadget wrote: > > > To prevent erroneous commits from being reported (e.g. when unpacking > > Git's source code from a .tar.gz file into a subdirectory of a different > > Git project, as e.g. git_osx_installer does), we painstakingly set > > GIT_CEILING_DIRECTORIES when trying to determine the current commit. > > > > Except that we got the quoting wrong, and that variable therefore does > > not have the desired effect. > > > > Let's fix that quoting, and while at it, also suppress the unhelpful > > message > > I had to stare at the code for a bit to figure out what was wrong: Do you want me to update the commit message? > > - '-DGIT_BUILT_FROM_COMMIT="$(shell > > GIT_CEILING_DIRECTORIES=\"$(CURDIR)/..\" \ > > - git rev-parse -q --verify HEAD || :)"' > > + '-DGIT_BUILT_FROM_COMMIT="$(shell \ > > + GIT_CEILING_DIRECTORIES="$(CURDIR)/.." \ > > + git rev-parse -q --verify HEAD 2>/dev/null)"' > > The issue is that the $(shell) is resolved before the output is stuffed > into the command-line with -DGIT_BUILT_FROM_COMMIT, and therefore is > _not_ inside quotes. And thus backslashing the quotes is wrong, as the > quote gets literally inserted into the CEILING_DIRECTORIES variable. > > I thought at first we could not need the quotes in the post-image > either, because shell variable assignments do not do word-splitting. > I.e.: > > FOO='with spaces' > BAR=$FOO sh -c 'echo $BAR' > > works just fine. $ x="two spaces" $ echo $x two spaces Maybe we should quote a little bit more religiously. > But $(CURDIR) here is not a shell variable, but rather a Makefile > variable, so it's expanded before we hit the shell. So we need the > quotes. And unfortunately it also breaks if $(CURDIR) contains exotic > metacharacters. If we cared we could use single quotes and $(CURDIR_SQ), > but I suspect it would be far from the first thing to break in such a > case. > > Which is a long-winded way of saying the patch looks correct to me. Thanks ;-) Ciao, Dscho
Re: [PATCH v6 3/4] stash: convert branch to builtin
Hello, On 27.06.2018 21:39, Junio C Hamano wrote: This is primarily because cmd_$foo() is designed to be replacement of "main()" in usual programs---it is allowed to assume the global variables it uses have their initial values and nobody cares the state it leaves behind when it returns. Argument parser may flip bits in file-scope static variables to record which options are seen, assuming that these variables are initialized to all-zero, and that assumption would not hold for the second call to cmd_$foo(), etc. cmd_$foo() also is allowed to call die() when there is an unrecoverable error condition in the context of carrying out $foo; a scripted Porcelain that used $foo as a building block to implement a larger operation (e.g. "stash" that used "checkout" to switch to a different branch) may want to react to the failure and do something extra (i.e. "git checkout || do something more"). Using run_command() interface would not cause any of these problems; the subcommand will run in a clean environment, and the caller can react to its failure. Thank you a lot for this great explanation. When I will submit a new iteration, I will make sure to replace calls of `cmd_$foo()` with `run_command()` (or with API, if possible). Best regards, Paul Ungureanu
Re: [PATCH v2] fetch-pack: support negotiation tip whitelist
On 06/28, Jonathan Tan wrote: > > This seems like a pretty difficult to use feature, requiring that I > > provide the actual OIDs. I think a much better UI would probably be to > > accept a number of different things ranging from exact OIDs to actual > > ref names or even better, allowing for ref-patterns which include globs. > > That way I can do the following: > > > > git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote > > > > in order to easily limit the tips to all the refs I have from that > > particular remote. > > Actual ref names are already supported - it uses the same DWIM as > others. As for globs, I thought of supporting both DWIM objects and the > LHS of refspecs, but (1) it might be strange to support master and > refs/heads/* but not heads/*, I don't think that would be strange at all, and no where in git do we handle heads/* but we do already handle refs/heads/* as well as DWIM master. > and (2) I can't think of anywhere in Git > where you can provide either one - it's either SHA-1 and DWIM name, or > SHA-1 and refspec, but not all three. fetch is a perfect example of supporting all three. I can do git fetch origin SHA1 git fetch origin master git fetch origin refs/heads/*:refs/heads/* -- Brandon Williams
Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect
Johannes Schindelin writes: > On Wed, 27 Jun 2018, Junio C Hamano wrote: > >> Johannes Schindelin writes: >> >> >git rev-list --bisect-all --first-parent F..E >revs && >> ># only E, e1..e8 should be listed, nothing else >> >test_line_count = 9 revs && >> >for rev in E e1 e2 e3 e4 e5 e6 e7 e8 >> >do >> >grep "^$(git rev-parse $rev) " revs || return >> >done >> > >> > I am faster by... a lot. Like, seconds instead of minutes. >> >> I'm fine either way. I just thought you would not want 9 separate >> invocations of grep ;-) > > I don't. > > I like the unnecessary test_commit calls even less ;-) > > And yes, we could avoid those `grep` calls, I know. By something as > convoluted as I now think you are reading me wrong ;-) I think you already established pretty well that it is a good idea to avoid introducing different history to run the new test on, when there is perfectly usable one available. That part, i.e. test_commit, I do not think we have anything to disagree about. What I meant by "many separte grep calls" was to contrast these two approaches: * Have one typical output spelled out as "expect", take an output from an actual run into "actual", make them comparable and then do a compare (which does not use grep; it uses sort in the "making comparable" phase) * Not have any typical output, take an output from an actual run, and have _code_ that inspects the output encode the rule over the output (e.g. "these must exist" is inspected with many grep invocations) Two things the "output must have 9 entries, and these 9 must be mentioned" we see at the beginning of this message does not verify are (1) exact dist value given to each of these entries and (2) the order in which these entries appear in the output. The latter is something we document, and the test should cover. The former does not have to be cast in stone (i.e. I do not think it does not make a difference to label the edge commits with dist=1 or dist=0 as long as everything is consistent), but if there is no strong reason to keep it possible for us to later change how the numbers are assigned, I am OK if the test cast it in stone. Another reason (other than "many separate invocation of grep") to favor the former approach (i.e. use real-looking expected output, munge it and the real output into comparable forms and then compare) is that it is easier to see what aspect of the output we care (and we do not care) about. It is harder to see the omission of exact dist value and ordering of entries in the outpu in the latter approach, and more importantly, know if the omission was deliberate (e.g. it was merely an example) or a mere mistake. With "using a real-looking expected output, make it and the actual output comparable and then compare" approach, the aspects in the output we choose not to care about will show in the "make them comparable" munging. If we do not care the exact dist values, there would be something like s/dist=[0-9]*/dist=X/ to mask the exact value before making the two comparable to see that the expect and the actual files have the same entries. If we still do care about the entries are ordered by the dist values, there would be something that sorts the entries with the actual dist values before doing that masking to ensure if the order is correct.
Re: [PATCH v2] fetch-pack: support negotiation tip whitelist
> This seems like a pretty difficult to use feature, requiring that I > provide the actual OIDs. I think a much better UI would probably be to > accept a number of different things ranging from exact OIDs to actual > ref names or even better, allowing for ref-patterns which include globs. > That way I can do the following: > > git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote > > in order to easily limit the tips to all the refs I have from that > particular remote. Actual ref names are already supported - it uses the same DWIM as others. As for globs, I thought of supporting both DWIM objects and the LHS of refspecs, but (1) it might be strange to support master and refs/heads/* but not heads/*, and (2) I can't think of anywhere in Git where you can provide either one - it's either SHA-1 and DWIM name, or SHA-1 and refspec, but not all three.
Re: [PATCH v2] fetch-pack: support negotiation tip whitelist
On 06/27, Jonathan Tan wrote: > During negotiation, fetch-pack eventually reports as "have" lines all > commits reachable from all refs. Allow the user to restrict the commits > sent in this way by providing a whitelist of tips; only the tips > themselves and their ancestors will be sent. > > This feature is only supported for protocols that support connect or > stateless-connect (such as HTTP with protocol v2). > > This will speed up negotiation when the repository has multiple > relatively independent branches (for example, when a repository > interacts with multiple repositories, such as with linux-next [1] and > torvalds/linux [2]), and the user knows which local branch is likely to > have commits in common with the upstream branch they are fetching. > > [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/ > [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/ > > Signed-off-by: Jonathan Tan > --- > v2 is exactly the same as the original, except with user-facing > documentation in Documentation/fetch-options.txt. > > > What's the plan to expose this "feature" to end-users? There is no > > end-user facing documentation added by this patch, and in-code > > comments only talk about what (mechanical) effect the option has, > > but not when a user may want to use the feature, or how the user > > would best decide the set of commits to pass to this new option. > > Jonathan Nieder also mentioned this. Lack of documentation was an > oversight, sorry. I've added it in this version. > > > Would something like this > > > > git fetch $(git for-each-ref \ > > --format=--nego-tip="%(objectname)" \ > > refs/remotes/linux-next/) \ > > linux-next > > > > be an expected typical way to pull from one remote, exposing only > > the tips of refs we got from that remote and not the ones we > > obtained from other places? > > Yes, that is one way. Alternatively, if the user is only fetching one > branch, they may also want to specify a single branch. > --- > Documentation/fetch-options.txt | 12 +++ > builtin/fetch.c | 21 + > fetch-pack.c| 19 ++-- > fetch-pack.h| 7 + > t/t5510-fetch.sh| 55 + > transport-helper.c | 3 ++ > transport.c | 1 + > transport.h | 10 ++ > 8 files changed, 126 insertions(+), 2 deletions(-) > > diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt > index 97d3217df9..80c4c94595 100644 > --- a/Documentation/fetch-options.txt > +++ b/Documentation/fetch-options.txt > @@ -42,6 +42,18 @@ the current repository has the same history as the source > repository. > .git/shallow. This option updates .git/shallow and accept such > refs. > > +--negotiation-tip:: > + By default, Git will report, to the server, commits reachable > + from all local refs to find common commits in an attempt to > + reduce the size of the to-be-received packfile. If specified, > + Git will only report commits reachable from the given commit. > + This is useful to speed up fetches when the user knows which > + local ref is likely to have commits in common with the > + upstream ref being fetched. This seems like a pretty difficult to use feature, requiring that I provide the actual OIDs. I think a much better UI would probably be to accept a number of different things ranging from exact OIDs to actual ref names or even better, allowing for ref-patterns which include globs. That way I can do the following: git fetch --negotiation-tip=refs/remotes/my-remote/* my-remote in order to easily limit the tips to all the refs I have from that particular remote. > ++ > +This option may be specified more than once; if so, Git will report > +commits reachable from any of the given commits. > + > ifndef::git-pull[] > --dry-run:: > Show what would be done, without making any changes. > diff --git a/builtin/fetch.c b/builtin/fetch.c > index ea5b9669ad..12daec0f3b 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -63,6 +63,7 @@ static int shown_url = 0; > static struct refspec refmap = REFSPEC_INIT_FETCH; > static struct list_objects_filter_options filter_options; > static struct string_list server_options = STRING_LIST_INIT_DUP; > +static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP; > > static int git_fetch_config(const char *k, const char *v, void *cb) > { > @@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = { > TRANSPORT_FAMILY_IPV4), > OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), > TRANSPORT_FAMILY_IPV6), > + OPT_STRING_LIST(0, "negotiation-tip", _tip, N_("revision"), > + N_("report that we have only objects reachable from > this object")), >
Re: [PATCH v1 0/9] Introducing remote ODBs
On Tue, Jun 26, 2018 at 2:37 AM, Eric Sunshine wrote: > In addition to the t5702 failures, I'm also seeing failures of > t0410.1, t5616.6 and t5616.7 at the tip of 'pu' as of [1], all of > which seem to be related to these changes. Yeah but only s/core.partialclonefilter/odb.origin.partialclonefilter/ is needed on top of the fix for for the master branch. Thanks for the report!
Re: [PATCH 29/29] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
On Tue, Jun 26, 2018 at 07:15:45PM -0700, Elijah Newren wrote: > Crazy idea: maybe we could defang it a little more thoroughly with > something like the following (apologies in advance if gmail whitespace > damages this): > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 28315706be..7fda08a90a 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -675,7 +675,7 @@ test_run_ () { > trace= > # 117 is magic because it is unlikely to match the exit > # code of other programs > - if test "OK-117" != "$(test_eval_ "(exit 117) && > $1${LF}${LF}echo OK-\$?" 3>&1)" > + if test "OK-117" != "$(test_eval_ "cd() { return 0; } > && PATH=/dev/null && export PATH && (exit 117) && $1${LF}${LF}echo > OK-\$?" 3>&1)" Clever. We'd still run shell builtins, which is why you need the cd() above. There may be others, but at least it narrows things down. Unless the shell is busybox or something, and implements everything as a builtin. :) I agree on the point elsewhere of returning non-zero (and the items missing from PATH should do that, which is good). -Peff