[PATCH v2] filter-branch: skip commits present on --state-branch
The commits in state:filter.map have already been processed, so don't filter them again. This makes incremental git filter-branch much faster. Also add tests for --state-branch option. Signed-off-by: Michael Barabanov --- git-filter-branch.sh | 1 + t/t7003-filter-branch.sh | 15 +++ 2 files changed, 16 insertions(+) diff --git a/git-filter-branch.sh b/git-filter-branch.sh index ccceaf19a..5c5afa2b9 100755 --- a/git-filter-branch.sh +++ b/git-filter-branch.sh @@ -372,6 +372,7 @@ while read commit parents; do git_filter_branch__commit_count=$(($git_filter_branch__commit_count+1)) report_progress + test -f "$workdir"/../map/$commit && continue case "$filter_subdir" in "") diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh index ec4b160dd..e23de7d0b 100755 --- a/t/t7003-filter-branch.sh +++ b/t/t7003-filter-branch.sh @@ -107,6 +107,21 @@ test_expect_success 'test that the directory was renamed' ' test dir/D = "$(cat diroh/D.t)" ' +V=$(git rev-parse HEAD) + +test_expect_success 'populate --state-branch' ' + git filter-branch --state-branch state -f --tree-filter "touch file || :" HEAD +' + +W=$(git rev-parse HEAD) + +test_expect_success 'using --state-branch to skip already rewritten commits' ' + test_when_finished git reset --hard $V && + git reset --hard $V && + git filter-branch --state-branch state -f --tree-filter "touch file || :" HEAD && + test_cmp_rev $W HEAD +' + git tag oldD HEAD~4 test_expect_success 'rewrite one branch, keeping a side branch' ' git branch modD oldD && -- 2.18.0
Re: [PATCH v1 0/9] Introducing remote ODBs
On Mon, Jun 25, 2018 at 5:49 PM Junio C Hamano wrote: > Christian Couder writes: > > This is a follow up from the patch series called "odb remote" [...] > > 5702.20 and 5702.21 seems to fail in standalone test, when these are > directly queued on top of Git v2.18.0; I haven't looked into the > failure myself (yet). In addition to the t5702 failures, I'm also seeing failures of t0410.1, t5616.6 and t5616.7 at the tip of 'pu' as of [1], all of which seem to be related to these changes. [1]: https://public-inbox.org/git/xmqqin66mql6@gitster-ct.c.googlers.com/
Re: [PATCH v2 11/24] midx: read pack names into array
On Mon, Jun 25, 2018 at 10:35 AM Derrick Stolee wrote: > diff --git a/midx.c b/midx.c > @@ -210,6 +227,20 @@ static void sort_packs_by_name(char **pack_names, > uint32_t nr_packs, uint32_t *p > +static size_t write_midx_pack_lookup(struct hashfile *f, > +char **pack_names, > +uint32_t nr_packs) > +{ > + uint32_t i, cur_len = 0; > + > + for (i = 0; i < nr_packs; i++) { > + hashwrite_be32(f, cur_len); > + cur_len += strlen(pack_names[i]) + 1; > + } > + > + return sizeof(uint32_t) * (size_t)nr_packs; > +} This static function is never used, thus breaks the build with DEVELOPER=1: midx.c:567:15: error: ‘write_midx_pack_lookup’ defined but not used [-Werror=unused-function] cc1: all warnings being treated as errors
Re: [PATCH v4 0/8] ref-in-want
> Changes in v4 are fairly minor. There are a few documentation changes, > commit message updates, as well as a few small style tweaks based on > reviewer comments. Patches 4 and 7, which I have commented on previously, look good. As for patch 2, it still has a missing period in the documentation that I remarked upon in [1], but I'm not too worried about that. Having said that, Jonathan Nieder suggested [2]: > Stefan mentioned that the spec says > > * The server MUST NOT send any refs which were not requested > using 'want-ref' lines. > > Can client enforce that? If not, can the spec say SHOULD NOT for the > server and add a MUST describing appropriate client behavior? I noticed that you did use "SHOULD NOT" instead of "MUST NOT" - in this case, you should probably also follow the second part about appropriate client behavior - it's probably best to document and implement that we ignore all unwanted refs. But considering this situation, though, I think it's better to just put "MUST NOT" and have the client enforce this. One more thing - I think that the fetch part needs to be tested more. In particular, test cases similar to that of the upload-pack tests (multiple ref names, ref name + exact SHA-1), and in addition, handling of wildcards (for example, a wildcard that expands to nothing and a wildcard that expands to 2 refs). [1] https://public-inbox.org/git/20180625174056.53053-1-jonathanta...@google.com/ [2] https://public-inbox.org/git/20180622230119.gl12...@aiede.svl.corp.google.com/
What's cooking in git.git (Jun 2018, #06; Mon, 25)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. Git 2.18 turned out to be an unusually large release in the recent history, with 900+ individual changes, compared to ~500 for a few releases before that. Thanks, everybody who participated. The tip of 'next' hasn't been rewound yet, but I've already reverted merges of a few topics from there to give them a fresh restart. I haven't annotated many topics still in 'pu' with short-term plans and my assessment on their done-ness. Comments and review summaries to help deciding are appreciated. I'd like to start accepting new topics only after getting the classification more-or-less right for the topics already in flight. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ag/rebase-p (2018-06-01) 4 commits (merged to 'next' on 2018-06-13 at dd6f8a51d7) + rebase: remove -p code from git-rebase--interactive.sh + rebase: use the new git-rebase--preserve-merges.sh + rebase: strip unused code in git-rebase--preserve-merges.sh + rebase: introduce a dedicated backend for --preserve-merges (this branch is used by ag/rebase-i-append-todo-help and ag/rebase-i-rewrite-todo.) Separate "rebase -p" codepath out of "rebase -i" implementation to slim down the latter and make it easier to manage. * cc/tests-without-assuming-ref-files-backend (2018-06-04) 1 commit (merged to 'next' on 2018-06-13 at 7e2f74431c) + t9104: kosherly remove remote refs Instead of mucking with filesystem directly, use plumbing commands update-ref etc. to manipulate the refs in the tests. * ds/commit-graph-lockfile-fix (2018-05-22) 1 commit (merged to 'next' on 2018-05-24 at 3d12a02b0c) + commit-graph: fix UX issue when .lock file exists (this branch is used by ds/commit-graph-fsck; uses ds/generation-numbers.) Update to ds/generation-numbers topic. * ds/generation-numbers (2018-05-22) 11 commits (merged to 'next' on 2018-05-24 at 56fc38a1b6) + commit-graph.txt: update design document + merge: check config before loading commits + commit: use generation number in remove_redundant() + commit: add short-circuit to paint_down_to_common() + commit: use generation numbers for in_merge_bases() + ref-filter: use generation number for --contains + commit-graph: always load commit-graph information + commit: use generations in paint_down_to_common() + commit-graph: compute generation numbers + commit: add generation number to struct commit + ref-filter: fix outdated comment on in_commit_list (this branch is used by ds/commit-graph-fsck and ds/commit-graph-lockfile-fix.) A recently added "commit-graph" datafile has learned to store pre-computed generation numbers to speed up the decisions to stop history traversal. * en/merge-recursive-tests (2018-05-28) 5 commits (merged to 'next' on 2018-06-01 at 8490b560b4) + t6036: prefer test_when_finished to manual cleanup in following test + t6036, t6042: prefer test_cmp to sequences of test + t6036, t6042: prefer test_path_is_file, test_path_is_missing + t6036, t6042: use test_line_count instead of wc -l + t6036, t6042: use test_create_repo to keep tests independent Clean up tests in t6xxx series about 'merge' command. * en/rename-directory-detection-reboot (2018-06-18) 1 commit (merged to 'next' on 2018-06-18 at 95c454d3f4) + merge-recursive: use xstrdup() instead of fixed buffer Newly added codepath in merge-recursive had potential buffer overrun, which has been fixed. * jk/show-index (2018-05-29) 2 commits (merged to 'next' on 2018-06-01 at 4b3382d994) + show-index: update documentation for index v2 + make show-index a builtin Modernize a less often used command. * ls/complete-remote-update-names (2018-06-01) 1 commit (merged to 'next' on 2018-06-13 at 86b4d23278) + completion: complete remote names too "git remote update" can take both a single remote nickname and a nickname for remote groups, and the completion script (in contrib/) has been taught about it. * nd/commit-util-to-slab (2018-05-21) 15 commits (merged to 'next' on 2018-05-24 at bb5643d75c) + commit.h: delete 'util' field in struct commit + merge: use commit-slab in merge remote desc instead of commit->util + log: use commit-slab in prepare_bases() instead of commit->util + show-branch: note about its object flags usage + show-branch: use commit-slab for commit-name instead of commit->util + name-rev: use commit-slab for rev-name instead of commit->util + bisect.c: use commit-slab for commit weight instead of commit->util + revision.c: use commit-slab for show_source + sequencer.c: use commit-slab to
Re: [PATCH v4 6/8] fetch: refactor to make function args narrower
> Refactor find_non_local_tags and get_ref_map to only take the > information they need instead of the entire transport struct. Besides > improving code clarity, this also improves their flexibility, allowing > for a different set of refs to be used instead of relying on the ones > stored in the transport struct. I see that due to the narrowing of get_ref_map() to take the refs (and the remote) instead of the whole transport, the computation of the refs (including computation of the ref prefixes) is also moved from get_ref_map() to its caller, do_fetch(). As in a previous patch, get_ref_map() is only used once, so this movement is safe. Reviewed-by: Jonathan Tan
Re: [PATCH v4 3/8] upload-pack: test negotiation with changing repository
> --- /dev/null > +++ b/t/lib-httpd/one-time-sed.sh > @@ -0,0 +1,16 @@ > +#!/bin/sh > + > +if [ -e one-time-sed ]; then > + "$GIT_EXEC_PATH/git-http-backend" >out > + > + sed "$(cat one-time-sed)" out_modified > + > + if diff out out_modified >/dev/null; then > + cat out > + else > + cat out_modified > + rm one-time-sed > + fi > +else > + "$GIT_EXEC_PATH/git-http-backend" > +fi Add an explanatory comment somewhere (maybe, at the beginning), something like: If "one-time-sed" exists in $HTTPD_ROOT_PATH, run sed on the HTTP response, using the contents of "one-time-sed" as the sed command to be run. If the response was modified as a result, delete "one-time-sed" so that subsequent HTTP responses are no longer modified. This can be used to simulate the effects of the repository changing in between HTTP request-response pairs. > +test_expect_failure 'server is initially ahead - ref in want' ' [snip] > +test_expect_failure 'server is initially behind - ref in want' ' [snip] These are test_expect_failure, I assume because the fetch part has not been implemented yet. Can this be moved to the end of the patch set, once the fetch part has been implemented? There's also the case of when the server initially has a ref but later does not - can this be tested here too? The server part is already covered by the upload-pack test in which we craft a request with a non-existent ref, but it would be good to test the client part too.
Re: [PATCH] filter-branch: skip commits present on --state-branch
Michael Barabanov writes: > The commits in state:filter.map have already been processed, so don't > filter them again. This makes incremental git filter-branch much faster. > > Also add tests for --state-branch option. > > Signed-off-by: Michael Barabanov > --- > git-filter-branch.sh | 3 +++ > t/t7003-filter-branch.sh | 15 +++ > 2 files changed, 18 insertions(+) > > diff --git a/git-filter-branch.sh b/git-filter-branch.sh > index ccceaf19a..2df7ed107 100755 > --- a/git-filter-branch.sh > +++ b/git-filter-branch.sh > @@ -372,6 +372,9 @@ while read commit parents; do > git_filter_branch__commit_count=$(($git_filter_branch__commit_count+1)) > > report_progress > + if test -r "$workdir/../map/$commit"; then > + continue > + fi The original script is so much of a mess that I needed quite some time to find enough evidence to convince myself that this change is in line with what is already happening in the program. We have test -f "$workdir"/../map/$sha1 && continue in the codepath for remap-to-ancestor prostprocessing to do pretty-much the same skipping. I think the new code should follow suit, i.e. if test -f "$workdir/../map/$commit" then continue fi to check just the existence for consistency. It would have been reviewer friendly if the proposed commit log message said how this change does *not* break the progress output and count. A possible alternative optimization could be not to add these already mapped commits in ../revs file in the first place (so they are not even counted as part of $commits), and such a change would give different meaning to the progress output (which may or may not be a good change). Instead, the posted patch counts the commits to be filtered the same way as before, and merely pretends that it filtered those commits to their mapped counterparts without spending any cycle (simply because we already _know_ what they are mapped to), so the meaning of the numbers in the progress display does not change at all---just they appear to progress much faster, which is a welcome change ;-)
Re: [PATCH 4/5] commit-graph: store graph in struct object_store
> I was looking at semantic merge conflicts between this and your > e2838d85 ("commit-graph: always load commit-graph information", > 2018-05-01), the latter of which I planned to merge to 'master' as a > part of the first batch in this cycle, and noticed that there are > two very similar functions, without enough document the callers > would not know which one is the correct one to call. Needless to > say, such a code duplication would mean the work required for > resolving semantic conflict doubles needlessly X-<. > > > int parse_commit_in_graph(struct commit *item) > { > uint32_t pos; > > if (!core_commit_graph) > return 0; > if (item->object.parsed) > return 1; > prepare_commit_graph(); > if (commit_graph && find_commit_in_graph(item, commit_graph, )) > return fill_commit_in_graph(item, commit_graph, pos); > return 0; > } > > void load_commit_graph_info(struct commit *item) > { > uint32_t pos; > if (!core_commit_graph) > return; > prepare_commit_graph(); > if (commit_graph && find_commit_in_graph(item, commit_graph, )) > fill_commit_graph_info(item, commit_graph, pos); > } Thanks for letting me know - when I reroll, I'll ensure that I reroll on top of this change. As for whether both these functions are necessary in the first place, I think they are: we could have a load_commit_graph_info() that ignores the parsedness of the commit and just copies everything over, but that means that we can't have the optimization of returning immediately without consulting the graph when attempting to reparse a parsed object. So we can't delete parse_commit_in_graph(). And we can't delete load_commit_graph_info() either, because after reading the documentation in commit-graph.h, looking at the places where it is used, and observing some test failures when I make parse_object_buffer() not use the graph, I see that there are code paths in Git in which we use both parsed-from-buffer and parsed-without-buffer commits at the same time. So, I now look at the code duplication, and I see that it is mostly the check-prepare-check step; my patch set reduces it a little, but I'll try to reduce it nearly completely when I reroll too. Maybe when the object store becomes more pervasive, we can more clearly separate out the buffers that come from a repository (and maybe, when obtaining those buffers, we should obtain graph information at the same time) and buffers that we obtain from some arbitrary non-repository source, but that will take some time. In the meantime, I'll do what I suggested above when I reroll, maintaining these 2 functions.
Re: [PATCH v1 0/9] Introducing remote ODBs
Christian Couder writes: > This is a follow up from the patch series called "odb remote" that I > ... Just an early warning, as I haven't even complained on patch titles of these patches in the series ;-) 5702.20 and 5702.21 seems to fail in standalone test, when these are directly queued on top of Git v2.18.0; I haven't looked into the failure myself (yet).
[PATCH 1/2] grep.c: extract show_line_header()
The grep code invokes show_line() to display the contents of a matched or context line in its output. Part of this execution is to print a line header that includes information such as the kind, the line- and column-number and etc. of that match. To prepare for the addition of an option to print only the matching component(s) of a non-context line, we must prepare for the possibility that a single line may contain multiple matching parts, and thus will need multiple headers printed for a single line. Extracting show_line_header allows us to do just that. In the subsequent commit, it will be used within the colorization loop to print out only the matching parts of a line, optionally with LFs delimiting sub-matches. Signed-off-by: Taylor Blau --- grep.c | 44 +--- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/grep.c b/grep.c index 992673fe7e..4ff8a73043 100644 --- a/grep.c +++ b/grep.c @@ -1410,26 +1410,9 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol, return hit; } -static void show_line(struct grep_opt *opt, char *bol, char *eol, - const char *name, unsigned lno, ssize_t cno, char sign) +static void show_line_header(struct grep_opt *opt, const char *name, +unsigned lno, ssize_t cno, char sign) { - int rest = eol - bol; - const char *match_color, *line_color = NULL; - - if (opt->file_break && opt->last_shown == 0) { - if (opt->show_hunk_mark) - opt->output(opt, "\n", 1); - } else if (opt->pre_context || opt->post_context || opt->funcbody) { - if (opt->last_shown == 0) { - if (opt->show_hunk_mark) { - output_color(opt, "--", 2, opt->color_sep); - opt->output(opt, "\n", 1); - } - } else if (lno > opt->last_shown + 1) { - output_color(opt, "--", 2, opt->color_sep); - opt->output(opt, "\n", 1); - } - } if (opt->heading && opt->last_shown == 0) { output_color(opt, name, strlen(name), opt->color_filename); opt->output(opt, "\n", 1); @@ -1457,6 +1440,29 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol, output_color(opt, buf, strlen(buf), opt->color_columnno); output_sep(opt, sign); } +} + +static void show_line(struct grep_opt *opt, char *bol, char *eol, + const char *name, unsigned lno, ssize_t cno, char sign) +{ + int rest = eol - bol; + const char *match_color, *line_color = NULL; + + if (opt->file_break && opt->last_shown == 0) { + if (opt->show_hunk_mark) + opt->output(opt, "\n", 1); + } else if (opt->pre_context || opt->post_context || opt->funcbody) { + if (opt->last_shown == 0) { + if (opt->show_hunk_mark) { + output_color(opt, "--", 2, opt->color_sep); + opt->output(opt, "\n", 1); + } + } else if (lno > opt->last_shown + 1) { + output_color(opt, "--", 2, opt->color_sep); + opt->output(opt, "\n", 1); + } + } + show_line_header(opt, name, lno, cno, sign); if (opt->color) { regmatch_t match; enum grep_context ctx = GREP_CONTEXT_BODY; -- 2.18.0
[PATCH 0/2] grep.c: teach --only-matching to 'git-grep(1)'
Hi, Attached is a resurrection of my previous topic [1] to add '--only-matching' (in the style of GNU grep) to 'git grep'. The changes are described in detail in each of the attached patches. Similar to the series to add --column to 'git grep', I have restarted this thread in order to not clutter the old one. I rewrote the patches from scratch today, and have based them on tb/grep-colno, on top of which they should apply cleanly. Thanks in advance for your kind review :-). Thanks, Taylor [1]: https://public-inbox.org/git/cover.1525492696.git...@ttaylorr.com/ Taylor Blau (2): grep.c: extract show_line_header() grep.c: teach 'git grep --only-matching' builtin/grep.c | 6 grep.c | 90 +++-- grep.h | 1 + t/t7810-grep.sh | 15 + 4 files changed, 79 insertions(+), 33 deletions(-) -- 2.18.0
[PATCH 2/2] grep.c: teach 'git grep --only-matching'
Teach 'git grep --only-matching', a new option to only print the matching part(s) of a line. For instance, a line containing the following (taken from README.md:27): (`man gitcvs-migration` or `git help cvs-migration` if git is Is printed as follows: $ git grep -no -e git -- README.md | grep ":27" README.md:27:7:git README.md:27:16:git README.md:27:38:git The patch works mostly as one would expect, with the exception of a few considerations that are worth mentioning here. Like GNU grep, this patch ignores --only-matching when --invert (-v) is given. There is a sensible answer here, but parity with the behavior of other tools is preferred. Because a line might contain more than one match, there are special considerations pertaining to when to print line headers, newlines, and how to increment the match column offset. The line header and newlines are handled as a special case within the main loop to avoid polluting the surrounding code with conditionals that have large blocks. Signed-off-by: Taylor Blau --- builtin/grep.c | 6 ++ grep.c | 48 +--- grep.h | 1 + t/t7810-grep.sh | 15 +++ 4 files changed, 55 insertions(+), 15 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 61bcaf6e58..228b83990f 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -843,6 +843,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) OPT_BOOL_F('z', "null", _following_name, N_("print NUL after filenames"), PARSE_OPT_NOCOMPLETE), + OPT_BOOL('o', "only-matching", _matching, + N_("show only matching parts of a line")), OPT_BOOL('c', "count", , N_("show the number of matches instead of matching lines")), OPT__COLOR(, N_("highlight matches")), @@ -962,6 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix) if (!opt.pattern_list) die(_("no pattern given.")); + /* --only-matching has no effect with --invert. */ + if (opt.invert) + opt.only_matching = 0; + /* * We have to find "--" in a separate pass, because its presence * influences how we will parse arguments that come before it. diff --git a/grep.c b/grep.c index 4ff8a73043..48cca6723e 100644 --- a/grep.c +++ b/grep.c @@ -51,6 +51,7 @@ void init_grep_defaults(void) color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED); color_set(opt->color_selected, ""); color_set(opt->color_sep, GIT_COLOR_CYAN); + opt->only_matching = 0; opt->color = -1; opt->output = std_output; } @@ -158,6 +159,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) opt->pattern_tail = >pattern_list; opt->header_tail = >header_list; + opt->only_matching = def->only_matching; opt->color = def->color; opt->extended_regexp_option = def->extended_regexp_option; opt->pattern_type_option = def->pattern_type_option; @@ -1462,39 +1464,55 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol, opt->output(opt, "\n", 1); } } - show_line_header(opt, name, lno, cno, sign); - if (opt->color) { + if (!opt->only_matching) { + /* +* In case the line we're being called with contains more than +* one match, leave printing each header to the loop below. +*/ + show_line_header(opt, name, lno, cno, sign); + } + if (opt->color || opt->only_matching) { regmatch_t match; enum grep_context ctx = GREP_CONTEXT_BODY; int ch = *eol; int eflags = 0; - if (sign == ':') - match_color = opt->color_match_selected; - else - match_color = opt->color_match_context; - if (sign == ':') - line_color = opt->color_selected; - else if (sign == '-') - line_color = opt->color_context; - else if (sign == '=') - line_color = opt->color_function; + if (opt->color) { + if (sign == ':') + match_color = opt->color_match_selected; + else + match_color = opt->color_match_context; + if (sign == ':') + line_color = opt->color_selected; + else if (sign == '-') + line_color = opt->color_context; + else if (sign == '=') + line_color = opt->color_function; + } *eol = '\0';
Re: [GSoC][PATCH v3 2/3] rebase -i: rewrite setup_reflog_action() in C
Hi, On Mon, 25 Jun 2018, Alban Gruin wrote: > Le 25/06/2018 à 17:34, Junio C Hamano a écrit : > > Alban Gruin writes: > > > >> Le 22/06/2018 à 18:27, Junio C Hamano a écrit : > >>> Alban Gruin writes: > This rewrites (the misnamed) setup_reflog_action() from shell to C. The > new version is called checkout_base_commit(). > >>> > >>> ;-) on the "misnamed" part. Indeed, setting up the comment for the > >>> reflog entry is secondary to what this function wants to do, which > >>> is to check out the branch to be rebased. > >>> > >>> I do not think "base_commit" is a good name, either, though. When I > >>> hear 'base' in the context of 'rebase', I would imagine that the > >>> speaker is talking about the bottom of the range of the commits to > >>> be rebased (i.e. "rebase --onto ONTO BASE BRANCH", which replays > >>> commits BASE..BRANCH on top of ONTO and then points BRANCH at the > >>> result), not the top of the range or the branch these commits are > >>> taken from. > >>> > >> > >> Perhaps should I name this function checkout_onto(), and rename > >> checkout_onto() to "detach_onto()"? And I would reorder those two commits > >> in > >> the series, as this would be very confusing. > > > > I may be misunderstanding what is happening in the function, but I > > think it is checking out neither the onto or the base commit. The > > function instead is about checking out the branch to be rebased > > before anything else happens when the optional argument is > > given (and when the optional argument is not given, then we rebase > > the current branch so there is no need to check it out upfront), no? > > > > > > Yes, you’re right. > > Now I really don’t know how to call this function. > checkout_top_of_range(), perhaps? Pratik refactored some code from sequencer.c into checkout.c/checkout.h today to do exactly that. It is not polished yet, but probably will be tomorrow. It provides a function `int detach_head_to(struct object_oid *oid, const char *action, const char *reflog_message)`. Maybe use that directly, once the commit is available? Ciao, Dscho
Re: [PATCH 4/5] commit-graph: store graph in struct object_store
Jonathan Tan writes: > Instead of storing commit graphs in static variables, store them in > struct object_store. There are no changes to the signatures of existing > functions - they all still only support the_repository, and support for > other instances of struct repository will be added in a subsequent > commit. > > Signed-off-by: Jonathan Tan > --- > commit-graph.c | 34 ++ > object-store.h | 3 +++ > object.c | 5 + > 3 files changed, 26 insertions(+), 16 deletions(-) > ... > @@ -293,7 +290,9 @@ int parse_commit_in_graph(struct commit *item) > return 1; > > prepare_commit_graph(); > - if (commit_graph) { > + if (the_repository->objects->commit_graph) { > + struct commit_graph *commit_graph = > + the_repository->objects->commit_graph; > uint32_t pos; > int found; > if (item->graph_pos != COMMIT_NOT_FROM_GRAPH) { > @@ -329,7 +328,8 @@ struct tree *get_commit_tree_in_graph(const struct commit > *c) > if (c->graph_pos == COMMIT_NOT_FROM_GRAPH) > BUG("get_commit_tree_in_graph called from non-commit-graph > commit"); > > - return load_tree_for_commit(commit_graph, (struct commit *)c); > + return load_tree_for_commit(the_repository->objects->commit_graph, > + (struct commit *)c); > } I was looking at semantic merge conflicts between this and your e2838d85 ("commit-graph: always load commit-graph information", 2018-05-01), the latter of which I planned to merge to 'master' as a part of the first batch in this cycle, and noticed that there are two very similar functions, without enough document the callers would not know which one is the correct one to call. Needless to say, such a code duplication would mean the work required for resolving semantic conflict doubles needlessly X-<. int parse_commit_in_graph(struct commit *item) { uint32_t pos; if (!core_commit_graph) return 0; if (item->object.parsed) return 1; prepare_commit_graph(); if (commit_graph && find_commit_in_graph(item, commit_graph, )) return fill_commit_in_graph(item, commit_graph, pos); return 0; } void load_commit_graph_info(struct commit *item) { uint32_t pos; if (!core_commit_graph) return; prepare_commit_graph(); if (commit_graph && find_commit_in_graph(item, commit_graph, )) fill_commit_graph_info(item, commit_graph, pos); }
Re: [PATCH] Makefile: tweak sed invocation
On 2018-06-25 16:15, Eric Sunshine wrote: On Mon, Jun 25, 2018 at 3:18 PM Alejandro R. Sedeño wrote: With GNU sed, the r command doesn't care if a space separates it and the filename it reads from. With SunOS sed, the space is required. MacOS and the various BSD's ship with BSD 'sed', not GNU 'sed', so it seemed prudent to check this change against them as well, which I did, and can report that it does not cause any regression on those platforms. Therefore, the patch looks good. Thanks. Thanks for checking on that, Eric. I tested MacOS locally before submitting as well. From a quick skim of the POSIX sed page, the space is expected, so this should be portable. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html -Alejandro Signed-off-by: Alejandro R. Sedeño --- diff --git a/Makefile b/Makefile @@ -2109,7 +2109,7 @@ $(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES GIT-PERL-HEADER GIT-VERSION-FILE $(QUIET_GEN)$(RM) $@ $@+ && \ sed -e '1{' \ -e 's|#!.*perl|#!$(PERL_PATH_SQ)|' \ - -e 'rGIT-PERL-HEADER' \ + -e 'r GIT-PERL-HEADER' \ -e 'G' \ -e '}' \ -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
Re: [PATCH] Makefile: tweak sed invocation
On Mon, Jun 25, 2018 at 3:18 PM Alejandro R. Sedeño wrote: > With GNU sed, the r command doesn't care if a space separates it and > the filename it reads from. > > With SunOS sed, the space is required. MacOS and the various BSD's ship with BSD 'sed', not GNU 'sed', so it seemed prudent to check this change against them as well, which I did, and can report that it does not cause any regression on those platforms. Therefore, the patch looks good. Thanks. > Signed-off-by: Alejandro R. Sedeño > --- > diff --git a/Makefile b/Makefile > @@ -2109,7 +2109,7 @@ $(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES > GIT-PERL-HEADER GIT-VERSION-FILE > $(QUIET_GEN)$(RM) $@ $@+ && \ > sed -e '1{' \ > -e 's|#!.*perl|#!$(PERL_PATH_SQ)|' \ > - -e 'rGIT-PERL-HEADER' \ > + -e 'r GIT-PERL-HEADER' \ > -e 'G' \ > -e '}' \ > -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
Re: [PATCH v2 08/24] packfile: generalize pack directory list
Derrick Stolee writes: > +struct prepare_pack_data > +{ ERROR: open brace '{' following struct go on the same line #88: FILE: packfile.c:777: +struct prepare_pack_data +{ > + struct repository *r; > + struct string_list *garbage; > + int local; > +}; > + > +static void prepare_pack(const char *full_name, size_t full_name_len, const > char *file_name, void *_data) > +{ > + struct prepare_pack_data *data = (struct prepare_pack_data *)_data; > + struct packed_git *p; > + size_t base_len = full_name_len; > + > + if (strip_suffix_mem(full_name, _len, ".idx")) { > + /* Don't reopen a pack we already have. */ > + for (p = data->r->objects->packed_git; p; p = p->next) { > + size_t len; > + if (strip_suffix(p->pack_name, ".pack", ) && > + len == base_len && > + !memcmp(p->pack_name, full_name, len)) > + break; > + } > + > + if (p == NULL && > + /* > + * See if it really is a valid .idx file with > + * corresponding .pack file that we can map. > + */ > + (p = add_packed_git(full_name, full_name_len, data->local)) > != NULL) > + install_packed_git(data->r, p); > + } This is merely a moved code and the issue was inherited from the original, but can we make it easier to read and at the same time remove that assignment inside if() condition (which generally makes the code harder to read)? The most naïve if (!p) { p = add_packed_git(full_name, full_name_len, data->local); if (p) install_packed_git(data->r, p); } isn't all that bad, but there may be even better ways. > + if (!report_garbage) > +return; This "return;" is indented with a run of SP not with HT?
Re: [PATCH v2 07/24] multi-pack-index: expand test data
Derrick Stolee writes: > test_expect_success 'write midx with no packs' ' > + test_when_finished rm pack/multi-pack-index && It is generally a good idea to give "-f" to "rm" used in test_when_finished. The main command sequence may have failed before it has a chance to create that file; even though an error will be ignored by the when_finished handler, it is a good code hygiene to mark expected condition (e.g. the file to be removed may not exist at this point) to signal to future readers that the author knew what s/he was writing. > git multi-pack-index --object-dir=. write && > test_path_is_file pack/multi-pack-index && > midx_read_expect > ' > > +test_expect_success 'create objects' ' > + for i in `test_seq 1 5` Please write it as "$(test_seq 1 5)" > + do > + iii=$(printf '%03i' $i) > + test-tool genrandom "bar" 200 > wide_delta_$iii && test-tool genrandom "bar" 200 >"wide_delta_$iii" && i.e. no SP between the redirection operator and the target file name. Also dq around target file name that depends on variable substitution to tell bash that we know what we are doing (some vintage of bash will throw warning at us unless we do so).
Re: [PATCH v2 06/24] multi-pack-index: load into memory
Derrick Stolee writes: > +#define MIDX_HASH_LEN 20 > +#define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN) > > static char *get_midx_filename(const char *object_dir) > { > return xstrfmt("%s/pack/multi-pack-index", object_dir); > } > > +struct multi_pack_index *load_multi_pack_index(const char *object_dir) > +{ > + struct multi_pack_index *m = NULL; > + int fd; > + struct stat st; > + size_t midx_size; > + void *midx_map = NULL; > + uint32_t hash_version; > + char *midx_name = get_midx_filename(object_dir); > + > + fd = git_open(midx_name); > + > + if (fd < 0) { > + error_errno(_("failed to read %s"), midx_name); > + FREE_AND_NULL(midx_name); > + return NULL; > + } > + if (fstat(fd, )) { > + error_errno(_("failed to read %s"), midx_name); > + FREE_AND_NULL(midx_name); > + close(fd); > + return NULL; > + } > + > + midx_size = xsize_t(st.st_size); > + > + if (midx_size < MIDX_MIN_SIZE) { > + close(fd); > + error(_("multi-pack-index file %s is too small"), midx_name); > + goto cleanup_fail; > + } > + > + FREE_AND_NULL(midx_name); Error handling in the above part looks a bit inconsistent. I first thought that the earlier ones manually clean up and leave because jumping to cleanup_fail would need a successfully opened fd and successfully mmapped midx_map, but the above "goto" forces cleanup_fail: to munmap NULL and close an already closed fd. I wonder if it is simpler to do cleanup_fail: /* no need to check for NULL when freeing */ free(m); free(midx_name); if (midx_map) munmap(midx_map, midx_size); if (0 <= fd) close(fd); return NULL; and have all of the above error codepath to jump there. > + midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0); > + > + m = xcalloc(1, sizeof(*m) + strlen(object_dir) + 1); > + strcpy(m->object_dir, object_dir); > + m->data = midx_map; > + > + m->signature = get_be32(m->data); > + if (m->signature != MIDX_SIGNATURE) { > + error(_("multi-pack-index signature 0x%08x does not match > signature 0x%08x"), > + m->signature, MIDX_SIGNATURE); > + goto cleanup_fail; > + } > + > + m->version = m->data[4]; > + if (m->version != MIDX_VERSION) { > + error(_("multi-pack-index version %d not recognized"), > + m->version); > + goto cleanup_fail; > + } > + > + hash_version = m->data[5]; Is there a good existing example to show a better way to avoid these hard-coded constants that describe/define the file format? > + if (hash_version != MIDX_HASH_VERSION) { > + error(_("hash version %u does not match"), hash_version); > + goto cleanup_fail; > + } > + m->hash_len = MIDX_HASH_LEN; > + > + m->num_chunks = *(m->data + 6); By the way, this mixture of m->data[4] and *(m->data + 6) is even worse. You could do get_be32(&8[m->data]) if you want to irritate readers even more ;-) > + m->num_packs = get_be32(m->data + 8); > + > + return m; > + > +cleanup_fail: > + FREE_AND_NULL(m); > + FREE_AND_NULL(midx_name); > + munmap(midx_map, midx_size); > + close(fd); > + return NULL; > +} > + > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > index 8622a7cdce..0372704c96 100755 > --- a/t/t5319-multi-pack-index.sh > +++ b/t/t5319-multi-pack-index.sh > @@ -3,9 +3,19 @@ > test_description='multi-pack-indexes' > . ./test-lib.sh > > +midx_read_expect() { "midx_read_expect () {", i.e. SP on both sides of (), please. > + cat >expect <<- EOF "<<-\EOF", i.e. make it easy for readers to spot that there is no funny substitutions happening in the here-doc body. > + header: 4d494458 1 0 0 > + object_dir: . > + EOF > + test-tool read-midx . >actual && > + test_cmp expect actual > +} > + > test_expect_success 'write midx with no packs' ' > git multi-pack-index --object-dir=. write && > - test_path_is_file pack/multi-pack-index > + test_path_is_file pack/multi-pack-index && > + midx_read_expect > ' > > test_done
[PATCH] fetch-pack: support negotiation tip whitelist
During negotiation, fetch-pack eventually reports as "have" lines all commits reachable from all refs. Allow the user to restrict the commits sent in this way by providing a whitelist of tips; only the tips themselves and their ancestors will be sent. This feature is only supported for protocols that support connect or stateless-connect (such as HTTP with protocol v2). This will speed up negotiation when the repository has multiple relatively independent branches (for example, when a repository interacts with multiple repositories, such as with linux-next [1] and torvalds/linux [2]), and the user knows which local branch is likely to have commits in common with the upstream branch they are fetching. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/ [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/ Signed-off-by: Jonathan Tan --- This is based on jt/fetch-pack-negotiator, but if that branch is undesirable for whatever reason, I can port this to master. --- builtin/fetch.c| 21 ++ fetch-pack.c | 19 ++-- fetch-pack.h | 7 ++ t/t5510-fetch.sh | 55 ++ transport-helper.c | 3 +++ transport.c| 1 + transport.h| 10 + 7 files changed, 114 insertions(+), 2 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669a..12daec0f3 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -63,6 +63,7 @@ static int shown_url = 0; static struct refspec refmap = REFSPEC_INIT_FETCH; static struct list_objects_filter_options filter_options; static struct string_list server_options = STRING_LIST_INIT_DUP; +static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP; static int git_fetch_config(const char *k, const char *v, void *cb) { @@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_STRING_LIST(0, "negotiation-tip", _tip, N_("revision"), + N_("report that we have only objects reachable from this object")), OPT_PARSE_LIST_OBJECTS_FILTER(_options), OPT_END() }; @@ -1075,6 +1078,24 @@ static struct transport *prepare_transport(struct remote *remote, int deepen) filter_options.filter_spec); set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); } + if (negotiation_tip.nr) { + struct oid_array *oids; + if (transport->smart_options) { + int i; + oids = xcalloc(1, sizeof(*oids)); + for (i = 0; i < negotiation_tip.nr; i++) { + struct object_id oid; + if (get_oid(negotiation_tip.items[i].string, + )) + die("%s is not a valid object", + negotiation_tip.items[i].string); + oid_array_append(oids, ); + } + transport->smart_options->negotiation_tips = oids; + } else { + warning("Ignoring --negotiation-tip because the protocol does not support it."); + } + } return transport; } diff --git a/fetch-pack.c b/fetch-pack.c index ba12085c4..c66bd49bd 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -213,6 +213,21 @@ static int next_flush(int stateless_rpc, int count) return count; } +static void mark_tips(struct fetch_negotiator *negotiator, + const struct oid_array *negotiation_tips) +{ + int i; + if (!negotiation_tips) { + for_each_ref(rev_list_insert_ref_oid, negotiator); + return; + } + + for (i = 0; i < negotiation_tips->nr; i++) + rev_list_insert_ref(negotiator, NULL, + _tips->oid[i]); + return; +} + static int find_common(struct fetch_negotiator *negotiator, struct fetch_pack_args *args, int fd[2], struct object_id *result_oid, @@ -230,7 +245,7 @@ static int find_common(struct fetch_negotiator *negotiator, if (args->stateless_rpc && multi_ack == 1) die(_("--stateless-rpc requires multi_ack_detailed")); - for_each_ref(rev_list_insert_ref_oid, negotiator); + mark_tips(negotiator, args->negotiation_tips); for_each_cached_alternate(negotiator, insert_one_alternate_object); fetching = 0; @@ -1295,7 +1310,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, else state = FETCH_SEND_REQUEST; -
Re: [PATCH v2 05/24] midx: write header information to lockfile
Derrick Stolee writes: > +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ > +#define MIDX_VERSION 1 > +#define MIDX_HASH_VERSION 1 > +#define MIDX_HEADER_SIZE 12 > + > +static char *get_midx_filename(const char *object_dir) > +{ > + return xstrfmt("%s/pack/multi-pack-index", object_dir); > +} > + > +static size_t write_midx_header(struct hashfile *f, > + unsigned char num_chunks, > + uint32_t num_packs) > +{ > + unsigned char byte_values[4]; > + hashwrite_be32(f, MIDX_SIGNATURE); WARNING: Missing a blank line after declarations #48: FILE: midx.c:21: + unsigned char byte_values[4]; + hashwrite_be32(f, MIDX_SIGNATURE); > + byte_values[0] = MIDX_VERSION; > + byte_values[1] = MIDX_HASH_VERSION; > + byte_values[2] = num_chunks; > + byte_values[3] = 0; /* unused */ > + hashwrite(f, byte_values, sizeof(byte_values)); > + hashwrite_be32(f, num_packs); > + > + return MIDX_HEADER_SIZE; > +} > + > int write_midx_file(const char *object_dir) > { > + unsigned char num_chunks = 0; > + char *midx_name; > + struct hashfile *f; > + struct lock_file lk; > + > + midx_name = get_midx_filename(object_dir); > + if (safe_create_leading_directories(midx_name)) { > + UNLEAK(midx_name); > + die_errno(_("unable to create leading directories of %s"), > + midx_name); > + } > + > + hold_lock_file_for_update(, midx_name, LOCK_DIE_ON_ERROR); > + f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf); > + FREE_AND_NULL(midx_name); I am not sure why people prefer FREE_AND_NULL over free() for things like this. It is on stack; it's not like this is a static variable visible after this function returns or anything like that. > + write_midx_header(f, num_chunks, 0); > + > + finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); > + commit_lock_file(); > + > return 0; > } > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > index ec3ddbe79c..8622a7cdce 100755 > --- a/t/t5319-multi-pack-index.sh > +++ b/t/t5319-multi-pack-index.sh > @@ -4,7 +4,8 @@ test_description='multi-pack-indexes' > . ./test-lib.sh > > test_expect_success 'write midx with no packs' ' > - git multi-pack-index --object-dir=. write > + git multi-pack-index --object-dir=. write && > + test_path_is_file pack/multi-pack-index > ' > > test_done
[PATCH] Makefile: tweak sed invocation
With GNU sed, the r command doesn't care if a space separates it and the filename it reads from. With SunOS sed, the space is required. Signed-off-by: Alejandro R. Sedeño --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index e4b503d..5bac181 100644 --- a/Makefile +++ b/Makefile @@ -2109,7 +2109,7 @@ $(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES GIT-PERL-HEADER GIT-VERSION-FILE $(QUIET_GEN)$(RM) $@ $@+ && \ sed -e '1{' \ -e 's|#!.*perl|#!$(PERL_PATH_SQ)|' \ - -e 'rGIT-PERL-HEADER' \ + -e 'r GIT-PERL-HEADER' \ -e 'G' \ -e '}' \ -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \ -- 2.1.4
Re: [PATCH v2 03/24] multi-pack-index: add builtin
Derrick Stolee writes: > This new 'git multi-pack-index' builtin will be the plumbing access > for writing, reading, and checking multi-pack-index files. The > initial implementation is a no-op. > > Signed-off-by: Derrick Stolee > --- > .gitignore | 3 +- > Documentation/git-multi-pack-index.txt | 36 > Makefile | 1 + > builtin.h | 1 + > builtin/multi-pack-index.c | 38 ++ > command-list.txt | 1 + > git.c | 1 + > 7 files changed, 80 insertions(+), 1 deletion(-) > create mode 100644 Documentation/git-multi-pack-index.txt > create mode 100644 builtin/multi-pack-index.c > > diff --git a/.gitignore b/.gitignore > index 388cc4beee..25633bc515 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -99,8 +99,9 @@ > /git-mergetool--lib > /git-mktag > /git-mktree > -/git-name-rev > +/git-multi-pack-index > /git-mv > +/git-name-rev Nice attention to the detail (even though the patch as an incremental change gets distracting, the result is better and future changes to the file will read cleaner). > diff --git a/Documentation/git-multi-pack-index.txt > b/Documentation/git-multi-pack-index.txt > new file mode 100644 > index 00..9877f9c441 > --- /dev/null > +++ b/Documentation/git-multi-pack-index.txt > @@ -0,0 +1,36 @@ > +git-multi-pack-index(1) > +== > + > +NAME > + > +git-multi-pack-index - Write and verify multi-pack-indexes > + > + > +SYNOPSIS > + > +[verse] > +'git multi-pack-index' [--object-dir ] > + > +DESCRIPTION > +--- > +Write or verify a multi-pack-index (MIDX) file. > + > +OPTIONS > +--- > + > +--object-dir :: > + Use given directory for the location of Git objects. We check > + /packs/multi-pack-index for the current MIDX file, and > + /packs for the pack-files to index. Do we want to `quote` these constant strings? > diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c > new file mode 100644 > index 00..f101873525 > --- /dev/null > +++ b/builtin/multi-pack-index.c > @@ -0,0 +1,38 @@ > +#include "builtin.h" > +#include "cache.h" > +#include "config.h" > +#include "parse-options.h" > + > +static char const * const builtin_multi_pack_index_usage[] ={ ERROR: spaces required around that '=' (ctx:WxV) #112: FILE: builtin/multi-pack-index.c:6: +static char const * const builtin_multi_pack_index_usage[] ={ ^
[GSoC] GSoC with git, week 8
Hi, I published my blog post about last week: https://blog.pa1ch.fr/posts/2018/06/25/en/gsoc2018-week-8.html Cheers, Alban
[PATCH v4 3/8] upload-pack: test negotiation with changing repository
Add tests to check the behavior of fetching from a repository which changes between rounds of negotiation (for example, when different servers in a load-balancing agreement participate in the same stateless RPC negotiation). This forms a baseline of comparison to the ref-in-want functionality (which will be introduced to the client in subsequent commits), and ensures that subsequent commits do not change existing behavior. As part of this effort, a mechanism to substitute strings in a single HTTP response is added. Signed-off-by: Brandon Williams --- t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf| 8 +++ t/lib-httpd/one-time-sed.sh| 16 ++ t/t5703-upload-pack-ref-in-want.sh | 92 ++ 4 files changed, 117 insertions(+) create mode 100644 t/lib-httpd/one-time-sed.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 435a37465..84f8efdd4 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -132,6 +132,7 @@ prepare_httpd() { cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH" install_script broken-smart-http.sh install_script error.sh + install_script one-time-sed.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 724d9ae46..fe68d37bb 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -111,9 +111,14 @@ Alias /auth/dumb/ www/auth/dumb/ SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} SetEnv GIT_HTTP_EXPORT_ALL + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1 ScriptAlias /broken_smart/ broken-smart-http.sh/ ScriptAlias /error/ error.sh/ +ScriptAliasMatch /one_time_sed/(.*) one-time-sed.sh/$1 Options FollowSymlinks @@ -123,6 +128,9 @@ ScriptAlias /error/ error.sh/ Options ExecCGI + + Options ExecCGI + Options ExecCGI diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh new file mode 100644 index 0..a9c4aa5f4 --- /dev/null +++ b/t/lib-httpd/one-time-sed.sh @@ -0,0 +1,16 @@ +#!/bin/sh + +if [ -e one-time-sed ]; then + "$GIT_EXEC_PATH/git-http-backend" >out + + sed "$(cat one-time-sed)" out_modified + + if diff out out_modified >/dev/null; then + cat out + else + cat out_modified + rm one-time-sed + fi +else + "$GIT_EXEC_PATH/git-http-backend" +fi diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index 0ef182970..979ab6d03 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -150,4 +150,96 @@ test_expect_success 'want-ref with ref we already have commit for' ' check_output ' +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo" +LOCAL_PRISTINE="$(pwd)/local_pristine" + +test_expect_success 'setup repos for change-while-negotiating test' ' + ( + git init "$REPO" && + cd "$REPO" && + >.git/git-daemon-export-ok && + test_commit m1 && + git tag -d m1 && + + # Local repo with many commits (so that negotiation will take + # more than 1 request/response pair) + git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo; "$LOCAL_PRISTINE" && + cd "$LOCAL_PRISTINE" && + git checkout -b side && + for i in $(seq 1 33); do test_commit s$i; done && + + # Add novel commits to upstream + git checkout master && + cd "$REPO" && + test_commit m2 && + test_commit m3 && + git tag -d m2 m3 + ) && + git -C "$LOCAL_PRISTINE" remote set-url origin "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo; && + git -C "$LOCAL_PRISTINE" config protocol.version 2 +' + +inconsistency() { + # Simulate that the server initially reports $2 as the ref + # corresponding to $1, and after that, $1 as the ref corresponding to + # $1. This corresponds to the real-life situation where the server's + # repository appears to change during negotiation, for example, when + # different servers in a load-balancing arrangement serve (stateless) + # RPCs during a single negotiation. + printf "s/%s/%s/" \ + $(git -C "$REPO" rev-parse $1 | tr -d "\n") \ + $(git -C "$REPO" rev-parse $2 | tr -d "\n") \ + >"$HTTPD_ROOT_PATH/one-time-sed" +} + +test_expect_success 'server is initially ahead - no ref in want' ' + git -C "$REPO" config uploadpack.allowRefInWant false && + rm -rf local && + cp -r "$LOCAL_PRISTINE" local && + inconsistency master 1234567890123456789012345678901234567890 && + test_must_fail git -C local fetch 2>err && +
[PATCH v4 7/8] fetch-pack: put shallow info in output parameter
Expand the transport fetch method signature, by adding an output parameter, to allow transports to return information about the refs they have fetched. Then communicate shallow status information through this mechanism instead of by modifying the input list of refs. This does require clients to sometimes generate the ref map twice: once from the list of refs provided by the remote (as is currently done) and potentially once from the new list of refs that the fetch mechanism provides. Signed-off-by: Brandon Williams --- builtin/clone.c | 4 ++-- builtin/fetch.c | 28 fetch-object.c | 2 +- fetch-pack.c | 15 --- transport-helper.c | 6 -- transport-internal.h | 9 - transport.c | 34 -- transport.h | 3 ++- 8 files changed, 77 insertions(+), 24 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 99e73dae8..8f86d99c5 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1155,7 +1155,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } if (!is_local && !complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); remote_head = find_ref_by_name(refs, "HEAD"); remote_head_points_at = @@ -1197,7 +1197,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); update_remote_refs(refs, mapped_refs, remote_head_points_at, branch_top.buf, reflog_msg.buf, transport, diff --git a/builtin/fetch.c b/builtin/fetch.c index bda00e826..0347cf016 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -946,11 +946,13 @@ static int quickfetch(struct ref *ref_map) return check_connected(iterate_ref_map, , ); } -static int fetch_refs(struct transport *transport, struct ref *ref_map) +static int fetch_refs(struct transport *transport, struct ref *ref_map, + struct ref **updated_remote_refs) { int ret = quickfetch(ref_map); if (ret) - ret = transport_fetch_refs(transport, ref_map); + ret = transport_fetch_refs(transport, ref_map, + updated_remote_refs); if (!ret) /* * Keep the new pack's ".keep" file around to allow the caller @@ -1112,7 +1114,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - if (!fetch_refs(transport, ref_map)) + if (!fetch_refs(transport, ref_map, NULL)) consume_refs(transport, ref_map); if (gsecondary) { @@ -1128,6 +1130,7 @@ static int do_fetch(struct transport *transport, int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; const struct ref *remote_refs; + struct ref *updated_remote_refs = NULL; struct argv_array ref_prefixes = ARGV_ARRAY_INIT; if (tags == TAGS_DEFAULT) { @@ -1178,7 +1181,24 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { + + if (fetch_refs(transport, ref_map, _remote_refs)) { + free_refs(ref_map); + retcode = 1; + goto cleanup; + } + if (updated_remote_refs) { + /* +* Regenerate ref_map using the updated remote refs. This is +* to account for additional information which may be provided +* by the transport (e.g. shallow info). +*/ + free_refs(ref_map); + ref_map = get_ref_map(transport->remote, updated_remote_refs, rs, + tags, ); + free_refs(updated_remote_refs); + } + if (consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; diff --git a/fetch-object.c b/fetch-object.c index 853624f81..48fe63dd6 100644 --- a/fetch-object.c +++ b/fetch-object.c @@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref *ref) transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1"); -
[PATCH v4 8/8] fetch-pack: implement ref-in-want
Implement ref-in-want on the client side so that when a server supports the "ref-in-want" feature, a client will send "want-ref" lines for each reference the client wants to fetch. This feature allows clients to tolerate inconsistencies that exist when a remote repository's refs change during the course of negotiation. This allows a client to request to request a particular ref without specifying the OID of the ref. This means that instead of hitting an error when a ref no longer points at the OID it did at the beginning of negotiation, negotiation can continue and the value of that ref will be sent at the termination of negotiation, just before a packfile is sent. More information on the ref-in-want feature can be found in Documentation/technical/protocol-v2.txt. Signed-off-by: Brandon Williams --- fetch-pack.c | 35 +++--- remote.c | 1 + remote.h | 1 + t/t5703-upload-pack-ref-in-want.sh | 4 ++-- 4 files changed, 36 insertions(+), 5 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 73890b894..3a18f5bcd 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1102,9 +1102,10 @@ static void add_shallow_requests(struct strbuf *req_buf, static void add_wants(const struct ref *wants, struct strbuf *req_buf) { + int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0); + for ( ; wants ; wants = wants->next) { const struct object_id *remote = >old_oid; - const char *remote_hex; struct object *o; /* @@ -1122,8 +1123,10 @@ static void add_wants(const struct ref *wants, struct strbuf *req_buf) continue; } - remote_hex = oid_to_hex(remote); - packet_buf_write(req_buf, "want %s\n", remote_hex); + if (!use_ref_in_want || wants->exact_oid) + packet_buf_write(req_buf, "want %s\n", oid_to_hex(remote)); + else + packet_buf_write(req_buf, "want-ref %s\n", wants->name); } } @@ -1334,6 +1337,29 @@ static void receive_shallow_info(struct fetch_pack_args *args, args->deepen = 1; } +static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs) +{ + process_section_header(reader, "wanted-refs", 0); + while (packet_reader_read(reader) == PACKET_READ_NORMAL) { + struct object_id oid; + const char *end; + struct ref *r = NULL; + + if (parse_oid_hex(reader->line, , ) || *end++ != ' ') + die("expected wanted-ref, got '%s'", reader->line); + + for (r = refs; r; r = r->next) { + if (!strcmp(end, r->name)) { + oidcpy(>old_oid, ); + break; + } + } + } + + if (reader->status != PACKET_READ_DELIM) + die("error processing wanted refs: %d", reader->status); +} + enum fetch_state { FETCH_CHECK_LOCAL = 0, FETCH_SEND_REQUEST, @@ -1408,6 +1434,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, if (process_section_header(, "shallow-info", 1)) receive_shallow_info(args, ); + if (process_section_header(, "wanted-refs", 1)) + receive_wanted_refs(, ref); + /* get the pack */ process_section_header(, "packfile", 0); if (get_pack(args, fd, pack_lockfile)) diff --git a/remote.c b/remote.c index abe80c139..2c2376fff 100644 --- a/remote.c +++ b/remote.c @@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs, if (refspec->exact_sha1) { ref_map = alloc_ref(name); get_oid_hex(name, _map->old_oid); + ref_map->exact_oid = 1; } else { ref_map = get_remote_ref(remote_refs, name); } diff --git a/remote.h b/remote.h index 45ecc6cef..976292152 100644 --- a/remote.h +++ b/remote.h @@ -73,6 +73,7 @@ struct ref { force:1, forced_update:1, expect_old_sha1:1, + exact_oid:1, deletion:1; enum { diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index 979ab6d03..b94a51380 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -204,7 +204,7 @@ test_expect_success 'server is initially ahead - no ref in want' ' grep "ERR upload-pack: not our ref" err ' -test_expect_failure 'server is initially ahead - ref in want' ' +test_expect_success 'server is initially ahead - ref in want' ' git -C "$REPO"
[PATCH v4 6/8] fetch: refactor to make function args narrower
Refactor find_non_local_tags and get_ref_map to only take the information they need instead of the entire transport struct. Besides improving code clarity, this also improves their flexibility, allowing for a different set of refs to be used instead of relying on the ones stored in the transport struct. Signed-off-by: Brandon Williams --- builtin/fetch.c | 52 - 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 2fabfed0e..bda00e826 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -254,9 +254,9 @@ static int will_fetch(struct ref **head, const unsigned char *sha1) return 0; } -static void find_non_local_tags(struct transport *transport, - struct ref **head, - struct ref ***tail) +static void find_non_local_tags(const struct ref *refs, + struct ref **head, + struct ref ***tail) { struct string_list existing_refs = STRING_LIST_INIT_DUP; struct string_list remote_refs = STRING_LIST_INIT_NODUP; @@ -264,7 +264,7 @@ static void find_non_local_tags(struct transport *transport, struct string_list_item *item = NULL; for_each_ref(add_existing, _refs); - for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) { + for (ref = refs; ref; ref = ref->next) { if (!starts_with(ref->name, "refs/tags/")) continue; @@ -338,7 +338,8 @@ static void find_non_local_tags(struct transport *transport, string_list_clear(_refs, 0); } -static struct ref *get_ref_map(struct transport *transport, +static struct ref *get_ref_map(struct remote *remote, + const struct ref *remote_refs, struct refspec *rs, int tags, int *autotags) { @@ -346,27 +347,11 @@ static struct ref *get_ref_map(struct transport *transport, struct ref *rm; struct ref *ref_map = NULL; struct ref **tail = _map; - struct argv_array ref_prefixes = ARGV_ARRAY_INIT; /* opportunistically-updated references: */ struct ref *orefs = NULL, **oref_tail = struct string_list existing_refs = STRING_LIST_INIT_DUP; - const struct ref *remote_refs; - - if (rs->nr) - refspec_ref_prefixes(rs, _prefixes); - else if (transport->remote && transport->remote->fetch.nr) - refspec_ref_prefixes(>remote->fetch, _prefixes); - - if (ref_prefixes.argc && - (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) { - argv_array_push(_prefixes, "refs/tags/"); - } - - remote_refs = transport_get_remote_refs(transport, _prefixes); - - argv_array_clear(_prefixes); if (rs->nr) { struct refspec *fetch_refspec; @@ -403,7 +388,7 @@ static struct ref *get_ref_map(struct transport *transport, if (refmap.nr) fetch_refspec = else - fetch_refspec = >remote->fetch; + fetch_refspec = >fetch; for (i = 0; i < fetch_refspec->nr; i++) get_fetch_map(ref_map, _refspec->items[i], _tail, 1); @@ -411,7 +396,6 @@ static struct ref *get_ref_map(struct transport *transport, die("--refmap option is only meaningful with command-line refspec(s)."); } else { /* Use the defaults */ - struct remote *remote = transport->remote; struct branch *branch = branch_get(NULL); int has_merge = branch_has_merge_config(branch); if (remote && @@ -450,7 +434,7 @@ static struct ref *get_ref_map(struct transport *transport, /* also fetch all tags */ get_fetch_map(remote_refs, tag_refspec, , 0); else if (tags == TAGS_DEFAULT && *autotags) - find_non_local_tags(transport, _map, ); + find_non_local_tags(remote_refs, _map, ); /* Now append any refs to be updated opportunistically: */ *tail = orefs; @@ -1143,6 +1127,8 @@ static int do_fetch(struct transport *transport, struct ref *ref_map; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; + const struct ref *remote_refs; + struct argv_array ref_prefixes = ARGV_ARRAY_INIT; if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) @@ -1158,7 +1144,21 @@ static int do_fetch(struct transport *transport, goto cleanup; } - ref_map = get_ref_map(transport, rs, tags, ); + if (rs->nr) + refspec_ref_prefixes(rs, _prefixes); + else if (transport->remote && transport->remote->fetch.nr) +
[PATCH v4 4/8] fetch: refactor the population of peer ref OIDs
Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides tightening scopes of variables in the code, this also prepares for get_ref_map being able to be called multiple times within do_fetch. Signed-off-by: Brandon Williams --- builtin/fetch.c | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669a..545635448 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -351,6 +351,7 @@ static struct ref *get_ref_map(struct transport *transport, /* opportunistically-updated references: */ struct ref *orefs = NULL, **oref_tail = + struct string_list existing_refs = STRING_LIST_INIT_DUP; const struct ref *remote_refs; if (rs->nr) @@ -458,7 +459,23 @@ static struct ref *get_ref_map(struct transport *transport, tail = >next; } - return ref_remove_duplicates(ref_map); + ref_map = ref_remove_duplicates(ref_map); + + for_each_ref(add_existing, _refs); + for (rm = ref_map; rm; rm = rm->next) { + if (rm->peer_ref) { + struct string_list_item *peer_item = + string_list_lookup(_refs, + rm->peer_ref->name); + if (peer_item) { + struct object_id *old_oid = peer_item->util; + oidcpy(>peer_ref->old_oid, old_oid); + } + } + } + string_list_clear(_refs, 1); + + return ref_map; } #define STORE_REF_ERROR_OTHER 1 @@ -1110,14 +1127,10 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) static int do_fetch(struct transport *transport, struct refspec *rs) { - struct string_list existing_refs = STRING_LIST_INIT_DUP; struct ref *ref_map; - struct ref *rm; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; - for_each_ref(add_existing, _refs); - if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) tags = TAGS_SET; @@ -1136,18 +1149,6 @@ static int do_fetch(struct transport *transport, if (!update_head_ok) check_not_current_branch(ref_map); - for (rm = ref_map; rm; rm = rm->next) { - if (rm->peer_ref) { - struct string_list_item *peer_item = - string_list_lookup(_refs, - rm->peer_ref->name); - if (peer_item) { - struct object_id *old_oid = peer_item->util; - oidcpy(>peer_ref->old_oid, old_oid); - } - } - } - if (tags == TAGS_DEFAULT && autotags) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); if (prune) { @@ -1183,7 +1184,6 @@ static int do_fetch(struct transport *transport, } cleanup: - string_list_clear(_refs, 1); return retcode; } -- 2.18.0.rc2.346.g013aa6912e-goog
[PATCH v4 5/8] fetch: refactor fetch_refs into two functions
Refactor the fetch_refs function into a function that does the fetching of refs and another function that stores them. This is in preparation for allowing additional processing of the fetched refs before updating the local ref store. Signed-off-by: Brandon Williams --- builtin/fetch.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 545635448..2fabfed0e 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -968,9 +968,21 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map) if (ret) ret = transport_fetch_refs(transport, ref_map); if (!ret) - ret |= store_updated_refs(transport->url, - transport->remote->name, - ref_map); + /* +* Keep the new pack's ".keep" file around to allow the caller +* time to update refs to reference the new objects. +*/ + return 0; + transport_unlock_pack(transport); + return ret; +} + +/* Update local refs based on the ref values fetched from a remote */ +static int consume_refs(struct transport *transport, struct ref *ref_map) +{ + int ret = store_updated_refs(transport->url, +transport->remote->name, +ref_map); transport_unlock_pack(transport); return ret; } @@ -1116,7 +1128,8 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - fetch_refs(transport, ref_map); + if (!fetch_refs(transport, ref_map)) + consume_refs(transport, ref_map); if (gsecondary) { transport_disconnect(gsecondary); @@ -1165,7 +1178,7 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map)) { + if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; -- 2.18.0.rc2.346.g013aa6912e-goog
[PATCH v4 2/8] upload-pack: implement ref-in-want
Currently, while performing packfile negotiation, clients are only allowed to specify their desired objects using object ids. This causes a vulnerability to failure when an object turns non-existent during negotiation, which may happen if, for example, the desired repository is provided by multiple Git servers in a load-balancing arrangement and there exists replication delay. In order to eliminate this vulnerability, implement the ref-in-want feature for the 'fetch' command in protocol version 2. This feature enables the 'fetch' command to support requests in the form of ref names through a new "want-ref " parameter. At the conclusion of negotiation, the server will send a list of all of the wanted references (as provided by "want-ref" lines) in addition to the generated packfile. Signed-off-by: Brandon Williams --- Documentation/config.txt| 7 ++ Documentation/technical/protocol-v2.txt | 28 - t/t5703-upload-pack-ref-in-want.sh | 153 upload-pack.c | 66 ++ 4 files changed, 253 insertions(+), 1 deletion(-) create mode 100755 t/t5703-upload-pack-ref-in-want.sh diff --git a/Documentation/config.txt b/Documentation/config.txt index ab641bf5a..fb1dd7428 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3479,6 +3479,13 @@ Note that this configuration variable is ignored if it is seen in the repository-level config (this is a safety measure against fetching from untrusted repositories). +uploadpack.allowRefInWant:: + If this option is set, `upload-pack` will support the `ref-in-want` + feature of the protocol version 2 `fetch` command. This feature + is intended for the benefit of load-balanced servers which may + not have the same view of what OIDs their refs point to due to + replication delay. + url..insteadOf:: Any URL that starts with this value will be rewritten to start, instead, with . In cases where some site serves a diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 49bda76d2..d44ac10f1 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -299,12 +299,21 @@ included in the client's request: for use with partial clone and partial fetch operations. See `rev-list` for possible "filter-spec" values. +If the 'ref-in-want' feature is advertised, the following argument can +be included in the client's request as well as the potential addition of +the 'wanted-refs' section in the server's response as explained below. + +want-ref + Indicates to the server that the client wants to retrieve a + particular ref, where is the full name of a ref on the + server. + The response of `fetch` is broken into a number of sections separated by delimiter packets (0001), with each section beginning with its section header. output = *section -section = (acknowledgments | shallow-info | packfile) +section = (acknowledgments | shallow-info | wanted-refs | packfile) (flush-pkt | delim-pkt) acknowledgments = PKT-LINE("acknowledgments" LF) @@ -319,6 +328,10 @@ header. shallow = "shallow" SP obj-id unshallow = "unshallow" SP obj-id +wanted-refs = PKT-LINE("wanted-refs" LF) + *PKT-LINE(wanted-ref LF) +wanted-ref = obj-id SP refname + packfile = PKT-LINE("packfile" LF) *PKT-LINE(%x01-03 *%x00-ff) @@ -379,6 +392,19 @@ header. * This section is only included if a packfile section is also included in the response. +wanted-refs section + * This section is only included if the client has requested a + ref using a 'want-ref' line and if a packfile section is also + included in the response. + + * Always begins with the section header "wanted-refs" + + * The server will send a ref listing (" ") for + each reference requested using 'want-ref' lines. + + * The server SHOULD NOT send any refs which were not requested + using 'want-ref' lines. + packfile section * This section is only included if the client has sent 'want' lines in its request and either requested that no more diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh new file mode 100755 index 0..0ef182970 --- /dev/null +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -0,0 +1,153 @@ +#!/bin/sh + +test_description='upload-pack ref-in-want' + +. ./test-lib.sh + +get_actual_refs() { + sed -n '/wanted-refs/,/0001/p' actual_refs +} + +get_actual_commits() { + sed -n '/packfile/,//p' o.pack && + git index-pack o.pack && + git verify-pack -v o.idx | grep commit | cut -c-40 | sort >actual_commits +} + +check_output() { + get_actual_refs && + test_cmp expected_refs actual_refs && + get_actual_commits &&
[PATCH v4 1/8] test-pkt-line: add unpack-sideband subcommand
Add an 'unpack-sideband' subcommand to the test-pkt-line helper to enable unpacking packet line data sent multiplexed using a sideband. Signed-off-by: Brandon Williams --- t/helper/test-pkt-line.c | 33 + 1 file changed, 33 insertions(+) diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c index 0f19e53c7..30775f986 100644 --- a/t/helper/test-pkt-line.c +++ b/t/helper/test-pkt-line.c @@ -1,3 +1,4 @@ +#include "cache.h" #include "pkt-line.h" static void pack_line(const char *line) @@ -48,6 +49,36 @@ static void unpack(void) } } +static void unpack_sideband(void) +{ + struct packet_reader reader; + packet_reader_init(, 0, NULL, 0, + PACKET_READ_GENTLE_ON_EOF | + PACKET_READ_CHOMP_NEWLINE); + + while (packet_reader_read() != PACKET_READ_EOF) { + int band; + int fd; + + switch (reader.status) { + case PACKET_READ_EOF: + break; + case PACKET_READ_NORMAL: + band = reader.line[0] & 0xff; + if (band < 1 || band > 2) + die("unexpected side band %d", band); + fd = band; + + write_or_die(fd, reader.line + 1, reader.pktlen - 1); + break; + case PACKET_READ_FLUSH: + return; + case PACKET_READ_DELIM: + break; + } + } +} + int cmd_main(int argc, const char **argv) { if (argc < 2) @@ -57,6 +88,8 @@ int cmd_main(int argc, const char **argv) pack(argc - 2, argv + 2); else if (!strcmp(argv[1], "unpack")) unpack(); + else if (!strcmp(argv[1], "unpack-sideband")) + unpack_sideband(); else die("invalid argument '%s'", argv[1]); -- 2.18.0.rc2.346.g013aa6912e-goog
[PATCH v4 0/8] ref-in-want
Changes in v4 are fairly minor. There are a few documentation changes, commit message updates, as well as a few small style tweaks based on reviewer comments. Brandon Williams (8): test-pkt-line: add unpack-sideband subcommand upload-pack: implement ref-in-want upload-pack: test negotiation with changing repository fetch: refactor the population of peer ref OIDs fetch: refactor fetch_refs into two functions fetch: refactor to make function args narrower fetch-pack: put shallow info in output parameter fetch-pack: implement ref-in-want Documentation/config.txt| 7 + Documentation/technical/protocol-v2.txt | 28 ++- builtin/clone.c | 4 +- builtin/fetch.c | 135 - fetch-object.c | 2 +- fetch-pack.c| 50 - remote.c| 1 + remote.h| 1 + t/helper/test-pkt-line.c| 33 t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf | 8 + t/lib-httpd/one-time-sed.sh | 16 ++ t/t5703-upload-pack-ref-in-want.sh | 245 transport-helper.c | 6 +- transport-internal.h| 9 +- transport.c | 34 +++- transport.h | 3 +- upload-pack.c | 66 +++ 18 files changed, 574 insertions(+), 75 deletions(-) create mode 100644 t/lib-httpd/one-time-sed.sh create mode 100755 t/t5703-upload-pack-ref-in-want.sh -- 2.18.0.rc2.346.g013aa6912e-goog
Re: [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
On Mon, Jun 25, 2018 at 02:43:50PM -0400, Jeff King wrote: > On Fri, Jun 22, 2018 at 10:49:26AM -0500, Taylor Blau wrote: > > Since the last time, only a couple of things have changed at Peff's > > suggestions in [1]. The changes are summarized here, and an inter-diff > > is available below: > > > > - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I > > plan to send a follow-up patch to convert this back to "%zu" to see > > how people feel about it, but I wanted to keep that out of the > > present series in order to not hold things up. > > > > - Don't short-circuit AND when given --column, since an earlier NOT > > higher in the tree may cause an AND to be converted into an OR via > > de Morgan's Law, in which case the problem is reduced to the OR case > > (and should not have been short-circuited in the first place). > > > > - Add a test in t7810 to cover this behavior (i.e., '--not \( -e x > > --and -e y \)'). > > Jinxes aside, this interdiff looks good to me. Thanks; I hope that I haven't jinxed anything :-). I'm going to avoid sending the PRIuMAX -> "%zu" patch, since dscho points out that it's not available on Windows [1]. Thanks, Taylor [1]: https://public-inbox.org/git/nycvar.qro.7.76.6.1806222344280.11...@tvgsbejvaqbjf.bet/
Re: [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
On Fri, Jun 22, 2018 at 10:49:26AM -0500, Taylor Blau wrote: > Attached is my third--anticipate the final--re-roll of my series to > teach 'git grep --column'. You know when you say that it jinxes it, right? :) > Since the last time, only a couple of things have changed at Peff's > suggestions in [1]. The changes are summarized here, and an inter-diff > is available below: > > - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I > plan to send a follow-up patch to convert this back to "%zu" to see > how people feel about it, but I wanted to keep that out of the > present series in order to not hold things up. > > - Don't short-circuit AND when given --column, since an earlier NOT > higher in the tree may cause an AND to be converted into an OR via > de Morgan's Law, in which case the problem is reduced to the OR case > (and should not have been short-circuited in the first place). > > - Add a test in t7810 to cover this behavior (i.e., '--not \( -e x > --and -e y \)'). Jinxes aside, this interdiff looks good to me. -Peff
Re: [PATCH v3 0/7] Refactor fetch negotiation into its own API
On 06/14, Jonathan Tan wrote: > Thanks, Brandon and Junio, for your comments. > > Updates since v2: > - nothing new in patches 1 and 2 > - patch 3: now clear priority queue unconditionally once we are done >with it (as requested by Brandon in a comment for a later patch) > - patch 4: updated commit message to not mention everything_local() (as >pointed out by Brandon), updated test to not rely on the fact that >fetch-pack uses prefix matching (thanks, Junio, for the observation) > - patch 5: used a more descriptive name ("struct negotiation_state") >for the struct, instead of "struct data" > - squashed patch 8 into patch 7; this means that the comments are not >moved verbatim, but for the reviewer, verbatim-ness of comments is >probably not that important anyway Thanks this series looks good now. Reviewed-by: Brandon Williams -- Brandon Williams
Re: [PATCH 0/2] Object store refactoring: make bitmap_git not global
On 06/07, Jonathan Tan wrote: > This is a continuation of the object store refactoring effort. > > We cannot truly free an object store without ensuring that any generated > bitmaps are first freed, so here are patches to drastically reduce the > lifetime of any bitmaps generated. As a bonus, the API is also improved, > and global state reduced. I've reviewed this series and haven't found any issues. Reviewed-by: Brandon Williams -- Brandon Williams
Re: [GSoC][PATCH v3 2/3] rebase -i: rewrite setup_reflog_action() in C
Hi Junio, Le 25/06/2018 à 17:34, Junio C Hamano a écrit : > Alban Gruin writes: > >> Hi Junio, >> >> Le 22/06/2018 à 18:27, Junio C Hamano a écrit : >>> Alban Gruin writes: This rewrites (the misnamed) setup_reflog_action() from shell to C. The new version is called checkout_base_commit(). >>> >>> ;-) on the "misnamed" part. Indeed, setting up the comment for the >>> reflog entry is secondary to what this function wants to do, which >>> is to check out the branch to be rebased. >>> >>> I do not think "base_commit" is a good name, either, though. When I >>> hear 'base' in the context of 'rebase', I would imagine that the >>> speaker is talking about the bottom of the range of the commits to >>> be rebased (i.e. "rebase --onto ONTO BASE BRANCH", which replays >>> commits BASE..BRANCH on top of ONTO and then points BRANCH at the >>> result), not the top of the range or the branch these commits are >>> taken from. >>> >> >> Perhaps should I name this function checkout_onto(), and rename >> checkout_onto() to "detach_onto()"? And I would reorder those two commits >> in >> the series, as this would be very confusing. > > I may be misunderstanding what is happening in the function, but I > think it is checking out neither the onto or the base commit. The > function instead is about checking out the branch to be rebased > before anything else happens when the optional argument is > given (and when the optional argument is not given, then we rebase > the current branch so there is no need to check it out upfront), no? > > Yes, you’re right. Now I really don’t know how to call this function. checkout_top_of_range(), perhaps? Cheers, Alban
Re: [PATCH v3 2/8] upload-pack: implement ref-in-want
On 06/25, Jonathan Tan wrote: > > +static int parse_want_ref(const char *line, struct string_list > > *wanted_refs) > > +{ > > + const char *arg; > > + if (skip_prefix(line, "want-ref ", )) { > > + struct object_id oid; > > + struct string_list_item *item; > > + struct object *o; > > + > > + if (read_ref(arg, )) > > + die("unknown ref %s", arg); > > One more thing - if you're planning to "die" here, also write out an > error to the user, just like in parse_want(). Oh good idea, I'll add an ERR pkt here -- Brandon Williams
Re: [BUG] url schemes should be case-insensitive
Jeff King writes: > We seem to match url schemes case-sensitively: > > $ git clone SSH://example.com/repo.git > Cloning into 'repo'... > fatal: Unable to find remote helper for 'SSH' > > whereas rfc3986 is clear that the scheme portion is case-insensitive. > We probably ought to match at least our internal ones with strcasecmp. That may break if somebody at DevToolGroup@$BIGCOMPANY got cute and named their custom remote helper SSH:// that builds on top of the normal ssh:// protocol with something extra and gave it to their developers (and they named the http counterpart that has the same extra HTTP://, of course). If we probe for git-remote-SSH first and then fall back to git-remote-ssh, then we won't break these people, though. I agree that it may be a good bite-sized #leftoverbit material. > Possibly we should also normalize external helpers (so "FOO://bar" would > always call git-remote-foo, never git-remote-FOO). > We could probably also give an advise() message in the above output, > suggesting that the problem is likely one of: > > 1. They misspelled the scheme. > > 2. They need to install the appropriate helper. > > This may be a good topic for somebody looking for low-hanging fruit to > get involved in development (I'd maybe call it a #leftoverbits, but > since I didn't start on it, I'm not sure if it counts as "left over" ;)). Well, noticing an issue, analysing and discussing potential improvements and their ramifications is already half the work, so it does count as left-over, I would say. It may probably be a good idea to do an advice, but I'd think "Untable to find remote helper for 'SSH'" may be clear enough. If anything, perhaps saying "remote helper for 'SSH' protocol" would make it even clear? I dunno.
Re: [PATCH v3 7/8] fetch-pack: put shallow info in output parameter
On 06/25, Jonathan Tan wrote: > > static void update_shallow(struct fetch_pack_args *args, > > - struct ref **sought, int nr_sought, > > + struct ref *refs, > > update_shallow() now takes in a linked list of refs instead of an array. > I see that the translation of this function is straightforward - > occasionally, we need to iterate through the linked list and count up > from 0 at the same time, but that is not a problem. > > >struct shallow_info *si) > > { > > struct oid_array ref = OID_ARRAY_INIT; > > int *status; > > - int i; > > + int i = 0; > > Remove the " = 0" - I've verified that it does not need to be there, and > it might inhibit useful "unintialized variable" warnings if others were > to change the code later. > > Optional: I would also remove this declaration and declare "int i;" in > each of the blocks that need it. > > > static int fetch_refs_via_pack(struct transport *transport, > > - int nr_heads, struct ref **to_fetch) > > + int nr_heads, struct ref **to_fetch, > > + struct ref **fetched_refs) > > { > > int ret = 0; > > struct git_transport_data *data = transport->data; > > @@ -354,8 +356,12 @@ static int fetch_refs_via_pack(struct transport > > *transport, > > if (report_unmatched_refs(to_fetch, nr_heads)) > > ret = -1; > > > > + if (fetched_refs) > > + *fetched_refs = refs; > > + else > > + free_refs(refs); > > + > > free_refs(refs_tmp); > > - free_refs(refs); > > free(dest); > > return ret; > > } > > Instead of just freeing the linked list, we return it if requested by > the client. This makes sense. > > > -int transport_fetch_refs(struct transport *transport, struct ref *refs) > > +int transport_fetch_refs(struct transport *transport, struct ref *refs, > > +struct ref **fetched_refs) > > { > > int rc; > > int nr_heads = 0, nr_alloc = 0, nr_refs = 0; > > struct ref **heads = NULL; > > + struct ref *nop_head = NULL, **nop_tail = _head; > > struct ref *rm; > > > > for (rm = refs; rm; rm = rm->next) { > > nr_refs++; > > if (rm->peer_ref && > > !is_null_oid(>old_oid) && > > - !oidcmp(>peer_ref->old_oid, >old_oid)) > > + !oidcmp(>peer_ref->old_oid, >old_oid)) { > > + /* > > +* These need to be reported as fetched, but we don't > > +* actually need to fetch them. > > +*/ > > + if (fetched_refs) { > > + struct ref *nop_ref = copy_ref(rm); > > + *nop_tail = nop_ref; > > + nop_tail = _ref->next; > > + } > > continue; > > + } > > ALLOC_GROW(heads, nr_heads + 1, nr_alloc); > > heads[nr_heads++] = rm; > > } > > @@ -1245,7 +1263,11 @@ int transport_fetch_refs(struct transport > > *transport, struct ref *refs) > > heads[nr_heads++] = rm; > > } > > > > - rc = transport->vtable->fetch(transport, nr_heads, heads); > > + rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); > > + if (fetched_refs && nop_head) { > > + *nop_tail = *fetched_refs; > > + *fetched_refs = nop_head; > > + } > > > > free(heads); > > return rc; > > And sometimes, even if we are merely simulating the fetching of refs, we > still need to report those refs in fetched_refs. This is correct. > > I also see that t5703 now passes. > > Besides enabling the writing of subsequent patches, I see that this also > makes the API clearer in that the input refs to transport_fetch_refs() > are not overloaded to output shallow information. Other than the " = 0" > change above, this patch looks good to me. Perfect, I'll just drop the " = 0" part (making the diff slightly smaller) -- Brandon Williams
Re: [PATCH v3 2/8] upload-pack: implement ref-in-want
> +static int parse_want_ref(const char *line, struct string_list *wanted_refs) > +{ > + const char *arg; > + if (skip_prefix(line, "want-ref ", )) { > + struct object_id oid; > + struct string_list_item *item; > + struct object *o; > + > + if (read_ref(arg, )) > + die("unknown ref %s", arg); One more thing - if you're planning to "die" here, also write out an error to the user, just like in parse_want().
Re: [PATCH v3 8/8] fetch-pack: implement ref-in-want
On 06/22, Jonathan Nieder wrote: > Hi, > > Brandon Williams wrote: > > > Implement ref-in-want on the client side so that when a server supports > > the "ref-in-want" feature, a client will send "want-ref" lines for each > > reference the client wants to fetch. > > > > Signed-off-by: Brandon Williams > > --- > > fetch-pack.c | 35 +++--- > > remote.c | 1 + > > remote.h | 1 + > > t/t5703-upload-pack-ref-in-want.sh | 4 ++-- > > 4 files changed, 36 insertions(+), 5 deletions(-) > > This commit message doesn't tell me what ref-in-want is or is for. Could > it include > > A. a pointer to Documentation/technical/protocol-v2.txt, or > B. an example illustrating the effect e.g. using GIT_TRACE_PACKET > > or both? Yeah I can imporve the message here. > > + > > + for (r = refs; r; r = r->next) { > > + if (!strcmp(end, r->name)) { > > + oidcpy(>old_oid, ); > > + break; > > + } > > Stefan mentioned that the spec says > > * The server MUST NOT send any refs which were not requested > using 'want-ref' lines. > > Can client enforce that? If not, can the spec say SHOULD NOT for the > server and add a MUST describing appropriate client behavior? Yeah I can update the docs in an earlier patch. > > > + } > > + } > > + > > + if (reader->status != PACKET_READ_DELIM) > > The spec says > > * This section is only included if the client has requested a > ref using a 'want-ref' line and if a packfile section is also > included in the response. > > What should happen if the client already has all the relevant objects > (or in other words if there is no packfile to send in the packfile > section)? Is the idea that the client should already have known that > based on the ref advertisement? What if ref values change to put us > in that state between the ls-refs and fetch steps? I believe the current functionality is that if all wants are already satisfied by all haves then an empty packfile is sent, so that would fall under that case. -- Brandon Williams
Re: [PATCH v1 2/8] Add initial odb remote support
Christian Couder writes: > For each promisor remote I think it makes no sense to have more than > one remote odb pointing to it. So I am not sure what to do here. If it makes no sense, then detecting and erroring out would be a sensible thing to do, no?
Re: [PATCH v3 7/8] fetch-pack: put shallow info in output parameter
> static void update_shallow(struct fetch_pack_args *args, > -struct ref **sought, int nr_sought, > +struct ref *refs, update_shallow() now takes in a linked list of refs instead of an array. I see that the translation of this function is straightforward - occasionally, we need to iterate through the linked list and count up from 0 at the same time, but that is not a problem. > struct shallow_info *si) > { > struct oid_array ref = OID_ARRAY_INIT; > int *status; > - int i; > + int i = 0; Remove the " = 0" - I've verified that it does not need to be there, and it might inhibit useful "unintialized variable" warnings if others were to change the code later. Optional: I would also remove this declaration and declare "int i;" in each of the blocks that need it. > static int fetch_refs_via_pack(struct transport *transport, > -int nr_heads, struct ref **to_fetch) > +int nr_heads, struct ref **to_fetch, > +struct ref **fetched_refs) > { > int ret = 0; > struct git_transport_data *data = transport->data; > @@ -354,8 +356,12 @@ static int fetch_refs_via_pack(struct transport > *transport, > if (report_unmatched_refs(to_fetch, nr_heads)) > ret = -1; > > + if (fetched_refs) > + *fetched_refs = refs; > + else > + free_refs(refs); > + > free_refs(refs_tmp); > - free_refs(refs); > free(dest); > return ret; > } Instead of just freeing the linked list, we return it if requested by the client. This makes sense. > -int transport_fetch_refs(struct transport *transport, struct ref *refs) > +int transport_fetch_refs(struct transport *transport, struct ref *refs, > + struct ref **fetched_refs) > { > int rc; > int nr_heads = 0, nr_alloc = 0, nr_refs = 0; > struct ref **heads = NULL; > + struct ref *nop_head = NULL, **nop_tail = _head; > struct ref *rm; > > for (rm = refs; rm; rm = rm->next) { > nr_refs++; > if (rm->peer_ref && > !is_null_oid(>old_oid) && > - !oidcmp(>peer_ref->old_oid, >old_oid)) > + !oidcmp(>peer_ref->old_oid, >old_oid)) { > + /* > + * These need to be reported as fetched, but we don't > + * actually need to fetch them. > + */ > + if (fetched_refs) { > + struct ref *nop_ref = copy_ref(rm); > + *nop_tail = nop_ref; > + nop_tail = _ref->next; > + } > continue; > + } > ALLOC_GROW(heads, nr_heads + 1, nr_alloc); > heads[nr_heads++] = rm; > } > @@ -1245,7 +1263,11 @@ int transport_fetch_refs(struct transport *transport, > struct ref *refs) > heads[nr_heads++] = rm; > } > > - rc = transport->vtable->fetch(transport, nr_heads, heads); > + rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); > + if (fetched_refs && nop_head) { > + *nop_tail = *fetched_refs; > + *fetched_refs = nop_head; > + } > > free(heads); > return rc; And sometimes, even if we are merely simulating the fetching of refs, we still need to report those refs in fetched_refs. This is correct. I also see that t5703 now passes. Besides enabling the writing of subsequent patches, I see that this also makes the API clearer in that the input refs to transport_fetch_refs() are not overloaded to output shallow information. Other than the " = 0" change above, this patch looks good to me.
Re: [PATCH v3 4/8] fetch: refactor the population of peer ref OIDs
> Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides > tightening scopes of variables in the code, this also prepares for > get_ref_map being able to be called multiple times within do_fetch. get_ref_map() is only called in one place in builtin/fetch.c, and that place is in do_fetch(), so moving functionality from do_fetch() to get_ref_map() is perfectly fine, and also allows tightening of the scope of the existing_refs variable. Reviewed-by: Jonathan Tan
Re: [PATCH v3 2/8] upload-pack: implement ref-in-want
> +wanted-refs section > + * This section is only included if the client has requested a > + ref using a 'want-ref' line and if a packfile section is also > + included in the response. > + > + * Always begins with the section header "wanted-refs" Add a period at the end to be consistent with the others. > + * The server will send a ref listing (" ") for > + each reference requested using 'want-ref' lines. > + > + * The server MUST NOT send any refs which were not requested > + using 'want-ref' lines. We might want tag following refs to be included here in the future, but at that time, I think we can amend this to say that if include-tag-ref is sent by the user, the server may send additional refs, otherwise the server must not do so. So this is fine. > +test_expect_success 'mix want and want-ref' ' > + cat >expected_refs <<-EOF && > + $(git rev-parse f) refs/heads/master > + EOF > + git rev-parse e f | sort >expected_commits && > + > + test-pkt-line pack >in <<-EOF && > + command=fetch > + 0001 > + no-progress > + want-ref refs/heads/master > + want $(git rev-parse e) > + have $(git rev-parse a) > + done > + > + EOF > + > + git serve --stateless-rpc >out + check_output > +' Overall the tests look good, although I might be a bit biased since they are based on what I wrote a while ago [1]. I was wondering about the behavior when the client mixes "want" and "want-ref" (as will happen if they fetch both a ref by name and an exact SHA-1), and this test indeed shows the expected behavior. [1] https://public-inbox.org/git/d0d42b3bb4cf755f122591e191354c53848f197d.1485381677.git.jonathanta...@google.com/ > +test_expect_success 'want-ref with ref we already have commit for' ' > + cat >expected_refs <<-EOF && > + $(git rev-parse c) refs/heads/o/foo > + EOF > + >expected_commits && > + > + test-pkt-line pack >in <<-EOF && > + command=fetch > + 0001 > + no-progress > + want-ref refs/heads/o/foo > + have $(git rev-parse c) > + done > + > + EOF > + > + git serve --stateless-rpc >out + check_output > +' Likewise for this test - the ref is still reported, but the packfile does not contain the requested object. > struct upload_pack_data { > struct object_array wants; > + struct string_list wanted_refs; Document here that this is a map from ref names to owned struct object_id instances. > +static int parse_want_ref(const char *line, struct string_list *wanted_refs) > +{ > + const char *arg; > + if (skip_prefix(line, "want-ref ", )) { > + struct object_id oid; > + struct string_list_item *item; > + struct object *o; > + > + if (read_ref(arg, )) > + die("unknown ref %s", arg); > + > + item = string_list_append(wanted_refs, arg); > + item->util = oiddup(); > + > + o = parse_object_or_die(, arg); > + if (!(o->flags & WANTED)) { > + o->flags |= WANTED; > + add_object_array(o, NULL, _obj); > + } Makes sense - besides adding it to wanted_refs, this adds the object to want_obj, just like how the other code paths for "want" adds it. > +static void send_wanted_ref_info(struct upload_pack_data *data) > +{ > + const struct string_list_item *item; > + > + if (!data->wanted_refs.nr) > + return; > + > + packet_write_fmt(1, "wanted-refs\n"); > + > + for_each_string_list_item(item, >wanted_refs) { > + packet_write_fmt(1, "%s %s\n", > + oid_to_hex(item->util), > + item->string); > + } > + > + packet_delim(1); > +} The documentation states that the "wanted-refs" section is only sent if there is at least one "want-ref" from the client, and each "want-ref" causes one entry to be added to data->wanted_refs, so this is correct. Thanks - besides adding the period in the documentation, this patch looks good to me.
Re: Unexpected ignorecase=false behavior on Windows
On Mon, Jun 25, 2018 at 9:34 AM Junio C Hamano wrote: > > Bryan Turner writes: > > > Git on Windows is not designed to run with anything other than > > core.ignoreCase=true, and attempting to do so will cause unexpected > > behavior. > > Even though I fully agree with your conclusion that the document > must make it crystal clear that core.ignoreCase must be set to > reflect the reality, I found the above statement misleading and do > not want it to be used as the basis of a documentation update. But > it is possible that I am misunderstanding the current state of > affairs. > > Is the case insensitivity that deeply ingrained in the Git for > Windows code? > > IOW, even if the code used to build Git for Windows were executed on > a case sensitive filesystem, is there a case-smashing code on _our_ > side that kicks in to cause unexpected behaviour, _even_ when > core.ignorecase is set to false to match (hypothetical) reality? > > To put it yet another way, if a case sensitive filesystem were > available, wouldn't running "git init" from Git for Windows in a > directory on such a filesytem set core.ignoreCase to false in the > resulting repository and from then on wouldn't everything work fine? > > If my suspicion (i.e. the code for Git for Windows is perfectly > fine---it is just the users are not running with case sensitive > filesystems and flipping core.ignoreCase to true does not make case > incapable filesystems suddenly capable) is correct, then it is not > "Git on Windows is not designed to run" from two angles. (1) it is > not just Git for Windows---Git running on UNIX that mounts VFAT, or > Git running on macOS with default HFS+, would exhibit the same > symptom, and (2) it is not "Git is not designed to run"---it is > core.ignoreCase that is not designed to be a way to make case > incapable filesystems suddenly capable of distinguishing cases in > filesystems. Apologies for the unclear word choice. Given Git was designed first to work with case-sensitive filesystems, certainly the obvious (and correct) conclusion is that Git itself is fine in a case-sensitive environment. It wasn't my intention to suggest otherwise. Note that my word choice was not "Git _for_ Windows", however; it was "Git _on_ Windows". (This still doesn't change the correctness of your clarification, let me be quick to add. I'm only pointing it out because it's relevant to what I intended the comment to say.) On Windows, the default filesystem is NTFS, and NTFS is not case sensitive. Hence, Git on Windows (by which I'm implying Git on NTFS), is not designed to run with anything other than core.ignoreCase=true, because that setting aligns Git's expectations with how the underlying filesystem actually works. In other words, Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT, vFAT, NTFS, etc.) is not designed to be run with anything other than core.ignoreCase=true. Bryan > > Thanks.
Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect
Tiago Botelho writes: > +test_expect_success "--bisect-all --first-parent" ' > +cat >expect1 < +$(git rev-parse CC) (dist=2) > +$(git rev-parse EX) (dist=1) > +$(git rev-parse D) (dist=1) > +$(git rev-parse FX) (dist=0) > +EOF > + > +cat >expect2 < +$(git rev-parse CC) (dist=2) > +$(git rev-parse D) (dist=1) > +$(git rev-parse EX) (dist=1) > +$(git rev-parse FX) (dist=0) > +EOF > + > +git rev-list --bisect-all --first-parent FX ^A >actual && > + ( test_cmp expect1 actual || test_cmp expect2 actual ) > +' I hate to say this, but the above looks like a typical unmaintainable mess. What happens when you or somebody else later needs to update the graph to be tested to add one more commit (or even more)? Would it be enough to add another "rev-parse" plus "dist=X" line in both expects? Or do we see a trap for combinatorial explosion that requires us to add new expect$N?
Re: [PATCH 02/15] apply.c: stop using index compat macros
Nguyễn Thái Ngọc Duy writes: > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > apply.c | 34 +++--- > 1 file changed, 19 insertions(+), 15 deletions(-) Until all the codepaths that reach these callsites to index_* functions from cmd_main() are converted to pass "struct index_state *" throughout, a step like this patch does fundamentally changes nothing. The only two issues worth addressing in this area still remain. All the lines we see here still depend on the existence of "the_index" instance, and they can only operate on that single "the_index" instance and nothing else. It is true that the dependency has been made more explicit, but it already is explicit (to see them you just say "no-the-index-macros" and see what fails to compile). I see others are enthused with this series, but seeing the changes like these I am not all that impressed. Would it become hard to review if we combine this step *and* the next logical step (i.e. pass the "struct index_state" throughout the callflow) into a single patch? The functions in apply.c are fairly well isolated and there won't be all that heavy interaction with the outside world if we convert this file (and it alone) without touchning the other files. If a division in that direction is possible, it may make a better orgainzation (i.e. instead of doing whole-tree superficial conversion that needs to be fixed up again later, do a deep full conversion on selected files before going on to next set of files).
[PATCH v6 4/6] stash: refactor `show_stash()` to use the diff API
Currently, `show_stash()` uses `cmd_diff()` to generate the output. After this commit, the output will be generated using the internal API. Before this commit, `git stash show --quiet` would act like `git diff` and error out if the stash is not empty. Now, the `--quiet` option does not error out given an empty stash. Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 72 + 1 file changed, 45 insertions(+), 27 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index 4f49fd04b..4589e12d6 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -10,6 +10,8 @@ #include "run-command.h" #include "dir.h" #include "rerere.h" +#include "revision.h" +#include "log-tree.h" static const char * const git_stash_helper_usage[] = { N_("git stash--helper list []"), @@ -650,55 +652,71 @@ static int git_stash_config(const char *var, const char *value, void *cb) static int show_stash(int argc, const char **argv, const char *prefix) { - int i, ret = 0; - struct argv_array args = ARGV_ARRAY_INIT; - struct argv_array args_refs = ARGV_ARRAY_INIT; + int i; + int flags = 0; struct stash_info info; + struct rev_info rev; + struct argv_array stash_args = ARGV_ARRAY_INIT; struct option options[] = { OPT_END() }; - argc = parse_options(argc, argv, prefix, options, -git_stash_helper_show_usage, -PARSE_OPT_KEEP_UNKNOWN); + init_diff_ui_defaults(); + git_config(git_diff_ui_config, NULL); - argv_array_push(, "diff"); + init_revisions(, prefix); - /* Push args which are not options into args_refs. */ - for (i = 0; i < argc; ++i) { - if (argv[i][0] == '-') - argv_array_push(, argv[i]); + /* Push args which are not options into stash_args. */ + for (i = 1; i < argc; ++i) { + if (argv[i][0] != '-') + argv_array_push(_args, argv[i]); else - argv_array_push(_refs, argv[i]); - } - - if (get_stash_info(, args_refs.argc, args_refs.argv)) { - argv_array_clear(); - argv_array_clear(_refs); - return -1; + flags++; } /* * The config settings are applied only if there are not passed * any flags. */ - if (args.argc == 1) { + if (!flags) { git_config(git_stash_config, NULL); if (show_stat) - argv_array_push(, "--stat"); + rev.diffopt.output_format |= DIFF_FORMAT_DIFFSTAT; + if (show_patch) { + rev.diffopt.output_format = ~DIFF_FORMAT_NO_OUTPUT; + rev.diffopt.output_format |= DIFF_FORMAT_PATCH; + } + } - if (show_patch) - argv_array_push(, "-p"); + if (get_stash_info(, stash_args.argc, stash_args.argv)) { + argv_array_clear(_args); + return -1; } - argv_array_pushl(, oid_to_hex(_commit), -oid_to_hex(_commit), NULL); + argc = setup_revisions(argc, argv, , NULL); + if (!rev.diffopt.output_format) + rev.diffopt.output_format = DIFF_FORMAT_PATCH; + diff_setup_done(); + rev.diffopt.flags.recursive = 1; + setup_diff_pager(); - ret = cmd_diff(args.argc, args.argv, prefix); + /* +* We can return early if there was any option not recognised by +* `diff_opt_parse()`, besides the word `stash`. +*/ + if (argc > 1) { + free_stash_info(); + argv_array_clear(_args); + usage_with_options(git_stash_helper_show_usage, options); + } + + /* Do the diff thing. */ + diff_tree_oid(_commit, _commit, "", ); + log_tree_diff_flush(); free_stash_info(); - argv_array_clear(); - return ret; + argv_array_clear(_args); + return 0; } int cmd_stash__helper(int argc, const char **argv, const char *prefix) -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 3/6] stash: change `git stash show` usage text and documentation
It is already stated in documentation that it will accept any option known to `git diff`, but not in the usage text and some parts of the documentation. Signed-off-by: Paul-Sebastian Ungureanu --- Documentation/git-stash.txt | 4 ++-- builtin/stash--helper.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt index 7ef8c4791..e31ea7d30 100644 --- a/Documentation/git-stash.txt +++ b/Documentation/git-stash.txt @@ -9,7 +9,7 @@ SYNOPSIS [verse] 'git stash' list [] -'git stash' show [] +'git stash' show [] [] 'git stash' drop [-q|--quiet] [] 'git stash' ( pop | apply ) [--index] [-q|--quiet] [] 'git stash' branch [] @@ -106,7 +106,7 @@ stash@{1}: On master: 9cc0589... Add git-stash The command takes options applicable to the 'git log' command to control what is shown and how. See linkgit:git-log[1]. -show []:: +show [] []:: Show the changes recorded in the stash entry as a diff between the stashed contents and the commit back when the stash entry was first diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index a7eba3fba..4f49fd04b 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -13,7 +13,7 @@ static const char * const git_stash_helper_usage[] = { N_("git stash--helper list []"), - N_("git stash--helper show []"), + N_("git stash--helper show [] []"), N_("git stash--helper drop [-q|--quiet] []"), N_("git stash--helper ( pop | apply ) [--index] [-q|--quiet] []"), N_("git stash--helper branch []"), @@ -27,7 +27,7 @@ static const char * const git_stash_helper_list_usage[] = { }; static const char * const git_stash_helper_show_usage[] = { - N_("git stash--helper show []"), + N_("git stash--helper show [] []"), NULL }; -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 6/6] stash: convert store to builtin
Add stash store to the helper and delete the store_stash function from the shell script. Add the usage string which was forgotten in the shell script. Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 48 + git-stash.sh| 43 ++-- 2 files changed, 50 insertions(+), 41 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index 4589e12d6..556d91b20 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -20,6 +20,7 @@ static const char * const git_stash_helper_usage[] = { N_("git stash--helper ( pop | apply ) [--index] [-q|--quiet] []"), N_("git stash--helper branch []"), N_("git stash--helper clear"), + N_("git stash--helper store [-m|--message ] [-q|--quiet] "), NULL }; @@ -58,6 +59,11 @@ static const char * const git_stash_helper_clear_usage[] = { NULL }; +static const char * const git_stash_helper_store_usage[] = { + N_("git stash--helper store [-m|--message ] [-q|--quiet] "), + NULL +}; + static const char *ref_stash = "refs/stash"; static int quiet; static struct strbuf stash_index_path = STRBUF_INIT; @@ -719,6 +725,46 @@ static int show_stash(int argc, const char **argv, const char *prefix) return 0; } +static int store_stash(int argc, const char **argv, const char *prefix) +{ + struct object_id obj; + int ret = 0; + const char *stash_msg = NULL; + const char *w_commit = NULL; + struct option options[] = { + OPT__QUIET(, N_("be quiet, only report errors")), + OPT_STRING('m', "message", _msg, "message", N_("stash message")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_store_usage, +PARSE_OPT_KEEP_UNKNOWN); + + if (argc != 1) { + fprintf(stderr, _("\"git stash--helper store\" requires one argument\n")); + return -1; + } + + w_commit = argv[0]; + + if (!stash_msg) + stash_msg = xstrdup("Created via \"git stash--helper store\"."); + + ret = get_oid(w_commit, ); + if (!ret) { + ret = update_ref(stash_msg, ref_stash, , NULL, +REF_FORCE_CREATE_REFLOG, +quiet ? UPDATE_REFS_QUIET_ON_ERR : +UPDATE_REFS_MSG_ON_ERR); + } + + if (ret && !quiet) + fprintf_ln(stderr, _("Cannot update %s with %s"), ref_stash, w_commit); + + return ret; +} + int cmd_stash__helper(int argc, const char **argv, const char *prefix) { pid_t pid = getpid(); @@ -753,6 +799,8 @@ int cmd_stash__helper(int argc, const char **argv, const char *prefix) return !!list_stash(argc, argv, prefix); else if (!strcmp(argv[0], "show")) return !!show_stash(argc, argv, prefix); + else if (!strcmp(argv[0], "store")) + return !!store_stash(argc, argv, prefix); usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]), git_stash_helper_usage, options); diff --git a/git-stash.sh b/git-stash.sh index 0d05cbc1e..5739c5152 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -191,45 +191,6 @@ create_stash () { die "$(gettext "Cannot record working tree state")" } -store_stash () { - while test $# != 0 - do - case "$1" in - -m|--message) - shift - stash_msg="$1" - ;; - -m*) - stash_msg=${1#-m} - ;; - --message=*) - stash_msg=${1#--message=} - ;; - -q|--quiet) - quiet=t - ;; - *) - break - ;; - esac - shift - done - test $# = 1 || - die "$(eval_gettext "\"$dashless store\" requires one argument")" - - w_commit="$1" - if test -z "$stash_msg" - then - stash_msg="Created via \"git stash store\"." - fi - - git update-ref --create-reflog -m "$stash_msg" $ref_stash $w_commit - ret=$? - test $ret != 0 && test -z "$quiet" && - die "$(eval_gettext "Cannot update \$ref_stash with \$w_commit")" - return $ret -} - push_stash () { keep_index= patch_mode= @@ -308,7 +269,7 @@ push_stash () { clear_stash || die "$(gettext "Cannot initialize stash")" create_stash -m "$stash_msg" -u "$untracked" -- "$@" - store_stash -m "$stash_msg" -q $w_commit || + git stash--helper store -m "$stash_msg" -q $w_commit || die "$(gettext "Cannot save the
[PATCH v6 1/6] stash: implement the "list" command in the builtin
Add stash list to the helper and delete the list_stash function from the shell script. Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 33 + git-stash.sh| 7 +-- 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index a38d6ae8a..2ed21f5d1 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -12,6 +12,7 @@ #include "rerere.h" static const char * const git_stash_helper_usage[] = { + N_("git stash--helper list []"), N_("git stash--helper drop [-q|--quiet] []"), N_("git stash--helper ( pop | apply ) [--index] [-q|--quiet] []"), N_("git stash--helper branch []"), @@ -19,6 +20,11 @@ static const char * const git_stash_helper_usage[] = { NULL }; +static const char * const git_stash_helper_list_usage[] = { + N_("git stash--helper list []"), + NULL +}; + static const char * const git_stash_helper_drop_usage[] = { N_("git stash--helper drop [-q|--quiet] []"), NULL @@ -595,6 +601,31 @@ static int branch_stash(int argc, const char **argv, const char *prefix) return ret; } +static int list_stash(int argc, const char **argv, const char *prefix) +{ + int ret; + struct argv_array args = ARGV_ARRAY_INIT; + struct option options[] = { + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_list_usage, +PARSE_OPT_KEEP_UNKNOWN); + + if (!ref_exists(ref_stash)) + return 0; + + argv_array_pushl(, "log", "--format=%gd: %gs", "-g", +"--first-parent", "-m", NULL); + argv_array_pushv(, argv); + argv_array_push(, ref_stash); + argv_array_push(, "--"); + ret = cmd_log(args.argc, args.argv, prefix); + argv_array_clear(); + return ret; +} + int cmd_stash__helper(int argc, const char **argv, const char *prefix) { pid_t pid = getpid(); @@ -625,6 +656,8 @@ int cmd_stash__helper(int argc, const char **argv, const char *prefix) return !!pop_stash(argc, argv, prefix); else if (!strcmp(argv[0], "branch")) return !!branch_stash(argc, argv, prefix); + else if (!strcmp(argv[0], "list")) + return !!list_stash(argc, argv, prefix); usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]), git_stash_helper_usage, options); diff --git a/git-stash.sh b/git-stash.sh index 8f2640fe9..6052441aa 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -382,11 +382,6 @@ have_stash () { git rev-parse --verify --quiet $ref_stash >/dev/null } -list_stash () { - have_stash || return 0 - git log --format="%gd: %gs" -g --first-parent -m "$@" $ref_stash -- -} - show_stash () { ALLOW_UNKNOWN_FLAGS=t assert_stash_like "$@" @@ -574,7 +569,7 @@ test -n "$seen_non_option" || set "push" "$@" case "$1" in list) shift - list_stash "$@" + git stash--helper list "$@" ;; show) shift -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 5/6] stash: update `git stash show` documentation
Add in documentation about the change of behavior regarding the `--quiet` option, which was introduced in the last commit. (the `--quiet` option does not exit anymore with erorr if it is given an empty stash as argument) Signed-off-by: Paul-Sebastian Ungureanu --- Documentation/git-stash.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt index e31ea7d30..d60ebdb96 100644 --- a/Documentation/git-stash.txt +++ b/Documentation/git-stash.txt @@ -117,6 +117,9 @@ show [] []:: You can use stash.showStat and/or stash.showPatch config variables to change the default behavior. + It accepts any option known to `git diff`, but acts different on + `--quiet` option and exit with zero regardless of differences. + pop [--index] [-q|--quiet] []:: Remove a single stashed state from the stash list and apply it -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 2/6] stash: convert show to builtin
Add stash show to the helper and delete the show_stash, have_stash, assert_stash_like, is_stash_like and parse_flags_and_rev functions from the shell script now that they are no longer needed. Before this commit, `git stash show` would ignore `--index` and `--quiet` options. Now, `git stash show` errors out on `--index` and does not display any message on `--quiet`, but errors out if the stash is not empty. Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 77 +++ git-stash.sh| 132 +--- 2 files changed, 78 insertions(+), 131 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index 2ed21f5d1..a7eba3fba 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -13,6 +13,7 @@ static const char * const git_stash_helper_usage[] = { N_("git stash--helper list []"), + N_("git stash--helper show []"), N_("git stash--helper drop [-q|--quiet] []"), N_("git stash--helper ( pop | apply ) [--index] [-q|--quiet] []"), N_("git stash--helper branch []"), @@ -25,6 +26,11 @@ static const char * const git_stash_helper_list_usage[] = { NULL }; +static const char * const git_stash_helper_show_usage[] = { + N_("git stash--helper show []"), + NULL +}; + static const char * const git_stash_helper_drop_usage[] = { N_("git stash--helper drop [-q|--quiet] []"), NULL @@ -626,6 +632,75 @@ static int list_stash(int argc, const char **argv, const char *prefix) return ret; } +static int show_stat = 1; +static int show_patch; + +static int git_stash_config(const char *var, const char *value, void *cb) +{ + if (!strcmp(var, "stash.showStat")) { + show_stat = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "stash.showPatch")) { + show_patch = git_config_bool(var, value); + return 0; + } + return git_default_config(var, value, cb); +} + +static int show_stash(int argc, const char **argv, const char *prefix) +{ + int i, ret = 0; + struct argv_array args = ARGV_ARRAY_INIT; + struct argv_array args_refs = ARGV_ARRAY_INIT; + struct stash_info info; + struct option options[] = { + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_show_usage, +PARSE_OPT_KEEP_UNKNOWN); + + argv_array_push(, "diff"); + + /* Push args which are not options into args_refs. */ + for (i = 0; i < argc; ++i) { + if (argv[i][0] == '-') + argv_array_push(, argv[i]); + else + argv_array_push(_refs, argv[i]); + } + + if (get_stash_info(, args_refs.argc, args_refs.argv)) { + argv_array_clear(); + argv_array_clear(_refs); + return -1; + } + + /* +* The config settings are applied only if there are not passed +* any flags. +*/ + if (args.argc == 1) { + git_config(git_stash_config, NULL); + if (show_stat) + argv_array_push(, "--stat"); + + if (show_patch) + argv_array_push(, "-p"); + } + + argv_array_pushl(, oid_to_hex(_commit), +oid_to_hex(_commit), NULL); + + ret = cmd_diff(args.argc, args.argv, prefix); + + free_stash_info(); + argv_array_clear(); + return ret; +} + int cmd_stash__helper(int argc, const char **argv, const char *prefix) { pid_t pid = getpid(); @@ -658,6 +733,8 @@ int cmd_stash__helper(int argc, const char **argv, const char *prefix) return !!branch_stash(argc, argv, prefix); else if (!strcmp(argv[0], "list")) return !!list_stash(argc, argv, prefix); + else if (!strcmp(argv[0], "show")) + return !!show_stash(argc, argv, prefix); usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]), git_stash_helper_usage, options); diff --git a/git-stash.sh b/git-stash.sh index 6052441aa..0d05cbc1e 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -378,35 +378,6 @@ save_stash () { fi } -have_stash () { - git rev-parse --verify --quiet $ref_stash >/dev/null -} - -show_stash () { - ALLOW_UNKNOWN_FLAGS=t - assert_stash_like "$@" - - if test -z "$FLAGS" - then - if test "$(git config --bool stash.showStat || echo true)" = "true" - then - FLAGS=--stat - fi - - if test "$(git config --bool stash.showPatch || echo false)" = "true" - then - FLAGS=${FLAGS}${FLAGS:+ }-p - fi - - if test -z "$FLAGS" -
[PATCH v6 3/4] stash: convert branch to builtin
From: Joel Teichroeb Add stash branch to the helper and delete the apply_to_branch function from the shell script. Checkout does not currently provide a function for checking out a branch as cmd_checkout does a large amount of sanity checks first that we require here. Signed-off-by: Joel Teichroeb Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 43 + git-stash.sh| 17 ++-- 2 files changed, 45 insertions(+), 15 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index 84a537f39..fbf78249c 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -14,6 +14,7 @@ static const char * const git_stash_helper_usage[] = { N_("git stash--helper drop [-q|--quiet] []"), N_("git stash--helper apply [--index] [-q|--quiet] []"), + N_("git stash--helper branch []"), N_("git stash--helper clear"), NULL }; @@ -28,6 +29,11 @@ static const char * const git_stash_helper_apply_usage[] = { NULL }; +static const char * const git_stash_helper_branch_usage[] = { + N_("git stash--helper branch []"), + NULL +}; + static const char * const git_stash_helper_clear_usage[] = { N_("git stash--helper clear"), NULL @@ -522,6 +528,41 @@ static int drop_stash(int argc, const char **argv, const char *prefix) return ret; } +static int branch_stash(int argc, const char **argv, const char *prefix) +{ + const char *branch = NULL; + int ret; + struct argv_array args = ARGV_ARRAY_INIT; + struct stash_info info; + struct option options[] = { + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_branch_usage, 0); + + if (argc == 0) + return error(_("No branch name specified")); + + branch = argv[0]; + + if (get_stash_info(, argc - 1, argv + 1)) + return -1; + + argv_array_pushl(, "checkout", "-b", NULL); + argv_array_push(, branch); + argv_array_push(, oid_to_hex(_commit)); + ret = cmd_checkout(args.argc, args.argv, prefix); + if (!ret) + ret = do_apply_stash(prefix, , 1); + if (!ret && info.is_stash_ref) + ret = do_drop_stash(prefix, ); + + free_stash_info(); + + return ret; +} + int cmd_stash__helper(int argc, const char **argv, const char *prefix) { pid_t pid = getpid(); @@ -548,6 +589,8 @@ int cmd_stash__helper(int argc, const char **argv, const char *prefix) return !!clear_stash(argc, argv, prefix); else if (!strcmp(argv[0], "drop")) return !!drop_stash(argc, argv, prefix); + else if (!strcmp(argv[0], "branch")) + return !!branch_stash(argc, argv, prefix); usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]), git_stash_helper_usage, options); diff --git a/git-stash.sh b/git-stash.sh index a99d5dc9e..29d9f4425 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -598,20 +598,6 @@ drop_stash () { clear_stash } -apply_to_branch () { - test -n "$1" || die "$(gettext "No branch name specified")" - branch=$1 - shift 1 - - set -- --index "$@" - assert_stash_like "$@" - - git checkout -b $branch $REV^ && - apply_stash "$@" && { - test -z "$IS_STASH_REF" || drop_stash "$@" - } -} - test "$1" = "-p" && set "push" "$@" PARSE_CACHE='--not-parsed' @@ -673,7 +659,8 @@ pop) ;; branch) shift - apply_to_branch "$@" + cd "$START_DIR" + git stash--helper branch "$@" ;; *) case $# in -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 2/4] stash: convert drop and clear to builtin
From: Joel Teichroeb Add the drop and clear commands to the builtin helper. These two are each simple, but are being added together as they are quite related. We have to unfortunately keep the drop and clear functions in the shell script as functions are called with parameters internally that are not valid when the commands are called externally. Once pop is converted they can both be removed. Signed-off-by: Joel Teichroeb Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 112 git-stash.sh| 4 +- 2 files changed, 114 insertions(+), 2 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index 1c4387b10..84a537f39 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -12,7 +12,14 @@ #include "rerere.h" static const char * const git_stash_helper_usage[] = { + N_("git stash--helper drop [-q|--quiet] []"), N_("git stash--helper apply [--index] [-q|--quiet] []"), + N_("git stash--helper clear"), + NULL +}; + +static const char * const git_stash_helper_drop_usage[] = { + N_("git stash--helper drop [-q|--quiet] []"), NULL }; @@ -21,6 +28,11 @@ static const char * const git_stash_helper_apply_usage[] = { NULL }; +static const char * const git_stash_helper_clear_usage[] = { + N_("git stash--helper clear"), + NULL +}; + static const char *ref_stash = "refs/stash"; static int quiet; static struct strbuf stash_index_path = STRBUF_INIT; @@ -139,6 +151,31 @@ static int get_stash_info(struct stash_info *info, int argc, const char **argv) return 0; } +static int do_clear_stash(void) +{ + struct object_id obj; + if (get_oid(ref_stash, )) + return 0; + + return delete_ref(NULL, ref_stash, , 0); +} + +static int clear_stash(int argc, const char **argv, const char *prefix) +{ + struct option options[] = { + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_clear_usage, +PARSE_OPT_STOP_AT_NON_OPTION); + + if (argc != 0) + return error(_("git stash--helper clear with parameters is unimplemented")); + + return do_clear_stash(); +} + static int reset_tree(struct object_id *i_tree, int update, int reset) { struct unpack_trees_options opts; @@ -414,6 +451,77 @@ static int apply_stash(int argc, const char **argv, const char *prefix) return ret; } +static int do_drop_stash(const char *prefix, struct stash_info *info) +{ + struct argv_array args = ARGV_ARRAY_INIT; + struct child_process cp = CHILD_PROCESS_INIT; + int ret; + + /* +* reflog does not provide a simple function for deleting refs. One will +* need to be added to avoid implementing too much reflog code here +*/ + argv_array_pushl(, "reflog", "delete", "--updateref", "--rewrite", +NULL); + argv_array_push(, info->revision.buf); + ret = cmd_reflog(args.argc, args.argv, prefix); + if (!ret) { + if (!quiet) + printf(_("Dropped %s (%s)\n"), info->revision.buf, + oid_to_hex(>w_commit)); + } else { + return error(_("%s: Could not drop stash entry"), info->revision.buf); + } + + /* +* This could easily be replaced by get_oid, but currently it will throw a +* fatal error when a reflog is empty, which we can not recover from +*/ + cp.git_cmd = 1; + /* Even though --quiet is specified, rev-parse still outputs the hash */ + cp.no_stdout = 1; + argv_array_pushl(, "rev-parse", "--verify", "--quiet", NULL); + argv_array_pushf(, "%s@{0}", ref_stash); + ret = run_command(); + + /* do_clear_stash if we just dropped the last stash entry */ + if (ret) + do_clear_stash(); + + return 0; +} + +static void assert_stash_ref(struct stash_info *info) +{ + if (!info->is_stash_ref) { + free_stash_info(info); + error(_("'%s' is not a stash reference"), info->revision.buf); + exit(128); + } +} + +static int drop_stash(int argc, const char **argv, const char *prefix) +{ + struct stash_info info; + int ret; + struct option options[] = { + OPT__QUIET(, N_("be quiet, only report errors")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_drop_usage, 0); + + if (get_stash_info(, argc, argv)) + return -1; + + assert_stash_ref(); + + ret = do_drop_stash(prefix, ); + free_stash_info(); + return ret; +} + int cmd_stash__helper(int argc, const char **argv, const char *prefix) { pid_t pid = getpid();
[PATCH v6 4/4] stash: convert pop to builtin
From: Joel Teichroeb Add stash pop to the helper and delete the pop_stash, drop_stash, assert_stash_ref functions from the shell script now that they are no longer needed. Signed-off-by: Joel Teichroeb Signed-off-by: Paul-Sebastian Ungureanu --- builtin/stash--helper.c | 36 ++- git-stash.sh| 47 ++--- 2 files changed, 37 insertions(+), 46 deletions(-) diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c index fbf78249c..a38d6ae8a 100644 --- a/builtin/stash--helper.c +++ b/builtin/stash--helper.c @@ -13,7 +13,7 @@ static const char * const git_stash_helper_usage[] = { N_("git stash--helper drop [-q|--quiet] []"), - N_("git stash--helper apply [--index] [-q|--quiet] []"), + N_("git stash--helper ( pop | apply ) [--index] [-q|--quiet] []"), N_("git stash--helper branch []"), N_("git stash--helper clear"), NULL @@ -24,6 +24,11 @@ static const char * const git_stash_helper_drop_usage[] = { NULL }; +static const char * const git_stash_helper_pop_usage[] = { + N_("git stash--helper pop [--index] [-q|--quiet] []"), + NULL +}; + static const char * const git_stash_helper_apply_usage[] = { N_("git stash--helper apply [--index] [-q|--quiet] []"), NULL @@ -528,6 +533,33 @@ static int drop_stash(int argc, const char **argv, const char *prefix) return ret; } +static int pop_stash(int argc, const char **argv, const char *prefix) +{ + int index = 0, ret; + struct stash_info info; + struct option options[] = { + OPT__QUIET(, N_("be quiet, only report errors")), + OPT_BOOL(0, "index", , + N_("attempt to recreate the index")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, +git_stash_helper_pop_usage, 0); + + if (get_stash_info(, argc, argv)) + return -1; + + assert_stash_ref(); + if ((ret = do_apply_stash(prefix, , index))) + printf_ln(_("The stash entry is kept in case you need it again.")); + else + ret = do_drop_stash(prefix, ); + + free_stash_info(); + return ret; +} + static int branch_stash(int argc, const char **argv, const char *prefix) { const char *branch = NULL; @@ -589,6 +621,8 @@ int cmd_stash__helper(int argc, const char **argv, const char *prefix) return !!clear_stash(argc, argv, prefix); else if (!strcmp(argv[0], "drop")) return !!drop_stash(argc, argv, prefix); + else if (!strcmp(argv[0], "pop")) + return !!pop_stash(argc, argv, prefix); else if (!strcmp(argv[0], "branch")) return !!branch_stash(argc, argv, prefix); diff --git a/git-stash.sh b/git-stash.sh index 29d9f4425..8f2640fe9 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -554,50 +554,6 @@ assert_stash_like() { } } -is_stash_ref() { - is_stash_like "$@" && test -n "$IS_STASH_REF" -} - -assert_stash_ref() { - is_stash_ref "$@" || { - args="$*" - die "$(eval_gettext "'\$args' is not a stash reference")" - } -} - -apply_stash () { - cd "$START_DIR" - git stash--helper apply "$@" - res=$? - cd_to_toplevel - return $res -} - -pop_stash() { - assert_stash_ref "$@" - - if apply_stash "$@" - then - drop_stash "$@" - else - status=$? - say "$(gettext "The stash entry is kept in case you need it again.")" - exit $status - fi -} - -drop_stash () { - assert_stash_ref "$@" - - git reflog delete --updateref --rewrite "${REV}" && - say "$(eval_gettext "Dropped \${REV} (\$s)")" || - die "$(eval_gettext "\${REV}: Could not drop stash entry")" - - # clear_stash if we just dropped the last stash entry - git rev-parse --verify --quiet "$ref_stash@{0}" >/dev/null || - clear_stash -} - test "$1" = "-p" && set "push" "$@" PARSE_CACHE='--not-parsed' @@ -655,7 +611,8 @@ drop) ;; pop) shift - pop_stash "$@" + cd "$START_DIR" + git stash--helper pop "$@" ;; branch) shift -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 1/4] stash: convert apply to builtin
From: Joel Teichroeb Add a builtin helper for performing stash commands. Converting all at once proved hard to review, so starting with just apply lets conversion get started without the other commands being finished. The helper is being implemented as a drop in replacement for stash so that when it is complete it can simply be renamed and the shell script deleted. Delete the contents of the apply_stash shell function and replace it with a call to stash--helper apply until pop is also converted. Signed-off-by: Joel Teichroeb Signed-off-by: Paul-Sebastian Ungureanu --- .gitignore | 1 + Makefile| 1 + builtin.h | 1 + builtin/stash--helper.c | 442 git-stash.sh| 78 +-- git.c | 1 + 6 files changed, 453 insertions(+), 71 deletions(-) create mode 100644 builtin/stash--helper.c diff --git a/.gitignore b/.gitignore index 388cc4bee..56e63dd24 100644 --- a/.gitignore +++ b/.gitignore @@ -155,6 +155,7 @@ /git-show-ref /git-stage /git-stash +/git-stash--helper /git-status /git-stripspace /git-submodule diff --git a/Makefile b/Makefile index 1d27f3636..ca6072053 100644 --- a/Makefile +++ b/Makefile @@ -1078,6 +1078,7 @@ BUILTIN_OBJS += builtin/serve.o BUILTIN_OBJS += builtin/shortlog.o BUILTIN_OBJS += builtin/show-branch.o BUILTIN_OBJS += builtin/show-ref.o +BUILTIN_OBJS += builtin/stash--helper.o BUILTIN_OBJS += builtin/stripspace.o BUILTIN_OBJS += builtin/submodule--helper.o BUILTIN_OBJS += builtin/symbolic-ref.o diff --git a/builtin.h b/builtin.h index 4e0f64723..ba63ce0f6 100644 --- a/builtin.h +++ b/builtin.h @@ -221,6 +221,7 @@ extern int cmd_shortlog(int argc, const char **argv, const char *prefix); extern int cmd_show(int argc, const char **argv, const char *prefix); extern int cmd_show_branch(int argc, const char **argv, const char *prefix); extern int cmd_status(int argc, const char **argv, const char *prefix); +extern int cmd_stash__helper(int argc, const char **argv, const char *prefix); extern int cmd_stripspace(int argc, const char **argv, const char *prefix); extern int cmd_submodule__helper(int argc, const char **argv, const char *prefix); extern int cmd_symbolic_ref(int argc, const char **argv, const char *prefix); diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c new file mode 100644 index 0..1c4387b10 --- /dev/null +++ b/builtin/stash--helper.c @@ -0,0 +1,442 @@ +#include "builtin.h" +#include "config.h" +#include "parse-options.h" +#include "refs.h" +#include "lockfile.h" +#include "cache-tree.h" +#include "unpack-trees.h" +#include "merge-recursive.h" +#include "argv-array.h" +#include "run-command.h" +#include "dir.h" +#include "rerere.h" + +static const char * const git_stash_helper_usage[] = { + N_("git stash--helper apply [--index] [-q|--quiet] []"), + NULL +}; + +static const char * const git_stash_helper_apply_usage[] = { + N_("git stash--helper apply [--index] [-q|--quiet] []"), + NULL +}; + +static const char *ref_stash = "refs/stash"; +static int quiet; +static struct strbuf stash_index_path = STRBUF_INIT; + +/* + * w_commit is set to the commit containing the working tree + * b_commit is set to the base commit + * i_commit is set to the commit containing the index tree + * u_commit is set to the commit containing the untracked files tree + * w_tree is set to the working tree + * b_tree is set to the base tree + * i_tree is set to the index tree + * u_tree is set to the untracked files tree + */ + +struct stash_info { + struct object_id w_commit; + struct object_id b_commit; + struct object_id i_commit; + struct object_id u_commit; + struct object_id w_tree; + struct object_id b_tree; + struct object_id i_tree; + struct object_id u_tree; + struct strbuf revision; + int is_stash_ref; + int has_u; +}; + +static void free_stash_info(struct stash_info *info) +{ + strbuf_release(>revision); +} + +static void assert_stash_like(struct stash_info *info, const char * revision) +{ + if (get_oidf(>b_commit, "%s^1", revision) || + get_oidf(>w_tree, "%s:", revision) || + get_oidf(>b_tree, "%s^1:", revision) || + get_oidf(>i_tree, "%s^2:", revision)) { + free_stash_info(info); + error(_("'%s' is not a stash-like commit"), revision); + exit(128); + } +} + +static int get_stash_info(struct stash_info *info, int argc, const char **argv) +{ + struct strbuf symbolic = STRBUF_INIT; + int ret; + const char *revision; + const char *commit = NULL; + char *end_of_rev; + char *expanded_ref; + struct object_id dummy; + + if (argc > 1) { + int i; + struct strbuf refs_msg = STRBUF_INIT; + for (i = 0; i < argc; ++i) + strbuf_addf(_msg, " '%s'",
[PATCH v6 1/4] sha1-name.c: added 'get_oidf', which acts like 'get_oid'
Compared to 'get_oid', 'get_oidf' has as parameters a printf format string and the additional arguments. This will help simplify the code in subsequent commits. Original-idea-by: Johannes Schindelin Signed-off-by: Paul-Sebastian Ungureanu --- cache.h | 1 + sha1-name.c | 19 +++ 2 files changed, 20 insertions(+) diff --git a/cache.h b/cache.h index 89a107a7f..9252a4546 100644 --- a/cache.h +++ b/cache.h @@ -1321,6 +1321,7 @@ struct object_context { GET_OID_BLOB) extern int get_oid(const char *str, struct object_id *oid); +extern int get_oidf(struct object_id *oid, const char *fmt, ...); extern int get_oid_commit(const char *str, struct object_id *oid); extern int get_oid_committish(const char *str, struct object_id *oid); extern int get_oid_tree(const char *str, struct object_id *oid); diff --git a/sha1-name.c b/sha1-name.c index 60d9ef3c7..80ee8f742 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1466,6 +1466,25 @@ int get_oid(const char *name, struct object_id *oid) return get_oid_with_context(name, 0, oid, ); } +/* + * This returns a non-zero value if the string (built using printf + * format and the given arguments) is not a valid object. + */ +int get_oidf(struct object_id *oid, const char *fmt, ...) +{ + va_list ap; + int ret; + struct strbuf sb = STRBUF_INIT; + + va_start(ap, fmt); + strbuf_vaddf(, fmt, ap); + va_end(ap); + + ret = get_oid(sb.buf, oid); + strbuf_release(); + + return ret; +} /* * Many callers know that the user meant to name a commit-ish by -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 0/4] stash: add new tests and introduce a new helper function
Hello, This first series of patches does bring some changes and improvements to the test suite. One of the patches also introduces a new function `get_oidf()` which will be hepful for the incoming patches related to `git stash`. Thanks, Paul Joel Teichroeb (1): stash: improve option parsing test coverage Paul-Sebastian Ungureanu (3): sha1-name.c: added 'get_oidf', which acts like 'get_oid' stash: update test cases conform to coding guidelines stash: renamed test cases to be more descriptive cache.h | 1 + sha1-name.c | 19 ++ t/t3903-stash.sh | 169 +-- 3 files changed, 123 insertions(+), 66 deletions(-) -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 2/4] stash: improve option parsing test coverage
From: Joel Teichroeb In preparation for converting the stash command incrementally to a builtin command, this patch improves test coverage of the option parsing. Both for having too many parameters, or too few. Signed-off-by: Joel Teichroeb Signed-off-by: Paul-Sebastian Ungureanu --- t/t3903-stash.sh | 35 +++ 1 file changed, 35 insertions(+) diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 1f871d3cc..af7586d43 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -444,6 +444,36 @@ test_expect_failure 'stash file to directory' ' test foo = "$(cat file/file)" ' +test_expect_success 'giving too many ref arguments does not modify files' ' + git stash clear && + test_when_finished "git reset --hard HEAD" && + echo foo >file2 && + git stash && + echo bar >file2 && + git stash && + test-tool chmtime =123456789 file2 && + for type in apply pop "branch stash-branch" + do + test_must_fail git stash $type stash@{0} stash@{1} 2>err && + test_i18ngrep "Too many revisions" err && + test 123456789 = $(test-tool chmtime -g file2) || return 1 + done +' + +test_expect_success 'drop: too many arguments errors out (does nothing)' ' + git stash list >expect && + test_must_fail git stash drop stash@{0} stash@{1} 2>err && + test_i18ngrep "Too many revisions" err && + git stash list >actual && + test_cmp expect actual +' + +test_expect_success 'show: too many arguments errors out (does nothing)' ' + test_must_fail git stash show stash@{0} stash@{1} 2>err 1>out && + test_i18ngrep "Too many revisions" err && + test_must_be_empty out +' + test_expect_success 'stash create - no changes' ' git stash clear && test_when_finished "git reset --hard HEAD" && @@ -479,6 +509,11 @@ test_expect_success 'stash branch - stashes on stack, stash-like argument' ' test $(git ls-files --modified | wc -l) -eq 1 ' +test_expect_success 'stash branch complains with no arguments' ' + test_must_fail git stash branch 2>err && + test_i18ngrep "No branch name specified" err +' + test_expect_success 'stash show format defaults to --stat' ' git stash clear && test_when_finished "git reset --hard HEAD" && -- 2.18.0.rc2.13.g506fc12fb
[PATCH v6 3/4] stash: update test cases conform to coding guidelines
Removed whitespaces after redirection operators. Signed-off-by: Paul-Sebastian Ungureanu --- t/t3903-stash.sh | 120 --- 1 file changed, 61 insertions(+), 59 deletions(-) diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index af7586d43..de6cab1fe 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -8,22 +8,22 @@ test_description='Test git stash' . ./test-lib.sh test_expect_success 'stash some dirty working directory' ' - echo 1 > file && + echo 1 >file && git add file && echo unrelated >other-file && git add other-file && test_tick && git commit -m initial && - echo 2 > file && + echo 2 >file && git add file && - echo 3 > file && + echo 3 >file && test_tick && git stash && git diff-files --quiet && git diff-index --cached --quiet HEAD ' -cat > expect << EOF +cat >expect < output && + git diff stash^2..stash >output && test_cmp output expect ' @@ -74,7 +74,7 @@ test_expect_success 'apply stashed changes' ' test_expect_success 'apply stashed changes (including index)' ' git reset --hard HEAD^ && - echo 6 > other-file && + echo 6 >other-file && git add other-file && test_tick && git commit -m other-file && @@ -99,12 +99,12 @@ test_expect_success 'stash drop complains of extra options' ' test_expect_success 'drop top stash' ' git reset --hard && - git stash list > stashlist1 && - echo 7 > file && + git stash list >expected && + echo 7 >file && git stash && git stash drop && - git stash list > stashlist2 && - test_cmp stashlist1 stashlist2 && + git stash list >actual && + test_cmp expected actual && git stash apply && test 3 = $(cat file) && test 1 = $(git show :file) && @@ -113,9 +113,9 @@ test_expect_success 'drop top stash' ' test_expect_success 'drop middle stash' ' git reset --hard && - echo 8 > file && + echo 8 >file && git stash && - echo 9 > file && + echo 9 >file && git stash && git stash drop stash@{1} && test 2 = $(git stash list | wc -l) && @@ -160,7 +160,7 @@ test_expect_success 'stash pop' ' test 0 = $(git stash list | wc -l) ' -cat > expect << EOF +cat >expect < expect1 << EOF +cat >expect1 < expect2 << EOF +cat >expect2 < file && + echo foo >file && git commit file -m first && - echo bar > file && - echo bar2 > file2 && + echo bar >file && + echo bar2 >file2 && git add file2 && git stash && - echo baz > file && + echo baz >file && git commit file -m second && git stash branch stashbranch && test refs/heads/stashbranch = $(git symbolic-ref HEAD) && test $(git rev-parse HEAD) = $(git rev-parse master^) && - git diff --cached > output && + git diff --cached >output && test_cmp output expect && - git diff > output && + git diff >output && test_cmp output expect1 && git add file && git commit -m alternate\ second && - git diff master..stashbranch > output && + git diff master..stashbranch >output && test_cmp output expect2 && test 0 = $(git stash list | wc -l) ' test_expect_success 'apply -q is quiet' ' - echo foo > file && + echo foo >file && git stash && - git stash apply -q > output.out 2>&1 && + git stash apply -q >output.out 2>&1 && test_must_be_empty output.out ' test_expect_success 'save -q is quiet' ' - git stash save --quiet > output.out 2>&1 && + git stash save --quiet >output.out 2>&1 && test_must_be_empty output.out ' test_expect_success 'pop -q is quiet' ' - git stash pop -q > output.out 2>&1 && + git stash pop -q >output.out 2>&1 && test_must_be_empty output.out ' test_expect_success 'pop -q --index works and is quiet' ' - echo foo > file && + echo foo >file && git add file && git stash save --quiet && - git stash pop -q --index > output.out 2>&1 && + git stash pop -q --index >output.out 2>&1 && test foo = "$(git show :file)" && test_must_be_empty output.out ' test_expect_success 'drop -q is quiet' ' git stash && - git stash drop -q > output.out 2>&1 && + git stash drop -q >output.out 2>&1 && test_must_be_empty output.out ' test_expect_success 'stash -k' ' - echo bar3 > file && - echo bar4 > file2 && + echo bar3 >file && + echo bar4 >file2 && git add file2 && git stash -k && test bar,bar4 = $(cat file),$(cat file2) ' test_expect_success 'stash --no-keep-index' ' - echo bar33 > file && - echo bar44 > file2 && + echo bar33 >file && + echo
[PATCH v6 4/4] stash: renamed test cases to be more descriptive
Renamed some test cases' labels to be more descriptive and under 80 characters per line. Signed-off-by: Paul-Sebastian Ungureanu --- t/t3903-stash.sh | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index de6cab1fe..8d002a7f2 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -604,7 +604,7 @@ test_expect_success 'stash show -p - no stashes on stack, stash-like argument' ' test_cmp expected actual ' -test_expect_success 'stash drop - fail early if specified stash is not a stash reference' ' +test_expect_success 'drop: fail early if specified stash is not a stash ref' ' git stash clear && test_when_finished "git reset --hard HEAD && git stash clear" && git reset --hard && @@ -618,7 +618,7 @@ test_expect_success 'stash drop - fail early if specified stash is not a stash r git reset --hard HEAD ' -test_expect_success 'stash pop - fail early if specified stash is not a stash reference' ' +test_expect_success 'pop: fail early if specified stash is not a stash ref' ' git stash clear && test_when_finished "git reset --hard HEAD && git stash clear" && git reset --hard && @@ -682,7 +682,7 @@ test_expect_success 'invalid ref of the form "n", n >= N' ' git stash drop ' -test_expect_success 'stash branch should not drop the stash if the branch exists' ' +test_expect_success 'branch: should not drop the stash if the branch exists' ' git stash clear && echo foo >file && git add file && @@ -693,7 +693,7 @@ test_expect_success 'stash branch should not drop the stash if the branch exists git rev-parse stash@{0} -- ' -test_expect_success 'stash branch should not drop the stash if the apply fails' ' +test_expect_success 'branch: should not drop the stash if the apply fails' ' git stash clear && git reset HEAD~1 --hard && echo foo >file && @@ -707,7 +707,7 @@ test_expect_success 'stash branch should not drop the stash if the apply fails' git rev-parse stash@{0} -- ' -test_expect_success 'stash apply shows status same as git status (relative to current directory)' ' +test_expect_success 'apply: shows same status as git status (relative to ./)' ' git stash clear && echo 1 >subdir/subfile1 && echo 2 >subdir/subfile2 && @@ -1048,7 +1048,7 @@ test_expect_success 'stash push -p with pathspec shows no changes only once' ' test_i18ncmp expect actual ' -test_expect_success 'stash push with pathspec shows no changes when there are none' ' +test_expect_success 'push: shows no changes when there are none' ' >foo && git add foo && git commit -m "tmp" && @@ -1058,7 +1058,7 @@ test_expect_success 'stash push with pathspec shows no changes when there are no test_i18ncmp expect actual ' -test_expect_success 'stash push with pathspec not in the repository errors out' ' +test_expect_success 'push: not in the repository errors out' ' >untracked && test_must_fail git stash push untracked && test_path_is_file untracked -- 2.18.0.rc2.13.g506fc12fb
Re: Unexpected ignorecase=false behavior on Windows
Bryan Turner writes: > Git on Windows is not designed to run with anything other than > core.ignoreCase=true, and attempting to do so will cause unexpected > behavior. Even though I fully agree with your conclusion that the document must make it crystal clear that core.ignoreCase must be set to reflect the reality, I found the above statement misleading and do not want it to be used as the basis of a documentation update. But it is possible that I am misunderstanding the current state of affairs. Is the case insensitivity that deeply ingrained in the Git for Windows code? IOW, even if the code used to build Git for Windows were executed on a case sensitive filesystem, is there a case-smashing code on _our_ side that kicks in to cause unexpected behaviour, _even_ when core.ignorecase is set to false to match (hypothetical) reality? To put it yet another way, if a case sensitive filesystem were available, wouldn't running "git init" from Git for Windows in a directory on such a filesytem set core.ignoreCase to false in the resulting repository and from then on wouldn't everything work fine? If my suspicion (i.e. the code for Git for Windows is perfectly fine---it is just the users are not running with case sensitive filesystems and flipping core.ignoreCase to true does not make case incapable filesystems suddenly capable) is correct, then it is not "Git on Windows is not designed to run" from two angles. (1) it is not just Git for Windows---Git running on UNIX that mounts VFAT, or Git running on macOS with default HFS+, would exhibit the same symptom, and (2) it is not "Git is not designed to run"---it is core.ignoreCase that is not designed to be a way to make case incapable filesystems suddenly capable of distinguishing cases in filesystems. Thanks.
Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never
On Sat, Jun 23, 2018 at 05:45:19PM -0400, Noam Postavsky wrote: > On 20 May 2016 at 18:12, Noam Postavsky > wrote: My, this is a blast from the past. :) > Subject: [PATCH v1] log: Fix coloring of certain octupus merge shapes > > For octopus merges where the first parent edge immediately merges into > the next column to the left: > > | | *-. > | | |\ \ > | |/ / / > > then the number of columns should be one less than the usual case: > > | *-. > | |\ \ > | | | * These diagrams confused me for a minute, because I see two differences: 1. The first one has an extra apparently unrelated parallel branch on the far left. 2. The first has the first-parent of the "*" merge commit immediately join the branch. But if I understand correctly, we only care about the second property. So would it be accurate to show them as: | *-. | |\ \ |/ / / | *-. | |\ \ | | | * ? I think that makes it easier to compare them. I don't remember much about our prior discussion, so let me try to talk myself through the patch itself: > diff --git a/graph.c b/graph.c > index e1f6d3bdd..c919c86e8 100644 > --- a/graph.c > +++ b/graph.c > @@ -856,12 +856,16 @@ static int graph_draw_octopus_merge(struct git_graph > *graph, > int col_num, i; > int num_dashes = > ((graph->num_parents - dashless_commits) * 2) - 1; > - for (i = 0; i < num_dashes; i++) { > - col_num = (i / 2) + dashless_commits + graph->commit_index; OK, so the old code emitted num_dashes, and every pair was done with the same column. Our highest iteration of this loop would use the column at (num_dashes-1) / 2. We know that num_dashes is always odd, so: num_dashes = 1 puts our last column at 0 num_dashes = 3 puts our last column at 1 And so on. So far so good. > + int first_col = dashless_commits + graph->commit_index; This corresponds to the i=0 case, makes sense. > + int last_col = first_col + (num_dashes / 2); But here our last_col misses the "-1". I don't think it matters because we know num_dashes is always odd, and therefore due to integer truncation (num_dashes-1)/2 == (num_dashes/2). > + if (last_col >= graph->num_new_columns) { > + first_col--; > + last_col--; > + } The shifting of last_col I expect as part of the fix. I was surprised by shifting first_col, though. Wouldn't it always start at 0 (offset by the previous commits)? It definitely seems to be necessary, but I'm not sure I understand why. > + for (i = 0, col_num = first_col; i < num_dashes; i++, col_num++) { > strbuf_write_column(sb, >new_columns[col_num], '-'); > } > - col_num = (i / 2) + dashless_commits + graph->commit_index; > - strbuf_write_column(sb, >new_columns[col_num], '.'); > + strbuf_write_column(sb, >new_columns[last_col], '.'); In this new loop we count up our dashes and our columns. But now we have 1-to-1 correspondence as we increment! I don't think that can be right. And indeed, if I take your original problem report and add an extra "d" branch and make the octopus "a b d", then the problem comes back. You don't notice with a 3-parent merge because We need to increment col_num only half as much as num_dashes. Should we be doing: for (col_num = first_col; col_num < last_col; col_num++) { strbuf_write_column(sb, >new_columns[col_num], '-'); strbuf_write_column(sb, >new_columns[col_num], '-'); } strbuf_write_column(sb, >new_columns[last_col], '-'); strbuf_write_column(sb, >new_columns[last_col], '.'); I.e., write "--" for each interior column, and then "-." for the last one? -Peff
Re: [PATCH] t3401: add directory rename testcases for rebase and am
On Wed, Jun 6, 2018 at 10:04 PM, Elijah Newren wrote: > Add a simple directory rename testcase, in conjunction with each of the > types of rebases: > git-rebase--interactive > git-rebase--am > git-rebase--merge > and also use the same testcase for > git am --3way > > This demonstrates an inconsistency between the different rebase backends > and which can detect the directory rename. Due to changes in the en/rebase-consistency series, I'm no longer keeping this patch as a separate submission but have pulled it into v4 of that series[1]. [1] https://public-inbox.org/git/20180625161300.26060-9-new...@gmail.com/
[PATCH v4 6/9] directory-rename-detection.txt: technical docs on abilities and limitations
Signed-off-by: Elijah Newren --- .../technical/directory-rename-detection.txt | 92 +++ 1 file changed, 92 insertions(+) create mode 100644 Documentation/technical/directory-rename-detection.txt diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt new file mode 100644 index 00..6e22920a39 --- /dev/null +++ b/Documentation/technical/directory-rename-detection.txt @@ -0,0 +1,92 @@ +Directory rename detection +== + +Rename detection logic in diffcore-rename that checks for renames of +individual files is aggregated and analyzed in merge-recursive for cases +where combinations of renames indicate that a full directory has been +renamed. + +Scope of abilities +-- + +It is perhaps easiest to start with an example: + + * When all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is +likely that x/d added in the meantime would also want to move to z/d by +taking the hint that the entire directory 'x' moved to 'z'. + +More interesting possibilities exist, though, such as: + + * one side of history renames x -> z, and the other renames some file to +x/e, causing the need for the merge to do a transitive rename. + + * one side of history renames x -> z, but also renames all files within +x. For example, x/a -> z/alpha, x/b -> z/bravo, etc. + + * both 'x' and 'y' being merged into a single directory 'z', with a +directory rename being detected for both x->z and y->z. + + * not all files in a directory being renamed to the same location; +i.e. perhaps most the files in 'x' are now found under 'z', but a few +are found under 'w'. + + * a directory being renamed, which also contained a subdirectory that was +renamed to some entirely different location. (And perhaps the inner +directory itself contained inner directories that were renamed to yet +other locations). + + * combinations of the above; see t/t6043-merge-rename-directories.sh for +various interesting cases. + +Limitations -- applicability of directory renames +- + +In order to prevent edge and corner cases resulting in either conflicts +that cannot be represented in the index or which might be too complex for +users to try to understand and resolve, a couple basic rules limit when +directory rename detection applies: + + 1) If a given directory still exists on both sides of a merge, we do + not consider it to have been renamed. + + 2) If a subset of to-be-renamed files have a file or directory in the + way (or would be in the way of each other), "turn off" the directory + rename for those specific sub-paths and report the conflict to the + user. + + 3) If the other side of history did a directory rename to a path that + your side of history renamed away, then ignore that particular + rename from the other side of history for any implicit directory + renames (but warn the user). + +Limitations -- detailed rules and testcases +--- + +t/t6043-merge-rename-directories.sh contains extensive tests and commentary +which generate and explore the rules listed above. It also lists a few +additional rules: + + a) If renames split a directory into two or more others, the directory + with the most renames, "wins". + + b) Avoid directory-rename-detection for a path, if that path is the + source of a rename on either side of a merge. + + c) Only apply implicit directory renames to directories if the other side + of history is the one doing the renaming. + +Limitations -- support in different commands + + +Directory rename detection is supported by 'merge' and 'cherry-pick'. +Other git commands which users might be surprised to see limited or no +directory rename detection support in: + + * diff + +Folks have requested in the past that `git diff` detect directory +renames and somehow simplify its output. It is not clear whether this +would be desirable or how the output should be simplified, so this was +simply not implemented. Further, to implement this, directory rename +detection logic would need to move from merge-recursive to +diffcore-rename. -- 2.18.0.9.g678597d97e
[PATCH v4 4/9] git-rebase: error out when incompatible options passed
git rebase has three different types: am, merge, and interactive, all of which are implemented in terms of separate scripts. am builds on git-am, merge builds on git-merge-recursive, and interactive builds on git-cherry-pick. We make use of features in those lower-level commands in the different rebase types, but those features don't exist in all of the lower level commands so we have a range of incompatibilities. Previously, we just accepted nearly any argument and silently ignored whichever ones weren't implemented for the type of rebase specified. Change this so the incompatibilities are documented, included in the testsuite, and tested for at runtime with an appropriate error message shown. Some exceptions I left out: * --merge and --interactive are technically incompatible since they are supposed to run different underlying scripts, but with a few small changes, --interactive can do everything that --merge can. In fact, I'll shortly be sending another patch to remove git-rebase--merge and reimplement it on top of git-rebase--interactive. * One could argue that --interactive and --quiet are incompatible since --interactive doesn't implement a --quiet mode (perhaps since cherry-pick itself does not implement one). However, the interactive mode is more quiet than the other modes in general with progress messages, so one could argue that it's already quiet. Signed-off-by: Elijah Newren --- git-rebase.sh | 35 ++ t/t3422-rebase-incompatible-options.sh | 16 ++-- 2 files changed, 43 insertions(+), 8 deletions(-) diff --git a/git-rebase.sh b/git-rebase.sh index bf71b7fa20..18ac8226c4 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -503,6 +503,24 @@ then git_format_patch_opt="$git_format_patch_opt --progress" fi +if test -n "$git_am_opt"; then + incompatible_opts=$(echo " $git_am_opt " | \ + sed -e 's/ -q / /g' -e 's/^ \(.*\) $/\1/') + if test -n "$interactive_rebase" + then + if test -n "$incompatible_opts" + then + die "$(gettext "error: cannot combine interactive options (--interactive, --exec, --rebase-merges, --preserve-merges, --keep-empty, --root + --onto) with am options ($incompatible_opts)")" + fi + fi + if test -n "$do_merge"; then + if test -n "$incompatible_opts" + then + die "$(gettext "error: cannot combine merge options (--merge, --strategy, --strategy-option) with am options ($incompatible_opts)")" + fi + fi +fi + if test -n "$signoff" then test -n "$preserve_merges" && @@ -511,6 +529,23 @@ then force_rebase=t fi +if test -n "$preserve_merges" +then + # Note: incompatibility with --signoff handled in signoff block above + # Note: incompatibility with --interactive is just a strong warning; + # git-rebase.txt caveats with "unless you know what you are doing" + test -n "$rebase_merges" && + die "$(gettext "error: cannot combine '--preserve_merges' with '--rebase-merges'")" +fi + +if test -n "$rebase_merges" +then + test -n "$strategy_opts" && + die "$(gettext "error: cannot combine '--rebase_merges' with '--strategy-option'")" + test -n "$strategy" && + die "$(gettext "error: cannot combine '--rebase_merges' with '--strategy'")" +fi + if test -z "$rebase_root" then case "$#" in diff --git a/t/t3422-rebase-incompatible-options.sh b/t/t3422-rebase-incompatible-options.sh index 2687a85254..fc0ce150d1 100755 --- a/t/t3422-rebase-incompatible-options.sh +++ b/t/t3422-rebase-incompatible-options.sh @@ -33,27 +33,27 @@ test_expect_success 'setup' ' test_rebase_am_only () { opt=$1 shift - test_expect_failure "$opt incompatible with --merge" " + test_expect_success "$opt incompatible with --merge" " git checkout B^0 && test_must_fail git rebase $opt --merge A " - test_expect_failure "$opt incompatible with --strategy=ours" " + test_expect_success "$opt incompatible with --strategy=ours" " git checkout B^0 && test_must_fail git rebase $opt --strategy=ours A " - test_expect_failure "$opt incompatible with --strategy-option=ours" " + test_expect_success "$opt incompatible with --strategy-option=ours" " git checkout B^0 && test_must_fail git rebase $opt --strategy=ours A " - test_expect_failure "$opt incompatible with --interactive" " + test_expect_success "$opt incompatible with --interactive" " git checkout B^0 && test_must_fail git rebase $opt --interactive A " - test_expect_failure "$opt incompatible with --exec" " + test_expect_success
[PATCH v4 7/9] git-rebase.txt: document behavioral differences between modes
There are a variety of aspects that are common to all rebases regardless of which backend is in use; however, the behavior for these different aspects varies in ways that could surprise users. (In fact, it's not clear -- to me at least -- that these differences were even desirable or intentional.) Document these differences. Signed-off-by: Elijah Newren --- Documentation/git-rebase.txt | 32 +++ .../technical/directory-rename-detection.txt | 23 + 2 files changed, 55 insertions(+) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 2f47495a4d..a67df4caba 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -546,6 +546,38 @@ Other incompatible flag pairs: * --rebase-merges and --strategy * --rebase-merges and --strategy-option +BEHAVIORAL DIFFERENCES +--- + + * empty commits: + +am-based rebase will drop any "empty" commits, whether the +commit started empty (had no changes relative to its parent to +start with) or ended empty (all changes were already applied +upstream in other commits). + +merge-based rebase does the same. + +interactive-based rebase will by default drop commits that +started empty and halt if it hits a commit that ended up empty. +The `--keep-empty` option exists for interactive rebases to allow +it to keep commits that started empty. + + * empty commit messages: + +am-based rebase will silently apply commits with empty commit +messages. + +merge-based and interactive-based rebases will by default halt +on any such commits. The `--allow-empty-message` option exists to +allow interactive-based rebases to apply such commits without +halting. + + * directory rename detection: + +merge-based and interactive-based rebases work fine with +directory rename detection. am-based rebases sometimes do not. + include::merge-strategies.txt[] NOTES diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt index 6e22920a39..1c0086e287 100644 --- a/Documentation/technical/directory-rename-detection.txt +++ b/Documentation/technical/directory-rename-detection.txt @@ -90,3 +90,26 @@ directory rename detection support in: simply not implemented. Further, to implement this, directory rename detection logic would need to move from merge-recursive to diffcore-rename. + + * am + +git-am tries to avoid a full three way merge, instead calling +git-apply. That prevents us from detecting renames at all, which may +defeat the directory rename detection. There is a fallback, though; if +the initial git-apply fails and the user has specified the -3 option, +git-am will fall back to a three way merge. However, git-am lacks the +necessary information to do a "real" three way merge. Instead, it has +to use build_fake_ancestor() to get a merge base that is missing files +whose rename may have been important to detect for directory rename +detection to function. + + * rebase + +Since am-based rebases work by first generating a bunch of patches +(which no longer record what the original commits were and thus don't +have the necessary info from which we can find a real merge-base), and +then calling git-am, this implies that am-based rebases will not always +successfully detect directory renames either (see the 'am' section +above). merged-based rebases (rebase -m) and cherry-pick-based rebases +(rebase -i) are not affected by this shortcoming, and fully support +directory rename detection. -- 2.18.0.9.g678597d97e
[PATCH v4 1/9] git-rebase.txt: document incompatible options
git rebase has many options that only work with one of its three backends. It also has a few other pairs of incompatible options. Document these. Signed-off-by: Elijah Newren --- Documentation/git-rebase.txt | 85 1 file changed, 77 insertions(+), 8 deletions(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 0e20a66e73..b2d95e3fb9 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -243,11 +243,15 @@ leave out at most one of A and B, in which case it defaults to HEAD. --keep-empty:: Keep the commits that do not change anything from its parents in the result. ++ +See also INCOMPATIBLE OPTIONS below. --allow-empty-message:: By default, rebasing commits with an empty message will fail. This option overrides that behavior, allowing commits with empty messages to be rebased. ++ +See also INCOMPATIBLE OPTIONS below. --skip:: Restart the rebasing process by skipping the current patch. @@ -271,6 +275,8 @@ branch on top of the branch. Because of this, when a merge conflict happens, the side reported as 'ours' is the so-far rebased series, starting with , and 'theirs' is the working branch. In other words, the sides are swapped. ++ +See also INCOMPATIBLE OPTIONS below. -s :: --strategy=:: @@ -280,8 +286,10 @@ other words, the sides are swapped. + Because 'git rebase' replays each commit from the working branch on top of the branch using the given strategy, using -the 'ours' strategy simply discards all patches from the , +the 'ours' strategy simply empties all patches from the , which makes little sense. ++ +See also INCOMPATIBLE OPTIONS below. -X :: --strategy-option=:: @@ -289,6 +297,8 @@ which makes little sense. This implies `--merge` and, if no strategy has been specified, `-s recursive`. Note the reversal of 'ours' and 'theirs' as noted above for the `-m` option. ++ +See also INCOMPATIBLE OPTIONS below. -S[]:: --gpg-sign[=]:: @@ -324,6 +334,8 @@ which makes little sense. and after each change. When fewer lines of surrounding context exist they all must match. By default no context is ever ignored. ++ +See also INCOMPATIBLE OPTIONS below. -f:: --force-rebase:: @@ -355,19 +367,22 @@ default is `--no-fork-point`, otherwise the default is `--fork-point`. --whitespace=:: These flag are passed to the 'git apply' program (see linkgit:git-apply[1]) that applies the patch. - Incompatible with the --interactive option. ++ +See also INCOMPATIBLE OPTIONS below. --committer-date-is-author-date:: --ignore-date:: These flags are passed to 'git am' to easily change the dates of the rebased commits (see linkgit:git-am[1]). - Incompatible with the --interactive option. ++ +See also INCOMPATIBLE OPTIONS below. --signoff:: Add a Signed-off-by: trailer to all the rebased commits. Note that if `--interactive` is given then only commits marked to be - picked, edited or reworded will have the trailer added. Incompatible - with the `--preserve-merges` option. + picked, edited or reworded will have the trailer added. ++ +See also INCOMPATIBLE OPTIONS below. -i:: --interactive:: @@ -378,6 +393,8 @@ default is `--no-fork-point`, otherwise the default is `--fork-point`. The commit list format can be changed by setting the configuration option rebase.instructionFormat. A customized instruction format will automatically have the long commit hash prepended to the format. ++ +See also INCOMPATIBLE OPTIONS below. -r:: --rebase-merges[=(rebase-cousins|no-rebase-cousins)]:: @@ -404,7 +421,7 @@ It is currently only possible to recreate the merge commits using the `recursive` merge strategy; Different merge strategies can be used only via explicit `exec git merge -s [...]` commands. + -See also REBASING MERGES below. +See also REBASING MERGES and INCOMPATIBLE OPTIONS below. -p:: --preserve-merges:: @@ -415,6 +432,8 @@ See also REBASING MERGES below. This uses the `--interactive` machinery internally, but combining it with the `--interactive` option explicitly is generally not a good idea unless you know what you are doing (see BUGS below). ++ +See also INCOMPATIBLE OPTIONS below. -x :: --exec :: @@ -437,6 +456,8 @@ squash/fixup series. + This uses the `--interactive` machinery internally, but it can be run without an explicit `--interactive`. ++ +See also INCOMPATIBLE OPTIONS below. --root:: Rebase all commits reachable from , instead of @@ -447,6 +468,8 @@ without an explicit `--interactive`. When used together with both --onto and --preserve-merges, 'all' root commits will be rewritten to have as parent instead. ++ +See also INCOMPATIBLE OPTIONS below. --autosquash:: --no-autosquash:: @@ -461,11 +484,11 @@ without an explicit `--interactive`. too.
[PATCH v4 8/9] t3401: add directory rename testcases for rebase and am
Add a simple directory rename testcase, in conjunction with each of the types of rebases: git-rebase--interactive git-rebase--am git-rebase--merge and also use the same testcase for git am --3way This demonstrates a difference in behavior between the different rebase backends in regards to directory rename detection. Signed-off-by: Elijah Newren --- t/t3401-rebase-and-am-rename.sh | 105 1 file changed, 105 insertions(+) create mode 100755 t/t3401-rebase-and-am-rename.sh diff --git a/t/t3401-rebase-and-am-rename.sh b/t/t3401-rebase-and-am-rename.sh new file mode 100755 index 00..8f832957fc --- /dev/null +++ b/t/t3401-rebase-and-am-rename.sh @@ -0,0 +1,105 @@ +#!/bin/sh + +test_description='git rebase + directory rename tests' + +. ./test-lib.sh +. "$TEST_DIRECTORY"/lib-rebase.sh + +test_expect_success 'setup testcase' ' + test_create_repo dir-rename && + ( + cd dir-rename && + + mkdir x && + test_seq 1 10 >x/a && + test_seq 11 20 >x/b && + test_seq 21 30 >x/c && + test_write_lines a b c d e f g h i >l && + git add x l && + git commit -m "Initial" && + + git branch O && + git branch A && + git branch B && + + git checkout A && + git mv x y && + git mv l letters && + git commit -m "Rename x to y, l to letters" && + + git checkout B && + echo j >>l && + test_seq 31 40 >x/d && + git add l x/d && + git commit -m "Modify l, add x/d" + ) +' + +test_expect_success 'rebase --interactive: directory rename detected' ' + ( + cd dir-rename && + + git checkout B^0 && + + set_fake_editor && + FAKE_LINES="1" git rebase --interactive A && + + git ls-files -s >out && + test_line_count = 5 out && + + test_path_is_file y/d && + test_path_is_missing x/d + ) +' + +test_expect_failure 'rebase (am): directory rename detected' ' + ( + cd dir-rename && + + git checkout B^0 && + + git rebase A && + + git ls-files -s >out && + test_line_count = 5 out && + + test_path_is_file y/d && + test_path_is_missing x/d + ) +' + +test_expect_success 'rebase --merge: directory rename detected' ' + ( + cd dir-rename && + + git checkout B^0 && + + git rebase --merge A && + + git ls-files -s >out && + test_line_count = 5 out && + + test_path_is_file y/d && + test_path_is_missing x/d + ) +' + +test_expect_failure 'am: directory rename detected' ' + ( + cd dir-rename && + + git checkout A^0 && + + git format-patch -1 B && + + git am --3way 0001*.patch && + + git ls-files -s >out && + test_line_count = 5 out && + + test_path_is_file y/d && + test_path_is_missing x/d + ) +' + +test_done -- 2.18.0.9.g678597d97e
[RFC PATCH v4 9/9] git-rebase: make --allow-empty-message the default
rebase backends currently behave differently with empty commit messages, largely as a side-effect of the different underlying commands on which they are based. am-based rebases apply commits with an empty commit message without stopping or requiring the user to specify an extra flag. (It is interesting to note that am-based rebases are the default rebase type, and no one has ever requested a --no-allow-empty-message flag to change this behavior.) merge-based and interactive-based rebases (which are ultimately based on git-commit), will currently halt on any such commits and require the user to manually specify what to do with the commit and continue. One possible rationale for the difference in behavior is that the purpose of an "am" based rebase is solely to transplant an existing history, while an "interactive" rebase is one whose purpose is to polish a series before making it publishable. Thus, stopping and asking for confirmation for a possible problem is more appropriate in the latter case. However, there are two problems with this rationale: 1) merge-based rebases are also non-interactive and there are multiple types of rebases that use the interactive machinery but are not explicitly interactive (e.g. when either --rebase-merges or --keep-empty are specified without --interactive). These rebases are also used solely to transplant an existing history, and thus also should default to --allow-empty-message. 2) this rationale only says that the user is more accepting of stopping in the case of an explicitly interactive rebase, not that stopping for this particular reason actually makes sense. Exploring whether it makes sense, requires backing up and analyzing the underlying commands... If git-commit did not error out on empty commits by default, accidental creation of commits with empty messages would be a very common occurrence (this check has caught me many times). Further, nearly all such empty commit messages would be considered an accidental error (as evidenced by a huge amount of documentation across version control systems and in various blog posts explaining how important commit messages are). A simple check for what would otherwise be a common error thus made a lot of sense, and git-commit gained an --allow-empty-message flag for special case overrides. This has made commits with empty messages very rare. There are two sources for commits with empty messages for rebase (and cherry-pick): (a) commits created in git where the user previously specified --allow-empty-message to git-commit, and (b) commits imported into git from other version control systems. In case (a), the user has already explicitly specified that there is something special about this commit that makes them not want to specify a commit message; forcing them to re-specify with every cherry-pick or rebase seems more likely to be infuriating than helpful. In case (b), the commit is highly unlikely to have been authored by the person who has imported the history and is doing the rebase or cherry-pick, and thus the user is unlikely to be the appropriate person to write a commit message for it. Stopping and expecting the user to modify the commit before proceeding thus seems counter-productive. Further, note that while empty commit messages was a common error case for git-commit to deal with, it is a rare case for rebase (or cherry-pick). The fact that it is rare raises the question of why it would be worth checking and stopping on this particular condition and not others. For example, why doesn't an interactive rebase automatically stop if the commit message's first line is 2000 columns long, or is missing a blank line after the first line, or has every line indented with five spaces, or any number of other myriad problems? Finally, note that if a user doing an interactive rebase does have the necessary knowledge to add a message for any such commit and wants to do so, it is rather simple for them to change the appropriate line from 'pick' to 'reword'. The fact that the subject is empty in the todo list that the user edits should even serve as a way to notify them. As far as I can tell, the fact that merge-based and interactive-based rebases stop on commits with empty commit messages is solely a by-product of having been based on git-commit. It went without notice for a long time precisely because such cases are rare. The rareness of this situation made it difficult to reason about, so when folks did eventually notice this behavior, they assumed it was there for a good reason and just added an --allow-empty-message flag. In my opinion, stopping on such messages not desirable in any of these cases, even the (explicitly) interactive case. Signed-off-by: Elijah Newren --- My commit messsage seems like one of those things that someone else will instantly see how to shrink to less than a quarter of its size while still retaining all essential reasoning. I can't
[PATCH v4 5/9] git-rebase.txt: address confusion between --no-ff vs --force-rebase
rebase was taught the --force-rebase option in commit b2f82e05de ("Teach rebase to rebase even if upstream is up to date", 2009-02-13). This flag worked for the am and merge backends, but wasn't a valid option for the interactive backend. rebase was taught the --no-ff option for interactive rebases in commit b499549401cb ("Teach rebase the --no-ff option.", 2010-03-24), to do the exact same thing as --force-rebase does for non-interactive rebases. This commit explicitly documented the fact that --force-rebase was incompatible with --interactive, though it made --no-ff a synonym for --force-rebase for non-interactive rebases. The choice of a new option was based on the fact that "force rebase" didn't sound like an appropriate term for the interactive machinery. In commit 6bb4e485cff8 ("rebase: align variable names", 2011-02-06), the separate parsing of command line options in the different rebase scripts was removed, and whether on accident or because the author noticed that these options did the same thing, the options became synonyms and both were accepted by all three rebase types. In commit 2d26d533a012 ("Documentation/git-rebase.txt: -f forces a rebase that would otherwise be a no-op", 2014-08-12), which reworded the description of the --force-rebase option, the (no-longer correct) sentence stating that --force-rebase was incompatible with --interactive was finally removed. Finally, as explained at https://public-inbox.org/git/98279912-0f52-969d-44a6-222420393...@xiplink.com In the original discussion around this option [1], at one point I proposed teaching rebase--interactive to respect --force-rebase instead of adding a new option [2]. Ultimately --no-ff was chosen as the better user interface design [3], because an interactive rebase can't be "forced" to run. We have accepted both --no-ff and --force-rebase as full synonyms for all three rebase types for over seven years. Documenting them differently and in ways that suggest they might not be quite synonyms simply leads to confusion. Adjust the documentation to match reality. Signed-off-by: Elijah Newren --- Documentation/git-rebase.txt | 30 ++ 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index b2d95e3fb9..2f47495a4d 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -337,16 +337,18 @@ See also INCOMPATIBLE OPTIONS below. + See also INCOMPATIBLE OPTIONS below. --f:: +--no-ff:: --force-rebase:: - Force a rebase even if the current branch is up to date and - the command without `--force` would return without doing anything. +-f:: + Individually replay all rebased commits instead of fast-forwarding + over the unchanged ones. This ensures that the entire history of + the rebased branch is composed of new commits. + -You may find this (or --no-ff with an interactive rebase) helpful after -reverting a topic branch merge, as this option recreates the topic branch with -fresh commits so it can be remerged successfully without needing to "revert -the reversion" (see the -link:howto/revert-a-faulty-merge.html[revert-a-faulty-merge How-To] for details). +You may find this helpful after reverting a topic branch merge, as this option +recreates the topic branch with fresh commits so it can be remerged +successfully without needing to "revert the reversion" (see the +link:howto/revert-a-faulty-merge.html[revert-a-faulty-merge How-To] for +details). --fork-point:: --no-fork-point:: @@ -498,18 +500,6 @@ See also INCOMPATIBLE OPTIONS below. with care: the final stash application after a successful rebase might result in non-trivial conflicts. ---no-ff:: - With --interactive, cherry-pick all rebased commits instead of - fast-forwarding over the unchanged ones. This ensures that the - entire history of the rebased branch is composed of new commits. -+ -Without --interactive, this is a synonym for --force-rebase. -+ -You may find this helpful after reverting a topic branch merge, as this option -recreates the topic branch with fresh commits so it can be remerged -successfully without needing to "revert the reversion" (see the -link:howto/revert-a-faulty-merge.html[revert-a-faulty-merge How-To] for details). - INCOMPATIBLE OPTIONS -- 2.18.0.9.g678597d97e
[PATCH v4 0/9] Document/fix/warn about rebase incompatibilities and inconsistencies
git-rebase has lots of options that are mutually incompatible. Even among aspects of its behavior that is common to all rebase types, it has a number of inconsistencies. This series tries to document, fix, and/or warn users about many of these. Changes since v3 (range-diff included at the end): * Fix many small issues mentioned by Junio and SZEDER, including a fix suggested by Eric. * Add a new patch (#6) which introduces technical documentation on directory rename detection, so that... * The more technical portion of the docs from patch 7 (used to be patch 5 in v3) can be moved from git-rebase.txt to the new technical/directory-rename-detection.txt, as suggested by Junio. * Also adds a new patch (#8) which adds testcases to demonstrate what the new technical doc discusses. Items of particular note for reviewers: * I have left patch 9 (used to be patch 7) in RFC state, but expanded the commit message with an in-depth usability rationale for the change. * It sounded like Junio was slightly unclear about the intent of the wording in Patch 1. Not sure if my answer (in email) was sufficient or if there are changes I should make to the patch. * On Patch 5 of v3 (which is now patch 7 of v4), Junio suggested a course of action to take with --keep-empty. I counter-proposed with a --empty={drop,halt,keep} idea. Is just deferring this issue to a later series okay, or are there changes I should make to the current series to handle this? (If so, what changes should those be?) Elijah Newren (9): git-rebase.txt: document incompatible options git-rebase.sh: update help messages a bit t3422: new testcases for checking when incompatible options passed git-rebase: error out when incompatible options passed git-rebase.txt: address confusion between --no-ff vs --force-rebase directory-rename-detection.txt: technical docs on abilities and limitations git-rebase.txt: document behavioral differences between modes t3401: add directory rename testcases for rebase and am git-rebase: make --allow-empty-message the default Documentation/git-rebase.txt | 135 ++ .../technical/directory-rename-detection.txt | 115 +++ git-rebase.sh | 43 +- t/t3401-rebase-and-am-rename.sh | 105 ++ t/t3404-rebase-interactive.sh | 7 +- t/t3405-rebase-malformed.sh | 11 +- t/t3422-rebase-incompatible-options.sh| 88 7 files changed, 462 insertions(+), 42 deletions(-) create mode 100644 Documentation/technical/directory-rename-detection.txt create mode 100755 t/t3401-rebase-and-am-rename.sh create mode 100755 t/t3422-rebase-incompatible-options.sh 1: 4cdf9130cc ! 1: 3f454ebc5e git-rebase.txt: document incompatible options @@ -163,7 +163,7 @@ +implementations: + + * one based on linkgit:git-am[1] (the default) -+ * one based on linkgit:git-merge-recursive[1] (merge backend) ++ * one based on git-merge-recursive (merge backend) + * one based on linkgit:git-cherry-pick[1] (interactive backend) + +Flags only understood by the am backend: 2: e336f76c5e ! 2: 31a5a071a6 git-rebase.sh: update help messages a bit @@ -6,6 +6,8 @@ to make like things (e.g. strategy and strategy-option) be near each other. +Signed-off-by: Elijah Newren + diff --git a/git-rebase.sh b/git-rebase.sh --- a/git-rebase.sh +++ b/git-rebase.sh 3: 4ab38d8a5f ! 3: 5a2b5eec79 t3422: new testcases for checking when incompatible options passed @@ -34,8 +34,7 @@ + git commit -m A && + + git checkout B && -+ # This is indented with HT SP HT. -+ echo " foo();" >>foo && ++ echo "q qfoo();" | q_to_tab >>foo && + git add foo && + git commit -m B +' @@ -48,7 +47,7 @@ +# being ignored. Make sure rebase warns the user and aborts instead. +# + -+test_run_rebase () { ++test_rebase_am_only () { + opt=$1 + shift + test_expect_failure "$opt incompatible with --merge" " @@ -78,10 +77,10 @@ + +} + -+test_run_rebase --whitespace=fix -+test_run_rebase --ignore-whitespace -+test_run_rebase --committer-date-is-author-date -+test_run_rebase -C4 ++test_rebase_am_only --whitespace=fix ++test_rebase_am_only --ignore-whitespace ++test_rebase_am_only --committer-date-is-author-date ++test_rebase_am_only -C4 + +test_expect_success '--preserve-merges incompatible with --signoff' ' + git checkout B^0 && 4: 5223954caf ! 4: 1e1c83724a git-rebase: error out when incompatible options passed @@ -37,7 +37,8 @@ fi +if test -n "$git_am_opt"; then -+ incompatible_opts=$(echo "$git_am_opt" | sed -e 's/ -q//') ++ incompatible_opts=$(echo " $git_am_opt " | \
[PATCH v4 3/9] t3422: new testcases for checking when incompatible options passed
git rebase is split into three types: am, merge, and interactive. Various options imply different types, and which mode we are using determine which sub-script (git-rebase--$type) is executed to finish the work. Not all options work with all types, so add tests for combinations where we expect to receive an error rather than having options be silently ignored. Signed-off-by: Elijah Newren --- t/t3422-rebase-incompatible-options.sh | 88 ++ 1 file changed, 88 insertions(+) create mode 100755 t/t3422-rebase-incompatible-options.sh diff --git a/t/t3422-rebase-incompatible-options.sh b/t/t3422-rebase-incompatible-options.sh new file mode 100755 index 00..2687a85254 --- /dev/null +++ b/t/t3422-rebase-incompatible-options.sh @@ -0,0 +1,88 @@ +#!/bin/sh + +test_description='test if rebase detects and aborts on incompatible options' +. ./test-lib.sh + +test_expect_success 'setup' ' + test_seq 2 9 >foo && + git add foo && + git commit -m orig && + + git branch A && + git branch B && + + git checkout A && + test_seq 1 9 >foo && + git add foo && + git commit -m A && + + git checkout B && + echo "q qfoo();" | q_to_tab >>foo && + git add foo && + git commit -m B +' + +# +# Rebase has lots of useful options like --whitepsace=fix, which are +# actually all built in terms of flags to git-am. Since neither +# --merge nor --interactive (nor any options that imply those two) use +# git-am, using them together will result in flags like --whitespace=fix +# being ignored. Make sure rebase warns the user and aborts instead. +# + +test_rebase_am_only () { + opt=$1 + shift + test_expect_failure "$opt incompatible with --merge" " + git checkout B^0 && + test_must_fail git rebase $opt --merge A + " + + test_expect_failure "$opt incompatible with --strategy=ours" " + git checkout B^0 && + test_must_fail git rebase $opt --strategy=ours A + " + + test_expect_failure "$opt incompatible with --strategy-option=ours" " + git checkout B^0 && + test_must_fail git rebase $opt --strategy=ours A + " + + test_expect_failure "$opt incompatible with --interactive" " + git checkout B^0 && + test_must_fail git rebase $opt --interactive A + " + + test_expect_failure "$opt incompatible with --exec" " + git checkout B^0 && + test_must_fail git rebase $opt --exec 'true' A + " + +} + +test_rebase_am_only --whitespace=fix +test_rebase_am_only --ignore-whitespace +test_rebase_am_only --committer-date-is-author-date +test_rebase_am_only -C4 + +test_expect_success '--preserve-merges incompatible with --signoff' ' + git checkout B^0 && + test_must_fail git rebase --preserve-merges --signoff A +' + +test_expect_failure '--preserve-merges incompatible with --rebase-merges' ' + git checkout B^0 && + test_must_fail git rebase --preserve-merges --rebase-merges A +' + +test_expect_failure '--rebase-merges incompatible with --strategy' ' + git checkout B^0 && + test_must_fail git rebase --rebase-merges -s resolve A +' + +test_expect_failure '--rebase-merges incompatible with --strategy-option' ' + git checkout B^0 && + test_must_fail git rebase --rebase-merges -Xignore-space-change A +' + +test_done -- 2.18.0.9.g678597d97e
[PATCH v4 2/9] git-rebase.sh: update help messages a bit
signoff is not specific to the am-backend. Also, re-order a few options to make like things (e.g. strategy and strategy-option) be near each other. Signed-off-by: Elijah Newren --- git-rebase.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-rebase.sh b/git-rebase.sh index 40be59ecc4..bf71b7fa20 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -20,23 +20,23 @@ onto=! rebase onto given branch instead of upstream r,rebase-merges? try to rebase merges instead of skipping them p,preserve-merges! try to recreate merges instead of ignoring them s,strategy=! use the given merge strategy +X,strategy-option=! pass the argument through to the merge strategy no-ff! cherry-pick all commits, even if unchanged +f,force-rebase!cherry-pick all commits, even if unchanged m,merge! use merging strategies to rebase i,interactive! let the user edit the list of commits to rebase x,exec=! add exec lines after each commit of the editable list k,keep-empty preserve empty commits during rebase allow-empty-message allow rebasing commits with empty messages -f,force-rebase!force rebase even if branch is up to date -X,strategy-option=! pass the argument through to the merge strategy stat! display a diffstat of what changed upstream n,no-stat! do not show diffstat of what changed upstream verify allow pre-rebase hook to run rerere-autoupdate allow rerere to update index with resolved conflicts root! rebase all reachable commits up to the root(s) autosquash move commits that begin with squash!/fixup! under -i +signoffadd a Signed-off-by: line to each commit committer-date-is-author-date! passed to 'git am' ignore-date! passed to 'git am' -signoffpassed to 'git am' whitespace=! passed to 'git apply' ignore-whitespace! passed to 'git apply' C=!passed to 'git apply' -- 2.18.0.9.g678597d97e
Re: [PATCH] submodule.c: report the submodule that an error occurs in
SZEDER Gábor writes: >> When an error occurs in updating the working tree of a submodule in >> submodule_move_head, tell the user which submodule the error occurred in. >> >> The call to read-tree contains a super-prefix, such that the read-tree >> will correctly report any path related issues, but some error messages >> do not contain a path, for example: >> >> ~/gerrit$ git checkout --recurse-submodules origin/master >> ~/gerrit$ fatal: failed to unpack tree object >> 07672f31880ba80300b38492df9d0acfcd6ee00a >> >> Give the hint which submodule has a problem. >> >> Signed-off-by: Stefan Beller >> --- >> submodule.c | 2 +- >> t/lib-submodule-update.sh | 3 ++- >> 2 files changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/submodule.c b/submodule.c >> index 939d6870ecd..ebd092a14fd 100644 >> --- a/submodule.c >> +++ b/submodule.c >> @@ -1668,7 +1668,7 @@ int submodule_move_head(const char *path, >> argv_array_push(, new_head ? new_head : empty_tree_oid_hex()); >> >> if (run_command()) { >> -ret = -1; >> +ret = error(_("Submodule '%s' could not be updated."), path); > > This is a translated error message ... > >> goto out; >> } >> >> diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh >> index 1f38a85371a..e27f5d8541d 100755 >> --- a/t/lib-submodule-update.sh >> +++ b/t/lib-submodule-update.sh >> @@ -781,7 +781,8 @@ test_submodule_recursing_with_args_common() { >> ( >> cd submodule_update && >> git branch -t invalid_sub1 origin/invalid_sub1 && >> -test_must_fail $command invalid_sub1 && >> +test_must_fail $command invalid_sub1 2>err && >> +grep sub1 err && > > ... so the test should use 'test_i18ngrep' to check it. Thanks for being a careful reviewer, as always. Will tweak locally to skip one round-trip.
Re: [GSoC][PATCH v3 2/3] rebase -i: rewrite setup_reflog_action() in C
Alban Gruin writes: > Hi Junio, > > Le 22/06/2018 à 18:27, Junio C Hamano a écrit : >> Alban Gruin writes: >> > This rewrites (the misnamed) setup_reflog_action() from shell to C. The >> > new version is called checkout_base_commit(). >> >> ;-) on the "misnamed" part. Indeed, setting up the comment for the >> reflog entry is secondary to what this function wants to do, which >> is to check out the branch to be rebased. >> >> I do not think "base_commit" is a good name, either, though. When I >> hear 'base' in the context of 'rebase', I would imagine that the >> speaker is talking about the bottom of the range of the commits to >> be rebased (i.e. "rebase --onto ONTO BASE BRANCH", which replays >> commits BASE..BRANCH on top of ONTO and then points BRANCH at the >> result), not the top of the range or the branch these commits are >> taken from. >> > > Perhaps should I name this function checkout_onto(), and rename > checkout_onto() to "detach_onto()"? And I would reorder those two commits in > the series, as this would be very confusing. I may be misunderstanding what is happening in the function, but I think it is checking out neither the onto or the base commit. The function instead is about checking out the branch to be rebased before anything else happens when the optional argument is given (and when the optional argument is not given, then we rebase the current branch so there is no need to check it out upfront), no?
Re: [PATCH 2/2] i18n: bisect: mark two supplementary strings for translation
Hi, On Thu, 21 Jun 2018, Raphael Hertzog wrote: > On Thu, 21 Jun 2018, Duy Nguyen wrote: > > Nice. There's another string in bisect_common() that should also be > > translated: "revision walk setup failed". Maybe you can mark it too? > > Sure. A new version of the second patch is attached. I haven't had any other feedback. Is there anything else that I should do to get my changes merged? Cheers, -- Raphaël Hertzog ◈ Debian Developer Support Debian LTS: https://www.freexian.com/services/debian-lts.html Learn to master Debian: https://debian-handbook.info/get/
Re: [PATCH v6 20/21] gc: automatically write commit-graph files
On 6/8/2018 6:24 PM, SZEDER Gábor wrote: The commit-graph file is a very helpful feature for speeding up git operations. In order to make it more useful, make it possible to write the commit-graph file during standard garbage collection operations. Add a 'gc.commitGraph' config setting that triggers writing a commit-graph file after any non-trivial 'git gc' command. Defaults to false while the commit-graph feature matures. We specifically do not want to have this on by default until the commit-graph feature is fully integrated with history-modifying features like shallow clones. So I played around with an earlier version of this patch series a while ago... and as I looked into my gitconfig today I was surprised to have both 'core.commitGraph' and 'gc.commitGraph' config variables set. When I looked into it I came across this email from Ævar: https://public-inbox.org/git/87fu3peni2@evledraar.gmail.com/ > Other than the question if 'gc.commitGraph' and 'core.commitGraph' > should be independent config variables, and the exact wording of the > git-gc docs, it looks good to me. Sans doc errors you pointed out in other places (you need to set core.commitGraph so it's read at all), I think it's very useful to have these split up. It's simliar to pack.useBitmaps & pack.writeBitmaps. I think the comparison with pack bitmaps makes a lot of sense and I have to say that I really like those 'useBitmaps' and 'writeBitmaps' variable names, because it's clear right away which one is which, without consulting the documentation. I think having 'useCommitGraph' and 'writeCommitGraph' variables would be a lot better than the same variable name in two different sections, and I'm sure that then I wouldn't have been caught off guard. Sorry for the late reply. Maybe 'gc.writeCommitGraph' makes it clear that the setting enables writing the commit-graph during a 'git gc'. Is that more clear? Thanks, -Stolee
[PATCH v2 07/24] multi-pack-index: expand test data
As we build the multi-pack-index file format, we want to test the format on real repoasitories. Add tests to t5319-multi-pack-index.sh that create repository data including multiple packfiles with both version 1 and version 2 formats. The current 'git multi-pack-index write' command will always write the same file with no "real" data. This will be expanded in future commits, along with the test expectations. Signed-off-by: Derrick Stolee --- t/t5319-multi-pack-index.sh | 99 + 1 file changed, 99 insertions(+) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 0372704c96..d533fd0dbc 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -13,9 +13,108 @@ midx_read_expect() { } test_expect_success 'write midx with no packs' ' + test_when_finished rm pack/multi-pack-index && git multi-pack-index --object-dir=. write && test_path_is_file pack/multi-pack-index && midx_read_expect ' +test_expect_success 'create objects' ' + for i in `test_seq 1 5` + do + iii=$(printf '%03i' $i) + test-tool genrandom "bar" 200 > wide_delta_$iii && + test-tool genrandom "baz $iii" 50 >> wide_delta_$iii && + test-tool genrandom "foo"$i 100 > deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 1) 100 >> deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 2) 100 >> deep_delta_$iii && + echo $iii >file_$iii && + test-tool genrandom "$iii" 8192 >>file_$iii && + git update-index --add file_$iii deep_delta_$iii wide_delta_$iii && + i=$(expr $i + 1) || return 1 + done && + { echo 101 && test-tool genrandom 100 8192; } >file_101 && + git update-index --add file_101 && + tree=$(git write-tree) && + commit=$(git commit-tree $tree obj-list && + git update-ref HEAD $commit +' + +test_expect_success 'write midx with one v1 pack' ' + pack=$(git pack-objects --index-version=1 pack/test wide_delta_$iii && + test-tool genrandom "baz $iii" 50 >> wide_delta_$iii && + test-tool genrandom "foo"$i 100 > deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 1) 100 >> deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 2) 100 >> deep_delta_$iii && + echo $iii >file_$iii && + test-tool genrandom "$iii" 8192 >>file_$iii && + git update-index --add file_$iii deep_delta_$iii wide_delta_$iii && + i=$(expr $i + 1) || return 1 + done && + { echo 101 && test-tool genrandom 100 8192; } >file_101 && + git update-index --add file_101 && + tree=$(git write-tree) && + commit=$(git commit-tree $tree -p HEADobj-list2 && + git update-ref HEAD $commit +' + +test_expect_success 'write midx with two packs' ' + git pack-objects --index-version=1 pack/test-2 wide_delta_$iii && + test-tool genrandom "baz $iii" 50 >> wide_delta_$iii && + test-tool genrandom "foo"$i 100 > deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 1) 100 >> deep_delta_$iii && + test-tool genrandom "foo"$(expr $i + 2) 100 >> deep_delta_$iii && + echo $iii >file_$iii && + test-tool genrandom "$iii" 8192 >>file_$iii && + git update-index --add file_$iii deep_delta_$iii wide_delta_$iii && + { echo 101 && test-tool genrandom 100 8192; } >file_101 && + git update-index --add file_101 && + tree=$(git write-tree) && + commit=$(git commit-tree $tree -p HEADobj-list && + git update-ref HEAD $commit && + git pack-objects --index-version=2 test-pack
[PATCH v2 21/24] midx: use midx in approximate_object_count
Signed-off-by: Derrick Stolee --- packfile.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/packfile.c b/packfile.c index 20b743da91..e72e8a685d 100644 --- a/packfile.c +++ b/packfile.c @@ -863,11 +863,14 @@ unsigned long approximate_object_count(void) { if (!the_repository->objects->approximate_object_count_valid) { unsigned long count; + struct multi_pack_index *m; struct packed_git *p; prepare_packed_git(the_repository); count = 0; - for (p = the_repository->objects->packed_git; p; p = p->next) { + for (m = get_multi_pack_index(the_repository); m; m = m->next) + count += m->num_objects; + for (p = get_packed_git(the_repository); p; p = p->next) { if (open_pack_index(p)) continue; count += p->num_objects; -- 2.18.0.24.g1b579a2ee9
[PATCH v2 22/24] midx: prevent duplicate packfile loads
The multi-pack-index, when present, tracks the existence of objects and their offsets within a list of packfiles. This allows us to use the multi-pack-index for object lookups, abbreviations, and object counts. When the multi-pack-index tracks a packfile, then we do not need to add that packfile to the packed_git linked list or the MRU list. We still need to load the packfiles that are not tracked by the multi-pack-index. Signed-off-by: Derrick Stolee --- packfile.c | 9 + 1 file changed, 9 insertions(+) diff --git a/packfile.c b/packfile.c index e72e8a685d..f2b8d6f8a7 100644 --- a/packfile.c +++ b/packfile.c @@ -796,6 +796,7 @@ struct prepare_pack_data struct repository *r; struct string_list *garbage; int local; + struct multi_pack_index *m; }; static void prepare_pack(const char *full_name, size_t full_name_len, const char *file_name, void *_data) @@ -805,6 +806,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len, const char size_t base_len = full_name_len; if (strip_suffix_mem(full_name, _len, ".idx")) { + if (data->m && midx_contains_pack(data->m, file_name)) + return; /* Don't reopen a pack we already have. */ for (p = data->r->objects->packed_git; p; p = p->next) { size_t len; @@ -841,6 +844,12 @@ static void prepare_packed_git_one(struct repository *r, char *objdir, int local struct prepare_pack_data data; struct string_list garbage = STRING_LIST_INIT_DUP; + data.m = r->objects->multi_pack_index; + + /* look for the multi-pack-index for this object directory */ + while (data.m && strcmp(data.m->object_dir, objdir)) + data.m = data.m->next; + data.r = r; data.garbage = data.local = local; -- 2.18.0.24.g1b579a2ee9
[PATCH v2 20/24] midx: use existing midx when writing new one
Signed-off-by: Derrick Stolee --- midx.c | 107 ++--- midx.h | 1 + 2 files changed, 96 insertions(+), 12 deletions(-) diff --git a/midx.c b/midx.c index c258e3ebdf..02cbfc5bd5 100644 --- a/midx.c +++ b/midx.c @@ -47,7 +47,6 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) fd = git_open(midx_name); if (fd < 0) { - error_errno(_("failed to read %s"), midx_name); FREE_AND_NULL(midx_name); return NULL; } @@ -276,6 +275,29 @@ int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct mu return nth_midxed_pack_entry(m, e, pos); } +int midx_contains_pack(struct multi_pack_index *m, const char *idx_name) +{ + uint32_t first = 0, last = m->num_packs; + + while (first < last) { + uint32_t mid = first + (last - first) / 2; + const char *current; + int cmp; + + current = m->pack_names[mid]; + cmp = strcmp(idx_name, current); + if (!cmp) + return 1; + if (cmp > 0) { + first = mid + 1; + continue; + } + last = mid; + } + + return 0; +} + int prepare_multi_pack_index_one(struct repository *r, const char *object_dir) { struct multi_pack_index *m = r->objects->multi_pack_index; @@ -322,6 +344,7 @@ struct pack_list uint32_t alloc_list; uint32_t alloc_names; size_t pack_name_concat_len; + struct multi_pack_index *m; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -330,6 +353,9 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, struct pack_list *packs = (struct pack_list *)data; if (ends_with(file_name, ".idx")) { + if (packs->m && midx_contains_pack(packs->m, file_name)) + return; + ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc_list); ALLOC_GROW(packs->names, packs->nr + 1, packs->alloc_names); @@ -413,6 +439,23 @@ static int midx_oid_compare(const void *_a, const void *_b) return a->pack_int_id - b->pack_int_id; } +static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, + uint32_t *pack_perm, + struct pack_midx_entry *e, + uint32_t pos) +{ + if (pos >= m->num_objects) + return 1; + + nth_midxed_object_oid(>oid, m, pos); + e->pack_int_id = pack_perm[nth_midxed_pack_int_id(m, pos)]; + e->offset = nth_midxed_offset(m, pos); + + /* consider objects in midx to be from "old" packs */ + e->pack_mtime = 0; + return 0; +} + static void fill_pack_entry(uint32_t pack_int_id, struct packed_git *p, uint32_t cur_object, @@ -438,7 +481,8 @@ static void fill_pack_entry(uint32_t pack_int_id, * Copy only the de-duplicated entries (selected by most-recent modified time * of a packfile containing the object). */ -static struct pack_midx_entry *get_sorted_entries(struct packed_git **p, +static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, + struct packed_git **p, uint32_t *perm, uint32_t nr_packs, uint32_t *nr_objects) @@ -447,8 +491,9 @@ static struct pack_midx_entry *get_sorted_entries(struct packed_git **p, uint32_t alloc_fanout, alloc_objects, total_objects = 0; struct pack_midx_entry *entries_by_fanout = NULL; struct pack_midx_entry *deduplicated_entries = NULL; + uint32_t start_pack = m ? m->num_packs : 0; - for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) { + for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { total_objects += p[cur_pack]->num_objects; } @@ -466,7 +511,23 @@ static struct pack_midx_entry *get_sorted_entries(struct packed_git **p, for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { uint32_t nr_fanout = 0; - for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) { + if (m) { + uint32_t start = 0, end; + + if (cur_fanout) + start = ntohl(m->chunk_oid_fanout[cur_fanout - 1]); + end = ntohl(m->chunk_oid_fanout[cur_fanout]); + + for (cur_object = start; cur_object < end; cur_object++) { + ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); +
[PATCH v2 24/24] midx: clear midx on repack
If a 'git repack' command replaces existing packfiles, then we must clear the existing multi-pack-index before moving the packfiles it references. Signed-off-by: Derrick Stolee --- builtin/repack.c | 8 midx.c | 8 midx.h | 1 + 3 files changed, 17 insertions(+) diff --git a/builtin/repack.c b/builtin/repack.c index 6c636e159e..66a7d8e8ea 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -8,6 +8,7 @@ #include "strbuf.h" #include "string-list.h" #include "argv-array.h" +#include "midx.h" static int delta_base_offset = 1; static int pack_kept_objects = -1; @@ -174,6 +175,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int no_update_server_info = 0; int quiet = 0; int local = 0; + int midx_cleared = 0; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, _everything, @@ -340,6 +342,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) continue; } + if (!midx_cleared) { + /* if we move a packfile, it will invalidated the midx */ + clear_midx_file(get_object_directory()); + midx_cleared = 1; + } + fname_old = mkpathdup("%s/old-%s%s", packdir, item->string, exts[ext].name); if (file_exists(fname_old)) diff --git a/midx.c b/midx.c index 02cbfc5bd5..ef9fb38610 100644 --- a/midx.c +++ b/midx.c @@ -890,3 +890,11 @@ int write_midx_file(const char *object_dir) FREE_AND_NULL(pack_perm); return 0; } + +void clear_midx_file(const char *object_dir) +{ + char *midx = get_midx_filename(object_dir); + + if (remove_path(midx)) + die(_("failed to clear multi-pack-index at %s"), midx); +} diff --git a/midx.h b/midx.h index 5faffb7bc6..5a42cbed1d 100644 --- a/midx.h +++ b/midx.h @@ -15,5 +15,6 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir); int write_midx_file(const char *object_dir); +void clear_midx_file(const char *object_dir); #endif -- 2.18.0.24.g1b579a2ee9
[PATCH v2 23/24] packfile: skip loading index if in multi-pack-index
Signed-off-by: Derrick Stolee --- packfile.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/packfile.c b/packfile.c index f2b8d6f8a7..acd02430a8 100644 --- a/packfile.c +++ b/packfile.c @@ -469,8 +469,19 @@ static int open_packed_git_1(struct packed_git *p) ssize_t read_result; const unsigned hashsz = the_hash_algo->rawsz; - if (!p->index_data && open_pack_index(p)) - return error("packfile %s index unavailable", p->pack_name); + if (!p->index_data) { + struct multi_pack_index *m; + const char *pack_name = strrchr(p->pack_name, '/'); + + for (m = the_repository->objects->multi_pack_index; +m; m = m->next) { + if (midx_contains_pack(m, pack_name)) + break; + } + + if (!m && open_pack_index(p)) + return error("packfile %s index unavailable", p->pack_name); + } if (!pack_max_fds) { unsigned int max_fds = get_max_fd_limit(); @@ -521,6 +532,10 @@ static int open_packed_git_1(struct packed_git *p) " supported (try upgrading GIT to a newer version)", p->pack_name, ntohl(hdr.hdr_version)); + /* Skip index checking if in multi-pack-index */ + if (!p->index_data) + return 0; + /* Verify the pack matches its index. */ if (p->num_objects != ntohl(hdr.hdr_entries)) return error("packfile %s claims to have %"PRIu32" objects" -- 2.18.0.24.g1b579a2ee9
[PATCH v2 17/24] midx: prepare midxed_git struct
Signed-off-by: Derrick Stolee --- midx.c | 22 ++ midx.h | 3 +++ object-store.h | 9 + packfile.c | 6 +- 4 files changed, 39 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 71ca493107..3dd5027dc6 100644 --- a/midx.c +++ b/midx.c @@ -176,6 +176,28 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) return NULL; } +int prepare_multi_pack_index_one(struct repository *r, const char *object_dir) +{ + struct multi_pack_index *m = r->objects->multi_pack_index; + struct multi_pack_index *m_search; + + if (!core_multi_pack_index) + return 0; + + for (m_search = m; m_search; m_search = m_search->next) + if (!strcmp(object_dir, m_search->object_dir)) + return 1; + + r->objects->multi_pack_index = load_multi_pack_index(object_dir); + + if (r->objects->multi_pack_index) { + r->objects->multi_pack_index->next = m; + return 1; + } + + return 0; +} + static size_t write_midx_header(struct hashfile *f, unsigned char num_chunks, uint32_t num_packs) diff --git a/midx.h b/midx.h index 2d83dd9ec1..731ad6f094 100644 --- a/midx.h +++ b/midx.h @@ -1,9 +1,12 @@ #ifndef __MIDX_H__ #define __MIDX_H__ +#include "repository.h" + struct multi_pack_index; struct multi_pack_index *load_multi_pack_index(const char *object_dir); +int prepare_multi_pack_index_one(struct repository *r, const char *object_dir); int write_midx_file(const char *object_dir); diff --git a/object-store.h b/object-store.h index 07bcc80e02..7d67ad7aa9 100644 --- a/object-store.h +++ b/object-store.h @@ -85,6 +85,8 @@ struct packed_git { }; struct multi_pack_index { + struct multi_pack_index *next; + int fd; const unsigned char *data; @@ -126,6 +128,13 @@ struct raw_object_store { */ struct oidmap *replace_map; + /* +* private data +* +* should only be accessed directly by packfile.c and midx.c +*/ + struct multi_pack_index *multi_pack_index; + /* * private data * diff --git a/packfile.c b/packfile.c index 39d6b66337..ff2df22a0b 100644 --- a/packfile.c +++ b/packfile.c @@ -15,6 +15,7 @@ #include "tree-walk.h" #include "tree.h" #include "object-store.h" +#include "midx.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -937,10 +938,13 @@ static void prepare_packed_git(struct repository *r) if (r->objects->packed_git_initialized) return; + prepare_multi_pack_index_one(r, r->objects->objectdir); prepare_packed_git_one(r, r->objects->objectdir, 1); prepare_alt_odb(r); - for (alt = r->objects->alt_odb_list; alt; alt = alt->next) + for (alt = r->objects->alt_odb_list; alt; alt = alt->next) { + prepare_multi_pack_index_one(r, alt->path); prepare_packed_git_one(r, alt->path, 0); + } rearrange_packed_git(r); prepare_packed_git_mru(r); r->objects->packed_git_initialized = 1; -- 2.18.0.24.g1b579a2ee9
[PATCH v2 10/24] multi-pack-index: write pack names in chunk
The multi-pack-index needs to track which packfiles it indexes. Store these in our first required chunk. Since filenames are not well structured, add padding to keep good alignment in later chunks. Modify the 'git multi-pack-index read' subcommand to output the existence of the pack-file name chunk. Modify t5319-multi-pack-index.sh to reflect this new output and the new expected number of chunks. Defense in depth: A pattern we are using in the multi-pack-index feature is to verify the data as we write it. We want to ensure we never write invalid data to the multi-pack-index. There are many checks that verify that the values we are writing fit the format definitions. This mainly helps developers while working on the feature, but it can also identify issues that only appear when dealing with very large data sets. These large sets are hard to encode into test cases. Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 6 + midx.c | 190 ++-- object-store.h | 2 + t/helper/test-read-midx.c | 7 + t/t5319-multi-pack-index.sh | 3 +- 5 files changed, 198 insertions(+), 10 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index e060e693f4..6c5a77475f 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -296,6 +296,12 @@ CHUNK LOOKUP: CHUNK DATA: + Packfile Names (ID: {'P', 'N', 'A', 'M'}) + Stores the packfile names as concatenated, null-terminated strings. + Packfiles must be listed in lexicographic order for fast lookups by + name. This is the only chunk not guaranteed to be a multiple of four + bytes in length, so should be the last chunk for alignment reasons. + (This section intentionally left incomplete.) TRAILER: diff --git a/midx.c b/midx.c index e79ffb5576..2fad99d1b8 100644 --- a/midx.c +++ b/midx.c @@ -13,6 +13,11 @@ #define MIDX_HASH_LEN 20 #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN) +#define MIDX_MAX_CHUNKS 1 +#define MIDX_CHUNK_ALIGNMENT 4 +#define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ +#define MIDX_CHUNKLOOKUP_WIDTH (sizeof(uint32_t) + sizeof(uint64_t)) + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); @@ -27,6 +32,7 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) void *midx_map = NULL; uint32_t hash_version; char *midx_name = get_midx_filename(object_dir); + uint32_t i; fd = git_open(midx_name); @@ -83,6 +89,33 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) m->num_packs = get_be32(m->data + 8); + for (i = 0; i < m->num_chunks; i++) { + uint32_t chunk_id = get_be32(m->data + MIDX_HEADER_SIZE + +MIDX_CHUNKLOOKUP_WIDTH * i); + uint64_t chunk_offset = get_be64(m->data + MIDX_HEADER_SIZE + 4 + +MIDX_CHUNKLOOKUP_WIDTH * i); + + switch (chunk_id) { + case MIDX_CHUNKID_PACKNAMES: + m->chunk_pack_names = m->data + chunk_offset; + break; + + case 0: + die(_("terminating multi-pack-index chunk id appears earlier than expected")); + break; + + default: + /* +* Do nothing on unrecognized chunks, allowing future +* extensions to add optional chunks. +*/ + break; + } + } + + if (!m->chunk_pack_names) + die(_("multi-pack-index missing required pack-name chunk")); + return m; cleanup_fail: @@ -112,8 +145,11 @@ static size_t write_midx_header(struct hashfile *f, struct pack_list { struct packed_git **list; + char **names; uint32_t nr; - uint32_t alloc; + uint32_t alloc_list; + uint32_t alloc_names; + size_t pack_name_concat_len; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -122,27 +158,101 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, struct pack_list *packs = (struct pack_list *)data; if (ends_with(file_name, ".idx")) { - ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc); + ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc_list); + ALLOC_GROW(packs->names, packs->nr + 1, packs->alloc_names); packs->list[packs->nr] = add_packed_git(full_path,
[PATCH v2 19/24] midx: use midx in abbreviation calculations
Signed-off-by: Derrick Stolee --- midx.c | 11 ++ midx.h | 3 ++ packfile.c | 6 packfile.h | 1 + sha1-name.c | 70 + t/t5319-multi-pack-index.sh | 3 +- 6 files changed, 93 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 14514d6828..c258e3ebdf 100644 --- a/midx.c +++ b/midx.c @@ -201,6 +201,17 @@ int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32 MIDX_HASH_LEN, result); } +struct object_id *nth_midxed_object_oid(struct object_id *oid, + struct multi_pack_index *m, + uint32_t n) +{ + if (n >= m->num_objects) + return NULL; + + hashcpy(oid->hash, m->chunk_oid_lookup + m->hash_len * n); + return oid; +} + static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) { const unsigned char *offset_data; diff --git a/midx.h b/midx.h index 6b74a0640f..f7c2ec7893 100644 --- a/midx.h +++ b/midx.h @@ -7,6 +7,9 @@ struct multi_pack_index; struct multi_pack_index *load_multi_pack_index(const char *object_dir); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +struct object_id *nth_midxed_object_oid(struct object_id *oid, + struct multi_pack_index *m, + uint32_t n); int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir); diff --git a/packfile.c b/packfile.c index 946d0c241f..20b743da91 100644 --- a/packfile.c +++ b/packfile.c @@ -963,6 +963,12 @@ struct packed_git *get_packed_git(struct repository *r) return r->objects->packed_git; } +struct multi_pack_index *get_multi_pack_index(struct repository *r) +{ + prepare_packed_git(r); + return r->objects->multi_pack_index; +} + struct list_head *get_packed_git_mru(struct repository *r) { prepare_packed_git(r); diff --git a/packfile.h b/packfile.h index b0eed44c0b..046280caf3 100644 --- a/packfile.h +++ b/packfile.h @@ -45,6 +45,7 @@ extern void install_packed_git(struct repository *r, struct packed_git *pack); struct packed_git *get_packed_git(struct repository *r); struct list_head *get_packed_git_mru(struct repository *r); +struct multi_pack_index *get_multi_pack_index(struct repository *r); /* * Give a rough count of objects in the repository. This sacrifices accuracy diff --git a/sha1-name.c b/sha1-name.c index 60d9ef3c7e..7dc71201e6 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -12,6 +12,7 @@ #include "packfile.h" #include "object-store.h" #include "repository.h" +#include "midx.h" static int get_oid_oneline(const char *, struct object_id *, struct commit_list *); @@ -149,6 +150,32 @@ static int match_sha(unsigned len, const unsigned char *a, const unsigned char * return 1; } +static void unique_in_midx(struct multi_pack_index *m, + struct disambiguate_state *ds) +{ + uint32_t num, i, first = 0; + const struct object_id *current = NULL; + num = m->num_objects; + + if (!num) + return; + + bsearch_midx(>bin_pfx, m, ); + + /* +* At this point, "first" is the location of the lowest object +* with an object name that could match "bin_pfx". See if we have +* 0, 1 or more objects that actually match(es). +*/ + for (i = first; i < num && !ds->ambiguous; i++) { + struct object_id oid; + current = nth_midxed_object_oid(, m, i); + if (!match_sha(ds->len, ds->bin_pfx.hash, current->hash)) + break; + update_candidates(ds, current); + } +} + static void unique_in_pack(struct packed_git *p, struct disambiguate_state *ds) { @@ -177,8 +204,12 @@ static void unique_in_pack(struct packed_git *p, static void find_short_packed_object(struct disambiguate_state *ds) { + struct multi_pack_index *m; struct packed_git *p; + for (m = get_multi_pack_index(the_repository); m && !ds->ambiguous; +m = m->next) + unique_in_midx(m, ds); for (p = get_packed_git(the_repository); p && !ds->ambiguous; p = p->next) unique_in_pack(p, ds); @@ -527,6 +558,42 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_midx(struct multi_pack_index *m, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, first = 0; + struct object_id oid; + const struct object_id *mad_oid; + + if
[PATCH v2 18/24] midx: read objects from multi-pack-index
Signed-off-by: Derrick Stolee --- midx.c | 93 -- midx.h | 2 ++ object-store.h | 1 + packfile.c | 8 - 4 files changed, 101 insertions(+), 3 deletions(-) diff --git a/midx.c b/midx.c index 3dd5027dc6..14514d6828 100644 --- a/midx.c +++ b/midx.c @@ -4,7 +4,7 @@ #include "lockfile.h" #include "packfile.h" #include "object-store.h" -#include "packfile.h" +#include "sha1-lookup.h" #include "midx.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ @@ -150,7 +150,8 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) m->num_objects = ntohl(m->chunk_oid_fanout[255]); - m->pack_names = xcalloc(m->num_packs, sizeof(const char *)); + m->packs = xcalloc(m->num_packs, sizeof(*m->packs)); + ALLOC_ARRAY(m->pack_names, m->num_packs); cur_pack_name = (const char *)m->chunk_pack_names; for (i = 0; i < m->num_packs; i++) { @@ -176,6 +177,94 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) return NULL; } +static int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id) +{ + struct strbuf pack_name = STRBUF_INIT; + + if (pack_int_id >= m->num_packs) + BUG("bad pack-int-id"); + + if (m->packs[pack_int_id]) + return 0; + + strbuf_addf(_name, "%s/pack/%s", m->object_dir, + m->pack_names[pack_int_id]); + + m->packs[pack_int_id] = add_packed_git(pack_name.buf, pack_name.len, 1); + strbuf_release(_name); + return !m->packs[pack_int_id]; +} + +int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result) +{ + return bsearch_hash(oid->hash, m->chunk_oid_fanout, m->chunk_oid_lookup, + MIDX_HASH_LEN, result); +} + +static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) +{ + const unsigned char *offset_data; + uint32_t offset32; + + offset_data = m->chunk_object_offsets + pos * MIDX_CHUNK_OFFSET_WIDTH; + offset32 = get_be32(offset_data + sizeof(uint32_t)); + + if (m->chunk_large_offsets && offset32 & MIDX_LARGE_OFFSET_NEEDED) { + if (sizeof(offset32) < sizeof(uint64_t)) + die(_("multi-pack-index stores a 64-bit offset, but off_t is too small")); + + offset32 ^= MIDX_LARGE_OFFSET_NEEDED; + return get_be64(m->chunk_large_offsets + sizeof(uint64_t) * offset32); + } + + return offset32; +} + +static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) +{ + return get_be32(m->chunk_object_offsets + pos * MIDX_CHUNK_OFFSET_WIDTH); +} + +static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *e, uint32_t pos) +{ + uint32_t pack_int_id; + struct packed_git *p; + + if (pos >= m->num_objects) + return 0; + + pack_int_id = nth_midxed_pack_int_id(m, pos); + + if (prepare_midx_pack(m, pack_int_id)) + die(_("error preparing packfile from multi-pack-index")); + p = m->packs[pack_int_id]; + + /* + * We are about to tell the caller where they can locate the + * requested object. We better make sure the packfile is + * still here and can be accessed before supplying that + * answer, as it may have been deleted since the MIDX was + * loaded! + */ + if (!is_pack_valid(p)) + return 0; + + e->offset = nth_midxed_offset(m, pos); + e->p = p; + + return 1; +} + +int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m) +{ + uint32_t pos; + + if (!bsearch_midx(oid, m, )) + return 0; + + return nth_midxed_pack_entry(m, e, pos); +} + int prepare_multi_pack_index_one(struct repository *r, const char *object_dir) { struct multi_pack_index *m = r->objects->multi_pack_index; diff --git a/midx.h b/midx.h index 731ad6f094..6b74a0640f 100644 --- a/midx.h +++ b/midx.h @@ -6,6 +6,8 @@ struct multi_pack_index; struct multi_pack_index *load_multi_pack_index(const char *object_dir); +int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir); int write_midx_file(const char *object_dir); diff --git a/object-store.h b/object-store.h index 7d67ad7aa9..03cc278758 100644 --- a/object-store.h +++ b/object-store.h @@ -106,6 +106,7 @@ struct multi_pack_index { const unsigned char *chunk_large_offsets; const char **pack_names; + struct packed_git **packs; char object_dir[FLEX_ARRAY]; }; diff --git a/packfile.c b/packfile.c index ff2df22a0b..946d0c241f 100644 --- a/packfile.c +++
[PATCH v2 15/24] midx: write object offsets
The final pair of chunks for the multi-pack-index file stores the object offsets. We default to using 32-bit offsets as in the pack-index version 1 format, but if there exists an offset larger than 32-bits, we use a trick similar to the pack-index version 2 format by storing all offsets at least 2^31 in a 64-bit table; we use the 32-bit table to point into that 64-bit table as necessary. We only store these 64-bit offsets if necessary, so create a test that manipulates a version 2 pack-index to fake a large offset. This allows us to test that the large offset table is created, but the data does not match the actual packfile offsets. The multi-pack-index offset does match the (corrupted) pack-index offset, so a future feature will compare these offsets during a 'verify' step. Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 15 +++- midx.c | 100 +++- object-store.h | 2 + t/helper/test-read-midx.c | 4 + t/t5319-multi-pack-index.sh | 45 --- 5 files changed, 151 insertions(+), 15 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 3215f7bfcd..cab5bdd2ff 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -311,7 +311,20 @@ CHUNK DATA: The OIDs for all objects in the MIDX are stored in lexicographic order in this chunk. - (This section intentionally left incomplete.) + Object Offsets (ID: {'O', 'O', 'F', 'F'}) + Stores two 4-byte values for every object. + 1: The pack-int-id for the pack storing this object. + 2: The offset within the pack. + If all offsets are less than 2^31, then the large offset chunk + will not exist and offsets are stored as in IDX v1. + If there is at least one offset value larger than 2^32-1, then + the large offset chunk must exist. If the large offset chunk + exists and the 31st bit is on, then removing that bit reveals + the row in the large offsets containing the 8-byte offset of + this object. + + [Optional] Object Large Offsets (ID: {'L', 'O', 'F', 'F'}) + 8-byte offsets into large packfiles. TRAILER: diff --git a/midx.c b/midx.c index 0f773e2585..71ca493107 100644 --- a/midx.c +++ b/midx.c @@ -14,13 +14,18 @@ #define MIDX_HASH_LEN 20 #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN) -#define MIDX_MAX_CHUNKS 3 +#define MIDX_MAX_CHUNKS 5 #define MIDX_CHUNK_ALIGNMENT 4 #define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ #define MIDX_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ #define MIDX_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ +#define MIDX_CHUNKID_OBJECTOFFSETS 0x4f4f4646 /* "OOFF" */ +#define MIDX_CHUNKID_LARGEOFFSETS 0x4c4f4646 /* "LOFF" */ #define MIDX_CHUNKLOOKUP_WIDTH (sizeof(uint32_t) + sizeof(uint64_t)) #define MIDX_CHUNK_FANOUT_SIZE (sizeof(uint32_t) * 256) +#define MIDX_CHUNK_OFFSET_WIDTH (2 * sizeof(uint32_t)) +#define MIDX_CHUNK_LARGE_OFFSET_WIDTH (sizeof(uint64_t)) +#define MIDX_LARGE_OFFSET_NEEDED 0x8000 static char *get_midx_filename(const char *object_dir) { @@ -113,6 +118,14 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) m->chunk_oid_lookup = m->data + chunk_offset; break; + case MIDX_CHUNKID_OBJECTOFFSETS: + m->chunk_object_offsets = m->data + chunk_offset; + break; + + case MIDX_CHUNKID_LARGEOFFSETS: + m->chunk_large_offsets = m->data + chunk_offset; + break; + case 0: die(_("terminating multi-pack-index chunk id appears earlier than expected")); break; @@ -132,6 +145,8 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) die(_("multi-pack-index missing required OID fanout chunk")); if (!m->chunk_oid_lookup) die(_("multi-pack-index missing required OID lookup chunk")); + if (!m->chunk_object_offsets) + die(_("multi-pack-index missing required object offsets chunk")); m->num_objects = ntohl(m->chunk_oid_fanout[255]); @@ -466,6 +481,56 @@ static size_t write_midx_oid_lookup(struct hashfile *f, unsigned char hash_len, return written; } +static size_t write_midx_object_offsets(struct hashfile *f, int large_offset_needed, + struct pack_midx_entry *objects, uint32_t nr_objects) +{ + struct pack_midx_entry *list = objects; + uint32_t i, nr_large_offset = 0; + size_t written = 0; + + for (i
[PATCH v2 16/24] config: create core.multiPackIndex setting
The core.multiPackIndex config setting controls the multi-pack- index (MIDX) feature. If false, the setting will disable all reads from the multi-pack-index file. Add comparison commands in t5319-multi-pack-index.sh to check typical Git behavior remains the same as the config setting is turned on and off. This currently includes 'git rev-list' and 'git log' commands to trigger several object database reads. Signed-off-by: Derrick Stolee --- Documentation/config.txt| 4 +++ cache.h | 1 + config.c| 5 environment.c | 1 + t/t5319-multi-pack-index.sh | 55 +++-- 5 files changed, 57 insertions(+), 9 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index ab641bf5a9..ab895ebb32 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -908,6 +908,10 @@ core.commitGraph:: Enable git commit graph feature. Allows reading from the commit-graph file. +core.multiPackIndex:: + Use the multi-pack-index file to track multiple packfiles using a + single index. See linkgit:technical/multi-pack-index[1]. + core.sparseCheckout:: Enable "sparse checkout" feature. See section "Sparse checkout" in linkgit:git-read-tree[1] for more information. diff --git a/cache.h b/cache.h index 89a107a7f7..d12aa49710 100644 --- a/cache.h +++ b/cache.h @@ -814,6 +814,7 @@ extern char *git_replace_ref_base; extern int fsync_object_files; extern int core_preload_index; extern int core_commit_graph; +extern int core_multi_pack_index; extern int core_apply_sparse_checkout; extern int precomposed_unicode; extern int protect_hfs; diff --git a/config.c b/config.c index fbbf0f8e9f..95d8da4243 100644 --- a/config.c +++ b/config.c @@ -1313,6 +1313,11 @@ static int git_default_core_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.multipackindex")) { + core_multi_pack_index = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.sparsecheckout")) { core_apply_sparse_checkout = git_config_bool(var, value); return 0; diff --git a/environment.c b/environment.c index 2a6de2330b..b9bc919cdb 100644 --- a/environment.c +++ b/environment.c @@ -67,6 +67,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE; char *notes_ref_name; int grafts_replace_parents = 1; int core_commit_graph; +int core_multi_pack_index; int core_apply_sparse_checkout; int merge_log_config = -1; int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */ diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index ccde83bca4..f7f55ea181 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -3,6 +3,8 @@ test_description='multi-pack-indexes' . ./test-lib.sh +objdir=.git/objects + midx_read_expect() { NUM_PACKS=$1 NUM_OBJECTS=$2 @@ -61,12 +63,42 @@ test_expect_success 'write midx with one v1 pack' ' midx_read_expect 1 17 4 . ' +midx_git_two_modes() { + git -c core.multiPackIndex=false $1 >expect && + git -c core.multiPackIndex=true $1 >actual && + test_cmp expect actual +} + +compare_results_with_midx() { + MSG=$1 + test_expect_success "check normal git operations: $MSG" ' + midx_git_two_modes "rev-list --objects --all" && + midx_git_two_modes "log --raw" + ' +} + test_expect_success 'write midx with one v2 pack' ' - git pack-objects --index-version=2,0x40 pack/test expect && + git -c core.multiPackIndex=true $1 >actual && + test_cmp expect actual +} + +compare_results_with_midx() { + MSG=$1 + test_expect_success "check normal git operations: $MSG" ' + midx_git_two_modes "rev-list --objects --all" && + midx_git_two_modes "log --raw" + ' +} + +compare_results_with_midx "one v2 pack" + test_expect_success 'Add more objects' ' for i in `test_seq 6 10` do @@ -92,11 +124,13 @@ test_expect_success 'Add more objects' ' ' test_expect_success 'write midx with two packs' ' - git pack-objects --index-version=1 pack/test-2 obj-list && git update-ref HEAD $commit && - git pack-objects --index-version=2 pack/test-pack [] corrupt_data() { -- 2.18.0.24.g1b579a2ee9
[PATCH v2 02/24] multi-pack-index: add format details
The multi-pack-index feature generalizes the existing pack-index feature by indexing objects across multiple pack-files. Describe the basic file format, using a 12-byte header followed by a lookup table for a list of "chunks" which will be described later. The file ends with a footer containing a checksum using the hash algorithm. The header allows later versions to create breaking changes by advancing the version number. We can also change the hash algorithm using a different version value. We will add the individual chunk format information as we introduce the code that writes that information. Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 49 + 1 file changed, 49 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 70a99fd142..e060e693f4 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -252,3 +252,52 @@ Pack file entry: <+ corresponding packfile. 20-byte SHA-1-checksum of all of the above. + +== multi-pack-index (MIDX) files have the following format: + +The multi-pack-index files refer to multiple pack-files and loose objects. + +In order to allow extensions that add extra data to the MIDX, we organize +the body into "chunks" and provide a lookup table at the beginning of the +body. The header includes certain length values, such as the number of packs, +the number of base MIDX files, hash lengths and types. + +All 4-byte numbers are in network order. + +HEADER: + + 4-byte signature: + The signature is: {'M', 'I', 'D', 'X'} + + 1-byte version number: + Git only writes or recognizes version 1. + + 1-byte Object Id Version + Git only writes or recognizes version 1 (SHA1). + + 1-byte number of "chunks" + + 1-byte number of base multi-pack-index files: + This value is currently always zero. + + 4-byte number of pack files + +CHUNK LOOKUP: + + (C + 1) * 12 bytes providing the chunk offsets: + First 4 bytes describe chunk id. Value 0 is a terminating label. + Other 8 bytes provide offset in current file for chunk to start. + (Chunks are provided in file-order, so you can infer the length + using the next chunk position if necessary.) + + The remaining data in the body is described one chunk at a time, and + these chunks may be given in any order. Chunks are required unless + otherwise specified. + +CHUNK DATA: + + (This section intentionally left incomplete.) + +TRAILER: + + 20-byte SHA1-checksum of the above contents. -- 2.18.0.24.g1b579a2ee9
[PATCH v2 13/24] midx: write object ids in a chunk
Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 4 ++ midx.c | 51 ++--- object-store.h | 1 + t/helper/test-read-midx.c | 2 + t/t5319-multi-pack-index.sh | 4 +- 5 files changed, 55 insertions(+), 7 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 6c5a77475f..78ee0489c6 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -302,6 +302,10 @@ CHUNK DATA: name. This is the only chunk not guaranteed to be a multiple of four bytes in length, so should be the last chunk for alignment reasons. + OID Lookup (ID: {'O', 'I', 'D', 'L'}) + The OIDs for all objects in the MIDX are stored in lexicographic + order in this chunk. + (This section intentionally left incomplete.) TRAILER: diff --git a/midx.c b/midx.c index 648a501d74..aec85b8181 100644 --- a/midx.c +++ b/midx.c @@ -14,9 +14,10 @@ #define MIDX_HASH_LEN 20 #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN) -#define MIDX_MAX_CHUNKS 1 +#define MIDX_MAX_CHUNKS 2 #define MIDX_CHUNK_ALIGNMENT 4 #define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ +#define MIDX_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ #define MIDX_CHUNKLOOKUP_WIDTH (sizeof(uint32_t) + sizeof(uint64_t)) static char *get_midx_filename(const char *object_dir) @@ -102,6 +103,10 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) m->chunk_pack_names = m->data + chunk_offset; break; + case MIDX_CHUNKID_OIDLOOKUP: + m->chunk_oid_lookup = m->data + chunk_offset; + break; + case 0: die(_("terminating multi-pack-index chunk id appears earlier than expected")); break; @@ -117,6 +122,8 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) if (!m->chunk_pack_names) die(_("multi-pack-index missing required pack-name chunk")); + if (!m->chunk_oid_lookup) + die(_("multi-pack-index missing required OID lookup chunk")); m->pack_names = xcalloc(m->num_packs, sizeof(const char *)); @@ -127,7 +134,7 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) cur_pack_name += strlen(cur_pack_name) + 1; if (i && strcmp(m->pack_names[i], m->pack_names[i - 1]) <= 0) { - error("MIDX pack names out of order: '%s' before '%s'", + error(_("multi-pack-index pack names out of order: '%s' before '%s'"), m->pack_names[i - 1], m->pack_names[i]); goto cleanup_fail; @@ -394,6 +401,32 @@ static size_t write_midx_pack_names(struct hashfile *f, return written; } +static size_t write_midx_oid_lookup(struct hashfile *f, unsigned char hash_len, + struct pack_midx_entry *objects, + uint32_t nr_objects) +{ + struct pack_midx_entry *list = objects; + uint32_t i; + size_t written = 0; + + for (i = 0; i < nr_objects; i++) { + struct pack_midx_entry *obj = list++; + + if (i < nr_objects - 1) { + struct pack_midx_entry *next = list; + if (oidcmp(>oid, >oid) >= 0) + BUG("OIDs not in order: %s >= %s", + oid_to_hex(>oid), + oid_to_hex(>oid)); + } + + hashwrite(f, obj->oid.hash, (int)hash_len); + written += hash_len; + } + + return written; +} + int write_midx_file(const char *object_dir) { unsigned char cur_chunk, num_chunks = 0; @@ -407,7 +440,7 @@ int write_midx_file(const char *object_dir) uint32_t chunk_ids[MIDX_MAX_CHUNKS + 1]; uint64_t chunk_offsets[MIDX_MAX_CHUNKS + 1]; uint32_t nr_entries; - struct pack_midx_entry *entries; + struct pack_midx_entry *entries = NULL; midx_name = get_midx_filename(object_dir); if (safe_create_leading_directories(midx_name)) { @@ -440,7 +473,7 @@ int write_midx_file(const char *object_dir) FREE_AND_NULL(midx_name); cur_chunk = 0; - num_chunks = 1; + num_chunks = 2; written = write_midx_header(f, num_chunks, packs.nr); @@ -448,9 +481,13 @@ int write_midx_file(const char *object_dir) chunk_offsets[cur_chunk] = written + (num_chunks + 1) * MIDX_CHUNKLOOKUP_WIDTH; cur_chunk++; - chunk_ids[cur_chunk] = 0; + chunk_ids[cur_chunk] =
[PATCH v2 14/24] midx: write object id fanout chunk
Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 5 +++ midx.c | 53 +++-- object-store.h | 1 + t/helper/test-read-midx.c | 4 +- t/t5319-multi-pack-index.sh | 18 + 5 files changed, 69 insertions(+), 12 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 78ee0489c6..3215f7bfcd 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -302,6 +302,11 @@ CHUNK DATA: name. This is the only chunk not guaranteed to be a multiple of four bytes in length, so should be the last chunk for alignment reasons. + OID Fanout (ID: {'O', 'I', 'D', 'F'}) + The ith entry, F[i], stores the number of OIDs with first + byte at most i. Thus F[255] stores the total + number of objects. + OID Lookup (ID: {'O', 'I', 'D', 'L'}) The OIDs for all objects in the MIDX are stored in lexicographic order in this chunk. diff --git a/midx.c b/midx.c index aec85b8181..0f773e2585 100644 --- a/midx.c +++ b/midx.c @@ -14,11 +14,13 @@ #define MIDX_HASH_LEN 20 #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN) -#define MIDX_MAX_CHUNKS 2 +#define MIDX_MAX_CHUNKS 3 #define MIDX_CHUNK_ALIGNMENT 4 #define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ +#define MIDX_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ #define MIDX_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ #define MIDX_CHUNKLOOKUP_WIDTH (sizeof(uint32_t) + sizeof(uint64_t)) +#define MIDX_CHUNK_FANOUT_SIZE (sizeof(uint32_t) * 256) static char *get_midx_filename(const char *object_dir) { @@ -103,6 +105,10 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) m->chunk_pack_names = m->data + chunk_offset; break; + case MIDX_CHUNKID_OIDFANOUT: + m->chunk_oid_fanout = (uint32_t *)(m->data + chunk_offset); + break; + case MIDX_CHUNKID_OIDLOOKUP: m->chunk_oid_lookup = m->data + chunk_offset; break; @@ -122,9 +128,13 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir) if (!m->chunk_pack_names) die(_("multi-pack-index missing required pack-name chunk")); + if (!m->chunk_oid_fanout) + die(_("multi-pack-index missing required OID fanout chunk")); if (!m->chunk_oid_lookup) die(_("multi-pack-index missing required OID lookup chunk")); + m->num_objects = ntohl(m->chunk_oid_fanout[255]); + m->pack_names = xcalloc(m->num_packs, sizeof(const char *)); cur_pack_name = (const char *)m->chunk_pack_names; @@ -401,6 +411,35 @@ static size_t write_midx_pack_names(struct hashfile *f, return written; } +static size_t write_midx_oid_fanout(struct hashfile *f, + struct pack_midx_entry *objects, + uint32_t nr_objects) +{ + struct pack_midx_entry *list = objects; + struct pack_midx_entry *last = objects + nr_objects; + uint32_t count = 0; + uint32_t i; + + /* + * Write the first-level table (the list is sorted, + * but we use a 256-entry lookup to be able to avoid + * having to do eight extra binary search iterations). + */ + for (i = 0; i < 256; i++) { + struct pack_midx_entry *next = list; + + while (next < last && next->oid.hash[0] == i) { + count++; + next++; + } + + hashwrite_be32(f, count); + list = next; + } + + return MIDX_CHUNK_FANOUT_SIZE; +} + static size_t write_midx_oid_lookup(struct hashfile *f, unsigned char hash_len, struct pack_midx_entry *objects, uint32_t nr_objects) @@ -473,7 +512,7 @@ int write_midx_file(const char *object_dir) FREE_AND_NULL(midx_name); cur_chunk = 0; - num_chunks = 2; + num_chunks = 3; written = write_midx_header(f, num_chunks, packs.nr); @@ -481,9 +520,13 @@ int write_midx_file(const char *object_dir) chunk_offsets[cur_chunk] = written + (num_chunks + 1) * MIDX_CHUNKLOOKUP_WIDTH; cur_chunk++; - chunk_ids[cur_chunk] = MIDX_CHUNKID_OIDLOOKUP; + chunk_ids[cur_chunk] = MIDX_CHUNKID_OIDFANOUT; chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] + packs.pack_name_concat_len; + cur_chunk++; + chunk_ids[cur_chunk] = MIDX_CHUNKID_OIDLOOKUP; + chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] + MIDX_CHUNK_FANOUT_SIZE; +
[PATCH v2 12/24] midx: sort and deduplicate objects from packfiles
Before writing a list of objects and their offsets to a multi-pack-index, we need to collect the list of objects contained in the packfiles. There may be multiple copies of some objects, so this list must be deduplicated. It is possible to artificially get into a state where there are many duplicate copies of objects. That can create high memory pressure if we are to create a list of all objects before de-duplication. To reduce this memory pressure without a significant performance drop, automatically group objects by the first byte of their object id. Use the IDX fanout tables to group the data, copy to a local array, then sort. Copy only the de-duplicated entries. Select the duplicate based on the most-recent modified time of a packfile containing the object. Signed-off-by: Derrick Stolee --- midx.c | 127 + packfile.c | 17 +++ packfile.h | 2 + 3 files changed, 146 insertions(+) diff --git a/midx.c b/midx.c index 8bbc966f6b..648a501d74 100644 --- a/midx.c +++ b/midx.c @@ -4,6 +4,7 @@ #include "lockfile.h" #include "packfile.h" #include "object-store.h" +#include "packfile.h" #include "midx.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ @@ -187,6 +188,13 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, return; } + if (open_pack_index(packs->list[packs->nr])) { + warning(_("failed to open pack-index '%s'"), + full_path); + close_pack(packs->list[packs->nr]); + FREE_AND_NULL(packs->list[packs->nr]); + } + packs->names[packs->nr] = xstrdup(file_name); packs->pack_name_concat_len += strlen(file_name) + 1; packs->nr++; @@ -227,6 +235,120 @@ static void sort_packs_by_name(char **pack_names, uint32_t nr_packs, uint32_t *p FREE_AND_NULL(pairs); } +struct pack_midx_entry { + struct object_id oid; + uint32_t pack_int_id; + time_t pack_mtime; + uint64_t offset; +}; + +static int midx_oid_compare(const void *_a, const void *_b) +{ + const struct pack_midx_entry *a = (const struct pack_midx_entry *)_a; + const struct pack_midx_entry *b = (const struct pack_midx_entry *)_b; + int cmp = oidcmp(>oid, >oid); + + if (cmp) + return cmp; + + if (a->pack_mtime > b->pack_mtime) + return -1; + else if (a->pack_mtime < b->pack_mtime) + return 1; + + return a->pack_int_id - b->pack_int_id; +} + +static void fill_pack_entry(uint32_t pack_int_id, + struct packed_git *p, + uint32_t cur_object, + struct pack_midx_entry *entry) +{ + if (!nth_packed_object_oid(>oid, p, cur_object)) + die(_("failed to locate object %d in packfile"), cur_object); + + entry->pack_int_id = pack_int_id; + entry->pack_mtime = p->mtime; + + entry->offset = nth_packed_object_offset(p, cur_object); +} + +/* + * It is possible to artificially get into a state where there are many + * duplicate copies of objects. That can create high memory pressure if + * we are to create a list of all objects before de-duplication. To reduce + * this memory pressure without a significant performance drop, automatically + * group objects by the first byte of their object id. Use the IDX fanout + * tables to group the data, copy to a local array, then sort. + * + * Copy only the de-duplicated entries (selected by most-recent modified time + * of a packfile containing the object). + */ +static struct pack_midx_entry *get_sorted_entries(struct packed_git **p, + uint32_t *perm, + uint32_t nr_packs, + uint32_t *nr_objects) +{ + uint32_t cur_fanout, cur_pack, cur_object; + uint32_t alloc_fanout, alloc_objects, total_objects = 0; + struct pack_midx_entry *entries_by_fanout = NULL; + struct pack_midx_entry *deduplicated_entries = NULL; + + for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) { + total_objects += p[cur_pack]->num_objects; + } + + /* +* As we de-duplicate by fanout value, we expect the fanout +* slices to be evenly distributed, with some noise. Hence, +* allocate slightly more than one 256th. +*/ + alloc_objects = alloc_fanout = total_objects > 3200 ? total_objects / 200 : 16; + + ALLOC_ARRAY(entries_by_fanout, alloc_fanout); + ALLOC_ARRAY(deduplicated_entries, alloc_objects); + *nr_objects = 0; + + for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { + uint32_t nr_fanout = 0; + + for (cur_pack = 0; cur_pack < nr_packs; cur_pack++) { +
[PATCH v2 09/24] multi-pack-index: read packfile list
When constructing a multi-pack-index file for a given object directory, read the files within the enclosed pack directory and find matches that end with ".idx" and find the correct paired packfile using add_packed_git(). Signed-off-by: Derrick Stolee --- midx.c | 45 - t/t5319-multi-pack-index.sh | 16 ++--- 2 files changed, 52 insertions(+), 9 deletions(-) diff --git a/midx.c b/midx.c index 0977397d6a..e79ffb5576 100644 --- a/midx.c +++ b/midx.c @@ -1,6 +1,8 @@ #include "cache.h" #include "csum-file.h" +#include "dir.h" #include "lockfile.h" +#include "packfile.h" #include "object-store.h" #include "midx.h" @@ -107,12 +109,40 @@ static size_t write_midx_header(struct hashfile *f, return MIDX_HEADER_SIZE; } +struct pack_list +{ + struct packed_git **list; + uint32_t nr; + uint32_t alloc; +}; + +static void add_pack_to_midx(const char *full_path, size_t full_path_len, +const char *file_name, void *data) +{ + struct pack_list *packs = (struct pack_list *)data; + + if (ends_with(file_name, ".idx")) { + ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc); + + packs->list[packs->nr] = add_packed_git(full_path, +full_path_len, +0); + if (!packs->list[packs->nr]) + warning(_("failed to add packfile '%s'"), + full_path); + else + packs->nr++; + } +} + int write_midx_file(const char *object_dir) { unsigned char num_chunks = 0; char *midx_name; + uint32_t i; struct hashfile *f; struct lock_file lk; + struct pack_list packs; midx_name = get_midx_filename(object_dir); if (safe_create_leading_directories(midx_name)) { @@ -121,14 +151,27 @@ int write_midx_file(const char *object_dir) midx_name); } + packs.nr = 0; + packs.alloc = 16; + packs.list = NULL; + ALLOC_ARRAY(packs.list, packs.alloc); + + for_each_file_in_pack_dir(object_dir, add_pack_to_midx, ); + hold_lock_file_for_update(, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf); FREE_AND_NULL(midx_name); - write_midx_header(f, num_chunks, 0); + write_midx_header(f, num_chunks, packs.nr); finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); commit_lock_file(); + for (i = 0; i < packs.nr; i++) { + close_pack(packs.list[i]); + FREE_AND_NULL(packs.list[i]); + } + + FREE_AND_NULL(packs.list); return 0; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index d533fd0dbc..4d4d6ca0a6 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -4,8 +4,9 @@ test_description='multi-pack-indexes' . ./test-lib.sh midx_read_expect() { + NUM_PACKS=$1 cat >expect <<- EOF - header: 4d494458 1 0 0 + header: 4d494458 1 0 $NUM_PACKS object_dir: . EOF test-tool read-midx . >actual && @@ -15,8 +16,7 @@ midx_read_expect() { test_expect_success 'write midx with no packs' ' test_when_finished rm pack/multi-pack-index && git multi-pack-index --object-dir=. write && - test_path_is_file pack/multi-pack-index && - midx_read_expect + midx_read_expect 0 ' test_expect_success 'create objects' ' @@ -47,13 +47,13 @@ test_expect_success 'write midx with one v1 pack' ' pack=$(git pack-objects --index-version=1 pack/test obj-list && git update-ref HEAD $commit && - git pack-objects --index-version=2 test-pack