Re: in case you want a use-case with lots of submodules
On Mon, Jun 19, 2017 at 1:20 PM, Yaroslav Halchenko wrote: > > On Mon, 19 Jun 2017, Stefan Beller wrote: > >> On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko >> wrote: >> > Hi All, > >> > On a recent trip I've listened to the git minutes podcast episode and >> > got excited to hear Stefan Beller (CCed just in case) describing >> > ongoing work on submodules mechanism. I got excited, since e.g. >> > performance improvements would be of great benefit to us too. > >> If you're mostly interested in performance improvements of the status >> quo (i.e. "make git-submodule fast"), then the work of Prathamesh >> Chavan (cc'd) might be more interesting to you than what I do. >> He is porting git-submodule (which is mostly a shell script nowadays) >> to C, such that we can save a lot of process invocations and can do >> processing within one process. > > ah -- cool. I would be eager to test it out, thanks! would be > interesting to see if it positively affects our overall performance. > Pointers to that development would be welcome! The latest from today: https://public-inbox.org/git/CAME+mvUQJFneV7b1G7zmAidP-5L=nimvy43v0ug-gtesr83...@mail.gmail.com/ > >> > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370 >> > repositories, up to 4 levels deep) hierarchy of git/git-annex >> > repositories all tied together via git submodules mechanism. And as the >> > collection grows, interactions with it become slower, so additional >> > options (such as --ignore-submodules=dirty to status) become our >> > friends. > >> I am not as much concerned about the 370 number than about the >> 4 layers of nesting. In my experience the nested submodule case >> is a little bit error prone and the bug reports are not as frequent as >> there are not as many users of nesting, yet(?) > > well -- part of the story here is that we are forced to use/have full > blown .git/ directories (for git-annex symlinks to content files to > work) within submodules instead of .git file with a reference under > parent's .git/modules. So we can 'slice' at any level and I > guess that is why may be avoiding some possibly issues due to nesting > and the "parent has all .git/modules" approach. That sounds like you either want to configure to have the submodules git dirs in-place or you want to convince git-annex to learn about the gitdir pointer files. > >> In a neighboring thread on the mailing list we have a discussion >> on the usefulness of being on branches than in detached HEAD >> in the submodules. >> https://public-inbox.org/git/0092cdd27c5f9d418b0f3e9b5d05be0801028...@sbs2011.opfingen.plc2.de/ > >> This would not break non-ambiguously, rather it would add >> ease of use. > > that is indeed a common caveat... I am not sure if any heuristic > approach would provide a 'bullet proof' solution. I might even prefer a > hardcoded 'branch-name' to be listed/associated with each submodule > within .gitmodules. hardcoded as submodule.NAME.branch, maybe? https://git-scm.com/docs/gitmodules > In the datalad case, detached HEAD is common So you are accustomed to detached HEADs and would not gain much from being back on a branch? That's cool, too. > whenever someone installs "outdated" (branch of which progressed > forward) submodule. In this case we just check if the branch after "git > clone" (but before git submodule update) includes the pointed by > Subproject commit, and if so -- we announce that it must be the branch > (so far it is always "master" branch anyways ;) ) heh, having just one branch. That is retro-style. :)
[PATCH/RFC] Cleanup Documentation
Make following changes to the git-submodule documentation: * Remove redundancy * Remove unclear back reference * Use more appropriate word * Quote important word Suggestions-by: Stefan Beller Signed-off-by: Kaartic Sivaraam --- Currently used the word "canonical" instead of "humanish". If that word sounds more suitable then this is a [PATCH] and not a [PATCH/RFC]. Documentation/git-submodule.txt | 37 +++-- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 74bc6200d..045fef417 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -63,14 +63,6 @@ add [-b ] [-f|--force] [--name ] [--reference ] [--dep to the changeset to be committed next to the current project: the current project is termed the "superproject". + -This requires at least one argument: . The optional -argument is the relative location for the cloned submodule -to exist in the superproject. If is not given, the -"humanish" part of the source repository is used ("repo" for -"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). -The is also used as the submodule's logical name in its -configuration entries unless `--name` is used to specify a logical name. -+ is the URL of the new submodule's origin repository. This may be either an absolute URL, or (if it begins with ./ or ../), the location relative to the superproject's default remote @@ -87,21 +79,22 @@ If the superproject doesn't have a default remote configured the superproject is its own authoritative upstream and the current working directory is used instead. + - is the relative location for the cloned submodule to -exist in the superproject. If does not exist, then the -submodule is created by cloning from the named URL. If does -exist and is already a valid Git repository, then this is added -to the changeset without cloning. This second form is provided -to ease creating a new submodule from scratch, and presumes -the user will later push the submodule to the given URL. +The optional argument is the relative location for the cloned +submodule to exist in the superproject. If is not given, the +canonical part of the source repository is used ("repo" for +"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). If +exists and is already a valid Git repository, then this is added +to the changeset without cloning. The is also used as the +submodule's logical name in its configuration entries unless `--name` +is used to specify a logical name. + -In either case, the given URL is recorded into .gitmodules for -use by subsequent users cloning the superproject. If the URL is -given relative to the superproject's repository, the presumption -is the superproject and submodule repositories will be kept -together in the same relative location, and only the -superproject's URL needs to be provided: git-submodule will correctly -locate the submodule using the relative URL in .gitmodules. +The given URL is recorded into `.gitmodules` for use by subsequent users +cloning the superproject. If the URL is given relative to the +superproject's repository, the presumption is the superproject and +submodule repositories will be kept together in the same relative +location, and only the superproject's URL needs to be provided. +git-submodule will correctly locate the submodule using the relative +URL in .gitmodules. status [--cached] [--recursive] [--] [...]:: Show the status of the submodules. This will print the SHA-1 of the -- 2.11.0
[PATCH 3/3] Add tests for the contextual initial status message
Also, fixed minor spacing issue Helped-by: Ævar Arnfjörð Bjarmason Helped-by: Junio C Hamano Signed-off-by: Kaartic Sivaraam --- t/t7508-status.sh | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/t/t7508-status.sh b/t/t7508-status.sh index fb00e6d9b..6cf3af9d9 100755 --- a/t/t7508-status.sh +++ b/t/t7508-status.sh @@ -1494,9 +1494,39 @@ EOF test_expect_success 'git commit -m will commit a staged but ignored submodule' ' git commit -uno -m message && git status -s --ignore-submodules=dirty >output && -test_i18ngrep ! "^M. sm" output && + test_i18ngrep ! "^M. sm" output && git config --remove-section submodule.subname && git config -f .gitmodules --remove-section submodule.subname ' +test_expect_success '"No commits yet" should be noted in status output' ' + git checkout --orphan empty-branch-1 && + git status >output && + test_i18ngrep "No commits yet" output +' + +test_expect_success '"No commits yet" should not be noted in status output' ' + git checkout --orphan empty-branch-2 && + test_commit test-commit-1 && + git status >output && + test_i18ngrep ! "No commits yet" output +' + +test_expect_success '"Initial commit" should be noted in commit template' ' + git checkout --orphan empty-branch-3 && + touch to_be_committed_1 && + git add to_be_committed_1 && + git commit --dry-run >output && + test_i18ngrep "Initial commit" output +' + +test_expect_success '"Initial commit" should not be noted in commit template' ' + git checkout --orphan empty-branch-4 && + test_commit test-commit-2 && + touch to_be_committed_2 && + git add to_be_committed_2 && + git commit --dry-run >output && + test_i18ngrep ! "Initial commit" output +' + test_done -- 2.11.0
[PATCH 2/3] Update test(s) that used old status message
The tests that checked for old status message have been updated to check for the new status message Signed-off-by: Kaartic Sivaraam --- t/t7501-commit.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t7501-commit.sh b/t/t7501-commit.sh index 0b6da7ae1..fa61b1a4e 100755 --- a/t/t7501-commit.sh +++ b/t/t7501-commit.sh @@ -18,7 +18,7 @@ test_expect_success 'initial status' ' echo bongo bongo >file && git add file && git status >actual && - test_i18ngrep "Initial commit" actual + test_i18ngrep "No commits yet" actual ' test_expect_success 'fail initial amend' ' -- 2.11.0
[PATCH 1/3] Contextually notify user about an initial commit
"git status" indicated "Initial commit" when HEAD points at an unborn branch. This message is shared with the commit log template "git commit" prepares for the user when creating a commit (i.e. "You are about to create the initial commit"), and is OK as long as the reader is aware of the nature of the message (i.e. it guides the user working toward the next commit), but was confusing to new users, especially the ones who do "git commit -m message" without having a chance to pay attention to the commit log template. The "Initial commit" indication wasn't an issue in the commit template. Taking that into consideration, a good solution would be to contextually use different messages to indicate the user that there were no commits in this branch. A few alternatives considered were, * Waiting for initial commit * Your current branch does not have any commits * Current branch waiting for initial commit The most succint one, "No commits yet", among the alternatives was chosen. Helped-by: Junio C Hamano Signed-off-by: Kaartic Sivaraam --- builtin/commit.c | 1 + wt-status.c | 5 - wt-status.h | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/builtin/commit.c b/builtin/commit.c index 1d805f5da..0f36d2ac3 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1648,6 +1648,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix) usage_with_options(builtin_commit_usage, builtin_commit_options); status_init_config(&s, git_commit_config); + s.commit_template = 1; status_format = STATUS_FORMAT_NONE; /* Ignore status.short */ s.colopts = 0; diff --git a/wt-status.c b/wt-status.c index 037548496..e6a9ee34f 100644 --- a/wt-status.c +++ b/wt-status.c @@ -1576,7 +1576,10 @@ static void wt_longstatus_print(struct wt_status *s) if (s->is_initial) { status_printf_ln(s, color(WT_STATUS_HEADER, s), "%s", ""); - status_printf_ln(s, color(WT_STATUS_HEADER, s), _("Initial commit")); + status_printf_ln(s, color(WT_STATUS_HEADER, s), +s->commit_template +? _("Initial commit") +: _("No commits yet")); status_printf_ln(s, color(WT_STATUS_HEADER, s), "%s", ""); } diff --git a/wt-status.h b/wt-status.h index 6018c627b..782b2997f 100644 --- a/wt-status.h +++ b/wt-status.h @@ -76,6 +76,7 @@ struct wt_status { char color_palette[WT_STATUS_MAXSLOT][COLOR_MAXLEN]; unsigned colopts; int null_termination; + int commit_template; int show_branch; int hints; -- 2.11.0
Re: [showing-off RFC/PATCH 26/26] diff.c: have a "machine parseable" move coloring
+ Ævar, who was not part of the email where I copied all recipients from for this series. On Mon, Jun 19, 2017 at 7:48 PM, Stefan Beller wrote: > Ævar asked for it, this is how you would do it. > (plus documentation, tests, CLI knobs, options) > > Signed-off-by: Stefan Beller > --- > diff.c | 15 +++ > diff.h | 2 ++ > 2 files changed, 13 insertions(+), 4 deletions(-) > > diff --git a/diff.c b/diff.c > index 7756f7610c..61caa057ff 100644 > --- a/diff.c > +++ b/diff.c > @@ -997,6 +997,7 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > static const char *nneof = " No newline at end of file\n"; > const char *context, *reset, *set, *meta, *fraginfo; > struct strbuf sb = STRBUF_INIT; > + int sign; > > enum diff_symbol s = eds->s; > const char *line = eds->line; > @@ -1058,8 +1059,11 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > default: > set = diff_get_color_opt(o, DIFF_FILE_NEW); > } > + sign = '+'; > + if (flags & DIFF_SYMBOL_MOVED_LINE && > o->machine_readable_moves) > + sign = '*'; > reset = diff_get_color_opt(o, DIFF_RESET); > - emit_line_ws_markup(o, set, reset, line, len, '+', > + emit_line_ws_markup(o, set, reset, line, len, sign, > flags & DIFF_SYMBOL_CONTENT_WS_MASK, > flags & > DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF); > break; > @@ -1086,8 +1090,11 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > default: > set = diff_get_color_opt(o, DIFF_FILE_OLD); > } > + sign = '-'; > + if (flags & DIFF_SYMBOL_MOVED_LINE && > o->machine_readable_moves) > + sign = '_'; > reset = diff_get_color_opt(o, DIFF_RESET); > - emit_line_ws_markup(o, set, reset, line, len, '-', > + emit_line_ws_markup(o, set, reset, line, len, sign, > flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0); > break; > case DIFF_SYMBOL_WORDS_PORCELAIN: > @@ -5475,7 +5482,7 @@ static void diff_flush_patch_all_file_pairs(struct > diff_options *o) > static struct emitted_diff_symbols esm = EMITTED_DIFF_SYMBOLS_INIT; > struct diff_queue_struct *q = &diff_queued_diff; > > - if (o->color_moved) > + if (o->color_moved || o->machine_readable_moves) > o->emitted_symbols = &esm; > > for (i = 0; i < q->nr; i++) { > @@ -5485,7 +5492,7 @@ static void diff_flush_patch_all_file_pairs(struct > diff_options *o) > } > > if (o->emitted_symbols) { > - if (o->color_moved) { > + if (o->color_moved || o->machine_readable_moves) { > struct hashmap add_lines, del_lines; > unsigned ignore_ws = DIFF_XDL_TST(o, > IGNORE_WHITESPACE); > > diff --git a/diff.h b/diff.h > index 98abd75521..b6c4d7ab0f 100644 > --- a/diff.h > +++ b/diff.h > @@ -194,6 +194,8 @@ struct diff_options { > COLOR_MOVED_ZEBRA = 2, > COLOR_MOVED_ZEBRA_DIM = 3, > } color_moved; > + > + int machine_readable_moves; > }; > > void diff_emit_submodule_del(struct diff_options *o, const char *line); > -- > 2.12.2.575.gb14f27f917 >
[PATCH 23/26] diff.c: color moved lines differently, plain mode
Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add Signed-off-by: Stefan Beller --- diff.c | 5 + diff.h | 1 + t/t4015-diff-whitespace.sh | 51 +- 3 files changed, 56 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 20c1f9b99f..0d41a53b76 100644 --- a/diff.c +++ b/diff.c @@ -246,6 +246,8 @@ static int parse_color_moved(const char *arg) { if (!strcmp(arg, "no")) return COLOR_MOVED_NO; + else if (!strcmp(arg, "plain")) + return COLOR_MOVED_PLAIN; else if (!strcmp(arg, "zebra")) return COLOR_MOVED_ZEBRA; else @@ -830,6 +832,9 @@ static void mark_color_as_moved(struct diff_options *o, l->flags |= DIFF_SYMBOL_MOVED_LINE; + if (o->color_moved == COLOR_MOVED_PLAIN) + continue; + /* Check any potential block runs, advance each or nullify */ for (i = 0; i < pmb_nr; i++) { struct moved_entry *p = pmb[i]; diff --git a/diff.h b/diff.h index 7726ad255c..1aae8738ca 100644 --- a/diff.h +++ b/diff.h @@ -190,6 +190,7 @@ struct diff_options { struct emitted_diff_symbols *emitted_symbols; enum { COLOR_MOVED_NO = 0, + COLOR_MOVED_PLAIN = 1, COLOR_MOVED_ZEBRA = 2, } color_moved; }; diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 4a03766f1f..1ca16435d6 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -986,7 +986,7 @@ test_expect_success 'detect moved code, complete file' ' git mv test.c main.c && test_config color.diff.oldMoved "normal red" && test_config color.diff.newMoved "normal green" && - git diff HEAD --color-moved --no-renames | test_decode_color >actual && + git diff HEAD --color-moved=zebra --no-renames | test_decode_color >actual && cat >expected <<-\EOF && diff --git a/main.c b/main.c new file mode 100644 @@ -1130,6 +1130,55 @@ test_expect_success 'detect malicious moved code, inside file' ' test_cmp expected actual ' +test_expect_success 'plain moved code, inside file' ' + test_config color.diff.oldMoved "normal red" && + test_config color.diff.newMoved "normal green" && + test_config color.diff.oldMovedAlternative "blue" && + test_config color.diff.newMovedAlternative "yellow" && + # needs previous test as setup + git diff HEAD --no-renames --color-moved=plain| test_decode_color >actual && + cat <<-\EOF >expected && + diff --git a/main.c b/main.c + index 27a619c..7cf9336 100644 + --- a/main.c + +++ b/main.c + @@ -5,13 +5,6 @@ printf("Hello "); +printf("World\n"); +} + + -int secure_foo(struct user *u) + -{ + -if (!u->is_allowed_foo) + -return; + -foo(u); + -} + - +int main() +{ +foo(); + diff --git a/test.c b/test.c + index 1dc1d85..2bedec9 100644 + --- a/test.c + +++ b/test.c + @@ -4,6 +4,13 @@ int bar() +printf("Hello World, but different\n"); +} + + +int secure_foo(struct user *u) + +{ + +foo(u); + +if (!u->is_allowed_foo) + +return; + +} + + +int another_function() +{ +bar(); + EOF + + test_cmp expected actual +' + test_expect_success 'no effect from --color-moved with --word-diff' ' cat <<-\EOF >text.txt && Lorem Ipsum is simply dummy text of the printing and typesetting industry. -- 2.12.2.575.gb14f27f917
[PATCH 22/26] diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] -if (!is_authorized_user()) [OM] -die("unauthorized"); [OM] -sensitive_stuff(spanning, [OM] -multiple, [OM] -lines); [OM] -} void another_function() { [del] -printf("foo"); [add] +printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] +if (!is_authorized_user()) [NM] +die("unauthorized"); [NM] +sensitive_stuff(spanning, [NM] +multiple, [NM] +lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] -if (!is_authorized_user()) [OMA] -die("unauthorized"); [OM] -sensitive_stuff(spanning, [OM] -multiple, [OM] -lines); [OMA] -} void another_function() { [del] -printf("foo"); [add] +printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] +sensitive_stuff(spanning, [NMA] +multiple, [NMA] +lines); [NM] +if (!is_authorized_user()) [NM] +die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. Helped-by: Jonathan Tan Signed-off-by: Stefan Beller --- diff.c | 304 ++--- diff.h | 10 +- t/t4015-diff-whitespace.sh | 196 + 3 files changed, 495 insertions(+), 15 deletions(-) diff --git a/diff.c b/diff.c index 35b5924ff2..20c1f9b99f 100644 --- a/diff.c +++ b/diff.c @@ -15,6 +15,7 @@ #include "userdiff.h" #include "submodule-config.h" #include "submodule.h" +#include "hashmap.h" #include "ll-merge.h" #include "string-list.h" #include "argv-array.h" @@ -31,6 +32,7 @@ static int diff_indent_heuristic; /* experimental */ static int diff_rename_limit_default = 400; static int diff_suppress_blank_empty; static int diff_use_color_default = -1; +static int diff_color_moved_default; static int diff_context_default = 3; static int diff_interhunk_context_default; static const char *diff_word_regex_cfg; @@ -55,6 +57,10 @@ static char diff_colors[][COLOR_MAXLEN] = { GIT_COLOR_YELLOW, /* COMMIT */ GIT_COLOR_BG_RED, /* WHITESPACE */ GIT_COLOR_NORMAL, /* FUNCINFO */ + GIT_COLOR_MAGENTA, /* OLD_MOVED */ + GIT_COLOR_BLUE, /* OLD_MOVED ALTERNATIVE */ + GIT_COLOR_CYAN, /* NEW_MOVED */ + GIT_COLOR_YELLOW, /* NEW_MOVED ALTERNATIVE */ }; static NORETURN void die_want_option(const char *option_name) @@ -80,6 +86,14 @@ static int parse_diff_color_slot(const char *var) return DIFF_WHITESPACE; if (!strcasecmp
[PATCH 13/26] diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
Signed-off-by: Stefan Beller --- diff.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index 95f11ea86f..e56962b844 100644 --- a/diff.c +++ b/diff.c @@ -572,6 +572,7 @@ enum diff_symbol { DIFF_SYMBOL_WORDS, DIFF_SYMBOL_FILEPAIR, DIFF_SYMBOL_HEADER, + DIFF_SYMBOL_BINARY_FILES, }; /* * Flags for content lines: @@ -689,6 +690,10 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_HEADER: fprintf(o->file, "%s", line); break; + case DIFF_SYMBOL_BINARY_FILES: + fprintf(o->file, "%sBinary files %s differ\n", + diff_line_prefix(o), line); + break; default: die("BUG: unknown diff symbol"); } @@ -2537,6 +2542,7 @@ static void builtin_diff(const char *name_a, } else if (!DIFF_OPT_TST(o, TEXT) && ( (!textconv_one && diff_filespec_is_binary(one)) || (!textconv_two && diff_filespec_is_binary(two)) )) { + struct strbuf sb = STRBUF_INIT; if (!one->data && !two->data && S_ISREG(one->mode) && S_ISREG(two->mode) && !DIFF_OPT_TST(o, BINARY)) { @@ -2549,8 +2555,10 @@ static void builtin_diff(const char *name_a, } emit_diff_symbol(o, DIFF_SYMBOL_HEADER, header.buf, header.len, 0); - fprintf(o->file, "%sBinary files %s and %s differ\n", - line_prefix, lbl[0], lbl[1]); + strbuf_addf(&sb, "%s and %s", lbl[0], lbl[1]); + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_FILES, +sb.buf, sb.len, 0); + strbuf_release(&sb); goto free_ab_and_return; } if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0) @@ -2567,9 +2575,12 @@ static void builtin_diff(const char *name_a, strbuf_reset(&header); if (DIFF_OPT_TST(o, BINARY)) emit_binary_diff(o->file, &mf1, &mf2, line_prefix); - else - fprintf(o->file, "%sBinary files %s and %s differ\n", - line_prefix, lbl[0], lbl[1]); + else { + strbuf_addf(&sb, "%s and %s", lbl[0], lbl[1]); + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_FILES, +sb.buf, sb.len, 0); + strbuf_release(&sb); + } o->found_changes = 1; } else { /* Crazy xdl interfaces.. */ -- 2.12.2.575.gb14f27f917
[PATCH 01/26] diff.c: readability fix
We already have dereferenced 'p->two' into a local variable 'two'. Use that. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 74283d9001..3f5bf8b5a4 100644 --- a/diff.c +++ b/diff.c @@ -3283,8 +3283,8 @@ static void run_diff(struct diff_filepair *p, struct diff_options *o) const char *other; const char *attr_path; - name = p->one->path; - other = (strcmp(name, p->two->path) ? p->two->path : NULL); + name = one->path; + other = (strcmp(name, two->path) ? two->path : NULL); attr_path = name; if (o->prefix_length) strip_prefix(o->prefix_length, &name, &other); -- 2.12.2.575.gb14f27f917
[PATCH 16/26] diff.c: convert emit_binary_diff_body to use emit_diff_symbol
Signed-off-by: Stefan Beller --- diff.c | 66 +- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/diff.c b/diff.c index 42a9020d95..e6ade5fde0 100644 --- a/diff.c +++ b/diff.c @@ -573,6 +573,11 @@ enum diff_symbol { DIFF_SYMBOL_FILEPAIR, DIFF_SYMBOL_HEADER, DIFF_SYMBOL_BINARY_FILES, + DIFF_SYMBOL_BINARY_DIFF_HEADER, + DIFF_SYMBOL_BINARY_DIFF_HEADER_DELTA, + DIFF_SYMBOL_BINARY_DIFF_HEADER_LITERAL, + DIFF_SYMBOL_BINARY_DIFF_BODY, + DIFF_SYMBOL_BINARY_DIFF_FOOTER, DIFF_SYMBOL_REWRITE_DIFF, DIFF_SYMBOL_SUBMODULE_ADD, DIFF_SYMBOL_SUBMODULE_DEL, @@ -581,6 +586,7 @@ enum diff_symbol { DIFF_SYMBOL_SUBMODULE_HEADER, DIFF_SYMBOL_SUBMODULE_ERROR, DIFF_SYMBOL_SUBMODULE_PIPETHROUGH, + }; /* * Flags for content lines: @@ -702,6 +708,22 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, fprintf(o->file, "%sBinary files %s differ\n", diff_line_prefix(o), line); break; + case DIFF_SYMBOL_BINARY_DIFF_HEADER: + fprintf(o->file, "%sGIT binary patch\n", diff_line_prefix(o)); + break; + case DIFF_SYMBOL_BINARY_DIFF_HEADER_DELTA: + fprintf(o->file, "%sdelta %s\n", diff_line_prefix(o), line); + break; + case DIFF_SYMBOL_BINARY_DIFF_HEADER_LITERAL: + fprintf(o->file, "%sliteral %s\n", diff_line_prefix(o), line); + break; + case DIFF_SYMBOL_BINARY_DIFF_BODY: + emit_line(o, "", "", line, len); + break; + case DIFF_SYMBOL_BINARY_DIFF_FOOTER: + fputs(diff_line_prefix(o), o->file); + fputc('\n', o->file); + break; case DIFF_SYMBOL_REWRITE_DIFF: fraginfo = diff_get_color(o->use_color, DIFF_FRAGINFO); reset = diff_get_color_opt(o, DIFF_RESET); @@ -2394,8 +2416,8 @@ static unsigned char *deflate_it(char *data, return deflated; } -static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, - const char *prefix) +static void emit_binary_diff_body(struct diff_options *o, + mmfile_t *one, mmfile_t *two) { void *cp; void *delta; @@ -2424,13 +2446,18 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, } if (delta && delta_size < deflate_size) { - fprintf(file, "%sdelta %lu\n", prefix, orig_size); + char *s = xstrfmt("%lu", orig_size); + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_DIFF_HEADER_DELTA, +s, strlen(s), 0); + free(s); free(deflated); data = delta; data_size = delta_size; - } - else { - fprintf(file, "%sliteral %lu\n", prefix, two->size); + } else { + char *s = xstrfmt("%lu", two->size); + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_DIFF_HEADER_LITERAL, +s, strlen(s), 0); + free(s); free(delta); data = deflated; data_size = deflate_size; @@ -2439,8 +2466,9 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, /* emit data encoded in base85 */ cp = data; while (data_size) { + int len; int bytes = (52 < data_size) ? 52 : data_size; - char line[70]; + char line[71]; data_size -= bytes; if (bytes <= 26) line[0] = bytes + 'A' - 1; @@ -2448,20 +2476,24 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, line[0] = bytes - 26 + 'a' - 1; encode_85(line + 1, cp, bytes); cp = (char *) cp + bytes; - fprintf(file, "%s", prefix); - fputs(line, file); - fputc('\n', file); + + len = strlen(line); + line[len++] = '\n'; + line[len] = '\0'; + + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_DIFF_BODY, +line, len, 0); } - fprintf(file, "%s\n", prefix); + emit_diff_symbol(o, DIFF_SYMBOL_BINARY_DIFF_FOOTER, NULL, 0, 0); free(data); } -static void emit_binary_diff(FILE *file, mmfile_t *one, mmfile_t *two, -const char *prefix) +static void emit_binary_diff(struct diff_options *o, +mmfile_t *one, mmfile_t *two) { - fprintf(file, "%sGIT binary patch\n", prefix); - emit_binary_diff_body(file, one, two, prefix); - emit_binary_diff_body(file, two, one, prefix); + emit_diff_symbol(o,
[PATCH 10/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
Signed-off-by: Stefan Beller --- diff.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 7096457c8a..e6029e8213 100644 --- a/diff.c +++ b/diff.c @@ -565,6 +565,7 @@ enum diff_symbol { DIFF_SYMBOL_CONTEXT_FRAGINFO, DIFF_SYMBOL_NO_LF_EOF, DIFF_SYMBOL_CONTEXT, + DIFF_SYMBOL_CONTEXT_INCOMPLETE, DIFF_SYMBOL_PLUS, DIFF_SYMBOL_MINUS, DIFF_SYMBOL_WORDS_PORCELAIN, @@ -637,6 +638,11 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, emit_line_ws_markup(o, set, reset, line, len, ' ', flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0); break; + case DIFF_SYMBOL_CONTEXT_INCOMPLETE: + context = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, context, reset, line, len); + break; case DIFF_SYMBOL_PLUS: set = diff_get_color_opt(o, DIFF_FILE_NEW); reset = diff_get_color_opt(o, DIFF_RESET); @@ -1448,8 +1454,8 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) default: /* incomplete line at the end */ ecbdata->lno_in_preimage++; - emit_line(o, diff_get_color(ecbdata->color_diff, DIFF_CONTEXT), - reset, line, len); + emit_diff_symbol(o, DIFF_SYMBOL_CONTEXT_INCOMPLETE, +line, len, 0); break; } } -- 2.12.2.575.gb14f27f917
[PATCH 09/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS{_PORCELAIN}
Signed-off-by: Stefan Beller --- diff.c | 42 ++ 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/diff.c b/diff.c index 42c9f48dc2..7096457c8a 100644 --- a/diff.c +++ b/diff.c @@ -567,6 +567,8 @@ enum diff_symbol { DIFF_SYMBOL_CONTEXT, DIFF_SYMBOL_PLUS, DIFF_SYMBOL_MINUS, + DIFF_SYMBOL_WORDS_PORCELAIN, + DIFF_SYMBOL_WORDS, }; /* * Flags for content lines: @@ -648,6 +650,26 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, emit_line_ws_markup(o, set, reset, line, len, '-', flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0); break; + case DIFF_SYMBOL_WORDS_PORCELAIN: + context = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, context, reset, line, len); + fputs("~\n", o->file); + break; + case DIFF_SYMBOL_WORDS: + context = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + /* +* Skip the prefix character, if any. With +* diff_suppress_blank_empty, there may be +* none. +*/ + if (line[0] != '\n') { + line++; + len--; + } + emit_line(o, context, reset, line, len); + break; default: die("BUG: unknown diff symbol"); } @@ -1342,7 +1364,6 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) { struct emit_callback *ecbdata = priv; const char *meta = diff_get_color(ecbdata->color_diff, DIFF_METAINFO); - const char *context = diff_get_color(ecbdata->color_diff, DIFF_CONTEXT); const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET); struct diff_options *o = ecbdata->opt; const char *line_prefix = diff_line_prefix(o); @@ -1384,6 +1405,9 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } if (ecbdata->diff_words) { + enum diff_symbol s = + ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN ? + DIFF_SYMBOL_WORDS_PORCELAIN : DIFF_SYMBOL_WORDS; if (line[0] == '-') { diff_words_append(line, len, &ecbdata->diff_words->minus); @@ -1403,21 +1427,7 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) return; } diff_words_flush(ecbdata); - if (ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) { - emit_line(o, context, reset, line, len); - fputs("~\n", o->file); - } else { - /* -* Skip the prefix character, if any. With -* diff_suppress_blank_empty, there may be -* none. -*/ - if (line[0] != '\n') { - line++; - len--; - } - emit_line(o, context, reset, line, len); - } + emit_diff_symbol(o, s, line, len, 0); return; } -- 2.12.2.575.gb14f27f917
[showing-off RFC/PATCH 26/26] diff.c: have a "machine parseable" move coloring
Ævar asked for it, this is how you would do it. (plus documentation, tests, CLI knobs, options) Signed-off-by: Stefan Beller --- diff.c | 15 +++ diff.h | 2 ++ 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/diff.c b/diff.c index 7756f7610c..61caa057ff 100644 --- a/diff.c +++ b/diff.c @@ -997,6 +997,7 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, static const char *nneof = " No newline at end of file\n"; const char *context, *reset, *set, *meta, *fraginfo; struct strbuf sb = STRBUF_INIT; + int sign; enum diff_symbol s = eds->s; const char *line = eds->line; @@ -1058,8 +1059,11 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, default: set = diff_get_color_opt(o, DIFF_FILE_NEW); } + sign = '+'; + if (flags & DIFF_SYMBOL_MOVED_LINE && o->machine_readable_moves) + sign = '*'; reset = diff_get_color_opt(o, DIFF_RESET); - emit_line_ws_markup(o, set, reset, line, len, '+', + emit_line_ws_markup(o, set, reset, line, len, sign, flags & DIFF_SYMBOL_CONTENT_WS_MASK, flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF); break; @@ -1086,8 +1090,11 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, default: set = diff_get_color_opt(o, DIFF_FILE_OLD); } + sign = '-'; + if (flags & DIFF_SYMBOL_MOVED_LINE && o->machine_readable_moves) + sign = '_'; reset = diff_get_color_opt(o, DIFF_RESET); - emit_line_ws_markup(o, set, reset, line, len, '-', + emit_line_ws_markup(o, set, reset, line, len, sign, flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0); break; case DIFF_SYMBOL_WORDS_PORCELAIN: @@ -5475,7 +5482,7 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) static struct emitted_diff_symbols esm = EMITTED_DIFF_SYMBOLS_INIT; struct diff_queue_struct *q = &diff_queued_diff; - if (o->color_moved) + if (o->color_moved || o->machine_readable_moves) o->emitted_symbols = &esm; for (i = 0; i < q->nr; i++) { @@ -5485,7 +5492,7 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) } if (o->emitted_symbols) { - if (o->color_moved) { + if (o->color_moved || o->machine_readable_moves) { struct hashmap add_lines, del_lines; unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE); diff --git a/diff.h b/diff.h index 98abd75521..b6c4d7ab0f 100644 --- a/diff.h +++ b/diff.h @@ -194,6 +194,8 @@ struct diff_options { COLOR_MOVED_ZEBRA = 2, COLOR_MOVED_ZEBRA_DIM = 3, } color_moved; + + int machine_readable_moves; }; void diff_emit_submodule_del(struct diff_options *o, const char *line); -- 2.12.2.575.gb14f27f917
[PATCH 21/26] diff.c: buffer all output if asked to
Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In 6440d34 (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller --- diff.c | 109 +++-- diff.h | 2 ++ 2 files changed, 109 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f2f7a4da79..35b5924ff2 100644 --- a/diff.c +++ b/diff.c @@ -603,6 +603,47 @@ enum diff_symbol { #define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF (1<<16) #define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK) +/* + * This struct is used when we need to buffer the output of the diff output. + * + * NEEDSWORK: Instead of storing a copy of the line, add an offset pointer + * into the pre/post image file. This pointer could be a union with the + * line pointer. By storing an offset into the file instead of the literal line, + * we can decrease the memory footprint for the buffered output. At first we + * may want to only have indirection for the content lines, but we could also + * enhance the state for emitting prefabricated lines, e.g. the similarity + * score line or hunk/file headers would only need to store a number or path + * and then the output can be constructed later on depending on state. + */ +struct emitted_diff_symbol { + const char *line; + int len; + int flags; + enum diff_symbol s; +}; +#define EMITTED_DIFF_SYMBOL_INIT {NULL} + +struct emitted_diff_symbols { + struct emitted_diff_symbol *buf; + int nr, alloc; +}; +#define EMITTED_DIFF_SYMBOLS_INIT {NULL, 0, 0} + +static void append_emitted_diff_symbol(struct diff_options *o, + struct emitted_diff_symbol *e) +{ + struct emitted_diff_symbol *f; + + ALLOC_GROW(o->emitted_symbols->buf, + o->emitted_symbols->nr + 1, + o->emitted_symbols->alloc); + f = &o->emitted_symbols->buf[o->emitted_symbols->nr++]; + + memcpy(f, e, sizeof(struct emitted_diff_symbol)); + f->line = e->line ? xmemdupz(e->line, e->len) : NULL; +} + + static void emit_line_ws_markup(struct diff_options *o, const char *set, const char *reset, const char *line, int len, char sign, @@ -629,12 +670,18 @@ static void emit_line_ws_markup(struct diff_options *o, } } -static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, -const char *line, int len, unsigned flags) +static void emit_diff_symbol_from_struct(struct diff_options *o, +struct emitted_diff_symbol *eds) { static const char *nneof = " No newline at end of file\n"; const char *context, *reset, *set, *meta, *fraginfo; struct strbuf sb = STRBUF_INIT; + + enum diff_symbol s = eds->s; + const char *line = eds->line; + int len = eds->len; + unsigned flags = eds->flags; + switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", @@ -790,6 +837,17 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, strbuf_release(&sb); } +static void emit_diff_symbol(struct di
[PATCH 05/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
Signed-off-by: Stefan Beller --- diff.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 89466018e5..3af07fa659 100644 --- a/diff.c +++ b/diff.c @@ -561,17 +561,24 @@ static void emit_line(struct diff_options *o, const char *set, const char *reset enum diff_symbol { DIFF_SYMBOL_SEPARATOR, + DIFF_SYMBOL_CONTEXT_MARKER, }; static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len) { + const char *context, *reset; switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", diff_line_prefix(o), o->line_termination); break; + case DIFF_SYMBOL_CONTEXT_MARKER: + context = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, context, reset, line, len); + break; default: die("BUG: unknown diff symbol"); } @@ -661,7 +668,8 @@ static void emit_hunk_header(struct emit_callback *ecbdata, if (len < 10 || memcmp(line, atat, 2) || !(ep = memmem(line + 2, len - 2, atat, 2))) { - emit_line(ecbdata->opt, context, reset, line, len); + emit_diff_symbol(ecbdata->opt, +DIFF_SYMBOL_CONTEXT_MARKER, line, len); return; } ep += 2; /* skip over @@ */ -- 2.12.2.575.gb14f27f917
[PATCH 20/26] diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
Signed-off-by: Stefan Beller --- diff.c | 73 +++--- 1 file changed, 43 insertions(+), 30 deletions(-) diff --git a/diff.c b/diff.c index 18bfc4720b..f2f7a4da79 100644 --- a/diff.c +++ b/diff.c @@ -592,6 +592,7 @@ enum diff_symbol { DIFF_SYMBOL_STATS_LINE, DIFF_SYMBOL_WORD_DIFF, DIFF_SYMBOL_STAT_SEP, + DIFF_SYMBOL_SUMMARY, }; /* * Flags for content lines: @@ -780,6 +781,9 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_STAT_SEP: fputs(o->stat_sep, o->file); break; + case DIFF_SYMBOL_SUMMARY: + emit_line(o, "", "", line, len); + break; default: die("BUG: unknown diff symbol"); } @@ -4727,67 +4731,76 @@ static void flush_one_pair(struct diff_filepair *p, struct diff_options *opt) } } -static void show_file_mode_name(FILE *file, const char *newdelete, struct diff_filespec *fs) +static void show_file_mode_name(struct diff_options *opt, const char *newdelete, struct diff_filespec *fs) { + struct strbuf sb = STRBUF_INIT; if (fs->mode) - fprintf(file, " %s mode %06o ", newdelete, fs->mode); + strbuf_addf(&sb, " %s mode %06o ", newdelete, fs->mode); else - fprintf(file, " %s ", newdelete); - write_name_quoted(fs->path, file, '\n'); -} + strbuf_addf(&sb, " %s ", newdelete); + quote_c_style(fs->path, &sb, NULL, 0); + strbuf_addch(&sb, '\n'); + emit_diff_symbol(opt, DIFF_SYMBOL_SUMMARY, +sb.buf, sb.len, 0); + strbuf_release(&sb); +} -static void show_mode_change(FILE *file, struct diff_filepair *p, int show_name, - const char *line_prefix) +static void show_mode_change(struct diff_options *opt, struct diff_filepair *p, + int show_name) { if (p->one->mode && p->two->mode && p->one->mode != p->two->mode) { - fprintf(file, "%s mode change %06o => %06o%c", line_prefix, p->one->mode, - p->two->mode, show_name ? ' ' : '\n'); + struct strbuf sb = STRBUF_INIT; + strbuf_addf(&sb, " mode change %06o => %06o", + p->one->mode, p->two->mode); if (show_name) { - write_name_quoted(p->two->path, file, '\n'); + strbuf_addch(&sb, ' '); + quote_c_style(p->two->path, &sb, NULL, 0); } + emit_diff_symbol(opt, DIFF_SYMBOL_SUMMARY, +sb.buf, sb.len, 0); + strbuf_release(&sb); } } -static void show_rename_copy(FILE *file, const char *renamecopy, struct diff_filepair *p, - const char *line_prefix) +static void show_rename_copy(struct diff_options *opt, const char *renamecopy, + struct diff_filepair *p) { + struct strbuf sb = STRBUF_INIT; char *names = pprint_rename(p->one->path, p->two->path); - - fprintf(file, " %s %s (%d%%)\n", renamecopy, names, similarity_index(p)); + strbuf_addf(&sb, " %s %s (%d%%)\n", + renamecopy, names, similarity_index(p)); free(names); - show_mode_change(file, p, 0, line_prefix); + emit_diff_symbol(opt, DIFF_SYMBOL_SUMMARY, +sb.buf, sb.len, 0); + show_mode_change(opt, p, 0); } static void diff_summary(struct diff_options *opt, struct diff_filepair *p) { - FILE *file = opt->file; - const char *line_prefix = diff_line_prefix(opt); - switch(p->status) { case DIFF_STATUS_DELETED: - fputs(line_prefix, file); - show_file_mode_name(file, "delete", p->one); + show_file_mode_name(opt, "delete", p->one); break; case DIFF_STATUS_ADDED: - fputs(line_prefix, file); - show_file_mode_name(file, "create", p->two); + show_file_mode_name(opt, "create", p->two); break; case DIFF_STATUS_COPIED: - fputs(line_prefix, file); - show_rename_copy(file, "copy", p, line_prefix); + show_rename_copy(opt, "copy", p); break; case DIFF_STATUS_RENAMED: - fputs(line_prefix, file); - show_rename_copy(file, "rename", p, line_prefix); + show_rename_copy(opt, "rename", p); break; default: if (p->score) { - fprintf(file, "%s rewrite ", line_prefix); - write_name_quoted(p->two->path, file, ' '); - fprintf(file, "(%d%%)\n", similarity_index(p)); + struct strbuf sb = STRBUF_INIT; + strbuf_addstr(&sb, " rewrite "); +
[PATCH 24/26] diff.c: add dimming to moved line detection
Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller --- color.h| 2 + diff.c | 139 - diff.h | 9 ++- t/t4015-diff-whitespace.sh | 124 4 files changed, 259 insertions(+), 15 deletions(-) diff --git a/color.h b/color.h index 90627650fc..0e091b0cf5 100644 --- a/color.h +++ b/color.h @@ -42,6 +42,8 @@ struct strbuf; #define GIT_COLOR_BG_BLUE "\033[44m" #define GIT_COLOR_BG_MAGENTA "\033[45m" #define GIT_COLOR_BG_CYAN "\033[46m" +#define GIT_COLOR_DI "\033[2m" +#define GIT_COLOR_DI_IT"\033[2;3m" /* A special value meaning "no color selected" */ #define GIT_COLOR_NIL "NIL" diff --git a/diff.c b/diff.c index 0d41a53b76..7756f7610c 100644 --- a/diff.c +++ b/diff.c @@ -57,10 +57,14 @@ static char diff_colors[][COLOR_MAXLEN] = { GIT_COLOR_YELLOW, /* COMMIT */ GIT_COLOR_BG_RED, /* WHITESPACE */ GIT_COLOR_NORMAL, /* FUNCINFO */ - GIT_COLOR_MAGENTA, /* OLD_MOVED */ - GIT_COLOR_BLUE, /* OLD_MOVED ALTERNATIVE */ - GIT_COLOR_CYAN, /* NEW_MOVED */ - GIT_COLOR_YELLOW, /* NEW_MOVED ALTERNATIVE */ + GIT_COLOR_BOLD_MAGENTA, /* OLD_MOVED */ + GIT_COLOR_BOLD_BLUE,/* OLD_MOVED ALTERNATIVE */ + GIT_COLOR_DI, /* OLD_MOVED_DIM */ + GIT_COLOR_DI_IT,/* OLD_MOVED_ALTERNATIVE_DIM */ + GIT_COLOR_BOLD_CYAN,/* NEW_MOVED */ + GIT_COLOR_BOLD_YELLOW, /* NEW_MOVED ALTERNATIVE */ + GIT_COLOR_DI, /* NEW_MOVED_DIM */ + GIT_COLOR_DI_IT,/* NEW_MOVED_ALTERNATIVE_DIM */ }; static NORETURN void die_want_option(const char *option_name) @@ -90,10 +94,18 @@ static int parse_diff_color_slot(const char *var) return DIFF_FILE_OLD_MOVED; if (!strcasecmp(var, "oldmovedalternative")) return DIFF_FILE_OLD_MOVED_ALT; + if (!strcasecmp(var, "oldmoveddimmed")) + return DIFF_FILE_OLD_MOVED_DIM; + if (!strcasecmp(var, "oldmovedalternativedimmed")) + return DIFF_FILE_OLD_MOVED_ALT_DIM; if (!strcasecmp(var, "newmoved")) return DIFF_FILE_NEW_MOVED; if (!strcasecmp(var, "newmovedalternative")) return DIFF_FILE_NEW_MOVED_ALT; + if (!strcasecmp(var, "newmoveddimmed")) + return DIFF_FILE_NEW_MOVED_DIM; + if (!strcasecmp(var, "newmovedalternativedimmed")) + return DIFF_FILE_NEW_MOVED_ALT_DIM; return -1; } @@ -250,6 +262,8 @@ static int parse_color_moved(const char *arg) return COLOR_MOVED_PLAIN; else if (!strcmp(arg, "zebra")) return COLOR_MOVED_ZEBRA; + else if (!strcmp(arg, "dimmed_zebra")) + return COLOR_MOVED_ZEBRA_DIM; else return -1; } @@ -636,6 +650,7 @@ enum diff_symbol { #define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF (1<<16) #define DIFF_SYMBOL_MOVED_LINE (1<<17) #define DIFF_SYMBOL_MOVED_LINE_ZEBRA (1<<18) +#define DIFF_SYMBOL_MOVED_LINE_UNINTERESTING (1<<19) #define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK) /* @@ -889,6 +904,67 @@ static void mark_color_as_moved(struct diff_options *o, free(pmb); } +#define DIFF_SYMBOL_MOVED_LINE_ZEBRA_MASK \ + (DIFF_SYMBOL_MOVED_LINE | DIFF_SYMBOL_MOVED_LINE_ZEBRA) +static void dim_moved_lines(struct diff_options *o) +{ + int n; + for (n = 0; n < o->emitted_symbols->nr; n++) { + struct emitted_diff_symbol *prev = (n != 0) ? + &o->emitted_symbols->buf[n - 1] : NULL; + struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; + struct emitted_diff_symbol *next = + (n < o->emitted_symbols->nr - 1) ? + &o->emitted_symbols->buf[n + 1] : NULL; + + /* Not a plus or minus line? */ + if (l->s != DIFF_SYMBOL_PLUS && l->s != DIFF_SYMBOL_MINUS) + continue; + + /* Not a moved line? */ + if (!(l->flags & DIFF_SYMBOL_MOVED_LINE)) + continue; + + /* +* If prev or next are not a plus or minus line, +* pretend they don't exist +*/ + if (prev && prev->s != DIFF_SYMBOL_PLUS && + prev->s != DIFF_SYMBOL_MINUS) + prev = NULL; + if (next && next->s != DIFF_SYMBOL_PLUS && + next->s != DIFF_SYMBOL_MINUS) + next = NULL; + + /* Inside a block? */ + if (
[PATCH 03/26] diff.c: factor out diff_flush_patch_all_file_pairs
In a later patch we want to do more things before and after all filepairs are flushed. So factor flushing out all file pairs into its own function that the new code can be plugged in easily. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index c2ed605cd0..2f9722b382 100644 --- a/diff.c +++ b/diff.c @@ -4737,6 +4737,17 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc) warning(_(rename_limit_advice), varname, needed); } +static void diff_flush_patch_all_file_pairs(struct diff_options *o) +{ + int i; + struct diff_queue_struct *q = &diff_queued_diff; + for (i = 0; i < q->nr; i++) { + struct diff_filepair *p = q->queue[i]; + if (check_pair_status(p)) + diff_flush_patch(p, o); + } +} + void diff_flush(struct diff_options *options) { struct diff_queue_struct *q = &diff_queued_diff; @@ -4831,11 +4842,7 @@ void diff_flush(struct diff_options *options) } } - for (i = 0; i < q->nr; i++) { - struct diff_filepair *p = q->queue[i]; - if (check_pair_status(p)) - diff_flush_patch(p, options); - } + diff_flush_patch_all_file_pairs(options); } if (output_format & DIFF_FORMAT_CALLBACK) -- 2.12.2.575.gb14f27f917
[PATCH 11/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR
Signed-off-by: Stefan Beller --- diff.c | 42 +- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/diff.c b/diff.c index e6029e8213..7c56150b56 100644 --- a/diff.c +++ b/diff.c @@ -570,6 +570,7 @@ enum diff_symbol { DIFF_SYMBOL_MINUS, DIFF_SYMBOL_WORDS_PORCELAIN, DIFF_SYMBOL_WORDS, + DIFF_SYMBOL_FILEPAIR, }; /* * Flags for content lines: @@ -610,7 +611,7 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len, unsigned flags) { static const char *nneof = " No newline at end of file\n"; - const char *context, *reset, *set; + const char *context, *reset, *set, *meta; switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", @@ -676,6 +677,14 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, } emit_line(o, context, reset, line, len); break; + case DIFF_SYMBOL_FILEPAIR: + meta = diff_get_color_opt(o, DIFF_METAINFO); + reset = diff_get_color_opt(o, DIFF_RESET); + fprintf(o->file, "%s%s%s%s%s%s\n", diff_line_prefix(o), meta, + flags ? "+++ " : "--- ", + line, reset, + strchr(line, ' ') ? "\t" : ""); + break; default: die("BUG: unknown diff symbol"); } @@ -847,8 +856,6 @@ static void emit_rewrite_diff(const char *name_a, struct diff_options *o) { int lc_a, lc_b; - const char *name_a_tab, *name_b_tab; - const char *metainfo = diff_get_color(o->use_color, DIFF_METAINFO); const char *fraginfo = diff_get_color(o->use_color, DIFF_FRAGINFO); const char *reset = diff_get_color(o->use_color, DIFF_RESET); static struct strbuf a_name = STRBUF_INIT, b_name = STRBUF_INIT; @@ -868,8 +875,6 @@ static void emit_rewrite_diff(const char *name_a, name_a += (*name_a == '/'); name_b += (*name_b == '/'); - name_a_tab = strchr(name_a, ' ') ? "\t" : ""; - name_b_tab = strchr(name_b, ' ') ? "\t" : ""; strbuf_reset(&a_name); strbuf_reset(&b_name); @@ -896,11 +901,11 @@ static void emit_rewrite_diff(const char *name_a, lc_a = count_lines(data_one, size_one); lc_b = count_lines(data_two, size_two); - fprintf(o->file, - "%s%s--- %s%s%s\n%s%s+++ %s%s%s\n%s%s@@ -", - line_prefix, metainfo, a_name.buf, name_a_tab, reset, - line_prefix, metainfo, b_name.buf, name_b_tab, reset, - line_prefix, fraginfo); + + emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, a_name.buf, a_name.len, 0); + emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, b_name.buf, b_name.len, 1); + + fprintf(o->file, "%s%s@@ -", line_prefix, fraginfo); if (!o->irreversible_delete) print_line_count(o->file, lc_a); else @@ -1369,10 +1374,8 @@ static void find_lno(const char *line, struct emit_callback *ecbdata) static void fn_out_consume(void *priv, char *line, unsigned long len) { struct emit_callback *ecbdata = priv; - const char *meta = diff_get_color(ecbdata->color_diff, DIFF_METAINFO); const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET); struct diff_options *o = ecbdata->opt; - const char *line_prefix = diff_line_prefix(o); o->found_changes = 1; @@ -1383,15 +1386,12 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } if (ecbdata->label_path[0]) { - const char *name_a_tab, *name_b_tab; - - name_a_tab = strchr(ecbdata->label_path[0], ' ') ? "\t" : ""; - name_b_tab = strchr(ecbdata->label_path[1], ' ') ? "\t" : ""; - - fprintf(o->file, "%s%s--- %s%s%s\n", - line_prefix, meta, ecbdata->label_path[0], reset, name_a_tab); - fprintf(o->file, "%s%s+++ %s%s%s\n", - line_prefix, meta, ecbdata->label_path[1], reset, name_b_tab); + emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, +ecbdata->label_path[0], +strlen(ecbdata->label_path[0]), 0); + emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, +ecbdata->label_path[1], +strlen(ecbdata->label_path[1]), 1); ecbdata->label_path[0] = ecbdata->label_path[1] = NULL; } -- 2.12.2.575.gb14f27f917
[PATCH 00/26] reroll of sb/diff-color-moved
This is a complete rewrite of the series. Highlights: * instead of buffering partial lines, we'll pretend all diff output follows a well defined grammar, and we emit symbols thereof. (The difference is mostly mental, though by this trick we reduce the memory footprint for storing one of these symbols from 7 variables (3 pointers, 3 ints, one state (also int) down to 4 variables (one pointer, 2 ints, one state). * The algorithm for color painting was detangled: -> different functions for block detection and dimming -> The last patch (not to be applied) is an RFC that shows how we would approach non-colored, but machine parseable highlighting of moved lines. Thanks, Stefan Stefan Beller (26): diff.c: readability fix diff.c: move line ending check into emit_hunk_header diff.c: factor out diff_flush_patch_all_file_pairs diff.c: introduce emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF diff.c: migrate emit_line_checked to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS{_PORCELAIN} diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF submodule.c: migrate diff output to use emit_diff_symbol diff.c: convert emit_binary_diff_body to use emit_diff_symbol diff.c: convert show_stats to use emit_diff_symbol diff.c: convert word diffing to use emit_diff_symbol diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY diff.c: buffer all output if asked to diff.c: color moved lines differently diff.c: color moved lines differently, plain mode diff.c: add dimming to moved line detection diff: document the new --color-moved setting WIP/RFC: diff.c: have a "machine parseable" move coloring Documentation/config.txt | 12 +- Documentation/diff-options.txt | 27 + cache.h|1 + color.h|2 + diff.c | 1283 diff.h | 39 +- submodule.c| 85 ++- submodule.h| 13 +- t/t4015-diff-whitespace.sh | 369 9 files changed, 1515 insertions(+), 316 deletions(-) -- 2.12.2.575.gb14f27f917
[PATCH 15/26] submodule.c: migrate diff output to use emit_diff_symbol
As the submodule process is no longer attached to the same stdout as the superprojects process, we need to pass coloring explicitly. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol. Signed-off-by: Stefan Beller --- diff.c | 89 - diff.h | 9 +++ submodule.c | 85 ++ submodule.h | 13 +++-- 4 files changed, 128 insertions(+), 68 deletions(-) diff --git a/diff.c b/diff.c index 96ce53c5cf..42a9020d95 100644 --- a/diff.c +++ b/diff.c @@ -574,6 +574,13 @@ enum diff_symbol { DIFF_SYMBOL_HEADER, DIFF_SYMBOL_BINARY_FILES, DIFF_SYMBOL_REWRITE_DIFF, + DIFF_SYMBOL_SUBMODULE_ADD, + DIFF_SYMBOL_SUBMODULE_DEL, + DIFF_SYMBOL_SUBMODULE_UNTRACKED, + DIFF_SYMBOL_SUBMODULE_MODIFIED, + DIFF_SYMBOL_SUBMODULE_HEADER, + DIFF_SYMBOL_SUBMODULE_ERROR, + DIFF_SYMBOL_SUBMODULE_PIPETHROUGH, }; /* * Flags for content lines: @@ -700,11 +707,77 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, reset = diff_get_color_opt(o, DIFF_RESET); emit_line(o, fraginfo, reset, line, len); break; + case DIFF_SYMBOL_SUBMODULE_ADD: + set = diff_get_color_opt(o, DIFF_FILE_NEW); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, set, reset, line, len); + break; + case DIFF_SYMBOL_SUBMODULE_DEL: + set = diff_get_color_opt(o, DIFF_FILE_OLD); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, set, reset, line, len); + break; + case DIFF_SYMBOL_SUBMODULE_UNTRACKED: + fprintf(o->file, "%sSubmodule %s contains untracked content\n", + diff_line_prefix(o), line); + break; + case DIFF_SYMBOL_SUBMODULE_MODIFIED: + fprintf(o->file, "%sSubmodule %s contains modified content\n", + diff_line_prefix(o), line); + break; + case DIFF_SYMBOL_SUBMODULE_HEADER: + fprintf(o->file, "%s%s", diff_line_prefix(o), line); + break; + case DIFF_SYMBOL_SUBMODULE_ERROR: + emit_line(o, "", "", line, len); + break; + case DIFF_SYMBOL_SUBMODULE_PIPETHROUGH: + emit_line(o, "", "", line, len); + break; default: die("BUG: unknown diff symbol"); } } +void diff_emit_submodule_del(struct diff_options *o, const char *line) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_DEL, line, strlen(line), 0); +} + +void diff_emit_submodule_add(struct diff_options *o, const char *line) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_ADD, line, strlen(line), 0); +} + +void diff_emit_submodule_untracked(struct diff_options *o, const char *path) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_UNTRACKED, +path, strlen(path), 0); +} + +void diff_emit_submodule_modified(struct diff_options *o, const char *path) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_MODIFIED, +path, strlen(path), 0); +} + +void diff_emit_submodule_header(struct diff_options *o, const char *header) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_HEADER, +header, strlen(header), 0); +} + +void diff_emit_submodule_error(struct diff_options *o, const char *err) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_ERROR, err, strlen(err), 0); +} + +void diff_emit_submodule_pipethrough(struct diff_options *o, +const char *line, int len) +{ + emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_PIPETHROUGH, line, len, 0); +} + static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) { if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) && @@ -2465,24 +2538,16 @@ static void builtin_diff(const char *name_a, if (o->submodule_format == DIFF_SUBMODULE_LOG && (!one->mode || S_ISGITLINK(one->mode)) && (!two->mode || S_ISGITLINK(two->mode))) { - const char *del = diff_get_color_opt(o, DIFF_FILE_OLD); - const char *add = diff_get_color_opt(o, DIFF_FILE_NEW); - show_submodule_summary(o->file, one->path ? one->path : two->path, - line_prefix, + show_submodule_summary(o, one->path ? one->path : two->path, &one->oid, &two->oid, - two->dirty_submodule, - meta, del, add, reset); + two->dirty_submodule); return; } else if (o->submodule_format == DIFF_SUBMODULE_INLINE_DIFF
[PATCH 18/26] diff.c: convert word diffing to use emit_diff_symbol
The word diffing is not line oriented and would need some serious effort to be transformed into a line oriented approach, so just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line. Signed-off-by: Stefan Beller --- diff.c | 79 ++ 1 file changed, 46 insertions(+), 33 deletions(-) diff --git a/diff.c b/diff.c index 0a5ff2500f..5b46baa12a 100644 --- a/diff.c +++ b/diff.c @@ -590,6 +590,7 @@ enum diff_symbol { DIFF_SYMBOL_STATS_SUMMARY_ABBREV, DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES, DIFF_SYMBOL_STATS_LINE, + DIFF_SYMBOL_WORD_DIFF, }; /* * Flags for content lines: @@ -772,6 +773,9 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES: emit_line(o, "", "", line, len); break; + case DIFF_SYMBOL_WORD_DIFF: + fprintf(o->file, "%.*s", len, line); + break; default: die("BUG: unknown diff symbol"); } @@ -1100,37 +1104,49 @@ struct diff_words_data { struct diff_words_style *style; }; -static int fn_out_diff_words_write_helper(FILE *fp, +static int fn_out_diff_words_write_helper(struct diff_options *o, struct diff_words_style_elem *st_el, const char *newline, - size_t count, const char *buf, - const char *line_prefix) + size_t count, const char *buf) { int print = 0; + struct strbuf sb = STRBUF_INIT; while (count) { char *p = memchr(buf, '\n', count); if (print) - fputs(line_prefix, fp); + strbuf_addstr(&sb, diff_line_prefix(o)); + if (p != buf) { - if (st_el->color && fputs(st_el->color, fp) < 0) - return -1; - if (fputs(st_el->prefix, fp) < 0 || - fwrite(buf, p ? p - buf : count, 1, fp) != 1 || - fputs(st_el->suffix, fp) < 0) - return -1; - if (st_el->color && *st_el->color - && fputs(GIT_COLOR_RESET, fp) < 0) - return -1; + const char *reset = st_el->color && *st_el->color ? + GIT_COLOR_RESET : NULL; + if (st_el->color && *st_el->color) + strbuf_addstr(&sb, st_el->color); + strbuf_addstr(&sb, st_el->prefix); + strbuf_add(&sb, buf, p ? p - buf : count); + strbuf_addstr(&sb, st_el->suffix); + if (reset) + strbuf_addstr(&sb, reset); } if (!p) - return 0; - if (fputs(newline, fp) < 0) - return -1; + goto out; + + strbuf_addstr(&sb, newline); count -= p + 1 - buf; buf = p + 1; print = 1; + if (count) { + emit_diff_symbol(o, DIFF_SYMBOL_WORD_DIFF, +sb.buf, sb.len, 0); + strbuf_reset(&sb); + } } + +out: + if (sb.len) + emit_diff_symbol(o, DIFF_SYMBOL_WORD_DIFF, +sb.buf, sb.len, 0); + strbuf_release(&sb); return 0; } @@ -1212,24 +1228,20 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len) fputs(line_prefix, diff_words->opt->file); } if (diff_words->current_plus != plus_begin) { - fn_out_diff_words_write_helper(diff_words->opt->file, + fn_out_diff_words_write_helper(diff_words->opt, &style->ctx, style->newline, plus_begin - diff_words->current_plus, - diff_words->current_plus, line_prefix); - if (*(plus_begin - 1) == '\n') - fputs(line_prefix, diff_words->opt->file); + diff_words->current_plus); } if (minus_begin != minus_end) { - fn_out_diff_words_write_helper(diff_words->opt->file, + fn_out_diff_words_write_helper(diff_words->opt, &style->old, style->newline, - minus_end - minus_begin, minus_begin, - line_prefix); + minus_end - minus_begin, minus_begin); } if (plus_begin !=
[PATCH 02/26] diff.c: move line ending check into emit_hunk_header
The emit_hunk_header() function is responsible for assembling a hunk header and calling emit_line() to send the hunk header to the output file. Its only caller fn_out_consume() needs to prepare for a case where the function emits an incomplete line and add the terminating LF. Instead make sure emit_hunk_header() to always send a completed line to emit_line(). Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 3f5bf8b5a4..c2ed605cd0 100644 --- a/diff.c +++ b/diff.c @@ -677,6 +677,8 @@ static void emit_hunk_header(struct emit_callback *ecbdata, } strbuf_add(&msgbuf, line + len, org_len - len); + strbuf_complete_line(&msgbuf); + emit_line(ecbdata->opt, "", "", msgbuf.buf, msgbuf.len); strbuf_release(&msgbuf); } @@ -1315,8 +1317,6 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) len = sane_truncate_line(ecbdata, line, len); find_lno(line, ecbdata); emit_hunk_header(ecbdata, line, len); - if (line[len-1] != '\n') - putc('\n', o->file); return; } -- 2.12.2.575.gb14f27f917
[PATCH 17/26] diff.c: convert show_stats to use emit_diff_symbol
We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_string machinery and keep print_stat_summary with the same arguments around. Signed-off-by: Stefan Beller --- diff.c | 118 + diff.h | 4 +-- 2 files changed, 77 insertions(+), 45 deletions(-) diff --git a/diff.c b/diff.c index e6ade5fde0..0a5ff2500f 100644 --- a/diff.c +++ b/diff.c @@ -586,7 +586,10 @@ enum diff_symbol { DIFF_SYMBOL_SUBMODULE_HEADER, DIFF_SYMBOL_SUBMODULE_ERROR, DIFF_SYMBOL_SUBMODULE_PIPETHROUGH, - + DIFF_SYMBOL_STATS_SUMMARY_NO_FILES, + DIFF_SYMBOL_STATS_SUMMARY_ABBREV, + DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES, + DIFF_SYMBOL_STATS_LINE, }; /* * Flags for content lines: @@ -628,6 +631,7 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, { static const char *nneof = " No newline at end of file\n"; const char *context, *reset, *set, *meta, *fraginfo; + struct strbuf sb = STRBUF_INIT; switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", @@ -756,9 +760,22 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_SUBMODULE_PIPETHROUGH: emit_line(o, "", "", line, len); break; + case DIFF_SYMBOL_STATS_SUMMARY_NO_FILES: + fprintf(o->file, " 0 files changed\n"); + break; + case DIFF_SYMBOL_STATS_SUMMARY_ABBREV: + emit_line(o, "", "", " ...\n", strlen(" ...\n")); + break; + case DIFF_SYMBOL_STATS_LINE: + emit_line(o, "", "", line, len); + break; + case DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES: + emit_line(o, "", "", line, len); + break; default: die("BUG: unknown diff symbol"); } + strbuf_release(&sb); } void diff_emit_submodule_del(struct diff_options *o, const char *line) @@ -1712,20 +1729,14 @@ static int scale_linear(int it, int width, int max_change) return 1 + (it * (width - 1) / max_change); } -static void show_name(FILE *file, - const char *prefix, const char *name, int len) -{ - fprintf(file, " %s%-*s |", prefix, len, name); -} - -static void show_graph(FILE *file, char ch, int cnt, const char *set, const char *reset) +static void show_graph(struct strbuf *out, char ch, int cnt, + const char *set, const char *reset) { if (cnt <= 0) return; - fprintf(file, "%s", set); - while (cnt--) - putc(ch, file); - fprintf(file, "%s", reset); + strbuf_addstr(out, set); + strbuf_addchars(out, ch, cnt); + strbuf_addstr(out, reset); } static void fill_print_name(struct diffstat_file *file) @@ -1749,14 +1760,16 @@ static void fill_print_name(struct diffstat_file *file) file->print_name = pname; } -int print_stat_summary(FILE *fp, int files, int insertions, int deletions) +static void print_stat_summary_inserts_deletes(struct diff_options *options, + int files, int insertions, int deletions) { struct strbuf sb = STRBUF_INIT; - int ret; if (!files) { assert(insertions == 0 && deletions == 0); - return fprintf(fp, "%s\n", " 0 files changed"); + emit_diff_symbol(options, DIFF_SYMBOL_STATS_SUMMARY_NO_FILES, +NULL, 0, 0); + return; } strbuf_addf(&sb, @@ -1783,9 +1796,19 @@ int print_stat_summary(FILE *fp, int files, int insertions, int deletions) deletions); } strbuf_addch(&sb, '\n'); - ret = fputs(sb.buf, fp); + emit_diff_symbol(options, DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES, +sb.buf, sb.len, 0); strbuf_release(&sb); - return ret; +} + +void print_stat_summary(FILE *fp, int files, + int insertions, int deletions) +{ + struct diff_options o; + memset(&o, 0, sizeof(o)); + o.file = fp; + + print_stat_summary_inserts_deletes(&o, files, insertions, deletions); } static void show_stats(struct diffstat_t *data, struct diff_options *options) @@ -1795,13 +1818,13 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) int total_files = data->nr, count; int width, name_width, graph_width, number_width = 0, bin_width = 0; const char *reset, *add_c, *del_c; - const char *line_prefix = ""; int extra_shown = 0; + const char *line_prefix = diff_line_prefix(options); + struct strbuf out = STRBUF_INIT; if (data->nr == 0) return; - line_prefix = diff_line_prefix(options);
[PATCH 25/26] diff: document the new --color-moved setting
Signed-off-by: Stefan Beller --- Documentation/config.txt | 12 ++-- Documentation/diff-options.txt | 27 +++ 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 475e874d51..29e0b9fa69 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1051,14 +1051,22 @@ This does not affect linkgit:git-format-patch[1] or the 'git-diff-{asterisk}' plumbing commands. Can be overridden on the command line with the `--color[=]` option. +diff.colorMoved:: + If set moved lines in a diff are colored differently, + for details see '--color-moved' in linkgit:git-diff[1]. + color.diff.:: Use customized color for diff colorization. `` specifies which part of the patch to use the specified color, and is one of `context` (context text - `plain` is a historical synonym), `meta` (metainformation), `frag` (hunk header), 'func' (function in hunk header), `old` (removed lines), - `new` (added lines), `commit` (commit headers), or `whitespace` - (highlighting whitespace errors). + `new` (added lines), `commit` (commit headers), `whitespace` + (highlighting whitespace errors), `oldMoved` (deleted lines), + `newMoved` (added lines), `oldMovedDimmed`, `oldMovedAlternative`, + `oldMovedAlternativeDimmed`, `newMovedDimmed`, `newMovedAlternative` + and `newMovedAlternativeDimmed` (See the '' + setting of '--color-moved' in linkgit:git-diff[1] for details). color.decorate.:: Use customized color for 'git log --decorate' output. `` is one diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 89cc0f48de..058c8014ed 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -231,6 +231,33 @@ ifdef::git-diff[] endif::git-diff[] It is the same as `--color=never`. +--color-moved[=]:: + Moved lines of code are colored differently. +ifdef::git-diff[] + It can be changed by the `diff.colorMoved` configuration setting. +endif::git-diff[] + The defaults to 'no' if the option is not given + and to 'dimmed_zebra' if the option with no mode is given. + The mode must be one of: ++ +-- +no:: + Moved lines are not highlighted. +plain:: + Any line that is added in one location and was removed + in another location will be colored with 'color.diff.newMoved'. + Similarly 'color.diff.oldMoved' will be used for removed lines + that are added somewhere else in the diff. +zebra:: + Blocks of moved code are detected. The detected blocks are + painted using the 'color.diff.{old,new}Moved' alternating with + 'color.diff.{old,new}MovedAlternative'. +dimmed_zebra:: + Similar to 'zebra', but additional dimming of uninteresting parts + of moved code is performed. The bordering lines of two adjacent + blocks are considered interesting, the rest is uninteresting. +-- + --word-diff[=]:: Show a word diff, using the to delimit changed words. By default, words are delimited by whitespace; see -- 2.12.2.575.gb14f27f917
[PATCH 04/26] diff.c: introduce emit_diff_symbol
Signed-off-by: Stefan Beller --- diff.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/diff.c b/diff.c index 2f9722b382..89466018e5 100644 --- a/diff.c +++ b/diff.c @@ -559,6 +559,24 @@ static void emit_line(struct diff_options *o, const char *set, const char *reset emit_line_0(o, set, reset, line[0], line+1, len-1); } +enum diff_symbol { + DIFF_SYMBOL_SEPARATOR, +}; + +static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, +const char *line, int len) +{ + switch (s) { + case DIFF_SYMBOL_SEPARATOR: + fprintf(o->file, "%s%c", + diff_line_prefix(o), + o->line_termination); + break; + default: + die("BUG: unknown diff symbol"); + } +} + static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) { if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) && @@ -4833,9 +4851,7 @@ void diff_flush(struct diff_options *options) if (output_format & DIFF_FORMAT_PATCH) { if (separator) { - fprintf(options->file, "%s%c", - diff_line_prefix(options), - options->line_termination); + emit_diff_symbol(options, DIFF_SYMBOL_SEPARATOR, NULL, 0); if (options->stat_sep) { /* attach patch instead of inline */ fputs(options->stat_sep, options->file); -- 2.12.2.575.gb14f27f917
[PATCH 14/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
Signed-off-by: Stefan Beller --- diff.c | 35 +-- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/diff.c b/diff.c index e56962b844..96ce53c5cf 100644 --- a/diff.c +++ b/diff.c @@ -573,6 +573,7 @@ enum diff_symbol { DIFF_SYMBOL_FILEPAIR, DIFF_SYMBOL_HEADER, DIFF_SYMBOL_BINARY_FILES, + DIFF_SYMBOL_REWRITE_DIFF, }; /* * Flags for content lines: @@ -613,7 +614,7 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len, unsigned flags) { static const char *nneof = " No newline at end of file\n"; - const char *context, *reset, *set, *meta; + const char *context, *reset, *set, *meta, *fraginfo; switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", @@ -694,6 +695,11 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, fprintf(o->file, "%sBinary files %s differ\n", diff_line_prefix(o), line); break; + case DIFF_SYMBOL_REWRITE_DIFF: + fraginfo = diff_get_color(o->use_color, DIFF_FRAGINFO); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line(o, fraginfo, reset, line, len); + break; default: die("BUG: unknown diff symbol"); } @@ -816,17 +822,17 @@ static void remove_tempfile(void) } } -static void print_line_count(FILE *file, int count) +static void add_line_count(struct strbuf *out, int count) { switch (count) { case 0: - fprintf(file, "0,0"); + strbuf_addstr(out, "0,0"); break; case 1: - fprintf(file, "1"); + strbuf_addstr(out, "1"); break; default: - fprintf(file, "1,%d", count); + strbuf_addf(out, "1,%d", count); break; } } @@ -865,14 +871,12 @@ static void emit_rewrite_diff(const char *name_a, struct diff_options *o) { int lc_a, lc_b; - const char *fraginfo = diff_get_color(o->use_color, DIFF_FRAGINFO); - const char *reset = diff_get_color(o->use_color, DIFF_RESET); static struct strbuf a_name = STRBUF_INIT, b_name = STRBUF_INIT; const char *a_prefix, *b_prefix; char *data_one, *data_two; size_t size_one, size_two; struct emit_callback ecbdata; - const char *line_prefix = diff_line_prefix(o); + struct strbuf out = STRBUF_INIT; if (diff_mnemonic_prefix && DIFF_OPT_TST(o, REVERSE_DIFF)) { a_prefix = o->b_prefix; @@ -914,14 +918,17 @@ static void emit_rewrite_diff(const char *name_a, emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, a_name.buf, a_name.len, 0); emit_diff_symbol(o, DIFF_SYMBOL_FILEPAIR, b_name.buf, b_name.len, 1); - fprintf(o->file, "%s%s@@ -", line_prefix, fraginfo); + strbuf_addstr(&out, "@@ -"); if (!o->irreversible_delete) - print_line_count(o->file, lc_a); + add_line_count(&out, lc_a); else - fprintf(o->file, "?,?"); - fprintf(o->file, " +"); - print_line_count(o->file, lc_b); - fprintf(o->file, " @@%s\n", reset); + strbuf_addstr(&out, "?,?"); + strbuf_addstr(&out, " +"); + add_line_count(&out, lc_b); + strbuf_addstr(&out, " @@\n"); + emit_diff_symbol(o, DIFF_SYMBOL_REWRITE_DIFF, out.buf, out.len, 0); + strbuf_release(&out); + if (lc_a && !o->irreversible_delete) emit_rewrite_lines(&ecbdata, '-', data_one, size_one); if (lc_b) -- 2.12.2.575.gb14f27f917
[PATCH 12/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
Signed-off-by: Stefan Beller --- diff.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/diff.c b/diff.c index 7c56150b56..95f11ea86f 100644 --- a/diff.c +++ b/diff.c @@ -571,6 +571,7 @@ enum diff_symbol { DIFF_SYMBOL_WORDS_PORCELAIN, DIFF_SYMBOL_WORDS, DIFF_SYMBOL_FILEPAIR, + DIFF_SYMBOL_HEADER, }; /* * Flags for content lines: @@ -685,6 +686,9 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, line, reset, strchr(line, ' ') ? "\t" : ""); break; + case DIFF_SYMBOL_HEADER: + fprintf(o->file, "%s", line); + break; default: die("BUG: unknown diff symbol"); } @@ -1380,7 +1384,8 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) o->found_changes = 1; if (ecbdata->header) { - fprintf(o->file, "%s", ecbdata->header->buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +ecbdata->header->buf, ecbdata->header->len, 0); strbuf_reset(ecbdata->header); ecbdata->header = NULL; } @@ -2514,7 +2519,8 @@ static void builtin_diff(const char *name_a, if (complete_rewrite && (textconv_one || !diff_filespec_is_binary(one)) && (textconv_two || !diff_filespec_is_binary(two))) { - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +header.buf, header.len, 0); strbuf_reset(&header); emit_rewrite_diff(name_a, name_b, one, two, textconv_one, textconv_two, o); @@ -2524,7 +2530,8 @@ static void builtin_diff(const char *name_a, } if (o->irreversible_delete && lbl[1][0] == '/') { - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, header.buf, +header.len, 0); strbuf_reset(&header); goto free_ab_and_return; } else if (!DIFF_OPT_TST(o, TEXT) && @@ -2535,10 +2542,13 @@ static void builtin_diff(const char *name_a, !DIFF_OPT_TST(o, BINARY)) { if (!oidcmp(&one->oid, &two->oid)) { if (must_show_header) - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +header.buf, header.len, +0); goto free_ab_and_return; } - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +header.buf, header.len, 0); fprintf(o->file, "%sBinary files %s and %s differ\n", line_prefix, lbl[0], lbl[1]); goto free_ab_and_return; @@ -2549,10 +2559,11 @@ static void builtin_diff(const char *name_a, if (mf1.size == mf2.size && !memcmp(mf1.ptr, mf2.ptr, mf1.size)) { if (must_show_header) - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +header.buf, header.len, 0); goto free_ab_and_return; } - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, header.buf, header.len, 0); strbuf_reset(&header); if (DIFF_OPT_TST(o, BINARY)) emit_binary_diff(o->file, &mf1, &mf2, line_prefix); @@ -2570,7 +2581,8 @@ static void builtin_diff(const char *name_a, const struct userdiff_funcname *pe; if (must_show_header) { - fprintf(o->file, "%s", header.buf); + emit_diff_symbol(o, DIFF_SYMBOL_HEADER, +header.buf, header.len, 0); strbuf_reset(&header); } -- 2.12.2.575.gb14f27f917
[PATCH 07/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
Signed-off-by: Stefan Beller --- diff.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/diff.c b/diff.c index a1804a7785..28be3226c2 100644 --- a/diff.c +++ b/diff.c @@ -563,11 +563,13 @@ enum diff_symbol { DIFF_SYMBOL_SEPARATOR, DIFF_SYMBOL_CONTEXT_MARKER, DIFF_SYMBOL_CONTEXT_FRAGINFO, + DIFF_SYMBOL_NO_LF_EOF, }; static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len) { + static const char *nneof = " No newline at end of file\n"; const char *context, *reset; switch (s) { case DIFF_SYMBOL_SEPARATOR: @@ -583,6 +585,13 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_CONTEXT_FRAGINFO: emit_line(o, "", "", line, len); break; + case DIFF_SYMBOL_NO_LF_EOF: + context = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + putc('\n', o->file); + emit_line_0(o, context, reset, '\\', + nneof, strlen(nneof)); + break; default: die("BUG: unknown diff symbol"); } @@ -750,7 +759,6 @@ static void emit_rewrite_lines(struct emit_callback *ecb, int prefix, const char *data, int size) { const char *endp = NULL; - static const char *nneof = " No newline at end of file\n"; const char *reset = diff_get_color(ecb->color_diff, DIFF_RESET); while (0 < size) { @@ -768,13 +776,8 @@ static void emit_rewrite_lines(struct emit_callback *ecb, size -= len; data += len; } - if (!endp) { - const char *context = diff_get_color(ecb->color_diff, -DIFF_CONTEXT); - putc('\n', ecb->opt->file); - emit_line_0(ecb->opt, context, reset, '\\', - nneof, strlen(nneof)); - } + if (!endp) + emit_diff_symbol(ecb->opt, DIFF_SYMBOL_NO_LF_EOF, NULL, 0); } static void emit_rewrite_diff(const char *name_a, -- 2.12.2.575.gb14f27f917
[PATCH 06/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
Signed-off-by: Stefan Beller --- diff.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 3af07fa659..a1804a7785 100644 --- a/diff.c +++ b/diff.c @@ -562,6 +562,7 @@ static void emit_line(struct diff_options *o, const char *set, const char *reset enum diff_symbol { DIFF_SYMBOL_SEPARATOR, DIFF_SYMBOL_CONTEXT_MARKER, + DIFF_SYMBOL_CONTEXT_FRAGINFO, }; static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, @@ -579,6 +580,9 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, reset = diff_get_color_opt(o, DIFF_RESET); emit_line(o, context, reset, line, len); break; + case DIFF_SYMBOL_CONTEXT_FRAGINFO: + emit_line(o, "", "", line, len); + break; default: die("BUG: unknown diff symbol"); } @@ -704,8 +708,8 @@ static void emit_hunk_header(struct emit_callback *ecbdata, strbuf_add(&msgbuf, line + len, org_len - len); strbuf_complete_line(&msgbuf); - - emit_line(ecbdata->opt, "", "", msgbuf.buf, msgbuf.len); + emit_diff_symbol(ecbdata->opt, +DIFF_SYMBOL_CONTEXT_FRAGINFO, msgbuf.buf, msgbuf.len); strbuf_release(&msgbuf); } -- 2.12.2.575.gb14f27f917
[PATCH 08/26] diff.c: migrate emit_line_checked to use emit_diff_symbol
Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) Signed-off-by: Stefan Beller --- cache.h | 1 + diff.c | 112 diff.h | 6 ++-- 3 files changed, 75 insertions(+), 44 deletions(-) diff --git a/cache.h b/cache.h index e1f0e182ad..4d63c44f07 100644 --- a/cache.h +++ b/cache.h @@ -2168,6 +2168,7 @@ void shift_tree_by(const struct object_id *, const struct object_id *, struct ob #define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF) #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB|8) #define WS_TAB_WIDTH_MASK077 +#define WS_RULE_MASK 0 extern unsigned whitespace_rule_cfg; extern unsigned whitespace_rule(const char *); extern unsigned parse_whitespace_rule(const char *); diff --git a/diff.c b/diff.c index 28be3226c2..42c9f48dc2 100644 --- a/diff.c +++ b/diff.c @@ -564,13 +564,50 @@ enum diff_symbol { DIFF_SYMBOL_CONTEXT_MARKER, DIFF_SYMBOL_CONTEXT_FRAGINFO, DIFF_SYMBOL_NO_LF_EOF, + DIFF_SYMBOL_CONTEXT, + DIFF_SYMBOL_PLUS, + DIFF_SYMBOL_MINUS, }; +/* + * Flags for content lines: + * 0..12 are whitespace rules + * 13-15 are WSEH_NEW | WSEH_OLD | WSEH_CONTEXT + * 16 is marking if the line is blank at EOF + */ +#define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF (1<<16) +#define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK) + +static void emit_line_ws_markup(struct diff_options *o, + const char *set, const char *reset, + const char *line, int len, char sign, + unsigned ws_rule, int blank_at_eof) +{ + const char *ws = NULL; + + if (o->ws_error_highlight & ws_rule) { + ws = diff_get_color_opt(o, DIFF_WHITESPACE); + if (!*ws) + ws = NULL; + } + + if (!ws) + emit_line_0(o, set, reset, sign, line, len); + else if (blank_at_eof) + /* Blank line at EOF - paint '+' as well */ + emit_line_0(o, ws, reset, sign, line, len); + else { + /* Emit just the prefix, then the rest. */ + emit_line_0(o, set, reset, sign, "", 0); + ws_check_emit(line, len, ws_rule, + o->file, set, reset, ws); + } +} static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, -const char *line, int len) +const char *line, int len, unsigned flags) { static const char *nneof = " No newline at end of file\n"; - const char *context, *reset; + const char *context, *reset, *set; switch (s) { case DIFF_SYMBOL_SEPARATOR: fprintf(o->file, "%s%c", @@ -592,6 +629,25 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, emit_line_0(o, context, reset, '\\', nneof, strlen(nneof)); break; + case DIFF_SYMBOL_CONTEXT: + set = diff_get_color_opt(o, DIFF_CONTEXT); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line_ws_markup(o, set, reset, line, len, ' ', + flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0); + break; + case DIFF_SYMBOL_PLUS: + set = diff_get_color_opt(o, DIFF_FILE_NEW); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line_ws_markup(o, set, reset, line, len, '+', + flags & DIFF_SYMBOL_CONTENT_WS_MASK, + flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF); + break; + case DIFF_SYMBOL_MINUS: + set = diff_get_color_opt(o, DIFF_FILE_OLD); + reset = diff_get_color_opt(o, DIFF_RESET); + emit_line_ws_markup(o, set, reset, line, len, '-', + flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0); + break; default: die("BUG: unknown diff symbol"); } @@ -608,57 +664,31 @@ static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line return ws_blank_line(line, len, ecbdata->ws_rule); } -static void emit_line_checked(const char *reset, - struct emit_callback *ecbdata, - const char *line, int len, - enum color_diff color, - unsigned ws_error_highlight, - char sign) -{ - const char *set = diff_get_color(ecbdata->colo
[PATCH 19/26] diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
Signed-off-by: Stefan Beller --- diff.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/diff.c b/diff.c index 5b46baa12a..18bfc4720b 100644 --- a/diff.c +++ b/diff.c @@ -591,6 +591,7 @@ enum diff_symbol { DIFF_SYMBOL_STATS_SUMMARY_INSERTS_DELETES, DIFF_SYMBOL_STATS_LINE, DIFF_SYMBOL_WORD_DIFF, + DIFF_SYMBOL_STAT_SEP, }; /* * Flags for content lines: @@ -776,6 +777,9 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, case DIFF_SYMBOL_WORD_DIFF: fprintf(o->file, "%.*s", len, line); break; + case DIFF_SYMBOL_STAT_SEP: + fputs(o->stat_sep, o->file); + break; default: die("BUG: unknown diff symbol"); } @@ -5085,10 +5089,10 @@ void diff_flush(struct diff_options *options) if (output_format & DIFF_FORMAT_PATCH) { if (separator) { emit_diff_symbol(options, DIFF_SYMBOL_SEPARATOR, NULL, 0, 0); - if (options->stat_sep) { + if (options->stat_sep) /* attach patch instead of inline */ - fputs(options->stat_sep, options->file); - } + emit_diff_symbol(options, DIFF_SYMBOL_STAT_SEP, +NULL, 0, 0); } diff_flush_patch_all_file_pairs(options); -- 2.12.2.575.gb14f27f917
Re: [PATCH v4 5/5] stash: implement builtin stash
On Sun, Jun 11, 2017 at 2:27 PM, Thomas Gummerer wrote: >> + >> +int cmd_stash(int argc, const char **argv, const char *prefix) >> +{ >> + int result = 0; >> + pid_t pid = getpid(); >> + >> + struct option options[] = { >> + OPT_END() >> + }; >> + >> + git_config(git_default_config, NULL); >> + >> + xsnprintf(stash_index_path, 64, ".git/index.stash.%d", pid); >> + >> + argc = parse_options(argc, argv, prefix, options, git_stash_usage, >> + PARSE_OPT_KEEP_UNKNOWN|PARSE_OPT_KEEP_DASHDASH); >> + >> + if (argc < 1) { >> + result = do_push_stash(NULL, prefix, 0, 0, 0, 0, NULL); >> + } else if (!strcmp(argv[0], "list")) >> + result = list_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "show")) >> + result = show_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "save")) >> + result = save_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "push")) >> + result = push_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "apply")) >> + result = apply_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "clear")) >> + result = clear_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "create")) >> + result = create_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "store")) >> + result = store_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "drop")) >> + result = drop_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "pop")) >> + result = pop_stash(argc, argv, prefix); >> + else if (!strcmp(argv[0], "branch")) >> + result = branch_stash(argc, argv, prefix); >> + else { >> + if (argv[0][0] == '-') { >> + struct argv_array args = ARGV_ARRAY_INIT; >> + argv_array_push(&args, "push"); >> + argv_array_pushv(&args, argv); >> + result = push_stash(args.argc, args.argv, prefix); > > This is a bit of a change in behaviour to what we currently have. > > The rules we decided on are as follows: > > - "git stash -p" is an alias for "git stash push -p". > - "git stash" with only option arguments is an alias for "git stash >push" with those same arguments. non-option arguments can be >specified after a "--" for disambiguation. > > The above makes "git stash -*" a alias for "git stash push -*". This > would result in a change of behaviour, for example in the case where > someone would use "git stash -this is a test-". In that case the > current behaviour is to create a stash with the message "-this is a > test-", while the above would end up making git stash error out. The > discussion on how we came up with those rules can be found at > http://public-inbox.org/git/20170206161432.zvpsqegjspaa2...@sigill.intra.peff.net/. I don't really like the "argv[0][0] == '-'" logic, but it doesn't seem to have the flaw you pointed out: $ ./git stash -this is a test- error: unknown switch `t' usage: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet] [...] I'm not sure this is the best possible error message, but it's just as useful as the message from the old version. > >> + if (!result) >> + printf_ln(_("To restore them type \"git stash >> apply\"")); > > In the shell script this is only displayed when the stash_push in the > case where git stash is invoked with no arguments, not in the push > case if I read this correctly. So the two lines above should go in > the (argc < 1) case I think. I think it's correct as is. One of the tests checks for this string to be output, and if I move the line, the test fails. I agreed with all the other points you raised, and they will be fixed in my next revision.
Re: [PATCH v5 08/10] rebase -i: skip unnecessary picks using the rebase--helper
On 19/06/17 05:45 AM, Johannes Schindelin wrote: > Hi Liam, > > On Sat, 17 Jun 2017, Liam Beguin wrote: > >> On 16/06/17 09:56 AM, Johannes Schindelin wrote: >> >>> On Thu, 15 Jun 2017, Liam Beguin wrote: >>> On 14/06/17 09:08 AM, Johannes Schindelin wrote: > diff --git a/sequencer.c b/sequencer.c > index a697906d463..a0e020dab09 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -2640,3 +2640,110 @@ int check_todo_list(void) > > return res; > } > + > +/* skip picking commits whose parents are unchanged */ > +int skip_unnecessary_picks(void) > +{ > + const char *todo_file = rebase_path_todo(); > + struct strbuf buf = STRBUF_INIT; > + struct todo_list todo_list = TODO_LIST_INIT; > + struct object_id onto_oid, *oid = &onto_oid, *parent_oid; > + int fd, i; > + > + if (!read_oneliner(&buf, rebase_path_onto(), 0)) > + return error(_("could not read 'onto'")); > + if (get_sha1(buf.buf, onto_oid.hash)) { I missed this last time but we could also replace `get_sha1` with `get_oid` >>> >>> Good point! >>> >>> I replaced this locally and force-pushed, but there is actually little >>> chance of this patch series being integrated in a form with which I >>> would be comfortable. >> >> Is there any chance of this moving forward? I was hoping to resend my >> path to abbreviate command names on top of this. > > We are at an impasse right now. > > Junio wants me to change this code: > > revs.pretty_given = 1; > git_config_get_string("rebase.instructionFormat", &format); > if (!format || !*format) { > free(format); > format = xstrdup("%s"); > } > get_commit_format(format, &revs); > free(format); > pp.fmt = revs.commit_format; > pp.output_encoding = get_log_output_encoding(); > > if (setup_revisions(argc, argv, &revs, NULL) > 1) > ... > > which is reasonably compile-time safe, to something like this: > > struct strbuf userformat = STRBUF_INIT; > struct argv_array args = ARGV_ARRAY_INIT; > ... > for (i = 0; i < argc; i++) > argv_array_push(&args, argv[i]); > git_config_get_string("rebase.instructionFormat", &format); > if (!format || !*format) > argv_array_push(&args, "--format=%s"); > else { > strbuf_addf(&userformat, "--format=%s", format); > argv_array_push(&args, userformat.buf); > } > > if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) > ... > argv_array_clear(&args); > strbuf_release(&userformat); > > which is not compile-time safe, harder to read, sloppy coding in my book. > > The reason for this suggestion is that one of the revision machinery's > implementation details is an ugly little semi-secret: the pretty-printing > machinery uses a global state, and that is why we need the "pretty_given" > flag in the first place. > > Junio thinks that it would be better to not use the pretty_given flag in > this patch series. I disagree: It would be better to use the pretty_given > flag, also as a good motivator to work on removing the global state in the > pretty-printing machinery. > > Junio wants to strong-arm me into accepting authorship for this sloppy > coding, and I simply won't do it. > > That's why we are at an impasse right now. > > I am really, really sorry that this affects your patch series, as I had > not foreseen Junio's insistence on the sloppy coding when I suggested that > you rebase your work on top of my patch series. I just was really > unprepared for this contention, and I am still surprised/shocked that this > is even an issue up for discussion. > > Be that as it may, I see that this is a very bad experience for a > motivated contributor such as yourself. I am very sorry about that! It's not such a bad experience, I've found the people on the list to be quite welcoming and supportive. > > So it may actually be better for you to go forward with your original > patch series targeting git-rebase--interactive.sh. I'll wait a bit longer to see how this goes and if it doesn't move, I'll try re-sending an update to that series. > > My apologies, > Dscho > Thanks, Liam
Re: [PATCH v4 5/5] stash: implement builtin stash
On Fri, Jun 16, 2017 at 3:47 PM, Junio C Hamano wrote: > Joel Teichroeb writes: >> +/* >> + * Untracked files are stored by themselves in a parentless commit, for >> + * ease of unpacking later. >> + */ >> +static int save_untracked(struct stash_info *info, const char *message, >> + int include_untracked, int include_ignored, const char **argv) >> +{ >> + struct child_process cp = CHILD_PROCESS_INIT; >> + struct strbuf out = STRBUF_INIT; >> + struct object_id orig_tree; >> + int ret; >> + const char *index_file = get_index_file(); >> + >> + set_alternate_index_output(stash_index_path); >> + untracked_files(&out, include_untracked, include_ignored, argv); >> + >> + cp.git_cmd = 1; >> + argv_array_pushl(&cp.args, "update-index", "-z", "--add", "--remove", >> + "--stdin", NULL); >> + argv_array_pushf(&cp.env_array, "GIT_INDEX_FILE=%s", stash_index_path); >> + >> + if (pipe_command(&cp, out.buf, out.len, NULL, 0, NULL, 0)) { >> + strbuf_release(&out); >> + return 1; >> + } >> + > > OK, that's a very straight-forward way of doing this, and as we do > not care too much about performance in this initial conversion to C, > it is even sensible. In a later update after this patch lands, you > may want to use dir.c's fill_directory() API to find the untracked > files and add them yourself internally, without running ls-files (in > untracked_files()) or update-index (here) as subprocesses, but that > is in the future. Let's get this round finished. > >> + strbuf_reset(&out); >> + >> + discard_cache(); >> + read_cache_from(stash_index_path); >> + >> + write_index_as_tree(orig_tree.hash, &the_index, stash_index_path, >> 0,NULL); > > SP before "NULL". > >> + discard_cache(); >> + >> + read_cache_from(stash_index_path); > > Hmph, what did anybody change in the on-disk stash_index (or > contents in the_index) since you read_cache_from()? > >> + write_cache_as_tree(info->u_tree.hash, 0, NULL); > > Then you write exactly the same index contents again, this time to > info->u_tree here. I am not sure why you need to do this twice, and > I do not see how orig_tree.hash you wrote earlier is used? > I'm not sure I understand what's happening here either. When I was writing this, it was essentially a lot of trial and error in order to get the index handling correct. Getting rid of any single one of these lines makes the test fail. At some point I'd like to redo all the index handling parts here, as I think I can do without an additional index, but I'd need to make sure the error handling is perfect first.
[PATCH v4 8/8] sha1_file: refactor has_sha1_file_with_flags
has_sha1_file_with_flags() implements many mechanisms in common with sha1_object_info_extended(). Make has_sha1_file_with_flags() a convenience function for sha1_object_info_extended() instead. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 10 ++ builtin/index-pack.c | 3 ++- cache.h | 11 +++ sha1_file.c | 13 +++-- 4 files changed, 14 insertions(+), 23 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 47708451b..96d5146c4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -242,9 +242,11 @@ static void find_non_local_tags(struct transport *transport, */ if (ends_with(ref->name, "^{}")) { if (item && - !has_object_file_with_flags(&ref->old_oid, HAS_SHA1_QUICK) && + !has_object_file_with_flags(&ref->old_oid, + OBJECT_INFO_QUICK) && !will_fetch(head, ref->old_oid.hash) && - !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) && + !has_sha1_file_with_flags(item->util, + OBJECT_INFO_QUICK) && !will_fetch(head, item->util)) item->util = NULL; item = NULL; @@ -258,7 +260,7 @@ static void find_non_local_tags(struct transport *transport, * fetch. */ if (item && - !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) && + !has_sha1_file_with_flags(item->util, OBJECT_INFO_QUICK) && !will_fetch(head, item->util)) item->util = NULL; @@ -279,7 +281,7 @@ static void find_non_local_tags(struct transport *transport, * checked to see if it needs fetching. */ if (item && - !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) && + !has_sha1_file_with_flags(item->util, OBJECT_INFO_QUICK) && !will_fetch(head, item->util)) item->util = NULL; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 04b9dcaf0..587bc80c9 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -794,7 +794,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, if (startup_info->have_repository) { read_lock(); - collision_test_needed = has_sha1_file_with_flags(oid->hash, HAS_SHA1_QUICK); + collision_test_needed = + has_sha1_file_with_flags(oid->hash, OBJECT_INFO_QUICK); read_unlock(); } diff --git a/cache.h b/cache.h index 2e1cc3fe2..387694b25 100644 --- a/cache.h +++ b/cache.h @@ -1268,15 +1268,10 @@ int read_loose_object(const char *path, void **contents); /* - * Return true iff we have an object named sha1, whether local or in - * an alternate object database, and whether packed or loose. This - * function does not respect replace references. - * - * If the QUICK flag is set, do not re-check the pack directory - * when we cannot find the object (this means we may give a false - * negative answer if another process is simultaneously repacking). + * Convenience for sha1_object_info_extended() with a blank struct + * object_info. OBJECT_INFO_SKIP_CACHED is automatically set; pass + * nonzero flags to also set other flags. */ -#define HAS_SHA1_QUICK 0x1 extern int has_sha1_file_with_flags(const unsigned char *sha1, int flags); static inline int has_sha1_file(const unsigned char *sha1) { diff --git a/sha1_file.c b/sha1_file.c index 68e3a3400..20db9b510 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3491,18 +3491,11 @@ int has_sha1_pack(const unsigned char *sha1) int has_sha1_file_with_flags(const unsigned char *sha1, int flags) { - struct pack_entry e; - + static struct object_info blank; if (!startup_info->have_repository) return 0; - if (find_pack_entry(sha1, &e)) - return 1; - if (has_loose_object(sha1)) - return 1; - if (flags & HAS_SHA1_QUICK) - return 0; - reprepare_packed_git(); - return find_pack_entry(sha1, &e); + return !sha1_object_info_extended(sha1, &blank, + flags | OBJECT_INFO_SKIP_CACHED); } int has_object_file(const struct object_id *oid) -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 6/8] sha1_file: improve sha1_object_info_extended
Improve sha1_object_info_extended() by supporting additional flags. This allows has_sha1_file_with_flags() to be modified to use sha1_object_info_extended() in a subsequent patch. Signed-off-by: Jonathan Tan --- cache.h | 4 sha1_file.c | 43 --- 2 files changed, 28 insertions(+), 19 deletions(-) diff --git a/cache.h b/cache.h index 48aea923b..7cf2ca466 100644 --- a/cache.h +++ b/cache.h @@ -1863,6 +1863,10 @@ struct object_info { #define OBJECT_INFO_LOOKUP_REPLACE 1 /* Allow reading from a loose object file of unknown/bogus type */ #define OBJECT_INFO_ALLOW_UNKNOWN_TYPE 2 +/* Do not check cached storage */ +#define OBJECT_INFO_SKIP_CACHED 4 +/* Do not retry packed storage after checking packed and loose storage */ +#define OBJECT_INFO_QUICK 8 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags); extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *); diff --git a/sha1_file.c b/sha1_file.c index 4d5033c48..24f7a146e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2977,29 +2977,30 @@ static int sha1_loose_object_info(const unsigned char *sha1, int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) { - struct cached_object *co; struct pack_entry e; int rtype; const unsigned char *real = (flags & OBJECT_INFO_LOOKUP_REPLACE) ? lookup_replace_object(sha1) : sha1; - co = find_cached_object(real); - if (co) { - if (oi->typep) - *(oi->typep) = co->type; - if (oi->sizep) - *(oi->sizep) = co->size; - if (oi->disk_sizep) - *(oi->disk_sizep) = 0; - if (oi->delta_base_sha1) - hashclr(oi->delta_base_sha1); - if (oi->typename) - strbuf_addstr(oi->typename, typename(co->type)); - if (oi->contentp) - *oi->contentp = xmemdupz(co->buf, co->size); - oi->whence = OI_CACHED; - return 0; + if (!(flags & OBJECT_INFO_SKIP_CACHED)) { + struct cached_object *co = find_cached_object(real); + if (co) { + if (oi->typep) + *(oi->typep) = co->type; + if (oi->sizep) + *(oi->sizep) = co->size; + if (oi->disk_sizep) + *(oi->disk_sizep) = 0; + if (oi->delta_base_sha1) + hashclr(oi->delta_base_sha1); + if (oi->typename) + strbuf_addstr(oi->typename, typename(co->type)); + if (oi->contentp) + *oi->contentp = xmemdupz(co->buf, co->size); + oi->whence = OI_CACHED; + return 0; + } } if (!find_pack_entry(real, &e)) { @@ -3010,9 +3011,13 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, } /* Not a loose object; someone else may have just packed it. */ - reprepare_packed_git(); - if (!find_pack_entry(real, &e)) + if (flags & OBJECT_INFO_QUICK) { return -1; + } else { + reprepare_packed_git(); + if (!find_pack_entry(real, &e)) + return -1; + } } rtype = packed_object_info(e.p, e.offset, oi); -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 7/8] sha1_file: do not access pack if unneeded
Add an option to struct object_info to suppress population of additional information about a packed object if unneeded. This allows an optimization in which sha1_object_info_extended() does not even need to access the pack if no information besides provenance is requested. A subsequent patch will make use of this optimization. Signed-off-by: Jonathan Tan --- cache.h | 1 + sha1_file.c | 17 + streaming.c | 1 + 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index 7cf2ca466..2e1cc3fe2 100644 --- a/cache.h +++ b/cache.h @@ -1828,6 +1828,7 @@ struct object_info { unsigned char *delta_base_sha1; struct strbuf *typename; void **contentp; + unsigned populate_u : 1; /* Response */ enum { diff --git a/sha1_file.c b/sha1_file.c index 24f7a146e..68e3a3400 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3020,6 +3020,13 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, } } + if (!oi->typep && !oi->sizep && !oi->disk_sizep && + !oi->delta_base_sha1 && !oi->typename && !oi->contentp && + !oi->populate_u) { + oi->whence = OI_PACKED; + return 0; + } + rtype = packed_object_info(e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); @@ -3028,10 +3035,12 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, oi->whence = OI_DBCACHED; } else { oi->whence = OI_PACKED; - oi->u.packed.offset = e.offset; - oi->u.packed.pack = e.p; - oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA || -rtype == OBJ_OFS_DELTA); + if (oi->populate_u) { + oi->u.packed.offset = e.offset; + oi->u.packed.pack = e.p; + oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA || +rtype == OBJ_OFS_DELTA); + } } return 0; diff --git a/streaming.c b/streaming.c index 9afa66b8b..deebc18a8 100644 --- a/streaming.c +++ b/streaming.c @@ -113,6 +113,7 @@ static enum input_source istream_source(const unsigned char *sha1, oi->typep = type; oi->sizep = &size; + oi->populate_u = 1; status = sha1_object_info_extended(sha1, oi, 0); if (status < 0) return stream_error; -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 4/8] sha1_file: move delta base cache code up
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan --- sha1_file.c | 220 ++-- 1 file changed, 110 insertions(+), 110 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index ae44b32f3..a7be45efe 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2239,116 +2239,6 @@ static enum object_type packed_to_object_type(struct packed_git *p, goto out; } -int packed_object_info(struct packed_git *p, off_t obj_offset, - struct object_info *oi) -{ - struct pack_window *w_curs = NULL; - unsigned long size; - off_t curpos = obj_offset; - enum object_type type; - - /* -* We always get the representation type, but only convert it to -* a "real" type later if the caller is interested. -*/ - type = unpack_object_header(p, &w_curs, &curpos, &size); - - if (oi->sizep) { - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { - off_t tmp_pos = curpos; - off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos, - type, obj_offset); - if (!base_offset) { - type = OBJ_BAD; - goto out; - } - *oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos); - if (*oi->sizep == 0) { - type = OBJ_BAD; - goto out; - } - } else { - *oi->sizep = size; - } - } - - if (oi->disk_sizep) { - struct revindex_entry *revidx = find_pack_revindex(p, obj_offset); - *oi->disk_sizep = revidx[1].offset - obj_offset; - } - - if (oi->typep || oi->typename) { - enum object_type ptot; - ptot = packed_to_object_type(p, obj_offset, type, &w_curs, -curpos); - if (oi->typep) - *oi->typep = ptot; - if (oi->typename) { - const char *tn = typename(ptot); - if (tn) - strbuf_addstr(oi->typename, tn); - } - if (ptot < 0) { - type = OBJ_BAD; - goto out; - } - } - - if (oi->delta_base_sha1) { - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { - const unsigned char *base; - - base = get_delta_base_sha1(p, &w_curs, curpos, - type, obj_offset); - if (!base) { - type = OBJ_BAD; - goto out; - } - - hashcpy(oi->delta_base_sha1, base); - } else - hashclr(oi->delta_base_sha1); - } - -out: - unuse_pack(&w_curs); - return type; -} - -static void *unpack_compressed_entry(struct packed_git *p, - struct pack_window **w_curs, - off_t curpos, - unsigned long size) -{ - int st; - git_zstream stream; - unsigned char *buffer, *in; - - buffer = xmallocz_gently(size); - if (!buffer) - return NULL; - memset(&stream, 0, sizeof(stream)); - stream.next_out = buffer; - stream.avail_out = size + 1; - - git_inflate_init(&stream); - do { - in = use_pack(p, w_curs, curpos, &stream.avail_in); - stream.next_in = in; - st = git_inflate(&stream, Z_FINISH); - if (!stream.avail_out) - break; /* the payload is larger than it should be */ - curpos += stream.next_in - in; - } while (st == Z_OK || st == Z_BUF_ERROR); - git_inflate_end(&stream); - if ((st != Z_STREAM_END) || stream.total_out != size) { - free(buffer); - return NULL; - } - - return buffer; -} - static struct hashmap delta_base_cache; static size_t delta_base_cached; @@ -2486,6 +2376,116 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset, hashmap_add(&delta_base_cache, ent); } +int packed_object_info(struct packed_git *p, off_t obj_offset, + struct object_info *oi) +{ + struct pack_window *w_curs = NULL; + unsigned long size; + off_t curpos = obj_offset; + enum object_type type; + + /* +* We always get the representation type, but only convert i
[PATCH v4 5/8] sha1_file: refactor read_object
read_object() and sha1_object_info_extended() both implement mechanisms such as object replacement, retrying the packed store after failing to find the object in the packed store then the loose store, and being able to mark a packed object as bad and then retrying the whole process. Consolidating these mechanisms would be a great help to maintainability. Therefore, consolidate them by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan --- cache.h | 1 + sha1_file.c | 84 ++--- 2 files changed, 43 insertions(+), 42 deletions(-) diff --git a/cache.h b/cache.h index a3631b237..48aea923b 100644 --- a/cache.h +++ b/cache.h @@ -1827,6 +1827,7 @@ struct object_info { off_t *disk_sizep; unsigned char *delta_base_sha1; struct strbuf *typename; + void **contentp; /* Response */ enum { diff --git a/sha1_file.c b/sha1_file.c index a7be45efe..4d5033c48 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2005,19 +2005,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, &oi, 0); } -static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1) -{ - int ret; - git_zstream stream; - char hdr[8192]; - - ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); - if (ret < Z_OK || (*type = parse_sha1_header(hdr, size)) < 0) - return NULL; - - return unpack_sha1_rest(&stream, hdr, *size, sha1); -} - unsigned long get_size_from_delta(struct packed_git *p, struct pack_window **w_curs, off_t curpos) @@ -2326,8 +2313,10 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset, if (!ent) return unpack_entry(p, base_offset, type, base_size); - *type = ent->type; - *base_size = ent->size; + if (type) + *type = ent->type; + if (base_size) + *base_size = ent->size; return xmemdupz(ent->data, ent->size); } @@ -2388,9 +2377,16 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, * We always get the representation type, but only convert it to * a "real" type later if the caller is interested. */ - type = unpack_object_header(p, &w_curs, &curpos, &size); + if (oi->contentp) { + *oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep, + &type); + if (!*oi->contentp) + type = OBJ_BAD; + } else { + type = unpack_object_header(p, &w_curs, &curpos, &size); + } - if (oi->sizep) { + if (!oi->contentp && oi->sizep) { if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { off_t tmp_pos = curpos; off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos, @@ -2679,8 +2675,10 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, free(external_base); } - *final_type = type; - *final_size = size; + if (final_type) + *final_type = type; + if (final_size) + *final_size = size; unuse_pack(&w_curs); @@ -2914,6 +2912,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, git_zstream stream; char hdr[32]; struct strbuf hdrbuf = STRBUF_INIT; + unsigned long size_scratch; if (oi->delta_base_sha1) hashclr(oi->delta_base_sha1); @@ -2926,7 +2925,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, * return value implicitly indicates whether the * object even exists. */ - if (!oi->typep && !oi->typename && !oi->sizep) { + if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) { const char *path; struct stat st; if (stat_sha1_file(sha1, &st, &path) < 0) @@ -2939,6 +2938,10 @@ static int sha1_loose_object_info(const unsigned char *sha1, map = map_sha1_file(sha1, &mapsize); if (!map) return -1; + + if (!oi->sizep) + oi->sizep = &size_scratch; + if (oi->disk_sizep) *oi->disk_sizep = mapsize; if ((flags & OBJECT_INFO_ALLOW_UNKNOWN_TYPE)) { @@ -2956,10 +2959,18 @@ static int sha1_loose_object_info(const unsigned char *sha1, sha1_to_hex(sha1)); } else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0) status = error("unable to parse %s header", sha1_to_hex(sha1)); - git_inflate_end
[PATCH v4 0/8] Improvements to sha1_file
Thanks, Peff and Junio for your comments. Here's an updated version and some replies to comments. > I also found this quite subtle. However, I don't think that > has_sha1_file() actually freshens. It's a bit confusing because > has_loose_object() reuses the check_and_freshen() function to do the > lookup, but it actually sets the "freshen" flag to false. > > That's why in 33d4221c7 (write_sha1_file: freshen existing objects, > 2014-10-15), which introduced the freshening functions and converted > has_loose_object(), the actual write_sha1_file() function switched to > using the freshening functions directly (and obviously sets the freshen > parameter to true). Good catch. > I actually think all of that infrastructure could become part of > Jonathan's consolidated lookup, too. We would just need: > > 1. A QUICK flag to avoid re-reading objects/pack when we don't find > anything (which it looks like he already has). > > 2. A FRESHEN flag to update the mtime of any item that we do find. > > I suspect we may also need something like ONLY_LOOSE and ONLY_NONLOCAL > to meet all the callers (e.g., has_loose_object_nonlocal). Those should > be easy to implement, I'd think. For things like FRESHEN, ONLY_LOOSE, and ONLY_NONLOCAL, I was thinking that I would like to restrict these patches to only handle the cases that are agnostic to the type of storage (in preparation for missing blob handling patches). > I had the same thoughts (both on the name and the "vocabularies"). IMHO > we should consider allocating the bits from the same set. There's only > one HAS_SHA1 flag, and it has an exact match in OBJECT_INFO_QUICK. Agreed - in this patch set, I have also consolidated the relevant flags, including LOOKUP_REPLACE_OBJECT and LOOKUP_UNKNOWN_OBJECT. In addition, Junio has mentioned the potential confusion in behavior between a NULL and an empty struct passed to sha1_object_info_extended(). In this patch set, I require non-NULL, and have added an optimization that avoids accessing the pack in certain situations, but this optimization requires checking a lot of fields. Let me know what you think. Jonathan Tan (8): sha1_file: teach packed_object_info about typename sha1_file: rename LOOKUP_UNKNOWN_OBJECT sha1_file: rename LOOKUP_REPLACE_OBJECT sha1_file: move delta base cache code up sha1_file: refactor read_object sha1_file: improve sha1_object_info_extended sha1_file: do not access pack if unneeded sha1_file: refactor has_sha1_file_with_flags builtin/cat-file.c | 7 +- builtin/fetch.c | 10 +- builtin/index-pack.c | 3 +- cache.h | 37 +++-- sha1_file.c | 391 ++- streaming.c | 1 + 6 files changed, 228 insertions(+), 221 deletions(-) -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 1/8] sha1_file: teach packed_object_info about typename
In commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-06), "struct object_info" gained a "typename" field that could represent a type name from a loose object file, whether valid or invalid, as opposed to the existing "typep" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above. Signed-off-by: Jonathan Tan --- sha1_file.c | 29 - 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 59a4ed2ed..a52b27541 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2277,9 +2277,18 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, *oi->disk_sizep = revidx[1].offset - obj_offset; } - if (oi->typep) { - *oi->typep = packed_to_object_type(p, obj_offset, type, &w_curs, curpos); - if (*oi->typep < 0) { + if (oi->typep || oi->typename) { + enum object_type ptot; + ptot = packed_to_object_type(p, obj_offset, type, &w_curs, +curpos); + if (oi->typep) + *oi->typep = ptot; + if (oi->typename) { + const char *tn = typename(ptot); + if (tn) + strbuf_addstr(oi->typename, tn); + } + if (ptot < 0) { type = OBJ_BAD; goto out; } @@ -2960,7 +2969,6 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, struct cached_object *co; struct pack_entry e; int rtype; - enum object_type real_type; const unsigned char *real = lookup_replace_object_extended(sha1, flags); co = find_cached_object(real); @@ -2992,18 +3000,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, return -1; } - /* -* packed_object_info() does not follow the delta chain to -* find out the real type, unless it is given oi->typep. -*/ - if (oi->typename && !oi->typep) - oi->typep = &real_type; - rtype = packed_object_info(e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); - if (oi->typep == &real_type) - oi->typep = NULL; return sha1_object_info_extended(real, oi, 0); } else if (in_delta_base_cache(e.p, e.offset)) { oi->whence = OI_DBCACHED; @@ -3014,10 +3013,6 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA || rtype == OBJ_OFS_DELTA); } - if (oi->typename) - strbuf_addstr(oi->typename, typename(*oi->typep)); - if (oi->typep == &real_type) - oi->typep = NULL; return 0; } -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT
The LOOKUP_UNKNOWN_OBJECT flag was introduced in commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03) in order to support a feature in cat-file subsequently introduced in commit 39e4ae3 ("cat-file: teach cat-file a '--allow-unknown-type' option", 2015-05-03). Despite its name and location in cache.h, this flag is used neither in read_sha1_file_extended() nor in any of the lookup functions, but used only in sha1_object_info_extended(). Therefore rename this flag to OBJECT_INFO_ALLOW_UNKNOWN_TYPE, taking the name of the cat-file flag that invokes this feature, and move it closer to the declaration of sha1_object_info_extended(). Also add documentation for this flag. Signed-off-by: Jonathan Tan --- builtin/cat-file.c | 2 +- cache.h| 3 ++- sha1_file.c| 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 4bffd7a2d..209374b3c 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -60,7 +60,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, const char *path = force_path; if (unknown_type) - flags |= LOOKUP_UNKNOWN_OBJECT; + flags |= OBJECT_INFO_ALLOW_UNKNOWN_TYPE; if (get_sha1_with_context(obj_name, GET_SHA1_RECORD_PATH, oid.hash, &obj_context)) diff --git a/cache.h b/cache.h index 4d92aae0e..e2ec45dfe 100644 --- a/cache.h +++ b/cache.h @@ -1207,7 +1207,6 @@ extern char *xdg_cache_home(const char *filename); /* object replacement */ #define LOOKUP_REPLACE_OBJECT 1 -#define LOOKUP_UNKNOWN_OBJECT 2 extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag); static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size) { @@ -1866,6 +1865,8 @@ struct object_info { */ #define OBJECT_INFO_INIT {NULL} +/* Allow reading from a loose object file of unknown/bogus type */ +#define OBJECT_INFO_ALLOW_UNKNOWN_TYPE 2 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags); extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *); diff --git a/sha1_file.c b/sha1_file.c index a52b27541..ad04ea8e0 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1964,7 +1964,7 @@ static int parse_sha1_header_extended(const char *hdr, struct object_info *oi, * we're obtaining the type using '--allow-unknown-type' * option. */ - if ((flags & LOOKUP_UNKNOWN_OBJECT) && (type < 0)) + if ((flags & OBJECT_INFO_ALLOW_UNKNOWN_TYPE) && (type < 0)) type = 0; else if (type < 0) die("invalid object type"); @@ -2941,7 +2941,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, return -1; if (oi->disk_sizep) *oi->disk_sizep = mapsize; - if ((flags & LOOKUP_UNKNOWN_OBJECT)) { + if ((flags & OBJECT_INFO_ALLOW_UNKNOWN_TYPE)) { if (unpack_sha1_header_to_strbuf(&stream, map, mapsize, hdr, sizeof(hdr), &hdrbuf) < 0) status = error("unable to unpack %s header with --allow-unknown-type", sha1_to_hex(sha1)); -- 2.13.1.611.g7e3b11ae1-goog
[PATCH v4 3/8] sha1_file: rename LOOKUP_REPLACE_OBJECT
The LOOKUP_REPLACE_OBJECT flag controls whether the lookup_replace_object() function is invoked by sha1_object_info_extended(), read_sha1_file_extended(), and lookup_replace_object_extended(), but it is not immediately clear which functions accept that flag. Therefore restrict this flag to only sha1_object_info_extended(), renaming it appropriately to OBJECT_INFO_LOOKUP_REPLACE and adding some documentation. Update read_sha1_file_extended() to have a boolean parameter instead, and delete lookup_replace_object_extended(). parse_sha1_header() also passes this flag to parse_sha1_header_extended() since commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03), but that has had no effect since that commit. Therefore this patch also removes this flag from that invocation. Signed-off-by: Jonathan Tan --- builtin/cat-file.c | 5 +++-- cache.h| 17 ++--- sha1_file.c| 13 - 3 files changed, 17 insertions(+), 18 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 209374b3c..923786c00 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -56,7 +56,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, struct object_context obj_context; struct object_info oi = OBJECT_INFO_INIT; struct strbuf sb = STRBUF_INIT; - unsigned flags = LOOKUP_REPLACE_OBJECT; + unsigned flags = OBJECT_INFO_LOOKUP_REPLACE; const char *path = force_path; if (unknown_type) @@ -337,7 +337,8 @@ static void batch_object_write(const char *obj_name, struct batch_options *opt, struct strbuf buf = STRBUF_INIT; if (!data->skip_object_info && - sha1_object_info_extended(data->oid.hash, &data->info, LOOKUP_REPLACE_OBJECT) < 0) { + sha1_object_info_extended(data->oid.hash, &data->info, + OBJECT_INFO_LOOKUP_REPLACE)) { printf("%s missing\n", obj_name ? obj_name : oid_to_hex(&data->oid)); fflush(stdout); diff --git a/cache.h b/cache.h index e2ec45dfe..a3631b237 100644 --- a/cache.h +++ b/cache.h @@ -1205,12 +1205,12 @@ extern char *xdg_config_home(const char *filename); */ extern char *xdg_cache_home(const char *filename); -/* object replacement */ -#define LOOKUP_REPLACE_OBJECT 1 -extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag); +extern void *read_sha1_file_extended(const unsigned char *sha1, +enum object_type *type, +unsigned long *size, int lookup_replace); static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size) { - return read_sha1_file_extended(sha1, type, size, LOOKUP_REPLACE_OBJECT); + return read_sha1_file_extended(sha1, type, size, 1); } /* @@ -1232,13 +1232,6 @@ static inline const unsigned char *lookup_replace_object(const unsigned char *sh return do_lookup_replace_object(sha1); } -static inline const unsigned char *lookup_replace_object_extended(const unsigned char *sha1, unsigned flag) -{ - if (!(flag & LOOKUP_REPLACE_OBJECT)) - return sha1; - return lookup_replace_object(sha1); -} - /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ extern int sha1_object_info(const unsigned char *, unsigned long *); extern int hash_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *sha1); @@ -1865,6 +1858,8 @@ struct object_info { */ #define OBJECT_INFO_INIT {NULL} +/* Invoke lookup_replace_object() on the given hash */ +#define OBJECT_INFO_LOOKUP_REPLACE 1 /* Allow reading from a loose object file of unknown/bogus type */ #define OBJECT_INFO_ALLOW_UNKNOWN_TYPE 2 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags); diff --git a/sha1_file.c b/sha1_file.c index ad04ea8e0..ae44b32f3 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2002,7 +2002,7 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) struct object_info oi = OBJECT_INFO_INIT; oi.sizep = sizep; - return parse_sha1_header_extended(hdr, &oi, LOOKUP_REPLACE_OBJECT); + return parse_sha1_header_extended(hdr, &oi, 0); } static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1) @@ -2969,7 +2969,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, struct cached_object *co; struct pack_entry e; int rtype; - const unsigned char *real = lookup_replace_object_extended(sha1, flags); + const unsigned char *real = (flags & OBJECT_INFO_LOOKUP_REPLACE) ? + lookup_replace_object(sha1) : +
Re: [GSoC] Update: Week 5
On 06/20, Andrew Ardill wrote: > On 20 June 2017 at 07:41, Prathamesh Chavan wrote: > > >But as communicating between child_process is still an issue > >and so there was no simple was to current carry out the > >porting. And hence, a hack was used instead. But after > >discussing it, instead using the repository-object patch > >series will help to resolve these issues in this situation. > > Just wondering, does that mean that your patch series is dependent on > the repository-object one? I saw some discussion around it recently > but couldn't see it in the latest whats cooking so maybe I missed what > has happened to it. > Due to some of the discussion (and finding a bug with how git_path works with worktrees) I decided to break the series up into smaller bits since the original series was 30+ patches. Once 'bw/config-h' and 'bw/ls-files-sans-the-index' have been merged into next I'll probably send out v3 of the repository object series. -- Brandon Williams
Re: [GSoC] Update: Week 5
On 20 June 2017 at 07:41, Prathamesh Chavan wrote: >But as communicating between child_process is still an issue >and so there was no simple was to current carry out the >porting. And hence, a hack was used instead. But after >discussing it, instead using the repository-object patch >series will help to resolve these issues in this situation. Just wondering, does that mean that your patch series is dependent on the repository-object one? I saw some discussion around it recently but couldn't see it in the latest whats cooking so maybe I missed what has happened to it. Really enjoying your updates, by the way, they are very clear and show what looks like great progress! Regards, Andrew Ardill
Re: [PATCH] die routine: change recursion limit from 1 to 1024
On Mon, Jun 19, 2017 at 3:32 PM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Jun 19 2017, Stefan Beller jotted: > >>> Now, git-grep could make use of the pluggable error facility added in >>> commit c19a490e37 ("usage: allow pluggable die-recursion checks", >>> 2013-04-16). >> >> I think we should do that instead (though I have not looked at the downsides >> of this), because... > > It makes sense to do that in addition to what I'm doing here (or if the > approach I'm taking doesn't make sense, some other patch to fix the > general issue in the default handler). > > I'm going to try to get around to fixing the grep behavior in a > follow-up patch, this is a fix for the overzealous recursion detection > in the default handler needlessly causing other issues. > >>> >>> So let's just set the recursion limit to a number higher than the >>> number of threads we're ever likely to spawn. Now we won't lose >>> errors, and if we have a recursing die handler we'll still die within >>> microseconds. >> >> how are we handling access to that global variable? >> Do we need to hold a mutex to be correct? or rather hope that >> it works across threads, not counting on it, because each thread >> individually would count up to 1024? > > It's not guarded by a mutex and the ++ here and the reads from it are > both racy. > > However, for its stated purpose that's fine, even if we're racily > incrementing it and losing some updates some will get through, which is > good enough for an infinite recursion detection. We don't really care if > we die at exactly 1 or exactly 1024. > >> I would prefer if we kept the number as low as "at most >> one screen of lines". > > In practice this is the case in git, because the programs that would > encounter this are going to be spawning less than screenfull of threads, > assuming (as is the case) that each thread might print out one error. > > The semantics of that aren't changed with this patch, the difference is > that you're going to get e.g. N repeats of a meaningful error instead of > N repeats of either the meaningful error OR "recursion detected in die > handler", depending on your luck. > > I.e. in current git (after a few runs to get an unlucky one): > > $ git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: recursion detected in die handler > fatal: recursion detected in die handler > fatal: recursion detected in die handler > > Or if you're lucky at least one of these will be the actual error: > > $ git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: recursion detected in die handler > fatal: pcre_exec failed with error code -8 > fatal: recursion detected in die handler > fatal: recursion detected in die handler > fatal: recursion detected in die handler > > But with this change: > > $ ~/g/git/git-grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > > (The error message is different because I compiled with PCRE v2 locally, > instead of the system PCRE v1, but that doesn't matter for this example) > > Over 1000 runs thish is how that breaks down on my machine, without this > patch. I've replaced the recursion error with just "R" and the PCRE > error with "P", and shown them in descending order by occurrence, lines > without a "P" only printed out the recursion error: > > $ (for i in {1..1000}; do git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: > r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort > -nr|head -n 10 > 247 R R > 136 P R R > 122 P R > 112 R R R > 108 R > 59 R P R > 54 R P > 54 P > 31 P R R R > 21 R R P > > There's a long tail I've omitted there of alterations to that. As this > shows in >10% of cases we don't print out any meaningful error at > all. But with this change: > > $ (for i in {1..1000}; do ~/g/git/git-grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: > r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort > -nr|head -n 10 > 377 P P > 358 P P P > 192 P > 63 P P P P > 8 P P P P P > 2 P P P P P P > > We will always show a meaningful error, but may of course do so multiple > times, which is a subject for a fix in git-grep in particular, but the > point is again, to fix the general case for the default handler. > > Something something sorry about the long mail didn't have time to write > a shorter one :) > Actually this convinced me (and it would be lovely to have such observations in the commit message). Thanks, Stefan
Re: [PATCH] die routine: change recursion limit from 1 to 1024
On Mon, Jun 19 2017, Stefan Beller jotted: >> Now, git-grep could make use of the pluggable error facility added in >> commit c19a490e37 ("usage: allow pluggable die-recursion checks", >> 2013-04-16). > > I think we should do that instead (though I have not looked at the downsides > of this), because... It makes sense to do that in addition to what I'm doing here (or if the approach I'm taking doesn't make sense, some other patch to fix the general issue in the default handler). I'm going to try to get around to fixing the grep behavior in a follow-up patch, this is a fix for the overzealous recursion detection in the default handler needlessly causing other issues. >> >> So let's just set the recursion limit to a number higher than the >> number of threads we're ever likely to spawn. Now we won't lose >> errors, and if we have a recursing die handler we'll still die within >> microseconds. > > how are we handling access to that global variable? > Do we need to hold a mutex to be correct? or rather hope that > it works across threads, not counting on it, because each thread > individually would count up to 1024? It's not guarded by a mutex and the ++ here and the reads from it are both racy. However, for its stated purpose that's fine, even if we're racily incrementing it and losing some updates some will get through, which is good enough for an infinite recursion detection. We don't really care if we die at exactly 1 or exactly 1024. > I would prefer if we kept the number as low as "at most > one screen of lines". In practice this is the case in git, because the programs that would encounter this are going to be spawning less than screenfull of threads, assuming (as is the case) that each thread might print out one error. The semantics of that aren't changed with this patch, the difference is that you're going to get e.g. N repeats of a meaningful error instead of N repeats of either the meaningful error OR "recursion detected in die handler", depending on your luck. I.e. in current git (after a few runs to get an unlucky one): $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: recursion detected in die handler fatal: recursion detected in die handler fatal: recursion detected in die handler Or if you're lucky at least one of these will be the actual error: $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: recursion detected in die handler fatal: pcre_exec failed with error code -8 fatal: recursion detected in die handler fatal: recursion detected in die handler fatal: recursion detected in die handler But with this change: $ ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: pcre2_jit_match failed with error code -47: match limit exceeded fatal: pcre2_jit_match failed with error code -47: match limit exceeded fatal: pcre2_jit_match failed with error code -47: match limit exceeded (The error message is different because I compiled with PCRE v2 locally, instead of the system PCRE v1, but that doesn't matter for this example) Over 1000 runs thish is how that breaks down on my machine, without this patch. I've replaced the recursion error with just "R" and the PCRE error with "P", and shown them in descending order by occurrence, lines without a "P" only printed out the recursion error: $ (for i in {1..1000}; do git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10 247 R R 136 P R R 122 P R 112 R R R 108 R 59 R P R 54 R P 54 P 31 P R R R 21 R R P There's a long tail I've omitted there of alterations to that. As this shows in >10% of cases we don't print out any meaningful error at all. But with this change: $ (for i in {1..1000}; do ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10 377 P P 358 P P P 192 P 63 P P P P 8 P P P P P 2 P P P P P P We will always show a meaningful error, but may of course do so multiple times, which is a subject for a fix in git-grep in particular, but the point is again, to fix the general case for the default handler. Something something sorry about the long mail didn't have time to write a shorter one :) >> Signed-off-by: Ævar Arnfjörð Bjarmason >> --- >> usage.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/usage.c b/usage.c >> index 2f87ca69a8..1c198d4882 100644 >> --- a/usage.c >> +++ b/usage.c >> @@ -44,7 +44,9 @@ static void warn_builtin(const char *warn, va_list params) >> static int die_is_recursing_builtin(void) >> { >> static i
Re: Behavior of 'git fetch' for commit hashes
On Mon, 19 Jun 2017 10:49:36 -0700 Jonathan Tan wrote: > On Mon, 19 Jun 2017 12:09:28 + > wrote: > > > For version 2.13.3 Firstly, exactly which version of Git doesn't work? I'm assuming 2.13.1 (as written elsewhere in your e-mail), since 2.13.3 doesn't exist. > > However, the workaround for descbibed abot for git version 2.7.4 no > > longer works. The result is always > > fatal: Couldn't find remote ref > > I'll take a look into this. I tried to reproduce with this script, but it seems to pass even at 2.13.1: #!/bin/bash rm -rf ~/tmp/x && make --quiet && ./git init ~/tmp/x && ./git -C ~/tmp/x fetch --quiet ~/gitpristine/git master:foo && ./git -C ~/tmp/x fetch ~/gitpristine/git "$(git -C ~/gitpristine/git rev-parse master^)" exit $? Commenting out the first fetch line produces, as expected: error: Server does not allow request for unadvertised object And I have not seen the "fatal: Couldn't find remote ref" error you describe. As Stefan alluded to in his earlier e-mail [1], I made a change recently to how "git fetch" handles literal SHA-1s in commit fdb69d3 ("fetch-pack: always allow fetching of literal SHA1s", 2017-05-15). In this case, I don't think that's the cause of this issue (since that change, if anything, makes things more permissive, not less) but that might be a point of interest in an investigation. [1] https://public-inbox.org/git/CAGZ79kbFeptKOfpaZ23yK=zw9mj0_evqpsthkkd1hscap_p...@mail.gmail.com/
Re: Git Strange behaviour with remote deleted branch
On Mon, 2017-06-19 at 23:33 +0200, Reda Lyazidi wrote: > with git I noticed when I removed a remote branch with git push origin > --delete > in my clone when I used git branch -a I don't the deleted branch > but my colleagues still see it. > > I tried with two clones in my PC, with the first one delete branch and > the other still sees it > despite git pull . Normally Git will not remove a remote's branch refs on a fetch (or pull) operation. This is a safety measure, since you might still have a use for that branch (for example to run diff against or similar). In order to remove branches that have been deleted in your remote, you must use the --prune (or -p) option to git fetch or git pull: git pull -p will get rid of them.
Re: [PATCH] die routine: change recursion limit from 1 to 1024
> Now, git-grep could make use of the pluggable error facility added in > commit c19a490e37 ("usage: allow pluggable die-recursion checks", > 2013-04-16). I think we should do that instead (though I have not looked at the downsides of this), because... > > So let's just set the recursion limit to a number higher than the > number of threads we're ever likely to spawn. Now we won't lose > errors, and if we have a recursing die handler we'll still die within > microseconds. how are we handling access to that global variable? Do we need to hold a mutex to be correct? or rather hope that it works across threads, not counting on it, because each thread individually would count up to 1024? I would prefer if we kept the number as low as "at most one screen of lines". > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > usage.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/usage.c b/usage.c > index 2f87ca69a8..1c198d4882 100644 > --- a/usage.c > +++ b/usage.c > @@ -44,7 +44,9 @@ static void warn_builtin(const char *warn, va_list params) > static int die_is_recursing_builtin(void) > { > static int dying; > - return dying++; > + static int recursion_limit = 1024; > + > + return dying++ > recursion_limit; > } > > /* If we are in a dlopen()ed .so write to a global variable would segfault > -- > 2.13.1.518.g3df882009 >
[PATCH] grep: fix erroneously copy/pasted variable in check/assert pattern
Fix an erroneously copy/pasted check for the pcre2_jit_stack variable to check pcre2_match_context instead. The former was already checked in the preceding "if" statement. This is a trivial and obvious error introduced in my commit 94da9193a6 ("grep: add support for PCRE v2", 2017-06-01). In practice if pcre2_match_context_create() returned NULL we were likely in a situation where malloc() was returning NULL, and were thus screwed anyway, but if only the pcre2_match_context_create() call returned NULL (through some transitory bug) PCRE v2 would just allocate and supply its own context object when matching, and we'd run normally at the trivial expense of not getting a slight speedup by sharing the context object between successive matches. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/grep.c b/grep.c index d7ef21358e..06c4c24926 100644 --- a/grep.c +++ b/grep.c @@ -508,7 +508,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt if (!p->pcre2_jit_stack) die("Couldn't allocate PCRE2 JIT stack"); p->pcre2_match_context = pcre2_match_context_create(NULL); - if (!p->pcre2_jit_stack) + if (!p->pcre2_match_context) die("Couldn't allocate PCRE2 match context"); pcre2_jit_stack_assign(p->pcre2_match_context, NULL, p->pcre2_jit_stack); } else if (p->pcre2_jit_on != 0) { -- 2.13.1.518.g3df882009
[PATCH] die routine: change recursion limit from 1 to 1024
Change the recursion limit for the default die routine from a *very* low 1 to 1024. This ensures that infinite recursions are broken, but doesn't lose error messages. The intent of the existing code, as explained in commit cd163d4b4e ("usage.c: detect recursion in die routines and bail out immediately", 2012-11-14), is to break infinite recursion in cases where the die routine itself dies. However, doing that very aggressively by immediately printing out "recursion detected in die handler" if we've already called die() once means that threaded invocations of git can go through the following flow: 1. Start a bunch of threads 2. The threads start invoking die(), pretty much at the same time. 3. The first die() invocation will be in the middle of trying to print out its message by the time another thread dies, that other thread then runs into the recursion limit and dies with "recursion detected in die handler". 4. Due to a race condition the initial error may never get printed before the "recursion detected" thread calls exit() and aborts the program. An example of this is running a threaded grep against e.g. linux.git: git grep -P --threads=4 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' With the current version of git this will print some combination of multiple PCRE failures that caused the abort and multiple "recursion detected", some invocations will print out multiple "recursion detected" errors with no PCRE error at all! Now, git-grep could make use of the pluggable error facility added in commit c19a490e37 ("usage: allow pluggable die-recursion checks", 2013-04-16). That should be done for git-grep in particular because before this change (and after) it'll potentially print out the exact same error from the N threads it starts, that should be de-duplicated. But let's start by improving the default behavior shared across all of git. Right now the common case is not an infinite recursion in the handler, but us losing error messages by default because we're overly paranoid about our recursion check. So let's just set the recursion limit to a number higher than the number of threads we're ever likely to spawn. Now we won't lose errors, and if we have a recursing die handler we'll still die within microseconds. Signed-off-by: Ævar Arnfjörð Bjarmason --- usage.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/usage.c b/usage.c index 2f87ca69a8..1c198d4882 100644 --- a/usage.c +++ b/usage.c @@ -44,7 +44,9 @@ static void warn_builtin(const char *warn, va_list params) static int die_is_recursing_builtin(void) { static int dying; - return dying++; + static int recursion_limit = 1024; + + return dying++ > recursion_limit; } /* If we are in a dlopen()ed .so write to a global variable would segfault -- 2.13.1.518.g3df882009
[GSoC][PATCH 6/6] submodule: port submodule subcommand 'deinit' from shell to C
The same mechanism is used even for porting this submodule subcommand, as used in the ported subcommands till now. The function cmd_deinit in split up after porting into three functions: module_deinit, for_each_submodule_list and deinit_submodule. Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan --- builtin/submodule--helper.c | 140 git-submodule.sh| 55 + 2 files changed, 141 insertions(+), 54 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index e10cac462..f029f5fae 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -750,6 +750,145 @@ static int module_status(int argc, const char **argv, const char *prefix) return 0; } +struct deinit_cb { + const char *prefix; + unsigned int quiet: 1; + unsigned int force: 1; + unsigned int all: 1; +}; +#define DEINIT_CB_INIT { NULL, 0, 0, 0 } + +static void deinit_submodule(const struct cache_entry *list_item, +void *cb_data) +{ + struct deinit_cb *info = cb_data; + const struct submodule *sub; + char *displaypath = NULL; + struct child_process cp_config = CHILD_PROCESS_INIT; + struct strbuf sb_config = STRBUF_INIT; + char *sm_path = xstrdup(list_item->name); + char *sub_git_dir = xstrfmt("%s/.git", sm_path); + + sub = submodule_from_path(null_sha1, sm_path); + + if (!sub->name) + goto cleanup; + + displaypath = get_submodule_displaypath(sm_path, info->prefix); + + /* remove the submodule work tree (unless the user already did it) */ + if (is_directory(sm_path)) { + struct child_process cp = CHILD_PROCESS_INIT; + + /* protect submodules containing a .git directory */ + if (is_git_directory(sub_git_dir)) + die(_("Submodule work tree '%s' contains a .git " + "directory use 'rm -rf' if you really want " + "to remove it including all of its history"), + displaypath); + + if (!info->force) { + struct child_process cp_rm = CHILD_PROCESS_INIT; + cp_rm.git_cmd = 1; + argv_array_pushl(&cp_rm.args, "rm", "-qn", sm_path, +NULL); + + /* list_item->name is changed by cmd_rm() below */ + if (run_command(&cp_rm)) + die(_("Submodule work tree '%s' contains local " + "modifications; use '-f' to discard them"), + displaypath); + } + + cp.use_shell = 1; + argv_array_pushl(&cp.args, "rm", "-rf", sm_path, NULL); + if (!run_command(&cp)) { + if (!info->quiet) + printf(_("Cleared directory '%s'\n"), +displaypath); + } else { + if (!info->quiet) + printf(_("Could not remove submodule work tree '%s'\n"), +displaypath); + } + } + + if (mkdir(sm_path, 0700)) + die(_("could not create empty submodule directory %s"), + displaypath); + + cp_config.git_cmd = 1; + argv_array_pushl(&cp_config.args, "config", "--get-regexp", NULL); + argv_array_pushf(&cp_config.args, "submodule.%s\\.", sub->name); + + /* remove the .git/config entries (unless the user already did it) */ + if (!capture_command(&cp_config, &sb_config, 0) && sb_config.len) { + char *sub_key = xstrfmt("submodule.%s", sub->name); + /* +* remove the whole section so we have a clean state when +* the user later decides to init this submodule again +*/ + git_config_rename_section_in_file(NULL, sub_key, NULL); + if (!info->quiet) + printf(_("Submodule '%s' (%s) unregistered for path '%s'\n"), +sub->name, sub->url, displaypath); + free(sub_key); + } + +cleanup: + free(displaypath); + free(sub_git_dir); + free(sm_path); + strbuf_release(&sb_config); +} + +static int module_deinit(int argc, const char **argv, const char *prefix) +{ + struct deinit_cb info = DEINIT_CB_INIT; + struct pathspec pathspec; + struct module_list list = MODULE_LIST_INIT; + int quiet = 0; + int force = 0; + int all = 0; + + struct option module_deinit_options[] = { + OPT__QUIET(&quiet, N_("Suppress submodule status output")), + OPT__
[GSoC][PATCH 2/6] submodule--helper: introduce get_submodule_displaypath and for_each_submodule_list
Functions get_submodule_displaypath and for_each_submodule_list for using them in the later patches, related to porting submodule subcommands from shell to C. These new functions are also used in ported submodule subcommand init Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan --- builtin/submodule--helper.c | 69 - 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 8cc648d85..f7adca95b 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -13,6 +13,9 @@ #include "refs.h" #include "connect.h" +typedef void (*submodule_list_func_t)(const struct cache_entry *list_item, + void *cb_data); + static char *get_default_remote(void) { char *dest = NULL, *ret; @@ -219,6 +222,27 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr return 0; } +static char *get_submodule_displaypath(const char *path, const char *prefix) +{ + const char *super_prefix = get_super_prefix(); + + if (prefix && super_prefix) { + BUG("cannot have prefix '%s' and superprefix '%s'", + prefix, super_prefix); + } else if (prefix) { + struct strbuf sb = STRBUF_INIT; + char *displaypath = xstrdup(relative_path(path, prefix, &sb)); + strbuf_release(&sb); + return displaypath; + } else if (super_prefix) { + int len = strlen(super_prefix); + const char *format = is_dir_sep(super_prefix[len-1]) ? "%s%s" : "%s/%s"; + return xstrfmt(format, super_prefix, path); + } else { + return xstrdup(path); + } +} + struct module_list { const struct cache_entry **entries; int alloc, nr; @@ -330,26 +354,30 @@ static int module_list(int argc, const char **argv, const char *prefix) return 0; } -static void init_submodule(const char *path, const char *prefix, int quiet) +static void for_each_submodule_list(const struct module_list list, + submodule_list_func_t fn, void *cb_data) { + int i; + for (i = 0; i < list.nr; i++) + fn(list.entries[i], cb_data); +} + +struct init_cb { + const char *prefix; + unsigned int quiet: 1; +}; +#define INIT_CB_INIT { NULL, 0 } + +static void init_submodule(const struct cache_entry *list_item, void *cb_data) +{ + struct init_cb *info = cb_data; const struct submodule *sub; struct strbuf sb = STRBUF_INIT; char *upd = NULL, *url = NULL, *displaypath; - /* Only loads from .gitmodules, no overlay with .git/config */ - gitmodules_config(); - - if (prefix && get_super_prefix()) - die("BUG: cannot have prefix and superprefix"); - else if (prefix) - displaypath = xstrdup(relative_path(path, prefix, &sb)); - else if (get_super_prefix()) { - strbuf_addf(&sb, "%s%s", get_super_prefix(), path); - displaypath = strbuf_detach(&sb, NULL); - } else - displaypath = xstrdup(path); + displaypath = get_submodule_displaypath(list_item->name, info->prefix); - sub = submodule_from_path(null_sha1, path); + sub = submodule_from_path(null_sha1, list_item->name); if (!sub) die(_("No url found for submodule path '%s' in .gitmodules"), @@ -361,7 +389,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet) * * Set active flag for the submodule being initialized */ - if (!is_submodule_initialized(path)) { + if (!is_submodule_initialized(list_item->name)) { strbuf_reset(&sb); strbuf_addf(&sb, "submodule.%s.active", sub->name); git_config_set_gently(sb.buf, "true"); @@ -404,7 +432,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet) if (git_config_set_gently(sb.buf, url)) die(_("Failed to register url for submodule path '%s'"), displaypath); - if (!quiet) + if (!info->quiet) fprintf(stderr, _("Submodule '%s' (%s) registered for path '%s'\n"), sub->name, url, displaypath); @@ -433,10 +461,10 @@ static void init_submodule(const char *path, const char *prefix, int quiet) static int module_init(int argc, const char **argv, const char *prefix) { + struct init_cb info = INIT_CB_INIT; struct pathspec pathspec; struct module_list list = MODULE_LIST_INIT; int quiet = 0; - int i; struct option module_init_options[] = { OPT__QUIET(&quiet, N_("Suppress output
[GSoC][PATCH 5/6] submodule: port submodule subcommand sync from shell to C
The mechanism used for porting the submodule subcommand 'sync' is similar to that of 'foreach', where we split the function cmd_sync from shell into three functions in C, module_sync, for_each_submodule_list and sync_submodule. print_default_remote is introduced as a submodule--helper subcommand for getting the default remote as stdout. Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan --- builtin/submodule--helper.c | 180 git-submodule.sh| 56 +- 2 files changed, 181 insertions(+), 55 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 78b21ab22..e10cac462 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -43,6 +43,20 @@ static char *get_default_remote(void) return ret; } +static int print_default_remote(int argc, const char **argv, const char *prefix) +{ + const char *remote; + + if (argc != 1) + die(_("submodule--helper print-default-remote takes no arguments")); + + remote = get_default_remote(); + if (remote) + puts(remote); + + return 0; +} + static int starts_with_dot_slash(const char *str) { return str[0] == '.' && is_dir_sep(str[1]); @@ -311,6 +325,25 @@ static int print_name_rev(int argc, const char **argv, const char *prefix) return 0; } +static char *get_up_path(const char *path) +{ + int i = count_slashes(path); + struct strbuf sb = STRBUF_INIT; + + while (i--) + strbuf_addstr(&sb, "../"); + + /* +*Check if 'path' ends with slash or not +*for having the same output for dir/sub_dir +*and dir/sub_dir/ +*/ + if (!is_dir_sep(path[i - 1])) + strbuf_addstr(&sb, "../"); + + return strbuf_detach(&sb, NULL); +} + struct module_list { const struct cache_entry **entries; int alloc, nr; @@ -736,6 +769,151 @@ static int module_name(int argc, const char **argv, const char *prefix) return 0; } +struct sync_cb { + const char *prefix; + unsigned int quiet: 1; + unsigned int recursive: 1; +}; +#define SYNC_CB_INIT { NULL, 0, 0 } + +static void sync_submodule(const struct cache_entry *list_item, void *cb_data) +{ + struct sync_cb *info = cb_data; + const struct submodule *sub; + char *sub_key, *remote_key; + char *url, *sub_origin_url, *super_config_url, *displaypath; + struct strbuf sb = STRBUF_INIT; + struct child_process cp = CHILD_PROCESS_INIT; + + if (!is_submodule_initialized(list_item->name)) + return; + + sub = submodule_from_path(null_sha1, list_item->name); + + if (!sub->url) + die(_("no url found for submodule path '%s' in .gitmodules"), + list_item->name); + + url = xstrdup(sub->url); + + if (starts_with_dot_dot_slash(url) || starts_with_dot_slash(url)) { + char *remote_url, *up_path; + char *remote = get_default_remote(); + char *remote_key = xstrfmt("remote.%s.url", remote); + free(remote); + + if (git_config_get_string(remote_key, &remote_url)) + remote_url = xgetcwd(); + up_path = get_up_path(list_item->name); + sub_origin_url = relative_url(remote_url, url, up_path); + super_config_url = relative_url(remote_url, url, NULL); + free(remote_key); + free(up_path); + free(remote_url); + } else { + sub_origin_url = xstrdup(url); + super_config_url = xstrdup(url); + } + + displaypath = get_submodule_displaypath(list_item->name, info->prefix); + + if (!info->quiet) + printf(_("Synchronizing submodule url for '%s'\n"), +displaypath); + + sub_key = xstrfmt("submodule.%s.url", sub->name); + if (git_config_set_gently(sub_key, super_config_url)) + die(_("failed to register url for submodule path '%s'"), + displaypath); + + if (!is_submodule_populated_gently(list_item->name, NULL)) + goto cleanup; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = list_item->name; + argv_array_pushl(&cp.args, "submodule--helper", +"print-default-remote", NULL); + if (capture_command(&cp, &sb, 0)) + die(_("failed to get the default remote for submodule '%s'"), + list_item->name); + + strbuf_strip_suffix(&sb, "\n"); + remote_key = xstrfmt("remote.%s.url", sb.buf); + strbuf_release(&sb); + + child_process_init(&cp); + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = list_item->name; + argv_array_pus
[GSoC][PATCH 4/6] submodule: port submodule subcommand status
The mechanism used for porting submodule subcommand 'status' is similar to that used for subcommand 'foreach'. The function cmd_status from git-submodule is ported to three functions in the builtin submodule--helper namely: module_status, for_each_submodule_list and status_submodule. print_status is also introduced for handling the output of the subcommand and also to reduce the code size. Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan --- builtin/submodule--helper.c | 152 git-submodule.sh| 49 +- 2 files changed, 153 insertions(+), 48 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 6fd861e42..78b21ab22 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -566,6 +566,157 @@ static int module_init(int argc, const char **argv, const char *prefix) return 0; } +struct status_cb { + const char *prefix; + unsigned int quiet: 1; + unsigned int recursive: 1; + unsigned int cached: 1; +}; +#define STATUS_CB_INIT { NULL, 0, 0, 0 } + +static void print_status(struct status_cb *info, char state, const char *path, +char *sub_sha1, char *displaypath) +{ + if (info->quiet) + return; + + printf("%c%s %s", state, sub_sha1, displaypath); + + if (state == ' ' || state == '+') { + struct argv_array name_rev_args = ARGV_ARRAY_INIT; + + argv_array_pushl(&name_rev_args, "print-name-rev", +path, sub_sha1, NULL); + print_name_rev(name_rev_args.argc, name_rev_args.argv, + info->prefix); + } else { + printf("\n"); + } +} + +static void status_submodule(const struct cache_entry *list_item, void *cb_data) +{ + struct status_cb *info = cb_data; + char *sub_sha1 = xstrdup(oid_to_hex(&list_item->oid)); + char *displaypath; + struct argv_array diff_files_args = ARGV_ARRAY_INIT; + + if (!submodule_from_path(null_sha1, list_item->name)) + die(_("no submodule mapping found in .gitmodules for path '%s'"), + list_item->name); + + displaypath = get_submodule_displaypath(list_item->name, info->prefix); + + if (list_item->ce_flags) { + print_status(info, 'U', list_item->name, +sha1_to_hex(null_sha1), displaypath); + goto cleanup; + } + + if (!is_submodule_initialized(list_item->name)) { + print_status(info, '-', list_item->name, sub_sha1, displaypath); + goto cleanup; + } + + argv_array_pushl(&diff_files_args, "diff-files", +"--ignore-submodules=dirty", "--quiet", "--", +list_item->name, NULL); + + if (!cmd_diff_files(diff_files_args.argc, diff_files_args.argv, + info->prefix)) { + print_status(info, ' ', list_item->name, sub_sha1, displaypath); + } else { + if (!info->cached) { + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf sb = STRBUF_INIT; + + prepare_submodule_repo_env(&cp.env_array); + cp.git_cmd = 1; + cp.dir = list_item->name; + + argv_array_pushl(&cp.args, "rev-parse", +"--verify", "HEAD", NULL); + + if (capture_command(&cp, &sb, 0)) + die(_("could not run 'git rev-parse --verify" + "HEAD' in submodule %s"), + list_item->name); + + strbuf_strip_suffix(&sb, "\n"); + print_status(info, '+', list_item->name, sb.buf, +displaypath); + strbuf_release(&sb); + } else { + print_status(info, '+', list_item->name, sub_sha1, +displaypath); + } + } + + if (info->recursive) { + struct child_process cpr = CHILD_PROCESS_INIT; + + cpr.git_cmd = 1; + cpr.dir = list_item->name; + prepare_submodule_repo_env(&cpr.env_array); + + argv_array_pushl(&cpr.args, "--super-prefix", displaypath, +"submodule--helper", "status", "--recursive", +NULL); + + if (info->cached) + argv_array_push(&cpr.args, "--cached"); + + if (info->quiet) + argv_array_push(&cpr.args, "--quiet"); + + if (run_command(&cpr)) + die(_("failed to recurse into s
[GSoC][PATCH 3/6] submodule: port set_name_rev from shell to C
Since later on we want to port submodule subcommand status, and since set_name_rev is part of cmd_status, hence this function is ported. It has been ported to function print_name_rev in C, which calls get_name_rev to get the revname, and after formatting it, print_name_rev prints it. And hence in this way, the command `git submodule--helper print-name-rev "sm_path" "sha1"` sets value of revname in git-submodule.sh The function get_name_rev returns the stdout of the git describe commands. Since there are four different git-describe commands used for generating the name rev, four child_process are introduced, each successive child process running only when previous has no stdout. The order of these four git-describe commands is maintained the same as it was in the function set_name_rev() in shell script. Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan --- builtin/submodule--helper.c | 69 + git-submodule.sh| 16 ++- 2 files changed, 71 insertions(+), 14 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index f7adca95b..6fd861e42 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -243,6 +243,74 @@ static char *get_submodule_displaypath(const char *path, const char *prefix) } } +enum describe_step { + step_bare, + step_tags, + step_contains, + step_all_always, + step_end +}; + +static char *get_name_rev(const char *sub_path, const char* object_id) +{ + struct strbuf sb = STRBUF_INIT; + enum describe_step cur_step; + + for (cur_step = step_bare; cur_step < step_end; cur_step++) { + struct child_process cp = CHILD_PROCESS_INIT; + prepare_submodule_repo_env(&cp.env_array); + cp.dir = sub_path; + cp.git_cmd = 1; + cp.no_stderr = 1; + + switch (cur_step) { + case step_bare: + argv_array_pushl(&cp.args, "describe", +object_id, NULL); + break; + case step_tags: + argv_array_pushl(&cp.args, "describe", +"--tags", object_id, NULL); + break; + case step_contains: + argv_array_pushl(&cp.args, "describe", +"--contains", object_id, +NULL); + break; + case step_all_always: + argv_array_pushl(&cp.args, "describe", +"--all", "--always", +object_id, NULL); + break; + default: + BUG("unknown describe step '%d'", cur_step); + } + + if (!capture_command(&cp, &sb, 0) && sb.len) { + strbuf_strip_suffix(&sb, "\n"); + return strbuf_detach(&sb, NULL); + } + + } + + strbuf_release(&sb); + return NULL; +} + +static int print_name_rev(int argc, const char **argv, const char *prefix) +{ + char *namerev; + if (argc != 3) + die("print-name-rev only accepts two arguments: "); + + namerev = get_name_rev(argv[1], argv[2]); + if (namerev && namerev[0]) + printf(" (%s)", namerev); + printf("\n"); + + return 0; +} + struct module_list { const struct cache_entry **entries; int alloc, nr; @@ -1242,6 +1310,7 @@ static struct cmd_struct commands[] = { {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, {"resolve-relative-url-test", resolve_relative_url_test, 0}, + {"print-name-rev", print_name_rev, 0}, {"init", module_init, SUPPORT_SUPER_PREFIX}, {"remote-branch", resolve_remote_submodule_branch, 0}, {"push-check", push_check, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index c0d0e9a4c..091051891 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -758,18 +758,6 @@ cmd_update() } } -set_name_rev () { - revname=$( ( - sanitize_submodule_env - cd "$1" && { - git describe "$2" 2>/dev/null || - git describe --tags "$2" 2>/dev/null || - git describe --contains "$2" 2>/dev/null || - git describe --all --always "$2" - } - ) ) - test -z "$revname" || revname=" ($revname)" -} # # Show commit summary for submodules in index or working tree #
[GSoC][PATCH 1/6] dir: create function count_slashes
Similar functions exist in apply.c and builtin/show-branch.c for counting the number of slashes in a string. Also in the later patches, we introduce a third caller for the same. Hence, we unify it now by cleaning the existing functions and declaring a common function count_slashes in dir.h and implementing it in dir.c to remove this code duplication. Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Junio C Hamano Signed-off-by: Prathamesh Chavan Signed-off-by: Junio C Hamano --- The complete build report of this is available at: https://travis-ci.org/pratham-pc/git/builds/ Branch: All-patch-series Build #111 apply.c | 11 --- builtin/show-branch.c | 13 +++-- dir.c | 9 + dir.h | 3 +++ 4 files changed, 15 insertions(+), 21 deletions(-) diff --git a/apply.c b/apply.c index c49cef063..121e53406 100644 --- a/apply.c +++ b/apply.c @@ -762,17 +762,6 @@ static char *find_name_traditional(struct apply_state *state, return find_name_common(state, line, def, p_value, line + len, 0); } -static int count_slashes(const char *cp) -{ - int cnt = 0; - char ch; - - while ((ch = *cp++)) - if (ch == '/') - cnt++; - return cnt; -} - /* * Given the string after "--- " or "+++ ", guess the appropriate * p_value for the given patch. diff --git a/builtin/show-branch.c b/builtin/show-branch.c index 4a6cc6f49..3636a0559 100644 --- a/builtin/show-branch.c +++ b/builtin/show-branch.c @@ -5,6 +5,7 @@ #include "color.h" #include "argv-array.h" #include "parse-options.h" +#include "dir.h" static const char* show_branch_usage[] = { N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n" @@ -421,14 +422,6 @@ static int append_tag_ref(const char *refname, const struct object_id *oid, static const char *match_ref_pattern = NULL; static int match_ref_slash = 0; -static int count_slash(const char *s) -{ - int cnt = 0; - while (*s) - if (*s++ == '/') - cnt++; - return cnt; -} static int append_matching_ref(const char *refname, const struct object_id *oid, int flag, void *cb_data) @@ -438,7 +431,7 @@ static int append_matching_ref(const char *refname, const struct object_id *oid, * refs/tags/v0.99.9a and friends. */ const char *tail; - int slash = count_slash(refname); + int slash = count_slashes(refname); for (tail = refname; *tail && match_ref_slash < slash; ) if (*tail++ == '/') slash--; @@ -529,7 +522,7 @@ static void append_one_rev(const char *av) int saved_matches = ref_name_cnt; match_ref_pattern = av; - match_ref_slash = count_slash(av); + match_ref_slash = count_slashes(av); for_each_ref(append_matching_ref, NULL); if (saved_matches == ref_name_cnt && ref_name_cnt < MAX_REVS) diff --git a/dir.c b/dir.c index 9efcf1eab..4a953c16a 100644 --- a/dir.c +++ b/dir.c @@ -52,6 +52,15 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, static int get_dtype(struct dirent *de, struct index_state *istate, const char *path, int len); +int count_slashes(const char *s) +{ + int cnt = 0; + while (*s) + if (*s++ == '/') + cnt++; + return cnt; +} + int fspathcmp(const char *a, const char *b) { return ignore_case ? strcasecmp(a, b) : strcmp(a, b); diff --git a/dir.h b/dir.h index a89c13e27..e3717055d 100644 --- a/dir.h +++ b/dir.h @@ -197,6 +197,9 @@ struct dir_struct { unsigned unmanaged_exclude_files; }; +/*Count the number of slashes for string s*/ +extern int count_slashes(const char *s); + /* * The ordering of these constants is significant, with * higher-numbered match types signifying "closer" (i.e. more -- 2.13.0
[GSoC] Update: Week 5
SUMMARY OF MY PROJECT: Git submodule subcommands are currently implemented by using shell script 'git-submodule.sh'. There are several reasons why we'll prefer not to use the shell script. My project intends to convert the subcommands into C code, thus making them builtins. This will increase Git's portability and hence the efficiency of working with the git-submodule commands. Link to the complete proposal: [1] Mentors: Stefan Beller Christian Couder UPDATES: Following are the updates about my ongoing project: 1. sync and status: The patches were discussed with the mentors and after that, are being posted with this patch. 2. deinit: The patch is finally debugged, and is ready to be discussed. It is also attached with this update. 3. summary: While porting the subcommand, I underwent certain issues. After getting them clarified from my mentors, I have resumed working on it. I'm aware of the time I have taken for porting this subcommand is more than the previous ones. Hence will try my best to finish this in this week. 4. foreach: As stated in the previous update, the subcommand was ported without resolving the bug, and simply translating the present code, and adding a NEEDSWORK tag to the comment for mentioning the reported bug as well. But as communicating between child_process is still an issue and so there was no simple was to current carry out the porting. And hence, a hack was used instead. But after discussing it, instead using the repository-object patch series will help to resolve these issues in this situation. PLAN FOR WEEK-6 (20 June 2017 to 26 June 2017): 1. summary: Mostly I'll be working on this and post the patch for discussion as soon as possible. 2. foreach: As it was decided that unblock the conversion of this submodule subcommand, the original cmd_foreach was ported without including the BUG-FIX patch here. Hence, for this week I will try to utilize the 'repository-object' series by Brandon Williams. 3. deinit: I will be working on improvising this patch as it was recently debugged and posted for discussion. [1]: https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/ Thanks, Prathamesh Chavan
Git Strange behaviour with remote deleted branch
Hello, with git I noticed when I removed a remote branch with git push origin --delete in my clone when I used git branch -a I don't the deleted branch but my colleagues still see it. I tried with two clones in my PC, with the first one delete branch and the other still sees it despite git pull . I use git version 2.9.4 Kind regards.
Re: What's cooking in git.git (Jun 2017, #05; Mon, 19)
> > * sb/submodule-doc (2017-06-13) 1 commit > - submodules: overhaul documentation > > Doc update. > > Waiting for discussion to settle. Please hold back, this definitely needs another version. > * sb/diff-color-move (2017-06-01) 17 commits > - diff.c: color moved lines differently > ... > "git diff" has been taught to optionally paint new lines that are > the same as deleted lines elsewhere differently from genuinely new > lines. > > Is any more update coming? Yes. I do have a series locally that replaces "diff_line" with "emitted_string" (but the same structure) and also changed the algorithm to have 8 colors configurable. But then I got dragged away in "doing it right", which will be presented shortly with a different approach for the first patches and how they are refactored. So the refactoring of that is done and I need to apply the patches that bring in the new functionality on top. Thanks, Stefan
Re: Better usability of stash refs
On Mon, Jun 19, 2017 at 03:32:54PM -0500, Robert Dailey wrote: > To drop a stash, I have to do this (example): > > $ git stash drop stash@{3} > > Using the full "stash@{N}" seems superfluous since the documentation > states it must be a stash in the first place. It would make more sense > (and be quicker to type) to do: > > $ git stash drop 3 > > Is there a trick I can use to make this shorthand possible? I thought > about creating a "s" script for "stash" that intercepted the > parameters for only a couple of stash sub-commands and created the > ref, but that seems a lot of work. > > Any productivity tips here? Thanks in advance. Junio mentioned that this is already possible. I suspect the problem may be that your Git is not recent enough. It was added in a56c8f5aa (stash: allow stashes to be referenced by index only, 2016-10-24), which is in v2.11.0. -Peff
Re: Better usability of stash refs
Robert Dailey writes: > To drop a stash, I have to do this (example): > > $ git stash drop stash@{3} > > Using the full "stash@{N}" seems superfluous since the documentation > states it must be a stash in the first place. It would make more sense > (and be quicker to type) to do: > > $ git stash drop 3 > > Is there a trick I can use to make this shorthand possible? I thought > about creating a "s" script for "stash" that intercepted the > parameters for only a couple of stash sub-commands and created the > ref, but that seems a lot of work. > > Any productivity tips here? Thanks in advance. Isn't that already the case? git-stash.sh::drop_stash gives "$@" to assert_stash_ref, which in turn calls is_stash_ref, which in turn calls is_stash_like, and this callchain eventually cals into parse_flags_and_rev. Which has this bit (may be hard to read because it is incorrectly indented): case "$1" in *[!0-9]*) : ;; *) set -- "${ref_stash}@{$1}" ;; esac REV=$(git rev-parse --symbolic --verify --quiet "$1") || { reference="$1" die "$(eval_gettext "\$reference is not a valid reference")" }
Better usability of stash refs
To drop a stash, I have to do this (example): $ git stash drop stash@{3} Using the full "stash@{N}" seems superfluous since the documentation states it must be a stash in the first place. It would make more sense (and be quicker to type) to do: $ git stash drop 3 Is there a trick I can use to make this shorthand possible? I thought about creating a "s" script for "stash" that intercepted the parameters for only a couple of stash sub-commands and created the ref, but that seems a lot of work. Any productivity tips here? Thanks in advance.
Re: in case you want a use-case with lots of submodules
On Mon, 19 Jun 2017, Stefan Beller wrote: > On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko > wrote: > > Hi All, > > On a recent trip I've listened to the git minutes podcast episode and > > got excited to hear Stefan Beller (CCed just in case) describing > > ongoing work on submodules mechanism. I got excited, since e.g. > > performance improvements would be of great benefit to us too. > If you're mostly interested in performance improvements of the status > quo (i.e. "make git-submodule fast"), then the work of Prathamesh > Chavan (cc'd) might be more interesting to you than what I do. > He is porting git-submodule (which is mostly a shell script nowadays) > to C, such that we can save a lot of process invocations and can do > processing within one process. ah -- cool. I would be eager to test it out, thanks! would be interesting to see if it positively affects our overall performance. Pointers to that development would be welcome! > > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370 > > repositories, up to 4 levels deep) hierarchy of git/git-annex > > repositories all tied together via git submodules mechanism. And as the > > collection grows, interactions with it become slower, so additional > > options (such as --ignore-submodules=dirty to status) become our > > friends. > I am not as much concerned about the 370 number than about the > 4 layers of nesting. In my experience the nested submodule case > is a little bit error prone and the bug reports are not as frequent as > there are not as many users of nesting, yet(?) well -- part of the story here is that we are forced to use/have full blown .git/ directories (for git-annex symlinks to content files to work) within submodules instead of .git file with a reference under parent's .git/modules. So we can 'slice' at any level and I guess that is why may be avoiding some possibly issues due to nesting and the "parent has all .git/modules" approach. > In a neighboring thread on the mailing list we have a discussion > on the usefulness of being on branches than in detached HEAD > in the submodules. > https://public-inbox.org/git/0092cdd27c5f9d418b0f3e9b5d05be0801028...@sbs2011.opfingen.plc2.de/ > This would not break non-ambiguously, rather it would add > ease of use. that is indeed a common caveat... I am not sure if any heuristic approach would provide a 'bullet proof' solution. I might even prefer a hardcoded 'branch-name' to be listed/associated with each submodule within .gitmodules. In the datalad case, detached HEAD is common whenever someone installs "outdated" (branch of which progressed forward) submodule. In this case we just check if the branch after "git clone" (but before git submodule update) includes the pointed by Subproject commit, and if so -- we announce that it must be the branch (so far it is always "master" branch anyways ;) ) > > So I thought to share this as a use-case happen you need more > > motivation or just a real-case test-bed for your work. And thank > > you again for making Git even Greater. > Thanks for the motivation. :) the least I could do ;) -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
AW: Restoring detached HEADs after Git operations
Hello Stefan, I never have tapped into the DLL Hell trap. That's maybe I never did C++ development or I started with VB .NET / C# as .NET solved major parts of the DLL Hell :). That doesn't mean my new beloved language Python doesn't have a similar problem ... Thinking about DLL Hell is a thinking in big version numbers like 1.0, 2.0 oder even 2.1, 2.2, ... We are here talking about revisions in the build numbers which need to be synchronized between the parent repository and the sub modules (IP cores). Both sides are under heavy development and interfaces evolving from day to day because hardware design can't be planned as easy as software design. So by using Git submodules a developer - responsible for a submodule / IP core - can after he finished interface level 1 now go on and implement interface level 2. The parent project can finish it's integration and testing of the level 1 interface before proceeding with level 2. More over if the same IP core is used multiple time in different sub IP cores, it's possible to update one usage place to interface level 2 by a second developer so he can finish his IP core at level 2, which other usage places can still use the level 1 interface. Start situation: -- TOPLEVEL (developer A) o- IP_1 @level1 (developer B) o- IP_2 @level1 (developer C) o- IP_3 @level1 (developer D) o- IP_2 @level1 Developer C creates interface level 2, but all instances use level1 of IP_2: -- TOPLEVEL (developer A) o- IP_1 @level1 (developer B) o- IP_2 @level1 (developer C) o- IP_3 @level1 (developer D) o- IP_2 @level1 Developer D updates instance of IP_2 to level 2 and completes level 2 of IP_3: -- TOPLEVEL (developer A) o- IP_1 @level1 (developer B) o- IP_2 @level1 (developer C) o- IP_3 @level1 (developer D) o- IP_2 @level2 Developer A updates instance of IP_3 to level 2: -- TOPLEVEL (developer A) o- IP_1 @level1 (developer B) o- IP_2 @level1 (developer C) o- IP_3 @level2 (developer D) o- IP_2 @level2 Developer B has finished his testing for IP_1 and can now update the instance if IP_2: -- TOPLEVEL (developer A) o- IP_1 @level1 (developer B) o- IP_2 @level2 (developer C) o- IP_3 @level2 (developer D) o- IP_2 @level2 So now imaging 8 developers, whereof 6 are working remote on the project. There is one responsible developer per IP core (maintainer) and an overall maintainer overseeing all integration merges and test results (CI). Kind regards Patrick Von: Stefan Beller [sbel...@google.com] Gesendet: Montag, 19. Juni 2017 21:21 Bis: Patrick Lehmann Cc: Lars Schneider; Git Mailinglist Betreff: Re: Restoring detached HEADs after Git operations On Mon, Jun 19, 2017 at 11:09 AM, Patrick Lehmann wrote: > Hello Stefan, > > the use case is as follows: > > The projects consists of circa 18 IP cores. Each IP core is represented by a > Git repository. Think of an IP core as of a lonestanding DLL or SO file > project. Each IP core references 2 submodules, which bring the verification > environments for testing the IP core standalone. So phrased differently: You are using submodules to avoid "DLL hell" (sharing a lib, with ease of versioning as the submodules in the different IP cores may be pointing at different versions). > > These 18 IP cores are grouped to bigger IP cores, referencing the low-level > IP cores and each again the 2 verification submodules. Finally, the main > project references the bigger IP cores and again the 2 verification cores. > > TOPLEVEL > o- IP1 >o- UVVM >o- VUnit > o- IP2 >o- UVVM >o- VUnit > o- IP3 >o- UVVM >o- VUnit > o- IP4 >o- UVVM >o- VUnit >o- IP5 >o- UVVM >o- VUnit >o- IP6 >o- UVVM >o- VUnit >o- IP7 >o- UVVM >o- VUnit > o- IP8 >o- UVVM >o- VUnit >o- IP9 >o- UVVM >o- VUnit >o- IP10 >o- UVVM >o- VUnit > o- IP11 >o- UVVM >o- VUnit >o- IP9 >o- UVVM >o- VUnit >o- IP12 >o- UVVM >o- VUnit >o- UVVM >o- VUnit > > That's the simplified structure. I can't write more, because it's a closed > source project. You can find other usecases e.g. in my other open source > projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding > PoC-Examples repository. > > Example: PoC > Pile of Cores includes 4 Git submodules and is itself an IP core library. > So PoC-Examples again references PoC. This looks like this tree: > > PoC-Examples > |- lib/ >o- PoC > |- lib >
What's cooking in git.git (Jun 2017, #05; Mon, 19)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ab/pcre-v2 (2017-06-02) 8 commits (merged to 'next' on 2017-06-13 at 34bf49ee44) + grep: add support for PCRE v2 + grep: un-break building with PCRE >= 8.32 without --enable-jit + grep: un-break building with PCRE < 8.20 + grep: un-break building with PCRE < 8.32 + grep: add support for the PCRE v1 JIT API + log: add -P as a synonym for --perl-regexp + grep: skip pthreads overhead when using one thread + grep: don't redundantly compile throwaway patterns under threading Update "perl-compatible regular expression" support to enable JIT and also allow linking with the newer PCRE v2 library. * ab/perf-remove-index-lock (2017-06-05) 1 commit (merged to 'next' on 2017-06-13 at c532e59233) + perf: work around the tested repo having an index.lock When an existing repository is used for t/perf testing, we first create bit-for-bit copy of it, which may grab a transient state of the repository and freeze it into the repository used for testing, which then may cause Git operations to fail. Single out "the index being locked" case and forcibly drop the lock from the copy. * bw/object-id (2017-06-05) 33 commits (merged to 'next' on 2017-06-13 at 0582278759) + diff: rename diff_fill_sha1_info to diff_fill_oid_info + diffcore-rename: use is_empty_blob_oid + tree-diff: convert path_appendnew to object_id + tree-diff: convert diff_tree_paths to struct object_id + tree-diff: convert try_to_follow_renames to struct object_id + builtin/diff-tree: cleanup references to sha1 + diff-tree: convert diff_tree_sha1 to struct object_id + notes-merge: convert write_note_to_worktree to struct object_id + notes-merge: convert verify_notes_filepair to struct object_id + notes-merge: convert find_notes_merge_pair_ps to struct object_id + notes-merge: convert merge_from_diffs to struct object_id + notes-merge: convert notes_merge* to struct object_id + tree-diff: convert diff_root_tree_sha1 to struct object_id + combine-diff: convert find_paths_* to struct object_id + combine-diff: convert diff_tree_combined to struct object_id + diff: convert diff_flush_patch_id to struct object_id + patch-ids: convert to struct object_id + diff: finish conversion for prepare_temp_file to struct object_id + diff: convert reuse_worktree_file to struct object_id + diff: convert fill_filespec to struct object_id + diff: convert diff_change to struct object_id + diff: convert run_diff_files to struct object_id + diff: convert diff_addremove to struct object_id + diff: convert diff_index_show_file to struct object_id + diff: convert get_stat_data to struct object_id + grep: convert to struct object_id + notes: convert some accessor functions to struct object_id + builtin/notes: convert to struct object_id + notes: convert format_display_notes to struct object_id + notes: make get_note return pointer to struct object_id + notes: convert for_each_note to struct object_id + notes: convert internal parts to struct object_id + notes: convert internal structures to struct object_id Conversion from uchar[20] to struct object_id continues. * jk/consistent-h (2017-06-05) 8 commits (merged to 'next' on 2017-06-13 at e09c1fe968) + t0012: test "-h" with builtins + git: add hidden --list-builtins option + version: convert to parse-options + diff- and log- family: handle "git cmd -h" early + submodule--helper: show usage for "-h" + remote-{ext,fd}: print usage message on invalid arguments + upload-archive: handle "-h" option early + credential: handle invalid arguments earlier "git $cmd -h" for builtin commands calls the implementation of the command (i.e. cmd_$cmd() function) without doing any repository set-up, and the commands that expect RUN_SETUP is done by the Git potty needs to be prepared to show the help text without barfing. * jk/pathspec-magic-disambiguation (2017-05-29) 6 commits (merged to 'next' on 2017-06-13 at 088987f033) + verify_filename(): flip order of checks + verify_filename(): treat ":(magic)" as a pathspec + check_filename(): handle ":^" path magic + check_filename(): use skip_prefix + check_filename(): refactor ":/" handling + t4208: add check for ":/" without matching file The convention for a command line is to follow "git cmdname --options" with revisions followed by an optional "--" disambiguator and then finally pathspecs. When "--" is not there, we make sure early ones are all interpretable as revs (and do not look like paths) and later ones are the
Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
Jeff King writes: > If we forget the "storing it twice" argument, would it make sense to > convert the parallel arrays of items into a single array-of-struct? > I.e.: > > struct configured_refspec { > const char *string; > struct refspec refspec; > unsigned parsed:1; > } > > I guess that may run into problems where we really need an > array-of-refspec to pass into sub-functions. So going further, could we > just have "struct refspec" store the text form it was parsed from? I find this a sensible suggestion. I think the original "parallel" structure was anticipating that a textual input we get from the user (either from the command line or from the configuration) could expand to multiple parsed refspecs for easier use by the code, but it appears that we hadn't seen a need for that, so I think it is safe to unify them into a single struct. Of course, that still anticipates that a parsed refspec may not be unambiguously turned back to textual original input form; if that will never be the case, then the approach taken by the posted patches should also be OK.
Re: [PATCH v2 0/3] Add regression tests for rectent rebase -i fixes
Johannes Sixt writes: > Am 16.06.2017 um 20:43 schrieb Johannes Sixt: >> Am 16.06.2017 um 15:49 schrieb Johannes Schindelin: >>> On Thu, 15 Jun 2017, Junio C Hamano wrote: diff --git a/t/t3420-rebase-autostash.sh b/t/t3420-rebase-autostash.sh index 325ec75353..801bce25da 100755 --- a/t/t3420-rebase-autostash.sh +++ b/t/t3420-rebase-autostash.sh @@ -45,7 +45,7 @@ create_expected_success_am() { } create_expected_success_interactive() { -cr=$'\r' && +cr=$(echo . | tr '.' '\015') && cat >expected <<-EOF $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) HEAD is now at $(git rev-parse --short feature-branch) third commit >>> >>> This is still incorrect, as the \r\n (which $(echo . | tr \.\ '\015') >>> would emit) is interpreted correctly as a line break on Windows, meaning >>> that cr is now *empty*. Not what we want. >>> >>> What I did is to replace the `cat` by `q_to_cr` (we have that lovely >>> function, might just as well use it), replace `${cr}` by `Q` and skip the >>> cr variable altogether. >> >> You beat me to it. I came up with the identical q_to_cr changes, but >> haven't dug the remaining failure regarding the swapped output >> lines. You seem to have nailed it. Will test your proposed changes >> tomorrow. > > As expected, the patches fix the observed test failures for me, too, > if that's still relevant. Thanks for double-checking.
Re: [PATCH 00/28] Create a reference backend for packed refs
On Mon, Jun 19, 2017 at 03:43:15PM -0400, Jeff King wrote: > > > Is the iterator over packed-refs correctly skipping over what are > > > covered by loose refs? The entries in the packed-refs file that are > > > superseded by loose refs should be allowed to point at an already > > > expired object. > > > > Here it is in a test form for easier diagnosis. > > Thanks, I was just starting to do that myself. The problem is in > ca6b06eb7 (packed_ref_store: support iteration, 2017-05-15) and is > pretty obvious: the packed_ref iterator checks whether the entry > resolves. > > I think that _neither_ of the loose and packed iterators should be > checking this. It's only the merged result (where loose trumps packed) > that should bother checking. It looks like this is mostly already the case. files_ref_iterator combines the two and does check. The loose iteration is done by cache_ref_iterator[1], and does not. So it's just this new packed_refs iterator that is wrong, and we just need: diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 311fd014c..ad1143e64 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -416,12 +416,6 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator) ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE) continue; - if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) && - !ref_resolves_to_object(iter->iter0->refname, - iter->iter0->oid, - iter->iter0->flags)) - continue; - iter->base.refname = iter->iter0->refname; iter->base.oid = iter->iter0->oid; iter->base.flags = iter->iter0->flags; @@ -473,8 +467,6 @@ static struct ref_iterator *packed_ref_iterator_begin( struct ref_iterator *ref_iterator; unsigned int required_flags = REF_STORE_READ; - if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN)) - required_flags |= REF_STORE_ODB; refs = packed_downcast(ref_store, required_flags, "ref_iterator_begin"); iter = xcalloc(1, sizeof(*iter)); At least that makes sense to me and passes the tests (including the new one). I haven't actually reviewed the patches yet. -Peff
Re: [PATCH v2 0/3] Add regression tests for rectent rebase -i fixes
Am 16.06.2017 um 20:43 schrieb Johannes Sixt: Am 16.06.2017 um 15:49 schrieb Johannes Schindelin: On Thu, 15 Jun 2017, Junio C Hamano wrote: diff --git a/t/t3420-rebase-autostash.sh b/t/t3420-rebase-autostash.sh index 325ec75353..801bce25da 100755 --- a/t/t3420-rebase-autostash.sh +++ b/t/t3420-rebase-autostash.sh @@ -45,7 +45,7 @@ create_expected_success_am() { } create_expected_success_interactive() { -cr=$'\r' && +cr=$(echo . | tr '.' '\015') && cat >expected <<-EOF $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) HEAD is now at $(git rev-parse --short feature-branch) third commit This is still incorrect, as the \r\n (which $(echo . | tr \.\ '\015') would emit) is interpreted correctly as a line break on Windows, meaning that cr is now *empty*. Not what we want. What I did is to replace the `cat` by `q_to_cr` (we have that lovely function, might just as well use it), replace `${cr}` by `Q` and skip the cr variable altogether. You beat me to it. I came up with the identical q_to_cr changes, but haven't dug the remaining failure regarding the swapped output lines. You seem to have nailed it. Will test your proposed changes tomorrow. As expected, the patches fix the observed test failures for me, too, if that's still relevant. -- Hannes
Re: [PATCH 2/2] Add test for the new status message
On Mon, 2017-06-19 at 14:04 -0400, Jeff King wrote: > On Mon, Jun 19, 2017 at 11:29:49PM +0530, Kaartic Sivaraam wrote: > > > Is there a way to test for the "Initial commit" message in the > > commit > > template? > > You can do "git commit --dry-run", which produces the template on > stdout. Thanks for the help :) > > That should be good enough for our purposes here, as it's the same > code > that produces the text that goes into the editor template. If you > really > wanted to test what the editor sees, you could do something like: > > GIT_EDITOR='cat >editor-input' git commit > > but I don't think it's worth the trouble. > That's right. I thinks it's not worth the trouble too.
Re: [PATCH 00/28] Create a reference backend for packed refs
On Mon, Jun 19, 2017 at 12:25:07PM -0700, Junio C Hamano wrote: > Junio C Hamano writes: > > > Is the iterator over packed-refs correctly skipping over what are > > covered by loose refs? The entries in the packed-refs file that are > > superseded by loose refs should be allowed to point at an already > > expired object. > > Here it is in a test form for easier diagnosis. Thanks, I was just starting to do that myself. The problem is in ca6b06eb7 (packed_ref_store: support iteration, 2017-05-15) and is pretty obvious: the packed_ref iterator checks whether the entry resolves. I think that _neither_ of the loose and packed iterators should be checking this. It's only the merged result (where loose trumps packed) that should bother checking. -Peff
Re: in case you want a use-case with lots of submodules
On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko wrote: > Hi All, > > On a recent trip I've listened to the git minutes podcast episode and > got excited to hear Stefan Beller (CCed just in case) describing > ongoing work on submodules mechanism. I got excited, since e.g. > performance improvements would be of great benefit to us too. If you're mostly interested in performance improvements of the status quo (i.e. "make git-submodule fast"), then the work of Prathamesh Chavan (cc'd) might be more interesting to you than what I do. He is porting git-submodule (which is mostly a shell script nowadays) to C, such that we can save a lot of process invocations and can do processing within one process. > In our project, http://datalad.org, git submodules is the basic > mechanism to bring multiple "datasets" (mix of git and git-annex'ed > repositories) under the same roof so we could non-ambiguously > version them all at any level. Cool, glad to here submodules being useful. :) > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370 > repositories, up to 4 levels deep) hierarchy of git/git-annex > repositories all tied together via git submodules mechanism. And as the > collection grows, interactions with it become slower, so additional > options (such as --ignore-submodules=dirty to status) become our > friends. I am not as much concerned about the 370 number than about the 4 layers of nesting. In my experience the nested submodule case is a little bit error prone and the bug reports are not as frequent as there are not as many users of nesting, yet(?) In a neighboring thread on the mailing list we have a discussion on the usefulness of being on branches than in detached HEAD in the submodules. https://public-inbox.org/git/0092cdd27c5f9d418b0f3e9b5d05be0801028...@sbs2011.opfingen.plc2.de/ This would not break non-ambiguously, rather it would add ease of use. > So I thought to share this as a use-case happen you need more > motivation or just a real-case test-bed for your work. And thank > you again for making Git even Greater. Thanks for the motivation. :) > P.S. Please CCme in your replies (if any), I am not on the list > > With best regards, Cheers, Stefan
Re: [PATCH 00/28] Create a reference backend for packed refs
Junio C Hamano writes: > Is the iterator over packed-refs correctly skipping over what are > covered by loose refs? The entries in the packed-refs file that are > superseded by loose refs should be allowed to point at an already > expired object. Here it is in a test form for easier diagnosis. t/t1408-packed-refs.sh | 46 ++ 1 file changed, 46 insertions(+) diff --git a/t/t1408-packed-refs.sh b/t/t1408-packed-refs.sh new file mode 100755 index 00..35533c8593 --- /dev/null +++ b/t/t1408-packed-refs.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='packed-refs entries are covered by loose refs' + +. ./test-lib.sh + +test_expect_success setup ' + test_tick && + git commit --allow-empty -m one && + one=$(git rev-parse HEAD) && + git for-each-ref >actual && + echo "$one commit refs/heads/master" >expect && + test_cmp expect actual && + + git pack-refs --all && + git for-each-ref >actual && + echo "$one commit refs/heads/master" >expect && + test_cmp expect actual && + + cat .git/packed-refs && + + git checkout --orphan another && + test_tick && + git commit --allow-empty -m two && + two=$(git rev-parse HEAD) && + git checkout -B master && + git branch -D another && + + cat .git/packed-refs && + + git for-each-ref >actual && + echo "$two commit refs/heads/master" >expect && + test_cmp expect actual && + + git reflog expire --expire=now --all && + git prune && + git tag -m v1.0 v1.0 master +' + +test_expect_success 'no error from stale entry in packed-refs' ' + git describe master >actual 2>&1 && + echo "v1.0" >expect && + test_cmp expect actual +' + +test_done
Re: Restoring detached HEADs after Git operations
On Mon, Jun 19, 2017 at 11:09 AM, Patrick Lehmann wrote: > Hello Stefan, > > the use case is as follows: > > The projects consists of circa 18 IP cores. Each IP core is represented by a > Git repository. Think of an IP core as of a lonestanding DLL or SO file > project. Each IP core references 2 submodules, which bring the verification > environments for testing the IP core standalone. So phrased differently: You are using submodules to avoid "DLL hell" (sharing a lib, with ease of versioning as the submodules in the different IP cores may be pointing at different versions). > > These 18 IP cores are grouped to bigger IP cores, referencing the low-level > IP cores and each again the 2 verification submodules. Finally, the main > project references the bigger IP cores and again the 2 verification cores. > > TOPLEVEL > o- IP1 >o- UVVM >o- VUnit > o- IP2 >o- UVVM >o- VUnit > o- IP3 >o- UVVM >o- VUnit > o- IP4 >o- UVVM >o- VUnit >o- IP5 >o- UVVM >o- VUnit >o- IP6 >o- UVVM >o- VUnit >o- IP7 >o- UVVM >o- VUnit > o- IP8 >o- UVVM >o- VUnit >o- IP9 >o- UVVM >o- VUnit >o- IP10 >o- UVVM >o- VUnit > o- IP11 >o- UVVM >o- VUnit >o- IP9 >o- UVVM >o- VUnit >o- IP12 >o- UVVM >o- VUnit >o- UVVM >o- VUnit > > That's the simplified structure. I can't write more, because it's a closed > source project. You can find other usecases e.g. in my other open source > projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding > PoC-Examples repository. > > Example: PoC > Pile of Cores includes 4 Git submodules and is itself an IP core library. > So PoC-Examples again references PoC. This looks like this tree: > > PoC-Examples > |- lib/ >o- PoC > |- lib > o- Cocotb > o- OSVVM > o- VUnit > o- OSVVM > o- UVVM > > The library VUnit itself already includes OSVVM as a library. > > -- > Forcast: > I'll write a new question / idea about multiple equal submodules and the > memory footprint soon... > Here is my original question posted on StackOverflow: > https://stackoverflow.com/questions/44585425/how-to-reduce-the-memory-footprint-for-multiple-submodules-of-the-same-source > -- > > Do you need more use cases? > Well this use case points out a different issue than I hoped for. ;) >From the stackoverflow post and from looking at the layout here, one of the major questions is how to deduplicate the submodule object store for example. By use case I rather meant a sales pitch for your initial email: I use this bash script because it fits in my workflow because I need branches instead of detached HEADS, because $REASONS and I'd be interested in these $REASONS, which I assumed to be * easier to work with branches than detached HEADS (it aids the workflow) * we're not challenging the underlying mental model of tracking sha1s in the superproject rather than branches. At least I gave these reasons in the "reattach HEAD" stuff that I wrote, but maybe there are others? (I know the code base of submodules very well, but I do not work with submodules on a day-to-day basis myself...)
Re: Restoring detached HEADs after Git operations
On Mon, Jun 19, 2017 at 10:55 AM, Junio C Hamano wrote: > Stefan Beller writes: > >> On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann >> wrote: >>> Hello Lars, >>> >>> for your questions: If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem? >>> >>> I can't think of a good solution to resolve it automatically. Maybe a >>> script could print that there are multiple possibilities and it choose the >>> first branch in the list. >>> >>> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too? >>> >>> This is also related to restoring tags. If we go this way, we should have >>> this priority list: >>> - local branches >>> - remote branches >> >> For remote branches you would create a local branch of the same name >> (if such a branch would not exist, possibly setting it up to track that >> remote >> branch)? >> >>> - tags >> >> as said in the other email and similar to remote branches, we'd not want to >> have >> HEAD pointing to them directly but somehow have a local branch. > > Let's step back a bit. We detach the HEAD for a good reason, no? And the 'good reason' being that at the time git-submodule was written we did not know what would be best, and having a detached HEAD would be (a) easy to implement, and (b) removing one moving thing from the whole construction, hence making it a bit safer, (c) it sort of follows the mental model: the superproject said it had the submodule at X (and not at branch Y!) the submodule itself is a whole repo on its own (it doesn't need to be aware of the superproject) so in this world detaching at X is the best we can do. > Why is it a good idea to move them back on to a branch picked among > multiple ones that all happen to be pointing at the same commit? This (rhetorical?) question reads like 2 questions actually: (a) "Why is it a good idea to move them back on to a branch?" It makes working easier as the submodule is not detached, but on a proper branch (b) "picked among multiple ones that all ..." I think this is a bad idea and we'd rather want to follow some configuration instead of wild-guessing by Git. > The user may build on a history of a submodule, and then may push > the result out to a particular branch at the other side; that is > when being on a named branch in the submodule becomes useful, but > even then I do not think randomly picking one branch and be on it > is a good thing to do. so you provide one reason why it is useful, but then claiming it is 'not a good thing' (yet). Can you give a reason why this is a 'bad thing'? > I would understand the workflow would go more like so: > > - You do something at the superproject (e.g. create a new branch X >from an existing commit and check it out), which results in >submodules' HEADs getting detached at the commits bound to the >superproject's tree. And here we'd want to discuss if we *really* want to detach the HEADs or rather have a symbolic ref "following the superproject". > - Because you want to make changes to both submodules and the >superproject in a consistent way, you'd want to commit changes to >all of these repositories and the push the result out in an >atomic way. Committing and pushing are different things. You should not care if I commit atomically as you (in the general "upstream" sense) cannot observe my local commits. For pushing we would want to have an atomic push, but that is not the scope of this discussion. (As a Gerrit user, we implemented the submodule atomicity serverside, but in plain Git server you'd not need the atomicity either: git commit -a -m "update submodule pointers" git submodule foreach git push git push should be fine w.r.t. any non-atomic race condition.) > - Hence you tell "Hey, Git, I want all the submodules that I >modified to be on branch X" from the superproject. > >- This may succeed in a submodule where X is a new name, or the > current tip of branch X is an ancestor of the detached HEAD. so we'd allow fast forward for X. This seems arbitrary to me. I could also say "If X exists I allow a merge to be made between old X and the object name given by the superproject". (maybe as a config option) >- This may fail in a submodule where there is branch X that does > not want to move to the detached HEAD's state. In this latter > case, the user needs to deal with the situation (perhaps the > old X is expendable; perhaps the HEAD's commit may need to be > merged to old X; perhaps there are other cases). makes sense. > > though.
Re: [PATCH 00/28] Create a reference backend for packed refs
Michael Haggerty writes: > I've developed these patches on top of master plus the following > patches, which are followups to mh/packed-refs-store-prep: > > * lock_packed_refs(): fix cache validity check > * for_each_bisect_ref(): don't trim refnames > > The patches can also be obtained from my GitHub fork [2] as branch > "packed-ref-store". > > Michael > > [1] http://public-inbox.org/git/cover.1490026594.git.mhag...@alum.mit.edu/ > [2] https://github.com/mhagger/git Thanks. Both in the version I queued and a fresh fetching of c13b2fad ("read_packed_refs(): die if `packed-refs` contains bogus data", 2017-06-12) from your https://github.com/mhagger/git repository seems to exhibit an annoying error message in my local repository I use for the primary work: $ git fetch https://github.com/mhagger/git packed-ref-store $ git checkout FETCH_HEAD $ make $ ./git describe next error: refs/notes/amlog does not point to a valid object! v2.13.1-611-g7e3b11ae1b $ grep refs/notes/amlog .git/packed-refs ed07e83cff8e407464fb2f5e84bd311da9c87565 refs/notes/amlog $ git rev-parse refs/notes/amlog b3079212325398e406078585c785c892d6e572f0 $ git cat-file -t ed07e83cff8e407464fb2f5e84bd311da9c87565 fatal: git cat-file: could not get object info $ git cat-file -t b3079212325398e406078585c785c892d6e572f0 commit Is the iterator over packed-refs correctly skipping over what are covered by loose refs? The entries in the packed-refs file that are superseded by loose refs should be allowed to point at an already expired object.
Re: [RFC/PATCH] submodules: overhaul documentation
On 06/13, Stefan Beller wrote: > Adding two native speakers as we start word smithing. > > On Tue, Jun 13, 2017 at 12:29 PM, Junio C Hamano wrote: > > >> + > >> +A submodule is another Git repository tracked in a subdirectory of your > >> +repository. The tracked repository has its own history, which does not > >> +interfere with the history of the current repository. > > > > "tracked in a subdirectory" sounds as if your top-level superproject > > has a dedicated submodules/ directory and in it there live a bunch > > of submodules. Which obviously is not what you meant. If phrased > > "tracked as a subdirectory", I think the sentence makes sense. > > Given this explanation "as a" also sounds wrong[1], maybe we need to > separate (1) where it is put/mounted and (2) the fact that is tracked, > i.e. the superproject has an idea of what should be there at a given > revision. (I shortly thought about /s/as a/using/ in the above, but): > > A submodule is another Git repository at an arbitrary place inside > the working tree, and also tracked. The tracked repository has its > own history, which does not interfere with the history of the current > repository. I would probably change the first sentence to: A submodule is another Git repository tracked at an arbitrary place inside the working tree. > > [1] http://www.thesaurus.com/browse/as > > > > > While "which does not interfere" may be technically correct, I am > > not sure what the value of saying that is. > > I think we can drop it here. When writing I wanted to separate it from > subtrees, but this is the wrong place for that. > > > > >> +Submodules are composed from a so-called `gitlink` tree entry > >> +in the main repository that refers to a particular commit object > >> +within the inner repository. > > > > Correct, but it may be unclear to the readers why we do so. Perhaps > > > > ... and this way, the tree of each commit in the main repository > > "knows" which commit from the submodule's history is "tied" to it. > > > > or something like that? > > sounds good to me. > > > > >> +Additionally to the gitlink entry the `.gitmodules` file (see > >> +linkgit:gitmodules[5]) at the root of the source tree contains > >> +information needed for submodules. > > > > Is that really true? Each submodule do not *need* what is in > > .gitmodules; the top-level superproject needs to learn about > > its submodules from the contents of that file, though. > > Ha! The ediled words in my mind were: > > ... information needed for submodules [to work in the superproject]. > > But maybe we need to reword that as > > Additionally to the gitlink entry the `.gitmodules` file (see > linkgit:gitmodules[5]) at the root of the source tree contains > information on how to handle submodules. This sounds slightly awkward. Maybe: In addition to the gitlink entry, the `.gitmodules` file (see linkgit:gitmodules[5]) at the root of the source tree contains information on how to handle submodules. -- Brandon Williams
AW: Restoring detached HEADs after Git operations
Hello Stefan, the use case is as follows: The projects consists of circa 18 IP cores. Each IP core is represented by a Git repository. Think of an IP core as of a lonestanding DLL or SO file project. Each IP core references 2 submodules, which bring the verification environments for testing the IP core standalone. These 18 IP cores are grouped to bigger IP cores, referencing the low-level IP cores and each again the 2 verification submodules. Finally, the main project references the bigger IP cores and again the 2 verification cores. TOPLEVEL o- IP1 o- UVVM o- VUnit o- IP2 o- UVVM o- VUnit o- IP3 o- UVVM o- VUnit o- IP4 o- UVVM o- VUnit o- IP5 o- UVVM o- VUnit o- IP6 o- UVVM o- VUnit o- IP7 o- UVVM o- VUnit o- IP8 o- UVVM o- VUnit o- IP9 o- UVVM o- VUnit o- IP10 o- UVVM o- VUnit o- IP11 o- UVVM o- VUnit o- IP9 o- UVVM o- VUnit o- IP12 o- UVVM o- VUnit o- UVVM o- VUnit That's the simplified structure. I can't write more, because it's a closed source project. You can find other usecases e.g. in my other open source projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding PoC-Examples repository. Example: PoC Pile of Cores includes 4 Git submodules and is itself an IP core library. So PoC-Examples again references PoC. This looks like this tree: PoC-Examples |- lib/ o- PoC |- lib o- Cocotb o- OSVVM o- VUnit o- OSVVM o- UVVM The library VUnit itself already includes OSVVM as a library. -- Forcast: I'll write a new question / idea about multiple equal submodules and the memory footprint soon... Here is my original question posted on StackOverflow: https://stackoverflow.com/questions/44585425/how-to-reduce-the-memory-footprint-for-multiple-submodules-of-the-same-source -- Do you need more use cases? Kind regards Patrick Von: git-ow...@vger.kernel.org [git-ow...@vger.kernel.org]" im Auftrag von "Stefan Beller [sbel...@google.com] Gesendet: Montag, 19. Juni 2017 19:47 Bis: Patrick Lehmann Cc: Lars Schneider; Git Mailinglist Betreff: Re: Restoring detached HEADs after Git operations On Mon, Jun 19, 2017 at 10:34 AM, Patrick Lehmann wrote: > Hello, > > I'm just an advanced Git user, not a Git developer. So I might find some time > to improve the suggested script, which I provided with the hints given on the > mailing list, but I have no time to do a complete feature release in your > patch based Git flow. ok, thanks for letting us know. I may re-prioritize the "reattach HEAD" patches that I referenced earlier. I would have hoped that additionally to the shell lines you'd have given a good use case/summary. > I have no experience with other shells then Bash. So if you rely on a Bash > with less features, please port the syntax to such a shell system. (I > personally do not support legacy programs or out-date programs). > > -- > We are talking about circa 50 submodules in total with a maximum depth of 4. > The platforms are: > - Mint OS with Git in Bash > - Windows 7 with Git-Bash > - Windows 10 with Git-Bash > - Windows 10 with Posh-Git Thanks, Stefan
Re: [PATCH 2/2] Add test for the new status message
On Mon, Jun 19, 2017 at 11:29:49PM +0530, Kaartic Sivaraam wrote: > Is there a way to test for the "Initial commit" message in the commit > template? You can do "git commit --dry-run", which produces the template on stdout. That should be good enough for our purposes here, as it's the same code that produces the text that goes into the editor template. If you really wanted to test what the editor sees, you could do something like: GIT_EDITOR='cat >editor-input' git commit but I don't think it's worth the trouble. -Peff
Re: [PATCH 2/2] Add test for the new status message
On Sun, 2017-06-18 at 21:32 -0700, Junio C Hamano wrote: > Kaartic Sivaraam writes: > > > +test_expect_success 'No commits yet should be noted in status > > output' ' > > + git init initial && > > + cd initial && > > + git status >output && > > + test_i18ngrep "No commits yet" output && > > + test_commit initial && > > + git status >output && > > + test_i18ngrep ! "No commits yet" output && > > + test_i18ngrep "nothing.*to commit" output > > +' > > + > > Do not "cd" in a test, without being in a subshell. When other > people in the future want to add new tests to the end of this > script, the new test will end up running in the new subdirectory, > which is not something they should have to worry about. > > git checkout --orphan empty-branch && > git status >output && > test_i18ngrep "No commits yet" output && > ... > > perhaps? > > > test_done Fixed it. I wasn't aware of the guide lines for writing tests when I used the patch from the thread blindly. I'll be careful to avoid that in future. Is there a way to test for the "Initial commit" message in the commit template? -- Regards, Kaartic Sivaraam
Re: How to git push mirror local refs only?
Robert Dailey writes: > On Fri, Jun 9, 2017 at 8:53 PM, Junio C Hamano wrote: >> Robert Dailey writes: >> >>> So I want to update my remote fork with all my local branches. >>> Normally I'd do this: >>> >>> $ git push --mirror fork >>> ... >> Something along this line in your .git/config: >> >> [remote "fork"] >> url = ... >> push = refs/heads/*:refs/heads/* >> >> and then >> >> $ git push --prune --follow-tags fork > > Sorry for the late reply, I was out of town all last week. Thanks for > your help. Does this serve as a good general default? I can't imagine > a case where I'd ever want to push something inside refs/remotes. It may, as a good general default to something other than --mirror, perhaps. When I ask for mirroring the state of my repository, I'd expect it to preserve as much state as possible, including the tips of histories I received from elsewhere.
[PATCH v3 4/4] rebase: Add more regression tests for console output
From: Phillip Wood Check the console output when using --autostash and the stash does not apply is what we expect. The test is quite strict but should catch any changes to the console output from the various rebase flavors. Thanks-to: Johannes Schindelin Signed-off-by: Phillip Wood --- t/t3420-rebase-autostash.sh | 71 +++-- 1 file changed, 69 insertions(+), 2 deletions(-) diff --git a/t/t3420-rebase-autostash.sh b/t/t3420-rebase-autostash.sh index cd1012798cb300f4f1ddeba6fdcad544ca9ea1d9..2c01ae6ad2a104940e388013984e7fa2a0d5aed5 100755 --- a/t/t3420-rebase-autostash.sh +++ b/t/t3420-rebase-autostash.sh @@ -88,6 +88,67 @@ create_expected_success_merge() { EOF } +create_expected_failure_am() { + cat >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + First, rewinding head to replay your work on top of it... + Applying: second commit + Applying: third commit + Applying autostash resulted in conflicts. + Your changes are safe in the stash. + You can run "git stash pop" or "git stash drop" at any time. + EOF +} + +create_expected_failure_interactive() { + q_to_cr >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + Rebasing (1/2)QRebasing (2/2)QApplying autostash resulted in conflicts. + Your changes are safe in the stash. + You can run "git stash pop" or "git stash drop" at any time. + Successfully rebased and updated refs/heads/rebased-feature-branch. + EOF +} + +create_expected_failure_merge() { + cat >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + First, rewinding head to replay your work on top of it... + Merging unrelated-onto-branch with HEAD~1 + Merging: + $(git rev-parse --short unrelated-onto-branch) unrelated commit + $(git rev-parse --short feature-branch^) second commit + found 1 common ancestor: + $(git rev-parse --short feature-branch~2) initial commit + [detached HEAD $(git rev-parse --short rebased-feature-branch~1)] second commit +Author: A U Thor +Date: Thu Apr 7 15:14:13 2005 -0700 +2 files changed, 2 insertions(+) +create mode 100644 file1 +create mode 100644 file2 + Committed: 0001 second commit + Merging unrelated-onto-branch with HEAD~0 + Merging: + $(git rev-parse --short rebased-feature-branch~1) second commit + $(git rev-parse --short feature-branch) third commit + found 1 common ancestor: + $(git rev-parse --short feature-branch~1) second commit + [detached HEAD $(git rev-parse --short rebased-feature-branch)] third commit +Author: A U Thor +Date: Thu Apr 7 15:15:13 2005 -0700 +1 file changed, 1 insertion(+) +create mode 100644 file3 + Committed: 0002 third commit + All done. + Applying autostash resulted in conflicts. + Your changes are safe in the stash. + You can run "git stash pop" or "git stash drop" at any time. + EOF +} + testrebase() { type=$1 dotest=$2 @@ -198,10 +259,9 @@ testrebase() { test_config rebase.autostash true && git reset --hard && git checkout -b rebased-feature-branch feature-branch && - test_when_finished git branch -D rebased-feature-branch && echo dirty >file4 && git add file4 && - git rebase$type unrelated-onto-branch && + git rebase$type unrelated-onto-branch >actual 2>&1 && test_path_is_missing $dotest && git reset --hard && grep unrelated file4 && @@ -210,6 +270,13 @@ testrebase() { git stash pop && grep dirty file4 ' + + test_expect_success "rebase$type: check output with conflicting stash" ' + test_when_finished git branch -D rebased-feature-branch && + suffix=${type#\ --} && suffix=${suffix:-am} && + create_expected_failure_$suffix && + test_cmp expected actual + ' } test_expect_success "rebase: fast-forward rebase" ' -- 2.13.0
[PATCH v3 2/4] rebase -i: Add test for reflog message
From: Phillip Wood Check that the reflog message written to the branch reflog when the rebase is completed is correct Signed-off-by: Phillip Wood --- t/t3404-rebase-interactive.sh | 7 +++ 1 file changed, 7 insertions(+) diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index 5bd0275930b715c25507fc9744c8946e7f73900b..37821d245433f757fa13f0a3e27da0312bebb7db 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -169,6 +169,13 @@ test_expect_success 'reflog for the branch shows state before rebase' ' test $(git rev-parse branch1@{1}) = $(git rev-parse original-branch1) ' +test_expect_success 'reflog for the branch shows correct finish message' ' + printf "rebase -i (finish): refs/heads/branch1 onto %s\n" \ + "$(git rev-parse branch2)" >expected && + git log -g --pretty=%gs -1 refs/heads/branch1 >actual && + test_cmp expected actual +' + test_expect_success 'exchange two commits' ' set_fake_editor && FAKE_LINES="2 1" git rebase -i HEAD~2 && -- 2.13.0
[PATCH v3 3/4] rebase: Add regression tests for console output
From: Phillip Wood Check the console output when using --autostash and the stash applies cleanly is what we expect. The test is quite strict but should catch any changes to the console output from the various rebase flavors. Thanks-to: Johannes Schindelin Signed-off-by: Phillip Wood --- t/t3420-rebase-autostash.sh | 65 +++-- 1 file changed, 63 insertions(+), 2 deletions(-) diff --git a/t/t3420-rebase-autostash.sh b/t/t3420-rebase-autostash.sh index ab8a63e8d6dc643b28eb0c74ba3f032b7532226f..cd1012798cb300f4f1ddeba6fdcad544ca9ea1d9 100755 --- a/t/t3420-rebase-autostash.sh +++ b/t/t3420-rebase-autostash.sh @@ -33,6 +33,61 @@ test_expect_success setup ' git commit -m "related commit" ' +create_expected_success_am() { + cat >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + First, rewinding head to replay your work on top of it... + Applying: second commit + Applying: third commit + Applied autostash. + EOF +} + +create_expected_success_interactive() { + q_to_cr >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + Rebasing (1/2)QRebasing (2/2)QApplied autostash. + Successfully rebased and updated refs/heads/rebased-feature-branch. + EOF +} + +create_expected_success_merge() { + cat >expected <<-EOF + $(grep "^Created autostash: [0-9a-f][0-9a-f]*\$" actual) + HEAD is now at $(git rev-parse --short feature-branch) third commit + First, rewinding head to replay your work on top of it... + Merging unrelated-onto-branch with HEAD~1 + Merging: + $(git rev-parse --short unrelated-onto-branch) unrelated commit + $(git rev-parse --short feature-branch^) second commit + found 1 common ancestor: + $(git rev-parse --short feature-branch~2) initial commit + [detached HEAD $(git rev-parse --short rebased-feature-branch~1)] second commit +Author: A U Thor +Date: Thu Apr 7 15:14:13 2005 -0700 +2 files changed, 2 insertions(+) +create mode 100644 file1 +create mode 100644 file2 + Committed: 0001 second commit + Merging unrelated-onto-branch with HEAD~0 + Merging: + $(git rev-parse --short rebased-feature-branch~1) second commit + $(git rev-parse --short feature-branch) third commit + found 1 common ancestor: + $(git rev-parse --short feature-branch~1) second commit + [detached HEAD $(git rev-parse --short rebased-feature-branch)] third commit +Author: A U Thor +Date: Thu Apr 7 15:15:13 2005 -0700 +1 file changed, 1 insertion(+) +create mode 100644 file3 + Committed: 0002 third commit + All done. + Applied autostash. + EOF +} + testrebase() { type=$1 dotest=$2 @@ -51,14 +106,20 @@ testrebase() { test_config rebase.autostash true && git reset --hard && git checkout -b rebased-feature-branch feature-branch && - test_when_finished git branch -D rebased-feature-branch && echo dirty >>file3 && - git rebase$type unrelated-onto-branch && + git rebase$type unrelated-onto-branch >actual 2>&1 && grep unrelated file4 && grep dirty file3 && git checkout feature-branch ' + test_expect_success "rebase$type --autostash: check output" ' + test_when_finished git branch -D rebased-feature-branch && + suffix=${type#\ --} && suffix=${suffix:-am} && + create_expected_success_$suffix && + test_cmp expected actual + ' + test_expect_success "rebase$type: dirty index, non-conflicting rebase" ' test_config rebase.autostash true && git reset --hard && -- 2.13.0
[PATCH v3 1/4] sequencer: print autostash messages to stderr
From: Johannes Schindelin The rebase messages are printed to stderr traditionally. However due to a bug introduced in 587947750bd (rebase: implement --[no-]autostash and rebase.autostash, 2013-05-12) which was faithfully copied when reimplementing parts of the interactive rebase in the sequencer the autostash messages are printed to stdout instead. It is time to fix that: let's print the autostash messages to stderr instead of stdout. Signed-off-by: Johannes Schindelin Signed-off-by: Phillip Wood --- git-rebase.sh | 4 ++-- sequencer.c | 11 ++- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/git-rebase.sh b/git-rebase.sh index db1deed8464f0643763ed6e3c5e54221cad8c985..2cf73b88e8e83ca34b9eb319dbc2b0a220139b0f 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -166,14 +166,14 @@ apply_autostash () { stash_sha1=$(cat "$state_dir/autostash") if git stash apply $stash_sha1 2>&1 >/dev/null then - echo "$(gettext 'Applied autostash.')" + echo "$(gettext 'Applied autostash.')" >&2 else git stash store -m "autostash" -q $stash_sha1 || die "$(eval_gettext "Cannot store \$stash_sha1")" gettext 'Applying autostash resulted in conflicts. Your changes are safe in the stash. You can run "git stash pop" or "git stash drop" at any time. -' +' >&2 fi fi } diff --git a/sequencer.c b/sequencer.c index a23b948ac148304dbebfe38955ec8b40cab3e1e5..606750b1e0c900a9fb43cea224d27ab36ca29ab9 100644 --- a/sequencer.c +++ b/sequencer.c @@ -1921,7 +1921,7 @@ static int apply_autostash(struct replay_opts *opts) argv_array_push(&child.args, "apply"); argv_array_push(&child.args, stash_sha1.buf); if (!run_command(&child)) - printf(_("Applied autostash.\n")); + fprintf(stderr, _("Applied autostash.\n")); else { struct child_process store = CHILD_PROCESS_INIT; @@ -1935,10 +1935,11 @@ static int apply_autostash(struct replay_opts *opts) if (run_command(&store)) ret = error(_("cannot store %s"), stash_sha1.buf); else - printf(_("Applying autostash resulted in conflicts.\n" - "Your changes are safe in the stash.\n" - "You can run \"git stash pop\" or" - " \"git stash drop\" at any time.\n")); + fprintf(stderr, + _("Applying autostash resulted in conflicts.\n" + "Your changes are safe in the stash.\n" + "You can run \"git stash pop\" or" + " \"git stash drop\" at any time.\n")); } strbuf_release(&stash_sha1); -- 2.13.0
Re: Restoring detached HEADs after Git operations
On Mon, Jun 19 2017, Patrick Lehmann jotted: > Hello, > > I wrote a Bash script to recover branch names after Git operations have > create detached HEADs in a Git repository containing lots of Git submodules. > The script works recursively. > > I would like to see: > a) that script or algorithm being integrated into Git by default > b) that as a default behavior for all Git operations creating detached HEADs > > That's the command: > > git submodule foreach --recursive 'HEAD=$(git branch --list | head -n 1); if > [[ "$HEAD" == *HEAD* ]]; then REF=$(git rev-parse HEAD); FOUND=0; for Branch > in $(git branch --list | grep "^ " | sed -e "s/ //" ); do if [[ "$(git > rev-parse "$Branch")" == $REF ]]; then echo -e " \e[36mCheckout > $Branch...\e[0m"; git checkout $Branch; FOUND=1; break; fi done; if [[ $FOUND > -eq 0 ]]; then echo -e " \e[31mNo matching branch found.\e[0m"; fi else echo > -e " \e[36mNothing to do.\e[0m"; fi' > > > How does it work: > 1. It uses git submodule foreach to dive into each Git submodule and execute > a series of Bash commands. > 2. It's reading the list of branches and checks if the submodule is in > detached mode. The first line contains the string HEAD. > 3. Retrieve the hash of the detached HEAD > 4. Iterate all local branches and get their hashes > 5. Compare the branch hashes with the detached HEAD's hash. If it matches do > a checkout. > 6. Report if no branch name was found or if a HEAD was not in detached mode. > > The Bash code with line breaks and indentation: > > HEAD=$(git branch --list | head -n 1) > if [[ "$HEAD" == *HEAD* ]]; then > REF=$(git rev-parse HEAD) > FOUND=0 > for Branch in $(git branch --list | grep "^ " | sed -e "s/ //" ); do > if [[ "$(git rev-parse "$Branch")" == $REF ]]; then > echo -e " \e[36mCheckout $Branch...\e[0m" > git checkout $Branch > FOUND=1 > break > fi > done > if [[ $FOUND -eq 0 ]]; then > echo -e " \e[31mNo matching branch found.\e[0m" > fi > else > echo -e " \e[36mNothing to do.\e[0m" > fi > > > Are their any chances to get it integrated into Git? > > I tried to register that code as a Git alias, but git config complains about > quote problem not showing where. It neither specifies if it's a single or > double quote problem. Any advice on how to register that piece of code as an > alias? > > If wished, I think I could expand the script to also recover hash values to > Git tags if no branch was found. I have something similar to this, this written before git-submodule learned --branch (or at least before I knew about it): $ git config alias.sm-pull-all !git submodule foreach 'git checkout $(NAME=$name git sm-mainbranch) && git pull' $ git config alias.sm-mainbranch !git config --file ../.gitmodules submodule.$NAME.branch || git describe --all --always | sed 's!^heads/!!' So with this I can run `git sm-pull-all` to update all the submodules in a superproject to update them all. This relies on the branch name being in the .gitmodules config.a Now if you add a module with e.g. 'git submodule add -b master ...' you'll have it checked out at the master branch, but I'm not familiar enough with all the workflows around them to say if there are any holes in that implementation. But that seems like a fundimentally better approach to me than what you're suggesting. Why would we try to work our way back from the SHA-1 and guess what branch it lives on when we can just encode the branch name we'd like to be on in the superproject?
[PATCH v3 0/4] Add regression tests for recent rebase -i fixes
From: Phillip Wood I've updated the second two tests to be portable using q_to_cr() as Johannes suggested and added his patch to fix the autostash messages going to stdout rather than stderr. The reflog message test is unchanged. Thanks to Johannes for his help and to Junio for picking up the bashism in the last iteration. Johannes Schindelin (1): sequencer: print autostash messages to stderr Phillip Wood (3): rebase -i: Add test for reflog message rebase: Add regression tests for console output rebase: Add more regression tests for console output git-rebase.sh | 4 +- sequencer.c | 11 ++-- t/t3404-rebase-interactive.sh | 7 +++ t/t3420-rebase-autostash.sh | 136 -- 4 files changed, 147 insertions(+), 11 deletions(-) -- 2.13.0
Re: [PATCH/RFC] Cleanup Documentation
On Mon, Jun 19, 2017 at 10:33 AM, Kaartic Sivaraam wrote: > >> Please markup the '.gitmodules' either via single quotes or `. >> (or even link to 'gitmodules(5)' ) >> > Marked it up using `. Help needed to link to 'gitmodules(5)', as I'm > not sure how to provide alternative text to 'linkgit:'. I did not mean 'gitmodules(5)' literally, I rather meant using linkgit: as you know it. :) > >> I am undecided if this is really removing (2) unclearness, but the >> (1) redundancy seems fine to me. >> > Sorry about that. The commit message should have been, > > ... > 2. Removed unclear back reference Oh that clears some confusion here. :) > Note: Will follow up with a patch, soon. Thanks. Stefan
Re: Restoring detached HEADs after Git operations
Stefan Beller writes: > On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann > wrote: >> Hello Lars, >> >> for your questions: >>> If there are multiple branches with the same hash then your script would >>> pick the first one. Can you imagine a situation where this would be a >>> problem? >> >> I can't think of a good solution to resolve it automatically. Maybe a script >> could print that there are multiple possibilities and it choose the first >> branch in the list. >> >> >>> Plus, you are looking only at local branches. Wouldn't it make sense to >>> look at remote branches, too? >> >> This is also related to restoring tags. If we go this way, we should have >> this priority list: >> - local branches >> - remote branches > > For remote branches you would create a local branch of the same name > (if such a branch would not exist, possibly setting it up to track that remote > branch)? > >> - tags > > as said in the other email and similar to remote branches, we'd not want to > have > HEAD pointing to them directly but somehow have a local branch. Let's step back a bit. We detach the HEAD for a good reason, no? Why is it a good idea to move them back on to a branch picked among multiple ones that all happen to be pointing at the same commit? The user may build on a history of a submodule, and then may push the result out to a particular branch at the other side; that is when being on a named branch in the submodule becomes useful, but even then I do not think randomly picking one branch and be on it is a good thing to do. I would understand the workflow would go more like so: - You do something at the superproject (e.g. create a new branch X from an existing commit and check it out), which results in submodules' HEADs getting detached at the commits bound to the superproject's tree. - Because you want to make changes to both submodules and the superproject in a consistent way, you'd want to commit changes to all of these repositories and the push the result out in an atomic way. - Hence you tell "Hey, Git, I want all the submodules that I modified to be on branch X" from the superproject. - This may succeed in a submodule where X is a new name, or the current tip of branch X is an ancestor of the detached HEAD. - This may fail in a submodule where there is branch X that does not want to move to the detached HEAD's state. In this latter case, the user needs to deal with the situation (perhaps the old X is expendable; perhaps the HEAD's commit may need to be merged to old X; perhaps there are other cases). though.
Re: Behavior of 'git fetch' for commit hashes
On Mon, 19 Jun 2017 12:09:28 + wrote: > For version 2.7.4 > = > Git exits with exit code 1. > > However, if I first do 'git fetch ', then 'git fetch will > also work > > * branch-> FETCH_HEAD I suspect that what is happening is that 'git fetch ' also downloads the commit referenced by , so the subsequent 'git fetch ' is a no-op (except setting FETCH_HEAD). > For version 2.13.3 > == > Git exits with exit code 128 and message > fatal: Couldn't find remote ref > > However, the workaround for descbibed abot for git version 2.7.4 no > longer works. The result is always > fatal: Couldn't find remote ref I'll take a look into this. > Desired result > == > Commit is in .git/FETCH_HEAD and can be checked out. > > > I want to checkout a specific commit without creating any extra named > remotes in the local git clone. > > Finally, > What is the expected behavior for 'git fetch' in this case? > Is there some other way I can achieve my goals? I'll take a look into the expected behavior, but if my assumptions are correct, you should be able to just checkout the commit you want after fetching the branch: git fetch git checkout
Re: Restoring detached HEADs after Git operations
On Mon, Jun 19, 2017 at 10:34 AM, Patrick Lehmann wrote: > Hello, > > I'm just an advanced Git user, not a Git developer. So I might find some time > to improve the suggested script, which I provided with the hints given on the > mailing list, but I have no time to do a complete feature release in your > patch based Git flow. ok, thanks for letting us know. I may re-prioritize the "reattach HEAD" patches that I referenced earlier. I would have hoped that additionally to the shell lines you'd have given a good use case/summary. > I have no experience with other shells then Bash. So if you rely on a Bash > with less features, please port the syntax to such a shell system. (I > personally do not support legacy programs or out-date programs). > > -- > We are talking about circa 50 submodules in total with a maximum depth of 4. > The platforms are: > - Mint OS with Git in Bash > - Windows 7 with Git-Bash > - Windows 10 with Git-Bash > - Windows 10 with Posh-Git Thanks, Stefan
Re: [PATCH v5 4/5] convert: move multiple file filter error handling to separate function
> On 19 Jun 2017, at 19:18, Torsten Bögershausen wrote: > > On 2017-06-18 13:47, Lars Schneider wrote: >> >>> On 18 Jun 2017, at 09:20, Torsten Bögershausen wrote: >>> >>> >>> On 2017-06-01 10:22, Lars Schneider wrote: This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. >>> >>> May be >>> Collecting all filter error handling is useful for the subsequent patch >>> 'convert: add "status=delayed" to filter process protocol'. >> >> I think I get your point. However, I feel "Collecting" is not the right word. >> >> How about: >> "Refactoring filter error handling is useful..."? > > OK OK, I'll change it in the next round. Signed-off-by: Lars Schneider --- convert.c | 47 ++- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/convert.c b/convert.c index f1e168bc30..a5e09bb0e8 100644 --- a/convert.c +++ b/convert.c @@ -565,6 +565,29 @@ static int start_multi_file_filter_fn(struct subprocess_entry *subprocess) return err; } +static void handle_filter_error(const struct strbuf *filter_status, + struct cmd2process *entry, + const unsigned int wanted_capability) { + if (!strcmp(filter_status->buf, "error")) { + /* The filter signaled a problem with the file. */ + } else if (!strcmp(filter_status->buf, "abort") && wanted_capability) { + /* + * The filter signaled a permanent problem. Don't try to filter + * files with the same command for the lifetime of the current + * Git process. + */ + entry->supported_capabilities &= ~wanted_capability; + } else { + /* + * Something went wrong with the protocol filter. + * Force shutdown and restart if another blob requires filtering. + */ + error("external filter '%s' failed", entry->subprocess.cmd); + subprocess_stop(&subprocess_map, &entry->subprocess); + free(entry); + } +} + static int apply_multi_file_filter(const char *path, const char *src, size_t len, int fd, struct strbuf *dst, const char *cmd, const unsigned int wanted_capability) @@ -656,28 +679,10 @@ static int apply_multi_file_filter(const char *path, const char *src, size_t len done: sigchain_pop(SIGPIPE); - if (err) { - if (!strcmp(filter_status.buf, "error")) { - /* The filter signaled a problem with the file. */ >>> Note1: Do we need a line with a single ";" here ? >> >> The single ";" wouldn't hurt but I don't think it is helpful either. >> I can't find anything in the coding guidelines about this... > > OK about that. I was thinking to remove the {}, and the we -need- the ";" True. However, I prefer the {} style here. Would that be OK with you? Plus, this commit is not supposed to change code. I just want to move the code to a different function. >>> Question: What should/will happen to the file ? >>> Will Git complain later ? Retry later ? >> >> The file is not filtered and Git does not try, again. >> That might be OK for certain filters: >> If "filter.foo.required = false" then this would be OK. >> If "filter.foo.required = true" then this would cause an error. > > Does it make sense to add it as a comment ? > I know, everything is documented otherwise, but when looking at the source > code only, the context may be useful, it may be not. I agree. I'll add a comment! >> - } else if (!strcmp(filter_status.buf, "abort")) { - /* - * The filter signaled a permanent problem. Don't try to filter - * files with the same command for the lifetime of the current - * Git process. - */ - entry->supported_capabilities &= ~wanted_capability; >>>Hm, could this be clarified somewhat ? >>>Mapping "abort" to "permanent problem" makes sense as >>>such, but the only action that is done is to reset >>>a capablity. >> >> I am not sure what you mean with "reset capability". We disable the >> capability here. >> That means Git will not use the capability for the active session. > > Sorry, my wrong - reset is a bad word here. > "Git will not use the capability for the active session" is perfect! OK :) >>> /* >>> * The filter signaled a missing capabilty. Don't try to filter >>> * files with the same command for the lifetime o
Re: [PATCH v5 08/10] rebase -i: skip unnecessary picks using the rebase--helper
Jeff King writes: > On Mon, Jun 19, 2017 at 11:45:50AM +0200, Johannes Schindelin wrote: > >> The reason for this suggestion is that one of the revision machinery's >> implementation details is an ugly little semi-secret: the pretty-printing >> machinery uses a global state, and that is why we need the "pretty_given" >> flag in the first place. > > I think that's mis-stating Junio's complaint. The point is not the > pretty_given flag itself, which we know about and can work around. The > point is that we don't know what other similar problems we have or will > have due to future changes in the revision code. > > In other words, there are two APIs: the one where C code manipulates > rev_info directly, and the one where revision.c responds to string > arguments. From a maintenance perspective, it is easy for somebody make > a change that works for the latter but not the former. There are (1) the API for users of revision traversal machinery and (2) the implementation detail of the API. Of course, the code that actually parses the plumbing and end-user supplied set of options needs to manipulate rev_info directly, but we should avoid direct manipulation when we can to future-proof the codebase. My main complaint is that Johannes's "compiler can catch mistakes" is true only for the most trivial kind of mistakes. The current implementation detail for reacting to --format="" given by the revision traversal machinery happens to be to set three things in rev_info structure: set verbose_header and pretty_given to 1 and then call get_commit_format() on the format string. Dscho is correct to say that the compiler can catch a mistake that misspells, say verbose_header as verbos_header. The compiler would not catch an equivalent mistake to misspell the option as --formta="", so from that point of view, his argument may seem to make sense. But the compiler would not help catching a copy-and-paste mistake to do only two out of the three things needed (e.g. forgetting to set pretty_given). If the code relies on setup_revisions() to react to options just the way "rev-list" plumbing does, such a mistake cannot happen in the first place---there is no need to worry about "catching". As you clarified correctly, the writer of the code that _uses_ the machinery, like the one in sequencer_make_script(), cannot anticipate how the internal implementation of reacting to the features will change, and more importantly, it should not have to know how it will change. By using the setup_revisions() API that uses the exactly the same parser as plumbing command "git rev-list" does, the forward compatibility is guaranteed. Copying and pasting the current internal implementation detail will break that. > I do agree that the lack of compile-time safety for obvious mistakes > like "--pertty" is a downside, though. On the other hand, there are > strong run-time checks there, so the tests would catch it. True. After setup_revisions() returns, if there are unrecognized options (e.g. misspelt "--pertty"), that can be used as the indication of a programming error, as the new code is not even parsing arbitrary set of options given by the end-user.
Re: [PATCH/RFC] Cleanup Documentation
On Sun, 2017-06-18 at 22:50 -0700, Stefan Beller wrote: > > > diff --git a/Documentation/git-submodule.txt b/Documentation/git- > > > submodule.txt > > > index 74bc6200d..9812b0655 100644 > > > --- a/Documentation/git-submodule.txt > > > +++ b/Documentation/git-submodule.txt > > > @@ -63,13 +63,7 @@ add [-b ] [-f|--force] [--name ] > > > [--reference ] [--dep > > > to the changeset to be committed next to the current > > > project: the current project is termed the "superproject". > > > + > > > -This requires at least one argument: . The optional > > > -argument is the relative location for the cloned > > > submodule > > > -to exist in the superproject. If is not given, the > > > -"humanish" part of the source repository is used ("repo" for > > > -"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). > > > -The is also used as the submodule's logical name in its > > > -configuration entries unless `--name` is used to specify a > > > logical name. > > > +This requires at least one argument: . > > > + > > So we're losing the information how the submodule name is chosen. I just moved it. I don't think we're losing anything related to how the name is chosen. Please let me know if I misinterpreted your statement. > This may be fine as I plan (long term) to make the name an arbitrary > random > string (IMHO that reduces confusion as there will be less 'nearly the > same' > things) > > On the other hand the newly added line > 'This requires at least one argument: (actually moved, but) is sort of redundant. The notation in the > argument line > should make that clear, already? > Makes clear sense. Removed it. > This sounds good, it consolidates all information about [] > in one paragraph. While at it, maybe let's find another (better) > substitute for "humanish" as that can be anything(?). > > Maybe "the last part of the URL" (without any .git) > How about "meaningful"? Put in place it reads like, If is not given, the meaningful part of the source repository ... > Please markup the '.gitmodules' either via single quotes or `. > (or even link to 'gitmodules(5)' ) > Marked it up using `. Help needed to link to 'gitmodules(5)', as I'm not sure how to provide alternative text to 'linkgit:'. > I am undecided if this is really removing (2) unclearness, but the > (1) redundancy seems fine to me. > Sorry about that. The commit message should have been, ... 2. Removed unclear back reference ... by which I intend to denote the following removal, > -In either case, the given URL is recorded into .gitmodules for > -use by subsequent users cloning the superproject. Note: Will follow up with a patch, soon. -- Regards, Kaartic Sivaraam
AW: Restoring detached HEADs after Git operations
Hello, I'm just an advanced Git user, not a Git developer. So I might find some time to improve the suggested script, which I provided with the hints given on the mailing list, but I have no time to do a complete feature release in your patch based Git flow. I'm currently involved in 8 other open source projects. One can't improve the world alone by supplying patches to any open source project one is using... I have no experience with other shells then Bash. So if you rely on a Bash with less features, please port the syntax to such a shell system. (I personally do not support legacy programs or out-date programs). -- We are talking about circa 50 submodules in total with a maximum depth of 4. The platforms are: - Mint OS with Git in Bash - Windows 7 with Git-Bash - Windows 10 with Git-Bash - Windows 10 with Posh-Git Kind regards Patrick Von: Stefan Beller [sbel...@google.com] Gesendet: Montag, 19. Juni 2017 18:37 Bis: Patrick Lehmann Cc: Lars Schneider; Git Mailinglist Betreff: Re: Restoring detached HEADs after Git operations On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann wrote: > Hello Lars, > > for your questions: >> If there are multiple branches with the same hash then your script would >> pick the first one. Can you imagine a situation where this would be a >> problem? > > I can't think of a good solution to resolve it automatically. Maybe a script > could print that there are multiple possibilities and it choose the first > branch in the list. > > >> Plus, you are looking only at local branches. Wouldn't it make sense to look >> at remote branches, too? > > This is also related to restoring tags. If we go this way, we should have > this priority list: > - local branches > - remote branches For remote branches you would create a local branch of the same name (if such a branch would not exist, possibly setting it up to track that remote branch)? > - tags as said in the other email and similar to remote branches, we'd not want to have HEAD pointing to them directly but somehow have a local branch. >> Submodule processing is already quite slow if you have many of them. I >> wonder how much this approach would affect the performance. > > Yes. It takes a few seconds to iterate all the submodules. It could be > improved if the processing wouldn't be based on slow Bash scripts spawning > lot's of sub-shells to execute multiple Git commands. How many submodules are we talking about? (Are you on Windows to make shell even more fun?)
Re: [PATCH v5 4/5] convert: move multiple file filter error handling to separate function
On 2017-06-18 13:47, Lars Schneider wrote: > >> On 18 Jun 2017, at 09:20, Torsten Bögershausen wrote: >> >> >> On 2017-06-01 10:22, Lars Schneider wrote: >>> This is useful for the subsequent patch 'convert: add "status=delayed" to >>> filter process protocol'. >> >> May be >> Collecting all filter error handling is useful for the subsequent patch >> 'convert: add "status=delayed" to filter process protocol'. > > I think I get your point. However, I feel "Collecting" is not the right word. > > How about: > "Refactoring filter error handling is useful..."? OK > >>> >>> Signed-off-by: Lars Schneider >>> --- >>> convert.c | 47 ++- >>> 1 file changed, 26 insertions(+), 21 deletions(-) >>> >>> diff --git a/convert.c b/convert.c >>> index f1e168bc30..a5e09bb0e8 100644 >>> --- a/convert.c >>> +++ b/convert.c >>> @@ -565,6 +565,29 @@ static int start_multi_file_filter_fn(struct >>> subprocess_entry *subprocess) >>> return err; >>> } >>> >>> +static void handle_filter_error(const struct strbuf *filter_status, >>> + struct cmd2process *entry, >>> + const unsigned int wanted_capability) { >>> + if (!strcmp(filter_status->buf, "error")) { >>> + /* The filter signaled a problem with the file. */ >>> + } else if (!strcmp(filter_status->buf, "abort") && wanted_capability) { >>> + /* >>> +* The filter signaled a permanent problem. Don't try to filter >>> +* files with the same command for the lifetime of the current >>> +* Git process. >>> +*/ >>> +entry->supported_capabilities &= ~wanted_capability; >>> + } else { >>> + /* >>> +* Something went wrong with the protocol filter. >>> +* Force shutdown and restart if another blob requires >>> filtering. >>> +*/ >>> + error("external filter '%s' failed", entry->subprocess.cmd); >>> + subprocess_stop(&subprocess_map, &entry->subprocess); >>> + free(entry); >>> + } >>> +} >>> + >>> static int apply_multi_file_filter(const char *path, const char *src, >>> size_t len, >>>int fd, struct strbuf *dst, const char *cmd, >>>const unsigned int wanted_capability) >>> @@ -656,28 +679,10 @@ static int apply_multi_file_filter(const char *path, >>> const char *src, size_t len >>> done: >>> sigchain_pop(SIGPIPE); >>> >>> - if (err) { >>> - if (!strcmp(filter_status.buf, "error")) { >>> - /* The filter signaled a problem with the file. */ >>Note1: Do we need a line with a single ";" here ? > > The single ";" wouldn't hurt but I don't think it is helpful either. > I can't find anything in the coding guidelines about this... OK about that. I was thinking to remove the {}, and the we -need- the ";" > > >>Question: What should/will happen to the file ? >>Will Git complain later ? Retry later ? > > The file is not filtered and Git does not try, again. > That might be OK for certain filters: > If "filter.foo.required = false" then this would be OK. > If "filter.foo.required = true" then this would cause an error. Does it make sense to add it as a comment ? I know, everything is documented otherwise, but when looking at the source code only, the context may be useful, it may be not. > > >>> - } else if (!strcmp(filter_status.buf, "abort")) { >>> - /* >>> -* The filter signaled a permanent problem. Don't try >>> to filter >>> -* files with the same command for the lifetime of the >>> current >>> -* Git process. >>> -*/ >>> -entry->supported_capabilities &= ~wanted_capability; >> Hm, could this be clarified somewhat ? >> Mapping "abort" to "permanent problem" makes sense as >> such, but the only action that is done is to reset >> a capablity. > > I am not sure what you mean with "reset capability". We disable the > capability here. > That means Git will not use the capability for the active session. Sorry, my wrong - reset is a bad word here. "Git will not use the capability for the active session" is perfect! > > >> /* >> * The filter signaled a missing capabilty. Don't try to filter >> * files with the same command for the lifetime of the current >> * Git process. >> */ > > I like the original version better because the capability is not "missing". > There is "a permanent problem" and the filter wants to signal Git to not use > this capability for the current session anymore. Git and the filter are in a negotiation phase to find out which side is able to do what.So I don't thi