Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Wed, Aug 9, 2017 at 10:19 AM, Junio C Hamano wrote: > Jeff King writes: > >> Parsing trailers out of a commit message is _mostly_ easy, but there >> area a lot of funny corner cases (e.g., heuristics for how many >> non-trailers must be present before a final paragraph isn't a trailer >> block anymore). The code in trailer.c already knows about these corner >> cases, but there's no way to access it from the command line. >> >> This series teaches interpret-trailers to parse and output just the >> trailers. So now you can do: >> >> $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | >> git interpret-trailers --parse >> Signed-off-by: Hartmut Henkel >> Helped-by: Stefan Beller >> Signed-off-by: Ralf Thielow >> Acked-by: Matthias Rüster > > Thank-you, thank-you, thank-you. > > The above example made me wonder if we also want a format specifier > to do the above without piping, but it turns out that we already > have "log --format=%(trailers)", so we are good ;-) > I was going to say, I thought we had a way to get trailers for a commit via the pretty format, since that is what i used in the past. Thanks, Jake
[PATCH V2 1/2] Fix delta integer overflows
From: Martin Koegler The current delta code produces incorrect pack objects for files > 4GB. Signed-off-by: Martin Koegler --- For next. diff-delta.c | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/diff-delta.c b/diff-delta.c index 3797ce6..cd238c8 100644 --- a/diff-delta.c +++ b/diff-delta.c @@ -319,7 +319,9 @@ create_delta(const struct delta_index *index, const void *trg_buf, unsigned long trg_size, unsigned long *delta_size, unsigned long max_size) { - unsigned int i, outpos, outsize, moff, msize, val; + unsigned int i, val; + off_t outpos, moff; + size_t l, outsize, msize; int inscnt; const unsigned char *ref_data, *ref_top, *data, *top; unsigned char *out; @@ -336,20 +338,20 @@ create_delta(const struct delta_index *index, return NULL; /* store reference buffer size */ - i = index->src_size; - while (i >= 0x80) { - out[outpos++] = i | 0x80; - i >>= 7; + l = index->src_size; + while (l >= 0x80) { + out[outpos++] = l | 0x80; + l >>= 7; } - out[outpos++] = i; + out[outpos++] = l; /* store target buffer size */ - i = trg_size; - while (i >= 0x80) { - out[outpos++] = i | 0x80; - i >>= 7; + l = trg_size; + while (l >= 0x80) { + out[outpos++] = l | 0x80; + l >>= 7; } - out[outpos++] = i; + out[outpos++] = l; ref_data = index->src_buf; ref_top = ref_data + index->src_size; -- 2.1.4
[PATCH V2 2/2] Convert size datatype to size_t
From: Martin Koegler It changes the signature of the core object access function including any other functions to assure a clean compile if sizeof(size_t) != sizeof(unsigned long). Signed-off-by: Martin Koegler --- For next. As this touches core functions, it will likely produce conflicts with other changes. Please provide the commit you want to rebase the patch on and I'll produce a V3. Includes changes from Johannes Schindelin. apply.c | 6 +++--- archive-tar.c| 4 ++-- archive-zip.c| 2 +- archive.c| 2 +- archive.h| 2 +- blame.c | 4 ++-- blame.h | 2 +- builtin/cat-file.c | 14 +++--- builtin/difftool.c | 2 +- builtin/fast-export.c| 8 builtin/fmt-merge-msg.c | 2 +- builtin/fsck.c | 2 +- builtin/grep.c | 8 builtin/index-pack.c | 16 builtin/log.c| 4 ++-- builtin/ls-tree.c| 4 ++-- builtin/merge-tree.c | 6 +++--- builtin/mktag.c | 2 +- builtin/notes.c | 6 +++--- builtin/pack-objects.c | 24 --- builtin/reflog.c | 2 +- builtin/tag.c| 4 ++-- builtin/unpack-file.c| 2 +- builtin/unpack-objects.c | 14 +++--- builtin/verify-commit.c | 2 +- bundle.c | 2 +- cache.h | 22 ++--- combine-diff.c | 9 + commit.c | 6 +++--- config.c | 2 +- delta.h | 26 - diff-delta.c | 16 diff.c | 18 - diff.h | 2 +- diffcore.h | 2 +- dir.c| 2 +- entry.c | 4 ++-- fast-import.c| 24 +++ fsck.c | 2 +- grep.h | 2 +- http-push.c | 5 +++-- mailmap.c| 2 +- match-trees.c| 4 ++-- merge-blobs.c| 6 +++--- merge-blobs.h| 2 +- merge-recursive.c| 4 ++-- notes-cache.c| 2 +- notes-merge.c| 2 +- notes.c | 6 +++--- object.c | 2 +- pack-check.c | 2 +- pack-objects.h | 6 +++--- patch-delta.c| 11 ++- read-cache.c | 4 ++-- ref-filter.c | 4 ++-- remote-testsvn.c | 4 ++-- rerere.c | 2 +- sha1_file.c | 50 +--- streaming.c | 8 streaming.h | 2 +- submodule-config.c | 2 +- t/helper/test-delta.c| 2 +- tag.c| 4 ++-- tree-walk.c | 8 tree.c | 2 +- xdiff-interface.c| 2 +- 66 files changed, 219 insertions(+), 212 deletions(-) diff --git a/apply.c b/apply.c index 41ee63e..af9ffee 100644 --- a/apply.c +++ b/apply.c @@ -3082,7 +3082,7 @@ static int apply_binary_fragment(struct apply_state *state, struct patch *patch) { struct fragment *fragment = patch->fragments; - unsigned long len; + size_t len; void *dst; if (!fragment) @@ -3171,7 +3171,7 @@ static int apply_binary(struct apply_state *state, if (has_sha1_file(oid.hash)) { /* We already have the postimage */ enum object_type type; - unsigned long size; + size_t size; char *result; result = read_sha1_file(oid.hash, &type, &size); @@ -3233,7 +3233,7 @@ static int read_blob_object(struct strbuf *buf, const struct object_id *oid, uns strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid)); } else { enum object_type type; - unsigned long sz; + size_t sz; char *result; result = read_sha1_file(oid->hash, &type, &sz); diff --git a/archive-tar.c b/archive-tar.c index c6ed96e..719673d 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -115,7 +115,7 @@ static int stream_blocked(const unsigned char *sha1) { struct git_istream *st; enum object_type type; - unsigned long sz; + size_t sz; char buf[BLOCKSIZE]; ssize_t readlen; @@ -240,7 +240,7 @@ static int write_tar_entry(struct archiver_args *args, struct ustar_header header; struct strbuf ext_header = STRBUF_INIT; unsigned int old_mode = mode; - unsigned long size, size_in_header; + size_t size, size_in_header; void *buffer; int err = 0; diff --git a/archive-zip.c b/archive-zip.c index e8913e5..4492d64 100644 --- a/archive-zip.c +++ b/archive-zip.c @@ -295,7 +295,7 @@ static int write_zip_entr
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Thu, Aug 10, 2017 at 12:04:49AM -0700, Jacob Keller wrote: > >> $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | > >> git interpret-trailers --parse > >> Signed-off-by: Hartmut Henkel > >> Helped-by: Stefan Beller > >> Signed-off-by: Ralf Thielow > >> Acked-by: Matthias Rüster > > > > Thank-you, thank-you, thank-you. > > > > The above example made me wonder if we also want a format specifier > > to do the above without piping, but it turns out that we already > > have "log --format=%(trailers)", so we are good ;-) > > I was going to say, I thought we had a way to get trailers for a > commit via the pretty format, since that is what i used in the past. I do like that you could get the trailers for many commits in a single invocation. That doesn't matter for my current use-case, but obviously piping through O(n) interpret-trailers invocations is a bad idea. But there are a few difficulties with using %(trailers) for this, as it shows everything between the start/end points of the trailer block. In particular: 1. You don't get any kind of normalization, so you're on your own for parsing things like whitespace continuation, extra spaces before separators, etc. 2. It prints non-trailers that fall inside the block. For instance: $ git commit --allow-empty -F - <<-\EOF subject body Signed-off-by: me this is not a trailer Signed-off-by: you EOF $ git log -1 --format=%B | git interpret-trailers --parse Signed-off-by: me Signed-off-by: you $ git log -1 --format='%(trailers)' Signed-off-by: me this is not a trailer Signed-off-by: you For (1) I think many callers would prefer to see the original formatting. Maybe we'd need a %(trailers:normalize) or something. I'm tempted to call (2) a bug, but I guess it's unclear whether callers would want to see the whole block, or if they really want just the individual trailers. -Peff
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Wed, Aug 09, 2017 at 11:18:19AM -0700, Stefan Beller wrote: > On Wed, Aug 9, 2017 at 5:24 AM, Jeff King wrote: > > It can be useful to invoke interpret-trailers for the > > primary purpose of parsing existing trailers. But in that > > case, we don't want to apply existing ifMissing or ifExists > > rules from the config. Let's add a special mode where we > > avoid applying those rules. Coupled with --only-trailers, > > this gives us a reasonable parsing tool. > > I have the impression that the name is slightly misleading > because 'only' just reduces the set. it does not enhance it. > (Do we have a configuration that says "remove this trailer > anytime"?) No, I think you can only add trailers via ifExists or ifMissing. I actually called this --no-config originally, because to me it meant "do not apply config". But the processing applies also to --trailer arguments no the command line, which is how I ended up with --only-existing. > So maybe this is rather worded as 'exact-trailers' ? I'm not fond of that, as it's vague about which exact trailers we're talking about. I also thought of something like --verbatim, but I'd worry that would seem to conflict with --normalize. I dunno. All of the names seem not quite descriptive enough to me. -Peff
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Wed, Aug 09, 2017 at 11:38:20AM -0700, Jonathan Tan wrote: > > diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh > > index e5b0718ef6..525fd53e5b 100755 > > --- a/t/t7513-interpret-trailers.sh > > +++ b/t/t7513-interpret-trailers.sh > > @@ -1312,4 +1312,19 @@ test_expect_success 'only-trailers omits non-trailer > > in middle of block' ' > > test_cmp expected actual > > ' > > > > +test_expect_success 'only existing' ' > > + cat >expected <<-\EOF && > > + existing: existing-value > > + EOF > > + git interpret-trailers \ > > + --only-trailers --only-existing >actual <<-\EOF && > > + my subject > > + > > + my body > > + > > + existing: existing-value > > + EOF > > + test_cmp expected actual > > Would it be worth asserting that the "sign" trailer is configured here > and would normally appear? Maybe through a grep on the output of "git > config". I'd much rather re-add it than assert it with grep hackery. Note that its presence is implied in all of the follow-on tests, too (so I had sort of assumed that its presence in the --only-trailers test would imply that it was carried through to the others). We can be more explicit, though, I guess. > > diff --git a/trailer.c b/trailer.c > > index a4ff99f98a..88f6efe523 100644 > > --- a/trailer.c > > +++ b/trailer.c > > @@ -991,9 +991,10 @@ void process_trailers(const char *file, struct > > process_trailer_options *opts, > > trailer_end = process_input_file(opts->only_trailers ? NULL : outfile, > > sb.buf, &head); > > > > - process_command_line_args(&arg_head, trailers); > > - > > - process_trailers_lists(&head, &arg_head); > > + if (!opts->only_existing) { > > + process_command_line_args(&arg_head, trailers); > > + process_trailers_lists(&head, &arg_head); > > + } > > This is a bit confusing, but it is correct, since > process_command_line_args() processes both configured trailers and > command-line trailers. Yes, it confused me, too. That combination is why "--trailer" is disallowed with --only-existing (which otherwise could work together). I didn't think it was worth refactoring for a case that I don't think anybody would care about. > Having said that, it might be worth declaring LIST_HEAD(arg_head) inside > the if block now. Agreed. -Peff
Re: [PATCH 2/5] interpret-trailers: add an option to show only the trailers
On Wed, Aug 09, 2017 at 11:35:27AM -0700, Jonathan Tan wrote: > > -static void print_all(FILE *outfile, struct list_head *head, int > > trim_empty) > > +static void print_all(FILE *outfile, struct list_head *head, > > + struct process_trailer_options *opts) > > This can be const, I think. (Same thing for patch 1.) OK. We often leave these kinds of option structs as non-const because they can sometimes grow to carry state between functions (e.g., diff_opt). But it's certainly const-able now, so we can let somebody undo it later if they want. > > @@ -897,9 +899,10 @@ static int process_input_file(FILE *outfile, > > trailer_info_get(&info, str); > > > > /* Print lines before the trailers as is */ > > - fwrite(str, 1, info.trailer_start - str, outfile); > > + if (outfile) > > Any reason why you expect outfile to possibly be NULL? > > > + fwrite(str, 1, info.trailer_start - str, outfile); > > > > - if (!info.blank_line_before_trailer) > > + if (outfile && !info.blank_line_before_trailer) > > Same comment here. Because of this hunk from later in the file where we pass in NULL: /* Print the lines before the trailers */ - trailer_end = process_input_file(outfile, sb.buf, &head); + trailer_end = process_input_file(opts->only_trailers ? NULL : outfile, +sb.buf, &head); -Peff
Re: [PATCH 5/5] interpret-trailers: add --parse convenience option
On Wed, Aug 09, 2017 at 11:20:12AM -0700, Stefan Beller wrote: > > +--parse:: > > + A convenience alias for `--only-trailers --only-existing > > + --normalize`. > > Somewhere in this series, we'd want to not just describe each > of the new knobs, but reword the initial description, too? > > git-interpret-trailers - help add structured information > into commit messages > > Maybe a s/add/handle/ s/into// is enough? Thanks, my initial attempt actually included that, when I thought I was going to have a unique "parsing mode". But then as it grew into individual options, I dropped it. And forgot how modification-centric the rest of the manpage description is. I'll try to include something in the final commit, which is where the parsing mode is all tied together. -Peff
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: > This series teaches interpret-trailers to parse and output just the > trailers. So now you can do: > > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | > git interpret-trailers --parse > Signed-off-by: Hartmut Henkel > Helped-by: Stefan Beller > Signed-off-by: Ralf Thielow > Acked-by: Matthias Rüster And here's a v2 that addresses all of the comments except one: Stefan suggested that --only-existing wasn't a great name. I agree, but I like everything else less. Summary: - opts arguments are now const - tests that depend on the value of trailer.sign.command now set it explicitly - the arg_head variable is now moved into the conditional block whewre it's used - the interpret-trailers manpage has been updated in patch 5 to make it clear that "add" and "parse" are the two major modes [1/5]: trailer: put process_trailers() options into a struct [2/5]: interpret-trailers: add an option to show only the trailers [3/5]: interpret-trailers: add an option to show only existing trailers [4/5]: interpret-trailers: add an option to normalize output [5/5]: interpret-trailers: add --parse convenience option Documentation/git-interpret-trailers.txt | 34 +++--- builtin/interpret-trailers.c | 34 +++--- t/t7513-interpret-trailers.sh| 76 trailer.c| 68 ++-- trailer.h| 13 +- 5 files changed, 196 insertions(+), 29 deletions(-) -Peff
[PATCH 5/5] interpret-trailers: add --parse convenience option
The last few commits have added command line options that can turn interpret-trailers into a parsing tool. Since they'd most often be used together, let's provide a convenient single option for callers to invoke this mode. This is implemented as a callback rather than a boolean so that its effect is applied immediately, as if those options had been specified. Later options can then override them. E.g.: git interpret-trailers --parse --no-normalize would work. Let's also update the documentation to make clear that this parsing mode behaves quite differently than the normal "add trailers to the input" mode. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 21 ++--- builtin/interpret-trailers.c | 12 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 6537faf887..863cb4ec5d 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -3,24 +3,27 @@ git-interpret-trailers(1) NAME -git-interpret-trailers - help add structured information into commit messages +git-interpret-trailers - add or parse structured information in commit messages SYNOPSIS [verse] -'git interpret-trailers' [--in-place] [--trim-empty] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [--parse] [...] DESCRIPTION --- -Help adding 'trailers' lines, that look similar to RFC 822 e-mail +Help parsing or adding 'trailers' lines, that look similar to RFC 822 e-mail headers, at the end of the otherwise free-form part of a commit message. This command reads some patches or commit messages from either the - arguments or the standard input if no is specified. Then -this command applies the arguments passed using the `--trailer` -option, if any, to the commit message part of each input file. The -result is emitted on the standard output. + arguments or the standard input if no is specified. If +`--parse` is specified, the output consists of the parsed trailers. + +Otherwise, the this command applies the arguments passed using the +`--trailer` option, if any, to the commit message part of each input +file. The result is emitted on the standard output. Some configuration variables control the way the `--trailer` arguments are applied to each commit message and the way any existing trailer in @@ -93,6 +96,10 @@ OPTIONS line, with any existing whitespace continuation folded into a single line. +--parse:: + A convenience alias for `--only-trailers --only-existing + --normalize`. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index ed2d893b4f..70ca855aa6 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -16,6 +16,16 @@ static const char * const git_interpret_trailers_usage[] = { NULL }; +static int parse_opt_parse(const struct option *opt, const char *arg, + int unset) +{ + struct process_trailer_options *v = opt->value; + v->only_trailers = 1; + v->only_existing = 1; + v->normalize = 1; + return 0; +} + int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; @@ -27,6 +37,8 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_BOOL(0, "only-existing", &opts.only_existing, N_("output only existing trailers")), OPT_BOOL(0, "normalize", &opts.normalize, N_("normalize trailer formatting")), + { OPTION_CALLBACK, 0, "parse", &opts, NULL, N_("set parsing options"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, parse_opt_parse }, OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() -- 2.14.0.614.g0beb26d5e9
[PATCH 2/5] interpret-trailers: add an option to show only the trailers
In theory it's easy for any reader who wants to parse trailers to do so. But there are a lot of subtle corner cases around what counts as a trailer, when the trailer block begins and ends, etc. Since interpret-trailers already has our parsing logic, let's let callers ask it to just output the trailers. They still have to parse the "key: value" lines, but at least they can ignore all of the other corner cases. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 3 +++ builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 39 trailer.c| 19 ++-- trailer.h| 1 + 5 files changed, 56 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 31cdeaecdf..295dffbd21 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -80,6 +80,9 @@ OPTIONS trailer to the input messages. See the description of this command. +--only-trailers:: + Output only the trailers, not any other parts of the input. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index bb0d7b937a..afb12c11bc 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -24,6 +24,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) struct option options[] = { OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), + OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 0c6f91c433..90d30037b7 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1275,4 +1275,43 @@ test_expect_success 'with cut line' ' test_cmp expected actual ' +test_expect_success 'only trailers' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + sign: config-value + added: added-value + EOF + git interpret-trailers \ + --trailer added:added-value \ + --only-trailers >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + +test_expect_success 'only-trailers omits non-trailer in middle of block' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + Signed-off-by: nobody + Signed-off-by: somebody + sign: config-value + EOF + git interpret-trailers --only-trailers >actual <<-\EOF && + subject + + it is important that the trailers below are signed-off-by + so that they meet the "25% trailers Git knows about" heuristic + + Signed-off-by: nobody + this is not a trailer + Signed-off-by: somebody + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index e21a0d1629..706351fc98 100644 --- a/trailer.c +++ b/trailer.c @@ -164,13 +164,15 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct list_head *head, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, + const struct process_trailer_options *opts) { struct list_head *pos; struct trailer_item *item; list_for_each(pos, head) { item = list_entry(pos, struct trailer_item, list); - if (!trim_empty || strlen(item->value) > 0) + if ((!opts->trim_empty || strlen(item->value) > 0) && + (!opts->only_trailers || item->token)) print_tok_val(outfile, item->token, item->value); } } @@ -897,9 +899,10 @@ static int process_input_file(FILE *outfile, trailer_info_get(&info, str); /* Print lines before the trailers as is */ - fwrite(str, 1, info.trailer_start - str, outfile); + if (outfile) + fwrite(str, 1, info.trailer_start - str, outfile); - if (!info.blank_line_before_trailer) + if (outfile && !info.blank_line_before_trailer) fprintf(outfile, "\n"); for (i = 0; i < info.trailer_nr; i++) { @@ -986,18 +989,20
[PATCH 4/5] interpret-trailers: add an option to normalize output
The point of "--only-trailers" is to give a caller an output that's easy for them to parse. Getting rid of the non-trailer material helps, but we still may see more complicated syntax like whitespace continuation. Let's add an option to normalize the output into one "key: value" line per trailer. As a bonus, this could be used even without --only-trailers to clean up unusual formatting in the incoming data. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 21 trailer.c| 34 +++- trailer.h| 1 + 5 files changed, 61 insertions(+), 1 deletion(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index b2a8fad248..6537faf887 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -88,6 +88,11 @@ OPTIONS from the command-line or by following configured `trailer.*` rules. +--normalize:: + Normalize trailer output so that trailers appear exactly one per + line, with any existing whitespace continuation folded into a + single line. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index a49f94ba34..ed2d893b4f 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -26,6 +26,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_BOOL(0, "only-existing", &opts.only_existing, N_("output only existing trailers")), + OPT_BOOL(0, "normalize", &opts.normalize, N_("normalize trailer formatting")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 97f4ac0fb3..ef34ee4e49 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1330,4 +1330,25 @@ test_expect_success 'only existing' ' test_cmp expected actual ' +test_expect_success 'normalize' ' + cat >expected <<-\EOF && + foo: continued across several lines + EOF + # pass through tr to make leading and trailing whitespace more obvious + tr _ " " <<-\EOF | + my subject + + my body + + foo:_ + __continued + ___across + several + _lines + ___ + EOF + git interpret-trailers --only-trailers --only-existing --normalize >actual && + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 3abac3db37..1ccf9996b1 100644 --- a/trailer.c +++ b/trailer.c @@ -887,7 +887,35 @@ static int ends_with_blank_line(const char *buf, size_t len) return is_blank_line(buf + ll); } +static void normalize_value(struct strbuf *val) +{ + struct strbuf out = STRBUF_INIT; + size_t i; + + strbuf_grow(&out, val->len); + i = 0; + while (i < val->len) { + char c = val->buf[i++]; + if (c == '\n') { + /* Collapse continuation down to a single space. */ + while (i < val->len && isspace(val->buf[i])) + i++; + strbuf_addch(&out, ' '); + } else { + strbuf_addch(&out, c); + } + } + + /* Empty lines may have left us with whitespace cruft at the edges */ + strbuf_trim(&out); + + /* output goes back to val as if we modified it in-place */ + strbuf_swap(&out, val); + strbuf_release(&out); +} + static int process_input_file(FILE *outfile, + int normalize, const char *str, struct list_head *head) { @@ -914,12 +942,16 @@ static int process_input_file(FILE *outfile, if (separator_pos >= 1) { parse_trailer(&tok, &val, NULL, trailer, separator_pos); + if (normalize) + normalize_value(&val); add_trailer_item(head, strbuf_detach(&tok, NULL), strbuf_detach(&val, NULL)); } else { strbuf_addstr(&val, trailer); strbuf_strip_suffix(&val, "\n"); + if
[PATCH 3/5] interpret-trailers: add an option to show only existing trailers
It can be useful to invoke interpret-trailers for the primary purpose of parsing existing trailers. But in that case, we don't want to apply existing ifMissing or ifExists rules from the config. Let's add a special mode where we avoid applying those rules. Coupled with --only-trailers, this gives us a reasonable parsing tool. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 7 +++ t/t7513-interpret-trailers.sh| 16 trailer.c| 9 + trailer.h| 1 + 5 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 295dffbd21..b2a8fad248 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -83,6 +83,11 @@ OPTIONS --only-trailers:: Output only the trailers, not any other parts of the input. +--only-existing:: + Output only trailers that exist in the input; do not add any + from the command-line or by following configured `trailer.*` + rules. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index afb12c11bc..a49f94ba34 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -25,6 +25,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), + OPT_BOOL(0, "only-existing", &opts.only_existing, N_("output only existing trailers")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -33,6 +34,12 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, options, git_interpret_trailers_usage, 0); + if (opts.only_existing && trailers.nr) + usage_msg_opt( + _("--trailer with --only-existing does not make sense"), + git_interpret_trailers_usage, + options); + if (argc) { int i; for (i = 0; i < argc; i++) diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 90d30037b7..97f4ac0fb3 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1314,4 +1314,20 @@ test_expect_success 'only-trailers omits non-trailer in middle of block' ' test_cmp expected actual ' +test_expect_success 'only existing' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + EOF + git interpret-trailers \ + --only-trailers --only-existing >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 706351fc98..3abac3db37 100644 --- a/trailer.c +++ b/trailer.c @@ -976,7 +976,6 @@ void process_trailers(const char *file, struct string_list *trailers) { LIST_HEAD(head); - LIST_HEAD(arg_head); struct strbuf sb = STRBUF_INIT; int trailer_end; FILE *outfile = stdout; @@ -992,9 +991,11 @@ void process_trailers(const char *file, trailer_end = process_input_file(opts->only_trailers ? NULL : outfile, sb.buf, &head); - process_command_line_args(&arg_head, trailers); - - process_trailers_lists(&head, &arg_head); + if (!opts->only_existing) { + LIST_HEAD(arg_head); + process_command_line_args(&arg_head, trailers); + process_trailers_lists(&head, &arg_head); + } print_all(outfile, &head, opts); diff --git a/trailer.h b/trailer.h index 3cf35ced00..96dcbe801a 100644 --- a/trailer.h +++ b/trailer.h @@ -26,6 +26,7 @@ struct process_trailer_options { int in_place; int trim_empty; int only_trailers; + int only_existing; }; #define PROCESS_TRAILER_OPTIONS_INIT {0} -- 2.14.0.614.g0beb26d5e9
[PATCH 1/5] trailer: put process_trailers() options into a struct
We already have two options and are about to add a few more. To avoid having a huge number of boolean arguments, let's convert to an options struct which can be passed in. Signed-off-by: Jeff King --- builtin/interpret-trailers.c | 13 ++--- trailer.c| 10 ++ trailer.h| 10 +- 3 files changed, 21 insertions(+), 12 deletions(-) diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 175f14797b..bb0d7b937a 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -18,13 +18,12 @@ static const char * const git_interpret_trailers_usage[] = { int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { - int in_place = 0; - int trim_empty = 0; + struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; struct string_list trailers = STRING_LIST_INIT_NODUP; struct option options[] = { - OPT_BOOL(0, "in-place", &in_place, N_("edit files in place")), - OPT_BOOL(0, "trim-empty", &trim_empty, N_("trim empty trailers")), + OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), + OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -36,11 +35,11 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) if (argc) { int i; for (i = 0; i < argc; i++) - process_trailers(argv[i], in_place, trim_empty, &trailers); + process_trailers(argv[i], &opts, &trailers); } else { - if (in_place) + if (opts.in_place) die(_("no input file given for in-place editing")); - process_trailers(NULL, in_place, trim_empty, &trailers); + process_trailers(NULL, &opts, &trailers); } string_list_clear(&trailers, 0); diff --git a/trailer.c b/trailer.c index 751b56c009..e21a0d1629 100644 --- a/trailer.c +++ b/trailer.c @@ -968,7 +968,9 @@ static FILE *create_in_place_tempfile(const char *file) return outfile; } -void process_trailers(const char *file, int in_place, int trim_empty, struct string_list *trailers) +void process_trailers(const char *file, + const struct process_trailer_options *opts, + struct string_list *trailers) { LIST_HEAD(head); LIST_HEAD(arg_head); @@ -980,7 +982,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str read_input_file(&sb, file); - if (in_place) + if (opts->in_place) outfile = create_in_place_tempfile(file); /* Print the lines before the trailers */ @@ -990,14 +992,14 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str process_trailers_lists(&head, &arg_head); - print_all(outfile, &head, trim_empty); + print_all(outfile, &head, opts->trim_empty); free_all(&head); /* Print the lines after the trailers as is */ fwrite(sb.buf + trailer_end, 1, sb.len - trailer_end, outfile); - if (in_place) + if (opts->in_place) if (rename_tempfile(&trailers_tempfile, file)) die_errno(_("could not rename temporary file to %s"), file); diff --git a/trailer.h b/trailer.h index 65cc5d79c6..9da00bedec 100644 --- a/trailer.h +++ b/trailer.h @@ -22,7 +22,15 @@ struct trailer_info { size_t trailer_nr; }; -void process_trailers(const char *file, int in_place, int trim_empty, +struct process_trailer_options { + int in_place; + int trim_empty; +}; + +#define PROCESS_TRAILER_OPTIONS_INIT {0} + +void process_trailers(const char *file, + const struct process_trailer_options *opts, struct string_list *trailers); void trailer_info_get(struct trailer_info *info, const char *str); -- 2.14.0.614.g0beb26d5e9
Re: [PATCH 0/4] dropping support for older curl
Hi Peff, On Wed, 9 Aug 2017, Jeff King wrote: > On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote: > > > > This is a resurrection of the thread from April: > > > > > > > > > https://public-inbox.org/git/20170404025438.bgxz5sfmrawqs...@sigill.intra.peff.net/ > > > > As before, I would like to point out that people running with older cURL > > are most likely not at liberty to change the system libraries. > > > > I know that I didn't when I was working on a very expensive microscope > > whose only certified control computer ran a very old version of CentOS, > > and I really needed to install Git on it. > > > > In such a case, it is often preferable to be able to build against an old > > cURL -- even if some of the fancier features might be broken, and even if > > some minor compile errors need to be fixed. > > > > I know I was happy to compile Git against an ancient cURL back then. > > > > Just so you understand where I come from when I would like to caution > > against dropping support for older cURL unless it *really* adds an > > *enormous* amount of maintenance burden. > > > > I mean, if we even go out of our way to support the completely outdated > > and obsolete .git/branches/ for what is likely a single user, it may not > > be the worst to keep those couple of #ifdef guards to keep at least > > nominal support for older cURLs? > > You've totally ignored the argument I made back then[1], and which I > reiterated in this thread. So I'll say it one more time: the more > compelling reason is not the #ifdefs, but the fact that the older > versions are totally untested. In fact, they do not even compile, and > yet I have not seen any patches to fix that. Let me re-quote from above: > > In such a case, it is often preferable to be able to build against an > > old cURL -- even if some of the fancier features might be broken, and > > even if some minor compile errors need to be fixed. As far as I remember, I *did* have to fix a minor compile error. Took something like 15 minutes from first compile error to fully running test suite. Compare that effort to the effort of compiling a current cURL, possibly having to compile newer c-ares, spdylay, jansson, nghttp2, openssl, nettle, libunistring, libtasn1, libidn, libmetalink, rtmpdump and whatever else. > So IMHO this is about being honest with users about which versions we > _actually_ support. We will most likely never, ever have a fully 100% bug free system. That does not mean that we should rip out everything that is not totally, completely working. Instead, we try [*1*] to welcome patches. I can buy some argument like: this support is so invasive, so brittle, and nobody takes care of it, and if it breaks, it is hard to fix, and those who could, won't, so let's remove it. That argument is why Git for Windows dropped XP support. I cannot buy the argument: there are a dozen #ifdefs and I don't know whether they still work. I don't know whether anybody (who most likely has better things to do than read the Git mailing list) is still using those. So let's just remove them. That argument was what let us go overboard, and actually go too far by removing the fallback when REG_STARTEND is missing, even while fixing a very real bug. And that overzealous action hurt users. It also cost *us* time, having to deal with the ensuing conversation, but we deserved to be paying for this, the users didn't. I did not have time to look closely over your patches to remove cURL support for older versions. From a cursory look, I did not get the impression that there is a lot of maintenance burden there, though. Therefore, I currently believe that the downsides of removing the support outweigh the benefits. Mind, I agree that cURL should be upgrade to version 7.55.0 wherever possible. But it is a huge mistake to assume that everybody who wants to build, or just use, Git is at liberty to perform that upgrade in their setup. To make that assumption is really harmful to users who are stuck in a bad place out of no fault of their own. It is also not very nice. Hopefully I had better luck expressing my concerns this time? Ciao, Dscho Footnote *1*: It is no secret that I find our patch submission less than inviting. Granted, *I* use it. *I* did not have problems entering the mailing list. But then, my mails were not swallowed silently, because my mail program does not send HTML by default. And prepared by the German school system (I learned the term "sugar coating" only when exposed to some US culture), I had little emotional problems with being criticized and not thanked for my contribution, I persisted nevertheless. The opinion that the Git contribution process is a lot less inviting than it could be is not only my view, by the way. I hear this a lot. I give you that we are not quite as textbook "keep out from here unless you look like us, smell like us, talk like us, have the same genital setup like us" as the Linux kernel mailing list, but
Re: [RFC] clang-format: outline the git project's coding style
Hi Stefan, On Wed, 9 Aug 2017, Stefan Beller wrote: > > I am sure that something even better will be possible: a Continuous > > "Integration" that fixes the coding style automatically by using > > `filter-branch` (avoiding the merge conflicts that would arise if > > `rebase -i` was used). > > I do not quite follow. Is that to be used by Junio while integrating > branches? I was more thinking about a bot on GitHub. "Code cleanup as a service". Ciao, Dscho
[PATCH] sha1_file: release delta_stack on error in unpack_entry()
When unpack_entry() encounters a broken packed object, it returns early. It adjusts the reference count of the pack window, but leaks the buffer for a big delta stack in case the small automatic one was not enough. Jump to the cleanup code at end instead, which takes care of that. Signed-off-by: Rene Scharfe --- sha1_file.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index b60ae15f70..b7bb38b445 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2542,8 +2542,8 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, error("bad packed object CRC for %s", sha1_to_hex(sha1)); mark_bad_packed_object(p, sha1); - unuse_pack(&w_curs); - return NULL; + data = NULL; + goto out; } } @@ -2681,6 +2681,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, if (final_size) *final_size = size; +out: unuse_pack(&w_curs); if (delta_stack != small_delta_stack) -- 2.14.0
[PATCH] fsck: free buffers on error in fsck_obj()
Move the code for releasing tree buffers and commit buffers in fsck_obj() to the end of the function and make sure it's executed no matter of an error is encountered or not. Signed-off-by: Rene Scharfe --- builtin/fsck.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 99dea7adf6..698a8776a4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -335,6 +335,8 @@ static void check_connectivity(void) static int fsck_obj(struct object *obj) { + int err; + if (obj->flags & SEEN) return 0; obj->flags |= SEEN; @@ -345,20 +347,13 @@ static int fsck_obj(struct object *obj) if (fsck_walk(obj, NULL, &fsck_obj_options)) objerror(obj, "broken links"); - if (fsck_object(obj, NULL, 0, &fsck_obj_options)) - return -1; - - if (obj->type == OBJ_TREE) { - struct tree *item = (struct tree *) obj; - - free_tree_buffer(item); - } + err = fsck_object(obj, NULL, 0, &fsck_obj_options); + if (err) + goto out; if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; - free_commit_buffer(commit); - if (!commit->parents && show_root) printf("root %s\n", describe_object(&commit->object)); } @@ -374,7 +369,12 @@ static int fsck_obj(struct object *obj) } } - return 0; +out: + if (obj->type == OBJ_TREE) + free_tree_buffer((struct tree *)obj); + if (obj->type == OBJ_COMMIT) + free_commit_buffer((struct commit *)obj); + return err; } static int fsck_obj_buffer(const struct object_id *oid, enum object_type type, -- 2.14.0
Re: Dropping support for older perl
On 09/08/17 15:38, Ævar Arnfjörð Bjarmason wrote: RHEL/CentOS 5.x has perl 5.8.8, but it also has curl 7.15.5[1] which is obseleted by these curl patches. Maybe we'd want to be more conservative with perl for whatever reason, but I'd like to at least bump our requirenment of 5.8.0 to 5.8.8. Those releases are 4 years apart, and a lot of bugs were fixed[2], and some constructs / modules have newer APIs we could use. But if we do the thing corresponding to these curl patches we should bump the dependency to 5.10.1, that was released in August 2009 (and the curl version JK is bumping us to in March 2009), and 5.10.1 is shipped with RHEL/CentOS 6. I agree with your thoughts. Though I'm a bit biased since I only really care about RHEL/CentOS in the context of being able to use vendor provided versions of curl and perl. The bump to 5.10.1 may be a bad idea, I know AIX/HPUX/Solaris and some others have historically been more conservative about upgrading perl than stuff like libcurl since it's in the base system. AFAIK it used to be common to build updated versions at least on Solaris. I provide perl 5.16.x and a recent curl for Solaris 2.6-9 as part of tgcware(1) and Solaris 10/11 users can use OpenCSW which seems to have 5.10.1 available. -tgc 1) https://jupiterrise.com/tgcware/
[PATCH] win32: plug memory leak on realloc() failure in syslog()
If realloc() fails then the original buffer is still valid. Free it before exiting the function. Signed-off-by: Rene Scharfe --- compat/win32/syslog.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compat/win32/syslog.c b/compat/win32/syslog.c index 6c7c9b6053..161978d720 100644 --- a/compat/win32/syslog.c +++ b/compat/win32/syslog.c @@ -43,8 +43,10 @@ void syslog(int priority, const char *fmt, ...) va_end(ap); while ((pos = strstr(str, "%1")) != NULL) { + char *oldstr = str; str = realloc(str, st_add(++str_len, 1)); if (!str) { + free(oldstr); warning_errno("realloc failed"); return; } -- 2.14.0
Re: [PATCH 3/4] http: drop support for curl < 7.19.4
Jeff King wrote: > -#if LIBCURL_VERSION_NUM >= 0x071301 > curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL); > -#elif LIBCURL_VERSION_NUM >= 0x071101 > curl_easy_setopt(result, CURLOPT_POST301, 1); > -#endif This seems to be an unintended behavioural change: the second condition wouldn't have applied previously and overrides the first option (equivalent to CURLOPT_POSTREDIR = CURL_REDIR_POST_301). -- Mischa
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: > More context (may be different issue(s)) could be found at > http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ > but currently I am consistently reproducing it while running > git (1:2.11.0-3 debian stretch build) within debian stretch singularity > environment [1]. > > External system is Centos 6.9, and git 1.7.1 (and installed in modules > 2.0.4) do not show similar buggy behavior. > > NFS mounted partitions are bind mounted inside the sinularity space and > when I try to do some git operations, I get that error inconsistently , e.g. > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > fatal: Out of memory, getdelim failed > error: git://github.com/datalad/datalad did not send all necessary > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > fatal: Out of memory, getdelim failed > error: git://github.com/datalad/datalad did not send all necessary > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > From git://github.com/datalad/datalad >* branch master -> FETCH_HEAD > fatal: Out of memory, getdelim failed > > and some times it succeeds. So it smells that some race condition > somewhere...? I doubt the type of file system matters. The questions are: How much main memory do you have, what is git trying to cram into it, is there a way to reduce the memory footprint or do you need to add more RAM? > any recommendations on how to pin point the "offender"? ;) Running "GIT_TRACE=1 git pull --ff-only origin master" would be a good start, I think, to find out which of the different activities that pull is doing causes the out-of-memory error. "free" and "ulimit -a" can help you find out how much memory you can use. Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? getdelim() is used mostly to read lines from files like these and in the admittedly unlikely case that they are *really* long such an error would be expected. René
Re: fatal: Out of memory, getdelim failed under NFS mounts
Thank you René! comments/answers embedded below On Thu, 10 Aug 2017, René Scharfe wrote: > Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: > > More context (may be different issue(s)) could be found at > > http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ > > but currently I am consistently reproducing it while running > > git (1:2.11.0-3 debian stretch build) within debian stretch singularity > > environment [1]. > > External system is Centos 6.9, and git 1.7.1 (and installed in modules > > 2.0.4) do not show similar buggy behavior. > > NFS mounted partitions are bind mounted inside the sinularity space and > > when I try to do some git operations, I get that error inconsistently , e.g. > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > fatal: Out of memory, getdelim failed > > error: git://github.com/datalad/datalad did not send all necessary > > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > fatal: Out of memory, getdelim failed > > error: git://github.com/datalad/datalad did not send all necessary > > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > From git://github.com/datalad/datalad > > * branch master -> FETCH_HEAD > > fatal: Out of memory, getdelim failed > > and some times it succeeds. So it smells that some race condition > > somewhere...? > I doubt the type of file system matters. So far it has been a very consistent indicator. I did not manage to get this error while performing the same operation under /tmp (bind to local mounted drive), where it also feels going faster (again suggesting that original issue is some kind of a race) > The questions are: How much > main memory do you have, what is git trying to cram into it, is there > a way to reduce the memory footprint or do you need to add more RAM? > ... reordered ... > "free" and "ulimit -a" can help you find out how much memory you can > use. I think those aren't the reason: yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h totalusedfree shared buff/cache available Mem: 126G2.5G 90G652K 33G123G Swap: 127G1.7M127G yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit unlimited > > any recommendations on how to pin point the "offender"? ;) > Running "GIT_TRACE=1 git pull --ff-only origin master" would be a > good start, I think, to find out which of the different activities > that pull is doing causes the out-of-memory error. samples of bad, and then good runs (from eyeballing -- the same until error message): yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log 14:05:25.782270 git.c:371 trace: built-in: git 'pull' '--ff-only' 'origin' 'master' 14:05:25.795036 run-command.c:350 trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.795332 exec_cmd.c:116 trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.797212 git.c:371 trace: built-in: git 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.904088 run-command.c:350 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.085954 run-command.c:350 trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.086333 exec_cmd.c:116 trace: exec: 'git' 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.088382 git.c:371 trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.133326 run-command.c:350 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.133688 exec_cmd.c:116 trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.135493 git.c:371 trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' fatal: Out of memory, getdelim failed error: git://github.com/datalad/datalad did not send all necessary objects 14:05:26.138838 run-command.c:350 trace: run_command: 'gc' '--auto' 14:05:26.139131 exec_cmd.c:116 trace: exec: 'git' 'gc' '--auto' 14:05:26.141108 git.c:371 trace: built-in: git 'gc' '--auto' yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_good.log 14:05:37.851862 git.c:371 trace: built-in: git 'pull' '--ff-only' 'origin' 'master' 14:05:37.854250 run-command.c:350 trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master' 14:05:37.854527 exec_cmd.c:116 trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master' 14:05:37.856389 git
Re: [PATCH V2 2/2] Convert size datatype to size_t
Hi Martin, On Thu, 10 Aug 2017, Martin Koegler wrote: > From: Martin Koegler > > It changes the signature of the core object access function > including any other functions to assure a clean compile if > sizeof(size_t) != sizeof(unsigned long). > > Signed-off-by: Martin Koegler > --- > For next. As this touches core functions, it will likely produce > conflicts with other changes. Please provide the commit you want > to rebase the patch on and I'll produce a V3. > > Includes changes from Johannes Schindelin. Thank you so much! Dscho
Re: [PATCH] apply: remove prefix_length member from apply_state
On Wed, Aug 9, 2017 at 5:54 PM, René Scharfe wrote: > Use a NULL-and-NUL check to see if we have a prefix and consistently use > C string functions on it instead of storing its length in a member of > struct apply_state. This avoids strlen() calls and simplifies the code. This looks like a good idea. > @@ -2088,10 +2087,9 @@ static int use_patch(struct apply_state *state, struct > patch *p) > int i; > > /* Paths outside are not touched regardless of "--include" */ > - if (0 < state->prefix_length) { > - int pathlen = strlen(pathname); > - if (pathlen <= state->prefix_length || > - memcmp(state->prefix, pathname, state->prefix_length)) > + if (state->prefix && *state->prefix) { > + const char *rest; > + if (!skip_prefix(pathname, state->prefix, &rest) || !*rest) > return 0; > } Yeah, or maybe declare "const char *rest;" just after "int i;" and then use: if (state->prefix && *state->prefix && (!skip_prefix(pathname, state->prefix, &rest) || !*rest)) return 0;
Re: [RFC] clang-format: outline the git project's coding style
Johannes Schindelin writes: > On Wed, 9 Aug 2017, Stefan Beller wrote: > >> > I am sure that something even better will be possible: a Continuous >> > "Integration" that fixes the coding style automatically by using >> > `filter-branch` (avoiding the merge conflicts that would arise if >> > `rebase -i` was used). >> >> I do not quite follow. Is that to be used by Junio while integrating >> branches? > > I was more thinking about a bot on GitHub. "Code cleanup as a service". I vaguely recall that there was a discussion to have SubmitGit wait for success from Travis CI; if that is already in place, then I can sort of see how it would help individual contributors to have the style checker in that pipeline as well. I have a mixed feelings about "fixing" styles automatically, though.
[PATCH] merge: use skip_prefix()
Get rid of a magic string length constant by using skip_prefix() instead of starts_with(). Signed-off-by: Rene Scharfe --- builtin/merge.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/merge.c b/builtin/merge.c index 900bafdb45..4facb6fdd0 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1117,8 +1117,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) * current branch. */ branch = branch_to_free = resolve_refdup("HEAD", 0, head_oid.hash, NULL); - if (branch && starts_with(branch, "refs/heads/")) - branch += 11; + if (branch) + skip_prefix(branch, "refs/heads/", &branch); if (!branch || is_null_oid(&head_oid)) head_commit = NULL; else -- 2.14.0
Not understanding with git wants to copy one file to another
I ran into a line in git commit ouput I had not see before #copied: d0/etc/hosts -> misc/old-readerHOSTvcs-files/etc/hosts So googling I learned that this might happen if git thinks the two files are the same. I was pretty sure they were not the same so checked them> diff d0/etc/host misc/old-readerHOSTvcs-files/etc/hosts The output is a bit long but shows them being quite different. Some 2 dozen or so lines that dramatically differ. Here are two that are at least kind of similar but would never be seen as the same: < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 > 192.168.1.43m2.local.lanm2 # win 7 Not to mention they are quite different lines as well. So what is going on and what should I be looking at?
Re: [RFC] clang-format: outline the git project's coding style
On 08/10, Junio C Hamano wrote: > Johannes Schindelin writes: > > > On Wed, 9 Aug 2017, Stefan Beller wrote: > > > >> > I am sure that something even better will be possible: a Continuous > >> > "Integration" that fixes the coding style automatically by using > >> > `filter-branch` (avoiding the merge conflicts that would arise if > >> > `rebase -i` was used). > >> > >> I do not quite follow. Is that to be used by Junio while integrating > >> branches? > > > > I was more thinking about a bot on GitHub. "Code cleanup as a service". > > I vaguely recall that there was a discussion to have SubmitGit wait > for success from Travis CI; if that is already in place, then I can > sort of see how it would help individual contributors to have the > style checker in that pipeline as well. > > I have a mixed feelings about "fixing" styles automatically, though. > I still think we are far away from a world where we can fix style automatically. If we do want to keep pursuing this there are a number steps we'd want to take first. 1. Settle on a concrete style and document it using a formatter's rules (in say a .clang-format file). This style would most likely need to be tuned a little bit, at least the 'Penalty' configuration would need to be tuned which (as far as I understand it) is used to determine which rule to break first to ensure a line isn't too long. 2. Start getting contributors to use the tool to format their patches. This would include having some script or hook that a contributor could run to only format the sections of code that they touched. 3. Slowly the code base would begin to have a uniform style. At some point we may want to then reformat the remaining sections of the code base. At this point we could have some automated bot that fixes style. I'm sure I missed a step in there somewhere though. -- Brandon Williams
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
On Tue, Aug 8, 2017 at 6:22 PM, Jonathan Tan wrote: > Here is the complete patch set. I have only moved the exported functions > that operate with packfiles and their static helpers - for example, > static functions like freshen_packed_object() that are used only by > non-pack-specific functions are not moved. > > In the end, 3 functions needed to be made global. They are > find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad(). > > Of the 3, find_pack_entry() is probably legitimately promoted. But I > think that the latter two functions needing to be accessed from > sha1_file.c points to a design that could be improved - they are only > used when packed_object_info() detects corruption, and used for marking > as bad and printing messages to the user respectively, which > packed_object_info() should probably do itself. But I have not made this > change in this patch set. > > (Other than the 3 functions above, there are some variables and > functions that are temporarily made global, but reduced back to static > when the wide scope is no longer needed.) I read through the patches yesterday and had no comment. Thanks, Stefan
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 12:32 AM, Jeff King wrote: > On Wed, Aug 09, 2017 at 11:18:19AM -0700, Stefan Beller wrote: > >> On Wed, Aug 9, 2017 at 5:24 AM, Jeff King wrote: >> > It can be useful to invoke interpret-trailers for the >> > primary purpose of parsing existing trailers. But in that >> > case, we don't want to apply existing ifMissing or ifExists >> > rules from the config. Let's add a special mode where we >> > avoid applying those rules. Coupled with --only-trailers, >> > this gives us a reasonable parsing tool. >> >> I have the impression that the name is slightly misleading >> because 'only' just reduces the set. it does not enhance it. >> (Do we have a configuration that says "remove this trailer >> anytime"?) > > No, I think you can only add trailers via ifExists or ifMissing. > I actually called this --no-config originally, because to me it meant > "do not apply config". But the processing applies also to --trailer > arguments no the command line, which is how I ended up with > --only-existing. > >> So maybe this is rather worded as 'exact-trailers' ? > > I'm not fond of that, as it's vague about which exact trailers we're > talking about. I also thought of something like --verbatim, but I'd > worry that would seem to conflict with --normalize. > > I dunno. All of the names seem not quite descriptive enough to me. I meant 'exact' as in 'exactly from the patch/commit, no external influence such as config', so maybe '--from-patch' or '--from-commit' (which says the same as --no-config just the other way round. Having --no- in config options as the standard is a UX disaster IMHO as then we have to forbid the --no-no-X or reintroduce X and flip the default) Maybe --genuine ? > > -Peff
Re: [RFC] clang-format: outline the git project's coding style
Brandon Williams writes: > On 08/10, Junio C Hamano wrote: > >> I vaguely recall that there was a discussion to have SubmitGit wait >> for success from Travis CI; if that is already in place, then I can >> sort of see how it would help individual contributors to have the >> style checker in that pipeline as well. >> >> I have a mixed feelings about "fixing" styles automatically, though. > > I still think we are far away from a world where we can fix style > automatically. If we do want to keep pursuing this there are a number > steps we'd want to take first. > > 1. Settle on a concrete style and document it using a formatter's rules >(in say a .clang-format file). This style would most likely need to >be tuned a little bit, at least the 'Penalty' configuration would >need to be tuned which (as far as I understand it) is used to >determine which rule to break first to ensure a line isn't too long. Yes. I think this is what you started to get the ball rolling. Together with what checkpatch.pl already diagnoses, I think we can get a guideline that is more or less reasonable. > 2. Start getting contributors to use the tool to format their patches. >This would include having some script or hook that a contributor >could run to only format the sections of code that they touched. This, too. Running checkpatch.pl (possibly combined with a bit of tweaking it to match our needs) already catches many of the issues, so a tool with a similar interface would be easy to use, I would imagine. > 3. Slowly the code base would begin to have a uniform style. At >some point we may want to then reformat the remaining sections of the >code base. At this point we could have some automated bot that fixes >style. I suspect I am discussing this based on a different assumption. I think the primary goal of this effort is to make it easier to cleanse the new patches that appear on the list of trivial style issues, so that contributors and reviewers do not have to spend bandwidth and brain cycles during the review. And I have been assuming that we can do so even without waiting for a "tree wide" code churn on existing code to complete.
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 10:27:19AM -0700, Stefan Beller wrote: > > I'm not fond of that, as it's vague about which exact trailers we're > > talking about. I also thought of something like --verbatim, but I'd > > worry that would seem to conflict with --normalize. > > > > I dunno. All of the names seem not quite descriptive enough to me. > > I meant 'exact' as in 'exactly from the patch/commit, no external > influence such as config', so maybe '--from-patch' or '--from-commit' > (which says the same as --no-config just the other way round. > Having --no- in config options as the standard is a UX disaster > IMHO as then we have to forbid the --no-no-X or reintroduce X > and flip the default) Yes, that was definitely the other reason I didn't want to call it "--no-config". :) It's not always from a patch or commit. The most accurate along those lines is "--from-input". > Maybe --genuine ? But in the greater context I think that's vague again; we don't know which part of the command's operation is "genuine". Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to match the other "--only". I think I like that last one the best. It makes it clear that we are looking just at the input, and not anything else. Which is exactly what the feature does. -Peff
Re: [PATCH 3/4] http: drop support for curl < 7.19.4
On Thu, Aug 10, 2017 at 02:36:41PM +0200, Mischa POSLAWSKY wrote: > Jeff King wrote: > > -#if LIBCURL_VERSION_NUM >= 0x071301 > > curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL); > > -#elif LIBCURL_VERSION_NUM >= 0x071101 > > curl_easy_setopt(result, CURLOPT_POST301, 1); > > -#endif > > This seems to be an unintended behavioural change: the second condition > wouldn't have applied previously and overrides the first option > (equivalent to CURLOPT_POSTREDIR = CURL_REDIR_POST_301). Thanks, you're right. I'll fix it in my re-roll. -Peff
Re: Not understanding with git wants to copy one file to another
On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnam wrote: > I ran into a line in git commit ouput I had not see before > > #copied: d0/etc/hosts -> misc/old-readerHOSTvcs-files/etc/hosts > > So googling I learned that this might happen if git thinks the two > files are the same. > > I was pretty sure they were not the same so checked them> > > > > diff d0/etc/host misc/old-readerHOSTvcs-files/etc/hosts > > The output is a bit long but shows them being quite different. > > Some 2 dozen or so lines that dramatically differ. > > Here are two that are at least kind of similar but would never be seen > as the same: > > < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 >> 192.168.1.43m2.local.lanm2 # win 7 > > Not to mention they are quite different lines as well. > > So what is going on and what should I be looking at? The diff machinery has a threshold for when it assumes a copy/move of a file. (e.g. "A file is assumed copied when at least 55% of lines are equal") https://git-scm.com/docs/git-diff See -C and -M option. git-status seems to use this machinery as well, but does not expose the options?
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 10:33 AM, Jeff King wrote: > On Thu, Aug 10, 2017 at 10:27:19AM -0700, Stefan Beller wrote: > >> > I'm not fond of that, as it's vague about which exact trailers we're >> > talking about. I also thought of something like --verbatim, but I'd >> > worry that would seem to conflict with --normalize. >> > >> > I dunno. All of the names seem not quite descriptive enough to me. >> >> I meant 'exact' as in 'exactly from the patch/commit, no external >> influence such as config', so maybe '--from-patch' or '--from-commit' >> (which says the same as --no-config just the other way round. >> Having --no- in config options as the standard is a UX disaster >> IMHO as then we have to forbid the --no-no-X or reintroduce X >> and flip the default) > > Yes, that was definitely the other reason I didn't want to call it > "--no-config". :) > > It's not always from a patch or commit. The most accurate along those > lines is "--from-input". > >> Maybe --genuine ? > > But in the greater context I think that's vague again; we don't know > which part of the command's operation is "genuine". The input of course. ;) --genuine-input. > > Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to > match the other "--only". > > I think I like that last one the best. It makes it clear that we are > looking just at the input, and not anything else. Which is exactly what > the feature does. Makes sense to me, Thanks, Stefan
[ANNOUNCE] Git v2.14.1, v2.13.5, and others
The latest maintenance release Git v2.14.1 is now available at the usual places, together with releases for older maintenance track for the same issue: v2.7.6, v2.8.6, v2.9.5, v2.10.4, v2.11.3, v2.12.4, and v2.13.5. These contain a security fix for CVE-2017-1000117, and are released in coordination with Subversion and Mercurial that share a similar issue. CVE-2017-9800 and CVE-2017-1000116 are assigned to these systems, respectively, for issues similar to it that are now addressed in their part of this coordinated release. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of these tags: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://github.com/gitster/git A malicious third-party can give a crafted "ssh://..." URL to an unsuspecting victim, and an attempt to visit the URL can result in any program that exists on the victim's machine being executed. Such a URL could be placed in the .gitmodules file of a malicious project, and an unsuspecting victim could be tricked into running "git clone --recurse-submodules" to trigger the vulnerability. Credits to find and fix the issue go to Brian Neel at GitLab, Joern Schneeweisz of Recurity Labs and Jeff King at GitHub. * A "ssh://..." URL can result in a "ssh" command line with a hostname that begins with a dash "-", which would cause the "ssh" command to instead (mis)treat it as an option. This is now prevented by forbidding such a hostname (which should not impact any real-world usage). * Similarly, when GIT_PROXY_COMMAND is configured, the command is run with host and port that are parsed out from "ssh://..." URL; a poorly written GIT_PROXY_COMMAND could be tricked into treating a string that begins with a dash "-" as an option. This is now prevented by forbidding such a hostname and port number (again, which should not impact any real-world usage). * In the same spirit, a repository name that begins with a dash "-" is also forbidden now.
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: > On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: > > > This series teaches interpret-trailers to parse and output just the > > trailers. So now you can do: > > > > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | > > git interpret-trailers --parse > > Signed-off-by: Hartmut Henkel > > Helped-by: Stefan Beller > > Signed-off-by: Ralf Thielow > > Acked-by: Matthias Rüster > > And here's a v2 that addresses all of the comments except one: Stefan > suggested that --only-existing wasn't a great name. I agree, but I like > everything else less. Here's a v3 that takes care of that (renaming it to --only-input). It's otherwise the same as v2, but since the name-change ripples through the remaining patches, I wanted to get v3 in front of people sooner rather than later. [1/5]: trailer: put process_trailers() options into a struct [2/5]: interpret-trailers: add an option to show only the trailers [3/5]: interpret-trailers: add an option to show only existing trailers [4/5]: interpret-trailers: add an option to normalize output [5/5]: interpret-trailers: add --parse convenience option Documentation/git-interpret-trailers.txt | 34 +++--- builtin/interpret-trailers.c | 34 +++--- t/t7513-interpret-trailers.sh| 76 trailer.c| 68 ++-- trailer.h| 13 +- 5 files changed, 196 insertions(+), 29 deletions(-)
[PATCH v3 1/5] trailer: put process_trailers() options into a struct
We already have two options and are about to add a few more. To avoid having a huge number of boolean arguments, let's convert to an options struct which can be passed in. Signed-off-by: Jeff King --- builtin/interpret-trailers.c | 13 ++--- trailer.c| 10 ++ trailer.h| 10 +- 3 files changed, 21 insertions(+), 12 deletions(-) diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 175f14797b..bb0d7b937a 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -18,13 +18,12 @@ static const char * const git_interpret_trailers_usage[] = { int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { - int in_place = 0; - int trim_empty = 0; + struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; struct string_list trailers = STRING_LIST_INIT_NODUP; struct option options[] = { - OPT_BOOL(0, "in-place", &in_place, N_("edit files in place")), - OPT_BOOL(0, "trim-empty", &trim_empty, N_("trim empty trailers")), + OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), + OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -36,11 +35,11 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) if (argc) { int i; for (i = 0; i < argc; i++) - process_trailers(argv[i], in_place, trim_empty, &trailers); + process_trailers(argv[i], &opts, &trailers); } else { - if (in_place) + if (opts.in_place) die(_("no input file given for in-place editing")); - process_trailers(NULL, in_place, trim_empty, &trailers); + process_trailers(NULL, &opts, &trailers); } string_list_clear(&trailers, 0); diff --git a/trailer.c b/trailer.c index 751b56c009..e21a0d1629 100644 --- a/trailer.c +++ b/trailer.c @@ -968,7 +968,9 @@ static FILE *create_in_place_tempfile(const char *file) return outfile; } -void process_trailers(const char *file, int in_place, int trim_empty, struct string_list *trailers) +void process_trailers(const char *file, + const struct process_trailer_options *opts, + struct string_list *trailers) { LIST_HEAD(head); LIST_HEAD(arg_head); @@ -980,7 +982,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str read_input_file(&sb, file); - if (in_place) + if (opts->in_place) outfile = create_in_place_tempfile(file); /* Print the lines before the trailers */ @@ -990,14 +992,14 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str process_trailers_lists(&head, &arg_head); - print_all(outfile, &head, trim_empty); + print_all(outfile, &head, opts->trim_empty); free_all(&head); /* Print the lines after the trailers as is */ fwrite(sb.buf + trailer_end, 1, sb.len - trailer_end, outfile); - if (in_place) + if (opts->in_place) if (rename_tempfile(&trailers_tempfile, file)) die_errno(_("could not rename temporary file to %s"), file); diff --git a/trailer.h b/trailer.h index 65cc5d79c6..9da00bedec 100644 --- a/trailer.h +++ b/trailer.h @@ -22,7 +22,15 @@ struct trailer_info { size_t trailer_nr; }; -void process_trailers(const char *file, int in_place, int trim_empty, +struct process_trailer_options { + int in_place; + int trim_empty; +}; + +#define PROCESS_TRAILER_OPTIONS_INIT {0} + +void process_trailers(const char *file, + const struct process_trailer_options *opts, struct string_list *trailers); void trailer_info_get(struct trailer_info *info, const char *str); -- 2.14.0.614.g0beb26d5e9
[PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
In theory it's easy for any reader who wants to parse trailers to do so. But there are a lot of subtle corner cases around what counts as a trailer, when the trailer block begins and ends, etc. Since interpret-trailers already has our parsing logic, let's let callers ask it to just output the trailers. They still have to parse the "key: value" lines, but at least they can ignore all of the other corner cases. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 3 +++ builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 39 trailer.c| 19 ++-- trailer.h| 1 + 5 files changed, 56 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 31cdeaecdf..295dffbd21 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -80,6 +80,9 @@ OPTIONS trailer to the input messages. See the description of this command. +--only-trailers:: + Output only the trailers, not any other parts of the input. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index bb0d7b937a..afb12c11bc 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -24,6 +24,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) struct option options[] = { OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), + OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 0c6f91c433..90d30037b7 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1275,4 +1275,43 @@ test_expect_success 'with cut line' ' test_cmp expected actual ' +test_expect_success 'only trailers' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + sign: config-value + added: added-value + EOF + git interpret-trailers \ + --trailer added:added-value \ + --only-trailers >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + +test_expect_success 'only-trailers omits non-trailer in middle of block' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + Signed-off-by: nobody + Signed-off-by: somebody + sign: config-value + EOF + git interpret-trailers --only-trailers >actual <<-\EOF && + subject + + it is important that the trailers below are signed-off-by + so that they meet the "25% trailers Git knows about" heuristic + + Signed-off-by: nobody + this is not a trailer + Signed-off-by: somebody + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index e21a0d1629..706351fc98 100644 --- a/trailer.c +++ b/trailer.c @@ -164,13 +164,15 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct list_head *head, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, + const struct process_trailer_options *opts) { struct list_head *pos; struct trailer_item *item; list_for_each(pos, head) { item = list_entry(pos, struct trailer_item, list); - if (!trim_empty || strlen(item->value) > 0) + if ((!opts->trim_empty || strlen(item->value) > 0) && + (!opts->only_trailers || item->token)) print_tok_val(outfile, item->token, item->value); } } @@ -897,9 +899,10 @@ static int process_input_file(FILE *outfile, trailer_info_get(&info, str); /* Print lines before the trailers as is */ - fwrite(str, 1, info.trailer_start - str, outfile); + if (outfile) + fwrite(str, 1, info.trailer_start - str, outfile); - if (!info.blank_line_before_trailer) + if (outfile && !info.blank_line_before_trailer) fprintf(outfile, "\n"); for (i = 0; i < info.trailer_nr; i++) { @@ -986,18 +989,20
[PATCH v3 3/5] interpret-trailers: add an option to show only existing trailers
It can be useful to invoke interpret-trailers for the primary purpose of parsing existing trailers. But in that case, we don't want to apply existing ifMissing or ifExists rules from the config. Let's add a special mode where we avoid applying those rules. Coupled with --only-trailers, this gives us a reasonable parsing tool. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 7 +++ t/t7513-interpret-trailers.sh| 16 trailer.c| 9 + trailer.h| 1 + 5 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 295dffbd21..7cc43b0e3e 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -83,6 +83,11 @@ OPTIONS --only-trailers:: Output only the trailers, not any other parts of the input. +--only-input:: + Output only trailers that exist in the input; do not add any + from the command-line or by following configured `trailer.*` + rules. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index afb12c11bc..2d90e0e480 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -25,6 +25,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "in-place", &opts.in_place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), + OPT_BOOL(0, "only-input", &opts.only_input, N_("do not apply config rules")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -33,6 +34,12 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, options, git_interpret_trailers_usage, 0); + if (opts.only_input && trailers.nr) + usage_msg_opt( + _("--trailer with --only-input does not make sense"), + git_interpret_trailers_usage, + options); + if (argc) { int i; for (i = 0; i < argc; i++) diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 90d30037b7..94b6c52473 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1314,4 +1314,20 @@ test_expect_success 'only-trailers omits non-trailer in middle of block' ' test_cmp expected actual ' +test_expect_success 'only input' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + EOF + git interpret-trailers \ + --only-trailers --only-input >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 706351fc98..2b5269183a 100644 --- a/trailer.c +++ b/trailer.c @@ -976,7 +976,6 @@ void process_trailers(const char *file, struct string_list *trailers) { LIST_HEAD(head); - LIST_HEAD(arg_head); struct strbuf sb = STRBUF_INIT; int trailer_end; FILE *outfile = stdout; @@ -992,9 +991,11 @@ void process_trailers(const char *file, trailer_end = process_input_file(opts->only_trailers ? NULL : outfile, sb.buf, &head); - process_command_line_args(&arg_head, trailers); - - process_trailers_lists(&head, &arg_head); + if (!opts->only_input) { + LIST_HEAD(arg_head); + process_command_line_args(&arg_head, trailers); + process_trailers_lists(&head, &arg_head); + } print_all(outfile, &head, opts); diff --git a/trailer.h b/trailer.h index 3cf35ced00..76c3b571bf 100644 --- a/trailer.h +++ b/trailer.h @@ -26,6 +26,7 @@ struct process_trailer_options { int in_place; int trim_empty; int only_trailers; + int only_input; }; #define PROCESS_TRAILER_OPTIONS_INIT {0} -- 2.14.0.614.g0beb26d5e9
[PATCH v3 5/5] interpret-trailers: add --parse convenience option
The last few commits have added command line options that can turn interpret-trailers into a parsing tool. Since they'd most often be used together, let's provide a convenient single option for callers to invoke this mode. This is implemented as a callback rather than a boolean so that its effect is applied immediately, as if those options had been specified. Later options can then override them. E.g.: git interpret-trailers --parse --no-normalize would work. Let's also update the documentation to make clear that this parsing mode behaves quite differently than the normal "add trailers to the input" mode. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 21 ++--- builtin/interpret-trailers.c | 12 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 598b74efd7..54c8b081c6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -3,24 +3,27 @@ git-interpret-trailers(1) NAME -git-interpret-trailers - help add structured information into commit messages +git-interpret-trailers - add or parse structured information in commit messages SYNOPSIS [verse] -'git interpret-trailers' [--in-place] [--trim-empty] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [--parse] [...] DESCRIPTION --- -Help adding 'trailers' lines, that look similar to RFC 822 e-mail +Help parsing or adding 'trailers' lines, that look similar to RFC 822 e-mail headers, at the end of the otherwise free-form part of a commit message. This command reads some patches or commit messages from either the - arguments or the standard input if no is specified. Then -this command applies the arguments passed using the `--trailer` -option, if any, to the commit message part of each input file. The -result is emitted on the standard output. + arguments or the standard input if no is specified. If +`--parse` is specified, the output consists of the parsed trailers. + +Otherwise, the this command applies the arguments passed using the +`--trailer` option, if any, to the commit message part of each input +file. The result is emitted on the standard output. Some configuration variables control the way the `--trailer` arguments are applied to each commit message and the way any existing trailer in @@ -93,6 +96,10 @@ OPTIONS line, with any existing whitespace continuation folded into a single line. +--parse:: + A convenience alias for `--only-trailers --only-input + --normalize`. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index b7592bedd9..010c181492 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -16,6 +16,16 @@ static const char * const git_interpret_trailers_usage[] = { NULL }; +static int parse_opt_parse(const struct option *opt, const char *arg, + int unset) +{ + struct process_trailer_options *v = opt->value; + v->only_trailers = 1; + v->only_input = 1; + v->normalize = 1; + return 0; +} + int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; @@ -27,6 +37,8 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_BOOL(0, "only-input", &opts.only_input, N_("do not apply config rules")), OPT_BOOL(0, "normalize", &opts.normalize, N_("normalize trailer formatting")), + { OPTION_CALLBACK, 0, "parse", &opts, NULL, N_("set parsing options"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, parse_opt_parse }, OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() -- 2.14.0.614.g0beb26d5e9
[PATCH v3 4/5] interpret-trailers: add an option to normalize output
The point of "--only-trailers" is to give a caller an output that's easy for them to parse. Getting rid of the non-trailer material helps, but we still may see more complicated syntax like whitespace continuation. Let's add an option to normalize the output into one "key: value" line per trailer. As a bonus, this could be used even without --only-trailers to clean up unusual formatting in the incoming data. Signed-off-by: Jeff King --- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 21 trailer.c| 34 +++- trailer.h| 1 + 5 files changed, 61 insertions(+), 1 deletion(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 7cc43b0e3e..598b74efd7 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -88,6 +88,11 @@ OPTIONS from the command-line or by following configured `trailer.*` rules. +--normalize:: + Normalize trailer output so that trailers appear exactly one per + line, with any existing whitespace continuation folded into a + single line. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 2d90e0e480..b7592bedd9 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -26,6 +26,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "trim-empty", &opts.trim_empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", &opts.only_trailers, N_("output only the trailers")), OPT_BOOL(0, "only-input", &opts.only_input, N_("do not apply config rules")), + OPT_BOOL(0, "normalize", &opts.normalize, N_("normalize trailer formatting")), OPT_STRING_LIST(0, "trailer", &trailers, N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 94b6c52473..37c4f37882 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1330,4 +1330,25 @@ test_expect_success 'only input' ' test_cmp expected actual ' +test_expect_success 'normalize' ' + cat >expected <<-\EOF && + foo: continued across several lines + EOF + # pass through tr to make leading and trailing whitespace more obvious + tr _ " " <<-\EOF | + my subject + + my body + + foo:_ + __continued + ___across + several + _lines + ___ + EOF + git interpret-trailers --only-trailers --only-input --normalize >actual && + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 2b5269183a..07ef082284 100644 --- a/trailer.c +++ b/trailer.c @@ -887,7 +887,35 @@ static int ends_with_blank_line(const char *buf, size_t len) return is_blank_line(buf + ll); } +static void normalize_value(struct strbuf *val) +{ + struct strbuf out = STRBUF_INIT; + size_t i; + + strbuf_grow(&out, val->len); + i = 0; + while (i < val->len) { + char c = val->buf[i++]; + if (c == '\n') { + /* Collapse continuation down to a single space. */ + while (i < val->len && isspace(val->buf[i])) + i++; + strbuf_addch(&out, ' '); + } else { + strbuf_addch(&out, c); + } + } + + /* Empty lines may have left us with whitespace cruft at the edges */ + strbuf_trim(&out); + + /* output goes back to val as if we modified it in-place */ + strbuf_swap(&out, val); + strbuf_release(&out); +} + static int process_input_file(FILE *outfile, + int normalize, const char *str, struct list_head *head) { @@ -914,12 +942,16 @@ static int process_input_file(FILE *outfile, if (separator_pos >= 1) { parse_trailer(&tok, &val, NULL, trailer, separator_pos); + if (normalize) + normalize_value(&val); add_trailer_item(head, strbuf_detach(&tok, NULL), strbuf_detach(&val, NULL)); } else { strbuf_addstr(&val, trailer); strbuf_strip_suffix(&val, "\n"); + if (normalize) +
[PATCH 3/4] Convert zlib.c to size_t
From: Martin Koegler Signed-off-by: Martin Koegler --- builtin/pack-objects.c | 10 +- cache.h| 12 ++-- pack-check.c | 6 +++--- pack.h | 2 +- sha1_file.c| 6 +++--- wrapper.c | 8 zlib.c | 8 7 files changed, 26 insertions(+), 26 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d94fd17..aa70f80 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -222,15 +222,15 @@ static void copy_pack_data(struct sha1file *f, struct packed_git *p, struct pack_window **w_curs, off_t offset, - off_t len) + size_t len) { unsigned char *in; - unsigned long avail; + size_t avail; while (len) { in = use_pack(p, w_curs, offset, &avail); if (avail > len) - avail = (unsigned long)len; + avail = len; sha1write(f, in, avail); offset += avail; len -= avail; @@ -1388,8 +1388,8 @@ static void check_object(struct object_entry *entry) struct pack_window *w_curs = NULL; const unsigned char *base_ref = NULL; struct object_entry *base_entry; - unsigned long used, used_0; - unsigned long avail; + size_t used, used_0; + size_t avail; off_t ofs; unsigned char *buf, c; diff --git a/cache.h b/cache.h index 26a3eaa..9185763 100644 --- a/cache.h +++ b/cache.h @@ -42,10 +42,10 @@ #include typedef struct git_zstream { z_stream z; - unsigned long avail_in; - unsigned long avail_out; - unsigned long total_in; - unsigned long total_out; + size_t avail_in; + size_t avail_out; + size_t total_in; + size_t total_out; unsigned char *next_in; unsigned char *next_out; } git_zstream; @@ -62,7 +62,7 @@ void git_deflate_end(git_zstream *); int git_deflate_abort(git_zstream *); int git_deflate_end_gently(git_zstream *); int git_deflate(git_zstream *, int flush); -unsigned long git_deflate_bound(git_zstream *, unsigned long); +size_t git_deflate_bound(git_zstream *, size_t); /* The length in bytes and in hex digits of an object name (SHA-1 value). */ #define GIT_SHA1_RAWSZ 20 @@ -1678,7 +1678,7 @@ extern int open_pack_index(struct packed_git *); */ extern void close_pack_index(struct packed_git *); -extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); +extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *); extern void close_pack_windows(struct packed_git *); extern void close_all_packs(void); extern void unuse_pack(struct pack_window **); diff --git a/pack-check.c b/pack-check.c index 6f7714f..8cc2364 100644 --- a/pack-check.c +++ b/pack-check.c @@ -24,13 +24,13 @@ static int compare_entries(const void *e1, const void *e2) } int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, - off_t offset, off_t len, unsigned int nr) + off_t offset, size_t len, unsigned int nr) { const uint32_t *index_crc; uint32_t data_crc = crc32(0, NULL, 0); do { - unsigned long avail; + size_t avail; void *data = use_pack(p, w_curs, offset, &avail); if (avail > len) avail = len; @@ -65,7 +65,7 @@ static int verify_packfile(struct packed_git *p, git_SHA1_Init(&ctx); do { - unsigned long remaining; + size_t remaining; unsigned char *in = use_pack(p, w_curs, offset, &remaining); offset += remaining; if (!pack_sig_ofs) diff --git a/pack.h b/pack.h index 8294341..ca89ad0 100644 --- a/pack.h +++ b/pack.h @@ -78,7 +78,7 @@ struct progress; typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*); extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, const unsigned char *sha1); -extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); +extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, size_t len, unsigned int nr); extern int verify_pack_index(struct packed_git *); extern int verify_pack(struct packed_git *, verify_fn fn, struct progress *, uint32_t); extern off_t write_pack_header(struct sha1file *f, uint32_t); diff --git a/sha1_file.c b/sha1_file.c index 97b39b0..3428172 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1228,7 +1228,7 @@ static int in_window(struct pack_window *win, off_t offset)
[PATCH 4/4] Fix delta offset overflow
From: Martin Koegler Prevent generating delta offsets beyond 4G. Signed-off-by: Martin Koegler --- diff-delta.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/diff-delta.c b/diff-delta.c index 3d5e1ef..633883e 100644 --- a/diff-delta.c +++ b/diff-delta.c @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, moff += msize; msize = left; + if (moff > 0x) + msize = 0; + if (msize < 4096) { int j; val = 0; -- 2.1.4
Re: Not understanding with git wants to copy one file to another
Stefan Beller writes: > On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnam wrote: [...] Harry wrote: >> Here are two that are at least kind of similar but would never be seen >> as the same: >> >> < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 >>> 192.168.1.43m2.local.lanm2 # win 7 Stefan B replied: > The diff machinery has a threshold for when it assumes > a copy/move of a file. (e.g. "A file is assumed copied when > at least 55% of lines are equal") > > https://git-scm.com/docs/git-diff > > See -C and -M option. > > git-status seems to use this machinery as well, but does > not expose the options? Well, now I'm even more confused. What actually happens? Is either file changed? Is only one file kept? On the surface it sounds like complete anathema to what git is all about. However, I know a tool this sophisticated is not doing something just outright stupid... so must be really missing the point here. I get the way you can make -M stricter or not... but I didn't call git-diff to see that copy thing comeup. I called git commit. There must be some way to set stricter guidlines to calling things copies. But then I must really not get it because it still seems almost silly to consider one file a copy of another if only 55% is the same. What am I missing?
Re: [PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
On Thu, Aug 10, 2017 at 11:04 AM, Jeff King wrote: > In theory it's easy for any reader who wants to parse > trailers to do so. But there are a lot of subtle corner > cases around what counts as a trailer, when the trailer > block begins and ends, etc. Since interpret-trailers already > has our parsing logic, let's let callers ask it to just > output the trailers. > > They still have to parse the "key: value" lines, but at > least they can ignore all of the other corner cases. > > Signed-off-by: Jeff King Sorry for a sloppy review last round, upon reviewing I found another nit. > > +test_expect_success 'only trailers' ' > + git config trailer.sign.command "echo config-value" && You may want to use 'test_config' here, which keeps the config only for one test. The subsequent tests seem to overwrite the config, so this is not wrong, just not the best style.
Re: [PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
On Thu, Aug 10, 2017 at 11:28:52AM -0700, Stefan Beller wrote: > > +test_expect_success 'only trailers' ' > > + git config trailer.sign.command "echo config-value" && > > You may want to use 'test_config' here, which keeps the config > only for one test. The subsequent tests seem to overwrite the > config, so this is not wrong, just not the best style. Yeah, I actually considered that but decided to keep style with the rest of the script. I agree the whole thing could possibly switch to test_config, but I suspect there may be some fallouts (the style of the rest of the script seems to assume some continuity between the tests). -Peff
[PATCH v2 1/2] format-patch: have progress option while generating patches
When generating patches for the rebase command if the user does not realize the branch they are rebasing onto is thousands of commits different there is no progress indication after initial rewinding message. The progress meter as presented in this patch assumes the thousands of patches to have a fine granularity as well as assuming to require all the same amount of work/time for each, such that a steady progress bar is achieved. We do not want to estimate the time for each patch based e.g. on their size or number of touched files (or parents) as that is too expensive for just a progress meter. This patch allows a progress option to be passed to format-patch so that the user can be informed the progress of generating the patch. This option is then used by the rebase command when calling format-patch. Signed-off-by: Kevin Willford --- Documentation/git-format-patch.txt | 4 builtin/log.c | 10 ++ 2 files changed, 14 insertions(+) diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index c890328b02..6cbe462a77 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -23,6 +23,7 @@ SYNOPSIS [(--reroll-count|-v) ] [--to=] [--cc=] [--[no-]cover-letter] [--quiet] [--notes[=]] + [--progress] [] [ | ] @@ -283,6 +284,9 @@ you can use `--suffix=-patch` to get `0001-description-of-my-change-patch`. range are always formatted as creation patches, independently of this flag. +--progress:: + Show progress reports on stderr as patches are generated. + CONFIGURATION - You can specify extra mail header lines to be added to each message, diff --git a/builtin/log.c b/builtin/log.c index 725c7b8a1a..b07a5529c2 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -27,6 +27,7 @@ #include "version.h" #include "mailmap.h" #include "gpg-interface.h" +#include "progress.h" /* Set a default date-time format for git log ("log.date" config variable) */ static const char *default_date_mode = NULL; @@ -1422,6 +1423,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) char *branch_name = NULL; char *base_commit = NULL; struct base_tree_info bases; + int show_progress = 0; + struct progress *progress = NULL; const struct option builtin_format_patch_options[] = { { OPTION_CALLBACK, 'n', "numbered", &numbered, NULL, @@ -1493,6 +1496,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) OPT_FILENAME(0, "signature-file", &signature_file, N_("add a signature from a file")), OPT__QUIET(&quiet, N_("don't print the patch filenames")), + OPT_BOOL(0, "progress", &show_progress, +N_("show progress while generating patches")), OPT_END() }; @@ -1752,8 +1757,12 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) start_number--; } rev.add_signoff = do_signoff; + + if (show_progress) + progress = start_progress_delay(_("Generating patches"), total, 0, 1); while (0 <= --nr) { int shown; + display_progress(progress, total - nr); commit = list[nr]; rev.nr = total - nr + (start_number - 1); /* Make the second and subsequent mails replies to the first */ @@ -1818,6 +1827,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) if (!use_stdout) fclose(rev.diffopt.file); } + stop_progress(&progress); free(list); free(branch_name); string_list_clear(&extra_to, 0); -- 2.14.0.rc0.286.g44127d70e4
[PATCH v2 2/2] rebase: turn on progress option by default for format-patch
This change passes the progress option of format-patch checking that stderr is attached and rebase is not being run in quiet mode. Signed-off-by: Kevin Willford --- git-rebase--am.sh | 1 + git-rebase.sh | 6 ++ 2 files changed, 7 insertions(+) diff --git a/git-rebase--am.sh b/git-rebase--am.sh index 375239341f..ff98fe3a73 100644 --- a/git-rebase--am.sh +++ b/git-rebase--am.sh @@ -53,6 +53,7 @@ else git format-patch -k --stdout --full-index --cherry-pick --right-only \ --src-prefix=a/ --dst-prefix=b/ --no-renames --no-cover-letter \ + $git_format_patch_opt \ "$revisions" ${restrict_revision+^$restrict_revision} \ >"$GIT_DIR/rebased-patches" ret=$? diff --git a/git-rebase.sh b/git-rebase.sh index f8b3d1fd97..ad8415e3cf 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -74,6 +74,7 @@ test "$(git config --bool rebase.stat)" = true && diffstat=t autostash="$(git config --bool rebase.autostash || echo false)" fork_point=auto git_am_opt= +git_format_patch_opt= rebase_root= force_rebase= allow_rerere_autoupdate= @@ -445,6 +446,11 @@ else state_dir="$apply_dir" fi +if test -t 2 && test -z "$GIT_QUIET" +then + git_format_patch_opt="$git_format_patch_opt --progress" +fi + if test -z "$rebase_root" then case "$#" in -- 2.14.0.rc0.286.g44127d70e4
[PATCH v2 0/2] Add progress for format-patch and rebase
Changes since last patch: 1. Use start_progress_delay so progress isn't shown if generating the patches is fast enough 2. Updated to have text of "Generating patches" 3. Only show progress when the --progress flag is passed 4. In the rebase script check stderr and the quiet option is not set before propagating the progress flag to format-patch Kevin Willford (2): format-patch: have progress option while generating patches rebase: turn on progress option by default for format-patch Documentation/git-format-patch.txt | 4 builtin/log.c | 10 ++ git-rebase--am.sh | 1 + git-rebase.sh | 6 ++ 4 files changed, 21 insertions(+) -- 2.14.0.rc0.286.g44127d70e4
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: > The point of "--only-trailers" is to give a caller an output > that's easy for them to parse. Getting rid of the > non-trailer material helps, but we still may see more > complicated syntax like whitespace continuation. Let's add > an option to normalize the output into one "key: value" line > per trailer. > > As a bonus, this could be used even without --only-trailers > to clean up unusual formatting in the incoming data. This is useful for the parsing part, but for the writing part we'd rather want to have the opposite thing, such as '--line-break=rfc822'. But this doesn't have to be part of this series. With this in mind, I do not quite understand the latter use case how you would use normalized trailers without --only-trailers?
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Thu, Aug 10, 2017 at 11:03 AM, Jeff King wrote: > On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: > >> On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: >> >> > This series teaches interpret-trailers to parse and output just the >> > trailers. So now you can do: >> > >> > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | >> > git interpret-trailers --parse >> > Signed-off-by: Hartmut Henkel >> > Helped-by: Stefan Beller >> > Signed-off-by: Ralf Thielow >> > Acked-by: Matthias Rüster >> >> And here's a v2 that addresses all of the comments except one: Stefan >> suggested that --only-existing wasn't a great name. I agree, but I like >> everything else less. > > Here's a v3 that takes care of that (renaming it to --only-input). > > It's otherwise the same as v2, but since the name-change ripples through > the remaining patches, I wanted to get v3 in front of people sooner > rather than later. > Looks good, Thanks, Stefan
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: > On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: > > The point of "--only-trailers" is to give a caller an output > > that's easy for them to parse. Getting rid of the > > non-trailer material helps, but we still may see more > > complicated syntax like whitespace continuation. Let's add > > an option to normalize the output into one "key: value" line > > per trailer. > > > > As a bonus, this could be used even without --only-trailers > > to clean up unusual formatting in the incoming data. > > This is useful for the parsing part, but for the writing part we'd > rather want to have the opposite thing, such as > '--line-break=rfc822'. But this doesn't have to be part of this > series. With this in mind, I do not quite understand the latter > use case how you would use normalized trailers without > --only-trailers? If you prefer the normalized form (and the input was line-broken in a way that you don't like), then this would convert to your preferred form. I agree that you could potentially want the opposite (folding long lines). Perhaps something like --wrap=72. -Peff
Re: [PATCH 0/5] make interpret-trailers useful for parsing
Jeff King writes: >> > The above example made me wonder if we also want a format specifier >> > to do the above without piping, but it turns out that we already >> > have "log --format=%(trailers)", so we are good ;-) >> >> I was going to say, I thought we had a way to get trailers for a >> commit via the pretty format, since that is what i used in the past. > > I do like that you could get the trailers for many commits in a single > invocation. That doesn't matter for my current use-case, but obviously > piping through O(n) interpret-trailers invocations is a bad idea. > But there are a few difficulties with using %(trailers) for this,... I think it is clear to you, but it may not be clear to others, that I did not mean to say "because 'log --format' already knows about it, this change to interpret-trailers is unnecessary". > For (1) I think many callers would prefer to see the original > formatting. Maybe we'd need a %(trailers:normalize) or something. Thanks; that is exactly the line of thought I had in the back of my head without even realizing when I brought up %(trailers) format element.
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
Jeff King writes: > Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to > match the other "--only". > > I think I like that last one the best. It makes it clear that we are > looking just at the input, and not anything else. Which is exactly what > the feature does. Sounds good. Thanks.
[PATCH] cache-tree: remove use of strbuf_addf in update_one
String formatting can be a performance issue when there are hundreds of thousands of trees. Change to stop using the strbuf_addf and just add the strings or characters individually. There are a limited number of modes so added a switch for the known ones and a default case if something comes through that are not a known one for git. Signed-off-by: Kevin Willford --- cache-tree.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 2440d1dc89..41744b3db7 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -390,7 +390,29 @@ static int update_one(struct cache_tree *it, continue; strbuf_grow(&buffer, entlen + 100); - strbuf_addf(&buffer, "%o %.*s%c", mode, entlen, path + baselen, '\0'); + + switch (mode) { + case 0100644: + strbuf_add(&buffer, "100644 ", 7); + break; + case 0100664: + strbuf_add(&buffer, "100664 ", 7); + break; + case 0100755: + strbuf_add(&buffer, "100755 ", 7); + break; + case 012: + strbuf_add(&buffer, "12 ", 7); + break; + case 016: + strbuf_add(&buffer, "16 ", 7); + break; + default: + strbuf_addf(&buffer, "%o ", mode); + break; + } + strbuf_add(&buffer, path + baselen, entlen); + strbuf_addch(&buffer, '\0'); strbuf_add(&buffer, sha1, 20); #if DEBUG -- 2.14.0.rc0.286.g44127d70e4
Re: Not understanding with git wants to copy one file to another
On Thu, Aug 10, 2017 at 11:18 AM, Harry Putnam wrote: > Stefan Beller writes: > >> On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnam wrote: > > [...] > > Harry wrote: >>> Here are two that are at least kind of similar but would never be seen >>> as the same: >>> >>> < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 192.168.1.43m2.local.lanm2 # win 7 > > Stefan B replied: >> The diff machinery has a threshold for when it assumes >> a copy/move of a file. (e.g. "A file is assumed copied when >> at least 55% of lines are equal") >> >> https://git-scm.com/docs/git-diff >> >> See -C and -M option. >> >> git-status seems to use this machinery as well, but does >> not expose the options? > > Well, now I'm even more confused. What actually happens? Is either > file changed? Is only one file kept? > > On the surface it sounds like complete anathema to what git is all > about. > > However, I know a tool this sophisticated is not doing something just > outright stupid... so must be really missing the point here. > > I get the way you can make -M stricter or not... but I didn't call > git-diff to see that copy thing comeup. > > I called git commit. Ah. Sorry for confusing even more. By pointing out the options for git-diff, I just wanted to point out that such a mechanism ("rename/copy detection") exists. The output of git-status is similar to a dry run of git-commit, and apparently this detection is used there. > > There must be some way to set stricter guidlines to calling things > copies. Well from Gits perspective it is really hard to tell if it was a copy, or if it was similar incidentally (because the format/content of these files happen to follow some strict guidelines). The user could have moved/copied a file outside of Git (instead of git-mv, you'd use tools provided by your operating system to copy a file). Or the user could have written a file that is similar by chance. However that doesn't really matter, as Git tracks the content, and not how the file evolved. Consider the copy/move/rename detection as a heuristic, that wants to help the user, but may be mistaken. > > But then I must really not get it because it still seems almost silly > to consider one file a copy of another if only 55% is the same. > > What am I missing? > https://www.reddit.com/r/git/comments/3ogkk1/beginner_disable_rename_detection/ "Rename detection is just GUI sugar".
Re: [PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
Jeff King writes: > On Thu, Aug 10, 2017 at 11:28:52AM -0700, Stefan Beller wrote: > >> > +test_expect_success 'only trailers' ' >> > + git config trailer.sign.command "echo config-value" && >> >> You may want to use 'test_config' here, which keeps the config >> only for one test. The subsequent tests seem to overwrite the >> config, so this is not wrong, just not the best style. > > Yeah, I actually considered that but decided to keep style with the rest > of the script. I agree the whole thing could possibly switch to > test_config, but I suspect there may be some fallouts (the style of the > rest of the script seems to assume some continuity between the tests). I agree with your judgment here. As you said in the "use tool to enforce consistent style" thread, sometimes humans need to apply better taste than mechanical tooling could, and I view this one of a good example. Even though this is not a C coding-style thing, but the essense of the problem is the same. That is one of the reasons why I earlier said that we may see more style-only critique to patches if we introduce a new tool without setting the expectation of what we want to get out of such a tool straight.
[PATCH v1 1/1] dir: teach status to show ignored directories
Teach Git to optionally show ignored directories when showing all untracked files. The git status command exposes the options to report ignored and/or untracked files. However, when reporting all untracked files (--untracked-files=all), all individual ignored files are reported as well. It is not currently possible to get the reporting behavior of the --ignored flag, while also reporting all untracked files. This change exposes a flag to report all untracked files while not showing individual files in ignored directories. Motivation: Our application (Visual Studio) needs all untracked files listed individually, but does not need all ignored files listed individually. Reporting all ignored files can affect the time it takes for status to run. For a representative repository, here are some measurements showing a large perf improvement for this scenario: | Command | Reported ignored entries | Time (s) | | --- | | | | 1 | 0| 1.3 | | 2 | 1024 | 4.2 | | 3 | 174904 | 7.5 | | 4 | 1046 | 1.6 | Commands: 1) status 2) status --ignored 3) status --ignored --untracked-files=all 4) status --ignored --untracked-files=all --show-ignored-directory This changes exposes a --show-ignored-directory flag to the git status command. This flag is utilized when running git status with the --ignored and --untracked-files options to not list ignored individual ignored files contained in directories that match an ignore pattern. Part of the perf improvement comes from the tweak to read_directory_recursive to stop scanning the file system after it encounters the first file. When a directory is ignored, all it needs to determine is if the directory is empty or not. The logic currently keeps scanning the file system until it finds an untracked file. However, as the directory is ignored, all the contained contents are also marked excluded. For ignored directories that contain a large number of files, this can take some time. Signed-off-by: Jameson Miller --- Documentation/git-status.txt | 5 + Documentation/technical/api-directory-listing.txt | 6 + builtin/commit.c | 4 + dir.c | 48 +- dir.h | 3 +- t/t7519-status-show-ignored-directory.sh | 189 ++ wt-status.c | 4 + wt-status.h | 1 + 8 files changed, 253 insertions(+), 7 deletions(-) create mode 100755 t/t7519-status-show-ignored-directory.sh diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt index d47f198f15..48ebe18571 100644 --- a/Documentation/git-status.txt +++ b/Documentation/git-status.txt @@ -100,6 +100,11 @@ configuration variable documented in linkgit:git-config[1]. --ignored:: Show ignored files as well. +--show-ignored-directory:: +Show directories that are ignored, instead of individual files. +Does not recurse into excluded directories when listing all +untracked files. + -z:: Terminate entries with NUL, instead of LF. This implies the `--porcelain=v1` output format if no other format is given. diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 6c77b4920c..a9816df44c 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -33,6 +33,12 @@ The notable options are: Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]` in addition to untracked files in `entries[]`. +`DIR_SHOW_IGNORED_DIRECTORY`::: + + If this is set, non-empty directories that match an ignore pattern are + returned. The individual files contained in ignored directories are not + included. + `DIR_KEEP_UNTRACKED_CONTENTS`::: Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the diff --git a/builtin/commit.c b/builtin/commit.c index 8e93802511..9fcf77ae3a 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1333,6 +1333,7 @@ static int git_status_config(const char *k, const char *v, void *cb) int cmd_status(int argc, const char **argv, const char *prefix) { + static int show_ignored_directory = 0; static struct wt_status s; int fd; struct object_id oid; @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char *prefix) N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"), PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files in columns")), + OPT_BOOL(0, "show-ignored-directory", &show_i
[PATCH v1 0/1] Teach status to show ignored directories
Our application (Visual Studio) needs to run git status with options to report untracked and ignored files. It needs all untracked files reported individually, but would rather not have all individual ignored files under explicitly ignored directories reported. Directories that match an ignore pattern should be reported, instead of all contained files. It is not possible to get this output with the current arguments available to git status. You can get ignored files (--ignored), all untracked files (--untracked-files=all), but if you specify both options then you will also get all individual ignored files. This change is to add the option to have git status report all untracked files while also reporting directories that match an ignore pattern. The logic in dir.c was also modified to not iterate over all files in an ignored directory. This change resulted in a performance improvement in work directories with a large number of files contained in ignored directories. For example, this setup is present when bin and obj directories are contained in the repository work dir. Jameson Miller (1): dir: teach status to show ignored directories Documentation/git-status.txt | 5 + Documentation/technical/api-directory-listing.txt | 6 + builtin/commit.c | 4 + dir.c | 48 +- dir.h | 3 +- t/t7519-status-show-ignored-directory.sh | 189 ++ wt-status.c | 4 + wt-status.h | 1 + 8 files changed, 253 insertions(+), 7 deletions(-) create mode 100755 t/t7519-status-show-ignored-directory.sh -- 2.11.0
[PATCH] commit: skip discarding the index if there is no pre-commit hook
If there is not a pre-commit hook, there is no reason to discard the index and reread it. This change checks to presence of a pre-commit hook and then only discards the index if there was one. Signed-off-by: Kevin Willford --- builtin/commit.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index e7a2cb6285..443949d87b 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -671,12 +671,22 @@ static int prepare_to_commit(const char *index_file, const char *prefix, const char *hook_arg2 = NULL; int clean_message_contents = (cleanup_mode != CLEANUP_NONE); int old_display_comment_prefix; + const char *precommit_hook = NULL; /* This checks and barfs if author is badly specified */ determine_author_info(author_ident); - if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL)) - return 0; + + if (!no_verify) { + /* +* Check to see if there is a pre-commit hook +* If there not one we can skip discarding the index later on +*/ + precommit_hook = find_hook("pre-commit"); + if (precommit_hook && + run_commit_hook(use_editor, index_file, "pre-commit", NULL)) + return 0; + } if (squash_message) { /* @@ -940,12 +950,15 @@ static int prepare_to_commit(const char *index_file, const char *prefix, return 0; } - /* -* Re-read the index as pre-commit hook could have updated it, -* and write it out as a tree. We must do this before we invoke -* the editor and after we invoke run_status above. -*/ - discard_cache(); + if (!no_verify && precommit_hook) { + /* +* Re-read the index as pre-commit hook could have updated it, +* and write it out as a tree. We must do this before we invoke +* the editor and after we invoke run_status above. +*/ + discard_cache(); + } + read_cache_from(index_file); if (update_main_cache_tree(0)) { error(_("Error building trees")); -- 2.14.0.rc0.286.g44127d70e4
Re: fatal: Out of memory, getdelim failed under NFS mounts
René Scharfe writes: > I doubt the type of file system matters. The questions are: How much > main memory do you have, what is git trying to cram into it, is there > a way to reduce the memory footprint or do you need to add more RAM? > >> any recommendations on how to pin point the "offender"? ;) > Running "GIT_TRACE=1 git pull --ff-only origin master" would be a > good start, I think, to find out which of the different activities > that pull is doing causes the out-of-memory error. > > "free" and "ulimit -a" can help you find out how much memory you can > use. > > Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? > getdelim() is used mostly to read lines from files like these and in > the admittedly unlikely case that they are *really* long such an > error would be expected. There is only one getdelim() call, which was introduced in v2.5.0 timeframe, and it is used like this: r = getdelim(&sb->buf, &sb->alloc, term, fp); if (r > 0) { sb->len = r; return 0; } assert(r == -1); /* * Normally we would have called xrealloc, which will try to free * memory and recover. But we have no way to tell getdelim() to do so. * Worse, we cannot try to recover ENOMEM ourselves, because we have * no idea how many bytes were read by getdelim. * * Dying here is reasonable. It mirrors what xrealloc would do on * catastrophic memory failure. We skip the opportunity to free pack * memory and retry, but that's unlikely to help for a malloc small * enough to hold a single line of input, anyway. */ if (errno == ENOMEM) die("Out of memory, getdelim failed"); So the function is returning -1 and leaving ENOMEM in errno on Yaroslav's system. I wonder if we are truly hitting out of memory, though. The same symptom could bee seen if getdelim() does not touch errno when it returns -1, but some other system call earlier set it to ENOMEM, for example. If the same version of Git is recompiled there without HAVE_GETDELIM defined, would it still die with out of memory (presumably inside the call to strbuf_grow() in the strbuf_getwholeline() function)?
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willford wrote: > String formatting can be a performance issue when there are > hundreds of thousands of trees. When changing this for the sake of performance, could you give an example (which kind of repository you need for this to become a bottleneck? I presume the large Windows repo? Or can I reproduce it with a small repo such as linux.git or even git.git?) and some numbers how this improves the performance? > Change to stop using the strbuf_addf and just add the strings > or characters individually. > > There are a limited number of modes so added a switch for the > known ones and a default case if something comes through that > are not a known one for git. > > Signed-off-by: Kevin Willford > --- > cache-tree.c | 24 +++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/cache-tree.c b/cache-tree.c > index 2440d1dc89..41744b3db7 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -390,7 +390,29 @@ static int update_one(struct cache_tree *it, > continue; > > strbuf_grow(&buffer, entlen + 100); > - strbuf_addf(&buffer, "%o %.*s%c", mode, entlen, path + > baselen, '\0'); > + > + switch (mode) { > + case 0100644: > + strbuf_add(&buffer, "100644 ", 7); > + break; > + case 0100664: > + strbuf_add(&buffer, "100664 ", 7); > + break; > + case 0100755: > + strbuf_add(&buffer, "100755 ", 7); > + break; > + case 012: > + strbuf_add(&buffer, "12 ", 7); > + break; > + case 016: > + strbuf_add(&buffer, "16 ", 7); > + break; Maybe it is worth spelling out the modes in non-numeric, but e.g. S_IFGITLINK. > + default: > + strbuf_addf(&buffer, "%o ", mode); Given the repository you are measuring, maybe we could get away with fewer entries here and only take the 2 or 3 most used entries and special case them? Or in case this is assumed to be the exhaustive list, we could issue a warning here? Thanks, Stefan
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 11:58:34AM -0700, Stefan Beller wrote: > On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willford wrote: > > String formatting can be a performance issue when there are > > hundreds of thousands of trees. > > When changing this for the sake of performance, could you give > an example (which kind of repository you need for this to become > a bottleneck? I presume the large Windows repo? Or can I > reproduce it with a small repo such as linux.git or even git.git?) > and some numbers how this improves the performance? I was about to say the same thing. Normally I don't mind a small optimization without numbers if the result is obviously an improvement. But in this case the result is a lot less readable, and it's not entirely clear to me that it would always be an improvement (we now always run 3 strbuf calls instead of one, and have to check the length for each one). What I'm wondering specifically is if vsnprintf() on Kevin's platform (which I'll assume is Windows) is slow, and we would do better to replace it with a faster compat/ routine. -Peff
Re: [PATCH] commit: skip discarding the index if there is no pre-commit hook
On Thu, Aug 10, 2017 at 02:54:16PM -0400, Kevin Willford wrote: > If there is not a pre-commit hook, there is no reason to discard > the index and reread it. > > This change checks to presence of a pre-commit hook and then only > discards the index if there was one. Sounds like a smart optimization. > diff --git a/builtin/commit.c b/builtin/commit.c > index e7a2cb6285..443949d87b 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -671,12 +671,22 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > const char *hook_arg2 = NULL; > int clean_message_contents = (cleanup_mode != CLEANUP_NONE); > int old_display_comment_prefix; > + const char *precommit_hook = NULL; The return value from find_hook() points to storage that may later be overwritten or even freed. I know your patch doesn't actually look at the contents of precommit_hook, but it seems like it's setting up a potential bomb for somebody later. Can we make this: int have_precommit_hook = 0; ... have_precommit_hook = !!find_hook("pre-commit"); ? Though see below. > - if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", > NULL)) > - return 0; > + > + if (!no_verify) { > + /* > + * Check to see if there is a pre-commit hook > + * If there not one we can skip discarding the index later on > + */ > + precommit_hook = find_hook("pre-commit"); > + if (precommit_hook && > + run_commit_hook(use_editor, index_file, "pre-commit", NULL)) > + return 0; > + } We'll find the hook again in run_commit_hook(), but it's not so expensive that it's worth trying to pass down the hook path we found (and if we switch to an integer flag we don't have the path anyway ;) ). But note that we don't even care about precommit_hook here. We could just call run_commit_hook() regardless. We only care later whether we ran it or not. So we could drop this hunk and the precommit_hook variable entirely, and... > @@ -940,12 +950,15 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > return 0; > } > > - /* > - * Re-read the index as pre-commit hook could have updated it, > - * and write it out as a tree. We must do this before we invoke > - * the editor and after we invoke run_status above. > - */ > - discard_cache(); > + if (!no_verify && precommit_hook) { > + /* > + * Re-read the index as pre-commit hook could have updated it, > + * and write it out as a tree. We must do this before we invoke > + * the editor and after we invoke run_status above. > + */ > + discard_cache(); > + } Just make this: if (!no_verify && find_hook("pre-commit")) ... discard cache ... That is racy if somebody removes the hook while we're running (so is your patch, but in the opposite direction). What we really want to know is "did we run the hook" and annoyingly run_hook_ve() doesn't tell us that. So I think the most robust solution would be refactoring that to pass out the information, and then use the flag here. But I'm not sure it's actually worth worrying about such a race in practice. -Peff
[PATCH 9/9] Convert cache functions to size_t
From: Martin Koegler Signed-off-by: Martin Koegler --- cache-tree.c | 6 +++--- cache-tree.h | 2 +- cache.h | 6 +++--- convert.c | 18 +- environment.c | 4 ++-- read-cache.c | 18 +- 6 files changed, 27 insertions(+), 27 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2440d1d..77b3253 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -485,10 +485,10 @@ void cache_tree_write(struct strbuf *sb, struct cache_tree *root) write_one(sb, root, "", 0); } -static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) +static struct cache_tree *read_one(const char **buffer, size_t *size_p) { const char *buf = *buffer; - unsigned long size = *size_p; + size_t size = *size_p; const char *cp; char *ep; struct cache_tree *it; @@ -568,7 +568,7 @@ static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) return NULL; } -struct cache_tree *cache_tree_read(const char *buffer, unsigned long size) +struct cache_tree *cache_tree_read(const char *buffer, size_t size) { if (buffer[0]) return NULL; /* not the whole tree */ diff --git a/cache-tree.h b/cache-tree.h index f7b9cab..df59e6e 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -28,7 +28,7 @@ void cache_tree_invalidate_path(struct index_state *, const char *); struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *); void cache_tree_write(struct strbuf *, struct cache_tree *root); -struct cache_tree *cache_tree_read(const char *buffer, unsigned long size); +struct cache_tree *cache_tree_read(const char *buffer, size_t size); int cache_tree_fully_valid(struct cache_tree *); int cache_tree_update(struct index_state *, int); diff --git a/cache.h b/cache.h index 9322303..f77d9ec 100644 --- a/cache.h +++ b/cache.h @@ -667,7 +667,7 @@ extern int chmod_index_entry(struct index_state *, struct cache_entry *ce, char extern int ce_same_name(const struct cache_entry *a, const struct cache_entry *b); extern void set_object_name_for_intent_to_add_entry(struct cache_entry *ce); extern int index_name_is_other(const struct index_state *, const char *, int); -extern void *read_blob_data_from_index(const struct index_state *, const char *, unsigned long *); +extern void *read_blob_data_from_index(const struct index_state *, const char *, size_t *); /* do stat comparison even if CE_VALID is true */ #define CE_MATCH_IGNORE_VALID 01 @@ -743,8 +743,8 @@ extern int pack_compression_level; extern size_t packed_git_window_size; extern size_t packed_git_limit; extern size_t delta_base_cache_limit; -extern unsigned long big_file_threshold; -extern unsigned long pack_size_limit_cfg; +extern size_t big_file_threshold; +extern size_t pack_size_limit_cfg; /* * Accessors for the core.sharedrepository config which lazy-load the value diff --git a/convert.c b/convert.c index 1012462..445599b 100644 --- a/convert.c +++ b/convert.c @@ -41,9 +41,9 @@ struct text_stat { unsigned printable, nonprintable; }; -static void gather_stats(const char *buf, unsigned long size, struct text_stat *stats) +static void gather_stats(const char *buf, size_t size, struct text_stat *stats) { - unsigned long i; + size_t i; memset(stats, 0, sizeof(*stats)); @@ -90,7 +90,7 @@ static void gather_stats(const char *buf, unsigned long size, struct text_stat * * The same heuristics as diff.c::mmfile_is_binary() * We treat files with bare CR as binary */ -static int convert_is_binary(unsigned long size, const struct text_stat *stats) +static int convert_is_binary(size_t size, const struct text_stat *stats) { if (stats->lonecr) return 1; @@ -101,7 +101,7 @@ static int convert_is_binary(unsigned long size, const struct text_stat *stats) return 0; } -static unsigned int gather_convert_stats(const char *data, unsigned long size) +static unsigned int gather_convert_stats(const char *data, size_t size) { struct text_stat stats; int ret = 0; @@ -118,7 +118,7 @@ static unsigned int gather_convert_stats(const char *data, unsigned long size) return ret; } -static const char *gather_convert_stats_ascii(const char *data, unsigned long size) +static const char *gather_convert_stats_ascii(const char *data, size_t size) { unsigned int convert_stats = gather_convert_stats(data, size); @@ -140,7 +140,7 @@ const char *get_cached_convert_stats_ascii(const struct index_state *istate, const char *path) { const char *ret; - unsigned long sz; + size_t sz; void *data = read_blob_data_from_index(istate, path, &sz); ret = gather_convert_stats_ascii(data, sz); free(data); @@ -222,7 +222,7 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action, static int has_cr_in_index(const struct i
[PATCH 6/9] Use size_t for sha1
From: Martin Koegler Signed-off-by: Martin Koegler --- block-sha1/sha1.c | 2 +- block-sha1/sha1.h | 2 +- ppc/sha1.c| 2 +- ppc/sha1.h| 2 +- sha1dc_git.c | 2 +- sha1dc_git.h | 2 +- sha1dc_git_ext.h | 2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c index 22b125c..8681031 100644 --- a/block-sha1/sha1.c +++ b/block-sha1/sha1.c @@ -203,7 +203,7 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx) ctx->H[4] = 0xc3d2e1f0; } -void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len) +void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, size_t len) { unsigned int lenW = ctx->size & 63; diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h index 4df6747..9fb0441 100644 --- a/block-sha1/sha1.h +++ b/block-sha1/sha1.h @@ -13,7 +13,7 @@ typedef struct { } blk_SHA_CTX; void blk_SHA1_Init(blk_SHA_CTX *ctx); -void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, unsigned long len); +void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, size_t len); void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx); #define platform_SHA_CTX blk_SHA_CTX diff --git a/ppc/sha1.c b/ppc/sha1.c index ec6a192..f0dfcfb 100644 --- a/ppc/sha1.c +++ b/ppc/sha1.c @@ -25,7 +25,7 @@ int ppc_SHA1_Init(ppc_SHA_CTX *c) return 0; } -int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n) +int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, size_t n) { unsigned long nb; const unsigned char *p = ptr; diff --git a/ppc/sha1.h b/ppc/sha1.h index 9b24b32..52cac23 100644 --- a/ppc/sha1.h +++ b/ppc/sha1.h @@ -16,7 +16,7 @@ typedef struct { } ppc_SHA_CTX; int ppc_SHA1_Init(ppc_SHA_CTX *c); -int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n); +int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, size_t n); int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c); #define platform_SHA_CTX ppc_SHA_CTX diff --git a/sha1dc_git.c b/sha1dc_git.c index 4d32b4f..a9076bc 100644 --- a/sha1dc_git.c +++ b/sha1dc_git.c @@ -11,7 +11,7 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx) sha1_to_hex(hash)); } -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len) +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len) { const char *data = vdata; /* We expect an unsigned long, but sha1dc only takes an int */ diff --git a/sha1dc_git.h b/sha1dc_git.h index a8a5c1d..f6051aa 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -11,7 +11,7 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init SHA1DCInit diff --git a/sha1dc_git_ext.h b/sha1dc_git_ext.h index d0ea8ce..aede828 100644 --- a/sha1dc_git_ext.h +++ b/sha1dc_git_ext.h @@ -17,7 +17,7 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init git_SHA1DCInit -- 2.1.4
[PATCH 8/9] Convert fsck.c & commit.c to size_t
From: Martin Koegler Signed-off-by: Martin Koegler --- builtin/replace.c | 2 +- commit.c | 14 +++--- commit.h | 8 fsck.c| 14 +++--- fsck.h| 2 +- 5 files changed, 20 insertions(+), 20 deletions(-) diff --git a/builtin/replace.c b/builtin/replace.c index f4a85a1..dcd0d1e 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -391,7 +391,7 @@ static int create_graft(int argc, const char **argv, int force) struct commit *commit; struct strbuf buf = STRBUF_INIT; const char *buffer; - unsigned long size; + size_t size; if (get_oid(old_ref, &old) < 0) die(_("Not a valid object name: '%s'"), old_ref); diff --git a/commit.c b/commit.c index 79decc2..5ebac6a 100644 --- a/commit.c +++ b/commit.c @@ -231,19 +231,19 @@ int unregister_shallow(const struct object_id *oid) struct commit_buffer { void *buffer; - unsigned long size; + size_t size; }; define_commit_slab(buffer_slab, struct commit_buffer); static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab); -void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) +void set_commit_buffer(struct commit *commit, void *buffer, size_t size) { struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); v->buffer = buffer; v->size = size; } -const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *get_cached_commit_buffer(const struct commit *commit, size_t *sizep) { struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); if (!v) { @@ -256,7 +256,7 @@ const void *get_cached_commit_buffer(const struct commit *commit, unsigned long return v->buffer; } -const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *get_commit_buffer(const struct commit *commit, size_t *sizep) { const void *ret = get_cached_commit_buffer(commit, sizep); if (!ret) { @@ -291,7 +291,7 @@ void free_commit_buffer(struct commit *commit) } } -const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) +const void *detach_commit_buffer(struct commit *commit, size_t *sizep) { struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); void *ret; @@ -1128,7 +1128,7 @@ int parse_signed_commit(const struct commit *commit, struct strbuf *payload, struct strbuf *signature) { - unsigned long size; + size_t size; const char *buffer = get_commit_buffer(commit, &size); int in_signature, saw_signature = -1; const char *line, *tail; @@ -1284,7 +1284,7 @@ struct commit_extra_header *read_commit_extra_headers(struct commit *commit, const char **exclude) { struct commit_extra_header *extra = NULL; - unsigned long size; + size_t size; const char *buffer = get_commit_buffer(commit, &size); extra = read_commit_extra_header_lines(buffer, size, exclude); unuse_commit_buffer(commit, buffer); diff --git a/commit.h b/commit.h index 82e966e..fd44de3 100644 --- a/commit.h +++ b/commit.h @@ -70,20 +70,20 @@ void parse_commit_or_die(struct commit *item); * Associate an object buffer with the commit. The ownership of the * memory is handed over to the commit, and must be free()-able. */ -void set_commit_buffer(struct commit *, void *buffer, unsigned long size); +void set_commit_buffer(struct commit *, void *buffer, size_t size); /* * Get any cached object buffer associated with the commit. Returns NULL * if none. The resulting memory should not be freed. */ -const void *get_cached_commit_buffer(const struct commit *, unsigned long *size); +const void *get_cached_commit_buffer(const struct commit *, size_t *size); /* * Get the commit's object contents, either from cache or by reading the object * from disk. The resulting memory should not be modified, and must be given * to unuse_commit_buffer when the caller is done. */ -const void *get_commit_buffer(const struct commit *, unsigned long *size); +const void *get_commit_buffer(const struct commit *, size_t *size); /* * Tell the commit subsytem that we are done with a particular commit buffer. @@ -102,7 +102,7 @@ void free_commit_buffer(struct commit *); * Disassociate any cached object buffer from the commit, but do not free it. * The buffer (or NULL, if none) is returned. */ -const void *detach_commit_buffer(struct commit *, unsigned long *sizep); +const void *detach_commit_buffer(struct commit *, size_t *sizep); /* Find beginning and length of commit subject. */ int find_commit_subject(const char *commit_buffer, const char **subject); diff --git a/fsck.c b/fsck.c index feca3a8..9039373 100644 --- a/fsck.c +++ b/fsck.c @@ -632,18 +632,18 @@ static int fsck_tr
[PATCH 5/9] Convert sha1_file.c to size_t
From: Martin Koegler Signed-off-by: Martin Koegler --- cache.h | 16 +++ sha1_file.c | 68 ++--- streaming.c | 2 +- 3 files changed, 43 insertions(+), 43 deletions(-) diff --git a/cache.h b/cache.h index 9185763..9322303 100644 --- a/cache.h +++ b/cache.h @@ -1189,15 +1189,15 @@ static inline const unsigned char *lookup_replace_object(const unsigned char *sh /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ extern int sha1_object_info(const unsigned char *, size_t *); -extern int hash_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *sha1); -extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1); -extern int hash_sha1_file_literally(const void *buf, unsigned long len, const char *type, unsigned char *sha1, unsigned flags); -extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *); +extern int hash_sha1_file(const void *buf, size_t len, const char *type, unsigned char *sha1); +extern int write_sha1_file(const void *buf, size_t len, const char *type, unsigned char *return_sha1); +extern int hash_sha1_file_literally(const void *buf, size_t len, const char *type, unsigned char *sha1, unsigned flags); +extern int pretend_sha1_file(void *, size_t, enum object_type, unsigned char *); extern int force_object_loose(const unsigned char *sha1, time_t mtime); extern int git_open_cloexec(const char *name, int flags); #define git_open(name) git_open_cloexec(name, O_RDONLY) -extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); -extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz); +extern void *map_sha1_file(const unsigned char *sha1, size_t *size); +extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, size_t mapsize, void *buffer, size_t bufsiz); extern int parse_sha1_header(const char *hdr, size_t *sizep); /* global flag to enable extra checks when accessing packed objects */ @@ -1723,8 +1723,8 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, size_t *); -extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, size_t *sizep); -extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); +extern size_t unpack_object_header_buffer(const unsigned char *buf, size_t len, enum object_type *type, size_t *sizep); +extern size_t get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, size_t *); /* diff --git a/sha1_file.c b/sha1_file.c index 3428172..1b3efea 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -51,7 +51,7 @@ static struct cached_object { unsigned char sha1[20]; enum object_type type; void *buf; - unsigned long size; + size_t size; } *cached_objects; static int cached_object_nr, cached_object_alloc; @@ -818,8 +818,8 @@ static int check_packed_git_idx(const char *path, struct packed_git *p) * variable sized table containing 8-byte entries * for offsets larger than 2^31. */ - unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; - unsigned long max_size = min_size; + size_t min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; + size_t max_size = min_size; if (nr) max_size += (nr - 1)*8; if (idx_size < min_size || idx_size > max_size) { @@ -1763,7 +1763,7 @@ static int open_sha1_file(const unsigned char *sha1, const char **path) */ static void *map_sha1_file_1(const char *path, const unsigned char *sha1, -unsigned long *size) +size_t *size) { void *map; int fd; @@ -1790,13 +1790,13 @@ static void *map_sha1_file_1(const char *path, return map; } -void *map_sha1_file(const unsigned char *sha1, unsigned long *size) +void *map_sha1_file(const unsigned char *sha1, size_t *size) { return map_sha1_file_1(NULL, sha1, size); } -unsigned long unpack_object_header_buffer(const unsigned char *buf, - unsigned long len, enum object_type *type, size_t *sizep) +size_t unpack_object_header_buffer(const unsigned char *buf, + size_t len, enum object_type *type, size_t *sizep) { unsigned shift; size_t size, c; @@ -1821,8 +1821,8 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, } static int unpack_sha1_short_he
Re: [PATCH] t1200: remove t1200-tutorial.sh
Johannes Schindelin writes: > Hi, > > On Wed, 9 Aug 2017, Stefan Beller wrote: > >> v1.2.0~121 (New tutorial, 2006-01-22) rewrote the tutorial such that the >> original intent of 2ae6c70674 (Adapt tutorial to cygwin and add test case, >> 2005-10-13) to test the examples from the tutorial doesn't hold any more. >> >> There are dedicated tests for the commands used, even "git whatchanged", >> such that removing these tests doesn't seem like a reduction in test >> coverage. >> >> Signed-off-by: Stefan Beller >> --- > > ACK, > Dscho Thanks, both. Will queue.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 8:37 PM, Jeff King wrote: > On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: > >> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: >> > The point of "--only-trailers" is to give a caller an output >> > that's easy for them to parse. Getting rid of the >> > non-trailer material helps, but we still may see more >> > complicated syntax like whitespace continuation. Let's add >> > an option to normalize the output into one "key: value" line >> > per trailer. >> > >> > As a bonus, this could be used even without --only-trailers >> > to clean up unusual formatting in the incoming data. >> >> This is useful for the parsing part, but for the writing part we'd >> rather want to have the opposite thing, such as >> '--line-break=rfc822'. But this doesn't have to be part of this >> series. With this in mind, I do not quite understand the latter >> use case how you would use normalized trailers without >> --only-trailers? > > If you prefer the normalized form (and the input was line-broken in a > way that you don't like), then this would convert to your preferred > form. I agree that you could potentially want the opposite (folding long > lines). Perhaps something like --wrap=72. Related to this, I wonder if people might want to "normalize" in different ways later. If that happens, we might regret having called this option "--normalize" instead of "--one-per-line" for example.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 09:39:21PM +0200, Christian Couder wrote: > > If you prefer the normalized form (and the input was line-broken in a > > way that you don't like), then this would convert to your preferred > > form. I agree that you could potentially want the opposite (folding long > > lines). Perhaps something like --wrap=72. > > Related to this, I wonder if people might want to "normalize" in > different ways later. If that happens, we might regret having called > this option "--normalize" instead of "--one-per-line" for example. My assumption was that it would be OK to add other normalization later if it brings us closer to the "key: value" form as a standard, and it could fall under "--normalize", since that's what callers would want. And that's why I didn't want to call it something like --one-per-line. But if you are arguing that there can be many "standards" to normalize to, I agree that's a possibility. I think we have an out by extending to "--normalize=whatever-form" in the future. -Peff
Re: [PATCH 0/5] make interpret-trailers useful for parsing
Stefan Beller writes: > On Thu, Aug 10, 2017 at 11:03 AM, Jeff King wrote: >> On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: >> >>> On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: >>> >>> > This series teaches interpret-trailers to parse and output just the >>> > trailers. So now you can do: >>> > >>> > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | >>> > git interpret-trailers --parse >>> > Signed-off-by: Hartmut Henkel >>> > Helped-by: Stefan Beller >>> > Signed-off-by: Ralf Thielow >>> > Acked-by: Matthias Rüster >>> >>> And here's a v2 that addresses all of the comments except one: Stefan >>> suggested that --only-existing wasn't a great name. I agree, but I like >>> everything else less. >> >> Here's a v3 that takes care of that (renaming it to --only-input). >> >> It's otherwise the same as v2, but since the name-change ripples through >> the remaining patches, I wanted to get v3 in front of people sooner >> rather than later. >> > > Looks good, Yeah, looks good to me too. Thanks for the "--only-input" discussion.
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 10.08.2017 um 16:43 schrieb Yaroslav Halchenko: > On Thu, 10 Aug 2017, René Scharfe wrote: >> Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: >>> More context (may be different issue(s)) could be found at >>> http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ >>> but currently I am consistently reproducing it while running >>> git (1:2.11.0-3 debian stretch build) within debian stretch singularity >>> environment [1]. > >>> External system is Centos 6.9, and git 1.7.1 (and installed in modules >>> 2.0.4) do not show similar buggy behavior. > >>> NFS mounted partitions are bind mounted inside the sinularity space and >>> when I try to do some git operations, I get that error inconsistently , e.g. > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> fatal: Out of memory, getdelim failed >>> error: git://github.com/datalad/datalad did not send all necessary >>> objects > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> fatal: Out of memory, getdelim failed >>> error: git://github.com/datalad/datalad did not send all necessary >>> objects > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> From git://github.com/datalad/datalad >>> * branch master -> FETCH_HEAD >>> fatal: Out of memory, getdelim failed > >>> and some times it succeeds. So it smells that some race condition >>> somewhere...? > >> I doubt the type of file system matters. > > So far it has been a very consistent indicator. I did not manage to get > this error while performing the same operation under /tmp (bind to local > mounted drive), where it also feels going faster (again suggesting that > original issue is some kind of a race) Well, there have been bugs in getdelim() before, e.g.: https://bugzilla.redhat.com/show_bug.cgi?id=601071 https://bugzilla.redhat.com/show_bug.cgi?id=1332917 git v2.5.0 was the first version to use it. So if all else fails it may be worth compiling git without HAVE_GETDELIM. >> The questions are: How much >> main memory do you have, what is git trying to cram into it, is there >> a way to reduce the memory footprint or do you need to add more RAM? >> ... reordered ... >> "free" and "ulimit -a" can help you find out how much memory you can >> use. > > I think those aren't the reason: > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h >totalusedfree shared buff/cache > available > Mem: 126G2.5G 90G652K 33G > 123G > Swap: 127G1.7M127G Is all of that available to the git in the Singularity container or is that the memory size of the host and there's some kind of limit for the guests? > yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit > unlimited That's just the maximum file size; memory-related limits are more interesting for this case. "ulimit -a" will show all limits. >>> any recommendations on how to pin point the "offender"? ;) >> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a >> good start, I think, to find out which of the different activities >> that pull is doing causes the out-of-memory error. > > samples of bad, and then good runs (from eyeballing -- the same until > error message): > > yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log > 14:05:25.782270 git.c:371 trace: built-in: git 'pull' > '--ff-only' 'origin' 'master' > 14:05:25.795036 run-command.c:350 trace: run_command: 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.795332 exec_cmd.c:116 trace: exec: 'git' 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.797212 git.c:371 trace: built-in: git 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.904088 run-command.c:350 trace: run_command: 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.085954 run-command.c:350 trace: run_command: 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.086333 exec_cmd.c:116 trace: exec: 'git' 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.088382 git.c:371 trace: built-in: git 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.133326 run-command.c:350 trace: run_command: 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.133688 exec_cmd.c:116 trace: exec: 'git' 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.135493 git.c:371 trace: built-in: git 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > fatal: Out of memory, getdelim failed
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 12:39 PM, Christian Couder wrote: > On Thu, Aug 10, 2017 at 8:37 PM, Jeff King wrote: >> On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: >> >>> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: >>> > The point of "--only-trailers" is to give a caller an output >>> > that's easy for them to parse. Getting rid of the >>> > non-trailer material helps, but we still may see more >>> > complicated syntax like whitespace continuation. Let's add >>> > an option to normalize the output into one "key: value" line >>> > per trailer. >>> > >>> > As a bonus, this could be used even without --only-trailers >>> > to clean up unusual formatting in the incoming data. >>> >>> This is useful for the parsing part, but for the writing part we'd >>> rather want to have the opposite thing, such as >>> '--line-break=rfc822'. But this doesn't have to be part of this >>> series. With this in mind, I do not quite understand the latter >>> use case how you would use normalized trailers without >>> --only-trailers? >> >> If you prefer the normalized form (and the input was line-broken in a >> way that you don't like), then this would convert to your preferred >> form. I agree that you could potentially want the opposite (folding long >> lines). Perhaps something like --wrap=72. > > Related to this, I wonder if people might want to "normalize" in > different ways later. If that happens, we might regret having called > this option "--normalize" instead of "--one-per-line" for example. What is normal? Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] then? If you have --one-per-line, this may be orthogonal to e.g. json (as json can be crammed into one line IIUC), but when given the selection you cannot combine multiple styles. Scratch that, we actually want to combine these styles with each other.
Re: fatal: Out of memory, getdelim failed under NFS mounts
Thank you Junio On Thu, 10 Aug 2017, Junio C Hamano wrote: > There is only one getdelim() call, which was introduced in v2.5.0 > timeframe, and it is used like this: > r = getdelim(&sb->buf, &sb->alloc, term, fp); > if (r > 0) { > sb->len = r; > return 0; > } > assert(r == -1); > /* >* Normally we would have called xrealloc, which will try to free >* memory and recover. But we have no way to tell getdelim() to do so. >* Worse, we cannot try to recover ENOMEM ourselves, because we have >* no idea how many bytes were read by getdelim. >* >* Dying here is reasonable. It mirrors what xrealloc would do on >* catastrophic memory failure. We skip the opportunity to free pack >* memory and retry, but that's unlikely to help for a malloc small >* enough to hold a single line of input, anyway. >*/ > if (errno == ENOMEM) > die("Out of memory, getdelim failed"); > So the function is returning -1 and leaving ENOMEM in errno on > Yaroslav's system. > I wonder if we are truly hitting out of memory, though. The same > symptom could bee seen if getdelim() does not touch errno when it > returns -1, but some other system call earlier set it to ENOMEM, > for example. > If the same version of Git is recompiled there without HAVE_GETDELIM > defined, would it still die with out of memory (presumably inside > the call to strbuf_grow() in the strbuf_getwholeline() function)? will check now... for my own education (rotten by Python) -- how do you know which syscall set errno to be analyzed at this specific point? may be it was already set to ENOMEM before entry to this function? -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On 8/10/2017 3:03 PM, Jeff King wrote: On Thu, Aug 10, 2017 at 11:58:34AM -0700, Stefan Beller wrote: On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willford wrote: String formatting can be a performance issue when there are hundreds of thousands of trees. When changing this for the sake of performance, could you give an example (which kind of repository you need for this to become a bottleneck? I presume the large Windows repo? Or can I reproduce it with a small repo such as linux.git or even git.git?) and some numbers how this improves the performance? I was about to say the same thing. Normally I don't mind a small optimization without numbers if the result is obviously an improvement. But in this case the result is a lot less readable, and it's not entirely clear to me that it would always be an improvement (we now always run 3 strbuf calls instead of one, and have to check the length for each one). What I'm wondering specifically is if vsnprintf() on Kevin's platform (which I'll assume is Windows) is slow, and we would do better to replace it with a faster compat/ routine. -Peff The strbuf_add call is essentially only having to do a memcpy whereas the strbuf_addf will have to parse the string, determine the types, convert the data, and then get it in the buffer. That could be made faster with a better compat/ routine but I fear still far from the length check and memcpy. void strbuf_add(struct strbuf *sb, const void *data, size_t len) { strbuf_grow(sb, len); memcpy(sb->buf + sb->len, data, len); strbuf_setlen(sb, sb->len + len); } Here are some of the performance numbers from the windows repo. I will work on writing a perf test for this change so that we have a better idea on smaller repo what the impact of this change is on them. | w/o | with fix | --- git checkout | 36.08 s | 33.34 s | --- git checkout | 32.54 s | 28.26 s | --- git checkout | 44.10 s | 38.13 s | --- git merge| 32.90 s | 30.56 s | --- git rebase | 46.14 s | 42.18 s |
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 10.08.2017 um 20:56 schrieb Junio C Hamano: > René Scharfe writes: > >> I doubt the type of file system matters. The questions are: How much >> main memory do you have, what is git trying to cram into it, is there >> a way to reduce the memory footprint or do you need to add more RAM? >> >>> any recommendations on how to pin point the "offender"? ;) >> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a >> good start, I think, to find out which of the different activities >> that pull is doing causes the out-of-memory error. >> >> "free" and "ulimit -a" can help you find out how much memory you can >> use. >> >> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? >> getdelim() is used mostly to read lines from files like these and in >> the admittedly unlikely case that they are *really* long such an >> error would be expected. > > There is only one getdelim() call, which was introduced in v2.5.0 > timeframe, and it is used like this: > > r = getdelim(&sb->buf, &sb->alloc, term, fp); > > if (r > 0) { > sb->len = r; > return 0; > } > assert(r == -1); > > /* >* Normally we would have called xrealloc, which will try to free >* memory and recover. But we have no way to tell getdelim() to do so. >* Worse, we cannot try to recover ENOMEM ourselves, because we have >* no idea how many bytes were read by getdelim. >* >* Dying here is reasonable. It mirrors what xrealloc would do on >* catastrophic memory failure. We skip the opportunity to free pack >* memory and retry, but that's unlikely to help for a malloc small >* enough to hold a single line of input, anyway. >*/ > if (errno == ENOMEM) > die("Out of memory, getdelim failed"); > > So the function is returning -1 and leaving ENOMEM in errno on > Yaroslav's system. > > I wonder if we are truly hitting out of memory, though. The same > symptom could bee seen if getdelim() does not touch errno when it > returns -1, but some other system call earlier set it to ENOMEM, > for example. That can happen when the end of a file is reached; getdelim() returns -1 and leaves errno unchanged. So we need to set errno to 0 just before that call. Still *some* function must have run into a memory shortage in that scenario, so adding/assigning more should help, right? > If the same version of Git is recompiled there without HAVE_GETDELIM > defined, would it still die with out of memory (presumably inside > the call to strbuf_grow() in the strbuf_getwholeline() function)? It's worth a try, if possible. René
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On Thu, Aug 10, 2017 at 11:49 AM, Jameson Miller wrote: Welcome to the Git mailing list. :) > Teach Git to optionally show ignored directories when showing all > untracked files. The git status command exposes the options to report > ignored and/or untracked files. However, when reporting all untracked > files (--untracked-files=all), all individual ignored files are reported > as well. It is not currently possible to get the reporting behavior of > the --ignored flag, while also reporting all untracked files. Trying to understand this based off the documentation for --untracked=all and --ignored, I realize that the documentation for --ignored seems to be lacking as I do not understand what the --ignored behavior is in combination with --untracked=[all, normal, no] > This > change exposes a flag to report all untracked files while not showing > individual files in ignored directories. By the description up to here, it sounds as if you want to introduce mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad suggestion)? However the patch seems to add an orthogonal flag, such that status --no-ignored --untracked=no --show-ignored-directory would also be possible. What is a reasonable expectation for the output of such? > Motivation: > Our application (Visual Studio) needs all untracked files listed > individually, but does not need all ignored files listed individually. For parsing output, I would strongly recommend --porcelain[=2], but that is a tangent. > Reporting all ignored files can affect the time it takes for status > to run. For a representative repository, here are some measurements > showing a large perf improvement for this scenario: > > | Command | Reported ignored entries | Time (s) | > | --- | | | > | 1 | 0| 1.3 | > | 2 | 1024 | 4.2 | > | 3 | 174904 | 7.5 | > | 4 | 1046 | 1.6 | > > Commands: > 1) status > 2) status --ignored > 3) status --ignored --untracked-files=all > 4) status --ignored --untracked-files=all --show-ignored-directory (2) is --untracked-files=normal I'd presume, such that this flag can be understood as a tweak to "normal" based on the similar size between 2 and 4? (The timing improvement from 2 to 4 is huge though). > This changes exposes a --show-ignored-directory flag to the git status s/changes/change/ > command. This flag is utilized when running git status with the > --ignored and --untracked-files options to not list ignored individual > ignored files contained in directories that match an ignore pattern. > > Part of the perf improvement comes from the tweak to > read_directory_recursive to stop scanning the file system after it > encounters the first file. When a directory is ignored, all it needs to > determine is if the directory is empty or not. The logic currently keeps > scanning the file system until it finds an untracked file. However, as > the directory is ignored, all the contained contents are also marked > excluded. For ignored directories that contain a large number of files, > this can take some time. I think it is possible to ignore a directory and still track files in it, what are the implications of this flag on a tracked (and changed) file in an ignored dir? What happens to empty directories that match an ignore pattern? > @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char > *prefix) > N_("ignore changes to submodules, optional when: all, > dirty, untracked. (Default: all)"), > PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, > OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files > in columns")), > + OPT_BOOL(0, "show-ignored-directory", &show_ignored_directory, Is it possible to directly read into s.show_ignored_directory here? > +test_expect_success 'Verify behavior of status on folders with ignored > files' ' > + test_when_finished "git clean -fdx" && > + git status --porcelain=v2 --ignored --untracked-files=all > --show-ignored-directory >output && > + test_i18ncmp expect output > +' > + > +# Test status bahavior on folder with tracked and ignored files behavior > +cat >expect <<\EOF > +? expect > +? output > +! dir/tracked_ignored/ignored_1.ign > +! dir/tracked_ignored/ignored_2.ign > +! tracked_ignored/ignored_1.ign > +! tracked_ignored/ignored_2.ign > +EOF I think our latest 'best style' is to include these heredocs into the test.
Re: fatal: Out of memory, getdelim failed under NFS mounts
On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote: > > So the function is returning -1 and leaving ENOMEM in errno on > > Yaroslav's system. > > > > I wonder if we are truly hitting out of memory, though. The same > > symptom could bee seen if getdelim() does not touch errno when it > > returns -1, but some other system call earlier set it to ENOMEM, > > for example. > > That can happen when the end of a file is reached; getdelim() returns > -1 and leaves errno unchanged. So we need to set errno to 0 just > before that call. Good catch. That's a bug in my original conversoin of strbuf_getwholeline(). -Peff
Re: [PATCH V2 1/2] Fix delta integer overflows
Martin Koegler writes: > From: Martin Koegler Just a nitpick on the patch title. As "git shortlog --no-merges" output would tell you, we try to prefix the title with a short name of the area of codebase we are touching, followed by a colon and a space and then remainder without extra capitalization. Perhaps Subject: delta: fix enconding size larger than an "uint" can hold > The current delta code produces incorrect pack objects for files > 4GB. > > Signed-off-by: Martin Koegler I am a bit torn on this change. The original is indeed bad in that the code does not guarantee that an intermediate variable like 'l' is not large enough to hold the true size we know in index->src_size, and in that sense this change is an improvement. Given that this is not merely a local storage format but it also is an interchange format, we would probably want to make sure that the receiving end (e.g. get_delta_hdr_size() that is used at the beginning of patch_delta()) on a platform whose size_t is smaller than that of a platform that produced the delta stream with this code behaves "sensibly". If we replaced ulong we use in create/patch delta codepaths with uint32_t, that would be safer, just because the encoder would not be able to emit varint that is larger than the receivers to handle. But that defeats the whole point of using varint() to encode the sizes in the first place. It was partly done for space saving, but more for allowing larger sizes and larger ulong in the future without having to change the file format. Perhaps we should teach the receiving end to notice that the varint data it reads encodes a size that is too large for it to grok and die. With that, we can safely move forward with whatever size_t each platform uses. Thanks. > --- > For next. > > diff-delta.c | 24 +--- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/diff-delta.c b/diff-delta.c > index 3797ce6..cd238c8 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -319,7 +319,9 @@ create_delta(const struct delta_index *index, >const void *trg_buf, unsigned long trg_size, >unsigned long *delta_size, unsigned long max_size) > { > - unsigned int i, outpos, outsize, moff, msize, val; > + unsigned int i, val; > + off_t outpos, moff; > + size_t l, outsize, msize; > int inscnt; > const unsigned char *ref_data, *ref_top, *data, *top; > unsigned char *out; > @@ -336,20 +338,20 @@ create_delta(const struct delta_index *index, > return NULL; > > /* store reference buffer size */ > - i = index->src_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = index->src_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > /* store target buffer size */ > - i = trg_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = trg_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > ref_data = index->src_buf; > ref_top = ref_data + index->src_size;
[ANNOUNCE] Git for Windows 2.14.1
Dear Git users, It is my pleasure to announce that Git for Windows 2.14.1 is available from: https://git-for-windows.github.io/ Changes since Git for Windows v2.14.0(2) (August 7th 2017) Note: there have been MinGit-only releases v2.12.2(3) and v2.13.1(3) with backports of the important bug fix in v2.14.1 as well as the experimental --show-ignored-directory option of git status. New Features * Comes with Git v2.14.1. * Comes with cURL v7.55.0. * The Git Bash Here context menu item is now also available in the special Libraries folders. Filename | SHA-256 | --- Git-2.14.1-64-bit.exe | 0dc556503e3ce4699228fc910a8e4a8d81172635ac8e8e16a11be107254c4901 Git-2.14.1-32-bit.exe | 0129e21eaed8efa6d795f712656463ee4f90aa2b3b66168f29b0da98f74104f7 PortableGit-2.14.1-64-bit.7z.exe | 3c3270a9df5f3db1f7637d86b94fb54a96e9145ba43c98a3e993cdffb1a1842e PortableGit-2.14.1-32-bit.7z.exe | df3f9b6c2dd2b12e5cb7035b9ca48d13b973d054a35b0939953aa6e7a00a0659 MinGit-2.14.1-64-bit.zip | 65c12e4959b8874187b68ec37e532fe7fc526e10f6f0f29e699fa1d2449e7d92 MinGit-2.14.1-32-bit.zip | 77b468e0ead1e7da4cb3a1cf35dabab5210bf10457b4142f5e9430318217cdef MinGit-2.14.1-busybox-64-bit.zip | 7e72a78e0711d27d98f851ec81a6fe27b4159066d548c2013dd7ce57a1b8cd03 MinGit-2.14.1-busybox-32-bit.zip | 2f3a3ae26391e5e3487501b3b16ee1c6385259ebfdaafcbee9947d7513dc0a0f Git-2.14.1-64-bit.tar.bz2 | 544615e2ef5e2040a67878ce7aac42cb103f948d52989239b3715dd6023b1007 Git-2.14.1-32-bit.tar.bz2 | 0aede42a7ec7a6351a3f273ab519679f95e9341cb63899c54be18a57819da6aa Ciao, Johannes
Product Inquiry
Dear Sir/Madam We are interested in your product, can we have your MOQ and FOB prices for consideration and purchase waiting for your response. Best Regards, Ms Tina Wang (Purchasing Manager) Email:purchasing.sale...@gmail.com
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 9:42 PM, Jeff King wrote: > On Thu, Aug 10, 2017 at 09:39:21PM +0200, Christian Couder wrote: > >> > If you prefer the normalized form (and the input was line-broken in a >> > way that you don't like), then this would convert to your preferred >> > form. I agree that you could potentially want the opposite (folding long >> > lines). Perhaps something like --wrap=72. >> >> Related to this, I wonder if people might want to "normalize" in >> different ways later. If that happens, we might regret having called >> this option "--normalize" instead of "--one-per-line" for example. > > My assumption was that it would be OK to add other normalization later > if it brings us closer to the "key: value" form as a standard, and it > could fall under "--normalize", since that's what callers would want. > And that's why I didn't want to call it something like --one-per-line. > > But if you are arguing that there can be many "standards" to normalize > to, I agree that's a possibility. I think we have an out by extending to > "--normalize=whatever-form" in the future. If we take `git log` as an example, we now have "--oneline" which is a shorthand for "--pretty=oneline --abbrev-commit". And the default for "--pretty" is called "medium". So instead of your suggestion, we could call this option "--oneline" now, and if other normalizations are later required we could then create "--pretty=whatever" and say that "--oneline" is a shorthand for "--pretty=oneline".
Re: fatal: Out of memory, getdelim failed under NFS mounts
On Thu, 10 Aug 2017, Jeff King wrote: > On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote: > > > So the function is returning -1 and leaving ENOMEM in errno on > > > Yaroslav's system. > > > I wonder if we are truly hitting out of memory, though. The same > > > symptom could bee seen if getdelim() does not touch errno when it > > > returns -1, but some other system call earlier set it to ENOMEM, > > > for example. > > That can happen when the end of a file is reached; getdelim() returns > > -1 and leaves errno unchanged. So we need to set errno to 0 just > > before that call. > Good catch. That's a bug in my original conversoin of > strbuf_getwholeline(). I think this did it! at least on this simple test... yet to test a bit more in those scenarios we ran into it before while testing git-annex. commit 36ef5e3ad2c187d3be664c33dbc8c06e59bceaf4 (HEAD -> bf-seterrno0) Author: Yaroslav O. Halchenko Date: Thu Aug 10 20:26:47 2017 + BF: set errno to 0 before getdelim call to get unbiased assesment later diff --git a/strbuf.c b/strbuf.c index 89d22e3b0..323c49ceb 100644 --- a/strbuf.c +++ b/strbuf.c @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(&sb->buf, &sb->alloc, term, fp); if (r > 0) { -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH 0/4] dropping support for older curl
[I am resending this since the original does not seem to have made it to the list, at least I cannot find it in any archives] On 09/08/17 23:47, Jeff King wrote: On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote: I mean, if we even go out of our way to support the completely outdated and obsolete .git/branches/ for what is likely a single user, it may not be the worst to keep those couple of #ifdef guards to keep at least nominal support for older cURLs? You've totally ignored the argument I made back then[1], and which I reiterated in this thread. So I'll say it one more time: the more compelling reason is not the #ifdefs, but the fact that the older versions are totally untested. Perhaps you forgot but I stated in the original thread that I build RPMS for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every single time. I currently have 2.13.3 up for el4, el5, el6 and el7. Only el4 requires any patches, the rest will build out of the box with the vendor supplied version of curl. The plan was to drop the el4 builds for 2.14.0 to get rid of the patches. In fact, they do not even compile, and yet I have not seen any patches to fix that. I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at all neither with building nor with running the testsuite. So IMHO this is about being honest with users about which versions we _actually_ support. I have no problem with you wanting to drop support for older curl releases (such as 7.15.5) but don't use the argument that it doesn't currently build and nobody cares. Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle Support program until 2020-11-30. -tgc
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 12:57 PM, Kevin Willford wrote: > Here are some of the performance numbers from the windows repo. > I will work on writing a perf test for this change so that we > have a better idea on smaller repo what the impact of this change > is on them. > > | w/o | with fix | > --- > git checkout | 36.08 s | 33.34 s | > --- > git checkout | 32.54 s | 28.26 s | > --- > git checkout | 44.10 s | 38.13 s | > --- > git merge| 32.90 s | 30.56 s | > --- > git rebase | 46.14 s | 42.18 s | > ~10-15% is impressive for this patch, I certainly did not expect as much. Thanks for providing the numbers! Stefan
Re: [PATCH V2 1/2] Fix delta integer overflows
On Thu, Aug 10, 2017 at 01:07:07PM -0700, Junio C Hamano wrote: > Perhaps we should teach the receiving end to notice that the varint > data it reads encodes a size that is too large for it to grok and > die. With that, we can safely move forward with whatever size_t > each platform uses. Yes, this is very important even for "unsigned long". I'd worry that malicious input could cause us to wrap to 0, and we'd potentially write into a too-small buffer[1]. There's some prior art with checking this against bitsizeof() in unpack_object_header_buffer() but get_delta_hdr_size() does not seem to have a check. -Peff [1] In most cases it's _probably_ not a vulnerability to wrap here, because we'd just read less data than we ought to. But it makes me nervous nonetheless.
Re: [PATCH 4/4] Fix delta offset overflow
Martin Koegler writes: > From: Martin Koegler > > Prevent generating delta offsets beyond 4G. > > Signed-off-by: Martin Koegler > --- > diff-delta.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/diff-delta.c b/diff-delta.c > index 3d5e1ef..633883e 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, > moff += msize; > msize = left; > > + if (moff > 0x) > + msize = 0; > + The lower 4-byte of moff (before incrementing it with msize) were already encoded to the output stream before this hunk. Shouldn't we be checking if moff would fit in uint32_t _before_ that happens? IOW, in a much earlier part of that "while (data < top)" loop, there is a code that tries to find a match that gives us a large msize by iterating over index->hash[], and it sets msize and moff as a potential location that we may want to use. If moff is already big there, then we shouldn't consider it a match because we cannot encode its location using 4-byte anyway, no? Cutting it off at here by resetting msize to 0 might help the next iteration (I didn't check, but is the effect of it is to corrupt the "val" rolling checksum and make it unlikely that the hash computation would not find a correct match?) but it somehow feels like closing the barn door after the horse has already bolted... > if (msize < 4096) { > int j; > val = 0;
[PATCH] strbuf: clear errno before calling getdelim(3)
getdelim(3) returns -1 at the end of the file and if it encounters an error, but sets errno only in the latter case. Set errno to zero before calling it to avoid misdiagnosing an out-of-memory condition due to a left-over value from some other function call. Reported-by: Yaroslav Halchenko Suggested-by: Junio C Hamano Signed-off-by: Rene Scharfe --- Do we need to save and restore the original value of errno? I doubt it, but didn't think deeply about it, yet. strbuf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/strbuf.c b/strbuf.c index 89d22e3b09..323c49ceb3 100644 --- a/strbuf.c +++ b/strbuf.c @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(&sb->buf, &sb->alloc, term, fp); if (r > 0) { -- 2.14.0
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. Looks good to me. > Do we need to save and restore the original value of errno? I doubt it, > but didn't think deeply about it, yet. I'd say no. Anybody depending on strbuf_getwholeline() is clearly already wrong in the error case. And in general I think we assume that syscalls can clear errno on success if they choose to (this isn't a syscall, but obviously it is calling some). -Peff
Re: [PATCH 0/4] dropping support for older curl
On 09/08/17 23:47, Jeff King wrote: On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote: I mean, if we even go out of our way to support the completely outdated and obsolete .git/branches/ for what is likely a single user, it may not be the worst to keep those couple of #ifdef guards to keep at least nominal support for older cURLs? You've totally ignored the argument I made back then[1], and which I reiterated in this thread. So I'll say it one more time: the more compelling reason is not the #ifdefs, but the fact that the older versions are totally untested. Perhaps you forgot but I stated in the original thread that I build RPMS for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every single time. I currently have 2.13.3 up for el4, el5, el6 and el7. Only el4 requires any patches, the rest will build out of the box with the vendor supplied version of curl. The plan was to drop the el4 builds for 2.14.0 to get rid of the patches. In fact, they do not even compile, and yet I have not seen any patches to fix that. I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at all neither with building nor with running the testsuite. So IMHO this is about being honest with users about which versions we _actually_ support. I have no problem with you wanting to drop support for older curl releases (such as 7.15.5) but don't use the argument that it doesn't currently build and nobody cares. Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle Support program until 2020-11-30. -tgc
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 9:44 PM, Stefan Beller wrote: > On Thu, Aug 10, 2017 at 12:39 PM, Christian Couder > wrote: >> On Thu, Aug 10, 2017 at 8:37 PM, Jeff King wrote: >>> On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: >>> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: > The point of "--only-trailers" is to give a caller an output > that's easy for them to parse. Getting rid of the > non-trailer material helps, but we still may see more > complicated syntax like whitespace continuation. Let's add > an option to normalize the output into one "key: value" line > per trailer. > > As a bonus, this could be used even without --only-trailers > to clean up unusual formatting in the incoming data. This is useful for the parsing part, but for the writing part we'd rather want to have the opposite thing, such as '--line-break=rfc822'. But this doesn't have to be part of this series. With this in mind, I do not quite understand the latter use case how you would use normalized trailers without --only-trailers? >>> >>> If you prefer the normalized form (and the input was line-broken in a >>> way that you don't like), then this would convert to your preferred >>> form. I agree that you could potentially want the opposite (folding long >>> lines). Perhaps something like --wrap=72. >> >> Related to this, I wonder if people might want to "normalize" in >> different ways later. If that happens, we might regret having called >> this option "--normalize" instead of "--one-per-line" for example. > > What is normal? > > Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] > then? Yeah, we could go there right now (using perhaps "--pretty" or "--format" instead of "--style", so that we are more consistent with other commands), but on the other hand we don't know yet if the other formats will really be needed. > If you have --one-per-line, this may be orthogonal to e.g. json > (as json can be crammed into one line IIUC), but when given the > selection you cannot combine multiple styles. > > Scratch that, we actually want to combine these styles with each > other. Yeah, that's another possibility for the future. People might want a --json option that can be used both with and without --oneline. But as the future is difficult to predict, let's try to make it easy for us in both cases. And I think starting with just "--oneline" would be easier to deal with later than "--normalize" (or "--style" or "--pretty" or "--format") especially in the latter case.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 11:06:40PM +0200, Christian Couder wrote: > >> Related to this, I wonder if people might want to "normalize" in > >> different ways later. If that happens, we might regret having called > >> this option "--normalize" instead of "--one-per-line" for example. > > > > What is normal? > > > > Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] > > then? > > Yeah, we could go there right now (using perhaps "--pretty" or > "--format" instead of "--style", so that we are more consistent with > other commands), but on the other hand we don't know yet if the other > formats will really be needed. But some of those things are not 1:1 mappings with normalization. For instance, --json presumably implies --only-trailers. Or are we proposing to break the whole commit message down into components and output it all as json? -Peff
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
Jonathan Tan writes: > Here is the complete patch set. I have only moved the exported functions > that operate with packfiles and their static helpers - for example, > static functions like freshen_packed_object() that are used only by > non-pack-specific functions are not moved. This will interfere with smaller changes and fixes we want to have early in the 'master' branch, so while I think it is a good idea to do something like this in the longer term, I'd have to ask you to either hold on or rebase this on them (you'll know what else you are conflicting with when you try to merge this to 'pu' yourself). Thanks.
Re: [RFC] clang-format: outline the git project's coding style
On 08/10, Junio C Hamano wrote: > Brandon Williams writes: > > > On 08/10, Junio C Hamano wrote: > > > >> I vaguely recall that there was a discussion to have SubmitGit wait > >> for success from Travis CI; if that is already in place, then I can > >> sort of see how it would help individual contributors to have the > >> style checker in that pipeline as well. > >> > >> I have a mixed feelings about "fixing" styles automatically, though. > > > > I still think we are far away from a world where we can fix style > > automatically. If we do want to keep pursuing this there are a number > > steps we'd want to take first. > > > > 1. Settle on a concrete style and document it using a formatter's rules > >(in say a .clang-format file). This style would most likely need to > >be tuned a little bit, at least the 'Penalty' configuration would > >need to be tuned which (as far as I understand it) is used to > >determine which rule to break first to ensure a line isn't too long. > > Yes. I think this is what you started to get the ball rolling. > Together with what checkpatch.pl already diagnoses, I think we can > get a guideline that is more or less reasonable. > > > 2. Start getting contributors to use the tool to format their patches. > >This would include having some script or hook that a contributor > >could run to only format the sections of code that they touched. > > This, too. Running checkpatch.pl (possibly combined with a bit of > tweaking it to match our needs) already catches many of the issues, > so a tool with a similar interface would be easy to use, I would > imagine. > > > 3. Slowly the code base would begin to have a uniform style. At > >some point we may want to then reformat the remaining sections of the > >code base. At this point we could have some automated bot that fixes > >style. > > I suspect I am discussing this based on a different assumption. > > I think the primary goal of this effort is to make it easier to > cleanse the new patches that appear on the list of trivial style > issues, so that contributors and reviewers do not have to spend > bandwidth and brain cycles during the review. And I have been > assuming that we can do so even without waiting for a "tree wide" > code churn on existing code to complete. Yes that's one of the steps I missed we can call it 2.5 ;) (3) could be a long term goal which is what I was trying to get at by saying: > > 3. Slowly the code base would begin to have a uniform style. -- Brandon Williams
Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 10:33:18PM +0200, Tom G. Christensen wrote: > > You've totally ignored the argument I made back then[1], and which I > > reiterated in this thread. So I'll say it one more time: the more > > compelling reason is not the #ifdefs, but the fact that the older > > versions are totally untested. > > Perhaps you forgot but I stated in the original thread that I build RPMS for > RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every > single time. I didn't forget. I actually double-checked the patches you sent at the time, but I didn't see one for the CURLPROTO issue. And indeed, it is still broken for me: $ cd /path/to/curl/repo $ git checkout curl-7_15_5 $ ./buildconf && ./configure --prefix=/tmp/foo && make install $ cd /path/to/git $ git checkout v2.14.0 $ make CURLDIR=/tmp/foo V=1 http.o gcc -o http.o -c -MF ./.depend/http.o.d -MQ http.o -MMD -MP -g -O0 -Wall -Werror -Wdeclaration-after-statement -Wpointer-arith -Wstrict-prototypes -Wvla -Wold-style-declaration -Wold-style-definition -Wno-error -Wno-cpp -Wno-unused-value -Wno-strict-prototypes -I. -DUSE_LIBPCRE1 -DHAVE_ALLOCA_H -I/tmp/foo/include -DUSE_CURL_FOR_IMAP_SEND -DNO_GETTEXT -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 -DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"sha1dc_git.c\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"sha1dc_git.h\"" -DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\"" -DHAVE_PATHS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM -DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH='"/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"' http.c http.c: In function ‘get_curl_allowed_protocols’: http.c:685:24: error: ‘CURLPROTO_HTTP’ undeclared (first use in this function); did you mean ‘CURLPROXY_HTTP’? allowed_protocols |= CURLPROTO_HTTP; ^~ CURLPROXY_HTTP [and so on] > I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at > all neither with building nor with running the testsuite. As you can see, this does not compile for me. What's going on? I don't see how it could work, as CURLPROTO_HTTP is not defined at all in that version of curl. Can you please double-check that you're building against the correct version of curl, and that you are building the HTTP parts of Git (which _are_ optional, and the test suite will pass without them). > > So IMHO this is about being honest with users about which versions we > > _actually_ support. > > I have no problem with you wanting to drop support for older curl releases > (such as 7.15.5) but don't use the argument that it doesn't currently build > and nobody cares. My argument isn't quite that nobody cares. It's that we do users a disservice by shipping a version of the code that very well may have hidden problems like security holes (for instance, we do not handle redirects safely in old versions of curl). So if you can get it to build it may _seem_ fine, but it's a bit of a booby-trap waiting to spring. I also won't claim any absolutes. I think we all agree this is a cost/benefit tradeoff. But there are a lot of options for building on a very old system. For instance, building without http if you don't need it. Or building a more recent libcurl (and even linking statically for simplicity). I'd find arguments against the latter more compelling if recent curl were hard to compile on old systems. I don't know whether that's the case (certainly on a modern system, it's much easier to get newer versions of curl to compile than older ones). > Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle > Support program until 2020-11-30. I saw that, too. But as I understand it, they provide no code updates: no bugfixes and no security updates. They just promise to answer the phone and help you with troubleshooting. It's possible my perception is wrong, though; I'm certainly not one of their customers. -Peff
Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote: > Hopefully I had better luck expressing my concerns this time? I understand your argument much better now. I'm still not sure I agree. -Peff
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, 10 Aug 2017, Jeff King wrote: > On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > > getdelim(3) returns -1 at the end of the file and if it encounters an > > error, but sets errno only in the latter case. Set errno to zero before > > calling it to avoid misdiagnosing an out-of-memory condition due to a > > left-over value from some other function call. > Looks good to me. > > Do we need to save and restore the original value of errno? I doubt it, > > but didn't think deeply about it, yet. > I'd say no. Anybody depending on strbuf_getwholeline() is clearly > already wrong in the error case. And in general I think we assume that > syscalls can clear errno on success if they choose to (this isn't a > syscall, but obviously it is calling some). Shouldn't ideally errno being reset to 0 upon check of the syscall successful run right after that syscall? (I also see some spots within git code where it sets errno to ENOMEM) -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik