Git Bash and Git GUI freezing just after opening
I have a fresh git installation on my windows 10 machine, but I am unable to use the Git Bash or the Git GUI because the program freezes just after opening. The only difference I have made on my machine was to update Windows 10 Pro to it's latest version (build 14393.105) and change the default Documents folder to another partition. I really need to fix this. How can I get some help?
[PATCH] Allow stashes to be referenced by index only
Instead of referencing "stash@{n}" explicitly, it can simply be referenced as "n". Most users only reference stashes by their position in the stash stask (what I refer to as the "index"). The syntax for the typical stash (stash@{n}) is slightly annoying and easy to forget, and sometimes difficult to escape properly in a script. Because of this the capability to do things with the stash by simply referencing the index is desirable. This patch includes the superior implementation provided by Øsse Walle (thanks for that), with a slight change to fix a broken test in the test suite. I also merged the test scripts as suggested by Jeff King, and un-wrapped the documentation as suggested by Junio Hamano. Signed-off-by: Aaron M Watson--- Documentation/git-stash.txt | 3 ++- git-stash.sh| 17 +++-- t/t3903-stash.sh| 35 +++ 3 files changed, 52 insertions(+), 3 deletions(-) diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt index 92df596..2e9cef0 100644 --- a/Documentation/git-stash.txt +++ b/Documentation/git-stash.txt @@ -39,7 +39,8 @@ The latest stash you created is stored in `refs/stash`; older stashes are found in the reflog of this reference and can be named using the usual reflog syntax (e.g. `stash@{0}` is the most recently created stash, `stash@{1}` is the one before it, `stash@{2.hours.ago}` -is also possible). +is also possible). Stashes may also be referenced by specifying just the +stash index (e.g. the integer `n` is equivalent to `stash@{n}`). OPTIONS --- diff --git a/git-stash.sh b/git-stash.sh index 826af18..d8d3b8d 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -384,9 +384,10 @@ parse_flags_and_rev() i_tree= u_tree= - REV=$(git rev-parse --no-flags --symbolic --sq "$@") || exit 1 + REV=$(git rev-parse --no-flags --symbolic --sq "$@" 2> /dev/null) FLAGS= + ARGV= for opt do case "$opt" in @@ -404,10 +405,13 @@ parse_flags_and_rev() die "$(eval_gettext "unknown option: \$opt")" FLAGS="${FLAGS}${FLAGS:+ }$opt" ;; + *) + ARGV="${ARGV}${ARGV:+ }'$opt'" + ;; esac done - eval set -- $REV + eval set -- $ARGV case $# in 0) @@ -422,6 +426,15 @@ parse_flags_and_rev() ;; esac + case "$1" in + *[!0-9]*) + : + ;; + *) + set -- "${ref_stash}@{$1}" + ;; + esac + REV=$(git rev-parse --symbolic --verify --quiet "$1") || { reference="$1" die "$(eval_gettext "\$reference is not a valid reference")" diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 2142c1f..f82a8c4 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -131,6 +131,26 @@ test_expect_success 'drop middle stash' ' test 1 = $(git show HEAD:file) ' +test_expect_success 'drop middle stash by index' ' + git reset --hard && + echo 8 > file && + git stash && + echo 9 > file && + git stash && + git stash drop 1 && + test 2 = $(git stash list | wc -l) && + git stash apply && + test 9 = $(cat file) && + test 1 = $(git show :file) && + test 1 = $(git show HEAD:file) && + git reset --hard && + git stash drop && + git stash apply && + test 3 = $(cat file) && + test 1 = $(git show :file) && + test 1 = $(git show HEAD:file) +' + test_expect_success 'stash pop' ' git reset --hard && git stash pop && @@ -604,6 +624,21 @@ test_expect_success 'invalid ref of the form stash@{n}, n >= N' ' git stash drop ' +test_expect_success 'invalid ref of the form "n", n >= N' ' + git stash clear && + test_must_fail git stash drop 0 && + echo bar5 > file && + echo bar6 > file2 && + git add file2 && + git stash && + test_must_fail git stash drop 1 && + test_must_fail git stash pop 1 && + test_must_fail git stash apply 1 && + test_must_fail git stash show 1 && + test_must_fail git stash branch tmp 1 && + git stash drop +' + test_expect_success 'stash branch should not drop the stash if the branch exists' ' git stash clear && echo foo >file && -- 2.7.4
What's cooking in git.git (Sep 2016, #02; Thu, 8)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. There are a few more topics in flight that may be ready to be picked up but I haven't, and other topics in flight that may not be quite ready. I'll start merging topics that have been cooking in 'next' to 'master', rewind and rebuild 'next', and merge those that have been waiting in 'pu' to 'next' before picking them up. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [New Topics] * bc/object-id (2016-09-07) 20 commits - builtin/reset: convert to use struct object_id - builtin/commit-tree: convert to struct object_id - builtin/am: convert to struct object_id - refs: add an update_ref_oid function. - sha1_name: convert get_sha1_mb to struct object_id - builtin/update-index: convert file to struct object_id - notes: convert init_notes to use struct object_id - builtin/rm: convert to use struct object_id - builtin/blame: convert file to use struct object_id - Convert read_mmblob to take struct object_id. - notes-merge: convert struct notes_merge_pair to struct object_id - builtin/checkout: convert some static functions to struct object_id - streaming: make stream_blob_to_fd take struct object_id - builtin: convert textconv_object to use struct object_id - builtin/cat-file: convert some static functions to struct object_id - builtin/cat-file: convert struct expand_data to use struct object_id - builtin/log: convert some static functions to use struct object_id - builtin/blame: convert struct origin to use struct object_id - builtin/apply: convert static functions to struct object_id - cache: convert struct cache_entry to use struct object_id The "unsigned char sha1[20]" to "struct object_id" conversion continues. Notable changes in this round includes that ce->sha1, i.e. the object name recorded in the cache_entry, turns into an object_id. It had merge conflicts with a few topics in flight (Christian's "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's "status --porcelain-v2"). Extra sets of eyes double-checking for mismerges are highly appreciated. * ep/use-git-trace-curl-in-tests (2016-09-07) 4 commits - t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var - t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var - test-lib.sh: preserve GIT_TRACE_CURL from the environment - t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var Update a few tests that used to use GIT_CURL_VERBOSE to use the newer GIT_TRACE_CURL. Will merge to 'next'. * jk/pack-tag-of-tag (2016-09-07) 5 commits - pack-objects: walk tag chains for --include-tag - t5305: simplify packname handling - t5305: use "git -C" - t5305: drop "dry-run" of unpack-objects - t5305: move cleanup into test block "git pack-objects --include-tag" was taught that when we know that we are sending an object C, we want a tag B that directly points at C but also a tag A that points at the tag B. We used to miss the intermediate tag B in some cases. * js/t6026-clean-up (2016-09-07) 1 commit - t6026-merge-attr: clean up background process at end of test case A test spawned a short-lived background process, which sometimes prevented the test directory from getting removed at the end of the script on some platforms. Will merge to 'next'. * js/t9903-chaining (2016-09-07) 1 commit - t9903: fix broken && chain Will merge to 'next'. * jt/accept-capability-advertisement-when-fetching-from-void (2016-09-07) 2 commits - connect: advertized capability is not a ref - tests: move test_lazy_prereq JGIT to test-lib.sh JGit can show a fake ref "capabilities^{}" to "git fetch" when it does not advertise any refs, but "git fetch" was not prepared to see such an advertisement. Waiting for a reroll. Rewording the log, and avoiding making it overly loose are needed. cf. <20160908013431.gc25...@google.com> * rs/compat-strdup (2016-09-07) 1 commit - compat: move strdup(3) replacement to its own file Will merge to 'next'. * rs/hex2chr (2016-09-07) 1 commit - introduce hex2chr() for converting two hexadecimal digits to a character Will merge to 'next'. * rt/rebase-i-broken-insn-advise (2016-09-07) 1 commit - rebase -i: improve advice on bad instruction lines When "git rebase -i" is given a broken instruction, it told the user to fix it with "--edit-todo", but didn't say what the step after that was (i.e. "--continue"). Will hold. Dscho's "rebase -i" hopefully will become available in 'pu', by which time an equivalent of this fix would be ported to C. This is queued merely as a reminder. *
Re: [PATCH v7 08/10] convert: modernize tests
On Thu, Sep 8, 2016 at 11:21 AM,wrote: > From: Lars Schneider > > Use `test_config` to set the config, check that files are empty with > `test_must_be_empty`, compare files with `test_cmp`, and remove spaces > after ">" and "<". > > Please note that the "rot13" filter configured in "setup" keeps using > `git config` instead of `test_config` because subsequent tests might > depend on it. > > Signed-off-by: Lars Schneider > --- Makes sense & Reviewed-by "Stefan Beller " Thanks, Stefan
Re: [PATCH v7 06/10] pkt-line: add functions to read/write flush terminated packet streams
On Thu, Sep 8, 2016 at 11:21 AM,wrote: > From: Lars Schneider > > write_packetized_from_fd() and write_packetized_from_buf() write a > stream of packets. All content packets use the maximal packet size > except for the last one. After the last content packet a `flush` control > packet is written. I presume we need both write_* things in a later patch; can you clarify why we need both of them? > + if (paket_len < 0) { > + if (oldalloc == 0) > + strbuf_release(sb_out); So if old alloc is 0, we release it, which is documented as /** * Release a string buffer and the memory it used. You should not use the * string buffer after using this function, unless you initialize it again. */ > + else > + strbuf_setlen(sb_out, oldlen); Otherwise we just set the length back, such that it looks like before. So as a caller the strbuf is in a different state in case of error depending whether the strbuf already had some data in it. I think it would be better if we only did `strbuf_setlen(sb_out, oldlen);` here, such that the caller can strbuf_release it unconditionally. Or to make things more confusing, you could use strbuf_reset in case of 0, as that is a strbuf_setlen internally. ;) > @@ -77,6 +79,11 @@ char *packet_read_line(int fd, int *size); > */ > char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size); > > +/* > + * Reads a stream of variable sized packets until a flush packet is detected. Strictly speaking we read until a packet of size 0 appears, but as per the implementation of packet_read we cannot distinguish between "" and "0004", i.e. an empty non-flush packet. So I think we're fine both in the implementation as well as the documentation here.
Re: [PATCH v2 38/38] refs: implement iteration over only per-worktree refs
Other than the duplicated sign-offs, this series looks good to me ("Don't act surprised, you guys, cuz I wrote 'em"). Kind of a funny place to cut it off, but I guess it makes sense. On Sun, 2016-09-04 at 18:08 +0200, Michael Haggerty wrote: > From: David Turner> > Alternate refs backends might still use files to store per-worktree > refs. So provide a way to iterate over only the per-worktree references > in a ref_store. The other backend can set up a files ref_store and > iterate using the new DO_FOR_EACH_PER_WORKTREE_ONLY flag when iterating.
Bug: git-p4 can generate duplicate commits when syncing changes that span multiple depot paths
Reproduction Steps: 1. Have a git repo cloned from a perforce repo using multiple depot paths (e.g. //depot/foo and //depot/bar). 2. Submit a single change to the perforce repo that makes changes in both //depot/foo and //depot/bar. 3. Run "git p4 sync" to sync the change from #2. Expected Behavior: Change should be synced as a single commit to the git repo. Actual Behavior: Change is synced as multiple commits, one for each depot path that was affected. Best Guess: I believe this is happening because the command syntax "p4 changes //depot/foo/...@123,456 //depot/bar/...@123,456", which git-p4 uses to get the list of changes to sync, will return the same change number multiple times if the change was present in multiple depot paths. This is expected behavior as per the p4 changes documentation: "If p4 changes is called with multiple file arguments, the sets of changelists that affect each argument are evaluated individually. The final output is neither combined nor sorted; the effect is the same as calling p4 changes multiple times, once for each file argument." git-p4 is handling the sorting itself, but it is not handling the combining. I would imagine this is fixable in the p4ChangesForPaths() method by dropping non-unique elements of the list before or after sorting. Rudimentary testing in the python interpreter would suggest that something like "changes = sorted(set(changes))" should do the trick, but I am no python expert so there may be a better way. Thanks! - James
Re: [PATCH] checkout: eliminate unnecessary merge for trivial checkout
On Thu, Sep 08, 2016 at 02:22:16PM -0700, Junio C Hamano wrote: > > + /* > > +* Optimize the performance of checkout when the current and > > +* new branch have the same OID and avoid the trivial merge. > > +* For example, a "git checkout -b foo" just needs to create > > +* the new ref and report the stats. > > +*/ > > + if (!old.commit || !new->commit > > + || oidcmp(>object.oid, >commit->object.oid) > > + || !opts->new_branch || opts->new_branch_force || > > opts->new_orphan_branch > > + || opts->patch_mode || opts->merge || opts->force || > > opts->force_detach > > + || opts->writeout_stage || !opts->overwrite_ignore > > + || opts->ignore_skipworktree || opts->ignore_other_worktrees > > + || opts->new_branch_log || opts->branch_exists || opts->prefix > > + || opts->source_tree) { > > ... this is a maintenance nightmare in that any new option we will > add later will need to consider what this "optimization" is trying > (not) to skip. The first two lines (i.e. we need a real checkout if > we cannot positively say that old and new commits are the same > object) are clear, but no explanation was given for all the other > random conditions this if condition checks. What if opts->something > was not listed (or "listed" for that matter) in the list above--it > is totally unclear if it was missed by mistake (or "added by > mistake") or deliberately excluded (or "deliberately added"). > > For example, why is opts->prefix there? If > > git checkout -b new-branch HEAD > > should be able to omit the two-way merge, shouldn't > > cd t && git checkout -b new-branch HEAD > > also be able to? I was just writing another reply, but I think our complaints may have dovetailed. My issue is that the condition above is an unreadable mass. It would be really nice to pull it out into a helper function, and then all of the items could be split out and commented independently, like: static int needs_working_tree_merge(const struct checkout_opts *opts, const struct branch_info *old, const struct branch_info *new) { /* * We must do the merge if we are actually moving to a new * commit. */ if (!old->commit || !new->commit || oidcmp(>object.oid, >commit->object.oid)) return 1; /* Option "foo" is not compatible because of... */ if (opts->foo) return 1; ... etc ... } That still leaves your "what if opts->something is not listed" question open, but at least it makes it easier to comment on it in the code. -Peff PS I didn't think hard on whether the conditions above make _sense_. My first goal would be to get more communication about them individually, and then we can evaluate them.
Re: [PATCH] gpg-interface: reflect stderr to stderr
Jeff Kingwrites: >> Even though this patch is fixing only one of the two issues, I am >> tempted to say that we should queue it for now, as it does so >> without breaking a bigger gain made by the original, i.e. we learn >> the status of verification in a way the authors of GPG wants us to, >> while somebody figuires out what the best way is to show the prompt >> to the console on Windows. > > That's OK by me, but I don't know if we can put off the "best way to > show the prompt" fix. It seems like a pretty serious regression for > people on Windows. Yes, I am not saying that it is OK to keep Windows users broken. As I understand what Dscho said correctly, his users are covered by a reversion of the "read the GPG status correctly" patch, i.e. with a different trade-off between the correctness of GPG status vs usability of the prompt, he will ship with Git for Windows, and that stop-gap measure will last only until developers who can do Windows (which excludes you, me, and Michael it seems) comes up with a solution that satisfies both. I consider that an approach that is perfectly fine.
Re: [PATCH 4/5] versioncmp: pass full tagnames to swap_prereleases()
SZEDER Gáborwrites: > I'm not sure about the relevancy of this pararaph, or the relevancy of > the original version for that matter. I mean, there is a different > character for sure, so it's really rather obvious that it can't > possibly be the same suffix in both, isn't it? So I don't think it > adds much value, and don't mind deleting it in the reroll. Concurred. Let's lose this confusing statement. Thanks.
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
Jeff Kingwrites: >> commit f96e5673 ("grep: use REG_STARTEND for all matching if available", >> 22-05-2010) introduced this test and expects ".. NUL characters themselves >> are not matched in any way". With the native library on cygwin they are >> matched, with the compat/regex they are not. Indeed, if you use the system >> 'grep' command (rather than 'git grep'), then it will also not match ... :-D >> >> Slightly off topic, but ... > > Hmm. So it sounds like the "regmatch" in grep.c could go away in favor > of Johannes's regexec_buf(), and cygwin ought to be using NO_REGEX. Sounds like a plan.
Re: [PATCH v7 05/10] pkt-line: add packet_write_gently()
On Thu, Sep 8, 2016 at 11:21 AM,wrote: > From: Lars Schneider > > packet_write_fmt_gently() uses format_packet() which lets the caller > only send string data via "%s". That means it cannot be used for > arbitrary data that may contain NULs. Makes sense. > > Add packet_write_gently() which writes arbitrary data and returns `0` > for success and `-1` for an error. I think documenting the return code is better done in either the header file or in a commend preceding the implementation instead of the commit message? Maybe just a generic comment for *_gently is good enough, maybe even no comment. So the commit is fine, too. I dunno. > This function is used by other > pkt-line functions in a subsequent patch. That's what I figured. Do we also need to mention that in the preceding patch for packet_flush_gently ? > > Signed-off-by: Lars Schneider > --- > pkt-line.c | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/pkt-line.c b/pkt-line.c > index 37345ca..1d3d725 100644 > --- a/pkt-line.c > +++ b/pkt-line.c > @@ -181,6 +181,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...) > return status; > } > > +int packet_write_gently(const int fd_out, const char *buf, size_t size) > +{ > + static char packet_write_buffer[LARGE_PACKET_MAX]; > + > + if (size > sizeof(packet_write_buffer) - 4) { > + error("packet write failed"); > + return -1; > + } > + packet_trace(buf, size, 1); > + size += 4; > + set_packet_header(packet_write_buffer, size); > + memcpy(packet_write_buffer + 4, buf, size - 4); > + if (write_in_full(fd_out, packet_write_buffer, size) == size) > + return 0; > + > + error("packet write failed"); > + return -1; > +} > + > void packet_buf_write(struct strbuf *buf, const char *fmt, ...) > { > va_list args; > -- > 2.10.0 >
Re: [PATCH] Move format-patch base commit and prerequisites before email signature
Jeff Kingwrites: > On Thu, Sep 08, 2016 at 11:54:08AM -0700, Josh Triplett wrote: > >> > your problem description >> > looks perfect. I am still not sure if the code does a reasonable >> > thing in MIME case, though. >> >> It *looks* correct to me. > > Hmm. It looks correct to me, too; ... > ... > So this is actually fixing a bug,... Yes, I actually wanted to hear that from Josh and have that in the proposed log message ;-).
Re: [PATCH] checkout: eliminate unnecessary merge for trivial checkout
Ben Peartwrites: > Teach git to avoid unnecessary merge during trivial checkout. When > running 'git checkout -b foo' git follows a common code path through > the expensive merge_working_tree even when it is unnecessary. I would be lying if I said I am not sympathetic to the cause, but... > + /* > + * Optimize the performance of checkout when the current and > + * new branch have the same OID and avoid the trivial merge. > + * For example, a "git checkout -b foo" just needs to create > + * the new ref and report the stats. > + */ > + if (!old.commit || !new->commit > + || oidcmp(>object.oid, >commit->object.oid) > + || !opts->new_branch || opts->new_branch_force || > opts->new_orphan_branch > + || opts->patch_mode || opts->merge || opts->force || > opts->force_detach > + || opts->writeout_stage || !opts->overwrite_ignore > + || opts->ignore_skipworktree || opts->ignore_other_worktrees > + || opts->new_branch_log || opts->branch_exists || opts->prefix > + || opts->source_tree) { ... this is a maintenance nightmare in that any new option we will add later will need to consider what this "optimization" is trying (not) to skip. The first two lines (i.e. we need a real checkout if we cannot positively say that old and new commits are the same object) are clear, but no explanation was given for all the other random conditions this if condition checks. What if opts->something was not listed (or "listed" for that matter) in the list above--it is totally unclear if it was missed by mistake (or "added by mistake") or deliberately excluded (or "deliberately added"). For example, why is opts->prefix there? If git checkout -b new-branch HEAD should be able to omit the two-way merge, shouldn't cd t && git checkout -b new-branch HEAD also be able to? Even the main condition is unclear. It wants to see that old and new have exactly the same commit, but shouldn't the "the result of the two-way merge is known to be no-op" logic equally apply if the old and two trees are the same?
Re: [PATCH v7 03/10] pkt-line: add packet_write_fmt_gently()
On Thu, Sep 8, 2016 at 11:21 AM,wrote: > +static int packet_write_fmt_1(int fd, int gently, > + const char *fmt, va_list args) > +{ > + struct strbuf buf = STRBUF_INIT; > + size_t count; > + > + format_packet(, fmt, args); > + count = write_in_full(fd, buf.buf, buf.len); > + if (count == buf.len) > + return 0; > + > + if (!gently) { call check_pipe from write_or_die here instead of reproducing that function? > + if (errno == EPIPE) { > + if (in_async()) > + async_exit(141); > + > + signal(SIGPIPE, SIG_DFL); > + raise(SIGPIPE); > + /* Should never happen, but just in case... */ > + exit(141); > + } > + die_errno("packet write error"); > + } > + error("packet write failed"); > + return -1; I think the more idiomatic way is to return error(...); as error always return -1.
Re: [PATCH 3/3] diff: remove dead code
Stefan Bellerwrites: > When `len < 1`, len has to be 0 or negative, emit_line will then remove the > first character and by then `len` would be negative. As this doesn't > happen, it is safe to assume it is dead code. > > This continues to simplify the code, which was started in b8d9c1a66b > (2009-09-03, diff.c: the builtin_diff() deals with only two-file > comparison). We look at line[0] to see if it is '@' before this check, which would have been wrong if "len < 1" were ever true. > > Signed-off-by: Stefan Beller > --- > diff.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/diff.c b/diff.c > index 79ad91d..c143019 100644 > --- a/diff.c > +++ b/diff.c > @@ -1251,14 +1251,6 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > return; > } > > - if (len < 1) { > - emit_line(o, reset, reset, line, len); > - if (ecbdata->diff_words > - && ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) > - fputs("~\n", o->file); > - return; > - } > - > if (ecbdata->diff_words) { > if (line[0] == '-') { > diff_words_append(line, len,
Re: [PATCH 2/3] diff: omit found pointer from emit_callback
Stefan Bellerwrites: > diff --git a/diff.c b/diff.c > index 4a6501c..79ad91d 100644 > --- a/diff.c > +++ b/diff.c > @@ -354,7 +354,6 @@ struct emit_callback { > const char **label_path; > struct diff_words_data *diff_words; > struct diff_options *opt; > - int *found_changesp; > struct strbuf *header; > }; I briefly wondered if we have some callsites that do not want o->found_changes to be modified (hence pointing this field at elsewhere), but the fact that you can _remove_ this field means that there is no such use case, which is good. > @@ -722,7 +721,6 @@ static void emit_rewrite_diff(const char *name_a, > > memset(, 0, sizeof(ecbdata)); > ecbdata.color_diff = want_color(o->use_color); > - ecbdata.found_changesp = >found_changes; > ecbdata.ws_rule = whitespace_rule(name_b); > ecbdata.opt = o; > if (ecbdata.ws_rule & WS_BLANK_AT_EOF) { > @@ -1215,13 +1213,13 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET); > struct diff_options *o = ecbdata->opt; > const char *line_prefix = diff_line_prefix(o); > + o->found_changes = 1; > > if (ecbdata->header) { > fprintf(o->file, "%s", ecbdata->header->buf); > strbuf_reset(ecbdata->header); > ecbdata->header = NULL; > } > - *(ecbdata->found_changesp) = 1; Is there a good reason to move the assignment up? "The fact that this function was called even once means we found some change" is probably a good argument, but then I'd prefer to have a blank before it to separate it (the first statement) from the block of decls. No need to resend. Thanks.
Re: [PATCH 01/13] i18n: apply: mark plural string for translation
Thanks. I'll skip 01-03/13 and queue the remainder for now, as I'd want to see Christian's "split builtin/apply.c into two, moving bulk to apply.c at the top-level to be reused" merged to 'next' sooner and to 'master' hopefully during this cycle.
[PATCH] checkout: eliminate unnecessary merge for trivial checkout
Teach git to avoid unnecessary merge during trivial checkout. When running 'git checkout -b foo' git follows a common code path through the expensive merge_working_tree even when it is unnecessary. As a result, 95% of the time is spent in merge_working_tree doing the 2-way merge between the new and old commit trees that is unneeded. The time breakdown is as follows: merge_working_tree <-- 95% unpack_trees <-- 80% traverse_trees <-- 50% cache_tree_update <-- 17% mark_new_skip_worktree <-- 10% With a large repo, this cost is pronounced. Using "git checkout -b r" to create and switch to a new branch costs 166 seconds (all times worst case with a cold file system cache). git.c:406 trace: built-in: git 'checkout' '-b' 'r' read-cache.c:1667 performance: 17.442926555 s: read_index_from name-hash.c:128 performance: 2.912145231 s: lazy_init_name_hash read-cache.c:2208 performance: 4.387713335 s: write_locked_index trace.c:420 performance: 166.458921289 s: git command: 'c:\Users\benpeart\bin\git.exe' 'checkout' '-b' 'r' Switched to a new branch 'r' By adding a test to skip the unnecessary call to merge_working_tree in this case reduces the cost to 16 seconds. git.c:406 trace: built-in: git 'checkout' '-b' 's' read-cache.c:1667 performance: 16.100742476 s: read_index_from trace.c:420 performance: 16.461547867 s: git command: 'c:\Users\benpeart\bin\git.exe' 'checkout' '-b' 's' Switched to a new branch 's' Signed-off-by: Ben Peart--- builtin/checkout.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 8672d07..595d64b 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -827,10 +827,25 @@ static int switch_branches(const struct checkout_opts *opts, parse_commit_or_die(new->commit); } - ret = merge_working_tree(opts, , new, _error); - if (ret) { - free(path_to_free); - return ret; + /* +* Optimize the performance of checkout when the current and +* new branch have the same OID and avoid the trivial merge. +* For example, a "git checkout -b foo" just needs to create +* the new ref and report the stats. +*/ + if (!old.commit || !new->commit + || oidcmp(>object.oid, >commit->object.oid) + || !opts->new_branch || opts->new_branch_force || opts->new_orphan_branch + || opts->patch_mode || opts->merge || opts->force || opts->force_detach + || opts->writeout_stage || !opts->overwrite_ignore + || opts->ignore_skipworktree || opts->ignore_other_worktrees + || opts->new_branch_log || opts->branch_exists || opts->prefix + || opts->source_tree) { + ret = merge_working_tree(opts, , new, _error); + if (ret) { + free(path_to_free); + return ret; + } } if (!opts->quiet && !old.path && old.commit && new->commit != old.commit) -- 2.10.0.windows.1
Re: [PATCH 4/5] versioncmp: pass full tagnames to swap_prereleases()
Quoting Junio C Hamano: SZEDER Gábor writes: - * Note that we don't have to deal with the situation when both p1 and - * p2 start with the same suffix because the common part is already + * Note that we don't have to deal with the situation when both s1 and + * s2 contain the same suffix because the common part is already * consumed by the caller. "The common part is already consumed" was relevant while the function was fed p1 and p2, i.e. the first difference, but the whole point of passing the original s1 and s2 with ofs is so that the function can look behind ofs as necessary. Is "already consumed" still correct (or relevant) with s/p/s/ you did to its calling convention? Well, it's still correct in the sense that we don't have to worry about finding the same suffix in both strings. However, "consume" is not the right word to use here, as incrementing an offset until it points past the common part doesn't count as "consumption", so more rewording would be necessary. I'm not sure about the relevancy of this pararaph, or the relevancy of the original version for that matter. I mean, there is a different character for sure, so it's really rather obvious that it can't possibly be the same suffix in both, isn't it? So I don't think it adds much value, and don't mind deleting it in the reroll. Best, Gábor
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
Johannes Schindelinwrites: > On Thu, 1 Sep 2016, Junio C Hamano wrote: > >> Hopefully that [patch removing the - suffix] would help making >> Dscho's "what are the failed tests?" logic simpler. > > Of course. > > It also makes sure that those 2 hours I spent on writing and perfecting > the sed magic were spent in vain... ;-) Well it is either * the sed magic is so arcane that you'd need to spend a long time, comparable to 2 hours you already spent, if you ever need to look at it and figure out what it does next time you need to change something in it. or * you are not familiar with the sed magic and you would be able to write the same thing in 2 minutes next time if you need to adjust it when we add -pid back later. Either way, those 2 hours are not wasted. I personally fall into the former category. Any sed script that needs G, h, and x together I need to spend at least 15 minutes just to warm myself up, as I do not work with the language that often. Thanks ;-)
Re: [PATCH] Move format-patch base commit and prerequisites before email signature
On Thu, Sep 08, 2016 at 11:54:08AM -0700, Josh Triplett wrote: > > your problem description > > looks perfect. I am still not sure if the code does a reasonable > > thing in MIME case, though. > > It *looks* correct to me. Hmm. It looks correct to me, too; we stick it just after the patch, so with "--attach" it is part of the text/x-patch, which is reasonable. But looking at the results of "--attach" from _before_ your patch, it looks totally broken. The "base" information comes _after the final delimiter of the multipart/mixed. Most mailers would just throw it away when decoding the multipart, I think. So this is actually fixing a bug, and you could probably add a test (though I am not sure we have anything in git that actually parses multipart messages _or_ that carefully consumes the base-commit info, so it might be hard to test in practice). -Peff
Re: [PATCH 3/3] checkout: fix ambiguity check in subdir
Nguyễn Thái Ngọc Duywrites: > The two functions in parse_branchname_arg(), verify_non_filename and > check_filename, need correct prefix in order to reconstruct the paths > and check for their existence. With NULL prefix, they just check paths > at top dir instead. Good eyes. Will queue. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > builtin/checkout.c| 4 ++-- > t/t2010-checkout-ambiguous.sh | 9 + > t/t2024-checkout-dwim.sh | 12 > 3 files changed, 23 insertions(+), 2 deletions(-) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 1f71d06..53c7284 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -985,7 +985,7 @@ static int parse_branchname_arg(int argc, const char > **argv, > int recover_with_dwim = dwim_new_local_branch_ok; > > if (!has_dash_dash && > - (check_filename(NULL, arg) || !no_wildcard(arg))) > + (check_filename(opts->prefix, arg) || !no_wildcard(arg))) > recover_with_dwim = 0; > /* >* Accept "git checkout foo" and "git checkout foo --" > @@ -1046,7 +1046,7 @@ static int parse_branchname_arg(int argc, const char > **argv, >* it would be extremely annoying. >*/ > if (argc) > - verify_non_filename(NULL, arg); > + verify_non_filename(opts->prefix, arg); > } else { > argcount++; > argv++; > diff --git a/t/t2010-checkout-ambiguous.sh b/t/t2010-checkout-ambiguous.sh > index e76e84a..2e47fe0 100755 > --- a/t/t2010-checkout-ambiguous.sh > +++ b/t/t2010-checkout-ambiguous.sh > @@ -41,6 +41,15 @@ test_expect_success 'check ambiguity' ' > test_must_fail git checkout world all > ' > > +test_expect_success 'check ambiguity in subdir' ' > + mkdir sub && > + # not ambiguous because sub/world does not exist > + git -C sub checkout world ../all && > + echo hello >sub/world && > + # ambiguous because sub/world does exist > + test_must_fail git -C sub checkout world ../all > +' > + > test_expect_success 'disambiguate checking out from a tree-ish' ' > echo bye > world && > git checkout world -- world && > diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh > index 468a000..3e5ac81 100755 > --- a/t/t2024-checkout-dwim.sh > +++ b/t/t2024-checkout-dwim.sh > @@ -174,6 +174,18 @@ test_expect_success 'checkout of branch with a file > having the same name fails' > test_branch master > ' > > +test_expect_success 'checkout of branch with a file in subdir having the > same name fails' ' > + git checkout -B master && > + test_might_fail git branch -D spam && > + > + >spam && > + mkdir sub && > + mv spam sub/spam && > + test_must_fail git -C sub checkout spam && > + test_must_fail git rev-parse --verify refs/heads/spam && > + test_branch master > +' > + > test_expect_success 'checkout -- succeeds, even if a file with the > same name exists' ' > git checkout -B master && > test_might_fail git branch -D spam &&
Re: [PATCH] gpg-interface: reflect stderr to stderr
On Thu, Sep 08, 2016 at 11:20:09AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Wed, Sep 07, 2016 at 10:27:34AM +0200, Michael J Gruber wrote: > > > >> Now, I can't reproduce C on Linux[*], so there is more involved. It > >> could be that my patch just exposes a problem in our start_command() > >> etc.: run-command.c contains a lot of ifdefing, so possibly quite > >> different code is run on different platforms. > > > > Maybe, though my blind guess is that it is simply that on Linux we can > > open /dev/tty directly, and console-IO on Windows is a bit more > > complicated. > > True. > > Even though this patch is fixing only one of the two issues, I am > tempted to say that we should queue it for now, as it does so > without breaking a bigger gain made by the original, i.e. we learn > the status of verification in a way the authors of GPG wants us to, > while somebody figuires out what the best way is to show the prompt > to the console on Windows. That's OK by me, but I don't know if we can put off the "best way to show the prompt" fix. It seems like a pretty serious regression for people on Windows. -Peff
Re: [PATCH 2/3] checkout.txt: document a common case that ignores ambiguation rules
Nguyễn Thái Ngọc Duywrites: > Normally we err on the safe side: if something can be seen as both an > SHA1 and a pathspec, we stop and scream. In checkout, there is one > exception added in 859fdab (git-checkout: improve error messages, detect > ambiguities. - 2008-07-23), to allow the common case "git checkout > branch". Let's document this exception. Good idea, but... > +ARGUMENT AMBIGUATION > + > + > +When there is only one argument given and it is not `--` (e.g. "git > +checkout abc"), "abc" could be seen as either a `` or a > +``, but Git will assume the argument is a ``, which is > +a common case for switching branches. Use `git checkout -- ` > +form if you mean it to be a pathspec. ... this is far from reasonable. I'd read "but Git will assume the argument is a tree-ish" to mean "git checkout Makefile" would attempt to checkout the Makefile branch and fail. When there is only one argument given and it is not `--` (e.g. "git checkout abc"), and when the argument is both a valid `` (e.g. a branch "abc" exists) and a valid `` (e.g. a file or a directory whose name is "abc" exists), Git would usually ask you to disambiguate. Because checking out a branch is so common an operation, however, "git checkout abc" takes "abc" as a `` in such situation. Use `git checkout -- ` if you want to checkout these paths out of the index. or something like that?
Re: [PATCH 2/3] t0001: work around the bug that reads config file before repo setup
On Thu, Sep 08, 2016 at 08:47:18PM +0700, Nguyễn Thái Ngọc Duy wrote: > git-init somehow reads '.git/config' at current directory and sets > log_all_ref_updates based on this file. Because log_all_ref_updates is > not unspecified (-1) any more. It will not be written to the new repo's > config file (see create_default_files() function). > > This will affect our tests in the next patch as we will compare the > config file and expect that core.logallrefupdates is already set to true > by "git init main-worktree". This is a bug for more than worktrees, and is something I'm working on fixing (what I'd like to do is kill off the lazy fallback to ".git/" as the repo name, which is almost always the wrong thing to do). I'm not opposed to your workaround, but just FYI. -Peff
Re: [PATCH 3/3] init: do not set core.worktree more often than necessary
Nguyễn Thái Ngọc Duywrites: > +/* > + * Return the first ".git" that we have encountered. > + * FIXME this function for not entirely correct because > + * setup_git_directory() and enter_repo() do not update first_git_dir > + * when they follow .git files. The function in its current state is > + * only suitable for "git init". > + */ Would it be possible to move this to "init-db.c" then? The very first thing cmd_init_db() does to what is in the environment.c is to call set_git_dir() via set_git_dir_init() to tell it where the ".git" thing is, no? Can't that code remember the location itself, instead of adding code that is known not to be usable by other callers? That would help avoiding the future confusion. > +const char *get_first_git_dir(void) > +{ > + return first_git_dir ? first_git_dir : git_dir; > +} > + > const char *get_git_common_dir(void) > { > return git_common_dir; > diff --git a/t/t0001-init.sh b/t/t0001-init.sh > index 393c940..d59669a 100755 > --- a/t/t0001-init.sh > +++ b/t/t0001-init.sh > @@ -393,9 +393,11 @@ test_expect_success 're-init from a linked worktree' ' > test_commit first && > git worktree add ../linked-worktree && > mv .git/info/exclude expected-exclude && > + cp .git/config expected-config && > find .git/worktrees -print | sort >expected && > git -C ../linked-worktree init && > test_cmp expected-exclude .git/info/exclude && > + test_cmp expected-config .git/config && > find .git/worktrees -print | sort >actual && > test_cmp expected actual > )
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
On Thu, Sep 08, 2016 at 08:06:35PM +0100, Ramsay Jones wrote: > > Actually, I take it back again. Your test case doesn't have an embedded > > NUL in it (so we check that git finds it, but aside from the lack of > > segfault, stock git would already find it). > > This reminds me ... despite the native cygwin regex library no longer > having the 'regex bug' (ie t0070.5 now passes), I still have NO_REGEX > set on cygwin. This is because, when building with the native library, > we have an "unexpected pass" for t7008.12, which looks like: > > test_expect_failure 'git grep .fi a' ' > git grep .fi a > ' > [where the file a is set up earlier by: echo 'binaryQfile' | q_to_nul >a] > > commit f96e5673 ("grep: use REG_STARTEND for all matching if available", > 22-05-2010) introduced this test and expects ".. NUL characters themselves > are not matched in any way". With the native library on cygwin they are > matched, with the compat/regex they are not. Indeed, if you use the system > 'grep' command (rather than 'git grep'), then it will also not match ... :-D > > Slightly off topic, but ... Hmm. So it sounds like the "regmatch" in grep.c could go away in favor of Johannes's regexec_buf(), and cygwin ought to be using NO_REGEX. -Peff
Re: [PATCH 2/3] t0001: work around the bug that reads config file before repo setup
Nguyễn Thái Ngọc Duywrites: > git-init somehow reads '.git/config' at current directory and sets > log_all_ref_updates based on this file. Because log_all_ref_updates is > not unspecified (-1) any more. It will not be written to the new repo's > config file (see create_default_files() function). I vaguely recall this was discussed recently somewhere. Is this related to <20160902091130.jxckdeomw35ee...@sigill.intra.peff.net>? > This will affect our tests in the next patch as we will compare the > config file and expect that core.logallrefupdates is already set to true > by "git init main-worktree". Ugh. The "mv" workaround is fine to demonstrate that you have fixed one of two problems, but as you said, it would be nice to have another that expects failure until both of them gets fixed. > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > t/t0001-init.sh | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/t/t0001-init.sh b/t/t0001-init.sh > index d64e5e3..393c940 100755 > --- a/t/t0001-init.sh > +++ b/t/t0001-init.sh > @@ -385,7 +385,9 @@ test_expect_success MINGW 'bare git dir not hidden' ' > ' > > test_expect_success 're-init from a linked worktree' ' > + mv .git/config work-around-init-reading-wrong-file && > git init main-worktree && > + mv work-around-init-reading-wrong-file .git/config && > ( > cd main-worktree && > test_commit first &&
Re: [PATCH 1/3] init: correct re-initialization from a linked worktree
Nguyễn Thái Ngọc Duywrites: > When 'git init' is called from a linked worktree, '.git' dir as the main > '.git' (i.e. $GIT_COMMON_DIR) and populate the whole repository skeleton > in there. It does not harm anything (*) but it is still wrong. -ECANNOTPARSE. Did you mean "... worktree, we treat '.git' dir as if it is the main '.git' ..." or something entirely different? > diff --git a/builtin/init-db.c b/builtin/init-db.c > index 3a45f0b..6d9552e 100644 > --- a/builtin/init-db.c > +++ b/builtin/init-db.c > @@ -138,7 +138,7 @@ static void copy_templates(const char *template_dir) > goto close_free_return; > } > > - strbuf_addstr(, get_git_dir()); > + strbuf_addstr(, get_git_common_dir()); > strbuf_complete(, '/'); > copy_templates_1(, _path, dir); > close_free_return: > diff --git a/t/t0001-init.sh b/t/t0001-init.sh > index a6fdd5e..d64e5e3 100755 > --- a/t/t0001-init.sh > +++ b/t/t0001-init.sh > @@ -384,4 +384,19 @@ test_expect_success MINGW 'bare git dir not hidden' ' > ! is_hidden newdir > ' > > +test_expect_success 're-init from a linked worktree' ' > + git init main-worktree && > + ( > + cd main-worktree && > + test_commit first && > + git worktree add ../linked-worktree && > + mv .git/info/exclude expected-exclude && > + find .git/worktrees -print | sort >expected && > + git -C ../linked-worktree init && > + test_cmp expected-exclude .git/info/exclude && > + find .git/worktrees -print | sort >actual && > + test_cmp expected actual > + ) > +' > + > test_done
Re: [PATCH] Move format-patch base commit and prerequisites before email signature
Josh Triplettwrites: > If any other change ends up being necessary, I'll split the patch in v2. Thanks. I do not see anything else offhand myself, but other people watching the topic from the sideline may spot something we missed.
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
On 08/09/16 09:35, Jeff King wrote: > On Thu, Sep 08, 2016 at 04:14:46AM -0400, Jeff King wrote: > >> On Thu, Sep 08, 2016 at 04:10:24AM -0400, Jeff King wrote: >> >>> On Thu, Sep 08, 2016 at 09:54:51AM +0200, Johannes Schindelin wrote: >>> > diff.c | 3 ++- > diffcore-pickaxe.c | 18 -- > xdiff-interface.c | 13 - > 3 files changed, 14 insertions(+), 20 deletions(-) I just realized that this should switch the test_expect_failure from 1/3 to a test_expect_success. >>> >>> Yep. I wonder if we also would want to test that we correctly find >>> regexes inside binary files. >>> >>> E.g., given a mixed binary/text file like: >>> >>> printf 'binary\0text' >file && >>> git add file && >>> git commit -m file >>> >>> then "git log -Stext" will find that file, but "--pickaxe-regex" will >>> not (using stock git). Ditto for "-Gtext". >>> >>> Your patch should fix that. >> >> Of course if I had actually _looked carefully_ at your patch, I would >> have seen that your test doesn't just check that we don't segfault, but >> actually confirms that we find the entry. >> >> Sorry for the noise. > > Actually, I take it back again. Your test case doesn't have an embedded > NUL in it (so we check that git finds it, but aside from the lack of > segfault, stock git would already find it). This reminds me ... despite the native cygwin regex library no longer having the 'regex bug' (ie t0070.5 now passes), I still have NO_REGEX set on cygwin. This is because, when building with the native library, we have an "unexpected pass" for t7008.12, which looks like: test_expect_failure 'git grep .fi a' ' git grep .fi a ' [where the file a is set up earlier by: echo 'binaryQfile' | q_to_nul >a] commit f96e5673 ("grep: use REG_STARTEND for all matching if available", 22-05-2010) introduced this test and expects ".. NUL characters themselves are not matched in any way". With the native library on cygwin they are matched, with the compat/regex they are not. Indeed, if you use the system 'grep' command (rather than 'git grep'), then it will also not match ... :-D Slightly off topic, but ... ATB, Ramsay Jones
Re: [PATCH] Move format-patch base commit and prerequisites before email signature
On Thu, Sep 08, 2016 at 11:34:15AM -0700, Junio C Hamano wrote: > Josh Triplettwrites: > > > Any text below the "-- " for the email signature gets treated as part of > > the signature, and many mail clients will trim it from the quoted text > > for a reply. Move it above the signature, so people can reply to it > > more easily. > > > > Add tests for the exact format of the email signature, and add tests to > > ensure the email signature appears last. > > > > (Patch by Junio Hamano; tests by Josh Triplett.) > > Signed-off-by: Josh Triplett > > --- > > > > Does the above seem reasonable, for a patch that incorporates the > > proposed patch from Message-Id > > xmqqh99rpud4@gitster.mtv.corp.google.com and adds tests? > > Other than that I'd probably retitle it, Ah, true, I should have titled it "format-patch: move base commit ...". > your problem description > looks perfect. I am still not sure if the code does a reasonable > thing in MIME case, though. It *looks* correct to me. > Thanks for tying the loose ends anyway. > > > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh > > index b0579dd..a4af275 100755 > > --- a/t/t4014-format-patch.sh > > +++ b/t/t4014-format-patch.sh > > @@ -754,9 +754,22 @@ test_expect_success 'format-patch > > --ignore-if-in-upstream HEAD' ' > > git format-patch --ignore-if-in-upstream HEAD > > ' > > > > +git_version="$(git --version | sed "s/.* //")" > > + > > +signature() { > > + printf "%s\n%s\n\n" "-- " "${1:-$git_version}" > > +} > > Hmph. I would actually have expected that you would force a fixed > and an easily noticeable string via format.signature for the purpose > of the test, One of the git tests already did that. I just modified that test to test the exact signature format and that it appears at the end, rather than just grepping to check that the signature string appears somewhere. Then when doing so, I realized that I should check the default case too (at which point that test change probably should have gone in a separate patch). > but I guess this test covers a lot more than what the > purpose of the main part of the patch does (i.e. enforces that the > default signature must be made from the version string of Git). It > is not a bad thing to test, but it probably does not belong to this > change. If you _were_ to split the patch in two, that is where I > probably would split, i.e. "we didn't test what the default signature > looks like, or we didn't make sure --signature option overrides the > default signature, so let's test it" as the preliminary preparation, > followed by "having base info after sig is inconvenient, let's move > it and make sure base info stays before sig with additional test" as > the second (and primary) patch. > > But a single patch is fine. > > Thanks. If any other change ends up being necessary, I'll split the patch in v2. - Josh Triplett
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
On Thu, Sep 08, 2016 at 09:57:43AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > Between the two options for regexec_buf(), I think you have convinced me > > that REG_STARTEND is better than just using compat/regex everywhere. I > > do think the fallback for platforms like musl should be "use > > compat/regex" and not doing an expensive copy (which in most cases is > > not even necessary). > > I agree with you that it would be the best approach to build > regexec_buf() that unconditionally uses REG_STARTEND and tell people > without REG_STARTEND to use compat/regex instead of their platform > regex library. > > The description in Makefile may want to be rephrased to clarify. > > -# Define NO_REGEX if you have no or inferior regex support in your C library. > +# Define NO_REGEX if your C library lacks regex support with REG_STARTEND > +# feature. > > The word "inferior" is not giving any useful information there. Yeah, very much agreed (and the "#error" when we lack REG_STARTEND I mentioned earlier may be a good hint). -Peff
Re: "fatal error in commit_refs" from pushing to github
On Thu, Sep 08, 2016 at 06:03:33PM +0700, Duy Nguyen wrote: > > So this is really an internal failure at the ref-update stage. There > > _should_ be a reasonable error message, but I think "fatal error in > > commit_refs" is the generic last-ditch fallback. I'll pass this along to > > people in charge of that code, as we should be generating a more useful > > error message. > > Hmm.. I'm interested in this because the "fix" is from client side. I > did "git gc" and "git fetch" and the problem was gone. It may also have been a transient error inside GitHub that resolved itself between your two pushes. -Peff
Re: [PATCH] Move format-patch base commit and prerequisites before email signature
Josh Triplettwrites: > Any text below the "-- " for the email signature gets treated as part of > the signature, and many mail clients will trim it from the quoted text > for a reply. Move it above the signature, so people can reply to it > more easily. > > Add tests for the exact format of the email signature, and add tests to > ensure the email signature appears last. > > (Patch by Junio Hamano; tests by Josh Triplett.) > Signed-off-by: Josh Triplett > --- > > Does the above seem reasonable, for a patch that incorporates the > proposed patch from Message-Id > xmqqh99rpud4@gitster.mtv.corp.google.com and adds tests? Other than that I'd probably retitle it, your problem description looks perfect. I am still not sure if the code does a reasonable thing in MIME case, though. Thanks for tying the loose ends anyway. > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh > index b0579dd..a4af275 100755 > --- a/t/t4014-format-patch.sh > +++ b/t/t4014-format-patch.sh > @@ -754,9 +754,22 @@ test_expect_success 'format-patch > --ignore-if-in-upstream HEAD' ' > git format-patch --ignore-if-in-upstream HEAD > ' > > +git_version="$(git --version | sed "s/.* //")" > + > +signature() { > + printf "%s\n%s\n\n" "-- " "${1:-$git_version}" > +} Hmph. I would actually have expected that you would force a fixed and an easily noticeable string via format.signature for the purpose of the test, but I guess this test covers a lot more than what the purpose of the main part of the patch does (i.e. enforces that the default signature must be made from the version string of Git). It is not a bad thing to test, but it probably does not belong to this change. If you _were_ to split the patch in two, that is where I probably would split, i.e. "we didn't test what the default signature looks like, or we didn't make sure --signature option overrides the default signature, so let's test it" as the preliminary preparation, followed by "having base info after sig is inconvenient, let's move it and make sure base info stays before sig with additional test" as the second (and primary) patch. But a single patch is fine. Thanks.
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
Hi Peff & Junio, On Thu, 8 Sep 2016, Junio C Hamano wrote: > Jeff Kingwrites: > > > Between the two options for regexec_buf(), I think you have convinced me > > that REG_STARTEND is better than just using compat/regex everywhere. I > > do think the fallback for platforms like musl should be "use > > compat/regex" and not doing an expensive copy (which in most cases is > > not even necessary). > > I agree with you that it would be the best approach to build > regexec_buf() that unconditionally uses REG_STARTEND and tell people > without REG_STARTEND to use compat/regex instead of their platform > regex library. > > The description in Makefile may want to be rephrased to clarify. > > -# Define NO_REGEX if you have no or inferior regex support in your C library. > +# Define NO_REGEX if your C library lacks regex support with REG_STARTEND > +# feature. Okay, I will make that happen. But first I need to sleep. Ciao, Dscho
[PATCH v7 10/10] convert: add filter..process option
From: Lars SchneiderGit's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter..process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter..process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright Reviewed-by: Jakub Narebski Signed-off-by: Lars Schneider --- Documentation/gitattributes.txt| 154 +- contrib/long-running-filter/example.pl | 111 ++ convert.c | 349 --- pkt-line.h | 1 + t/t0021-conversion.sh | 365 - t/t0021/rot13-filter.pl| 179 unpack-trees.c | 1 + 7 files changed, 1129 insertions(+), 31 deletions(-) create mode 100755 contrib/long-running-filter/example.pl create mode 100755 t/t0021/rot13-filter.pl diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 7aff940..ac000ea 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -293,7 +293,13 @@ checkout, when the `smudge` command is specified, the command is fed the blob object from its standard input, and its standard output is used to update the worktree file. Similarly, the `clean` command is used to convert the contents of worktree file -upon checkin. +upon checkin. By default these commands process only a single +blob and terminate. If a long running `process` filter is used +in place of `clean` and/or `smudge` filters, then Git can process +all blobs with a single filter command invocation for the entire +life of a single Git command, for example `git add --all`. See +section below for the description of the protocol used to +communicate with a `process` filter. One use of the content filtering is to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use. @@ -373,6 +379,152 @@ not exist, or may have different contents. So, smudge and clean commands should not try to access the file on disk, but only act as filters on the content provided to them on standard input. +Long Running Filter Process +^^^ + +If the filter command (a string value) is defined via +`filter..process` then Git can process all blobs with a +single filter invocation for the entire life of a single Git +command. This is achieved by using the following packet format +(pkt-line, see technical/protocol-common.txt) based protocol over +standard input and standard output. + +Git starts the filter when it encounters the first file +that needs to be cleaned or smudged. After the filter started +Git sends a welcome message ("git-filter-client"), a list of +supported protocol version numbers, and a flush packet. Git expects +to read a welcome response message ("git-filter-server") and exactly +one protocol version number from the previously sent list. All further +communication will be based on the selected version. The remaining +protocol description below documents "version=2". Please note that +"version=42" in the example below does not exist and is only there +to illustrate how the protocol would look like with more than one +version. +
[PATCH v7 09/10] convert: make apply_filter() adhere to standard Git error handling
From: Lars Schneiderapply_filter() returns a boolean that tells the caller if it "did convert or did not convert". The variable `ret` was used throughout the function to track errors whereas `1` denoted success and `0` failure. This is unusual for the Git source where `0` denotes success. Rename the variable and flip its value to make the function easier readable for Git developers. Signed-off-by: Lars Schneider --- convert.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/convert.c b/convert.c index 986c239..a068df7 100644 --- a/convert.c +++ b/convert.c @@ -451,7 +451,7 @@ static int apply_filter(const char *path, const char *src, size_t len, int fd, * * (child --> cmd) --> us */ - int ret = 1; + int err = 0; struct strbuf nbuf = STRBUF_INIT; struct async async; struct filter_params params; @@ -478,22 +478,22 @@ static int apply_filter(const char *path, const char *src, size_t len, int fd, if (strbuf_read(, async.out, len) < 0) { error("read from external filter '%s' failed", cmd); - ret = 0; + err = -1; } if (close(async.out)) { error("read from external filter '%s' failed", cmd); - ret = 0; + err = -1; } if (finish_async()) { error("external filter '%s' failed", cmd); - ret = 0; + err = -1; } - if (ret) { + if (!err) { strbuf_swap(dst, ); } strbuf_release(); - return ret; + return !err; } static struct convert_driver { -- 2.10.0
[PATCH v7 04/10] pkt-line: add packet_flush_gently()
From: Lars Schneiderpacket_flush() would die in case of a write error even though for some callers an error would be acceptable. Add packet_flush_gently() which writes a pkt-line flush packet and returns `0` for success and `-1` for failure. Signed-off-by: Lars Schneider --- pkt-line.c | 9 + pkt-line.h | 1 + 2 files changed, 10 insertions(+) diff --git a/pkt-line.c b/pkt-line.c index 3824d05..37345ca 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -91,6 +91,15 @@ void packet_flush(int fd) write_or_die(fd, "", 4); } +int packet_flush_gently(int fd) +{ + packet_trace("", 4, 1); + if (write_in_full(fd, "", 4) == 4) + return 0; + error("flush packet write failed"); + return -1; +} + void packet_buf_flush(struct strbuf *buf) { packet_trace("", 4, 1); diff --git a/pkt-line.h b/pkt-line.h index 3caea77..3fa0899 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -23,6 +23,7 @@ void packet_flush(int fd); void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); void packet_buf_flush(struct strbuf *buf); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); +int packet_flush_gently(int fd); int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); /* -- 2.10.0
[PATCH v7 07/10] convert: quote filter names in error messages
From: Lars SchneiderGit filter driver commands with spaces (e.g. `filter.sh foo`) are hard to read in error messages. Quote them to improve the readability. Signed-off-by: Lars Schneider --- convert.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/convert.c b/convert.c index 077f5e6..986c239 100644 --- a/convert.c +++ b/convert.c @@ -412,7 +412,7 @@ static int filter_buffer_or_fd(int in, int out, void *data) child_process.out = out; if (start_command(_process)) - return error("cannot fork to run external filter %s", params->cmd); + return error("cannot fork to run external filter '%s'", params->cmd); sigchain_push(SIGPIPE, SIG_IGN); @@ -430,13 +430,13 @@ static int filter_buffer_or_fd(int in, int out, void *data) if (close(child_process.in)) write_err = 1; if (write_err) - error("cannot feed the input to external filter %s", params->cmd); + error("cannot feed the input to external filter '%s'", params->cmd); sigchain_pop(SIGPIPE); status = finish_command(_process); if (status) - error("external filter %s failed %d", params->cmd, status); + error("external filter '%s' failed %d", params->cmd, status); strbuf_release(); return (write_err || status); @@ -477,15 +477,15 @@ static int apply_filter(const char *path, const char *src, size_t len, int fd, return 0; /* error was already reported */ if (strbuf_read(, async.out, len) < 0) { - error("read from external filter %s failed", cmd); + error("read from external filter '%s' failed", cmd); ret = 0; } if (close(async.out)) { - error("read from external filter %s failed", cmd); + error("read from external filter '%s' failed", cmd); ret = 0; } if (finish_async()) { - error("external filter %s failed", cmd); + error("external filter '%s' failed", cmd); ret = 0; } -- 2.10.0
[PATCH v7 05/10] pkt-line: add packet_write_gently()
From: Lars Schneiderpacket_write_fmt_gently() uses format_packet() which lets the caller only send string data via "%s". That means it cannot be used for arbitrary data that may contain NULs. Add packet_write_gently() which writes arbitrary data and returns `0` for success and `-1` for an error. This function is used by other pkt-line functions in a subsequent patch. Signed-off-by: Lars Schneider --- pkt-line.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/pkt-line.c b/pkt-line.c index 37345ca..1d3d725 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -181,6 +181,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...) return status; } +int packet_write_gently(const int fd_out, const char *buf, size_t size) +{ + static char packet_write_buffer[LARGE_PACKET_MAX]; + + if (size > sizeof(packet_write_buffer) - 4) { + error("packet write failed"); + return -1; + } + packet_trace(buf, size, 1); + size += 4; + set_packet_header(packet_write_buffer, size); + memcpy(packet_write_buffer + 4, buf, size - 4); + if (write_in_full(fd_out, packet_write_buffer, size) == size) + return 0; + + error("packet write failed"); + return -1; +} + void packet_buf_write(struct strbuf *buf, const char *fmt, ...) { va_list args; -- 2.10.0
[PATCH v7 08/10] convert: modernize tests
From: Lars SchneiderUse `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Signed-off-by: Lars Schneider --- t/t0021-conversion.sh | 58 +-- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index e799e59..dc50938 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -38,8 +38,8 @@ script='s/^\$Id: \([0-9a-f]*\) \$/\1/p' test_expect_success check ' - cmp test.o test && - cmp test.o test.t && + test_cmp test.o test && + test_cmp test.o test.t && # ident should be stripped in the repository git diff --raw --exit-code :test :test.i && @@ -47,10 +47,10 @@ test_expect_success check ' embedded=$(sed -ne "$script" test.i) && test "z$id" = "z$embedded" && - git cat-file blob :test.t > test.r && + git cat-file blob :test.t >test.r && - ./rot13.sh < test.o > test.t && - cmp test.r test.t + ./rot13.sh test.t && + test_cmp test.r test.t ' # If an expanded ident ever gets into the repository, we want to make sure that @@ -130,7 +130,7 @@ test_expect_success 'filter shell-escaped filenames' ' # delete the files and check them out again, using a smudge filter # that will count the args and echo the command-line back to us - git config filter.argc.smudge "sh ./argc.sh %f" && + test_config filter.argc.smudge "sh ./argc.sh %f" && rm "$normal" "$special" && git checkout -- "$normal" "$special" && @@ -141,7 +141,7 @@ test_expect_success 'filter shell-escaped filenames' ' test_cmp expect "$special" && # do the same thing, but with more args in the filter expression - git config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" && + test_config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" && rm "$normal" "$special" && git checkout -- "$normal" "$special" && @@ -154,9 +154,9 @@ test_expect_success 'filter shell-escaped filenames' ' ' test_expect_success 'required filter should filter data' ' - git config filter.required.smudge ./rot13.sh && - git config filter.required.clean ./rot13.sh && - git config filter.required.required true && + test_config filter.required.smudge ./rot13.sh && + test_config filter.required.clean ./rot13.sh && + test_config filter.required.required true && echo "*.r filter=required" >.gitattributes && @@ -165,17 +165,17 @@ test_expect_success 'required filter should filter data' ' rm -f test.r && git checkout -- test.r && - cmp test.o test.r && + test_cmp test.o test.r && ./rot13.sh expected && git cat-file blob :test.r >actual && - cmp expected actual + test_cmp expected actual ' test_expect_success 'required filter smudge failure' ' - git config filter.failsmudge.smudge false && - git config filter.failsmudge.clean cat && - git config filter.failsmudge.required true && + test_config filter.failsmudge.smudge false && + test_config filter.failsmudge.clean cat && + test_config filter.failsmudge.required true && echo "*.fs filter=failsmudge" >.gitattributes && @@ -186,9 +186,9 @@ test_expect_success 'required filter smudge failure' ' ' test_expect_success 'required filter clean failure' ' - git config filter.failclean.smudge cat && - git config filter.failclean.clean false && - git config filter.failclean.required true && + test_config filter.failclean.smudge cat && + test_config filter.failclean.clean false && + test_config filter.failclean.required true && echo "*.fc filter=failclean" >.gitattributes && @@ -197,8 +197,8 @@ test_expect_success 'required filter clean failure' ' ' test_expect_success 'filtering large input to small output should use little memory' ' - git config filter.devnull.clean "cat >/dev/null" && - git config filter.devnull.required true && + test_config filter.devnull.clean "cat >/dev/null" && + test_config filter.devnull.required true && for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB && echo "30MB filter=devnull" >.gitattributes && GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB @@ -207,7 +207,7 @@ test_expect_success 'filtering large input to small output should use little mem test_expect_success 'filter that does not read is fine' ' test-genrandom foo $((128 * 1024 + 1)) >big && echo "big filter=epipe" >.gitattributes && - git config
[PATCH v7 06/10] pkt-line: add functions to read/write flush terminated packet streams
From: Lars Schneiderwrite_packetized_from_fd() and write_packetized_from_buf() write a stream of packets. All content packets use the maximal packet size except for the last one. After the last content packet a `flush` control packet is written. read_packetized_to_buf() reads arbitrary sized packets until it detects a `flush` packet. Signed-off-by: Lars Schneider --- pkt-line.c | 68 ++ pkt-line.h | 7 +++ 2 files changed, 75 insertions(+) diff --git a/pkt-line.c b/pkt-line.c index 1d3d725..5001a07 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -209,6 +209,47 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...) va_end(args); } +int write_packetized_from_fd(int fd_in, int fd_out) +{ + static char buf[PKTLINE_DATA_MAXLEN]; + int err = 0; + ssize_t bytes_to_write; + + while (!err) { + bytes_to_write = xread(fd_in, buf, sizeof(buf)); + if (bytes_to_write < 0) + return COPY_READ_ERROR; + if (bytes_to_write == 0) + break; + err = packet_write_gently(fd_out, buf, bytes_to_write); + } + if (!err) + err = packet_flush_gently(fd_out); + return err; +} + +int write_packetized_from_buf(const char *src_in, size_t len, int fd_out) +{ + static char buf[PKTLINE_DATA_MAXLEN]; + int err = 0; + size_t bytes_written = 0; + size_t bytes_to_write; + + while (!err) { + if ((len - bytes_written) > sizeof(buf)) + bytes_to_write = sizeof(buf); + else + bytes_to_write = len - bytes_written; + if (bytes_to_write == 0) + break; + err = packet_write_gently(fd_out, src_in + bytes_written, bytes_to_write); + bytes_written += bytes_to_write; + } + if (!err) + err = packet_flush_gently(fd_out); + return err; +} + static int get_packet_data(int fd, char **src_buf, size_t *src_size, void *dst, unsigned size, int options) { @@ -318,3 +359,30 @@ char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len) { return packet_read_line_generic(-1, src, src_len, dst_len); } + +ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out) +{ + int paket_len; + int options = PACKET_READ_GENTLE_ON_EOF; + + size_t oldlen = sb_out->len; + size_t oldalloc = sb_out->alloc; + + for (;;) { + strbuf_grow(sb_out, PKTLINE_DATA_MAXLEN+1); + paket_len = packet_read(fd_in, NULL, NULL, + sb_out->buf + sb_out->len, PKTLINE_DATA_MAXLEN+1, options); + if (paket_len <= 0) + break; + sb_out->len += paket_len; + } + + if (paket_len < 0) { + if (oldalloc == 0) + strbuf_release(sb_out); + else + strbuf_setlen(sb_out, oldlen); + return paket_len; + } + return sb_out->len - oldlen; +} diff --git a/pkt-line.h b/pkt-line.h index 3fa0899..6df8449 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -25,6 +25,8 @@ void packet_buf_flush(struct strbuf *buf); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); int packet_flush_gently(int fd); int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); +int write_packetized_from_fd(int fd_in, int fd_out); +int write_packetized_from_buf(const char *src_in, size_t len, int fd_out); /* * Read a packetized line into the buffer, which must be at least size bytes @@ -77,6 +79,11 @@ char *packet_read_line(int fd, int *size); */ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size); +/* + * Reads a stream of variable sized packets until a flush packet is detected. + */ +ssize_t read_packetized_to_buf(int fd_in, struct strbuf *sb_out); + #define DEFAULT_PACKET_MAX 1000 #define LARGE_PACKET_MAX 65520 extern char packet_buffer[LARGE_PACKET_MAX]; -- 2.10.0
[PATCH v7 02/10] pkt-line: extract set_packet_header()
From: Lars Schneiderset_packet_header() converts an integer to a 4 byte hex string. Make this function locally available so that other pkt-line functions can use it. Signed-off-by: Lars Schneider --- pkt-line.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/pkt-line.c b/pkt-line.c index 0a9b61c..e8adc0f 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -97,10 +97,20 @@ void packet_buf_flush(struct strbuf *buf) strbuf_add(buf, "", 4); } -#define hex(a) (hexchar[(a) & 15]) -static void format_packet(struct strbuf *out, const char *fmt, va_list args) +static void set_packet_header(char *buf, const int size) { static char hexchar[] = "0123456789abcdef"; + + #define hex(a) (hexchar[(a) & 15]) + buf[0] = hex(size >> 12); + buf[1] = hex(size >> 8); + buf[2] = hex(size >> 4); + buf[3] = hex(size); + #undef hex +} + +static void format_packet(struct strbuf *out, const char *fmt, va_list args) +{ size_t orig_len, n; orig_len = out->len; @@ -111,10 +121,7 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args) if (n > LARGE_PACKET_MAX) die("protocol error: impossibly long line"); - out->buf[orig_len + 0] = hex(n >> 12); - out->buf[orig_len + 1] = hex(n >> 8); - out->buf[orig_len + 2] = hex(n >> 4); - out->buf[orig_len + 3] = hex(n); + set_packet_header(>buf[orig_len], n); packet_trace(out->buf + orig_len + 4, n - 4, 1); } -- 2.10.0
[PATCH v7 01/10] pkt-line: rename packet_write() to packet_write_fmt()
From: Lars Schneiderpacket_write() should be called packet_write_fmt() as the string parameter can be formatted. Suggested-by: Junio C Hamano Signed-off-by: Lars Schneider --- builtin/archive.c| 4 ++-- builtin/receive-pack.c | 4 ++-- builtin/remote-ext.c | 4 ++-- builtin/upload-archive.c | 4 ++-- connect.c| 2 +- daemon.c | 2 +- http-backend.c | 2 +- pkt-line.c | 2 +- pkt-line.h | 2 +- shallow.c| 2 +- upload-pack.c| 30 +++--- 11 files changed, 29 insertions(+), 29 deletions(-) diff --git a/builtin/archive.c b/builtin/archive.c index a1e3b94..49f4914 100644 --- a/builtin/archive.c +++ b/builtin/archive.c @@ -47,10 +47,10 @@ static int run_remote_archiver(int argc, const char **argv, if (name_hint) { const char *format = archive_format_from_filename(name_hint); if (format) - packet_write(fd[1], "argument --format=%s\n", format); + packet_write_fmt(fd[1], "argument --format=%s\n", format); } for (i = 1; i < argc; i++) - packet_write(fd[1], "argument %s\n", argv[i]); + packet_write_fmt(fd[1], "argument %s\n", argv[i]); packet_flush(fd[1]); buf = packet_read_line(fd[0], NULL); diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 011db00..1ce7682 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -218,7 +218,7 @@ static int receive_pack_config(const char *var, const char *value, void *cb) static void show_ref(const char *path, const unsigned char *sha1) { if (sent_capabilities) { - packet_write(1, "%s %s\n", sha1_to_hex(sha1), path); + packet_write_fmt(1, "%s %s\n", sha1_to_hex(sha1), path); } else { struct strbuf cap = STRBUF_INIT; @@ -233,7 +233,7 @@ static void show_ref(const char *path, const unsigned char *sha1) if (advertise_push_options) strbuf_addstr(, " push-options"); strbuf_addf(, " agent=%s", git_user_agent_sanitized()); - packet_write(1, "%s %s%c%s\n", + packet_write_fmt(1, "%s %s%c%s\n", sha1_to_hex(sha1), path, 0, cap.buf); strbuf_release(); sent_capabilities = 1; diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c index 88eb8f9..11b48bf 100644 --- a/builtin/remote-ext.c +++ b/builtin/remote-ext.c @@ -128,9 +128,9 @@ static void send_git_request(int stdin_fd, const char *serv, const char *repo, const char *vhost) { if (!vhost) - packet_write(stdin_fd, "%s %s%c", serv, repo, 0); + packet_write_fmt(stdin_fd, "%s %s%c", serv, repo, 0); else - packet_write(stdin_fd, "%s %s%chost=%s%c", serv, repo, 0, + packet_write_fmt(stdin_fd, "%s %s%chost=%s%c", serv, repo, 0, vhost, 0); } diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c index 2caedf1..dc872f6 100644 --- a/builtin/upload-archive.c +++ b/builtin/upload-archive.c @@ -88,11 +88,11 @@ int cmd_upload_archive(int argc, const char **argv, const char *prefix) writer.git_cmd = 1; if (start_command()) { int err = errno; - packet_write(1, "NACK unable to spawn subprocess\n"); + packet_write_fmt(1, "NACK unable to spawn subprocess\n"); die("upload-archive: %s", strerror(err)); } - packet_write(1, "ACK\n"); + packet_write_fmt(1, "ACK\n"); packet_flush(1); while (1) { diff --git a/connect.c b/connect.c index 722dc3f..5330d9c 100644 --- a/connect.c +++ b/connect.c @@ -730,7 +730,7 @@ struct child_process *git_connect(int fd[2], const char *url, * Note: Do not add any other headers here! Doing so * will cause older git-daemon servers to crash. */ - packet_write(fd[1], + packet_write_fmt(fd[1], "%s %s%chost=%s%c", prog, path, 0, target_host, 0); diff --git a/daemon.c b/daemon.c index 425aad0..afce1b9 100644 --- a/daemon.c +++ b/daemon.c @@ -281,7 +281,7 @@ static int daemon_error(const char *dir, const char *msg) { if (!informative_errors) msg = "access denied or repository not exported"; - packet_write(1, "ERR %s: %s", msg, dir); + packet_write_fmt(1, "ERR %s: %s", msg, dir); return -1; } diff --git a/http-backend.c b/http-backend.c index adc8c8c..eef0a36 100644 --- a/http-backend.c +++ b/http-backend.c @@ -464,7 +464,7 @@ static void get_info_refs(struct strbuf *hdr, char *arg)
[PATCH v7 03/10] pkt-line: add packet_write_fmt_gently()
From: Lars Schneiderpacket_write_fmt() would die in case of a write error even though for some callers an error would be acceptable. Add packet_write_fmt_gently() which writes a formatted pkt-line and returns `0` for success and `-1` for an error. Signed-off-by: Lars Schneider --- pkt-line.c | 43 +++ pkt-line.h | 1 + 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/pkt-line.c b/pkt-line.c index e8adc0f..3824d05 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -125,16 +125,51 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args) packet_trace(out->buf + orig_len + 4, n - 4, 1); } +static int packet_write_fmt_1(int fd, int gently, + const char *fmt, va_list args) +{ + struct strbuf buf = STRBUF_INIT; + size_t count; + + format_packet(, fmt, args); + count = write_in_full(fd, buf.buf, buf.len); + if (count == buf.len) + return 0; + + if (!gently) { + if (errno == EPIPE) { + if (in_async()) + async_exit(141); + + signal(SIGPIPE, SIG_DFL); + raise(SIGPIPE); + /* Should never happen, but just in case... */ + exit(141); + } + die_errno("packet write error"); + } + error("packet write failed"); + return -1; +} + void packet_write_fmt(int fd, const char *fmt, ...) { - static struct strbuf buf = STRBUF_INIT; va_list args; - strbuf_reset(); va_start(args, fmt); - format_packet(, fmt, args); + packet_write_fmt_1(fd, 0, fmt, args); + va_end(args); +} + +int packet_write_fmt_gently(int fd, const char *fmt, ...) +{ + int status; + va_list args; + + va_start(args, fmt); + status = packet_write_fmt_1(fd, 1, fmt, args); va_end(args); - write_or_die(fd, buf.buf, buf.len); + return status; } void packet_buf_write(struct strbuf *buf, const char *fmt, ...) diff --git a/pkt-line.h b/pkt-line.h index 1902fb3..3caea77 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -23,6 +23,7 @@ void packet_flush(int fd); void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); void packet_buf_flush(struct strbuf *buf); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); +int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); /* * Read a packetized line into the buffer, which must be at least size bytes -- 2.10.0
[PATCH v7 00/10] Git filter protocol
From: Lars SchneiderThe goal of this series is to avoid launching a new clean/smudge filter process for each file that is filtered. A short summary about v1 to v5 can be found here: https://git.github.io/rev_news/2016/08/17/edition-18/ This series is also published on web: https://github.com/larsxschneider/git/pull/11 Thanks a lot to Stefan, Junio, Jakub (Narebski and Keller), Peff, and Torsten for very helpful reviews, Lars ## Major changes since v6 * series rebased to Git v2.10.0 * simplified and improved multi-packet test * v2 filter takes precedence over v1 filter * rename "error-all" to "abort" * patch 07/13 "pack-protocol: fix maximum pkt-line size" submitted separately: http://public-inbox.org/git/20160829175509.69050-1-larsxschnei...@gmail.com/ * removed "10/13 convert: generate large test files only once" * patch 13/13 "read-cache: make sure file handles are not inherited by child processes" moved to an own series: http://public-inbox.org/git/2016090521.72956-1-larsxschnei...@gmail.com/ ## All changes since v6 ### Stefan * http://public-inbox.org/git/cagz79kajn-yf28+ls2jyqss6jt-oj2jw-saxqn-oe5xaxsy...@mail.gmail.com/ * submit patch "pack-protocol: fix maximum pkt-line size" separately: http://public-inbox.org/git/20160829175509.69050-1-larsxschnei...@gmail.com/ * http://public-inbox.org/git/cagz79kbeoyvm_+mwkajt2phn1okzxate5d+dv_usj7dypgx...@mail.gmail.com/ * trace failures in new *_gently() pkt-line functions ### Junio * http://public-inbox.org/git/xmqq8tvkle6o@gitster.mtv.corp.google.com/ * share code between packet_write_fmt() and packet_write_fmt_gently() * http://public-inbox.org/git/xmqq4m68ldrg@gitster.mtv.corp.google.com/ * remove "Second, it will..." from "pkt-line: add packet_write_gently()" commit message * use separate buffer in write_packetized_from_fd() * fix write order in packet_write_gently() * http://public-inbox.org/git/xmqqzio0jxh7@gitster.mtv.corp.google.com/ * rename read/write flush terminated packet streams functions * remove meaningless "bytes_to_write > ..." check * reuse packet_read() function in read_packetized_to_buf() * http://public-inbox.org/git/xmqqk2f3fgbd@gitster.mtv.corp.google.com/ * revert test_config replacement in test setup * http://public-inbox.org/git/xmqq37lncvj6@gitster.mtv.corp.google.com/ * improve multi-packet test * http://public-inbox.org/git/xmqqzinv6wtg@gitster.mtv.corp.google.com/ * give v2 filter precedence over v1 filter * http://public-inbox.org/git/xmqqmvjt3nht@gitster.mtv.corp.google.com/ * explain that "error" behavior mimics v1 filter * explain that Git closes the pipe on exit to allow the filter a graceful exit ### Jakub * http://public-inbox.org/git/7fbd02a1-ad62-7391-df2a-835aae8a6...@gmail.com/ * rename "error-all" to "abort" * add a simple Perl v2 filter example in contrib ## Interdiff (v6..v7) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index df059e9..ac000ea 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -470,20 +470,11 @@ packet: git< status=error\n packet: git< -In case the filter cannot or does not want to process the content -as well as any future content for the lifetime of the Git process, -it is expected to respond with an "error-all" status. Depending on -the `filter..required` flag Git will interpret that as error -but it will not stop or restart the filter process. - -packet: git< status=error-all\n -packet: git< - - If the filter experiences an error during processing, then it can -send the status "error". Depending on the `filter..required` -flag Git will interpret that as error but it will not stop or restart -the filter process. +send the status "error" after the content was (partially or +completely) sent. Depending on the `filter..required` flag +Git will interpret that as error but it will not stop or restart the +filter process. packet: git< status=success\n packet: git< @@ -495,21 +486,38 @@ packet: git< If the filter dies during the communication or does not adhere to the protocol then Git will stop the filter process and restart it -with the next file that needs to be processed. +with the next file that needs to be processed. Depending on the +`filter..required` flag Git will interpret that as error. + +The error handling for all cases above mimic the behavior of +the `filter..clean` / `filter..smudge` error +handling. + +In case the filter cannot or does not want to process the content +as well as any future content for the lifetime of the Git process, +it is expected to respond with an "abort" status. Depending on +the `filter..required` flag Git will interpret that as
Re: [PATCH v3 3/3] Use the newly-introduced regexec_buf() function
Johannes Schindelinwrites: > @@ -33,11 +32,8 @@ static void diffgrep_consume(void *priv, char *line, > unsigned long len) >* caller early. >*/ > return; > - /* Yuck -- line ought to be "const char *"! */ > - hold = line[len]; > - line[len] = '\0'; > - data->hit = !regexec(data->regexp, line + 1, 1, , 0); > - line[len] = hold; > + data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1, > + , 0); This is an unexpected happy surprise. It really feels good to see that "Yuck" line go. > @@ -228,18 +227,16 @@ static long ff_regexp(const char *line, long len, > len--; > } > > - line_buffer = xstrndup(line, len); /* make NUL terminated */ > - > for (i = 0; i < regs->nr; i++) { > struct ff_reg *reg = regs->array + i; > - if (!regexec(>re, line_buffer, 2, pmatch, 0)) { > + if (!regexec_buf(>re, line, len, 2, pmatch, 0)) { So is this hunk. Removing unnecessary copying is a very good thing. Please give these three patches a common prefix, e.g. regex: -G feeds a non NUL-terminated string to regexec() and fails regex: add regexec_buf() that can work on a non NUL-terminated string regex: use regexec_buf() or something like that. Also I agree with Peff that a test with an embedded NUL would be a good thing. This round is so close to perfect.
Re: [PATCH v3 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers
Johannes Schindelinwrites: > We just introduced a test that demonstrates that our sloppy use of > regexec() on a mmap()ed area can result in incorrect results or even > hard crashes. > > So what we need to fix this is a function that calls regexec() on a > length-delimited, rather than a NUL-terminated, string. > > Happily, there is an extension to regexec() introduced by the NetBSD > project and present in all major regex implementation including > Linux', MacOSX' and the one Git includes in compat/regex/: by using > the (non-POSIX) REG_STARTEND flag, it is possible to tell the > regexec() function that it should only look at the offsets between > pmatch[0].rm_so and pmatch[0].rm_eo. > > That is exactly what we need. Yes, that is good. > Since support for REG_STARTEND is so widespread by now, let's just > introduce a helper function that uses it, and fall back to allocating > and constructing a NUL-terminated when REG_STARTEND is not available. I do not think this fallback is good; we do ship a compat/ fallback that does support REG_STARTEND and you'd want to use that. Not having the copying fallback means you do not even have to worry about the size+1 overflow and fix it with xmallocz() ;-)
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
Jeff Kingwrites: > Between the two options for regexec_buf(), I think you have convinced me > that REG_STARTEND is better than just using compat/regex everywhere. I > do think the fallback for platforms like musl should be "use > compat/regex" and not doing an expensive copy (which in most cases is > not even necessary). I agree with you that it would be the best approach to build regexec_buf() that unconditionally uses REG_STARTEND and tell people without REG_STARTEND to use compat/regex instead of their platform regex library. The description in Makefile may want to be rephrased to clarify. -# Define NO_REGEX if you have no or inferior regex support in your C library. +# Define NO_REGEX if your C library lacks regex support with REG_STARTEND +# feature. The word "inferior" is not giving any useful information there.
Re: git-gui, was Re: [PATCH v2 6/6] git-gui: Update Japanese information
Johannes Schindelinwrites: > On Wed, 7 Sep 2016, Junio C Hamano wrote: > >> Pat, we haven't heard from you for a long time. > > Indeed. There are a couple of git-gui patches in Git for Windows that have > been contributed a long time ago and not been picked up. > > Maybe it is time to just accept git-gui patches directly into Git, after > some other Tcl/Tk savvy people ACKed them? Yes, that was my thinking, though I'd prefer to see somebodyto be acting as a central point of contact so that I do not have to pay any attention to the part of the system other than responding to a pull request. The subsystem maintainer does not have to forever be Pat, of course, and can be handed over to somebody else, but if that is what is going to happen, I'd prefer to see a volunteer or two to step up and Pat to bless them.
Re: [PATCH] connect: tighten check for unexpected early hang up (Re: [PATCH v3 2/2] connect: advertized capability is not a ref)
Jonathan Niederwrites: > Jonathan Nieder wrote: > >> Subject: connect: tighten check for unexpected early hang up > [...] >> @@ -131,7 +131,7 @@ struct ref **get_remote_heads(int in, char *src_buf, >> size_t src_len, >>PACKET_READ_GENTLE_ON_EOF | >>PACKET_READ_CHOMP_NEWLINE); >> if (len < 0) >> -die_initial_contact(got_at_least_one_head); >> +die_initial_contact(first_line); > > This should say !first_line. > > I'll add tests if the patch seems like a good idea. I tried to write one-paragraph comment for the die_initial_contact() function, like so: +/* + * A remote end that is unwilling to talk to us would not give + * any response to us before hanging up. After seeing some + * response, we know the hang-up is unexpected. + */ +static void die_initial_contact(int saw_any_response) but then I got stuck. We may know that after seeing any response (not necessarily a ref, but .have or shallow) the other end is willing to talk to us, but the reverse is not necessarily true (it may be willing to talk to us, but the network between us may have prevented it from doing so). For that reason, the above comment is inappropriate for a function that takes a bool and gives an "unexpected hung-up" or an "unreachable, possible ACL or problems" message. So my second attempt was to comment on the variable that keeps track of the status of the conversation, which turned out to be better (attached). I think I fixed your "oops, the bool needs polarity flip". A test may be a good idea, but I am not sure how you plan to produce a failure after sending some response. -- >8 -- From: Jonathan Nieder Date: Wed, 7 Sep 2016 18:45:55 -0700 Subject: [PATCH] connect: tighten check for unexpected early hang up A server hanging up immediately to mark access being denied does not send any .have refs, shallow lines, or anything else before hanging up. If the server has sent anything, then the hangup is unexpected. That is, if the server hangs up after a shallow line but before sending any refs, then git should tell me so: fatal: The remote end hung up upon initial contact instead of suggesting an access control problem: fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. Noticed while examining this code. This case isn't likely to come up in practice but tightening the check makes the code easier to read and manipulate. Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano --- connect.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/connect.c b/connect.c index c53f3f1..067cf40 100644 --- a/connect.c +++ b/connect.c @@ -43,9 +43,9 @@ int check_ref_type(const struct ref *ref, int flags) return check_ref(ref->name, flags); } -static void die_initial_contact(int got_at_least_one_head) +static void die_initial_contact(int unexpected) { - if (got_at_least_one_head) + if (unexpected) die("The remote end hung up upon initial contact"); else die("Could not read from remote repository.\n\n" @@ -115,10 +115,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, struct sha1_array *shallow_points) { struct ref **orig_list = list; - int got_at_least_one_head = 0; + + /* +* A hang-up after seeing some response from the other end +* means that it is unexpected, as we know the other end is +* willing to talk to us. A hang-up before seeing any +* response does not necessarily mean an ACL problem, though. +*/ + int saw_response; *list = NULL; - for (;;) { + for (saw_response = 0; ; saw_response = 1) { struct ref *ref; struct object_id old_oid; char *name; @@ -131,7 +138,7 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, PACKET_READ_GENTLE_ON_EOF | PACKET_READ_CHOMP_NEWLINE); if (len < 0) - die_initial_contact(got_at_least_one_head); + die_initial_contact(saw_response); if (!len) break; @@ -171,7 +178,6 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, oidcpy(>old_oid, _oid); *list = ref; list = >next; - got_at_least_one_head = 1; } annotate_refs_with_symref_info(*orig_list); -- 2.10.0-267-g7db2ae3
Re: [PATCH] connect: tighten check for unexpected early hang up (Re: [PATCH v3 2/2] connect: advertized capability is not a ref)
On Wed, Sep 7, 2016 at 6:45 PM, Jonathan Niederwrote: > (+cc: Heiko) > Jonathan Nieder wrote: > >> 'die_initial_contact' uses got_at_least_one_head to determine whether >> it was on the first line but code paths added later that use >> 'continue' don't populate it properly (see b06dcd7d, 40c155ff, and >> 1a7141ff). We could do >> >> int first_line = 1; >> >> for (;; first_line = 0) { >> ... >> } >> >> and use !first_line instead of got_at_least_one_head (removing >> got_at_least_one_head in the process since it has no other purpose). > > I got the history wrong. It looks like this was always confused > by the 'continue' cases. Unless I'm missing something subtle --- > thoughts? I was a bit confused by the line for (;; first_line = 0) { at first, but the explanation of 'continue's make sense for this pattern. However I'd rather prefer if we'd have int first_line = 1; for(;;) { ... // stuff with no continue here if (len < 0) die_initial_contact(!first_line); first_line = 0; ... // here we may have some continues, but that doesn't matter // w.r.t. first_line }
Re: [PATCH v3 2/2] connect: advertized capability is not a ref
Jonathan Niederwrites: > I think we can make this stricter. The capabilities^{} line is supposed > to be the first advertised ref, before any 'shallow' lines or .have > extra refs. "The first", or "the first and only"? I thought that it would be the latter.
[PATCH 0/3] Fix git-init in linked worktrees
My ASAP is not so ASAP. Sorry about that but I think I have fixed it. Side note about 2/3. I've known this problem (in general) for years (accidentally reading .git config file before .git is searched) and could not do anything about it. And because test_expect_failure should only be there if someone will eventually fix it, and I don't see myself doing it, and I don't see it super important that other people would bother, so I decided to simply sweep it under the rug. I could make a test_expect_failure if people think otherwise. Side note about 3/3. Yes there's a FIXME in there. It will take a lot more time to remove that FIXME (because setup_git_directory is not as simple). For now it should be ok to leave it there. When we find a use for get_first_git_dir outside git-init, we can fix it then. Nguyễn Thái Ngọc Duy (3): init: correct re-initialization from a linked worktree t0001: work around the bug that reads config file before repo setup init: do not set core.worktree more often than necessary builtin/init-db.c | 4 ++-- cache.h | 1 + environment.c | 16 +++- t/t0001-init.sh | 19 +++ 4 files changed, 37 insertions(+), 3 deletions(-) -- 2.8.2.524.g6ff3d78
[PATCH 3/3] init: do not set core.worktree more often than necessary
When "git init" is called with GIT_WORK_TREE environment set, we want to keep this worktree's location in core.worktree so the user does not have to set the environment again and again. See ef6f0af (git-init: set core.worktree if GIT_WORK_TREE is specified - 2007-07-04) We detect that by this logic (in needs_work_tree_config): normally worktree's top dir would contains ".git" directory, if this is not true, worktree is probably set to elsewhere by the user. Unfortunately when it calls get_git_dir() it does not take ".git" files into account. When we find a .git file, we immediately follow the file until we find the real ".git" directory. The location of this first ".git" file is lost. The .git file would satisfy the logic above and not create core.worktree (correct). But because the final .git's location is used, needs_work_tree_config() is misled and creates core.worktree anyway. This would not be a huge deal normally. But if this happens in a multiple worktree setup it becomes a real problem because up until now, core.worktree will be applied to the main worktree only. If you accidentally do "git init" from a linked worktree, you set core.worktree (for the main repo) pointing to the _linked_ worktree. After that point, may you live in interesting times. Record the .git file location and use it here. Noticed-by: Max NordlundHelped-by: Michael J Gruber Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/init-db.c | 2 +- cache.h | 1 + environment.c | 16 +++- t/t0001-init.sh | 2 ++ 4 files changed, 19 insertions(+), 2 deletions(-) diff --git a/builtin/init-db.c b/builtin/init-db.c index 6d9552e..36255f2 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -254,7 +254,7 @@ static int create_default_files(const char *template_path) /* allow template config file to override the default */ if (log_all_ref_updates == -1) git_config_set("core.logallrefupdates", "true"); - if (needs_work_tree_config(get_git_dir(), work_tree)) + if (needs_work_tree_config(get_first_git_dir(), work_tree)) git_config_set("core.worktree", work_tree); } diff --git a/cache.h b/cache.h index b780a91..e6c05f8 100644 --- a/cache.h +++ b/cache.h @@ -460,6 +460,7 @@ extern char *git_work_tree_cfg; extern int is_inside_work_tree(void); extern const char *get_git_dir(void); extern const char *get_git_common_dir(void); +extern const char *get_first_git_dir(void); extern char *get_object_directory(void); extern char *get_index_file(void); extern char *get_graft_file(void); diff --git a/environment.c b/environment.c index ca72464..8cfb8f3 100644 --- a/environment.c +++ b/environment.c @@ -100,7 +100,7 @@ static char *work_tree; static const char *namespace; static size_t namespace_len; -static const char *git_dir, *git_common_dir; +static const char *git_dir, *git_common_dir, *first_git_dir; static char *git_object_dir, *git_index_file, *git_graft_file; int git_db_env, git_index_env, git_graft_env, git_common_dir_env; @@ -168,6 +168,8 @@ static void setup_git_env(void) if (!git_dir) git_dir = DEFAULT_GIT_DIR_ENVIRONMENT; gitfile = read_gitfile(git_dir); + if (gitfile && !first_git_dir) + first_git_dir = xstrdup(git_dir); git_dir = xstrdup(gitfile ? gitfile : git_dir); if (get_common_dir(, git_dir)) git_common_dir_env = 1; @@ -203,6 +205,18 @@ const char *get_git_dir(void) return git_dir; } +/* + * Return the first ".git" that we have encountered. + * FIXME this function for not entirely correct because + * setup_git_directory() and enter_repo() do not update first_git_dir + * when they follow .git files. The function in its current state is + * only suitable for "git init". + */ +const char *get_first_git_dir(void) +{ + return first_git_dir ? first_git_dir : git_dir; +} + const char *get_git_common_dir(void) { return git_common_dir; diff --git a/t/t0001-init.sh b/t/t0001-init.sh index 393c940..d59669a 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -393,9 +393,11 @@ test_expect_success 're-init from a linked worktree' ' test_commit first && git worktree add ../linked-worktree && mv .git/info/exclude expected-exclude && + cp .git/config expected-config && find .git/worktrees -print | sort >expected && git -C ../linked-worktree init && test_cmp expected-exclude .git/info/exclude && + test_cmp expected-config .git/config && find .git/worktrees -print | sort >actual && test_cmp expected actual ) -- 2.8.2.524.g6ff3d78
[PATCH 1/3] init: correct re-initialization from a linked worktree
When 'git init' is called from a linked worktree, '.git' dir as the main '.git' (i.e. $GIT_COMMON_DIR) and populate the whole repository skeleton in there. It does not harm anything (*) but it is still wrong. Since 'git init' calls set_git_dir() at preparation time, which indirectly calls get_common_dir() and correctly detects multiple worktree setup, all git_path_buf() calls in create_default_files() will return correct paths in both single and multiple worktree setups. The only thing left is copy_templates(), which targets $GIT_DIR, not $GIT_COMMON_DIR. Fix that with get_git_common_dir(). This function will return $GIT_DIR in single-worktree setup, so we don't have to make a special case for multiple-worktree here. (*) It does in fact, thanks to another bug. More on that later. Noticed-by: Max NordlundHelped-by: Michael J Gruber Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/init-db.c | 2 +- t/t0001-init.sh | 15 +++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/init-db.c b/builtin/init-db.c index 3a45f0b..6d9552e 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -138,7 +138,7 @@ static void copy_templates(const char *template_dir) goto close_free_return; } - strbuf_addstr(, get_git_dir()); + strbuf_addstr(, get_git_common_dir()); strbuf_complete(, '/'); copy_templates_1(, _path, dir); close_free_return: diff --git a/t/t0001-init.sh b/t/t0001-init.sh index a6fdd5e..d64e5e3 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -384,4 +384,19 @@ test_expect_success MINGW 'bare git dir not hidden' ' ! is_hidden newdir ' +test_expect_success 're-init from a linked worktree' ' + git init main-worktree && + ( + cd main-worktree && + test_commit first && + git worktree add ../linked-worktree && + mv .git/info/exclude expected-exclude && + find .git/worktrees -print | sort >expected && + git -C ../linked-worktree init && + test_cmp expected-exclude .git/info/exclude && + find .git/worktrees -print | sort >actual && + test_cmp expected actual + ) +' + test_done -- 2.8.2.524.g6ff3d78
[PATCH 2/3] t0001: work around the bug that reads config file before repo setup
git-init somehow reads '.git/config' at current directory and sets log_all_ref_updates based on this file. Because log_all_ref_updates is not unspecified (-1) any more. It will not be written to the new repo's config file (see create_default_files() function). This will affect our tests in the next patch as we will compare the config file and expect that core.logallrefupdates is already set to true by "git init main-worktree". Signed-off-by: Nguyễn Thái Ngọc Duy--- t/t0001-init.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t0001-init.sh b/t/t0001-init.sh index d64e5e3..393c940 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -385,7 +385,9 @@ test_expect_success MINGW 'bare git dir not hidden' ' ' test_expect_success 're-init from a linked worktree' ' + mv .git/config work-around-init-reading-wrong-file && git init main-worktree && + mv work-around-init-reading-wrong-file .git/config && ( cd main-worktree && test_commit first && -- 2.8.2.524.g6ff3d78
Re: How to simulate a real checkout to test a new smudge filter?
W dniu 06.09.2016 o 23:01, john smith pisze: > I'd prefer smudge/clean filters instead of `make' scripts etc. to > convert template dotfiles into something usable and back because > filters: > > 1. could be run automatically > > 2. do not modify files as shown by `git show HEAD:' and > therefore no files are reported as modified by git status and also > there are not conflicts when merging master into work/home branch. > > I have problems because with point 1 because apparently smudge filter > is not run automatically every time when branch is changed if files > listed in .gitattributes do not change. As the last resort I could > force smudge/clean filter to run just to keep advantage specified in > point 2. Couldn't you use post-checkout hook plus clean filter instead of clean/smudge filter pair, if the smudge part depends on the branch? Or make post-checkout hook invoke smudge filter... though `git cat-file --filters` is not in any released version, I think... Best, -- Jakub Narębski
Re: [PATCH v2 6/6] git-gui: Update Japanese information
On Wed, 07 Sep 2016 10:35:22 -0700 Junio C Hamano wrote: Since I received the patch directly bypassing vger, I queued it on gitgui-0.20.0 from Pat and tentatively merged it to my 'pu'. wow, thanks so much.
Re: "fatal error in commit_refs" from pushing to github
On Thu, Sep 8, 2016 at 8:25 AM, Jeff Kingwrote: > On Thu, Sep 08, 2016 at 07:49:12AM +0700, Duy Nguyen wrote: > >> I got the message in the subject when pushing to github today. Yes I >> know it's github, not git. But according to stackoveflow [1] it's a >> local problem. Which makes me think, if we know exactly what this is >> (or at least roughly the problem area), maybe we could improve git to >> catch it locally in the first place (and because other git servers may >> not have the same protection as github). Jeff maybe you can reveal >> something about this "fatal error in commit_refs"? I'm sure it's not >> in git code. But I would understand if the answer is "no". > > The short answer is that it's nothing to do with Git or the client; it's > GitHub-specific code running on the server that is outside of Git > entirely. > > The long answer is that pushes to GitHub don't hit Git directly these > days. They hit a proxy layer that speaks just enough of the Git protocol > to relay to N separate receives spread across N replica servers[1]. Those > receive-packs take in the pack and verify it, but don't actually update > any refs[2]. Then the proxy layer runs its own set of policy hooks, and > speaks a commit-protocol to each of the replicas so that they all agree > on the new ref state. That last step is called "commit_refs" internally. > > So this is really an internal failure at the ref-update stage. There > _should_ be a reasonable error message, but I think "fatal error in > commit_refs" is the generic last-ditch fallback. I'll pass this along to > people in charge of that code, as we should be generating a more useful > error message. Hmm.. I'm interested in this because the "fix" is from client side. I did "git gc" and "git fetch" and the problem was gone. From this description, I suppose C Git sends a good pack (phew!), but probably with some stale ref or something that upsets this this last stage. It's hard to make a connection back to either gc or fetch. Maybe gc does ref trimming or something (that should probably be done by git-push as well). Oh well.. maybe next time I see it, I'll get a nice and clear message :) -- Duy
[no subject]
I am Jacobs Steven $ 2 Million Has Been Donated To You this is Real,For More Info Contact Mr James Stocklas with this Email ; >> jamesstocklas...@gmail.com Send Your Response To >> jamesstocklas...@gmail.com __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __
Re: [ANNOUNCE] Git for Windows 2.10.0
Am 03.09.2016 um 15:17 schrieb Johannes Schindelin: > Dear Git users, > > It is my pleasure to announce that Git for Windows 2.10.0 is available. > This time, I even blogged about it, primarily because I am so excited > about the speed improvements of rebase -i: > > https://blogs.msdn.microsoft.com/visualstudioalm/2016/09/03/whats-new-in-git-for-windows-2-10/ > > As always, you can download it from: https://git-for-windows.github.io/ > > Changes since Git for Windows v2.9.3(2) (August 25th 2016) > > New Features > > • Comes with Git v2.10.0. > • The git rebase -i command was made faster by reimplementing large > parts in C. I finally had the chance to do a "bigger" rebase and what shall I say... F***k, has this thing become fast, or what! Thank you so much for doing this Stefan -- /dev/random says: A Cat's courage is as strong as a dog's chain python -c "print '73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')" GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9 9666 829B 49C5 9221 27AF
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
On Thu, Sep 08, 2016 at 04:14:46AM -0400, Jeff King wrote: > On Thu, Sep 08, 2016 at 04:10:24AM -0400, Jeff King wrote: > > > On Thu, Sep 08, 2016 at 09:54:51AM +0200, Johannes Schindelin wrote: > > > > > > diff.c | 3 ++- > > > > diffcore-pickaxe.c | 18 -- > > > > xdiff-interface.c | 13 - > > > > 3 files changed, 14 insertions(+), 20 deletions(-) > > > > > > I just realized that this should switch the test_expect_failure from 1/3 > > > to a test_expect_success. > > > > Yep. I wonder if we also would want to test that we correctly find > > regexes inside binary files. > > > > E.g., given a mixed binary/text file like: > > > > printf 'binary\0text' >file && > > git add file && > > git commit -m file > > > > then "git log -Stext" will find that file, but "--pickaxe-regex" will > > not (using stock git). Ditto for "-Gtext". > > > > Your patch should fix that. > > Of course if I had actually _looked carefully_ at your patch, I would > have seen that your test doesn't just check that we don't segfault, but > actually confirms that we find the entry. > > Sorry for the noise. Actually, I take it back again. Your test case doesn't have an embedded NUL in it (so we check that git finds it, but aside from the lack of segfault, stock git would already find it). Sorry for the double-noise. -Peff
Re: [PATCH] t6026-merge-attr: wait for process to release trash directory
On Thu, Sep 08, 2016 at 10:05:58AM +0200, Johannes Schindelin wrote: > > > Is fifo safe on Windows, though? > > > > No clue. We seem to use mkfifo unconditionally in lib-daemon, but > > perhaps people do not run that test on Windows. Other invocations seem > > to be protected by the PIPE prerequisite. But... > > AFAICT we do not use mkfifo on Windows. Let's see what t/test-lib.sh has > to say about the matter: > > test_lazy_prereq PIPE ' > # test whether the filesystem supports FIFOs > case $(uname -s) in > CYGWIN*|MINGW*) > false > ;; > *) > rm -f testfifo && mkfifo testfifo > ;; > esac > ' > > So there you go. > > The reason it is disabled is that Cygwin/MSYS2 do have a concept of a > FIFO. But `git.exe` won't be able to access such a FIFO because it is > emulated by the POSIX emulation layer, which Git cannot access. Regarding my "unconditionally" above: coincidentally, I happened to be looking in lib-git-daemon.sh about an hour ago and noticed that we do indeed check "test_have_prereq PIPE" (just not near the mkfifo, of course, because we are not in a test block). It seems to have been added by a "Johannes Schindelin". Any relation? So yeah. It definitely would be a bad idea to use a fifo in a test that is already Windows-specific. -Peff
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
On Thu, Sep 08, 2016 at 09:49:38AM +0200, Johannes Schindelin wrote: > > > diff --git a/diff.c b/diff.c > > > index 534c12e..2c5a360 100644 > > > --- a/diff.c > > > +++ b/diff.c > > > @@ -951,7 +951,13 @@ static int find_word_boundaries(mmfile_t *buffer, > > > regex_t *word_regex, > > > { > > > if (word_regex && *begin < buffer->size) { > > > regmatch_t match[1]; > > > - if (!regexec(word_regex, buffer->ptr + *begin, 1, match, > > > 0)) { > > > + int f = 0; > > > +#ifdef REG_STARTEND > > > + match[0].rm_so = 0; > > > + match[0].rm_eo = *end - *begin; > > > + f = REG_STARTEND; > > > +#endif > > > + if (!regexec(word_regex, buffer->ptr + *begin, 1, match, > > > f)) { > > Heh. You introduced the same bug I did. Or maybe you just fetched my > mmap-regexec branch and looked at an intermediate iteration? I do not think I introduced anything. The quoted text is what you sent. Which is perhaps why it has your bug. :) > > But I much prefer this approach to copying the data just to add a NUL. > > I think it is not worth the burden. The only regex implementation in > semi-widespread use that do not support REG_STARTEND seems to be musl. > > I'd rather not spend *so much* effort just to support an obscure platform. > Not when the users of that obscure platform could spend that effort > themselves. And probably won't, because we only copy data to add a NUL on > those platforms when regexec() is called on an mmfile_t. I'm confused about what you think I'm proposing. I was saying I _like_ something like regexec_buf() instead of copying the data. Which seems like the simpler thing to me (and presumably to you). Or do you mean using compat/regex to build on re_search() consistently? I do not think that is all that complex; the question is only whether people really want to use their own regex libraries. Between the two options for regexec_buf(), I think you have convinced me that REG_STARTEND is better than just using compat/regex everywhere. I do think the fallback for platforms like musl should be "use compat/regex" and not doing an expensive copy (which in most cases is not even necessary). -Peff
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
On Thu, Sep 08, 2016 at 04:10:24AM -0400, Jeff King wrote: > On Thu, Sep 08, 2016 at 09:54:51AM +0200, Johannes Schindelin wrote: > > > > diff.c | 3 ++- > > > diffcore-pickaxe.c | 18 -- > > > xdiff-interface.c | 13 - > > > 3 files changed, 14 insertions(+), 20 deletions(-) > > > > I just realized that this should switch the test_expect_failure from 1/3 > > to a test_expect_success. > > Yep. I wonder if we also would want to test that we correctly find > regexes inside binary files. > > E.g., given a mixed binary/text file like: > > printf 'binary\0text' >file && > git add file && > git commit -m file > > then "git log -Stext" will find that file, but "--pickaxe-regex" will > not (using stock git). Ditto for "-Gtext". > > Your patch should fix that. Of course if I had actually _looked carefully_ at your patch, I would have seen that your test doesn't just check that we don't segfault, but actually confirms that we find the entry. Sorry for the noise. -Peff
Re: [PATCH v2 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
On Thu, Sep 08, 2016 at 09:33:29AM +0200, Johannes Schindelin wrote: > On Thu, 8 Sep 2016, Johannes Schindelin wrote: > > > We solve this by introducing a helper, regexec_buf(), that takes a > > pointer and a length instead of a NUL-terminated string. > > BTW I should have clarified why I decided on another name than regexecn() > (I had considered this even before reading Peff's proposed patch): the > in string functions suggest a limiting of NUL-terminated strings. In other > words, if n = 100 and the provided pointer points to a NUL-terminated > string of length 3, the *n function will treat it as a string of length 3. > > That is not what regexec_buf() does: it ignores the NUL. Hence the > different name. I agree that is a better name (this was the exact thing I wondered about with REG_STARTEND, but certainly what we _want_ is true "_buf" semantics). I guess an argument that REG_STARTEND does what we want everywhere is that the GNU implementation does what we want, and since it has been around for over a decade presumably _somebody_ would have complained if it did not match the NetBSD behavior. -Peff
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
On Thu, Sep 08, 2016 at 09:54:51AM +0200, Johannes Schindelin wrote: > > diff.c | 3 ++- > > diffcore-pickaxe.c | 18 -- > > xdiff-interface.c | 13 - > > 3 files changed, 14 insertions(+), 20 deletions(-) > > I just realized that this should switch the test_expect_failure from 1/3 > to a test_expect_success. Yep. I wonder if we also would want to test that we correctly find regexes inside binary files. E.g., given a mixed binary/text file like: printf 'binary\0text' >file && git add file && git commit -m file then "git log -Stext" will find that file, but "--pickaxe-regex" will not (using stock git). Ditto for "-Gtext". Your patch should fix that. -Peff
Re: [PATCH] t6026-merge-attr: wait for process to release trash directory
Hi Peff & Junio, On Wed, 7 Sep 2016, Jeff King wrote: > On Wed, Sep 07, 2016 at 11:39:57AM -0700, Junio C Hamano wrote: > > > > Can we do some signaling with fifos to tell the hook when it is safe to > > > exit? Then we would just need to `wait` for its parent process. > > > > Is fifo safe on Windows, though? > > No clue. We seem to use mkfifo unconditionally in lib-daemon, but > perhaps people do not run that test on Windows. Other invocations seem > to be protected by the PIPE prerequisite. But... AFAICT we do not use mkfifo on Windows. Let's see what t/test-lib.sh has to say about the matter: test_lazy_prereq PIPE ' # test whether the filesystem supports FIFOs case $(uname -s) in CYGWIN*|MINGW*) false ;; *) rm -f testfifo && mkfifo testfifo ;; esac ' So there you go. The reason it is disabled is that Cygwin/MSYS2 do have a concept of a FIFO. But `git.exe` won't be able to access such a FIFO because it is emulated by the POSIX emulation layer, which Git cannot access. > > With v2 that explicitly kills, I guess we can make the sleep longer > > without slowing down in the optimistic case? > > Yeah, I think the v2 one is non-racy (I thought at first we might race > with the "echo", but it should be synchronous; the hook will not exit > until we have written the pid file, and git will not exit until the hook > is done running). Please note that Hannes and I discussed this (as I originally suggested to increase it to 10 seconds, and Hannes rightfully pointed out that we would have to change the script name, too, as it says sleep-one-second.sh, and that would have made the patch less readable) and we came to the same conclusion: it's not necessary. Ciao, Dscho
Re: [PATCH v2 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers
On Thu, Sep 08, 2016 at 09:31:11AM +0200, Johannes Schindelin wrote: > diff --git a/git-compat-util.h b/git-compat-util.h > index db89ba7..19128b3 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -965,6 +965,27 @@ void git_qsort(void *base, size_t nmemb, size_t size, > #define qsort git_qsort > #endif > > +static inline int regexec_buf(const regex_t *preg, const char *buf, size_t > size, > + size_t nmatch, regmatch_t pmatch[], int eflags) > +{ > +#ifdef REG_STARTEND > + assert(nmatch > 0 && pmatch); > + pmatch[0].rm_so = 0; > + pmatch[0].rm_eo = size; > + return regexec(preg, buf, nmatch, pmatch, eflags | REG_STARTEND); > +#else > + char *buf2 = xmalloc(size + 1); > + int ret; > + > + memcpy(buf2, buf, size); > + buf2[size] = '\0'; I mentioned elsewhere that I'd prefer we just push people into using compat/regex if they don't have REG_STARTEND. But if we _do_ keep this fallback, note that the above has a buffer overflow (think what happens when "size" is the maximum value for a size_t). You can avoid it by using xmallocz(). -Peff
Re: [PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
On Thu, Sep 08, 2016 at 09:29:58AM +0200, Johannes Schindelin wrote: > sorry for the late answer, I was really busy trying to come up with a new > and improved version of the patch series, and while hunting a bug I > introduced got bogged down with other tasks. No problem. I am not in a hurry. > > I always assumed the _point_ of re_search taking a ptr/len pair was > > exactly to handle this case. The documentation[1] says: > > > >`string` is the string you want to match; it can contain newline and > >null characters. `size` is the length of that string. > > > > Which seems pretty definitive to me (that's for re_match(), but > > re_search() is defined in the docs in terms of re_match()). > > Right. The problem is: I *really* want to avoid using GNU-isms. I don't think GNU-isms are a problem if we wrap them to give a nice interface, and if we rely on having compat/regex. But if you mean "I do not want to rely on using compat/regex everywhere", then OK. I can see arguments both for and against using a consistent regex library, but I do not care that much either way myself. > > We can contain this to the existing compat/regexec/regexec.c, and just > > provide a wrapper that is similar to regexec but takes a ptr/len pair. > > But we can do even better than that: we can provide a wrapper that uses > REG_STARTEND where available (which is really the majority of platforms we > care about: Linux, MacOSX, Windows, and even the *BSDs). Where it is not > available, we simply malloc(), memcpy() and append a NUL. Doesn't that make things much _worse_ for people on systems without REG_STARTEND? If we imagine that most regexec calls would operate on a NUL-terminated buffer, then they are now paying the extra malloc and copy for each call to regexec_buf(), even if the buffer was already NUL-terminated (because they have no idea whether it was or not). I think I'd rather just have: #ifndef REG_STARTEND #error "Your regex library sucks. Compile with NO_REGEX=NeedsStartEnd" #endif (or you could just use REG_STARTEND and let the compiler complain, but then the user has to figure out the right knob to twiddle). One other question about REG_STARTEND is: what does it do with NULs inside the buffer? Certainly glibc (and our compat/regex) treat it as a buffer with a particular length and ignore embedded NULs, as we want. But the NetBSD documentation says only: REG_STARTEND The string is considered to start at string + pmatch[0].rm_so and to have a terminating NUL located at string + pmatch[0].rm_eo (there need not actually be a NUL at that location), Besides avoiding a segfault, one of the benefits of regcomp_buf() is that we will now find pickaxe-regex strings inside mixed binary/text files. But it's not clear to me that NetBSD's implementation does this. I guess we can assume it is fine (it is certainly no _worse_ than the current behavior), and if people's platforms do not handle it, they can build with NO_REGEX. -Peff
[PATCH v3 3/3] Use the newly-introduced regexec_buf() function
The new regexec_buf() function operates on buffers with an explicitly specified length, rather than NUL-terminated strings. We need to use this function whenever the buffer we want to pass to regexec() may have been mmap()ed (and is hence not NUL-terminated). Note: the original motivation for this patch was to fix a bug where `git diff -G ` would crash. This patch converts more callers, though, some of which explicitly allocated and constructed NUL-terminated strings (or worse: modified read-only buffers to insert NULs). Some of the buffers actually may be NUL-terminated. As regexec_buf() uses REG_STARTEND where available, but has to fall back to allocating and constructing NUL-terminated strings where REG_STARTEND is not available, this makes the code less efficient in the latter case. However, given the widespread support for REG_STARTEND, combined with the improved ease of code maintenance, we strike the balance in favor of REG_STARTEND. Signed-off-by: Johannes Schindelin--- diff.c | 3 ++- diffcore-pickaxe.c | 18 -- t/t4061-diff-pickaxe.sh | 2 +- xdiff-interface.c | 13 - 4 files changed, 15 insertions(+), 21 deletions(-) diff --git a/diff.c b/diff.c index 534c12e..526775a 100644 --- a/diff.c +++ b/diff.c @@ -951,7 +951,8 @@ static int find_word_boundaries(mmfile_t *buffer, regex_t *word_regex, { if (word_regex && *begin < buffer->size) { regmatch_t match[1]; - if (!regexec(word_regex, buffer->ptr + *begin, 1, match, 0)) { + if (!regexec_buf(word_regex, buffer->ptr + *begin, +buffer->size - *begin, 1, match, 0)) { char *p = memchr(buffer->ptr + *begin + match[0].rm_so, '\n', match[0].rm_eo - match[0].rm_so); *end = p ? p - buffer->ptr : match[0].rm_eo + *begin; diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index 55067ca..9795ca1 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -23,7 +23,6 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) { struct diffgrep_cb *data = priv; regmatch_t regmatch; - int hold; if (line[0] != '+' && line[0] != '-') return; @@ -33,11 +32,8 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) * caller early. */ return; - /* Yuck -- line ought to be "const char *"! */ - hold = line[len]; - line[len] = '\0'; - data->hit = !regexec(data->regexp, line + 1, 1, , 0); - line[len] = hold; + data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1, +, 0); } static int diff_grep(mmfile_t *one, mmfile_t *two, @@ -50,9 +46,11 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, xdemitconf_t xecfg; if (!one) - return !regexec(regexp, two->ptr, 1, , 0); + return !regexec_buf(regexp, two->ptr, two->size, + 1, , 0); if (!two) - return !regexec(regexp, one->ptr, 1, , 0); + return !regexec_buf(regexp, one->ptr, one->size, + 1, , 0); /* * We have both sides; need to run textual diff and see if @@ -83,8 +81,8 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, kwset_t kws) regmatch_t regmatch; int flags = 0; - assert(data[sz] == '\0'); - while (*data && !regexec(regexp, data, 1, , flags)) { + while (*data && + !regexec_buf(regexp, data, sz, 1, , flags)) { flags |= REG_NOTBOL; data += regmatch.rm_eo; if (*data && regmatch.rm_so == regmatch.rm_eo) diff --git a/t/t4061-diff-pickaxe.sh b/t/t4061-diff-pickaxe.sh index 5929f2e..f0bf50b 100755 --- a/t/t4061-diff-pickaxe.sh +++ b/t/t4061-diff-pickaxe.sh @@ -14,7 +14,7 @@ test_expect_success setup ' test_tick && git commit -m "A 4k file" ' -test_expect_failure '-G matches' ' +test_expect_success '-G matches' ' git diff --name-only -G "^0{4096}$" HEAD^ >out && test 4096-zeroes.txt = "$(cat out)" ' diff --git a/xdiff-interface.c b/xdiff-interface.c index f34ea76..50702a2 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -214,11 +214,10 @@ struct ff_regs { static long ff_regexp(const char *line, long len, char *buffer, long buffer_size, void *priv) { - char *line_buffer; struct ff_regs *regs = priv; regmatch_t pmatch[2]; int i; - int result = -1; + int result; /* Exclude terminating newline (and cr) from matching */ if (len > 0 && line[len-1] == '\n') { @@ -228,18 +227,16 @@ static long
[PATCH v3 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers
We just introduced a test that demonstrates that our sloppy use of regexec() on a mmap()ed area can result in incorrect results or even hard crashes. So what we need to fix this is a function that calls regexec() on a length-delimited, rather than a NUL-terminated, string. Happily, there is an extension to regexec() introduced by the NetBSD project and present in all major regex implementation including Linux', MacOSX' and the one Git includes in compat/regex/: by using the (non-POSIX) REG_STARTEND flag, it is possible to tell the regexec() function that it should only look at the offsets between pmatch[0].rm_so and pmatch[0].rm_eo. That is exactly what we need. Since support for REG_STARTEND is so widespread by now, let's just introduce a helper function that uses it, and fall back to allocating and constructing a NUL-terminated when REG_STARTEND is not available. Signed-off-by: Johannes Schindelin--- git-compat-util.h | 21 + 1 file changed, 21 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index db89ba7..19128b3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -965,6 +965,27 @@ void git_qsort(void *base, size_t nmemb, size_t size, #define qsort git_qsort #endif +static inline int regexec_buf(const regex_t *preg, const char *buf, size_t size, + size_t nmatch, regmatch_t pmatch[], int eflags) +{ +#ifdef REG_STARTEND + assert(nmatch > 0 && pmatch); + pmatch[0].rm_so = 0; + pmatch[0].rm_eo = size; + return regexec(preg, buf, nmatch, pmatch, eflags | REG_STARTEND); +#else + char *buf2 = xmalloc(size + 1); + int ret; + + memcpy(buf2, buf, size); + buf2[size] = '\0'; + ret = regexec(preg, buf2, nmatch, pmatch, eflags); + free(buf2); + + return ret; +#endif +} + #ifndef DIR_HAS_BSD_GROUP_SEMANTICS # define FORCE_DIR_SET_GID S_ISGID #else -- 2.10.0.windows.1.10.g803177d
[PATCH v3 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
This patch series addresses a problem where `git diff` is called using `-G` or `-S --pickaxe-regex` on new-born files that are configured without user diff drivers, and that hence get mmap()ed into memory. The problem with that: mmap()ed memory is *not* NUL-terminated, yet the pickaxe code calls regexec() on it just the same. This problem has been reported by my colleague Chris Sidi. We solve this by introducing a helper, regexec_buf(), that takes a pointer and a length instead of a NUL-terminated string. This helper then uses REG_STARTEND where available, and falls back to allocating and constructing a NUL-terminated string. Given the wide-spread support for REG_STARTEND (Linux has it, MacOSX has it, Git for Windows has it because it uses compat/regex/ that has it), I think this is a fair trade-off. Changes since v2: - changed 3/3 to switch the test_expect_failure from 1/3 to a test_expect_success Johannes Schindelin (3): Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers Introduce a function to run regexec() on non-NUL-terminated buffers Use the newly-introduced regexec_buf() function diff.c | 3 ++- diffcore-pickaxe.c | 18 -- git-compat-util.h | 21 + t/t4061-diff-pickaxe.sh | 22 ++ xdiff-interface.c | 13 - 5 files changed, 57 insertions(+), 20 deletions(-) create mode 100755 t/t4061-diff-pickaxe.sh Published-As: https://github.com/dscho/git/releases/tag/mmap-regexec-v3 Fetch-It-Via: git fetch https://github.com/dscho/git mmap-regexec-v3 Interdiff vs v2: diff --git a/t/t4061-diff-pickaxe.sh b/t/t4061-diff-pickaxe.sh index 5929f2e..f0bf50b 100755 --- a/t/t4061-diff-pickaxe.sh +++ b/t/t4061-diff-pickaxe.sh @@ -14,7 +14,7 @@ test_expect_success setup ' test_tick && git commit -m "A 4k file" ' -test_expect_failure '-G matches' ' +test_expect_success '-G matches' ' git diff --name-only -G "^0{4096}$" HEAD^ >out && test 4096-zeroes.txt = "$(cat out)" ' -- 2.10.0.windows.1.10.g803177d base-commit: 6ebdac1bab966b720d776aa43ca188fe378b1f4b
[PATCH v3 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers
When our pickaxe code feeds file contents to regexec(), it implicitly assumes that the file contents are read into implicitly NUL-terminated buffers (i.e. that we overallocate by 1, appending a single '\0'). This is not so. In particular when the file contents are simply mmap()ed, we can be virtually certain that the buffer is preceding uninitialized bytes, or invalid pages. Note that the test we add here is known to be flakey: we simply cannot know whether the byte following the mmap()ed ones is a NUL or not. Typically, on Linux the test passes. On Windows, it fails virtually every time due to an access violation (that's a segmentation fault for you Unix-y people out there). And Windows would be correct: the regexec() call wants to operate on a regular, NUL-terminated string, there is no NUL in the mmap()ed memory range, and it is undefined whether the next byte is even legal to access. When run with --valgrind it demonstrates quite clearly the breakage, of course. Being marked with `test_expect_failure`, this test will sometimes be declare "TODO fixed", even if it only passes by mistake. This test case represents a Minimal, Complete and Verifiable Example of a breakage reported by Chris Sidi. Signed-off-by: Johannes Schindelin--- t/t4061-diff-pickaxe.sh | 22 ++ 1 file changed, 22 insertions(+) create mode 100755 t/t4061-diff-pickaxe.sh diff --git a/t/t4061-diff-pickaxe.sh b/t/t4061-diff-pickaxe.sh new file mode 100755 index 000..5929f2e --- /dev/null +++ b/t/t4061-diff-pickaxe.sh @@ -0,0 +1,22 @@ +#!/bin/sh +# +# Copyright (c) 2016 Johannes Schindelin +# + +test_description='Pickaxe options' + +. ./test-lib.sh + +test_expect_success setup ' + test_commit initial && + printf "%04096d" 0 >4096-zeroes.txt && + git add 4096-zeroes.txt && + test_tick && + git commit -m "A 4k file" +' +test_expect_failure '-G matches' ' + git diff --name-only -G "^0{4096}$" HEAD^ >out && + test 4096-zeroes.txt = "$(cat out)" +' + +test_done -- 2.10.0.windows.1.10.g803177d
Re: [PATCH v2 3/3] Use the newly-introduced regexec_buf() function
Hi, On Thu, 8 Sep 2016, Johannes Schindelin wrote: > The new regexec_buf() function operates on buffers with an explicitly > specified length, rather than NUL-terminated strings. > > We need to use this function whenever the buffer we want to pass to > regexec() may have been mmap()ed (and is hence not NUL-terminated). > > Note: the original motivation for this patch was to fix a bug where > `git diff -G ` would crash. This patch converts more callers, > though, some of which explicitly allocated and constructed > NUL-terminated strings (or worse: modified read-only buffers to insert > NULs). > > Some of the buffers actually may be NUL-terminated. As regexec_buf() > uses REG_STARTEND where available, but has to fall back to allocating > and constructing NUL-terminated strings where REG_STARTEND is not > available, this makes the code less efficient in the latter case. > > However, given the widespread support for REG_STARTEND, combined with > the improved ease of code maintenance, we strike the balance in favor > of REG_STARTEND. > > Signed-off-by: Johannes Schindelin> --- > diff.c | 3 ++- > diffcore-pickaxe.c | 18 -- > xdiff-interface.c | 13 - > 3 files changed, 14 insertions(+), 20 deletions(-) I just realized that this should switch the test_expect_failure from 1/3 to a test_expect_success. Will send out v3 in a moment. Ciao, Dscho
Re: [PATCH 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers
Hi Peff, On Tue, 6 Sep 2016, Jeff King wrote: > On Mon, Sep 05, 2016 at 05:45:02PM +0200, Johannes Schindelin wrote: > > > Typically, on Linux the test passes. On Windows, it fails virtually > > every time due to an access violation (that's a segmentation fault for > > you Unix-y people out there). And Windows would be correct: the > > regexec() call wants to operate on a regular, NUL-terminated string, > > there is no NUL in the mmap()ed memory range, and it is undefined > > whether the next byte is even legal to access. > > > > When run with --valgrind it demonstrates quite clearly the breakage, of > > course. > > > > So we simply mark it with `test_expect_success` for now. > > I'd prefer if this were marked as expect_failure. It fails reliably for > me on Linux, even without --valgrind. But even if that were not so, > there is no reason to hurt bisectability of somebody running with > "--valgrind" (not when it costs so little to mark it correctly). Forgot to say in the cover letter: I did change this from test_expect_success to test_expect_failure. But of course, now I remember that I failed to change it back in 3/3. Bah. Ciao, Dscho
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
Hi Junio, On Wed, 7 Sep 2016, Junio C Hamano wrote: > Jeff Kingwrites: > > > What happens to those poor souls on systems without REG_STARTEND? Do > > they get to keep segfaulting? > > > > I think the solution is to push them into setting NO_REGEX. So looking > > at this versus a "regexecn", it seems: > > > > - this lets people keep using their native regexec if it supports > > STARTEND > > > > - this is a bit more clunky to use at the callsites (though we could > > _create_ a portable regexecn wrapper that uses this technique on top > > of the native regex library) > > > > But I much prefer this approach to copying the data just to add a NUL. > > I first thought "push them to NO_REGEX" to mean "they live with > crippled Git that does not do regexp" and went "Huh?", but it merely > means "let's avoid platform regex library and use on from the > compat/ hierarchy", which would solve the STARTEND portability issue > for everybody. > > Which is very good. > > The idea to create a thin regexecn() wrapper also sounds like a good > idea, too. The changes to the callsites in the demonstration patch > does look a bit clunky to me, too. The demonstration patch was only meant as a mere demonstration where this leads us. I DRY'd it up quite a bit (which was my plan all along, but it was faster to make the changes in place, to avoid a full-sale recompilation due to a central header change; you might not care because you use Linux with its native POSIX, while I have to use MSYS2, making even my builds slower). And I really do not think that it would be a good idea to use compat/regex/ for everybody, even if they already have a working regex.h on their system. Ciao, Dscho
Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers
Hi Peff, On Tue, 6 Sep 2016, Jeff King wrote: > On Tue, Sep 06, 2016 at 06:02:59PM +0200, Johannes Schindelin wrote: > > > It will still be quite tricky, because we have to touch a function that is > > rather at the bottom of the food chain: diff_populate_filespec() is called > > from fill_textconv(), which in turn is called from pickaxe_match(), and > > only pickaxe_match() knows whether we want to call regexec() or not (it > > depends on its regexp parameter). > > > > Adding a flag to diff_populate_filespec() sounds really reasonable until > > you see how many call sites fill_textconv() has. > > I was thinking of something quite gross, like a global "switch to using > slower-but-safer NUL termination" flag (but I agree with Junio's point > elsewhere that we do not even know if it is "slower"). Urgh. ;-) > > So now for the better idea. > > > > While I was researching the code for this reply, I hit upon one thing > > that I never knew existed, introduced in f96e567 (grep: use > > REG_STARTEND for all matching if available, 2010-05-22). Apparently, > > NetBSD introduced an extension to regexec() where you can specify > > buffer boundaries using REG_STARTEND. Which is pretty much what we > > need. > > Yes, and compat/regex support this, too. My question is whether it is > portable. That is only one question. Another, important question is: is it efficient? I have no idea whether there exists any hardware-accelerated regex library out there, maybe even using CUDA (I know that there is some code out there using SSE to perform LF -> CR/LF conversion, unfortunately it is intentionally incompatible with GPLv2). We cannot simply switch everybody and her dog to compat/regex/ just because we want to avoid a segfault. > > diff --git a/diff.c b/diff.c > > index 534c12e..2c5a360 100644 > > --- a/diff.c > > +++ b/diff.c > > @@ -951,7 +951,13 @@ static int find_word_boundaries(mmfile_t *buffer, > > regex_t *word_regex, > > { > > if (word_regex && *begin < buffer->size) { > > regmatch_t match[1]; > > - if (!regexec(word_regex, buffer->ptr + *begin, 1, match, > > 0)) { > > + int f = 0; > > +#ifdef REG_STARTEND > > + match[0].rm_so = 0; > > + match[0].rm_eo = *end - *begin; > > + f = REG_STARTEND; > > +#endif > > + if (!regexec(word_regex, buffer->ptr + *begin, 1, match, > > f)) { Heh. You introduced the same bug I did. Or maybe you just fetched my mmap-regexec branch and looked at an intermediate iteration? The problem with this patch is that *end is uninitialized. I actually initialized it in my patch, but it was still incorrect. I settled on using buffer->size - *begin in the end. > What happens to those poor souls on systems without REG_STARTEND? Do > they get to keep segfaulting? Of course not. Those poor souls on systems without REG_STARTEND pay a little price for that: malloc(); memcpy(); *end = '\0'; ... free(); I think it is worth it: maintenance of the code is much easier that way than forcing everybody and her dog and her dog's hamster to compat/regex/. > But I much prefer this approach to copying the data just to add a NUL. I think it is not worth the burden. The only regex implementation in semi-widespread use that do not support REG_STARTEND seems to be musl. I'd rather not spend *so much* effort just to support an obscure platform. Not when the users of that obscure platform could spend that effort themselves. And probably won't, because we only copy data to add a NUL on those platforms when regexec() is called on an mmfile_t. Better to keep it simple, Dscho
Re: [PATCH v2 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
Hi, On Thu, 8 Sep 2016, Johannes Schindelin wrote: > We solve this by introducing a helper, regexec_buf(), that takes a > pointer and a length instead of a NUL-terminated string. BTW I should have clarified why I decided on another name than regexecn() (I had considered this even before reading Peff's proposed patch): the in string functions suggest a limiting of NUL-terminated strings. In other words, if n = 100 and the provided pointer points to a NUL-terminated string of length 3, the *n function will treat it as a string of length 3. That is not what regexec_buf() does: it ignores the NUL. Hence the different name. Ciao, Dscho
[PATCH v2 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers
We just introduced a test that demonstrates that our sloppy use of regexec() on a mmap()ed area can result in incorrect results or even hard crashes. So what we need to fix this is a function that calls regexec() on a length-delimited, rather than a NUL-terminated, string. Happily, there is an extension to regexec() introduced by the NetBSD project and present in all major regex implementation including Linux', MacOSX' and the one Git includes in compat/regex/: by using the (non-POSIX) REG_STARTEND flag, it is possible to tell the regexec() function that it should only look at the offsets between pmatch[0].rm_so and pmatch[0].rm_eo. That is exactly what we need. Since support for REG_STARTEND is so widespread by now, let's just introduce a helper function that uses it, and fall back to allocating and constructing a NUL-terminated when REG_STARTEND is not available. Signed-off-by: Johannes Schindelin--- git-compat-util.h | 21 + 1 file changed, 21 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index db89ba7..19128b3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -965,6 +965,27 @@ void git_qsort(void *base, size_t nmemb, size_t size, #define qsort git_qsort #endif +static inline int regexec_buf(const regex_t *preg, const char *buf, size_t size, + size_t nmatch, regmatch_t pmatch[], int eflags) +{ +#ifdef REG_STARTEND + assert(nmatch > 0 && pmatch); + pmatch[0].rm_so = 0; + pmatch[0].rm_eo = size; + return regexec(preg, buf, nmatch, pmatch, eflags | REG_STARTEND); +#else + char *buf2 = xmalloc(size + 1); + int ret; + + memcpy(buf2, buf, size); + buf2[size] = '\0'; + ret = regexec(preg, buf2, nmatch, pmatch, eflags); + free(buf2); + + return ret; +#endif +} + #ifndef DIR_HAS_BSD_GROUP_SEMANTICS # define FORCE_DIR_SET_GID S_ISGID #else -- 2.10.0.windows.1.10.g803177d
[PATCH v2 3/3] Use the newly-introduced regexec_buf() function
The new regexec_buf() function operates on buffers with an explicitly specified length, rather than NUL-terminated strings. We need to use this function whenever the buffer we want to pass to regexec() may have been mmap()ed (and is hence not NUL-terminated). Note: the original motivation for this patch was to fix a bug where `git diff -G ` would crash. This patch converts more callers, though, some of which explicitly allocated and constructed NUL-terminated strings (or worse: modified read-only buffers to insert NULs). Some of the buffers actually may be NUL-terminated. As regexec_buf() uses REG_STARTEND where available, but has to fall back to allocating and constructing NUL-terminated strings where REG_STARTEND is not available, this makes the code less efficient in the latter case. However, given the widespread support for REG_STARTEND, combined with the improved ease of code maintenance, we strike the balance in favor of REG_STARTEND. Signed-off-by: Johannes Schindelin--- diff.c | 3 ++- diffcore-pickaxe.c | 18 -- xdiff-interface.c | 13 - 3 files changed, 14 insertions(+), 20 deletions(-) diff --git a/diff.c b/diff.c index 534c12e..526775a 100644 --- a/diff.c +++ b/diff.c @@ -951,7 +951,8 @@ static int find_word_boundaries(mmfile_t *buffer, regex_t *word_regex, { if (word_regex && *begin < buffer->size) { regmatch_t match[1]; - if (!regexec(word_regex, buffer->ptr + *begin, 1, match, 0)) { + if (!regexec_buf(word_regex, buffer->ptr + *begin, +buffer->size - *begin, 1, match, 0)) { char *p = memchr(buffer->ptr + *begin + match[0].rm_so, '\n', match[0].rm_eo - match[0].rm_so); *end = p ? p - buffer->ptr : match[0].rm_eo + *begin; diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index 55067ca..9795ca1 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -23,7 +23,6 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) { struct diffgrep_cb *data = priv; regmatch_t regmatch; - int hold; if (line[0] != '+' && line[0] != '-') return; @@ -33,11 +32,8 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) * caller early. */ return; - /* Yuck -- line ought to be "const char *"! */ - hold = line[len]; - line[len] = '\0'; - data->hit = !regexec(data->regexp, line + 1, 1, , 0); - line[len] = hold; + data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1, +, 0); } static int diff_grep(mmfile_t *one, mmfile_t *two, @@ -50,9 +46,11 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, xdemitconf_t xecfg; if (!one) - return !regexec(regexp, two->ptr, 1, , 0); + return !regexec_buf(regexp, two->ptr, two->size, + 1, , 0); if (!two) - return !regexec(regexp, one->ptr, 1, , 0); + return !regexec_buf(regexp, one->ptr, one->size, + 1, , 0); /* * We have both sides; need to run textual diff and see if @@ -83,8 +81,8 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, kwset_t kws) regmatch_t regmatch; int flags = 0; - assert(data[sz] == '\0'); - while (*data && !regexec(regexp, data, 1, , flags)) { + while (*data && + !regexec_buf(regexp, data, sz, 1, , flags)) { flags |= REG_NOTBOL; data += regmatch.rm_eo; if (*data && regmatch.rm_so == regmatch.rm_eo) diff --git a/xdiff-interface.c b/xdiff-interface.c index f34ea76..50702a2 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -214,11 +214,10 @@ struct ff_regs { static long ff_regexp(const char *line, long len, char *buffer, long buffer_size, void *priv) { - char *line_buffer; struct ff_regs *regs = priv; regmatch_t pmatch[2]; int i; - int result = -1; + int result; /* Exclude terminating newline (and cr) from matching */ if (len > 0 && line[len-1] == '\n') { @@ -228,18 +227,16 @@ static long ff_regexp(const char *line, long len, len--; } - line_buffer = xstrndup(line, len); /* make NUL terminated */ - for (i = 0; i < regs->nr; i++) { struct ff_reg *reg = regs->array + i; - if (!regexec(>re, line_buffer, 2, pmatch, 0)) { + if (!regexec_buf(>re, line, len, 2, pmatch, 0)) { if (reg->negate) - goto fail; +
[PATCH v2 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
This patch series addresses a problem where `git diff` is called using `-G` or `-S --pickaxe-regex` on new-born files that are configured without user diff drivers, and that hence get mmap()ed into memory. The problem with that: mmap()ed memory is *not* NUL-terminated, yet the pickaxe code calls regexec() on it just the same. This problem has been reported by my colleague Chris Sidi. We solve this by introducing a helper, regexec_buf(), that takes a pointer and a length instead of a NUL-terminated string. This helper then uses REG_STARTEND where available, and falls back to allocating and constructing a NUL-terminated string. Given the wide-spread support for REG_STARTEND (Linux has it, MacOSX has it, Git for Windows has it because it uses compat/regex/ that has it), I think this is a fair trade-off. Changes since v1: - completely revamped the approach: now we no longer force-append a NUL whenever mmap()ing buffers for use by the diff machinery, but we fix things at the regexec() level. Johannes Schindelin (3): Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers Introduce a function to run regexec() on non-NUL-terminated buffers Use the newly-introduced regexec_buf() function diff.c | 3 ++- diffcore-pickaxe.c | 18 -- git-compat-util.h | 21 + t/t4061-diff-pickaxe.sh | 22 ++ xdiff-interface.c | 13 - 5 files changed, 57 insertions(+), 20 deletions(-) create mode 100755 t/t4061-diff-pickaxe.sh Published-As: https://github.com/dscho/git/releases/tag/mmap-regexec-v2 Fetch-It-Via: git fetch https://github.com/dscho/git mmap-regexec-v2 Interdiff vs v1: diff --git a/diff.c b/diff.c index 32f7f46..526775a 100644 --- a/diff.c +++ b/diff.c @@ -951,7 +951,8 @@ static int find_word_boundaries(mmfile_t *buffer, regex_t *word_regex, { if (word_regex && *begin < buffer->size) { regmatch_t match[1]; - if (!regexec(word_regex, buffer->ptr + *begin, 1, match, 0)) { + if (!regexec_buf(word_regex, buffer->ptr + *begin, + buffer->size - *begin, 1, match, 0)) { char *p = memchr(buffer->ptr + *begin + match[0].rm_so, '\n', match[0].rm_eo - match[0].rm_so); *end = p ? p - buffer->ptr : match[0].rm_eo + *begin; @@ -2826,15 +2827,6 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags) s->data = strbuf_detach(, ); s->size = size; s->should_free = 1; - } else { - /* data must be NUL-terminated so e.g. for regexec() */ - char *data = xmalloc(s->size + 1); - memcpy(data, s->data, s->size); - data[s->size] = '\0'; - munmap(s->data, s->size); - s->should_munmap = 0; - s->data = data; - s->should_free = 1; } } else { diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index 88820b6..9795ca1 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -23,7 +23,6 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) { struct diffgrep_cb *data = priv; regmatch_t regmatch; - int hold; if (line[0] != '+' && line[0] != '-') return; @@ -33,11 +32,8 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) * caller early. */ return; - /* Yuck -- line ought to be "const char *"! */ - hold = line[len]; - line[len] = '\0'; - data->hit = !regexec(data->regexp, line + 1, 1, , 0); - line[len] = hold; + data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1, + , 0); } static int diff_grep(mmfile_t *one, mmfile_t *two, @@ -49,12 +45,12 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, xpparam_t xpp; xdemitconf_t xecfg; - assert(!one || one->ptr[one->size] == '\0'); - assert(!two || two->ptr[two->size] == '\0'); if (!one) - return !regexec(regexp, two->ptr, 1, , 0); + return !regexec_buf(regexp, two->ptr, two->size, + 1, , 0); if (!two) - return !regexec(regexp, one->ptr, 1, , 0); + return !regexec_buf(regexp, one->ptr, one->size, + 1, , 0); /* * We have both sides; need to run textual diff and see if @@ -85,8 +81,8 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, kwset_t kws) regmatch_t regmatch; int flags = 0; - assert(data[sz] == '\0'); -
[PATCH v2 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers
When our pickaxe code feeds file contents to regexec(), it implicitly assumes that the file contents are read into implicitly NUL-terminated buffers (i.e. that we overallocate by 1, appending a single '\0'). This is not so. In particular when the file contents are simply mmap()ed, we can be virtually certain that the buffer is preceding uninitialized bytes, or invalid pages. Note that the test we add here is known to be flakey: we simply cannot know whether the byte following the mmap()ed ones is a NUL or not. Typically, on Linux the test passes. On Windows, it fails virtually every time due to an access violation (that's a segmentation fault for you Unix-y people out there). And Windows would be correct: the regexec() call wants to operate on a regular, NUL-terminated string, there is no NUL in the mmap()ed memory range, and it is undefined whether the next byte is even legal to access. When run with --valgrind it demonstrates quite clearly the breakage, of course. Being marked with `test_expect_failure`, this test will sometimes be declare "TODO fixed", even if it only passes by mistake. This test case represents a Minimal, Complete and Verifiable Example of a breakage reported by Chris Sidi. Signed-off-by: Johannes Schindelin--- t/t4061-diff-pickaxe.sh | 22 ++ 1 file changed, 22 insertions(+) create mode 100755 t/t4061-diff-pickaxe.sh diff --git a/t/t4061-diff-pickaxe.sh b/t/t4061-diff-pickaxe.sh new file mode 100755 index 000..5929f2e --- /dev/null +++ b/t/t4061-diff-pickaxe.sh @@ -0,0 +1,22 @@ +#!/bin/sh +# +# Copyright (c) 2016 Johannes Schindelin +# + +test_description='Pickaxe options' + +. ./test-lib.sh + +test_expect_success setup ' + test_commit initial && + printf "%04096d" 0 >4096-zeroes.txt && + git add 4096-zeroes.txt && + test_tick && + git commit -m "A 4k file" +' +test_expect_failure '-G matches' ' + git diff --name-only -G "^0{4096}$" HEAD^ >out && + test 4096-zeroes.txt = "$(cat out)" +' + +test_done -- 2.10.0.windows.1.10.g803177d
Re: [RFC/PATCH v2 0/3] patch-id for merges
On Wed, Sep 07, 2016 at 03:51:04PM -0700, Josh Triplett wrote: > > This is still marked RFC, because there are really two approaches here, > > and I'm not sure which one is better for "format-patch --base". I'd like > > to get input from Xiaolong Ye (who worked on --base), and Josh Triplett > > (who has proposed some patches in that area, and is presumably using > > them). > > Thanks. > > I'd love to see a more resilient patch-id mechanism, to make it easier > to match up patches between branches. I don't think it makes sense to > talk about the patch-id of a merge commit (though it might make sense > for a merge which makes additional changes not present in any of the > parents). Even if someone wants to match up merge commits with merge > commits, I don't think that should happen via patch-id; I think that > should happen in terms of "what patches does this merge introduce", > without constructing a merge-patch-id via a Merkle tree of commit > patch-ids. > > So, I think this patch series makes sense (modulo the comments about the > commit message in patch 3). We already don't respect merge commits when > doing format-patch; this seems consistent with that. If we ever make it > possible for format-patch to handle merge commits, then we should also > allow it to have merge commits as prerequisites. Thanks for the input. I knew that format-patch doesn't show merge commits, but I didn't realize that merges were skipped entirely for the base preparation (but I see it now; there is a "rev.max_parents = 1" setting in prepare_bases). So this really doesn't change the output there at all. And in fact, the switch to: if (commit_patch_id(commit, , sha1, 0)) - die(_("cannot get patch id")) + continue; should never hit that continue. It could be: die("BUG: somehow a merge got fed to commit_patch_id?"); but the "continue" somehow seems like the right thing to me. I'll wait another day or so for comments and then send the re-roll using this approach. -Peff
Re: [PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data
Hi Peff, sorry for the late answer, I was really busy trying to come up with a new and improved version of the patch series, and while hunting a bug I introduced got bogged down with other tasks. The good news is that I made up my mind about releasing a Git for Windows v2.10.0(2): originally, I had planned to do that today, to have time for any hot fixes until Sunday, if necessary, before going semi-dark. FWIW I am now trying to track my plans for v2.10.0(2) (or v2.10.1, if upstream Git v2.10.1 is released before) on GitHub: https://github.com/git-for-windows/git/milestone/3 On Tue, 6 Sep 2016, Jeff King wrote: > On Tue, Sep 06, 2016 at 04:06:32PM +0200, Johannes Schindelin wrote: > > > > I think re_search() the correct replacement function but it's been a > > > while since I've looked into it. > > > > The segfault I investigated happened in a call to strlen(). I see many > > calls to strlen() in compat/regex/... The one that triggers the segfault > > is in regexec(), compat/regex/regexec.c:241. > > Yes, that is the important one, I think. The others are for patterns, > error msgs, etc. Of course strlen() is not the only function that cares > about NUL delimiters (and there might even be a "while (*p)" somewhere > in the code). > > I always assumed the _point_ of re_search taking a ptr/len pair was > exactly to handle this case. The documentation[1] says: > >`string` is the string you want to match; it can contain newline and >null characters. `size` is the length of that string. > > Which seems pretty definitive to me (that's for re_match(), but > re_search() is defined in the docs in terms of re_match()). Right. The problem is: I *really* want to avoid using GNU-isms. > > The bigger problem is that re_search() is defined in the __USE_GNU section > > of regex.h, and I do not think it is appropriate to universally #define > > said constant before #include'ing regex.h. So it would appear that major > > surgery would be required if we wanted to use regular expressions on > > strings that are not NUL-terminated. > > We can contain this to the existing compat/regexec/regexec.c, and just > provide a wrapper that is similar to regexec but takes a ptr/len pair. But we can do even better than that: we can provide a wrapper that uses REG_STARTEND where available (which is really the majority of platforms we care about: Linux, MacOSX, Windows, and even the *BSDs). Where it is not available, we simply malloc(), memcpy() and append a NUL. Which is what my v2 does (will send it out in a moment). Ciao, Dscho
Re: [PATCH 2/3] diff_flush_patch_id: stop returning error result
On Thu, Sep 08, 2016 at 01:51:05AM +0100, Ramsay Jones wrote: > > > On 07/09/16 23:04, Jeff King wrote: > > All of our errors come from diff_get_patch_id(), which has > > exactly three error conditions. The first is an internal > > assertion, which should be a die("BUG") in the first place. > > > > The other two are caused by an inability to two diff blobs, >^ > Huh? ... to diff two blobs? Sorry. English my getting worse be to seems. Will fix in a re-roll. -Peff
Re: [PATCH 3/5] git-rebase--interactive: fix English grammar
Hi Alex, On Wed, 7 Sep 2016, Alex Henrie wrote: > diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh > index 7e558b0..6fd6d4e 100644 > --- a/git-rebase--interactive.sh > +++ b/git-rebase--interactive.sh > @@ -1082,7 +1082,7 @@ If they are meant to go into a new commit, run: > >git commit \$gpg_sign_opt_quoted > > -In both case, once you're done, continue with: > +In both cases, once you're done, continue with: Good catch! I ported this into my `prepare-sequencer` patch series. Ciao, Johannes
git-gui, was Re: [PATCH v2 6/6] git-gui: Update Japanese information
Hi Junio, On Wed, 7 Sep 2016, Junio C Hamano wrote: > Pat, we haven't heard from you for a long time. Indeed. There are a couple of git-gui patches in Git for Windows that have been contributed a long time ago and not been picked up. Maybe it is time to just accept git-gui patches directly into Git, after some other Tcl/Tk savvy people ACKed them? Ciao, Dscho
Re: [PATCH] compat: move strdup(3) replacement to its own file
Hi, On Wed, 7 Sep 2016, Junio C Hamano wrote: > René Scharfewrites: > > > Well, OK. I think the missing point is that the original nedmalloc > > doesn't come with strdup() and doesn't need it. Only _users_ of > > nedmalloc need it. Marius added it in nedmalloc.c, but strdup.c is a > > better place for it. > > Thanks. I'll add these lines like so: > > Move our implementation of strdup(3) out of compat/nedmalloc/ and > allow it to be used independently from USE_NED_ALLOCATOR. The > original nedmalloc doesn't come with strdup() and doesn't need it. > Only _users_ of nedmalloc need it, which was added when we imported > it to our compat/ hierarchy. > > This reduces the difference of our copy of nedmalloc from the > original, making it easier to update, and allows for easier testing > and reusing of our version of strdup(). Excellent! Thanks, Dscho
Re: Bug? ran into a "fatal" using interactive rebase
Hi Ralf, On Wed, 7 Sep 2016, Ralf Thielow wrote: > 2016-09-07 17:19 GMT+02:00 Johannes Schindelin: > > > > So, something like this should help (if you are interested in seeing this > > patch included, please run with it, as I am running short on time): What I meant is: could you craft a test, a proper commit message, and submit it? I will be mostly offline during the second half of September and have many other things to take care of right now. Thanks, Dscho
Re: [PATCH v2 00/38] Virtualization of the refs API
On 09/07/2016 09:20 PM, Junio C Hamano wrote: > Michael Haggertywrites: > >> This is v2 of the patch series to virtualize the references API >> (though earlier patch series similar in spirit were submitted by >> Ronnie Sahlberg and David Turner). Thanks to Junio, Eric, and Ramsay >> for their comments about v1 [1]. >> >> Nobody pointed out any fundamental problems with v1, but this version >> includes the following improvements: > > Curiously, many of these improvements were already in 'pu'. I pushed some of these changes to GitHub a while ago and mentioned that fact on the mailing list [1]; I assume you fetched from there. But I was waiting for the dust to settle from the earlier patch series and the 2.10 release before sending them to the mailing list again. >> * In "refs: add methods for reflog": >> >> * Don't export `files_reflog_iterator_begin()` (suggested by >> Ramsay). > > This I can see was missing in what has been in 'pu'. And according to your "What's cooking", you were waiting on this change before proceeding. FYI: I haven't done significant work on any next steps, which IMO could go in many possible directions: * Splitting loose and packed refs storage into two separate ref_store subclasses that are joined together using some kind of overlay_ref_store. This is not necessarily a blocker for the following ideas, but the overlay_ref_store would make it easier to construct other hybrid ref-stores, as for example the ubiquitous "most references are stored in main repo; per-worktree refs are stored in worktree". * Implementing reference caching as a ref_store that "writes through" changes to a backing reference store. This is also not a blocker for other ideas, but it would allow reference caching easily to be turned on/off for other reference stores (or even disabled for packed/loose reference storage during command invocations that know they won't have to iterate over references more than once). * A variant of loose reference storage that encodes the reference names somehow before turning them into filenames (1) to make refnames independent of filesystem path-name encoding issues, and (2) to eliminate D/F conflicts, thereby allowing reflogs to be retained after a reference is deleted. * A reference backend based on LMDB (or any other key-value store). * A reference backend based on Shawn Pearce's RefTree proposal or something like it [2]. It's uncertain when I'll next have a chunk of time to work on any of these. Michael [1] http://public-inbox.org/git/575990fb.5000...@alum.mit.edu/ [2] http://public-inbox.org/git/CAJo=hjvnapnaddcaawavu9c4rveqdos3ev9wtguhx4fd0v_...@mail.gmail.com/t
Re: [PATCH 5/5] unpack-trees: do not capitalize "working"
Alex Henriewrites: > In English, only proper nouns are capitalized. > > Signed-off-by: Alex Henrie > --- > unpack-trees.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/unpack-trees.c b/unpack-trees.c > index 11c37fb..c87a90a 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -123,9 +123,9 @@ void setup_unpack_trees_porcelain(struct > unpack_trees_options *opts, > msgs[ERROR_SPARSE_NOT_UPTODATE_FILE] = > _("Cannot update sparse checkout: the following entries are not > up-to-date:\n%s"); > msgs[ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN] = > - _("The following Working tree files would be overwritten by > sparse checkout update:\n%s"); > + _("The following working tree files would be overwritten by > sparse checkout update:\n%s"); > msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] = > - _("The following Working tree files would be removed by sparse > checkout update:\n%s"); > + _("The following working tree files would be removed by sparse > checkout update:\n%s"); Probably a leftover from an old sentence starting with Working? In any case, obviously correct too, thanks. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH 3/5] git-rebase--interactive: fix English grammar
Alex Henriewrites: > --- a/git-rebase--interactive.sh > +++ b/git-rebase--interactive.sh > @@ -1082,7 +1082,7 @@ If they are meant to go into a new commit, run: > >git commit \$gpg_sign_opt_quoted > > -In both case, once you're done, continue with: > +In both cases, once you're done, continue with: I don't remember writing this, but since I'm Cc-ed I guess I did ;-). Obviously correct, thanks. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH 1/3] diff.c: use diff_options directly
On Wed, Sep 7, 2016 at 4:36 PM, Stefan Bellerwrote: > The value of `ecbdata->opt` is accessible via the short variable `o` > already, so let's use that instead. > > Signed-off-by: Stefan Beller Seems reasonable. > --- > diff.c | 21 ++--- > 1 file changed, 10 insertions(+), 11 deletions(-) > > diff --git a/diff.c b/diff.c > index 534c12e..4a6501c 100644 > --- a/diff.c > +++ b/diff.c > @@ -1217,7 +1217,7 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > const char *line_prefix = diff_line_prefix(o); > > if (ecbdata->header) { > - fprintf(ecbdata->opt->file, "%s", ecbdata->header->buf); > + fprintf(o->file, "%s", ecbdata->header->buf); > strbuf_reset(ecbdata->header); > ecbdata->header = NULL; > } > @@ -1229,9 +1229,9 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > name_a_tab = strchr(ecbdata->label_path[0], ' ') ? "\t" : ""; > name_b_tab = strchr(ecbdata->label_path[1], ' ') ? "\t" : ""; > > - fprintf(ecbdata->opt->file, "%s%s--- %s%s%s\n", > + fprintf(o->file, "%s%s--- %s%s%s\n", > line_prefix, meta, ecbdata->label_path[0], reset, > name_a_tab); > - fprintf(ecbdata->opt->file, "%s%s+++ %s%s%s\n", > + fprintf(o->file, "%s%s+++ %s%s%s\n", > line_prefix, meta, ecbdata->label_path[1], reset, > name_b_tab); > ecbdata->label_path[0] = ecbdata->label_path[1] = NULL; > } > @@ -1249,15 +1249,15 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > find_lno(line, ecbdata); > emit_hunk_header(ecbdata, line, len); > if (line[len-1] != '\n') > - putc('\n', ecbdata->opt->file); > + putc('\n', o->file); > return; > } > > if (len < 1) { > - emit_line(ecbdata->opt, reset, reset, line, len); > + emit_line(o, reset, reset, line, len); > if (ecbdata->diff_words > && ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) > - fputs("~\n", ecbdata->opt->file); > + fputs("~\n", o->file); > return; > } > > @@ -1282,8 +1282,8 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > } > diff_words_flush(ecbdata); > if (ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) { > - emit_line(ecbdata->opt, context, reset, line, len); > - fputs("~\n", ecbdata->opt->file); > + emit_line(o, context, reset, line, len); > + fputs("~\n", o->file); > } else { > /* > * Skip the prefix character, if any. With > @@ -1294,7 +1294,7 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > line++; > len--; > } > - emit_line(ecbdata->opt, context, reset, line, len); > + emit_line(o, context, reset, line, len); > } > return; > } > @@ -1316,8 +1316,7 @@ static void fn_out_consume(void *priv, char *line, > unsigned long len) > default: > /* incomplete line at the end */ > ecbdata->lno_in_preimage++; > - emit_line(ecbdata->opt, > - diff_get_color(ecbdata->color_diff, DIFF_CONTEXT), > + emit_line(o, diff_get_color(ecbdata->color_diff, > DIFF_CONTEXT), > reset, line, len); > break; > } > -- > 2.10.0.2.g0676c79.dirty >
Re: [PATCH v1 2/2] read-cache: make sure file handles are not inherited by child processes
> On 07 Sep 2016, at 20:23, Junio C Hamanowrote: > > Eric Wong writes: > >> We probably should be using O_NOATIME for all O_RDONLY cases >> to get the last bit of performance out (especially since >> non-modern-Linux systems probably still lack relatime). > > No, please do not go there. > > The user can read from a file in a working tree using "less", > "grep", etc., and they all update the atime, so should "git grep". > We do not use atime ourselves on these files but we should let > outside tools rely on the validity of atime (e.g. "what are the > files that were looked at yesterday?"). > > If you grep for noatime in our current codebase, you'd notice that > we use it only for files in objects/ hierarchy, and that makes very > good sense. These files are what we create for our _sole_ use and > no other tools can peek at them and expect to get any useful > information out of them (we hear from time to time that virus > scanners leaving open file descriptors on them causing trouble, but > that is an example of a useless access), and that makes a file in > objects/ hierarchy a fair game for noatime optimization. How do we deal with read-cache:ce_compare_data, though? By your definition above we shouldn't use NOATIME since the read file is not in objects/. However, the file read is not something the user explicitly triggers. The read is part of the Git internal "clean" machinery. What would you suggest? Should I open the file with or without NOATIME? Thanks, Lars