Re: git difftool with symlink to readonly jar failed
Hi, On Thu, May 24, 2018 at 11:11 PM, Etienne d'Hautefeuillewrote: > > #try a diff > git difftool --dir-diff 4cb98b4a307ce97d9e6b8e4b03211fa5ca8af7e7 > 0244799661b993b1f78fa5afb621de3fe4c4a39c > fatal: impossible d'ouvrir '/tmp/git-difftool.UQ4mqo/left/jenkins.war' en > écriture: Permission non accordée You should use LANG=C so that people can understand the error message. Also git difftool launches another program that will actually perform the diff. It looks like it is bcompare on your setup. Did you try with another program?
Re: [PATCH v3 00/20] Integrate commit-graph into 'fsck' and 'gc'
Derrick Stoleewrites: > One other change worth mentioning: in "commit-graph: add '--reachable' > option" I put the ref-iteration into a new external > 'write_commit_graph_reachable()' method inside commit-graph.c. This > makes the 'gc: automatically write commit-graph files' a simpler change. ;-).
Re: [PATCH 4/4] fetch: implement fetch.fsck.*
Ævar Arnfjörð Bjarmasonwrites: > fsck.skipList:: > - Like `fsck.` this variable has a corresponding > - `receive.fsck.skipList` variant. > + Like `fsck.` this variable has corresponding > + `receive.fsck.skipList` and `fetch.fsck.skipList` variants. > + > The path to a sorted list of object names (i.e. one SHA-1 per line) > that are known to be broken in a non-fatal way and should be I think I've said this already, but I tend to agree with Eric that this is the other way around. Perhaps that is because I consider fsck. the most basic one people would want to understand first, and then corresponding .fsck. a mere specialization of it. So "Here is what fsck.skipList does" followed by "By the way, you can configure it only for the (internal) fsck run during the object transfer with transfer.fsck.skipList" feels more natural presentation order. > diff --git a/fetch-pack.c b/fetch-pack.c > index 490c38f833..9e4282788e 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -19,6 +19,7 @@ > #include "sha1-array.h" > #include "oidset.h" > #include "packfile.h" > +#include "fsck.h" > > static int transfer_unpack_limit = -1; > static int fetch_unpack_limit = -1; > @@ -33,6 +34,7 @@ static int agent_supported; > static int server_supports_filtering; > static struct lock_file shallow_lock; > static const char *alternate_shallow_file; > +static struct strbuf fsck_msg_types = STRBUF_INIT; > > /* Remember to update object flag allocation in object.h */ > #define COMPLETE (1U << 0) > @@ -935,7 +937,8 @@ static int get_pack(struct fetch_pack_args *args, >*/ > argv_array_push(, "--fsck-objects"); > else > - argv_array_push(, "--strict"); > + argv_array_pushf(, "--strict%s", > + fsck_msg_types.buf); This made a reader wonder what fsck_msg_types.buf[] has. It is empty or a comma separated list of things, prefixed with =, that is constructed by repeated calls to fetch_pack_config_cb(), so syntactically what we feed index-pack looks like "--strict", "--strict=thing", or "--strict=thing1,thing2,,thingn". And each "thing" is either "=" or "skiplist=". The buffer that has both msgtype specification and object name should not be called fsck_msg_types, though. It is probably fsck_exception or something. > + strbuf_addf(_msg_types, "%cskiplist=%s", > + fsck_msg_types.len ? ',' : '=', path); > + free((char *)path); > + return 0; > + } > + > + if (skip_prefix(var, "fetch.fsck.", )) { > + if (is_valid_msg_type(var, value)) > + strbuf_addf(_msg_types, "%c%s=%s", > + fsck_msg_types.len ? ',' : '=', var, value); > + else > + warning("Skipping unknown msg id '%s'", var); > + return 0; > + } > + > + return git_default_config(var, value, cb); > +}
Re: Wrong -dirty suffix set by setlocalversion (was: BUG in git diff-index)
On 24/05/2018 16:03, Mike Mason wrote: > diff --git a/scripts/setlocalversion b/scripts/setlocalversion > index 71f39410691b..9da4c5e83285 100755 > --- a/scripts/setlocalversion > +++ b/scripts/setlocalversion > @@ -73,8 +73,10 @@ scm_version() > printf -- '-svn%s' "`git svn find-rev $head`" > fi > > - # Check for uncommitted changes > - if git diff-index --name-only HEAD | grep -qv > "^scripts/package"; then > + # Check for uncommitted changes. Only check mtime and size. > + # Ignore insequential ctime, uid, gid and inode differences. > + if git -c "core.checkstat=minimal" diff-index --name-only HEAD > | \ > + grep -qv "^scripts/package"; then > printf '%s' -dirty > fi FWIW: Reported-by: marc.herb...@intel.com Reviewed-by: marc.herb...@intel.com (assuming a future and decent commit message) Tested-by: marc.herb...@intel.com So the real use case is making a copy of a whole tree before building. Typical in automated builds, old example: https://groups.google.com/a/chromium.org/d/msg/chromium-os-dev/zxOa0OLWFkw/N_Sb7EZOBwAJ Here's a more complex but faster and more transparent way to test Mike's fix than copying an entire tree: # Make sure you start from a clean state git describe --dirty # must not -dirty make prepare # Simulate a copy of the tree but with just one file rsync --perms --times README README.mtime_backup rm README rsync --perms --times README.mtime_backup README stat README README.mtime_backup # Demo the BUG fixed by Mike ./scripts/setlocalversion # -dirty BUG! because spurious inode ctime difference git diff-index HEAD git describe --dirty # not -dirty ./scripts/setlocalversion # not -dirty any more cause describe refreshed index # Make sure mtime still causes -dirty with AND without Mike's fix touch README ./scripts/setlocalversion # -dirty
Re: [PATCH 1/2] diff: turn --ita-invisible-in-index on by default
Hi Duy, Nguyễn Thái Ngọc Duy wrote: > Due to the implementation detail of intent-to-add entries. The current > "git diff" (i.e. no treeish or --cached argument) would show the > changes in the i-t-a file, but it does not mark the file as new, while > "diff --cached" would mark the file as new while showing its content > as empty. > > One evidence of the current output being wrong is that, the output > from "git diff" (with ita entries) cannot be applied because it > assumes empty files exist before applying. > > Turning on --ita-invisible-in-index [1] [2] would fix this. I'm having a lot of trouble understanding the above. Part of my confusion may be grammatical: for example, the first sentence is a sentence fragment. Another part is that the commit message doesn't tell me a story: what does the user try to do and fail that is not possible without this? What is the intention or effect behind the commits mentioned that leads to them being cited? To put it another way, the basic aspects I look for in a commit message are: - the motivation behind the change (a wrong behavior, a task that isn't possible, some excessive complexity, or whatever). The reader doesn't know your motivation so their default attitude will be to assume that nothing should change - a little more detail about the why and how behind the current behavior, to put the proposed in context. This makes it easier for the reader to understand how the change will affect users of that behavior that don't necessarily have the same motivation. An example illustrating the behavior can work well here. - any interesting details of implementation or alternatives considered that can make the patch easier to read now that the motivation is out of the way. - a word or two on what this makes possible I'm having trouble pulling apart these pieces in this commit message. Can you give an example of a command's output before and after this change so I can understand better why it's a good one? > This option is on by default in git-status [1] but we need more fixup > in rename detection code [3]. Luckily we don't need to do anything > else for the rename detection code in diff.c (wt-status.c uses a > customized one). > > [1] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist > in index" - 2016-10-24) > [2] b42b451919 (diff: add --ita-[in]visible-in-index - 2016-10-24) > [3] bc3dca07f4 (Merge branch 'nd/ita-wt-renames-in-status' - 2018-01-23) > > Signed-off-by: Nguyễn Thái Ngọc Duy> --- > builtin/diff.c | 7 +++ > t/t2203-add-intent.sh | 37 ++--- > t/t4011-diff-symlink.sh | 10 ++ > 3 files changed, 43 insertions(+), 11 deletions(-) This flips the default as announced but I'm not sure yet whether it's a good thing. After all, an intent-to-add entry is a real entry in the index; why wouldn't it show up in "git diff --cached"? Is the idea that it shouldn't show up there because "git commit" would not include the intent-to-add entry? That makes some sense to me. What does the endgame look like? Would we flip the default to --ita-invisible and then remove the option? Context is that an internal script does something like echo 'This file is added!' >added git add --intent-to-add added git diff --name-only --no-renames --diff-filter=A master to list added files and started failing with this patch (in "next"). Arguably the script should use diff-index for a stronger stability guarantee. Should the script use --ita-visible as a futureproofing measure as well? Actually, why is this "git diff" output changing at all, given that the script doesn't pass --cached? I would expect "git diff" to show the ITA entry because it affects the result of "git commit -a". Thanks, Jonathan
Re: [PATCH 3/4] config doc: elaborate on what transfer.fsckObjects does
Eric Sunshinewrites: >> +will instead be left unreferenced in the repository. That's considered >> +a bug, and hopefully future git release will implement a quarantine >> +for the "fetch" side as well. > > If this was a "BUGS" section in a man-page, the above might be less > scary. In this context, however, I wonder if it makes sense to tone it > down a bit: > > On the fetch side, malformed objects will instead be left > unreferenced in the repository. (However, in the future, such > objects may be quarantined for "fetch", as well.) I had an impression that nobody else sayd it is considered as a bug. Do we need to say it in this series? I'd rather not--with or without such a future modification (or lack of plan thereof), teaching the fetch side to pay attention to the various fsck tweaks is an improvement.
Re: [PATCH 1/4] config doc: don't describe *.fetchObjects twice
Ævar Arnfjörð Bjarmasonwrites: > Change the copy/pasted description of the fetch.fsckObjects and > receive.fsckObjects variables to refer to transfer.fsckObjects > instead. > > Let's not duplicate the description of what *.fsckObjects does twice. > instead let's refer to transfer.fsckObjects from both fetch.* and > receive.*. The two paragraphs above are duplicating what each other says, perhaps meant as a half-joke? Well played if that is the case ;-). Thanks for polishing this area.
Re: [PATCH] branch: issue "-l" deprecation warning after pager starts
On Fri, May 25, 2018 at 10:55:45AM +0900, Junio C Hamano wrote: > Jeff Kingwrites: > > > Hmm, actually, I suppose the true value of the warning is to help people > > doing "git branch -l foo", and it would still work there. The "more > > extreme" from your suggested patch would only affect "branch -l". > > > Still, I think I prefer the gentler version that we get by keeping it as > > a warning even in the latter case. > > "git branch -l newbranch [forkpoint]" that warns "We won't be doing > reflog creation with -l" is good, but "git branch -l" that warns "We > won't be doing reflog creation with -l" sounds like a pure noise, as > the user would say "Irrelevant, I am not doing that anyway--I am > listing". > > The warning to prepare users for the next step jk/branch-l-1-removal > should say "we won't be accepting '-l' as a silent and unadvertised > synonym soon. Spell it as --list" when "git branch -l" is given, I > would think. I hoped that reminding them that "-l is a synonym for --create-reflog" would serve as a gentle reminder that they're Doing It Wrong. I guess we could be more explicit, though. It is not "we won't be accepting -l as a synonym" though. It was never a synonym, it's just that it didn't happen to do anything in list mode. > > @@ -700,6 +700,11 @@ int cmd_branch(int argc, const char **argv, const char > > *prefix) > > if (list) > > setup_auto_pager("branch", 1); > > > > + if (used_deprecated_reflog_option) { > > + warning("the '-l' alias for '--create-reflog' is deprecated;"); > > + warning("it will be removed in a future version of Git"); > > + } > > So from that point of view, we may need a separate message to warn > users who _do_ want listing with '-l' before jk/branch-l-1-removal > removes it? > > The jk/branch-l-2-resurrection topic later repurposes '-l' for > '--list' but until that happens 'git branch -l' will error not, no? Yes, after step 1 it will error out. Again, I hoped the existing message would prepare people. But maybe we should do this on top of what I posted earlier? -- >8 -- Subject: [PATCH] branch: customize "-l" warning in list mode People mistakenly use "git branch -l", thinking that it triggers list mode. It doesn't, but the lack of non-option arguments in that command does (and the "-l" becomes a silent noop). Since afc968e579 (branch: deprecate "-l" option, 2018-03-26) we've warned that "-l" is going away. But the warning text is primarily aimed at people who _meant_ to use "-l", as in "git branch -l foo". People who mistakenly said "git branch -l" may be left puzzled. Let's make it clear that: 1. No, "-l" didn't do what they thought here. 2. It's going away, and what they should do instead. Signed-off-by: Jeff King --- builtin/branch.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index 55bfacd843..b0b33dab94 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -701,8 +701,14 @@ int cmd_branch(int argc, const char **argv, const char *prefix) setup_auto_pager("branch", 1); if (used_deprecated_reflog_option) { - warning("the '-l' alias for '--create-reflog' is deprecated;"); - warning("it will be removed in a future version of Git"); + if (list) { + warning("the '-l' option is an alias for '--create-reflog' and"); + warning("has no effect in list mode. This option will soon be"); + warning("removed and you should omit it (or use '--list' instead)."); + } else { + warning("the '-l' alias for '--create-reflog' is deprecated;"); + warning("it will be removed in a future version of Git"); + } } if (delete) { -- 2.17.0.1391.g6fdbf40724
Re: [PATCH 2/4] config doc: unify the description of fsck.* and receive.fsck.*
Eric Sunshinewrites: > I see where you're coming from, however, I would think that readers > arriving at this topic (generally) do so as a result of actively > looking for it (as opposed to happening upon it), in which case they > probably are directly seeking information about it; the incidental > information is just a bonus after reading what they came to learn. Yup. That matches my mental model as well.
Re: [PATCH] branch: issue "-l" deprecation warning after pager starts
Jeff Kingwrites: > Hmm, actually, I suppose the true value of the warning is to help people > doing "git branch -l foo", and it would still work there. The "more > extreme" from your suggested patch would only affect "branch -l". > Still, I think I prefer the gentler version that we get by keeping it as > a warning even in the latter case. "git branch -l newbranch [forkpoint]" that warns "We won't be doing reflog creation with -l" is good, but "git branch -l" that warns "We won't be doing reflog creation with -l" sounds like a pure noise, as the user would say "Irrelevant, I am not doing that anyway--I am listing". The warning to prepare users for the next step jk/branch-l-1-removal should say "we won't be accepting '-l' as a silent and unadvertised synonym soon. Spell it as --list" when "git branch -l" is given, I would think. > @@ -700,6 +700,11 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > if (list) > setup_auto_pager("branch", 1); > > + if (used_deprecated_reflog_option) { > + warning("the '-l' alias for '--create-reflog' is deprecated;"); > + warning("it will be removed in a future version of Git"); > + } So from that point of view, we may need a separate message to warn users who _do_ want listing with '-l' before jk/branch-l-1-removal removes it? The jk/branch-l-2-resurrection topic later repurposes '-l' for '--list' but until that happens 'git branch -l' will error not, no?
Re: "man git-tag" inconsistent about whether you can tag non-commit objects
"Robert P. J. Day"writes: > embarrassed to admit i had no idea that you could tag non-commit > objects, only realized that when i was reading the man page and saw: > > SYNOPSIS > git tag [-a | -s | -u ] [-f] [-m | -F ] [-e] > [ | ] > > > so i tried it and, sure enough, i could tag a blob object. but if you > read further into DESCRIPTION, about halfway through, you read: > > "Otherwise just a tag reference for the SHA-1 object name of the >commit object is created (i.e. a lightweight tag)." >^^ > > which suggests only commit objects. finally, much further down, under > OPTIONS: > > ", > The object that the new tag will refer to, usually a commit. > > > so to clean this up, is it sufficient to just change that middle line > to say "object" rather than "commit object"? or is there more in the > man page that needs tweaking? As that sentence talks about a lightweight tag (i.e. a reference in refs/tags/ hierarchy that directly points at an object of any kind), another possibility would be to say Otherwise a tag reference that directly points at the given object (i.e. lightweight tag) is created.
Re: [PATCH 2/5] t6036, t6042: use test_line_count instead of wc -l
SZEDER Gáborwrites: >> -test 2 = $(git ls-files -s | wc -l) && >> -test 2 = $(git ls-files -u | wc -l) && >> -test 2 = $(git ls-files -o | wc -l) && > > Here 'git ls-files -o' should have listed two untracked files ... > >> +git ls-files -s >out && >> +test_line_count = 2 out && >> +git ls-files -u >out && >> +test_line_count = 2 out && >> +git ls-files -o >out && >> +test_line_count = 3 out && > > ... but now you expect it to list three. I was about to point out the > typo, but then noticed that you expect it to list one more untracked > file than before in all subsequent tests... now that can't be just a > typo, can it? > > Please mention in the commit message that when using an intermediate > file to store the output, 'git ls-files -o' will list that file, too, > that's why the number of expected untracked files had to be adjusted; > so future readers won't have to figure this out themselves. I'd expect that a reader of the commit who cares enough to bother to wonder by looking at the patch and seeing that 2 became 3 would know why already. And a reader of the resulting file would not know that the 3 used to be 2, and won't be helped by "we used to count to 2, now we have 'out' also counted" that much, especially in the commit log message. What would help the latter would be to name which three paths we expect to see in the comment (or test against the exact list of paths, instead of using test_line_count). > An alternative to consider would be to add a .gitignore file in the > initial commit to ignore 'out', then the number of untracked files > don't have to be adjusted. I think that is a preferred solution that we've used in ls-files and status tests successfully.
Re: [RFC/PATCH 4/7] rerere: fix crash when conflict goes unresolved
Thomas Gummererwrites: > Sorry I botched the description here, and failed to describe what the > code is actually doing. We're actually only removing the variant in > the MERGE_RR file, whose path we are now no longer able to handle. Oh, that's absolutely fine, then. Thanks for a prompt update.
Re: [PATCH v2] rev-parse: check lookup'ed commit references for NULL
Elijah Newrenwrites: > I would have used a Reported-by tag for Florian and Todd, but looking at > the bugzilla.redhat.com bug report doesn't show me Florian's email > address. I grepped through git logs and found two associated with that > name, but didn't know if they were still accurate, or were a different > Florian. So I just went with the sentence instead. Or write names after reported-by without any address? There is no law that says that a trailer's contents must be proper e-mail addresses. People are already known to put garbage on Cc:, for example. > builtin/rev-parse.c | 8 ++-- > t/t6101-rev-parse-parents.sh | 8 > 2 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c > index a1e680b5e9..a0a0ace38d 100644 > --- a/builtin/rev-parse.c > +++ b/builtin/rev-parse.c > @@ -282,6 +282,10 @@ static int try_difference(const char *arg) > struct commit *a, *b; > a = lookup_commit_reference(_oid); > b = lookup_commit_reference(_oid); > + if (!a || !b) { > + *dotdot = '.'; > + return 0; > + } We thought A..B or X...Y were a commit range, but it turns out that it is not the case, since at least one end is not a committish. We simply restore the original and tell "No, this is not a range, try to parse it as something else" to the caller by returning 0. Makes sense. > @@ -328,12 +332,12 @@ static int try_parent_shorthands(const char *arg) > return 0; > > *dotdot = 0; > - if (get_oid_committish(arg, )) { > + if (get_oid_committish(arg, ) || > + !(commit = lookup_commit_reference())) { > *dotdot = '^'; > return 0; > } > > - commit = lookup_commit_reference(); OK, the logic flows the same way for things like foo^@ here, which makes sense. Looks good. Thanks.
Re: "git grep" and "working tree" vs "working directory"
Stefan Bellerwrites: > There are 2 dimensions to it: > * (where you are) > if you run git-grep from a sub directory of the repository, then the > "sub-working-tree" > will be searched. s/the repository/the top level directory of the working tree/, perhaps? >> also, at the bottom of that section, one reads: >> >> ... >> If given, limit the search to paths matching at least one >> pattern. Both leading paths match and glob(7) patterns are supported. >> >> For more details about the syntax, see the pathspec >> entry in gitglossary(7). >> >> but, again, what if is *not* given? then what? > > Assume "$pwd/." This is not technically wrong per-se, but I do not think there is any reason to encourage it over just a simple "." dot. >> finally, the first example has the same problem: >> >> git grep 'time_t' -- '*.[ch]' >> Looks for time_t in all tracked .c and .h files in the >> working directory and its subdirectories. >> >> in "the working directory"? >> >> what is the proper terminology to be used here? > > the working directory sounds right, not sure which aspect you want to be > exposed more clearly. "The part of the working tree below and including your current working directory", if you really want to be pedantic ;-). But almost all the examples that show how to work _with_ Git inspecting and manipulating tracked contents assume that your current working directory _is_ inside a working tree of the repository you are working on, so the above is equivalent to "The current working directory" should be clear enough for most readers, I would think.
Re: [PATCH v3 01/20] commit-graph: UNLEAK before die()
On 5/24/2018 6:47 PM, Stefan Beller wrote: On Thu, May 24, 2018 at 9:25 AM, Derrick Stoleewrote: Signed-off-by: Derrick Stolee --- builtin/commit-graph.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 37420ae0fd..f0875b8bf3 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -51,8 +51,11 @@ static int graph_read(int argc, const char **argv) graph_name = get_commit_graph_filename(opts.obj_dir); graph = load_commit_graph_one(graph_name); - if (!graph) + if (!graph) { + UNLEAK(graph_name); die("graph file %s does not exist", graph_name); Unrelated to this patch: Is the command that ends up die()ing here a plumbing or porcelain, or: Do we want to translate the message here? In a lot of commands that show paths we single quote them '%s', (speaking from experience with a lot of submodule path code) This is for the 'git commit-graph read' command, which is plumbing (and 'read' is really only for testing). I don't think this message requires translation. I'll keep the quotes in mind for the future. Thanks, -Stolee
Re: [PATCH] submodule: do not pass null OID to setup_revisions
On Thu, 24 May 2018 16:07:49 -0700 Stefan Bellerwrote: > Hi Jonathan, > > On Thu, May 24, 2018 at 1:47 PM, Jonathan Tan > wrote: > > If "git pull --recurse-submodules --rebase" is invoked when the current > > branch and its corresponding remote-tracking branch have no merge base, > > a "bad object" fatal error occurs. This issue was introduced with commit > > a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule > > changes only)", 2017-06-23), which also introduced this feature. > > Ok, what should happen instead? Just for there not to be this "bad object" error. > > This is because cmd_pull() in builtin/pull.c thus invokes > > submodule_touches_in_range() with a null OID as the first parameter. > > Ensure that this case works, and document what happens in this case. > > By documenting you mean adding a test, i.e. documenting it for the > developers, not the users. I also updated the submodule.h file, but yes, it is for the developers. I'll change the commit message to make this more clear if I need a reroll. > I inserted a test_pause here and inspect child: > * the submodule is the same as in parent, so this patch is > just testing it works with submodules the same? > * No, the submodule is not cloned into the child > at all. So we do not know what do do with the submodule. Yes, this test doesn't do much. I just wanted to make sure that submodule_touches_in_range() could be called without encountering this unrelated error. (Incidentally, we might want to add tests for the "cannot rebase with locally recorded submodule modifications", but I haven't looked into that.) > However this patch is about making sure the superproject > works out well, without this patch we'd have > $ git -C child pull --recurse-submodules --rebase > fatal: bad object > which is to be avoided. > > Yes I think this is the best way to fix the issue, I thought for some time > that > could first check if submodules are initialzed or active, but these > are checks are done afterwards, so this is ok. > > Reviewed-by: Stefan Beller Thanks!
Re: [PATCH] submodule: do not pass null OID to setup_revisions
Hi Jonathan, On Thu, May 24, 2018 at 1:47 PM, Jonathan Tanwrote: > If "git pull --recurse-submodules --rebase" is invoked when the current > branch and its corresponding remote-tracking branch have no merge base, > a "bad object" fatal error occurs. This issue was introduced with commit > a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule > changes only)", 2017-06-23), which also introduced this feature. Ok, what should happen instead? > This is because cmd_pull() in builtin/pull.c thus invokes > submodule_touches_in_range() with a null OID as the first parameter. > Ensure that this case works, and document what happens in this case. By documenting you mean adding a test, i.e. documenting it for the developers, not the users. > > Signed-off-by: Jonathan Tan > --- > submodule.c | 6 -- > submodule.h | 5 - > t/t5572-pull-submodule.sh | 21 + > 3 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/submodule.c b/submodule.c > index 74d35b2577..49def93dd9 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -1169,8 +1169,10 @@ int submodule_touches_in_range(struct object_id > *excl_oid, > > argv_array_push(, "--"); /* args[0] program name */ > argv_array_push(, oid_to_hex(incl_oid)); > - argv_array_push(, "--not"); > - argv_array_push(, oid_to_hex(excl_oid)); > + if (!is_null_oid(excl_oid)) { > + argv_array_push(, "--not"); > + argv_array_push(, oid_to_hex(excl_oid)); > + } > > collect_changed_submodules(, ); > ret = subs.nr; > diff --git a/submodule.h b/submodule.h > index e5526f6aaa..1fd7111f60 100644 > --- a/submodule.h > +++ b/submodule.h > @@ -94,7 +94,10 @@ extern int merge_submodule(struct object_id *result, const > char *path, >const struct object_id *a, >const struct object_id *b, int search); > > -/* Checks if there are submodule changes in a..b. */ > +/* > + * Checks if there are submodule changes in a..b. If a is the null OID, > + * checks b and all its ancestors instead. > + */ > extern int submodule_touches_in_range(struct object_id *a, > struct object_id *b); > extern int find_unpushed_submodules(struct oid_array *commits, > diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh > index 321bd37deb..f916729a12 100755 > --- a/t/t5572-pull-submodule.sh > +++ b/t/t5572-pull-submodule.sh > @@ -132,4 +132,25 @@ test_expect_success 'pull rebase recursing fails with > conflicts' ' > test_i18ngrep "locally recorded submodule modifications" err > ' > > +test_expect_success 'branch has no merge base with remote-tracking > counterpart' ' > + rm -rf parent child && > + > + test_create_repo a-submodule && > + test_commit -C a-submodule foo && > + > + test_create_repo parent && > + git -C parent submodule add "$(pwd)/a-submodule" && > + git -C parent commit -m foo && > + > + git clone parent child && > + > + # Reset master so that it has no merge base with > + # refs/remotes/origin/master. > + OTHER=$(git -C child commit-tree -m bar \ > + $(git -C child rev-parse HEAD^{tree})) && > + git -C child reset --hard "$OTHER" && I inserted a test_pause here and inspect child: * the submodule is the same as in parent, so this patch is just testing it works with submodules the same? * No, the submodule is not cloned into the child at all. So we do not know what do do with the submodule. However this patch is about making sure the superproject works out well, without this patch we'd have $ git -C child pull --recurse-submodules --rebase fatal: bad object which is to be avoided. Yes I think this is the best way to fix the issue, I thought for some time that could first check if submodules are initialzed or active, but these are checks are done afterwards, so this is ok. Reviewed-by: Stefan Beller Thanks! Stefan
Re: Wrong -dirty suffix set by setlocalversion (was: BUG in git diff-index)
How about something like this? It ignores attributes that should have no bearing on whether the kernel is considered dirty. Copied trees with no other changes would no longer be marked with -dirty. Plus it works on read-only media since no index updating is required. Would this also be considered kosher, at least for the purposes of setlocalversion? diff --git a/scripts/setlocalversion b/scripts/setlocalversion index 71f39410691b..9da4c5e83285 100755 --- a/scripts/setlocalversion +++ b/scripts/setlocalversion @@ -73,8 +73,10 @@ scm_version() printf -- '-svn%s' "`git svn find-rev $head`" fi - # Check for uncommitted changes - if git diff-index --name-only HEAD | grep -qv "^scripts/package"; then + # Check for uncommitted changes. Only check mtime and size. + # Ignore insequential ctime, uid, gid and inode differences. + if git -c "core.checkstat=minimal" diff-index --name-only HEAD | \ + grep -qv "^scripts/package"; then printf '%s' -dirty fi
Re: why do "git log -h" and "git show -h" print the same thing?
On Thu, 24 May 2018, Stefan Beller wrote: > On Thu, May 24, 2018 at 6:37 AM, Robert P. J. Day> wrote: > > > > maybe this is deliberate, but it's confusing that, with git 2.17.0, > > the output of both "git log -h" and "git show -h" is exactly the same: > > > > $ git log -h > > usage: git log [] [] [[--] ...] > >or: git show [] ... > > > > -q, --quiet suppress diff output > > --source show source > > --use-mailmap Use mail map file > > --decorate-refs > > only decorate refs that match > > --decorate-refs-exclude > > do not decorate refs that match > > --decorate[=...] decorate options > > -L Process line range n,m in file, counting from 1 > > $ > > > > is that what's *supposed* to happen? > > I would think so, show is just "log -p" with the range clamped > down to ^... > > It's been in the code like that for a couple years by now, > e.g. see > e66dc0cc4b1a6 log.c: fix translation markings, 2015-01-06 ah, very well, it just caught me by surprise. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: [PATCH 2/4] config doc: unify the description of fsck.* and receive.fsck.*
On Thu, May 24, 2018 at 4:12 PM, Ævar Arnfjörð Bjarmasonwrote: > On Thu, May 24 2018, Eric Sunshine wrote: >> On Thu, May 24, 2018 at 3:35 PM, Ævar Arnfjörð Bjarmason >> wrote: >>> fsck.skipList:: >>> + Like `fsck.` this variable has a corresponding >>> + `receive.fsck.skipList` variant. >>> ++ >>> +The path to a sorted list of object names (i.e. one SHA-1 per line) >>> +that are known to be broken in a non-fatal way and should be >>> +ignored. This feature is useful when an established project should be >>> +accepted despite early commits containing errors that can be safely >>> +ignored such as invalid committer email addresses. Note: corrupt >>> +objects cannot be skipped with this setting. >> >> Nit: This organization seems backward. Typically, one would describe >> what the option is for and then add the incidental note ("Like >> fsck.<...>, this variable...") at the end. It's not clear why this >> patch demotes the description to a secondary paragraph and considers >> the incidental note as primary. > > I could change it like that. I was thinking that later in the series > fetch.fsck.* is going to be first in the file, and then the user is told > to look at this variable, so it made sense to note from the outset that > we're describing several variables here. > What do you think? I see where you're coming from, however, I would think that readers arriving at this topic (generally) do so as a result of actively looking for it (as opposed to happening upon it), in which case they probably are directly seeking information about it; the incidental information is just a bonus after reading what they came to learn. Anyhow, I don't care too strongly about it (it was just a "nit", after all).
Re: [PATCH v3 01/20] commit-graph: UNLEAK before die()
On Thu, May 24, 2018 at 9:25 AM, Derrick Stoleewrote: > Signed-off-by: Derrick Stolee > --- > builtin/commit-graph.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c > index 37420ae0fd..f0875b8bf3 100644 > --- a/builtin/commit-graph.c > +++ b/builtin/commit-graph.c > @@ -51,8 +51,11 @@ static int graph_read(int argc, const char **argv) > graph_name = get_commit_graph_filename(opts.obj_dir); > graph = load_commit_graph_one(graph_name); > > - if (!graph) > + if (!graph) { > + UNLEAK(graph_name); > die("graph file %s does not exist", graph_name); Unrelated to this patch: Is the command that ends up die()ing here a plumbing or porcelain, or: Do we want to translate the message here? In a lot of commands that show paths we single quote them '%s', (speaking from experience with a lot of submodule path code)
Re: why do "git log -h" and "git show -h" print the same thing?
On Thu, May 24, 2018 at 6:37 AM, Robert P. J. Daywrote: > > maybe this is deliberate, but it's confusing that, with git 2.17.0, > the output of both "git log -h" and "git show -h" is exactly the same: > > $ git log -h > usage: git log [] [] [[--] ...] >or: git show [] ... > > -q, --quiet suppress diff output > --source show source > --use-mailmap Use mail map file > --decorate-refs > only decorate refs that match > --decorate-refs-exclude > do not decorate refs that match > --decorate[=...] decorate options > -L Process line range n,m in file, counting from 1 > $ > > is that what's *supposed* to happen? I would think so, show is just "log -p" with the range clamped down to ^... It's been in the code like that for a couple years by now, e.g. see e66dc0cc4b1a6 log.c: fix translation markings, 2015-01-06
Re: commit-graph: change in "best" merge-base when ambiguous
Derrick Stoleewrites: > On 5/22/2018 1:39 AM, Michael Haggerty wrote: >> On 05/21/2018 08:10 PM, Derrick Stolee wrote: >>> [...] >>> In the Discussion section of the `git merge-base` docs [1], we have the >>> following: >>> >>> When the history involves criss-cross merges, there can be more than >>> one best common ancestor for two commits. For example, with this topology: >>> >>> ---1---o---A >>> \ / >>> X >>> / \ >>> ---2---o---o---B >>> >>> both 1 and 2 are merge-bases of A and B. Neither one is better than >>> the other (both are best merge bases). When the --all option is not >>> given, it is unspecified which best one is output. >>> >>> This means our official documentation mentions that we do not have a >>> concrete way to differentiate between these choices. This makes me think >>> that this change in behavior is not a bug, but it _is_ a change in >>> behavior. It's worth mentioning, but I don't think there is any value in >>> making sure `git merge-base` returns the same output. >>> >>> Does anyone disagree? Is this something we should solidify so we always >>> have a "definitive" merge-base? >>> [...] >> This may be beyond the scope of what you are working on, but there are >> significant advantages to selecting a "best" merge base from among the >> candidates. Long ago [1] I proposed that the "best" merge base is the >> merge base candidate that minimizes the number of non-merge commits that >> are in >> >> git rev-list $candidate..$branch >> >> that are already in master: >> >> git rev-list $master >> >> (assuming merging branch into master), which is equivalent to choosing >> the merge base that minimizes >> >> git rev-list --count $candidate..$branch Is the above correct... >> In fact, this criterion is symmetric if you exchange branch ↔ master, >> which is a nice property, and indeed generalizes pretty simply to >> computing the merge base of more than two commits. ...as it doesn't seem to have the described symmetry. >> >> In that email I also included some data showing that the "best" merge >> base almost always results in either the same or a shorter diff than the >> more or less arbitrary algorithm that we currently use. Sometimes the >> difference in diff length is dramatic. >> >> To me it feels like the best *deterministic* merge base would be based >> on the above criterion, maybe with first-parent reachability, commit >> times, and SHA-1s used (in that order) to break ties. > > Thanks, everyone, for your perspective on this. I'm walking away with > these conclusions: > > 1. While this is a change in behavior, it is not a regression. We do > not need to act immediately to preserve old behavior in these > ambiguous cases. > > 2. We should (eventually) define tie-breaking conditions. I like > Michael's suggestion above. One thing I'd like to point out is that when searching for some algorithm to speed up merge-base calculation (which is called lowest common ancestor in graph theory, and for which I have currently found only an algorithm with O(|V|*{E|) preparation time, and U(1) query) I have found instead attempts to rigorously define single representative lowest common ancestor. It might be worth a look how it is done. Another possible source to compare against is the algorithm used by Mercurial (which as far as I know doesn't use recursive merge strategy, so it needs to chose one merge base). HTH, -- Jakub Narębski
Re: [PATCH v3 00/20] Integrate commit-graph into 'fsck' and 'gc'
On Thu, May 24 2018, Derrick Stolee wrote: > Thanks for all the feedback on v2. I've tried to make this round's > review a bit easier by splitting up the commits into smaller pieces. > Also, the test script now has less boilerplate and uses variables and > clear arithmetic to explain which bytes are being modified. Thanks. it's a lot easier. > One other change worth mentioning: in "commit-graph: add '--reachable' > option" I put the ref-iteration into a new external > 'write_commit_graph_reachable()' method inside commit-graph.c. This > makes the 'gc: automatically write commit-graph files' a simpler change. Maybe you want this, maybe not, but I came up with this to squash: diff --git a/Documentation/config.txt b/Documentation/config.txt index 9a3abd87e7..2665522385 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -900,7 +900,8 @@ the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. core.commitGraph:: Enable git commit graph feature. Allows reading from the - commit-graph file. + commit-graph file. See `gc.commitGraph` for automatically + maintaining the file. core.sparseCheckout:: Enable "sparse checkout" feature. See section "Sparse checkout" in @@ -1554,10 +1555,10 @@ gc.autoDetach:: if the system supports it. Default is true. gc.commitGraph:: - If true, then gc will rewrite the commit-graph file after any - change to the object database. If '--auto' is used, then the - commit-graph will not be updated unless the threshold is met. - See linkgit:git-commit-graph[1] for details. + If true, then gc will rewrite the commit-graph file when + linkgit:git-gc[1] is run. When using linkgit:git-gc[1] + '--auto' the commit-graph will be updated if housekeeping is + required. See linkgit:git-commit-graph[1] for details. gc.logExpiry:: If the file gc.log exists, then `git gc --auto` won't run I.e. let's mention the new gc.commitGraph in core.commitGraph, and I think the "any change to the object database" line in gc.commitGraph is needlessly confusing, let's just say "when git-gc is run".
git difftool with symlink to readonly jar failed
Hi, In my repo, I have some symlink to readonly jar, I can't make difftool between 2 commit Steps to reproduce : > git --version git version 2.17.0 > id uid=1000(sagoum) gid=1000(sagoum) groupes=1000(sagoum),5(tty),10(wheel),14(uucp),18(audio),20(dialout),27(video),35(games),78(kvm),85(usb),100(users),104(plugdev),992(android),993(docker),997(vboxusers) # readonly jar > ls -l /home/tempogit/jenkins_war total 510012 -rwxr- 1 root users 74585998 24 mai 21:08 jenkins-2.110.war -rwxr- 1 root users 74612387 24 mai 21:08 jenkins-2.114.war -rwxr- 1 root users 74606954 24 mai 21:08 jenkins-2.116.war -rwxr- 1 root users 74737297 24 mai 21:08 jenkins-2.121.war -rwxr- 1 root users 74525235 24 mai 21:08 jenkins-LTS-2.107.1.war -rwxr- 1 root users 74568464 24 mai 21:08 jenkins-LTS-2.107.2.war -rwxr- 1 root users 74576216 24 mai 21:08 jenkins-LTS-2.107.3.war #create a repo with symlink to this jar mkdir ~/tempo_2.17.0 cd ~/tempo_2.17.0 git init ln -s /home/tempogit/jenkins_war/jenkins-2.110.war jenkins.war git add jenkins.war git commit -a -m " add symlink on jenkins-2.110.war" rm jenkins.war ln -s /home/tempogit/jenkins_war/jenkins-2.114.war jenkins.war git add jenkins.war git commit -a -m " add symlink on jenkins-2.114.war" rm jenkins.war ln -s /home/tempogit/jenkins_war/jenkins-2.116.war jenkins.war git add jenkins.war git commit -a -m " add symlink on jenkins-2.116.war" #detail of the repo git log commit ac1bc44d899ffd5406a7b0c413cf6c2e3497d496 (HEAD -> master) Author: kakoumDate: Thu May 24 22:47:09 2018 +0200 add symlink on jenkins-2.116.war commit 0244799661b993b1f78fa5afb621de3fe4c4a39c Author: kakoum Date: Thu May 24 22:47:09 2018 +0200 add symlink on jenkins-2.114.war commit 4cb98b4a307ce97d9e6b8e4b03211fa5ca8af7e7 Author: kakoum Date: Thu May 24 22:47:09 2018 +0200 add symlink on jenkins-2.110.war #try a diff git difftool --dir-diff 4cb98b4a307ce97d9e6b8e4b03211fa5ca8af7e7 0244799661b993b1f78fa5afb621de3fe4c4a39c fatal: impossible d'ouvrir '/tmp/git-difftool.UQ4mqo/left/jenkins.war' en écriture: Permission non accordée #temporary file find /tmp/git-difftool.UQ4mqo/ -ls 412444 0 drwx-- 4 sagoum sagoum 80 mai 24 22:48 /tmp/git-difftool.UQ4mqo/ 412446 0 drwx-- 2 sagoum sagoum 40 mai 24 22:48 /tmp/git-difftool.UQ4mqo/right 412445 0 drwx-- 2 sagoum sagoum 60 mai 24 22:48 /tmp/git-difftool.UQ4mqo/left 412449 0 lrwxrwxrwx 1 sagoum sagoum 44 mai 24 22:48 /tmp/git-difftool.UQ4mqo/left/jenkins.war -> /home/tempogit/jenkins_war/jenkins-2.110.war # I have no problem to launch a bcompare between /tmp/git-difftool.UQ4mqo/left and /tmp/git-difftool.UQ4mqo/right (which is empty) Thank
Re: Weird revision walk behaviour
On 23/05/2018 20:35, Jeff King wrote: On Wed, May 23, 2018 at 01:32:46PM -0400, Jeff King wrote: On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote: $ git log --oneline master..ba95710a3b -- ci/ ea44c0a594 Merge branch 'bw/protocol-v2' into jt/partial-clone-proto-v2 I keep some older builds around, and it does not reproduce with v1.6.6.3 (that's my usual goto for "old"). Bisecting turns up d0af663e42 (revision.c: Make --full-history consider more merges, 2013-05-16). It looks like an unintended change (the commit message claims that the non-full-history case shouldn't be affected). There's more discussion in the thread at: https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/ I haven't absorbed it all yet, but I'm adding Junio to the cc. In this case, we're hitting a merge commit which is not on master, but it has two parents which both are. Which, IIRC, means the merge commit is INTERESTING with two UNINTERESTING parents; and we are TREESAME to only one of them. The commit changing the logic of TREESAME you identified believes that those TREESAME changes for merges which were intended to improve fuller history modes shouldn't affect the simple history "because partially TREESAME merges are turned into normal commits". Clearly that didn't happen here. I think we need to look at why that isn't happening, and if it can be made to happen. The problem is that this commit is effectively the base of the graph - it's got a double-connection to the UNINTERESTING set, and maybe that prevented the simple history "follow 1 TREESAME" logic from kicking in. Maybe it won't follow 1 TREESAME to UNINTERESTING. I know there were quite a few changes later in the series to try to reconcile the simple and full history, for the cases where the simple history takes a weird path because of its love of TREESAME parents, hiding evil merges. But I believe the simple history behaviour was supposed to remain as-is - take first TREESAME always. Kevin
Re: "git grep" and "working tree" vs "working directory"
On Wed, May 23, 2018 at 11:50 AM, Robert P. J. Daywrote: > > more puzzling terminology, this time in the man page for "git grep". > the SYNOPSIS shows, at the very end, the clearly optional > "[...]", > > git grep ... >... snip ... >[--] [...] > > but nowhere in the man page is there an explanation as to the default > value used if there is no pathspec, and here's why that's confusing. > > first, what is the proper phrase for the *entire* checked out repo? What is the *entirety* of a checked out repo? (Is it just the main working tree, or do you mean all directories that are found "git worktree --list" ?) http://public-inbox.org/git/xmqqo9wy4hxa@gitster.mtv.corp.google.com gives insights into "worktree vs working tree", the former being the command and the latter being the directory you work in -- a working directory if you will -- but the terminology is working tree. There was another recent discussion on that, why it stuck with "tree". > working tree? working directory? either? and is that the proper phrase > to use *regardless* of where you happen to be located, perhaps in a > subdirectory? > > i did a quick test and, if i don't supply a pathspec, then "git > grep" (quite reasonably) recursively searches only the *current* > working directory (example from linux kernel source repo): > > $ cd scripts > $ git grep -il torvalds -- > checkstack.pl > get_maintainer.pl > package/mkdebian > $ > > however, if you peruse the very first part of the OPTIONS section of > that man page, you read: > > --cached > Instead of searching tracked files in the working tree, > search blobs registered in the index file. > > --no-index > Search files in the current directory that is not managed by Git. > > --untracked > In addition to searching in the tracked files in the > working tree, search also in untracked files. > > ... snip ... > > note how a couple of those options are described as searching "the > working tree", when they clearly(?) do no such thing if you happen to > be located in a subdirectory. There are 2 dimensions to it: * (where you are) if you run git-grep from a sub directory of the repository, then the "sub-working-tree" will be searched. Extend the example from above by calling cd scripts git rm --cached checkstack.pl git grep -il torvalds -- ls checkstack.pl * (what is searched) The options mentioned above specify what exactly is used as the base for searching (the file system, the index, or a commit) > also, at the bottom of that section, one reads: > > ... > If given, limit the search to paths matching at least one > pattern. Both leading paths match and glob(7) patterns are supported. > > For more details about the syntax, see the pathspec > entry in gitglossary(7). > > but, again, what if is *not* given? then what? Assume "$pwd/." > > finally, the first example has the same problem: > > git grep 'time_t' -- '*.[ch]' > Looks for time_t in all tracked .c and .h files in the > working directory and its subdirectories. > > in "the working directory"? > > what is the proper terminology to be used here? the working directory sounds right, not sure which aspect you want to be exposed more clearly.
BUG: No way to set fsck. when cloning
Am 24.05.2018 um 17:25 schrieb Ævar Arnfjörð Bjarmason: > When I do: > > git -c fetch.fsckObjects=true clone > g...@github.com:robbyrussell/oh-my-zsh.git > > I get: > > error: object 2b7227859263b6aabcc28355b0b994995b7148b6: > zeroPaddedFilemode: contains zero-padded file modes > fatal: Error in object > fatal: index-pack failed > > [...] Doing git clone --config transfer.fsckobjects=false --config receive.fsckobjects=false --config fetch.fsckobjects=false g...@github.com:robbyrussell/oh-my-zsh.git does the trick here (stolen from [1]). $ git --version git version 2.17.0.windows.1 I don't know why though. [1]: https://github.com/michaeljones/breathe/issues/340#issuecomment-390775142
[PATCH] submodule: do not pass null OID to setup_revisions
If "git pull --recurse-submodules --rebase" is invoked when the current branch and its corresponding remote-tracking branch have no merge base, a "bad object" fatal error occurs. This issue was introduced with commit a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule changes only)", 2017-06-23), which also introduced this feature. This is because cmd_pull() in builtin/pull.c thus invokes submodule_touches_in_range() with a null OID as the first parameter. Ensure that this case works, and document what happens in this case. Signed-off-by: Jonathan Tan--- submodule.c | 6 -- submodule.h | 5 - t/t5572-pull-submodule.sh | 21 + 3 files changed, 29 insertions(+), 3 deletions(-) diff --git a/submodule.c b/submodule.c index 74d35b2577..49def93dd9 100644 --- a/submodule.c +++ b/submodule.c @@ -1169,8 +1169,10 @@ int submodule_touches_in_range(struct object_id *excl_oid, argv_array_push(, "--"); /* args[0] program name */ argv_array_push(, oid_to_hex(incl_oid)); - argv_array_push(, "--not"); - argv_array_push(, oid_to_hex(excl_oid)); + if (!is_null_oid(excl_oid)) { + argv_array_push(, "--not"); + argv_array_push(, oid_to_hex(excl_oid)); + } collect_changed_submodules(, ); ret = subs.nr; diff --git a/submodule.h b/submodule.h index e5526f6aaa..1fd7111f60 100644 --- a/submodule.h +++ b/submodule.h @@ -94,7 +94,10 @@ extern int merge_submodule(struct object_id *result, const char *path, const struct object_id *a, const struct object_id *b, int search); -/* Checks if there are submodule changes in a..b. */ +/* + * Checks if there are submodule changes in a..b. If a is the null OID, + * checks b and all its ancestors instead. + */ extern int submodule_touches_in_range(struct object_id *a, struct object_id *b); extern int find_unpushed_submodules(struct oid_array *commits, diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh index 321bd37deb..f916729a12 100755 --- a/t/t5572-pull-submodule.sh +++ b/t/t5572-pull-submodule.sh @@ -132,4 +132,25 @@ test_expect_success 'pull rebase recursing fails with conflicts' ' test_i18ngrep "locally recorded submodule modifications" err ' +test_expect_success 'branch has no merge base with remote-tracking counterpart' ' + rm -rf parent child && + + test_create_repo a-submodule && + test_commit -C a-submodule foo && + + test_create_repo parent && + git -C parent submodule add "$(pwd)/a-submodule" && + git -C parent commit -m foo && + + git clone parent child && + + # Reset master so that it has no merge base with + # refs/remotes/origin/master. + OTHER=$(git -C child commit-tree -m bar \ + $(git -C child rev-parse HEAD^{tree})) && + git -C child reset --hard "$OTHER" && + + git -C child pull --recurse-submodules --rebase +' + test_done -- 2.17.0.768.g1526ddbba1.dirty
Re: [PATCH 3/4] config doc: elaborate on what transfer.fsckObjects does
On Thu, May 24, 2018 at 3:35 PM, Ævar Arnfjörð Bjarmasonwrote: > The existing documentation led the user to believe that all we were > doing were basic readability sanity checks, but that hasn't been true > for a very long time. Update the description to match reality, and > note the caveat that there's a quarantine for accepting pushes, but > not for fetching. > > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > diff --git a/Documentation/config.txt b/Documentation/config.txt > @@ -3339,9 +3339,19 @@ transfer.fsckObjects:: > -When set, the fetch or receive will abort in the case of a malformed > -object or a broken link. The result of an abort are only dangling > -objects. > +When set, the fetch receive will abort in the case of a malformed "fetch receive"? Did you mean "fetch or receive" (like the original)? > +object or a link to a nonexisting object. In addition, various other s/nonexisting/nonexistent/ > +issues are checked for, including legacy issues (see `fsck.`), > +and potential security issues like there being a `.GIT` directory (see s/there being/existence of/ > +the release notes for v2.2.1 for details). Other sanity and security > +checks may be added in future releases. > ++ > +On the receiving side failing fsckObjects will make those objects s/side/&,/ > +unreachable, see "QUARANTINE ENVIRONMENT" in > +linkgit:git-receive-pack[1]. On the fetch side the malformed objects s/side/&,/ > +will instead be left unreferenced in the repository. That's considered > +a bug, and hopefully future git release will implement a quarantine > +for the "fetch" side as well. If this was a "BUGS" section in a man-page, the above might be less scary. In this context, however, I wonder if it makes sense to tone it down a bit: On the fetch side, malformed objects will instead be left unreferenced in the repository. (However, in the future, such objects may be quarantined for "fetch", as well.)
Re: [PATCH 2/4] config doc: unify the description of fsck.* and receive.fsck.*
On Thu, May 24 2018, Eric Sunshine wrote: > On Thu, May 24, 2018 at 3:35 PM, Ævar Arnfjörð Bjarmason >wrote: >> The documentation for the fsck. and receive.fsck. >> variables was mostly duplicated in two places, with fsck. >> making no mention of the corresponding receive.fsck., and the >> same for fsck.skipList. >> [...] >> Rectify this situation by describing the feature in general terms >> under the fsck.* documentation, and make the receive.fsck.* >> documentation refer to those variables instead. >> [...] >> Signed-off-by: Ævar Arnfjörð Bjarmason >> --- >> diff --git a/Documentation/config.txt b/Documentation/config.txt >> @@ -1554,23 +1554,41 @@ filter..smudge:: >> fsck.skipList:: >> - The path to a sorted list of object names (i.e. one SHA-1 per >> - line) that are known to be broken in a non-fatal way and should >> - be ignored. This feature is useful when an established project >> - should be accepted despite early commits containing errors that >> - can be safely ignored such as invalid committer email addresses. >> - Note: corrupt objects cannot be skipped with this setting. >> + Like `fsck.` this variable has a corresponding >> + `receive.fsck.skipList` variant. >> ++ >> +The path to a sorted list of object names (i.e. one SHA-1 per line) >> +that are known to be broken in a non-fatal way and should be >> +ignored. This feature is useful when an established project should be >> +accepted despite early commits containing errors that can be safely >> +ignored such as invalid committer email addresses. Note: corrupt >> +objects cannot be skipped with this setting. > > Nit: This organization seems backward. Typically, one would describe > what the option is for and then add the incidental note ("Like > fsck.<...>, this variable...") at the end. It's not clear why this > patch demotes the description to a secondary paragraph and considers > the incidental note as primary. I could change it like that. I was thinking that later in the series fetch.fsck.* is going to be first in the file, and then the user is told to look at this variable, so it made sense to note from the outset that we're describing several variables here. What do you think?
[PATCH v2] Use proper syntax for replaceables in command docs
The standard for command documentation synopses appears to be: [...] means optional <...> means replaceable [<...>] means both optional and replaceable So fix a number of doc pages that use incorrect variations of the above. Signed-off-by: Robert P. J. Day--- diff --git a/Documentation/git-annotate.txt b/Documentation/git-annotate.txt index 05fd482b7..e44a83133 100644 --- a/Documentation/git-annotate.txt +++ b/Documentation/git-annotate.txt @@ -8,7 +8,7 @@ git-annotate - Annotate file lines with commit information SYNOPSIS [verse] -'git annotate' [options] file [revision] +'git annotate' [] [] DESCRIPTION --- diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index aa3b2bf2f..3c0578217 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,8 +9,8 @@ git-check-attr - Display gitattributes information SYNOPSIS [verse] -'git check-attr' [-a | --all | attr...] [--] pathname... -'git check-attr' --stdin [-z] [-a | --all | attr...] +'git check-attr' [-a | --all | ...] [--] ... +'git check-attr' --stdin [-z] [-a | --all | ...] DESCRIPTION --- diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt index 611754f10..8b42cb3fb 100644 --- a/Documentation/git-check-ignore.txt +++ b/Documentation/git-check-ignore.txt @@ -9,8 +9,8 @@ git-check-ignore - Debug gitignore / exclude files SYNOPSIS [verse] -'git check-ignore' [options] pathname... -'git check-ignore' [options] --stdin +'git check-ignore' [] ... +'git check-ignore' [] --stdin DESCRIPTION --- diff --git a/Documentation/git-check-mailmap.txt b/Documentation/git-check-mailmap.txt index 39028ee1a..aa2055dbe 100644 --- a/Documentation/git-check-mailmap.txt +++ b/Documentation/git-check-mailmap.txt @@ -9,7 +9,7 @@ git-check-mailmap - Show canonical names and email addresses of contacts SYNOPSIS [verse] -'git check-mailmap' [options] ... +'git check-mailmap' [] ... DESCRIPTION diff --git a/Documentation/git-credential-cache.txt b/Documentation/git-credential-cache.txt index 2b8582639..0216c18ef 100644 --- a/Documentation/git-credential-cache.txt +++ b/Documentation/git-credential-cache.txt @@ -8,7 +8,7 @@ git-credential-cache - Helper to temporarily store passwords in memory SYNOPSIS - -git config credential.helper 'cache [options]' +git config credential.helper 'cache []' - DESCRIPTION diff --git a/Documentation/git-credential-store.txt b/Documentation/git-credential-store.txt index 25fb963f4..693dd9d9d 100644 --- a/Documentation/git-credential-store.txt +++ b/Documentation/git-credential-store.txt @@ -8,7 +8,7 @@ git-credential-store - Helper to store credentials on disk SYNOPSIS --- -git config credential.helper 'store [options]' +git config credential.helper 'store []' --- DESCRIPTION diff --git a/Documentation/git-cvsserver.txt b/Documentation/git-cvsserver.txt index 37b96c545..f98b7c6ed 100644 --- a/Documentation/git-cvsserver.txt +++ b/Documentation/git-cvsserver.txt @@ -22,7 +22,7 @@ cvspserver stream tcp nowait nobody /usr/bin/git-cvsserver git-cvsserver pserver Usage: [verse] -'git-cvsserver' [options] [pserver|server] [ ...] +'git-cvsserver' [] [pserver|server] [ ...] OPTIONS --- diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt index 7c2c44270..b180f1fa5 100644 --- a/Documentation/git-diff.txt +++ b/Documentation/git-diff.txt @@ -9,11 +9,11 @@ git-diff - Show changes between commits, commit and working tree, etc SYNOPSIS [verse] -'git diff' [options] [] [--] [...] -'git diff' [options] --cached [] [--] [...] -'git diff' [options] [--] [...] -'git diff' [options] -'git diff' [options] --no-index [--] +'git diff' [] [] [--] [...] +'git diff' [] --cached [] [--] [...] +'git diff' [] [--] [...] +'git diff' [] +'git diff' [] --no-index [--] DESCRIPTION --- @@ -21,7 +21,7 @@ Show changes between the working tree and the index or a tree, changes between the index and a tree, changes between two trees, changes between two blob objects, or changes between two files on disk. -'git diff' [options] [--] [...]:: +'git diff' [] [--] [...]:: This form is to view the changes you made relative to the index (staging area for the next commit). In other @@ -29,7 +29,7 @@ two blob objects, or changes between two files on disk. further add to the index but you still haven't. You can stage these changes by using linkgit:git-add[1]. -'git diff' [options] --no-index [--] :: +'git diff' [] --no-index [--] :: This form is to compare the given two paths on the filesystem. You can omit the `--no-index` option when @@ -38,7 +38,7 @@ two blob objects, or changes between two files on disk. or when running the command
Re: [PATCH] Use proper syntax for replaceables in command docs
On Thu, 24 May 2018, Eric Sunshine wrote: > On Thu, May 24, 2018 at 3:54 PM, Robert P. J. Day> wrote: > > The standard for command documentation synopses appears to be: > > > > [...] means optional > > <...> means replaceable > > [<...>] means both optional and replaceable > > > > So fix a number of doc pages that use incorrect variations of the > > above. > > > > Signed-off-by: Robert P. J. Day > > --- > > diff --git a/Documentation/git-check-attr.txt > > b/Documentation/git-check-attr.txt > > @@ -9,7 +9,7 @@ git-check-attr - Display gitattributes information > > -'git check-attr' [-a | --all | attr...] [--] pathname... > > +'git check-attr' [-a | --all | attr...] [--] ... > > 'git check-attr' --stdin [-z] [-a | --all | attr...] > > Don't you also want ""? > > > diff --git a/Documentation/git-check-ignore.txt > > b/Documentation/git-check-ignore.txt > > @@ -9,8 +9,8 @@ git-check-ignore - Debug gitignore / exclude files > > -'git check-ignore' [options] pathname... > > +'git check-ignore' [] ... > > Earlier in the patch, you changed "pathname" to "", but here > you change "pathname" to "", which is inconsistent. > > It's also inconsistent and odd to say "..." (with the "..."). > Seems better just to say "..." to match existing practice. points taken. note that this is not even close to a comprehensive fix; i just did the stuff i ran across. there is almost certainly much more, but i'll fix the above. rday
Re: [PATCH] Use proper syntax for replaceables in command docs
On Thu, May 24, 2018 at 3:54 PM, Robert P. J. Daywrote: > The standard for command documentation synopses appears to be: > > [...] means optional > <...> means replaceable > [<...>] means both optional and replaceable > > So fix a number of doc pages that use incorrect variations of the > above. > > Signed-off-by: Robert P. J. Day > --- > diff --git a/Documentation/git-check-attr.txt > b/Documentation/git-check-attr.txt > @@ -9,7 +9,7 @@ git-check-attr - Display gitattributes information > -'git check-attr' [-a | --all | attr...] [--] pathname... > +'git check-attr' [-a | --all | attr...] [--] ... > 'git check-attr' --stdin [-z] [-a | --all | attr...] Don't you also want ""? > diff --git a/Documentation/git-check-ignore.txt > b/Documentation/git-check-ignore.txt > @@ -9,8 +9,8 @@ git-check-ignore - Debug gitignore / exclude files > -'git check-ignore' [options] pathname... > +'git check-ignore' [] ... Earlier in the patch, you changed "pathname" to "", but here you change "pathname" to "", which is inconsistent. It's also inconsistent and odd to say "..." (with the "..."). Seems better just to say "..." to match existing practice.
[PATCH] Use proper syntax for replaceables in command docs
The standard for command documentation synopses appears to be: [...] means optional <...> means replaceable [<...>] means both optional and replaceable So fix a number of doc pages that use incorrect variations of the above. Signed-off-by: Robert P. J. Day--- diff --git a/Documentation/git-annotate.txt b/Documentation/git-annotate.txt index 05fd482b7..e44a83133 100644 --- a/Documentation/git-annotate.txt +++ b/Documentation/git-annotate.txt @@ -8,7 +8,7 @@ git-annotate - Annotate file lines with commit information SYNOPSIS [verse] -'git annotate' [options] file [revision] +'git annotate' [] [] DESCRIPTION --- diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt index aa3b2bf2f..aff476e28 100644 --- a/Documentation/git-check-attr.txt +++ b/Documentation/git-check-attr.txt @@ -9,7 +9,7 @@ git-check-attr - Display gitattributes information SYNOPSIS [verse] -'git check-attr' [-a | --all | attr...] [--] pathname... +'git check-attr' [-a | --all | attr...] [--] ... 'git check-attr' --stdin [-z] [-a | --all | attr...] DESCRIPTION diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt index 611754f10..006f3b9fa 100644 --- a/Documentation/git-check-ignore.txt +++ b/Documentation/git-check-ignore.txt @@ -9,8 +9,8 @@ git-check-ignore - Debug gitignore / exclude files SYNOPSIS [verse] -'git check-ignore' [options] pathname... -'git check-ignore' [options] --stdin +'git check-ignore' [] ... +'git check-ignore' [] --stdin DESCRIPTION --- diff --git a/Documentation/git-check-mailmap.txt b/Documentation/git-check-mailmap.txt index 39028ee1a..aa2055dbe 100644 --- a/Documentation/git-check-mailmap.txt +++ b/Documentation/git-check-mailmap.txt @@ -9,7 +9,7 @@ git-check-mailmap - Show canonical names and email addresses of contacts SYNOPSIS [verse] -'git check-mailmap' [options] ... +'git check-mailmap' [] ... DESCRIPTION diff --git a/Documentation/git-credential-cache.txt b/Documentation/git-credential-cache.txt index 2b8582639..0216c18ef 100644 --- a/Documentation/git-credential-cache.txt +++ b/Documentation/git-credential-cache.txt @@ -8,7 +8,7 @@ git-credential-cache - Helper to temporarily store passwords in memory SYNOPSIS - -git config credential.helper 'cache [options]' +git config credential.helper 'cache []' - DESCRIPTION diff --git a/Documentation/git-credential-store.txt b/Documentation/git-credential-store.txt index 25fb963f4..693dd9d9d 100644 --- a/Documentation/git-credential-store.txt +++ b/Documentation/git-credential-store.txt @@ -8,7 +8,7 @@ git-credential-store - Helper to store credentials on disk SYNOPSIS --- -git config credential.helper 'store [options]' +git config credential.helper 'store []' --- DESCRIPTION diff --git a/Documentation/git-cvsserver.txt b/Documentation/git-cvsserver.txt index 37b96c545..f98b7c6ed 100644 --- a/Documentation/git-cvsserver.txt +++ b/Documentation/git-cvsserver.txt @@ -22,7 +22,7 @@ cvspserver stream tcp nowait nobody /usr/bin/git-cvsserver git-cvsserver pserver Usage: [verse] -'git-cvsserver' [options] [pserver|server] [ ...] +'git-cvsserver' [] [pserver|server] [ ...] OPTIONS --- diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt index 7c2c44270..b180f1fa5 100644 --- a/Documentation/git-diff.txt +++ b/Documentation/git-diff.txt @@ -9,11 +9,11 @@ git-diff - Show changes between commits, commit and working tree, etc SYNOPSIS [verse] -'git diff' [options] [] [--] [...] -'git diff' [options] --cached [] [--] [...] -'git diff' [options] [--] [...] -'git diff' [options] -'git diff' [options] --no-index [--] +'git diff' [] [] [--] [...] +'git diff' [] --cached [] [--] [...] +'git diff' [] [--] [...] +'git diff' [] +'git diff' [] --no-index [--] DESCRIPTION --- @@ -21,7 +21,7 @@ Show changes between the working tree and the index or a tree, changes between the index and a tree, changes between two trees, changes between two blob objects, or changes between two files on disk. -'git diff' [options] [--] [...]:: +'git diff' [] [--] [...]:: This form is to view the changes you made relative to the index (staging area for the next commit). In other @@ -29,7 +29,7 @@ two blob objects, or changes between two files on disk. further add to the index but you still haven't. You can stage these changes by using linkgit:git-add[1]. -'git diff' [options] --no-index [--] :: +'git diff' [] --no-index [--] :: This form is to compare the given two paths on the filesystem. You can omit the `--no-index` option when @@ -38,7 +38,7 @@ two blob objects, or changes between two files on disk. or when running the command outside a working tree controlled by Git. -'git
Re: [PATCH 2/4] config doc: unify the description of fsck.* and receive.fsck.*
On Thu, May 24, 2018 at 3:35 PM, Ævar Arnfjörð Bjarmasonwrote: > The documentation for the fsck. and receive.fsck. > variables was mostly duplicated in two places, with fsck. > making no mention of the corresponding receive.fsck., and the > same for fsck.skipList. > [...] > Rectify this situation by describing the feature in general terms > under the fsck.* documentation, and make the receive.fsck.* > documentation refer to those variables instead. > [...] > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > diff --git a/Documentation/config.txt b/Documentation/config.txt > @@ -1554,23 +1554,41 @@ filter..smudge:: > fsck.skipList:: > - The path to a sorted list of object names (i.e. one SHA-1 per > - line) that are known to be broken in a non-fatal way and should > - be ignored. This feature is useful when an established project > - should be accepted despite early commits containing errors that > - can be safely ignored such as invalid committer email addresses. > - Note: corrupt objects cannot be skipped with this setting. > + Like `fsck.` this variable has a corresponding > + `receive.fsck.skipList` variant. > ++ > +The path to a sorted list of object names (i.e. one SHA-1 per line) > +that are known to be broken in a non-fatal way and should be > +ignored. This feature is useful when an established project should be > +accepted despite early commits containing errors that can be safely > +ignored such as invalid committer email addresses. Note: corrupt > +objects cannot be skipped with this setting. Nit: This organization seems backward. Typically, one would describe what the option is for and then add the incidental note ("Like fsck.<...>, this variable...") at the end. It's not clear why this patch demotes the description to a secondary paragraph and considers the incidental note as primary.
Re: What's cooking in git.git (May 2018, #03; Wed, 23)
On 24 May 2018 at 01:02, Junio C Hamanowrote: > Here are the topics that have been cooking. Commits prefixed with > '-' are only in 'pu' (proposed updates) while commits prefixed with > '+' are in 'next'. The ones marked with '.' do not appear in any of > the integration branches, but I am still holding onto them. > > You can find the changes described here in the integration branches > of the repositories listed at > > http://git-blame.blogspot.com/p/git-public-repositories.html > > > > * ld/p4-unshelve (2018-05-23) 3 commits > - git-p4: unshelve: use action==add instead of rev==none > - SQUASH??? > - git-p4: add unshelve command > > "git p4" learned to "unshelve" shelved commit from P4. > > Expecting a reroll. > cf.
[PATCH v3] checkout & worktree: introduce checkout.implicitRemote
Introduce a checkout.implicitRemote setting which can be used to designate a remote to prefer (via checkout.implicitRemote=origin) when running e.g. "git checkout master" to mean origin/master, even though there's other remotes that have the "master" branch. I want this because it's very handy to use this workflow to checkout a repository and create a topic branch, then get back to a "master" as retrieved from upstream: ( rm -rf /tmp/tbdiff && git clone g...@github.com:trast/tbdiff.git && cd tbdiff && git branch -m topic && git checkout master ) That will output: Branch 'master' set up to track remote branch 'master' from 'origin'. Switched to a new branch 'master' But as soon as a new remote is added (e.g. just to inspect something from someone else) the DWIMery goes away: ( rm -rf /tmp/tbdiff && git clone g...@github.com:trast/tbdiff.git && cd tbdiff && git branch -m topic && git remote add avar g...@github.com:avar/tbdiff.git && git fetch avar && git checkout master ) Will output: error: pathspec 'master' did not match any file(s) known to git. The new checkout.implicitRemote config allows me to say that whenever that ambiguity comes up I'd like to prefer "origin", and it'll still work as though the only remote I had was "origin". I considered splitting this into checkout.implicitRemote and worktree.implicitRemote, but it's probably less confusing to break our own rules that anything shared between config should live in core.* than have two config settings, and I couldn't come up with a short name under core.* that made sense (core.implicitRemoteForCheckout?). See also 70c9ac2f19 ("DWIM "git checkout frotz" to "git checkout -b frotz origin/frotz"", 2009-10-18) which introduced this DWIM feature to begin with, and 4e85333197 ("worktree: make add dwim", 2017-11-26) which added it to git-worktree. Signed-off-by: Ævar Arnfjörð Bjarmason--- Took me a while to get around to this, but this v3 addresses feedback by Eric about the docs. Thanks! interdiff: 1: 905a17f35f ! 1: b23d1b71e9 checkout & worktree: introduce checkout.implicitRemote @@ -71,16 +71,15 @@ + tracking e.g. 'origin/'. This stops working as soon + as you have more than one remote with a '' + reference. This setting allows for setting the name of a -+ special remote that should always win when it comes to ++ preferred remote that should always win when it comes to + disambiguation. The typical use-case is to set this to + `origin`. ++ +Currently this is used by linkgit:git-checkout[1] when 'git checkout +' will checkout the '' branch on another remote, -+and by linkgit:git-worktree[1] when 'git worktree add' when referring -+to a remote branch. This setting might be used for other -+checkout-like commands or functionality in the future when -+appropriate. ++and by linkgit:git-worktree[1] when 'git worktree add' refers to a ++remote branch. This setting might be used for other checkout-like ++commands or functionality in the future. + clean.requireForce:: A boolean to make git-clean do nothing unless given -f, @@ -110,14 +109,14 @@ $ git worktree add --track -b / + -+It's also possible to use the `checkout.implicitRemote` setting to -+designate a special remote this rule should be applied to, even if the -+branch isn't unique across all remotes. See `checkout.implicitRemote` -+in linkgit:git-config[1]. ++The `checkout.implicitRemote` setting can be used to to designate a ++preferred `` this rule should be applied to, even if the ++`` isn't unique across all remotes. See ++`checkout.implicitRemote` in linkgit:git-config[1]. ++ If `` is omitted and neither `-b` nor `-B` nor `--detach` used, - then, as a convenience, a new branch based at HEAD is created automatically, - as if `-b $(basename )` was specified. + then, as a convenience, the new worktree is associated with a branch + (call it ``) named after `$(basename )`. If `` diff --git a/builtin/checkout.c b/builtin/checkout.c --- a/builtin/checkout.c @@ -235,6 +234,6 @@ + ) +' + - post_checkout_hook () { - gitdir=${1:-.git} - test_when_finished "rm -f $gitdir/hooks/post-checkout" && + test_expect_success 'git worktree add does not match remote' ' + test_when_finished rm -rf repo_a repo_b foo && + setup_remote_repo repo_a repo_b && Documentation/config.txt | 16
Re: What's cooking in git.git (May 2018, #03; Wed, 23)
On Thu, May 24 2018, Junio C Hamano wrote: > * ab/fetch-tags-noclobber (2018-05-16) 9 commits > - fixup! push tests: assert re-pushing annotated tags > - fetch: stop clobbering existing tags without --force > - fetch tests: add a test clobbering tag behavior > - fetch tests: correct a comment "remove it" -> "remove them" > - push doc: correct lies about how push refspecs work > - push tests: assert re-pushing annotated tags > - push tests: add more testing for forced tag pushing > - push tests: fix logic error in "push" test assertion > - push tests: remove redundant 'git push' invocation > > Expecting a reboot of the discussion to take it to some conclusion > and then a reroll. > cf.> cf. > cf. > cf. I hope to follow-up with this soon. Sorry for the delay. > * ab/get-short-oid (2018-05-11) 5 commits > (merged to 'next' on 2018-05-23 at 07e1908439) > + get_short_oid: sort ambiguous objects by type, then SHA-1 > + sha1-name.c: move around the collect_ambiguous() function > + git-p4: change "commitish" typo to "committish" > + sha1-array.h: align function arguments > + sha1-name.c: remove stray newline Sweet. I'll follow-up with the rest of the patches revised once that lands now that I'm fairly sure what you and Jeff mean by $sha1^{} :)
[PATCH 4/4] fetch: implement fetch.fsck.*
Implement support for fetch.fsck.* corresponding with the existing receive.fsck.*. This allows for pedantically cloning repositories with specific issues without turning off fetch.fsckObjects. One such repository is https://github.com/robbyrussell/oh-my-zsh.git which before this change will emit this error when cloned with fetch.fsckObjects: error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes fatal: Error in object fatal: index-pack failed Now with fetch.fsck.zeroPaddedFilemode=warn we'll warn about that issue, but the clone will succeed: warning: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes warning: object a18c4d13c2a5fa2d4ecd5346c50e119b999b807d: zeroPaddedFilemode: contains zero-padded file modes warning: object 84df066176c8da3fd59b13731a86d90f4f1e5c9d: zeroPaddedFilemode: contains zero-padded file modes The motivation for this is to be able to turn on fetch.fsckObjects globally across a fleet of computers but still be able to manually clone various legacy repositories by either white-listing specific issues, or better yet whitelist specific objects. The use of --git-dir=* instead of -C in the tests could be considered somewhat archaic, but the tests I'm adding here are duplicating the corresponding receive.* tests with as few changes as possible. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt| 21 +++ fetch-pack.c| 32 +-- t/t5504-fetch-receive-strict.sh | 46 + 3 files changed, 92 insertions(+), 7 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 124f7a187c..af6187f6b5 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1426,6 +1426,16 @@ fetch.fsckObjects:: checked. Defaults to false. If not set, the value of `transfer.fsckObjects` is used instead. +fetch.fsck.:: + Acts like `fsck.`, but is used by + linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See + the `fsck.` documentation for details. + +fetch.fsck.skipList:: + Acts like `fsck.skipList`, but is used by + linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See + the `fsck.skipList` documentation for details. + fetch.unpackLimit:: If the number of objects fetched over the Git native transfer is below this @@ -1561,9 +1571,10 @@ fsck.:: repositories containing such data. + Setting `fsck.` will be picked up by linkgit:git-fsck[1], but -to accept pushes of such data set `receive.fsck.` instead. The -rest of the documentation discusses `fsck.*` for brevity, but the same -applies for the corresponding `receive.fsck.*` variables. +to accept pushes of such data set `receive.fsck.` instead, or +to clone or fetch it set `fetch.fsck.`. The rest of the +documentation discusses `fsck.*` for brevity, but the same applies for +the corresponding `receive.fsck.*` and `fetch..*`. variables. + When `fsck.` is set, errors can be switched to warnings and vice versa by configuring the `fsck.` setting where the @@ -1580,8 +1591,8 @@ hidden going forward, although that's unlikely to happen in practice unless someone is being deliberately malicious. fsck.skipList:: - Like `fsck.` this variable has a corresponding - `receive.fsck.skipList` variant. + Like `fsck.` this variable has corresponding + `receive.fsck.skipList` and `fetch.fsck.skipList` variants. + The path to a sorted list of object names (i.e. one SHA-1 per line) that are known to be broken in a non-fatal way and should be diff --git a/fetch-pack.c b/fetch-pack.c index 490c38f833..9e4282788e 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -19,6 +19,7 @@ #include "sha1-array.h" #include "oidset.h" #include "packfile.h" +#include "fsck.h" static int transfer_unpack_limit = -1; static int fetch_unpack_limit = -1; @@ -33,6 +34,7 @@ static int agent_supported; static int server_supports_filtering; static struct lock_file shallow_lock; static const char *alternate_shallow_file; +static struct strbuf fsck_msg_types = STRBUF_INIT; /* Remember to update object flag allocation in object.h */ #define COMPLETE (1U << 0) @@ -935,7 +937,8 @@ static int get_pack(struct fetch_pack_args *args, */ argv_array_push(, "--fsck-objects"); else - argv_array_push(, "--strict"); + argv_array_pushf(, "--strict%s", +fsck_msg_types.buf); } cmd.in = demux.out; @@ -1409,6 +1412,31 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, return ref; } +static int fetch_pack_config_cb(const char *var, const char *value, void *cb) +{ + if (strcmp(var,
[PATCH 2/4] config doc: unify the description of fsck.* and receive.fsck.*
The documentation for the fsck. and receive.fsck. variables was mostly duplicated in two places, with fsck. making no mention of the corresponding receive.fsck., and the same for fsck.skipList. I spent quite a lot of time today wondering why setting the fsck. variant wasn't working to clone a legacy repository (not that that would have worked anyway, but a subsequent patch implements fetch.fsck.). Rectify this situation by describing the feature in general terms under the fsck.* documentation, and make the receive.fsck.* documentation refer to those variables instead. This documentation was initially added in 2becf00ff7 ("fsck: support demoting errors to warnings", 2015-06-22) and 4b55b9b479 ("fsck: document the new receive.fsck. options", 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 74 ++-- 1 file changed, 41 insertions(+), 33 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 623dffd198..351c541ab8 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1554,23 +1554,41 @@ filter..smudge:: linkgit:gitattributes[5] for details. fsck.:: - Allows overriding the message type (error, warn or ignore) of a - specific message ID such as `missingEmail`. -+ -For convenience, fsck prefixes the error/warning with the message ID, -e.g. "missingEmail: invalid author/committer line - missing email" means -that setting `fsck.missingEmail = ignore` will hide that issue. -+ -This feature is intended to support working with legacy repositories -which cannot be repaired without disruptive changes. + During fsck git may find issues with legacy data which + wouldn't be generated by current versions of git, and which + wouldn't be sent over the wire if `transfer.fsckObjects` was + set. This feature is intended to support working with legacy + repositories containing such data. ++ +Setting `fsck.` will be picked up by linkgit:git-fsck[1], but +to accept pushes of such data set `receive.fsck.` instead. The +rest of the documentation discusses `fsck.*` for brevity, but the same +applies for the corresponding `receive.fsck.*` variables. ++ +When `fsck.` is set, errors can be switched to warnings and +vice versa by configuring the `fsck.` setting where the +`` is the fsck message ID and the value is one of `error`, +`warn` or `ignore`. For convenience, fsck prefixes the error/warning +with the message ID, e.g. "missingEmail: invalid author/committer line +- missing email" means that setting `fsck.missingEmail = ignore` will +hide that issue. ++ +Depending on the circumstances it might be better to use +`fsck.skipList` instead to explicitly whitelist those objects that +have issues, otherwise new occurrences of the same issue will be +hidden going forward, although that's unlikely to happen in practice +unless someone is being deliberately malicious. fsck.skipList:: - The path to a sorted list of object names (i.e. one SHA-1 per - line) that are known to be broken in a non-fatal way and should - be ignored. This feature is useful when an established project - should be accepted despite early commits containing errors that - can be safely ignored such as invalid committer email addresses. - Note: corrupt objects cannot be skipped with this setting. + Like `fsck.` this variable has a corresponding + `receive.fsck.skipList` variant. ++ +The path to a sorted list of object names (i.e. one SHA-1 per line) +that are known to be broken in a non-fatal way and should be +ignored. This feature is useful when an established project should be +accepted despite early commits containing errors that can be safely +ignored such as invalid committer email addresses. Note: corrupt +objects cannot be skipped with this setting. gc.aggressiveDepth:: The depth parameter used in the delta compression @@ -2849,26 +2867,16 @@ receive.fsckObjects:: `transfer.fsckObjects` is used instead. receive.fsck.:: - When `receive.fsckObjects` is set to true, errors can be switched - to warnings and vice versa by configuring the `receive.fsck.` - setting where the `` is the fsck message ID and the value - is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes - the error/warning with the message ID, e.g. "missingEmail: invalid - author/committer line - missing email" means that setting - `receive.fsck.missingEmail = ignore` will hide that issue. -+ -This feature is intended to support working with legacy repositories -which would not pass pushing when `receive.fsckObjects = true`, allowing -the host to accept repositories with certain known issues but still catch -other issues. + Acts like `fsck.`, but is used by + linkgit:git-receive-pack[1] instead of + linkgit:git-fsck[1]. See the `fsck.` documentation for +
[PATCH 1/4] config doc: don't describe *.fetchObjects twice
Change the copy/pasted description of the fetch.fsckObjects and receive.fsckObjects variables to refer to transfer.fsckObjects instead. Let's not duplicate the description of what *.fsckObjects does twice. instead let's refer to transfer.fsckObjects from both fetch.* and receive.*. I don't think this description of it makes much sense, but for now I'm just moving the existing documentation around. Making it better will be done in a later patch. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 84e2891aed..623dffd198 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1422,10 +1422,9 @@ fetch.recurseSubmodules:: fetch.fsckObjects:: If it is set to true, git-fetch-pack will check all fetched - objects. It will abort in the case of a malformed object or a - broken link. The result of an abort are only dangling objects. - Defaults to false. If not set, the value of `transfer.fsckObjects` - is used instead. + objects. See `transfer.fsckObjects` for what's + checked. Defaults to false. If not set, the value of + `transfer.fsckObjects` is used instead. fetch.unpackLimit:: If the number of objects fetched over the Git native @@ -2845,10 +2844,9 @@ receive.certNonceSlop:: receive.fsckObjects:: If it is set to true, git-receive-pack will check all received - objects. It will abort in the case of a malformed object or a - broken link. The result of an abort are only dangling objects. - Defaults to false. If not set, the value of `transfer.fsckObjects` - is used instead. + objects. See `transfer.fsckObjects` for what's checked. + Defaults to false. If not set, the value of + `transfer.fsckObjects` is used instead. receive.fsck.:: When `receive.fsckObjects` is set to true, errors can be switched @@ -3332,6 +3330,10 @@ transfer.fsckObjects:: When `fetch.fsckObjects` or `receive.fsckObjects` are not set, the value of this variable is used instead. Defaults to false. ++ +When set, the fetch or receive will abort in the case of a malformed +object or a broken link. The result of an abort are only dangling +objects. transfer.hideRefs:: String(s) `receive-pack` and `upload-pack` use to decide which -- 2.17.0.290.gded63e768a
[PATCH 3/4] config doc: elaborate on what transfer.fsckObjects does
The existing documentation led the user to believe that all we were doing were basic readability sanity checks, but that hasn't been true for a very long time. Update the description to match reality, and note the caveat that there's a quarantine for accepting pushes, but not for fetching. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 351c541ab8..124f7a187c 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3339,9 +3339,19 @@ transfer.fsckObjects:: not set, the value of this variable is used instead. Defaults to false. + -When set, the fetch or receive will abort in the case of a malformed -object or a broken link. The result of an abort are only dangling -objects. +When set, the fetch receive will abort in the case of a malformed +object or a link to a nonexisting object. In addition, various other +issues are checked for, including legacy issues (see `fsck.`), +and potential security issues like there being a `.GIT` directory (see +the release notes for v2.2.1 for details). Other sanity and security +checks may be added in future releases. ++ +On the receiving side failing fsckObjects will make those objects +unreachable, see "QUARANTINE ENVIRONMENT" in +linkgit:git-receive-pack[1]. On the fetch side the malformed objects +will instead be left unreferenced in the repository. That's considered +a bug, and hopefully future git release will implement a quarantine +for the "fetch" side as well. transfer.hideRefs:: String(s) `receive-pack` and `upload-pack` use to decide which -- 2.17.0.290.gded63e768a
[PATCH 0/4] fsck: doc fixes & fetch.fsck.* implementation
On Thu, May 24, 2018 at 9:02 PM, Jeff Kingwrote: > On Thu, May 24, 2018 at 07:04:04PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> That doesn't work, because that's for the server-side, but I need the >> fetch.fsck.* that doesn't exist. This works (I'll send a better patch >> with tests / docs etc. soon): > > Yeah, I think this is the right direction. > >> +static int fetch_pack_config_cb(const char *var, const char *value, void >> *cb) >> +{ >> + if (strcmp(var, "fetch.fsck.skiplist") == 0) { >> + const char *path; >> + >> + if (git_config_pathname(, var, value)) >> + return 1; >> + strbuf_addf(_msg_types, "%cskiplist=%s", >> + fsck_msg_types.len ? ',' : '=', path); >> + free((char *)path); >> + return 0; >> + } >> + >> + if (skip_prefix(var, "fetch.fsck.", )) { >> + if (is_valid_msg_type(var, value)) >> + strbuf_addf(_msg_types, "%c%s=%s", >> + fsck_msg_types.len ? ',' : '=', var, value); >> + else >> + warning("Skipping unknown msg id '%s'", var); >> + return 0; >> + } > > This matches what's in receive-pack, though the way we stuff all of the > options into a flat string is kind of nasty. I also wonder if we'd > eventually run up against command-line limits if somebody had a > complicated fsck config. Yeah, but I'm leaving this for the future. I doubt that it's going to happen in practice, although if you have a huge skip-list... > I wonder if we should simply be telling index-pack "please read fsck > config from the prefix 'fetch.fsck'" instead. I think this whole notion of reading the same config from two places sucks, and now with my patches it's three. But I can't think of a reasonable way to make it better without even more complexity, and maybe it's better to split it up anyway, i.e. the stuff you want to accept is different that fsck and fetch. > I dunno, maybe I am just creating work. Certainly I don't think it > should be a blocker for adding fetch.fsck support. But if you want to > think about it while you are here or write a patch, I wouldn't complain. :) Sorry, too late. I already wrote all of this :) Ævar Arnfjörð Bjarmason (4): config doc: don't describe *.fetchObjects twice config doc: unify the description of fsck.* and receive.fsck.* config doc: elaborate on what transfer.fsckObjects does fetch: implement fetch.fsck.* Documentation/config.txt| 113 fetch-pack.c| 32 - t/t5504-fetch-receive-strict.sh | 46 + 3 files changed, 148 insertions(+), 43 deletions(-) -- 2.17.0.290.gded63e768a
Re: Weird revision walk behaviour
On 23/05/2018 20:35, Jeff King wrote: There's more discussion in the thread at: https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/ I haven't absorbed it all yet, but I'm adding Junio to the cc. Just to ack that I've seen the discussion, but I can't identify the code's reasoning at the moment. My recollection is that I accepted while coming up with the algorithm that it might err slightly on the side of false positives in the display - there were some merge cases I was unable to fully distinguish whether or not the merge had lost a change it shouldn't have done, and if I was uncertain I'd rather show it than not. The first commit was not originally intended to alter behaviour for anything other than --full-history, but later in the chain there was specific consideration into tracking the path to the specified "bottom" commit. It may be that's part of what's happening here. Kevin
[PATCH] branch: issue "-l" deprecation warning after pager starts
On Thu, May 24, 2018 at 03:22:14PM -0400, Jeff King wrote: > On Thu, May 24, 2018 at 08:40:18PM +0530, Kaartic Sivaraam wrote: > > > > On the other hand, I'm not sure this is that big a deal. The point of > > > the deprecation warning is to catch people who are actually trying to > > > use "-l" as "--create-reflog", and that case does not page. The people > > > doing "git branch -l" are actually getting what they want eventually, > > > which is to turn it into "--list". In the interim step where it becomes > > > an unknown option, they'll get a hard error. > > > > I just thought we wouldn't want to surprise/confuse users who try to > > use "git branch -l" with the warning message about "create reflog" > > along-side the list of branches. That would just add to the confusion. > > So, I thought we should error out when users do "git branch -l" > > instead. Something like the following should help us prevent "git > > branch -l" from listing branch names and might also prevent the > > confusion. > > Yeah, I think that's just a more extreme version of the current plan (it > turns it immediately into a hard error instead of warning for a while). > If we just make the warning easier to see in the paged case, I think > that makes the current plan fine. > > I'll wrap up the patch I sent earlier. Hmm, actually, I suppose the true value of the warning is to help people doing "git branch -l foo", and it would still work there. The "more extreme" from your suggested patch would only affect "branch -l". Still, I think I prefer the gentler version that we get by keeping it as a warning even in the latter case. Here's the patch. It goes on top of jk/branch-l-0-deprecation (and will naturally conflict with the removal branch, but the resolution should be obvious). -- >8 -- Subject: [PATCH] branch: issue "-l" deprecation warning after pager starts If you run "git branch -l", we issue a deprecation warning that "-l" is probably not doing what you expect. But we do so while parsing the options, which is _before_ we start the pager. Depending on your pager settings, this means that the warning may get totally overwritten by the pager. Instead, let's delay the message until after we would have started the pager. If we do page, then it will end up inside the pager (since we redirect stderr). And if not (including the case when you really did mean for "-l" to work as "--create-reflog"), then it will still get shown on stderr. Signed-off-by: Jeff King--- builtin/branch.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index e50a5a1680..55bfacd843 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -34,6 +34,7 @@ static const char * const builtin_branch_usage[] = { NULL }; +static int used_deprecated_reflog_option; static const char *head; static struct object_id head_oid; @@ -573,8 +574,7 @@ static int edit_branch_description(const char *branch_name) static int deprecated_reflog_option_cb(const struct option *opt, const char *arg, int unset) { - warning("the '-l' alias for '--create-reflog' is deprecated;"); - warning("it will be removed in a future version of Git"); + used_deprecated_reflog_option = 1; *(int *)opt->value = !unset; return 0; } @@ -700,6 +700,11 @@ int cmd_branch(int argc, const char **argv, const char *prefix) if (list) setup_auto_pager("branch", 1); + if (used_deprecated_reflog_option) { + warning("the '-l' alias for '--create-reflog' is deprecated;"); + warning("it will be removed in a future version of Git"); + } + if (delete) { if (!argc) die(_("branch name required")); -- 2.17.0.1391.g6fdbf40724
Re: jk/branch-l-0-deprecation (was Re: What's cooking in git.git (May 2018, #02; Thu, 17))
On Thu, May 24, 2018 at 08:40:18PM +0530, Kaartic Sivaraam wrote: > > On the other hand, I'm not sure this is that big a deal. The point of > > the deprecation warning is to catch people who are actually trying to > > use "-l" as "--create-reflog", and that case does not page. The people > > doing "git branch -l" are actually getting what they want eventually, > > which is to turn it into "--list". In the interim step where it becomes > > an unknown option, they'll get a hard error. > > I just thought we wouldn't want to surprise/confuse users who try to > use "git branch -l" with the warning message about "create reflog" > along-side the list of branches. That would just add to the confusion. > So, I thought we should error out when users do "git branch -l" > instead. Something like the following should help us prevent "git > branch -l" from listing branch names and might also prevent the > confusion. Yeah, I think that's just a more extreme version of the current plan (it turns it immediately into a hard error instead of warning for a while). If we just make the warning easier to see in the paged case, I think that makes the current plan fine. I'll wrap up the patch I sent earlier. -Peff
Re: is the standard "[]", and not "[options]" or other?
On Thu, May 24, 2018 at 4:45 AM, Robert P. J. Daywrote: > > but should the man pages be updated similarly? i can whip up a patch > for that unless someone wants to comment on this further. Yes, please! I think [] are the best, as they are pedantically correct. [--options] is the worst, as there is not such thing as --options. Thanks, Stefan
Re: What's cooking in git.git (May 2018, #03; Wed, 23)
> * sb/diff-color-move-more (2018-05-21) 8 commits > - diff: color-moved white space handling options imply color-moved > - diff.c: add --color-moved-ignore-space-delta option > - diff.c: decouple white space treatment from move detection algorithm > - diff.c: add a blocks mode for moved code detection > - diff.c: adjust hash function signature to match hashmap expectation > - diff.c: do not pass diff options as keydata to hashmap > - xdiff/xdiffi.c: remove unneeded function declarations > - xdiff/xdiff.h: remove unused flags > > "git diff --color-moved" feature has further been tweaked. > > Will merge to 'next'. This is nowhere near done after reading the feedback from Jonathan Tan; I am considering redoing it completely differently. Please eject from next. (at least the last 3 patches, the patches up to "add blocks mode" are reasonable.)
Re: [RFC/PATCH 7/7] rerere: teach rerere to handle nested conflicts
On 05/24, Junio C Hamano wrote: > Thomas Gummererwrites: > > > No automated test for this yet. As mentioned in the cover letter as > > well, I'm not sure if this is common enough for us to actually > > consider this use case. I don't know how nested conflicts could > > actually be created apart from committing a file with conflict > > markers, > > Recursive merge whose inner merge leaves conflict markers? Thanks, lots of stuff in Git I still have to learn :) > One thing that makes me wonder is that the conflict markers may not > "nest" so nicely. For example, if inner merges had two conflicts > like these: > > <<< > < > A > = > B > > > === > < > A > = > C > > > >>> > > where one side made something to A or B, while the other side made > something (or something else) to A or C, I would imagine that the > outer conflict could be "optimized" to produce this instead: > > > < > A > = > <<< > B > === > C > >>> > > Yeah, I do think that would be a nicer merge conflict to solve. But I think that should be done in a separate patch series if we decide to do so. When this one lands rerere will be able to handle the conflict either way :)
Re: BUG: No way to set fsck. when cloning
On Thu, May 24, 2018 at 07:04:04PM +0200, Ævar Arnfjörð Bjarmason wrote: > That doesn't work, because that's for the server-side, but I need the > fetch.fsck.* that doesn't exist. This works (I'll send a better patch > with tests / docs etc. soon): Yeah, I think this is the right direction. > +static int fetch_pack_config_cb(const char *var, const char *value, void *cb) > +{ > + if (strcmp(var, "fetch.fsck.skiplist") == 0) { > + const char *path; > + > + if (git_config_pathname(, var, value)) > + return 1; > + strbuf_addf(_msg_types, "%cskiplist=%s", > + fsck_msg_types.len ? ',' : '=', path); > + free((char *)path); > + return 0; > + } > + > + if (skip_prefix(var, "fetch.fsck.", )) { > + if (is_valid_msg_type(var, value)) > + strbuf_addf(_msg_types, "%c%s=%s", > + fsck_msg_types.len ? ',' : '=', var, value); > + else > + warning("Skipping unknown msg id '%s'", var); > + return 0; > + } This matches what's in receive-pack, though the way we stuff all of the options into a flat string is kind of nasty. I also wonder if we'd eventually run up against command-line limits if somebody had a complicated fsck config. I wonder if we should simply be telling index-pack "please read fsck config from the prefix 'fetch.fsck'" instead. I dunno, maybe I am just creating work. Certainly I don't think it should be a blocker for adding fetch.fsck support. But if you want to think about it while you are here or write a patch, I wouldn't complain. :) -Peff
Re: [RFC/PATCH 4/7] rerere: fix crash when conflict goes unresolved
On 05/24, Junio C Hamano wrote: > Thomas Gummererwrites: > > > To fix this, remove the rerere ID from the MERGE_RR file in case we > > can't handle it, and remove the folder for the ID. Removing it > > unconditionally is fine here, because if the user would have resolved > > the conflict and ran rerere, the entry would no longer be in the > > MERGE_RR file, so we wouldn't have this problem in the first place, > > I do not think removing the directory and losing _other_ conflicts > and their resolutions, if they exist, is fine in the modern world > order post rerere-multi update in 2016. Well, it is just as safe as > "rm -rf .git/rr-cache/" in the sense that it won't make Git start > segfaulting, but it is not fine as it is discarding information of > conflicts that has nothing to do with the current one that is > problematic. Sorry I botched the description here, and failed to describe what the code is actually doing. We're actually only removing the variant in the MERGE_RR file, whose path we are now no longer able to handle. And I think that's fine to do, because if it is still in the MERGE_RR file the conflict hasn't been resolved yet, afaiu. Will update the commit message.
Re: unexpected "unresolved merge conflict" for a new file
On Thu, May 24, 2018 at 01:36:57PM +0200, Michal Hocko wrote: > `git commit' fails on a newly added file with the following > * > * You have some suspicious patch lines: > * > * In Documentation/core-api/gfp_mask-from-fs-io.rst > * unresolved merge conflict (line 27) > Documentation/core-api/gfp_mask-from-fs-io.rst:27:=== This message isn't generated by git itself, but rather by a pre-commit hook. You can skip the hook by running "git commit --no-verify". As for the false positive in the hook logic, I can't say more without having seen the hook source. :) Do you know where you got it from? (Googling for "suspicious patch lines" turns up some hits, but with varying provenance). -Peff
Re: [PATCH v2] rev-parse: check lookup'ed commit references for NULL
On Wed, May 23, 2018 at 11:27:33PM -0700, Elijah Newren wrote: > Commits 2122f8b963d4 ("rev-parse: Add support for the ^! and ^@ syntax", > 2008-07-26) and 3dd4e7320d ("Teach rev-parse the ... syntax.", 2006-07-04) > taught rev-parse new syntax, and used lookup_commit_reference() as part of > their logic. Neither usage checked the returned commit to see if it was > non-NULL before using it. Check for NULL and ensure an appropriate error > is reported to the user. > > Reported by Florian Weimer and Todd Zullinger. > > Helped-by: Jeff King> Signed-off-by: Elijah Newren This version looks good to me. Thanks for taking care of this! -Peff
Re: BUG: No way to set fsck. when cloning
On Thu, May 24, 2018 at 05:25:29PM +0200, Ævar Arnfjörð Bjarmason wrote: > When I do: > > git -c fetch.fsckObjects=true clone > g...@github.com:robbyrussell/oh-my-zsh.git > > I get: > > error: object 2b7227859263b6aabcc28355b0b994995b7148b6: > zeroPaddedFilemode: contains zero-padded file modes > fatal: Error in object > fatal: index-pack failed > > The docs (https://git-scm.com/docs/git-config#git-config-fsckltmsg-idgt) > say you can override this with -c fsck.zeroPaddedFilemode = ignore, but > I see in builtin/fsck.c that just fsck_config() knows about this, and > indeed this works *after* you clone the repo when you use 'git fsck'. > > I don't have time to fix this now, but what's the best approach here? > Make all the relevant commands copy what fsck_config() is doing, or > should fsck_object() be lazily looking up this config by itself? I think the relevant commands need to do it themselves. We already have receive.fsck.*. I don't think there's an equivalent for fetching, but there probably should be. But fsck_object() doesn't have the context to know which set should be used. -Peff
Re: BUG: No way to set fsck. when cloning
On Thu, May 24 2018, Kevin Daudt wrote: > On Thu, May 24, 2018 at 05:25:29PM +0200, Ævar Arnfjörð Bjarmason wrote: >> When I do: >> >> git -c fetch.fsckObjects=true clone >> g...@github.com:robbyrussell/oh-my-zsh.git >> >> I get: >> >> error: object 2b7227859263b6aabcc28355b0b994995b7148b6: >> zeroPaddedFilemode: contains zero-padded file modes >> fatal: Error in object >> fatal: index-pack failed >> >> The docs (https://git-scm.com/docs/git-config#git-config-fsckltmsg-idgt) >> say you can override this with -c fsck.zeroPaddedFilemode = ignore, but >> I see in builtin/fsck.c that just fsck_config() knows about this, and >> indeed this works *after* you clone the repo when you use 'git fsck'. >> >> I don't have time to fix this now, but what's the best approach here? >> Make all the relevant commands copy what fsck_config() is doing, or >> should fsck_object() be lazily looking up this config by itself? > > Apparently someone reported this earlier[0]. Johannes replied: > >> Well, you can apparently have your cake and eat it too (see >> https://git-scm.com/docs/git-config#git-config-receivefsckltmsg-idgt): >> >> receive.fsck.:: >> When `receive.fsckObjects` is set to true, errors can be switched >> to warnings and vice versa by configuring the `receive.fsck.` >> setting where the `` is the fsck message ID and the value >> is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes >> the error/warning with the message ID, e.g. "missingEmail: invalid >> author/committer line - missing email" means that setting >> `receive.fsck.missingEmail = ignore` will hide that issue. >> >> In your case, use receive.fsck.zeroPaddedFilemode=ignore=warn (or >> =ignore). > > [0]https://public-inbox.org/git/alpine.deb.2.21.1.1801042125430...@minint-6bku6qn.europe.corp.microsoft.com/ > > Hope this helps, Kevin. That doesn't work, because that's for the server-side, but I need the fetch.fsck.* that doesn't exist. This works (I'll send a better patch with tests / docs etc. soon): diff --git a/fetch-pack.c b/fetch-pack.c index 490c38f833..9e4282788e 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -19,6 +19,7 @@ #include "sha1-array.h" #include "oidset.h" #include "packfile.h" +#include "fsck.h" static int transfer_unpack_limit = -1; static int fetch_unpack_limit = -1; @@ -33,6 +34,7 @@ static int agent_supported; static int server_supports_filtering; static struct lock_file shallow_lock; static const char *alternate_shallow_file; +static struct strbuf fsck_msg_types = STRBUF_INIT; /* Remember to update object flag allocation in object.h */ #define COMPLETE (1U << 0) @@ -935,7 +937,8 @@ static int get_pack(struct fetch_pack_args *args, */ argv_array_push(, "--fsck-objects"); else - argv_array_push(, "--strict"); + argv_array_pushf(, "--strict%s", +fsck_msg_types.buf); } cmd.in = demux.out; @@ -1409,6 +1412,31 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, return ref; } +static int fetch_pack_config_cb(const char *var, const char *value, void *cb) +{ + if (strcmp(var, "fetch.fsck.skiplist") == 0) { + const char *path; + + if (git_config_pathname(, var, value)) + return 1; + strbuf_addf(_msg_types, "%cskiplist=%s", + fsck_msg_types.len ? ',' : '=', path); + free((char *)path); + return 0; + } + + if (skip_prefix(var, "fetch.fsck.", )) { + if (is_valid_msg_type(var, value)) + strbuf_addf(_msg_types, "%c%s=%s", + fsck_msg_types.len ? ',' : '=', var, value); + else + warning("Skipping unknown msg id '%s'", var); + return 0; + } + + return git_default_config(var, value, cb); +} + static void fetch_pack_config(void) { git_config_get_int("fetch.unpacklimit", _unpack_limit); @@ -1417,7 +1445,7 @@ static void fetch_pack_config(void) git_config_get_bool("fetch.fsckobjects", _fsck_objects); git_config_get_bool("transfer.fsckobjects", _fsck_objects); - git_config(git_default_config, NULL); + git_config(fetch_pack_config_cb, NULL); } static void fetch_pack_setup(void)
Re: [PATCH 2/5] t6036, t6042: use test_line_count instead of wc -l
On Thu, May 24, 2018 at 3:05 AM, SZEDER Gáborwrote: >> --- a/t/t6036-recursive-corner-cases.sh >> +++ b/t/t6036-recursive-corner-cases.sh >> @@ -65,9 +65,12 @@ test_expect_success 'merge simple rename+criss-cross with >> no modifications' ' >> >> test_must_fail git merge -s recursive R2^0 && >> >> - test 2 = $(git ls-files -s | wc -l) && >> - test 2 = $(git ls-files -u | wc -l) && >> - test 2 = $(git ls-files -o | wc -l) && > > Here 'git ls-files -o' should have listed two untracked files ... > >> + git ls-files -s >out && >> + test_line_count = 2 out && >> + git ls-files -u >out && >> + test_line_count = 2 out && >> + git ls-files -o >out && >> + test_line_count = 3 out && > > ... but now you expect it to list three. I was about to point out the > typo, but then noticed that you expect it to list one more untracked > file than before in all subsequent tests... now that can't be just a > typo, can it? > > Please mention in the commit message that when using an intermediate > file to store the output, 'git ls-files -o' will list that file, too, > that's why the number of expected untracked files had to be adjusted; > so future readers won't have to figure this out themselves. How does adding the following to the commit message sound? Changing to use test_line_count means using an intermediate file to store the output, which means that 'git ls-files -o' will have an additional file to list, which means that the number of lines expected in some tests will therefore change as well (unless the same intermediate file was used to capture the output of a previous command). I'll include that in my next roll of the series after waiting for any other fixups folks point out. Sorry for the trouble. As always, thanks for taking a look!
Re: [PATCH v2 03/12] commit-graph: test that 'verify' finds corruption
On 5/21/2018 2:53 PM, Jakub Narebski wrote: +corrupt_data() { + file=$1 + pos=$2 + data="${3:-\0}" + printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc +} First, if we do this that way (and not by adding a test helper), the use of this function should be, I think, protected using appropriate test prerequisite. Not everyone has 'dd' tool installed, for example on MS Windows. Windows does not, but it is also missing many things this test suite needs. 'dd' is included in the Git for Windows SDK. I rebased this series onto Git for Windows and the tests passed when run in an SDK shell. Thanks, -Stolee
[PATCH v3 13/20] commit-graph: verify generation number
While iterating through the commit parents, perform the generation number calculation and compare against the value stored in the commit-graph. The tests demonstrate that having a different set of parents affects the generation number calculation, and this value propagates to descendants. Hence, we drop the single-line condition on the output. Signed-off-by: Derrick Stolee--- commit-graph.c | 18 ++ t/t5318-commit-graph.sh | 6 ++ 2 files changed, 24 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index fff22dc0c3..ead92460c1 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -922,6 +922,7 @@ int verify_commit_graph(struct commit_graph *g) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; + uint32_t max_generation = 0; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -956,6 +957,9 @@ int verify_commit_graph(struct commit_graph *g) oid_to_hex(_parents->item->object.oid), oid_to_hex(_parents->item->object.oid)); + if (graph_parents->item->generation > max_generation) + max_generation = graph_parents->item->generation; + graph_parents = graph_parents->next; odb_parents = odb_parents->next; } @@ -963,6 +967,20 @@ int verify_commit_graph(struct commit_graph *g) if (odb_parents != NULL) graph_report("commit-graph parent list for commit %s terminates early", oid_to_hex(_oid)); + + /* +* If one of our parents has generation GENERATION_NUMBER_MAX, then +* our generation is also GENERATION_NUMBER_MAX. Decrement to avoid +* extra logic in the following condition. +*/ + if (max_generation == GENERATION_NUMBER_MAX) + max_generation--; + + if (graph_commit->generation != max_generation + 1) + graph_report("commit-graph generation for commit %s is %u != %u", +oid_to_hex(_oid), +graph_commit->generation, +max_generation + 1); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 12f0d7f54d..673b0d37d5 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -272,6 +272,7 @@ GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET GRAPH_BYTE_COMMIT_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN` GRAPH_BYTE_COMMIT_EXTRA_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4` GRAPH_BYTE_COMMIT_WRONG_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3` +GRAPH_BYTE_COMMIT_GENERATION=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 8` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -366,4 +367,9 @@ test_expect_success 'detect incorrect tree OID' ' "commit-graph parent for" ' +test_expect_success 'detect incorrect generation number' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_GENERATION "\01" \ + "generation" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 18/20] commit-graph: add '--reachable' option
When writing commit-graph files, it can be convenient to ask for all reachable commits (starting at the ref set) in the resulting file. This is particularly helpful when writing to stdin is complicated, such as a future integration with 'git gc' which will call write_commit_graph_reachable() after performing cleanup of the object database. Signed-off-by: Derrick Stolee--- Documentation/git-commit-graph.txt | 8 ++-- builtin/commit-graph.c | 16 commit-graph.c | 32 commit-graph.h | 1 + t/t5318-commit-graph.sh| 10 ++ 5 files changed, 61 insertions(+), 6 deletions(-) diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt index a222cfab08..dececb79d7 100644 --- a/Documentation/git-commit-graph.txt +++ b/Documentation/git-commit-graph.txt @@ -38,12 +38,16 @@ Write a commit graph file based on the commits found in packfiles. + With the `--stdin-packs` option, generate the new commit graph by walking objects only in the specified pack-indexes. (Cannot be combined -with --stdin-commits.) +with `--stdin-commits` or `--reachable`.) + With the `--stdin-commits` option, generate the new commit graph by walking commits starting at the commits specified in stdin as a list of OIDs in hex, one OID per line. (Cannot be combined with ---stdin-packs.) +`--stdin-packs` or `--reachable`.) ++ +With the `--reachable` option, generate the new commit graph by walking +commits starting at all refs. (Cannot be combined with `--stdin-commits` +or `--stdin-packs`.) + With the `--append` option, include all commits that are present in the existing commit-graph file. diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 0433dd6e20..20ce6437ae 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -9,7 +9,7 @@ static char const * const builtin_commit_graph_usage[] = { N_("git commit-graph [--object-dir ]"), N_("git commit-graph read [--object-dir ]"), N_("git commit-graph verify [--object-dir ]"), - N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), + N_("git commit-graph write [--object-dir ] [--append] [--reachable|--stdin-packs|--stdin-commits]"), NULL }; @@ -24,12 +24,13 @@ static const char * const builtin_commit_graph_read_usage[] = { }; static const char * const builtin_commit_graph_write_usage[] = { - N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), + N_("git commit-graph write [--object-dir ] [--append] [--reachable|--stdin-packs|--stdin-commits]"), NULL }; static struct opts_commit_graph { const char *obj_dir; + int reachable; int stdin_packs; int stdin_commits; int append; @@ -130,6 +131,8 @@ static int graph_write(int argc, const char **argv) OPT_STRING(0, "object-dir", _dir, N_("dir"), N_("The object directory to store the graph")), + OPT_BOOL(0, "reachable", , + N_("start walk at all refs")), OPT_BOOL(0, "stdin-packs", _packs, N_("scan pack-indexes listed by stdin for commits")), OPT_BOOL(0, "stdin-commits", _commits, @@ -143,11 +146,16 @@ static int graph_write(int argc, const char **argv) builtin_commit_graph_write_options, builtin_commit_graph_write_usage, 0); - if (opts.stdin_packs && opts.stdin_commits) - die(_("cannot use both --stdin-commits and --stdin-packs")); + if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1) + die(_("use at most one of --reachable, --stdin-commits, or --stdin-packs")); if (!opts.obj_dir) opts.obj_dir = get_object_directory(); + if (opts.reachable) { + write_commit_graph_reachable(opts.obj_dir, opts.append); + return 0; + } + if (opts.stdin_packs || opts.stdin_commits) { struct strbuf buf = STRBUF_INIT; lines_nr = 0; diff --git a/commit-graph.c b/commit-graph.c index a33600c584..057d734926 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -6,6 +6,7 @@ #include "packfile.h" #include "commit.h" #include "object.h" +#include "refs.h" #include "revision.h" #include "sha1-lookup.h" #include "commit-graph.h" @@ -651,6 +652,37 @@ static void compute_generation_numbers(struct packed_commit_list* commits) } } +struct hex_list { + char **hex_strs; + int hex_nr; + int hex_alloc; +}; + +static int add_ref_to_list(const char *refname, + const struct object_id *oid, + int flags, void *cb_data) +{ + struct hex_list *list = (struct
[PATCH v3 17/20] fsck: verify commit-graph
If core.commitGraph is true, verify the contents of the commit-graph during 'git fsck' using the 'git commit-graph verify' subcommand. Run this check on all alternates, as well. We use a new process for two reasons: 1. The subcommand decouples the details of loading and verifying a commit-graph file from the other fsck details. 2. The commit-graph verification requires the commits to be loaded in a specific order to guarantee we parse from the commit-graph file for some objects and from the object database for others. Signed-off-by: Derrick Stolee--- Documentation/git-fsck.txt | 3 +++ builtin/fsck.c | 21 + t/t5318-commit-graph.sh| 8 3 files changed, 32 insertions(+) diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt index b9f060e3b2..ab9a93fb9b 100644 --- a/Documentation/git-fsck.txt +++ b/Documentation/git-fsck.txt @@ -110,6 +110,9 @@ Any corrupt objects you will have to find in backups or other archives (i.e., you can just remove them and do an 'rsync' with some other site in the hopes that somebody else has the object you have corrupted). +If core.commitGraph is true, the commit-graph file will also be inspected +using 'git commit-graph verify'. See linkgit:git-commit-graph[1]. + Extracted Diagnostics - diff --git a/builtin/fsck.c b/builtin/fsck.c index ef78c6c00c..a6d5045b77 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -16,6 +16,7 @@ #include "streaming.h" #include "decorate.h" #include "packfile.h" +#include "run-command.h" #define REACHABLE 0x0001 #define SEEN 0x0002 @@ -45,6 +46,7 @@ static int name_objects; #define ERROR_REACHABLE 02 #define ERROR_PACK 04 #define ERROR_REFS 010 +#define ERROR_COMMIT_GRAPH 020 static const char *describe_object(struct object *obj) { @@ -815,5 +817,24 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } check_connectivity(); + + if (core_commit_graph) { + struct child_process commit_graph_verify = CHILD_PROCESS_INIT; + const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL, NULL }; + commit_graph_verify.argv = verify_argv; + commit_graph_verify.git_cmd = 1; + + if (run_command(_graph_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + + prepare_alt_odb(); + for (alt = alt_odb_list; alt; alt = alt->next) { + verify_argv[2] = "--object-dir"; + verify_argv[3] = alt->path; + if (run_command(_graph_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + } + } + return errors_found; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 2680a2ebff..4941937163 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -394,4 +394,12 @@ test_expect_success 'detect invalid checksum hash' ' "incorrect checksum" ' +test_expect_success 'git fsck (checks commit-graph)' ' + cd "$TRASH_DIRECTORY/full" && + git fsck && + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ + "incorrect checksum" && + test_must_fail git fsck +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 15/20] commit-graph: test for corrupted octopus edge
The commit-graph file has an extra chunk to store the parent int-ids for parents beyond the first parent for octopus merges. Our test repo has a single octopus merge that we can manipulate to demonstrate the 'verify' subcommand detects incorrect values in that chunk. Signed-off-by: Derrick Stolee--- t/t5318-commit-graph.sh | 10 ++ 1 file changed, 10 insertions(+) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 58adb8246d..240aef6add 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -248,6 +248,7 @@ test_expect_success 'git commit-graph verify' ' ' NUM_COMMITS=9 +NUM_OCTOPUS_EDGES=2 HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 @@ -274,6 +275,10 @@ GRAPH_BYTE_COMMIT_EXTRA_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4` GRAPH_BYTE_COMMIT_WRONG_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3` GRAPH_BYTE_COMMIT_GENERATION=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 8` GRAPH_BYTE_COMMIT_DATE=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 12` +GRAPH_COMMIT_DATA_WIDTH=`expr $HASH_LEN + 16` +GRAPH_OCTOPUS_DATA_OFFSET=`expr $GRAPH_COMMIT_DATA_OFFSET + \ + $GRAPH_COMMIT_DATA_WIDTH \* $NUM_COMMITS` +GRAPH_BYTE_OCTOPUS=`expr $GRAPH_OCTOPUS_DATA_OFFSET + 4` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -378,4 +383,9 @@ test_expect_success 'detect incorrect commit date' ' "commit date" ' +test_expect_success 'detect incorrect parent for octopus merge' ' + corrupt_graph_and_verify $GRAPH_BYTE_OCTOPUS "\01" \ + "invalid parent" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 20/20] commit-graph: update design document
The commit-graph feature is now integrated with 'fsck' and 'gc', so remove those items from the "Future Work" section of the commit-graph design document. Also remove the section on lazy-loading trees, as that was completed in an earlier patch series. Signed-off-by: Derrick Stolee--- Documentation/technical/commit-graph.txt | 22 -- 1 file changed, 22 deletions(-) diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index e1a883eb46..c664acbd76 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -118,9 +118,6 @@ Future Work - The commit graph feature currently does not honor commit grafts. This can be remedied by duplicating or refactoring the current graft logic. -- The 'commit-graph' subcommand does not have a "verify" mode that is - necessary for integration with fsck. - - After computing and storing generation numbers, we must make graph walks aware of generation numbers to gain the performance benefits they enable. This will mostly be accomplished by swapping a commit-date-ordered @@ -130,25 +127,6 @@ Future Work - 'log --topo-order' - 'tag --merged' -- Currently, parse_commit_gently() requires filling in the root tree - object for a commit. This passes through lookup_tree() and consequently - lookup_object(). Also, it calls lookup_commit() when loading the parents. - These method calls check the ODB for object existence, even if the - consumer does not need the content. For example, we do not need the - tree contents when computing merge bases. Now that commit parsing is - removed from the computation time, these lookup operations are the - slowest operations keeping graph walks from being fast. Consider - loading these objects without verifying their existence in the ODB and - only loading them fully when consumers need them. Consider a method - such as "ensure_tree_loaded(commit)" that fully loads a tree before - using commit->tree. - -- The current design uses the 'commit-graph' subcommand to generate the graph. - When this feature stabilizes enough to recommend to most users, we should - add automatic graph writes to common operations that create many commits. - For example, one could compute a graph on 'clone', 'fetch', or 'repack' - commands. - - A server could provide a commit graph file as part of the network protocol to avoid extra calculations by clients. This feature is only of benefit if the user is willing to trust the file, because verifying the file is correct -- 2.16.2.329.gfb62395de6
[PATCH v3 16/20] commit-graph: verify contents match checksum
The commit-graph file ends with a SHA1 hash of the previous contents. If a commit-graph file has errors but the checksum hash is correct, then we know that the problem is a bug in Git and not simply file corruption after-the-fact. Compute the checksum right away so it is the first error that appears, and make the message translatable since this error can be "corrected" by a user by simply deleting the file and recomputing. The rest of the errors are useful only to developers. Be sure to continue checking the rest of the file data if the checksum is wrong. This is important for our tests, as we break the checksum as we modify bytes of the commit-graph file. Signed-off-by: Derrick Stolee--- commit-graph.c | 16 ++-- t/t5318-commit-graph.sh | 6 ++ 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index d2b291aca2..a33600c584 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -841,6 +841,7 @@ void write_commit_graph(const char *obj_dir, oids.nr = 0; } +#define VERIFY_COMMIT_GRAPH_ERROR_HASH 2 static int verify_commit_graph_error; static void graph_report(const char *fmt, ...) @@ -860,7 +861,9 @@ static void graph_report(const char *fmt, ...) int verify_commit_graph(struct commit_graph *g) { uint32_t i, cur_fanout_pos = 0; - struct object_id prev_oid, cur_oid; + struct object_id prev_oid, cur_oid, checksum; + struct hashfile *f; + int devnull; if (!g) { graph_report("no commit-graph file loaded"); @@ -879,6 +882,15 @@ int verify_commit_graph(struct commit_graph *g) if (verify_commit_graph_error) return verify_commit_graph_error; + devnull = open("/dev/null", O_WRONLY); + f = hashfd(devnull, NULL); + hashwrite(f, g->data, g->data_len - g->hash_len); + finalize_hashfile(f, checksum.hash, CSUM_CLOSE); + if (hashcmp(checksum.hash, g->data + g->data_len - g->hash_len)) { + graph_report(_("the commit-graph file has incorrect checksum and is likely corrupt")); + verify_commit_graph_error = VERIFY_COMMIT_GRAPH_ERROR_HASH; + } + for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit; @@ -916,7 +928,7 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } - if (verify_commit_graph_error) + if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 240aef6add..2680a2ebff 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -279,6 +279,7 @@ GRAPH_COMMIT_DATA_WIDTH=`expr $HASH_LEN + 16` GRAPH_OCTOPUS_DATA_OFFSET=`expr $GRAPH_COMMIT_DATA_OFFSET + \ $GRAPH_COMMIT_DATA_WIDTH \* $NUM_COMMITS` GRAPH_BYTE_OCTOPUS=`expr $GRAPH_OCTOPUS_DATA_OFFSET + 4` +GRAPH_BYTE_FOOTER=`expr $GRAPH_OCTOPUS_DATA_OFFSET + 4 \* $NUM_OCTOPUS_EDGES` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -388,4 +389,9 @@ test_expect_success 'detect incorrect parent for octopus merge' ' "invalid parent" ' +test_expect_success 'detect invalid checksum hash' ' + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ + "incorrect checksum" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 02/20] commit-graph: fix GRAPH_MIN_SIZE
The GRAPH_MIN_SIZE macro should be the smallest size of a parsable commit-graph file. However, the minimum number of chunks was wrong. It is possible to write a commit-graph file with zero commits, and that violates this macro's value. Rewrite the macro, and use extra macros to better explain the magic constants. Signed-off-by: Derrick Stolee--- commit-graph.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index a8c337dd77..82295f0975 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -33,10 +33,11 @@ #define GRAPH_LAST_EDGE 0x8000 +#define GRAPH_HEADER_SIZE 8 #define GRAPH_FANOUT_SIZE (4 * 256) #define GRAPH_CHUNKLOOKUP_WIDTH 12 -#define GRAPH_MIN_SIZE (5 * GRAPH_CHUNKLOOKUP_WIDTH + GRAPH_FANOUT_SIZE + \ - GRAPH_OID_LEN + 8) +#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * GRAPH_CHUNKLOOKUP_WIDTH \ + + GRAPH_FANOUT_SIZE + GRAPH_OID_LEN) char *get_commit_graph_filename(const char *obj_dir) { -- 2.16.2.329.gfb62395de6
[PATCH v3 05/20] commit-graph: load a root tree from specific graph
When lazy-loading a tree for a commit, it will be important to select the tree from a specific struct commit_graph. Create a new method that specifies the commit-graph file and use that in get_commit_tree_in_graph(). Signed-off-by: Derrick Stolee--- commit-graph.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 78ba0edc80..25893ec096 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -358,14 +358,20 @@ static struct tree *load_tree_for_commit(struct commit_graph *g, struct commit * return c->maybe_tree; } -struct tree *get_commit_tree_in_graph(const struct commit *c) +static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g, +const struct commit *c) { if (c->maybe_tree) return c->maybe_tree; if (c->graph_pos == COMMIT_NOT_FROM_GRAPH) - BUG("get_commit_tree_in_graph called from non-commit-graph commit"); + BUG("get_commit_tree_in_graph_one called from non-commit-graph commit"); + + return load_tree_for_commit(g, (struct commit *)c); +} - return load_tree_for_commit(commit_graph, (struct commit *)c); +struct tree *get_commit_tree_in_graph(const struct commit *c) +{ + return get_commit_tree_in_graph_one(commit_graph, c); } static void write_graph_chunk_fanout(struct hashfile *f, -- 2.16.2.329.gfb62395de6
[PATCH v3 08/20] commit-graph: verify required chunks are present
The commit-graph file requires the following three chunks: * OID Fanout * OID Lookup * Commit Data If any of these are missing, then the 'verify' subcommand should report a failure. This includes the chunk IDs malformed or the chunk count is truncated. Signed-off-by: Derrick Stolee--- commit-graph.c | 9 + t/t5318-commit-graph.sh | 29 + 2 files changed, 38 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 55b41664ee..06e3e4f9ba 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -860,5 +860,14 @@ int verify_commit_graph(struct commit_graph *g) return 1; } + verify_commit_graph_error = 0; + + if (!g->chunk_oid_fanout) + graph_report("commit-graph is missing the OID Fanout chunk"); + if (!g->chunk_oid_lookup) + graph_report("commit-graph is missing the OID Lookup chunk"); + if (!g->chunk_commit_data) + graph_report("commit-graph is missing the Commit Data chunk"); + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index bd64481c7a..4ef3fe3dc2 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -249,6 +249,15 @@ test_expect_success 'git commit-graph verify' ' GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 +GRAPH_BYTE_CHUNK_COUNT=6 +GRAPH_CHUNK_LOOKUP_OFFSET=8 +GRAPH_CHUNK_LOOKUP_WIDTH=12 +GRAPH_CHUNK_LOOKUP_ROWS=5 +GRAPH_BYTE_OID_FANOUT_ID=$GRAPH_CHUNK_LOOKUP_OFFSET +GRAPH_BYTE_OID_LOOKUP_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \ + 1 \* $GRAPH_CHUNK_LOOKUP_WIDTH` +GRAPH_BYTE_COMMIT_DATA_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \ + 2 \* $GRAPH_CHUNK_LOOKUP_WIDTH` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -283,4 +292,24 @@ test_expect_success 'detect bad hash version' ' "hash version" ' +test_expect_success 'detect bad chunk count' ' + corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\02" \ + "missing the Commit Data chunk" +' + +test_expect_success 'detect missing OID fanout chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_FANOUT_ID "\0" \ + "missing the OID Fanout chunk" +' + +test_expect_success 'detect missing OID lookup chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_ID "\0" \ + "missing the OID Lookup chunk" +' + +test_expect_success 'detect missing commit data chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_DATA_ID "\0" \ + "missing the Commit Data chunk" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 10/20] commit-graph: verify objects exist
In the 'verify' subcommand, load commits directly from the object database to ensure they exist. Parse by skipping the commit-graph. Signed-off-by: Derrick Stolee--- commit-graph.c | 20 t/t5318-commit-graph.sh | 7 +++ 2 files changed, 27 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index cbd1aae514..0420ebcd87 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -238,6 +238,10 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g, { struct commit *c; struct object_id oid; + + if (pos >= g->num_commits) + die("invalid parent position %"PRIu64, pos); + hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos); c = lookup_commit(); if (!c) @@ -905,5 +909,21 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + if (verify_commit_graph_error) + return verify_commit_graph_error; + + for (i = 0; i < g->num_commits; i++) { + struct commit *odb_commit; + + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + odb_commit = (struct commit *)create_object(cur_oid.hash, alloc_commit_node()); + if (parse_commit_internal(odb_commit, 0, 0)) { + graph_report("failed to parse %s from object database", +oid_to_hex(_oid)); + continue; + } + } + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index c050ef980b..996a016239 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -247,6 +247,7 @@ test_expect_success 'git commit-graph verify' ' git commit-graph verify >output ' +NUM_COMMITS=9 HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 @@ -265,6 +266,7 @@ GRAPH_BYTE_FANOUT1=`expr $GRAPH_FANOUT_OFFSET + 4 \* 4` GRAPH_BYTE_FANOUT2=`expr $GRAPH_FANOUT_OFFSET + 4 \* 255` GRAPH_OID_LOOKUP_OFFSET=`expr $GRAPH_FANOUT_OFFSET + 4 \* 256` GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8` +GRAPH_BYTE_OID_LOOKUP_MISSING=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 4 + 10` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -334,4 +336,9 @@ test_expect_success 'detect incorrect OID order' ' "incorrect OID order" ' +test_expect_success 'detect OID not in object database' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_MISSING "\01" \ + "from object database" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 19/20] gc: automatically write commit-graph files
The commit-graph file is a very helpful feature for speeding up git operations. In order to make it more useful, write the commit-graph file by default during standard garbage collection operations. Add a 'gc.commitGraph' config setting that triggers writing a commit-graph file after any non-trivial 'git gc' command. Defaults to false while the commit-graph feature matures. We specifically do not want to turn this on by default until the commit-graph feature is fully integrated with history-modifying features like shallow clones. Signed-off-by: Derrick Stolee--- Documentation/config.txt | 6 ++ Documentation/git-gc.txt | 4 builtin/gc.c | 6 ++ t/t5318-commit-graph.sh | 14 ++ 4 files changed, 30 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 11f027194e..9a3abd87e7 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1553,6 +1553,12 @@ gc.autoDetach:: Make `git gc --auto` return immediately and run in background if the system supports it. Default is true. +gc.commitGraph:: + If true, then gc will rewrite the commit-graph file after any + change to the object database. If '--auto' is used, then the + commit-graph will not be updated unless the threshold is met. + See linkgit:git-commit-graph[1] for details. + gc.logExpiry:: If the file gc.log exists, then `git gc --auto` won't run unless that file is more than 'gc.logExpiry' old. Default is diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 571b5a7e3c..17dd654a59 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -119,6 +119,10 @@ The optional configuration variable `gc.packRefs` determines if it within all non-bare repos or it can be set to a boolean value. This defaults to true. +The optional configuration variable 'gc.commitGraph' determines if +'git gc' runs 'git commit-graph write'. This can be set to a boolean +value. This defaults to false. + The optional configuration variable `gc.aggressiveWindow` controls how much time is spent optimizing the delta compression of the objects in the repository when the --aggressive option is specified. The larger diff --git a/builtin/gc.c b/builtin/gc.c index 77fa720bd0..efd214a59f 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -20,6 +20,7 @@ #include "argv-array.h" #include "commit.h" #include "packfile.h" +#include "commit-graph.h" #define FAILED_RUN "failed to run %s" @@ -34,6 +35,7 @@ static int aggressive_depth = 50; static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; +static int gc_commit_graph = 0; static int detach_auto = 1; static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; @@ -121,6 +123,7 @@ static void gc_config(void) git_config_get_int("gc.aggressivedepth", _depth); git_config_get_int("gc.auto", _auto_threshold); git_config_get_int("gc.autopacklimit", _auto_pack_limit); + git_config_get_bool("gc.commitgraph", _commit_graph); git_config_get_bool("gc.autodetach", _auto); git_config_get_expiry("gc.pruneexpire", _expire); git_config_get_expiry("gc.worktreepruneexpire", _worktrees_expire); @@ -480,6 +483,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (pack_garbage.nr > 0) clean_pack_garbage(); + if (gc_commit_graph) + write_commit_graph_reachable(get_object_directory(), 0); + if (auto_gc && too_many_loose_objects()) warning(_("There are too many unreachable loose objects; " "run 'git prune' to remove them.")); diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index a659620332..d20b17586f 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -245,6 +245,20 @@ test_expect_success 'perform fast-forward merge in full repo' ' test_cmp expect output ' +test_expect_success 'check that gc clears commit-graph' ' + cd "$TRASH_DIRECTORY/full" && + git commit --allow-empty -m "blank" && + git commit-graph write --reachable && + cp $objdir/info/commit-graph commit-graph-before-gc && + git reset --hard HEAD~1 && + git config gc.commitGraph true && + git gc && + cp $objdir/info/commit-graph commit-graph-after-gc && + ! test_cmp commit-graph-before-gc commit-graph-after-gc && + git commit-graph write --reachable && + test_cmp commit-graph-after-gc $objdir/info/commit-graph +' + # the verify tests below expect the commit-graph to contain # exactly the commits reachable from the commits/8 branch. # If the file changes the set of commits in the list, then the -- 2.16.2.329.gfb62395de6
[PATCH v3 09/20] commit-graph: verify corrupt OID fanout and lookup
In the commit-graph file, the OID fanout chunk provides an index into the OID lookup. The 'verify' subcommand should find incorrect values in the fanout. Similarly, the 'verify' subcommand should find out-of-order values in the OID lookup. Signed-off-by: Derrick Stolee--- commit-graph.c | 36 t/t5318-commit-graph.sh | 22 ++ 2 files changed, 58 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 06e3e4f9ba..cbd1aae514 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -855,6 +855,9 @@ static void graph_report(const char *fmt, ...) int verify_commit_graph(struct commit_graph *g) { + uint32_t i, cur_fanout_pos = 0; + struct object_id prev_oid, cur_oid; + if (!g) { graph_report("no commit-graph file loaded"); return 1; @@ -869,5 +872,38 @@ int verify_commit_graph(struct commit_graph *g) if (!g->chunk_commit_data) graph_report("commit-graph is missing the Commit Data chunk"); + if (verify_commit_graph_error) + return verify_commit_graph_error; + + for (i = 0; i < g->num_commits; i++) { + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + if (i && oidcmp(_oid, _oid) >= 0) + graph_report("commit-graph has incorrect OID order: %s then %s", +oid_to_hex(_oid), +oid_to_hex(_oid)); + + oidcpy(_oid, _oid); + + while (cur_oid.hash[0] > cur_fanout_pos) { + uint32_t fanout_value = get_be32(g->chunk_oid_fanout + cur_fanout_pos); + if (i != fanout_value) + graph_report("commit-graph has incorrect fanout value: fanout[%d] = %u != %u", +cur_fanout_pos, fanout_value, i); + + cur_fanout_pos++; + } + } + + while (cur_fanout_pos < 256) { + uint32_t fanout_value = get_be32(g->chunk_oid_fanout + cur_fanout_pos); + + if (g->num_commits != fanout_value) + graph_report("commit-graph has incorrect fanout value: fanout[%d] = %u != %u", +cur_fanout_pos, fanout_value, i); + + cur_fanout_pos++; + } + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 4ef3fe3dc2..c050ef980b 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -247,6 +247,7 @@ test_expect_success 'git commit-graph verify' ' git commit-graph verify >output ' +HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 GRAPH_BYTE_CHUNK_COUNT=6 @@ -258,6 +259,12 @@ GRAPH_BYTE_OID_LOOKUP_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \ 1 \* $GRAPH_CHUNK_LOOKUP_WIDTH` GRAPH_BYTE_COMMIT_DATA_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \ 2 \* $GRAPH_CHUNK_LOOKUP_WIDTH` +GRAPH_FANOUT_OFFSET=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \ + $GRAPH_CHUNK_LOOKUP_WIDTH \* $GRAPH_CHUNK_LOOKUP_ROWS` +GRAPH_BYTE_FANOUT1=`expr $GRAPH_FANOUT_OFFSET + 4 \* 4` +GRAPH_BYTE_FANOUT2=`expr $GRAPH_FANOUT_OFFSET + 4 \* 255` +GRAPH_OID_LOOKUP_OFFSET=`expr $GRAPH_FANOUT_OFFSET + 4 \* 256` +GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -312,4 +319,19 @@ test_expect_success 'detect missing commit data chunk' ' "missing the Commit Data chunk" ' +test_expect_success 'detect incorrect fanout' ' + corrupt_graph_and_verify $GRAPH_BYTE_FANOUT1 "\01" \ + "fanout value" +' + +test_expect_success 'detect incorrect fanout' ' + corrupt_graph_and_verify $GRAPH_BYTE_FANOUT2 "\01" \ + "fanout value" +' + +test_expect_success 'detect incorrect OID order' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_ORDER "\01" \ + "incorrect OID order" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 14/20] commit-graph: verify commit date
Signed-off-by: Derrick Stolee--- commit-graph.c | 6 ++ t/t5318-commit-graph.sh | 6 ++ 2 files changed, 12 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index ead92460c1..d2b291aca2 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -981,6 +981,12 @@ int verify_commit_graph(struct commit_graph *g) oid_to_hex(_oid), graph_commit->generation, max_generation + 1); + + if (graph_commit->date != odb_commit->date) + graph_report("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime, +oid_to_hex(_oid), +graph_commit->date, +odb_commit->date); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 673b0d37d5..58adb8246d 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -273,6 +273,7 @@ GRAPH_BYTE_COMMIT_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN` GRAPH_BYTE_COMMIT_EXTRA_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4` GRAPH_BYTE_COMMIT_WRONG_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3` GRAPH_BYTE_COMMIT_GENERATION=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 8` +GRAPH_BYTE_COMMIT_DATE=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 12` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -372,4 +373,9 @@ test_expect_success 'detect incorrect generation number' ' "generation" ' +test_expect_success 'detect incorrect commit date' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_DATE "\01" \ + "commit date" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 11/20] commit-graph: verify root tree OIDs
The 'verify' subcommand must compare the commit content parsed from the commit-graph and compare it against the content in the object database. Use lookup_commit() and parse_commit_in_graph_one() to parse the commits from the graph and compare against a commit that is loaded separately and parsed directly from the object database. Add checks for the root tree OID. Signed-off-by: Derrick Stolee--- commit-graph.c | 17 - t/t5318-commit-graph.sh | 7 +++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index 0420ebcd87..19ea369fc6 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -880,6 +880,8 @@ int verify_commit_graph(struct commit_graph *g) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit; + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); if (i && oidcmp(_oid, _oid) >= 0) @@ -897,6 +899,11 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + + graph_commit = lookup_commit(_oid); + if (!parse_commit_in_graph_one(g, graph_commit)) + graph_report("failed to parse %s from commit-graph", +oid_to_hex(_oid)); } while (cur_fanout_pos < 256) { @@ -913,16 +920,24 @@ int verify_commit_graph(struct commit_graph *g) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { - struct commit *odb_commit; + struct commit *graph_commit, *odb_commit; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + graph_commit = lookup_commit(_oid); odb_commit = (struct commit *)create_object(cur_oid.hash, alloc_commit_node()); if (parse_commit_internal(odb_commit, 0, 0)) { graph_report("failed to parse %s from object database", oid_to_hex(_oid)); continue; } + + if (oidcmp(_commit_tree_in_graph_one(g, graph_commit)->object.oid, + get_commit_tree_oid(odb_commit))) + graph_report("root tree OID for commit %s in commit-graph is %s != %s", +oid_to_hex(_oid), + oid_to_hex(get_commit_tree_oid(graph_commit)), + oid_to_hex(get_commit_tree_oid(odb_commit))); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 996a016239..21cc8e82f3 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -267,6 +267,8 @@ GRAPH_BYTE_FANOUT2=`expr $GRAPH_FANOUT_OFFSET + 4 \* 255` GRAPH_OID_LOOKUP_OFFSET=`expr $GRAPH_FANOUT_OFFSET + 4 \* 256` GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8` GRAPH_BYTE_OID_LOOKUP_MISSING=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 4 + 10` +GRAPH_COMMIT_DATA_OFFSET=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* $NUM_COMMITS` +GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -341,4 +343,9 @@ test_expect_success 'detect OID not in object database' ' "from object database" ' +test_expect_success 'detect incorrect tree OID' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_TREE "\01" \ + "root tree OID for commit" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 12/20] commit-graph: verify parent list
The commit-graph file stores parents in a two-column portion of the commit data chunk. If there is only one parent, then the second column stores 0x to indicate no second parent. The 'verify' subcommand checks the parent list for the commit loaded from the commit-graph and the one parsed from the object database. Test these checks for corrupt parents, too many parents, and wrong parents. The octopus merge will be tested in a later commit. Signed-off-by: Derrick Stolee--- commit-graph.c | 25 + t/t5318-commit-graph.sh | 18 ++ 2 files changed, 43 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 19ea369fc6..fff22dc0c3 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -921,6 +921,7 @@ int verify_commit_graph(struct commit_graph *g) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; + struct commit_list *graph_parents, *odb_parents; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -938,6 +939,30 @@ int verify_commit_graph(struct commit_graph *g) oid_to_hex(_oid), oid_to_hex(get_commit_tree_oid(graph_commit)), oid_to_hex(get_commit_tree_oid(odb_commit))); + + graph_parents = graph_commit->parents; + odb_parents = odb_commit->parents; + + while (graph_parents) { + if (odb_parents == NULL) { + graph_report("commit-graph parent list for commit %s is too long", +oid_to_hex(_oid)); + break; + } + + if (oidcmp(_parents->item->object.oid, _parents->item->object.oid)) + graph_report("commit-graph parent for %s is %s != %s", +oid_to_hex(_oid), + oid_to_hex(_parents->item->object.oid), + oid_to_hex(_parents->item->object.oid)); + + graph_parents = graph_parents->next; + odb_parents = odb_parents->next; + } + + if (odb_parents != NULL) + graph_report("commit-graph parent list for commit %s terminates early", +oid_to_hex(_oid)); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 21cc8e82f3..12f0d7f54d 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -269,6 +269,9 @@ GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8` GRAPH_BYTE_OID_LOOKUP_MISSING=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 4 + 10` GRAPH_COMMIT_DATA_OFFSET=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* $NUM_COMMITS` GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET +GRAPH_BYTE_COMMIT_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN` +GRAPH_BYTE_COMMIT_EXTRA_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4` +GRAPH_BYTE_COMMIT_WRONG_PARENT=`expr $GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3` # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -348,4 +351,19 @@ test_expect_success 'detect incorrect tree OID' ' "root tree OID for commit" ' +test_expect_success 'detect incorrect parent int-id' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_PARENT "\01" \ + "invalid parent" +' + +test_expect_success 'detect extra parent int-id' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_EXTRA_PARENT "\00" \ + "is too long" +' + +test_expect_success 'detect incorrect tree OID' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_WRONG_PARENT "\01" \ + "commit-graph parent for" +' + test_done -- 2.16.2.329.gfb62395de6
[PATCH v3 01/20] commit-graph: UNLEAK before die()
Signed-off-by: Derrick Stolee--- builtin/commit-graph.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 37420ae0fd..f0875b8bf3 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -51,8 +51,11 @@ static int graph_read(int argc, const char **argv) graph_name = get_commit_graph_filename(opts.obj_dir); graph = load_commit_graph_one(graph_name); - if (!graph) + if (!graph) { + UNLEAK(graph_name); die("graph file %s does not exist", graph_name); + } + FREE_AND_NULL(graph_name); printf("header: %08x %d %d %d %d\n", -- 2.16.2.329.gfb62395de6
[PATCH v3 00/20] Integrate commit-graph into 'fsck' and 'gc'
Thanks for all the feedback on v2. I've tried to make this round's review a bit easier by splitting up the commits into smaller pieces. Also, the test script now has less boilerplate and uses variables and clear arithmetic to explain which bytes are being modified. One other change worth mentioning: in "commit-graph: add '--reachable' option" I put the ref-iteration into a new external 'write_commit_graph_reachable()' method inside commit-graph.c. This makes the 'gc: automatically write commit-graph files' a simpler change. Thanks, -Stolee Derrick Stolee (20): commit-graph: UNLEAK before die() commit-graph: fix GRAPH_MIN_SIZE commit-graph: parse commit from chosen graph commit: force commit to parse from object database commit-graph: load a root tree from specific graph commit-graph: add 'verify' subcommand commit-graph: verify catches corrupt signature commit-graph: verify required chunks are present commit-graph: verify corrupt OID fanout and lookup commit-graph: verify objects exist commit-graph: verify root tree OIDs commit-graph: verify parent list commit-graph: verify generation number commit-graph: verify commit date commit-graph: test for corrupted octopus edge commit-graph: verify contents match checksum fsck: verify commit-graph commit-graph: add '--reachable' option gc: automatically write commit-graph files commit-graph: update design document Documentation/config.txt | 6 + Documentation/git-commit-graph.txt | 14 +- Documentation/git-fsck.txt | 3 + Documentation/git-gc.txt | 4 + Documentation/technical/commit-graph.txt | 22 --- builtin/commit-graph.c | 59 +++- builtin/fsck.c | 21 +++ builtin/gc.c | 6 + commit-graph.c | 234 +-- commit-graph.h | 3 + commit.c | 9 +- commit.h | 1 + t/t5318-commit-graph.sh | 196 ++ 13 files changed, 539 insertions(+), 39 deletions(-) base-commit: 34fdd433396ee0e3ef4de02eb2189f8226eafe4e -- 2.16.2.329.gfb62395de6
[PATCH v3 06/20] commit-graph: add 'verify' subcommand
If the commit-graph file becomes corrupt, we need a way to verify that its contents match the object database. In the manner of 'git fsck' we will implement a 'git commit-graph verify' subcommand to report all issues with the file. Add the 'verify' subcommand to the 'commit-graph' builtin and its documentation. The subcommand is currently a no-op except for loading the commit-graph into memory, which may trigger run-time errors that would be caught by normal use. Add a simple test that ensures the command returns a zero error code. If no commit-graph file exists, this is an acceptable state. Do not report any errors. Signed-off-by: Derrick Stolee--- Documentation/git-commit-graph.txt | 6 ++ builtin/commit-graph.c | 38 ++ commit-graph.c | 26 ++ commit-graph.h | 2 ++ t/t5318-commit-graph.sh| 10 ++ 5 files changed, 82 insertions(+) diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt index 4c97b555cc..a222cfab08 100644 --- a/Documentation/git-commit-graph.txt +++ b/Documentation/git-commit-graph.txt @@ -10,6 +10,7 @@ SYNOPSIS [verse] 'git commit-graph read' [--object-dir ] +'git commit-graph verify' [--object-dir ] 'git commit-graph write' [--object-dir ] @@ -52,6 +53,11 @@ existing commit-graph file. Read a graph file given by the commit-graph file and output basic details about the graph file. Used for debugging purposes. +'verify':: + +Read the commit-graph file and verify its contents against the object +database. Used to check for corrupted data. + EXAMPLES diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index f0875b8bf3..0433dd6e20 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -8,10 +8,16 @@ static char const * const builtin_commit_graph_usage[] = { N_("git commit-graph [--object-dir ]"), N_("git commit-graph read [--object-dir ]"), + N_("git commit-graph verify [--object-dir ]"), N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), NULL }; +static const char * const builtin_commit_graph_verify_usage[] = { + N_("git commit-graph verify [--object-dir ]"), + NULL +}; + static const char * const builtin_commit_graph_read_usage[] = { N_("git commit-graph read [--object-dir ]"), NULL @@ -29,6 +35,36 @@ static struct opts_commit_graph { int append; } opts; + +static int graph_verify(int argc, const char **argv) +{ + struct commit_graph *graph = 0; + char *graph_name; + + static struct option builtin_commit_graph_verify_options[] = { + OPT_STRING(0, "object-dir", _dir, + N_("dir"), + N_("The object directory to store the graph")), + OPT_END(), + }; + + argc = parse_options(argc, argv, NULL, +builtin_commit_graph_verify_options, +builtin_commit_graph_verify_usage, 0); + + if (!opts.obj_dir) + opts.obj_dir = get_object_directory(); + + graph_name = get_commit_graph_filename(opts.obj_dir); + graph = load_commit_graph_one(graph_name); + FREE_AND_NULL(graph_name); + + if (!graph) + return 0; + + return verify_commit_graph(graph); +} + static int graph_read(int argc, const char **argv) { struct commit_graph *graph = NULL; @@ -163,6 +199,8 @@ int cmd_commit_graph(int argc, const char **argv, const char *prefix) PARSE_OPT_STOP_AT_NON_OPTION); if (argc > 0) { + if (!strcmp(argv[0], "verify")) + return graph_verify(argc, argv); if (!strcmp(argv[0], "read")) return graph_read(argc, argv); if (!strcmp(argv[0], "write")) diff --git a/commit-graph.c b/commit-graph.c index 25893ec096..55b41664ee 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -836,3 +836,29 @@ void write_commit_graph(const char *obj_dir, oids.alloc = 0; oids.nr = 0; } + +static int verify_commit_graph_error; + +static void graph_report(const char *fmt, ...) +{ + va_list ap; + struct strbuf sb = STRBUF_INIT; + verify_commit_graph_error = 1; + + va_start(ap, fmt); + strbuf_vaddf(, fmt, ap); + + fprintf(stderr, "%s\n", sb.buf); + strbuf_release(); + va_end(ap); +} + +int verify_commit_graph(struct commit_graph *g) +{ + if (!g) { + graph_report("no commit-graph file loaded"); + return 1; + } + + return verify_commit_graph_error; +} diff --git a/commit-graph.h b/commit-graph.h index 96cccb10f3..71a39c5a57 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -53,4 +53,6 @@ void
[PATCH v3 04/20] commit: force commit to parse from object database
In anticipation of verifying commit-graph file contents against the object database, create parse_commit_internal() to allow side-stepping the commit-graph file and parse directly from the object database. Due to the use of generation numbers, this method should not be called unless the intention is explicit in avoiding commits from the commit-graph file. Signed-off-by: Derrick Stolee--- commit.c | 9 +++-- commit.h | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 1d28677dfb..6eaed0174c 100644 --- a/commit.c +++ b/commit.c @@ -392,7 +392,7 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s return 0; } -int parse_commit_gently(struct commit *item, int quiet_on_missing) +int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph) { enum object_type type; void *buffer; @@ -403,7 +403,7 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return -1; if (item->object.parsed) return 0; - if (parse_commit_in_graph(item)) + if (use_commit_graph && parse_commit_in_graph(item)) return 0; buffer = read_sha1_file(item->object.oid.hash, , ); if (!buffer) @@ -424,6 +424,11 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return ret; } +int parse_commit_gently(struct commit *item, int quiet_on_missing) +{ + return parse_commit_internal(item, quiet_on_missing, 1); +} + void parse_commit_or_die(struct commit *item) { if (parse_commit(item)) diff --git a/commit.h b/commit.h index b5afde1ae9..5fde74fcd7 100644 --- a/commit.h +++ b/commit.h @@ -73,6 +73,7 @@ struct commit *lookup_commit_reference_by_name(const char *name); struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name); int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size, int check_graph); +int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph); int parse_commit_gently(struct commit *item, int quiet_on_missing); static inline int parse_commit(struct commit *item) { -- 2.16.2.329.gfb62395de6
[PATCH v3 03/20] commit-graph: parse commit from chosen graph
Before verifying a commit-graph file against the object database, we need to parse all commits from the given commit-graph file. Create parse_commit_in_graph_one() to target a given struct commit_graph. Signed-off-by: Derrick Stolee--- commit-graph.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 82295f0975..78ba0edc80 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -310,7 +310,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin } } -int parse_commit_in_graph(struct commit *item) +static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item) { uint32_t pos; @@ -318,9 +318,21 @@ int parse_commit_in_graph(struct commit *item) return 0; if (item->object.parsed) return 1; + + if (find_commit_in_graph(item, g, )) + return fill_commit_in_graph(item, g, pos); + + return 0; +} + +int parse_commit_in_graph(struct commit *item) +{ + if (!core_commit_graph) + return 0; + prepare_commit_graph(); - if (commit_graph && find_commit_in_graph(item, commit_graph, )) - return fill_commit_in_graph(item, commit_graph, pos); + if (commit_graph) + return parse_commit_in_graph_one(commit_graph, item); return 0; } -- 2.16.2.329.gfb62395de6
[PATCH v3 07/20] commit-graph: verify catches corrupt signature
This is the first of several commits that add a test to check that 'git commit-graph verify' catches corruption in the commit-graph file. The first test checks that the command catches an error in the file signature. This is a check that exists in the existing commit-graph reading code. Add a helper method 'corrupt_graph_and_verify' to the test script t5318-commit-graph.sh. This helper corrupts the commit-graph file at a certain location, runs 'git commit-graph verify', and reports the output to the 'err' file. This data is filtered to remove the lines added by 'test_must_fail' when the test is run verbosely. Then, the output is checked to contain a specific error message. Signed-off-by: Derrick Stolee--- t/t5318-commit-graph.sh | 43 +++ 1 file changed, 43 insertions(+) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 6ca451dfd2..bd64481c7a 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -235,9 +235,52 @@ test_expect_success 'perform fast-forward merge in full repo' ' test_cmp expect output ' +# the verify tests below expect the commit-graph to contain +# exactly the commits reachable from the commits/8 branch. +# If the file changes the set of commits in the list, then the +# offsets into the binary file will result in different edits +# and the tests will likely break. + test_expect_success 'git commit-graph verify' ' cd "$TRASH_DIRECTORY/full" && + git rev-parse commits/8 | git commit-graph write --stdin-commits && git commit-graph verify >output ' +GRAPH_BYTE_VERSION=4 +GRAPH_BYTE_HASH=5 + +# usage: corrupt_graph_and_verify +# Manipulates the commit-graph file at the position +# by inserting the data, then runs 'git commit-graph verify' +# and places the output in the file 'err'. Test 'err' for +# the given string. +corrupt_graph_and_verify() { + pos=$1 + data="${2:-\0}" + grepstr=$3 + cd "$TRASH_DIRECTORY/full" && + test_when_finished mv commit-graph-backup $objdir/info/commit-graph && + cp $objdir/info/commit-graph commit-graph-backup && + printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" conv=notrunc && + test_must_fail git commit-graph verify 2>test_err && + grep -v "^+" test_err >err + grep "$grepstr" err +} + +test_expect_success 'detect bad signature' ' + corrupt_graph_and_verify 0 "\0" \ + "graph signature" +' + +test_expect_success 'detect bad version' ' + corrupt_graph_and_verify $GRAPH_BYTE_VERSION "\02" \ + "graph version" +' + +test_expect_success 'detect bad hash version' ' + corrupt_graph_and_verify $GRAPH_BYTE_HASH "\02" \ + "hash version" +' + test_done -- 2.16.2.329.gfb62395de6
Re: BUG: No way to set fsck. when cloning
On Thu, May 24, 2018 at 05:25:29PM +0200, Ævar Arnfjörð Bjarmason wrote: > When I do: > > git -c fetch.fsckObjects=true clone > g...@github.com:robbyrussell/oh-my-zsh.git > > I get: > > error: object 2b7227859263b6aabcc28355b0b994995b7148b6: > zeroPaddedFilemode: contains zero-padded file modes > fatal: Error in object > fatal: index-pack failed > > The docs (https://git-scm.com/docs/git-config#git-config-fsckltmsg-idgt) > say you can override this with -c fsck.zeroPaddedFilemode = ignore, but > I see in builtin/fsck.c that just fsck_config() knows about this, and > indeed this works *after* you clone the repo when you use 'git fsck'. > > I don't have time to fix this now, but what's the best approach here? > Make all the relevant commands copy what fsck_config() is doing, or > should fsck_object() be lazily looking up this config by itself? Apparently someone reported this earlier[0]. Johannes replied: > Well, you can apparently have your cake and eat it too (see > https://git-scm.com/docs/git-config#git-config-receivefsckltmsg-idgt): > > receive.fsck.:: > When `receive.fsckObjects` is set to true, errors can be switched > to warnings and vice versa by configuring the `receive.fsck.` > setting where the `` is the fsck message ID and the value > is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes > the error/warning with the message ID, e.g. "missingEmail: invalid > author/committer line - missing email" means that setting > `receive.fsck.missingEmail = ignore` will hide that issue. > > In your case, use receive.fsck.zeroPaddedFilemode=ignore=warn (or > =ignore). [0]https://public-inbox.org/git/alpine.deb.2.21.1.1801042125430...@minint-6bku6qn.europe.corp.microsoft.com/ Hope this helps, Kevin.
BUG: No way to set fsck. when cloning
When I do: git -c fetch.fsckObjects=true clone g...@github.com:robbyrussell/oh-my-zsh.git I get: error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes fatal: Error in object fatal: index-pack failed The docs (https://git-scm.com/docs/git-config#git-config-fsckltmsg-idgt) say you can override this with -c fsck.zeroPaddedFilemode = ignore, but I see in builtin/fsck.c that just fsck_config() knows about this, and indeed this works *after* you clone the repo when you use 'git fsck'. I don't have time to fix this now, but what's the best approach here? Make all the relevant commands copy what fsck_config() is doing, or should fsck_object() be lazily looking up this config by itself?
Re: [PATCH v2] rev-parse: check lookup'ed commit references for NULL
On 05/24/2018 04:04 PM, Todd Zullinger wrote: I added Florian to Cc, in case he wants to provide a preferred address. Sorry, using this address is fine. Thanks, Florian
jk/branch-l-0-deprecation (was Re: What's cooking in git.git (May 2018, #02; Thu, 17))
On Thursday 17 May 2018 07:06 PM, Jeff King wrote: > But because git-branch does not kick in the pager until later > (because it only wants to do it for list-mode), that happens _after_ > we've emitted the message. > I observe exactly the consequence of this behaviour. First, the error is emitted and then the pager kicks in to list the branches. > On the other hand, I'm not sure this is that big a deal. The point of > the deprecation warning is to catch people who are actually trying to > use "-l" as "--create-reflog", and that case does not page. The people > doing "git branch -l" are actually getting what they want eventually, > which is to turn it into "--list". In the interim step where it becomes > an unknown option, they'll get a hard error. I just thought we wouldn't want to surprise/confuse users who try to use "git branch -l" with the warning message about "create reflog" along-side the list of branches. That would just add to the confusion. So, I thought we should error out when users do "git branch -l" instead. Something like the following should help us prevent "git branch -l" from listing branch names and might also prevent the confusion. diff --git a/builtin/branch.c b/builtin/branch.c index 452742fec..f3c5181bb 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -672,7 +672,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, options, builtin_branch_usage, 0); - if (!delete && !rename && !copy && !edit_description && !new_upstream && !unset_upstream && argc == 0) + if (!delete && !rename && !copy && !edit_description && !new_upstream && !unset_upstream && !reflog && argc == 0) list = 1; if (filter.with_commit || filter.merge != REF_FILTER_MERGED_NONE || filter.points_at.nr || -- Kaartic signature.asc Description: This is a digitally signed message part
RE: [PATCH v3 2/7] block alloc: add lifecycle APIs for cache_entry structs
> -Original Message- > From: Junio C HamanoOn Behalf Of Junio C Hamano > Sent: Thursday, May 24, 2018 12:52 AM > To: Jameson Miller > Cc: git@vger.kernel.org; pclo...@gmail.com; jonathanta...@google.com; > sbel...@google.com; peart...@gmail.com > Subject: Re: [PATCH v3 2/7] block alloc: add lifecycle APIs for cache_entry > structs > > Jameson Miller writes: > > > Add an API around managing the lifetime of cache_entry structs. > > Abstracting memory management details behind an API will allow for > > alternative memory management strategies without affecting all the > > call sites. This commit does not change how memory is allocated / > > freed. A later commit in this series will allocate cache entries from > > memory pools as appropriate. > > > > Motivation: > > It has been observed that the time spent loading an index with a large > > number of entries is partly dominated by malloc() calls. This change > > is in preparation for using memory pools to reduce the number of > > malloc() calls made when loading an index. > > > > This API makes a distinction between cache entries that are intended > > for use with a particular index and cache entries that are not. This > > enables us to use the knowledge about how a cache entry will be used > > to make informed decisions about how to handle the corresponding > > memory. > > Yuck. make_index_cache_entry()? > > Generally we use "cache" when working on the_index without passing istate, > and otherwise "index", which means that readers can assume that > distim_cache_entry(...)" is a shorter and more limited way to say > "distim_index_entry(_index, ...)". Having both index and cache in the > same > name smells crazy. > > If most of the alocations are for permanent kind, give it a shorter name call > it > make_cache_entry(_index, ...), and call the other non-permanent one with > a longer and more cumbersome name, perhaps > make_transient_cache_entry(...). Avoid saying "index" in the former name, as > the design decision this series is making to allocate memory for a cache-entry > from a pool associated to an index_state is already seen by what its first > parameter is. I like this suggestion - I will make this change in the next version of this series. > > > diff --git a/cache.h b/cache.h > > index f0a407602c..204f788438 100644 > > --- a/cache.h > > +++ b/cache.h > > @@ -339,6 +339,29 @@ extern void remove_name_hash(struct index_state > > *istate, struct cache_entry *ce) extern void free_name_hash(struct > > index_state *istate); > > > > > > +/* Cache entry creation and freeing */ > > + > > +/* > > + * Create cache_entry intended for use in the specified index. Caller > > + * is responsible for discarding the cache_entry with > > + * `discard_cache_entry`. > > + */ > > +extern struct cache_entry *make_index_cache_entry(struct index_state > > +*istate, unsigned int mode, const unsigned char *sha1, const char > > +*path, int stage, unsigned int refresh_options); extern struct > > +cache_entry *make_empty_index_cache_entry(struct index_state *istate, > > +size_t name_len); > > + > > +/* > > + * Create a cache_entry that is not intended to be added to an index. > > + * Caller is responsible for discarding the cache_entry > > + * with `discard_cache_entry`. > > + */ > > +extern struct cache_entry *make_transient_cache_entry(unsigned int > > +mode, const unsigned char *sha1, const char *path, int stage); extern > > +struct cache_entry *make_empty_transient_cache_entry(size_t > > +name_len); > > + > > +/* > > + * Discard cache entry. > > + */ > > +void discard_cache_entry(struct cache_entry *ce); > > I am not yet convinced that it is a good idea to require each istate to hold a > separate pool. Anything that uses unpack_trees() can do "starting from this > src_index, perform various mergy operations and deposit the result in > dst_index". Sometimes the two logical indices point at the same istate, > sometimes different. When src and dst are different istates, the code that > used > to simply add another pointer to the same ce to the dst index now needs to > duplicate it out of the pool associated with dst? I did not see any instances in unpack_trees() where it copied just the cache_entry pointer from src to dst, but I will check again. You are correct, all the cache_entries need to be duplicated before being added to the destination index, which is what I think the code already does. We tried to make this more explicity by converting the inline xcalloc/memcpy instances to an actual function. In the existing code (before this patch series), the index implicitly "owns" its cache_entry instances. This can be seen in the discard_index function, where the index will discard any cache entries that it has a reference to (that are not contained in the split index). If there is code that just copies the pointer to an unrelated index, then this cache entry would be freed when the
"man git-tag" inconsistent about whether you can tag non-commit objects
embarrassed to admit i had no idea that you could tag non-commit objects, only realized that when i was reading the man page and saw: SYNOPSIS git tag [-a | -s | -u ] [-f] [-m | -F ] [-e] [ | ] so i tried it and, sure enough, i could tag a blob object. but if you read further into DESCRIPTION, about halfway through, you read: "Otherwise just a tag reference for the SHA-1 object name of the commit object is created (i.e. a lightweight tag)." ^^ which suggests only commit objects. finally, much further down, under OPTIONS: ", The object that the new tag will refer to, usually a commit. so to clean this up, is it sufficient to just change that middle line to say "object" rather than "commit object"? or is there more in the man page that needs tweaking? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
RE: [PATCH v3 0/7] allocate cache entries from memory pool
> -Original Message- > From: Junio C HamanoOn Behalf Of Junio C Hamano > Sent: Thursday, May 24, 2018 12:55 AM > To: Jameson Miller > Cc: git@vger.kernel.org; pclo...@gmail.com; jonathanta...@google.com; > sbel...@google.com; peart...@gmail.com > Subject: Re: [PATCH v3 0/7] allocate cache entries from memory pool > > Jameson Miller writes: > > > Changes from V2: > > > > - Tweak logic of finding available memory block for memory > > allocation > > > > - Only search head block > > Hmph. Is that because we generally do not free() a lot so once a block is > filled, > there is not much chance that we have reclaimed space in the block later? > The design of the memory pool is that once the memory is claimed from the pool, it is not reused until the containing pool is discarded. Individual entries are not freed, only the entire memory pool is freed, and only after we are sure that there are no references to any of the entries in the pool. The memory pool design makes some tradeoffs. It is not meant to be completely replace malloc / free as a general purpose allocator, but rather used in scenarios where the benefit (faster allocations, lower bookkeeping overhead) is worth the tradeoffs (not able to free individual allocations). The access patterns around cache entries are well matched with the memory pool to get the benefits - the majority of cache entries are allocated up front when reading the index from disk, and are then discarded in bulk when the index is freed (if the index is freed at all (rather than just existing)). > > - Tweaked handling of large memory allocations. > > > > - Large blocks now tracked in same manner as "regular" > > blocks > > > > - Large blocks are placed at end of linked list of memory > > blocks > > If we are only carving out of the most recently allocated block, it seems that > there is no point looking for "the end", no? Right. If we are not searching the list, then there isn't any point in Appending odd large items to the end vs sticking it immediately past the head block. I will remove the usage of the tail pointer in the next version. Yes, this is true. I can remove the usage of the tail pointer here, as it is not really leveraged. I will make this change in the next version. > > > > - Cache_entry type gains notion of whether it was allocated > > from memory pool or not > > > > - Collapsed cache_entry discard logic into single > > function. This should make code easier to maintain > > That certainly should be safer to have a back-pointer pointing to which pool > each entry came from, but doesn't it result in memory bloat? Currently, entries claimed from a memory pool are not freed, so we only need to know whether the entry came from a memory pool or not. This has less memory impact than a full pointer but is also a bit more restrictive. We debated several approaches for what to do here and landed on using a simple bit for this rather than the full pointer. In the current code we use a full integer field for this, but we can convert this into a bit or bit field. The current flags word is full, so this would require a second flags field.
Re: [PATCH v2] rev-parse: check lookup'ed commit references for NULL
[Added Florian to Cc] Elijah Newren wrote: > Commits 2122f8b963d4 ("rev-parse: Add support for the ^! and ^@ syntax", > 2008-07-26) and 3dd4e7320d ("Teach rev-parse the ... syntax.", 2006-07-04) > taught rev-parse new syntax, and used lookup_commit_reference() as part of > their logic. Neither usage checked the returned commit to see if it was > non-NULL before using it. Check for NULL and ensure an appropriate error > is reported to the user. > > Reported by Florian Weimer and Todd Zullinger. > > Helped-by: Jeff King> Signed-off-by: Elijah Newren > --- The output is now much more consistent with other invalid input. The only (minor) difference I noticed was when using the fff...fff form. With exactly 40 chars, rev-parse prints both refs separately and then the full input string before the "fatal:" error. I doubt it's terribly important. # exactly 40 chars $ ./git-rev-parse ... ... fatal: ambiguous argument '...': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' # not 40 chars $ ./git-rev-parse fff...fff fff...fff fatal: ambiguous argument 'fff...fff': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' > I would have used a Reported-by tag for Florian and Todd, but looking at > the bugzilla.redhat.com bug report doesn't show me Florian's email > address. I grepped through git logs and found two associated with that > name, but didn't know if they were still accurate, or were a different > Florian. So I just went with the sentence instead. I added Florian to Cc, in case he wants to provide a preferred address. (The Red Hat Bugzilla only shows email addresses if you're logged in.) Thanks Elijah and Peff. > builtin/rev-parse.c | 8 ++-- > t/t6101-rev-parse-parents.sh | 8 > 2 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c > index a1e680b5e9..a0a0ace38d 100644 > --- a/builtin/rev-parse.c > +++ b/builtin/rev-parse.c > @@ -282,6 +282,10 @@ static int try_difference(const char *arg) > struct commit *a, *b; > a = lookup_commit_reference(_oid); > b = lookup_commit_reference(_oid); > + if (!a || !b) { > + *dotdot = '.'; > + return 0; > + } > exclude = get_merge_bases(a, b); > while (exclude) { > struct commit *commit = pop_commit(); > @@ -328,12 +332,12 @@ static int try_parent_shorthands(const char *arg) > return 0; > > *dotdot = 0; > - if (get_oid_committish(arg, )) { > + if (get_oid_committish(arg, ) || > + !(commit = lookup_commit_reference())) { > *dotdot = '^'; > return 0; > } > > - commit = lookup_commit_reference(); > if (exclude_parent && > exclude_parent > commit_list_count(commit->parents)) { > *dotdot = '^'; > diff --git a/t/t6101-rev-parse-parents.sh b/t/t6101-rev-parse-parents.sh > index 8c617981a3..7683e4a114 100755 > --- a/t/t6101-rev-parse-parents.sh > +++ b/t/t6101-rev-parse-parents.sh > @@ -214,4 +214,12 @@ test_expect_success 'rev-list merge^-1x (garbage after > ^-1)' ' > test_must_fail git rev-list merge^-1x > ' > > +test_expect_success 'rev-parse $garbage^@ does not segfault' ' > + test_must_fail git rev-parse $EMPTY_TREE^@ > +' > + > +test_expect_success 'rev-parse $garbage...$garbage does not segfault' ' > + test_must_fail git rev-parse $EMPTY_TREE...$EMPTY_BLOB > +' > + > test_done -- Todd ~~ If the triangles were to make a God they would give him three sides. -- Montesquieu
why do "git log -h" and "git show -h" print the same thing?
maybe this is deliberate, but it's confusing that, with git 2.17.0, the output of both "git log -h" and "git show -h" is exactly the same: $ git log -h usage: git log [] [] [[--] ...] or: git show [] ... -q, --quiet suppress diff output --source show source --use-mailmap Use mail map file --decorate-refs only decorate refs that match --decorate-refs-exclude do not decorate refs that match --decorate[=...] decorate options -LProcess line range n,m in file, counting from 1 $ is that what's *supposed* to happen? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: [GSoC][PATCH v3 0/4] rebase: split rebase -p from rebase -i
Hi, Le 24/05/2018 à 13:49, Alban Gruin a écrit : > Changes since v2: - Removing `mark_action_done()` from git-rebase--interactive.sh - Removing unused variables from git-rebase--interactive.sh Cheers, Alban
[GSoC][PATCH v3 1/4] rebase: duplicate git-rebase--interactive.sh to git-rebase--preserve-merges.sh
This duplicates git-rebase--interactive.sh to git-rebase--preserve-merges.sh. This is done to split -p from -i. No modifications are made to this file here, but any code that is not used by -p will be stripped in the next commit. Signed-off-by: Alban Gruin--- .gitignore |1 + Makefile |2 + git-rebase--preserve-merges.sh | 1069 3 files changed, 1072 insertions(+) create mode 100644 git-rebase--preserve-merges.sh diff --git a/.gitignore b/.gitignore index 833ef3b0b..ef4925485 100644 --- a/.gitignore +++ b/.gitignore @@ -117,6 +117,7 @@ /git-rebase--helper /git-rebase--interactive /git-rebase--merge +/git-rebase--preserve-merges /git-receive-pack /git-reflog /git-remote diff --git a/Makefile b/Makefile index 50da82b01..810a0d0f4 100644 --- a/Makefile +++ b/Makefile @@ -582,6 +582,7 @@ SCRIPT_LIB += git-mergetool--lib SCRIPT_LIB += git-parse-remote SCRIPT_LIB += git-rebase--am SCRIPT_LIB += git-rebase--interactive +SCRIPT_LIB += git-rebase--preserve-merges SCRIPT_LIB += git-rebase--merge SCRIPT_LIB += git-sh-setup SCRIPT_LIB += git-sh-i18n @@ -2271,6 +2272,7 @@ LOCALIZED_C = $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H) LOCALIZED_SH = $(SCRIPT_SH) LOCALIZED_SH += git-parse-remote.sh LOCALIZED_SH += git-rebase--interactive.sh +LOCALIZED_SH += git-rebase--preserve-merges.sh LOCALIZED_SH += git-sh-setup.sh LOCALIZED_PERL = $(SCRIPT_PERL) diff --git a/git-rebase--preserve-merges.sh b/git-rebase--preserve-merges.sh new file mode 100644 index 0..2f4941d0f --- /dev/null +++ b/git-rebase--preserve-merges.sh @@ -0,0 +1,1069 @@ +# This shell script fragment is sourced by git-rebase to implement +# its interactive mode. "git rebase --interactive" makes it easy +# to fix up commits in the middle of a series and rearrange commits. +# +# Copyright (c) 2006 Johannes E. Schindelin +# +# The original idea comes from Eric W. Biederman, in +# https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/ +# +# The file containing rebase commands, comments, and empty lines. +# This file is created by "git rebase -i" then edited by the user. As +# the lines are processed, they are removed from the front of this +# file and written to the tail of $done. +todo="$state_dir"/git-rebase-todo + +# The rebase command lines that have already been processed. A line +# is moved here when it is first handled, before any associated user +# actions. +done="$state_dir"/done + +# The commit message that is planned to be used for any changes that +# need to be committed following a user interaction. +msg="$state_dir"/message + +# The file into which is accumulated the suggested commit message for +# squash/fixup commands. When the first of a series of squash/fixups +# is seen, the file is created and the commit message from the +# previous commit and from the first squash/fixup commit are written +# to it. The commit message for each subsequent squash/fixup commit +# is appended to the file as it is processed. +# +# The first line of the file is of the form +# # This is a combination of $count commits. +# where $count is the number of commits whose messages have been +# written to the file so far (including the initial "pick" commit). +# Each time that a commit message is processed, this line is read and +# updated. It is deleted just before the combined commit is made. +squash_msg="$state_dir"/message-squash + +# If the current series of squash/fixups has not yet included a squash +# command, then this file exists and holds the commit message of the +# original "pick" commit. (If the series ends without a "squash" +# command, then this can be used as the commit message of the combined +# commit without opening the editor.) +fixup_msg="$state_dir"/message-fixup + +# $rewritten is the name of a directory containing files for each +# commit that is reachable by at least one merge base of $head and +# $upstream. They are not necessarily rewritten, but their children +# might be. This ensures that commits on merged, but otherwise +# unrelated side branches are left alone. (Think "X" in the man page's +# example.) +rewritten="$state_dir"/rewritten + +dropped="$state_dir"/dropped + +end="$state_dir"/end +msgnum="$state_dir"/msgnum + +# A script to set the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and +# GIT_AUTHOR_DATE that will be used for the commit that is currently +# being rebased. +author_script="$state_dir"/author-script + +# When an "edit" rebase command is being processed, the SHA1 of the +# commit to be edited is recorded in this file. When "git rebase +# --continue" is executed, if there are any staged changes then they +# will be amended to the HEAD commit, but only provided the HEAD +# commit is still the commit to be edited. When any other rebase +# command is processed, this file is deleted. +amend="$state_dir"/amend + +# For the post-rewrite hook, we make a list of rewritten
[GSoC][PATCH v3 2/4] rebase: strip unused code in git-rebase--preserve-merges.sh
This removes the code coming from git-rebase--interactive.sh that is not needed by preserve-merges. Signed-off-by: Alban Gruin--- git-rebase--preserve-merges.sh | 65 +++--- 1 file changed, 4 insertions(+), 61 deletions(-) diff --git a/git-rebase--preserve-merges.sh b/git-rebase--preserve-merges.sh index 2f4941d0f..c51c7828e 100644 --- a/git-rebase--preserve-merges.sh +++ b/git-rebase--preserve-merges.sh @@ -1,12 +1,8 @@ -# This shell script fragment is sourced by git-rebase to implement -# its interactive mode. "git rebase --interactive" makes it easy -# to fix up commits in the middle of a series and rearrange commits. +# This shell script fragment is sourced by git-rebase to implement its +# preserve-merges mode. # # Copyright (c) 2006 Johannes E. Schindelin # -# The original idea comes from Eric W. Biederman, in -# https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/ -# # The file containing rebase commands, comments, and empty lines. # This file is created by "git rebase -i" then edited by the user. As # the lines are processed, they are removed from the front of this @@ -287,17 +283,7 @@ pick_one () { empty_args="--allow-empty" fi - test -d "$rewritten" && - pick_one_preserving_merges "$@" && return - output eval git cherry-pick $allow_rerere_autoupdate $allow_empty_message \ - ${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} \ - $signoff "$strategy_args" $empty_args $ff "$@" - - # If cherry-pick dies it leaves the to-be-picked commit unrecorded. Reschedule - # previous task so this commit is not lost. - ret=$? - case "$ret" in [01]) ;; *) reschedule_last_action ;; esac - return $ret + pick_one_preserving_merges "$@" } pick_one_preserving_merges () { @@ -761,11 +747,6 @@ get_missing_commit_check_level () { initiate_action () { case "$1" in continue) - if test ! -d "$rewritten" - then - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi # do we have anything to commit? if git diff-index --cached --quiet HEAD -- then @@ -824,12 +805,6 @@ first and then run 'git rebase --continue' again.")" ;; skip) git rerere clear - - if test ! -d "$rewritten" - then - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi do_rest return 0 ;; @@ -944,43 +919,11 @@ EOF } expand_todo_ids - - test -d "$rewritten" || test -n "$force_rebase" || - onto="$(git rebase--helper --skip-unnecessary-picks)" || - die "Could not skip unnecessary pick commands" - checkout_onto - if test ! -d "$rewritten" - then - require_clean_work_tree "rebase" - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi do_rest } -git_rebase__interactive () { - initiate_action "$action" - ret=$? - if test $ret = 0; then - return 0 - fi - - setup_reflog_action - init_basic_state - - init_revisions_and_shortrevisions - - git rebase--helper --make-script ${keep_empty:+--keep-empty} \ - ${rebase_merges:+--rebase-merges} \ - ${rebase_cousins:+--rebase-cousins} \ - $revisions ${restrict_revision+^$restrict_revision} >"$todo" || - die "$(gettext "Could not generate todo list")" - - complete_action -} - -git_rebase__interactive__preserve_merges () { +git_rebase__preserve_merges () { initiate_action "$action" ret=$? if test $ret = 0; then -- 2.16.1
[GSoC][PATCH v3 0/4] rebase: split rebase -p from rebase -i
This splits the `rebase --preserve-merges` functionnality from git-rebase--interactive.sh. All the dead code left by the duplication of git-rebase--interactive.sh is also removed. This will make it easier to rewrite this script in C. This patch series is based of js/sequencer-and-root-commits. Changes since v2: - Removing ` Alban Gruin (4): rebase: duplicate git-rebase--interactive.sh to git-rebase--preserve-merges.sh rebase: strip unused code in git-rebase--preserve-merges.sh rebase: use the new git-rebase--preserve-merges.sh rebase: remove -p code from git-rebase--interactive.sh .gitignore |1 + Makefile |2 + git-rebase--interactive.sh | 802 +-- git-rebase--preserve-merges.sh | 1012 git-rebase.sh | 32 +- 5 files changed, 1048 insertions(+), 801 deletions(-) create mode 100644 git-rebase--preserve-merges.sh -- 2.16.1
[GSoC][PATCH v3 4/4] rebase: remove -p code from git-rebase--interactive.sh
All the code specific to preserve-merges was moved to git-rebase--preserve-merges.sh, and so it’s useless to keep it here. Signed-off-by: Alban Gruin--- git-rebase--interactive.sh | 802 + 1 file changed, 8 insertions(+), 794 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 2f4941d0f..9884ecd71 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -13,83 +13,6 @@ # file and written to the tail of $done. todo="$state_dir"/git-rebase-todo -# The rebase command lines that have already been processed. A line -# is moved here when it is first handled, before any associated user -# actions. -done="$state_dir"/done - -# The commit message that is planned to be used for any changes that -# need to be committed following a user interaction. -msg="$state_dir"/message - -# The file into which is accumulated the suggested commit message for -# squash/fixup commands. When the first of a series of squash/fixups -# is seen, the file is created and the commit message from the -# previous commit and from the first squash/fixup commit are written -# to it. The commit message for each subsequent squash/fixup commit -# is appended to the file as it is processed. -# -# The first line of the file is of the form -# # This is a combination of $count commits. -# where $count is the number of commits whose messages have been -# written to the file so far (including the initial "pick" commit). -# Each time that a commit message is processed, this line is read and -# updated. It is deleted just before the combined commit is made. -squash_msg="$state_dir"/message-squash - -# If the current series of squash/fixups has not yet included a squash -# command, then this file exists and holds the commit message of the -# original "pick" commit. (If the series ends without a "squash" -# command, then this can be used as the commit message of the combined -# commit without opening the editor.) -fixup_msg="$state_dir"/message-fixup - -# $rewritten is the name of a directory containing files for each -# commit that is reachable by at least one merge base of $head and -# $upstream. They are not necessarily rewritten, but their children -# might be. This ensures that commits on merged, but otherwise -# unrelated side branches are left alone. (Think "X" in the man page's -# example.) -rewritten="$state_dir"/rewritten - -dropped="$state_dir"/dropped - -end="$state_dir"/end -msgnum="$state_dir"/msgnum - -# A script to set the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and -# GIT_AUTHOR_DATE that will be used for the commit that is currently -# being rebased. -author_script="$state_dir"/author-script - -# When an "edit" rebase command is being processed, the SHA1 of the -# commit to be edited is recorded in this file. When "git rebase -# --continue" is executed, if there are any staged changes then they -# will be amended to the HEAD commit, but only provided the HEAD -# commit is still the commit to be edited. When any other rebase -# command is processed, this file is deleted. -amend="$state_dir"/amend - -# For the post-rewrite hook, we make a list of rewritten commits and -# their new sha1s. The rewritten-pending list keeps the sha1s of -# commits that have been processed, but not committed yet, -# e.g. because they are waiting for a 'squash' command. -rewritten_list="$state_dir"/rewritten-list -rewritten_pending="$state_dir"/rewritten-pending - -# Work around Git for Windows' Bash whose "read" does not strip CRLF -# and leaves CR at the end instead. -cr=$(printf "\015") - -strategy_args=${strategy:+--strategy=$strategy} -test -n "$strategy_opts" && -eval ' - for strategy_opt in '"$strategy_opts"' - do - strategy_args="$strategy_args -X$(git rev-parse --sq-quote "${strategy_opt#--}")" - done -' - GIT_CHERRY_PICK_HELP="$resolvemsg" export GIT_CHERRY_PICK_HELP @@ -105,15 +28,6 @@ case "$comment_char" in ;; esac -warn () { - printf '%s\n' "$*" >&2 -} - -# Output the commit message for the specified commit. -commit_message () { - git cat-file commit "$1" | sed "1,/^$/d" -} - orig_reflog_action="$GIT_REFLOG_ACTION" comment_for_reflog () { @@ -125,33 +39,6 @@ comment_for_reflog () { esac } -last_count= -mark_action_done () { - sed -e 1q < "$todo" >> "$done" - sed -e 1d < "$todo" >> "$todo".new - mv -f "$todo".new "$todo" - new_count=$(( $(git stripspace --strip-comments <"$done" | wc -l) )) - echo $new_count >"$msgnum" - total=$(($new_count + $(git stripspace --strip-comments <"$todo" | wc -l))) - echo $total >"$end" - if test "$last_count" != "$new_count" - then - last_count=$new_count - eval_gettext "Rebasing (\$new_count/\$total)"; printf "\r" - test -z "$verbose" || echo - fi -} - -# Put the last action marked done at the beginning of
[GSoC][PATCH v3 3/4] rebase: use the new git-rebase--preserve-merges.sh
Creates a new type of rebase, "preserve-merges", used when rebase is called with -p. Before that, the type for preserve-merges was "interactive", and some places of this script compared $type to "interactive". Instead, the code now checks if $interactive_rebase is empty or not, as it is set to "explicit" when calling an interactive rebase (and, possibly, one of its submodes), and "implied" when calling one of its submodes (eg. preserve-merges) *without* interactive rebase. It also detects the presence of the directory "$merge_dir"/rewritten left by the preserve-merges script when calling rebase --continue, --skip, etc., and, if it exists, sets the rebase mode to preserve-merges. In this case, interactive_rebase is set to "explicit", as "implied" would break some tests. Signed-off-by: Alban Gruin--- git-rebase.sh | 32 +--- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/git-rebase.sh b/git-rebase.sh index 40be59ecc..19bdebb48 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -207,7 +207,14 @@ run_specific_rebase () { autosquash= fi . git-rebase--$type - git_rebase__$type${preserve_merges:+__preserve_merges} + + if test -z "$preserve_merges" + then + git_rebase__$type + else + git_rebase__preserve_merges + fi + ret=$? if test $ret -eq 0 then @@ -239,7 +246,12 @@ then state_dir="$apply_dir" elif test -d "$merge_dir" then - if test -f "$merge_dir"/interactive + if test -d "$merge_dir"/rewritten + then + type=preserve-merges + interactive_rebase=explicit + preserve_merges=t + elif test -f "$merge_dir"/interactive then type=interactive interactive_rebase=explicit @@ -402,14 +414,14 @@ if test -n "$action" then test -z "$in_progress" && die "$(gettext "No rebase in progress?")" # Only interactive rebase uses detailed reflog messages - if test "$type" = interactive && test "$GIT_REFLOG_ACTION" = rebase + if test -n "$interactive_rebase" && test "$GIT_REFLOG_ACTION" = rebase then GIT_REFLOG_ACTION="rebase -i ($action)" export GIT_REFLOG_ACTION fi fi -if test "$action" = "edit-todo" && test "$type" != "interactive" +if test "$action" = "edit-todo" && test -z "$interactive_rebase" then die "$(gettext "The --edit-todo action can only be used during interactive rebase.")" fi @@ -487,7 +499,13 @@ fi if test -n "$interactive_rebase" then - type=interactive + if test -z "$preserve_merges" + then + type=interactive + else + type=preserve-merges + fi + state_dir="$merge_dir" elif test -n "$do_merge" then @@ -647,7 +665,7 @@ require_clean_work_tree "rebase" "$(gettext "Please commit or stash them.")" # but this should be done only when upstream and onto are the same # and if this is not an interactive rebase. mb=$(git merge-base "$onto" "$orig_head") -if test "$type" != interactive && test "$upstream" = "$onto" && +if test -z "$interactive_rebase" && test "$upstream" = "$onto" && test "$mb" = "$onto" && test -z "$restrict_revision" && # linear history? ! (git rev-list --parents "$onto".."$orig_head" | sane_grep " .* ") > /dev/null @@ -691,7 +709,7 @@ then GIT_PAGER='' git diff --stat --summary "$mb" "$onto" fi -test "$type" = interactive && run_specific_rebase +test -n "$interactive_rebase" && run_specific_rebase # Detach HEAD and reset the tree say "$(gettext "First, rewinding head to replay your work on top of it...")" -- 2.16.1
is the standard "[]", and not "[options]" or other?
more pedantry -- was digging through "man git-diff" and noticed the inconsistency in how options are represented. first, in the synopsis, you see "[options]": SYNOPSIS git diff [options] [] [--] [...] git diff [options] --cached [] [--] [...] git diff [options] [--] [...] git diff [options] git diff [options] [--no-index] [--] while just below that in DESCRIPTION, it's all "[--options]": git diff [--options] [--] [...] ^^^ a further search produced this from RelNotes/2.7.0.txt: "A couple of commands still showed "[options]" in their usage string to note where options should come on their command line, but we spell that "[]" in most places these days." so, "git diff -h" does in fact use the allegedly encouraged syntax: $ git diff -h usage: git diff [] [ []] [--] [...] $ but should the man pages be updated similarly? i can whip up a patch for that unless someone wants to comment on this further. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: [RFC/PATCH 4/7] rerere: fix crash when conflict goes unresolved
Thomas Gummererwrites: > To fix this, remove the rerere ID from the MERGE_RR file in case we > can't handle it, and remove the folder for the ID. Removing it > unconditionally is fine here, because if the user would have resolved > the conflict and ran rerere, the entry would no longer be in the > MERGE_RR file, so we wouldn't have this problem in the first place, I do not think removing the directory and losing _other_ conflicts and their resolutions, if they exist, is fine in the modern world order post rerere-multi update in 2016. Well, it is just as safe as "rm -rf .git/rr-cache/" in the sense that it won't make Git start segfaulting, but it is not fine as it is discarding information of conflicts that has nothing to do with the current one that is problematic.
unexpected "unresolved merge conflict" for a new file
Hi, `git commit' fails on a newly added file with the following * * You have some suspicious patch lines: * * In Documentation/core-api/gfp_mask-from-fs-io.rst * unresolved merge conflict (line 27) Documentation/core-api/gfp_mask-from-fs-io.rst:27:=== $ git status --porcelain A Documentation/core-api/gfp_mask-from-fs-io.rst $ git --version git version 2.17.0 from debian. Btw. the suspicious line is $ sed -n '27p' Documentation/core-api/gfp_mask-from-fs-io.rst === I believe this is a bug because a new file cannot have a conflict by definition and also there are no < in the file so there is no unresolved conflict there. So I guess the heuristic should be more clever. Thanks! -- Michal Hocko SUSE Labs
Re: [PATCH 2/5] t6036, t6042: use test_line_count instead of wc -l
On Thu, May 24, 2018 at 12:05 PM, SZEDER Gáborwrote: > >> Signed-off-by: Elijah Newren >> --- >> t/t6036-recursive-corner-cases.sh| 102 ++- >> t/t6042-merge-rename-corner-cases.sh | 99 +- >> 2 files changed, 134 insertions(+), 67 deletions(-) >> >> diff --git a/t/t6036-recursive-corner-cases.sh >> b/t/t6036-recursive-corner-cases.sh >> index cfe6a99771..3e659cff28 100755 >> --- a/t/t6036-recursive-corner-cases.sh >> +++ b/t/t6036-recursive-corner-cases.sh >> @@ -65,9 +65,12 @@ test_expect_success 'merge simple rename+criss-cross with >> no modifications' ' >> >> test_must_fail git merge -s recursive R2^0 && >> >> - test 2 = $(git ls-files -s | wc -l) && >> - test 2 = $(git ls-files -u | wc -l) && >> - test 2 = $(git ls-files -o | wc -l) && > > Here 'git ls-files -o' should have listed two untracked files ... > >> + git ls-files -s >out && >> + test_line_count = 2 out && >> + git ls-files -u >out && >> + test_line_count = 2 out && >> + git ls-files -o >out && >> + test_line_count = 3 out && > > ... but now you expect it to list three. I was about to point out the > typo, but then noticed that you expect it to list one more untracked > file than before in all subsequent tests... Hrm, not all, actually, because there is this one exception: >> @@ -426,10 +444,14 @@ test_expect_success 'handle rename/rename (2to1) >> conflict correctly' ' >> test_must_fail git merge -s recursive C^0 >out && Note that the file 'out' is created here ... >> test_i18ngrep "CONFLICT (rename/rename)" out && >> >> - test 2 -eq $(git ls-files -s | wc -l) && >> - test 2 -eq $(git ls-files -u | wc -l) && >> - test 2 -eq $(git ls-files -u c | wc -l) && >> - test 3 -eq $(git ls-files -o | wc -l) && ... so this 'git ls-files -o' already lists it as well, ... >> + git ls-files -s >out && >> + test_line_count = 2 out && >> + git ls-files -u >out && >> + test_line_count = 2 out && >> + git ls-files -u c >out && >> + test_line_count = 2 out && >> + git ls-files -o >out && >> + test_line_count = 3 out && ... therefore the number of untracked files here remains unchanged. >> >> test ! -f a && >> test ! -f b &&
Re: [RFC/PATCH 5/7] rerere: only return whether a path has conflicts or not
Thomas Gummererwrites: > We currently return the exact number of conflict hunks a certain path > has from the 'handle_paths' function. However all of its callers only > care whether there are conflicts or not or if there is an error. > Return only that information, and document that only that information > is returned. This will simplify the code in the subsequent steps. Makes sense.