Inconsistencies in commit-graph technical docs.
I recently read the "Supercharging the Git Commit Graph" blog by Derrick Stolee. I found the article interesting and wanted to verify the performance numbers for myself. Then that led me to want to know more about the implementation, so I read the technical design notes in commit-graph.txt, and then I jumped into the format documentation in commit-graph-format.txt. Along the way, I noticed a few issues. They might just be errors in the documentation, but I figured it was worth documenting my entire process just to be sure. "Supercharging the Git Commit Graph", by Derrick Stolee: https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/ # TL;DR I found a few discrepencies between the documentation in commit-graph-format.txt and the results that I observed for myself. 1. The "Commit Data" chunk signature is documented to be 'CGET', but it should be 'CDAT'. commit-graph.c:18 #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ 2. The "no parent" value is documented to be 0x, but is actually 0x7000. commit-graph.c:34 #define GRAPH_PARENT_NONE 0x7000 3. The "generation" field isn't set on any of the commits. (I don't know what to make of this.) # Documented Process Just to follow along, here are the details about my local repo. I don't have any local changes; I just track upstream. repo:git/git remotes: git://git.kernel.org/pub/scm/git/git.git (origin) branch: master commit: ed843436dd4924c10669820cc73daf50f0b4dabd os: MacOS 10.11.6 (El Capitan) git: 2.18.0 Just to be clear, I am following the details as described in: /Documentation/technical/commit-graph-format.txt # Create the commit-graph File As a setup step, I have to prepare the commit-graph file as described in the blog post. $ git show-ref -s | git commit-graph write --stdin-commits # Verify SHA1 Checksum Starting at the end first because like most (all?) of git's binary formats, there is a SHA1 checksum that verifies the integrity of the file. Everything checks out fine. $ < .git/objects/info/commit-graph tail -c 20 | xxd -p 35507183b1e9546f9e4844f0796d630daebc85d8 $ < .git/objects/info/commit-graph wc -c 3037156 $ < .git/objects/info/commit-graph head -c $((3037156-20)) | sha1sum 35507183b1e9546f9e4844f0796d630daebc85d8 - # Header The header looks good: * signature: CGPH * version: 1 * hash:1 (SHA1) * "chunks":4 * "reserved": (ignored) $ < .git/objects/info/commit-graph xxd -l8 : 4347 5048 0101 0400 CGPH # Chunk Lookup, Problem: CGET/CDAT I have all of the documented "chunks". but it appears, that the documentation is wrong for the "Commit Data" chunk. The document signature is 'CGET', but I'm getting 'CDAT'. * OIDF, Object ID Fanout: 0x44 * OIDL, Object ID Lookup: 0x444 * CDAT, Commit Data:0x108f58 * EDGE, Large Edge List:0x2e577c * , terminating label: 0x2e57d0 $ < .git/objects/info/commit-graph xxd -s8 -l$(((4+1)*12)) -c12 -g4 0008: 4f494446 0044 OIDF...D 0014: 4f49444c 0444 OIDL...D 0020: 43444154 00108f58 CDAT...X 002c: 45444745 002e567c EDGE..V| 0038: 002e57d0 ..W. # OID Fanout Not much to tell here, but since it's important, I'll just show last line so know how many commits are recorded in the file. Everything checks out that I have 54209 commits in my local repo. $ < .git/objects/info/commit-graph xxd -s0x44 -l$((256*4)) -c4 -g4 ... 0440: d3c1 $ echo $((0xd3c1)) 54209 $ git rev-list --all | wc -l 54209 # OID Lookup Again, not much to see here; just an alphanumeric, sequential list off all the commit ids. $ < .git/objects/info/commit-graph xxd -s0x444 -l$((54209*20)) -c20 -g20 ... 00108f44: fffe694d607ea683b5d08ee99a46d9b06cb74006 ..iM`~...F..l.@. $ git cat-file -p fffe694d607ea683b5d08ee99a46d9b06cb74006 tree 1241274386e9d5dabaf370477ff19ba0eb36c3c2 parent 84dee6bbc9eb6dd8e613414ed0271d8290549e89 author Eric Wong 1168151155 -0800 committer Junio C Hamano 1168152528 -0800 ... # Commit Data This is where I start to see inconsistencies with the documented format. $ < .git/objects/info/commit-graph xxd -s0x108f58 -l$((54209*(20+16))) -c36 -g4 ... 002e5658: 12412743 86e9d5da baf37047 7ff19ba0 eb36c3c2 6dea 7000 45a097d0 >From the above 'git cat-file' command, we can confirm the tree is correct. tree:1241274386e9d5dabaf370477ff19ba0eb36c3c2 parent1: 6dea parent2: 7000 generation: timestamp: 45a097d0 As does the timestamp. $ date --date @$((0x45a097d0)) +%s 1168152528 Parent #1 looks good as well. < .git/objects/info/commit-graph xxd -s$((0x444+(0x6dea*20))) -l20 -c20 -g20 00089a8c: 84dee6bbc9eb6dd8e613414ed0271d8290549e89 ## Problem: Parent #2 Parent #2 (7000), however appears to be a mystery value.
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: > We have common code that is supposed to be usable across different platforms > and hence different file encodings. With the full support of the > working-tree-encoding in the latest version of git on all platforms, how do > we have files converted to different encodings on different platforms? > I could not find anything that would allow us to say 'if platform = z/OS then > encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? I don't believe there is such functionality. Git doesn't have attributes that are conditional on the platform in that sort of way. You could use a smudge/clean filter and adjust the filter for the platform you're on, which might meet your needs. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
[PATCH] sequencer: use configured comment character
Use configured comment character when generating comments about branches in an instruction sheet. Failure to honor this configuration causes a failure to parse the resulting instruction sheet. Signed-off-by: Aaron Schrab --- sequencer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sequencer.c b/sequencer.c index 4034c0461b..caf91af29d 100644 --- a/sequencer.c +++ b/sequencer.c @@ -3991,7 +3991,7 @@ static int make_script_with_merges(struct pretty_print_context *pp, entry = oidmap_get(&state.commit2label, &commit->object.oid); if (entry) - fprintf(out, "\n# Branch %s\n", entry->string); + fprintf(out, "\n%c Branch %s\n", comment_line_char, entry->string); else fprintf(out, "\n"); -- 2.18.0.419.gfe4b301394
Re: [PATCH v3] Documentation: declare "core.ignorecase" as internal variable
At 12:11 -0700 27 Jun 2018, Junio C Hamano wrote: Hmph. Do other people have difficulty applying this patch to their trees? It is just several lines long so I could retype it myself, but I guess "Content-Type: text/plain; charset=utf-8; format=flowed" has destroyed formatting of the patch rather badly. Yes, format=flowed requires lines that start with a space (along with '>' or 'From ') to be space-stuffed, adding a leading space. This will affect context lines in patches. I was able to apply it cleanly (I think) by sending the message to: sed '/@@/,$s/^ / /' | git am That's replacing two leading spaces with one.
Re: [PATCH] fetch: when deepening, check connectivity fully
Jonathan Tan writes: >> Hmph, don't we quote these in the trace output, requiring us to grep >> for "'--not' '--all'" or somesuch? > > I thought so too, but this was changed in commit 1fbdab21bb ("trace: > avoid unnecessary quoting", 2018-01-16). Yup; fortunately or unfortunately, that and the "partial clone" stuff are among the differences between v2.16 track and newer, so we can just forget about v2.16.x and lower and target this fix to maint-2.17 and above.
Re: [PATCH] fetch: when deepening, check connectivity fully
Jonathan Tan writes: >> Hmph, remind me how old and/or new shallow cut-off points affect >> this traversal? In order to verify that everything above the new >> cut-off points, opt->shallow_file should be pointing at the new >> cut-off points, then we do the full sweep (without --not --all) to >> ensure we won't find missing or broken objects but still stop at the >> new cut-off points, and then only after check_connected() passes, >> update the shallow file to make new shallow cut-off points active >> (and if the traversal fails, then we do nto install the new shallow >> cut-off points)? > > That is the way it should work, but after thinking about it once more, I > realize that it isn't. > > opt->shallow_file is not set to anything. And fetch-pack updates the > shallow file by itself (at least, that is my understanding of > update_shallow() in fetch-pack.c) before fetch calls check_connected(), > meaning that if check_connected() fails, there is still no rollback of > the shallow file. Ouch. We need to fix that; otherwise, a broken server will keep giving you a corrupt repository even with this fix, no?
Re: [PATCH] fetch: when deepening, check connectivity fully
> Jonathan Tan writes: > > > +test_expect_success 'shallow fetches check connectivity without stopping > > at existing refs' ' > > + cp -R .git server.git && > > + > > + # Normally, the connectivity check stops at ancestors of existing refs. > > + git init client && > > + GIT_TRACE="$(pwd)/trace" git -C client fetch "$(pwd)/server.git" && > > + grep "run_command: git rev-list" trace >rev-list-command && > > + grep -e "--not --all" rev-list-command && > > + > > + # But it does not for a shallow fetch... > > + rm -rf client trace && > > + git init client && > > + GIT_TRACE="$(pwd)/trace" git -C client fetch --depth=1 > > "$(pwd)/server.git" && > > + grep "run_command: git rev-list" trace >rev-list-command && > > + ! grep -e "--not --all" rev-list-command && > > + > > + # ...and when deepening. > > + rm trace && > > + GIT_TRACE="$(pwd)/trace" git -C client fetch --unshallow > > "$(pwd)/server.git" && > > + grep "run_command: git rev-list" trace >rev-list-command && > > + ! grep -e "--not --all" rev-list-command > > +' > > Hmph, don't we quote these in the trace output, requiring us to grep > for "'--not' '--all'" or somesuch? I thought so too, but this was changed in commit 1fbdab21bb ("trace: avoid unnecessary quoting", 2018-01-16). > I do not think of a better way to do the above without a huge effort > offhand, and the approach taken by the above may be the best we > could do, but it looks like quite a brittle test that knows too much > about the current implementation. "rev-list $new_commits --not > --all" is a so very common and useful pattern that it is not all > that implausible that we may want to come up with a new option to do > so, or more likely we may want to do that with an in-process API > without spawning an external rev-list (hence making it impossible to > observe via GIT_TRACE). I agree. The best way to do it would probably be to intercept the fetch response and substitute an empty packfile for the packfile returned by the fetch, like the one-time-sed mechanism [1], but I think that it is outside the scope of this patch. [1] https://public-inbox.org/git/afe5d7d3f876893fdad318665805df1e056717c6.1485381677.git.jonathanta...@google.com/
Re: [PATCH] fetch: when deepening, check connectivity fully
> Hmph, remind me how old and/or new shallow cut-off points affect > this traversal? In order to verify that everything above the new > cut-off points, opt->shallow_file should be pointing at the new > cut-off points, then we do the full sweep (without --not --all) to > ensure we won't find missing or broken objects but still stop at the > new cut-off points, and then only after check_connected() passes, > update the shallow file to make new shallow cut-off points active > (and if the traversal fails, then we do nto install the new shallow > cut-off points)? That is the way it should work, but after thinking about it once more, I realize that it isn't. opt->shallow_file is not set to anything. And fetch-pack updates the shallow file by itself (at least, that is my understanding of update_shallow() in fetch-pack.c) before fetch calls check_connected(), meaning that if check_connected() fails, there is still no rollback of the shallow file. So to directly answer your first question, only the new shallow cut-off points affect this traversal, and not the old ones. > If so, that sounds sensible. By not having "--not --all", we > potentially would do a full sweep, but if we are really deepening to > the root commits, then we do need one full sweep anyway (as these > deepest parts of the history were previously hidden by the shallow > cutoff points), and if we are only deepening the history by a bit, > even if we do not have "--not --all", we would hit the updated > shallow cutoff points (which may be at older parts of the history) > and stop, and for correctness we need to visit there anyway, so I > think we are not being overly pessimistic with this change, either. Yes, this analysis makes sense.
[PATCH v6 0/8] ref-in-want
Changes in v6: * Stricter checks in tests * Change the name of a test to better reflect what is being tested * Various style issues fixed in shell scripts * Renamed one-time-sed.sh to apply-one-time-sed.sh * Client now errors out when an unexpected ref is sent from the server during the wanted-ref section. Also changed the docs around the server's responsibility with the refs that are sent during this section. Brandon Williams (8): test-pkt-line: add unpack-sideband subcommand upload-pack: implement ref-in-want upload-pack: test negotiation with changing repository fetch: refactor the population of peer ref OIDs fetch: refactor fetch_refs into two functions fetch: refactor to make function args narrower fetch-pack: put shallow info in output parameter fetch-pack: implement ref-in-want Documentation/config.txt| 7 + Documentation/technical/protocol-v2.txt | 28 +- builtin/clone.c | 4 +- builtin/fetch.c | 135 + fetch-object.c | 2 +- fetch-pack.c| 53 +++- remote.c| 1 + remote.h| 1 + t/helper/test-pkt-line.c| 33 +++ t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf | 8 + t/lib-httpd/apply-one-time-sed.sh | 22 ++ t/t5703-upload-pack-ref-in-want.sh | 377 transport-helper.c | 6 +- transport-internal.h| 9 +- transport.c | 34 ++- transport.h | 3 +- upload-pack.c | 66 + 18 files changed, 715 insertions(+), 75 deletions(-) create mode 100644 t/lib-httpd/apply-one-time-sed.sh create mode 100755 t/t5703-upload-pack-ref-in-want.sh -- 2.18.0.399.gad0ab374a1-goog
[PATCH v6 5/8] fetch: refactor fetch_refs into two functions
Refactor the fetch_refs function into a function that does the fetching of refs and another function that stores them. This is in preparation for allowing additional processing of the fetched refs before updating the local ref store. Signed-off-by: Brandon Williams --- builtin/fetch.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 545635448..2fabfed0e 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -968,9 +968,21 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map) if (ret) ret = transport_fetch_refs(transport, ref_map); if (!ret) - ret |= store_updated_refs(transport->url, - transport->remote->name, - ref_map); + /* +* Keep the new pack's ".keep" file around to allow the caller +* time to update refs to reference the new objects. +*/ + return 0; + transport_unlock_pack(transport); + return ret; +} + +/* Update local refs based on the ref values fetched from a remote */ +static int consume_refs(struct transport *transport, struct ref *ref_map) +{ + int ret = store_updated_refs(transport->url, +transport->remote->name, +ref_map); transport_unlock_pack(transport); return ret; } @@ -1116,7 +1128,8 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - fetch_refs(transport, ref_map); + if (!fetch_refs(transport, ref_map)) + consume_refs(transport, ref_map); if (gsecondary) { transport_disconnect(gsecondary); @@ -1165,7 +1178,7 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map)) { + if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; -- 2.18.0.399.gad0ab374a1-goog
[PATCH v6 2/8] upload-pack: implement ref-in-want
Currently, while performing packfile negotiation, clients are only allowed to specify their desired objects using object ids. This causes a vulnerability to failure when an object turns non-existent during negotiation, which may happen if, for example, the desired repository is provided by multiple Git servers in a load-balancing arrangement and there exists replication delay. In order to eliminate this vulnerability, implement the ref-in-want feature for the 'fetch' command in protocol version 2. This feature enables the 'fetch' command to support requests in the form of ref names through a new "want-ref " parameter. At the conclusion of negotiation, the server will send a list of all of the wanted references (as provided by "want-ref" lines) in addition to the generated packfile. Signed-off-by: Brandon Williams --- Documentation/config.txt| 7 ++ Documentation/technical/protocol-v2.txt | 28 - t/t5703-upload-pack-ref-in-want.sh | 160 upload-pack.c | 66 ++ 4 files changed, 260 insertions(+), 1 deletion(-) create mode 100755 t/t5703-upload-pack-ref-in-want.sh diff --git a/Documentation/config.txt b/Documentation/config.txt index ab641bf5a..fb1dd7428 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3479,6 +3479,13 @@ Note that this configuration variable is ignored if it is seen in the repository-level config (this is a safety measure against fetching from untrusted repositories). +uploadpack.allowRefInWant:: + If this option is set, `upload-pack` will support the `ref-in-want` + feature of the protocol version 2 `fetch` command. This feature + is intended for the benefit of load-balanced servers which may + not have the same view of what OIDs their refs point to due to + replication delay. + url..insteadOf:: Any URL that starts with this value will be rewritten to start, instead, with . In cases where some site serves a diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 49bda76d2..1da71d0dd 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -299,12 +299,21 @@ included in the client's request: for use with partial clone and partial fetch operations. See `rev-list` for possible "filter-spec" values. +If the 'ref-in-want' feature is advertised, the following argument can +be included in the client's request as well as the potential addition of +the 'wanted-refs' section in the server's response as explained below. + +want-ref + Indicates to the server that the client wants to retrieve a + particular ref, where is the full name of a ref on the + server. + The response of `fetch` is broken into a number of sections separated by delimiter packets (0001), with each section beginning with its section header. output = *section -section = (acknowledgments | shallow-info | packfile) +section = (acknowledgments | shallow-info | wanted-refs | packfile) (flush-pkt | delim-pkt) acknowledgments = PKT-LINE("acknowledgments" LF) @@ -319,6 +328,10 @@ header. shallow = "shallow" SP obj-id unshallow = "unshallow" SP obj-id +wanted-refs = PKT-LINE("wanted-refs" LF) + *PKT-LINE(wanted-ref LF) +wanted-ref = obj-id SP refname + packfile = PKT-LINE("packfile" LF) *PKT-LINE(%x01-03 *%x00-ff) @@ -379,6 +392,19 @@ header. * This section is only included if a packfile section is also included in the response. +wanted-refs section + * This section is only included if the client has requested a + ref using a 'want-ref' line and if a packfile section is also + included in the response. + + * Always begins with the section header "wanted-refs". + + * The server will send a ref listing (" ") for + each reference requested using 'want-ref' lines. + + * The server MUST NOT send any refs which were not requested + using 'want-ref' lines. + packfile section * This section is only included if the client has sent 'want' lines in its request and either requested that no more diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh new file mode 100755 index 0..da86f7cde --- /dev/null +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -0,0 +1,160 @@ +#!/bin/sh + +test_description='upload-pack ref-in-want' + +. ./test-lib.sh + +get_actual_refs () { + sed -n -e '/wanted-refs/,/0001/{ + /wanted-refs/d + /0001/d + p + }' actual_refs +} + +get_actual_commits () { + sed -n -e '/packfile/,//{ + /packfile/d + p + }' o.pack && + git index-pack o.pack && + git verify-pack -v o.idx | grep commit
[PATCH v6 3/8] upload-pack: test negotiation with changing repository
Add tests to check the behavior of fetching from a repository which changes between rounds of negotiation (for example, when different servers in a load-balancing agreement participate in the same stateless RPC negotiation). This forms a baseline of comparison to the ref-in-want functionality (which will be introduced to the client in subsequent commits), and ensures that subsequent commits do not change existing behavior. As part of this effort, a mechanism to substitute strings in a single HTTP response is added. Signed-off-by: Brandon Williams --- t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf| 8 t/lib-httpd/apply-one-time-sed.sh | 22 ++ t/t5703-upload-pack-ref-in-want.sh | 68 ++ 4 files changed, 99 insertions(+) create mode 100644 t/lib-httpd/apply-one-time-sed.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 435a37465..ed41b155a 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -132,6 +132,7 @@ prepare_httpd() { cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH" install_script broken-smart-http.sh install_script error.sh + install_script apply-one-time-sed.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 724d9ae46..581c010d8 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -111,9 +111,14 @@ Alias /auth/dumb/ www/auth/dumb/ SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} SetEnv GIT_HTTP_EXPORT_ALL + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1 ScriptAlias /broken_smart/ broken-smart-http.sh/ ScriptAlias /error/ error.sh/ +ScriptAliasMatch /one_time_sed/(.*) apply-one-time-sed.sh/$1 Options FollowSymlinks @@ -123,6 +128,9 @@ ScriptAlias /error/ error.sh/ Options ExecCGI + + Options ExecCGI + Options ExecCGI diff --git a/t/lib-httpd/apply-one-time-sed.sh b/t/lib-httpd/apply-one-time-sed.sh new file mode 100644 index 0..fcef72892 --- /dev/null +++ b/t/lib-httpd/apply-one-time-sed.sh @@ -0,0 +1,22 @@ +#!/bin/sh + +# If "one-time-sed" exists in $HTTPD_ROOT_PATH, run sed on the HTTP response, +# using the contents of "one-time-sed" as the sed command to be run. If the +# response was modified as a result, delete "one-time-sed" so that subsequent +# HTTP responses are no longer modified. +# +# This can be used to simulate the effects of the repository changing in +# between HTTP request-response pairs. +if [ -e one-time-sed ]; then + "$GIT_EXEC_PATH/git-http-backend" >out + sed "$(cat one-time-sed)" out_modified + + if diff out out_modified >/dev/null; then + cat out + else + cat out_modified + rm one-time-sed + fi +else + "$GIT_EXEC_PATH/git-http-backend" +fi diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index da86f7cde..32527a59c 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -157,4 +157,72 @@ test_expect_success 'want-ref with ref we already have commit for' ' check_output ' +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo" +LOCAL_PRISTINE="$(pwd)/local_pristine" + +test_expect_success 'setup repos for change-while-negotiating test' ' + ( + git init "$REPO" && + cd "$REPO" && + >.git/git-daemon-export-ok && + test_commit m1 && + git tag -d m1 && + + # Local repo with many commits (so that negotiation will take + # more than 1 request/response pair) + git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo"; "$LOCAL_PRISTINE" && + cd "$LOCAL_PRISTINE" && + git checkout -b side && + for i in $(seq 1 33); do test_commit s$i; done && + + # Add novel commits to upstream + git checkout master && + cd "$REPO" && + test_commit m2 && + test_commit m3 && + git tag -d m2 m3 + ) && + git -C "$LOCAL_PRISTINE" remote set-url origin "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo"; && + git -C "$LOCAL_PRISTINE" config protocol.version 2 +' + +inconsistency () { + # Simulate that the server initially reports $2 as the ref + # corresponding to $1, and after that, $1 as the ref corresponding to + # $1. This corresponds to the real-life situation where the server's + # repository appears to change during negotiation, for example, when + # different servers in a load-balancing arrangement serve (stateless) + # RPCs during a single negotiation. + printf "s/%s/%s/" \ + $(git -C "$REPO" rev-parse $1 | tr -d "\n") \ +
[PATCH v6 7/8] fetch-pack: put shallow info in output parameter
Expand the transport fetch method signature, by adding an output parameter, to allow transports to return information about the refs they have fetched. Then communicate shallow status information through this mechanism instead of by modifying the input list of refs. This does require clients to sometimes generate the ref map twice: once from the list of refs provided by the remote (as is currently done) and potentially once from the new list of refs that the fetch mechanism provides. Signed-off-by: Brandon Williams --- builtin/clone.c | 4 ++-- builtin/fetch.c | 28 fetch-object.c | 2 +- fetch-pack.c | 15 --- transport-helper.c | 6 -- transport-internal.h | 9 - transport.c | 34 -- transport.h | 3 ++- 8 files changed, 77 insertions(+), 24 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 99e73dae8..8f86d99c5 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1155,7 +1155,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } if (!is_local && !complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); remote_head = find_ref_by_name(refs, "HEAD"); remote_head_points_at = @@ -1197,7 +1197,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); update_remote_refs(refs, mapped_refs, remote_head_points_at, branch_top.buf, reflog_msg.buf, transport, diff --git a/builtin/fetch.c b/builtin/fetch.c index bda00e826..0347cf016 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -946,11 +946,13 @@ static int quickfetch(struct ref *ref_map) return check_connected(iterate_ref_map, &rm, &opt); } -static int fetch_refs(struct transport *transport, struct ref *ref_map) +static int fetch_refs(struct transport *transport, struct ref *ref_map, + struct ref **updated_remote_refs) { int ret = quickfetch(ref_map); if (ret) - ret = transport_fetch_refs(transport, ref_map); + ret = transport_fetch_refs(transport, ref_map, + updated_remote_refs); if (!ret) /* * Keep the new pack's ".keep" file around to allow the caller @@ -1112,7 +1114,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - if (!fetch_refs(transport, ref_map)) + if (!fetch_refs(transport, ref_map, NULL)) consume_refs(transport, ref_map); if (gsecondary) { @@ -1128,6 +1130,7 @@ static int do_fetch(struct transport *transport, int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; const struct ref *remote_refs; + struct ref *updated_remote_refs = NULL; struct argv_array ref_prefixes = ARGV_ARRAY_INIT; if (tags == TAGS_DEFAULT) { @@ -1178,7 +1181,24 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { + + if (fetch_refs(transport, ref_map, &updated_remote_refs)) { + free_refs(ref_map); + retcode = 1; + goto cleanup; + } + if (updated_remote_refs) { + /* +* Regenerate ref_map using the updated remote refs. This is +* to account for additional information which may be provided +* by the transport (e.g. shallow info). +*/ + free_refs(ref_map); + ref_map = get_ref_map(transport->remote, updated_remote_refs, rs, + tags, &autotags); + free_refs(updated_remote_refs); + } + if (consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; diff --git a/fetch-object.c b/fetch-object.c index 853624f81..48fe63dd6 100644 --- a/fetch-object.c +++ b/fetch-object.c @@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref *ref) transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
[PATCH v6 4/8] fetch: refactor the population of peer ref OIDs
Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides tightening scopes of variables in the code, this also prepares for get_ref_map being able to be called multiple times within do_fetch. Signed-off-by: Brandon Williams --- builtin/fetch.c | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669a..545635448 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -351,6 +351,7 @@ static struct ref *get_ref_map(struct transport *transport, /* opportunistically-updated references: */ struct ref *orefs = NULL, **oref_tail = &orefs; + struct string_list existing_refs = STRING_LIST_INIT_DUP; const struct ref *remote_refs; if (rs->nr) @@ -458,7 +459,23 @@ static struct ref *get_ref_map(struct transport *transport, tail = &rm->next; } - return ref_remove_duplicates(ref_map); + ref_map = ref_remove_duplicates(ref_map); + + for_each_ref(add_existing, &existing_refs); + for (rm = ref_map; rm; rm = rm->next) { + if (rm->peer_ref) { + struct string_list_item *peer_item = + string_list_lookup(&existing_refs, + rm->peer_ref->name); + if (peer_item) { + struct object_id *old_oid = peer_item->util; + oidcpy(&rm->peer_ref->old_oid, old_oid); + } + } + } + string_list_clear(&existing_refs, 1); + + return ref_map; } #define STORE_REF_ERROR_OTHER 1 @@ -1110,14 +1127,10 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) static int do_fetch(struct transport *transport, struct refspec *rs) { - struct string_list existing_refs = STRING_LIST_INIT_DUP; struct ref *ref_map; - struct ref *rm; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; - for_each_ref(add_existing, &existing_refs); - if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) tags = TAGS_SET; @@ -1136,18 +1149,6 @@ static int do_fetch(struct transport *transport, if (!update_head_ok) check_not_current_branch(ref_map); - for (rm = ref_map; rm; rm = rm->next) { - if (rm->peer_ref) { - struct string_list_item *peer_item = - string_list_lookup(&existing_refs, - rm->peer_ref->name); - if (peer_item) { - struct object_id *old_oid = peer_item->util; - oidcpy(&rm->peer_ref->old_oid, old_oid); - } - } - } - if (tags == TAGS_DEFAULT && autotags) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); if (prune) { @@ -1183,7 +1184,6 @@ static int do_fetch(struct transport *transport, } cleanup: - string_list_clear(&existing_refs, 1); return retcode; } -- 2.18.0.399.gad0ab374a1-goog
[PATCH v6 6/8] fetch: refactor to make function args narrower
Refactor find_non_local_tags and get_ref_map to only take the information they need instead of the entire transport struct. Besides improving code clarity, this also improves their flexibility, allowing for a different set of refs to be used instead of relying on the ones stored in the transport struct. Signed-off-by: Brandon Williams --- builtin/fetch.c | 52 - 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 2fabfed0e..bda00e826 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -254,9 +254,9 @@ static int will_fetch(struct ref **head, const unsigned char *sha1) return 0; } -static void find_non_local_tags(struct transport *transport, - struct ref **head, - struct ref ***tail) +static void find_non_local_tags(const struct ref *refs, + struct ref **head, + struct ref ***tail) { struct string_list existing_refs = STRING_LIST_INIT_DUP; struct string_list remote_refs = STRING_LIST_INIT_NODUP; @@ -264,7 +264,7 @@ static void find_non_local_tags(struct transport *transport, struct string_list_item *item = NULL; for_each_ref(add_existing, &existing_refs); - for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) { + for (ref = refs; ref; ref = ref->next) { if (!starts_with(ref->name, "refs/tags/")) continue; @@ -338,7 +338,8 @@ static void find_non_local_tags(struct transport *transport, string_list_clear(&remote_refs, 0); } -static struct ref *get_ref_map(struct transport *transport, +static struct ref *get_ref_map(struct remote *remote, + const struct ref *remote_refs, struct refspec *rs, int tags, int *autotags) { @@ -346,27 +347,11 @@ static struct ref *get_ref_map(struct transport *transport, struct ref *rm; struct ref *ref_map = NULL; struct ref **tail = &ref_map; - struct argv_array ref_prefixes = ARGV_ARRAY_INIT; /* opportunistically-updated references: */ struct ref *orefs = NULL, **oref_tail = &orefs; struct string_list existing_refs = STRING_LIST_INIT_DUP; - const struct ref *remote_refs; - - if (rs->nr) - refspec_ref_prefixes(rs, &ref_prefixes); - else if (transport->remote && transport->remote->fetch.nr) - refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes); - - if (ref_prefixes.argc && - (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) { - argv_array_push(&ref_prefixes, "refs/tags/"); - } - - remote_refs = transport_get_remote_refs(transport, &ref_prefixes); - - argv_array_clear(&ref_prefixes); if (rs->nr) { struct refspec *fetch_refspec; @@ -403,7 +388,7 @@ static struct ref *get_ref_map(struct transport *transport, if (refmap.nr) fetch_refspec = &refmap; else - fetch_refspec = &transport->remote->fetch; + fetch_refspec = &remote->fetch; for (i = 0; i < fetch_refspec->nr; i++) get_fetch_map(ref_map, &fetch_refspec->items[i], &oref_tail, 1); @@ -411,7 +396,6 @@ static struct ref *get_ref_map(struct transport *transport, die("--refmap option is only meaningful with command-line refspec(s)."); } else { /* Use the defaults */ - struct remote *remote = transport->remote; struct branch *branch = branch_get(NULL); int has_merge = branch_has_merge_config(branch); if (remote && @@ -450,7 +434,7 @@ static struct ref *get_ref_map(struct transport *transport, /* also fetch all tags */ get_fetch_map(remote_refs, tag_refspec, &tail, 0); else if (tags == TAGS_DEFAULT && *autotags) - find_non_local_tags(transport, &ref_map, &tail); + find_non_local_tags(remote_refs, &ref_map, &tail); /* Now append any refs to be updated opportunistically: */ *tail = orefs; @@ -1143,6 +1127,8 @@ static int do_fetch(struct transport *transport, struct ref *ref_map; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; + const struct ref *remote_refs; + struct argv_array ref_prefixes = ARGV_ARRAY_INIT; if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) @@ -1158,7 +1144,21 @@ static int do_fetch(struct transport *transport, goto cleanup; } - ref_map = get_ref_map(transport, rs, tags, &autotags); + if (rs->nr) + refspec_ref_prefixe
[PATCH v6 1/8] test-pkt-line: add unpack-sideband subcommand
Add an 'unpack-sideband' subcommand to the test-pkt-line helper to enable unpacking packet line data sent multiplexed using a sideband. Signed-off-by: Brandon Williams --- t/helper/test-pkt-line.c | 33 + 1 file changed, 33 insertions(+) diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c index 0f19e53c7..30775f986 100644 --- a/t/helper/test-pkt-line.c +++ b/t/helper/test-pkt-line.c @@ -1,3 +1,4 @@ +#include "cache.h" #include "pkt-line.h" static void pack_line(const char *line) @@ -48,6 +49,36 @@ static void unpack(void) } } +static void unpack_sideband(void) +{ + struct packet_reader reader; + packet_reader_init(&reader, 0, NULL, 0, + PACKET_READ_GENTLE_ON_EOF | + PACKET_READ_CHOMP_NEWLINE); + + while (packet_reader_read(&reader) != PACKET_READ_EOF) { + int band; + int fd; + + switch (reader.status) { + case PACKET_READ_EOF: + break; + case PACKET_READ_NORMAL: + band = reader.line[0] & 0xff; + if (band < 1 || band > 2) + die("unexpected side band %d", band); + fd = band; + + write_or_die(fd, reader.line + 1, reader.pktlen - 1); + break; + case PACKET_READ_FLUSH: + return; + case PACKET_READ_DELIM: + break; + } + } +} + int cmd_main(int argc, const char **argv) { if (argc < 2) @@ -57,6 +88,8 @@ int cmd_main(int argc, const char **argv) pack(argc - 2, argv + 2); else if (!strcmp(argv[1], "unpack")) unpack(); + else if (!strcmp(argv[1], "unpack-sideband")) + unpack_sideband(); else die("invalid argument '%s'", argv[1]); -- 2.18.0.399.gad0ab374a1-goog
[PATCH v6 8/8] fetch-pack: implement ref-in-want
Implement ref-in-want on the client side so that when a server supports the "ref-in-want" feature, a client will send "want-ref" lines for each reference the client wants to fetch. This feature allows clients to tolerate inconsistencies that exist when a remote repository's refs change during the course of negotiation. This allows a client to request to request a particular ref without specifying the OID of the ref. This means that instead of hitting an error when a ref no longer points at the OID it did at the beginning of negotiation, negotiation can continue and the value of that ref will be sent at the termination of negotiation, just before a packfile is sent. More information on the ref-in-want feature can be found in Documentation/technical/protocol-v2.txt. Signed-off-by: Brandon Williams --- fetch-pack.c | 38 +++- remote.c | 1 + remote.h | 1 + t/t5703-upload-pack-ref-in-want.sh | 149 + 4 files changed, 186 insertions(+), 3 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 73890b894..0b4a9f288 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1102,9 +1102,10 @@ static void add_shallow_requests(struct strbuf *req_buf, static void add_wants(const struct ref *wants, struct strbuf *req_buf) { + int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0); + for ( ; wants ; wants = wants->next) { const struct object_id *remote = &wants->old_oid; - const char *remote_hex; struct object *o; /* @@ -1122,8 +1123,10 @@ static void add_wants(const struct ref *wants, struct strbuf *req_buf) continue; } - remote_hex = oid_to_hex(remote); - packet_buf_write(req_buf, "want %s\n", remote_hex); + if (!use_ref_in_want || wants->exact_oid) + packet_buf_write(req_buf, "want %s\n", oid_to_hex(remote)); + else + packet_buf_write(req_buf, "want-ref %s\n", wants->name); } } @@ -1334,6 +1337,32 @@ static void receive_shallow_info(struct fetch_pack_args *args, args->deepen = 1; } +static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs) +{ + process_section_header(reader, "wanted-refs", 0); + while (packet_reader_read(reader) == PACKET_READ_NORMAL) { + struct object_id oid; + const char *end; + struct ref *r = NULL; + + if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ') + die("expected wanted-ref, got '%s'", reader->line); + + for (r = refs; r; r = r->next) { + if (!strcmp(end, r->name)) { + oidcpy(&r->old_oid, &oid); + break; + } + } + + if (!r) + die("unexpected wanted-ref: '%s'", reader->line); + } + + if (reader->status != PACKET_READ_DELIM) + die("error processing wanted refs: %d", reader->status); +} + enum fetch_state { FETCH_CHECK_LOCAL = 0, FETCH_SEND_REQUEST, @@ -1408,6 +1437,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, if (process_section_header(&reader, "shallow-info", 1)) receive_shallow_info(args, &reader); + if (process_section_header(&reader, "wanted-refs", 1)) + receive_wanted_refs(&reader, ref); + /* get the pack */ process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfile)) diff --git a/remote.c b/remote.c index abe80c139..2c2376fff 100644 --- a/remote.c +++ b/remote.c @@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs, if (refspec->exact_sha1) { ref_map = alloc_ref(name); get_oid_hex(name, &ref_map->old_oid); + ref_map->exact_oid = 1; } else { ref_map = get_remote_ref(remote_refs, name); } diff --git a/remote.h b/remote.h index 45ecc6cef..976292152 100644 --- a/remote.h +++ b/remote.h @@ -73,6 +73,7 @@ struct ref { force:1, forced_update:1, expect_old_sha1:1, + exact_oid:1, deletion:1; enum { diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index 32527a59c..a73c55a47 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -211,6 +211,18 @@ test_expect_success 'server is initially ahead - no ref in want' ' grep "ERR upload-pack: not ou
Re: [PATCH v7 07/22] commit-graph: add 'verify' subcommand
> +int verify_commit_graph(struct repository *r, struct commit_graph *g) I haven't had the time to review this patch set, but I did rebase my object store refactoring [1] on this and wrote a test: static void test_verify_commit_graph(const char *gitdir, const char *worktree) { struct repository r; char *graph_name; struct commit_graph *graph; repo_init(&r, gitdir, worktree); graph_name = get_commit_graph_filename(r.objects->objectdir); graph = load_commit_graph_one(graph_name); printf("verification returned %d\n", verify_commit_graph(&r, graph)); repo_clear(&r); } However, it doesn't work because verify_commit_graph() invokes parse_commit_internal(), which tries to look up replace refs in the_repository. I think that verify_commit_graph() should not take a repository argument for now. To minimize churn on the review of this patch set, and to minimize diffs when we migrate parse_commit_internal() (and likely other functions) to take in a repository argument, I would be OK with something like the following instead: int verify_commit_graph(struct commit_graph *g) { /* * NEEDSWORK: Make r into a parameter when all functions * invoked by this function are not hardcoded to operate on * the_repository. */ struct repository *r = the_repository; /* ... */ As for my rebased refactoring, I'll send the patches to the mailing list once Junio updates ds/commit-graph-fsck with these latest changes, so that I can rebase once again on that and ensure that everything still works. [1] https://public-inbox.org/git/cover.1529616356.git.jonathanta...@google.com/
Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'
On Wed, Jun 27, 2018 at 02:11:13PM -0700, Junio C Hamano wrote: > Taylor Blau writes: > > >> Just initializing match_color where it is defined at the beginning of > >> show_line() should be sufficient, I think. > > > > I think that we could also use the following, and leave the `if > > (opt->color)` conditional where it is: > > > > diff --git a/grep.c b/grep.c > > index 48cca6723e..b985fb3ee0 100644 > > --- a/grep.c > > +++ b/grep.c > > @@ -1448,7 +1448,7 @@ static void show_line(struct grep_opt *opt, char > > *bol, char *eol, > > const char *name, unsigned lno, ssize_t cno, char sign) > > { > > int rest = eol - bol; > > - const char *match_color, *line_color = NULL; > > + const char *match_color = NULL, *line_color = NULL; > > > > if (opt->file_break && opt->last_shown == 0) { > > if (opt->show_hunk_mark) > > You say "we could also", but the above is exactly what I suggested, > so we seem to be on the same page, which is good ;-) Ah, sorry -- I misinterpreted your meaning "initializing match_color where it is defined" to mean bringing the large `if (opt->color)` conditional up, instead of just assigning to NULL. Let's do that and leave the `if (opt->color)` block where it is? Thanks, Taylor
Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'
Taylor Blau writes: >> Just initializing match_color where it is defined at the beginning of >> show_line() should be sufficient, I think. > > I think that we could also use the following, and leave the `if > (opt->color)` conditional where it is: > > diff --git a/grep.c b/grep.c > index 48cca6723e..b985fb3ee0 100644 > --- a/grep.c > +++ b/grep.c > @@ -1448,7 +1448,7 @@ static void show_line(struct grep_opt *opt, char *bol, > char *eol, > const char *name, unsigned lno, ssize_t cno, char sign) > { > int rest = eol - bol; > - const char *match_color, *line_color = NULL; > + const char *match_color = NULL, *line_color = NULL; > > if (opt->file_break && opt->last_shown == 0) { > if (opt->show_hunk_mark) You say "we could also", but the above is exactly what I suggested, so we seem to be on the same page, which is good ;-)
Re: [PATCH v5 2/8] upload-pack: implement ref-in-want
> Yeah after thinking more about this I wonder if we have some mental model that we want to teach to the users? What is the fetch command (using the ref-in-want capability) supposed to do? * update to the latest state observed by the latest remote talked to? * update to some approximate state that is converged from multiple remotes? * update to a state that the first remote had, that we talked to Having such a model would make it easier for me to follow this discussion. > I agree, we should have the client > fail out and require that the server MUST not send additional refs. This is reasoned for by the way we evolve the client, not some state the user expects to see short or longterm? > This can of course be loosened through a capability if we want to do > something else in the future. Thanks for sanity checking me :) ok, that is a sensible way to go forward.
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
Junio C Hamano wrote: > Jeff King writes: > >> Specifically, I'm thinking of: >> >> 1. The pre-built documentation that Junio builds for >> quick-install-doc. This _could_ be customized during the "quick" >> step, but it's probably not worth the effort. However, we'd want >> some kind of generic fill-in then, and hopefully not >> "/home/jch/etc" or something. > > That is very likely to happen, actually X-<. Obviously, we don't want the end result to cause regressions in the common case or any burden on you. Would setting the ETC_GIT(CONFIG|ATTRIBUTES) variables in the dist-doc target alleviate that concern? Alternately, we can make the default use generic paths and require some other knob to enable substituting the actual paths when building documentation. I tend to think that the default should be to build documentation that is accurate for that build, but since it's something I'll set once for my package builds it's not a big deal either way to me. -- Todd ~~ Einstein argued that there must be simplified explanations of nature, because God is not capricious or arbitrary. No such faith comforts the software engineer. -- Fred Brooks
Re: [PATCH v5 2/8] upload-pack: implement ref-in-want
On 06/27, Junio C Hamano wrote: > Brandon Williams writes: > > >> > +* The server SHOULD NOT send any refs which were not requested > >> > + using 'want-ref' lines and a client MUST ignore refs which > >> > + weren't requested. > >> > >> Just being curious, but the above feels the other way around. Why > >> are we being more lenient to writers of broken server than writers > >> of broken clients? The number of installations they need to take > >> back and replace is certainly lower for the former, which means > >> that, if exchanges of unsoliclited refs are unwanted, clients should > >> notice and barf (or warn) if the server misbehaves, and the server > >> should be forbidden from sending unsolicited refs, no? > > > > Ok so should I change the server part to "MUST NOT" and the client part > > to "SHOULD"? And I can add code to die when we see refs that weren't > > requested, but i feel like if we add an ability to request a pattern in > > the future this will completely change, which is why I currently have a > > client just ignoring anything else. > > I did not have enough information to give an answer to "should I do > X?"; that is why I asked these questions prefixed with "Just being > curious". I do not quite get a good feeling that I now know enough > to answer, still, but let me try. > > If we anticipate backward incompatible changes between this early > WIP stage and the final completed protocol, it would be GOOD to make > sure that an early WIP clients/servers fail when seeing the other > side gives them something they do not understand, no? > > So... Yeah after thinking more about this I agree, we should have the client fail out and require that the server MUST not send additional refs. This can of course be loosened through a capability if we want to do something else in the future. Thanks for sanity checking me :) -- Brandon Williams
Re: [PATCH] fetch: when deepening, check connectivity fully
Jonathan Tan writes: > +test_expect_success 'shallow fetches check connectivity without stopping at > existing refs' ' > + cp -R .git server.git && > + > + # Normally, the connectivity check stops at ancestors of existing refs. > + git init client && > + GIT_TRACE="$(pwd)/trace" git -C client fetch "$(pwd)/server.git" && > + grep "run_command: git rev-list" trace >rev-list-command && > + grep -e "--not --all" rev-list-command && > + > + # But it does not for a shallow fetch... > + rm -rf client trace && > + git init client && > + GIT_TRACE="$(pwd)/trace" git -C client fetch --depth=1 > "$(pwd)/server.git" && > + grep "run_command: git rev-list" trace >rev-list-command && > + ! grep -e "--not --all" rev-list-command && > + > + # ...and when deepening. > + rm trace && > + GIT_TRACE="$(pwd)/trace" git -C client fetch --unshallow > "$(pwd)/server.git" && > + grep "run_command: git rev-list" trace >rev-list-command && > + ! grep -e "--not --all" rev-list-command > +' Hmph, don't we quote these in the trace output, requiring us to grep for "'--not' '--all'" or somesuch? I do not think of a better way to do the above without a huge effort offhand, and the approach taken by the above may be the best we could do, but it looks like quite a brittle test that knows too much about the current implementation. "rev-list $new_commits --not --all" is a so very common and useful pattern that it is not all that implausible that we may want to come up with a new option to do so, or more likely we may want to do that with an in-process API without spawning an external rev-list (hence making it impossible to observe via GIT_TRACE).
Re: [PATCH 1/2] gitignore.txt: clarify default core.excludesfile path
Junio C Hamano wrote: > Todd Zullinger writes: > >> The default core.excludesfile path is $XDG_CONFIG_HOME/git/ignore. >> $HOME/.config/git/ignore is used if XDG_CONFIG_HOME is empty or unset, > > ... because $HOME/.config is the default value for XDG_CONFIG_HOME > when it is unset, that is? If so, the change makes sense. Indeed, that's the fallback path. Thanks, -- Todd
Re: [PATCH] fetch: when deepening, check connectivity fully
Jonathan Tan writes: > Do not stop at ancestors of our refs when deepening during fetching. > This is because when performing such a fetch, shallow entries may have > been updated; the client can reasonably assume that the existing refs > are connected up to the old shallow points, but not the new. OK. > diff --git a/connected.c b/connected.c > index 91feb78815..1bba888eff 100644 > --- a/connected.c > +++ b/connected.c > @@ -58,8 +58,10 @@ int check_connected(oid_iterate_fn fn, void *cb_data, > argv_array_push(&rev_list.args, "--stdin"); > if (repository_format_partial_clone) > argv_array_push(&rev_list.args, "--exclude-promisor-objects"); > - argv_array_push(&rev_list.args, "--not"); > - argv_array_push(&rev_list.args, "--all"); > + if (!opt->is_deepening_fetch) { > + argv_array_push(&rev_list.args, "--not"); > + argv_array_push(&rev_list.args, "--all"); > + } > argv_array_push(&rev_list.args, "--quiet"); > if (opt->progress) > argv_array_pushf(&rev_list.args, "--progress=%s", Hmph, remind me how old and/or new shallow cut-off points affect this traversal? In order to verify that everything above the new cut-off points, opt->shallow_file should be pointing at the new cut-off points, then we do the full sweep (without --not --all) to ensure we won't find missing or broken objects but still stop at the new cut-off points, and then only after check_connected() passes, update the shallow file to make new shallow cut-off points active (and if the traversal fails, then we do nto install the new shallow cut-off points)? If so, that sounds sensible. By not having "--not --all", we potentially would do a full sweep, but if we are really deepening to the root commits, then we do need one full sweep anyway (as these deepest parts of the history were previously hidden by the shallow cutoff points), and if we are only deepening the history by a bit, even if we do not have "--not --all", we would hit the updated shallow cutoff points (which may be at older parts of the history) and stop, and for correctness we need to visit there anyway, so I think we are not being overly pessimistic with this change, either.
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
Elijah Newren writes: > Seems reasonable. Since these tests were essentially copies of other > tests within the same file, just for rebase -i instead of -m, should I > also add another patch to the series fixing up the rebase -m testcase > to also replace the subshell with a single-shot environment > assignment? I personally would think it would be best to leave that to a later day, and take your v3 that properly &&-chains these two assignments. It may be a good clean-up, but is an overkill if done as a preparatory clean-up in the context of these two small fixes.
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
Johannes Sixt writes: > Pitfalls ahead! Thanks. I said "at least the latter" for this exact reason ;-). > PATH=... git rebase ... > > is OK, but > > PATH=... test_must_fail git rebase ... > > is not; the latter requires the subshell, otherwise the modified PATH > variable survives the command because test_must_fail is a shell > function. Yes, it's silly, but that's how it is. Or you could do test_must_fail env VAR=VAL git rebase ... which I am not particularly fond of, but seems to be used quite often.
Re: [PATCH 1/2] gitignore.txt: clarify default core.excludesfile path
Todd Zullinger writes: > The default core.excludesfile path is $XDG_CONFIG_HOME/git/ignore. > $HOME/.config/git/ignore is used if XDG_CONFIG_HOME is empty or unset, ... because $HOME/.config is the default value for XDG_CONFIG_HOME when it is unset, that is? If so, the change makes sense. > as described later in the document. > Signed-off-by: Todd Zullinger > --- > Documentation/gitignore.txt | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt > index ff5d7f9ed6..d107daaffd 100644 > --- a/Documentation/gitignore.txt > +++ b/Documentation/gitignore.txt > @@ -7,7 +7,7 @@ gitignore - Specifies intentionally untracked files to ignore > > SYNOPSIS > > -$HOME/.config/git/ignore, $GIT_DIR/info/exclude, .gitignore > +$XDG_CONFIG_HOME/git/ignore, $GIT_DIR/info/exclude, .gitignore > > DESCRIPTION > ---
Re: [PATCH v3] Documentation: declare "core.ignorecase" as internal variable
Marc Strapetz writes: > [1. text/plain] > The current description of "core.ignoreCase" reads like an option which > is intended to be changed by the user while it's actually expected to > be set by Git on initialization only. Subsequently, Git relies on the > proper configuration of this variable, as noted by Bryan Turner [1]: > > Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT, > vFAT, NTFS, etc.) is not designed to be run with anything other > than core.ignoreCase=true. > > [1] https://marc.info/?l=git&m=152998665813997&w=2 > mid:cagyf7-gee8jrgpkme9rhkptheq6p1+ebpmmwatmh01uo3bf...@mail.gmail.com > > Signed-off-by: Marc Strapetz > --- > Documentation/config.txt | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > Hmph. Do other people have difficulty applying this patch to their trees? It is just several lines long so I could retype it myself, but I guess "Content-Type: text/plain; charset=utf-8; format=flowed" has destroyed formatting of the patch rather badly. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828..c70cfe956 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -390,16 +390,19 @@ core.hideDotFiles:: > default mode is 'dotGitOnly'. > > core.ignoreCase:: > - If true, this option enables various workarounds to enable > + Internal variable which enables various workarounds to enable > Git to work better on filesystems that are not case sensitive, > - like FAT. For example, if a directory listing finds > - "makefile" when Git expects "Makefile", Git will assume > + like APFS, HFS+, FAT, NTFS, etc. For example, if a directory listing > + finds "makefile" when Git expects "Makefile", Git will assume > it is really the same file, and continue to remember it as > "Makefile". > + > The default is false, except linkgit:git-clone[1] or linkgit:git-init[1] > will probe and set core.ignoreCase true if appropriate when the repository > is created. > ++ > +Git relies on the proper configuration of this variable for your operating > +and file system. Modifying this value may result in unexpected behavior. > > core.precomposeUnicode:: > This option is only used by Mac OS implementation of Git.
Re: [PATCH v5 2/8] upload-pack: implement ref-in-want
Brandon Williams writes: >> > + * The server SHOULD NOT send any refs which were not requested >> > +using 'want-ref' lines and a client MUST ignore refs which >> > +weren't requested. >> >> Just being curious, but the above feels the other way around. Why >> are we being more lenient to writers of broken server than writers >> of broken clients? The number of installations they need to take >> back and replace is certainly lower for the former, which means >> that, if exchanges of unsoliclited refs are unwanted, clients should >> notice and barf (or warn) if the server misbehaves, and the server >> should be forbidden from sending unsolicited refs, no? > > Ok so should I change the server part to "MUST NOT" and the client part > to "SHOULD"? And I can add code to die when we see refs that weren't > requested, but i feel like if we add an ability to request a pattern in > the future this will completely change, which is why I currently have a > client just ignoring anything else. I did not have enough information to give an answer to "should I do X?"; that is why I asked these questions prefixed with "Just being curious". I do not quite get a good feeling that I now know enough to answer, still, but let me try. If we anticipate backward incompatible changes between this early WIP stage and the final completed protocol, it would be GOOD to make sure that an early WIP clients/servers fail when seeing the other side gives them something they do not understand, no? So...
Re: [PATCH] rebase -i: Fix white space in comments
Sorry for the whitespace bug, it looks like everything is under control one way or the other.
Re: [PATCH v6 0/4] stash: add new tests and introduce a new helper function
Johannes Schindelin writes: > Hi, > > On Mon, 25 Jun 2018, Paul-Sebastian Ungureanu wrote: > >> This first series of patches does bring some changes and improvements to >> the test suite. One of the patches also introduces a new function >> `get_oidf()` which will be hepful for the incoming patches related to >> `git stash`. > > For reviewers: it is *my* fault that this patch submission is a bit funny > with two 1/4 and one 1/6 patches... *I* suggested to not send a 14-strong > patch series but split it into three, and then I failed to explain the > correct invocation to do that from the command-line. > > My sincere apologies, > Dscho Heh, what is more useful than apology is to tell us which order these three (apparent) series build on top of each other ;-) The answer, IIUC, is that * oidf+tests come first, then * apply/drop/branch/pop (as these rely on oidf) on top, and finally * list (as it wants to add to stash--helper that is a new file in the middle) When there is clear dependency like that, I agree that it would help readers to emphasize that these cannot be applied in an arbitrary order. It is especially true as the second part of the above _will_ apply more-or-less cleanly without the first one, and then fail to compile due to lack of oidf. Thanks.
[PATCH] fsck: check skiplist for object in fsck_blob()
Since commit ed8b10f631 ("fsck: check .gitmodules content", 2018-05-02), fsck will issue an error message for '.gitmodules' content that cannot be parsed correctly. This is the case, even when the corresponding blob object has been included on the skiplist. For example, using the cgit repository, we see the following: $ git fsck Checking object directories: 100% (256/256), done. error: bad config line 5 in blob .gitmodules error in blob 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob Checking objects: 100% (6626/6626), done. $ $ git config fsck.skiplist '.git/skip' $ echo 51dd1eff1edc663674df9ab85d2786a40f7ae3a5 >.git/skip $ $ git fsck Checking object directories: 100% (256/256), done. error: bad config line 5 in blob .gitmodules Checking objects: 100% (6626/6626), done. $ Note that the error message issued by the config parser is still present, despite adding the object-id of the blob to the skiplist. One solution would be to provide a means of suppressing the messages issued by the config parser. However, given that (logically) we are asking fsck to ignore this object, a simpler approach is to just not call the config parser if the object is to be skipped. Add a check to the 'fsck_blob()' processing function, to determine if the object is on the skiplist and, if so, exit the function early. Signed-off-by: Ramsay Jones --- Hi Junio, I noticed recently that the 'cgit.git' repo was complaining when doing an 'git fsck' ... Note that 'cgit' had a '.gitmodules' file and a 'submodule.sh' script back in 2007, which had nothing to do with the current git submodule facilities, ... ;-) Viz: $ git show 51dd1eff1e # This file maps a submodule path to an url from where the submodule # can be obtained. The script "submodules.sh" finds the url in this file # when invoked with -i to clone the submodules. git git://git.kernel.org/pub/scm/git/git.git $ I just remembered that I had intended to review the name of the function that this patch introduces before sending, but totally forgot! :( [Hmm, 'to_be_skipped' -> object_to_be_skipped, object_on_skiplist, etc., ... suggestions welcome!] ATB, Ramsay Jones fsck.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 48e7e3686..702ceb629 100644 --- a/fsck.c +++ b/fsck.c @@ -281,6 +281,13 @@ static void append_msg_id(struct strbuf *sb, const char *msg_id) strbuf_addstr(sb, ": "); } +static int to_be_skipped(struct fsck_options *opts, struct object *obj) +{ + if (opts && opts->skiplist && obj) + return oid_array_lookup(opts->skiplist, &obj->oid) >= 0; + return 0; +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -292,8 +299,7 @@ static int report(struct fsck_options *options, struct object *object, if (msg_type == FSCK_IGNORE) return 0; - if (options->skiplist && object && - oid_array_lookup(options->skiplist, &object->oid) >= 0) + if (to_be_skipped(options, object)) return 0; if (msg_type == FSCK_FATAL) @@ -963,6 +969,9 @@ static int fsck_blob(struct blob *blob, const char *buf, return 0; oidset_insert(&gitmodules_done, &blob->object.oid); + if (to_be_skipped(options, &blob->object)) + return 0; + if (!buf) { /* * A missing buffer here is a sign that the caller found the -- 2.18.0
Re: [PATCH v6 3/4] stash: convert branch to builtin
Johannes Schindelin writes: >> +ret = cmd_checkout(args.argc, args.argv, prefix); > > I guess it is okay to run that, but the cmd_() functions are not *really* > meant to be called this way... Personally, I would be more comfortable > with a `run_command()` here, i.e. with a spawned process, until the time > when checkout.c/checkout.h have learned enough API for a direct call. Thanks for pointing it out. I'll add a bit more for those who are reading from sideline (iow, if you are Dscho, you do not have to read the remainder, as I know Dscho knows all of this). It is not OK to use cmd_$foo() as subroutine; depending on the value of $foo, where the call is made and the number of times the call is made, it may happen to work OK in the current code, but that is a ticking timebomb waiting to go off. This is primarily because cmd_$foo() is designed to be replacement of "main()" in usual programs---it is allowed to assume the global variables it uses have their initial values and nobody cares the state it leaves behind when it returns. Argument parser may flip bits in file-scope static variables to record which options are seen, assuming that these variables are initialized to all-zero, and that assumption would not hold for the second call to cmd_$foo(), etc. cmd_$foo() also is allowed to call die() when there is an unrecoverable error condition in the context of carrying out $foo; a scripted Porcelain that used $foo as a building block to implement a larger operation (e.g. "stash" that used "checkout" to switch to a different branch) may want to react to the failure and do something extra (i.e. "git checkout || do something more"). Using run_command() interface would not cause any of these problems; the subcommand will run in a clean environment, and the caller can react to its failure.
Re: [PATCH] t/lib-submodule-update: fix absorbing test
On Wed, Jun 27, 2018 at 2:31 PM Stefan Beller wrote: > From: Eric Sunshine > > This test has been dysfunctional since it was added by 259f3ee296 > (lib-submodule-update.sh: define tests for recursing into submodules, > 2017-03-14), however, problems went unnoticed due to a broken &&-chain > toward the end of the test. > [...] > Signed-off-by: Eric Sunshine > Signed-off-by: Stefan Beller > --- > In an ideal world the commands would not fail, but absorb the git directory > of the submodule. I manually tested that it is absorbed and not data from > a git directory is lost. > > I would propose to replace that patch with the patch below; I hope > the wording did not add more confusion than there is already. Thanks for diagnosing the problem, Stefan. I'm not a submodule user and was not at all confident that I had interpreted the test breakage correctly or that my fix was appropriate, so I'm happy to have a diagnosis and fix from the person who actually wrote the test. I'll also add a Helped-by: when re-posting.
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase
> Am 27.06.2018 um 19:27 schrieb Elijah Newren: > > On Wed, Jun 27, 2018 at 9:54 AM, Junio C Hamano wrote: > >> Having said that, it would be simpler for at least the latter to > >> write it using a single-shot environment assignment, perhaps? I.e. > >> > >> PATH=./test-bin:$PATH git rebase --continue && > >> > >> without running in a subshell? > > > > Seems reasonable. Since these tests were essentially copies of other > > tests within the same file, just for rebase -i instead of -m, should I > > also add another patch to the series fixing up the rebase -m testcase > > to also replace the subshell with a single-shot environment > > assignment? > > Pitfalls ahead! > > PATH=... git rebase ... > > is OK, but > > PATH=... test_must_fail git rebase ... > > is not; the latter requires the subshell, otherwise the modified PATH > variable survives the command because test_must_fail is a shell > function. Yes, it's silly, but that's how it is. We have the 'test_env' helper function for this case: test_env PATH=./test-bin:$PATH test_must_fail git rebase --opts
[PATCH] t/lib-submodule-update: fix absorbing test
From: Eric Sunshine This test has been dysfunctional since it was added by 259f3ee296 (lib-submodule-update.sh: define tests for recursing into submodules, 2017-03-14), however, problems went unnoticed due to a broken &&-chain toward the end of the test. The test wants to verify that replacing a submodule containing a .git directory would absorb the .git directory into the .git/modules/ of the superproject, and then replace the working tree content with the liking of the superproject. The check if submodule content is around is wrong as the submodule should have been replaced by the content of the superproject. Delete the submodule content check, which also fixes the && chain in the test. While at it, fix broken &&-chains in a couple neighboring tests. Signed-off-by: Eric Sunshine Signed-off-by: Stefan Beller --- > The test wants to verify that replacing a submodule containing a .git > directory must fail. All other "must fail" tests in this script invoke > the supplied command as 'test_must_fail', however, this test neglects to > do so. In an ideal world the commands would not fail, but absorb the git directory of the submodule. I manually tested that it is absorbed and not data from a git directory is lost. I would propose to replace that patch with the patch below; I hope the wording did not add more confusion than there is already. Thanks, Stefan t/lib-submodule-update.sh | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh index 1f38a85371a..e90ec790877 100755 --- a/t/lib-submodule-update.sh +++ b/t/lib-submodule-update.sh @@ -755,7 +755,7 @@ test_submodule_recursing_with_args_common() { : >sub1/untrackedfile && test_must_fail $command replace_sub1_with_file && test_superproject_content origin/add_sub1 && - test_submodule_content sub1 origin/add_sub1 + test_submodule_content sub1 origin/add_sub1 && test -f sub1/untracked_file ) ' @@ -842,7 +842,7 @@ test_submodule_switch_recursing_with_args () { cd submodule_update && git branch -t add_sub1 origin/add_sub1 && : >sub1 && - echo sub1 >.git/info/exclude + echo sub1 >.git/info/exclude && $command add_sub1 && test_superproject_content origin/add_sub1 && test_submodule_content sub1 origin/add_sub1 @@ -969,7 +969,6 @@ test_submodule_forced_switch_recursing_with_args () { rm -rf .git/modules/sub1 && $command replace_sub1_with_directory && test_superproject_content origin/replace_sub1_with_directory && - test_submodule_content sub1 origin/modify_sub1 test_git_directory_exists sub1 ) ' -- 2.18.0.399.gad0ab374a1-goog
[PATCH v2] fetch-pack: support negotiation tip whitelist
During negotiation, fetch-pack eventually reports as "have" lines all commits reachable from all refs. Allow the user to restrict the commits sent in this way by providing a whitelist of tips; only the tips themselves and their ancestors will be sent. This feature is only supported for protocols that support connect or stateless-connect (such as HTTP with protocol v2). This will speed up negotiation when the repository has multiple relatively independent branches (for example, when a repository interacts with multiple repositories, such as with linux-next [1] and torvalds/linux [2]), and the user knows which local branch is likely to have commits in common with the upstream branch they are fetching. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/ [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/ Signed-off-by: Jonathan Tan --- v2 is exactly the same as the original, except with user-facing documentation in Documentation/fetch-options.txt. > What's the plan to expose this "feature" to end-users? There is no > end-user facing documentation added by this patch, and in-code > comments only talk about what (mechanical) effect the option has, > but not when a user may want to use the feature, or how the user > would best decide the set of commits to pass to this new option. Jonathan Nieder also mentioned this. Lack of documentation was an oversight, sorry. I've added it in this version. > Would something like this > > git fetch $(git for-each-ref \ > --format=--nego-tip="%(objectname)" \ > refs/remotes/linux-next/) \ > linux-next > > be an expected typical way to pull from one remote, exposing only > the tips of refs we got from that remote and not the ones we > obtained from other places? Yes, that is one way. Alternatively, if the user is only fetching one branch, they may also want to specify a single branch. --- Documentation/fetch-options.txt | 12 +++ builtin/fetch.c | 21 + fetch-pack.c| 19 ++-- fetch-pack.h| 7 + t/t5510-fetch.sh| 55 + transport-helper.c | 3 ++ transport.c | 1 + transport.h | 10 ++ 8 files changed, 126 insertions(+), 2 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 97d3217df9..80c4c94595 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -42,6 +42,18 @@ the current repository has the same history as the source repository. .git/shallow. This option updates .git/shallow and accept such refs. +--negotiation-tip:: + By default, Git will report, to the server, commits reachable + from all local refs to find common commits in an attempt to + reduce the size of the to-be-received packfile. If specified, + Git will only report commits reachable from the given commit. + This is useful to speed up fetches when the user knows which + local ref is likely to have commits in common with the + upstream ref being fetched. ++ +This option may be specified more than once; if so, Git will report +commits reachable from any of the given commits. + ifndef::git-pull[] --dry-run:: Show what would be done, without making any changes. diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669ad..12daec0f3b 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -63,6 +63,7 @@ static int shown_url = 0; static struct refspec refmap = REFSPEC_INIT_FETCH; static struct list_objects_filter_options filter_options; static struct string_list server_options = STRING_LIST_INIT_DUP; +static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP; static int git_fetch_config(const char *k, const char *v, void *cb) { @@ -174,6 +175,8 @@ static struct option builtin_fetch_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_STRING_LIST(0, "negotiation-tip", &negotiation_tip, N_("revision"), + N_("report that we have only objects reachable from this object")), OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), OPT_END() }; @@ -1075,6 +1078,24 @@ static struct transport *prepare_transport(struct remote *remote, int deepen) filter_options.filter_spec); set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); } + if (negotiation_tip.nr) { + struct oid_array *oids; + if (transport->smart_options) { + int i; + oids = xcalloc(1, sizeof(*oids)); + for (i = 0; i < negotiation_tip.nr; i++) { + struct object_id oid; +
Re: [PATCH v3 1/2] t3418: add testcase showing problems with rebase -i and strategy options
> diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh > index 03bf1b8a3b..11546d6e14 100755 > --- a/t/t3418-rebase-continue.sh > +++ b/t/t3418-rebase-continue.sh > @@ -74,6 +74,38 @@ test_expect_success 'rebase --continue remembers merge > strategy and options' ' > test -f funny.was.run > ' > > +test_expect_failure 'rebase -i --continue handles merge strategy and > options' ' > + rm -fr .git/rebase-* && > + git reset --hard commit-new-file-F2-on-topic-branch && > + test_commit "commit-new-file-F3-on-topic-branch-for-dash-i" F3 32 && > + test_when_finished "rm -fr test-bin funny.was.run funny.args" && > + mkdir test-bin && > + cat >test-bin/git-merge-funny <<-EOF && > + #!$SHELL_PATH > + echo "\$@" >>funny.args > + case "\$1" in --opt) ;; *) exit 2 ;; esac > + case "\$2" in --foo) ;; *) exit 2 ;; esac > + case "\$4" in --) ;; *) exit 2 ;; esac > + shift 2 && > + >funny.was.run && > + exec git merge-recursive "\$@" > + EOF > + chmod +x test-bin/git-merge-funny && You could use the 'write_script' helper function here. > + ( > + PATH=./test-bin:$PATH && > + test_must_fail git rebase -i -s funny -Xopt -Xfoo master topic > + ) && > + test -f funny.was.run && > + rm funny.was.run && > + echo "Resolved" >F2 && > + git add F2 && > + ( > + PATH=./test-bin:$PATH && > + git rebase --continue > + ) && > + test -f funny.was.run > +' > + > test_expect_success 'rebase passes merge strategy options correctly' ' > rm -fr .git/rebase-* && > git reset --hard commit-new-file-F3-on-topic-branch && > -- > 2.18.0.9.g431b2c36d5 > >
Re: [PATCH v7 21/22] gc: automatically write commit-graph files
On 6/27/2018 2:09 PM, Junio C Hamano wrote: Derrick Stolee writes: @@ -40,6 +41,7 @@ static int aggressive_depth = 50; static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; +static int gc_write_commit_graph = 0; Please avoid unnecessary (and undesirable) explicit initialization to 0. Instead, let BSS to handle it by leaving " = 0" out. +test_expect_success 'check that gc computes commit-graph' ' + cd "$TRASH_DIRECTORY/full" && + git commit --allow-empty -m "blank" && + git commit-graph write --reachable && + cp $objdir/info/commit-graph commit-graph-before-gc && + git reset --hard HEAD~1 && + git config gc.writeCommitGraph true && + git gc && + cp $objdir/info/commit-graph commit-graph-after-gc && + ! test_cmp commit-graph-before-gc commit-graph-after-gc && The set of commits in the commit graph will chanbe by discarding the (old) tip commit, so the fact that the contents of the file changed across gc proves that "commit-graph write" was triggered during gc. Come to think of it, do we promise to end-users (in docs etc.) that commit-graph covers *ONLY* commits reachable from refs and HEAD? I am wondering what should happen if "git gc" here does not prune the reflog for HEAD---wouldn't we want to reuse the properties of the commit we are discarding when it comes back (e.g. you push, then reset back, and the next pull makes it reappear in your repository)? Today I learned that 'gc' keeps some of the reflog around. That makes sense, but I wouldn't optimize the commit-graph file for this scenario. I guess what I am really questioning is if it is sensible to define "--reachable" as "starting at all refs", unlike the usual connectivity rules "gc" uses, especially when this is run from inside "gc". It is sensible to me, especially because we only lose performance if we visit those other commits that are still in the object database. By writing the commit-graph on 'gc' and not during 'fetch', we are already assuming the commit-graph will usually be behind the set of commits that the user cares about, by design. An alternate view on the decision will need help answering from others who know more than me: In fetch negotiation, does the client report commits in the reflog as 'have's or do they get re-downloaded on a resulting 'git pull'? + git commit-graph write --reachable && + test_cmp commit-graph-after-gc $objdir/info/commit-graph This says that running "commit-graph write" twice without changing the topology MUST yield byte-for-byte identical commit-graph file. Is that a reasonable requirement on the future implementation? I am wondering if there will arise a situation where you need to store records in "some" fixed order but two records compare "the same" and tie-breaking them to give stable sort is expensive, or something like that, which would benefit if you leave an escape hatch to allow two logically identical graphs expressed bitwise differently. Since the file format allows flexibility in the order of the chunks, it is possible to have bitwise-different files that represent the same set of data. However, I would not want git to provide inconsistent output given the same set of commits covered by the file. Thanks, -Stolee
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
On Wed, Jun 27, 2018 at 2:17 PM Johannes Sixt wrote: > Pitfalls ahead! > > PATH=... git rebase ... > > is OK, but > > PATH=... test_must_fail git rebase ... > > is not; the latter requires the subshell, otherwise the modified PATH > variable survives the command because test_must_fail is a shell > function. Yes, it's silly, but that's how it is. As an alternative to the subshell, some test use 'env': test_must_fail env PATH=... git rebase ...
Re: [PATCH v5 8/8] fetch-pack: implement ref-in-want
On 06/27, Jonathan Tan wrote: > > +test_expect_success 'setup repos for change-while-negotiating test' ' > > The tests that follow are basic ref-in-want tests, not tests on a repo > that changes during negotiation - this would be just "setup repos for > fetch tests". That looks like a copy-paste error. > > > +test_expect_success 'fetching with exact OID' ' > > + rm -rf local && > > + cp -r "$LOCAL_PRISTINE" local && > > + git -C local fetch origin $(git -C "$REPO" rev-parse > > d):refs/heads/actual && > > + > > + git -C "$REPO" rev-parse "d" >expected && > > + git -C local rev-parse refs/heads/actual >actual && > > + test_cmp expected actual > > +' > > Also verify that "want-ref refs/tags/d" is being sent over the wire, and > not any "want ...". (If not we can't distinguish these from the usual > non-want-ref behavior.) Same comment for the other tests. I think your mistaken on how what this test is looking for. no want-ref line is going to be sent because we're requesting an exact OID here, not a ref. But I can add checks for want-ref in the tests that should be sending want-ref. > > Other than that (and my other comments), this patch series looks good. -- Brandon Williams
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
Am 27.06.2018 um 19:27 schrieb Elijah Newren: On Wed, Jun 27, 2018 at 9:54 AM, Junio C Hamano wrote: Having said that, it would be simpler for at least the latter to write it using a single-shot environment assignment, perhaps? I.e. PATH=./test-bin:$PATH git rebase --continue && without running in a subshell? Seems reasonable. Since these tests were essentially copies of other tests within the same file, just for rebase -i instead of -m, should I also add another patch to the series fixing up the rebase -m testcase to also replace the subshell with a single-shot environment assignment? Pitfalls ahead! PATH=... git rebase ... is OK, but PATH=... test_must_fail git rebase ... is not; the latter requires the subshell, otherwise the modified PATH variable survives the command because test_must_fail is a shell function. Yes, it's silly, but that's how it is. -- Hannes
Re: [PATCH v5 7/8] fetch-pack: put shallow info in output parameter
On 06/26, Junio C Hamano wrote: > Brandon Williams writes: > > > Expand the transport fetch method signature, by adding an output > > parameter, to allow transports to return information about the refs they > > have fetched. Then communicate shallow status information through this > > mechanism instead of by modifying the input list of refs. > > Makes sense. Would this mechanism also allow us to be more explicit > about the "tag following"? > Yes most likely. We could change it so that when a packfile is sent the result of tag following could be sent along too (the actual tag refs themselves that is) instead of having the client rely on the ref advertisement for tag following. -- Brandon Williams
Re: [PATCH v7 21/22] gc: automatically write commit-graph files
Derrick Stolee writes: > @@ -40,6 +41,7 @@ static int aggressive_depth = 50; > static int aggressive_window = 250; > static int gc_auto_threshold = 6700; > static int gc_auto_pack_limit = 50; > +static int gc_write_commit_graph = 0; Please avoid unnecessary (and undesirable) explicit initialization to 0. Instead, let BSS to handle it by leaving " = 0" out. > +test_expect_success 'check that gc computes commit-graph' ' > + cd "$TRASH_DIRECTORY/full" && > + git commit --allow-empty -m "blank" && > + git commit-graph write --reachable && > + cp $objdir/info/commit-graph commit-graph-before-gc && > + git reset --hard HEAD~1 && > + git config gc.writeCommitGraph true && > + git gc && > + cp $objdir/info/commit-graph commit-graph-after-gc && > + ! test_cmp commit-graph-before-gc commit-graph-after-gc && The set of commits in the commit graph will chanbe by discarding the (old) tip commit, so the fact that the contents of the file changed across gc proves that "commit-graph write" was triggered during gc. Come to think of it, do we promise to end-users (in docs etc.) that commit-graph covers *ONLY* commits reachable from refs and HEAD? I am wondering what should happen if "git gc" here does not prune the reflog for HEAD---wouldn't we want to reuse the properties of the commit we are discarding when it comes back (e.g. you push, then reset back, and the next pull makes it reappear in your repository)? I guess what I am really questioning is if it is sensible to define "--reachable" as "starting at all refs", unlike the usual connectivity rules "gc" uses, especially when this is run from inside "gc". > + git commit-graph write --reachable && > + test_cmp commit-graph-after-gc $objdir/info/commit-graph This says that running "commit-graph write" twice without changing the topology MUST yield byte-for-byte identical commit-graph file. Is that a reasonable requirement on the future implementation? I am wondering if there will arise a situation where you need to store records in "some" fixed order but two records compare "the same" and tie-breaking them to give stable sort is expensive, or something like that, which would benefit if you leave an escape hatch to allow two logically identical graphs expressed bitwise differently. > +' > + > # the verify tests below expect the commit-graph to contain > # exactly the commits reachable from the commits/8 branch. > # If the file changes the set of commits in the list, then the
Re: [PATCH v5 8/8] fetch-pack: implement ref-in-want
> +test_expect_success 'setup repos for change-while-negotiating test' ' The tests that follow are basic ref-in-want tests, not tests on a repo that changes during negotiation - this would be just "setup repos for fetch tests". > +test_expect_success 'fetching with exact OID' ' > + rm -rf local && > + cp -r "$LOCAL_PRISTINE" local && > + git -C local fetch origin $(git -C "$REPO" rev-parse > d):refs/heads/actual && > + > + git -C "$REPO" rev-parse "d" >expected && > + git -C local rev-parse refs/heads/actual >actual && > + test_cmp expected actual > +' Also verify that "want-ref refs/tags/d" is being sent over the wire, and not any "want ...". (If not we can't distinguish these from the usual non-want-ref behavior.) Same comment for the other tests. Other than that (and my other comments), this patch series looks good.
Re: [PATCH v7 22/22] commit-graph: update design document
Derrick Stolee writes: > The commit-graph feature is now integrated with 'fsck' and 'gc', > so remove those items from the "Future Work" section of the > commit-graph design document. Nice ;-)
Re: [PATCH v5 3/8] upload-pack: test negotiation with changing repository
On 06/26, Junio C Hamano wrote: > Brandon Williams writes: > > > diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh > > new file mode 100644 > > index 0..8a9a5aca0 > > --- /dev/null > > +++ b/t/lib-httpd/one-time-sed.sh > > @@ -0,0 +1,22 @@ > > +#!/bin/sh > > + > > +# If "one-time-sed" exists in $HTTPD_ROOT_PATH, run sed on the HTTP > > response, > > +# using the contents of "one-time-sed" as the sed command to be run. If the > > +# response was modified as a result, delete "one-time-sed" so that > > subsequent > > +# HTTP responses are no longer modified. > > ;-) clever. > > > +# > > +# This can be used to simulate the effects of the repository changing in > > +# between HTTP request-response pairs. > > +if [ -e one-time-sed ]; then "$GIT_EXEC_PATH/git-http-backend" >out > > Style (cf. Documentation/CodingGuidelines). > > > + > > + sed "$(cat one-time-sed)" out_modified > > + > > + if diff out out_modified >/dev/null; then > > + cat out > > + else > > + cat out_modified > > + rm one-time-sed > > + fi > > +else > > + "$GIT_EXEC_PATH/git-http-backend" > > +fi > > OK. I was worried if the removal-after-use was about _this_ script > (in which case it is a bad hygiene), but this is not a one-time > script, but merely the mechanism to use the one-time sed script. > > Perhaps rename this to t/lib-httpd/apply-one-time-sed.sh or something > to avoid confusion? Sure, I'll go ahead and rename it to that. > > > diff --git a/t/t5703-upload-pack-ref-in-want.sh > > b/t/t5703-upload-pack-ref-in-want.sh > > index 0ef182970..a4fe0e7e4 100755 > > --- a/t/t5703-upload-pack-ref-in-want.sh > > +++ b/t/t5703-upload-pack-ref-in-want.sh > > @@ -150,4 +150,72 @@ test_expect_success 'want-ref with ref we already have > > commit for' ' > > check_output > > ' > > > > +. "$TEST_DIRECTORY"/lib-httpd.sh > > +start_httpd > > + > > +REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo" > > +LOCAL_PRISTINE="$(pwd)/local_pristine" > > + > > +test_expect_success 'setup repos for change-while-negotiating test' ' > > + ( > > + git init "$REPO" && > > + cd "$REPO" && > > + >.git/git-daemon-export-ok && > > + test_commit m1 && > > + git tag -d m1 && > > + > > + # Local repo with many commits (so that negotiation will take > > + # more than 1 request/response pair) > > + git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo"; > > "$LOCAL_PRISTINE" && > > + cd "$LOCAL_PRISTINE" && > > + git checkout -b side && > > + for i in $(seq 1 33); do test_commit s$i; done && > > + > > + # Add novel commits to upstream > > + git checkout master && > > + cd "$REPO" && > > + test_commit m2 && > > + test_commit m3 && > > + git tag -d m2 m3 > > + ) && > > + git -C "$LOCAL_PRISTINE" remote set-url origin > > "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo"; && > > + git -C "$LOCAL_PRISTINE" config protocol.version 2 > > +' > > + > > +inconsistency() { > > Style. "inconsistency () {" > > > + # Simulate that the server initially reports $2 as the ref > > + # corresponding to $1, and after that, $1 as the ref corresponding to > > + # $1. This corresponds to the real-life situation where the server's > > + # repository appears to change during negotiation, for example, when > > + # different servers in a load-balancing arrangement serve (stateless) > > + # RPCs during a single negotiation. > > + printf "s/%s/%s/" \ > > + $(git -C "$REPO" rev-parse $1 | tr -d "\n") \ > > + $(git -C "$REPO" rev-parse $2 | tr -d "\n") \ > > + >"$HTTPD_ROOT_PATH/one-time-sed" > > +} > > + > > +test_expect_success 'server is initially ahead - no ref in want' ' > > + git -C "$REPO" config uploadpack.allowRefInWant false && > > + rm -rf local && > > + cp -r "$LOCAL_PRISTINE" local && > > + inconsistency master 1234567890123456789012345678901234567890 && > > + test_must_fail git -C local fetch 2>err && > > + grep "ERR upload-pack: not our ref" err > > +' > > + > > +test_expect_success 'server is initially behind - no ref in want' ' > > + git -C "$REPO" config uploadpack.allowRefInWant false && > > + rm -rf local && > > + cp -r "$LOCAL_PRISTINE" local && > > + inconsistency master "master^" && > > + git -C local fetch && > > + > > + git -C "$REPO" rev-parse --verify "master^" >expected && > > + git -C local rev-parse --verify refs/remotes/origin/master >actual && > > + test_cmp expected actual > > +' > > + > > +stop_httpd > > + > > test_done -- Brandon Williams
Re: [PATCH v5 2/8] upload-pack: implement ref-in-want
> Brandon Williams writes: > > > +wanted-refs section > > + * This section is only included if the client has requested a > > + ref using a 'want-ref' line and if a packfile section is also > > + included in the response. > > + > > + * Always begins with the section header "wanted-refs". > > + > > + * The server will send a ref listing (" ") for > > + each reference requested using 'want-ref' lines. > > + > > + * The server SHOULD NOT send any refs which were not requested > > + using 'want-ref' lines and a client MUST ignore refs which > > + weren't requested. > > Just being curious, but the above feels the other way around. Why > are we being more lenient to writers of broken server than writers > of broken clients? The number of installations they need to take > back and replace is certainly lower for the former, which means > that, if exchanges of unsoliclited refs are unwanted, clients should > notice and barf (or warn) if the server misbehaves, and the server > should be forbidden from sending unsolicited refs, no? I agree - if you were thinking of later relaxing this constraint to allow the server to supply tag ref information during tag following, writing "MUST NOT" here is still fine. (We can later change this to "if 'include-tag-ref' is sent, the server MAY send refs which were not requested, otherwise, the server MUST NOT".) Both Jonathan Nieder and I also suggested client enforcement [1] - I see that in patch 8, in receive_wanted_refs(), unrecognized wanted-ref lines are silently ignored. Maybe at least a warning would be good. [1] https://public-inbox.org/git/20180625230310.210182-1-jonathanta...@google.com/
Re: [PATCH v5 2/8] upload-pack: implement ref-in-want
On 06/26, Junio C Hamano wrote: > Brandon Williams writes: > > > +wanted-refs section > > + * This section is only included if the client has requested a > > + ref using a 'want-ref' line and if a packfile section is also > > + included in the response. > > + > > + * Always begins with the section header "wanted-refs". > > + > > + * The server will send a ref listing (" ") for > > + each reference requested using 'want-ref' lines. > > + > > + * The server SHOULD NOT send any refs which were not requested > > + using 'want-ref' lines and a client MUST ignore refs which > > + weren't requested. > > Just being curious, but the above feels the other way around. Why > are we being more lenient to writers of broken server than writers > of broken clients? The number of installations they need to take > back and replace is certainly lower for the former, which means > that, if exchanges of unsoliclited refs are unwanted, clients should > notice and barf (or warn) if the server misbehaves, and the server > should be forbidden from sending unsolicited refs, no? Ok so should I change the server part to "MUST NOT" and the client part to "SHOULD"? And I can add code to die when we see refs that weren't requested, but i feel like if we add an ability to request a pattern in the future this will completely change, which is why I currently have a client just ignoring anything else. > > > > diff --git a/t/t5703-upload-pack-ref-in-want.sh > > b/t/t5703-upload-pack-ref-in-want.sh > > new file mode 100755 > > index 0..0ef182970 > > --- /dev/null > > +++ b/t/t5703-upload-pack-ref-in-want.sh > > @@ -0,0 +1,153 @@ > > +#!/bin/sh > > + > > +test_description='upload-pack ref-in-want' > > + > > +. ./test-lib.sh > > + > > +get_actual_refs() { > > Style. "get_actual_refs () {" > > > + sed -n '/wanted-refs/,/0001/p' > unpack >actual_refs > > Unnecessary piping of sed output into another sed invocation? > > sed -n -e '/wanted-refs/,/0001/{ > /wanted-refs/d > /0001/d > p > }' > > or something like that? Yeah thanks for the help with sed :) -- Brandon Williams
Re: [PATCH 08/29] t7400: fix broken "submodule add/reconfigure --force" test
On Tue, Jun 26, 2018 at 12:30 AM Eric Sunshine wrote: > > This test has been dysfunctional since it was added by 619acfc78c > (submodule add: extend force flag to add existing repos, 2016-10-06), > however, two problems early in the test went unnoticed due to a broken > &&-chain later in the test. > > First, it tries configuring the submodule with repository "bogus-url", > however, "git submodule add" insists that the repository be either an > absolute URL or a relative pathname requiring prefix "./" or "../" (this > is true even with --force), but "bogus-url" does not meet those > criteria, thus the command fails. > > Second, it then tries configuring a submodule with a path which is > .gitignore'd, which is disallowed. This restriction can be overridden > with --force, but the test neglects to use that option. > > Fix both problems, as well as the broken &&-chain behind which they hid. > > Signed-off-by: Eric Sunshine This patch is Reviewed-by: Stefan Beller Thanks for this whole series (I just read the cover letter) and I think detecting broken && chains is a valuable part in the test suite. Thanks, Stefan > --- > t/t7400-submodule-basic.sh | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh > index 812db137b8..401adaed32 100755 > --- a/t/t7400-submodule-basic.sh > +++ b/t/t7400-submodule-basic.sh > @@ -171,12 +171,12 @@ test_expect_success 'submodule add to .gitignored path > with --force' ' > test_expect_success 'submodule add to reconfigure existing submodule with > --force' ' > ( > cd addtest-ignore && > - git submodule add --force bogus-url submod && > - git submodule add -b initial "$submodurl" submod-branch && > - test "bogus-url" = "$(git config -f .gitmodules > submodule.submod.url)" && > - test "bogus-url" = "$(git config submodule.submod.url)" && > + git submodule add --force /bogus-url submod && > + git submodule add --force -b initial "$submodurl" > submod-branch && > + test "/bogus-url" = "$(git config -f .gitmodules > submodule.submod.url)" && > + test "/bogus-url" = "$(git config submodule.submod.url)" && > # Restore the url > - git submodule add --force "$submodurl" submod > + git submodule add --force "$submodurl" submod && > test "$submodurl" = "$(git config -f .gitmodules > submodule.submod.url)" && > test "$submodurl" = "$(git config submodule.submod.url)" > ) > -- > 2.18.0.419.gfe4b301394 >
Re: [PATCH v5 3/8] upload-pack: test negotiation with changing repository
> +# If "one-time-sed" exists in $HTTPD_ROOT_PATH, run sed on the HTTP response, > +# using the contents of "one-time-sed" as the sed command to be run. If the > +# response was modified as a result, delete "one-time-sed" so that subsequent > +# HTTP responses are no longer modified. > +# Whitespace error on this line (the line immediately after "are no longer modified.").
Re: [PATCH v7 20/22] commit-graph: add '--reachable' option
Derrick Stolee writes: > diff --git a/commit-graph.c b/commit-graph.c > index a06e7e9549..adf54e3fe7 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -7,6 +7,7 @@ > #include "packfile.h" > #include "commit.h" > #include "object.h" > +#include "refs.h" > #include "revision.h" > #include "sha1-lookup.h" > #include "commit-graph.h" > @@ -656,6 +657,23 @@ static void compute_generation_numbers(struct > packed_commit_list* commits) > } > } > > +static int add_ref_to_list(const char *refname, > +const struct object_id *oid, > +int flags, void *cb_data) > +{ > + struct string_list *list = (struct string_list*)cb_data; Style: "(struct string_list *)cb_data" Also please have a blank line here between the decls and the first statement. > + string_list_append(list, oid_to_hex(oid)); > + return 0; > +}
Re: [PATCH v7 17/22] commit-graph: verify contents match checksum
Derrick Stolee writes: > + devnull = open("/dev/null", O_WRONLY); > + f = hashfd(devnull, NULL); > + hashwrite(f, g->data, g->data_len - g->hash_len); I wondered if we can hide the knowledge of "/dev/null" by reusing hashfd_check() from csum-file.c (which is used by the verification of pack idx file), which also gives us the hash comparison of an existing file on disk for free. But unfortunately this codepath only verifies the checksum and not contents, so we cannot avoid setting up the /dev/null sink manually like this patch does, I guess. > + finalize_hashfile(f, checksum.hash, CSUM_CLOSE); > + if (hashcmp(checksum.hash, g->data + g->data_len - g->hash_len)) { > + graph_report(_("the commit-graph file has incorrect checksum > and is likely corrupt")); > + verify_commit_graph_error = VERIFY_COMMIT_GRAPH_ERROR_HASH; > + } > + > for (i = 0; i < g->num_commits; i++) { > struct commit *graph_commit; > > @@ -909,7 +921,7 @@ int verify_commit_graph(struct repository *r, struct > commit_graph *g) > cur_fanout_pos++; > } > > - if (verify_commit_graph_error) > + if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH) > return verify_commit_graph_error; > > for (i = 0; i < g->num_commits; i++) { > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index a0cf1f66de..fed05e2f12 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -279,6 +279,7 @@ GRAPH_COMMIT_DATA_WIDTH=$(($HASH_LEN + 16)) > GRAPH_OCTOPUS_DATA_OFFSET=$(($GRAPH_COMMIT_DATA_OFFSET + \ >$GRAPH_COMMIT_DATA_WIDTH * $NUM_COMMITS)) > GRAPH_BYTE_OCTOPUS=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4)) > +GRAPH_BYTE_FOOTER=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4 * $NUM_OCTOPUS_EDGES)) > > # usage: corrupt_graph_and_verify > # Manipulates the commit-graph file at the position > @@ -393,4 +394,9 @@ test_expect_success 'detect incorrect parent for octopus > merge' ' > "invalid parent" > ' > > +test_expect_success 'detect invalid checksum hash' ' > + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ > + "incorrect checksum" > +' > + > test_done
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On 27.06.18 09:54, Steve Groeger wrote: > Hi, > > Sorry for incomplete post earlier. Here is the full post: > > > In the latest version of git a new attribute has been added, > working-tree-encoding. The release notes states: > > 'The new "working-tree-encoding" attribute can ask Git to convert the >contents to the specified encoding when checking out to the working >tree (and the other way around when checking in).' > We have been using this attribute on our z/OS systems using a version of git > from Rocket software to convert files to EBCDIC for quite a while now. On > other platforms (Linux, AIX etc) git ignored this attribute and therefore > left the files in ASCII. > > We have common code that is supposed to be usable across different platforms > and hence different file encodings. With the full support of the > working-tree-encoding in the latest version of git on all platforms, how do > we have files converted to different encodings on different platforms? > I could not find anything that would allow us to say 'if platform = z/OS then > encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? > > > > > Thanks > Steve Groeger [] Did you consider to put a gitattributes file on machine level ? https://git-scm.com/docs/gitattributes [snipped the other places where to put gitattributes] ... Attributes for all users on a system should be placed in the $(prefix)/etc/gitattributes file. > Java Runtimes Development > IBM Hursley > IBM United Kingdom Ltd > Tel: (44) 1962 816911 Mobex: 279990 Mobile: 07718 517 129 > Fax (44) 1962 816800 > Lotus Notes: Steve Groeger/UK/IBM > Internet: groe...@uk.ibm.com > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >
Re: [PATCH v7 01/22] t5318-commit-graph.sh: use core.commitGraph
Derrick Stolee writes: > The commit-graph tests should be checking that normal Git operations > succeed and have matching output with and without the commit-graph > feature enabled. However, the test was toggling 'core.graph' instead > of the correct 'core.commitGraph' variable. > > Signed-off-by: Derrick Stolee > --- > > Junio, > > I sent this patch as a one-off a while ago, but it seems it was dropped. > I'm adding it back here so we don't forget it. Thanks, will queue. > > -Stolee > > t/t5318-commit-graph.sh | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index 77d85aefe7..59d0be2877 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -28,8 +28,8 @@ test_expect_success 'create commits and repack' ' > ' > > graph_git_two_modes() { > - git -c core.graph=true $1 >output > - git -c core.graph=false $1 >expect > + git -c core.commitGraph=true $1 >output > + git -c core.commitGraph=false $1 >expect > test_cmp output expect > }
[PATCH] fetch: when deepening, check connectivity fully
Do not stop at ancestors of our refs when deepening during fetching. This is because when performing such a fetch, shallow entries may have been updated; the client can reasonably assume that the existing refs are connected up to the old shallow points, but not the new. This was noticed when a user tried to deepen an existing repository by fetching with --no-shallow from a server that did not send all necessary objects - the connectivity check as run by "git fetch" succeeded, but a subsequent "git fsck" failed. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 5 - connected.c | 6 -- connected.h | 7 +++ t/t5537-fetch-shallow.sh | 23 +++ 4 files changed, 38 insertions(+), 3 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index ea5b9669ad..2cfd13342f 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -780,6 +780,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, const char *filename = dry_run ? "/dev/null" : git_path_fetch_head(); int want_status; int summary_width = transport_summary_width(ref_map); + struct check_connected_options opt = CHECK_CONNECTED_INIT; fp = fopen(filename, "a"); if (!fp) @@ -791,7 +792,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, url = xstrdup("foreign"); rm = ref_map; - if (check_connected(iterate_ref_map, &rm, NULL)) { + if (deepen) + opt.is_deepening_fetch = 1; + if (check_connected(iterate_ref_map, &rm, &opt)) { rc = error(_("%s did not send all necessary objects\n"), url); goto abort; } diff --git a/connected.c b/connected.c index 91feb78815..1bba888eff 100644 --- a/connected.c +++ b/connected.c @@ -58,8 +58,10 @@ int check_connected(oid_iterate_fn fn, void *cb_data, argv_array_push(&rev_list.args, "--stdin"); if (repository_format_partial_clone) argv_array_push(&rev_list.args, "--exclude-promisor-objects"); - argv_array_push(&rev_list.args, "--not"); - argv_array_push(&rev_list.args, "--all"); + if (!opt->is_deepening_fetch) { + argv_array_push(&rev_list.args, "--not"); + argv_array_push(&rev_list.args, "--all"); + } argv_array_push(&rev_list.args, "--quiet"); if (opt->progress) argv_array_pushf(&rev_list.args, "--progress=%s", diff --git a/connected.h b/connected.h index a53f03a61a..322dc76372 100644 --- a/connected.h +++ b/connected.h @@ -38,6 +38,13 @@ struct check_connected_options { * Insert these variables into the environment of the child process. */ const char **env; + + /* +* If non-zero, check the ancestry chain completely, not stopping at +* any existing ref. This is necessary when deepening existing refs +* during a fetch. +*/ + unsigned is_deepening_fetch : 1; }; #define CHECK_CONNECTED_INIT { 0 } diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index df8d2f095a..ac4beac96b 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -186,4 +186,27 @@ EOF test_cmp expect actual ' +test_expect_success 'shallow fetches check connectivity without stopping at existing refs' ' + cp -R .git server.git && + + # Normally, the connectivity check stops at ancestors of existing refs. + git init client && + GIT_TRACE="$(pwd)/trace" git -C client fetch "$(pwd)/server.git" && + grep "run_command: git rev-list" trace >rev-list-command && + grep -e "--not --all" rev-list-command && + + # But it does not for a shallow fetch... + rm -rf client trace && + git init client && + GIT_TRACE="$(pwd)/trace" git -C client fetch --depth=1 "$(pwd)/server.git" && + grep "run_command: git rev-list" trace >rev-list-command && + ! grep -e "--not --all" rev-list-command && + + # ...and when deepening. + rm trace && + GIT_TRACE="$(pwd)/trace" git -C client fetch --unshallow "$(pwd)/server.git" && + grep "run_command: git rev-list" trace >rev-list-command && + ! grep -e "--not --all" rev-list-command +' + test_done -- 2.18.0.rc2.346.g013aa6912e-goog
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
On Wed, Jun 27, 2018 at 9:54 AM, Junio C Hamano wrote: > + ( > + PATH=./test-bin:$PATH > + test_must_fail git rebase -i -s funny -Xopt -Xfoo master topic > + ) && > > + ( > + PATH=./test-bin:$PATH > + git rebase --continue > + ) && > > and it would be reasonable to assume that these variable assignments > would not fail. IOW, there is no BUG in breaking &&-chain at these > two places. It is merely that the automated mechanism introduced by > step 29/29 of that other series does not understand that (which is > expected). It is not wrong to have && at the end of the assignment, > though. > > Having said that, it would be simpler for at least the latter to > write it using a single-shot environment assignment, perhaps? I.e. > > PATH=./test-bin:$PATH git rebase --continue && > > without running in a subshell? Seems reasonable. Since these tests were essentially copies of other tests within the same file, just for rebase -i instead of -m, should I also add another patch to the series fixing up the rebase -m testcase to also replace the subshell with a single-shot environment assignment?
Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'
On Wed, Jun 27, 2018 at 09:40:10AM -0700, Junio C Hamano wrote: > Taylor Blau writes: > > > - if (sign == ':') > > - match_color = opt->color_match_selected; > > - else > > - match_color = opt->color_match_context; > > - if (sign == ':') > > - line_color = opt->color_selected; > > - else if (sign == '-') > > - line_color = opt->color_context; > > - else if (sign == '=') > > - line_color = opt->color_function; > > + if (opt->color) { > > + if (sign == ':') > > + match_color = opt->color_match_selected; > > + else > > + match_color = opt->color_match_context; > > + if (sign == ':') > > + line_color = opt->color_selected; > > + else if (sign == '-') > > + line_color = opt->color_context; > > + else if (sign == '=') > > + line_color = opt->color_function; > > + } > > The above change (specifically, the fact that this now is enclosed > in "if (opt->color) { ... }") unfortunately leaves match_color > undefined at this point in the control flow. The next loop then > calls output_color() with an undefined match_color and tricks stupid > compiler to issue a warning and makes -Werror build to fail. Right, this is because we are using the `while (next_match(...))` loop for non-colorized output, which is new. This seems like a definite problem, and certainly causes the -Werror build to fail for me, too. > > *eol = '\0'; > > while (next_match(opt, bol, eol, ctx, &match, eflags)) { > > if (match.rm_so == match.rm_eo) > > break; > > > > - output_color(opt, bol, match.rm_so, line_color); > > + if (opt->only_matching) > > + show_line_header(opt, name, lno, cno, sign); > > + else > > + output_color(opt, bol, match.rm_so, line_color); > > output_color(opt, bol + match.rm_so, > > match.rm_eo - match.rm_so, match_color); > > output_color() does check want_color(opt->color) before using its > last parameter, and want_color() gives false for opt->color that is > 0 (i.e. leaves match_color to be undefined), so in this case, the > compiler is worried too much, but still, we should work it around if > it is easy to do so. > > Just initializing match_color where it is defined at the beginning of > show_line() should be sufficient, I think. I think that we could also use the following, and leave the `if (opt->color)` conditional where it is: diff --git a/grep.c b/grep.c index 48cca6723e..b985fb3ee0 100644 --- a/grep.c +++ b/grep.c @@ -1448,7 +1448,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol, const char *name, unsigned lno, ssize_t cno, char sign) { int rest = eol - bol; - const char *match_color, *line_color = NULL; + const char *match_color = NULL, *line_color = NULL; if (opt->file_break && opt->last_shown == 0) { if (opt->show_hunk_mark) >From my reading, output_color() appears happy to treat a NULL color as meaningless, and instead redirect the call to `opt->output` without the preceding colorization and the reset afterwords. We could also move that `if (opt->color)` block up closer to where line_color and match_color are initialized, which might appear cleaner. I think the best time to do that would be in a preparatory patch in this series. What do you think? Thanks, Taylor
Re: [PATCH] submodule.c: report the submodule that an error occurs in
On Mon, Jun 25, 2018 at 8:58 AM Junio C Hamano wrote: > > SZEDER Gábor writes: > > >> When an error occurs in updating the working tree of a submodule in > >> submodule_move_head, tell the user which submodule the error occurred in. > >> > >> The call to read-tree contains a super-prefix, such that the read-tree > >> will correctly report any path related issues, but some error messages > >> do not contain a path, for example: > >> > >> ~/gerrit$ git checkout --recurse-submodules origin/master > >> ~/gerrit$ fatal: failed to unpack tree object > >> 07672f31880ba80300b38492df9d0acfcd6ee00a > >> > >> Give the hint which submodule has a problem. > >> > >> Signed-off-by: Stefan Beller > >> --- > >> submodule.c | 2 +- > >> t/lib-submodule-update.sh | 3 ++- > >> 2 files changed, 3 insertions(+), 2 deletions(-) > >> > >> diff --git a/submodule.c b/submodule.c > >> index 939d6870ecd..ebd092a14fd 100644 > >> --- a/submodule.c > >> +++ b/submodule.c > >> @@ -1668,7 +1668,7 @@ int submodule_move_head(const char *path, > >> argv_array_push(&cp.args, new_head ? new_head : empty_tree_oid_hex()); > >> > >> if (run_command(&cp)) { > >> -ret = -1; > >> +ret = error(_("Submodule '%s' could not be updated."), path); > > > > This is a translated error message ... > > > >> goto out; > >> } > >> > >> diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh > >> index 1f38a85371a..e27f5d8541d 100755 > >> --- a/t/lib-submodule-update.sh > >> +++ b/t/lib-submodule-update.sh > >> @@ -781,7 +781,8 @@ test_submodule_recursing_with_args_common() { > >> ( > >> cd submodule_update && > >> git branch -t invalid_sub1 origin/invalid_sub1 && > >> -test_must_fail $command invalid_sub1 && > >> +test_must_fail $command invalid_sub1 2>err && > >> +grep sub1 err && > > > > ... so the test should use 'test_i18ngrep' to check it. > > Thanks for being a careful reviewer, as always. > > Will tweak locally to skip one round-trip. Thanks for tweaking locally. I reviewed your queued patch and it looks good to me. Thanks! I remember deliberately not using the i18n grep as we grep for a part that should stay the same in all languages. However I forgot that details of the test suite implementation (it doesn't have a real translation, so it doesn't matter that this particular piece of the message will show up in all translations) Sorry for my mishap in thinking there. Thanks, Stefan
Re: [PATCH v2 1/2] t3418: add testcase showing problems with rebase -i and strategy options
Elijah Newren writes: >>> + chmod +x test-bin/git-merge-funny && >>> + ( >>> + PATH=./test-bin:$PATH >> >> Broken &&-chain (in subshell). >> >>> + test_must_fail git rebase -i -s funny -Xopt -Xfoo master >>> topic >>> + ) && >>> + test -f funny.was.run && >>> + rm funny.was.run && >>> + echo "Resolved" >F2 && >>> + git add F2 && >>> + ( >>> + PATH=./test-bin:$PATH >> >> Ditto. >> > > I'm just trying to prove how important your other patch series is. ;-) Actually, this shows why the other patch series, especially its last step, is problematic a bit. The above assignments are followed by a single command, i.e. + ( + PATH=./test-bin:$PATH + test_must_fail git rebase -i -s funny -Xopt -Xfoo master topic + ) && + ( + PATH=./test-bin:$PATH + git rebase --continue + ) && and it would be reasonable to assume that these variable assignments would not fail. IOW, there is no BUG in breaking &&-chain at these two places. It is merely that the automated mechanism introduced by step 29/29 of that other series does not understand that (which is expected). It is not wrong to have && at the end of the assignment, though. Having said that, it would be simpler for at least the latter to write it using a single-shot environment assignment, perhaps? I.e. PATH=./test-bin:$PATH git rebase --continue && without running in a subshell?
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
Jeff King writes: > Specifically, I'm thinking of: > > 1. The pre-built documentation that Junio builds for > quick-install-doc. This _could_ be customized during the "quick" > step, but it's probably not worth the effort. However, we'd want > some kind of generic fill-in then, and hopefully not > "/home/jch/etc" or something. That is very likely to happen, actually X-<.
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
I wrote: > Jeff King wrote: >> (Related, there's a build target in the local Makefile for using >> asciidoctor; does it need updated, too?) > > I didn't test asciidoctor specficially, but it also respects > the ASCIIDOC_EXTRA parameters, so I think it will work just > as well. I'll try to confirm that later today. Testing confirmed that asciidoctor works fine with this as well. Somewhat tangentially, I looked at using asciidoctor for the Fedora packages last year and one issue that kept me from using it then was the '[FIXME: source]' it includes in the footer of the manpage. When I dug into it at the time, it appeared this was due to no declaration (similarly missing for manual, and version). It wasn't clear whether it was possible to include a custom header template in plain asciidoctor. I got the impression that it would require using a custom backend, which in turn required the rubygem 'tilt' for processing. I spent about an hour poking around with it and decided that I'd put off building with asciidoctor until that was fixed. I felt that displaying '[FIXME: source]' wass worse than simply not including the version. It's always possible that I was doing something wrong in my use of asciidoctor (I just set USE_ASCIIDOCTOR). Or maybe the Fedora packages are missing some dependency which I missed. It might also be that we need some adjustments similar to https://patchwork.kernel.org/patch/10360207/ to get the mansource attribute passed on to asciidoctor. I only just ran across that patch and haven't had a chance to test sometime similar in the git manpage build. That looks promising though. -- Todd ~~ Why is it that we rejoice at a birth and grieve at a funeral? It is because we are not the person involved. -- Mark Twain
Re: [PATCH 2/2] grep.c: teach 'git grep --only-matching'
Taylor Blau writes: > - if (sign == ':') > - match_color = opt->color_match_selected; > - else > - match_color = opt->color_match_context; > - if (sign == ':') > - line_color = opt->color_selected; > - else if (sign == '-') > - line_color = opt->color_context; > - else if (sign == '=') > - line_color = opt->color_function; > + if (opt->color) { > + if (sign == ':') > + match_color = opt->color_match_selected; > + else > + match_color = opt->color_match_context; > + if (sign == ':') > + line_color = opt->color_selected; > + else if (sign == '-') > + line_color = opt->color_context; > + else if (sign == '=') > + line_color = opt->color_function; > + } The above change (specifically, the fact that this now is enclosed in "if (opt->color) { ... }") unfortunately leaves match_color undefined at this point in the control flow. The next loop then calls output_color() with an undefined match_color and tricks stupid compiler to issue a warning and makes -Werror build to fail. > *eol = '\0'; > while (next_match(opt, bol, eol, ctx, &match, eflags)) { > if (match.rm_so == match.rm_eo) > break; > > - output_color(opt, bol, match.rm_so, line_color); > + if (opt->only_matching) > + show_line_header(opt, name, lno, cno, sign); > + else > + output_color(opt, bol, match.rm_so, line_color); > output_color(opt, bol + match.rm_so, >match.rm_eo - match.rm_so, match_color); output_color() does check want_color(opt->color) before using its last parameter, and want_color() gives false for opt->color that is 0 (i.e. leaves match_color to be undefined), so in this case, the compiler is worried too much, but still, we should work it around if it is easy to do so. Just initializing match_color where it is defined at the beginning of show_line() should be sufficient, I think.
Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect
Johannes Schindelin writes: > git rev-list --bisect-all --first-parent F..E >revs && > # only E, e1..e8 should be listed, nothing else > test_line_count = 9 revs && > for rev in E e1 e2 e3 e4 e5 e6 e7 e8 > do > grep "^$(git rev-parse $rev) " revs || return > done > > I am faster by... a lot. Like, seconds instead of minutes. I'm fine either way. I just thought you would not want 9 separate invocations of grep ;-)
[PATCH v3 0/2] Fix use of strategy options with interactive rebases
The interactive machinery for git rebase can accept special merge strategies or strategy options, but has a bug in its handling of strategy options. This short series patches that. Changes since v2: - Fix broken &&-chaining (Thanks to Eric for spotting) Elijah Newren (2): t3418: add testcase showing problems with rebase -i and strategy options Fix use of strategy options with interactive rebases git-rebase.sh | 2 +- sequencer.c| 7 ++- t/t3418-rebase-continue.sh | 32 3 files changed, 39 insertions(+), 2 deletions(-) 1: 43b9ac5a63 ! 1: f8a5df9ef1 t3418: add testcase showing problems with rebase -i and strategy options @@ -34,7 +34,7 @@ + EOF + chmod +x test-bin/git-merge-funny && + ( -+ PATH=./test-bin:$PATH ++ PATH=./test-bin:$PATH && + test_must_fail git rebase -i -s funny -Xopt -Xfoo master topic + ) && + test -f funny.was.run && @@ -42,7 +42,7 @@ + echo "Resolved" >F2 && + git add F2 && + ( -+ PATH=./test-bin:$PATH ++ PATH=./test-bin:$PATH && + git rebase --continue + ) && + test -f funny.was.run 2: d345eb96d5 = 2: b7e4971e66 Fix use of strategy options with interactive rebases -- 2.18.0.9.g431b2c36d5
[PATCH v3 2/2] Fix use of strategy options with interactive rebases
git-rebase.sh wrote strategy options to .git/rebase/merge/strategy_opts in the following format: '--ours' '--renormalize' Note the double spaces. git-rebase--interactive uses sequencer.c to parse that file, and sequencer.c used split_cmdline() to get the individual strategy options. After splitting, sequencer.c prefixed each "option" with a double dash, so, concatenating all its options would result in: -- --ours -- --renormalize So, when it ended up calling try_merge_strategy(), that in turn would run git merge-$strategy -- --ours -- --renormalize $merge_base -- $head $remote instead of the expected/desired git merge-$strategy --ours --renormalize $merge_base -- $head $remote Remove the extra spaces so that when it goes through split_cmdline() we end up with the desired command line. Signed-off-by: Elijah Newren --- git-rebase.sh | 2 +- sequencer.c| 7 ++- t/t3418-rebase-continue.sh | 2 +- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/git-rebase.sh b/git-rebase.sh index 19bdebb480..f3b10c7f62 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -328,7 +328,7 @@ do do_merge=t ;; --strategy-option=*) - strategy_opts="$strategy_opts $(git rev-parse --sq-quote "--${1#--strategy-option=}")" + strategy_opts="$strategy_opts $(git rev-parse --sq-quote "--${1#--strategy-option=}" | sed -e s/^.//)" do_merge=t test -z "$strategy" && strategy=recursive ;; diff --git a/sequencer.c b/sequencer.c index 5354d4d51e..ef9237c814 100644 --- a/sequencer.c +++ b/sequencer.c @@ -2206,6 +2206,7 @@ static int populate_opts_cb(const char *key, const char *value, void *data) static void read_strategy_opts(struct replay_opts *opts, struct strbuf *buf) { int i; + char *strategy_opts_string; strbuf_reset(buf); if (!read_oneliner(buf, rebase_path_strategy(), 0)) @@ -2214,7 +2215,11 @@ static void read_strategy_opts(struct replay_opts *opts, struct strbuf *buf) if (!read_oneliner(buf, rebase_path_strategy_opts(), 0)) return; - opts->xopts_nr = split_cmdline(buf->buf, (const char ***)&opts->xopts); + strategy_opts_string = buf->buf; + if (*strategy_opts_string == ' ') + strategy_opts_string++; + opts->xopts_nr = split_cmdline(strategy_opts_string, + (const char ***)&opts->xopts); for (i = 0; i < opts->xopts_nr; i++) { const char *arg = opts->xopts[i]; diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh index 11546d6e14..c145dbac38 100755 --- a/t/t3418-rebase-continue.sh +++ b/t/t3418-rebase-continue.sh @@ -74,7 +74,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' test -f funny.was.run ' -test_expect_failure 'rebase -i --continue handles merge strategy and options' ' +test_expect_success 'rebase -i --continue handles merge strategy and options' ' rm -fr .git/rebase-* && git reset --hard commit-new-file-F2-on-topic-branch && test_commit "commit-new-file-F3-on-topic-branch-for-dash-i" F3 32 && -- 2.18.0.9.g431b2c36d5
[PATCH v3 1/2] t3418: add testcase showing problems with rebase -i and strategy options
We are not passing the same args to merge strategies when we are doing an --interactive rebase as we do with a --merge rebase. The merge strategy should not need to be aware of which type of rebase is in effect. Add a testcase which checks for the appropriate args. Signed-off-by: Elijah Newren --- t/t3418-rebase-continue.sh | 32 1 file changed, 32 insertions(+) diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh index 03bf1b8a3b..11546d6e14 100755 --- a/t/t3418-rebase-continue.sh +++ b/t/t3418-rebase-continue.sh @@ -74,6 +74,38 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' test -f funny.was.run ' +test_expect_failure 'rebase -i --continue handles merge strategy and options' ' + rm -fr .git/rebase-* && + git reset --hard commit-new-file-F2-on-topic-branch && + test_commit "commit-new-file-F3-on-topic-branch-for-dash-i" F3 32 && + test_when_finished "rm -fr test-bin funny.was.run funny.args" && + mkdir test-bin && + cat >test-bin/git-merge-funny <<-EOF && + #!$SHELL_PATH + echo "\$@" >>funny.args + case "\$1" in --opt) ;; *) exit 2 ;; esac + case "\$2" in --foo) ;; *) exit 2 ;; esac + case "\$4" in --) ;; *) exit 2 ;; esac + shift 2 && + >funny.was.run && + exec git merge-recursive "\$@" + EOF + chmod +x test-bin/git-merge-funny && + ( + PATH=./test-bin:$PATH && + test_must_fail git rebase -i -s funny -Xopt -Xfoo master topic + ) && + test -f funny.was.run && + rm funny.was.run && + echo "Resolved" >F2 && + git add F2 && + ( + PATH=./test-bin:$PATH && + git rebase --continue + ) && + test -f funny.was.run +' + test_expect_success 'rebase passes merge strategy options correctly' ' rm -fr .git/rebase-* && git reset --hard commit-new-file-F3-on-topic-branch && -- 2.18.0.9.g431b2c36d5
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
Jeff King wrote: > On Wed, Jun 27, 2018 at 12:56:37AM -0400, Todd Zullinger wrote: > >> Replace `$(prefix)/etc/gitconfig` and `$(prefix)/etc/gitattributes` in >> generated documentation with the paths chosen when building. Readers of >> the documentation should not need to know how `$(prefix)` was defined. > > Yes, I was just complaining about this yesterday. So what you're saying is that if I had procrastinated a little, you may have written such a patch for me? :) > Besides readers not > having any clue what $(prefix) means here, $(prefix)/etc is not even > correct for builds with prefix=/usr. > > So I like the overall direction here, but it leaves me with one > question: what happens for documentation outside of customized builds? > > Specifically, I'm thinking of: > > 1. The pre-built documentation that Junio builds for > quick-install-doc. This _could_ be customized during the "quick" > step, but it's probably not worth the effort. However, we'd want > some kind of generic fill-in then, and hopefully not > "/home/jch/etc" or something. If building docs separately for such a use, the values can be overridden by setting ETC_GITCONFIG and ETC_GITATTRIBUTES (or prefix or sysconfig). To keep the same values as we currently use, for example: make -C Documentation V=1 \ ETC_GITCONFIG='$$(prefix)/etc/gitconfig' \ ETC_GITATTRIBUTES='$$(prefix)/etc/gitattributes' I don't know if that's sufficient for folks who build documentation to share with a general audience or not. It might be enough if the default values are relatively sane and consistent. That would be a slight improvement over the current situation still. > 2. The manpages on git-scm.com, which are built with asciidoctor. I > think we'd want the same generic value there. Ideally it would be > embedded in the asciidoc source as "if this attribute isn't > defined, then use this", but it's not the end of the world to > require a patch to the site to handle this. We have that for the DEFAULT_(EDITOR|PAGER). I didn't know if we'd want that here, but maybe it's worth the effort if it helps when building docs destined for the website and such. It might make the plain text files a bit less readable though. The EDITOR/PAGER bits are in git-var.txt, like this: GIT_EDITOR:: Text editor for use by Git commands. The value is meant to be interpreted by the shell when it is used. Examples: `~/bin/vi`, `$SOME_ENVIRONMENT_VARIABLE`, `"C:\Program Files\Vim\gvim.exe" --nofork`. The order of preference is the `$GIT_EDITOR` environment variable, then `core.editor` configuration, then `$VISUAL`, then `$EDITOR`, and then the default chosen at compile time, which is usually 'vi'. ifdef::git-default-editor[] The build you are using chose '{git-default-editor}' as the default. endif::git-default-editor[] The ifdef would be a little different to set the var if it was not set, of course. I think if we ensure that ETC_GITCONFIG / ETC_GITATTRIBUTES are set sanely by default (and easily overridden) then we can probably avoid the need to handle it within the asciidoc file though. (There's more on that though below.) > (Related, there's a build target in the local Makefile for using > asciidoctor; does it need updated, too?) I didn't test asciidoctor specficially, but it also respects the ASCIIDOC_EXTRA parameters, so I think it will work just as well. I'll try to confirm that later today. >> diff --git a/Documentation/Makefile b/Documentation/Makefile >> index d079d7c73a..75af671798 100644 >> --- a/Documentation/Makefile >> +++ b/Documentation/Makefile >> @@ -95,6 +95,7 @@ DOC_MAN7 = $(patsubst %.txt,%.7,$(MAN7_TXT)) >> >> prefix ?= $(HOME) >> bindir ?= $(prefix)/bin >> +sysconfdir ?= $(prefix)/etc >> htmldir ?= $(prefix)/share/doc/git-doc >> infodir ?= $(prefix)/share/info >> pdfdir ?= $(prefix)/share/doc/git-doc >> @@ -205,6 +206,18 @@ DEFAULT_EDITOR_SQ = $(subst ','\'',$(DEFAULT_EDITOR)) >> ASCIIDOC_EXTRA += -a 'git-default-editor=$(DEFAULT_EDITOR_SQ)' >> endif >> >> +ifndef ETC_GITCONFIG >> +ETC_GITCONFIG = $(sysconfdir)/gitconfig >> +endif >> +ETC_GITCONFIG_SQ = $(subst ','\'',$(ETC_GITCONFIG)) >> +ASCIIDOC_EXTRA += -a 'etc-gitconfig=$(ETC_GITCONFIG_SQ)' >> + >> +ifndef ETC_GITATTRIBUTES >> +ETC_GITATTRIBUTES = $(sysconfdir)/gitattributes >> +endif >> +ETC_GITATTRIBUTES_SQ = $(subst ','\'',$(ETC_GITATTRIBUTES)) >> +ASCIIDOC_EXTRA += -a 'etc-gitattributes=$(ETC_GITATTRIBUTES_SQ)' >> + > > It's a shame we have to repeat this logic from the Makefile, though I > guess we already do so for prefix, bindir, etc, so far. > > Should we factor the path logic from the top-level Makefile into an > include that can be used from either? Yeah, maybe. I didn't like the duplication either, but as you noticed, we do it already for many of the other v
Re: [GSoC][PATCH v4 0/2] rebase -i: rewrite append_todo_help() in C
Hi Johannes, Le 26/06/2018 à 23:37, Johannes Schindelin a écrit : > Hi Alban, > > On Tue, 26 Jun 2018, Alban Gruin wrote: > >> This patch rewrites append_todo_help() from shell to C. The C version >> covers a bit more than the old shell version. To achieve that, some >> parameters were added to rebase--helper. >> >> This also introduce a new source file, rebase-interactive.c. >> >> This is part of the effort to rewrite interactive rebase in C. >> >> This is based on next, as of 2018-06-26. > > Out of curiosity: which commits that are not yet in `master` are required? > None, actually. Cheers, Alban
Re: [GSoC][PATCH v4 1/2] sequencer: make two functions and an enum from sequencer.c public
Hi Johannes, Le 26/06/2018 à 23:41, Johannes Schindelin a écrit : > Hi Alban, > > On Tue, 26 Jun 2018, Alban Gruin wrote: > >> diff --git a/sequencer.h b/sequencer.h >> index c5787c6b5..08397b0d1 100644 >> --- a/sequencer.h >> +++ b/sequencer.h >> @@ -3,6 +3,7 @@ >> >> const char *git_path_commit_editmsg(void); >> const char *git_path_seq_dir(void); >> +const char *rebase_path_todo(void); >> >> #define APPEND_SIGNOFF_DEDUP (1u << 0) >> >> @@ -57,6 +58,10 @@ struct replay_opts { >> }; >> #define REPLAY_OPTS_INIT { .action = -1, .current_fixups = STRBUF_INIT } >> >> +enum check_level { >> +CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR >> +}; >> + > > While this is contained within scopes that include `sequencer.h`, it *is* > public now, so I am slightly uneasy about keeping this enum so generic. > Maybe we want to use > > enum missing_commit_check_level { > MISSING_COMMIT_CHECK_IGNORE = 0, > MISSING_COMMIT_CHECK_WARN, > MISSING_COMMIT_CHECK_ERROR > }; > > instead? > You’re right, this would be better. Cheers, Alban
Re: [PATCH] doc: substitute ETC_GIT(CONFIG|ATTRIBUTES) in generated docs
On Wed, Jun 27, 2018 at 12:56:37AM -0400, Todd Zullinger wrote: > Replace `$(prefix)/etc/gitconfig` and `$(prefix)/etc/gitattributes` in > generated documentation with the paths chosen when building. Readers of > the documentation should not need to know how `$(prefix)` was defined. Yes, I was just complaining about this yesterday. Besides readers not having any clue what $(prefix) means here, $(prefix)/etc is not even correct for builds with prefix=/usr. So I like the overall direction here, but it leaves me with one question: what happens for documentation outside of customized builds? Specifically, I'm thinking of: 1. The pre-built documentation that Junio builds for quick-install-doc. This _could_ be customized during the "quick" step, but it's probably not worth the effort. However, we'd want some kind of generic fill-in then, and hopefully not "/home/jch/etc" or something. 2. The manpages on git-scm.com, which are built with asciidoctor. I think we'd want the same generic value there. Ideally it would be embedded in the asciidoc source as "if this attribute isn't defined, then use this", but it's not the end of the world to require a patch to the site to handle this. (Related, there's a build target in the local Makefile for using asciidoctor; does it need updated, too?) > diff --git a/Documentation/Makefile b/Documentation/Makefile > index d079d7c73a..75af671798 100644 > --- a/Documentation/Makefile > +++ b/Documentation/Makefile > @@ -95,6 +95,7 @@ DOC_MAN7 = $(patsubst %.txt,%.7,$(MAN7_TXT)) > > prefix ?= $(HOME) > bindir ?= $(prefix)/bin > +sysconfdir ?= $(prefix)/etc > htmldir ?= $(prefix)/share/doc/git-doc > infodir ?= $(prefix)/share/info > pdfdir ?= $(prefix)/share/doc/git-doc > @@ -205,6 +206,18 @@ DEFAULT_EDITOR_SQ = $(subst ','\'',$(DEFAULT_EDITOR)) > ASCIIDOC_EXTRA += -a 'git-default-editor=$(DEFAULT_EDITOR_SQ)' > endif > > +ifndef ETC_GITCONFIG > +ETC_GITCONFIG = $(sysconfdir)/gitconfig > +endif > +ETC_GITCONFIG_SQ = $(subst ','\'',$(ETC_GITCONFIG)) > +ASCIIDOC_EXTRA += -a 'etc-gitconfig=$(ETC_GITCONFIG_SQ)' > + > +ifndef ETC_GITATTRIBUTES > +ETC_GITATTRIBUTES = $(sysconfdir)/gitattributes > +endif > +ETC_GITATTRIBUTES_SQ = $(subst ','\'',$(ETC_GITATTRIBUTES)) > +ASCIIDOC_EXTRA += -a 'etc-gitattributes=$(ETC_GITATTRIBUTES_SQ)' > + It's a shame we have to repeat this logic from the Makefile, though I guess we already do so for prefix, bindir, etc, so far. Should we factor the path logic from the top-level Makefile into an include that can be used from either? > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..ed903b60bd 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -5,7 +5,7 @@ The Git configuration file contains a number of variables > that affect > the Git commands' behavior. The `.git/config` file in each repository > is used to store the configuration for that repository, and > `$HOME/.gitconfig` is used to store a per-user configuration as > -fallback values for the `.git/config` file. The file `/etc/gitconfig` > +fallback values for the `.git/config` file. The file +{etc-gitconfig}+ I think the formatting tweak you've done here is the right thing. There's no way to expand within literal backticks (since that's the point). So we only care about the monospacing effect, which ++ should give. -Peff
[PATCH v7 20/22] commit-graph: add '--reachable' option
When writing commit-graph files, it can be convenient to ask for all reachable commits (starting at the ref set) in the resulting file. This is particularly helpful when writing to stdin is complicated, such as a future integration with 'git gc'. Signed-off-by: Derrick Stolee --- Documentation/git-commit-graph.txt | 8 ++-- builtin/commit-graph.c | 16 commit-graph.c | 18 ++ commit-graph.h | 1 + t/t5318-commit-graph.sh| 10 ++ 5 files changed, 47 insertions(+), 6 deletions(-) diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt index a222cfab08..dececb79d7 100644 --- a/Documentation/git-commit-graph.txt +++ b/Documentation/git-commit-graph.txt @@ -38,12 +38,16 @@ Write a commit graph file based on the commits found in packfiles. + With the `--stdin-packs` option, generate the new commit graph by walking objects only in the specified pack-indexes. (Cannot be combined -with --stdin-commits.) +with `--stdin-commits` or `--reachable`.) + With the `--stdin-commits` option, generate the new commit graph by walking commits starting at the commits specified in stdin as a list of OIDs in hex, one OID per line. (Cannot be combined with ---stdin-packs.) +`--stdin-packs` or `--reachable`.) ++ +With the `--reachable` option, generate the new commit graph by walking +commits starting at all refs. (Cannot be combined with `--stdin-commits` +or `--stdin-packs`.) + With the `--append` option, include all commits that are present in the existing commit-graph file. diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index ea28bc311a..c7d0db5ab4 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -10,7 +10,7 @@ static char const * const builtin_commit_graph_usage[] = { N_("git commit-graph [--object-dir ]"), N_("git commit-graph read [--object-dir ]"), N_("git commit-graph verify [--object-dir ]"), - N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), + N_("git commit-graph write [--object-dir ] [--append] [--reachable|--stdin-packs|--stdin-commits]"), NULL }; @@ -25,12 +25,13 @@ static const char * const builtin_commit_graph_read_usage[] = { }; static const char * const builtin_commit_graph_write_usage[] = { - N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), + N_("git commit-graph write [--object-dir ] [--append] [--reachable|--stdin-packs|--stdin-commits]"), NULL }; static struct opts_commit_graph { const char *obj_dir; + int reachable; int stdin_packs; int stdin_commits; int append; @@ -127,6 +128,8 @@ static int graph_write(int argc, const char **argv) OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"), N_("The object directory to store the graph")), + OPT_BOOL(0, "reachable", &opts.reachable, + N_("start walk at all refs")), OPT_BOOL(0, "stdin-packs", &opts.stdin_packs, N_("scan pack-indexes listed by stdin for commits")), OPT_BOOL(0, "stdin-commits", &opts.stdin_commits, @@ -140,11 +143,16 @@ static int graph_write(int argc, const char **argv) builtin_commit_graph_write_options, builtin_commit_graph_write_usage, 0); - if (opts.stdin_packs && opts.stdin_commits) - die(_("cannot use both --stdin-commits and --stdin-packs")); + if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1) + die(_("use at most one of --reachable, --stdin-commits, or --stdin-packs")); if (!opts.obj_dir) opts.obj_dir = get_object_directory(); + if (opts.reachable) { + write_commit_graph_reachable(opts.obj_dir, opts.append); + return 0; + } + string_list_init(&lines, 0); if (opts.stdin_packs || opts.stdin_commits) { struct strbuf buf = STRBUF_INIT; diff --git a/commit-graph.c b/commit-graph.c index a06e7e9549..adf54e3fe7 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -7,6 +7,7 @@ #include "packfile.h" #include "commit.h" #include "object.h" +#include "refs.h" #include "revision.h" #include "sha1-lookup.h" #include "commit-graph.h" @@ -656,6 +657,23 @@ static void compute_generation_numbers(struct packed_commit_list* commits) } } +static int add_ref_to_list(const char *refname, + const struct object_id *oid, + int flags, void *cb_data) +{ + struct string_list *list = (struct string_list*)cb_data; + string_list_append(list, oid_to_hex(oid)); + return 0; +} + +void write_commit_graph_reachable(const char *obj_dir, int append) +{ +
[PATCH v7 19/22] commit-graph: use string-list API for input
Signed-off-by: Derrick Stolee --- builtin/commit-graph.c | 39 +-- commit-graph.c | 15 +++ commit-graph.h | 7 +++ 3 files changed, 23 insertions(+), 38 deletions(-) diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 9d108f43a9..ea28bc311a 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -119,13 +119,9 @@ static int graph_read(int argc, const char **argv) static int graph_write(int argc, const char **argv) { - const char **pack_indexes = NULL; - int packs_nr = 0; - const char **commit_hex = NULL; - int commits_nr = 0; - const char **lines = NULL; - int lines_nr = 0; - int lines_alloc = 0; + struct string_list *pack_indexes = NULL; + struct string_list *commit_hex = NULL; + struct string_list lines; static struct option builtin_commit_graph_write_options[] = { OPT_STRING(0, "object-dir", &opts.obj_dir, @@ -149,34 +145,25 @@ static int graph_write(int argc, const char **argv) if (!opts.obj_dir) opts.obj_dir = get_object_directory(); + string_list_init(&lines, 0); if (opts.stdin_packs || opts.stdin_commits) { struct strbuf buf = STRBUF_INIT; - lines_nr = 0; - lines_alloc = 128; - ALLOC_ARRAY(lines, lines_alloc); - - while (strbuf_getline(&buf, stdin) != EOF) { - ALLOC_GROW(lines, lines_nr + 1, lines_alloc); - lines[lines_nr++] = strbuf_detach(&buf, NULL); - } - - if (opts.stdin_packs) { - pack_indexes = lines; - packs_nr = lines_nr; - } - if (opts.stdin_commits) { - commit_hex = lines; - commits_nr = lines_nr; - } + + while (strbuf_getline(&buf, stdin) != EOF) + string_list_append(&lines, strbuf_detach(&buf, NULL)); + + if (opts.stdin_packs) + pack_indexes = &lines; + if (opts.stdin_commits) + commit_hex = &lines; } write_commit_graph(opts.obj_dir, pack_indexes, - packs_nr, commit_hex, - commits_nr, opts.append); + string_list_clear(&lines, 0); return 0; } diff --git a/commit-graph.c b/commit-graph.c index d926c4b59f..a06e7e9549 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -657,10 +657,8 @@ static void compute_generation_numbers(struct packed_commit_list* commits) } void write_commit_graph(const char *obj_dir, - const char **pack_indexes, - int nr_packs, - const char **commit_hex, - int nr_commits, + struct string_list *pack_indexes, + struct string_list *commit_hex, int append) { struct packed_oid_list oids; @@ -701,10 +699,10 @@ void write_commit_graph(const char *obj_dir, int dirlen; strbuf_addf(&packname, "%s/pack/", obj_dir); dirlen = packname.len; - for (i = 0; i < nr_packs; i++) { + for (i = 0; i < pack_indexes->nr; i++) { struct packed_git *p; strbuf_setlen(&packname, dirlen); - strbuf_addstr(&packname, pack_indexes[i]); + strbuf_addstr(&packname, pack_indexes->items[i].string); p = add_packed_git(packname.buf, packname.len, 1); if (!p) die("error adding pack %s", packname.buf); @@ -717,12 +715,13 @@ void write_commit_graph(const char *obj_dir, } if (commit_hex) { - for (i = 0; i < nr_commits; i++) { + for (i = 0; i < commit_hex->nr; i++) { const char *end; struct object_id oid; struct commit *result; - if (commit_hex[i] && parse_oid_hex(commit_hex[i], &oid, &end)) + if (commit_hex->items[i].string && + parse_oid_hex(commit_hex->items[i].string, &oid, &end)) continue; result = lookup_commit_reference_gently(&oid, 1); diff --git a/commit-graph.h b/commit-graph.h index 4359812fb4..a79b854470 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -3,6 +3,7 @@ #include "git-compat-util.h" #include "repository.h" +#include "string-list.h" char *get_commit_graph_filename(const char *obj_dir); @@ -48,10 +49,8 @@ struct commit_graph { struct commit_graph *load_commit_
[PATCH v7 09/22] commit-graph: verify required chunks are present
The commit-graph file requires the following three chunks: * OID Fanout * OID Lookup * Commit Data If any of these are missing, then the 'verify' subcommand should report a failure. This includes the chunk IDs malformed or the chunk count is truncated. Signed-off-by: Derrick Stolee --- commit-graph.c | 9 + t/t5318-commit-graph.sh | 29 + 2 files changed, 38 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 22ef696e18..f30b4ccee9 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -848,5 +848,14 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) return 1; } + verify_commit_graph_error = 0; + + if (!g->chunk_oid_fanout) + graph_report("commit-graph is missing the OID Fanout chunk"); + if (!g->chunk_oid_lookup) + graph_report("commit-graph is missing the OID Lookup chunk"); + if (!g->chunk_commit_data) + graph_report("commit-graph is missing the Commit Data chunk"); + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index c0c1ff09b9..dc16849ddd 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -249,6 +249,15 @@ test_expect_success 'git commit-graph verify' ' GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 +GRAPH_BYTE_CHUNK_COUNT=6 +GRAPH_CHUNK_LOOKUP_OFFSET=8 +GRAPH_CHUNK_LOOKUP_WIDTH=12 +GRAPH_CHUNK_LOOKUP_ROWS=5 +GRAPH_BYTE_OID_FANOUT_ID=$GRAPH_CHUNK_LOOKUP_OFFSET +GRAPH_BYTE_OID_LOOKUP_ID=$(($GRAPH_CHUNK_LOOKUP_OFFSET + \ + 1 * $GRAPH_CHUNK_LOOKUP_WIDTH)) +GRAPH_BYTE_COMMIT_DATA_ID=$(($GRAPH_CHUNK_LOOKUP_OFFSET + \ +2 * $GRAPH_CHUNK_LOOKUP_WIDTH)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -283,4 +292,24 @@ test_expect_success 'detect bad hash version' ' "hash version" ' +test_expect_success 'detect low chunk count' ' + corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\02" \ + "missing the .* chunk" +' + +test_expect_success 'detect missing OID fanout chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_FANOUT_ID "\0" \ + "missing the OID Fanout chunk" +' + +test_expect_success 'detect missing OID lookup chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_ID "\0" \ + "missing the OID Lookup chunk" +' + +test_expect_success 'detect missing commit data chunk' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_DATA_ID "\0" \ + "missing the Commit Data chunk" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 11/22] commit-graph: verify objects exist
In the 'verify' subcommand, load commits directly from the object database to ensure they exist. Parse by skipping the commit-graph. Signed-off-by: Derrick Stolee --- commit-graph.c | 18 ++ t/t5318-commit-graph.sh | 7 +++ 2 files changed, 25 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 866a9e7e41..00e89b71e9 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -11,6 +11,7 @@ #include "sha1-lookup.h" #include "commit-graph.h" #include "object-store.h" +#include "alloc.h" #define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */ #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ @@ -242,6 +243,7 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g, { struct commit *c; struct object_id oid; + hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos); c = lookup_commit(&oid); if (!c) @@ -893,5 +895,21 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) cur_fanout_pos++; } + if (verify_commit_graph_error) + return verify_commit_graph_error; + + for (i = 0; i < g->num_commits; i++) { + struct commit *odb_commit; + + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + odb_commit = (struct commit *)create_object(r, cur_oid.hash, alloc_commit_node(r)); + if (parse_commit_internal(odb_commit, 0, 0)) { + graph_report("failed to parse %s from object database", +oid_to_hex(&cur_oid)); + continue; + } + } + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 8ae2a49cfa..a27ec97269 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -247,6 +247,7 @@ test_expect_success 'git commit-graph verify' ' git commit-graph verify >output ' +NUM_COMMITS=9 HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 @@ -265,6 +266,7 @@ GRAPH_BYTE_FANOUT1=$(($GRAPH_FANOUT_OFFSET + 4 * 4)) GRAPH_BYTE_FANOUT2=$(($GRAPH_FANOUT_OFFSET + 4 * 255)) GRAPH_OID_LOOKUP_OFFSET=$(($GRAPH_FANOUT_OFFSET + 4 * 256)) GRAPH_BYTE_OID_LOOKUP_ORDER=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 8)) +GRAPH_BYTE_OID_LOOKUP_MISSING=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 4 + 10)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -334,4 +336,9 @@ test_expect_success 'detect incorrect OID order' ' "incorrect OID order" ' +test_expect_success 'detect OID not in object database' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_MISSING "\01" \ + "from object database" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 22/22] commit-graph: update design document
The commit-graph feature is now integrated with 'fsck' and 'gc', so remove those items from the "Future Work" section of the commit-graph design document. Also remove the section on lazy-loading trees, as that was completed in an earlier patch series. Signed-off-by: Derrick Stolee --- Documentation/technical/commit-graph.txt | 22 -- 1 file changed, 22 deletions(-) diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index e1a883eb46..c664acbd76 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -118,9 +118,6 @@ Future Work - The commit graph feature currently does not honor commit grafts. This can be remedied by duplicating or refactoring the current graft logic. -- The 'commit-graph' subcommand does not have a "verify" mode that is - necessary for integration with fsck. - - After computing and storing generation numbers, we must make graph walks aware of generation numbers to gain the performance benefits they enable. This will mostly be accomplished by swapping a commit-date-ordered @@ -130,25 +127,6 @@ Future Work - 'log --topo-order' - 'tag --merged' -- Currently, parse_commit_gently() requires filling in the root tree - object for a commit. This passes through lookup_tree() and consequently - lookup_object(). Also, it calls lookup_commit() when loading the parents. - These method calls check the ODB for object existence, even if the - consumer does not need the content. For example, we do not need the - tree contents when computing merge bases. Now that commit parsing is - removed from the computation time, these lookup operations are the - slowest operations keeping graph walks from being fast. Consider - loading these objects without verifying their existence in the ODB and - only loading them fully when consumers need them. Consider a method - such as "ensure_tree_loaded(commit)" that fully loads a tree before - using commit->tree. - -- The current design uses the 'commit-graph' subcommand to generate the graph. - When this feature stabilizes enough to recommend to most users, we should - add automatic graph writes to common operations that create many commits. - For example, one could compute a graph on 'clone', 'fetch', or 'repack' - commands. - - A server could provide a commit graph file as part of the network protocol to avoid extra calculations by clients. This feature is only of benefit if the user is willing to trust the file, because verifying the file is correct -- 2.18.0.24.g1b579a2ee9
[PATCH v7 15/22] commit-graph: verify commit date
Signed-off-by: Derrick Stolee --- commit-graph.c | 6 ++ t/t5318-commit-graph.sh | 6 ++ 2 files changed, 12 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index e0f71658da..6d6c6beff9 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -986,6 +986,12 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) oid_to_hex(&cur_oid), graph_commit->generation, max_generation + 1); + + if (graph_commit->date != odb_commit->date) + graph_report("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime, +oid_to_hex(&cur_oid), +graph_commit->date, +odb_commit->date); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 3128e19aef..2c65e6a95c 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -273,6 +273,7 @@ GRAPH_BYTE_COMMIT_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN)) GRAPH_BYTE_COMMIT_EXTRA_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4)) GRAPH_BYTE_COMMIT_WRONG_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3)) GRAPH_BYTE_COMMIT_GENERATION=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 11)) +GRAPH_BYTE_COMMIT_DATE=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 12)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -377,4 +378,9 @@ test_expect_success 'detect incorrect generation number' ' "non-zero generation number" ' +test_expect_success 'detect incorrect commit date' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_DATE "\01" \ + "commit date" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 18/22] fsck: verify commit-graph
If core.commitGraph is true, verify the contents of the commit-graph during 'git fsck' using the 'git commit-graph verify' subcommand. Run this check on all alternates, as well. We use a new process for two reasons: 1. The subcommand decouples the details of loading and verifying a commit-graph file from the other fsck details. 2. The commit-graph verification requires the commits to be loaded in a specific order to guarantee we parse from the commit-graph file for some objects and from the object database for others. Signed-off-by: Derrick Stolee --- Documentation/git-fsck.txt | 3 +++ builtin/fsck.c | 21 + t/t5318-commit-graph.sh| 8 3 files changed, 32 insertions(+) diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt index b9f060e3b2..ab9a93fb9b 100644 --- a/Documentation/git-fsck.txt +++ b/Documentation/git-fsck.txt @@ -110,6 +110,9 @@ Any corrupt objects you will have to find in backups or other archives (i.e., you can just remove them and do an 'rsync' with some other site in the hopes that somebody else has the object you have corrupted). +If core.commitGraph is true, the commit-graph file will also be inspected +using 'git commit-graph verify'. See linkgit:git-commit-graph[1]. + Extracted Diagnostics - diff --git a/builtin/fsck.c b/builtin/fsck.c index 3ad4f160f9..9fb2edc69f 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -18,6 +18,7 @@ #include "decorate.h" #include "packfile.h" #include "object-store.h" +#include "run-command.h" #define REACHABLE 0x0001 #define SEEN 0x0002 @@ -47,6 +48,7 @@ static int name_objects; #define ERROR_REACHABLE 02 #define ERROR_PACK 04 #define ERROR_REFS 010 +#define ERROR_COMMIT_GRAPH 020 static const char *describe_object(struct object *obj) { @@ -822,5 +824,24 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } check_connectivity(); + + if (core_commit_graph) { + struct child_process commit_graph_verify = CHILD_PROCESS_INIT; + const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL }; + commit_graph_verify.argv = verify_argv; + commit_graph_verify.git_cmd = 1; + + if (run_command(&commit_graph_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + + prepare_alt_odb(the_repository); + for (alt = the_repository->objects->alt_odb_list; alt; alt = alt->next) { + verify_argv[2] = "--object-dir"; + verify_argv[3] = alt->path; + if (run_command(&commit_graph_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + } + } + return errors_found; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index fed05e2f12..a9e8c774d5 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -399,4 +399,12 @@ test_expect_success 'detect invalid checksum hash' ' "incorrect checksum" ' +test_expect_success 'git fsck (checks commit-graph)' ' + cd "$TRASH_DIRECTORY/full" && + git fsck && + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ + "incorrect checksum" && + test_must_fail git fsck +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 17/22] commit-graph: verify contents match checksum
The commit-graph file ends with a SHA1 hash of the previous contents. If a commit-graph file has errors but the checksum hash is correct, then we know that the problem is a bug in Git and not simply file corruption after-the-fact. Compute the checksum right away so it is the first error that appears, and make the message translatable since this error can be "corrected" by a user by simply deleting the file and recomputing. The rest of the errors are useful only to developers. Be sure to continue checking the rest of the file data if the checksum is wrong. This is important for our tests, as we break the checksum as we modify bytes of the commit-graph file. Signed-off-by: Derrick Stolee --- commit-graph.c | 16 ++-- t/t5318-commit-graph.sh | 6 ++ 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 6d6c6beff9..d926c4b59f 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -833,6 +833,7 @@ void write_commit_graph(const char *obj_dir, oids.nr = 0; } +#define VERIFY_COMMIT_GRAPH_ERROR_HASH 2 static int verify_commit_graph_error; static void graph_report(const char *fmt, ...) @@ -852,8 +853,10 @@ static void graph_report(const char *fmt, ...) int verify_commit_graph(struct repository *r, struct commit_graph *g) { uint32_t i, cur_fanout_pos = 0; - struct object_id prev_oid, cur_oid; + struct object_id prev_oid, cur_oid, checksum; int generation_zero = 0; + struct hashfile *f; + int devnull; if (!g) { graph_report("no commit-graph file loaded"); @@ -872,6 +875,15 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) if (verify_commit_graph_error) return verify_commit_graph_error; + devnull = open("/dev/null", O_WRONLY); + f = hashfd(devnull, NULL); + hashwrite(f, g->data, g->data_len - g->hash_len); + finalize_hashfile(f, checksum.hash, CSUM_CLOSE); + if (hashcmp(checksum.hash, g->data + g->data_len - g->hash_len)) { + graph_report(_("the commit-graph file has incorrect checksum and is likely corrupt")); + verify_commit_graph_error = VERIFY_COMMIT_GRAPH_ERROR_HASH; + } + for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit; @@ -909,7 +921,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) cur_fanout_pos++; } - if (verify_commit_graph_error) + if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index a0cf1f66de..fed05e2f12 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -279,6 +279,7 @@ GRAPH_COMMIT_DATA_WIDTH=$(($HASH_LEN + 16)) GRAPH_OCTOPUS_DATA_OFFSET=$(($GRAPH_COMMIT_DATA_OFFSET + \ $GRAPH_COMMIT_DATA_WIDTH * $NUM_COMMITS)) GRAPH_BYTE_OCTOPUS=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4)) +GRAPH_BYTE_FOOTER=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4 * $NUM_OCTOPUS_EDGES)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -393,4 +394,9 @@ test_expect_success 'detect incorrect parent for octopus merge' ' "invalid parent" ' +test_expect_success 'detect invalid checksum hash' ' + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ + "incorrect checksum" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 10/22] commit-graph: verify corrupt OID fanout and lookup
In the commit-graph file, the OID fanout chunk provides an index into the OID lookup. The 'verify' subcommand should find incorrect values in the fanout. Similarly, the 'verify' subcommand should find out-of-order values in the OID lookup. Signed-off-by: Derrick Stolee --- commit-graph.c | 36 t/t5318-commit-graph.sh | 22 ++ 2 files changed, 58 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index f30b4ccee9..866a9e7e41 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -843,6 +843,9 @@ static void graph_report(const char *fmt, ...) int verify_commit_graph(struct repository *r, struct commit_graph *g) { + uint32_t i, cur_fanout_pos = 0; + struct object_id prev_oid, cur_oid; + if (!g) { graph_report("no commit-graph file loaded"); return 1; @@ -857,5 +860,38 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) if (!g->chunk_commit_data) graph_report("commit-graph is missing the Commit Data chunk"); + if (verify_commit_graph_error) + return verify_commit_graph_error; + + for (i = 0; i < g->num_commits; i++) { + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + if (i && oidcmp(&prev_oid, &cur_oid) >= 0) + graph_report("commit-graph has incorrect OID order: %s then %s", +oid_to_hex(&prev_oid), +oid_to_hex(&cur_oid)); + + oidcpy(&prev_oid, &cur_oid); + + while (cur_oid.hash[0] > cur_fanout_pos) { + uint32_t fanout_value = get_be32(g->chunk_oid_fanout + cur_fanout_pos); + if (i != fanout_value) + graph_report("commit-graph has incorrect fanout value: fanout[%d] = %u != %u", +cur_fanout_pos, fanout_value, i); + + cur_fanout_pos++; + } + } + + while (cur_fanout_pos < 256) { + uint32_t fanout_value = get_be32(g->chunk_oid_fanout + cur_fanout_pos); + + if (g->num_commits != fanout_value) + graph_report("commit-graph has incorrect fanout value: fanout[%d] = %u != %u", +cur_fanout_pos, fanout_value, i); + + cur_fanout_pos++; + } + return verify_commit_graph_error; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index dc16849ddd..8ae2a49cfa 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -247,6 +247,7 @@ test_expect_success 'git commit-graph verify' ' git commit-graph verify >output ' +HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 GRAPH_BYTE_CHUNK_COUNT=6 @@ -258,6 +259,12 @@ GRAPH_BYTE_OID_LOOKUP_ID=$(($GRAPH_CHUNK_LOOKUP_OFFSET + \ 1 * $GRAPH_CHUNK_LOOKUP_WIDTH)) GRAPH_BYTE_COMMIT_DATA_ID=$(($GRAPH_CHUNK_LOOKUP_OFFSET + \ 2 * $GRAPH_CHUNK_LOOKUP_WIDTH)) +GRAPH_FANOUT_OFFSET=$(($GRAPH_CHUNK_LOOKUP_OFFSET + \ + $GRAPH_CHUNK_LOOKUP_WIDTH * $GRAPH_CHUNK_LOOKUP_ROWS)) +GRAPH_BYTE_FANOUT1=$(($GRAPH_FANOUT_OFFSET + 4 * 4)) +GRAPH_BYTE_FANOUT2=$(($GRAPH_FANOUT_OFFSET + 4 * 255)) +GRAPH_OID_LOOKUP_OFFSET=$(($GRAPH_FANOUT_OFFSET + 4 * 256)) +GRAPH_BYTE_OID_LOOKUP_ORDER=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 8)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -312,4 +319,19 @@ test_expect_success 'detect missing commit data chunk' ' "missing the Commit Data chunk" ' +test_expect_success 'detect incorrect fanout' ' + corrupt_graph_and_verify $GRAPH_BYTE_FANOUT1 "\01" \ + "fanout value" +' + +test_expect_success 'detect incorrect fanout final value' ' + corrupt_graph_and_verify $GRAPH_BYTE_FANOUT2 "\01" \ + "fanout value" +' + +test_expect_success 'detect incorrect OID order' ' + corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_ORDER "\01" \ + "incorrect OID order" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 21/22] gc: automatically write commit-graph files
The commit-graph file is a very helpful feature for speeding up git operations. In order to make it more useful, make it possible to write the commit-graph file during standard garbage collection operations. Add a 'gc.commitGraph' config setting that triggers writing a commit-graph file after any non-trivial 'git gc' command. Defaults to false while the commit-graph feature matures. We specifically do not want to have this on by default until the commit-graph feature is fully integrated with history-modifying features like shallow clones. Helped-by: Ævar Arnfjörð Bjarmason Signed-off-by: Derrick Stolee --- Documentation/config.txt | 9 ++--- Documentation/git-gc.txt | 4 builtin/gc.c | 6 ++ t/t5318-commit-graph.sh | 14 ++ 4 files changed, 30 insertions(+), 3 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1cc18a828c..978deecfee 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -904,9 +904,12 @@ core.notesRef:: This setting defaults to "refs/notes/commits", and it can be overridden by the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. -core.commitGraph:: - Enable git commit graph feature. Allows reading from the - commit-graph file. +gc.commitGraph:: + If true, then gc will rewrite the commit-graph file when + linkgit:git-gc[1] is run. When using linkgit:git-gc[1] + '--auto' the commit-graph will be updated if housekeeping is + required. Default is false. See linkgit:git-commit-graph[1] + for details. core.sparseCheckout:: Enable "sparse checkout" feature. See section "Sparse checkout" in diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 24b2dd44fe..f5bc98ccb3 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -136,6 +136,10 @@ The optional configuration variable `gc.packRefs` determines if it within all non-bare repos or it can be set to a boolean value. This defaults to true. +The optional configuration variable `gc.commitGraph` determines if +'git gc' should run 'git commit-graph write'. This can be set to a +boolean value. This defaults to false. + The optional configuration variable `gc.aggressiveWindow` controls how much time is spent optimizing the delta compression of the objects in the repository when the --aggressive option is specified. The larger diff --git a/builtin/gc.c b/builtin/gc.c index ccfb1ceaeb..b7dd206f41 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -20,6 +20,7 @@ #include "sigchain.h" #include "argv-array.h" #include "commit.h" +#include "commit-graph.h" #include "packfile.h" #include "object-store.h" #include "pack.h" @@ -40,6 +41,7 @@ static int aggressive_depth = 50; static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; +static int gc_write_commit_graph = 0; static int detach_auto = 1; static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; @@ -129,6 +131,7 @@ static void gc_config(void) git_config_get_int("gc.aggressivedepth", &aggressive_depth); git_config_get_int("gc.auto", &gc_auto_threshold); git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit); + git_config_get_bool("gc.writecommitgraph", &gc_write_commit_graph); git_config_get_bool("gc.autodetach", &detach_auto); git_config_get_expiry("gc.pruneexpire", &prune_expire); git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire); @@ -641,6 +644,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (pack_garbage.nr > 0) clean_pack_garbage(); + if (gc_write_commit_graph) + write_commit_graph_reachable(get_object_directory(), 0); + if (auto_gc && too_many_loose_objects()) warning(_("There are too many unreachable loose objects; " "run 'git prune' to remove them.")); diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 2208c266b5..5947de3d24 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -245,6 +245,20 @@ test_expect_success 'perform fast-forward merge in full repo' ' test_cmp expect output ' +test_expect_success 'check that gc computes commit-graph' ' + cd "$TRASH_DIRECTORY/full" && + git commit --allow-empty -m "blank" && + git commit-graph write --reachable && + cp $objdir/info/commit-graph commit-graph-before-gc && + git reset --hard HEAD~1 && + git config gc.writeCommitGraph true && + git gc && + cp $objdir/info/commit-graph commit-graph-after-gc && + ! test_cmp commit-graph-before-gc commit-graph-after-gc && + git commit-graph write --reachable && + test_cmp commit-graph-after-gc $objdir/info/commit-graph +' + # the verify tests below expect the commit-graph to contain # exactly the commits reachable
[PATCH v7 12/22] commit-graph: verify root tree OIDs
The 'verify' subcommand must compare the commit content parsed from the commit-graph against the content in the object database. Use lookup_commit() and parse_commit_in_graph_one() to parse the commits from the graph and compare against a commit that is loaded separately and parsed directly from the object database. Add checks for the root tree OID. Signed-off-by: Derrick Stolee --- commit-graph.c | 17 - t/t5318-commit-graph.sh | 7 +++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index 00e89b71e9..5df18394f9 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -866,6 +866,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit; + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); if (i && oidcmp(&prev_oid, &cur_oid) >= 0) @@ -883,6 +885,11 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) cur_fanout_pos++; } + + graph_commit = lookup_commit(&cur_oid); + if (!parse_commit_in_graph_one(g, graph_commit)) + graph_report("failed to parse %s from commit-graph", +oid_to_hex(&cur_oid)); } while (cur_fanout_pos < 256) { @@ -899,16 +906,24 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) return verify_commit_graph_error; for (i = 0; i < g->num_commits; i++) { - struct commit *odb_commit; + struct commit *graph_commit, *odb_commit; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + graph_commit = lookup_commit(&cur_oid); odb_commit = (struct commit *)create_object(r, cur_oid.hash, alloc_commit_node(r)); if (parse_commit_internal(odb_commit, 0, 0)) { graph_report("failed to parse %s from object database", oid_to_hex(&cur_oid)); continue; } + + if (oidcmp(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid, + get_commit_tree_oid(odb_commit))) + graph_report("root tree OID for commit %s in commit-graph is %s != %s", +oid_to_hex(&cur_oid), + oid_to_hex(get_commit_tree_oid(graph_commit)), + oid_to_hex(get_commit_tree_oid(odb_commit))); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index a27ec97269..f258c6d5d0 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -267,6 +267,8 @@ GRAPH_BYTE_FANOUT2=$(($GRAPH_FANOUT_OFFSET + 4 * 255)) GRAPH_OID_LOOKUP_OFFSET=$(($GRAPH_FANOUT_OFFSET + 4 * 256)) GRAPH_BYTE_OID_LOOKUP_ORDER=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 8)) GRAPH_BYTE_OID_LOOKUP_MISSING=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 4 + 10)) +GRAPH_COMMIT_DATA_OFFSET=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * $NUM_COMMITS)) +GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -341,4 +343,9 @@ test_expect_success 'detect OID not in object database' ' "from object database" ' +test_expect_success 'detect incorrect tree OID' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_TREE "\01" \ + "root tree OID for commit" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 14/22] commit-graph: verify generation number
While iterating through the commit parents, perform the generation number calculation and compare against the value stored in the commit-graph. The tests demonstrate that having a different set of parents affects the generation number calculation, and this value propagates to descendants. Hence, we drop the single-line condition on the output. Since Git will ship with the commit-graph feature without generation numbers, we need to accept commit-graphs with all generation numbers equal to zero. In this case, ignore the generation number calculation. However, verify that we should never have a mix of zero and non-zero generation numbers. Create a test that sets one commit to generation zero and all following commits report a failure as they have non-zero generation in a file that contains generation number zero. Signed-off-by: Derrick Stolee --- commit-graph.c | 34 ++ t/t5318-commit-graph.sh | 11 +++ 2 files changed, 45 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 6d8d774eb0..e0f71658da 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -846,10 +846,14 @@ static void graph_report(const char *fmt, ...) va_end(ap); } +#define GENERATION_ZERO_EXISTS 1 +#define GENERATION_NUMBER_EXISTS 2 + int verify_commit_graph(struct repository *r, struct commit_graph *g) { uint32_t i, cur_fanout_pos = 0; struct object_id prev_oid, cur_oid; + int generation_zero = 0; if (!g) { graph_report("no commit-graph file loaded"); @@ -911,6 +915,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; + uint32_t max_generation = 0; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -945,6 +950,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) oid_to_hex(&graph_parents->item->object.oid), oid_to_hex(&odb_parents->item->object.oid)); + if (graph_parents->item->generation > max_generation) + max_generation = graph_parents->item->generation; + graph_parents = graph_parents->next; odb_parents = odb_parents->next; } @@ -952,6 +960,32 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) if (odb_parents != NULL) graph_report("commit-graph parent list for commit %s terminates early", oid_to_hex(&cur_oid)); + + if (!graph_commit->generation) { + if (generation_zero == GENERATION_NUMBER_EXISTS) + graph_report("commit-graph has generation number zero for commit %s, but non-zero elsewhere", +oid_to_hex(&cur_oid)); + generation_zero = GENERATION_ZERO_EXISTS; + } else if (generation_zero == GENERATION_ZERO_EXISTS) + graph_report("commit-graph has non-zero generation number for commit %s, but zero elsewhere", +oid_to_hex(&cur_oid)); + + if (generation_zero == GENERATION_ZERO_EXISTS) + continue; + + /* +* If one of our parents has generation GENERATION_NUMBER_MAX, then +* our generation is also GENERATION_NUMBER_MAX. Decrement to avoid +* extra logic in the following condition. +*/ + if (max_generation == GENERATION_NUMBER_MAX) + max_generation--; + + if (graph_commit->generation != max_generation + 1) + graph_report("commit-graph generation for commit %s is %u != %u", +oid_to_hex(&cur_oid), +graph_commit->generation, +max_generation + 1); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index b41c8f4d9b..3128e19aef 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -272,6 +272,7 @@ GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET GRAPH_BYTE_COMMIT_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN)) GRAPH_BYTE_COMMIT_EXTRA_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4)) GRAPH_BYTE_COMMIT_WRONG_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3)) +GRAPH_BYTE_COMMIT_GENERATION=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 11)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -366,4 +367,14 @@ test_expect_success 'detect wrong
[PATCH v7 13/22] commit-graph: verify parent list
The commit-graph file stores parents in a two-column portion of the commit data chunk. If there is only one parent, then the second column stores 0x to indicate no second parent. The 'verify' subcommand checks the parent list for the commit loaded from the commit-graph and the one parsed from the object database. Test these checks for corrupt parents, too many parents, and wrong parents. Add a boundary check to insert_parent_or_die() for when the parent position value is out of range. The octopus merge will be tested in a later commit. Signed-off-by: Derrick Stolee --- commit-graph.c | 28 t/t5318-commit-graph.sh | 18 ++ 2 files changed, 46 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 5df18394f9..6d8d774eb0 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -244,6 +244,9 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g, struct commit *c; struct object_id oid; + if (pos >= g->num_commits) + die("invalid parent position %"PRIu64, pos); + hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos); c = lookup_commit(&oid); if (!c) @@ -907,6 +910,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; + struct commit_list *graph_parents, *odb_parents; hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -924,6 +928,30 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) oid_to_hex(&cur_oid), oid_to_hex(get_commit_tree_oid(graph_commit)), oid_to_hex(get_commit_tree_oid(odb_commit))); + + graph_parents = graph_commit->parents; + odb_parents = odb_commit->parents; + + while (graph_parents) { + if (odb_parents == NULL) { + graph_report("commit-graph parent list for commit %s is too long", +oid_to_hex(&cur_oid)); + break; + } + + if (oidcmp(&graph_parents->item->object.oid, &odb_parents->item->object.oid)) + graph_report("commit-graph parent for %s is %s != %s", +oid_to_hex(&cur_oid), + oid_to_hex(&graph_parents->item->object.oid), + oid_to_hex(&odb_parents->item->object.oid)); + + graph_parents = graph_parents->next; + odb_parents = odb_parents->next; + } + + if (odb_parents != NULL) + graph_report("commit-graph parent list for commit %s terminates early", +oid_to_hex(&cur_oid)); } return verify_commit_graph_error; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index f258c6d5d0..b41c8f4d9b 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -269,6 +269,9 @@ GRAPH_BYTE_OID_LOOKUP_ORDER=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 8)) GRAPH_BYTE_OID_LOOKUP_MISSING=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * 4 + 10)) GRAPH_COMMIT_DATA_OFFSET=$(($GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN * $NUM_COMMITS)) GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET +GRAPH_BYTE_COMMIT_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN)) +GRAPH_BYTE_COMMIT_EXTRA_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4)) +GRAPH_BYTE_COMMIT_WRONG_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -348,4 +351,19 @@ test_expect_success 'detect incorrect tree OID' ' "root tree OID for commit" ' +test_expect_success 'detect incorrect parent int-id' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_PARENT "\01" \ + "invalid parent" +' + +test_expect_success 'detect extra parent int-id' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_EXTRA_PARENT "\00" \ + "is too long" +' + +test_expect_success 'detect wrong parent' ' + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_WRONG_PARENT "\01" \ + "commit-graph parent for" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 16/22] commit-graph: test for corrupted octopus edge
The commit-graph file has an extra chunk to store the parent int-ids for parents beyond the first parent for octopus merges. Our test repo has a single octopus merge that we can manipulate to demonstrate the 'verify' subcommand detects incorrect values in that chunk. Signed-off-by: Derrick Stolee --- t/t5318-commit-graph.sh | 10 ++ 1 file changed, 10 insertions(+) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 2c65e6a95c..a0cf1f66de 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -248,6 +248,7 @@ test_expect_success 'git commit-graph verify' ' ' NUM_COMMITS=9 +NUM_OCTOPUS_EDGES=2 HASH_LEN=20 GRAPH_BYTE_VERSION=4 GRAPH_BYTE_HASH=5 @@ -274,6 +275,10 @@ GRAPH_BYTE_COMMIT_EXTRA_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 4)) GRAPH_BYTE_COMMIT_WRONG_PARENT=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 3)) GRAPH_BYTE_COMMIT_GENERATION=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 11)) GRAPH_BYTE_COMMIT_DATE=$(($GRAPH_COMMIT_DATA_OFFSET + $HASH_LEN + 12)) +GRAPH_COMMIT_DATA_WIDTH=$(($HASH_LEN + 16)) +GRAPH_OCTOPUS_DATA_OFFSET=$(($GRAPH_COMMIT_DATA_OFFSET + \ +$GRAPH_COMMIT_DATA_WIDTH * $NUM_COMMITS)) +GRAPH_BYTE_OCTOPUS=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4)) # usage: corrupt_graph_and_verify # Manipulates the commit-graph file at the position @@ -383,4 +388,9 @@ test_expect_success 'detect incorrect commit date' ' "commit date" ' +test_expect_success 'detect incorrect parent for octopus merge' ' + corrupt_graph_and_verify $GRAPH_BYTE_OCTOPUS "\01" \ + "invalid parent" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 04/22] commit-graph: parse commit from chosen graph
Before verifying a commit-graph file against the object database, we need to parse all commits from the given commit-graph file. Create parse_commit_in_graph_one() to target a given struct commit_graph. Signed-off-by: Derrick Stolee --- commit-graph.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index f83f6d2373..e77b19971d 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -314,7 +314,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin } } -int parse_commit_in_graph(struct commit *item) +static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item) { uint32_t pos; @@ -322,9 +322,21 @@ int parse_commit_in_graph(struct commit *item) return 0; if (item->object.parsed) return 1; + + if (find_commit_in_graph(item, g, &pos)) + return fill_commit_in_graph(item, g, pos); + + return 0; +} + +int parse_commit_in_graph(struct commit *item) +{ + if (!core_commit_graph) + return 0; + prepare_commit_graph(); - if (commit_graph && find_commit_in_graph(item, commit_graph, &pos)) - return fill_commit_in_graph(item, commit_graph, pos); + if (commit_graph) + return parse_commit_in_graph_one(commit_graph, item); return 0; } -- 2.18.0.24.g1b579a2ee9
[PATCH v7 07/22] commit-graph: add 'verify' subcommand
If the commit-graph file becomes corrupt, we need a way to verify that its contents match the object database. In the manner of 'git fsck' we will implement a 'git commit-graph verify' subcommand to report all issues with the file. Add the 'verify' subcommand to the 'commit-graph' builtin and its documentation. The subcommand is currently a no-op except for loading the commit-graph into memory, which may trigger run-time errors that would be caught by normal use. Add a simple test that ensures the command returns a zero error code. If no commit-graph file exists, this is an acceptable state. Do not report any errors. Helped-by: Ramsay Jones Signed-off-by: Derrick Stolee --- Documentation/git-commit-graph.txt | 6 + builtin/commit-graph.c | 39 ++ commit-graph.c | 23 ++ commit-graph.h | 3 +++ t/t5318-commit-graph.sh| 10 5 files changed, 81 insertions(+) diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt index 4c97b555cc..a222cfab08 100644 --- a/Documentation/git-commit-graph.txt +++ b/Documentation/git-commit-graph.txt @@ -10,6 +10,7 @@ SYNOPSIS [verse] 'git commit-graph read' [--object-dir ] +'git commit-graph verify' [--object-dir ] 'git commit-graph write' [--object-dir ] @@ -52,6 +53,11 @@ existing commit-graph file. Read a graph file given by the commit-graph file and output basic details about the graph file. Used for debugging purposes. +'verify':: + +Read the commit-graph file and verify its contents against the object +database. Used to check for corrupted data. + EXAMPLES diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index f0875b8bf3..9d108f43a9 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -3,15 +3,22 @@ #include "dir.h" #include "lockfile.h" #include "parse-options.h" +#include "repository.h" #include "commit-graph.h" static char const * const builtin_commit_graph_usage[] = { N_("git commit-graph [--object-dir ]"), N_("git commit-graph read [--object-dir ]"), + N_("git commit-graph verify [--object-dir ]"), N_("git commit-graph write [--object-dir ] [--append] [--stdin-packs|--stdin-commits]"), NULL }; +static const char * const builtin_commit_graph_verify_usage[] = { + N_("git commit-graph verify [--object-dir ]"), + NULL +}; + static const char * const builtin_commit_graph_read_usage[] = { N_("git commit-graph read [--object-dir ]"), NULL @@ -29,6 +36,36 @@ static struct opts_commit_graph { int append; } opts; + +static int graph_verify(int argc, const char **argv) +{ + struct commit_graph *graph = NULL; + char *graph_name; + + static struct option builtin_commit_graph_verify_options[] = { + OPT_STRING(0, "object-dir", &opts.obj_dir, + N_("dir"), + N_("The object directory to store the graph")), + OPT_END(), + }; + + argc = parse_options(argc, argv, NULL, +builtin_commit_graph_verify_options, +builtin_commit_graph_verify_usage, 0); + + if (!opts.obj_dir) + opts.obj_dir = get_object_directory(); + + graph_name = get_commit_graph_filename(opts.obj_dir); + graph = load_commit_graph_one(graph_name); + FREE_AND_NULL(graph_name); + + if (!graph) + return 0; + + return verify_commit_graph(the_repository, graph); +} + static int graph_read(int argc, const char **argv) { struct commit_graph *graph = NULL; @@ -165,6 +202,8 @@ int cmd_commit_graph(int argc, const char **argv, const char *prefix) if (argc > 0) { if (!strcmp(argv[0], "read")) return graph_read(argc, argv); + if (!strcmp(argv[0], "verify")) + return graph_verify(argc, argv); if (!strcmp(argv[0], "write")) return graph_write(argc, argv); } diff --git a/commit-graph.c b/commit-graph.c index 9e228d3bb5..22ef696e18 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -827,3 +827,26 @@ void write_commit_graph(const char *obj_dir, oids.alloc = 0; oids.nr = 0; } + +static int verify_commit_graph_error; + +static void graph_report(const char *fmt, ...) +{ + va_list ap; + verify_commit_graph_error = 1; + + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + fprintf(stderr, "\n"); + va_end(ap); +} + +int verify_commit_graph(struct repository *r, struct commit_graph *g) +{ + if (!g) { + graph_report("no commit-graph file loaded"); + return 1; + } + + return verify_commit_graph_error; +} diff --git a/commit-graph.h b/commit-graph.h index 96cccb10f3..4359812fb4 100644 ---
[PATCH v7 00/22] Integrate commit-graph into 'fsck' and 'gc'
From: Derrick Stolee This v7 has very little material difference from the previous version. The only changes are: * Rename 'gc.commitGraph' to 'gc.writeCommitGraph' * Add 't5318-commit-graph.sh: use core.commitGraph' patch that was dropped. * Rebase onto latest master, which appears to fix any issues merging into 'next' and 'pu'. I'm sending this from my gmail account to hopefully avoid any message munging that happened in v6. You can see my commits on GitHub [1]. [1] https://github.com/gitgitgadget/git/pull/6 Thanks, -Stolee Derrick Stolee (22): t5318-commit-graph.sh: use core.commitGraph commit-graph: UNLEAK before die() commit-graph: fix GRAPH_MIN_SIZE commit-graph: parse commit from chosen graph commit: force commit to parse from object database commit-graph: load a root tree from specific graph commit-graph: add 'verify' subcommand commit-graph: verify catches corrupt signature commit-graph: verify required chunks are present commit-graph: verify corrupt OID fanout and lookup commit-graph: verify objects exist commit-graph: verify root tree OIDs commit-graph: verify parent list commit-graph: verify generation number commit-graph: verify commit date commit-graph: test for corrupted octopus edge commit-graph: verify contents match checksum fsck: verify commit-graph commit-graph: use string-list API for input commit-graph: add '--reachable' option gc: automatically write commit-graph files commit-graph: update design document Documentation/config.txt | 9 +- Documentation/git-commit-graph.txt | 14 +- Documentation/git-fsck.txt | 3 + Documentation/git-gc.txt | 4 + Documentation/technical/commit-graph.txt | 22 -- builtin/commit-graph.c | 99 ++--- builtin/fsck.c | 21 ++ builtin/gc.c | 6 + commit-graph.c | 249 +-- commit-graph.h | 11 +- commit.c | 10 +- commit.h | 1 + t/t5318-commit-graph.sh | 205 ++- 13 files changed, 572 insertions(+), 82 deletions(-) base-commit: ed843436dd4924c10669820cc73daf50f0b4dabd -- 2.18.0.24.g1b579a2ee9
[PATCH v7 05/22] commit: force commit to parse from object database
In anticipation of verifying commit-graph file contents against the object database, create parse_commit_internal() to allow side-stepping the commit-graph file and parse directly from the object database. Due to the use of generation numbers, this method should not be called unless the intention is explicit in avoiding commits from the commit-graph file. Signed-off-by: Derrick Stolee --- commit.c | 10 -- commit.h | 1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 0c3b75aeff..598cf21cee 100644 --- a/commit.c +++ b/commit.c @@ -418,7 +418,7 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s return 0; } -int parse_commit_gently(struct commit *item, int quiet_on_missing) +int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph) { enum object_type type; void *buffer; @@ -429,7 +429,7 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return -1; if (item->object.parsed) return 0; - if (parse_commit_in_graph(item)) + if (use_commit_graph && parse_commit_in_graph(item)) return 0; buffer = read_object_file(&item->object.oid, &type, &size); if (!buffer) @@ -441,6 +441,7 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return error("Object %s not a commit", oid_to_hex(&item->object.oid)); } + ret = parse_commit_buffer(item, buffer, size, 0); if (save_commit_buffer && !ret) { set_commit_buffer(item, buffer, size); @@ -450,6 +451,11 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) return ret; } +int parse_commit_gently(struct commit *item, int quiet_on_missing) +{ + return parse_commit_internal(item, quiet_on_missing, 1); +} + void parse_commit_or_die(struct commit *item) { if (parse_commit(item)) diff --git a/commit.h b/commit.h index 3ad07c2e3d..7e0f273720 100644 --- a/commit.h +++ b/commit.h @@ -77,6 +77,7 @@ struct commit *lookup_commit_reference_by_name(const char *name); struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name); int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size, int check_graph); +int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph); int parse_commit_gently(struct commit *item, int quiet_on_missing); static inline int parse_commit(struct commit *item) { -- 2.18.0.24.g1b579a2ee9
[PATCH v7 08/22] commit-graph: verify catches corrupt signature
This is the first of several commits that add a test to check that 'git commit-graph verify' catches corruption in the commit-graph file. The first test checks that the command catches an error in the file signature. This is a check that exists in the existing commit-graph reading code. Add a helper method 'corrupt_graph_and_verify' to the test script t5318-commit-graph.sh. This helper corrupts the commit-graph file at a certain location, runs 'git commit-graph verify', and reports the output to the 'err' file. This data is filtered to remove the lines added by 'test_must_fail' when the test is run verbosely. Then, the output is checked to contain a specific error message. Most messages from 'git commit-graph verify' will not be marked for translation. There will be one exception: the message that reports an invalid checksum will be marked for translation, as that is the only message that is intended for a typical user. Helped-by: Szeder Gábor Signed-off-by: Derrick Stolee --- t/t5318-commit-graph.sh | 43 + 1 file changed, 43 insertions(+) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 0830ef9fdd..c0c1ff09b9 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -235,9 +235,52 @@ test_expect_success 'perform fast-forward merge in full repo' ' test_cmp expect output ' +# the verify tests below expect the commit-graph to contain +# exactly the commits reachable from the commits/8 branch. +# If the file changes the set of commits in the list, then the +# offsets into the binary file will result in different edits +# and the tests will likely break. + test_expect_success 'git commit-graph verify' ' cd "$TRASH_DIRECTORY/full" && + git rev-parse commits/8 | git commit-graph write --stdin-commits && git commit-graph verify >output ' +GRAPH_BYTE_VERSION=4 +GRAPH_BYTE_HASH=5 + +# usage: corrupt_graph_and_verify +# Manipulates the commit-graph file at the position +# by inserting the data, then runs 'git commit-graph verify' +# and places the output in the file 'err'. Test 'err' for +# the given string. +corrupt_graph_and_verify() { + pos=$1 + data="${2:-\0}" + grepstr=$3 + cd "$TRASH_DIRECTORY/full" && + test_when_finished mv commit-graph-backup $objdir/info/commit-graph && + cp $objdir/info/commit-graph commit-graph-backup && + printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" conv=notrunc && + test_must_fail git commit-graph verify 2>test_err && + grep -v "^+" test_err >err + test_i18ngrep "$grepstr" err +} + +test_expect_success 'detect bad signature' ' + corrupt_graph_and_verify 0 "\0" \ + "graph signature" +' + +test_expect_success 'detect bad version' ' + corrupt_graph_and_verify $GRAPH_BYTE_VERSION "\02" \ + "graph version" +' + +test_expect_success 'detect bad hash version' ' + corrupt_graph_and_verify $GRAPH_BYTE_HASH "\02" \ + "hash version" +' + test_done -- 2.18.0.24.g1b579a2ee9
[PATCH v7 01/22] t5318-commit-graph.sh: use core.commitGraph
The commit-graph tests should be checking that normal Git operations succeed and have matching output with and without the commit-graph feature enabled. However, the test was toggling 'core.graph' instead of the correct 'core.commitGraph' variable. Signed-off-by: Derrick Stolee --- Junio, I sent this patch as a one-off a while ago, but it seems it was dropped. I'm adding it back here so we don't forget it. -Stolee t/t5318-commit-graph.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 77d85aefe7..59d0be2877 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -28,8 +28,8 @@ test_expect_success 'create commits and repack' ' ' graph_git_two_modes() { - git -c core.graph=true $1 >output - git -c core.graph=false $1 >expect + git -c core.commitGraph=true $1 >output + git -c core.commitGraph=false $1 >expect test_cmp output expect } -- 2.18.0.24.g1b579a2ee9
[PATCH v7 06/22] commit-graph: load a root tree from specific graph
When lazy-loading a tree for a commit, it will be important to select the tree from a specific struct commit_graph. Create a new method that specifies the commit-graph file and use that in get_commit_tree_in_graph(). Signed-off-by: Derrick Stolee --- commit-graph.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index e77b19971d..9e228d3bb5 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -362,14 +362,20 @@ static struct tree *load_tree_for_commit(struct commit_graph *g, struct commit * return c->maybe_tree; } -struct tree *get_commit_tree_in_graph(const struct commit *c) +static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g, +const struct commit *c) { if (c->maybe_tree) return c->maybe_tree; if (c->graph_pos == COMMIT_NOT_FROM_GRAPH) - BUG("get_commit_tree_in_graph called from non-commit-graph commit"); + BUG("get_commit_tree_in_graph_one called from non-commit-graph commit"); + + return load_tree_for_commit(g, (struct commit *)c); +} - return load_tree_for_commit(commit_graph, (struct commit *)c); +struct tree *get_commit_tree_in_graph(const struct commit *c) +{ + return get_commit_tree_in_graph_one(commit_graph, c); } static void write_graph_chunk_fanout(struct hashfile *f, -- 2.18.0.24.g1b579a2ee9
[PATCH v7 03/22] commit-graph: fix GRAPH_MIN_SIZE
The GRAPH_MIN_SIZE macro should be the smallest size of a parsable commit-graph file. However, the minimum number of chunks was wrong. It is possible to write a commit-graph file with zero commits, and that violates this macro's value. Rewrite the macro, and use extra macros to better explain the magic constants. Signed-off-by: Derrick Stolee --- commit-graph.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index b63a1fc85e..f83f6d2373 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -35,10 +35,11 @@ #define GRAPH_LAST_EDGE 0x8000 +#define GRAPH_HEADER_SIZE 8 #define GRAPH_FANOUT_SIZE (4 * 256) #define GRAPH_CHUNKLOOKUP_WIDTH 12 -#define GRAPH_MIN_SIZE (5 * GRAPH_CHUNKLOOKUP_WIDTH + GRAPH_FANOUT_SIZE + \ - GRAPH_OID_LEN + 8) +#define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * GRAPH_CHUNKLOOKUP_WIDTH \ + + GRAPH_FANOUT_SIZE + GRAPH_OID_LEN) char *get_commit_graph_filename(const char *obj_dir) { -- 2.18.0.24.g1b579a2ee9
[PATCH v7 02/22] commit-graph: UNLEAK before die()
Signed-off-by: Derrick Stolee --- builtin/commit-graph.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 37420ae0fd..f0875b8bf3 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -51,8 +51,11 @@ static int graph_read(int argc, const char **argv) graph_name = get_commit_graph_filename(opts.obj_dir); graph = load_commit_graph_one(graph_name); - if (!graph) + if (!graph) { + UNLEAK(graph_name); die("graph file %s does not exist", graph_name); + } + FREE_AND_NULL(graph_name); printf("header: %08x %d %d %d %d\n", -- 2.18.0.24.g1b579a2ee9
Re: [PATCH] rebase: fix documentation formatting
Hi Vlad, On Wed, 27 Jun 2018, Vladimir Parfinenko wrote: > Last sections are squashed into non-formatted block after adding > "REBASING MERGES". > To reproduce the error see bottom of page: > https://git-scm.com/docs/git-rebase > > Signed-off-by: Vladimir Parfinenko ACK! Thank you so much for fixing this (as you may suspect, I forgot to adjust the length of the "underline" after changing the header from "RECREATING MERGES" to "REBASING MERGES"), Johannes
Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect
Hi Junio, On Tue, 26 Jun 2018, Junio C Hamano wrote: > Christian Couder writes: > > > Obviousness is often not the same for everybody. > > ... which you just learned---what you thought obvious turns out to > be not so obvious after all, so you adjust to help your readers. Indeed. And trying to tell the reader that they should find it obvious is not exactly productive. It only causes bad feelings and can be easily avoided. > >> In this particular case it even feels as if this test is not even > >> testing what it should test at all: > >> > >> - it should verify that all of the commits in the first parent > >> lineage are part of the list > > > > It does that. > > > >> - it should verify that none of the other commits are in the list > > > > It does that too. > > But the point is it does a lot more by insisting exact output. Indeed. One thing it does in addition is to make the test code a lot less obvious for readers in general. Let me summarize again what good regression tests have to deliver, because I think it cannot be stressed enough, especially in this context where we run the danger of adding poor regression tests: - a regression test needs to catch regressions (d'oh) - a regression test needs to be *quick*. Otherwise developers will skip running them, which is worse than having no tests at all (because the effort to develop them is wasted) - a regression test must make it as easy as possible to fix regressions There are quite a few corollaries to that last point, some of which are: - a regression test that fails for anything but an obvious reason is *useless* - a regression test that tries to test for *everything* (including dogs and cats) not only breaks the quickness requirement, but it also makes it confusing where to start fixing things. "The test talked about --bisect but now I am stuck somewhere in XYZ.c, what has *that* to do with --bisect?" is *not* something you want to risk any developer yelling at the test code that you authored - less is more. If you can use commits that were already generated in the common "setup" step, there is literally a negative value in generating a new set of commits instead - less is more. If you can catch the same regressions in three, concise *and* understandable lines, avoid using thirty lines instead - regression tests should not need to be adjusted when the logic changes in an intended way. It is a strong sign that a regression test was written badly if it starts failing for any reason other than a regression The overzealous "I want this output to be exactly this" stance of the tests we are discussing here is a very obvious violation here. Regression tests should fail only to indicate regressions. Really. Let me repeat that because it is so obvious when you think about it, but it is easy to forget when writing regression tests: Regression tests should fail only to indicate regressions. > For example, the version I reviewed had a two "expected output", and > said that the actual output must match either one of them. I guess it > was because there were two entries with the same distance and we cannot > rely on which order they appear in the result? If a test history gained > another entry with the same distance, then would we need 6 possible > expected output because we cannot rely on the order in which these three > come out? It is totally unnecessary to go there, as it would make those regression tests a lot less valuable than they could otherwise be. Let me elaborate further below. > That was the only thing I was complaining about. Dscho gave me too > much credit and read a lot more good things than what I actually > meant to say ;-). Why don't you just accept my praise gracefully? ;-) It's not that I gave you a lot of praise recently, even if you clearly deserve it. > >> And that is really all there is to test. > > Another is that "rev-list --bisect-all" promises that the entries > are ordered by the distance value. Yes! And you know what we can do there? We can test *precisely* that! # verify that the output is sorted by `dist` (descending) sed "s/.*dist=\([0-9]*\).*/\1/" dist && sort -n -r dist.sorted && test_cmp dist dist.sorted This extracts the distance numbers into their own file, then verifies that they were already sorted. The really big advantage here is that any future change that might result in a different order of entries with the same "dist" value *will not cause this regression test to fail*. And for a good reason: because ordering identical "dist" values differently is not a regression at all. > So taking the above three points, perhaps > > cat >expect < ... as written in one of the expect list in Tiago's patch > EOF Please no. Please let's *not* generate more commits when we already have a perfectly fine set of commits generated by the setup phase of the very same script. Please let's not use confusing names for those commit
Re: [PATCH] rebase -i: Fix white space in comments
Hi Dana, On Tue, 26 Jun 2018, dana wrote: > On 26 Jun 2018, at 16:44, Johannes Schindelin > wrote: > >There is of course one other way to fix this, and that is by rewriting > >this in C. > > > >Which Alban has done here ;-) > > > >http://public-inbox.org/git/20180626161643.31152-3-alban.gr...@gmail.com > > Oh, i'm sorry, i didn't see that. No need to be sorry, nobody expects you to read the "firehose" that is the Git mailing list in its entirety. > That change does appear to solve the same problem, so i'm happy to defer > to it. Thank you for confirming that it also fixes your issue, that is very helpful! Ciao, Johannes
[PATCH] rebase: fix documentation formatting
Last sections are squashed into non-formatted block after adding "REBASING MERGES". To reproduce the error see bottom of page: https://git-scm.com/docs/git-rebase Signed-off-by: Vladimir Parfinenko --- Documentation/git-rebase.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index bd5ecff98..6fe98165d 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -804,7 +804,7 @@ The ripple effect of a "hard case" recovery is especially bad: case" recovery too! REBASING MERGES -- +--- The interactive rebase command was originally designed to handle individual patch series. As such, it makes sense to exclude merge -- 2.18.0.windows.1