[PATCH v2 0/7] read/write_in_full leftovers
On Mon, Sep 25, 2017 at 04:26:47PM -0400, Jeff King wrote: > This series addresses the bits leftover from the discussion two weeks > ago in: > > > https://public-inbox.org/git/20170913170807.cyx7rrpoyhaau...@sigill.intra.peff.net/ > > and its subthread. I don't think any of these is a real problem in > practice, so this can be considered as a cleanup. I'm on the fence over > whether the final one is a good idea. See the commit message for > details. > > [1/7]: files-backend: prefer "0" for write_in_full() error check > [2/7]: notes-merge: drop dead zero-write code > [3/7]: read_in_full: reset errno before reading > [4/7]: get-tar-commit-id: prefer "!=" for read_in_full() error check > [5/7]: worktree: use xsize_t to access file size > [6/7]: worktree: check the result of read_in_full() > [7/7]: add xread_in_full() helper Here's a followup based on the discussion on v1. There are actually a few changes here. I dropped the "read_in_full() should set errno on short reads" idea (3/7 in the earlier series). It really is the caller's fault for looking at errno when they know there hasn't been an error in the first place. We should just bite the bullet and have the callers do the right thing. I also dropped the "xread_in_full" helper (7/7 earlier). The lego sentences it created just weren't worth the hassle. Instead, I've fixed all of the relevant callers to provide good error messages for both cases. It's a few more lines of code, and it's probably rare for users to see these in the first place. But it doesn't hurt too much to be thorough, and I think it's good to model correct error handling. This is in patches 4 and 5 below. There are a handful of other changes: [1/7]: files-backend: prefer "0" for write_in_full() error check [2/7]: notes-merge: drop dead zero-write code These two are the same as before. [3/7]: prefer "!=" when checking read_in_full() result Along with the get-tar-commit-id case, I found two other cases which aren't bugs, but which should be updated for style reasons. [4/7]: avoid looking at errno for short read_in_full() returns [5/7]: distinguish error versus short_read from read_in_full() These two are new, and add in the messages. The ones in patch 4 are actively wrong (they show a bogus errno). The ones in 5 are just less descriptive than they could be. So if we find the extra lines of code too much, we could drop 5 (or even convert the ones in 4 to just stop looking at errno). [6/7]: worktree: use xsize_t to access file size Same as before. [7/7]: worktree: check the result of read_in_full() Similar, but now handles errors and short_reads separately. It also avoids leaking descriptors and memory in the error path (noticed by Coverity). builtin/get-tar-commit-id.c | 6 -- builtin/worktree.c | 24 +--- bulk-checkin.c | 5 - csum-file.c | 2 +- notes-merge.c | 2 -- pack-write.c| 7 ++- packfile.c | 11 +-- pkt-line.c | 2 +- refs/files-backend.c| 2 +- sha1_file.c | 11 --- 10 files changed, 55 insertions(+), 17 deletions(-) -Peff
Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
Brandon Williamswrites: > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */ > +static int process_protocol_version(void) > +{ > + switch (determine_protocol_version_client(packet_buffer)) { > + case protocol_v1: > + return 1; > + case protocol_v0: > + return 0; > + default: > + die("server is speaking an unknown protocol"); > + } > +} For the purpose of "technology demonstration" v1 protocol, it is OK to discard the result of "determine_pvc()" like the above code, but in a real application, we would do a bit more than just ignoring an extra "version #" packet that appears at the beginning, no? It would be sensible to design how the result of determien_pvc() call is propagated to the remainder of the program in this patch and implement it. Perhaps add a new global (like server_capabilities already is) and store the value there, or something? Or pass a pointer to enum protocol_version as a return-location parameter to this helper function so that the process_capabilities() can pass a pointer to its local variable? > static void process_capabilities(int *len) > { > @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, > size_t src_len, >*/ > int responded = 0; > int len; > - int state = EXPECTING_FIRST_REF; > + int state = EXPECTING_PROTOCOL_VERSION; > > *list = NULL; > > while ((len = read_remote_ref(in, _buf, _len, ))) { > switch (state) { > + case EXPECTING_PROTOCOL_VERSION: > + if (process_protocol_version()) { > + state = EXPECTING_FIRST_REF; > + break; > + } > + state = EXPECTING_FIRST_REF; > + /* fallthrough */ > case EXPECTING_FIRST_REF: > process_capabilities(); > if (process_dummy_ref()) {
Re: [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1
Brandon Williamswrites: > @@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, > const char *prefix) > else if (0 <= receive_unpack_limit) > unpack_limit = receive_unpack_limit; > > + switch (determine_protocol_version_server()) { > + case protocol_v1: > + if (advertise_refs || !stateless_rpc) > + packet_write_fmt(1, "version 1\n"); > + /* > + * v1 is just the original protocol with a version string, > + * so just fall through after writing the version string. > + */ > + case protocol_v0: > + default: > + break; When protocol_v2 is introduced in the other part of the codebase (i.e. in protocol.[ch]), until these lines are updated accordingly to take care of the new protocol, we'd pretend that client asked (and the server accepted) v0, even though the client and the daemon agreed to talk v2. Shouldn't the "default:" die instead? The same for upload-pack.c
Re: [PATCH v2 4/9] daemon: recognize hidden request arguments
Brandon Williamswrites: > A normal request to git-daemon is structured as > "command path/to/repo\0host=..\0" and due to a bug in an old version of > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the > command, 2009-06-04) we aren't able to place any extra args (separated > by NULs) besides the host. It's a bit unclear if that commit _introduced_ a bug, or just noticed an old bug and documented it in its log message. How does that commit impact the versons of Git that the updated code is capable of interracting with? > +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env, > + char *extra_args, int buflen) > +{ > + const char *end = extra_args + buflen; > + struct strbuf git_protocol = STRBUF_INIT; > + > + /* First look for the host argument */ > + extra_args = parse_host_arg(hi, extra_args, buflen); > + > + /* Look for additional arguments places after a second NUL byte */ > + for (; extra_args < end; extra_args += strlen(extra_args) + 1) { > + const char *arg = extra_args; > + > + /* > + * Parse the extra arguments, adding most to 'git_protocol' > + * which will be used to set the 'GIT_PROTOCOL' envvar in the > + * service that will be run. > + * > + * If there ends up being a particular arg in the future that > + * git-daemon needs to parse specificly (like the 'host' arg) > + * then it can be parsed here and not added to 'git_protocol'. > + */ > + if (*arg) { > + if (git_protocol.len > 0) > + strbuf_addch(_protocol, ':'); > + strbuf_addstr(_protocol, arg); > + } > + } > + > + if (git_protocol.len > 0) > + argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s", > + git_protocol.buf); > + strbuf_release(_protocol); > }
Re: -s theirs use-case(s) Was: BUG: merge -s theirs is not in effect
On Wed, 27 Sep 2017, Junio C Hamano wrote: > > and that is where the gotcha comes -- what if "my" changes were already > > published? then I would like to avoid the rebase, and would -s theirs > > to choose "their" solution in favor of mine and be able to push so > > others could still "fast-forward" to the new state. > > So -- as to me it remains 'symmetric' ;) > I do not necessarily agree. Once you decide that their history is > the mainline, you'd rather want to treat your line of development as > a side branch and make a merge in that direction, i.e. the first > parent of the resulting merge is a commit on their history and the > second parent is the last bad one of your history. So you would end > up using "checkout their-history && merge -s ours your-history" to > keep the first-parenthood sensible. > And at that point, use of "-s ours" is no longer a workaround for > lack of "-s theirs". It is a proper part of the desired semantics, > i.e. from the point of view of the surviving canonical history line, > you want to preserve what it did, nullifying what the other line of > history did. > So I still do not think the above scenario justifies "-s theirs". ok, when you describe it like this (in my case I rarely cared about the side of the merge), then indeed I might better do the entire dance with git reset --hard theirstate; git merge -s ours HEAD@{1} and live happily with the left side being the one always correct and hide "my" mistakes ;) will keep it in mind -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
Brandon Williamswrites: > +`GIT_PROTOCOL`:: > + For internal use only. Used in handshaking the wire protocol. > + Contains a colon ':' separated list of keys with optional values > + 'key[=value]'. Presence of unknown keys must be tolerated. Is this meant to be used only on the "server" end? Am I correct to interpret "handshaking" to mean the initial connection acceptor (e.g. "git daemon") uses it to pass what it decided to the programs that implement the service (e.g. "git receive-pack")? > +/* > + * Environment variable used in handshaking the wire protocol. > + * Contains a colon ':' separated list of keys with optional values > + * 'key[=value]'. Presence of unknown keys must be tolerated. > + */ > +#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL" "Must be tolerated" feels a bit strange. When somebody asks you to use "version 3" or "version 1 variant 2", when you only know "version 0" or "version 1" and you are not yet even aware of the concept of "variant", we simply ignore "variant=2" as if it wasn't there, even though "version=3" will be rejected (because we know of "version"; it's just that we don't know "version=3"). > +enum protocol_version determine_protocol_version_server(void) > +{ > + const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT); > + enum protocol_version version = protocol_v0; > + > + if (git_protocol) { > + struct string_list list = STRING_LIST_INIT_DUP; > + const struct string_list_item *item; > + string_list_split(, git_protocol, ':', -1); > + > + for_each_string_list_item(item, ) { > + const char *value; > + enum protocol_version v; > + > + if (skip_prefix(item->string, "version=", )) { > + v = parse_protocol_version(value); > + if (v > version) > + version = v; > + } > + } > + > + string_list_clear(, 0); > + } > + > + return version; > +} This implements "the largest one wins", not "the last one wins". Is there a particular reason why the former is chosen?
Re: [PATCH 3/7] read_in_full: reset errno before reading
Jeff Kingwrites: > You actually don't need errno for that. You can write: > > ret = read_in_full(..., size); > if (ret < 0) > die_errno("oops"); > else if (ret != size) > die("short read"); > > So I think using errno as a sentinel value to tell between the two cases > doesn't have much value. > ... > One of the reasons I dislike the helper I showed earlier is that it > involves assembling a lego sentence. But the alternative is requiring > the caller to write both sentences from scratch (and writing a good > sentence for the short read really is long: you expected X bytes but got > Y). All sensible considerations. I probably should have read the entire discussion twice before starting to type X-<. Sorry about the noise. > Sometimes a custom "short read" message can be better, if it's "could > not read enough bytes for packfile header" or similar. You don't need to > know the exact number then. That's true, too. I dunno, either.
Re: [PATCH 3/7] read_in_full: reset errno before reading
On Tue, Sep 26, 2017 at 01:11:59PM +0900, Junio C Hamano wrote: > Jeff Kingwrites: > > >> #ifndef EUNDERFLOW > >> # ifdef ENODATA > >> # define EUNDERFLOW ENODATA > >> # else > >> # define EUNDERFLOW ESPIPE > >> # endif > >> #endif > > > > Right, I think our mails just crossed but I'm leaning in this direction. > > Hmph, I may be slow (or may be skimming the exchanges too fast), but > what exactly is wrong with "0"? As long as we do not have to tell > two or more "not exactly an error from syscall in errno" cases, I > would think "0" is the best value to use. The main reason to avoid "0" is just that the "read error: Success" message is a bit funny. > If the syserror message _is_ the issue, then we'd need to either > pick an existing errno that is available everywhere (with possibly > suboptimal message), or pick something and prepare a fallback for > platforms that lack the errno, so picking "0" as the value and use > whatever logic we would have used for the "fallback" would not sound > too bad. I.e. > > if (read_in_full(..., size) != size) > if (errno) > die_errno("oops"); > else > die("short read"); > > If the callsite were too numerous, You actually don't need errno for that. You can write: ret = read_in_full(..., size); if (ret < 0) die_errno("oops"); else if (ret != size) die("short read"); So I think using errno as a sentinel value to tell between the two cases doesn't have much value. > #define die_errno_or(msg1, msg2) if (errno) die_errno(msg1); else die(msg2) > > perhaps? If you just care about dying, you can wrap the whole read_in_full() in a helper (which is what my original patch 7 did). But I found there were cases that didn't want to die. Of course _most_ of those don't bother looking at errno in the first place. I think the only one that does is index_core(), which calls error_errno(). So with two helpers (one for die and one for error), I think we could cover the existing cases. Or we could convert the single error() case to just handle the logic inline. One of the reasons I dislike the helper I showed earlier is that it involves assembling a lego sentence. But the alternative is requiring the caller to write both sentences from scratch (and writing a good sentence for the short read really is long: you expected X bytes but got Y). I guess just tweaking the errno value does not really give a good "short read" error either. You get "unable to read foo: No data available", without the X/Y information. Sometimes a custom "short read" message can be better, if it's "could not read enough bytes for packfile header" or similar. You don't need to know the exact number then. So I dunno. Maybe I have just talked myself into actually just fixing all of the read_in_full() call sites manually. -Peff
Darlehensangebot
Guten Tag Sir / ma, Wir Norton Finanzierung Unternehmen hier in Großbritannien gibt Darlehen an Kunden mit 2,4% Zinssatz. und auch Geld für Investitionen und andere finanzielle Zwecke unabhängig von ihrer Lage, Geschlecht, Familienstand, Bildung, Job-Status, sondern muss ein gesetzliches Mittel zur Rückzahlung .. Unser Darlehen reicht von 5000 Euro, nok, chf, - 5.000.000,00 Euro, nok , chf, mit 2,4% Zinssatz. Interessierte Personen, Firmen und Firmen. Bitte kontaktieren Sie uns unter unserer E-Mail: nortonfinancecre...@gmail.com Wenn Sie sich für ein Darlehen interessieren Füllen Sie aus und senden Sie das Darlehen Antragsformular sofort. INFORMATIONEN BENÖTIGT Deine Namen: Adresse: ... Telefon: ... Benötigte Menge: ... Dauer: ... Beruf: ... Monatliches Einkommensniveau: .. Geschlecht: .. Geburtsdatum: ... Bundesland: ... Land: . Zweck des Darlehens oder Überprüfung des Projekts . KONTAKTIERE UNS nortonfinancecre...@gmail.com Danke für Ihr Verständnis; Wir dienen für Ihren Komfort. Mit freundlichen Grüßen, Dr. Helen Mathew
Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
"Eric Rannaud"writes: > Any comments on the testing strategy with a background fast-import? > > To summarize: > - fast-import is started in the background with a command stream that > ends with "checkpoint\nprogress checkpoint\n". fast-import keeps > running after reaching the last command (we don't want fast-import to > terminate). > - In the meantime, the test is waiting to read "progress checkpoint" in > the output of fast-import, then it checks the testing conditions. > - Finally, the test ensures that fast-import is still running in the > background (and thus that it has just observed the effect of > checkpoint, and not the side effects of fast-import terminating). > - After 10 sec, no matter what, the background fast-import is sent > "done" and terminates. > > I tried to make sure that the test runs quickly and does not (typically) > sleep. > Upon failure, the test may take up to 10 sec to fully terminate. H, it certainly looks a bit too brittle with many tweakables like 10, 50, 55, etc. that heavily depend on the load on the system. Sorry for asking you to go through the hoops. > diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh > index 67b8c50a5ab4..b410bf3a3a7a 100755 > --- a/t/t9300-fast-import.sh > +++ b/t/t9300-fast-import.sh > @@ -3120,4 +3120,121 @@ test_expect_success 'U: validate root delete result' ' > compare_diff_raw expect actual > ' > > +### > +### series V (checkpoint) > +### > + > +# To make sure you're observing the side effects of checkpoint *before* > +# fast-import terminates (and thus writes out its state), check that the > +# fast-import process is still running using background_import_still_running > +# *after* evaluating the test conditions. > +background_import_until_checkpoint () { > + options=$1 > + input_file=$2 > + ( cat $input_file; sleep 10; echo done ) | git fast-import $options > >V.output & > + echo $! >V.pid > + > + # The loop will poll for approximately 5 seconds before giving up. > + n=0 > + while ! test "$(cat V.output)" = "progress checkpoint"; do Micronit. Just like you have if/then on different lines, have while/do on different lines, i.e. while test "$(cat V.output)" != "progress checkpoint" do > + if test $n -gt 55 > + then > +... > +background_import_still_running () { > + if ! ps --pid "$(cat V.pid)" "ps --pid" is probably not portable (I think we'd do "kill -0 $pid" instead in our existing tests---and it should be kosher according to POSIX [*1*, *2*]). > +test_expect_success 'V: checkpoint updates refs after reset' ' > + cat >input <<-INPUT_END && It is not wrong per-se but we quote INPUT_END when there is no interpolation necessary in the body of here document to help readers, like so: cat >input <<-\INPUT_END && > + reset refs/heads/V > + from refs/heads/U > + > + checkpoint > + progress checkpoint > + INPUT_END > +test_expect_success 'V: checkpoint updates refs and marks after commit' ' > + cat >input <<-INPUT_END && This we do want interpolation and the above is correct. [Footnotes] *1* http://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html lists '0' as an allowed signal number to be sent, and defers to the description of the kill() function to explain what happens. *2* http://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html says """If sig is 0 (the null signal), error checking is performed but no signal is actually sent. The null signal can be used to check the validity of pid."""
REPLY NOW:
Did you receive my previous message about my donation to you? for philanthropic and humanitarian work for all of mankind who are in great need among your friends, family and people around you. REPLY FOR MORE DETAILS. Regards Mrs Mavis Wanczyk.
Re: [PATCH] t7406: submodule..update command must not be run from .gitmodules
Stefan Bellerwrites: > submodule..update can be assigned an arbitrary command via setting > it to "!command". When this command is found in the regular config, Git > ought to just run that command instead of other update mechanisms. > > However if that command is just found in the .gitmodules file, it is > potentially untrusted, which is why we do not run it. Add a test > confirming the behavior. > > Suggested-by: Jonathan Nieder > Signed-off-by: Stefan Beller > --- Earlier, we saw: Ideally we want this test to be super robust: e.g. if it runs the command but from a different directory, we still want the test to fail, and if it runs the command but using exec instead of a shell, we still want the test to fail. and this one (i.e. signal that it is a command by prefixing with '!', and then have a single command that would fail whether it is run via run_command() with or without shell) would satisfy that criteria, I would think. >> This test for a missing file is certainly a remnant from the >> previous iteration, isn't it? > > Yes. This is a good indicator I need some vacation. Or just take a deep breath before making a knee-jerk reaction public and instead double-check before sending things out ;-) Will queue. Thanks. > > Thanks, > Stefan > > t/t7406-submodule-update.sh | 8 > 1 file changed, 8 insertions(+) > > diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh > index 034914a14f..6f083c4d68 100755 > --- a/t/t7406-submodule-update.sh > +++ b/t/t7406-submodule-update.sh > @@ -406,6 +406,14 @@ test_expect_success 'submodule update - command in > .git/config' ' > ) > ' > > +test_expect_success 'submodule update - command in .gitmodules is ignored' ' > + test_when_finished "git -C super reset --hard HEAD^" && > + git -C super config -f .gitmodules submodule.submodule.update "!false" > && > + git -C super commit -a -m "add command to .gitmodules file" && > + git -C super/submodule reset --hard $submodulesha1^ && > + git -C super submodule update submodule > +' > + > cat << EOF >expect > Execution of 'false $submodulesha1' failed in submodule path 'submodule' > EOF
Re: [PATCH 4/3] branch: fix "copy" to never touch HEAD
Ævar Arnfjörð Bjarmasonwrites: > I do think however that we also need to update the docs, the relevant > origin/master...gitster/sd/branch-copy diff is currently: > > +The `-c` and `-C` options have the exact same semantics as `-m` and > +`-M`, except instead of the branch being renamed it along with its > +config and reflog will be copied to a new name. > > Suggestions welcome, but I think that should probably be changed to > something like the following as part of this patch: > > The `-c` and `-C` options have the exact same semantics as `-m` and > `-M`, except instead of the branch being renamed it along with its > config and reflog will be copied to a new name. Furthermore, unlike > its `-m` and `-M` cousins the `-c` and `-C` options will never move > the HEAD, whereas the move option will do so if the source branch is > the currently checked-out branch. I do not think anybody even _imagines_ copy to move HEAD, and do not think "unlike -m, it doesn't touch HEAD" a worthwhile thing to say. It is '-m' whose behaviour may look strange wrt HEAD, so _that_ may be worth mentioning, though. I suspect that your reaction probably comes from being too married to the idea that HEAD has a string that is the same as the refname of the current branch. But that is a mere implementation detail. Users would think that HEAD points at the current branch and does not even care how that pointing is implemented. Conceptually, you can consider that each branch has its own identity, separate from various traits it has, like what its upstream branch is, what commit it points at, what its reflog says, and (most notably) what its name is. Then you can think of "branch -m old new" to be (1) finding the instance of branch that currently has name 'old' and (2) updating only one of its trait, namely, its name, without changing anything else. Creating a new instance of a branch by copying an existing one, on the other hand, would then be the matter of (1) finding the instance of branch with name 'old' and (2) creating another instance of branch with the same traits as the original, modulo its name is set to 'new'. A necessary wrinkle for "branch -m" that falls out of the above world model is that HEAD needs to be adjusted in order to keep pointing at the same (renamed) instance of branch that now has an updated name, if HEAD happened to be pointing at the instance of the branch whose name trait has been updated. So, from that point of view, I'd prefer to do something like the attached patch instead. I notice that "branch -m" does not mention configuration variables carried over to the new branch, but I do not necessarily think we want to exhaustively enumerate what traits are carried over here, so perhaps it is OK as is. Documentation/git-branch.txt | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt index fe029ac6fc..d425e3acd4 100644 --- a/Documentation/git-branch.txt +++ b/Documentation/git-branch.txt @@ -63,11 +63,13 @@ With a `-m` or `-M` option, will be renamed to . If had a corresponding reflog, it is renamed to match , and a reflog entry is created to remember the branch renaming. If exists, -M must be used to force the rename -to happen. +to happen. If you rename a branch that is currently checked out, +`HEAD` is adjusted so that the branch (whose name is now +) stays to be the current branch. -The `-c` and `-C` options have the exact same semantics as `-m` and -`-M`, except instead of the branch being renamed it along with its -config and reflog will be copied to a new name. +With a `-c` or`-C` option, a new branch is created by +copying the traits like the reflog contents and `branch.*.*` +configuration from an existing . With a `-d` or `-D` option, `` will be deleted. You may specify more than one branch for deletion. If the branch currently
Re: git apply fails silently when on the wrong directory
On Tue, Sep 26, 2017 at 05:53:55PM -0700, Ernesto Alfonso wrote: > I recently ran into a similar issue as described here: > > https://stackoverflow.com/questions/24821431/git-apply-patch-fails-silently-no-errors-but-nothing-happens > > I was using the alias: > > alias ganw='git diff -U0 -w --no-color "$@" | git apply --cached > --ignore-whitespace --unidiff-zero -' > > to stage non-whitespace changes, but I was not in the root directory and > the changes were not being applied. I broke down the command to discover > the 'git apply' part of the pipe was silently failing to apply anything, > exiting 0 without even a warning. > > The exit status and lack of warning is terribly misleading, I imagine > this would be the cause of subtle bugs in automated scripts. > > Is this expected behaviour? Yes, this is as documented in git-apply(1): Reads the supplied diff output (i.e. "a patch") and applies it to files. When running from a subdirectory in a repository, patched paths outside the directory are ignored. I agree it's a potential source of confusion. There's some previous discussion in this thread: https://public-inbox.org/git/ca+dcaeqqqh59lb43y4bi1xktpnoodv11kkubbkng1oz7mdb...@mail.gmail.com/ -Peff
Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
Brandon Williamswrites: > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */ > +static int process_protocol_version(void) > +{ > + switch (determine_protocol_version_client(packet_buffer)) { > + case protocol_v1: > + return 1; > + case protocol_v0: > + return 0; > + default: > + die("server is speaking an unknown protocol"); > + } > +} checkpatch.pl yells at me: ERROR: switch and case should be at the same indent and we would probably want to teach "make style" the same, if we already don't.
Re: git apply fails silently when on the wrong directory
Also, has there been a feature request for a '-w' option to 'git add', analogous to the same option in 'git diff'? Ernesto Alfonsowrites: > I recently ran into a similar issue as described here: > > https://stackoverflow.com/questions/24821431/git-apply-patch-fails-silently-no-errors-but-nothing-happens > > I was using the alias: > > alias ganw='git diff -U0 -w --no-color "$@" | git apply --cached > --ignore-whitespace --unidiff-zero -' > > to stage non-whitespace changes, but I was not in the root directory and > the changes were not being applied. I broke down the command to discover > the 'git apply' part of the pipe was silently failing to apply anything, > exiting 0 without even a warning. > > The exit status and lack of warning is terribly misleading, I imagine > this would be the cause of subtle bugs in automated scripts. > > Is this expected behaviour? > > Thanks, > > Ernesto
git apply fails silently when on the wrong directory
I recently ran into a similar issue as described here: https://stackoverflow.com/questions/24821431/git-apply-patch-fails-silently-no-errors-but-nothing-happens I was using the alias: alias ganw='git diff -U0 -w --no-color "$@" | git apply --cached --ignore-whitespace --unidiff-zero -' to stage non-whitespace changes, but I was not in the root directory and the changes were not being applied. I broke down the command to discover the 'git apply' part of the pipe was silently failing to apply anything, exiting 0 without even a warning. The exit status and lack of warning is terribly misleading, I imagine this would be the cause of subtle bugs in automated scripts. Is this expected behaviour? Thanks, Ernesto
Re: [PATCH] submodule: correct error message for missing commits.
Stefan Bellerwrites: > When a submodule diff should be displayed we currently just add the > submodule objects to the main object store and then e.g. walk the > revision graph and create a summary for that submodule. > > It is possible that we are missing the submodule either completely or > partially, which we currently differentiate with different error messages > depending on whether (1) the whole submodule object store is missing or > (2) just the needed for this particular diff. (1) is reported as > "not initialized", and (2) is reported as "commits not present". > > If a submodule is deinit'ed its repository data is still around inside > the superproject, such that the diff can still be produced. In that way > the error message (1) is misleading as we can have a diff despite the > submodule being not initialized. > > Downgrade the error message (1) to be the same as (2) and just say > the commits are not present, as that is the true reason why the diff > cannot be shown. > > Signed-off-by: Stefan Beller > --- Sounds good. Thanks for working on it.
Re: [PATCH 0/3] Comment fixes
Han-Wen Nienhuyswrites: > follow more commit log conventions; verified it compiled (yay). ;-) > (should I send patches that are in 'pu' again as well?) Because these look more or less independent changes that can be queued on separate topics, I do not think resending is needed at all, even though mentioning that these do *not* replace what is queued like you just did is an excellent way to prevent the maintainer from making stupid mistakes. Thanks. > Han-Wen Nienhuys (3): > real_path: clarify return value ownership > read_gitfile_gently: clarify return value ownership. > string-list.h: move documentation from Documentation/api/ into header > > Documentation/technical/api-string-list.txt | 209 > > abspath.c | 4 + > setup.c | 3 +- > string-list.h | 192 + > 4 files changed, 168 insertions(+), 240 deletions(-) > delete mode 100644 Documentation/technical/api-string-list.txt > > -- > 2.14.1.821.g8fa685d3b7-goog
Re: -s theirs use-case(s) Was: BUG: merge -s theirs is not in effect
Yaroslav Halchenkowrites: > and that is where the gotcha comes -- what if "my" changes were already > published? then I would like to avoid the rebase, and would -s theirs > to choose "their" solution in favor of mine and be able to push so > others could still "fast-forward" to the new state. > > So -- as to me it remains 'symmetric' ;) I do not necessarily agree. Once you decide that their history is the mainline, you'd rather want to treat your line of development as a side branch and make a merge in that direction, i.e. the first parent of the resulting merge is a commit on their history and the second parent is the last bad one of your history. So you would end up using "checkout their-history && merge -s ours your-history" to keep the first-parenthood sensible. And at that point, use of "-s ours" is no longer a workaround for lack of "-s theirs". It is a proper part of the desired semantics, i.e. from the point of view of the surviving canonical history line, you want to preserve what it did, nullifying what the other line of history did. So I still do not think the above scenario justifies "-s theirs".
Re: [PATCH v2 0/9] Teach 'run' perf script to read config files
Roberto Tyleywrites: > I had a quick look at git-send-email.perl, I see the trick is the `time++` one > introduced with https://github.com/git/git/commit/a5370b16 - seems reasonable! > > SubmitGit makes all emails in-reply-to the initial email, which I > think is correct behaviour, > but I can see that offsetting the times would probably give a more > reliable sorting in > a newsreader. ... > ...so the only way SubmitGit can offset the times is to literally > delay the sending of the emails, > which is a bit unfortunate for patchbombs more than a few dozen commits long! As this matters mostly for a series that is longer than several patches, it indeed is unfortunate. If SubmitGit needs to send and sleep for a dozen messages, does it mean the end user has to wait 12 (or is it 11? ;-)) seconds? If the human is the only thing that needs waiting, it may be OK (after all, it all happens in the web browser and the human can switch to other tasks after clicking "submit"), but that may not be acceptable if this artificial delay can cause a timeout in a moving part among many. Thanks for looking into this.
[PATCH v2 0/9] protocol transition
v2 of this series has the following changes: * Included Jonathan Tan's patch as [1/9] with a small tweak to not rely on hardcoding the side of sha1 into the capability line check. * Reworked some of the logic in daemon.c * Added the word 'Experimental' to protocol.version to indicate that user's shouldn't rely on its semantics to hold until it has been thoroughly tested. Brandon Williams (8): pkt-line: add packet_write function protocol: introduce protocol extention mechanisms daemon: recognize hidden request arguments upload-pack, receive-pack: introduce protocol version 1 connect: teach client to recognize v1 server response connect: tell server that the client understands v1 http: tell server that the client understands v1 i5700: add interop test for protocol transition Jonathan Tan (1): connect: in ref advertisement, shallows are last Documentation/config.txt | 17 ++ Documentation/git.txt | 5 + Makefile | 1 + builtin/receive-pack.c | 14 ++ cache.h| 9 + connect.c | 248 daemon.c | 68 +++- http.c | 18 ++ pkt-line.c | 6 + pkt-line.h | 1 + protocol.c | 72 protocol.h | 14 ++ t/interop/i5700-protocol-transition.sh | 68 t/lib-httpd/apache.conf| 7 + t/t5700-protocol-v1.sh | 292 + upload-pack.c | 17 +- 16 files changed, 776 insertions(+), 81 deletions(-) create mode 100644 protocol.c create mode 100644 protocol.h create mode 100755 t/interop/i5700-protocol-transition.sh create mode 100755 t/t5700-protocol-v1.sh -- 2.14.1.992.g2c7b836f3a-goog
[PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1
Teach upload-pack and receive-pack to understand and respond using protocol version 1, if requested. Protocol version 1 is simply the original and current protocol (what I'm calling version 0) with the addition of a single packet line, which precedes the ref advertisement, indicating the protocol version being spoken. Signed-off-by: Brandon Williams--- builtin/receive-pack.c | 14 ++ upload-pack.c | 17 - 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index dd06b3fb4..cb179367b 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -24,6 +24,7 @@ #include "tmp-objdir.h" #include "oidset.h" #include "packfile.h" +#include "protocol.h" static const char * const receive_pack_usage[] = { N_("git receive-pack "), @@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix) else if (0 <= receive_unpack_limit) unpack_limit = receive_unpack_limit; + switch (determine_protocol_version_server()) { + case protocol_v1: + if (advertise_refs || !stateless_rpc) + packet_write_fmt(1, "version 1\n"); + /* +* v1 is just the original protocol with a version string, +* so just fall through after writing the version string. +*/ + case protocol_v0: + default: + break; + } + if (advertise_refs || !stateless_rpc) { write_head_info(); } diff --git a/upload-pack.c b/upload-pack.c index 7efff2fbf..5cab39819 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -18,6 +18,7 @@ #include "parse-options.h" #include "argv-array.h" #include "prio-queue.h" +#include "protocol.h" static const char * const upload_pack_usage[] = { N_("git upload-pack [] "), @@ -1067,6 +1068,20 @@ int cmd_main(int argc, const char **argv) die("'%s' does not appear to be a git repository", dir); git_config(upload_pack_config, NULL); - upload_pack(); + + switch (determine_protocol_version_server()) { + case protocol_v1: + if (advertise_refs || !stateless_rpc) + packet_write_fmt(1, "version 1\n"); + /* +* v1 is just the original protocol with a version string, +* so just fall through after writing the version string. +*/ + case protocol_v0: + default: + upload_pack(); + break; + } + return 0; } -- 2.14.1.992.g2c7b836f3a-goog
[PATCH v2 8/9] http: tell server that the client understands v1
Tell a server that protocol v1 can be used by sending the http header 'Git-Protocol' indicating this. Also teach the apache http server to pass through the 'Git-Protocol' header as an environment variable 'GIT_PROTOCOL'. Signed-off-by: Brandon Williams--- cache.h | 2 ++ http.c | 18 + t/lib-httpd/apache.conf | 7 + t/t5700-protocol-v1.sh | 69 + 4 files changed, 96 insertions(+) diff --git a/cache.h b/cache.h index 0c792545f..aaadac1f0 100644 --- a/cache.h +++ b/cache.h @@ -451,6 +451,8 @@ static inline enum object_type object_type(unsigned int mode) * 'key[=value]'. Presence of unknown keys must be tolerated. */ #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL" +/* HTTP header used to handshake the wire protocol */ +#define GIT_PROTOCOL_HEADER "Git-Protocol" /* * This environment variable is expected to contain a boolean indicating diff --git a/http.c b/http.c index 9e40a465f..ffb719216 100644 --- a/http.c +++ b/http.c @@ -12,6 +12,7 @@ #include "gettext.h" #include "transport.h" #include "packfile.h" +#include "protocol.h" static struct trace_key trace_curl = TRACE_KEY_INIT(CURL); #if LIBCURL_VERSION_NUM >= 0x070a08 @@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname) *var = val; } +static void protocol_http_header(void) +{ + if (get_protocol_version_config() > 0) { + struct strbuf protocol_header = STRBUF_INIT; + + strbuf_addf(_header, GIT_PROTOCOL_HEADER ": version=%d", + get_protocol_version_config()); + + + extra_http_headers = curl_slist_append(extra_http_headers, + protocol_header.buf); + strbuf_release(_header); + } +} + void http_init(struct remote *remote, const char *url, int proactive_auth) { char *low_speed_limit; @@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth) if (remote) var_override(_proxy_authmethod, remote->http_proxy_authmethod); + protocol_http_header(); + pragma_header = curl_slist_append(http_copy_default_headers(), "Pragma: no-cache"); no_pragma_header = curl_slist_append(http_copy_default_headers(), diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 0642ae7e6..df1943631 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -67,6 +67,9 @@ LockFile accept.lock LoadModule unixd_module modules/mod_unixd.so + + LoadModule setenvif_module modules/mod_setenvif.so + PassEnv GIT_VALGRIND @@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS PassEnv GIT_TRACE PassEnv GIT_CONFIG_NOSYSTEM += 2.4> + SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0 + + Alias /dumb/ www/ Alias /auth/dumb/ www/auth/dumb/ diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh index 1988bbce6..65127 100755 --- a/t/t5700-protocol-v1.sh +++ b/t/t5700-protocol-v1.sh @@ -220,4 +220,73 @@ test_expect_success 'push with ssh:// using protocol v1' ' grep "push< version 1" log ' +# Test protocol v1 with 'http://' transport +# +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +test_expect_success 'create repo to be served by http:// transport' ' + git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true && + test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one +' + +test_expect_success 'clone with http:// using protocol v1' ' + GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \ + clone "$HTTPD_URL/smart/http_parent" http_child 2>log && + + git -C http_child log -1 --format=%s >actual && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect && + test_cmp expect actual && + + # Client requested to use protocol v1 + grep "Git-Protocol: version=1" log && + # Server responded using protocol v1 + grep "git< version 1" log +' + +test_expect_success 'fetch with http:// using protocol v1' ' + test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two && + + GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \ + fetch 2>log && + + git -C http_child log -1 --format=%s FETCH_HEAD >actual && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect && + test_cmp expect actual && + + # Server responded using protocol v1 + grep "git< version 1" log +' + +test_expect_success 'pull with http:// using protocol v1' ' + GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \ + pull 2>log && + + git -C http_child log -1 --format=%s >actual && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s
[PATCH v2 3/9] protocol: introduce protocol extention mechanisms
Create protocol.{c,h} and provide functions which future servers and clients can use to determine which protocol to use or is being used. Also introduce the 'GIT_PROTOCOL' environment variable which will be used to communicate a colon separated list of keys with optional values to a server. Unknown keys and values must be tolerated. This mechanism is used to communicate which version of the wire protocol a client would like to use with a server. Signed-off-by: Brandon Williams--- Documentation/config.txt | 17 Documentation/git.txt| 5 Makefile | 1 + cache.h | 7 + protocol.c | 72 protocol.h | 14 ++ 6 files changed, 116 insertions(+) create mode 100644 protocol.c create mode 100644 protocol.h diff --git a/Documentation/config.txt b/Documentation/config.txt index dc4e3f58a..b78747abc 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2517,6 +2517,23 @@ The protocol names currently used by git are: `hg` to allow the `git-remote-hg` helper) -- +protocol.version:: + Experimental. If set, clients will attempt to communicate with a + server using the specified protocol version. If unset, no + attempt will be made by the client to communicate using a + particular protocol version, this results in protocol version 0 + being used. + Supported versions: ++ +-- + +* `0` - the original wire protocol. + +* `1` - the original wire protocol with the addition of a version string + in the initial response from the server. + +-- + pull.ff:: By default, Git does not create an extra merge commit when merging a commit that is a descendant of the current commit. Instead, the diff --git a/Documentation/git.txt b/Documentation/git.txt index 6e3a6767e..299f75c7b 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -697,6 +697,11 @@ of clones and fetches. which feed potentially-untrusted URLS to git commands. See linkgit:git-config[1] for more details. +`GIT_PROTOCOL`:: + For internal use only. Used in handshaking the wire protocol. + Contains a colon ':' separated list of keys with optional values + 'key[=value]'. Presence of unknown keys must be tolerated. + Discussion[[Discussion]] diff --git a/Makefile b/Makefile index ed4ca438b..9ce68cded 100644 --- a/Makefile +++ b/Makefile @@ -842,6 +842,7 @@ LIB_OBJS += pretty.o LIB_OBJS += prio-queue.o LIB_OBJS += progress.o LIB_OBJS += prompt.o +LIB_OBJS += protocol.o LIB_OBJS += quote.o LIB_OBJS += reachable.o LIB_OBJS += read-cache.o diff --git a/cache.h b/cache.h index 49b083ee0..0c792545f 100644 --- a/cache.h +++ b/cache.h @@ -445,6 +445,13 @@ static inline enum object_type object_type(unsigned int mode) #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS" #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH" +/* + * Environment variable used in handshaking the wire protocol. + * Contains a colon ':' separated list of keys with optional values + * 'key[=value]'. Presence of unknown keys must be tolerated. + */ +#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL" + /* * This environment variable is expected to contain a boolean indicating * whether we should or should not treat: diff --git a/protocol.c b/protocol.c new file mode 100644 index 0..369503065 --- /dev/null +++ b/protocol.c @@ -0,0 +1,72 @@ +#include "cache.h" +#include "config.h" +#include "protocol.h" + +static enum protocol_version parse_protocol_version(const char *value) +{ + if (!strcmp(value, "0")) + return protocol_v0; + else if (!strcmp(value, "1")) + return protocol_v1; + else + return protocol_unknown_version; +} + +enum protocol_version get_protocol_version_config(void) +{ + const char *value; + if (!git_config_get_string_const("protocol.version", )) { + enum protocol_version version = parse_protocol_version(value); + + if (version == protocol_unknown_version) + die("unknown value for config 'protocol.version': %s", + value); + + return version; + } + + return protocol_v0; +} + +enum protocol_version determine_protocol_version_server(void) +{ + const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT); + enum protocol_version version = protocol_v0; + + if (git_protocol) { + struct string_list list = STRING_LIST_INIT_DUP; + const struct string_list_item *item; + string_list_split(, git_protocol, ':', -1); + + for_each_string_list_item(item, ) { + const char *value; + enum protocol_version v; + + if (skip_prefix(item->string,
[PATCH v2 2/9] pkt-line: add packet_write function
Add a function which can be used to write the contents of an arbitrary buffer. This makes it easy to build up data in a buffer before writing the packet instead of formatting the entire contents of the packet using 'packet_write_fmt()'. Signed-off-by: Brandon Williams--- pkt-line.c | 6 ++ pkt-line.h | 1 + 2 files changed, 7 insertions(+) diff --git a/pkt-line.c b/pkt-line.c index 647bbd3bc..c025d0332 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size) return 0; } +void packet_write(const int fd_out, const char *buf, size_t size) +{ + if (packet_write_gently(fd_out, buf, size)) + die_errno("packet write failed"); +} + void packet_buf_write(struct strbuf *buf, const char *fmt, ...) { va_list args; diff --git a/pkt-line.h b/pkt-line.h index 66ef610fc..d9e9783b1 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -22,6 +22,7 @@ void packet_flush(int fd); void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); void packet_buf_flush(struct strbuf *buf); +void packet_write(const int fd_out, const char *buf, size_t size); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); int packet_flush_gently(int fd); int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3))); -- 2.14.1.992.g2c7b836f3a-goog
[PATCH v2 1/9] connect: in ref advertisement, shallows are last
From: Jonathan TanCurrently, get_remote_heads() parses the ref advertisement in one loop, allowing refs and shallow lines to intersperse, despite this not being allowed by the specification. Refactor get_remote_heads() to use two loops instead, enforcing that refs come first, and then shallows. This also makes it easier to teach get_remote_heads() to interpret other lines in the ref advertisement, which will be done in a subsequent patch. As part of this change, this patch interprets capabilities only on the first line in the ref advertisement, printing a warning message when encountering capabilities on other lines. Signed-off-by: Jonathan Tan Signed-off-by: Brandon Williams --- connect.c | 189 -- 1 file changed, 123 insertions(+), 66 deletions(-) diff --git a/connect.c b/connect.c index df56c0cbf..8e2e276b6 100644 --- a/connect.c +++ b/connect.c @@ -11,6 +11,7 @@ #include "string-list.h" #include "sha1-array.h" #include "transport.h" +#include "strbuf.h" static char *server_capabilities; static const char *parse_feature_value(const char *, const char *, int *); @@ -107,6 +108,104 @@ static void annotate_refs_with_symref_info(struct ref *ref) string_list_clear(, 0); } +/* + * Read one line of a server's ref advertisement into packet_buffer. + */ +static int read_remote_ref(int in, char **src_buf, size_t *src_len, + int *responded) +{ + int len = packet_read(in, src_buf, src_len, + packet_buffer, sizeof(packet_buffer), + PACKET_READ_GENTLE_ON_EOF | + PACKET_READ_CHOMP_NEWLINE); + const char *arg; + if (len < 0) + die_initial_contact(*responded); + if (len > 4 && skip_prefix(packet_buffer, "ERR ", )) + die("remote error: %s", arg); + + *responded = 1; + + return len; +} + +#define EXPECTING_FIRST_REF 0 +#define EXPECTING_REF 1 +#define EXPECTING_SHALLOW 2 + +static void process_capabilities(int *len) +{ + int nul_location = strlen(packet_buffer); + if (nul_location == *len) + return; + server_capabilities = xstrdup(packet_buffer + nul_location + 1); + *len = nul_location; +} + +static int process_dummy_ref(void) +{ + struct object_id oid; + const char *name; + + if (parse_oid_hex(packet_buffer, , )) + return 0; + if (*name != ' ') + return 0; + name++; + + return !oidcmp(_oid, ) && !strcmp(name, "capabilities^{}"); +} + +static void check_no_capabilities(int len) +{ + if (strlen(packet_buffer) != len) + warning("Ignoring capabilities after first line '%s'", + packet_buffer + strlen(packet_buffer)); +} + +static int process_ref(int len, struct ref ***list, unsigned int flags, + struct oid_array *extra_have) +{ + struct object_id old_oid; + const char *name; + + if (parse_oid_hex(packet_buffer, _oid, )) + return 0; + if (*name != ' ') + return 0; + name++; + + if (extra_have && !strcmp(name, ".have")) { + oid_array_append(extra_have, _oid); + } else if (!strcmp(name, "capabilities^{}")) { + die("protocol error: unexpected capabilities^{}"); + } else if (check_ref(name, flags)) { + struct ref *ref = alloc_ref(name); + oidcpy(>old_oid, _oid); + **list = ref; + *list = >next; + } + check_no_capabilities(len); + return 1; +} + +static int process_shallow(int len, struct oid_array *shallow_points) +{ + const char *arg; + struct object_id old_oid; + + if (!skip_prefix(packet_buffer, "shallow ", )) + return 0; + + if (get_oid_hex(arg, _oid)) + die("protocol error: expected shallow sha-1, got '%s'", arg); + if (!shallow_points) + die("repository on the other end cannot be shallow"); + oid_array_append(shallow_points, _oid); + check_no_capabilities(len); + return 1; +} + /* * Read all the refs from the other end */ @@ -123,76 +222,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, * willing to talk to us. A hang-up before seeing any * response does not necessarily mean an ACL problem, though. */ - int saw_response; - int got_dummy_ref_with_capabilities_declaration = 0; + int responded = 0; + int len; + int state = EXPECTING_FIRST_REF; *list = NULL; - for (saw_response = 0; ; saw_response = 1) { - struct ref *ref; - struct object_id old_oid; - char *name; - int len, name_len; - char *buffer =
[PATCH v2 7/9] connect: tell server that the client understands v1
Teach the connection logic to tell a serve that it understands protocol v1. This is done in 2 different ways for the built in protocols. 1. git:// A normal request is structured as "command path/to/repo\0host=..\0" and due to a bug in an old version of git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) we aren't able to place any extra args (separated by NULs) besides the host. In order to get around this limitation put protocol version information after a second NUL byte so the request is structured like: "command path/to/repo\0host=..\0\0version=1\0". git-daemon can then parse out the version number and set GIT_PROTOCOL. 2. ssh://, file:// Set GIT_PROTOCOL envvar with the desired protocol version. The envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and having the server whitelist this envvar. Signed-off-by: Brandon Williams--- connect.c | 37 ++-- t/t5700-protocol-v1.sh | 223 + 2 files changed, 255 insertions(+), 5 deletions(-) create mode 100755 t/t5700-protocol-v1.sh diff --git a/connect.c b/connect.c index 1805debf3..12ebab724 100644 --- a/connect.c +++ b/connect.c @@ -871,6 +871,7 @@ struct child_process *git_connect(int fd[2], const char *url, printf("Diag: path=%s\n", path ? path : "NULL"); conn = NULL; } else if (protocol == PROTO_GIT) { + struct strbuf request = STRBUF_INIT; /* * Set up virtual host information based on where we will * connect, unless the user has overridden us in @@ -898,13 +899,25 @@ struct child_process *git_connect(int fd[2], const char *url, * Note: Do not add any other headers here! Doing so * will cause older git-daemon servers to crash. */ - packet_write_fmt(fd[1], -"%s %s%chost=%s%c", -prog, path, 0, -target_host, 0); + strbuf_addf(, + "%s %s%chost=%s%c", + prog, path, 0, + target_host, 0); + + /* If using a new version put that stuff here after a second null byte */ + if (get_protocol_version_config() > 0) { + strbuf_addch(, '\0'); + strbuf_addf(, "version=%d%c", + get_protocol_version_config(), '\0'); + } + + packet_write(fd[1], request.buf, request.len); + free(target_host); + strbuf_release(); } else { struct strbuf cmd = STRBUF_INIT; + const char *const *var; conn = xmalloc(sizeof(*conn)); child_process_init(conn); @@ -917,7 +930,9 @@ struct child_process *git_connect(int fd[2], const char *url, sq_quote_buf(, path); /* remove repo-local variables from the environment */ - conn->env = local_repo_env; + for (var = local_repo_env; *var; var++) + argv_array_push(>env_array, *var); + conn->use_shell = 1; conn->in = conn->out = -1; if (protocol == PROTO_SSH) { @@ -971,6 +986,14 @@ struct child_process *git_connect(int fd[2], const char *url, } argv_array_push(>args, ssh); + + if (get_protocol_version_config() > 0) { + argv_array_push(>args, "-o"); + argv_array_push(>args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT); + argv_array_pushf(>env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d", +get_protocol_version_config()); + } + if (flags & CONNECT_IPV4) argv_array_push(>args, "-4"); else if (flags & CONNECT_IPV6) @@ -985,6 +1008,10 @@ struct child_process *git_connect(int fd[2], const char *url, argv_array_push(>args, ssh_host); } else { transport_check_allowed("file"); + if (get_protocol_version_config() > 0) { + argv_array_pushf(>env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d", +get_protocol_version_config()); + } } argv_array_push(>args, cmd.buf); diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh new file mode 100755 index 0..1988bbce6 --- /dev/null +++ b/t/t5700-protocol-v1.sh @@ -0,0 +1,223 @@ +#!/bin/sh + +test_description='test git
[PATCH v2 9/9] i5700: add interop test for protocol transition
Signed-off-by: Brandon Williams--- t/interop/i5700-protocol-transition.sh | 68 ++ 1 file changed, 68 insertions(+) create mode 100755 t/interop/i5700-protocol-transition.sh diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh new file mode 100755 index 0..9e83428a8 --- /dev/null +++ b/t/interop/i5700-protocol-transition.sh @@ -0,0 +1,68 @@ +#!/bin/sh + +VERSION_A=. +VERSION_B=v2.0.0 + +: ${LIB_GIT_DAEMON_PORT:=5600} +LIB_GIT_DAEMON_COMMAND='git.b daemon' + +test_description='clone and fetch by client who is trying to use a new protocol' +. ./interop-lib.sh +. "$TEST_DIRECTORY"/lib-git-daemon.sh + +start_git_daemon --export-all + +repo=$GIT_DAEMON_DOCUMENT_ROOT_PATH/repo + +test_expect_success "create repo served by $VERSION_B" ' + git.b init "$repo" && + git.b -C "$repo" commit --allow-empty -m one +' + +test_expect_success "git:// clone with $VERSION_A and protocol v1" ' + GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone "$GIT_DAEMON_URL/repo" child 2>log && + git.a -C child log -1 --format=%s >actual && + git.b -C "$repo" log -1 --format=%s >expect && + test_cmp expect actual && + grep "version=1" log +' + +test_expect_success "git:// fetch with $VERSION_A and protocol v1" ' + git.b -C "$repo" commit --allow-empty -m two && + git.b -C "$repo" log -1 --format=%s >expect && + + GIT_TRACE_PACKET=1 git.a -C child -c protocol.version=1 fetch 2>log && + git.a -C child log -1 --format=%s FETCH_HEAD >actual && + + test_cmp expect actual && + grep "version=1" log && + ! grep "version 1" log +' + +stop_git_daemon + +test_expect_success "create repo served by $VERSION_B" ' + git.b init parent && + git.b -C parent commit --allow-empty -m one +' + +test_expect_success "file:// clone with $VERSION_A and protocol v1" ' + GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone --upload-pack="git.b upload-pack" parent child2 2>log && + git.a -C child2 log -1 --format=%s >actual && + git.b -C parent log -1 --format=%s >expect && + test_cmp expect actual && + ! grep "version 1" log +' + +test_expect_success "file:// fetch with $VERSION_A and protocol v1" ' + git.b -C parent commit --allow-empty -m two && + git.b -C parent log -1 --format=%s >expect && + + GIT_TRACE_PACKET=1 git.a -C child2 -c protocol.version=1 fetch --upload-pack="git.b upload-pack" 2>log && + git.a -C child2 log -1 --format=%s FETCH_HEAD >actual && + + test_cmp expect actual && + ! grep "version 1" log +' + +test_done -- 2.14.1.992.g2c7b836f3a-goog
[PATCH v2 4/9] daemon: recognize hidden request arguments
A normal request to git-daemon is structured as "command path/to/repo\0host=..\0" and due to a bug in an old version of git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) we aren't able to place any extra args (separated by NULs) besides the host. In order to get around this limitation teach git-daemon to recognize additional request arguments hidden behind a second NUL byte. Requests can then be structured like: "command path/to/repo\0host=..\0\0version=1\0key=value\0". git-daemon can then parse out the extra arguments and set 'GIT_PROTOCOL' accordingly. Signed-off-by: Brandon Williams--- daemon.c | 68 +++- 1 file changed, 59 insertions(+), 9 deletions(-) diff --git a/daemon.c b/daemon.c index 30747075f..36cc794c9 100644 --- a/daemon.c +++ b/daemon.c @@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi) return NULL;/* Fallthrough. Deny by default */ } -typedef int (*daemon_service_fn)(void); +typedef int (*daemon_service_fn)(const struct argv_array *env); struct daemon_service { const char *name; const char *config_name; @@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir, } static int run_service(const char *dir, struct daemon_service *service, - struct hostinfo *hi) + struct hostinfo *hi, const struct argv_array *env) { const char *path; int enabled = service->enabled; @@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service, */ signal(SIGTERM, SIG_IGN); - return service->fn(); + return service->fn(env); } static void copy_to_log(int fd) @@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld) return finish_command(cld); } -static int upload_pack(void) +static int upload_pack(const struct argv_array *env) { struct child_process cld = CHILD_PROCESS_INIT; argv_array_pushl(, "upload-pack", "--strict", NULL); argv_array_pushf(, "--timeout=%u", timeout); + + argv_array_pushv(_array, env->argv); + return run_service_command(); } -static int upload_archive(void) +static int upload_archive(const struct argv_array *env) { struct child_process cld = CHILD_PROCESS_INIT; argv_array_push(, "upload-archive"); + + argv_array_pushv(_array, env->argv); + return run_service_command(); } -static int receive_pack(void) +static int receive_pack(const struct argv_array *env) { struct child_process cld = CHILD_PROCESS_INIT; argv_array_push(, "receive-pack"); + + argv_array_pushv(_array, env->argv); + return run_service_command(); } @@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in) /* * Read the host as supplied by the client connection. */ -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen) +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen) { char *val; int vallen; @@ -602,6 +611,43 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen) if (extra_args < end && *extra_args) die("Invalid request"); } + + return extra_args; +} + +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env, +char *extra_args, int buflen) +{ + const char *end = extra_args + buflen; + struct strbuf git_protocol = STRBUF_INIT; + + /* First look for the host argument */ + extra_args = parse_host_arg(hi, extra_args, buflen); + + /* Look for additional arguments places after a second NUL byte */ + for (; extra_args < end; extra_args += strlen(extra_args) + 1) { + const char *arg = extra_args; + + /* +* Parse the extra arguments, adding most to 'git_protocol' +* which will be used to set the 'GIT_PROTOCOL' envvar in the +* service that will be run. +* +* If there ends up being a particular arg in the future that +* git-daemon needs to parse specificly (like the 'host' arg) +* then it can be parsed here and not added to 'git_protocol'. +*/ + if (*arg) { + if (git_protocol.len > 0) + strbuf_addch(_protocol, ':'); + strbuf_addstr(_protocol, arg); + } + } + + if (git_protocol.len > 0) + argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s", +git_protocol.buf); + strbuf_release(_protocol); } /* @@ -695,6 +741,7 @@ static int execute(void)
[PATCH v2 6/9] connect: teach client to recognize v1 server response
Teach a client to recognize that a server understands protocol v1 by looking at the first pkt-line the server sends in response. This is done by looking for the response "version 1" send by upload-pack or receive-pack. Signed-off-by: Brandon Williams--- connect.c | 30 ++ 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/connect.c b/connect.c index 8e2e276b6..1805debf3 100644 --- a/connect.c +++ b/connect.c @@ -12,6 +12,7 @@ #include "sha1-array.h" #include "transport.h" #include "strbuf.h" +#include "protocol.h" static char *server_capabilities; static const char *parse_feature_value(const char *, const char *, int *); @@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t *src_len, return len; } -#define EXPECTING_FIRST_REF 0 -#define EXPECTING_REF 1 -#define EXPECTING_SHALLOW 2 +#define EXPECTING_PROTOCOL_VERSION 0 +#define EXPECTING_FIRST_REF 1 +#define EXPECTING_REF 2 +#define EXPECTING_SHALLOW 3 + +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */ +static int process_protocol_version(void) +{ + switch (determine_protocol_version_client(packet_buffer)) { + case protocol_v1: + return 1; + case protocol_v0: + return 0; + default: + die("server is speaking an unknown protocol"); + } +} static void process_capabilities(int *len) { @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, */ int responded = 0; int len; - int state = EXPECTING_FIRST_REF; + int state = EXPECTING_PROTOCOL_VERSION; *list = NULL; while ((len = read_remote_ref(in, _buf, _len, ))) { switch (state) { + case EXPECTING_PROTOCOL_VERSION: + if (process_protocol_version()) { + state = EXPECTING_FIRST_REF; + break; + } + state = EXPECTING_FIRST_REF; + /* fallthrough */ case EXPECTING_FIRST_REF: process_capabilities(); if (process_dummy_ref()) { -- 2.14.1.992.g2c7b836f3a-goog
Re: RFC v3: Another proposed hash function transition plan
Hi, Johannes Schindelin wrote: > Sorry, you are asking cryptography experts to spend their time on the Git > mailing list. I tried to get them to speak out on the Git mailing list. > They respectfully declined. > > I can't fault them, they have real jobs to do, and none of their managers > would be happy for them to educate the Git mailing list on matters of > cryptography, not after what happened in 2005. Fortunately we have had a few public comments from crypto specialists: https://public-inbox.org/git/91a34c5b-7844-3db2-cf29-411df5bcf...@noekeon.org/ https://public-inbox.org/git/CAL9PXLzhPyE+geUdcLmd=pidt5p8efebbsgx_ds88knz2q_...@mail.gmail.com/ https://public-inbox.org/git/CAL9PXLxMHG1nP5_GQaK_WSJTNKs=_qbaL6V5v2GzVG=9vu2...@mail.gmail.com/ https://public-inbox.org/git/59bfb95d.1030...@st.com/ https://public-inbox.org/git/59c149a3.6080...@st.com/ [...] > Let's be realistic. Git is pretty important to us, but it is not important > enough to sway, say, Intel into announcing hardware support for SHA3. Yes, I agree with this. (Adoption by Git could lead to adoption by some other projects, leading to more work on high quality software implementations in projects like OpenSSL, but I am not convinced that that would be a good thing for the world anyway. There are downsides to a proliferation of too many crypto primitives. This is the basic argument described in more detail at [1].) [...] > On Tue, 26 Sep 2017, Jason Cooper wrote: >> For my use cases, as a user of git, I have a plan to maintain provable >> integrity of existing objects stored in git under sha1 while migrating >> away from sha1. The same plan works for migrating away from SHA2 or >> SHA3 when the time comes. > > Please do not make the mistake of taking your use case to be a template > for everybody's use case. That said, I'm curious at what plan you are alluding to. Is it something that could benefit others on the list? Thanks, Jonathan [1] https://www.imperialviolet.org/2017/05/31/skipsha3.html
Re: [PATCH] technical doc: add a design doc for hash function transition
Hi, Stefan Beller wrote: > From: Jonathan NiederI go by jrnie...@gmail.com upstream. :) > This is "RFC v3: Another proposed hash function transition plan" from > the git mailing list. > > Signed-off-by: Jonathan Nieder > Signed-off-by: Jonathan Tan > Signed-off-by: Brandon Williams > Signed-off-by: Stefan Beller I hadn't signed-off on this version, but it's not a big deal. [...] > --- > > This takes the original Google Doc[1] and adds it to our history, > such that the discussion can be on on list and in the commit messages. > > * replaced SHA3-256 with NEWHASH, sha3 with newhash > * added section 'Implementation plan' > * added section 'Future work' > * added section 'Agreed-upon criteria for selecting NewHash' Thanks for sending this out. I had let it stall too long. As a tiny nit, I think NewHash is easier to read than NEWHASH. Not a big deal. More importantly, we need some text describing it and saying it's a placeholder. The implementation plan included here is out of date. It comes from an email where I was answering a question about what people can do to make progress, before this design had been agreed on. In the context of this design there are other steps we'd want to describe (having to do with implementing the translation table, etc). I also planned to add a description of the translation table based on what was discussed previously in this thread. Jonathan
Re: [PATCH] submodule: correct error message for missing commits.
On Tue, Sep 26, 2017 at 11:27 AM, Stefan Bellerwrote: > When a submodule diff should be displayed we currently just add the > submodule objects to the main object store and then e.g. walk the > revision graph and create a summary for that submodule. > > It is possible that we are missing the submodule either completely or > partially, which we currently differentiate with different error messages > depending on whether (1) the whole submodule object store is missing or > (2) just the needed for this particular diff. (1) is reported as > "not initialized", and (2) is reported as "commits not present". > > If a submodule is deinit'ed its repository data is still around inside > the superproject, such that the diff can still be produced. In that way > the error message (1) is misleading as we can have a diff despite the > submodule being not initialized. > > Downgrade the error message (1) to be the same as (2) and just say > the commits are not present, as that is the true reason why the diff > cannot be shown. > > Signed-off-by: Stefan Beller This makes sense to me. Thanks, Jake > --- > submodule.c | 2 +- > t/t4059-diff-submodule-not-initialized.sh | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/submodule.c b/submodule.c > index 6531c5d609..280c246477 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -567,7 +567,7 @@ static void show_submodule_header(FILE *f, const char > *path, > > if (add_submodule_odb(path)) { > if (!message) > - message = "(not initialized)"; > + message = "(commits not present)"; > goto output_header; > } > > diff --git a/t/t4059-diff-submodule-not-initialized.sh > b/t/t4059-diff-submodule-not-initialized.sh > index cd70fd5192..49bca7b48d 100755 > --- a/t/t4059-diff-submodule-not-initialized.sh > +++ b/t/t4059-diff-submodule-not-initialized.sh > @@ -95,7 +95,7 @@ test_expect_success 'submodule not initialized in new > clone' ' > git clone . sm3 && > git -C sm3 diff-tree -p --no-commit-id --submodule=log HEAD >actual && > cat >expected <<-EOF && > - Submodule sm1 $smhead1...$smhead2 (not initialized) > + Submodule sm1 $smhead1...$smhead2 (commits not present) > EOF > test_cmp expected actual > ' > -- > 2.14.0.rc0.3.g6c2e499285 >
Google indexing https://public-inbox.org/git (was: BUG in git diff-index)
[Reduced Cc: and change Subject:] On 26/09/17 13:11, Eric Wong wrote: > There's no blocks on public-inbox.org and I'm completely against > any sort of blocking/throttling. Maybe there's too many pages > to index? Or the Message-IDs in URLs are too ugly/scary? Not > sure what to do about that... > > Anyways, I just put up a robots.txt with Crawl-Delay: 1, since I > seem to recall crawlers use a more conservative delay by default: Not sure what made the difference: Google can now easily find this thread on both https://public-inbox.org/git and on spinics too. Same change observed for another, unrelated thread on this list. Nevermind. "Unsearchable are his judgments, and his ways past finding out"
Re: [PATCH 09/13] rev-list: add object filtering support
On Fri, 22 Sep 2017 20:30:13 + Jeff Hostetlerwrote: > + if (filter_options.relax) { Add some documentation about how this differs from ignore_missing_links in struct rev_info.
Re: [PATCH 07/13] object-filter: common declarations for object filtering
On Fri, 22 Sep 2017 20:30:11 + Jeff Hostetlerwrote: > Makefile| 1 + > object-filter.c | 269 > > object-filter.h | 173 > 3 files changed, 443 insertions(+) > create mode 100644 object-filter.c > create mode 100644 object-filter.h I think these and list-objects-filter-* are multiple levels of indirection too many. Would a single file with a few implementations of filter_object_fn be sufficient?
Re: [PATCH 03/13] list-objects: filter objects in traverse_commit_list
On Fri, 22 Sep 2017 20:26:22 + Jeff Hostetlerwrote: > From: Jeff Hostetler > > Create traverse_commit_list_filtered() and add filtering You mention _filtered() here, but this patch contains _worker(). > list-objects.h | 30 ++ > 2 files changed, 80 insertions(+), 16 deletions(-) > > diff --git a/list-objects.c b/list-objects.c > index b3931fa..3e86008 100644 > --- a/list-objects.c > +++ b/list-objects.c > @@ -13,10 +13,13 @@ static void process_blob(struct rev_info *revs, >show_object_fn show, >struct strbuf *path, >const char *name, > - void *cb_data) > + void *cb_data, > + filter_object_fn filter, > + void *filter_data) I had some thoughts about modifying "show" (instead of adding "filter", as you have done in this patch) to indicate to the caller that the corresponding object should not be marked "seen". That does have the advantage that we don't have so many callbacks flying around, but it also makes things more complicated, so I don't know which is better. > + if (filter) { > + r = filter(LOFT_END_TREE, obj, base->buf, >buf[baselen], > filter_data); > + if (r & LOFR_MARK_SEEN) > + obj->flags |= SEEN; > + if (r & LOFR_SHOW) > + show(obj, base->buf, cb_data); > + } This LOFT_END_TREE was added to support the code in list-objects-filter-sparse - would it be OK to completely remove the optimization involving the "provisional" omission of blobs? (I don't think the exact same tree object will occur multiple times often.) If yes, then I think this block can be removed too and will simplify the code. > +/* See object.h and revision.h */ > +#define FILTER_REVISIT (1<<25) If you do add this, also update object.h. But I don't think this is the right place to do it - it's probably better in list-objects-filter-sparse. > +typedef enum list_objects_filter_result list_objects_filter_result; > +typedef enum list_objects_filter_type list_objects_filter_type; I don't think Git style is to typedef enums. > +void traverse_commit_list_worker( > + struct rev_info *, > + show_commit_fn, show_object_fn, void *show_data, > + filter_object_fn filter, void *filter_data); Here (and throughout), as far as I can tell, the style is to indent to the character after the opening parenthesis.
[PATCH] technical doc: add a design doc for hash function transition
From: Jonathan NiederThis is "RFC v3: Another proposed hash function transition plan" from the git mailing list. Signed-off-by: Jonathan Nieder Signed-off-by: Jonathan Tan Signed-off-by: Brandon Williams Signed-off-by: Stefan Beller --- This takes the original Google Doc[1] and adds it to our history, such that the discussion can be on on list and in the commit messages. * replaced SHA3-256 with NEWHASH, sha3 with newhash * added section 'Implementation plan' * added section 'Future work' * added section 'Agreed-upon criteria for selecting NewHash' As the discussion restarts again, here is our attempt to add value to the discussion, we planned to polish it more, but as the discussion is restarting, we might just post it as-is. Thanks. [1] https://docs.google.com/document/d/18hYAQCTsDgaFUo-VJGhT0UqyetL2LbAzkWNK1fYS8R0/edit Documentation/Makefile | 1 + .../technical/hash-function-transition.txt | 571 + 2 files changed, 572 insertions(+) create mode 100644 Documentation/technical/hash-function-transition.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index 2415e0d657..471bb29725 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -67,6 +67,7 @@ SP_ARTICLES += howto/maintain-git API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt))) SP_ARTICLES += $(API_DOCS) +TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol TECH_DOCS += technical/index-format TECH_DOCS += technical/pack-format diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt new file mode 100644 index 00..0ac751d600 --- /dev/null +++ b/Documentation/technical/hash-function-transition.txt @@ -0,0 +1,571 @@ +Git hash function transition + + +Objective +- +Migrate Git from SHA-1 to a stronger hash function. + +Background +-- +At its core, the Git version control system is a content addressable +filesystem. It uses the SHA-1 hash function to name content. For +example, files, directories, and revisions are referred to by hash +values unlike in other traditional version control systems where files +or versions are referred to via sequential numbers. The use of a hash +function to address its content delivers a few advantages: + +* Integrity checking is easy. Bit flips, for example, are easily + detected, as the hash of corrupted content does not match its name. +* Lookup of objects is fast. + +Using a cryptographically secure hash function brings additional +advantages: + +* Object names can be signed and third parties can trust the hash to + address the signed object and all objects it references. +* Communication using Git protocol and out of band communication + methods have a short reliable string that can be used to reliably + address stored content. + +Over time some flaws in SHA-1 have been discovered by security +researchers. https://shattered.io demonstrated a practical SHA-1 hash +collision. As a result, SHA-1 cannot be considered cryptographically +secure any more. This impacts the communication of hash values because +we cannot trust that a given hash value represents the known good +version of content that the speaker intended. + +SHA-1 still possesses the other properties such as fast object lookup +and safe error checking, but other hash functions are equally suitable +that are believed to be cryptographically secure. + +Goals +- +1. The transition to NEWHASH can be done one local repository at a time. + a. Requiring no action by any other party. + b. A NEWHASH repository can communicate with SHA-1 Git servers + (push/fetch). + c. Users can use SHA-1 and NEWHASH identifiers for objects + interchangeably. + d. New signed objects make use of a stronger hash function than + SHA-1 for their security guarantees. +2. Allow a complete transition away from SHA-1. + a. Local metadata for SHA-1 compatibility can be removed from a + repository if compatibility with SHA-1 is no longer needed. +3. Maintainability throughout the process. + a. The object format is kept simple and consistent. + b. Creation of a generalized repository conversion tool. + +Non-Goals +- +1. Add NEWHASH support to Git protocol. This is valuable and the + logical next step but it is out of scope for this initial design. +2. Transparently improving the security of existing SHA-1 signed + objects. +3. Intermixing objects using multiple hash functions in a single + repository. +4. Taking the opportunity to fix other bugs in git's formats and + protocols. +5. Shallow clones and fetches into a NEWHASH repository. (This will + change when we add NEWHASH support to Git
Re: [PATCH 02/13] oidset2: create oidset subclass with object length and pathname
On Fri, 22 Sep 2017 20:26:21 + Jeff Hostetlerwrote: > From: Jeff Hostetler > > Create subclass of oidset where each entry has a > field to store the length of the object's content > and an optional pathname. > > This will be used in a future commit to build a > manifest of omitted objects in a partial/narrow > clone/fetch. As Brandon mentioned, I think "oidmap" should be the new data structure of choice (with "oidset" modified to use it). > +struct oidset2_entry { > + struct hashmap_entry hash; > + struct object_id oid; > + > + enum object_type type; > + int64_t object_length; /* This is SIGNED. Use -1 when unknown. */ > + char *pathname; > +}; object_length is defined to be "unsigned long" in Git code, I think. When is object_length not known, and in those cases, would it be better to use a separate data structure to store what we need?
Re: RFC v3: Another proposed hash function transition plan
Hi Jason, On Tue, 26 Sep 2017, Jason Cooper wrote: > On Thu, Sep 14, 2017 at 08:45:35PM +0200, Johannes Schindelin wrote: > > On Wed, 13 Sep 2017, Linus Torvalds wrote: > > > On Wed, Sep 13, 2017 at 6:43 AM, demerphqwrote: > > > > SHA3 however uses a completely different design where it mixes a 1088 > > > > bit block into a 1600 bit state, for a leverage of 2:3, and the excess > > > > is *preserved between each block*. > > > > > > Yes. And considering that the SHA1 attack was actually predicated on > > > the fact that each block was independent (no extra state between), I > > > do think SHA3 is a better model. > > > > > > So I'd rather see SHA3-256 than SHA256. > > Well, for what it's worth, we need to be aware that SHA3 is *different*. > In crypto, "different" = "bugs haven't been found yet". :-P > > And SHA2 is *known*. So we have a pretty good handle on how it'll > weaken over time. Here, you seem to agree with me. > > SHA-256 got much more cryptanalysis than SHA3-256, and apart from the > > length-extension problem that does not affect Git's usage, there are no > > known weaknesses so far. > > While I think that statement is true on it's face (particularly when > including post-competition analysis), I don't think it's sufficient > justification to chose one over the other. And here you don't. I find that very confusing. > > It would seem that the experts I talked to were much more concerned about > > that amount of attention than the particulars of the algorithm. My > > impression was that the new features of SHA3 were less studied than the > > well-known features of SHA2, and that the new-ness of SHA3 is not > > necessarily a good thing. > > The only thing I really object to here is the abstract "experts". We're > talking about cryptography and integrity here. It's no longer > sufficient to cite anonymous experts. Either they can put their > thoughts, opinions and analysis on record here, or it shouldn't be > considered. Sorry. Sorry, you are asking cryptography experts to spend their time on the Git mailing list. I tried to get them to speak out on the Git mailing list. They respectfully declined. I can't fault them, they have real jobs to do, and none of their managers would be happy for them to educate the Git mailing list on matters of cryptography, not after what happened in 2005. > Other than their anonymity, though, I do agree with your experts > assessments. I know what our in-house cryptography experts have to prove to start working at Microsoft. Forgive me, but you are not a known entity to me. > However, whether we chose SHA2 or SHA3 doesn't matter. To you, it does not matter. To me, it matters. To the several thousand developers working on Windows, probably the largest Git repository in active use, it matters. It matters because the speed difference that has little impact on you has a lot more impact on us. > Moving away from SHA1 does. Once the object_id code is in place to > facilitate that transition, the problem is solved from git's > perspective. Uh oh. You forgot the mapping. And the protocol. And pretty much everything except the oid. > If SHA3 is chosen as the successor, it's going to get a *lot* more > adoption, and thus, a lot more analysis. If cracks start to show, the > hard work of making git flexible is already done. We can migrate to > SHA4/5/whatever in an orderly fashion with far less effort than the > transition away from SHA1. Sure. And if XYZ789 is chosen, it's going to get a *lot* more adoption, too. We think. Let's be realistic. Git is pretty important to us, but it is not important enough to sway, say, Intel into announcing hardware support for SHA3. And if you try to force through *any* hash function only so that it gets more adoption and hence more support, in the short run you will make life harder for developers on more obscure platforms, who may not easily get high-quality, high-speed implementations of anything but the very mainstream (which is, let's face it, MD5, SHA-1 and SHA-256). I know I would have cursed you for such a decision back when I had to work on AIX and IRIX. > For my use cases, as a user of git, I have a plan to maintain provable > integrity of existing objects stored in git under sha1 while migrating > away from sha1. The same plan works for migrating away from SHA2 or > SHA3 when the time comes. Please do not make the mistake of taking your use case to be a template for everybody's use case. Migrating a large team away from any hash function to another one *will* be painful, and costly. Migrating will be very costly for hosting companies like GitHub, Microsoft and BitBucket, too. Ciao, Johannes
Re: [PATCH 4/3] branch: fix "copy" to never touch HEAD
On Fri, Sep 22 2017, Junio C. Hamano jotted: > When creating a new branch B by copying the branch A that happens to > be the current branch, it also updates HEAD to point at the new > branch. It probably was made this way because "git branch -c A B" > piggybacked its implementation on "git branch -m A B", > > This does not match the usual expectation. If I were sitting on a > blue chair, and somebody comes and repaints it to red, I would > accept ending up sitting on a chair that is now red (I am also OK to > stand, instead, as there no longer is my favourite blue chair). But > if somebody creates a new red chair, modelling it after the blue > chair I am sitting on, I do not expect to be booted off of the blue > chair and ending up on sitting on the new red one. > > Let's fix this before it hits 'next'. Those who want to create a > new branch and switch to it can do "git checkout B" after doing a > "git branch -c B", and if that operation is so useful and deserves a > short-hand way to do so, perhaps extend "git checkout -b B" to copy > configurations while creating the new branch B. > > Signed-off-by: Junio C Hamano> --- > > * Let's send an updated one as a follow-up to the discussion thread >that it is a follow-up to. The patch in this message is the same >as the one I previously sent; the proposed log message has been >updated somewhat. > > builtin/branch.c | 9 +++-- > t/t3200-branch.sh | 10 +- > 2 files changed, 8 insertions(+), 11 deletions(-) This mostly looks good to me. I've already spent several screenfuls here on-list arguing for not applying the equivalent of this patch, which I won't repeat here. But just a brief comment: I think this doing this is good, and it's something I agree with given that the consensus is going this way. I thought "do what -m does" was less confusing, but since everyone (you, Brad, even Sahil) seems to disagree I'm happy to be shown to be wrong & to have this go in so thoroughly reviewed & thought about. I do think however that we also need to update the docs, the relevant origin/master...gitster/sd/branch-copy diff is currently: +The `-c` and `-C` options have the exact same semantics as `-m` and +`-M`, except instead of the branch being renamed it along with its +config and reflog will be copied to a new name. Suggestions welcome, but I think that should probably be changed to something like the following as part of this patch: The `-c` and `-C` options have the exact same semantics as `-m` and `-M`, except instead of the branch being renamed it along with its config and reflog will be copied to a new name. Furthermore, unlike its `-m` and `-M` cousins the `-c` and `-C` options will never move the HEAD, whereas the move option will do so if the source branch is the currently checked-out branch. > diff --git a/builtin/branch.c b/builtin/branch.c > index 89f64f4123..e2e3692838 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -506,12 +506,9 @@ static void copy_or_rename_branch(const char *oldname, > const char *newname, int > oldref.buf + 11); > } > > - if (replace_each_worktree_head_symref(oldref.buf, newref.buf, > logmsg.buf)) { > - if (copy) > - die(_("Branch copied to %s, but HEAD is not updated!"), > newname); > - else > - die(_("Branch renamed to %s, but HEAD is not > updated!"), newname); > - } > + if (!copy && > + replace_each_worktree_head_symref(oldref.buf, newref.buf, > logmsg.buf)) > + die(_("Branch renamed to %s, but HEAD is not updated!"), > newname); > > strbuf_release(); > > diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh > index 5d03ad16f6..be9b3784c6 100755 > --- a/t/t3200-branch.sh > +++ b/t/t3200-branch.sh > @@ -422,7 +422,7 @@ test_expect_success 'git branch --copy is a synonym for > -c' ' > test_cmp expect actual > ' > > -test_expect_success 'git branch -c ee ef should copy and checkout branch ef' > ' > +test_expect_success 'git branch -c ee ef should copy to create branch ef' ' > git checkout -b ee && > git reflog exists refs/heads/ee && > git config branch.ee.dummy Hello && > @@ -431,7 +431,7 @@ test_expect_success 'git branch -c ee ef should copy and > checkout branch ef' ' > git reflog exists refs/heads/ef && > test $(git config branch.ee.dummy) = Hello && > test $(git config branch.ef.dummy) = Hello && > - test $(git rev-parse --abbrev-ref HEAD) = ef > + test $(git rev-parse --abbrev-ref HEAD) = ee > ' > > test_expect_success 'git branch -c f/f g/g should work' ' > @@ -494,12 +494,12 @@ test_expect_success 'git branch -C c1 c2 should succeed > when c1 is checked out' > git checkout -b c1 && > git branch c2 && > git branch -C c1 c2 && > - test $(git rev-parse --abbrev-ref HEAD) = c2 > + test $(git
Re: BUG in git diff-index
Marc Herbertwrote: > PS: I used NNTP and http://dir.gmane.org/gmane.comp.version-control.git > to quickly find this old thread (what could we do without NNTP?). Then > I googled for a web archive of this thread and Google could only find > this one: > http://git.661346.n2.nabble.com/BUG-in-git-diff-index-tt7652105.html#none > Is there a robots.txt to block indexing on > https://public-inbox.org/git/1459432667.2124.2.ca...@dwim.me ? There's no blocks on public-inbox.org and I'm completely against any sort of blocking/throttling. Maybe there's too many pages to index? Or the Message-IDs in URLs are too ugly/scary? Not sure what to do about that... Anyways, I just put up a robots.txt with Crawl-Delay: 1, since I seem to recall crawlers use a more conservative delay by default: ==> https://public-inbox.org/robots.txt <== User-Agent: * Crawl-Delay: 1 I don't know much about SEO other than keeping a site up and responsive; so perhaps there's more to be done about getting things indexed...
[PATCH] t7406: submodule..update command must not be run from .gitmodules
submodule..update can be assigned an arbitrary command via setting it to "!command". When this command is found in the regular config, Git ought to just run that command instead of other update mechanisms. However if that command is just found in the .gitmodules file, it is potentially untrusted, which is why we do not run it. Add a test confirming the behavior. Suggested-by: Jonathan NiederSigned-off-by: Stefan Beller --- > This test for a missing file is certainly a remnant from the > previous iteration, isn't it? Yes. This is a good indicator I need some vacation. Thanks, Stefan t/t7406-submodule-update.sh | 8 1 file changed, 8 insertions(+) diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 034914a14f..6f083c4d68 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -406,6 +406,14 @@ test_expect_success 'submodule update - command in .git/config' ' ) ' +test_expect_success 'submodule update - command in .gitmodules is ignored' ' + test_when_finished "git -C super reset --hard HEAD^" && + git -C super config -f .gitmodules submodule.submodule.update "!false" && + git -C super commit -a -m "add command to .gitmodules file" && + git -C super/submodule reset --hard $submodulesha1^ && + git -C super submodule update submodule +' + cat << EOF >expect Execution of 'false $submodulesha1' failed in submodule path 'submodule' EOF -- 2.14.0.rc0.3.g6c2e499285
Re: [PATCH] t7406: submodule..update command must not be run from .gitmodules
Am 26.09.2017 um 20:54 schrieb Stefan Beller: +test_expect_success 'submodule update - command in .gitmodules is ignored' ' + test_when_finished "git -C super reset --hard HEAD^" && + + git -C super config -f .gitmodules submodule.submodule.update "!false" && + git -C super commit -a -m "add command to .gitmodules file" && + git -C super/submodule reset --hard $submodulesha1^ && + git -C super submodule update submodule && + test_path_is_missing bad This test for a missing file is certainly a remnant from the previous iteration, isn't it? -- Hannes
Re: BUG in git diff-index
On 31/03/16 13:39, Junio C Hamano wrote: > Andy Lowrywrites: > >> So I think now that the script should do "update-index --refresh" >> followed by "diff-index --quiet HEAD". Sound correct? > > Yes. That has always been one of the kosher ways for any script to > make sure that the files in the working tree that are tracked have > not been modified relative to HEAD (assuming that the index matches > HEAD). Too bad kernel/scripts/setlocalversion didn't get the memo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cdf2bc632ebc9ef51 > scripts/setlocalversion on write-protected source tree (2013) > I don't see how removing "git update-index" could do any harm. This causes a spurious "-dirty" suffix when building from a directory copy (as Mike learned the hard way) PS: I used NNTP and http://dir.gmane.org/gmane.comp.version-control.git to quickly find this old thread (what could we do without NNTP?). Then I googled for a web archive of this thread and Google could only find this one: http://git.661346.n2.nabble.com/BUG-in-git-diff-index-tt7652105.html#none Is there a robots.txt to block indexing on https://public-inbox.org/git/1459432667.2124.2.ca...@dwim.me ?
Re: [PATCH v2 0/9] Teach 'run' perf script to read config files
> "Note: Amazon SES overrides any Date header you provide with the > time that Amazon > SES accepts the message." > > http://docs.aws.amazon.com/ses/latest/DeveloperGuide/header-fields.html > > ...so the only way SubmitGit can offset the times is to literally > delay the sending of the emails, > which is a bit unfortunate for patchbombs more than a few dozen commits long! How many series of more than "a few dozen" patches were sent to the list lately? I'd argue this is a non issue for the typical use case of submitGit. > I'll take a further look at this when I get a bit more free time. Thanks! Stefan
Re: [PATCH v2 0/9] Teach 'run' perf script to read config files
On 26 September 2017 at 16:40, Christian Couderwrote: > On Sun, Sep 24, 2017 at 9:59 AM, Junio C Hamano wrote: >> Christian Couder writes: >> >>> (It looks like smtp.gmail.com isn't working anymore for me, so I am >>> trying to send this using Gmail for the cover letter and Submitgit for >>> the patches.) >> >> SubmitGit may want to learn the "change the timestamps of the >> individual patches by 1 second" trick from "git send-email" to help >> threading (you can view inbox/comp.version-control.git/ group over >> nntp and tell your newsreader to sort-by-date). > > Roberto is now in CC. I will let him answer about that. I had a quick look at git-send-email.perl, I see the trick is the `time++` one introduced with https://github.com/git/git/commit/a5370b16 - seems reasonable! SubmitGit makes all emails in-reply-to the initial email, which I think is correct behaviour, but I can see that offsetting the times would probably give a more reliable sorting in a newsreader. Unfortunately the documentation for AWS Simple Email Service (SES) says: "Note: Amazon SES overrides any Date header you provide with the time that Amazon SES accepts the message." http://docs.aws.amazon.com/ses/latest/DeveloperGuide/header-fields.html ...so the only way SubmitGit can offset the times is to literally delay the sending of the emails, which is a bit unfortunate for patchbombs more than a few dozen commits long! I'll take a further look at this when I get a bit more free time. Roberto
Re: [PATCH 00/13] RFC object filtering for parital clone
On 9/26/2017 10:55 AM, Jeff Hostetler wrote: On 9/22/2017 8:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:19 + Jeff Hostetlerwrote: ... I tried applying your patches and it doesn't apply cleanly on master. Could you try rebasing? In particular, the code references get_sha1_with_context(), which no longer exists (it was replaced with get_oid_with_context()). I usually build and submit patches relative to the latest available tag (v2.14.1) rather than a moving target, but yeah I'll look thru the other comments and rebase it forward. https://github.com/jeffhostetler/git/tree/core/rev_list_filtering I rebased this branch onto a current master and fixed the get_sha1_with_context() problems. Let me know if you have problems with it (or would prefer that I send it to the mailing list). Jeff
[PATCH] t7406: submodule..update command must not be run from .gitmodules
submodule..update can be assigned an arbitrary command via setting it to "!command". When this command is found in the regular config, Git ought to just run that command instead of other update mechanisms. However if that command is just found in the .gitmodules file, it is potentially untrusted, which is why we do not run it. Add a test confirming the behavior. Suggested-by: Jonathan NiederSigned-off-by: Stefan Beller --- Johannes wrote: > I am pretty confident that this does not test what you intend to > test. Notice that $TEST_DIRECTORY is expanded when the script is > written. But that path contains a blank, and we have something like > this in the test script: > > #!/bin/sh > >/the/build/directory/t/trash directory.t7406/bad I can confirm that. Instead of mucking around with writing a script, "to make it robust", I decided to got the simplest route and just have the command "!false", which would make "git submodule update submodule" return an error. Thanks, Stefan t/t7406-submodule-update.sh | 10 ++ 1 file changed, 10 insertions(+) diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 034914a14f..a9ea098e55 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -406,6 +406,16 @@ test_expect_success 'submodule update - command in .git/config' ' ) ' +test_expect_success 'submodule update - command in .gitmodules is ignored' ' + test_when_finished "git -C super reset --hard HEAD^" && + + git -C super config -f .gitmodules submodule.submodule.update "!false" && + git -C super commit -a -m "add command to .gitmodules file" && + git -C super/submodule reset --hard $submodulesha1^ && + git -C super submodule update submodule && + test_path_is_missing bad +' + cat << EOF >expect Execution of 'false $submodulesha1' failed in submodule path 'submodule' EOF -- 2.14.0.rc0.3.g6c2e499285
Re: [PATCH v5] connect: in ref advertisement, shallows are last
On 09/26, Jonathan Tan wrote: > Currently, get_remote_heads() parses the ref advertisement in one loop, > allowing refs and shallow lines to intersperse, despite this not being > allowed by the specification. Refactor get_remote_heads() to use two > loops instead, enforcing that refs come first, and then shallows. > > This also makes it easier to teach get_remote_heads() to interpret other > lines in the ref advertisement, which will be done in a subsequent > patch. > > As part of this change, this patch interprets capabilities only on the > first line in the ref advertisement, printing a warning message when > encountering capabilities on other lines. > > Signed-off-by: Jonathan Tan> --- > Changes in v5: > - print warning when encountering capabilities on other lines instead >of ignoring them (also updated commit message) > - explicitly disallow refs of name "capabilities^{}" (except when it is >the only ref) > --- > connect.c | 183 > +++--- > 1 file changed, 117 insertions(+), 66 deletions(-) > > diff --git a/connect.c b/connect.c > index df56c0cbf..df65a3fc4 100644 > --- a/connect.c > +++ b/connect.c > @@ -11,6 +11,7 @@ > #include "string-list.h" > #include "sha1-array.h" > #include "transport.h" > +#include "strbuf.h" > > static char *server_capabilities; > static const char *parse_feature_value(const char *, const char *, int *); > @@ -107,6 +108,98 @@ static void annotate_refs_with_symref_info(struct ref > *ref) > string_list_clear(, 0); > } > > +/* > + * Read one line of a server's ref advertisement into packet_buffer. > + */ > +static int read_remote_ref(int in, char **src_buf, size_t *src_len, > +int *responded) > +{ > + int len = packet_read(in, src_buf, src_len, > + packet_buffer, sizeof(packet_buffer), > + PACKET_READ_GENTLE_ON_EOF | > + PACKET_READ_CHOMP_NEWLINE); > + const char *arg; > + if (len < 0) > + die_initial_contact(*responded); > + if (len > 4 && skip_prefix(packet_buffer, "ERR ", )) > + die("remote error: %s", arg); > + > + *responded = 1; > + > + return len; > +} > + > +#define EXPECTING_FIRST_REF 0 > +#define EXPECTING_REF 1 > +#define EXPECTING_SHALLOW 2 > + > +static void process_capabilities(int *len) > +{ > + int nul_location = strlen(packet_buffer); > + if (nul_location == *len) > + return; > + server_capabilities = xstrdup(packet_buffer + nul_location + 1); > + *len = nul_location; > +} > + > +static int process_dummy_ref(void) > +{ > + static char *template; > + if (!template) > + template = xstrfmt("%040d capabilities^{}", 0); My only complaint is still here, I don't like the notion of hardcoding the 0's. Its much more future proof and less error prone to call parse_oid_hex and require that it matches null_oid. > + return !strcmp(packet_buffer, template); > +} > + > +static void check_no_capabilities(int len) > +{ > + if (strlen(packet_buffer) != len) > + warning("Ignoring capabilities after first line '%s'", > + packet_buffer + strlen(packet_buffer)); > +} > + > +static int process_ref(int len, struct ref ***list, unsigned int flags, > +struct oid_array *extra_have) > +{ > + struct object_id old_oid; > + const char *name; > + > + if (parse_oid_hex(packet_buffer, _oid, )) > + return 0; > + if (*name != ' ') > + return 0; > + name++; > + > + if (extra_have && !strcmp(name, ".have")) { > + oid_array_append(extra_have, _oid); > + } else if (!strcmp(name, "capabilities^{}")) { > + die("protocol error: unexpected capabilities^{}"); > + } else if (check_ref(name, flags)) { > + struct ref *ref = alloc_ref(name); > + oidcpy(>old_oid, _oid); > + **list = ref; > + *list = >next; > + } > + check_no_capabilities(len); > + return 1; > +} > + > +static int process_shallow(int len, struct oid_array *shallow_points) > +{ > + const char *arg; > + struct object_id old_oid; > + > + if (!skip_prefix(packet_buffer, "shallow ", )) > + return 0; > + > + if (get_oid_hex(arg, _oid)) > + die("protocol error: expected shallow sha-1, got '%s'", arg); > + if (!shallow_points) > + die("repository on the other end cannot be shallow"); > + oid_array_append(shallow_points, _oid); > + check_no_capabilities(len); > + return 1; > +} > + > /* > * Read all the refs from the other end > */ > @@ -123,76 +216,34 @@ struct ref **get_remote_heads(int in, char *src_buf, > size_t src_len, >* willing to talk to us. A hang-up before seeing any >* response does not necessarily mean an ACL problem, though. >*/ > -
[PATCH] submodule: correct error message for missing commits.
When a submodule diff should be displayed we currently just add the submodule objects to the main object store and then e.g. walk the revision graph and create a summary for that submodule. It is possible that we are missing the submodule either completely or partially, which we currently differentiate with different error messages depending on whether (1) the whole submodule object store is missing or (2) just the needed for this particular diff. (1) is reported as "not initialized", and (2) is reported as "commits not present". If a submodule is deinit'ed its repository data is still around inside the superproject, such that the diff can still be produced. In that way the error message (1) is misleading as we can have a diff despite the submodule being not initialized. Downgrade the error message (1) to be the same as (2) and just say the commits are not present, as that is the true reason why the diff cannot be shown. Signed-off-by: Stefan Beller--- submodule.c | 2 +- t/t4059-diff-submodule-not-initialized.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/submodule.c b/submodule.c index 6531c5d609..280c246477 100644 --- a/submodule.c +++ b/submodule.c @@ -567,7 +567,7 @@ static void show_submodule_header(FILE *f, const char *path, if (add_submodule_odb(path)) { if (!message) - message = "(not initialized)"; + message = "(commits not present)"; goto output_header; } diff --git a/t/t4059-diff-submodule-not-initialized.sh b/t/t4059-diff-submodule-not-initialized.sh index cd70fd5192..49bca7b48d 100755 --- a/t/t4059-diff-submodule-not-initialized.sh +++ b/t/t4059-diff-submodule-not-initialized.sh @@ -95,7 +95,7 @@ test_expect_success 'submodule not initialized in new clone' ' git clone . sm3 && git -C sm3 diff-tree -p --no-commit-id --submodule=log HEAD >actual && cat >expected <<-EOF && - Submodule sm1 $smhead1...$smhead2 (not initialized) + Submodule sm1 $smhead1...$smhead2 (commits not present) EOF test_cmp expected actual ' -- 2.14.0.rc0.3.g6c2e499285
[PATCH v5] connect: in ref advertisement, shallows are last
Currently, get_remote_heads() parses the ref advertisement in one loop, allowing refs and shallow lines to intersperse, despite this not being allowed by the specification. Refactor get_remote_heads() to use two loops instead, enforcing that refs come first, and then shallows. This also makes it easier to teach get_remote_heads() to interpret other lines in the ref advertisement, which will be done in a subsequent patch. As part of this change, this patch interprets capabilities only on the first line in the ref advertisement, printing a warning message when encountering capabilities on other lines. Signed-off-by: Jonathan Tan--- Changes in v5: - print warning when encountering capabilities on other lines instead of ignoring them (also updated commit message) - explicitly disallow refs of name "capabilities^{}" (except when it is the only ref) --- connect.c | 183 +++--- 1 file changed, 117 insertions(+), 66 deletions(-) diff --git a/connect.c b/connect.c index df56c0cbf..df65a3fc4 100644 --- a/connect.c +++ b/connect.c @@ -11,6 +11,7 @@ #include "string-list.h" #include "sha1-array.h" #include "transport.h" +#include "strbuf.h" static char *server_capabilities; static const char *parse_feature_value(const char *, const char *, int *); @@ -107,6 +108,98 @@ static void annotate_refs_with_symref_info(struct ref *ref) string_list_clear(, 0); } +/* + * Read one line of a server's ref advertisement into packet_buffer. + */ +static int read_remote_ref(int in, char **src_buf, size_t *src_len, + int *responded) +{ + int len = packet_read(in, src_buf, src_len, + packet_buffer, sizeof(packet_buffer), + PACKET_READ_GENTLE_ON_EOF | + PACKET_READ_CHOMP_NEWLINE); + const char *arg; + if (len < 0) + die_initial_contact(*responded); + if (len > 4 && skip_prefix(packet_buffer, "ERR ", )) + die("remote error: %s", arg); + + *responded = 1; + + return len; +} + +#define EXPECTING_FIRST_REF 0 +#define EXPECTING_REF 1 +#define EXPECTING_SHALLOW 2 + +static void process_capabilities(int *len) +{ + int nul_location = strlen(packet_buffer); + if (nul_location == *len) + return; + server_capabilities = xstrdup(packet_buffer + nul_location + 1); + *len = nul_location; +} + +static int process_dummy_ref(void) +{ + static char *template; + if (!template) + template = xstrfmt("%040d capabilities^{}", 0); + return !strcmp(packet_buffer, template); +} + +static void check_no_capabilities(int len) +{ + if (strlen(packet_buffer) != len) + warning("Ignoring capabilities after first line '%s'", + packet_buffer + strlen(packet_buffer)); +} + +static int process_ref(int len, struct ref ***list, unsigned int flags, + struct oid_array *extra_have) +{ + struct object_id old_oid; + const char *name; + + if (parse_oid_hex(packet_buffer, _oid, )) + return 0; + if (*name != ' ') + return 0; + name++; + + if (extra_have && !strcmp(name, ".have")) { + oid_array_append(extra_have, _oid); + } else if (!strcmp(name, "capabilities^{}")) { + die("protocol error: unexpected capabilities^{}"); + } else if (check_ref(name, flags)) { + struct ref *ref = alloc_ref(name); + oidcpy(>old_oid, _oid); + **list = ref; + *list = >next; + } + check_no_capabilities(len); + return 1; +} + +static int process_shallow(int len, struct oid_array *shallow_points) +{ + const char *arg; + struct object_id old_oid; + + if (!skip_prefix(packet_buffer, "shallow ", )) + return 0; + + if (get_oid_hex(arg, _oid)) + die("protocol error: expected shallow sha-1, got '%s'", arg); + if (!shallow_points) + die("repository on the other end cannot be shallow"); + oid_array_append(shallow_points, _oid); + check_no_capabilities(len); + return 1; +} + /* * Read all the refs from the other end */ @@ -123,76 +216,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, * willing to talk to us. A hang-up before seeing any * response does not necessarily mean an ACL problem, though. */ - int saw_response; - int got_dummy_ref_with_capabilities_declaration = 0; + int responded = 0; + int len; + int state = EXPECTING_FIRST_REF; *list = NULL; - for (saw_response = 0; ; saw_response = 1) { - struct ref *ref; - struct object_id old_oid; - char *name; - int len, name_len; -
[ANNOUNCE] Git for Windows 2.14.2
Dear Git users, It is my pleasure to announce that Git for Windows 2.14.2 is available from: https://git-for-windows.github.io/ Changes since Git for Windows v2.14.1 (August 10th 2017) New Features * Comes with Git v2.14.2. * Comes with cURL v7.55.1. * The XP-compatibility layer emulating pthreads (which is no longer needed) was dropped in favor of modern Windows threading APIs; This should make threaded operations slightly faster and more robust. * On Windows, UNC paths can now be accessed via file://host/share/ repo.git-style paths. * Comes with a new custom Git command git update to help keeping Git up-to-date on your machine. * The Git installer now offers an option to keep Git up-to-date by calling git update regularly. * Comes with BusyBox v1.28.0pre.16353.2739df917. * As is common elsewhere, Ctrl+Left and Ctrl+Right now move word-wise in Git Bash, too. * Comes with patch level 2 of the MSYS2 runtime (Git for Windows flavor) based on Cygwin 2.9.0. * Comes with Git LFS v2.3.0. * The vs/master branch can now be built in Visual Studio 2017, too * As requested by the same user who implemented the change, Git for Windows now comes with tig, a text-mode interface for Git. Bug Fixes * It is now possible to override http.sslBackend on the command-line. * The installer now detects correctly whether symbolic links can be created by regular users. * Git Bash now renders non-ASCII directories nicely. * A regression that caused the fetch operation with lots of refs to be a lot slower than before was fixed. * The git-gui.exe and gitk.exe wrappers intended to be used in Git CMD now handle command-line parameters correctly. * The core.longPaths setting is now heeded when packing refs, and other previously forgotten Git commands. * Pressing Ctrl+Z in Git Bash no longer kills Win32 processes (e.g. git.exe) anymore, because POSIX job control is only available with MSYS2 processes. * Git for Windows now sets core.fsyncObjectFiles = true by default which makes it a lot more fault-tolerant, say, when power is lost. * A bug has been fixed where Git for Windows could run into an infinite loop trying to rename a file. * Before installing Git for Windows, we already verified that no Git Bash instance is active (which would prevent files from being overwritten). We now also verify that no git.exe processes are active, either. Filename | SHA-256 | --- Git-2.14.2-64-bit.exe | ef144a5dbb74518b56727f6c228993b9e7f5aedec7bbf7d680dcffb5d36ce354 Git-2.14.2-32-bit.exe | 0c7cf5f1bd6532c3720920b953da6b5c563acab35e3caee0b6f337975e284f35 PortableGit-2.14.2-64-bit.7z.exe | 3d451383519379dfe2f160c4c3a49e4b9e39a98f3eea2534593cc3fb5a4e PortableGit-2.14.2-32-bit.7z.exe | 6acd886bb1cebee7b5b607599c8327453110c28ddd8c4f976d14050ab0a17e11 MinGit-2.14.2-64-bit.zip | 9638733b8d749c43d59c34a714d582b2352356ee7d13c4acf919c18f307387f5 MinGit-2.14.2-32-bit.zip | 1c8ed7f54a8d50c84fbc767d97cf35992de42463d9852c56a1f3a3c1ac80c965 MinGit-2.14.2-busybox-64-bit.zip | e6a75e75378b89c4d2e9c89a7cd0b0995ea0b8c2bcd3a93c42d0ffadd17ccd75 MinGit-2.14.2-busybox-32-bit.zip | b0b4d2135ad63347dfd787aa8046a625272764d0a6ea73f4ebd793105d6d9545 Git-2.14.2-64-bit.tar.bz2 | cf2d4ccce3dbdd42755e2e89830a406b9cf5cef6305dc7bf176a4ef5d55cb47e Git-2.14.2-32-bit.tar.bz2 | 8901421656b6e5215cf4a2f3872de3e6d89aa1177ea3c3ffbed54af292957023 Ciao, Johannes
Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 3)
On Tue, 26 Sep 2017 10:25:16 -0400 Jeff Hostetlerwrote: > >> Perhaps you could augment the OID lookup to remember where the object > >> was found (essentially a .promisor bit set). Then you wouldn't need > >> to touch them all. > > > > Sorry - I don't understand this. Are you saying that missing promisor > > objects should go into the global object hashtable, so that we can set a > > flag on them? > > I just meant could we add a bit to "struct object_info" to indicate > that the object was found in a .promisor packfile ? This could > be set in sha1_object_info_extended(). > > Then the is_promised() calls in fsck and gc would just test that bit. > > Given that that bit will be set on promisOR objects (and we won't > have object_info for missing objects), you may need to adjust the > iterator in the fsck/gc code slightly. > > This is a bit of a handwave, but could something like that eliminate > the need to build this oidset? This oidset is meant to contain the missing objects, and is needed as the final check (attempt to read the object, then check this oidset). Admittedly, right now I add objects to it even if they are present in the DB, but that's because I think that it's better for the set to be bigger than to incur the repeated existence checks. But even if we only include truly missing objects in this oidset, we still need the oidset, or store information about missing objects in some equivalent data structure. The bit that you mention being set on promisOR objects is already being set. See the invocation of mark_uninteresting() in rev-list.c.
Re: RFC v3: Another proposed hash function transition plan
Hi all, Sorry for late commentary... On Thu, Sep 14, 2017 at 08:45:35PM +0200, Johannes Schindelin wrote: > On Wed, 13 Sep 2017, Linus Torvalds wrote: > > On Wed, Sep 13, 2017 at 6:43 AM, demerphqwrote: > > > SHA3 however uses a completely different design where it mixes a 1088 > > > bit block into a 1600 bit state, for a leverage of 2:3, and the excess > > > is *preserved between each block*. > > > > Yes. And considering that the SHA1 attack was actually predicated on > > the fact that each block was independent (no extra state between), I > > do think SHA3 is a better model. > > > > So I'd rather see SHA3-256 than SHA256. Well, for what it's worth, we need to be aware that SHA3 is *different*. In crypto, "different" = "bugs haven't been found yet". :-P And SHA2 is *known*. So we have a pretty good handle on how it'll weaken over time. > SHA-256 got much more cryptanalysis than SHA3-256, and apart from the > length-extension problem that does not affect Git's usage, there are no > known weaknesses so far. While I think that statement is true on it's face (particularly when including post-competition analysis), I don't think it's sufficient justification to chose one over the other. > It would seem that the experts I talked to were much more concerned about > that amount of attention than the particulars of the algorithm. My > impression was that the new features of SHA3 were less studied than the > well-known features of SHA2, and that the new-ness of SHA3 is not > necessarily a good thing. The only thing I really object to here is the abstract "experts". We're talking about cryptography and integrity here. It's no longer sufficient to cite anonymous experts. Either they can put their thoughts, opinions and analysis on record here, or it shouldn't be considered. Sorry. Other than their anonymity, though, I do agree with your experts assessments. However, whether we chose SHA2 or SHA3 doesn't matter. Moving away from SHA1 does. Once the object_id code is in place to facilitate that transition, the problem is solved from git's perspective. If SHA3 is chosen as the successor, it's going to get a *lot* more adoption, and thus, a lot more analysis. If cracks start to show, the hard work of making git flexible is already done. We can migrate to SHA4/5/whatever in an orderly fashion with far less effort than the transition away from SHA1. For my use cases, as a user of git, I have a plan to maintain provable integrity of existing objects stored in git under sha1 while migrating away from sha1. The same plan works for migrating away from SHA2 or SHA3 when the time comes. thx, Jason.
Re: [PATCH v2 0/9] Teach 'run' perf script to read config files
On Sun, Sep 24, 2017 at 9:59 AM, Junio C Hamanowrote: > Christian Couder writes: > >> (It looks like smtp.gmail.com isn't working anymore for me, so I am >> trying to send this using Gmail for the cover letter and Submitgit for >> the patches.) > > SubmitGit may want to learn the "change the timestamps of the > individual patches by 1 second" trick from "git send-email" to help > threading (you can view inbox/comp.version-control.git/ group over > nntp and tell your newsreader to sort-by-date). Roberto is now in CC. I will let him answer about that. >> Highlevel view of the patches in the series >> ~~~ >> >> - Patches 1/9 to 4/9 were already in v1 and haven't changed. > > It isn't quite clear what _did_ change in the series and what > lessons were learned form the previous round's discussion here. In the previous round of discussion, I think the reviewers (you and Peff) basically agreed that it was worth improving the test framework to make it possible to easily run many tests, though reviewers were not sure if what I had planned to do would in the end be a good solution. (See https://public-inbox.org/git/20170713065050.19215-1-chrisc...@tuxfamily.org/) So what did change is that I implemented what was in the "Future work" section in v1. > The > sample configuration in the description above (snipped) seems to > have been extended and it shows that one of the use cases of the > feature is to allow comparing runs against two versions, which > looked more or less sensible way to express it. [...] > Thanks. Let me see how well it works ;-) Thanks for testing ;-)
Re: RFC: Design and code of partial clones (now, missing commits and trees OK)
On 09/22/2017 12:42 AM, Jonathan Tan wrote: > On Thu, 21 Sep 2017 13:57:30 Jeff Hostetlerwrote: > [...] >> I struggled with the terms here a little when looking at the source. >> () A remote responding to a partial-clone is termed a >> "promisor-remote". () Packfiles received from a promisor-remote are >> marked with ".promisor" like ".keep" names. >> () An object actually contained in such packfiles is called a >> "promisor-object". () An object not-present but referenced by one of >> the above promisor-objects is called a "promised-object" (aka a >> "missing-object"). >> >> I think the similarity of the words "promisOR" and "promisED" threw >> me here and as I was looking at the code. The code in is_promised() >> [1] looked like it was adding all promisor- and promised-objects to >> the "promised" OIDSET, but I could be mistaken. >> >> [1] >> https://github.com/jonathantanmy/git/commit/7a9c2d9b6e2fce293817b595dee29a7eede0#diff-5d5d5dc185ef37dc30bb7d9a7ae0c4e8R1960 > > I was struggling a bit with the terminology, true. > > Right now I'm thinking of: > - promisor remote (as you defined) > - promisor packfile (as you defined) > - promisor object is an object known to belong to the promisor (whether >because we have it in a promisor packfile or because it is referenced >by an object in a promisor packfile) Maybe naming has been discussed at length before, and I am jumping into a long-settled topic. And admittedly this is bikeshedding. But I find these names obscure, even as a developer. And terms like this will undoubtedly bleed into the UI and documentation, so it would be good to put some effort into choosing the best names possible. I suppose that the term "promisor" comes from the computer science term "promise" [1]. In that sense it is apt, because, say, a promisor object is something that is known to be obtainable, but we don't have it yet. But from the user's point of view, I think this isn't a very illuminating term. I think the user's mental model will be that there is a distinguished remote repository that holds the project's entire published history, and she has to remain connected to it for certain Git operations to work [2]. Another interesting aspect of this remote is that it has to be trusted never (well, almost never) to discard any objects [3]. Let me brainstorm about other names or concepts that seem closer to the user's mental model: * "backing remote", "backing repository" * "lazy remote", "live remote", "cloud remote", "shared remote", "on-demand remote" * "full remote", "deep remote", "permanent remote" * "attached remote", "bound remote", "linked remote" * "trusted remote", "authoritative remote", "official remote" (these terms probably imply more than we want) * "upstream", "upstream remote" (probably too confusing with how the term "upstream" is currently used, even if in practice they will often be the same remote) * "object depot", "depot remote", "depot repository", "remote object depot" (I don't like the term "object" that much, because it is rather remote from the Git user's daily life) * "repository of record", "remote of record" (too wordy) * "history depot", "history warehouse" (meh) * (dare I say it?) "central repository" * "object archive", "archival remote" (not so good because we already use "archive" for other concepts) * depository (sounds too much like "repository") * The thesaurus suggests nouns like: store, bank, warehouse, library, chronicle, annals, registry, locker, strongbox, attic, bunker Personally I think "lazy remote" and "backing remote" are not too bad. Michael [1] https://en.wikipedia.org/wiki/Futures_and_promises [2] I haven't checked whether the current proposal allows for multiple "promisor remotes". It's certainly thinkable, if not now then in the future. But I suppose that even then, 99% of users will configure a single "promisor remote" for each repository. [3] For those rare occasions where the server has to discard objects, it might make sense for the server to remember the names of the objects that were deleted, so that it can tell clients "no, you're not insane. I used to have that object but it has intentionally been obliterated", and possibly even a reason: "it is now taboo" vs. "I got tired of carrying it around".
Re: git config credential.helper with absolute path on windows not working properly
My situation was a little different, but I was able to get this to work with some interesting escaping. helper = !"\"C:\\Path with spaces\\executable\"" --option1 value1 credential-helper $@ Notice the exclamation, quoted path of executable and extra escaped quotes inside of that, plus escaped backslashes. -- Sent from: http://git.661346.n2.nabble.com/
Re: [PATCH 00/13] RFC object filtering for parital clone
On 9/22/2017 8:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:19 + Jeff Hostetlerwrote: This draft contains filters to: () omit all blobs () omit blobs larger than some size () omit blobs using a sparse-checkout specification In addition to specifying the filter criteria, the rev-list command was updated to include options to: () print a list of the omitted objects (due to the current filtering criteria) () print a list of missing objects (probably from a prior partial clone/fetch). Thanks, this last point seems useful. I tried applying your patches and it doesn't apply cleanly on master. Could you try rebasing? In particular, the code references get_sha1_with_context(), which no longer exists (it was replaced with get_oid_with_context()). I usually build and submit patches relative to the latest available tag (v2.14.1) rather than a moving target, but yeah I'll look thru the other comments and rebase it forward. Thanks Jeff
Re: [PATCH] git: add --no-optional-locks option
On 9/25/2017 12:17 PM, Johannes Schindelin wrote: Hi Kaartic, On Sun, 24 Sep 2017, Kaartic Sivaraam wrote: On Thursday 21 September 2017 10:02 AM, Jeff King wrote: Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. I might be missing something, shouldn't the IDEs be encouraged to use libgit2 instead? I thought it was meant for these use cases. There are pros and cons. Visual Studio moved away from libgit2 e.g. to support SSH (back then, libgit2 did not support spawning processes, I have no idea whether that changed in the meantime). There were other issues besides feature parity. The big one for VS was that it moved the git computations into a separate process and address space. You can't easily read/modify/write a 500MB .git/index file into memory (with however many copies of the data that that creates) in a 32-bit process.
Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 3)
On 9/22/2017 6:58 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 17:32:00 -0400 Jeff Hostetlerwrote: I guess I'm afraid that the first call to is_promised() is going cause a very long pause as it loads up a very large hash of objects. Yes, the first call will cause a long pause. (I think fsck and gc can tolerate this, but a better solution is appreciated.) Perhaps you could augment the OID lookup to remember where the object was found (essentially a .promisor bit set). Then you wouldn't need to touch them all. Sorry - I don't understand this. Are you saying that missing promisor objects should go into the global object hashtable, so that we can set a flag on them? I just meant could we add a bit to "struct object_info" to indicate that the object was found in a .promisor packfile ? This could be set in sha1_object_info_extended(). Then the is_promised() calls in fsck and gc would just test that bit. Given that that bit will be set on promisOR objects (and we won't have object_info for missing objects), you may need to adjust the iterator in the fsck/gc code slightly. This is a bit of a handwave, but could something like that eliminate the need to build this oidset? The oidset will deduplicate OIDs. Right, but you still have an entry for each object. For a repo the size of Windows, you may have 25M+ objects your copy of the ODB. We have entries only for the "frontier" objects (the objects directly referenced by any promisor object). For the Windows repo, for example, I foresee that many of the blobs, trees, and commits will be "hiding" behind objects that the repository user did not download into their repo.
Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 2/3)
On 9/22/2017 6:52 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 17:19:50 -0400 Jeff Hostetlerwrote: In your specific example, how would rev-list know, on the client, to include (or exclude) a large blob in its output if it does not have it, and thus does not know its size? The client doesn't have the size. It just knows it is missing and it needs it. It doesn't matter why it is missing. (But I guess the client could assume it is because it is large.) Ah, OK. So rev-list on the client could filter the objects it has by size. My issue is that if the purpose of this feature in rev-list is to do prefetching, the only criterion we need to check for is absence from the local repo right? (Or is filtering by size on the client useful for other reasons?) The prefetch before a checkout may want all missing blobs, or it want to work with the sparse-checkout specification and only get the required missing blobs in the subset of the tree. By putting the same filter logic in rev-list, we can do that. It also sets the stage for later filtering trees. (My current patch only filters blobs. It would be nice to have a second version of the sparse filter that also omits trees, but that may require a recursive option in the fetch-objects protocol.) FYI I just posted my RFC this afternoon. https://public-inbox.org/git/20170922204211.ga24...@google.com/T/ Thanks, I'll take a look.
Re: -X theirs does not resolve symlink conflict Was: BUG: merge -s theirs is not in effect
On Tue, 26 Sep 2017, Junio C Hamano wrote: > >> I do not recall people talking about symbolic links but the case of > >> binary files has been on the wishlist for a long time, and I do not > >> know of anybody who is working on (or is planning to work on) it. > > Ah, I misremembered. > > We've addressed the "binary files" case back in 2012 with a944af1d > > ("merge: teach -Xours/-Xtheirs to binary ll-merge driver", > > 2012-09-08). I do not know offhand if it is just as easy to plumb > > the MERGE_FAVOR_{OURS,THEIRS} bits thru the symbolic link codepath, > > like that patch did to the binary file codepath. > Perhaps the attached (totally untested) patch might be a good > starting point. I do not know if you are interested in hacking on > Git, and I do not feel offended if you are not, but perhaps somebody I would have felt honored to "hack on Git" but neither my C-foo is up to par, neither there would be more time I could adequately allocate for such endeavor. So meanwhile I am trying to contribute in hopefully constructive "whining" while exploiting git. > else might get interested in seeing if this #leftoverbits is a good > direction to go in, and finishing it with docs and tests if it is > ;-) > merge-recursive.c | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > >...< This patch worked beautifully in my usecase!: $> rm -rf /tmp/repo1; mkdir /tmp/repo1; cd /tmp/repo1; git init .; ln -s sym1 link; echo 1 > file; git add file link; git commit -m 'common'; git co -b b1 ; ln -sf b1link link; echo "b1 file" >| file; git commit -m 'b2 changes' -a; git co master; ln -sf masterlink link; echo "master file" >| file; git commit -m 'also modified in master' -a; git merge -X theirs --no-edit b1; ls -l link; cat file warning: templates not found /home/yoh/share/git-core/templates Initialized empty Git repository in /tmp/repo1/.git/ [master (root-commit) d2e9010] common 2 files changed, 2 insertions(+) create mode 100644 file create mode 12 link Switched to a new branch 'b1' [b1 a2b1321] b2 changes 2 files changed, 2 insertions(+), 2 deletions(-) Switched to branch 'master' [master fbb4ba7] also modified in master 2 files changed, 2 insertions(+), 2 deletions(-) Auto-merging link Auto-merging file Merge made by the 'recursive' strategy. file | 2 +- link | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) lrwxrwxrwx 1 yoh yoh 6 Sep 26 09:32 link -> b1link b1 file I also tried -s ours and no explicit -s, and they did as prescribed as well PS thanks for the CCs -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: -s theirs use-case(s) Was: BUG: merge -s theirs is not in effect
On Tue, 26 Sep 2017, Junio C Hamano wrote: > Yaroslav Halchenkowrites: > > 1. As a workaround for absence of -m theirs I using mtheirs git alias: > > (I believe provided to me awhile back here on the list): > > mtheirs = !sh -c 'git merge -s ours --no-commit $1 && git read-tree -m > > -u $1' - > > and it worked fine for my usecases > > 2. I think that if there is a reason for -s ours to exist, so there for -s > > theirs > > since it is just the directionality of merges which changes between the two > Just on this point. They are not exactly symmetric. > Imagine there are some undesirable changes you want to vanquish from > the world, but they have already built on useful changes on top of > the undesirable changes. A hypothetical history might look like > this: > B---C > / >X---X---A > / > ---o---o your mainline > where 'X' denotes those unwanted changes. > >...< > The symmetiric case where _you_ have wrong changes do not need "-s > theirs". These mistakes X are yours, so are the changes depend on > them: > B---C > / >X---X---A > / > ---o---o their mainline > and you can just rebase A, B and C on top of their mainline while > getting rid of Xs yourself before publishing. and that is where the gotcha comes -- what if "my" changes were already published? then I would like to avoid the rebase, and would -s theirs to choose "their" solution in favor of mine and be able to push so others could still "fast-forward" to the new state. So -- as to me it remains 'symmetric' ;) > There may be valid workflows that benefit from "-s theirs", and I > would not be surprised at all if we found more of them in the past 9 > years since we had the "why -s theirs does not exist" discussion in > 2008. But "because -s ours can be used in reverse to emulate" is > not a valid excuse to add "-s theirs". It can be used a rationale > against adding it (e.g. "-s theirs generally is discouraged because > it forsters a bad workflow, but in a very rare case where it might > be useful, you can always check out their branch and merge yours > using '-s ours' to emulate it, so we do not lose any functionality > even if we did not add it"), though. sure, git is flexible, so workarounds could always be found, but often many options are just a matter of convenience. And here -s theirs would be one of them besides my other use case where it is a somewhat "by design" workflow, and -s theirs is use to take their exact state I would improve upon (before committing the "merge") -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH 4/4] Move documentation of string_list into string-list.h
On Tue, Sep 26, 2017 at 7:22 AM, Junio C Hamanowrote: > Junio C Hamano writes: > >> >> Thanks. I am not sure if you can safely reorder the contents of the >> header files in general, but I trust that you made sure that this >> does not introduce problems (e.g. referrals before definition). > > Alas, this time it seems my trust was grossly misplaced. Discarding > this patch and redoing the integration cycle for the day. Oops, my bad. I resent the remaining patches. They do compile now :) -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
[PATCH 3/3] string-list.h: move documentation from Documentation/api/ into header
This mirrors commit 'bdfdaa497 ("strbuf.h: integrate api-strbuf.txt documentation, 2015-01-16") which did the same for strbuf.h: * API documentation uses /** */ to set it apart from other comments. * Function names were stripped from the comments. * Ordering of the header was adjusted to follow the one from the text file. * Edited some existing comments from string-list.h for consistency. Signed-off-by: Han-Wen Nienhuys--- Documentation/technical/api-string-list.txt | 209 string-list.h | 192 + 2 files changed, 162 insertions(+), 239 deletions(-) delete mode 100644 Documentation/technical/api-string-list.txt diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt deleted file mode 100644 index c08402b12..0 --- a/Documentation/technical/api-string-list.txt +++ /dev/null @@ -1,209 +0,0 @@ -string-list API -=== - -The string_list API offers a data structure and functions to handle -sorted and unsorted string lists. A "sorted" list is one whose -entries are sorted by string value in `strcmp()` order. - -The 'string_list' struct used to be called 'path_list', but was renamed -because it is not specific to paths. - -The caller: - -. Allocates and clears a `struct string_list` variable. - -. Initializes the members. You might want to set the flag `strdup_strings` - if the strings should be strdup()ed. For example, this is necessary - when you add something like git_path("..."), since that function returns - a static buffer that will change with the next call to git_path(). -+ -If you need something advanced, you can manually malloc() the `items` -member (you need this if you add things later) and you should set the -`nr` and `alloc` members in that case, too. - -. Adds new items to the list, using `string_list_append`, - `string_list_append_nodup`, `string_list_insert`, - `string_list_split`, and/or `string_list_split_in_place`. - -. Can check if a string is in the list using `string_list_has_string` or - `unsorted_string_list_has_string` and get it from the list using - `string_list_lookup` for sorted lists. - -. Can sort an unsorted list using `string_list_sort`. - -. Can remove duplicate items from a sorted list using - `string_list_remove_duplicates`. - -. Can remove individual items of an unsorted list using - `unsorted_string_list_delete_item`. - -. Can remove items not matching a criterion from a sorted or unsorted - list using `filter_string_list`, or remove empty strings using - `string_list_remove_empty_items`. - -. Finally it should free the list using `string_list_clear`. - -Example: - - -struct string_list list = STRING_LIST_INIT_NODUP; -int i; - -string_list_append(, "foo"); -string_list_append(, "bar"); -for (i = 0; i < list.nr; i++) - printf("%s\n", list.items[i].string) - - -NOTE: It is more efficient to build an unsorted list and sort it -afterwards, instead of building a sorted list (`O(n log n)` instead of -`O(n^2)`). -+ -However, if you use the list to check if a certain string was added -already, you should not do that (using unsorted_string_list_has_string()), -because the complexity would be quadratic again (but with a worse factor). - -Functions -- - -* General ones (works with sorted and unsorted lists as well) - -`string_list_init`:: - - Initialize the members of the string_list, set `strdup_strings` - member according to the value of the second parameter. - -`filter_string_list`:: - - Apply a function to each item in a list, retaining only the - items for which the function returns true. If free_util is - true, call free() on the util members of any items that have - to be deleted. Preserve the order of the items that are - retained. - -`string_list_remove_empty_items`:: - - Remove any empty strings from the list. If free_util is true, - call free() on the util members of any items that have to be - deleted. Preserve the order of the items that are retained. - -`print_string_list`:: - - Dump a string_list to stdout, useful mainly for debugging purposes. It - can take an optional header argument and it writes out the - string-pointer pairs of the string_list, each one in its own line. - -`string_list_clear`:: - - Free a string_list. The `string` pointer of the items will be freed in - case the `strdup_strings` member of the string_list is set. The second - parameter controls if the `util` pointer of the items should be freed - or not. - -* Functions for sorted lists only - -`string_list_has_string`:: - - Determine if the string_list has a given string or not. - -`string_list_insert`:: - - Insert a new element to the string_list. The returned pointer can be - handy if you want to write something to the `util` pointer of the - string_list_item
[PATCH 2/3] read_gitfile_gently: clarify return value ownership.
Signed-off-by: Han-Wen Nienhuys--- setup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/setup.c b/setup.c index 42400fcc5..8c95841d5 100644 --- a/setup.c +++ b/setup.c @@ -541,7 +541,8 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir) /* * Try to read the location of the git directory from the .git file, - * return path to git directory if found. + * return path to git directory if found. The return value comes from + * a shared buffer. * * On failure, if return_error_code is not NULL, return_error_code * will be set to an error code and NULL will be returned. If -- 2.14.1.821.g8fa685d3b7-goog
[PATCH 1/3] real_path: clarify return value ownership
Signed-off-by: Han-Wen Nienhuys--- abspath.c | 4 1 file changed, 4 insertions(+) diff --git a/abspath.c b/abspath.c index 708aff8d4..985798532 100644 --- a/abspath.c +++ b/abspath.c @@ -202,6 +202,10 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path, return retval; } +/* + * Resolve `path` into an absolute, cleaned-up path. The return value + * comes from a shared buffer. + */ const char *real_path(const char *path) { static struct strbuf realpath = STRBUF_INIT; -- 2.14.1.821.g8fa685d3b7-goog
[PATCH 0/3] Comment fixes
follow more commit log conventions; verified it compiled (yay). (should I send patches that are in 'pu' again as well?) Han-Wen Nienhuys (3): real_path: clarify return value ownership read_gitfile_gently: clarify return value ownership. string-list.h: move documentation from Documentation/api/ into header Documentation/technical/api-string-list.txt | 209 abspath.c | 4 + setup.c | 3 +- string-list.h | 192 + 4 files changed, 168 insertions(+), 240 deletions(-) delete mode 100644 Documentation/technical/api-string-list.txt -- 2.14.1.821.g8fa685d3b7-goog
[PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
The checkpoint command cycles packfiles if object_count != 0, a sensible test or there would be no pack files to write. Since 820b931012, the command also dumps branches, tags and marks, but still conditionally. However, it is possible for a command stream to modify refs or create marks without creating any new objects. For example, reset a branch (and keep fast-import running): $ git fast-import reset refs/heads/master from refs/heads/master^ checkpoint but refs/heads/master remains unchanged. Other example: a commit command that re-creates an object that already exists in the object database. The man page also states that checkpoint "updates the refs" and that "placing a progress command immediately after a checkpoint will inform the reader when the checkpoint has been completed and it can safely access the refs that fast-import updated". This wasn't always true without this patch. This fix unconditionally calls dump_{branches,tags,marks}() for all checkpoint commands. dump_branches() and dump_tags() are cheap to call in the case of a no-op. Add tests to t9300 that observe the (non-packfiles) effects of checkpoint. Signed-off-by: Eric Rannaud--- fast-import.c | 6 +-- t/t9300-fast-import.sh | 117 + 2 files changed, 120 insertions(+), 3 deletions(-) Any comments on the testing strategy with a background fast-import? To summarize: - fast-import is started in the background with a command stream that ends with "checkpoint\nprogress checkpoint\n". fast-import keeps running after reaching the last command (we don't want fast-import to terminate). - In the meantime, the test is waiting to read "progress checkpoint" in the output of fast-import, then it checks the testing conditions. - Finally, the test ensures that fast-import is still running in the background (and thus that it has just observed the effect of checkpoint, and not the side effects of fast-import terminating). - After 10 sec, no matter what, the background fast-import is sent "done" and terminates. I tried to make sure that the test runs quickly and does not (typically) sleep. Upon failure, the test may take up to 10 sec to fully terminate. However, the test could break under (very heavy) load if fast-import is unable to make progress in a reasonable amount of time (either "progress checkpoint" is not read within 5 sec, or fast-import receives "done" before the testing conditions are checked). Let me know if this is not acceptable. Should these test cases be at least separated to a new t9304 to circumscribe any such nuisance? Alternatively, a FIFO-based approach could be considered. (Note: added cases 2 and 4 pass without this patch, but 1 and 3 do not.) diff --git a/fast-import.c b/fast-import.c index 35bf671f12c4..d5e4cf0bad41 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3189,10 +3189,10 @@ static void checkpoint(void) checkpoint_requested = 0; if (object_count) { cycle_packfile(); - dump_branches(); - dump_tags(); - dump_marks(); } + dump_branches(); + dump_tags(); + dump_marks(); } static void parse_checkpoint(void) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 67b8c50a5ab4..b410bf3a3a7a 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -3120,4 +3120,121 @@ test_expect_success 'U: validate root delete result' ' compare_diff_raw expect actual ' +### +### series V (checkpoint) +### + +# To make sure you're observing the side effects of checkpoint *before* +# fast-import terminates (and thus writes out its state), check that the +# fast-import process is still running using background_import_still_running +# *after* evaluating the test conditions. +background_import_until_checkpoint () { + options=$1 + input_file=$2 + ( cat $input_file; sleep 10; echo done ) | git fast-import $options >V.output & + echo $! >V.pid + + # The loop will poll for approximately 5 seconds before giving up. + n=0 + while ! test "$(cat V.output)" = "progress checkpoint"; do + if test $n -gt 55 + then + echo >&2 "no progress checkpoint received" + exit 1 + fi + + # Try to avoid sleeping in the first iterations and poll + # aggressively. + if test $n -ge 50 + then + sleep 1 + fi + + n=$(($n + 1)) + done +} + +background_import_still_running () { + if ! ps --pid "$(cat V.pid)" + then + echo >&2 "background fast-import terminated too early" + exit 1 + fi + kill $(cat V.pid) +} + +test_expect_success 'V: checkpoint updates refs after reset' ' + cat >input <<-INPUT_END && +
Re: [PATCH v2 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf
Derrick Stoleewrites: > diff --git a/t/helper/test-abbrev.c b/t/helper/test-abbrev.c > new file mode 100644 > index 0..6866896eb > --- /dev/null > +++ b/t/helper/test-abbrev.c > @@ -0,0 +1,19 @@ > +#include "cache.h" > +#include Same comment on as [1/5] applies. > + > +int cmd_main(int ac, const char **av) > +{ > + struct object_id oid; > + char hex[GIT_MAX_HEXSZ + 2]; Why +2 (as opposed to +1)? > + const char *end; > + > + setup_git_directory(); > + > + while (fgets(hex, GIT_MAX_HEXSZ + 2, stdin)) { > + hex[GIT_MAX_HEXSZ] = 0; > + if (!parse_oid_hex(hex, , )) > + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); > + } > + > + exit(0); > +} > diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh > new file mode 100755 > index 0..ba25e7824 > --- /dev/null > +++ b/t/perf/p0008-abbrev.sh > @@ -0,0 +1,22 @@ > +#!/bin/bash > + > +test_description='Test object disambiguation through abbreviations' > +. ./perf-lib.sh > + > +test_perf_large_repo > + > +test-list-objects 10 | sort -R > objs.txt I thought "sort randomly" was a GNUism. Does it work across platforms? I think not. > + > +test_perf 'find_unique_abbrev() for existing objects' ' > + test-abbrev < objs.txt > +' > + > +test-list-objects 10 --missing | sort -R > objs.txt > + > +test_perf 'find_unique_abbrev() for missing objects' ' > + test-abbrev < objs.txt > +' > + > +rm objs.txt > + > +test_done
Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids
Derrick Stoleewrites: > diff --git a/t/helper/test-list-objects.c b/t/helper/test-list-objects.c > new file mode 100644 > index 0..83b1250fe > --- /dev/null > +++ b/t/helper/test-list-objects.c > @@ -0,0 +1,85 @@ > +#include "cache.h" > +#include "packfile.h" > +#include If you include "cache.h", like the ordinary "git" program, you should not have to include any system headers like stdio.h yourself. The whole point of "git-compat-util.h must be the first to be included (indirectly including it via cache.h and friends is also OK)" rule is to make sure that system headers are included at the right place in the include sequence (e.g. defining feature macros before including certain headers, etc.). > +int cmd_main(int ac, const char **av) > +{ > + unsigned int hash_delt = 0xFDB97531; > + unsigned int hash_base = 0x01020304; > +... > + const int u_per_oid = sizeof(struct object_id) / sizeof(unsigned int); Because the above will not work as you expect, unless your uint is 32-bit, you would probably want to make hash_delt and hash_base explicitly uint32_t, I think. Alternatively, because you assume that uint is at least 8-hex-digit wide in the output loop, you can keep "unsigned int" above, but change the divisor from sizeof(unsigned int) to a hardcoded 4. Either would fix the issue. > + c.total = 0; > + if (ac > 1) > + n = atoi(av[1]); Otherwise n will stay 0 as initialized. Not checking the result of atoi() is probably permissible, as this is merely a test helper and we can assume that the caller will not do anything silly, like passing "-3" in av[1]. But it certainly would be a good discipline to make sure av[1] is a reasonable number. I am afraid that I may be misreading the code, but the following code seems to require that the caller must know roughly how many objects there are (so that a large-enough value can be passed to av[1] to force c.select_mod to 1) when it wants to show everything, as the "missing" loop will run 0 times showing nothing, and "listing" case will divide by zero. "Not passing anything will show everything" the proposed log message promises is a better design than what the code actually seems to be doing. > + > + if (ac > 2 && !strcmp(av[2], "--missing")) { > + while (c.total++ < n) { When n==0, this does nothing. I am not sure what the desired behaviour should be for "When no count is given, we show everything" in this codepath, though. Also, doesn't "test-list-objects 5 --missing" look very wrong? Shouldn't it be more like "test-list-objects --missing 5"? > + for (i = 0; i < u_per_oid; i++) { > + printf("%08x", hash_base); > + hash_base += hash_delt; > + } > + printf("\n"); > + } > + } else { > + setup_git_directory(); > + > + for_each_loose_object(count_loose, , 0); > + for_each_packed_object(count_packed, , 0); > + > + c.select_mod = 1 + c.total / n; When n==0, this would divide by zero. Perhaps c.select_mod = n ? (1 + c.total / n) : 1; or something to keep the promise of showing everything? > + for_each_loose_object(output_loose, , 0); > + for_each_packed_object(output_packed, , 0); > + } > + > + exit(0); > +} Thanks.
Re: [PATCH] t7406: submodule..update command must not be run from .gitmodules
Johannes Sixtwrites: >> +test_when_finished "git -C super reset --hard HEAD^" && >> + >> +write_script must_not_run.sh <<-EOF && >> +>$TEST_DIRECTORY/bad >> +EOF > > I am pretty confident that this does not test what you intend to > test. Notice that $TEST_DIRECTORY is expanded when the script is > written. But that path contains a blank, and we have something like > this in the test script: > > #!/bin/sh > >/the/build/directory/t/trash directory.t7406/bad Nicely explained. Thanks.
[ANNOUNCE] Git v2.14.2
The latest maintenance release Git v2.14.2 is now available at the usual places. In addition to the cvsserver related fixes that appear in 2.10.5, 2.11.4, 2.12.5 and 2.13.6 announced separately, this also includes various fixes that were merged already to the 'master' branch. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of the 'v2.14.2' tag and the 'maint' branch that the tag points at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://github.com/gitster/git Git v2.14.2 Release Notes = Fixes since v2.14.1 --- * Because recent Git for Windows do come with a real msgfmt, the build procedure for git-gui has been updated to use it instead of a hand-rolled substitute. * "%C(color name)" in the pretty print format always produced ANSI color escape codes, which was an early design mistake. They now honor the configuration (e.g. "color.ui = never") and also tty-ness of the output medium. * The http.{sslkey,sslCert} configuration variables are to be interpreted as a pathname that honors "~[username]/" prefix, but weren't, which has been fixed. * Numerous bugs in walking of reflogs via "log -g" and friends have been fixed. * "git commit" when seeing an totally empty message said "you did not edit the message", which is clearly wrong. The message has been corrected. * When a directory is not readable, "gitweb" fails to build the project list. Work this around by skipping such a directory. * A recently added test for the "credential-cache" helper revealed that EOF detection done around the time the connection to the cache daemon is torn down were flaky. This was fixed by reacting to ECONNRESET and behaving as if we got an EOF. * Some versions of GnuPG fail to kill gpg-agent it auto-spawned and such a left-over agent can interfere with a test. Work it around by attempting to kill one before starting a new test. * "git log --tag=no-such-tag" showed log starting from HEAD, which has been fixed---it now shows nothing. * The "tag.pager" configuration variable was useless for those who actually create tag objects, as it interfered with the use of an editor. A new mechanism has been introduced for commands to enable pager depending on what operation is being carried out to fix this, and then "git tag -l" is made to run pager by default. * "git push --recurse-submodules $there HEAD:$target" was not propagated down to the submodules, but now it is. * Commands like "git rebase" accepted the --rerere-autoupdate option from the command line, but did not always use it. This has been fixed. * "git clone --recurse-submodules --quiet" did not pass the quiet option down to submodules. * "git am -s" has been taught that some input may end with a trailer block that is not Signed-off-by: and it should refrain from adding an extra blank line before adding a new sign-off in such a case. * "git svn" used with "--localtime" option did not compute the tz offset for the timestamp in question and instead always used the current time, which has been corrected. * Memory leaks in a few error codepaths have been plugged. * bash 4.4 or newer gave a warning on NUL byte in command substitution done in "git stash"; this has been squelched. * "git grep -L" and "git grep --quiet -L" reported different exit codes; this has been corrected. * When handshake with a subprocess filter notices that the process asked for an unknown capability, Git did not report what program the offending subprocess was running. This has been corrected. * "git apply" that is used as a better "patch -p1" failed to apply a taken from a file with CRLF line endings to a file with CRLF line endings. The root cause was because it misused convert_to_git() that tried to do "safe-crlf" processing by looking at the index entry at the same path, which is a nonsense---in that mode, "apply" is not working on the data in (or derived from) the index at all. This has been fixed. * Killing "git merge --edit" before the editor returns control left the repository in a state with MERGE_MSG but without MERGE_HEAD, which incorrectly tells the subsequent "git commit" that there was a squash merge in progress. This has been fixed. * "git archive" did not work well with pathspecs and the export-ignore attribute. * "git cvsserver" no longer is invoked by "git shell" by default, as it is old and largely unmaintained. * Various Perl scripts did not use safe_pipe_capture() instead of backticks, leaving them susceptible to end-user input. They have been corrected. Also contains various documentation updates and code clean-ups. Credits go to joernchen
[ANNOUNCE] Git v2.13.6 and others
Maintenance releases Git v2.10.5, v2.11.4, v2.12.5 and v2.13.6 are now available at the usual places. These are solely about hardening "git shell" that is used on servers against an unsafe user input, which "git cvsserver" copes with poorly. A copy of the release notes for v2.10.5 is attached at the end of the message, but the notes for other releases listed above essentially say the same thing. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of the 'v2.10.5', 'v2.11.4', 'v2.12.5' and 'v2.13.6' tags and some of them have the 'maint-2.10', 'maint-2.11', 'maint-2.12' and 'maint-2.13' branches that the tags point at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://github.com/gitster/git Note that the committed version of the release notes for these versions all incorrectly mention "git daemon", where they should have said "git shell". It has been corrected in the attached copy, but because the release engineering was done several days in advance and the tags have already been shared with binary packagers and others at the git-secur...@googlegroups.com mailing list, these release tarballs are issued _with_ the known typo to avoid confusion of having two release tags with different contents. Sorry about that. Git v2.10.5 Release Notes = Fixes since v2.10.4 --- * "git cvsserver" no longer is invoked by "git shell" by default, as it is old and largely unmaintained. * Various Perl scripts did not use safe_pipe_capture() instead of backticks, leaving them susceptible to end-user input. They have been corrected. Credits go to joernchenfor finding the unsafe constructs in "git cvsserver", and to Jeff King at GitHub for finding and fixing instances of the same issue in other scripts. Changes since v2.10.4 are as follows: Jeff King (3): shell: drop git-cvsserver support by default archimport: use safe_pipe_capture for user input cvsimport: shell-quote variable used in backticks Junio C Hamano (3): cvsserver: move safe_pipe_capture() to the main package cvsserver: use safe_pipe_capture for `constant commands` as well Git 2.10.5 joernchen (1): cvsserver: use safe_pipe_capture instead of backticks