Re: [PATCH Outreachy 1/2] format: create pretty.h file
> I see you've "standardized" to drop "extern" from the declarations > in the header; I have an impression that our preference however is > to go in the other direction. OK, absolutely not a problem, I will return them. Do I need to write "extern" further in function declarations? And why did everyone choose writing "extern" every time? It looks obvious for me that declaration of function is extern, that's why I decided to throw them away. > The choice of bits that are moved to the new header looks quite > sensible to me. I'm very happy and satisfied with it :-) > s/futher/further/ It was a typo that I missed. Thank you! Will fix it also. > This has a toll on topics in flight that expect the symbols for > pretty are available in "commit.h"; they are forced to include > this new file they did not even know about. > > I notice that "commit.h" is included in "builtin.h"; perhaps adding > a new include for "pretty.h" there would be of lessor impact? I > dunno. > It's a middle point, as I said. I have plans to create unifying format.h then (for all formatting issues). I guess that pretty.h and ref-filter.h will be deleted later. But, I really need to create now that pretty.h because it is much easier to work with existing interface. If you have another ideas how to achieve the main goal - please share them with me, I would appreciate that so much. I am not sure that my solution is the best, but I can't come up with something better for now.
Re: [PATCH] docs/pretty-formats: mention commas in %(trailers) syntax
On Fri, Dec 08, 2017 at 12:16:36AM -0500, Jeff King wrote: > Commit 84ff053d47 (pretty.c: delimit "%(trailers)" arguments > with ",", 2017-10-01) switched the syntax of the trailers > placeholder, but forgot to update the documentation in > pretty-formats.txt. > > There's need to mention the old syntax; it was never in a > released version of Git. > > Signed-off-by: Jeff KingMy mistake, and thank you for giving this your attention.
Re: [SCRIPT/RFC 0/3] git-commit --onto-parent (three-way merge, no working tree file changes)
Hi Alexei, On 09/12/2017 03:18, Alexei Lozovsky wrote: > > > Chris reported in this very topic[1] that sometimes, due to > > conflicts with later commits, "checkout > commit > [checkout >] > > rebase --onto" is "much easier to do", where "commit --fixup > > > rebase -i" "breaks" (does not apply cleanly). > > It was more of a rant about conflict resolution by rebase rather than > a concern about modification time of files. While I'd prefer git to > not touch the source tree unnecessarily, it's not really a big deal > for me if it does and some parts of the project need to be rebuilt. Nevertheless, I found it valuable in supporting the case where "commit --fixup > rebase -i" seems to require even more work than otherwise necessary :) But thanks for clarifying, anyway, it does feel like `git rebase -i --autosquash` could be smarter in this regards, if `git rebase --onto` does it better...? Even though your explanation seems clear, having a real, easily reproducible case would help as well, I guess. > I kinda hoped that you may know this magic and incorporate it into > "commit --onto" which will allow to immediately get to the result of > the rebase: > > ---A---f!A---B' > > without spelling it all manually. If you mind enough to be bothered testing it out, might be even existing/initial state of originally proposed `git commit --onto-parent` script would work for you, as it does incorporate some trivial three-way merge resolution. In your starting situation: ---A---B ... you would just do something like: git commit --onto-parent A ... hopefully ending up in the desired state (hopefully = conflicts automatically resolved): ---A---C---B' You could even do this instead: git commit --onto-parent A --amend ... ending up with: ---A'---B' ... as that is basically what you wanted in the first place ;) > (And yeah, I'm actually Alexei, not Chris. That was my MUA being > dumb and using an old pseudonym than Google insists I'm called by.) Ah, sorry for the confusion :) Regards, Buga
Let us make business together.
Let us make business together. I am in-charge of transferring out funds our village generated from the sales of our local mined gold. I have some left over fund in the bank that I alone is aware and can transfer it out. My village that mines gold, has mandated me for sales of our raw Gold, and its fund control. I use this opportunity to look for some one who will provide an account to receive sum of ten million, US dollars left in the bank, this is gold sales fund unknown by our village, This fund has been laying for onward transfer to overseas as we transfer out all funds sold from our gold, till now this fund is lying in the bank, I have all documents concerning the fund and now I want to use it to establish outside my country. So if you are interested, then provide an account to receive the fund for a joint business and sharing, I will give you 30% of the fund. if you are interested reply me. Sincerely, Barrister Fatai Sanuo { fatai.sa...@myself.com }
Re: [SCRIPT/RFC 0/3] git-commit --onto-parent (three-way merge, no working tree file changes)
On Dec 9, 2017, at 01:54, Igor Djordjevic wrote: > On 08/12/2017 17:24, Junio C Hamano wrote: >> I'd rather do a quick fix-up on top (which ensures that at least the >> fix-up works in the context of the tip), and then "rebase -i" to >> move it a more appropriate place in the history (during which I have >> a chance to ensure that the fix-up works in the context it is >> intended to apply to). > > Chris reported in this very topic[1] that sometimes, due to conflicts > with later commits, "checkout > commit > [checkout >] rebase --onto" > is "much easier to do", where "commit --fixup > rebase -i" "breaks" > (does not apply cleanly). It was more of a rant about conflict resolution by rebase rather than a concern about modification time of files. While I'd prefer git to not touch the source tree unnecessarily, it's not really a big deal for me if it does and some parts of the project need to be rebuilt. The situations which tend to cause conflicts for me usually look like this. ---A---B Say, a file has this line added somewhere in commit A: +int some_function(); Then commit B adds another line: int some_function(); +int another_function(); Then I realize that I would like to have commit A to introduce some other_function() as well, so I add a fixup: ---A---B---f!A with the following diff: int some_function(); +int other_function(); int another_function(); And then I often find that "rebase -i --autosquash" fails to apply the commit B because it expects slightly different context around the changed lines. However, sometimes I see that if I do this: ---A---B \ f!A then "rebase --onto f!A A B" succeeds, nicely resolving the conflict without bothering me. No idea why. I kinda hoped that you may know this magic and incorporate it into "commit --onto" which will allow to immediately get to the result of the rebase: ---A---f!A---B' without spelling it all manually. (And yeah, I'm actually Alexei, not Chris. That was my MUA being dumb and using an old pseudonym than Google insists I'm called by.)
Re: [PATCH v2 4/5] Makefile: use the sha1collisiondetection submodule by default
On Fri, Dec 08, 2017 at 03:21:23PM -0800, Junio C Hamano wrote: > Junio C Hamanowrites: > > > That endgame allows us not force people to grab an essential part of > > the codebase as an external dependency from another place, which > > feels quite bad, especially when their primary interest is not in > > dogfooding submodule but in building a working version of Git. > > > > Removing one and keeping the other between the two will make the > > distribution more streamlined by removing the build-time knob we > > need to tweak, but the one that gets removed does not have to be the > > in-tree one that allows people to "git clone" from just one place. > > Perhaps this may deserve a bit more explanation. > > I wouldn't be so against "let's do submodule-only" if this were not > SHA-1 implementation but something like gitk and git-gui. An optional > part of a system that it is safe to leave to individual builders if > they want to fetch and use that part *is* an ideal target to bind as > a submodule to the system. > > It's just the "default SHA-1 implementation" is at the far opposite > end of the spectrum from "an optional part", and our use of > submodule to bind this code is definitely *not* because it makes > sense to use submodule in that context; it is because developers > (not necessarily those who consume Git sourcecode) *wanted* to use > submodule there, regardless of the real merit of doing so. Thanks for writing this out. I had a vague feeling that I didn't quite like going the submodule-only direction, but I was having trouble putting into words. It's exactly this. -Peff
Re: [SCRIPT/RFC 0/3] git-commit --onto-parent (three-way merge, no working tree file changes)
On 08/12/2017 17:24, Junio C Hamano wrote: > > > To get back on track, and regarding what`s already been said, > > would having something like this(1) feel useful? > > > > (1) git commit --onto > > Are you asking me if _I_ find it useful? It is not a very useful > question to ask, as I've taken things that I do not find useful > myself. It was also (kind of shy and subtle) "would you take it?", indeed, but I do value your personal opinion here, too, being a recognized developer, and one really knowing the Git (mailing list) community on top of it, so I appreciate you addressed both sides of the question. And it was partly addressed to Hannes, but more for a confirmation, I guess, him being the one to favor such a flow in the first place, over what I initially suggested. > Having said that, I do not see me personally using it. You keep > claiming that committing without ever materializing the exact state > that is committed in the working tree is a good thing. > > I do not subscribe to that view. No - and I find it an important difference to note - just that it might be acceptable / more preferable _in certain situations_, where the only alternative seems to be wasting (significant) amount of time on needless rebuilds of many files (just because of branch switching, otherwise untouched by the changes we`re interested in). If this is perceived a too uncommon/exotic case to worth addressing is a different matter, though. > I'd rather do a quick fix-up on top (which ensures that at least the > fix-up works in the context of the tip), and then "rebase -i" to > move it a more appropriate place in the history (during which I have > a chance to ensure that the fix-up works in the context it is > intended to apply to). Chris reported in this very topic[1] that sometimes, due to conflicts with later commits, "checkout > commit > [checkout >] rebase --onto" is "much easier to do", where "commit --fixup > rebase -i" "breaks" (does not apply cleanly). > I know that every time I say this, people who prefer to commit > things that never existed in the working tree will say "but we'll > test it later after we make these commit without having their state > in the working tree". But I also know better that "later" often do > not come, ever, at least for people like me ;-). No comment here ;) > The amount of work _required_ to record the fix-up at its final > resting place deeper in the history would be larger with "rebase -i" > approach, simply because approaches like "commit --onto" and "git > post" that throw a new commit deep in the history would not require > ever materializing it in the working tree. But because I care about > what I am actually committing, and because I am just lazy as any > other human (if not more), I'd prefer an apporach that _forces_ me > to have a checkout of the exact state that I'd be committing. That > would prod me to actually looking at and testing the state after the > change in the context it is meant to go. All that I agree with, too. But that said, I do find `git add --patch` invaluable (for example), where one can still opt to commit right away (and test later ;)), or do a proper `git stash push --keep-index` first in order to actually check/test the exact state/context before committing. One of the biggest advantages I see in using Git is that it provides so many possibilities, where there is not necessarily a single "correct" way to do something - depending on the (sub)context, the decision on "_the_ correct" way can be deferred to the user himself. Git (usually) does not judge, except in cases where something is considered "plain wrong" - still different than "might not be the best approach", but not necessarily a wrong one, either. But I do realize it also means more chances for beginner users to shoot themselves in the foot, blaming it on Git, so even if just a matter of personal taste, a more restrictive preference from the Git maintainer is understandable :) Regards, Buga [1] https://public-inbox.org/git/CAPc5daWupO6DMOMFGn=xjucg-jmyc4eyo8+tmasdwcaohxz...@mail.gmail.com/T/#m989306ab9327e15f14027cfd74ae8c5bf487affb
Re: [PATCH v2 4/5] Makefile: use the sha1collisiondetection submodule by default
On Fri, Dec 08 2017, Junio C. Hamano jotted: > Ævar Arnfjörð Bjarmasonwrites: > >> Instead the Makefile will emit an error if the contents of the >> submodule aren't checked out (line-wrapped. GNU make emits this all on >> one line): >> >> Makefile:1031: *** The sha1collisiondetection submodule is not >> checked out. Please make it available, either by cloning with >> --recurse-submodules, or by running "git submodule update >> --init". If you can't use it for whatever reason you can define >> NO_DC_SHA1_SUBMODULE=NoThanks. Stop. > > Sounds OK. > > But I actually do not mind to (and may even prefer to) have an > endgame opposite of what this series tries to lead us to. We've > tried to have this as submodule, we've seen that the arrangement > works, and now we declare victory and get rid of the submodule. I don't think we can say we tried without having this 4/5 (5/5 not needed) in a couple of releases. Without this series we always smoothly fall back to the non-submodule, and only use it if you opt in. So all we've really tested so far is: * CI systems that consume git.git and provide --recurse-submodules to git-clone by default. * People on this list that have gone out of their way to test by manually toggling the the flag. > That endgame allows us not force people to grab an essential part of > the codebase as an external dependency from another place, which > feels quite bad, especially when their primary interest is not in > dogfooding submodule but in building a working version of Git. As noted previously my two motivations are: 1) That we decide what we want to do with this, ultimately I don't really mind which way we go. 2) That if we go with the submodule by default, it should be understood that one of the main benefits is us *actually* dogfooding it and subsequently improving it for all git users. > Removing one and keeping the other between the two will make the > distribution more streamlined by removing the build-time knob we > need to tweak, but the one that gets removed does not have to be the > in-tree one that allows people to "git clone" from just one place. What you're describing here is a great example of #2, and also a way of making my point above that we really haven't tried submodules in git.git yet, since you're just worrying about this issue now that using it would the default. This is a UX issue with submodules that I agree sucks and there's no reason for why it couldn't be solved. E.g. one solution is that submodules could have something like: [submodule "sha1collisiondetection"] path = sha1collisiondetection url = https://github.com/cr-marcstevens/sha1collisiondetection.git branch = master localbranch = sha1collisiondetection/master Where the localbranch would be git.git's own copy in a branch of the the sha1collisiondetection/ commit. Then when you update the ref sha1collisiondetection/ points to it would also update the sha1collisiondetection/master branch (and warn/die when you push git.git master but not that branch). This would solve offer a solution to this submodule UX issue, but more importantly I think the likelyhood of such a patch (and others) actually being written is going to go up *significantly* if the git project itself is dogfooding the feature, with exhibit A being that you're already annoyed by it :)
Re: [PATCH v2 4/5] Makefile: use the sha1collisiondetection submodule by default
Junio C Hamanowrites: > That endgame allows us not force people to grab an essential part of > the codebase as an external dependency from another place, which > feels quite bad, especially when their primary interest is not in > dogfooding submodule but in building a working version of Git. > > Removing one and keeping the other between the two will make the > distribution more streamlined by removing the build-time knob we > need to tweak, but the one that gets removed does not have to be the > in-tree one that allows people to "git clone" from just one place. Perhaps this may deserve a bit more explanation. I wouldn't be so against "let's do submodule-only" if this were not SHA-1 implementation but something like gitk and git-gui. An optional part of a system that it is safe to leave to individual builders if they want to fetch and use that part *is* an ideal target to bind as a submodule to the system. It's just the "default SHA-1 implementation" is at the far opposite end of the spectrum from "an optional part", and our use of submodule to bind this code is definitely *not* because it makes sense to use submodule in that context; it is because developers (not necessarily those who consume Git sourcecode) *wanted* to use submodule there, regardless of the real merit of doing so. And that is why I do not feel entirely happy with the step 4/5 and also I feel moderately strongly against the step 5/5. These two steps have a "developer first" smell that disgusts me.
Re: [PATCH v2 2/5] Makefile: under "make dist", include the sha1collisiondetection submodule
On Fri, Dec 08 2017, Junio C. Hamano jotted: > Ævar Arnfjörð Bjarmasonwrites: > >> Include the sha1collisiondetection submodule when running "make >> dist". Even though we've been shipping the sha1collisiondetection >> submodule[1] and using it by default if it's checked out[2] anyone >> downloading git as a tarball would just get an empty >> sha1collisiondetection/ directory. > > While I can see that you are not including everything, but I do not > see _why_ you chose to do so and hardcode the burden of maintaining > the list of files we need to copy in the Makefile. I started by trying to come up with something generic which would handle future submodules, i.e.: git submodule foreach 'git ls-files' However, unlike the C programs ./git-submodule will bark about missing shell stuff when not installed, and the "dist" target already use ./git-*. Between that and someone using git.git probably never running sha1collisiondetection/ itself, it seemed fine just to hardcode the couple of things we needed, which are very unlikely to change. > This is much better than shipping a tarball that would not build at > the endgame stage, of course ;-)
Re: [PATCH v2 4/5] Makefile: use the sha1collisiondetection submodule by default
Ævar Arnfjörð Bjarmasonwrites: > Instead the Makefile will emit an error if the contents of the > submodule aren't checked out (line-wrapped. GNU make emits this all on > one line): > > Makefile:1031: *** The sha1collisiondetection submodule is not > checked out. Please make it available, either by cloning with > --recurse-submodules, or by running "git submodule update > --init". If you can't use it for whatever reason you can define > NO_DC_SHA1_SUBMODULE=NoThanks. Stop. Sounds OK. But I actually do not mind to (and may even prefer to) have an endgame opposite of what this series tries to lead us to. We've tried to have this as submodule, we've seen that the arrangement works, and now we declare victory and get rid of the submodule. That endgame allows us not force people to grab an essential part of the codebase as an external dependency from another place, which feels quite bad, especially when their primary interest is not in dogfooding submodule but in building a working version of Git. Removing one and keeping the other between the two will make the distribution more streamlined by removing the build-time knob we need to tweak, but the one that gets removed does not have to be the in-tree one that allows people to "git clone" from just one place.
Re: [PATCH v2 2/5] Makefile: under "make dist", include the sha1collisiondetection submodule
Ævar Arnfjörð Bjarmasonwrites: > Include the sha1collisiondetection submodule when running "make > dist". Even though we've been shipping the sha1collisiondetection > submodule[1] and using it by default if it's checked out[2] anyone > downloading git as a tarball would just get an empty > sha1collisiondetection/ directory. While I can see that you are not including everything, but I do not see _why_ you chose to do so and hardcode the burden of maintaining the list of files we need to copy in the Makefile. This is much better than shipping a tarball that would not build at the endgame stage, of course ;-) Thanks.
RE: Documentation Breakage at 2.5.6
-Original Message- On December 8, 2017 5:29 PM Junio C Hamano wrote: >"Randall S. Becker"writes: >> One request to Junio: Would it be possible to tag the commits to align >> with the tags in the main repo? That way, I can build a nice little >> Jenkins job to automatically fetch the correct commit for man pages >> when packaging up a release. >I am not interested in doing anything more than absolute minimum in the >history that records generated cruft. We already describe the mainline commit >object names in the messages; perhaps that is >sufficient? >commit daa88a54a985ed1ef258800c742223c2a8f0caaa > Author: Junio C Hamano > Date: Wed Dec 6 10:04:03 2017 -0800 > > Autogenerated manpages for v2.15.1-354-g95ec6 >The primary reason why I do not want to tag them is because the tree the >documentation sources were taken from is *not* the only thing that affects >these autogenerated cruft. The AsciiDoc toolchain that >happen to be >installed on the box the day I ran the documentation tools is an obvious >difference, and I do not want to make them appear any more definitive and >official. "This is *the* manpage for release >v2.15.1" is the message I do >not want to give. No worries. I will push on with trying to get asciidoc to build so that I can generate the man pages. That probably won't happen soon, so I'll keep MacGyvering. I do get generating is the better solution, but I would rather focus my own efforts on keeping up with git (that ports fairly easily and is essential to work life) than burning off $DAYJOB hours on asciidoc and/or xmlto. Cheers, Randall
[PATCH v2 3/5] sha1dc_git.h: re-arrange an ifdef chain for a subsequent change
A subsequent change will change the semantics of DC_SHA1_SUBMODULE in a way that would require moving these checks around, so start by moving them around without any functional changes to reduce the size of the subsequent patch. Signed-off-by: Ævar Arnfjörð Bjarmason--- sha1dc_git.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sha1dc_git.h b/sha1dc_git.h index a8c2729278..41e1c3fd3f 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -1,9 +1,9 @@ /* Plumbing with collition-detecting SHA1 code */ -#ifdef DC_SHA1_SUBMODULE -#include "sha1collisiondetection/lib/sha1.h" -#elif defined(DC_SHA1_EXTERNAL) +#ifdef DC_SHA1_EXTERNAL #include +#elif defined(DC_SHA1_SUBMODULE) +#include "sha1collisiondetection/lib/sha1.h" #else #include "sha1dc/sha1.h" #endif -- 2.15.1.424.g9478a66081
[PATCH v2 2/5] Makefile: under "make dist", include the sha1collisiondetection submodule
Include the sha1collisiondetection submodule when running "make dist". Even though we've been shipping the sha1collisiondetection submodule[1] and using it by default if it's checked out[2] anyone downloading git as a tarball would just get an empty sha1collisiondetection/ directory. Doing this automatically is a feature that's missing from git-archive, but in the meantime let's bundle this up into the tarball we ship. This ensures that the DC_SHA1_SUBMODULE flag does what's intended even in an unpacked tarball, and more importantly means we're building the exact same code from the same paths from git.git and from the tarball. I am not including all the files in the submodule, only the ones git actually needs (and the licenses), only including some files like this would be a useful feature if git-archive ever adds the ability to bundle up submodules. 1. commit 86cfd61e6b ("sha1dc: optionally use sha1collisiondetection as a submodule", 2017-07-01) 2. cac87dc01d ("sha1collisiondetection: automatically enable when submodule is populated", 2017-07-01) Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 15 +++ 1 file changed, 15 insertions(+) diff --git a/Makefile b/Makefile index dc886f8eda..3955b02b6a 100644 --- a/Makefile +++ b/Makefile @@ -2643,6 +2643,21 @@ dist: git-archive$(X) configure $(GIT_TARNAME)/configure \ $(GIT_TARNAME)/version \ $(GIT_TARNAME)/git-gui/version +ifdef DC_SHA1_SUBMODULE + @mkdir -p $(GIT_TARNAME)/sha1collisiondetection/lib + @cp sha1collisiondetection/LICENSE.txt \ + $(GIT_TARNAME)/sha1collisiondetection/ + @cp sha1collisiondetection/LICENSE.txt \ + $(GIT_TARNAME)/sha1collisiondetection/ + @cp sha1collisiondetection/lib/sha1.[ch] \ + $(GIT_TARNAME)/sha1collisiondetection/lib/ + @cp sha1collisiondetection/lib/ubc_check.[ch] \ + $(GIT_TARNAME)/sha1collisiondetection/lib/ + $(TAR) rf $(GIT_TARNAME).tar \ + $(GIT_TARNAME)/sha1collisiondetection/LICENSE.txt \ + $(GIT_TARNAME)/sha1collisiondetection/lib/sha1.[ch] \ + $(GIT_TARNAME)/sha1collisiondetection/lib/ubc_check.[ch] +endif @$(RM) -r $(GIT_TARNAME) gzip -f -9 $(GIT_TARNAME).tar -- 2.15.1.424.g9478a66081
[PATCH v2 4/5] Makefile: use the sha1collisiondetection submodule by default
Change the build process so that instead of needing to supply DC_SHA1_SUBMODULE=YesPlease to use the sha1collisiondetection submodule instead of the copy of the same code shipped in the sha1dc directory, it uses the submodule by default unless NO_DC_SHA1_SUBMODULE=NoThanks is supplied. This reverses the logic added by me in 86cfd61e6b ("sha1dc: optionally use sha1collisiondetection as a submodule", 2017-07-01). Git has now shipped with the submodule in git.git for two major releases, if we're ever going to migrate to fully using it instead of perpetually maintaining both sha1collisiondetection and the sha1dc directory this is a logical first step. This change removes the "auto" logic Junio added in cac87dc01d ("sha1collisiondetection: automatically enable when submodule is populated", 2017-07-01), I feel that automatically falling back to using sha1dc would defeat the point, which is to smoke out any remaining users of git.git who have issues cloning the submodule for whatever reason. Instead the Makefile will emit an error if the contents of the submodule aren't checked out (line-wrapped. GNU make emits this all on one line): Makefile:1031: *** The sha1collisiondetection submodule is not checked out. Please make it available, either by cloning with --recurse-submodules, or by running "git submodule update --init". If you can't use it for whatever reason you can define NO_DC_SHA1_SUBMODULE=NoThanks. Stop. Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 34 -- sha1dc_git.h | 2 +- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/Makefile b/Makefile index 3955b02b6a..aed9d3001d 100644 --- a/Makefile +++ b/Makefile @@ -167,11 +167,12 @@ all:: # Without this option, i.e. the default behavior is to build git with its # own built-in code (or submodule). # -# Define DC_SHA1_SUBMODULE in addition to DC_SHA1 to use the -# sha1collisiondetection shipped as a submodule instead of the -# non-submodule copy in sha1dc/. This is an experimental option used -# by the git project to migrate to using sha1collisiondetection as a -# submodule. +# Define NO_DC_SHA1_SUBMODULE in addition to DC_SHA1 to use the +# sha1collisiondetection library shipped as a non-submodule copy in +# sha1dc/, instead of using the sha1collisiondetection submodule. This +# option might eventually go away in favor a hard dependency on +# cloning git.git with "--recurse-submodules" or on running "git +# submodule update --init" after cloning. # # Define OPENSSL_SHA1 environment variable when running make to link # with the SHA1 routine from openssl library. @@ -1026,8 +1027,15 @@ EXTLIBS = GIT_USER_AGENT = git/$(GIT_VERSION) -ifeq ($(wildcard sha1collisiondetection/lib/sha1.h),sha1collisiondetection/lib/sha1.h) -DC_SHA1_SUBMODULE = auto +ifndef NO_DC_SHA1_SUBMODULE + ifndef DC_SHA1_EXTERNAL + ifneq ($(wildcard sha1collisiondetection/lib/sha1.h),sha1collisiondetection/lib/sha1.h) +$(error The sha1collisiondetection submodule is not checked out. \ +Please make it available, either by cloning with --recurse-submodules, \ +or by running "git submodule update --init". If you can't use it for \ +whatever reason define NO_DC_SHA1_SUBMODULE=NoThanks) + endif + endif endif include config.mak.uname @@ -1497,19 +1505,17 @@ else BASIC_CFLAGS += -DSHA1_DC LIB_OBJS += sha1dc_git.o ifdef DC_SHA1_EXTERNAL - ifdef DC_SHA1_SUBMODULE - ifneq ($(DC_SHA1_SUBMODULE),auto) -$(error Only set DC_SHA1_EXTERNAL or DC_SHA1_SUBMODULE, not both) - endif + ifdef NO_DC_SHA1_SUBMODULE +$(error Only set DC_SHA1_EXTERNAL or NO_DC_SHA1_SUBMODULE, not both) endif BASIC_CFLAGS += -DDC_SHA1_EXTERNAL EXTLIBS += -lsha1detectcoll else -ifdef DC_SHA1_SUBMODULE +ifndef NO_DC_SHA1_SUBMODULE LIB_OBJS += sha1collisiondetection/lib/sha1.o LIB_OBJS += sha1collisiondetection/lib/ubc_check.o - BASIC_CFLAGS += -DDC_SHA1_SUBMODULE else + BASIC_CFLAGS += -DNO_DC_SHA1_SUBMODULE LIB_OBJS += sha1dc/sha1.o LIB_OBJS += sha1dc/ubc_check.o endif @@ -2643,7 +2649,7 @@ dist: git-archive$(X) configure $(GIT_TARNAME)/configure \ $(GIT_TARNAME)/version \ $(GIT_TARNAME)/git-gui/version -ifdef DC_SHA1_SUBMODULE +ifndef NO_DC_SHA1_SUBMODULE @mkdir -p $(GIT_TARNAME)/sha1collisiondetection/lib @cp sha1collisiondetection/LICENSE.txt \ $(GIT_TARNAME)/sha1collisiondetection/ diff --git a/sha1dc_git.h b/sha1dc_git.h index 41e1c3fd3f..1bcc4c473c 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -2,7 +2,7 @@ #ifdef DC_SHA1_EXTERNAL #include -#elif defined(DC_SHA1_SUBMODULE) +#elif !defined(NO_DC_SHA1_SUBMODULE) #include "sha1collisiondetection/lib/sha1.h" #else #include "sha1dc/sha1.h" -- 2.15.1.424.g9478a66081
[PATCH v2 1/5] Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto
Fix a logic error in the initial introduction of DC_SHA1_EXTERNAL. If git.git has a sha1collisiondetection submodule checked out the logic to set DC_SHA1_SUBMODULE=auto would interact badly with the check for whether DC_SHA1_SUBMODULE was set. It would error out, meaning that there's no way to build git with DC_SHA1_EXTERNAL=YesPlease without deinit-ing the submodule. Instead, adjust the logic to only fire if the variable is to something else than "auto" which would mean it's a mistake on the part of whoever's building git, not just the Makefile tripping over its own logic. 1. 3964cbbb5c ("sha1dc: allow building with the external sha1dc library", 2017-08-15) 2. cac87dc01d ("sha1collisiondetection: automatically enable when submodule is populated", 2017-07-01) Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Makefile b/Makefile index fef9c8d272..dc886f8eda 100644 --- a/Makefile +++ b/Makefile @@ -1498,7 +1498,9 @@ else LIB_OBJS += sha1dc_git.o ifdef DC_SHA1_EXTERNAL ifdef DC_SHA1_SUBMODULE + ifneq ($(DC_SHA1_SUBMODULE),auto) $(error Only set DC_SHA1_EXTERNAL or DC_SHA1_SUBMODULE, not both) + endif endif BASIC_CFLAGS += -DDC_SHA1_EXTERNAL EXTLIBS += -lsha1detectcoll -- 2.15.1.424.g9478a66081
Re: [PATCH v7 00/16] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
On 12/08, Jeff Hostetler wrote: > From: Jeff Hostetler> > This is V7 of part 3 of partial clone. It builds upon V7 of part 2 > (which builds upon V6 of part 1). > > This version adds additional tests, fixes test errors on the MAC version, > and squashes some fixup commits. > > It also restores functionality accidentally dropped from the V6 series > for "git fetch" to automatically inherit the partial-clone filter-spec > when appropriate. This version extends the --no-filter argument to > override this inheritance. > I just finished reading through parts 1-3. Overall I like the series. There are a few point's that I'm not a big fan of but i wasn't able to come up with a better alternative. One of these being the need for a global variable to tell the fetch-object logic to not go to the server to try and fetch a missing object. One other thing i noticed was it looks like when you discover that you are missing a blob you you'll try to fault it in from the server without first checking its an object the server would even have. Shouldn't you first do a check to verify that the object in question is a promised object before you go out to contact the server to request it? You may have already ruled this out for some reason I'm not aware of (maybe its too costly to compute?). -- Brandon Williams
[PATCH v2 0/5] SHA1DC fixes & fully moving to a git.git submodule
Here's v2 as promised. Comments per-patch. Ævar Arnfjörð Bjarmason (5): Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto Fixed indenting. Makefile: under "make dist", include the sha1collisiondetection submodule NEW: Change "make dist" to include the sha1collisiondetection/ dir in the tarball Junio's going to build when he makes releases, right now we just ship an empty directory. sha1dc_git.h: re-arrange an ifdef chain for a subsequent change No changes, trivial rewording of commit message. Makefile: use the sha1collisiondetection submodule by default s/NO_DC_SHA1_SUBMODULE=UnfortunatelyYes/NO_DC_SHA1_SUBMODULE=NoThanks/ as requested by Junio. Fix up wording of comment describing NO_DC_SHA1_SUBMODULE Fix indenting. sha1dc: remove in favor of using sha1collisiondetection as a submodule Reword & expand commit message. Don't die if both NO_DC_SHA1_SUBMODULE=Y and DC_SHA1_EXTERNAL=Y are provided. Makefile | 42 +- sha1dc/.gitattributes |1 - sha1dc/LICENSE.txt| 30 - sha1dc/sha1.c | 1900 - sha1dc/sha1.h | 110 --- sha1dc/ubc_check.c| 372 -- sha1dc/ubc_check.h| 52 -- sha1dc_git.h |6 +- 8 files changed, 27 insertions(+), 2486 deletions(-) delete mode 100644 sha1dc/.gitattributes delete mode 100644 sha1dc/LICENSE.txt delete mode 100644 sha1dc/sha1.c delete mode 100644 sha1dc/sha1.h delete mode 100644 sha1dc/ubc_check.c delete mode 100644 sha1dc/ubc_check.h -- 2.15.1.424.g9478a66081
Re: Documentation Breakage at 2.5.6
"Randall S. Becker"writes: > One request to Junio: Would it be possible to tag the commits to > align with the tags in the main repo? That way, I can build a nice > little Jenkins job to automatically fetch the correct commit for > man pages when packaging up a release. Sorry, I missed this due to an overlong line in the message. I am not interested in doing anything more than absolute minimum in the history that records generated cruft. We already describe the mainline commit object names in the messages; perhaps that is sufficient? commit daa88a54a985ed1ef258800c742223c2a8f0caaa Author: Junio C Hamano Date: Wed Dec 6 10:04:03 2017 -0800 Autogenerated manpages for v2.15.1-354-g95ec6 commit 466a3070abecf4081a12d8e07c770689506440b9 Author: Junio C Hamano Date: Wed Nov 29 10:12:49 2017 +0900 Autogenerated manpages for v2.15.1-271-g1a4e4 commit be681f4100647ab93fd19cb5066fcaa2cb79204b Author: Junio C Hamano Date: Mon Nov 27 12:33:30 2017 +0900 Autogenerated manpages for v2.15.0-374-g5f995 The primary reason why I do not want to tag them is because the tree the documentation sources were taken from is *not* the only thing that affects these autogenerated cruft. The AsciiDoc toolchain that happen to be installed on the box the day I ran the documentation tools is an obvious difference, and I do not want to make them appear any more definitive and official. "This is *the* manpage for release v2.15.1" is the message I do not want to give.
[PATCH] t6120: fix under gettext-poison
I'll queue this on top so that 'pu' would have one fewer breakage at Travis. diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 4668f0058e..3e3fb462a0 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -341,7 +341,7 @@ test_expect_success 'describe directly tagged blob' ' test_expect_success 'describe tag object' ' git tag test-blob-1 -a -m msg unique-file:file && test_must_fail git describe test-blob-1 2>actual && - grep "fatal: test-blob-1 is neither a commit nor blob" actual + test_i18ngrep "fatal: test-blob-1 is neither a commit nor blob" actual ' test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' -- 2.15.1-501-gdae66d7f65
Re: [PATCH v2 4/4] t/Makefile: introduce TEST_SHELL_PATH
On Fri, Dec 08, 2017 at 04:08:19PM +0100, Johannes Schindelin wrote: > > Yes, but test-lib.sh sources GIT-BUILD-OPTIONS, which we > > built during the first "make". And that overrides the > > environment, giving us the original SHELL_PATH again. > > ... and we could simply see whether the environment variable > TEST_SHELL_PATH (which we would set in t/Makefile from the passed-in > SHELL_PATH) is set, and override it again. > > I still think we can do without recording test-phase details in the > build-phase (which may, or may not, know what the test-phase wants to do). > > In other words, I believe that we can make the invocation you mentioned > above work, by touching only t/Makefile (to pass SHELL_PATH as > TEST_SHELL_PATH) and t/test-lib.sh (to override the SHELL_PATH from > GIT-BUILD-OPTIONS with TEST_SHELL_PATH, if set). We could do that, but it makes TEST_SHELL_PATH inconsistent with all of the other config.mak variables. -Peff
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
On Fri, Dec 08, 2017 at 12:39:55PM -0800, Stefan Beller wrote: > > If you add --raw, you can see that both commits introduce that blob, and > > it never "goes away". That's because that happened in a merge, which we > > don't diff in a default log invocation. > > We should when --raw is given. > --raw is documented as "For each commit, show a summary of changes > using the raw diff format." and I would argue that 'each commit' includes > merges. Though I guess this may have implications for long time users. And "--patch" is documented as "generate patch", but it also does nothing for merges by default. I think it has little to do with "--raw". It is simply that the default for "log" is none of "-c", "--cc", or "-m". We _could_ change that default ("--cc" is already the default for git-show), but I would not be surprised if that has fallouts (certainly it makes git-log much slower). > > So I think this one is tricky because of the revert. In the same way > > that pathspec-limiting is often tricky in the face of a revert, because > > the merges "hide" interesting things happening. > > yup, hidden merges are unfortunate. Is there an easy way to find out > about merges? (Junio hints at having tests around merges, which I'll do > next) If you find such an easy way, let me know. :) One of the few really manual types of query I remember having done in recent years is trying to pinpoint a bad merge. I.e., somebody during merge resolution accidentally does "git checkout --ours foo.c", blowing away changes which they didn't mean to. And then later you want to figure out which merge did it. If you use "-c" or "--cc", that isn't an "interesting" change, because it resolves to one side of the merge. If you use "-m", you get way too many changes and have to comb through them manually. I've resorted to "-m --first-parent", but then you frequently have to dig down several layers (e.g., the bad merge is a merge from "master" onto a topic branch, and your first "--first-parent" attempt will just find the bad topic being merged back into master). I think the most promising tool I've seen there is to redo the merge and show the diff between the auto-merge (including conflicts) and the committed tree. It's just another definition of "is this hunk interesting" that's different from "--cc". I'm not sure how that would interact with something like "--blobfind", though. For that matter, I'm not quite sure how your patch would interact with "--cc". I think you may need to special-case it. -Peff
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
On Fri, Dec 08, 2017 at 10:37:08AM -0800, Junio C Hamano wrote: > > The two modes (dup/nodup) make string_list code tricky. Not sure > > how far we'd get with something simpler (e.g. an array of char pointers), > > but having the caller do all string allocations would make the code > > easier to analyze. > > Yes. > > It probably would have been more sensible if the API did not have > two modes (instead, have the caller pass whatever string to be > stored, *and* make the caller responsible for freeing them *if* it > passed an allocated string). I'd actually argue the other way: the simplest interface is one where the string list owns all of its pointers. That keeps the ownership/lifetime issues clear, and it's one less step for the caller to have to remember to do at the end (they do have to clear() the list, but they must do that anyway to free the array of items). It does mean that some callers may have to remember to free a temporary buffer right after adding its contents to the list. But that's a lesser evil, I think, since the memory ownership issues are all clearly resolved at the time of add. The big cost is just extra copies/allocations. I dunno. I actually do not mind the "nodup" version of append being used on a "dup" list. It is just a way of letting each call decide on whether to hand over ownership, and I think most sites are pretty clear. > For the push_refs_with_push() patch you sent, another possible fix > would be to make cas_options a nodup kind so that the result of > strbuf_detach() does not get an extra strdup to be lost when placed > in cas_options. With the current string-list API that would not > quite work, because freeing done in _release() is tied to the > "dup/nodup" ness of the string list. I think there even is a > codepath that initializes a string_list as nodup kind, stuffs string > in it giving the ownership, and then flips it into dup kind just > before calling _release() only to have it free the strings, or > something silly/ugly like that. Yes, the first grep hit for NODUP is bisect_clean_state(), which does this. I think it would be more clear if we could do: diff --git a/bisect.c b/bisect.c index 0fca17c02b..7c59408a13 100644 --- a/bisect.c +++ b/bisect.c @@ -1060,8 +1060,7 @@ static int mark_for_removal(const char *refname, const struct object_id *oid, int flag, void *cb_data) { struct string_list *refs = cb_data; - char *ref = xstrfmt("refs/bisect%s", refname); - string_list_append(refs, ref); + string_list_appendf(refs, "refs/bisect%s", refname); return 0; } @@ -1070,11 +1069,10 @@ int bisect_clean_state(void) int result = 0; /* There may be some refs packed during bisection */ - struct string_list refs_for_removal = STRING_LIST_INIT_NODUP; + struct string_list refs_for_removal = STRING_LIST_INIT_DUP; for_each_ref_in("refs/bisect", mark_for_removal, (void *) _for_removal); string_list_append(_for_removal, xstrdup("BISECT_HEAD")); result = delete_refs("bisect: remove", _for_removal, REF_NO_DEREF); - refs_for_removal.strdup_strings = 1; string_list_clear(_for_removal, 0); unlink_or_warn(git_path_bisect_expected_rev()); unlink_or_warn(git_path_bisect_ancestors_ok()); Having a "format into a string" wrapper doesn't cover _every_ string you might want to add to a list, but my experience with argv_array_pushf leads me to believe that it covers quite a lot of cases. I dunno. I am not so bothered by the current dual-nature that I think it is worth going on a crusade to eliminate one side. But I'm OK if somebody else wants to do so. -Peff
Re: [PATCH 1/2] git version --build-options: report the build platform, too
On Fri, Dec 8, 2017 at 4:17 PM, Eric Sunshinewrote: > On Fri, Dec 8, 2017 at 12:43 PM, Junio C Hamano wrote: >> Jonathan Nieder writes: @@ -413,6 +414,7 @@ int cmd_version(int argc, const char **argv, const char *prefix) if (build_options) { printf("sizeof-long: %d\n", (int)sizeof(long)); +printf("machine: %s\n", build_platform); >>> >>> Can this use GIT_BUILD_PLATFORM directly instead of going via the >>> indirection >>> of a mutable static string? That is, something like >>> >>> printf("machine: %s\n", GIT_BUILD_PLATFORM); >> >> Good point. And if this is externally identified as "machine", >> probably the macro should also use the same word, not "platform". >> We can go either way, as long as we are consistent, though. > > In Autoconf parlance, this would be called "host architecture" > (GIT_HOST_ARCH). My bad: "host cpu", rather (GIT_HOST_CPU).
Re: [PATCH 1/2] git version --build-options: report the build platform, too
On Fri, Dec 8, 2017 at 12:43 PM, Junio C Hamanowrote: > Jonathan Nieder writes: >>> @@ -413,6 +414,7 @@ int cmd_version(int argc, const char **argv, const char >>> *prefix) >>> >>> if (build_options) { >>> printf("sizeof-long: %d\n", (int)sizeof(long)); >>> +printf("machine: %s\n", build_platform); >> >> Can this use GIT_BUILD_PLATFORM directly instead of going via the indirection >> of a mutable static string? That is, something like >> >> printf("machine: %s\n", GIT_BUILD_PLATFORM); > > Good point. And if this is externally identified as "machine", > probably the macro should also use the same word, not "platform". > We can go either way, as long as we are consistent, though. In Autoconf parlance, this would be called "host architecture" (GIT_HOST_ARCH).
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
On Fri, Dec 08, 2017 at 06:29:34PM +0100, René Scharfe wrote: > > By the way, I think there's another quite subtle leak in this function. > > We do this: > > > >format_commit_message(commit, "%s", , ); > >strbuf_ltrim(); > > > > and then only use "sb" if sb.len is non-zero. But we may have actually > > allocated to create our zero-length string (e.g., if we had a strbuf > > full of spaces and trimmed them all off). Since we reuse "sb" over and > > over as we loop, this will actually only leak once for the whole loop, > > not once per iteration. So it's probably not a big deal, but writing it > > with the explicit reset/release pattern fixes that (and is more > > idiomatic for our code base, I think). > > It's subtle, but I think it's not leaking, at least not in your example > case (and I can't think of another way). IIUC format_subject(), which > handles the "%s" part, doesn't touch sb if the subject is made up only > of whitespace. Yeah, I suspected that may be the case. But IMHO it is a poor use of strbufs if you have to dig that far to see whether the code leaks or not. The whole point of strbufs is to make string handling and memory ownership more obviously correct. Just skimming the history, I think it's mostly an artifact of the function was slowly converted over the years. -Peff
Re: [PATCH v4 1/4] Makefile: generate Perl header from template file
On Wed, Dec 06 2017, Ævar Arnfjörð Bjarmason jotted: > On Wed, Dec 6, 2017 at 7:56 PM, Daniel Jacqueswrote: >> On Wed, Dec 6, 2017 at 1:47 PM, Junio C Hamano wrote: >>> >>> Johannes Sixt writes: >>> >>> > The updated series works for me now. Nevertheless, I suggest to squash >>> > in the following change to protect against IFS and globbing characters in >>> > $INSTLIBDIR. >>> >>> Yeah, that is very sensible. >>> >>> > diff --git a/Makefile b/Makefile >>> > index 7ac4458f11..08c78a1a63 100644 >>> > --- a/Makefile >>> > +++ b/Makefile >>> > @@ -2072,7 +2072,7 @@ GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) >>> > GIT-PERL-DEFINES perl/perl.mak Makefile >>> > INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \ >>> > INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" >>> > && \ >>> > sed -e 's=@@PATHSEP@@=$(pathsep)=g' \ >>> > - -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \ >>> > + -e 's=@@INSTLIBDIR@@='"$$INSTLIBDIR"'=g' \ >>> > -e 's=@@GITEXECDIR@@=$(gitexecdir_relative_SQ)=g' \ >>> > -e 's=@@PERLLIBDIR@@=$(perllibdir_relative_SQ)=g' \ >>> > $< >$@+ && \ >> >> Sounds good; I'll apply that to my working patch and include it in my >> next ("v5") submission, which is currently blocked pending avarab@'s Perl >> Makefile changes: >> https://public-inbox.org/git/20171129195430.10069-1-ava...@gmail.com/T/#t > > Thanks, FWIW I'll send another version of that at the end of the week > or so, I'm waiting to see if there's any more comments on it to reduce > list churn. Sorry, I got this conflated with my sha1collisiondetection series, I have nothing new to send out. It seems everyone's happy with the version of my v2 (with Junio's on top) from this series, so it's just a matter of waiting.
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
Stefan Bellerwrites: >>> + >>> + if ((DIFF_FILE_VALID(p->one) && >>> + oidset_contains(options->blobfind, >one->oid)) || >>> + (DIFF_FILE_VALID(p->two) && >>> + oidset_contains(options->blobfind, >two->oid))) { >> >> Shouldn't this make sure that !DIFF_PAIR_UNMERGED(p) before looking >> at the sides of the pair? I think an unmerged pair is queued with >> filespecs whose modes are cleared, but we should not be relying on >> that---I think we even had bugs we fixed by introducing an explicit >> is_unmerged field to the filepair struct. > > I am not sure I follow. IIUC 'unmerged' only ever happens in the > index when there are multiple stages for one path (represented in the > working tree with diff markers). Aren't we supposed to find such > an unmerged blob as well (despite wrong mode, but the oid ought to fit)? I think diff_unmerged() records "this path is unmerged" without giving any object name to either side, so you do not get to compare (or even look at) p->{one,two}->oid with anything. That is why I think checking p->one and p->two like the above code, without making sure that you are looking at a pair that is not unmerged, is wrong.
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
On Fri, Dec 08, 2017 at 06:29:31PM +0100, René Scharfe wrote: > Am 07.12.2017 um 22:27 schrieb Jeff King: > > Grepping for "list_append.*detach" shows a few other possible cases in > > transport-helper.c, which I think are leaks. > > -- >8 -- > Subject: [PATCH] transport-helper: plug strbuf and string_list leaks > > Transfer ownership of detached strbufs to string_lists of the > duplicating variety by calling string_list_append_nodup() instead of > string_list_append() to avoid duplicating and then leaking the buffer. Thanks, this part looks obviously correct. > While at it make sure to release the string_list when done; > push_refs_with_export() already does that. This one takes a bit more digging. I've been bitten before in Git's code by freeing what appeared to be a leak only to find out that we had passed the pointers off to some other data structure which expected them to persist. Here we feed them to set_helper_option(), which passes them to quote_c_style(), which makes a copy into a strbuf. So I think all is well. -Peff
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
>> + >> + if ((DIFF_FILE_VALID(p->one) && >> + oidset_contains(options->blobfind, >one->oid)) || >> + (DIFF_FILE_VALID(p->two) && >> + oidset_contains(options->blobfind, >two->oid))) { > > Shouldn't this make sure that !DIFF_PAIR_UNMERGED(p) before looking > at the sides of the pair? I think an unmerged pair is queued with > filespecs whose modes are cleared, but we should not be relying on > that---I think we even had bugs we fixed by introducing an explicit > is_unmerged field to the filepair struct. I am not sure I follow. IIUC 'unmerged' only ever happens in the index when there are multiple stages for one path (represented in the working tree with diff markers). Aren't we supposed to find such an unmerged blob as well (despite wrong mode, but the oid ought to fit)? >> + if (revs->diffopt.blobfind) >> + revs->simplify_history = 0; >> return 0; >> } > > It makes sense to clear this bit by default, but is this an > unconditonal clearing? In other words, is there a way for the user > to countermand this default and ask for merge simplification from > the command line, e.g. "diff --simplify-history --blobfind="? As noted in your reply, this is consistent with other occurrences of simplify_history, so let's keep it this way. >> + >> + git log --abbrev=12 --oneline --blobfind=greeting^{blob} >actual.raw && >> + cut -c 14- actual.raw >actual && >> + test_cmp expect actual > > The hardcoded numbers look strange and smell like a workaround of > not asking for full object names. You mean the 12 and 14 ? Yeah I can fix that to 40 and 42 if you want. I wrote it this way to be agnostic of the actual object id, but thought 12 would be enough for this test no matter the future hash. > Also, now it has the simplify-history bit in the change, don't we > want to check a mergy history, and also running "diff-files" while > the index has unmerged entries? yup, I am working on that.
Re: [PATCH v2 5/7] diff: use skip-to-optional-val in parsing --relative
Jeff Kingwrites: > On Thu, Dec 07, 2017 at 02:31:38PM -0800, Junio C Hamano wrote: > >> If this goes on top as a standalone patch, then the reason why it is >> separate from the other users of _default() is not because the way >> it uses the null return is special, but because it was written by a >> different author, I would think. > > Mostly I was just concerned that we missed a somewhat subtle bug in the > first conversion, and I think it's worth calling out in the commit > message why that callsite must be converted in a particular way. Whether > that happens in a separate commit or squashed, I don't care too much. > > But I dunno. Maybe it is obvious now that the correct code is in there. > ;) It probably is too obvious to me only because the use case like this one that wanted to treat --foo and --foo="" differently was the only reason why I pushed against Christian's original one that hardcoded the equivalence without allowing what the _default() variant lets us do, I think.
Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)
Johannes Schindelinwrites: > We might want to consider using a saner Continuous Testing workflow, to > avoid re-testing (and re-finding) breakages in individual patch series > just because completely unrelated patch got updated. > > I mean, yes, it seemed like a good idea a long time ago to have One Branch > that contains All The Patch Series Currently Cooking, back when our most > reliable (because only) test facilities were poor humans. > > But we see how many more subtle bugs are spotted nowadays where Git's > source code is tested automatically on a growing number of Operating > System/CPU architecture "coordinates", and it is probably time to save > some human resources. > > How about testing the individual branches instead? We would benefit from both, so not "instead", but "in addition" would make more sense. Even if a topic passes a test in isolation, the job of the developer who originally did that topic does not end there, as the topic may break in presence of other topics in flight when tested together with them, and because a project is a team effort, we expect those familiar with the topics involved in such a breakage to all participate in diagnosing and fixing. Ideally, in addition to the tips of these integration branches, and in addition to the tips of topics, it would be nicer if we can test individual new commits. When we see the tip of 'pu' updated from A to B, then git rev-list --no-merges A..B would give us all individual non-merge commits that have been added, and assuming that we have already tested commits back when the tip was at A, these are the only commits that needs testing to see what is broken in the new round. I do not know how easy it is to arrange something like that, though. What we currently run with Travis lets us limit the number of jobs to the number of tentative integration branches; a scheme like that would require quite a lot more test cycles, triggered by a single pushout. (Unscientific numbers) $ git rev-list --count --no-merges pu@{48.hours}..pu@{24.hours} 10 $ git rev-list --count --no-merges pu@{24.hours}..pu 37
Re: [PATCH v1 1/2] t/README: remove mention of adding copyright notices
On 12/05, Jonathan Nieder wrote: > Hi, > > Thomas Gummerer wrote: > > > We generally no longer include copyright notices in new test scripts. > > However t/README still mentions it as something to include at the top of > > every new script. > > Where can I read more about this change? Was it a deliberate change > or something that simply happened? I'm not sure if it was a deliberate change, I just noticed that most new test files don't have a copyright notice anymore. $ git grep "Copyright (c)" t/* | sed -E 's/.*?Copyright .c. ([[:digit:]]+).*?/\1/' | sort | uniq -c 61 2005 40 2006 55 2007 32 2008 31 2009 30 2010 10 2011 14 2012 4 2013 3 2014 1 2015 5 2016 While we may be adding less new test files, we definitely added a few in 2017, for example t/t7521-ignored-mode.sh in 371c80c746 ("status: test ignored modes", 2017-10-30), or t/t0025-crlf-renormalize.sh in 9472935d81 ("add: introduce "--renormalize"", 2017-11-16). > Thanks, > Jonathan
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
On Fri, Dec 8, 2017 at 12:19 PM, Jeff Kingwrote: > On Fri, Dec 08, 2017 at 04:28:23PM +, Ramsay Jones wrote: > >> On 08/12/17 09:34, Jeff King wrote: >> > On Thu, Dec 07, 2017 at 04:24:47PM -0800, Stefan Beller wrote: >> [snip] >> >> Junio hinted at a different approach of solving this problem, which this >> >> patch implements. Teach the diff machinery another flag for restricting >> >> the information to what is shown. For example: >> >> >> >> $ ./git log --oneline --blobfind=v2.0.0:Makefile >> >> b2feb64309 Revert the whole "ask curl-config" topic for now >> >> 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" >> >> >> >> we observe that the Makefile as shipped with 2.0 was introduced in >> >> v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by >> >> a different blob. >> >> Sorry, this has nothing to do with this specific thread. :( >> >> However, the above caught my eye. Commit b2feb64309 does not 'replace >> the Makefile with a different blob'. That commit was a 'last minute' >> revert of a topic _prior_ to v2.0.0, which resulted in the _same_ >> blob as commit 47fbfded53. (i.e. a53f3a8326c2e62dc79bae7169d64137ac3dab20). > > Right, I think the patch is working as advertised but the commit message > misinterprets the results. :) Thanks for digging. I came to a similar realization. > If you add --raw, you can see that both commits introduce that blob, and > it never "goes away". That's because that happened in a merge, which we > don't diff in a default log invocation. We should when --raw is given. --raw is documented as "For each commit, show a summary of changes using the raw diff format." and I would argue that 'each commit' includes merges. Though I guess this may have implications for long time users. > And in fact finding where this "goes away" is hard, because the merge > doesn't trigger "-c" or "--cc". It's 8eaf517835, which presumably was > forked before the revert in b2feb64309, and then the merge was able to > replace that blob completely with the other side of the merge. > > Viewing with "-m" turns up a bunch of uninteresting merges. Looking at > "--first-parent" is how I got 8eaf517835. > > So I think this one is tricky because of the revert. In the same way > that pathspec-limiting is often tricky in the face of a revert, because > the merges "hide" interesting things happening. yup, hidden merges are unfortunate. Is there an easy way to find out about merges? (Junio hints at having tests around merges, which I'll do next)
Re: [PATCH Outreachy 1/2] format: create pretty.h file
Olga Telezhnayawrites: > archive.c | 1 + > builtin/notes.c | 2 +- > builtin/reset.c | 2 +- > builtin/show-branch.c | 2 +- > combine-diff.c| 1 + > commit.c | 1 + > commit.h | 80 -- > diffcore-pickaxe.c| 1 + > grep.c| 1 + > log-tree.c| 1 + > notes-cache.c | 1 + > pretty.h | 87 > +++ > revision.h| 2 +- > sequencer.c | 1 + > sha1_name.c | 1 + > submodule.c | 1 + > 16 files changed, 101 insertions(+), 84 deletions(-) > create mode 100644 pretty.h > > diff --git a/archive.c b/archive.c > index 0b7b62af0c3ec..60607e8c00857 100644 > --- a/archive.c > +++ b/archive.c > @@ -2,6 +2,7 @@ > #include "config.h" > #include "refs.h" > #include "commit.h" > +#include "pretty.h" > #include "tree-walk.h" > #include "attr.h" > #include "archive.h" This has a toll on topics in flight that expect the symbols for pretty are available in "commit.h"; they are forced to include this new file they did not even know about. I notice that "commit.h" is included in "builtin.h"; perhaps adding a new include for "pretty.h" there would be of lessor impact? I dunno.
Re: Re: bug deleting "unmerged" branch (2.12.3)
From: "Ulrich Windl"Hi Philip! I'm unsure what you are asking for... Ulrich Hi Ulrich, I was doing a retrospective follow up (of the second kind [1]). In your initial email https://public-inbox.org/git/5a1d70fd02a100029...@gwsmtp1.uni-regensburg.de/ you said "I wanted to delete the temporary branch (which is of no use now), I got a message that the branch is unmerged. I think if more than one branches are pointing to the same commit, one should be allowed to delete all but the last one without warning." My retrospectives question was to find what what part of the documentation could be improved to assist fellow coders and Git users in gaining a better understanding here. I think it's an easy mistake [2] to make and that we should try to make the man pages more assistive. I suspect that the description for the `git branch -d` needs a few more words to clarify the 'merged/unmerged' issue for those who recieve the warning message. Or maybe the git-glossary, etc. I tend to believe that most users will read some of the man pages, and would continue to do so if they are useful. I'd welcome any feedback or suggestions you could provide. -- Philip >>> "Philip Oakley" 04.12.17 0.30 Uhr >>> From: "Junio C Hamano" > "Philip Oakley" writes: > >> I think it was that currently you are on M, and neither A nor B are >> ancestors (i.e. merged) of M. >> >> As Junio said:- "branch -d" protects branches that are yet to be >> merged to the **current branch**. > > Actually, I think people loosened this over time and removal of > branch X is not rejected even if the range HEAD..X is not empty, as > long as X is marked to integrate with/build on something else with > branch.X.{remote,merge} and the range X@{upstream}..X is empty. > > So the stress of "current branch" above you added is a bit of a > white lie. Ah, thanks. [I haven't had chance to check the code] The man page does say: .-d .Delete a branch. The branch must be fully merged in its upstream .branch, or in HEAD if no upstream was set with --track .or --set-upstream. It's whether or not Ulrich had joined the two aspects together, and if the doc was sufficient to help recognise the 'unmerged' issue. Ulrich? -- Philip [1] Retrospective Second Directive, section 3.4.2 of (15th Ed) Agile Processes in software engineering and extreme programming. ISBN 1628251042 (for the perspective of the retrospective..) [2] 'mistake' colloquial part of the error categories of slips lapses and mistakes : Human Error, by Reason (James, prof) ISBN 0521314194 (worthwhile)
Re: [WIP 04/15] upload-pack: convert to a builtin
Johannes Schindelinwrites: > On Wed, 6 Dec 2017, Junio C Hamano wrote: > ... >> I vaguely recalled and feared that we on purpose kept this program >> separate from the rest of the system for a reason, but my mailing >> list search is coming up empty. > > I only recall that we kept it in the bin/ directory (as opposed to > mlibexec/git-core/) to help with fetching via SSH. Yes, that is about where it is installed (i.e. on $PATH), which is a different issue. My vague recollection was about what is (and what is not) included in and linked into the program built, with some reason that is different from but similar to the reason why remote helpers that link to curl and openssl libraries are excluded from the builtin deliberately. I know we exclude remote-helpers from builtin in order to save the start-up overhead for other more important built-in commands by not having to link these heavyweight libs. I suspect there was some valid reason why we didn't make upload-pack a built-in, but am failing to recall what the reason was.
Re: [WIP 11/15] serve: introduce git-serve
On 12/07, Junio C Hamano wrote: > Brandon Williamswrites: > > > +static struct protocol_capability *get_capability(const char *key) > > +{ > > + int i; > > + > > + if (!key) > > + return NULL; > > + > > + for (i = 0; i < ARRAY_SIZE(capabilities); i++) { > > + struct protocol_capability *c = [i]; > > + const char *out; > > + if (skip_prefix(key, c->name, ) && (!*out || *out == '=')) > > + return c; > > Looks familiar and resembles what was recently discussed on list ;-) > > > +int cmd_serve(int argc, const char **argv, const char *prefix) > > +{ > > + > > + struct option options[] = { > > + OPT_END() > > + }; > > + > > + /* ignore all unknown cmdline switches for now */ > > + argc = parse_options(argc, argv, prefix, options, grep_usage, > > +PARSE_OPT_KEEP_DASHDASH | > > +PARSE_OPT_KEEP_UNKNOWN); > > + serve(); > > + > > + return 0; > > +} I assume that at some point we may want to have a new endpoint that just does v2 without needing the side channel to tell it to do so. Maybe for brand new server commands, like a remote grep or a remote object-stat or something that don't have a v1 equivalent that can be fallen back to. That's why I included a builtin/serve.c > > ... > > +/* Main serve loop for protocol version 2 */ > > +void serve(void) > > +{ > > + /* serve by default supports v2 */ > > + packet_write_fmt(1, "version 2\n"); > > + > > + advertise_capabilities(); > > + > > + for (;;) > > + if (process_request()) > > + break; > > +} > > I am guessing that this would be run just like upload-pack, > i.e. invoked via ssh or via git-daemon, and that is why it can just > assume that fd#0/fd#1 are already connected to the other end. It > may be helpful to document somewhere how we envision to invoke this > program. > This function I was planning to just be executed by upload-pack and receive-pack when a client requests protocol v2. But yes the idea would be that fd#0/fd#1 would be already setup like they are for upload-pack and receive-pack. -- Brandon Williams
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
On Fri, Dec 08, 2017 at 04:28:23PM +, Ramsay Jones wrote: > On 08/12/17 09:34, Jeff King wrote: > > On Thu, Dec 07, 2017 at 04:24:47PM -0800, Stefan Beller wrote: > [snip] > >> Junio hinted at a different approach of solving this problem, which this > >> patch implements. Teach the diff machinery another flag for restricting > >> the information to what is shown. For example: > >> > >> $ ./git log --oneline --blobfind=v2.0.0:Makefile > >> b2feb64309 Revert the whole "ask curl-config" topic for now > >> 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" > >> > >> we observe that the Makefile as shipped with 2.0 was introduced in > >> v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by > >> a different blob. > > Sorry, this has nothing to do with this specific thread. :( > > However, the above caught my eye. Commit b2feb64309 does not 'replace > the Makefile with a different blob'. That commit was a 'last minute' > revert of a topic _prior_ to v2.0.0, which resulted in the _same_ > blob as commit 47fbfded53. (i.e. a53f3a8326c2e62dc79bae7169d64137ac3dab20). Right, I think the patch is working as advertised but the commit message misinterprets the results. :) If you add --raw, you can see that both commits introduce that blob, and it never "goes away". That's because that happened in a merge, which we don't diff in a default log invocation. And in fact finding where this "goes away" is hard, because the merge doesn't trigger "-c" or "--cc". It's 8eaf517835, which presumably was forked before the revert in b2feb64309, and then the merge was able to replace that blob completely with the other side of the merge. Viewing with "-m" turns up a bunch of uninteresting merges. Looking at "--first-parent" is how I got 8eaf517835. So I think this one is tricky because of the revert. In the same way that pathspec-limiting is often tricky in the face of a revert, because the merges "hide" interesting things happening. -Peff
Re: [WIP 07/15] connect: convert get_remote_heads to use struct packet_reader
On 12/06, Junio C Hamano wrote: > Brandon Williamswrites: > > > EXPECTING_DONE sounded like we are expecting to see 'done' packet > sent from the other side, but I was mistaken. It is the state > where we are "done" expecting anything ;-). > > Having an (unconditional) assignment to 'state' in the above switch > makes me feel somewhat uneasy, as the next "switch (state)" is what > is meant as the state machine that would allow us to say things like > "from this state, transition to that state is impossible". When we > get a flush while we are expecting the first ref, for example, we'd > just go into the "done" state. There is no provision for a future > update to say "no, getting a flush in this state is an error". I believe this is accepted behavior, receiving a flush in that state. And I don't think there is ever an instance during the ref advertisement where a flush would be an error. It just indicates that the advertisement is finished. > > That is no different from the current code; when read_remote_ref() > notices that it got a flush, it just leaves the loop without even > touching 'state' variable. But at least, I find that the current > code is more honest---it does not even touch 'state' and allows the > code after the loop to inspect it, if needed. From that point of > vhew, the way the new code uses 'state' to leave the loop upon > seeing a flush is a regression---it makes it harder to notice and > report when we got a flush in a wrong state. > > Perhaps getting rid of "EXPECTING_DONE" from the enum and then: > > int got_flush = 0; > while (1) { > switch (reader_read()) { > case PACKET_READ_FLUSH: > got_flush = 1; > break; > ... other cases ... > } > > if (got_flush) > break; > > switch (state) { > ... current code ... > } > } > > would be an improvement; we can later extend "if (got_flush)" part > to check what state we are in if we wanted to notice and report an > error there before breaking out of the loop. > I don't really see how this is any clearer from what this patch does though. I thought it made it easier to read as you no longer have an infinite loop, but rather know when it will end (when you move to the done state). -- Brandon Williams
Re: [PATCH] partial-clone: design doc
Jeff Hostetlerwrites: > From: Jeff Hostetler > > First draft of design document for partial clone feature. > > Signed-off-by: Jeff Hostetler > Signed-off-by: Jonathan Tan > --- Thanks. > +Non-Goals > +- > + > +Partial clone is independent of and not intended to conflict with > +shallow-clone, refspec, or limited-ref mechanisms since these all operate > +at the DAG level whereas partial clone and fetch works *within* the set > +of commits already chosen for download. It probably is not a huge deal (simply because it is about "Non-Goals") but I have no idea what "refspec" and "limited-ref mechanism" refer to in the above sentence, and I suspect many others share the same puzzlement. > +An object may be missing due to a partial clone or fetch, or missing due > +to repository corruption. To differentiate these cases, the local > +repository specially indicates packfiles obtained from the promisor > +remote. These "promisor packfiles" consist of a ".promisor" file > +with arbitrary contents (like the ".keep" files), in addition to > +their ".pack" and ".idx" files. (In the future, this ability > +may be extended to loose objects[a].) > + ... > +Foot Notes > +-- > + > +[a] Remembering that loose objects are promisor objects is mainly > +important for trees, since they may refer to promisor blobs that > +the user does not have. We do not need to mark loose blobs as > +promisor because they do not refer to other objects. I fail to see any logical link between the "loose" and "tree". Putting it differently, I do not see why "tree" is so special. A promisor pack that contains a tree but lacks blobs the tree refers to would be sufficient to let us remember that these missing blobs are not corruption. A loose commit or a tag that is somehow marked as obtained from a promisor, if it can serve just like a commit or a tag in a promisor pack to promise its direct pointee, would equally be useful (if very inefficient). In any case, I suspect "since they may refer to promisor blobs" is a typo of "since they may refer to promised blobs". > +- Currently, dynamic object fetching invokes fetch-pack for each item > + because most algorithms stumble upon a missing object and need to have > + it resolved before continuing their work. This may incur significant > + overhead -- and multiple authentication requests -- if many objects are > + needed. > + > + We need to investigate use of a long-running process, such as proposed > + in [5,6] to reduce process startup and overhead costs. Also perhaps in some operations we can enumerate the objects we will need upfront and ask for them in one go (e.g. "git log -p A..B" may internally want to do "rev-list --objects A..B" to enumerate trees and blobs that we may lack upfront). I do not think having the other side guess is a good idea, though. > +- We currently only promisor packfiles. We need to add support for > + promisor loose objects as described earlier. The earlier description was not convincing enough to feel the need to me; at least not yet.
Re: [WIP 04/15] upload-pack: convert to a builtin
On 12/06, Junio C Hamano wrote: > Brandon Williamswrites: > > > In order to allow for code sharing with the server-side of fetch in > > protocol-v2 convert upload-pack to be a builtin. > > > > Signed-off-by: Brandon Williams > > --- > > This looks obvious and straight-forward to a cursory look. > > I vaguely recalled and feared that we on purpose kept this program > separate from the rest of the system for a reason, but my mailing > list search is coming up empty. > > > Makefile | 3 ++- > > builtin.h | 1 + > > git.c | 1 + > > upload-pack.c | 2 +- > > 4 files changed, 5 insertions(+), 2 deletions(-) > > ... > > diff --git a/upload-pack.c b/upload-pack.c > > index ef99a029c..2d16952a3 100644 > > --- a/upload-pack.c > > +++ b/upload-pack.c > > @@ -1033,7 +1033,7 @@ static int upload_pack_config(const char *var, const > > char *value, void *unused) > > return parse_hide_refs_config(var, value, "uploadpack"); > > } > > > > -int cmd_main(int argc, const char **argv) > > +int cmd_upload_pack(int argc, const char **argv, const char *prefix) > > { > > const char *dir; > > int strict = 0; > > Shouldn't this file be moved to builtin/ directory, though? I can definitely move the file to builtin if you would prefer. -- Brandon Williams
Re: [WIP 08/15] connect: discover protocol version outside of get_remote_heads
On 12/07, Junio C Hamano wrote: > Brandon Williamswrites: > > > While we could wrap the preamble into a function it sort of defeats the > > purpose since you want to be able to call different functions based on > > the protocol version you're speaking. That way you can have hard > > separations between the code paths which operate on v0/v1 and v2. > > As I envision that the need for different processing across protocol > versions in real code would become far greater than it would fit in > cases within a single switch() statement, I imagined that the reader > structure would have a field that says which version of the protocol > it is, so that the caller of the preamble thing can inspect it later > to switch on it. > Yes, patch 9 changes this so that the protocol version is stored in the transport structure. This way the fetch and push code can know what to do in v2. -- Brandon Williams
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
Am 08.12.2017 um 19:44 schrieb Junio C Hamano: > René Scharfewrites: > >> Am 07.12.2017 um 22:27 schrieb Jeff King: >>> Grepping for "list_append.*detach" shows a few other possible cases in >>> transport-helper.c, which I think are leaks. >> >> -- >8 -- >> Subject: [PATCH] transport-helper: plug strbuf and string_list leaks >> >> Transfer ownership of detached strbufs to string_lists of the >> duplicating variety by calling string_list_append_nodup() instead of >> string_list_append() to avoid duplicating and then leaking the buffer. >> >> While at it make sure to release the string_list when done; >> push_refs_with_export() already does that. >> >> Reported-by: Jeff King >> Signed-off-by: Rene Scharfe >> --- >> transport-helper.c | 7 +-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/transport-helper.c b/transport-helper.c >> index bf05a2dcf1..f682e7c534 100644 >> --- a/transport-helper.c >> +++ b/transport-helper.c >> @@ -882,7 +882,8 @@ static int push_refs_with_push(struct transport >> *transport, >> struct strbuf cas = STRBUF_INIT; >> strbuf_addf(, "%s:%s", >> ref->name, >> oid_to_hex(>old_oid_expect)); >> -string_list_append(_options, strbuf_detach(, >> NULL)); >> +string_list_append_nodup(_options, >> + strbuf_detach(, NULL)); >> } >> } >> if (buf.len == 0) { >> @@ -897,6 +898,7 @@ static int push_refs_with_push(struct transport >> *transport, >> strbuf_addch(, '\n'); >> sendline(data, ); >> strbuf_release(); >> +string_list_release(_options, 0); > > There is no such function; you meant _clear() perhaps? Yes, of course, I'm sorry. Not sure what happened there. O_o René
Re: [WIP 03/15] pkt-line: add delim packet support
On 12/07, Stefan Beller wrote: > On Mon, Dec 4, 2017 at 3:58 PM, Brandon Williamswrote: > > One of the design goals of protocol-v2 is to improve the semantics of > > flush packets. Currently in protocol-v1, flush packets are used both to > > indicate a break in a list of packet lines as well as an indication that > > one side has finished speaking. This makes it particularly difficult > > to implement proxies as a proxy would need to completely understand git > > protocol instead of simply looking for a flush packet. > > > > To do this, introduce the special deliminator packet '0001'. A delim > > packet can then be used as a deliminator between lists of packet lines > > while flush packets can be reserved to indicate the end of a response. > > > > Signed-off-by: Brandon Williams > > I presume the update for Documentation/technical/* comes at a later patch in > the > series, clarifying the exact semantic difference between the packet types? Yeah, currently there isn't a use for the delim packet but there will be one when v2 is introduced. -- Brandon Williams
[PATCH] Partial clone design document
From: Jeff HostetlerThis patch contains a design document that Jonathan Tan and I have been working on that describes the partial clone feature currently under development. Since edits to this document are independent of the code, I did not include in the part 1,2,3 patch series. Please let us know if there are any major sections we should add or any areas that need clarification. Thanks! Jeff Hostetler (1): partial-clone: design doc Documentation/technical/partial-clone.txt | 240 ++ 1 file changed, 240 insertions(+) create mode 100644 Documentation/technical/partial-clone.txt -- 2.9.3
[PATCH] partial-clone: design doc
From: Jeff HostetlerFirst draft of design document for partial clone feature. Signed-off-by: Jeff Hostetler Signed-off-by: Jonathan Tan --- Documentation/technical/partial-clone.txt | 240 ++ 1 file changed, 240 insertions(+) create mode 100644 Documentation/technical/partial-clone.txt diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt new file mode 100644 index 000..7ab39d8 --- /dev/null +++ b/Documentation/technical/partial-clone.txt @@ -0,0 +1,240 @@ +Partial Clone Design Notes +== + +The "Partial Clone" feature is a performance optimization for git that +allows git to function without having a complete copy of the repository. + +During clone and fetch operations, git normally downloads the complete +contents and history of the repository. That is, during clone the client +receives all of the commits, trees, and blobs in the repository into a +local ODB. Subsequent fetches extend the local ODB with any new objects. +For large repositories, this can take significant time to download and +large amounts of diskspace to store. + +The goal of this work is to allow git better handle extremely large +repositories. Often in these repositories there are many files that the +user does not need such as ancient versions of source files, files in +portions of the worktree outside of the user's work area, or large binary +assets. If we can avoid downloading such unneeded objects *in advance* +during clone and fetch operations, we can decrease download times and +reduce ODB disk usage. + + +Non-Goals +- + +Partial clone is independent of and not intended to conflict with +shallow-clone, refspec, or limited-ref mechanisms since these all operate +at the DAG level whereas partial clone and fetch works *within* the set +of commits already chosen for download. + + +Design Overview +--- + +Partial clone logically consists of the following parts: + +- A mechanism for the client to describe unneeded or unwanted objects to + the server. + +- A mechanism for the server to omit such unwanted objects from packfiles + sent to the client. + +- A mechanism for the client to gracefully handle missing objects (that + were previously omitted by the server). + +- A mechanism for the client to backfill missing objects as needed. + + +Design Details +-- + +- A new pack-protocol capability "filter" is added to the fetch-pack and + upload-pack negotiation. + + This uses the existing capability discovery mechanism. + See "filter" in Documentation/technical/pack-protocol.txt. + +- Clients pass a "filter-spec" to clone and fetch which is passed to the + server to request filtering during packfile construction. + + There are various filters available to accomodate different situations. + See "--filter=" in Documentation/rev-list-options.txt. + +- On the server pack-objects applies the requested filter-spec as it + creates "filtered" packfiles for the client. + + These filtered packfiles are incomplete in the traditional sense because + they may contain trees that reference blobs that the client does not have. + + + How the local repository gracefully handles missing objects + +With partial clone, the fact that objects can be missing makes such +repositories incompatible with older versions of Git, necessitating a +repository extension (see the documentation of "extensions.partialClone" +for more information). + +An object may be missing due to a partial clone or fetch, or missing due +to repository corruption. To differentiate these cases, the local +repository specially indicates packfiles obtained from the promisor +remote. These "promisor packfiles" consist of a ".promisor" file +with arbitrary contents (like the ".keep" files), in addition to +their ".pack" and ".idx" files. (In the future, this ability +may be extended to loose objects[a].) + +The local repository considers a "promisor object" to be an object that +it knows (to the best of its ability) that the promisor remote has, +either because the local repository has that object in one of its +promisor packfiles, or because another promisor object refers to it. Git +can then check if the missing object is a promisor object, and if yes, +this situation is common and expected. This also means that there is no +need to explicitly maintain an expensive-to-modify list of missing +objects on the client. + +Almost all Git code currently expects any objects referred to by other +objects to be present. Therefore, a fallback mechanism is added: +whenever Git attempts to read an object that is found to be missing, it +will attempt to fetch it from the promisor remote, expanding the subset +of objects available locally, then reattempt the read. This allows +objects to be "faulted in" from the promisor remote without complicated +prediction algorithms. For
Re: [PATCH Outreachy 2/2] format: create docs for pretty.h
On Fri, Dec 8, 2017 at 8:21 AM, Olga Telezhnayawrote: > Write some docs for functions in pretty.h. > Take it as a first draft, they would be changed later. > > Signed-off-by: Olga Telezhnaia > Mentored-by: Christian Couder > Mentored by: Jeff King > --- > diff --git a/pretty.h b/pretty.h > @@ -57,31 +58,74 @@ struct userformat_want { > +/* > + * Create a text message about commit using given "format" and "context". > + * Put the result to "sb". > + * Please use this function for custom formats. > + */ > void format_commit_message(const struct commit *commit, > const char *format, struct strbuf *sb, > const struct pretty_print_context *context); > > +/* > + * Parse given arguments from "arg", check it for correctness and > + * fill struct rev_info. To be consistent with the way you formatted the other comments, I think you'd want quotes around rev_info. > + */ > void get_commit_format(const char *arg, struct rev_info *);
Re: [PATCH v1 1/2] t0027: Don't use git commit
On Fri, Dec 08, 2017 at 10:21:19AM -0800, Junio C Hamano wrote: > Junio C Hamanowrites: > > > tbo...@web.de writes: > > > >> From: Torsten Bögershausen > >> > >> Replace `git commit -m "comment" ""` with `git commit -m "comment"` to > >> remove the empty path spec. > >> > >> Signed-off-by: Torsten Bögershausen > >> --- > >> t/t0027-auto-crlf.sh | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > > > This looks a bit strange. The intent seems to commit all changes > > made in the working tree, so I'd understand it if it replaced the > > empty string with a single dot. > > > > I also thought that we deprecated use of an empty string as "match > > all" pathspec recently, and expected that this to break. > > > > ... goes and looks ... > > > > Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec > > element", 2017-06-23) does exactly that. > > OK, I think I can safely omit this patch, because > > (1) when 2/2 is queued on top of tb/check-crlf-for-safe-crlf, > because ex/deprecate-empty-pathspec-as-match-all is not yet in > the topic, an empty pathspec still means "match all" and > nothing breaks; and > > (2) when tb/check-crlf-for-safe-crlf plus 2/2 is merged to any > branch with ex/deprecate-empty-pathspec-as-match-all, the topic > already fixes what this 1/2 tries to > > Thanks for being thorough, though. > Sure, the credit goes 100% to you: commit 229a95aafa77b583b46a3156b4fad469c264ddfd Author: Junio C Hamano Date: Fri Jun 23 11:04:14 2017 -0700 t0027: do not use an empty string as a pathspec element My brain did just assume that this had mad it to master, sorry for the noise
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
René Scharfewrites: > Am 07.12.2017 um 22:27 schrieb Jeff King: >> Grepping for "list_append.*detach" shows a few other possible cases in >> transport-helper.c, which I think are leaks. > > -- >8 -- > Subject: [PATCH] transport-helper: plug strbuf and string_list leaks > > Transfer ownership of detached strbufs to string_lists of the > duplicating variety by calling string_list_append_nodup() instead of > string_list_append() to avoid duplicating and then leaking the buffer. > > While at it make sure to release the string_list when done; > push_refs_with_export() already does that. > > Reported-by: Jeff King > Signed-off-by: Rene Scharfe > --- > transport-helper.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/transport-helper.c b/transport-helper.c > index bf05a2dcf1..f682e7c534 100644 > --- a/transport-helper.c > +++ b/transport-helper.c > @@ -882,7 +882,8 @@ static int push_refs_with_push(struct transport > *transport, > struct strbuf cas = STRBUF_INIT; > strbuf_addf(, "%s:%s", > ref->name, > oid_to_hex(>old_oid_expect)); > - string_list_append(_options, strbuf_detach(, > NULL)); > + string_list_append_nodup(_options, > + strbuf_detach(, NULL)); > } > } > if (buf.len == 0) { > @@ -897,6 +898,7 @@ static int push_refs_with_push(struct transport > *transport, > strbuf_addch(, '\n'); > sendline(data, ); > strbuf_release(); > + string_list_release(_options, 0); There is no such function; you meant _clear() perhaps?
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
René Scharfewrites: >> I'm not sure it's string-list's fault. Many callers (including this one) >> ... > The two modes (dup/nodup) make string_list code tricky. Not sure > how far we'd get with something simpler (e.g. an array of char pointers), > but having the caller do all string allocations would make the code > easier to analyze. Yes. It probably would have been more sensible if the API did not have two modes (instead, have the caller pass whatever string to be stored, *and* make the caller responsible for freeing them *if* it passed an allocated string). For the push_refs_with_push() patch you sent, another possible fix would be to make cas_options a nodup kind so that the result of strbuf_detach() does not get an extra strdup to be lost when placed in cas_options. With the current string-list API that would not quite work, because freeing done in _release() is tied to the "dup/nodup" ness of the string list. I think there even is a codepath that initializes a string_list as nodup kind, stuffs string in it giving the ownership, and then flips it into dup kind just before calling _release() only to have it free the strings, or something silly/ugly like that. In any case, the patch looks sensible. Thanks for plugging the leaks.
Re: [PATCH v7 00/16] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
Jeff Hostetlerwrites: > On 12/8/2017 12:58 PM, Junio C Hamano wrote: >> Jeff Hostetler writes: >> >>> From: Jeff Hostetler >>> >>> This is V7 of part 3 of partial clone. It builds upon V7 of part 2 >>> (which builds upon V6 of part 1). >> >> Aren't the three patches at the bottom sort-of duplicate from the >> part 2 series? >> > > oops. yes, you're right. it looks like i selected pc*6*_p2..pc7_p3 > rather than pc*7*_p2..pc7_p3. sorry for the typo. > > and since the only changes in p2 were to squash those 2 commits near > the tip of p2, only those 3 commits changed SHAs in v7 over v6. > > so, please disregard the duplicates. > > would you like me to send a corrected V8 for p3 ? Nah. I just wanted to make sure that I am discarding the right ones (i.e. 1-3/16 of partial-clone, not 8-10/10 of fsck-promisors). Thanks for an update.
Re: [PATCH v1 1/2] t0027: Don't use git commit
Junio C Hamanowrites: > tbo...@web.de writes: > >> From: Torsten Bögershausen >> >> Replace `git commit -m "comment" ""` with `git commit -m "comment"` to >> remove the empty path spec. >> >> Signed-off-by: Torsten Bögershausen >> --- >> t/t0027-auto-crlf.sh | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > This looks a bit strange. The intent seems to commit all changes > made in the working tree, so I'd understand it if it replaced the > empty string with a single dot. > > I also thought that we deprecated use of an empty string as "match > all" pathspec recently, and expected that this to break. > > ... goes and looks ... > > Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec > element", 2017-06-23) does exactly that. OK, I think I can safely omit this patch, because (1) when 2/2 is queued on top of tb/check-crlf-for-safe-crlf, because ex/deprecate-empty-pathspec-as-match-all is not yet in the topic, an empty pathspec still means "match all" and nothing breaks; and (2) when tb/check-crlf-for-safe-crlf plus 2/2 is merged to any branch with ex/deprecate-empty-pathspec-as-match-all, the topic already fixes what this 1/2 tries to Thanks for being thorough, though.
Re: [PATCH v1 1/2] t0027: Don't use git commit
tbo...@web.de writes: > From: Torsten Bögershausen> > Replace `git commit -m "comment" ""` with `git commit -m "comment"` to > remove the empty path spec. > > Signed-off-by: Torsten Bögershausen > --- > t/t0027-auto-crlf.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) This looks a bit strange. The intent seems to commit all changes made in the working tree, so I'd understand it if it replaced the empty string with a single dot. I also thought that we deprecated use of an empty string as "match all" pathspec recently, and expected that this to break. ... goes and looks ... Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec element", 2017-06-23) does exactly that. > diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh > index c2c208fdcd..97154f5c79 100755 > --- a/t/t0027-auto-crlf.sh > +++ b/t/t0027-auto-crlf.sh > @@ -370,7 +370,7 @@ test_expect_success 'setup master' ' > echo >.gitattributes && > git checkout -b master && > git add .gitattributes && > - git commit -m "add .gitattributes" "" && > + git commit -m "add .gitattributes" && > printf "\$Id: > \$\nLINEONE\nLINETWO\nLINETHREE" >LF && > printf "\$Id: > \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF && > printf "\$Id: > \$\nLINEONE\r\nLINETWO\nLINETHREE" >CRLF_mix_LF &&
Re: [WIP 02/15] pkt-line: introduce struct packet_reader
On 12/07, Stefan Beller wrote: > On Mon, Dec 4, 2017 at 3:58 PM, Brandon Williamswrote: > > Sometimes it is advantageous to be able to peek the next packet line > > without consuming it (e.g. to be able to determine the protocol version > > a server is speaking). In order to do that introduce 'struct > > packet_reader' which is an abstraction around the normal packet reading > > logic. This enables a caller to be able to peek a single line at a time > > using 'packet_reader_peek()' and having a caller consume a line by > > calling 'packet_reader_read()'. > > > > Signed-off-by: Brandon Williams > > --- > > pkt-line.c | 61 > > + > > pkt-line.h | 20 > > 2 files changed, 81 insertions(+) > > > > diff --git a/pkt-line.c b/pkt-line.c > > index ac619f05b..518109bbe 100644 > > --- a/pkt-line.c > > +++ b/pkt-line.c > > @@ -406,3 +406,64 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct > > strbuf *sb_out) > > } > > return sb_out->len - orig_len; > > } > > + > > +/* Packet Reader Functions */ > > +void packet_reader_init(struct packet_reader *reader, int fd, > > + char *src_buffer, size_t src_len) > > +{ > > + reader->fd = fd; > > + reader->src_buffer = src_buffer; > > + reader->src_len = src_len; > > + reader->buffer = packet_buffer; > > + reader->buffer_size = sizeof(packet_buffer); > > + reader->options = PACKET_READ_CHOMP_NEWLINE | > > PACKET_READ_GENTLE_ON_EOF; > > I assume the future user of this packet reader will need exactly > these options coincidentally. ;) > > I think it might be ok for now and later we can extend the reader if needed > to also take the flags as args. However given this set of args, this is a > gentle > packet reader, as it corresponds to the _gently version of reading > packets AFAICT. > > Unlike the pkt_read function this constructor of a packet reader doesn't need > the arguments for its buffer (packet_buffer and sizeof thereof), which > packet_read > unfortunately needs. We pass in packet_buffer all the time except in > builtin/receive-pack > for obtaining the gpg cert. I think that's ok here. Yeah, all of the callers I wanted to migrate all passed the same flags and same buffer. I figured there may be a point in the future where we'd want to extend this so instead of hard coding the flags in the read functions, I did so in the constructor. That way if we wanted to extend it in the future, it would be a little bit easier. > > > +enum packet_read_status packet_reader_read(struct packet_reader *reader) > > +{ > > + if (reader->line_peeked) { > > + reader->line_peeked = 0; > > + return reader->status; > > + } > > + > > + reader->status = packet_read_with_status(reader->fd, > > +>src_buffer, > > +>src_len, > > +reader->buffer, > > +reader->buffer_size, > > +>pktlen, > > +reader->options); > > + > > + switch (reader->status) { > > + case PACKET_READ_ERROR: > > + reader->pktlen = -1; > > In case of error the read function might not > have assigned the pktlen as requested, so we assign > it to -1/NULL here. Though the caller ought to already know > that they handle bogus, as the state is surely the first thing > they'd inspect? Exactly, I thought it would be easier to ensure that pktlen and line were had consistent state no matter what the output of the read was. But yes, callers should be looking at the status to determine what to do. > > > + reader->line = NULL; > > + break; > > + case PACKET_READ_NORMAL: > > + reader->line = reader->buffer; > > + break; > > + case PACKET_READ_FLUSH: > > + reader->pktlen = 0; > > + reader->line = NULL; > > + break; > > + } > > Oh, this gives an interesting interface for someone who is > just inspecting the len/line instead of the state, so it might be > worth keeping it this way. > > Can we have an API documentation in the header file, > explaining what to expect in each field given the state > of the (read, peaked) packet? Yep I can write some better API docs for these functions. -- Brandon Williams
Re: [PATCH Outreachy 1/2] format: create pretty.h file
On Fri, Dec 8, 2017 at 8:21 AM, Olga Telezhnayawrote: > Create header for pretty.c to make formatting interface more structured. > This is a middle point, this file would be merged futher with other s/futher/further/ > files which contain formatting stuff. > > Signed-off-by: Olga Telezhnaia > Mentored-by: Christian Couder > Mentored by: Jeff King
Re: [PATCH v7 00/16] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
On 12/8/2017 12:58 PM, Junio C Hamano wrote: Jeff Hostetlerwrites: From: Jeff Hostetler This is V7 of part 3 of partial clone. It builds upon V7 of part 2 (which builds upon V6 of part 1). Aren't the three patches at the bottom sort-of duplicate from the part 2 series? oops. yes, you're right. it looks like i selected pc*6*_p2..pc7_p3 rather than pc*7*_p2..pc7_p3. sorry for the typo. and since the only changes in p2 were to squash those 2 commits near the tip of p2, only those 3 commits changed SHAs in v7 over v6. so, please disregard the duplicates. would you like me to send a corrected V8 for p3 ? Jeff
Re: [WIP 01/15] pkt-line: introduce packet_read_with_status
On 12/07, Stefan Beller wrote: > On Mon, Dec 4, 2017 at 3:58 PM, Brandon Williamswrote: > > > diff --git a/pkt-line.h b/pkt-line.h > > index 3dad583e2..f1545929b 100644 > > --- a/pkt-line.h > > +++ b/pkt-line.h > > @@ -60,8 +60,16 @@ int write_packetized_from_buf(const char *src_in, size_t > > len, int fd_out); > > * If options contains PACKET_READ_CHOMP_NEWLINE, a trailing newline (if > > * present) is removed from the buffer before returning. > > */ > > +enum packet_read_status { > > + PACKET_READ_ERROR = -1, > > + PACKET_READ_NORMAL, > > + PACKET_READ_FLUSH, > > +}; > > #define PACKET_READ_GENTLE_ON_EOF (1u<<0) > > #define PACKET_READ_CHOMP_NEWLINE (1u<<1) > > +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, > > size_t *src_len, > > + char *buffer, unsigned > > size, int *pktlen, > > + int options); > > int packet_read(int fd, char **src_buffer, size_t *src_len, char > > *buffer, unsigned size, int options); > > The documentation that is preceding these lines is very specific to > packet_read, e.g. > > If options does contain PACKET_READ_GENTLE_ON_EOF, > we will not die on condition 4 (truncated input), but instead return -1 > > which doesn't hold true for the _status version. Can you adapt the comment > to explain both read functions? Good point, I'll makes changes and document the _status version separately. -- Brandon Williams
Re: [PATCH] doc: reword gitworflows for neutrality
On Fri, Dec 8, 2017 at 10:18 AM, Daniel Bensoussanwrote: > doc: reword gitworflows for neutrality s/gitworflows/gitworkflows/ > Changed 'he' to 'them' to be more neutral in "gitworkflows.txt". > > See discussion at: > https://public-inbox.org/git/xmqqvahieeqy@gitster.mtv.corp.google.com/ > > Signed-off-by: Matthieu Moy > Signed-off-by: Timothee Albertin > Signed-off-by: Nathan Payre > Signed-off-by: Daniel Bensoussan > --- > diff --git a/Documentation/gitworkflows.txt b/Documentation/gitworkflows.txt > @@ -407,8 +407,8 @@ follows. > -Occasionally, the maintainer may get merge conflicts when he tries to > -pull changes from downstream. In this case, he can ask downstream to > +Occasionally, the maintainer may get merge conflicts when they try to > +pull changes from downstream. In this case, they can ask downstream to As a native English speaker, I find the new phrasing odd, and think this may a step backward. How about trying a different approach? For example: Occasionally, the maintainer may get merge conflicts when trying to pull changes from downstream. In this case, it may make sense to ask downstream to do the merge and resolve the conflicts instead (since, presumably, downstream will know better how to resolve them). > do the merge and resolve the conflicts themselves (perhaps they will > know better how to resolve them). It is one of the rare cases where > downstream 'should' merge from upstream.
Re: [PATCH v7 00/16] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
Jeff Hostetlerwrites: > From: Jeff Hostetler > > This is V7 of part 3 of partial clone. It builds upon V7 of part 2 > (which builds upon V6 of part 1). Aren't the three patches at the bottom sort-of duplicate from the part 2 series?
Re: [PATCH 2/2] version --build-options: report commit, too, if possible
Jonathan Niederwrites: >> We need to be careful, though, to report when the current commit cannot be >> determined, e.g. when building from a tarball without any associated Git >> repository. > > This means that on Debian, it would always print > > built from commit: (unknown) > > Maybe I shouldn't care, but I wonder if there's a way to improve on > that. E.g. should there be a makefile knob to allow Debian to specify > what to put there? Another "interesting" possibility is to build from a tarball extracted into a directory hierarchy that is controlled by an unrelated Git repository. E.g. "my $HOME is under $HOME/.git repository, and then I have a tarball extract in $HOME/src/git". We shouldn't embed the HEAD commit of that $HOME directory project in the resulting executable in such a case. We should be able to do this by being a bit more careful than the presented patch. Make sure that the toplevel is at the same directory as we assumed to be (i.e. where we found that Makefile) and trust rev-parse output only when that is the case, or something like that.
[PATCH v1 1/2] t0027: Don't use git commit
From: Torsten BögershausenReplace `git commit -m "comment" ""` with `git commit -m "comment"` to remove the empty path spec. Signed-off-by: Torsten Bögershausen --- t/t0027-auto-crlf.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh index c2c208fdcd..97154f5c79 100755 --- a/t/t0027-auto-crlf.sh +++ b/t/t0027-auto-crlf.sh @@ -370,7 +370,7 @@ test_expect_success 'setup master' ' echo >.gitattributes && git checkout -b master && git add .gitattributes && - git commit -m "add .gitattributes" "" && + git commit -m "add .gitattributes" && printf "\$Id: \$\nLINEONE\nLINETWO\nLINETHREE" >LF && printf "\$Id: \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF && printf "\$Id: \$\nLINEONE\r\nLINETWO\nLINETHREE" >CRLF_mix_LF && -- 2.15.1.271.g1a4e40aa5d
[PATCH v1 2/2] t0027: Adapt the new MIX tests to Windows
From: Torsten BögershausenThe new MIX tests don't pass under Windows, adapt them to use the correct native line ending. Signed-off-by: Torsten Bögershausen --- Sorry for the breakage. This needs to go on top of tb/check-crlf-for-safe-crlf t/t0027-auto-crlf.sh | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh index 97154f5c79..8d929b76dc 100755 --- a/t/t0027-auto-crlf.sh +++ b/t/t0027-auto-crlf.sh @@ -170,22 +170,22 @@ commit_MIX_chkwrn () { git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err" done - test_expect_success "commit file with mixed EOL crlf=$crlf attr=$attr LF" ' + test_expect_success "commit file with mixed EOL onto LF crlf=$crlf attr=$attr" ' check_warning "$lfwarn" ${pfx}_LF.err ' - test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF" ' + test_expect_success "commit file with mixed EOL onto CLRF attr=$attr aeol=$aeol crlf=$crlf" ' check_warning "$crlfwarn" ${pfx}_CRLF.err ' - test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_mix_LF" ' + test_expect_success "commit file with mixed EOL onto CRLF_mix_LF attr=$attr aeol=$aeol crlf=$crlf" ' check_warning "$lfmixcrlf" ${pfx}_CRLF_mix_LF.err ' - test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf LF_mix_cr" ' + test_expect_success "commit file with mixed EOL onto LF_mix_cr attr=$attr aeol=$aeol crlf=$crlf " ' check_warning "$lfmixcr" ${pfx}_LF_mix_CR.err ' - test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_nul" ' + test_expect_success "commit file with mixed EOL onto CRLF_nul attr=$attr aeol=$aeol crlf=$crlf" ' check_warning "$crlfnul" ${pfx}_CRLF_nul.err ' } @@ -378,7 +378,7 @@ test_expect_success 'setup master' ' printf "\$Id: \$\r\nLINEONE\r\nLINETWO\rLINETHREE" >CRLF_mix_CR && printf "\$Id: \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul && printf "\$Id: \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul && - create_NNO_MIX_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF && + create_NNO_MIX_files && git -c core.autocrlf=false add NNO_*.txt MIX_*.txt && git commit -m "mixed line endings" && test_tick @@ -441,13 +441,14 @@ test_expect_success 'commit files attr=crlf' ' ' # Commit "CRLFmixLF" on top of these files already in the repo: -# LF, CRLF, CRLFmixLF LF_mix_CR CRLFNULL +# mixed mixed mixed mixed mixed +# onto onto ontoonto onto # attrLFCRLF CRLFmixLF LF_mix_CR CRLFNUL commit_MIX_chkwrn "" "" false """""" "" "" commit_MIX_chkwrn "" "" true"LF_CRLF" """" "LF_CRLF" "LF_CRLF" commit_MIX_chkwrn "" "" input "CRLF_LF" """" "CRLF_LF" "CRLF_LF" -commit_MIX_chkwrn "auto" "" false "CRLF_LF" """" "CRLF_LF" "CRLF_LF" +commit_MIX_chkwrn "auto" "" false "$WAMIX" """" "$WAMIX""$WAMIX" commit_MIX_chkwrn "auto" "" true"LF_CRLF" """" "LF_CRLF" "LF_CRLF" commit_MIX_chkwrn "auto" "" input "CRLF_LF" """" "CRLF_LF" "CRLF_LF" -- 2.15.1.271.g1a4e40aa5d
Re: [PATCH 1/2] git version --build-options: report the build platform, too
Jonathan Niederwrites: >> @@ -390,6 +390,7 @@ const char *help_unknown_cmd(const char *cmd) >> >> int cmd_version(int argc, const char **argv, const char *prefix) >> { >> +static char build_platform[] = GIT_BUILD_PLATFORM; >> int build_options = 0; >> const char * const usage[] = { >> N_("git version []"), >> @@ -413,6 +414,7 @@ int cmd_version(int argc, const char **argv, const char >> *prefix) >> >> if (build_options) { >> printf("sizeof-long: %d\n", (int)sizeof(long)); >> +printf("machine: %s\n", build_platform); > > Can this use GIT_BUILD_PLATFORM directly instead of going via the indirection > of a mutable static string? That is, something like > > printf("machine: %s\n", GIT_BUILD_PLATFORM); Good point. And if this is externally identified as "machine", probably the macro should also use the same word, not "platform". We can go either way, as long as we are consistent, though. >> --- a/help.h >> +++ b/help.h >> @@ -33,3 +33,16 @@ extern void list_commands(unsigned int colopts, struct >> cmdnames *main_cmds, stru >> */ >> extern void help_unknown_ref(const char *ref, const char *cmd, const char >> *error); >> #endif /* HELP_H */ >> + >> +/* >> + * identify build platform >> + */ >> +#ifndef GIT_BUILD_PLATFORM >> +#if defined __x86__ || defined __i386__ || defined __i586__ || defined >> __i686__ >> +#define GIT_BUILD_PLATFORM "x86" >> +#elif defined __x86_64__ >> +#define GIT_BUILD_PLATFORM "x86_64" >> +#else >> +#define GIT_BUILD_PLATFORM "unknown" >> +#endif >> +#endif > > This code needs to be inside the HELP_H header guard. Certainly. Thanks.
Re: [PATCH Outreachy 1/2] format: create pretty.h file
Olga Telezhnayawrites: > -extern void get_commit_format(const char *arg, struct rev_info *); > -extern const char *format_subject(struct strbuf *sb, const char *msg, > - const char *line_separator); > -extern void userformat_find_requirements(const char *fmt, struct > userformat_want *w); > -extern int commit_format_is_empty(enum cmit_fmt); > extern const char *skip_blank_lines(const char *msg); > -extern void format_commit_message(const struct commit *commit, > - const char *format, struct strbuf *sb, > - const struct pretty_print_context *context); > -extern void pretty_print_commit(struct pretty_print_context *pp, > - const struct commit *commit, > - struct strbuf *sb); > -extern void pp_commit_easy(enum cmit_fmt fmt, const struct commit *commit, > -struct strbuf *sb); > -void pp_user_info(struct pretty_print_context *pp, > - const char *what, struct strbuf *sb, > - const char *line, const char *encoding); > -void pp_title_line(struct pretty_print_context *pp, > -const char **msg_p, > -struct strbuf *sb, > -const char *encoding, > -int need_8bit_cte); > -void pp_remainder(struct pretty_print_context *pp, > - const char **msg_p, > - struct strbuf *sb, > - int indent); > ... > +void userformat_find_requirements(const char *fmt, struct userformat_want > *w); > +void pp_commit_easy(enum cmit_fmt fmt, const struct commit *commit, > + struct strbuf *sb); > +void pp_user_info(struct pretty_print_context *pp, const char *what, > + struct strbuf *sb, const char *line, > + const char *encoding); > +void pp_title_line(struct pretty_print_context *pp, const char **msg_p, > + struct strbuf *sb, const char *encoding, > + int need_8bit_cte); > +void pp_remainder(struct pretty_print_context *pp, const char **msg_p, > + struct strbuf *sb, int indent); > + > +void format_commit_message(const struct commit *commit, > + const char *format, struct strbuf *sb, > + const struct pretty_print_context *context); > + > +void get_commit_format(const char *arg, struct rev_info *); > + > +void pretty_print_commit(struct pretty_print_context *pp, > + const struct commit *commit, > + struct strbuf *sb); > + > +const char *format_subject(struct strbuf *sb, const char *msg, > + const char *line_separator); > + > +int commit_format_is_empty(enum cmit_fmt); I see you've "standardized" to drop "extern" from the declarations in the header; I have an impression that our preference however is to go in the other direction. The choice of bits that are moved to the new header looks quite sensible to me. Thanks.
Re: [PATCH v5 4/4] builtin/branch: strip refs/heads/ using skip_prefix
On Friday 08 December 2017 04:44 AM, Junio C Hamano wrote: Junio C Hamanowrites: Somehow this fell underneath my radar horizon. I see v4 and v5 of 4/4 but do not seem to find 1-3/4. Is this meant to be a standalone patch, or am I expected to already have 1-3 that we already are committed to take? Ah, I am guessing that this would apply on top of 1-3/4 in the thread with <20171118172648.17918-1-kaartic.sivar...@gmail.com> You guessed right; at the right time. I was about to ask why this got "out of your radar" in reply to your recent "What's cooking" email :-) The base of the series seems to predate 16169285 ("Merge branch 'jc/branch-name-sanity'", 2017-11-28), so let me see how it looks by applying those three plus this one on top of 'master' before that point. Let me know if this has terrible conflicts so that I can rebase the series on top of 'master'. Thanks, Kaartic
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
Am 08.12.2017 um 11:14 schrieb Jeff King: > On Thu, Dec 07, 2017 at 01:47:14PM -0800, Junio C Hamano wrote: > >>> diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c >>> index 22034f87e7..8e8a15ea4a 100644 >>> --- a/builtin/fmt-merge-msg.c >>> +++ b/builtin/fmt-merge-msg.c >>> @@ -377,7 +377,8 @@ static void shortlog(const char *name, >>> string_list_append(, >>>oid_to_hex(>object.oid)); >>> else >>> - string_list_append(, strbuf_detach(, NULL)); >>> + string_list_append_nodup(, >>> +strbuf_detach(, NULL)); >>> } >>> >>> if (opts->credit_people) >> >> What is leaked comes from strbuf, so the title is not a lie, but I >> tend to think that this leak is caused by a somewhat strange >> string_list API. The subjects string-list is initialized as a "dup" >> kind, but a caller that wants to avoid leaking can (and should) use >> _nodup() call to add a string without duping. It all feels a bit >> too convoluted. > > I'm not sure it's string-list's fault. Many callers (including this one) > have _some_ entries whose strings must be duplicated and others which do > not. > > So either: > >1. The list gets marked as "nodup", and we add an extra xstrdup() to the > oid_to_hex call above. And also need to remember to free() the > strings later, since the list does not own them. > > or > >2. We mark it as "dup" and incur an extra allocation and copy, like: > > string_list_append(, sb.buf); > strbuf_release(); The two modes (dup/nodup) make string_list code tricky. Not sure how far we'd get with something simpler (e.g. an array of char pointers), but having the caller do all string allocations would make the code easier to analyze. > So I'd really blame the caller, which doesn't want to do (2) out of a > sense of optimization. It could also perhaps write it as: > >while (commit = get_revision(rev)) { > strbuf_reset(); > ... maybe put some stuff in sb ... > if (!sb.len) > string_list_append(, oid_to_hex(obj)); > else > string_list_append(, sb.buf); >} >strbuf_release(); > > which at least avoids the extra allocations. Right, we'd just have extra string copies in that case. > By the way, I think there's another quite subtle leak in this function. > We do this: > >format_commit_message(commit, "%s", , ); >strbuf_ltrim(); > > and then only use "sb" if sb.len is non-zero. But we may have actually > allocated to create our zero-length string (e.g., if we had a strbuf > full of spaces and trimmed them all off). Since we reuse "sb" over and > over as we loop, this will actually only leak once for the whole loop, > not once per iteration. So it's probably not a big deal, but writing it > with the explicit reset/release pattern fixes that (and is more > idiomatic for our code base, I think). It's subtle, but I think it's not leaking, at least not in your example case (and I can't think of another way). IIUC format_subject(), which handles the "%s" part, doesn't touch sb if the subject is made up only of whitespace. René
Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
Am 07.12.2017 um 22:27 schrieb Jeff King: > Grepping for "list_append.*detach" shows a few other possible cases in > transport-helper.c, which I think are leaks. -- >8 -- Subject: [PATCH] transport-helper: plug strbuf and string_list leaks Transfer ownership of detached strbufs to string_lists of the duplicating variety by calling string_list_append_nodup() instead of string_list_append() to avoid duplicating and then leaking the buffer. While at it make sure to release the string_list when done; push_refs_with_export() already does that. Reported-by: Jeff KingSigned-off-by: Rene Scharfe --- transport-helper.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/transport-helper.c b/transport-helper.c index bf05a2dcf1..f682e7c534 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -882,7 +882,8 @@ static int push_refs_with_push(struct transport *transport, struct strbuf cas = STRBUF_INIT; strbuf_addf(, "%s:%s", ref->name, oid_to_hex(>old_oid_expect)); - string_list_append(_options, strbuf_detach(, NULL)); + string_list_append_nodup(_options, +strbuf_detach(, NULL)); } } if (buf.len == 0) { @@ -897,6 +898,7 @@ static int push_refs_with_push(struct transport *transport, strbuf_addch(, '\n'); sendline(data, ); strbuf_release(); + string_list_release(_options, 0); return push_update_refs_status(data, remote_refs, flags); } @@ -930,7 +932,8 @@ static int push_refs_with_export(struct transport *transport, private = apply_refspecs(data->refspecs, data->refspec_nr, ref->name); if (private && !get_oid(private, )) { strbuf_addf(, "^%s", private); - string_list_append(_args, strbuf_detach(, NULL)); + string_list_append_nodup(_args, +strbuf_detach(, NULL)); oidcpy(>old_oid, ); } free(private); -- 2.15.1
Re: [PATCH 2/2] version --build-options: report commit, too, if possible
Hi, Johannes Schindelin wrote: > In particular when local tags are used (or tags that are pushed to some > fork) to build Git, it is very hard to figure out from which particular > revision a particular Git executable was built. Hm, can you say more about how this comes up in practice? I wonder if we should always augment the version string with the commit hash. E.g. I am running git version 2.15.1.424.g9478a66081 now, which already includes the commit hash because it disambiguates the .424 thing, but depending on the motivation, maybe we would also want git version 2.15.1.0.g9b185bef0c15 for people running v2.15.1 (or at least when such a tag is not a well known, published tag). > We need to be careful, though, to report when the current commit cannot be > determined, e.g. when building from a tarball without any associated Git > repository. This means that on Debian, it would always print built from commit: (unknown) Maybe I shouldn't care, but I wonder if there's a way to improve on that. E.g. should there be a makefile knob to allow Debian to specify what to put there? Thanks, Jonathan
Re: [PATCH 1/2] git version --build-options: report the build platform, too
Hi, Johannes Schindelin wrote: > From: Adric Norris> > When asking for bug reports to include the output of `git version > --build-options`, the idea is that we get a better idea of the > environment where said bug occurs. In this context, it is useful to > distinguish between 32 and 64-bit builds. Neat! The cover letter gives a clearer idea of the motivation: In Git for Windows, we ask users to paste the output of said command into their bug reports, with the idea that this frequently helps identify where the problems are coming from. There are some obvious missing bits of information in said output, though, and this patch series tries to fill the gaps at least a little. Could some of that text go here, too? [...] > --- a/help.c > +++ b/help.c > @@ -390,6 +390,7 @@ const char *help_unknown_cmd(const char *cmd) > > int cmd_version(int argc, const char **argv, const char *prefix) > { > + static char build_platform[] = GIT_BUILD_PLATFORM; > int build_options = 0; > const char * const usage[] = { > N_("git version []"), > @@ -413,6 +414,7 @@ int cmd_version(int argc, const char **argv, const char > *prefix) > > if (build_options) { > printf("sizeof-long: %d\n", (int)sizeof(long)); > + printf("machine: %s\n", build_platform); Can this use GIT_BUILD_PLATFORM directly instead of going via the indirection of a mutable static string? That is, something like printf("machine: %s\n", GIT_BUILD_PLATFORM); > /* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */ What do you think of this idea? uname_M isn't one of the variables in that file, though, so that's orthogonal to this patch. [...] > --- a/help.h > +++ b/help.h > @@ -33,3 +33,16 @@ extern void list_commands(unsigned int colopts, struct > cmdnames *main_cmds, stru > */ > extern void help_unknown_ref(const char *ref, const char *cmd, const char > *error); > #endif /* HELP_H */ > + > +/* > + * identify build platform > + */ > +#ifndef GIT_BUILD_PLATFORM > + #if defined __x86__ || defined __i386__ || defined __i586__ || defined > __i686__ > + #define GIT_BUILD_PLATFORM "x86" > + #elif defined __x86_64__ > + #define GIT_BUILD_PLATFORM "x86_64" > + #else > + #define GIT_BUILD_PLATFORM "unknown" > + #endif > +#endif This code needs to be inside the HELP_H header guard. Thanks and hope that helps, Jonathan
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
Junio C Hamanowrites: > Stefan Beller writes: > ... >> @@ -2883,6 +2884,8 @@ int prepare_revision_walk(struct rev_info *revs) >> simplify_merges(revs); >> if (revs->children.name) >> set_children(revs); >> +if (revs->diffopt.blobfind) >> +revs->simplify_history = 0; >> return 0; >> } > > It makes sense to clear this bit by default, but is this an > unconditonal clearing? In other words, is there a way for the user > to countermand this default and ask for merge simplification from > the command line, e.g. "diff --simplify-history --blobfind="? Looking at the places that assign to revs->simplify_history in revision.c it seems that "this option turns the merge simplification off unconditionally" is pretty much the norm, and this change just follows suit. So perhaps it is OK to let this pass, at least for now.
[PATCH 1/2] git version --build-options: report the build platform, too
From: Adric NorrisWhen asking for bug reports to include the output of `git version --build-options`, the idea is that we get a better idea of the environment where said bug occurs. In this context, it is useful to distinguish between 32 and 64-bit builds. We start by distinguishing between x86 and x86_64 (hoping that interested parties will extend this to other architectures in the future). In addition, all 32-bit variants (i686, i586, etc.) are collapsed into `x86'. An example of the output is: $ git version --build-options git version 2.9.3.windows.2.826.g06c0f2f sizeof-long: 4 machine: x86_64 The label of `machine' was chosen so the new information will approximate the output of `uname -m'. Signed-off-by: Adric Norris Signed-off-by: Johannes Schindelin --- help.c | 2 ++ help.h | 13 + 2 files changed, 15 insertions(+) diff --git a/help.c b/help.c index 88a3aeaeb9f..df1332fa3c9 100644 --- a/help.c +++ b/help.c @@ -390,6 +390,7 @@ const char *help_unknown_cmd(const char *cmd) int cmd_version(int argc, const char **argv, const char *prefix) { + static char build_platform[] = GIT_BUILD_PLATFORM; int build_options = 0; const char * const usage[] = { N_("git version []"), @@ -413,6 +414,7 @@ int cmd_version(int argc, const char **argv, const char *prefix) if (build_options) { printf("sizeof-long: %d\n", (int)sizeof(long)); + printf("machine: %s\n", build_platform); /* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */ } return 0; diff --git a/help.h b/help.h index b21d7c94e8c..42dd9852194 100644 --- a/help.h +++ b/help.h @@ -33,3 +33,16 @@ extern void list_commands(unsigned int colopts, struct cmdnames *main_cmds, stru */ extern void help_unknown_ref(const char *ref, const char *cmd, const char *error); #endif /* HELP_H */ + +/* + * identify build platform + */ +#ifndef GIT_BUILD_PLATFORM + #if defined __x86__ || defined __i386__ || defined __i586__ || defined __i686__ + #define GIT_BUILD_PLATFORM "x86" + #elif defined __x86_64__ + #define GIT_BUILD_PLATFORM "x86_64" + #else + #define GIT_BUILD_PLATFORM "unknown" + #endif +#endif -- 2.15.1.windows.2
[PATCH 2/2] version --build-options: report commit, too, if possible
In particular when local tags are used (or tags that are pushed to some fork) to build Git, it is very hard to figure out from which particular revision a particular Git executable was built. Let's just report that in our build options. We need to be careful, though, to report when the current commit cannot be determined, e.g. when building from a tarball without any associated Git repository. Signed-off-by: Johannes Schindelin--- Makefile | 4 +++- help.c| 2 ++ version.c | 1 + version.h | 1 + 4 files changed, 7 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index fef9c8d2725..92a0ae3d8e3 100644 --- a/Makefile +++ b/Makefile @@ -1893,7 +1893,9 @@ builtin/help.sp builtin/help.s builtin/help.o: EXTRA_CPPFLAGS = \ version.sp version.s version.o: GIT-VERSION-FILE GIT-USER-AGENT version.sp version.s version.o: EXTRA_CPPFLAGS = \ '-DGIT_VERSION="$(GIT_VERSION)"' \ - '-DGIT_USER_AGENT=$(GIT_USER_AGENT_CQ_SQ)' + '-DGIT_USER_AGENT=$(GIT_USER_AGENT_CQ_SQ)' \ + '-DGIT_BUILT_FROM_COMMIT="$(shell git rev-parse -q --verify HEAD || \ + echo "(unknown)")"' $(BUILT_INS): git$X $(QUIET_BUILT_IN)$(RM) $@ && \ diff --git a/help.c b/help.c index df1332fa3c9..6ebea665780 100644 --- a/help.c +++ b/help.c @@ -413,6 +413,8 @@ int cmd_version(int argc, const char **argv, const char *prefix) printf("git version %s\n", git_version_string); if (build_options) { + printf("built from commit: %s\n", + git_built_from_commit_string); printf("sizeof-long: %d\n", (int)sizeof(long)); printf("machine: %s\n", build_platform); /* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */ diff --git a/version.c b/version.c index 6106a8098c8..41b718c29e1 100644 --- a/version.c +++ b/version.c @@ -3,6 +3,7 @@ #include "strbuf.h" const char git_version_string[] = GIT_VERSION; +const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT; const char *git_user_agent(void) { diff --git a/version.h b/version.h index 6911a4f1558..7c62e805771 100644 --- a/version.h +++ b/version.h @@ -2,6 +2,7 @@ #define VERSION_H extern const char git_version_string[]; +extern const char git_built_from_commit_string[]; const char *git_user_agent(void); const char *git_user_agent_sanitized(void); -- 2.15.1.windows.2
[PATCH 0/2] Offer more information in `git version --build-options`'s output
In Git for Windows, we ask users to paste the output of said command into their bug reports, with the idea that this frequently helps identify where the problems are coming from. There are some obvious missing bits of information in said output, though, and this patch series tries to fill the gaps at least a little. Adric Norris (1): git version --build-options: report the build platform, too Johannes Schindelin (1): version --build-options: report commit, too, if possible Makefile | 4 +++- help.c| 4 help.h| 13 + version.c | 1 + version.h | 1 + 5 files changed, 22 insertions(+), 1 deletion(-) base-commit: 95ec6b1b3393eb6e26da40c565520a8db9796e9f Published-As: https://github.com/dscho/git/releases/tag/built-from-commit-v1 Fetch-It-Via: git fetch https://github.com/dscho/git built-from-commit-v1 -- 2.15.1.windows.2
Re: [PATCH] docs/pretty-formats: mention commas in %(trailers) syntax
Jeff Kingwrites: > On Fri, Dec 08, 2017 at 03:10:34AM -0500, Eric Sunshine wrote: > >> On Fri, Dec 8, 2017 at 12:16 AM, Jeff King wrote: >> > Commit 84ff053d47 (pretty.c: delimit "%(trailers)" arguments >> > with ",", 2017-10-01) switched the syntax of the trailers >> > placeholder, but forgot to update the documentation in >> > pretty-formats.txt. >> > >> > There's need to mention the old syntax; it was never in a >> >> I suppose you mean: s/need/no need/ > > Yes, indeed. Thanks. Ah, I probably should switch to 'read-only' mode until I finish my inbox.
Re: [PATCH] docs/pretty-formats: mention commas in %(trailers) syntax
Jeff Kingwrites: > Commit 84ff053d47 (pretty.c: delimit "%(trailers)" arguments > with ",", 2017-10-01) switched the syntax of the trailers > placeholder, but forgot to update the documentation in > pretty-formats.txt. > > There's need to mention the old syntax; it was never in a > released version of Git. There's or There's no? > > Signed-off-by: Jeff King > --- > This should go on top of tb/delimit-pretty-trailers-args-with-comma. > > Documentation/pretty-formats.txt | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/Documentation/pretty-formats.txt > b/Documentation/pretty-formats.txt > index d433d50f81..e664c088a5 100644 > --- a/Documentation/pretty-formats.txt > +++ b/Documentation/pretty-formats.txt > @@ -204,11 +204,13 @@ endif::git-rev-list[] >than given and there are spaces on its left, use those spaces > - '%><()', '%><|()': similar to '% <()', '%<|()' >respectively, but padding both sides (i.e. the text is centered) > -- %(trailers): display the trailers of the body as interpreted by > - linkgit:git-interpret-trailers[1]. If the `:only` option is given, > - omit non-trailer lines from the trailer block. If the `:unfold` > - option is given, behave as if interpret-trailer's `--unfold` option > - was given. E.g., `%(trailers:only:unfold)` to do both. > +- %(trailers[:options]): display the trailers of the body as interpreted > + by linkgit:git-interpret-trailers[1]. The `trailers` string may be > + followed by a colon and zero or more comma-separated options. If the > + `only` option is given, omit non-trailer lines from the trailer block. > + If the `unfold` option is given, behave as if interpret-trailer's > + `--unfold` option was given. E.g., `%(trailers:only,unfold)` to do > + both. > > NOTE: Some placeholders may depend on other options given to the > revision traversal engine. For example, the `%g*` reflog options will
Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
On 08/12/17 09:34, Jeff King wrote: > On Thu, Dec 07, 2017 at 04:24:47PM -0800, Stefan Beller wrote: [snip] >> Junio hinted at a different approach of solving this problem, which this >> patch implements. Teach the diff machinery another flag for restricting >> the information to what is shown. For example: >> >> $ ./git log --oneline --blobfind=v2.0.0:Makefile >> b2feb64309 Revert the whole "ask curl-config" topic for now >> 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" >> >> we observe that the Makefile as shipped with 2.0 was introduced in >> v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by >> a different blob. Sorry, this has nothing to do with this specific thread. :( However, the above caught my eye. Commit b2feb64309 does not 'replace the Makefile with a different blob'. That commit was a 'last minute' revert of a topic _prior_ to v2.0.0, which resulted in the _same_ blob as commit 47fbfded53. (i.e. a53f3a8326c2e62dc79bae7169d64137ac3dab20). [I haven't been following this topic, so just ignore me if I have misunderstood what the above was describing! :-D ] ATB, Ramsay Jones
Re: [SCRIPT/RFC 0/3] git-commit --onto-parent (three-way merge, no working tree file changes)
Igor Djordjevicwrites: > To get back on track, and regarding what`s already been said, would > having something like this(1) feel useful? > > (1) git commit --onto Are you asking me if _I_ find it useful? It is not a very useful question to ask, as I've taken things that I do not find useful myself. Having said that, I do not see me personally using it. You keep claiming that committing without ever materializing the exact state that is committed in the working tree is a good thing. I do not subscribe to that view. I'd rather do a quick fix-up on top (which ensures that at least the fix-up works in the context of the tip), and then "rebase -i" to move it a more appropriate place in the history (during which I have a chance to ensure that the fix-up works in the context it is intended to apply to). I know that every time I say this, people who prefer to commit things that never existed in the working tree will say "but we'll test it later after we make these commit without having their state in the working tree". But I also know better that "later" often do not come, ever, at least for people like me ;-). The amount of work _required_ to record the fix-up at its final resting place deeper in the history would be larger with "rebase -i" approach, simply because approaches like "commit --onto" and "git post" that throw a new commit deep in the history would not require ever materializing it in the working tree. But because I care about what I am actually committing, and because I am just lazy as any other human (if not more), I'd prefer an apporach that _forces_ me to have a checkout of the exact state that I'd be committing. That would prod me to actually looking at and testing the state after the change in the context it is meant to go.
[PATCH v7 04/16] upload-pack: add object filtering for partial clone
From: Jeff HostetlerTeach upload-pack to negotiate object filtering over the protocol and to send filter parameters to pack-objects. This is intended for partial clone and fetch. The idea to make upload-pack configurable using uploadpack.allowFilter comes from Jonathan Tan's work in [1]. [1] https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathanta...@google.com/ Signed-off-by: Jeff Hostetler --- Documentation/config.txt | 4 Documentation/technical/pack-protocol.txt | 8 +++ Documentation/technical/protocol-capabilities.txt | 8 +++ upload-pack.c | 26 ++- 4 files changed, 45 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1ac0ae6..e528210 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3268,6 +3268,10 @@ uploadpack.packObjectsHook:: was run. I.e., `upload-pack` will feed input intended for `pack-objects` to the hook, and expects a completed packfile on stdout. + +uploadpack.allowFilter:: + If this option is set, `upload-pack` will advertise partial + clone and partial fetch object filtering. + Note that this configuration variable is ignored if it is seen in the repository-level config (this is a safety measure against fetching from diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index ed1eae8..a43a113 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -212,6 +212,7 @@ out of what the server said it could do with the first 'want' line. upload-request= want-list *shallow-line *1depth-request + [filter-request] flush-pkt want-list = first-want @@ -227,6 +228,8 @@ out of what the server said it could do with the first 'want' line. additional-want = PKT-LINE("want" SP obj-id) depth = 1*DIGIT + + filter-request= PKT-LINE("filter" SP filter-spec) Clients MUST send all the obj-ids it wants from the reference @@ -249,6 +252,11 @@ complete those commits. Commits whose parents are not received as a result are defined as shallow and marked as such in the server. This information is sent back to the client in the next step. +The client can optionally request that pack-objects omit various +objects from the packfile using one of several filtering techniques. +These are intended for use with partial clone and partial fetch +operations. See `rev-list` for possible "filter-spec" values. + Once all the 'want's and 'shallow's (and optional 'deepen') are transferred, clients MUST send a flush-pkt, to tell the server side that it is done sending the list. diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt index 26dcc6f..332d209 100644 --- a/Documentation/technical/protocol-capabilities.txt +++ b/Documentation/technical/protocol-capabilities.txt @@ -309,3 +309,11 @@ to accept a signed push certificate, and asks the to be included in the push certificate. A send-pack client MUST NOT send a push-cert packet unless the receive-pack server advertises this capability. + +filter +-- + +If the upload-pack server advertises the 'filter' capability, +fetch-pack may send "filter" commands to request a partial clone +or partial fetch and request that the server omit various objects +from the packfile. diff --git a/upload-pack.c b/upload-pack.c index e25f725..e6d38b9 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -10,6 +10,8 @@ #include "diff.h" #include "revision.h" #include "list-objects.h" +#include "list-objects-filter.h" +#include "list-objects-filter-options.h" #include "run-command.h" #include "connect.h" #include "sigchain.h" @@ -18,6 +20,7 @@ #include "parse-options.h" #include "argv-array.h" #include "prio-queue.h" +#include "quote.h" static const char * const upload_pack_usage[] = { N_("git upload-pack [] "), @@ -64,6 +67,10 @@ static int advertise_refs; static int stateless_rpc; static const char *pack_objects_hook; +static int filter_capability_requested; +static int filter_advertise; +static struct list_objects_filter_options filter_options; + static void reset_timeout(void) { alarm(timeout); @@ -131,6 +138,12 @@ static void create_pack_file(void) argv_array_push(_objects.args, "--delta-base-offset"); if (use_include_tag) argv_array_push(_objects.args, "--include-tag"); + if (filter_options.filter_spec) { + struct strbuf buf = STRBUF_INIT; + sq_quote_buf(, filter_options.filter_spec); + argv_array_pushf(_objects.args, "--filter=%s", buf.buf); +
[PATCH v7 01/16] sha1_file: support lazily fetching missing objects
From: Jonathan TanTeach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing. The fetching of objects can be suppressed through a global variable. This is used by fsck and index-pack. However, by default, such fetching is not suppressed. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without fetching them) or be more efficient in fetching them. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file.c that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in packfile.c (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended(). For (2), I looked at for_each_packed_object and others. For for_each_packed_object, the callers either already work or are fixed in this patch: - reachable - only to find recent objects - builtin/fsck - already knows about missing objects - builtin/cat-file - warning message added in this commit Callers of the other functions do not need to be changed: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - already knows about promisor objects - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization Signed-off-by: Jonathan Tan --- builtin/cat-file.c | 2 ++ builtin/fetch-pack.c | 2 ++ builtin/fsck.c | 3 +++ builtin/index-pack.c | 6 ++ cache.h | 8 fetch-object.c | 3 +++ sha1_file.c | 32 ++ t/t0410-partial-clone.sh | 51 8 files changed, 99 insertions(+), 8 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index f5fa4fd..cf9ea5c 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -475,6 +475,8 @@ static int batch_objects(struct batch_options *opt) for_each_loose_object(batch_loose_object, , 0); for_each_packed_object(batch_packed_object, , 0); + if (repository_format_partial_clone) + warning("This repository has extensions.partialClone set. Some objects may not be loaded."); cb.opt = opt; cb.expand = diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 02abe72..15eeed7 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -53,6 +53,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) struct oid_array shallow = OID_ARRAY_INIT; struct string_list deepen_not = STRING_LIST_INIT_DUP; + fetch_if_missing = 0; + packet_trace_identity("fetch-pack"); memset(, 0, sizeof(args)); diff --git a/builtin/fsck.c b/builtin/fsck.c index 578a7c8..3b76c0e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -678,6 +678,9 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) int i; struct alternate_object_database *alt; + /* fsck knows how to handle missing promisor objects */ + fetch_if_missing = 0; + errors_found = 0; check_replace_refs = 0; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 24c2f05..a0a35e6 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1657,6 +1657,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) unsigned foreign_nr = 1;/* zero is a "good" value, assume bad */ int report_end_of_input = 0; + /* +* index-pack never needs to fetch missing objects, since it only +* accesses the repo to do hash collision checks +*/ + fetch_if_missing = 0; + if (argc == 2 && !strcmp(argv[1], "-h")) usage(index_pack_usage); diff --git a/cache.h b/cache.h index c76f2e9..6980072 100644 --- a/cache.h +++ b/cache.h @@ -1727,6 +1727,14 @@
[PATCH v7 02/16] rev-list: support termination at promisor objects
From: Jonathan TanTeach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- Documentation/rev-list-options.txt | 11 builtin/rev-list.c | 71 +++--- list-objects.c | 29 ++- object.c | 2 +- revision.c | 33 +++- revision.h | 5 +- t/t0410-partial-clone.sh | 101 + 7 files changed, 240 insertions(+), 12 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 8d8b7f4..0ce8ccd 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -745,10 +745,21 @@ The form '--missing=allow-any' will allow object traversal to continue if a missing object is encountered. Missing objects will silently be omitted from the results. + +The form '--missing=allow-promisor' is like 'allow-any', but will only +allow object traversal to continue for EXPECTED promisor missing objects. +Unexpected missing objects will raise an error. ++ The form '--missing=print' is like 'allow-any', but will also print a list of the missing objects. Object IDs are prefixed with a ``?'' character. endif::git-rev-list[] +--exclude-promisor-objects:: + (For internal use only.) Prefilter object traversal at + promisor boundary. This is used with partial clone. This is + stronger than `--missing=allow-promisor` because it limits the + traversal, rather than just silencing errors about missing + objects. + --no-walk[=(sorted|unsorted)]:: Only show the given commits, but do not traverse their ancestors. This has no effect if a range is specified. If the argument diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 547a3e0..48f922d 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -15,6 +15,7 @@ #include "progress.h" #include "reflog-walk.h" #include "oidset.h" +#include "packfile.h" static const char rev_list_usage[] = "git rev-list [OPTION] ... [ -- paths... ]\n" @@ -67,6 +68,7 @@ enum missing_action { MA_ERROR = 0,/* fail if any missing objects are encountered */ MA_ALLOW_ANY,/* silently allow ALL missing objects */ MA_PRINT,/* print ALL missing objects in special section */ + MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */ }; static enum missing_action arg_missing_action; @@ -197,6 +199,12 @@ static void finish_commit(struct commit *commit, void *data) static inline void finish_object__ma(struct object *obj) { + /* +* Whether or not we try to dynamically fetch missing objects +* from the server, we currently DO NOT have the object. We +* can either print, allow (ignore), or conditionally allow +* (ignore) them. +*/ switch (arg_missing_action) { case MA_ERROR: die("missing blob object '%s'", oid_to_hex(>oid)); @@ -209,25 +217,36 @@ static inline void finish_object__ma(struct object *obj) oidset_insert(_objects, >oid); return; + case MA_ALLOW_PROMISOR: + if (is_promisor_object(>oid)) + return; + die("unexpected missing blob object '%s'", + oid_to_hex(>oid)); + return; + default: BUG("unhandled missing_action"); return; } } -static void finish_object(struct object *obj, const char *name, void *cb_data) +static int finish_object(struct object *obj, const char *name, void *cb_data) { struct rev_list_info *info = cb_data; - if (obj->type == OBJ_BLOB && !has_object_file(>oid)) + if (obj->type == OBJ_BLOB && !has_object_file(>oid)) { finish_object__ma(obj); + return 1; + } if (info->revs->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT) parse_object(>oid); + return 0;
[PATCH v7 05/16] fetch-pack, index-pack, transport: partial clone
From: Jeff HostetlerSigned-off-by: Jeff Hostetler --- builtin/fetch-pack.c | 4 fetch-pack.c | 13 + fetch-pack.h | 2 ++ transport-helper.c | 5 + transport.c | 4 transport.h | 5 + 6 files changed, 33 insertions(+) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 15eeed7..7957807 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.no_dependents = 1; continue; } + if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), )) { + parse_list_objects_filter(_options, arg); + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) diff --git a/fetch-pack.c b/fetch-pack.c index 0798e0b..3c5f064 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -29,6 +29,7 @@ static int deepen_not_ok; static int fetch_fsck_objects = -1; static int transfer_fsck_objects = -1; static int agent_supported; +static int server_supports_filtering; static struct lock_file shallow_lock; static const char *alternate_shallow_file; @@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args, if (deepen_not_ok) strbuf_addstr(, " deepen-not"); if (agent_supported)strbuf_addf(, " agent=%s", git_user_agent_sanitized()); + if (args->filter_options.choice) + strbuf_addstr(, " filter"); packet_buf_write(_buf, "want %s%s\n", remote_hex, c.buf); strbuf_release(); } else @@ -407,6 +410,9 @@ static int find_common(struct fetch_pack_args *args, packet_buf_write(_buf, "deepen-not %s", s->string); } } + if (server_supports_filtering && args->filter_options.choice) + packet_buf_write(_buf, "filter %s", +args->filter_options.filter_spec); packet_buf_flush(_buf); state_len = req_buf.len; @@ -969,6 +975,13 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, else prefer_ofs_delta = 0; + if (server_supports("filter")) { + server_supports_filtering = 1; + print_verbose(args, _("Server supports filter")); + } else if (args->filter_options.choice) { + warning("filtering not recognized by server, ignoring"); + } + if ((agent_feature = server_feature_value("agent", _len))) { agent_supported = 1; if (agent_len) diff --git a/fetch-pack.h b/fetch-pack.h index aeac152..3e224a1 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -3,6 +3,7 @@ #include "string-list.h" #include "run-command.h" +#include "list-objects-filter-options.h" struct oid_array; @@ -12,6 +13,7 @@ struct fetch_pack_args { int depth; const char *deepen_since; const struct string_list *deepen_not; + struct list_objects_filter_options filter_options; unsigned deepen_relative:1; unsigned quiet:1; unsigned keep_pack:1; diff --git a/transport-helper.c b/transport-helper.c index c948d52..0650ca0 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -671,6 +671,11 @@ static int fetch(struct transport *transport, if (data->transport_options.update_shallow) set_helper_option(transport, "update-shallow", "true"); + if (data->transport_options.filter_options.choice) + set_helper_option( + transport, "filter", + data->transport_options.filter_options.filter_spec); + if (data->fetch) return fetch_with_fetch(transport, nr_heads, to_fetch); diff --git a/transport.c b/transport.c index f2fbc6f..cce50f5 100644 --- a/transport.c +++ b/transport.c @@ -166,6 +166,9 @@ static int set_git_option(struct git_transport_options *opts, } else if (!strcmp(name, TRANS_OPT_NO_DEPENDENTS)) { opts->no_dependents = !!value; return 0; + } else if (!strcmp(name, TRANS_OPT_LIST_OBJECTS_FILTER)) { + parse_list_objects_filter(>filter_options, value); + return 0; } return 1; } @@ -236,6 +239,7 @@ static int fetch_refs_via_pack(struct transport *transport, args.update_shallow = data->options.update_shallow; args.from_promisor = data->options.from_promisor; args.no_dependents = data->options.no_dependents; + args.filter_options = data->options.filter_options; if (!data->got_remote_heads) { connect_setup(transport, 0);
[PATCH v7 07/16] fetch-pack: test support excluding large blobs
From: Jonathan TanCreated tests to verify fetch-pack and upload-pack support for excluding large blobs using --filter=blobs:limit= parameter. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- t/t5500-fetch-pack.sh | 27 +++ upload-pack.c | 13 + 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 80a1a32..c57916b 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -755,4 +755,31 @@ test_expect_success 'fetching deepen' ' ) ' +test_expect_success 'filtering by size' ' + rm -rf server client && + test_create_repo server && + test_commit -C server one && + test_config -C server uploadpack.allowfilter 1 && + + test_create_repo client && + git -C client fetch-pack --filter=blob:limit=0 ../server HEAD && + + # Ensure that object is not inadvertently fetched + test_must_fail git -C client cat-file -e $(git hash-object server/one.t) +' + +test_expect_success 'filtering by size has no effect if support for it is not advertised' ' + rm -rf server client && + test_create_repo server && + test_commit -C server one && + + test_create_repo client && + git -C client fetch-pack --filter=blob:limit=0 ../server HEAD 2> err && + + # Ensure that object is fetched + git -C client cat-file -e $(git hash-object server/one.t) && + + test_i18ngrep "filtering not recognized by server" err +' + test_done diff --git a/upload-pack.c b/upload-pack.c index e6d38b9..15b6605 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -139,10 +139,15 @@ static void create_pack_file(void) if (use_include_tag) argv_array_push(_objects.args, "--include-tag"); if (filter_options.filter_spec) { - struct strbuf buf = STRBUF_INIT; - sq_quote_buf(, filter_options.filter_spec); - argv_array_pushf(_objects.args, "--filter=%s", buf.buf); - strbuf_release(); + if (pack_objects.use_shell) { + struct strbuf buf = STRBUF_INIT; + sq_quote_buf(, filter_options.filter_spec); + argv_array_pushf(_objects.args, "--filter=%s", buf.buf); + strbuf_release(); + } else { + argv_array_pushf(_objects.args, "--filter=%s", +filter_options.filter_spec); + } } pack_objects.in = -1; -- 2.9.3
[PATCH v7 06/16] fetch-pack: add --no-filter
From: Jeff HostetlerFixup fetch-pack to accept --no-filter to be consistent with rev-list and pack-objects. Signed-off-by: Jeff Hostetler --- builtin/fetch-pack.c | 4 1 file changed, 4 insertions(+) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 7957807..cbf5035 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -157,6 +157,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) parse_list_objects_filter(_options, arg); continue; } + if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { + list_objects_filter_release(_options); + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) -- 2.9.3
[PATCH v7 08/16] fetch: refactor calculation of remote list
From: Jonathan TanSeparate out the calculation of remotes to be fetched from and the actual fetching. This will allow us to include an additional step before the actual fetching in a subsequent commit. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 225c734..1b1f039 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) { int i; struct string_list list = STRING_LIST_INIT_DUP; - struct remote *remote; + struct remote *remote = NULL; int result = 0; struct argv_array argv_gc_auto = ARGV_ARRAY_INIT; @@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) else if (argc > 1) die(_("fetch --all does not make sense with refspecs")); (void) for_each_remote(get_one_remote_for_fetch, ); - result = fetch_multiple(); } else if (argc == 0) { /* No arguments -- use default remote */ remote = remote_get(NULL); - result = fetch_one(remote, argc, argv); } else if (multiple) { /* All arguments are assumed to be remotes or groups */ for (i = 0; i < argc; i++) if (!add_remote_or_group(argv[i], )) die(_("No such remote or remote group: %s"), argv[i]); - result = fetch_multiple(); } else { /* Single remote or group */ (void) add_remote_or_group(argv[0], ); @@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) /* More than one remote */ if (argc > 1) die(_("Fetching a group and specifying refspecs does not make sense")); - result = fetch_multiple(); } else { /* Zero or one remotes */ remote = remote_get(argv[0]); - result = fetch_one(remote, argc-1, argv+1); + argc--; + argv++; } } + if (remote) + result = fetch_one(remote, argc, argv); + else + result = fetch_multiple(); + if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) { struct argv_array options = ARGV_ARRAY_INIT; -- 2.9.3
[PATCH v7 10/16] partial-clone: define partial clone settings in config
From: Jeff HostetlerCreate get and set routines for "partial clone" config settings. These will be used in a future commit by clone and fetch to remember the promisor remote and the default filter-spec. Signed-off-by: Jeff Hostetler --- cache.h | 1 + config.c | 5 +++ environment.c | 1 + list-objects-filter-options.c | 90 +++ list-objects-filter-options.h | 6 +++ 5 files changed, 88 insertions(+), 15 deletions(-) diff --git a/cache.h b/cache.h index 6980072..bccc510 100644 --- a/cache.h +++ b/cache.h @@ -861,6 +861,7 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION_READ 1 extern int repository_format_precious_objects; extern char *repository_format_partial_clone; +extern const char *core_partial_clone_filter_default; struct repository_format { int version; diff --git a/config.c b/config.c index adb7d7a..adeee04 100644 --- a/config.c +++ b/config.c @@ -1241,6 +1241,11 @@ static int git_default_core_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.partialclonefilter")) { + return git_config_string(_partial_clone_filter_default, +var, value); + } + /* Add other config variables here and to Documentation/config.txt. */ return 0; } diff --git a/environment.c b/environment.c index e52aab3..7537565 100644 --- a/environment.c +++ b/environment.c @@ -28,6 +28,7 @@ int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_precious_objects; char *repository_format_partial_clone; +const char *core_partial_clone_filter_default; const char *git_commit_encoding; const char *git_log_output_encoding; const char *apply_default_whitespace; diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 4c5b34e..5c47e2b 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -21,29 +21,36 @@ * subordinate commands when necessary. We also "intern" the arg for * the convenience of the current command. */ -int parse_list_objects_filter(struct list_objects_filter_options *filter_options, - const char *arg) +static int gently_parse_list_objects_filter( + struct list_objects_filter_options *filter_options, + const char *arg, + struct strbuf *errbuf) { const char *v0; - if (filter_options->choice) - die(_("multiple object filter types cannot be combined")); + if (filter_options->choice) { + if (errbuf) { + strbuf_init(errbuf, 0); + strbuf_addstr( + errbuf, + _("multiple filter-specs cannot be combined")); + } + return 1; + } filter_options->filter_spec = strdup(arg); if (!strcmp(arg, "blob:none")) { filter_options->choice = LOFC_BLOB_NONE; return 0; - } - if (skip_prefix(arg, "blob:limit=", )) { - if (!git_parse_ulong(v0, _options->blob_limit_value)) - die(_("invalid filter-spec expression '%s'"), arg); - filter_options->choice = LOFC_BLOB_LIMIT; - return 0; - } + } else if (skip_prefix(arg, "blob:limit=", )) { + if (git_parse_ulong(v0, _options->blob_limit_value)) { + filter_options->choice = LOFC_BLOB_LIMIT; + return 0; + } - if (skip_prefix(arg, "sparse:oid=", )) { + } else if (skip_prefix(arg, "sparse:oid=", )) { struct object_context oc; struct object_id sparse_oid; @@ -57,15 +64,27 @@ int parse_list_objects_filter(struct list_objects_filter_options *filter_options filter_options->sparse_oid_value = oiddup(_oid); filter_options->choice = LOFC_SPARSE_OID; return 0; - } - if (skip_prefix(arg, "sparse:path=", )) { + } else if (skip_prefix(arg, "sparse:path=", )) { filter_options->choice = LOFC_SPARSE_PATH; filter_options->sparse_path_value = strdup(v0); return 0; } - die(_("invalid filter-spec expression '%s'"), arg); + if (errbuf) { + strbuf_init(errbuf, 0); + strbuf_addf(errbuf, "invalid filter-spec '%s'", arg); + } + memset(filter_options, 0, sizeof(*filter_options)); + return 1; +} + +int parse_list_objects_filter(struct list_objects_filter_options *filter_options, + const char *arg) +{ + struct strbuf buf = STRBUF_INIT; + if (gently_parse_list_objects_filter(filter_options, arg, )) +
[PATCH v7 14/16] t5616: end-to-end tests for partial clone
From: Jeff HostetlerAdditional end-to-end tests for partial clone. Signed-off-by: Jeff Hostetler --- t/t5616-partial-clone.sh | 96 1 file changed, 96 insertions(+) create mode 100755 t/t5616-partial-clone.sh diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh new file mode 100755 index 000..3b8cc0b --- /dev/null +++ b/t/t5616-partial-clone.sh @@ -0,0 +1,96 @@ +#!/bin/sh + +test_description='git partial clone' + +. ./test-lib.sh + +# create a normal "src" repo where we can later create new commits. +# expect_1.oids will contain a list of the OIDs of all blobs. +test_expect_success 'setup normal src repo' ' + echo "{print \$1}" >print_1.awk && + echo "{print \$2}" >print_2.awk && + + git init src && + for n in 1 2 3 4 + do + echo "This is file: $n" > src/file.$n.txt + git -C src add file.$n.txt + git -C src commit -m "file $n" + git -C src ls-files -s file.$n.txt >>temp + done && + awk -f print_2.awk expect_1.oids && + test_line_count = 4 expect_1.oids +' + +# bare clone "src" giving "srv.bare" for use as our server. +test_expect_success 'setup bare clone for server' ' + git clone --bare "file://$(pwd)/src" srv.bare && + git -C srv.bare config --local uploadpack.allowfilter 1 && + git -C srv.bare config --local uploadpack.allowanysha1inwant 1 +' + +# do basic partial clone from "srv.bare" +# confirm we are missing all of the known blobs. +# confirm partial clone was registered in the local config. +test_expect_success 'do partial clone 1' ' + git clone --no-checkout --filter=blob:none "file://$(pwd)/srv.bare" pc1 && + git -C pc1 rev-list HEAD --quiet --objects --missing=print \ + | awk -f print_1.awk \ + | sed "s/?//" \ + | sort >observed.oids && + test_cmp expect_1.oids observed.oids && + test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" && + test "$(git -C pc1 config --local extensions.partialclone)" = "origin" && + test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none" +' + +# checkout master to force dynamic object fetch of blobs at HEAD. +test_expect_success 'verify checkout with dynamic object fetch' ' + git -C pc1 rev-list HEAD --quiet --objects --missing=print >observed && + test_line_count = 4 observed && + git -C pc1 checkout master && + git -C pc1 rev-list HEAD --quiet --objects --missing=print >observed && + test_line_count = 0 observed +' + +# create new commits in "src" repo to establish a blame history on file.1.txt +# and push to "srv.bare". +test_expect_success 'push new commits to server' ' + git -C src remote add srv "file://$(pwd)/srv.bare" && + for x in a b c d e + do + echo "Mod $x" >>src/file.1.txt + git -C src add file.1.txt + git -C src commit -m "mod $x" + done && + git -C src blame master -- file.1.txt >expect.blame && + git -C src push -u srv master +' + +# (partial) fetch in the partial clone repo from the promisor remote. +# verify that fetch inherited the filter-spec from the config and DOES NOT +# have the new blobs. +test_expect_success 'partial fetch inherits filter settings' ' + git -C pc1 fetch origin && + git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + test_line_count = 5 observed +' + +# force dynamic object fetch using diff. +# we should only get 1 new blob (for the file in origin/master). +test_expect_success 'verify diff causes dynamic object fetch' ' + git -C pc1 diff master..origin/master -- file.1.txt && + git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + test_line_count = 4 observed +' + +# force full dynamic object fetch of the file's history using blame. +# we should get the intermediate blobs for the file. +test_expect_success 'verify blame causes dynamic object fetch' ' + git -C pc1 blame origin/master -- file.1.txt >observed.blame && + test_cmp expect.blame observed.blame && + git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + test_line_count = 0 observed +' + +test_done -- 2.9.3
[PATCH v7 15/16] fetch: inherit filter-spec from partial clone
From: Jeff HostetlerTeach (partial) fetch to inherit the filter-spec used by the partial clone. Extend --no-filter to override this inheritance. Signed-off-by: Jeff Hostetler --- builtin/fetch-pack.c | 2 +- builtin/fetch.c | 56 --- builtin/rev-list.c| 2 +- list-objects-filter-options.c | 2 +- list-objects-filter-options.h | 12 ++ t/t5616-partial-clone.sh | 22 - 6 files changed, 89 insertions(+), 7 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index cbf5035..a7bc136 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -158,7 +158,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) continue; } if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { - list_objects_filter_release(_options); + list_objects_filter_set_no_filter(_options); continue; } usage(fetch_pack_usage); diff --git a/builtin/fetch.c b/builtin/fetch.c index 14aab71..79c866c 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1275,6 +1275,56 @@ static int fetch_multiple(struct string_list *list) return result; } +/* + * Fetching from the promisor remote should use the given filter-spec + * or inherit the default filter-spec from the config. + */ +static inline void fetch_one_setup_partial(struct remote *remote) +{ + /* +* Explicit --no-filter argument overrides everything, regardless +* of any prior partial clones and fetches. +*/ + if (filter_options.no_filter) + return; + + /* +* If no prior partial clone/fetch and the current fetch DID NOT +* request a partial-fetch, do a normal fetch. +*/ + if (!repository_format_partial_clone && !filter_options.choice) + return; + + /* +* If this is the FIRST partial-fetch request, we enable partial +* on this repo and remember the given filter-spec as the default +* for subsequent fetches to this remote. +*/ + if (!repository_format_partial_clone && filter_options.choice) { + partial_clone_register(remote->name, _options); + return; + } + + /* +* We are currently limited to only ONE promisor remote and only +* allow partial-fetches from the promisor remote. +*/ + if (strcmp(remote->name, repository_format_partial_clone)) { + if (filter_options.choice) + die(_("--filter can only be used with the remote configured in core.partialClone")); + return; + } + + /* +* Do a partial-fetch from the promisor remote using either the +* explicitly given filter-spec or inherit the filter-spec from +* the config. +*/ + if (!filter_options.choice) + partial_clone_get_default_filter_spec(_options); + return; +} + static int fetch_one(struct remote *remote, int argc, const char **argv) { static const char **refs = NULL; @@ -1404,13 +1454,13 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) } if (remote) { - if (filter_options.choice && - strcmp(remote->name, repository_format_partial_clone)) - die(_("--filter can only be used with the remote configured in core.partialClone")); + if (filter_options.choice || repository_format_partial_clone) + fetch_one_setup_partial(remote); result = fetch_one(remote, argc, argv); } else { if (filter_options.choice) die(_("--filter can only be used with the remote configured in core.partialClone")); + /* TODO should this also die if we have a previous partial-clone? */ result = fetch_multiple(); } diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 48f922d..8503dea 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -460,7 +460,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) continue; } if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { - list_objects_filter_release(_options); + list_objects_filter_set_no_filter(_options); continue; } if (!strcmp(arg, "--filter-print-omitted")) { diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 5c47e2b..6a3cc98 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -94,7 +94,7 @@ int opt_parse_list_objects_filter(const struct option *opt, struct
[PATCH v7 16/16] t5616: test bulk prefetch after partial fetch
From: Jeff HostetlerAdd test to t5616 to bulk fetch missing objects following a partial fetch. A technique like this could be used in a pre-command hook for example. Signed-off-by: Jeff Hostetler --- t/t5616-partial-clone.sh | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 433e07e..29d8631 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -105,7 +105,7 @@ test_expect_success 'push new commits to server for file.2.txt' ' git -C src push -u srv master ' -# Do FULL fetch by disabling filter-spec using --no-filter. +# Do FULL fetch by disabling inherited filter-spec using --no-filter. # Verify we have all the new blobs. test_expect_success 'override inherited filter-spec using --no-filter' ' git -C pc1 fetch --no-filter origin && @@ -113,4 +113,34 @@ test_expect_success 'override inherited filter-spec using --no-filter' ' test_line_count = 0 observed ' +# create new commits in "src" repo to establish a history on file.3.txt +# and push to "srv.bare". +test_expect_success 'push new commits to server for file.3.txt' ' + for x in a b c d e f + do + echo "Mod file.3.txt $x" >>src/file.3.txt + git -C src add file.3.txt + git -C src commit -m "mod $x" + done && + git -C src push -u srv master +' + +# Do a partial fetch and then try to manually fetch the missing objects. +# This can be used as the basis of a pre-command hook to bulk fetch objects +# perhaps combined with a command in dry-run mode. +test_expect_success 'manual prefetch of missing objects' ' + git -C pc1 fetch --filter=blob:none origin && + git -C pc1 rev-list master..origin/master --quiet --objects --missing=print \ + | awk -f print_1.awk \ + | sed "s/?//" \ + | sort >observed.oids && + test_line_count = 6 observed.oids && + git -C pc1 fetch-pack --stdin "file://$(pwd)/srv.bare" observed.oids && + test_line_count = 0 observed.oids +' + test_done -- 2.9.3
[PATCH v7 11/16] clone: partial clone
From: Jonathan TanSigned-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- builtin/clone.c | 22 -- t/t5601-clone.sh | 49 + 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index dbddd98..f519bd4 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -26,6 +26,7 @@ #include "run-command.h" #include "connected.h" #include "packfile.h" +#include "list-objects-filter-options.h" /* * Overall FIXMEs: @@ -60,6 +61,7 @@ static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP; static int option_dissociate; static int max_jobs = -1; static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP; +static struct list_objects_filter_options filter_options; static int recurse_submodules_cb(const struct option *opt, const char *arg, int unset) @@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_PARSE_LIST_OBJECTS_FILTER(_options), OPT_END() }; @@ -886,6 +889,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) struct refspec *refspec; const char *fetch_pattern; + fetch_if_missing = 0; + packet_trace_identity("clone"); argc = parse_options(argc, argv, prefix, builtin_clone_options, builtin_clone_usage, 0); @@ -1073,6 +1078,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) warning(_("--shallow-since is ignored in local clones; use file:// instead.")); if (option_not.nr) warning(_("--shallow-exclude is ignored in local clones; use file:// instead.")); + if (filter_options.choice) + warning(_("--filter is ignored in local clones; use file:// instead.")); if (!access(mkpath("%s/shallow", path), F_OK)) { if (option_local > 0) warning(_("source repository is shallow, ignoring --local")); @@ -1104,7 +,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix) transport_set_option(transport, TRANS_OPT_UPLOADPACK, option_upload_pack); - if (transport->smart_options && !deepen) + if (filter_options.choice) { + transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, +filter_options.filter_spec); + transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); + } + + if (transport->smart_options && !deepen && !filter_options.choice) transport->smart_options->check_self_contained_and_connected = 1; refs = transport_get_remote_refs(transport); @@ -1164,13 +1177,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix) write_refspec_config(src_ref_prefix, our_head_points_at, remote_head_points_at, _top); + if (filter_options.choice) + partial_clone_register("origin", _options); + if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) transport_fetch_refs(transport, mapped_refs); update_remote_refs(refs, mapped_refs, remote_head_points_at, - branch_top.buf, reflog_msg.buf, transport, !is_local); + branch_top.buf, reflog_msg.buf, transport, + !is_local && !filter_options.choice); update_head(our_head_points_at, remote_head, reflog_msg.buf); @@ -1191,6 +1208,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } junk_mode = JUNK_LEAVE_REPO; + fetch_if_missing = 1; err = checkout(submodule_progress); strbuf_release(_msg); diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 9c56f77..6d37c6d 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable pack' ' git -C replay.git index-pack -v --stdin err && + + test_i18ngrep "filtering not recognized by server" err +' + +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +test_expect_success 'partial clone using HTTP' ' + partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server" +' + +stop_httpd + test_done -- 2.9.3
[PATCH v7 12/16] unpack-trees: batch fetching of missing blobs
From: Jonathan TanWhen running checkout, first prefetch all blobs that are to be updated but are missing. This means that only one pack is downloaded during such operations, instead of one per missing blob. This operates only on the blob level - if a repository has a missing tree, they are still fetched one at a time. This does not use the delayed checkout mechanism introduced in commit 2841e8f ("convert: add "status=delayed" to filter process protocol", 2017-06-30) due to significant conceptual differences - in particular, for partial clones, we already know what needs to be fetched based on the contents of the local repo alone, whereas for status=delayed, it is the filter process that tells us what needs to be checked in the end. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- fetch-object.c | 26 ++ fetch-object.h | 5 + t/t5601-clone.sh | 52 unpack-trees.c | 22 ++ 4 files changed, 101 insertions(+), 4 deletions(-) diff --git a/fetch-object.c b/fetch-object.c index 258fcfa..853624f 100644 --- a/fetch-object.c +++ b/fetch-object.c @@ -5,11 +5,10 @@ #include "transport.h" #include "fetch-object.h" -void fetch_object(const char *remote_name, const unsigned char *sha1) +static void fetch_refs(const char *remote_name, struct ref *ref) { struct remote *remote; struct transport *transport; - struct ref *ref; int original_fetch_if_missing = fetch_if_missing; fetch_if_missing = 0; @@ -18,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned char *sha1) die(_("Remote with no URL")); transport = transport_get(remote, remote->url[0]); - ref = alloc_ref(sha1_to_hex(sha1)); - hashcpy(ref->old_oid.hash, sha1); transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1"); transport_fetch_refs(transport, ref); fetch_if_missing = original_fetch_if_missing; } + +void fetch_object(const char *remote_name, const unsigned char *sha1) +{ + struct ref *ref = alloc_ref(sha1_to_hex(sha1)); + hashcpy(ref->old_oid.hash, sha1); + fetch_refs(remote_name, ref); +} + +void fetch_objects(const char *remote_name, const struct oid_array *to_fetch) +{ + struct ref *ref = NULL; + int i; + + for (i = 0; i < to_fetch->nr; i++) { + struct ref *new_ref = alloc_ref(oid_to_hex(_fetch->oid[i])); + oidcpy(_ref->old_oid, _fetch->oid[i]); + new_ref->next = ref; + ref = new_ref; + } + fetch_refs(remote_name, ref); +} diff --git a/fetch-object.h b/fetch-object.h index f371300..4b269d0 100644 --- a/fetch-object.h +++ b/fetch-object.h @@ -1,6 +1,11 @@ #ifndef FETCH_OBJECT_H #define FETCH_OBJECT_H +#include "sha1-array.h" + extern void fetch_object(const char *remote_name, const unsigned char *sha1); +extern void fetch_objects(const char *remote_name, + const struct oid_array *to_fetch); + #endif diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 6d37c6d..13610b7 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does not support object filte test_i18ngrep "filtering not recognized by server" err ' +test_expect_success 'batch missing blob request during checkout' ' + rm -rf server client && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + + git -C server commit -m x && + echo aa >server/a && + echo bb >server/b && + git -C server add a b && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + + git clone --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Ensure that there is only one negotiation by checking that there is + # only "done" line sent. ("done" marks the end of negotiation.) + GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ && + grep "git> done" trace >done_lines && + test_line_count = 1 done_lines +' + +test_expect_success 'batch missing blob request does not inadvertently try to fetch gitlinks' ' + rm -rf server client && + + test_create_repo repo_for_submodule && + test_commit -C repo_for_submodule x && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + git -C server commit -m x && + + echo aa >server/a && + echo bb >server/b && + # Also add a gitlink pointing to an arbitrary repository + git -C server submodule add
[PATCH v7 13/16] fetch-pack: restore save_commit_buffer after use
From: Jonathan TanIn fetch-pack, the global variable save_commit_buffer is set to 0, but not restored to its original value after use. In particular, if show_log() (in log-tree.c) is invoked after fetch_pack() in the same process, show_log() will return before printing out the commit message (because the invocation to get_cached_commit_buffer() returns NULL, because the commit buffer was not saved). I discovered this when attempting to run "git log -S" in a partial clone, triggering the case where revision walking lazily loads missing objects. Therefore, restore save_commit_buffer to its original value after use. An alternative to solve the problem I had is to replace get_cached_commit_buffer() with get_commit_buffer(). That invocation was introduced in commit a97934d ("use get_cached_commit_buffer where appropriate", 2014-06-13) to replace "commit->buffer" introduced in commit 3131b71 ("Add "--show-all" revision walker flag for debugging", 2008-02-13). In the latter commit, the commit author seems to be deciding between not showing an unparsed commit at all and showing an unparsed commit without the message (which is what the commit does), and did not mention parsing the unparsed commit, so I prefer to preserve the existing behavior. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- fetch-pack.c | 4 1 file changed, 4 insertions(+) diff --git a/fetch-pack.c b/fetch-pack.c index 3c5f064..773453c 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args, { struct ref *ref; int retval; + int old_save_commit_buffer = save_commit_buffer; timestamp_t cutoff = 0; save_commit_buffer = 0; @@ -786,6 +787,9 @@ static int everything_local(struct fetch_pack_args *args, print_verbose(args, _("already have %s (%s)"), oid_to_hex(remote), ref->name); } + + save_commit_buffer = old_save_commit_buffer; + return retval; } -- 2.9.3
[PATCH v7 09/16] fetch: support filters
From: Jeff HostetlerTeach fetch to support filters. This is only allowed for the remote configured in extensions.partialcloneremote. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 23 +-- connected.c | 2 ++ remote-curl.c | 6 ++ t/t5500-fetch-pack.sh | 36 4 files changed, 65 insertions(+), 2 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 1b1f039..14aab71 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -18,6 +18,7 @@ #include "argv-array.h" #include "utf8.h" #include "packfile.h" +#include "list-objects-filter-options.h" static const char * const builtin_fetch_usage[] = { N_("git fetch [] [ [...]]"), @@ -55,6 +56,7 @@ static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND; static int shown_url = 0; static int refmap_alloc, refmap_nr; static const char **refmap_array; +static struct list_objects_filter_options filter_options; static int git_fetch_config(const char *k, const char *v, void *cb) { @@ -160,6 +162,7 @@ static struct option builtin_fetch_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_PARSE_LIST_OBJECTS_FILTER(_options), OPT_END() }; @@ -1044,6 +1047,11 @@ static struct transport *prepare_transport(struct remote *remote, int deepen) set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes"); if (update_shallow) set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes"); + if (filter_options.choice) { + set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, + filter_options.filter_spec); + set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); + } return transport; } @@ -1328,6 +1336,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) packet_trace_identity("fetch"); + fetch_if_missing = 0; + /* Record the command line for the reflog */ strbuf_addstr(_rla, "fetch"); for (i = 1; i < argc; i++) @@ -1361,6 +1371,9 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) if (depth || deepen_since || deepen_not.nr) deepen = 1; + if (filter_options.choice && !repository_format_partial_clone) + die("--filter can only be used when extensions.partialClone is set"); + if (all) { if (argc == 1) die(_("fetch --all does not take a repository argument")); @@ -1390,10 +1403,16 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) } } - if (remote) + if (remote) { + if (filter_options.choice && + strcmp(remote->name, repository_format_partial_clone)) + die(_("--filter can only be used with the remote configured in core.partialClone")); result = fetch_one(remote, argc, argv); - else + } else { + if (filter_options.choice) + die(_("--filter can only be used with the remote configured in core.partialClone")); result = fetch_multiple(); + } if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) { struct argv_array options = ARGV_ARRAY_INIT; diff --git a/connected.c b/connected.c index f416b05..3a5bd67 100644 --- a/connected.c +++ b/connected.c @@ -56,6 +56,8 @@ int check_connected(sha1_iterate_fn fn, void *cb_data, argv_array_push(_list.args,"rev-list"); argv_array_push(_list.args, "--objects"); argv_array_push(_list.args, "--stdin"); + if (repository_format_partial_clone) + argv_array_push(_list.args, "--exclude-promisor-objects"); argv_array_push(_list.args, "--not"); argv_array_push(_list.args, "--all"); argv_array_push(_list.args, "--quiet"); diff --git a/remote-curl.c b/remote-curl.c index 4318391..6ec5352 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -24,6 +24,7 @@ struct options { char *deepen_since; struct string_list deepen_not; struct string_list push_options; + char *filter; unsigned progress : 1, check_self_contained_and_connected : 1, cloning : 1, @@ -165,6 +166,9 @@ static int set_option(const char *name, const char *value) } else if (!strcmp(name, "no-dependents")) { options.no_dependents = 1; return 0; + } else if (!strcmp(name, "filter")) { + options.filter = xstrdup(value);; + return 0; } else { return 1 /* unsupported */; } @@ -834,6 +838,8 @@ static int fetch_git(struct discovery *heads,
[PATCH v7 03/16] gc: do not repack promisor packfiles
From: Jonathan TanTeach gc to stop traversal at promisor objects, and to leave promisor packfiles alone. This has the effect of only repacking non-promisor packfiles, and preserves the distinction between promisor packfiles and non-promisor packfiles. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- Documentation/git-pack-objects.txt | 11 builtin/gc.c | 3 +++ builtin/pack-objects.c | 37 -- builtin/prune.c| 7 + builtin/repack.c | 8 -- t/t0410-partial-clone.sh | 54 -- 6 files changed, 114 insertions(+), 6 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index aa403d0..81bc490 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -255,6 +255,17 @@ a missing object is encountered. This is the default action. The form '--missing=allow-any' will allow object traversal to continue if a missing object is encountered. Missing objects will silently be omitted from the results. ++ +The form '--missing=allow-promisor' is like 'allow-any', but will only +allow object traversal to continue for EXPECTED promisor missing objects. +Unexpected missing object will raise an error. + +--exclude-promisor-objects:: + Omit objects that are known to be in the promisor remote. (This + option has the purpose of operating only on locally created objects, + so that when we repack, we still maintain a distinction between + locally created objects [without .promisor] and objects from the + promisor remote [with .promisor].) This is used with partial clone. SEE ALSO diff --git a/builtin/gc.c b/builtin/gc.c index 3c5eae0..77fa720 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -458,6 +458,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix) argv_array_push(, prune_expire); if (quiet) argv_array_push(, "--no-progress"); + if (repository_format_partial_clone) + argv_array_push(, + "--exclude-promisor-objects"); if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) return error(FAILED_RUN, prune.argv[0]); } diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 45ad35d..f5fc401 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -75,6 +75,8 @@ static int use_bitmap_index = -1; static int write_bitmap_index; static uint16_t write_bitmap_options; +static int exclude_promisor_objects; + static unsigned long delta_cache_size = 0; static unsigned long max_delta_cache_size = 256 * 1024 * 1024; static unsigned long cache_max_small_delta_size = 1000; @@ -84,8 +86,9 @@ static unsigned long window_memory_limit = 0; static struct list_objects_filter_options filter_options; enum missing_action { - MA_ERROR = 0,/* fail if any missing objects are encountered */ - MA_ALLOW_ANY,/* silently allow ALL missing objects */ + MA_ERROR = 0, /* fail if any missing objects are encountered */ + MA_ALLOW_ANY, /* silently allow ALL missing objects */ + MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */ }; static enum missing_action arg_missing_action; static show_object_fn fn_show_object; @@ -2577,6 +2580,20 @@ static void show_object__ma_allow_any(struct object *obj, const char *name, void show_object(obj, name, data); } +static void show_object__ma_allow_promisor(struct object *obj, const char *name, void *data) +{ + assert(arg_missing_action == MA_ALLOW_PROMISOR); + + /* +* Quietly ignore EXPECTED missing objects. This avoids problems with +* staging them now and getting an odd error later. +*/ + if (!has_object_file(>oid) && is_promisor_object(>oid)) + return; + + show_object(obj, name, data); +} + static int option_parse_missing_action(const struct option *opt, const char *arg, int unset) { @@ -2591,10 +2608,18 @@ static int option_parse_missing_action(const struct option *opt, if (!strcmp(arg, "allow-any")) { arg_missing_action = MA_ALLOW_ANY; + fetch_if_missing = 0; fn_show_object = show_object__ma_allow_any; return 0; } + if (!strcmp(arg, "allow-promisor")) { + arg_missing_action = MA_ALLOW_PROMISOR; + fetch_if_missing = 0; + fn_show_object = show_object__ma_allow_promisor; + return 0; + } + die(_("invalid value for --missing"));
[PATCH v7 00/16] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
From: Jeff HostetlerThis is V7 of part 3 of partial clone. It builds upon V7 of part 2 (which builds upon V6 of part 1). This version adds additional tests, fixes test errors on the MAC version, and squashes some fixup commits. It also restores functionality accidentally dropped from the V6 series for "git fetch" to automatically inherit the partial-clone filter-spec when appropriate. This version extends the --no-filter argument to override this inheritance. Jeff Hostetler (8): upload-pack: add object filtering for partial clone fetch-pack, index-pack, transport: partial clone fetch-pack: add --no-filter fetch: support filters partial-clone: define partial clone settings in config t5616: end-to-end tests for partial clone fetch: inherit filter-spec from partial clone t5616: test bulk prefetch after partial fetch Jonathan Tan (8): sha1_file: support lazily fetching missing objects rev-list: support termination at promisor objects gc: do not repack promisor packfiles fetch-pack: test support excluding large blobs fetch: refactor calculation of remote list clone: partial clone unpack-trees: batch fetching of missing blobs fetch-pack: restore save_commit_buffer after use Documentation/config.txt | 4 + Documentation/git-pack-objects.txt| 11 ++ Documentation/rev-list-options.txt| 11 ++ Documentation/technical/pack-protocol.txt | 8 + Documentation/technical/protocol-capabilities.txt | 8 + builtin/cat-file.c| 2 + builtin/clone.c | 22 ++- builtin/fetch-pack.c | 10 ++ builtin/fetch.c | 83 - builtin/fsck.c| 3 + builtin/gc.c | 3 + builtin/index-pack.c | 6 + builtin/pack-objects.c| 37 +++- builtin/prune.c | 7 + builtin/repack.c | 8 +- builtin/rev-list.c| 73 +++- cache.h | 9 + config.c | 5 + connected.c | 2 + environment.c | 1 + fetch-object.c| 29 ++- fetch-object.h| 5 + fetch-pack.c | 17 ++ fetch-pack.h | 2 + list-objects-filter-options.c | 92 -- list-objects-filter-options.h | 18 ++ list-objects.c| 29 ++- object.c | 2 +- remote-curl.c | 6 + revision.c| 33 +++- revision.h| 5 +- sha1_file.c | 32 +++- t/t0410-partial-clone.sh | 206 +- t/t5500-fetch-pack.sh | 63 +++ t/t5601-clone.sh | 101 +++ t/t5616-partial-clone.sh | 146 +++ transport-helper.c| 5 + transport.c | 4 + transport.h | 5 + unpack-trees.c| 22 +++ upload-pack.c | 31 +++- 41 files changed, 1110 insertions(+), 56 deletions(-) create mode 100755 t/t5616-partial-clone.sh -- 2.9.3
Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)
Christian Couderwrites: > On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano wrote: > > >> * cc/skip-to-optional-val (2017-12-07) 7 commits >> - t4045: test 'diff --relative' for real >> - t4045: reindent to make helpers readable >> - diff: use skip-to-optional-val in parsing --relative >> - diff: use skip_to_optional_val_default() >> - diff: use skip_to_optional_val() >> - index-pack: use skip_to_optional_val() >> - git-compat-util: introduce skip_to_optional_val() >> >> Introduce a helper to simplify code to parse a common pattern that >> expects either "--key" or "--key=". >> >> Even though I queued fixes for "diff --relative" on top, it may >> still want a final reroll to make it harder to misuse by allowing >> NULL at the valp part of the argument. > > Yeah, I already implemented that and it will be in the next v3 version. Good. I am hoping that you've followed the discussion on the tests, where all of us agreed that the approach taken by Jacob's one is preferrable over what is queued above? >> Also s/_val/_arg/. > > I am not sure that is a good idea, because it could suggest that the > functions are designed to parse only command option arguments, while > they can be used to parse any "key=val" string where "key" is also > allowed. > >> cf. >> cf. > > It doesn't look like s/_val/_arg/ was discussed in the above messages. It came from your statement that was made before the thread, where you said you'll rename it to use arg after I said I suspect that arg would make more sense than val. https://public-inbox.org/git/CAP8UFD2OSsqzhyAL-QG1TOowB-xgbf=kc9whre+flc+0j1x...@mail.gmail.com/ Thanks.
Re: Unfortunate interaction between git diff-index and fakeroot
Elazar Leibovichwrites: > ignore unused information, such as commit > 2cb45e95438c113871fbbea5b4f629f9463034e7 > which ignores st_dev, because it doesn't actually matter, or I do not think it ignores because "it doesn't matter". st_dev is known not to be stable (e.g. across reboots and reconnects nfs), so it is a poor indicator to use as a quick measure to see if the file contents may have changed since we last checked. On the other hand, in a sane (it is of course open to debate "by whose definition") environment, the fact that the owner of the file has changed _is_ a significant enough sign to suspect that the file contents have changed (i.e. the file is suspect to be dirty; you can actually check and refresh, of course, though). > commit e44794706eeb57f2ee38ed1604821aa38b8ad9d2 which ignores > mode changes not relevant to the owner. And that one is not even about the cached stat information used for the quick "dirty-ness" check. The change is about the mode bits in recorded history. File's mtime is not recorded in the history, either, but we do care and record it in the index, because it can be used as a good (albeit a bit too coarse grained) indicator that the contents of a file may have changed. The owner and group fall into the same category.
Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)
On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamanowrote: > * cc/skip-to-optional-val (2017-12-07) 7 commits > - t4045: test 'diff --relative' for real > - t4045: reindent to make helpers readable > - diff: use skip-to-optional-val in parsing --relative > - diff: use skip_to_optional_val_default() > - diff: use skip_to_optional_val() > - index-pack: use skip_to_optional_val() > - git-compat-util: introduce skip_to_optional_val() > > Introduce a helper to simplify code to parse a common pattern that > expects either "--key" or "--key=". > > Even though I queued fixes for "diff --relative" on top, it may > still want a final reroll to make it harder to misuse by allowing > NULL at the valp part of the argument. Yeah, I already implemented that and it will be in the next v3 version. > Also s/_val/_arg/. I am not sure that is a good idea, because it could suggest that the functions are designed to parse only command option arguments, while they can be used to parse any "key=val" string where "key" is also allowed. > cf. > cf. It doesn't look like s/_val/_arg/ was discussed in the above messages.
[PATCH v7 05/10] fsck: support promisor objects as CLI argument
From: Jonathan TanTeach fsck to not treat missing promisor objects provided on the CLI as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan --- builtin/fsck.c | 2 ++ t/t0410-partial-clone.sh | 13 + 2 files changed, 15 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index 4c2a56d..578a7c8 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -750,6 +750,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) struct object *obj = lookup_object(oid.hash); if (!obj || !(obj->flags & HAS_OBJ)) { + if (is_promisor_object()) + continue; error("%s: object missing", oid_to_hex()); errors_found |= ERROR_OBJECT; continue; diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 4f9931f..e96f436 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -125,4 +125,17 @@ test_expect_success 'missing object, but promised, passes fsck' ' git -C repo fsck ' +test_expect_success 'missing CLI object, but promised, passes fsck' ' + rm -rf repo && + test_create_repo repo && + test_commit -C repo my_commit && + + A=$(git -C repo commit-tree -m a HEAD^{tree}) && + promise_and_delete "$A" && + + git -C repo config core.repositoryformatversion 1 && + git -C repo config extensions.partialclone "arbitrary string" && + git -C repo fsck "$A" +' + test_done -- 2.9.3