Re: [PATCH 0/4] subtree: move out of contrib
On Tue, May 01 2018, Johannes Schindelin wrote: > Hi Ævar, > > On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote: > >> I think at this point git-subtree is widely used enough to move out of >> contrib/, maybe others disagree, but patches are always better for >> discussion that patch-less ML posts. > > Sure, it is used widely enough. > > However, it flies in the face of so many GSoC efforts to introduce yet > another one of those poorly portable Unix shell scripts, as central part > of Git's code base. > > The script itself does look quite straight-forward to port to a builtin, > so why not give it a try? That's a valid point. I think it makes sense to leave that aside for now, maybe the consensus is that subtree is fine in every way except we'd like to have a policy not to introduce new shellscript built-ins. Let's first just assume it's in C already and look at it in terms of its functionality, to figure out if it's worth even getting to that point. > If you are completely opposed to porting it to C, I will be completely > opposed to moving it out of contrib/. This series shows that we should split the concern about whether something lives in contrib/ from whether it's built/installed by default. No matter if we decide that subtree should be a blessed default command it makes sense to move it out of contrib, purely because as can be seen from this series it'll replace >100 lines of hacks with 1 line in our main Makefile. We can then just e.g. add a flag to guard for it, e.g. CONTRIB_SUBTREE=YesPlease. But that's just an internal implementation detail of how we manage code sitting in git.git.
Re: [PATCH 0/4] subtree: move out of contrib
Hi Ævar, On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote: > I think at this point git-subtree is widely used enough to move out of > contrib/, maybe others disagree, but patches are always better for > discussion that patch-less ML posts. Sure, it is used widely enough. However, it flies in the face of so many GSoC efforts to introduce yet another one of those poorly portable Unix shell scripts, as central part of Git's code base. The script itself does look quite straight-forward to port to a builtin, so why not give it a try? If you are completely opposed to porting it to C, I will be completely opposed to moving it out of contrib/. If you need help with porting it, please come up with a task plan and I can jump in, to help (but please do collaborate with me on this one, don't leave all of the hard work to me). Ciao, Dscho
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 11:50 AM, Ævar Arnfjörð Bjarmasonwrote: > I think at this point git-subtree is widely used enough to move out of > contrib/, maybe others disagree, but patches are always better for > discussion that patch-less ML posts. After narrow/partial clone becomes real, it should be "easy" to implement some sort of narrow checkout that would achieve the same thing. But it took me forever with all other stuff to get back to this. If we remove it from contrib and there are people willing to update/maintain it, should it be a separate repository then? The willing people will have much more freedom to update it. And I don't have to answer the questions about who will maintain this thing in git.git. I don't like the idea of adding new (official) shell-based commands either to be honest. -- Duy
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 6:21 PM, Ævar Arnfjörð Bjarmasonwrote: > Pretty clear it's garbage data, unless we're to believe that the > relative interest of submodules in the US, Germany and Sweden is 51, 64 > & 84, but 75, 100 and 0 for subtree. Oh yeah, Swedish people hate git-subtree. Nobody knows why. Avery
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30 2018, Avery Pennarun wrote: > On Mon, Apr 30, 2018 at 5:38 PM, Stefan Bellerwrote: > There's one exception, which is doing a one-time permanent merge of > two projects into one. That's a nice feature, but is probably used > extremely rarely. FWIW this is the only thing I've used it for. I do this occasionally and used to do this manually with format-patch + "perl -pe" before or similar when I needed to merge some repositories together, and then some other times I was less stupid and manually started doing something similar to what subtree is doing with a "move everything" commit just before the merge of the two histories. >> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule >> >> Not sure what to make of this data. > > Clearly people need a lot more help when using submodules than when > using subtree :) Pretty clear it's garbage data, unless we're to believe that the relative interest of submodules in the US, Germany and Sweden is 51, 64 & 84, but 75, 100 and 0 for subtree.
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 2:53 PM, Avery Pennarunwrote: > For the best of both worlds, I've often thought that a good balance > would be to use the same data structure that submodule uses, but to > store all the code in a single git repo under different refs, which we > might or might not download (or might or might not have different > ACLs) under different circumstances. There has been some experimentation with having a simpler ref surface on the submodule side, https://public-inbox.org/git/cover.1512168087.git.jonathanta...@google.com/ The way you describe the future of submodules, all we'd have to do is to teach git-clone how to select the the "interesting" refs for your use case. Any other command would assume all submodule data to be in the main repository. The difference to Jonathans proposal linked above, would be the object store to be in the main repo and the refs to be prefixed per submodule instead of "shadowed". > However, when some projects get > really huge (lots of very big submodule dependencies), then repacking > one-big-repo starts becoming unwieldy; in that situation git-subtree > also fails completely. Yes, but that is a general scaling problem of Git that could be tackled, e.g. repack into multiple packs serially instead of putting everything into one pack. >> Submodules do not need to produce a synthetic project history >> when splitting off again, as the history is genuine. This allows >> for easier work with upstream. > > Splitting for easier work upstream is great, and there really ought to > be an official version of 'git subtree split', which is good for all > sorts of purposes. > > However, I suspect almost all uses of the split feature are a) > splitting a subtree that you previously merged in, or b) splitting a > subtree into a separate project that you want to maintain separately > from now on. Repeated splits in case (a) are only necessary because > you're not using submodules, or in case (b) are only necessary because > you didn't *switch* to submodules when it finally came time to split > the projects. (In both cases you probably didn't switch to submodules > because you didn't like one of its tradeoffs, especially the need to > track multiple repos when you fork.) That makes sense. > > There's one exception, which is doing a one-time permanent merge of > two projects into one. That's a nice feature, but is probably used > extremely rarely. More often people get into a > merge-split-merge-split cycle that would be better served by a > slightly improved git-submodule. This rare use case is how git-subtree came into existence in gits contrib directory AFAICT, https://kernel.googlesource.com/pub/scm/git/git/+/634392b26275fe5436c0ea131bc89b46476aa4ae which is interesting to view in git-show, but I think defaults could be tweaked there, as it currently shows me mostly a license file. >> Conceptually Gerrit is doing >> >> while true: >> git submodule update --remote >> if worktree is dirty: >> git commit "update the submodules" >> >> just that Gerrit doesn't poll but does it event based. > > ...and it's super handy :) The problem is it's fundamentally > centralized: because gerrit can serialize merges into the submodule, > it also knows exactly how to update the link in the supermodule. If > there was wild branching and merging (as there often is in git) and > you had to resolve conflicts between two submodules, I don't think it > would be obvious at all how to do it automatically when pushing a > submodule. (This also works quite badly with git subtree --squash.) With the poll based solution I don't think you'd run into many more problems than you would with Gerrits solution. In a nearby thread, we were just discussing the submodule merging strategies, https://public-inbox.org/git/1524739599.20251.17.ca...@klsmartin.com/ which might seem confusing, but the implementation is actually easy as we just fastforward-only in submodules. >> >> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule >> >> Not sure what to make of this data. > > Clearly people need a lot more help when using submodules than when > using subtree :) That could be true. :) Thanks, Stefan
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 5:38 PM, Stefan Bellerwrote: > On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun wrote: > No objections from me either. > > Submodules seem to serve a slightly different purpose, though? I think the purpose is actually the same - it's just the tradeoffs that are difference. Both sets of tradeoffs kind of suck. > With Subtrees the superproject always contains all the code, > even when you squash the subtree histroy when merging it in. > In the submodule world, you may not have access to one of the > submodules. Right. Personally I think it's a disadvantage of subtree that it always contains all the code (what if some people don't want the code for a particular build variant?). However, it's a huge pain that submodules *don't* contain all the code (what if I'm not online right now, or the site supposedly containing the code goes offline, or I want to make my own fork?). For the best of both worlds, I've often thought that a good balance would be to use the same data structure that submodule uses, but to store all the code in a single git repo under different refs, which we might or might not download (or might or might not have different ACLs) under different circumstances. However, when some projects get really huge (lots of very big submodule dependencies), then repacking one-big-repo starts becoming unwieldy; in that situation git-subtree also fails completely. > Submodules do not need to produce a synthetic project history > when splitting off again, as the history is genuine. This allows > for easier work with upstream. Splitting for easier work upstream is great, and there really ought to be an official version of 'git subtree split', which is good for all sorts of purposes. However, I suspect almost all uses of the split feature are a) splitting a subtree that you previously merged in, or b) splitting a subtree into a separate project that you want to maintain separately from now on. Repeated splits in case (a) are only necessary because you're not using submodules, or in case (b) are only necessary because you didn't *switch* to submodules when it finally came time to split the projects. (In both cases you probably didn't switch to submodules because you didn't like one of its tradeoffs, especially the need to track multiple repos when you fork.) > Subtrees present you the whole history by default and the user > needs to be explicit about not wanting to see history from the > subtree, which is the opposite of submodules (though this > may be planned in the future to switch). It turns out that AFAIK, almost everyone prefers 'git subtree --squash', which squashes into a single commit each time you merge, much like git submodule does. I doubt people would cry too much if the full-history feature went away. There's one exception, which is doing a one-time permanent merge of two projects into one. That's a nice feature, but is probably used extremely rarely. More often people get into a merge-split-merge-split cycle that would be better served by a slightly improved git-submodule. >> The gerrit team (eg. Stefan Beller) has been doing some really great >> stuff to make submodules more usable by helping with relative >> submodule links and by auto-updating links in supermodules at the >> right times. Unfortunately doing that requires help from the server >> side, which kind of messes up decentralization and so doesn't solve >> the problem in the general case. > > Conceptually Gerrit is doing > > while true: > git submodule update --remote > if worktree is dirty: > git commit "update the submodules" > > just that Gerrit doesn't poll but does it event based. ...and it's super handy :) The problem is it's fundamentally centralized: because gerrit can serialize merges into the submodule, it also knows exactly how to update the link in the supermodule. If there was wild branching and merging (as there often is in git) and you had to resolve conflicts between two submodules, I don't think it would be obvious at all how to do it automatically when pushing a submodule. (This also works quite badly with git subtree --squash.) >> I really wish there were a good answer, but I don't know what it is. >> I do know that lots of people seem to at least be happy using >> git-subtree, and would be even happier if it were installed >> automatically with git. > > https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule > > Not sure what to make of this data. Clearly people need a lot more help when using submodules than when using subtree :) Have fun, Avery
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarunwrote: > On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmason > wrote: >> I think at this point git-subtree is widely used enough to move out of >> contrib/, maybe others disagree, but patches are always better for >> discussion that patch-less ML posts. > > I really was hoping git-subtree would be completely obsoleted by > git-submodule by now, but... it hasn't been, so... no objections on > this end. No objections from me either. Submodules seem to serve a slightly different purpose, though? With Subtrees the superproject always contains all the code, even when you squash the subtree histroy when merging it in. In the submodule world, you may not have access to one of the submodules. Submodules do not need to produce a synthetic project history when splitting off again, as the history is genuine. This allows for easier work with upstream. Subtrees present you the whole history by default and the user needs to be explicit about not wanting to see history from the subtree, which is the opposite of submodules (though this may be planned in the future to switch). > The gerrit team (eg. Stefan Beller) has been doing some really great > stuff to make submodules more usable by helping with relative > submodule links and by auto-updating links in supermodules at the > right times. Unfortunately doing that requires help from the server > side, which kind of messes up decentralization and so doesn't solve > the problem in the general case. Conceptually Gerrit is doing while true: git submodule update --remote if worktree is dirty: git commit "update the submodules" just that Gerrit doesn't poll but does it event based. > I really wish there were a good answer, but I don't know what it is. > I do know that lots of people seem to at least be happy using > git-subtree, and would be even happier if it were installed > automatically with git. https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule Not sure what to make of this data.
Re: [PATCH 0/4] subtree: move out of contrib
On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmasonwrote: > I think at this point git-subtree is widely used enough to move out of > contrib/, maybe others disagree, but patches are always better for > discussion that patch-less ML posts. I really was hoping git-subtree would be completely obsoleted by git-submodule by now, but... it hasn't been, so... no objections on this end. The gerrit team (eg. Stefan Beller) has been doing some really great stuff to make submodules more usable by helping with relative submodule links and by auto-updating links in supermodules at the right times. Unfortunately doing that requires help from the server side, which kind of messes up decentralization and so doesn't solve the problem in the general case. I really wish there were a good answer, but I don't know what it is. I do know that lots of people seem to at least be happy using git-subtree, and would be even happier if it were installed automatically with git. Have fun, Avery
Re: [PATCH 0/4] subtree: move out of contrib
From: "Ævar Arnfjörð Bjarmason"I think at this point git-subtree is widely used enough to move out of contrib/, maybe others disagree, but patches are always better for discussion that patch-less ML posts. Assuming this lands in Git, then there will also need to be a simple follow on into Duy's series that is updating the command-list.txt (Message-Id: <20180429181844.21325-10-pclo...@gmail.com>). Duy's series also does the completions thing IIUC;-). -- Philip Ævar Arnfjörð Bjarmason (4): git-subtree: move from contrib/subtree/ subtree: remove support for git version <1.7 subtree: fix a test failure under GETTEXT_POISON i18n: translate the git-subtree command .gitignore| 1 + Documentation/git-submodule.txt | 2 +- .../subtree => Documentation}/git-subtree.txt | 3 + Makefile | 1 + contrib/subtree/.gitignore| 7 - contrib/subtree/COPYING | 339 -- contrib/subtree/INSTALL | 28 -- contrib/subtree/Makefile | 97 - contrib/subtree/README| 8 - contrib/subtree/t/Makefile| 86 - contrib/subtree/todo | 48 --- .../subtree/git-subtree.sh => git-subtree.sh | 109 +++--- {contrib/subtree/t => t}/t7900-subtree.sh | 21 +- 13 files changed, 78 insertions(+), 672 deletions(-) rename {contrib/subtree => Documentation}/git-subtree.txt (99%) delete mode 100644 contrib/subtree/.gitignore delete mode 100644 contrib/subtree/COPYING delete mode 100644 contrib/subtree/INSTALL delete mode 100644 contrib/subtree/Makefile delete mode 100644 contrib/subtree/README delete mode 100644 contrib/subtree/t/Makefile delete mode 100644 contrib/subtree/todo rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%) rename {contrib/subtree/t => t}/t7900-subtree.sh (99%) -- 2.17.0.290.gded63e768a
[PATCH 0/4] subtree: move out of contrib
I think at this point git-subtree is widely used enough to move out of contrib/, maybe others disagree, but patches are always better for discussion that patch-less ML posts. Ævar Arnfjörð Bjarmason (4): git-subtree: move from contrib/subtree/ subtree: remove support for git version <1.7 subtree: fix a test failure under GETTEXT_POISON i18n: translate the git-subtree command .gitignore| 1 + Documentation/git-submodule.txt | 2 +- .../subtree => Documentation}/git-subtree.txt | 3 + Makefile | 1 + contrib/subtree/.gitignore| 7 - contrib/subtree/COPYING | 339 -- contrib/subtree/INSTALL | 28 -- contrib/subtree/Makefile | 97 - contrib/subtree/README| 8 - contrib/subtree/t/Makefile| 86 - contrib/subtree/todo | 48 --- .../subtree/git-subtree.sh => git-subtree.sh | 109 +++--- {contrib/subtree/t => t}/t7900-subtree.sh | 21 +- 13 files changed, 78 insertions(+), 672 deletions(-) rename {contrib/subtree => Documentation}/git-subtree.txt (99%) delete mode 100644 contrib/subtree/.gitignore delete mode 100644 contrib/subtree/COPYING delete mode 100644 contrib/subtree/INSTALL delete mode 100644 contrib/subtree/Makefile delete mode 100644 contrib/subtree/README delete mode 100644 contrib/subtree/t/Makefile delete mode 100644 contrib/subtree/todo rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%) rename {contrib/subtree/t => t}/t7900-subtree.sh (99%) -- 2.17.0.290.gded63e768a