Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Ævar Arnfjörð Bjarmason

On Tue, May 01 2018, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> I think at this point git-subtree is widely used enough to move out of
>> contrib/, maybe others disagree, but patches are always better for
>> discussion that patch-less ML posts.
>
> Sure, it is used widely enough.
>
> However, it flies in the face of so many GSoC efforts to introduce yet
> another one of those poorly portable Unix shell scripts, as central part
> of Git's code base.
>
> The script itself does look quite straight-forward to port to a builtin,
> so why not give it a try?

That's a valid point. I think it makes sense to leave that aside for
now, maybe the consensus is that subtree is fine in every way except
we'd like to have a policy not to introduce new shellscript built-ins.

Let's first just assume it's in C already and look at it in terms of its
functionality, to figure out if it's worth even getting to that point.

> If you are completely opposed to porting it to C, I will be completely
> opposed to moving it out of contrib/.

This series shows that we should split the concern about whether
something lives in contrib/ from whether it's built/installed by
default.

No matter if we decide that subtree should be a blessed default command
it makes sense to move it out of contrib, purely because as can be seen
from this series it'll replace >100 lines of hacks with 1 line in our
main Makefile.

We can then just e.g. add a flag to guard for it,
e.g. CONTRIB_SUBTREE=YesPlease.

But that's just an internal implementation detail of how we manage code
sitting in git.git.


Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Johannes Schindelin
Hi Ævar,

On Mon, 30 Apr 2018, Ævar Arnfjörð Bjarmason wrote:

> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

Sure, it is used widely enough.

However, it flies in the face of so many GSoC efforts to introduce yet
another one of those poorly portable Unix shell scripts, as central part
of Git's code base.

The script itself does look quite straight-forward to port to a builtin,
so why not give it a try?

If you are completely opposed to porting it to C, I will be completely
opposed to moving it out of contrib/.

If you need help with porting it, please come up with a task plan and I
can jump in, to help (but please do collaborate with me on this one, don't
leave all of the hard work to me).

Ciao,
Dscho

Re: [PATCH 0/4] subtree: move out of contrib

2018-05-01 Thread Duy Nguyen
On Mon, Apr 30, 2018 at 11:50 AM, Ævar Arnfjörð Bjarmason
 wrote:
> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

After narrow/partial clone becomes real, it should be "easy" to
implement some sort of narrow checkout that would achieve the same
thing. But it took me forever with all other stuff to get back to
this.

If we remove it from contrib and there are people willing to
update/maintain it, should it be a separate repository then? The
willing people will have much more freedom to update it. And I don't
have to answer the questions about who will maintain this thing in
git.git. I don't like the idea of adding new (official) shell-based
commands either to be honest.
-- 
Duy


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun
On Mon, Apr 30, 2018 at 6:21 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Pretty clear it's garbage data, unless we're to believe that the
> relative interest of submodules in the US, Germany and Sweden is 51, 64
> & 84, but 75, 100 and 0 for subtree.

Oh yeah, Swedish people hate git-subtree.  Nobody knows why.

Avery


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Ævar Arnfjörð Bjarmason

On Mon, Apr 30 2018, Avery Pennarun wrote:

> On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller  wrote:
> There's one exception, which is doing a one-time permanent merge of
> two projects into one.  That's a nice feature, but is probably used
> extremely rarely.

FWIW this is the only thing I've used it for. I do this occasionally and
used to do this manually with format-patch + "perl -pe" before or
similar when I needed to merge some repositories together, and then some
other times I was less stupid and manually started doing something
similar to what subtree is doing with a "move everything" commit just
before the merge of the two histories.

>> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>>
>> Not sure what to make of this data.
>
> Clearly people need a lot more help when using submodules than when
> using subtree :)

Pretty clear it's garbage data, unless we're to believe that the
relative interest of submodules in the US, Germany and Sweden is 51, 64
& 84, but 75, 100 and 0 for subtree.


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Stefan Beller
On Mon, Apr 30, 2018 at 2:53 PM, Avery Pennarun  wrote:

> For the best of both worlds, I've often thought that a good balance
> would be to use the same data structure that submodule uses, but to
> store all the code in a single git repo under different refs, which we
> might or might not download (or might or might not have different
> ACLs) under different circumstances.

There has been some experimentation with having a simpler ref
surface on the submodule side,
https://public-inbox.org/git/cover.1512168087.git.jonathanta...@google.com/

The way you describe the future of submodules, all we'd have to do
is to teach git-clone how to select the the "interesting" refs for your use
case. Any other command would assume all submodule data to be
in the main repository.

The difference to Jonathans proposal linked above, would be the
object store to be in the main repo and the refs to be prefixed
per submodule instead of "shadowed".

>  However, when some projects get
> really huge (lots of very big submodule dependencies), then repacking
> one-big-repo starts becoming unwieldy; in that situation git-subtree
> also fails completely.

Yes, but that is a general scaling problem of Git that could be tackled,
e.g. repack into multiple packs serially instead of putting everything
into one pack.

>> Submodules do not need to produce a synthetic project history
>> when splitting off again, as the history is genuine. This allows
>> for easier work with upstream.
>
> Splitting for easier work upstream is great, and there really ought to
> be an official version of 'git subtree split', which is good for all
> sorts of purposes.
>
> However, I suspect almost all uses of the split feature are a)
> splitting a subtree that you previously merged in, or b) splitting a
> subtree into a separate project that you want to maintain separately
> from now on.  Repeated splits in case (a) are only necessary because
> you're not using submodules, or in case (b) are only necessary because
> you didn't *switch* to submodules when it finally came time to split
> the projects.  (In both cases you probably didn't switch to submodules
> because you didn't like one of its tradeoffs, especially the need to
> track multiple repos when you fork.)

That makes sense.

>
> There's one exception, which is doing a one-time permanent merge of
> two projects into one.  That's a nice feature, but is probably used
> extremely rarely.  More often people get into a
> merge-split-merge-split cycle that would be better served by a
> slightly improved git-submodule.

This rare use case is how git-subtree came into existence in gits
contrib directory AFAICT,
https://kernel.googlesource.com/pub/scm/git/git/+/634392b26275fe5436c0ea131bc89b46476aa4ae
which is interesting to view in git-show, but I think defaults could
be tweaked there, as it currently shows me mostly a license file.

>> Conceptually Gerrit is doing
>>
>>   while true:
>> git submodule update --remote
>> if worktree is dirty:
>> git commit "update the submodules"
>>
>> just that Gerrit doesn't poll but does it event based.
>
> ...and it's super handy :)  The problem is it's fundamentally
> centralized: because gerrit can serialize merges into the submodule,
> it also knows exactly how to update the link in the supermodule.  If
> there was wild branching and merging (as there often is in git) and
> you had to resolve conflicts between two submodules, I don't think it
> would be obvious at all how to do it automatically when pushing a
> submodule.  (This also works quite badly with git subtree --squash.)

With the poll based solution I don't think you'd run into many more
problems than you would with Gerrits solution.

In a nearby thread, we were just discussing the submodule merging
strategies,
https://public-inbox.org/git/1524739599.20251.17.ca...@klsmartin.com/
which might seem confusing, but the implementation is actually easy
as we just fastforward-only in submodules.

>>
>> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>>
>> Not sure what to make of this data.
>
> Clearly people need a lot more help when using submodules than when
> using subtree :)

That could be true. :)

Thanks,
Stefan


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun
On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller  wrote:
> On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun  wrote:
> No objections from me either.
>
> Submodules seem to serve a slightly different purpose, though?

I think the purpose is actually the same - it's just the tradeoffs
that are difference.  Both sets of tradeoffs kind of suck.

> With Subtrees the superproject always contains all the code,
> even when you squash the subtree histroy when merging it in.
> In the submodule world, you may not have access to one of the
> submodules.

Right.  Personally I think it's a disadvantage of subtree that it
always contains all the code (what if some people don't want the code
for a particular build variant?).  However, it's a huge pain that
submodules *don't* contain all the code (what if I'm not online right
now, or the site supposedly containing the code goes offline, or I
want to make my own fork?).

For the best of both worlds, I've often thought that a good balance
would be to use the same data structure that submodule uses, but to
store all the code in a single git repo under different refs, which we
might or might not download (or might or might not have different
ACLs) under different circumstances.  However, when some projects get
really huge (lots of very big submodule dependencies), then repacking
one-big-repo starts becoming unwieldy; in that situation git-subtree
also fails completely.

> Submodules do not need to produce a synthetic project history
> when splitting off again, as the history is genuine. This allows
> for easier work with upstream.

Splitting for easier work upstream is great, and there really ought to
be an official version of 'git subtree split', which is good for all
sorts of purposes.

However, I suspect almost all uses of the split feature are a)
splitting a subtree that you previously merged in, or b) splitting a
subtree into a separate project that you want to maintain separately
from now on.  Repeated splits in case (a) are only necessary because
you're not using submodules, or in case (b) are only necessary because
you didn't *switch* to submodules when it finally came time to split
the projects.  (In both cases you probably didn't switch to submodules
because you didn't like one of its tradeoffs, especially the need to
track multiple repos when you fork.)

> Subtrees present you the whole history by default and the user
> needs to be explicit about not wanting to see history from the
> subtree, which is the opposite of submodules (though this
> may be planned in the future to switch).

It turns out that AFAIK, almost everyone prefers 'git subtree
--squash', which squashes into a single commit each time you merge,
much like git submodule does.  I doubt people would cry too much if
the full-history feature went away.

There's one exception, which is doing a one-time permanent merge of
two projects into one.  That's a nice feature, but is probably used
extremely rarely.  More often people get into a
merge-split-merge-split cycle that would be better served by a
slightly improved git-submodule.

>> The gerrit team (eg. Stefan Beller) has been doing some really great
>> stuff to make submodules more usable by helping with relative
>> submodule links and by auto-updating links in supermodules at the
>> right times.  Unfortunately doing that requires help from the server
>> side, which kind of messes up decentralization and so doesn't solve
>> the problem in the general case.
>
> Conceptually Gerrit is doing
>
>   while true:
> git submodule update --remote
> if worktree is dirty:
> git commit "update the submodules"
>
> just that Gerrit doesn't poll but does it event based.

...and it's super handy :)  The problem is it's fundamentally
centralized: because gerrit can serialize merges into the submodule,
it also knows exactly how to update the link in the supermodule.  If
there was wild branching and merging (as there often is in git) and
you had to resolve conflicts between two submodules, I don't think it
would be obvious at all how to do it automatically when pushing a
submodule.  (This also works quite badly with git subtree --squash.)

>> I really wish there were a good answer, but I don't know what it is.
>> I do know that lots of people seem to at least be happy using
>> git-subtree, and would be even happier if it were installed
>> automatically with git.
>
> https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule
>
> Not sure what to make of this data.

Clearly people need a lot more help when using submodules than when
using subtree :)

Have fun,

Avery


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Stefan Beller
On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun  wrote:
> On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> I think at this point git-subtree is widely used enough to move out of
>> contrib/, maybe others disagree, but patches are always better for
>> discussion that patch-less ML posts.
>
> I really was hoping git-subtree would be completely obsoleted by
> git-submodule by now, but... it hasn't been, so... no objections on
> this end.

No objections from me either.

Submodules seem to serve a slightly different purpose, though?

With Subtrees the superproject always contains all the code,
even when you squash the subtree histroy when merging it in.
In the submodule world, you may not have access to one of the
submodules.

Submodules do not need to produce a synthetic project history
when splitting off again, as the history is genuine. This allows
for easier work with upstream.

Subtrees present you the whole history by default and the user
needs to be explicit about not wanting to see history from the
subtree, which is the opposite of submodules (though this
may be planned in the future to switch).

> The gerrit team (eg. Stefan Beller) has been doing some really great
> stuff to make submodules more usable by helping with relative
> submodule links and by auto-updating links in supermodules at the
> right times.  Unfortunately doing that requires help from the server
> side, which kind of messes up decentralization and so doesn't solve
> the problem in the general case.

Conceptually Gerrit is doing

  while true:
git submodule update --remote
if worktree is dirty:
git commit "update the submodules"

just that Gerrit doesn't poll but does it event based.

> I really wish there were a good answer, but I don't know what it is.
> I do know that lots of people seem to at least be happy using
> git-subtree, and would be even happier if it were installed
> automatically with git.

https://trends.google.com/trends/explore?date=all=git%20subtree,git%20submodule

Not sure what to make of this data.


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Avery Pennarun
On Mon, Apr 30, 2018 at 5:50 AM, Ævar Arnfjörð Bjarmason
 wrote:
> I think at this point git-subtree is widely used enough to move out of
> contrib/, maybe others disagree, but patches are always better for
> discussion that patch-less ML posts.

I really was hoping git-subtree would be completely obsoleted by
git-submodule by now, but... it hasn't been, so... no objections on
this end.

The gerrit team (eg. Stefan Beller) has been doing some really great
stuff to make submodules more usable by helping with relative
submodule links and by auto-updating links in supermodules at the
right times.  Unfortunately doing that requires help from the server
side, which kind of messes up decentralization and so doesn't solve
the problem in the general case.

I really wish there were a good answer, but I don't know what it is.
I do know that lots of people seem to at least be happy using
git-subtree, and would be even happier if it were installed
automatically with git.

Have fun,

Avery


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

I think at this point git-subtree is widely used enough to move out of
contrib/, maybe others disagree, but patches are always better for
discussion that patch-less ML posts.



Assuming this lands in Git, then there will also need to be a simple follow 
on into Duy's series that is updating the command-list.txt (Message-Id: 
<20180429181844.21325-10-pclo...@gmail.com>). Duy's series also does the 
completions thing IIUC;-).

--
Philip


Ævar Arnfjörð Bjarmason (4):
 git-subtree: move from contrib/subtree/
 subtree: remove support for git version <1.7
 subtree: fix a test failure under GETTEXT_POISON
 i18n: translate the git-subtree command

.gitignore|   1 +
Documentation/git-submodule.txt   |   2 +-
.../subtree => Documentation}/git-subtree.txt |   3 +
Makefile  |   1 +
contrib/subtree/.gitignore|   7 -
contrib/subtree/COPYING   | 339 --
contrib/subtree/INSTALL   |  28 --
contrib/subtree/Makefile  |  97 -
contrib/subtree/README|   8 -
contrib/subtree/t/Makefile|  86 -
contrib/subtree/todo  |  48 ---
.../subtree/git-subtree.sh => git-subtree.sh  | 109 +++---
{contrib/subtree/t => t}/t7900-subtree.sh |  21 +-
13 files changed, 78 insertions(+), 672 deletions(-)
rename {contrib/subtree => Documentation}/git-subtree.txt (99%)
delete mode 100644 contrib/subtree/.gitignore
delete mode 100644 contrib/subtree/COPYING
delete mode 100644 contrib/subtree/INSTALL
delete mode 100644 contrib/subtree/Makefile
delete mode 100644 contrib/subtree/README
delete mode 100644 contrib/subtree/t/Makefile
delete mode 100644 contrib/subtree/todo
rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%)
rename {contrib/subtree/t => t}/t7900-subtree.sh (99%)

--
2.17.0.290.gded63e768a






[PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Ævar Arnfjörð Bjarmason
I think at this point git-subtree is widely used enough to move out of
contrib/, maybe others disagree, but patches are always better for
discussion that patch-less ML posts.

Ævar Arnfjörð Bjarmason (4):
  git-subtree: move from contrib/subtree/
  subtree: remove support for git version <1.7
  subtree: fix a test failure under GETTEXT_POISON
  i18n: translate the git-subtree command

 .gitignore|   1 +
 Documentation/git-submodule.txt   |   2 +-
 .../subtree => Documentation}/git-subtree.txt |   3 +
 Makefile  |   1 +
 contrib/subtree/.gitignore|   7 -
 contrib/subtree/COPYING   | 339 --
 contrib/subtree/INSTALL   |  28 --
 contrib/subtree/Makefile  |  97 -
 contrib/subtree/README|   8 -
 contrib/subtree/t/Makefile|  86 -
 contrib/subtree/todo  |  48 ---
 .../subtree/git-subtree.sh => git-subtree.sh  | 109 +++---
 {contrib/subtree/t => t}/t7900-subtree.sh |  21 +-
 13 files changed, 78 insertions(+), 672 deletions(-)
 rename {contrib/subtree => Documentation}/git-subtree.txt (99%)
 delete mode 100644 contrib/subtree/.gitignore
 delete mode 100644 contrib/subtree/COPYING
 delete mode 100644 contrib/subtree/INSTALL
 delete mode 100644 contrib/subtree/Makefile
 delete mode 100644 contrib/subtree/README
 delete mode 100644 contrib/subtree/t/Makefile
 delete mode 100644 contrib/subtree/todo
 rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%)
 rename {contrib/subtree/t => t}/t7900-subtree.sh (99%)

-- 
2.17.0.290.gded63e768a