Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Stefan Beller
On Fri, Sep 28, 2018 at 12:26 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Thu, Sep 27 2018, Stefan Beller wrote:
>
> > Internally we have rolled out this as an experiment for
> > "submodules replacing the repo tool[1]". The repo tool is described as:
> >
> > Repo unifies Git repositories when necessary, performs uploads to the
> > Gerrit revision control system, and automates parts of the Android
> > development workflow. Repo is not meant to replace Git, only to make
> > it easier to work with Git in the context of Android. The repo command
> > is an executable Python script that you can put anywhere in your path.
> >
> > In working with the Android source files, you use Repo for
> > across-network operations. For example, with a single Repo command you
> > can download files from multiple repositories into your local working
> > directory.
> >
> > In most situations, you can use Git instead of Repo, or mix Repo and
> > Git commands to form complex commands.
> >
> > [1] https://source.android.com/setup/develop/
>
> Some questions just out of curiosity, not for this patch in particular:
>
> Those docs seem to describe the situation without this patch, with this
> patch is the repo tool fully replaced?

Yeah I started by describing the status quo. Some points to add:

* The repo UX looks very similar to perforce, as that was used internally
   at Google at the time a lot. Git wasn't around for long in the days of
   early Android. Now that Git is well known, new grads joining are more likely
   to know Git than perforce.
* repo has no tests because it started as a small script
* maintenance/releases of repo are not ideal.

So yes, one of our long term goals is to replace repo with Git-only
(on the client side) workflow.

> How are you planning to migrate from repo to this on a repository-data
> basis

We don't plan to migrate the existing clients. You'd have to re-clone.

>, does gerrit also populate .gitmodules files appropriately, which
> your clone --recurse-submodules will pick up, but repo will just ignore,
> so you can use the two in parallel?

Gerrit has native support to update submodules in a superproject.
So if you submit code to a project that is a submodule of a superproject
the superproject updates its gitlink. (and if you use a topic based workflow
to submit to two projects, it will show up as an atomic update in the
superproject by having just one commit updating two gitlinks)

Also Gerrit has a plugin "supermanifest" [1], which tracks repo
manifests and mirrors changes from the manifest into the
superproject, e.g. adding a new project to the manifest will
add a new submodule to the superproject.

[1] https://gerrit.googlesource.com/plugins/supermanifest/

With both Gerrits internal superproject subscription and that plugin
it should be possible to use repo or git-submodule in parallel (on an
organisational level, i.e. you choose one of them, and your coworker
chooses the other one)

> Now "repo init -u" takes a URL to a manifest of repositories to stitch
> together, I've understood from past conversations (but am not sure) that
> this is used e.g. by downstream Android vendors so they can use what
> Google's using + whatever they have in-house, i.e. make the manifest the
> set of open source repos plus some (e.g. drivers specific to their
> device). How is that sort of workflow going to work where you
> (presumably) have do that via .gitmodules + commit entries in trees?

The manifest is tracked in its own manifest repo, so today they fork
that project, with the superproject you'd fork the superproject and modify
the .gitmodules file as needed.

> They run their own Gerrit install with some magic to sync back & forth?
>
> I assume that now the recursive "checkout" relies on all the origin/HEAD
> symbolic refs pointing to "master", but how is this going to deal with
> incorporating a repo whose main branch has a different name?

Change the name? Or pin it via sha1.

> E.g. "trunk" or "blead"? Perhaps some interaction with
> checkout.defaultRemote + submodule..branch= could make "git
> checkout :mainbranch" DWYM.
>
> > * Fetching changes ("repo sync") needs to fetch all repositories, as there
> >   is no central place that tracks what has changed. In a superproject
> >   git fetch can determine which submodules need fetching.  In Androids
> >   case the daily change is only in a few repositories (think 10s), so
> >   migrating to a superproject would save an order of magnitude in fetch
> >   traffic for daily updates of developers.
>
> Interesting that in all this time with the reliance on a central server
> repo wasn't already asking some custom API "what repos changed since xyz
> time" to narrow that down, but hey, .gitmodules + commit entries in
> trees will do it for you.

It's complicated. Shawn really grew to dislike the repo tool and wanted
to have it gone as quickly as possible, so no hacks that extend its life. ;-)
I would prefer to keep 

Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Ævar Arnfjörð Bjarmason


On Thu, Sep 27 2018, Stefan Beller wrote:

> Internally we have rolled out this as an experiment for
> "submodules replacing the repo tool[1]". The repo tool is described as:
>
> Repo unifies Git repositories when necessary, performs uploads to the
> Gerrit revision control system, and automates parts of the Android
> development workflow. Repo is not meant to replace Git, only to make
> it easier to work with Git in the context of Android. The repo command
> is an executable Python script that you can put anywhere in your path.
>
> In working with the Android source files, you use Repo for
> across-network operations. For example, with a single Repo command you
> can download files from multiple repositories into your local working
> directory.
>
> In most situations, you can use Git instead of Repo, or mix Repo and
> Git commands to form complex commands.
>
> [1] https://source.android.com/setup/develop/

Some questions just out of curiosity, not for this patch in particular:

Those docs seem to describe the situation without this patch, with this
patch is the repo tool fully replaced?

How are you planning to migrate from repo to this on a repository-data
basis, does gerrit also populate .gitmodules files appropriately, which
your clone --recurse-submodules will pick up, but repo will just ignore,
so you can use the two in parallel?

Now "repo init -u" takes a URL to a manifest of repositories to stitch
together, I've understood from past conversations (but am not sure) that
this is used e.g. by downstream Android vendors so they can use what
Google's using + whatever they have in-house, i.e. make the manifest the
set of open source repos plus some (e.g. drivers specific to their
device). How is that sort of workflow going to work where you
(presumably) have do that via .gitmodules + commit entries in trees?
They run their own Gerrit install with some magic to sync back & forth?

I assume that now the recursive "checkout" relies on all the origin/HEAD
symbolic refs pointing to "master", but how is this going to deal with
incorporating a repo whose main branch has a different name?
E.g. "trunk" or "blead"? Perhaps some interaction with
checkout.defaultRemote + submodule..branch= could make "git
checkout :mainbranch" DWYM.

> * Fetching changes ("repo sync") needs to fetch all repositories, as there
>   is no central place that tracks what has changed. In a superproject
>   git fetch can determine which submodules need fetching.  In Androids
>   case the daily change is only in a few repositories (think 10s), so
>   migrating to a superproject would save an order of magnitude in fetch
>   traffic for daily updates of developers.

Interesting that in all this time with the reliance on a central server
repo wasn't already asking some custom API "what repos changed since xyz
time" to narrow that down, but hey, .gitmodules + commit entries in
trees will do it for you.

> * Sometimes when the dependencies are not on a clear repository boundary
>   one would like to have git-bisect available across the different
>   repositories, which repo cannot provide due to its design.

I assume that you're not upgrading independently to e.g. every single
linux commit, just stable releases, so does bisecting deal with knowing
that e.g. a breakage occurred when linux.git was updated from v4.10 to
v4.12, and then to go within the repo itself and bisect from there, or
is that done manually?


Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Junio C Hamano
Jonathan Nieder  writes:

> (dropping cc-s to my internal address that I don't use on this list
>  and to git-c...@google.com which bounces)
> Hi,
>
> Stefan Beller wrote:
>
>> Internally we have rolled out this as an experiment for
>> "submodules replacing the repo tool[1]". The repo tool is described as:
>>
>> Repo unifies Git repositories when necessary, performs uploads to the
>> Gerrit revision control system, and automates parts of the Android
>> development workflow. Repo is not meant to replace Git, only to make
>> it easier to work with Git in the context of Android. The repo command
>> is an executable Python script that you can put anywhere in your path.
> [...]
>> Submodules can also be understood as unifying Git repositories, even more,
>> in the superproject the submodules have relationships between their
>> versions, which the repo tool only provides in releases.
>>
>> The repo tool does not provide these relationships between versions, but
>> all the repositories (in case of Android think of ~1000 git repositories)
>> are put in their place without depending on each other.
>>
>> This comes with a couple of advantages and disadvantages:
>
> Thanks for describing this background.

Thanks for this.  I probably won't be reading this before other
topics, but I've often found that changes from google were lacking
the backstory to make sense of them as a coherent whole.  

For example, a patch that says "Instead of detaching at the commit
recorded in the superproject, check out the branch with the same
name, even if it points at a different commit.  Here is the write up
of what it does in the Documentation/ and code" is hard to judge
beyond checking if the code does what it claims to do and if the
docs describe what the code does---without backstory like this that
talks about how individual small pieces fit within the plan for the
whole thing, judging if that "what it claims to do" is even sensible
is impossible.


Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-27 Thread Jonathan Nieder
(dropping cc-s to my internal address that I don't use on this list
 and to git-c...@google.com which bounces)
Hi,

Stefan Beller wrote:

> Internally we have rolled out this as an experiment for
> "submodules replacing the repo tool[1]". The repo tool is described as:
>
> Repo unifies Git repositories when necessary, performs uploads to the
> Gerrit revision control system, and automates parts of the Android
> development workflow. Repo is not meant to replace Git, only to make
> it easier to work with Git in the context of Android. The repo command
> is an executable Python script that you can put anywhere in your path.
[...]
> Submodules can also be understood as unifying Git repositories, even more,
> in the superproject the submodules have relationships between their
> versions, which the repo tool only provides in releases.
>
> The repo tool does not provide these relationships between versions, but
> all the repositories (in case of Android think of ~1000 git repositories)
> are put in their place without depending on each other.
>
> This comes with a couple of advantages and disadvantages:

Thanks for describing this background.

[...]
> This changes the following commands in the superproject:
>
>   checkout -B/-b create branches in subs, too
>   checkout (-f): update branch in submodule (create if needed) and check
>  it out; Pass the argument down literally if it is a branch
>  name (e.g. "checkout -f master" will run a
> "checkout -f master" in the submodule as well)
>   clone: see checkout
>   reset --hard: see checkout

As you mentioned, I've been using this submodule.repoLike=true mode
for my own use for a while.  You did a nice job of explaining on how
it fits into a Gerrit-driven workflow; I'd like to add that I find it
pleasant in non-Gerrit-driven contexts as well.

The primary difference from repoLike=false is that this makes the
normal state to have branches checked out in submodules.  For example,
if I run

git checkout --recurse-submodules -B master origin/master

then this will create and check out a "master" branch in all
submodules instead of only in the superproject.  This helps avoid some
issues in Git's submodule handling where submodule commits can be
pruned if they have not been checked out in a while because there is
no ref pointing to them.

Some next steps:

- now that we have a repository object, some of the implementation can
  be simplified and made more robust.  I expect that will also make
  these patches easier to review

- also in the direction of reviewability, at that point we may want to
  split this into multiple patches

- gitsubmodules.txt and config.txt should describe the new option, to
  help new users understand what this new repoLike workflow does

- there are some edge cases in the UX that get... messy that I should
  describe in another message

All that said, thanks for sending this out, and I'd be happy to hear
from any interested people --- feedback from anyone adventurous enough
to try this out would be very welcome.

Happy hacking,
Jonathan