Fwd: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
mar., 27 mar. 2018, 02:07 Stefan Beller a scris: > > [snipped the cc list as well] > > On Tue, Mar 6, 2018 at 12:06 PM Eddy Petrișor > wrote: > > > Signed-off-by: Eddy Petrișor > > --- > > Did this go anywhere? > (I just came back from a longer vacation, sorry for the delay on my site) Not really. I am still unsure how is best to proceed. Details below. > > There are projects such as llvm/clang which use several repositories, and > they > > might be forked for providing support for various features such as adding > Redox > > awareness to the toolchain. This typically means the superproject will use > > another branch than master, occasionally even use an old commit from that > > non-master branch. > > > Combined with the fact that when incorporating such a hierachy of > repositories > > usually the user is interested in just the exact commit specified in the > > submodule info, it follows that a desireable usecase is to be also able to > > provide '--depth 1' to avoid waiting for ages for the clone operation to > > finish. > > Very sensible. The only change is that I realized that hard coding the depth is not necessary because the client can fetch more and more from the branch until the commit hash is found or the entire history was fetched and it wasn't found. This is more robust but has a variable performance penalty and is probably slower than single branch fetching from the start. > > Git submodule seems to be very stubborn and cloning master, although the > > wrapper script and the gitmodules-helper could work together to clone > directly > > the branch specified in the .gitmodules file, if specified. > > Also very sensible. > > So far so good, could you move these paragraphs before the triple dashed > line > and sign off so we record it as the commit message? Sure, as long as the implementation and design makes sense. > > Another wrinkle is that when the commit is not the tip of the branch, the > depth > > parameter should somehow be stored in the .gitmodules info, but any > change in > > the submodule will break the supermodule submodule depth info sooner or > later, > > which is definitly frigile. > > ... which is why I would not include that. > > git-fetch knows about --shallow-since or even better > shallow-exclude which could be set to the (depth+1)-th commit > (the boundary commit) recorded in the shallow information. I am unsure what that means. Without yet looking in the docs, would this --shallow-since be better than the try-until-found algorithm explained above? > > I tried digging into this section of the code and debugging with bashdb > to see > > where --depth might fit, but I got stuck on the shell-to-helper > interaction and > > the details of the submodule implementation, so I want to lay out this > first > > patch as starting point for the discussion in the hope somebody else > picks it > > up or can provide some inputs. I have the feeling there are multiple code > paths > > that are being ran, depending on the moment (initial clone, submodule > > recursive, post-clone update etc.) and I have a gut feeling there > shouldn't be > > any code duplication just because the operation is different. > > > This first patch is only trying to use a non-master branch, I have some > changes > > for the --depth part, but I stopped working on it due to the "default > depth" > > issue above. > > > Does any of this sound reasonable? > > Is this patch idea usable or did I managed to touch the part of the code > that > > should not be touched? > > This sounds reasonable. Thanks for writing the patch! OK. Now I need to make it good, which is the hard part :) > > diff --git a/git-submodule.sh b/git-submodule.sh > > index 2491496..370f19e 100755 > > --- a/git-submodule.sh > > +++ b/git-submodule.sh > > @@ -589,8 +589,11 @@ cmd_update() > > branch=$(git submodule--helper remote-branch > "$sm_path") > > if test -z "$nofetch" > > then > > + # non-default branch > > + rbranch=$(git config -f .gitmodules > submodule.$sm_path.branch) > > + > br_refspec=${rbanch:+"refs/heads/$rbranch:refs/heads/$rbranch"} > > Wouldn't we want to fetch into a remote tracking branch instead? > Instead of computing all this by yourself, these two lines could be > > br_refspec=$(git submodule--helper remote-branch $sm_path) > > I would think. I wasn't aware of this, will implement I the next version and see what happens. > > > > # Fetch remote before determining > tracking $sha1 > > - fetch_in_submodule "$sm_path" $depth || > > + fetch_in_submodule "$sm_path" $depth > $br_refspec || > > die "$(eval_gettext "Unable to fetch in > submodule path '\$sm_path'")" > > fi > > remote_name=$(sanitize_submodule_
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
[snipped the cc list as well] On Tue, Mar 6, 2018 at 12:06 PM Eddy Petrișor wrote: > Signed-off-by: Eddy Petrișor > --- Did this go anywhere? (I just came back from a longer vacation, sorry for the delay on my site) > There are projects such as llvm/clang which use several repositories, and they > might be forked for providing support for various features such as adding Redox > awareness to the toolchain. This typically means the superproject will use > another branch than master, occasionally even use an old commit from that > non-master branch. > Combined with the fact that when incorporating such a hierachy of repositories > usually the user is interested in just the exact commit specified in the > submodule info, it follows that a desireable usecase is to be also able to > provide '--depth 1' to avoid waiting for ages for the clone operation to > finish. Very sensible. > Git submodule seems to be very stubborn and cloning master, although the > wrapper script and the gitmodules-helper could work together to clone directly > the branch specified in the .gitmodules file, if specified. Also very sensible. So far so good, could you move these paragraphs before the triple dashed line and sign off so we record it as the commit message? > Another wrinkle is that when the commit is not the tip of the branch, the depth > parameter should somehow be stored in the .gitmodules info, but any change in > the submodule will break the supermodule submodule depth info sooner or later, > which is definitly frigile. ... which is why I would not include that. git-fetch knows about --shallow-since or even better shallow-exclude which could be set to the (depth+1)-th commit (the boundary commit) recorded in the shallow information. > I tried digging into this section of the code and debugging with bashdb to see > where --depth might fit, but I got stuck on the shell-to-helper interaction and > the details of the submodule implementation, so I want to lay out this first > patch as starting point for the discussion in the hope somebody else picks it > up or can provide some inputs. I have the feeling there are multiple code paths > that are being ran, depending on the moment (initial clone, submodule > recursive, post-clone update etc.) and I have a gut feeling there shouldn't be > any code duplication just because the operation is different. > This first patch is only trying to use a non-master branch, I have some changes > for the --depth part, but I stopped working on it due to the "default depth" > issue above. > Does any of this sound reasonable? > Is this patch idea usable or did I managed to touch the part of the code that > should not be touched? This sounds reasonable. Thanks for writing the patch! > diff --git a/git-submodule.sh b/git-submodule.sh > index 2491496..370f19e 100755 > --- a/git-submodule.sh > +++ b/git-submodule.sh > @@ -589,8 +589,11 @@ cmd_update() > branch=$(git submodule--helper remote-branch "$sm_path") > if test -z "$nofetch" > then > + # non-default branch > + rbranch=$(git config -f .gitmodules submodule.$sm_path.branch) > + br_refspec=${rbanch:+"refs/heads/$rbranch:refs/heads/$rbranch"} Wouldn't we want to fetch into a remote tracking branch instead? Instead of computing all this by yourself, these two lines could be br_refspec=$(git submodule--helper remote-branch $sm_path) I would think. > # Fetch remote before determining tracking $sha1 > - fetch_in_submodule "$sm_path" $depth || > + fetch_in_submodule "$sm_path" $depth $br_refspec || > die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")" > fi > remote_name=$(sanitize_submodule_env; cd "$sm_path" && get_default_remote) It would be awesome if you could write a little test for this feature, too. Look for the tests in regarding --remote in t7406 (in the t/ directory) as a starting point, please. Thanks! Stefan
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
On Sat, Mar 17, 2018 at 3:11 PM, Thomas Gummerer wrote: > On 03/17, Eddy Petrișor wrote: >> vin., 16 mar. 2018, 23:44 Eric Sunshine a scris: >> > It may be a disservice to remove mention of git-blame and git-shortlog >> > since git-contacts may not be suitable for everyone. Instead, perhaps >> > advertise git-contacts as a potentially simpler alternative to the >> > already-mentioned git-blamd & git-shortlog? > > Not sure how much of a disservice it would be. I think of > SubmittingPatches as mostly a document for new git contributors, for > who I think we should make it as easy as possible to start > contributing. Interpreting the output of 'git blame' and 'git > shortlog' feels like an extra hurdle for new contributors, especially > if someone is not familiar with the mailing list workflow. I do > remember wondering exactly how I should interpret this when I sent my > first patches. Okay. Mentioning those commands (in addition to git-contacts) is an opportunity to educate newcomers to Git the tool (not just to git.git the project) about additional ways to engage in project spelunking. By "disservice", I meant that that educational opportunity is lost. Eddy's suggestion of reversing the order, thus mentioning git-contacts first is a good alternative. However, perhaps this idea of educating newcomers to Git is misplaced in this context; such spelunking advice may be better suited to a general Git tutorial than to SubmittingPatches which is indeed specific to git.git. Given that reasoning, then my "disservice" view may be wrong. >> > Also, would it make sense to mention Felipe's git-related[1] which is >> > the original (and now more configurable) script from which >> > git-contacts functionality was derived? > > The reason I chose 'git contacts' over git-related is mainly that it > comes available with git. Mentioning both again makes things harder > on new contributors who already have enough to think about when > submitting the patch. > > I guess in the end it comes down to who we think the target of the > document is. To me it was always people new to the project, which is > why I think the single command there makes sense. Fair enough.
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
On 03/17, Eddy Petrișor wrote: > vin., 16 mar. 2018, 23:44 Eric Sunshine a scris: > > > On Fri, Mar 16, 2018 at 5:33 PM, Thomas Gummerer > > wrote: > > > a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches > > > @@ -260,8 +260,8 @@ that starts with `-BEGIN PGP SIGNED > > MESSAGE-`. That is > > > Send your patch with "To:" set to the mailing list, with "cc:" listing > > > -people who are involved in the area you are touching (the output from > > > -`git blame $path` and `git shortlog --no-merges $path` would help to > > > +people who are involved in the area you are touching (the `git > > > +contacts` command in `contrib/contacts/` can help to > > > identify them), to solicit comments and reviews. > > > > It may be a disservice to remove mention of git-blame and git-shortlog > > since git-contacts may not be suitable for everyone. Instead, perhaps > > advertise git-contacts as a potentially simpler alternative to the > > already-mentioned git-blamd & git-shortlog? Not sure how much of a disservice it would be. I think of SubmittingPatches as mostly a document for new git contributors, for who I think we should make it as easy as possible to start contributing. Interpreting the output of 'git blame' and 'git shortlog' feels like an extra hurdle for new contributors, especially if someone is not familiar with the mailing list workflow. I do remember wondering exactly how I should interpret this when I sent my first patches. > As a "victim" of the current documentation, I would advise to have the > order reversed. For a new contributor, judging if git blame and shortlog > are "more suitable" than git contracts or git related is definitely over > the reasonable knowledge expectation. Why is that more suitable than this? > How is suitability determined? > > A new person needs a straight forward way to focus on submitting the patch > in the right form. With experience adding more people in cc will come > naturally and those contacts might be aware of the contributor, too. This is also my experience, as I am getting involved with longer with the project I get more of an intuition who is involved where, and 'blame' and 'shortlog' start helping me confirm that and come up with a reasonable list of contacts (although I'm still not always sure whether I got the correct people or not). But I don't know if people that are getting involved for longer read this document much anymore. So I feel like having the commands mentioned here comes at the expense of new contributors, so I'm not sure it's worth it. > > Also, would it make sense to mention Felipe's git-related[1] which is > > the original (and now more configurable) script from which > > git-contacts functionality was derived? The reason I chose 'git contacts' over git-related is mainly that it comes available with git. Mentioning both again makes things harder on new contributors who already have enough to think about when submitting the patch. I guess in the end it comes down to who we think the target of the document is. To me it was always people new to the project, which is why I think the single command there makes sense. > > [1]: https://github.com/felipec/git-related > >
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
On Fri, Mar 16, 2018 at 5:33 PM, Thomas Gummerer wrote: > Subject: [PATCH] SubmittingPatches: mention the git contacts command > > Instead of just mentioning 'git blame' and 'git shortlog', which make > it harder than necessary for new contributors to pick out the > appropriate list of people to cc on their patch series, mention the > 'git contacts' utility, which should make it much easier to get a > reasonable list of contacts for a change. > > Signed-off-by: Thomas Gummerer > --- > diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches > @@ -260,8 +260,8 @@ that starts with `-BEGIN PGP SIGNED MESSAGE-`. > That is > Send your patch with "To:" set to the mailing list, with "cc:" listing > -people who are involved in the area you are touching (the output from > -`git blame $path` and `git shortlog --no-merges $path` would help to > +people who are involved in the area you are touching (the `git > +contacts` command in `contrib/contacts/` can help to > identify them), to solicit comments and reviews. It may be a disservice to remove mention of git-blame and git-shortlog since git-contacts may not be suitable for everyone. Instead, perhaps advertise git-contacts as a potentially simpler alternative to the already-mentioned git-blamd & git-shortlog? Also, would it make sense to mention Felipe's git-related[1] which is the original (and now more configurable) script from which git-contacts functionality was derived? [1]: https://github.com/felipec/git-related
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
On 03/08, Eddy Petrișor wrote: > 2018-03-06 22:21 GMT+02:00 Jonathan Nieder : > > > > (cc list snipped) > > Hi, > > > > Eddy Petrișor wrote: > > > > > Cc: [a lot of people] > > > > Can you say a little about how this cc list was created? E.g. should > > "git send-email" get a feature to warn about long cc lists? > > > I did it as advised by the documentation, git blame: > > https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L264 > > I was suprised that there is no specialized script that does this, as > it is for the kernel or u-boot, so I ran first > > git log --pretty=format:'%an <%ae>' git-submodule.sh | sort -u > > mail-list-submodule > > then configure git to use that custom output and ignore the patch > since it was trying to convert every line of it into a email address: > > git config sendemail.ccCmd 'cat mail-list-submodule # ' > > Then "git send-email 0001..." did the rest. There's a 'git contacts' command which we've been carrying in "contrib/" for a while. Maybe we could advertise that in "Documentation/SubmittingPatches" instead of 'git blame' and 'git shortlog'? Maybe something like the patch below? --- >8 --- Subject: [PATCH] SubmittingPatches: mention the git contacts command Instead of just mentioning 'git blame' and 'git shortlog', which make it harder than necessary for new contributors to pick out the appropriate list of people to cc on their patch series, mention the 'git contacts' utility, which should make it much easier to get a reasonable list of contacts for a change. Signed-off-by: Thomas Gummerer --- Documentation/SubmittingPatches | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index a1d0feca36..945f8edb46 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -260,8 +260,8 @@ that starts with `-BEGIN PGP SIGNED MESSAGE-`. That is not a text/plain, it's something else. Send your patch with "To:" set to the mailing list, with "cc:" listing -people who are involved in the area you are touching (the output from -`git blame $path` and `git shortlog --no-merges $path` would help to +people who are involved in the area you are touching (the `git +contacts` command in `contrib/contacts/` can help to identify them), to solicit comments and reviews. :1: footnote:[The current maintainer: gits...@pobox.com] -- 2.16.2.804.g6dcf76e11
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
2018-03-08 21:29 GMT+02:00 Eddy Petrișor : > 2018-03-06 22:21 GMT+02:00 Jonathan Nieder : >> >> (cc list snipped) >> Hi, >> >> Eddy Petrișor wrote: >> >> > Cc: [a lot of people] >> >> Can you say a little about how this cc list was created? E.g. should >> "git send-email" get a feature to warn about long cc lists? > > > I did it as advised by the documentation, git blame: > > https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L264 > > I was suprised that there is no specialized script that does this, as > it is for the kernel or u-boot, so I ran first > > git log --pretty=format:'%an <%ae>' git-submodule.sh | sort -u > > mail-list-submodule > > then configure git to use that custom output and ignore the patch > since it was trying to convert every line of it into a email address: > > git config sendemail.ccCmd 'cat mail-list-submodule # ' > > Then "git send-email 0001..." did the rest. > > >> >> > Signed-off-by: Eddy Petrișor >> > --- >> > >> > There are projects such as llvm/clang which use several repositories, and >> > they >> > might be forked for providing support for various features such as adding >> > Redox >> > awareness to the toolchain. This typically means the superproject will use >> > another branch than master, occasionally even use an old commit from that >> > non-master branch. >> > >> > Combined with the fact that when incorporating such a hierachy of >> > repositories >> > usually the user is interested in just the exact commit specified in the >> > submodule info, it follows that a desireable usecase is to be also able to >> > provide '--depth 1' to avoid waiting for ages for the clone operation to >> > finish. >> >> Some previous discussion is at >> https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/. >> >> In theory this should be straightforward: Git protocol allows fetching >> an arbitrary commit, so "git submodule update" and similar commands >> could fetch the submodule commit by SHA-1 instead of by refname. Poof! >> Problem gone. >> >> In practice, some complications: >> >> - some servers do not permit fetch-by-sha1. For example, github does >>not permit it. This is governed by the >>uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant >>configuration items. > > > That is the problem I have faced since I tried to clone this repo > which has at lest 2 levels of submodules: > https://github.com/redox-os/redox > > The problematic modules are: > rust @ > https://github.com/redox-os/rust/tree/81c2bf4e51647295d3d92952dbb0464b460df0c3 > - first level > > and > > rust/src/llvm @ > https://github.com/rust-lang/llvm/tree/ba2edd794c7def715007931fcd1b4ce62aa711c8 > > >> >>That should be surmountable by making the behavior conditional, but >>it's a complication. > > > Which forced me to try to fetch a branch on which that commit exists, > but also consider providing --depth for the submodule checkout, > effectively minimizing the amount of brought in history. > >> >> >> - When the user passes --depth=, do they mean that to apply to >>the superproject, to the submodules, or both? Documentation should >>make the behavior clear. > > > The supermodule clone already exists and that works OK; I was after > providing something like 'git submodule update --depth 20 --recursive' > or evn providing that 'depth' info via the .gitmodules parameters > since 'shallow' is already used somehow, yet that is a bool value, not > an integer, like depth expects: > > > eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f .gitmodules > --list | egrep '(depth|shallow)' > submodule.src/llvm.shallow=true > submodule.src/rt/hoedown.shallow=true > submodule.src/jemalloc.shallow=true > submodule.src/liblibc.shallow=true > submodule.src/doc/nomicon.shallow=true > submodule.src/tools/cargo.shallow=true > submodule.src/doc/reference.shallow=true > submodule.src/doc/book.shallow=true > submodule.src/tools/rls.shallow=true > submodule.src/libcompiler_builtins.shallow=true > submodule.src/tools/clippy.shallow=true > submodule.src/tools/rustfmt.shallow=true > submodule.src/tools/miri.shallow=true > submodule.src/dlmalloc.shallow=true > submodule.src/binaryen.shallow=true > submodule.src/doc/rust-by-example.shallow=true > submodule.src/llvm-emscripten.shallow=true > submodule.src/tools/rust-installer.shallow=true > > >> > Git submodule seems to be very stubborn and cloning master, although the >> > wrapper script and the gitmodules-helper could work together to clone >> > directly >> > the branch specified in the .gitmodules file, if specified. >> >> This could make sense. For the same reason as --depth in the >> superproject gives ambiguous signals about what should happen in >> submodules, --single-branch in the superproject gives ambiguous >> signals about what branch to fetch in submodules. > > > I never thought of providing the branch in the command line, since > that's not versionable info, bu
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
2018-03-06 22:21 GMT+02:00 Jonathan Nieder : > > (cc list snipped) > Hi, > > Eddy Petrișor wrote: > > > Cc: [a lot of people] > > Can you say a little about how this cc list was created? E.g. should > "git send-email" get a feature to warn about long cc lists? I did it as advised by the documentation, git blame: https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L264 I was suprised that there is no specialized script that does this, as it is for the kernel or u-boot, so I ran first git log --pretty=format:'%an <%ae>' git-submodule.sh | sort -u > mail-list-submodule then configure git to use that custom output and ignore the patch since it was trying to convert every line of it into a email address: git config sendemail.ccCmd 'cat mail-list-submodule # ' Then "git send-email 0001..." did the rest. > > > Signed-off-by: Eddy Petrișor > > --- > > > > There are projects such as llvm/clang which use several repositories, and > > they > > might be forked for providing support for various features such as adding > > Redox > > awareness to the toolchain. This typically means the superproject will use > > another branch than master, occasionally even use an old commit from that > > non-master branch. > > > > Combined with the fact that when incorporating such a hierachy of > > repositories > > usually the user is interested in just the exact commit specified in the > > submodule info, it follows that a desireable usecase is to be also able to > > provide '--depth 1' to avoid waiting for ages for the clone operation to > > finish. > > Some previous discussion is at > https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/. > > In theory this should be straightforward: Git protocol allows fetching > an arbitrary commit, so "git submodule update" and similar commands > could fetch the submodule commit by SHA-1 instead of by refname. Poof! > Problem gone. > > In practice, some complications: > > - some servers do not permit fetch-by-sha1. For example, github does >not permit it. This is governed by the >uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant >configuration items. That is the problem I have faced since I tried to clone this repo which has at lest 2 levels of submodules: https://github.com/redox-os/redox The problematic modules are: rust @ https://github.com/redox-os/rust/tree/81c2bf4e51647295d3d92952dbb0464b460df0c3 - first level and rust/src/llvm @ https://github.com/rust-lang/llvm/tree/ba2edd794c7def715007931fcd1b4ce62aa711c8 > >That should be surmountable by making the behavior conditional, but >it's a complication. Which forced me to try to fetch a branch on which that commit exists, but also consider providing --depth for the submodule checkout, effectively minimizing the amount of brought in history. > > > - When the user passes --depth=, do they mean that to apply to >the superproject, to the submodules, or both? Documentation should >make the behavior clear. The supermodule clone already exists and that works OK; I was after providing something like 'git submodule update --depth 20 --recursive' or evn providing that 'depth' info via the .gitmodules parameters since 'shallow' is already used somehow, yet that is a bool value, not an integer, like depth expects: eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f .gitmodules --list | egrep '(depth|shallow)' submodule.src/llvm.shallow=true submodule.src/rt/hoedown.shallow=true submodule.src/jemalloc.shallow=true submodule.src/liblibc.shallow=true submodule.src/doc/nomicon.shallow=true submodule.src/tools/cargo.shallow=true submodule.src/doc/reference.shallow=true submodule.src/doc/book.shallow=true submodule.src/tools/rls.shallow=true submodule.src/libcompiler_builtins.shallow=true submodule.src/tools/clippy.shallow=true submodule.src/tools/rustfmt.shallow=true submodule.src/tools/miri.shallow=true submodule.src/dlmalloc.shallow=true submodule.src/binaryen.shallow=true submodule.src/doc/rust-by-example.shallow=true submodule.src/llvm-emscripten.shallow=true submodule.src/tools/rust-installer.shallow=true > > Git submodule seems to be very stubborn and cloning master, although the > > wrapper script and the gitmodules-helper could work together to clone > > directly > > the branch specified in the .gitmodules file, if specified. > > This could make sense. For the same reason as --depth in the > superproject gives ambiguous signals about what should happen in > submodules, --single-branch in the superproject gives ambiguous > signals about what branch to fetch in submodules. I never thought of providing the branch in the command line, since that's not versionable info, but to provide git-submodule a hint in the .gitsubmodule config about on which branch the hash in question should be found, like this: eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f .gitmodules --list | egrep branch submodule.s
Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default
(cc list snipped) Hi, Eddy Petrișor wrote: > Cc: [a lot of people] Can you say a little about how this cc list was created? E.g. should "git send-email" get a feature to warn about long cc lists? > Signed-off-by: Eddy Petrișor > --- > > There are projects such as llvm/clang which use several repositories, and they > might be forked for providing support for various features such as adding > Redox > awareness to the toolchain. This typically means the superproject will use > another branch than master, occasionally even use an old commit from that > non-master branch. > > Combined with the fact that when incorporating such a hierachy of repositories > usually the user is interested in just the exact commit specified in the > submodule info, it follows that a desireable usecase is to be also able to > provide '--depth 1' to avoid waiting for ages for the clone operation to > finish. Some previous discussion is at https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/. In theory this should be straightforward: Git protocol allows fetching an arbitrary commit, so "git submodule update" and similar commands could fetch the submodule commit by SHA-1 instead of by refname. Poof! Problem gone. In practice, some complications: - some servers do not permit fetch-by-sha1. For example, github does not permit it. This is governed by the uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant configuration items. That should be surmountable by making the behavior conditional, but it's a complication. - When the user passes --depth=, do they mean that to apply to the superproject, to the submodules, or both? Documentation should make the behavior clear. Fortunately I believe this complication has been takencare of using the --shallow-submodules option. > Git submodule seems to be very stubborn and cloning master, although the > wrapper script and the gitmodules-helper could work together to clone directly > the branch specified in the .gitmodules file, if specified. This could make sense. For the same reason as --depth in the superproject gives ambiguous signals about what should happen in submodules, --single-branch in the superproject gives ambiguous signals about what branch to fetch in submodules. > Another wrinkle is that when the commit is not the tip of the branch, the > depth > parameter should somehow be stored in the .gitmodules info, but any change in > the submodule will break the supermodule submodule depth info sooner or later, > which is definitly frigile. Hm, this seems to go in another direction. I don't think we should store the depth parameter in the .gitmodules file, since different users are likely to have different preferences about what to make shallow. If we make --depth easy enough to use at the superproject level then the user can specify what they want there. > I tried digging into this section of the code and debugging with bashdb to see > where --depth might fit, but I got stuck on the shell-to-helper interaction > and > the details of the submodule implementation, so I want to lay out this first > patch as starting point for the discussion in the hope somebody else picks it > up or can provide some inputs. I have the feeling there are multiple code > paths > that are being ran, depending on the moment (initial clone, submodule > recursive, post-clone update etc.) and I have a gut feeling there shouldn't be > any code duplication just because the operation is different. > > This first patch is only trying to use a non-master branch, I have some > changes > for the --depth part, but I stopped working on it due to the "default depth" > issue above. > > Does any of this sound reasonable? > Is this patch idea usable or did I managed to touch the part of the code that > should not be touched? I agree with the goal. As mentioned above, I'm not confident about the particular mechanism --- e.g. something using fetch-by-sha1 seems likely to be more intuitive. Today, the 'branch' setting in .gitmodules is only for "git submodule update --remote". This patch would be a significant expansion in scope for it. Hopefully others on the list can talk more about how that fits into various workflows and whether it would work out well. Thanks and hope that helps, Jonathan > git-submodule.sh | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/git-submodule.sh b/git-submodule.sh > index 2491496..370f19e 100755 > --- a/git-submodule.sh > +++ b/git-submodule.sh > @@ -589,8 +589,11 @@ cmd_update() > branch=$(git submodule--helper remote-branch "$sm_path") > if test -z "$nofetch" > then > + # non-default branch > + rbranch=$(git config -f .gitmodules > submodule.$sm_path.branch) > + > br_refspec=${rbanch:+"refs/heads/$rbranch