Re: Relative submodule URLs
Sorry for dropping out of the conversation; the last few days were a bit hectic. Regarding recursive branching, I agree that a super-repo's branch names are not necessarily appropriate for its submodules, and that Heiko's simple workflow is a workable base to build upon. More thought is needed here, but that's for another day. Regarding remote.default, Robert please understand that the feature doesn't exist, and the idea is to only serve as a fallback when the current methods for remote selection end up resorting to the hardcoded origin name. More thought is also needed here, but not today. Both Heiko and Robert took issue with this statement of mine: On 14-08-22 12:00 PM, Marc Branchaud wrote: A branch should fork the entire repo, including its submodules. The implication is that if you want to push that branch somewhere, that somewhere needs to be able to accept the forks of the submodules *even if those submodules aren't changed in your branch* because at the very least the branch ref has to exist in the submodules' repositories. Heiko said: It should be easy to work on a repository that is forked in its entirety, but it should also be possible (and properly supported) to only fork some submodules. You're right, I overstated it when I said that the branch ref has to exist in the unchanged submodules. The super-repo branch records which submodules it updates, and when pushing the branch somewhere only those submodules' changes need to be pushed. Robert asked: How will this impact *creating* branches? What about forking? Do you expect submodule forking branching to be automatic as well? ... This seems difficult to do, especially the forking part since you would need an API for this (Github, Atlassian Stash, etc), unless you are thinking of something clever like local/relative forks. I meant fork in the local-branch sense: The branch represents a topic in the repository, and it should encompass the entire repository including its submodules (just like the branch encompasses all the files in the repository, even though the branch's commits only change a subset of those files). I think you're talking about fork in the sense of setting up a mirror of a repository. I agree that there aren't really any tools for automatically doing that with repositories that contain relative-path submodules. I think git clone could learn to do it, though. Heiko also said this: On Fri, Aug 22, 2014 at 12:00:07PM -0400, Marc Branchaud wrote: With relative-path submodules, the push's target repo *must* also have the submodules in their proper places, so that they can get updated. Furthermore, if you clone a repo that has relative-path submodules you *must* also clone the submodules. That is not true. You can have relative submodules and just clone/fetch some from a different remote. Its just a question of how to specifiy/transport this information. I meant that more as a general guideline than some kind of physical law. Sure, it's possible to scatter the submodules across all sorts of hosts, but it's not a good idea. When it comes to relative-path submodules, pushing and fetching submodule changes in the super-repo should just involve the one remote host (whatever way that's determined). This keeps things tractable, because otherwise your branch's changes are scattered among many different hosts and you end up considering weird things like this part of the branch's changes are on host A but this other part are on host B, so let's record that somewhere, oh but what if host B is down when I'm trying to fetch, but I know that host C has the changes too so why don't I just fetch what I want from there. It's a nightmare. It's infinitely better to treat a repository and its relative-path submodules as an atomic unit, so that any remote that hosts the repository also hosts the submodules. When pushing a branch with submodule changes, expect to find those submodules on the target remote and update them. Regardless of how the target remote is determined. Same thing for fetching. It's just so much simpler to work this way. So please, let's not try to specify submodule remotes per-branch or make that info pushable. It's enough for a branch's local configuration to say that it tracks fetch/pull refs on different remotes. The rest should flow from that. M. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Relative submodule URLs
On Thu, Aug 28, 2014 at 01:44:18PM -0400, Marc Branchaud wrote: Heiko also said this: On Fri, Aug 22, 2014 at 12:00:07PM -0400, Marc Branchaud wrote: With relative-path submodules, the push's target repo *must* also have the submodules in their proper places, so that they can get updated. Furthermore, if you clone a repo that has relative-path submodules you *must* also clone the submodules. That is not true. You can have relative submodules and just clone/fetch some from a different remote. Its just a question of how to specifiy/transport this information. I meant that more as a general guideline than some kind of physical law. Sure, it's possible to scatter the submodules across all sorts of hosts, but it's not a good idea. When it comes to relative-path submodules, pushing and fetching submodule changes in the super-repo should just involve the one remote host (whatever way that's determined). This keeps things tractable, because otherwise your branch's changes are scattered among many different hosts and you end up considering weird things like this part of the branch's changes are on host A but this other part are on host B, so let's record that somewhere, oh but what if host B is down when I'm trying to fetch, but I know that host C has the changes too so why don't I just fetch what I want from there. It's a nightmare. It's infinitely better to treat a repository and its relative-path submodules as an atomic unit, so that any remote that hosts the repository also hosts the submodules. When pushing a branch with submodule changes, expect to find those submodules on the target remote and update them. Regardless of how the target remote is determined. Same thing for fetching. It's just so much simpler to work this way. You are right, its simpler. But I would not say better. Depending on your project it might be better to just fork some submodules. So please, let's not try to specify submodule remotes per-branch or make that info pushable. It's enough for a branch's local configuration to say that it tracks fetch/pull refs on different remotes. The rest should flow from that. Why not? Git is all about flexibility. Of course if you organise your submodules in chaos you will get chaos. But consider this: You have this big project which consists of submodule (e.g. like Android with hundreds of submodules). Now you want to develop on something that involves just a subset of submodules, lets say two submodules. Now if someone just wants to publish a small change to some submodules you are demanding to setup a mirror of *all* submodules that are in this big project. That might not even be feasible depending on the projects size and the remote quota. Not to speak about having to first create a fork of hundreds of repositories. So in this situation we should support just referring some submodules to other places. Regarding transporting this information. If you ask someone to try out your change it should be as simple as possible. It should be enough to say. clone from there and checkout that branch (once recursive checkout and fetch for submodules is in place). So here we need a way to transport this configuration for a fork. Yes for a small project where its feasible to simply clone all submodules you can just say: please fork everything. But for bigger projects thats not necessarily an option. So we should at least give the users that option. Then its a matter of policy how you work with a project. I am not saying that everything for this should be implemented in the first steps but we should keep it in mind and design everything in such a way that it is still possible to implement such a kind of workflow later. Cheers Heiko -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Relative submodule URLs
On Mon, Aug 25, 2014 at 09:29:07AM -0500, Robert Dailey wrote: On Sun, Aug 24, 2014 at 8:34 AM, Heiko Voigt hvo...@hvoigt.net wrote: New --with--remote parameter for 'git submodule' While having said all that about submodule settings I think a much much simpler start is to go ahead with a commandline setting, like Robert proposed here[2]. For that we do not have to worry about how it can be stored, transported, defined per submodule or on a branch, since answers to this are given at the commandline (and current repository state). There are still open questions about this though: * Should the name in the submodule be 'origin' even though you specified --with-remote=somewhere? For me its always confusing to have the same/similar remotes named differently in different repositories. That why I try to keep the names the same in all my clones of repositories (i.e. for my private, github, upstream remotes). * When you do a 'git submodule sync --with-remote=somewhere' should the remote be added or replaced. My opinion on these are: The remote should be named as in the superproject so --with-remote=somewhere adds/replaces the remote 'somewhere' in the submodules named on the commandline (or all in case no submodule is specified). In case of a fresh clone of the submodule, there would be no origin but only a remote under the new name. Would the --with-remote feature I describe be a feasible start for you Robert? What do others think? Is the naming of the parameter '--with-remote' alright? Cheers Heiko [1] http://article.gmane.org/gmane.comp.version-control.git/255512 [2] http://article.gmane.org/gmane.comp.version-control.git/255512 [3] https://github.com/jlehmann/git-submod-enhancements/wiki#special-ref-overriding-gitmodules-values Hi Heiko, My last email response was in violation of your request to keep the two topics separate, sorry about that. I started typing it this weekend and completed the draft this morning, without having read this response from you first. Thats fine, no problem. Here is what I think would make the feature most usable. I think you went over some of these ideas but I just want to clarify, to make sure we're on the same page. Please correct me as needed. 1. Running `git submodule update --with-remote name` shall fail the command unconditionally. I am not sure but I think you mean git submodule update --with-remote=name With the equals sign, without it you would name the submodule paths to update. No I think that should just add the remote name to all submodules that would be updated and do the normal update operation on them (with the new remote of course). 2. Using the `--with-remote` option on submodule `update` or `sync` will fail if it detects absolute submodule URLs in .gitmodule Yes, almost. Since you can have a mixture I suggest to only fail if the submodules that would be processed have an absolute url in them. If processed submodules are all relative it can go ahead. 3. Running `git submodule update --init --with-remote name` shall fail the command ONLY if a submodule is being processed that is NOT also being initialized. No since the --init flag just tells update to initialize submodules on-demand. It should just go ahead the same way as without --with-remote. 4. The behavior of git submodule's `update` or `sync` commands combined with `--with-remote` will REPLACE or CREATE the 'origin' remote in each submodule it is run in. We will not allow the user to configure what the submodule remote name will end up being (I think this is current behavior and forces good practice; I consider `origin` an adopted standard for git, and actually wish it was more enforced for super projects as well!) No please carefully read my email again. I specifically was describing the opposite. --with-remote=name creates/replaces the remote name in the submodule. I do not see a benefit in restricting the user from creating different remote names in the submodule. I think it would be more confusing if the remote 'origin' in the superproject does not point to the same location as 'origin' in the submodule. Let me know if I've missed anything. Once we clarify requirements I'll attempt to start work on this during my free time. I'll start by testing this through msysgit, since I do not have linux installed, but I have Linux Mint running in a Virtual Machine so I can test on both platforms as needed (I don't have a lot of experience on Linux though). I think it does not matter which development environment you use. In my experience though Linux is around 30x faster when it comes to the typical operations you do when developing git. Especially for running the testsuite that makes a difference between a few hours and minutes. I hope you won't mind me reaching out for questions as needed, however I
Re: Re: Re: Relative submodule URLs
On Tue, Aug 26, 2014 at 1:28 AM, Heiko Voigt hvo...@hvoigt.net wrote: Hi Heiko, My last email response was in violation of your request to keep the two topics separate, sorry about that. I started typing it this weekend and completed the draft this morning, without having read this response from you first. Thats fine, no problem. Here is what I think would make the feature most usable. I think you went over some of these ideas but I just want to clarify, to make sure we're on the same page. Please correct me as needed. 1. Running `git submodule update --with-remote name` shall fail the command unconditionally. I am not sure but I think you mean git submodule update --with-remote=name With the equals sign, without it you would name the submodule paths to update. No I think that should just add the remote name to all submodules that would be updated and do the normal update operation on them (with the new remote of course). I'm not sure about Linux but at least with msysgit on Windows, typing a two-dash option (such as --with-remote) forces command line evaluation to use the next placement parameter as the parameter for it. I've seen this work the same way with argparse in python too. In my experience, command line has worked that way, I'm not sure if that is by design or not though. I never use equal signs with git commands, never had a problem for some reason. For example: git rebase --onto release/1.0 head~3 head The `--onto` option knows to use `release/1.0` as its parameter. 2. Using the `--with-remote` option on submodule `update` or `sync` will fail if it detects absolute submodule URLs in .gitmodule Yes, almost. Since you can have a mixture I suggest to only fail if the submodules that would be processed have an absolute url in them. If processed submodules are all relative it can go ahead. For example if it processes 3 submodules in the following order: 1. relative 2. absolute 3. relative Should it fail before or after processing the 3rd relative submodule? I was thinking it would fail while trying to sync/update the 2nd one (which is absolute) and stop before processing the 3rd. 3. Running `git submodule update --init --with-remote name` shall fail the command ONLY if a submodule is being processed that is NOT also being initialized. No since the --init flag just tells update to initialize submodules on-demand. It should just go ahead the same way as without --with-remote. But doesn't the on-demand initialization need to evaluate relative URLs and convert them to absolute based on the .gitmodules configuration? I thought the idea was to make `--with-remote` invalid for initialization/sync of absolute URLs. In other words if I did: git submodule init --with-remote fork my-submodule-dir and if my-submodule-dir was not relative in .gitmodules, then the `--with-remote` flag becomes useless. We could fail silently but for educational purposes to the user I thought we were failing in these scenarios. Maybe I misunderstood your original intent with the failures? Is init not doing the relative to absolute evaluation like I'm thinking? Please correct me where I'm wrong. 4. The behavior of git submodule's `update` or `sync` commands combined with `--with-remote` will REPLACE or CREATE the 'origin' remote in each submodule it is run in. We will not allow the user to configure what the submodule remote name will end up being (I think this is current behavior and forces good practice; I consider `origin` an adopted standard for git, and actually wish it was more enforced for super projects as well!) No please carefully read my email again. I specifically was describing the opposite. --with-remote=name creates/replaces the remote name in the submodule. I do not see a benefit in restricting the user from creating different remote names in the submodule. I think it would be more confusing if the remote 'origin' in the superproject does not point to the same location as 'origin' in the submodule. Well the reason why I said it would be 'origin' is so that the submodule knows which remote to use internally during an update. I'm assuming 'update' uses 'origin' internally in the submodule to know which remote to pull from. My understanding of how `git submodule update` knows which URL to pull from is probably incorrect. I'm not familiar on the internal mechanics of how this works. Perhaps you could explain or send me to some reading material on it? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Re: Relative submodule URLs
On Tue, Aug 26, 2014 at 10:18:48AM -0500, Robert Dailey wrote: On Tue, Aug 26, 2014 at 1:28 AM, Heiko Voigt hvo...@hvoigt.net wrote: My last email response was in violation of your request to keep the two topics separate, sorry about that. I started typing it this weekend and completed the draft this morning, without having read this response from you first. Thats fine, no problem. Here is what I think would make the feature most usable. I think you went over some of these ideas but I just want to clarify, to make sure we're on the same page. Please correct me as needed. 1. Running `git submodule update --with-remote name` shall fail the command unconditionally. I am not sure but I think you mean git submodule update --with-remote=name With the equals sign, without it you would name the submodule paths to update. No I think that should just add the remote name to all submodules that would be updated and do the normal update operation on them (with the new remote of course). I'm not sure about Linux but at least with msysgit on Windows, typing a two-dash option (such as --with-remote) forces command line evaluation to use the next placement parameter as the parameter for it. I've seen this work the same way with argparse in python too. In my experience, command line has worked that way, I'm not sure if that is by design or not though. I never use equal signs with git commands, never had a problem for some reason. For example: git rebase --onto release/1.0 head~3 head The `--onto` option knows to use `release/1.0` as its parameter. If you are on Window or Linux does not make a difference here. I just realized we are quite inconsistent: $ git grep -E -e --\w+=\w+ -- Documentation/ | wc -l 226 $ git grep -E -e --\w+ \w+ -- Documentation/ | wc -l 75 I would prefer the first though since that one is used more often. But we can leave that for later, once we have some code to talk about. 2. Using the `--with-remote` option on submodule `update` or `sync` will fail if it detects absolute submodule URLs in .gitmodule Yes, almost. Since you can have a mixture I suggest to only fail if the submodules that would be processed have an absolute url in them. If processed submodules are all relative it can go ahead. For example if it processes 3 submodules in the following order: 1. relative 2. absolute 3. relative Should it fail before or after processing the 3rd relative submodule? I was thinking it would fail while trying to sync/update the 2nd one (which is absolute) and stop before processing the 3rd. For consistency I would prefer if it fails right from the beginning in this situation since the command can not be completed. 3. Running `git submodule update --init --with-remote name` shall fail the command ONLY if a submodule is being processed that is NOT also being initialized. No since the --init flag just tells update to initialize submodules on-demand. It should just go ahead the same way as without --with-remote. But doesn't the on-demand initialization need to evaluate relative URLs and convert them to absolute based on the .gitmodules configuration? I thought the idea was to make `--with-remote` invalid for initialization/sync of absolute URLs. In other words if I did: git submodule init --with-remote fork my-submodule-dir and if my-submodule-dir was not relative in .gitmodules, then the `--with-remote` flag becomes useless. We could fail silently but for educational purposes to the user I thought we were failing in these scenarios. Maybe I misunderstood your original intent with the failures? Is init not doing the relative to absolute evaluation like I'm thinking? Please correct me where I'm wrong. Yes it does the relative to absolute evaluation. But that is a different topic. For absolute urls in .gitmodules it should fail, but you were talking about --init in general and in general that should not fail IMO. So e.g. git submodule update --init --with-remote=name when all submodule urls are relative in .gitmodules and some submodules have already been initialized should succeed. 4. The behavior of git submodule's `update` or `sync` commands combined with `--with-remote` will REPLACE or CREATE the 'origin' remote in each submodule it is run in. We will not allow the user to configure what the submodule remote name will end up being (I think this is current behavior and forces good practice; I consider `origin` an adopted standard for git, and actually wish it was more enforced for super projects as well!) No please carefully read my email again. I specifically was describing the opposite. --with-remote=name creates/replaces the remote name in the submodule. I do not see a benefit in restricting the user from creating different remote names in the submodule. I think it would be more confusing if the remote 'origin' in the
Re: Relative submodule URLs
On Fri, Aug 22, 2014 at 11:00 AM, Marc Branchaud marcn...@xiplink.com wrote: A couple of years ago I started to work on such a thing ([1] [2] [3]), mainly because when we tried to change to relative submodules we got bitten when someone used clone's -o option so that his super-repo had no origin remote *and* his was checked out on a detached HEAD. So get_default_remote() failed for him. I didn't have time to complete the work -- it ended up being quite involved. But Junio did come up with an excellent transition plan [4] for adopting a default remote setting. [1] (v0) http://thread.gmane.org/gmane.comp.version-control.git/200145 [2] (v1) http://thread.gmane.org/gmane.comp.version-control.git/201065 [3] (v2) http://thread.gmane.org/gmane.comp.version-control.git/201306 [4] http://article.gmane.org/gmane.comp.version-control.git/201332 I think you're on the right path. However I'd suggest something like the following: [submodule] remote = remote_for_relative_submodules (e.g. `upstream`) I think remote.default would be more generally useful, especially when working with detached checkouts. Honestly speaking I don't use default.remote, even now that I know about it thanks to the discussion ongoing here. The reason is that sometimes I push my branches to origin, sometimes I push them to my fork. I like explicit control as to which one I push to. I also sync my git config file to dropbox and I use it on multiple projects and platforms. I don't use the same push destination workflow on all projects. It seems to get in the way of my workflow more than it helps. I really only ever have two needs: 1. Push explicitly to my remote (e.g. `git push fork` or `git push origin`) 2. Push to the tracked branch (e.g. `git push`) I'm also not sure how `push.default = simple` conflicts with the usage of `remote.default`, since in the tracked-repo case, you must explicitly specify the source ref to push. Is this behavior documented somewhere? (For the record, I would also be happy if clone got rid of its -o option and origin became the sacred, reserved remote name (perhaps translated into other languages as needed) that clone always uses no matter what.) [branch.name] submoduleRemote = remote_for_relative_submodule If I understand correctly, you want this so that your branch can be a fork of only the super-repo while the submodules are not forked and so they should keep tracking their original repo. That's correct. But this is case-by-case. Sometimes I make a change where I want the submodule forked (rare), most times I don't. Sometimes I can get away with pushing changes to the submodule and worrying about it later since I know the submodule ref won't move forward unless someone does update --remote (which isn't often or only done as needed). To me this seems to be going in the opposite direction of having branches recursively apply to submodules, which I think most of us want. A branch should fork the entire repo, including its submodules. The implication is that if you want to push that branch somewhere, that somewhere needs to be able to accept the forks of the submodules *even if those submodules aren't changed in your branch* because at the very least the branch ref has to exist in the submodules' repositories. There are many levels on which this can apply. When it comes to checkouts and such, I agree. However, how will this impact *creating* branches? What about forking? Do you expect submodule forking branching to be automatic as well? Based on your description, it seems so (although a new branch doesn't necessarily have to correspond to a new fork, unless I'm misunderstanding you). This seems difficult to do, especially the forking part since you would need an API for this (Github, Atlassian Stash, etc), unless you are thinking of something clever like local/relative forks. However the inconvenience of forking manually isn't the main reason why I avoid forking submodules. It's the complication of pull requests. There is no uniformity there, which is unfortunate. Recursive pull requests are something outside the scope of git, I realize that, but it would still be nice. However the suggestion you make here lays the foundation for that I think. With absolute-path submodules, the push is a simple as creating the branch ref in the submodules' home repositories -- even if the main somewhere you're pushing to isn't one of those repositories. With relative-path submodules, the push's target repo *must* also have the submodules in their proper places, so that they can get updated. Furthermore, if you clone a repo that has relative-path submodules you *must* also clone the submodules. Robert, I think what you'll say to this is that you still want your branch to track the latest submodules updates from their home repository. (BTW, I'm confused with how you're using the terms upstream and origin. I'll use home to refer to the repository where everything
Re: Re: Relative submodule URLs
On Sun, Aug 24, 2014 at 8:34 AM, Heiko Voigt hvo...@hvoigt.net wrote: New --with--remote parameter for 'git submodule' While having said all that about submodule settings I think a much much simpler start is to go ahead with a commandline setting, like Robert proposed here[2]. For that we do not have to worry about how it can be stored, transported, defined per submodule or on a branch, since answers to this are given at the commandline (and current repository state). There are still open questions about this though: * Should the name in the submodule be 'origin' even though you specified --with-remote=somewhere? For me its always confusing to have the same/similar remotes named differently in different repositories. That why I try to keep the names the same in all my clones of repositories (i.e. for my private, github, upstream remotes). * When you do a 'git submodule sync --with-remote=somewhere' should the remote be added or replaced. My opinion on these are: The remote should be named as in the superproject so --with-remote=somewhere adds/replaces the remote 'somewhere' in the submodules named on the commandline (or all in case no submodule is specified). In case of a fresh clone of the submodule, there would be no origin but only a remote under the new name. Would the --with-remote feature I describe be a feasible start for you Robert? What do others think? Is the naming of the parameter '--with-remote' alright? Cheers Heiko [1] http://article.gmane.org/gmane.comp.version-control.git/255512 [2] http://article.gmane.org/gmane.comp.version-control.git/255512 [3] https://github.com/jlehmann/git-submod-enhancements/wiki#special-ref-overriding-gitmodules-values Hi Heiko, My last email response was in violation of your request to keep the two topics separate, sorry about that. I started typing it this weekend and completed the draft this morning, without having read this response from you first. At this point my only intention was to start discussion on a possible short-term solution. I realize the Git developers are working hard on improving submodule workflow for the long term. In addition I do not have the domain expertise to properly make suggestions in regards to longer-term solutions, so I leave that to you :-) The --with-remote feature would allow me to begin using relative submodules because: On a per-submodule basis, I can specify the remote it will use. When I fork a submodule and need to start tracking it, I can run `git submodule sync --with-remote fork`, which will take my super repo's 'fork' remote, REPLACE 'origin' in the submodule with that URL, and also redo the relative URL calculation. This is ideal since I use HTTP at home (so I can use my proxy server to access git behind firewall at work) and at work physically I use SSH for performance (to avoid HTTP protocol). I also like the idea of never having to update my submodule URLs again if the git server moves, domain name changes, or whatever else. Here is what I think would make the feature most usable. I think you went over some of these ideas but I just want to clarify, to make sure we're on the same page. Please correct me as needed. 1. Running `git submodule update --with-remote name` shall fail the command unconditionally. 2. Using the `--with-remote` option on submodule `update` or `sync` will fail if it detects absolute submodule URLs in .gitmodule 3. Running `git submodule update --init --with-remote name` shall fail the command ONLY if a submodule is being processed that is NOT also being initialized. 4. The behavior of git submodule's `update` or `sync` commands combined with `--with-remote` will REPLACE or CREATE the 'origin' remote in each submodule it is run in. We will not allow the user to configure what the submodule remote name will end up being (I think this is current behavior and forces good practice; I consider `origin` an adopted standard for git, and actually wish it was more enforced for super projects as well!) Let me know if I've missed anything. Once we clarify requirements I'll attempt to start work on this during my free time. I'll start by testing this through msysgit, since I do not have linux installed, but I have Linux Mint running in a Virtual Machine so I can test on both platforms as needed (I don't have a lot of experience on Linux though). I hope you won't mind me reaching out for questions as needed, however I will attempt to be as resourceful as possible since I know you're all busy. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Relative submodule URLs
On Mon, Aug 25, 2014 at 9:29 AM, Robert Dailey rcdailey.li...@gmail.com wrote: On Sun, Aug 24, 2014 at 8:34 AM, Heiko Voigt hvo...@hvoigt.net wrote: New --with--remote parameter for 'git submodule' While having said all that about submodule settings I think a much much simpler start is to go ahead with a commandline setting, like Robert proposed here[2]. For that we do not have to worry about how it can be stored, transported, defined per submodule or on a branch, since answers to this are given at the commandline (and current repository state). There are still open questions about this though: * Should the name in the submodule be 'origin' even though you specified --with-remote=somewhere? For me its always confusing to have the same/similar remotes named differently in different repositories. That why I try to keep the names the same in all my clones of repositories (i.e. for my private, github, upstream remotes). * When you do a 'git submodule sync --with-remote=somewhere' should the remote be added or replaced. My opinion on these are: The remote should be named as in the superproject so --with-remote=somewhere adds/replaces the remote 'somewhere' in the submodules named on the commandline (or all in case no submodule is specified). In case of a fresh clone of the submodule, there would be no origin but only a remote under the new name. Would the --with-remote feature I describe be a feasible start for you Robert? What do others think? Is the naming of the parameter '--with-remote' alright? Cheers Heiko [1] http://article.gmane.org/gmane.comp.version-control.git/255512 [2] http://article.gmane.org/gmane.comp.version-control.git/255512 [3] https://github.com/jlehmann/git-submod-enhancements/wiki#special-ref-overriding-gitmodules-values Hi Heiko, My last email response was in violation of your request to keep the two topics separate, sorry about that. I started typing it this weekend and completed the draft this morning, without having read this response from you first. At this point my only intention was to start discussion on a possible short-term solution. I realize the Git developers are working hard on improving submodule workflow for the long term. In addition I do not have the domain expertise to properly make suggestions in regards to longer-term solutions, so I leave that to you :-) The --with-remote feature would allow me to begin using relative submodules because: On a per-submodule basis, I can specify the remote it will use. When I fork a submodule and need to start tracking it, I can run `git submodule sync --with-remote fork`, which will take my super repo's 'fork' remote, REPLACE 'origin' in the submodule with that URL, and also redo the relative URL calculation. This is ideal since I use HTTP at home (so I can use my proxy server to access git behind firewall at work) and at work physically I use SSH for performance (to avoid HTTP protocol). I also like the idea of never having to update my submodule URLs again if the git server moves, domain name changes, or whatever else. Here is what I think would make the feature most usable. I think you went over some of these ideas but I just want to clarify, to make sure we're on the same page. Please correct me as needed. 1. Running `git submodule update --with-remote name` shall fail the command unconditionally. 2. Using the `--with-remote` option on submodule `update` or `sync` will fail if it detects absolute submodule URLs in .gitmodule 3. Running `git submodule update --init --with-remote name` shall fail the command ONLY if a submodule is being processed that is NOT also being initialized. 4. The behavior of git submodule's `update` or `sync` commands combined with `--with-remote` will REPLACE or CREATE the 'origin' remote in each submodule it is run in. We will not allow the user to configure what the submodule remote name will end up being (I think this is current behavior and forces good practice; I consider `origin` an adopted standard for git, and actually wish it was more enforced for super projects as well!) Let me know if I've missed anything. Once we clarify requirements I'll attempt to start work on this during my free time. I'll start by testing this through msysgit, since I do not have linux installed, but I have Linux Mint running in a Virtual Machine so I can test on both platforms as needed (I don't have a lot of experience on Linux though). I hope you won't mind me reaching out for questions as needed, however I will attempt to be as resourceful as possible since I know you're all busy. Thanks. Thought of a few more: 5. If `--with-remote` is unspecified, behavior will continue as it currently does (I'm not clear on the precedence here of various options, but I assume: `remote.default` first, then `branch.name.remote`) 6. `--with-remote` will take
Re: Re: Relative submodule URLs
Hi, since the mail got quite long. To avoid 'tl;dr', I talk about two topics in this mail: * Submodule settings for default remote (complex, future) * New --with--remote parameter for 'git submodule' (simple, now) Depending on your interest you might want to skip the first part of the email. I think they are two separate topics. Please only answer to either one and remove the other. That way we split the thread here and not mix the two together anymore. On Fri, Aug 22, 2014 at 12:00:07PM -0400, Marc Branchaud wrote: I think you're on the right path. However I'd suggest something like the following: [submodule] remote = remote_for_relative_submodules (e.g. `upstream`) I think remote.default would be more generally useful, especially when working with detached checkouts. Depends what workflow you have. Especially for submodules where the default remote might change from branch to branch this is not necessarily true. The following drawbacks in relation to submodules come to my mind: * You can not transport such configuration to the server. In case you are developing on a branch which has changes in a forked submodule that would be useful. * When your development in superproject and submodule gets merged to a stable branch (i.e. master) you also may not want that other remote anymore. So a setting, that can be per branch, might be preferred. * When your development gets pushed to a different remote the settings do not change. I.e. once part of the upstream repository the settings should possibly disappear. * You might only want to fork a certain submodule (since thats the only one you need to make changes in) in your branch. Then you need this setting to be per submodule. So to sum up a default remote setting which would be generally useful for submodules needs the following properties (IMO): * pushable * per branch * per remote * per submodule All of these being optional, so in case you have a local mirror, including submodules, of some project in which you develop with your team you might just want to set the default remote once for all submodules. I have not completely thought that through but the special ref idea[3] described by Jonathan seems to make it possible to implement all these properties. [branch.name] submoduleRemote = remote_for_relative_submodule If I understand correctly, you want this so that your branch can be a fork of only the super-repo while the submodules are not forked and so they should keep tracking their original repo. To me this seems to be going in the opposite direction of having branches recursively apply to submodules, which I think most of us want. I disagree. While recursive branches might make sense in some situations in most it does not. Consider a project in which you use a library which is separately maintained. You develop on featureA in your project and discover a bug in the submodule which you fix on a branch (which is then tracked in the submodule). Here it does not make sense to call your branch in the submodule featureA, since the submodule has no knowledge at all (and should not) about this featureA. While having said that, for a simple workflow while developing a certain feature recursive branches make sense. Lets say as a temporary local branch you could have that featureA branch in your submodule and just commit any changes you need in the submodule on that branch (including extensions and stuff). Later in the process you divide up that branch in the submodule into cleanups, bugfixes, extensions, ... to push it upstream for review and integration. A branch should fork the entire repo, including its submodules. The implication is that if you want to push that branch somewhere, that somewhere needs to be able to accept the forks of the submodules *even if those submodules aren't changed in your branch* because at the very least the branch ref has to exist in the submodules' repositories. I disagree here as well. As the distributed nature of git allows to have different remotes, I think its perfectly legitimate to just fork the repositories you need to change. It should be easy to work on a repository that is forked in its entirety, but it should also be possible (and properly supported) to only fork some submodules. I know it does make the situation more complex, but I think we should properly define the goal beforehand, so we do not exclude any use-cases. Then we can go ahead and just implement the simpler stuff (like entire repo forks) first, while making sure we do not block the more complex use-cases. With absolute-path submodules, the push is a simple as creating the branch ref in the submodules' home repositories -- even if the main somewhere you're pushing to isn't one of those repositories. With relative-path submodules, the push's target repo *must* also have the submodules in their proper places, so that they can get updated.
Re: Relative submodule URLs
On 14-08-19 12:07 PM, Robert Dailey wrote: On Mon, Aug 18, 2014 at 3:55 PM, Jonathan Nieder jrnie...@gmail.com wrote: Thanks for reporting. The remote used is the default remote that git fetch without further arguments would use: get_default_remote () { curr_branch=$(git symbolic-ref -q HEAD) curr_branch=${curr_branch#refs/heads/} origin=$(git config --get branch.$curr_branch.remote) echo ${origin:-origin} } The documentation is wrong. git-fetch(1) doesn't provide a name for this thing. Any ideas for wording? I guess a good start would be to call it the tracked remote instead of remote origin. The word tracked here makes it very obvious that if I have a remote tracking branch setup, it will use the remote portion of that configuration. The real question is, how will `git submodule update` function if a tracking remote is not configured? Will it fail with some useful error message? I don't like the idea of it defaulting to self remote mode, where it will be relative to my repo directory. That seems like it would fail is subtle ways in a worst-case scenario (if I did by happenstance have a bare repository cloned up one directory level for other reasons). Currently there isn't, short of reconfiguring the remote used by default by git fetch. I wish there was a way to specify the remote on a per-branch basis separately from the tracking branch. I read a while back that someone proposed some changes to git to support decentralized tracking (concept of an upstream tracking branch and a separate one for origin, i think). If that were implemented, then relative submodules could utilize the 'upstream' remote by default for each branch, which would provide more predictable default behavior. Right now most people on my team would not be aware that if they tracked a branch on their fork, they would subsequently need to fork the submodules to that same remote. Various co-workers use the remote named central instead of upstream and fork instead of origin (because that just makes more sense to them and it's perfectly valid). However if relative submodules require 'origin' to exist AND also represent the upstream repository (in triangle workflow), then this breaks on several levels. Can you explain further? In a triangle workflow, git fetch will pull from the 'origin' remote by default and will push to the remote named in the '[remote] pushdefault' setting (see remote.pushdefault in git-config(1)). So you can do [remote] pushDefault = whereishouldpush and then 'git fetch' and 'git fetch --recurse-submodules' will fetch from origin and 'git push' will push to the whereishouldpush remote. I didn't know about this option, seems useful. A common workflow that we use on my team is to set the tracking branch to 'upstream' for convenient pulls with rebase. This means a feature branch of mine can track its parent branch on 'upstream', so that when other pull requests get merged in on the remote repo branch, I can just do `git pull` and my feature branch rebases onto the latest of that parent branch. Cases like these would work with relative submodules because 'upstream' is the tracked remote (and most of the time we don't want to fork submodules). However sometimes I like to track the same pushed branch on origin (my fork), especially when it is up for pull request. In these cases, my submodule update will fail because I didn't fork my submodules when I changed my tracking branch. Is this correct? breaks on several levels was basically my way of saying that various workflow choices will break when you introduce submodules. One of the beautiful things about Git is that it allows everyone to choose their own workflow. But submodules seem to prevent that to some degree. I think addressing relative submodule usability issues is the best approach for the long term as they feel more sustainable and scalable. It's an absolute pain to move a submodule URL, I think we've all experienced it. It's even harder for me because I'm the go-to at work for help with Git. Most people that aren't advanced with Git will not know what to do without a ton of reading such. It might make sense to introduce a new [remote] default = whereishouldfetch setting to allow the name origin above to be replaced, too. Is that what you mean? A couple of years ago I started to work on such a thing ([1] [2] [3]), mainly because when we tried to change to relative submodules we got bitten when someone used clone's -o option so that his super-repo had no origin remote *and* his was checked out on a detached HEAD. So get_default_remote() failed for him. I didn't have time to complete the work -- it ended up being quite involved. But Junio did come up with an excellent transition plan [4] for adopting a default remote setting. [1]
Re: Re: Re: Relative submodule URLs
On Wed, Aug 20, 2014 at 08:18:12AM -0500, Robert Dailey wrote: On Tue, Aug 19, 2014 at 3:57 PM, Heiko Voigt hvo...@hvoigt.net wrote: I would actually error out when specified in already cloned state. Because otherwise the user might expect the remote to be updated. Since we are currently busy implementing recursive fetch and checkout I have added that to our ideas list[1] so we do not forget about it. In the meantime you can either use the branch.name.remote configuration to define a remote to use or just use 'origin'. Cheers Heiko [1] https://github.com/jlehmann/git-submod-enhancements/wiki#add-with-remote--switch-to-submodule-update Thanks Heiko. I would offer to help implement this for you, if you find it to be a good idea, but I've never done git development before and based on what I've seen it seems like you need to know at least 2-3 languages to contribute: bash, perl, C++. I know C++ Python but I don't know perl or bash scripting language. What would it take to help you guys out? It's easy to complain file bugs but as a developer I feel like I should offer more, if it suits you. For this particular case shell scripting should be sufficient. And it should not take too much time. Have a look at the git-submodule.sh script in the repository. That is the one implementing the git submodule command. Additionally you need to extend the documentation and write a test or two. Writing a test is also done in shell script. The documentation[1] is in asciidoc which is pretty self explanatory. The test should probably go into t/t7406-submodule-update.sh and, as Phil pointed out, in t7403-submodule-sync.sh). Also make sure to read the shell scripting part in Documentation/CodingGuidelines and as a general rule: Keep close to the style you find in the file. And when you are ready to send a patch: Documentation/SubmittingPatches. If you are happy but unsure about anything just send a patch with your implementation (CC me and everyone involved) and we will discuss it here on the list. Cheers Heiko [1] Documentation/git-submodule.txt -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 3:57 PM, Heiko Voigt hvo...@hvoigt.net wrote: I would actually error out when specified in already cloned state. Because otherwise the user might expect the remote to be updated. Since we are currently busy implementing recursive fetch and checkout I have added that to our ideas list[1] so we do not forget about it. In the meantime you can either use the branch.name.remote configuration to define a remote to use or just use 'origin'. Cheers Heiko [1] https://github.com/jlehmann/git-submod-enhancements/wiki#add-with-remote--switch-to-submodule-update Thanks Heiko. I would offer to help implement this for you, if you find it to be a good idea, but I've never done git development before and based on what I've seen it seems like you need to know at least 2-3 languages to contribute: bash, perl, C++. I know C++ Python but I don't know perl or bash scripting language. What would it take to help you guys out? It's easy to complain file bugs but as a developer I feel like I should offer more, if it suits you. Let me know I'm happy to help with anything. Thanks again!! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
On Mon, Aug 18, 2014 at 01:55:05PM -0700, Jonathan Nieder wrote: Robert Dailey wrote: The documentation wasn't 100% clear on this, but I'm assuming by remote origin, it says that the relative URL is relative to the actual remote *named* origin (and it is not using origin as just a general terminology). Thanks for reporting. The remote used is the default remote that git fetch without further arguments would use: get_default_remote () { curr_branch=$(git symbolic-ref -q HEAD) curr_branch=${curr_branch#refs/heads/} origin=$(git config --get branch.$curr_branch.remote) echo ${origin:-origin} } The documentation is wrong. git-fetch(1) doesn't provide a name for this thing. Any ideas for wording? How about 'upstream'? Like this[1]? Lets step back a little is this really what we want in such situation? Is one remote really the answer here? I suppose you have relative urls in your .gitmodules file and two remotes in you superproject right? What you want is that the remote names in the superproject are reflected in the submodules when you initialise and update them? Because at the moment what you get is always a remote 'origin' in the submodule. Even if that remote was called 'fork' in the superproject. Maybe in the relative URLs case we should teach the clone in submodule update to use all remotes with their names from the superproject? Would that solve your issue? Is there a way to specify (on a per-clone basis) which named remote will be used to calculate the URL for submodules? Currently there isn't, short of reconfiguring the remote used by default by git fetch. Well currently it is either the tracked remote by the currently checked out branch or if the branch has no tracked remote configured: 'origin'. So by configuring (or checking out a branch with) a different remote you can choose from remote submodule are cloned. No? Various co-workers use the remote named central instead of upstream and fork instead of origin (because that just makes more sense to them and it's perfectly valid). However if relative submodules require 'origin' to exist AND also represent the upstream repository (in triangle workflow), then this breaks on several levels. Can you explain further? In a triangle workflow, git fetch will pull from the 'origin' remote by default and will push to the remote named in the '[remote] pushdefault' setting (see remote.pushdefault in git-config(1)). So you can do [remote] pushDefault = whereishouldpush and then 'git fetch' and 'git fetch --recurse-submodules' will fetch from origin and 'git push' will push to the whereishouldpush remote. It might make sense to introduce a new [remote] default = whereishouldfetch setting to allow the name origin above to be replaced, too. Is that what you mean? I think the OP problem stems from him having a branch that does not have a remote configured. Since they do not have 'origin' as a remote and git submodule update --init --recursive path/to/submodule fails. Right? Cheers Heiko [1] From: Heiko Voigt hvo...@hvoigt.net Subject: [PATCH] submodule: use 'upstream' instead of 'origin' in documentation When talking about relative URL's it is ambiguous to use the term 'origin', since that might denote the default remote name 'origin'. Lets use 'upstream' to make it more clear that the upstream repository is meant. Signed-off-by: Heiko Voigt hvo...@hvoigt.net --- Documentation/git-submodule.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 8e6af65..c6f82e6 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -80,15 +80,15 @@ to exist in the superproject. If path is not given, the The path is also used as the submodule's logical name in its configuration entries unless `--name` is used to specify a logical name. + -repository is the URL of the new submodule's origin repository. +repository is the URL of the new submodule's upstream repository. This may be either an absolute URL, or (if it begins with ./ -or ../), the location relative to the superproject's origin +or ../), the location relative to the superproject's upstream repository (Please note that to specify a repository 'foo.git' which is located right next to a superproject 'bar.git', you'll have to use '../foo.git' instead of './foo.git' - as one might expect when following the rules for relative URLs - because the evaluation of relative URLs in Git is identical to that of relative directories). -If the superproject doesn't have an origin configured +If the superproject doesn't have any remote configured the superproject is its own authoritative upstream and the current working directory is used instead. + -- 2.1.0.rc0.52.gaa544bf -- To unsubscribe from
Re: Relative submodule URLs
On Mon, Aug 18, 2014 at 3:55 PM, Jonathan Nieder jrnie...@gmail.com wrote: Thanks for reporting. The remote used is the default remote that git fetch without further arguments would use: get_default_remote () { curr_branch=$(git symbolic-ref -q HEAD) curr_branch=${curr_branch#refs/heads/} origin=$(git config --get branch.$curr_branch.remote) echo ${origin:-origin} } The documentation is wrong. git-fetch(1) doesn't provide a name for this thing. Any ideas for wording? I guess a good start would be to call it the tracked remote instead of remote origin. The word tracked here makes it very obvious that if I have a remote tracking branch setup, it will use the remote portion of that configuration. The real question is, how will `git submodule update` function if a tracking remote is not configured? Will it fail with some useful error message? I don't like the idea of it defaulting to self remote mode, where it will be relative to my repo directory. That seems like it would fail is subtle ways in a worst-case scenario (if I did by happenstance have a bare repository cloned up one directory level for other reasons). Currently there isn't, short of reconfiguring the remote used by default by git fetch. I wish there was a way to specify the remote on a per-branch basis separately from the tracking branch. I read a while back that someone proposed some changes to git to support decentralized tracking (concept of an upstream tracking branch and a separate one for origin, i think). If that were implemented, then relative submodules could utilize the 'upstream' remote by default for each branch, which would provide more predictable default behavior. Right now most people on my team would not be aware that if they tracked a branch on their fork, they would subsequently need to fork the submodules to that same remote. Various co-workers use the remote named central instead of upstream and fork instead of origin (because that just makes more sense to them and it's perfectly valid). However if relative submodules require 'origin' to exist AND also represent the upstream repository (in triangle workflow), then this breaks on several levels. Can you explain further? In a triangle workflow, git fetch will pull from the 'origin' remote by default and will push to the remote named in the '[remote] pushdefault' setting (see remote.pushdefault in git-config(1)). So you can do [remote] pushDefault = whereishouldpush and then 'git fetch' and 'git fetch --recurse-submodules' will fetch from origin and 'git push' will push to the whereishouldpush remote. I didn't know about this option, seems useful. A common workflow that we use on my team is to set the tracking branch to 'upstream' for convenient pulls with rebase. This means a feature branch of mine can track its parent branch on 'upstream', so that when other pull requests get merged in on the remote repo branch, I can just do `git pull` and my feature branch rebases onto the latest of that parent branch. Cases like these would work with relative submodules because 'upstream' is the tracked remote (and most of the time we don't want to fork submodules). However sometimes I like to track the same pushed branch on origin (my fork), especially when it is up for pull request. In these cases, my submodule update will fail because I didn't fork my submodules when I changed my tracking branch. Is this correct? breaks on several levels was basically my way of saying that various workflow choices will break when you introduce submodules. One of the beautiful things about Git is that it allows everyone to choose their own workflow. But submodules seem to prevent that to some degree. I think addressing relative submodule usability issues is the best approach for the long term as they feel more sustainable and scalable. It's an absolute pain to move a submodule URL, I think we've all experienced it. It's even harder for me because I'm the go-to at work for help with Git. Most people that aren't advanced with Git will not know what to do without a ton of reading such. It might make sense to introduce a new [remote] default = whereishouldfetch setting to allow the name origin above to be replaced, too. Is that what you mean? I think you're on the right path. However I'd suggest something like the following: [submodule] remote = remote_for_relative_submodules (e.g. `upstream`) [branch.name] submoduleRemote = remote_for_relative_submodule Above, `submodule.remote` is the 'default' remote used by all relative submodules on all branches. If unspecified, it defaults to `branch.name.remote` as it currently behaves. `branch.name.submoduleRemote` is an override for `submodule.remote`. Basically if you consider this scenario: [branch.myfoo] remote = origin submoduleRemote = upstream I can track an
Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 5:24 AM, Heiko Voigt hvo...@hvoigt.net wrote: I think the OP problem stems from him having a branch that does not have a remote configured. Since they do not have 'origin' as a remote and git submodule update --init --recursive path/to/submodule fails. Right? Not exactly. The issue is that there is a tug of war going on between three specific commands (all of which utilize the tracked remote): git fetch git pull git submodule update (for relative submodules) The way I set up my remote tracking branch will be different for each of these commands: - git pull :: If I want convenient pulls (with rebase), I will track my upstream branch. My pushes have to be more explicit as a tradeoff. - git push :: If I want convenient pushes, track my origin branch. Pulls become less convenient. My relative submodules will now need to be forked. - git submodule update :: I track upstream to avoid forking my submodules. But pushes become more inconvenient. As you can see, I feel like we're overusing the single remote setting. Sure, we've added some global settings to set default push/pull remotes and such, but I don't think that is a sustainable long term solution. I like the idea of possibly introducing multiple tracking remotes for various purposes. This adds some additional configuration overhead (slightly), but git is already very config heavy so it might be worth exploring. At least, this feels like a better thing for the long term as I won't be constantly switching my tracking remote for various purposes. Could also explore the possibility of creating const remotes. If we specify a remote MUST exist for relative submodules, git can create it for us, and fail to operate without it. It's up to the user to map fork to origin if needed (perhaps add a `git remote clone source new remote` to assist with this)? Various approaches we can take, but I don't do development on Git so I'm not sure what makes the most sense. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
Robert Dailey rcdailey.li...@gmail.com writes: The way I set up my remote tracking branch will be different for each of these commands: - git pull :: If I want convenient pulls (with rebase), I will track my upstream branch. My pushes have to be more explicit as a tradeoff. Keeping 'origin' pointing at the repository where you cloned from, without doing anything funky (i.e. set up my remote) would give you convenient pulls. - git push :: If I want convenient pushes, track my origin branch. Pulls become less convenient. My relative submodules will now need to be forked. You need to configure your pushes to go to a different place, if you want them to go to a different place ;-). Long time ago, it used to be that you have to affect the URL used in both direction, making pulls less conveninent, but hasn't this been made an non-issue for triangular workflows with the introduction of remote.pushdefault long time ago? - git submodule update :: I track upstream to avoid forking my submodules. But pushes become more inconvenient. If 'submodule update' follows the same place as 'pull' goes by default, I would imagine that there is no issue here, no? Am I oversimplifying the issue by guessing that the root cause of is that you are not using remote.pushdefault from your configuration toolchest and instead setting the 'origin' to a wrong (i.e. where push goes) place? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 11:39 AM, Junio C Hamano gits...@pobox.com wrote: Robert Dailey rcdailey.li...@gmail.com writes: The way I set up my remote tracking branch will be different for each of these commands: - git pull :: If I want convenient pulls (with rebase), I will track my upstream branch. My pushes have to be more explicit as a tradeoff. Keeping 'origin' pointing at the repository where you cloned from, without doing anything funky (i.e. set up my remote) would give you convenient pulls. - git push :: If I want convenient pushes, track my origin branch. Pulls become less convenient. My relative submodules will now need to be forked. You need to configure your pushes to go to a different place, if you want them to go to a different place ;-). Long time ago, it used to be that you have to affect the URL used in both direction, making pulls less conveninent, but hasn't this been made an non-issue for triangular workflows with the introduction of remote.pushdefault long time ago? - git submodule update :: I track upstream to avoid forking my submodules. But pushes become more inconvenient. If 'submodule update' follows the same place as 'pull' goes by default, I would imagine that there is no issue here, no? Am I oversimplifying the issue by guessing that the root cause of is that you are not using remote.pushdefault from your configuration toolchest and instead setting the 'origin' to a wrong (i.e. where push goes) place? Maybe I'm misunderstanding something here and you can help me out. All the reading I've done (mostly github) says that 'upstream' points to the authoritative repository that you forked from but do not have permissions to write to. 'origin' points to the place you push your changes for pull requests (the fork). Basically the workflow I use is: - Use 'upstream' to PULL changes (latest code is obtained from the authoritative repository) - Use 'origin' to push your branches. Since I never modify the branches that exist in 'upstream' on my 'origin' (I do everything through separate personal branches). That means if I have a branch off of 'master' named 'topic', I will track 'upstream/master' and get latest with 'git pull'. When I'm ready for a pull request, I do 'git push origin' (I use push.default = simple). According to my understanding, relative submodules work here. But not everyone on my team uses this workflow. Sometimes they track origin/topic (if we use my example again). Won't the submodule try to find itself on the fork now? Basically it seems like what you're advocating is that I need to enforce a policy of always track upstream and never track origin and always set remote.pushdefault. Seems a bit error prone... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
Robert Dailey rcdailey.li...@gmail.com writes: Maybe I'm misunderstanding something here and you can help me out. All the reading I've done (mostly github) says that 'upstream' points to the authoritative repository that you forked from but do not have permissions to write to. 'origin' points to the place you push your changes for pull requests (the fork). I do not know if that is how GitHub teaches people, but I would have to say that these are strange phrasing. I suspect that part of their documentation was written long time ago, back when nobody on the GitHub side were involved in design (let alone implementation) of Git, and I would take it with a grain of salt. Having said that, here is a summary of the current support for referring to different repositories in Git: The word 'origin' refers to where things originate from; a place you push to is where things go, so it makes no sense to use that word to refer to the repository where you publish your work result. The 'origin' may or may not be where you can push (or you would want to push) to. It is where you 'pull' from to synchronize with the 'upstream'. The 'upstream' in SCM context refers to those who control a logically more authoritative history than you, whose work you derive your work from, i.e. synonymous to 'origin'. For people like Linus (i.e. he may pull from others but that is to take in changes made as derived work; he does not pull to grab more authoritative work), there is no need to say 'upstream'; or you can consider he is his own 'upstream'. For those who use CVS-style central repository model (i.e. they would pull from that single central shared repository, and push their work back to the same repository), 'origin' are writable to them and they push to them. For people with CVS-style central shared repository model, their central repository is their 'upstream' with respect to their local copy. Since these two classes of people need just one other repository to refer to, we just used 'origin' when we did the very initial version of git clone, and these users can keep using that name to refer to that single other repository they interact with. The support for the triangular workflow in which you pull from one place and push the result of work to another, which the configuration variable 'remote.pushdefault' is a part of, is relatively a more recent development in Git. I am not sure we have added an official term to our glossary to refer to the repository you push your work result to, but in the discussions we have seen phrases like 'publishing repository' used, I think. It must be writable by you, of course, and it may or may not be the same as the 'origin' repository. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 11:50:08AM -0500, Robert Dailey wrote: Maybe I'm misunderstanding something here and you can help me out. All the reading I've done (mostly github) says that 'upstream' points to the authoritative repository that you forked from but do not have permissions to write to. 'origin' points to the place you push your changes for pull requests (the fork). Basically the workflow I use is: - Use 'upstream' to PULL changes (latest code is obtained from the authoritative repository) - Use 'origin' to push your branches. Since I never modify the branches that exist in 'upstream' on my 'origin' (I do everything through separate personal branches). That means if I have a branch off of 'master' named 'topic', I will track 'upstream/master' and get latest with 'git pull'. When I'm ready for a pull request, I do 'git push origin' (I use push.default = simple). According to my understanding, relative submodules work here. But not everyone on my team uses this workflow. Sometimes they track origin/topic (if we use my example again). Won't the submodule try to find itself on the fork now? Well the remote for the submodule is currently only calculated once, when you do the initial git submodule update --init that clones the submodule. Afterwards the fixed url is configured under the name 'origin' in the submodule like in a normal git repository that you have freshly cloned. Which remote is used for cloning depends on the configured remote for the current branch or 'origin'. When you do a fetch or push with --recurse-submodules it only executes a 'git fetch' or 'git push' without any specific remote. For fetch the same commandline options (but only the options) are passed on. Here it might make sense to guess the remote in the submodule somehow and not do what fetch without remotes would do. For the triangular workflow not much work has been done in regards to submodule support. But since a submodule behaves like a normal git repository maybe there is not much work needed and we can just point to the workflow without submodules most times. We still have to figure that out properly. Cheers Heiko -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 2:19 PM, Junio C Hamano gits...@pobox.com wrote: I do not know if that is how GitHub teaches people, but I would have to say that these are strange phrasing. I suspect that part of their documentation was written long time ago, back when nobody on the GitHub side were involved in design (let alone implementation) of Git, and I would take it with a grain of salt. Having said that, here is a summary of the current support for referring to different repositories in Git: snip Wow, that was a very good read. Thank you for that. I definitely have been using the wrong terms. upstream origin are interchangeable, yet I was using them to represent two distinct repositories. I think going forward my central repository will be named 'origin' and for the name of the second, nothing has really jumped out at me yet but it'll either be fork or proxy... surrogate would be nice too if it wasn't such a long word in comparison. I'm sure you guys will find a name for it in good time. I wonder what Linus would suggest. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 2:30 PM, Heiko Voigt hvo...@hvoigt.net wrote: Well the remote for the submodule is currently only calculated once, when you do the initial git submodule update --init that clones the submodule. Afterwards the fixed url is configured under the name 'origin' in the submodule like in a normal git repository that you have freshly cloned. Which remote is used for cloning depends on the configured remote for the current branch or 'origin'. When you do a fetch or push with --recurse-submodules it only executes a 'git fetch' or 'git push' without any specific remote. For fetch the same commandline options (but only the options) are passed on. Here it might make sense to guess the remote in the submodule somehow and not do what fetch without remotes would do. For the triangular workflow not much work has been done in regards to submodule support. But since a submodule behaves like a normal git repository maybe there is not much work needed and we can just point to the workflow without submodules most times. We still have to figure that out properly. Maybe then the only thing we need is a --with-remote option for git submodule? :: git submodule update --init --with-remote myremote The --with-remote option would be a NOOP if it's already initialized, as you say. But I could create an alias for this as needed to make sure it is always specified. That way, just in case someone cloned with their fork (in which case 'origin' would not be pointing in the right place), they could tell it to use `myremote`. This is really the only strange case to handle right now (people that clone their forks instead of the actual upstream/central repository). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Relative submodule URLs
On Tue, Aug 19, 2014 at 03:23:36PM -0500, Robert Dailey wrote: On Tue, Aug 19, 2014 at 2:30 PM, Heiko Voigt hvo...@hvoigt.net wrote: Well the remote for the submodule is currently only calculated once, when you do the initial git submodule update --init that clones the submodule. Afterwards the fixed url is configured under the name 'origin' in the submodule like in a normal git repository that you have freshly cloned. Which remote is used for cloning depends on the configured remote for the current branch or 'origin'. When you do a fetch or push with --recurse-submodules it only executes a 'git fetch' or 'git push' without any specific remote. For fetch the same commandline options (but only the options) are passed on. Here it might make sense to guess the remote in the submodule somehow and not do what fetch without remotes would do. For the triangular workflow not much work has been done in regards to submodule support. But since a submodule behaves like a normal git repository maybe there is not much work needed and we can just point to the workflow without submodules most times. We still have to figure that out properly. Maybe then the only thing we need is a --with-remote option for git submodule? :: git submodule update --init --with-remote myremote The --with-remote option would be a NOOP if it's already initialized, as you say. But I could create an alias for this as needed to make sure it is always specified. I would actually error out when specified in already cloned state. Because otherwise the user might expect the remote to be updated. Since we are currently busy implementing recursive fetch and checkout I have added that to our ideas list[1] so we do not forget about it. In the meantime you can either use the branch.name.remote configuration to define a remote to use or just use 'origin'. Cheers Heiko [1] https://github.com/jlehmann/git-submod-enhancements/wiki#add-with-remote--switch-to-submodule-update -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs
Hi, Robert Dailey wrote: The documentation wasn't 100% clear on this, but I'm assuming by remote origin, it says that the relative URL is relative to the actual remote *named* origin (and it is not using origin as just a general terminology). Thanks for reporting. The remote used is the default remote that git fetch without further arguments would use: get_default_remote () { curr_branch=$(git symbolic-ref -q HEAD) curr_branch=${curr_branch#refs/heads/} origin=$(git config --get branch.$curr_branch.remote) echo ${origin:-origin} } The documentation is wrong. git-fetch(1) doesn't provide a name for this thing. Any ideas for wording? Is there a way to specify (on a per-clone basis) which named remote will be used to calculate the URL for submodules? Currently there isn't, short of reconfiguring the remote used by default by git fetch. Various co-workers use the remote named central instead of upstream and fork instead of origin (because that just makes more sense to them and it's perfectly valid). However if relative submodules require 'origin' to exist AND also represent the upstream repository (in triangle workflow), then this breaks on several levels. Can you explain further? In a triangle workflow, git fetch will pull from the 'origin' remote by default and will push to the remote named in the '[remote] pushdefault' setting (see remote.pushdefault in git-config(1)). So you can do [remote] pushDefault = whereishouldpush and then 'git fetch' and 'git fetch --recurse-submodules' will fetch from origin and 'git push' will push to the whereishouldpush remote. It might make sense to introduce a new [remote] default = whereishouldfetch setting to allow the name origin above to be replaced, too. Is that what you mean? Meanwhile it is hard to fork a project that uses relative submodule URLs without also forking the submodules (or, conversely, to fork some of the submodules of a project that uses absolute submodule URLs). That's a real and serious problem but I'm not sure how it relates to the names of remotes. My preferred fix involves teaching git to read a refs/meta/git (or similarly named) ref when cloning a project with submodules and let settings from .gitmodules in that ref override .gitmodules in other branches. Is that what you were referring to? Curious, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs, and forks that haven't forked the submodule
So let me see if I understand you correctly. On Wed, Jun 11, 2014 at 12:15:39PM +0200, Charles Brossollet wrote: Hi, I'm banging my head on this problem: I have a central repo cloned by SSH, and a fork on the same server. The central remote is origin, and the fork is chbrosso-wip. $ git remote -v | grep origin origin chbrosso@lltech:/git/lightct.git (fetch) origin chbrosso@lltech:/git/lightct.git (push) $ git remote -v | grep chbrosso-wip chbrosso-wipchbrosso@lltech:~/prog/git/lightct.git (fetch) chbrosso-wipchbrosso@lltech:~/prog/git/lightct.git (push) On a local working copy, fetched my fork and checked out a remote branch out of it. Its remote-tracking branch is on the fork. $ git branch -vv | grep \* * actor d98ec24 [chbrosso-wip/actor] (commit msg) Now, submodules for this repo have relative URLs. And this is where the problem begins, because the submodule isn't forked, but resides only in origin. Fork is not a git thing. It's not a git command and it's not supported by git. You can of course easily do a fork of a git project, but git will be unaware of it beeing a fork. What you're saying is that you've one repository: lightct.git and one other repository which is a submodule to lightct.git at motors.git. Then you've made a copy of lightct.git to an other place for example: /some/other/path/lightct.git and the naturally the submodule path that's relative will point to /some/other/path/motors.git that doesn't exists, since you haven't copied motors.git But this shouldn't cause any problem, right? The docs says that if relative URL are used, they resolve using the origin URL. First issue, it's not the case: Orgin refers to the repository you cloned from. That is if you did git clone lightct.git my_working_copy the origin for my_working_copy would be lightct.git. However if you did git clone /some/other/path/lightct.git my_working_copy the origin for my_working_copy would be /some/other/path/lightct.git So to me it seems to be correct. $ cat .gitmodules [submodule motors] path = motors url = ../motors.git branch = master $ git submodule init motors Submodule 'motors' (chbrosso@lltech:~/prog/git/motors.git) registered for path 'motors' Here the submodule is registered on my fork, which doesn't exist, and it's wrong with what the documentation says. Fine, I'll edit the .git/config entry to make it point to origin: $ git config submodule.motors.url chbrosso@lltech:/git/motors.git $ git config submodule.motors.url chbrosso@lltech:/git/motors.git $ ssh chbrosso@lltech if [ -d /git/motors.git ]; then echo 'ok'; fi Password: ok So the submodule's url is changed, and points to a correct path, let's update so that I can work $ git submodule update motors Password: fatal: '~/prog/git/motors.git' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. Unable to fetch in submodule path 'motors' That's right, it is still the old url, and I can't have my submodule! Here you change the path to the submodule at /some/other/path/lightct.git and then it isn't changed in my_working_copy. How could it? They don't communicate if you don't tell them to. Can someone explain what's going on? And how can I get my submodule in the working copy? Either created a copy of the submodule just as you did with lightct.git or use non-relative paths. -- Med vänlig hälsning Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Relative submodule URLs, and forks that haven't forked the submodule
Thanks for taking time to understand, let me make it more clear Le 12 juin 2014 à 17:25, Fredrik Gustafsson iv...@iveqy.com a écrit : So let me see if I understand you correctly. On Wed, Jun 11, 2014 at 12:15:39PM +0200, Charles Brossollet wrote: Hi, I'm banging my head on this problem: I have a central repo cloned by SSH, and a fork on the same server. The central remote is origin, and the fork is chbrosso-wip. $ git remote -v | grep origin origin chbrosso@lltech:/git/lightct.git (fetch) origin chbrosso@lltech:/git/lightct.git (push) $ git remote -v | grep chbrosso-wip chbrosso-wipchbrosso@lltech:~/prog/git/lightct.git (fetch) chbrosso-wipchbrosso@lltech:~/prog/git/lightct.git (push) On a local working copy, fetched my fork and checked out a remote branch out of it. Its remote-tracking branch is on the fork. $ git branch -vv | grep \* * actor d98ec24 [chbrosso-wip/actor] (commit msg) Now, submodules for this repo have relative URLs. And this is where the problem begins, because the submodule isn't forked, but resides only in origin. Fork is not a git thing. It's not a git command and it's not supported by git. You can of course easily do a fork of a git project, but git will be unaware of it beeing a fork. OK, you get it, what I mean by fork here is an independent copy of a repository, at another remote place. What you're saying is that you've one repository: lightct.git and one other repository which is a submodule to lightct.git at motors.git. Then you've made a copy of lightct.git to an other place for example: /some/other/path/lightct.git and the naturally the submodule path that's relative will point to /some/other/path/motors.git that doesn't exists, since you haven't copied motors.git That's right. Origin is the repository that were original cloned to the working copy, and I have a copy of it, that is in /some/other/path, without motors.git having been copied. I haven't copied motors.git because I won't modify it, so I still want to refer it… But this shouldn't cause any problem, right? The docs says that if relative URL are used, they resolve using the origin URL. First issue, it's not the case: Orgin refers to the repository you cloned from. That is if you did git clone lightct.git my_working_copy the origin for my_working_copy would be lightct.git. However if you did git clone /some/other/path/lightct.git my_working_copy the origin for my_working_copy would be /some/other/path/lightct.git So to me it seems to be correct. No, in the working copy, origin's location isn't changed, it is still the repository I originally (!) cloned from. I added the other remote afterward, and named it chbrosso-wip, not origin. Then, the working copy has two remotes, origin and chbrosso-wip. So if we follow the docs the URL for the submodule shouldn't be set to chbrosso-wip's URL, but this is what is happening. snip That's right, it is still the old url, and I can't have my submodule! Here you change the path to the submodule at /some/other/path/lightct.git and then it isn't changed in my_working_copy. How could it? They don't communicate if you don't tell them to. No, you missed my point, let me explain it a more synthesized way: There are 3 repos main, fork, and sub, having the following URLs: /central/main /central/sub /user/main sub is a submodule of main, and referred with a relative URL in .gitmodules. In a working copy, cloned from /central/main, thus referred by git as origin, and added /user/main as another remote repository. Fetched from it. Initially the submodule isn't cloned in the working copy. The two problems I'm pointing are: 1. After checkout of a branch that tracks /user/main repo, call git init submodule motors. Git registers it in .git/config with URL /user/sub, while it should be /central/sub according to documentation because origin's URL is at /central. 2. For an obscure reason, changing the url in .git/config to /central/sub and call git submodule update still make git want to clone from /user/sub, and fails. There seems to be no way to tell git the right URL for this submodule, while it should be possible according to the submodule documentation. Can someone explain what's going on? And how can I get my submodule in the working copy? Either created a copy of the submodule just as you did with lightct.git or use non-relative paths. -- Med vänlig hälsning Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git] Re: Relative submodule URLs, and forks that haven't forked the submodule
On Thu, Jun 12, 2014 at 06:05:10PM +0200, Charles Brossollet wrote: The two problems I'm pointing are: 1. After checkout of a branch that tracks /user/main repo, call git init submodule motors. Git registers it in .git/config with URL /user/sub, while it should be /central/sub according to documentation because origin's URL is at /central. The logic for this is in resolve_relative_url, defined in git-submodule.sh. The remote it uses is calculated using get_default_remote, defined in git-parse-remote.sh: get_default_remote () { curr_branch=$(git symbolic-ref -q HEAD) curr_branch=${curr_branch#refs/heads/} origin=$(git config --get branch.$curr_branch.remote) echo ${origin:-origin} } 2. For an obscure reason, changing the url in .git/config to /central/sub and call git submodule update still make git want to clone from /user/sub, and fails. There seems to be no way to tell git the right URL for this submodule, while it should be possible according to the submodule documentation. This is very surprising to me. With Git v1.9.1: * Clone just the superproject, without it's sibling submodule projects: $ git clone git://github.com/wking/pygrader.git pg-1 * Clone the isolated superproject, so we'll have broken relative URLs: $ git clone pg-1 pg-2 * Initialize a submodule: $ git submodule init dep/src/pyassuan * Fix the broken, expanded-from-relative URL to point back to the original: $ git config submodule.dep/src/pyassuan.url git://github.com/wking/pyassuan.git * Initial, cloning update: $ git submodule update That all works as expected for me. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature