Re: Submodule, subtree, or something else?

2015-08-24 Thread Stefan Beller
On Sun, Aug 23, 2015 at 7:11 AM, Jānis Rukšāns janis.ruks...@gmail.com wrote:
 On Pk, 2015-08-21 at 17:07 -0700, Stefan Beller wrote:
 On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com 
 wrote:
 
  A major drawback of submodules in my opinion is the
  inability to make a full clone from an existing one without having
  access to the central repository, which is something I have to do from
  time to time.

 Can you elaborate on that a bit more?
 git clone --recurse-submodules should do that no matter which remote
 you contact?

 I mean that if I have cloned a repository with submodules, cloning that
 repository with --recurse-submodules will either access the central
 server if absolute URLs are used, or requires additional clones for
 each submodule.  For example

 git clone --recursive http://somewhere/projectA.git
 git clone --recursive file://$(pwd)/projectA projectA.tmp

 The second command will cause the submodules to be downloaded again, or
 expect them to be found in $(pwd).

IIUC, the second command will lookup the submodules in $(pwd), but if they
are not there they are skipped, so all of the existing submodules are cloned.
Why do you need more submodules in the tmp clone than in $(pwd)/projectA
would be my next question. But I see your point now.




 Or am I mistaken, or doing something wrong?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Submodule, subtree, or something else?

2015-08-24 Thread Jānis Rukšāns
On P , 2015-08-24 at 09:51 -0700, Stefan Beller wrote:
 IIUC, the second command will lookup the submodules in $(pwd), but if
 they are not there they are skipped, so all of the existing submodules
 are cloned.
 Why do you need more submodules in the tmp clone than in
 $(pwd)/projectA would be my next question. But I see your point now.

The $(pwd) was just an example to illustrate my point.  The actual use
case is that I would be hacking on something at work, notice that it is
already late and I have to catch the last bus home, yet I don't want to
postpone whatever I was working on until the next day.  So I would do
git commit -a -m [WIP] Stuff, finish at home to save my work so far,
go home, and clone / fetch it over ssh.

Another important factor is that a lot of our code can be meaningfully
tested only on the actual hardware, and is built in a VM.  Quite often
getting things right involve many iterations of hack hack hack, git
commit --amend, fetch  reset --hard in the VM, build, test, repeat.
Being able to clone / fetch directly from the copy I am working on makes
it a lot easier.

As I wrote in the other e-mail, I managed to achieve the desired result
by using ./submodule (without .git suffix) as the submodule URL, and
creating a file named submodule in the bare repo with
'gitdir: ../submodule.git' as it's contents, but I'm not sure whether
it is a good idea or not.

Jānis

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Submodule, subtree, or something else?

2015-08-24 Thread Jānis Rukšāns
On Sv, 2015-08-23 at 17:13 -0600, Cox, Michael wrote:
 You might want to take a look at how the Boost (boost.org) project
 uses submodules.  They use submodules for each library.  I know they
 use relative paths in their .gitmodules file to avoid the problem
 you're referring to regarding git clone --recurse-submodules.

Thanks!  I had a look at their setup, and they are using ../libx.git for
submodules, which unfortunately breaks when cloning from another
working copy:

$ git clone --recursive file:///tmp/gittest/repo.a/main.git main.work
Cloning into 'main.work'...
snip
Submodule 'liba' (file:///tmp/gittest/repo.a/liba.git) registered for path 
'liba'
Cloning into 'liba'...
snip
Submodule path 'liba': checked out '6a0ef37c03a7068328956dcb8a08bc39f280edfc'

$ git clone --recursive file://($pwd)/main.work main.home
Cloning into 'main.home'...
snip
Submodule 'liba' (file:///tmp/gittest/work/liba.git) registered for path 'liba'
Cloning into 'liba'...
fatal: '/tmp/gittest/work/liba.git' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Clone of 'file:///tmp/gittest/work/liba.git' into submodule path 'liba' failed


After some trial and error I managed to get what I wanted to achieve by
using ./liba as the submodule URL (no .git suffix!), and creating a file
named liba in /tmp/gittest/repo.a/main.git (ie. the bare origin repo)
with a single line in it:

gitdir: ../liba.git

However, I'm not sure it is the right thing, or even advisable to do so.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Submodule, subtree, or something else?

2015-08-23 Thread Jānis Rukšāns
On Pk, 2015-08-21 at 17:07 -0700, Stefan Beller wrote:
 On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com 
 wrote:
  
  A major drawback of submodules in my opinion is the
  inability to make a full clone from an existing one without having
  access to the central repository, which is something I have to do from
  time to time.
 
 Can you elaborate on that a bit more?
 git clone --recurse-submodules should do that no matter which remote
 you contact?

I mean that if I have cloned a repository with submodules, cloning that
repository with --recurse-submodules will either access the central
server if absolute URLs are used, or requires additional clones for
each submodule.  For example

git clone --recursive http://somewhere/projectA.git
git clone --recursive file://$(pwd)/projectA projectA.tmp

The second command will cause the submodules to be downloaded again, or
expect them to be found in $(pwd).

Or am I mistaken, or doing something wrong?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Submodule, subtree, or something else?

2015-08-21 Thread Stefan Beller
On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns janis.ruks...@gmail.com wrote:
 Hello,


 First of all, I apologise for the wall of text that follows; obviously I
 am bad at this.

 My $DAYJOB is switching from Subversion to Git, primarily because of
 it's distributed nature (we are scattered all across the globe), and the
 ease of branching and merging.  One issue that has popped up is how to
 manage code shared between multiple projects.

 Our SVN setup used a shared repository for all projects, either using
 externals for shared code, or, more often than not, simply merging the
 code between projects as needed.  Ignoring the fact that merging with
 SVN is somewhat cumbersome, overall it has worked quite well for us,
 especially when combined with git-svn.

 For external libraries that rarely change, submodules appear to be the
 obvious choice when using Git.  On the other hand, I've found them
 somewhat cumbersome to use, and subtree merging (either using git
 subtree, or directly with git merge -s subtree) is closer to what we
 were doing in SVN.  A major drawback of submodules in my opinion is the
 inability to make a full clone from an existing one without having
 access to the central repository, which is something I have to do from
 time to time.

Can you elaborate on that a bit more?
git clone --recurse-submodules should do that no matter which remote
you contact?



 For internal libraries, the situation is even less clear.  For many of
 these libraries, most of the development happens within the context of a
 single project, with commits to main project being interleaved with
 commits to the subproject(s), resulting in histories resembling:

  (using git submodule)

A---B---S1---S2---C---S3
   ,´   ,´   ,´
  N---OPQ---R

  (using git subtree with --rejoin)

A---B---N---O---M1---M2---Q---C---R---M3
   ///
  N'--O'---PQ'--R'

  (using merge -s subtree)

A---B---M1---M2---C---M3
   ///
  N---OPQ---R

 where A, B and C are changes to the main project, N, O, P, Q and R are
 changes to library code, and Sn and Mn are submodule updates and merge
 commits, respectively.

 From what I have gathered, submodules have issues with branching and
 merging, therefore, unless I'm mistaken, submodules are kinda out of
 question.  Of the remaining two options, merging directly results in a
 nicer history, but requires making all changes to the library repo first
 (although I am quite sure that a similar effect can be achieved with
 plumbing, similarly to how git subtree split works), and is harder to
 use than git subtree.  Also, all three options can result in the main
 project history being cluttered with extra commits.

 Lastly, there is a particularly painful 3rd party library that has an
 enormous amount of local modifications that are never going to make it
 upstream, essentially making it a fork, project specific changes that
 are required for one project, but would break others, separate language
 bindings that access the internals (often requiring bug fixes to be made
 simultaneously to both), and, if that wasn't enough, it *requires*
 several source files to be modified for each individual project that
 uses it.  It's a complete mess, but we're stuck with it for the existing
 projects, as switching to an alternative would be too time consuming.


 To sum up, I'm looking for something that would let us share code
 between multiple projects, allow for:

 1) separate histories with relatively easy branching and merging

 2) distributed workflow without having to set up a multiple repositories
 everywhere (eg. work - home - laptop)

 3) to work on the shared code within a project using it

 4) inspection of the complete history

 5) modifications that are not shared with other projects

 and would not result in lots of clutter in the history.

 Repository size is somewhat less of an issue, because each submodule has
 to be checked out anyway.

 Submodules let you have #3, and #1, #2 and #5 to a point, after which it
 becomes a pain.  git subtree allows #1, #2, #3 and #4, and #5 with some
 pain (?), but results in duplicate commits.  Using subtree merge
 strategy directly gives everything except #3, but is harder to use than
 submodules or subtree.

 Are there any other options beside these three for sharing (or in some
 cases, not sharing) common code between projects using Git, that would
 address the above points better?  Or, alternatively, ways to work around
 the drawbacks of the existing tools?

 Lastly, I will be grateful for any suggestions about how to handle the
 messy case described above better.

 Thanks,
 Jānis

 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe git in
the