I found something that might be interesting. To me the problem seems to be the way the pack files are constructed.
I checked what happened with git log and what I get is the following: /lhome/gitadmin/repo/main> git log > commit 214baf2cea19d66e3a1817e8e6aa4883294be05f > Merge: ac974b0 8ad7c91 > Author: gitadmin <gitadmin@gitrepo> > Date: Wed Aug 29 13:02:42 2012 +0200 > Split 'xyz/' into commit '*8ad7c91aef6a4814fce80ab6e092fe7eeedc8090*' > git-subtree-dir: xyz > git-subtree-mainline: *ac974b0c9ac110a85c6f58fb460ee54a64992bda* > git-subtree-split: *8ad7c91aef6a4814fce80ab6e092fe7eeedc8090* > commit *ac974b0c9ac110a85c6f58fb460ee54a64992bda* > Author: gitadmin <gitadmin@gitrepo> > Date: Wed Aug 29 13:02:07 2012 +0200 > initial commit importing everything > commit 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090 > Author: gitadmin <gitadmin@gitrepo> > Date: Wed Aug 29 13:02:07 2012 +0200 > [xyz] initial commit importing everything So interesting parts are the highlighted SHAs. So the subtree command actually does create a synthetic commit (8ad7c91). I checked the contents of this commit with: git cat-file 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090 and saw that it only points to one tree object --> 0f201238. This tree object is the folder of only the files / subtree I need. So the commit contains EXACTLY what I want to have. So why does it not work? Here the interesting part (that I still don't get): When I clone that subtree branch using file:// git is smart enough to repack and create a separate pack file that only contains what I need / what the commit is pointing to. When I clone from remote via ssh though I get the big pack file that contains everything. Interestingly, when cloning locally via ssh git again repacks and gives me a new pack file. Any ideas how I can force the repacking when cloning? Am Donnerstag, 30. August 2012 14:18:51 UTC+2 schrieb Philip Oakley: > > > > On Aug 30, 11:34 am, Haasip Satang <[email protected]> > wrote: > > ;-) That's what I did as you can see in my explanation above ;-) The > > problem still seems to be that it only works locally on the same linux > > machine. When I try to clone from any remote machine (not matter which > OS) > > I end up getting the huge .git folder. > > > > So the question actually is why does > > > > git clone --depth 1 --no-hardlinks > *file:///*home/me/gitTests/subtreeRepo > > -b subtrees/xyz *xyz * > > > > give me a small clone (*but only locally), *while cloning from remote I > get > > a big one. > > > > I'm going to 'guess' that it is one of two things. > > One is the pack protocol that could mean that in one case you get a > compressed pack (though that doesn't shound like your case ;-). > > And the other is that the --no-hardlinks, in conjunction with the > other options has severely limited the number of branches that are > cloned (copied locally), while --depth=1 will (could) still have > pulled down the lead commit for every branch and hidden them under / > remotes/. > > > > And as mentioned earlier as well, when cloning the small *xyz *from > remote > > then I end up with what I wanna have; a small xyz project on a remote > > machine. > > > > Why can I not directly clone xyz remotely and get the same result as > with > > the local clone? > > > > Am Donnerstag, 30. August 2012 09:00:04 UTC+2 schrieb Philip Oakley: > > > > > > > > > > > > > Isn't a shallow clone a good use case for this? You only need the > latest > > > commit of each project you want to build and then it either works or > it > > > doesn't, and the clone is then deleted. > > > > > So is 'git clone --depth <depth>' what you need? > > > Use <depth> := 1 > > > > > Just a thought > > > > > Philip > > > > > - > > > ---- Original Message ----- > > > > > *From:* Haasip Satang <javascript:> > > > *To:* [email protected] <javascript:> > > > *Sent:* Thursday, August 30, 2012 1:21 AM > > > *Subject:* [git-users] Size of cloned git subtrees - only history / > files > > > for subtree needed > > > > > Hi all, > > > > > in short the question of the lenghty explanation below will be: How > can I > > > create a clone of a subtree that only contains the data needed for > that > > > subtree in the .git folder. > > > > > In detail here is what I have tried already and what my setup looks > like: > > > We are having a big repository containing multiple projects (political > > > reasons, cannot avoid having that... at least for now). While this > works > > > fine for all the developers (they just clone the big repo and get all > the > > > projects they need), we are facing problems with our continuous build > > > system (Jenkins). > > > > > Here we would like to have a job for each single project; of course > > > WITHOUT having to clone the whole big repo for every job, as this > would > > > lead to a significant overhead on disk. > > > > > After searching around for some time I basically came across four > > > potential solutions: > > > > > 1. Sparse Checkout > > > 2. Submodules > > > 3. Individual Repos with a manager script like repo, mr, git-status, > and > > > all the others that exist to tackle that problem > > > 4. Subtrees > > > > > The problem with 1 is, you still get to clone the whole repo > (including > > > all history), only to then checkout a part of it --> still disk > overhead. > > > As for submodules, I personally don't really like them and don't think > the > > > should be used in this case and they are kinda difficult to handle and > can > > > be fragile anyway. > > > The additional script based solution seems kinda hacky as well, so I > > > didn't really follow up on that too much. > > > > > So my favorite solution so far is actually using git subtree, which is > > > more or less easy (especially since the subtree branches are only used > for > > > the CI builds / in a read only way, nothing needs to be pushed back to > the > > > bigrepo). > > > > > The problem is, however, when I clone the bare and then create the > subtree > > > branches in the cloned working copy and then try to clone these > subtree > > > branches only, I still seem to get the whole big history, including > all the > > > stuff outside the tree. > > > > > Is there any way to avoid that and create a synthetic project history > > > containing only data relevant for the subtree? > > > > > What I did to kinda get there is more a hacky way. I create the > subtree > > > branch using: > > > > > git subtree split --prefix=xyz --annotate="[xy] " --rejoin -b > > > subtrees/xyz > > > > > Then I clone that with: > > > > > git clone --depth 1 --no-hardlinks > file:///home/me/gitTests/subtreeRepo -b > > > subtrees/xyz xyz > > > > > So creating a shallow clone (depth 1) seems to be the only way and > that > > > also only works on the local linux machine. If I clone the same > subtreeRepo > > > branch on a remote machine I actually get the whole big pack / history > with > > > it (which I of course don't want). > > > > > So what I did is I cloned the subtree branch locally and then cloned > that > > > repo from my remote Jenkins machine. While this seems to work (I > haven't > > > looked in if I'm getting the necessary change sets to send out the > emails > > > yet) it seems both, unnecessary complicated and very hacky. > > > > > To sum up, let me conclude with the question from the beginning: How > can I > > > create a clone of a subtree that only contains the data needed for > that > > > subtree in the .git folder. > > > > > Looking forward to your comments and ideas :) > > > > > Thanks, Haasip > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Git for human beings" group. > > > To view this discussion on the web visit > > >https://groups.google.com/d/msg/git-users/-/n5ZPYpDf4EIJ. > > > To post to this group, send email to > > > [email protected]<javascript:> > > > > . > > > To unsubscribe from this group, send email to > > > [email protected] <javascript:>. > > > For more options, visit this group at > > >http://groups.google.com/group/git-users?hl=en. > > > > > No virus found in this message. > > > Checked by AVG -www.avg.com > > > Version: 2012.0.2197 / Virus Database: 2437/5233 - Release Date: > 08/29/12- Hide quoted text - > > > > - Show quoted text - > -- You received this message because you are subscribed to the Google Groups "Git for human beings" group. To view this discussion on the web visit https://groups.google.com/d/msg/git-users/-/AHABBhkaqU4J. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/git-users?hl=en.
