I found something that might be interesting. To me the problem seems to be 
the way the pack files are constructed. 

I checked what happened with git log and what I get is the following: 

/lhome/gitadmin/repo/main> git log
> commit 214baf2cea19d66e3a1817e8e6aa4883294be05f
> Merge: ac974b0 8ad7c91
> Author: gitadmin <gitadmin@gitrepo>
> Date:   Wed Aug 29 13:02:42 2012 +0200
>     Split 'xyz/' into commit '*8ad7c91aef6a4814fce80ab6e092fe7eeedc8090*'
>     git-subtree-dir: xyz
>     git-subtree-mainline: *ac974b0c9ac110a85c6f58fb460ee54a64992bda*
>     git-subtree-split: *8ad7c91aef6a4814fce80ab6e092fe7eeedc8090*
> commit *ac974b0c9ac110a85c6f58fb460ee54a64992bda*
> Author: gitadmin <gitadmin@gitrepo>
> Date:   Wed Aug 29 13:02:07 2012 +0200
>     initial commit importing everything
> commit 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090
> Author: gitadmin <gitadmin@gitrepo>
> Date:   Wed Aug 29 13:02:07 2012 +0200
>     [xyz] initial commit importing everything


So interesting parts are the highlighted SHAs. So the subtree command 
actually does create a synthetic commit  (8ad7c91).

I checked the contents of this commit with: 

git cat-file 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090

and saw that it only points to one tree object --> 0f201238. This tree 
object is the folder of only the files / subtree I need. So the commit 
contains EXACTLY what I want to have. 

So why does it not work? Here the interesting part (that I still don't 
get): 

When I clone that subtree branch using file:// git is smart enough to 
repack and create a separate pack file that only contains what I need / 
what the commit is pointing to. 

When I clone from remote via ssh though I get the big pack file that 
contains everything. Interestingly, when cloning locally via ssh git again 
repacks and gives me a new pack file. 

Any ideas how I can force the repacking when cloning? 



Am Donnerstag, 30. August 2012 14:18:51 UTC+2 schrieb Philip Oakley:
>
>
>
> On Aug 30, 11:34 am, Haasip Satang <haasip.sat...@googlemail.com> 
> wrote: 
> > ;-) That's what I did as you can see in my explanation above ;-) The 
> > problem still seems to be that it only works locally on the same linux 
> > machine. When I try to clone from any remote machine (not matter which 
> OS) 
> > I end up getting the huge .git folder. 
> > 
> > So the question actually is why does 
> > 
> > git clone --depth 1 --no-hardlinks 
> *file:///*home/me/gitTests/subtreeRepo 
> > -b subtrees/xyz *xyz * 
> > 
> > give me a small clone (*but only locally), *while cloning from remote I 
> get 
> > a big one. 
> > 
>
> I'm going to 'guess' that it is one of two things. 
>
> One is the pack protocol that could mean that in one case you get a 
> compressed pack (though that doesn't shound like your case ;-). 
>
> And the other is that the --no-hardlinks, in conjunction with the 
> other options has severely limited the number of branches that are 
> cloned (copied locally), while --depth=1 will (could) still have 
> pulled down the lead commit for every branch and hidden them under / 
> remotes/. 
>
>
> > And as mentioned earlier as well, when cloning the small *xyz *from 
> remote 
> > then I end up with what I wanna have; a small xyz project on a remote 
> > machine. 
> > 
> > Why can I not directly clone xyz remotely and get the same result as 
> with 
> > the local clone? 
> > 
> > Am Donnerstag, 30. August 2012 09:00:04 UTC+2 schrieb Philip Oakley: 
> > 
> > 
> > 
> > 
> > 
> > >  Isn't a shallow clone a good use case for this? You only need the 
> latest 
> > > commit of each project you want to build and then it either works or 
> it 
> > > doesn't, and the clone is then deleted. 
> > 
> > > So is 'git clone --depth <depth>' what you need? 
> > > Use  <depth> := 1 
> > 
> > > Just a thought 
> > 
> > > Philip 
> > 
> > > - 
> > > ---- Original Message ----- 
> > 
> > > *From:* Haasip Satang <javascript:> 
> > > *To:* git-...@googlegroups.com <javascript:> 
> > > *Sent:* Thursday, August 30, 2012 1:21 AM 
> > > *Subject:* [git-users] Size of cloned git subtrees - only history / 
> files 
> > > for subtree needed 
> > 
> > > Hi all, 
> > 
> > > in short the question of the lenghty explanation below will be: How 
> can I 
> > > create a clone of a subtree that only contains the data needed for 
> that 
> > > subtree in the .git folder. 
> > 
> > > In detail here is what I have tried already and what my setup looks 
> like: 
> > > We are having a big repository containing multiple projects (political 
> > > reasons, cannot avoid having that... at least for now). While this 
> works 
> > > fine for all the developers (they just clone the big repo and get all 
> the 
> > > projects they need), we are facing problems with our continuous build 
> > > system (Jenkins). 
> > 
> > > Here we would like to have a job for each single project; of course 
> > > WITHOUT having to clone the whole big repo for every job, as this 
> would 
> > > lead to a significant overhead on disk. 
> > 
> > > After searching around for some time I basically came across four 
> > > potential solutions: 
> > 
> > > 1. Sparse Checkout 
> > > 2. Submodules 
> > > 3. Individual Repos with a manager script like repo, mr, git-status, 
> and 
> > > all the others that exist to tackle that problem 
> > > 4. Subtrees 
> > 
> > > The problem with 1 is, you still get to clone the whole repo 
> (including 
> > > all history), only to then checkout a part of it --> still disk 
> overhead. 
> > > As for submodules, I personally don't really like them and don't think 
> the 
> > > should be used in this case and they are kinda difficult to handle and 
> can 
> > > be fragile anyway. 
> > > The additional script based solution seems kinda hacky as well, so I 
> > > didn't really follow up on that too much. 
> > 
> > > So my favorite solution so far is actually using git subtree, which is 
> > > more or less easy (especially since the subtree branches are only used 
> for 
> > > the CI builds / in a read only way, nothing needs to be pushed back to 
> the 
> > > bigrepo). 
> > 
> > > The problem is, however, when I clone the bare and then create the 
> subtree 
> > > branches in the cloned working copy and then try to clone these 
> subtree 
> > > branches only, I still seem to get the whole big history, including 
> all the 
> > > stuff outside the tree. 
> > 
> > > Is there any way to avoid that and create a synthetic project history 
> > > containing only data relevant for the subtree? 
> > 
> > > What I did to kinda get there is more a hacky way. I create the 
> subtree 
> > > branch using: 
> > 
> > >  git subtree split --prefix=xyz --annotate="[xy] " --rejoin -b 
> > > subtrees/xyz 
> > 
> > > Then I clone that with: 
> > 
> > > git clone --depth 1 --no-hardlinks 
> file:///home/me/gitTests/subtreeRepo -b 
> > > subtrees/xyz xyz 
> > 
> > > So creating a shallow clone (depth 1) seems to be the only way and 
> that 
> > > also only works on the local linux machine. If I clone the same 
> subtreeRepo 
> > > branch on a remote machine I actually get the whole big pack / history 
> with 
> > > it (which I of course don't want). 
> > 
> > > So what I did is I cloned the subtree branch locally and then cloned 
> that 
> > > repo from my remote Jenkins machine. While this seems to work (I 
> haven't 
> > > looked in if I'm getting the necessary change sets to send out the 
> emails 
> > > yet) it seems both, unnecessary complicated and very hacky. 
> > 
> > > To sum up, let me conclude with the question from the beginning: How 
> can I 
> > > create a clone of a subtree that only contains the data needed for 
> that 
> > > subtree in the .git folder. 
> > 
> > > Looking forward to your comments and ideas :) 
> > 
> > > Thanks, Haasip 
> > 
> > > -- 
> > > You received this message because you are subscribed to the Google 
> Groups 
> > > "Git for human beings" group. 
> > > To view this discussion on the web visit 
> > >https://groups.google.com/d/msg/git-users/-/n5ZPYpDf4EIJ. 
> > > To post to this group, send email to 
> > > git-...@googlegroups.com<javascript:> 
>
> > > . 
> > > To unsubscribe from this group, send email to 
> > > git-users+...@googlegroups.com <javascript:>. 
> > > For more options, visit this group at 
> > >http://groups.google.com/group/git-users?hl=en. 
> > 
> > > No virus found in this message. 
> > > Checked by AVG -www.avg.com 
> > > Version: 2012.0.2197 / Virus Database: 2437/5233 - Release Date: 
> 08/29/12- Hide quoted text - 
> > 
> > - Show quoted text - 
>

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/git-users/-/AHABBhkaqU4J.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.

Reply via email to