> On Oct 14, 2016, at 07:06, Emilian Bold <e...@apache.org> wrote:
> Hello,
> I've recently learned git allows 'shallow' clones that may contain no
> history whatsoever.
> See the git clone manual <https://git-scm.com/docs/git-clone>, specifically
> the --depth parameter.
> Obviously this will be a huge bandwidth, time and disk saver for some
> people.

I agree shallow git clones are great. I think I would use them even with 
smaller repos until I needed to know more.

> And it seems that git even supports push / pull from shallow repositories.
> I believe this would permit us to still use a single unaltered repository
> while allowing users (or GitHub mirrors) to be shallow.

Yes, but then the whole is much larger still. The repository is 1GB just for 
the sources. If I’m working on Groovy, Java, and Core, then I don’t need PHP, 
C/C++, or others, and frankly they are out of context in that case. I think 
perhaps as a start we look at how to get moved over, but of course have to be 
able to put it in the infra regardless of thoughts on this, and then figure out 
something. i.e. it isn’t scalable IMO that everyone working on every technology 
has to contribute and merge up with everyone else working on other technologies 
unless they are actually changing some central thing.

> PS: Philosophically speaking, I see all this discussion about repository
> size and history stripping as a failure of DVCS
> <https://en.wikipedia.org/wiki/Distributed_version_control>s and/or of the
> Internet infrastructure. Removing history is the equivalent of removing
> comments to save disk space.

I don’t think that last statement is necessarily accurate. I mean, if a file 
has so many changes those old depths are irrelevant and useless, then what 
meaning do they have? It is hard to make a case they are useful after some 
time. To me it is like keeping too much stuff in the house because we are 
afraid to get rid of it. If you will never touch it, does it have any meaning? 
You might keep something, and some time down the road you go “Man, if I had 
that I could have made 10,000 bucks!”, but then if you had sold off old stuff 
and saved the money as you went through life, you probably would have had more 
money instantly available. But, the rare times you had that 10,000 dollar time 
laying around were probably so rare you can’t remember them or never had them. 
Maybe a bad analogy, but I think there is still a point when history is just 
stale, and even if slightly useful, not much due to the complication of its 
relevance to “now” at any point in time; the bigger the depth of a files 
history, the bigger the complexity between depth N and depth 1IMO.

On the DVCS stuff, I don’t know. It is like the “cloud”. Smaller things just 
scale better until not only disk space but bandwidth gets cheaper and more 
available. Even in large networks like AWS smaller drives scale better for 
problems where as bigger ones don’t because you are dealing with so many 
connections and data pools. Even if we were using SVN, then if we depended on 
pulling down all C++, Python, PHP, Java, Groovy, etc just to work on say 
JavaScript, and if those things made the pull over 1GB, I think the same 
problem would exist, and personally I don’t find it practical. So, I see it as 
a problem of structure versus as much a problem with the technology…at least 
until we have quantum SSDs and quantum entanglement driven networks :-D


Reply via email to