> On Oct 14, 2016, at 07:06, Emilian Bold <e...@apache.org> wrote:
> I've recently learned git allows 'shallow' clones that may contain no
> history whatsoever.
> See the git clone manual <https://git-scm.com/docs/git-clone>, specifically
> the --depth parameter.
> Obviously this will be a huge bandwidth, time and disk saver for some
I agree shallow git clones are great. I think I would use them even with
smaller repos until I needed to know more.
> And it seems that git even supports push / pull from shallow repositories.
> I believe this would permit us to still use a single unaltered repository
> while allowing users (or GitHub mirrors) to be shallow.
Yes, but then the whole is much larger still. The repository is 1GB just for
the sources. If I’m working on Groovy, Java, and Core, then I don’t need PHP,
C/C++, or others, and frankly they are out of context in that case. I think
perhaps as a start we look at how to get moved over, but of course have to be
able to put it in the infra regardless of thoughts on this, and then figure out
something. i.e. it isn’t scalable IMO that everyone working on every technology
has to contribute and merge up with everyone else working on other technologies
unless they are actually changing some central thing.
> PS: Philosophically speaking, I see all this discussion about repository
> size and history stripping as a failure of DVCS
> <https://en.wikipedia.org/wiki/Distributed_version_control>s and/or of the
> Internet infrastructure. Removing history is the equivalent of removing
> comments to save disk space.
I don’t think that last statement is necessarily accurate. I mean, if a file
has so many changes those old depths are irrelevant and useless, then what
meaning do they have? It is hard to make a case they are useful after some
time. To me it is like keeping too much stuff in the house because we are
afraid to get rid of it. If you will never touch it, does it have any meaning?
You might keep something, and some time down the road you go “Man, if I had
that I could have made 10,000 bucks!”, but then if you had sold off old stuff
and saved the money as you went through life, you probably would have had more
money instantly available. But, the rare times you had that 10,000 dollar time
laying around were probably so rare you can’t remember them or never had them.
Maybe a bad analogy, but I think there is still a point when history is just
stale, and even if slightly useful, not much due to the complication of its
relevance to “now” at any point in time; the bigger the depth of a files
history, the bigger the complexity between depth N and depth 1IMO.
On the DVCS stuff, I don’t know. It is like the “cloud”. Smaller things just
scale better until not only disk space but bandwidth gets cheaper and more
available. Even in large networks like AWS smaller drives scale better for
problems where as bigger ones don’t because you are dealing with so many
connections and data pools. Even if we were using SVN, then if we depended on
pulling down all C++, Python, PHP, Java, Groovy, etc just to work on say
problem would exist, and personally I don’t find it practical. So, I see it as
a problem of structure versus as much a problem with the technology…at least
until we have quantum SSDs and quantum entanglement driven networks :-D