On Mon, 2005-04-18 at 01:39 +0200, Petr Baudis wrote:
> I think this is bad, bad, bad. If you don't keep around all the
> _commits_, you get into all sorts of troubles - when merging, when doing
> git log, etc. And the commits themselves are probably actually pretty
> small portion of the thing. I didn't do any actual measurement but I
> would be pretty surprised if it would be much more than few megabytes of
> data for the kernel history.

I'm not sure it's that bad -- and everyone already seems perfectly happy
not to have history going back before 2.6.12-rc2. We're not talking
about doing this by _default_ -- we're talking about allowing people to
keep trees pruned if they _want_ to. So I might want to drop history
before 2.6.0 on my laptop, for example.

> Of course an entirely different thing are _trees_ associated with those
> commits. As long as you stay with a simple three-way merge, you
> basically never want to look at trees which aren't heads and which you
> don't specifically request to look at. And the trees and what they carry
> inside is the main bulk of data.

If the trees are absent and you're trying to merge, what do you gain
from having the commit objects? And for the case of 'git log', I
certainly think it's acceptable that you lose out on those parts of
prehistory which you've explicitly removed from your local tree --
that's a feature, not a bug. 

For the special case of removing history before 2.6.12-rc2 from the
trees, I certainly think we can do it by leaving out all the commits,
not just the trees. We can do that easily, but there's no way we can
_add_ that history retrospectively if we omit it in the first place.

For history older than 2.6.12-rc2 I'd suggest that it would be available
in a different place, and absent from the 'main' working tree that
everyone uses by default. The only difference we'd see in the working
tree is that the 2.6.12-rc2 commit -- the oldest commit in that tree --
would actually have an absentee parent instead of appearing to be an
import. And all the sha1 hashes of all subsequent commits would be
different, of course.

To allow pruning of older objects in the general case would be a little
bit harder than that, because as things stand you'd be re-fetching them
every time you rsync from elsewhere -- but that wouldn't really be hard
to fix if we care.

Either way, I think it can probably be done by omitting the commit
objects as well as the trees -- but the important point is that we
_should_ include a 'parent' pointer in the oldest commit of the tree
we're working with, pointing back to the imported history.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to