Frank Sorenson <[EMAIL PROTECTED]> writes:

> It reduces the disk space requirement significantly (linux packs from
> 135MB to 73MB), and I'm seeing speed improvements as well (probably
> because cache-cold operation requires far less seeking, and the caching
> requirements are smaller).
>
> What are the benefits to keeping old packs?

For a private repository where one does development in and does
push to public repositories from, packing everything into one
pack and pruning everything else (including old packs) is always
the optimum thing, if one can afford the time to repack.

There are no benefits to _keeping_ old packs, but there may be
benefits not to pack everything into one huge one when other
people are involved.

Suppose I have currently three packs (one since the beginning of
time to some time ago, one incremental on top of it, another
incremental on top of the other two).  Somebody cloned from my
repository reasonably early in the project timeline (he has only
the first pack), somebody else cloned yesterday (has all three
packs).  And "git count-objects" reports many other objects are
unpacked and I decide it is a time to repack.

At this point I could create everything into one new big pack
and remove old packs.  Or I could create the fourth incremental.
Another possibility, and which is what I currently do by hand,
is to create a pack that is incremental on top of the first two,
and replace the latest incremental with it.

Now these two people want to fetch from my repository while the
third person wants to clone from scratch.  Which repacking
strategy gives the best transfer to these three people?  Having
a single huge pack favors the newcomer and penalizes the old
timers.  Especially, the current http-pull does not have a smart
to pick a better pack when an object is found in more than one
packs, so leaving old packs around would not help.

Leaving the old packs around could help all of them.  In the
above example, I could create the fourth incremental _and_ a
superpack that has everything in it.  The newcomer would slurp
in the superpack, the one with only the first pack can use one
of the second+third+fourth or the superpack, and the one with
all three can use the fourth pack.

Having said that, the packing has an interesting compression
characteristics.  Repacking the three existing packs (from the
example) along with the unpacked objects into one pack would
result in a very small pack, compared to the sum of three
existing packs, depending on how often you repack.  In that
sense, it may not be such a big deal to force everybody to
re-fetch everything even if most of them are already locally
available, by repacking everything into one.

> I disagree about not removing old packs.

I am not saying we should not remove old pack.  I am saying that
repacking, choosing which pack to remove and doing the actual
removing should be kept as separate steps and in separate
commands, perhaps the latter two as part of "git prune".

-jc

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to