On Tuesday, August 06, 2013 06:24:50 am Duy Nguyen wrote:
> On Tue, Aug 6, 2013 at 9:38 AM, Ramkumar Ramachandra
> > + Garbage collect using a pseudo
> > logarithmic packfile maintenance +
> > approach. This approach attempts to minimize packfile
> > churn + by keeping several generations
> > of varying sized packfiles around + and
> > only consolidating packfiles (or loose objects) which
> > are + either new packfiles, or packfiles
> > close to the same size as + another
> > packfile.
> I wonder if a simpler approach may be nearly efficient as
> this one: keep the largest pack out, repack the rest at
> fetch/push time so there are at most 2 packs at a time.
> Or we we could do the repack at 'gc --auto' time, but
> with lower pack threshold (about 10 or so). When the
> second pack is as big as, say half the size of the
> first, merge them into one at "gc --auto" time. This can
> be easily implemented in git-repack.sh.
It would definitely be better than the current gc approach.
However, I suspect it is still at least one to two orders of
magnitude off from where it should be. To give you a real
world example, on our server today when gitexproll ran on
our kernel/msm repo, it consolidated 317 pack files into one
almost 8M packfile (it compresses/dedupes shockingly well,
one of those new packs was 33M). Our largest packfile in
that repo is 1.5G!
So let's now imagine that the second closest packfile is
only 100M, it would keep getting consolidated with 8M worth
of data every day (assuming the same conditions and no extra
compression). That would take (750M-100M)/8M ~ 81 days to
finally build up large enough to no longer consolidate the
new packs with the second largest pack file daily. During
those 80+ days, it will be on average writing 325M too much
per day (when it should be writing just 8M).
So I can see the appeal of a simple solution, unfortunately
I think one layer would still "suck" though. And if you are
going to add even just one extra layer, I suspect that you
might as well go the full distance since you probably
already need to implement the logic to do so?
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html