On Tue, Sep 3, 2013 at 1:49 PM, Jeff King <p...@peff.net> wrote:
>> - by going through index-pack first, then unpack, we pay extra cost
>> for completing a thin pack into a full one. But compared to fetch's
>> total time, it should not be noticeable because unpack-objects is
>> only called when the pack contains a small number of objects.
> ...but the cost is paid by total pack size, not number of objects. So if
> I am pushing up a commit with a large uncompressible blob, I've
> effectively doubled my disk I/O. It would make more sense to me for
> index-pack to learn command-line options specifying the limits, and then
> to operate on the pack as it streams in. E.g., to decide after seeing
> the header to unpack rather than index, or to drop large blobs from the
> pack (and put them in their own pack directly) as we are streaming into
> it (we do not know the blob size ahead of time, but we can make a good
> guess if it has a large on-disk size in the pack).
Yeah letting index-pack do the work was my backup plan :) I think if
there is a big blob in the pack, then the pack should not be unpacked
at all. If you store big blobs in a separate pack you already pay the
the lookup cost of one more pack in find_pack_entry(), why go through
the process of unpacking? index-pack still has the advantage of
streaming though. Will rework.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html