On Thu, Sep 12, 2013 at 12:45:44PM +0000, Pyeron, Jason J CTR (US) wrote:
> If the rules of engagement are change a bit, the server side can be release
> from most of its work (CPU/IO).
>
> Client does the following, looping as needed:
>
> Heads=server->heads();
> KnownCommits=Local->AllCommits();
> Missingblobs=[];
> Foreach(commit:heads) if (!knownCommits->contains(commit))
> MissingBlobs[]=commit;
> Foreach(commit:knownCommit) if (!commit->isValid())
> MissingBlobs[]=commit->blobs();
> If (missingBlobs->size()>0) server->FetchBlobs(missingBlobs);
That doesn't quite work. The client does not know the set of missing
objects just from the commits. It knows the sha1 of the root trees it is
missing. And then if it fetches those, it knows the sha1 of any
top-level entries it is missing. And when it gets those, it knows the
sha1 of any 2nd-level entries it is missing, and so forth.
You can progressively ask for each level, but:
1. You are spending a round-trip for each request. Doing it per-object
is awful (the dumb http walker will do this if the repo is not
packed, and it's S-L-O-W). Doing it per-level would be better, but
not great.
2. You are losing opportunities for deltas (or you are making the
state the server needs to maintain very complicated, as it must
remember from request to request which objects you have gotten that
can be used as delta bases).
3. There is a lot of overhead in this protocol. The client has to
mention each object individually by sha1. It may not seem like a
lot, but it can easily add 10% to a clone (just look at the size of
the pack .idx files versus the packfiles themselves).
-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html