From: "Duy Nguyen" <pclo...@gmail.com>
Sent: Wednesday, November 27, 2013 11:50 PM
On Thu, Nov 28, 2013 at 5:52 AM, Philip Oakley <philipoak...@iee.org>
In the pack transfer protocol
the negotiation for refs is discussed, but its unclear to me if the
negotiation explicitly navigates down into the trees and blobs of
commit that need to go into the pack.
From one perspective I can see that, in the main, it's only commit
that are being negotiated, and the DAG is used to imply which commit
are to be sent between the wants and haves end points, without need
descend into their trees and blobs. The tags and the objects they
are explicitly given so are negotiated easily.
The other view is that the negotiation should be listing every object
type between the wants and haves as part of the negotiation. I just
tell from the docs which assumption is appropriate. Is there any
clarifications on this?
other object negotiation is inferred from commits because sending full
listing is too much. If you say you have commit A, you imply you have
everything from commit A down to the bottom. With this knowledge, when
you want commit B, the sender only needs to send trees and objects
that do not exist in commit A or any of its predecessors.
I presume that in the case of a tag that points to a tree, the pack will
contain not only the specific tree that was tagged, but also its
sub-trees and blobs that it references.
Plus I'm wondering if the commit(s) that contain that tagged tree are
also sent (given that such a tree could be repeated/reused in other
commits this may be expensive, and hence not done)
cut cost at the sender, we do something less than optimized (check out
the edge concept in documents, or else in pack-objects.c). Pack
bitmaps are supposed to provide cheap object traversal and make the
transfered pack even smaller.
I ask as I was cogitating on options for a 'narrow' clone (to
shallow clones ;-) that could, say, in some way limit the size of
downloaded, or the number of tree levels downloaded, or even path
size limiting is easy because you don't need to traverse object dag at
all. Inside pack-objects it calls rev-list to collect objects to be
sent. You just filter by size at that phase. Support for raising or
lowering size limit is also workable, just like how shallow
deepen/shorten is done: you let the sender know you have size limit A,
now you want to raise to B and the sender just collects extra objects
in A..B range for all "have" refs.
The problem is how to let the client know what objects are not sent
due to the size limit, so it could set up refs/replace to stop the
user from running into missing objects. If there are too many excluded
objects, sending all those SHA-1 with pkt-line is inefficient. (path
limit does not have problem, it can infer from the command line
arguments most of the time). Maybe you could send this listing in
binary format just before sending the pack.
This was part of the problem I was thinking about ;-)
BTW another way to deal with large blobs in clone is git-annex. I was
thinking the other day if we could sort of integrate it to git to
provide smooth UI (the user does not have to type "git annex
something", or at least not often). Of course git-annex is still
optional and the UI integration is only activated via config key,
after git-annex is installed.
I'll have a look.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html