Re: Pack transfer negotiation to tree and blob level?

Philip Oakley Thu, 28 Nov 2013 00:40:19 -0800

From: "Duy Nguyen" <[email protected]>
Sent: Wednesday, November 27, 2013 11:50 PM

On Thu, Nov 28, 2013 at 5:52 AM, Philip Oakley <[email protected]>wrote:
In the pack transfer protocol(Documentation\technical\pack-protocol.txt)
the negotiation for refs is discussed, but its unclear to me if the
negotiation explicitly navigates down into the trees and blobs ofeach
commit that need to go into the pack.
From one perspective I can see that, in the main, it's only commitobjectsthat are being negotiated, and the DAG is used to imply which commitobjectsare to be sent between the wants and haves end points, without needtodescend into their trees and blobs. The tags and the objects theypoint to
are explicitly given so are negotiated easily.
The other view is that the negotiation should be listing every objectof anytype between the wants and haves as part of the negotiation. I justcouldn'ttell from the docs which assumption is appropriate. Is there anyextra
clarifications on this?
other object negotiation is inferred from commits because sending full
listing is too much. If you say you have commit A, you imply you have
everything from commit A down to the bottom. With this knowledge, when
you want commit B, the sender only needs to send trees and objects
that do not exist in commit A or any of its predecessors.

I presume that in the case of a tag that points to a tree, the pack willcontain not only the specific tree that was tagged, but also itssub-trees and blobs that it references.

Plus I'm wondering if the commit(s) that contain that tagged tree arealso sent (given that such a tree could be repeated/reused in othercommits this may be expensive, and hence not done)

Although to
cut cost at the sender, we do something less than optimized (check out
the edge concept in documents, or else in pack-objects.c). Pack
bitmaps are supposed to provide cheap object traversal and make the
transfered pack even smaller.

I ask as I was cogitating on options for a 'narrow' clone (tocomplementshallow clones ;-) that could, say, in some way limit the size ofblobsdownloaded, or the number of tree levels downloaded, or even pathlimiting.


size limiting is easy because you don't need to traverse object dag at
all. Inside pack-objects it calls rev-list to collect objects to be
sent. You just filter by size at that phase. Support for raising or
lowering size limit is also workable, just like how shallow
deepen/shorten is done: you let the sender know you have size limit A,
now you want to raise to B and the sender just collects extra objects
in A..B range for all "have" refs.

The problem is how to let the client know what objects are not sent
due to the size limit, so it could set up refs/replace to stop the
user from running into missing objects. If there are too many excluded
objects, sending all those SHA-1 with pkt-line is inefficient. (path
limit does not have problem, it can infer from the command line
arguments most of the time). Maybe you could send this listing in
binary format just before sending the pack.


This was part of the problem I was thinking about ;-)


BTW another way to deal with large blobs in clone is git-annex. I was
thinking the other day if we could sort of integrate it to git to
provide smooth UI (the user does not have to type "git annex
something", or at least not often). Of course git-annex is still
optional and the UI integration is only activated via config key,
after git-annex is installed.


I'll have a look.

--
Duy
--

Philip

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Pack transfer negotiation to tree and blob level?

Reply via email to