On 03/14/2013 02:40 PM, Jeff King wrote: > On Thu, Mar 14, 2013 at 12:28:53PM +0100, Michael Haggerty wrote: > >> Perhaps if peel_ref() *were* 100% reliable, we might be able to use it >> to avoid object lookups in some other places. > > In theory, some of the many uses of deref_tag could be adopted. However, > we do not always have the refname handy at that point (and I believe > peel_ref's optimization only kicks in during for_each_ref traversals > anyway). > > It may still be a win to check the packed-refs file before peeling a > random sha1, as looking up there should be cheaper than actually loading > the object. But right now, the way the optimization is used is always > O(1) to just check the last ref loaded. With your recent ref > refactoring, I think we should be able to do lookups in O(lg n).
There are more conceptual problems in the handling of peeled references and current_ref. I want to document them here for future reference and to get other people thinking about whether other parts of the code might be making wrong assumptions about this stuff. What is stored in ref_value.peeled? Is it the peeled version of ref_value.sha1, or is it the peeled version of the associated refname? Because they are not necessarily the same thing: an entry in the packed ref_cache *might* be overridden by a loose reference with the same refname but a different SHA1 which peels to a different value. Obviously we cannot always guarantee that ref_value.peeled is the peeled version of the refname, because it is read from a packed-refs file that might be old. So the only sane policy is that ref_value.peeled should be the peeled version of ref_value.sha1 regardless of whether the refname now points elsewhere. It follows that the only time we can be sure that ref_value.peeled is the correct peeled version of refname is if we know that there is no loose reference overriding it. This leads to some subtleties when reading and writing ref_value.peeled. When reading ref_value.peeled: When we iterate over references using do_for_each_ref(), then we only present a packed ref if there is no loose ref with the same refname. So in that case it is OK to point current_ref at the packed ref entry, and if somebody calls peel_ref() for the current reference then (modulo race conditions?) current_ref->u.value.peeled is indeed the correct peeled value for that refname. But if we iterate over *only* the packed refs using do_for_each_ref_in_dir() (as, for example, in repack_without_ref()), then the iteration doesn't even look at the loose refs, so we don't know whether the packed ref_entry being iterated over is authoritative. But we nevertheless set current_ref to that entry. So if somebody would call peel_ref() on the current refname during such an iteration, they could get the peeled version of the packed ref rather than the correct peeled version for the refname. Currently, nobody calls peel_ref() during such an iteration, so I don't think it causes any bugs. But it is an unmarked landmine. When setting ref_value.peeled: When determining the value to write to ref_value.peeled (which I believe only happens in pack-refs), we should record the peeled version of ref_entry.sha1, and *not* the peeled version of refname. This is indeed what pack-refs does. But if somebody were to get the clever and reasonable-sounding idea of using peel_ref() in the implementation of pack-refs.c:handle_one_ref(), then it would be easy to accidentally get a peeled value for the current refname rather than for the SHA1 being recorded in the file. I am trying to change repack_without_refs() to write the peeled refs to the packed-refs file (currently the peeled refs are lost if a packed ref is deleted). For this I would like to reuse the pre-peeled values from the old version of the packed-refs file to avoid having to resolve them all again. So all of the issues mentioned above are coming together for me. I think my solution will be to add a static function in refs.c that passes pointers to the ref_entries to its callback, so that the function has direct access to the peeled member rather than having to access it via the information-losing peel_ref(). Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html