[Adding Jeff King to CC; I meant to copy you in the original but forgot, sorry]
On 2013-01-23, at 14:19, Junio C Hamano <gits...@pobox.com> wrote:
> Jonathon Mah <j...@me.com> writes:
>> Add a new function "free_object_buffer", which marks the object as
>> un-parsed and frees the buffer. Only trees and commits have buffers;
>> other types are not affected. If the tree or commit buffer is already
>> NULL, the "parsed" flag is still cleared so callers can control the free
>> themselves (index-pack.c uses this).
>> Several areas of code would free buffers for object structs that
>> contained them ("struct tree" and "struct commit"), but without clearing
>> the "parsed" flag. parse_object would clear the flag for "struct tree",
>> but commits would remain in an invalid state (marked as parsed but with
>> a NULL buffer). Because the objects are uniqued (ccdc6037fee), the
>> invalid objects stay around and can lead to bad behavior.
>> In particular, running pickaxe on all refs in a repository with a cached
>> textconv could segfault. If the textconv cache ref was evaluated first
>> by cmd_log_walk, a subsequent notes_cache_match_validity call would
>> dereference NULL.
> Conceptually this is a right thing to do, but it is unclear why this
> conversion is safe in the existing code.
> A codepath that used to free() and assign NULL to a buffer without
> resetting .parsed would have assumed that it can find out the parsed
> properties of the object (e.g. .parents) without re-parsing the
> object, and much more importantly, the modifications made by that
> codepath will not be clobbered by later call to parse_object().
> IIRC, revision traversal machinery rewrites commit->parents but
> discards buffer when it knows that the log message is not needed
> (save_commit_buffer controls this behaviour). I do not offhand know
> if there are other instances of existing code that depend on the
> current behaviour, but have you audited all the codepaths that are
> affected by this patch and codepaths that work on objects this patch
> unmarks their .parsed field will not have such a problem?
No, I haven't audited the code paths (I have only the loosest familiarity with
the source). Indeed, I found that clearing the 'parsed' flag in fsck.c
(traverse_one_object()) is incorrect and causes test failures.
With the object cache, isn't modifying the object unsafe in general? Instead of
auditing code paths, it's now necessary to audit _all_ code that uses "struct
object", which seems infeasible.
Anyway, I don't care about the implementation (Junio does that extremely well),
I'm just trying to fix the segfault demonstrated in the test attached to the
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html