On 05/20/2013 06:44 PM, Jeff King wrote:
> On Mon, May 20, 2013 at 04:42:38PM +0200, Michael Haggerty wrote:
> 
>>>> * Many callers store the empty string ("") as the name; for example,
>>>>   most of the entries created during a run of rev-list have "" as
>>>>   their name.  This means that lots of needless copies of "" are being
>>>>   made.  I think that the best solution to this problem would be to
>>>>   store NULL rather than "" for such entries, but I haven't figured
>>>>   out all of the places where the name is used.
>>>
>>> Use strbufs?
>>>
>>> No allocation (except for the strbuf object itself) is needed for
>>> empty strings, and string ownership and be transferred to and from it
>>> to prevent extra copies.
>>
>> That would cost two extra size_t per object_array_entry.  I have the
>> feeling that this structure is used often enough that the extra overhead
>> would be a disadvantage, but I'm not sure.
>>
>> The obvious alternative would be to teach users to deal with NULL and
>> either add another constructor alternative that transfers string
>> ownership or *always* transfer string ownership and change the callers
>> to call xstrdup() if they don't already own the name string.  I think I
>> will try that approach first.
> 
> You could use the same trick that strbuf does: instead of NULL, point to
> a well-known empty string literal. Readers do not have to care about
> this optimization at all; only writers need to recognize the well-known
> pointer value. And since we do not update in place but only eventually
> free, it really is just that anyone calling free() would do "if (name !=
> well_known_empty_string)".

Yes, that sounds like the best solution.  Ultimately there is only one
writer, add_object_array_with_mode(), and it can do

    if (!name)
        entry->name = NULL;
    else if (!*name)
        entry->name = well_known_empty_string;
    else
        entry->name = xstrdup(name);

This should be a lot less intrusive than what I was trying: to change
callers who wrote name="" to write name=NULL instead.  But it is a
nightmare to find all of the code that reads name and decide whether
they need to do

    entry->name ? entry->name : ""

because that in turn depends on whether the code that wrote into the
same object_array always/never/sometimes wrote strings vs. NULL to the
name field.  Blech, encapsulation is tough in C.

While I was chasing down callers, I came across other gems like

builtin/checkout.c:
        add_pending_object(&revs, object, sha1_to_hex(object->sha1));

revision.c:
        add_pending_object(revs, object, sha1_to_hex(object->sha1));

submodule.c:
        add_pending_object(rev, &list->item->object,
                sha1_to_hex(list->item->object.sha1));

so apparently I wasn't the only one befuddled by the lifetime and
ownership of the name field of object_array_entry.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to