On 05/20/2013 06:44 PM, Jeff King wrote:
> On Mon, May 20, 2013 at 04:42:38PM +0200, Michael Haggerty wrote:
>
>>>> * Many callers store the empty string ("") as the name; for example,
>>>> most of the entries created during a run of rev-list have "" as
>>>> their name. This means that lots of needless copies of "" are being
>>>> made. I think that the best solution to this problem would be to
>>>> store NULL rather than "" for such entries, but I haven't figured
>>>> out all of the places where the name is used.
>>>
>>> Use strbufs?
>>>
>>> No allocation (except for the strbuf object itself) is needed for
>>> empty strings, and string ownership and be transferred to and from it
>>> to prevent extra copies.
>>
>> That would cost two extra size_t per object_array_entry. I have the
>> feeling that this structure is used often enough that the extra overhead
>> would be a disadvantage, but I'm not sure.
>>
>> The obvious alternative would be to teach users to deal with NULL and
>> either add another constructor alternative that transfers string
>> ownership or *always* transfer string ownership and change the callers
>> to call xstrdup() if they don't already own the name string. I think I
>> will try that approach first.
>
> You could use the same trick that strbuf does: instead of NULL, point to
> a well-known empty string literal. Readers do not have to care about
> this optimization at all; only writers need to recognize the well-known
> pointer value. And since we do not update in place but only eventually
> free, it really is just that anyone calling free() would do "if (name !=
> well_known_empty_string)".
Yes, that sounds like the best solution. Ultimately there is only one
writer, add_object_array_with_mode(), and it can do
if (!name)
entry->name = NULL;
else if (!*name)
entry->name = well_known_empty_string;
else
entry->name = xstrdup(name);
This should be a lot less intrusive than what I was trying: to change
callers who wrote name="" to write name=NULL instead. But it is a
nightmare to find all of the code that reads name and decide whether
they need to do
entry->name ? entry->name : ""
because that in turn depends on whether the code that wrote into the
same object_array always/never/sometimes wrote strings vs. NULL to the
name field. Blech, encapsulation is tough in C.
While I was chasing down callers, I came across other gems like
builtin/checkout.c:
add_pending_object(&revs, object, sha1_to_hex(object->sha1));
revision.c:
add_pending_object(revs, object, sha1_to_hex(object->sha1));
submodule.c:
add_pending_object(rev, &list->item->object,
sha1_to_hex(list->item->object.sha1));
so apparently I wasn't the only one befuddled by the lifetime and
ownership of the name field of object_array_entry.
Michael
--
Michael Haggerty
[email protected]
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html