Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-08 Thread Stefan Beller
On Tue, May 8, 2018 at 8:00 AM, Duy Nguyen  wrote:
> On Tue, May 8, 2018 at 12:59 AM, Stefan Beller  wrote:
>> @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o)
>>  void parsed_object_pool_clear(struct parsed_object_pool *o)
>>  {
>> /*
>> -* TOOD free objects in o->obj_hash.
>> -*
>>  * As objects are allocated in slabs (see alloc.c), we do
>>  * not need to free each object, but each slab instead.
>> +*
>> +* Before doing so, we need to free any additional memory
>> +* the objects may hold.
>>  */
>> +   unsigned i;
>> +
>> +   for (i = 0; i < o->obj_hash_size; i++) {
>> +   struct object *obj = o->obj_hash[i];
>> +
>> +   if (!obj)
>> +   continue;
>> +
>> +   if (obj->type == OBJ_TREE) {
>> +   free(((struct tree*)obj)->buffer);
>
> It would be nicer to keep this in separate functions, e.g.
> release_tree_node() and release_commit_node() to go with
> alloc_xxx_node().

ok, I can introduce that, although it seems unnecessary complicated
for now.

On top of this series I started an experiment (which rewrites alloc
and object.c a whole lot more; for performance reasons), which gets
rid of the multiple alloc_states. There will be only one allocation for
one repository, it can allocate across multiple types without alignment
overhead. It will reduce memory footprint of obj_hash by half, via
storing indexes instead of pointers in there.
That said, the experiment shall not influence the
direction of this series. Will fix.

>> +   } else if (obj->type == OBJ_COMMIT) {
>> +   free_commit_list(((struct commit*)obj)->parents);
>> +   free(&((struct commit*)obj)->util);
>> +   }
>> +   }
>
> I still don't see who frees obj_hash[] (or at least clears it if not
> freed). If I'm going to use this to free memory in pack-objects then
> I'd really prefer obj_hash[] freed because it's a big _big_ array.

gah!

> Just to be clear, what I mean is
>
> FREE_AND_NULL(o->obj_hash);
> o->obj_hash_size = 0;

ok, I just put it here, just before the calls
to clear_alloc_state()s.


Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-08 Thread Jonathan Tan
On Mon,  7 May 2018 15:59:16 -0700
Stefan Beller  wrote:

> + for (i = 0; i < o->obj_hash_size; i++) {
> + struct object *obj = o->obj_hash[i];
> +
> + if (!obj)
> + continue;
> +
> + if (obj->type == OBJ_TREE) {
> + free(((struct tree*)obj)->buffer);
> + } else if (obj->type == OBJ_COMMIT) {
> + free_commit_list(((struct commit*)obj)->parents);
> + free(&((struct commit*)obj)->util);
> + }

Besides the other comments by Peff and Duy, should the "tag" field of a
tag object be freed too? It is allocated by xmemdupz in tag.c, and is
not assigned to by any other code (verified by renaming it and then
fixing the compile errors one by one).

Other than that, and other than my small comment on patch 1, this patch
set looks good to me.


Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-08 Thread Duy Nguyen
On Tue, May 8, 2018 at 12:59 AM, Stefan Beller  wrote:
> @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o)
>  void parsed_object_pool_clear(struct parsed_object_pool *o)
>  {
> /*
> -* TOOD free objects in o->obj_hash.
> -*
>  * As objects are allocated in slabs (see alloc.c), we do
>  * not need to free each object, but each slab instead.
> +*
> +* Before doing so, we need to free any additional memory
> +* the objects may hold.
>  */
> +   unsigned i;
> +
> +   for (i = 0; i < o->obj_hash_size; i++) {
> +   struct object *obj = o->obj_hash[i];
> +
> +   if (!obj)
> +   continue;
> +
> +   if (obj->type == OBJ_TREE) {
> +   free(((struct tree*)obj)->buffer);

It would be nicer to keep this in separate functions, e.g.
release_tree_node() and release_commit_node() to go with
alloc_xxx_node().

> +   } else if (obj->type == OBJ_COMMIT) {
> +   free_commit_list(((struct commit*)obj)->parents);
> +   free(&((struct commit*)obj)->util);
> +   }
> +   }

I still don't see who frees obj_hash[] (or at least clears it if not
freed). If I'm going to use this to free memory in pack-objects then
I'd really prefer obj_hash[] freed because it's a big _big_ array.

Just to be clear, what I mean is

FREE_AND_NULL(o->obj_hash);
o->obj_hash_size = 0;

> +
> +   clear_alloc_state(o->blob_state);
> +   clear_alloc_state(o->tree_state);
> +   clear_alloc_state(o->commit_state);
> +   clear_alloc_state(o->tag_state);
> +   clear_alloc_state(o->object_state);
>  }
-- 
Duy


Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-08 Thread Jeff King
On Mon, May 07, 2018 at 03:59:16PM -0700, Stefan Beller wrote:

> @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o)
>  void parsed_object_pool_clear(struct parsed_object_pool *o)
> [...]
> + for (i = 0; i < o->obj_hash_size; i++) {
> + struct object *obj = o->obj_hash[i];
> +
> + if (!obj)
> + continue;
> +
> + if (obj->type == OBJ_TREE) {
> + free(((struct tree*)obj)->buffer);
> + } else if (obj->type == OBJ_COMMIT) {
> + free_commit_list(((struct commit*)obj)->parents);
> + free(&((struct commit*)obj)->util);
> + }
> + }

Coverity complains about this final free(). I think the "&" is doing an
incorrect extra level of indirection?

That said, I'm not sure if it is safe to blindly free the util field. We
don't necessarily know what downstream code has pointed it to. It may
not be allocated memory[1], or it may even be a more complicated data
structure that has sub-components that need freeing[2].

In the long run, it may be worth trying to get rid of this util field
completely, in favor of having callers use a commit_slab. That has
better memory-ownership semantics, and it would save 8 bytes in struct
commit.

[1] Grepping for "commit->util =", sequencer.c seems to assign pointers
into other arrays, as well as the "(void *)1".

[2] Most assignments seem to be flex-structs, but blame.c assigns a
linked list.

-Peff