Re: [PATCH 00/21] np/pack-v4 updates

2013-09-12 Thread Duy Nguyen
On Thu, Sep 12, 2013 at 11:20 PM, Nicolas Pitre  wrote:
> On Thu, 12 Sep 2013, Duy Nguyen wrote:
>
>> On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre  wrote:
>> > On Wed, 11 Sep 2013, Duy Nguyen wrote:
>> >
>> >> Nico, if you have time you may want to look into this. The result v4
>> >> pack from pack-objects on git.git for me is 35MB (one branch) while
>> >> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
>> >> such a big difference in size. I compared. Ident dict is identical.
>> >> Tree dict is a bit different (some that have same hits are ordered
>> >> differently). Delta chains do not differ much. Many groups of entries
>> >> in the pack are displaced though. I guess I turned a wrong knob or
>> >> something in pack-objects in v4 code..
>> >
>> > Will try to have a closer look.
>>
>> Problem found. I encoded some trees as ref-delta instead of pv4-tree
>> :( Something like this brings the size back to packv4-create output
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index f604fa5..3d9ab0e 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
>>   * deltify other objects against, in order to avoid
>>   * circular deltas.
>>   */
>> - entry->type = entry->in_pack_type;
>> + if (pack_version < 4)
>> + entry->type = entry->in_pack_type;
>>   entry->delta = base_entry;
>>   entry->delta_size = entry->size;
>>   entry->delta_sibling = base_entry->delta_child;
>
> Hmmm... I've folded this fix into your patch touching this area.

You may want to stricten the condition a bit, to "pack_version < 4 ||
entry->type != OBJ_TREE". I think always not doing it in v4 turns off
the reuse code path for blobs and tags.

> This code is becoming rather subtle and messy though.  We'll have to
> find a way to better abstract things.

Yep. Not sure how that should be done though. Maybe we can revisit it
when pack-objects learns to skip compatibility layer when reading v4
commits and trees..

> Especially since object data
> reuse will work only for blobs and tags with packv4.  Commits and trees
> will need adjustments to their indices.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-12 Thread Nicolas Pitre
On Thu, 12 Sep 2013, Duy Nguyen wrote:

> On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre  wrote:
> > On Wed, 11 Sep 2013, Duy Nguyen wrote:
> >
> >> Nico, if you have time you may want to look into this. The result v4
> >> pack from pack-objects on git.git for me is 35MB (one branch) while
> >> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
> >> such a big difference in size. I compared. Ident dict is identical.
> >> Tree dict is a bit different (some that have same hits are ordered
> >> differently). Delta chains do not differ much. Many groups of entries
> >> in the pack are displaced though. I guess I turned a wrong knob or
> >> something in pack-objects in v4 code..
> >
> > Will try to have a closer look.
> 
> Problem found. I encoded some trees as ref-delta instead of pv4-tree
> :( Something like this brings the size back to packv4-create output
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index f604fa5..3d9ab0e 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
>   * deltify other objects against, in order to avoid
>   * circular deltas.
>   */
> - entry->type = entry->in_pack_type;
> + if (pack_version < 4)
> + entry->type = entry->in_pack_type;
>   entry->delta = base_entry;
>   entry->delta_size = entry->size;
>   entry->delta_sibling = base_entry->delta_child;

Hmmm... I've folded this fix into your patch touching this area.

This code is becoming rather subtle and messy though.  We'll have to 
find a way to better abstract things.  Especially since object data 
reuse will work only for blobs and tags with packv4.  Commits and trees 
will need adjustments to their indices.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-11 Thread Duy Nguyen
On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre  wrote:
> On Wed, 11 Sep 2013, Duy Nguyen wrote:
>
>> Nico, if you have time you may want to look into this. The result v4
>> pack from pack-objects on git.git for me is 35MB (one branch) while
>> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
>> such a big difference in size. I compared. Ident dict is identical.
>> Tree dict is a bit different (some that have same hits are ordered
>> differently). Delta chains do not differ much. Many groups of entries
>> in the pack are displaced though. I guess I turned a wrong knob or
>> something in pack-objects in v4 code..
>
> Will try to have a closer look.

Problem found. I encoded some trees as ref-delta instead of pv4-tree
:( Something like this brings the size back to packv4-create output

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f604fa5..3d9ab0e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
  * deltify other objects against, in order to avoid
  * circular deltas.
  */
- entry->type = entry->in_pack_type;
+ if (pack_version < 4)
+ entry->type = entry->in_pack_type;
  entry->delta = base_entry;
  entry->delta_size = entry->size;
  entry->delta_sibling = base_entry->delta_child;
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-11 Thread Nicolas Pitre
On Wed, 11 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> This contains fixups for some of my patches, some of Nico's, adds v4
> support to unpack-objects because the test suite needs it. With these,
> when force generating pack v4 unconditionally, the remaining failed
> tests are:
[...]

@junio: I've folded those patches into my branch, along with the better 
fix for the compilation issue you found.  So you may simply replace the 
branch you have in  pu with mine directly if you wish.

git://git.linaro.org/people/nico/git


Nicolas


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-11 Thread Nicolas Pitre
On Wed, 11 Sep 2013, Duy Nguyen wrote:

> Nico, if you have time you may want to look into this. The result v4
> pack from pack-objects on git.git for me is 35MB (one branch) while
> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
> such a big difference in size. I compared. Ident dict is identical.
> Tree dict is a bit different (some that have same hits are ordered
> differently). Delta chains do not differ much. Many groups of entries
> in the pack are displaced though. I guess I turned a wrong knob or
> something in pack-objects in v4 code..

Will try to have a closer look.

Thanks for your dedication so far.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-11 Thread Duy Nguyen
Nico, if you have time you may want to look into this. The result v4
pack from pack-objects on git.git for me is 35MB (one branch) while
packv4-create produces 30MB (v2 is 40MB). I don't know why there is
such a big difference in size. I compared. Ident dict is identical.
Tree dict is a bit different (some that have same hits are ordered
differently). Delta chains do not differ much. Many groups of entries
in the pack are displaced though. I guess I turned a wrong knob or
something in pack-objects in v4 code..
--
DUy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html