Re: pack v4 trees with a canonical base

2013-09-10 Thread Nicolas Pitre
On Tue, 10 Sep 2013, Junio C Hamano wrote:

> Nicolas Pitre  writes:
> 
> > On Tue, 10 Sep 2013, Junio C Hamano wrote:
> >
> >> There may be trees in the wild that record 100775 or 100777 in the
> >> mode field for executable blobs, which also need to be special
> >> cased.
> >
> > All the file mode bits are always preserved.  So this is not really a 
> > special case as far as the pack v4 encoding is concerned.
> 
> Ahh. OK.  It can theoretically be argued that you could further
> squeeze 13 bits out per tree entry because you would need only 5
> possible values (100644, 100755 12, 4, and 16, all
> octal) for the modes, but we will never know what other modes we
> would want to use in the future, so not being over-tight and using
> 16-bit for this purpose is probably a good trade-off

Absolutely.  I tried not to lose any of the currently available 
extension possibilities in the canonical object format.

> (squeezing 8 bits out per tree entry would make the shape of ident 
> table entry and tree path entry different and may hurt reusing the 
> code to parse these tables).

One could argue that 16 bits is much more than sufficient to encode a 
time zone offset too.  but again this didn't seem worth painting 
ourselves in a corner if ever some creative time zones are used.

Those table are also compressed.  So any repetition of the same mode bit 
pattern or sparseness in the tz bits is likely to be compressed well.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack v4 trees with a canonical base

2013-09-10 Thread Junio C Hamano
Nicolas Pitre  writes:

very much appreciated to> On Tue, 10 Sep 2013, Junio C Hamano wrote:
>
>> Duy Nguyen  writes:
>> 
>> > On Tue, Sep 10, 2013 at 2:25 AM, Nicolas Pitre  wrote:
>> >> An eventual optimization to index-pack when completing a pack would be
>> >> to attempt the encoding of appended tree objects into the packv4 format
>> >> using the existing dictionary table in the pack, and fall back to the
>> >> canonical format if that table doesn't have all the necessary elements.
>> >
>> > Yeah, it's on the improvement todo list. The way pack-objects creates
>> > dictionaries right now, the tree dict should contain all elements the
>> > base trees need so fall back is only necessary when trees are have
>> > extra zeros in mode field.
>> 
>> Careful.
>> 
>> There may be trees in the wild that record 100775 or 100777 in the
>> mode field for executable blobs, which also need to be special
>> cased.
>
> All the file mode bits are always preserved.  So this is not really a 
> special case as far as the pack v4 encoding is concerned.

Ahh. OK.  It can theoretically be argued that you could further
squeeze 13 bits out per tree entry because you would need only 5
possible values (100644, 100755 12, 4, and 16, all
octal) for the modes, but we will never know what other modes we
would want to use in the future, so not being over-tight and using
16-bit for this purpose is probably a good trade-off (squeezing 8
bits out per tree entry would make the shape of ident table entry
and tree path entry different and may hurt reusing the code to parse
these tables).





--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack v4 trees with a canonical base

2013-09-10 Thread Nicolas Pitre
On Tue, 10 Sep 2013, Junio C Hamano wrote:

> Duy Nguyen  writes:
> 
> > On Tue, Sep 10, 2013 at 2:25 AM, Nicolas Pitre  wrote:
> >> An eventual optimization to index-pack when completing a pack would be
> >> to attempt the encoding of appended tree objects into the packv4 format
> >> using the existing dictionary table in the pack, and fall back to the
> >> canonical format if that table doesn't have all the necessary elements.
> >
> > Yeah, it's on the improvement todo list. The way pack-objects creates
> > dictionaries right now, the tree dict should contain all elements the
> > base trees need so fall back is only necessary when trees are have
> > extra zeros in mode field.
> 
> Careful.
> 
> There may be trees in the wild that record 100775 or 100777 in the
> mode field for executable blobs, which also need to be special
> cased.

All the file mode bits are always preserved.  So this is not really a 
special case as far as the pack v4 encoding is concerned.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack v4 trees with a canonical base

2013-09-10 Thread Junio C Hamano
Duy Nguyen  writes:

> On Tue, Sep 10, 2013 at 2:25 AM, Nicolas Pitre  wrote:
>> An eventual optimization to index-pack when completing a pack would be
>> to attempt the encoding of appended tree objects into the packv4 format
>> using the existing dictionary table in the pack, and fall back to the
>> canonical format if that table doesn't have all the necessary elements.
>
> Yeah, it's on the improvement todo list. The way pack-objects creates
> dictionaries right now, the tree dict should contain all elements the
> base trees need so fall back is only necessary when trees are have
> extra zeros in mode field.

Careful.

There may be trees in the wild that record 100775 or 100777 in the
mode field for executable blobs, which also need to be special
cased.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack v4 trees with a canonical base

2013-09-09 Thread Duy Nguyen
On Tue, Sep 10, 2013 at 2:25 AM, Nicolas Pitre  wrote:
> An eventual optimization to index-pack when completing a pack would be
> to attempt the encoding of appended tree objects into the packv4 format
> using the existing dictionary table in the pack, and fall back to the
> canonical format if that table doesn't have all the necessary elements.

Yeah, it's on the improvement todo list. The way pack-objects creates
dictionaries right now, the tree dict should contain all elements the
base trees need so fall back is only necessary when trees are have
extra zeros in mode field.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html