Re: reftable: new ref storage format

Shawn Pearce Sun, 23 Jul 2017 16:04:32 -0700

On Sun, Jul 23, 2017 at 3:56 PM, Shawn Pearce <spea...@spearce.org> wrote:
> On Mon, Jul 17, 2017 at 6:43 PM, Michael Haggerty <mhag...@alum.mit.edu> 
> wrote:
>> On Sun, Jul 16, 2017 at 12:43 PM, Shawn Pearce <spea...@spearce.org> wrote:
>>> On Sun, Jul 16, 2017 at 10:33 AM, Michael Haggerty <mhag...@alum.mit.edu> 
>>> wrote:
>
>> * What would you think about being extravagant and making the
>> value_type a full byte? It would make the format a tiny bit easier to
>> work with, and would leave room for future enhancements (e.g.,
>> pseudorefs, peeled symrefs, support for the successors of SHA-1s)
>> without having to change the file format dramatically.
>
> I reran my 866k file with full byte value_type. It pushes up the
> average bytes per ref from 33 to 34, but the overall file size is
> still 28M (with 64 block size). I think its reasonable to expand this
> to the full byte as you suggest.


FYI, I went back on this in the v3 draft I posted on Jul 22 in
https://public-inbox.org/git/CAJo=hJvxWg2J-yRiCK3szux=eym2thjt0kwo-sffooc1rkx...@mail.gmail.com/

I expanded value_type from 2 bits to 3 bits, but kept it as a bit
field in a varint. I just couldn't justify the additional byte per ref
in these large files. The prefix compression works well enough that
many refs are still able to use only a single byte for the
suffix_length << 3 | value_type varint, keeping the average at 33
bytes per ref.

The reftable format uses values 0-3, leaving 4-7 available. I reserved
4 for an arbitrary payload like MERGE_HEAD type files.

Re: reftable: new ref storage format

Reply via email to