On Sun, Jul 23, 2017 at 3:56 PM, Shawn Pearce <spea...@spearce.org> wrote: > On Mon, Jul 17, 2017 at 6:43 PM, Michael Haggerty <mhag...@alum.mit.edu> > wrote: >> On Sun, Jul 16, 2017 at 12:43 PM, Shawn Pearce <spea...@spearce.org> wrote: >>> On Sun, Jul 16, 2017 at 10:33 AM, Michael Haggerty <mhag...@alum.mit.edu> >>> wrote: > >> * What would you think about being extravagant and making the >> value_type a full byte? It would make the format a tiny bit easier to >> work with, and would leave room for future enhancements (e.g., >> pseudorefs, peeled symrefs, support for the successors of SHA-1s) >> without having to change the file format dramatically. > > I reran my 866k file with full byte value_type. It pushes up the > average bytes per ref from 33 to 34, but the overall file size is > still 28M (with 64 block size). I think its reasonable to expand this > to the full byte as you suggest.
FYI, I went back on this in the v3 draft I posted on Jul 22 in https://public-inbox.org/git/CAJo=hJvxWg2J-yRiCK3szux=eym2thjt0kwo-sffooc1rkx...@mail.gmail.com/ I expanded value_type from 2 bits to 3 bits, but kept it as a bit field in a varint. I just couldn't justify the additional byte per ref in these large files. The prefix compression works well enough that many refs are still able to use only a single byte for the suffix_length << 3 | value_type varint, keeping the average at 33 bytes per ref. The reftable format uses values 0-3, leaving 4-7 available. I reserved 4 for an arbitrary payload like MERGE_HEAD type files.