Variable-length saves space for values under about 2^21 ~= 2M. It's a wash for values up to about 2^28 ~= 268M. It costs an extra byte for larger values. I'm thinking unsigned values here at the moment, and ignoring the CPU costs of encoding/decoding, which is tiny.
Yes it's a loss for 15/16ths of the key space. My big assumption is that in many cases that first 1/16th is heavily used. It's certainly true when values are counts, and true when they're product IDs. When they're hashes, nope. I actually don't know the situation here -- if it's a bad idea we don't do it. But there are surely places where it's a good idea. On Wed, May 25, 2011 at 10:14 PM, Jake Mannix <[email protected]> wrote: > > If you have more than a 32M IDs *total*, even if they are sequential > starting > at 0 "most" of them will take up the full 4 bytes, and only a tiny fraction > will > take up less than 3. >
