On Wed, May 25, 2011 at 1:52 PM, Sean Owen <[email protected]> wrote: > If keys are distributed across the keyspace then yes it is a net loss to > try > variable-length encoding. However it's my impression that keys aren't in > many contexts. (I actually haven't thought about this one hard.) > > But for example in recommender-land where keys are product IDs, it's more > common for there to be millions of keys ranging in value to, well, a few > million, than spread across the key space. >
If you have more than a 32M IDs *total*, even if they are sequential starting at 0 "most" of them will take up the full 4 bytes, and only a tiny fraction will take up less than 3.
