On Oct 23, 12:43 am, dormando <[email protected]> wrote: > > I'm trying to cache 1 billion items in memcached, which have a URL > > key. > > The value of one item has 12 bytes (3 integers). However, the key is > > also been stored in memory, which increases the RAM size greatly. I'm > > using a Base64 of a MD5 of a URL, but even so each item is being > > cached in 160 bytes according to the "stats sizes" command. > > > To save space I would like to not cache these keys. I'm willing to > > accept the collision problem of having different items mapped to the > > same entry. > > > Is there any way/setup to not store the keys in memcached (or > > membase) ? > > We have to know the key in order for the internal hash table to work. >
You have to know the key to map it to a server and RAM location, but I do not see any reason why you should have to store it. The only explanation seems to distinguish values after a hash collision, since you can not even iterate over the keys. > > Is there any way/setup to reduce the waste space of small items like > > these? How can I set a tiny slab size? > > A few things: > > 1) Use binary protocol, and use direct SHA1 (128bit or 256bit) as the key, > which will save a lot of bytes over base64. MD5 is cutting it a bit close. I'm using the binary protocol. The problem is that both md5 and sha1 return \n, \r and spaces in the key, which leads to this message: "Key contains invalid characters". Is there any way to use a byte array as key instead of a string? > 2) Use -C startup command, which disables CAS and saves 8 bytes per item OK > 3) Compare `stats sizes` with the slab class sizes after storing some test > items, and adjust -f and/or the minimum slab size to get the slabs closer > to ideal. > OK > Should get you a lot closer. > > -Dormando Thanks.
