On Oct 23, 12:43 am, dormando <[email protected]> wrote:
> > I'm trying to cache 1 billion items in memcached, which have a URL
> > key.
> > The value of one item has 12 bytes (3 integers). However, the key is
> > also been stored in memory, which increases the RAM size greatly. I'm
> > using a Base64 of a MD5 of a URL, but even so each item is being
> > cached in 160 bytes according to the "stats sizes" command.
>
> > To save space I would like to not cache these keys. I'm willing to
> > accept the collision problem of having different items mapped to the
> > same entry.
>
> > Is there any way/setup to not store the keys in memcached (or
> > membase) ?
>
> We have to know the key in order for the internal hash table to work.
>

You have to know the key to map it to a server and RAM location, but I
do not see any reason why you should have to store it.
The only explanation seems to distinguish values after a hash
collision, since you can not even iterate over the keys.


> > Is there any way/setup to reduce the waste space of small items like
> > these? How can I set a tiny slab size?
>
> A few things:
>
> 1) Use binary protocol, and use direct SHA1 (128bit or 256bit) as the key,
> which will save a lot of bytes over base64. MD5 is cutting it a bit close.

I'm using the binary protocol. The problem is that both md5 and sha1
return \n, \r and spaces in the key, which leads to this message: "Key
contains invalid characters". Is there any way to use a byte array as
key instead of a string?

> 2) Use -C startup command, which disables CAS and saves 8 bytes per item

OK

> 3) Compare `stats sizes` with the slab class sizes after storing some test
> items, and adjust -f and/or the minimum slab size to get the slabs closer
> to ideal.
>

OK

> Should get you a lot closer.
>
> -Dormando

Thanks.

Reply via email to