Re: What is a valid key?

Steven Grimm Thu, 20 Dec 2007 11:36:31 -0800

That seems fine to me, but we don't actually need to forbid 0x7F.Memcached doesn't do anything special with that byte.


-Steve



On Dec 20, 2007, at 11:34 AM, Aaron Stone wrote:

This is pretty verbose, but hopefully will cut way down on this FAQ:

 Keys are limited in length to 250 octets. Octets in the key MUST NOT
 have value 0x20 or less, nor value 0x7F (corresponding to ASCII space
 and all control characters below it, and ASCII del, respectively).
 Octets MAY have their 'high bits' set.

 Note: The UTF-8 character encoding produces output octets which meet
 these requirements. Please be aware that some characters may be
 represented as more than one octet. Refer to your language's string

length functions to ensure that you are producing keys of 250 orfewer

 _octets_ and not simply 250 or fewer _characters_.

I forgot about that ascii 127 deal until I re-read 'man ascii' justnow.

I assume we need to restrict that, too, so I put it in the text above.

Do you think this text still inadvertently suggests that we requireUTF-8?


Aaron

On Thu, Dec 20, 2007, Kieran Benton <[EMAIL PROTECTED]>said:

Point taken - that was something I hadn't considered.

I still think it's a good idea to add a footnote into that section of
the docs to note that UTF8 is a "safe" encoding to use since it is so

popular in western systems and many devs might not necessarily knowif

it fulfills the criteria (I certainly didn't from a brief scan).

This is of course if its decided by the end of this thread that itcan

be used generically! :)

Cheers,
Kieran

-----Original Message-----
From: Steven Grimm [mailto:[EMAIL PROTECTED]
Sent: 20 December 2007 18:32
To: Kieran Benton
Cc: Dustin Sallings; a.; [email protected]
Subject: Re: What is a valid key?

On Dec 20, 2007, at 10:14 AM, Kieran Benton wrote:

Are we saying that as long as you use UTF-8 for the key, and that it
is

not longer that 250 bytes, then all is fine with both text andbinary

protocols? If so then I think we should update the docs to say so
and be
happy :)


It has nothing to do with UTF-8. There is no good reason to specify
that in the documentation. It's just a bunch of bytes (or octets, if
you prefer) with some specific byte values forbidden. The server does
not check the bytes in the key to make sure they form valid UTF-8
sequences. You can use ASCII or UTF-8 or ISO-8859-1 or ISO-8859-5 or
KOI-8 or GB-18030 or a random-number generator, so long as you avoid

the forbidden bytes. It does not even have to be a human-readablekey;

it could be a raw hash value with certain bytes escaped. (Though
obviously that makes ad-hoc debugging a bit painful.)

If we say "keys can be UTF-8" in the documentation, then some poor

Russian programmer, say, who is otherwise working in KOI-8 encodingis

going to add unnecessary code to a client library to transform KOI-8
to UTF-8 so as to comply with the protocol spec.

-Steve

--

Re: What is a valid key?

Reply via email to