On Mon, Nov 12, 2007 at 14:43:11 -0800, Marc wrote: > For the record, my motivation for a binary protocol was not computational > efficiency but more efficient I/O, especially for large sets of small keys > with small values, AND to reduce code complexity.
That's why you should be voting against "tags" approach. Now you have the reply format: <tag> <data> <tag> <data> --- nothing for not found items <tag> <data> <tag> <data> <tag> <data> --- nothing for not found items <tag> <data> Since you want to query lots of data, I guess <tag> is more than one byte, right (otherwise you can query at most 256 items in one streaming round)? But what if we implement one-to-one correspondence between requested key and response? Let's see: <found> <data> <found> <data> <not_found> <found> <data> <found> <data> <found> <data> <not_found> <found> <data> where <found> and <not_found> are one byte (bit would be enough, if we can add it to some other field). So, while with sizeof(tag) >= 2 you have at least 2 * 6 = 12 meta bytes. With one-to-one, you have 1 * 8 = 8. If the hit ratio is >50%, that is. It was said Facebook have get-intensive 99-1 applications, so I doubt you are optimizing for a hit rate <<50%. I also described code complexity issues of matching keys/tags/whatever vs simple sequential processing in other mail. So, why would one want to have tags? -- Tomash Brechko
