[EMAIL PROTECTED] wrote:
Roy Lyseng wrote:

In reality, a full blown string compare is not even needed in this case. One can loop through the command array comparing the first character only, then the second, etc. If the actual length of the command is stored in the array as well and the length of the requested command is done beforehand, one needs to at most only compare two characters and the length. From a quick look only the get/gets commands both start with the same two characters - therefore the length must also be compared. Otherwise the entire string in this particular case.

Yes - and I guess strcmp() will bail out after the first non-match.


I could also argue that 'add' is almost certainly less frequency in most people's apps than 'set'. In reality, we're nit-picking too much, a few tens/hundreds of clock cycles more or less per request would not give us any meaningful performance benefit. Interrupts and context switches are the bottlenecks when demand is extremely high - tens of thousands of requests per second.

Just for fun I started porting memcached into a kernel module with the goal of measuring performance differences between the standard one and the kernel module - but I haven't finished that yet. There would be pretty large savings on context switching and interrupts - though I don't think real world performance gains would be meaningful for 99% of the users on this list. Only a few push (or get close to pushing) memcached to its limits.

Just want to add that the benchmarks I see show 20% time in user-space and 80% in kernel-space...

Roy

Marc



Håkan Waara wrote:
I haven't hacked this code myself, so forgive me if I'm totally wrong, but couldn't you use a hash table instead below to avoid any string comparison at all?

I certainly could, but I do not think this is a performance pain point.

But the array of command strings would be a great start for another optimization.

Roy

/Håkan

Roy Lyseng wrote:
Hi,

I have been looking at structuring the command processing in memcached slightly.

I am thinking about creating an array with info about available commands:

static struct {
    char *cmdword;    /* Command word string */
    int   cmd;        /* Command shorthand */
    int   mintokens;  /* Minimum number of tokens (required) */
int maxtokens; /* Maximum number of tokens (zero means no limit) */
} cmds[] = {
   {"get",        CMD_GET,       3, 0},
   {"delete",     CMD_DELETE,    3, 5},
   {"add",        CMD_ADD,       6, 7},
   {"set",        CMD_SET,       6, 7},
   {"replace",    CMD_REPLACE,   6, 7},
   {"prepend",    CMD_PREPEND,   6, 7},
   {"append",     CMD_APPEND,    6, 7},
   {"gets",       CMD_GETS,      3, 0},
   {"cas",        CMD_CAS,       7, 8},
   {"incr",       CMD_INCR,      4, 5},
   {"decr",       CMD_DECR,      4, 5},
   {"bget",       CMD_BGET,      3, 0},
   {"own",        CMD_OWN,       3, 3},
   {"disown",     CMD_DISOWN,    3, 3},
   {"bg",         CMD_BG,        3, 3},
   {"stats",      CMD_STATS,     2, 0},
   {"flush_all",  CMD_FLUSH,     2, 4},
   {"version",    CMD_VERSION,   2, 2},
   {"quit",       CMD_QUIT,      2, 2},
{"slabs", CMD_SLABS, 5, 5}, /* Next token should be "reassign" */
   {"verbosity",  CMD_VERBOSITY, 3, 4},
{NULL, -1, 0, 0} /* Terminate with a NULL string pointer */
};

I have tried to sort the presumably most frequent commands first.

process_commands() will then do:

    ntokens = tokenize_command(command, tokens, MAX_TOKENS);

    for (i = 0; cmds[i].cmdword != NULL; i++) {
if (strcmp(tokens[COMMAND_TOKEN].value, cmds[i].cmdword) == 0) {
           cmd = cmds[i].cmd;
           break;
        }
    }

    if (cmd < 0) {
        out_string(c, "ERROR");              /* Token not matched */
        return;
    }
    if (ntokens < cmds[i].mintokens ||
       (cmds[i].maxtokens > 0 && ntokens > cmds[i].maxtokens)) {
out_string(c, "ERROR"); /* Invalid number of tokens for this cmd */
        return;
    }

c->item_comm = cmd; /* Command being processed on connection */

    switch (cmd) {
    case CMD_GET:
    case CMD_BGET:
        process_get_command(c, tokens, ntokens, false);
        break;
...

Does this look interesting to you guys?

Before I go any further with this, is there any other information that should be used to characterize commands?

Are there commands that are not in use (BGET?)

More?

Thanks,
Roy


!DSPAM:47d5500917066856617375!


Reply via email to