Re: [Drizzle-discuss] c10k and pooling (was Paul's comment on sockets)

dormando Sun, 27 Jul 2008 17:04:03 -0700

Paul McCullagh wrote:

Hi Brian and Mark,


On Jul 16, 2008, at 1:03 AM, Brian Aker wrote:

So to me these need to be set:
   error= setsockopt(ptr->fd, SOL_SOCKET, SO_LINGER,
                     &linger, (socklen_t)sizeof(struct linger));
   error= setsockopt(ptr->fd, SOL_SOCKET, SO_SNDTIMEO,
                     &waittime, (socklen_t)sizeof(struct timeval));
   error= setsockopt(ptr->fd, SOL_SOCKET, SO_RCVTIMEO,
                     &waittime, (socklen_t)sizeof(struct timeval));
   (void)fcntl(ptr->fd, F_SETFL, flags | O_NONBLOCK);


What I don't quite understand is, why use non-blocking I/O?

This would be like polling the connection. But after setting a timeout
you shouldn't need to poll.

Another thing:

Have you considered using c10k and a pool of threads?


FYI, this is precisely what the MySQL Proxy does/solves.  Does it make
sense to do this in the microkernel itself or external to it?  Or both...

-jay

Had to sit on this for a while, but I've worked up some courage... Sorrythis is so long. Haven't found a way to describe it more succinctly.

I went in a different direction with the network handling in DPM(my shameless mysql proxy clone). Also, I don't know how mysql proxy doesthis offhand.


- Logical packets are read/written into buffers.

- Buffers are *only* passed in for processing after a complete packet hasbeen read.

- Buffers are *only* written to socket after processing has completed.

- Buffers are written to sockets as fast as the data is taken. As socketsbecome writable there's a short path to the "find more buffer data andflush" code.

This adds a little more memory management and a little more memory butreduces syscalls a great deal. In some part of my mind I believe it helpswith avoiding context switches as well. MySQL will do blocking writes onindividual logical packets...

It also means I can do weird things and avoid weird timeslicing from theOS:

- Write to buffers on many sockets (multiplexing, processing)
- Flush all sockets in a loop. (syscalls, no processing)

It's not patched into DPM yet, but I have some minor modifications on thisalgorithm to reduce memory usage a bit and reduce latency for receivingthe response.

Obviously this can't work directly within mysql since you want to streamlarge results back to the client. In proxy-land you tend to read a chunk,write a chunk, then flush the write before going back to read more data.So you add a tiny bit of latency but batch syscalls. You also never end upflipping around setsockopt calls...


How this relates to drizzle/MySQL:

- Blocking writes aren't all that useful. Process as fast as you can andflush in the background. Especially with thread pooling, where you cancomplete processing with potentially many fewer context switches, andpick up the next incoming request via a message passing interface. Theobvious exception are for large resultsets.

- Blocking reads aren't useful at all once we have a libevent layer. Passcomplete request packets into a message queue, which are then picked upfrom the thread pool.

- This does require significant retooling so session contexts (half ofTHD?) are separate from the actual thread contexts (the read of THD). Eachconnection has a logical session, which gets passed to the processingthread when it handles the actual request. This brings in sessionvariables, temp tables, blah blah blah.

- The complexity of this work is why I decided on a proxy instead andended up writing DPM. I kind of view my initial work with productiondrizzle to actually use DPM as a plugin. DPM can then directly injectfully buffered queries, read results, etc.

The major tradeoff I make a conscious decision on how threads getreused... What query contexts are safe for connection pooling, and whichclient connections need a dedicated backend thread for the duration oftheir session. At that point I can actually implement a working 'RESETSTATE' command by freeing the backend thread into the mysql thread pool,but keeping the client's tcp connection open. It's probably a littlefaster, at least..

Should drizzle's main IO still use libevent? Probably. The magic is in howwe handle buffering of resultsets or incoming queries, and how threads areabstracted into a worker pool. Which all goes over my head right now. Ifwe throw in libevent but still require alarms/blocking/sleeping everywherethere won't be much functional difference, and Mark Callaghan'sblocking-with-timeouts design ends up being more efficient (lesstwiddling).


-Dormando

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] c10k and pooling (was Paul's comment on sockets)

Reply via email to