Lesson 1: never assume your process doesn't fork() and exec() libuuid does a fork() and exec() of uuidd, which is setuid libuuid (so netstat doesn't show that a process is holding our socket open unless we run it as root).
uuidd would then get a clone of all our file descriptors. One of these being the socket that we're listening on. so as long as the uuidd process exists, we wouldn't be able to bind to the server socket. *ouch*. Since uuid is only ever called in some test cases, it's not a real issue running the test suite *unless* it tries to restart the server *after* using a libuuid function that could spawn the process. With mtr2 and --parallel=1, this was the case. with higher levels of parallelism... well, then you have more fun. so a simple fcntl() to set FD_CLOEXEC on the server socket fixes this problem (see my patch for the bug). It's actually worse than that.... It's a big security issue. This allows uuidd to do anything it wants with our file descriptors (e.g. send crap to clients, trash the innodb data files). Imagine the future with stored procs/routines in forked() processes. There's still more ways for it to be bad... It's also on accept()ed sockets. So a forked() process could do real nasty things to all our client connections (or simply run out of file descriptors, while drizzled is *just* below the limit). We can also deal with this with a fcntl() after accepting a socket. But it's even worse than that.... That leaves a race condition. Since we are a threaded app, and fork() and exec() could be going on simultaneously with accepting sockets. i.e. Thread 1 Thread 2 New process fd= accept fork() fcntl() exec() and new process has the fd that it shouldn't. This is fixed in (recent) linux kernels and libc.... or at least allows us to fix it. Namely, an atomic way to create a socket and set CLOEXEC (see the required reading below) But it's even worse than this.... Also file descriptors created for, well, files. This means all engines need to be modified too. Anybody have an idea what the story is on Solaris? Required reading: - http://lwn.net/Articles/292843/ - http://udrepper.livejournal.com/20407.html - https://bugzilla.redhat.com/show_bug.cgi?id=443321 -- Stewart "foiled by unix" Smith _______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : drizzle-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp