> From: Gary Mills > Really! Does the parent broadcast to all of the children and then > deal with the one that responds first? That model won't scale very > well.
The protocol is a standard flavor. The parent sends a request to socket. The request includes a return address for a response. The request is received by, processed, and answered by a single child. > Surely the parent must know which children are busy and which > are idle. Can't the parent just pick the first idle one? There are perhaps subtle implementation problems related to speed, correctness, and robustness with having the parent know the state of all children. However, there is a bigger problem with that model that you in particular should care about. For the parent to be able to send to an individual child and so with a private channel, the parent would need to devote a file descriptor to each child. Since every mail message being processed by dccd might need a DNSBL lookup simultaneously, that additional use of file descriptors would reduce the -j limit on concurrent dccm jobs. The current scheme uses no additional file descriptors because talking to the DNSBL child is done with the socket normally used to talk to the DCC server. > They should > all respond equally quickly. Hidden in that thought is the problem. See http://www.google.com/search?q=unix+%22thundering+herd+problem%22 > My system call trace is puzzling. When a helper is in its poll/recvfrom > loop, poll() claims that one file descriptor is ready for reading, but > recvfrom() says there's nothing there. I don't know how that's even > possible. What could cause that behavior? - All of the idle children are asleep in select/poll waiting for something to happen on either the pipe that tells them that the parent has died or socket on which requests arrive. - A request arrives on the file descripter, so the kernel awakens all of the idle children. - One child gets into the recvfrom() system call first, receives the request, and starts working. - The thundering herd of other idlers get "sorry no data" from recvfrom() on the non-blocking socket and go back to sleep. I coded a fix that uses write(1 byte)/read() on that pipe to awaken a single child, and SIGALRM to awaken children that have been asleep too long and need to kill themselves. However, I can't make SIGALRM work in some POSIX threads implementations. So I'm ripping out support for external filters to make it possible to use SIGALRM. (The only external filter ever hooked to dccm/dccproc/dccifd that I know of requires threads, but no cares about using it inside dccm, dccifd, or dccproc.) Vernon Schryver [EMAIL PROTECTED] _______________________________________________ DCC mailing list [email protected] http://www.rhyolite.com/mailman/listinfo/dcc
