Hi, I'm trying to fix an issue in a custom mpm. It's called peruser. More or less it's a prefork with pools of processes running on different users. Additional pool of processes called Multiplexers is accepting connections and sending them to workers. Each worker pool has it's own pair of sockets (socketpair(PF_UNIX, SOCK_STREAM)) one for Multiplexers and other for Workers. Multiplexer sends socket and request data to Worker using blocking sendmsg(), Workers are using non blocking recvmsg().
The code looks like this in Workers receive_from_multiplexer() ... // Don't block ret = recvmsg(ctrl_sock_fd, &msg, MSG_DONTWAIT); if (ret == -1 && errno == EAGAIN) { _DBG("receive_from_multiplexer recvmsg() EAGAIN, someone was faster"); return APR_EAGAIN; } else if (ret == -1) { _DBG("recvmsg failed with error \"%s\"", strerror(errno)); return APR_EGENERAL; } else _DBG("recvmsg returned %d", ret); in Multiplexers if ((rv = sendmsg(processor->senv->output, &msg, 0)) == -1) { apr_pool_destroy(r->pool); ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, ap_server_conf, "Writing message failed %d %d", rv, errno); return -1; } The problem is that sometimes Multiplexer is stuck on sendmsg(), and Worker is stuck on recvmsg() os is linux 2.6.32 on amd64 sendmsg(74, {msg_name(0)=NULL, msg_iov(5)=[{"y\1\0\0\0\0\0\0", 8}, {"\0\0\0\0\0\0\0\0", 8}, {"\230\322\265\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\364\351 \0\0\2\0\0\0\20\0\0\0\4\0\0\0\20\0\0\0\0\0\0\0l\324\265\0\0\0\0\0\0\0\0\0\0\0\0\0\2\0\351\364C\303p \0\0\0\0\0\0\0\0\10\0\0\0\0\0\0\0F\251\266\0\0\0\0\0@\24 5\256\2\0\0\0\0\370\222\266\0\0\0\0\0`\30\252\366\377\177\0\0\320\30\252\366\377\177\0\0\5\0\0\0\0\0\0\0\364\230\254\242a\177\0\0\6\0\0\0\1\0\0\0\5\0\0\0\1\ 0\0\0\4\0\0\0\1\0\0\0\3\0\0\0\1\0\1\0\213\0\0\0\1\0\0\0\220\361\5\0\0\0\0\0", 192}, {"GET /oglxxxxx.html HTTP/1.0\r\nHost: xxxxxxx \r\nU ser-Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)\r\nAccept: text/xml,application/xml,application/xhtml+xml,tex t/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q= 0.7\r\n\r\n\0", 378}, {"", 0}], msg_controllen=20, {cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {151}}, msg_flags=MSG_PROXY|MSG_DONTWAIT}, 0 < unfinished ...> Killing destination Workers frees all Multiplexers. I think that the problem might be in receive_from_multiplexer(), if a message gets ie half received the code isn't going back to reread this, receive_from_multiplexer() is called after apr_pool() on multiple Workers so there's no guarantee that the same one is going back to reread the message, and this blocks this socket for other messages. I know that this is not httpd code, but perusers mailing list is dead, and I don't have any other ideas where to go with this. -- Michal Grzedzicki