On Sat, 18 Jun 2022, Jacob Moody wrote:
I've attempted to reproduce it, trying to remove the libthread/notify factors. I've come up with this:#include <u.h> #include <libc.h> static void proc_udp(void*) { char resp[512]; char req[] = "request"; int fd; int n; int pid; fd = dial("udp!185.157.221.201!5678", nil, nil, nil); if(fd < 0) exits("can't dial"); if(write(fd, req, strlen(req)) != strlen(req)) exits("can't write"); pid = getpid(); fprint(1, "start %d\n", pid); n = read(fd, resp, sizeof(resp)-1); fprint(1, "end %d %d\n", pid, n); exits(nil); } void main(int, char**) { int i; Waitmsg *wm; for(i = 0; i < 10; i++){ switch(fork()){ case -1: sysfatal("fork %r"); case 0: proc_udp(nil); sysfatal("ret"); default: break; } } for(i = 0; i < 10; i++){ wm = wait(); print("proc %d died with message %s\n", wm->pid, wm->msg); } exits(nil); } This code makes it pretty obvious that we are losing some children; on my machine this program never exits. I see some portion of the readers correctly returning -1, and the parent is able to get their Waitmsg but not all of them.
Moody I think this old thread will interest you: https://marc.info/?t=112730920400001&r=1&w=2 Russ Cox explained there: It appears that your program, at its core, it is doing this: void readproc(void *v) { int fd; char buf[100]; fd = (int)v; read(fd, buf, sizeof buf); } void threadmain(int argc, char **argv) { int p[2]; pipe(p); proccreate(readproc, (void*)p[0], 8192); proccreate(readproc, (void*)p[1], 8192); close(p[0]); /* and here you expect the first readproc to be done */ close(p[1]); /* and here the second */ } Each read call is holding up a reference to its channel inside the kernel, so that even though you've closed the fd and removed the ref from the fd table, there is still a reference to each side of the pipe in the form of the process blocked on the read. I've never been sure whether the implicit ref held during the system call is good behavior, but it's hard to change. In your case, writing 0 (or anything) makes the read finish, releasing the last ref to the underlying pipe when the system call finishes, and then everything cleans up as expected. So you've found your workaround, and now we understand why it works. ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6e48031f9e8673387c0b47b8 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
