gmake write error and possible solution
Hello all. Putting this on a misc@ list because this looks like not the port problem itself. Recently I start running (too) often in GMake's write error problem. It was reported some times ago here with no result. And after some more digging I found that commit in DragonFlyBSD: http://www.mail-archive.com/commits%40crater.dragonflybsd.org/msg02534.html Log: Do not set O_NONBLOCK on a threaded program's descriptors any more. Instead, use the new system calls to directly issue non-blocking I/O. Additionally, force blocking I/O for debug output. This partly solves the problem of programs such as bmake or gmake fork/exec'd children which happen to be threaded. The children would set O_NONBLOCK on e.g. stdin, stdout, and stderr, resulting in unexpected operation if the unrelated parent program tries to issue a read or write. Solves: gmake 'write error' problem Can anyone expirinced comment this, please? -- Best wishes, Vadim Zhukov
Re: gmake write error and possible solution
On Tue, Jan 6, 2009 at 6:47 PM, Vadim Zhukov persg...@gmail.com wrote: Recently I start running (too) often in GMake's write error problem. It was reported some times ago here with no result. And after some more digging I found that commit in DragonFlyBSD: http://www.mail-archive.com/commits%40crater.dragonflybsd.org/msg02534.html Log: Do not set O_NONBLOCK on a threaded program's descriptors any more. Instead, use the new system calls to directly issue non-blocking I/O. Additionally, force blocking I/O for debug output. This partly solves the problem of programs such as bmake or gmake fork/exec'd children which happen to be threaded. The children would set O_NONBLOCK on e.g. stdin, stdout, and stderr, resulting in unexpected operation if the unrelated parent program tries to issue a read or write. Solves: gmake 'write error' problem Can anyone expirinced comment this, please? We don't have whatever these new syscalls are and are unlikely to adopt them, so I don't think the fix is particularly relevant to openbsd. But yeah, faking threads in userland causes trouble. If we replace the thread library with a better one, then the problem goes away. Maybe. Let me qualify that. The reason for the maybe is that there can be many reasons for a program to set stdout to non-blocking. It may not always be the result of pthread fiddling. So gmake is still wrong. If its behavior depends on whether a fd is set nonblocking in a child, that's a problem. Just a problem that occurs less frequently without threads it seems.
Re: gmake write error and possible solution
On Tue, Jan 6, 2009 at 5:07 PM, Ted Unangst ted.unan...@gmail.com wrote: ... Let me qualify that. The reason for the maybe is that there can be many reasons for a program to set stdout to non-blocking. It may not always be the result of pthread fiddling. So gmake is still wrong. If its behavior depends on whether a fd is set nonblocking in a child, that's a problem. Just a problem that occurs less frequently without threads it seems. Some of us wish that the non-blocking flag was an fd flag (like FD_CLOEXEC) instead of a file table flag like it really is**; this would have never been an issue then. As for this being a bug in gmake, well, the same bug exists in *lots* of programs. I used to hit it all the time with the system 'vi' when debugging a threaded program that crashed, leaving the session's std{in,out,err} as non-blocking. That mostly went away when the system ksh started resetting the terminal to blocking when the foreground process exited, but you can still hit it by running 'vi' from inside a threaded program (with system()), then stopping and starting the program and vi with ^Z and fg: Error: input: Resource temporarily unavailable Notice that resetting the state at startup isn't enough. Since the state could be changed by another process at any moment, you actually have to replace each should-be-blocking call with try it, then poll() and loop if EAGAIN logic...which probably isn't correct for a terminal device in non-canonical mode. Altering almost every program on the system to do that seems like the Wrong Thing to me. Philip Guenther ** Yes, yes, there would have had to been some way to specify non-blocking open(). If we lived in that universe, the details would have been worked out already.
Re: gmake write error and possible solution
On Tue, Jan 6, 2009 at 8:51 PM, Philip Guenther guent...@gmail.com wrote: As for this being a bug in gmake, well, the same bug exists in *lots* of programs. I used to hit it all the time with the system 'vi' when debugging a threaded program that crashed, leaving the session's std{in,out,err} as non-blocking. That mostly went away when the system ksh started resetting the terminal to blocking when the foreground process exited, but you can still hit it by running 'vi' from inside a threaded program (with system()), then stopping and starting the program and vi with ^Z and fg: Error: input: Resource temporarily unavailable Notice that resetting the state at startup isn't enough. Since the state could be changed by another process at any moment, you actually have to replace each should-be-blocking call with try it, then poll() and loop if EAGAIN logic...which probably isn't correct for a terminal device in non-canonical mode. Altering almost every program on the system to do that seems like the Wrong Thing to me. My opinion is that for vi this is more a corner case. I think it's reasonable for vi to assume it has blocking fds to start, and for the shell to enforce that. Same for any other app that doesn't anticipate being toggled with another app on console. But gmake is actively execing other jobs. It *knows* that other processes are running and that they are likely writing to stdout, so it should handle this case. Fixing every program that writes out data to use a loop is certainly overkill, but I don't think fixing every program that uses fork+exec to reset or deal with non-blocking shared descriptors is too much to ask.