On Sun, Oct 30, 2005 at 12:25:23AM +0900, Jason Stubbs wrote: > On Saturday 29 October 2005 19:56, Jason Stubbs wrote: > > All file descriptors opened by spawn() are now closed in the correct > > places. The code that closes all unnecessary descriptors via brute force is > > no longer required. > > > > I found that a file descriptor was being left open and passed around. > > However, it was definately not created in spawn(). It seems to me though > > that code outside of spawn should take care of closing (or telling spawn to > > close) any unnecessary FDs. > > > On Saturday 29 October 2005 23:22, Brian Harring wrote: > > With the fd_unused route, there is no way clean way to have all fd's > > closed. You don't know what fd's are open when you spawn (nor is > > there any way to know, beyond testing each potential fd, or overriding > > the file builtin); going the fd_unused route means fd's the process > > shouldn't have access to, will have access to. > > This is a difference of viewpoint, I think. Having unknown fds left open by > the calling code seems unclean to me. Iterating through 100s or 1000s of fds > and attempting to close them just in case the caller was sloppy seems equally > unclean.
Arbitrary example as to why the brute force is required class user_source: def __init__(self): self.fd = open("/etc/passwd", "r") def __iter__(self): return self def next(self): l = self.fd.next() return l.split(":",1)[0] from portage_exec import spawn_bash def process_users(users): for x in users: spawn_bash("echo %s" % x, fd_pipes={1:1, 2:2}) if __name__ == "__main__": process_users(user_source()) I'd have to modify the process_users func to handle passing in a potentially unneeded list of fd's to close- if users were a list, I'm cluttering up process_user's signature specializing for once potential source, since the process_users function is designed to take an iterable object (any iterable object). Example above doesn't cover the rather nasty cases where there is N func calls involved between where the fd is opened, and the actual spawn call. The worst scenario is where there is >1 thread running- you're stuck using globals there, which is pretty ugly. It's pretty much best practice to close what you don't explicitly need when handling total control over to another process, good daemonizing code does the same thing due to the reasons I raised above. > By the way, this method does not allow having a pipe open to a child process > other than stdin, stdout or stderr. It does; the portage_exec that is in stable is the modified version I created for ebd, which uses this exact functionality. line #153 of trunk/pym/ebuild.py, example call- self.pid = spawn_func(self.ebd+" daemonize", fd_pipes={0:0, 1:1, 2:2, 3:cread, 4:dwrite}, returnpid=True,env=env, *args, **spawn_opts)[0] Diffing between stable and trunk portage_exec, no changes affect fd handling (not surprising, the code is nasty without your reworking), so please clarify. Addendum though... where the implicit fd closing breaks down is in usage of spawn_func, which is a trunk func only, used for implementing parallel-fetch, due to issues with insuring that the fetching process isn't screwing with the normal sequential buildplan execution (nice way of saying it was something of a hack). Plus side, seperate process, it's not bound by the gil, so it sidestepped that trickery. > > The original code is ugly, and knowing a bit more this time around I'd > > write it a bit differently, that said the protections/behaviour > > originally in it should be left- it's more robust, and is expected > > now. > > I disagree here. Presently, other than the execve call, every try/except > block > in portage_exec.py has bugs that are hidden by the "protections". Original in this case was referencing the fd_pipes additions, not the true original spawn func :) Most of that code could be converted over to OSError catches anyways, although doing so requires testing the piss out of it to make sure no regressions are introduced (shouldn't be possible, but don't care to screw up the subsystem). Looking through the 2.4 subprocess module (which came along after the mods in question), adding an option controlling the fd cleansing would make sense, as long as the default option is to close the fds. That's subjective, but the examples I gave above show (imo at least) why this is needed; it's anal and *will* expose bugs rather then letting them lie, but that's better from where I sit. ~harring
pgpYLAPM6yGKz.pgp
Description: PGP signature