On Sun, Oct 30, 2005 at 12:25:23AM +0900, Jason Stubbs wrote:
> On Saturday 29 October 2005 19:56, Jason Stubbs wrote:
> > All file descriptors opened by spawn() are now closed in the correct
> > places. The code that closes all unnecessary descriptors via brute force is
> > no longer required.
> >
> > I found that a file descriptor was being left open and passed around.
> > However, it was definately not created in spawn(). It seems to me though
> > that code outside of spawn should take care of closing (or telling spawn to
> > close) any unnecessary FDs.
> 
> 
> On Saturday 29 October 2005 23:22, Brian Harring wrote:
> > With the fd_unused route, there is no way clean way to have all fd's
> > closed.  You don't know what fd's are open when you spawn (nor is
> > there any way to know, beyond testing each potential fd, or overriding
> > the file builtin); going the fd_unused route means fd's the process
> > shouldn't have access to, will have access to.
> 
> This is a difference of viewpoint, I think. Having unknown fds left open by 
> the calling code seems unclean to me. Iterating through 100s or 1000s of fds 
> and attempting to close them just in case the caller was sloppy seems equally 
> unclean.

Arbitrary example as to why the brute force is required

class user_source:
        def __init__(self):
                self.fd = open("/etc/passwd", "r")
        def __iter__(self):
                return self
        def next(self):
                l = self.fd.next()
                return l.split(":",1)[0]

from portage_exec import spawn_bash
def process_users(users):
        for x in users:
                spawn_bash("echo %s" % x, fd_pipes={1:1, 2:2})

if __name__ == "__main__":
        process_users(user_source())

I'd have to modify the process_users func to handle passing in a 
potentially unneeded list of fd's to close- if users were a list, I'm 
cluttering up process_user's signature specializing for once potential 
source, since the process_users function is designed to take an 
iterable object (any iterable object).

Example above doesn't cover the rather nasty cases where there is N 
func calls involved between where the fd is opened, and the actual 
spawn call.  The worst scenario is where there is >1 thread running- 
you're stuck using globals there, which is pretty ugly.

It's pretty much best practice to close what you don't explicitly need 
when handling total control over to another process, good daemonizing code 
does the same thing due to the reasons I raised above.


> By the way, this method does not allow having a pipe open to a child process 
> other than stdin, stdout or stderr.

It does; the portage_exec that is in stable is the modified version I 
created for ebd, which uses this exact functionality.

line #153 of trunk/pym/ebuild.py, example call-

self.pid = spawn_func(self.ebd+" daemonize", fd_pipes={0:0, 1:1, 2:2, 
        3:cread, 4:dwrite}, returnpid=True,env=env, *args, **spawn_opts)[0]

Diffing between stable and trunk portage_exec, no changes affect fd 
handling (not surprising, the code is nasty without your reworking), 
so please clarify.

Addendum though... where the implicit fd closing breaks down is in 
usage of spawn_func, which is a trunk func only, used for implementing 
parallel-fetch, due to issues with insuring that the fetching process 
isn't screwing with the normal sequential buildplan execution (nice 
way of saying it was something of a hack).  Plus side, seperate 
process, it's not bound by the gil, so it sidestepped that trickery.


> > The original code is ugly, and knowing a bit more this time around I'd
> > write it a bit differently, that said the protections/behaviour
> > originally in it should be left- it's more robust, and is expected
> > now.
> 
> I disagree here. Presently, other than the execve call, every try/except 
> block 
> in portage_exec.py has bugs that are hidden by the "protections".

Original in this case was referencing the fd_pipes additions, not the 
true original spawn func :)

Most of that code could be converted over to OSError catches anyways, 
although doing so requires testing the piss out of it to make sure no 
regressions are introduced (shouldn't be possible, but don't care to 
screw up the subsystem).

Looking through the 2.4 subprocess module (which came along after the 
mods in question), adding an option controlling the fd cleansing would 
make sense, as long as the default option is to close the fds.

That's subjective, but the examples I gave above show (imo at least) 
why this is needed; it's anal and *will* expose bugs rather then 
letting them lie, but that's better from where I sit.
~harring

Attachment: pgpYLAPM6yGKz.pgp
Description: PGP signature

Reply via email to