On Wed, 8 Nov 2006, William Stein wrote: > On Wed, 08 Nov 2006 07:27:39 -0800, Gonzalo Tornaria > <[EMAIL PROTECTED]> wrote: > >> On Wed, Nov 08, 2006 at 02:23:18AM -0800, William Stein wrote: >>> Good question. Longterm there are a couple of issues: >>> >>> (1) How do you tell the monitor about new processes that get spawned? >>> You could put that info in a temp file, but that feels a little >>> clunky. >> >> You can read from a pipe or some other form of IPC (e.g. unix socket) > > Yes, that could work.
This is what I was envisioning too. > >>> (3) I want to continue to support the spawned processes running on >>> other >>> computers (or as different users) via ssh. With a separate >>> monitor >>> for each spawned process this is possible (though two ssh sessions >>> would be needed). This isn't possible if there is only one >>> monitor, >>> since it can only run on one computer. >> >> Just run one monitor for each host. > > OK, but then what about question 1 again? In particular, telling a monitor > over the network about new processes would be complicated. I don't think the monitor would have to run on the client computer. Rather if I sent several jobs to, say, sage.math, there could be a single monitor with an ssh to sage.math that monitors all of them. Then there would be one monitor per host. > >>> (4) For reasons I don't understand, the slave process doesn't really >>> die >>> until the monitor exits. If in the monitor script instead of >>> doing >>> a sys.exit(0) after the kill, I continue the monitor running, >>> then the >>> process the monitor is watching doesn't terminate as it should. >>> This >>> is on OS X Intel, and is rather odd, but isn't an issue with the >>> 1-monitor >>> per process model. >> >> See wait(2) and waitpid(2) (and also wait3 and wait4 for resource >> information). >> >> Essentially, when a process dies, it stays in "zombie" state so that >> one can (a) get exit status (b) get resource usage information (c) >> dump core IIRC, etc. >> >> The usual trick to spawn e.g. a daemon is to fork / setsid(2) / fork, >> run the process in question as a grandchild, and let the child die; >> because of the setsid(2) call, the process is not adopted by its >> grandparent, but by the init process, which is supposed to clean up on >> exit of any process. [ See also setsid(8) ] >> >> Since we are talking about a monitor, the sensible thing is that the >> monitor waits for all its subprocesses. > > One point that might not have been clear from my previous posting > is that the monitor does not have any subprocesses. The gap/gp/magma, > etc., process that it monitors is a sibling rather than a subprocess. > >> In addition, the monitor can get information about resource usage, >> which could be interesting. > > Yes. > >>> (5) The overhead is minimal -- it really is only 2MB to run a minimal >>> Python process. >> >> However small, it's still O(n). > > Yes but for a typically running SAGE program n is about 3-4, at most. > There's no reason in SAGE to launch numerous subprocesses. Maybe I'm just not used to the idea of having 64GB of ram yet, but 2MB, though it won't bring a system to its knees, doesn't seem tiny to me. Maybe it is because of oprhan processes, but I often see more than 3-4 sage processes running at a time. E.g. one might have a notebook with several worksheets (which each have their own sage subprossess) which may have, say, magma and mathematica and maxima running. > >> BTW, isn't it better to kill -15 first, wait some time, then kill -9 >> (give a chance to cleanup in case there is a SIGTERM handler) > > Yes. Good point. > > Many thanks for your email. > > Anyway, this process monitor thing is a completely general purpose > unix tool. It really a priori has nothing to do with SAGE. Most > of the suggestions on the list are to turn it from what I wrote > into a generic daemon. I wonder -- has such a generic daemon for > process monitoring *already* been written and I just don't know > about it? > > William > > > > --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/ -~----------~----~----~----~------~----~------~--~---
