On Tue, 15 Feb 2011 08:39:10 +0100 Matthieu wrote: > Hi, > > I come a little bit late on that but I would like to add that I agree with > Don on that. > > IMHO, modifing such a behavior is not really great. There is more scenarios > where salloc is executing a non-interactive command (salloc/mpirun) in > background than scenarios where it is running a particular shell > interactively _starting_ in background. If it is mandatory to have this > behavior for interactive application I would rather have a new option for > salloc to use this mode that making it the default. At least for me, 99% of > the backgrounded salloc are made for mpirun executions in non regression > tests, I am not sure that my users will be happy to rewrite all their > scripts or python programs that launches concurrent salloc/mpirun using > threads (automatically set in background by python). > > So far you are saying the same as Don. I have neither disagreed with your and Don's comments, but apart from claiming that it "broke" things, there have been no constructive suggestions how to make this better.
The change affects solely jobs started in the background, for foreground processes the behaviour is not "broken" (it did not change). If you can be sure that the job is indeed non-interactive, it can still be started in the background, but apparently there is a large number of scripts that resist any changes through e.g. perl or sed. But if the job is interactive, starting it in the background will result in hanging salloc, as per reply to Mark. There is no way of bringing such a job into the foreground within salloc, the session is doomed to fail. That case is indeed broken, and this is not limited to the question background or not. We had a user running gdb in this way, the lack of job control in salloc lead him having to use skill/scancel to clean up the hung session. > Would it be possible to make it configurable ? > But how do you intend that? As per the reply to Don, there is no a priori way of telling whether a program is run in interactive mode or not. Program name or commandline flags are not sufficient - a shell can also run in non interactive mode. Gerrit > 2011/2/15 <[email protected]> > > > > > It looks as though I am being outvoted in this, but I would like to make a > > few more points: > > > > 1. The reason I got involved in this was that a rather large Bull > > customer has acceptance test script jobs that submit thousands of > > "salloc" > > requests as background jobs. These scripts worked just fine, and then > > they were broken by a change that appeared in the final version of 2.2.0, > > with no explanation of why salloc is suddenly restricted to running in > > the > > foreground. > > 2. I understand that there are job control issues with salloc when run > > in the background, and that the changes in signal handling that Gerrit > > made > > improve the situation when salloc is run in the foreground by retaining > > better control of the job from the terminal, but I disagree that this is > > sufficient justification to remove the ability to run salloc in the > > background, expecially since this change can be trivially bypassed by > > using > > input redirection. All that has been accomplished is to break the > > scripts I > > mentioned above and whatever else depended on the current behavior of > > salloc, and force users to add a "kludge" to obtain the behavior they had > > before. The customer scripts above have a legitimate use in invoking > > "non-interactive" usage of salloc, as do other examples such as > > starting an > > "xterm" on a SLURM allocation. > > 3. I disagree that the proposed comments in the code provide > > sufficient explanation of this change. The new comments explain that > > salloc > > must be running in the foreground to issue the "tcsetpgrp" call and run > > "interactive" subprocesses, but they do not explain the rationale for > > disallowing salloc to run in the background when it is running only > > "non-interactive" subprocesses. > > 4. The test case for my patch of submitting an interactive shell as a > > background job request is spurious. As Gerrit said, "if starting an > > interactive session via salloc, why would a user want to start it in a > > stopped state"? The answer is: you wouldn't. If you wanted to run > > interactively, you wouldn't add that "&" at the end of your command. > > But > > if you knew that you wanted to run something that would run > > "non-interactively", such as an "mpirun" or an "xterm", why would you > > not want to be able to add that "&", and free up your terminal or script > > for other commands? As Mark noted previously, if users inadvertently > > try > > to run jobs that need to be interactive in the background, they should > > fairly quickly learn that it isn't a good idea, whether under salloc or > > just a normal shell. > > > > > > Ok, I've had my say. I will rest my case now. > > > > -Don Albert- > >
