Michael Shapiro wrote: > You can use the /proc PCSTOP and PCRUN directives > for this. Thanks; that appears to be what I need. But the man pages and source code are confusing me. While reading them I feel like a rat in a maze. Every time I read any part of anything to learn how it works, and have preconceived notions about how other things on which the thing which I'm reading about depends work, my notions turn out to be wrong, and I have to go read about those other things. Repeat ad infinitum. For example I tried to decipher how SIGSTOP, SIGRUN, PCSTOP, and PCRUN interact when they're are used together, including when and how they set various statuses, and I failed utterly. I get the impression that it's not possible to comprehend the answer without also learning how the entire kernel works.
It appears that SIGSTOP sets lwpstatus.pr_why to PR_SIGNALLED, whereas PCSTOP and PCDSTOP sets it to PR_REQUESTED, and these are mutually exclusive, but the man page says "If PCSTOP or PCDSTOP is applied to a thread that is stopped, but not because of an event of interest, the stop directive takes effect when the thread is restarted by the competing mechanism; at that time the thread enters a PR_REQUESTED stop before executing any user-level code." and I don't see where that pending stop directive is recorded. For example if the scheduler is the competing mechanism, and has set lwpstatus.pr_why to PR_SIGNALLED, and then I apply PCDSTOP, and then the scheduler tries to set the lwp runnable, then it has to know somehow that I previously applied PCDSTOP so that it can set lwpstatus.pr_why to PR_REQUESTED instead of setting the lwp runnable. And while trying to figure out how, I then notice that the flags available for lwpstatus.pr_flags contain various flavors of stopped, but no runnable. So there's another wrong preconceived notion, that the same structure which contains a flag to mark a lwp as stopped would also contain a flag to mark it as runnable. Then I try to read prcontrol.c/pr_stop(), and after a few minutes I just give up, even though it's only 64 lines long. This isn't a comment on the quality of the Solaris source code, just a comment that it seems futile to try to understand any part of the code without understanding all of it. > > And if there is such a mechanism, then is there a > way to tell the kernel's > > scheduler to use it instead of sigstop and sigcont > for a particular process, > > so that the process thinks that it runs without > ever being preempted by the > > scheduler? > > Not sure what you're trying to do, The sigcont signal sent by the scheduler is a side channel via which some information about system timing can leak to a process which is not authorized to have such information. The system can deny the process access to system clocks and timers and the network so that the process can't time its own execution, but the process could still gain some (limited) timing information by using the arrival of sigcont signals as a crude timer. So in order to deny the process access to any timing information, the system must refrain from sending it such signals when it's scheduled to run. > but I could > imagine some inventive use of > DTrace that would achieve this: DTrace's stop() > action is equivalent to a > PCSTOP, so you could write a DTrace script to hit a > probe either on > descheduling a process with particular attributes or > every so often, > hit it with a stop, and then have another control > process later wake it up. I don't understand how this would achieve my goal of allowing a process to run as usual and allowing it to be periodically preempted as usual so that other processes can run concurrently, but making the preemption be completely undetectable so that the process thinks that it runs continously in one contiguous timeslice, so that it can't correlate individual timeslices with its own execution progress. > you need to make sure you've > got a shell which > supports it or use /bin/kill. Many shells have kill > built-ins which > may not support the exact same syntax. That's what my problem was; both /bin/sh and /bin/bash have built-in kill. Not only is their syntax different from /bin/kill, they're even different from each other. Somehow it seems wrong to have redundant functionality with the same name in the same place (the shell's command namespace) with subtly different syntax. This is a feature which I would be proud to include in a user interface if the goal were intentional obfuscation. Naturally, the shells' built-in kill commands are documented in the man pages so technically I ought to have not been confused if I'd just bothered to RTFM. > This is effectively what gcore() does [snip] Thanks again; that's what I need. This message posted from opensolaris.org _______________________________________________ opensolaris-code mailing list [email protected] https://opensolaris.org:444/mailman/listinfo/opensolaris-code
