Re: Someone help me understand this...?
On Thu, Aug 28, 2003 at 11:34:09AM -0400, Robert Watson wrote: Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver in the privilege-change case. OpenBSD does not consider a process 'tainted' if it changes credentials while running. From the issetugid(2) manpage: The status of issetugid() is only affected by execve(). In most cases, fail-stop is a reasonable behavior for unexpected security behavior from the system, but ignore is likely to shoot you later. :-) I tend to wrap even kill() calls as uid 0 in an assertion check, just to be on the safe side. If nothing else, it helps detect the case where the other process has died, and you're using a stale pid. It's particular useful if the other process has died, the pid has been reused, and it's now owned by another user, which is a real-world case where kill() as a non-0 uid can fail even when you're sure it can't :-). This can be avoided by careful programming: do not use SA_NOCLDWAIT and don't pass pids to kill() when they have been returned by wait() or similar functions. If the process has terminated in between, it's a zombie. In that case, FreeBSD probably returns ESRCH but SUSv3 mandates returning success (but performing no action). Jilles Tjoelker ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Someone help me understand this...?
On Sat, 30 Aug 2003, Jilles Tjoelker wrote: On Thu, Aug 28, 2003 at 11:34:09AM -0400, Robert Watson wrote: Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver in the privilege-change case. OpenBSD does not consider a process 'tainted' if it changes credentials while running. From the issetugid(2) manpage: The status of issetugid() is only affected by execve(). In OpenBSD, two flags are used to represent the credential change notion: P_SUGIDEXEC, and P_SUGID. issetugid() checks the first of these, but signal delivery checks P_SUGID. P_SUGIDEXEC is set during execve(). In FreeBSD, we have a combined notion used by both, since the same protections generally apply. You can find a comment comparing our use of P_SUGID to the OpenBSD approach in our issetugid() implementation: /* * Note: OpenBSD sets a P_SUGIDEXEC flag set at execve() time, * we use P_SUGID because we consider changing the owners as * tainting as well. * This is significant for procs that start as root and become * a user without an exec - programs cannot know *everything* * that libc *might* have put in their data segment. */ Regarding specific signals: inspection of the OpenBSD implementation reveals that the following signals are permitted in the P_SUGID case, assuming a reasonable credential match: case 0: case SIGKILL: case SIGINT: case SIGTERM: case SIGALRM: case SIGSTOP: case SIGTTIN: case SIGTTOU: case SIGTSTP: case SIGHUP: case SIGUSR1: case SIGUSR2: In FreeBSD, we permit: case 0: case SIGKILL: case SIGINT: case SIGTERM: case SIGSTOP: case SIGTTIN: case SIGTTOU: case SIGTSTP: case SIGHUP: case SIGUSR1: case SIGUSR2: So they permit SIGALRM in addition to the signals we support. In light of this thread, I think it would be reasonable to add SIGALRM to our list as well. In most cases, fail-stop is a reasonable behavior for unexpected security behavior from the system, but ignore is likely to shoot you later. :-) I tend to wrap even kill() calls as uid 0 in an assertion check, just to be on the safe side. If nothing else, it helps detect the case where the other process has died, and you're using a stale pid. It's particular useful if the other process has died, the pid has been reused, and it's now owned by another user, which is a real-world case where kill() as a non-0 uid can fail even when you're sure it can't :-). This can be avoided by careful programming: do not use SA_NOCLDWAIT and don't pass pids to kill() when they have been returned by wait() or similar functions. If the process has terminated in between, it's a zombie. In that case, FreeBSD probably returns ESRCH but SUSv3 mandates returning success (but performing no action). There's still a race possible here, it just becomes more narrow with conservative programming. And in the classic use of pids for signalling (/var/run/foo.pid, or kill -9 pid), these approaches won't help. The only way to close this sort of race is to have a notion of a unique process identifier that lasts beyond the lifetime of the process itself -- i.e., the ability to return EMYSINCERESTREGRESTS if you try to signal a process after it has died, and have a guarantee that the handle won't be reused. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Someone help me understand this...?
On Sat, 30 Aug 2003 12:23:35 -0400 (EDT), Robert Watson [EMAIL PROTECTED] said: The only way to close this sort of race is to have a notion of a unique process identifier that lasts beyond the lifetime of the process itself -- i.e., the ability to return EMYSINCERESTREGRESTS if you try to signal a process after it has died, and have a guarantee that the handle won't be reused. This is traditionally done by holding an advisory lock on the pid file; if the file is no longer locked, then the process holding the lock must have exited. You could also do it with UUIDs and a more heavyweight signal API. -GAWollman ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Someone help me understand this...?
On Thu, 28 Aug 2003, Joe Greco wrote: Could you confim this happens with 4.8? The access control checks there are substantially different, and I wouldn't expect the behavior you're seeing on 4.8... Rather difficult. I'll see if the client will let me trash a production system, but usually people don't like $40K servers handing out a few hundred megabits of traffic going out of service. We were trying to fix it on the scratch box (which happens to have 5.1R on it) and then were going to see how it fared on the production systems. I think it's safe to assume that if you're seeing a similar failure, there's a different source given my reading of the code, but I'm willing to be proven wrong. It's probably not worth the investment if you're talking about large quantities of money, though. It's more like large quantities of annoyance and work. Can you describe the case you're envisioning? If I can easily poke at it, I can at least get some clues. I guess all I'm looking for is confirmation that your original statement (happens in 4.8 and 5.1) is completely correct: the 5.1 behavior is expected, but I'm surprised it happens with 4.8. Correct. The USR signals control debug levels. If it was a signal that was only used internally, it could be changed, of course, but changing a signal used by humans (and one used in the same manner as other programs) is probably a bad idea. Try the patch attached, which introduces both the conservative_signals sysctl, and adds SIGALRM to the list of acceptable signals for P_SUGID processes. Yeah, if anything, we probably don't want to do that, because the resources set up as root are usually more attractive. I don't have a problem with coding in some FreeBSD-isms, but I don't see it as buying us anything, does it? I'm not sure there are explicit benefits in this specific situation, except that you can run Diablo with the resource limits of the user you configure, and potentially those might be similar to (but perhaps not identical to) those given to root. I.e., instead of hard-coding use the resource limits of root, you're saying use the resource limits of the user Diablo is run as, and set those to what you want. Given that heavy-weight news servers are likely to be dedicated machines, it's a subtle but perhaps useful semantic difference. Updated patch below. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories Index: kern_prot.c === RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v retrieving revision 1.175 diff -u -r1.175 kern_prot.c --- kern_prot.c 13 Jul 2003 01:22:20 - 1.175 +++ kern_prot.c 30 Aug 2003 19:45:50 - @@ -1367,6 +1367,20 @@ return (cr_cansee(td-td_ucred, p-p_ucred)); } +/* + * 'conservative_signals' prevents the delivery of a broad class of + * signals by unprivileged processes to processes that have changed their + * credentials since the last invocation of execve(). This can prevent + * the leakage of cached information or retained privileges as a result + * of a common class of signal-related vulnerabilities. However, this + * may interfere with some applications that expect to be able to + * deliver these signals to peer processes after having given up + * privilege. + */ +static int conservative_signals = 1; +SYSCTL_INT(_security_bsd, OID_AUTO, conservative_signals, CTLFLAG_RW, +conservative_signals, 0, Unprivileged processes prevented from +sending certain signals to processes whose credentials have changed); /*- * Determine whether cred may deliver the specified signal to proc. * Returns: 0 for permitted, an errno value otherwise. @@ -1399,12 +1413,13 @@ * bit on the target process. If the bit is set, then additional * restrictions are placed on the set of available signals. */ - if (proc-p_flag P_SUGID) { + if (conservative_signals (proc-p_flag P_SUGID)) { switch (signum) { case 0: case SIGKILL: case SIGINT: case SIGTERM: + case SIGALRM: case SIGSTOP: case SIGTTIN: case SIGTTOU: ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Someone help me understand this...?
On Wed, 27 Aug 2003, Joe Greco wrote: I've got a weirdness with kill(2). This code is out of Diablo, the news package, and has been working fine for some years. It apparently works fine on other OS's. In the Diablo model, the parent process may choose to tell its children to update status via a signal. The loop basically consists of going through and issuing a SIGALRM. This stopped working a while ago, don't know precisely when. I was in the process of debugging it today and ran into this. The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Perhaps the children are setuid, the parent doesn't have appropriate privelege and you are mistaken about this happening under 4.8 as well. In 5.x since at least rev.1.80 of kern_prot.c, only certain signals not including SIGALRM can be sent from unprivileged processes to setuid processes. This is very UN-unixlike although it is permitted as an-implementation- defined restriction in at least POSIX.1-2001. It breaks^Wexposes bugs in some old POSIX test programs and I don't have many security concerns so I just disable it locally: %%% Index: kern_prot.c === RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v retrieving revision 1.175 diff -u -2 -r1.175 kern_prot.c --- kern_prot.c 13 Jul 2003 01:22:20 - 1.175 +++ kern_prot.c 17 Aug 2003 04:26:00 - @@ -1395,4 +1387,5 @@ return (error); +#if 0 /* * UNIX signal semantics depend on the status of the P_SUGID @@ -1425,4 +1418,5 @@ } } +#endif /* %%% Wot? Why can't I send it a signal? I've read kill(2) rather carefully and cannot find the reason. It says, For a process to have permission to send a signal to a process designated by pid, the real or effective user ID of the receiving process must match that of the sending process or the user must have appropriate privileges (such as given by a set-user-ID program or the user is the super-user). The implementation-defined restrictions are not documented, of course ;-). Well, the sending and receiving processes both clearly have equal uid/euid. We're not running in a jail, so I don't expect any issues there. The parent process did actually start as root and then shed privilege with struct passwd *pw = getpwnam(news); struct group *gr = getgrnam(news); gid_t gid; if (pw == NULL) { perror(getpwnam('news')); exit(1); } if (gr == NULL) { perror(getgrnam('news')); exit(1); } gid = gr-gr_gid; setgroups(1, gid); setgid(gr-gr_gid); setuid(pw-pw_uid); so that looks all well and fine... so why can't it kill its own children, and why can't I kill one of its children from a shell with equivalent uid/euid? Changing the ids is one way to make the process setuid (setuid-on-exec is another but that doesn't seem to be the problem here). The relevant setuid bit (P_SUGID) is normally cleared on exec, but perhaps it isn't here, either because the children don't exec or the effective ids don't match the real ids at the time of the exec. I know there's been some paranoia about signal delivery and all that, but my searching hasn't turned up anything that would explain this. Certainly the manual page ought to be updated if this is a new expected behaviour or something... at least some clue as to why it might fail would be helpful. Certainly. It is incomplete even not counting complications for jails or other implementation-defined restrictions related to appropriate privilege. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Someone help me understand this...?
On Wed, 27 Aug 2003, Joe Greco wrote: I've got a weirdness with kill(2). This code is out of Diablo, the news package, and has been working fine for some years. It apparently works fine on other OS's. In the Diablo model, the parent process may choose to tell its children to update status via a signal. The loop basically consists of going through and issuing a SIGALRM. This stopped working a while ago, don't know precisely when. I was in the process of debugging it today and ran into this. The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Perhaps the children are setuid, the parent doesn't have appropriate privelege and you are mistaken about this happening under 4.8 as well. Well, the parent process does the code I listed below early on in the initialization process - it pretty much does a little initialization, opens its socket (119), sheds privilege, and begins accepting conns. It then forks off processes for each connection. The program itself is not a suid executable, but rather relies on being launched by a root user. I'm not sure what appropriate privilege would be. If both the uid/euid of the parent match both the uid/euid of the child, I would expect to be able to kill the process... Client complains about the resulting problems also happening under 4.8 servers. Dunno. Could possibly be a separate issue. In 5.x since at least rev.1.80 of kern_prot.c, only certain signals not including SIGALRM can be sent from unprivileged processes to setuid processes. This is very UN-unixlike although it is permitted as an-implementation- defined restriction in at least POSIX.1-2001. It breaks^Wexposes bugs in some old POSIX test programs and I don't have many security concerns so I just disable it locally: %%% Index: kern_prot.c === RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v retrieving revision 1.175 diff -u -2 -r1.175 kern_prot.c --- kern_prot.c 13 Jul 2003 01:22:20 - 1.175 +++ kern_prot.c 17 Aug 2003 04:26:00 - @@ -1395,4 +1387,5 @@ return (error); +#if 0 /* * UNIX signal semantics depend on the status of the P_SUGID @@ -1425,4 +1418,5 @@ } } +#endif /* %%% Wot? Why can't I send it a signal? I've read kill(2) rather carefully and cannot find the reason. It says, For a process to have permission to send a signal to a process designated by pid, the real or effective user ID of the receiving process must match that of the sending process or the user must have appropriate privileges (such as given by a set-user-ID program or the user is the super-user). The implementation-defined restrictions are not documented, of course ;-). Well, the sending and receiving processes both clearly have equal uid/euid. We're not running in a jail, so I don't expect any issues there. The parent process did actually start as root and then shed privilege with struct passwd *pw = getpwnam(news); struct group *gr = getgrnam(news); gid_t gid; if (pw == NULL) { perror(getpwnam('news')); exit(1); } if (gr == NULL) { perror(getgrnam('news')); exit(1); } gid = gr-gr_gid; setgroups(1, gid); setgid(gr-gr_gid); setuid(pw-pw_uid); so that looks all well and fine... so why can't it kill its own children, and why can't I kill one of its children from a shell with equivalent uid/euid? Changing the ids is one way to make the process setuid (setuid-on-exec is another but that doesn't seem to be the problem here). The relevant setuid bit (P_SUGID) is normally cleared on exec, but perhaps it isn't here, either because the children don't exec or the effective ids don't match the real ids at the time of the exec. The children aren't spawned via exec, but clearly they have equal uid/euid's. So what you're saying, I guess, is it's not supposed to work. I guess I'm a bit confused by the logic of this. I've seen numerous forking daemons over the years that do this sort of thing (not to mention I've written a number). I've always viewed shedding root privs as being a good thing... Was it really intended to break things in this manner? I know there's been some paranoia about signal delivery and all that, but my searching hasn't turned up anything that would explain this. Certainly the manual page ought to be updated if this is a new expected behaviour or something... at least some clue as to why it might fail would be helpful. Certainly. It is incomplete even not counting complications for jails or other implementation-defined restrictions related to appropriate privilege. Sigh. Thanks for the note, ... JG -- Joe Greco - sol.net Network
Re: Someone help me understand this...?
On Wed, 27 Aug 2003, Joe Greco wrote: The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Could you confim this happens with 4.8? The access control checks there are substantially different, and I wouldn't expect the behavior you're seeing on 4.8... ... Well, the sending and receiving processes both clearly have equal uid/euid. We're not running in a jail, so I don't expect any issues there. ... The parent process did actually start as root and then shed privilege with struct passwd *pw = getpwnam(news); struct group *gr = getgrnam(news); gid_t gid; if (pw == NULL) { perror(getpwnam('news')); exit(1); } if (gr == NULL) { perror(getgrnam('news')); exit(1); } gid = gr-gr_gid; setgroups(1, gid); setgid(gr-gr_gid); setuid(pw-pw_uid); so that looks all well and fine... so why can't it kill its own children, and why can't I kill one of its children from a shell with equivalent uid/euid? I know there's been some paranoia about signal delivery and all that, but my searching hasn't turned up anything that would explain this. Certainly the manual page ought to be updated if this is a new expected behaviour or something... at least some clue as to why it might fail would be helpful. The man page definitely needs to be updated, but I think it's worth having a conversation about whether the current behavior is too conservative first... These changes come in response to a class of application vulnerabilities relating to the delivery of unexpected signals. The reason the process in question is being treated as special from an access control perspective is that it has undergone a credential change, resulting in the setting of the process P_SUGID bit. This bit remains set even if the remaining credentials of the process appear normal -- i.e., even if ruid==euid, rgid==egid, and can only be reset by calling execve() on a normal binary, which is considered sufficient to flush the state of the process. These processes are given special protection properties because they almost always have cached access to memory or resources acquired using the original credential. For example, the process accesses the password file while holding root privilege, which means that the process may well have password hashes in memory from its reading the shadow password file -- in fact, it likely even have a file descriptor to the shadow password file still open. The same P_SUGID flag is used to prevent against unprivileged debugging of applications that have changed credentials and now appear normal. P_SUGID is also used to determine the results of the issetugid() system call, which is used by many libraries to see if they are running with (or have run with) privilege and need to behave in a more conservative manner. I don't remember the details, but there have been at least a couple of demonstrated exploits of vulnerable applications using signals in which setuid applications rely on certain signals (such as SIGALRM, SIGIO, SIGURG) only being delivered as a result of system calls that set up timers, IO, etc. I seem to recall it might have involved a setuid application such as sendmail on OpenBSD, but I'll have to do some googling and get back to you. These protections probably fall into the same class of conservative behavior as our preventing setuid programs from being started with closed stdin/stdout/stderr descriptors. Giving up privilege without performing an exec() is very difficult in UNIX, unfortunately, since the trappings of privilege may be maintained by libraries, etc, without the knowledge of application writers. Right now, signal delivery in 5.x is pretty conservative if a process has changed credentials, to protect against tampering with a class of applications that has, historically, been vulnerable to a broad variety of exploits. I've attached an (untested) patch that makes this behavior run-time configuration using a sysctl -- when the sysctl is disabled, special-case handling for P_SUGID processes is disabled. I believe that this will cause the problem you're experiencing in 5.x to go away -- please let me know. Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver
Re: Someone help me understand this...?
On Wed, 27 Aug 2003, Joe Greco wrote: The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Could you confim this happens with 4.8? The access control checks there are substantially different, and I wouldn't expect the behavior you're seeing on 4.8... Rather difficult. I'll see if the client will let me trash a production system, but usually people don't like $40K servers handing out a few hundred megabits of traffic going out of service. We were trying to fix it on the scratch box (which happens to have 5.1R on it) and then were going to see how it fared on the production systems. The man page definitely needs to be updated, but I think it's worth having a conversation about whether the current behavior is too conservative first... These changes come in response to a class of application vulnerabilities relating to the delivery of unexpected signals. The reason the process in question is being treated as special from an access control perspective is that it has undergone a credential change, resulting in the setting of the process P_SUGID bit. This bit remains set even if the remaining credentials of the process appear normal -- i.e., even if ruid==euid, rgid==egid, and can only be reset by calling execve() on a normal binary, which is considered sufficient to flush the state of the process. These processes are given special protection properties because they almost always have cached access to memory or resources acquired using the original credential. For example, the process accesses the password file while holding root privilege, which means that the process may well have password hashes in memory from its reading the shadow password file -- in fact, it likely even have a file descriptor to the shadow password file still open. The same P_SUGID flag is used to prevent against unprivileged debugging of applications that have changed credentials and now appear normal. P_SUGID is also used to determine the results of the issetugid() system call, which is used by many libraries to see if they are running with (or have run with) privilege and need to behave in a more conservative manner. Okay, well, that makes good sense. I don't remember the details, but there have been at least a couple of demonstrated exploits of vulnerable applications using signals in which setuid applications rely on certain signals (such as SIGALRM, SIGIO, SIGURG) only being delivered as a result of system calls that set up timers, IO, etc. I seem to recall it might have involved a setuid application such as sendmail on OpenBSD, but I'll have to do some googling and get back to you. These protections probably fall into the same class of conservative behavior as our preventing setuid programs from being started with closed stdin/stdout/stderr descriptors. Giving up privilege without performing an exec() is very difficult in UNIX, unfortunately, since the trappings of privilege may be maintained by libraries, etc, without the knowledge of application writers. Right now, signal delivery in 5.x is pretty conservative if a process has changed credentials, to protect against tampering with a class of applications that has, historically, been vulnerable to a broad variety of exploits. I've attached an (untested) patch that makes this behavior run-time configuration using a sysctl -- when the sysctl is disabled, special-case handling for P_SUGID processes is disabled. I believe that this will cause the problem you're experiencing in 5.x to go away -- please let me know. Well, I'm hoping more for a general fix for Diablo, rather than a special patch for the OS. Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver in the privilege-change case. I wonder if it would be reasonable to have some sort of interface that allowed a program to tell FreeBSD not to set this flag... if not, at least if there was a sysctl, code could be added so that the daemon checked the flag when starting and errored out if it wasn't set. BTW, it's worth noting that the mechanism Diablo is using to give up privilege actually does retain some privileges -- it doesn't, for example, synchronize its resource limits with those of the user it is switching to, so it retains the starting resource limits (likely those of
Re: Someone help me understand this...?
On Thu, 28 Aug 2003, Joe Greco wrote: On Wed, 27 Aug 2003, Joe Greco wrote: The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Could you confim this happens with 4.8? The access control checks there are substantially different, and I wouldn't expect the behavior you're seeing on 4.8... Rather difficult. I'll see if the client will let me trash a production system, but usually people don't like $40K servers handing out a few hundred megabits of traffic going out of service. We were trying to fix it on the scratch box (which happens to have 5.1R on it) and then were going to see how it fared on the production systems. I think it's safe to assume that if you're seeing a similar failure, there's a different source given my reading of the code, but I'm willing to be proven wrong. It's probably not worth the investment if you're talking about large quantities of money, though. Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver in the privilege-change case. I wonder if it would be reasonable to have some sort of interface that allowed a program to tell FreeBSD not to set this flag... if not, at least if there was a sysctl, code could be added so that the daemon checked the flag when starting and errored out if it wasn't set. We actually have such an interface, but it's only enabled for the purposes of regression testing. If you compile options REGRESSION into the kernel configuration, a new system call __setsugid(), is exposed to applications. It's used by src/tools/regression/security/proc_to_proc to make it easier to set up process pairs for regression testing of inter-process access control. When I added it, there was some interest in just making it setsugid() and exposing it to all processes. Maybe we should just go this route for 5.2-RELEASE. Invoking it with a (0) argument would mean the application writer accepted the inherrent risks. However, this would open the application to the risks of debugging attachment, which are probably greater than the signal risks in most cases. It's not clear what the best way to express I want to accept these risks but not those risks would be... So far, it sounds like we have three work-arounds in the pot, perhaps we can think of something better: (1) Remove SIGALRM from the list of prohibited signals in the P_SUGID case. Not clear what the risks are here based on common application use, but this is an easy change to make. (2) Add setsugid() to allow applications to give up implicit protections associated with credential changes. This comes with greater risks, I suspect, since it opens up applications to more explicit vulnerabilities: signal attacks require more sophistication and luck, but debugging attacks are easy. (3) Allow administrators to selectively disable the more restrictive signal checks at a system scope using a sysctl. This is easy, and comes with no risks as long as the setting is unchanged (the default in the patch I sent out earlier). I'm tempted to commit (1) immediately to allow a workaround if we get nothing else figured out, and to think some more about (2) and (3). Another possibility would be to encourage application writers to avoid overloading signals that already have meanings, and rely on the USR signals. I assume the reason Diablo uses ALRM is that the USR signals already have assigned semantics? BTW, it's worth noting that the mechanism Diablo is using to give up privilege actually does retain some privileges -- it doesn't, for example, synchronize its resource limits with those of the user it is switching to, so it retains the starting resource limits (likely those of the root account). That's actually preferred in most cases. News servers almost always eat far more resources than whatever limits you might set by default, which just turns into telling people to remove the limits or use root's limits. Generally if a news package bumps limits bad things happen. Right now, most applications in the base system make use of the setusercontext() call to modify their protections as part of a switch of users. They often pass in the flag LOGIN_SETALL and then remove the bits they don't need, such as LOGIN_SETRESOURCES. This also has the side effect
Re: Someone help me understand this...?
On Thu, 28 Aug 2003, Joe Greco wrote: On Wed, 27 Aug 2003, Joe Greco wrote: The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. Could you confim this happens with 4.8? The access control checks there are substantially different, and I wouldn't expect the behavior you're seeing on 4.8... Rather difficult. I'll see if the client will let me trash a production system, but usually people don't like $40K servers handing out a few hundred megabits of traffic going out of service. We were trying to fix it on the scratch box (which happens to have 5.1R on it) and then were going to see how it fared on the production systems. I think it's safe to assume that if you're seeing a similar failure, there's a different source given my reading of the code, but I'm willing to be proven wrong. It's probably not worth the investment if you're talking about large quantities of money, though. It's more like large quantities of annoyance and work. Can you describe the case you're envisioning? If I can easily poke at it, I can at least get some clues. Clearly, unbreaking applications like Diablo by default is desirable. At least OpenBSD has similar protections to these turned on by default, and possibly other systems as well. As 5.x sees more broad use, we may well bump into other cases where applications have similar behavior: they rely on no special protections once they've given up privilege. I wonder if Diablo can run unmodified on OpenBSD; it could be they don't include SIGALRM on the list of protect against signals, or it could be that they modify Diablo for their environment to use an alternative signaling mechanism. Another alternative to this patch would simply be to add SIGARLM to the list of acceptable signals to deliver in the privilege-change case. I wonder if it would be reasonable to have some sort of interface that allowed a program to tell FreeBSD not to set this flag... if not, at least if there was a sysctl, code could be added so that the daemon checked the flag when starting and errored out if it wasn't set. We actually have such an interface, but it's only enabled for the purposes of regression testing. If you compile options REGRESSION into the kernel configuration, a new system call __setsugid(), is exposed to applications. It's used by src/tools/regression/security/proc_to_proc to make it easier to set up process pairs for regression testing of inter-process access control. When I added it, there was some interest in just making it setsugid() and exposing it to all processes. Maybe we should just go this route for 5.2-RELEASE. Invoking it with a (0) argument would mean the application writer accepted the inherrent risks. However, this would open the application to the risks of debugging attachment, which are probably greater than the signal risks in most cases. It's not clear what the best way to express I want to accept these risks but not those risks would be... So far, it sounds like we have three work-arounds in the pot, perhaps we can think of something better: (1) Remove SIGALRM from the list of prohibited signals in the P_SUGID case. Not clear what the risks are here based on common application use, but this is an easy change to make. (2) Add setsugid() to allow applications to give up implicit protections associated with credential changes. This comes with greater risks, I suspect, since it opens up applications to more explicit vulnerabilities: signal attacks require more sophistication and luck, but debugging attacks are easy. (3) Allow administrators to selectively disable the more restrictive signal checks at a system scope using a sysctl. This is easy, and comes with no risks as long as the setting is unchanged (the default in the patch I sent out earlier). I'm tempted to commit (1) immediately to allow a workaround if we get nothing else figured out, and to think some more about (2) and (3). Another possibility would be to encourage application writers to avoid overloading signals that already have meanings, and rely on the USR signals. I assume the reason Diablo uses ALRM is that the USR signals already have assigned semantics? Correct. The USR signals control debug levels. If it was a signal that was only used internally, it could be changed, of course, but changing a signal used by humans (and one used in the same manner as other programs) is probably a bad idea. BTW, it's worth noting that the mechanism Diablo is using to give up privilege actually does retain some privileges -- it doesn't, for example, synchronize its resource limits with those of the user it is switching to, so it retains the starting resource limits (likely those of the root account). That's actually preferred in most cases. News servers almost always eat far more
Someone help me understand this...?
I've got a weirdness with kill(2). This code is out of Diablo, the news package, and has been working fine for some years. It apparently works fine on other OS's. In the Diablo model, the parent process may choose to tell its children to update status via a signal. The loop basically consists of going through and issuing a SIGALRM. This stopped working a while ago, don't know precisely when. I was in the process of debugging it today and ran into this. The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. %echo $$ 29047 %ps -O ruid,uid | egrep '28949|29045|29047' 28949 8 8 p0 I 0:00.00 diablo: ihav=0chk=0rec=0 ent=0 29045 8 8 p0 I 0:00.00 sleep 99 29047 8 8 p0 D 0:00.01 -su (csh) %kill -ALRM 28949 28949: Operation not permitted %kill -ALRM 29045 %ps -O ruid,uid | egrep '28949|29045' 28949 8 8 p0 I 0:00.00 diablo: ihav=0chk=0rec=0 ent=0 % Wot? Why can't I send it a signal? I've read kill(2) rather carefully and cannot find the reason. It says, For a process to have permission to send a signal to a process designated by pid, the real or effective user ID of the receiving process must match that of the sending process or the user must have appropriate privileges (such as given by a set-user-ID program or the user is the super-user). Well, the sending and receiving processes both clearly have equal uid/euid. We're not running in a jail, so I don't expect any issues there. The parent process did actually start as root and then shed privilege with struct passwd *pw = getpwnam(news); struct group *gr = getgrnam(news); gid_t gid; if (pw == NULL) { perror(getpwnam('news')); exit(1); } if (gr == NULL) { perror(getgrnam('news')); exit(1); } gid = gr-gr_gid; setgroups(1, gid); setgid(gr-gr_gid); setuid(pw-pw_uid); so that looks all well and fine... so why can't it kill its own children, and why can't I kill one of its children from a shell with equivalent uid/euid? I know there's been some paranoia about signal delivery and all that, but my searching hasn't turned up anything that would explain this. Certainly the manual page ought to be updated if this is a new expected behaviour or something... at least some clue as to why it might fail would be helpful. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again. - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]