Re: [PATCH] selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets
On Wed, Jun 21, 2017 at 3:04 PM, Paul Moorewrote: > On Wed, Jun 21, 2017 at 5:48 AM, Luis Ressel wrote: >> On Tue, 20 Jun 2017 17:43:38 -0400 >> Paul Moore wrote: >> >>> Considering where we are at with respect to the merge window, let's >>> shelve this for now and I'll merge it after the next merge window >>> closes. In all likelihood I'll be sending selinux/next up to James >>> later this week and I'd like this to sit in linux-next for longer than >>> a few days. >> >> That means the change will land in 4.14 at the earliest, right? (Just >> out of curiosity.) > > That's correct. We are currently working towards a v4.12 release in > Linus' tree, the upcoming merge window will be for v4.13, and things > merged into selinux/next after that merge window will be for v4.14. > >> By the way, refpolicy only grants "socket" permissions to a handful of >> domains, all of which also have the corresponding "unix_dgram_socket" >> permissions. The fedora policy does the same (according to Stephen); >> this only leaves custom policies to be potentially affected by this >> change. > > While custom policies are definitely in the minority, we still need to > do out best not to break them without warning. > >> Given that the SOCK_RAW->SOCK_DGRAM translation is obscure enough not to >> be documented anywhere outside the kernel sources, I doubt there are >> many users of it, anyway. > > You very well may be right, I just felt that such a change requires > more than a week in the selinux/next tree. > > Thank you for your patch, it's in the queue and I'll be merging it > into the selinux/next branch in a few weeks. With all of the SELinux things for v4.13 upstream, I went ahead and merged this patch into selinux/next. -- paul moore www.paul-moore.com
[RFC][PATCH] selinux: Introduce a policy capability and permission for NNP transitions
As systemd ramps up enabling NoNewPrivileges (either explicitly in service unit files or as a side effect of other security-related settings in service unit files), we're increasingly running afoul of its interactions with SELinux. The end result is bad for the security of both SELinux-disabled and SELinux-enabled systems. Packagers have to turn off these options in the unit files to preserve SELinux domain transitions. For users who choose to disable SELinux, this means that they miss out on at least having the systemd-supported protections. For users who keep SELinux enabled, they may still be missing out on some protections because it isn't necessarily guaranteed that the SELinux policy for that service provides the same protections in all cases. Our options seem to be: 1) Just keep on the way we are now, i.e. packagers have to remove default protection settings from upstream package unit files in order to have them work with SELinux (and not just NoNewPrivileges= itself; increasingly systemd is enabling NNP as a side effect of other unit file options, even seemingly unrelated ones like PrivateDevices). SELinux-disabled users lose entirely, SELinux-enabled users may lose (depending on whether SELinux policy provides equivalent or better guarantees). 2) Change systemd to automatically disable NNP on SELinux-enabled systems. Unit files can be left unmodified from upstream. SELinux- disabled users win. SELinux-enabled users may lose. 3) Try to use typebounds, since we allow bounded transitions under NNP. Unit files can be left unmodified from upstream. SELinux-disabled users win. SELinux-enabled users get to benefit from systemd-provided protections. However, this option is impractical to implement in policy currently, since typebounds requires us to ensure that each domain is allowed everything all of its descendant domains are allowed, and this has to be repeated for the entire chain of domain transitions. There is no way to clone all allow rules from children to the parent in policy currently, and it is doubtful that doing so would be desirable even if it were practical, as it requires leaking permissions to objects and operations into parent domains that could weaken their own security in order to allow them to the children (e.g. if a child requires execmem permission, then so does the parent; if a child requires read to a symbolic link or temporary file that it can write, then so does the parent, ...). 4) Decouple NNP from SELinux transitions, so that we don't have to make a choice between them. Introduce a new policy capability that causes the ability to transition under NNP to be based on a new permission check between the old and new contexts rather than typebounds. Domain transitions can then be allowed in policy without requiring the parent to be a strict superset of all of its children. The rationale for this divergence from NNP behavior for capabilities is that SELinux permissions are substantially broader than just capabilities (they express a full access matrix, not just privileges) and can only be used to further restrict capabilities, not grant them beyond what is already permitted. Unit files can be left unmodified from upstream. SELinux-disabled users win. SELinux-enabled users can benefit from systemd-provided protections and policy won't need to radically change. This change takes the last approach above, as it seems to be the best option. Signed-off-by: Stephen Smalley--- security/selinux/hooks.c| 41 - security/selinux/include/classmap.h | 2 +- security/selinux/include/security.h | 2 ++ security/selinux/ss/services.c | 7 ++- 4 files changed, 36 insertions(+), 16 deletions(-) diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 3a06afb..f0c11c2 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2326,24 +2326,37 @@ static int check_nnp_nosuid(const struct linux_binprm *bprm, return 0; /* No change in credentials */ /* -* The only transitions we permit under NNP or nosuid -* are transitions to bounded SIDs, i.e. SIDs that are -* guaranteed to only be allowed a subset of the permissions -* of the current SID. +* If the policy enables the nnp_transition policy capability, +* then we permit transitions under NNP or nosuid if the +* policy explicitly allows nnp_transition permission between +* the old and new contexts. */ - rc = security_bounded_transition(old_tsec->sid, new_tsec->sid); - if (rc) { + if (selinux_policycap_nnptransition) { + rc = avc_has_perm(old_tsec->sid, new_tsec->sid, + SECCLASS_PROCESS, + PROCESS__NNP_TRANSITION, NULL); + if (!rc) + return 0; + } else { /* -*
Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier
On Mon, Jul 10, 2017 at 1:57 AM, Eric W. Biedermanwrote: > Kees Cook writes: > >> There are several places where exec needs to know if a privilege-gain has >> happened. These should be using the results of security_bprm_secureexec() >> but it is getting (needlessly) called very late. > > It is hard to tell at a glance but I believe this introduces a > regression. > > cap_bprm_set_creds is currently called before cap_bprm_secureexec and > it has a number of cases such as no_new_privs and ptrace that can result > in some of the precomputed credential changes not happening. > > Without accounting for that I believe your cap_bprm_securexec now > returns a postive value too early. It's still before cap_bprm_secureexec. cap_brpm_set_creds() is in prepare_binprm(), which is well before exec_binprm() and it's eventual call to setup_new_exec(). -Kees -- Kees Cook Pixel Security
Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"
Kees Cookwrites: > On Mon, Jul 10, 2017 at 1:46 AM, Eric W. Biederman > wrote: >> >> But you miss it. >> >> The "point of no return" is the call to de_thread. Or aguably anything in >> flush_old_exec. Once anything in the current task is modified you can't >> return an error. >> >> It very much does not have anything to do with brpm.It has >> everything to do with current. > > Yes, but the thing that actually enforces this is the test of bprm->mm > and the SIGSEGV in search_binary_handlers(). So what. Calling that the point of no return is wrong. The point of no return is when we kill change anyting in signal_struct or task_struct. AKA killing the first thread in de_thread. It is more than just the SIGSEGV in search_binary_handlers that enforces this. de_thread only returns (with a failure code) after having killed some threads if those threads are dead. Similarly exec_mmap only returns with failure if we know that a core dump is pending, and as such the process will be killed before returning to userspace. I am a little worried that we may fail to dump some threads if a core dump races with exec, but that is a quality of implementation issue, and the window is very small so I don't know that it matters. The point of no return very much comes a while before clearing brpm->mm. >>> diff --git a/fs/exec.c b/fs/exec.c >>> index 904199086490..7842ae661e34 100644 >>> --- a/fs/exec.c >>> +++ b/fs/exec.c >>> @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm) >>> if (retval) >>> goto out; >>> >>> - bprm->mm = NULL;/* We're using it now */ >>> + /* >>> + * After clearing bprm->mm (to mark that current is using the >>> + * prepared mm now), we are at the point of no return. If >>> + * anything from here on returns an error, the check in >>> + * search_binary_handler() will kill current (since the mm has >>> + * been replaced). >>> + */ >>> + bprm->mm = NULL; >>> >>> set_fs(USER_DS); >>> current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD | Eric
Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier
Kees Cookwrites: > On Mon, Jul 10, 2017 at 1:57 AM, Eric W. Biederman > wrote: >> Kees Cook writes: >> >>> There are several places where exec needs to know if a privilege-gain has >>> happened. These should be using the results of security_bprm_secureexec() >>> but it is getting (needlessly) called very late. >> >> It is hard to tell at a glance but I believe this introduces a >> regression. >> >> cap_bprm_set_creds is currently called before cap_bprm_secureexec and >> it has a number of cases such as no_new_privs and ptrace that can result >> in some of the precomputed credential changes not happening. >> >> Without accounting for that I believe your cap_bprm_securexec now >> returns a postive value too early. > > It's still before cap_bprm_secureexec. cap_brpm_set_creds() is in > prepare_binprm(), which is well before exec_binprm() and it's eventual > call to setup_new_exec(). Good point. I didn't double check and the set in the name had me thinking it was setting the creds on current. Is there any reason we need a second security hook? It feels like we should be able to just fold the secureexec hook into the set_creds hook. The two are so interrelated I fear that having them separate only encourages them to diverge in trivial ways as it is easy to forget about some detail or other. Certainly having them called from different functions seems wrong. If we know enough in prepare_binprm we know enough. Eric
Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"
On Mon, Jul 10, 2017 at 1:46 AM, Eric W. Biedermanwrote: > > But you miss it. > > The "point of no return" is the call to de_thread. Or aguably anything in > flush_old_exec. Once anything in the current task is modified you can't > return an error. > > It very much does not have anything to do with brpm.It has > everything to do with current. Yes, but the thing that actually enforces this is the test of bprm->mm and the SIGSEGV in search_binary_handlers(). -Kees > > >> diff --git a/fs/exec.c b/fs/exec.c >> index 904199086490..7842ae661e34 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm) >> if (retval) >> goto out; >> >> - bprm->mm = NULL;/* We're using it now */ >> + /* >> + * After clearing bprm->mm (to mark that current is using the >> + * prepared mm now), we are at the point of no return. If >> + * anything from here on returns an error, the check in >> + * search_binary_handler() will kill current (since the mm has >> + * been replaced). >> + */ >> + bprm->mm = NULL; >> >> set_fs(USER_DS); >> current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD | > > Eric -- Kees Cook Pixel Security
[PATCH] open_init_pty: Do not make stdin and stdout non-blocking
It is unclear why this was being done in the first place, and it has caused multiple bugs with run_init/open_init_pty usage. Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=863187 Fixes: https://bugs.gentoo.org/show_bug.cgi?id=621062 Signed-off-by: Stephen Smalley--- policycoreutils/run_init/open_init_pty.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/policycoreutils/run_init/open_init_pty.c b/policycoreutils/run_init/open_init_pty.c index 6e25ea3..b37ae4d 100644 --- a/policycoreutils/run_init/open_init_pty.c +++ b/policycoreutils/run_init/open_init_pty.c @@ -276,10 +276,8 @@ int main(int argc, char *argv[]) } } - /* Non blocking mode for all file descriptors. */ + /* Non blocking mode for the pty master. */ setfd_nonblock(pty_master); - setfd_nonblock(STDIN_FILENO); - setfd_nonblock(STDOUT_FILENO); if (isatty(STDIN_FILENO)) { if (tty_semi_raw(STDIN_FILENO) < 0) { -- 2.9.4
Re: [PATCH v2 8/8] exec: Use sane stack rlimit under secureexec
On Mon, 2017-07-10 at 00:57 -0700, Kees Cook wrote: > For a secureexec, before memory layout selection has happened, reset the > stack rlimit to something sane to avoid the caller having control over > the resulting layouts. > > $ ulimit -s > 8192 > $ ulimit -s unlimited > $ /bin/sh -c 'ulimit -s' > unlimited > $ sudo /bin/sh -c 'ulimit -s' > 8192 > > Signed-off-by: Kees Cook> --- > fs/exec.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/fs/exec.c b/fs/exec.c > index e0186db02f90..1e3463854a16 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1343,6 +1343,16 @@ void setup_new_exec(struct linux_binprm * bprm) > > /* Make sure parent cannot signal privileged process. */ > current->pdeath_signal = 0; > + > + /* > + * For secureexec, reset the stack limit to sane default to > + * avoid bad behavior from the prior rlimits. This has to > + * happen before arch_pick_mmap_layout(), which examines > + * RLIMIT_STACK, but after the point of no return to avoid > + * needing to clean up the change on failure. > + */ > + if (current->signal->rlim[RLIMIT_STACK].rlim_cur > _STK_LIM) > + current->signal->rlim[RLIMIT_STACK].rlim_cur = _STK_LIM; As setup_new_exec() is called before install_exec_creds(), I think this leaves a window where the real user can change the limit again with prlimit(). Ben. > } > > arch_pick_mmap_layout(current->mm); -- Ben Hutchings Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer signature.asc Description: This is a digitally signed message part
[PATCH v2 7/8] exec: Consolidate pdeath_signal clearing
Instead of an additional secureexec check for pdeath_signal, just move it up into the initial secureexec test. Neither perf nor arch code touches pdeath_signal, so the relocation shouldn't change anything. Signed-off-by: Kees Cook--- fs/exec.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index eeb8323977d1..e0186db02f90 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1340,6 +1340,9 @@ void setup_new_exec(struct linux_binprm * bprm) if (security_bprm_secureexec(bprm)) { /* Record for AT_SECURE. */ bprm->secureexec = 1; + + /* Make sure parent cannot signal privileged process. */ + current->pdeath_signal = 0; } arch_pick_mmap_layout(current->mm); @@ -1363,10 +1366,6 @@ void setup_new_exec(struct linux_binprm * bprm) */ current->mm->task_size = TASK_SIZE; - if (bprm->secureexec) { - current->pdeath_signal = 0; - } - /* An exec changes our domain. We are no longer part of the thread group */ current->self_exec_id++; -- 2.7.4
[PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier
There are several places where exec needs to know if a privilege-gain has happened. These should be using the results of security_bprm_secureexec() but it is getting (needlessly) called very late. Instead, move this earlier in the exec code, to the start of the point of no return in setup_new_exec(). Here, the new creds have already been calculated (and stored in bprm->cred), which is normally what security_bprm_secureexec() wants to examine. Since it's moved earlier, LSMs hooking bprm_secureexec() need to be adjusted to use the creds in bprm: $ git grep LSM_HOOK_INIT.*bprm_secureexec apparmor/lsm.c:LSM_HOOK_INIT(bprm_secureexec, apparmor_bprm_secureexec), commoncap.c: LSM_HOOK_INIT(bprm_secureexec, cap_bprm_secureexec), selinux/hooks.c: LSM_HOOK_INIT(bprm_secureexec, selinux_bprm_secureexec), smack/smack_lsm.c: LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec), AppArmor does not access creds in apparmor_bprm_secureexec. Capabilities needed to be adjusted to use bprm creds. SELinux needed to be adjusted to use bprm creds for the security structure. Smack needed to be adjusted to use bprm creds for the security structure. The result of the bprm_secureexec() hook is saved in a new bprm field "secureexec" so it can be queried later (just AT_SECURE currently). Signed-off-by: Kees Cook--- fs/binfmt_elf.c| 2 +- fs/binfmt_elf_fdpic.c | 2 +- fs/exec.c | 5 + include/linux/binfmts.h| 3 ++- include/linux/lsm_hooks.h | 3 ++- security/commoncap.c | 4 +--- security/selinux/hooks.c | 2 +- security/smack/smack_lsm.c | 2 +- 8 files changed, 14 insertions(+), 9 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 5075fd5c62c8..7f6ec4dac13d 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -254,7 +254,7 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec, NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid)); NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid)); NEW_AUX_ENT(AT_EGID, from_kgid_munged(cred->user_ns, cred->egid)); - NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm)); + NEW_AUX_ENT(AT_SECURE, bprm->secureexec); NEW_AUX_ENT(AT_RANDOM, (elf_addr_t)(unsigned long)u_rand_bytes); #ifdef ELF_HWCAP2 NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2); diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index cf93a4fad012..5aa9199dfb13 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -650,7 +650,7 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm, NEW_AUX_ENT(AT_EUID,(elf_addr_t) from_kuid_munged(cred->user_ns, cred->euid)); NEW_AUX_ENT(AT_GID, (elf_addr_t) from_kgid_munged(cred->user_ns, cred->gid)); NEW_AUX_ENT(AT_EGID,(elf_addr_t) from_kgid_munged(cred->user_ns, cred->egid)); - NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm)); + NEW_AUX_ENT(AT_SECURE, bprm->secureexec); NEW_AUX_ENT(AT_EXECFN, bprm->exec); #ifdef ARCH_DLINFO diff --git a/fs/exec.c b/fs/exec.c index 7842ae661e34..b92e37fb53aa 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1337,6 +1337,11 @@ EXPORT_SYMBOL(would_dump); void setup_new_exec(struct linux_binprm * bprm) { + if (security_bprm_secureexec(bprm)) { + /* Record for AT_SECURE. */ + bprm->secureexec = 1; + } + arch_pick_mmap_layout(current->mm); current->sas_ss_sp = current->sas_ss_size = 0; diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index 05488da3aee9..1afaa303cad0 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -27,9 +27,10 @@ struct linux_binprm { unsigned int cred_prepared:1,/* true if creds already prepared (multiple * preps happen for interpreters) */ - cap_effective:1;/* true if has elevated effective capabilities, + cap_effective:1,/* true if has elevated effective capabilities, * false if not; except for init which inherits * its parent's caps anyway */ + secureexec:1; /* true when gaining privileges */ #ifdef __alpha__ unsigned int taso:1; #endif diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 080f34e66017..d1bd24fb4a33 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -72,7 +72,8 @@ * Return a boolean value (0 or 1) indicating whether a "secure exec" * is required. The flag is passed in the auxiliary table * on the initial stack to the ELF interpreter to indicate whether libc - * should enable secure mode. + * should enable secure mode. Called before bprm_committing_creds(), + * so pending credentials are in @bprm->cred. * @bprm contains the linux_binprm structure. * * Security hooks for filesystem operations. diff
Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier
Kees Cookwrites: > There are several places where exec needs to know if a privilege-gain has > happened. These should be using the results of security_bprm_secureexec() > but it is getting (needlessly) called very late. It is hard to tell at a glance but I believe this introduces a regression. cap_bprm_set_creds is currently called before cap_bprm_secureexec and it has a number of cases such as no_new_privs and ptrace that can result in some of the precomputed credential changes not happening. Without accounting for that I believe your cap_bprm_securexec now returns a postive value too early. > Instead, move this earlier in the exec code, to the start of the point > of no return in setup_new_exec(). Here, the new creds have already > been calculated (and stored in bprm->cred), which is normally what > security_bprm_secureexec() wants to examine. Since it's moved earlier, > LSMs hooking bprm_secureexec() need to be adjusted to use the creds in > bprm: > > $ git grep LSM_HOOK_INIT.*bprm_secureexec > apparmor/lsm.c:LSM_HOOK_INIT(bprm_secureexec, apparmor_bprm_secureexec), > commoncap.c: LSM_HOOK_INIT(bprm_secureexec, cap_bprm_secureexec), > selinux/hooks.c: LSM_HOOK_INIT(bprm_secureexec, selinux_bprm_secureexec), > smack/smack_lsm.c: LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec), > > AppArmor does not access creds in apparmor_bprm_secureexec. > > Capabilities needed to be adjusted to use bprm creds. > > SELinux needed to be adjusted to use bprm creds for the security structure. > > Smack needed to be adjusted to use bprm creds for the security structure. > > The result of the bprm_secureexec() hook is saved in a new bprm field > "secureexec" so it can be queried later (just AT_SECURE currently). > > Signed-off-by: Kees Cook > --- > fs/binfmt_elf.c| 2 +- > fs/binfmt_elf_fdpic.c | 2 +- > fs/exec.c | 5 + > include/linux/binfmts.h| 3 ++- > include/linux/lsm_hooks.h | 3 ++- > security/commoncap.c | 4 +--- > security/selinux/hooks.c | 2 +- > security/smack/smack_lsm.c | 2 +- > 8 files changed, 14 insertions(+), 9 deletions(-) > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 5075fd5c62c8..7f6ec4dac13d 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -254,7 +254,7 @@ create_elf_tables(struct linux_binprm *bprm, struct > elfhdr *exec, > NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid)); > NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid)); > NEW_AUX_ENT(AT_EGID, from_kgid_munged(cred->user_ns, cred->egid)); > - NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm)); > + NEW_AUX_ENT(AT_SECURE, bprm->secureexec); > NEW_AUX_ENT(AT_RANDOM, (elf_addr_t)(unsigned long)u_rand_bytes); > #ifdef ELF_HWCAP2 > NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2); > diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c > index cf93a4fad012..5aa9199dfb13 100644 > --- a/fs/binfmt_elf_fdpic.c > +++ b/fs/binfmt_elf_fdpic.c > @@ -650,7 +650,7 @@ static int create_elf_fdpic_tables(struct linux_binprm > *bprm, > NEW_AUX_ENT(AT_EUID,(elf_addr_t) from_kuid_munged(cred->user_ns, > cred->euid)); > NEW_AUX_ENT(AT_GID, (elf_addr_t) from_kgid_munged(cred->user_ns, > cred->gid)); > NEW_AUX_ENT(AT_EGID,(elf_addr_t) from_kgid_munged(cred->user_ns, > cred->egid)); > - NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm)); > + NEW_AUX_ENT(AT_SECURE, bprm->secureexec); > NEW_AUX_ENT(AT_EXECFN, bprm->exec); > > #ifdef ARCH_DLINFO > diff --git a/fs/exec.c b/fs/exec.c > index 7842ae661e34..b92e37fb53aa 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1337,6 +1337,11 @@ EXPORT_SYMBOL(would_dump); > > void setup_new_exec(struct linux_binprm * bprm) > { > + if (security_bprm_secureexec(bprm)) { > + /* Record for AT_SECURE. */ > + bprm->secureexec = 1; > + } > + > arch_pick_mmap_layout(current->mm); > > current->sas_ss_sp = current->sas_ss_size = 0; > diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h > index 05488da3aee9..1afaa303cad0 100644 > --- a/include/linux/binfmts.h > +++ b/include/linux/binfmts.h > @@ -27,9 +27,10 @@ struct linux_binprm { > unsigned int > cred_prepared:1,/* true if creds already prepared (multiple >* preps happen for interpreters) */ > - cap_effective:1;/* true if has elevated effective capabilities, > + cap_effective:1,/* true if has elevated effective capabilities, >* false if not; except for init which inherits >* its parent's caps anyway */ > + secureexec:1; /* true when gaining privileges */ > #ifdef __alpha__ > unsigned int taso:1; > #endif > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h > index 080f34e66017..d1bd24fb4a33
[PATCH v2 1/8] exec: Correct comments about "point of no return"
In commit 221af7f87b97 ("Split 'flush_old_exec' into two functions"), the comment about the point of no return should have stayed in flush_old_exec() since it refers to "bprm->mm = NULL;" line, but prior changes in commits c89681ed7d0e ("remove steal_locks()"), and fd8328be874f ("sanitize handling of shared descriptor tables in failing execve()") made it look like it meant the current->sas_ss_sp line instead. The comment is referring to the fact that once bprm->mm is NULL, all failures from a binfmt load_binary hook (e.g. load_elf_binary), will get SEGV raised against current. Move this comment and expand the explanation a bit, putting it above the assignment this time. This also removes an erroneous commet about when credentials are being installed. That has its own dedicated function, install_exec_creds(), which carries a similar (and correct) comment, so remove the bogus comment where installation is not actually happening. Signed-off-by: Kees Cook--- fs/exec.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 904199086490..7842ae661e34 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm) if (retval) goto out; - bprm->mm = NULL;/* We're using it now */ + /* +* After clearing bprm->mm (to mark that current is using the +* prepared mm now), we are at the point of no return. If +* anything from here on returns an error, the check in +* search_binary_handler() will kill current (since the mm has +* been replaced). +*/ + bprm->mm = NULL; set_fs(USER_DS); current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD | @@ -1332,7 +1339,6 @@ void setup_new_exec(struct linux_binprm * bprm) { arch_pick_mmap_layout(current->mm); - /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; if (uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), current_gid())) @@ -1350,7 +1356,6 @@ void setup_new_exec(struct linux_binprm * bprm) */ current->mm->task_size = TASK_SIZE; - /* install the new credentials */ if (!uid_eq(bprm->cred->uid, current_euid()) || !gid_eq(bprm->cred->gid, current_egid())) { current->pdeath_signal = 0; -- 2.7.4
Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"
But you miss it. The "point of no return" is the call to de_thread. Or aguably anything in flush_old_exec. Once anything in the current task is modified you can't return an error. It very much does not have anything to do with brpm.It has everything to do with current. > diff --git a/fs/exec.c b/fs/exec.c > index 904199086490..7842ae661e34 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm) > if (retval) > goto out; > > - bprm->mm = NULL;/* We're using it now */ > + /* > + * After clearing bprm->mm (to mark that current is using the > + * prepared mm now), we are at the point of no return. If > + * anything from here on returns an error, the check in > + * search_binary_handler() will kill current (since the mm has > + * been replaced). > + */ > + bprm->mm = NULL; > > set_fs(USER_DS); > current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD | Eric
[PATCH v2 8/8] exec: Use sane stack rlimit under secureexec
For a secureexec, before memory layout selection has happened, reset the stack rlimit to something sane to avoid the caller having control over the resulting layouts. $ ulimit -s 8192 $ ulimit -s unlimited $ /bin/sh -c 'ulimit -s' unlimited $ sudo /bin/sh -c 'ulimit -s' 8192 Signed-off-by: Kees Cook--- fs/exec.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index e0186db02f90..1e3463854a16 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1343,6 +1343,16 @@ void setup_new_exec(struct linux_binprm * bprm) /* Make sure parent cannot signal privileged process. */ current->pdeath_signal = 0; + + /* +* For secureexec, reset the stack limit to sane default to +* avoid bad behavior from the prior rlimits. This has to +* happen before arch_pick_mmap_layout(), which examines +* RLIMIT_STACK, but after the point of no return to avoid +* needing to clean up the change on failure. +*/ + if (current->signal->rlim[RLIMIT_STACK].rlim_cur > _STK_LIM) + current->signal->rlim[RLIMIT_STACK].rlim_cur = _STK_LIM; } arch_pick_mmap_layout(current->mm); -- 2.7.4
[PATCH v2 5/8] smack: Remove redundant pdeath_signal clearing
This removes the redundant pdeath_signal clearing in Smack: the check in smack_bprm_committing_creds() matches the check in smack_bprm_secureexec(), and since secureexec is now being checked for clearing pdeath_signal, this is redundant to the common exec code. Signed-off-by: Kees Cook--- security/smack/smack_lsm.c | 15 --- 1 file changed, 15 deletions(-) diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 13cf9e66d5fe..4b10b782aecc 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -954,20 +954,6 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm) } /** - * smack_bprm_committing_creds - Prepare to install the new credentials - * from bprm. - * - * @bprm: binprm for exec - */ -static void smack_bprm_committing_creds(struct linux_binprm *bprm) -{ - struct task_smack *bsp = bprm->cred->security; - - if (bsp->smk_task != bsp->smk_forked) - current->pdeath_signal = 0; -} - -/** * smack_bprm_secureexec - Return the decision to use secureexec. * @bprm: binprm for exec * @@ -4645,7 +4631,6 @@ static struct security_hook_list smack_hooks[] __lsm_ro_after_init = { LSM_HOOK_INIT(sb_parse_opts_str, smack_parse_opts_str), LSM_HOOK_INIT(bprm_set_creds, smack_bprm_set_creds), - LSM_HOOK_INIT(bprm_committing_creds, smack_bprm_committing_creds), LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec), LSM_HOOK_INIT(inode_alloc_security, smack_inode_alloc_security), -- 2.7.4
[PATCH v2 4/8] exec: Use secureexec for clearing pdeath_signal
Like dumpability, clearing pdeath_signal happens both in setup_new_exec() and later in commit_creds(). The test in setup_new_exec() is different from all other privilege comparisons, though: it is checking the new cred (bprm) uid vs the old cred (current) euid. This appears to be a bug, introduced by commit a6f76f23d297 ("CRED: Make execve() take advantage of copy-on-write credentials"): - if (bprm->e_uid != current_euid() || - bprm->e_gid != current_egid()) { - set_dumpable(current->mm, suid_dumpable); + /* install the new credentials */ + if (bprm->cred->uid != current_euid() || + bprm->cred->gid != current_egid()) { It was bprm euid vs current euid (and egids), but the effective got dropped. Nothing in the exec flow changes bprm->cred->uid (nor gid). The call traces are: prepare_bprm_creds() prepare_exec_creds() prepare_creds() memcpy(new_creds, old_creds, ...) security_prepare_creds() (unimplemented by commoncap) ... prepare_binprm() bprm_fill_uid() resets euid/egid to current euid/egid sets euid/egid on bprm based on set*id file bits security_bprm_set_creds() cap_bprm_set_creds() handle all caps-based manipulations so this test is effectively a test of current_uid() vs current_euid(), which is wrong, just like the prior dumpability tests were wrong. The commit log says "Clear pdeath_signal and set dumpable on certain circumstances that may not be covered by commit_creds()." This may be meaning the earlier old euid vs new euid (and egid) test that got changed. Luckily, as with dumpability, this is all masked by commit_creds() which performs old/new euid and egid tests and clears pdeath_signal. And again, like dumpability, we should include LSM secureexec logic for pdeath_signal clearing. For example, Smack goes out of its way to clear pdeath_signal when it finds a secureexec condition. Signed-off-by: Kees Cook--- fs/exec.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 3e519d4f0bd3..d7bda5b60e7b 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1361,8 +1361,7 @@ void setup_new_exec(struct linux_binprm * bprm) */ current->mm->task_size = TASK_SIZE; - if (!uid_eq(bprm->cred->uid, current_euid()) || - !gid_eq(bprm->cred->gid, current_egid())) { + if (bprm->secureexec) { current->pdeath_signal = 0; } else { if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP) -- 2.7.4
[PATCH v2 0/8] exec: Use sane stack rlimit under secureexec
As discussed with Linus and Andy, we need to reset the stack rlimit before we do memory layouts when execing a privilege-gaining (e.g. setuid) program. This moves security_bprm_secureexec() earlier (with required changes), and then lowers the stack limit when appropriate. As a side-effect, dumpability and pdeath_signal clearing is expanded to cover LSM definitions of secureexec (and Smack can drop its special handler for pdeath_signal clearing). I'd appreciate some extra eyes on this to make sure this isn't broken in some special way. I couldn't find anything that _depended_ on security_bprm_secureexec() being called late. Thanks! -Kees v2: - fix missed current_security() uses in LSMs. - research/consolidate dumpability setting logic - research/consolidate pdeath_signal clearing logic - split up logical steps a little more for easier review (and bisection) - fix some old broken comments
[PATCH v2 3/8] exec: Use secureexec for setting dumpability
The examination of "current" to decide dumpability is wrong. This was a check of and euid/uid (or egid/gid) mismatch in the existing process, not the newly created one. This appears to stretch back into even the "history.git" tree. Luckily, dumpability is later set in commit_creds(). In earlier kernel versions before creds existed, similar checks also existed late in the exec flow, covering up the mistake as far back as I could find. The commit_creds() check examines differences of euid, uid, egid, gid, and capabilities between the old and new creds. It would look like the setup_new_exec() dumpability test could be entirely removed, but strictly speaking, the secureexec test covers a different set of tests than what commit_creds() checks for. So, fix this test to use secureexec, which includes the same logical check (euid != uid || egid != gid), but checks bprm->cred, not current->cred. One would wonder if we need a security_commit_creds() LSM hook and to move the existing checks in commit_creds() into commoncaps.c, which would allow expanding the logic to all LSMs. Currently this doesn't seem needed, though. Signed-off-by: Kees Cook--- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/exec.c b/fs/exec.c index b92e37fb53aa..3e519d4f0bd3 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1346,7 +1346,7 @@ void setup_new_exec(struct linux_binprm * bprm) current->sas_ss_sp = current->sas_ss_size = 0; - if (uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), current_gid())) + if (!bprm->secureexec) set_dumpable(current->mm, SUID_DUMP_USER); else set_dumpable(current->mm, suid_dumpable); -- 2.7.4