Re: [PATCH] selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets

2017-07-10 Thread Paul Moore
On Wed, Jun 21, 2017 at 3:04 PM, Paul Moore  wrote:
> On Wed, Jun 21, 2017 at 5:48 AM, Luis Ressel  wrote:
>> On Tue, 20 Jun 2017 17:43:38 -0400
>> Paul Moore  wrote:
>>
>>> Considering where we are at with respect to the merge window, let's
>>> shelve this for now and I'll merge it after the next merge window
>>> closes.  In all likelihood I'll be sending selinux/next up to James
>>> later this week and I'd like this to sit in linux-next for longer than
>>> a few days.
>>
>> That means the change will land in 4.14 at the earliest, right? (Just
>> out of curiosity.)
>
> That's correct.  We are currently working towards a v4.12 release in
> Linus' tree, the upcoming merge window will be for v4.13, and things
> merged into selinux/next after that merge window will be for v4.14.
>
>> By the way, refpolicy only grants "socket" permissions to a handful of
>> domains, all of which also have the corresponding "unix_dgram_socket"
>> permissions. The fedora policy does the same (according to Stephen);
>> this only leaves custom policies to be potentially affected by this
>> change.
>
> While custom policies are definitely in the minority, we still need to
> do out best not to break them without warning.
>
>> Given that the SOCK_RAW->SOCK_DGRAM translation is obscure enough not to
>> be documented anywhere outside the kernel sources, I doubt there are
>> many users of it, anyway.
>
> You very well may be right, I just felt that such a change requires
> more than a week in the selinux/next tree.
>
> Thank you for your patch, it's in the queue and I'll be merging it
> into the selinux/next branch in a few weeks.

With all of the SELinux things for v4.13 upstream, I went ahead and
merged this patch into selinux/next.

-- 
paul moore
www.paul-moore.com


[RFC][PATCH] selinux: Introduce a policy capability and permission for NNP transitions

2017-07-10 Thread Stephen Smalley
As systemd ramps up enabling NoNewPrivileges (either explicitly in
service unit files or as a side effect of other security-related
settings in service unit files), we're increasingly running afoul of
its interactions with SELinux.

The end result is bad for the security of both SELinux-disabled and
SELinux-enabled systems.  Packagers have to turn off these
options in the unit files to preserve SELinux domain transitions.  For
users who choose to disable SELinux, this means that they miss out on
at least having the systemd-supported protections.  For users who keep
SELinux enabled, they may still be missing out on some protections
because it isn't necessarily guaranteed that the SELinux policy for
that service provides the same protections in all cases.

Our options seem to be:

1) Just keep on the way we are now, i.e. packagers have to remove
default protection settings from upstream package unit files in order
to have them work with SELinux (and not just NoNewPrivileges=
itself; increasingly systemd is enabling NNP as a side effect of other
unit file options, even seemingly unrelated ones like PrivateDevices).
SELinux-disabled users lose entirely, SELinux-enabled users may lose
(depending on whether SELinux policy provides equivalent or
better guarantees).

2) Change systemd to automatically disable NNP on SELinux-enabled
systems.  Unit files can be left unmodified from upstream.  SELinux-
disabled users win.  SELinux-enabled users may lose.

3) Try to use typebounds, since we allow bounded transitions under NNP.
Unit files can be left unmodified from upstream. SELinux-disabled users
win.  SELinux-enabled users get to benefit from systemd-provided
protections.  However, this option is impractical to implement in policy
currently, since typebounds requires us to ensure that each domain is
allowed everything all of its descendant domains are allowed, and this
has to be repeated for the entire chain of domain transitions.  There is
no way to clone all allow rules from children to the parent in policy
currently, and it is doubtful that doing so would be desirable even if
it were practical, as it requires leaking permissions to objects and
operations into parent domains that could weaken their own security in
order to allow them to the children (e.g. if a child requires execmem
permission, then so does the parent; if a child requires read to a symbolic
link or temporary file that it can write, then so does the parent, ...).

4) Decouple NNP from SELinux transitions, so that we don't have to
make a choice between them. Introduce a new policy capability that
causes the ability to transition under NNP to be based on a new permission
check between the old and new contexts rather than typebounds.  Domain
transitions can then be allowed in policy without requiring the parent
to be a strict superset of all of its children.  The rationale for this
divergence from NNP behavior for capabilities is that SELinux permissions
are substantially broader than just capabilities (they express a full
access matrix, not just privileges) and can only be used to further
restrict capabilities, not grant them beyond what is already permitted.
Unit files can be left unmodified from upstream.  SELinux-disabled users
win.  SELinux-enabled users can benefit from systemd-provided protections
and policy won't need to radically change.

This change takes the last approach above, as it seems to be the
best option.

Signed-off-by: Stephen Smalley 
---
 security/selinux/hooks.c| 41 -
 security/selinux/include/classmap.h |  2 +-
 security/selinux/include/security.h |  2 ++
 security/selinux/ss/services.c  |  7 ++-
 4 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3a06afb..f0c11c2 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2326,24 +2326,37 @@ static int check_nnp_nosuid(const struct linux_binprm 
*bprm,
return 0; /* No change in credentials */
 
/*
-* The only transitions we permit under NNP or nosuid
-* are transitions to bounded SIDs, i.e. SIDs that are
-* guaranteed to only be allowed a subset of the permissions
-* of the current SID.
+* If the policy enables the nnp_transition policy capability,
+* then we permit transitions under NNP or nosuid if the
+* policy explicitly allows nnp_transition permission between
+* the old and new contexts.
 */
-   rc = security_bounded_transition(old_tsec->sid, new_tsec->sid);
-   if (rc) {
+   if (selinux_policycap_nnptransition) {
+   rc = avc_has_perm(old_tsec->sid, new_tsec->sid,
+ SECCLASS_PROCESS,
+ PROCESS__NNP_TRANSITION, NULL);
+   if (!rc)
+   return 0;
+   } else {
/*
-* 

Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier

2017-07-10 Thread Kees Cook
On Mon, Jul 10, 2017 at 1:57 AM, Eric W. Biederman
 wrote:
> Kees Cook  writes:
>
>> There are several places where exec needs to know if a privilege-gain has
>> happened. These should be using the results of security_bprm_secureexec()
>> but it is getting (needlessly) called very late.
>
> It is hard to tell at a glance but I believe this introduces a
> regression.
>
> cap_bprm_set_creds is currently called before cap_bprm_secureexec and
> it has a number of cases such as no_new_privs and ptrace that can result
> in some of the precomputed credential changes not happening.
>
> Without accounting for that I believe your cap_bprm_securexec now
> returns a postive value too early.

It's still before cap_bprm_secureexec. cap_brpm_set_creds() is in
prepare_binprm(), which is well before exec_binprm() and it's eventual
call to setup_new_exec().

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"

2017-07-10 Thread Eric W. Biederman
Kees Cook  writes:

> On Mon, Jul 10, 2017 at 1:46 AM, Eric W. Biederman
>  wrote:
>>
>> But you miss it.
>>
>> The "point of no return" is the call to de_thread.  Or aguably anything in
>> flush_old_exec.  Once anything in the current task is modified you can't
>> return an error.
>>
>> It very much does not have anything to do with brpm.It has
>> everything to do with current.
>
> Yes, but the thing that actually enforces this is the test of bprm->mm
> and the SIGSEGV in search_binary_handlers().

So what.  Calling that the point of no return is wrong.

The point of no return is when we kill change anyting in signal_struct
or task_struct.  AKA killing the first thread in de_thread.

It is more than just the SIGSEGV in search_binary_handlers that enforces
this.  de_thread only returns (with a failure code) after having killed
some threads if those threads are dead.

Similarly exec_mmap only returns with failure if we know that a core
dump is pending, and as such the process will be killed before returning
to userspace.

I am a little worried that we may fail to dump some threads if a core
dump races with exec, but that is a quality of implementation issue, and
the window is very small so I don't know that it matters.

The point of no return very much comes a while before clearing brpm->mm.

>>> diff --git a/fs/exec.c b/fs/exec.c
>>> index 904199086490..7842ae661e34 100644
>>> --- a/fs/exec.c
>>> +++ b/fs/exec.c
>>> @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm)
>>>   if (retval)
>>>   goto out;
>>>
>>> - bprm->mm = NULL;/* We're using it now */
>>> + /*
>>> +  * After clearing bprm->mm (to mark that current is using the
>>> +  * prepared mm now), we are at the point of no return. If
>>> +  * anything from here on returns an error, the check in
>>> +  * search_binary_handler() will kill current (since the mm has
>>> +  * been replaced).
>>> +  */
>>> + bprm->mm = NULL;
>>>
>>>   set_fs(USER_DS);
>>>   current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |

Eric


Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier

2017-07-10 Thread Eric W. Biederman
Kees Cook  writes:

> On Mon, Jul 10, 2017 at 1:57 AM, Eric W. Biederman
>  wrote:
>> Kees Cook  writes:
>>
>>> There are several places where exec needs to know if a privilege-gain has
>>> happened. These should be using the results of security_bprm_secureexec()
>>> but it is getting (needlessly) called very late.
>>
>> It is hard to tell at a glance but I believe this introduces a
>> regression.
>>
>> cap_bprm_set_creds is currently called before cap_bprm_secureexec and
>> it has a number of cases such as no_new_privs and ptrace that can result
>> in some of the precomputed credential changes not happening.
>>
>> Without accounting for that I believe your cap_bprm_securexec now
>> returns a postive value too early.
>
> It's still before cap_bprm_secureexec. cap_brpm_set_creds() is in
> prepare_binprm(), which is well before exec_binprm() and it's eventual
> call to setup_new_exec().

Good point.  I didn't double check and the set in the name had me
thinking it was setting the creds on current.

Is there any reason we need a second security hook?  It feels like we
should be able to just fold the secureexec hook into the set_creds hook.

The two are so interrelated I fear that having them separate only
encourages them to diverge in trivial ways as it is easy to forget about
some detail or other.

Certainly having them called from different functions seems wrong.  If
we know enough in prepare_binprm we know enough.

Eric



Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"

2017-07-10 Thread Kees Cook
On Mon, Jul 10, 2017 at 1:46 AM, Eric W. Biederman
 wrote:
>
> But you miss it.
>
> The "point of no return" is the call to de_thread.  Or aguably anything in
> flush_old_exec.  Once anything in the current task is modified you can't
> return an error.
>
> It very much does not have anything to do with brpm.It has
> everything to do with current.

Yes, but the thing that actually enforces this is the test of bprm->mm
and the SIGSEGV in search_binary_handlers().

-Kees

>
>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 904199086490..7842ae661e34 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm)
>>   if (retval)
>>   goto out;
>>
>> - bprm->mm = NULL;/* We're using it now */
>> + /*
>> +  * After clearing bprm->mm (to mark that current is using the
>> +  * prepared mm now), we are at the point of no return. If
>> +  * anything from here on returns an error, the check in
>> +  * search_binary_handler() will kill current (since the mm has
>> +  * been replaced).
>> +  */
>> + bprm->mm = NULL;
>>
>>   set_fs(USER_DS);
>>   current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
>
> Eric



-- 
Kees Cook
Pixel Security


[PATCH] open_init_pty: Do not make stdin and stdout non-blocking

2017-07-10 Thread Stephen Smalley
It is unclear why this was being done in the first place, and
it has caused multiple bugs with run_init/open_init_pty usage.

Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=863187
Fixes: https://bugs.gentoo.org/show_bug.cgi?id=621062
Signed-off-by: Stephen Smalley 
---
 policycoreutils/run_init/open_init_pty.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/policycoreutils/run_init/open_init_pty.c 
b/policycoreutils/run_init/open_init_pty.c
index 6e25ea3..b37ae4d 100644
--- a/policycoreutils/run_init/open_init_pty.c
+++ b/policycoreutils/run_init/open_init_pty.c
@@ -276,10 +276,8 @@ int main(int argc, char *argv[])
}
}
 
-   /* Non blocking mode for all file descriptors. */
+   /* Non blocking mode for the pty master. */
setfd_nonblock(pty_master);
-   setfd_nonblock(STDIN_FILENO);
-   setfd_nonblock(STDOUT_FILENO);
 
if (isatty(STDIN_FILENO)) {
if (tty_semi_raw(STDIN_FILENO) < 0) {
-- 
2.9.4



Re: [PATCH v2 8/8] exec: Use sane stack rlimit under secureexec

2017-07-10 Thread Ben Hutchings
On Mon, 2017-07-10 at 00:57 -0700, Kees Cook wrote:
> For a secureexec, before memory layout selection has happened, reset the
> stack rlimit to something sane to avoid the caller having control over
> the resulting layouts.
> 
> $ ulimit -s
> 8192
> $ ulimit -s unlimited
> $ /bin/sh -c 'ulimit -s'
> unlimited
> $ sudo /bin/sh -c 'ulimit -s'
> 8192
> 
> Signed-off-by: Kees Cook 
> ---
>  fs/exec.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index e0186db02f90..1e3463854a16 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1343,6 +1343,16 @@ void setup_new_exec(struct linux_binprm * bprm)
>  
>   /* Make sure parent cannot signal privileged process. */
>   current->pdeath_signal = 0;
> +
> + /*
> +  * For secureexec, reset the stack limit to sane default to
> +  * avoid bad behavior from the prior rlimits. This has to
> +  * happen before arch_pick_mmap_layout(), which examines
> +  * RLIMIT_STACK, but after the point of no return to avoid
> +  * needing to clean up the change on failure.
> +  */
> + if (current->signal->rlim[RLIMIT_STACK].rlim_cur > _STK_LIM)
> + current->signal->rlim[RLIMIT_STACK].rlim_cur = _STK_LIM;

As setup_new_exec() is called before install_exec_creds(), I think this
leaves a window where the real user can change the limit again with
prlimit().

Ben.

>   }
>  
>   arch_pick_mmap_layout(current->mm);
-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer



signature.asc
Description: This is a digitally signed message part


[PATCH v2 7/8] exec: Consolidate pdeath_signal clearing

2017-07-10 Thread Kees Cook
Instead of an additional secureexec check for pdeath_signal, just move it
up into the initial secureexec test. Neither perf nor arch code touches
pdeath_signal, so the relocation shouldn't change anything.

Signed-off-by: Kees Cook 
---
 fs/exec.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index eeb8323977d1..e0186db02f90 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1340,6 +1340,9 @@ void setup_new_exec(struct linux_binprm * bprm)
if (security_bprm_secureexec(bprm)) {
/* Record for AT_SECURE. */
bprm->secureexec = 1;
+
+   /* Make sure parent cannot signal privileged process. */
+   current->pdeath_signal = 0;
}
 
arch_pick_mmap_layout(current->mm);
@@ -1363,10 +1366,6 @@ void setup_new_exec(struct linux_binprm * bprm)
 */
current->mm->task_size = TASK_SIZE;
 
-   if (bprm->secureexec) {
-   current->pdeath_signal = 0;
-   }
-
/* An exec changes our domain. We are no longer part of the thread
   group */
current->self_exec_id++;
-- 
2.7.4



[PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier

2017-07-10 Thread Kees Cook
There are several places where exec needs to know if a privilege-gain has
happened. These should be using the results of security_bprm_secureexec()
but it is getting (needlessly) called very late.

Instead, move this earlier in the exec code, to the start of the point
of no return in setup_new_exec(). Here, the new creds have already
been calculated (and stored in bprm->cred), which is normally what
security_bprm_secureexec() wants to examine. Since it's moved earlier,
LSMs hooking bprm_secureexec() need to be adjusted to use the creds in
bprm:

$ git grep LSM_HOOK_INIT.*bprm_secureexec
apparmor/lsm.c:LSM_HOOK_INIT(bprm_secureexec, apparmor_bprm_secureexec),
commoncap.c:   LSM_HOOK_INIT(bprm_secureexec, cap_bprm_secureexec),
selinux/hooks.c:   LSM_HOOK_INIT(bprm_secureexec, selinux_bprm_secureexec),
smack/smack_lsm.c: LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec),

AppArmor does not access creds in apparmor_bprm_secureexec.

Capabilities needed to be adjusted to use bprm creds.

SELinux needed to be adjusted to use bprm creds for the security structure.

Smack needed to be adjusted to use bprm creds for the security structure.

The result of the bprm_secureexec() hook is saved in a new bprm field
"secureexec" so it can be queried later (just AT_SECURE currently).

Signed-off-by: Kees Cook 
---
 fs/binfmt_elf.c| 2 +-
 fs/binfmt_elf_fdpic.c  | 2 +-
 fs/exec.c  | 5 +
 include/linux/binfmts.h| 3 ++-
 include/linux/lsm_hooks.h  | 3 ++-
 security/commoncap.c   | 4 +---
 security/selinux/hooks.c   | 2 +-
 security/smack/smack_lsm.c | 2 +-
 8 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5075fd5c62c8..7f6ec4dac13d 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -254,7 +254,7 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr 
*exec,
NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid));
NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid));
NEW_AUX_ENT(AT_EGID, from_kgid_munged(cred->user_ns, cred->egid));
-   NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm));
+   NEW_AUX_ENT(AT_SECURE, bprm->secureexec);
NEW_AUX_ENT(AT_RANDOM, (elf_addr_t)(unsigned long)u_rand_bytes);
 #ifdef ELF_HWCAP2
NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2);
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index cf93a4fad012..5aa9199dfb13 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -650,7 +650,7 @@ static int create_elf_fdpic_tables(struct linux_binprm 
*bprm,
NEW_AUX_ENT(AT_EUID,(elf_addr_t) from_kuid_munged(cred->user_ns, 
cred->euid));
NEW_AUX_ENT(AT_GID, (elf_addr_t) from_kgid_munged(cred->user_ns, 
cred->gid));
NEW_AUX_ENT(AT_EGID,(elf_addr_t) from_kgid_munged(cred->user_ns, 
cred->egid));
-   NEW_AUX_ENT(AT_SECURE,  security_bprm_secureexec(bprm));
+   NEW_AUX_ENT(AT_SECURE,  bprm->secureexec);
NEW_AUX_ENT(AT_EXECFN,  bprm->exec);
 
 #ifdef ARCH_DLINFO
diff --git a/fs/exec.c b/fs/exec.c
index 7842ae661e34..b92e37fb53aa 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1337,6 +1337,11 @@ EXPORT_SYMBOL(would_dump);
 
 void setup_new_exec(struct linux_binprm * bprm)
 {
+   if (security_bprm_secureexec(bprm)) {
+   /* Record for AT_SECURE. */
+   bprm->secureexec = 1;
+   }
+
arch_pick_mmap_layout(current->mm);
 
current->sas_ss_sp = current->sas_ss_size = 0;
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 05488da3aee9..1afaa303cad0 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -27,9 +27,10 @@ struct linux_binprm {
unsigned int
cred_prepared:1,/* true if creds already prepared (multiple
 * preps happen for interpreters) */
-   cap_effective:1;/* true if has elevated effective capabilities,
+   cap_effective:1,/* true if has elevated effective capabilities,
 * false if not; except for init which inherits
 * its parent's caps anyway */
+   secureexec:1;   /* true when gaining privileges */
 #ifdef __alpha__
unsigned int taso:1;
 #endif
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 080f34e66017..d1bd24fb4a33 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -72,7 +72,8 @@
  * Return a boolean value (0 or 1) indicating whether a "secure exec"
  * is required.  The flag is passed in the auxiliary table
  * on the initial stack to the ELF interpreter to indicate whether libc
- * should enable secure mode.
+ * should enable secure mode. Called before bprm_committing_creds(),
+ * so pending credentials are in @bprm->cred.
  * @bprm contains the linux_binprm structure.
  *
  * Security hooks for filesystem operations.
diff 

Re: [PATCH v2 2/8] exec: Move security_bprm_secureexec() earlier

2017-07-10 Thread Eric W. Biederman
Kees Cook  writes:

> There are several places where exec needs to know if a privilege-gain has
> happened. These should be using the results of security_bprm_secureexec()
> but it is getting (needlessly) called very late.

It is hard to tell at a glance but I believe this introduces a
regression.

cap_bprm_set_creds is currently called before cap_bprm_secureexec and
it has a number of cases such as no_new_privs and ptrace that can result
in some of the precomputed credential changes not happening.

Without accounting for that I believe your cap_bprm_securexec now
returns a postive value too early.

> Instead, move this earlier in the exec code, to the start of the point
> of no return in setup_new_exec(). Here, the new creds have already
> been calculated (and stored in bprm->cred), which is normally what
> security_bprm_secureexec() wants to examine. Since it's moved earlier,
> LSMs hooking bprm_secureexec() need to be adjusted to use the creds in
> bprm:
>
> $ git grep LSM_HOOK_INIT.*bprm_secureexec
> apparmor/lsm.c:LSM_HOOK_INIT(bprm_secureexec, apparmor_bprm_secureexec),
> commoncap.c:   LSM_HOOK_INIT(bprm_secureexec, cap_bprm_secureexec),
> selinux/hooks.c:   LSM_HOOK_INIT(bprm_secureexec, selinux_bprm_secureexec),
> smack/smack_lsm.c: LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec),
>
> AppArmor does not access creds in apparmor_bprm_secureexec.
>
> Capabilities needed to be adjusted to use bprm creds.
>
> SELinux needed to be adjusted to use bprm creds for the security structure.
>
> Smack needed to be adjusted to use bprm creds for the security structure.
>
> The result of the bprm_secureexec() hook is saved in a new bprm field
> "secureexec" so it can be queried later (just AT_SECURE currently).
>
> Signed-off-by: Kees Cook 
> ---
>  fs/binfmt_elf.c| 2 +-
>  fs/binfmt_elf_fdpic.c  | 2 +-
>  fs/exec.c  | 5 +
>  include/linux/binfmts.h| 3 ++-
>  include/linux/lsm_hooks.h  | 3 ++-
>  security/commoncap.c   | 4 +---
>  security/selinux/hooks.c   | 2 +-
>  security/smack/smack_lsm.c | 2 +-
>  8 files changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 5075fd5c62c8..7f6ec4dac13d 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -254,7 +254,7 @@ create_elf_tables(struct linux_binprm *bprm, struct 
> elfhdr *exec,
>   NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid));
>   NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid));
>   NEW_AUX_ENT(AT_EGID, from_kgid_munged(cred->user_ns, cred->egid));
> - NEW_AUX_ENT(AT_SECURE, security_bprm_secureexec(bprm));
> + NEW_AUX_ENT(AT_SECURE, bprm->secureexec);
>   NEW_AUX_ENT(AT_RANDOM, (elf_addr_t)(unsigned long)u_rand_bytes);
>  #ifdef ELF_HWCAP2
>   NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2);
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index cf93a4fad012..5aa9199dfb13 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -650,7 +650,7 @@ static int create_elf_fdpic_tables(struct linux_binprm 
> *bprm,
>   NEW_AUX_ENT(AT_EUID,(elf_addr_t) from_kuid_munged(cred->user_ns, 
> cred->euid));
>   NEW_AUX_ENT(AT_GID, (elf_addr_t) from_kgid_munged(cred->user_ns, 
> cred->gid));
>   NEW_AUX_ENT(AT_EGID,(elf_addr_t) from_kgid_munged(cred->user_ns, 
> cred->egid));
> - NEW_AUX_ENT(AT_SECURE,  security_bprm_secureexec(bprm));
> + NEW_AUX_ENT(AT_SECURE,  bprm->secureexec);
>   NEW_AUX_ENT(AT_EXECFN,  bprm->exec);
>  
>  #ifdef ARCH_DLINFO
> diff --git a/fs/exec.c b/fs/exec.c
> index 7842ae661e34..b92e37fb53aa 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1337,6 +1337,11 @@ EXPORT_SYMBOL(would_dump);
>  
>  void setup_new_exec(struct linux_binprm * bprm)
>  {
> + if (security_bprm_secureexec(bprm)) {
> + /* Record for AT_SECURE. */
> + bprm->secureexec = 1;
> + }
> +
>   arch_pick_mmap_layout(current->mm);
>  
>   current->sas_ss_sp = current->sas_ss_size = 0;
> diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
> index 05488da3aee9..1afaa303cad0 100644
> --- a/include/linux/binfmts.h
> +++ b/include/linux/binfmts.h
> @@ -27,9 +27,10 @@ struct linux_binprm {
>   unsigned int
>   cred_prepared:1,/* true if creds already prepared (multiple
>* preps happen for interpreters) */
> - cap_effective:1;/* true if has elevated effective capabilities,
> + cap_effective:1,/* true if has elevated effective capabilities,
>* false if not; except for init which inherits
>* its parent's caps anyway */
> + secureexec:1;   /* true when gaining privileges */
>  #ifdef __alpha__
>   unsigned int taso:1;
>  #endif
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 080f34e66017..d1bd24fb4a33 

[PATCH v2 1/8] exec: Correct comments about "point of no return"

2017-07-10 Thread Kees Cook
In commit 221af7f87b97 ("Split 'flush_old_exec' into two functions"),
the comment about the point of no return should have stayed in
flush_old_exec() since it refers to "bprm->mm = NULL;" line, but prior
changes in commits c89681ed7d0e ("remove steal_locks()"), and
fd8328be874f ("sanitize handling of shared descriptor tables in failing
execve()") made it look like it meant the current->sas_ss_sp line instead.

The comment is referring to the fact that once bprm->mm is NULL, all
failures from a binfmt load_binary hook (e.g. load_elf_binary), will
get SEGV raised against current. Move this comment and expand the
explanation a bit, putting it above the assignment this time.

This also removes an erroneous commet about when credentials are being
installed. That has its own dedicated function, install_exec_creds(),
which carries a similar (and correct) comment, so remove the bogus comment
where installation is not actually happening.

Signed-off-by: Kees Cook 
---
 fs/exec.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 904199086490..7842ae661e34 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm)
if (retval)
goto out;
 
-   bprm->mm = NULL;/* We're using it now */
+   /*
+* After clearing bprm->mm (to mark that current is using the
+* prepared mm now), we are at the point of no return. If
+* anything from here on returns an error, the check in
+* search_binary_handler() will kill current (since the mm has
+* been replaced).
+*/
+   bprm->mm = NULL;
 
set_fs(USER_DS);
current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
@@ -1332,7 +1339,6 @@ void setup_new_exec(struct linux_binprm * bprm)
 {
arch_pick_mmap_layout(current->mm);
 
-   /* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
 
if (uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), 
current_gid()))
@@ -1350,7 +1356,6 @@ void setup_new_exec(struct linux_binprm * bprm)
 */
current->mm->task_size = TASK_SIZE;
 
-   /* install the new credentials */
if (!uid_eq(bprm->cred->uid, current_euid()) ||
!gid_eq(bprm->cred->gid, current_egid())) {
current->pdeath_signal = 0;
-- 
2.7.4



Re: [PATCH v2 1/8] exec: Correct comments about "point of no return"

2017-07-10 Thread Eric W. Biederman

But you miss it.

The "point of no return" is the call to de_thread.  Or aguably anything in
flush_old_exec.  Once anything in the current task is modified you can't
return an error.

It very much does not have anything to do with brpm.It has
everything to do with current.


> diff --git a/fs/exec.c b/fs/exec.c
> index 904199086490..7842ae661e34 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1285,7 +1285,14 @@ int flush_old_exec(struct linux_binprm * bprm)
>   if (retval)
>   goto out;
>  
> - bprm->mm = NULL;/* We're using it now */
> + /*
> +  * After clearing bprm->mm (to mark that current is using the
> +  * prepared mm now), we are at the point of no return. If
> +  * anything from here on returns an error, the check in
> +  * search_binary_handler() will kill current (since the mm has
> +  * been replaced).
> +  */
> + bprm->mm = NULL;
>  
>   set_fs(USER_DS);
>   current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |

Eric


[PATCH v2 8/8] exec: Use sane stack rlimit under secureexec

2017-07-10 Thread Kees Cook
For a secureexec, before memory layout selection has happened, reset the
stack rlimit to something sane to avoid the caller having control over
the resulting layouts.

$ ulimit -s
8192
$ ulimit -s unlimited
$ /bin/sh -c 'ulimit -s'
unlimited
$ sudo /bin/sh -c 'ulimit -s'
8192

Signed-off-by: Kees Cook 
---
 fs/exec.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/exec.c b/fs/exec.c
index e0186db02f90..1e3463854a16 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1343,6 +1343,16 @@ void setup_new_exec(struct linux_binprm * bprm)
 
/* Make sure parent cannot signal privileged process. */
current->pdeath_signal = 0;
+
+   /*
+* For secureexec, reset the stack limit to sane default to
+* avoid bad behavior from the prior rlimits. This has to
+* happen before arch_pick_mmap_layout(), which examines
+* RLIMIT_STACK, but after the point of no return to avoid
+* needing to clean up the change on failure.
+*/
+   if (current->signal->rlim[RLIMIT_STACK].rlim_cur > _STK_LIM)
+   current->signal->rlim[RLIMIT_STACK].rlim_cur = _STK_LIM;
}
 
arch_pick_mmap_layout(current->mm);
-- 
2.7.4



[PATCH v2 5/8] smack: Remove redundant pdeath_signal clearing

2017-07-10 Thread Kees Cook
This removes the redundant pdeath_signal clearing in Smack: the check in
smack_bprm_committing_creds() matches the check in smack_bprm_secureexec(),
and since secureexec is now being checked for clearing pdeath_signal, this
is redundant to the common exec code.

Signed-off-by: Kees Cook 
---
 security/smack/smack_lsm.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 13cf9e66d5fe..4b10b782aecc 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -954,20 +954,6 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm)
 }
 
 /**
- * smack_bprm_committing_creds - Prepare to install the new credentials
- * from bprm.
- *
- * @bprm: binprm for exec
- */
-static void smack_bprm_committing_creds(struct linux_binprm *bprm)
-{
-   struct task_smack *bsp = bprm->cred->security;
-
-   if (bsp->smk_task != bsp->smk_forked)
-   current->pdeath_signal = 0;
-}
-
-/**
  * smack_bprm_secureexec - Return the decision to use secureexec.
  * @bprm: binprm for exec
  *
@@ -4645,7 +4631,6 @@ static struct security_hook_list smack_hooks[] 
__lsm_ro_after_init = {
LSM_HOOK_INIT(sb_parse_opts_str, smack_parse_opts_str),
 
LSM_HOOK_INIT(bprm_set_creds, smack_bprm_set_creds),
-   LSM_HOOK_INIT(bprm_committing_creds, smack_bprm_committing_creds),
LSM_HOOK_INIT(bprm_secureexec, smack_bprm_secureexec),
 
LSM_HOOK_INIT(inode_alloc_security, smack_inode_alloc_security),
-- 
2.7.4



[PATCH v2 4/8] exec: Use secureexec for clearing pdeath_signal

2017-07-10 Thread Kees Cook
Like dumpability, clearing pdeath_signal happens both in setup_new_exec()
and later in commit_creds(). The test in setup_new_exec() is different
from all other privilege comparisons, though: it is checking the new cred
(bprm) uid vs the old cred (current) euid. This appears to be a bug,
introduced by commit a6f76f23d297 ("CRED: Make execve() take advantage of
copy-on-write credentials"):

-   if (bprm->e_uid != current_euid() ||
-   bprm->e_gid != current_egid()) {
-   set_dumpable(current->mm, suid_dumpable);
+   /* install the new credentials */
+   if (bprm->cred->uid != current_euid() ||
+   bprm->cred->gid != current_egid()) {

It was bprm euid vs current euid (and egids), but the effective got
dropped. Nothing in the exec flow changes bprm->cred->uid (nor gid).
The call traces are:

prepare_bprm_creds()
prepare_exec_creds()
prepare_creds()
memcpy(new_creds, old_creds, ...)
security_prepare_creds() (unimplemented by commoncap)
...
prepare_binprm()
bprm_fill_uid()
resets euid/egid to current euid/egid
sets euid/egid on bprm based on set*id file bits
security_bprm_set_creds()
cap_bprm_set_creds()
handle all caps-based manipulations

so this test is effectively a test of current_uid() vs current_euid(),
which is wrong, just like the prior dumpability tests were wrong.

The commit log says "Clear pdeath_signal and set dumpable on
certain circumstances that may not be covered by commit_creds()." This
may be meaning the earlier old euid vs new euid (and egid) test that
got changed.

Luckily, as with dumpability, this is all masked by commit_creds()
which performs old/new euid and egid tests and clears pdeath_signal.

And again, like dumpability, we should include LSM secureexec logic for
pdeath_signal clearing. For example, Smack goes out of its way to clear
pdeath_signal when it finds a secureexec condition.

Signed-off-by: Kees Cook 
---
 fs/exec.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 3e519d4f0bd3..d7bda5b60e7b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1361,8 +1361,7 @@ void setup_new_exec(struct linux_binprm * bprm)
 */
current->mm->task_size = TASK_SIZE;
 
-   if (!uid_eq(bprm->cred->uid, current_euid()) ||
-   !gid_eq(bprm->cred->gid, current_egid())) {
+   if (bprm->secureexec) {
current->pdeath_signal = 0;
} else {
if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP)
-- 
2.7.4



[PATCH v2 0/8] exec: Use sane stack rlimit under secureexec

2017-07-10 Thread Kees Cook
As discussed with Linus and Andy, we need to reset the stack rlimit
before we do memory layouts when execing a privilege-gaining (e.g.
setuid) program. This moves security_bprm_secureexec() earlier (with
required changes), and then lowers the stack limit when appropriate.

As a side-effect, dumpability and pdeath_signal clearing is expanded
to cover LSM definitions of secureexec (and Smack can drop its special
handler for pdeath_signal clearing).

I'd appreciate some extra eyes on this to make sure this isn't
broken in some special way. I couldn't find anything that _depended_
on security_bprm_secureexec() being called late.

Thanks!

-Kees

v2:
- fix missed current_security() uses in LSMs.
- research/consolidate dumpability setting logic
- research/consolidate pdeath_signal clearing logic
- split up logical steps a little more for easier review (and bisection)
- fix some old broken comments



[PATCH v2 3/8] exec: Use secureexec for setting dumpability

2017-07-10 Thread Kees Cook
The examination of "current" to decide dumpability is wrong. This was a
check of and euid/uid (or egid/gid) mismatch in the existing process,
not the newly created one. This appears to stretch back into even the
"history.git" tree. Luckily, dumpability is later set in commit_creds().
In earlier kernel versions before creds existed, similar checks also
existed late in the exec flow, covering up the mistake as far back as I
could find.

The commit_creds() check examines differences of euid, uid, egid, gid,
and capabilities between the old and new creds. It would look like
the setup_new_exec() dumpability test could be entirely removed, but
strictly speaking, the secureexec test covers a different set of tests
than what commit_creds() checks for. So, fix this test to use secureexec,
which includes the same logical check (euid != uid || egid != gid),
but checks bprm->cred, not current->cred.

One would wonder if we need a security_commit_creds() LSM hook and to
move the existing checks in commit_creds() into commoncaps.c, which
would allow expanding the logic to all LSMs. Currently this doesn't
seem needed, though.

Signed-off-by: Kees Cook 
---
 fs/exec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/exec.c b/fs/exec.c
index b92e37fb53aa..3e519d4f0bd3 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1346,7 +1346,7 @@ void setup_new_exec(struct linux_binprm * bprm)
 
current->sas_ss_sp = current->sas_ss_size = 0;
 
-   if (uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), 
current_gid()))
+   if (!bprm->secureexec)
set_dumpable(current->mm, SUID_DUMP_USER);
else
set_dumpable(current->mm, suid_dumpable);
-- 
2.7.4