Re: dropping capabilities in user namespace
Aditya Kali writes: > On Wed, Apr 23, 2014 at 2:17 AM, Eric W. Biederman > wrote: >> Aditya Kali writes: >> >>> Hi all, >>> >>> I am trying to understand the behavior of how we can drop capabilities >>> inside user namespace. i.e., I want to start a process inside user >>> namespace with its effective and permitted capability sets cleared. >> >> Please note to start with that at the point you are in a user namespace >> all of your capabilities are relative to that user namespace. >> >> Now I have not had any problem dropping capabilities in a user namespace >> so you are doing something weird. Let me see if I can see what that >> weird thing is. >> >>> A typical way in which a root (uid=0) process can drop its privileges is: >>> >>> prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); >> >> You clear this bit in securebits that should already be clear anyay. >> >>> setresuid(uid, uid, uid); // At this point, permitted and effective >>> capabilities are cleared >>> exec() >>> >>> But this sequence of operation inside a user namespace does not work >>> as expected: >> >> >> As I look at this it seems to work as designed. By not starting with >> uid 0 you are triggered the non-zero uid with caps section of the code >> that has always behaved differently. >> >>> Assume /proc/pid/uid_map has entry: uid uid 1 >>> >>> attach_user_ns(pid); // OR create_user_ns() & write_uid_map() >>> prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); >>> setresuid(uid, uid, uid); // Fails to reset capabilities >>> exec() >>> >>> The exec()ed process starts with correct uid set, but still with all >>> the capabilities. >>> >>> The differentiating factor here seems to be the 'root_uid' value in >>> security/commoncap.c:cap_emulate_setxuid(): >>> >>> static inline void cap_emulate_setxuid(struct cred *new, const struct cred >>> *old) >>> { >>> kuid_t root_uid = make_kuid(old->user_ns, 0); >>> >>> if ((uid_eq(old->uid, root_uid) || >>> uid_eq(old->euid, root_uid) || >>> uid_eq(old->suid, root_uid)) && >>> (!uid_eq(new->uid, root_uid) && >>> !uid_eq(new->euid, root_uid) && >>> !uid_eq(new->suid, root_uid)) && >>> !issecure(SECURE_KEEP_CAPS)) { >>> cap_clear(new->cap_permitted); >>> cap_clear(new->cap_effective); >>> } >>> ... >>> >>> There are couple of problems here: >>> (1) In above example when there is no mapping for uid 0 inside >>> old->user_ns, make_kuid() returns INVALID_UID. Since we go on to >>> compare root_uid without first checking if its even valid, we never >>> satisfy the 'if' condition and never clear the caps. This looks like a >>> bug. >> >> INVALID_UID will never be in a capability set, so the comparison is >> guaranteed against root_uid is guaranteed to fail if there is not a root >> uid. That is correct. >> > > So this does seem like a regression in userns w.r.t. > global/init-user-ns. (See below for correct example when the behavior > is different). It is outside of the conditions that can exist without user namespaces so it can not possibly be a regression. It may be a violation of expectations, but I am not certain it is wrong. Most of the way capabilities work when you don't trigger their backwards compatible behavior violates expectations. >>> (2) Even if there is some mapping for uid 0 inside old->user_ns (say >>> "0 1"), since old->uid = 0, and root_uid= (or some non-zero >>> uid), the 'if' condition again remains unsatisfied. >> >> Correct. Because this code is not supposed to do something if you have >> caps and your uid is not zero. >> >>> It looks like currently the only case where global root (uid=0) >>> process can drop its capabilities inside a user namespace is by having >>> "0 0 " mapping in the uid_map file. It seems wrong to expose >>> global root in user namespace just to drop privileges! >> >> Where does global root come into this? Nothing above is global root >> specific? Or do you just mean you are starting all of this as the >> global root user? >> > > I am starting my program as global root user, yes. The program > attaches to given user namespaces, sets uid to given uid and does some > work (which it expects to do as user
Re: dropping capabilities in user namespace
Aditya Kali adityak...@google.com writes: On Wed, Apr 23, 2014 at 2:17 AM, Eric W. Biederman ebied...@xmission.com wrote: Aditya Kali adityak...@google.com writes: Hi all, I am trying to understand the behavior of how we can drop capabilities inside user namespace. i.e., I want to start a process inside user namespace with its effective and permitted capability sets cleared. Please note to start with that at the point you are in a user namespace all of your capabilities are relative to that user namespace. Now I have not had any problem dropping capabilities in a user namespace so you are doing something weird. Let me see if I can see what that weird thing is. A typical way in which a root (uid=0) process can drop its privileges is: prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); You clear this bit in securebits that should already be clear anyay. setresuid(uid, uid, uid); // At this point, permitted and effective capabilities are cleared exec() But this sequence of operation inside a user namespace does not work as expected: As I look at this it seems to work as designed. By not starting with uid 0 you are triggered the non-zero uid with caps section of the code that has always behaved differently. Assume /proc/pid/uid_map has entry: uid uid 1 attach_user_ns(pid); // OR create_user_ns() write_uid_map() prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities exec() The exec()ed process starts with correct uid set, but still with all the capabilities. The differentiating factor here seems to be the 'root_uid' value in security/commoncap.c:cap_emulate_setxuid(): static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) { kuid_t root_uid = make_kuid(old-user_ns, 0); if ((uid_eq(old-uid, root_uid) || uid_eq(old-euid, root_uid) || uid_eq(old-suid, root_uid)) (!uid_eq(new-uid, root_uid) !uid_eq(new-euid, root_uid) !uid_eq(new-suid, root_uid)) !issecure(SECURE_KEEP_CAPS)) { cap_clear(new-cap_permitted); cap_clear(new-cap_effective); } ... There are couple of problems here: (1) In above example when there is no mapping for uid 0 inside old-user_ns, make_kuid() returns INVALID_UID. Since we go on to compare root_uid without first checking if its even valid, we never satisfy the 'if' condition and never clear the caps. This looks like a bug. INVALID_UID will never be in a capability set, so the comparison is guaranteed against root_uid is guaranteed to fail if there is not a root uid. That is correct. So this does seem like a regression in userns w.r.t. global/init-user-ns. (See below for correct example when the behavior is different). It is outside of the conditions that can exist without user namespaces so it can not possibly be a regression. It may be a violation of expectations, but I am not certain it is wrong. Most of the way capabilities work when you don't trigger their backwards compatible behavior violates expectations. (2) Even if there is some mapping for uid 0 inside old-user_ns (say 0 1), since old-uid = 0, and root_uid= (or some non-zero uid), the 'if' condition again remains unsatisfied. Correct. Because this code is not supposed to do something if you have caps and your uid is not zero. It looks like currently the only case where global root (uid=0) process can drop its capabilities inside a user namespace is by having 0 0 length mapping in the uid_map file. It seems wrong to expose global root in user namespace just to drop privileges! Where does global root come into this? Nothing above is global root specific? Or do you just mean you are starting all of this as the global root user? I am starting my program as global root user, yes. The program attaches to given user namespaces, sets uid to given uid and does some work (which it expects to do as user uid without any capabilities). I made a mistake in my example above. If I exec() at the end, the capabilities do get cleared as you suggest. The problematic case is: attach_to_userns(pid) prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities pause() / sleep(...) / do_some_work_as_uid() [[ no exec, sorry ]] And I was looking at the Cap* fields in /proc/process-pid/status from another terminal. I noticed that the capabilities were not reset after the setresuid() call. This behavior is different as compared to what happens in init_user_ns. Yes this behavior is different as compared to what happens without user namespaces involved, and I grant it is a bit surprising. So I feel we need to fix the condition checks everywhere we are using make_kuid() in security/commoncap.c. Can the security experts please advice how this is supposed to work? If you don't want to set your uid to 0 inside a user namespace before setting your uid to something else. You need to call
Re: dropping capabilities in user namespace
On Wed, Apr 23, 2014 at 2:17 AM, Eric W. Biederman wrote: > Aditya Kali writes: > >> Hi all, >> >> I am trying to understand the behavior of how we can drop capabilities >> inside user namespace. i.e., I want to start a process inside user >> namespace with its effective and permitted capability sets cleared. > > Please note to start with that at the point you are in a user namespace > all of your capabilities are relative to that user namespace. > > Now I have not had any problem dropping capabilities in a user namespace > so you are doing something weird. Let me see if I can see what that > weird thing is. > >> A typical way in which a root (uid=0) process can drop its privileges is: >> >> prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); > > You clear this bit in securebits that should already be clear anyay. > >> setresuid(uid, uid, uid); // At this point, permitted and effective >> capabilities are cleared >> exec() >> >> But this sequence of operation inside a user namespace does not work >> as expected: > > > As I look at this it seems to work as designed. By not starting with > uid 0 you are triggered the non-zero uid with caps section of the code > that has always behaved differently. > >> Assume /proc/pid/uid_map has entry: uid uid 1 >> >> attach_user_ns(pid); // OR create_user_ns() & write_uid_map() >> prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); >> setresuid(uid, uid, uid); // Fails to reset capabilities >> exec() >> >> The exec()ed process starts with correct uid set, but still with all >> the capabilities. >> >> The differentiating factor here seems to be the 'root_uid' value in >> security/commoncap.c:cap_emulate_setxuid(): >> >> static inline void cap_emulate_setxuid(struct cred *new, const struct cred >> *old) >> { >> kuid_t root_uid = make_kuid(old->user_ns, 0); >> >> if ((uid_eq(old->uid, root_uid) || >> uid_eq(old->euid, root_uid) || >> uid_eq(old->suid, root_uid)) && >> (!uid_eq(new->uid, root_uid) && >> !uid_eq(new->euid, root_uid) && >> !uid_eq(new->suid, root_uid)) && >> !issecure(SECURE_KEEP_CAPS)) { >> cap_clear(new->cap_permitted); >> cap_clear(new->cap_effective); >> } >> ... >> >> There are couple of problems here: >> (1) In above example when there is no mapping for uid 0 inside >> old->user_ns, make_kuid() returns INVALID_UID. Since we go on to >> compare root_uid without first checking if its even valid, we never >> satisfy the 'if' condition and never clear the caps. This looks like a >> bug. > > INVALID_UID will never be in a capability set, so the comparison is > guaranteed against root_uid is guaranteed to fail if there is not a root > uid. That is correct. > So this does seem like a regression in userns w.r.t. global/init-user-ns. (See below for correct example when the behavior is different). >> (2) Even if there is some mapping for uid 0 inside old->user_ns (say >> "0 1"), since old->uid = 0, and root_uid= (or some non-zero >> uid), the 'if' condition again remains unsatisfied. > > Correct. Because this code is not supposed to do something if you have > caps and your uid is not zero. > >> It looks like currently the only case where global root (uid=0) >> process can drop its capabilities inside a user namespace is by having >> "0 0 " mapping in the uid_map file. It seems wrong to expose >> global root in user namespace just to drop privileges! > > Where does global root come into this? Nothing above is global root > specific? Or do you just mean you are starting all of this as the > global root user? > I am starting my program as global root user, yes. The program attaches to given user namespaces, sets uid to given uid and does some work (which it expects to do as user without any capabilities). I made a mistake in my example above. If I exec() at the end, the capabilities do get cleared as you suggest. The problematic case is: attach_to_userns(pid) prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities pause() / sleep(...) / do_some_work_as_uid() [[ no exec, sorry ]] And I was looking at the Cap* fields in /proc//status from another terminal. I noticed that the capabilities were not reset after the setresuid() call. This behavior is different as compared to what happens in init_user_ns. >> So I feel we >> need to fix the condition checks everywhere we are using make_kuid() >> in security/commoncap.c. >> Can the security ex
Re: dropping capabilities in user namespace
Aditya Kali writes: > Hi all, > > I am trying to understand the behavior of how we can drop capabilities > inside user namespace. i.e., I want to start a process inside user > namespace with its effective and permitted capability sets cleared. Please note to start with that at the point you are in a user namespace all of your capabilities are relative to that user namespace. Now I have not had any problem dropping capabilities in a user namespace so you are doing something weird. Let me see if I can see what that weird thing is. > A typical way in which a root (uid=0) process can drop its privileges is: > > prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); You clear this bit in securebits that should already be clear anyay. > setresuid(uid, uid, uid); // At this point, permitted and effective > capabilities are cleared > exec() > > But this sequence of operation inside a user namespace does not work > as expected: As I look at this it seems to work as designed. By not starting with uid 0 you are triggered the non-zero uid with caps section of the code that has always behaved differently. > Assume /proc/pid/uid_map has entry: uid uid 1 > > attach_user_ns(pid); // OR create_user_ns() & write_uid_map() > prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); > setresuid(uid, uid, uid); // Fails to reset capabilities > exec() > > The exec()ed process starts with correct uid set, but still with all > the capabilities. > > The differentiating factor here seems to be the 'root_uid' value in > security/commoncap.c:cap_emulate_setxuid(): > > static inline void cap_emulate_setxuid(struct cred *new, const struct cred > *old) > { > kuid_t root_uid = make_kuid(old->user_ns, 0); > > if ((uid_eq(old->uid, root_uid) || > uid_eq(old->euid, root_uid) || > uid_eq(old->suid, root_uid)) && > (!uid_eq(new->uid, root_uid) && > !uid_eq(new->euid, root_uid) && > !uid_eq(new->suid, root_uid)) && > !issecure(SECURE_KEEP_CAPS)) { > cap_clear(new->cap_permitted); > cap_clear(new->cap_effective); > } > ... > > There are couple of problems here: > (1) In above example when there is no mapping for uid 0 inside > old->user_ns, make_kuid() returns INVALID_UID. Since we go on to > compare root_uid without first checking if its even valid, we never > satisfy the 'if' condition and never clear the caps. This looks like a > bug. INVALID_UID will never be in a capability set, so the comparison is guaranteed against root_uid is guaranteed to fail if there is not a root uid. That is correct. > (2) Even if there is some mapping for uid 0 inside old->user_ns (say > "0 1"), since old->uid = 0, and root_uid= (or some non-zero > uid), the 'if' condition again remains unsatisfied. Correct. Because this code is not supposed to do something if you have caps and your uid is not zero. > It looks like currently the only case where global root (uid=0) > process can drop its capabilities inside a user namespace is by having > "0 0 " mapping in the uid_map file. It seems wrong to expose > global root in user namespace just to drop privileges! Where does global root come into this? Nothing above is global root specific? Or do you just mean you are starting all of this as the global root user? > So I feel we > need to fix the condition checks everywhere we are using make_kuid() > in security/commoncap.c. > Can the security experts please advice how this is supposed to work? If you don't want to set your uid to 0 inside a user namespace before setting your uid to something else. You need to call capset, because you are in bizarro land with respect to capabilities. If you don't want things to work like normal, and you want to skip setting your uid to 0 before calling setrexuid(2) you need to call capset(2). But your scenario continues to be very weird because after exec you should not have capabilities. Looking at cap_bprm_set_creds() { if (!issecure(SECURE_NOROOT)) { ... if (uid_eq(new->euid, root_uid)) effective = true; } ... if (effective) new->cap_effective = new->cap_permitted; else cap_clear(new->cap_effective); ... } That very clearly clears your effective set if your uid is not 0 in the user namespace. I fail to see how even in the example you gave above that you would have any effective capabilities after exec. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dropping capabilities in user namespace
On Wed, Apr 23, 2014 at 2:17 AM, Eric W. Biederman ebied...@xmission.com wrote: Aditya Kali adityak...@google.com writes: Hi all, I am trying to understand the behavior of how we can drop capabilities inside user namespace. i.e., I want to start a process inside user namespace with its effective and permitted capability sets cleared. Please note to start with that at the point you are in a user namespace all of your capabilities are relative to that user namespace. Now I have not had any problem dropping capabilities in a user namespace so you are doing something weird. Let me see if I can see what that weird thing is. A typical way in which a root (uid=0) process can drop its privileges is: prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); You clear this bit in securebits that should already be clear anyay. setresuid(uid, uid, uid); // At this point, permitted and effective capabilities are cleared exec() But this sequence of operation inside a user namespace does not work as expected: As I look at this it seems to work as designed. By not starting with uid 0 you are triggered the non-zero uid with caps section of the code that has always behaved differently. Assume /proc/pid/uid_map has entry: uid uid 1 attach_user_ns(pid); // OR create_user_ns() write_uid_map() prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities exec() The exec()ed process starts with correct uid set, but still with all the capabilities. The differentiating factor here seems to be the 'root_uid' value in security/commoncap.c:cap_emulate_setxuid(): static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) { kuid_t root_uid = make_kuid(old-user_ns, 0); if ((uid_eq(old-uid, root_uid) || uid_eq(old-euid, root_uid) || uid_eq(old-suid, root_uid)) (!uid_eq(new-uid, root_uid) !uid_eq(new-euid, root_uid) !uid_eq(new-suid, root_uid)) !issecure(SECURE_KEEP_CAPS)) { cap_clear(new-cap_permitted); cap_clear(new-cap_effective); } ... There are couple of problems here: (1) In above example when there is no mapping for uid 0 inside old-user_ns, make_kuid() returns INVALID_UID. Since we go on to compare root_uid without first checking if its even valid, we never satisfy the 'if' condition and never clear the caps. This looks like a bug. INVALID_UID will never be in a capability set, so the comparison is guaranteed against root_uid is guaranteed to fail if there is not a root uid. That is correct. So this does seem like a regression in userns w.r.t. global/init-user-ns. (See below for correct example when the behavior is different). (2) Even if there is some mapping for uid 0 inside old-user_ns (say 0 1), since old-uid = 0, and root_uid= (or some non-zero uid), the 'if' condition again remains unsatisfied. Correct. Because this code is not supposed to do something if you have caps and your uid is not zero. It looks like currently the only case where global root (uid=0) process can drop its capabilities inside a user namespace is by having 0 0 length mapping in the uid_map file. It seems wrong to expose global root in user namespace just to drop privileges! Where does global root come into this? Nothing above is global root specific? Or do you just mean you are starting all of this as the global root user? I am starting my program as global root user, yes. The program attaches to given user namespaces, sets uid to given uid and does some work (which it expects to do as user uid without any capabilities). I made a mistake in my example above. If I exec() at the end, the capabilities do get cleared as you suggest. The problematic case is: attach_to_userns(pid) prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities pause() / sleep(...) / do_some_work_as_uid() [[ no exec, sorry ]] And I was looking at the Cap* fields in /proc/process-pid/status from another terminal. I noticed that the capabilities were not reset after the setresuid() call. This behavior is different as compared to what happens in init_user_ns. So I feel we need to fix the condition checks everywhere we are using make_kuid() in security/commoncap.c. Can the security experts please advice how this is supposed to work? If you don't want to set your uid to 0 inside a user namespace before setting your uid to something else. You need to call capset, because you are in bizarro land with respect to capabilities. If you don't want things to work like normal, and you want to skip setting your uid to 0 before calling setrexuid(2) you need to call capset(2). I cannot call setuid(0) before setting the uid to something else there is no uid 0 inside userns as per the uid_map. I will try the capset() approach, but I hope we could fix the above case too. But your scenario continues to be very weird because after exec you should
Re: dropping capabilities in user namespace
Aditya Kali adityak...@google.com writes: Hi all, I am trying to understand the behavior of how we can drop capabilities inside user namespace. i.e., I want to start a process inside user namespace with its effective and permitted capability sets cleared. Please note to start with that at the point you are in a user namespace all of your capabilities are relative to that user namespace. Now I have not had any problem dropping capabilities in a user namespace so you are doing something weird. Let me see if I can see what that weird thing is. A typical way in which a root (uid=0) process can drop its privileges is: prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); You clear this bit in securebits that should already be clear anyay. setresuid(uid, uid, uid); // At this point, permitted and effective capabilities are cleared exec() But this sequence of operation inside a user namespace does not work as expected: As I look at this it seems to work as designed. By not starting with uid 0 you are triggered the non-zero uid with caps section of the code that has always behaved differently. Assume /proc/pid/uid_map has entry: uid uid 1 attach_user_ns(pid); // OR create_user_ns() write_uid_map() prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities exec() The exec()ed process starts with correct uid set, but still with all the capabilities. The differentiating factor here seems to be the 'root_uid' value in security/commoncap.c:cap_emulate_setxuid(): static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) { kuid_t root_uid = make_kuid(old-user_ns, 0); if ((uid_eq(old-uid, root_uid) || uid_eq(old-euid, root_uid) || uid_eq(old-suid, root_uid)) (!uid_eq(new-uid, root_uid) !uid_eq(new-euid, root_uid) !uid_eq(new-suid, root_uid)) !issecure(SECURE_KEEP_CAPS)) { cap_clear(new-cap_permitted); cap_clear(new-cap_effective); } ... There are couple of problems here: (1) In above example when there is no mapping for uid 0 inside old-user_ns, make_kuid() returns INVALID_UID. Since we go on to compare root_uid without first checking if its even valid, we never satisfy the 'if' condition and never clear the caps. This looks like a bug. INVALID_UID will never be in a capability set, so the comparison is guaranteed against root_uid is guaranteed to fail if there is not a root uid. That is correct. (2) Even if there is some mapping for uid 0 inside old-user_ns (say 0 1), since old-uid = 0, and root_uid= (or some non-zero uid), the 'if' condition again remains unsatisfied. Correct. Because this code is not supposed to do something if you have caps and your uid is not zero. It looks like currently the only case where global root (uid=0) process can drop its capabilities inside a user namespace is by having 0 0 length mapping in the uid_map file. It seems wrong to expose global root in user namespace just to drop privileges! Where does global root come into this? Nothing above is global root specific? Or do you just mean you are starting all of this as the global root user? So I feel we need to fix the condition checks everywhere we are using make_kuid() in security/commoncap.c. Can the security experts please advice how this is supposed to work? If you don't want to set your uid to 0 inside a user namespace before setting your uid to something else. You need to call capset, because you are in bizarro land with respect to capabilities. If you don't want things to work like normal, and you want to skip setting your uid to 0 before calling setrexuid(2) you need to call capset(2). But your scenario continues to be very weird because after exec you should not have capabilities. Looking at cap_bprm_set_creds() { if (!issecure(SECURE_NOROOT)) { ... if (uid_eq(new-euid, root_uid)) effective = true; } ... if (effective) new-cap_effective = new-cap_permitted; else cap_clear(new-cap_effective); ... } That very clearly clears your effective set if your uid is not 0 in the user namespace. I fail to see how even in the example you gave above that you would have any effective capabilities after exec. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
dropping capabilities in user namespace
Hi all, I am trying to understand the behavior of how we can drop capabilities inside user namespace. i.e., I want to start a process inside user namespace with its effective and permitted capability sets cleared. A typical way in which a root (uid=0) process can drop its privileges is: prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // At this point, permitted and effective capabilities are cleared exec() But this sequence of operation inside a user namespace does not work as expected: Assume /proc/pid/uid_map has entry: uid uid 1 attach_user_ns(pid); // OR create_user_ns() & write_uid_map() prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities exec() The exec()ed process starts with correct uid set, but still with all the capabilities. The differentiating factor here seems to be the 'root_uid' value in security/commoncap.c:cap_emulate_setxuid(): static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) { kuid_t root_uid = make_kuid(old->user_ns, 0); if ((uid_eq(old->uid, root_uid) || uid_eq(old->euid, root_uid) || uid_eq(old->suid, root_uid)) && (!uid_eq(new->uid, root_uid) && !uid_eq(new->euid, root_uid) && !uid_eq(new->suid, root_uid)) && !issecure(SECURE_KEEP_CAPS)) { cap_clear(new->cap_permitted); cap_clear(new->cap_effective); } ... There are couple of problems here: (1) In above example when there is no mapping for uid 0 inside old->user_ns, make_kuid() returns INVALID_UID. Since we go on to compare root_uid without first checking if its even valid, we never satisfy the 'if' condition and never clear the caps. This looks like a bug. (2) Even if there is some mapping for uid 0 inside old->user_ns (say "0 1"), since old->uid = 0, and root_uid= (or some non-zero uid), the 'if' condition again remains unsatisfied. It looks like currently the only case where global root (uid=0) process can drop its capabilities inside a user namespace is by having "0 0 " mapping in the uid_map file. It seems wrong to expose global root in user namespace just to drop privileges! So I feel we need to fix the condition checks everywhere we are using make_kuid() in security/commoncap.c. Can the security experts please advice how this is supposed to work? (FYI: Commit 18815a18085364d8514c0d0c4c986776cb74272c "userns: Convert capabilities related permsion checks" introduced the make_uid() change in cap_emulate_setxuid() & other places). Thanks, -- Aditya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
dropping capabilities in user namespace
Hi all, I am trying to understand the behavior of how we can drop capabilities inside user namespace. i.e., I want to start a process inside user namespace with its effective and permitted capability sets cleared. A typical way in which a root (uid=0) process can drop its privileges is: prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // At this point, permitted and effective capabilities are cleared exec() But this sequence of operation inside a user namespace does not work as expected: Assume /proc/pid/uid_map has entry: uid uid 1 attach_user_ns(pid); // OR create_user_ns() write_uid_map() prctl(PR_SET_KEEPCAPS, 0, 0, 0, 0); setresuid(uid, uid, uid); // Fails to reset capabilities exec() The exec()ed process starts with correct uid set, but still with all the capabilities. The differentiating factor here seems to be the 'root_uid' value in security/commoncap.c:cap_emulate_setxuid(): static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old) { kuid_t root_uid = make_kuid(old-user_ns, 0); if ((uid_eq(old-uid, root_uid) || uid_eq(old-euid, root_uid) || uid_eq(old-suid, root_uid)) (!uid_eq(new-uid, root_uid) !uid_eq(new-euid, root_uid) !uid_eq(new-suid, root_uid)) !issecure(SECURE_KEEP_CAPS)) { cap_clear(new-cap_permitted); cap_clear(new-cap_effective); } ... There are couple of problems here: (1) In above example when there is no mapping for uid 0 inside old-user_ns, make_kuid() returns INVALID_UID. Since we go on to compare root_uid without first checking if its even valid, we never satisfy the 'if' condition and never clear the caps. This looks like a bug. (2) Even if there is some mapping for uid 0 inside old-user_ns (say 0 1), since old-uid = 0, and root_uid= (or some non-zero uid), the 'if' condition again remains unsatisfied. It looks like currently the only case where global root (uid=0) process can drop its capabilities inside a user namespace is by having 0 0 length mapping in the uid_map file. It seems wrong to expose global root in user namespace just to drop privileges! So I feel we need to fix the condition checks everywhere we are using make_kuid() in security/commoncap.c. Can the security experts please advice how this is supposed to work? (FYI: Commit 18815a18085364d8514c0d0c4c986776cb74272c userns: Convert capabilities related permsion checks introduced the make_uid() change in cap_emulate_setxuid() other places). Thanks, -- Aditya -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/