Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-20 Thread Aleksa Sarai
e, etc). The current setup would obviously still work, but you'd add a permission for users that just want to be able to limit their own processes. IIRC we need to update cgroup_procs_write_permission() anyway. By having the cgroup namespace requirement, you'd definitely have to &qu

Re: [PATCH v1 2/3] cgroup: allow for unprivileged subtree management

2016-07-20 Thread Aleksa Sarai
eference doesn't do anything. Getting it here would only make sense if the pointer is passed to an asynchronous context. I'll send out a fixed patchset once we figure out the cgroups_proc_write_permission() stuff. -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-20 Thread Aleksa Sarai
e container crowd, but people also planning on using cgroups as advanced forms of rlimits). -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-21 Thread Aleksa Sarai
folks would find it useful to be able to limit browser processes in a similar way. -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-21 Thread Aleksa Sarai
administrator to *install* and support LXC (as well as the shadow-utils setuid binaries too). There are cases where you don't have the freedom to do that, and also "just get someone to give you privileges temporarily" is again punting on the problem. -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-21 Thread Aleksa Sarai
oesn't inspire confidence :/ (from the runC side, we've had nothing but issues). Also, how do you even boot into a cgroupv2 system with systemd (I started backporting patches to openSUSE, but it's still not booting)? -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-22 Thread Aleksa Sarai
ss anything (and if there's enough processes then cgroup.procs reads aren't atomic either). -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

2016-07-22 Thread Aleksa Sarai
to move the actual processes within the cgroup namespace around? The administrator could also join the cgroupns (without needing to join the userns) and then just move things around that way? Do any of those suggestions seem reasonable? -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: strace lockup when tracing exec in go

2016-09-22 Thread Aleksa Sarai
issue still persists, but I didn't apply your other proposed change to this conditional. Or am I misunderstanding what tsk->ptrace refers to? -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH v2] cgroup: Add pids controller event when fork fails because of pid limit

2016-06-21 Thread Aleksa Sarai
} > + } > + return err; > } Why are we logging this? Isn't the pids.events file enough information? I feel like you could remove a lot of logic if you don't log this. And even if we do end up logging it, why have the boolean flag (the counter always increases, just log if the counter is currently 0 and you're incrementing it). -- Aleksa Sarai (cyphar) www.cyphar.com

Re: strace lockup when tracing exec in go

2016-09-22 Thread Aleksa Sarai
;ll apply it to my local kernel. :D -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH] ioctl_tty.2: add TIOCGPTPEER documentation

2017-08-15 Thread Aleksa Sarai
wrappers around the open+ptsname combo that I mention earlier in the sentence (and thus are vulnerable to the same issue). But if you feel it's confusing you can feel free to drop it. Thanks. -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: [PATCH] ioctl_tty.2: add TIOCGPTPEER documentation

2017-08-16 Thread Aleksa Sarai
ation rather than open(2) with the pathname returned by ptsname(3). This would clarify that there are usecases where you need this particular feature, without saying causing people to panic over inaccurate claims of glibc being broken. Does that sound better? -- Aleksa Sarai Software Eng

Re: [PATCH] tty: hide unused pty_get_peer function

2017-06-20 Thread Aleksa Sarai
add TIOCGPTPEER ioctl") Signed-off-by: Arnd Bergmann Oops, I missed that. Acked-by: Aleksa Sarai -- Aleksa Sarai Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

RFC: making cn_proc work in {pid,user} namespaces

2017-10-15 Thread Aleksa Sarai
to see whether anyone has any solid NACKs against the use-case or whether there is some fundamental issue that I'm not seeing. If nobody objects, I'll be happy to work on this. [1]: https://lwn.net/Articles/532748/ -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: RFC: making cn_proc work in {pid,user} namespaces

2017-10-16 Thread Aleksa Sarai
in [1] -- it's clearly not a security issue per-se but it is a correctness one). I'll try to work through those in either case, but I imagine that the architecture reworks necessary to fix those issues will make making it work for unprivileged users quite trivial (excluding the part

[PATCH] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-04 Thread Aleksa Sarai
DoS by removing the underlying SCSI device of the host's / mount). Cc: Cc: "Eric W. Biederman" Signed-off-by: Aleksa Sarai --- drivers/scsi/scsi_proc.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/sc

Re: [PATCH] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-04 Thread Aleksa Sarai
DoS by removing the underlying SCSI device of the host's / mount). Cc: Cc: "Eric W. Biederman" Signed-off-by: Aleksa Sarai --- drivers/scsi/scsi_proc.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/sc

[PATCH v2] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-04 Thread Aleksa Sarai
DoS by removing the underlying SCSI device of the host's / mount). Cc: Cc: "Eric W. Biederman" Signed-off-by: Aleksa Sarai --- drivers/scsi/scsi_proc.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/sc

[PATCH v3] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-04 Thread Aleksa Sarai
DoS by removing the underlying SCSI device of the host's / mount). Cc: Cc: "Eric W. Biederman" Signed-off-by: Aleksa Sarai --- drivers/scsi/scsi_proc.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/sc

Re: [PATCH v3] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-04 Thread Aleksa Sarai
On 11/05/2017 01:56 PM, Aleksa Sarai wrote: Previously, the only capability effectively required to operate on the /proc/scsi interface was CAP_DAC_OVERRIDE (or for some other files, having an fsuid of GLOBAL_ROOT_UID was enough). This means that semi-privileged processes could interfere with

Re: [PATCH v3] scsi: require CAP_SYS_ADMIN to write to procfs interface

2017-11-05 Thread Aleksa Sarai
I've booted it on a few of my laptops, and nothing seemed to break. Is there a particular test-suite you'd recommend that I run? On Sun, Nov 5, 2017 at 6:31 PM, Greg KH wrote: > On Sun, Nov 05, 2017 at 01:56:35PM +1100, Aleksa Sarai wrote: >> Previously, the only capability e

Re: RFC(v2): Audit Kernel Container IDs

2017-10-18 Thread Aleksa Sarai
but also there are cases where thinking of it as being hierarchical isn't necessarily correct). -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: RFC(v2): Audit Kernel Container IDs

2017-10-19 Thread Aleksa Sarai
e sense for generic containers, but since the point of this facility is *specifically* for audit I imagine that not being able to move a process from a sub-container's ID is a benefit. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Re: RFC(v2): Audit Kernel Container IDs

2017-10-19 Thread Aleksa Sarai
e sense for generic containers, but since the point of this facility is *specifically* for audit I imagine that not being able to move a process from a sub-container's ID is a benefit. [This assumes it's CAP_AUDIT_CONTROL which is what we are discussing in a sister thread.] --

Re: pids controller fails to track zombie processes

2015-10-05 Thread Aleksa Sarai
oller? But I'm not sure if that's safe (can we even call task_get_css() on a task that's being waited on?). -- Aleksa Sarai (cyphar) www.cyphar.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.

Re: [PATCH 14/14] cgroup: add cgroup_subsys->free() method and use it to fix pids controller

2015-10-12 Thread Aleksa Sarai
t; exit. Looks good to me. Out of interest, is there any reason why we still have ->exit(), given the zombie process edge case? Surely the zombie process edge case would cause issues with kmemcg and similar controllers, if they use ->exit() and not ->free()? -- Aleksa Sarai (cyph

module_put_and_exit() and free_module()

2015-09-05 Thread Aleksa Sarai
se, why do you get EBUSY when trying to remove the module (surely you should get an ENOENT)? Is it even safe to attempt to remove a module from within itself? Thanks in advance. -- Aleksa Sarai (cyphar) www.cyphar.com -- To unsubscribe from this list: send the line "unsubscribe linux-k

Re: [PATCH] pty: fix O_CLOEXEC for TIOCGPTPEER

2018-07-19 Thread Aleksa Sarai
On 2018-07-19, Matthijs van Duin wrote: > It was being ignored because the flags were not passed to fd allocation. > > Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl") Acked-by: Aleksa Sarai > Signed-off-by: Matthijs van Duin > --- > drivers/tty/pty.c | 2 +- &g

[REGRESSION v4.16-rc6] [PATCH] mqueue: forbid unprivileged user access to internal mount

2018-03-22 Thread Aleksa Sarai
mount was protected by mount_ns(), but the patch in question switched to kern_mount_data() which doesn't do this necessary permission check. So add it explicitly to mq_internal_mount(). Fixes: 36735a6a2b5e ("mqueue: switch to on-demand creation of internal mount") Reported-by: Felix

Re: [PATCH v7 1/6] seccomp: add a return code to trap to userspace

2018-09-28 Thread Aleksa Sarai
r Hicks > CC: Akihiro Suda Would you mind adding me to the Cc: list for the next round of patches? It's looking pretty neat! Thanks! -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

[PATCH 0/3] namei: implement various scoping AT_* flags

2018-09-29 Thread Aleksa Sarai
is, but it's not clear to me whether they should live here or in xfstests (as far as I can tell there are no other VFS tests in selftests, while there are some tests that look like generic VFS tests in xfstests). If you'd prefer them to be included in xfstests, let me know. [1]: https://lo

[PATCH 1/3] namei: implement O_BENEATH-style AT_* flags

2018-09-29 Thread Aleksa Sarai
y of them (and will be done in a separate patchset). Cc: Andy Lutomirski Cc: Eric Biederman Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/fcntl.c | 2 +- fs/namei.c | 61 ++-- fs/open.c

[PATCH 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution

2018-09-29 Thread Aleksa Sarai
deally this flag would be supported by all *at(2) syscalls, but this will require adding flags arguments to many of them (and will be done in a separate patchset). [1]: https://github.com/cyphar/filepath-securejoin Cc: Eric Biederman Cc: Christian Brauner Signed-off-by: Alek

[PATCH 3/3] selftests: vfs: add AT_* path resolution tests

2018-09-29 Thread Aleksa Sarai
With the addition of so many new scoping flags, it's necessary to have some sort of validation that they really work. There were no vfs self-tests in the past, so this also includes a basic framework that future VFS tests can use. Signed-off-by: Aleksa Sarai --- tools/testing/selftests/Mak

Re: [PATCH 1/3] namei: implement O_BENEATH-style AT_* flags

2018-09-29 Thread Aleksa Sarai
On 2018-09-29, Christian Brauner wrote: > > Cc: Andy Lutomirski > > Cc: Eric Biederman > > Cc: Christian Brauner > > Signed-off-by: Aleksa Sarai > > Not to be a stickler about protocol but given that this is based heavily > on ideas from prior patchsets an

Re: [PATCH 0/3] namei: implement various scoping AT_* flags

2018-09-29 Thread Aleksa Sarai
on images). [*]: Sorry for the awful naming, I'm not sure what the correct name is (I've called them "super symlinks" in the past) -- if you have a better name please let me know! [1]: https://lwn.net/Articles/721443/ [2]: https://marc.info/?l=linux-kernel&m=149394765324531&w=2 -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH v4 2/4] namei: O_BENEATH-style path resolution flags

2018-11-23 Thread Aleksa Sarai
On 2018-11-23, Andy Lutomirski wrote: > > On Nov 23, 2018, at 5:10 AM, Jürg Billeter wrote: > > > > Hi Aleksa, > > > >> On Tue, 2018-11-13 at 01:26 +1100, Aleksa Sarai wrote: > >> * O_BENEATH: Disallow "escapes" from the starting point of t

Re: [PATCH v4 2/4] namei: O_BENEATH-style path resolution flags

2018-11-23 Thread Aleksa Sarai
On 2018-11-24, Aleksa Sarai wrote: > > >> On Tue, 2018-11-13 at 01:26 +1100, Aleksa Sarai wrote: > > >> * O_BENEATH: Disallow "escapes" from the starting point of the > > >> filesystem tree during resolution (you must stay "beneath" the

Re: [PATCH v4 2/4] namei: O_BENEATH-style path resolution flags

2018-11-23 Thread Aleksa Sarai
On 2018-11-23, Jürg Billeter wrote: > On Tue, 2018-11-13 at 01:26 +1100, Aleksa Sarai wrote: > > * O_BENEATH: Disallow "escapes" from the starting point of the > > filesystem tree during resolution (you must stay "beneath" the > > starting point

Re: [PATCH] proc: allow killing processes via file descriptors

2018-11-19 Thread Aleksa Sarai
its name suggest) it's tied to network namespaces and not pid namespaces so you wouldn't reasonably be able to use the API inside a container. Using an fd side-steps the problem somewhat (though this just gave me an idea -- I will add it to the other thread). -- Aleksa Sarai Senio

Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

2018-11-19 Thread Aleksa Sarai
working on it? If we extend the procfd API to allow process creation this would allow a container to create a process outside its pidns. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

2018-11-19 Thread Aleksa Sarai
On 2018-11-19, Christian Brauner wrote: > On Tue, Nov 20, 2018 at 07:28:57AM +1100, Aleksa Sarai wrote: > > On 2018-11-19, Christian Brauner wrote: > > > + if (info) { > > > + ret = __copy_siginfo_from_user(sig, &kinfo, info); &

Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

2018-11-19 Thread Aleksa Sarai
On 2018-11-20, Aleksa Sarai wrote: > On 2018-11-19, Christian Brauner wrote: > > On Tue, Nov 20, 2018 at 07:28:57AM +1100, Aleksa Sarai wrote: > > > On 2018-11-19, Christian Brauner wrote: > > > > + if (info) { > > > > + ret =

Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

2018-11-19 Thread Aleksa Sarai
On 2018-11-19, Christian Brauner wrote: > On Tue, Nov 20, 2018 at 08:18:10AM +1100, Aleksa Sarai wrote: > > On 2018-11-19, Christian Brauner wrote: > > > On Tue, Nov 20, 2018 at 07:28:57AM +1100, Aleksa Sarai wrote: > > > > On 2018-11-19, Christian Brauner w

Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

2018-11-19 Thread Aleksa Sarai
d_new() and the rest of the API we are discussing, I'd argue we'd want to allow passing an nsfs fd to specify what pidns we want the process to be created in (for procfd_new()). This will obviously require a permission check to make sure we aren't creating processes in a parent

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-22 Thread Aleksa Sarai
close(fd); > exit(EXIT_FAILURE); > } > > close(fd); > > exit(EXIT_SUCCESS); > } > > [1]: https://lkml.org/lkml/2018/11/18/130 > > Cc: "Eric W. Biederman" > Cc: Serge Hallyn > Cc: Jann Hor

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Aleksa Sarai
ork through that rewrite in a future series once this one goes in. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH] proc: allow killing processes via file descriptors

2018-11-18 Thread Aleksa Sarai
de from being ugly, > > avoids this particular issue because it merely lets you wait for > > something you already could have observed using readdir(). > > Yes. I mentioned this same issue-punting as the motivation behind > exithand, initially, just reading EOF on exit. One quest

Re: [PATCH] proc: allow killing processes via file descriptors

2018-11-18 Thread Aleksa Sarai
hing I think we should eventually consider is to provide an API which pings a listener whenever a process does an execve() (and possibly fork()). This is something you can get from FreeBSD's kqueue -- and is something that we have in a really neutered form in the "proc connector". But of course we can discuss this separately, especially if we have an extensible API idea in mind when we start. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH] proc: allow killing processes via file descriptors

2018-11-18 Thread Aleksa Sarai
On 2018-11-18, Daniel Colascione wrote: > On Sun, Nov 18, 2018 at 11:05 AM, Aleksa Sarai wrote: > > On 2018-11-18, Daniel Colascione wrote: > >> > Here's my point: if we're really going to make a new API to manipulate > >> > processes by their fd, I

Re: [PATCH v2] Document /proc/pid PID reuse behavior

2018-11-19 Thread Aleksa Sarai
port for it in 2016). I agree with your overall point, but it should be noted that the vast majority of Linux systems these days have protections against this (by default) that use the pids cgroup controller. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

[PATCH v4 0/4] namei: O_* flags to restrict path resolution

2018-11-12 Thread Aleksa Sarai
: Eric Biederman Cc: Andy Lutomirski Cc: David Howells Cc: Jann Horn Cc: Christian Brauner Cc: David Drysdale Cc: Cc: Cc: [1]: https://lwn.net/Articles/721443/ [2]: https://lore.kernel.org/patchwork/patch/784221/ [3]: https://lwn.net/Articles/619151/ [4]: https://lwn.net/Articles/603929/ [5]: https:/

[PATCH v4 1/4] namei: split out nd->dfd handling to dirfd_path_init

2018-11-12 Thread Aleksa Sarai
d-off-by: Aleksa Sarai --- fs/namei.c | 103 ++--- 1 file changed, 59 insertions(+), 44 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index fb913148d4d1..faefca58348d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2168,9 +2168,59 @@ static int link_p

[PATCH v4 2/4] namei: O_BENEATH-style path resolution flags

2018-11-12 Thread Aleksa Sarai
made in this refresh. [1]: https://lwn.net/Articles/721443/ [2]: https://lwn.net/Articles/619151/ [3]: https://lwn.net/Articles/603929/ [4]: https://lwn.net/Articles/723057/ Cc: Eric Biederman Cc: Christian Brauner Cc: Suggested-by: David Drysdale Suggested-by: Al Viro Suggested-by: Andy Lutomi

[PATCH v4 3/4] namei: O_THISROOT: chroot-like path resolution

2018-11-12 Thread Aleksa Sarai
extending AT_EMPTY_PATH support. [1]: https://github.com/cyphar/filepath-securejoin Cc: Eric Biederman Cc: Christian Brauner Cc: Signed-off-by: Aleksa Sarai --- fs/fcntl.c | 2 +- fs/namei.c | 6 +++--- fs/open.c| 4 +++

[PATCH v4 4/4] namei: aggressively check for nd->root escape on ".." resolution

2018-11-12 Thread Aleksa Sarai
ock must have been taken by the attacker after path_is_under() returned in the victim), and thus will not be able to escape from the previously-inside-root path. Walking down is still safe since the entire subtree was moved (either by rename(2) or MS_MOVE) and because (as discussed above) walk

Re: [PATCH v4] signal: add taskfd_send_signal() syscall

2018-12-06 Thread Aleksa Sarai
ry to existing kill(2) interfaces (making it so that transitioning to it won't be seamless), What would the error be? ESRCH would be _very_ wrong, given that it would confuse the two states (zombie/dead-for-real) and would lead to weird cases where fstatat(taskfd) succeeds but taskfd_sen

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-12-03 Thread Aleksa Sarai
to err; > will fail. Imho this is correct behavior since technically signaling a > struct pid is the equivalent of writing to a file and hence doesn't > purely operate on the file descriptor level. Not to mention that O_PATH file descriptors are a whole kettle of fish when it comes to permission checking semantics. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH v3 1/2] kretprobe: produce sane stack traces

2018-11-09 Thread Aleksa Sarai
ed a quick impl that I could test). I will fix this, thanks! By is_kretprobe_handler_context() I imagine you are referring to checking is_kretprobe(current_kprobe())? > So, we should check we are in the kretprobe handler context if tsk == current, > if not, we definately can lock the hash lock witho

Re: Cgroups "pids" controller does not update "pids.current" count immediately

2018-06-14 Thread Aleksa Sarai
ur reproducer will actually produce zombies (I only took a quick look at it). > The "memory" controller, for example, works as expected and does not suffer > from this asynchronous lag. I'm not sure what makes the memory controller and the pids controller comparable in this aspec

Re: Cgroups "pids" controller does not update "pids.current" count immediately

2018-06-14 Thread Aleksa Sarai
On 2018-06-15, Aleksa Sarai wrote: > > I've tested this on 4.14.27 and 4.4.0-124-generic Ubuntu. > > > > If I start a couple of processes which exit very quickly (like a simple Bash > > script with many commands in it), the reported value in "pids.current&qu

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-12-25 Thread Aleksa Sarai
ends on > > the guarantee that holding /proc/pid/reg_file also holds the pid, > > one of which I haven't checked carefully either. > > > > Oh, Sorry, I was wrong, the pid isn't reserved even when > the fd is kept in the user space. And I'm sorry that I ha

[PATCH v2 0/3] namei: implement various lookup restriction AT_* flags

2018-10-08 Thread Aleksa Sarai
tps://lwn.net/Articles/721443/ [2]: https://lore.kernel.org/patchwork/patch/784221/ [3]: https://lwn.net/Articles/619151/ [4]: https://lwn.net/Articles/603929/ [5]: https://lwn.net/Articles/723057/ [6]: https://github.com/cyphar/filepath-securejoin Aleksa Sarai (3): namei: implemen

[PATCH v2 0/3] namei: implement various lookup restriction AT_* flags

2018-10-08 Thread Aleksa Sarai
gh O_PATH and fstatat(2). [1]: https://lwn.net/Articles/721443/ [2]: https://lore.kernel.org/patchwork/patch/784221/ [3]: https://lwn.net/Articles/619151/ [4]: https://lwn.net/Articles/603929/ [5]: https://lwn.net/Articles/723057/ [6]: https://github.com/cyphar/filepath-securejoin Cc: A

[PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-08 Thread Aleksa Sarai
ch in turn was based on the Capsicum project[3]). Input from Linus and Andy in the AT_NO_JUMPS thread[4] determined most of the API changes made in this refresh. [1]: https://lwn.net/Articles/721443/ [2]: https://lwn.net/Articles/619151/ [3]: https://lwn.net/Articles/603929/ [4]: https://lwn.ne

[PATCH v3 0/3] namei: implement various lookup restriction AT_* flags

2018-10-09 Thread Aleksa Sarai
gh O_PATH and fstatat(2). [1]: https://lwn.net/Articles/721443/ [2]: https://lore.kernel.org/patchwork/patch/784221/ [3]: https://lwn.net/Articles/619151/ [4]: https://lwn.net/Articles/603929/ [5]: https://lwn.net/Articles/723057/ [6]: https://github.com/cyphar/filepath-securejoin Cc: A

[PATCH v3 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-09 Thread Aleksa Sarai
ch in turn was based on the Capsicum project[3]). Input from Linus and Andy in the AT_NO_JUMPS thread[4] determined most of the API changes made in this refresh. [1]: https://lwn.net/Articles/721443/ [2]: https://lwn.net/Articles/619151/ [3]: https://lwn.net/Articles/603929/ [4]: https://lwn.ne

[PATCH v3 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution

2018-10-09 Thread Aleksa Sarai
ort in other *at(2) syscalls (because of AT_EMPTY_PATH many *at(2) operations do not need to support these flags directly). [1]: https://github.com/cyphar/filepath-securejoin Cc: Al Viro Cc: Eric Biederman Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/fcntl.c

[PATCH v3 3/3] namei: aggressively check for nd->root escape on ".." resolution

2018-10-09 Thread Aleksa Sarai
must have been taken after __d_path returned), and thus will not be able to escape from the previously-inside-root path. Walking down is still safe since the entire subtree was moved (either by rename(2) or MS_MOVE) and because (as discussed above) walking down is safe. Cc: Al Viro Cc: Jann Horn

Re: [PATCH v3 3/3] namei: aggressively check for nd->root escape on ".." resolution

2018-10-09 Thread Aleksa Sarai
On 2018-10-09, 'Jann Horn' via dev wrote: > On Tue, Oct 9, 2018 at 9:03 AM Aleksa Sarai wrote: > > This patch allows for AT_BENEATH and AT_THIS_ROOT to safely permit ".." > > resolution (in the case of AT_BENEATH the resolution will still fail if > > &qu

Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-10 Thread Aleksa Sarai
On 2018-10-09, Andy Lutomirski wrote: > On Mon, Oct 8, 2018 at 11:53 PM Aleksa Sarai wrote: > > * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very > > specific restriction, and it exists because /proc/$pid/fd/... > > "symlinks"

Re: [RFC v5 1/1] ns: add binfmt_misc to the user namespace

2018-10-10 Thread Aleksa Sarai
heck an error case that > cannot happen. The first namespace binfmt_ns pointer is initialized with > &init_binfmt_ns, so the return value cannot be NULL. I'd argue that BUG() is a better thing to do then -- if doing a dummy error path makes no sense. Though IIRC BUG() is no longer a popular thing to do. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-10 Thread Aleksa Sarai
On 2018-10-10, Aleksa Sarai wrote: > On 2018-10-09, Andy Lutomirski wrote: > > On Mon, Oct 8, 2018 at 11:53 PM Aleksa Sarai wrote: > > > * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very > > > specific restrict

Re: [PATCH] kretprobe: produce sane stack traces

2018-10-29 Thread Aleksa Sarai
etprobe_save_stack_trace); > > + > > +void kretprobe_perf_callchain_kernel(struct kretprobe_instance *ri, > > +struct perf_callchain_entry_ctx *ctx) > > +{ > > + int i; > > + struct kretprobe_trace *krt = &ri->entry; > > + > >

Re: [RFC PATCH] Implement /proc/pid/kill

2018-10-29 Thread Aleksa Sarai
that make it a bit difficult to use /proc/$pid exclusively for introspection of a process -- especially in the context of containers.) -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [RFC PATCH] Implement /proc/pid/kill

2018-10-30 Thread Aleksa Sarai
I'm not suggesting we redesign peers or anything like that). This makes it basically a non-starter. But if, on top of this ground-work, we then referenced containers entirely via an fd to /proc/$pid then you could also avoid PID reuse races (as well as being able to find out implicitly

Re: [RFC PATCH] Implement /proc/pid/kill

2018-10-30 Thread Aleksa Sarai
On 2018-10-30, Joel Fernandes wrote: > On Wed, Oct 31, 2018 at 07:45:01AM +1100, Aleksa Sarai wrote: > [...] > > > > (Unfortunately > > > > there are lots of things that make it a bit difficult to use /proc/$pid > > > > exclusively for introspection

Re: [RFC PATCH] Implement /proc/pid/kill

2018-10-30 Thread Aleksa Sarai
reuse > races. (Sorry, I misunderstood your original question.) The problem is that holding /proc/$pid doesn't stop the PID from dying and being reused. The benefit of holding open /proc/$pid is that you will get an error if you try to use it *after* the PID has died -- which means that you don't need to worry about explicitly checking for PID reuse if you are only operating with the file descriptor and not the PID. So that sequence won't always work. There is a race where the pid might die and be recycled by the time you call kill(2) -- after you've done step 2. By tying step 2 and 3 together -- in this patch -- you remove the race (since in order to resolve the "kill" procfs file VFS must resolve the PID first -- atomically). Though this race window is likely very tiny, and I wonder how much PID churn you really need to hit it. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [RFC PATCH] Implement /proc/pid/kill

2018-10-30 Thread Aleksa Sarai
ls didn't make it. :+1: on a well thought-out and generic proposal. As we've discussed elsewhere, this is an issue that really would be great to (finally) solve. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH] kretprobe: produce sane stack traces

2018-10-31 Thread Aleksa Sarai
on graph > work with multiple users, and have kretprobes be able to hook into it > just like kprobes hooks into function tracer. > > I have some ideas on how to get this done, and will try to have an RFC > patch set ready by plumbers. Should I continue working on this patchset

Re: [PATCH v2] Implement /proc/pid/kill

2018-10-31 Thread Aleksa Sarai
cially if this would conflict with the idea Christian will propose -- as he said, there were proposals to do this in the past and they didn't get anywhere because of lack of discussion and brainstorming before posting patches. -- Aleksa Sarai Senior Software Engineer (Containers)

Re: [RFC PATCH v2] Minimal non-child process exit notification support

2018-11-01 Thread Aleksa Sarai
st provide as much information as you get from proc_connector -- such as the exit status?). Also maybe we should integrate this into the exit machinery instead of this loop... -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [RFC PATCH v2] Minimal non-child process exit notification support

2018-11-01 Thread Aleksa Sarai
On 2018-11-01, Aleksa Sarai wrote: > On 2018-10-29, Daniel Colascione wrote: > > This patch adds a new file under /proc/pid, /proc/pid/exithand. > > Attempting to read from an exithand file will block until the > > corresponding process exits, at which point the r

Re: [PATCH v3 0/3] namei: implement various lookup restriction AT_* flags

2018-10-17 Thread Aleksa Sarai
On 2018-10-09, Aleksa Sarai wrote: > The need for some sort of control over VFS's path resolution (to avoid > malicious paths resulting in inadvertent breakouts) has been a very > long-standing desire of many userspace applications. This patchset is a > revival of Al Viro'

Re: [PATCH v3 1/2] kretprobe: produce sane stack traces

2018-11-07 Thread Aleksa Sarai
On 2018-11-06, Steven Rostedt wrote: > On Sun, 4 Nov 2018 22:59:13 +1100 > Aleksa Sarai wrote: > > > The same issue is present in __save_stack_trace > > (arch/x86/kernel/stacktrace.c). This is likely the only reason that -- > > as Steven said -- stacktraces wouldn&#x

Re: [PATCH v3 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-13 Thread Aleksa Sarai
gt; pathnames, call dirfd_path_init() for relative ones". And I would argue that > taking LOOKUP_BENEATH handling out of dirfd_path_init() into path_init() > (relative) > case would be a good idea. Right, I could definitely do that -- though for AT_THIS_ROOT we'd duplicate t

Re: [PATCH v3 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-13 Thread Aleksa Sarai
st have them use it combined with > O_PATH and pass the result to ...at()... This works for stat and quite a few other things (which is why I only added openat(2) support for the moment), but I think we'd eventually need something like this for renameat2(2) as well as a few other choice *at(

Re: [PATCH v3 3/3] namei: aggressively check for nd->root escape on ".." resolution

2018-10-13 Thread Aleksa Sarai
check against &rename_lock at least and retry if it changed). I could definitely use path_is_under() if you prefer, though I think that in this case we'd need to take &rename_lock (right?). Also is there a speed issue with taking the write-side of a seqlock when we are just reading -- is this more efficient than doing a retry like in __d_path? -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH v3 3/3] namei: aggressively check for nd->root escape on ".." resolution

2018-10-13 Thread Aleksa Sarai
On 2018-10-13, Al Viro wrote: > On Sat, Oct 13, 2018 at 07:53:26PM +1100, Aleksa Sarai wrote: > > > I didn't know about path_is_under() -- I just checked and it appears to > > not take &rename_lock? From my understanding, in order to protect > > against t

Re: [PATCH 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution

2018-10-04 Thread Aleksa Sarai
d thus will always be blocked. DO NOT MERGE: Currently this code returns -EMULTIHOP in this case, purely as a debugging measure (so that you can see that the protection actually does something). Obviously in the proper patch this will return -EXDEV.

Re: [PATCH 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution

2018-10-05 Thread Aleksa Sarai
On 2018-10-04, Jann Horn wrote: > On Thu, Oct 4, 2018 at 6:26 PM Aleksa Sarai wrote: > > On 2018-09-29, Jann Horn wrote: > > > You attempt to open "C/../../etc/passwd" under the root "/A/B". > > > Something else concurrently moves /A/B/C

Re: [PATCH 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution

2018-10-05 Thread Aleksa Sarai
, > > unsigned flags) > > nd->last_type = LAST_ROOT; /* if there are only slashes... */ > > nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT; > > nd->depth = 0; > > + nd->m_seq = read_seqbegin(&mount_lock); > >

[PATCH v3 0/2] kretprobe: produce sane stack traces

2018-11-01 Thread Aleksa Sarai
ce (switch away from hlist_for_each_entry_safe) * kprobe: make maximum stack size 127, which is the ftrace default Aleksa Sarai (2): kretprobe: produce sane stack traces trace: remove kretprobed checks Documentation/kprobes.txt | 6 +- include/linux/kprobes.h

[PATCH v3 2/2] trace: remove kretprobed checks

2018-11-01 Thread Aleksa Sarai
This is effectively a reversion of commit 76094a2cf46e ("ftrace: distinguish kretprobe'd functions in trace logs"), as the checking of kretprobe_trampoline *for tracing* is no longer necessary with the new kretprobe stack trace changes. Signed-off-by: Aleksa Sarai --

Re: [RFC PATCH v2] Minimal non-child process exit notification support

2018-11-01 Thread Aleksa Sarai
On 2018-11-01, Daniel Colascione wrote: > On Thu, Nov 1, 2018 at 7:00 AM, Aleksa Sarai wrote: > > On 2018-10-29, Daniel Colascione wrote: > >> This patch adds a new file under /proc/pid, /proc/pid/exithand. > >> Attempting to read from an exithand file will block u

Re: [PATCH] kretprobe: produce sane stack traces

2018-11-01 Thread Aleksa Sarai
I think your patch is the best fix for this issue. Thanks, I just sent a v3. Though, even with Steven's hooking of kprobes I think you'd still need to stash away the stack trace somewhere (or am I misunderstanding the proposal?). -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> signature.asc Description: PGP signature

Re: [PATCH] kretprobe: produce sane stack traces

2018-11-01 Thread Aleksa Sarai
On 2018-11-02, Masami Hiramatsu wrote: > On Thu, 1 Nov 2018 21:49:48 +1100 > Aleksa Sarai wrote: > > > On 2018-11-01, Masami Hiramatsu wrote: > > > > > > Anyway, until that merge happens, this patch looks good to avoid > > > > > > this

[PATCH] kretprobe: produce sane stack traces

2018-10-26 Thread Aleksa Sarai
moving them completely is reasonable. [1]: https://github.com/iovisor/bpftrace/issues/101 Cc: Brendan Gregg Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- include/linux/kprobes.h | 15 ++ kernel/events/callchain.c | 8 ++- kernel/kprobes.c | 108 +

Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags

2018-10-27 Thread Aleksa Sarai
On 2018-10-27, Ed Maste wrote: > On Tue, 9 Oct 2018 at 02:53, Aleksa Sarai wrote: > > > > +#ifndef O_BENEATH > > +#define O_BENEATH 0004000 /* *Not* the same as capsicum's > > O_BENEATH! */ > > +#endif > [...] > O_BENEATH originally came fro

  1   2   3   4   5   6   >