Re: [PATCH] fs: Replace kmap{,_atomic}() with kmap_local_page()

2022-06-30 Thread Eric W. Biederman
"Fabio M. De Francesco" writes: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > With kmap_local_page(), the mappings are per thread, CPU local and not > globally visible. Furthermore, the mappings can be acquired from any > context (including

Re: [PATCH] capabilities: require CAP_SETFCAP to map uid 0 (v3.2)

2021-04-18 Thread Eric W. Biederman
Guiseppe can you take a look at this? This is a second attempt at tightening up the semantics of writing to file capabilities from a user namespace. The first attempt was reverted with 3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")"),

Re: 08ed4efad6: stress-ng.sigsegv.ops_per_sec -41.9% regression

2021-04-08 Thread Eric W. Biederman
Linus Torvalds writes: > On Thu, Apr 8, 2021 at 1:32 AM kernel test robot > wrote: >> >> FYI, we noticed a -41.9% regression of stress-ng.sigsegv.ops_per_sec due to >> commit >> 08ed4efad684 ("[PATCH v10 6/9] Reimplement RLIMIT_SIGPENDING on top of >> ucounts") > > Ouch. We were cautiously

Re: [PATCH v9 4/8] Reimplement RLIMIT_NPROC on top of ucounts

2021-04-07 Thread Eric W. Biederman
Alexey Gladkov writes: > On Mon, Apr 05, 2021 at 11:56:35AM -0500, Eric W. Biederman wrote: >> >> Also when setting ns->ucount_max[] in create_user_ns because one value >> is signed and the other is unsigned. Care should be taken so that >> rlimit_infinity

Re: [PATCH v9 0/8] Count rlimits in each user namespace

2021-04-05 Thread Eric W. Biederman
er > but in this case we face a different problem of uid mapping when transferring > files from one container to another. > > Eric W. Biederman mentioned this issue [2][3]. > > Introduced changes > -- > To address the problem, we bind rlimit counters to user nam

Re: [PATCH v9 3/8] Use atomic_t for ucounts reference counting

2021-04-05 Thread Eric W. Biederman
Alexey Gladkov writes: > The current implementation of the ucounts reference counter requires the > use of spin_lock. We're going to use get_ucounts() in more performance > critical areas like a handling of RLIMIT_SIGPENDING. > > Now we need to use spin_lock only if we want to change the

Re: [PATCH v9 6/8] Reimplement RLIMIT_SIGPENDING on top of ucounts

2021-04-05 Thread Eric W. Biederman
A small bug below. Eric > diff --git a/kernel/signal.c b/kernel/signal.c > index f2a1b898da29..1b537d9de447 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -413,49 +413,44 @@ void task_join_group_stop(struct task_struct *task) > static struct sigqueue * > __sigqueue_alloc(int

Re: [PATCH v9 4/8] Reimplement RLIMIT_NPROC on top of ucounts

2021-04-05 Thread Eric W. Biederman
Alexey Gladkov writes: > The rlimit counter is tied to uid in the user_namespace. This allows > rlimit values to be specified in userns even if they are already > globally exceeded by the user. However, the value of the previous > user_namespaces cannot be exceeded. > > To illustrate the impact

Re: [PATCH] psi: allow unprivileged users with CAP_SYS_RESOURCE to write psi files

2021-04-01 Thread Eric W. Biederman
Kees Cook writes: > On Wed, Mar 31, 2021 at 11:36:28PM -0500, Eric W. Biederman wrote: >> Josh Hunt writes: >> >> > Currently only root can write files under /proc/pressure. Relax this to >> > allow tasks running as unprivileged users with CAP_SYS_

Re: [PATCH] psi: allow unprivileged users with CAP_SYS_RESOURCE to write psi files

2021-03-31 Thread Eric W. Biederman
Josh Hunt writes: > Currently only root can write files under /proc/pressure. Relax this to > allow tasks running as unprivileged users with CAP_SYS_RESOURCE to be > able to write to these files. The test for CAP_SYS_RESOURCE really needs to be in open rather than in write. Otherwise a suid

Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

2021-03-27 Thread Eric W. Biederman
effects observed. Kicked off the longer runs now. > > Not a huge amount of changes from the posted series, but please peruse > here if you want to double check: > > https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.12 > > And diff against v2 posted is below. Thanks!

Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

2021-03-26 Thread Eric W. Biederman
Jens Axboe writes: > On 3/26/21 4:23 PM, Eric W. Biederman wrote: >> Jens Axboe writes: >> >>> On 3/26/21 2:29 PM, Eric W. Biederman wrote: >>>> Jens Axboe writes: >>>> >>>>> We go through various hoops to disallow signals for t

Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

2021-03-26 Thread Eric W. Biederman
Jens Axboe writes: > On 3/26/21 2:29 PM, Eric W. Biederman wrote: >> Jens Axboe writes: >> >>> We go through various hoops to disallow signals for the IO threads, but >>> there's really no reason why we cannot just allow them. The IO threads >>> neve

Re: [PATCH 3/4] exec: simplify the compat syscall handling

2021-03-26 Thread Eric W. Biederman
Christoph Hellwig writes: > diff --git a/fs/exec.c b/fs/exec.c > index 06e07278b456fa..b34c1eb9e7ad8e 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -391,47 +391,34 @@ static int bprm_mm_init(struct linux_binprm *bprm) > return err; > } > > -struct user_arg_ptr { > -#ifdef

Re: [PATCH 3/7] kernel: stop masking signals in create_io_thread()

2021-03-26 Thread Eric W. Biederman
Jens Axboe writes: > This is racy - move the blocking into when the task is created and > we're marking it as PF_IO_WORKER anyway. The IO threads are now > prepared to handle signals like SIGSTOP as well, so clear that from > the mask to allow proper stopping of IO threads. Acked

Re: [PATCH 1/7] kernel: don't call do_exit() for PF_IO_WORKER threads

2021-03-26 Thread Eric W. Biederman
Jens Axboe writes: > Right now we're never calling get_signal() from PF_IO_WORKER threads, but > in preparation for doing so, don't handle a fatal signal for them. The > workers have state they need to cleanup when exiting, and they don't do > coredumps, so just return instead of performing

Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

2021-03-26 Thread Eric W. Biederman
Jens Axboe writes: > We go through various hoops to disallow signals for the IO threads, but > there's really no reason why we cannot just allow them. The IO threads > never return to userspace like a normal thread, and hence don't go through > normal signal processing. Instead, just check for a

Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/

2021-03-25 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/25, Linus Torvalds wrote: >> >> The whole "signals are very special for IO threads" thing has caused >> so many problems, that maybe the solution is simply to _not_ make them >> special? > > Or may be IO threads should not abuse CLONE_THREAD? > > Why does

Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/

2021-03-25 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/25, Eric W. Biederman wrote: >> >> So looking quickly the flip side of the coin is gdb (and other >> debuggers) needs a way to know these threads are special, so it can know >> not to attach. > > may be, > >> I suspect get

Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/

2021-03-25 Thread Eric W. Biederman
Linus Torvalds writes: > On Thu, Mar 25, 2021 at 12:42 PM Linus Torvalds > wrote: >> >> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds >> wrote: >> > >> > I don't know what the gdb logic is, but maybe there's some other >> > option that makes gdb not react to them? >> >> .. maybe we could

Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/

2021-03-25 Thread Eric W. Biederman
Jens Axboe writes: > On 3/25/21 1:42 PM, Linus Torvalds wrote: >> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds >> wrote: >>> >>> I don't know what the gdb logic is, but maybe there's some other >>> option that makes gdb not react to them? >> >> .. maybe we could have a different name for

Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/

2021-03-25 Thread Eric W. Biederman
Jens Axboe writes: > Hi, > > Stefan reports that attaching to a task with io_uring will leave gdb > very confused and just repeatedly attempting to attach to the IO threads, > even though it receives an -EPERM every time. This patchset proposes to > skip PF_IO_WORKER threads as

Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads

2021-03-25 Thread Eric W. Biederman
Stefan Metzmacher writes: > Am 25.03.21 um 12:24 schrieb Sasha Levin: >> From: "Eric W. Biederman" >> >> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ] >> >> Just like we don't allow normal signals to IO threads, don't deliver a

Re: [PATCHSET 0/2] PF_IO_WORKER signal tweaks

2021-03-21 Thread Eric W. Biederman
Jens Axboe writes: > On 3/20/21 4:08 PM, Eric W. Biederman wrote: >> >> Added criu because I just realized that io_uring (which can open files >> from an io worker thread) looks to require some special handling for >> stopping and freezing processes.

Re: [PATCH 1/2] signal: don't allow sending any signals to PF_IO_WORKER threads

2021-03-21 Thread Eric W. Biederman
Jens Axboe writes: > On 3/20/21 3:38 PM, Eric W. Biederman wrote: >> Linus Torvalds writes: >> >>> On Sat, Mar 20, 2021 at 9:19 AM Eric W. Biederman >>> wrote: >>>> >>>> The creds should be reasonably in-sync with the rest of

Re: [PATCHSET 0/2] PF_IO_WORKER signal tweaks

2021-03-20 Thread Eric W. Biederman
Added criu because I just realized that io_uring (which can open files from an io worker thread) looks to require some special handling for stopping and freezing processes. If not in the SIGSTOP case in the related cgroup freezer case. Linus Torvalds writes: > On Sat, Mar 20, 2021 at 10:51

Re: [PATCH 1/2] signal: don't allow sending any signals to PF_IO_WORKER threads

2021-03-20 Thread Eric W. Biederman
Linus Torvalds writes: > On Sat, Mar 20, 2021 at 9:19 AM Eric W. Biederman > wrote: >> >> The creds should be reasonably in-sync with the rest of the threads. > > It's not about credentials (despite the -EPERM). > > It's about the fact that kernel threads cannot

Re: [PATCHSET 0/2] PF_IO_WORKER signal tweaks

2021-03-20 Thread Eric W. Biederman
Jens Axboe writes: > Hi, > > Been trying to ensure that we do the right thing wrt signals and > PF_IO_WORKER threads, and I think there are two cases we need to handle > explicitly: > > 1) Just don't allow signals to them in general. We do mask everything >as blocked, outside of SIGKILL, so

Re: [PATCH 2/2] signal: don't allow STOP on PF_IO_WORKER threads

2021-03-20 Thread Eric W. Biederman
Jens Axboe writes: > Just like we don't allow normal signals to IO threads, don't deliver a > STOP to a task that has PF_IO_WORKER set. The IO threads don't take > signals in general, and have no means of flushing out a stop either. At first glance this seems safe. This is before we count all

Re: [PATCH 1/2] signal: don't allow sending any signals to PF_IO_WORKER threads

2021-03-20 Thread Eric W. Biederman
Jens Axboe writes: > They don't take signals individually, and even if they share signals with > the parent task, don't allow them to be delivered through the worker > thread. This is silly I know, but why do we care? The creds should be reasonably in-sync with the rest of the threads. There

Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-18 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/18, qianli zhao wrote: >> >> Hi,Oleg >> >> Thank you for your reply. >> >> >> When init sub-threads running on different CPUs exit at the same time, >> >> zap_pid_ns_processe()->BUG() may be happened. >> >> > and why do you think your patch can't prevent this? >> >>

Re: [GIT PULL] userns regression fix for v5.12-rc3

2021-03-13 Thread Eric W. Biederman
Linus Torvalds writes: > On Fri, Mar 12, 2021 at 1:34 PM Eric W. Biederman > wrote: >> >> Please pull the for-v5.12-rc3 branch from the git tree. >> >> Removing the ambiguity broke userspace so please revert the change: >> It turns out that there a

Re: [PATCH v5] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

2021-03-12 Thread Eric W. Biederman
Jim Newsome writes: > On 3/12/21 14:29, Eric W. Biederman wrote: >> When I looked at this a second time it became apparent that using >> pid_task twice should actually be faster as it removes a dependent load >> caused by thread_group_leader, and replaces it by accessing two

Re: [PATCH v5] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

2021-03-12 Thread Eric W. Biederman
Jim Newsome writes: > do_wait is an internal function used to implement waitpid, waitid, > wait4, etc. To handle the general case, it does an O(n) linear scan of > the thread group's children and tracees. > > This patch adds a special-case when waiting on a pid to skip these scans > and instead

Re: [PATCH V2] exit: trigger panic when global init has exited

2021-03-12 Thread Eric W. Biederman
Qianli Zhao writes: > From: Qianli Zhao > > When init sub-threads running on different CPUs exit at the same time, > zap_pid_ns_processe()->BUG() may be happened. > And every thread status is abnormal after exit(PF_EXITING set,task->mm=NULL > etc), > which makes it difficult to parse coredump

Re: [patch V2 0/3] signals: Allow caching one sigqueue object per task

2021-03-11 Thread Eric W. Biederman
Thomas Gleixner writes: > This is a follow up to the initial submission which can be found here: > > https://lore.kernel.org/r/20210303142025.wbbt2nnr6dtgw...@linutronix.de > > Signal sending requires a kmem cache allocation at the sender side and the > receiver hands it back to the kmem cache

Re: [PATCH v3] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

2021-03-11 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/10, Eric W. Biederman wrote: >> >> Jim Newsome writes: >> >> > +static int do_wait_pid(struct wait_opts *wo) >> > +{ >> > + struct task_s

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-11 Thread Eric W. Biederman
Thomas Gleixner writes: > On Wed, Mar 10 2021 at 15:57, Eric W. Biederman wrote: >> Thomas Gleixner writes: >>> IMO, not bothering with an extra counter and rlimit plus the required >>> atomic operations is just fine and having this for all tasks >>> un

Re: [PATCH v3] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

2021-03-11 Thread Eric W. Biederman
Jim Newsome writes: > On 3/10/21 16:40, Eric W. Biederman wrote: >>> +// Optimization for waiting on PIDTYPE_PID. No need to iterate > through child >>> +// and tracee lists to find the target task. >> >> Minor nit: C++ style comments look very out of pla

Re: [PATCH v3] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

2021-03-10 Thread Eric W. Biederman
Jim Newsome writes: > do_wait is an internal function used to implement waitpid, waitid, > wait4, etc. To handle the general case, it does an O(n) linear scan of > the thread group's children and tracees. > > This patch adds a special-case when waiting on a pid to skip these scans > and instead

Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

2021-03-10 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/10, Eric W. Biederman wrote: >> >> /* If global init has exited, >> * panic immediately to get a useable coredump. >> */ >> if (unlikely(is_global_init(tsk) && >> (thread_group_e

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-10 Thread Eric W. Biederman
Thomas Gleixner writes: > On Thu, Mar 04 2021 at 21:58, Thomas Gleixner wrote: >> On Thu, Mar 04 2021 at 13:04, Eric W. Biederman wrote: >>> Thomas Gleixner writes: >>>> >>>> We could of course do the caching unconditionally for all tasks. &g

Re: [PATCH v2 1/1] fs: Allow no_new_privs tasks to call chroot(2)

2021-03-10 Thread Eric W. Biederman
atch is a follow-up of a previous one sent by Andy Lutomirski, but > with less limitations: > https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.l...@amacapital.net/ > > Cc: Al Viro > Cc: Andy Lutomirski > Cc: Christian Brauner > Cc: Christo

Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

2021-03-10 Thread Eric W. Biederman
Oleg Nesterov writes: > On 03/10, Eric W. Biederman wrote: >> >> /* If global init has exited, >> * panic immediately to get a useable coredump. >> */ >> if (unlikely(is_global_init(tsk) && >> (thread_group_e

Re: [RFC PATCH] mm: fork: Prevent a NULL deref by getting mm only if the refcount isn't 0

2021-03-10 Thread Eric W. Biederman
Filippo Sironi writes: > We've seen a number of crashes with the following signature: > > BUG: kernel NULL pointer dereference, address: > #PF: supervisor read access in kernel mode > #PF: error_code(0x) - not-present page > ... > Oops: [#1] SMP PTI

Re: [PATCH v1 1/1] fs: Allow no_new_privs tasks to call chroot(2)

2021-03-10 Thread Eric W. Biederman
Mickaël Salaün writes: > From: Mickaël Salaün > > Being able to easily change root directories enable to ease some > development workflow and can be used as a tool to strengthen > unprivileged security sandboxes. chroot(2) is not an access-control > mechanism per se, but it can be used to

Re: [PATCH] exit: trigger panic when init process is set to SIGNAL_GROUP_EXIT

2021-03-10 Thread Eric W. Biederman
qianli zhao writes: > Hi,Oleg > > Thanks for your replay. > >> To be honest, I don't understand the changelog. It seems that you want >> to uglify the kernel to simplify the debugging of buggy init? Or what? > > My patch is for the following purpose: > 1. I hope to fix the occurrence of

Re: kernel panic: Attempted to kill init!

2021-03-09 Thread Eric W. Biederman
Al Viro writes: > On Tue, Mar 09, 2021 at 11:29:14AM +0530, Palash Oswal wrote: > >> I observe the following result(notice the segfault in systemd): >> root@sandbox:~# ./repro >> [9.457767] got to 221 >> [9.457791] got to 183 >> [9.459144] got to 201 >> [9.459471] got to 208 >> [

Re: d28296d248: stress-ng.sigsegv.ops_per_sec -82.7% regression

2021-03-05 Thread Eric W. Biederman
Alexey Gladkov writes: > On Wed, Feb 24, 2021 at 12:50:21PM -0600, Eric W. Biederman wrote: >> Alexey Gladkov writes: >> >> > On Wed, Feb 24, 2021 at 10:54:17AM -0600, Eric W. Biederman wrote: >> >> kernel test robot writes: >> >> >>

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-04 Thread Eric W. Biederman
Thomas Gleixner writes: > On Thu, Mar 04 2021 at 09:11, Sebastian Andrzej Siewior wrote: >> On 2021-03-03 16:09:05 [-0600], Eric W. Biederman wrote: >>> Sebastian Andrzej Siewior writes: >>> >>> > From: Thomas Gleixner >>> > >>>

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-04 Thread Eric W. Biederman
Sebastian Andrzej Siewior writes: > On 2021-03-03 16:09:05 [-0600], Eric W. Biederman wrote: >> Sebastian Andrzej Siewior writes: >> >> > From: Thomas Gleixner >> > >> > Allow realtime tasks to cache one sigqueue in task struct. This avoids an >

Re: [GIT PULL] idmapped mounts for v5.12

2021-03-03 Thread Eric W. Biederman
Christian Brauner writes: > Hi Linus, > This series comes with an extensive xfstests suite covering both ext4 and xfs > https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts > It covers truncation, creation, opening, xattrs, vfscaps, setid execution, > setgid inheritance and more both

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-03 Thread Eric W. Biederman
Sebastian Andrzej Siewior writes: > From: Thomas Gleixner > > Allow realtime tasks to cache one sigqueue in task struct. This avoids an > allocation which can increase the latency or fail. > Ideally the sigqueue is cached after first successful delivery and will be > available for next signal

Re: exec error: BUG: Bad rss-counter

2021-03-03 Thread Eric W. Biederman
Ilya Lipnitskiy writes: > On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman > wrote: >> >> Ilya Lipnitskiy writes: >> >> > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman >> > wrote: >> >> >> >> Ilya Lipnitskiy write

Re: exec error: BUG: Bad rss-counter

2021-03-03 Thread Eric W. Biederman
Ilya Lipnitskiy writes: > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman > wrote: >> >> Ilya Lipnitskiy writes: >> >> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman >> > wrote: >> >> >> >> Ilya Lipnitskiy writes: &g

Re: exec error: BUG: Bad rss-counter

2021-03-02 Thread Eric W. Biederman
Ilya Lipnitskiy writes: > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman > wrote: >> >> Ilya Lipnitskiy writes: >> >> > Eric, All, >> > >> > The following error appears when running Linux 5.10.18 on an embedded >> > MIPS mt76

Re: exec error: BUG: Bad rss-counter

2021-03-01 Thread Eric W. Biederman
Ilya Lipnitskiy writes: > Eric, All, > > The following error appears when running Linux 5.10.18 on an embedded > MIPS mt7621 target: > [0.301219] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1 > > Being a very generic error, I started digging and added a stack dump > before

Re: [PATCH v14 01/11] x86: kdump: replace the hard-coded alignment with macro CRASH_ALIGN

2021-02-26 Thread Eric W. Biederman
chenzhou writes: > On 2021/2/25 15:25, Baoquan He wrote: >> On 02/24/21 at 02:19pm, Catalin Marinas wrote: >>> On Sat, Jan 30, 2021 at 03:10:15PM +0800, Chen Zhou wrote: Move CRASH_ALIGN to header asm/kexec.h for later use. Besides, the alignment of crash kernel regions in x86 is

Re: d28296d248: stress-ng.sigsegv.ops_per_sec -82.7% regression

2021-02-24 Thread Eric W. Biederman
Alexey Gladkov writes: > On Wed, Feb 24, 2021 at 10:54:17AM -0600, Eric W. Biederman wrote: >> kernel test robot writes: >> >> > Greeting, >> > >> > FYI, we noticed a -82.7% regression of stress-ng.sigsegv.ops_per_sec

Re: d28296d248: stress-ng.sigsegv.ops_per_sec -82.7% regression

2021-02-24 Thread Eric W. Biederman
kernel test robot writes: > Greeting, > > FYI, we noticed a -82.7% regression of stress-ng.sigsegv.ops_per_sec due to > commit: > > > commit: d28296d2484fa11e94dff65e93eb25802a443d47 ("[PATCH v7 5/7] Reimplement > RLIMIT_SIGPENDING on top of ucounts") > url: >

Re: [PATCH v6 0/7] Count rlimits in each user namespace

2021-02-22 Thread Eric W. Biederman
Linus Torvalds writes: > On Mon, Feb 15, 2021 at 4:42 AM Alexey Gladkov > wrote: >> >> These patches are for binding the rlimit counters to a user in user >> namespace. > > So this is now version 6, but I think the kernel test robot keeps > complaining about them causing KASAN issues. > > The

Re: [PATCH v2] kexec: move machine_kexec_post_load() to public interface

2021-02-22 Thread Eric W. Biederman
Will Deacon writes: > On Fri, 19 Feb 2021 14:51:42 -0500, Pavel Tatashin wrote: >> machine_kexec_post_load() is called after kexec load is finished. It must >> declared in public header not in kexec_internal.h >> >> Fixes the following compiler warning: >> >>

Re: [PATCH v2] sparc: make copy_thread honor pid namespaces

2021-02-22 Thread Eric W. Biederman
CLONE_NEWPID | CLONE_NEWUSER) < 0) > err(1, "unshare"); > test_fork(); > return 0; > } > EOF > $ sh -c ./a.out > current: 10001, parent: 1, fork returned: 10002 > current: 10002, parent: 10001, fork returned: 10001 > cu

Re: [RESEND PATCH v4 0/3] proc: Relax check of mount visibility

2021-02-22 Thread Eric W. Biederman
Alexey Gladkov writes: > If only the dynamic part of procfs is mounted (subset=pid), then there is no > need to check if procfs is fully visible to the user in the new user > namespace. A couple of things. 1) Allowing the mount should come in the last patch. So we don't have a bisect hazard.

Re: [PATCH] sparc: make copy_thread honor pid namespaces

2021-02-18 Thread Eric W. Biederman
urrent->nsproxy->pid_ns_for_children instead of task_active_pid_ns(p). For sparc people. Do we know of anyone who actually uses the parent pid returned from fork to the child process? If not the code can simply return 0 and we don't have to do this. Eric > Cc: Eric W. Biederman > Cc: s

Re: [PATCH] proc: Convert S_ permission uses to octal

2021-02-12 Thread Eric W. Biederman
Matthew Wilcox writes: > On Fri, Feb 12, 2021 at 04:01:48PM -0600, Eric W. Biederman wrote: >> Joe Perches writes: >> >> > Convert S_ permissions to the more readable octal. >> > >> > Done using: >> > $ ./scripts/checkpatch.pl -f --fix

Re: [PATCH] proc: Convert S_ permission uses to octal

2021-02-12 Thread Eric W. Biederman
Joe Perches writes: > On Fri, 2021-02-12 at 16:01 -0600, Eric W. Biederman wrote: >> Joe Perches writes: >> >> > Convert S_ permissions to the more readable octal. >> > >> > Done using: >> > $ ./scripts/checkpatch.pl -f --fix

Re: [PATCH] proc: Convert S_ permission uses to octal

2021-02-12 Thread Eric W. Biederman
Joe Perches writes: > Convert S_ permissions to the more readable octal. > > Done using: > $ ./scripts/checkpatch.pl -f --fix-inplace --types=SYMBOLIC_PERMS > fs/proc/*.[ch] > > No difference in generated .o files allyesconfig x86-64 > > Link: >

Re: [PATCH v11 0/6] arm64: MMU enabled kexec relocation

2021-02-04 Thread Eric W. Biederman
Pavel Tatashin writes: >> > I understand that having an extra set of page tables could potentially >> > waste memory, especially if VAs are sparse, but in this case we use >> > page tables exclusively for contiguous VA space (copy [src, src + >> > size]). Therefore, the extra memory usage is

Re: [PATCH v11 0/6] arm64: MMU enabled kexec relocation

2021-02-03 Thread Eric W. Biederman
Pavel Tatashin writes: > Hi James, > >> The problem I see with this is rewriting the relocation code. It needs to >> work whether the >> machine has enough memory to enable the MMU during kexec, or not. >> >> In off-list mail to Pavel I proposed an alternative implementation here: >>

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-31 Thread Eric W. Biederman
"Serge E. Hallyn" writes: > On Fri, Jan 29, 2021 at 04:55:29PM -0600, Eric W. Biederman wrote: >> "Serge E. Hallyn" writes: >> >> > On Thu, Jan 28, 2021 at 02:19:13PM -0600, Eric W. Biederman wrote: >> >> "Serge E. Hallyn" wri

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-29 Thread Eric W. Biederman
"Serge E. Hallyn" writes: > On Thu, Jan 28, 2021 at 08:44:26PM +0100, Miklos Szeredi wrote: >> On Thu, Jan 28, 2021 at 6:09 PM Serge E. Hallyn wrote: >> > >> > On Tue, Jan 19, 2021 at 07:34:49PM -0600, Eric W. Biederman

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-29 Thread Eric W. Biederman
"Serge E. Hallyn" writes: > On Thu, Jan 28, 2021 at 02:19:13PM -0600, Eric W. Biederman wrote: >> "Serge E. Hallyn" writes: >> >> > On Tue, Jan 19, 2021 at 07:34:49PM -0600, Eric W. Biederman wrote: >> >> Miklos Szeredi writes: >&

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-28 Thread Eric W. Biederman
Miklos Szeredi writes: > On Thu, Jan 28, 2021 at 9:24 PM Eric W. Biederman > wrote: > >> >> From our previous discussions I would also argue it would be good >> if there was a bypass that skipped all conversions if the reader >> and the filesyst

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-28 Thread Eric W. Biederman
"Serge E. Hallyn" writes: > On Tue, Jan 19, 2021 at 07:34:49PM -0600, Eric W. Biederman wrote: >> Miklos Szeredi writes: >> >> > If a capability is stored on disk in v2 format cap_inode_getsecurity() will >> > currently return in v2 format uncon

Re: [PATCH v2 1/1] kexec: dump kmessage before machine_kexec

2021-01-28 Thread Eric W. Biederman
Pavel Tatashin writes: > kmsg_dump(KMSG_DUMP_SHUTDOWN) is called before > machine_restart(), machine_halt(), machine_power_off(), the only one that > is missing is machine_kexec(). > > The dmesg output that it contains can be used to study the shutdown > performance of both kernel and systemd

Re: [RFC PATCH v3 1/8] Use refcount_t for ucounts reference counting

2021-01-21 Thread Eric W. Biederman
Alexey Gladkov writes: > On Tue, Jan 19, 2021 at 07:57:36PM -0600, Eric W. Biederman wrote: >> Alexey Gladkov writes: >> >> > On Mon, Jan 18, 2021 at 12:34:29PM -0800, Linus Torvalds wrote: >> >> On Mon, Jan 18, 2021 at 11:46 AM Alexey Gladkov >> >

[RFC][PATCH] apparmor: Enforce progressively tighter permissions for no_new_privs

2021-01-20 Thread Eric W. Biederman
by having no_new_privs enforce progressinvely tighter permissions. Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp") Signed-off-by: Eric W. Biederman --- I came accross this while examining the places cred_guard_mutex is used and trying to

Re: [RFC][PATCH] apparmor: Enforce progressively tighter permissions for no_new_privs

2021-01-20 Thread Eric W. Biederman
TL;DR selinux and apparmor ignore no_new_privs What? John Johansen writes: > On 1/20/21 1:26 PM, Eric W. Biederman wrote: >> >> The current understanding of apparmor with respect to no_new_privs is at >> odds with how no_new_privs is implemented and u

Re: [RFC][PATCH] apparmor: Enforce progressively tighter permissions for no_new_privs

2021-01-20 Thread Eric W. Biederman
This should now Cc the correct email address for James Morris. ebied...@xmission.com (Eric W. Biederman) writes: > The current understanding of apparmor with respect to no_new_privs is at > odds with how no_new_privs is implemented and understood by the rest of > t

Re: [RFC PATCH v3 1/8] Use refcount_t for ucounts reference counting

2021-01-19 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes: > Alexey Gladkov writes: > >> On Mon, Jan 18, 2021 at 12:34:29PM -0800, Linus Torvalds wrote: >>> On Mon, Jan 18, 2021 at 11:46 AM Alexey Gladkov >>> wrote: >>> > >>> > Sorry about that.

Re: [RFC PATCH v3 1/8] Use refcount_t for ucounts reference counting

2021-01-19 Thread Eric W. Biederman
Alexey Gladkov writes: > On Mon, Jan 18, 2021 at 12:34:29PM -0800, Linus Torvalds wrote: >> On Mon, Jan 18, 2021 at 11:46 AM Alexey Gladkov >> wrote: >> > >> > Sorry about that. I thought that this code is not needed when switching >> > from int to refcount_t. I was wrong. >> >> Well, you

Re: [PATCH 2/2] security.capability: fix conversions on getxattr

2021-01-19 Thread Eric W. Biederman
well this works with stacking. In particular ovl_xattr_set appears to call vfs_getxattr without overriding the creds. What the purpose of that is I haven't quite figured out. It looks like it is just a probe to see if an xattr is present so maybe it is ok. Acked-by: "Eric W. Biederman&q

Re: [PATCH 0/2] capability conversion fixes

2021-01-19 Thread Eric W. Biederman
Miklos Szeredi writes: > It turns out overlayfs is actually okay wrt. mutliple conversions, because > it uses the right context for lower operations. I.e. before calling > vfs_{set,get}xattr() on underlying fs, it overrides creds with that of the > mounter, so the current user ns will now match

Re: [PATCH 1/2] ecryptfs: fix uid translation for setxattr on security.capability

2021-01-19 Thread Eric W. Biederman
legated_inode and breaking leases. Code that is enabled with CONFIG_FILE_LOCKING. So unless I am missing something this introduces a different regression into ecryptfs. > > Reported-by: Eric W. Biederman > Cc: Tyler Hicks > Fixes: 7c03e2cda4a5 ("vfs: move cap_convert_nscap() c

Re: [RFC PATCH v2 1/8] Use atomic type for ucounts reference counting

2021-01-13 Thread Eric W. Biederman
Alexey Gladkov writes: We might want to use refcount_t instead of atomic_t. Not a big deal either way. > Signed-off-by: Alexey Gladkov > --- > include/linux/user_namespace.h | 2 +- > kernel/ucount.c| 10 +- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff

Re: [RFC PATCH v2 2/8] Add a reference to ucounts for each user

2021-01-13 Thread Eric W. Biederman
The subject is wrong. This should be: [RFC PATCH v2 2/8] Add a reference to ucounts for each cred. Further the explanation could use a little work. Something along the lines of: For RLIMIT_NPROC and some other rlimits the user_struct that holds the global limit is kept alive for the lifetime

Re: [PATCH v2 01/10] vfs: move cap_convert_nscap() call into vfs_setxattr()

2021-01-12 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes: > So there is the basic question do we want to read the raw bytes on disk > or do we want to return something meaningful to the reader. As the > existing tools use the xattr interface to set/clear fscaps returning > data to user space

Re: [PATCH v2 01/10] vfs: move cap_convert_nscap() call into vfs_setxattr()

2021-01-12 Thread Eric W. Biederman
Miklos Szeredi writes: > On Tue, Jan 12, 2021 at 1:15 AM Eric W. Biederman > wrote: >> >> Miklos Szeredi writes: >> >> > On Fri, Jan 01, 2021 at 11:35:16AM -0600, Eric W. Biederman wrote: > >> > For one: a v2 fscap is supposed to be equivalent t

Re: [PATCH v2 01/10] vfs: move cap_convert_nscap() call into vfs_setxattr()

2021-01-11 Thread Eric W. Biederman
Miklos Szeredi writes: > On Fri, Jan 01, 2021 at 11:35:16AM -0600, Eric W. Biederman wrote: >> Miklos Szeredi writes: >> >> > cap_convert_nscap() does permission checking as well as conversion of the >> > xattr value conditionally based on fs's user-ns. >

Re: [RFC PATCH v2 0/8] Count rlimits in each user namespace

2021-01-11 Thread Eric W. Biederman
Linus Torvalds writes: > On Sun, Jan 10, 2021 at 9:34 AM Alexey Gladkov > wrote: >> >> To address the problem, we bind rlimit counters to each user namespace. The >> result is a tree of rlimit counters with the biggest value at the root (aka >> init_user_ns). The rlimit counter

Re: [PATCH] x86/vm86/32: Remove VM86_SCREEN_BITMAP support

2021-01-09 Thread Eric W. Biederman
Andy Lutomirski writes: > The implementation was rather buggy. It unconditionally marked PTEs > read-only, even for VM_SHARED mappings. I'm not sure whether this is > actually a problem, but it certainly seems unwise. More importantly, it > released the mmap lock before flushing the TLB,

Re: in_compat_syscall() on x86

2021-01-05 Thread Eric W. Biederman
Al Viro writes: > On Mon, Jan 04, 2021 at 06:47:38PM -0600, Eric W. Biederman wrote: >> >> It is defined in the Ubuntu kernel configs I've got lurking: >> >> Both 3.8.0-19_generic (Ubuntu 13.04) and 5.4.0-56_generic (probably >> >> 20.04). >> >

Re: in_compat_syscall() on x86

2021-01-04 Thread Eric W. Biederman
Andy Lutomirski writes: >> On Jan 4, 2021, at 2:36 PM, David Laight wrote: >> >> From: Eric W. Biederman >>> Sent: 04 January 2021 20:41 >>> >>> Al Viro writes: >>> >>>> On Mon, Jan 04, 2021 at 12:16:56PM +000

Re: in_compat_syscall() on x86

2021-01-04 Thread Eric W. Biederman
Al Viro writes: > On Mon, Jan 04, 2021 at 12:16:56PM +, David Laight wrote: >> On x86 in_compat_syscall() is defined as: >> in_ia32_syscall() || in_x32_syscall() >> >> Now in_ia32_syscall() is a simple check of the TS_COMPAT flag. >> However in_x32_syscall() is a horrid beast that has

Re: [PATCH v2 01/10] vfs: move cap_convert_nscap() call into vfs_setxattr()

2021-01-01 Thread Eric W. Biederman
Miklos Szeredi writes: > cap_convert_nscap() does permission checking as well as conversion of the > xattr value conditionally based on fs's user-ns. > > This is needed by overlayfs and probably other layered fs (ecryptfs) and is > what vfs_foo() is supposed to do anyway. Well crap. I just

Re: Does uaccess_kernel() work for detecting kernel thread?

2020-12-22 Thread Eric W. Biederman
Tetsuo Handa writes: > Commit db68ce10c4f0a27c ("new helper: uaccess_kernel()") replaced > segment_eq(get_fs(), KERNEL_DS) > with uaccess_kernel(). But uaccess_kernel() became an unconditional "false" > for some architectures > due to commit 5e6e9852d6f76e01 ("uaccess: add infrastructure for

Re: [RFC PATCH] ptrace: make ptrace() fail if the tracee changed its pid unexpectedly

2020-12-21 Thread Eric W. Biederman
Oleg Nesterov writes: > On 12/17, Eric W. Biederman wrote: >> >> Oleg Nesterov writes: >> >> > Suppose we have 2 threads, the group-leader L and a sub-theread T, >> > both parked in ptrace_stop(). Debugger tries to resume both threads >>

Re: [PATCH] Smack: Handle io_uring kernel thread privileges.

2020-12-21 Thread Eric W. Biederman
ed-by: "Eric W. Biederman" > > Suggested-by: Jens Axboe > Signed-off-by: Casey Schaufler > --- >  security/smack/smack_access.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/security/smack/smack_access.c b/security/smack/sma

Re: [PATCH] signal: Don't init struct kernel_siginfo fields to zero again

2020-12-21 Thread Eric W. Biederman
Leesoo Ahn writes: > clear_siginfo() is responsible for clearing struct kernel_siginfo object. > It's obvious that manually initializing those fields is needless as > a commit[1] explains why the function introduced and its guarantee that > all bits in the struct are cleared after it. The

  1   2   3   4   5   6   7   8   9   10   >