Launchpad has imported 7 comments from the remote bug at
https://bugzilla.kernel.org/show_bug.cgi?id=195453.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2017-04-18T13:22:33+00:00 colin.king wrote:

There is a race condition on the fs->users counter and fs->in_exec flag
that impacts exec() on a suid program when there are multiple threads
being created and destroyed by a parent process.  This in_exec flag was
introduced way back in 2009 with commit:

commit 498052bba55ecaff58db6a1436b0e25bfd75a7ff
Author: Al Viro <v...@zeniv.linux.org.uk>
Date:   Mon Mar 30 07:20:30 2009 -0400


When performing an exec() on a suid program, check_unsafe_exec() checks to see 
we have more f->fs->users than the count of child threads that share the same 
fs with the parent.

However, there are two race conditions that this seems to fail on:

1. The parent may be creating a new thread and copy_fs in kernel/fork.c
bumps the fs->users count before the thread is attached to the parent
hence causing the check p->fs->users > n_fs in check_unsafe_exec() to be
true in check_unsafe_exec() and cause the execution a the suid program
by the parent to fail to marked as unsafe and so it executes as a non-
suid executable.  The crux of the matter is that the fs->user count
temporarily ahead by 1 with the number of threads linked to the process
and we don't have any locking mechanism to protect us from this in the
exec phase.

2. A thread my be dying and the associated fs pointer is nullified
before it is removed from the parent and this also causes issues in the
check_unsafe_exec() fs sanity check. I believe this can be fixed
checking for t->fs being NULL, e.g.:

diff --git a/fs/exec.c b/fs/exec.c
index 72934df68471..ebfd9b76b69f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1447,7 +1447,7 @@ static void check_unsafe_exec(struct linux_binprm *bprm)
        spin_lock(&p->fs->lock);
        rcu_read_lock();
        while_each_thread(p, t) {
-               if (t->fs == p->fs)
+               if (t->fs == p->fs || !t->fs)
                        n_fs++;
        }
        rcu_read_unlock();


The reproducer is quite simple and always easy to reproduce:

$ cat Makefile 
ALL=a b
all: $(ALL)

a: LDFLAGS=-pthread

b: b.c
        $(CC) b.c -o b
        sudo chown root:root b
        sudo chmod u+s b

test:
        for I in $$(seq 1000); do echo $I; ./a ; done

clean:
        rm -vf $(ALL)


$ cat a.c
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>
#include <time.h>

void *nothing(void *p)
{
        return NULL;
}

void *target(void *p) {
        for (;;) {
                pthread_t t;
                if (pthread_create(&t, NULL, nothing, NULL) == 0)
                        pthread_join(t, NULL);
        }
        return NULL;
}

int main(void)
{
        struct timespec tv;
        int i;

        for (i = 0; i < 10; i++) {
                pthread_t t;
                pthread_create(&t, NULL, target, NULL);
        }
        tv.tv_sec = 0;
        tv.tv_nsec = 100000;
        nanosleep(&tv, NULL);
        if (execl("./b", "./b", NULL) < 0)
                perror("execl");
        return 0;
}

$ cat b.c
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>

int main(void)
{
        const uid_t euid = geteuid();
        if (euid != 0) {
                printf("Failed, got euid %d (expecting 0)\n", euid);
                return 1;
        }
        return 0;
}


To reproduce:

make
make test

You will see "Failed, got euid 1000 (expecting 0)" errors whenever the
suid program being exec'd fails to exec as a suid process because of the
race and failure in check_unsafe_exec()

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/12

------------------------------------------------------------------------
On 2017-04-21T00:36:08+00:00 dev.jongy wrote:

I don't think the second condition you gave is relevant, because once
the task_struct->fs pointer is nullified, this thread is not accounted
in fs->users anymore. Since this thread doesn't count for neither
fs->count nor n_fs, it is okay.

The first condition is indeed a problem. I'm not sure of the wanted fix
(probably some kind of locking in copy_process, that while_each_thread
could use too), but currently this race is bad.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/14

------------------------------------------------------------------------
On 2017-04-21T09:47:07+00:00 colin.king wrote:

Thanks for looking at this.  The locking on copy_process concerns me as
I didn't want to introduce a locking fix that caused a serious
performance regression on the copy and exec paths.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/15

------------------------------------------------------------------------
On 2017-05-03T10:55:47+00:00 colin.king wrote:

Any updates on this?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/16

------------------------------------------------------------------------
On 2017-05-10T11:09:09+00:00 colin.king wrote:

Created attachment 256351
this fix solves the issue without any overhead of extra per thread locking and 
a simple lightweight retry

This has been extensively tested on a multi-proc Xeon server with the
reproducers and fixes the issue.  I've checked this out with a high
contention of pthreads and the retry loop occurs very rarely, so the
overhead of the retry is very small indeed.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/22

------------------------------------------------------------------------
On 2021-04-16T01:24:27+00:00 benh wrote:

Did you ever submit the fix to the mailing list ?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/40

------------------------------------------------------------------------
On 2022-07-13T01:39:01+00:00 shaoyi wrote:

Hi Colin,

May I please ask about the most recent update about your fix? This race
condition can still be reproduced on the current linux mainline
v5.19.0-rc6. I found ubuntu had picked this patch up as
https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html but
also reported the soft lockup issue in kernel 4.4 as
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1876856 so I'm
wondering do you have an updated version of the patch and the plan to
submit it to the upstream? Would really appreciate if you're still
tracking this issue!

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819/comments/41

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1672819

Title:
  exec'ing a setuid binary from a threaded program sometimes fails to
  setuid

Status in Linux:
  Confirmed
Status in golang-1.6 package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Fix Released
Status in golang-1.6 source package in Xenial:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in golang-1.6 source package in Yakkety:
  Invalid
Status in linux source package in Yakkety:
  Fix Released
Status in golang-1.6 source package in Zesty:
  Invalid
Status in linux source package in Zesty:
  Fix Released

Bug description:
  == SRU template for golang-1.6 ==

  [Impact]
  The kernel bug reported below means that occasionally (maybe 1 in 1000 times) 
the snapd -> snap-confine exec that is part of a snap execution fails to take 
the setuid bit on the snap-confine binary into account which means that the 
execution fails. This is extremely confusing for the user of the snap who just 
sees a permission denied error with no explanation.

  The kernel bug has been fixed in Xenial+ but not all users of snapd are on 
xenial+ kernels (they might be on trusty or another distribution entirely).
  Backporting this fix will mean that the snapd in the core snap will get the 
workaround next time it is built and because the snapd in trusty or the other 
distro will re-exec into the snapd in the core snap before execing 
snap-confine, users should not see the above behaviour.

  [Test case]
  This will be a bit tricky as the kernel bug has been fixed. A xenial 
container on a trusty host/VM should do the trick. The test case from 
https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813 should be 
sufficient to demonstrate the bug and then, once golang-1.6 has been upgraded 
from proposed, the fix.

  [Regression potential]
  If there is a bug in the patch it could cause deadlocks in currently working 
programs. But the patch is pretty simple and has passed review upstream so I 
think it should be OK.

  == SRU REQUEST XENIAL, YAKKETY, ZESTY ==

  Due to two race conditions in check_unsafe_exec(),  exec'ing a setuid
  binary from a threaded program sometimes fails to setuid.

  == Fix ==

  Sauce patch for Xenial, Yakkety + Zesty:

  https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html

  This fix re-executes the unsafe check if there is a discrepancy
  between the expected fs count and the found count during the racy
  window during thread exec or exit.  This re-check occurs very
  infrequently and saves a lot of addition locking on per thread
  structures that would make performance of fork/exec/exit prohibitively
  expensive.

  == Test case ==

  See the example C code in the patch,
  https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html

  Run the test code as follows: for i in $(seq 1000); do ./a; done

  With the patch, no messages are emitted, without the patch, one sees a
  message:

  "Failed, got euid 1000 (expecting 0)"

  ..which shows the setuid program failed the check_unsafe_exec()
  because of the race.

  == Regression potential ==

  breaking existing safe exec semantics.

  ====================

  This can be reproduced with
  https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813

  With that, and go 1.8, if you run “make” and then

  for i in `seq 99`; do ./a_go; done

  you'll see a variable number of ”GOT 1000” (or whatever your user id
  is). If you don't, add one or two more 9s on there.

  That's a simple go reproducer. You can also use “a_p” instead of
  “a_go” to see one that only uses pthreads. “a_c” is a C version that
  does *not* reproduce the issue.

  But it's not pthreads: if in a_go.go you comment out the “import "C"”,
  you'll still see the “GOT 1000” messages, in a static binary that uses
  no pthreads, just clone(2). You'll also see a bunch of warnings
  because it's not properly handling an EAGAIN from clone, but that's
  unrelated.

  If you pin the process to a single thread using taskset, you don't get
  the issue from a_go; a_p continues to reproduce the issue. In some
  virtualized environments we haven't been able to reproduce the issue
  either (e.g. some aws instances), but kvm works (you need -smp to see
  the issue from a_go).

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-64-generic 4.4.0-64.85
  ProcVersionSignature: Ubuntu 4.4.0-64.85-generic 4.4.44
  Uname: Linux 4.4.0-64-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/pcmC0D0p:   john       2354 F...m pulseaudio
   /dev/snd/controlC0:  john       2354 F.... pulseaudio
  CurrentDesktop: Unity
  Date: Tue Mar 14 17:17:23 2017
  HibernationDevice: RESUME=UUID=b9fd155b-dcbe-4337-ae77-6daa6569beaf
  InstallationDate: Installed on 2014-04-27 (1051 days ago)
  InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
  MachineType: Dell Inc. Latitude E6510
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-64-generic 
root=/dev/mapper/ubuntu--vg-root ro enable_mtrr_cleanup mtrr_spare_reg_nr=8 
mtrr_gran_size=32M mtrr_chunk_size=32M quiet splash
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-64-generic N/A
   linux-backports-modules-4.4.0-64-generic  N/A
   linux-firmware                            1.157.8
  SourcePackage: linux
  SystemImageInfo: Error: command ['system-image-cli', '-i'] failed with exit 
code 2:
  UpgradeStatus: Upgraded to xenial on 2015-06-18 (634 days ago)
  dmi.bios.date: 12/05/2013
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: A16
  dmi.board.vendor: Dell Inc.
  dmi.chassis.type: 9
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvrA16:bd12/05/2013:svnDellInc.:pnLatitudeE6510:pvr0001:rvnDellInc.:rn:rvr:cvnDellInc.:ct9:cvr:
  dmi.product.name: Latitude E6510
  dmi.product.version: 0001
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1672819/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to