control: reassign -1 src:linux
control: retitle -1 linux: cleared PKRU register after signal in amd64 32-bit
compat mode
control: close -1 4.10-1~exp1
control: affects -1 glibc
Hi,
On 2019-03-28 22:05, Aurelien Jarno wrote:
> Hi Florian,
>
> On 2019-03-27 23:59, Florian Weimer wrote:
> > retitle 924891 glibc: misc/tst-pkey fails due to cleared PKRU register
> > after signal in amd64 32-bit compat mode
> > thanks
> >
> > * Lucas Nussbaum:
> >
> > > On 27/03/19 at 08:48 +0100, Florian Weimer wrote:
> > >> > If that's useful, I can easily provide access to an AWS VM to debug
> > >> > this
> > >> > issue.
> > >>
> > >> Oh, that would be quite helpful indeed.
> > >
> > > Can you send your SSH key? (I thought there was a way to get the SSH key
> > > for a DD, but I cannot find it anymore)
> > >
> > > Then you will be able to ssh to root@18.184.55.40.
> > > There's sbuild and schroot setup on the VM.
> > >
> > > When you are done, please 'poweroff' the machine, which will terminate
> > > it.
> >
> > The issue reproduces outside the chroot, with the stretch userland.
> >
> > What happens is that once we get out of the SIGUSR1 signal handler,
> > the PKRU register has value zero. This happens around this code in
> > the test:
> >
> > /* Check that in a signal handler, there is no access. */
> > xsignal (SIGUSR1, &sigusr1_handler);
> > xraise (SIGUSR1);
> > xsignal (SIGUSR1, SIG_DFL);
> > TEST_COMPARE (sigusr1_handler_ran, 1);
> >
> > I checked the following (via a breakpoint in pkey_get; I don't think
> > GDB can read the PKRU register directly): Inside the SIGUSR1 signal
> > handler, PKRU has value 0x5554, as expected for this kernel, but
> > after the return, we get zero. This is the first time a signal is
> > delivered on the main thread, so it's consistent with fairly broken
> > signal handling as far as the PKRU register is concerned. I guess
> > clearing PKRU in this way might even constitute a minor security bug
> > (because the zero value means no restrictions).
>
> Thanks a lot for investigating and for all the details.
>
> > This commit looks highly relevant:
> >
> > commit a4455082dc6f0b5d51a23523f77600e8ede47c79
> > Author: Dave Hansen
> > Date: Wed Jun 8 10:25:33 2016 -0700
> >
> > x86/signals: Add missing signal_compat code for x86 features
> >
> > The 32-bit siginfo is a different binary format than the 64-bit
> > one. So, when running 32-bit binaries on 64-bit kernels, we have
> > to convert the kernel's 64-bit version to a 32-bit version that
> > userspace can grok.
> >
> > If the siginfo_t layout is incorrect (with regards to what the
> > hardware writes), I expect that we might end up copying back the wrong
> > PKRU value.
>
> This commit is actually already in the 4.9 kernel.
>
> > I'm not sure what to do here. This really looks like a kernel bug.
> > Maybe we should just verify that this is fixed in the buster kernel
> > and move on?
>
> I agree. I have been able to find a machine where I can temporarily run
> a VM. I have found that the problem has been solved between kernel
> 4.10-rc6 and 4.10, more precisely between the following debian packages:
> - linux-image-4.10.0-rc6-amd64-unsigned version 4.10~rc6-1~exp1
> - linux-image-4.10.0-trunk-amd64-unsigned version 4.10-1~exp1
>
> I gave a quick look at the commit logs, and I haven't identified a
> commit. I'll look again and try to identify the commit fixing the issue
> so that it can be backported in the stretch kernel. I'll then reassign
> the bug there.
I have identified that the issue has been fixed by the following commit
that went into kernel 4.10-rc7:
commit dffba9a31c7769be3231c420d4b364c92ba3f1ac
Author: Yu-cheng Yu
Date: Mon Jan 23 14:54:44 2017 -0800
x86/fpu/xstate: Fix xcomp_bv in XSAVES header
The compacted-format XSAVES area is determined at boot time and
never changed after. The field xsave.header.xcomp_bv indicates
which components are in the fixed XSAVES format.
In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
format since at the time there is no valid data.
However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
as in commit:
b22cbe404a9c x86/fpu: Fix invalid FPU ptrace state after execve()
and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
app, a #GP occurs. This can be easily triggered by doing valgrind on
a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
others:
https://bugzilla.kernel.org/show_bug.cgi?id=190061
Fix it by setting xcomp_bv correctly.
This patch also moves the xcomp_bv initialization to the proper
place, which was in copyin_to_xsaves() as of:
4c833368f0bf x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
which fixed the bug too, but it's more efficient and cleaner to
initialize things once per boot, not for every signal handlin