Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-10-29 Thread Florian Weimer
* Mathieu Desnoyers:

> - On Sep 29, 2020, at 4:13 AM, Florian Weimer fwei...@redhat.com wrote:
>
>> * Mathieu Desnoyers:
>> 
 So we have a bootstrap issue here that needs to be solved, I think.
>>>
>>> The one thing I'm not sure about is whether the vDSO interface is indeed
>>> superior to KTLS, or if it is just the model we are used to.
>>>
>>> AFAIU, the current use-cases for vDSO is that an application calls into
>>> glibc, which then calls the vDSO function exposed by the kernel. I wonder
>>> whether the vDSO indirection is really needed if we typically have a glibc
>>> function used as indirection ? For an end user, what is the benefit of vDSO
>>> over accessing KTLS data directly from glibc ?
>> 
>> I think the kernel can only reasonably maintain a single userspace data
>> structure.  It's not reasonable to update several versions of the data
>> structure in parallel.
>
> I disagree with your statement. Considering that the kernel needs to
> keep ABI compatibility for whatever it exposes to user-space, claiming
> that it should never update several versions of data structures
> exposed to user-space in parallel means that once a data structure is
> exposed to user-space as ABI in a certain way, it can never ever
> change in the future, even if we find a better way to do things.

I think it's possible to put data into userspace without making it ABI.
Think about the init_module system call.  The module blob comes from
userspace, but its (deeper) internal structure does not have a stable
ABI.  Similar for many BPF use cases.

If the internal KTLS blob structure turns into ABI, including the parts
that need to be updated on context switch, each versioning change has a
performance impact.

>> This means that glibc would have to support multiple kernel data
>> structures, and users might lose userspace acceleration after a kernel
>> update, until they update glibc as well.  The glibc update should be
>> ABI-compatible, but someone would still have to backport it, apply it to
>> container images, etc.
>
> No. If the kernel ever exposes a data structure to user-space as ABI,
> then it needs to stay there, and not break userspace. Hence the need to
> duplicate information provided to user-space if need be, so we can move
> on to better ABIs without breaking the old ones.

It can expose the data as an opaque blob.

> Or as Andy mentioned, we would simply pass the ktls offset as argument to
> the vDSO ? It seems simple enough. Would it fit all our use-cases including
> errno ?

That would work, yes.  It's neat, but it won't give you a way to provide
traditional syscall wrappers directly from the vDSO.

>> We'll see what will break once we have the correct TID after vfork. 8->
>> glibc currently supports malloc-after-vfork as an extension, and
>> a lot of software depends on it (OpenJDK, for example).
>
> I am not sure to see how that is related to ktls ?

The mutex implementation could switch to the KTLS TID because it always
correct.  But then locking in a vfork'ed subprocess would no longer look
like locking from the parent thread because the TID would be different.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-10-20 Thread Mathieu Desnoyers
- On Sep 29, 2020, at 4:13 AM, Florian Weimer fwei...@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>>> So we have a bootstrap issue here that needs to be solved, I think.
>>
>> The one thing I'm not sure about is whether the vDSO interface is indeed
>> superior to KTLS, or if it is just the model we are used to.
>>
>> AFAIU, the current use-cases for vDSO is that an application calls into
>> glibc, which then calls the vDSO function exposed by the kernel. I wonder
>> whether the vDSO indirection is really needed if we typically have a glibc
>> function used as indirection ? For an end user, what is the benefit of vDSO
>> over accessing KTLS data directly from glibc ?
> 
> I think the kernel can only reasonably maintain a single userspace data
> structure.  It's not reasonable to update several versions of the data
> structure in parallel.

I disagree with your statement. Considering that the kernel needs to keep
ABI compatibility for whatever it exposes to user-space, claiming that it
should never update several versions of data structures exposed to user-space
in parallel means that once a data structure is exposed to user-space as ABI
in a certain way, it can never ever change in the future, even if we find
a better way to do things.

It makes more sense to allow multiple data structures to be updated
in parallel until older ones become deprecated/unused/irrelevant, at
which point those can be configured out at build time and eventually
phased out after years of deprecation. Having the ability to update multiple
data structures in user-space with replicated information is IMHO necessary
to allow creation of new/better accelerated ABIs.

> 
> This means that glibc would have to support multiple kernel data
> structures, and users might lose userspace acceleration after a kernel
> update, until they update glibc as well.  The glibc update should be
> ABI-compatible, but someone would still have to backport it, apply it to
> container images, etc.

No. If the kernel ever exposes a data structure to user-space as ABI,
then it needs to stay there, and not break userspace. Hence the need to
duplicate information provided to user-space if need be, so we can move
on to better ABIs without breaking the old ones.

> 
> What's worse, the glibc code would be quite hard to test because we
> would have to keep around multiple kernel versions to exercise all the
> different data structure variants.
> 
> In contrast, the vDSO code always matches the userspace data structures,
> is always updated at the same time, and tested together.  That looks
> like a clear win to me.

For cases where the overhead of vDSO is not an issue, I agree that it
makes things tidier than directly accessing a data structure. The
documentation of the ABI becomes much simpler as well.

> 
>> If we decide that using KTLS from a vDSO function is indeed a requirement,
>> then, as you point out, the thread_pointer is available as ABI, but we miss
>> the KTLS offset.
>>
>> Some ideas on how we could solve this: we could either make the KTLS
>> offset part of the ABI (fixed offset), or save the offset near the
>> thread pointer at a location that would become ABI. It would have to
>> be already populated with something which can help detect the case
>> where a vDSO is called from a thread which does not populate KTLS
>> though. Is that even remotely doable ?
> 
> I don't know.
> 
> We could decide that these accelerated system calls must only be called
> with a valid TCB.  That's unavoidable if the vDSO sets errno directly,
> so it's perhaps not a big loss.  It's also backwards-compatible because
> existing TCB-less code won't know about those new vDSO entrypoints.
> Calling into glibc from a TCB-less thread has always been undefined.
> TCB-less code would have to make direct, non-vDSO system calls, as today.
> 
> For discovering the KTLS offset, a per-process page at a fixed offset
> from the vDSO code (i.e., what real shared objects already do for global
> data) could store this offset.  This way, we could entirely avoid an ABI
> dependency.

Or as Andy mentioned, we would simply pass the ktls offset as argument to
the vDSO ? It seems simple enough. Would it fit all our use-cases including
errno ?

> 
> We'll see what will break once we have the correct TID after vfork. 8->
> glibc currently supports malloc-after-vfork as an extension, and
> a lot of software depends on it (OpenJDK, for example).

I am not sure to see how that is related to ktls ?

> 
>>> With the latter, we could
>>> directly expose the vDSO implementation to applications, assuming that
>>> we agree that the vDSO will not fail with ENOSYS to request fallback to
>>> the system call, but will itself perform the system call.
>>
>> We should not forget the fields needed by rseq as well: the rseq_cs
>> pointer and the cpu_id fields need to be accessed directly from the
>> rseq critical section, without function call. Those use-cases require
>> that applications and library can know 

Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-09-29 Thread Andy Lutomirski
On Mon, Sep 28, 2020 at 8:14 AM Florian Weimer  wrote:
>
> * Mathieu Desnoyers:
>
> > Upstreaming efforts aiming to integrate rseq support into glibc led to
> > interesting discussions, where we identified a clear need to extend the
> > size of the per-thread structure shared between kernel and user-space
> > (struct rseq).  This is something that is not possible with the current
> > rseq ABI.  The fact that the current non-extensible rseq kernel ABI
> > would also prevent glibc's ABI to be extended prevents its integration
> > into glibc.
> >
> > Discussions with glibc maintainers led to the following design, which we
> > are calling "Kernel Thread Local Storage" or KTLS:
> >
> > - at glibc library init:
> >   - glibc queries the size and alignment of the KTLS area supported by the
> > kernel,
> >   - glibc reserves the memory area required by the kernel for main
> > thread,
> >   - glibc registers the offset from thread pointer where the KTLS area
> > will be placed for all threads belonging to the threads group which
> > are created with clone3 CLONE_RSEQ_KTLS,
> > - at nptl thread creation:
> >   - glibc reserves the memory area required by the kernel,
> > - application/libraries can query glibc for the offset/size of the
> >   KTLS area, and offset from the thread pointer to access that area.
>
> One remaining challenge see is that we want to use vDSO functions to
> abstract away the exact layout of the KTLS area.  For example, there are
> various implementation strategies for getuid optimizations, some of them
> exposing a shared struct cred in a thread group, and others not doing
> that.
>
> The vDSO has access to the thread pointer because it's ABI (something
> that we recently (and quite conveniently) clarified for x86).  What it
> does not know is the offset of the KTLS area from the thread pointer.
> In the original rseq implementation, this offset could vary from thread
> to thread in a process, although the submitted glibc implementation did
> not use this level of flexibility and the offset is constant.  The vDSO
> is not relocated by the run-time dynamic loader, so it can't use ELF TLS
> data.

I assume that, by "thread pointer", you mean the pointer stored in
GSBASE on x86_32, FSBASE on x86_64, and elsewhere on other
architectures?

The vDSO has done pretty well so far having the vDSO not touch FS, GS,
or their bases at all.  If we want to change that, I would be very
nervous about doing so in existing vDSO functions.  Regardless of
anything an ABI document might say and anything that existing or
previous glibc versions may or may not have done, there are plenty of
bizarre programs out there that don't really respect the psABI
document.  Go and various not-ready-for-prime-time-but-released-anyway
Bionic branches come to mind.  So we would need to tread very, very
carefully.

One way to side-step much of this would be to make the interface explicit:

long __vdso_do_whatever(void *ktls_ptr, ...);

Sadly, on x86, actually generating the ktls ptr is bit nasty due to
the fact that lea %fs:(offset) doesn't do what one might have liked it
to do.  I suppose this could also be:

long __vdso_do_whatever(unsigned long ktls_offset);

which will generate quite nice code on x86_64.  I can't speak for the
asm capabilities of other architectures.

What I *don't* want to do is to accidentally repeat anything like the
%gs:0x28 mess we have with the stack cookie on x86_32.  (The stack
cookie is, in kernel code, in a completely nonsensical location.  I'm
quite surprised that any of the maintainers ever accepted the current
stack cookie implementation.  I assume there's some history there, but
I don't know it.  The end result is a festering mess in the x86_32
kernel code that only persists because no one cares quite enough about
x86_32 to fix it.)  We obviously won't end up with precisely the same
type of mistake here, but a mis-step here certainly does have the
possibility of promoting an unfortunate-in-hindsight design decision
in glibc and/or psABI to something that every other x86_64 Linux
software stack has to copy to be compatible with the vDSO.

As for errno itself, with all due respect to those who designed errno
before I was born, IMO it was a mistake.  Why exactly should the vDSO
know about errno?


Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-09-29 Thread Florian Weimer
* Mathieu Desnoyers:

>> So we have a bootstrap issue here that needs to be solved, I think.
>
> The one thing I'm not sure about is whether the vDSO interface is indeed
> superior to KTLS, or if it is just the model we are used to.
>
> AFAIU, the current use-cases for vDSO is that an application calls into
> glibc, which then calls the vDSO function exposed by the kernel. I wonder
> whether the vDSO indirection is really needed if we typically have a glibc
> function used as indirection ? For an end user, what is the benefit of vDSO
> over accessing KTLS data directly from glibc ?

I think the kernel can only reasonably maintain a single userspace data
structure.  It's not reasonable to update several versions of the data
structure in parallel.

This means that glibc would have to support multiple kernel data
structures, and users might lose userspace acceleration after a kernel
update, until they update glibc as well.  The glibc update should be
ABI-compatible, but someone would still have to backport it, apply it to
container images, etc.

What's worse, the glibc code would be quite hard to test because we
would have to keep around multiple kernel versions to exercise all the
different data structure variants.

In contrast, the vDSO code always matches the userspace data structures,
is always updated at the same time, and tested together.  That looks
like a clear win to me.

> If we decide that using KTLS from a vDSO function is indeed a requirement,
> then, as you point out, the thread_pointer is available as ABI, but we miss
> the KTLS offset.
>
> Some ideas on how we could solve this: we could either make the KTLS
> offset part of the ABI (fixed offset), or save the offset near the
> thread pointer at a location that would become ABI. It would have to
> be already populated with something which can help detect the case
> where a vDSO is called from a thread which does not populate KTLS
> though. Is that even remotely doable ?

I don't know.

We could decide that these accelerated system calls must only be called
with a valid TCB.  That's unavoidable if the vDSO sets errno directly,
so it's perhaps not a big loss.  It's also backwards-compatible because
existing TCB-less code won't know about those new vDSO entrypoints.
Calling into glibc from a TCB-less thread has always been undefined.
TCB-less code would have to make direct, non-vDSO system calls, as today.

For discovering the KTLS offset, a per-process page at a fixed offset
from the vDSO code (i.e., what real shared objects already do for global
data) could store this offset.  This way, we could entirely avoid an ABI
dependency.

We'll see what will break once we have the correct TID after vfork. 8->
glibc currently supports malloc-after-vfork as an extension, and
a lot of software depends on it (OpenJDK, for example).

>> With the latter, we could
>> directly expose the vDSO implementation to applications, assuming that
>> we agree that the vDSO will not fail with ENOSYS to request fallback to
>> the system call, but will itself perform the system call.
>
> We should not forget the fields needed by rseq as well: the rseq_cs
> pointer and the cpu_id fields need to be accessed directly from the
> rseq critical section, without function call. Those use-cases require
> that applications and library can know the KTLS offset and size and
> use those fields directly.

Yes, but those offsets could be queried using a function from the vDSO
(or using a glibc interface, to simplify linking).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-09-28 Thread Mathieu Desnoyers
- On Sep 28, 2020, at 11:13 AM, Florian Weimer fwei...@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>> Upstreaming efforts aiming to integrate rseq support into glibc led to
>> interesting discussions, where we identified a clear need to extend the
>> size of the per-thread structure shared between kernel and user-space
>> (struct rseq).  This is something that is not possible with the current
>> rseq ABI.  The fact that the current non-extensible rseq kernel ABI
>> would also prevent glibc's ABI to be extended prevents its integration
>> into glibc.
>>
>> Discussions with glibc maintainers led to the following design, which we
>> are calling "Kernel Thread Local Storage" or KTLS:
>>
>> - at glibc library init:
>>   - glibc queries the size and alignment of the KTLS area supported by the
>> kernel,
>>   - glibc reserves the memory area required by the kernel for main
>> thread,
>>   - glibc registers the offset from thread pointer where the KTLS area
>> will be placed for all threads belonging to the threads group which
>> are created with clone3 CLONE_RSEQ_KTLS,
>> - at nptl thread creation:
>>   - glibc reserves the memory area required by the kernel,
>> - application/libraries can query glibc for the offset/size of the
>>   KTLS area, and offset from the thread pointer to access that area.
> 
> One remaining challenge see is that we want to use vDSO functions to
> abstract away the exact layout of the KTLS area.  For example, there are
> various implementation strategies for getuid optimizations, some of them
> exposing a shared struct cred in a thread group, and others not doing
> that.
> 
> The vDSO has access to the thread pointer because it's ABI (something
> that we recently (and quite conveniently) clarified for x86).  What it
> does not know is the offset of the KTLS area from the thread pointer.
> In the original rseq implementation, this offset could vary from thread
> to thread in a process, although the submitted glibc implementation did
> not use this level of flexibility and the offset is constant.  The vDSO
> is not relocated by the run-time dynamic loader, so it can't use ELF TLS
> data.

In the context of this prototype, the KTLS offset is the same for all threads
belonging to a thread group.

> 
> Furthermore, not all threads in a thread group may have an associated
> KTLS area.  In a potential glibc implementation, only the threads
> created by pthread_create would have it; threads created directly using
> clone would lack it (and would not even run with a correctly set up
> userspace TCB).

Right.

> 
> So we have a bootstrap issue here that needs to be solved, I think.

The one thing I'm not sure about is whether the vDSO interface is indeed
superior to KTLS, or if it is just the model we are used to.

AFAIU, the current use-cases for vDSO is that an application calls into
glibc, which then calls the vDSO function exposed by the kernel. I wonder
whether the vDSO indirection is really needed if we typically have a glibc
function used as indirection ? For an end user, what is the benefit of vDSO
over accessing KTLS data directly from glibc ?

If we decide that using KTLS from a vDSO function is indeed a requirement,
then, as you point out, the thread_pointer is available as ABI, but we miss
the KTLS offset.

Some ideas on how we could solve this: we could either make the KTLS
offset part of the ABI (fixed offset), or save the offset near the thread 
pointer
at a location that would become ABI. It would have to be already populated with
something which can help detect the case where a vDSO is called from a thread
which does not populate KTLS though. Is that even remotely doable ?

> 
> In most cases, I would not be too eager to bypass the vDSO completely,
> and having the kernel expose a data-only interface.  I could perhaps
> make an exception for the current TID because that's so convenient to
> use in mutex implementations, and errno.

Indeed, using a KTLS field to store errno is another use-case I forgot to
mention. That would make life easier for errno handling in vDSO as well.

> With the latter, we could
> directly expose the vDSO implementation to applications, assuming that
> we agree that the vDSO will not fail with ENOSYS to request fallback to
> the system call, but will itself perform the system call.

We should not forget the fields needed by rseq as well: the rseq_cs pointer and
the cpu_id fields need to be accessed directly from the rseq critical section,
without function call. Those use-cases require that applications and library can
know the KTLS offset and size and use those fields directly. That being said,
there are certainly plenty of use-cases where it makes sense to use the KTLS
data through a vDSO, and only expose the vDSO interface, if the cost of the
extra vDSO call indirection is not prohibitive.

Thanks,

Mathieu

> 
> Thanks,
> Florian
> --
> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
> Commercial register: 

Re: [RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-09-28 Thread Florian Weimer
* Mathieu Desnoyers:

> Upstreaming efforts aiming to integrate rseq support into glibc led to
> interesting discussions, where we identified a clear need to extend the
> size of the per-thread structure shared between kernel and user-space
> (struct rseq).  This is something that is not possible with the current
> rseq ABI.  The fact that the current non-extensible rseq kernel ABI
> would also prevent glibc's ABI to be extended prevents its integration
> into glibc.
>
> Discussions with glibc maintainers led to the following design, which we
> are calling "Kernel Thread Local Storage" or KTLS:
>
> - at glibc library init:
>   - glibc queries the size and alignment of the KTLS area supported by the
> kernel,
>   - glibc reserves the memory area required by the kernel for main
> thread,
>   - glibc registers the offset from thread pointer where the KTLS area
> will be placed for all threads belonging to the threads group which
> are created with clone3 CLONE_RSEQ_KTLS,
> - at nptl thread creation:
>   - glibc reserves the memory area required by the kernel,
> - application/libraries can query glibc for the offset/size of the
>   KTLS area, and offset from the thread pointer to access that area.

One remaining challenge see is that we want to use vDSO functions to
abstract away the exact layout of the KTLS area.  For example, there are
various implementation strategies for getuid optimizations, some of them
exposing a shared struct cred in a thread group, and others not doing
that.

The vDSO has access to the thread pointer because it's ABI (something
that we recently (and quite conveniently) clarified for x86).  What it
does not know is the offset of the KTLS area from the thread pointer.
In the original rseq implementation, this offset could vary from thread
to thread in a process, although the submitted glibc implementation did
not use this level of flexibility and the offset is constant.  The vDSO
is not relocated by the run-time dynamic loader, so it can't use ELF TLS
data.

Furthermore, not all threads in a thread group may have an associated
KTLS area.  In a potential glibc implementation, only the threads
created by pthread_create would have it; threads created directly using
clone would lack it (and would not even run with a correctly set up
userspace TCB).

So we have a bootstrap issue here that needs to be solved, I think.

In most cases, I would not be too eager to bypass the vDSO completely,
and having the kernel expose a data-only interface.  I could perhaps
make an exception for the current TID because that's so convenient to
use in mutex implementations, and errno.  With the latter, we could
directly expose the vDSO implementation to applications, assuming that
we agree that the vDSO will not fail with ENOSYS to request fallback to
the system call, but will itself perform the system call.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



[RFC PATCH 1/2] rseq: Implement KTLS prototype for x86-64

2020-09-25 Thread Mathieu Desnoyers
Upstreaming efforts aiming to integrate rseq support into glibc led to
interesting discussions, where we identified a clear need to extend the
size of the per-thread structure shared between kernel and user-space
(struct rseq).  This is something that is not possible with the current
rseq ABI.  The fact that the current non-extensible rseq kernel ABI
would also prevent glibc's ABI to be extended prevents its integration
into glibc.

Discussions with glibc maintainers led to the following design, which we
are calling "Kernel Thread Local Storage" or KTLS:

- at glibc library init:
  - glibc queries the size and alignment of the KTLS area supported by the
kernel,
  - glibc reserves the memory area required by the kernel for main
thread,
  - glibc registers the offset from thread pointer where the KTLS area
will be placed for all threads belonging to the threads group which
are created with clone3 CLONE_RSEQ_KTLS,
- at nptl thread creation:
  - glibc reserves the memory area required by the kernel,
- application/libraries can query glibc for the offset/size of the
  KTLS area, and offset from the thread pointer to access that area.

The basic idea is that the kernel UAPI will define the layout of that
per-thread area, and glibc only deals with its allocation.

It should fulfill the extensibility needs for many extensions which can
benefit from a per-thread shared memory area between kernel and
user-space, e.g.:

- Exposing:
  - current NUMA node number,
  - current cpu number,
  - current user id, group id, thread id, process id, process group id, ...
  - signal block flag (fast-path for signal blocking),
  - pretty much anything a vDSO can be used for, without requiring the
function call.

This patch implements a crude prototype of extensible rseq ABI to show
what interfaces we need to expose.  It's currently wired up using rseq
flags, turning rseq into a system call multiplexer, which was easy for a
prototype.

It should be ready for some bikeshedding on how that ABI should look
like. Is this extension using flags OK, or should this appear as new
system calls instead ?

Signed-off-by: Mathieu Desnoyers 
Cc: Carlos O'Donell 
Cc: "Florian Weimer 
Cc: Peter Zijlstra (Intel) 
Cc: "Paul E. McKenney" 
Cc: Boqun Feng 
---
 arch/x86/kernel/process_64.c |   1 +
 include/linux/sched.h| 122 +---
 include/linux/sched/signal.h | 150 +
 include/uapi/linux/rseq.h|  16 +++-
 include/uapi/linux/sched.h   |   1 +
 kernel/fork.c|  14 ++-
 kernel/rseq.c| 177 +++
 7 files changed, 321 insertions(+), 160 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9a97415b2139..f0c822a78d01 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -670,6 +670,7 @@ long do_arch_prctl_64(struct task_struct *task, int option, 
unsigned long arg2)
task->thread.fsindex = 0;
x86_fsbase_write_task(task, arg2);
}
+   rseq_set_thread_pointer(task, arg2);
preempt_enable();
break;
}
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 683372943093..7b11ee7dee1a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1148,6 +1148,8 @@ struct task_struct {
 * with respect to preemption.
 */
unsigned long rseq_event_mask;
+   unsigned long rseq_thread_pointer;
+   unsigned int rseq_ktls:1;
 #endif
 
struct tlbflush_unmap_batch tlb_ubc;
@@ -1907,114 +1909,6 @@ extern long sched_getaffinity(pid_t pid, struct cpumask 
*mask);
 #define TASK_SIZE_OF(tsk)  TASK_SIZE
 #endif
 
-#ifdef CONFIG_RSEQ
-
-/*
- * Map the event mask on the user-space ABI enum rseq_cs_flags
- * for direct mask checks.
- */
-enum rseq_event_mask_bits {
-   RSEQ_EVENT_PREEMPT_BIT  = RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT,
-   RSEQ_EVENT_SIGNAL_BIT   = RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT,
-   RSEQ_EVENT_MIGRATE_BIT  = RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT,
-};
-
-enum rseq_event_mask {
-   RSEQ_EVENT_PREEMPT  = (1U << RSEQ_EVENT_PREEMPT_BIT),
-   RSEQ_EVENT_SIGNAL   = (1U << RSEQ_EVENT_SIGNAL_BIT),
-   RSEQ_EVENT_MIGRATE  = (1U << RSEQ_EVENT_MIGRATE_BIT),
-};
-
-static inline void rseq_set_notify_resume(struct task_struct *t)
-{
-   if (t->rseq)
-   set_tsk_thread_flag(t, TIF_NOTIFY_RESUME);
-}
-
-void __rseq_handle_notify_resume(struct ksignal *sig, struct pt_regs *regs);
-
-static inline void rseq_handle_notify_resume(struct ksignal *ksig,
-struct pt_regs *regs)
-{
-   if (current->rseq)
-   __rseq_handle_notify_resume(ksig, regs);
-}
-
-static inline void rseq_signal_deliver(struct ksignal *ksig,
-  struct pt_regs *regs)
-{
-