Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-13 Thread Mathieu Desnoyers
- On Jul 11, 2020, at 11:54 AM, Christian Brauner 
christian.brau...@ubuntu.com wrote:

> 
> The registration is a thread-group property I'll assume, right, i.e. all
> threads will have rseq TLS or no thread will have it?

No, rseq registration is a per-thread property, but it would arguably be
a bit weird for a thread-group to observe different registration states
for different threads.

> Some things I seem to be able to assume (correct me if I'm wrong):
> - rseq registration is expected to be done at thread creation time

True.

> - rseq registration _should_ only be done once, i.e. if a caller detects
>  that rseq is already registered for a thread, then they could
>  technically re-register rseq - risking a feature mismatch if doing so
>  - but they shouldn't re-register but simply assume that someone else
>  is in control of rseq. If they violate that assumption than you're
>  hosed anyway.

Right.

> So it seems as long as callers leave rseq registration alone whenever
> they detect that it is already registered then one can assume that rseq
> is under consistent control of a single library/program. If that's the
> case it should safe to assume that the library will use the same rseq
> registration for all threads bounded by the available kernel features or
> by the set of features it is aware of.

The rseq registration is per-thread. But yes, as soon as one user registers
rseq, other users for that thread are expected to piggy-back on that
registration.

> I proposed that specific scheme because I was under the impression that
> you are in need of a mechanism for a caller (at thread creation I
> assume) to check what feature set is supported by rseq _without_
> issung a system call. If you were to record the requested flags in
> struct rseq or in some other way, then another library which tries to
> register rseq for a thread but detects it has already been registered
> can look at e.g. whether the reliable cpu feature is around and then
> adjust it's behavior accordingly.
> Another reason why this seems worthwhile is because of how rseq works in
> general. Since it registers a piece of userspace memory which userspace
> can trivially access enforcing that userspace issue a syscall to get at
> the feature list seems odd when you can just record it in the struct.
> But that's a matter of style, I guess.

Good points.

> 
> Also, I'm thinking about the case of adding one or two new features that
> introduce mutually exclusive behavior for rseq, i.e. if you register
> rseq with FEAT1 and someone else registers it with FEAT2 and FEAT1 and
> FEAT2 would lead to incompatible behavior for an aspect or all of rseq.
> Even if you had a way to query the kernel for FEAT1 and FEAT2 in the
> rseq syscall it still be a problem since a caller wouldn't know at rseq
> registration time whether the library registered rseq with FEAT1 or
> FEAT2. If you record the behavior somewhere - kernel_flags or whatever -
> then the caller could check at registration time "ah, rseq is registered
> with this behavior" I need to adjust my behavior.

I think what we want here is to be able to extend the feature set, but not
"pick and choose" different incompatible features.

[...]
>> 
>> One additional thing to keep in mind: the application can itself choose
>> to define the __rseq_abi TLS, which AFAIU (please let me know if I am
>> wrong) would take precedence over glibc's copy. So extending the
> 
> You mean either that an application could simply choose to ignore that e.g.
> glibc has already registered rseq and e.g. unregister it and register
> it's own or that it registers it's own rseq before glibc comes into the
> play?

No quite.

I mean that an application binary or a preloaded library is allowed to
interpose with glibc and expose a __rseq_abi symbol with a size smaller
than glibc's __rseq_abi. The issue is that glibc (or other library
responsible for rseq registration) is unaware of that symbol's size.

This means that extending __rseq_abi cannot be done by means of additional
flags or parameters passed directly from the registration owner to the
rseq system call.

> I mean, if I interpreted what you're trying to say correctly, I think
> one needs to work under the assumption that if someone else has already
> registered rseq than it becomes the single source of truth. I think that
> basically needs to be the contract. Same way you expect a user of
> pthreads to not suddenly go and call clone3() with CLONE_THREAD |
> CLONE_VM | [...] and assume that this won't mess with glibc's internal
> state.

Right. The issue is not about which library owns the registration, but
rather making sure everyone agree on the size of __rseq_abi, including
interposition scenarios.

[...]
>> 
>> For both approaches, we could either pass them as parameters with rseq
>> registration, and make rseq registration success conditional on the
>> kernel implementing those feature/fix-version, or validate the flag/version
>> separately from 

Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-11 Thread Christian Brauner
On Thu, Jul 09, 2020 at 11:15:57AM -0400, Mathieu Desnoyers wrote:
> - On Jul 9, 2020, at 8:49 AM, Christian Brauner 
> christian.brau...@ubuntu.com wrote:
> 
> > On Wed, Jul 08, 2020 at 01:34:48PM -0400, Mathieu Desnoyers wrote:
> >> - On Jul 8, 2020, at 12:22 PM, Christian Brauner
> >> christian.brau...@ubuntu.com wrote:
> >> [...]
> >> > I've been following this a little bit. The kernel version itself doesn't
> >> > really mean anything and the kernel version is imho not at all
> >> > interesting to userspace applications. Especially for cross-distro
> >> > programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
> >> > openSUSE and god knows who what other distro what their fixed kernel
> >> > version is. That's not feasible at all and not how must programs do it.
> >> > Sure, a lot of programs name a minimal kernel version they require but
> >> > realistically we can't keep bumping it all the time. So the best
> >> > strategy for userspace imho has been to introduce a re-versioned flag or
> >> > enum that indicates the fixed behavior.
> >> > 
> >> > So I would suggest to just introduce
> >> > RSEQ_FLAG_REGISTER_2  = (1 << 2),
> >> > that's how these things are usually done (Netlink etc.). So not
> >> > introducing a fix bit or whatever but simply reversion your flag/enum.
> >> > We already deal with this today.
> >> 
> >> Because rseq is effectively a per-thread resource shared across application
> >> and libraries, it is not practical to merge the notion of version with the
> >> registration. Typically __rseq_abi is registered by libc, and can be used
> >> by the application and by many libraries. Early adopter libraries and
> >> applications (e.g. librseq, tcmalloc) can also choose to handle 
> >> registration
> >> if it's not already done by libc.
> > 
> > I'm probably missing the elephant in the room but I was briefly looking
> > at github.com/compudj/librseq and it seems to me that the registration
> > you're talking about is:
> > 
> > extern __thread struct rseq __rseq_abi;
> > extern int __rseq_handled;
> 
> Note that __rseq_handled has now vanished, adapting to glibc's ABI. I just
> updated librseq's header accordingly.
> 
> > 
> > and it's done in int rseq_register_current_thread(void) afaict and
> > currently registration is done with flags set to 0.
> 
> Correct, however that registration will become a no-op when linked against a
> glibc 2.32+, because the glibc will already have handled the registration
> at thread creation.

The registration is a thread-group property I'll assume, right, i.e. all
threads will have rseq TLS or no thread will have it? Some things I seem
to be able to assume (correct me if I'm wrong):
- rseq registration is expected to be done at thread creation time
- rseq registration _should_ only be done once, i.e. if a caller detects
  that rseq is already registered for a thread, then they could
  technically re-register rseq - risking a feature mismatch if doing so
  - but they shouldn't re-register but simply assume that someone else
  is in control of rseq. If they violate that assumption than you're
  hosed anyway.
So it seems as long as callers leave rseq registration alone whenever
they detect that it is already registered then one can assume that rseq
is under consistent control of a single library/program. If that's the
case it should safe to assume that the library will use the same rseq
registration for all threads bounded by the available kernel features or
by the set of features it is aware of.

> 
> > 
> > What is the problem with either adding a - I don't know -
> > RSEG_FLAG_REGISTER/RSEQ_RELIABLE_CPU_FIELD flag that is also recorded in
> > __rseq_abi.flags. If the kernel doesn't support the flag it will fail
> > registration with EINVAL. So the registering program can detect it. If a
> > caller needs to know whether another thread uses the new flag it can
> > query __rseq_abi.flags. Some form of coordination must be possible in
> > userspace otherwise you'll have trouble with any new feature you add. I
> > general, I don't see how this is different from adding a new feature to
> > rseq. It should be the same principle.
> 
> The problem with "extending" struct rseq is that it becomes complex
> because it is shared between libraries and application. Let's suppose
> the library doing the rseq registration does the scheme you describe:
> queries the kernel for features, and stores them in the __rseq_abi.flags.
> We end up with the following upgrade transition headhaches for an
> application using __rseq_abi:
> 
> Kernel |   glibc |   librseq| __rseq_abi registration owner
> --
> 4.18   |   2.31  | no   | application (reliable cpu_id = 
> false)
> 4.18   |   2.31  | yes  | librseq (reliable cpu_id = false)
> 5.8|   2.31  | yes  | librseq (reliable cpu_id = true)
> 5.8|   2.32

Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-09 Thread Mathieu Desnoyers
- On Jul 9, 2020, at 8:49 AM, Christian Brauner 
christian.brau...@ubuntu.com wrote:

> On Wed, Jul 08, 2020 at 01:34:48PM -0400, Mathieu Desnoyers wrote:
>> - On Jul 8, 2020, at 12:22 PM, Christian Brauner
>> christian.brau...@ubuntu.com wrote:
>> [...]
>> > I've been following this a little bit. The kernel version itself doesn't
>> > really mean anything and the kernel version is imho not at all
>> > interesting to userspace applications. Especially for cross-distro
>> > programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
>> > openSUSE and god knows who what other distro what their fixed kernel
>> > version is. That's not feasible at all and not how must programs do it.
>> > Sure, a lot of programs name a minimal kernel version they require but
>> > realistically we can't keep bumping it all the time. So the best
>> > strategy for userspace imho has been to introduce a re-versioned flag or
>> > enum that indicates the fixed behavior.
>> > 
>> > So I would suggest to just introduce
>> > RSEQ_FLAG_REGISTER_2  = (1 << 2),
>> > that's how these things are usually done (Netlink etc.). So not
>> > introducing a fix bit or whatever but simply reversion your flag/enum.
>> > We already deal with this today.
>> 
>> Because rseq is effectively a per-thread resource shared across application
>> and libraries, it is not practical to merge the notion of version with the
>> registration. Typically __rseq_abi is registered by libc, and can be used
>> by the application and by many libraries. Early adopter libraries and
>> applications (e.g. librseq, tcmalloc) can also choose to handle registration
>> if it's not already done by libc.
> 
> I'm probably missing the elephant in the room but I was briefly looking
> at github.com/compudj/librseq and it seems to me that the registration
> you're talking about is:
> 
> extern __thread struct rseq __rseq_abi;
> extern int __rseq_handled;

Note that __rseq_handled has now vanished, adapting to glibc's ABI. I just
updated librseq's header accordingly.

> 
> and it's done in int rseq_register_current_thread(void) afaict and
> currently registration is done with flags set to 0.

Correct, however that registration will become a no-op when linked against a
glibc 2.32+, because the glibc will already have handled the registration
at thread creation.

> 
> What is the problem with either adding a - I don't know -
> RSEG_FLAG_REGISTER/RSEQ_RELIABLE_CPU_FIELD flag that is also recorded in
> __rseq_abi.flags. If the kernel doesn't support the flag it will fail
> registration with EINVAL. So the registering program can detect it. If a
> caller needs to know whether another thread uses the new flag it can
> query __rseq_abi.flags. Some form of coordination must be possible in
> userspace otherwise you'll have trouble with any new feature you add. I
> general, I don't see how this is different from adding a new feature to
> rseq. It should be the same principle.

The problem with "extending" struct rseq is that it becomes complex
because it is shared between libraries and application. Let's suppose
the library doing the rseq registration does the scheme you describe:
queries the kernel for features, and stores them in the __rseq_abi.flags.
We end up with the following upgrade transition headhaches for an
application using __rseq_abi:

Kernel |   glibc |   librseq| __rseq_abi registration owner
--
4.18   |   2.31  | no   | application (reliable cpu_id = false)
4.18   |   2.31  | yes  | librseq (reliable cpu_id = false)
5.8|   2.31  | yes  | librseq (reliable cpu_id = true)
5.8|   2.32  | yes  | glibc (reliable cpu_id = false)   
 
5.8|   2.33+ | yes  | glibc (reliable cpu_id = true)

This kind of transition regressing feature-wise when upgrading a glibc
can be confusing for users.

One possibility would be to have the kernel store the "reliable cpu_id"
flag directly into a new __rseq_abi.kernel_flags (because __rseq_abi.flags
is documented as only read by the kernel). This would remove the registration
owner from the upgrade scenarios. But what would we gain by exposing this
flag within struct rseq ? The only real reason for doing so over using an
explicit system call is typically speed, and querying the kernel for a
feature does not need to be done often, so this is why I originally favored
exposing this information through a new system call flag without changing
the content of struct rseq_cs.

One additional thing to keep in mind: the application can itself choose
to define the __rseq_abi TLS, which AFAIU (please let me know if I am
wrong) would take precedence over glibc's copy. So extending the
size of struct rseq seems rather tricky because the application may
provide a smaller __rseq_abi, even if both the kernel and glibc agree
on a larger size.

> 
> I also don't 

Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-09 Thread Christian Brauner
On Wed, Jul 08, 2020 at 01:34:48PM -0400, Mathieu Desnoyers wrote:
> - On Jul 8, 2020, at 12:22 PM, Christian Brauner 
> christian.brau...@ubuntu.com wrote:
> [...]
> > I've been following this a little bit. The kernel version itself doesn't
> > really mean anything and the kernel version is imho not at all
> > interesting to userspace applications. Especially for cross-distro
> > programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
> > openSUSE and god knows who what other distro what their fixed kernel
> > version is. That's not feasible at all and not how must programs do it.
> > Sure, a lot of programs name a minimal kernel version they require but
> > realistically we can't keep bumping it all the time. So the best
> > strategy for userspace imho has been to introduce a re-versioned flag or
> > enum that indicates the fixed behavior.
> > 
> > So I would suggest to just introduce
> > RSEQ_FLAG_REGISTER_2  = (1 << 2),
> > that's how these things are usually done (Netlink etc.). So not
> > introducing a fix bit or whatever but simply reversion your flag/enum.
> > We already deal with this today.
> 
> Because rseq is effectively a per-thread resource shared across application
> and libraries, it is not practical to merge the notion of version with the
> registration. Typically __rseq_abi is registered by libc, and can be used
> by the application and by many libraries. Early adopter libraries and
> applications (e.g. librseq, tcmalloc) can also choose to handle registration
> if it's not already done by libc.

I'm probably missing the elephant in the room but I was briefly looking
at github.com/compudj/librseq and it seems to me that the registration
you're talking about is:

extern __thread struct rseq __rseq_abi;
extern int __rseq_handled;

and it's done in int rseq_register_current_thread(void) afaict and
currently registration is done with flags set to 0.

What is the problem with either adding a - I don't know -
RSEG_FLAG_REGISTER/RSEQ_RELIABLE_CPU_FIELD flag that is also recorded in
__rseq_abi.flags. If the kernel doesn't support the flag it will fail
registration with EINVAL. So the registering program can detect it. If a
caller needs to know whether another thread uses the new flag it can
query __rseq_abi.flags. Some form of coordination must be possible in
userspace otherwise you'll have trouble with any new feature you add. I
general, I don't see how this is different from adding a new feature to
rseq. It should be the same principle.

I also don't understand the "not practical to merge the notion of
version with the registration". I'm not sure what that means to be
honest. :)
But just thinking about adding a new feature to rseq. Then you're in the
same spot, I think. When you register a bumped rseq - because you added
a new flag or whatever - you register a new version one way or the other
since a new feature - imho - is always a version bump. In fact, you
could think of your "reliable cpu" as a new feature not a bug. ;)

Also, you seem to directly version struct rseq_cs already through the
"version" member. So even if you are against the new flag I wouldn't
know what would stop you from directly versioning struct rseq itself.

And it's not that we don't version syscalls. We're doing it in multiple
ways to be honest, syscalls with a flag argument that reject unknown
flags are bumped in their version every time you add a new flag that
they accept. We don't spell this out but this is effectively what it is.
Think of it as a minor version bump. Extensible syscalls are versioned
by size and when their struct grows are bumped in their (minor) version.
In fact extensible syscalls with flags argument embedded in the struct
can be version bumped in two ways: growing a new flag argument or
growing a new struct member.

> 
> For instance, it is acceptable for glibc to register rseq for all threads,
> even in the presence of the cpu_id field inaccuracy, for use by the
> sched_getcpu(3) implementation. However, users of rseq which need to
> implement critical sections performing per-cpu data updates may want
> to know whether the cpu_id field is reliable to ensure they do not crash
> the process due to per-cpu data corruption.
> 
> This led me to consider exposing a feature-specific flag which can be
> queried by specific users to know whether rseq has specific set of correct
> behaviors implemented.
> 
> > (Also, as a side-note. I see that you're passing struct rseq *rseq with
> > a length argument but you are not versioning by size. Is that
> > intentional? That basically somewhat locks you to the current struct
> > rseq layout and means users might run into problems when you extend
> > struct rseq in the future as they can't pass the new struct down to
> > older kernels. The way we deal with this is now - rseq might preceed
> > this - is copy_struct_from_user() (for example in sched_{get,set}attr(),
> > openat2(), bpf(), clone3(), etc.). Maybe you want to 

Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-08 Thread Mathieu Desnoyers
- On Jul 8, 2020, at 12:22 PM, Christian Brauner 
christian.brau...@ubuntu.com wrote:
[...]
> I've been following this a little bit. The kernel version itself doesn't
> really mean anything and the kernel version is imho not at all
> interesting to userspace applications. Especially for cross-distro
> programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
> openSUSE and god knows who what other distro what their fixed kernel
> version is. That's not feasible at all and not how must programs do it.
> Sure, a lot of programs name a minimal kernel version they require but
> realistically we can't keep bumping it all the time. So the best
> strategy for userspace imho has been to introduce a re-versioned flag or
> enum that indicates the fixed behavior.
> 
> So I would suggest to just introduce
> RSEQ_FLAG_REGISTER_2  = (1 << 2),
> that's how these things are usually done (Netlink etc.). So not
> introducing a fix bit or whatever but simply reversion your flag/enum.
> We already deal with this today.

Because rseq is effectively a per-thread resource shared across application
and libraries, it is not practical to merge the notion of version with the
registration. Typically __rseq_abi is registered by libc, and can be used
by the application and by many libraries. Early adopter libraries and
applications (e.g. librseq, tcmalloc) can also choose to handle registration
if it's not already done by libc.

For instance, it is acceptable for glibc to register rseq for all threads,
even in the presence of the cpu_id field inaccuracy, for use by the
sched_getcpu(3) implementation. However, users of rseq which need to
implement critical sections performing per-cpu data updates may want
to know whether the cpu_id field is reliable to ensure they do not crash
the process due to per-cpu data corruption.

This led me to consider exposing a feature-specific flag which can be
queried by specific users to know whether rseq has specific set of correct
behaviors implemented.

> (Also, as a side-note. I see that you're passing struct rseq *rseq with
> a length argument but you are not versioning by size. Is that
> intentional? That basically somewhat locks you to the current struct
> rseq layout and means users might run into problems when you extend
> struct rseq in the future as they can't pass the new struct down to
> older kernels. The way we deal with this is now - rseq might preceed
> this - is copy_struct_from_user() (for example in sched_{get,set}attr(),
> openat2(), bpf(), clone3(), etc.). Maybe you want to switch to that to
> keep rseq extensible? Users can detect the new rseq version by just
> passing a larger struct down to the kernel with the extra bytes set to 0
> and if rseq doesn't complain they know they're dealing with an rseq that
> knows larger struct sizes. Might be worth it if you have any reason to
> belive that struct rseq might need to grow.)

In the initial iterations of the rseq patch set, we initially had the rseq_len
argument hoping we would eventually be able to extend struct rseq. However,
it was finally decided against making it extensible, so the rseq ABI merged
into the Linux kernel with a fixed-size.

One of the key reasons for this is explained in
commit 83b0b15bcb0f ("rseq: Remove superfluous rseq_len from task_struct")

The rseq system call, when invoked with flags of "0" or
"RSEQ_FLAG_UNREGISTER" values, expects the rseq_len parameter to
be equal to sizeof(struct rseq), which is fixed-size and fixed-layout,
specified in uapi linux/rseq.h.

Expecting a fixed size for rseq_len is a design choice that ensures
multiple libraries and application defining __rseq_abi in the same
process agree on its exact size.

The issue here is caused by the fact that the __rseq_abi variable is
shared across application/libraries for a given thread. So it's not
enough to agree between kernel and user-space on the extensibility
scheme, but we'd also have to find a way for all users within a process
to agree on the layout.

The "rseq_len" parameter could eventually become useful when combined
with additional flags, but currently its size is fixed.

There are indeed clear use-cases for extending struct rseq (or add a new
similar TLS structure), for instance exposing the current numa node id,
or to implement a fast signal-disabling scheme. But the fact that struct rseq
is shared across application/libraries makes it tricky to "just extend" struct
rseq.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-08 Thread Florian Weimer
* Christian Brauner:

> I've been following this a little bit. The kernel version itself doesn't
> really mean anything and the kernel version is imho not at all
> interesting to userspace applications. Especially for cross-distro
> programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
> openSUSE and god knows who what other distro what their fixed kernel
> version is.

And Red Hat Enterprise Linux only has a dozen or two kernel branches
under active maintenance, each with their own succession of version
numbers.  It's just not feasible.  Even figuring out the branch based
on the kernel version can be tricky!

> (Also, as a side-note. I see that you're passing struct rseq *rseq with
> a length argument but you are not versioning by size. Is that
> intentional? That basically somewhat locks you to the current struct
> rseq layout and means users might run into problems when you extend
> struct rseq in the future as they can't pass the new struct down to
> older kernels. The way we deal with this is now - rseq might preceed
> this - is copy_struct_from_user()

The kernel retains the pointer after the system call returns.
Basically, ownership of the memory area is transferred to the kernel.
It's like set_robust_list in this regard.


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-08 Thread Christian Brauner
On Wed, Jul 08, 2020 at 11:33:51AM -0400, Mathieu Desnoyers wrote:
> [ Context for Linus: I am dropping this RFC patch, but am curious to
>   hear your point of view on exposing to user-space which system call
>   behavior fixes are present in the kernel, either through feature
>   flags or system-call versioning. The intent is to allow user-space
>   to make better decisions on whether it should use a system call or
>   rely on fallback behavior. ]
> 
> - On Jul 7, 2020, at 3:55 PM, Florian Weimer f...@deneb.enyo.de wrote:
> 
> > * Carlos O'Donell:
> > 
> >> It's not a great fit IMO. Just let the kernel version be the arbiter of
> >> correctness.
> > 
> > For manual review, sure.  But checking it programmatically does not
> > yield good results due to backports.  Even those who use the stable
> > kernel series sometimes pick up critical fixes beforehand, so it's not
> > reliable possible for a program to say, “I do not want to run on this
> > kernel because it has a bad version”.  We had a recent episode of this
> > with the Go runtime, which tried to do exactly this.
> 
> FWIW, the kernel fix backport issue would also be a concern if we exposed
> a numeric "fix level version" with specific system calls: what should
> we do if a distribution chooses to include one fix in the sequence,
> but not others ? Identifying fixes are "feature flags" allow
> cherry-picking specific fixes in a backport, but versions would not
> allow that.
> 
> That being said, maybe it's not such a bad thing to _require_ the
> entire series of fixes to be picked in backports, which would be a
> fortunate side-effect of the per-syscall-fix-version approach.
> 
> But I'm under the impression that such a scheme ends up versioning
> a system call, which I suspect will be a no-go from Linus' perspective.

I've been following this a little bit. The kernel version itself doesn't
really mean anything and the kernel version is imho not at all
interesting to userspace applications. Especially for cross-distro
programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
openSUSE and god knows who what other distro what their fixed kernel
version is. That's not feasible at all and not how must programs do it.
Sure, a lot of programs name a minimal kernel version they require but
realistically we can't keep bumping it all the time. So the best
strategy for userspace imho has been to introduce a re-versioned flag or
enum that indicates the fixed behavior.

So I would suggest to just introduce
RSEQ_FLAG_REGISTER_2  = (1 << 2),
that's how these things are usually done (Netlink etc.). So not
introducing a fix bit or whatever but simply reversion your flag/enum.
We already deal with this today.

(Also, as a side-note. I see that you're passing struct rseq *rseq with
a length argument but you are not versioning by size. Is that
intentional? That basically somewhat locks you to the current struct
rseq layout and means users might run into problems when you extend
struct rseq in the future as they can't pass the new struct down to
older kernels. The way we deal with this is now - rseq might preceed
this - is copy_struct_from_user() (for example in sched_{get,set}attr(),
openat2(), bpf(), clone3(), etc.). Maybe you want to switch to that to
keep rseq extensible? Users can detect the new rseq version by just
passing a larger struct down to the kernel with the extra bytes set to 0
and if rseq doesn't complain they know they're dealing with an rseq that
knows larger struct sizes. Might be worth it if you have any reason to
belive that struct rseq might need to grow.)

Christian


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-08 Thread Mathieu Desnoyers
[ Context for Linus: I am dropping this RFC patch, but am curious to
  hear your point of view on exposing to user-space which system call
  behavior fixes are present in the kernel, either through feature
  flags or system-call versioning. The intent is to allow user-space
  to make better decisions on whether it should use a system call or
  rely on fallback behavior. ]

- On Jul 7, 2020, at 3:55 PM, Florian Weimer f...@deneb.enyo.de wrote:

> * Carlos O'Donell:
> 
>> It's not a great fit IMO. Just let the kernel version be the arbiter of
>> correctness.
> 
> For manual review, sure.  But checking it programmatically does not
> yield good results due to backports.  Even those who use the stable
> kernel series sometimes pick up critical fixes beforehand, so it's not
> reliable possible for a program to say, “I do not want to run on this
> kernel because it has a bad version”.  We had a recent episode of this
> with the Go runtime, which tried to do exactly this.

FWIW, the kernel fix backport issue would also be a concern if we exposed
a numeric "fix level version" with specific system calls: what should
we do if a distribution chooses to include one fix in the sequence,
but not others ? Identifying fixes are "feature flags" allow
cherry-picking specific fixes in a backport, but versions would not
allow that.

That being said, maybe it's not such a bad thing to _require_ the
entire series of fixes to be picked in backports, which would be a
fortunate side-effect of the per-syscall-fix-version approach.

But I'm under the impression that such a scheme ends up versioning
a system call, which I suspect will be a no-go from Linus' perspective.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-08 Thread Florian Weimer
* Mathieu Desnoyers:

> Allright, thanks for the insight! I'll drop these patches and focus only
> on the bugfix.

Thanks, much appreciated!


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Florian Weimer
* Carlos O'Donell:

> It's not a great fit IMO. Just let the kernel version be the arbiter of
> correctness.

For manual review, sure.  But checking it programmatically does not
yield good results due to backports.  Even those who use the stable
kernel series sometimes pick up critical fixes beforehand, so it's not
reliable possible for a program to say, “I do not want to run on this
kernel because it has a bad version”.  We had a recent episode of this
with the Go runtime, which tried to do exactly this.


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Mathieu Desnoyers
- On Jul 7, 2020, at 2:53 PM, Carlos O'Donell car...@redhat.com wrote:

> On 7/7/20 8:06 AM, Mathieu Desnoyers wrote:
>> - On Jul 7, 2020, at 7:32 AM, Florian Weimer f...@deneb.enyo.de wrote:
>> 
>>> * Mathieu Desnoyers:
>>>
 Those are very good points. One possibility we have would be to let
 glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
 flag. On kernels with the bug present, the cpu_id field is still good
 enough for typical uses of sched_getcpu() which does not appear to
 have a very strict correctness requirement on returning the right
 cpu number.

 Then libraries and applications which require a reliable cpu_id
 field could check this on their own by calling rseq with the
 RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
 complex in __rseq_abi, and let each rseq user decide about its own
 fate: whether it uses rseq or keeps using an rseq-free fallback.

 I am still tempted to allow combining RSEQ_FLAG_REGISTER |
 RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using
 glibc, and want to check this flag on thread registration.
>>>
>>> Well, you could add a bug fix level field to the __rseq_abi variable.
>> 
>> Even though I initially planned to make the struct rseq_abi extensible,
>> the __rseq_abi variable ends up being fix-sized, so we need to be very
>> careful in choosing what we place in the remaining 12 bytes of padding.
>> I suspect we'd want to keep 8 bytes to express a pointer to an
>> "extended" structure.
>> 
>> I wonder if a bug fix level "version" is the right approach. We could
>> instead have a bitmask of fixes, which the application could independently
>> check. For instance, some applications may care about cpu_id field
>> reliability, and others not.
> 
> I agree with Florian.
> 
> Developers are not interested in a bitmask of fixes.
> 
> They want a version they can check and that at a given version everything
> works as expected.
> 
> In reality today this is the kernel version.
> 
> So rseq is broken from a developer perspective until they can get a new
> kernel or an agreement from their downstream vendor that revision Z of
> the kernel they are using has the fix you've just committed.
> 
> Essentially this problem solves itself at level higher than the interfaces
> we're talking about today.
> 
> Encoding this as a bitmask of fixes is an overengineered solution for a
> problem that the downstream communities already know how to solve.
> 
> I would strongly suggest a "version" or nothing.

That's a good point.

> 
>>> Then applications could check if the kernel has the appropriate level
>>> of non-buggyness.  But the same thing could be useful for many other
>>> kernel interfaces, and I haven't seen such a fix level value for them.
>>> What makes rseq so special?
>> 
>> I guess my only answer is because I care as a user of the system call, and
>> what is a system call without users ? I have real applications which work
>> today with end users deploying them on various kernels, old and new, and I
>> want to take advantage of this system call to speed them up. However, if I
>> have to choose between speed and correctness (in other words, not crashing
>> a critical application), I will choose correctness. So if I cannot detect
>> that I can safely use the system call, it becomes pretty much useless even
>> for my own use-cases.
> 
> Yes.
> 
> In the case of RHEL we have *tons*  of users in the same predicament.
> 
> They just wait until the RHEL kernel team releases a fixed kernel version
> and check for that version and deploy with that version.
> 
> Likewise every other user of a kernel. They solve it by asking their
> kernel provider, internal or external, to verify the fix is applied to
> the deployment kernel.
> 
> If they are an ISV they have to test and deploy similar strategies for
> multiple kernel version.
> 
> By trying to automate this you are encoding into the API some level of
> package management concepts e.g. patch level, revision level, etc.
> 
> This is difficult to do without a more generalized API. Why do it just
> for rseq? Why do it with the few bits you have?
> 
> It's not a great fit IMO. Just let the kernel version be the arbiter of
> correctness.
> 
>>> It won't help with the present bug, but maybe we should add an rseq
>>> API sub-call that blocks future rseq registration for the thread.
>>> Then we can add a glibc tunable that flips off rseq reliably if people
>>> do not want to use it for some reason (and switch to the non-rseq
>>> fallback code instead).  But that's going to help with future bugs
>>> only.
>> 
>> I don't think it's needed. All I really need is to have _some_ way to
>> let lttng-ust or liburcu query whether the cpu_id field is reliable. This
>> state does not have to be made quickly accessible to other libraries,
>> nor does it have to be shared between libraries. It would allow each
>> user library or 

Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Carlos O'Donell
On 7/7/20 8:06 AM, Mathieu Desnoyers wrote:
> - On Jul 7, 2020, at 7:32 AM, Florian Weimer f...@deneb.enyo.de wrote:
> 
>> * Mathieu Desnoyers:
>>
>>> Those are very good points. One possibility we have would be to let
>>> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
>>> flag. On kernels with the bug present, the cpu_id field is still good
>>> enough for typical uses of sched_getcpu() which does not appear to
>>> have a very strict correctness requirement on returning the right
>>> cpu number.
>>>
>>> Then libraries and applications which require a reliable cpu_id
>>> field could check this on their own by calling rseq with the
>>> RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
>>> complex in __rseq_abi, and let each rseq user decide about its own
>>> fate: whether it uses rseq or keeps using an rseq-free fallback.
>>>
>>> I am still tempted to allow combining RSEQ_FLAG_REGISTER |
>>> RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using
>>> glibc, and want to check this flag on thread registration.
>>
>> Well, you could add a bug fix level field to the __rseq_abi variable.
> 
> Even though I initially planned to make the struct rseq_abi extensible,
> the __rseq_abi variable ends up being fix-sized, so we need to be very
> careful in choosing what we place in the remaining 12 bytes of padding.
> I suspect we'd want to keep 8 bytes to express a pointer to an
> "extended" structure.
> 
> I wonder if a bug fix level "version" is the right approach. We could
> instead have a bitmask of fixes, which the application could independently
> check. For instance, some applications may care about cpu_id field
> reliability, and others not.

I agree with Florian.

Developers are not interested in a bitmask of fixes.

They want a version they can check and that at a given version everything
works as expected.

In reality today this is the kernel version.

So rseq is broken from a developer perspective until they can get a new
kernel or an agreement from their downstream vendor that revision Z of
the kernel they are using has the fix you've just committed.

Essentially this problem solves itself at level higher than the interfaces
we're talking about today.

Encoding this as a bitmask of fixes is an overengineered solution for a
problem that the downstream communities already know how to solve.

I would strongly suggest a "version" or nothing.

>> Then applications could check if the kernel has the appropriate level
>> of non-buggyness.  But the same thing could be useful for many other
>> kernel interfaces, and I haven't seen such a fix level value for them.
>> What makes rseq so special?
> 
> I guess my only answer is because I care as a user of the system call, and
> what is a system call without users ? I have real applications which work
> today with end users deploying them on various kernels, old and new, and I
> want to take advantage of this system call to speed them up. However, if I
> have to choose between speed and correctness (in other words, not crashing
> a critical application), I will choose correctness. So if I cannot detect
> that I can safely use the system call, it becomes pretty much useless even
> for my own use-cases.

Yes.

In the case of RHEL we have *tons*  of users in the same predicament.

They just wait until the RHEL kernel team releases a fixed kernel version
and check for that version and deploy with that version.

Likewise every other user of a kernel. They solve it by asking their
kernel provider, internal or external, to verify the fix is applied to
the deployment kernel.

If they are an ISV they have to test and deploy similar strategies for
multiple kernel version.

By trying to automate this you are encoding into the API some level of
package management concepts e.g. patch level, revision level, etc.

This is difficult to do without a more generalized API. Why do it just
for rseq? Why do it with the few bits you have?

It's not a great fit IMO. Just let the kernel version be the arbiter of
correctness.
 
>> It won't help with the present bug, but maybe we should add an rseq
>> API sub-call that blocks future rseq registration for the thread.
>> Then we can add a glibc tunable that flips off rseq reliably if people
>> do not want to use it for some reason (and switch to the non-rseq
>> fallback code instead).  But that's going to help with future bugs
>> only.
> 
> I don't think it's needed. All I really need is to have _some_ way to
> let lttng-ust or liburcu query whether the cpu_id field is reliable. This
> state does not have to be made quickly accessible to other libraries,
> nor does it have to be shared between libraries. It would allow each
> user library or application to make its own mind on whether it can use
> rseq or not.
 
That check is "kernel version > x.y.z" :-)

or

"My kernel vendor told me it's fixed."

-- 
Cheers,
Carlos.



Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Mathieu Desnoyers
- On Jul 7, 2020, at 7:32 AM, Florian Weimer f...@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> Those are very good points. One possibility we have would be to let
>> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
>> flag. On kernels with the bug present, the cpu_id field is still good
>> enough for typical uses of sched_getcpu() which does not appear to
>> have a very strict correctness requirement on returning the right
>> cpu number.
>>
>> Then libraries and applications which require a reliable cpu_id
>> field could check this on their own by calling rseq with the
>> RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
>> complex in __rseq_abi, and let each rseq user decide about its own
>> fate: whether it uses rseq or keeps using an rseq-free fallback.
>>
>> I am still tempted to allow combining RSEQ_FLAG_REGISTER |
>> RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using
>> glibc, and want to check this flag on thread registration.
> 
> Well, you could add a bug fix level field to the __rseq_abi variable.

Even though I initially planned to make the struct rseq_abi extensible,
the __rseq_abi variable ends up being fix-sized, so we need to be very
careful in choosing what we place in the remaining 12 bytes of padding.
I suspect we'd want to keep 8 bytes to express a pointer to an
"extended" structure.

I wonder if a bug fix level "version" is the right approach. We could
instead have a bitmask of fixes, which the application could independently
check. For instance, some applications may care about cpu_id field
reliability, and others not.

> Then applications could check if the kernel has the appropriate level
> of non-buggyness.  But the same thing could be useful for many other
> kernel interfaces, and I haven't seen such a fix level value for them.
> What makes rseq so special?

I guess my only answer is because I care as a user of the system call, and
what is a system call without users ? I have real applications which work
today with end users deploying them on various kernels, old and new, and I
want to take advantage of this system call to speed them up. However, if I
have to choose between speed and correctness (in other words, not crashing
a critical application), I will choose correctness. So if I cannot detect
that I can safely use the system call, it becomes pretty much useless even
for my own use-cases.

> It won't help with the present bug, but maybe we should add an rseq
> API sub-call that blocks future rseq registration for the thread.
> Then we can add a glibc tunable that flips off rseq reliably if people
> do not want to use it for some reason (and switch to the non-rseq
> fallback code instead).  But that's going to help with future bugs
> only.

I don't think it's needed. All I really need is to have _some_ way to
let lttng-ust or liburcu query whether the cpu_id field is reliable. This
state does not have to be made quickly accessible to other libraries,
nor does it have to be shared between libraries. It would allow each
user library or application to make its own mind on whether it can use
rseq or not.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Florian Weimer
* Mathieu Desnoyers:

> Those are very good points. One possibility we have would be to let
> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
> flag. On kernels with the bug present, the cpu_id field is still good
> enough for typical uses of sched_getcpu() which does not appear to
> have a very strict correctness requirement on returning the right
> cpu number.
>
> Then libraries and applications which require a reliable cpu_id
> field could check this on their own by calling rseq with the
> RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
> complex in __rseq_abi, and let each rseq user decide about its own
> fate: whether it uses rseq or keeps using an rseq-free fallback.
>
> I am still tempted to allow combining RSEQ_FLAG_REGISTER |
> RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using
> glibc, and want to check this flag on thread registration.

Well, you could add a bug fix level field to the __rseq_abi variable.
Then applications could check if the kernel has the appropriate level
of non-buggyness.  But the same thing could be useful for many other
kernel interfaces, and I haven't seen such a fix level value for them.
What makes rseq so special?

It won't help with the present bug, but maybe we should add an rseq
API sub-call that blocks future rseq registration for the thread.
Then we can add a glibc tunable that flips off rseq reliably if people
do not want to use it for some reason (and switch to the non-rseq
fallback code instead).  But that's going to help with future bugs
only.


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Mathieu Desnoyers
- On Jul 7, 2020, at 3:29 AM, Florian Weimer f...@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> commit 93b585c08d16 ("Fix: sched: unreliable rseq cpu_id for new tasks")
>> addresses an issue with cpu_id field of newly created processes. Expose
>> a flag which can be used by user-space to query whether the kernel
>> implements this fix.
>>
>> Considering that this issue can cause corruption of user-space per-cpu
>> data updated with rseq, it is recommended that user-space detects
>> availability of this fix by using the RSEQ_FLAG_RELIABLE_CPU_ID flag
>> either combined with registration or on its own before using rseq.
> 
> Presumably, the intent is that glibc uses RSEQ_FLAG_RELIABLE_CPU_ID to
> register the rseq area.  That will surely prevent glibc itself from
> activating rseq on broken kernels.  But if another rseq library
> performs registration and has not been updated to use
> RSEQ_FLAG_RELIABLE_CPU_ID, we still end up with an active rseq area
> (and incorrect CPU IDs from sched_getcpu in glibc).  So further glibc
> changes will be needed.  I suppose we could block third-party rseq
> registration with a registration of a hidden rseq area (not
> __rseq_abi).  But then the question is if any of the third-party rseq
> users are expecting the EINVAL error code from their failed
> registration.
> 
> The rseq registration state machine is quite tricky already, and the
> need to use RSEQ_FLAG_RELIABLE_CPU_ID would make it even more
> complicated.  Even if we implemented all the changes, it's all going
> to be essentially dead, untestable code in a few months, when the
> broken kernels are out of circulation.  It does not appear to be good
> investment to me.

Those are very good points. One possibility we have would be to let
glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
flag. On kernels with the bug present, the cpu_id field is still good
enough for typical uses of sched_getcpu() which does not appear to
have a very strict correctness requirement on returning the right
cpu number.

Then libraries and applications which require a reliable cpu_id field
could check this on their own by calling rseq with the
RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more
complex in __rseq_abi, and let each rseq user decide about its own fate:
whether it uses rseq or keeps using an rseq-free fallback.

I am still tempted to allow combining RSEQ_FLAG_REGISTER | 
RSEQ_FLAG_RELIABLE_CPU_ID
for applications which would not be using glibc, and want to check this flag on
thread registration.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-07 Thread Florian Weimer
* Mathieu Desnoyers:

> commit 93b585c08d16 ("Fix: sched: unreliable rseq cpu_id for new tasks")
> addresses an issue with cpu_id field of newly created processes. Expose
> a flag which can be used by user-space to query whether the kernel
> implements this fix.
>
> Considering that this issue can cause corruption of user-space per-cpu
> data updated with rseq, it is recommended that user-space detects
> availability of this fix by using the RSEQ_FLAG_RELIABLE_CPU_ID flag
> either combined with registration or on its own before using rseq.

Presumably, the intent is that glibc uses RSEQ_FLAG_RELIABLE_CPU_ID to
register the rseq area.  That will surely prevent glibc itself from
activating rseq on broken kernels.  But if another rseq library
performs registration and has not been updated to use
RSEQ_FLAG_RELIABLE_CPU_ID, we still end up with an active rseq area
(and incorrect CPU IDs from sched_getcpu in glibc).  So further glibc
changes will be needed.  I suppose we could block third-party rseq
registration with a registration of a hidden rseq area (not
__rseq_abi).  But then the question is if any of the third-party rseq
users are expecting the EINVAL error code from their failed
registration.

The rseq registration state machine is quite tricky already, and the
need to use RSEQ_FLAG_RELIABLE_CPU_ID would make it even more
complicated.  Even if we implemented all the changes, it's all going
to be essentially dead, untestable code in a few months, when the
broken kernels are out of circulation.  It does not appear to be good
investment to me.


[RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

2020-07-06 Thread Mathieu Desnoyers
commit 93b585c08d16 ("Fix: sched: unreliable rseq cpu_id for new tasks")
addresses an issue with cpu_id field of newly created processes. Expose
a flag which can be used by user-space to query whether the kernel
implements this fix.

Considering that this issue can cause corruption of user-space per-cpu
data updated with rseq, it is recommended that user-space detects
availability of this fix by using the RSEQ_FLAG_RELIABLE_CPU_ID flag
either combined with registration or on its own before using rseq.

Signed-off-by: Mathieu Desnoyers 
Cc: Peter Zijlstra (Intel) 
Cc: Thomas Gleixner 
Cc: Florian Weimer 
Cc: "Paul E. McKenney" 
Cc: Boqun Feng 
Cc: "H . Peter Anvin" 
Cc: Paul Turner 
Cc: Dmitry Vyukov 
Cc: Neel Natu 
Cc: linux-...@vger.kernel.org
---
 include/uapi/linux/rseq.h | 5 +
 kernel/rseq.c | 4 
 2 files changed, 9 insertions(+)

diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 3b5fba25461a..a548b77c9520 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -21,13 +21,18 @@ enum rseq_cpu_id_state {
 /*
  * RSEQ_FLAG_UNREGISTER:   Unregister rseq ABI for caller thread.
  * RSEQ_FLAG_REGISTER: Register rseq ABI for caller thread.
+ * RSEQ_FLAG_RELIABLE_CPU_ID:  rseq provides a reliable cpu_id field.
  *
  * Flag value 0 has the same behavior as RSEQ_FLAG_REGISTER, but cannot be
  * combined with other flags. This behavior is kept for backward compatibility.
+ *
+ * The flag RSEQ_FLAG_REGISTER can be combined with the 
RSEQ_FLAG_RELIABLE_CPU_ID
+ * flag.
  */
 enum rseq_flags {
RSEQ_FLAG_UNREGISTER= (1 << 0),
RSEQ_FLAG_REGISTER  = (1 << 1),
+   RSEQ_FLAG_RELIABLE_CPU_ID   = (1 << 2),
 };
 
 enum rseq_cs_flags_bit {
diff --git a/kernel/rseq.c b/kernel/rseq.c
index 47ce221cd6f9..32cc2e0d961f 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -333,6 +333,8 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, 
rseq_len,
current->rseq_sig = 0;
break;
case RSEQ_FLAG_REGISTER:
+   fallthrough;
+   case RSEQ_FLAG_REGISTER | RSEQ_FLAG_RELIABLE_CPU_ID:
if (current->rseq) {
/*
 * If rseq is already registered, check whether
@@ -365,6 +367,8 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, 
rseq_len,
 */
rseq_set_notify_resume(current);
break;
+   case RSEQ_FLAG_RELIABLE_CPU_ID:
+   break;
default:
return -EINVAL;
}
-- 
2.17.1