* Jann Horn:
> On Wed, Apr 14, 2021 at 12:27 PM Florian Weimer wrote:
>>
>> * Andrei Vagin:
>>
>> > We already have process_vm_readv and process_vm_writev to read and write
>> > to a process memory faster than we can do this with ptrace. And now it
>&
* Borislav Petkov:
> On Mon, Apr 12, 2021 at 10:30:23PM +, Bae, Chang Seok wrote:
>> On Mar 26, 2021, at 03:30, Borislav Petkov wrote:
>> > On Thu, Mar 25, 2021 at 09:56:53PM -0700, Andy Lutomirski wrote:
>> >> We really ought to have a SIGSIGFAIL signal that's sent, double-fault
>> >>
* Andrei Vagin:
> We already have process_vm_readv and process_vm_writev to read and write
> to a process memory faster than we can do this with ptrace. And now it
> is time for process_vm_exec that allows executing code in an address
> space of another process. We can do this with ptrace but it
* Borislav Petkov:
> On Mon, Apr 12, 2021 at 04:19:29PM +0200, Florian Weimer wrote:
>> Maybe we could have done this in 2016 when I reported this for the first
>> time. Now it is too late, as more and more software is using
>> CPUID-based detection for AVX-512.
>
> S
* Ard Biesheuvel:
> Wouldn't that require the compiler to interpret the contents of the
> asm() block?
Yes and no. It would require proper toolchain support, so in this case
a new ELF relocation type, with compiler, assembler, and linker support
to generate those relocations and process them.
* Len Brown via Libc-alpha:
>> In particular, the library may use instructions that main() doesn't know
>> exist.
>
> And so I'll ask my question another way.
>
> How is it okay to change the value of XCR0 during the run time of a
> program?
>
> I submit that it is not, and that is a deal-killer
* Andy Lutomirski:
> On Fri, Mar 26, 2021 at 1:35 PM Florian Weimer wrote:
>>
>> * Andy Lutomirski:
>>
>> > On Fri, Mar 26, 2021 at 12:34 PM Florian Weimer wrote:
>> >> x86: Sporadic failures in tst-cpu-features-cpuinfo
>> >> <
* Andy Lutomirski:
> On Fri, Mar 26, 2021 at 12:34 PM Florian Weimer wrote:
>> x86: Sporadic failures in tst-cpu-features-cpuinfo
>> <https://sourceware.org/bugzilla/show_bug.cgi?id=27398#c3>
>
> It's worth noting that recent microcode updates have make RTM
* Andy Lutomirski:
>> > AVX-512 cleared, and programs need to explicitly request enablement.
>> > This would allow programs to opt into not saving/restoring across
>> > signals or to save/restore in buffers supplied when the feature is
>> > enabled.
>>
>> Isn't XSAVEOPT already able to handle
* Andy Lutomirski-alpha:
> glibc appears to use AVX512F for memcpy by default. (Unless
> Prefer_ERMS is default-on, but I genuinely can't tell if this is the
> case. I did some searching.) The commit adding it refers to a 2016
> email saying that it's 30% on KNL.
As far as I know, glibc only
* Chang Seok via Libc-alpha Bae:
> On Mar 25, 2021, at 09:20, Borislav Petkov wrote:
>>
>> $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3453 -o tst-minsigstksz-2
>> $ ./tst-minsigstksz-2
>> tst-minsigstksz-2: changed byte 50 bytes below configured stack
>>
>> Whoops.
>>
>> And the debug print
* Alexey Dobriyan:
> Some aren't -- PF_FORKNOEXEC. However it is silly for userspace to query it
> because programs knows if it forked but didn't exec without external help.
Libraries typically lack that knowledge, and may have reasons to
detect forks. But there are probably better ways than
* Mathieu Desnoyers:
> This way, the configuration structure can be expanded in the future. The
> rseq ABI structure is by definition fixed-size, so there is no point in
> having its size here.
>
> Florian, did I understand your request correctly, or am I missing your
> point ?
No, the idea was
* Piotr Figiel:
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index 83ee45fa634b..d54cf6b6ce7c 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -102,6 +102,14 @@ struct ptrace_syscall_info {
> };
> };
>
> +#define
* Greg Kroah-Hartman:
> I'm announcing the release of the 4.9.256 kernel.
>
> This, and the 4.4.256 release are a little bit "different" than normal.
>
> This contains only 1 patch, just the version bump from .255 to .256
> which ends up causing the userspace-visable LINUX_VERSION_CODE to
>
* Greg Kroah-Hartman:
> I'm announcing the release of the 4.9.256 kernel.
>
> This, and the 4.4.256 release are a little bit "different" than normal.
>
> This contains only 1 patch, just the version bump from .255 to .256 which ends
> up causing the userspace-visable LINUX_VERSION_CODE to behave
* Lukas Wunner:
> On Fri, Jan 08, 2021 at 12:02:53PM -0800, Linus Torvalds wrote:
>> I appreciate Arnd pointing out "--std=gnu11", though. What are the
>> actual relevant language improvements?
>>
>> Variable declarations in for-loops is the only one I can think of. I
>> think that would clean
* Suren Baghdasaryan:
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6a660858784b..c2d600386902 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const
> struct iovec __user *, vec,
> goto release_task;
>
* Theodore Ts'o:
> On Thu, Jan 07, 2021 at 01:37:47PM +, Russell King - ARM Linux admin
> wrote:
>> > The gcc bugzilla mentions backports into gcc-linaro, but I do not see
>> > them in my git history.
>>
>> So, do we raise the minimum gcc version for the kernel as a whole to 5.1
>> or just
* Andy Lutomirski:
> If you want a 4GB allocation to succeed, you can only divide the
> address space into 32k fragments. Or, a little more precisely, if you
> want a randomly selected 4GB region to be empty, any other allocation
> has a 1/32k chance of being in the way. (Rough numbers — I’m
* Topi Miettinen:
> +3 Additionally enable full randomization of memory mappings created
> +with mmap(NULL, ...). With 2, the base of the VMA used for such
> +mappings is random, but the mappings are created in predictable
> +places within the VMA and in sequential order. With 3,
* Jann Horn:
> But if you can't tell whether the more modern syscall failed because
> of a seccomp filter, you may be forced to retry with an older syscall
> even on systems where the new syscall works fine, and such a fallback
> may reduce security or reliability if you're trying to use some
* Mark Wielaard:
> For valgrind the issue is statx which we try to use before falling back
> to stat64, fstatat or stat (depending on architecture, not all define
> all of these). The problem with these fallbacks is that under some
> containers (libseccomp versions) they might return EPERM
* Jann Horn:
> +seccomp maintainers/reviewers
> [thread context is at
> https://lore.kernel.org/linux-api/87lfer2c0b@oldenburg2.str.redhat.com/
> ]
>
> On Tue, Nov 24, 2020 at 5:49 PM Christoph Hellwig wrote:
>> On Tue, Nov 24, 2020 at 03:08:05PM +0100, Mark Wielaard wrote:
>> > For valgrind
* Christoph Hellwig:
> On Tue, Nov 24, 2020 at 03:08:09PM +0100, Florian Weimer wrote:
>> Do you categorically reject the general advice, or specific instances as
>> well?
>
> All of the above. Really, if people decided to use seccompt to return
> nonsensical error
* Christoph Hellwig:
> On Tue, Nov 24, 2020 at 01:08:20PM +0100, Florian Weimer wrote:
>> This documents a way to safely use new security-related system calls
>> while preserving compatibility with container runtimes that require
>> insecure emulation (because they f
* Aleksa Sarai:
> As I mentioned in the runc thread[1], this is really down to Docker's
> default policy configuration. The EPERM-everything behaviour in OCI was
> inherited from Docker, and it boils down to not having an additional
> seccomp rule which does ENOSYS for unknown syscall numbers
* Christian Brauner:
> I'm sorry but I have some doubts about this new "rule". The idea of
> being able to reliably trigger an error for a system call other then
> EPERM might have merrit in some scenarios but justifying it via a bug in
> a userspace standard is not enough in my opinion.
>
> The
, for existing system calls such as faccessat2,
without kernel or container runtime changes.
Signed-off-by: Florian Weimer
---
Documentation/process/adding-syscalls.rst | 37 +++
1 file changed, 37 insertions(+)
diff --git a/Documentation/process/adding-syscalls.rst
b
* Alejandro Colomar:
> The Linux kernel uses 'unsigned int' instead of 'int' for 'fd' and
> 'whence'. As glibc provides no wrapper, use the same types the
> kernel uses.
lseek is a POSIX interface, and glibc provides it. POSIX uses int for
file descriptors (and the whence parameter in case of
* Segher Boessenkool:
> On Wed, Nov 18, 2020 at 12:17:30PM -0500, Steven Rostedt wrote:
>> I could change the stub from (void) to () if that would be better.
>
> Don't? In a function definition they mean exactly the same thing (and
> the kernel uses (void) everywhere else, which many people find
* Gabriel Krisman Bertazi:
> The main use case is to intercept Windows system calls of an application
> running over Wine. While Wine is using an unmodified glibc to execute
> its own native Linux syscalls, the Windows libraries might be directly
> issuing syscalls that we need to capture. So
* Peter Zijlstra:
>> The default Linux calling conventions are all of the cdecl family,
>> where the caller pops the argument off the stack. You didn't quote
>> enough to context to tell whether other calling conventions matter in
>> your case.
>
> This is strictly in-kernel, and I think we're
* Peter Zijlstra:
> I think that as long as the function is completely empty (it never
> touches any of the arguments) this should work in practise.
>
> That is:
>
> void tp_nop_func(void) { }
>
> can be used as an argument to any function pointer that has a void
> return. In fact, I already do
* Gabriel Krisman Bertazi:
> This is the v7 of syscall user dispatch. This version is a bit
> different from v6 on the following points, after the modifications
> requested on that submission.
Is this supposed to work with existing (Linux) libcs, or do you bring
your own low-level run-time
* Gabriel Krisman Bertazi:
> +Interface
> +-
> +
> +A process can setup this mechanism on supported kernels
> +CONFIG_SYSCALL_USER_DISPATCH) by executing the following prctl:
> +
> + prctl(PR_SET_SYSCALL_USER_DISPATCH, , , , [selector])
> +
> + is either PR_SYS_DISPATCH_ON or
* Catalin Marinas:
> Can the dynamic loader mmap() the main exe again while munmap'ing the
> original one? (sorry if it was already discussed)
No, we don't have a descriptor for that. /proc may not be mounted, and
using the path stored there has a race condition anyway.
Thanks,
Florian
--
Red
* Will Deacon:
> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?
It seems bogus to me. Everyone will just create alias mappings instead,
just like they did for the similar SELinux
* Szabolcs Nagy:
> Program headers are processed in two pass: after the first pass
> load segments are mmapped so in the second pass target specific
> note processing logic can access the notes.
>
> The second pass is moved later so various link_map fields are
> set up that may be useful for note
* Szabolcs Nagy:
> Re-mmap executable segments if possible instead of using mprotect
> to add PROT_BTI. This allows using BTI protection with security
> policies that prevent mprotect with PROT_EXEC.
>
> If the fd of the ELF module is not available because it was kernel
> mapped then mprotect is
* Mathieu Desnoyers:
> - On Sep 29, 2020, at 4:13 AM, Florian Weimer fwei...@redhat.com wrote:
>
>> * Mathieu Desnoyers:
>>
>>>> So we have a bootstrap issue here that needs to be solved, I think.
>>>
>>> The one thing I'm not sure about is w
* Alejandro Colomar via Libc-alpha:
> [[ CC += linux-man, linux-kernel, libc-alpha, mtk ]]
>
> On 2020-10-28 20:26, Alejandro Colomar wrote:
>> The manual page for getdents64() says the prototype should be the
>> following:
>> int getdents64(unsigned int fd, struct linux_dirent64 *dirp,
* Dave Martin via Libc-alpha:
> On Mon, Oct 26, 2020 at 05:45:42PM +0100, Florian Weimer via Libc-alpha wrote:
>> * Dave Martin via Libc-alpha:
>>
>> > Would it now help to add something like:
>> >
>> > int mchangeprot(void *addr, size_t len, int old_fl
* Dave Martin via Libc-alpha:
> Would it now help to add something like:
>
> int mchangeprot(void *addr, size_t len, int old_flags, int new_flags)
> {
> int ret = -EINVAL;
> mmap_write_lock(current->mm);
> if (all vmas in [addr .. addr + len) have
> their
* Topi Miettinen:
> Allowing mprotect(PROT_EXEC|PROT_BTI) would mean that all you need to
> circumvent MDWX is to add PROT_BTI flag. I'd suggest getting the flags
> right at mmap() time or failing that, reverting the PROT_BTI for
> legacy programs later.
>
> Could the kernel tell the loader of
* Topi Miettinen:
>> The dynamic loader has to process the LOAD segments to get to the ELF
>> note that says to enable BTI. Maybe we could do a first pass and
>> load only the segments that cover notes. But that requires lots of
>> changes to generic code in the loader.
>
> What if the loader
* Lennart Poettering:
> On Mi, 21.10.20 22:44, Jeremy Linton (jeremy.lin...@arm.com) wrote:
>
>> Hi,
>>
>> There is a problem with glibc+systemd on BTI enabled systems. Systemd
>> has a service flag "MemoryDenyWriteExecute" which uses seccomp to deny
>> PROT_EXEC changes. Glibc enables BTI only
* Mark Wielaard:
> On Sun, Oct 11, 2020 at 02:15:18PM +0200, Florian Weimer wrote:
>> * Mark Wielaard:
>>
>> > Yes, that would work. I don't know what the lowest supported GCC
>> > version is, but technically it was definitely fixed in 4.10.0, 4.8.4
>>
* Mark Wielaard:
> Yes, that would work. I don't know what the lowest supported GCC
> version is, but technically it was definitely fixed in 4.10.0, 4.8.4
> and 4.9.2. And various distros would probably have backported the
> fix. But checking for 5.0+ would certainly give you a good version.
>
>
* Peter Zijlstra:
> On Tue, Oct 06, 2020 at 11:20:01PM +0200, Florian Weimer wrote:
>> * Peter Zijlstra:
>>
>> > Our Documentation/memory-barriers.txt has a Control Dependencies section
>> > (which I shall not replicate here for brevity) which lists a numbe
* Peter Zijlstra:
> Our Documentation/memory-barriers.txt has a Control Dependencies section
> (which I shall not replicate here for brevity) which lists a number of
> caveats. But in general the work-around we use is:
>
> x = READ_ONCE(*foo);
> if (x > 42)
>
* Dave Martin via Libc-alpha:
> On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
>> On 10/6/20 8:25 AM, Dave Martin wrote:
>> > Or are people reporting real stack overruns on x86 today?
>>
>> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state
>> mostly from
* Mathieu Desnoyers:
>> So we have a bootstrap issue here that needs to be solved, I think.
>
> The one thing I'm not sure about is whether the vDSO interface is indeed
> superior to KTLS, or if it is just the model we are used to.
>
> AFAIU, the current use-cases for vDSO is that an application
* Mathieu Desnoyers:
> Upstreaming efforts aiming to integrate rseq support into glibc led to
> interesting discussions, where we identified a clear need to extend the
> size of the per-thread structure shared between kernel and user-space
> (struct rseq). This is something that is not possible
* Madhavan T. Venkataraman:
> Otherwise, using an ABI quirk or a calling convention side effect to
> load the PC into a GPR is, IMO, non-standard or non-compliant or
> non-approved or whatever you want to call it. I would be
> conservative and not use it. Who knows what incompatibility there
>
* Solar Designer:
> While I share my opinion here, I don't mean that to block Madhavan's
> work. I'd rather defer to people more knowledgeable in current userland
> and ABI issues/limitations and plans on dealing with those, especially
> to Florian Weimer. I haven't seen Florian
* Jonathan Wakely:
> I don't see much point in using std::size here. If you're going to
> provide the alternative implementation for when std::size isn't
> defined, why not just use it always?
>
> template
> #if __cplusplus >= 201103L
> constexpr
> #endif
> inline std::size_t
>
* Madhavan T. Venkataraman:
> On 9/17/20 10:36 AM, Madhavan T. Venkataraman wrote:
libffi
==
I have implemented my solution for libffi and provided the changes for
X86 and ARM, 32-bit and 64-bit. Here is the reference patch:
* madvenka:
> Examples of trampolines
> ===
>
> libffi (A Portable Foreign Function Interface Library):
>
> libffi allows a user to define functions with an arbitrary list of
> arguments and return value through a feature called "Closures".
> Closures use trampolines to jump
* Minchan Kim:
> On Tue, Sep 01, 2020 at 08:46:02PM +0200, Florian Weimer wrote:
>> * Minchan Kim:
>>
>> > ssize_t process_madvise(int pidfd, const struct iovec *iovec,
>> > unsigned long vlen, int advice, unsigned int flags);
>>
>
* Minchan Kim:
> ssize_t process_madvise(int pidfd, const struct iovec *iovec,
> unsigned long vlen, int advice, unsigned int flags);
size_t for vlen provides a clearer hint regarding the type of special
treatment needed for ILP32 here (zero extension, not changing the type
* Yu-cheng Yu:
> On 9/1/2020 10:50 AM, Florian Weimer wrote:
>> * Yu-cheng Yu:
>>
>>> Like other arch_prctl()'s, this parameter was 'unsigned long'
>>> earlier. The idea was, since this arch_prctl is only implemented for
>>> the 64-bit kernel, we wanted
* Yu-cheng Yu:
> Like other arch_prctl()'s, this parameter was 'unsigned long'
> earlier. The idea was, since this arch_prctl is only implemented for
> the 64-bit kernel, we wanted it to look as 64-bit only. I will change
> it back to 'unsigned long'.
What about x32? In general, long is rather
* H. J. Lu:
> Can you think of ANY issues of passing more arguments to arch_prctl?
On x32, the glibc arch_prctl system call wrapper only passes two
arguments to the kernel, and applications have no way of detecting that.
musl only passes two arguments on all architectures. It happens to work
* H. J. Lu:
> On Thu, Aug 27, 2020 at 6:19 AM Florian Weimer wrote:
>>
>> * Dave Martin:
>>
>> > You're right that this has implications: for i386, libc probably pulls
>> > more arguments off the stack than are really there in some situation
* Dave Martin:
> You're right that this has implications: for i386, libc probably pulls
> more arguments off the stack than are really there in some situations.
> This isn't a new problem though. There are already generic prctls with
> fewer than 4 args that are used on x86.
As originally
* Andy Lutomirski:
>> I _believe_ there are also things like AES-NI that can get strong
>> protection from stuff like this. They load encryption keys into (AVX)
>> registers and then can do encrypt/decrypt operations without the keys
>> leaving the registers. If the key was loaded from a secret
* Dave Martin:
> On Tue, Aug 25, 2020 at 04:34:27PM -0700, Yu, Yu-cheng wrote:
>> On 8/25/2020 4:20 PM, Dave Hansen wrote:
>> >On 8/25/20 2:04 PM, Yu, Yu-cheng wrote:
>> I think this is more arch-specific. Even if it becomes a new syscall,
>> we still need to pass the same parameters.
>>
* Andy Lutomirski:
> On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu wrote:
>>
>> From: "H.J. Lu"
>>
>> Emulation of the legacy vsyscall page is required by some programs built
>> before 2013. Newer programs after 2013 don't use it. Disallow vsyscall
>> emulation when Control-flow Enforcement
* Madhavan T. Venkataraman:
> Standardization
> -
>
> Trampfd is a framework that can be used to implement multiple
> things. May be, a few of those things can also be implemented in
> user land itself. But I think having just one mechanism to execute
> dynamic code objects is
* Andy Lutomirski:
> This is quite clever, but now I’m wondering just how much kernel help
> is really needed. In your series, the trampoline is an non-executable
> page. I can think of at least two alternative approaches, and I'd
> like to know the pros and cons.
>
> 1. Entirely userspace: a
* Al Viro:
> On Thu, Jul 23, 2020 at 07:12:24PM +0200, Mickaël Salaün wrote:
>> When the O_MAYEXEC flag is passed, openat2(2) may be subject to
>> additional restrictions depending on a security policy managed by the
>> kernel through a sysctl or implemented by an LSM thanks to the
>>
* Kevin Buettner:
> This commit fixes a regression encountered while running the
> gdb.base/corefile.exp test in GDB's test suite.
>
> In my testing, the typo prevented the sw_reserved field of struct
> fxregs_state from being output to the kernel XSAVES area. Thus the
> correct mask
* Carlos O'Donell:
> On 7/13/20 11:03 PM, Mathieu Desnoyers wrote:
>> Recent discussion led to a solution for extending struct rseq. This is
>> an implementation of the proposed solution.
>>
>> Now is a good time to agree on this scheme before the release of glibc
>> 2.32, just in case there are
* Mathieu Desnoyers:
> - On Jul 15, 2020, at 9:42 AM, Florian Weimer fwei...@redhat.com wrote:
>> * Mathieu Desnoyers:
>>
> [...]
>>> How would this allow early-rseq-adopter libraries to interact with
>>> glibc ?
>>
>> Under all exte
* Mathieu Desnoyers:
> So indeed it could be done today without upgrading the toolchains by
> writing custom assembler for each architecture to get the thread's
> struct rseq. AFAIU the ABI to access the thread pointer is fixed for
> each architecture, right ?
Yes, determining the thread pointer
* Mathieu Desnoyers:
> Practically speaking, I suspect this would mean postponing availability of
> rseq for widely deployed applications for a few more years ?
There is no rseq support in GCC today, so you have to write assembler
code anyway.
Thanks,
Florian
* Chris Kennelly:
> When glibc provides registration, is the anticipated use case that a
> library would unregister and reregister each thread to "upgrade" it to
> the most modern version of interface it knows about provided by the
> kernel?
Absolutely not, that is likely to break other
* Mathieu Desnoyers:
>> How are extensions going to affect the definition of struct rseq,
>> including its alignment?
>
> The alignment will never decrease. If the structure becomes large enough
> its alignment could theoretically increase. Would that be an issue ?
Telling the compiler that
* Mathieu Desnoyers:
> + /*
> + * Very last field of the structure, to calculate size excluding padding
> + * with offsetof().
> + */
> + char end[];
> } __attribute__((aligned(4 * sizeof(__u64;
This makes the header incompatible with standard C++.
How are extensions
* Christian Brauner:
> I've been following this a little bit. The kernel version itself doesn't
> really mean anything and the kernel version is imho not at all
> interesting to userspace applications. Especially for cross-distro
> programs. We can't go around and ask Red Hat, SUSE, Ubuntu,
* Mathieu Desnoyers:
> Allright, thanks for the insight! I'll drop these patches and focus only
> on the bugfix.
Thanks, much appreciated!
* Carlos O'Donell:
> It's not a great fit IMO. Just let the kernel version be the arbiter of
> correctness.
For manual review, sure. But checking it programmatically does not
yield good results due to backports. Even those who use the stable
kernel series sometimes pick up critical fixes
I would like to point out that the subject is misleading: This is not
an ABI change. It fixes the contents of the __rseq_abi TLS variable
(as glibc calls it), but that's it.
(Sorry, I should have mentioned this earlier.)
* Mathieu Desnoyers:
> Those are very good points. One possibility we have would be to let
> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID
> flag. On kernels with the bug present, the cpu_id field is still good
> enough for typical uses of sched_getcpu() which does not
xpected CPU 2, expected 0
> error: Unexpected CPU 2, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0
As far as I can tell, the glibc reproducer no longer shows the issue
with this patch applied.
Tested-By: Florian Weimer
* Mathieu Desnoyers:
> commit 93b585c08d16 ("Fix: sched: unreliable rseq cpu_id for new tasks")
> addresses an issue with cpu_id field of newly created processes. Expose
> a flag which can be used by user-space to query whether the kernel
> implements this fix.
>
> Considering that this issue can
* Mathieu Desnoyers:
> This is an RFC aiming for quick inclusion into the Linux kernel, unless
> we prefer reverting the entire rseq glibc integration and try again in 6
> months. Their upcoming release is on August 3rd, so we need to take a
> decision on this matter quickly.
Just to clarify
* Mathieu Desnoyers:
> - On Jul 6, 2020, at 1:50 PM, Florian Weimer fwei...@redhat.com wrote:
>
>> * Mathieu Desnoyers:
>>
>>> Now we need to discuss how we introduce that fix in a way that will
>>> allow user-space to trust the __rseq_abi.cpu_id field's
* Mathieu Desnoyers:
> Now we need to discuss how we introduce that fix in a way that will
> allow user-space to trust the __rseq_abi.cpu_id field's content.
I don't think that's necessary. We can mention it in the glibc
distribution notes on the wiki.
> The usual approach to kernel bug fixing
* Mathieu Desnoyers:
> When available, use the cpu_id field from __rseq_abi on Linux to
> implement sched_getcpu(). Fall-back on the vgetcpu vDSO if
> unavailable.
I've pushed this to glibc master, but unfortunately it looks like this
exposes a kernel bug related to affinity mask changes.
* Mathieu Desnoyers via Libc-alpha:
> Register rseq TLS for each thread (including main), and unregister for
> each thread (excluding main). "rseq" stands for Restartable Sequences.
>
> See the rseq(2) man page proposed here:
> https://lkml.org/lkml/2018/9/19/647
>
> Those are based on glibc
* Mathieu Desnoyers:
>> I think we should keep things simple on the glibc side for now and do
>> this changes to the kernel headers first.
>
> Just to be sure I understand what you mean by "keep things simple", do you
> recommend removing the following lines completely for now from sys/rseq.h ?
>
* Mathieu Desnoyers:
>> I'm still worried that __rseq_static_assert and __rseq_alignof will show
>> up in the UAPI with textually different definitions. (This does not
>> apply to __rseq_tls_model_ie.)
>
> What makes this worry not apply to __rseq_tls_model_ie ?
It's not needed by the kernel
* Mathieu Desnoyers:
> diff --git a/manual/threads.texi b/manual/threads.texi
> index bb7a42c655..d5069d5581 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> +@deftypevar {struct rseq} __rseq_abi
> +@standards{Linux, sys/rseq.h}
> +@Theglibc{} implements a @code{__rseq_abi} TLS
* Palmer Dabbelt:
> This patch set adds fchmodat4(), a new syscall. The actual
> implementation is super simple: essentially it's just the same as
> fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> I've attempted to make this match "man 2 fchmodat" as closely as
>
* Mathieu Desnoyers:
> - On Jun 3, 2020, at 8:05 AM, Florian Weimer fwei...@redhat.com wrote:
>
>> * Mathieu Desnoyers:
>>
>>> +#ifdef __cplusplus
>>> +# if __cplusplus >= 201103L
>>> +# define __rseq_static_assert(expr, diagnostic) sta
* Mathieu Desnoyers:
> +#ifdef __cplusplus
> +# if __cplusplus >= 201103L
> +# define __rseq_static_assert(expr, diagnostic) static_assert (expr,
> diagnostic)
> +# define __rseq_alignof(type) alignof (type)
> +# define __rseq_alignas(x) alignas (x)
>
* Christian Brauner:
> The performance is striking. For good measure, comparing the following
> simple close_all_fds() userspace implementation that is essentially just
> glibc's version in [6]:
>
> static int close_all_fds(void)
> {
> int dir_fd;
> DIR *dir;
> struct
* Mathieu Desnoyers:
>> Like the attribute, it needs to come right after the struct keyword, I
>> think. (Trailing attributes can be ambiguous, but not in this case.)
>
> Nope. _Alignas really _is_ special :-(
>
> struct _Alignas (16) blah {
> int a;
> };
>
> p.c:1:8: error: expected ‘{’
1 - 100 of 698 matches
Mail list logo