Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Wed, Oct 7, 2020 at 8:46 AM Dave Martin wrote: > > On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote: > > On Wed, Oct 7, 2020 at 3:47 AM Dave Martin wrote: > > > > > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > > [...] > > > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > > > > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > > > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the > > > > value > > > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > > > > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? > > > > Done. > > OK. I was prepared to have to argue my case a bit more, but if you > think this is OK then I will stop arguing ;) > > > > > If the arch has more or bigger registers to save in the signal frame, > > > the chances are that they're going to get saved in some userspace stack > > > frames too. So I suspect that the user signal handler stack usage may > > > scale up to some extent rather than being a constant. > > > > > > > > > To help people migrate without unpleasant surprises, I also figured it > > > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= > > > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. > > > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a > > > drop-in replacement for MINSIGSTKSZ, etc. > > > > > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. > > > My idea was that sysconf () should hide this surprise, but people who > > > really want to know the kernel value would tolerate some > > > nonportability and read AT_MINSIGSTKSZ directly.) > > > > > > > > > So then: > > > > > > kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); > > > minsigstksz = LEGACY_MINSIGSTKSZ; > > > if (kernel_minsigstksz > minsigstksz) > > > minsistksz = kernel_minsigstksz; > > > > > > sigstksz = LEGACY_SIGSTKSZ; > > > if (minsigstksz * 4 > sigstksz) > > > sigstksz = minsigstksz * 4; > > > > I updated users/hjl/AT_MINSIGSTKSZ branch with > > > > +@item _SC_MINSIGSTKSZ > > +@standards{GNU, unistd.h} > > Can we specify these more agressively now? > > > +Inquire about the signal stack size used by the kernel. > > I think we've already concluded that this should included all mandatory > overheads, including those imposed by the compiler and glibc? > > e.g.: > > --8<-- > > The returned value is the minimum number of bytes of free stack space > required in order to gurantee successful, non-nested handling of a > single signal whose handler is an empty function. > > -->8-- > > > + > > +@item _SC_SIGSTKSZ > > +@standards{GNU, unistd.h} > > +Inquire about the default signal stack size for a signal handler. > > Similarly: > > --8<-- > > The returned value is the suggested minimum number of bytes of stack > space required for a signal stack. > > This is not guaranteed to be enough for any specific purpose other than > the invocation of a single, non-nested, empty handler, but nonetheless > should be enough for basic scenarios involving simple signal handlers > and very low levels of signal nesting (say, 2 or 3 levels at the very > most). > > This value is provided for developer convenience and to ease migration > from the legacy SIGSTKSZ constant. Programs requiring stronger > guarantees should avoid using it if at all possible. > > -->8-- Done. > > If these descriptions are too wordy, we might want to move some of it > out to signal.texi, though. > > > > > case _SC_MINSIGSTKSZ: > > assert (GLRO(dl_minsigstacksize) != 0); > > return GLRO(dl_minsigstacksize); > > > > case _SC_SIGSTKSZ: > > { > > /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4. */ > > long int minsigstacksize = GLRO(dl_minsigstacksize); > > _Static_assert (__builtin_constant_p (MINSIGSTKSZ), > > "MINSIGSTKSZ is constant"); > > if (minsigstacksize < MINSIGSTKSZ) > > minsigstacksize = MINSIGSTKSZ; > > return minsigstacksize * 4; > > } > > > > > > > > (Or something like that, unless the architecture provides its own > > > definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't > > > want this.) > > > > > > > > > Also: should all these values be rounded up to a multiple of the > > > architecture's preferred stack alignment? > > > > Kernel should provide a properly aligned AT_MINSIGSTKSZ. > > OK. Can you comment on Chang S. Bae's series? I wasn't sure that the > proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe > I just worked it out wrong. In Linux kernel, the signal frame data is composed of the following areas and laid out as: -- | alignment padding | -- | xsave buffer | <<< This must be 64 byte aligned
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote: > On Wed, Oct 7, 2020 at 3:47 AM Dave Martin wrote: > > > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: [...] > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > > > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the > > > value > > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? > > Done. OK. I was prepared to have to argue my case a bit more, but if you think this is OK then I will stop arguing ;) > > If the arch has more or bigger registers to save in the signal frame, > > the chances are that they're going to get saved in some userspace stack > > frames too. So I suspect that the user signal handler stack usage may > > scale up to some extent rather than being a constant. > > > > > > To help people migrate without unpleasant surprises, I also figured it > > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= > > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. > > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a > > drop-in replacement for MINSIGSTKSZ, etc. > > > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. > > My idea was that sysconf () should hide this surprise, but people who > > really want to know the kernel value would tolerate some > > nonportability and read AT_MINSIGSTKSZ directly.) > > > > > > So then: > > > > kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); > > minsigstksz = LEGACY_MINSIGSTKSZ; > > if (kernel_minsigstksz > minsigstksz) > > minsistksz = kernel_minsigstksz; > > > > sigstksz = LEGACY_SIGSTKSZ; > > if (minsigstksz * 4 > sigstksz) > > sigstksz = minsigstksz * 4; > > I updated users/hjl/AT_MINSIGSTKSZ branch with > > +@item _SC_MINSIGSTKSZ > +@standards{GNU, unistd.h} Can we specify these more agressively now? > +Inquire about the signal stack size used by the kernel. I think we've already concluded that this should included all mandatory overheads, including those imposed by the compiler and glibc? e.g.: --8<-- The returned value is the minimum number of bytes of free stack space required in order to gurantee successful, non-nested handling of a single signal whose handler is an empty function. -->8-- > + > +@item _SC_SIGSTKSZ > +@standards{GNU, unistd.h} > +Inquire about the default signal stack size for a signal handler. Similarly: --8<-- The returned value is the suggested minimum number of bytes of stack space required for a signal stack. This is not guaranteed to be enough for any specific purpose other than the invocation of a single, non-nested, empty handler, but nonetheless should be enough for basic scenarios involving simple signal handlers and very low levels of signal nesting (say, 2 or 3 levels at the very most). This value is provided for developer convenience and to ease migration from the legacy SIGSTKSZ constant. Programs requiring stronger guarantees should avoid using it if at all possible. -->8-- If these descriptions are too wordy, we might want to move some of it out to signal.texi, though. > > case _SC_MINSIGSTKSZ: > assert (GLRO(dl_minsigstacksize) != 0); > return GLRO(dl_minsigstacksize); > > case _SC_SIGSTKSZ: > { > /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4. */ > long int minsigstacksize = GLRO(dl_minsigstacksize); > _Static_assert (__builtin_constant_p (MINSIGSTKSZ), > "MINSIGSTKSZ is constant"); > if (minsigstacksize < MINSIGSTKSZ) > minsigstacksize = MINSIGSTKSZ; > return minsigstacksize * 4; > } > > > > > (Or something like that, unless the architecture provides its own > > definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't > > want this.) > > > > > > Also: should all these values be rounded up to a multiple of the > > architecture's preferred stack alignment? > > Kernel should provide a properly aligned AT_MINSIGSTKSZ. OK. Can you comment on Chang S. Bae's series? I wasn't sure that the proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe I just worked it out wrong. > > Should the preferred stack alignment also be exposed through sysconf? > > Portable code otherwise has no way to know this, though if the > > preferred alignment is <= the minimum malloc()/alloca() alignment then > > this is generally not an issue.) > > Good question. But it is orthogonal to the signal stack size issue. Ack. There are various other brokennesses around this area, such as the fact that the register data may now run off the end of the mcontext_t object that is supposed to contain it. Ideally we should fork ucontext_t or mcontext_t into two types, one for the
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Wed, Oct 7, 2020 at 3:47 AM Dave Martin wrote: > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 9:55 AM Dave Martin wrote: > > > > > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin wrote: > > > > > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin > > > > > > wrote: > > > > > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > > > > > userspace > > > > > > > > > > stack. On x86, the data pushed onto the user stack includes > > > > > > > > > > XSAVE state, > > > > > > > > > > which has grown over time as new features and larger > > > > > > > > > > registers have been > > > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h > > > > > > > > > > headers and > > > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. > > > > > > > > > > Its value is > > > > > > > > > > compiled into programs and is part of the user/kernel ABI. > > > > > > > > > > The MINSIGSTKSZ > > > > > > > > > > constant indicates to userspace how much data the kernel > > > > > > > > > > expects to push on > > > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > > > > > However, this constant is much too small and does not > > > > > > > > > > reflect recent > > > > > > > > > > additions to the architecture. For instance, when AVX-512 > > > > > > > > > > states are in > > > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ > > > > > > > > > > remains 2KB. > > > > > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > > > > > MINSIGSTKSZ can > > > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a > > > > > > > > > > similar > > > > > > > > > >approach to [5] > > > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach > > > > > > > > > followed in this > > > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf > > > > > > > > > interface to > > > > > > > > > this... > > > > > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > > > > > Thanks for the link. > > > > > > > > > > > > > > Are there patches yet? I already had some hacks in the works, > > > > > > > but I can > > > > > > > drop them if there's something already out there. > > > > > > > > > > > > I am working on it. > > > > > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > > > superseded by someone (i.e., other than me) who actually knows what > > > > > they're doing... > > > > > > > > Please see my previous email for my glibc patch: > > > > > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk > > > > > > > of > > > > > > > buffer overruns. > > > > > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ > > > > > > > #define > > > > > > > (apart from the fact that it is rarely used, due to glibc's > > > > > > > shadowing > > > > > > > definitions) was that userspace binaries will have baked in the > > > > > > > old > > > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > > > changes. This could be a problem if an newly built library tries > > > > > > > to > > > > > > > memcpy() or dump such an object defined by and old binary. > > > > > > > Bounds-checking and the stack sizes passed to things like > > > > > > > sigaltstack() > > > > > > > and makecontext() could similarly go wrong. > > > > > > > > > > > > With my original proposal: > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > > > constants: > > > > > > > > > > > >
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 9:55 AM Dave Martin wrote: > > > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin wrote: > > > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin > > > > > wrote: > > > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin > > > > > > > wrote: > > > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > > > > userspace > > > > > > > > > stack. On x86, the data pushed onto the user stack includes > > > > > > > > > XSAVE state, > > > > > > > > > which has grown over time as new features and larger > > > > > > > > > registers have been > > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h > > > > > > > > > headers and > > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its > > > > > > > > > value is > > > > > > > > > compiled into programs and is part of the user/kernel ABI. > > > > > > > > > The MINSIGSTKSZ > > > > > > > > > constant indicates to userspace how much data the kernel > > > > > > > > > expects to push on > > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > > > However, this constant is much too small and does not reflect > > > > > > > > > recent > > > > > > > > > additions to the architecture. For instance, when AVX-512 > > > > > > > > > states are in > > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ > > > > > > > > > remains 2KB. > > > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > > > > MINSIGSTKSZ can > > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a > > > > > > > > > similar > > > > > > > > >approach to [5] > > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed > > > > > > > > in this > > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf > > > > > > > > interface to > > > > > > > > this... > > > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > > > Thanks for the link. > > > > > > > > > > > > Are there patches yet? I already had some hacks in the works, but > > > > > > I can > > > > > > drop them if there's something already out there. > > > > > > > > > > I am working on it. > > > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > > superseded by someone (i.e., other than me) who actually knows what > > > > they're doing... > > > > > > Please see my previous email for my glibc patch: > > > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > > > buffer overruns. > > > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ > > > > > > #define > > > > > > (apart from the fact that it is rarely used, due to glibc's > > > > > > shadowing > > > > > > definitions) was that userspace binaries will have baked in the old > > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > > changes. This could be a problem if an newly built library tries to > > > > > > memcpy() or dump such an object defined by and old binary. > > > > > > Bounds-checking and the stack sizes passed to things like > > > > > > sigaltstack() > > > > > > and makecontext() could similarly go wrong. > > > > > > > > > > With my original proposal: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > > constants: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > > > obvious thing to do with this constant in many simple cases. Such usage > > > > is widespread,
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 11:30:42AM -0700, Dave Hansen wrote: > On 10/6/20 10:00 AM, Dave Martin wrote: > > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > >> On 10/6/20 8:25 AM, Dave Martin wrote: > >>> Or are people reporting real stack overruns on x86 today? > >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > Right. Out of interest, do you believe that's a direct consequence of > > the larger kernel-generated signal frame, or does the expansion of > > userspace stack frames play a role too? > > The kernel-generated signal frame is entirely responsible for the ~2800 > bytes that I'm talking about. > > I'm sure there are some systems where userspace plays a role, but those > are much less of a worry at the moment, since the kernel-induced > overflows mean an instant crash that userspace has no recourse for. Ack, that sounds pretty convincing. Cheers ---Dave
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 08:21:00PM +0200, Florian Weimer wrote: > * Dave Martin via Libc-alpha: > > > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > >> On 10/6/20 8:25 AM, Dave Martin wrote: > >> > Or are people reporting real stack overruns on x86 today? > >> > >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > > > Right. Out of interest, do you believe that's a direct consequence of > > the larger kernel-generated signal frame, or does the expansion of > > userspace stack frames play a role too? > > I must say that I do not quite understand this question. > > 32 64-*byte* registers simply need 2048 bytes of storage space worst > case, there is really no way around that. If the architecture grows more or bigger registers, and if those registers are used in general-purpose code, then all stack frames will tend to grow, not just the signal frame. So a stack overflow might be caused by the larger signal frame by itself; or it might be caused by the growth of the stack of 20 function frames created by someone's signal handler. In the latter case, this is just a "normal" stack overflow, and nothing really to do with signals or SIGSTKSZ. Rebuilding with different compiler flags could also grow the stack usage and cause just the same problem. I also strongly suspect that people often don't think about signal nesting when allocating signal stacks. So, there might be a pre- existing potential overflow that just becomes more likely when the signal frame grows. That's not really SIGSTKSZ's fault. Of course, AVX-512 might never be used in general-purpose code. On AArch64, SVE can be used in general-purpose code, but it's too early to say what its prevalence will be in signal handlers. Probably low. > > In practice software just assumes SIGSTKSZ and then ignores the problem > > until / unless an actual stack overflow is seen. > > > > There's probably a lot of software out there whose stack is > > theoretically too small even without AVX-512 etc. in the mix, especially > > when considering the possibility of nested signals... > > That is certainly true. We have seen problems with ntpd, which > requested a 16 KiB stack, at a time when there were various deductions > from the stack size, and since the glibc dynamic loader also uses XSAVE, > ntpd exceeded the remaining stack space. But in this case, we just > fudged the stack size computation in pthread_create and made it less > likely that the dynamic loader was activated, which largely worked > around this particular problem. For MINSIGSTKSZ, we just don't have > this option because it's simply too small in the first place. > > I don't immediately recall a bug due to SIGSTKSZ being too small. The > test cases I wrote for this were all artificial, to raise awareness of > this issue (applications treating these as recommended values, rather > than minimum value to avoid immediately sigaltstack/phtread_create > failures, same issue with PTHREAD_STACK_MIN). Ack, I think if SIGSTKSZ was too small significantly often, there would be more awareness of the issue. Cheers ---Dave
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On 10/6/20 10:00 AM, Dave Martin wrote: > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: >> On 10/6/20 8:25 AM, Dave Martin wrote: >>> Or are people reporting real stack overruns on x86 today? >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > Right. Out of interest, do you believe that's a direct consequence of > the larger kernel-generated signal frame, or does the expansion of > userspace stack frames play a role too? The kernel-generated signal frame is entirely responsible for the ~2800 bytes that I'm talking about. I'm sure there are some systems where userspace plays a role, but those are much less of a worry at the moment, since the kernel-induced overflows mean an instant crash that userspace has no recourse for.
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
* Dave Martin via Libc-alpha: > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: >> On 10/6/20 8:25 AM, Dave Martin wrote: >> > Or are people reporting real stack overruns on x86 today? >> >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > Right. Out of interest, do you believe that's a direct consequence of > the larger kernel-generated signal frame, or does the expansion of > userspace stack frames play a role too? I must say that I do not quite understand this question. 32 64-*byte* registers simply need 2048 bytes of storage space worst case, there is really no way around that. > In practice software just assumes SIGSTKSZ and then ignores the problem > until / unless an actual stack overflow is seen. > > There's probably a lot of software out there whose stack is > theoretically too small even without AVX-512 etc. in the mix, especially > when considering the possibility of nested signals... That is certainly true. We have seen problems with ntpd, which requested a 16 KiB stack, at a time when there were various deductions from the stack size, and since the glibc dynamic loader also uses XSAVE, ntpd exceeded the remaining stack space. But in this case, we just fudged the stack size computation in pthread_create and made it less likely that the dynamic loader was activated, which largely worked around this particular problem. For MINSIGSTKSZ, we just don't have this option because it's simply too small in the first place. I don't immediately recall a bug due to SIGSTKSZ being too small. The test cases I wrote for this were all artificial, to raise awareness of this issue (applications treating these as recommended values, rather than minimum value to avoid immediately sigaltstack/phtread_create failures, same issue with PTHREAD_STACK_MIN). Thanks, Florian -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 6, 2020 at 9:55 AM Dave Martin wrote: > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin wrote: > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin > > > > > > wrote: > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > > > userspace > > > > > > > > stack. On x86, the data pushed onto the user stack includes > > > > > > > > XSAVE state, > > > > > > > > which has grown over time as new features and larger registers > > > > > > > > have been > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h > > > > > > > > headers and > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its > > > > > > > > value is > > > > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > > > > MINSIGSTKSZ > > > > > > > > constant indicates to userspace how much data the kernel > > > > > > > > expects to push on > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > However, this constant is much too small and does not reflect > > > > > > > > recent > > > > > > > > additions to the architecture. For instance, when AVX-512 > > > > > > > > states are in > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ > > > > > > > > remains 2KB. > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > > > MINSIGSTKSZ can > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a > > > > > > > > similar > > > > > > > >approach to [5] > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed > > > > > > > in this > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface > > > > > > > to > > > > > > > this... > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > Thanks for the link. > > > > > > > > > > Are there patches yet? I already had some hacks in the works, but I > > > > > can > > > > > drop them if there's something already out there. > > > > > > > > I am working on it. > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > superseded by someone (i.e., other than me) who actually knows what > > > they're doing... > > > > Please see my previous email for my glibc patch: > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > > buffer overruns. > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > > definitions) was that userspace binaries will have baked in the old > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > changes. This could be a problem if an newly built library tries to > > > > > memcpy() or dump such an object defined by and old binary. > > > > > Bounds-checking and the stack sizes passed to things like > > > > > sigaltstack() > > > > > and makecontext() could similarly go wrong. > > > > > > > > With my original proposal: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > constants: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > > obvious thing to do with this constant in many simple cases. Such usage > > > is widespread, see: > > > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D=1 > > > > > > > > > Your two approaches seem to trade off two different sources of buffer > > > overruns: undersized stacks versus ABI breaks across library boundaries. > > > > We can't get everything we want. > > > > > Since
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > On 10/6/20 8:25 AM, Dave Martin wrote: > > Or are people reporting real stack overruns on x86 today? > > We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. Right. Out of interest, do you believe that's a direct consequence of the larger kernel-generated signal frame, or does the expansion of userspace stack frames play a role too? In practice software just assumes SIGSTKSZ and then ignores the problem until / unless an actual stack overflow is seen. There's probably a lot of software out there whose stack is theoretically too small even without AVX-512 etc. in the mix, especially when considering the possibility of nested signals... Cheers ---Dave
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin wrote: > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin > > > > > wrote: > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > > userspace > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > > > state, > > > > > > > which has grown over time as new features and larger registers > > > > > > > have been > > > > > > > added to the architecture. > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers > > > > > > > and > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its > > > > > > > value is > > > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > > > MINSIGSTKSZ > > > > > > > constant indicates to userspace how much data the kernel expects > > > > > > > to push on > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > However, this constant is much too small and does not reflect > > > > > > > recent > > > > > > > additions to the architecture. For instance, when AVX-512 states > > > > > > > are in > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains > > > > > > > 2KB. > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > > MINSIGSTKSZ can > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > 1. Provide a variable minimum stack size to userspace, as a > > > > > > > similar > > > > > > >approach to [5] > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in > > > > > > this > > > > > > series does seem consistent with the way arm64 populates > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > this... > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > Thanks for the link. > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > drop them if there's something already out there. > > > > > > I am working on it. > > > > OK. I may post something for discussion, but I'm happy for it to be > > superseded by someone (i.e., other than me) who actually knows what > > they're doing... > > Please see my previous email for my glibc patch: > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > buffer overruns. > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > definitions) was that userspace binaries will have baked in the old > > > > value of the constant and may be making assumptions about it. > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > changes. This could be a problem if an newly built library tries to > > > > memcpy() or dump such an object defined by and old binary. > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > and makecontext() could similarly go wrong. > > > > > > With my original proposal: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > constants: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > obvious thing to do with this constant in many simple cases. Such usage > > is widespread, see: > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D=1 > > > > > > Your two approaches seem to trade off two different sources of buffer > > overruns: undersized stacks versus ABI breaks across library boundaries. > > We can't get everything we want. > > > Since undersized stack is by far the more familiar problem and we at > > least have guard regions to help detect overruns, I'd vote to keep > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. > > Agree. > > > Or are people reporting real stack overruns on x86 today? > > I hope so. > > > > > For arm64, we made large
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 6, 2020 at 8:43 AM Dave Martin wrote: > > On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu wrote: > > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin > > > > > wrote: > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > > userspace > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > > > state, > > > > > > > which has grown over time as new features and larger registers > > > > > > > have been > > > > > > > added to the architecture. > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers > > > > > > > and > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its > > > > > > > value is > > > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > > > MINSIGSTKSZ > > > > > > > constant indicates to userspace how much data the kernel expects > > > > > > > to push on > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > However, this constant is much too small and does not reflect > > > > > > > recent > > > > > > > additions to the architecture. For instance, when AVX-512 states > > > > > > > are in > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains > > > > > > > 2KB. > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > > MINSIGSTKSZ can > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > 1. Provide a variable minimum stack size to userspace, as a > > > > > > > similar > > > > > > >approach to [5] > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in > > > > > > this > > > > > > series does seem consistent with the way arm64 populates > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > this... > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > Thanks for the link. > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > drop them if there's something already out there. > > > > > > I am working on it. > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > buffer overruns. > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > definitions) was that userspace binaries will have baked in the old > > > > value of the constant and may be making assumptions about it. > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > changes. This could be a problem if an newly built library tries to > > > > memcpy() or dump such an object defined by and old binary. > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > and makecontext() could similarly go wrong. > > > > > > With my original proposal: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > constants: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the > > > > > kernel. > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > discovery method is changing. The meaning of the value is exactly the > > > > same as before. > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > recommendation for your stack size. While the signal frame size is > > > > relevant to picking a stack size, it's not the only thing to > > > > consider. > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > kernel. The minimum stack size for a signal handler is more likely > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > frame size used by kernel + 6KB for user application. > > > > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > of a "recommended stack size" be abandoned? glibc can at least make
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On 10/6/20 8:25 AM, Dave Martin wrote: > Or are people reporting real stack overruns on x86 today? We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu wrote: > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > userspace > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > > state, > > > > > > which has grown over time as new features and larger registers have > > > > > > been > > > > > > added to the architecture. > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers > > > > > > and > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value > > > > > > is > > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > > MINSIGSTKSZ > > > > > > constant indicates to userspace how much data the kernel expects to > > > > > > push on > > > > > > the user stack, [2][3]. > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > additions to the architecture. For instance, when AVX-512 states > > > > > > are in > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains > > > > > > 2KB. > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > MINSIGSTKSZ can > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > >approach to [5] > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in > > > > > this > > > > > series does seem consistent with the way arm64 populates > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > this... > > > > > > > > Here is my proposal for glibc: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > Thanks for the link. > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > drop them if there's something already out there. > > > > I am working on it. > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > buffer overruns. > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > definitions) was that userspace binaries will have baked in the old > > > value of the constant and may be making assumptions about it. > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > changes. This could be a problem if an newly built library tries to > > > memcpy() or dump such an object defined by and old binary. > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > and makecontext() could similarly go wrong. > > > > With my original proposal: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > constants: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the > > > > kernel. > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > discovery method is changing. The meaning of the value is exactly the > > > same as before. > > > > > > If we are going to rename it though, it could make sense to go for > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > recommendation for your stack size. While the signal frame size is > > > relevant to picking a stack size, it's not the only thing to > > > consider. > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > kernel. The minimum stack size for a signal handler is more likely > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > frame size used by kernel + 6KB for user application. > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > slightly more informed guess about suitable stack sizes than the kernel > > > (and glibc already has to guess anyway, in order to determine the > > > default thread stack size). > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > available. > > > > > > > > > 3. Deprecate
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 6, 2020 at 8:25 AM Dave Martin wrote: > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > During signal entry, the kernel pushes data onto the normal > > > > > > userspace > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > > state, > > > > > > which has grown over time as new features and larger registers have > > > > > > been > > > > > > added to the architecture. > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers > > > > > > and > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value > > > > > > is > > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > > MINSIGSTKSZ > > > > > > constant indicates to userspace how much data the kernel expects to > > > > > > push on > > > > > > the user stack, [2][3]. > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > additions to the architecture. For instance, when AVX-512 states > > > > > > are in > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains > > > > > > 2KB. > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > > MINSIGSTKSZ can > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > >approach to [5] > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in > > > > > this > > > > > series does seem consistent with the way arm64 populates > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > this... > > > > > > > > Here is my proposal for glibc: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > Thanks for the link. > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > drop them if there's something already out there. > > > > I am working on it. > > OK. I may post something for discussion, but I'm happy for it to be > superseded by someone (i.e., other than me) who actually knows what > they're doing... Please see my previous email for my glibc patch: https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > buffer overruns. > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > definitions) was that userspace binaries will have baked in the old > > > value of the constant and may be making assumptions about it. > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > changes. This could be a problem if an newly built library tries to > > > memcpy() or dump such an object defined by and old binary. > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > and makecontext() could similarly go wrong. > > > > With my original proposal: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > constants: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > obvious thing to do with this constant in many simple cases. Such usage > is widespread, see: > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D=1 > > > Your two approaches seem to trade off two different sources of buffer > overruns: undersized stacks versus ABI breaks across library boundaries. We can't get everything we want. > Since undersized stack is by far the more familiar problem and we at > least have guard regions to help detect overruns, I'd vote to keep > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. Agree. > Or are people reporting real stack overruns on x86 today? I hope so. > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > frames are not seen by default. Would somethine similar be feasible on > x86? > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the > > > > kernel. > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > discovery method is
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > state, > > > > > which has grown over time as new features and larger registers have > > > > > been > > > > > added to the architecture. > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > MINSIGSTKSZ > > > > > constant indicates to userspace how much data the kernel expects to > > > > > push on > > > > > the user stack, [2][3]. > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > additions to the architecture. For instance, when AVX-512 states are > > > > > in > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > MINSIGSTKSZ can > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > In this series, we suggest a couple of things: > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > >approach to [5] > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > series does seem consistent with the way arm64 populates > > > > AT_MINSIGSTKSZ. > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > this... > > > > > > Here is my proposal for glibc: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > Thanks for the link. > > > > Are there patches yet? I already had some hacks in the works, but I can > > drop them if there's something already out there. > > I am working on it. OK. I may post something for discussion, but I'm happy for it to be superseded by someone (i.e., other than me) who actually knows what they're doing... > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > buffer overruns. > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > (apart from the fact that it is rarely used, due to glibc's shadowing > > definitions) was that userspace binaries will have baked in the old > > value of the constant and may be making assumptions about it. > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > changes. This could be a problem if an newly built library tries to > > memcpy() or dump such an object defined by and old binary. > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > and makecontext() could similarly go wrong. > > With my original proposal: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > constants: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html Ah, I see. But both still API and ABI breaks; moreover, declaraing an array with size based on (MIN)SIGSTKSZ is not just reasonable, but the obvious thing to do with this constant in many simple cases. Such usage is widespread, see: * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D=1 Your two approaches seem to trade off two different sources of buffer overruns: undersized stacks versus ABI breaks across library boundaries. Since undersized stack is by far the more familiar problem and we at least have guard regions to help detect overruns, I'd vote to keep MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. Or are people reporting real stack overruns on x86 today? For arm64, we made large vectors on SVE opt-in, so that oversized signal frames are not seen by default. Would somethine similar be feasible on x86? > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the > > > kernel. > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > discovery method is changing. The meaning of the value is exactly the > > same as before. > > > > If we are going to rename it though, it could make sense to go for > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > The trouble with including "STKSZ" is that is sounds like a > > recommendation for your stack size. While the signal frame size is > > relevant to picking a stack size, it's not the only thing to > > consider. > > The problem is that AT_MINSIGSTKSZ is the signal frame
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE > > > > > state, > > > > > which has grown over time as new features and larger registers have > > > > > been > > > > > added to the architecture. > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > compiled into programs and is part of the user/kernel ABI. The > > > > > MINSIGSTKSZ > > > > > constant indicates to userspace how much data the kernel expects to > > > > > push on > > > > > the user stack, [2][3]. > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > additions to the architecture. For instance, when AVX-512 states are > > > > > in > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small > > > > > MINSIGSTKSZ can > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > In this series, we suggest a couple of things: > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > >approach to [5] > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > series does seem consistent with the way arm64 populates > > > > AT_MINSIGSTKSZ. > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > this... > > > > > > Here is my proposal for glibc: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > Thanks for the link. > > > > Are there patches yet? I already had some hacks in the works, but I can > > drop them if there's something already out there. > > I am working on it. > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > buffer overruns. > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > (apart from the fact that it is rarely used, due to glibc's shadowing > > definitions) was that userspace binaries will have baked in the old > > value of the constant and may be making assumptions about it. > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > changes. This could be a problem if an newly built library tries to > > memcpy() or dump such an object defined by and old binary. > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > and makecontext() could similarly go wrong. > > With my original proposal: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > constants: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the > > > kernel. > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > discovery method is changing. The meaning of the value is exactly the > > same as before. > > > > If we are going to rename it though, it could make sense to go for > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > The trouble with including "STKSZ" is that is sounds like a > > recommendation for your stack size. While the signal frame size is > > relevant to picking a stack size, it's not the only thing to > > consider. > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > kernel. The minimum stack size for a signal handler is more likely > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > frame size used by kernel + 6KB for user application. > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > of a "recommended stack size" be abandoned? glibc can at least make a > > slightly more informed guess about suitable stack sizes than the kernel > > (and glibc already has to guess anyway, in order to determine the > > default thread stack size). > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > available. > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > is in use. > > > > Great if we can do it. I was concerned that this might be > > controversial. > > > > Would this just be a recommendation, or can we enforce it somehow? > > It is just an idea. We need to move away from constant SIGSTKSZ and > MINSIGSTKSZ. > Here is the glibc patch:
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Oct 6, 2020 at 2:25 AM Dave Martin wrote: > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > which has grown over time as new features and larger registers have been > > > > added to the architecture. > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > compiled into programs and is part of the user/kernel ABI. The > > > > MINSIGSTKSZ > > > > constant indicates to userspace how much data the kernel expects to > > > > push on > > > > the user stack, [2][3]. > > > > > > > > However, this constant is much too small and does not reflect recent > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ > > > > can > > > > cause user stack overflow when delivering a signal. > > > > > > > > In this series, we suggest a couple of things: > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > >approach to [5] > > > > 2. Avoid using a too-small alternate stack > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > series does seem consistent with the way arm64 populates > > > AT_MINSIGSTKSZ. > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > this... > > > > Here is my proposal for glibc: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > Thanks for the link. > > Are there patches yet? I already had some hacks in the works, but I can > drop them if there's something already out there. I am working on it. > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > Can we do this? IIUC, this is an ABI break and carries the risk of > buffer overruns. > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > (apart from the fact that it is rarely used, due to glibc's shadowing > definitions) was that userspace binaries will have baked in the old > value of the constant and may be making assumptions about it. > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > changes. This could be a problem if an newly built library tries to > memcpy() or dump such an object defined by and old binary. > Bounds-checking and the stack sizes passed to things like sigaltstack() > and makecontext() could similarly go wrong. With my original proposal: https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html char [MINSIGSTKSZ] won't compile. The feedback is to increase the constants: https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > discovery method is changing. The meaning of the value is exactly the > same as before. > > If we are going to rename it though, it could make sense to go for > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > The trouble with including "STKSZ" is that is sounds like a > recommendation for your stack size. While the signal frame size is > relevant to picking a stack size, it's not the only thing to > consider. The problem is that AT_MINSIGSTKSZ is the signal frame size used by kernel. The minimum stack size for a signal handler is more likely AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal frame size used by kernel + 6KB for user application. > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > of a "recommended stack size" be abandoned? glibc can at least make a > slightly more informed guess about suitable stack sizes than the kernel > (and glibc already has to guess anyway, in order to determine the > default thread stack size). Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't available. > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > is in use. > > Great if we can do it. I was concerned that this might be > controversial. > > Would this just be a recommendation, or can we enforce it somehow? It is just an idea. We need to move away from constant SIGSTKSZ and MINSIGSTKSZ. -- H.J.
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > During signal entry, the kernel pushes data onto the normal userspace > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > which has grown over time as new features and larger registers have been > > > added to the architecture. > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > constant indicates to userspace how much data the kernel expects to push > > > on > > > the user stack, [2][3]. > > > > > > However, this constant is much too small and does not reflect recent > > > additions to the architecture. For instance, when AVX-512 states are in > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ > > > can > > > cause user stack overflow when delivering a signal. > > > > > > In this series, we suggest a couple of things: > > > 1. Provide a variable minimum stack size to userspace, as a similar > > >approach to [5] > > > 2. Avoid using a too-small alternate stack > > > > I can't comment on the x86 specifics, but the approach followed in this > > series does seem consistent with the way arm64 populates > > AT_MINSIGSTKSZ. > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > this... > > Here is my proposal for glibc: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html Thanks for the link. Are there patches yet? I already had some hacks in the works, but I can drop them if there's something already out there. > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. Can we do this? IIUC, this is an ABI break and carries the risk of buffer overruns. The reason for not simply increasing the kernel's MINSIGSTKSZ #define (apart from the fact that it is rarely used, due to glibc's shadowing definitions) was that userspace binaries will have baked in the old value of the constant and may be making assumptions about it. For example, the type (char [MINSIGSTKSZ]) changes if this #define changes. This could be a problem if an newly built library tries to memcpy() or dump such an object defined by and old binary. Bounds-checking and the stack sizes passed to things like sigaltstack() and makecontext() could similarly go wrong. > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the discovery method is changing. The meaning of the value is exactly the same as before. If we are going to rename it though, it could make sense to go for something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". The trouble with including "STKSZ" is that is sounds like a recommendation for your stack size. While the signal frame size is relevant to picking a stack size, it's not the only thing to consider. Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept of a "recommended stack size" be abandoned? glibc can at least make a slightly more informed guess about suitable stack sizes than the kernel (and glibc already has to guess anyway, in order to determine the default thread stack size). > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > is in use. Great if we can do it. I was concerned that this might be controversial. Would this just be a recommendation, or can we enforce it somehow? Cheers ---Dave
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Mon, Oct 5, 2020 at 6:45 AM Dave Martin wrote: > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > During signal entry, the kernel pushes data onto the normal userspace > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > which has grown over time as new features and larger registers have been > > added to the architecture. > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > constant indicates to userspace how much data the kernel expects to push on > > the user stack, [2][3]. > > > > However, this constant is much too small and does not reflect recent > > additions to the architecture. For instance, when AVX-512 states are in > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > cause user stack overflow when delivering a signal. > > > > In this series, we suggest a couple of things: > > 1. Provide a variable minimum stack size to userspace, as a similar > >approach to [5] > > 2. Avoid using a too-small alternate stack > > I can't comment on the x86 specifics, but the approach followed in this > series does seem consistent with the way arm64 populates > AT_MINSIGSTKSZ. > > I need to dig up my glibc hacks for providing a sysconf interface to > this... Here is my proposal for glibc: https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE is in use. -- H.J.
Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > During signal entry, the kernel pushes data onto the normal userspace > stack. On x86, the data pushed onto the user stack includes XSAVE state, > which has grown over time as new features and larger registers have been > added to the architecture. > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > constant indicates to userspace how much data the kernel expects to push on > the user stack, [2][3]. > > However, this constant is much too small and does not reflect recent > additions to the architecture. For instance, when AVX-512 states are in > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > cause user stack overflow when delivering a signal. > > In this series, we suggest a couple of things: > 1. Provide a variable minimum stack size to userspace, as a similar >approach to [5] > 2. Avoid using a too-small alternate stack I can't comment on the x86 specifics, but the approach followed in this series does seem consistent with the way arm64 populates AT_MINSIGSTKSZ. I need to dig up my glibc hacks for providing a sysconf interface to this... Cheers ---Dave > > [1]: > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD > [2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html > [3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html > [4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531 > [5]: > https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf > > Chang S. Bae (4): > x86/signal: Introduce helpers to get the maximum signal frame size > x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ > x86/signal: Prevent an alternate stack overflow before a signal > delivery > selftest/x86/signal: Include test cases for validating sigaltstack > > arch/x86/ia32/ia32_signal.c | 11 +- > arch/x86/include/asm/elf.h| 4 + > arch/x86/include/asm/fpu/signal.h | 2 + > arch/x86/include/asm/sigframe.h | 25 + > arch/x86/include/uapi/asm/auxvec.h| 6 +- > arch/x86/kernel/cpu/common.c | 3 + > arch/x86/kernel/fpu/signal.c | 20 > arch/x86/kernel/signal.c | 66 +++- > tools/testing/selftests/x86/Makefile | 2 +- > tools/testing/selftests/x86/sigaltstack.c | 126 ++ > 10 files changed, 258 insertions(+), 7 deletions(-) > create mode 100644 tools/testing/selftests/x86/sigaltstack.c > > -- > 2.17.1 >
[RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
During signal entry, the kernel pushes data onto the normal userspace stack. On x86, the data pushed onto the user stack includes XSAVE state, which has grown over time as new features and larger registers have been added to the architecture. MINSIGSTKSZ is a constant provided in the kernel signal.h headers and typically distributed in lib-dev(el) packages, e.g. [1]. Its value is compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ constant indicates to userspace how much data the kernel expects to push on the user stack, [2][3]. However, this constant is much too small and does not reflect recent additions to the architecture. For instance, when AVX-512 states are in use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can cause user stack overflow when delivering a signal. In this series, we suggest a couple of things: 1. Provide a variable minimum stack size to userspace, as a similar approach to [5] 2. Avoid using a too-small alternate stack [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD [2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html [3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html [4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531 [5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf Chang S. Bae (4): x86/signal: Introduce helpers to get the maximum signal frame size x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ x86/signal: Prevent an alternate stack overflow before a signal delivery selftest/x86/signal: Include test cases for validating sigaltstack arch/x86/ia32/ia32_signal.c | 11 +- arch/x86/include/asm/elf.h| 4 + arch/x86/include/asm/fpu/signal.h | 2 + arch/x86/include/asm/sigframe.h | 25 + arch/x86/include/uapi/asm/auxvec.h| 6 +- arch/x86/kernel/cpu/common.c | 3 + arch/x86/kernel/fpu/signal.c | 20 arch/x86/kernel/signal.c | 66 +++- tools/testing/selftests/x86/Makefile | 2 +- tools/testing/selftests/x86/sigaltstack.c | 126 ++ 10 files changed, 258 insertions(+), 7 deletions(-) create mode 100644 tools/testing/selftests/x86/sigaltstack.c -- 2.17.1