[Bug middle-end/66479] -fstack-check doesn't prevent stack allocation with size -1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479 Daniel Micay danielmicay at gmail dot com changed: What|Removed |Added CC||danielmicay at gmail dot com --- Comment #4 from Daniel Micay danielmicay at gmail dot com --- It's not a configuration issue. The -fstack-check feature is supposed to guarantee the guard page is hit if it's present, regardless of the size of the stack or the stack allocation.
[Bug target/67265] [x86] 'asm' operand has impossible constraints with -fstack-check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67265 Daniel Micay danielmicay at gmail dot com changed: What|Removed |Added CC||danielmicay at gmail dot com --- Comment #3 from Daniel Micay danielmicay at gmail dot com --- It only temporarily needs a register to perform a write. The implementation might currently have to reserve it but it *shouldn't* have to. It should be free other than guaranteeing a one byte write per page.
[Bug c/68065] Size calculations for VLAs can overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065 Daniel Micay changed: What|Removed |Added CC||danielmicay at gmail dot com --- Comment #9 from Daniel Micay --- > VLA also does not detect stack overflows either. Stack overflows are detected with -fstack-check, or at least they would be if the option worked properly: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65958 I've always found it quite bad that well-defined code with GCC can actually be exploited (arbitrary write vulnerabilities) due to the fact that -fstack-check is not enabled by default. MSVC++ and Clang on Windows guarantee that stack overflows from well-defined code (large stack frames, VLAs) will be caught. However, the switch seems to cause a significant performance hit for functions where it triggers (which are rare but sometimes performance critical, a good example is jemalloc's rbtree implementation which uses arrays rather than recursion) and compatibility issues due to the way it's currently implemented: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67265/.
[Bug sanitizer/68065] Size calculations for VLAs can overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065 --- Comment #25 from Daniel Micay --- > Just as GCC on Windows... Sure. I'm pointing out that Windows has had safety here for years, while Linux doesn't. It should. It really shouldn't be possible to exploit well-defined code. Running out of resources and aborting isn't ideal but that's at least sane and doing better doesn't seem possible for C. > Figures please, otherwise that's just FUD as usual. ... pointing out that something isn't implemented ideally is FUD now? If it had no significant performance hit (which should be the case for optimized C, because it shouldn't need to reserve a register), then it would surely be enabled by default. We tried enabling it in Arch Linux but it had to be backed out due to performance concerns. Some compatibility issues came up (due to inline assembly) and then investigation into it demonstrated that it wasn't really causing a negligible performance hit, especially on i686. Among other things, it causes a significant performance hit (over 5% slower in a malloc micro-benchmark on x86_64, more on i686) for jemalloc which has large enough stack frames to trigger it and essentially all of the code is inlined. It's usually pretty small... but a lot more than it should be. Anyway, I was just trying to be helpful. I'm only really interested in Android these days so GCC isn't really something I care about... I just happened to have thoughts about this stuff because I worked on similar issues in the Rust compiler / standard libraries (Rust is why LLVM is getting proper stack checking at all, Clang implements -fstack-check as a no-op right now for 'compatibility') and recently Bionic.
[Bug sanitizer/68065] Size calculations for VLAs can overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065 --- Comment #17 from Daniel Micay --- It's well-defined C code though. It shouldn't be possible to for anything to go wrong here when using -fstack-check, i.e. it should be guaranteed to trigger a stack overflow which is caught. The size wrapping back around is different.
[Bug sanitizer/68065] Size calculations for VLAs can overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065 --- Comment #18 from Daniel Micay --- Also, it's possible to use segmented stacks to avoid stack overflow and this probably breaks in that context too. That's a very niche use case compared to -fstack-check though.
[Bug sanitizer/68065] Size calculations for VLAs can overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065 --- Comment #21 from Daniel Micay --- (In reply to jos...@codesourcery.com from comment #20) > Undefined behavior when the type is created (not when an object of that > type is declared or when sizeof is used) seems entirely in accordance with > normal C practice in areas such as stack overflow[*] (that is, where the C > standard fails to recognize limits in such areas but all implementations > in practice have such limits, that's a defect in the C standard). Stack overflow is undefined with GCC, but MSVC++ and Clang on Windows guarantee that it will be caught if the program doesn't invoke any truly undefined behavior. Clang will be getting an implementation for other platforms soon, and it will probably end up being enabled by default since it really has no significant overhead. The implementation of -fstack-check in GCC does have significant overhead, but it doesn't have to be that way. It shouldn't go out of the way to provide a proper stack trace with -O2/-O3 (or whatever other reasons it has for the slow implementation).
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 Daniel Micay changed: What|Removed |Added CC||danielmicay at gmail dot com --- Comment #5 from Daniel Micay --- An implementation can have object size limits, but I think it's incorrect if those limits are not enforced for all standard ways of allocating objects. Objects larger than PTRDIFF_MAX are forbidden with musl (malloc, mmap and friends report ENOMEM and it's explicitly undefined to create them in another way and use them with libc), and it would make a lot of sense for glibc to do the same thing. I recently landed the same feature in Android's Bionic libc for mmap. Since glibc's implementations of standard library functions don't handle objects larger than PTRDIFF_MAX, this can definitely be considered as a buggy area in glibc. The typical issue is `end - start` but compilers considering addition to be undefined in this case isn't surprising either. FWIW, Clang also treats `ptr + size` with `size > PTRDIFF_MAX` as undefined for standard C pointer arithmetic because the underlying getelementptr instruction in LLVM is inherently a signed arithmetic operation. Clang marks standard C pointer arithmetic operations as "inbounds", which among other things makes use of the value returned from wrapping into undefined behavior. Last time I checked, the non-standard GNU C `void *` arithmetic doesn't get tagged as "inbounds" by Clang, so wrapping is well-defined for that.
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #14 from Daniel Micay --- (In reply to Florian Weimer from comment #12) > (In reply to Daniel Micay from comment #10) > > (In reply to Florian Weimer from comment #7) > > > If this is not a GCC bug and it is the responsibility of allocators not to > > > produce huge objects, do we also have to make sure that no object crosses > > > the boundary between 0x7fff_ and 0x8000_? If pointers are treated > > > as de-facto signed, this is where signed overflow would occur. > > > > No, that's fine. > > Is this based on your reading of the standard, the GCC sources, or both? > (It is unusual to see people making such definite statements about > middle-end/back-end behavior, that's why I have to ask.) It's not the kind of thing the standard is concerned with: it'd be perfectly valid for an implementation to forbid that, as long as it was enforced throughout the implementation. It would be just crazy to have a requirement like that. As far as I know, the use of signed offsets for pointer arithmetic in GCC is just a design decision with known consequences. That's definitely the case in LLVM, since it's very explicitly documented as being a signed offset with undefined overflow.
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #15 from Daniel Micay --- i.e. AFAIK the offsets are intended to be treated as signed but treating pointers as signed would be a serious bug rather than a design choice
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #9 from Daniel Micay --- (In reply to Florian Weimer from comment #8) > (In reply to Alexander Cherepanov from comment #4) > > > Am I right that the C standards do not allow for such a limitation (and > > hence this should not be reported to glibc as a bug) and gcc is not > > standards-compliant in this regard? Or I'm missing something? > > The standard explicitly acknowledges the possibility of arrays that have > more than PTRDIFF_MAX elements (it says that the difference of two pointers > within the same array is not necessarily representable in ptrdiff_t). > > I'm hesitant to put in artificial limits into glibc because in the mast, > there was significant demand for huge mappings in 32-bit programs (to the > degree that Red Hat even shipped special kernels for this purpose). I don't think there's much of a use case for allocating a single >2G allocation in a 3G or 4G address space. It has a high chance of failure simply due to virtual memory fragmentation, especially since the kernel's mmap allocation algorithm is so naive (keeps going downwards and ignores holes until it runs out, rather than using first-best-fit). Was the demand for a larger address space or was it really for the ability to allocate all that memory in one go?
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #10 from Daniel Micay --- (In reply to Florian Weimer from comment #7) > If this is not a GCC bug and it is the responsibility of allocators not to > produce huge objects, do we also have to make sure that no object crosses > the boundary between 0x7fff_ and 0x8000_? If pointers are treated > as de-facto signed, this is where signed overflow would occur. No, that's fine. It's the offsets that are treated as ptrdiff_t. Clang/LLVM handle it the same way. There's a very important assumption for optimizations that pointer arithmetic cannot wrap (per the standard) and all offsets are treated as signed integers. AFAIK, `ptr + size` is equivalent to `ptr + (ptrdiff_t)size` in both Clang and GCC. There's documentation on how this is handled in LLVM IR here, specifically the inbounds marker which is added to all standard C pointer arithmetic: http://llvm.org/docs/LangRef.html#getelementptr-instruction I expect GCC works very similarly, but I'm not familiar with the GCC internals. It's not really a compiler bug because the standard allows object size limits, but the compiler and standard C library both need to be aware of those limits and enforce them if they exist. So it's a bug in GCC + glibc or Clang + glibc, not either of them alone. I think dealing with it in libc is the only full solution though due to issues like `p - q` and the usage of ssize_t for sizes in functions like read/write.
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #13 from Daniel Micay --- They'd still be able to make a mmap system call via syscall(...) to avoid the check, so it seems like it's mostly an ABI compatibility issue. Of course, they'd have to be very careful to avoid all of the caveats of a mapping that large too. It could be dealt with as it usually is by making new symbols with the checks to avoid changing anything for old binaries. And yeah, the vanilla kernel ASLR is incredibly weak. It only uses up to 1MiB of virtual memory (8-bit ASLR) rather than 256MiB (16-bit) like PaX.
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #20 from Daniel Micay --- > I think several issues are mixed: A conforming C implementation requires either fixing both the compiler and libc functions to handle > PTRDIFF_MAX objects or preventing them from being allocated via standard mechanisms (and *ideally* documenting the restriction). Since there are many other issues with > PTRDIFF_MAX objects (p - q, read/write and similar uses of ssize_t, etc.) and few reasons to allow it, it really makes the most sense to tackle it in libc. > Is it documented? I don't think musl has documentation like that in general. > This terminology is quite misleading. It's neither signed nor unsigned. > Pointer operand is unsigned while offsets are signed. Agreed. > How buggy? Are there bugs filed? Searching for PTRDIFF_MAX finds Zarro Boogs. It hasn't been treated as a systemic issue or considered as something related to PTRDIFF_MAX. You'd need to search for issues like ssize_t overflow to find them. If you really want one specific example, it looks like there's at least one case of `end - start` in stdlib/qsort.c among other places (char *mid = lo + size * ((hi - lo) / size >> 1);). I don't think fixing every usage of `end - start` on arbitrarily sized objects is the right way to go, so it's not something I'd audit for and file bugs about. I did do some testing a while ago by passing > PTRDIFF_MAX size objects to the standard libc functions taking pointer and size parameters so I'm aware that it's problematic. > This is the same restriction as in C11 which is satisfied in the provided > example. (If one large number is a problem replace "buf + len" by "(int *)buf > + len / sizeof(int)".) So it should work in Clang? The length is cast to a signed integer of the same size and that negative signed offset is given as an argument to the inbounds GEP instruction, which is undefined since it wraps. > For this to work a compiler have to support for working with huge objects, > right? Well, they might just need a contiguous allocation without the need to actually use it all at once. It doesn't necessarily require compiler support, but it could easily go wrong without compiler support if the semantics of the implementation aren't clearly laid out (and at the moment it's definitely not documented).
[Bug c/70090] New: add non-constant variant of __builtin_object_size for _FORTIFY_SOURCE and -fsanitize=object-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70090 Bug ID: 70090 Summary: add non-constant variant of __builtin_object_size for _FORTIFY_SOURCE and -fsanitize=object-size Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: danielmicay at gmail dot com Target Milestone: --- The __builtin_object_size intrinsic is primarily used for _FORTIFY_SOURCE, where it's used for both the compile-time and runtime checks. However, _FORTIFY_SOURCE would be better served by a more flexible intrinsic able to return a runtime value when possible. For example, consider this code: void *p = malloc(n); if (!p) { return 1; } memset(p, 0, m); It would be straightforward to catch cases where m > n by replacing __builtin_object_size with a new __builtin_runtime_object_size intrinsic taking advantage of the alloc_size attribute for runtime values. It would still return a constant sentinel when the value is unknown. The fortified functions would use __builtin_constant_p and perform the runtime check if the value is not constant, falling through to the old code with the compile-time checks and fast paths when it's constant. This would make _FORTIFY_SOURCE significantly more useful for dynamic allocations by covering simple cases where the memory is used right away. The same code could also be used to improve -fsanitize=object-size.
[Bug c/67999] Wrong optimization of pointer comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 --- Comment #28 from Daniel Micay --- I got jemalloc / Bionic libc (Android) to report errors for malloc and mmap/mremap larger than PTRDIFF_MAX a while ago (along with fixing a missing case for mremap in musl), but glibc needs to be convinced to do the same. It would be a lot easier to convince them with this officially documented. I think it's perfectly reasonable if it's clearly stated that objects larger than PTRDIFF_MAX are not supported and that the libc implementation needs to deal with it.
[Bug middle-end/70090] add non-constant variant of __builtin_object_size for _FORTIFY_SOURCE and -fsanitize=object-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70090 --- Comment #2 from Daniel Micay --- This has now been implemented in Clang as __builtin_dynamic_object_size. There's a thread on the GCC mailing list about it at https://www.mail-archive.com/gcc@gcc.gnu.org/msg87230.html.