[Bug middle-end/66479] -fstack-check doesn't prevent stack allocation with size -1

2015-08-18 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479

Daniel Micay danielmicay at gmail dot com changed:

   What|Removed |Added

 CC||danielmicay at gmail dot com

--- Comment #4 from Daniel Micay danielmicay at gmail dot com ---
It's not a configuration issue. The -fstack-check feature is supposed to
guarantee the guard page is hit if it's present, regardless of the size of the
stack or the stack allocation.


[Bug target/67265] [x86] 'asm' operand has impossible constraints with -fstack-check

2015-08-18 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67265

Daniel Micay danielmicay at gmail dot com changed:

   What|Removed |Added

 CC||danielmicay at gmail dot com

--- Comment #3 from Daniel Micay danielmicay at gmail dot com ---
It only temporarily needs a register to perform a write. The implementation
might currently have to reserve it but it *shouldn't* have to. It should be
free other than guaranteeing a one byte write per page.


[Bug c/68065] Size calculations for VLAs can overflow

2015-10-27 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065

Daniel Micay  changed:

   What|Removed |Added

 CC||danielmicay at gmail dot com

--- Comment #9 from Daniel Micay  ---
> VLA also does not detect stack overflows either.

Stack overflows are detected with -fstack-check, or at least they would be if
the option worked properly: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65958

I've always found it quite bad that well-defined code with GCC can actually be
exploited (arbitrary write vulnerabilities) due to the fact that -fstack-check
is not enabled by default. MSVC++ and Clang on Windows guarantee that stack
overflows from well-defined code (large stack frames, VLAs) will be caught.
However, the switch seems to cause a significant performance hit for functions
where it triggers (which are rare but sometimes performance critical, a good
example is jemalloc's rbtree implementation which uses arrays rather than
recursion) and compatibility issues due to the way it's currently implemented:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67265/.


[Bug sanitizer/68065] Size calculations for VLAs can overflow

2015-11-11 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065

--- Comment #25 from Daniel Micay  ---
> Just as GCC on Windows...

Sure. I'm pointing out that Windows has had safety here for years, while Linux
doesn't. It should. It really shouldn't be possible to exploit well-defined
code. Running out of resources and aborting isn't ideal but that's at least
sane and doing better doesn't seem possible for C.

> Figures please, otherwise that's just FUD as usual.

... pointing out that something isn't implemented ideally is FUD now? If it had
no significant performance hit (which should be the case for optimized C,
because it shouldn't need to reserve a register), then it would surely be
enabled by default.

We tried enabling it in Arch Linux but it had to be backed out due to
performance concerns. Some compatibility issues came up (due to inline
assembly) and then investigation into it demonstrated that it wasn't really
causing a negligible performance hit, especially on i686. Among other things,
it causes a significant performance hit (over 5% slower in a malloc
micro-benchmark on x86_64, more on i686) for jemalloc which has large enough
stack frames to trigger it and essentially all of the code is inlined. It's
usually pretty small... but a lot more than it should be.

Anyway, I was just trying to be helpful. I'm only really interested in Android
these days so GCC isn't really something I care about... I just happened to
have thoughts about this stuff because I worked on similar issues in the Rust
compiler / standard libraries (Rust is why LLVM is getting proper stack
checking at all, Clang implements -fstack-check as a no-op right now for
'compatibility') and recently Bionic.

[Bug sanitizer/68065] Size calculations for VLAs can overflow

2015-11-10 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065

--- Comment #17 from Daniel Micay  ---
It's well-defined C code though. It shouldn't be possible to for anything to go
wrong here when using -fstack-check, i.e. it should be guaranteed to trigger a
stack overflow which is caught. The size wrapping back around is different.

[Bug sanitizer/68065] Size calculations for VLAs can overflow

2015-11-10 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065

--- Comment #18 from Daniel Micay  ---
Also, it's possible to use segmented stacks to avoid stack overflow and this
probably breaks in that context too. That's a very niche use case compared to
-fstack-check though.

[Bug sanitizer/68065] Size calculations for VLAs can overflow

2015-11-10 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68065

--- Comment #21 from Daniel Micay  ---
(In reply to jos...@codesourcery.com from comment #20)
> Undefined behavior when the type is created (not when an object of that 
> type is declared or when sizeof is used) seems entirely in accordance with 
> normal C practice in areas such as stack overflow[*] (that is, where the C 
> standard fails to recognize limits in such areas but all implementations 
> in practice have such limits, that's a defect in the C standard).

Stack overflow is undefined with GCC, but MSVC++ and Clang on Windows guarantee
that it will be caught if the program doesn't invoke any truly undefined
behavior. Clang will be getting an implementation for other platforms soon, and
it will probably end up being enabled by default since it really has no
significant overhead.

The implementation of -fstack-check in GCC does have significant overhead, but
it doesn't have to be that way. It shouldn't go out of the way to provide a
proper stack trace with -O2/-O3 (or whatever other reasons it has for the slow
implementation).

[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-18 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

Daniel Micay  changed:

   What|Removed |Added

 CC||danielmicay at gmail dot com

--- Comment #5 from Daniel Micay  ---
An implementation can have object size limits, but I think it's incorrect if
those limits are not enforced for all standard ways of allocating objects.
Objects larger than PTRDIFF_MAX are forbidden with musl (malloc, mmap and
friends report ENOMEM and it's explicitly undefined to create them in another
way and use them with libc), and it would make a lot of sense for glibc to do
the same thing. I recently landed the same feature in Android's Bionic libc for
mmap.

Since glibc's implementations of standard library functions don't handle
objects larger than PTRDIFF_MAX, this can definitely be considered as a buggy
area in glibc. The typical issue is `end - start` but compilers considering
addition to be undefined in this case isn't surprising either.

FWIW, Clang also treats `ptr + size` with `size > PTRDIFF_MAX` as undefined for
standard C pointer arithmetic because the underlying getelementptr instruction
in LLVM is inherently a signed arithmetic operation. Clang marks standard C
pointer arithmetic operations as "inbounds", which among other things makes use
of the value returned from wrapping into undefined behavior. Last time I
checked, the non-standard GNU C `void *` arithmetic doesn't get tagged as
"inbounds" by Clang, so wrapping is well-defined for that.


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-19 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #14 from Daniel Micay  ---
(In reply to Florian Weimer from comment #12)
> (In reply to Daniel Micay from comment #10)
> > (In reply to Florian Weimer from comment #7)
> > > If this is not a GCC bug and it is the responsibility of allocators not to
> > > produce huge objects, do we also have to make sure that no object crosses
> > > the boundary between 0x7fff_ and 0x8000_?  If pointers are treated
> > > as de-facto signed, this is where signed overflow would occur.
> > 
> > No, that's fine.
> 
> Is this based on your reading of the standard, the GCC sources, or both? 
> (It is unusual to see people making such definite statements about
> middle-end/back-end behavior, that's why I have to ask.)

It's not the kind of thing the standard is concerned with: it'd be perfectly
valid for an implementation to forbid that, as long as it was enforced
throughout the implementation. It would be just crazy to have a requirement
like that. As far as I know, the use of signed offsets for pointer arithmetic
in GCC is just a design decision with known consequences. That's definitely the
case in LLVM, since it's very explicitly documented as being a signed offset
with undefined overflow.


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-19 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #15 from Daniel Micay  ---
i.e. AFAIK the offsets are intended to be treated as signed but treating
pointers as signed would be a serious bug rather than a design choice


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-19 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #9 from Daniel Micay  ---
(In reply to Florian Weimer from comment #8)
> (In reply to Alexander Cherepanov from comment #4)
> 
> > Am I right that the C standards do not allow for such a limitation (and
> > hence this should not be reported to glibc as a bug) and gcc is not
> > standards-compliant in this regard? Or I'm missing something?
> 
> The standard explicitly acknowledges the possibility of arrays that have
> more than PTRDIFF_MAX elements (it says that the difference of two pointers
> within the same array is not necessarily representable in ptrdiff_t).
> 
> I'm hesitant to put in artificial limits into glibc because in the mast,
> there was significant demand for huge mappings in 32-bit programs (to the
> degree that Red Hat even shipped special kernels for this purpose).

I don't think there's much of a use case for allocating a single >2G allocation
in a 3G or 4G address space. It has a high chance of failure simply due to
virtual memory fragmentation, especially since the kernel's mmap allocation
algorithm is so naive (keeps going downwards and ignores holes until it runs
out, rather than using first-best-fit).

Was the demand for a larger address space or was it really for the ability to
allocate all that memory in one go?


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-19 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #10 from Daniel Micay  ---
(In reply to Florian Weimer from comment #7)
> If this is not a GCC bug and it is the responsibility of allocators not to
> produce huge objects, do we also have to make sure that no object crosses
> the boundary between 0x7fff_ and 0x8000_?  If pointers are treated
> as de-facto signed, this is where signed overflow would occur.

No, that's fine. It's the offsets that are treated as ptrdiff_t. Clang/LLVM
handle it the same way. There's a very important assumption for optimizations
that pointer arithmetic cannot wrap (per the standard) and all offsets are
treated as signed integers. AFAIK, `ptr + size` is equivalent to `ptr +
(ptrdiff_t)size` in both Clang and GCC.

There's documentation on how this is handled in LLVM IR here, specifically the
inbounds marker which is added to all standard C pointer arithmetic:

http://llvm.org/docs/LangRef.html#getelementptr-instruction

I expect GCC works very similarly, but I'm not familiar with the GCC internals.

It's not really a compiler bug because the standard allows object size limits,
but the compiler and standard C library both need to be aware of those limits
and enforce them if they exist. So it's a bug in GCC + glibc or Clang + glibc,
not either of them alone. I think dealing with it in libc is the only full
solution though due to issues like `p - q` and the usage of ssize_t for sizes
in functions like read/write.


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-19 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #13 from Daniel Micay  ---
They'd still be able to make a mmap system call via syscall(...) to avoid the
check, so it seems like it's mostly an ABI compatibility issue. Of course,
they'd have to be very careful to avoid all of the caveats of a mapping that
large too. It could be dealt with as it usually is by making new symbols with
the checks to avoid changing anything for old binaries.

And yeah, the vanilla kernel ASLR is incredibly weak. It only uses up to 1MiB
of virtual memory (8-bit ASLR) rather than 256MiB (16-bit) like PaX.


[Bug c/67999] Wrong optimization of pointer comparisons

2015-10-20 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #20 from Daniel Micay  ---
> I think several issues are mixed:

A conforming C implementation requires either fixing both the compiler and libc
functions to handle > PTRDIFF_MAX objects or preventing them from being
allocated via standard mechanisms (and *ideally* documenting the restriction).
Since there are many other issues with > PTRDIFF_MAX objects (p - q, read/write
and similar uses of ssize_t, etc.) and few reasons to allow it, it really makes
the most sense to tackle it in libc.

> Is it documented?

I don't think musl has documentation like that in general.

> This terminology is quite misleading. It's neither signed nor unsigned. 
> Pointer operand is unsigned while offsets are signed.

Agreed.

> How buggy? Are there bugs filed? Searching for PTRDIFF_MAX finds Zarro Boogs.

It hasn't been treated as a systemic issue or considered as something related
to PTRDIFF_MAX. You'd need to search for issues like ssize_t overflow to find
them. If you really want one specific example, it looks like there's at least
one case of `end - start` in stdlib/qsort.c among other places (char *mid = lo
+ size * ((hi - lo) / size >> 1);). I don't think fixing every usage of `end -
start` on arbitrarily sized objects is the right way to go, so it's not
something I'd audit for and file bugs about.

I did do some testing a while ago by passing > PTRDIFF_MAX size objects to the
standard libc functions taking pointer and size parameters so I'm aware that
it's problematic.

> This is the same restriction as in C11 which is satisfied in the provided 
> example. (If one large number is a problem replace "buf + len" by "(int *)buf 
> + len / sizeof(int)".) So it should work in Clang?

The length is cast to a signed integer of the same size and that negative
signed offset is given as an argument to the inbounds GEP instruction, which is
undefined since it wraps.

> For this to work a compiler have to support for working with huge objects, 
> right?

Well, they might just need a contiguous allocation without the need to actually
use it all at once. It doesn't necessarily require compiler support, but it
could easily go wrong without compiler support if the semantics of the
implementation aren't clearly laid out (and at the moment it's definitely not
documented).


[Bug c/70090] New: add non-constant variant of __builtin_object_size for _FORTIFY_SOURCE and -fsanitize=object-size

2016-03-05 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70090

Bug ID: 70090
   Summary: add non-constant variant of __builtin_object_size for
_FORTIFY_SOURCE and -fsanitize=object-size
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danielmicay at gmail dot com
  Target Milestone: ---

The __builtin_object_size intrinsic is primarily used for _FORTIFY_SOURCE,
where it's used for both the compile-time and runtime checks. However,
_FORTIFY_SOURCE would be better served by a more flexible intrinsic able to
return a runtime value when possible.

For example, consider this code:

void *p = malloc(n);
if (!p) {
return 1;
}
memset(p, 0, m);

It would be straightforward to catch cases where m > n by replacing
__builtin_object_size with a new __builtin_runtime_object_size intrinsic taking
advantage of the alloc_size attribute for runtime values. It would still return
a constant sentinel when the value is unknown. The fortified functions would
use __builtin_constant_p and perform the runtime check if the value is not
constant, falling through to the old code with the compile-time checks and fast
paths when it's constant. This would make _FORTIFY_SOURCE significantly more
useful for dynamic allocations by covering simple cases where the memory is
used right away.

The same code could also be used to improve -fsanitize=object-size.

[Bug c/67999] Wrong optimization of pointer comparisons

2018-02-15 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

--- Comment #28 from Daniel Micay  ---
I got jemalloc / Bionic libc (Android) to report errors for malloc and
mmap/mremap larger than PTRDIFF_MAX a while ago (along with fixing a missing
case for mremap in musl), but glibc needs to be convinced to do the same. It
would be a lot easier to convince them with this officially documented. I think
it's perfectly reasonable if it's clearly stated that objects larger than
PTRDIFF_MAX are not supported and that the libc implementation needs to deal
with it.

[Bug middle-end/70090] add non-constant variant of __builtin_object_size for _FORTIFY_SOURCE and -fsanitize=object-size

2019-02-14 Thread danielmicay at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70090

--- Comment #2 from Daniel Micay  ---
This has now been implemented in Clang as __builtin_dynamic_object_size.
There's a thread on the GCC mailing list about it at
https://www.mail-archive.com/gcc@gcc.gnu.org/msg87230.html.