[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2022-01-03 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #22 from rguenther at suse dot de  ---
On Mon, 13 Dec 2021, hubicka at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663
> 
> --- Comment #20 from Jan Hubicka  ---
> I really think with -fdelete-null-pointer-checks we should optimize away the
> pointer adjustment relying on the fact that program will segfault.
> 
> I was wondering, -fdelete-null-pointer-checks currently requires pointer to be
> precisely 0.  We are already iffy here since the access is at non-0 offset, 
> but
> since infer_nonnull_range_by_dereference uses check_loadstore:
> 
> static bool
> check_loadstore (gimple *, tree op, tree, void *data)
> {
>   if (TREE_CODE (op) == MEM_REF || TREE_CODE (op) == TARGET_MEM_REF)
> {
>   /* Some address spaces may legitimately dereference zero.  */
>   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (op));
>   if (targetm.addr_space.zero_address_valid (as))
> return false;
> 
>   return operand_equal_p (TREE_OPERAND (op, 0), (tree)data, 0);
> }
>   return false;
> }
> 
> which completely ignores MEM_REF_OFFSET we actually turn into trap accesses
> that are arbitrarily far from NULL.  We also ignore handled components so we

I think MEM_REF[(void *)0 + 4] is non-canonical (IIRC we "simplify" that 
to MEM_REF[(void *)4])

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-12-13 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #21 from Jeffrey A. Law  ---
I don't think there's anything inherently wrong with treating 0 + small offset
similarly to 0 when it comes to -fdelete-null-pointer-checks.   I suspect it'll
break some poorly written low level code, though we fixed up some of that a
year or so ago in response to some of Martin's diagnostic work which was
flagging 0 + offset as problematical.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-12-13 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #20 from Jan Hubicka  ---
I really think with -fdelete-null-pointer-checks we should optimize away the
pointer adjustment relying on the fact that program will segfault.

I was wondering, -fdelete-null-pointer-checks currently requires pointer to be
precisely 0.  We are already iffy here since the access is at non-0 offset, but
since infer_nonnull_range_by_dereference uses check_loadstore:

static bool
check_loadstore (gimple *, tree op, tree, void *data)
{
  if (TREE_CODE (op) == MEM_REF || TREE_CODE (op) == TARGET_MEM_REF)
{
  /* Some address spaces may legitimately dereference zero.  */
  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (op));
  if (targetm.addr_space.zero_address_valid (as))
return false;

  return operand_equal_p (TREE_OPERAND (op, 0), (tree)data, 0);
}
  return false;
}

which completely ignores MEM_REF_OFFSET we actually turn into trap accesses
that are arbitrarily far from NULL.  We also ignore handled components so we
miss this for example for variable array accesses I think.

However if we had --param null-pointer-zone defaulting to say 4k (a page size)
we could optimize all accesses that are near the NULL pointer.  This would let
us to optimize this correctly and also i.e. simplify ipa-pta that currently
special cases 0 as null but thinks that any other constat may point to any
global variable.  Small constants are common so this should optimize
noticeably.

For clang binary there are really many traps added by this logic that makes
code noticeably uglier than what clang generates on its own sources.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-12-13 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #19 from Jan Hubicka  ---
*** Bug 103674 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-01-07 Thread law at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #18 from Jeffrey A. Law  ---
Jon, there's no way for the optimizers to improve the to_derived_bad case as
there's nothing in the IL after we leave the front-end that's useful.  In the
.original dump we have:

;; Function derived& to_derived_bad(base2*) (null)
;; enabled by -tree-original


return  = b != 0B ? (struct derived &) b + 18446744073709551612 : 0;


There's just nothing the optimizers can do with that.  The front-end would have
to provide more information or remove the check itself (as is done for the
to_derived_good case which has this .original dump):

;; Function derived& to_derived_good(base2*) (null)
;; enabled by -tree-original


return  = (struct derived &) NON_LVALUE_EXPR  +
18446744073709551612;

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-01-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-01-07
 Status|UNCONFIRMED |NEW

--- Comment #17 from Jonathan Wakely  ---
Copied from the PR 98501 dup:

Consider this code:

struct base1 { int a; };
struct base2 { int b; };
struct derived : base1, base2 {};

derived& to_derived_bad(base2* b)
{
return *static_cast(b);
}

derived& to_derived_good(base2* b)
{
return static_cast(*b);
}

I believe both of these functions are functionally equivalent and should
generate the same code. Both functions cast pointer from base to derived if it
is not nullptr and both cause undefined behavior if it is nullptr.

GCC optimizes to_derived_good() to a single subtraction, but it inserts
nullptr-check into to_derived_bad():

to_derived_good(base2*):
lea rax, [rdi-4]
ret
to_derived_bad(base2*):
lea rax, [rdi-4]
test rdi, rdi
mov edx, 0
cmove rax, rdx
ret

Could GCC omit the nullptr-check in to_derived_bad?

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2021-01-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Jonathan Wakely  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #16 from Jonathan Wakely  ---
*** Bug 98501 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-27 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #15 from Jakub Jelinek  ---
In the particular case we are talking about, security/non-security, it doesn't
make sense to do anything but optimize it into straight line code, any
__builtin_trap or similar will just hurt.  If you feel e.g. by default adding
__builtin_unreachable is too dangerous in some cases, it can just handle
similar cases manually and optimize away the conditional if there are no
side-effect statements in between.
We are talking about:
   [local count: 1073741824]:
  if (base_2(D) != 0B)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 751619281]:
  iftmp.1_3 = base_2(D) + (sizetype)-4;

   [local count: 1073741824]:
  # iftmp.1_1 = PHI 
  _5 = MEM[(const struct Derived *)iftmp.1_1].D.2340.y;
where without -fno-delete-null-pointer-checks and without -fwrapv-pointer, we
can assume: 1) pointers in valid programs don't wrap 2) the first page is not
mapped
As offsetof (Derived, D.2340.y) is >= 4 and < 4096 here, we don't need to even
care about pointer wrapping, just rely on accesses to 0 .. 4095 offsets to
trap.
If the offsetof would be 0, it would be about pointer wrapping, whether we are
ok if instead of dereferencing *(int *)0 we dereference *(int *)-4 instead.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-27 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #14 from Jeffrey A. Law  ---
The reason for turning the dereference into a trap is because we can clean up
ancillary computations -- the address computation, the RHS of a store, and the
like via standard dead code elimination algorithms.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-27 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #13 from Jeffrey A. Law  ---
Marc,

Yes, absolutely.  In fact, I think that falls out of the work Martin S is doing
in this space.  Conceptually we're looking to generalize that code so that we
can route more cases where the compiler detects undefined behavior through the
path isolation routines.

Within those commonized routines we want to have a knob which selects different
behavior from the compiler when undefined behavior is detected which could
potentially include issuing the RTL equivalent of __builtin_unreachable vs trap
vs warn, but leave the code alone, try to mitigate, etc.

Where I think we've differed in the past is what to do with conditional branch
upon which the undefined behavior is control dependent upon.  As you may
remember, the original submission of path isolation would turn that conditional
into an unconditional branch to the valid path.  That's not correct because
there can be observable behavior that occurs on the path from the conditional,
but before the undefined behavior triggers.  Having a knob to twiddle *that*
may or may not be a good idea.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-27 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #12 from Marc Glisse  ---
(In reply to Jeffrey A. Law from comment #10)
> __builtin_trap emits an actual trap into the instruction stream which halts
> the process immediately which is *much* better from a security standpoint

Regardless of what the default is, I think we should be able to agree that
there are uses where we want to favor hardening/security (public facing
servers, web browsers), and others where performance is more important
(scientific simulations), and it would be nice to give users a choice.
(I think sanitizers already provide a way to turn __builtin_unreachable into
__builtin_trap, but that's more meant for explicit __builtin_unreachable in
user code)

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-26 Thread jzwinck at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #11 from John Zwinck  ---
Jeffrey, when I compile with -fno-isolate-erroneous-paths-dereference I am
asking the compiler not to insert traps.  I filed this issue in hopes that GCC
can better optimize when it is told not to insert traps.

I think the default behavior of inserting a trap is sort of OK ("sort of"
because it seems pointless to trap when not trapping would dereference 0x0,
which in my mind is also a good way to end a program that dereferences a null
pointer).  But when I tell GCC not to insert traps, I definitely want the
simplest, fastest code.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-26 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com

--- Comment #10 from Jeffrey A. Law  ---
trap is much stronger than an unreachable.

If you hit a gcc_unreachable location at runtime, execution just continues
onward with no indication something terrible has happened.  It's literally just
a marker in our IL and results in no generated code.  I think it's
fundamentally broken from a security standpoint.

__builtin_trap emits an actual trap into the instruction stream which halts the
process immediately which is *much* better from a security standpoint

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek  ---
(In reply to rguent...@suse.de from comment #8)
> -fdelete-null-pointer-checks optimizes NULL pointer checks _after_ a
> dereference.  This case is first checking and then dereferencing.
> GCC sees

-fdelete-null-pointer-checks is just an option that says that it is ok to
optimize null pointer dereferences.
So, I don't see why we couldn't optimize this out.
-fisolate-erroneous-paths-dereference is documented to add __builtin_trap in
there instead of __builtin_unreachable, I'd say we want (by default) something
that will add __builtin_unreachable in those cases instead.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #8 from rguenther at suse dot de  ---
On Mon, 15 Jun 2020, redi at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663
> 
> --- Comment #7 from Jonathan Wakely  ---
> So yes, the static_cast should evaluate to zero,

Thanks for the clarification.

> but if it's followed by a
> dereference then it seems reasonable to expect -fdelete-null-pointer-checks to
> optimize away the handling for zero.

-fdelete-null-pointer-checks optimizes NULL pointer checks _after_ a
dereference.  This case is first checking and then dereferencing.
GCC sees

   [local count: 1073741824]:
  if (base_2(D) != 0B)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 751619278]:
  iftmp.1_3 = base_2(D) + 18446744073709551612;

   [local count: 1073741824]:
  # iftmp.1_1 = PHI 
  _5 = MEM[(int *)iftmp.1_1 + 4B];

in getter().  So this is more about if-conversion or in the GCC
case path isolation where I'd actually question that
MEM[0B + 4B] is an "obvious" null pointer dereference ;)
It's more obvious in the IL for field():

   [local count: 1073741824]:
  if (base_2(D) != 0B)  
goto ; [70.00%]
  else  
goto ; [30.00%]

   [local count: 751619278]:
  iftmp.0_3 = base_2(D) + 18446744073709551612; 

   [local count: 1073741824]: 
  # iftmp.0_1 = PHI 
  _5 = iftmp.0_1->D.2309.y;   

but my point is I guess that the C++ FE has a more "native"
view of this and should maybe elide the NULL pointer check
when the static_cast is dereferenced.  Because eliding the
check and isolating the (not NULL!) dereference is sth
different.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #7 from Jonathan Wakely  ---
So yes, the static_cast should evaluate to zero, but if it's followed by a
dereference then it seems reasonable to expect -fdelete-null-pointer-checks to
optimize away the handling for zero.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #6 from Jonathan Wakely  ---
Dereferencing in the null case is undefined.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #5 from rguenther at suse dot de  ---
On Mon, 15 Jun 2020, redi at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663
> 
> --- Comment #3 from Jonathan Wakely  ---
> Or to be more clear:
> 
> struct Large {
>   char pad[1024*1024];
>   int x;
> };
> 
> Large* p = 0;
> int i = p->x;

Sure, but this isn't the same if C++ mandates the static_cast
of a null evaluates to null and not the offset of the base class.
So what clang does is not unsafe but wrong since the offset is
missing and it returns Base1::x instead of Base2::y?

Note for getter and clang I see

_Z6getterP5Base2:   # @_Z6getterP5Base2
.cfi_startproc
# %bb.0:
leaq-4(%rdi), %rax
testq   %rdi, %rdi
cmoveq  %rdi, %rax
movl4(%rax), %eax
retq

So either static_cast(base) should evaluate to zero
or not.  If it does then clang dereferences the wrong address
in the null case (but only in 'field').

So, what does C++ say here?

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread jzwinck at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #4 from John Zwinck  ---
Richard Biener said:

> Note it will make a difference for very large objects (and thus very large 
> offsets added) which may end up accessing actually mapped memory so IMHO what 
> clang does by default is a security risk.

I am a bit confused by this statement, as GCC turns what would have been a load
from address zero into a load from a non-zero address.  Here's a demo inspired
by Mr Wakely's example: https://godbolt.org/z/EvMpyz

Maybe I'm misreading the output, or you, but Clang's generated code looks safer
to me.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #3 from Jonathan Wakely  ---
Or to be more clear:

struct Large {
  char pad[1024*1024];
  int x;
};

Large* p = 0;
int i = p->x;

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

--- Comment #2 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> I suppose the C++ standard says static_cast(nullptr) == nullptr
> and
> we literally follow that.  Note it will make a difference for very large
> objects (and thus very large offsets added) which may end up acccessing
> actually
> mapped memory so IMHO what clang does by default is a security risk.

Is that any worse than this though?

LargeObject* p = nullptr;
p->foo();

Adding a static_cast doesn't seem to be the problem here.

[Bug tree-optimization/95663] static_cast checks for null even when the pointer is dereferenced

2020-06-15 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95663

Richard Biener  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
  Component|c++ |tree-optimization
   Keywords||missed-optimization
 Target||x86_64-*-* i?86-*-*

--- Comment #1 from Richard Biener  ---
I suppose the C++ standard says static_cast(nullptr) == nullptr and
we literally follow that.  Note it will make a difference for very large
objects (and thus very large offsets added) which may end up acccessing
actually
mapped memory so IMHO what clang does by default is a security risk.

Now, what we should eventually improve is the code generated in the isolated
path.  On GIMPLE we retain the load:

   [count: 0]:
  _7 ={v} MEM[(struct Derived *)0B].D.2340.y;
  __builtin_trap ();

(because it can trap).  The use of the cold section for the
ud2 is probably also bad since it will cause a larger jump instruction
where very likely

testq %rdi, %rdi
jne .L2
ud2
.L2:
movl (%rdi), ...

would be both faster and smaller.  For reference the generated code:

_Z5fieldP5Base2:
.LFB1:
.cfi_startproc
testq   %rdi, %rdi
je  .L2
movl(%rdi), %eax
ret
.cfi_endproc
.section.text.unlikely
.cfi_startproc
.type   _Z5fieldP5Base2.cold, @function
_Z5fieldP5Base2.cold:
.LFSB1:
.L2:
movl4, %eax
ud2


CCing Jeff for the RTL side representation - IIRC we have some special
CFG magic for gcc_unreachable, not sure if what we end up with
trap() matches that or if we should adjust this somehow.  Currently DCE
marks the load as always necessary because it seems isolate-paths makes the
load
volatile:

  /* We want the NULL pointer dereference to actually occur so that
 code that wishes to catch the signal can do so.

...

fair enough - but as you see above we dereference not NULL but some
derived constant which might not actually trap.  I wonder if it is
more useful/safe to replace the load with a plain *(char *)0?

Note I don't think what clang does here is reasonable.