Am Freitag, dem 12.12.2025 um 09:21 +0700 schrieb Jason Merrill via Gcc:
> On Fri, Dec 12, 2025, 8:57 a.m. Krister Walfridsson via Gcc <[email protected]>
> wrote:
> 
> > On Thu, 11 Dec 2025, Richard Biener wrote:
> > 
> > > Date: Thu, 11 Dec 2025 12:38:43 +0100
> > > From: Richard Biener <[email protected]>
> > > To: Krister Walfridsson <[email protected]>
> > > Cc: [email protected]
> > > Subject: Re: pointer comparison in GIMPLE
> > > 
> > > On Thu, Dec 11, 2025 at 5:12 AM Krister Walfridsson
> > > <[email protected]> wrote:
> > > > 
> > > > On Wed, 10 Dec 2025, Richard Biener wrote:
> > > > 
> > > > > > The problem is that in GIMPLE, a pointer does not need to be in
> > bounds. The caller could call the function with a value of i such that p +
> > i happens to be equal to &a. So, as I understand it, the GIMPLE semantics
> > do not allow the pass to conclude that p + i == &a is false, unless p + i
> > is dereferenced (because dereferencing a through p + i would be UB due to
> > provenance).
> > > > > 
> > > > > GIMPLE adopts most of the C pointer restrictions here thus we can (and
> > do) conclude that pointers stay within an object when advanced.  This is
> > used by the PTA pass which results are used when we optimize your example.
> > You have to divert to integer arithmetic to circumvent this and the PTA
> > pass, while tracking provenance through integers as well, does the right
> > thing with this.
> > > > 
> > > > Great, that is much better for smtgcc than the semantics I have
> > currently
> > > > implemented!
> > > > 
> > > > But it is not completely clear to me what "most of the C pointer
> > > > restrictions" implies. Is the following a correct interpretation?
> > > > 
> > > > 1. A pointer must contain a value that points into (or one past) an
> > object
> > > > corresponding to its provenance (where a pointer may have multiple
> > > > provenances). Otherwise it invokes undefined behavior.
> > > 
> > > Hmm.  I think it's only UB when you'd "use" that pointer.  That is, PTA
> > would
> > > compute the points-to set to 'nothing'.  The immediate consequences are
> > such
> > > pointer isn't equal to any other pointer and accesses through it alias
> > > with nothing,
> > > stores would be DSEd.  But at the point a SSA var is assigned such a
> > pointer
> > > we couldn't place a trap() (?)
> > > 
> > > > 2. The provenance used for the result of POINTER_PLUS is the union of
> > the
> > > > provenances for the two arguments.
> > > 
> > > For POINTER_PLUS it's the provenance of the first argument.
> > > 
> > > For PLUS_EXPR it is the union of both arguments.  For POINTER_DIFF_EXPR
> > > the result has no provenance.
> > > 
> > > > 3. The POINTER_PLUS operation is UB if the calculation overflows and
> > > > TYPE_OVERFLOW_WRAPS(ptr_type) is false.
> > > 
> > > Yes.
> > > 
> > > > 4. The rules are the same for the calculations done in MEM_REF and
> > > > TARGET_MEM_REF as for POINTER_PLUS.
> > > 
> > > Yes.
> > > 
> > > > Question: For the TARGET_MEM_REF calculation:
> > > >    BASE + STEP * INDEX + INDEX2 + OFFSET
> > > > Is it treated as one POINTER_PLUS, i.e.
> > > >    BASE + (STEP * INDEX + INDEX2 + OFFSET)
> > > > or as two (i.e. do we care about overflow and OOB between the two index
> > > > calculations)?
> > > 
> > > I'd say it counts as one pointer + offset calculation with all the offset
> > > calculation being done in wrapping operations.
> > 
> > Your answers match exactly what is currently implemented in smtgcc, so I
> > am still thinking this is a bug in GCC (or that there is some missing
> > GIMPLE rule I must implement).
> > 
> > The original program looks in GIMPLE like:
> > 
> >    void foo (char * p, long long int i)
> >    {
> >      char a;
> >      sizetype i.0_1;
> >      char * _2;
> > 
> >      <bb 2> :
> >      i.0_1 = (sizetype) i_3(D);
> >      _2 = p_4(D) + i.0_1;
> >      if (_2 == &a)
> >        goto <bb 3>;
> >      else
> >        goto <bb 4>;
> > 
> >      <bb 3> :
> >      __builtin_abort ();
> > 
> >      <bb 4> :
> >      a ={v} {CLOBBER(eos)};
> >      return;
> >    }
> > 
> > Assume, for the sake of argument, that the address of a is 0x2000000, p =
> > 0x1000000 and i = 0x1000000.
> > 
> > With the semantics as described in this mail thread, all operations are
> > defined:
> >   * _2 evaluates to 0x2000000, with the provenance of p (although the
> >     provenance is irrelevant in this execution).
> >   * The comparison is also defined and evaluates to true.
> >   * As a result, the program then calls __builtin_abort, which exits.
> > 
> > A valid optimization must produce the same result (including side effects)
> > given the same input for executions where all steps have defined
> > semantics.
> 
> 
> Except this comparison has an unspecified value, so an optimization is
> allowed to change it.

Is it?  For, C this is defined.

Martin


> 
> 
> Therefore, an optimization that does not call __builtin_abort
> > for this input is buggy (or the semantics is incorrect).
> > 
> > You said in your second mail: "GIMPLE adopts most of the C pointer
> > restrictions here thus we can (and do) conclude that pointers stay within
> > an object when advanced.  This is used by the PTA pass which results are
> > used when we optimize your example." which is what I tried to reflect in:
> > 
> > > > 1. A pointer must contain a value that points into (or one past) an
> > object
> > > > corresponding to its provenance (where a pointer may have multiple
> > > > provenances). Otherwise it invokes undefined behavior.
> > 
> > But as you say, that is wrong (and the vectorizer and ifconv do indeed
> > perform transformations that would be invalid with this semantics). So
> > what is the correct rule here for "pointers stay within an object"? All
> > ideas I have tried fails in different ways... :(
> > 
> > ---
> > 
> > I also have a somewhat related question regarding:
> > 
> > > > 2. The provenance used for the result of POINTER_PLUS is the union of
> > the
> > > > provenances for the two arguments.
> > > 
> > > For POINTER_PLUS it's the provenance of the first argument.
> > 
> > > > 4. The rules are the same for the calculations done in MEM_REF and
> > > > TARGET_MEM_REF as for POINTER_PLUS.
> > > 
> > > Yes.
> > 
> > The ifconv pass sometimes rewrites memory accesses as:
> > 
> >    _84 = &MEM[(float *)0B + _83 + ivtmp.41_75 * 4];
> >    MEM[(float *)_84] = _3;
> > 
> > which you can see by compiling testsuite/gcc.dg/sms-11.c for x86_64 with
> > -O1.
> > 
> > If TARGET_MEM_REF works like POINTER_PLUS, which does not propagate
> > provenance through integers, then this store has no provenance and invokes
> > undefined behavior. So is ifopts buggy, or do TARGET_MEM_REF propagate
> > provenance from index/offset?
> > 
> >     /Krister
> > 
> > 

Reply via email to