On Fri, Dec 12, 2025, 8:57 a.m. Krister Walfridsson via Gcc <[email protected]>
wrote:

> On Thu, 11 Dec 2025, Richard Biener wrote:
>
> > Date: Thu, 11 Dec 2025 12:38:43 +0100
> > From: Richard Biener <[email protected]>
> > To: Krister Walfridsson <[email protected]>
> > Cc: [email protected]
> > Subject: Re: pointer comparison in GIMPLE
> >
> > On Thu, Dec 11, 2025 at 5:12 AM Krister Walfridsson
> > <[email protected]> wrote:
> >>
> >> On Wed, 10 Dec 2025, Richard Biener wrote:
> >>
> >>>> The problem is that in GIMPLE, a pointer does not need to be in
> bounds. The caller could call the function with a value of i such that p +
> i happens to be equal to &a. So, as I understand it, the GIMPLE semantics
> do not allow the pass to conclude that p + i == &a is false, unless p + i
> is dereferenced (because dereferencing a through p + i would be UB due to
> provenance).
> >>>
> >>> GIMPLE adopts most of the C pointer restrictions here thus we can (and
> do) conclude that pointers stay within an object when advanced.  This is
> used by the PTA pass which results are used when we optimize your example.
> You have to divert to integer arithmetic to circumvent this and the PTA
> pass, while tracking provenance through integers as well, does the right
> thing with this.
> >>
> >> Great, that is much better for smtgcc than the semantics I have
> currently
> >> implemented!
> >>
> >> But it is not completely clear to me what "most of the C pointer
> >> restrictions" implies. Is the following a correct interpretation?
> >>
> >> 1. A pointer must contain a value that points into (or one past) an
> object
> >> corresponding to its provenance (where a pointer may have multiple
> >> provenances). Otherwise it invokes undefined behavior.
> >
> > Hmm.  I think it's only UB when you'd "use" that pointer.  That is, PTA
> would
> > compute the points-to set to 'nothing'.  The immediate consequences are
> such
> > pointer isn't equal to any other pointer and accesses through it alias
> > with nothing,
> > stores would be DSEd.  But at the point a SSA var is assigned such a
> pointer
> > we couldn't place a trap() (?)
> >
> >> 2. The provenance used for the result of POINTER_PLUS is the union of
> the
> >> provenances for the two arguments.
> >
> > For POINTER_PLUS it's the provenance of the first argument.
> >
> > For PLUS_EXPR it is the union of both arguments.  For POINTER_DIFF_EXPR
> > the result has no provenance.
> >
> >> 3. The POINTER_PLUS operation is UB if the calculation overflows and
> >> TYPE_OVERFLOW_WRAPS(ptr_type) is false.
> >
> > Yes.
> >
> >> 4. The rules are the same for the calculations done in MEM_REF and
> >> TARGET_MEM_REF as for POINTER_PLUS.
> >
> > Yes.
> >
> >> Question: For the TARGET_MEM_REF calculation:
> >>    BASE + STEP * INDEX + INDEX2 + OFFSET
> >> Is it treated as one POINTER_PLUS, i.e.
> >>    BASE + (STEP * INDEX + INDEX2 + OFFSET)
> >> or as two (i.e. do we care about overflow and OOB between the two index
> >> calculations)?
> >
> > I'd say it counts as one pointer + offset calculation with all the offset
> > calculation being done in wrapping operations.
>
> Your answers match exactly what is currently implemented in smtgcc, so I
> am still thinking this is a bug in GCC (or that there is some missing
> GIMPLE rule I must implement).
>
> The original program looks in GIMPLE like:
>
>    void foo (char * p, long long int i)
>    {
>      char a;
>      sizetype i.0_1;
>      char * _2;
>
>      <bb 2> :
>      i.0_1 = (sizetype) i_3(D);
>      _2 = p_4(D) + i.0_1;
>      if (_2 == &a)
>        goto <bb 3>;
>      else
>        goto <bb 4>;
>
>      <bb 3> :
>      __builtin_abort ();
>
>      <bb 4> :
>      a ={v} {CLOBBER(eos)};
>      return;
>    }
>
> Assume, for the sake of argument, that the address of a is 0x2000000, p =
> 0x1000000 and i = 0x1000000.
>
> With the semantics as described in this mail thread, all operations are
> defined:
>   * _2 evaluates to 0x2000000, with the provenance of p (although the
>     provenance is irrelevant in this execution).
>   * The comparison is also defined and evaluates to true.
>   * As a result, the program then calls __builtin_abort, which exits.
>
> A valid optimization must produce the same result (including side effects)
> given the same input for executions where all steps have defined
> semantics.


Except this comparison has an unspecified value, so an optimization is
allowed to change it.


Therefore, an optimization that does not call __builtin_abort
> for this input is buggy (or the semantics is incorrect).
>
> You said in your second mail: "GIMPLE adopts most of the C pointer
> restrictions here thus we can (and do) conclude that pointers stay within
> an object when advanced.  This is used by the PTA pass which results are
> used when we optimize your example." which is what I tried to reflect in:
>
> >> 1. A pointer must contain a value that points into (or one past) an
> object
> >> corresponding to its provenance (where a pointer may have multiple
> >> provenances). Otherwise it invokes undefined behavior.
>
> But as you say, that is wrong (and the vectorizer and ifconv do indeed
> perform transformations that would be invalid with this semantics). So
> what is the correct rule here for "pointers stay within an object"? All
> ideas I have tried fails in different ways... :(
>
> ---
>
> I also have a somewhat related question regarding:
>
> >> 2. The provenance used for the result of POINTER_PLUS is the union of
> the
> >> provenances for the two arguments.
> >
> > For POINTER_PLUS it's the provenance of the first argument.
>
> >> 4. The rules are the same for the calculations done in MEM_REF and
> >> TARGET_MEM_REF as for POINTER_PLUS.
> >
> > Yes.
>
> The ifconv pass sometimes rewrites memory accesses as:
>
>    _84 = &MEM[(float *)0B + _83 + ivtmp.41_75 * 4];
>    MEM[(float *)_84] = _3;
>
> which you can see by compiling testsuite/gcc.dg/sms-11.c for x86_64 with
> -O1.
>
> If TARGET_MEM_REF works like POINTER_PLUS, which does not propagate
> provenance through integers, then this store has no provenance and invokes
> undefined behavior. So is ifopts buggy, or do TARGET_MEM_REF propagate
> provenance from index/offset?
>
>     /Krister
>
>

Reply via email to