On Thu, 11 Dec 2025, Richard Biener wrote:
Date: Thu, 11 Dec 2025 12:38:43 +0100
From: Richard Biener <[email protected]>
To: Krister Walfridsson <[email protected]>
Cc: [email protected]
Subject: Re: pointer comparison in GIMPLE
On Thu, Dec 11, 2025 at 5:12 AM Krister Walfridsson
<[email protected]> wrote:
On Wed, 10 Dec 2025, Richard Biener wrote:
The problem is that in GIMPLE, a pointer does not need to be in
bounds. The caller could call the function with a value of i such that p +
i happens to be equal to &a. So, as I understand it, the GIMPLE semantics
do not allow the pass to conclude that p + i == &a is false, unless p + i
is dereferenced (because dereferencing a through p + i would be UB due to
provenance).
GIMPLE adopts most of the C pointer restrictions here thus we can (and
do) conclude that pointers stay within an object when advanced. This is
used by the PTA pass which results are used when we optimize your example.
You have to divert to integer arithmetic to circumvent this and the PTA
pass, while tracking provenance through integers as well, does the right
thing with this.
Great, that is much better for smtgcc than the semantics I have
currently
implemented!
But it is not completely clear to me what "most of the C pointer
restrictions" implies. Is the following a correct interpretation?
1. A pointer must contain a value that points into (or one past) an
object
corresponding to its provenance (where a pointer may have multiple
provenances). Otherwise it invokes undefined behavior.
Hmm. I think it's only UB when you'd "use" that pointer. That is, PTA
would
compute the points-to set to 'nothing'. The immediate consequences are
such
pointer isn't equal to any other pointer and accesses through it alias
with nothing,
stores would be DSEd. But at the point a SSA var is assigned such a
pointer
we couldn't place a trap() (?)
2. The provenance used for the result of POINTER_PLUS is the union of
the
provenances for the two arguments.
For POINTER_PLUS it's the provenance of the first argument.
For PLUS_EXPR it is the union of both arguments. For POINTER_DIFF_EXPR
the result has no provenance.
3. The POINTER_PLUS operation is UB if the calculation overflows and
TYPE_OVERFLOW_WRAPS(ptr_type) is false.
Yes.
4. The rules are the same for the calculations done in MEM_REF and
TARGET_MEM_REF as for POINTER_PLUS.
Yes.
Question: For the TARGET_MEM_REF calculation:
BASE + STEP * INDEX + INDEX2 + OFFSET
Is it treated as one POINTER_PLUS, i.e.
BASE + (STEP * INDEX + INDEX2 + OFFSET)
or as two (i.e. do we care about overflow and OOB between the two index
calculations)?
I'd say it counts as one pointer + offset calculation with all the offset
calculation being done in wrapping operations.
Your answers match exactly what is currently implemented in smtgcc, so I
am still thinking this is a bug in GCC (or that there is some missing
GIMPLE rule I must implement).
The original program looks in GIMPLE like:
void foo (char * p, long long int i)
{
char a;
sizetype i.0_1;
char * _2;
<bb 2> :
i.0_1 = (sizetype) i_3(D);
_2 = p_4(D) + i.0_1;
if (_2 == &a)
goto <bb 3>;
else
goto <bb 4>;
<bb 3> :
__builtin_abort ();
<bb 4> :
a ={v} {CLOBBER(eos)};
return;
}
Assume, for the sake of argument, that the address of a is 0x2000000, p =
0x1000000 and i = 0x1000000.
With the semantics as described in this mail thread, all operations are
defined:
* _2 evaluates to 0x2000000, with the provenance of p (although the
provenance is irrelevant in this execution).
* The comparison is also defined and evaluates to true.
* As a result, the program then calls __builtin_abort, which exits.
A valid optimization must produce the same result (including side effects)
given the same input for executions where all steps have defined
semantics.