Am Freitag, dem 12.12.2025 um 09:21 +0700 schrieb Jason Merrill via Gcc: > On Fri, Dec 12, 2025, 8:57 a.m. Krister Walfridsson via Gcc <[email protected]> > wrote: > > > On Thu, 11 Dec 2025, Richard Biener wrote: > > > > > Date: Thu, 11 Dec 2025 12:38:43 +0100 > > > From: Richard Biener <[email protected]> > > > To: Krister Walfridsson <[email protected]> > > > Cc: [email protected] > > > Subject: Re: pointer comparison in GIMPLE > > > > > > On Thu, Dec 11, 2025 at 5:12 AM Krister Walfridsson > > > <[email protected]> wrote: > > > > > > > > On Wed, 10 Dec 2025, Richard Biener wrote: > > > > > > > > > > The problem is that in GIMPLE, a pointer does not need to be in > > bounds. The caller could call the function with a value of i such that p + > > i happens to be equal to &a. So, as I understand it, the GIMPLE semantics > > do not allow the pass to conclude that p + i == &a is false, unless p + i > > is dereferenced (because dereferencing a through p + i would be UB due to > > provenance). > > > > > > > > > > GIMPLE adopts most of the C pointer restrictions here thus we can (and > > do) conclude that pointers stay within an object when advanced. This is > > used by the PTA pass which results are used when we optimize your example. > > You have to divert to integer arithmetic to circumvent this and the PTA > > pass, while tracking provenance through integers as well, does the right > > thing with this. > > > > > > > > Great, that is much better for smtgcc than the semantics I have > > currently > > > > implemented! > > > > > > > > But it is not completely clear to me what "most of the C pointer > > > > restrictions" implies. Is the following a correct interpretation? > > > > > > > > 1. A pointer must contain a value that points into (or one past) an > > object > > > > corresponding to its provenance (where a pointer may have multiple > > > > provenances). Otherwise it invokes undefined behavior. > > > > > > Hmm. I think it's only UB when you'd "use" that pointer. That is, PTA > > would > > > compute the points-to set to 'nothing'. The immediate consequences are > > such > > > pointer isn't equal to any other pointer and accesses through it alias > > > with nothing, > > > stores would be DSEd. But at the point a SSA var is assigned such a > > pointer > > > we couldn't place a trap() (?) > > > > > > > 2. The provenance used for the result of POINTER_PLUS is the union of > > the > > > > provenances for the two arguments. > > > > > > For POINTER_PLUS it's the provenance of the first argument. > > > > > > For PLUS_EXPR it is the union of both arguments. For POINTER_DIFF_EXPR > > > the result has no provenance. > > > > > > > 3. The POINTER_PLUS operation is UB if the calculation overflows and > > > > TYPE_OVERFLOW_WRAPS(ptr_type) is false. > > > > > > Yes. > > > > > > > 4. The rules are the same for the calculations done in MEM_REF and > > > > TARGET_MEM_REF as for POINTER_PLUS. > > > > > > Yes. > > > > > > > Question: For the TARGET_MEM_REF calculation: > > > > BASE + STEP * INDEX + INDEX2 + OFFSET > > > > Is it treated as one POINTER_PLUS, i.e. > > > > BASE + (STEP * INDEX + INDEX2 + OFFSET) > > > > or as two (i.e. do we care about overflow and OOB between the two index > > > > calculations)? > > > > > > I'd say it counts as one pointer + offset calculation with all the offset > > > calculation being done in wrapping operations. > > > > Your answers match exactly what is currently implemented in smtgcc, so I > > am still thinking this is a bug in GCC (or that there is some missing > > GIMPLE rule I must implement). > > > > The original program looks in GIMPLE like: > > > > void foo (char * p, long long int i) > > { > > char a; > > sizetype i.0_1; > > char * _2; > > > > <bb 2> : > > i.0_1 = (sizetype) i_3(D); > > _2 = p_4(D) + i.0_1; > > if (_2 == &a) > > goto <bb 3>; > > else > > goto <bb 4>; > > > > <bb 3> : > > __builtin_abort (); > > > > <bb 4> : > > a ={v} {CLOBBER(eos)}; > > return; > > } > > > > Assume, for the sake of argument, that the address of a is 0x2000000, p = > > 0x1000000 and i = 0x1000000. > > > > With the semantics as described in this mail thread, all operations are > > defined: > > * _2 evaluates to 0x2000000, with the provenance of p (although the > > provenance is irrelevant in this execution). > > * The comparison is also defined and evaluates to true. > > * As a result, the program then calls __builtin_abort, which exits. > > > > A valid optimization must produce the same result (including side effects) > > given the same input for executions where all steps have defined > > semantics. > > > Except this comparison has an unspecified value, so an optimization is > allowed to change it.
Is it? For, C this is defined. Martin > > > Therefore, an optimization that does not call __builtin_abort > > for this input is buggy (or the semantics is incorrect). > > > > You said in your second mail: "GIMPLE adopts most of the C pointer > > restrictions here thus we can (and do) conclude that pointers stay within > > an object when advanced. This is used by the PTA pass which results are > > used when we optimize your example." which is what I tried to reflect in: > > > > > > 1. A pointer must contain a value that points into (or one past) an > > object > > > > corresponding to its provenance (where a pointer may have multiple > > > > provenances). Otherwise it invokes undefined behavior. > > > > But as you say, that is wrong (and the vectorizer and ifconv do indeed > > perform transformations that would be invalid with this semantics). So > > what is the correct rule here for "pointers stay within an object"? All > > ideas I have tried fails in different ways... :( > > > > --- > > > > I also have a somewhat related question regarding: > > > > > > 2. The provenance used for the result of POINTER_PLUS is the union of > > the > > > > provenances for the two arguments. > > > > > > For POINTER_PLUS it's the provenance of the first argument. > > > > > > 4. The rules are the same for the calculations done in MEM_REF and > > > > TARGET_MEM_REF as for POINTER_PLUS. > > > > > > Yes. > > > > The ifconv pass sometimes rewrites memory accesses as: > > > > _84 = &MEM[(float *)0B + _83 + ivtmp.41_75 * 4]; > > MEM[(float *)_84] = _3; > > > > which you can see by compiling testsuite/gcc.dg/sms-11.c for x86_64 with > > -O1. > > > > If TARGET_MEM_REF works like POINTER_PLUS, which does not propagate > > provenance through integers, then this store has no provenance and invokes > > undefined behavior. So is ifopts buggy, or do TARGET_MEM_REF propagate > > provenance from index/offset? > > > > /Krister > > > >
