Hi Arthur and Richard,
Richard's and my replies got crossed. I tried to address some of the concerns
in the reply that preceded Richard's reply by a few minutes.
With the change, there is no difference in behaviour for:
void func(char d) {
size_t nbytes = sizeof(int);
for (int i = 0; i < 100; i++) {
__builtin_memcpy(x.b, &i, nbytes);
d = *p;
x.a += d;
}
}
The change should only impact TBAA annotation when dealing with accesses into
char arrays (non-VLAs only), so only accesses of the "x.b[i] = .." sort should
be affected, as they will have a different tbaa tag generated. More detailed
info is in the earlier reply. "d = *p" will have the same annotation as
before. (The change is in CGExpr.cpp in EmitArraySubscriptExpr, when dealing
with constant sized arrays). There is no change for the above test case:
!0 = metadata !{metadata !"clang version 3.5.0 "}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"any pointer", metadata !3, i64 0}
!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
!4 = metadata !{metadata !"Simple C/C++ TBAA"}
!5 = metadata !{metadata !6, metadata !6, i64 0}
!6 = metadata !{metadata !"int", metadata !3, i64 0}
!7 = metadata !{metadata !8, metadata !3, i64 0}
!8 = metadata !{metadata !"", metadata !3, i64 0, metadata !3, i64 1, metadata
!3, i64 105}
for.body: ; preds = %for.cond
%i.0..sroa_cast = bitcast i32* %i to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* getelementptr inbounds (%struct.S*
@x, i32 0, i32 1, i32 0), i8* %i.0..sroa_cast, i64 4, i32 1, i1 false)
%0 = load i32** @p, align 8, !tbaa !1
%1 = load i32* %0, align 4, !tbaa !5 <==== d = *p, tag !5 with base type
"int", with path leading to "omnipotent char", same as before
%conv = trunc i32 %1 to i8
%conv1 = zext i8 %conv to i32
%2 = load i8* getelementptr inbounds (%struct.S* @x, i32 0, i32 0), align 1,
!tbaa !7
%conv2 = zext i8 %2 to i32
%add = add nsw i32 %conv2, %conv1
%conv3 = trunc i32 %add to i8
store i8 %conv3, i8* getelementptr inbounds (%struct.S* @x, i32 0, i32 0),
align 1, !tbaa !7
%i.0.4 = load i32* %i, align 4
%inc = add nsw i32 %i.0.4, 1
store i32 %inc, i32* %i, align 4
br label %for.cond
After memcpy is expanded, the resulting store will not have a tbaa tag
associated with it, so the "d = *p" will not be able to get hoisted out, as it
cannot be proved that it is safe. This is true both with and without this
change, since "d = *p" has the same tag as before (and no new tags are being
generated in this case).
It will be able to get hoisted past the stores into constant sized char arrays,
as "*p" is an lvalue of type int, which is not the dynamic type of the object
being pointed to (char in this case), or any other classification that I can
see in Section 3.10 in the C++ Standard.
Even though it will not impact the first case you pointed out, can you please
expand on why it's not valid to hoist "d = *p" if is p is a pointer to int, and
we are dealing with stores/loads into arrays of the form
template<class T>
struct Container {
char data[N + sizeof T];
};
?
As Richard suggested, it could only be enabled by passing an additional flag
(-fchar-array-tbaa?), which would only have an effect if -fstruct-path-tbaa is
on.
Thanks,
Sanjin
-----Original Message-----
From: Arthur O'Dwyer [mailto:[email protected]]
Sent: Thursday, June 26, 2014 9:11 AM
To: Richard Smith
Cc: Sanjin Sijaric; cfe-commits
Subject: Re: [PATCH] More precise aliasing for char arrays
On Thu, Jun 26, 2014 at 4:10 AM, Richard Smith <[email protected]> wrote:
> On Wed, Jun 25, 2014 at 7:34 PM, Arthur O'Dwyer <[email protected]>
> wrote:
>> On Wed, Jun 25, 2014 at 3:26 PM, Sanjin Sijaric <[email protected]>
>> wrote:
>> >
>> >> int *p;
>> >> typedef struct {
>> >> char a;
>> >> char b[100];
>> >> char c;
>> >> } S;
>> >>
>> >> S x;
>> >>
>> >> void func1 (char d) {
>> >> for (int i = 0; i < 100; i++) {
>> >> x.b[i] += 1;
>> >> d = *p;
>> >> x.a += d;
>> >> }
>> >> }
>> >>
>> >> It seems like you want the compiler to hoist the read of `*p`
>> >> above the write to `x.b[i]`.
>> >> But that isn't generally possible, is it? because the caller might
>> >> have executed
>> >>
>> >> p = &x.b[3];
>> >>
>> >> before the call to func1.
[…]
> I think that's backwards from the intent: if you swap over 'int' and 'char'
> in the example, we cannot do the reordering, because p could point to
> (some byte of) one of the ints.
>
> With the test as-is, we *can* reorder the *p load (and even move it
> out of the loop):
> -- *p cannot alias x.b[i], because if 'x.b[i] += 1' has defined
> behavior, then x is an object of type S and x.b is an object of type
> char[100] and 0 <= i < 100, and therefore there is no int object
> aliased by that store
You left out one step there, which needs to be made explicit IMO: the fact that
we *store* into a *single byte* of x.b is important.
Obviously we could not do the same optimization on
size_t nbytes = sizeof(int);
for (int i = 0; i < 100; i++) {
__builtin_memcpy(x.b, &i, nbytes);
d = *p;
x.a += d;
}
because that would basically nerf "omnipotent char" (and "placement new" in
general). If the compiler can actually detect that the size of the last access
was incommensurate with the size of the load being reordered, then I withdraw
my objection, but otherwise it really seems like this is a super dangerous
optimization.
> -- *p cannot alias x.a, because if 'x.a += d' has defined behavior,
> then x is an object of type S, so a store to S::a cannot alias any int object.
This one I agree with your logic, btw. A named object of type 'char'
(being only one byte in size) clearly cannot alias with type 'int'
(being four bytes in size). My objections apply only to cases analogous to
template<class T>
struct Container {
char data[N + sizeof T];
};
–Arthur
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits