FYI, http://reviews.llvm.org/D8487 will fix your non-union testcase.
On Fri, Mar 20, 2015 at 11:08 AM, Jeroen Dobbelaere < [email protected]> wrote: > > > On Fri, Mar 20, 2015 at 5:50 PM, Daniel Berlin <[email protected]> > wrote: > >> >> >> On Fri, Mar 20, 2015 at 9:06 AM, Jeroen Dobbelaere < >> [email protected]> wrote: >> >>> In the next set of small functions, the goal is to identify the (for >>> llvm) wanted behavior related to aliasing. >>> (having a 32bit architecture in mind) >>> >>> For every function, I have three fields: >>> - c++11 : my interpretation of the c++11/14 standard based on a previous >>> mail from Richard Smith >>> (This is: a combination of the known layout of a >>> structure/union/array and the types) >>> (the relevant section for the types is 3.10 10) >>> - llvm: what does llvm deduce today >>> - future: what I would like to see in future >>> >>> The question to ask is: when should it be valid to REORDER the stores, >>> in the assumption >>> that after the function call, the second object is accessed (read from) >>> in a legal way. >>> - NoAlias: we are allowed to reorder the stores >>> - MayAlias: we are _not_ allowed to reorder the stores >>> >>> For the 'future' field, my personal guideline is that: >>> - if a union (sub)member is accessed (full path) and the union (or one >>> of its members) contains a type that matches >>> the other access (taking into account the access path), then aliasing >>> must be assumed (possibly taken into account >>> the offset/size of the access) >>> >>> NOTE: when the standard requires a MayAlias, and we provide a NoAlias, >>> we a have 'wrong code' issue >>> when the standard specifies a NoAlias, and we provide a MayAlias, >>> we have a possible performance degradation. >>> NOTE2: for member array accesses of the same type, llvm will today >>> always assume them as aliasing, >>> even if the happen at different offsets. This is acceptable as it >>> does not result in wrong code. >>> NOTE3: depending on the interpretation of the standard, more or less >>> cases will have the 'MayAlias' label. I try >>> to come with a guess on what the standard means, based on a >>> previous mail from Richard Smith. >>> (Richard, please comment for those cases where I made a wrong >>> guess ;) ) >>> NOTE4: it would be good if the standard can be extended with a set of >>> examples, explicitly stating where reorderings >>> are allowed and where not. >>> NOTE5: we probably want to have a similar behavior as 'gcc', and we >>> assume that gcc follows the 'future' rules, but >>> I was not able to deduce any useful information with >>> '-fdump-tree-alias'. Probably I missed something :( >>> >> >> -alias is the dump for field-sensitive points-to, not "all alias info". >> These are all incoming arguments, so non-IPA points-to will tell you >> nothing. >> >> But it does, of course, know things are in different places in each one: >> for test3_s03c, for example, it computes: >> >> a = &NONLOCAL >> b = &NONLOCAL >> derefaddrtmp = &NONLOCAL >> *a + 64 = derefaddrtmp >> *b + 160 = derefaddrtmp >> >> a(D), points-to non-local, points-to vars: { } >> b(D), points-to non-local, points-to vars: { } >> >> >> IE they point to nothing locally, but may point to any non-local variable. >> >> >> That is the best answer you can give these cases. >> >> I would be glad if somebody could extend this with information on >>> how gcc treats the accesses. >>> >> >> This would be complicated. >> GCC builds a simple initial memory SSA form, but isn't going to >> disambiguate it until an optimization asks for it to be disambiguated. >> >> Since there is nothing to optimize in your cases, i can't tell you >> whether it thinks they alias easily. >> I did take the one case you have below, and added a load, to determine >> aliasing. >> >> >> NOTE6: In order to make unions really useful, llvm allows to read a union >>> member with a different type than the one >>> that was used to write it. Once we have a correct deduction of >>> the aliasing relation, this should also work. >>> >>> >>> So, please comment if you agree/disagree ;) >>> >>> Jeroen >>> >>> -- >>> >>> void test_s03c(struct S03* a, struct S03* b) >>> { >>> a->mS00_1[0].mI_2=1; >>> b->mS00_1[1].mI_2=2; >>> >>> // c++11: NoAlias >>> // llvm: MayAlias ******** performance >>> // future: NoAlias >>> } >>> >> >> I actually don't understand why we get this wrong now :) >> > > One main issue is that arrays are stored as 'char' type in the tbaa struct > info. > The access itself then starts from a unknown location inside the array. > So, both the member type of an array are lost, as is the (in this case) > fixed offset in the tbaa path information. > > This can be fixed with a combination of improved tbaa struct information, > tbaa path information and analysis (which we will need to do anyway to get > the union testcases right). > > > This should be pretty trivial offset calculation. >> >> I transformed this into: >> int test_s03c(struct S03* a, struct S03* b) >> { >> a->mS00_1[0].mI_2=1; >> b->mS00_1[1].mI_2=2; >> return a->mS00_1[0].mI_2; >> } >> >> To see if GCC believes the second store clobbers the first store (IE >> whether it thinks they alias). >> It does not believe that, and will replace the return statement with >> "return 1"; >> >> So GCC correctly determines "NoAlias" in this case. >> >> > ah ! that is good way to find out what gcc makes of it. > > >> We definitely do not. >> >> GVN says: >> GVN: load i32 %7 is clobbered by store i32 2, i32* %6, align 4 >> >> and does *not* replace the return value with 1. >> >> We should fix this. >> >> BasicAA, sadly, has to walk backwards because it is stateless. >> It would be easier to walk forwards once, compute the actual offsets >> being accessed by the GEPs, and map that to the loads/stores for later >> usage. Then it is a constant time check for struct related aliasing like >> this. >> This isn't going to get invalidated all that often (only things like load >> widening/etc would change the offsets) >> >> >> Anyway, i looked into BasicAA, and it fails in aliasGeps. >> >> The logic in there confuses the heck out of me. >> >> It seems to go to a lot of trouble, and steadfastly refuses to believe >> that if you have two pointers whose underlying object is a struct argument, >> and which are accessed using GEP's at provably different offsets, they >> can't alias. >> This is wrong. >> >> In the case of arguments (or identified function locals) that are struct >> types, it doesn't matter whether the underlying pointers are must or may >> alias, it matters what the offset,size of the access is. >> >> In particular, there should be an else branch after this block: >> >> if (PreciseBaseAlias == NoAlias) { >> >> >> that says "if they [offsets, sizes] don't overlap different, and >> UnderlyingV1/V2 is an identified local, they can't alias) >> >> (if it can't figure out what the underlying object is, conservatively, >> the right thing to do is say MayAlias) >> >> >> I'll play with adding some logic here that does the right thing. >> >> >> The of your answers rest looks sane to me. >> >> > > Greetings, > -- > Jeroen Dobbelaere >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
