Hi, On Fri, Dec 12, 2008 at 12:26:38PM +0100, Richard Guenther wrote: > On Fri, Dec 12, 2008 at 11:59 AM, Jan Hubicka <hubi...@ucw.cz> wrote: > >> On Fri, Dec 12, 2008 at 12:29 AM, Martin Jambor <mjam...@suse.cz> wrote: > >> > Hi, > >> > > >> > today I have encountered an unpleasant problem with the function > >> > get_ref_base_and_extent() when it claimed a known and constant offset > >> > for the expression insn_4(D)->u.fld[arg.82_3].rt_rtvec. (arg being a > >> > default_def parameter of the function, insn is an rtx). Moreover, it > >> > also returned constant size and max_size, all three equal to 64 > >> > (bits). > >> > > >> > This is certainly wrong (I believe the function got confused by > >> > unions) but after looking at the function and what it did in debugger, > >> > I grew unsure what the expected behavior is. The two alternatives I > >> > consider possibly correct are returned offset equal to -1 or, > >> > alternatively, offset equal to the offset of the array (+ offset of > >> > rt_rtvec which is zero) and max_size equal either to the size of the > >> > array or -1 if it is unknown. > >> > > >> > At the moment, the function never returns offset equal to -1, the > >> > comment above it does not even mention such possibility and, from the > >> > limited research I did, its callers do not expect and check for it. > >> > However, at the same time, the special handling of non-constants in > >> > array indices after the label "done" does not trigger in this > >> > particular case (here I suspect the unions come to play because it is > >> > the last part of the conjunction that evaluates to false). > >> > > >> > Which (or what else) is the correct semantics of the function? Both > >> > make sense to me (I would prefer the former but I suspect other users > >> > rely on the latter). What would be the correct fix, when a union > >> > field is itself a record ending with a variable-length array? How > >> > much would I pessimize things if I just returned -1 max_size if both a > >> > non-constant index and a union was encountered? > >> > > >> > Or did I miss something else? > >> > >> The function is supposed to return -1 for max_size in case the access > >> accesses a variable array index of an array of unknown size. > >> > >> insn_4(D)->u.fld[arg.82_3].rt_rtvec accesses struct rtvec_def -- in which > >> case the access _does_ have known size (an int plus one rtx pointer). > >> But maybe I am missing context here? > > > > Hmm, isn't this the usual problem with arrays running past the end of > > sturcture? > > No, get_ref_base_and_extent handles those fine (it is supposed to set > max_size to -1 in this case). > > > I guess we need to special case these for needs of SRA as you can't > > easilly know size of the array even if it is explicitely written in the > > program. > > Of course, just bail out if max_size is -1. The problem Martin is seeing > is that it isn't -1 for some reason. Martin, are you asking > get_ref_base_and_extent > for only a part of an access?
I don't know what you mean by a partial access, the gimple statement this expression comes from is: D.10261_5 = insn_4(D)->u.fld[arg.82_3].rt_rtvec; I simply grabbed the rhs and fed it to get_ref_base_and_extent(). I agree the function should return -1 max_size in the case of a variable indexing into an array that might have unknown size. The problem is how the function identifies such arrays. The idea is simple, these are arrays at the end of the whole aggregate and thus the hitherto calculated offset + max_size give exactly that, the size of the whole aggregate. However, this does not work if we have unions within the structure. A simple example would be: struct a { union { struct { int a; int var_array[1]; } x; struct { int a,b,c,d,e,f,g,h,i,j; } y; } z; } a; In this particular case, the ofset of var_array + its max_size (the size of its element, as the function computes it, this is something I also do not really understand) is not equal to the size of the whole structure, because the other union component is way larger. Therefore the final part of the following conjunction evaluates to false: if (seen_variable_array_ref && maxsize != -1 && host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1) && bit_offset + maxsize == (signed)TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)))) maxsize = -1; Note that exp TREE_TYPE (exp) is now the type of the all-encompassing most-ouward aggregate. And thus maxsize is not set to -1. I believe a simple solution would be to ahve a nother flag to be set when processing component_ref with a union as its zeroth argument and always return maxsize of -1 if both seen_variable_array_ref and this flag is set. It is not optimal because the union field might be a structure with an array in it but not as the last field and so it cannot have variable size but well... ...if you think this case really matters I can try to handle it as well. Thanks, Martin