> On Apr 11, 2025, at 13:37, Martin Uecker <uec...@tugraz.at> wrote: > > Am Freitag, dem 11.04.2025 um 17:08 +0000 schrieb Qing Zhao: >> >>> On Apr 11, 2025, at 12:20, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao: >>>> >>>>> On Apr 11, 2025, at 10:53, Martin Uecker <uec...@tugraz.at> wrote: >>>>> >>>>> Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod: >>>>>> On 4/11/25 10:27, Qing Zhao wrote: >>>>>>> >>>>>>>> On Apr 10, 2025, at 11:12, Martin Uecker <uec...@tugraz.at> wrote: >>>>>>>> >>>>>>>> Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh >>>>>>>> Poyarekar: >>>>>>>>> On 2025-04-10 10:50, Andrew MacLeod wrote: >>>>>>>>>> Its not clear to me exactly what is being asked, but I think the >>>>>>>>>> suggestion is that pointer references are being replaced with a >>>>>>>>>> builtin >>>>>>>>>> function called .ACCESS_WITH_SIZE ? and I presume that builtin >>>>>>>>>> function has some parameters that give you relevant range >>>>>>>>>> information of >>>>>>>>>> some sort? >>>>>>>>> Added, not replaced, but yes, that's essentially it. >>>>>>>>> >>>>>>>>>> range-ops is setup to pull range information from builtin functions >>>>>>>>>> already in gimple-range-op.cc:: >>>>>>>>>> gimple_range_op_handler::maybe_builtin_call (). We'd just need to >>>>>>>>>> write >>>>>>>>>> a handler for this new one. You can pull information from 2 operands >>>>>>>>>> under normal circumstances, but exceptions can be made. I'd need a >>>>>>>>>> description of what it looks like and how that translates to range >>>>>>>>>> info. >>>>>>>>> That's perfect! It's probably redundant for cases where we end up >>>>>>>>> with >>>>>>>>> both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't remember >>>>>>>>> if >>>>>>>>> that's the only place where .ACCESS_WITH_SIZE is generated today. >>>>>>>>> Qing, >>>>>>>>> could you please work with Andrew on this? >>>>>>>> BTW, what I would find very interesting is inserting such information >>>>>>>> at the points where arrays decay to pointer. >>>>>>> Is the following the example? >>>>>>> >>>>>>> 1 #include <stdio.h> >>>>>>> 2 >>>>>>> 3 void foo (int arr[]) { >>>>>>> 4 // Inside the function, arr is treated as a pointer >>>>>>> 5 arr[6] = 10; >>>>>>> 6 } >>>>>>> 7 >>>>>>> 8 int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>> 9 >>>>>>> 10 int main() { >>>>>>> 11 my_array[6] = 6; >>>>>>> 12 int *ptr = my_array; // Array decays to pointer here >>>>>>> 13 ptr[7] = 7; >>>>>>> 14 foo (my_array); >>>>>>> 15 16 return 0; >>>>>>> 17 } >>>>>>> >>>>>>> When I use the latest gcc to compile the above with -Warray-bounds: >>>>>>> >>>>>>> []$ gcc -O2 -Warray-bounds t.c >>>>>>> t.c: In function ‘main’: >>>>>>> t.c:13:6: warning: array subscript 7 is outside array bounds of >>>>>>> ‘int[5]’ [-Warray-bounds=] >>>>>>> 13 | ptr[7] = 7; >>>>>>> | ~~~^~~ >>>>>>> t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20 >>>>>>> 8 | int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>> | ^~~~~~~~ >>>>>>> In function ‘foo’, >>>>>>> inlined from ‘main’ at t.c:14:3: >>>>>>> t.c:5:10: warning: array subscript 6 is outside array bounds of >>>>>>> ‘int[5]’ [-Warray-bounds=] >>>>>>> 5 | arr[6] = 10; >>>>>>> | ~~~~~~~^~~~ >>>>>>> t.c: In function ‘main’: >>>>>>> t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20 >>>>>>> 8 | int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>> | ^~~~~~~~ >>>>>>> >>>>>>> Looks like that even after the array decay to pointer, the bound >>>>>>> information is still carried >>>>>>> for the decayed pointer somehow (I guess that vrp did this?) >>>>>> >>>>>> No, the behaviour in these warnings is from something else. Although >>>>>> some range info from VRP is used, most of this is tracked by the >>>>>> pointer_query (pointer-query.cc) mechanism that was written a number of >>>>>> years ago before ranger was completed. It attempts to do its own custom >>>>>> tracking of pointers and what they point to and the size of things they >>>>>> access. >>>>>> >>>>>> There are issues with that code, and the goal is to replace it with >>>>>> rangers prange. Alas there is enhancement work to prange for that to >>>>>> happen as it doesnt currently track and points to info. That would then >>>>>> be followed by converting the warning code to then use ranger/VRP >>>>>> instead. >>>>>> >>>>>> Any any adjustments to ranger for this are unlikely to affect anything >>>>>> until that work is done, and I do not think anyone is equipped to >>>>>> attempt to update the existing pointer-query code. >>>>>> >>>>>> Unfortunately :-( >>>>> >>>>> Examples I have in mind for the .ACCESS_WITH_SIZE are the >>>>> following two: >>>>> >>>>> struct foo { >>>>> >>>>> char arr[3]; >>>>> int b; >>>>> }; >>>>> >>>>> void f(struct foo x) >>>>> { >>>>> char *ptr = x.arr; >>>>> ptr[4] = 10; >>>>> } >>>> >>>> The above is an example about decaying a field array of a structure to a >>>> pointer. >>>> >>>> Yes, usually tracking and keeping the bound info for a field is harder >>>> than a regular variable, >>>> However, I think it’s still possible to improve compiler analysis to do >>>> this since the original bound >>>> info is in the code. >>>> >>>>> >>>>> void g(char (*arr)[4]) >>>>> { >>>>> char *ptr = *arr; >>>>> ptr[4] = 1; >>>>> } >>>>> >>>> >>>> The above example is about decaying a formal parameter array to a pointer. >>>> I think that since the bound information is in the code too, the current >>>> compiler analysis should be >>>> able to be improved to catch such case. >>>> >>>> For the above two cases, the current compiler analysis is not able to >>>> propagate the bound information, >>>> But since the bound info already in the code, it’s possible to improve the >>>> current compiler analysis to >>>> propagate such information more aggressively. >>>> >>>> Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. >>>> >>>> .ACCESS_WITH_SIZE is necessary for the cases that the size information is >>>> not available in the source >>>> Code, i.e., the cases that we need add attribute to specify the size of >>>> the access, for example, counted_by >>>> attribute, access attribute, etc. >>> >>> When you add an attribute, the information is also in the source. >>> >>> .ACCESS_WITH_SIZE can be used to make semantic information >>> from a type or attribute available to other passes, so it >>> seems the right tool to be used here. >>> >> >> If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is to >> explicitly add the reference to >> the size field into the data flow in order to avoid any incorrect code >> reordering fro happening. >> >> Even without .ACCESS_WITH_SIZE, compiler phases that need the size >> information can get it from the IR. >> >> My understanding is that such issue with the implicit data flow dependency >> information missing is only for the >> counted_by attribute, not for the other TYPE which already have the bound >> information there. >> > > The dependency issue is only for the size, but for > other types the size information is often not > preserved, so then not available later. > .ACCESS_WITH_SIZE would solve this.
Yes, I see your points here: The original size information from the array will be passed through the 2nd parameter of .ACCESS_WITH_SIZE to the decayed pointer. -:) However, I am not sure how much more benefit this can bring compare to improve the compiler analysis to better tracking the bound information for pointers through data-flow. Qing > > Martin > > >> Qing >>> >>> Martin >>> >>>> >>>> thanks. >>>> >>>> Qing >>>> >>>>> >>>>> Martin