> On Apr 11, 2025, at 13:37, Martin Uecker <uec...@tugraz.at> wrote:
> 
> Am Freitag, dem 11.04.2025 um 17:08 +0000 schrieb Qing Zhao:
>> 
>>> On Apr 11, 2025, at 12:20, Martin Uecker <uec...@tugraz.at> wrote:
>>> 
>>> Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao:
>>>> 
>>>>> On Apr 11, 2025, at 10:53, Martin Uecker <uec...@tugraz.at> wrote:
>>>>> 
>>>>> Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod:
>>>>>> On 4/11/25 10:27, Qing Zhao wrote:
>>>>>>> 
>>>>>>>> On Apr 10, 2025, at 11:12, Martin Uecker <uec...@tugraz.at> wrote:
>>>>>>>> 
>>>>>>>> Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh 
>>>>>>>> Poyarekar:
>>>>>>>>> On 2025-04-10 10:50, Andrew MacLeod wrote:
>>>>>>>>>> Its not clear to me exactly what is being asked, but I think the
>>>>>>>>>> suggestion is that pointer references are being replaced with a 
>>>>>>>>>> builtin
>>>>>>>>>> function called .ACCESS_WITH_SIZE ?    and I presume that builtin
>>>>>>>>>> function has some parameters that give you relevant range 
>>>>>>>>>> information of
>>>>>>>>>> some sort?
>>>>>>>>> Added, not replaced, but yes, that's essentially it.
>>>>>>>>> 
>>>>>>>>>> range-ops is setup to pull range information from builtin functions
>>>>>>>>>> already in gimple-range-op.cc::
>>>>>>>>>> gimple_range_op_handler::maybe_builtin_call ().  We'd just need to 
>>>>>>>>>> write
>>>>>>>>>> a handler for this new one.  You can pull information from 2 operands
>>>>>>>>>> under normal circumstances, but exceptions can be made.    I'd need a
>>>>>>>>>> description of what it looks like and how that translates to range 
>>>>>>>>>> info.
>>>>>>>>> That's perfect!  It's probably redundant for cases where we end up 
>>>>>>>>> with
>>>>>>>>> both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't remember 
>>>>>>>>> if
>>>>>>>>> that's the only place where .ACCESS_WITH_SIZE is generated today.  
>>>>>>>>> Qing,
>>>>>>>>> could you please work with Andrew on this?
>>>>>>>> BTW, what I would find very interesting is inserting such information
>>>>>>>> at the points where arrays decay to pointer.
>>>>>>> Is the following the example?
>>>>>>> 
>>>>>>> 1 #include <stdio.h>
>>>>>>> 2
>>>>>>> 3 void foo (int arr[]) {
>>>>>>> 4   // Inside the function, arr is treated as a pointer
>>>>>>> 5   arr[6] = 10;
>>>>>>> 6 }
>>>>>>> 7
>>>>>>> 8 int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>> 9
>>>>>>> 10 int main() {
>>>>>>> 11   my_array[6] = 6;
>>>>>>> 12   int *ptr = my_array; // Array decays to pointer here
>>>>>>> 13   ptr[7] = 7;
>>>>>>> 14   foo (my_array);
>>>>>>> 15   16   return 0;
>>>>>>> 17 }
>>>>>>> 
>>>>>>> When I use the latest gcc to compile the above with -Warray-bounds:
>>>>>>> 
>>>>>>> []$ gcc -O2 -Warray-bounds t.c
>>>>>>> t.c: In function ‘main’:
>>>>>>> t.c:13:6: warning: array subscript 7 is outside array bounds of 
>>>>>>> ‘int[5]’ [-Warray-bounds=]
>>>>>>>  13 |   ptr[7] = 7;
>>>>>>>     |   ~~~^~~
>>>>>>> t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20
>>>>>>>   8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>>     |     ^~~~~~~~
>>>>>>> In function ‘foo’,
>>>>>>>   inlined from ‘main’ at t.c:14:3:
>>>>>>> t.c:5:10: warning: array subscript 6 is outside array bounds of 
>>>>>>> ‘int[5]’ [-Warray-bounds=]
>>>>>>>   5 |   arr[6] = 10;
>>>>>>>     |   ~~~~~~~^~~~
>>>>>>> t.c: In function ‘main’:
>>>>>>> t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20
>>>>>>>   8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>>     |     ^~~~~~~~
>>>>>>> 
>>>>>>> Looks like that even after the array decay to pointer, the bound 
>>>>>>> information is still carried
>>>>>>> for the decayed pointer somehow (I guess that vrp did this?)
>>>>>> 
>>>>>> No, the behaviour in these warnings is from something else. Although 
>>>>>> some range info from VRP is used, most of this is tracked by the 
>>>>>> pointer_query (pointer-query.cc) mechanism that was written a number of 
>>>>>> years ago before ranger was completed.  It attempts to do its own custom 
>>>>>> tracking of pointers and what they point to and the size of things they 
>>>>>> access.
>>>>>> 
>>>>>> There are issues with that code, and the goal is to replace it with 
>>>>>> rangers prange.  Alas there is enhancement work to prange for that to 
>>>>>> happen as it doesnt currently track and points to info. That would then 
>>>>>> be followed by converting the warning code to then use ranger/VRP 
>>>>>> instead.
>>>>>> 
>>>>>> Any any adjustments to ranger for this are unlikely to affect anything 
>>>>>> until that work is done, and I do not think anyone is equipped to 
>>>>>> attempt to update the existing pointer-query code.
>>>>>> 
>>>>>> Unfortunately :-(
>>>>> 
>>>>> Examples I have in mind for the .ACCESS_WITH_SIZE are the
>>>>> following two:
>>>>> 
>>>>> struct foo {
>>>>> 
>>>>>  char arr[3];
>>>>>  int b;
>>>>> };
>>>>> 
>>>>> void f(struct foo x)
>>>>> {
>>>>>  char *ptr = x.arr;
>>>>>  ptr[4] = 10;
>>>>> }
>>>> 
>>>> The above is an example about decaying a field array of a structure to a 
>>>> pointer. 
>>>> 
>>>> Yes, usually tracking and keeping the bound info for a field is harder 
>>>> than a regular variable,
>>>> However, I think it’s still possible to improve compiler analysis to do 
>>>> this since the original bound
>>>> info is in the code. 
>>>> 
>>>>> 
>>>>> void g(char (*arr)[4])
>>>>> {
>>>>>  char *ptr = *arr;
>>>>>  ptr[4] = 1;
>>>>> }
>>>>> 
>>>> 
>>>> The above example is about decaying a formal parameter array to a pointer. 
>>>> I think that since the bound information is in the code too, the current 
>>>> compiler analysis should be
>>>> able to be improved to catch such case. 
>>>> 
>>>> For the above two cases, the current compiler analysis is not able to 
>>>> propagate the bound information,
>>>> But since the bound info already in the code, it’s possible to improve the 
>>>> current compiler analysis to 
>>>> propagate such information more aggressively. 
>>>> 
>>>> Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. 
>>>> 
>>>> .ACCESS_WITH_SIZE is necessary for the cases that the size information is 
>>>> not available in the source
>>>> Code, i.e., the cases that we need add attribute to specify the size of 
>>>> the access, for example, counted_by
>>>> attribute, access attribute, etc. 
>>> 
>>> When you add an attribute, the information is also in the source.
>>> 
>>> .ACCESS_WITH_SIZE can be used to make  semantic information
>>> from a type or attribute available to other passes, so it
>>> seems the right tool to be used here. 
>>> 
>> 
>> If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is to 
>> explicitly add the reference to
>> the size field into the data flow in order to avoid any incorrect code 
>> reordering fro happening.  
>> 
>> Even without .ACCESS_WITH_SIZE, compiler phases that need the size 
>> information can get it from the IR. 
>> 
>> My understanding is that such issue with the implicit data flow dependency 
>> information missing is only for the
>> counted_by attribute, not for the other TYPE which already have the bound 
>> information there.
>> 
> 
> The dependency issue is only for the size, but for
> other types the size information is often not
> preserved, so then not available later. 
> .ACCESS_WITH_SIZE would solve this.

Yes, I see your points here:

The original size information from the array will be passed through
 the 2nd parameter of .ACCESS_WITH_SIZE to the decayed pointer. -:)

However, I am not sure how much more benefit this can bring compare 
to improve the compiler analysis to better tracking the bound information for 
pointers
through data-flow. 

Qing


> 
> Martin
> 
> 
>> Qing
>>> 
>>> Martin
>>> 
>>>> 
>>>> thanks.
>>>> 
>>>> Qing
>>>> 
>>>>> 
>>>>> Martin


Reply via email to