> On May 23, 2024, at 07:46, Richard Biener <richard.guent...@gmail.com> wrote: > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao <qing.z...@oracle.com> wrote: >> >> >> >>> On May 22, 2024, at 03:38, Richard Biener <richard.guent...@gmail.com> >>> wrote: >>> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalc...@redhat.com> wrote: >>>> >>>> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote: >>>>> Thanks for the comments and suggestions. >>>>> >>>>>> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com> >>>>>> wrote: >>>>>> >>>>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: >>>>>>> On Mon, 13 May 2024, Qing Zhao wrote: >>>>>>> >>>>>>>> -Warray-bounds is an important option to enable linux kernal to >>>>>>>> keep >>>>>>>> the array out-of-bound errors out of the source tree. >>>>>>>> >>>>>>>> However, due to the false positive warnings reported in >>>>>>>> PR109071 >>>>>>>> (-Warray-bounds false positive warnings due to code duplication >>>>>>>> from >>>>>>>> jump threading), -Warray-bounds=1 cannot be added on by >>>>>>>> default. >>>>>>>> >>>>>>>> Although it's impossible to elinimate all the false positive >>>>>>>> warnings >>>>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds >>>>>>>> documentation says "always out of bounds"), we should minimize >>>>>>>> the >>>>>>>> false positive warnings in -Warray-bounds=1. >>>>>>>> >>>>>>>> The root reason for the false positive warnings reported in >>>>>>>> PR109071 is: >>>>>>>> >>>>>>>> When the thread jump optimization tries to reduce the # of >>>>>>>> branches >>>>>>>> inside the routine, sometimes it needs to duplicate the code >>>>>>>> and >>>>>>>> split into two conditional pathes. for example: >>>>>>>> >>>>>>>> The original code: >>>>>>>> >>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index) >>>>>>>> { >>>>>>>> if (index >= 4) >>>>>>>> warn (); >>>>>>>> *ptr = 0; >>>>>>>> *val = sg->vals[index]; >>>>>>>> if (index >= 4) >>>>>>>> warn (); >>>>>>>> *ptr = *val; >>>>>>>> >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>>>>> With the thread jump, the above becomes: >>>>>>>> >>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index) >>>>>>>> { >>>>>>>> if (index >= 4) >>>>>>>> { >>>>>>>> warn (); >>>>>>>> *ptr = 0; // Code duplications since "warn" does >>>>>>>> return; >>>>>>>> *val = sg->vals[index]; // same this line. >>>>>>>> // In this path, since it's >>>>>>>> under >>>>>>>> the condition >>>>>>>> // "index >= 4", the compiler >>>>>>>> knows >>>>>>>> the value >>>>>>>> // of "index" is larger then 4, >>>>>>>> therefore the >>>>>>>> // out-of-bound warning. >>>>>>>> warn (); >>>>>>>> } >>>>>>>> else >>>>>>>> { >>>>>>>> *ptr = 0; >>>>>>>> *val = sg->vals[index]; >>>>>>>> } >>>>>>>> *ptr = *val; >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>>>>> We can see, after the thread jump optimization, the # of >>>>>>>> branches >>>>>>>> inside >>>>>>>> the routine "sparx5_set" is reduced from 2 to 1, however, due >>>>>>>> to >>>>>>>> the >>>>>>>> code duplication (which is needed for the correctness of the >>>>>>>> code), >>>>>>>> we >>>>>>>> got a false positive out-of-bound warning. >>>>>>>> >>>>>>>> In order to eliminate such false positive out-of-bound warning, >>>>>>>> >>>>>>>> A. Add one more flag for GIMPLE: is_splitted. >>>>>>>> B. During the thread jump optimization, when the basic blocks >>>>>>>> are >>>>>>>> duplicated, mark all the STMTs inside the original and >>>>>>>> duplicated >>>>>>>> basic blocks as "is_splitted"; >>>>>>>> C. Inside the array bound checker, add the following new >>>>>>>> heuristic: >>>>>>>> >>>>>>>> If >>>>>>>> 1. the stmt is duplicated and splitted into two conditional >>>>>>>> paths; >>>>>>>> + 2. the warning level < 2; >>>>>>>> + 3. the current block is not dominating the exit block >>>>>>>> Then not report the warning. >>>>>>>> >>>>>>>> The false positive warnings are moved from -Warray-bounds=1 to >>>>>>>> -Warray-bounds=2 now. >>>>>>>> >>>>>>>> Bootstrapped and regression tested on both x86 and aarch64. >>>>>>>> adjusted >>>>>>>> -Warray-bounds-61.c due to the false positive warnings. >>>>>>>> >>>>>>>> Let me know if you have any comments and suggestions. >>>>>>> >>>>>>> At the last Cauldron I talked with David Malcolm about these kind >>>>>>> of >>>>>>> issues and thought of instead of suppressing diagnostics to >>>>>>> record >>>>>>> how a block was duplicated. For jump threading my idea was to >>>>>>> record >>>>>>> the condition that was proved true when entering the path and do >>>>>>> this >>>>>>> by recording the corresponding locations >>>>> >>>>> Is only recording the location for the TRUE path enough? >>>>> We might need to record the corresponding locations for both TRUE and >>>>> FALSE paths since the VRP might be more accurate on both paths. >>>>> Is only recording the location is enough? >>>>> Do we need to record the pointer to the original condition stmt? >>>> >>>> Just to be clear: I don't plan to work on this myself (I have far too >>>> much already to work on...); I'm assuming Richard Biener is >>>> hoping/planning to implement this himself. >>> >>> While I think some of this might be an improvement to those vast >>> number of "false positive" diagnostics we have from (too) late diagnostic >>> passes I do not have the cycles to work on this. >> >> I can study a little bit more on how to improve the diagnostics for PR >> 109071 first. >> >> FYI, I just updated PR109071’s description as: Need more context for >> -Warray-bounds warnings due to code duplication from jump threading. >> >> As we already agreed, this is NOT a false positive. It caught a real bug in >> linux kernel that need to be patched in the kernel source. (See >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11) >> >> In order to add more context to the diagnostics to help the end user locate >> the bug accurately in their source code, compiler needs: >> >> 1. During jump threading phase, record the following information for each >> duplicated STMT: >> A. A pointer to the COND stmt; >> B. True/false for such COND >> 2. During array out-of-bound checking phase, whenever locate an out-of-bound >> case, >> A. Use a rich_location for the diagnostic; >> B. Create an instance of diagnositic_path, add events to this >> diagnostic_path based on the COND stmt, True/False info recorded in the STMT; >> C. Call richloc.set_path() to associate the path with the >> rich_location; >> D. Then issue warning with this rich_location instead of the current >> regular location. >> >> Any comment and suggestion to the above? > > I was originally thinking of using location_adhoc_data to store 1.A > and 1.B as a common bit to associate to each > copied stmt. IIRC location_adhoc_data->data is the stmts associated > lexical block so we'd need to stuff another > 'data' field into this struct, like a "copy history" where we could > then even record copied-of-copy-of-X. > locataion_adhoc_data seems imperfectly packed right now, with a 32bit > hole before 'data' and 32bit unused > at its end, so we might get away without memory use by re-ordering > discriminator before 'data' ...
Is “location_adhoc_data” an available data structure in current GCC? I just searched GCC source tree, cannot find it. Qing > > Richard. > >> Thanks. >> >> Qing >> >> >>> >>> Richard. >>> >>>> But feel free to ask me any questions about the diagnostic_path >>>> machinery within the diagnostics subsystem. >>>> >>>> [...snip...] >>>> >>>> Regards >>>> Dave >> >>