On Wed, May 22, 2024 at 8:53 PM Qing Zhao <qing.z...@oracle.com> wrote: > > > > > On May 22, 2024, at 03:38, Richard Biener <richard.guent...@gmail.com> > > wrote: > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalc...@redhat.com> wrote: > >> > >> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote: > >>> Thanks for the comments and suggestions. > >>> > >>>> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com> > >>>> wrote: > >>>> > >>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > >>>>> On Mon, 13 May 2024, Qing Zhao wrote: > >>>>> > >>>>>> -Warray-bounds is an important option to enable linux kernal to > >>>>>> keep > >>>>>> the array out-of-bound errors out of the source tree. > >>>>>> > >>>>>> However, due to the false positive warnings reported in > >>>>>> PR109071 > >>>>>> (-Warray-bounds false positive warnings due to code duplication > >>>>>> from > >>>>>> jump threading), -Warray-bounds=1 cannot be added on by > >>>>>> default. > >>>>>> > >>>>>> Although it's impossible to elinimate all the false positive > >>>>>> warnings > >>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > >>>>>> documentation says "always out of bounds"), we should minimize > >>>>>> the > >>>>>> false positive warnings in -Warray-bounds=1. > >>>>>> > >>>>>> The root reason for the false positive warnings reported in > >>>>>> PR109071 is: > >>>>>> > >>>>>> When the thread jump optimization tries to reduce the # of > >>>>>> branches > >>>>>> inside the routine, sometimes it needs to duplicate the code > >>>>>> and > >>>>>> split into two conditional pathes. for example: > >>>>>> > >>>>>> The original code: > >>>>>> > >>>>>> void sparx5_set (int * ptr, struct nums * sg, int index) > >>>>>> { > >>>>>> if (index >= 4) > >>>>>> warn (); > >>>>>> *ptr = 0; > >>>>>> *val = sg->vals[index]; > >>>>>> if (index >= 4) > >>>>>> warn (); > >>>>>> *ptr = *val; > >>>>>> > >>>>>> return; > >>>>>> } > >>>>>> > >>>>>> With the thread jump, the above becomes: > >>>>>> > >>>>>> void sparx5_set (int * ptr, struct nums * sg, int index) > >>>>>> { > >>>>>> if (index >= 4) > >>>>>> { > >>>>>> warn (); > >>>>>> *ptr = 0; // Code duplications since "warn" does > >>>>>> return; > >>>>>> *val = sg->vals[index]; // same this line. > >>>>>> // In this path, since it's > >>>>>> under > >>>>>> the condition > >>>>>> // "index >= 4", the compiler > >>>>>> knows > >>>>>> the value > >>>>>> // of "index" is larger then 4, > >>>>>> therefore the > >>>>>> // out-of-bound warning. > >>>>>> warn (); > >>>>>> } > >>>>>> else > >>>>>> { > >>>>>> *ptr = 0; > >>>>>> *val = sg->vals[index]; > >>>>>> } > >>>>>> *ptr = *val; > >>>>>> return; > >>>>>> } > >>>>>> > >>>>>> We can see, after the thread jump optimization, the # of > >>>>>> branches > >>>>>> inside > >>>>>> the routine "sparx5_set" is reduced from 2 to 1, however, due > >>>>>> to > >>>>>> the > >>>>>> code duplication (which is needed for the correctness of the > >>>>>> code), > >>>>>> we > >>>>>> got a false positive out-of-bound warning. > >>>>>> > >>>>>> In order to eliminate such false positive out-of-bound warning, > >>>>>> > >>>>>> A. Add one more flag for GIMPLE: is_splitted. > >>>>>> B. During the thread jump optimization, when the basic blocks > >>>>>> are > >>>>>> duplicated, mark all the STMTs inside the original and > >>>>>> duplicated > >>>>>> basic blocks as "is_splitted"; > >>>>>> C. Inside the array bound checker, add the following new > >>>>>> heuristic: > >>>>>> > >>>>>> If > >>>>>> 1. the stmt is duplicated and splitted into two conditional > >>>>>> paths; > >>>>>> + 2. the warning level < 2; > >>>>>> + 3. the current block is not dominating the exit block > >>>>>> Then not report the warning. > >>>>>> > >>>>>> The false positive warnings are moved from -Warray-bounds=1 to > >>>>>> -Warray-bounds=2 now. > >>>>>> > >>>>>> Bootstrapped and regression tested on both x86 and aarch64. > >>>>>> adjusted > >>>>>> -Warray-bounds-61.c due to the false positive warnings. > >>>>>> > >>>>>> Let me know if you have any comments and suggestions. > >>>>> > >>>>> At the last Cauldron I talked with David Malcolm about these kind > >>>>> of > >>>>> issues and thought of instead of suppressing diagnostics to > >>>>> record > >>>>> how a block was duplicated. For jump threading my idea was to > >>>>> record > >>>>> the condition that was proved true when entering the path and do > >>>>> this > >>>>> by recording the corresponding locations > >>> > >>> Is only recording the location for the TRUE path enough? > >>> We might need to record the corresponding locations for both TRUE and > >>> FALSE paths since the VRP might be more accurate on both paths. > >>> Is only recording the location is enough? > >>> Do we need to record the pointer to the original condition stmt? > >> > >> Just to be clear: I don't plan to work on this myself (I have far too > >> much already to work on...); I'm assuming Richard Biener is > >> hoping/planning to implement this himself. > > > > While I think some of this might be an improvement to those vast > > number of "false positive" diagnostics we have from (too) late diagnostic > > passes I do not have the cycles to work on this. > > I can study a little bit more on how to improve the diagnostics for PR 109071 > first. > > FYI, I just updated PR109071’s description as: Need more context for > -Warray-bounds warnings due to code duplication from jump threading. > > As we already agreed, this is NOT a false positive. It caught a real bug in > linux kernel that need to be patched in the kernel source. (See > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11) > > In order to add more context to the diagnostics to help the end user locate > the bug accurately in their source code, compiler needs: > > 1. During jump threading phase, record the following information for each > duplicated STMT: > A. A pointer to the COND stmt; > B. True/false for such COND > 2. During array out-of-bound checking phase, whenever locate an out-of-bound > case, > A. Use a rich_location for the diagnostic; > B. Create an instance of diagnositic_path, add events to this > diagnostic_path based on the COND stmt, True/False info recorded in the STMT; > C. Call richloc.set_path() to associate the path with the > rich_location; > D. Then issue warning with this rich_location instead of the current > regular location. > > Any comment and suggestion to the above?
I was originally thinking of using location_adhoc_data to store 1.A and 1.B as a common bit to associate to each copied stmt. IIRC location_adhoc_data->data is the stmts associated lexical block so we'd need to stuff another 'data' field into this struct, like a "copy history" where we could then even record copied-of-copy-of-X. locataion_adhoc_data seems imperfectly packed right now, with a 32bit hole before 'data' and 32bit unused at its end, so we might get away without memory use by re-ordering discriminator before 'data' ... Richard. > Thanks. > > Qing > > > > > > Richard. > > > >> But feel free to ask me any questions about the diagnostic_path > >> machinery within the diagnostics subsystem. > >> > >> [...snip...] > >> > >> Regards > >> Dave > >