On Wed, May 22, 2024 at 8:53 PM Qing Zhao <qing.z...@oracle.com> wrote:
>
>
>
> > On May 22, 2024, at 03:38, Richard Biener <richard.guent...@gmail.com> 
> > wrote:
> >
> > On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalc...@redhat.com> wrote:
> >>
> >> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote:
> >>> Thanks for the comments and suggestions.
> >>>
> >>>> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com>
> >>>> wrote:
> >>>>
> >>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> >>>>> On Mon, 13 May 2024, Qing Zhao wrote:
> >>>>>
> >>>>>> -Warray-bounds is an important option to enable linux kernal to
> >>>>>> keep
> >>>>>> the array out-of-bound errors out of the source tree.
> >>>>>>
> >>>>>> However, due to the false positive warnings reported in
> >>>>>> PR109071
> >>>>>> (-Warray-bounds false positive warnings due to code duplication
> >>>>>> from
> >>>>>> jump threading), -Warray-bounds=1 cannot be added on by
> >>>>>> default.
> >>>>>>
> >>>>>> Although it's impossible to elinimate all the false positive
> >>>>>> warnings
> >>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> >>>>>> documentation says "always out of bounds"), we should minimize
> >>>>>> the
> >>>>>> false positive warnings in -Warray-bounds=1.
> >>>>>>
> >>>>>> The root reason for the false positive warnings reported in
> >>>>>> PR109071 is:
> >>>>>>
> >>>>>> When the thread jump optimization tries to reduce the # of
> >>>>>> branches
> >>>>>> inside the routine, sometimes it needs to duplicate the code
> >>>>>> and
> >>>>>> split into two conditional pathes. for example:
> >>>>>>
> >>>>>> The original code:
> >>>>>>
> >>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
> >>>>>> {
> >>>>>>  if (index >= 4)
> >>>>>>    warn ();
> >>>>>>  *ptr = 0;
> >>>>>>  *val = sg->vals[index];
> >>>>>>  if (index >= 4)
> >>>>>>    warn ();
> >>>>>>  *ptr = *val;
> >>>>>>
> >>>>>>  return;
> >>>>>> }
> >>>>>>
> >>>>>> With the thread jump, the above becomes:
> >>>>>>
> >>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
> >>>>>> {
> >>>>>>  if (index >= 4)
> >>>>>>    {
> >>>>>>      warn ();
> >>>>>>      *ptr = 0;         // Code duplications since "warn" does
> >>>>>> return;
> >>>>>>      *val = sg->vals[index];   // same this line.
> >>>>>>                                // In this path, since it's
> >>>>>> under
> >>>>>> the condition
> >>>>>>                                // "index >= 4", the compiler
> >>>>>> knows
> >>>>>> the value
> >>>>>>                                // of "index" is larger then 4,
> >>>>>> therefore the
> >>>>>>                                // out-of-bound warning.
> >>>>>>      warn ();
> >>>>>>    }
> >>>>>>  else
> >>>>>>    {
> >>>>>>      *ptr = 0;
> >>>>>>      *val = sg->vals[index];
> >>>>>>    }
> >>>>>>  *ptr = *val;
> >>>>>>  return;
> >>>>>> }
> >>>>>>
> >>>>>> We can see, after the thread jump optimization, the # of
> >>>>>> branches
> >>>>>> inside
> >>>>>> the routine "sparx5_set" is reduced from 2 to 1, however,  due
> >>>>>> to
> >>>>>> the
> >>>>>> code duplication (which is needed for the correctness of the
> >>>>>> code),
> >>>>>> we
> >>>>>> got a false positive out-of-bound warning.
> >>>>>>
> >>>>>> In order to eliminate such false positive out-of-bound warning,
> >>>>>>
> >>>>>> A. Add one more flag for GIMPLE: is_splitted.
> >>>>>> B. During the thread jump optimization, when the basic blocks
> >>>>>> are
> >>>>>>   duplicated, mark all the STMTs inside the original and
> >>>>>> duplicated
> >>>>>>   basic blocks as "is_splitted";
> >>>>>> C. Inside the array bound checker, add the following new
> >>>>>> heuristic:
> >>>>>>
> >>>>>> If
> >>>>>>   1. the stmt is duplicated and splitted into two conditional
> >>>>>> paths;
> >>>>>> +  2. the warning level < 2;
> >>>>>> +  3. the current block is not dominating the exit block
> >>>>>> Then not report the warning.
> >>>>>>
> >>>>>> The false positive warnings are moved from -Warray-bounds=1 to
> >>>>>> -Warray-bounds=2 now.
> >>>>>>
> >>>>>> Bootstrapped and regression tested on both x86 and aarch64.
> >>>>>> adjusted
> >>>>>> -Warray-bounds-61.c due to the false positive warnings.
> >>>>>>
> >>>>>> Let me know if you have any comments and suggestions.
> >>>>>
> >>>>> At the last Cauldron I talked with David Malcolm about these kind
> >>>>> of
> >>>>> issues and thought of instead of suppressing diagnostics to
> >>>>> record
> >>>>> how a block was duplicated.  For jump threading my idea was to
> >>>>> record
> >>>>> the condition that was proved true when entering the path and do
> >>>>> this
> >>>>> by recording the corresponding locations
> >>>
> >>> Is only recording the location for the TRUE path  enough?
> >>> We might need to record the corresponding locations for both TRUE and
> >>> FALSE paths since the VRP might be more accurate on both paths.
> >>> Is only recording the location is enough?
> >>> Do we need to record the pointer to the original condition stmt?
> >>
> >> Just to be clear: I don't plan to work on this myself (I have far too
> >> much already to work on...); I'm assuming Richard Biener is
> >> hoping/planning to implement this himself.
> >
> > While I think some of this might be an improvement to those vast
> > number of "false positive" diagnostics we have from (too) late diagnostic
> > passes I do not have the cycles to work on this.
>
> I can study a little bit more on how to improve the diagnostics for PR 109071 
> first.
>
> FYI, I just updated PR109071’s description as: Need more context for 
> -Warray-bounds warnings due to code duplication from jump threading.
>
> As we already agreed, this is NOT a false positive. It caught a real bug in 
> linux kernel that need to be patched in the kernel source. (See 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11)
>
> In order to add more context to the diagnostics to help the end user locate 
> the bug accurately in their source code, compiler needs:
>
> 1. During jump threading phase, record the following information for each 
> duplicated STMT:
>         A. A pointer to the COND stmt;
>         B. True/false for such COND
> 2. During array out-of-bound checking phase, whenever locate an out-of-bound 
> case,
>         A. Use a rich_location for the diagnostic;
>         B. Create an instance of diagnositic_path, add events to this 
> diagnostic_path based on the COND stmt, True/False info recorded in the STMT;
>         C. Call richloc.set_path() to associate the path with the 
> rich_location;
>         D. Then issue warning with this rich_location instead of the current 
> regular location.
>
> Any comment and suggestion to the above?

I was originally thinking of using location_adhoc_data to store 1.A
and 1.B as a common bit to associate to each
copied stmt.  IIRC location_adhoc_data->data is the stmts associated
lexical block so we'd need to stuff another
'data' field into this struct, like a "copy history" where we could
then even record copied-of-copy-of-X.
locataion_adhoc_data seems imperfectly packed right now, with a 32bit
hole before 'data' and 32bit unused
at its end, so we might get away without memory use by re-ordering
discriminator before 'data' ...

Richard.

> Thanks.
>
> Qing
>
>
> >
> > Richard.
> >
> >> But feel free to ask me any questions about the diagnostic_path
> >> machinery within the diagnostics subsystem.
> >>
> >> [...snip...]
> >>
> >> Regards
> >> Dave
>
>

Reply via email to