Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-07-14 Thread David Malcolm via Gcc-patches
On Mon, 2020-07-13 at 17:07 -0400, Lewis Hyatt wrote:
> On Mon, Jul 13, 2020 at 03:04:20PM -0400, David Malcolm wrote:

[...]

> > OK for trunk with those nits fixed.
> > 
> > Dave
> > 
> > 
> 
> Thanks again for your time! I will address the above and then push in
> a day or two.

Excellent - thanks for all your work on this.

Dave



Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-07-13 Thread Lewis Hyatt via Gcc-patches
On Mon, Jul 13, 2020 at 03:04:20PM -0400, David Malcolm wrote:
> > +@item -fdiagnostics-column-unit=@var{UNIT}
> > +@opindex fdiagnostics-column-unit
> > +Select the units for the column number.  This affects traditional 
> > diagnostics
> > +(in the absence of @option{-fno-show-column}), as well as JSON format
> > +diagnostics if requested.
> > +
> > +The default @var{UNIT}, @samp{display}, considers the number of display
> > +columns occupied by each character.  This may be larger than the number
> > +of bytes required to encode the character, in the case of tab
> > +characters, or it may be smaller, in the case of multibyte characters.
> > +For example, the character ``@U{03C0}'' occupies one display column,
> > +and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''
> > +occupies two display columns, and its UTF-8 encoding requires four
> > +bytes.
> 
> Thanks for reworking this.
> 
> I'm wary of those @U commands - does the generated HTML from "make
> html" work? (similarly for the man page).  Would it be safer to express
> them in the following form?
> 
>  For example, the character ``GREEK SMALL LETTER PI (U+03C0)'' occupies one
>  display column, and its UTF-8 encoding requires two bytes; the character
>  ``SLIGHTLY SMILING FACE' (U+1F642)'' occupies two display columns, and
>  its UTF-8 encoding requires four bytes.
> 
> or somesuch?

The HTML works, yes, but I hadn't thought to check the man page. Seems the
@U notation is carried through unmodified there, which isn't so clear. I
changed it to your suggestion, no need to overcomplicate it.

> 
> > +Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte
> > +count in all cases, as was traditionally output by GCC prior to version 
> > 11.1.0.
> 
> [...snip...]
> 
> >  @item -fdiagnostics-format=@var{FORMAT}
> >  @opindex fdiagnostics-format
> >  Select a different format for printing diagnostics.
> > @@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) 
> > like this:
> >  "locations": [
> >  @{
> >  "caret": @{
> > +   "display-column": 3,
> > +   "byte-column": 3,
> >  "column": 3,
> >  "file": "misleading-indentation.c",
> >  "line": 15
> >  @},
> >  "finish": @{
> > +   "display-column": 4,
> > +   "byte-column": 4,
> >  "column": 4,
> >  "file": "misleading-indentation.c",
> >  "line": 15
> 
> Nit: the new fields don't line up with the old ones.
> 
> > @@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like 
> > this:
> >  "locations": [
> >  @{
> >  "caret": @{
> > +   "display-column": 5,
> > +   "byte-column": 5,
> >  "column": 5,
> >  "file": "misleading-indentation.c",
> >  "line": 17
> 
> Likewise.

You are referring to the source code as opposed to the generated HTML,
correct? Both look fine to me, I think above effect is due to mixed
tabs+spaces convention in the git diff ironically :). I'll make sure it
looks right in any case.

> 
> [...snip...]
> 
> > diff --git a/gcc/opts.c b/gcc/opts.c
> > index 340d99434b3..525f44d079f 100644
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "opt-suggestions.h"
> >  #include "diagnostic-color.h"
> >  #include "selftest.h"
> > +#include "cpplib.h"
> 
> Is this new #include still needed?
> 
> [...snip...]
> 
> 
> OK for trunk with those nits fixed.
> 
> Dave
> 
>

Thanks again for your time! I will address the above and then push in a day or 
two.

-Lewis


Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-07-13 Thread David Malcolm via Gcc-patches
> On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:
> > Thanks for the patch; sorry about the delay in reviewing it.
> > 
> > Some high-level review points
> > 
> > - I like the patch overall
> > 
> > - This will deserve an item in the release notes
> > 
> > - I don't like adding "global_tabstop" (I don't like global
> > variables).  Is there nowhere else we can handle this? I believe
> > there's a cluster of functions in the callgraph that make use of
> > it; can we simply pass around the tabstop value instead?  "tabstop"
> > seems to have several meanings.  If I'm reading the patch correctly
> >   * "tabstop > 0" means to expand tabs so that column numbers are a
> > multiple of tabstop
> >   * "tabstop == 0" means "don't expand tabs"
> >   * "tabstop < 0" in some places means: use the global_tabstop
value
> > Is it possible to eliminate global_tabstop value?  Or is there some
> > deep reason I'm missing?
> > 
> > I'll do a more thorough review once that's addressed/resolved
(since
> > eliminating global_tabstop might touch a few places).
> >
> 
> Thanks for the feedback! The attached updated patch addresses these
> concerns. Regarding tabstop, I have removed the new static variable
> global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments
in the
> various new APIs did previously work a bit more consistently than you
> described. In all cases "tabstop <= 0" meant to use the default
value,
> otherwise it specified the tabstop to use (with tabstop=1 naturally
> restoring the old behavior of changing tabs to a single space). In
order
> for libcpp to provide this feature (callers can pass tabstop <= 0 to
get a
> default, and the default can in turn by configured when processing
the
> -ftabstop option), it does need to remember the default, and this has
to
> be a file-level static variable because the routines need to work
> independent of any cpp_reader instance. (Some frontends don't use
> libcpp to read their input, for instance.) Anyway, I see the point
that
> this file-level static, being accessible with cpp_set_tabstop() and
> cpp_get_tabstop(), is effectively just a global variable, so I have
> removed this feature, which just means that all callers need to pass
the
> tabstop they want to use. I am now rather using the
diagnostic_context
> object to remember the value passed to -ftabstop. The only place this
> involves global variables is now in c-family/c-indentation.c, where
if I
> understood correctly, the only diagnostic_context available is
global_dc,
> so I am getting the tabstop value from there. Please let me know if
> there's a better way to handle that? Prior to my patch, the tabstop
was
> obtained from a different global variable (extern cpp_options
*cpp_opts),
> so at least conservation of total globals is maintained. :)

Thanks.  That sounds like a good approach.


> Compared to the previous version, this one is a bit longer, since 25
or
> so call sites had to be modified to know the value of -ftabstop. Most
of
> the churn is in diagnostic-show-locus.c, because there are a fair
number of
> static helper functions and helper classes there, which just needed
to
> receive the diagnostic_context object from their callers. I could
> have made this simpler by letting the tabstop argument default to
> something like 8 in all functions that require it... this would
remove the
> need to pass it in all the selftests that are indifferent to it. I
figured
> it would be better to force this argument to be passed, though, or
else in
> the future it may be easy to forget to pass it where it is needed. 

Thanks.

> > Thanks for adding docs; some nits on them:
> > 
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > 
> > [...snip...]
> > 
> > > +@item -fdiagnostics-column-unit=@var{UNIT}
> > > +@opindex fdiagnostics-column-unit
> > > +Select the units for the column number.  This affects
traditional diagnostics
> > > +(in the absence of @option{-fno-show-column}), as well as JSON
format
> > > +diagnostics if requested.
> > > +
> > > +The default @var{UNIT}, @samp{display}, considers the number of
display columns
> > > +occupied by each character.  This may be larger than the number
of bytes
> > > +occupied, in the case of tab characters, or it may be smaller,
in the case of
> > > +multibyte characters.  For example, the UTF-8 character ``@U{03C
0}'' occupies
> > > +two bytes and one display column, while the character ``@U{1F642
}'' occupies
> > > +four bytes and two display columns.
> > 
> > This is imprecise.  A unicode code point occupies some number of
display columns,
> > and its *UTF-8 encoding* occupies some number of bytes.
> > 
> > [and my inner pedant is now thinking: what about combining
diacritics? 
> > But I don't think we can ever issue a diagnostic on a diacritic; I
> > *think* we only ever care about the per-glyph level]
> > 
> > > +Setting @var{UNIT} to @samp{byte} changes the column number to
the
> > raw byte
> > > +count in all cases, as was 

Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-06-11 Thread Lewis Hyatt via Gcc-patches
On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:
> Thanks for the patch; sorry about the delay in reviewing it.
> 
> Some high-level review points
> 
> - I like the patch overall
> 
> - This will deserve an item in the release notes
> 
> - I don't like adding "global_tabstop" (I don't like global
> variables).  Is there nowhere else we can handle this? I believe
> there's a cluster of functions in the callgraph that make use of
> it; can we simply pass around the tabstop value instead?  "tabstop"
> seems to have several meanings.  If I'm reading the patch correctly
>   * "tabstop > 0" means to expand tabs so that column numbers are a
> multiple of tabstop
>   * "tabstop == 0" means "don't expand tabs"
>   * "tabstop < 0" in some places means: use the global_tabstop value
> Is it possible to eliminate global_tabstop value?  Or is there some
> deep reason I'm missing?
> 
> I'll do a more thorough review once that's addressed/resolved (since
> eliminating global_tabstop might touch a few places).
>

Thanks for the feedback! The attached updated patch addresses these
concerns. Regarding tabstop, I have removed the new static variable
global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments in the
various new APIs did previously work a bit more consistently than you
described. In all cases "tabstop <= 0" meant to use the default value,
otherwise it specified the tabstop to use (with tabstop=1 naturally
restoring the old behavior of changing tabs to a single space). In order
for libcpp to provide this feature (callers can pass tabstop <= 0 to get a
default, and the default can in turn by configured when processing the
-ftabstop option), it does need to remember the default, and this has to
be a file-level static variable because the routines need to work
independent of any cpp_reader instance. (Some frontends don't use
libcpp to read their input, for instance.) Anyway, I see the point that
this file-level static, being accessible with cpp_set_tabstop() and
cpp_get_tabstop(), is effectively just a global variable, so I have
removed this feature, which just means that all callers need to pass the
tabstop they want to use. I am now rather using the diagnostic_context
object to remember the value passed to -ftabstop. The only place this
involves global variables is now in c-family/c-indentation.c, where if I
understood correctly, the only diagnostic_context available is global_dc,
so I am getting the tabstop value from there. Please let me know if
there's a better way to handle that? Prior to my patch, the tabstop was
obtained from a different global variable (extern cpp_options *cpp_opts),
so at least conservation of total globals is maintained. :)

Compared to the previous version, this one is a bit longer, since 25 or
so call sites had to be modified to know the value of -ftabstop. Most of
the churn is in diagnostic-show-locus.c, because there are a fair number of
static helper functions and helper classes there, which just needed to
receive the diagnostic_context object from their callers. I could
have made this simpler by letting the tabstop argument default to
something like 8 in all functions that require it... this would remove the
need to pass it in all the selftests that are indifferent to it. I figured
it would be better to force this argument to be passed, though, or else in
the future it may be easy to forget to pass it where it is needed. 

> Thanks for adding docs; some nits on them:
> 
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> 
> [...snip...]
> 
> > +@item -fdiagnostics-column-unit=@var{UNIT}
> > +@opindex fdiagnostics-column-unit
> > +Select the units for the column number.  This affects traditional 
> > diagnostics
> > +(in the absence of @option{-fno-show-column}), as well as JSON format
> > +diagnostics if requested.
> > +
> > +The default @var{UNIT}, @samp{display}, considers the number of display 
> > columns
> > +occupied by each character.  This may be larger than the number of bytes
> > +occupied, in the case of tab characters, or it may be smaller, in the case 
> > of
> > +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' 
> > occupies
> > +two bytes and one display column, while the character ``@U{1F642}'' 
> > occupies
> > +four bytes and two display columns.
> 
> This is imprecise.  A unicode code point occupies some number of display 
> columns,
> and its *UTF-8 encoding* occupies some number of bytes.
> 
> [and my inner pedant is now thinking: what about combining diacritics? 
> But I don't think we can ever issue a diagnostic on a diacritic; I
> *think* we only ever care about the per-glyph level]
> 
> > +Setting @var{UNIT} to @samp{byte} changes the column number to the
> raw byte
> > +count in all cases, as was traditionally output by GCC prior to version 
> > 11.1.0.
> > +
> > +@item -fdiagnostics-column-origin=@var{ORIGIN}
> > +@opindex fdiagnostics-column-origin
> > +Select the origin for column numbers, i.e. the 

Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-06-10 Thread David Malcolm via Gcc-patches
On Fri, 2020-05-08 at 15:35 -0400, Lewis Hyatt wrote:
> On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:
> > On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> > > Hello-
> > > 
> > > Here is the second patch that I mentioned when I submitted the
> > > other
> > > related
> > > patch (which is awaiting review):
> > > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 
> > 
> > Sorry about that; I'm v. busy with analyzer bugs right now.
> > 
> > > This second patch
> > > is based on top of the first one and it closes out PR49973 and
> > > PR86904 by
> > > adding the new option -fdiagnostics-column-unit=[display|byte].
> > > This
> > > allows
> > > to specify whether columns are output as simple byte counts (the
> > > current
> > > behavior), or as display columns including handling multibyte
> > > characters and
> > > tabs. The patch makes display columns the new default.
> > > Additionally,
> > > a
> > > second new option -fdiagnostics-column-origin is added, which
> > > allows
> > > to make
> > > the column 0-based (or N-based for any N) instead of 1-based. The
> > > default
> > > remains at 1-based as it is now.
> > > 
> > > A number of testcases were explicitly testing for the old
> > > behavior,
> > > so I
> > > have updated them to test for the new behavior instead, since the
> > > column
> > > number adjusted for tabs is more natural to test for, and matches
> > > what
> > > editors typically show (give or take 1 for the origin
> > > convention).
> > > 
> > > One other testcase (go.dg/arrayclear.go) was a bit of an oddity.
> > > It
> > > failed
> > > after this patch, although it doesn't test for any column
> > > numbers.
> > > The
> > > answer turned out to be, this test checks for identical error
> > > text on
> > > two
> > > different lines. When the column units are changed to display
> > > columns, then
> > > the column of the second error happens to match the line of the
> > > first
> > > one. dejagnu then misinterprets the second error as if it matched
> > > the
> > > location of the first one (it doesn't distinguish whether it
> > > checks
> > > for the
> > > line number or the column number in the output). I added a
> > > comment to
> > > the
> > > test explaining the situation; since adding the comment has the
> > > side
> > > effect
> > > of making the first line number no longer match the second column
> > > number, it
> > > also makes the test pass again.
> > > 
> > > It wasn't quite clear to me whether this change was appropriate
> > > for
> > > GCC 10
> > > or not at this point. We discussed it a couple months ago here:
> > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either
> > > way,
> > > I hope
> > > it isn't a problem that I submitted the patch for review now,
> > > whether
> > > it
> > > will end up in 10 or 11. Please let me know what's normally
> > > expected?
> > > Thanks!
> > 
> > Thanks Lewis.
> > 
> > This patch looks very promising, but should wait until gcc 11;
> > we're
> > trying to stabilize gcc 10 right now (I'm knee-deep in analyzer
> > bug-
> > fixing, so I don't want to add any more diagnostics changes).
> > 
> 
> Hi Dave-
> 
> Well GCC 10 was released for a whole day so I thought I would bug you
> with this
> patch again now :). To summarize, I previously sent this in two
> separate parts.
> 
> Part 1: 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html
> Part 2: 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html
> 
> Part 1 added the support for converting tabs to spaces when
> outputting
> diagnostics. Part 2 added the new options -fdiagnostics-column-unit
> and
> -fdiagnostics-column-origin to control whether the column number is
> printed
> in display or byte units. Together they resolve both PR49973 and
> PR86904.
> 
> You provided me with feedback on part 2, which is quoted below with
> some
> notes interspersed. The new version of the patch incorporates all of
> your
> suggestions. Part 1 has not changed other than some trivial rebasing
> conflicts. The two patches touch nearly disjoint sets of files and
> are
> logically linked together, so I thought it would be simpler if I just
> sent
> one combined patch now. If you prefer them to be separated as before,
> please
> let me know and I can send them that way as well.
> 
> Bootstrap and reg tests were done on x86-64 Linux for all
> languages.  Tests
> look good:
> 
> type, before, after
> FAIL 96 96
> PASS 474637 475097
> UNSUPPORTED 11607 11607
> UNTESTED 195 195
> XFAIL 1816 1816
> XPASS 36 362

Thanks for the patch; sorry about the delay in reviewing it.

Some high-level review points

- I like the patch overall

- This will deserve an item in the release notes

- I don't like adding "global_tabstop" (I don't like global
variables).  Is there nowhere else we can handle this? I believe
there's a cluster of functions in the callgraph that make use of
it; can we simply pass around the tabstop value instead?  "tabstop"
seems to have 

Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-05-08 Thread Lewis Hyatt via Gcc-patches
On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:
> On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> > Hello-
> > 
> > Here is the second patch that I mentioned when I submitted the other
> > related
> > patch (which is awaiting review):
> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 
> 
> Sorry about that; I'm v. busy with analyzer bugs right now.
> 
> > This second patch
> > is based on top of the first one and it closes out PR49973 and
> > PR86904 by
> > adding the new option -fdiagnostics-column-unit=[display|byte]. This
> > allows
> > to specify whether columns are output as simple byte counts (the
> > current
> > behavior), or as display columns including handling multibyte
> > characters and
> > tabs. The patch makes display columns the new default. Additionally,
> > a
> > second new option -fdiagnostics-column-origin is added, which allows
> > to make
> > the column 0-based (or N-based for any N) instead of 1-based. The
> > default
> > remains at 1-based as it is now.
> > 
> > A number of testcases were explicitly testing for the old behavior,
> > so I
> > have updated them to test for the new behavior instead, since the
> > column
> > number adjusted for tabs is more natural to test for, and matches
> > what
> > editors typically show (give or take 1 for the origin convention).
> > 
> > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It
> > failed
> > after this patch, although it doesn't test for any column numbers.
> > The
> > answer turned out to be, this test checks for identical error text on
> > two
> > different lines. When the column units are changed to display
> > columns, then
> > the column of the second error happens to match the line of the first
> > one. dejagnu then misinterprets the second error as if it matched the
> > location of the first one (it doesn't distinguish whether it checks
> > for the
> > line number or the column number in the output). I added a comment to
> > the
> > test explaining the situation; since adding the comment has the side
> > effect
> > of making the first line number no longer match the second column
> > number, it
> > also makes the test pass again.
> > 
> > It wasn't quite clear to me whether this change was appropriate for
> > GCC 10
> > or not at this point. We discussed it a couple months ago here:
> > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,
> > I hope
> > it isn't a problem that I submitted the patch for review now, whether
> > it
> > will end up in 10 or 11. Please let me know what's normally expected?
> > Thanks!
> 
> Thanks Lewis.
> 
> This patch looks very promising, but should wait until gcc 11; we're
> trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-
> fixing, so I don't want to add any more diagnostics changes).
>

Hi Dave-

Well GCC 10 was released for a whole day so I thought I would bug you with this
patch again now :). To summarize, I previously sent this in two separate parts.

Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html
Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html

Part 1 added the support for converting tabs to spaces when outputting
diagnostics. Part 2 added the new options -fdiagnostics-column-unit and
-fdiagnostics-column-origin to control whether the column number is printed
in display or byte units. Together they resolve both PR49973 and PR86904.

You provided me with feedback on part 2, which is quoted below with some
notes interspersed. The new version of the patch incorporates all of your
suggestions. Part 1 has not changed other than some trivial rebasing
conflicts. The two patches touch nearly disjoint sets of files and are
logically linked together, so I thought it would be simpler if I just sent
one combined patch now. If you prefer them to be separated as before, please
let me know and I can send them that way as well.

Bootstrap and reg tests were done on x86-64 Linux for all languages.  Tests
look good:

type, before, after
FAIL 96 96
PASS 474637 475097
UNSUPPORTED 11607 11607
UNTESTED 195 195
XFAIL 1816 1816
XPASS 36 36

> 
> > gcc/ChangeLog:
> > 
> > 2020-01-31  Lewis Hyatt  
> >
> 
> Please reference the PRs here
> 
> [...]
> 
> > gcc/testsuite/ChangeLog:
> > 
> > 2020-01-31  Lewis Hyatt  
> 
> Likewise here.
> 
> [...]
>

Done.

> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 630c380bd6a..657985450c2 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) 
> > Value(DIAGNOSTICS_URL_YES)
> >  EnumValue
> >  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
> >  
> > +fdiagnostics-column-unit=
> > +Common Joined RejectNegative Enum(diagnostics_column_unit)
> > +-fdiagnostics-column-unit=[display|byte]   Select units for column numbers.
> Should this line mention the default?
>

Done.

> > +fdiagnostics-column-origin=
> > +Common Joined RejectNegative UInteger
> > 

Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-01-31 Thread Lewis Hyatt
Thanks for taking a look, sorry about that, it's my first new option
:). I will add in the next iteration.

-Lewis

On Fri, Jan 31, 2020 at 5:45 PM Joseph Myers  wrote:
>
> This seems to be missing invoke.texi documentation for the new options.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-01-31 Thread Joseph Myers
This seems to be missing invoke.texi documentation for the new options.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-01-31 Thread Lewis Hyatt
On Fri, Jan 31, 2020 at 3:32 PM David Malcolm  wrote:
>
> On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> > Hello-
> >
> > Here is the second patch that I mentioned when I submitted the other
> > related
> > patch (which is awaiting review):
> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html.
>
> Sorry about that; I'm v. busy with analyzer bugs right now.

Thanks very much for the feedback. Totally understood that you are
busy with more pressing things for GCC 10. Looking forward to trying
out -fanalyzer myself too. I'll apply these suggestions then and send
an updated version after GCC 10 release.

-Lewis


Re: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-01-31 Thread David Malcolm
On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> Hello-
> 
> Here is the second patch that I mentioned when I submitted the other
> related
> patch (which is awaiting review):
> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 

Sorry about that; I'm v. busy with analyzer bugs right now.

> This second patch
> is based on top of the first one and it closes out PR49973 and
> PR86904 by
> adding the new option -fdiagnostics-column-unit=[display|byte]. This
> allows
> to specify whether columns are output as simple byte counts (the
> current
> behavior), or as display columns including handling multibyte
> characters and
> tabs. The patch makes display columns the new default. Additionally,
> a
> second new option -fdiagnostics-column-origin is added, which allows
> to make
> the column 0-based (or N-based for any N) instead of 1-based. The
> default
> remains at 1-based as it is now.
> 
> A number of testcases were explicitly testing for the old behavior,
> so I
> have updated them to test for the new behavior instead, since the
> column
> number adjusted for tabs is more natural to test for, and matches
> what
> editors typically show (give or take 1 for the origin convention).
> 
> One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It
> failed
> after this patch, although it doesn't test for any column numbers.
> The
> answer turned out to be, this test checks for identical error text on
> two
> different lines. When the column units are changed to display
> columns, then
> the column of the second error happens to match the line of the first
> one. dejagnu then misinterprets the second error as if it matched the
> location of the first one (it doesn't distinguish whether it checks
> for the
> line number or the column number in the output). I added a comment to
> the
> test explaining the situation; since adding the comment has the side
> effect
> of making the first line number no longer match the second column
> number, it
> also makes the test pass again.
> 
> It wasn't quite clear to me whether this change was appropriate for
> GCC 10
> or not at this point. We discussed it a couple months ago here:
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,
> I hope
> it isn't a problem that I submitted the patch for review now, whether
> it
> will end up in 10 or 11. Please let me know what's normally expected?
> Thanks!

Thanks Lewis.

This patch looks very promising, but should wait until gcc 11; we're
trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-
fixing, so I don't want to add any more diagnostics changes).


> gcc/ChangeLog:
> 
> 2020-01-31  Lewis Hyatt  
>

Please reference the PRs here

[...]

> gcc/testsuite/ChangeLog:
> 
> 2020-01-31  Lewis Hyatt  

Likewise here.

[...]

> diff --git a/gcc/common.opt b/gcc/common.opt
> index 630c380bd6a..657985450c2 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) 
> Value(DIAGNOSTICS_URL_YES)
>  EnumValue
>  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
>  
> +fdiagnostics-column-unit=
> +Common Joined RejectNegative Enum(diagnostics_column_unit)
> +-fdiagnostics-column-unit=[display|byte] Select units for column numbers.
Should this line mention the default?

> +fdiagnostics-column-origin=
> +Common Joined RejectNegative UInteger
> +-fdiagnostics-column-origin= Set the number of the first column.  
> Default 1-based.

These new options should be documented in gcc/doc/invoke.texi.

[...]

> @@ -43,21 +44,23 @@ static json::array *cur_children_array;
>  /* Generate a JSON object for LOC.  */
>  
>  json::value *
> -json_from_expanded_location (location_t loc)
> +json_from_expanded_location (diagnostic_context *context, location_t loc)
>  {
>expanded_location exploc = expand_location (loc);
>json::object *result = new json::object ();
>if (exploc.file)
>  result->set ("file", new json::string (exploc.file));
>result->set ("line", new json::integer_number (exploc.line));
> -  result->set ("column", new json::integer_number (exploc.column));
> +  const int col = diagnostic_converted_column (context, exploc);
> +  result->set ("column", new json::integer_number (col));

I wonder if the JSON output format should show *both* values: perhaps
add fields "byte-column" and "display-column", and retain the field
"column", which would follow -fdiagnostics-column-unit?

[...]

> @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int 
> n_opts)
>context->min_margin_width = 0;
>context->show_ruler_p = false;
>context->parseable_fixits_p = false;
> +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
> +  context->column_adj = 0;

I'm not sure, but I think I prefer it if we store the column origin
instead, rather than an offset relative to an origin of 1.

[...]

> @@ -338,8 +341,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
>return