Re: Use predicates for RTL objects

2019-08-08 Thread Arvind Sankar
On Thu, Aug 08, 2019 at 01:26:42PM -0500, Segher Boessenkool wrote:
> LABEL_REF_P works out nicely because it is referring to something that
> is data, is not an operator.  "Leaves" in an RTL expression, if you want
> to look at it that way.
> 
> Predicates for other RTX codes aren't always as obvious, see CONST_P as
> example.  PLUS_P would be a bit borderline.
> 
> Part of the reason why REG_P and MEM_P and the like are nice, is that
> these predicates are often used in bigger conditions, maybe together
> with some XEXP and whatnot.  Is that the case for PLUS_P?
> 

Yes, it's used quite often in more complex conditions checking the
operands (eg to see whether they're constants), or applied to XEXP's
itself.

But I'm in agreement that PLUS_P just seems odd somehow. The leaf/data
vs operator distinction makes sense, maybe RTXOP_PLUS_P, but then you'd
want that to check if it was being called on an operator, so I don't
know if you'd do it unless/until we eventually have an rtx_op class and
have done the other bits of converting to C++.


Re: Use predicates for RTL objects

2019-08-08 Thread Arvind Sankar
On Thu, Aug 08, 2019 at 08:14:01PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 08, 2019 at 01:12:33PM -0400, Arvind Sankar wrote:
> > On Thu, Aug 08, 2019 at 03:04:53PM +, Michael Matz wrote:
> > > Hi,
> > > 
> > > On Wed, 7 Aug 2019, Arvind Sankar wrote:
> > > 
> > > > => x->is_a (REG)
> > > 
> > > Oh god, please no.  Currently at least the RTL parts of GCC still have 
> > > mostly a consistent and obvious style, which is a good thing.  I have no 
> > > idea why anyone would think the above is easier to read than REG_P (x).
> > > 
> > > 
> > > Ciao,
> > > Michael.
> > > P.S: Consider this: the current style served us quite well for the last 
> > > 35 
> > > or so years, so before suggesting style changes, shouldn't you first work 
> > > on the sources for some time?
> > 
> > Well, the main point of the email was to ask for review of a patchset
> > that attempts to make progress on a TODO that has been outstanding for
> > at least 15 of those 35 years.
> 
> Not everything that is in some TODO list somewhere is something generally
> agreed upon.
> 
>   Jakub

Surely there's general agreement on using REG_P etc? I don't see anyone
objecting to it, and that's all the patchset does: to avoid any
confusion the second half of the email asking about opinions on is_a is
entirely independent from the first half describing the existing patchset.


Re: Use predicates for RTL objects

2019-08-08 Thread Arvind Sankar
On Thu, Aug 08, 2019 at 10:04:14AM -0500, Segher Boessenkool wrote:
> On Wed, Aug 07, 2019 at 02:58:12PM -0400, Arvind Sankar wrote:
> > > > > code does is cheap.  With "x->is_a (PLUS)", who knows what is 
> > > > > happening
> > > That's not my point -- my point was that it is *obvious* the way things
> > > are now, which is nice.
> > 
> > My reply is pointing out that it is just as (non-)obvious with or
> > without that inline function, if you want to use any of the helper
> > macros.
> 
> But that is not what you suggested, or at least not how I read it.
>   x->is_a (PLUS)
> is not obviously cheap or simple at all, while
>   GET_CODE (x) == PLUS
> obviously *is*.
> 
> The former also isn't readable.

That's a matter of what style you prefer and it seems like no-one else
shares my preference, and I accept that we're not going to make such a
change.

But there is really nothing more or less obvious about it. It's easy to
go look at the code, as you probably once did when checking what
GET_CODE or REG_P actually did, and is_a methods are expected to be
lightweight. Regarding hiding things, consider that we just added
LABEL_REF_P, which is for a comparison that happens less than half as
often as PLUS in the codebase (and I think it was actually only
used in one place). It was done presumably because the author/reviewers
felt that LABEL_REF_P (x) is more readable than GET_CODE (x) == LABEL_REF, 
even if the latter might be ever so slightly more transparent as to what
its doing than the former.

> 
> Indirection is *the* evil in programming.

Some would say that "All problems in computer science can be solved by
another level of indirection" ;)


Re: Use predicates for RTL objects

2019-08-08 Thread Arvind Sankar
On Thu, Aug 08, 2019 at 03:04:53PM +, Michael Matz wrote:
> Hi,
> 
> On Wed, 7 Aug 2019, Arvind Sankar wrote:
> 
> > => x->is_a (REG)
> 
> Oh god, please no.  Currently at least the RTL parts of GCC still have 
> mostly a consistent and obvious style, which is a good thing.  I have no 
> idea why anyone would think the above is easier to read than REG_P (x).
> 
> 
> Ciao,
> Michael.
> P.S: Consider this: the current style served us quite well for the last 35 
> or so years, so before suggesting style changes, shouldn't you first work 
> on the sources for some time?

Well, the main point of the email was to ask for review of a patchset
that attempts to make progress on a TODO that has been outstanding for
at least 15 of those 35 years.


Re: Use predicates for RTL objects

2019-08-08 Thread Arvind Sankar
On Thu, Aug 08, 2019 at 08:31:28AM -0600, Jeff Law wrote:
> True, but they could be.  When David was working in this space a few
> years ago I concluded that the main value in sub-classing the various
> RTL operators just wansn't worth the effort.  Instead we focused on
> starting to tear apart things like the toplevel objects into rtx_insn
> and the like.  THere's little value in treating those as simple RTXs.
> INSN_LIST and the like were also ripe for this treatment.
> 
> The biggest value in making a real class for the operators things would
> be to move the runtime RTL checking into a compile-time check.  But I
> couldn't really green light something like that without first completing
> the rtx_insn changes.

Are there any notes or old discussion threads on what remains? I would
be interested in taking a look if no-one else is.

Thanks

> 
> > 
> > 
> > If you really want to convert RTL to C++, you should start with getting
> > rid of rtx_format and rtx_class, and make REG_P etc. work just as they
> > have always done.
> Yup.  And continue pushing the rtx_insn bits deeper, tackling INSN_LIST,
> etc.
> 
> jeff


Copyright assignment forms

2019-08-07 Thread Arvind Sankar
Hi, I would like to begin contributing to the GCC project, and Segher
suggested that I should take care of the copyright assignment forms.

I am reaching out to obtain them. I am currently unemployed so there's
no issue with an employer.

Thanks


Re: Use predicates for RTL objects

2019-08-07 Thread Arvind Sankar
On Wed, Aug 07, 2019 at 01:05:51PM -0500, Segher Boessenkool wrote:
> On Wed, Aug 07, 2019 at 01:39:53PM -0400, Arvind Sankar wrote:
> > On Wed, Aug 07, 2019 at 12:33:53PM -0500, Segher Boessenkool wrote:
> > > On Wed, Aug 07, 2019 at 12:15:29PM -0400, Arvind Sankar wrote:
> > > > I would also like to get some comments on the following idea to make the
> > > > code checks more readable: I am thinking of adding
> > > > bool rtx_def::is_a (enum rtx_code) const
> > > > This would allow us to make all the rtx_code comparisons more readable
> > > > without having to define individual macros for each.
> > > > i.e.,
> > > > REG_P (x)  => x->is_a (REG)
> > > > GET_CODE (x) == PLUS   => x->is_a (PLUS)
> > > > GET_CODE (PATTERN (x)) == SEQUENCE => PATTERN (x)->is_a 
> > > > (SEQUENCE)
> > > 
> > > That makes things much worse.  Not only is it less readable (IMO), but
> > > the "is_a" idiom is used to check if something is of a certain class,
> > > which is not the case here.
> > 
> > Well, the rtx_code *is* kind of a class. It determines what fields of
> > the rtx are valid and what they contain etc.
> 
> It is not a class in the C++ sense.  Confusing this is not useful for
> anyone.

True, but the code is semantically a type identifier. Using the common
is_a idiom IMO makes it more readable for that reason, but that is a
matter of personal opinion, hence my request for comments.

> 
> > > In "GET_CODE (x) == PLUS" it is clear that what the resulting machine
> > > code does is cheap.  With "x->is_a (PLUS)", who knows what is happening
> > > below the covers!
> > 
...
> > predicate macros whose implementation is more complex than checking the
> > code field. You basically have to trust that it's sensibly implemented,
> > i.e. that it is as efficiently implemented as it can be.
> 
> That's not my point -- my point was that it is *obvious* the way things
> are now, which is nice.

My reply is pointing out that it is just as (non-)obvious with or
without that inline function, if you want to use any of the helper
macros. It wouldn't be a reasonable argument to say that INSN_P (x) is
obviously efficient while x->is_a (PLUS) is hiding some potentially
nasty things inside. There's around 200 uses of GET_CODE being compared
to PLUS in the non-target-specific code, so it's not that rare. It would
be nice to replace it with something better, but PLUS_P (x) just reads
badly to me.

> 
> > I don't think
> > people writing RTL transformations should be overly worried about what
> > machine code their predicates are generating, especially when
> > they're calling the defined API for it.
> 
> The whole *design* of RTL is based around us caring a whole lot.

I'm not saying that we don't care about performance. My point is that
if you know that what you're checking is just whether this RTX is a
PLUS-coded RTX or not, you should not care exactly which machine
instructions that check will generate, because you already know that the
test should be trivial, and any sensible implementation will be
efficient enough, especially after a compiler is done with it. i.e.
being sure of getting efficient machine code is not a reason for
avoiding macros/inline functions.


Re: Use predicates for RTL objects

2019-08-07 Thread Arvind Sankar
On Wed, Aug 07, 2019 at 12:33:53PM -0500, Segher Boessenkool wrote:
> On Wed, Aug 07, 2019 at 12:15:29PM -0400, Arvind Sankar wrote:
> > I would also like to get some comments on the following idea to make the
> > code checks more readable: I am thinking of adding
> > bool rtx_def::is_a (enum rtx_code) const
> > This would allow us to make all the rtx_code comparisons more readable
> > without having to define individual macros for each.
> > i.e.,
> > REG_P (x)  => x->is_a (REG)
> > GET_CODE (x) == PLUS   => x->is_a (PLUS)
> > GET_CODE (PATTERN (x)) == SEQUENCE => PATTERN (x)->is_a (SEQUENCE)
> 
> That makes things much worse.  Not only is it less readable (IMO), but
> the "is_a" idiom is used to check if something is of a certain class,
> which is not the case here.
> 

Well, the rtx_code *is* kind of a class. It determines what fields of
the rtx are valid and what they contain etc.

> In "GET_CODE (x) == PLUS" it is clear that what the resulting machine
> code does is cheap.  With "x->is_a (PLUS)", who knows what is happening
> below the covers!

We already have, for eg, is_a  (x), and there are
predicate macros whose implementation is more complex than checking the
code field. You basically have to trust that it's sensibly implemented,
i.e. that it is as efficiently implemented as it can be. I don't think
people writing RTL transformations should be overly worried about what
machine code their predicates are generating, especially when
they're calling the defined API for it.

> 
> (And "REG_P" and similar are much shorter code to type).
> 

That is true for the ones that exist, but there are lots more that don't
and it doesn't really make sense to add individual macros for all of
them.

> 
> Segher


Use predicates for RTL objects

2019-08-07 Thread Arvind Sankar
Hi,
I have posted a patch series [1] for converting some of the RTL code to
use predicate macros, as described in the suggestions for beginner GCC
projects [2]. Segher was kind enough to give some comments on the
initial posting [3].

The code has been bootstrapped natively on x86_64, and I have built
cross-compilers for all targets except tilegx which gave build errors
even on trunk. The compiler object files are identical with trunk except
for *-checksum.o and string tables in build/gen*.o which change from
GET_CODE (..) == .. etc to the predicate macro.

Not all possible changes have been made yet, I figured I'd send this out
to check first.

I am hoping one of the maintainers will be able to take some time to
review the patches -- a few are quite large but are mechanical.

I would also like to get some comments on the following idea to make the
code checks more readable: I am thinking of adding
bool rtx_def::is_a (enum rtx_code) const
This would allow us to make all the rtx_code comparisons more readable
without having to define individual macros for each.
i.e.,
REG_P (x)  => x->is_a (REG)
GET_CODE (x) == PLUS   => x->is_a (PLUS)
GET_CODE (PATTERN (x)) == SEQUENCE => PATTERN (x)->is_a (SEQUENCE)

More complex predicates could be left as macros or additional methods
could be defined like rtx_def::is_a_nondebug_insn etc. I think this
should mostly be an improvement, although the comparisons around INSN
may become slightly more confusing: currently, INSN_P (x) is different
from is_a  (x), and using something like x->is_a_insn () for
the former would probably increase confusion.

Thanks.

[1] https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00327.html
[2] https://gcc.gnu.org/projects/beginner.html
[3] https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00171.html


Re: glibc-2.2{8,9} multiarch build failure on x86_64 with gcc 9

2019-05-01 Thread Arvind Sankar
On Wed, May 01, 2019 at 02:48:30PM -0400, Arvind Sankar wrote:
> On Wed, May 01, 2019 at 02:36:10PM -0400, Carlos O'Donell wrote:
> > On 5/1/19 2:24 PM, Arvind Sankar wrote:
> > > gcc 9 when configured for fortran installs ISO_Fortran_Binding.h in
> > >   gfor_cdir = 
> > > $(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/include
> > > For x86_64's 32-bit architecture support, this creates a subdirectory 
> > > named 'include'
> > > inside $(libsubdir)/32 which didn't use to exist in gcc 8.
> > This doesn't seem correct.
> > 
> > I would have expected the header to exist under the target name, for 
> > example:
> > 
> > /usr/lib/gcc/i686-redhat-linux/9/include/ISO_Fortran_binding.h
> > 
> > This way it doesn't conflict with other uses.
> > 
> > Perhaps there is enough variability in the way you build, package, and 
> > install this
> > that it can break in some configurations.
> > 
> > I think the gcc community needs to comment on this.
> > 
> > -- 
> > Cheers,
> > Carlos.
> 
> To be clear, it does exist there as well (I think because the native arch has 
> an
> empty MULTISUBDIR. The breakage happens because it _also_ exists under 
> .../9/32/include.

Including Paul -- looks like the file gets installed in different places
in fortran-dev branch vs trunk?


Re: glibc-2.2{8,9} multiarch build failure on x86_64 with gcc 9

2019-05-01 Thread Arvind Sankar
On Wed, May 01, 2019 at 02:36:10PM -0400, Carlos O'Donell wrote:
> On 5/1/19 2:24 PM, Arvind Sankar wrote:
> > gcc 9 when configured for fortran installs ISO_Fortran_Binding.h in
> > gfor_cdir = 
> > $(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/include
> > For x86_64's 32-bit architecture support, this creates a subdirectory named 
> > 'include'
> > inside $(libsubdir)/32 which didn't use to exist in gcc 8.
> This doesn't seem correct.
> 
> I would have expected the header to exist under the target name, for example:
> 
> /usr/lib/gcc/i686-redhat-linux/9/include/ISO_Fortran_binding.h
> 
> This way it doesn't conflict with other uses.
> 
> Perhaps there is enough variability in the way you build, package, and 
> install this
> that it can break in some configurations.
> 
> I think the gcc community needs to comment on this.
> 
> -- 
> Cheers,
> Carlos.

To be clear, it does exist there as well (I think because the native arch has an
empty MULTISUBDIR. The breakage happens because it _also_ exists under 
.../9/32/include.


Re: glibc-2.2{8,9} multiarch build failure on x86_64 with gcc 9

2019-05-01 Thread Arvind Sankar
gcc 9 when configured for fortran installs ISO_Fortran_Binding.h in
gfor_cdir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/include
For x86_64's 32-bit architecture support, this creates a subdirectory named 
'include'
inside $(libsubdir)/32 which didn't use to exist in gcc 8.

glibc's configure script, when configured with --with-headers=, uses the
following logic to guess at gcc's internal include directories. This
failes on gcc 9 because with -m32 now we will get $(gfor_cdir) as the
return for -print-file-name=include, instead of the real directory that
contains all the include files, $(libsubdir)/include. The result is that
SYSINCLUDES is incorrectly defined and the rest of the build fails due
to missing standard include files, eg stddef.h etc.

# if using special system headers, find out the compiler's sekrit
# header directory and add that to the list.  NOTE: Only does the right
# thing on a system that doesn't need fixincludes.  (Not presently a 
problem.)
if test -n "$sysheaders"; then
  SYSINCLUDES=-nostdinc
  for d in include include-fixed; do
i=`$CC -print-file-name="$d"` && test "x$i" != x && test "x$i" != 
"x$d" &&
SYSINCLUDES="$SYSINCLUDES -isystem $i"
  done
  SYSINCLUDES="$SYSINCLUDES \
-isystem `echo $sysheaders | sed 's/:/ -isystem /g'`"
  if test -n "$CXX"; then
CXX_SYSINCLUDES=
for cxxheaders in `$CXX -v -S -x c++ /dev/null -o /dev/null 2>&1 \
| sed -n -e '1,/#include/d' -e 's/^ \(\/.*\/[cg]++\)/\1/p'`; do
  test "x$cxxheaders" != x &&
  CXX_SYSINCLUDES="$CXX_SYSINCLUDES -isystem $cxxheaders"
done
  fi
fi
AC_SUBST(SYSINCLUDES)
AC_SUBST(CXX_SYSINCLUDES)

I'm not sure if this is a gcc bug or a glibc bug?

Should libgfortran install in the MULTISUBDIR? That's the only include
file that goes in there, and it is installing the same include file in
all of them, so this seems unnecessary.

glibc's configure fragment above also looks extremely fragile though,
the CXX_SYSINCLUDES version looks a little more robust, should something
similar be done for SYSINCLUDES?

(Fix return address in email)


glibc-2.2{8,9} multiarch build failure on x86_64 with gcc 9

2019-05-01 Thread Arvind Sankar
gcc 9 when configured for fortran installs ISO_Fortran_Binding.h in
gfor_cdir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/include
For x86_64's 32-bit architecture support, this creates a subdirectory named 
'include'
inside $(libsubdir)/32 which didn't use to exist in gcc 8.

glibc's configure script, when configured with --with-headers=, uses the
following logic to guess at gcc's internal include directories. This
failes on gcc 9 because with -m32 now we will get $(gfor_cdir) as the
return for -print-file-name=include, instead of the real directory that
contains all the include files, $(libsubdir)/include. The result is that
SYSINCLUDES is incorrectly defined and the rest of the build fails due
to missing standard include files, eg stddef.h etc.

# if using special system headers, find out the compiler's sekrit
# header directory and add that to the list.  NOTE: Only does the right
# thing on a system that doesn't need fixincludes.  (Not presently a 
problem.)
if test -n "$sysheaders"; then
  SYSINCLUDES=-nostdinc
  for d in include include-fixed; do
i=`$CC -print-file-name="$d"` && test "x$i" != x && test "x$i" != 
"x$d" &&
SYSINCLUDES="$SYSINCLUDES -isystem $i"
  done
  SYSINCLUDES="$SYSINCLUDES \
-isystem `echo $sysheaders | sed 's/:/ -isystem /g'`"
  if test -n "$CXX"; then
CXX_SYSINCLUDES=
for cxxheaders in `$CXX -v -S -x c++ /dev/null -o /dev/null 2>&1 \
| sed -n -e '1,/#include/d' -e 's/^ \(\/.*\/[cg]++\)/\1/p'`; do
  test "x$cxxheaders" != x &&
  CXX_SYSINCLUDES="$CXX_SYSINCLUDES -isystem $cxxheaders"
done
  fi
fi
AC_SUBST(SYSINCLUDES)
AC_SUBST(CXX_SYSINCLUDES)

I'm not sure if this is a gcc bug or a glibc bug?

Should libgfortran install in the MULTISUBDIR? That's the only include
file that goes in there, and it is installing the same include file in
all of them, so this seems unnecessary.

glibc's configure fragment above also looks extremely fragile though,
the CXX_SYSINCLUDES version looks a little more robust, should something
similar be done for SYSINCLUDES?


Re: GCC-9.0.1-RC-20190426 excessive memory usage with -ggdb

2019-04-28 Thread Arvind Sankar
On Sun, Apr 28, 2019 at 10:59:45PM -0400, Arvind Sankar wrote:
> While building clang-8.0.0 or firefox-66.0.3 on a gentoo system (x86_64)
> gcc 9 release candidate runs out of memory with -O2 -ggdb. Gcc 8.3.0
> succeeds without issues. For clang there are about 5-6 compilations
> where the memory usage seems to grow without bound (I have 128Gb RAM +
> 128Gb swap and at least two of them were observed to reach 80+Gb each
> before getting killed).
> 
> For 8.3.0 nothing seems to reach more than single-digit Gb. Without the
> -ggdb switch gcc 9 also builds fine without excessive memory usage.
> 
> Files running out of memory:
> lib/Parse/ParseDeclCXX.cpp
> lib/Parse/ParseTentative.cpp
> lib/StaticAnalyzer/Checkers/MallocSizeofChecker.cpp
> lib/Format/FormatTokenLexer.cpp
> lib/Format/TokenAnnotator.cpp
> 
> The assembler output for all but the MallocSizeChecker.cpp was still 0
> bytes when the compiler crashed, and it wasn't growing for that one.
> 
> Experimenting with different -g and -gdwarf levels instead of passing in
> -ggdb, -g2 -gdwarf-4 crashes, -g2 -gdwarf-3 crashes, -g1 -gdwarf-4
> crashes.
> 
> Both gcc 8.3.0 and 9 were bootstrapped using profiledbootstrap + 
> --with-build-config=bootstrap-lto at -O2.
> In addition my gcc 9 is currently built with some additional
> optimization options: ira-loop-pressure, live-range-shrinkage, tree-lrs.
> 
> I'll try bootstrapping 9 with regular -O2 without pgo/lto and see if
> that is better.
> 
> I realize this is a rather low-quality report :) but have little
> experience in trying to narrow this down esp with cmake being used in
> clang build, and throwing this out there in case someone else can take a
> quick look.
> 
> PS I am not on the list, please include me explicitly on replies.

I see this has been reported as Bug 90273 already.


GCC-9.0.1-RC-20190426 excessive memory usage with -ggdb

2019-04-28 Thread Arvind Sankar
While building clang-8.0.0 or firefox-66.0.3 on a gentoo system (x86_64)
gcc 9 release candidate runs out of memory with -O2 -ggdb. Gcc 8.3.0
succeeds without issues. For clang there are about 5-6 compilations
where the memory usage seems to grow without bound (I have 128Gb RAM +
128Gb swap and at least two of them were observed to reach 80+Gb each
before getting killed).

For 8.3.0 nothing seems to reach more than single-digit Gb. Without the
-ggdb switch gcc 9 also builds fine without excessive memory usage.

Files running out of memory:
lib/Parse/ParseDeclCXX.cpp
lib/Parse/ParseTentative.cpp
lib/StaticAnalyzer/Checkers/MallocSizeofChecker.cpp
lib/Format/FormatTokenLexer.cpp
lib/Format/TokenAnnotator.cpp

The assembler output for all but the MallocSizeChecker.cpp was still 0
bytes when the compiler crashed, and it wasn't growing for that one.

Experimenting with different -g and -gdwarf levels instead of passing in
-ggdb, -g2 -gdwarf-4 crashes, -g2 -gdwarf-3 crashes, -g1 -gdwarf-4
crashes.

Both gcc 8.3.0 and 9 were bootstrapped using profiledbootstrap + 
--with-build-config=bootstrap-lto at -O2.
In addition my gcc 9 is currently built with some additional
optimization options: ira-loop-pressure, live-range-shrinkage, tree-lrs.

I'll try bootstrapping 9 with regular -O2 without pgo/lto and see if
that is better.

I realize this is a rather low-quality report :) but have little
experience in trying to narrow this down esp with cmake being used in
clang build, and throwing this out there in case someone else can take a
quick look.

PS I am not on the list, please include me explicitly on replies.