Re: [C++ PATCH] c++/91353 - P1331R2: Allow trivial default init in constexpr contexts.

2019-11-28 Thread Jason Merrill
On Thu, Nov 28, 2019 at 10:28 PM Marek Polacek  wrote:

> On Wed, Nov 27, 2019 at 10:47:58PM -0500, Jason Merrill wrote:
> > On 11/27/19 6:35 PM, Marek Polacek wrote:
> > > On Wed, Nov 27, 2019 at 04:47:01PM -0500, Jason Merrill wrote:
> > > > On 11/27/19 2:36 PM, Marek Polacek wrote:
> > > > > On Sun, Nov 24, 2019 at 12:24:48PM -0500, Jason Merrill wrote:
> > > > > > On 11/16/19 5:23 PM, Marek Polacek wrote:
> > > > > > > [ Working virtually on Baker Island. ]
> > > > > > >
> > > > > > > This patch implements C++20 P1331, allowing trivial default
> initialization in
> > > > > > > constexpr contexts.
> > > > > > >
> > > > > > > I used Jakub's patch from the PR which allowed uninitialized
> variables in
> > > > > > > constexpr contexts.  But the hard part was handling
> CONSTRUCTOR_NO_CLEARING
> > > > > > > which is always cleared in cxx_eval_call_expression.  We need
> to set it in
> > > > > > > the case a constexpr constructor doesn't initialize all the
> members, so that
> > > > > > > we can give proper diagnostic instead of value-initializing.
> A lot of my
> > > > > > > attempts flopped but then I came up with this approach, which
> handles various
> > > > > > > cases as tested in constexpr-init8.C, where S is initialized
> by a non-default
> > > > > > > constexpr constructor, and constexpr-init9.C, using delegating
> constructors.
> > > > > > > And the best part is that I didn't need any new
> cx_check_missing_mem_inits
> > > > > > > calls!  Just save the information whether a constructor is
> missing an init
> > > > > > > into constexpr_fundef_table and retrieve it when needed.
> > > > > >
> > > > > > Is it necessary to clear the flag for constructors that do
> happen to
> > > > > > initialize all the members?  I would think that leaving that
> clearing to
> > > > > > reduced_constant_expression_p would be enough.
> > > > >
> > > > > It seems so: if I tweak cxx_eval_call_expression to only call
> clear_no_implicit_zero
> > > > > when 'fun' isn't DECL_CONSTRUCTOR_P, then a lot breaks, e.g.
> constexpr-base.C
> > > > > where the constructor initializes all the members.  By breaking I
> mean spurious
> > > > > errors coming from
> > > > >
> > > > > 5937   if (TREE_CODE (r) == CONSTRUCTOR && CONSTRUCTOR_NO_CLEARING
> (r))
> > > > > 5938 {
> > > > > 5939   if (!allow_non_constant)
> > > > > 5940 error ("%qE is not a constant expression because it
> refers to "
> > > > > 5941"an incompletely initialized variable", t);
> > > > > 5942   TREE_CONSTANT (r) = false;
> > > > > 5943   non_constant_p = true;
> > > > > 5944 }
> > > >
> > > > Why didn't reduced_constant_expression_p unset
> CONSTRUCTOR_NO_CLEARING?
> > >
> > > We have a constructor that initializes a base class and members of a
> class:
> > >
> > >{.D.2364={.i=12}, .a={.i=24}, .j=36}
> > >
> > > Say we don't clear CONSTRUCTOR_NO_CLEARING in this ctor in
> cxx_eval_call_expression.
> > > Then soon in reduced_constant_expression_p we do
> > > 2221 field = next_initializable_field (TYPE_FIELDS
> (TREE_TYPE (t)));
> > > and since "Implement P0017R1, C++17 aggregates with bases. / r241187"
> we skip
> > > base fields in C++17 so 'field' is set to 'a'.
> >
> > Hmm?
> >
> > > next_initializable_field (tree field)
> > > {
> > >   while (field
> > >  && (TREE_CODE (field) != FIELD_DECL
> > >  || DECL_UNNAMED_BIT_FIELD (field)
> > >  || (DECL_ARTIFICIAL (field)
> > >  && !(cxx_dialect >= cxx17 && DECL_FIELD_IS_BASE
> (field)
> > > field = DECL_CHAIN (field);
> >
> > This skips artificial fields except that in C++17 and up base fields are
> > *not* skipped.
> >
> > How are you getting field starting with 'a'?  Are you compiling in a
> lower
> > standard mode?  The code using next_initializable_field doesn't work for
> > lower -std because of skipping base fields.
>
> Duh, I'm sorry, you're right of course, I got it backwards.  I didn't
> realize
> I was debugging without -std=c++2a :/
>
> > So perhaps we want to always clear_no_implicit_zero before c++20, and
> always
> > for c++20 and up?
>
> This doesn't work for constexpr-init8.C, where S is initialized by a
> non-default
> constexpr constructor:
>
>   struct S {
> constexpr S(int) {}
>   };
>
>   struct W {
> constexpr W(int) : s(8), p() {}
>
> S s;
> int *p;
>   };
>
>   constexpr auto a = W(42);
>
> When we perform register_constexpr_fundef the result of
> massage_constexpr_body is
>
>   {.s=S::S (&((struct W *) this)->s, NON_LVALUE_EXPR <8>), .p=0B}
>
> i.e. a ctor that initializes all the members.  But later
> reduced_constant_expression_p only ever sees {.p=0B} which seemingly
> doesn't
> initialize all the members, so CONSTRUCTOR_NO_CLEARING is not cleared, and
> we
> give an error.
>

Sounds like reduced_constant_expression_p needs to deal better with empty
bases.

Jason


Re: [C++ PATCH] c++/91353 - P1331R2: Allow trivial default init in constexpr contexts.

2019-11-28 Thread Marek Polacek
On Wed, Nov 27, 2019 at 10:47:58PM -0500, Jason Merrill wrote:
> On 11/27/19 6:35 PM, Marek Polacek wrote:
> > On Wed, Nov 27, 2019 at 04:47:01PM -0500, Jason Merrill wrote:
> > > On 11/27/19 2:36 PM, Marek Polacek wrote:
> > > > On Sun, Nov 24, 2019 at 12:24:48PM -0500, Jason Merrill wrote:
> > > > > On 11/16/19 5:23 PM, Marek Polacek wrote:
> > > > > > [ Working virtually on Baker Island. ]
> > > > > > 
> > > > > > This patch implements C++20 P1331, allowing trivial default 
> > > > > > initialization in
> > > > > > constexpr contexts.
> > > > > > 
> > > > > > I used Jakub's patch from the PR which allowed uninitialized 
> > > > > > variables in
> > > > > > constexpr contexts.  But the hard part was handling 
> > > > > > CONSTRUCTOR_NO_CLEARING
> > > > > > which is always cleared in cxx_eval_call_expression.  We need to 
> > > > > > set it in
> > > > > > the case a constexpr constructor doesn't initialize all the 
> > > > > > members, so that
> > > > > > we can give proper diagnostic instead of value-initializing.  A lot 
> > > > > > of my
> > > > > > attempts flopped but then I came up with this approach, which 
> > > > > > handles various
> > > > > > cases as tested in constexpr-init8.C, where S is initialized by a 
> > > > > > non-default
> > > > > > constexpr constructor, and constexpr-init9.C, using delegating 
> > > > > > constructors.
> > > > > > And the best part is that I didn't need any new 
> > > > > > cx_check_missing_mem_inits
> > > > > > calls!  Just save the information whether a constructor is missing 
> > > > > > an init
> > > > > > into constexpr_fundef_table and retrieve it when needed.
> > > > > 
> > > > > Is it necessary to clear the flag for constructors that do happen to
> > > > > initialize all the members?  I would think that leaving that clearing 
> > > > > to
> > > > > reduced_constant_expression_p would be enough.
> > > > 
> > > > It seems so: if I tweak cxx_eval_call_expression to only call 
> > > > clear_no_implicit_zero
> > > > when 'fun' isn't DECL_CONSTRUCTOR_P, then a lot breaks, e.g. 
> > > > constexpr-base.C
> > > > where the constructor initializes all the members.  By breaking I mean 
> > > > spurious
> > > > errors coming from
> > > > 
> > > > 5937   if (TREE_CODE (r) == CONSTRUCTOR && CONSTRUCTOR_NO_CLEARING (r))
> > > > 5938 {
> > > > 5939   if (!allow_non_constant)
> > > > 5940 error ("%qE is not a constant expression because it refers 
> > > > to "
> > > > 5941"an incompletely initialized variable", t);
> > > > 5942   TREE_CONSTANT (r) = false;
> > > > 5943   non_constant_p = true;
> > > > 5944 }
> > > 
> > > Why didn't reduced_constant_expression_p unset CONSTRUCTOR_NO_CLEARING?
> > 
> > We have a constructor that initializes a base class and members of a class:
> > 
> >{.D.2364={.i=12}, .a={.i=24}, .j=36}
> > 
> > Say we don't clear CONSTRUCTOR_NO_CLEARING in this ctor in 
> > cxx_eval_call_expression.
> > Then soon in reduced_constant_expression_p we do
> > 2221 field = next_initializable_field (TYPE_FIELDS (TREE_TYPE 
> > (t)));
> > and since "Implement P0017R1, C++17 aggregates with bases. / r241187" we 
> > skip
> > base fields in C++17 so 'field' is set to 'a'.
> 
> Hmm?
> 
> > next_initializable_field (tree field)
> > {
> >   while (field
> >  && (TREE_CODE (field) != FIELD_DECL
> >  || DECL_UNNAMED_BIT_FIELD (field)
> >  || (DECL_ARTIFICIAL (field)
> >  && !(cxx_dialect >= cxx17 && DECL_FIELD_IS_BASE (field)
> > field = DECL_CHAIN (field);
> 
> This skips artificial fields except that in C++17 and up base fields are
> *not* skipped.
> 
> How are you getting field starting with 'a'?  Are you compiling in a lower
> standard mode?  The code using next_initializable_field doesn't work for
> lower -std because of skipping base fields.

Duh, I'm sorry, you're right of course, I got it backwards.  I didn't realize
I was debugging without -std=c++2a :/

> So perhaps we want to always clear_no_implicit_zero before c++20, and always
> for c++20 and up?

This doesn't work for constexpr-init8.C, where S is initialized by a non-default
constexpr constructor:

  struct S {
constexpr S(int) {}
  };

  struct W {
constexpr W(int) : s(8), p() {}

S s;
int *p;
  };

  constexpr auto a = W(42);

When we perform register_constexpr_fundef the result of massage_constexpr_body 
is 

  {.s=S::S (&((struct W *) this)->s, NON_LVALUE_EXPR <8>), .p=0B}

i.e. a ctor that initializes all the members.  But later
reduced_constant_expression_p only ever sees {.p=0B} which seemingly doesn't
initialize all the members, so CONSTRUCTOR_NO_CLEARING is not cleared, and we
give an error.

That's why I opted for the approach in my original patch: in 
register_constexpr_fundef
we still see all the initializers.

Marek



Handle C2x attributes in Objective-C

2019-11-28 Thread Joseph Myers
When adding the initial support for C2x attributes, I deferred the
unbounded lookahead support required to support such attributes in
Objective-C (except for the changes to string literal handling, which
were the riskier piece of preparation for such lookahead support).
This patch adds that remaining ObjC support.

For C, the parser continues to work exactly as it did before.  For
ObjC, however, when checking for whether '[[' starts attributes, it
lexes however many tokens are needed to check for a matching ']]', but
in a raw mode that omits all the context-sensitive processing that
c_lex_with_flags normally does, so that that processing can be done
later when the right context-sensitive flags are set.  Those tokens
are saved in a separate raw_tokens vector in the parser, and normal
c_lex_one_token calls will get tokens from there and perform the
remaining processing on them, if any tokens are found there, so all
parsing not using the new interfaces gets the same tokens as it did
before.  (For C, this raw lexing never occurs and the vector of raw
tokens is always NULL.)

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2019-11-29  Joseph Myers  

* c-parser.c (struct c_parser): Add members raw_tokens and
raw_tokens_used.
(c_lex_one_token): Add argument raw.  Handle lexing raw tokens and
using previously-lexed raw tokens.
(c_parser_peek_nth_token_raw)
(c_parser_check_balanced_raw_token_sequence): New functions.
(c_parser_nth_token_starts_std_attributes): Use
c_parser_check_balanced_raw_token_sequence for Objective-C.

gcc/testsuite:
2019-11-29  Joseph Myers  

* objc.dg/attributes/gnu2x-attr-syntax-1.m: New test.

Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 278812)
+++ gcc/c/c-parser.c(working copy)
@@ -176,6 +176,12 @@ struct GTY(()) c_parser {
   /* How many look-ahead tokens are available (0 - 4, or
  more if parsing from pre-lexed tokens).  */
   unsigned int tokens_avail;
+  /* Raw look-ahead tokens, used only for checking in Objective-C
+ whether '[[' starts attributes.  */
+  vec *raw_tokens;
+  /* The number of raw look-ahead tokens that have since been fully
+ lexed.  */
+  unsigned int raw_tokens_used;
   /* True if a syntax error is being recovered from; false otherwise.
  c_parser_error sets this flag.  It should clear this flag when
  enough tokens have been consumed to recover from the error.  */
@@ -251,21 +257,40 @@ c_parser_set_error (c_parser *parser, bool err)
 
 static GTY (()) c_parser *the_parser;
 
-/* Read in and lex a single token, storing it in *TOKEN.  */
+/* Read in and lex a single token, storing it in *TOKEN.  If RAW,
+   context-sensitive postprocessing of the token is not done.  */
 
 static void
-c_lex_one_token (c_parser *parser, c_token *token)
+c_lex_one_token (c_parser *parser, c_token *token, bool raw = false)
 {
   timevar_push (TV_LEX);
 
-  token->type = c_lex_with_flags (>value, >location,
- >flags,
- (parser->lex_joined_string
-  ? 0 : C_LEX_STRING_NO_JOIN));
-  token->id_kind = C_ID_NONE;
-  token->keyword = RID_MAX;
-  token->pragma_kind = PRAGMA_NONE;
+  if (raw || vec_safe_length (parser->raw_tokens) == 0)
+{
+  token->type = c_lex_with_flags (>value, >location,
+ >flags,
+ (parser->lex_joined_string
+  ? 0 : C_LEX_STRING_NO_JOIN));
+  token->id_kind = C_ID_NONE;
+  token->keyword = RID_MAX;
+  token->pragma_kind = PRAGMA_NONE;
+}
+  else
+{
+  /* Use a token previously lexed as a raw look-ahead token, and
+complete the processing on it.  */
+  *token = (*parser->raw_tokens)[parser->raw_tokens_used];
+  ++parser->raw_tokens_used;
+  if (parser->raw_tokens_used == vec_safe_length (parser->raw_tokens))
+   {
+ vec_free (parser->raw_tokens);
+ parser->raw_tokens_used = 0;
+   }
+}
 
+  if (raw)
+goto out;
+
   switch (token->type)
 {
 case CPP_NAME:
@@ -434,6 +459,7 @@ static void
 default:
   break;
 }
+ out:
   timevar_pop (TV_LEX);
 }
 
@@ -484,6 +510,32 @@ c_parser_peek_nth_token (c_parser *parser, unsigne
   return >tokens[n - 1];
 }
 
+/* Return a pointer to the Nth token from PARSER, reading it in as a
+   raw look-ahead token if necessary.  The N-1th token is already read
+   in.  Raw look-ahead tokens remain available for when the non-raw
+   functions above are called.  */
+
+c_token *
+c_parser_peek_nth_token_raw (c_parser *parser, unsigned int n)
+{
+  /* N is 1-based, not zero-based.  */
+  gcc_assert (n > 0);
+
+  if (parser->tokens_avail >= n)
+return >tokens[n - 1];
+  unsigned int raw_len = vec_safe_length 

[PATCH] rs6000: Fix formatting of *mov{si,di}_internal.*

2019-11-28 Thread Segher Boessenkool
This implements the improvements I asked for.  Committing.


Segher


2019-11-28  Segher Boessenkool  

* config/rs6000/rs6000.md (*movsi_internal1): Fix formatting.  Improve
formatting.
(*movdi_internal64): Ditto.

---
 gcc/config/rs6000/rs6000.md | 192 ++--
 1 file changed, 96 insertions(+), 96 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 876dfe3..e3e17ad 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6889,34 +6889,34 @@ (define_split
 UNSPEC_MOVSI_GOT))]
   "")
 
-;; MR   LA
-;; LWZ  LFIWZX  LXSIWZX
-;; STW  STFIWX  STXSIWX
-;; LI   LIS #
-;; XXLORXXSPLTIB 0  XXSPLTIB -1 VSPLTISW
-;; XXLXOR 0 XXLORC -1   P9 const
-;; MTVSRWZ  MFVSRWZ
-;; MF%1 MT%0NOP
+;;MR  LA
+;;LWZ LFIWZX  LXSIWZX
+;;STW STFIWX  STXSIWX
+;;LI  LIS #
+;;XXLOR   XXSPLTIB 0  XXSPLTIB -1 VSPLTISW
+;;XXLXOR 0XXLORC -1   P9 const
+;;MTVSRWZ MFVSRWZ
+;;MF%1MT%0NOP
 
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
-   "=r, r,
-   r,   d,  v,
-   m,   Z,  Z,
-   r,   r,  r,
-   wa,  wa, wa, v,
-   wa,  v,  v,
-   wa,  r,
-   r,   *h, *h")
+ "=r, r,
+  r,  d,  v,
+  m,  Z,  Z,
+  r,  r,  r,
+  wa, wa, wa, v,
+  wa, v,  v,
+  wa, r,
+  r,  *h, *h")
(match_operand:SI 1 "input_operand"
-   "r,  U,
-   m,   Z,  Z,
-   r,   d,  v,
-   I,   L,  n,
-   wa,  O,  wM, wB,
-   O,   wM, wS,
-   r,   wa,
-   *h,  r,  0"))]
+ "r,  U,
+  m,  Z,  Z,
+  r,  d,  v,
+  I,  L,  n,
+  wa, O,  wM, wB,
+  O,  wM, wS,
+  r,  wa,
+  *h, r,  0"))]
   "gpc_reg_operand (operands[0], SImode)
|| gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6944,32 +6944,32 @@ (define_insn "*movsi_internal1"
mt%0 %1
nop"
   [(set_attr "type"
-   "*, *,
-   load,   fpload, fpload,
-   store,  fpstore,fpstore,
-   *,  *,  *,
-   veclogical, vecsimple,  vecsimple,  vecsimple,
-   veclogical, veclogical, vecsimple,
-   mffgpr, mftgpr,
-   *,  *,  *")
+ "*,  *,
+  load,   fpload, fpload,
+  store,  fpstore,fpstore,
+  *,  *,  *,
+  veclogical, vecsimple,  vecsimple,  vecsimple,
+  veclogical, veclogical, vecsimple,
+  mffgpr, mftgpr,
+  *,  *,  *")
(set_attr "length"
-   "*, *,
-   *,  *,   *,
-   *,  *,   *,
-   *,  *,   8,
-   *,  *,   *,  *,
-   *,  *,   8,
-   *,  *,
-   *,  *,   *")
+ "*,  *,
+  *,  *,  *,
+  *,  *,  *,
+  *,  *,  8,
+  *,  *,  *,  *,
+  *,  *,  8,
+  *,  *,
+  *,  *,  *")
(set_attr "isa"
-   "*,  *,
-   *,   p8v,   p8v,
-   *,   p8v,   p8v,
-   *,   *, *,
-   p8v, p9v,   p9v,   p8v,
-   p9v, p8v,   p9v,
-   p8v, p8v,
-   *,   *, *")])
+ "*,  *,
+  *,  p8v,p8v,
+  *,  p8v,p8v,
+  *,  *,  *,
+  p8v,p9v,p9v,p8v,
+  p9v,p8v,p9v,
+  p8v,p8v,
+  *,  *,  *")])
 
 ;; Like movsi, but adjust a SF value to be 

Re: [Patch, gcc-wwdocs] Update to Fortran changes

2019-11-28 Thread Gerald Pfeifer
On Tue, 26 Nov 2019, Mark Eggleston wrote:
> Second attempt this time with attachment.

>From f884924877ba84578e75bd16cb127bab33eb5ee6 Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 26 Nov 2019 10:12:44 +
Subject: [PATCH] Update Fortran changes

+  
+A blank format item at the end of a format specification i.e. nothing
+following the final comma is allowed.  Use the option
+-fdec-blank-format-item, this options is implied with
+-fdec.
+  

The second sentence is a bit confusing.  "Use the option" ... "this
option is implied"?

When/why to use this option?

How does the implication come into play?

In any case "this option is" (singular)

Same below.


I have applied the patch below for now to address some of the above;
please feel free to go ahead with further changes.

Thanks,
Gerald


diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index de550813..f0f0d312 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -219,21 +219,21 @@ a work-in-progress.
   
 A blank format item at the end of a format specification, i.e. nothing
 following the final comma, is allowed.  Use the option
--fdec-blank-format-item, this options is implied with
+-fdec-blank-format-item; this option is implied with
 -fdec.
   
   
 The existing support for AUTOMATIC and STATIC
 attributes has been extended to allow variables with the
 AUTOMATIC attribute to be used in EQUIVALENCE
-statements. Use -fdec-static, this option is implied by
+statements. Use -fdec-static; this option is implied by
 -fdec.
   
   
 Allow character literals in assignments and DATA 
statements
 for numeric (INTEGER, REAL, or
 COMPLEX) or LOGICAL variables.  Use the 
option
--fdec-char-conversions, this options is implied with
+-fdec-char-conversions; this option is implied with
 -fdec.
   
   


[PATCH] rs6000: Use memory_operand for all simple {l,st}*brx instructions

2019-11-28 Thread Segher Boessenkool
We run fwprop before combine, very early even in the case of fwprop1;
and fwprop1 will change memory addressing to what it considers cheaper.
After the "common" change, it now changes the indexed store instruction
in the testcase to be to a constant address.  But that is not an
improvement at all: the byte reverse instructions only exist in the
indexed form, so they will not match anymore.

This patch changes the patterns for the byte reverse instructions to
allow plain memory_operand, letting reload fix this up.

Tested on powerpc64-linux {-m32,-m64}, committing to trunk.


Segher


2019-11-28  Segher Boessenkool  

PR target/92602
* config/rs6000/rs6000.md (bswap2_load for HSI): Change the
indexed_or_indirect_operand to be memory_operand.
(bswap2_store for HSI): Ditto.
(bswapdi2_load): Ditto.
(bswapdi2_store): Ditto.

---
 gcc/config/rs6000/rs6000.md | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 876dfe3..0187ba0 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2510,13 +2510,13 @@ (define_expand "bswap2"
 
 (define_insn "bswap2_load"
   [(set (match_operand:HSI 0 "gpc_reg_operand" "=r")
-   (bswap:HSI (match_operand:HSI 1 "indexed_or_indirect_operand" "Z")))]
+   (bswap:HSI (match_operand:HSI 1 "memory_operand" "Z")))]
   ""
   "lbrx %0,%y1"
   [(set_attr "type" "load")])
 
 (define_insn "bswap2_store"
-  [(set (match_operand:HSI 0 "indexed_or_indirect_operand" "=Z")
+  [(set (match_operand:HSI 0 "memory_operand" "=Z")
(bswap:HSI (match_operand:HSI 1 "gpc_reg_operand" "r")))]
   ""
   "stbrx %1,%y0"
@@ -2632,13 +2632,13 @@ (define_expand "bswapdi2"
 ;; Power7/cell has ldbrx/stdbrx, so use it directly
 (define_insn "bswapdi2_load"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-   (bswap:DI (match_operand:DI 1 "indexed_or_indirect_operand" "Z")))]
+   (bswap:DI (match_operand:DI 1 "memory_operand" "Z")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "ldbrx %0,%y1"
   [(set_attr "type" "load")])
 
 (define_insn "bswapdi2_store"
-  [(set (match_operand:DI 0 "indexed_or_indirect_operand" "=Z")
+  [(set (match_operand:DI 0 "memory_operand" "=Z")
(bswap:DI (match_operand:DI 1 "gpc_reg_operand" "r")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "stdbrx %1,%y0"
-- 
1.8.3.1



Re: [PATCH] Trivial patch to allow bootstrap on MacOS

2019-11-28 Thread Keller, Rainer
Dear Iain,
thanks for the quick reply. I wasn’t aware of the ticket 79885.

Yes, the intent is to use the same sysroot for build & run.

Hmm, the —with-sysroot is from gcc/configure option — not from the main 
configure.
Not wanting to turn this into a bug-report, but… using —with-sysroot instead of 
—with-build-sysroot failed, too.

Let’s keep track of this in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92719
I am happy to provide more info.

Thanks again for Your help,
Rainer


> On 28. Nov 2019, at 16:34, Iain Sandoe  wrote:
> 
> Hello again Rainer,
> 
> Iain Sandoe  wrote:
> 
>>> GMP however is installed elsewhere (by Homebrew, MacPorts etc), so ignore 
>>> any -nostdinc
>> 
>> it also works to symlink the sources for gmp, mpfr, mpc (and isl, if you use 
>> it) into the source tree - those then get boostrapped along with the 
>> compiler and there are no resulting external dependencies (which I find 
>> preferable).
>> 
>> However, —with-gmp= etc should also work with it (I’ll take a look at that 
>> case).
> 
> It works fine when using —with-sysroot=   and 
> —with-gmp=/somewhere/outside/the/SDK  (/opt/….)
> 
> So, I think one can get the desired behaviour with this configuration scheme.
> 
> thanks
> Iain
> 

-
Prof. Dr.-Ing. Rainer Keller, Hochschule Esslingen
Professor für Betriebssysteme, verteilte und parallele Systeme
Fakultät Informationstechnik
Flandernstr. 101, Raum F01.320
73732 Esslingen
T.: +49 (0)711 397-4165
F.: +49 (0)711 397-48 4165



RE: [PATCH][GCC][SLP][testsuite] Turn off vect-epilogue-nomask for slp-rect-3

2019-11-28 Thread Tamar Christina
Hi Richi,

> >
> > This patch turns off vect-epilogue-nomask for slp-reduc-3 as it seems
> > that the epiloque in this loop is vectorizable using SLP and smaller
> > VF.  Since this test expects there to be no SLP vectorization at all
> > the testcase then fails for arm targets.
> 
> Actually we do expect SLP vectorization, just the counting might go wrong.
> 
> What's the actual FAIL for arm?

I should have worded this better considering the testcase literally contains 
SLP in the name...

The failure is for the XFAIL 

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_widen_sum_hi_to_si_pattern || { ! vect_unpack } } } } } */

And my understanding as to what is happening is that without epiloque no mask 
it would only try HI modes, but thanks to the epiloques nomask
It tries QI mode as well which succeeds.  The xfail then generates an xpass 
since the condition on it checks for HI to SI and not QI.

So I disabled the epiloque mask since it seems to violate the conditions the 
test actually wanted to test for.

Not quite sure why it's failing only on Arm though.

Regards,
Tamar

> 
> Disabling epilogue vect is of course OK if it simplifies things.
> 
> > Regtested on arm-none-eabi and no issues.
> >
> > Ok for trunk?
> 
> > Thanks,
> > Tamar
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2019-11-28  Tamar Christina  
> >
> > * gcc.dg/vect/slp-reduc-3.c: Turn off epilogue-nomask.
> >
> >
> 
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] [libiberty] Fix read buffer overflow in split_directories

2019-11-28 Thread Tim Rühsen
An empty name param leads to read buffer overflow in
function split_directories.

* libiberty/make-relative-prefix.c (split_directories):
  Return early on empty name.
---
 libiberty/ChangeLog  | 7 +++
 libiberty/make-relative-prefix.c | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index b516903d94..b7e24d11ef 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,10 @@
+2019-11-28  Tim Ruehsen  
+
+   Fix read buffer overflow in split_directories
+
+   * make-relative-prefix.c (split_directories):
+   Return early on empty 'name'
+
 2019-11-16  Tim Ruehsen  

Fix write buffer overflow in cplus_demangle()
diff --git a/libiberty/make-relative-prefix.c b/libiberty/make-relative-prefix.c
index ec0b0ee749..2ff2af8a59 100644
--- a/libiberty/make-relative-prefix.c
+++ b/libiberty/make-relative-prefix.c
@@ -122,6 +122,9 @@ split_directories (const char *name, int *ptr_num_dirs)
   const char *p, *q;
   int ch;

+  if (!*name)
+return NULL;
+
   /* Count the number of directories.  Special case MSDOS disk names as part
  of the initial directory.  */
   p = name;
--
2.24.0



Re: [PATCH v2] Add `--with-install-sysroot=' configuration option

2019-11-28 Thread Joseph Myers
On Thu, 28 Nov 2019, Maciej W. Rozycki wrote:

> > 
> > Rather, it's a suffix (as in SYSROOT_SUFFIX_SPEC, no command-line option 
> > to print it),
> 
>  Do you mean that there's no option to print SYSROOT_SUFFIX_SPEC on its 
> own or that no option prints it as a path component?  If the latter, then 
> I think it's an awful shortcoming, because there's no reasonable way for 
> a given GCC compilation to determine the layout expected.

There is no option to print the results of expanding SYSROOT_SUFFIX_SPEC 
on its own.  You can use -print-sysroot to print the full sysroot used, 
including the suffix.

>  Is it that with $toolexeclibdir we have say:
> 
> /usr/mips64el-st-linux-gnu/
>   +-> lib/
>   |  +-> 2e/
>   |  \-> 2f/
>   +-> lib32/
>   |+-> 2e/
>   |\-> 2f/
>   \-> lib64/
>+-> 2e/
>\-> 2f/

Yes.

> whereas `--sysroot=/path/to/sysroot' expects:
> 
> /path/to/sysroot/
> +-> 2e/
> | +-> lib/
> | +-> lib32/
> | \-> lib64/
> \-> 2f/
>   +-> lib/
>   +-> lib32/
>   \-> lib64/

Yes.  This latter structure is currently one that GCC can *use* but never 
*installs* anything into.

>  If my understanding as expressed above is correct, then I think the way 
> to move forward with this change will be to rename the option to 
> `--with-toolexeclibdir=' or suchlike (and adjust documentation 
> accordingly) so that it avoids the ambiguity of "sysroot" and is in line 
> with the usual `--bindir=', `--libdir=', etc. or less usual 
> `--with-slibdir=' options where people can adjust the various installation 
> directories according to their requirements or preferences.

Yes, that seems a plausible approach.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] Add `--with-install-sysroot=' configuration option

2019-11-28 Thread Maciej W. Rozycki
On Fri, 22 Nov 2019, Joseph Myers wrote:

> >  As I recall the MIPS sysroot setup (please correct me if I got something 
> > wrong here) was like:
> 
> Yes, that's the sort of layout you get with sysroot suffixes.  See 
> gcc/config/mips/{st.h,t-st} for an example.

 Thanks for the pointer.

> >  Then the right-hand side of /path/to/somewhere (except for usr/) is what 
> > gets printed by `-print-multi-directory' or the left-hand side of output 
> > from `-print-multi-lib', e.g. `sof/el/lib64' for the example above.  
> 
> Rather, it's a suffix (as in SYSROOT_SUFFIX_SPEC, no command-line option 
> to print it),

 Do you mean that there's no option to print SYSROOT_SUFFIX_SPEC on its 
own or that no option prints it as a path component?  If the latter, then 
I think it's an awful shortcoming, because there's no reasonable way for 
a given GCC compilation to determine the layout expected.

> >  Well, I agree we need to have this stuff documented beyond what we 
> > currently have, but I think it applies equally to all the sysroot options 
> > we have, including both the `--sysroot=' GCC driver's option, and the 
> > `--with-sysroot=', `--with-build-sysroot=' and the newly-proposed 
> 
> All three of those refer to the top-level sysroot path, to which a sysroot 
> suffix is appended based on SYSROOT_SUFFIX_SPEC (unless 
> --no-sysroot-suffix is used).
> 
> > `--with-install-sysroot=' `configure' script's options as well.  All we 
> > currently have is this paragraph:
> 
> But this is a path relative to which SYSROOT_SUFFIX_SPEC isn't used at 
> all.

 Can you please show me the two directory layouts, one for `--sysroot=' 
and the other for `--with-install-sysroot=' aka $toolexeclibdir, say for 
the `mips64el-st-linux-gnu' target, and where exactly in GCC installation 
(if anywhere) the `--sysroot=' layout is used?

 Is it that with $toolexeclibdir we have say:

/usr/mips64el-st-linux-gnu/
  +-> lib/
  |  +-> 2e/
  |  \-> 2f/
  +-> lib32/
  |+-> 2e/
  |\-> 2f/
  \-> lib64/
   +-> 2e/
   \-> 2f/

whereas `--sysroot=/path/to/sysroot' expects:

/path/to/sysroot/
+-> 2e/
| +-> lib/
| +-> lib32/
| \-> lib64/
\-> 2f/
  +-> lib/
  +-> lib32/
  \-> lib64/

(and then GCC applies the former scheme to the directories pointed to by 
the `-B' and `-L' options and the latter scheme to the directory pointed 
to by the `--sysroot=' option)?

> >  And last but not least: do we want to hold my proposed change hostage to 
> > a sysroot handling documentation improvement?  It does not appear fair to 
> > me as the situation with said documentation is not a new problem nor one 
> > specific to this newly-added option, and the new option merely played the 
> 
> The proposed new option is, as far as I know, the first one introducing 
> this new kind of sysroot option (one for which the suffix from 
> SYSROOT_SUFFIX_SPEC is never added).

 Thank you for all your input.

 If my understanding as expressed above is correct, then I think the way 
to move forward with this change will be to rename the option to 
`--with-toolexeclibdir=' or suchlike (and adjust documentation 
accordingly) so that it avoids the ambiguity of "sysroot" and is in line 
with the usual `--bindir=', `--libdir=', etc. or less usual 
`--with-slibdir=' options where people can adjust the various installation 
directories according to their requirements or preferences.

 Then on top of this an option like `--enable-sysroot-for-toolexeclibdir' 
can be discussed in the future, that would switch $toolexeclibdir to the 
proper sysroot layout, whether `--with-toolexeclibdir=' has been used or 
not.  Such an option will necessarily have to rely on the presence of a 
GCC option to print SYSROOT_SUFFIX_SPEC/STARTFILE_PREFIX_SPEC for the 
multilib selected.

 If we agree on this plan, I'll post an update patch.

  Maciej


Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-28 Thread Bernd Schmidt
On 11/28/19 8:53 PM, Gunther Nikl wrote:> Bernd Schmidt
:
>> On 11/23/19 9:53 PM, Bernd Schmidt wrote:

>> move.w %a4,%d0
>> -   tst.b %d0
>> -   jeq .L352
>> +   jeq .L353
>>
>> And the reason - that's a movqi using move.w.
>
> Can this problem also happen on older (pre-ccmode) GCC versions? Or was
> this only an issue of the ccmode conversion?

This was an error in the conversion (I think).

> What about the compare constraint errors? Are those also present on
> older GCC versions but never surfaced?

Those were present, but presumably nothing ever tried to rerecognize a
compare insn after reload (unlike jumps) so you wouldn't get a crash. I
haven't checked whether it could have produced invalid assembly - I'm
guessing probably not.


Bernd


Making things a bit easier (was: [Patch, gcc-wwdocs] Update to Fortran changes)

2019-11-28 Thread Gerald Pfeifer
On Tue, 26 Nov 2019, Mark Eggleston wrote:
> I've check the changes with W3 validator. There are no problems related 
> to my changes, however, it does object to the lack of character encoding 
> in the file. My knowledge of HTML is limited, so don't know how to fix 
> that.

Let me take this as good feedback and help make things a bit easier 
going forward. :-)

I just committed a patch a patch that added a
  
tag to all of our HTML files which, also in my own tests with
https://validator.w3.org, addresses what you have encountered.

Is there anything else we can do to make things easier?

Gerald


Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-28 Thread Gunther Nikl
Bernd Schmidt :
> On 11/23/19 9:53 PM, Bernd Schmidt wrote:
> > I'll spend a few more days trying to see if I can do something
> > about the bootstrap failure Mikael saw (currently trying to do a
> > two-stage cross build rather than a really slow bootstrap).  
> 
> Whew, I think I have it. One tst instruction eliminated when it
> shouldn't have been:
> 
> move.w %a4,%d0
> -   tst.b %d0
> -   jeq .L352
> +   jeq .L353
> 
> And the reason - that's a movqi using move.w.

Can this problem also happen on older (pre-ccmode) GCC versions? Or was
this only an issue of the ccmode conversion?

What about the compare constraint errors? Are those also present on
older GCC versions but never surfaced?

Thanks,
Gunther


[wwwdocs] Push down into individual HTML files.

2019-11-28 Thread Gerald Pfeifer
This will make it easier to edit/validate pages and reduces our
dependency on preprocessing yet a bit more on top of the change
series I worked on last fall.

This change was triggered by feedback from Mark Eggleston; thank you!

Committed.

Gerald


commit 34bfcf1947c44e458af1b7ba201f25071c4d80a5
Author: Gerald Pfeifer 
Date:   Thu Nov 28 19:01:20 2019 +0100

Push  down into individual HTML files.

Historically we have been adding  to
all HTML files via our preprocessing machinery. With this change these
files become more self contained and in particular easier to validate
directly.

diff --git a/htdocs/about.html b/htdocs/about.html
index a67e3588..011f7ab5 100644
--- a/htdocs/about.html
+++ b/htdocs/about.html
@@ -2,6 +2,7 @@
 
 
 
+
 GCC: About
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/backends.html b/htdocs/backends.html
index c4b916d3..a392d5ea 100644
--- a/htdocs/backends.html
+++ b/htdocs/backends.html
@@ -2,6 +2,7 @@
 
 
 
+
 Status of Supported Architectures from Maintainers' Point of 
View
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/badspammer.html b/htdocs/badspammer.html
index e80d7c3c..e2016373 100644
--- a/htdocs/badspammer.html
+++ b/htdocs/badspammer.html
@@ -2,6 +2,7 @@
 
 
 
+
 GCC
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/benchmarks/index.html b/htdocs/benchmarks/index.html
index afb6c9d7..555d2906 100644
--- a/htdocs/benchmarks/index.html
+++ b/htdocs/benchmarks/index.html
@@ -1,6 +1,7 @@
 
 
 
+
 Benchmarking GCC
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/branch-closing.html b/htdocs/branch-closing.html
index 2b684503..f8a3ddfa 100644
--- a/htdocs/branch-closing.html
+++ b/htdocs/branch-closing.html
@@ -2,6 +2,7 @@
 
 
 
+
 Closing a GCC Release Branch
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/branching.html b/htdocs/branching.html
index 541a8b1b..5eb6e73a 100644
--- a/htdocs/branching.html
+++ b/htdocs/branching.html
@@ -2,6 +2,7 @@
 
 
 
+
 Branching for a GCC Release
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bugs/index.html b/htdocs/bugs/index.html
index 36fac841..66d9138f 100644
--- a/htdocs/bugs/index.html
+++ b/htdocs/bugs/index.html
@@ -2,6 +2,7 @@
 
 
 
+
 GCC Bugs
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bugs/management.html b/htdocs/bugs/management.html
index 92ee2730..18fee991 100644
--- a/htdocs/bugs/management.html
+++ b/htdocs/bugs/management.html
@@ -2,6 +2,7 @@
 
 
 
+
 Managing Bugs (Bugzilla and the Testsuite)
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bugs/minimize.html b/htdocs/bugs/minimize.html
index 22df7de5..6197169a 100644
--- a/htdocs/bugs/minimize.html
+++ b/htdocs/bugs/minimize.html
@@ -2,6 +2,7 @@
 
 
 
+
 How to Minimize Test Cases for Bugs
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bugs/reghunt.html b/htdocs/bugs/reghunt.html
index de2cdb6b..d9c92067 100644
--- a/htdocs/bugs/reghunt.html
+++ b/htdocs/bugs/reghunt.html
@@ -2,6 +2,7 @@
 
 
 
+
 How to Locate GCC Regressions
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bugs/segfault.html b/htdocs/bugs/segfault.html
index a5769d2b..66543dc8 100644
--- a/htdocs/bugs/segfault.html
+++ b/htdocs/bugs/segfault.html
@@ -2,6 +2,7 @@
 
 
 
+
 How to debug a GCC segmentation fault
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/buildstat.html b/htdocs/buildstat.html
index 693a6ce4..f593d5c8 100644
--- a/htdocs/buildstat.html
+++ b/htdocs/buildstat.html
@@ -2,6 +2,7 @@
 
 
 
+
 Build status for GCC
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/bzkanban/index.html b/htdocs/bzkanban/index.html
index aa651f4f..009ea354 100644
--- a/htdocs/bzkanban/index.html
+++ b/htdocs/bzkanban/index.html
@@ -2,6 +2,7 @@
 
 
 
+
 Bz Kanban Board
 https://gcc.gnu.org/gcc.css; />
 https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css;>
diff --git a/htdocs/c99status.html b/htdocs/c99status.html
index b5d80f98..b2b1fed6 100644
--- a/htdocs/c99status.html
+++ b/htdocs/c99status.html
@@ -2,6 +2,7 @@
 
 
 
+
 Status of C99 features in GCC
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/codingconventions.html b/htdocs/codingconventions.html
index d43f28f0..03a77063 100644
--- a/htdocs/codingconventions.html
+++ b/htdocs/codingconventions.html
@@ -2,6 +2,7 @@
 
 
 
+
 GCC Coding Conventions
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/codingrationale.html b/htdocs/codingrationale.html
index 2facc097..a2618b64 100644
--- a/htdocs/codingrationale.html
+++ b/htdocs/codingrationale.html
@@ -2,6 +2,7 @@
 
 
 
+
 GCC Coding Conventions Rationale and Discussion
 https://gcc.gnu.org/gcc.css; />
 
diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 80ebb26f..381aa02a 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -1,6 +1,7 @@
 
 
 
+
 
 
diff --git a/htdocs/contributewhy.html b/htdocs/contributewhy.html
index 0253d2a1..ce8d20f2 100644
--- a/htdocs/contributewhy.html
+++ b/htdocs/contributewhy.html
@@ -2,6 +2,7 @@
 
 
 

[Darwin, X86, testsuite, committed] Update tests for common section use.

2019-11-28 Thread Iain Sandoe
The switch to default of no-common means that we no longer
indirect the accesses to 'xxx' in this test.  Adjust the scan-
assembler tests to reflect this.

tested on x86_64-darwin16, x86_64-linux-gnu
applied to mainline,
thanks
Iain


gcc/testsuite/ChangeLog:

2019-11-28  Iain Sandoe  

* gcc.target/i386/pr32219-2.c: Adjust scan-assembler entries
for revised common default.

diff --git a/gcc/testsuite/gcc.target/i386/pr32219-2.c 
b/gcc/testsuite/gcc.target/i386/pr32219-2.c
index b6212f7..a9c18ba 100644
--- a/gcc/testsuite/gcc.target/i386/pr32219-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr32219-2.c
@@ -12,13 +12,12 @@ foo ()
 }
 
 /* { dg-final { scan-assembler-not "movl\[ \t\]xxx\\(%rip\\), %" { target { ! 
ia32 } } } } */
-/* For Darwin m64 we are always PIC, but common symbols are indirected, which 
happens to
-   match the general "ELF" case.  */
-/* { dg-final { scan-assembler "xxx@GOTPCREL" { target { ! ia32 } } } } */
+/* For Darwin m64 PIC we make a direct access to this symbol.  */
+/* { dg-final { scan-assembler "xxx@GOTPCREL" { target { { ! ia32 } && { ! 
*-*-darwin* } } } } } */
 
 /* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %" { 
target { ia32 && { ! *-*-darwin* } } } } } */
 /* { dg-final { scan-assembler "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %" { target 
{ ia32 && { ! *-*-darwin* } } } } } */
 
-/* Darwin m32 defaults to PIC but common symbols need to be indirected.  */
-/* { dg-final { scan-assembler {movl[ 
\t][Ll]_xxx\$non_lazy_ptr-L1\$pb\(%eax\),[ \t]%eax} { target { ia32 && 
*-*-darwin* } } } } */
+/* Darwin m32 PIC requires the picbase adjustment.  */
+/* { dg-final { scan-assembler {movl[ \t]_xxx-L1\$pb\(%eax\),[ \t]%eax} { 
target { ia32 && *-*-darwin* } } } } */
 



Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Richard Biener
On Thu, Nov 28, 2019 at 4:04 PM Joseph Myers  wrote:
>
> On Thu, 28 Nov 2019, Julian Brown wrote:
>
> > Unlike e.g. the _FloatN types, when decimal floating-point types are
> > enabled, common tree nodes are created for each float type size (e.g.
> > dfloat32_type_node) and also a pointer to each type is created
> > (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node emits
> > these like:
>
> As far as I can tell, nothing actually uses those pointer nodes, or the
> corresponding BT_DFLOAT32_PTR etc. defined in builtin-types.def.  I don't
> know if they ever were used, or if they were just added by analogy to e.g.
> float_ptr_type_node.
>
> So I'd suggest simply removing all references to those tree nodes and
> corresponding BT_*, from builtin-types.def, jit/jit-builtins.c (commented
> out), tree-core.h, tree.c, tree.h.  Hopefully that will solve the
> offloading problem.

Indeed that seems to be the case and would be my suggestion as well.
The issue with LTO streaming here is that pointers get streamed as two
things but the error-mark replacement as one, that causes the mismatches.

So please just remove those three global types.

Richard.

> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH][GCC][SLP][testsuite] Turn off vect-epilogue-nomask for slp-rect-3

2019-11-28 Thread Richard Biener
On Thu, 28 Nov 2019, Tamar Christina wrote:

> Hi All,
> 
> This patch turns off vect-epilogue-nomask for slp-reduc-3 as it seems that
> the epiloque in this loop is vectorizable using SLP and smaller VF.  Since 
> this
> test expects there to be no SLP vectorization at all the testcase then fails
> for arm targets.

Actually we do expect SLP vectorization, just the counting might go wrong.

What's the actual FAIL for arm?

Disabling epilogue vect is of course OK if it simplifies things.

> Regtested on arm-none-eabi and no issues.
> 
> Ok for trunk?

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-11-28  Tamar Christina  
> 
>   * gcc.dg/vect/slp-reduc-3.c: Turn off epilogue-nomask.
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Joseph Myers
On Thu, 28 Nov 2019, Julian Brown wrote:

> On Thu, 28 Nov 2019 15:04:05 +
> Joseph Myers  wrote:
> 
> > On Thu, 28 Nov 2019, Julian Brown wrote:
> > 
> > > Unlike e.g. the _FloatN types, when decimal floating-point types are
> > > enabled, common tree nodes are created for each float type size
> > > (e.g. dfloat32_type_node) and also a pointer to each type is created
> > > (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node
> > > emits these like:  
> > 
> > As far as I can tell, nothing actually uses those pointer nodes, or
> > the corresponding BT_DFLOAT32_PTR etc. defined in builtin-types.def.
> > I don't know if they ever were used, or if they were just added by
> > analogy to e.g. float_ptr_type_node.
> > 
> > So I'd suggest simply removing all references to those tree nodes and 
> > corresponding BT_*, from builtin-types.def, jit/jit-builtins.c
> > (commented out), tree-core.h, tree.c, tree.h.  Hopefully that will
> > solve the offloading problem.
> 
> Thanks for review. How about this (lightly retested so far), assuming
> it passes full testing/bootstrap?

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Julian Brown
On Thu, 28 Nov 2019 15:04:05 +
Joseph Myers  wrote:

> On Thu, 28 Nov 2019, Julian Brown wrote:
> 
> > Unlike e.g. the _FloatN types, when decimal floating-point types are
> > enabled, common tree nodes are created for each float type size
> > (e.g. dfloat32_type_node) and also a pointer to each type is created
> > (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node
> > emits these like:  
> 
> As far as I can tell, nothing actually uses those pointer nodes, or
> the corresponding BT_DFLOAT32_PTR etc. defined in builtin-types.def.
> I don't know if they ever were used, or if they were just added by
> analogy to e.g. float_ptr_type_node.
> 
> So I'd suggest simply removing all references to those tree nodes and 
> corresponding BT_*, from builtin-types.def, jit/jit-builtins.c
> (commented out), tree-core.h, tree.c, tree.h.  Hopefully that will
> solve the offloading problem.

Thanks for review. How about this (lightly retested so far), assuming
it passes full testing/bootstrap?

Thanks,

Julian

ChangeLog

gcc/
* builtin-types.def (BT_DFLOAT32_PTR, BT_DFLOAT64_PTR,
BT_DFLOAT128_PTR) Remove.
* tree-core.h (TI_DFLOAT32_PTR_TYPE, TI_DFLOAT64_PTR_TYPE,
TI_DFLOAT128_PTR_TYPE): Remove.
* tree.c (build_common_type_nodes): Remove dfloat32_ptr_type_node,
dfloat64_ptr_type_node and dfloat128_ptr_type_node initialisation.
* tree.h (dfloat32_ptr_type_node, dfloat64_ptr_type_node,
dfloat128_ptr_type_node): Remove macros.

gcc/jit/
* jit-builtins.c (BT_DFLOAT32_PTR, BT_DFLOAT64_PTR, BT_DFLOAT128_PTR):
Remove commented-out cases.
commit 80f69450724dbbf944a0d1e62e3ca6bdc3dd5a82
Author: Julian Brown 
Date:   Wed Nov 27 18:41:56 2019 -0800

Remove unused decimal floating-point pointer types

gcc/
* builtin-types.def (BT_DFLOAT32_PTR, BT_DFLOAT64_PTR,
BT_DFLOAT128_PTR) Remove.
* tree-core.h (TI_DFLOAT32_PTR_TYPE, TI_DFLOAT64_PTR_TYPE,
TI_DFLOAT128_PTR_TYPE): Remove.
* tree.c (build_common_type_nodes): Remove dfloat32_ptr_type_node,
dfloat64_ptr_type_node and dfloat128_ptr_type_node initialisation.
* tree.h (dfloat32_ptr_type_node, dfloat64_ptr_type_node,
dfloat128_ptr_type_node): Remove macros.

gcc/jit/
* jit-builtins.c (BT_DFLOAT32_PTR, BT_DFLOAT64_PTR, BT_DFLOAT128_PTR):
Remove commented-out cases.

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 800b751de6d..2611e88da60 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -145,15 +145,6 @@ DEF_PRIMITIVE_TYPE (BT_DFLOAT64, (dfloat64_type_node
 DEF_PRIMITIVE_TYPE (BT_DFLOAT128, (dfloat128_type_node
    ? dfloat128_type_node
    : error_mark_node))
-DEF_PRIMITIVE_TYPE (BT_DFLOAT32_PTR, (dfloat32_ptr_type_node
-  ? dfloat32_ptr_type_node
-  : error_mark_node))
-DEF_PRIMITIVE_TYPE (BT_DFLOAT64_PTR, (dfloat64_ptr_type_node
-  ? dfloat64_ptr_type_node
-  : error_mark_node))
-DEF_PRIMITIVE_TYPE (BT_DFLOAT128_PTR, (dfloat128_ptr_type_node
-   ? dfloat128_ptr_type_node
-   : error_mark_node))
 
 DEF_PRIMITIVE_TYPE (BT_VALIST_REF, va_list_ref_type_node)
 DEF_PRIMITIVE_TYPE (BT_VALIST_ARG, va_list_arg_type_node)
diff --git a/gcc/jit/jit-builtins.c b/gcc/jit/jit-builtins.c
index 850329c7b36..93d48c64c40 100644
--- a/gcc/jit/jit-builtins.c
+++ b/gcc/jit/jit-builtins.c
@@ -434,9 +434,6 @@ builtins_manager::make_primitive_type (enum jit_builtin_type type_id)
 // case BT_DFLOAT32:
 // case BT_DFLOAT64:
 // case BT_DFLOAT128:
-// case BT_DFLOAT32_PTR:
-// case BT_DFLOAT64_PTR:
-// case BT_DFLOAT128_PTR:
 // case BT_VALIST_REF:
 // case BT_VALIST_ARG:
 // case BT_I1:
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 12e078882da..f76f68d835d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -695,9 +695,6 @@ enum tree_index {
   TI_DFLOAT32_TYPE,
   TI_DFLOAT64_TYPE,
   TI_DFLOAT128_TYPE,
-  TI_DFLOAT32_PTR_TYPE,
-  TI_DFLOAT64_PTR_TYPE,
-  TI_DFLOAT128_PTR_TYPE,
 
   TI_VOID_LIST_NODE,
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 5ae250ee595..789f0a00f41 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10340,19 +10340,16 @@ build_common_tree_nodes (bool signed_char)
   TYPE_PRECISION (dfloat32_type_node) = DECIMAL32_TYPE_SIZE;
   SET_TYPE_MODE (dfloat32_type_node, SDmode);
   layout_type (dfloat32_type_node);
-  dfloat32_ptr_type_node = build_pointer_type (dfloat32_type_node);
 
   dfloat64_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (dfloat64_type_node) = DECIMAL64_TYPE_SIZE;
   SET_TYPE_MODE (dfloat64_type_node, DDmode);
   layout_type (dfloat64_type_node);
-  dfloat64_ptr_type_node = build_pointer_type (dfloat64_type_node);
 
   dfloat128_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (dfloat128_type_node) = DECIMAL128_TYPE_SIZE;
   

Fix leftover optimize checks

2019-11-28 Thread Jan Hubicka
Hi,
these optimize flag checks was forgotten after the params conversion.

Comitted as obvious.
Honza

* ipa-inline.c (want_early_inline_function_p): Remove leftover optimize
checks.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 278778)
+++ ipa-inline.c(working copy)
@@ -701,10 +701,8 @@ want_early_inline_function_p (struct cgr
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, e->call_stmt,
 "  will not early inline: %C->%C, "
-"growth %i exceeds --param 
early-inlining-insns%s\n",
-e->caller, callee, growth,
-opt_for_fn (e->caller->decl, optimize) >= 3
-? "" : "-O2");
+"growth %i exceeds --param early-inlining-insns\n",
+e->caller, callee, growth);
  want_inline = false;
}
   else if ((n = num_calls (callee)) != 0
@@ -713,11 +711,9 @@ want_early_inline_function_p (struct cgr
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, e->call_stmt,
 "  will not early inline: %C->%C, "
-"growth %i exceeds --param early-inlining-insns%s "
+"growth %i exceeds --param early-inlining-insns "
 "divided by number of calls\n",
-e->caller, callee, growth,
-opt_for_fn (e->caller->decl, optimize) >= 3
-? "" : "-O2");
+e->caller, callee, growth);
  want_inline = false;
}
 }
@@ -861,12 +857,9 @@ want_inline_small_function_p (struct cgr
- ipa_call_summaries->get (e)->call_stmt_size
  > inline_insns_single (e->caller, true))
 {
-  if (opt_for_fn (e->caller->decl, optimize) >= 3)
-   e->inline_failed = (DECL_DECLARED_INLINE_P (callee->decl)
-   ? CIF_MAX_INLINE_INSNS_SINGLE_LIMIT
-   : CIF_MAX_INLINE_INSNS_AUTO_LIMIT);
-  else
-   e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
+  e->inline_failed = (DECL_DECLARED_INLINE_P (callee->decl)
+ ? CIF_MAX_INLINE_INSNS_SINGLE_LIMIT
+ : CIF_MAX_INLINE_INSNS_AUTO_LIMIT);
   want_inline = false;
 }
   else


Add sanity checking for profile counter compatibility

2019-11-28 Thread Jan Hubicka
Hi,
this patch adds sanity checks with uncovered all the latent bugs I fixed
today.  I will add similar checking to cfg profile, too, and also few
unit tests.

Bootstrapped/regtested x86_64-linux, plan to commit it shortly.

* profile-count.c (profile_count::to_cgraph_frequency,
profile_count::to_sreal_scale): Check for compaibility of counts.
* profile-count.h (compatible_p): Make public; add checking for
global0 versus global types.
* cgraph.c (cgraph_node::verify_node): Verify count compatibility.

Index: profile-count.c
===
--- profile-count.c (revision 278814)
+++ profile-count.c (working copy)
@@ -291,6 +292,7 @@ profile_count::to_cgraph_frequency (prof
 return 0;
   gcc_checking_assert (entry_bb_count.initialized_p ());
   uint64_t scale;
+  gcc_checking_assert (compatible_p (entry_bb_count));
   if (!safe_scale_64bit (!entry_bb_count.m_val ? m_val + 1 : m_val,
 CGRAPH_FREQ_BASE, MAX (1, entry_bb_count.m_val), 
))
 return CGRAPH_FREQ_MAX;
@@ -328,6 +330,7 @@ profile_count::to_sreal_scale (profile_c
 return 0;
   if (m_val == in.m_val)
 return 1;
+  gcc_checking_assert (compatible_p (in));
 
   if (!in.m_val)
 {
Index: profile-count.h
===
--- profile-count.h (revision 278814)
+++ profile-count.h (working copy)
@@ -700,6 +700,7 @@ private:
   uint64_t UINT64_BIT_FIELD_ALIGN m_val : n_bits;
 #undef UINT64_BIT_FIELD_ALIGN
   enum profile_quality m_quality : 3;
+public:
 
   /* Return true if both values can meaningfully appear in single function
  body.  We have either all counters in function local or global, otherwise
@@ -711,9 +712,18 @@ private:
   if (*this == zero ()
  || other == zero ())
return true;
+  /* Do not allow nonzero global profile together with local guesses
+that are globally0.  */
+  if (ipa ().nonzero_p ()
+ && !(other.ipa () == other))
+   return false;
+  if (other.ipa ().nonzero_p ()
+ && !(ipa () == *this))
+   return false;
+   
   return ipa_p () == other.ipa_p ();
 }
-public:
+
   /* Used for counters which are expected to be never executed.  */
   static profile_count zero ()
 {
Index: cgraph.c
===
--- cgraph.c(revision 278814)
+++ cgraph.c(working copy)
@@ -3061,6 +3061,13 @@ cgraph_node::verify_node (void)
   error ("inline clone in same comdat group list");
   error_found = true;
 }
+  if (inlined_to && !count.compatible_p (inlined_to->count))
+{
+  error ("inline clone count is not compatible");
+  count.debug ();
+  inlined_to->count.debug ();
+  error_found = true;
+}
   if (!definition && !in_other_partition && local)
 {
   error ("local symbols must be defined");
@@ -3089,6 +3096,13 @@ cgraph_node::verify_node (void)
 identifier_to_locale (e->caller->name ()));
  error_found = true;
}
+  if (!e->count.compatible_p (count))
+   {
+ error ("edge count is not compatible with function count");
+ e->count.debug ();
+ count.debug ();
+ error_found = true;
+   }
   if (!e->indirect_unknown_callee
  || !e->indirect_info)
{
@@ -3137,6 +3151,13 @@ cgraph_node::verify_node (void)
 {
   if (e->verify_count ())
error_found = true;
+  if (!e->count.compatible_p (count))
+   {
+ error ("edge count is not compatible with function count");
+ e->count.debug ();
+ count.debug ();
+ error_found = true;
+   }
   if (gimple_has_body_p (e->caller->decl)
  && !e->caller->inlined_to
  && !e->speculative


[PATCH][GCC][SLP][testsuite] Turn off vect-epilogue-nomask for slp-rect-3

2019-11-28 Thread Tamar Christina
Hi All,

This patch turns off vect-epilogue-nomask for slp-reduc-3 as it seems that
the epiloque in this loop is vectorizable using SLP and smaller VF.  Since this
test expects there to be no SLP vectorization at all the testcase then fails
for arm targets.

Regtested on arm-none-eabi and no issues.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

2019-11-28  Tamar Christina  

* gcc.dg/vect/slp-reduc-3.c: Turn off epilogue-nomask.

-- 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c b/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
index 9c8124c9b5f289d0a2ed49d3c8ee626d0bf05862..7358275c3cba6b3fd41b34cb2449c85810b0a35c 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "--param=vect-epilogues-nomask=0" } */
 
 #include 
 #include "tree-vect.h"



Minor fix for profile_count::combine_with_ipa_count

2019-11-28 Thread Jan Hubicka
Hi,
this patch makes combine_with_ipa_count to return uninitialized when
called on uninitialized count. Previously in some cases it returned
IPA, which is probably not a big deal but it should not do that for
consistency.

Bootstrapped/regtested x86_64-linux, comitted.

* profile-count.c (profile_count::combine_with_ipa_count): Return
uninitialized count if called on ininitialized count.

Index: profile-count.c
===
--- profile-count.c (revision 278814)
+++ profile-count.c (working copy)
@@ -373,6 +376,8 @@ profile_count::adjust_for_ipa_scaling (p
 profile_count
 profile_count::combine_with_ipa_count (profile_count ipa)
 {
+  if (!initialized_p ())
+return *this;
   ipa = ipa.ipa ();
   if (ipa.nonzero_p ())
 return ipa;


Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-28 Thread Thomas Schwinge
Hi Tobias!

On 2019-11-26T15:02:34+0100, Tobias Burnus  wrote:
> I now played also around common blocks with "!$acc declare 
> device_resident (/block/)". [See attached test-case diff.]

If you'd like to, please commit that, to document the status quo.  (I
have not reviewed.)


There are several issues with the OpenACC 'declare' implementation, so
that one generally needs to be re-visited as some point.  Basically
everything from the front ends handling, to middle end handling, to nvptx
back end handling (supposedly?; see ), to
libgomp handling.  So, you're adding here some more.  ;-)

> Observations:
>
> * !$acc declare has to come after the declaration of the common block. 
> In terms of the spec, it just needs to be in the declaration section, 
> i.e. it could also be before. – Seems as if one needs to split parsing 
> and resolving clauses.

Good find -- purely a Fortran front end issue, as I understand.  Please
file a GCC PR, unless there is a reason (implementation complexity?) to
be more "strict" ("referenced variable/common block needs to be lexically
in scope", or something like that?), and the OpenACC specification should
be changed instead?

> * If I just use '!$acc parallel', the used variables are copied in 
> according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of 
> OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays 
> are map(fromto:). – Does this behaviour match the spec or should this 
> automatically mapped to, e.g., no_create as the 'device_resident' is 
> known? [Side remark: the module file does contain 
> "OACC_DECLARE_DEVICE_RESIDENT".]

Not sure at this point.

> * If I explicitly use '!$acc parallel present(/block/)' that fails 
> because present() does not permit common blocks.
> (OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and 
> present, the list argument may include a Fortran common block name 
> enclosed within slashes").

Do you understand the rationale behind that restriction, by the way?  I'm
not sure I do.  Is it because we don't know/can't be sure that *all* of
the common block has been mapped (per the rules set elsewhere)?  That
would make sense in context of this:

> I could use no_create

... which basically means 'present' but don't complain if not actually
present.

> but that's not yet 
> supported.

But will be soon.  :-)


Grüße
 Thomas


> --- a/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> @@ -1,29 +1,106 @@
>  ! { dg-do run }
>  
>  module vars
>implicit none
>real b
> - !$acc declare device_resident (b)
> +  !$acc declare device_resident (b)
> +
> +  integer :: x, y, z
> +  common /block/ x, y, z
> +  !$acc declare device_resident (/block/)
>  end module vars
>  
> +subroutine set()
> +  use openacc
> +  implicit none
> +  integer :: a(5), b(1), c, vals(7)
> +  common /another/ a, b, c
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (a)) stop 10
> +  if (.not. acc_is_present (b)) stop 11
> +  if (.not. acc_is_present (c)) stop 12
> +
> +  vals = 99
> +  !$acc parallel copyout(vals) present(a, b, c) ! OK
> +! but w/o 'present', 'c' is 
> firstprivate and a+b are 'map(fromto:'
> +! additionally, OpenACC 2.7 
> does not permit present(/another/)
> +! and no_create is not yet 
> in the trunk (but submitted)
> +a = [11,12,13,14,15]
> +b = 16
> +c = 47
> +vals(1:5) = a
> +vals(6:6) = b
> +vals(7) = c
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (a)) stop 13
> +  if (.not. acc_is_present (b)) stop 14
> +  if (.not. acc_is_present (c)) stop 15
> +
> +  if (any (vals /= [11,12,13,14,15,16,47])) stop 16
> +end subroutine set
> +
> +subroutine check()
> +  use openacc
> +  implicit none
> +  integer :: g, h(3), i(3)
> +  common /another/ g, h, i
> +  integer :: val(7)
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (g)) stop 20
> +  if (.not. acc_is_present (h)) stop 21
> +  if (.not. acc_is_present (i)) stop 22
> +
> +  val = 99
> +  !$acc parallel copyout(val) present(g, h, i)
> +val(5:7) = i
> +val(1) = g
> +val(2:4) = h
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (g)) stop 23
> +  if (.not. acc_is_present (h)) stop 24
> +  if (.not. acc_is_present (i)) stop 25
> +
> +
> +  !print *, val
> +  if (any (val /= [11,12,13,14,15,16,47])) stop 26
> +end subroutine check
> +
> +
>  program test
>use vars
>use openacc
>implicit none
>real a
> +  integer :: k
>  
> -  if (acc_is_present (b) .neqv. .true.) STOP 1
> +  call set()
> +  call check()
> +
> +  if (.not. acc_is_present (b)) stop 1
> +  if (.not. acc_is_present (x)) stop 2
> +  if (.not. acc_is_present (y)) stop 3
> +  if (.not. acc_is_present (z)) stop 4
>  
>a = 2.0
> +  k = 42

Prevent inconsistent profiles to be created in inline_transform

2019-11-28 Thread Jan Hubicka
Hi,
ipa-inline-transforms first applies edge redirection and then scales
profile.  For some reason cgraph_edge::redirect_call_stmt_to_callee
copies bb's count into callgraph edge count.  This leads to
inconsistency because cfg profile is before scaling at this point.
Fixed thus.

profilebootstrapped/regtested x86_64-linux, comitted.

* ipa-inline-transform.c (inline_transform): Scale profile before
redirecting.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 278811)
+++ ipa-inline-transform.c  (working copy)
@@ -681,6 +681,31 @@ inline_transform (struct cgraph_node *no
   if (preserve_function_body_p (node))
 save_inline_function_body (node);
 
+  profile_count num = node->count;
+  profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
+  bool scale = num.initialized_p () && !(num == den);
+  if (scale)
+{
+  profile_count::adjust_for_ipa_scaling (, );
+  if (dump_file)
+   {
+ fprintf (dump_file, "Applying count scale ");
+ num.dump (dump_file);
+ fprintf (dump_file, "/");
+ den.dump (dump_file);
+ fprintf (dump_file, "\n");
+   }
+
+  basic_block bb;
+  cfun->cfg->count_max = profile_count::uninitialized ();
+  FOR_ALL_BB_FN (bb, cfun)
+   {
+ bb->count = bb->count.apply_scale (num, den);
+ cfun->cfg->count_max = cfun->cfg->count_max.max (bb->count);
+   }
+  ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = node->count;
+}
+
   for (e = node->callees; e; e = next)
 {
   if (!e->inline_failed)
@@ -693,32 +718,8 @@ inline_transform (struct cgraph_node *no
   timevar_push (TV_INTEGRATION);
   if (node->callees && (opt_for_fn (node->decl, optimize) || has_inline))
 {
-  profile_count num = node->count;
-  profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
-  bool scale = num.initialized_p () && !(num == den);
-  if (scale)
-   {
- profile_count::adjust_for_ipa_scaling (, );
- if (dump_file)
-   {
- fprintf (dump_file, "Applying count scale ");
- num.dump (dump_file);
- fprintf (dump_file, "/");
- den.dump (dump_file);
- fprintf (dump_file, "\n");
-   }
-
- basic_block bb;
- cfun->cfg->count_max = profile_count::uninitialized ();
- FOR_ALL_BB_FN (bb, cfun)
-   {
- bb->count = bb->count.apply_scale (num, den);
- cfun->cfg->count_max = cfun->cfg->count_max.max (bb->count);
-   }
- ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = node->count;
-   }
   todo = optimize_inline_calls (current_function_decl);
-   }
+}
   timevar_pop (TV_INTEGRATION);
 
   cfun->always_inline_functions_inlined = true;


Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-28 Thread Thomas Schwinge
Hi Tobias!

On 2019-11-25T15:02:16+0100, Tobias Burnus  wrote:
> sorry for the belated reply.

Eh, no worries -- I'm way more behind on things...


> On 11/11/19 10:39 AM, Thomas Schwinge wrote:
>> By the way, do you know what's the status is for Fortran common blocks in
>> OpenMP: supported vs. expected per the specification?
>
> No; however, I had a quick glance at the spec and at the test cases; 
> both compile-time and run-time test have some coverage, although I 
> didn't spot a run-time test for one 'omp target'.

Thanks.

> Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 
> 'common /name/ x" (continues) define the common block named "name" by 
> adding 'x' to it. While "common // y" or "common y" appends 'y' to the 
> blank common.

Thanks for the concise summary.

> In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as 
> general rule in the definition of "list item" (which are inherited by 
> "extended list item" and "locator-list item"). [There are also some 
> constraints and notes regarding common blocks)]. It does not really tell 
> whether blank commons are permitted or not; some description is 
> explicitly for named-common variables, leaving blank-common ones out 
> (and undefined). But later sections explicitly make reference to blank 
> commons, hence, one can assume they are permitted unless explicitly 
> stated that they are not.

Yes, I go by the assumption that everything contained in the base
languages of OpenACC/OpenMP (so, the respective C, C++, Fortran
standards), should also work in an OpenACC/OpenMP context in a sensible
manner (detailed/clarified in the respective specification as necessary),
and if not supported then that ought to be spelled out explicitly.  (For
example, see either the "catch-all" notes in OpenACC 3.0,
1.7. "References", or the more in-detail notes in specific sections.)
Anything else I'd consider a bug in the respective specification, which
should be reported/fixed.

That said, if you think OpenMP needs to clarify whether Fortran blank
common blocks are supported or not, then file an issue or directly submit
a pull request against the specification on 
(once we've got access).

> And then very selectively for some items:
> * allocate – only with default allocator.
> * declare target – some restrictions and no blank commons
> * depend clause – no common permitted
> * threadprivate – some notes and explanation of the syntax (why?)
>also only here requirement regarding common blocks with bind(c)
>(why not also for declare target?)
> * linear clause – no common permitted
> * copyin – some notes
> * copyprivate – some notes
>
> As target test cases were suspiciously left out, I tries '!$omp target 
> map(/name/)' which was rejected. I think one should add test cases for 
> newer features – which mostly means 'omp target' and add the missing 
> common-block checks. – And one has to find out why blank commons are not 
> permitted and for the places where they are permitted, support has to be 
> added.

ACK.  Instead of "burying" such things in long emails, I like to see GCC
PRs filed, which can then be actioned on individually.

> Talking about blank common blocks, the current OpenACC implementation 
> does not seem to like them (see below); the spec (2.7) does not mention 
> blank common blocks at all. – It talks about name between two slashes, 
> but leaves it open whether the name can also be an empty string.

My assumption would thus be: yes, ought to be supported -- but I haven't
thought through whether that makes sense, so...

> common // x,y  !blank common
> !$acc parallel copyin(//)
> !$acc end parallel
> end
>
> fails with:
>
>  2 | !$acc parallel copyin(//)
>|   1
> Error: Syntax error in OpenMP variable list at (1)

..., please test with the PGI compiler (just to get more data), and
determine whether that makes sense to support in an OpenACC context
(likewise for OpenMP, of course), and then (once you've got access)
either file an issue, or (better) directly submit a pull request for
 to clarify that.  Sometimes
it's as easy as replacing non-standard text ("name between two slashes")
with the corresponding standard text (whatever the Fortran specification
calls this).


> On 2019-10-25T16:36:10+0200, Tobias Burnus  wrote:
>
>>> * I have now a new test case
>>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>>> omplower.
>> Thanks. Curious: why 'omplower' instead of 'gimple' dump?
>
> So far I found -fdump-tree-original, -fdump-omplower and 
> -fdump-optimized quite useful – I have so far not used 
> -fdump-tree-gimple, hence, I cannot state what's the advantage of the 
> latter.

My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
test for that stuff in the 'gimple' dump (which is between 'original' and
'omplower').

> The original dump I like because it shows 

Re: [patch,fortran] PR90374 add e0 zero width exponent support

2019-11-28 Thread Steve Kargl
On Thu, Nov 28, 2019 at 08:05:31AM -0800, Jerry DeLisle wrote:
> On 11/28/19 7:53 AM, Steve Kargl wrote:
> > On Thu, Nov 28, 2019 at 07:45:25AM -0800, Jerry DeLisle wrote:
> >> +if (u == FMT_ZERO)
> >> +  {
> >> +if (!gfc_notify_std (GFC_STD_F2018,
> >> +"Positive exponent width required in "
> >> +"format string at %L", _locus))
> >> +  {
> >> +saved_token = u;
> >> +goto fail;
> >> +  }
> >> +  }
> >> +else
> >> +  {
> >> +error = G_("Positive exponent width required in format"
> >> +   "string at %L");
> > 
> > This needs a space after "format" or before "string".
> > 
> >> +goto syntax;
> >> +  }
> >>}
> > 
> 
> Fixed. OK with that?
> 

LGTM.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [patch,fortran] PR90374 add e0 zero width exponent support

2019-11-28 Thread Jerry DeLisle

On 11/28/19 7:53 AM, Steve Kargl wrote:

On Thu, Nov 28, 2019 at 07:45:25AM -0800, Jerry DeLisle wrote:

+ if (u == FMT_ZERO)
+   {
+ if (!gfc_notify_std (GFC_STD_F2018,
+ "Positive exponent width required in "
+ "format string at %L", _locus))
+   {
+ saved_token = u;
+ goto fail;
+   }
+   }
+ else
+   {
+ error = G_("Positive exponent width required in format"
+"string at %L");


This needs a space after "format" or before "string".


+ goto syntax;
+   }
}




Fixed. OK with that?

Jerry


Re: [patch,fortran] PR90374 add e0 zero width exponent support

2019-11-28 Thread Steve Kargl
On Thu, Nov 28, 2019 at 07:45:25AM -0800, Jerry DeLisle wrote:
> +   if (u == FMT_ZERO)
> + {
> +   if (!gfc_notify_std (GFC_STD_F2018,
> +   "Positive exponent width required in "
> +   "format string at %L", _locus))
> + {
> +   saved_token = u;
> +   goto fail;
> + }
> + }
> +   else
> + {
> +   error = G_("Positive exponent width required in format"
> +  "string at %L");

This needs a space after "format" or before "string".

> +   goto syntax;
> + }
>   }

-- 
Steve


Fix profile_count::max and profile_count::apply_scale WRT counts of different types

2019-11-28 Thread Jan Hubicka
Hi,
profile_count::max and profile_count::apply_scale and two functions that needs
to work with pairs of counters from different functions (first is used to
obtain overall statistics used by e.g. inliner and profile_count::apply_scale
is used to scale function profile when inline clones are created).

profile_count::max needs to simply preffer global counts over local which is
implemented to converting to IPA.

profile_count::apply_scale needs to be careful about quality updates.
If NUM is GUESSED_GLOBAL0 or GUESSED_GLOBAL0_ADJUSTED the result needs to
be of same type and if NUM is other IPA count the result needs to be IPA too.

Bootstrapped/regtested x86_64-linux, will commit it shortly after testing on
Firefox.

* profile-count.h (profile_count::max): Work on profiles of different
type.
(profile_count::apply_scale): Be sure that ret is not local or global0
type if num is global.
Index: profile-count.h
===
--- profile-count.h (revision 278811)
+++ profile-count.h (working copy)
@@ -992,6 +992,14 @@ public:
 
   profile_count max (profile_count other) const
 {
+  profile_count val = *this;
+
+  /* Always prefer nonzero IPA counts over local counts.  */
+  if (ipa ().nonzero_p () || other.ipa ().nonzero_p ())
+   {
+ val = ipa ();
+ other = other.ipa ();
+   }
   if (!initialized_p ())
return other;
   if (!other.initialized_p ())
@@ -1001,8 +1009,8 @@ public:
   if (other == zero ())
return *this;
   gcc_checking_assert (compatible_p (other));
-  if (m_val < other.m_val || (m_val == other.m_val
- && m_quality < other.m_quality))
+  if (val.m_val < other.m_val || (m_val == other.m_val
+ && val.m_quality < other.m_quality))
return other;
   return *this;
 }
@@ -1075,8 +1083,11 @@ public:
   ret.m_val = MIN (val, max_count);
   ret.m_quality = MIN (MIN (MIN (m_quality, ADJUSTED),
num.m_quality), den.m_quality);
-  if (num.ipa_p () && !ret.ipa_p ())
-   ret.m_quality = MIN (num.m_quality, GUESSED);
+  /* Be sure that ret is not local if num is global.
+Also ensure that ret is not global0 when num is global.  */
+  if (num.ipa_p ())
+   ret.m_quality = MAX (num.m_quality,
+num == num.ipa () ? GUESSED : num.m_quality);
   return ret;
 }
 


[patch,fortran] PR90374 add e0 zero width exponent support

2019-11-28 Thread Jerry DeLisle

Hi all,

The attached patch implements the last piece of this which enables the zero 
width exponent, giving  a processor dependent width.


Regression tested on x86_64-pc-linux-gnu.

I don't think it is very intrusive and I updated the test case.

OK for trunk?

Regards,

Jerry

2019-11-27  Jerry DeLisle  

PR fortran/90374
* io.c (check_format): Allow zero width expoenent with e0.

io/format.c (parse_format_list): Relax format checking to allow
e0 exponent specifier.

* gfortran.dg/fmt_zero_width.f90: Update test.



diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
index 57a3fdd5152..70aa6474445 100644
--- a/gcc/fortran/io.c
+++ b/gcc/fortran/io.c
@@ -1007,9 +1007,22 @@ data_desc:
 	goto fail;
 	  if (u != FMT_POSINT)
 	{
-	  error = G_("Positive exponent width required in format string "
-			 "at %L");
-	  goto syntax;
+	  if (u == FMT_ZERO)
+		{
+		  if (!gfc_notify_std (GFC_STD_F2018,
+  "Positive exponent width required in "
+  "format string at %L", _locus))
+		{
+		  saved_token = u;
+		  goto fail;
+		}
+		}
+	  else
+		{
+		  error = G_("Positive exponent width required in format"
+			 "string at %L");
+		  goto syntax;
+		}
 	}
 	}
 
diff --git a/gcc/testsuite/gfortran.dg/fmt_zero_width.f90 b/gcc/testsuite/gfortran.dg/fmt_zero_width.f90
index 093c0a44c34..640b6735c65 100644
--- a/gcc/testsuite/gfortran.dg/fmt_zero_width.f90
+++ b/gcc/testsuite/gfortran.dg/fmt_zero_width.f90
@@ -1,11 +1,11 @@
 ! { dg-do run }
 ! PR90374 "5.5 d0.d, e0.d, es0.d, en0.d, g0.d and ew.d edit descriptors
 program pr90374
+  implicit none
   real(4) :: rn
   character(32) :: afmt, aresult
-  real(8) :: one = 1.0D0, zero = 0.0D0, nan, pinf, minf
+  real(8) :: one = 1.0D0, zero = 0.0D0, pinf, minf
 
-  nan = zero/zero
   rn = 0.00314_4
   afmt = "(D0.3)"
   write (aresult,fmt=afmt) rn
@@ -22,15 +22,19 @@ program pr90374
   afmt = "(G0.10)"
   write (aresult,fmt=afmt) rn
   if (aresult /= "0.313928E-02") stop 24
+  afmt = "(E0.10e0)"
+  write (aresult,fmt=afmt) rn
+  if (aresult /= "0.313928E-02") stop 27
   write (aresult,fmt="(D0.3)") rn
-  if (aresult /= "0.314D-02") stop 26
+  if (aresult /= "0.314D-02") stop 29
   write (aresult,fmt="(E0.10)") rn
-  if (aresult /= "0.313928E-02") stop 28
+  if (aresult /= "0.313928E-02") stop 31
   write (aresult,fmt="(ES0.10)") rn
-  if (aresult /= "3.139280E-03") stop 30
+  if (aresult /= "3.139280E-03") stop 33
   write (aresult,fmt="(EN0.10)") rn
-  if (aresult /= "3.139280E-03") stop 32
+  if (aresult /= "3.139280E-03") stop 35
   write (aresult,fmt="(G0.10)") rn
-  if (aresult /= "0.313928E-02") stop 34
-
+  if (aresult /= "0.313928E-02") stop 37
+  write (aresult,fmt="(E0.10e0)") rn
+  if (aresult /= "0.313928E-02") stop 39
 end
diff --git a/libgfortran/io/format.c b/libgfortran/io/format.c
index b33620815d5..dd448c83e87 100644
--- a/libgfortran/io/format.c
+++ b/libgfortran/io/format.c
@@ -1027,11 +1027,17 @@ parse_format_list (st_parameter_dt *dtp, bool *seen_dd)
 	{
 	  t = format_lex (fmt);
 	  if (t != FMT_POSINT)
-	{
-	  fmt->error = "Positive exponent width required in format";
-	  goto finished;
-	}
-
+	if (t == FMT_ZERO)
+	  {
+		notify_std (>common, GFC_STD_F2018,
+			"Positive exponent width required");
+	  }
+	else
+	  {
+		fmt->error = "Positive exponent width required in "
+			 "format string at %L";
+		goto finished;
+	  }
 	  tail->u.real.e = fmt->value;
 	}
 
diff --git a/libgfortran/io/write_float.def b/libgfortran/io/write_float.def
index daa16679f53..ce6aec83114 100644
--- a/libgfortran/io/write_float.def
+++ b/libgfortran/io/write_float.def
@@ -482,7 +482,7 @@ build_float_string (st_parameter_dt *dtp, const fnode *f, char *buffer,
   for (i = abs (e); i >= 10; i /= 10)
 	edigits++;
 
-  if (f->u.real.e < 0)
+  if (f->u.real.e <= 0)
 	{
 	  /* Width not specified.  Must be no more than 3 digits.  */
 	  if (e > 999 || e < -999)


Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Segher Boessenkool
Hi Joseph,

On Thu, Nov 28, 2019 at 03:04:05PM +, Joseph Myers wrote:
> On Thu, 28 Nov 2019, Julian Brown wrote:
> > Unlike e.g. the _FloatN types, when decimal floating-point types are
> > enabled, common tree nodes are created for each float type size (e.g.
> > dfloat32_type_node) and also a pointer to each type is created
> > (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node emits
> > these like:
> 
> As far as I can tell, nothing actually uses those pointer nodes, or the 
> corresponding BT_DFLOAT32_PTR etc. defined in builtin-types.def.  I don't 
> know if they ever were used, or if they were just added by analogy to e.g. 
> float_ptr_type_node.
> 
> So I'd suggest simply removing all references to those tree nodes and 
> corresponding BT_*, from builtin-types.def, jit/jit-builtins.c (commented 
> out), tree-core.h, tree.c, tree.h.  Hopefully that will solve the 
> offloading problem.

So your patch caused at least three problems, none of them completely
worked out yet, none of them trivial.

Maybe this isn't such a good idea during stage 3.


Segher


Re: [PATCH] Trivial patch to allow bootstrap on MacOS

2019-11-28 Thread Iain Sandoe

Hello again Rainer,

Iain Sandoe  wrote:

GMP however is installed elsewhere (by Homebrew, MacPorts etc), so  
ignore any -nostdinc


it also works to symlink the sources for gmp, mpfr, mpc (and isl, if you  
use it) into the source tree - those then get boostrapped along with the  
compiler and there are no resulting external dependencies (which I find  
preferable).


However, —with-gmp= etc should also work with it (I’ll take a look at  
that case).


It works fine when using —with-sysroot=   and  
—with-gmp=/somewhere/outside/the/SDK  (/opt/….)


So, I think one can get the desired behaviour with this configuration scheme.

thanks
Iain



Re: [PATCH] cgraph: ifunc resolvers cannot be made local (PR 92697)

2019-11-28 Thread Jan Hubicka
> Hi,
> 
> In the attached testcase, IPA-SRA thinks that an ifunc resolver
> (meanwhile IPA-split into two functions) function can be changed and so
> goes ahead.  The cgraph machinery then however throws away the new clone
> of the caller instead of the "old" caller and inliner inlines the clone
> of the ".part" function into the original resolver, which results into
> an interesting miscompilation because IPA-SRA counted on that both the
> caller and the callee are modified.
> 
> Fixed by making cgraph_node::can_be_local_p return false for ifunc
> resolvers, as it should.  The patch also adds dumping of the symtab_node
> flag.  Bootstrapped and tested on x86_64-linux, OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 
> 2019-11-27  Martin Jambor  
> 
>   PR ipa/92697
>   * cgraph.c (cgraph_node_cannot_be_local_p_1): Return true for
>   ifunc_resolvers.
>   * symtab.c (symtab_node::dump_base): Dump ifunc_resolver flag.
>   Removed trailig whitespace.
> 
>   testsuite/
>   * g++.dg/ipa/pr92697.C: New.
OK,
thanks!

Honza


[PATCH] cgraph: ifunc resolvers cannot be made local (PR 92697)

2019-11-28 Thread Martin Jambor
Hi,

In the attached testcase, IPA-SRA thinks that an ifunc resolver
(meanwhile IPA-split into two functions) function can be changed and so
goes ahead.  The cgraph machinery then however throws away the new clone
of the caller instead of the "old" caller and inliner inlines the clone
of the ".part" function into the original resolver, which results into
an interesting miscompilation because IPA-SRA counted on that both the
caller and the callee are modified.

Fixed by making cgraph_node::can_be_local_p return false for ifunc
resolvers, as it should.  The patch also adds dumping of the symtab_node
flag.  Bootstrapped and tested on x86_64-linux, OK for trunk?

Thanks,

Martin



2019-11-27  Martin Jambor  

PR ipa/92697
* cgraph.c (cgraph_node_cannot_be_local_p_1): Return true for
ifunc_resolvers.
* symtab.c (symtab_node::dump_base): Dump ifunc_resolver flag.
Removed trailig whitespace.

testsuite/
* g++.dg/ipa/pr92697.C: New.
---
 gcc/cgraph.c   |  1 +
 gcc/symtab.c   |  4 ++-
 gcc/testsuite/g++.dg/ipa/pr92697.C | 51 ++
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr92697.C

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 1f7a5c58d98..dd07516b83e 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2227,6 +2227,7 @@ static bool
 cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void *)
 {
   return !(!node->force_output
+  && !node->ifunc_resolver
   && ((DECL_COMDAT (node->decl)
&& !node->forced_by_abi
&& !node->used_from_object_file_p ()
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 3e634e22c86..5a3122fc8bb 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -914,8 +914,10 @@ symtab_node::dump_base (FILE *f)
   if (DECL_STATIC_DESTRUCTOR (decl))
fprintf (f, " destructor");
 }
+  if (ifunc_resolver)
+fprintf (f, " ifunc_resolver");
   fprintf (f, "\n");
-  
+
   if (same_comdat_group)
 fprintf (f, "  Same comdat group as: %s\n",
 same_comdat_group->dump_asm_name ());
diff --git a/gcc/testsuite/g++.dg/ipa/pr92697.C 
b/gcc/testsuite/g++.dg/ipa/pr92697.C
new file mode 100644
index 000..8958bd0dcf2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr92697.C
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -fdump-ipa-sra" } */
+
+extern int have_avx2;
+extern int have_ssse3;
+
+namespace NTL
+{
+
+  static void randomstream_impl_init_base ()
+  {
+__builtin_printf ("Frob1\n");
+  }
+
+  static void // __attribute__ ((target ("ssse3")))
+randomstream_impl_init_ssse3 ()
+  {
+__builtin_printf ("Frob2\n");
+  }
+
+  static void
+//__attribute__ ((target ("avx2,fma,avx,pclmul,ssse3")))
+randomstream_impl_init_avx2 ()
+  {
+__builtin_printf ("Frob3\n");
+  }
+
+  extern "C"
+  {
+static void (*resolve_randomstream_impl_init (void)) ()
+{
+  if (have_avx2)
+   return _impl_init_avx2;
+  if (have_ssse3)
+   return _impl_init_ssse3;
+  return _impl_init_base;
+}
+  }
+  static void
+__attribute__ ((ifunc ("resolve_" "randomstream_impl_init")))
+randomstream_impl_init ();
+  void foo ()
+  {
+randomstream_impl_init ();
+  }
+
+}
+
+
+/* { dg-final { scan-ipa-dump-not "Created new node" "sra" } } */
-- 
2.24.0



Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Thomas Schwinge
Hi Julian!

On 2019-11-28T14:24:02+, Julian Brown  wrote:
> As mentioned in PR91985, offloading compilation is broken at present
> because of an issue with LTO streaming. With thanks to Joseph for
> hints, here's a solution.
>
> Unlike e.g. the _FloatN types, when decimal floating-point types are
> enabled, common tree nodes are created for each float type size (e.g.
> dfloat32_type_node) and also a pointer to each type is created
> (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node emits
> these like:
>
>(dfloat32_type_node)
>(dfloat64_type_node)
>   (dfloat128_type_node)
>*   (dfloat32_ptr_type_node)
>   
>*   (dfloat64_ptr_type_node)
>   
>*  (dfloat128_ptr_type_node)
>   
>
> I.e., with explicit emission of a copy of the pointed-to type following
> the pointer itself.

I also do see that, but I fail to understand why that duplication: the
first '' and the second one (after the ' *') are the
same node, or aren't they?

> When DFP is disabled, we instead get:
>
>   <<< error >>>
>   <<< error >>>
>   <<< error >>>
>   <<< error >>>
>   <<< error >>>
>   <<< error >>>

(With that expectedly being any 'NULL_TREE's converted to
'error_mark_node' in 'gcc/tree-streamer.c:record_common_node'.)

> So, the number of nodes emitted during LTO write-out in the host/read-in
> in the offload compiler do not match.

ACK.

> This patch restores the number of nodes emitted by creating
> dfloatN_ptr_type_node as generic pointers rather than treating them as
> flat error_type_nodes. I don't think there's an easy way of creating an
> "error_type_node *", nor do I know if that would really be preferable.
>
> Tested with offloading to NVPTX & bootstrapped. OK to apply?

> commit 17119773a8a45af098364b4faafe68f2e868479a
> Author: Julian Brown 
> Date:   Wed Nov 27 18:41:56 2019 -0800
>
> Fix decimal floating-point LTO streaming for offloading compilation
> 
> gcc/
> * tree.c (build_common_tree_nodes): Use pointer type for
> dfloat32_ptr_type_node, dfloat64_ptr_type_node and
> dfloat128_ptr_type_node when decimal floating point support is 
> disabled.
>
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 5ae250ee595..db3f225ea7f 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -10354,6 +10354,15 @@ build_common_tree_nodes (bool signed_char)
>layout_type (dfloat128_type_node);
>dfloat128_ptr_type_node = build_pointer_type (dfloat128_type_node);
>  }
> +  else
> +{
> +  /* These must be pointers else tree-streamer.c:record_common_node will 
> emit
> +  a different number of nodes depending on DFP availability, which breaks
> +  offloading compilation.  */
> +  dfloat32_ptr_type_node = ptr_type_node;
> +  dfloat64_ptr_type_node = ptr_type_node;
> +  dfloat128_ptr_type_node = ptr_type_node;
> +}
>  
>complex_integer_type_node = build_complex_type (integer_type_node, true);
>complex_float_type_node = build_complex_type (float_type_node, true);

(Maybe that's indeed better than my "hamfisted" patch.)  ;-)

But it still reads a bit like a workaround (explicitly setting only
'dfloat*_ptr_type_node' here, leaving the actual 'dfloat*_type_node'
untouched (and then later on implicitly converted to 'error_mark_node' as
mentioned).

I guess we need somebody with more experience to review this.


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Joseph Myers
On Thu, 28 Nov 2019, Julian Brown wrote:

> Unlike e.g. the _FloatN types, when decimal floating-point types are
> enabled, common tree nodes are created for each float type size (e.g.
> dfloat32_type_node) and also a pointer to each type is created
> (e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node emits
> these like:

As far as I can tell, nothing actually uses those pointer nodes, or the 
corresponding BT_DFLOAT32_PTR etc. defined in builtin-types.def.  I don't 
know if they ever were used, or if they were just added by analogy to e.g. 
float_ptr_type_node.

So I'd suggest simply removing all references to those tree nodes and 
corresponding BT_*, from builtin-types.def, jit/jit-builtins.c (commented 
out), tree-core.h, tree.c, tree.h.  Hopefully that will solve the 
offloading problem.

-- 
Joseph S. Myers
jos...@codesourcery.com


Fix profile adjusments while cloning

2019-11-28 Thread Jan Hubicka
Hi,
this patch fixes profile updates while cloning.  When new clone is produced
its global profile is subtracted from the original function.  If the original
function profile drops to 0 we want to switch from global profiles to global0
profiles which is implemented by combine_with_ipa_count_within.

However this is done on all edges independnetly and it may happen that we end
up combining global and globa0 profiles in one functions which is not a good
idea.

This implements profile_count::combine_with_ipa_count_within which is able
to take into account that the counter is inside function with a given count.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* profile-count.h (profile_count::combine_with_ipa_count_within):
Declare.
* profile-count.c (profile_count::combine_with_ipa_count_within):
New.
* cgraphclones.c (cgraph_edge::clone, cgraph_node::create_clone): Use
it.
Index: profile-count.h
===
--- profile-count.h (revision 278809)
+++ profile-count.h (working copy)
@@ -1194,6 +1215,10 @@ public:
  global0.  */
   profile_count combine_with_ipa_count (profile_count ipa);
 
+  /* Same as combine_with_ipa_count but inside function with count IPA2.  */
+  profile_count combine_with_ipa_count_within
+(profile_count ipa, profile_count ipa2);
+
   /* The profiling runtime uses gcov_type, which is usually 64bit integer.
  Conversions back and forth are used to read the coverage and get it
  into internal representation.  */
Index: profile-count.c
===
--- profile-count.c (revision 278809)
+++ profile-count.c (working copy)
@@ -383,6 +388,23 @@ profile_count::combine_with_ipa_count (p
   return this->global0adjusted ();
 }
 
+/* Sae as profile_count::combine_with_ipa_count but within function with count
+   IPA2.  */
+profile_count
+profile_count::combine_with_ipa_count_within (profile_count ipa,
+ profile_count ipa2)
+{
+  profile_count ret;
+  if (!initialized_p ())
+return *this;
+  if (ipa2.ipa () == ipa2 && ipa.initialized_p ())
+ret = ipa;
+  else
+ret = combine_with_ipa_count (ipa);
+  gcc_checking_assert (ret.compatible_p (ipa2));
+  return ret;
+}
+
 /* The profiling runtime uses gcov_type, which is usually 64bit integer.
Conversions back and forth are used to read the coverage and get it
into internal representation.  */
Index: cgraphclones.c
===
--- cgraphclones.c  (revision 278809)
+++ cgraphclones.c  (working copy)
@@ -136,8 +141,9 @@ cgraph_edge::clone (cgraph_node *n, gcal
 
   /* Update IPA profile.  Local profiles need no updating in original.  */
   if (update_original)
-count = count.combine_with_ipa_count (count.ipa () 
- - new_edge->count.ipa ());
+count = count.combine_with_ipa_count_within (count.ipa () 
+- new_edge->count.ipa (),
+caller->count);
   symtab->call_edge_duplication_hooks (this, new_edge);
   return new_edge;
 }
@@ -341,7 +349,14 @@ cgraph_node::create_clone (tree new_decl
 
   /* Update IPA profile.  Local profiles need no updating in original.  */
   if (update_original)
-count = count.combine_with_ipa_count (count.ipa () - prof_count.ipa ());
+{
+  if (inlined_to)
+count = count.combine_with_ipa_count_within (count.ipa ()
+- prof_count.ipa (),
+inlined_to->count);
+  else
+count = count.combine_with_ipa_count (count.ipa () - prof_count.ipa 
());
+}
   new_node->decl = new_decl;
   new_node->register_symbol ();
   new_node->origin = origin;


[PATCH] Fix decimal floating-point LTO streaming for offloading compilation

2019-11-28 Thread Julian Brown
As mentioned in PR91985, offloading compilation is broken at present
because of an issue with LTO streaming. With thanks to Joseph for
hints, here's a solution.

Unlike e.g. the _FloatN types, when decimal floating-point types are
enabled, common tree nodes are created for each float type size (e.g.
dfloat32_type_node) and also a pointer to each type is created
(e.g. dfloat32_ptr_type_node). tree-streamer.c:record_common_node emits
these like:

   (dfloat32_type_node)
   (dfloat64_type_node)
  (dfloat128_type_node)
   *   (dfloat32_ptr_type_node)
  
   *   (dfloat64_ptr_type_node)
  
   *  (dfloat128_ptr_type_node)
  

I.e., with explicit emission of a copy of the pointed-to type following
the pointer itself.

When DFP is disabled, we instead get:

  <<< error >>>
  <<< error >>>
  <<< error >>>
  <<< error >>>
  <<< error >>>
  <<< error >>>

So, the number of nodes emitted during LTO write-out in the host/read-in
in the offload compiler do not match.

This patch restores the number of nodes emitted by creating
dfloatN_ptr_type_node as generic pointers rather than treating them as
flat error_type_nodes. I don't think there's an easy way of creating an
"error_type_node *", nor do I know if that would really be preferable.

Tested with offloading to NVPTX & bootstrapped. OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* tree.c (build_common_tree_nodes): Use pointer type for
dfloat32_ptr_type_node, dfloat64_ptr_type_node and
dfloat128_ptr_type_node when decimal floating point support
is disabled.
commit 17119773a8a45af098364b4faafe68f2e868479a
Author: Julian Brown 
Date:   Wed Nov 27 18:41:56 2019 -0800

Fix decimal floating-point LTO streaming for offloading compilation

gcc/
* tree.c (build_common_tree_nodes): Use pointer type for
dfloat32_ptr_type_node, dfloat64_ptr_type_node and
dfloat128_ptr_type_node when decimal floating point support is disabled.

diff --git a/gcc/tree.c b/gcc/tree.c
index 5ae250ee595..db3f225ea7f 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10354,6 +10354,15 @@ build_common_tree_nodes (bool signed_char)
   layout_type (dfloat128_type_node);
   dfloat128_ptr_type_node = build_pointer_type (dfloat128_type_node);
 }
+  else
+{
+  /* These must be pointers else tree-streamer.c:record_common_node will emit
+	 a different number of nodes depending on DFP availability, which breaks
+	 offloading compilation.  */
+  dfloat32_ptr_type_node = ptr_type_node;
+  dfloat64_ptr_type_node = ptr_type_node;
+  dfloat128_ptr_type_node = ptr_type_node;
+}
 
   complex_integer_type_node = build_complex_type (integer_type_node, true);
   complex_float_type_node = build_complex_type (float_type_node, true);


Fix scaling of profiles in ipa_merge_profiles

2019-11-28 Thread Jan Hubicka
Hi
this patch fixes two problems in ipa_merge_profiles.  First we allow cfg
profile to diverge from cgraph profile and prior summing cfg profiles we
must compensate for this change.

Second the function is trying to preserve as much information as possible (for
example to handle cases one function has guessed profile and other function has
IPA profile) but it does so independently on each proflie counter which is not
good since all type transitions must be done same way in order for resulting
profile to be meaninful.

In partiuclar the code sometimes makes node->count to be global count 
while some edges gets globa0 counters which is not meaningful and leads to ICE
with sanity checking I want to commit incrementally.

profiledbootstrapped x86_64-linux, comitted.

* ipa-utils.c (ipa_merge_profiles): Be sure that all type transtions
of counters are done same way.

Index: ipa-utils.c
===
--- ipa-utils.c (revision 278681)
+++ ipa-utils.c (working copy)
@@ -398,6 +398,7 @@ ipa_merge_profiles (struct cgraph_node *
   tree oldsrcdecl = src->decl;
   struct function *srccfun, *dstcfun;
   bool match = true;
+  bool copy_counts = false;
 
   if (!src->definition
   || !dst->definition)
@@ -429,10 +430,26 @@ ipa_merge_profiles (struct cgraph_node *
 }
   profile_count orig_count = dst->count;
 
-  if (dst->count.initialized_p () && dst->count.ipa () == dst->count)
-dst->count += src->count.ipa ();
-  else 
-dst->count = src->count.ipa ();
+  /* Either sum the profiles if both are IPA and not global0, or
+ pick more informative one (that is nonzero IPA if other is
+ uninitialized, guessed or global0).   */
+
+  if ((dst->count.ipa ().nonzero_p ()
+   || src->count.ipa ().nonzero_p ())
+  && dst->count.ipa ().initialized_p ()
+  && src->count.ipa ().initialized_p ())
+dst->count = dst->count.ipa () + src->count.ipa ();
+  else if (dst->count.ipa ().initialized_p ())
+;
+  else if (src->count.ipa ().initialized_p ())
+{
+  copy_counts = true;
+  dst->count = src->count.ipa ();
+}
+
+  /* If no updating needed return early.  */
+  if (dst->count == orig_count)
+return;
 
   /* First handle functions with no gimple body.  */
   if (dst->thunk.thunk_p || dst->alias
@@ -544,6 +561,16 @@ ipa_merge_profiles (struct cgraph_node *
   struct cgraph_edge *e, *e2;
   basic_block srcbb, dstbb;
 
+  /* Function and global profile may be out of sync.  First scale it same
+way as fixup_cfg would.  */
+  profile_count srcnum = src->count;
+  profile_count srcden = ENTRY_BLOCK_PTR_FOR_FN (srccfun)->count;
+  bool srcscale = srcnum.initialized_p () && !(srcnum == srcden);
+  profile_count dstnum = orig_count;
+  profile_count dstden = ENTRY_BLOCK_PTR_FOR_FN (dstcfun)->count;
+  bool dstscale = !copy_counts
+ && dstnum.initialized_p () && !(dstnum == dstden);
+
   /* TODO: merge also statement histograms.  */
   FOR_ALL_BB_FN (srcbb, srccfun)
{
@@ -551,15 +578,15 @@ ipa_merge_profiles (struct cgraph_node *
 
  dstbb = BASIC_BLOCK_FOR_FN (dstcfun, srcbb->index);
 
- /* Either sum the profiles if both are IPA and not global0, or
-pick more informative one (that is nonzero IPA if other is
-uninitialized, guessed or global0).   */
- if (!dstbb->count.ipa ().initialized_p ()
- || (dstbb->count.ipa () == profile_count::zero ()
- && (srcbb->count.ipa ().initialized_p ()
- && !(srcbb->count.ipa () == profile_count::zero ()
+ profile_count srccount = srcbb->count;
+ if (srcscale)
+   srccount = srccount.apply_scale (srcnum, srcden);
+ if (dstscale)
+   dstbb->count = dstbb->count.apply_scale (dstnum, dstden);
+
+ if (copy_counts)
{
- dstbb->count = srcbb->count;
+ dstbb->count = srccount;
  for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
{
  edge srce = EDGE_SUCC (srcbb, i);
@@ -568,18 +595,21 @@ ipa_merge_profiles (struct cgraph_node *
dste->probability = srce->probability;
}
}   
- else if (srcbb->count.ipa ().initialized_p ()
-  && !(srcbb->count.ipa () == profile_count::zero ()))
+ else 
{
  for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
{
  edge srce = EDGE_SUCC (srcbb, i);
  edge dste = EDGE_SUCC (dstbb, i);
  dste->probability = 
-   dste->probability * dstbb->count.probability_in 
(dstbb->count + srcbb->count)
-   + srce->probability * srcbb->count.probability_in 
(dstbb->count + srcbb->count);
+   dste->probability * dstbb->count.ipa ().probability_in
+ 

Re: Prevent all uses of DFP when unsupported (PR c/91985)

2019-11-28 Thread Thomas Schwinge
Hi!

So, testing just finished, and indeed:

On 2019-11-27T22:33:25+, Joseph Myers  wrote:
> On Wed, 27 Nov 2019, Thomas Schwinge wrote:
>
>> If I turn that conditional cited above into 'if (1)', then nvptx
>> offloading testing seems to return to normality, but I have not yet
>> assessed whether that has any ill effects on decimal float types support,

... this (see attached) doesn't disturb x86_64-pc-linux-gnu (with nvptx
offloading) as well as powerpc64le-unknown-linux-gnu (without offloading)
tests in any way.

>> and/or how this should be fixed properly.  (Julian, please have a look,
>> if you can, or tell me if you're busy with other things.)
>
> Whatever allows this to work for _FloatN types (when x86_64 supports 
> _Float128 but nxptx doesn't, for example) should be applied to the DFP 
> types as well.

Joseph, Julian, or anyone else, please review if that's (conceptually)
the correct fix, and before commit, I'll then remove the 'if'
conditional, and re-indent, obviously.  If approving this patch
(conceptually), please respond with "Reviewed-by: NAME " so that
your effort will be recorded in the commit log, see
.


Grüße
 Thomas


From 272dfd2d0bd05ca32a12a3d30ea2871030e5d784 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 28 Nov 2019 09:17:57 +0100
Subject: [PATCH] WIP Unconditionally enable decimal float types in
 'gcc/tree.c:build_common_tree_nodes'

---
 gcc/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree.c b/gcc/tree.c
index 5ae250ee595..d61496518fa 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10334,7 +10334,7 @@ build_common_tree_nodes (bool signed_char)
   uint64_type_node = make_or_reuse_type (64, 1);
 
   /* Decimal float types. */
-  if (targetm.decimal_float_supported_p ())
+  if (1 || targetm.decimal_float_supported_p ())
 {
   dfloat32_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (dfloat32_type_node) = DECIMAL32_TYPE_SIZE;
-- 
2.17.1



signature.asc
Description: PGP signature


Re: [PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-28 Thread Robin Dapp
> OK from me, what about earlier comments of using __asm__ in a C
> source file?
> 
> I wouldn't really object to converting all .S sources (infact I can
> do this myself) if it meant slightly better portability.

Adding to yesterday's message: feel free to apply the current version if
it's OK.  The inline __asm__ with the dummy function call seemed a bit
too hacky for most people.

Regards
 Robin



Fix scaling in update_profiling_info

2019-11-28 Thread Jan Hubicka
Hi,

This patch fixes scaling in update_profiling_info. My understanding is that
there is orig_node and new_node which have some counts that comes from cloning
but real distribution of execution counts is determined by counting callers to
new clone.  This is new_sum.  We thus want to scale orig_node to
orig_node->count-new_sum and new_node to new_sum.

Code seems to miss initialization of new_sum and updating of indirect calls.
Also i do not see why new_node->count and orig_node->count are same (because
orig_node can be updated multiple times) and thus I added code to save original
new_node->count so scaling can be done properly.

proiflebootstrapped/regtested x86_64. Martin, I would like you to take a look
on this.

Honza

* ipa-cp.c (update_profiling_info): Fix scaling.
Index: ipa-cp.c
===
--- ipa-cp.c(revision 278778)
+++ ipa-cp.c(working copy)
@@ -4091,6 +4091,7 @@ update_profiling_info (struct cgraph_nod
   struct caller_statistics stats;
   profile_count new_sum, orig_sum;
   profile_count remainder, orig_node_count = orig_node->count;
+  profile_count orig_new_node_count = new_node->count;
 
   if (!(orig_node_count.ipa () > profile_count::zero ()))
 return;
@@ -4128,15 +4129,20 @@ update_profiling_info (struct cgraph_nod
   remainder = orig_node_count.combine_with_ipa_count (orig_node_count.ipa ()
  - new_sum.ipa ());
   new_sum = orig_node_count.combine_with_ipa_count (new_sum);
+  new_node->count = new_sum;
   orig_node->count = remainder;
 
-  profile_count::adjust_for_ipa_scaling (_sum, _node_count);
+  profile_count::adjust_for_ipa_scaling (_sum, _new_node_count);
   for (cs = new_node->callees; cs; cs = cs->next_callee)
-cs->count = cs->count.apply_scale (new_sum, orig_node_count);
+cs->count = cs->count.apply_scale (new_sum, orig_new_node_count);
+  for (cs = new_node->indirect_calls; cs; cs = cs->next_callee)
+cs->count = cs->count.apply_scale (new_sum, orig_new_node_count);
 
   profile_count::adjust_for_ipa_scaling (, _node_count);
   for (cs = orig_node->callees; cs; cs = cs->next_callee)
 cs->count = cs->count.apply_scale (remainder, orig_node_count);
+  for (cs = orig_node->indirect_calls; cs; cs = cs->next_callee)
+cs->count = cs->count.apply_scale (remainder, orig_node_count);
 
   if (dump_file)
 dump_profile_updates (orig_node, new_node);


[PATCH] Elide return during inlining when possible

2019-11-28 Thread Richard Biener


Also from investigating the abstraction penalty in PR92645 I noticed
we create pointless stmts to copy the return value to the result decl
at the return.  That's not needed if the call doesn't have a LHS.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-11-28  Richard Biener  

PR tree-optimization/92645
* tree-inline.c (remap_gimple_stmt): When the return value
is not wanted, elide GIMPLE_RETURN.

* gcc.dg/tree-ssa/inline-12.c: New testcase.

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 278765)
+++ gcc/tree-inline.c   (working copy)
@@ -1541,9 +1541,12 @@ remap_gimple_stmt (gimple *stmt, copy_bo
 assignment to the equivalent of the original RESULT_DECL.
 If RETVAL is just the result decl, the result decl has
 already been set (e.g. a recent "foo (_decl, ...)");
-just toss the entire GIMPLE_RETURN.  */
+just toss the entire GIMPLE_RETURN.  Likewise for when the
+call doesn't want the return value.  */
   if (retval
  && (TREE_CODE (retval) != RESULT_DECL
+ && (!id->call_stmt
+ || gimple_call_lhs (id->call_stmt) != NULL_TREE)
  && (TREE_CODE (retval) != SSA_NAME
  || ! SSA_NAME_VAR (retval)
  || TREE_CODE (SSA_NAME_VAR (retval)) != RESULT_DECL)))
Index: gcc/testsuite/gcc.dg/tree-ssa/inline-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/inline-12.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/inline-12.c   (working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-einline" } */
+
+void *foo (void *, int);
+static inline void *mcp (void *src, int i)
+{
+  return foo (src, i);
+}
+void bar()
+{
+  int i;
+  mcp (, 0);
+}
+
+/* There should be exactly two assignments, one for both
+   the original foo call and the inlined copy (plus a clobber
+   that doesn't match here).  In particular bar should look like
+  :
+ _4 = foo (, 0);
+ i ={v} {CLOBBER};
+ return;  */
+/* { dg-final { scan-tree-dump-times " = " 2 "einline" } } */


[PATCH] More PR92645, teach vector CTOR optimization about more conversions

2019-11-28 Thread Richard Biener


The following fixes the reduced testcase in PR92645 (but not the original
C++ one because of abstraction - digging into that).

It teaches simplify_vector_constructor to consider all kinds of
conversions, even those changing the element size.  Since we now
have truncate and extend optabs for vector types the existing
code should already deal with those if the target supports it.
Until x86 does so I've teached simplify_vector_constructor to
handle the simple case of a non-permutated conversion via
VEC_UNPACK_* and VEC_PACK_TRUNC_EXPR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-11-28  Richard Biener  

PR tree-optimization/92645
* tree-ssa-forwprop.c (get_bit_field_ref_def): Also handle
conversions inside a mode class.  Remove restriction on
preserving the element size.
(simplify_vector_constructor): Deal with the above and for
identity permutes also try using VEC_UNPACK_[FLOAT_]LO_EXPR
and VEC_PACK_TRUNC_EXPR.

* gcc.target/i386/pr92645-4.c: New testcase.

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 278765)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -2004,16 +2004,12 @@ get_bit_field_ref_def (tree val, enum tr
 return NULL_TREE;
   enum tree_code code = gimple_assign_rhs_code (def_stmt);
   if (code == FLOAT_EXPR
-  || code == FIX_TRUNC_EXPR)
+  || code == FIX_TRUNC_EXPR
+  || CONVERT_EXPR_CODE_P (code))
 {
   tree op1 = gimple_assign_rhs1 (def_stmt);
   if (conv_code == ERROR_MARK)
-   {
- if (maybe_ne (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (val))),
-   GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op1)
-   return NULL_TREE;
- conv_code = code;
-   }
+   conv_code = code;
   else if (conv_code != code)
return NULL_TREE;
   if (TREE_CODE (op1) != SSA_NAME)
@@ -2078,9 +2074,8 @@ simplify_vector_constructor (gimple_stmt
  && VECTOR_TYPE_P (TREE_TYPE (ref))
  && useless_type_conversion_p (TREE_TYPE (op1),
TREE_TYPE (TREE_TYPE (ref)))
- && known_eq (bit_field_size (op1), elem_size)
  && constant_multiple_p (bit_field_offset (op1),
- elem_size, )
+ bit_field_size (op1), )
  && TYPE_VECTOR_SUBPARTS (TREE_TYPE (ref)).is_constant ())
{
  unsigned int j;
@@ -2153,7 +2148,83 @@ simplify_vector_constructor (gimple_stmt
   if (conv_code != ERROR_MARK
  && !supportable_convert_operation (conv_code, type, conv_src_type,
 _code))
-   return false;
+   {
+ /* Only few targets implement direct conversion patterns so try
+some simple special cases via VEC_[UN]PACK[_FLOAT]_LO_EXPR.  */
+ optab optab;
+ tree halfvectype, dblvectype;
+ if (CONVERT_EXPR_CODE_P (conv_code)
+ && (2 * TYPE_PRECISION (TREE_TYPE (TREE_TYPE (orig[0])))
+ == TYPE_PRECISION (TREE_TYPE (type)))
+ && mode_for_vector (as_a 
+ (TYPE_MODE (TREE_TYPE (TREE_TYPE (orig[0],
+ nelts * 2).exists ()
+ && (dblvectype
+ = build_vector_type (TREE_TYPE (TREE_TYPE (orig[0])),
+  nelts * 2))
+ && (optab = optab_for_tree_code (FLOAT_TYPE_P (TREE_TYPE (type))
+  ? VEC_UNPACK_FLOAT_LO_EXPR
+  : VEC_UNPACK_LO_EXPR,
+  dblvectype,
+  optab_default))
+ && (optab_handler (optab, TYPE_MODE (dblvectype))
+ != CODE_FOR_nothing))
+   {
+ gimple_seq stmts = NULL;
+ tree dbl;
+ if (refnelts == nelts)
+   {
+ /* ???  Paradoxical subregs don't exist, so insert into
+the lower half of a wider zero vector.  */
+ dbl = gimple_build (, BIT_INSERT_EXPR, dblvectype,
+ build_zero_cst (dblvectype), orig[0],
+ bitsize_zero_node);
+   }
+ else if (refnelts == 2 * nelts)
+   dbl = orig[0];
+ else
+   dbl = gimple_build (, BIT_FIELD_REF, dblvectype,
+   orig[0], TYPE_SIZE (dblvectype),
+   bitsize_zero_node);
+ gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+ gimple_assign_set_rhs_with_ops (gsi,
+ FLOAT_TYPE_P (TREE_TYPE (type))
+ ? 

Re: [PATCH] PR90838: Support ctz idioms

2019-11-28 Thread Wilco Dijkstra
ping

Hi Richard,

> Uh.  Well.  I think that the gimple-match-head.c hunk isn't something we 
> want.  Instead,
> since this optimizes a memory access, the handling should move
> to tree-ssa-forwprop.c where you _may_ use a (match ...)
> match.pd pattern to do the (rshift (mult (bit_and (negate @1) @1)
> matching.  It might be the first to use that feature, you need to
> declare the function to use it from tree-ssa-forwprop.c.  So

OK, I've moved to to fwprop, and it works just fine there while still
using match.pd to do the idiom matching. Here is the updated version:

[PATCH v2] PR90838: Support ctz idioms

v2: Use fwprop pass rather than match.pd

Support common idioms for count trailing zeroes using an array lookup.
The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
constant which when multiplied by a power of 2 contains a unique value
in the top 5 or 6 bits.  This is then indexed into a table which maps it
to the number of trailing zeroes.  When the table is valid, we emit a
sequence using the target defined value for ctz (0):

int ctz1 (unsigned x)
{
  static const char table[32] =
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

  return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27];
}

Is optimized to:

rbitw0, w0
clz w0, w0
and w0, w0, 31
ret

Bootstrapped on AArch64. OK for commit?

ChangeLog:

2019-11-15  Wilco Dijkstra  

PR tree-optimization/90838
* tree-ssa-forwprop.c (optimize_count_trailing_zeroes):
Add new function.
(simplify_count_trailing_zeroes): Add new function.
(pass_forwprop::execute): Try ctz simplification.
* match.pd: Add matching for ctz idioms.
* testsuite/gcc.target/aarch64/pr90838.c: New test.
---

diff --git a/gcc/match.pd b/gcc/match.pd
index 
6edf54b80012d87dbe7330f5ee638cdba2f9c099..479e9076f0d4deccda54425e93ee4567b85409aa
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6060,3 +6060,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (vec_perm vec_same_elem_p@0 @0 @1)
  @0)
+
+/* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop.
+   The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
+   constant which when multiplied by a power of 2 contains a unique value
+   in the top 5 or 6 bits.  This is then indexed into a table which maps it
+   to the number of trailing zeroes.  */
+(match (ctz_table_index @1 @2 @3)
+  (rshift (mult (bit_and (negate @1) @1) INTEGER_CST@2) INTEGER_CST@3))
diff --git a/gcc/testsuite/gcc.target/aarch64/pr90838.c 
b/gcc/testsuite/gcc.target/aarch64/pr90838.c
new file mode 100644
index 
..bff3144c0d1b3984016e5a404e986eae785c73ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr90838.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int ctz1 (unsigned x)
+{
+  static const char table[32] =
+{
+  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
+  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
+};
+
+  return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27];
+}
+
+int ctz2 (unsigned x)
+{
+  const int u = 0;
+  static short table[64] =
+{
+  32, 0, 1,12, 2, 6, u,13, 3, u, 7, u, u, u, u,14,
+  10, 4, u, u, 8, u, u,25, u, u, u, u, u,21,27,15,
+  31,11, 5, u, u, u, u, u, 9, u, u,24, u, u,20,26,
+  30, u, u, u, u,23, u,19,29, u,22,18,28,17,16, u
+};
+
+  x = (x & -x) * 0x0450FBAF;
+  return table[x >> 26];
+}
+
+int ctz3 (unsigned x)
+{
+  static int table[32] =
+{
+  0, 1, 2,24, 3,19, 6,25, 22, 4,20,10,16, 7,12,26,
+  31,23,18, 5,21, 9,15,11,30,17, 8,14,29,13,28,27
+};
+
+  if (x == 0) return 32;
+  x = (x & -x) * 0x04D7651F;
+  return table[x >> 27];
+}
+
+static const unsigned long long magic = 0x03f08c5392f756cdULL;
+
+static const char table[64] = {
+ 0,  1, 12,  2, 13, 22, 17,  3,
+14, 33, 23, 36, 18, 58, 28,  4,
+62, 15, 34, 26, 24, 48, 50, 37,
+19, 55, 59, 52, 29, 44, 39,  5,
+63, 11, 21, 16, 32, 35, 57, 27,
+61, 25, 47, 49, 54, 51, 43, 38,
+10, 20, 31, 56, 60, 46, 53, 42,
+ 9, 30, 45, 41,  8, 40,  7,  6,
+};
+
+int ctz4 (unsigned long x)
+{
+  unsigned long lsb = x & -x;
+  return table[(lsb * magic) >> 58];
+}
+
+/* { dg-final { scan-assembler-times "clz\t" 4 } } */
+/* { dg-final { scan-assembler-times "and\t" 2 } } */
+/* { dg-final { scan-assembler-not "cmp\t.*0" } } */
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index 
fe55ca958b49b986f79a9a710d92b5d906959105..a632d54712be55f8070c9816e3c3702d4a493182
 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs-tree.h"
 #include "tree-vector-builder.h"
 #include "vec-perm-indices.h"
+#include "internal-fn.h"
 
 /* This pass propagates the 

[committed] [testsuite][arm] Force use of -mfloat-abi=softfp in asm-flag-4.c

2019-11-28 Thread Christophe Lyon
Hi,

I've just committed the patch below as r278804. It avoids annoying
failures on asm-flag-4.c when the compiler is configured
--with-float=hard.

2019-11-28  Christophe Lyon  

* gcc.target/arm/asm-flag-4.c: Use -mfloat-abi=softfp.

Index: gcc/testsuite/gcc.target/arm/asm-flag-4.c
===
--- gcc/testsuite/gcc.target/arm/asm-flag-4.c   (revision 278803)
+++ gcc/testsuite/gcc.target/arm/asm-flag-4.c   (revision 278804)
@@ -1,6 +1,8 @@
 /* Test that we do not ice in thumb1 mode */
 /* { dg-do compile } */
-/* { dg-options "-march=armv4t" } */
+/* { dg-require-effective-target arm_arch_v4t_thumb_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } {
"-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */
+/* { dg-options "-march=armv4t -mfloat-abi=softfp" } */

 void __attribute__((target("arm"))) f(char *out)
 {

Christophe


Re: Ping: [PATCH] Add explicit description for -finline

2019-11-28 Thread Richard Biener
On Thu, 28 Nov 2019, luoxhu wrote:

> 
> 
> On 2019/11/4 11:42, luoxhu wrote:
> > On 2019/11/2 00:23, Joseph Myers wrote:
> >> On Thu, 31 Oct 2019, Xiong Hu Luo wrote:
> >>
> >>> +@code{-finline} enables inlining of function declared \"inline\".
> >>> +@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og.
> >>
> >> Use @option{} to mark up option names (both -finline and all the -O
> >> options in this paragraph).  Use @code{} to mark up keyword names, not
> >> \"\".
> >>
> > 
> > Thanks.  So shall I commit the tiny patch with below updates?
> > 
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 1407d019d14..ea0d407fe11 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -8576,6 +8576,10 @@ optimizing.
> >   Single functions can be exempted from inlining by marking them
> >   with the @code{noinline} attribute.
> > 
> > +@option{-finline} enables inlining of function declared @code{inline}.
> > +@option{-finline} is enabled at levels @option{-O1}, @option{-O2},
> > @option{-O3}
> > +and @option{-Os}, but not @option{-Og}.
> > +

But this is wrong - -finline is enabled at -Og.  I don't think the new
sentence adds anything useful.

> >   @item -finline-small-functions
> >   @opindex finline-small-functions
> >   Integrate functions into their callers when their body is smaller than
> >   expected
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] Trivial patch to allow bootstrap on MacOS

2019-11-28 Thread Iain Sandoe

Hello Rainer,

thanks for the patch, but I think it’s only a work-around to part of the  
problem and there are alternate strategies for the “usual case” on  
MacOS/Darwin.


Keller, Rainer  wrote:

the following is required to allow bootstrap in libcc1 during stage3 on  
MacOS Catalina (10.15). libcc1 invokes g++ with —nostdinc++


MacOS Catalina doesn’t provide /usr/include anymore, instead one builds  
with:
OSX_SDK_VERSION=`xcodebuild -showsdks | grep 'macOS\ 10' | cut -f2- -d'-'  
| cut -f2 -d' '`
OSX_SDK_PATH=`xcodebuild -sdk $OSX_SDK_VERSION -version | grep -E '^Path:  
' | cut -f2 -d' '`


configure … --with-build-sysroot=$OSX_SDK_PATH


—with-build-sysroot=  is known to have some problems;

see, for example, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79885

There is also a current series of patches on the topic from Maciej Rozicki.

I’m not sure what the intent is in using it in the MacOS builds - since I  
expect you will be using the same sysroot (SDK) at run time as you are at  
build time?




—with-sysroot= does, however, work (we also now honour SDKROOT in trunk,  
open branches and gcc-7, but not on gcc-6 or gcc-5 [I will probably do some  
Darwin-specific extra patches for the closed branches at some point])


I have sucessfully bootstrapped and tested on Catalina see (for example):

https://gcc.gnu.org/ml/gcc-testresults/2019-11/msg01492.html

Usually, I am using the command line tools, so the installation is not  
going to move….


… however, if you configure with “—with-sysroot=$OSX_SDK_PATH” and then the  
path moves (e.g. you relocate XCode) then you would have to supply the SDK  
position at runtime anyway (either by setting SDKROOT or by passing  
—sysroot=$NEW_SDK_POSITION for each compilation line)


GMP however is installed elsewhere (by Homebrew, MacPorts etc), so ignore  
any -nostdinc


it also works to symlink the sources for gmp, mpfr, mpc (and isl, if you  
use it) into the source tree - those then get boostrapped along with the  
compiler and there are no resulting external dependencies (which I find  
preferable).


However, —with-gmp= etc should also work with it (I’ll take a look at that  
case).



Please note, I am not subscribed to the list.


HTH,
Iain



Best regards,
Rainer Keller

gcc/Changelog:
* Have gmp.h be found outside of sysroot

--
Index: gcc/system.h
===
--- gcc/system.h(revision 278783)
+++ gcc/system.h(working copy)
@@ -684,7 +684,7 @@

/* Do not introduce a gmp.h dependency on the build system.  */
#ifndef GENERATOR_FILE
-#include 
+#include "gmp.h"
#endif

/* Get libiberty declarations.  */





Re: [PATCH] Fix ICE in tree-ssa-strlen.c (PR tree-optimization/92691)

2019-11-28 Thread Richard Biener
On Thu, 28 Nov 2019, Jakub Jelinek wrote:

> Hi!
> 
> The various routines propagate to the caller whether
>   if (check_and_optimize_stmt (, _eh, evrp.get_vr_values ()))
> gsi_next ();
> should do gsi_next or not (return false if e.g. gsi_remove is done, thus
> gsi is already moved to the next stmt).
> handle_printf_call returns that too, though with the values swapped,
> but since the move of handle_printf_call (then called handle_gimple_call)
> from the separate sprintf pass to strlen pass, the return value is ignored,
> while it must not be.  In some cases it means the following statement is not
> processed by the strlen pass, which can e.g. mean wrong-code because some
> strlen information is not invalidated when it should, or in other cases like
> in this testcase where the sprintf call that was removed was at the end of a 
> bb
> it means an ICE, because gsi_next when gsi is already at the end of bb is
> invalid.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2019-11-27  Jakub Jelinek  
> 
>   PR tree-optimization/92691
>   * tree-ssa-strlen.c (handle_store): Clarify return value meaning
>   in function comment.
>   (strlen_check_and_optimize_call): Likewise.  For handle_printf_call
>   calls, return !handle_printf_call rather than always returning true.
>   (check_and_optimize_stmt): Describe return value meaning in function
>   comment.  Formatting fix.
> 
>   * gcc.dg/tree-ssa/builtin-snprintf-10.c: New test.
> 
> --- gcc/tree-ssa-strlen.c.jj  2019-11-22 19:11:54.901951408 +0100
> +++ gcc/tree-ssa-strlen.c 2019-11-27 12:25:14.778564388 +0100
> @@ -4300,7 +4300,8 @@ count_nonzero_bytes (tree exp, unsigned
>  /* Handle a single or multibyte store other than by a built-in function,
> either via a single character assignment or by multi-byte assignment
> either via MEM_REF or via a type other than char (such as in
> -   '*(int*)a = 12345').  Return true when handled.  */
> +   '*(int*)a = 12345').  Return true to let the caller advance *GSI to
> +   the next statement in the basic block and false otherwise.  */
>  
>  static bool
>  handle_store (gimple_stmt_iterator *gsi, bool *zero_write, const vr_values 
> *rvals)
> @@ -4728,8 +4729,8 @@ is_char_type (tree type)
>  }
>  
>  /* Check the built-in call at GSI for validity and optimize it.
> -   Return true to let the caller advance *GSI to the statement
> -   in the CFG and false otherwise.  */
> +   Return true to let the caller advance *GSI to the next statement
> +   in the basic block and false otherwise.  */
>  
>  static bool
>  strlen_check_and_optimize_call (gimple_stmt_iterator *gsi,
> @@ -4738,16 +4739,13 @@ strlen_check_and_optimize_call (gimple_s
>  {
>gimple *stmt = gsi_stmt (*gsi);
>  
> +  /* When not optimizing we must be checking printf calls which
> + we do even for user-defined functions when they are declared
> + with attribute format.  */
>if (!flag_optimize_strlen
>|| !strlen_optimize
>|| !valid_builtin_call (stmt))
> -{
> -  /* When not optimizing we must be checking printf calls which
> -  we do even for user-defined functions when they are declared
> -  with attribute format.  */
> -  handle_printf_call (gsi, rvals);
> -  return true;
> -}
> +return !handle_printf_call (gsi, rvals);
>  
>tree callee = gimple_call_fndecl (stmt);
>switch (DECL_FUNCTION_CODE (callee))
> @@ -4806,7 +4804,8 @@ strlen_check_and_optimize_call (gimple_s
>   return false;
>break;
>  default:
> -  handle_printf_call (gsi, rvals);
> +  if (handle_printf_call (gsi, rvals))
> + return false;
>break;
>  }
>  
> @@ -4932,7 +4931,8 @@ handle_integral_assign (gimple_stmt_iter
>  /* Attempt to check for validity of the performed access a single statement
> at *GSI using string length knowledge, and to optimize it.
> If the given basic block needs clean-up of EH, CLEANUP_EH is set to
> -   true.  */
> +   true.  Return true to let the caller advance *GSI to the next statement
> +   in the basic block and false otherwise.  */
>  
>  static bool
>  check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh,
> @@ -4973,32 +4973,32 @@ check_and_optimize_stmt (gimple_stmt_ite
>   /* Handle assignment to a character.  */
>   handle_integral_assign (gsi, cleanup_eh);
>else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
> -  {
> - tree type = TREE_TYPE (lhs);
> - if (TREE_CODE (type) == ARRAY_TYPE)
> -   type = TREE_TYPE (type);
> -
> - bool is_char_store = is_char_type (type);
> - if (!is_char_store && TREE_CODE (lhs) == MEM_REF)
> -   {
> - /* To consider stores into char objects via integer types
> -other than char but not those to non-character objects,
> -determine the type of the destination rather than just
> -the 

Handle correctly global0 and global counters in profile_count::to_sreal_scale

2019-11-28 Thread Jan Hubicka
Hi,
This patch fixes problem in profile_count::to_sreal_scale.  We our porfile
counters can be function local, global (ipa) or function local but globally 0.
The last is used to hold static estimates for functions executed 0 times in
profile.  Now only one 64bit value is stored and if we compute frequency
of global0 counter in global counter we mix them up and return non-zero value
incorrectly.

I also implemented unit test, but will commit sanity checking separately from
fixes: there are multiple bugs in this area I tracked down.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* profile-count.c (profile_count::to_sreal_scale): Handle correctly
combination of globa0 and global counters..

Index: profile-count.c
===
--- profile-count.c (revision 278681)
+++ profile-count.c (working copy)
@@ -310,6 +311,20 @@ profile_count::to_sreal_scale (profile_c
 }
   if (known)
 *known = true;
+  /* Watch for cases where one count is IPA and other is not.  */
+  if (in.ipa ().initialized_p ())
+{
+  gcc_checking_assert (ipa ().initialized_p ());
+  /* If current count is inter-procedurally 0 and IN is inter-procedurally
+non-zero, return 0.  */
+  if (in.ipa ().nonzero_p ()
+ && !ipa().nonzero_p ())
+   return 0;
+}
+  else 
+/* We can handle correctly 0 IPA count within locally estimated
+   profile, but otherwise we are lost and this should not happen.   */
+gcc_checking_assert (!ipa ().initialized_p () || !ipa ().nonzero_p ());
   if (*this == zero ())
 return 0;
   if (m_val == in.m_val)


Re: [PATCH, GCC, AArch64] Fix PR88398 for AArch64

2019-11-28 Thread Richard Biener
On Wed, Nov 27, 2019 at 2:17 PM Wilco Dijkstra  wrote:
>
> Hi Richard,
>
> >> Yes so it does the insane "fully unrolled trailing loop before the unrolled
> >> loop" thing. One always does the trailing loop last (and typically as an
> >> actual loop of course) and then the code ends up much faster, close to
> >> the ideal version shown in the PR.
> >
> > Well, you can't do the unrolled loop first unless you keep all exit tests.
> > Not keeping them is the whole point of unrolling!
>
> You always need a loop entry test, but rather than testing iterations > 0,
> we can just test iterations >= 4 before entering a 4x unrolled loop.
>
> >> For these kinds of loops, stupid unrolling is clearly better than the
> >> default unrolling, both in size and in performance. For the example
> >> we only ever execute part of the "trailing" loop, and never enter the
> >> unrolled main loop!
> >
> > Well, then you don't want unrolling you want peeling.  You'd be
> > actually happy with four peeled iterations and then the regular,
> > not unrolled loop at the tail.
>
> While peeling would work in this case since the average number of
> iterations is so small, that's not what you'd want in general. The key is
> not to do the trailing loop before the unrolled loop.
>
> > The stupid strategy is what it says - stupid.
>
> Absolutely, it still can be improved significantly. We need to characterize
> loops and unroll smartly using different unroll strategies rather than
> bluntly unroll every loop 8 times.
>
> > Sure, which is why I suggest to change how we emit the
> > prologue here.  We can select the variant of the prologue
> > with a target hook based on preference for example, between
> > doing it peeling-like (which you prefer), using a scheme
> > like current (preferably in some optimized form).
>
> Well what I'm suggesting is to move the prologue to the epilogue
> similar to how the vectorizer executes the trailing loop at the end
> (rather than before the vectorized loop).

OK, that works as well, the current scheme tries to combine peeling
and unrolling to get the benefit of both.  For the the case here
[insert sound heuristics] we want the peeling being done as a loop.
Whether that's placed before or after the unrolled copy doesn't matter
I guess, you'd either have

 for (; i < n % unroll-factor; )
   prologue;
 if (i >= unroll-factor)
   for () unrolled-loop

or

 if (n / unroll-factor > 0)
   for () unrolled-loop
 if (n % unroll-factor > 0)
   for () epilogue

I think a prologue might be more efficient and eaier to set up as far
as IV-reuse is concerned?

In theory loop-unroll can then still decide (with heuristics)
to peel the prologue/epilogue (though we removed the
peeling code).

Richard.

> Cheers,
> Wilco


[PATCH] Trivial patch to allow bootstrap on MacOS

2019-11-28 Thread Keller, Rainer
Dear all,
the following is required to allow bootstrap in libcc1 during stage3 on MacOS 
Catalina (10.15). libcc1 invokes g++ with —nostdinc++

MacOS Catalina doesn’t provide /usr/include anymore, instead one builds with:
OSX_SDK_VERSION=`xcodebuild -showsdks | grep 'macOS\ 10' | cut -f2- -d'-' | cut 
-f2 -d' '`
OSX_SDK_PATH=`xcodebuild -sdk $OSX_SDK_VERSION -version | grep -E '^Path: ' | 
cut -f2 -d' '`

configure … --with-build-sysroot=$OSX_SDK_PATH

GMP however is installed elsewhere (by Homebrew, MacPorts etc), so ignore any 
-nostdinc

Please note, I am not subscribed to the list.

Best regards,
Rainer Keller

gcc/Changelog:
* Have gmp.h be found outside of sysroot

--
Index: gcc/system.h
===
--- gcc/system.h(revision 278783)
+++ gcc/system.h(working copy)
@@ -684,7 +684,7 @@
 
 /* Do not introduce a gmp.h dependency on the build system.  */
 #ifndef GENERATOR_FILE
-#include 
+#include "gmp.h"
 #endif
 
 /* Get libiberty declarations.  */