[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

--- Comment #7 from bin cheng  ---
Also, when calling move_fixed_address_to_symbol, fixed_address_object_p looks
too restricted, it only considers link time constant address.  In this case,
it's an array object in stack.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

--- Comment #6 from bin cheng  ---
(In reply to Andrew Pinski from comment #5)
> (In reply to bin cheng from comment #4)
> > On AArch64, iovpts generates following code:
> >[local count: 954449108]:
> >   # crc_20 = PHI 
> >   # ivtmp.5_18 = PHI <1(2), ivtmp.5_17(5)>
> >   _19 = _counts + 18446744073709551612;
> >   _1 = MEM[base: _19, index: ivtmp.5_18, step: 4, offset: 0B];
> >   crc_10 = crcu32 (_1, crc_20);
> >   _5 = _counts + 18446744073709551612;
> 
> I thought we had decided _counts + 18446744073709551612 would be
> invalid gimple anyways as we are taking the address of one element before.

Could you direct me to the discussion about this decision?  I remember once
raised this question (probably in private).  In this case, we need to revision
ivopts to avoid adding candidates which could violates this.

Anyway, it's an independent issue because the iv_cand could be one element
forwarded as:

> >[local count: 954449108]:
> >   # crc_20 = PHI 
> >   # ivtmp.5_18 = PHI <0(2), ivtmp.5_17(5)>
> >   _19 = _counts;
> >   _1 = MEM[base: _19, index: ivtmp.5_18, step: 4, offset: 0B];
> >   crc_10 = crcu32 (_1, crc_20);
> >   _5 = _counts;
> 

Unfortunately, cost computation still has problem to generate this code.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

--- Comment #5 from Andrew Pinski  ---
(In reply to bin cheng from comment #4)
> On AArch64, iovpts generates following code:
>[local count: 954449108]:
>   # crc_20 = PHI 
>   # ivtmp.5_18 = PHI <1(2), ivtmp.5_17(5)>
>   _19 = _counts + 18446744073709551612;
>   _1 = MEM[base: _19, index: ivtmp.5_18, step: 4, offset: 0B];
>   crc_10 = crcu32 (_1, crc_20);
>   _5 = _counts + 18446744073709551612;

I thought we had decided _counts + 18446744073709551612 would be invalid
gimple anyways as we are taking the address of one element before.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

--- Comment #4 from bin cheng  ---
On AArch64, iovpts generates following code:
   [local count: 954449108]:
  # crc_20 = PHI 
  # ivtmp.5_18 = PHI <1(2), ivtmp.5_17(5)>
  _19 = _counts + 18446744073709551612;
  _1 = MEM[base: _19, index: ivtmp.5_18, step: 4, offset: 0B];
  crc_10 = crcu32 (_1, crc_20);
  _5 = _counts + 18446744073709551612;
  _2 = MEM[base: _5, index: ivtmp.5_18, step: 4, offset: 0B];
  crc_12 = crcu32 (_2, crc_10);
  ivtmp.5_17 = ivtmp.5_18 + 1;
  if (ivtmp.5_17 != 9)
goto ; [87.50%]
  else
goto ; [12.50%]
Which looks optimal to me if _19/_5 can be hoisted out of loop.  And it is
intended to be hoisted by rtl liv.  (TREE liv doesn't help much, that's another
story)

Problem is in dom3 pass, cprop_operand, _19/_5 is propagated into memory access
although it causes invalid addressing mode on AArch64:
  [[(void *)_counts + -4B], [(void *)_counts + -4B]] 
EQUIVALENCES: { _19 } (1 elements)
Optimizing statement _1 = MEM[base: _19, index: ivtmp.5_18, step: 4, offset:
0B];
  Replaced '_19' with constant '[(void *)_counts + -4B]'
  Folded to: _1 = MEM[symbol: final_counts, index: ivtmp.5_18, step: 4, offset:
-4B];
LKUP STMT _1 = MEM[symbol: final_counts, index: ivtmp.5_18, step: 4, offset:
-4B] with .MEM_22
2>>> STMT _1 = MEM[symbol: final_counts, index: ivtmp.5_18, step: 4, offset:
-4B] with .MEM_22

it's kept in this form to the end of GIMPLE, then badly legitimized.

So ivopts worked hard to get addressing mode and invariant expression correct
in this case, we need to avoid immature transformations afterwards.

BTW, with dom disabled by -fno-tree-dominator-opts, vrp2 does the same
transformation too.  -fno-tree-vrp is also necessary to get the optimal code.

Well, you can argue [base + iv << 2] is sub-optimal comparing to [base + iv],
but that's hard to tune.  Also bias to the original IV is in general preferred
for reasons like smaller setup code, better debug info, and even for
performance in complicated loops.

[Bug fortran/30123] Document INQUIRE, especially UNFORMATTED and FORMATTED

2019-04-28 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30123

--- Comment #5 from Eric Gallager  ---
(In reply to Jürgen Reuter from comment #4)
> This seems like one of these documentation tasks which is in principle very
> easy to do but nobody is motivated to do.^^

Indeed.

[Bug c/43728] Add warning for redundant static function prototypes

2019-04-28 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43728

Eric Gallager  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #10 from Eric Gallager  ---
(In reply to Eric Gallager from comment #9)
> (In reply to Eric Gallager from comment #8)
> > Confirmed, although I probably wouldn't use such a warning myself if it were
> > added. (I like redundancy)
> 
> Do people still want this? Putting in WAITING for someone to re-confirm.

No reply, I guess no one really wants this after all.

[Bug c++/87403] [Meta-bug] Issues that suggest a new warning

2019-04-28 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
Bug 87403 depends on bug 43728, which changed state.

Bug 43728 Summary: Add warning for redundant static function prototypes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43728

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

[Bug translation/90274] New: untranslated string literal in opts.c

2019-04-28 Thread roland.illig at gmx dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90274

Bug ID: 90274
   Summary: untranslated string literal in opts.c
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: translation
  Assignee: unassigned at gcc dot gnu.org
  Reporter: roland.illig at gmx dot de
  Target Milestone: ---

cc1 --help produces:

  printf ("  Known valid arguments for %s option:\n   ", option->opt_text);

There's a _(...) missing around the string literal.

While here, the command line options are not surrounded by quotes.

[Bug debug/90273] [9/10 Regression] GCC runs out of memory building Firefox

2019-04-28 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

--- Comment #4 from Jan Hubicka  ---
The code is:
  inline bool IsNodeInternal() const { return false; }

  template 
  inline bool IsNodeInternal(First aFirst, Args... aArgs) const {
return mNodeInfo->Equals(aFirst) || IsNodeInternal(aArgs...);
  }

 public:
  inline bool IsHTMLElement() const {
return IsElement() && IsInNamespace(3);
  }

  inline bool IsHTMLElement(const nsAtom* aTag) const {
return IsElement() && mNodeInfo->Equals(aTag, 3);
  }

  template 
  inline bool IsAnyOfHTMLElements(First aFirst, Args... aArgs) const {
return IsHTMLElement() && IsNodeInternal(aFirst, aArgs...);
  }

$ grep ";; Function nsINode::IsNodeInternal" *cfg2 | less
;; Function nsINode::IsNodeInternal (_ZNK7nsINode14IsNodeInternalEv,
funcdef_no=13668, decl_uid=274623, cgraph_uid=7143, symbol_order=7869)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJEEEbT_DpT0_, funcdef_no=58886,
decl_uid=1359969, cgraph_uid=49491, symbol_order=50680)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_EEEbT_DpT0_, funcdef_no=58870,
decl_uid=1359893, cgraph_uid=49475, symbol_order=50664)
;; Function nsINode::IsNodeInternal (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_EEEbT_DpT0_,
funcdef_no=58853, decl_uid=1359814, cgraph_uid=49458, symbol_order=50647)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_EEEbT_DpT0_,
funcdef_no=58835, decl_uid=1359727, cgraph_uid=49440, symbol_order=50629)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_EEEbT_DpT0_,
funcdef_no=58816, decl_uid=1359637, cgraph_uid=49421, symbol_order=50610)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_EEEbT_DpT0_,
funcdef_no=58796, decl_uid=1359532, cgraph_uid=49401, symbol_order=50590)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_EEEbT_DpT0_,
funcdef_no=58774, decl_uid=1359424, cgraph_uid=49379, symbol_order=50568)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_S2_EEEbT_DpT0_,
funcdef_no=58751, decl_uid=1359292, cgraph_uid=49356, symbol_order=50545)
;; Function nsINode::IsNodeInternal
(_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_S2_S2_EEEbT_DpT0_,
funcdef_no=58719, decl_uid=1359003, cgraph_uid=49324, symbol_order=50513)

GCC ends up producing empty BBS with 1753608 debug statements in
nsINode::IsNodeInternal (const struct nsINode * const this, struct nsStaticAtom *
aFirst, struct nsStaticAtom * aArgs#0, struct nsStaticAtom * aArgs#1, struct
nsStaticAtom * aArgs#2, struct nsStaticAtom * aArgs#3, struct nsStaticAtom *
aArgs#4, struct nsStaticAtom * aArgs#5, struct nsStaticAtom * aArgs#6, struct
nsStaticAtom * aArgs#7

I did not wait for the version with 9 parameters :)

[Bug debug/90273] [9/10 Regression] GCC runs out of memory building Firefox

2019-04-28 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

Jan Hubicka  changed:

   What|Removed |Added

 CC||rguenther at suse dot de

--- Comment #3 from Jan Hubicka  ---
Note that it fails without -flto as well. Without -g it builds for me.

We end up with huge BB containing
# DEBUG aFirst => NULL
# DEBUG aArgs#0 => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG aArgs#0 => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG aArgs#0 => NULL
# DEBUG aArgs#1 => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG aArgs#0 => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL
# DEBUG this => NULL
# DEBUG aFirst => NULL

Seems that these statements are being early inlined and duplicated
indefinitely. Something needs to cut them to sensible limit.

[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads

2019-04-28 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2

--- Comment #13 from Eric Gallager  ---
(In reply to Jim Wilson from comment #12)
> I no longer have access to IA-64 hardware.  I was leaving myself as
> maintainer just so that there was someone responsible for answering
> questions.  I don't care if the port survives or not.  I can resign if that
> makes things easier.
> 
> There is an ia64 debian group that has hardware, and is still doing debian
> work.  They would be disappointed if the ia64 port was deprecated.  I get
> the occasional binutils and gcc bug reports from them, and am occasionally
> able to fix them even though I don't have hardware.

Do they have a bugzilla account here that could be cc-ed?

> 
> I think that solving this bug requires that all locals get initialized to 0
> when defined.  Otherwise, you may accidentally read a register with the nat
> bit set when doing bit-field operations on a register.  I don't care enough
> about ia64 to try to fix this.

[Bug debug/90273] [9/10 Regression] GCC runs out of memory building Firefox

2019-04-28 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

--- Comment #2 from Jan Hubicka  ---
http://www.ucw.cz/~hubicka/Unified_cpp_dom_events0-8.ii.xz

[Bug debug/90273] [9/10 Regression] GCC runs out of memory building Firefox

2019-04-28 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

Eric Gallager  changed:

   What|Removed |Added

   Keywords||build, memory-hog
 CC||egallager at gcc dot gnu.org
 Blocks||45375

--- Comment #1 from Eric Gallager  ---
I see -flto in there, so making this block the mozillametabug


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375
[Bug 45375] [meta-bug] Issues with building Mozilla (i.e. Firefox) with LTO

[Bug debug/90273] New: [9/10 Regression] GCC runs out of memory building Firefox

2019-04-28 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273

Bug ID: 90273
   Summary: [9/10 Regression] GCC runs out of memory building
Firefox
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

Running
/aux/hubicka/9-install/bin/g++ Unified_cpp_dom_events0-8.ii -c -flto
-flifetime-dse=1 -fPIC -fstack-protector-strong -Wall -Wempty-body
-Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wsign-compare
-Wtype-limits -Wunreachable-code -Wwrite-strings -Wno-invalid-offsetof
-Wc++1z-compat -Wduplicated-cond -Wimplicit-fallthrough
-Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations
-Wno-error=array-bounds -Wno-error=coverage-mismatch
-Wno-error=free-nonheap-object -Wno-error=multistatement-macros
-Wno-error=class-memaccess  -Wformat -Wformat-overflow=2
-fno-sized-deallocation -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2
-fstack-protector-strong -fno-exceptions -fno-strict-aliasing -fno-rtti
-fno-exceptions -fno-math-errno -pthread -pipe -g -freorder-blocks -O2
-fno-omit-frame-pointer -funwind-tables -Wno-error=shadow

eventually runs out of memory on my machine, while GCC 8 finishes rather
quickly.
Most of time is spent in gimple_copy.

Without debug info the file builds for me.

[Bug bootstrap/89864] gcc fails to build/bootstrap with XCode 10.2

2019-04-28 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #90 from Iain Sandoe  ---
(In reply to Zaak from comment #89)
> Anyone have a patch for 4.9? A user wants one, but I can't build 4.9 from
> source on Mojave.

4.9is long-closed [as are 5, 6] , is there some reason the user can't move
forward?
(I was not considering back-porting anything to earlier than 5.5/6.5 and that
would be out of tree "vendor branches" in git since SVN is closed as noted)

In the short-term you could try back porting the 8.3 patch to 4.9
fixincludes/fixincl.def and then regenerating the fixincl.x.

(unless there are other problems for 4.9 on 10.14)

[Bug tree-optimization/90269] loop distribution defeated by clobbers

2019-04-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90269

--- Comment #3 from Marc Glisse  ---
(In reply to Richard Biener from comment #2)
> Otherwise the patch looks sensible, mind to test/post it?

It bootstrapped and regtested fine, I'll send it later.

[Bug libstdc++/87982] No error for std::generate_n(ptr, ptr, f)

2019-04-28 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87982

--- Comment #6 from Jonathan Wakely  ---
(In reply to Jan van Dijk from comment #5)
> Somewhat off-topic: IMHO the standard is not explicit about the fact that
> gen() is (of course) to be invoked separately for every element in the range:
> "The generate_n algorithms invoke the function object gen and assign the
> return value of gen through all the iterators in the range...". But that may
> be me not being a native English speaker. The libstdc++ docs are more
> helpful for sure.

The current working draft is clear:
Assigns the result of successive evaluations of gen() through each iterator in
the range [first, first + N).

[Bug tree-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread eyalroz at technion dot ac.il
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

--- Comment #6 from Eyal Rozenberg  ---
> Is the example from real-world code?

Yes. Example: Some machines support atomic instructions on aligned 32 bits or
on 64 bits, but not directly on 1, 2, 3, 5, 6 or 7 bytes. So in order to
atomically change a value of one of those "undesirable" sizes, you have to work
on its corresponding 4-byte or 8-byte stretch: You read it, change it in the
middle, then apply atomic compare-and-swap to it.

[Bug tree-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

--- Comment #5 from rguenther at suse dot de  ---
On Sun, 28 Apr 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271
> 
> --- Comment #4 from Jakub Jelinek  ---
> One thing is that store-merging doesn't optimize this, I think we have an open
> enhancement request for that that should be able to cure that case.
> 
> Another one is that perhaps we should consider such MEM_REFs as not 
> necessarily
> forcing the variable into memory, and if that is the only thing to keep it
> addressable, we could in tree-ssa-live.c rewrite it using BIT_INSERT_EXPR.

Indeed, for the case we can rewrite the variable into SSA that should 
work.  If you change 'x' to be struct { int x; int y }; and just use
the x component that trick alone doesn't work - you'd first need SRA
to decompose this.  Oh, interestingly store-merging handles _that_
case just fine:

int replace_bytes_3(int *v1 ,char v2)
{
  memcpy( (void*) (((char*)v1)+1) ,  , sizeof(v2) );
  return *v1;
}

int foo3()
{
  struct { int x; int y; } s;
  s.x = 3;
  char c = 1;
  return replace_bytes_3(,c);
}

Coalescing successful!
Merged into 1 stores
New sequence of 1 stores to replace old one of 2 stores
Merging successful!
foo3 ()
{
  struct
  {
int x;
int y;
  } s;
  int _4;

   [local count: 1073741824]:
  MEM[(void *)] = 259;
  _4 = MEM[(int *)];
  s ={v} {CLOBBER};
  return _4;

with then optimal assembly.

[Bug tree-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
  Component|rtl-optimization|tree-optimization

--- Comment #3 from Richard Biener  ---
Well, very early in GIMPLE we end up with

   :
  v1 = 3;
  MEM[(char * {ref-all}) + 1B] = 1;
  _9 = v1;
  v1 ={v} {CLOBBER};
  return _9;

this means what we are looking for is store-merging handling this case
(albeit that runs quite late).  The extra complication there is likely
the fact that the memcpy changes the effective type of v1 and to preserve
TBAA correctness for the combined store we'd have to use a conservative
alias-set for the combined store (read: zero).

In an ideal world value-numbering would figure all this out of course ;)

Is the example from real-world code?  Just asking to see whether this
is an important case to handle.

[Bug tree-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

--- Comment #4 from Jakub Jelinek  ---
One thing is that store-merging doesn't optimize this, I think we have an open
enhancement request for that that should be able to cure that case.

Another one is that perhaps we should consider such MEM_REFs as not necessarily
forcing the variable into memory, and if that is the only thing to keep it
addressable, we could in tree-ssa-live.c rewrite it using BIT_INSERT_EXPR.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||amker at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

[Bug tree-optimization/90269] loop distribution defeated by clobbers

2019-04-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90269

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-04-28
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
I think gimple_has_side_effects shouldn't return true for clobbers (and/or
maybe we shouldn't set gimple_has_volatile_ops on them).

Otherwise the patch looks sensible, mind to test/post it?

[Bug c++/90265] [9/10 Regression] ICE in build_call_a at gcc/cp/call.c:396 since r268377

2019-04-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90265

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

--- Comment #1 from Richard Biener  ---
At this point P2 unless we can fix really fast (and safe).

[Bug middle-end/90262] Inline small constant memmoves

2019-04-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90262

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-04-28
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Note we are expanding memmoves inline on GIMPLE already if we can use a single
load/store (with power-of-two size).

But confirmed, we can do like you say at the expense of register pressure.
We could also do a conditional branch and have two variants inline.

[Bug rtl-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-04-28
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
memcpy optimization happens only when expanding from Gimple to RTL and we don't
remove "stack" locations after RTL.

I Have some idea on how to solve this.  Use VCE to convert to the same size
integer (if it exists) and then use BIT_FIELD_REF to extract the value, note we
to deal with big vs little endian (as BIT_FIELD_REF is depdent on that).

[Bug rtl-optimization/90271] [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread eyalroz at technion dot ac.il
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

--- Comment #1 from Eyal Rozenberg  ---
Can also reproduce this in C, with slightly different code:

int replace_bytes_3(int v1 ,char v2)
{
   memcpy( (void*) (((char*))+1) ,  , sizeof(v2) );
   return v1;
}

int foo3()
{
  int x = 3;
  char c = 1;
  return replace_bytes_3(x,c);
}


GodBolt: https://godbolt.org/z/1K89xh

Again, clang optimizes this correctly. Note specifically the way it handles the
non-inlined replace_bytes_3.

[Bug libstdc++/87982] No error for std::generate_n(ptr, ptr, f)

2019-04-28 Thread j.v.dijk at tue dot nl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87982

--- Comment #5 from Jan van Dijk  ---
Thanks a lot for this change.

One more nit: the standard clause 28.6.7(2) allows (== does not forbid)
negative count arguments, in which case generate_n is a no-op returning
__first, but this is not reflected by the libstdc++ documentation of the return
value, which claims  __first+__n unconditionally.

Somewhat off-topic: IMHO the standard is not explicit about the fact that gen()
is (of course) to be invoked separately for every element in the range:
"The generate_n algorithms invoke the function object gen and assign the return
value of gen through all the iterators in the range...". But that may be me not
being a native English speaker. The libstdc++ docs are more helpful for sure.

And indeed, sorry for not being sufficiently precise: the standard does not say
to *which* integral type the conversion must be possible so that type could be
used internally, or better: simply used in the interface. Having the _Size
template argument allows, in principle, the loop to be written without the
conversion being actually done, but that would be such a pico-optimization that
I still do not understand why the standard did not just fix the count type to
(e.g.) std::ptrdiff_t --- but also that is outside the scope of this issue.

[Bug go/90272] New: internal compile error with full backtrace

2019-04-28 Thread 22374604 at sun dot ac.za
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90272

Bug ID: 90272
   Summary: internal compile error with full backtrace
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: 22374604 at sun dot ac.za
CC: cmang at google dot com
  Target Milestone: ---

Created attachment 46255
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46255=edit
program that uncovers the bug reported in the description

I found another bug different from the one I reported in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90116.  I have attached the
program uncovering this bug. Below is the output:

go1: internal compiler error: in func_value, at go/gofrontend/gogo.h:2583
0x9d0bfb Named_object::func_value()
../../gcc-8.2.0/gcc/go/gofrontend/gogo.h:2583
0xb1a03d Type_declaration::define_methods(Named_type*)
../../gcc-8.2.0/gcc/go/gofrontend/gogo.cc:7099
0xb1ae54 Named_object::set_type_value(Named_type*)
../../gcc-8.2.0/gcc/go/gofrontend/gogo.cc:7291
0xb1e54c Bindings::define_type(Named_object*, Named_type*)
../../gcc-8.2.0/gcc/go/gofrontend/gogo.cc:7853
0xafb8ed Gogo::define_type(Named_object*, Named_type*)
../../gcc-8.2.0/gcc/go/gofrontend/gogo.cc:2132
0xb979ea Parse::type_spec(void*, unsigned int)
../../gcc-8.2.0/gcc/go/gofrontend/parse.cc:1598
0xb95c9a Parse::decl(void (Parse::*)(void*, unsigned int), void*, unsigned int)
../../gcc-8.2.0/gcc/go/gofrontend/parse.cc:1357
0xb971e5 Parse::type_decl(unsigned int)
../../gcc-8.2.0/gcc/go/gofrontend/parse.cc:1520
0xb95861 Parse::declaration()
../../gcc-8.2.0/gcc/go/gofrontend/parse.cc:1321
0xbb1c2c Parse::program()
../../gcc-8.2.0/gcc/go/gofrontend/parse.cc:5807
0xae8734 go_parse_input_files(char const**, unsigned int, bool, bool)
../../gcc-8.2.0/gcc/go/gofrontend/go.cc:79
0xad4a71 go_langhook_parse_file
../../gcc-8.2.0/gcc/go/go-lang.c:329
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.

[Bug rtl-optimization/90271] New: [missed-optimization] failure to keep variables in registers during "faux" memcpy

2019-04-28 Thread eyalroz at technion dot ac.il
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90271

Bug ID: 90271
   Summary: [missed-optimization] failure to keep variables in
registers during "faux" memcpy
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eyalroz at technion dot ac.il
  Target Milestone: ---

Example on GodBolt: https://godbolt.org/z/Q17L1u

Consider the following functions:

template
inline void replace_bytes (T1& v1 ,const T2& v2 ,std::size_t k) noexcept
{
   if (k > sizeof(T1) - sizeof(T2)) { return; }

   std::memcpy( (void*) (((char*))+k) , (const void*)  , sizeof(T2) );
}

For plain-old-data types, this is nothing but the manipulation of v1's bytes
(and there are no pointer aliasing issues). So, at least when k is known at
compile-time, the compiler should IMHO keep the activity to within registers.

And yet - GCC doesn't: With the extra code

int foo1()
{
  int x = 3;
  char c = 1;
  replace_bytes(x,c,1);
  return x;
}

we get (at maximum optimization):

foo1():
mov DWORD PTR [rsp-4], 3
mov BYTE PTR [rsp-3], 1
mov eax, DWORD PTR [rsp-4]
ret

This, while clang _does_ optimize fully and has foo1() simply return 259 (=
256+3).

Even if we make k a template parameter - it doesn't help.

[Bug libstdc++/87982] No error for std::generate_n(ptr, ptr, f)

2019-04-28 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87982

Jonathan Wakely  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org

--- Comment #4 from Jonathan Wakely  ---
The requirement is that Size can be converted to an integer type (so it would
be wrong to assert it is an integer type).

I think converting it to the iterator's difference type is the right fix. Doing
that would make the original example ill-formed, because the pointer isn't
convertible to ptrdiff_t.

Now we're in stage 1 I'll make that change.

[Bug libstdc++/87982] No error for std::generate_n(ptr, ptr, f)

2019-04-28 Thread j.v.dijk at tue dot nl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87982

Jan van Dijk  changed:

   What|Removed |Added

 CC||j.v.dijk at tue dot nl

--- Comment #3 from Jan van Dijk  ---
Isn't the real problem that the standard does not specify any requirements
about the type of the count argument (or a precondition on its value)? 

Should that be a built-in integral type? Then the issue is solved by doing a
static_assert> to the implementation.

One could also argue that the counter merely be convertible to a std::size_t,
and the question (to WG21) is why then the count type is a template argument in
the first place. (One could also argue that it should be a std::ptrdiff_t,
since the algorithm really operates on a range [ptr,ptr+count] --- that would
also allow an assert(count>=0) in the implementation.)

If the intention is really that also user-defined types are supported, the
requirements on such types should be spelled out (by WG21). As an example, the
code below uses a custom counter object that can can be converted to (=> and
compared with) an integer value. Reasonable enough. However, it does not
compile because of what really seems to be an undocumented implementation
detail in stl_algo.h: the usage of decltype(__n + 0) to compute a counter type
for internal usage. Is the code below valid or not?

It is remarkable how much Sunday morning can be spent on such an innocuous
issue :-)

cat 87892_2.cpp
#include 

struct counter
{
counter(unsigned n);
operator unsigned() const;
counter& operator--();
private:
template  counter operator+(T v) const;
};

void foo()
{
  int a[2];
  std::generate_n(a, counter(2), []{ return 0;});
}

g++ -c 87892_2.cpp
In file included from /home/jan/local/gcc-head/include/c++/9.0.1/algorithm:62,
 from 87892_2.cpp:1:
/home/jan/local/gcc-head/include/c++/9.0.1/bits/stl_algo.h: In instantiation of
‘_OIter std::generate_n(_OIter, _Size, _Generator) [with _OIter = int*; _Size =
counter; _Generator = foo()::]’:
87892_2.cpp:15:48:   required from here
/home/jan/local/gcc-head/include/c++/9.0.1/bits/stl_algo.h:4448:27: error:
‘counter counter::operator+(T) const [with T = int]’ is private within this
context
 4448 |   for (__decltype(__n + 0) __niter = __n;
  |   ^~~
87892_2.cpp:9:29: note: declared private here
9 |  template  counter operator+(T v) const;
  |

[Bug tree-optimization/90240] [10 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:6694

2019-04-28 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90240

--- Comment #8 from bin cheng  ---
Patch proposed at:
https://gcc.gnu.org/ml/gcc-patches/2019-04/msg01101.html

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread rjiejie at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

--- Comment #3 from jojo  ---
Haha..., get_address_cost() will cost down some address which is not my
expected, you
can get that code from my the 1st comment, i can not understand why adding that
code 
in new version :(

(In reply to Andrew Pinski from comment #1)
> Interesting aarch64 also has the same issue.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-04-28
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
GCC 7.3.0 for aarch64 produces:
.L2:
ldr w0, [x21, x19]
bl  crcu32
mov w1, w0
ldr w0, [x20, x19]
add x19, x19, 4
bl  crcu32
and w1, w0, 65535
cmp x19, 32
bne .L2

Which looks correct while GCC 8.3.0 produces:
.L2:
lsl x19, x20, 2
add x0, sp, 32
add x20, x20, 1
add x0, x0, x19
ldr w0, [x0, -4]
bl  crcu32
add x1, sp, 64
add x19, x1, x19
mov w1, w0
ldr w0, [x19, -4]
bl  crcu32
and w1, w0, 65535
cmp x20, 9
bne .L2

GCC 9.0.1 (from March 10th) produces:
.L2:
lsl x19, x20, 2
add x0, sp, 32
add x20, x20, 1
add x0, x0, x19
ldr w0, [x0, -4]
bl  crcu32
add x1, sp, 64
add x19, x1, x19
mov w1, w0
ldr w0, [x19, -4]
bl  crcu32
and w1, w0, 65535
cmp x20, 9
bne .L2

So going to assume GCC 9.1.0 also has a similar issue.

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||7.3.0
   Target Milestone|--- |8.4
  Known to fail||8.3.0

[Bug tree-optimization/90270] [8/9/10 Regression] Do not select best induction variable optimization

2019-04-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Target||aarch64, riscv
Summary|Do not select best  |[8/9/10 Regression] Do not
   |induction variable  |select best induction
   |optimization|variable optimization

--- Comment #1 from Andrew Pinski  ---
Interesting aarch64 also has the same issue.

[Bug tree-optimization/90270] New: Do not select best induction variable optimization

2019-04-28 Thread rjiejie at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90270

Bug ID: 90270
   Summary: Do not select best induction variable optimization
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rjiejie at me dot com
  Target Milestone: ---

Using built-in specs.
COLLECT_GCC=/home/jojo/work/csky/cskytoolchain/csky-toolchain-build-riscv/riscv-install/bin/riscv64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/jojo/work/csky/cskytoolchain/csky-toolchain-build-riscv/riscv-install/libexec/gcc/riscv64-unknown-linux-gnu/8.1.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with:
/home/jojo/work/csky/cskytoolchain/csky-toolchain-build-riscv/riscv-gcc/configure
--target=riscv64-unknown-linux-gnu
--prefix=/home/jojo/work/csky/cskytoolchain/csky-toolchain-build-riscv/riscv-install
--with-sysroot=/home/jojo/work/csky/cskytoolchain/csky-toolchain-build-riscv/riscv-install/sysroot
--with-system-zlib --enable-shared --enable-tls --enable-languages=c,c++
--disable-libmudflap --disable-libssp --disable-libquadmath --disable-nls
--disable-bootstrap --src=.././riscv-gcc --enable-checking=yes
--with-pkgversion= --disable-multilib --with-abi=lp64 --with-arch=rv64imac
'CFLAGS_FOR_TARGET=-O2  -mcmodel=medlow' 'CXXFLAGS_FOR_TARGET=-O2 
-mcmodel=medlow' CFLAGS='-O0 -g' CXXFLAGS='-O0 -g'
Thread model: posix
gcc version 8.1.0 ()


The following case do not select the best iv vars:

extern unsigned short int crcu32(unsigned int newval, unsigned short int crc);
unsigned short int func(unsigned short int crc)
{
 unsigned int final_counts[8];
 unsigned int track_counts[8];
 unsigned int i;

for (i=0; i< 8; i++) {
  crc=crcu32(final_counts[i],crc);
  crc=crcu32(track_counts[i],crc);
 }
 return crc;
}

the asm code:

.L2:
sllis0,s1,2
add a5,sp,s0
lw  a0,-4(a5)
addis1,s1,1
callcrcu32
addia5,sp,32
add s0,a5,s0
mv  a1,a0
lw  a0,-4(s0)
callcrcu32
mv  a1,a0
bne s1,s2,.L2

i debug and found some info from "ivopts" tree optimization,

the bellow additional code will adjust cost of some type address in file
tree-ssa-loop-ivopts.c:

/* Cost of small invariant expression adjusted against loop niters
 is usually zero, which makes it difficult to be differentiated
 from candidate based on loop invariant variables.  Secondly, the
 generated invariant expression may not be hoisted out of loop by
 following pass.  We penalize the cost by rounding up in order to
 neutralize such effects.  */
cost.cost = adjust_setup_cost (data, cost.cost, true);
cost.scratch = cost.cost;


when i remove the two lines, the created asm code will better:

.L2:
lw  a0,0(s0)
addis0,s0,4
addis1,s1,4
callcrcu32
mv  a1,a0
lw  a0,-4(s1)
callcrcu32
mv  a1,a0
bne s0,s2,.L2