Re: libgo patch committed: Upgrade to Go 1.9 release

2017-09-14 Thread Rainer Orth
Hi Ian, > On Thu, Sep 14, 2017 at 3:19 PM, Rainer Orth > wrote: >> >>> I've committed a patch to libgo to upgrade it to the recent Go 1.9 release. >>> >>> As usual with these upgrades, the patch is too large to attach here. >>> I've attached the changes to files

Re: [PATCH v2] [libcc1] Rename C{,P}_COMPILER_NAME and remove triplet from them

2017-09-14 Thread Sergio Durigan Junior
Ping. On Friday, September 01 2017, I wrote: > On Wednesday, August 23 2017, Pedro Alves wrote: > >> On 08/23/2017 05:17 AM, Sergio Durigan Junior wrote: >>> Hi there, >>> >>> This is a series of two patches, one for GDB and one for GCC, which aims >>> to improve the detection and handling of

Re: [RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop

2017-09-14 Thread Andrew Pinski
On Thu, Sep 14, 2017 at 6:30 PM, Kugan Vivekanandarajah wrote: > This patch prevent tree unroller from completely unrolling inner loops if that > results in excessive strided-loads in outer loop. Same comments from the RTL version. Though one more comment

Re: [RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-14 Thread Andrew Pinski
On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah wrote: > This patch adds number of hw prefetchers available to > cpu_prefetch_tune so it can be used in loop unrolling decisions. Can you explain the difference between this and num_slots

[RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_biv): export. * loop-iv.c: Likewise. * config/aarch64/aarch64.c

[RFC][PATCH 4/5] Change iv_analyze_result to take const_rtx.

2017-09-14 Thread Kugan Vivekanandarajah
Change iv_analyze_result to take const_rtx. This is just to make the next patch compile. No functional changes: Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * cfgloop.h (iv_analyze_result): Change 2nd param from rtx to const_rtx. *

[RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop

2017-09-14 Thread Kugan Vivekanandarajah
This patch prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64.c (count_mem_load_streams): New.

[RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds number of hw prefetchers available to cpu_prefetch_tune so it can be used in loop unrolling decisions. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64-protos.h (struct cpu_prefetch_tune): Add new field

[RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-14 Thread Kugan Vivekanandarajah
This patch adds separate params for rtl unroller so that they can be tunned accordingly. Default values I have are based on some testing on aarch64. I am happy to leave it as the current value and set them in the back-end. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah

[RFC][PATCH 0/5] Loop unrolling and memory load streams

2017-09-14 Thread Kugan Vivekanandarajah
While loop unrolling helps to keep the pipeline busy in modern processors, it also can increase the memory streams resulting in collisions for the hardware prefetcher that can impact performance. This patch series tries to detect this and limit the loop unrolling. Patch 1 : Add separate parms for

Re: [PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Alan Modra
On Thu, Sep 14, 2017 at 11:39:54AM -0500, Segher Boessenkool wrote: > [ pressed send too early ] > > On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote: > > --- gcc/config/rs6000/rs6000.c (revision 252029) > > +++ gcc/config/rs6000/rs6000.c (working copy) > > @@ -37807,6

Re: libgo patch committed: Upgrade to Go 1.9 release

2017-09-14 Thread Ian Lance Taylor
On Thu, Sep 14, 2017 at 3:19 PM, Rainer Orth wrote: > >> I've committed a patch to libgo to upgrade it to the recent Go 1.9 release. >> >> As usual with these upgrades, the patch is too large to attach here. >> I've attached the changes to files that are more or

Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Michael Meissner
On Thu, Sep 14, 2017 at 09:54:14AM -0500, Segher Boessenkool wrote: > On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote: > > This patch adds support on PowerPC ISA 3.0 for the built-in function > > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction > > and

Re: libgo patch committed: Upgrade to Go 1.9 release

2017-09-14 Thread Rainer Orth
Hi Ian, > I've committed a patch to libgo to upgrade it to the recent Go 1.9 release. > > As usual with these upgrades, the patch is too large to attach here. > I've attached the changes to files that are more or less specific to > gccgo. > > This upgrade required some changes to the gotools

Re: [C++ PATCH] Renames/adjustments of 1z to 17

2017-09-14 Thread Jakub Jelinek
On Thu, Sep 14, 2017 at 10:32:12PM +0100, Pedro Alves wrote: > On 09/14/2017 09:26 PM, Jakub Jelinek wrote: > > +@item c++17 > > +@itemx c++1z > > +The 2017 ISO C++ standard plus amendments. > > +The name @samp{c++1z} is deprecated. > > + > > +@item gnu++17 > > +@itemx gnu++1z > > +GNU dialect of

Re: [C++ PATCH] Renames/adjustments of 1z to 17

2017-09-14 Thread Pedro Alves
On 09/14/2017 09:26 PM, Jakub Jelinek wrote: > +@item c++17 > +@itemx c++1z > +The 2017 ISO C++ standard plus amendments. > +The name @samp{c++1z} is deprecated. > + > +@item gnu++17 > +@itemx gnu++1z > +GNU dialect of @option{-std=c++17}. > +The name @samp{gnu++17} is deprecated. > @end table I

Re: [C++ PATCH] Renames/adjustments of 1z to 17

2017-09-14 Thread Jakub Jelinek
On Thu, Sep 14, 2017 at 02:24:01PM -0700, Mike Stump wrote: > > --- gcc/doc/invoke.texi.jj 2017-09-12 21:57:57.0 +0200 > > +++ gcc/doc/invoke.texi 2017-09-14 19:32:34.342959968 +0200 > > @@ -1870,15 +1870,15 @@ GNU dialect of @option{-std=c++14}. > > This is the default for C++ code.

Re: [C++ PATCH] Renames/adjustments of 1z to 17

2017-09-14 Thread Mike Stump
On Sep 14, 2017, at 1:26 PM, Jakub Jelinek wrote: > > Given https://herbsutter.com/2017/09/06/c17-is-formally-approved/ > this patch makes -std=c++17 and -std=gnu++17 the documented options > --- gcc/doc/invoke.texi.jj2017-09-12 21:57:57.0 +0200 > +++

Re: [PATCH, rs6000 version 2] Add support for vec_xst_len_r() and vec_xl_len_r() builtins

2017-09-14 Thread Carl Love
GCC maintainers: Here is an updated patch to address the comment from Segher. The one comment that was not addressed was: >> +(define_insn "altivec_lvsl_reg" >> + [(set (match_operand:V16QI 0 "vsx_register_operand" "=v") >> + (unspec:V16QI >> + [(match_operand:DI 1

Re: [PATCH, rs6000] [v2] Folding of vector loads in GIMPLE

2017-09-14 Thread Will Schmidt
On Thu, 2017-09-14 at 09:38 -0500, Bill Schmidt wrote: > On Sep 14, 2017, at 5:15 AM, Richard Biener > wrote: > > > > On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt > > wrote: > >> On Sep 13, 2017, at 10:40 AM, Bill Schmidt

[committed] Fix handling of reference vars in C++ implicit task/taskloop firstprivate (PR c++/81314)

2017-09-14 Thread Jakub Jelinek
Hi! For firstprivate vars, even when implicit, the privatized entity is what the reference refers to; if its copy ctor or dtor need instantiation, doing this at gimplification time is too late, therefore we should handle it during genericization like we handle non-reference firstprivatized vars.

[C++ PATCH] Fix compile time hog in replace_placeholders (PR sanitizer/81929)

2017-09-14 Thread Jakub Jelinek
Hi! When the expression replace_placeholders is called on contains many SAVE_EXPRs that appear more than once in the tree, we hang walking them over and over again, while it is sufficient to just walk it without duplicates (not using cp_walk_tree_without_duplicates, because the callback can

Re: Rb_tree constructor optimization

2017-09-14 Thread François Dumont
I realized there was no test on the noexcept qualification of the move constructor with allocator. I added some and found out that patch was missing a noexcept qualification at _Rb_tree level. Here is the updated patch fully tested, ok to commit ? François On 13/09/2017 21:57, François

Re: [PATCH, PR81844] Fix condition folding in c_parser_omp_for_loop

2017-09-14 Thread Jakub Jelinek
On Thu, Sep 14, 2017 at 07:34:14PM +, de Vries, Tom wrote: > --- a/libgomp/testsuite/libgomp.c++/c++.exp > +++ b/libgomp/testsuite/libgomp.c++/c++.exp > @@ -22,6 +22,11 @@ dg-init > # Turn on OpenMP. > lappend ALWAYS_CFLAGS "additional_flags=-fopenmp" > > +# Switch into C++ mode.

Re: [PATCH, PR81844] Fix condition folding in c_parser_omp_for_loop

2017-09-14 Thread de Vries, Tom
> I know we don't have > libgomp.c-c++-common (maybe we should add that) Like so? Ran: - make check-target-libgomp RUNTESTFLAGS=c.exp=cancel-taskgroup-1.c - make check-target-libgomp RUNTESTFLAGS=c++.exp=cancel-taskgroup-1.c Currently running make check-target-libgomp. OK for trunk if tests

[committed] Fix crash accessing builtins in sanitizer.def and after (PR jit/82174)

2017-09-14 Thread David Malcolm
Calls to gcc_jit_context_get_builtin_function that accessed builtins in sanitizer.def and after (or failed to match any builtin) led to a crash accessing a NULL builtin name. The entries with the NULL name came from these lines in sanitizer.def: /* This has to come before all the sanitizer

Re: [PATCH version 2, rs6000] Add builtins to convert from float/double to int/long using current rounding mode

2017-09-14 Thread Michael Meissner
On Wed, Sep 13, 2017 at 06:08:45PM -0500, Segher Boessenkool wrote: > On Tue, Sep 12, 2017 at 07:17:07PM -0400, Michael Meissner wrote: > > On Tue, Sep 12, 2017 at 05:41:34PM -0500, Segher Boessenkool wrote: > > > This needs "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT" I think? Which > > > is the

Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Michael Meissner
On Wed, Sep 13, 2017 at 10:49:43PM +, Joseph Myers wrote: > On Wed, 13 Sep 2017, Michael Meissner wrote: > > > This patch adds support on PowerPC ISA 3.0 for the built-in function > > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction > > and > > the built-in

Re: [patch, fortran, RFC] warn about out-of-bounds errors in DO loops

2017-09-14 Thread Thomas Koenig
Hi Richard, Is it OK to throw a hard error for this? Maybe the rules are different from C and C++, but normally we can't do that for code that's only invalid if executed. An unconditional warning would be good though. I can also issue an unconditional warning; this will even simplify the

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Steve Ellcey
On Thu, 2017-09-14 at 11:53 -0600, Jeff Law wrote: >  > > And I think that's starting to zero in on the problem -- > WORD_REGISTER_OPERATIONS is zero on aarch64 as you don't get extension > to word_mode for W form registers. > > I wonder if what needs to happen is somehow look to extend that

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/14/2017 10:33 AM, Steve Ellcey wrote: > On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: >> On 09/13/2017 03:46 PM, Steve Ellcey wrote: >>> >>> In arm32 rtl expansion, when reading the QI memory location, I see >>> these instructions get generated: >>> >>> (insn 10 3 11 2 (set (reg:SI

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/14/2017 10:33 AM, Steve Ellcey wrote: > On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: >> On 09/13/2017 03:46 PM, Steve Ellcey wrote: >>> >>> In arm32 rtl expansion, when reading the QI memory location, I see >>> these instructions get generated: >>> >>> (insn 10 3 11 2 (set (reg:SI

Re: [PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Segher Boessenkool
On Thu, Sep 14, 2017 at 11:53:02AM -0500, Pat Haugen wrote: > On 09/14/2017 11:35 AM, Segher Boessenkool wrote: > > On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote: > >> --- gcc/config/rs6000/rs6000.c (revision 252029) > >> +++ gcc/config/rs6000/rs6000.c (working copy) > >> @@

Re: [PATCH] Add comments to struct cgraph_thunk_info

2017-09-14 Thread Jeff Law
On 09/14/2017 02:01 AM, Pierre-Marie de Rodat wrote: > Hello, > > This commit adds comments to fields in the cgraph_thunk_info structure > declaration from cgraph.h. They will hopefully answer questions that > people like myself can ask while discovering the thunk machinery. I > also made an

libgo patch committed: Upgrade to Go 1.9 release

2017-09-14 Thread Ian Lance Taylor
I've committed a patch to libgo to upgrade it to the recent Go 1.9 release. As usual with these upgrades, the patch is too large to attach here. I've attached the changes to files that are more or less specific to gccgo. This upgrade required some changes to the gotools Makefile. And one test

Re: Turn CANNOT_CHANGE_MODE_CLASS into a hook

2017-09-14 Thread Jeff Law
On 09/13/2017 01:19 PM, Richard Sandiford wrote: > This also seemed like a good opportunity to reverse the sense of the > hook to "can", to avoid the awkward double negative in !CANNOT. Yea. The double-negatives can sometimes make code hard to read. > > Tested on aarch64-linux-gnu,

Re: [PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Segher Boessenkool
[ pressed send too early ] On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote: > --- gcc/config/rs6000/rs6000.c(revision 252029) > +++ gcc/config/rs6000/rs6000.c(working copy) > @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r > add_to_hard_reg_set

Re: Turn TRULY_NOOP_TRUNCATION into a hook

2017-09-14 Thread Jeff Law
On 09/13/2017 01:21 PM, Richard Sandiford wrote: > I'm not sure the documentation is correct that outprec is always less > than inprec, and each non-default implementation tested for the case > in which it wasn't, but the patch leaves it as-is. While the non-default implementations may always test

Re: [PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Pat Haugen
On 09/14/2017 11:35 AM, Segher Boessenkool wrote: > On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote: >> --- gcc/config/rs6000/rs6000.c (revision 252029) >> +++ gcc/config/rs6000/rs6000.c (working copy) >> @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r >>

Re: [PATCH] Enhance PHI processing in VN

2017-09-14 Thread David Edelsohn
* tree-ssa-sccvn.c (visit_phi): Merge undefined values similar to VN_TOP. This seems to have regressed FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile "Read tp_first_run: 0" 2 FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile "Read tp_first_run: 2" 1 FAIL:

Re: Add option for whether ceil etc. can raise "inexact", adjust x86 conditions

2017-09-14 Thread Jan Hubicka
> > Well, it's of course the poor-mans solution compared to providing our own > ifunc-enabled libm ... One benefit here would be that we could have our own calling convention for this. So for floor/ceil we may just declare registers to be preserved (as they are on all modern AVX enabled cpus)

Re: Turn FUNCTION_ARG_OFFSET into a hook

2017-09-14 Thread Jeff Law
On 09/13/2017 01:22 PM, Richard Sandiford wrote: > Nice and easy, one definition and one use :-) > > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. > Also tested by comparing the testsuite assembly output on at least one > target per CPU directory. OK to install? > >

Re: [PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Segher Boessenkool
On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote: > --- gcc/config/rs6000/rs6000.c(revision 252029) > +++ gcc/config/rs6000/rs6000.c(working copy) > @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r > add_to_hard_reg_set (>set, Pmode,

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Steve Ellcey
On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: > On 09/13/2017 03:46 PM, Steve Ellcey wrote: > >  > > In arm32 rtl expansion, when reading the QI memory location, I see > > these instructions get generated: > > > > (insn 10 3 11 2 (set (reg:SI 119) > > (zero_extend:SI (mem:QI

[PATCHv2] Add a -Wcast-align=strict warning

2017-09-14 Thread Bernd Edlinger
On 09/04/17 10:07, Bernd Edlinger wrote: > Hi, > > as you know we have a -Wcast-align warning which works only for > STRICT_ALIGNMENT targets. But occasionally it would be nice to be > able to switch this warning on even for other targets. > > Therefore I would like to add a strict version of

Re: [PATCH version 3, rs6000] Add builtins to convert from float/double to int/long using current rounding mode

2017-09-14 Thread Segher Boessenkool
Hi Carl, On Wed, Sep 13, 2017 at 04:29:01PM -0700, Carl Love wrote: > -- add "TARGET_SF_FPR && TARGET_FPRND" to the define_insn "lrintsfsi2" > as mentioned it was missing on the original define_insn for fctiw. I don't think TARGET_FPRND is correct: this instruction is in the original PowerPC

[PATCH, rs6000] Don't mark the TOC reg as set up in prologue

2017-09-14 Thread Pat Haugen
Revision 235876 inadvertently caused the TOC reg to be marked as set up in prologue, which prevents shrink-wrapping from moving the prologue past a TOC reference. The following patch corrects the situation. Bootstrap/regtest on powerpc64le-linux and powerpc64-linux(-m32/-m64) with no new

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/13/2017 03:46 PM, Steve Ellcey wrote: > On Wed, 2017-09-13 at 14:46 -0500, Segher Boessenkool wrote: >> On Wed, Sep 13, 2017 at 06:13:50PM +0100, Kyrill Tkachov wrote: >>> >>> We are usually hesitant to add explicit subreg matching in the MD pattern >>> (though I don't remember if there's

[PATCH PR82163]Rewrite loop into lcssa form instantly

2017-09-14 Thread Bin Cheng
Hi, Current pcom implementation rewrites into lcssa form after all loops are transformed, this is not enough because unrolling of later loop checks lcssa form in function tree_transform_and_unroll_loop. This simple patch rewrites loop into lcssa form if store-store chain is handled. I think it

Re: [PATCH], Add support for __builtin_{sqrt,fma}f128 on PowerPC ISA 3.0

2017-09-14 Thread Segher Boessenkool
On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote: > This patch adds support on PowerPC ISA 3.0 for the built-in function > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction > and > the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP, >

Re: [PATCH, rs6000] [v2] Folding of vector loads in GIMPLE

2017-09-14 Thread Bill Schmidt
On Sep 14, 2017, at 5:15 AM, Richard Biener wrote: > > On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt > wrote: >> On Sep 13, 2017, at 10:40 AM, Bill Schmidt >> wrote: >>> >>> On Sep 13, 2017, at 7:23 AM,

Re: Add a vect_get_dr_size helper function

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 4:05 PM, Richard Sandiford wrote: > Richard Biener writes: >> On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford >> wrote: >>> This patch adds a helper function for getting the number

Re: Store VECTOR_CST_NELTS directly in tree_node

2017-09-14 Thread Richard Sandiford
Richard Biener writes: > On Thu, Sep 14, 2017 at 1:13 PM, Richard Sandiford > wrote: >> Previously VECTOR_CST_NELTS (t) read the number of elements from >> TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling >> this

Re: Add a vect_get_dr_size helper function

2017-09-14 Thread Richard Sandiford
Richard Biener writes: > On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford > wrote: >> This patch adds a helper function for getting the number of >> bytes accessed by a scalar data reference, which helps when general >> modes have a

[rs6000,patch] fix for fold-vec-ld-longlong.c test (lp64)

2017-09-14 Thread Will Schmidt
Hi, I missed a target lp64 require for the fold-vec-ld-longlong.c test. I'm now wearing my cone of shame. :-( Committing as trivial, momentarily. Thanks, -Will diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c

Re: Make more use of gimple-fold.h in tree-vect-loop.c

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:25 PM, Richard Sandiford wrote: > This patch makes the vectoriser use the gimple-fold.h routines > in more cases, instead of vect_init_vector. Later patches want > to use the same interface to handle variable-length vectors. > > Tested on

Re: Add LOOP_VINFO_MAX_VECT_FACTOR

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:24 PM, Richard Sandiford wrote: > Epilogue vectorisation uses the vectorisation factor of the main loop > as the maximum vectorisation factor allowed for correctness. That makes > sense as a conservatively correct value, since the chosen

Re: Add a vect_get_dr_size helper function

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford wrote: > This patch adds a helper function for getting the number of > bytes accessed by a scalar data reference, which helps when general > modes have a variable size. > > Tested on aarch64-linux-gnu,

Re: Add a vect_worthwhile_without_simd_p helper routine

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:22 PM, Richard Sandiford wrote: > The vectoriser sometimes considers lowering "vector" operations into N > scalar word operations. This N needs to be fixed at compile time, so > the condition guarding it needs to change when variable-lengh

Re: Add a vect_get_num_copies helper routine

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:22 PM, Richard Sandiford wrote: > This patch adds a vectoriser helper routine to calculate how > many copies of a vector statement we need. At present this > is always: > > LOOP_VINFO_VECT_FACTOR (loop_vinfo) / TYPE_VECTOR_SUBPARTS

Re: Add gimple_build_vector* helpers

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:20 PM, Richard Sandiford wrote: > This patch adds gimple-fold.h equivalents of build_vector and > build_vector_from_val. Like the other gimple-fold.h routines > they always return a valid gimple value and add any new > statements to a given

Re: Use vec<> for constant permute masks

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:20 PM, Richard Sandiford wrote: > This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new > typedefs vec_perm_indices and auto_vec_perm_indices. There are two > reasons for doing this for SVE: > > (1) it means that the number of

Re: Use vec<> in build_vector

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:14 PM, Richard Sandiford wrote: > This patch makes build_vector take the elements as a vec<> rather > than a tree *. This is useful for SVE because it bundles the number > of elements with the elements themselves, and enforces the fact that

Re: Store VECTOR_CST_NELTS directly in tree_node

2017-09-14 Thread Richard Biener
On Thu, Sep 14, 2017 at 1:13 PM, Richard Sandiford wrote: > Previously VECTOR_CST_NELTS (t) read the number of elements from > TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling > this with variable TYPE_VECTOR_SUBPARTS: either forcibly convert

Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-14 Thread Markus Trippelsdorf
On 2017.09.14 at 14:36 +0200, Jakub Jelinek wrote: > On Thu, Sep 14, 2017 at 12:10:50PM +, Shalnov, Sergey wrote: > > GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead > > of 256-bit AVX registers in the auto-vectorizer. > > > This patch enables the command line option

Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-14 Thread Jakub Jelinek
On Thu, Sep 14, 2017 at 12:10:50PM +, Shalnov, Sergey wrote: > GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead > of 256-bit AVX registers in the auto-vectorizer. > This patch enables the command line option "mprefer-avx256" that reduces > 512-bit registers usage in

Re: [RFC] Make 4-stage PGO bootstrap really working

2017-09-14 Thread Martin Liška
PING^1 On 08/30/2017 11:45 AM, Martin Liška wrote: > Hi. > > This is follow up which I've just noticed. Main problem we have is that > an instrumented compiler w/ -fprofile-generate (built in $OBJDIR/gcc > subfolder) > will generate all *.gcda files in a same dir as *.o files. That's

[PATCH][RFC] Radically simplify emission of balanced tree for switch statements.

2017-09-14 Thread Martin Liška
Hello. As mentioned at Cauldron 2017, second step in switch lowering should be massive simplification in code that does expansion of balanced tree. Basically it includes VRP and DCE, which we can for obvious reason do by our own. The patch does that, and introduces a separate pass for -O0

[PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-14 Thread Shalnov, Sergey
Hi, GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead of 256-bit AVX registers in the auto-vectorizer. This patch enables the command line option "mprefer-avx256" that reduces 512-bit registers usage in "march=skylake-avx512" mode. This is the initial implementation of the

Re: [PATCH] Enhance PHI processing in VN

2017-09-14 Thread Richard Biener
On Thu, 7 Sep 2017, Richard Biener wrote: > On Thu, 7 Sep 2017, Richard Biener wrote: > > > > > This enhances VN to do the same PHI handling as CCP, meeting > > undefined and constant to constant. I've gone a little bit > > further (and maybe will revisit this again) in also meeting > >

[PATCH][x86] Knights Mill -march/-mtune options

2017-09-14 Thread Peryt, Sebastian
Hi, This patch adds options -march=/-mtune=knm for Knights Mill. 2017-09-14 Sebastian Peryt gcc/ * config.gcc: Support "knm". * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm". * config/i386/i386-c.c

Make more use of gimple-fold.h in tree-vect-loop.c

2017-09-14 Thread Richard Sandiford
This patch makes the vectoriser use the gimple-fold.h routines in more cases, instead of vect_init_vector. Later patches want to use the same interface to handle variable-length vectors. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-09-14

Add LOOP_VINFO_MAX_VECT_FACTOR

2017-09-14 Thread Richard Sandiford
Epilogue vectorisation uses the vectorisation factor of the main loop as the maximum vectorisation factor allowed for correctness. That makes sense as a conservatively correct value, since the chosen vectorisation factor will be strictly less than that anyway. However, once the VF itself becomes

Add a vect_get_dr_size helper function

2017-09-14 Thread Richard Sandiford
This patch adds a helper function for getting the number of bytes accessed by a scalar data reference, which helps when general modes have a variable size. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-09-14 Richard Sandiford

Add a vect_worthwhile_without_simd_p helper routine

2017-09-14 Thread Richard Sandiford
The vectoriser sometimes considers lowering "vector" operations into N scalar word operations. This N needs to be fixed at compile time, so the condition guarding it needs to change when variable-lengh vectors are added. This patch puts the condition into a helper routine so that there's only

Add a vect_get_num_copies helper routine

2017-09-14 Thread Richard Sandiford
This patch adds a vectoriser helper routine to calculate how many copies of a vector statement we need. At present this is always: LOOP_VINFO_VECT_FACTOR (loop_vinfo) / TYPE_VECTOR_SUBPARTS (vectype) but later patches add other cases. Another benefit of using a helper routine is that it can

Add gimple_build_vector* helpers

2017-09-14 Thread Richard Sandiford
This patch adds gimple-fold.h equivalents of build_vector and build_vector_from_val. Like the other gimple-fold.h routines they always return a valid gimple value and add any new statements to a given gimple_seq. In combination with later patches this reduces the number of force_gimple_operands.

Use vec<> for constant permute masks

2017-09-14 Thread Richard Sandiford
This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new typedefs vec_perm_indices and auto_vec_perm_indices. There are two reasons for doing this for SVE: (1) it means that the number of elements is bundled with the elements themselves, and is obviously constant. (2) it makes it

Use vec<> in build_vector

2017-09-14 Thread Richard Sandiford
This patch makes build_vector take the elements as a vec<> rather than a tree *. This is useful for SVE because it bundles the number of elements with the elements themselves, and enforces the fact that the number is constant. Also, I think things like the folds can be used with any generic GNU

Store VECTOR_CST_NELTS directly in tree_node

2017-09-14 Thread Richard Sandiford
Previously VECTOR_CST_NELTS (t) read the number of elements from TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling this with variable TYPE_VECTOR_SUBPARTS: either forcibly convert the number to a constant (which is doable) or store the number directly in the VECTOR_CST. The

Re: [PATCH, PR81844] Fix condition folding in c_parser_omp_for_loop

2017-09-14 Thread Jakub Jelinek
On Mon, Aug 14, 2017 at 10:25:22AM +0200, Tom de Vries wrote: > 2017-08-14 Tom de Vries > > PR c/81844 Please use PR c/81875 instead, now that you've filed it. > * c-parser.c (c_parser_omp_for_loop): Fix condition folding. Fold only operands of cond, not

Re: [PATCH, rs6000] [v2] Folding of vector loads in GIMPLE

2017-09-14 Thread Richard Biener
On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt wrote: > On Sep 13, 2017, at 10:40 AM, Bill Schmidt > wrote: >> >> On Sep 13, 2017, at 7:23 AM, Richard Biener >> wrote: >>> >>> On Tue, Sep 12, 2017 at 11:08

Re: Add option for whether ceil etc. can raise "inexact", adjust x86 conditions

2017-09-14 Thread Richard Biener
On Wed, Sep 13, 2017 at 7:34 PM, Martin Jambor wrote: > Hello, > > I apologize for not coming back to this, I keep on getting distracted. > Anyway... > > On Tue, Aug 15, 2017 at 02:20:55PM +, Joseph Myers wrote: >> On Tue, 15 Aug 2017, Martin Jambor wrote: >> >> > I am not

Re: [testsuite, sparc] Don't xfail gcc.dg/vect/vect-multitypes-12.c on 32-bit SPARC (PR tree-optimization/80996)

2017-09-14 Thread Rainer Orth
Hi Richard, > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target > sparc*-*-* xfail ilp32 } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target > sparc*-*-* } } } */ > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1

Re: [PATCH] Make __FUNCTION__ a mergeable string and do not generate symbol entry.

2017-09-14 Thread Martin Liška
On 08/10/2017 09:43 PM, Jason Merrill wrote: > On 07/14/2017 01:35 AM, Martin Liška wrote: >> On 05/01/2017 09:13 PM, Jason Merrill wrote: >>> On Wed, Apr 26, 2017 at 6:58 AM, Martin Liška wrote: On 04/25/2017 01:58 PM, Jakub Jelinek wrote: > On Tue, Apr 25, 2017 at

[AArch64] Improve LDP/STP generation that requires a base register

2017-09-14 Thread Jackson Woodruff
Hi all, This patch generalizes the formation of LDP/STP that require a base register. Previously, we would only accept address pairs that were ordered in ascending or descending order, and only strictly sequential loads/stores. This patch improves that by allowing us to accept all orders

Re: [testsuite, sparc] Don't xfail gcc.dg/vect/vect-multitypes-12.c on 32-bit SPARC (PR tree-optimization/80996)

2017-09-14 Thread Richard Biener
On Thu, 14 Sep 2017, Rainer Orth wrote: > Since > > 2017-06-02 Richard Biener > > * tree-vect-loop.c (vect_analyze_loop_operations): Not relevant > PHIs are ok. > * tree-vect-stmts.c (process_use): Do not mark backedge defs > for inductions

[testsuite, sparc] Don't xfail gcc.dg/vect/vect-multitypes-12.c on 32-bit SPARC (PR tree-optimization/80996)

2017-09-14 Thread Rainer Orth
Since 2017-06-02 Richard Biener * tree-vect-loop.c (vect_analyze_loop_operations): Not relevant PHIs are ok. * tree-vect-stmts.c (process_use): Do not mark backedge defs for inductions as relevant. gcc.dg/vect/vect-multitypes-12.c XPASSes on

Minor tweak to dwarf2out_source_line

2017-09-14 Thread Eric Botcazou
The function contains these lines: if (debug_column_info) fprint_ul (asm_out_file, column); else putc ('0', asm_out_file); but they are dominated by: if (!debug_column_info) column = 0; Bootstrapped/regtested on x86_64-suse-linux, applied on mainline as

[committed] Formatting fixes in the combiner

2017-09-14 Thread Jakub Jelinek
Hi! While debugging this function I've noticed way too many formatting issues and fixed them, committed as obvious to trunk: 2017-09-14 Jakub Jelinek * combine.c (make_compound_operation_int): Formatting fixes. --- gcc/combine.c.jj2017-09-12 21:58:06.0

[PATCH] Add comments to struct cgraph_thunk_info

2017-09-14 Thread Pierre-Marie de Rodat
Hello, This commit adds comments to fields in the cgraph_thunk_info structure declaration from cgraph.h. They will hopefully answer questions that people like myself can ask while discovering the thunk machinery. I also made an assertion stricter in cgraph_node::create_thunk. I'm adding Nathan

Re: [OpenACC] Enable SIMD vectorization on vector loops

2017-09-14 Thread Jakub Jelinek
On Wed, Sep 13, 2017 at 04:20:32PM -0700, Cesar Philippidis wrote: > 2017-09-13 Cesar Philippidis > > gcc/ > * omp-offload.c (oacc_xform_loop): Enable SIMD vectorization on > non-SIMT targets in acc vector loops. Ok, thanks. Jakub