Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-19 Thread Michael Matz
Hi, On Tue, 19 Apr 2016, Andrew Haley wrote: > > I will happily declare any implementation where it's impossible to > > safely patch the instruction stream by flushing the respective buffers > > or other means completely under control of the patching machinery, to > > be broken by design. > >

Re: Free up bits in DECLs and TYPEs

2016-04-18 Thread Michael Matz
Hi, On Thu, 10 Dec 2015, Bernd Schmidt wrote: > On 12/10/2015 04:04 PM, Michael Matz wrote: > > This isn't stage 3 material really, OTOH fairly low risk. Anyway, okay > > for trunk now or once stage 1 opens? > > This is cool and we want it, but not now. Ok for stage

Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-18 Thread Michael Matz
Hi, On Mon, 18 Apr 2016, Andrew Haley wrote: > >> That may not be safe. Consider an implementation which looks ahead > >> in the instruction stream and decodes the instructions speculatively. > > > > It should go without saying that patching instructions is followed by > > whatever means nece

Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-18 Thread Michael Matz
Hi, On Mon, 18 Apr 2016, Andrew Haley wrote: > On 04/15/2016 06:29 PM, Alexander Monakov wrote: > > > Alternatively: replace first nop with a short forward branch that > > jumps over the rest of the pad, patch rest of the pad, patch the > > initial forward branch. > > That may not be safe. Con

Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-18 Thread Michael Matz
Hi, On Sun, 17 Apr 2016, Alexander Monakov wrote: > I've noticed an issue in my (and probably Michael's) solution: if > there's a thread that made it past the first nop, but is still executing > the nop pad, it's unsafe to replace the nops. To solve that, it > suffices to have a forward branc

Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-15 Thread Michael Matz
Hi, On Thu, 14 Apr 2016, Maxim Kuvyrkov wrote: > It appears that implementing -fprolog-pad=N option in GCC will not > enable kernel live-patching support for AArch64. The proposal for the > option was to make GCC output a given number of NOPs at the beginning of > each function, and then the

Re: Fix PR44281 (bad RA with global regs)

2016-02-29 Thread Michael Matz
Hi, On Mon, 29 Feb 2016, Mikael Pettersson wrote: > Well, almost. While it is true that a signal handler cannot > *accidentally* clobber the register state of the interrupted thread, it > can in fact access and update any part of that state via the ucontext_t > passed to it. Doing so is unco

Re: Fix PR44281 (bad RA with global regs)

2016-02-29 Thread Michael Matz
Hi, On Mon, 29 Feb 2016, Bernd Schmidt wrote: > On 02/29/2016 06:07 PM, Michael Matz wrote: > > > %rbx would have to be implicitly used/clobbered by the asm. In addition > > it would have to be used by all function entries and exits (so that a > > function body whe

Re: Fix PR44281 (bad RA with global regs)

2016-02-29 Thread Michael Matz
Hi, On Fri, 26 Feb 2016, Bernd Schmidt wrote: > Calls do, asms currently don't AFAICT. Not sure whether it's allowed to > use them, but I think it should be straightforward to adjust df-scan. > > > Some jit-like code uses global reg vars to reserve registers for the > > generated code. It wou

Re: [PATCH 10/9] ENABLE_CHECKING refactoring: remove remaining occurrences

2016-02-24 Thread Michael Matz
Hi, On Wed, 24 Feb 2016, Martin Liška wrote: > >> grep ENABLE_CHECKING *.[ch] > > dwarf2out.c:#if ENABLE_CHECKING > > dwarf2out.c:#if ENABLE_CHECKING > > dwarf2out.c:#if ENABLE_CHECKING > > dwarf2out.h:#if ENABLE_CHECKING > > Hi Richi. > > Removal in dwarf2out.c is not possible due to assignmen

Re: Fix PR44281 (bad RA with global regs)

2016-02-22 Thread Michael Matz
Hi, On Mon, 22 Feb 2016, Jeff Law wrote: > > never considers them as candidates. However, we do seem to have proper > > data flow information for them. IMO one of the points of global reg vars, and its long-standing documentation to that effect, is that we do not have proper data flow informat

Re: [PATCH] Fix reassoc ICE (PR tree-optimization/69802)

2016-02-15 Thread Michael Matz
Hi, On Mon, 15 Feb 2016, Jakub Jelinek wrote: > + /* If op is default def SSA_NAME, there is no place to insert the > + new comparison. Give up, unless we can use OP itself as the > + range test. */ > + if (op && SSA_NAME_IS_DEFAULT_DEF (op)) > +{ > + if (op == range->exp > +

Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread Michael Matz
Hi, On Wed, 10 Feb 2016, David Malcolm wrote: > > +static timevar_id_t global_phase; > > FWIW I like the idea, but could this be a private field within class > timer, rather than a global? Sure, consider the patch amended accordingly. Ciao, Michael.

Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread Michael Matz
Hi, On Wed, 10 Feb 2016, Richard Biener wrote: > > The problem is that TV_PHASE_DBGINFO is now nested within > > TV_PHASE_OPT_GEN, which violates the above mutual exclusivity > > requirement. Therefore the attached patch simply gets rid of > > TV_PHASE_DBGINFO (as well as of the sibling TV_PH

Re: [PATCH] Fix PR69274, 435.gromacs performance regression due to RA

2016-02-08 Thread Michael Matz
Hi, On Mon, 8 Feb 2016, Richard Biener wrote: > 429.mcf 9120243 37.6 S9120245 37.3 S > 429.mcf 9120224 40.7 S9120241 37.8 * > 429.mcf 9120225 40.5 *9120229 39.9 S > 471.omne

Re: Speedup configure and build with system.h

2016-01-26 Thread Michael Matz
Hi, On Tue, 26 Jan 2016, Uros Bizjak wrote: > > Meh. Can you try the attached patch with a configure test (it > > includes the generated files)? It works for me with 4.3.4, and should > > make your build include always. > > Yes, this patch works for me and allows bootstrap with gcc-4.1.2 to

Re: Speedup configure and build with system.h

2016-01-25 Thread Michael Matz
Hi, On Mon, 25 Jan 2016, Uros Bizjak wrote: > This patch caused bootstrap failure on non-c++11 bootstrap compiler > [1], e.g. CentOS 5.11. > > The problem is with std::swap, which was defined in header > until c++11 [2]. > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69464 > [2] http://e

Re: Speedup configure and build with system.h

2016-01-25 Thread Michael Matz
Hi, On Fri, 22 Jan 2016, Jakub Jelinek wrote: > > > This may have caused: > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69434 > > > > Guess we need: > > > > 2016-01-22 Jakub Jelinek > > > > PR bootstrap/69434 > > * genrecog.c: Define INCLUDE_ALGORITHM before including sy

Re: Speedup configure and build with system.h

2016-01-22 Thread Michael Matz
Hi, On Thu, 21 Jan 2016, Richard Biener wrote: > I'm inclined to say #define INCLUDE_ALGORITHM is a better name, I've done this. On a different (slower) machine than the one from the initial mail: without patch, -j31 bootstrap all,ada: real35m2.655s user395m28.135s sys 12m10.814s

Re: Speedup configure and build with system.h

2016-01-22 Thread Michael Matz
Hi, On Fri, 22 Jan 2016, Oleg Endo wrote: > and have been put into system.h because there have > been problems with malloc poisoning and C++ stdlib implementation other > than libstdc++, which sometimes pull other headers which then cause > trouble. The fix for this set of errors was to inc

Speedup configure and build with system.h

2016-01-21 Thread Michael Matz
Hi, this has bothered me for some time. The gcc configure with stage1 feels like taking forever because some of the decl availability tests (checking for C function) include system.h, and that, since a while, unconditionally includes and under C++, and we meanwhile use the C++ compiler for

Re: [PATCH] Fix PR c++/21802 (two-stage name lookup fails for operators)

2015-12-16 Thread Michael Matz
Hi, On Mon, 14 Dec 2015, Patrick Palka wrote: > >>> >This should use cp_tree_operand_length. > >> Hmm, I don't immediately see how I can use this function here. It > >> expects a tree but I dont have an appropriate tree to give to it, only a > >> tree_code. > > > > True. So let's introduce cp_t

Gather hash-tab statistics only with GATHER_STATISTICS

2015-12-10 Thread Michael Matz
Hi, while profiling cc1plus I noticed high hash-table activity for gathering statistics, even though I haven't configured with --enable-gather-detailed-mem-stats. Turns out the hash table rewrite hard-coded the relevant settings to true. This patch makes it initialized by GATHER_STATISTICS.

Free up bits in DECLs and TYPEs

2015-12-10 Thread Michael Matz
Hello, the other day Richi wondered why we specify alignment in bits, instead of in log2, as if e.g. a 12 byte alignment would make much sense (sure, an alignment of 12 byte means the address is evenly dividable by 12; great!). This patch changes the two places where we specify alignment (types

Re: Gimple loop splitting v2

2015-12-02 Thread Michael Matz
Hi, On Tue, 1 Dec 2015, Jeff Law wrote: > > So, okay for trunk? > -ENOPATCH Sigh :) Here it is. Ciao, Michael. * common.opt (-fsplit-loops): New flag. * passes.def (pass_loop_split): Add. * opts.c (default_options_table): Add OPT_fsplit_loops entry at -O3. (enab

Gimple loop splitting v2

2015-12-01 Thread Michael Matz
Hi, On Mon, 16 Nov 2015, Jeff Law wrote: > OK, if you want to keep them, then have a consistent way to turn them > on/off for future debugging. if0/if1 doesn't provide much of a clue to > someone else what to turn on/off if they need to debug this stuff. > > > I don't see any negative tests -

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-25 Thread Michael Matz
Hi, On Wed, 25 Nov 2015, Bernd Schmidt wrote: > So here's a very basic version which I think is appropriate for the > current stage, and can be extended later. Ok if it passes testing? When we're improving that place, we should really only consider ASMs that change memory state to be problemat

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-25 Thread Michael Matz
Hi, On Wed, 25 Nov 2015, Jakub Jelinek wrote: > > That looks bogus to me. It misses asm()s and at least today > > nonfreeing_call_p too much checks what it sounds like it checks. In > > practice it might work though. At least all the __sync_* and > > __atomic_* calls are _not_ barriers this

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-25 Thread Michael Matz
Hi, On Wed, 25 Nov 2015, Richard Biener wrote: > I don't think so. Btw, if you want to add this please add a new gimple > predicate to identify "memory barrier" (any call or asm with a VDEF). if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt)) nt_call_phase++; Ciao, Michael

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-25 Thread Michael Matz
Hi, On Wed, 25 Nov 2015, Bernd Schmidt wrote: > On 11/23/2015 05:05 PM, Michael Matz wrote: > > > > It only does so under some conditions, amongst them if it sees a > > dominating access to the same memory of the same type (load or store) and > > size. So it doesn&

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-23 Thread Michael Matz
Hi, On Fri, 20 Nov 2015, Jeff Law wrote: > > I'm undecided on whether cs-elim is safe wrt the store speculation vs > > locks concerns raised in the thread discussing Ian's > > noce_can_store_speculate_p, but that's not something we have to consider > > to solve the problem at hand. > I don't thin

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-23 Thread Michael Matz
Hi, On Fri, 20 Nov 2015, Bernd Schmidt wrote: > On 11/19/2015 12:49 AM, Jeff Law wrote: > > On 11/18/2015 12:16 PM, Bernd Schmidt wrote: > > > I don't think so, actually. One safe option would be to rip it out and > > > just stop transforming this case, but let's start by looking at the code > >

Fix -fno-checking segfault

2015-11-19 Thread Michael Matz
Hi, in an enabled-checking compiler gcc_checking_assert is always executed. If that depends on things having happened under flag_checking being true, but it's actually false during runtime due to -fno-checking things go awry, like segfaulting in this case. The fix is obvious and checked in a

Re: Extend tree-call-cdce to calls whose result is used

2015-11-16 Thread Michael Matz
Hi, On Mon, 16 Nov 2015, Richard Biener wrote: > >> Which would leave us with a lowering stage early in the main > >> optimization pipeline - I think fold_builtins pass is way too late > >> but any "folding" pass will do (like forwprop or backprop where the > >> latter might be better because

Re: Gimple loop splitting

2015-11-16 Thread Michael Matz
Hi, On Thu, 12 Nov 2015, Jeff Law wrote: > > this new pass implements loop iteration space splitting for loops that > > contain a conditional that's always true for one part of the iteration > > space and false for the other, i.e. such situations: > FWIW, Ajit suggested the same transformation ea

Gimple loop splitting

2015-11-12 Thread Michael Matz
Hello, this new pass implements loop iteration space splitting for loops that contain a conditional that's always true for one part of the iteration space and false for the other, i.e. such situations: for (i = beg; i < end; i++) if (i < p) dothis(); else dothat(); this i

Re: RFC: Incomplete Draft Patches to Correct Errors in Loop Unrolling Frequencies (bugzilla problem 68212)

2015-11-10 Thread Michael Matz
Hi, On Tue, 10 Nov 2015, Richard Biener wrote: > >> +static bool > >> +same_edge_p (edge an_edge, edge another_edge) > >> +{ > >> + return ((an_edge->src == another_edge->src) > >> + && (an_edge->dest == another_edge->dest)); > >> +} > > > > > > Formatting aside (extra parentheses), I wo

Re: Extend tree-call-cdce to calls whose result is used

2015-11-09 Thread Michael Matz
Hi, On Mon, 9 Nov 2015, Richard Sandiford wrote: > +static bool > +can_use_internal_fn (gcall *call) > +{ > + /* Only replace calls that set errno. */ > + if (!gimple_vdef (call)) > +return false; Oh, I managed to confuse this in my head while reading the patch. So, hmm, you don't actua

Re: Extend tree-call-cdce to calls whose result is used

2015-11-09 Thread Michael Matz
Hi, On Mon, 9 Nov 2015, Richard Sandiford wrote: > -ffast-math would already cause us to treat the function as not setting > errno, so the code wouldn't be used. What is "the code"? I don't see any checking of the relevant flags in tree-call-cdce.c, so I wonder what would prevent the addition

Re: [PATCH] Minor refactoring in tree-ssanames.c & freelists verifier

2015-11-09 Thread Michael Matz
Hi, On Mon, 9 Nov 2015, Jeff Law wrote: +verify_ssaname_freelists (struct function *fun) +{ + /* Do nothing if we are in RTL format. */ + basic_block bb; + FOR_EACH_BB_FN (bb, fun) +{ + if (bb->flags & BB_RTL) + return; +} gimple_in_ssa_p (fun); + /* Then note the op

Re: Extend tree-call-cdce to calls whose result is used

2015-11-09 Thread Michael Matz
Hi, On Sat, 7 Nov 2015, Richard Sandiford wrote: > For -fmath-errno, builtins.c currently expands calls to sqrt to: > > y = sqrt_optab (x); > if (y != y) > [ sqrt (x); or errno = EDOM; ] > > - the call to sqrt is protected by the result of the optab rather > than the input. It

Re: Division Optimization in match and simplify

2015-11-05 Thread Michael Matz
Hi, On Wed, 4 Nov 2015, Richard Biener wrote: > Ah, it was _left_ shift of negative values that ubsan complains about. Note that this is only for the frontend definition of shifts. I don't see why gimple shouldn't define it to the only sensible definition there is, which also happens to be th

Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))

2015-10-19 Thread Michael Matz
Hi, On Fri, 16 Oct 2015, David Malcolm wrote: > This fixes much of the bloat seen for influence.i when sending ranges > through for every token. Yeah, I think that's on the right track. > This was with 8 bits allocated for packed ranges (which is probably > excessive, but it makes debugging e

Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))

2015-10-14 Thread Michael Matz
Hi, On Wed, 14 Oct 2015, Richard Biener wrote: > The compile-time and memory-usage impact for the adhocloc at every token > patchkit is quite big. Remember that gaining 1% in compile-time is hard > and 20-40% memory increase for influence.i looks too much. Yes. OTOH the compile time and memo

Re: [patch 0/3] Header file reduction.

2015-10-08 Thread Michael Matz
Hi, On Wed, 7 Oct 2015, Richard Biener wrote: > > I'm probably the last person in the world that still generally prefers > > -cp :-) I'm getting to the point where I can tolerate -u. > > No, I prefer -cp too - diff just too easily makes a mess out of diffs > with -u, esp. if you have re-inden

Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);

2015-10-01 Thread Michael Matz
Hi, On Thu, 1 Oct 2015, Joseph Myers wrote: > On Thu, 1 Oct 2015, Michael Matz wrote: > > > both cases. The catch is that strictly speaking (NaN * -1.0) needs to > > deliver NaN, not -NaN (operations involving quiet NaNs need to provide > > one of the input NaNs as re

Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);

2015-10-01 Thread Michael Matz
Hi, On Thu, 1 Oct 2015, Jakub Jelinek wrote: > But if x is a sNaN, then the multiplication will throw an exception, while > the transformed operation will not. Hmm, that's right, silly me. > So perhaps it should be guarded by > !HONOR_SNANS (TYPE_MODE (type)) > ? That makes sense, yes. Ciao,

Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);

2015-10-01 Thread Michael Matz
Hi, On Thu, 1 Oct 2015, James Greenhalgh wrote: > > > x * copysign (1.0, y) > > > > > > x ^ (y & (1 << sign_bit_position)) > > > > Also I think this can only be done for finite and non trapping types. > > That may be well true, I swithered either way and went for no checks, > but I'd happi

Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)

2015-09-23 Thread Michael Matz
Hi, On Wed, 23 Sep 2015, Richard Biener wrote: > The issue we have with LTO is that the linemap gets populated in quite > random order and thus we repeatedly switch files (we've mitigated this > somewhat for GCC 5). Yes. > We also considered dropping column info (and would drop range info) as

Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)

2015-09-23 Thread Michael Matz
Hi, On Tue, 22 Sep 2015, David Malcolm wrote: > The drawback is that it could bloat the ad-hoc table. Can the ad-hoc > table ever get smaller, or does it only ever get inserted into? It only ever grows. > An idea I had is that we could stash short ranges directly into the 32 > bits of locatio

Re: [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

2015-09-16 Thread Michael Matz
Hi, On Wed, 16 Sep 2015, Richard Biener wrote: > Btw, this looks quite expensive - I'm sure we want to limit the effort > here a bit. I'm not so sure. It's only used for printing an error, so walking all available decls is expensive but IMHO not too much so. > I don't want us to suggest using

Re: [PATCH 04/22] Reimplement diagnostic_show_locus, introducing rich_location classes

2015-09-11 Thread Michael Matz
Hi, On Thu, 10 Sep 2015, David Malcolm wrote: > +/* A range of source locations. > + > + Ranges are half-open: > + m_start is the first location within the range, whereas > + m_finish is the first location *after* the range. I think you eventually decided that they are closed, not half-ope

Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-11 Thread Michael Matz
Hi, On Thu, 10 Sep 2015, David Malcolm wrote: > Does anyone know why this was "carefully packed" and to what extent > this matters? I'm adding an extra 8 bytes to it (or 4 if we eliminate > the existing location_t). As far as I can see, these are > short-lived, and there are only relative few a

Re: [PATCH 04/22] Reimplement diagnostic_show_locus, introducing rich_location classes

2015-09-11 Thread Michael Matz
Hi, On Thu, 10 Sep 2015, David Malcolm wrote: > +/* FIXME: (dmalcolm) > + This plugin is currently the only user of > + gcc_rich_location::add_range_with_caption > + As such, the symbol is present in libbackend.a, but not in "cc1", > + and running the plugin fails with a linker error: >

Re: New power of 2 hash policy

2015-09-11 Thread Michael Matz
Hi, On Thu, 10 Sep 2015, François Dumont wrote: > Here is a patch to offer an alternative hash policy. This one is > using power of 2 number of buckets allowing a faster modulo operation. > This is obvious when running the performance test that I have adapted to > use this alternative poli

Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-08 Thread Michael Matz
Hi, On Mon, 7 Sep 2015, Jonathan Wakely wrote: > > Interesting. Is this mode ABI-compatible with the default mode? > > Yes, that's the main reason I want to make this change. > > > Should _FORTIFY_SOURCE imply _GLIBCXX_ASSERTIONS? > > Yes, I think it should. Then at least those assertions th

Re: [5/7] Allow gimple debug stmt in widen mode

2015-09-07 Thread Michael Matz
Hi, On Mon, 7 Sep 2015, Kugan wrote: > Allow GIMPLE_DEBUG with values in promoted register. Patch does much more. > gcc/ChangeLog: > > 2015-09-07 Kugan Vivekanandarajah > > * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for > SSA_NAME that was set by GIMPLE_CALL

Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL

2015-09-07 Thread Michael Matz
Hi, On Mon, 7 Sep 2015, Kugan wrote: > For the following testcase (compiling with -O1; -O2 works fine), we have > a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by > a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode > resulting in wrong code. And why is t

Fix order of ENTRY and EXIT in reverse post order

2015-08-27 Thread Michael Matz
Hello, I've this in my tree since some time already. In-tree there is only one user of pre_and_rev_post_order_compute{,_fn} that actually wants entry and exit included, and that one is just a debug routine (draw_cfg_nodes_no_loops); so this bug right now is harmless. But I've used this for s

Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*

2015-08-19 Thread Michael Matz
Hi, On Wed, 19 Aug 2015, Richard Biener wrote: > I think tree_code is 64bits now. Huh? No; it's 16 bit since 8 bit run out. Ciao, Michael.

Re: [PATCH] Add warnings about GENERIC code-gen deficiencies in genmatch

2015-07-30 Thread Michael Matz
Hi, On Thu, 30 Jul 2015, Richard Biener wrote: > @@ -4174,11 +4267,13 @@ main (int argc, char **argv) >else if (strcmp (argv[i], "--generic") == 0) > gimple = false; >else if (strcmp (argv[i], "-v") == 0) > - verbose = true; > + verbose = 1; If you don't want to sta

Re: [PATCH][RFC] Re-work GIMPLE checking to be gimple class friendly

2015-07-27 Thread Michael Matz
Hi, On Mon, 27 Jul 2015, Richard Biener wrote: > > > static inline tree > > > gimple_assign_rhs1 (const_gimple gs) > > > { > > >GIMPLE_CHECK (gs, GIMPLE_ASSIGN); > > >return gimple_op (gs, 1); > > > } > > > > > > and the hidden checking is due to gimple_op being > > > > > > static i

Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-16 Thread Michael Matz
Hi, On Thu, 16 Jul 2015, Richard Earnshaw wrote: > >>> Now that we do have the problem, we can't fix it without an ARM port > >>> ABI change, which is undesirable, so we may have to fix it with a MI > >>> change. > >> > >> What's the ABI implication of fixing the inconsistency? > > > > I thin

Re: [PATCH] Add 'switch' statement to match.pd language

2015-07-16 Thread Michael Matz
Hi, On Thu, 16 Jul 2015, Richard Biener wrote: > > Similar, if the condition is an atom you should be able to leave the > > parens away: > > > > (switch > > cond (minus @0 @1) > > ) > > > > (given a predicate 'cond' defined appropriately). > > Yes. Though techincally the condition cannot

Re: [PATCH 4/5] Downgrade value_expr_for_decl to non-cache

2015-07-15 Thread Michael Matz
Hi, On Wed, 15 Jul 2015, Richard Biener wrote: > >Or, maybe we're talking past each other. You mean the case where > >complicated-expr-on-Y is the value-expr, and Y is _no_ stale decl, but > >the complicated expr itself nevertheless is mentioned nowhere else? > >Yes, those trees must be reta

Re: [PATCH] Add 'switch' statement to match.pd language

2015-07-15 Thread Michael Matz
Hi, On Wed, 15 Jul 2015, Richard Biener wrote: > >> (switch > >> (A) B > >> (B) C > >> (C) D > >> E) > > > >The lispy way would have been > > > > (switch > >(A) (B) > >(C) (D) > >(E) (F) > >G) > > > >i.e. parenthesize the result as well, which then would be unambiguousl

Re: [PATCH 4/5] Downgrade value_expr_for_decl to non-cache

2015-07-15 Thread Michael Matz
Hi, On Wed, 15 Jul 2015, Michael Matz wrote: > Similar for "ptr->foo" if "ptr" is nowhere mentioned in code or tables. > In effect DECL_VALUE_EXPR refers to stale decls that aren't initialized, > aren't given a place and aren't dealt with in

Re: [PATCH 4/5] Downgrade value_expr_for_decl to non-cache

2015-07-15 Thread Michael Matz
Hi, On Wed, 15 Jul 2015, Jakub Jelinek wrote: > > No, I really meant value. If you think it has meaning, then tell me > > what it is for DECL_VALUE_EXPR (X) to be 'Y', if Y is nowhere else > > mentioned, neither in code, nor in local-decls, nor in globals, or > > anywhere else that would be r

Re: [PATCH 4/5] Downgrade value_expr_for_decl to non-cache

2015-07-15 Thread Michael Matz
Hi, On Wed, 15 Jul 2015, Jakub Jelinek wrote: > On Wed, Jul 15, 2015 at 04:14:07PM +0200, Michael Matz wrote: > > That's Toms other approach with supporting multi-step dependencies. As I > > have tried to argue in the other thread, I think this idea is > > fundament

Re: [PATCH] Add 'switch' statement to match.pd language

2015-07-15 Thread Michael Matz
Hi, On Tue, 14 Jul 2015, Richard Biener wrote: > I know Micha detests the extra 'if' as much as the extra braces thus > would have prefered > > (switch > (A) B > (B) C > (C) D > E) The lispy way would have been (switch (A) (B) (C) (D) (E) (F) G) i.e. parenthesize t

Re: [PATCH 4/5] Downgrade value_expr_for_decl to non-cache

2015-07-15 Thread Michael Matz
Hi, On Tue, 14 Jul 2015, Richard Biener wrote: > For example have those special caches have two marking phases. The first > phase marks all non-key edges originating from each entry. The second > phase is the same as what we have now - unmarked entries get removed. > > The first phase would go

Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-15 Thread Michael Matz
Hi, On Tue, 14 Jul 2015, Jim Wilson wrote: > Now that we do have the problem, we can't fix it without an ARM port ABI > change, which is undesirable, so we may have to fix it with a MI change. What's the ABI implication of fixing the inconsistency? > There were two MI changes suggested, one wa

Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-15 Thread Michael Matz
Hi, On Tue, 14 Jul 2015, Richard Earnshaw wrote: > > I think it's a backend bug that parameters and locals are extended > > differently. The code in tree-outof-ssa was written with the > > assumption that the modes of RTL objects might be different (larger) > > than the tree types suggest, bu

Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-13 Thread Michael Matz
Hi, On Mon, 13 Jul 2015, Richard Biener wrote: > On Fri, Jul 10, 2015 at 5:46 PM, Jim Wilson wrote: > > On Tue, Jul 7, 2015 at 2:35 PM, Richard Biener > > wrote: > >> On July 7, 2015 6:29:21 PM GMT+02:00, Jim Wilson > >> wrote: > >>>signed sub-word locals. Thus to detect the need for a conve

Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-13 Thread Michael Matz
Hi, On Mon, 13 Jul 2015, Tom de Vries wrote: > > Implementing multi-step maps or making the hashmaps non-caching > > doesn't solve any of the above problems > > I'm not saying that making those hashmaps non-caching solves any of > these problems. Ah, I didn't mean to imply this, I meant to im

Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-13 Thread Michael Matz
Hi, On Sun, 12 Jul 2015, Tom de Vries wrote: > > I'm trying to get to a defined policy for what is allowed for caches. > > Either forbidding or allowing multi-step dependencies, I don't really > > mind. I think forbidding is the way to go, because ... > > I managed to write a patch series tha

Re: genmatch indent generated code

2015-07-10 Thread Michael Matz
Hi, On Fri, 10 Jul 2015, Richard Biener wrote: > > I also noticed it but didn't care ;) But now I notice > > > > switch (TREE_CODE (t)) > > { > > case SSA_NAME: > > > > cases are indented too much, it should be > > > > switch (TREE_CODE (t)) > > { > > case SSA_NAME: I like

Re: genmatch indent generated code

2015-07-09 Thread Michael Matz
Hi, On Thu, 9 Jul 2015, Jakub Jelinek wrote: > That violates the coding style by not using tabs ;) I knew it! Somebody would notice, pffft. Fixed in the committed version. Ciao, Michael. PS: this still isn't fully correct, as sometimes I start the strings with spaces which don't count towar

Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-09 Thread Michael Matz
Hi, On Thu, 9 Jul 2015, Tom de Vries wrote: > > > Given this I think the call to gt_ggc_mx is superfluous because it > > > wouldn't work relyably for multi-step dependencies anyway. Hence a > > > situation that works with that call in place, and breaking without > > > it is actually a bug wai

genmatch indent generated code

2015-07-09 Thread Michael Matz
Hi, while looking at gimple-match.c I got a minor stroke, so this patch makes genmatch generated (mostly) properly indented code. Sure, it could be done post-fact by an editor or something when one looks at the file, but all other generators also try to generate good looking code. No functio

Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-07 Thread Michael Matz
Hi, On Mon, 6 Jul 2015, Richard Biener wrote: > >> By doing so, we make the behaviour of gt_cleare_cache independent of the > >> order in which the entries are visited, turning: > >> - hard-to-trigger bugs which trigger for one visiting order but not for > >> another, into > >> - more easily tr

Re: flatten cfgloop.h

2015-07-06 Thread Michael Matz
Hi, On Sun, 5 Jul 2015, Prathamesh Kulkarni wrote: > Hi, > The attached patches flatten cfgloop.h. > patch-1.diff moves around prototypes and structures to respective > header-files. > patch-2.diff (mostly auto-generated) replicates cfgloop.h includes in c files. > Bootstrapped and tested on x86

Re: [hsa] HSA: add support for function declaration emission and, fix RA.

2015-06-30 Thread Michael Matz
Hi, On Tue, 30 Jun 2015, Martin Liška wrote: > Following patch implements emission of function declarations and removes > hsa_call_block_insn. The insn is replaced with a new hsa_arg_block_insn, > which will make insn iteration flat and much easier for register > allocator. Given that BRIG fo

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-25 Thread Michael Matz
Hi, On Thu, 25 Jun 2015, Benedikt Huber wrote: > > This is NOT a win on thunderX at least for single precision because > > you have to do the divide and sqrt in the same time as it takes 5 > > multiples (estimate and step are multiplies in the thunderX pipeline). > > Doubles is 10 multiplies

Fix PR66253 (GemsFDTD miscompile)

2015-06-17 Thread Michael Matz
Hi, this implements support for strided grouped stores in the non-SLP case (the SLP case existed already). Before we were ignoring all but the last store in a group. That led to a miscompile of GemsFDTD, the testcase reflects that situation. Also since r224511 yesterday grouped strided non-S

Backport PR63623 (debug info) to 4.8, 4.9

2015-06-12 Thread Michael Matz
Hi, this backports the fix for debug info of PR63623 to the 4.8 and 4.9 branches. Without this shrink-wrapped functions often have invalid debug info for parameters. Bootstrapped and regtested 4.8 and 4.9 with this on x86_64-linux, no regressions (for my machine/gdb combination 4.8 has two m

Re: Fix PR66251 (wrong code with strided group stores)

2015-05-26 Thread Michael Matz
Hi, On Fri, 22 May 2015, Richard Biener wrote: > >It's currently regstrapping on x86_64-linux, okay for trunk if that > >passes? > > OK. r223704 now. I've tried to also add the fortran runtime testcase from the PR that now exists, but failed. The gfortran.dg/vect testsuite is strange and c

Re: Do not compute alias sets for types that don't need them

2015-05-26 Thread Michael Matz
Hi, On Fri, 22 May 2015, Jan Hubicka wrote: > Index: tree-streamer-out.c > === > --- tree-streamer-out.c (revision 223508) > +++ tree-streamer-out.c (working copy) > @@ -346,6 +346,7 @@ pack_ts_type_common_value_fields (s

Fix PR66251 (wrong code with strided group stores)

2015-05-22 Thread Michael Matz
Hi, between Richis improvements of grouped accesses, and mine to strided stores is an interaction that now leads to ICEs and wrong code after both are in, for instance PR66251. The added testcases reflects this situation, and uses both, narrowing and widening (narrowing would still ICE, widen

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-20 Thread Michael Matz
Hi, On Wed, 20 May 2015, Rich Felker wrote: > > of a win that often, outside toy examples. Sure, the compiler can hoist > > function addresses trivially, but I think it will lead to spilling more > > often than not, or alternatively the hoisting will be undone by the > > register allocators r

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-20 Thread Michael Matz
Hi, On Tue, 19 May 2015, Richard Henderson wrote: > It is. The relaxation that HJ is working on requires that the reads > from the got not be hoisted. I'm not especially convinced that what > he's working on is a win. > > With LTO, the compiler can do the same job that he's attempting in the

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-19 Thread Michael Matz
Hi, On Tue, 19 May 2015, Jeff Law wrote: > > > Forget lazy binding. It's dead anyway because serious distros want > > > PIE+relro+bindnow+... > > > > You keep saying this, but I can't help the feeling it's mostly because > > musl doesn't support it ;-) > > FWIW, Red Hat is pushing PIE & parti

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-19 Thread Michael Matz
Hi, On Fri, 15 May 2015, Rich Felker wrote: > Forget lazy binding. It's dead anyway because serious distros want > PIE+relro+bindnow+... You keep saying this, but I can't help the feeling it's mostly because musl doesn't support it ;-) No, you don't have to use bindnow to get the effects of re

Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread Michael Matz
Hi, On Mon, 18 May 2015, H.J. Lu wrote: > Yes, we should convert it to > > nop call foo/jmp foo nop > > I implemented it on users/hjl/relax branch in binutils git repo. > > > the insn decoder. For calls as well of course, but there it might be > > better to have it before the call. > > > > I

Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread Michael Matz
Hi, On Mon, 18 May 2015, H.J. Lu wrote: > To avoid indirect branch to internal functions, I am proposing to add a > new relocation, R_X86_64_RELAX_GOTPCREL, to x86-64 psABI: > > 1. When branching to an external function, foo, compiler may generate > call/jmp *foo@GOTRELAX(%rip) >which

Re: My patch for GCC 5 directory names

2015-05-12 Thread Michael Matz
Hi, On Tue, 12 May 2015, H.J. Lu wrote: > >> So we have > >> > >> experimental > >> release > >> post-release > >> > >> Why not just rename prerelease to post-release? That is a one-line > >> change. > > > > Why print anything at all? 5.1.1 is after 5.1.0 in obvious ways. > > > > How can you te

Re: [PATCH][PR66010] Don't take address of ap unless necessary

2015-05-12 Thread Michael Matz
Hi, On Fri, 8 May 2015, Tom de Vries wrote: > III. > > Using the patch, before inlining we can see the address operator has been > removed in va_arg: > ... > f2_1 (struct * apD.1832) > { > intD.6 _4; > > # .MEM_3 = VDEF <.MEM_1(D)> > # USE = anything > # CLB = anything > > _4 = VA_A

Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-11 Thread Michael Matz
Hi, On Wed, 6 May 2015, Rich Felker wrote: > I don't see how this case is improved unless GCC is failing to consider > strong definitions in the same TU as locally-binding. Interposition of non-static non-inline non-weak symbols is supported independend of if they are defined in the same TU or

Re: Vectorize stores with unknown stride

2015-05-07 Thread Michael Matz
On Thu, 7 May 2015, Alan Lawrence wrote: > Also update comment? (5 identical cases) > > Also update comment? Obviously a good idea, thanks :) (s/loads/accesses/ for the commit) > > @@ -5013,7 +5025,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator > > *gsi, gimple *vec_stmt, > >tr

Vectorize stores with unknown stride

2015-05-06 Thread Michael Matz
Hi, I'm sitting on this since quite some time already and always missed stage 1. This implements support for vectorizing strided stores with unknown but loop invariant stride, like: sumit (float * __restrict dest, float * __restrict src, float * __restrict src2, int stride, int n

<    1   2   3   4   5   6   7   8   9   >