Re: [PR middle-end/78548] fix latent double free in tree-ssa-uninit.c
On 12/01/2016 07:32 AM, Richard Biener wrote: On Thu, Dec 1, 2016 at 12:03 PM, Aldy Hernandez wrote: This looks like a latent problem in the -Wmaybe-uninitialized code unrelated to my work. The problem here is a sequence of simplify_preds() followed by normalize_preds() that I added, but is based on prior art all over the file: + simplify_preds (&uninit_preds, NULL, false); + uninit_preds = normalize_preds (uninit_preds, NULL, false); The problem is that in this particular testcase we trigger simplify_preds_4 which makes a copy of a chain, frees the chain, and then tries to use the chain later (in normalize_preds). The normalize_preds() function tries to free again the chain and we blow up: This is the main problem in simplify_preds_4: /* Now clean up the chain. */ if (simplified) { for (i = 0; i < n; i++) { if ((*preds)[i].is_empty ()) continue; s_preds.safe_push ((*preds)[i]); // // Makes a copy of the pred_chain. } destroy_predicate_vecs (preds); // ^^^ // free() all the pred_chain's. (*preds) = s_preds; // ^^ // Wait a minute, we still keep a copy of the pred_chains. s_preds = vNULL; } I have no idea how this worked even before my patch. Perhaps we never had a simplify_preds() followed by a normalize_preds() where the simplification involved a call to simplify_preds_4. Interestingly enough, simplify_preds_2() has the exact same code, but with the fix I am proposing. So this seems like an oversight. Also, the fact that the simplification in simplify_preds_2 is more common than the simplification performed in simplify_preds_4 would suggest that simplify_preds_4 was uncommon enough and probably wasn't been used in a simplify_preds/normalize_preds combo. Anyways... I've made some other cleanups to the code, but the main gist of the entire patch is: - destroy_predicate_vecs (preds); + preds->release (); That is, release preds, but don't free the associated memory with the pred_chain's therein. This patch is on top of my pending patch here: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02900.html Tested on x86-64 Linux. OK for trunk? Ok. Thank you Richard. Since this is dependent on the other patch I will wait until said patch is approved to commit. Aldy
Re: [PATCH] correct handling of non-constant width and precision (pr 78521)
Hi Martin, > PR 78521 notes that the gimple-ssa-sprintf pass doesn't do the right > thing (i.e., the -Wformat-length and -fprintf-return-value options > behave incorrectly) when a conversion specification includes a width > or precision with a non-constant value. The code treats such cases > as if they were not provided which is incorrect and results in > the wrong bytes counts in warning messages and in the wrong ranges > being generated for such calls (or in the case sprintf(0, 0, ...) > for some such calls being eliminated). > > The attached patch corrects the handling of these cases, plus a couple > of other edge cases in the same area: it adjusts the parser to accept > precision in the form of just a period with no asterisk or decimal > digits after it (this sets the precision to zero), and corrects the > handling of zero precision and zero argument in integer directives > to produce no bytes on output. > > Finally, the patch also tightens up the constraint on the upper bound > of bounded functions like snprintf to be INT_MAX. The functions cannot > produce output in excess of INT_MAX + 1 bytes and some implementations > (e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more. > This is the subject of PR 78520. this patch broke Solaris bootstrap: /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'void {anonymous}::get_width_and_precision(const {anonymous}::conversion_spec&, long long int*, long long int*)': /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: error: call of overloaded 'abs(long long int)' is ambiguous width = abs (tree_to_shwi (spec.star_width)); ^ /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: note: candidates are: In file included from /usr/include/stdlib.h:12:0, from /vol/gcc/src/hg/trunk/local/gcc/system.h:258, from /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:49: /usr/include/iso/stdlib_iso.h:205:16: note: long int std::abs(long int) inline long abs(long _l) { return labs(_l); } ^ /usr/include/iso/stdlib_iso.h:160:12: note: int std::abs(int) extern int abs(int); ^ The following patch fixed this for me, but I've no idea if it's right. It bootstrapped successfully on sparc-sun-solaris2.12, i386-pc-solaris2.12, and x86_64-pc-linux-gnu. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University 2016-12-02 Rainer Orth * gimple-ssa-sprintf.c (get_width_and_precision): Use std::abs. diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c --- a/gcc/gimple-ssa-sprintf.c +++ b/gcc/gimple-ssa-sprintf.c @@ -774,7 +774,7 @@ get_width_and_precision (const conversio if (spec.star_width) { if (TREE_CODE (spec.star_width) == INTEGER_CST) - width = abs (tree_to_shwi (spec.star_width)); + width = std::abs (tree_to_shwi (spec.star_width)); else width = HOST_WIDE_INT_MIN; }
Re: [tree-tailcall] Check if function returns it's argument
On Thu, 1 Dec 2016, Jeff Law wrote: > On 12/01/2016 06:22 AM, Richard Biener wrote: > > > Well after removing DECL_BY_REFERENCE, verify_gimple still fails but > > > differently: > > > > > > tail-padding1.C:13:1: error: RESULT_DECL should be read only when > > > DECL_BY_REFERENCE is set > > > } > > > ^ > > > while verifying SSA_NAME nrvo_4 in statement > > > # .MEM_3 = VDEF <.MEM_1(D)> > > > nrvo_4 = __builtin_memset (nrvo_2(D), 0, 8); > > > tail-padding1.C:13:1: internal compiler error: verify_ssa failed > > > > Hmm, ok. Not sure why we enforce this. > I don't know either. But I would start by looking at tree-nrv.c since it > looks (based on the variable names) that the named-value-return optimization > kicked in. > > > > > Note that in the end this patch looks fishy -- iff we really need > > the LHS on the assignment for correctness if we have the tailcall > > flag set then what guarantees that later passes do not remove it > > again? So anybody removing a LHS would need to unset the tailcall flag? > > > > Saying again that I don't know enough about the RTL part of tailcall > > expansion. > The LHS on the assignment makes it easier to identify when a tail call is > possible. It's not needed for correctness. Not having the LHS on the > assignment just means we won't get an optimized tail call. > > Under what circumstances would the LHS possibly be removed? We know the > return statement references the LHS, so it's not going to be something that > DCE will do. Well, I thought Prathamesh added the optimization to copy-propagate the lhs from the returned argument. So we'd have both transforms here. Of course as always the user could have written the code in this way. If the LHS is not required for correctness then I don't think we need to put it there - Pratamesh verified the call is tail-called already if marked by the tailcall pass, even if the LHS is not present. Richard.
Re: [PATCH] S/390: Fix litpool-r3-1.c.
On Wed, Nov 23, 2016 at 12:27:30PM +0100, Dominik Vogt wrote: > gcc/ChangeLog-lp1 > > * gcc.target/s390/litpool-r3-1.c: Fix label number test. Applied. Thanks! -Andreas-
Re: [PATCH v3] Do not simplify "(and (reg) (const bit))" to if_then_else.
On Thu, Dec 01, 2016 at 04:30:08PM +0100, Dominik Vogt wrote: > gcc/ChangeLog > > * combine.c (combine_simplify_rtx): Suppress replacement of > "(and (reg) (const_int bit))" with "if_then_else". Applied. Thanks! -Andreas-
Re: [PATCH 2/2 v3] PR77822
On Fri, Nov 25, 2016 at 09:25:59AM +0100, Dominik Vogt wrote: > gcc/ChangeLogb > > PR target/77822 > * config/s390/s390.md ("extzv") > ("*extzv") > ("*extzvdi_lshiftrt") > ("*_ior_and_sr_ze") > ("*extract1bitdi") > ("*insv", "*insv_rnsbg_noshift") > ("*insv_rnsbg_srl", "*insv_mem_reg") > ("*insvdi_mem_reghigh", "*insvdi_reg_imm"): Use EXTRACT_ARGS_IN_RANGE > to validate the arguments of zero_extract and sign_extract. > gcc/testsuite/ChangeLogb > > PR target/77822 > * gcc.target/s390/s390.exp: Support .cxx tests. > * gcc.target/s390/pr77822-2.c: New test. > * gcc.target/s390/pr77822-1.cxx: New test. Applied with testcase .cxx renamed to .C. -Andreas-
Re: [PATCH 1/2 v3] PR77822
On Fri, Nov 25, 2016 at 09:29:46AM +0100, Dominik Vogt wrote: > gcc/ChangeLog > > PR target/77822 > * rtl.h (EXTRACT_ARGS_IN_RANGE): New. Applied. Thanks! -Andreas-
Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c
On Thu, Dec 1, 2016 at 6:28 PM, Jeff Law wrote: > On 12/01/2016 05:04 AM, Segher Boessenkool wrote: >> >> On Thu, Dec 01, 2016 at 10:19:42AM +0100, Richard Biener wrote: > > Thinking about this again maybe targets w/o insn-length should simply > always use the 'simple' algorithm instead of the STV one? At least > that > might be what your change effectively does in some way? From reading the comments I don't think STC will collapse down into the simple algorithm if block copying is disabled. But Segher would know for sure. WRT the choice of simple vs STC, I doubt it matters much for the processors in question. >>> >>> >>> I guess STC doesn't make much sense if we can't say anything about BB >>> sizes. >> >> >> STC tries to make as large as possible consecutive "traces", mainly to >> help with instruction cache utilization and hit rate etc. It cannot do >> a very good job if it isn't allowed to copy blocks. >> >> "simple" tries to (dynamically) have as many fall-throughs as possible, >> i.e. as few jumps as possible. I hope it special-cases backedges, that is, not make those fallthru. >> It never copies code; if that means it >> has to jump every second insn, so be it. It provably is within a factor >> three of optimal (optimal is NP-hard), under a really weak assumption >> within a factor two, and it does better than that in practice. >> >> STC without block copying makes longer traces which is not a good idea >> for most architectures, only for those that have a short jump that is >> much shorter than longer jumps (and thus, cannot cover many jump >> targets). >> >> I do not know how STC behaves when it does not know the insn lengths. > > mn103 & m68k are definitely sensitive to jump distances, the former more so > than the latter. Some of the others probably are as well. > > I think we've probably discussed this more than is really necessary. We > just need to pick an alternative and go with it, I think either alternative > is reasonable (avoid copying when STC has no length information or fall back > to simple when there is no length information). The description of both algorithms doesn't make it an obvious pick. So lets leave the decision of the algorithm to the target and make STC beave sane with no length information (whether that means disallowing any copying or substituting a "default" length is another question -- but obviously it's the targets fault to not provide lenght information). Richard. > > > jeff
RE: [PATCH] PR fortran/77505 -- Treat negative character length as LEN=0
Thank you Steve. Thanks, Elizebeth -Original Message- From: Steve Kargl [mailto:s...@troutmask.apl.washington.edu] Sent: 02 December 2016 04:54 To: Punnoose, Elizebeth Cc: fort...@gcc.gnu.org; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] PR fortran/77505 -- Treat negative character length as LEN=0 On Wed, Nov 30, 2016 at 05:13:28AM +, Punnoose, Elizebeth wrote: > Please excuse the messy formatting in my initial mail. Resending with > proper formatting. > > This patch checks for negative character length in the array > constructor, and treats it as LEN=0. > > A warning message is also printed if bounds checking is enabled. > > Bootstrapped and regression tested the patch on x86_64-linux-gnu and > aarch64-linux-gnu. > Thanks. After regression testing on x86_64-*-freebsd, I committed the attached patch. Not sure if the whitespace got messed up by my email agent, but I needed to reformat your testcases. I took the opportunity to rename and improve the testcases. The improvements check that in fact len=0 and that a warning is issued. Hopefully, you're inclined to submit additional patches in the future. A few recommendations are to include the text of your ChangeLog entry in body of the email, for example, 2016-12-01 Elizebeth Punnoose PR fortran/77505 * trans-array.c (trans_array_constructor): Treat negative character length as LEN = 0. 2016-12-01 Elizebeth Punnoose PR fortran/77505 * gfortran.dg/char_length_20.f90: New test. * gfortran.dg/char_length_21.f90: Ditto. (Note, 2 spaces before and after your name.) Then attach the patch to the email. This hopefully will prevent formatting issues with various email clients/servers. -- Steve
[PATCH, alpha]: Fix invalid RTX sharing
2016-12-02 Uros Bizjak * config/alpha/alpha.md (exception_receiver): Copy alpha_gp_ave_rtx return value. Bootstrapped and regression tested on alphaev68-linux-gnu, will commit soon. Uros. diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md index 3e4594b..0ed29de 100644 --- a/gcc/config/alpha/alpha.md +++ b/gcc/config/alpha/alpha.md @@ -5142,7 +5142,7 @@ "TARGET_ABI_OSF" { if (flag_reorder_blocks_and_partition) -operands[0] = alpha_gp_save_rtx (); +operands[0] = copy_rtx (alpha_gp_save_rtx ()); else operands[0] = const0_rtx; })
Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)
On Thu, Dec 1, 2016 at 6:58 PM, Martin Sebor wrote: >> Sure - but then you maybe instead want to check for op being in >> range [0, max-of-signed-type-of-op] instead? So similar to >> expr_not_equal_to add a expr_in_range helper? >> >> Your function returns true for sizetype vars even if it might be >> effectively signed, like for >> >> sizetype i_1 = -4; >> i_2 = i_1 + 1; >> >> operand_unsigned_p (i) returns true. I suppose you may have (*) >> meant >> >> +static bool >> +operand_unsigned_p (tree op) >> +{ >> + if (TREE_CODE (op) == SSA_NAME) >> +{ >> + gimple *def = SSA_NAME_DEF_STMT (op); >> + if (is_gimple_assign (def)) >> + { >> + tree_code code = gimple_assign_rhs_code (def); >> + if (code == NOP_EXPR >>&& TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION >> (TREE_TYPE (gimple_assign_rhs1 (def >> return tree_expr_nonnegative_p (gimple_assign_rhs1 (def))); >> + } >> +} >> + >> + return false; >> +} >> >> ? because only if you do see a cast and that cast is widening from an >> nonnegative number >> the final one will be unsigned (if interpreted as signed number). > > > I don't think this is what I want. Here's a test case that works > with my function but not with the suggested modification: > >char d[4]; >long f (unsigned long i) >{ > return __builtin_object_size (d + i + 1, 0); >} > > Here, the size I'm looking for is (at most) 3 no matter what value > i has. Am I missing a case where my function will do the wrong > thing? See above (*) I know what you are trying to do (retro-actively make value-ranges have a split variable / constant part). But I don't think that doing it the way you do it is a reasonable maintainable way. Others may disagree. >> You might want to use offset_ints here (see mem_ref_offset for example) > > > Okay, I'll see if I can switch to offset_int. > >> + gimple *def = SSA_NAME_DEF_STMT (off); + if (is_gimple_assign (def)) + { + tree_code code = gimple_assign_rhs_code (def); + if (code == PLUS_EXPR) + { + /* Handle offset in the form VAR + CST where VAR's type +is unsigned so the offset must be the greater of +OFFRANGE[0] and CST. This assumes the PLUS_EXPR +is in a canonical form with CST second. */ + tree rhs2 = gimple_assign_rhs2 (def); err, what? What about overflow? Aren't you just trying to decompose 'off' into a variable and a constant part here and somehow extracting a range for the variable part? So why not just do that? >>> >>> >>> >>> Sorry, what about overflow? >>> >>> The purpose of this code is to handle cases of the form >>> >>>& PTR [range (MIN, MAX)] + CST >> >> >> what if MAX + CST overflows? > > > The code doesn't look at MAX, only MIN is considered. It extracts > both but only actually uses MAX to see if it's dealing with a range > or a constant. Does that resolve your concern? But if MAX overflows then it might be smaller than MIN and thus you cannot conclude that [min, max] + CST is [min+CST, UNKNWON] but rather it's [UNKNOWN, UNKNOWN] (if you disregard the actual value of MAX). >>>char d[7]; >>> >>>#define bos(p, t) __builtin_object_size (p, t) >>> >>>long f (unsigned i) >>>{ >>> if (2 < i) i = 2; >>> >>> char *p = &d[i] + 3; >>> >>> return bos (p, 0); >>>} >> >> >> I'm sure that doesn't work as you match for PLUS_EXPR. > > > Sorry, I'm not sure what you mean. I mean that p = &d[i] + 3; is not represented as a PLUS_EXPR but as a POINTER_PLUS_EXPR. All pointer arithmetic is using POINTER_PLUS_EXPR. > The above evaluates to 4 with > the patch because i cannot be less than zero (otherwise &d[i] would > be invalid/undefined) so the type-0 size we want (the maximum) is > &d[7] - (&d[0] + 3) or 4. > >> Maybe simply ignore VR_ANTI_RANGEs for now then? > > > Yes, that makes sense. > >>> The code above is based on the observation that an anti-range is >>> often used to represent the full subrange of a narrower signed type >>> like signed char (as ~[128, -129]). I haven't been able to create >>> an anti-range like ~[5, 9]. When/how would a range like that come >>> about (so I can test it and implement the above correctly)? >> >> >> if (a < 4) >> if (a > 8) >> b = a; >> >> then b should have ~[5, 9] > > > Right :) I have figured out by know by know how to create an anti > range in general. What I meant is that I haven't had luck creating > them in a way that the tree-object-size pass could see (I'm guessing > because EVRP doesn't understand relational expressions). So given > this modified example from above: > > char d[9]; > > #define bos(p, t) __builtin_object_size (p, t) > > long f (unsigned a) > { >unsigned b = 0; > >if (a < 4) > if (a > 8
[avr,committed]: Fix coding-style glitches in avr.c
Committed rectifications for bunch of coding rule nits as obvious. Johann * config/avr/avr.c: Fix coding rule glitches. Index: config/avr/avr.c === --- config/avr/avr.c (revision 243104) +++ config/avr/avr.c (working copy) @@ -388,7 +388,7 @@ avr_parallel_insn_from_insns (rtx_insn * If this is the case, fill in the insns from casesi to INSNS[1..5] and the SImode extension to INSNS[0]. Moreover, extract the operands of pattern casesi__sequence forged from the sequence to recog_data. */ - + static bool avr_is_casesi_sequence (basic_block bb, rtx_insn *insn, rtx_insn *insns[6]) { @@ -702,7 +702,7 @@ avr_set_core_architecture (void) break; } else if (0 == strcmp (mcu->name, avr_mmcu) - // Is this a proper architecture ? + // Is this a proper architecture ? && NULL == mcu->macro) { avr_arch = &avr_arch_types[mcu->arch_id]; @@ -1078,7 +1078,7 @@ avr_set_current_function (tree decl) if (!STR_PREFIX_P (name, "__vector")) warning_at (loc, OPT_Wmisspelled_isr, "%qs appears to be a misspelled " - "%s handler, missing __vector prefix", name, isr); +"%s handler, missing __vector prefix", name, isr); } /* Don't print the above diagnostics more than once. */ @@ -1163,7 +1163,7 @@ avr_regs_to_save (HARD_REG_SET *set) /* Don't record frame pointer registers here. They are treated indivitually in prologue. */ && !(frame_pointer_needed - && (reg == REG_Y || reg == (REG_Y+1) + && (reg == REG_Y || reg == REG_Y + 1 { if (set) SET_HARD_REG_BIT (*set, reg); @@ -1374,7 +1374,7 @@ sequent_regs_live (void) else cur_seq = 0; - if (df_regs_ever_live_p (REG_Y+1)) + if (df_regs_ever_live_p (REG_Y + 1)) { ++live_seq; ++cur_seq; @@ -1807,7 +1807,8 @@ avr_expand_prologue (void) avr_prologue_setup_frame (size, set); if (flag_stack_usage_info) -current_function_static_stack_size = cfun->machine->stack_usage + INCOMING_FRAME_SP_OFFSET; +current_function_static_stack_size + = cfun->machine->stack_usage + INCOMING_FRAME_SP_OFFSET; } @@ -1840,9 +1841,9 @@ avr_asm_function_end_prologue (FILE *fil avr_outgoing_args_size()); fprintf (file, "/* frame size = " HOST_WIDE_INT_PRINT_DEC " */\n", - get_frame_size()); + get_frame_size()); fprintf (file, "/* stack size = %d */\n", - cfun->machine->stack_usage); + cfun->machine->stack_usage); /* Create symbol stack offset here so all functions have it. Add 1 to stack usage for offset so that SP + .L__stack_offset = return address. */ fprintf (file, ".L__stack_usage = %d\n", cfun->machine->stack_usage); @@ -2522,7 +2523,7 @@ avr_print_operand_address (FILE *file, m rtx x = addr; if (GET_CODE (x) == CONST) x = XEXP (x, 0); - if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x,1))) + if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1))) { /* Assembler gs() will implant word address. Make offset a byte offset inside gs() for assembler. This is @@ -2532,14 +2533,14 @@ avr_print_operand_address (FILE *file, m from symbol which may not be what the user really wanted. */ fprintf (file, "gs("); - output_addr_const (file, XEXP (x,0)); + output_addr_const (file, XEXP (x, 0)); fprintf (file, "+" HOST_WIDE_INT_PRINT_DEC ")", 2 * INTVAL (XEXP (x, 1))); if (AVR_3_BYTE_PC) if (warning (0, "pointer offset from symbol maybe incorrect")) { output_addr_const (stderr, addr); -fprintf(stderr,"\n"); +fprintf (stderr, "\n"); } } else @@ -2617,12 +2618,12 @@ avr_print_operand (FILE *file, rtx x, in } else if (code == 'E' || code == 'F') { - rtx op = XEXP(x, 0); + rtx op = XEXP (x, 0); fprintf (file, "%s", reg_names[REGNO (op) + ef]); } else if (code == 'I' || code == 'J') { - rtx op = XEXP(XEXP(x, 0), 0); + rtx op = XEXP (XEXP (x, 0), 0); fprintf (file, "%s", reg_names[REGNO (op) + ij]); } else if (REG_P (x)) @@ -2714,12 +2715,12 @@ avr_print_operand (FILE *file, rtx x, in } else if (GET_CODE (addr) == PLUS) { - avr_print_operand_address (file, VOIDmode, XEXP (addr,0)); + avr_print_operand_address (file, VOIDmode, XEXP (addr, 0)); if (REGNO (XEXP (addr, 0)) == REG_X) fatal_insn ("internal compiler error. Bad
Re: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=
On 01.12.2016 17:40, Mike Stump wrote: On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay wrote: This patch moves the compile tests that have a hard coded -mmcu=MCU in their dg-options to a new folder. The exp driver filters out -mmcu= from the command line options that are provided by, say, board description files or --tool-opts. This is needed because otherwise conflicting -mmcu= will FAIL respective test cases because of "specified option '-mmcu' more than once" errors from avr-gcc. Ok for trunk? So, it would be nice if different ports can use roughly similar schemes to handle the same problems. I think arm is one of the more complex ports at this point in this regard with a lot of people and a lot of years time to contemplate and implement solutions to the problem. They in particular don't have to move test cases around to handle the difference like this, I think it would be best to avoid that requirement if possible. Glancing around, two starting points for how the arm achieves what it does: lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* conflicts with -m(cpu|arch)=.* switch" in arm.exp, and they use something like: /* { dg-require-effective-target arm_crypto_ok } */ |crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target arm_crypto_ok } */ /* { dg-add-options arm_crypto } */ |crypto-vsha256su0q_u32.c:2:/* { dg-require-effective-target arm_crypto_ok } */ to validate the requirements of the test case, and to ensure that optional things are selected. Nice, simple, extensible, handles multilibs, dejagnu arguments and different cpu defaults as I recall. You won't need all the hair the arm folks have, but if you stub in support in that direction, you then have simple, easy expansion room to match all complexities that the arm folks have already hit and solved. I tried this approach, but it does not work as expected. Notice that avr-gcc throws an error if conflicting -mmcu options are supplied. Pruning the output will make some tests "PASS", but the compiler didn't actually do anything but producing an error message... And one test FAILs because it should produce some specific diagnostic, but again the compiler just throws a different error, the output is pruned and the expected message is missing, hence fail. Also one test case is for ATmega8, but you won't run the whole test suite against ATmega8 because that device is too restricted to give reasonable results... Hence for some tests -mmcu=atmega8 is added by hand. Johann
Re: [PATCH] remove invalid "tail +16c"
On 11/22/2016 11:22 PM, ma.ji...@zte.com.cn wrote: > Hi all, > In "config/acx.m4", there are still some "tail +16c" which are invalid > on POSIX systems. > In my opinion, all "tail +16c" should be changed to "tail -c +16" > directly, as most systems has accept the latter. > And, to skip first 16 bytes, we should use "tail -c +17" instead of > "tail -c +16". > > * config/acx.m4:Change "tail +16c" to "tail -c +17". > * configure: Regenerate. >Thanks. I've installed this on the trunk after bootstrap & regression >testing on x86_64-linux-gnu. > >jeff > >Thanks Hi Jeff, Thanks for your attention. I can now close the bug in binuitls---https://sourceware.org/bugzilla/show_bug.cgi?id=20823.
[Committed] S/390: Fix RTL sharing when generating reg note.
gcc/ChangeLog: 2016-12-02 Andreas Krebbel * config/s390/s390.c (s390_save_gprs_to_fprs): Fix RTL sharing problem. --- gcc/ChangeLog | 5 + gcc/config/s390/s390.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a0cefa7..03387cf 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2016-12-02 Andreas Krebbel + + * config/s390/s390.c (s390_save_gprs_to_fprs): Fix RTL sharing + problem. + 2016-12-02 Georg-Johann Lay * config/avr/avr-arch.h (avr_mcu_t) [n_flash]: Remove field. diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 767666e..030e10d 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -10666,7 +10666,7 @@ s390_save_gprs_to_fprs (void) /* This prevents dwarf2cfi from interpreting the set. Doing so it might emit def_cfa_register infos setting an FPR as new CFA. */ - add_reg_note (insn, REG_CFA_REGISTER, PATTERN (insn)); + add_reg_note (insn, REG_CFA_REGISTER, copy_rtx (PATTERN (insn))); } } } -- 2.9.1
Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)
On Wed, Nov 30, 2016 at 3:10 PM, Richard Biener wrote: > On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng wrote: >> Hi, >> This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html. >> Though review comments suggested it could be merged with last kind >> simplification >> of fold_cond_expr_with_comparison, it's not really applicable. As a matter >> of fact, >> the suggestion stands for patch >> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html. >> I had previous patch >> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html) >> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated >> that patch into it. For this one, the rework is trivial, just renames >> several variable >> tags as suggested. Bootstrap and test on x86_64 and AArch64, is it OK? > > + A) Operand x is a unsigned to signed type conversion and c1 is > + integer zero. In this case, > + (signed type)x < 0 <=> x > MAX_VAL(signed type) > + (signed type)x >= 0 <=> x <= MAX_VAL(signed type) > > for (singed type)x < 0 -> x > signed-type-max we probably do a reverse > "canonicalization" transform? Yeah, > > /* Non-equality compare simplifications from fold_binary */ > (for cmp (lt gt le ge) > ... > (if (wi::eq_p (@1, signed_max) > && TYPE_UNSIGNED (arg1_type) > /* We will flip the signedness of the comparison operator > associated with the mode of @1, so the sign bit is > specified by this mode. Check that @1 is the signed > max associated with this sign bit. */ > && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type)) > /* signed_type does not work on pointer types. */ > && INTEGRAL_TYPE_P (arg1_type)) > /* The following case also applies to X < signed_max+1 > and X >= signed_max+1 because previous transformations. */ > (if (cmp == LE_EXPR || cmp == GT_EXPR) >(with { tree st = signed_type_for (arg1_type); } > (if (cmp == LE_EXPR) > (ge (convert:st @0) { build_zero_cst (st); }) > (lt (convert:st @0) { build_zero_cst (st); })) > > + if (cmp_code == GE_EXPR) > + cmp_code = LE_EXPR; > + c1 = wide_int_to_tree (op_type, wi::max_value (to_type)); > + } > ... > + if (op == PLUS_EXPR) > + real_c1 = wide_int_to_tree (op_type, > + wi::sub (c3, c2, sgn, &overflow)); > + else > + real_c1 = wide_int_to_tree (op_type, > + wi::add (c3, c2, sgn, &overflow)); > > can you avoid the tree building here and just continue using wide-ints please? > Simply do the wide_int_to_tree in the result patterns. Hi, I updated patch wrto your comments, also deleted two useless variables. Bootstrap and test, is it OK? Thanks, bin 2016-12-01 Bin Cheng * match.pd: Add new pattern: (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2). gcc/testsuite/ChangeLog 2016-12-01 Bin Cheng * gcc.dg/fold-bopcond-1.c: New test. * gcc.dg/fold-bopcond-2.c: New test. > > Otherwise looks ok to me. > > Thanks, > Richard. > > >> Thanks, >> bin >> >> 2016-11-17 Bin Cheng >> >> * match.pd: Add new pattern: >> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2). >> >> gcc/testsuite/ChangeLog >> 2016-11-17 Bin Cheng >> >> * gcc.dg/fold-bopcond-1.c: New test. >> * gcc.dg/fold-bopcond-2.c: New test. diff --git a/gcc/match.pd b/gcc/match.pd index bc8a5e7..dbb9103 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2038,6 +2038,106 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (convert (cond (eq @1 (convert @3)) (convert:from_type @3) (convert:from_type @2) +/* (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2) if: + + 1) OP is PLUS or MINUS. + 2) CMP is LT, LE, GT or GE. + 3) C3 == (C1 op C2), and computation doesn't have undefined behavior. + + This pattern also handles special cases like: + + A) Operand x is a unsigned to signed type conversion and c1 is + integer zero. In this case, + (signed type)x < 0 <=> x > MAX_VAL(signed type) + (signed type)x >= 0 <=> x <= MAX_VAL(signed type) + B) Const c1 may not equal to (C3 op' C2). In this case we also + check equality for (c1+1) and (c1-1) by adjusting comparison + code. + + TODO: Though signed type is handled by this pattern, it cannot be + simplified at the moment because C standard requires additional + type promotion. In order to match&simplify it here, the IR needs + to be cleaned up by other optimizers, i.e, VRP. */ +(for op (plus minus) + (for cmp (lt le gt ge) + (simplify + (cond (cmp (convert? @X) INTEGER_CST@1) (op @X INTEGER_CST@2) INTEGER_CST@3) + (with { tree from_type = TREE_TYPE (@X), to_type = TREE_TYPE (@1); } +(if (types_match (from_type, to_type
[PATCH] EVRP edge info refactoring and simple improvement
The following refactors range extraction from edges and makes EVRP able to merge such edge based ranges for the case of multiple predecessors. This allows it to catch anti-ranges from if (a < 4 || a > 8) if that is not re-written to a single test like when using gotos. I don't expect this to catch very much in practice but the refactoring means we can incorporate the tricks from register_edge_assert_for more easily (we "simply" have to use extract_ranges_from_edge as the one-and-true kind-of interface). Motivated by Martin Sebors work on b_o_s and the appearant inability of EVRP to do anything useful for him. Bootstrap & regtest running on x86_64-unknown-linux-gnu. Richard. 2016-12-02 Richard Biener * tree-vrp.c (evrp_dom_walker::extract_ranges_from_edge): New method split out from evrp_dom_walker::before_dom_children. (evrp_dom_walker::before_dom_children): Use it and implement merging of edge range info from multiple predecessors. * gcc.dg/tree-ssa/evrp7.c: New testcase. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c new file mode 100644 index 000..c28347d --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fgimple -fdump-tree-evrp" } */ + +void __GIMPLE (startwith("evrp")) +f (unsigned int a) +{ + if (a < 4U) +goto x; + else +goto bb_3; + +bb_3: + if (a > 8U) +goto x; + else +goto bb_4; + +x: + if (a == 5U) +goto bb_5; + else +goto bb_4; + +bb_5: + __builtin_abort (); + +bb_4: + return; +} + +/* { dg-final { scan-tree-dump-times "evrp" 0 "if" } } */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 600634d..592d3b0 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -10700,6 +10700,7 @@ public: void push_value_range (tree var, value_range *vr); value_range *pop_value_range (tree var); value_range *try_find_new_range (tree op, tree_code code, tree limit); + void extract_ranges_from_edge (edge, vec > &); /* Cond_stack holds the old VR. */ auto_vec > stack; @@ -10736,13 +10737,69 @@ evrp_dom_walker::try_find_new_range (tree op, tree_code code, tree limit) return NULL; } +/* Extract ranges on E and store them into V. */ + +void +evrp_dom_walker::extract_ranges_from_edge (edge e, + vec > & v) +{ + gimple *stmt = last_stmt (e->src); + tree op0; + if (stmt + && gimple_code (stmt) == GIMPLE_COND + && (op0 = gimple_cond_lhs (stmt)) + && TREE_CODE (op0) == SSA_NAME + && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))) + || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Visiting controlling predicate "); + print_gimple_stmt (dump_file, stmt, 0, 0); + } + /* Entering a new scope. Try to see if we can find a VR +here. */ + tree op1 = gimple_cond_rhs (stmt); + tree_code code = gimple_cond_code (stmt); + + if (TREE_OVERFLOW_P (op1)) + op1 = drop_tree_overflow (op1); + + /* If condition is false, invert the cond. */ + if (e->flags & EDGE_FALSE_VALUE) + code = invert_tree_comparison (gimple_cond_code (stmt), + HONOR_NANS (op0)); + /* Add VR when (OP0 CODE OP1) condition is true. */ + value_range *op0_range = try_find_new_range (op0, code, op1); + + /* Register ranges for y in x < y where +y might have ranges that are useful. */ + tree limit; + tree_code new_code; + if (TREE_CODE (op1) == SSA_NAME + && extract_code_and_val_from_cond_with_ops (op1, code, + op0, op1, + false, + &new_code, &limit)) + { + /* Add VR when (OP1 NEW_CODE LIMIT) condition is true. */ + value_range *op1_range = try_find_new_range (op1, new_code, limit); + if (op1_range) + v.safe_push (std::make_pair (op1, op1_range)); + } + + if (op0_range) + v.safe_push (std::make_pair (op0, op0_range)); +} +} + /* See if there is any new scope is entered with new VR and set that VR to ssa_name before visiting the statements in the scope. */ edge evrp_dom_walker::before_dom_children (basic_block bb) { - tree op0 = NULL_TREE; edge_iterator ei; edge e; @@ -10768,53 +10825,10 @@ evrp_dom_walker::before_dom_children (basic_block bb) } if (pred_e) { - gimple *stmt = last_stmt (pred_e->src); - if (stmt - && gimple_code (stmt) == GIMPLE_COND - && (op0 = gimple_cond_lhs (stmt)) - && TREE_CODE (op0) == SSA_NAME - && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))) - || POINTER_TYPE_P (TREE_TYPE (gimple_cond_l
[PR middle-end/78328] [committed] wrong wording for unbounded alloca case
The problem here is that as we follow the cast from an unsigned int to __SIZE_TYPE__, we ignore the VR_ANTI_RANGE of 7 exhibited by the test in the PR: +void g (unsigned int n) +{ + if (n == 7) +n = 11; + f (__builtin_alloca (n)); +} Since we can't get any meaningful information from VR_ANTI_RANGE as we drill down to a cast, the appropriate thing is to drop it without assuming it has a range. This was as oversight in not handling VR_ANTI_RANGE gracefully, and I'll go on a limb here and say that attached patch is obvious. Committed to trunk. commit 59256dfdee704d08bcd20f0576abd3353f5767cc Author: Aldy Hernandez Date: Fri Dec 2 04:42:26 2016 -0500 PR middle-end/78328 * gimple-ssa-warn-alloca.c (alloca_call_type): Handle VR_ANTI_RANGE. diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c index e75f2fa..ae379f9 100644 --- a/gcc/gimple-ssa-warn-alloca.c +++ b/gcc/gimple-ssa-warn-alloca.c @@ -339,6 +339,8 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type) { // Fall through. } + else if (range_type == VR_ANTI_RANGE) + return alloca_type_and_limit (ALLOCA_UNBOUNDED); else if (range_type != VR_VARYING) return alloca_type_and_limit (ALLOCA_BOUND_MAYBE_LARGE, max); diff --git a/gcc/testsuite/gcc.dg/Walloca-12.c b/gcc/testsuite/gcc.dg/Walloca-12.c new file mode 100644 index 000..5d71cda --- /dev/null +++ b/gcc/testsuite/gcc.dg/Walloca-12.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-Walloca-larger-than=128 -O2" } */ + +void f (void*); + +void g (unsigned int n) +{ + if (n == 7) +n = 11; + f (__builtin_alloca (n)); /* { dg-warning "unbounded use of 'alloca'" } */ +}
[PATCH, Fortran, v1] Fix deallocation of nested derived typed components
Hi all, attached patch fixes on ICE, when freeing a scalar allocatable component in a derived typed coarray. Furthermore does it fix freeing of nested derived typed allocatable components. A simple code explains the bug that is solved by the patch: type inner integer, allocatable :: i end type type outer type(inner), allocatable :: link end type type(outer), allocatable :: obj allocate(obj) allocate(obj%link) allocate(obj%link%i) deallocate(obj%link) deallocate(obj) ! <- this will generate pseudo-pseudo-code of the kind: if (obj.link.i != 0) // But link is already NULL, i.e. a crash occurs. free(obj.link.i) The patch fixes this by moving the code for freeing link.i into the check if link is allocated, i.e.: if (obj.link != 0) { if (obj.link.i != 0) { free (obj.link.i); obj.link.i = 0; } free (obj.link); obj.link = 0; } Furthermore does the patch ensure that the handle of an allocatable component is set to 0. Bootstraped and regtested ok on x86_64-linux/F23. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de gcc/fortran/ChangeLog: 2016-12-02 Andre Vehreschild * trans-array.c (structure_alloc_comps): Restructure deallocation of (nested) allocatable components. Insert dealloc of sub-component into the block guarded by the if != NULL for the component. * trans.c (gfc_deallocate_with_status): Allow deallocation of scalar and arrays as well as coarrays. (gfc_deallocate_scalar_with_status): Get the data member for coarrays only when freeing an array with descriptor. * trans.h: Change prototype of gfc_deallocate_with_status to allow adding statements into the block guarded by the if (pointer != 0) and supply a coarray handle. gcc/testsuite/ChangeLog: 2016-12-02 Andre Vehreschild * gfortran.dg/coarray_alloc_comp_3.f08: New test. * gfortran.dg/finalize_18.f90: Add count for additional guard against accessing null-pointer. diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index ac90a4b..a5deb42 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -8157,8 +8157,11 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, tree null_cond = NULL_TREE; tree add_when_allocated; tree dealloc_fndecl; - bool called_dealloc_with_status; + tree caf_token; gfc_symbol *vtab; + int caf_dereg_mode; + symbol_attribute *attr; + bool deallocate_called; gfc_init_block (&fnblock); @@ -8265,7 +8268,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, bool cmp_has_alloc_comps = (c->ts.type == BT_DERIVED || c->ts.type == BT_CLASS) && c->ts.u.derived->attr.alloc_comp; - bool same_type = c->ts.type == BT_DERIVED && der_type == c->ts.u.derived; + bool same_type = (c->ts.type == BT_DERIVED && der_type == c->ts.u.derived) + || (c->ts.type == BT_CLASS && der_type == CLASS_DATA (c)->ts.u.derived); cdecl = c->backend_decl; ctype = TREE_TYPE (cdecl); @@ -8274,112 +8278,120 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, { case DEALLOCATE_ALLOC_COMP: - /* gfc_deallocate_scalar_with_status calls gfc_deallocate_alloc_comp - (i.e. this function) so generate all the calls and suppress the - recursion from here, if necessary. */ - called_dealloc_with_status = false; gfc_init_block (&tmpblock); - if ((c->ts.type == BT_DERIVED && !c->attr.pointer) - || (c->ts.type == BT_CLASS && !CLASS_DATA (c)->attr.class_pointer)) - { - comp = fold_build3_loc (input_location, COMPONENT_REF, ctype, - decl, cdecl, NULL_TREE); + comp = fold_build3_loc (input_location, COMPONENT_REF, ctype, + decl, cdecl, NULL_TREE); - /* The finalizer frees allocatable components. */ - called_dealloc_with_status - = gfc_add_comp_finalizer_call (&tmpblock, comp, c, - purpose == DEALLOCATE_ALLOC_COMP - && caf_enabled (caf_mode)); - } + /* Shortcut to get the attributes of the component. */ + if (c->ts.type == BT_CLASS) + attr = &CLASS_DATA (c)->attr; else - comp = NULL_TREE; + attr = &c->attr; - if (c->attr.allocatable && !c->attr.proc_pointer && !same_type - && (c->attr.dimension - || (caf_enabled (caf_mode) - && (caf_in_coarray (caf_mode) || c->attr.codimension - { - /* Allocatable arrays or coarray'ed components (scalar or - array). */ - int caf_dereg_mode - = (caf_in_coarray (caf_mode) || c->attr.codimension) - ? (gfc_caf_is_dealloc_only (caf_mode) - ? GFC_CAF_COARRAY_DEALLOCATE_ONLY - : GFC_CAF_COARRAY_DEREGISTER) - : GFC_CAF_COARRAY_NOCOARRAY; - if (comp == NULL_TREE) - comp = fold_build3_loc (input_location, COMPONENT_REF, ctype, - decl, cdecl, NULL_TREE); + if ((c->ts.type == BT_DERIVED && !c->attr.pointer) + || (c->ts.type == BT_CLASS && !CLASS_DATA (c)->attr.cla
Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)
On Thu, Dec 1, 2016 at 5:30 PM, Martin Liška wrote: > On 11/23/2016 03:13 PM, Jakub Jelinek wrote: >> On Wed, Nov 23, 2016 at 02:57:07PM +0100, Martin Liška wrote: >>> I started review process in libsanitizer: https://reviews.llvm.org/D26965 >>> And I have a question that was asked in the review: can we distinguish >>> between load and store >>> in case of having usage of ASAN_POISON? >> >> I think with ASAN_POISON it is indeed just loads from after scope that can >> be caught, a store overwrites the variable with a new value and when turning >> the store after we make the var no longer addressable into SSA form, we >> loose information about the out of scope store. Furthermore, if there is >> first a store and then a read, like: >> if (argc != 12312) >> { >> char my_char; >> ptr = &my_char; >> } >> *ptr = i + 26; >> return *ptr; >> we don't notice even the read. Not sure what could be done against that >> though. I think we'd need to hook into the into-ssa framework, there it >> should know the current value of the variable at the point of the store is >> result of ASAN_POISON and be able to instead of turning that >> my_char = _23; >> into >> my_char_35 = _23; >> turn it into: >> my_char_35 = ASAN_POISON (_23); >> which would represent after scope store into my_char. >> >> Not really familiar with into-ssa though to know where to do it. >> >> Jakub >> > > Richi, may I ask you for help with this question? Probably where we handle the CLOBBER case (rewrite_stmt, maybe_register_def), we do this for -Wuninitialized. Richard. > Thanks, > Martin >
Re: [PATCH] S/390: Fix setmem-long test.
On Thu, Dec 01, 2016 at 03:47:22PM +0100, Dominik Vogt wrote: > gcc/testsuite/ChangeLog-setmem-long-test > > * gcc.target/s390/md/setmem_long-1.c: Fix test. Applied. Thanks! -Andreas-
Re: [PING] (v2) Add a "compact" mode to print_rtx_function
On Thu, Dec 01, 2016 at 01:27:55PM +0100, Bernd Schmidt wrote: > On 12/01/2016 11:12 AM, Dominik Vogt wrote: ... > >I'd like to get our test failure fixed, either by changing > >print-rtl.c or our test case. Is the above patch good for trunk? > >It does fix the s390 test failure. > > I still don't see a strong reason not to print the quotes, so I'd > suggest changing the testcase. Ok. I've just committed https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00084.html Bye, -Andreas-
[PATCH] formatting nit
Another formatting nit I noticed. Committed. nathan -- Nathan Sidwell 2016-12-02 Nathan Sidwell * diagnostic.c (diagnostic_report_diagnostic): Remove extraneous braces. Index: diagnostic.c === --- diagnostic.c (revision 243176) +++ diagnostic.c (working copy) @@ -834,9 +834,7 @@ diagnostic_report_diagnostic (diagnostic -Wno-error=*. */ if (context->warning_as_error_requested && diagnostic->kind == DK_WARNING) -{ - diagnostic->kind = DK_ERROR; -} +diagnostic->kind = DK_ERROR; if (diagnostic->option_index && diagnostic->option_index != permissive_error_option (context))
[PATCH] fix -fmax-errors & notes, take 2
Hi, this respin of my notes patch from October (https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00706.html) addresses the fortran problems encountered. I introduced a new function and call it from the fortran error machinery at an appropriate point. ok? nathan -- Nathan Sidwell 2016-12-02 Nathan Sidwell gcc/ * diagnostic.c (diagnostic_check_max_errors): New, broken out of ... (diagnostic_action_after_output): ... here. (diagnostic_report_diagnostic): Call it for non-notes. * diagnostic.h (diagnostic_check_max_errors): Declare. gcc/fortran/ * error.c (gfc_warning_check): Call diagnostic_check_max_errors. (gfc_error_check): Likewise. gcc/testsuite/ * c-c++-common/fmax_errors.c: Check notes after last error are emitted. Index: gcc/diagnostic.c === --- gcc/diagnostic.c (revision 243115) +++ gcc/diagnostic.c (working copy) @@ -446,6 +446,29 @@ bt_err_callback (void *data ATTRIBUTE_UN errnum == 0 ? "" : xstrerror (errnum)); } +/* Check if we've met the maximum error limit. */ + +void +diagnostic_check_max_errors (diagnostic_context *context, bool flush) +{ + if (!context->max_errors) +return; + + int count = (diagnostic_kind_count (context, DK_ERROR) + + diagnostic_kind_count (context, DK_SORRY) + + diagnostic_kind_count (context, DK_WERROR)); + + if (count >= (int) context->max_errors) +{ + fnotice (stderr, + "compilation terminated due to -fmax-errors=%u.\n", + context->max_errors); + if (flush) + diagnostic_finish (context); + exit (FATAL_EXIT_CODE); +} +} + /* Take any action which is expected to happen after the diagnostic is written out. This function does not always return. */ void @@ -470,18 +493,6 @@ diagnostic_action_after_output (diagnost diagnostic_finish (context); exit (FATAL_EXIT_CODE); } - if (context->max_errors != 0 - && ((unsigned) (diagnostic_kind_count (context, DK_ERROR) - + diagnostic_kind_count (context, DK_SORRY) - + diagnostic_kind_count (context, DK_WERROR)) - >= context->max_errors)) - { - fnotice (stderr, - "compilation terminated due to -fmax-errors=%u.\n", - context->max_errors); - diagnostic_finish (context); - exit (FATAL_EXIT_CODE); - } break; case DK_ICE: @@ -892,6 +901,9 @@ diagnostic_report_diagnostic (diagnostic return false; } + if (diagnostic->kind != DK_NOTE) +diagnostic_check_max_errors (context); + context->lock++; if (diagnostic->kind == DK_ICE || diagnostic->kind == DK_ICE_NOBT) Index: gcc/diagnostic.h === --- gcc/diagnostic.h (revision 243115) +++ gcc/diagnostic.h (working copy) @@ -320,6 +320,7 @@ void default_diagnostic_start_span_fn (d void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *); void diagnostic_set_caret_max_width (diagnostic_context *context, int value); void diagnostic_action_after_output (diagnostic_context *, diagnostic_t); +void diagnostic_check_max_errors (diagnostic_context *, bool flush = false); void diagnostic_file_cache_fini (void); Index: gcc/fortran/error.c === --- gcc/fortran/error.c (revision 243115) +++ gcc/fortran/error.c (working copy) @@ -1226,6 +1226,7 @@ gfc_warning_check (void) diagnostic_action_after_output (global_dc, warningcount_buffered ? DK_WARNING : DK_ERROR); + diagnostic_check_max_errors (global_dc, true); } } @@ -1370,6 +1371,7 @@ gfc_error_check (void) gcc_assert (gfc_output_buffer_empty_p (pp_error_buffer)); pp->buffer = tmp_buffer; diagnostic_action_after_output (global_dc, DK_ERROR); + diagnostic_check_max_errors (global_dc, true); return true; } Index: gcc/testsuite/c-c++-common/fmax-errors.c === --- gcc/testsuite/c-c++-common/fmax-errors.c (revision 243115) +++ gcc/testsuite/c-c++-common/fmax-errors.c (working copy) @@ -1,11 +1,21 @@ /* PR c/44782 */ /* { dg-do compile } */ -/* { dg-options "-fmax-errors=3" } */ +/* { dg-options "-fmax-errors=3 -Wall" } */ void foo (unsigned int i, unsigned int j) { (i) (); /* { dg-error "" } */ (j) (); /* { dg-error "" } */ - (i+j) (); /* { dg-error "" } */ + + i + j; /* { dg-warning "" } */ + + (k) (); /* { dg-error "" } */ + /* Make sure we see the notes related to the final error we emit. */ + /* { dg-message "identifier" "" { target c } 12 } */ + + /* Warnings after the final error should not appear. */ + i + j; /* no warning. */ + (i*j) (); /* no error here due to -fmax-errors */ + } /* { dg-prune-output "compilation terminated" } */
Re: [PATCH] EVRP edge info refactoring and simple improvement
On Fri, 2 Dec 2016, Richard Biener wrote: > > The following refactors range extraction from edges and makes EVRP > able to merge such edge based ranges for the case of multiple > predecessors. This allows it to catch anti-ranges from > if (a < 4 || a > 8) if that is not re-written to a single test like > when using gotos. > > I don't expect this to catch very much in practice but the refactoring > means we can incorporate the tricks from register_edge_assert_for > more easily (we "simply" have to use extract_ranges_from_edge as the > one-and-true kind-of interface). Like the following, preliminary testing shows FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate minv_[0-9]* == 5 to 0" FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate minv_[0-9]* == 6 to 0" FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate maxv_[0-9]* == 5 to 0" FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate maxv_[0-9]* == 6 to 0" FAIL: gcc.dg/tree-ssa/vrp35.c scan-tree-dump vrp1 "Removing dead stmt [^\r\n]* = j_.* == 10" FAIL: gcc.dg/tree-ssa/vrp36.c scan-tree-dump vrp1 "Removing dead stmt [^\r\n]* = i_.* == 1" so more things are done by EVRP. More testing next week because I expected more VRP testcase fallout. But as expected this allows for the single-test if (a < 4 || a > 8) variant. Richard. 2016-12-02 Richard Biener * tree-vrp.c (assert_info): New struct. (add_assert_info): New helper. (register_edge_assert_for_2): Refactor to add asserts to a vector of assert_info. (register_edge_assert_for_1): Likewise. (register_edge_assert_for): Likewise. (finish_register_edge_assert_for): New helper actually registering asserts where live on edge. (find_conditional_asserts): Adjust. (find_switch_asserts): Likewise. (evrp_dom_walker::try_find_new_range): Generalize. (evrp_dom_walker::extract_ranges_from_edge): Use register_edge_assert_for. (evrp_dom_walker::before_dom_children): Adjust. diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 592d3b0..62d0e9d 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -89,6 +89,21 @@ static tree vrp_evaluate_conditional_warnv_with_ops (enum tree_code, tree, tree, bool, bool *, bool *); +struct assert_info +{ + /* Predicate code for the ASSERT_EXPR. Must be COMPARISON_CLASS_P. */ + enum tree_code comp_code; + + /* Name to register the assert for. */ + tree name; + + /* Value being compared against. */ + tree val; + + /* Expression to compare. */ + tree expr; +}; + /* Location information for ASSERT_EXPRs. Each instance of this structure describes an ASSERT_EXPR for an SSA name. Since a single SSA name may have more than one assertion associated with it, these @@ -4956,6 +4971,19 @@ debug_all_asserts (void) dump_all_asserts (stderr); } +/* Push the assert info for NAME, EXPR, COMP_CODE and VAL to ASSERTS. */ + +static void +add_assert_info (vec &asserts, +tree name, tree expr, enum tree_code comp_code, tree val) +{ + assert_info info; + info.comp_code = comp_code; + info.name = name; + info.val = val; + info.expr = expr; + asserts.safe_push (info); +} /* If NAME doesn't have an ASSERT_EXPR registered for asserting 'EXPR COMP_CODE VAL' at a location that dominates block BB or @@ -5172,9 +5200,10 @@ masked_increment (const wide_int &val_in, const wide_int &mask, Invert the condition COND if INVERT is true. */ static void -register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi, +register_edge_assert_for_2 (tree name, edge e, enum tree_code cond_code, - tree cond_op0, tree cond_op1, bool invert) + tree cond_op0, tree cond_op1, bool invert, + vec &asserts) { tree val; enum tree_code comp_code; @@ -5185,10 +5214,8 @@ register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi, invert, &comp_code, &val)) return; - /* Only register an ASSERT_EXPR if NAME was found in the sub-graph - reachable from E. */ - if (live_on_edge (e, name)) -register_new_assert_for (name, name, comp_code, val, NULL, e, bsi); + /* Queue the assert. */ + add_assert_info (asserts, name, name, comp_code, val); /* In the case of NAME <= CST and NAME being defined as NAME = (unsigned) NAME2 + CST2 we can assert NAME2 >= -CST2 @@ -5228,8 +5255,7 @@ register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi, && TREE_CODE (name3) == SSA_NAME && (cst2 == NULL_TREE || TREE_CODE (cst2) == INTEGER_CST) - && INTEGRAL_TYPE_P (TREE_TYPE (name3)) - && live_on_edge (e, name3)) + &&
Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)
On Fri, Dec 2, 2016 at 12:58 PM, Bin.Cheng wrote: > On Wed, Nov 30, 2016 at 3:10 PM, Richard Biener > wrote: >> On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng wrote: >>> Hi, >>> This is a rework of >>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html. >>> Though review comments suggested it could be merged with last kind >>> simplification >>> of fold_cond_expr_with_comparison, it's not really applicable. As a matter >>> of fact, >>> the suggestion stands for patch >>> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html. >>> I had previous patch >>> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html) >>> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated >>> that patch into it. For this one, the rework is trivial, just renames >>> several variable >>> tags as suggested. Bootstrap and test on x86_64 and AArch64, is it OK? >> >> + A) Operand x is a unsigned to signed type conversion and c1 is >> + integer zero. In this case, >> + (signed type)x < 0 <=> x > MAX_VAL(signed type) >> + (signed type)x >= 0 <=> x <= MAX_VAL(signed type) >> >> for (singed type)x < 0 -> x > signed-type-max we probably do a reverse >> "canonicalization" transform? Yeah, >> >> /* Non-equality compare simplifications from fold_binary */ >> (for cmp (lt gt le ge) >> ... >> (if (wi::eq_p (@1, signed_max) >> && TYPE_UNSIGNED (arg1_type) >> /* We will flip the signedness of the comparison operator >> associated with the mode of @1, so the sign bit is >> specified by this mode. Check that @1 is the signed >> max associated with this sign bit. */ >> && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type)) >> /* signed_type does not work on pointer types. */ >> && INTEGRAL_TYPE_P (arg1_type)) >> /* The following case also applies to X < signed_max+1 >> and X >= signed_max+1 because previous transformations. */ >> (if (cmp == LE_EXPR || cmp == GT_EXPR) >>(with { tree st = signed_type_for (arg1_type); } >> (if (cmp == LE_EXPR) >> (ge (convert:st @0) { build_zero_cst (st); }) >> (lt (convert:st @0) { build_zero_cst (st); })) >> >> + if (cmp_code == GE_EXPR) >> + cmp_code = LE_EXPR; >> + c1 = wide_int_to_tree (op_type, wi::max_value (to_type)); >> + } >> ... >> + if (op == PLUS_EXPR) >> + real_c1 = wide_int_to_tree (op_type, >> + wi::sub (c3, c2, sgn, &overflow)); >> + else >> + real_c1 = wide_int_to_tree (op_type, >> + wi::add (c3, c2, sgn, &overflow)); >> >> can you avoid the tree building here and just continue using wide-ints >> please? >> Simply do the wide_int_to_tree in the result patterns. > Hi, > I updated patch wrto your comments, also deleted two useless > variables. Bootstrap and test, is it OK? Ok. Thanks, Richard. > Thanks, > bin > > 2016-12-01 Bin Cheng > > * match.pd: Add new pattern: > (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2). > > gcc/testsuite/ChangeLog > 2016-12-01 Bin Cheng > > * gcc.dg/fold-bopcond-1.c: New test. > * gcc.dg/fold-bopcond-2.c: New test. > >> >> Otherwise looks ok to me. >> >> Thanks, >> Richard. >> >> >>> Thanks, >>> bin >>> >>> 2016-11-17 Bin Cheng >>> >>> * match.pd: Add new pattern: >>> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2). >>> >>> gcc/testsuite/ChangeLog >>> 2016-11-17 Bin Cheng >>> >>> * gcc.dg/fold-bopcond-1.c: New test. >>> * gcc.dg/fold-bopcond-2.c: New test.
Re: [PATCHv2 6/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call
On 23/11/16 11:53, Andre Vieira (lists) wrote: > On 11/11/16 16:19, Kyrill Tkachov wrote: >> And CC'ing Ramana and Richard this time... >> > > Hi, > > After some extra testing I found that the sibcall optimization was not > disabled for calls to function pointers with the cmse_nonsecure_call > attribute, causing the clearing and call to the function wrapper to be > skipped. This would result in an illegal branch into secure memory and > would HardFault. > > Added a test. > > Is this OK? > > Cheers, > Andre > > *** gcc/ChangeLog *** > 2016-11-xx Andre Vieira > Thomas Preud'homme > > * config/arm/arm.c (detect_cmse_nonsecure_call): New. > (cmse_nonsecure_call_clear_caller_saved): New. > (arm_reorg): Use cmse_nonsecure_call_clear_caller_saved. > (arm_function_ok_for_sibcall): Disable sibcalls for > cmse_nonsecure_call. > * config/arm/arm-protos.h (detect_cmse_nonsecure_call): New. > * config/arm/arm.md (call): Handle cmse_nonsecure_entry. > (call_value): Likewise. > (nonsecure_call_internal): New. > (nonsecure_call_value_internal): New. > * config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New. > (*nonsecure_call_value_reg_thumb1_v5): New. > * config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New. > (*nonsecure_call_value_reg_thumb2): New. > * config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New. > > *** libgcc/ChangeLog *** > 2016-11-xx Andre Vieira > Thomas Preud'homme > > * config/arm/cmse_nonsecure_call.S: New. > * config/arm/t-arm: Compile cmse_nonsecure_call.S > > > *** gcc/testsuite/ChangeLog *** > 2016-11-xx Andre Vieira > Thomas Preud'homme > > * gcc.target/arm/cmse/cmse.exp: Run tests in mainline dir. > * gcc.target/arm/cmse/cmse-9.c: Added some extra tests. > * gcc.target/arm/cmse/cmse-14.c: New. > * gcc.target/arm/cmse/baseline/bitfield-4.c: New. > * gcc.target/arm/cmse/baseline/bitfield-5.c: New. > * gcc.target/arm/cmse/baseline/bitfield-6.c: New. > * gcc.target/arm/cmse/baseline/bitfield-7.c: New. > * gcc.target/arm/cmse/baseline/bitfield-8.c: New. > * gcc.target/arm/cmse/baseline/bitfield-9.c: New. > * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: New. > * gcc.target/arm/cmse/baseline/cmse-11.c: New. > * gcc.target/arm/cmse/baseline/cmse-13.c: New. > * gcc.target/arm/cmse/baseline/cmse-6.c: New. > * gcc/testsuite/gcc.target/arm/cmse/baseline/union-1.c: New. > * gcc/testsuite/gcc.target/arm/cmse/baseline/union-2.c: New. > * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New. > * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New. > * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New. > * gcc.target/arm/cmse/mainline/hard/cmse-13.c: New. > * gcc.target/arm/cmse/mainline/hard/cmse-7.c: New. > * gcc.target/arm/cmse/mainline/hard/cmse-8.c: New. > * gcc.target/arm/cmse/mainline/soft/cmse-13.c: New. > * gcc.target/arm/cmse/mainline/soft/cmse-7.c: New. > * gcc.target/arm/cmse/mainline/soft/cmse-8.c: New. > * gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New. > * gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New. > * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New. > * gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New. > * gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New. > Hi, To make the clearing of registers consistent between single and double precision I decided to clear all FP registers with 0. The callee-saved registers, saved, cleared and restored in the library wrapper we can do this without much penalty to performance. The caller-saved registers are compiler generated and currently generate a 'vldr' instruction, per cleared (sp or dp) register. This is far from optimal, but it works and it is "safer". I have some ideas to improve this, for instance using r0-r1 to clear the FP registers, since they will either contain the address of the callback function or an argument value, either way they will never contain secret information. I will address this at a later time. Changed the tests to reflect these changes. No changes to the ChangeLog. Is this OK? Cheers, Andre diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 634a5de05472d562972d9ad84b5858691715714c..05d73ab2271d28ce8d6d49199341d4365f0a2bcb 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -137,6 +137,7 @@ extern int arm_const_double_inline_cost (rtx); extern bool arm_const_double_by_parts (rtx); extern bool arm_const_double_by_immediates (rtx); extern void arm_emit_call_insn (rtx, rtx, bool); +bool detect_cmse_nonsecure_call (tree); extern const char *output_call (rtx *); void arm_emit_movpair (rtx, rtx); extern const char *output_mov_long_double_arm_fr
[Patch, Fortran, OOP] PR 43207: [OOP] invalid (pointer) assignment to and from abstract non-polymorphic expressions
Hi all, the attached patch fixes the PR in the subject line by introducing a new check to reject invalid code. It's a slight update of an old patch that I posted in the PR quite some time ago, using somewhat tighter checking to avoid side effects on the testsuite. Regtests cleanly on x86_64-linux-gnu. Ok for trunk? Cheers, Janus 2016-12-02 Janus Weil PR fortran/43207 * primary.c (gfc_match_varspec): Reject nonpolymorphic references to abstract types. 2016-12-02 Janus Weil PR fortran/43207 * gfortran.dg/abstract_type_9.f90: New test case. Index: gcc/fortran/primary.c === --- gcc/fortran/primary.c (revision 243176) +++ gcc/fortran/primary.c (working copy) @@ -,7 +,15 @@ check_substring: } } - /* F2008, C727. */ + /* F08:C611. */ + if (primary->ts.type == BT_DERIVED && primary->ref + && primary->ts.u.derived && primary->ts.u.derived->attr.abstract) +{ + gfc_error ("Nonpolymorphic reference to abstract type at %C"); + return MATCH_ERROR; +} + + /* F08:C727. */ if (primary->expr_type == EXPR_PPC && gfc_is_coindexed (primary)) { gfc_error ("Coindexed procedure-pointer component at %C"); ! { dg-do compile } ! ! PR 43207: [OOP] invalid (pointer) assignment to and from abstract non-polymorphic expressions ! ! Contributed by Tobias Burnus implicit none type, abstract :: parent integer :: i end type type, extends(parent) :: child class(parent), pointer :: comp end type type(child), target :: c1 class(child), allocatable :: c2 class(parent), pointer :: cp c1%parent = c1%parent ! { dg-error "Nonpolymorphic reference to abstract type" } c2%parent = c1%parent ! { dg-error "Nonpolymorphic reference to abstract type" } cp => c1%comp cp => c1%parent! { dg-error "Nonpolymorphic reference to abstract type" } call sub(c1%comp) call sub(c1%parent)! { dg-error "Nonpolymorphic reference to abstract type" } contains subroutine sub(arg) class(parent) :: arg end subroutine end
Re: [PATCHv2 6/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call
Hi Andre, On 02/12/16 13:36, Andre Vieira (lists) wrote: On 23/11/16 11:53, Andre Vieira (lists) wrote: On 11/11/16 16:19, Kyrill Tkachov wrote: And CC'ing Ramana and Richard this time... Hi, After some extra testing I found that the sibcall optimization was not disabled for calls to function pointers with the cmse_nonsecure_call attribute, causing the clearing and call to the function wrapper to be skipped. This would result in an illegal branch into secure memory and would HardFault. Added a test. Is this OK? Cheers, Andre *** gcc/ChangeLog *** 2016-11-xx Andre Vieira Thomas Preud'homme * config/arm/arm.c (detect_cmse_nonsecure_call): New. (cmse_nonsecure_call_clear_caller_saved): New. (arm_reorg): Use cmse_nonsecure_call_clear_caller_saved. (arm_function_ok_for_sibcall): Disable sibcalls for cmse_nonsecure_call. * config/arm/arm-protos.h (detect_cmse_nonsecure_call): New. * config/arm/arm.md (call): Handle cmse_nonsecure_entry. (call_value): Likewise. (nonsecure_call_internal): New. (nonsecure_call_value_internal): New. * config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New. (*nonsecure_call_value_reg_thumb1_v5): New. * config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New. (*nonsecure_call_value_reg_thumb2): New. * config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New. *** libgcc/ChangeLog *** 2016-11-xx Andre Vieira Thomas Preud'homme * config/arm/cmse_nonsecure_call.S: New. * config/arm/t-arm: Compile cmse_nonsecure_call.S *** gcc/testsuite/ChangeLog *** 2016-11-xx Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse.exp: Run tests in mainline dir. * gcc.target/arm/cmse/cmse-9.c: Added some extra tests. * gcc.target/arm/cmse/cmse-14.c: New. * gcc.target/arm/cmse/baseline/bitfield-4.c: New. * gcc.target/arm/cmse/baseline/bitfield-5.c: New. * gcc.target/arm/cmse/baseline/bitfield-6.c: New. * gcc.target/arm/cmse/baseline/bitfield-7.c: New. * gcc.target/arm/cmse/baseline/bitfield-8.c: New. * gcc.target/arm/cmse/baseline/bitfield-9.c: New. * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: New. * gcc.target/arm/cmse/baseline/cmse-11.c: New. * gcc.target/arm/cmse/baseline/cmse-13.c: New. * gcc.target/arm/cmse/baseline/cmse-6.c: New. * gcc/testsuite/gcc.target/arm/cmse/baseline/union-1.c: New. * gcc/testsuite/gcc.target/arm/cmse/baseline/union-2.c: New. * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New. * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New. * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: New. * gcc.target/arm/cmse/mainline/hard/cmse-7.c: New. * gcc.target/arm/cmse/mainline/hard/cmse-8.c: New. * gcc.target/arm/cmse/mainline/soft/cmse-13.c: New. * gcc.target/arm/cmse/mainline/soft/cmse-7.c: New. * gcc.target/arm/cmse/mainline/soft/cmse-8.c: New. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New. * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New. * gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New. * gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New. Hi, To make the clearing of registers consistent between single and double precision I decided to clear all FP registers with 0. The callee-saved registers, saved, cleared and restored in the library wrapper we can do this without much penalty to performance. The caller-saved registers are compiler generated and currently generate a 'vldr' instruction, per cleared (sp or dp) register. This is far from optimal, but it works and it is "safer". I have some ideas to improve this, for instance using r0-r1 to clear the FP registers, since they will either contain the address of the callback function or an argument value, either way they will never contain secret information. I will address this at a later time. Changed the tests to reflect these changes. No changes to the ChangeLog. Is this OK? Thanks, I much prefer the consistency. This is ok. I believe all patches in this series have been approved now, so you can go ahead and commit them. Please keep an eye out for fallout over the next week. Kyrill Cheers, Andre
Re: [PATCH] EVRP edge info refactoring and simple improvement
On Fri, 2 Dec 2016, Richard Biener wrote: > On Fri, 2 Dec 2016, Richard Biener wrote: > > > > > The following refactors range extraction from edges and makes EVRP > > able to merge such edge based ranges for the case of multiple > > predecessors. This allows it to catch anti-ranges from > > if (a < 4 || a > 8) if that is not re-written to a single test like > > when using gotos. > > > > I don't expect this to catch very much in practice but the refactoring > > means we can incorporate the tricks from register_edge_assert_for > > more easily (we "simply" have to use extract_ranges_from_edge as the > > one-and-true kind-of interface). > > Like the following, preliminary testing shows > > FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate > minv_[0-9]* == 5 to 0" > FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate > minv_[0-9]* == 6 to 0" > FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate > maxv_[0-9]* == 5 to 0" > FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate > maxv_[0-9]* == 6 to 0" > FAIL: gcc.dg/tree-ssa/vrp35.c scan-tree-dump vrp1 "Removing dead stmt > [^\r\n]* = j_.* == 10" > FAIL: gcc.dg/tree-ssa/vrp36.c scan-tree-dump vrp1 "Removing dead stmt > [^\r\n]* = i_.* == 1" > > so more things are done by EVRP. More testing next week because > I expected more VRP testcase fallout. But as expected this allows > for the single-test if (a < 4 || a > 8) variant. It also allows to optimize void f (unsigned a) { if (a < 4 || a > 8) goto x; if (a > 5 && a < 9) goto x; return; x: if (a == 5) __builtin_abort (); } which VRP does not handle, only reassoc can do this (implementing folds simplifications, so writing (a < 4 || a > 8 || (a > 5 && a < 9)) simplifies this in fold already. Richard. > Richard. > > 2016-12-02 Richard Biener > > * tree-vrp.c (assert_info): New struct. > (add_assert_info): New helper. > (register_edge_assert_for_2): Refactor to add asserts to a vector > of assert_info. > (register_edge_assert_for_1): Likewise. > (register_edge_assert_for): Likewise. > (finish_register_edge_assert_for): New helper actually registering > asserts where live on edge. > (find_conditional_asserts): Adjust. > (find_switch_asserts): Likewise. > (evrp_dom_walker::try_find_new_range): Generalize. > (evrp_dom_walker::extract_ranges_from_edge): Use > register_edge_assert_for. > (evrp_dom_walker::before_dom_children): Adjust. > > diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c > index 592d3b0..62d0e9d 100644 > --- a/gcc/tree-vrp.c > +++ b/gcc/tree-vrp.c > @@ -89,6 +89,21 @@ static tree vrp_evaluate_conditional_warnv_with_ops (enum > tree_code, >tree, tree, bool, bool *, >bool *); > > +struct assert_info > +{ > + /* Predicate code for the ASSERT_EXPR. Must be COMPARISON_CLASS_P. */ > + enum tree_code comp_code; > + > + /* Name to register the assert for. */ > + tree name; > + > + /* Value being compared against. */ > + tree val; > + > + /* Expression to compare. */ > + tree expr; > +}; > + > /* Location information for ASSERT_EXPRs. Each instance of this > structure describes an ASSERT_EXPR for an SSA name. Since a single > SSA name may have more than one assertion associated with it, these > @@ -4956,6 +4971,19 @@ debug_all_asserts (void) >dump_all_asserts (stderr); > } > > +/* Push the assert info for NAME, EXPR, COMP_CODE and VAL to ASSERTS. */ > + > +static void > +add_assert_info (vec &asserts, > + tree name, tree expr, enum tree_code comp_code, tree val) > +{ > + assert_info info; > + info.comp_code = comp_code; > + info.name = name; > + info.val = val; > + info.expr = expr; > + asserts.safe_push (info); > +} > > /* If NAME doesn't have an ASSERT_EXPR registered for asserting > 'EXPR COMP_CODE VAL' at a location that dominates block BB or > @@ -5172,9 +5200,10 @@ masked_increment (const wide_int &val_in, const > wide_int &mask, > Invert the condition COND if INVERT is true. */ > > static void > -register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi, > +register_edge_assert_for_2 (tree name, edge e, > enum tree_code cond_code, > - tree cond_op0, tree cond_op1, bool invert) > + tree cond_op0, tree cond_op1, bool invert, > + vec &asserts) > { >tree val; >enum tree_code comp_code; > @@ -5185,10 +5214,8 @@ register_edge_assert_for_2 (tree name, edge e, > gimple_stmt_iterator bsi, > invert, &comp_code, &val)) > return; > > - /* Only register an ASSERT_EXPR if NAME was found in the sub-graph > - reachable from E. */ > - if (live_on_edge (e, name)) > -register
[hsa] Exclude parallel outlines from hsa_callable_functions_p
Hi, after the merge of nvidia OpenMP implementation, the normal parallel outline functions were also marked as "omp declare target" which lead to them being cloned and compiled to HSA which is not only unnecessary but often leads to a lot of useless HSA warning noise. The following patch deal with this issue by making sure they are not considered callable from HSA. Bootstrapped and tested on x86_64-linux. I will commit it to trunk in a few moments (it is already part of my most recent merge from trunk to the hsa branch. Thanks, Martin Exclude parallel outlines from hsa_callable_functions_p 2016-11-29 Martin Jambor * hsa.c (hsa_callable_function_p): Return false for artificial functions. --- gcc/hsa.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/hsa.c b/gcc/hsa.c index f881e78..31e3252 100644 --- a/gcc/hsa.c +++ b/gcc/hsa.c @@ -90,7 +90,10 @@ bool hsa_callable_function_p (tree fndecl) { return (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (fndecl)) - && !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl))); + && !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl)) + /* At this point, this is enough to identify clones for +parallel, which for HSA would need to be kernels anyway. */ + && !DECL_ARTIFICIAL (fndecl)); } /* Allocate HSA structures that are are used when dealing with different -- 2.10.2
C++ PATCH to stop inheriting copy constructors
In C++14 copy constructors were excluded from the inherited constructor set; Ville noticed that this was changed in the new inherited constructor rules, and lobbied for us to retain the old behavior in that case. This patch implements the proposed resolution. Tested x86_64-pc-linux-gnu, applied to trunk. 2016-12-01 Jason Merrill * call.c (add_function_candidate): Exclude inherited copy/move ctors. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 97003e5..b7aa97c 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -2042,6 +2042,24 @@ add_function_candidate (struct z_candidate **candidates, reason = arity_rejection (first_arg, i + remaining, len); } + /* An inherited constructor (12.6.3 [class.inhctor.init]) that has a first + parameter of type "reference to cv C" (including such a constructor + instantiated from a template) is excluded from the set of candidate + functions when used to construct an object of type D with an argument list + containing a single argument if C is reference-related to D. */ + if (viable && len == 1 && parmlist && DECL_CONSTRUCTOR_P (fn) + && flag_new_inheriting_ctors + && DECL_INHERITED_CTOR (fn)) +{ + tree ptype = non_reference (TREE_VALUE (parmlist)); + tree dtype = DECL_CONTEXT (fn); + if (reference_related_p (ptype, dtype)) + { + viable = false; + reason = inherited_ctor_rejection (); + } +} + /* Second, for a function to be viable, its constraints must be satisfied. */ if (flag_concepts && viable @@ -2142,18 +2160,6 @@ add_function_candidate (struct z_candidate **candidates, } } - /* Don't consider inherited constructors for initialization from an -expression of the same or derived type. */ - /* FIXME extend to operator=. */ - if (i == 0 && len == 1 - && DECL_INHERITED_CTOR (fn) - && reference_related_p (ctype, argtype)) - { - viable = 0; - reason = inherited_ctor_rejection (); - goto out; - } - /* Core issue 899: When [copy-]initializing a temporary to be bound to the first parameter of a copy constructor (12.8) called with a single argument in the context of direct-initialization,
Re: [PATCH] avoid calling alloca(0)
On 12/01/16 19:39, Jeff Law wrote: > On 11/30/2016 09:09 PM, Martin Sebor wrote: >>> What I think this tells us is that we're not at a place where we're >>> clean. But we can incrementally get there. The warning is only >>> catching a fairly small subset of the cases AFAICT. That's not unusual >>> and analyzing why it didn't trigger on those cases might be useful as >>> well. >> >> The warning has no smarts. It relies on constant propagation and >> won't find a call unless it sees it's being made with a constant >> zero. Looking at the top two on the list the calls are in extern >> functions not called from the same source file, so it probably just >> doesn't see that the functions are being called from another file >> with a zero. Building GCC with LTO might perhaps help. > Right. This is consistent with the limitations of other similar > warnings such as null pointer dereferences. > >> >>> So where does this leave us for gcc-7? I'm wondering if we drop the >>> warning in, but not enable it by default anywhere. We fix the cases we >>> can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before >>> stage3 closes, and shoot for the rest in gcc-8, including improvign the >>> warning (if there's something we can clearly improve), and enabling the >>> warning in -Wall or -Wextra. >> >> I'm fine with deferring the GCC fixes and working on the cleanup >> over time but I don't think that needs to gate enabling the option >> with -Wextra. The warnings can be suppressed or prevented from >> causing errors during a GCC build either via a command line option >> or by pragma in the code. AFAICT, from the other warnings I see >> go by, this is what has been done for -Wno-implicit-fallthrough >> while those warnings are being cleaned up. Why not take the same >> approach here? > The difference vs implicit fallthrough is that new instances of implicit > fallthrus aren't likely to be exposed by changes in IL that occur due to > transformations in the optimizer pipeline. > > Given the number of runtime triggers vs warnings, we know there's many > instances of passing 0 to the allocators that we're not diagnosing. I > can easily see differences in the early IL (such as those due to > BRANCH_COST differing for targets) exposing/hiding cases where 0 flows > into the allocator argument. Similarly for changes in inlining > decisions, jump threading, etc for profiled bootstraps. I'd like to > avoid playing wack-a-mole right now. > > So I'm being a bit more conservative here. Maybe it'd be appropriate in > Wextra since that's not enabled by default for GCC builds. > actually it is enabled, by -W -Wall ... > >> >> As much as I would like to improve the warning itself I'm also not >> sure I see much of an opportunity for it. It's not prone to high >> rates of false positives (hardly any in fact) and the cases it >> misses are those where it simply doesn't see the argument value >> because it's not been made available by constant propagation. > There's always ways :-) For example, I wouldn't be at all surprised if > you found PHIs that feed the allocation where one or more of the PHI > arguments are 0. > > >> >> That said, I consider the -Walloc-size-larger-than warning to be >> the more important part of the patch by far. I'd hate a lack of >> consensus on how to deal with GCC's handful of instances of >> alloca(0) to stall the rest of the patch. > Agreed on not wanting alloca(0) handling to stall the rest of the patch. I'd say negative arguments on alloca should always diagnosed, as that does increment the stack in reverse direction. And that is always very dangerous. I think the default of -Walloc-size-larger-than should be changed as Martin suggested to SIZE_T_MAX/2, I would make that the default. And keep alloca and malloc size limits separate warnings. Bernd.
Re: [PATCH] fix -fmax-errors & notes, take 2
On 12/02/2016 02:25 PM, Nathan Sidwell wrote: +/* Check if we've met the maximum error limit. */ Arguments should be documented. +void +diagnostic_check_max_errors (diagnostic_context *context, bool flush) +{ + if (!context->max_errors) +return; I prefer to spell that as != 0 since it's not a boolean, but you can also leave it. + int count = (diagnostic_kind_count (context, DK_ERROR) + + diagnostic_kind_count (context, DK_SORRY) + + diagnostic_kind_count (context, DK_WERROR)); + + if (count >= (int) context->max_errors) Looks like there are some unnecessary type mismatches leading to this cast. Maybe declare max_errors as int and remove the cast? Otherwise ok. Bernd
Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)
On Fri, Dec 02, 2016 at 12:34:13AM +0100, Jakub Jelinek wrote: > On Thu, Dec 01, 2016 at 11:48:03PM +0100, Bernd Schmidt wrote: > > On 12/01/2016 11:43 PM, Jakub Jelinek wrote: > > > > > >so we'd need to slow shallow_copy_rtx down by adding switch (code) > > >in there or something similar. > > > > Is there reason to assume it's time-critical? IMO eliminating the potential > > for error by not having callers need to do this outweighs that concern. > > Well, it isn't an error not to clear it, just missed optimization if some > caller > cares, and most of the shallow_copy_rtx users really don't care, e.g. > config/i386/i386.md and config/i386/i386.c callers. cfgexpand.c doesn't > either, at that point the used bit isn't set, the calls in cselib.c don't care > either. In emit-rtl.c one place explicitly sets used flag afterwards, another > clears it (that one could be removed if the logic is moved down). The > rs6000.c > use indeed could get rid of the explicit used bit clearing. Most of the > other shallow_copy_rtx callers in the middle-end just don't care. > I think it is really just simplify_replace_fn_rtx where (at least in the > rs6000 case) it is useful to clear it, similar trick is not used in many > places after initial unsharing during expansion before which the bit should > be clear on everything shareable, I saw it in ifcvt.c (set_used_flags > + some copying + unshare_*_chain afterwards). > After expansion when everything is unshared, gradually the used bits are no > longer relevant, they are set on stuff that has been originally unshared and > clear on newly added rtxes. Then the reload/postreload > unshare_all_rtx_again clears it and unshares the shared stuff. > So it is really just about spots that do the trick of set_used_flags, > call some functions where clearing of used bit on new stuff is important, > and finally call copy_rtx_if_shared. Anyway, here is a patch that changes shallow_copy_rtx, bootstrapped/regtested on x86_64-linux and i686-linux. I believe only the callers in the patch of the function actually care about used being 0 though (including the simplify-rtx.c hunks). 2016-11-30 Jakub Jelinek PR target/78614 * rtl.c (copy_rtx): Don't clear used flag here. (shallow_copy_rtx_stat): Clear used flag here unless code the rtx is shareable. * simplify-rtx.c (simplify_replace_fn_rtx): When copying rtx with 'E' in format, copy all vectors. * emit-rtl.c (copy_insn_1): Don't clear used flag here. * valtrack.c (cleanup_auto_inc_dec): Likewise. * config/rs6000/rs6000.c (rs6000_frame_related): Likewise. --- gcc/rtl.c.jj2016-10-31 13:28:12.0 +0100 +++ gcc/rtl.c 2016-12-02 11:01:12.557553040 +0100 @@ -318,10 +318,6 @@ copy_rtx (rtx orig) us to explicitly document why we are *not* copying a flag. */ copy = shallow_copy_rtx (orig); - /* We do not copy the USED flag, which is used as a mark bit during - walks over the RTL. */ - RTX_FLAG (copy, used) = 0; - format_ptr = GET_RTX_FORMAT (GET_CODE (copy)); for (i = 0; i < GET_RTX_LENGTH (GET_CODE (copy)); i++) @@ -367,7 +363,29 @@ shallow_copy_rtx_stat (const_rtx orig ME { const unsigned int size = rtx_size (orig); rtx const copy = ggc_alloc_rtx_def_stat (size PASS_MEM_STAT); - return (rtx) memcpy (copy, orig, size); + memcpy (copy, orig, size); + switch (GET_CODE (orig)) +{ + /* RTX codes copy_rtx_if_shared_1 considers are shareable, +the used flag is often used for other purposes. */ +case REG: +case DEBUG_EXPR: +case VALUE: +CASE_CONST_ANY: +case SYMBOL_REF: +case CODE_LABEL: +case PC: +case CC0: +case RETURN: +case SIMPLE_RETURN: +case SCRATCH: + break; +default: + /* For all other RTXes clear the used flag on the copy. */ + RTX_FLAG (copy, used) = 0; + break; +} + return copy; } /* Nonzero when we are generating CONCATs. */ --- gcc/simplify-rtx.c.jj 2016-12-02 00:15:09.200779256 +0100 +++ gcc/simplify-rtx.c 2016-12-02 10:55:24.283989673 +0100 @@ -547,13 +547,19 @@ simplify_replace_fn_rtx (rtx x, const_rt old_rtx, fn, data); if (op != RTVEC_ELT (vec, j)) { - if (newvec == vec) + if (x == newx) { - newvec = shallow_copy_rtvec (vec); - if (x == newx) - newx = shallow_copy_rtx (x); - XVEC (newx, i) = newvec; + newx = shallow_copy_rtx (x); + /* If we copy X, we need to copy also all + vectors in it, rather than copy only + a subset of them and share the rest. */ + for (int k = 0; fmt[k]; k++) + if (fmt[k] == 'E') + XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k)); + newvec
Re: C++ PATCH to stop inheriting copy constructors
On 2 December 2016 at 16:00, Jason Merrill wrote: > In C++14 copy constructors were excluded from the inherited > constructor set; Ville noticed that this was changed in the new > inherited constructor rules, and lobbied for us to retain the old > behavior in that case. This patch implements the proposed resolution. s/lobbied/convinced with reasonable rationale/ :) Smashing, thanks for the quick fix.
[C++ Patch] PR 78637 ("ICE on invalid C++ code (internal compiler error: in pop_namespace, at cp/name-lookup.c:3826)")
Hi, this regression is caused by my fix for c++/60385, where I changed push_namespace to early return (a bool) when pushdecl fails. In the present testcase, a different push_namespace call in cp_parser_namespace_definition, for nested namespace definitions, fails, thus returns early, and that is inconsistent with the final loop (pop_namespace ICEs): /* Finish the nested namespace definitions. */ while (nested_definition_count--) pop_namespace (); To fix this problem, I think we can simply increment nested_definition_count only when push_namespace succeeds. Tested x86_64-linux. Thanks, Paolo. // /cp 2016-12-02 Paolo Carlini PR c++/78637 * parser.c (cp_parser_namespace_definition): Increment nested_definition_count only if push_namespace succeeds. /testsuite 2016-12-02 Paolo Carlini PR c++/78637 * g++.dg/parse/namespace14.C: New. Index: cp/parser.c === --- cp/parser.c (revision 243171) +++ cp/parser.c (working copy) @@ -18184,8 +18184,8 @@ cp_parser_namespace_definition (cp_parser* parser) cp_parser_error (parser, "nested identifier required"); break; } - ++nested_definition_count; - push_namespace (identifier); + if (push_namespace (identifier)) + ++nested_definition_count; } } Index: testsuite/g++.dg/parse/namespace14.C === --- testsuite/g++.dg/parse/namespace14.C(revision 0) +++ testsuite/g++.dg/parse/namespace14.C(working copy) @@ -0,0 +1,6 @@ +// PR c++/78637 + +namespace X { +class Y; +} +namespace X::Y z; // { dg-error "namespace|expected|type" }
[PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)
Hi! While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html added *andndi3_doubleword, I don't understand how it can actually work. The problem is that this pattern is for combine, and needs combining of DImode NOT setter with DImode AND user. But there is no DImode pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2 patterns and I can't see how combine would be able to figure stuff out from that. This patch: 1) adds one_cmpldi2 pattern for stv purposes (which splits into two one_cmplsi2 after reload) 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it is for -mbmi only, and adds another *andndi3_doubleword pattern that is meant to live just from combine till the stv pass, or worse case till following split1 pass when it is split back into not followed by and; this change makes it possible to use pandn in stv pass, even without -mbmi Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-02 Jakub Jelinek PR target/70322 * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle NOT. (dimode_scalar_chain::compute_convert_gain): Likewise. (dimode_scalar_chain::convert_insn): Likewise. * config/i386/i386.md (*andndi3_doubleword): New define_insn_and_split. Rename the old define_insn_and_split to... (*andndi3_doubleword_bmi): ... this. (one_cmpl2): Use SWIM1248x iterator instead of SWIM. (*one_cmpldi2_doubleword): New define_insn_and_split. * gcc.target/i386/pr70322-1.c: New test. * gcc.target/i386/pr70322-2.c: New test. * gcc.target/i386/pr70322-3.c: New test. --- gcc/config/i386/i386.c.jj 2016-12-02 11:17:40.702995111 +0100 +++ gcc/config/i386/i386.c 2016-12-02 12:01:31.656469089 +0100 @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx return false; break; +case NOT: + break; + case REG: return true; @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx if ((GET_MODE (XEXP (src, 0)) != DImode && !CONST_INT_P (XEXP (src, 0))) - || (GET_MODE (XEXP (src, 1)) != DImode + || (GET_CODE (src) != NOT + && GET_MODE (XEXP (src, 1)) != DImode && !CONST_INT_P (XEXP (src, 1 return false; @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai if (CONST_INT_P (XEXP (src, 1))) gain -= vector_const_cost (XEXP (src, 1)); } + else if (GET_CODE (src) == NOT) + gain += ix86_cost->add - COSTS_N_INSNS (1); else if (GET_CODE (src) == COMPARE) { /* Assume comparison cost is the same. */ @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i PUT_MODE (src, V2DImode); break; +case NOT: + src = XEXP (src, 0); + convert_op (&src, insn); + subreg = gen_reg_rtx (V2DImode); + emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), insn); + src = gen_rtx_XOR (V2DImode, src, subreg); + break; + case MEM: if (!REG_P (dst)) convert_op (&src, insn); --- gcc/config/i386/i386.md.jj 2016-12-01 23:24:51.663157486 +0100 +++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100 @@ -8534,7 +8534,7 @@ (define_split operands[2] = gen_lowpart (QImode, operands[2]); }) -(define_insn_and_split "*andndi3_doubleword" +(define_insn_and_split "*andndi3_doubleword_bmi" [(set (match_operand:DI 0 "register_operand" "=r") (and:DI (not:DI (match_operand:DI 1 "register_operand" "r")) @@ -8551,6 +8551,27 @@ (define_insn_and_split "*andndi3_doublew (clobber (reg:CC FLAGS_REG))])] "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") +;; This insn is there only so that the stv pass can use pandn even when +;; BMI is not available. It should be matched during combine and split +;; again in split1, unless the stv pass converts it. +(define_insn_and_split "*andndi3_doubleword" + [(set (match_operand:DI 0 "register_operand" "=r") + (and:DI + (not:DI (match_operand:DI 1 "register_operand" "r")) + (match_operand:DI 2 "nonimmediate_operand" "rm"))) + (clobber (reg:CC FLAGS_REG))] + "!TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2 + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (match_dup 3) + (not:DI (match_dup 1))) + (parallel [(set (match_dup 0) + (and:DI (match_dup 3) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] + "operands[3] = gen_reg_rtx (DImode);") + (define_insn "*andn_1" [(set (match_operand:SWI48 0 "register_operand" "=r,r") (and:SWI48 @@ -9313,8 +9334,8 @@ (define_split ;; One complement instructions (define_expand "one_cmpl2" - [(set (match_operand:SWIM 0 "nonimmediate_operand") - (not:SWIM (match_oper
[PATCH] OpenACC executable directives
This patch teaches the C and C++ FEs to expect ACC ENTER/EXIT DATA, ACC UPDATE and ACC WAIT executable directives to be used inside compound statements. This is to prevent situations such as if (needs_wait) #pragma acc wait // do something else here from generating unexpected code when the program is built without -fopenacc. The C and C++ FEs already guard against such usages for OpenMP executable directives. This patch also included a tweak for the diagnostics generated by oacc_enter_exit_data parser. The error message now more accurately reports if an error is detected in an enter or exit data directive. Is this patch OK for trunk? Cesar
Re: [PATCH] OpenACC executable directives
On 12/02/2016 06:37 AM, Cesar Philippidis wrote: > This patch teaches the C and C++ FEs to expect ACC ENTER/EXIT DATA, ACC > UPDATE and ACC WAIT executable directives to be used inside compound > statements. This is to prevent situations such as > > if (needs_wait) > #pragma acc wait > > // do something else here > > from generating unexpected code when the program is built without > -fopenacc. The C and C++ FEs already guard against such usages for > OpenMP executable directives. > > This patch also included a tweak for the diagnostics generated by > oacc_enter_exit_data parser. The error message now more accurately > reports if an error is detected in an enter or exit data directive. > > Is this patch OK for trunk? And here's the patch. Cesar 2016-12-02 Cesar Philippidis James Norris gcc/c/ * c-parser.c (c_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA, EXIT_DATA,WAIT} are not used in compound statements. (c_parser_oacc_enter_exit_data): Update diagnostics. gcc/cp/ * parser.c (cp_parser_oacc_enter_exit_data): Update diagnostics. (cp_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA, EXIT_DATA,WAIT} are not used in compound statements. gcc/testsuite/ * c-c++-common/goacc/data-2.c: Adjust test. * c-c++-common/goacc/executeables-1.c: New test. * g++.dg/goacc/data-1.C: Adjust test. diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 00fe731..f7bf9c4 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -10145,10 +10145,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p) return false; case PRAGMA_OACC_ENTER_DATA: + if (context != pragma_compound) + { + if (context == pragma_stmt) + c_parser_error (parser, "%<#pragma acc enter data%> may only be " + "used in compound statements"); + goto bad_stmt; + } c_parser_oacc_enter_exit_data (parser, true); return false; case PRAGMA_OACC_EXIT_DATA: + if (context != pragma_compound) + { + if (context == pragma_stmt) + c_parser_error (parser, "%<#pragma acc exit data%> may only be " + "used in compound statements"); + goto bad_stmt; + } c_parser_oacc_enter_exit_data (parser, false); return false; @@ -10305,6 +10319,16 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p) c_parser_cilk_grainsize (parser, if_p); return false; +case PRAGMA_OACC_WAIT: + if (context != pragma_compound) + { + if (context == pragma_stmt) + c_parser_error (parser, "%<#pragma acc enter data%> may only be " + "used in compound statements"); + goto bad_stmt; + } + /* FALL THROUGH. */ + default: if (id < PRAGMA_FIRST_EXTERNAL) { @@ -13871,28 +13895,26 @@ c_parser_oacc_enter_exit_data (c_parser *parser, bool enter) { location_t loc = c_parser_peek_token (parser)->location; tree clauses, stmt; + const char *p = ""; c_parser_consume_pragma (parser); - if (!c_parser_next_token_is (parser, CPP_NAME)) + if (c_parser_next_token_is (parser, CPP_NAME)) { - c_parser_error (parser, enter - ? "expected % in %<#pragma acc enter data%>" - : "expected % in %<#pragma acc exit data%>"); - c_parser_skip_to_pragma_eol (parser); - return; + p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value); + c_parser_consume_token (parser); } - const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value); if (strcmp (p, "data") != 0) { - c_parser_error (parser, "invalid pragma"); + error_at (loc, enter + ? "expected % after %<#pragma acc enter%>" + : "expected % after %<#pragma acc exit%>"); + parser->error = true; c_parser_skip_to_pragma_eol (parser); return; } - c_parser_consume_token (parser); - if (enter) clauses = c_parser_oacc_all_clauses (parser, OACC_ENTER_DATA_CLAUSE_MASK, "#pragma acc enter data"); diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 843cbe2..08f5f9e 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -36175,23 +36175,18 @@ static tree cp_parser_oacc_enter_exit_data (cp_parser *parser, cp_token *pragma_tok, bool enter) { + location_t loc = pragma_tok->location; tree stmt, clauses; + const char *p = ""; - if (cp_lexer_next_token_is (parser->lexer, CPP_PRAGMA_EOL) - || cp_lexer_next_token_is_not (parser->lexer, CPP_NAME)) -{ - cp_parser_error (parser, enter - ? "expected % in %<#pragma acc enter data%>" - : "expected % in %<#pragma acc exit data%>"); - cp_parser_skip_to_pragma_eol (parser, pragma_tok); - return NULL_TREE; -} + if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)) +p = IDENTIFIER_POINTER (cp_lexer_peek_token (parser->lexer)->u.value); - const char *p = -IDENTIFIER_POINTER (cp_lexer_peek_token (parser->lexer)->u.value); if (strcmp (p, "data") != 0) { - cp_parser_error (parser, "invalid pragma"); + error_at (loc,
Re: [Patch 0/2 PR78561] Recalculate constant pool size before emitting it
On Thu, Dec 01, 2016 at 12:55:21PM -0500, David Edelsohn wrote: > > James Greenhalgh writes: > > > The patch set has been bootstrapped and tested on aarch64-none-linux-gnu and > > x86-64-none-linux-gnu without any issues. I've also cross-tested it for > > aarch64-none-elf and build-tested it for rs6000 (though I couldn't run the > > testsuite as I don't have a test environment). > > There are PPC64 Linux and AIX systems in the GNU Compile Farm. All > have DejaGNU installed. I bootstrapped and tested a compiler with these patches applied on gcc112, there were no issues. I've applied the patches based on Jeff's OK as revisions 243182, 243183. Thanks, James
Re: [PATCH 8a/9] Introduce class function_reader (v6)
On 12/02/2016 03:00 AM, David Malcolm wrote: Changed in v6: - split out dump-reading selftests into followup patches (target-independent, and target-specific) - removes unneeded headers from read-rtl-function.c - numerous other cleanups identified in review Ok, starting to look very close to done. It appears I accidentally made you change element->directive. It seems like a good change but all I wanted to point out was a plural where I would have expected a singular. + /* Lookup param by name. */ + tree t_param = parse_mem_expr (name); + // TODO: what if not found? Error out? +/* Implementation of rtx_reader::handle_unknown_directive. + + Require a top-level "function" directive, as emitted by + print_rtx_function, and parse it. */ Should document START_LOC and NAME. + /* Do we need this to force cgraphunit.c to output the function? */ Well, what happens if you don't do this? Seems reasonable anyway, so maybe lose the comment. +/* Parse rtx instructions by calling read_rtx_code, calling + set_first_insn and set_last_insn as appropriate, and + adding the insn to the insn chain. + Consume the trailing ')'. */ + +rtx_insn * +function_reader::parse_insn (file_location start_loc, const char *name) Once again, please document args - make another pass over all the functions to make sure. + +/* Parse operand IDX of X, of code 'i' or 'n'. "... as specified by FORMAT_CHAR", presumably. + if (0 == strncmp (desc, "orig:", 5)) + { + expect_original_regno = true; + desc_start += 5; + /* Skip to any whitespace following the integer. */ + const char *space = strchr (desc_start, ' '); + if (space) + desc_start = space + 1; + } + /* Any remaining text may be the REG_EXPR. Alternatively we have +no REG_ATTRS, and instead we have ORIGINAL_REGNO. */ + if (ISDIGIT (*desc_start)) + { + /* Assume we have ORIGINAL_REGNO. */ + ORIGINAL_REGNO (x) = atoi (desc_start); + } This looks like an issue. You'll want to have ORIGINAL_REGNOs printed and parsed like other regnos, i.e. with %number for pseudos. Can be done as a followup if it's involved. + /* Verify lookup of hard registers. */ +#ifdef GCC_AARCH64_H + ASSERT_EQ (0, lookup_reg_by_dump_name ("x0")); + ASSERT_EQ (1, lookup_reg_by_dump_name ("x1")); +#endif +#ifdef I386_OPTS_H + ASSERT_EQ (0, lookup_reg_by_dump_name ("ax")); + ASSERT_EQ (1, lookup_reg_by_dump_name ("dx")); +#endif Please just remove this for now; not important enough to break the rule about target-specific bits in generic code. @@ -1103,13 +1244,39 @@ rtx_reader::read_rtx_code (const char *code_name) rtx value; /* Value of this node. */ }; + one_time_initialization (); I still don't understand why this isn't in the rtx_reader constructor? Seems pointless to call this each time we want to parse an rtx. + + /* Use the format_ptr to parse the various operands of this rtx. + read_rtx_operand is a vfunc, allowing the parser to vary between + parsing .md files and parsing .rtl dumps; the function_reader subclass + overrides the defaults when loading RTL dumps, to handle the + various extra attributes emitted by print_rtx. */ for (int idx = 0; format_ptr[idx] != 0; idx++) -read_rtx_operand (return_rtx, idx); +return_rtx = read_rtx_operand (return_rtx, idx); + + /* Call a vfunc to handle the various additional information that + print-rtl.c can write after the regular fields; does nothing when + parsing .md files. */ + handle_any_trailing_information (return_rtx); I still think just drop the bulk of these comments. Virtual functions aren't that arcane. + /* "stringbuf" was allocated within string_obstack and thus has + the its lifetime restricted to that of the rtx_reader. This is + OK for the generator programs, but for non-generator programs, + XSTR and XTMPL fields are meant to be allocated in the GC-managed + heap. Hence we need to allocate a copy in the GC-managed heap + for the non-generator case. */ + const char *string_ptr = finalize_string (stringbuf); Put the comment near the implementation that makes the copy (and maybe one explaining why the no-op in the other implementation), not here. Bernd
Re: [PATCH] OpenACC executable directives
On Fri, Dec 02, 2016 at 06:38:25AM -0800, Cesar Philippidis wrote: > 2016-12-02 Cesar Philippidis > James Norris > > gcc/c/ > * c-parser.c (c_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA, > EXIT_DATA,WAIT} are not used in compound statements. > (c_parser_oacc_enter_exit_data): Update diagnostics. > > gcc/cp/ > * parser.c (cp_parser_oacc_enter_exit_data): Update diagnostics. > (cp_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA, > EXIT_DATA,WAIT} are not used in compound statements. > > gcc/testsuite/ > * c-c++-common/goacc/data-2.c: Adjust test. > * c-c++-common/goacc/executeables-1.c: New test. > * g++.dg/goacc/data-1.C: Adjust test. Ok, thanks. Jakub
Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)
On 12/02/2016 03:12 PM, Jakub Jelinek wrote: --- gcc/rtl.c.jj2016-10-31 13:28:12.0 +0100 +++ gcc/rtl.c 2016-12-02 11:01:12.557553040 +0100 @@ -318,10 +318,6 @@ copy_rtx (rtx orig) us to explicitly document why we are *not* copying a flag. */ copy = shallow_copy_rtx (orig); - /* We do not copy the USED flag, which is used as a mark bit during - walks over the RTL. */ - RTX_FLAG (copy, used) = 0; - format_ptr = GET_RTX_FORMAT (GET_CODE (copy)); for (i = 0; i < GET_RTX_LENGTH (GET_CODE (copy)); i++) @@ -367,7 +363,29 @@ shallow_copy_rtx_stat (const_rtx orig ME { const unsigned int size = rtx_size (orig); rtx const copy = ggc_alloc_rtx_def_stat (size PASS_MEM_STAT); - return (rtx) memcpy (copy, orig, size); + memcpy (copy, orig, size); + switch (GET_CODE (orig)) +{ + /* RTX codes copy_rtx_if_shared_1 considers are shareable, +the used flag is often used for other purposes. */ +case REG: +case DEBUG_EXPR: +case VALUE: +CASE_CONST_ANY: +case SYMBOL_REF: +case CODE_LABEL: +case PC: +case CC0: +case RETURN: +case SIMPLE_RETURN: +case SCRATCH: + break; +default: + /* For all other RTXes clear the used flag on the copy. */ + RTX_FLAG (copy, used) = 0; + break; +} + return copy; } I like this a lot better. Of course now that it's spelled out it seems like several of these (PC, CC0, RETURN, maybe SCRATCH) should never be passed to shallow_copy_rtx and maybe a checking_assert to that effect might be in order. This part is OK. /* Nonzero when we are generating CONCATs. */ --- gcc/simplify-rtx.c.jj 2016-12-02 00:15:09.200779256 +0100 +++ gcc/simplify-rtx.c 2016-12-02 10:55:24.283989673 +0100 @@ -547,13 +547,19 @@ simplify_replace_fn_rtx (rtx x, const_rt old_rtx, fn, data); if (op != RTVEC_ELT (vec, j)) { - if (newvec == vec) + if (x == newx) { - newvec = shallow_copy_rtvec (vec); - if (x == newx) - newx = shallow_copy_rtx (x); - XVEC (newx, i) = newvec; + newx = shallow_copy_rtx (x); + /* If we copy X, we need to copy also all + vectors in it, rather than copy only + a subset of them and share the rest. */ + for (int k = 0; fmt[k]; k++) + if (fmt[k] == 'E') + XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k)); + newvec = XVEC (newx, i); } + else + gcc_checking_assert (vec != newvec); RTVEC_ELT (newvec, j) = op; } } @@ -566,7 +572,15 @@ simplify_replace_fn_rtx (rtx x, const_rt if (op != XEXP (x, i)) { if (x == newx) - newx = shallow_copy_rtx (x); + { + newx = shallow_copy_rtx (x); + /* If we copy X, we need to copy also all + vectors in it, rather than copy only + a subset of them and share the rest. */ + for (int k = 0; fmt[k]; k++) + if (fmt[k] == 'E') + XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k)); + } XEXP (newx, i) = op; } } After looking at it more, I feel that here as well it seems strange for simplify_replace_fn_rtx to have knowledge about these issues. Doesn't this belong in shallow_copy_rtx as well? --- gcc/emit-rtl.c.jj 2016-12-02 09:43:19.0 +0100 +++ gcc/emit-rtl.c 2016-12-02 10:56:37.001061044 +0100 @@ -5552,10 +5552,6 @@ copy_insn_1 (rtx orig) us to explicitly document why we are *not* copying a flag. */ copy = shallow_copy_rtx (orig); - /* We do not copy the USED flag, which is used as a mark bit during - walks over the RTL. */ - RTX_FLAG (copy, used) = 0; - /* We do not copy JUMP, CALL, or FRAME_RELATED for INSNs. */ if (INSN_P (orig)) { --- gcc/valtrack.c.jj 2016-10-31 13:28:06.0 +0100 +++ gcc/valtrack.c 2016-12-02 11:01:44.690144705 +0100 @@ -119,10 +119,6 @@ cleanup_auto_inc_dec (rtx src, machine_m us to explicitly document why we are *not* copying a flag. */ x = shallow_copy_rtx (x); - /* We do not copy the USED flag, which is used as a mark bit during - walks over the RTL. */ - RTX_FLAG (x, used) = 0; - /* We do not copy FRAME_RELATED for INSNs. */ if (INSN_P (x)) RTX_FLAG (x, frame_related) = 0; --- gcc/config/rs6000/rs6000.c.jj 2016-12-01 08:56:27.137105707 +0100 +++ gcc/config/rs6000/rs6000.c 2016-12-02 11:02:14.796762115 +0100 @@ -27186,7 +27186,6 @@ rs6000_frame_related (rtx_insn *insn, rt { pat = shallow_copy_rtx (pat);
Re: [PATCH 8b/9] Add target-independent selftests of RTL function reader
Conceptually, the concept of "target-independent RTL" seems weird to me. But I guess you expect these to work because the backend never gets involved trying to recognize insns or addressing modes? I think a comment to that effect somewhere near the C code to load these tests would be good. I think this potentially suggests a missing feature: the dump should indicate whether the reader should leave it alone or ask the target to verify it by recognizing the insns. Also, + (edge-from entry (flags "FALLTHRU")) + (cinsn 1 (set (reg:DI %2) +(lshiftrt:DI (reg:DI %0) +(const_int 32))) +"../../src/gcc/testsuite/gcc.dg/asr_div1.c":14 Weren't we going to make file/line information optional? Otherwise this is probably going to be OK since I got past my "this is all wrong" initial reaction. You'll want to Cc target maintainers for the other two patches. Bernd
[Patch, Fortran, OOP] PR 42188: F03:C612. The leftmost part-name shall be the name of a data object
Hi all, another simple fix for a rather old PR. This one adds a new check, in order to provide better error messages than just "Unclassifiable statement". Regtests cleanly on x86_64-linux-gnu. Ok for trunk? Cheers, Janus 2016-12-02 Janus Weil PR fortran/42188 * primary.c (gfc_match_rvalue): Add a new check that gives better error messages. 2016-12-02 Janus Weil PR fortran/42188 * gfortran.dg/derived_result_2.f90.f90: New test case. Index: gcc/fortran/primary.c === --- gcc/fortran/primary.c (revision 243176) +++ gcc/fortran/primary.c (working copy) @@ -3298,6 +3298,15 @@ gfc_match_rvalue (gfc_expr **result) if (sym->result == NULL) sym->result = sym; + gfc_gobble_whitespace (); + /* F08:C612. */ + if (gfc_peek_ascii_char() == '%') + { + gfc_error ("The leftmost part-ref in a data-ref can not be a " +"function reference at %C"); + m = MATCH_ERROR; + } + m = MATCH_YES; break; ! { dg-do compile } ! ! PR 42188: [OOP] F03:C612. The leftmost part-name shall be the name of a data object ! ! Contributed by Janus Weil module grid_module implicit none type grid contains procedure :: new_grid procedure :: new_int end type contains subroutine new_grid(this) class(grid) :: this end subroutine integer function new_int(this) class(grid) :: this new_int = 42 end function end module module field_module use grid_module implicit none type field type(grid) :: mesh end type contains type(field) function new_field() end function subroutine test integer :: i type(grid) :: g g = new_field()%mesh ! { dg-error "can not be a function reference" } call new_field()%mesh%new_grid() ! { dg-error "Syntax error" } i = new_field() % mesh%new_int() ! { dg-error "can not be a function reference" } end subroutine end module
Re: [PATCH 8/9] Introduce class function_reader (v4)
On 12/01/2016 10:43 PM, David Malcolm wrote: + rtx_insn *jump_insn = get_insn_by_uid (1); + ASSERT_EQ (JUMP_INSN, GET_CODE (jump_insn)); + ASSERT_EQ (ret_rtx, JUMP_LABEL (jump_insn)); + // FIXME: ^^^ use ASSERT_RTX_PTR_EQ here ^^^ Why is this a fixme and not just done that way (several instances)? ASSERT_RTX_PTR_EQ doesn't exist yet; there was some discussion about it in earlier versions of the patch, but I haven't written it. It would be equivalent to ASSERT_EQ, checking pointer equality, but would dump the mismatching expected vs actual rtx on failure. Should I convert this to a TODO, or go ahead and implement ASSERT_RTX_PTR_EQ? Missed this question. Just add ASSERT_RTX_PTR_EQ, that shouldn't be hard, should it? Bernd
Re: [PATCH 1/9] print_rtx: implement support for reuse IDs (v2)
On 12/02/2016 02:37 AM, David Malcolm wrote: Here's the current status of the kit: "[PATCH 1/9] print_rtx: implement support for reuse IDs (v2)" https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01188.html Still needs review Whoops, I thought this was settled. Ok for the new version. Bernd
[PATCH, Fortran, pr78505, v1] [F08] Coarray source allocation not synchronizing on oversubscribed cores
Hi all, attached patch adds a call to sync_all after an ALLOCATE allocating a coarray. This is to adhere to standard wanting: ..., execution of the segment (11.6.2) following the statement is delayed until all other active images in the current team have executed the same statement the same number of times in this team. This, esp. the "statement", means that assigning the SOURCE= is to come before the sync all, which means we have to do it explicitly after that assignment. Or the other way around the sync all can be done in library caf_register(). Bootstraps and regtests ok on x86_64-linux/F23. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de gcc/fortran/ChangeLog: 2016-12-02 Andre Vehreschild PR fortran/78505 * trans-stmt.c (gfc_trans_allocate): Add sync all after the execution of the whole allocate-statement to adhere to the standard. gcc/testsuite/ChangeLog: 2016-12-02 Andre Vehreschild PR fortran/78505 * gfortran.dg/coarray_alloc_with_implicit_sync_1.f90: New test. diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c index 514db28..2219bcc 100644 --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -5506,7 +5506,7 @@ gfc_trans_allocate (gfc_code * code) stmtblock_t block; stmtblock_t post; tree nelems; - bool upoly_expr, tmp_expr3_len_flag = false, al_len_needs_set; + bool upoly_expr, tmp_expr3_len_flag = false, al_len_needs_set, is_coarray ; gfc_symtree *newsym = NULL; if (!code->ext.alloc.list) @@ -5516,6 +5516,7 @@ gfc_trans_allocate (gfc_code * code) expr3 = expr3_vptr = expr3_len = expr3_esize = NULL_TREE; label_errmsg = label_finish = errmsg = errlen = NULL_TREE; e3_is = E3_UNSET; + is_coarray = false; gfc_init_block (&block); gfc_init_block (&post); @@ -,8 +5556,9 @@ gfc_trans_allocate (gfc_code * code) expression. */ if (code->expr3) { - bool vtab_needed = false, temp_var_needed = false, - is_coarray = gfc_is_coarray (code->expr3); + bool vtab_needed = false, temp_var_needed = false; + + is_coarray = gfc_is_coarray (code->expr3); /* Figure whether we need the vtab from expr3. */ for (al = code->ext.alloc.list; !vtab_needed && al != NULL; @@ -6093,6 +6095,9 @@ gfc_trans_allocate (gfc_code * code) tree caf_decl, token; gfc_se caf_se; + /* Set flag, to add synchronize after the allocate. */ + is_coarray = true; + gfc_init_se (&caf_se, NULL); caf_decl = gfc_get_tree_for_caf_expr (expr); @@ -6114,6 +6119,11 @@ gfc_trans_allocate (gfc_code * code) } else { + /* Allocating coarrays needs a sync after the allocate executed. + Set the flag to add the sync after all objects are allocated. */ + is_coarray = is_coarray || (gfc_caf_attr (expr).codimension + && flag_coarray == GFC_FCOARRAY_LIB); + if (expr->ts.type == BT_CHARACTER && al_len != NULL_TREE && expr3_len != NULL_TREE) { @@ -6357,6 +6367,15 @@ gfc_trans_allocate (gfc_code * code) gfc_add_modify (&block, se.expr, tmp); } + if (is_coarray && flag_coarray == GFC_FCOARRAY_LIB) +{ + /* Add a sync all after the allocation has been executed. */ + tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all, + 3, null_pointer_node, null_pointer_node, + integer_zero_node); + gfc_add_expr_to_block (&post, tmp); +} + gfc_add_block_to_block (&block, &se.post); gfc_add_block_to_block (&block, &post); diff --git a/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90 b/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90 new file mode 100644 index 000..1dbbcb7 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90 @@ -0,0 +1,10 @@ +! { dg-do compile } +! { dg-options "-fdump-tree-original -fcoarray=lib" } +! Check that allocating a coarray adds an implicit sync all. + + implicit none + integer, allocatable :: f(:)[:] + allocate( f(20)[*], source = 1 ) +end + +! { dg-final { scan-tree-dump-times "_gfortran_caf_sync_all \\(" 1 "original" } }
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
Hi Steve, 2016-12-02 2:33 GMT+01:00 Steve Kargl : > The attached patch fixes an ICE, a nearby whitespace issue, and > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression > testing on x86_64-*-freebsd. Ok to commit? huh, I don't really understand why the argument of RANK is detected to be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be an EXPR_CONSTANT? Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed is the symbol "c" (as expected), but that clearly is not a function, so it seems to me that the actual bug here is that a->expr_type is set incorrectly ...? Cheers, Janus > 2016-12-01 Steven G. Kargl > > PR fortran/78618 > * check.c (gfc_check_rank): Remove ATTRIBUTE_UNUSED. Fix whitespace. > Fix ICE where a NULL pointer is dereferenced. > * simplify.c (gfc_convert_char_constant): Do not buffer error. > > 2016-12-01 Steven G. Kargl > > PR fortran/78618 > * gfortran.dg/char_conversion.f90: New test. > > -- > Steve
Re: [PR 70965] Schedule extra pass_rebuild_cgraph_edges
Hi, On Fri, Nov 25, 2016 at 02:10:51PM +0100, Jan Hubicka wrote: > > On Thu, Nov 24, 2016 at 5:44 PM, Martin Jambor wrote: > > > > > > ... > > > > > > 2016-11-24 Martin Jambor > > > > > > gcc/ > > > * passes.def (pass_build_ssa_passes): Add > > > pass_rebuild_cgraph_edges. > > > > > > gcc/testsuite/ > > > * g++.dg/pr70965.C: New test. > > > --- > > > gcc/passes.def | 1 + > > > gcc/testsuite/g++.dg/pr70965.C | 21 + > > > 2 files changed, 22 insertions(+) > > > create mode 100644 gcc/testsuite/g++.dg/pr70965.C > > > > > > diff --git a/gcc/passes.def b/gcc/passes.def > > > index 85a5af0..5faf17f 100644 > > > --- a/gcc/passes.def > > > +++ b/gcc/passes.def > > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see > > >NEXT_PASS (pass_build_ssa_passes); > > >PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes) > > >NEXT_PASS (pass_fixup_cfg); > > > + NEXT_PASS (pass_rebuild_cgraph_edges); > > >NEXT_PASS (pass_build_ssa); > > >NEXT_PASS (pass_warn_nonnull_compare); > > >NEXT_PASS (pass_ubsan); > > Actually you want to rebuild at the end of pass_build_ssa_passes passes queue. > This may still trip an ICE if one of passes bellow modify CFG (which > pass_nothorw > does) > > Path is OK with that change. Well, I have already committed the patch as it was. But given the above, I will commit the following to trunk after bootstrapping and testing. Thanks, Martin 2016-12-02 Martin Jambor * passes.def: Move pass_rebuild_cgraph_edges to the end of pass_build_ssa_passes. --- gcc/passes.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/passes.def b/gcc/passes.def index 1117b8b..7b12a41 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -56,12 +56,12 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_build_ssa_passes); PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes) NEXT_PASS (pass_fixup_cfg); - NEXT_PASS (pass_rebuild_cgraph_edges); NEXT_PASS (pass_build_ssa); NEXT_PASS (pass_warn_nonnull_compare); NEXT_PASS (pass_ubsan); NEXT_PASS (pass_early_warn_uninitialized); NEXT_PASS (pass_nothrow); + NEXT_PASS (pass_rebuild_cgraph_edges); POP_INSERT_PASSES () NEXT_PASS (pass_chkp_instrumentation_passes); -- 2.10.2
Re: [PATCH] correct handling of non-constant width and precision (pr 78521)
On 12/02/2016 01:31 AM, Rainer Orth wrote: Hi Martin, PR 78521 notes that the gimple-ssa-sprintf pass doesn't do the right thing (i.e., the -Wformat-length and -fprintf-return-value options behave incorrectly) when a conversion specification includes a width or precision with a non-constant value. The code treats such cases as if they were not provided which is incorrect and results in the wrong bytes counts in warning messages and in the wrong ranges being generated for such calls (or in the case sprintf(0, 0, ...) for some such calls being eliminated). The attached patch corrects the handling of these cases, plus a couple of other edge cases in the same area: it adjusts the parser to accept precision in the form of just a period with no asterisk or decimal digits after it (this sets the precision to zero), and corrects the handling of zero precision and zero argument in integer directives to produce no bytes on output. Finally, the patch also tightens up the constraint on the upper bound of bounded functions like snprintf to be INT_MAX. The functions cannot produce output in excess of INT_MAX + 1 bytes and some implementations (e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more. This is the subject of PR 78520. this patch broke Solaris bootstrap: /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'void {anonymous}::get_width_and_precision(const {anonymous}::conversion_spec&, long long int*, long long int*)': /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: error: call of overloaded 'abs(long long int)' is ambiguous width = abs (tree_to_shwi (spec.star_width)); ^ /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: note: candidates are: In file included from /usr/include/stdlib.h:12:0, from /vol/gcc/src/hg/trunk/local/gcc/system.h:258, from /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:49: /usr/include/iso/stdlib_iso.h:205:16: note: long int std::abs(long int) inline long abs(long _l) { return labs(_l); } ^ /usr/include/iso/stdlib_iso.h:160:12: note: int std::abs(int) extern int abs(int); ^ The following patch fixed this for me, but I've no idea if it's right. It bootstrapped successfully on sparc-sun-solaris2.12, i386-pc-solaris2.12, and x86_64-pc-linux-gnu. Thanks for the heads up! I just looked at that code yesterday while analyzing bug 78608, wondering if it was safe. Now I know it isn't. I think it might be best to simply hand code the expression instead of taking a chance on abs. Let me take care of it today along with 78608. Martin
PR78634: ifcvt/i386 cost updates
With the i386 backend no longer double-counting the cost of a SET, the default implementation default_max_noce_ifcvt_seq_cost now provides too high a bound for if conversion, allowing very costly substitutions. The following patch fixes this by making an i386 variant of the hook, but first - James, do you recall how you arrived at the COSTS_N_INSNS(3) magic number? What happens on arm if you lower that in the default hook? The change in ifcvt.c seems to have no bearing on the PR, I just noticed we were missing a cost check in one place. Lightly tested so far, will bootstrap & test on x86_64-linux. Bernd PR rtl-optimization/78634 * config/i386/i386.c (ix86_max_noce_ifcvt_seq_cost): New function. (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Define. * ifcvt.c (noce_try_cmove): Add missing cost check. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 242958) +++ gcc/config/i386/i386.c (working copy) @@ -50301,6 +50301,28 @@ ix86_spill_class (reg_class_t rclass, ma return NO_REGS; } +/* Implement TARGET_MAX_NOCE_IFCVT_SEQ_COST. Like the default implementation, + but returns a lower bound. */ + +static unsigned int +ix86_max_noce_ifcvt_seq_cost (edge e) +{ + bool predictable_p = predictable_edge_p (e); + + enum compiler_param param += (predictable_p + ? PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST + : PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST); + + /* If we have a parameter set, use that, otherwise take a guess using + BRANCH_COST. */ + if (global_options_set.x_param_values[param]) +return PARAM_VALUE (param); + else +return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (2); +} + + /* Implement targetm.vectorize.init_cost. */ static void * @@ -51615,6 +51637,8 @@ ix86_run_selftests (void) #undef TARGET_EXPAND_DIVMOD_LIBFUNC #define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc +#undef TARGET_MAX_NOCE_IFCVT_SEQ_COST +#define TARGET_MAX_NOCE_IFCVT_SEQ_COST ix86_max_noce_ifcvt_seq_cost #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests Index: gcc/ifcvt.c === --- gcc/ifcvt.c (revision 242958) +++ gcc/ifcvt.c (working copy) @@ -1826,7 +1826,7 @@ noce_try_cmove (struct noce_if_info *if_ noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump,
Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)
On Fri, Dec 02, 2016 at 03:51:52PM +0100, Bernd Schmidt wrote: > I like this a lot better. Of course now that it's spelled out it seems like > several of these (PC, CC0, RETURN, maybe SCRATCH) should never be passed to > shallow_copy_rtx and maybe a checking_assert to that effect might be in > order. This part is OK. Ok, I've committed the parts except simplify-rtx.c. > After looking at it more, I feel that here as well it seems strange for > simplify_replace_fn_rtx to have knowledge about these issues. Doesn't this > belong in shallow_copy_rtx as well? Started working on that, but it seems e.g. copy_insn_1 doesn't want this behavior, it has a complex system of sharing some of the subvectors of ASM_OPERANDS. Jakub
Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)
On Fri, Dec 2, 2016 at 3:21 PM, Jakub Jelinek wrote: > Hi! > > While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html > added *andndi3_doubleword, I don't understand how it can actually work. > The problem is that this pattern is for combine, and needs combining > of DImode NOT setter with DImode AND user. But there is no DImode > pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2 > patterns and I can't see how combine would be able to figure stuff out from > that. > > This patch: > 1) adds one_cmpldi2 pattern for stv purposes (which splits into two >one_cmplsi2 after reload) > 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) > 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it >is for -mbmi only, and adds another *andndi3_doubleword pattern that is >meant to live just from combine till the stv pass, or worse case till >following split1 pass when it is split back into not followed by and; >this change makes it possible to use pandn in stv pass, even without >-mbmi Please use attached (lightly tested) patch to implement point 3) above. The patch splits insn after reload, as is the case with all STV patterns. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-12-02 Jakub Jelinek > > PR target/70322 > * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle > NOT. > (dimode_scalar_chain::compute_convert_gain): Likewise. > (dimode_scalar_chain::convert_insn): Likewise. > * config/i386/i386.md (*andndi3_doubleword): New > define_insn_and_split. > Rename the old define_insn_and_split to... > (*andndi3_doubleword_bmi): ... this. > (one_cmpl2): Use SWIM1248x iterator instead of SWIM. > (*one_cmpldi2_doubleword): New define_insn_and_split. > > * gcc.target/i386/pr70322-1.c: New test. > * gcc.target/i386/pr70322-2.c: New test. > * gcc.target/i386/pr70322-3.c: New test. Apart to andn patterns, the patch is OK with the small nit below. Thanks, Uros. > --- gcc/config/i386/i386.c.jj 2016-12-02 11:17:40.702995111 +0100 > +++ gcc/config/i386/i386.c 2016-12-02 12:01:31.656469089 +0100 > @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx > return false; >break; > > +case NOT: > + break; > + > case REG: >return true; > > @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx > >if ((GET_MODE (XEXP (src, 0)) != DImode > && !CONST_INT_P (XEXP (src, 0))) > - || (GET_MODE (XEXP (src, 1)) != DImode > + || (GET_CODE (src) != NOT > + && GET_MODE (XEXP (src, 1)) != DImode > && !CONST_INT_P (XEXP (src, 1 > return false; > > @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai > if (CONST_INT_P (XEXP (src, 1))) > gain -= vector_const_cost (XEXP (src, 1)); > } > + else if (GET_CODE (src) == NOT) > + gain += ix86_cost->add - COSTS_N_INSNS (1); >else if (GET_CODE (src) == COMPARE) > { > /* Assume comparison cost is the same. */ > @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i >PUT_MODE (src, V2DImode); >break; > > +case NOT: > + src = XEXP (src, 0); > + convert_op (&src, insn); > + subreg = gen_reg_rtx (V2DImode); > + emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), > insn); > + src = gen_rtx_XOR (V2DImode, src, subreg); > + break; > + > case MEM: >if (!REG_P (dst)) > convert_op (&src, insn); > --- gcc/config/i386/i386.md.jj 2016-12-01 23:24:51.663157486 +0100 > +++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100 > @@ -8534,7 +8534,7 @@ (define_split >operands[2] = gen_lowpart (QImode, operands[2]); > }) > > -(define_insn_and_split "*andndi3_doubleword" > +(define_insn_and_split "*andndi3_doubleword_bmi" >[(set (match_operand:DI 0 "register_operand" "=r") > (and:DI > (not:DI (match_operand:DI 1 "register_operand" "r")) > @@ -8551,6 +8551,27 @@ (define_insn_and_split "*andndi3_doublew > (clobber (reg:CC FLAGS_REG))])] >"split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") > > +;; This insn is there only so that the stv pass can use pandn even when > +;; BMI is not available. It should be matched during combine and split > +;; again in split1, unless the stv pass converts it. > +(define_insn_and_split "*andndi3_doubleword" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (and:DI > + (not:DI (match_operand:DI 1 "register_operand" "r")) > + (match_operand:DI 2 "nonimmediate_operand" "rm"))) > + (clobber (reg:CC FLAGS_REG))] > + "!TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2 > + && can_create_pseudo_p ()" > + "#" > + "&& 1" > + [(set (match_dup 3) > + (not:DI (match_dup
Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)
On Fri, Dec 2, 2016 at 5:11 PM, Uros Bizjak wrote: > On Fri, Dec 2, 2016 at 3:21 PM, Jakub Jelinek wrote: >> Hi! >> >> While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html >> added *andndi3_doubleword, I don't understand how it can actually work. >> The problem is that this pattern is for combine, and needs combining >> of DImode NOT setter with DImode AND user. But there is no DImode >> pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2 >> patterns and I can't see how combine would be able to figure stuff out from >> that. >> >> This patch: >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two >>one_cmplsi2 after reload) >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it >>is for -mbmi only, and adds another *andndi3_doubleword pattern that is >>meant to live just from combine till the stv pass, or worse case till >>following split1 pass when it is split back into not followed by and; >>this change makes it possible to use pandn in stv pass, even without >>-mbmi > > Please use attached (lightly tested) patch to implement point 3) > above. The patch splits insn after reload, as is the case with all STV > patterns. Now attached for real. Uros. Index: i386.md === --- i386.md (revision 243185) +++ i386.md (working copy) @@ -8534,15 +8534,24 @@ operands[2] = gen_lowpart (QImode, operands[2]); }) -(define_insn_and_split "*andndi3_doubleword" - [(set (match_operand:DI 0 "register_operand" "=r") +(define_insn "*andndi3_doubleword" + [(set (match_operand:DI 0 "register_operand" "=r,&r") (and:DI - (not:DI (match_operand:DI 1 "register_operand" "r")) - (match_operand:DI 2 "nonimmediate_operand" "rm"))) + (not:DI (match_operand:DI 1 "register_operand" "r,0")) + (match_operand:DI 2 "nonimmediate_operand" "rm,rm"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2" + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" "#" - "&& reload_completed" + [(set_attr "isa" "bmi,*")]) + +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI + (not:DI (match_operand:DI 1 "register_operand")) + (match_operand:DI 2 "nonimmediate_operand"))) + (clobber (reg:CC FLAGS_REG))] + "!TARGET_64BIT && TARGET_BMI && TARGET_STV && TARGET_SSE2 + && reload_completed" [(parallel [(set (match_dup 0) (and:SI (not:SI (match_dup 1)) (match_dup 2))) (clobber (reg:CC FLAGS_REG))]) @@ -8551,6 +8560,24 @@ (clobber (reg:CC FLAGS_REG))])] "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI + (not:DI (match_dup 0)) + (match_operand:DI 1 "nonimmediate_operand"))) + (clobber (reg:CC FLAGS_REG))] + "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2 + && reload_completed" + [(set (match_dup 0) (not:SI (match_dup 0))) + (parallel [(set (match_dup 0) + (and:SI (match_dup 0) (match_dup 1))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_dup 2) (not:SI (match_dup 2))) + (parallel [(set (match_dup 2) + (and:SI (match_dup 2) (match_dup 3))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);") + (define_insn "*andn_1" [(set (match_operand:SWI48 0 "register_operand" "=r,r") (and:SWI48
Re: [PR 70965] Schedule extra pass_rebuild_cgraph_edges
> Hi, > > On Fri, Nov 25, 2016 at 02:10:51PM +0100, Jan Hubicka wrote: > > > On Thu, Nov 24, 2016 at 5:44 PM, Martin Jambor wrote: > > > > > > > > ... > > > > > > > > 2016-11-24 Martin Jambor > > > > > > > > gcc/ > > > > * passes.def (pass_build_ssa_passes): Add > > > > pass_rebuild_cgraph_edges. > > > > > > > > gcc/testsuite/ > > > > * g++.dg/pr70965.C: New test. > > > > --- > > > > gcc/passes.def | 1 + > > > > gcc/testsuite/g++.dg/pr70965.C | 21 + > > > > 2 files changed, 22 insertions(+) > > > > create mode 100644 gcc/testsuite/g++.dg/pr70965.C > > > > > > > > diff --git a/gcc/passes.def b/gcc/passes.def > > > > index 85a5af0..5faf17f 100644 > > > > --- a/gcc/passes.def > > > > +++ b/gcc/passes.def > > > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see > > > >NEXT_PASS (pass_build_ssa_passes); > > > >PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes) > > > >NEXT_PASS (pass_fixup_cfg); > > > > + NEXT_PASS (pass_rebuild_cgraph_edges); > > > >NEXT_PASS (pass_build_ssa); > > > >NEXT_PASS (pass_warn_nonnull_compare); > > > >NEXT_PASS (pass_ubsan); > > > > Actually you want to rebuild at the end of pass_build_ssa_passes passes > > queue. > > This may still trip an ICE if one of passes bellow modify CFG (which > > pass_nothorw > > does) > > > > Path is OK with that change. > > Well, I have already committed the patch as it was. But given the > above, I will commit the following to trunk after bootstrapping and > testing. > > Thanks, > > Martin > > > 2016-12-02 Martin Jambor > > * passes.def: Move pass_rebuild_cgraph_edges to the end of > pass_build_ssa_passes. Thanks, Honza
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
Ping? Is there anything else I need to do for the patch or is it OK for trunk? Thanks, Tamar From: Tamar Christina Sent: Friday, November 25, 2016 12:18:52 PM To: Joseph Myers Cc: GCC Patches; Wilco Dijkstra; rguent...@suse.de; l...@redhat.com; Michael Meissner; nd Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE. Hi Joseph, I have updated the patch with the changes, w.r.t to the formatting, there are tabs there that seem to be rendered at 4 spaces wide. In my editor setup at 8 spaces they are correct. Kind Regards, Tamar From: Joseph Myers Sent: Thursday, November 24, 2016 6:28:18 PM To: Tamar Christina Cc: GCC Patches; Wilco Dijkstra; rguent...@suse.de; l...@redhat.com; Michael Meissner; nd Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE. On Thu, 24 Nov 2016, Tamar Christina wrote: > @@ -11499,6 +11503,53 @@ to classify. GCC treats the last argument as > type-generic, which > means it does not do default promotion from float to double. > @end deftypefn > > +@deftypefn {Built-in Function} int __builtin_isnan (...) > +This built-in implements the C99 isnan functionality which checks if > +the given argument represents a NaN. The return value of the > +function will either be a 0 (false) or a 1 (true). > +On most systems, when an IEEE 754 floating point is used this > +built-in does not produce a signal when a signaling NaN is used. "an IEEE 754 floating point" should probably be "an IEEE 754 floating-point type" or similar. > +GCC treats the argument as type-generic, which means it does > +not do default promotion from float to double. I think type names such as float and double should use @code{} in the manual. > +the given argument represents an Infinite number. The return Infinite should not have a capital letter there. > +@deftypefn {Built-in Function} int __builtin_iszero (...) > +This built-in implements the C99 iszero functionality which checks if This isn't C99, it's TS 18661-1:2014. > +the given argument represents the number 0. The return 0 or -0. > +@deftypefn {Built-in Function} int __builtin_issubnormal (...) > +This built-in implements the C99 issubnormal functionality which checks if Again, TS 18661-1. > +the given argument represents a sub-normal number. The return Do not hyphenate subnormal. > + switch (DECL_FUNCTION_CODE (decl)) > + { > + case BUILT_IN_SETJMP: > + lower_builtin_setjmp (gsi); > + data->cannot_fallthru = false; > + return; The indentation in this whole block of code (not all quoted) is wrong. > +real_inf(&rinf); Missing space before '('. > + emit_tree_cond (&body, dest, done_label, > + is_normal(&body, arg, loc), fp_normal); > + emit_tree_cond (&body, dest, done_label, > + is_zero(&body, arg, loc), fp_zero); > + emit_tree_cond (&body, dest, done_label, > + is_nan(&body, arg, loc), fp_nan); > + emit_tree_cond (&body, dest, done_label, > + is_infinity(&body, arg, loc), fp_infinite); Likewise. > + fndecl(&body, arg, loc), t_true); Likewise. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote: > Hi Steve, > > 2016-12-02 2:33 GMT+01:00 Steve Kargl : > > The attached patch fixes an ICE, a nearby whitespace issue, and > > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression > > testing on x86_64-*-freebsd. Ok to commit? > > huh, I don't really understand why the argument of RANK is detected to > be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be > an EXPR_CONSTANT? > > Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed > is the symbol "c" (as expected), but that clearly is not a function, > so it seems to me that the actual bug here is that a->expr_type is set > incorrectly ...? > I found that it is the function __convert_s4_s1. Namely, we have character, parameter :: c = char(256,4) print *, rank(c) c is kind=1 and char(256,4) is kind 4. so, we end up with print *, rank(__convert_s4_s1(...)) As __convert_s4_s1() has neither isym nor esym set, you get a problem. -- Steve
Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)
On Fri, Dec 02, 2016 at 05:12:20PM +0100, Uros Bizjak wrote: > >> This patch: > >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two > >>one_cmplsi2 after reload) > >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) > >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it > >>is for -mbmi only, and adds another *andndi3_doubleword pattern that is > >>meant to live just from combine till the stv pass, or worse case till > >>following split1 pass when it is split back into not followed by and; > >>this change makes it possible to use pandn in stv pass, even without > >>-mbmi > > > > Please use attached (lightly tested) patch to implement point 3) > > above. The patch splits insn after reload, as is the case with all STV > > patterns. > > Now attached for real. Ok, I've checked in following patch (compared to your notes just added xfail to the pr70322-2.c test scan-assembler), feel free to test your patch and remove the xfail again. 2016-12-02 Jakub Jelinek PR target/70322 * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle NOT. (dimode_scalar_chain::compute_convert_gain): Likewise. (dimode_scalar_chain::convert_insn): Likewise. * config/i386/i386.md (*one_cmpldi2_doubleword): New define_insn_and_split. (one_cmpl2): Use SWIM1248x iterator instead of SWIM. * gcc.target/i386/pr70322-1.c: New test. * gcc.target/i386/pr70322-2.c: New test. * gcc.target/i386/pr70322-3.c: New test. --- gcc/config/i386/i386.c.jj 2016-12-02 11:17:40.702995111 +0100 +++ gcc/config/i386/i386.c 2016-12-02 12:01:31.656469089 +0100 @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx return false; break; +case NOT: + break; + case REG: return true; @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx if ((GET_MODE (XEXP (src, 0)) != DImode && !CONST_INT_P (XEXP (src, 0))) - || (GET_MODE (XEXP (src, 1)) != DImode + || (GET_CODE (src) != NOT + && GET_MODE (XEXP (src, 1)) != DImode && !CONST_INT_P (XEXP (src, 1 return false; @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai if (CONST_INT_P (XEXP (src, 1))) gain -= vector_const_cost (XEXP (src, 1)); } + else if (GET_CODE (src) == NOT) + gain += ix86_cost->add - COSTS_N_INSNS (1); else if (GET_CODE (src) == COMPARE) { /* Assume comparison cost is the same. */ @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i PUT_MODE (src, V2DImode); break; +case NOT: + src = XEXP (src, 0); + convert_op (&src, insn); + subreg = gen_reg_rtx (V2DImode); + emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), insn); + src = gen_rtx_XOR (V2DImode, src, subreg); + break; + case MEM: if (!REG_P (dst)) convert_op (&src, insn); --- gcc/config/i386/i386.md.jj 2016-12-01 23:24:51.663157486 +0100 +++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100 @@ -9312,9 +9312,22 @@ ;; One complement instructions +(define_insn_and_split "*one_cmpldi2_doubleword" + [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") + (not:DI (match_operand:DI 1 "nonimmediate_operand" "0")))] + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 + && ix86_unary_operator_ok (NOT, DImode, operands)" + "#" + "&& reload_completed" + [(set (match_dup 0) + (not:SI (match_dup 1))) + (set (match_dup 2) + (not:SI (match_dup 3)))] + "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);") + (define_expand "one_cmpl2" - [(set (match_operand:SWIM 0 "nonimmediate_operand") - (not:SWIM (match_operand:SWIM 1 "nonimmediate_operand")))] + [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") + (not:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")))] "" "ix86_expand_unary_operator (NOT, mode, operands); DONE;") --- gcc/testsuite/gcc.target/i386/pr70322-1.c.jj2016-12-02 12:52:47.193051745 +0100 +++ gcc/testsuite/gcc.target/i386/pr70322-1.c 2016-12-02 12:52:24.708338078 +0100 @@ -0,0 +1,12 @@ +/* PR target/70322 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -msse2 -mstv -mbmi" } */ +/* { dg-final { scan-assembler "pandn" } } */ + +extern long long z; + +void +foo (long long x, long long y) +{ + z = ~x & y; +} --- gcc/testsuite/gcc.target/i386/pr70322-2.c.jj2016-12-02 12:52:50.165013898 +0100 +++ gcc/testsuite/gcc.target/i386/pr70322-2.c 2016-12-02 12:52:39.302152232 +0100 @@ -0,0 +1,12 @@ +/* PR target/70322 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */ +/* { dg-final { scan-assembler "pandn" { xfail *-*-* } } } */ + +extern long long z; + +void +foo (long long x, long long y) +{ + z = ~x & y; +} ---
Re: [PATCH] gcc: Fix sysroot relative paths for MinGW
On 10/08/2016 04:26 AM, Tadek Kijkowski wrote: Prevent paths relative to sysroot directory from being transformed to Windows form with MSYS prefix. Second attempt: Moved most changes to x-mingw32. Only thing added to Makefile.in are new variables to ease overriding in included file. Disabled overriding standard lib and include paths (with /mingw/lib and /mingw/include) in mingw32.h when TARGET_SYSTEM_ROOT is defined. Host fragments (x-foo files) are strongly discouraged. The developer docs suggest they should only be used for makefile dependencies. Practice is somewhat looser than that. This stuff is so ugly that I'd really prefer it be buried, so I think it deserves an exception to the guidelines. But if you need further stuff in this space, let's try to avoid additional host fragments. There's several other directories that are not over-ridden, presumably because they're not relative to sysroot? Target s-selftest in gcc fails on MinGW: /home/tadek/gcc/gcc-build-mingw32-sysroot/./gcc/xgcc -B/home/tadek/gcc/gcc-build-mingw32-sysroot/./gcc/ -xc -S -c /dev/null -fself-test cc1.exe: fatal error: input file 'nul.s' is the same as output file It's enough to specify any output file, like '-o self-test-result.s' to fix the issue. Specifying '-o /dev/null' works as well, but I'm not sure if it's safe on all systems. It's a topic for separate patch. I think you're supposed to use HOST_BIT_BUCKET instead of /dev/null. That can be a follow-up patch. Manual at "https://gcc.gnu.org/install/configure.html"; does mention that path specified with|--with-native-system-header-dir is located within sysroot, but it fails to mention that the same applies to ||--with-local-prefix. Changelog: 2016-10-08 Tadek Kijkowski> * gcc/Makefile.in, gcc/config/i386/x-mingw32: |Fix sysroot relative paths for MinGW * gcc/config/i386/mingw32.h: Disable overriding default library and include paths when sysroot is defined I'm going to go ahead and installon the trunk. Thanks for updating the documentation on what those little makefile fragments do. It'll be helpful if anyone ever needs to change them in the future. Thanks, Jeff
[PATCH, GCC/testsuite/ARM] XFAIL optional_thumb-1 and optional_thumb-2 testcases
Hi, The logic to make -mthumb optional for Thumb-only targets was designed to only apply when no mode is specified either at compile time or when the toolchain was configured (using --with-mode). The dg-skip-if in the testcases optional_thumb-1 and optional_thumb-2 catch the former case but not the latter. Unfortunately there is no dejagnu procedure to check the presence of an option in the configure line used for the compiler. This patch marks the two tests as XFAIL since that will at least allow to catch a change from PASS to XFAIL, thus making the test still useful. It also fixes a codestyle issue in the target triplet of the dg-skip-if directives of all the optional_thumb-* testcases. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2016-12-01 Thomas Preud'homme * gcc.target/arm/optional_thumb-3.c: Fix codestyle issue in target triplet of dg-skip-if. * gcc.target/arm/optional_thumb-1.c: Likewise and add xfail. * gcc.target/arm/optional_thumb-2.c: Likewise. Is this ok for stage3? Best regards, Thomas diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c index 23df62887ba4aaa1d8717a34ecda9a40246f0552..27787bb8a403534abf708be02a80bdcf42638958 100644 --- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c +++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ -/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */ +/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */ /* { dg-options "-march=armv6-m" } */ +/* { dg-xfail-if "fails if GCC configured with --with-mode=arm" { *-*-* } } */ /* Check that -mthumb is not needed when compiling for a Thumb-only target. */ diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c index 4bd53a45eca97e62dd3b86d5a1a66c5ca21e7aad..5a8e5e8cfccf8d9175432ef6b5c07cb0b5f7b0bf 100644 --- a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c +++ b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ -/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */ +/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */ /* { dg-options "-mcpu=cortex-m4" } */ +/* { dg-xfail-if "fails if GCC configured with --with-mode=arm" { *-*-* } } */ /* Check that -mthumb is not needed when compiling for a Thumb-only target. */ diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c index f1fd5c8840b191e600c20a7817c611bb9bb645df..735c2774738c8d84deb14a9f082c34bf622b2d5d 100644 --- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c +++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_cortex_m } */ -/* { dg-skip-if "-mthumb given" { *-*-*} { "-mthumb" } } */ +/* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */ /* { dg-options "-marm" } */ /* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-*} 0 } */
Re: [build] Handle gas/gld --compress-debug-sections=type
>From my point of view this should be backported to the active branches. Building GCC 5 and GCC 6 with binutils >=2.26 now results in $ gcc -c -gz foo.c gcc: error: -gz is not supported in this configuration building these GCC version with binutils 2.25 recognizes this option. On 30.05.2016 13:32, Rainer Orth wrote: > * When I removed the default in the gcc_cv_ld_compress test, the outcome > always was 0, irrespective of the linker tested. Before, it would > almost always have been 1 if testing GNU ld. It turns out that in > this (and numerous other) cases the nesting of tests using ld_date was > wrong. I believe most if not all of those ld_date tests can go, being > only relevant for truly prehistoric versions of GNU ld not in use > anymore. I'll probably submit a followup to remove them, simplifying > several ld tests. > > Bootstrapped without regressions on i386-pc-solaris2.1[02] with various > as/ld combinations and checking that the tests yielded the expected > results: > > gcc_cv_as_compress_debug/gcc_cv_ld_compress_debug > as/ld 2.100/0 > as/ld 2.120/3 > gas 2.26/ld 2.10 2/0 > gas 2.26/ld 2.12 2/3 > gas 2.26/gld 2.26 2/3 the GNU case now reads if test "$ld_vers_major" -lt 2 \ || test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 21; then gcc_cv_ld_compress_debug=0 elif test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 26; then gcc_cv_ld_compress_debug=1 else gcc_cv_ld_compress_debug=3 gcc_cv_ld_compress_debug_option="--compress-debug-sections" fi if test $ld_is_gold = yes; then gcc_cv_ld_compress_debug=2 gcc_cv_ld_compress_debug_option="--compress-debug-sections" fi so you end up with different values depending on the linker default. Is this intended? LINK_COMPRESS_DEBUG_SPEC in gcc.c is defined in terms of the linker used at build time, so currently you get the wrong specs when using the non-default linker when selecting the linker at runtime using -fuse-ld=... Matthias
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
2016-12-02 17:30 GMT+01:00 Steve Kargl : > On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote: >> Hi Steve, >> >> 2016-12-02 2:33 GMT+01:00 Steve Kargl : >> > The attached patch fixes an ICE, a nearby whitespace issue, and >> > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression >> > testing on x86_64-*-freebsd. Ok to commit? >> >> huh, I don't really understand why the argument of RANK is detected to >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be >> an EXPR_CONSTANT? >> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed >> is the symbol "c" (as expected), but that clearly is not a function, >> so it seems to me that the actual bug here is that a->expr_type is set >> incorrectly ...? > > I found that it is the function __convert_s4_s1. That's strange. If we see different things here, maybe we are running into some kind of undefined behavior (possibly related to gfc_bad_expr?). Anyway, after some more debugging I came to the conclusion that what actually fails is the error propagation, which seems to be broken in gfc_check_assign and can be fixed like this: Index: gcc/fortran/expr.c === --- gcc/fortran/expr.c(revision 243194) +++ gcc/fortran/expr.c(working copy) @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER) { if (lvalue->ts.kind != rvalue->ts.kind && allow_convert) -gfc_convert_chartype (rvalue, &lvalue->ts); - - return true; +return gfc_convert_chartype (rvalue, &lvalue->ts); + else +return true; } if (!allow_convert) This also avoids the ICE and I think is the proper way to fix this ... Cheers, Janus
Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation
On Wed, Nov 30, 2016 at 02:07:58PM +, Kyrill Tkachov wrote: > > On 29/11/16 20:29, Segher Boessenkool wrote: > >Hi James, Kyrill, > > > >On Tue, Nov 29, 2016 at 10:57:33AM +, James Greenhalgh wrote: > >>>+static sbitmap > >>>+aarch64_components_for_bb (basic_block bb) > >>>+{ > >>>+ bitmap in = DF_LIVE_IN (bb); > >>>+ bitmap gen = &DF_LIVE_BB_INFO (bb)->gen; > >>>+ bitmap kill = &DF_LIVE_BB_INFO (bb)->kill; > >>>+ > >>>+ sbitmap components = sbitmap_alloc (V31_REGNUM + 1); > >>>+ bitmap_clear (components); > >>>+ > >>>+ /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ > >>>+ for (unsigned regno = R0_REGNUM; regno <= V31_REGNUM; regno++) > >>The use of R0_REGNUM and V31_REGNUM scare me a little bit, as we're > >>hardcoding > >>where the end of the register file is (does this, for example, fall apart > >>with the SVE work that was recently posted). Something like a > >>LAST_HARDREG_NUM might work? > >Components and registers aren't the same thing (you can have components > >for things that aren't just a register save, e.g. the frame setup, stack > >alignment, save of some non-GPR via a GPR, PIC register setup, etc.) > >The loop here should really only cover the non-volatile registers, and > >there should be some translation from register number to component number > >(it of course is convenient to have a 1-1 translation for the GPRs and > >floating point registers). For rs6000 many things in the backend already > >use non-symbolic numbers for the FPRs and GPRs, so that is easier there. > > Anyway, here's the patch with James's comments implemented. > I've introduced LAST_SAVED_REGNUM which is used to delimit the registers > considered for shrink-wrapping. > > aarch64_process_components is introduced and used to implement the > emit_prologue_components and emit_epilogue_components functions in a single > place. OK. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Thanks, > Kyrill > > 2016-11-30 Kyrylo Tkachov > > * config/aarch64/aarch64.h (machine_function): Add > reg_is_wrapped_separately field. > * config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Define new constant. > * config/aarch64/aarch64.c (emit_set_insn): Change return type to > rtx_insn *. > (aarch64_save_callee_saves): Don't save registers that are wrapped > separately. > (aarch64_restore_callee_saves): Don't restore registers that are > wrapped separately. > (offset_9bit_signed_unscaled_p, offset_12bit_unsigned_scaled_p, > aarch64_offset_7bit_signed_scaled_p): Move earlier in the file. > (aarch64_get_separate_components): New function. > (aarch64_get_next_set_bit): Likewise. > (aarch64_components_for_bb): Likewise. > (aarch64_disqualify_components): Likewise. > (aarch64_emit_prologue_components): Likewise. > (aarch64_emit_epilogue_components): Likewise. > (aarch64_set_handled_components): Likewise. > (aarch64_process_components): Likewise. > (TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS, > TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB, > TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS, > TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS, > TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS, > TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define. > >>>+static void > >>>+aarch64_disqualify_components (sbitmap, edge, sbitmap, bool) > >>>+{ > >>>+} > >>Is there no default "do nothing" hook for this? > >I can make the shrink-wrap code do nothing here if this hook isn't > >defined, if you want? > > I don't mind either way. > If you do it I'll then remove the empty implementation in aarch64. If there is no empty default, this is fine. Thanks, James
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
On Fri, Dec 02, 2016 at 06:00:48PM +0100, Janus Weil wrote: > 2016-12-02 17:30 GMT+01:00 Steve Kargl : > > On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote: > >> Hi Steve, > >> > >> 2016-12-02 2:33 GMT+01:00 Steve Kargl : > >> > The attached patch fixes an ICE, a nearby whitespace issue, and > >> > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression > >> > testing on x86_64-*-freebsd. Ok to commit? > >> > >> huh, I don't really understand why the argument of RANK is detected to > >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be > >> an EXPR_CONSTANT? > >> > >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed > >> is the symbol "c" (as expected), but that clearly is not a function, > >> so it seems to me that the actual bug here is that a->expr_type is set > >> incorrectly ...? > > > > I found that it is the function __convert_s4_s1. > > That's strange. If we see different things here, maybe we are running > into some kind of undefined behavior (possibly related to > gfc_bad_expr?). Anyway, after some more debugging I came to the > conclusion that what actually fails is the error propagation, which > seems to be broken in gfc_check_assign and can be fixed like this: > > > Index: gcc/fortran/expr.c > === > --- gcc/fortran/expr.c(revision 243194) > +++ gcc/fortran/expr.c(working copy) > @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval >if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER) > { >if (lvalue->ts.kind != rvalue->ts.kind && allow_convert) > -gfc_convert_chartype (rvalue, &lvalue->ts); > - > - return true; > +return gfc_convert_chartype (rvalue, &lvalue->ts); > + else > +return true; > } > >if (!allow_convert) > > > This also avoids the ICE and I think is the proper way to fix this ... > When developing the patch, I fixed/avoided the ICE, first. Then, I tried Gerhard's second testcase without the PARAMETER attribute. An error message is emitted, so I went hunting for why PARAMETER suppressed the error message. This led to the second part of the patch of changing gfc_error to gfc_error_now. Perhaps, forcing the gfc_error_now prevents gfortran from inserting the __convert_s4_s1 call. In any event, if your patch causes an error message to be emitted, then I think that your patch is better than the one I proposed. Feel free to commit your patch. -- Steve
Re: [PATCH] avoid calling alloca(0)
On 12/2/16, Bernd Edlinger wrote: > On 12/01/16 19:39, Jeff Law wrote: >> On 11/30/2016 09:09 PM, Martin Sebor wrote: What I think this tells us is that we're not at a place where we're clean. But we can incrementally get there. The warning is only catching a fairly small subset of the cases AFAICT. That's not unusual and analyzing why it didn't trigger on those cases might be useful as well. >>> >>> The warning has no smarts. It relies on constant propagation and >>> won't find a call unless it sees it's being made with a constant >>> zero. Looking at the top two on the list the calls are in extern >>> functions not called from the same source file, so it probably just >>> doesn't see that the functions are being called from another file >>> with a zero. Building GCC with LTO might perhaps help. >> Right. This is consistent with the limitations of other similar >> warnings such as null pointer dereferences. >> >>> So where does this leave us for gcc-7? I'm wondering if we drop the warning in, but not enable it by default anywhere. We fix the cases we can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before stage3 closes, and shoot for the rest in gcc-8, including improvign the warning (if there's something we can clearly improve), and enabling the warning in -Wall or -Wextra. >>> >>> I'm fine with deferring the GCC fixes and working on the cleanup >>> over time but I don't think that needs to gate enabling the option >>> with -Wextra. The warnings can be suppressed or prevented from >>> causing errors during a GCC build either via a command line option >>> or by pragma in the code. AFAICT, from the other warnings I see >>> go by, this is what has been done for -Wno-implicit-fallthrough >>> while those warnings are being cleaned up. Why not take the same >>> approach here? >> The difference vs implicit fallthrough is that new instances of implicit >> fallthrus aren't likely to be exposed by changes in IL that occur due to >> transformations in the optimizer pipeline. >> >> Given the number of runtime triggers vs warnings, we know there's many >> instances of passing 0 to the allocators that we're not diagnosing. I >> can easily see differences in the early IL (such as those due to >> BRANCH_COST differing for targets) exposing/hiding cases where 0 flows >> into the allocator argument. Similarly for changes in inlining >> decisions, jump threading, etc for profiled bootstraps. I'd like to >> avoid playing wack-a-mole right now. >> >> So I'm being a bit more conservative here. Maybe it'd be appropriate in >> Wextra since that's not enabled by default for GCC builds. >> > > actually it is enabled, by -W -Wall ... > Don't the gcc docs say that -Wextra is the preferred name for -W? Why is gcc still using the deprecated name for the flag when building itself? Seems like it'd help to avoid this confusion if the flags read -Wextra instead of -W...
Re: Change default level for -Wimplicit-fallthrough
On 11/03/2016 05:51 AM, Bernd Schmidt wrote: I'm concerned about the number of false positives for this warning, and judging by previous discussions, I'm not alone in this. This patch limits it to level 1 (any comment before the case label disables the warning) for cases where the user specified no explicit level. It'll still generate enough noise that people will be aware of it and can choose whether to use a higher level or not. Bootstrapped and tested on x86_64-linux. Ok? I think we should wait for the distro builds to fire up and see how the warning impacts them before making a decision on this patch. jeff
Re: [PATCH] Add AVX512 k-mask intrinsics
2016-11-11 22:14 GMT+03:00 Uros Bizjak : > On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich > wrote: >> 2016-11-11 20:56 GMT+03:00 Uros Bizjak : >>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak wrote: On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich wrote: > 2016-11-11 17:34 GMT+03:00 Uros Bizjak : >> Some quick remarks: >> >> +(define_insn "kmovb" >> + [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k") >> + (unspec:QI >> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")] >> + UNSPEC_KMOV))] >> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ" >> + "@ >> + kmovb\t{%k1, %0|%0, %k1} >> + kmovb\t{%1, %0|%0, %1}"; >> + [(set_attr "mode" "QI") >> + (set_attr "type" "mskmov") >> + (set_attr "prefix" "vex")]) >> + >> +(define_insn "kmovd" >> + [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k") >> + (unspec:SI >> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")] >> + UNSPEC_KMOV))] >> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW" >> + "@ >> + kmovd\t{%k1, %0|%0, %k1} >> + kmovd\t{%1, %0|%0, %1}"; >> + [(set_attr "mode" "SI") >> + (set_attr "type" "mskmov") >> + (set_attr "prefix" "vex")]) >> + >> +(define_insn "kmovq" >> + [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km") >> + (unspec:DI >> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")] >> + UNSPEC_KMOV))] >> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW" >> + "@ >> + kmovq\t{%k1, %0|%0, %k1} >> + kmovq\t{%1, %0|%0, %1} >> + kmovq\t{%1, %0|%0, %1}"; >> + [(set_attr "mode" "DI") >> + (set_attr "type" "mskmov") >> + (set_attr "prefix" "vex")]) >> >> - kmovd (and existing kmovw) should be using register_operand for >> opreand 0. In this case, there is no need for MEM_P checks at all. >> - In the insn constraint, pease check TARGET_AVX before checking MEM_P. >> - please put these definitions above corresponding *mov??_internal >> patterns. > > Do you mean put below *mov??_internal patterns? Attached corrected such > way. No, please put kmovq near *movdi_internal, kmovd near *movsi_internal, etc. It doesn't matter if they are above or below their respective *mov??_internal patterns, as long as they are positioned in some consistent way. IOW, new patterns shouldn't be grouped together, as is the case with your patch. >>> >>> +(define_insn "kmovb" >>> + [(set (match_operand:QI 0 "register_operand" "=k,k") >>> +(unspec:QI >>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")] >>> + UNSPEC_KMOV))] >>> + "TARGET_AVX512DQ && !MEM_P (operands[1])" >>> >>> There is no need for !MEM_P, this will prevent memory operand, which >>> is allowed by constraint "m". >>> >>> +(define_insn "kmovq" >>> + [(set (match_operand:DI 0 "register_operand" "=k,k,km") >>> +(unspec:DI >>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")] >>> + UNSPEC_KMOV))] >>> + "TARGET_AVX512BW && !MEM_P (operands[1])" >>> >>> Operand 0 should have "nonimmediate_operand" predicate. And here you >>> need && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent >>> mem->mem moves. >> >> Changed according your comments and attached. > > Still not good. > > +(define_insn "kmovd" > + [(set (match_operand:SI 0 "register_operand" "=k,k") > +(unspec:SI > + [(match_operand:SI 1 "nonimmediate_operand" "r,km")] > + UNSPEC_KMOV))] > + "TARGET_AVX512BW && !MEM_P (operands[1])" > > Remove !MEM_P in the above pattern. > > (define_insn "kmovw" > - [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k") > + [(set (match_operand:HI 0 "register_operand" "=k,k") > (unspec:HI >[(match_operand:HI 1 "nonimmediate_operand" "r,km")] >UNSPEC_KMOV))] > - "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F" > + "TARGET_AVX512F && !MEM_P (operands[1])" > > Also remove !MEM_P here. > > +(define_insn "kadd" > + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k") > +(plus:SWI1248x > + (not:SWI1248x > +(match_operand:SWI1248x 1 "register_operand" "r,0,k")) > + (match_operand:SWI1248x 2 "register_operand" "r,r,k"))) > + (clobber (reg:CC FLAGS_REG))] > + "TARGET_AVX512F" > +{ > + switch (which_alternative) > +{ > +case 0: > + return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}"; > +case 1: > + return "#"; > +case 2: > + if (TARGET_AVX512BW && mode == DImode) > +return "kaddq\t{%2, %1, %0|%0, %1, %2}"; > + else if (TARGET_AVX512BW && mode == SImode) > +return "kaddd\t{%2, %1, %0|%0, %1, %2}"; > + else if (TARGET_AVX512DQ && mode == QImode) > +return "kaddb\t{%2, %1, %0|%0, %1, %2}"; > + else > +return "kaddw\t{%2, %1, %0|%0, %1, %2}"; > + > > The above pattern i
Re: Avoid alloca(0) when temporarily propagating operands during threading
On Thu, Dec 01, 2016 at 11:43:19PM -0700, Jeff Law wrote: > > Martin's alloca work flagged this code as problematical. Essentially if we > had a statement with no operands and the statement was not in the hash > table, then we could end up performing alloca (0), which is inadvisable. I still don't understand why it is inadvisable. alloca(0) is not undefined behavior. It can return NULL, or non-unique pointer, or a unique pointer, and/or cause freeing of already left alloca blocks (like any other alloca call). None of that is a problem here. If num is 0, then copy is just set and never used. I expect most if not all gcc uses of alloca where the count can be 0 are like that. Jakub
Re: Avoid alloca(0) when temporarily propagating operands during threading
On 12/02/2016 10:58 AM, Jakub Jelinek wrote: On Thu, Dec 01, 2016 at 11:43:19PM -0700, Jeff Law wrote: Martin's alloca work flagged this code as problematical. Essentially if we had a statement with no operands and the statement was not in the hash table, then we could end up performing alloca (0), which is inadvisable. I still don't understand why it is inadvisable. alloca(0) is not undefined behavior. It can return NULL, or non-unique pointer, or a unique pointer, and/or cause freeing of already left alloca blocks (like any other alloca call). None of that is a problem here. If num is 0, then copy is just set and never used. I expect most if not all gcc uses of alloca where the count can be 0 are like that. It won't cause any problems in this and probably most instances, but leaving the code in its prior state is simply wrong from a maintenance standpoint. I'd much rather have the code explicitly and safely handle the zero operands case so that if someone makes a change later they don't have to worry about whether or not they're accessing memory which was never allocated. Additionally, it removes a false positive from the warning, thus making less noise. It's not unlike the strictly unnecessary initializations we do to shut up -Wuninitialized. Jeff
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
2016-12-02 18:13 GMT+01:00 Steve Kargl : >> >> > The attached patch fixes an ICE, a nearby whitespace issue, and >> >> > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression >> >> > testing on x86_64-*-freebsd. Ok to commit? >> >> >> >> huh, I don't really understand why the argument of RANK is detected to >> >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be >> >> an EXPR_CONSTANT? >> >> >> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed >> >> is the symbol "c" (as expected), but that clearly is not a function, >> >> so it seems to me that the actual bug here is that a->expr_type is set >> >> incorrectly ...? >> > >> > I found that it is the function __convert_s4_s1. >> >> That's strange. If we see different things here, maybe we are running >> into some kind of undefined behavior (possibly related to >> gfc_bad_expr?). Anyway, after some more debugging I came to the >> conclusion that what actually fails is the error propagation, which >> seems to be broken in gfc_check_assign and can be fixed like this: >> >> >> Index: gcc/fortran/expr.c >> === >> --- gcc/fortran/expr.c(revision 243194) >> +++ gcc/fortran/expr.c(working copy) >> @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval >>if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER) >> { >>if (lvalue->ts.kind != rvalue->ts.kind && allow_convert) >> -gfc_convert_chartype (rvalue, &lvalue->ts); >> - >> - return true; >> +return gfc_convert_chartype (rvalue, &lvalue->ts); >> + else >> +return true; >> } >> >>if (!allow_convert) >> >> >> This also avoids the ICE and I think is the proper way to fix this ... >> > > When developing the patch, I fixed/avoided the ICE, first. Then, > I tried Gerhard's second testcase without the PARAMETER attribute. > An error message is emitted, so I went hunting for why PARAMETER > suppressed the error message. This led to the second part of the > patch of changing gfc_error to gfc_error_now. I think the gfc_error_now should not be necessary with my fix, but removing the ATTRIBUTE_UNUSED is certainly a good idea. > In any event, if your patch causes an error message to be emitted, > then I think that your patch is better than the one I proposed. > Feel free to commit your patch. I am now regtesting the attached version and, if successful, will commit. Cheers, Janus Index: gcc/fortran/check.c === --- gcc/fortran/check.c (revision 243194) +++ gcc/fortran/check.c (working copy) @@ -3667,7 +3667,7 @@ gfc_check_range (gfc_expr *x) bool -gfc_check_rank (gfc_expr *a ATTRIBUTE_UNUSED) +gfc_check_rank (gfc_expr *a) { /* Any data object is allowed; a "data object" is a "constant (4.1.3), variable (6), or subobject of a constant (2.4.3.2.3)" (F2008, 1.3.45). */ Index: gcc/fortran/expr.c === --- gcc/fortran/expr.c (revision 243194) +++ gcc/fortran/expr.c (working copy) @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER) { if (lvalue->ts.kind != rvalue->ts.kind && allow_convert) - gfc_convert_chartype (rvalue, &lvalue->ts); - - return true; + return gfc_convert_chartype (rvalue, &lvalue->ts); + else + return true; } if (!allow_convert)
Re: Avoid alloca(0) when temporarily propagating operands during threading
On Fri, Dec 02, 2016 at 11:02:33AM -0700, Jeff Law wrote: > It won't cause any problems in this and probably most instances, but leaving > the code in its prior state is simply wrong from a maintenance standpoint. > > I'd much rather have the code explicitly and safely handle the zero operands > case so that if someone makes a change later they don't have to worry about > whether or not they're accessing memory which was never allocated. > > Additionally, it removes a false positive from the warning, thus making less > noise. > > It's not unlike the strictly unnecessary initializations we do to shut up > -Wuninitialized. But -Wuninitialized also found tons of real-world bugs. Do we have a single real-world example where such a warning would actually be useful (as in, there would be an actual bug)? Otherwise we are adding workarounds for a warning that just forces people to add those workarounds, but doesn't improve code in the wild in any way. Jakub
[PATCH 8a/9] Introduce class function_reader (v7)
On Fri, 2016-12-02 at 15:41 +0100, Bernd Schmidt wrote: > On 12/02/2016 03:00 AM, David Malcolm wrote: > > Changed in v6: > > - split out dump-reading selftests into followup patches > > (target-independent, and target-specific) > > - removes unneeded headers from read-rtl-function.c > > - numerous other cleanups identified in review > > Ok, starting to look very close to done. Thanks for the reviews so far. > It appears I accidentally made you change element->directive. It > seems > like a good change but all I wanted to point out was a plural where I > would have expected a singular. (nods) > > > > + /* Lookup param by name. */ > > + tree t_param = parse_mem_expr (name); > > + // TODO: what if not found? > > Error out? Currently, function_reader::parse_mem_expr can't fail; it creates a VAR_DECL for anything it doesn't recognize. But here, we should be looking specifically for a PARAM_DECL. I've moved the find-param-by-name code out from function_reader::parse_mem_expr into a new function, and call that instead, and error out if it isn't found from that. > > +/* Implementation of rtx_reader::handle_unknown_directive. > > + > > + Require a top-level "function" directive, as emitted by > > + print_rtx_function, and parse it. */ > > Should document START_LOC and NAME. Done. > > + /* Do we need this to force cgraphunit.c to output the function? > > */ > > Well, what happens if you don't do this? Seems reasonable anyway, so > maybe lose the comment. The two flag assignments don't seem to be needed; I think this is due to adding: if (node->native_rtl_p ()) node->force_output = 1; to cgraph_node::finalize_function in patch 9. Should I lose them (and the comment)? > > +/* Parse rtx instructions by calling read_rtx_code, calling > > + set_first_insn and set_last_insn as appropriate, and > > + adding the insn to the insn chain. > > + Consume the trailing ')'. */ > > + > > +rtx_insn * > > +function_reader::parse_insn (file_location start_loc, const char > > *name) > > Once again, please document args - make another pass over all the > functions to make sure. Done. In doing so, I noticed that the args to read_rtl_function_body were unnecessarily complicated, so I simplified it to just take a path. I also made some simplifications to the various fixup_source_location functions. > > + > > +/* Parse operand IDX of X, of code 'i' or 'n'. > > "... as specified by FORMAT_CHAR", presumably. Done. > > + if (0 == strncmp (desc, "orig:", 5)) > > + { > > + expect_original_regno = true; > > + desc_start += 5; > > + /* Skip to any whitespace following the integer. */ > > + const char *space = strchr (desc_start, ' '); > > + if (space) > > + desc_start = space + 1; > > + } > > + /* Any remaining text may be the REG_EXPR. Alternatively we > > have > > +no REG_ATTRS, and instead we have ORIGINAL_REGNO. */ > > + if (ISDIGIT (*desc_start)) > > + { > > + /* Assume we have ORIGINAL_REGNO. */ > > + ORIGINAL_REGNO (x) = atoi (desc_start); > > + } > > This looks like an issue. You'll want to have ORIGINAL_REGNOs printed > and parsed like other regnos, i.e. with %number for pseudos. Can be > done > as a followup if it's involved. This is involved; let's do this as a followup. > > + /* Verify lookup of hard registers. */ > > +#ifdef GCC_AARCH64_H > > + ASSERT_EQ (0, lookup_reg_by_dump_name ("x0")); > > + ASSERT_EQ (1, lookup_reg_by_dump_name ("x1")); > > +#endif > > +#ifdef I386_OPTS_H > > + ASSERT_EQ (0, lookup_reg_by_dump_name ("ax")); > > + ASSERT_EQ (1, lookup_reg_by_dump_name ("dx")); > > +#endif > > Please just remove this for now; not important enough to break the > rule > about target-specific bits in generic code. Removed for now. > > @@ -1103,13 +1244,39 @@ rtx_reader::read_rtx_code (const char > > *code_name) > >rtx value; /* Value of this node. */ > > }; > > > > + one_time_initialization (); > > I still don't understand why this isn't in the rtx_reader > constructor? > Seems pointless to call this each time we want to parse an rtx. Aha; that makes much more sense. Done. > > + > > + /* Use the format_ptr to parse the various operands of this rtx. > > + read_rtx_operand is a vfunc, allowing the parser to vary > > between > > + parsing .md files and parsing .rtl dumps; the function_reader > > subclass > > + overrides the defaults when loading RTL dumps, to handle the > > + various extra attributes emitted by print_rtx. */ > >for (int idx = 0; format_ptr[idx] != 0; idx++) > > -read_rtx_operand (return_rtx, idx); > > +return_rtx = read_rtx_operand (return_rtx, idx); > > + > > + /* Call a vfunc to handle the various additional information > > that > > + print-rtl.c can write after the regular fields; does nothing > > when > > + parsing .md files. */ > > + handle_any_trailing_information (return_rtx); > > I still think just drop the bulk o
Re: Avoid alloca(0) when temporarily propagating operands during threading
On 12/02/2016 11:11 AM, Jakub Jelinek wrote: On Fri, Dec 02, 2016 at 11:02:33AM -0700, Jeff Law wrote: It won't cause any problems in this and probably most instances, but leaving the code in its prior state is simply wrong from a maintenance standpoint. I'd much rather have the code explicitly and safely handle the zero operands case so that if someone makes a change later they don't have to worry about whether or not they're accessing memory which was never allocated. Additionally, it removes a false positive from the warning, thus making less noise. It's not unlike the strictly unnecessary initializations we do to shut up -Wuninitialized. But -Wuninitialized also found tons of real-world bugs. Do we have a single real-world example where such a warning would actually be useful (as in, there would be an actual bug)? Otherwise we are adding workarounds for a warning that just forces people to add those workarounds, but doesn't improve code in the wild in any way. Have you looked in depth at the lto.c issue it flagged? I can't prove that one is safe. And more generally, an under-sized allocation is a security risk. We should continue to try and identify those and warn for them. jeff
Re: Avoid alloca(0) when temporarily propagating operands during threading
On Fri, Dec 02, 2016 at 11:13:29AM -0700, Jeff Law wrote: > >But -Wuninitialized also found tons of real-world bugs. Do we have a single > >real-world example where such a warning would actually be useful (as in, > >there would be an actual bug)? Otherwise we are adding workarounds for a > >warning that just forces people to add those workarounds, but doesn't > >improve code in the wild in any way. > Have you looked in depth at the lto.c issue it flagged? I can't prove that > one is safe. If you mean the tree *map = XALLOCAVEC (tree, 2 * len); call, then I strongly doubt it can actually ever be called with len == 0. There is tree_scc *scc = (tree_scc *) alloca (sizeof (tree_scc) + (len - 1) * sizeof (tree)); a few lines above this and len is unsigned int, therefore for len 0 this will try to (on 64-bit host) alloca (32 + 0x * 4UL), i.e. alloca (0x4001cUL) and I'm pretty sure on most hosts alloca of 16GB won't really work. Even if this succeeded for whatever reason, what problem do you see with map = alloca (0)? > And more generally, an under-sized allocation is a security risk. We should Sure. But I really don't see 0 as being special in any way. We do support zero sized arrays and IMNSHO we want to support 0 sized alloca, it is very much the same thing. We also do support VLAs with 0 size. Jakub
Re: [PATCH] Add AVX512 k-mask intrinsics
On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich wrote: > 2016-11-11 22:14 GMT+03:00 Uros Bizjak : >> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich >> wrote: >>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak : On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak wrote: > On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich > wrote: >> 2016-11-11 17:34 GMT+03:00 Uros Bizjak : >>> Some quick remarks: >>> >>> +(define_insn "kmovb" >>> + [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k") >>> + (unspec:QI >>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")] >>> + UNSPEC_KMOV))] >>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ" >>> + "@ >>> + kmovb\t{%k1, %0|%0, %k1} >>> + kmovb\t{%1, %0|%0, %1}"; >>> + [(set_attr "mode" "QI") >>> + (set_attr "type" "mskmov") >>> + (set_attr "prefix" "vex")]) >>> + >>> +(define_insn "kmovd" >>> + [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k") >>> + (unspec:SI >>> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")] >>> + UNSPEC_KMOV))] >>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW" >>> + "@ >>> + kmovd\t{%k1, %0|%0, %k1} >>> + kmovd\t{%1, %0|%0, %1}"; >>> + [(set_attr "mode" "SI") >>> + (set_attr "type" "mskmov") >>> + (set_attr "prefix" "vex")]) >>> + >>> +(define_insn "kmovq" >>> + [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km") >>> + (unspec:DI >>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")] >>> + UNSPEC_KMOV))] >>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW" >>> + "@ >>> + kmovq\t{%k1, %0|%0, %k1} >>> + kmovq\t{%1, %0|%0, %1} >>> + kmovq\t{%1, %0|%0, %1}"; >>> + [(set_attr "mode" "DI") >>> + (set_attr "type" "mskmov") >>> + (set_attr "prefix" "vex")]) >>> >>> - kmovd (and existing kmovw) should be using register_operand for >>> opreand 0. In this case, there is no need for MEM_P checks at all. >>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P. >>> - please put these definitions above corresponding *mov??_internal >>> patterns. >> >> Do you mean put below *mov??_internal patterns? Attached corrected such >> way. > > No, please put kmovq near *movdi_internal, kmovd near *movsi_internal, > etc. It doesn't matter if they are above or below their respective > *mov??_internal patterns, as long as they are positioned in some > consistent way. IOW, new patterns shouldn't be grouped together, as is > the case with your patch. +(define_insn "kmovb" + [(set (match_operand:QI 0 "register_operand" "=k,k") +(unspec:QI + [(match_operand:QI 1 "nonimmediate_operand" "r,km")] + UNSPEC_KMOV))] + "TARGET_AVX512DQ && !MEM_P (operands[1])" There is no need for !MEM_P, this will prevent memory operand, which is allowed by constraint "m". +(define_insn "kmovq" + [(set (match_operand:DI 0 "register_operand" "=k,k,km") +(unspec:DI + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")] + UNSPEC_KMOV))] + "TARGET_AVX512BW && !MEM_P (operands[1])" Operand 0 should have "nonimmediate_operand" predicate. And here you need && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent mem->mem moves. >>> >>> Changed according your comments and attached. >> >> Still not good. >> >> +(define_insn "kmovd" >> + [(set (match_operand:SI 0 "register_operand" "=k,k") >> +(unspec:SI >> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")] >> + UNSPEC_KMOV))] >> + "TARGET_AVX512BW && !MEM_P (operands[1])" >> >> Remove !MEM_P in the above pattern. >> >> (define_insn "kmovw" >> - [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k") >> + [(set (match_operand:HI 0 "register_operand" "=k,k") >> (unspec:HI >>[(match_operand:HI 1 "nonimmediate_operand" "r,km")] >>UNSPEC_KMOV))] >> - "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F" >> + "TARGET_AVX512F && !MEM_P (operands[1])" >> >> Also remove !MEM_P here. >> >> +(define_insn "kadd" >> + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k") >> +(plus:SWI1248x >> + (not:SWI1248x >> +(match_operand:SWI1248x 1 "register_operand" "r,0,k")) >> + (match_operand:SWI1248x 2 "register_operand" "r,r,k"))) >> + (clobber (reg:CC FLAGS_REG))] >> + "TARGET_AVX512F" >> +{ >> + switch (which_alternative) >> +{ >> +case 0: >> + return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}"; >> +case 1: >> + return "#"; >> +case 2: >> + if (TARGET_AVX512BW && mode == DImode) >> +return "kaddq\t{%2, %1, %0|%0, %1, %2}"; >> + else if (TARGET_AVX512BW && mode == SImode) >> +return "kaddd\t{%2, %1, %0|%0, %1, %2}"; >>
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
2016-12-02 19:06 GMT+01:00 Janus Weil : > 2016-12-02 18:13 GMT+01:00 Steve Kargl : >>> >> > The attached patch fixes an ICE, a nearby whitespace issue, and >>> >> > removed the ATTRIBUTE_UNUSED tag. THe change has passed regression >>> >> > testing on x86_64-*-freebsd. Ok to commit? >>> >> >>> >> huh, I don't really understand why the argument of RANK is detected to >>> >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be >>> >> an EXPR_CONSTANT? >>> >> >>> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed >>> >> is the symbol "c" (as expected), but that clearly is not a function, >>> >> so it seems to me that the actual bug here is that a->expr_type is set >>> >> incorrectly ...? >>> > >>> > I found that it is the function __convert_s4_s1. >>> >>> That's strange. If we see different things here, maybe we are running >>> into some kind of undefined behavior (possibly related to >>> gfc_bad_expr?). Anyway, after some more debugging I came to the >>> conclusion that what actually fails is the error propagation, which >>> seems to be broken in gfc_check_assign and can be fixed like this: >>> >>> >>> Index: gcc/fortran/expr.c >>> === >>> --- gcc/fortran/expr.c(revision 243194) >>> +++ gcc/fortran/expr.c(working copy) >>> @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval >>>if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER) >>> { >>>if (lvalue->ts.kind != rvalue->ts.kind && allow_convert) >>> -gfc_convert_chartype (rvalue, &lvalue->ts); >>> - >>> - return true; >>> +return gfc_convert_chartype (rvalue, &lvalue->ts); >>> + else >>> +return true; >>> } >>> >>>if (!allow_convert) >>> >>> >>> This also avoids the ICE and I think is the proper way to fix this ... >>> >> >> When developing the patch, I fixed/avoided the ICE, first. Then, >> I tried Gerhard's second testcase without the PARAMETER attribute. >> An error message is emitted, so I went hunting for why PARAMETER >> suppressed the error message. This led to the second part of the >> patch of changing gfc_error to gfc_error_now. > > I think the gfc_error_now should not be necessary with my fix, but > removing the ATTRIBUTE_UNUSED is certainly a good idea. > > >> In any event, if your patch causes an error message to be emitted, >> then I think that your patch is better than the one I proposed. >> Feel free to commit your patch. > > I am now regtesting the attached version and, if successful, will commit. Testing went well. Committed as r243201. Cheers, Janus
Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)
On Fri, Dec 2, 2016 at 5:30 PM, Jakub Jelinek wrote: > On Fri, Dec 02, 2016 at 05:12:20PM +0100, Uros Bizjak wrote: >> >> This patch: >> >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two >> >>one_cmplsi2 after reload) >> >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) >> >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it >> >>is for -mbmi only, and adds another *andndi3_doubleword pattern that is >> >>meant to live just from combine till the stv pass, or worse case till >> >>following split1 pass when it is split back into not followed by and; >> >>this change makes it possible to use pandn in stv pass, even without >> >>-mbmi >> > >> > Please use attached (lightly tested) patch to implement point 3) >> > above. The patch splits insn after reload, as is the case with all STV >> > patterns. >> >> Now attached for real. > > Ok, I've checked in following patch (compared to your notes just added > xfail to the pr70322-2.c test scan-assembler), feel free to test your patch > and remove the xfail again. > > 2016-12-02 Jakub Jelinek > > PR target/70322 > * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle > NOT. > (dimode_scalar_chain::compute_convert_gain): Likewise. > (dimode_scalar_chain::convert_insn): Likewise. > * config/i386/i386.md (*one_cmpldi2_doubleword): New > define_insn_and_split. > (one_cmpl2): Use SWIM1248x iterator instead of SWIM. > > * gcc.target/i386/pr70322-1.c: New test. > * gcc.target/i386/pr70322-2.c: New test. > * gcc.target/i386/pr70322-3.c: New test. Attached to this message, please find the patch that finally implements PANDN generation for non-BMI targets. 2016-12-02 Uros Bizjak PR target/70322 * config/i386/i386.md (*andndi3_doubleword): Add non-BMI alternative and corresponding post-reload splitter. testsuite/ChangeLog: 2016-12-02 Uros Bizjak PR target/70322 * gcc.target/i386/pr70322-2.c (dg-final): Remove xfail. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 243196) +++ config/i386/i386.md (working copy) @@ -8534,15 +8534,24 @@ operands[2] = gen_lowpart (QImode, operands[2]); }) -(define_insn_and_split "*andndi3_doubleword" - [(set (match_operand:DI 0 "register_operand" "=r") +(define_insn "*andndi3_doubleword" + [(set (match_operand:DI 0 "register_operand" "=r,&r") (and:DI - (not:DI (match_operand:DI 1 "register_operand" "r")) - (match_operand:DI 2 "nonimmediate_operand" "rm"))) + (not:DI (match_operand:DI 1 "register_operand" "r,0")) + (match_operand:DI 2 "nonimmediate_operand" "rm,rm"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2" + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" "#" - "&& reload_completed" + [(set_attr "isa" "bmi,*")]) + +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI + (not:DI (match_operand:DI 1 "register_operand")) + (match_operand:DI 2 "nonimmediate_operand"))) + (clobber (reg:CC FLAGS_REG))] + "!TARGET_64BIT && TARGET_BMI && TARGET_STV && TARGET_SSE2 + && reload_completed" [(parallel [(set (match_dup 0) (and:SI (not:SI (match_dup 1)) (match_dup 2))) (clobber (reg:CC FLAGS_REG))]) @@ -8551,6 +8560,24 @@ (clobber (reg:CC FLAGS_REG))])] "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI + (not:DI (match_dup 0)) + (match_operand:DI 1 "nonimmediate_operand"))) + (clobber (reg:CC FLAGS_REG))] + "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2 + && reload_completed" + [(set (match_dup 0) (not:SI (match_dup 0))) + (parallel [(set (match_dup 0) + (and:SI (match_dup 0) (match_dup 1))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_dup 2) (not:SI (match_dup 2))) + (parallel [(set (match_dup 2) + (and:SI (match_dup 2) (match_dup 3))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);") + (define_insn "*andn_1" [(set (match_operand:SWI48 0 "register_operand" "=r,r") (and:SWI48 Index: testsuite/gcc.target/i386/pr70322-2.c === --- testsuite/gcc.target/i386/pr70322-2.c (revision 243196) +++ testsuite/gcc.target/i386/pr70322-2.c (working copy) @@ -1,7 +1,7 @@ /* PR target/70322 */ /* { dg-do compile { target ia32 } } */ /* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */ -/* { dg-final { scan-assembler "pandn" { xfail *-*-
Re: PR78599
Hi, On Thu, Dec 01, 2016 at 01:43:16PM +0100, Richard Biener wrote: > On Thu, Dec 1, 2016 at 11:07 AM, Prathamesh Kulkarni > wrote: > > Hi, > > As mentioned in PR, the issue seems to be that in > > propagate_bits_accross_jump_functions(), > > ipa_get_type() returns record_type during WPA and hence we pass > > invalid precision to > > ipcp_bits_lattice::meet_with (value, mask, precision) which eventually > > leads to runtime error. > > The attached patch tries to fix that, by bailing out if type of param > > is not integral or pointer type. > > This happens for the edge from deque_test -> > > _Z4copyIPd1BEvT_S2_T0_.isra.0/9. > > Feels more like a DECL_BY_REFERENCE mishandling and should be fixed elsewhere. That is indeed what is happening. Prathamesh, if you are going to save the type of arguments in the jump function, it should help you also with this issue. I know it was me who suggested using the function type to get at them and am sorry, I did not realize there potential issues with promotions and by_reference passing. By the way, please be careful not to introduce code style violations, especially lines exceeding 80 characters and adding trailing whitespace (propagate_bits_accross_jump_function has a few instances of both), I'd suggest setting your editor to highlight them. Thanks, Martin
Re: [PATCH] PR fortran/78618 -- RANK() should not ICE
On Fri, Dec 02, 2016 at 07:39:33PM +0100, Janus Weil wrote: > > Testing went well. Committed as r243201. > Thanks for reviewing my patch, and more importantly thanks for your patch. -- Steve
Re: [PATCH 8a/9] Introduce class function_reader (v7)
On 12/02/2016 07:44 PM, David Malcolm wrote: The two flag assignments don't seem to be needed; I think this is due to adding: if (node->native_rtl_p ()) node->force_output = 1; to cgraph_node::finalize_function in patch 9. Should I lose them (and the comment)? Let's keep this patch self-contained and not dependent on #9. So, leave them in for now. I think we can call this one OK for now (the orig_reg thing needs fixing and we'll probably want to do something about making locations optional, but all that can be done later). Bernd
Re: PR78599
On 3 December 2016 at 00:25, Martin Jambor wrote: > Hi, > > On Thu, Dec 01, 2016 at 01:43:16PM +0100, Richard Biener wrote: >> On Thu, Dec 1, 2016 at 11:07 AM, Prathamesh Kulkarni >> wrote: >> > Hi, >> > As mentioned in PR, the issue seems to be that in >> > propagate_bits_accross_jump_functions(), >> > ipa_get_type() returns record_type during WPA and hence we pass >> > invalid precision to >> > ipcp_bits_lattice::meet_with (value, mask, precision) which eventually >> > leads to runtime error. >> > The attached patch tries to fix that, by bailing out if type of param >> > is not integral or pointer type. >> > This happens for the edge from deque_test -> >> > _Z4copyIPd1BEvT_S2_T0_.isra.0/9. >> >> Feels more like a DECL_BY_REFERENCE mishandling and should be fixed >> elsewhere. > > That is indeed what is happening. Prathamesh, if you are going to save > the type of arguments in the jump function, it should help you also > with this issue. I know it was me who suggested using the function > type to get at them and am sorry, I did not realize there potential > issues with promotions and by_reference passing. > > By the way, please be careful not to introduce code style violations, > especially lines exceeding 80 characters and adding trailing > whitespace (propagate_bits_accross_jump_function has a few instances > of both), I'd suggest setting your editor to highlight them. Oops sorry about that, I will pay more attention to formatting henceforth. Using editor to highlight stray whitespace is indeed quite useful, thanks for that suggestion. Kugan has a patch for adding param type to jump function: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02732.html Once that gets committed, I will send a patch to use jfunc->param_type in propagate_bits_accross_jump_function(). Thanks, Prathamesh > > Thanks, > > Martin >
[PATCH] Add ASSERT_RTX_PTR_EQ
On Fri, 2016-12-02 at 16:28 +0100, Bernd Schmidt wrote: > On 12/01/2016 10:43 PM, David Malcolm wrote: > > > > + rtx_insn *jump_insn = get_insn_by_uid (1); > > > > + ASSERT_EQ (JUMP_INSN, GET_CODE (jump_insn)); > > > > + ASSERT_EQ (ret_rtx, JUMP_LABEL (jump_insn)); > > > > + // FIXME: ^^^ use ASSERT_RTX_PTR_EQ here ^^^ > > > > > > Why is this a fixme and not just done that way (several > > > instances)? > > > > ASSERT_RTX_PTR_EQ doesn't exist yet; there was some discussion > > about it > > in earlier versions of the patch, but I haven't written it. It > > would > > be equivalent to ASSERT_EQ, checking pointer equality, but would > > dump > > the mismatching expected vs actual rtx on failure. > > > > Should I convert this to a TODO, or go ahead and implement > > ASSERT_RTX_PTR_EQ? > > Missed this question. Just add ASSERT_RTX_PTR_EQ, that shouldn't be > hard, should it? > > > Bernd This patch implements an ASSERT_RTX_PTR_EQ macro, like ASSERT_EQ, but which dumps both rtx to stderr if the assertion fails. gcc/ChangeLog: * selftest-rtl.c (selftest::assert_rtx_ptr_eq_at): New function. * selftest-rtl.h (ASSERT_RTX_PTR_EQ): New macro. --- gcc/selftest-rtl.c | 23 +++ gcc/selftest-rtl.h | 18 ++ 2 files changed, 41 insertions(+) diff --git a/gcc/selftest-rtl.c b/gcc/selftest-rtl.c index 20e6550..c14db43 100644 --- a/gcc/selftest-rtl.c +++ b/gcc/selftest-rtl.c @@ -36,6 +36,29 @@ along with GCC; see the file COPYING3. If not see namespace selftest { +/* Compare rtx EXPECTED and ACTUAL by pointer equality, calling + ::selftest::pass if they are equal, aborting if they are non-equal. + LOC is the effective location of the assertion, MSG describes it. */ + +void +assert_rtx_ptr_eq_at (const location &loc, const char *msg, + rtx expected, rtx actual) +{ + if (expected == actual) +::selftest::pass (loc, msg); + else +{ + fprintf (stderr, "%s:%i: %s: FAIL: %s\n", loc.m_file, loc.m_line, + loc.m_function, msg); + fprintf (stderr, " expected (at %p): ", (void *)expected); + print_rtl (stderr, expected); + fprintf (stderr, "\n actual (at %p): ", (void *)actual); + print_rtl (stderr, actual); + fprintf (stderr, "\n"); + abort (); +} +} + /* Constructor for selftest::rtl_dump_test. Read a dumped RTL function from PATH. Takes ownership of PATH, freeing in dtor. diff --git a/gcc/selftest-rtl.h b/gcc/selftest-rtl.h index 35d6437..cb2772d 100644 --- a/gcc/selftest-rtl.h +++ b/gcc/selftest-rtl.h @@ -47,6 +47,24 @@ assert_rtl_dump_eq (const location &loc, const char *expected_dump, rtx x, assert_rtl_dump_eq (SELFTEST_LOCATION, (EXPECTED_DUMP), (RTX), \ (REUSE_MANAGER)) +/* Evaluate rtx EXPECTED and ACTUAL and compare them with == + (i.e. pointer equality), calling ::selftest::pass if they are + equal, aborting if they are non-equal. */ + +#define ASSERT_RTX_PTR_EQ(EXPECTED, ACTUAL) \ + SELFTEST_BEGIN_STMT \ + const char *desc = "ASSERT_RTX_PTR_EQ (" #EXPECTED ", " #ACTUAL ")"; \ + ::selftest::assert_rtx_ptr_eq_at (SELFTEST_LOCATION, desc, (EXPECTED), \ + (ACTUAL)); \ + SELFTEST_END_STMT + +/* Compare rtx EXPECTED and ACTUAL by pointer equality, calling + ::selftest::pass if they are equal, aborting if they are non-equal. + LOC is the effective location of the assertion, MSG describes it. */ + +extern void assert_rtx_ptr_eq_at (const location &loc, const char *msg, + rtx expected, rtx actual); + /* A class for testing RTL function dumps. */ class rtl_dump_test -- 1.8.5.3
Re: PR78634: ifcvt/i386 cost updates
On Fri, Dec 02, 2016 at 05:00:05PM +0100, Bernd Schmidt wrote: > With the i386 backend no longer double-counting the cost of a SET, > the default implementation default_max_noce_ifcvt_seq_cost now > provides too high a bound for if conversion, allowing very costly > substitutions. > > The following patch fixes this by making an i386 variant of the > hook, but first - James, do you recall how you arrived at the > COSTS_N_INSNS(3) magic number? What happens on arm if you lower that > in the default hook? Hi Bernd, Given the magic numbers in BRANCH_COST already, 3 was chosen to give a resonable chance of allowing if-conversion of blocks containing multiple sets. iWe need to do this because for AArch64 and x86 the conditional move pattern will expand to two further patterns, each of which will get a cost (before my cost model patches you would simply count the total number of expansions you would make, so the multiplication by three is to compensate) I wouldn't say there was much scientific to choosing between 2 and 3 beyond looking at cases which worked on x86_64 and AArch64 before I modified the cost model, and again after, and trying to maintain the number of such cases which were if-converted. Setting this to 2 in the generic hook and forcing AArch64 to run with a custom hook would be just as correct as this patch (though arguably yours is the better Stage 3 patch as it has more limited scope). Thanks, James > The change in ifcvt.c seems to have no bearing on the PR, I just > noticed we were missing a cost check in one place. > > Lightly tested so far, will bootstrap & test on x86_64-linux. > > > Bernd > PR rtl-optimization/78634 > * config/i386/i386.c (ix86_max_noce_ifcvt_seq_cost): New function. > (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Define. > * ifcvt.c (noce_try_cmove): Add missing cost check. > > Index: gcc/config/i386/i386.c > === > --- gcc/config/i386/i386.c(revision 242958) > +++ gcc/config/i386/i386.c(working copy) > @@ -50301,6 +50301,28 @@ ix86_spill_class (reg_class_t rclass, ma >return NO_REGS; > } > > +/* Implement TARGET_MAX_NOCE_IFCVT_SEQ_COST. Like the default > implementation, > + but returns a lower bound. */ > + > +static unsigned int > +ix86_max_noce_ifcvt_seq_cost (edge e) > +{ > + bool predictable_p = predictable_edge_p (e); > + > + enum compiler_param param > += (predictable_p > + ? PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST > + : PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST); > + > + /* If we have a parameter set, use that, otherwise take a guess using > + BRANCH_COST. */ > + if (global_options_set.x_param_values[param]) > +return PARAM_VALUE (param); > + else > +return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (2); > +} > + > + > /* Implement targetm.vectorize.init_cost. */ > > static void * > @@ -51615,6 +51637,8 @@ ix86_run_selftests (void) > #undef TARGET_EXPAND_DIVMOD_LIBFUNC > #define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc > > +#undef TARGET_MAX_NOCE_IFCVT_SEQ_COST > +#define TARGET_MAX_NOCE_IFCVT_SEQ_COST ix86_max_noce_ifcvt_seq_cost > #if CHECKING_P > #undef TARGET_RUN_TARGET_SELFTESTS > #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests > Index: gcc/ifcvt.c > === > --- gcc/ifcvt.c (revision 242958) > +++ gcc/ifcvt.c (working copy) > @@ -1826,7 +1826,7 @@ noce_try_cmove (struct noce_if_info *if_ > noce_emit_move_insn (if_info->x, target); > > seq = end_ifcvt_sequence (if_info); > - if (!seq) > + if (!seq || !noce_conversion_profitable_p (seq, if_info)) > return FALSE; > > emit_insn_before_setloc (seq, if_info->jump,
[PATCH] Fix tsubst_init error recovery (PR c++/78649)
Hi! Trying to value initialize error_mark_node type ICEs, other spots avoid calling build_value_init* if the type is error_mark_node. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-02 Jakub Jelinek PR c++/78649 * pt.c (tsubst_init): Don't call build_value_init if decl's type is error_mark_node. * g++.dg/cpp0x/pr78649.C: New test. --- gcc/cp/pt.c.jj 2016-11-23 16:58:30.0 +0100 +++ gcc/cp/pt.c 2016-12-02 14:38:43.237451068 +0100 @@ -14082,7 +14082,7 @@ tsubst_init (tree init, tree decl, tree init = tsubst_expr (init, args, complain, in_decl, false); - if (!init) + if (!init && TREE_TYPE (decl) != error_mark_node) { /* If we had an initializer but it instantiated to nothing, --- gcc/testsuite/g++.dg/cpp0x/pr78649.C.jj 2016-12-02 14:51:02.057892855 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr78649.C2016-12-02 14:50:42.0 +0100 @@ -0,0 +1,16 @@ +// PR c++/78649 +// { dg-do compile { target c++11 } } + +template void foo (); +template +void +test () +{ + T t (foo...); // { dg-error "declared void" } +} + +int +main () +{ + test (); +} Jakub
Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)
On 12/01/2016 07:31 PM, Martin Sebor wrote: On 12/01/2016 02:15 AM, Jakub Jelinek wrote: On Thu, Dec 01, 2016 at 08:26:47AM +0100, Jakub Jelinek wrote: Isn't this too simplistic? I mean, if you have say dirtype of signed char and argmin say 4096 + 32 and argmax say 4096 + 64, (signed char) arg has range 32, 64, while I think your routine will yield -128, 127 (well, 0 as min and -128 as max as that is what you return for signed type). Can't you subtract argmax - argmin (best just in wide_int, no need to build trees), and use what you have just for the case where that number doesn't fit into the narrower precision, otherwise if argmin - (dirtype) argmin == argmax - (dirtype) argmax, just use (dirtype) argmin and (dirtype) argmax as the range, and in case that it crosses a boundary figure if you can do anything than the above? Guess all cases of signed/unsigned dirtype and/or argtype need to be considered. Richard noted that you could have a look at CONVERT_EXPR_CODE_P handling in extract_range_from_unary_expr. I think it is the || (vr0.type == VR_RANGE && integer_zerop (int_const_binop (RSHIFT_EXPR, int_const_binop (MINUS_EXPR, vr0.max, vr0.min), size_int (TYPE_PRECISION (outer_type))) part that is important here for the narrowing conversion. Thanks, that was a helpful suggestion! Attached is an update that follows the vrp approach. I assume the infinity stuff is specific to the VRP pass and not something I need to worry about here. Martin PS To your earlier question, argmin and argmax have the same meaning in the added helper function as in its caller. gcc-78622.diff PR middle-end/78622 - [7 Regression] -Wformat-length/-fprintf-return-value incorrect with overflow/wrapping gcc/ChangeLog: PR middle-end/78622 * gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange rather than res.bounded. (adjust_range_for_overflow): New function. (format_integer): Always set res.bounded to true unless either precision or width is specified and unknown. Call adjust_range_for_overflow. (format_directive): Remove vestigial quoting. (add_bytes): Correct the computation of boundrange used to decide whether a warning is of a "maybe" or "defnitely" kind. s/defnitely/definitely/ gcc/testsuite/ChangeLog: PR middle-end/78622 * gcc.c-torture/execute/pr78622.c: New test. * gcc.dg/tree-ssa/builtin-sprintf-2.c: Remove "benign" undefined behavior inadvertently introduced in a previous commit. Tighten up final checking. * gcc.dg/tree-ssa/builtin-sprintf-5.c: Rename macros for clarity. Add test cases. * gcc.dg/tree-ssa/builtin-sprintf-6.c: Add test cases. * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same. * gcc.dg/tree-ssa/builtin-sprintf-warn-6.c: Remove xfails and add a final optimization check. * gcc.dg/tree-ssa/builtin-sprintf.c: Add test cases. diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c index 99a635a..95a8710 100644 --- a/gcc/gimple-ssa-sprintf.c +++ b/gcc/gimple-ssa-sprintf.c @@ -795,6 +795,59 @@ get_width_and_precision (const conversion_spec &spec, *pprec = prec; } +/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual + argument, due to the conversion from either *ARGMIN or *ARGMAX to + the type of the directive's formal argument it's possible for both + to result in the same number of bytes or a range of bytes that's + less than the number of bytes that would result from formatting + some other value in the range [*ARGMIN, *ARGMAX]. This can be + determined by checking for the actual argument being in the range + of the type of the directive. If it isn't it must be assumed to + take on the full range of the directive's type. + Return true when the range has been adjusted to the range of + DIRTYPE, false otherwise. */ I wish I'd counted how many times I read that. + +static bool +adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax) +{ + unsigned argprec = TYPE_PRECISION (TREE_TYPE (*argmin)); + unsigned dirprec = TYPE_PRECISION (dirtype); + + /* The logic here was inspired/lifted from the CONVERT_EXPR_CODE_P + branch in the extract_range_from_unary_expr function in tree-vrp.c. + */ Formatting it. If the comment-close won't fit, then line wrap prior to the last word in the comment. + + if (TREE_CODE (*argmin) == INTEGER_CST + && TREE_CODE (*argmax) == INTEGER_CST + && (dirprec >= argprec + || integer_zerop (int_const_binop (RSHIFT_EXPR, +int_const_binop (MINUS_EXPR, + *argmax, + *argmin), +size_int (dirprec) + { +
Re: [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)
On 11/29/2016 08:22 PM, Martin Sebor wrote: That said, I defer to you on how to proceed here. I'm prepared to do the work(*) but I do worry about jeopardizing the chances of this patch and the others making it into 7.0. So would it make sense to just init/fini the b_o_s framework in your pass and for builtin expansion? I think that should work for the sprintf checking. Let me test it. We can deal with the memxxx and strxxx patch (53562) independently if you prefer. Attached is a modified patch that calls {init,fini}_object_sizes() from the gimple-ssa-sprintf pass instead. While this works fine, I do like the approach of making the calls in a single function better because it makes for a more robust API. Decoupling the init/fini calls from the compute_object_size() function that depends on them having been made makes the API easier to accidentally misuse by calling one while forgetting to call one or both of the other two. It's not ideal, nor is the prospect of caching and potentially not invaliding properly. I've started tackling these kinds problems by wrapping everything into a class with suitable ctors/dtors and methods. With everything locked down inside a class, the only way to access the subsystem is by instantiating a suitable object (which obviously gives us control over init/fini). The problem then boils down to not having that instantiated object live across passes, which usually isn't a problem in GCC :-) I suggested it as a possibility, but wasn't going to demand it without knowing much more about the code in tree-object-size and how well it could be encapsulated. ANyway, I'll take another look at the patch. My recollection was that the only issue at hand was the init/fini/caching aspects. jeff
Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers
On Thu, Nov 24, 2016 at 01:07:04AM +, Joseph Myers wrote: > On Wed, 23 Nov 2016, Michael Meissner wrote: > > > Since some of the embedded hosts use powerpc-*-linux, I only set > > the > > default if the target is a 64-bit PowerPC system. I tested the compiler > > manually to verify that ifunc support was enabled, and it was. I did a > > boostrap build/check cycle on a little endian power8 system and there were > > regressions. Can I install the patch on the trunk? > > > > Note, the documentation for the --enable-gnu-indirect-option does not list > > which systems current enable the option, so I did not modify the > > documentation. > > I don't think this should have anything to do with embedded/server, or > about whether it's configured for 32-bit or 64-bit default. > powerpc-linux-gnu --enable-targets=all -m64 should behave the same as > powerpc64-linux-gnu. Since I don't test the embedded environments, I was just trying to cautious about changes that affect other PowerPC users. This patch enables --enable-gnu-indirect-function on all PowerPC linux systems, except for targetting Android or uclib. It is exactly the same code as we use in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux* systems. Given the ifunc support is not generated unless the user uses it, it should not affect other PowerPC users. Is this patch ok to check into the trunk? 2016-12-02 Michael Meissner * config.gcc (powerpc*-*-linux*): Set gnu-indirect-function by default on PowerPC linux systems. Index: gcc/config.gcc === --- gcc/config.gcc (svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config.gcc) (revision 243202) +++ gcc/config.gcc (.../gcc/config.gcc)(working copy) @@ -2443,6 +2443,14 @@ powerpc*-*-linux*) if test x${enable_secureplt} = xyes; then tm_file="rs6000/secureplt.h ${tm_file}" fi + # Assume modern glibc if not targeting Android nor uclibc. + case ${target} in + *-*-*android*|*-*-*uclibc*|*-*-*musl*) + ;; + *) + default_gnu_indirect_function=yes + ;; + esac ;; powerpc-wrs-vxworks|powerpc-wrs-vxworksae|powerpc-wrs-vxworksmils) tm_file="${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h" -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] Fix tsubst_init error recovery (PR c++/78649)
OK. On Fri, Dec 2, 2016 at 3:41 PM, Jakub Jelinek wrote: > Hi! > > Trying to value initialize error_mark_node type ICEs, other spots avoid > calling build_value_init* if the type is error_mark_node. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-12-02 Jakub Jelinek > > PR c++/78649 > * pt.c (tsubst_init): Don't call build_value_init if decl's type > is error_mark_node. > > * g++.dg/cpp0x/pr78649.C: New test. > > --- gcc/cp/pt.c.jj 2016-11-23 16:58:30.0 +0100 > +++ gcc/cp/pt.c 2016-12-02 14:38:43.237451068 +0100 > @@ -14082,7 +14082,7 @@ tsubst_init (tree init, tree decl, tree > >init = tsubst_expr (init, args, complain, in_decl, false); > > - if (!init) > + if (!init && TREE_TYPE (decl) != error_mark_node) > { >/* If we had an initializer but it > instantiated to nothing, > --- gcc/testsuite/g++.dg/cpp0x/pr78649.C.jj 2016-12-02 14:51:02.057892855 > +0100 > +++ gcc/testsuite/g++.dg/cpp0x/pr78649.C2016-12-02 14:50:42.0 > +0100 > @@ -0,0 +1,16 @@ > +// PR c++/78649 > +// { dg-do compile { target c++11 } } > + > +template void foo (); > +template > +void > +test () > +{ > + T t (foo...); // { dg-error "declared void" } > +} > + > +int > +main () > +{ > + test (); > +} > > Jakub
Re: [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)
On 11/29/2016 08:22 PM, Martin Sebor wrote: That said, I defer to you on how to proceed here. I'm prepared to do the work(*) but I do worry about jeopardizing the chances of this patch and the others making it into 7.0. So would it make sense to just init/fini the b_o_s framework in your pass and for builtin expansion? I think that should work for the sprintf checking. Let me test it. We can deal with the memxxx and strxxx patch (53562) independently if you prefer. Attached is a modified patch that calls {init,fini}_object_sizes() from the gimple-ssa-sprintf pass instead. While this works fine, I do like the approach of making the calls in a single function better because it makes for a more robust API. Decoupling the init/fini calls from the compute_object_size() function that depends on them having been made makes the API easier to accidentally misuse by calling one while forgetting to call one or both of the other two. Martin gcc-78245.diff PR middle-end/78245 - missing -Wformat-length on an overflow of a dynamically allocated buffer gcc/testsuite/ChangeLog: PR middle-end/78245 * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Add tests. gcc/ChangeLog: PR middle-end/78245 * gimple-ssa-sprintf.c (get_destination_size): Call compute_object_size. * tree-object-size.c (addr_object_size): Adjust. (pass_through_call): Adjust. (internal_object_size): New function. (compute_builtin_object_size): Call internal_object_size. (pass_object_sizes::execute): Adjust. * tree-object-size.h (fini_object_sizes): Declare. @@ -664,6 +665,18 @@ compute_builtin_object_size (tree ptr, int object_size_type, return *psize != unknown[object_size_type]; } +/* Compute __builtin_object_size value for PTR and set *PSIZE to + the resulting value. OBJECT_SIZE_TYPE is the second argument + to __builtin_object_size. Return true on success and false + when the object size could not be determined. */ + +bool +compute_builtin_object_size (tree ptr, int object_size_type, +unsigned HOST_WIDE_INT *psize) +{ + return internal_object_size (ptr, object_size_type, psize); +} Is this wrapper still necessary now that we've pulled the init/fini routines out? Seems like it shouldn't be. I think that's the only issue left. If the wrapper is needed, then the patch is fine. If the wrapper isn't, then a patch with the wrapper removed is pre-approved after the usual testing. jeff
Re: [PATCH] Fix PR78306
On 11/30/2016 07:18 AM, Richard Biener wrote: so progess on the OMP front hides regressions elsewhere. The patch that started this thread does not have any effect on the conformance testsuite result. That makes it even more obvious that it is a) unmaintained, b) probably not used widely as those ICEs have not been reported as bugs Let's deprecate Cilk+ if no maintainer steps up until 7.1 is released. Gerald, can you relay that to the SC? I've raised the issue with the steering committee; I will also speak with Intel about the possibility of them providing resources to maintain this code. jeff
Re: [PATCH] PR target/78639, Fix code generation on Power9
On Thu, Dec 01, 2016 at 07:46:31PM -0500, Michael Meissner wrote: > The previous code before 242679 used 'wY', but I deleted the 'w' by accident. > > This bug showed up in compiling PUGHReduce/Reduction.c in the cactusADM Spec > 2006 benchmark suite. I have verified that this fixes the problem. > > Assuming the bootstrap and make check that I'm running right now don't cause > any regressions, can I check the patch into the trunk? Yes of course. Thanks, Segher > 2016-12-01 Michael Meissner > > PR target/78639 > * config/rs6000/rs6000.md (movdi_internal64): Fix typo in > subversion id 242679 that causes the wrong store instruction to be > generated if a DImode is in an Altivec register using REG+REG > addressing.
[Committed,fortran] Plug memory leak.
While looking at code related to PR fortran/78618, I noticed a memory leak. The following patch plus the leak. 2016-12-02 Steven G. Kargl * simplify.c (gfc_convert_char_constant): Free result on error. Index: simplify.c === --- simplify.c (revision 243203) +++ simplify.c (working copy) @@ -7152,6 +7152,7 @@ gfc_convert_char_constant (gfc_expr *e, "into character kind %d", gfc_print_wide_char (result->value.character.string[i]), &e->where, kind); + gfc_free_expr (result); return &gfc_bad_expr; } Without the patch, valgrind finds ==4426== LEAK SUMMARY: ==4426==definitely lost: 192 bytes in 1 blocks ==4426==indirectly lost: 8 bytes in 1 blocks ==4426== possibly lost: 6,256 bytes in 21 blocks ==4426==still reachable: 817,676 bytes in 2,185 blocks ==4426== suppressed: 0 bytes in 0 blocks With the patch, she shows ==15883== LEAK SUMMARY: ==15883==definitely lost: 0 bytes in 0 blocks ==15883==indirectly lost: 0 bytes in 0 blocks ==15883== possibly lost: 6,256 bytes in 21 blocks ==15883==still reachable: 817,676 bytes in 2,185 blocks ==15883== suppressed: 0 bytes in 0 blocks -- Steve
Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c
On Fri, Dec 02, 2016 at 09:47:00AM +0100, Richard Biener wrote: > >> STC tries to make as large as possible consecutive "traces", mainly to > >> help with instruction cache utilization and hit rate etc. It cannot do > >> a very good job if it isn't allowed to copy blocks. > >> > >> "simple" tries to (dynamically) have as many fall-throughs as possible, > >> i.e. as few jumps as possible. > > I hope it special-cases backedges, that is, not make those fallthru. It doesn't, and that is a good thing. Firstly, what is classified as backedge doesn't make too much sense in some cases; quite often the cfg isn't reducible. Secondly, for simple loops it does not matter: the backedge has lower frequency than the forward edges, so everything works out as you want. Thirdly, consider this loop: | S< | | A | / \| B C | \ /| D | | | E- | F where the backedge now has higher frequency than A->B and A->C. With simple this ends up as: S: ...; goto A D: ... E: ...; if (not again) goto F S: ... A: ...; if () goto C B: ...; goto D C: ...; goto D and this is cheaper than the alternative: S: ... A: ...; if () goto C B: ... D: ... E: ...; if (again) goto A F: C: ...; goto D If a fraction f of the loop iterations does C (so 1-f does B), the alternative has 1+2f jumps per iteration; the code simple makes has 1+f. > > I think we've probably discussed this more than is really necessary. We > > just need to pick an alternative and go with it, I think either alternative > > is reasonable (avoid copying when STC has no length information or fall back > > to simple when there is no length information). > > The description of both algorithms doesn't make it an obvious pick. So lets > leave the decision of the algorithm to the target and make STC beave sane > with no length information (whether that means disallowing any copying or > substituting a "default" length is another question -- but obviously it's the > targets fault to not provide lenght information). I agree. Segher
Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers
On Fri, 2 Dec 2016, Michael Meissner wrote: > This patch enables --enable-gnu-indirect-function on all PowerPC linux > systems, > except for targetting Android or uclib. It is exactly the same code as we use > in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux* > systems. Given the ifunc support is not generated unless the user uses it, it > should not affect other PowerPC users. > > Is this patch ok to check into the trunk? This is not a review, but doing things the same as on other architectures makes sense to me. -- Joseph S. Myers jos...@codesourcery.com
[PATCH] Even more minimal reimplementation of errors.c within read-md.c
On Thu, 2016-12-01 at 13:40 +0100, Bernd Schmidt wrote: > On 11/30/2016 09:24 PM, David Malcolm wrote: > > > gcc/ChangeLog: > > * read-md.c (have_error): New global, copied from errors.c. > > (fatal): New function, copied from errors.c. > > I would have expected the function to go into diagnostic.c, but > actually > there are already various functions of this sort in read-md. In a way, there are three diagnostic systems in the codebase: - the functions in errors.h/errors.c (warning, error, fatal, etc) - the functions in read-md.c (error_at, fatal_at) which share the "have_error" global with errors.c. - the "real" diagnostics subsystem (diagnostics.c etc) It turns out that the only thing using "fatal" within read-md.c is md_reader::read_md_files. The interface this provides is overly complicated for the RTL FE's purposes, so the following patch avoids using it in favor of a simpler, new method: md_reader::read_file. With that, the only thing needed in read-md.c from errors.c on the host is just the "have_error" global; the patch verifies that by conditionalizing the include of errors.h on #ifdef GENERATOR_FILE (and similarly conditionalizing md_reader::read_md_files, since "fatal" isn't implemented on the host). > I'd request > you place it near fatal_at, and maybe add this to errors.h: > > inline bool seen_error () > { >return have_error; > } > > and replace explicit uses of have_error with that to unify things a > bit. > > > Bernd seen_error is already implemented in the "real" diagnostic subsystem, and would be a clash: we want a way to determine if read-md.c's implementation of error_at was called. Hence it seems simplest to conditionally add a copy of the "have_error" global to read-md.c. Thoughts? gcc/ChangeLog: * read-md.c: Wrap include of errors.h with #ifdef GENERATOR_FILE. (have_error): New global, copied from errors.c. (md_reader::read_md_files): Wrap with #ifdef GENERATOR_FILE. (md_reader::read_file): New method. * read-md.h (md_reader::read_file): New method. * read-rtl-function.c (read_rtl_function_body): Reimplement in terms of md_reader::read_file. --- gcc/read-md.c | 31 +++ gcc/read-md.h | 1 + gcc/read-rtl-function.c | 6 +- 3 files changed, 33 insertions(+), 5 deletions(-) diff --git a/gcc/read-md.c b/gcc/read-md.c index 25bc3c4..e581326 100644 --- a/gcc/read-md.c +++ b/gcc/read-md.c @@ -26,11 +26,23 @@ along with GCC; see the file COPYING3. If not see #endif #include "system.h" #include "coretypes.h" +#ifdef GENERATOR_FILE #include "errors.h" +#endif /* #ifdef GENERATOR_FILE */ #include "statistics.h" #include "vec.h" #include "read-md.h" +#ifndef GENERATOR_FILE + +/* Minimal reimplementation of errors.c for use by RTL frontend + within cc1. */ + +int have_error = 0; + +#endif /* #ifndef GENERATOR_FILE */ + + /* Associates PTR (which can be a string, etc.) with the file location specified by FILENAME and LINENO. */ struct ptr_loc { @@ -1190,6 +1202,8 @@ md_reader::add_include_path (const char *arg) m_last_dir_md_include_ptr = &dirtmp->next; } +#ifdef GENERATOR_FILE + /* The main routine for reading .md files. Try to process all the .md files specified on the command line and return true if no error occurred. @@ -1296,6 +1310,23 @@ md_reader::read_md_files (int argc, const char **argv, return !have_error; } +#endif /* #ifdef GENERATOR_FILE */ + +/* Read FILENAME. */ + +bool +md_reader::read_file (const char *filename) +{ + m_read_md_filename = filename; + m_read_md_file = fopen (m_read_md_filename, "r"); + if (m_read_md_file == 0) +{ + perror (m_read_md_filename); + return false; +} + handle_toplevel_file (); + return !have_error; +} /* Read FILENAME, filtering to just the given lines. */ diff --git a/gcc/read-md.h b/gcc/read-md.h index 6a73b00..5fc7605 100644 --- a/gcc/read-md.h +++ b/gcc/read-md.h @@ -110,6 +110,7 @@ class md_reader virtual ~md_reader (); bool read_md_files (int, const char **, bool (*) (const char *)); + bool read_file (const char *filename); bool read_file_fragment (const char *filename, int first_line, int last_line); diff --git a/gcc/read-rtl-function.c b/gcc/read-rtl-function.c index ddea836..5188b86 100644 --- a/gcc/read-rtl-function.c +++ b/gcc/read-rtl-function.c @@ -1590,12 +1590,8 @@ read_rtl_function_body (const char *path) init_emit (); init_varasm_status (); - auto_vec argv (2); - argv.safe_push (progname); - argv.safe_push (path); - function_reader reader; - if (!reader.read_md_files (argv.length (), argv.address (), NULL)) + if (!reader.read_file (path)) return false; return true; -- 1.8.5.3
[committed] selftest.c: remove calls to strndup (PR bootstrap/78616)
As part of the fix for PR c/78498 I added selftests for xstrndup in r243030, and seeing the implementation of strndup in libiberty, I also made the selftests verify strndup. This turned out to be overzealous, as strndup is not necessarily available in every configuration when the selftests run. This patch removes the testing for strndup (but not for xstrndup). Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. Committed to trunk as r243207 (as "obvious") gcc/ChangeLog: PR bootstrap/78616 * selftest.c (selftest::assert_strndup_eq): Rename to... (selftest::assert_xstrndup_eq): ...this, and remove call to strndup. (selftest::test_strndup): Rename to... (selftest::test_xstrndup): ...this, updating for above renaming. (selftest::test_libiberty): Update for renaming. --- gcc/selftest.c | 40 +--- 1 file changed, 17 insertions(+), 23 deletions(-) diff --git a/gcc/selftest.c b/gcc/selftest.c index 6df73c2..40c6cb5 100644 --- a/gcc/selftest.c +++ b/gcc/selftest.c @@ -200,41 +200,35 @@ read_file (const location &loc, const char *path) /* Selftests for libiberty. */ -/* Verify that both strndup and xstrndup generate EXPECTED - when called on SRC and N. */ +/* Verify that xstrndup generates EXPECTED when called on SRC and N. */ static void -assert_strndup_eq (const char *expected, const char *src, size_t n) +assert_xstrndup_eq (const char *expected, const char *src, size_t n) { - char *buf = strndup (src, n); - if (buf) -ASSERT_STREQ (expected, buf); - free (buf); - - buf = xstrndup (src, n); + char *buf = xstrndup (src, n); ASSERT_STREQ (expected, buf); free (buf); } -/* Verify that strndup and xstrndup work as expected. */ +/* Verify that xstrndup works as expected. */ static void -test_strndup () +test_xstrndup () { - assert_strndup_eq ("", "test", 0); - assert_strndup_eq ("t", "test", 1); - assert_strndup_eq ("te", "test", 2); - assert_strndup_eq ("tes", "test", 3); - assert_strndup_eq ("test", "test", 4); - assert_strndup_eq ("test", "test", 5); + assert_xstrndup_eq ("", "test", 0); + assert_xstrndup_eq ("t", "test", 1); + assert_xstrndup_eq ("te", "test", 2); + assert_xstrndup_eq ("tes", "test", 3); + assert_xstrndup_eq ("test", "test", 4); + assert_xstrndup_eq ("test", "test", 5); /* Test on an string without zero termination. */ const char src[4] = {'t', 'e', 's', 't'}; - assert_strndup_eq ("", src, 0); - assert_strndup_eq ("t", src, 1); - assert_strndup_eq ("te", src, 2); - assert_strndup_eq ("tes", src, 3); - assert_strndup_eq ("test", src, 4); + assert_xstrndup_eq ("", src, 0); + assert_xstrndup_eq ("t", src, 1); + assert_xstrndup_eq ("te", src, 2); + assert_xstrndup_eq ("tes", src, 3); + assert_xstrndup_eq ("test", src, 4); } /* Run selftests for libiberty. */ @@ -242,7 +236,7 @@ test_strndup () static void test_libiberty () { - test_strndup (); + test_xstrndup (); } /* Selftests for the selftest system itself. */ -- 1.8.5.3
Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)
Thanks for looking at this! I realize it's dense code and not easy to make sense out of. PR middle-end/78622 - [7 Regression] -Wformat-length/-fprintf-return-value incorrect with overflow/wrapping gcc/ChangeLog: PR middle-end/78622 * gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange rather than res.bounded. (adjust_range_for_overflow): New function. (format_integer): Always set res.bounded to true unless either precision or width is specified and unknown. Call adjust_range_for_overflow. (format_directive): Remove vestigial quoting. (add_bytes): Correct the computation of boundrange used to decide whether a warning is of a "maybe" or "defnitely" kind. s/defnitely/definitely/ Will fix. +/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual + argument, due to the conversion from either *ARGMIN or *ARGMAX to + the type of the directive's formal argument it's possible for both + to result in the same number of bytes or a range of bytes that's + less than the number of bytes that would result from formatting + some other value in the range [*ARGMIN, *ARGMAX]. This can be + determined by checking for the actual argument being in the range + of the type of the directive. If it isn't it must be assumed to + take on the full range of the directive's type. + Return true when the range has been adjusted to the range of + DIRTYPE, false otherwise. */ I wish I'd counted how many times I read that. Let me see if I can word it better. + +static bool +adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax) +{ + unsigned argprec = TYPE_PRECISION (TREE_TYPE (*argmin)); + unsigned dirprec = TYPE_PRECISION (dirtype); + + /* The logic here was inspired/lifted from the CONVERT_EXPR_CODE_P + branch in the extract_range_from_unary_expr function in tree-vrp.c. + */ Formatting it. If the comment-close won't fit, then line wrap prior to the last word in the comment. Okay. I didn't remember what the exact rules were. + + if (TREE_CODE (*argmin) == INTEGER_CST + && TREE_CODE (*argmax) == INTEGER_CST + && (dirprec >= argprec + || integer_zerop (int_const_binop (RSHIFT_EXPR, + int_const_binop (MINUS_EXPR, + *argmax, + *argmin), + size_int (dirprec) + { +*argmin = force_fit_type (dirtype, wi::to_widest (*argmin), 0, false); +*argmax = force_fit_type (dirtype, wi::to_widest (*argmax), 0, false); +return false; + } So in this case we're not changing the values, we're just converting them to a equal or wider type, right? Thus no adjustment (what about a sign change? Is that an adjustment?) and we return false per the function's specifications. This casts the value to the destination type, so it may result in different values. The important postcondition it preserves is that the difference ARGMAX - ARGMIN is less than or equal to TYPE_MAX of the directive's unsigned TYPE. (If it isn't, the range cannot be converted.) This lets us take, say: int f (int i) { if (i < 1024 || 1033 < i) i = 1024; return snprintf (0, 0, "%hhi", i); } and fold the function call into 1 because (signed char)1024 is 0 and (signed char)1033 is 9, and every other value in that same original range is also in the same range after the conversion. We know it's safe because (1033 - 1024 <= UCHAR_MAX) holds. But in in this: int g (int i) { if (i < 1024 || 1289 < i) i = 1024; return snprintf (0, 0, "%hhi", i); } even though (signed char)1024 is 0 and (signed char)1289 is also 9 like above, since (1289 - 1024) is 265 and thus greater than UCHAR_MAX, folding the result wouldn't be safe (for example, (sighed char)1034 yields 10). I picture this as a window bounded by the range of the directive's type sliding within another window bounded by the argument's type: [<-[...dirtype...]--->] the outer brackets delimit the range of the argument and the inner ones the directive's. The smaller window can slide left and right and it can shrink but it can't be wider that the directive's type's range. What about overflows when either argmin or argmax won't fit into dirtype without an overflow? What's the right behavior there? That's fine just as long as the property above holds. I think the algorithm works but the code came from tree-vrp where there are all sorts of additional checks for some infinities which VRP distinguishes from type limits like SCHAR_MIN and _MAX. I don't fully understand those and so I'd feel better if someone who does double checked this code. + + tree dirmin = TYPE_MIN_VALUE (dirtype); + tree dirmax = TYPE_MAX_VALUE (dirtype); From this point onward argmin/argmax are either not integers or we're doing a narrowing conversion, right? In both cases the fact that we're doing a narrowing conversion constrains th
Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers
On Fri, Dec 02, 2016 at 04:09:22PM -0500, Michael Meissner wrote: > On Thu, Nov 24, 2016 at 01:07:04AM +, Joseph Myers wrote: > This patch enables --enable-gnu-indirect-function on all PowerPC linux > systems, > except for targetting Android or uclib. It is exactly the same code as we use > in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux* > systems. Given the ifunc support is not generated unless the user uses it, it > should not affect other PowerPC users. > > Is this patch ok to check into the trunk? Assuming you tested it, okay. Thanks, Segher > 2016-12-02 Michael Meissner > > * config.gcc (powerpc*-*-linux*): Set gnu-indirect-function by > default on PowerPC linux systems.