Re: Move some flag_unsafe_math_optimizations using simplify and match
Hi, Works for me if you specify -fno-math-errno. I think that's a regression we can accept. Modified the pattern with fno-math-errno as a condition. Can you re-post with the typo fix and the missing :s? Please find attached the modified patch as per the review comments. Please suggest if there should be any further modifications. Thanks, Naveen ChangeLog 2015-08-20 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * fold-const.c (fold_binary_loc) : Move sqrt(x)*sqrt(x) as x to match.pd. Move Optimize pow(x,y)*pow(z,y) as pow(x*z,y)to match.pd. Move Optimize tan(x)*cos(x) as sin(x) to match.pd. Move Optimize x*pow(x,c) as pow(x,c+1) to match.pd. Move Optimize pow(x,c)*x as pow(x,c+1) to match.pd. Move Optimize sin(x)/cos(x) as tan(x) to match.pd. Move Optimize cos(x)/sin(x) as 1.0/tan(x) to match.pd. Move Optimize sin(x)/tan(x) as cos(x) to match.pd. Move Optimize tan(x)/sin(x) as 1.0/cos(x) to match.pd. Move Optimize pow(x,c)/x as pow(x,c-1) to match.pd. Move Optimize x/pow(y,z) into x*pow(y,-z) to match.pd. * match.pd (SIN ) : New Operator. (TAN) : New Operator. (mult (SQRT@1 @0) @1) : New simplifier. (mult (POW:s @0 @1) (POW:s @2 @1)) : New simplifier. (mult:c (TAN:s @0) (COS:s @0)) : New simplifier. (mult:c (TAN:s @0) (COS:s @0)) : New simplifier. (rdiv (SIN:s @0) (COS:s @0)) : New simplifier. (rdiv (COS:s @0) (SIN:s @0)) : New simplifier. (rdiv (SIN:s @0) (TAN:s @0)) : New simplifier. (rdiv (TAN:s @0) (SIN:s @0)) : New simplifier. (rdiv (POW:s @0 REAL_CST@1) @0) : New simplifier. (rdiv @0 (SQRT:s (rdiv:s @1 @2))) : New simplifier. (rdiv @0 (POW:s @1 @2)) : New simplifier.diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 93d6514..1e01726 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -9957,12 +9957,6 @@ fold_binary_loc (location_t loc, tree arg00 = CALL_EXPR_ARG (arg0, 0); tree arg10 = CALL_EXPR_ARG (arg1, 0); - /* Optimize sqrt(x)*sqrt(x) as x. */ - if (BUILTIN_SQRT_P (fcode0) - operand_equal_p (arg00, arg10, 0) - ! HONOR_SNANS (element_mode (type))) - return arg00; - /* Optimize root(x)*root(y) as root(x*y). */ rootfn = TREE_OPERAND (CALL_EXPR_FN (arg0), 0); arg = fold_build2_loc (loc, MULT_EXPR, type, arg00, arg10); @@ -9989,15 +9983,6 @@ fold_binary_loc (location_t loc, tree arg10 = CALL_EXPR_ARG (arg1, 0); tree arg11 = CALL_EXPR_ARG (arg1, 1); - /* Optimize pow(x,y)*pow(z,y) as pow(x*z,y). */ - if (operand_equal_p (arg01, arg11, 0)) - { - tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg0), 0); - tree arg = fold_build2_loc (loc, MULT_EXPR, type, - arg00, arg10); - return build_call_expr_loc (loc, powfn, 2, arg, arg01); - } - /* Optimize pow(x,y)*pow(x,z) as pow(x,y+z). */ if (operand_equal_p (arg00, arg10, 0)) { @@ -10008,67 +9993,6 @@ fold_binary_loc (location_t loc, } } - /* Optimize tan(x)*cos(x) as sin(x). */ - if (((fcode0 == BUILT_IN_TAN fcode1 == BUILT_IN_COS) - || (fcode0 == BUILT_IN_TANF fcode1 == BUILT_IN_COSF) - || (fcode0 == BUILT_IN_TANL fcode1 == BUILT_IN_COSL) - || (fcode0 == BUILT_IN_COS fcode1 == BUILT_IN_TAN) - || (fcode0 == BUILT_IN_COSF fcode1 == BUILT_IN_TANF) - || (fcode0 == BUILT_IN_COSL fcode1 == BUILT_IN_TANL)) - operand_equal_p (CALL_EXPR_ARG (arg0, 0), - CALL_EXPR_ARG (arg1, 0), 0)) - { - tree sinfn = mathfn_built_in (type, BUILT_IN_SIN); - - if (sinfn != NULL_TREE) - return build_call_expr_loc (loc, sinfn, 1, - CALL_EXPR_ARG (arg0, 0)); - } - - /* Optimize x*pow(x,c) as pow(x,c+1). */ - if (fcode1 == BUILT_IN_POW - || fcode1 == BUILT_IN_POWF - || fcode1 == BUILT_IN_POWL) - { - tree arg10 = CALL_EXPR_ARG (arg1, 0); - tree arg11 = CALL_EXPR_ARG (arg1, 1); - if (TREE_CODE (arg11) == REAL_CST - !TREE_OVERFLOW (arg11) - operand_equal_p (arg0, arg10, 0)) - { - tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg1), 0); - REAL_VALUE_TYPE c; - tree arg; - - c = TREE_REAL_CST (arg11); - real_arithmetic (c, PLUS_EXPR, c, dconst1); - arg = build_real (type, c); - return build_call_expr_loc (loc, powfn, 2, arg0, arg); - } - } - - /* Optimize pow(x,c)*x as pow(x,c+1). */ - if (fcode0 == BUILT_IN_POW - || fcode0 == BUILT_IN_POWF - || fcode0 == BUILT_IN_POWL) - { - tree arg00 = CALL_EXPR_ARG (arg0, 0); - tree arg01 = CALL_EXPR_ARG (arg0, 1); - if (TREE_CODE (arg01) == REAL_CST - !TREE_OVERFLOW (arg01) - operand_equal_p (arg1, arg00, 0)) - { - tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg0), 0); - REAL_VALUE_TYPE c; - tree arg; - - c = TREE_REAL_CST (arg01); -
Re: [PATCH] Fix middle-end/67133, part 1
On 08/20/2015 10:51 AM, Marek Polacek wrote: Based on the error, I suspect we've got a block ending with a GIMPLE_COND with no successors in the CFG. Except that I'm also seeing a different error: /home/brq/mpolacek/gcc/libgo/go/text/template/exec.go:303:1: error: wrong outgoing edge flags at end of bb 6 We have this bb: bb 6: # iftmp.1693_53 = PHI 0B(4) _54 = t_5(D)-Pipe; GOTMP.163 = template.evalPipeline.pN19_text_template.state (s_7(D), dot, _31); [return slot optimization] dot = GOTMP.163; _61 = __go_new (__go_tdn_text_template..text_template.state, 64); *_35 = *s_7(D); # DEBUG newState = _35 _35-tmpl = iftmp.1693_55; GOTMP.166.value = dot; _66 = __go_new (__go_td_AN22_text_template.variable1e, 40); SR.4170_67 = $; SR.4171_68 = 1; MEM[(struct .text/template.variable *)GOTMP.166] = $; MEM[(struct .text/template.variable *)GOTMP.166 + 8B] = 1; MEM[(struct .text/template.variable[1] *)_40][0] = GOTMP.166; _35-vars.__values = _40; _35-vars.__count = 1; _35-vars.__capacity = 1; _75 ={v} iftmp.1693_55-Tree; __builtin_trap (); _76 = _46-Root; D.8248.__methods = __go_pimt__I25_.text_template_parse.treeFrpN24_text_template_parse.Tr4_CopyFrN24_text_template_parse.Nodeee8_PositionFrN23_text_template_parse.Posee6_StringFrN6_stringee4_TypeFrN28_text_template_parse.NodeTyp__N28_text_template_parse.ListNode; D.8248.__object = _47; template.walk.pN19_text_template.state (_35, dot, D.8248); return; and single_succ_p (bb) is not satisfied, so it must have more outgoing edges. Not sure how can that happen... OK, iftmp.1693_55 is NULL (via the PHI). We inserted the trap. That looks reasonable. What does the CFG look like after splitting the block? There should be flags you can pass to get the edges flags as part of the debugging output. My first guess would be some kind of exception handling edge, but I thought we avoided the transformation in that case. Hmm, maybe seeing the CFG with edges flags at the time of trap insertion would be useful too. Jeff
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
snip... Having classpath (with binary files!) In the GCC SVN (or future git) repository is a significant burden, not to mention the size of the distributed source tarball. If we can get rid of that that would be a great step in reducing the burden. Iff we can even without classpath build enough of java to be useful (do you really need gcj or only gij for bootstrapping openjdk? After all ecj is just a drop-in to gcc as well). All the Java compilers are written in Java (ecj javac). So to run them, you need a JVM and its class library. It's those binary files which allow gcj to bootstrap the stack. If OpenJDK had a minimal binary class library, it would be able to bootstrap itself. But, as things stand, you need enough of the JDK to run a Java compiler and build the OpenJDK class libraries. GCJ currently fulfils that need where there isn't already an OpenJDK installation available. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [PATCH] Fix middle-end/67133, part 1
On Thu, Aug 20, 2015 at 06:51:45PM +0200, Marek Polacek wrote: and single_succ_p (bb) is not satisfied, so it must have more outgoing edges. Not sure how can that happen... Actually the problem seems to be that the BB ends with return but it has *no* outgoing edges. Marek
Re: [PR64164] drop copyrename, integrate into expand
On 08/19/2015 06:00 PM, Alexandre Oliva wrote: On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote: I'm having some difficulty getting access to an ia64 box ATM, and for ada bootstraps, a cross won't do, so... if you still have that build tree around, any chance you could recompile par.o with both stage1 and stage2, with -fdump-rtl-expand-details, and email me the compiler dump files? Thanks! In the mean time, I have been able to duplicate the problem myself. As you say, it is triggered by -gtoggle. However, it has nothing whatsoever to do with the recent patches I installed. At most they expose some latent problem in the scheduler. I have verified in the expand dumps that both the gimple and the rtl representation in the relevant parts of the code are identical, except for the presence of debug stmts and insns. Indeed, compiling with -fno-schedule-insns{,2}, no differences arise. We did a couple fixes to this code earlier this year. Presumably there's something still subtly wrong in there that your changes are exposing. See Maxim's changes from Feb. You might also look and see if any of those insns have SCHED_GROUP_P set. Jeff
Re: [PATCH] Fix middle-end/67133, part 1
On 08/20/2015 11:00 AM, Marek Polacek wrote: On Thu, Aug 20, 2015 at 06:51:45PM +0200, Marek Polacek wrote: and single_succ_p (bb) is not satisfied, so it must have more outgoing edges. Not sure how can that happen... Actually the problem seems to be that the BB ends with return but it has *no* outgoing edges. But isn't that block removed -- it ought to be unreachable -- unless something didn't handle splitting the block after the trap. jeff
libgo patch committed: Another fix for killing sleep processes in testsuite
I committed this patch to libgo as another fix for killing the sleep processes in the testsuite. This avoids padding issues in the ps output. Ran libgo testsuite. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 226899) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -448d30b696461a39870d1b8beb1195e411300bfd +ec34cfb0b148ff461df12c8f5270a06e2f438b7c The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/testsuite/gotest === --- libgo/testsuite/gotest (revision 226846) +++ libgo/testsuite/gotest (working copy) @@ -504,6 +504,7 @@ xno) fi ${GL} *.o ${GOLIBS} + set +e if test $bench = ; then if test $trace = true; then echo ./a.out -test.short -test.timeout=${timeout}s $@ @@ -518,9 +519,11 @@ xno) wait $pid status=$? if ! test -f gotest-timeout; then - out=`ps -o pid,ppid | grep $alarmpid | cut -f1 -d ` - if test x$out != x; then - kill -9 $out + sleeppid=`ps -o pid,ppid,cmd | grep $alarmpid | grep sleep | sed -e 's/ *\([0-9]*\) .*$/\1/'` + kill $alarmpid + wait $alarmpid + if test $sleeppid != ; then + kill $sleeppid fi fi else
RE: [PATCH][ARM]Tighten the conditions for arm_movw, arm_movt
Hi Renlin, On 19/08/15 15:37, Renlin Li wrote: Hi Kyrylo, On 19/08/15 13:46, Kyrylo Tkachov wrote: Hi Renlin, Please send patches to gcc-patches for review. Redirecting there now... Thank you! I should really double check after Thunderbird auto complete the address for me. On 19/08/15 12:49, Renlin Li wrote: Hi all, This simple patch will tighten the conditions when matching movw and arm_movt rtx pattern. Those two patterns will generate the following assembly: movw w1, #:lower16: dummy + addend movt w1, #:upper16: dummy + addend The addend here is optional. However, it should be an 16-bit signed value with in the range -32768 = A = 32768. By impose this restriction explicitly, it will prevent LRA/reload code from generation invalid high/lo_sum code for arm target. In process_address_1(), if the address is not legitimate, it will try to generate high/lo_sum pair to put the address into register. It will check if the target support those newly generated reload instructions. By define those two patterns, arm will reject them if conditions is not meet. Otherwise, it might generate movw/movt instructions with addend larger than 32768, this will cause a GAS error. GAS will produce '''offset out of range'' error message when the addend for MOVW/MOVT REL relocation is too large. arm-none-eabi regression tests Okay, Okay to commit to the trunk and backport to 5.0? This is ok if it passes an arm bootstrap as well. Please wait for a few days on trunk for any fallout before backporting to GCC 5 (you can bootstrap and test the patch there in the meantime). Thanks, Kyrill Regards, Renlin gcc/ChangeLog: 2015-08-19 Renlin Li renlin...@arm.com * config/arm/arm-protos.h (arm_valid_symbolic_address_p): Declare. * config/arm/arm.c (arm_valid_symbolic_address_p): Define. * config/arm/arm.md (arm_movt): Use arm_valid_symbolic_address_p. * config/arm/constraints.md (j): Add check for high code. Is it guaranteed that at this point XEXP (tmp, 0) and XEXP (tmp, 1) are valid? I think before you extract xop0 and xop1 you want to check that tmp is indeed a PLUS and return false if it's not. Only then you should extract XEXP (tmp, 0) and XEXP (tmp, 1). + if (GET_CODE (tmp) == PLUS GET_CODE (xop0) == SYMBOL_REF + CONST_INT_P (xop1)) +{ + HOST_WIDE_INT offset = INTVAL (xop1); + if (offset -0x8000 || offset 0x7fff) +return false; + else +return true; I think you can just do return IN_RANGE (offset, -0x8000, 0x7); Updated accordingly, please check the latest attachment. Thank you, Renlin
Re: [PATCH][1/n] dwarf2out refactoring for early (LTO) debug
On Wed, 19 Aug 2015, Richard Biener wrote: On Tue, 18 Aug 2015, Aldy Hernandez wrote: On 08/18/2015 07:20 AM, Richard Biener wrote: This starts a series of patches (still in development) to refactor dwarf2out.c to better cope with early debug (and LTO debug). Awesome! Thanks. Aldyh, what other testing did you usually do for changes? Run the gdb testsuite against the new compiler? Anything else? gdb testsuite, and make sure you test GCC with --enable-languages=all,go,ada, though the latter is mostly useful while you iron out bugs initially. I found that ultimately, the best test was C++. I see. Pre merge I also bootstrapped the compiler and compared .debug* section sizes in object files to make sure things were within reason. + +static void +vmsdbgout_early_finish (const char *filename ATTRIBUTE_UNUSED) +{ + if (write_symbols == VMS_AND_DWARF2_DEBUG) +(*dwarf2_debug_hooks.early_finish) (filename); +} You can get rid of ATTRIBUTE_UNUSED now. Done. I've also refrained from moving gen_scheduled_generic_parms_dies (); gen_remaining_tmpl_value_param_die_attribute (); for now as that causes regressions I have to investigate. Tricky beast ;) For g++.dg/debug/dwarf2/template-func-params-3.C we run into the issue that when doing early dwarf the rtl_for_decl_init (bleh) call will fail because it ends up asking 15809 ! walk_tree (init, reference_to_unused, NULL, NULL) which uses TREE_ASM_WRITTEN to see of 'bleh' was emitted or not. That's not going to work at this stage - we even have no idea whether 'bleh' is going to survive IPA or not (might be inlined). With LTO it gets even tricker as we only see a subset of the whole program (or original TU) at LTRANS stage. So it somehow looks like late dwarf thing for this kind of symbolic constants - but it _also_ looks like a very bad dwarf representation to me (going through RTL is bad enough, heh). We expect DW_OP_addr and a reference to _Z4blehv as follows: .byte 0x3 # DW_OP_addr .quad _Z4blehv but I wonder if DWARF has something better so we can refer to _Z4blehv by means of the DIE for its declaration (so the debugger can resolve the constant value)? That would allow the debugger to print bleh even if bleh was optimized out (it just would have to print optimized out for the actual address). So what I'll do is have two phases of gen_remaining_tmpl_value_param_die_attribute (gen_scheduled_generic_parms_dies doesn't have a similar issue AFAICS), during early-debug add those we can, retaining those we can only handle late (just checking the return value of tree_add_const_value_attribute). For LTO the remaining ones will be dropped on the floor (or we'd have to stream them somehow) unless we can change the DWARF representation of these symbolic constants. Thanks, Richard.
[PATCH] Add extra compile options for dg-final only once.
This patch fixes an annoying problem of the dg-final test using the scan-assembler family of tests (and maybe others). For a test file, the option -ffat-lto-objects is added to the command line once for each scan-assembler test, eventually resulting in an unreadable command line. Can this be committed? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/testsuite/ChangeLog * lib/gcc-dg.exp: Add extra options for db-final to the command line only once. From e89aecf367ffd2e89ac6eec7a04edd2eddd2a0da Mon Sep 17 00:00:00 2001 From: Dominik Vogt v...@linux.vnet.ibm.com Date: Thu, 20 Aug 2015 10:26:17 +0100 Subject: [PATCH] Add extra compile options for dg-final only once. A file with many scan-assembler* tests used to add -ffat-lto-objects to the command line many times, eventually rendering it unreadable. --- gcc/testsuite/lib/gcc-dg.exp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 7ce71df..7c1ab85 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -270,7 +270,7 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } { set finalcmd [lindex $x 0] if { [info procs ${finalcmd}_required_options] != } { set req [${finalcmd}_required_options] - if { $req != } { + if { $req != [lsearch -exact $extra_tool_flags $req] == -1 } { lappend extra_tool_flags $req } } -- 2.3.0
[i386] Simplify vector_all_ones_operand
gen_rtx_CONST_VECTOR ensures that there is a single instance of: (const_vector:M [(const_int -1) ... (const_int -1)]) for each M, so pointer equality with CONSTM1_RTX is enough. This seemed like a better fix than using the helper functions that I'm about to post. Bootstrapped regression-tested on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * config/i386/predicates.md (vector_all_ones_operand): Use CONSTM1_RTX to simplify definition. diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index a9c8623..bc76a5b 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -912,25 +912,9 @@ /* Return true if operand is a vector constant that is all ones. */ (define_predicate vector_all_ones_operand - (match_code const_vector) -{ - int nunits = GET_MODE_NUNITS (mode); - - if (GET_CODE (op) == CONST_VECTOR - CONST_VECTOR_NUNITS (op) == nunits) -{ - int i; - for (i = 0; i nunits; ++i) -{ - rtx x = CONST_VECTOR_ELT (op, i); - if (x != constm1_rtx) -return false; -} - return true; -} - - return false; -}) + (and (match_code const_vector) + (match_test INTEGRAL_MODE_P (GET_MODE (op))) + (match_test op == CONSTM1_RTX (GET_MODE (op) ; Return true when OP is operand acceptable for standard SSE move. (define_predicate vector_move_operand
Re: Move some flag_unsafe_math_optimizations using simplify and match
On Thu, Aug 20, 2015 at 7:38 AM, Marc Glisse marc.gli...@inria.fr wrote: On Thu, 20 Aug 2015, Hurugalawadi, Naveen wrote: The following testcase does not generate x as needed. double t (double x) { x = sqrt (x) * sqrt (x); return x; } With -fno-math-errno, we CSE the calls to sqrt, so I would expect this to match: (mult (SQRT@1 @0) @1) Without the flag, I expect that one will apply (simplify (mult (SQRT:s @0) (SQRT:s @1)) (SQRT (mult @0 @1))) and then maybe we have something converting sqrt(x*x) to abs(x) or maybe not. ICK. I'd rather have CSE still CSE the two calls by adding some tricks regarding to errno ... I wonder if all the unsafe math optimizations are really ok without -fno-math-errno... Well, on GIMPLE they will preserve the original calls because of their side-effects setting errno... on GENERIC probably not. Richard. -- Marc Glisse
[Patch] Add to the libgfortran/newlib bodge to detect ftruncate support in ARM/AArch64/SH
Hi, Steve's patch in 2013 [1] to fix the MIPS newlib/libgfortran build causes subtle issues for an ARM/AArch64 newlib/libgfortran build. The problem is that ARM/AArch64 (and SH) define a stub function for ftruncate, which we would previously have auto-detected, but which is not part of the hardwiring Steve added. Continuing the tradition of building bodge on bodge on bodge, this patch hardwires HAVE_FTRUNCATE on for ARM/AArch64/SH, which does fix the issue I was seeing. If this patch is acceptable for trunk, I'd also like to backport it to the 5.x and 4.9.x release branches. I'm not quite sure how to effectively verify this patch. I've looked at the generated config.h for aarch64-none-elf, arm-none-eabi and those come out with HAVE_FTRUNCATE defined. I wanted to check mips-none-elf, but I had no success there - the configure failed earlier when trying to link executables. I'd appreciate your help Steve to check that this patch works with your build system. Thanks, James [1]: https://gcc.gnu.org/ml/fortran/2013-09/msg00050.html --- 2015-08-14 James Greenhalgh james.greenha...@arm.com * configure.ac: Define HAVE_FTRUNCATE for ARM/AArch64/SH newlib builds. * configure: Regenerate. diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac index 35a8b39..adafb3f 100644 --- a/libgfortran/configure.ac +++ b/libgfortran/configure.ac @@ -295,6 +295,13 @@ if test x${with_newlib} = xyes; then if test xlong_double_math_on_this_cpu = xyes; then AC_DEFINE(HAVE_STRTOLD, 1, [Define if you have strtold.]) fi + + # ARM, AArch64 and SH also provide ftruncate. + case ${host} in + arm* | aarch64* | sh*) + AC_DEFINE(HAVE_FTRUNCATE, 1, [Define if you have ftruncate.]) + ;; + esac else AC_CHECK_FUNCS_ONCE(getrusage times mkstemp strtof strtold snprintf \ ftruncate chsize chdir getlogin gethostname kill link symlink sleep ttyname \
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. Andrew.
Re: [PATCH] Only accept BUILT_IN_NORMAL stringops for interesting_stringop_to_profile_p
Thanks for the comments. Attached please find the updated patch. OK? Index: gcc/value-prof.c === --- gcc/value-prof.c(revision 141081) +++ gcc/value-prof.c(working copy) @@ -209,7 +209,6 @@ gimple_add_histogram_value (struct function *fun, hist-fun = fun; } - /* Remove histogram HIST from STMT's histogram list. */ void @@ -234,7 +233,6 @@ gimple_remove_histogram_value (struct function *fu free (hist); } - /* Lookup histogram of type TYPE in the STMT. */ histogram_value @@ -389,6 +387,7 @@ stream_out_histogram_value (struct output_block *o if (hist-hvalue.next) stream_out_histogram_value (ob, hist-hvalue.next); } + /* Dump information about HIST to DUMP_FILE. */ void @@ -488,7 +487,6 @@ gimple_duplicate_stmt_histograms (struct function } } - /* Move all histograms associated with OSTMT to STMT. */ void @@ -529,7 +527,6 @@ visit_hist (void **slot, void *data) return 1; } - /* Verify sanity of the histograms. */ DEBUG_FUNCTION void @@ -594,7 +591,6 @@ free_histograms (void) } } - /* The overall number of invocations of the counter should match execution count of basic block. Report it as error rather than internal error as it might mean that user has misused the profile @@ -638,7 +634,6 @@ check_counter (gimple stmt, const char * name, return false; } - /* GIMPLE based transformations. */ bool @@ -697,7 +692,6 @@ gimple_value_profile_transformations (void) return changed; } - /* Generate code for transformation 1 (with parent gimple assignment STMT and probability of taking the optimal path PROB, which is equivalent to COUNT/ALL within roundoff error). This generates the @@ -859,6 +853,7 @@ gimple_divmod_fixed_value_transform (gimple_stmt_i probability of taking the optimal path PROB, which is equivalent to COUNT/ALL within roundoff error). This generates the result into a temp and returns the temp; it does not replace or alter the original STMT. */ + static tree gimple_mod_pow2 (gimple stmt, int prob, gcov_type count, gcov_type all) { @@ -938,6 +933,7 @@ gimple_mod_pow2 (gimple stmt, int prob, gcov_type } /* Do transform 2) on INSN if applicable. */ + static bool gimple_mod_pow2_value_transform (gimple_stmt_iterator *si) { @@ -1540,15 +1536,15 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) return true; } -/* Return true if the stringop CALL with FNDECL shall be profiled. - SIZE_ARG be set to the argument index for the size of the string - operation. -*/ +/* Return true if the stringop CALL shall be profiled. SIZE_ARG be + set to the argument index for the size of the string operation. */ + static bool -interesting_stringop_to_profile_p (tree fndecl, gimple call, int *size_arg) +interesting_stringop_to_profile_p (gimple call, int *size_arg) { - enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); + enum built_in_function fcode; + fcode = DECL_FUNCTION_CODE (gimple_call_fndecl (call)); if (fcode != BUILT_IN_MEMCPY fcode != BUILT_IN_MEMPCPY fcode != BUILT_IN_MEMSET fcode != BUILT_IN_BZERO) return false; @@ -1573,7 +1569,7 @@ static bool } } -/* Convert stringop (..., vcall_size) +/* Convert stringop (..., vcall_size) into if (vcall_size == icall_size) stringop (..., icall_size); @@ -1590,11 +1586,9 @@ gimple_stringop_fixed_value (gimple vcall_stmt, tr basic_block cond_bb, icall_bb, vcall_bb, join_bb; edge e_ci, e_cv, e_iv, e_ij, e_vj; gimple_stmt_iterator gsi; - tree fndecl; int size_arg; - fndecl = gimple_call_fndecl (vcall_stmt); - if (!interesting_stringop_to_profile_p (fndecl, vcall_stmt, size_arg)) + if (!interesting_stringop_to_profile_p (vcall_stmt, size_arg)) gcc_unreachable (); cond_bb = gimple_bb (vcall_stmt); @@ -1673,11 +1667,11 @@ gimple_stringop_fixed_value (gimple vcall_stmt, tr /* Find values inside STMT for that we want to measure histograms for division/modulo optimization. */ + static bool gimple_stringops_transform (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); - tree fndecl; tree blck_size; enum built_in_function fcode; histogram_value histogram; @@ -1688,14 +1682,11 @@ gimple_stringops_transform (gimple_stmt_iterator * tree tree_val; int size_arg; - if (gimple_code (stmt) != GIMPLE_CALL) + if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) return false; - fndecl = gimple_call_fndecl (stmt); - if (!fndecl) + + if (!interesting_stringop_to_profile_p (stmt, size_arg)) return false; - fcode = DECL_FUNCTION_CODE (fndecl); - if (!interesting_stringop_to_profile_p (fndecl, stmt, size_arg)) -return false; blck_size = gimple_call_arg (stmt, size_arg); if (TREE_CODE (blck_size) == INTEGER_CST) @@ -1704,10 +1695,12 @@ gimple_stringops_transform (gimple_stmt_iterator * histogram = gimple_histogram_value_of_type (cfun,
Re: [PATCH][2/n] Change dw2_asm_output_offset to allow assembling extra offset
On Wed, 19 Aug 2015, Mike Stump wrote: On Aug 19, 2015, at 7:25 AM, Richard Biener rguent...@suse.de wrote: This is needed so that we can output references to $early-debug-symbol + constant offset where $early-debug-symbol is the beginning of a .debug_info section containing early debug info from the compile-stage. Constant offsets are always fine for any object formats I know, On darwin, they generally speaking, are not. subsections_via_symbols can shed some light on the topic, if one is interested all the fun. I’ll give a quick intro below. foo+n only works if there is not other label of a certain type between label and foo+4, and there are no labels of a certain type at foo+4, and foo+n refers to at least one byte after that label, and n is non-negative and … So, for example, in nop foo: nop foo+32 would be invalid as nops are 4 bytes or so, and +32 is beyond the size of the region. foo+0 be fine. foo+4 would be invalid, assuming nop generates 4 bytes. foo-4 would be invalid. In: foo: nop bar: nop foo+4 would be invalid, as bar exists. In: foo: nop L12: nop foo+4 is fine, as local labels don’t participate. One way to think about this is imagine that each global label points to an independent section and that section isn’t loaded unless something refers to it, and one can only have pointers to the bytes inside that section, and that sections on output can be arbitrarily ordered. bar: nop foo: nop bar+4, even if you deferred this to running code, need not refer to foo. I say this as background. In the optimization where gcc tries to bunch up global variables together and form base+offset to get to the different data, this does not work on darwin because base+offset isn’t a valid way to go from one global label to the next, even in the same section. Now, if you merely sneak in data into the section with no labels and you need to account for N extra bytes before then you can change the existing reference to what it was before + N, without any worry. If you remove the interior labels to form your new base, and concatenate all the data together, then base+N to refer to the data is fine, if there are at least N+1 bytes of data after base. foo: nop bar: nop would become: base: Lfoo: nop Lbar: nop base+0 and base+4. So, if you confident you know and follow the rules, ok from my perspective. If you’re unsure, I can try and read a .s file and see if it looks ok. Testing would may not catch broken things unless you also select dead code stripping and try test cases with dead code. I believe that we in the end have Ldebug_info_from_t1.c: ... Ldebug_info_from_t2.c: ... debug_info: refer to Ldebug_info_from_t1.c + offset refer to Ldebug_info_from_t2.c + offset where the references always use positive offset and are based on the next previous lable (so Ldebug_info_from_t1.c is not refering to an entity at Ldebug_info_from_t2.c or beyond). So that seems to follow the restrictions you laid out above. If basic testing doesn't help I'll refrain from doing it ;) It can only break LTO in the end. Richard.
Re: [PATCH GCC]Improve loop bound info by simplifying conversions in iv base
On Fri, Aug 14, 2015 at 4:28 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Jul 28, 2015 at 11:38 AM, Bin Cheng bin.ch...@arm.com wrote: Hi, For now, SCEV may compute iv base in the form of (signed T)((unsigned T)base + step)). This complicates other optimizations/analysis depending on SCEV because it's hard to dive into type conversions. For many cases, such type conversions can be simplified with additional range information implied by loop initial conditions. This patch does such simplification. With simplified iv base, loop niter analysis can compute more accurate bound information since sensible value range can be derived for base+step. For example, accurate loop boundmay_be_zero information is computed for cases added by this patch. The code is actually borrowed from loop_exits_before_overflow. Moreover, with simplified iv base, the second case handled in that function now becomes the first case. I didn't remove that part of code because it may(?) still be visited in scev analysis itself and simple_iv isn't an interface for that. Is it OK? It looks quite special given it only handles a very specific pattern. Did you do any larger collecting of statistics on how many times this triggers, esp. how many times simplify_using_initial_conditions succeeds and how many times not? This function is somewhat expensive. Yes, this is corner case targeting induction variables of small signed types, just like added test cases. We need to convert it to unsigned, do the stepping, and convert back. I collected statistics for gcc bootstrap and spec2k6. The function is called about 400-500 times in both case. About 45% of calls succeeded in bootstrap, while only ~3% succeeded in spec2k6. I will prepare a new version patch if you think it's worthwhile in terms of compilation cost and benefit. Thanks, bin + || !operand_equal_p (iv-step, + fold_convert (type, +TREE_OPERAND (e, 1)), 0)) operand_equal_p can handle sign-differences in integer constants, no need to fold_convert here. Also if you know that you are comparing integer constants please use tree_int_cst_equal_p. + extreme = lower_bound_in_type (type, type); that's a strange function to call here (with two same types). Looks like just wide_int_to_tree (type, wi::max/min_value (type)). + extreme = fold_build2 (MINUS_EXPR, type, extreme, iv-step); so as iv-step is an INTEGER_CST please do this whole thing using wide_ints and only build trees here: + e = fold_build2 (code, boolean_type_node, base, extreme); Thanks, Richard. Thanks, bin 2015-07-28 Bin Cheng bin.ch...@arm.com * tree-ssa-loop-niter.c (tree_simplify_using_condition): Export the interface. * tree-ssa-loop-niter.h (tree_simplify_using_condition): Declare. * tree-scalar-evolution.c (simple_iv): Simplify type conversions in iv base using loop initial conditions. gcc/testsuite/ChangeLog 2015-07-28 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/loop-bound-2.c: New test. * gcc.dg/tree-ssa/loop-bound-4.c: New test. * gcc.dg/tree-ssa/loop-bound-6.c: New test.
Re: [AArch64] Break -mcpu tie between the compiler and assembler
On 20 August 2015 at 09:15, James Greenhalgh james.greenha...@arm.com wrote: 2015-08-19 James Greenhalgh james.greenha...@arm.com * common/config/aarch64/aarch64-common.c (AARCH64_CPU_NAME_LENGTH): Delete. (aarch64_option_extension): New. (all_extensions): Likewise. (processor_name_to_arch): Likewise. (arch_to_arch_name): Likewise. (all_cores): New. (all_architectures): Likewise. (aarch64_get_extension_string_for_isa_flags): Likewise. (aarch64_rewrite_selected_cpu): Change to rewrite CPU names to architecture names. * config/aarch64/aarch64-protos.h (aarch64_get_extension_string_for_isa_flags): New. * config/aarch64/aarch64.c (aarch64_print_extension): Delete. (aarch64_option_print): Get the string to print from aarch64_get_extension_string_for_isa_flags. (aarch64_declare_function_name): Likewise. * config/aarch64/aarch64.h (BIG_LITTLE_SPEC): Rename to... (MCPU_TO_MARCH_SPEC): This. (ASM_CPU_SPEC): Use it. (BIG_LITTLE_SPEC_FUNCTIONS): Rename to... (MCPU_TO_MARCH_SPEC_FUNCTIONS): ...This. (EXTRA_SPEC_FUNCTIONS): Use it. OK /Marcus
Re: [Patch] Add to the libgfortran/newlib bodge to detect ftruncate support in ARM/AArch64/SH
On 20 August 2015 at 09:31, James Greenhalgh james.greenha...@arm.com wrote: Hi, Steve's patch in 2013 [1] to fix the MIPS newlib/libgfortran build causes subtle issues for an ARM/AArch64 newlib/libgfortran build. The problem is that ARM/AArch64 (and SH) define a stub function for ftruncate, which we would previously have auto-detected, but which is not part of the hardwiring Steve added. Continuing the tradition of building bodge on bodge on bodge, this patch hardwires HAVE_FTRUNCATE on for ARM/AArch64/SH, which does fix the issue I was seeing. This is the second breakage I'm aware of due to the introduction of this hardwire code, the first being related to strtold. My recollection is that it is only the mips target that requires the newlib API hardwiring. Ideally we should rely only on the AC_CHECK_FUNCS_ONCE probe code and avoid the hardwire entirely. Perhaps a better approach for trunk would be something along the lines of: case ${host}--x${with_newlib} in mips*--xyes) hardwire_newlib=1;; esac if test ${hardwire_newlib:-0} -eq 1; then ... existing AC_DEFINES hardwire code else ... existing AC_CHECK_FUNCS_ONCE probe code fi In effect limiting the hardwire to just the target which is unable to probe. For backport to 4.9 and 5 I think James' more conservative patch is probably more appropriate. What do folks think? Cheers /Marcus
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On Thu, Aug 20, 2015 at 4:48 AM, Andrew Hughes gnu.and...@redhat.com wrote: - Original Message - On Fri, Aug 7, 2015 at 1:21 PM, Uros Bizjak ubiz...@gmail.com wrote: Attached patch fixes: Makefile:871: warning: overriding recipe for target 'gjdoc' Makefile:786: warning: ignoring old recipe for target 'gjdoc' build warning when compiling libjava. The problem was in configure.ac: we have to depend gjdoc build on CREATE_WRAPPERS in the same way as other tools are dependent a couple of lines above. While in this area, I also removed obsolete automake 1.11 workaround. As mentioned in HACKING: Make sure you have Automake 1.11.1 installed. Exactly that version! I have included all generated files in the diff. The changes are small and they illustrate the effect of the patch. 2015-08-07 Uros Bizjak ubiz...@gmail.com * configure.ac (tools/gjdoc): Depend on CREATE_WRAPPERS. * configure: Regenerate. * tools/Makefile.am: Remove unneeded dependencies for Automake 1.11. * tools/Makefile.in: Regenerate. Patch was bootstrapped on x86_64-linux-gnu, Fedora 22. OK for GCC mainline? I have committed this patch to GCC mainline SVN repository. Both issues can be considered obvious and the fix is trivial. Also, it looks like the official classpath repository is a dead place for a couple of years. Hardly. http://git.savannah.gnu.org/cgit/classpath.git/log/ I was looking at http://www.gnu.org/software/classpath/ where the lik still points to the CVS server, with the latest ChangeLog entry from 2012-03-25. The repository you listed above can't be found on what looks like a classpath start page. I've taken the liberty of applying your patch to keep GCJ in sync with its upstream. Thanks! Uros.
RE: [PATCH][AArch64][1/3] Expand signed mod by power of 2 using CSNEG
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00710.html Thanks, Kyrill On 03/08/15 14:01, James Greenhalgh wrote: On Fri, Jul 24, 2015 at 11:55:33AM +0100, Kyrill Tkachov wrote: Hi all, This patch implements an aarch64-specific expansion of the signed modulo by a power of 2. The proposed sequence makes use of the conditional negate instruction CSNEG. For a power of N, x % N can be calculated with: negs x1, x0 andx0, x0, #(N - 1) andx1, x1, #(N - 1) csneg x0, x0, x1, mi So, for N == 256 this would be: negs x1, x0 andx0, x0, #255 andx1, x1, #255 csneg x0, x0, x1, mi For comparison, the existing sequence emitted by expand_smod_pow2 in expmed.c is: asr x1, x0, 63 lsr x1, x1, 56 add x0, x0, x1 and x0, x0, 255 sub x0, x0, x1 Note that the CSNEG sequence is one instruction shorter and that the two and operations are independent, compared to the existing sequence where all instructions are dependent on the preceeding instructions. For the special case of N == 2 we can do even better: cmp x0, xzr and x0, x0, 1 csneg x0, x0, x0, ge I first tried implementing this in the generic code in expmed.c but that didn't work out for a few reasons: * This relies on having a conditional-negate instruction. We could gate it on HAVE_conditional_move and the combiner is capable of merging the final negate into the conditional move if a conditional negate is available (like on aarch64) but on targets without a conditional negate this would end up emitting a separate negate. * The first negs has to be a negs for the sequence to be a win i.e. having a separate negate and compare makes the sequence slower than the existing one (at least in my microbenchmarking) and I couldn't get subsequent passes to combine the negate and combine into the negs (presumably due to the use of the negated result in one of the ands). Doing it in the aarch64 backend where I could just call the exact gen_* functions that I need worked much more cleanly. The costing logic is updated to reflect this sequence during the intialisation of expmed.c where it calculates the smod_pow2_cheap metric. The tests will come in patch 3 of the series which are partly shared with the equivalent arm implementation. Bootstrapped and tested on aarch64. Ok for trunk? diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9d88a60..7bb4a55 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6639,8 +6639,26 @@ cost_plus: if (VECTOR_MODE_P (mode)) *cost += extra_cost-vect.alu; else if (GET_MODE_CLASS (mode) == MODE_INT) -*cost += (extra_cost-mult[mode == DImode].add - + extra_cost-mult[mode == DImode].idiv); +{ + /* We can expand signed mod by power of 2 using a + NEGS, two parallel ANDs and a CSNEG. Assume here + that CSNEG is COSTS_N_INSNS (1). This case should + only ever be reached through the set_smod_pow2_cheap check + in expmed.c. */ + if (code == MOD + CONST_INT_P (XEXP (x, 1)) + exact_log2 (INTVAL (XEXP (x, 1))) 0 + (mode == SImode || mode == DImode)) +{ + *cost += COSTS_N_INSNS (3) + + 2 * extra_cost-alu.logical + + extra_cost-alu.arith; + return true; +} + + *cost += (extra_cost-mult[mode == DImode].add ++ extra_cost-mult[mode == DImode].idiv); +} else if (mode == DFmode) *cost += (extra_cost-fp[1].mult + extra_cost-fp[1].div); This looks like it calculates the wrong cost for !speed. I think we will still expand through modmode3 when compiling for size, so we probably still want to cost the multiple instructions. Have I misunderstood? You're right, the logic needs a bit of wiggling to do the right thing. I've moved this case into a separate MOD case and added a gate on speed for the extra costs addition. Ok? Thanks, Kyrill 2015-08-13 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64.md (modmode3): New define_expand. (*negmode2_compare0): Rename to... (negmode2_compare0): ... This. * config/aarch64/aarch64.c (aarch64_rtx_costs, MOD case): Move check for speed inside the if-then-elses. Reflect CSNEG sequence in MOD by power of 2 case. Thanks, James aarch64-mod-2.patch commit de67e5fba716ce835f595c4167f57ec4faf61607 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Wed Jul 15 17:01:13 2015 +0100 [AArch64][1/3] Expand signed mod by power of 2 using CSNEG diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 1394ed7..c8bd8d2 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6652,6 +6652,25 @@ cost_plus: return
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Matthias
RE: [PATCH, MIPS, Ping] Inline memcpy for MipsR6
Checked in as revision 227026. Thanks, Simon -Original Message- From: Moore, Catherine [mailto:catherine_mo...@mentor.com] Sent: 01 August 2015 20:18 To: Simon Dardis; gcc-patches@gcc.gnu.org Cc: Moore, Catherine Subject: RE: [PATCH, MIPS, Ping] Inline memcpy for MipsR6 -Original Message- From: Simon Dardis [mailto:simon.dar...@imgtec.com] Sent: Wednesday, July 29, 2015 4:29 AM To: gcc-patches@gcc.gnu.org Cc: Moore, Catherine Subject: [PATCH, MIPS, Ping] Inline memcpy for MipsR6 This patch enables inline memcpy for R6 which was previously disabled and adds support for expansion when source and destination are at least half- word aligned. https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00749.html Hi Simon, Two things need to be fixed up with this patch before committing. 1. The new test inline-memcpy-2.c should not be run with -OS (like the other new tests that you submitted). 2. Your patch is against older source than what is currently in the repository, causing this hunk not to apply cleanly: @@ -8311,8 +8321,8 @@ bool mips_expand_block_move (rtx dest, rtx src, rtx length) { if (!ISA_HAS_LWL_LWR -(MEM_ALIGN (src) BITS_PER_WORD - || MEM_ALIGN (dest) BITS_PER_WORD)) + (MEM_ALIGN (src) MIPS_MIN_MOVE_MEM_ALIGN + || MEM_ALIGN (dest) MIPS_MIN_MOVE_MEM_ALIGN)) return false; if (CONST_INT_P (length)) The correct patch should like this: @@ -7780,8 +7790,9 @@ bool mips_expand_block_move (rtx dest, rtx src, rtx length) { - /* Disable entirely for R6 initially. */ - if (!ISA_HAS_LWL_LWR) + if (!ISA_HAS_LWL_LWR + (MEM_ALIGN (src) MIPS_MIN_MOVE_MEM_ALIGN + || MEM_ALIGN (dest) MIPS_MIN_MOVE_MEM_ALIGN)) return false; if (CONST_INT_P (length)) Okay with those changes. Thanks, Catherine
Re: [middle-end,patch] Making __builtin_signbit type-generic
Joseph Myers jos...@codesourcery.com writes: On Wed, 19 Aug 2015, Andreas Schwab wrote: Why only in usafe mode? Isn't the sign bit of NaN always unreliable? NaN sign bits are meaningful for a limited set of operations. And what are those? Assignment to the same type, negation, absolute value, copysign, signbit. Thanks, that means I have to fix the fpu emulation. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
[AArch64] Break -mcpu tie between the compiler and assembler
On Wed, Aug 19, 2015 at 04:48:11PM +0100, Andrew Pinski wrote: On Wed, Aug 19, 2015 at 11:39 PM, James Greenhalgh james.greenha...@arm.com wrote: Hi, This patch has been sitting in my tree for a while - it comes in handy when trying out bootstrap or test with -mcpu values like -mcpu=cortex-a72 with a system assmebler which trails trunk binutils. Essentially, we rewrite -mcpu=foo to a -march flag providing the same architecture revision and set of optional architecture features. There is no reason we should ever need the assembler to see a CPU name, it should only be interested in the architecture variant. While we're there I've long found this function too fragile and hard to grok in C. So I've rewritten it in C++ to use std::string rather than raw C strings. Making this work with extension strings requires a slight refactor to the existing extension printing code to pull it across to somewhere common. Note that this also stops us from having to pick through a big.LITTLE target to find and separate the core names - we can just look up the architecture of the whole target and use that. The new function does leak the allocation of a C string to hold the result, but looking at gcc.c:getenv_spec_function and gcc.c:replace_extension_spec_func this is the usual thing to do. This has been through an aarch64-none-linux-gnu bootstrap and test run, configured with --with-cpu=cortex-a72 , which my system assembler does not understand. Ok? + modified string, which seems much worse! */ + char *output = (char*) xmalloc (sizeof (*output) + * (outstr.length () + 1)); + strcpy (output, outstr.c_str ()); Why not just: char *output = xstrdup (outstr.c_str ()); Or at least use XNEWVEC instead of xmalloc with a cast? Makes sense to me, patch updated along those lines. OK? Thanks, James --- 2015-08-19 James Greenhalgh james.greenha...@arm.com * common/config/aarch64/aarch64-common.c (AARCH64_CPU_NAME_LENGTH): Delete. (aarch64_option_extension): New. (all_extensions): Likewise. (processor_name_to_arch): Likewise. (arch_to_arch_name): Likewise. (all_cores): New. (all_architectures): Likewise. (aarch64_get_extension_string_for_isa_flags): Likewise. (aarch64_rewrite_selected_cpu): Change to rewrite CPU names to architecture names. * config/aarch64/aarch64-protos.h (aarch64_get_extension_string_for_isa_flags): New. * config/aarch64/aarch64.c (aarch64_print_extension): Delete. (aarch64_option_print): Get the string to print from aarch64_get_extension_string_for_isa_flags. (aarch64_declare_function_name): Likewise. * config/aarch64/aarch64.h (BIG_LITTLE_SPEC): Rename to... (MCPU_TO_MARCH_SPEC): This. (ASM_CPU_SPEC): Use it. (BIG_LITTLE_SPEC_FUNCTIONS): Rename to... (MCPU_TO_MARCH_SPEC_FUNCTIONS): ...This. (EXTRA_SPEC_FUNCTIONS): Use it. diff --git a/gcc/common/config/aarch64/aarch64-common.c b/gcc/common/config/aarch64/aarch64-common.c index 726c625..07c6bba 100644 --- a/gcc/common/config/aarch64/aarch64-common.c +++ b/gcc/common/config/aarch64/aarch64-common.c @@ -27,7 +27,7 @@ #include common/common-target-def.h #include opts.h #include flags.h -#include errors.h +#include diagnostic.h #ifdef TARGET_BIG_ENDIAN_DEFAULT #undef TARGET_DEFAULT_TARGET_FLAGS @@ -107,36 +107,134 @@ aarch64_handle_option (struct gcc_options *opts, struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER; -#define AARCH64_CPU_NAME_LENGTH 128 +/* An ISA extension in the co-processor and main instruction set space. */ +struct aarch64_option_extension +{ + const char *const name; + const unsigned long flags_on; + const unsigned long flags_off; +}; + +/* ISA extensions in AArch64. */ +static const struct aarch64_option_extension all_extensions[] = +{ +#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRING) \ + {NAME, FLAGS_ON, FLAGS_OFF}, +#include config/aarch64/aarch64-option-extensions.def +#undef AARCH64_OPT_EXTENSION + {NULL, 0, 0} +}; + +struct processor_name_to_arch +{ + const std::string processor_name; + const enum aarch64_arch arch; + const unsigned long flags; +}; + +struct arch_to_arch_name +{ + const enum aarch64_arch arch; + const std::string arch_name; +}; + +/* Map processor names to the architecture revision they implement and + the default set of architectural feature flags they support. */ +static const struct processor_name_to_arch all_cores[] = +{ +#define AARCH64_CORE(NAME, X, IDENT, ARCH_IDENT, FLAGS, COSTS, IMP, PART) \ + {NAME, AARCH64_ARCH_##ARCH_IDENT, FLAGS}, +#include config/aarch64/aarch64-cores.def +#undef AARCH64_CORE + {generic, AARCH64_ARCH_8A, AARCH64_FL_FOR_ARCH8}, + {, aarch64_no_arch, 0} +}; + +/* Map architecture revisions to their string representation. */ +static const struct
Re: Move some flag_unsafe_math_optimizations using simplify and match
On Thu, Aug 20, 2015 at 6:48 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, Thanks again for your review and useful comments. I see. But I can't really help without a testcase that I can use to have a look (same for the above issue with the segfaults). The following testcase does not generate x as needed. double t (double x) { x = sqrt (x) * sqrt (x); return x; } Works for me if you specify -fno-math-errno. I think that's a regression we can accept. Later on GIMPLE CSE fails to CSE the two calls (because of the unknown side-effects, special-casing of (some) builtins would be necessary). All of the following operation results in segfault with:- aarch64-thunder-elf-gcc simlify-2.c -O2 -funsafe-math-optimizations === #include math.h double t (double x, double y, double z) { x = cbrt (x) * cbrt (y); x = exp10 (x) * exp10 (y); x = pow10 (x) * pow10 (y); x = x / cbrt (x/y) x = x / exp10 (y); x = x / pow10 (y); return x; } float t (float x, float y, float z) { x = sqrtf (x) * sqrtf (y); x = expf (x) * expf (y); x = powf (x, y) * powf (x, z); x = x / expf (y); return x; } long double t1 (long double x, long double y, long double z) { x = sqrtl (x) * sqrtl (y); x = expl (x) * expl (y); x = powl (x, y) * powl (x, z); x = x / expl (y); return x; } === /* Simplify sqrt(x) * sqrt(y) - sqrt(x*y). */ (simplify (mult (SQRT:s @0) (SQRT:s @1)) (SQRT (mult @0 @1))) /* Simplify pow(x,y) * pow(x,z) - pow(x,y+z). */ (simplify (mult (POW:s @0 @1) (POW:s @0 @2)) (POW @0 (plus @1 @2))) /* Simplify expN(x) * expN(y) - expN(x+y). */ (simplify (mult (EXP:s @0) (EXP:s @1)) (EXP (plus @0 @1))) /* Simplify x / expN(y) into x*expN(-y). */ (simplify (rdiv @0 (EXP @1)) (mult @0 (EXP (negate @1 === A quick fix to avoid this ICE would disable the pattern for -ferrno-math. Disabled the pattern for -ferrno-math. If you open a bugreport with the pattern and a testcase I'm going to have a closer look. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67285 Thanks for the detailed explanation of :s. Please let me know whether the working patch can be committed? If its okay and with your approval, I would like to move some more patterns using match and simplify. Can you re-post with the typo fix and the missing :s? Thanks, Richard. Thanks, Naveen
Re: [PATCH] S390: Fix vec_load_bndry.
On Tue, Aug 18, 2015 at 02:49:30PM +0200, Ulrich Weigand wrote: Dominik Vogt wrote: The attached patch fixes the vec_load_bndry builtin on S390. The second argument must be one of 64, 128, 256, ..., 4096, but the table expected that value to fit into 3 bits. Makes sense. However, I'd really like to see a testcase that verifies we're accepting the correct values and generate correct assembler ... Sure; updated patch atteched. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany From 689f07b5c98be80cf437981c2cffe20d3c339f57 Mon Sep 17 00:00:00 2001 From: Dominik Vogt v...@linux.vnet.ibm.com Date: Tue, 18 Aug 2015 13:11:08 +0100 Subject: [PATCH] S390: Fix vec_load_bndry. In one place it required 64, 128, ..., 4096 as the second argument and in another place it required that value to fit into three bits. --- gcc/config/s390/s390-builtins.def | 18 ++--- .../gcc.target/s390/zvector/vec-load_bndry-1.c | 80 ++ 2 files changed, 89 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-load_bndry-1.c diff --git a/gcc/config/s390/s390-builtins.def b/gcc/config/s390/s390-builtins.def index 9b11e41..3250eef 100644 --- a/gcc/config/s390/s390-builtins.def +++ b/gcc/config/s390/s390-builtins.def @@ -447,15 +447,15 @@ B_DEF (s390_vllezf,vec_insert_and_zerov4si,0, B_DEF (s390_vllezg,vec_insert_and_zerov2di,0, B_VX, 0, BT_FN_UV2DI_ULONGLONGCONSTPTR) OB_DEF (s390_vec_load_bndry,s390_vec_load_bndry_s8,s390_vec_load_bndry_dbl,B_VX,BT_FN_OV4SI_INTCONSTPTR_INT) -OB_DEF_VAR (s390_vec_load_bndry_s8, s390_vlbb, O2_U3, BT_OV_V16QI_SCHARCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_u8, s390_vlbb, O2_U3, BT_OV_UV16QI_UCHARCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_s16,s390_vlbb, O2_U3, BT_OV_V8HI_SHORTCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_u16,s390_vlbb, O2_U3, BT_OV_UV8HI_USHORTCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_s32,s390_vlbb, O2_U3, BT_OV_V4SI_INTCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_u32,s390_vlbb, O2_U3, BT_OV_UV4SI_UINTCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_s64,s390_vlbb, O2_U3, BT_OV_V2DI_LONGLONGCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_u64,s390_vlbb, O2_U3, BT_OV_UV2DI_ULONGLONGCONSTPTR_USHORT) -OB_DEF_VAR (s390_vec_load_bndry_dbl,s390_vlbb, O2_U3, BT_OV_V2DF_DBLCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_s8, s390_vlbb, O2_U16, BT_OV_V16QI_SCHARCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_u8, s390_vlbb, O2_U16, BT_OV_UV16QI_UCHARCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_s16,s390_vlbb, O2_U16, BT_OV_V8HI_SHORTCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_u16,s390_vlbb, O2_U16, BT_OV_UV8HI_USHORTCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_s32,s390_vlbb, O2_U16, BT_OV_V4SI_INTCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_u32,s390_vlbb, O2_U16, BT_OV_UV4SI_UINTCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_s64,s390_vlbb, O2_U16, BT_OV_V2DI_LONGLONGCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_u64,s390_vlbb, O2_U16, BT_OV_UV2DI_ULONGLONGCONSTPTR_USHORT) +OB_DEF_VAR (s390_vec_load_bndry_dbl,s390_vlbb, O2_U16, BT_OV_V2DF_DBLCONSTPTR_USHORT) B_DEF (s390_vlbb, vlbb, 0, B_VX, O2_U3, BT_FN_UV16QI_UCHARCONSTPTR_USHORT) diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-load_bndry-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-load_bndry-1.c new file mode 100644 index 000..9ebf6c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/zvector/vec-load_bndry-1.c @@ -0,0 +1,80 @@ +/* { dg-do compile { target { s390*-*-* } } } */ +/* { dg-options -O0 -mzarch -march=z13 -mzvector } */ + +#include vecintrin.h + +signed char +foo64 (signed char *p) +{ + return vec_load_bndry (p, 64)[0]; + /* { dg-final { scan-assembler-times \tvlbb\t%v..?,0\\(%r..?\\),0 1 } } */ +} + +signed char +foo128 (signed char *p) +{ + return +vec_load_bndry (p, 128)[0] ++ vec_load_bndry (p + 16, 128)[0]; + /* { dg-final { scan-assembler-times \tvlbb\t%v..?,0\\(%r..?\\),1 2 } } */ +} + +signed char +foo256 (signed char *p) +{ + return +vec_load_bndry (p, 256)[0] ++ vec_load_bndry (p + 16, 256)[0] ++ vec_load_bndry (p + 32, 256)[0]; + /* { dg-final { scan-assembler-times \tvlbb\t%v..?,0\\(%r..?\\),2 3 } } */ +} + +signed char +foo512 (signed char *p) +{ + return +vec_load_bndry (p,
Re: [PATCH] Fix middle-end/67133, part 1
Marek Polacek pola...@redhat.com writes: PR middle-end/67133 * gimple-ssa-isolate-paths.c (insert_trap_and_remove_trailing_statements): Rename to ... (insert_trap): ... this. Don't remove trailing statements; split block instead. (find_explicit_erroneous_behaviour): Don't remove all outgoing edges. This breaks go on aarch64: ../../../libgo/go/encoding/gob/decode.go: In function ‘gob.decIgnoreOpFor.pN20_encoding_gob.Decoder’: ../../../libgo/go/encoding/gob/decode.go:843:1: internal compiler error: in operator[], at vec.h:714 func (dec *Decoder) decIgnoreOpFor(wireId typeId) decOp { ^ 0xac5c3b vecedge_def*, va_gc, vl_embed::operator[](unsigned int) ../../gcc/vec.h:714 0xac5c3b extract_true_false_edges_from_block(basic_block_def*, edge_def**, edge_def**) ../../gcc/tree-cfg.c:8456 0xace9bf gimple_verify_flow_info ../../gcc/tree-cfg.c:5260 0x6ea1ab verify_flow_info() ../../gcc/cfghooks.c:260 0xadeca3 cleanup_tree_cfg_noloop ../../gcc/tree-cfgcleanup.c:739 0xadeca3 cleanup_tree_cfg() ../../gcc/tree-cfgcleanup.c:788 0x9d21c3 execute_function_todo ../../gcc/passes.c:1900 0x9d2b07 execute_todo ../../gcc/passes.c:2005 Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: [RFC, patch] New attribute to create target clones
On Mon, Aug 3, 2015 at 9:43 PM, Jeff Law l...@redhat.com wrote: On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote: Hi All, The patch enables new attribute 'ctarget', The attribute force compiler to create clones of a function with the attribute. For example: __attribute__((ctarget(avx,arch=slm,arch=core-avx2,default))) So presumably we're allowing both changing the ISA and the tuning options? In fact, it looks like we're able to change any -m option, right? What about something like -mregparm? I think the docs need to disallow clones with different ABI requirements. -mregparm is not allowed now. The targetm.target_option.valid_attribute_p hook specify which -m option is allowed for architecture. Here patch reuses Function Multiversioning methods. Currently default ctarget means that foo() will be optimized with current compiler options and target. The other option is to switch target to target specific minimum (like target x86-64). Is it better? I could make an argument for either. Do we have anything to guide us from other compilers such as ICC that may have a similar capability? Not sure. However ICC has similar to Function Multiversioning: __declcpec(cpu_specific(... where default is generic. I think for default we should do the same as Function Multiversioning - keep compiler options. That way users will be able to create target specific minimum by passing corresponding options to command line. What do you think about attribute name? 'ctarget' is short but not informative. Other variants are 'target_clones', 'targets'... target_clones seems good. For multiple_target.c: Can you please trim down the #include set. I can't believe you need all that stuff.If you really need backend stuff (tm.h), then use backend.h instead. It generally looks reasonable, but Jan knows the IPA code much better than I do and I'd like him to chime in. You might also ask Ilya to review the cloning and rules around clones since he head to deal with a lot of that stuff in his MPX work. jeff
Fix intelmic-mkoffload.c if the temp path contains a '-'
Hi all, during my test of OpenMP 4.0 offloading features I have found a bug in intelmic-mkoffload.c when the temp path contains a '-'. objcopy will in this case replace it with a '_' which wasn't reflected in the original code and resulted in a link error of the symbols '__offload_image_intelmic_start' and '__offload_image_intelmic_end'. This is my first contribution, so just reply if anything is wrong and I'll happily fix it. Greetings, Jonas -- Jonas Hahnfeld, MATSE-Auszubildender IT Center Group: High Performance Computing Division: Computational Science and Engineering RWTH Aachen University Seffenter Weg 23 D 52074 Aachen (Germany) hahnf...@itc.rwth-aachen.de www.itc.rwth-aachen.de Fix-intelmic-mkoffload.c-if-the-temp-path-contains-a.patch Description: Binary data smime.p7s Description: S/MIME cryptographic signature
[SPARC] Simplify const_all_ones_operand
gen_rtx_CONST_VECTOR ensures that there is a single instance of: (const_vector:M [(const_int -1) ... (const_int -1)]) for each M, so pointer equality with CONSTM1_RTX is enough. Also, HOST_BITS_PER_WIDE_INT == 32 is doubly dead: HOST_WIDE_INT is always 64 bits now, and we always use const_int rather than const_double or const_wide_int for all-ones values (or any other value that fits in a signed HOST_WIDE_INT). This seemed like a better fix than using the helper functions that I'm about to post. Tested with a cross-compiler and ensured that the predicate was still accepting all (-)1 values. OK to install? Thanks, Richard gcc/ * config/sparc/predicates.md (const_all_ones_operand): Use CONSTM1_RTX to simplify definition. diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md index 88537c6..aa45f8e 100644 --- a/gcc/config/sparc/predicates.md +++ b/gcc/config/sparc/predicates.md @@ -27,31 +27,9 @@ ;; Return true if the integer representation of OP is ;; all-ones. (define_predicate const_all_ones_operand - (match_code const_int,const_double,const_vector) -{ - if (GET_CODE (op) == CONST_INT INTVAL (op) == -1) -return true; -#if HOST_BITS_PER_WIDE_INT == 32 - if (GET_CODE (op) == CONST_DOUBLE - GET_MODE (op) == VOIDmode - CONST_DOUBLE_HIGH (op) == ~(HOST_WIDE_INT)0 - CONST_DOUBLE_LOW (op) == ~(HOST_WIDE_INT)0) -return true; -#endif - if (GET_CODE (op) == CONST_VECTOR) -{ - int i, num_elem = CONST_VECTOR_NUNITS (op); - - for (i = 0; i num_elem; i++) -{ - rtx n = CONST_VECTOR_ELT (op, i); - if (! const_all_ones_operand (n, mode)) -return false; -} - return true; -} - return false; -}) + (and (match_code const_int,const_double,const_vector) + (match_test INTEGRAL_MODE_P (GET_MODE (op))) + (match_test op == CONSTM1_RTX (GET_MODE (op) ;; Return true if OP is the integer constant 4096. (define_predicate const_4096_operand
Re: [PATCH] Fix middle-end/67133, part 1
Marek Polacek pola...@redhat.com writes: Whilst I'm struggling with building cross libgo to reproduce this, is there something like preprocessed source for go? I don't think so. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: Move some flag_unsafe_math_optimizations using simplify and match
On Thu, Aug 20, 2015 at 11:18 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, Works for me if you specify -fno-math-errno. I think that's a regression we can accept. Modified the pattern with fno-math-errno as a condition. Can you re-post with the typo fix and the missing :s? Please find attached the modified patch as per the review comments. Please suggest if there should be any further modifications. You marked + /* Simplify x * pow(x,c) - pow(x,c+1). */ + (simplify + (mult @0 (POW:s @0 REAL_CST@1)) + (if (!flag_errno_math +!TREE_OVERFLOW (@1)) + (POW @0 (plus @1 { build_one_cst (type); } with !flag_errno_math to avoid ICEs when replacing a non-call with a call. But ... + /* Simplify sin(x) / cos(x) - tan(x). */ + (simplify + (rdiv (SIN:s @0) (COS:s @0)) + (TAN @0)) has exactly the same issue, so does the following (and maybe others) + /* Simplify cos(x) / sin(x) - 1 / tan(x). */ + (simplify + (rdiv (COS:s @0) (SIN:s @0)) + (rdiv { build_one_cst (type); } (TAN @0))) so I presume those simply do not trigger late (on GIMPLE) for any existing testcases. So to not expose this (latent) issue please wait until I find the time to fix the underlying issue in a more generic way. I will have a look today. Thanks, Richard. Thanks, Naveen ChangeLog 2015-08-20 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * fold-const.c (fold_binary_loc) : Move sqrt(x)*sqrt(x) as x to match.pd. Move Optimize pow(x,y)*pow(z,y) as pow(x*z,y)to match.pd. Move Optimize tan(x)*cos(x) as sin(x) to match.pd. Move Optimize x*pow(x,c) as pow(x,c+1) to match.pd. Move Optimize pow(x,c)*x as pow(x,c+1) to match.pd. Move Optimize sin(x)/cos(x) as tan(x) to match.pd. Move Optimize cos(x)/sin(x) as 1.0/tan(x) to match.pd. Move Optimize sin(x)/tan(x) as cos(x) to match.pd. Move Optimize tan(x)/sin(x) as 1.0/cos(x) to match.pd. Move Optimize pow(x,c)/x as pow(x,c-1) to match.pd. Move Optimize x/pow(y,z) into x*pow(y,-z) to match.pd. * match.pd (SIN ) : New Operator. (TAN) : New Operator. (mult (SQRT@1 @0) @1) : New simplifier. (mult (POW:s @0 @1) (POW:s @2 @1)) : New simplifier. (mult:c (TAN:s @0) (COS:s @0)) : New simplifier. (mult:c (TAN:s @0) (COS:s @0)) : New simplifier. (rdiv (SIN:s @0) (COS:s @0)) : New simplifier. (rdiv (COS:s @0) (SIN:s @0)) : New simplifier. (rdiv (SIN:s @0) (TAN:s @0)) : New simplifier. (rdiv (TAN:s @0) (SIN:s @0)) : New simplifier. (rdiv (POW:s @0 REAL_CST@1) @0) : New simplifier. (rdiv @0 (SQRT:s (rdiv:s @1 @2))) : New simplifier. (rdiv @0 (POW:s @1 @2)) : New simplifier.
Re: [PATCH] Only accept BUILT_IN_NORMAL stringops for interesting_stringop_to_profile_p
On Thu, Aug 20, 2015 at 11:31 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Thanks for the comments. Attached please find the updated patch. OK? Ok. Thanks, Richard. Index: gcc/value-prof.c === --- gcc/value-prof.c(revision 141081) +++ gcc/value-prof.c(working copy) @@ -209,7 +209,6 @@ gimple_add_histogram_value (struct function *fun, hist-fun = fun; } - /* Remove histogram HIST from STMT's histogram list. */ void @@ -234,7 +233,6 @@ gimple_remove_histogram_value (struct function *fu free (hist); } - /* Lookup histogram of type TYPE in the STMT. */ histogram_value @@ -389,6 +387,7 @@ stream_out_histogram_value (struct output_block *o if (hist-hvalue.next) stream_out_histogram_value (ob, hist-hvalue.next); } + /* Dump information about HIST to DUMP_FILE. */ void @@ -488,7 +487,6 @@ gimple_duplicate_stmt_histograms (struct function } } - /* Move all histograms associated with OSTMT to STMT. */ void @@ -529,7 +527,6 @@ visit_hist (void **slot, void *data) return 1; } - /* Verify sanity of the histograms. */ DEBUG_FUNCTION void @@ -594,7 +591,6 @@ free_histograms (void) } } - /* The overall number of invocations of the counter should match execution count of basic block. Report it as error rather than internal error as it might mean that user has misused the profile @@ -638,7 +634,6 @@ check_counter (gimple stmt, const char * name, return false; } - /* GIMPLE based transformations. */ bool @@ -697,7 +692,6 @@ gimple_value_profile_transformations (void) return changed; } - /* Generate code for transformation 1 (with parent gimple assignment STMT and probability of taking the optimal path PROB, which is equivalent to COUNT/ALL within roundoff error). This generates the @@ -859,6 +853,7 @@ gimple_divmod_fixed_value_transform (gimple_stmt_i probability of taking the optimal path PROB, which is equivalent to COUNT/ALL within roundoff error). This generates the result into a temp and returns the temp; it does not replace or alter the original STMT. */ + static tree gimple_mod_pow2 (gimple stmt, int prob, gcov_type count, gcov_type all) { @@ -938,6 +933,7 @@ gimple_mod_pow2 (gimple stmt, int prob, gcov_type } /* Do transform 2) on INSN if applicable. */ + static bool gimple_mod_pow2_value_transform (gimple_stmt_iterator *si) { @@ -1540,15 +1536,15 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) return true; } -/* Return true if the stringop CALL with FNDECL shall be profiled. - SIZE_ARG be set to the argument index for the size of the string - operation. -*/ +/* Return true if the stringop CALL shall be profiled. SIZE_ARG be + set to the argument index for the size of the string operation. */ + static bool -interesting_stringop_to_profile_p (tree fndecl, gimple call, int *size_arg) +interesting_stringop_to_profile_p (gimple call, int *size_arg) { - enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); + enum built_in_function fcode; + fcode = DECL_FUNCTION_CODE (gimple_call_fndecl (call)); if (fcode != BUILT_IN_MEMCPY fcode != BUILT_IN_MEMPCPY fcode != BUILT_IN_MEMSET fcode != BUILT_IN_BZERO) return false; @@ -1573,7 +1569,7 @@ static bool } } -/* Convert stringop (..., vcall_size) +/* Convert stringop (..., vcall_size) into if (vcall_size == icall_size) stringop (..., icall_size); @@ -1590,11 +1586,9 @@ gimple_stringop_fixed_value (gimple vcall_stmt, tr basic_block cond_bb, icall_bb, vcall_bb, join_bb; edge e_ci, e_cv, e_iv, e_ij, e_vj; gimple_stmt_iterator gsi; - tree fndecl; int size_arg; - fndecl = gimple_call_fndecl (vcall_stmt); - if (!interesting_stringop_to_profile_p (fndecl, vcall_stmt, size_arg)) + if (!interesting_stringop_to_profile_p (vcall_stmt, size_arg)) gcc_unreachable (); cond_bb = gimple_bb (vcall_stmt); @@ -1673,11 +1667,11 @@ gimple_stringop_fixed_value (gimple vcall_stmt, tr /* Find values inside STMT for that we want to measure histograms for division/modulo optimization. */ + static bool gimple_stringops_transform (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); - tree fndecl; tree blck_size; enum built_in_function fcode; histogram_value histogram; @@ -1688,14 +1682,11 @@ gimple_stringops_transform (gimple_stmt_iterator * tree tree_val; int size_arg; - if (gimple_code (stmt) != GIMPLE_CALL) + if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) return false; - fndecl = gimple_call_fndecl (stmt); - if (!fndecl) + + if (!interesting_stringop_to_profile_p (stmt, size_arg)) return false; - fcode = DECL_FUNCTION_CODE (fndecl); - if (!interesting_stringop_to_profile_p (fndecl, stmt, size_arg)) -return false;
Re: [i386] Simplify vector_all_ones_operand
On Thu, Aug 20, 2015 at 12:02 PM, Richard Sandiford richard.sandif...@arm.com wrote: gen_rtx_CONST_VECTOR ensures that there is a single instance of: (const_vector:M [(const_int -1) ... (const_int -1)]) for each M, so pointer equality with CONSTM1_RTX is enough. This seemed like a better fix than using the helper functions that I'm about to post. Bootstrapped regression-tested on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * config/i386/predicates.md (vector_all_ones_operand): Use CONSTM1_RTX to simplify definition. OK. Thanks, Uros. diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index a9c8623..bc76a5b 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -912,25 +912,9 @@ /* Return true if operand is a vector constant that is all ones. */ (define_predicate vector_all_ones_operand - (match_code const_vector) -{ - int nunits = GET_MODE_NUNITS (mode); - - if (GET_CODE (op) == CONST_VECTOR - CONST_VECTOR_NUNITS (op) == nunits) -{ - int i; - for (i = 0; i nunits; ++i) -{ - rtx x = CONST_VECTOR_ELT (op, i); - if (x != constm1_rtx) -return false; -} - return true; -} - - return false; -}) + (and (match_code const_vector) + (match_test INTEGRAL_MODE_P (GET_MODE (op))) + (match_test op == CONSTM1_RTX (GET_MODE (op) ; Return true when OP is operand acceptable for standard SSE move. (define_predicate vector_move_operand
Re: [PATCH][1/n] dwarf2out refactoring for early (LTO) debug
On Thu, 20 Aug 2015, Richard Biener wrote: On Wed, 19 Aug 2015, Richard Biener wrote: On Tue, 18 Aug 2015, Aldy Hernandez wrote: On 08/18/2015 07:20 AM, Richard Biener wrote: This starts a series of patches (still in development) to refactor dwarf2out.c to better cope with early debug (and LTO debug). Awesome! Thanks. Aldyh, what other testing did you usually do for changes? Run the gdb testsuite against the new compiler? Anything else? gdb testsuite, and make sure you test GCC with --enable-languages=all,go,ada, though the latter is mostly useful while you iron out bugs initially. I found that ultimately, the best test was C++. I see. Pre merge I also bootstrapped the compiler and compared .debug* section sizes in object files to make sure things were within reason. + +static void +vmsdbgout_early_finish (const char *filename ATTRIBUTE_UNUSED) +{ + if (write_symbols == VMS_AND_DWARF2_DEBUG) +(*dwarf2_debug_hooks.early_finish) (filename); +} You can get rid of ATTRIBUTE_UNUSED now. Done. I've also refrained from moving gen_scheduled_generic_parms_dies (); gen_remaining_tmpl_value_param_die_attribute (); for now as that causes regressions I have to investigate. So I thought gen_scheduled_generic_parms_dies was fine but it exposes a hole in /* Generate early debug for global variables. Any local variables will be handled by either handling reachable functions from finalize_compilation_unit (and by consequence, locally scoped symbols), or by rest_of_type_compilation below. ... !decl_type_context (decl)) (*debug_hooks-early_global_decl) (decl); for __timepunct_cache::_S_timezones where through the rest_of_type_compilation we quickly finish processing __timepunct_cache as it is TYPE_DECL_SUPPRESS_DEBUG (so -type_decl exits w/o doing anything). So we fail to generate a type DIE for the global which causes us, at late_global_decl time (we do output this global var) ends up doing #12 0x00b831f1 in gen_decl_die (decl=0x75650a20, origin=0x0, context_die=0x762ab000) at /space/rguenther/src/svn/trunk/gcc/dwarf2out.c:21535 21532 /* And its containing type. */ 21533 class_origin = decl_class_context (decl_or_origin); 21534 if (class_origin != NULL_TREE) 21535 gen_type_die_for_member (class_origin, decl_or_origin, context_die); and thus create the type DIE for __timepunct_cache late. This is a hole in current early-debug. IMHO we should force early_global_decl even for globals in decl_type_context () or at least for a type DIE to be created for TYPE_DECL_SUPPRESS_DEBUG type decls. Jason, any advice on this? I note /* In a TYPE_DECL nonzero means the detail info about this type is not dumped into stabs. Instead it will generate cross reference ('x') of names. This uses the same flag as DECL_EXTERNAL. */ #define TYPE_DECL_SUPPRESS_DEBUG(NODE) \ (TYPE_DECL_CHECK (NODE)-decl_common.decl_flag_1) refering to STABS and This uses the same flag as DECL_EXTERNAL so maybe this is just bad co-incidence? DECL_EXTERNAL doesn't check it operates on a non-TYPE_DECL. So without knowing that the flag is supposed to be doing in DWARF (also the desired effect on any of its static members) I can only suggest we maybe do not want any debug info for __timepunct_cache::_S_timezones at all? Sorry to followup myself all the time ;) The following seems to work for me here: Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 226937) +++ gcc/dwarf2out.c (working copy) @@ -21647,7 +21652,8 @@ dwarf2out_late_global_decl (tree decl) Skip over functions because they were handled by the debug_hooks-function_decl() call in rest_of_handle_final. */ if ((TREE_CODE (decl) != FUNCTION_DECL || !DECL_INITIAL (decl)) - !POINTER_BOUNDS_P (decl)) + !POINTER_BOUNDS_P (decl) + lookup_decl_die (decl) != NULL) dwarf2out_decl (decl); } which basically means only annotate gloabl decls late if we created a DIE for it early. Kind-of makes sense. Going to test that more extensively separately. Richard.
Re: [PATCH] Fix middle-end/67133, part 1
On Thu, Aug 20, 2015 at 11:02:17AM +0200, Andreas Schwab wrote: Marek Polacek pola...@redhat.com writes: PR middle-end/67133 * gimple-ssa-isolate-paths.c (insert_trap_and_remove_trailing_statements): Rename to ... (insert_trap): ... this. Don't remove trailing statements; split block instead. (find_explicit_erroneous_behaviour): Don't remove all outgoing edges. This breaks go on aarch64: ../../../libgo/go/encoding/gob/decode.go: In function ‘gob.decIgnoreOpFor.pN20_encoding_gob.Decoder’: ../../../libgo/go/encoding/gob/decode.go:843:1: internal compiler error: in operator[], at vec.h:714 func (dec *Decoder) decIgnoreOpFor(wireId typeId) decOp { ^ 0xac5c3b vecedge_def*, va_gc, vl_embed::operator[](unsigned int) ../../gcc/vec.h:714 0xac5c3b extract_true_false_edges_from_block(basic_block_def*, edge_def**, edge_def**) ../../gcc/tree-cfg.c:8456 0xace9bf gimple_verify_flow_info ../../gcc/tree-cfg.c:5260 0x6ea1ab verify_flow_info() ../../gcc/cfghooks.c:260 0xadeca3 cleanup_tree_cfg_noloop ../../gcc/tree-cfgcleanup.c:739 0xadeca3 cleanup_tree_cfg() ../../gcc/tree-cfgcleanup.c:788 0x9d21c3 execute_function_todo ../../gcc/passes.c:1900 0x9d2b07 execute_todo ../../gcc/passes.c:2005 Whilst I'm struggling with building cross libgo to reproduce this, is there something like preprocessed source for go? So that ideally I'd just run ./go1 foo.go? That'd help tremendously. Marek
[patch] Restore installation of libstdc++.so.6.0.??-gdb.py
Matthias noticed that my backport of the Filesystem library removed the libstdc++.so.6.0.21-gdb.py file, which was now getting installed as libstdc++fs.a-gdb.py instead. Fixed by changing the glob we use to find the candidate library files that we use for the name of the installed gdb.py file. Tested powerpc64le-linux, committed to trunk and gcc-5-branch. commit dc3403f07733a847f37e15c856759a94aa7e97d7 Author: Jonathan Wakely jwak...@redhat.com Date: Thu Aug 20 11:11:51 2015 +0100 * python/Makefile.am: Ensure gdb.py is installed for libstdc++ not libstdc++fs. * python/Makefile.in: Regenerate. diff --git a/libstdc++-v3/python/Makefile.am b/libstdc++-v3/python/Makefile.am index 5d78224..ccb9427 100644 --- a/libstdc++-v3/python/Makefile.am +++ b/libstdc++-v3/python/Makefile.am @@ -49,7 +49,7 @@ install-data-local: gdb.py ## fragile, but there does not seem to be a better option, because ## libtool hides the real names from us. @here=`pwd`; cd $(DESTDIR)$(toolexeclibdir); \ - for file in libstdc++*; do \ + for file in libstdc++.*; do \ case $$file in \ *-gdb.py) ;; \ *.la) ;; \
[PATCH] Move late_global_decl call
This moves it where it really belongs, also avoiding extra work for the slim LTO compile phase. Bootstrapped and tested and gdb tested on x86_64-unknown-linux-gnu, applied. Richard. 2015-08-20 Richard Biener rguent...@suse.de * toplev.c (compile_file): Remove loop calling late_global_decl on all symbols. * varpool.c (varpool_node::assemble_decl): Call late_global_decl on decls we assembled. Index: gcc/toplev.c === --- gcc/toplev.c(revision 226966) +++ gcc/toplev.c(working copy) @@ -580,15 +580,6 @@ compile_file (void) if (seen_error ()) return; - /* After the parser has generated debugging information, augment - this information with any new location/etc information that may - have become available after the compilation proper. */ - timevar_start (TV_PHASE_DBGINFO); - symtab_node *node; - FOR_EACH_DEFINED_SYMBOL (node) -debug_hooks-late_global_decl (node-decl); - timevar_stop (TV_PHASE_DBGINFO); - timevar_start (TV_PHASE_LATE_ASM); /* Compilation unit is finalized. When producing non-fat LTO object, we are Index: gcc/varpool.c === --- gcc/varpool.c (revision 226966) +++ gcc/varpool.c (working copy) @@ -586,6 +586,12 @@ varpool_node::assemble_decl (void) gcc_assert (TREE_ASM_WRITTEN (decl)); gcc_assert (definition); assemble_aliases (); + /* After the parser has generated debugging information, augment +this information with any new location/etc information that may +have become available after the compilation proper. */ + timevar_start (TV_PHASE_DBGINFO); + debug_hooks-late_global_decl (decl); + timevar_stop (TV_PHASE_DBGINFO); return true; }
Re: [PATCH GCC]Improve loop bound info by simplifying conversions in iv base
On Thu, Aug 20, 2015 at 10:22 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Fri, Aug 14, 2015 at 4:28 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Jul 28, 2015 at 11:38 AM, Bin Cheng bin.ch...@arm.com wrote: Hi, For now, SCEV may compute iv base in the form of (signed T)((unsigned T)base + step)). This complicates other optimizations/analysis depending on SCEV because it's hard to dive into type conversions. For many cases, such type conversions can be simplified with additional range information implied by loop initial conditions. This patch does such simplification. With simplified iv base, loop niter analysis can compute more accurate bound information since sensible value range can be derived for base+step. For example, accurate loop boundmay_be_zero information is computed for cases added by this patch. The code is actually borrowed from loop_exits_before_overflow. Moreover, with simplified iv base, the second case handled in that function now becomes the first case. I didn't remove that part of code because it may(?) still be visited in scev analysis itself and simple_iv isn't an interface for that. Is it OK? It looks quite special given it only handles a very specific pattern. Did you do any larger collecting of statistics on how many times this triggers, esp. how many times simplify_using_initial_conditions succeeds and how many times not? This function is somewhat expensive. Yes, this is corner case targeting induction variables of small signed types, just like added test cases. We need to convert it to unsigned, do the stepping, and convert back. I collected statistics for gcc bootstrap and spec2k6. The function is called about 400-500 times in both case. About 45% of calls succeeded in bootstrap, while only ~3% succeeded in spec2k6. I will prepare a new version patch if you think it's worthwhile in terms of compilation cost and benefit. Yes. Richard. Thanks, bin + || !operand_equal_p (iv-step, + fold_convert (type, +TREE_OPERAND (e, 1)), 0)) operand_equal_p can handle sign-differences in integer constants, no need to fold_convert here. Also if you know that you are comparing integer constants please use tree_int_cst_equal_p. + extreme = lower_bound_in_type (type, type); that's a strange function to call here (with two same types). Looks like just wide_int_to_tree (type, wi::max/min_value (type)). + extreme = fold_build2 (MINUS_EXPR, type, extreme, iv-step); so as iv-step is an INTEGER_CST please do this whole thing using wide_ints and only build trees here: + e = fold_build2 (code, boolean_type_node, base, extreme); Thanks, Richard. Thanks, bin 2015-07-28 Bin Cheng bin.ch...@arm.com * tree-ssa-loop-niter.c (tree_simplify_using_condition): Export the interface. * tree-ssa-loop-niter.h (tree_simplify_using_condition): Declare. * tree-scalar-evolution.c (simple_iv): Simplify type conversions in iv base using loop initial conditions. gcc/testsuite/ChangeLog 2015-07-28 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/loop-bound-2.c: New test. * gcc.dg/tree-ssa/loop-bound-4.c: New test. * gcc.dg/tree-ssa/loop-bound-6.c: New test.
[ARC] Cleanup A5 references
This patch cleans up the references to obsolete A5 processor. Can this be committed? Thanks, Claudiu 2015-08-20 Claudiu Zissulescu claz...@synopsys.com * common/config/arc/arc-common.c, config/arc/arc-opts.h, config/arc/arc.c, config/arc/arc.h, config/arc/arc.md, config/arc/arc.opt, config/arc/constraints.md, config/arc/t-arc-newlib: Remove references to A5. patchCleanUpA5 Description: patchCleanUpA5
Add utility functions for rtx broadcast/duplicate constants
Several pieces of code want to know whether all elements of a CONST_VECTOR are equal, and I'm about to add some more to simplify-rtx.c. This patch adds some utility functions for that. I don't think we're really helping ourselves by having the shift amount in v16qi 3 be: (const_vector:V16QI [ (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) ]) so I wanted to leave open the possibility of using: (const:V16QI (vec_duplicate:V16QI (const_int 3))) in future. The interface therefore passes back the duplicated element rather than leaving callers to use CONST_VECTOR_ELT (c, 0) (== XVECEXP (c, 0, 0)). unwrap_const_vec_duplicate is mostly for code that handles vector operations equivalently to scalar ops. The follow-on simplify-rtx.c code makes more use of this. It also came in useful for the tilegx/ tilepro predicates. Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu. I also built cross-compilers for s390x-linux-gnu, spu-elf, tilepro-elf and tilegx-elf and checked by hand that the affected code still worked. (Well, except for the SPU case. That's handling vector constants in which the elements are symbolic addresses, such as { foo, foo, foo, foo }. Such vectors don't seem to be treated as constants at the gimple level and the initial rtl code that we generate is too complex for later optimisations to convert back to a constant, so I wasn't sure how best to trigger it.) OK to install? Thanks, Richard gcc/ * rtl.h (rtvec_all_equal_p): Declare. (const_vec_duplicate_p, unwrap_const_vec_duplicate): New functions. * rtl.c (rtvec_all_equal_p): New function. * expmed.c (expand_mult): Use unwrap_const_vec_duplicate. * config/aarch64/aarch64.c (aarch64_vect_float_const_representable_p) (aarch64_simd_dup_constant): Use const_vec_duplicate_p. * config/arm/arm.c (neon_vdup_constant): Likewise. * config/s390/s390.c (s390_contiguous_bitmask_vector_p): Likewise. * config/tilegx/constraints.md (W, Y): Likewise. * config/tilepro/constraints.md (W, Y): Likewise. * config/spu/spu.c (spu_legitimate_constant_p): Likewise. (classify_immediate): Use unwrap_const_vec_duplicate. * config/tilepro/predicates.md (reg_or_v4s8bit_operand): Likewise. (reg_or_v2s8bit_operand): Likewise. * config/tilegx/predicates.md (reg_or_v8s8bit_operand): Likewise. (reg_or_v4s8bit_operand): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 382be2c..9b2ea2c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9879,31 +9879,10 @@ sizetochar (int size) static bool aarch64_vect_float_const_representable_p (rtx x) { - int i = 0; - REAL_VALUE_TYPE r0, ri; - rtx x0, xi; - - if (GET_MODE_CLASS (GET_MODE (x)) != MODE_VECTOR_FLOAT) -return false; - - x0 = CONST_VECTOR_ELT (x, 0); - if (!CONST_DOUBLE_P (x0)) -return false; - - REAL_VALUE_FROM_CONST_DOUBLE (r0, x0); - - for (i = 1; i CONST_VECTOR_NUNITS (x); i++) -{ - xi = CONST_VECTOR_ELT (x, i); - if (!CONST_DOUBLE_P (xi)) - return false; - - REAL_VALUE_FROM_CONST_DOUBLE (ri, xi); - if (!REAL_VALUES_EQUAL (r0, ri)) - return false; -} - - return aarch64_float_const_representable_p (x0); + rtx elt; + return (GET_MODE_CLASS (GET_MODE (x)) == MODE_VECTOR_FLOAT + const_vec_duplicate_p (x, elt) + aarch64_float_const_representable_p (elt)); } /* Return true for valid and false for invalid. */ @@ -10366,28 +10345,15 @@ aarch64_simd_dup_constant (rtx vals) { machine_mode mode = GET_MODE (vals); machine_mode inner_mode = GET_MODE_INNER (mode); - int n_elts = GET_MODE_NUNITS (mode); - bool all_same = true; rtx x; - int i; - - if (GET_CODE (vals) != CONST_VECTOR) -return NULL_RTX; - - for (i = 1; i n_elts; ++i) -{ - x = CONST_VECTOR_ELT (vals, i); - if (!rtx_equal_p (x, CONST_VECTOR_ELT (vals, 0))) - all_same = false; -} - if (!all_same) + if (!const_vec_duplicate_p (vals, x)) return NULL_RTX; /* We can load this constant by using DUP and a constant in a single ARM register. This will be cheaper than a vector load. */ - x = copy_to_mode_reg (inner_mode, CONST_VECTOR_ELT (vals, 0)); + x = copy_to_mode_reg (inner_mode, x); return gen_rtx_VEC_DUPLICATE (mode, x); } diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index da77244..c2095a3 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12607,22 +12607,12 @@ neon_vdup_constant (rtx vals) { machine_mode mode = GET_MODE (vals); machine_mode inner_mode =
Re: [PATCH][2/n] Change dw2_asm_output_offset to allow assembling extra offset
On Aug 20, 2015, at 12:29 AM, Richard Biener rguent...@suse.de wrote: where the references always use positive offset and are based on the next previous lable (so Ldebug_info_from_t1.c is not refering to an entity at Ldebug_info_from_t2.c or beyond). So that seems to follow the restrictions you laid out above. I agree. If basic testing doesn't help I'll refrain from doing it ;) It can only break LTO in the end. :-) Actually, I don’t mind. It should be easy enough to ask a darwin person to spin it and watch for breakage if you want. I don’t expect it to not work given your elaboration.
Re: [PATCH][2/n] Change dw2_asm_output_offset to allow assembling extra offset
On Thu, 20 Aug 2015, Mike Stump wrote: On Aug 20, 2015, at 12:29 AM, Richard Biener rguent...@suse.de wrote: where the references always use positive offset and are based on the next previous lable (so Ldebug_info_from_t1.c is not refering to an entity at Ldebug_info_from_t2.c or beyond). So that seems to follow the restrictions you laid out above. I agree. If basic testing doesn't help I'll refrain from doing it ;) It can only break LTO in the end. :-) Actually, I don’t mind. It should be easy enough to ask a darwin person to spin it and watch for breakage if you want. I don’t expect it to not work given your elaboration. Err - you _are_ a darwin persion ;) According to MAINTAINERS at least... Richard.
Re: [PATCH][1/n] dwarf2out refactoring for early (LTO) debug
On Wed, 19 Aug 2015, Richard Biener wrote: On Tue, 18 Aug 2015, Aldy Hernandez wrote: On 08/18/2015 07:20 AM, Richard Biener wrote: This starts a series of patches (still in development) to refactor dwarf2out.c to better cope with early debug (and LTO debug). Awesome! Thanks. Aldyh, what other testing did you usually do for changes? Run the gdb testsuite against the new compiler? Anything else? gdb testsuite, and make sure you test GCC with --enable-languages=all,go,ada, though the latter is mostly useful while you iron out bugs initially. I found that ultimately, the best test was C++. I see. Pre merge I also bootstrapped the compiler and compared .debug* section sizes in object files to make sure things were within reason. + +static void +vmsdbgout_early_finish (const char *filename ATTRIBUTE_UNUSED) +{ + if (write_symbols == VMS_AND_DWARF2_DEBUG) +(*dwarf2_debug_hooks.early_finish) (filename); +} You can get rid of ATTRIBUTE_UNUSED now. Done. I've also refrained from moving gen_scheduled_generic_parms_dies (); gen_remaining_tmpl_value_param_die_attribute (); for now as that causes regressions I have to investigate. So I thought gen_scheduled_generic_parms_dies was fine but it exposes a hole in /* Generate early debug for global variables. Any local variables will be handled by either handling reachable functions from finalize_compilation_unit (and by consequence, locally scoped symbols), or by rest_of_type_compilation below. ... !decl_type_context (decl)) (*debug_hooks-early_global_decl) (decl); for __timepunct_cache::_S_timezones where through the rest_of_type_compilation we quickly finish processing __timepunct_cache as it is TYPE_DECL_SUPPRESS_DEBUG (so -type_decl exits w/o doing anything). So we fail to generate a type DIE for the global which causes us, at late_global_decl time (we do output this global var) ends up doing #12 0x00b831f1 in gen_decl_die (decl=0x75650a20, origin=0x0, context_die=0x762ab000) at /space/rguenther/src/svn/trunk/gcc/dwarf2out.c:21535 21532 /* And its containing type. */ 21533 class_origin = decl_class_context (decl_or_origin); 21534 if (class_origin != NULL_TREE) 21535 gen_type_die_for_member (class_origin, decl_or_origin, context_die); and thus create the type DIE for __timepunct_cache late. This is a hole in current early-debug. IMHO we should force early_global_decl even for globals in decl_type_context () or at least for a type DIE to be created for TYPE_DECL_SUPPRESS_DEBUG type decls. Jason, any advice on this? I note /* In a TYPE_DECL nonzero means the detail info about this type is not dumped into stabs. Instead it will generate cross reference ('x') of names. This uses the same flag as DECL_EXTERNAL. */ #define TYPE_DECL_SUPPRESS_DEBUG(NODE) \ (TYPE_DECL_CHECK (NODE)-decl_common.decl_flag_1) refering to STABS and This uses the same flag as DECL_EXTERNAL so maybe this is just bad co-incidence? DECL_EXTERNAL doesn't check it operates on a non-TYPE_DECL. So without knowing that the flag is supposed to be doing in DWARF (also the desired effect on any of its static members) I can only suggest we maybe do not want any debug info for __timepunct_cache::_S_timezones at all? Thanks, Richard.
Re: [PATCH] Add extra compile options for dg-final only once.
On Thu, Aug 20, 2015 at 11:37 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: This patch fixes an annoying problem of the dg-final test using the scan-assembler family of tests (and maybe others). For a test file, the option -ffat-lto-objects is added to the command line once for each scan-assembler test, eventually resulting in an unreadable command line. Can this be committed? Yes. Thanks, Richard. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
[PATCH 4/4] add default for CONSTANT_ALIGNMENT
From: tbsaunde tbsaunde@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ChangeLog: 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org * defaults.h (CONSTANT_ALIGNMENT): New macro definition. * builtins.c (get_object_alignment_2): Adjust. * varasm.c (align_variable): Likewise. (get_variable_align): Likewise. (build_constant_desc): Likewise. (force_const_mem): Likewise. * doc/tm.texi.in: Likewise. * doc/tm.texi: Regenerate. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@227052 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 11 +++ gcc/builtins.c | 6 ++ gcc/defaults.h | 4 gcc/doc/tm.texi| 2 +- gcc/doc/tm.texi.in | 2 +- gcc/varasm.c | 17 - 6 files changed, 23 insertions(+), 19 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2943501..2063885 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,16 @@ 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * defaults.h (CONSTANT_ALIGNMENT): New macro definition. + * builtins.c (get_object_alignment_2): Adjust. + * varasm.c (align_variable): Likewise. + (get_variable_align): Likewise. + (build_constant_desc): Likewise. + (force_const_mem): Likewise. + * doc/tm.texi.in: Likewise. + * doc/tm.texi: Regenerate. + +2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * genconfig.c (main): Always define HAVE_cc0. * recog.c (rest_of_handle_peephole2): Adjust. diff --git a/gcc/builtins.c b/gcc/builtins.c index 31969ca..635ba54 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -314,10 +314,9 @@ get_object_alignment_2 (tree exp, unsigned int *alignp, /* The alignment of a CONST_DECL is determined by its initializer. */ exp = DECL_INITIAL (exp); align = TYPE_ALIGN (TREE_TYPE (exp)); -#ifdef CONSTANT_ALIGNMENT if (CONSTANT_CLASS_P (exp)) align = (unsigned) CONSTANT_ALIGNMENT (exp, align); -#endif + known_alignment = true; } else if (DECL_P (exp)) @@ -393,10 +392,9 @@ get_object_alignment_2 (tree exp, unsigned int *alignp, /* STRING_CST are the only constant objects we allow to be not wrapped inside a CONST_DECL. */ align = TYPE_ALIGN (TREE_TYPE (exp)); -#ifdef CONSTANT_ALIGNMENT if (CONSTANT_CLASS_P (exp)) align = (unsigned) CONSTANT_ALIGNMENT (exp, align); -#endif + known_alignment = true; } diff --git a/gcc/defaults.h b/gcc/defaults.h index 4fe8eb1..d4d3a56 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1273,6 +1273,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define WORD_REGISTER_OPERATIONS 0 #endif +#ifndef CONSTANT_ALIGNMENT +#define CONSTANT_ALIGNMENT(EXP, ALIGN) ALIGN +#endif + #ifdef GCC_INSN_FLAGS_H /* Dependent default target macro definitions diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index f95646c..f5a1f84 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -1098,7 +1098,7 @@ that is being placed in memory. @var{constant} is the constant and have. The value of this macro is used instead of that alignment to align the object. -If this macro is not defined, then @var{basic-align} is used. +The default definition just returns @var{basic-align}. The typical use of this macro is to increase alignment for string constants to be word aligned so that @code{strcpy} calls that copy diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 2383fb9..9d5ac0a 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1048,7 +1048,7 @@ that is being placed in memory. @var{constant} is the constant and have. The value of this macro is used instead of that alignment to align the object. -If this macro is not defined, then @var{basic-align} is used. +The default definition just returns @var{basic-align}. The typical use of this macro is to increase alignment for string constants to be word aligned so that @code{strcpy} calls that copy diff --git a/gcc/varasm.c b/gcc/varasm.c index 2ebac89..7fa2e7b 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -1043,7 +1043,6 @@ align_variable (tree decl, bool dont_output_data) if (! DECL_THREAD_LOCAL_P (decl) || data_align = BITS_PER_WORD) align = data_align; #endif -#ifdef CONSTANT_ALIGNMENT if (DECL_INITIAL (decl) != 0 /* In LTO we have no errors in program; error_mark_node is used to mark offlined constructors. */ @@ -1056,7 +1055,6 @@ align_variable (tree decl, bool dont_output_data) if (! DECL_THREAD_LOCAL_P (decl) || const_align = BITS_PER_WORD) align = const_align; } -#endif } } @@ -1097,7 +1095,6 @@ get_variable_align (tree decl) if (! DECL_THREAD_LOCAL_P (decl) || data_align = BITS_PER_WORD) align = data_align; #endif -#ifdef CONSTANT_ALIGNMENT if (DECL_INITIAL (decl) != 0
Re: Forwarding -foffload=[...] from the driver (compile-time) to libgomp (run-time)
On Tue, 18 Aug 2015, Thomas Schwinge wrote: So, back to modifying the driver; here is my current messy WIP patch with still a lot of TODOs in it -- but it appears to work at last. :-) Maybe somebody else is able to continue with that task while I'm out of office. This has been developed on top of gomp-4_0-branch r226832. I'm also attaching a tarball of the even more messy indivdual patches, foffload.tar.bz2, in case there's anything to salvage in there, or if that helps to understand the development options/history. Earlier messages in this thread should give enough context what this is about, http://news.gmane.org/find-root.php?message_id=%3C87egjopgh0.fsf%40kepler.schwinge.homeip.net%3E. This is what I've committed to gomp-4_0-branch, with the driver changes substantially cleaned up and smaller changes to the other bits of the patch. gcc: 2015-08-20 Thomas Schwinge tho...@codesourcery.com Joseph Myers jos...@codesourcery.com * doc/invoke.texi (-ffixed-@var{reg}): Document conflict with Fortran options. * gcc.c (offload_targets): Update comment. (add_omp_infile_spec_func, spec_lang_mask_accept): New. (driver_self_specs) [ENABLE_OFFLOADING]: Add spec to use %:add-omp-infile(). (static_spec_functions): Add add-omp-infile. (struct switchstr): Add lang_mask field. Expand comment. (struct infile): Add lang_mask field. (add_infile, save_switch, do_spec): Add lang_mask argument. (driver_unknown_option_callback, driver_wrong_lang_callback) (driver_handle_option, process_command, do_self_spec) (driver::do_spec_on_infiles): All callers changed. (give_switch): Check languages of switch against spec_lang_mask_accept. (driver::maybe_putenv_OFFLOAD_TARGETS): Do not use intermediate targets variable. * gcc.h (do_spec): Update prototype. fortran: 2015-08-20 Joseph Myers jos...@codesourcery.com * gfortranspec.c (lang_specific_pre_link): Update call to do_spec. java: 2015-08-20 Joseph Myers jos...@codesourcery.com * jvspec.c (lang_specific_pre_link): Update call to do_spec. libgomp: 2015-08-20 Thomas Schwinge tho...@codesourcery.com Joseph Myers jos...@codesourcery.com * plugin/configfrag.ac (fnmatch.h): Check for header. (fnmatch): Check for function. (tgt_name): Do not set. (offload_targets): Separate with colons not commas. * config.h.in, configure: Regenerate. * env.c (initialize_env): Make static. Remove TODO. * libgomp.h (gomp_offload_target_available_p): New prototype. * libgomp.map (GOACC_2.0.GOMP_4_BRANCH): Add GOMP_set_offload_targets. (INTERNAL): Remove. * libgomp_g.h (GOMP_set_offload_targets): New prototype. * oacc-init.c (resolve_device): Do not handle acc_device_host. Add comments. * target.c: Include fnmatch.h. (resolve_device): Use host fallback when offload data not available. (gomp_offload_target_available_p, offload_target_to_plugin_name) (gomp_offload_targets, gomp_offload_targets_init) (GOMP_set_offload_targets, gomp_plugin_prefix) (gomp_plugin_suffix): New. (gomp_load_plugin_for_device): Add gomp_debug call. (gomp_target_init): Usegomp_offload_targets instead of OFFLOAD_TARGETS. Handle and rewrie colon-separated string. * testsuite/lib/libgomp.exp: Expect offload targets to be colon-separated. Adjust matching of offload targets. Don't generate constructor here. (libgomp_target_compile): Use GCC_UNDER_TEST. (check_effective_target_openacc_nvidia_accel_supported) (check_effective_target_openacc_host_selected): Adjust checks of offload target names. * testsuite/libgomp.c++/c++.exp: Do not set HAVE_SET_GXX_UNDER_TEST or GXX_UNDER_TEST. * testsuite/libgomp.c/c.exp: Do not append to libgomp_compile_options, * testsuite/libgomp.fortran/fortran.exp: Do not set GFORTRAN_UNDER_TEST or libgomp_compile_options. * testsuite/libgomp.graphite/graphite.exp: Do not append to libgomp_compile_options. * testsuite/libgomp.oacc-c++/c++.exp: Set SAVE_GCC_UNDER_TEST and GCC_UNDER_TEST. Do not set HAVE_SET_GXX_UNDER_TEST and GXX_UNDER_TEST. Do not append to ALWAYS_CFLAGS. Adjust set of offload targets. Use -foffload=. * testsuite/libgomp.oacc-c/c.exp: Do not append to libgomp_compile_options or ALWAYS_CFLAGS. Adjust set of offload targets. Use -foffload=. * testsuite/libgomp.oacc-fortran/fortran.exp: Do not set GFORTRAN_UNDER_TEST or append to libgomp_compile_options. Do not append to ALWAYS_CFLAGS. Adjust set of offload targets. Use -foffload=. Index: libgomp/plugin/configfrag.ac
[PATCH 3/4] always define HAVE_peephole2
From: tbsaunde tbsaunde@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ChangeLog: 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org * genconfig.c (main): Always define HAVE_cc0. * recog.c (rest_of_handle_peephole2): Adjust. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@227051 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 5 + gcc/genconfig.c | 5 + gcc/recog.c | 8 +++- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5debcca..2943501 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,10 @@ 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * genconfig.c (main): Always define HAVE_cc0. + * recog.c (rest_of_handle_peephole2): Adjust. + +2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * reorg.c (relax_delay_slots): Don't use #if to check value of HAVE_cc0. diff --git a/gcc/genconfig.c b/gcc/genconfig.c index acbf381..fc3c1eb 100644 --- a/gcc/genconfig.c +++ b/gcc/genconfig.c @@ -372,6 +372,11 @@ main (int argc, char **argv) printf (#define HAVE_peephole2 1\n); printf (#define MAX_INSNS_PER_PEEP2 %d\n, max_insns_per_peep2); } + else +{ + printf (#define HAVE_peephole2 0\n); + printf (#define MAX_INSNS_PER_PEEP2 0\n); +} puts (\n#endif /* GCC_INSN_CONFIG_H */); diff --git a/gcc/recog.c b/gcc/recog.c index c595bbd..352aec2 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -3018,7 +3018,6 @@ split_all_insns_noflow (void) return 0; } -#ifdef HAVE_peephole2 struct peep2_insn_data { rtx_insn *insn; @@ -3651,7 +3650,6 @@ peephole2_optimize (void) if (peep2_do_cleanup_cfg) cleanup_cfg (CLEANUP_CFG_CHANGED); } -#endif /* HAVE_peephole2 */ /* Common predicates for use with define_bypass. */ @@ -3804,9 +3802,9 @@ if_test_bypass_p (rtx_insn *out_insn, rtx_insn *in_insn) static unsigned int rest_of_handle_peephole2 (void) { -#ifdef HAVE_peephole2 - peephole2_optimize (); -#endif + if (HAVE_peephole2) +peephole2_optimize (); + return 0; } -- 2.4.0
[PATCH 1/4] always define HAVE_conditional_execution
From: tbsaunde tbsaunde@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ChangeLog: 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org * genconfig.c (main): Always define HAVE_CONDITIONAL_EXECUTION. * targhooks.c (default_have_conditional_execution): Adjust. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@227049 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 5 + gcc/genconfig.c | 2 ++ gcc/targhooks.c | 4 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 42abb92..87cccef 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + + * genconfig.c (main): Always define HAVE_CONDITIONAL_EXECUTION. + * targhooks.c (default_have_conditional_execution): Adjust. + 2015-08-20 Richard Sandiford richard.sandif...@arm.com * rtl.h (rtvec_all_equal_p): Declare. diff --git a/gcc/genconfig.c b/gcc/genconfig.c index ac16c5b..acbf381 100644 --- a/gcc/genconfig.c +++ b/gcc/genconfig.c @@ -348,6 +348,8 @@ main (int argc, char **argv) if (have_cond_exec_flag) printf (#define HAVE_conditional_execution 1\n); + else +printf (#define HAVE_conditional_execution 0\n); if (have_lo_sum_flag) printf (#define HAVE_lo_sum 1\n); diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 3eca47e..7238c8f 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1350,11 +1350,7 @@ default_case_values_threshold (void) bool default_have_conditional_execution (void) { -#ifdef HAVE_conditional_execution return HAVE_conditional_execution; -#else - return false; -#endif } /* By default we assume that c99 functions are present at the runtime, -- 2.4.0
[PATCH 2/4] remove another #if for HAVE_cc0
From: tbsaunde tbsaunde@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ChangeLog: 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org * reorg.c (relax_delay_slots): Don't use #if to check value of HAVE_cc0. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@227050 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 5 + gcc/reorg.c | 8 +++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 87cccef..5debcca 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,10 @@ 2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * reorg.c (relax_delay_slots): Don't use #if to check value of + HAVE_cc0. + +2015-08-20 Trevor Saunders tbsaunde+...@tbsaunde.org + * genconfig.c (main): Always define HAVE_CONDITIONAL_EXECUTION. * targhooks.c (default_have_conditional_execution): Adjust. diff --git a/gcc/reorg.c b/gcc/reorg.c index 1c60e13..cdaa60c 100644 --- a/gcc/reorg.c +++ b/gcc/reorg.c @@ -3445,15 +3445,13 @@ relax_delay_slots (rtx_insn *first) ! condjump_in_parallel_p (delay_jump_insn) prev_active_insn (target_label) == insn ! BARRIER_P (prev_nonnote_insn (target_label)) -#if HAVE_cc0 /* If the last insn in the delay slot sets CC0 for some insn, various code assumes that it is in a delay slot. We could put it back where it belonged and delete the register notes, but it doesn't seem worthwhile in this uncommon case. */ - ! find_reg_note (XVECEXP (pat, 0, XVECLEN (pat, 0) - 1), - REG_CC_USER, NULL_RTX) -#endif - ) + (!HAVE_cc0 + || ! find_reg_note (XVECEXP (pat, 0, XVECLEN (pat, 0) - 1), + REG_CC_USER, NULL_RTX))) { rtx_insn *after; int i; -- 2.4.0
[PATCH 0/4] a little bit of ifdef removal
From: Trevor Saunders tbsaunde+...@tbsaunde.org Hi, just more removal of conditional compilation. series run through config-list.mk, and each patch individually bootstrapped on x86_64-linux-gnu. I think this is still preapproved so committed. Trev tbsaunde (4): always define HAVE_conditional_execution remove another #if for HAVE_cc0 always define HAVE_peephole2 add default for CONSTANT_ALIGNMENT gcc/ChangeLog | 26 ++ gcc/builtins.c | 6 ++ gcc/defaults.h | 4 gcc/doc/tm.texi| 2 +- gcc/doc/tm.texi.in | 2 +- gcc/genconfig.c| 7 +++ gcc/recog.c| 8 +++- gcc/reorg.c| 8 +++- gcc/targhooks.c| 4 gcc/varasm.c | 17 - 10 files changed, 51 insertions(+), 33 deletions(-) -- 2.4.0
[gomp4] routine calls
I've committed this to gomp4 branch. It augments the call RTL with an optional const int, indicating the partitioning requirements of the target function. This is set from the target function's 'oacc function' attribute. We don't do anything with this information yet -- it'll be needed to get the correct number of threads to execute the call instruction. nathan 2015-08-20 Nathan Sidwell nat...@codesourcery.com * omp-low.c (build_oacc_routine_dims): Expand comment. * config/nvptx/nvptx.md (call_operation): Skip optional partitioning information. * config/nvptx/nvptx.c (nvptx_expand_call): Insert target partitioning information, if present. (nvptx_output_call_insn): Skip partitioning info, if present. Index: gcc/config/nvptx/nvptx.c === --- gcc/config/nvptx/nvptx.c (revision 226981) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -848,19 +848,18 @@ nvptx_end_call_args (void) void nvptx_expand_call (rtx retval, rtx address) { - int nargs; + int nargs = 0; rtx callee = XEXP (address, 0); rtx pat, t; rtvec vec; bool external_decl = false; + rtx partitioning = NULL_RTX; + rtx varargs = NULL_RTX; + tree decl_type = NULL_TREE; - nargs = 0; for (t = cfun-machine-call_args; t; t = XEXP (t, 1)) nargs++; - bool has_varargs = false; - tree decl_type = NULL_TREE; - if (!call_insn_operand (callee, Pmode)) { callee = force_reg (Pmode, callee); @@ -877,6 +876,22 @@ nvptx_expand_call (rtx retval, rtx addre cfun-machine-has_call_with_sc = true; if (DECL_EXTERNAL (decl)) external_decl = true; + tree attr = get_oacc_fn_attrib (decl); + if (attr) + { + tree dims = TREE_VALUE (attr); + + for (int ix = 0; ix != GOMP_DIM_MAX; ix++) + { + if (TREE_PURPOSE (dims) + !integer_zerop (TREE_PURPOSE (dims))) + { + partitioning = GEN_INT (ix); + break; + } + dims = TREE_CHAIN (dims); + } + } } } if (cfun-machine-funtype @@ -887,31 +902,19 @@ nvptx_expand_call (rtx retval, rtx addre || TREE_CODE (cfun-machine-funtype) == METHOD_TYPE) stdarg_p (cfun-machine-funtype)) { - has_varargs = true; - cfun-machine-has_call_with_varargs = true; -} - vec = rtvec_alloc (nargs + 1 + (has_varargs ? 1 : 0)); - pat = gen_rtx_PARALLEL (VOIDmode, vec); - if (has_varargs) -{ - rtx this_arg = gen_reg_rtx (Pmode); + varargs = gen_reg_rtx (Pmode); if (Pmode == DImode) - emit_move_insn (this_arg, stack_pointer_rtx); + emit_move_insn (varargs, stack_pointer_rtx); else - emit_move_insn (this_arg, stack_pointer_rtx); - XVECEXP (pat, 0, nargs + 1) = gen_rtx_USE (VOIDmode, this_arg); -} - - /* Construct the call insn, including a USE for each argument pseudo - register. These will be used when printing the insn. */ - int i; - rtx arg; - for (i = 1, arg = cfun-machine-call_args; arg; arg = XEXP (arg, 1), i++) -{ - rtx this_arg = XEXP (arg, 0); - XVECEXP (pat, 0, i) = gen_rtx_USE (VOIDmode, this_arg); + emit_move_insn (varargs, stack_pointer_rtx); + cfun-machine-has_call_with_varargs = true; } + vec = rtvec_alloc (nargs + 1 + + (partitioning ? 1 : 0) + (varargs ? 1 : 0)); + pat = gen_rtx_PARALLEL (VOIDmode, vec); + int vec_pos = 0; + rtx tmp_retval = retval; t = gen_rtx_CALL (VOIDmode, address, const0_rtx); if (retval != NULL_RTX) @@ -920,7 +923,23 @@ nvptx_expand_call (rtx retval, rtx addre tmp_retval = gen_reg_rtx (GET_MODE (retval)); t = gen_rtx_SET (tmp_retval, t); } - XVECEXP (pat, 0, 0) = t; + XVECEXP (pat, 0, vec_pos++) = t; + + if (partitioning) +XVECEXP (pat, 0, vec_pos++) = partitioning; + + /* Construct the call insn, including a USE for each argument pseudo + register. These will be used when printing the insn. */ + for (rtx arg = cfun-machine-call_args; arg; arg = XEXP (arg, 1)) +{ + rtx this_arg = XEXP (arg, 0); + XVECEXP (pat, 0, vec_pos++) = gen_rtx_USE (VOIDmode, this_arg); +} + + if (varargs) + XVECEXP (pat, 0, vec_pos++) = gen_rtx_USE (VOIDmode, varargs); + + gcc_assert (vec_pos = XVECLEN (pat, 0)); /* If this is a libcall, decl_type is NULL. For a call to a non-libcall undeclared function, we'll have an external decl without arg types. @@ -1816,17 +1835,26 @@ nvptx_output_call_insn (rtx_insn *insn, static int labelno; bool needs_tgt = register_operand (callee, Pmode); rtx pat = PATTERN (insn); - int nargs = XVECLEN (pat, 0) - 1; + int arg_end = XVECLEN (pat, 0); + int arg_start = 1; tree decl = NULL_TREE; + rtx partitioning = NULL_RTX; - fprintf (asm_out_file, \t{\n); - if (result != NULL) + if (arg_end 1) { - fprintf (asm_out_file, \t\t.param%s %%retval_in;\n, - nvptx_ptx_type_from_mode (arg_promotion (GET_MODE (result)), - false)); + partitioning = XVECEXP (pat, 0, 1); + if (GET_CODE
Go patch commmitted: Don't crash on empty print call
This patch by Chris Manghane fixes the Go frontend to not crash on an empty print call (print()). An empty print call is useless as it does nothing, but of course we shouldn't crash. This fixes https://golang.org/issue/11526 . Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 227037) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -ec34cfb0b148ff461df12c8f5270a06e2f438b7c +cc7303c97b232ea979cab950d95aaf76c4e0f5b5 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc(revision 226846) +++ gcc/go/gofrontend/expressions.cc(working copy) @@ -8177,6 +8177,12 @@ Builtin_call_expression::do_get_backend( location); } +// There aren't any arguments to the print builtin. The compiler +// issues a warning for this so we should avoid getting the backend +// representation for this call. Instead, perform a no-op. +if (print_stmts == NULL) + return context-backend()-boolean_constant_expression(false); + return print_stmts-get_backend(context); }
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On August 20, 2015 7:35:37 PM GMT+02:00, Andrew Hughes gnu.and...@redhat.com wrote: - Original Message - snip... Having classpath (with binary files!) In the GCC SVN (or future git) repository is a significant burden, not to mention the size of the distributed source tarball. If we can get rid of that that would be a great step in reducing the burden. Iff we can even without classpath build enough of java to be useful (do you really need gcj or only gij for bootstrapping openjdk? After all ecj is just a drop-in to gcc as well). All the Java compilers are written in Java (ecj javac). So to run them, you need a JVM and its class library. It's those binary files which allow gcj to bootstrap the stack. If OpenJDK had a minimal binary class library, it would be able to bootstrap itself. But, as things stand, you need enough of the JDK to run a Java compiler and build the OpenJDK class libraries. GCJ currently fulfils that need where there isn't already an OpenJDK installation available. -- Actually, this makes me think... IcedTea already depends on CACAO and JamVM for alternate builds of OpenJDK. We could instead include the bytecode binaries for GNU Classpath in IcedTea, bootstrap JamVM and use that to bootstrap OpenJDK. That would remove our dependency on gcj and make IcedTea largely self-sufficient. It would also mean we could drop a bunch of conditional code which depends on what the system bootstrap JDK is, because it would always be the in-tree solution. We'd still need more than six months to make this transition though, as such a change really needs time for testing. OK, so how about deprecating Java for GCC 6 by removing it from the default languages and removing it for GCC 7 or before we switch to git (whatever happens earlier?) Richard.
Re: Add utility functions for rtx broadcast/duplicate constants
On 08/20/2015 04:34 AM, Richard Sandiford wrote: Several pieces of code want to know whether all elements of a CONST_VECTOR are equal, and I'm about to add some more to simplify-rtx.c. This patch adds some utility functions for that. I don't think we're really helping ourselves by having the shift amount in v16qi 3 be: (const_vector:V16QI [ (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) ]) so I wanted to leave open the possibility of using: (const:V16QI (vec_duplicate:V16QI (const_int 3))) in future. The interface therefore passes back the duplicated element rather than leaving callers to use CONST_VECTOR_ELT (c, 0) (== XVECEXP (c, 0, 0)). unwrap_const_vec_duplicate is mostly for code that handles vector operations equivalently to scalar ops. The follow-on simplify-rtx.c code makes more use of this. It also came in useful for the tilegx/ tilepro predicates. Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu. I also built cross-compilers for s390x-linux-gnu, spu-elf, tilepro-elf and tilegx-elf and checked by hand that the affected code still worked. (Well, except for the SPU case. That's handling vector constants in which the elements are symbolic addresses, such as { foo, foo, foo, foo }. Such vectors don't seem to be treated as constants at the gimple level and the initial rtl code that we generate is too complex for later optimisations to convert back to a constant, so I wasn't sure how best to trigger it.) OK to install? Thanks, Richard gcc/ * rtl.h (rtvec_all_equal_p): Declare. (const_vec_duplicate_p, unwrap_const_vec_duplicate): New functions. * rtl.c (rtvec_all_equal_p): New function. * expmed.c (expand_mult): Use unwrap_const_vec_duplicate. * config/aarch64/aarch64.c (aarch64_vect_float_const_representable_p) (aarch64_simd_dup_constant): Use const_vec_duplicate_p. * config/arm/arm.c (neon_vdup_constant): Likewise. * config/s390/s390.c (s390_contiguous_bitmask_vector_p): Likewise. * config/tilegx/constraints.md (W, Y): Likewise. * config/tilepro/constraints.md (W, Y): Likewise. * config/spu/spu.c (spu_legitimate_constant_p): Likewise. (classify_immediate): Use unwrap_const_vec_duplicate. * config/tilepro/predicates.md (reg_or_v4s8bit_operand): Likewise. (reg_or_v2s8bit_operand): Likewise. * config/tilegx/predicates.md (reg_or_v8s8bit_operand): Likewise. (reg_or_v4s8bit_operand): Likewise. OK. Jeff
Re: [PATCH] Fix UBSan builtin types
On 08/20/2015 10:42 AM, Yury Gribov wrote: Hi all, GCC builtins BUILT_IN_UBSAN_HANDLE_NONNULL_ARG and BUILT_IN_UBSAN_HANDLE_NONNULL_ARG_ABORT were using BT_FN_VOID_PTR_PTRMODE whereas they are really BT_FN_VOID_PTR: void __ubsan::__ubsan_handle_nonnull_return(NonNullReturnData *Data) The patch fixes it. I only tested ubsan.exp (I doubt that bootstrap + full testsuite will add anything to this). Ok for trunk? Best regards, Yury Gribov fix-ubsan-builtin-types-1.patch commit d4747c9c7f78789ec7119fce07cd4526c4168ee0 Author: Yury Gribovy.gri...@samsung.com Date: Thu Aug 20 19:10:30 2015 +0300 2015-08-20 Yury Gribovy.gri...@samsung.com gcc/ * sanitizer.def (BUILT_IN_UBSAN_HANDLE_NONNULL_ARG, BUILT_IN_UBSAN_HANDLE_NONNULL_ARG): Fix builtin types. OK. jeff
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On 08/20/2015 10:03 AM, Andrew Hughes wrote: - Original Message - On 08/20/2015 09:27 AM, Andrew Haley wrote: On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. Right. So the question is there some reason why OpenJDK can't be used to bootstrap itself? Ie, is there a fundamental reason why Andrew needs to drop back down to GCJ and start the bootstrapping process from scratch. ISTM that ideally the previous version of OpenJDK would be used to bootstrap the new version of OpenJDK. Which leaves the question of how to deal with new platforms, but it sounds like there's a cross-compilation process starting with OpenJDK 8 which ought to solve that problem. The issue is that we're still supporting a version of OpenJDK/IcedTea where there is no previous version (6). Once that goes, gcj could go too. This is still just a little too soon. But surely OpenJDK6 can build OpenJDK6, right? I don't see you're fundamentally getting anything from always starting with a GCJ bootstrap. That's where it comes unstuck. How do you get a JDK built when there are no JDK binaries for your architecture? Cross compilation, just like folks do for Ada. I'm not against this long-term, just not immediately. Deprecating it now and removing it in the next release cycle (7?) would probably be enough, but we need a little more time to wind down dependencies. I don't see us needing it in a GCC released in 2017. I was of the opinion that we should remove it from the default languages to be built. Others wanted to be more aggressive :-) jeff
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
- Original Message - snip... Having classpath (with binary files!) In the GCC SVN (or future git) repository is a significant burden, not to mention the size of the distributed source tarball. If we can get rid of that that would be a great step in reducing the burden. Iff we can even without classpath build enough of java to be useful (do you really need gcj or only gij for bootstrapping openjdk? After all ecj is just a drop-in to gcc as well). All the Java compilers are written in Java (ecj javac). So to run them, you need a JVM and its class library. It's those binary files which allow gcj to bootstrap the stack. If OpenJDK had a minimal binary class library, it would be able to bootstrap itself. But, as things stand, you need enough of the JDK to run a Java compiler and build the OpenJDK class libraries. GCJ currently fulfils that need where there isn't already an OpenJDK installation available. -- Actually, this makes me think... IcedTea already depends on CACAO and JamVM for alternate builds of OpenJDK. We could instead include the bytecode binaries for GNU Classpath in IcedTea, bootstrap JamVM and use that to bootstrap OpenJDK. That would remove our dependency on gcj and make IcedTea largely self-sufficient. It would also mean we could drop a bunch of conditional code which depends on what the system bootstrap JDK is, because it would always be the in-tree solution. We'd still need more than six months to make this transition though, as such a change really needs time for testing. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [RFC][Scalar masks 1/x] Introduce GEN_MASK_EXPR.
On 08/17/2015 10:22 AM, Ilya Enkovich wrote: Hi, This patch starts a series introducing scalar masks support in the vectorizer. It was discussed on the recent Cauldron and changes overiew is available here: https://gcc.gnu.org/wiki/cauldron2015?action=AttachFiledo=viewtarget=Vectorization+for+Intel+AVX-512.pdf. Here is shortly a list of changes introduced by this series: - Add new tree expr to produce scalar masks in a vectorized code - Fix-up if-conversion to use bool predicates instead of integer masks - Disable some bool patterns to avoid bool to int conversion where masks can be used - Support bool operands in vectorization factor computation - Support scalar masks in MASK_LOAD, MASK_STORE and VEC_COND_EXPR by adding new optabs - Support vectorization for statements which are now not transformed by bool patterns - Add target support (hooks, optabs, expands) This patch introduces GEN_MASK_EXPR code. Intitially I wanted to use a comparison as an operand for it directly mapping it into AVX-512 comparison instruction. But a feedback was to simplify new code's semantics and use it for converting vectors into scalar masks. Therefore if we want to compare two vectors into a scalar masks we use two statements: vect.18_87 = vect__5.13_81 vect__6.16_86; mask__ifc__23.17_88 = GEN_MASK vect.18_87; Trying it in practice I found it producing worse code. The problem is that on target first comparison is expanded into two instructions: cmp with mask result + masked move to get a vector. GEN_MASK is then expanded into another comparison with zero vector. Thus I get two comparisons + move instead of a single comparison and have to optimize this out on a target side (current optimizers can't handle it). That's actually what I wanted to avoid. For now I changed GEN_MASK_EXPR to get a vector value as an operand but didn't change expand pattern which has four opernads: two vectors to compare + cmp operator + result. On expand I try to detect GEN_MASK uses a result of comparison and thus avoid double comparison generation. Patch series is not actually fully finished yet. I still have several type conversion tests not being vectorized and it wasn't widely tested. That's what I'm working on now. Will be glad to any comments. Thanks, Ilya -- 2015-08-17 Ilya Enkovich enkovich@gmail.com * expr.c (expand_expr_real_2): Support GEN_MASK_EXPR. * gimple-pretty-print.c (dump_unary_rhs): Likewise. * gimple.c (get_gimple_rhs_num_ops): Likewise. * optabs.c: Include gimple.h. (vector_compare_rtx): Add OPNO arg. (get_gen_mask_icode): New. (expand_gen_mask_expr_p): New. (expand_gen_mask_expr): New. (expand_vec_cond_expr): Adjust vector_compare_rtx call. * optabs.def (gen_mask_optab): New. (gen_masku_optab): New. * optabs.h (expand_gen_mask_expr_p): New. (expand_gen_mask_expr): New. * tree-cfg.c (verify_gimple_assign_unary): Support GEN_MASK_EXPR. * tree-inline.c (estimate_operator_cost): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. * tree-ssa-operands.c (get_expr_operands): Likewise. * tree.def (GEN_MASK_EXPR): New. A general question, would any of this likely help Yuri's work to optimize MASK_STORES? diff --git a/gcc/optabs.c b/gcc/optabs.c index a6ca706..bf466ca 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3. If not see #include recog.h #include reload.h #include target.h +#include gimple.h Hmm, part of me doesn't want to see optabs.c depending on gimple.h. How painful would it be to have this stuff live in expr.c? + +/* Generate insns for a GEN_MASK_EXPR, given its TYPE and operand. */ + +rtx +expand_gen_mask_expr (tree type, tree op0, rtx target) +{ + struct expand_operand ops[4]; + enum insn_code icode; + rtx comparison; + machine_mode mode = TYPE_MODE (type); + machine_mode cmp_op_mode; + bool unsignedp; + tree op0a, op0b; + enum tree_code tcode; + gimple def_stmt; + + /* Avoid double comparison. */ + if (TREE_CODE (op0) == SSA_NAME + (def_stmt = SSA_NAME_DEF_STMT (op0)) + is_gimple_assign (def_stmt) + TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt)) == tcc_comparison) +{ + op0a = gimple_assign_rhs1 (def_stmt); + op0b = gimple_assign_rhs2 (def_stmt); + tcode = gimple_assign_rhs_code (def_stmt); +} + else +{ + op0a = op0; + op0b = build_zero_cst (TREE_TYPE (op0)); + tcode = NE_EXPR; +} + + unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a)); + cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a)); + + gcc_assert (GET_MODE_BITSIZE (mode) = GET_MODE_NUNITS (cmp_op_mode)); + + icode = get_gen_mask_icode (cmp_op_mode, unsignedp); + if (icode == CODE_FOR_nothing) +return 0; So if the target doesn't have suitable insns, what happens? I suspect the answer is nothing
Re: [Scalar masks 2/x] Use bool masks in if-conversion
On 08/17/2015 10:25 AM, Ilya Enkovich wrote: Hi, This patch intoriduces a new vectorizer hook use_scalar_mask_p which affects code generated by if-conversion pass (and affects patterns in later patches). Thanks, Ilya -- 2015-08-17 Ilya Enkovich enkovich@gmail.com * doc/tm.texi (TARGET_VECTORIZE_USE_SCALAR_MASK_P): New. * doc/tm.texi.in: Regenerated. * target.def (use_scalar_mask_p): New. * tree-if-conv.c: Include target.h. (predicate_mem_writes): Don't convert boolean predicates into integer when scalar masks are used. Presumably this is how you prevent the generation of scalar masks rather than boolean masks on targets which don't have the former? I hate to ask, but how painful would it be to go from a boolean to integer masks later such as during expansion? Or vice-versa. WIthout a deep knowledge of the entire patchkit, it feels like we're introducing target stuff in a place where we don't want it and that we'd be better served with a canonical representation through gimple, then dropping into something more target specific during gimple-rtl expansion. Jeff
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
- Original Message - On 08/20/2015 10:03 AM, Andrew Hughes wrote: - Original Message - On 08/20/2015 09:27 AM, Andrew Haley wrote: On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. Right. So the question is there some reason why OpenJDK can't be used to bootstrap itself? Ie, is there a fundamental reason why Andrew needs to drop back down to GCJ and start the bootstrapping process from scratch. ISTM that ideally the previous version of OpenJDK would be used to bootstrap the new version of OpenJDK. Which leaves the question of how to deal with new platforms, but it sounds like there's a cross-compilation process starting with OpenJDK 8 which ought to solve that problem. The issue is that we're still supporting a version of OpenJDK/IcedTea where there is no previous version (6). Once that goes, gcj could go too. This is still just a little too soon. But surely OpenJDK6 can build OpenJDK6, right? I don't see you're fundamentally getting anything from always starting with a GCJ bootstrap. I'm talking about when you don't already have OpenJDK 6. That's where it comes unstuck. How do you get a JDK built when there are no JDK binaries for your architecture? Cross compilation, just like folks do for Ada. Which still needs a JDK somewhere and, as Matthias mentioned, the build system on older versions of OpenJDK (the ones were talking about) doesn't really support cross-compilation. I had to hack around just to get x86 on x86_64 to work. I'm not against this long-term, just not immediately. Deprecating it now and removing it in the next release cycle (7?) would probably be enough, but we need a little more time to wind down dependencies. I don't see us needing it in a GCC released in 2017. I was of the opinion that we should remove it from the default languages to be built. Others wanted to be more aggressive :-) I actually thought that change would have happened a long long time ago ;) I'm actually for the aggressive approach, just on a longer time scale, as I'll need time to transition IcedTea away from gcj. jeff -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [PATCH 1/15][ARM] Hide existing float16 intrinsics unless we have a scalar __fp16 type
Thanks, pushed with comment and ChangeLog fix as r227033. --Alan Kyrill Tkachov wrote: Hi Alan, On 28/07/15 12:23, Alan Lawrence wrote: This makes the existing float16 vector intrinsics available only when we have an __fp16 type (i.e. when one of the ARM_FP16_FORMAT_... macros is defined). Thus, we also rearrange the float16x[48]_t types to use the same type as __fp16 for the element type (ACLE says that __fp16 should be an alias). To keep the existing gcc.target/arm/neon/vcvt{f16_f32,f32_f16} tests working, as these do not specify an -mfp16-format, I've modified check_effective_target_arm_neon_fp16_ok to add in -mfp16-format=ieee *if necessary* (hence still allowing an explicit -mfp16-format=alternative). A documentation fix for this follows in the last patch. gcc/ChangeLog: * config/arm/arm-builtins.c (arm_init_simd_builtin_types): Move initialization of HFmode scalar type (float16_t) to... (arm_init_fp16_builtins): ...here, combining with previous __fp16. I'd say: ... Here. Combine with __fp16 initialization code (arm_init_builtins): Call arm_init_fp16_builtins earlier and always. * config/arm/arm_neon.h (vcvt_f16_f32, vcvt_f32_f16): Condition on having an -mfp16-format. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_arm_neon_fp16_ok_nocache): Add flag variants with -mfp16-format=ieee. @@ -1752,12 +1749,11 @@ arm_init_builtins (void) if (TARGET_REALLY_IWMMXT) arm_init_iwmmxt_builtins (); + arm_init_fp16_builtins (); + if (TARGET_NEON) arm_init_neon_builtins (); - if (arm_fp16_format) -arm_init_fp16_builtins (); - if (TARGET_CRC32) arm_init_crc32_builtins (); Can you please add a comment above arm_init_fp16_builtins (); saying that it needs to be called before arm_init_neon_builtins so that arm_simd_floatHF_type_node gets initialised properly? (Or words to that effect). Ok with the comment. Thanks, Kyrill
Re: [PATCH] S390: Fix vec_load_bndry.
Dominik Vogt wrote: gcc/ChangeLog * config/s390/s390-builtins.def: Fix value range of vec_load_bndry. gcc/testsuite/ChangeLog * gcc.target/s390/zvector/vec-load_bndry-1.c: New test. This is OK. Thanks, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
[avr, 5 + trunk, committed]: Fix KiB diagnostic for wrong address space.
https://gcc.gnu.org/r227034 https://gcc.gnu.org/r227035 If an address space is used that's beyond the flash of the current device (number of 64 KiB segments as specified by -mn-flash=) a diagnostic complains and prints the currently specified numbers of flash segments, i.e. 64 KiB chunks, and used KiB as unit. Hence the avr_n_flash number of segments has to be multiplied by 64 in order to convert from number of segments to KiB. Applied as obvious to trunk and gcc-5-branch (4.9 does not print the size). Johann Index: config/avr/avr.c === --- config/avr/avr.c(revision 227034) +++ config/avr/avr.c(revision 227035) @@ -9255,10 +9255,10 @@ avr_pgm_check_var_decl (tree node) { if (TYPE_P (node)) error (%qT uses address space %qs beyond flash of %d KiB, - node, avr_addrspace[as].name, avr_n_flash); + node, avr_addrspace[as].name, 64 * avr_n_flash); else error (%s %q+D uses address space %qs beyond flash of %d KiB, - reason, node, avr_addrspace[as].name, avr_n_flash); + reason, node, avr_addrspace[as].name, 64 * avr_n_flash); } else { @@ -9305,7 +9305,7 @@ avr_insert_attributes (tree node, tree * if (avr_addrspace[as].segment = avr_n_flash) { error (variable %q+D located in address space %qs beyond flash - of %d KiB, node, avr_addrspace[as].name, avr_n_flash); + of %d KiB, node, avr_addrspace[as].name, 64 * avr_n_flash); } else if (!AVR_HAVE_LPM avr_addrspace[as].pointer_size 2) {
Re: [PATCH testsuite, ARM] skip Wno-frame-address tests
Hi Christian, On 20/08/15 14:45, Christian Bruel wrote: Hello, 2 tests from rev 226480 introduced a new failure for ARM testing -Werror because a warning is always emitted regardless -Wframe-address is given or not. From expand_builtin_frame_address: /* Some ports cannot access arbitrary stack frames. */ if (tem == NULL) { warning (0, unsupported argument to %qD, fndecl); return const0_rtx; } This patch just skips the test on ARM that can't access arbitrary stack frame anyway and will always warn. OK for trunk ? thanks, Christian no-frame-address.patch 015-08-20 Christian Bruelchristian.br...@st.com * gcc.dg/Wno-frame-address.c: Skip for ARM. * g++.dg/Wno-frame-address.C: Ditto. Index: gcc/testsuite/gcc.dg/Wno-frame-address.c === --- gcc/testsuite/gcc.dg/Wno-frame-address.c(revision 227030) +++ gcc/testsuite/gcc.dg/Wno-frame-address.c(working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-skip-if Cannot access arbitrary stack frames. { arm*-*-* } { * } { } } */ /* { dg-options -Werror } */ /* Verify that -Wframe-address is not enabled by default by enabling Index: gcc/testsuite/g++.dg/Wno-frame-address.C === --- gcc/testsuite/g++.dg/Wno-frame-address.C(revision 227030) +++ gcc/testsuite/g++.dg/Wno-frame-address.C(working copy) @@ -1,4 +1,5 @@ // { dg-do compile } +/* { dg-skip-if Cannot access arbitrary stack frames. { arm*-*-* } { * } { } } */ // { dg-options -Werror } Use the C++-style comment here. Otherwise looks ok to me, though if more tests like this crop we'd want a dg-requires-effective-target check that filters out the targets that don't implement this feature. Kyrill // Verify that -Wframe-address is not enabled by default by enabling
Re: [PATCH testsuite, ARM] skip Wno-frame-address tests
Kyrill Tkachov kyrylo.tkac...@arm.com writes: 015-08-20 Christian Bruelchristian.br...@st.com * gcc.dg/Wno-frame-address.c: Skip for ARM. * g++.dg/Wno-frame-address.C: Ditto. Index: gcc/testsuite/gcc.dg/Wno-frame-address.c === --- gcc/testsuite/gcc.dg/Wno-frame-address.c (revision 227030) +++ gcc/testsuite/gcc.dg/Wno-frame-address.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-skip-if Cannot access arbitrary stack frames. { arm*-*-* } { * } { } } */ /* { dg-options -Werror } */ /* Verify that -Wframe-address is not enabled by default by enabling Index: gcc/testsuite/g++.dg/Wno-frame-address.C === --- gcc/testsuite/g++.dg/Wno-frame-address.C (revision 227030) +++ gcc/testsuite/g++.dg/Wno-frame-address.C (working copy) @@ -1,4 +1,5 @@ // { dg-do compile } +/* { dg-skip-if Cannot access arbitrary stack frames. { arm*-*-* } { * } { } } */ // { dg-options -Werror } Use the C++-style comment here. Otherwise looks ok to me, though if more tests like this crop we'd want a dg-requires-effective-target check that filters out the targets that don't implement this feature. Besides: omit the blank after the `frames. ' in the dg-skip-if comment and the unnecessary default arguments ({ * } { }) in both cases. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix PR67285
The following patch avoids breaking virtual SSA form in fold_stmt by making sure we only ever generate calls to const functions in simplify expression replacements. Bootstrap regtest running on x86_64-unknown-linux-gnu. Richard. 2015-08-20 Richard Biener rguent...@suse.de PR middle-end/67285 * gimple-fold.c (replace_stmt_with_simplification): Assert seq is empty when replacing a call with itself but different arguments. * gimple-match-head.c (maybe_push_res_to_seq): When pushing a call require that it is const. Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 227031) +++ gcc/gimple-fold.c (working copy) @@ -3308,6 +3308,7 @@ replace_stmt_with_simplification (gimple } if (i 3) gcc_assert (ops[i] == NULL_TREE); + gcc_assert (gimple_seq_empty_p (*seq)); return true; } else if (!inplace) Index: gcc/gimple-match-head.c === --- gcc/gimple-match-head.c (revision 227031) +++ gcc/gimple-match-head.c (working copy) @@ -338,6 +338,9 @@ maybe_push_res_to_seq (code_helper rcode tree decl = builtin_decl_implicit (rcode); if (!decl) return NULL_TREE; + /* We can't and should not emit calls to non-const functions. */ + if (!(flags_from_decl_or_type (decl) ECF_CONST)) + return NULL_TREE; /* Play safe and do not allow abnormals to be mentioned in newly created statements. */ unsigned nargs;
[RFC]: Vectorization cost benefit changes.
All: I have done the vectorization cost changes as given below. I have considered only the cost associated with the inner instead of outside. The consideration of inside scalar and vector cost is done as the inner cost are the most cost effective than the outside cost. min_profitable_iters = ((scalar_single_iter_cost - vec_inside_cost) *vf); The Scalar_single_iter_cost consider the hardcoded value 50 which is used for most of the targets and the scalar cost is multiplied With 50. This scalar cost is subtracted with vector cost and as the scalar cost is increased the chances of vectorization is more with same Vectorization factor and more loops will be vectorized. I have not changed the iteration count which is hardcoded with 50 and I will do the changes to replace the 50 with the static Estimates of iteration count if you agree upon the below changes. I have ran the SPEC cpu 2000 benchmarks with the below changes for i386 targets and the significant gains are achieved with respect To INT and FP benchmarks. Here is the data. Ratio of vectorization cost changes(FP benchmarks) vs Ratio of without vectorization cost changes( FP benchmarks) = 4640.102 vs 4583.379. Ratio of vectorization cost changes (INT benchmarks ) vs Ratio of without vectorization cost changes( INT benchmarks0 = 3812.883 vs 3778.558 Please give your feedback on the below changes for vectorization cost benefit. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 422b883..35d538f 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -2987,11 +2987,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo, min_profitable_iters = 1; else { - min_profitable_iters = ((vec_outside_cost - scalar_outside_cost) * vf - - vec_inside_cost * peel_iters_prologue - - vec_inside_cost * peel_iters_epilogue) - / ((scalar_single_iter_cost * vf) -- vec_inside_cost); + min_profitable_iters = ((scalar_single_iter_cost +- vec_inside_cost) *vf); if ((scalar_single_iter_cost * vf * min_profitable_iters) = (((int) vec_inside_cost * min_profitable_iters) Thanks Regards Ajit vect.diff Description: vect.diff
[PATCH: RL78] libgcc fixes for divmodsi, divmodhi and divmodqi
Hi, The following patch fixes issues in the div/mod emulation routines for the RL78 target. Hunk in divmodsi.S: This hunk adds a branch to 'main_loop_done_himode' instead of a direct 'ret'. The 'ret' from here was causing the hardware to crash as the registers were not being restored from stack before return. This happened for long data division by 0. Note: A 'br $!' is used as a only using 'br $' gives error, relocation truncated to fit: R_RL78_DIR8S_PCREL at link time in testcase. This hunk also fixes an issue related to return register. r10,r11 was returned for div instruction instead of r8,r9 register. Hunk in divmodhi.S: Fixes issue related to return register. r10 was returned for div instruction instead of r8 register. Hunk in divmodqi.S: Returns a 0x00 instead of 0xff to keep results consistent with other data types. The hunks in divmodhi and divmodqi are not critical, however the one in divmodsi is critical as the processor runs away to undefined space and crashes. This is regression tested for RL78 -msim. Please let me know if it is OK to commit. Best Regards, Kaushik Changelog: 2015-08-21 Kaushik Phatak kaushik.pha...@kpit.com * config/rl78/divmodqi.S: Return 0x00 by default for div by 0. * config/rl78/divmodsi.S: Update return register to r8. * config/rl78/divmodhi.S: Update return register to r8,r9. Branch to main_loop_done_himode to pop registers before return. Index: libgcc/config/rl78/divmodhi.S === --- libgcc/config/rl78/divmodhi.S (revision 227024) +++ libgcc/config/rl78/divmodhi.S (working copy) @@ -454,7 +454,11 @@ movwax, den cmpwax, #0 bnz $den_not_zero\which + .if \need_result + movwquot, #0 + .else movwnum, #0 + .endif ret den_not_zero\which: Index: libgcc/config/rl78/divmodqi.S === --- libgcc/config/rl78/divmodqi.S (revision 227024) +++ libgcc/config/rl78/divmodqi.S (working copy) @@ -63,7 +63,7 @@ ret den_is_zero\which: - mov r8, #0xff + mov r8, #0x00 ret ;; These routines leave DE alone - the signed functions use DE Index: libgcc/config/rl78/divmodsi.S === --- libgcc/config/rl78/divmodsi.S (revision 227024) +++ libgcc/config/rl78/divmodsi.S (working copy) @@ -688,9 +688,14 @@ or a, denB3; not x cmpwax, #0 bnz $den_not_zero\which + .if \need_result + movwquotL, #0 + movwquotH, #0 + .else movwnumL, #0 movwnumH, #0 - ret + .endif + br $!main_loop_done_himode\which den_not_zero\which: .if \need_result rl78_divmod.patch Description: rl78_divmod.patch
C++ PATCH to overloaded friend hiding
Someone on the C++ committee pointed out that G++ unqualified lookup could see through one hidden friend, but not two, and not a template. Fixed thus. Tested x86_64-pc-linux-gnu, applying to trunk. commit bef61f4085710822782a462def4a7032c8a91668 Author: Jason Merrill ja...@redhat.com Date: Thu Aug 20 14:40:55 2015 -0400 * name-lookup.c (hidden_name_p): Handle OVERLOAD. diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c index 79e2863..baaf3e7 100644 --- a/gcc/cp/name-lookup.c +++ b/gcc/cp/name-lookup.c @@ -4346,6 +4346,13 @@ hidden_name_p (tree val) TYPE_FUNCTION_OR_TEMPLATE_DECL_P (val) DECL_ANTICIPATED (val)) return true; + if (TREE_CODE (val) == OVERLOAD) +{ + for (tree o = val; o; o = OVL_CHAIN (o)) + if (!hidden_name_p (OVL_FUNCTION (o))) + return false; + return true; +} return false; } diff --git a/gcc/testsuite/g++.dg/lookup/friend16.C b/gcc/testsuite/g++.dg/lookup/friend16.C new file mode 100644 index 000..bb27773 --- /dev/null +++ b/gcc/testsuite/g++.dg/lookup/friend16.C @@ -0,0 +1,24 @@ +namespace std { + class ostream; +} + +namespace N2 { + class C0 {}; +} + +std::ostream operator( std::ostream os_, const N2::C0 m_); + +namespace N1 { + class C1 { +friend std::ostream operator(std::ostream os, const C1 what); + }; + + class C2 { +friend std::ostream operator(std::ostream os, const C2 what); + }; + + void foo(std::ostream os, const N2::C0 m) + { +os m; // Is this line valid? + } +} diff --git a/gcc/testsuite/g++.dg/template/friend15.C b/gcc/testsuite/g++.dg/template/friend15.C index 4acbf2d..15ba1c2 100644 --- a/gcc/testsuite/g++.dg/template/friend15.C +++ b/gcc/testsuite/g++.dg/template/friend15.C @@ -10,10 +10,11 @@ template typename class X { struct Inner; template typename R -friend typename XR::Inner * foo () { return 0; } +friend typename XR::Inner * foo (XR*) { return 0; } }; template class Xvoid; +Xvoid* p; struct U { -void bar () { foovoid (); } + void bar () { foo (p); } }; diff --git a/gcc/testsuite/g++.dg/template/friend18.C b/gcc/testsuite/g++.dg/template/friend18.C index 04ba26e..712d488 100644 --- a/gcc/testsuite/g++.dg/template/friend18.C +++ b/gcc/testsuite/g++.dg/template/friend18.C @@ -7,13 +7,14 @@ template int N struct X { - template int M friend int foo(X const ) + template int M friend int foo(X const , XM const) { return N * 1 + M; } }; X1234 bring; +X5678 brung; int main() { - return foo5678 (bring) != 12345678; + return foo (bring, brung) != 12345678; } diff --git a/gcc/testsuite/g++.old-deja/g++.pt/friend32.C b/gcc/testsuite/g++.old-deja/g++.pt/friend32.C index 512a69a..db8b724 100644 --- a/gcc/testsuite/g++.old-deja/g++.pt/friend32.C +++ b/gcc/testsuite/g++.old-deja/g++.pt/friend32.C @@ -7,8 +7,8 @@ struct S { }; template class Sint, double; -template char f(char, long, short); -template char* f(char*, long*, short*); +template char f(char, long, short); // { dg-error f } +template char* f(char*, long*, short*); // { dg-error f } template class X, class Y, class Z X f(X x, Y, Z) {
Re: Should we remove remnants of UWIN support in gcc/config.* files?
On Thu, 20 Aug 2015, FX wrote: PS: gcc/config.host and gcc/config.build include some other such targets… without checking them all, I think the following could be removed: powerpc-*-beos i370-*-opened* | i370-*-mvs* i386-*-vsta i[34567]86-*-udk* i[34567]86-*-sysv4* i[34567]86-sequent-ptx4* | i[34567]86-sequent-sysv4* i[34567]86-*-sco3.2v5* Is there a good reason for not removing those targets? If not, I’ll try to track them down, check that they are indeed unsupported, and propose a patch removing them. Well, they aren't *targets*, but *host* and *build* systems. However, I don't think any of them are actually significantly relevant to GCC now as host or build systems (and nor do I think we need case statements for unsupported host or build systems - just let people try to build and possibly have the build fail, a fix for such a failure would better use autoconf than special-casing a triplet anyway). -- Joseph S. Myers jos...@codesourcery.com
[PATCH], PR target/67211, Fix PowerPC 'insn does not satisfy its constraints' error on GCC 5
PR 67211 is an error that shows up on the GCC 5.x branch when the test case is compiled with -mcpu=power7 -mtune=power8 -O3. In looking at the code, I noticed that the code optimized adjancent 64-bit integer/pointers in a structure from DImode to V2DImode. The compiler optimized these to the vector registers, and then tried to move a common field used later back to the GPR field. If the cpu was power8, it would be able to use the direct move instructions, but on power7 those instructions don't exist. The current trunk compiler has dialed back on the optimization, and it no longer tries to optimize adjacent fields in this particular case to V2DImode, but it is an issue in the GCC 5 branch. In debugging the issue, I noticed the -mefficient-unaligned-VSX option was being set if -mtune=power8 was used, even if the architecture was not a power8. Efficient unaligned VSX is an architecture feature, and not a tuning feature. In fixing this to be an architecture feature, it no longer tried to do the V2DImode optimization because it didn't have fast unaligned support. I have checked this on a big endian power7 and a little endian power8 system, using the GCC 5.x patches and the patches for the trunk. There were no regressions in any of the runs. Is it ok to install these patches on both the GCC 5.x branch and trunk? I would like to commit a similar patch for the 4.9 branch as well. Is this ok? Note, due to rs6000.opt being slightly different between GCC 5.x and trunk, there are two different patches, one for GCC 5.x and the other for GCC 6.x (trunk). [gcc] 2015-08-20 Michael Meissner meiss...@linux.vnet.ibm.com PR target/67211 * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Set -mefficient-unaligned-vsx on ISA 2.7. * config/rs6000/rs6000.opt (-mefficient-unaligned-vsx): Convert option to a masked option. * config/rs6000/rs6000.c (rs6000_option_override_internal): Rework logic for -mefficient-unaligned-vsx so that it is set via an arch ISA option, instead of being set if -mtune=power8 is set. Move -mefficient-unaligned-vsx and -mallow-movmisalign handling to be near other default option handling. [gcc/testsuite] 2015-08-20 Michael Meissner meiss...@linux.vnet.ibm.com PR target/67211 * g++.dg/pr67211.C: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000-cpus.def === --- gcc/config/rs6000/rs6000-cpus.def (revision 226986) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -53,6 +53,7 @@ | OPTION_MASK_P8_VECTOR\ | OPTION_MASK_CRYPTO \ | OPTION_MASK_DIRECT_MOVE \ +| OPTION_MASK_EFFICIENT_UNALIGNED_VSX \ | OPTION_MASK_HTM \ | OPTION_MASK_QUAD_MEMORY \ | OPTION_MASK_QUAD_MEMORY_ATOMIC \ @@ -78,6 +79,7 @@ | OPTION_MASK_DFP \ | OPTION_MASK_DIRECT_MOVE \ | OPTION_MASK_DLMZB\ +| OPTION_MASK_EFFICIENT_UNALIGNED_VSX \ | OPTION_MASK_FPRND\ | OPTION_MASK_HTM \ | OPTION_MASK_ISEL \ Index: gcc/config/rs6000/rs6000.opt === --- gcc/config/rs6000/rs6000.opt(revision 226986) +++ gcc/config/rs6000/rs6000.opt(working copy) @@ -212,7 +212,7 @@ Target Undocumented Var(TARGET_ALLOW_MOV ; Allow/disallow the movmisalign in DF/DI vectors mefficient-unaligned-vector -Target Undocumented Report Var(TARGET_EFFICIENT_UNALIGNED_VSX) Init(-1) Save +Target Undocumented Report Mask(EFFICIENT_UNALIGNED_VSX) Var(rs6000_isa_flags) ; Consider unaligned VSX accesses to be efficient/inefficient mallow-df-permute Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 226986) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3692,6 +3692,45 @@ rs6000_option_override_internal (bool gl optimize = 3) rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN; + /* Set -mallow-movmisalign to explicitly on if we have full ISA 2.07 + support. If we only have ISA 2.06 support, and the user did not specify + the switch, leave it set to -1 so the movmisalign patterns are enabled, + but we don't
Re: [Patch] Add to the libgfortran/newlib bodge to detect ftruncate support in ARM/AArch64/SH
On Thu, 2015-08-20 at 09:31 +0100, James Greenhalgh wrote: I'd appreciate your help Steve to check that this patch works with your build system. Thanks, James Yes, this patch works fine with my builds for MIPS. Steve Ellcey sell...@imgtec.com
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On Thu, 20 Aug 2015, Richard Biener wrote: OK, so how about deprecating Java for GCC 6 by removing it from the default languages and removing it for GCC 7 or before we switch to git (whatever happens earlier?) I don't think before we switch to git should be a relevant consideration for any change to the contents of the source tree (all the libgcj files in SVN should be included in the git repository just like all the other history of all the branches in the repository - while I think there should be a fresh conversion to git, I don't think we should take the opportunity to remove anything from the history). -- Joseph S. Myers jos...@codesourcery.com
Should we remove remnants of UWIN support in gcc/config.* files?
UWIN support was apparently removed from GCC in 2008. Yet some traces can still be found in gcc/config.* files. Attached patch would remove them. OK to commit? FX PS: gcc/config.host and gcc/config.build include some other such targets… without checking them all, I think the following could be removed: powerpc-*-beos i370-*-opened* | i370-*-mvs* i386-*-vsta i[34567]86-*-udk* i[34567]86-*-sysv4* i[34567]86-sequent-ptx4* | i[34567]86-sequent-sysv4* i[34567]86-*-sco3.2v5* Is there a good reason for not removing those targets? If not, I’ll try to track them down, check that they are indeed unsupported, and propose a patch removing them. uwin.ChangeLog Description: Binary data uwin.diff Description: Binary data
Re: Should we remove remnants of UWIN support in gcc/config.* files?
Well, they aren't *targets*, but *host* and *build* systems. Yes, but do we maintain a list of support host or build systems, that would be different from our list of supported targets? (That’s a question out of curiosity. I do agree with the rest of your message: in practice, they are not supported.) FX
Re: Should we remove remnants of UWIN support in gcc/config.* files?
On Thu, 20 Aug 2015, FX wrote: Well, they aren't *targets*, but *host* and *build* systems. Yes, but do we maintain a list of support host or build systems, that would be different from our list of supported targets? I don't think there's such a list. For any such system that's not a supported target to work in practice, it would need a reasonably modern C++ compiler, which probably rules out a lot of systems that have been obsoleted as targets. -- Joseph S. Myers jos...@codesourcery.com
[v3 patch] Fix friend declaration so it is visible to name lookup
Jason pointed out this isn't valid, and is going to fail to compile soon with a fix he's making. Tested powerpc64le-linux, committed to trunk. commit 89c676b9c2008823e7bbb7d5db615d908b1ea27d Author: Jonathan Wakely jwak...@redhat.com Date: Thu Aug 20 20:51:39 2015 +0100 * include/experimental/any (__any_caster): Define at namespace scope so the name is visible to name lookup. * testsuite/experimental/any/misc/any_cast_neg.cc: Adjust dg-error. diff --git a/libstdc++-v3/include/experimental/any b/libstdc++-v3/include/experimental/any index dae82b5..4cdc1dc 100644 --- a/libstdc++-v3/include/experimental/any +++ b/libstdc++-v3/include/experimental/any @@ -296,14 +296,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Storage _M_storage; templatetypename _Tp - friend void* __any_caster(const any* __any) - { - if (__any-_M_manager != _Managerdecay_t_Tp::_S_manage) - return nullptr; - _Arg __arg; - __any-_M_manager(_Op_access, __any, __arg); - return __arg._M_obj; - } + friend void* __any_caster(const any* __any); // Manage in-place contained object. templatetypename _Tp @@ -396,6 +389,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } // @} + templatetypename _Tp +void* __any_caster(const any* __any) +{ + if (__any-_M_manager != any::_Managerdecay_t_Tp::_S_manage) + return nullptr; + any::_Arg __arg; + __any-_M_manager(any::_Op_access, __any, __arg); + return __arg._M_obj; +} + /** * @brief Access the contained object. * diff --git a/libstdc++-v3/testsuite/experimental/any/misc/any_cast_neg.cc b/libstdc++-v3/testsuite/experimental/any/misc/any_cast_neg.cc index 1d1180c..5c7595d 100644 --- a/libstdc++-v3/testsuite/experimental/any/misc/any_cast_neg.cc +++ b/libstdc++-v3/testsuite/experimental/any/misc/any_cast_neg.cc @@ -26,5 +26,5 @@ void test01() using std::experimental::any_cast; const any y(1); - any_castint(y); // { dg-error qualifiers { target { *-*-* } } 360 } + any_castint(y); // { dg-error qualifiers { target { *-*-* } } 353 } }
[patch] libstdc++/67294 Don't run timed mutex tests on Darwin
I added these tests recently but Darwin doesn't support timed mutexes. Tested powerpc64le-linux, committed to trunk. commit 6d230c89901d56ef429e3cc255b0d2b2d137f94f Author: Jonathan Wakely jwak...@redhat.com Date: Thu Aug 20 21:29:29 2015 +0100 libstdc++/67294 Don't run timed mutex tests on Darwin PR libstdc++/67294 * testsuite/30_threads/recursive_timed_mutex/unlock/2.cc: Do not run on Darwin. * testsuite/30_threads/timed_mutex/unlock/2.cc: Likewise. diff --git a/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc b/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc index ac51f43..de5592a 100644 --- a/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc +++ b/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc @@ -1,7 +1,7 @@ -// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } +// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthreads { target *-*-solaris* } } -// { dg-options -std=gnu++11 { target *-*-cygwin *-*-darwin* } } +// { dg-options -std=gnu++11 { target *-*-cygwin } } // { dg-require-cstdint } // { dg-require-gthreads } diff --git a/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc b/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc index 10fdc53..14e39de 100644 --- a/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc +++ b/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc @@ -1,7 +1,7 @@ -// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } +// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthreads { target *-*-solaris* } } -// { dg-options -std=gnu++11 { target *-*-cygwin *-*-darwin* } } +// { dg-options -std=gnu++11 { target *-*-cygwin } } // { dg-require-cstdint } // { dg-require-gthreads }
[PATCH, rs6000] A few more vector interfaces
Hi, This patch adds a few more vector interfaces listed in the ELFv2 ABI v1.1: missing flavors of vec_madd, vec_pmsum_be, and vec_shasigma_be. Existing tests have been updated to check for correct code gen. Tested on powerpc64le-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill [gcc] 2015-08-20 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.h (vec_pmsum_be): New #define. (vec_shasigma_be): New #define. * config/rs6000/rs6000-builtin.def (VPMSUMB): New BU_P8V_AV2_2. (VPMSUMH): Likewise. (VPMSUMW): Likewise. (VPMSUMD): Likewise. (VPMSUM): New BU_P8V_OVERLOAD_2. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): New entries for VEC_MADD and VEC_VPMSUM. [gcc/testsuite] 2015-08-20 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/altivec-35.c (foo): Add tests for vec_madd. * gcc.target/powerpc/p8vector-builtin-8.c (foo): Add tests for vec_vpmsum_be and vec_shasigma_be. Index: gcc/config/rs6000/altivec.h === --- gcc/config/rs6000/altivec.h (revision 227035) +++ gcc/config/rs6000/altivec.h (working copy) @@ -208,6 +208,8 @@ #define vec_lvebx __builtin_vec_lvebx #define vec_lvehx __builtin_vec_lvehx #define vec_lvewx __builtin_vec_lvewx +#define vec_pmsum_be __builtin_vec_vpmsum +#define vec_shasigma_be __builtin_crypto_vshasigma /* Cell only intrinsics. */ #ifdef __PPU__ #define vec_lvlx __builtin_vec_lvlx Index: gcc/config/rs6000/rs6000-builtin.def === --- gcc/config/rs6000/rs6000-builtin.def(revision 227035) +++ gcc/config/rs6000/rs6000-builtin.def(working copy) @@ -1489,6 +1489,10 @@ BU_P8V_AV_2 (VPKUDUM,vpkudum, CONST, altivec_v BU_P8V_AV_2 (VPKSDSS, vpksdss, CONST, altivec_vpksdss) BU_P8V_AV_2 (VPKUDUS, vpkudus, CONST, altivec_vpkudus) BU_P8V_AV_2 (VPKSDUS, vpksdus, CONST, altivec_vpksdus) +BU_P8V_AV_2 (VPMSUMB, vpmsumb, CONST, crypto_vpmsumb) +BU_P8V_AV_2 (VPMSUMH, vpmsumh, CONST, crypto_vpmsumh) +BU_P8V_AV_2 (VPMSUMW, vpmsumw, CONST, crypto_vpmsumw) +BU_P8V_AV_2 (VPMSUMD, vpmsumd, CONST, crypto_vpmsumd) BU_P8V_AV_2 (VRLD, vrld, CONST, vrotlv2di3) BU_P8V_AV_2 (VSLD, vsld, CONST, vashlv2di3) BU_P8V_AV_2 (VSRD, vsrd, CONST, vlshrv2di3) @@ -1570,6 +1574,7 @@ BU_P8V_OVERLOAD_2 (VPKSDSS, vpksdss) BU_P8V_OVERLOAD_2 (VPKSDUS,vpksdus) BU_P8V_OVERLOAD_2 (VPKUDUM,vpkudum) BU_P8V_OVERLOAD_2 (VPKUDUS,vpkudus) +BU_P8V_OVERLOAD_2 (VPMSUM, vpmsum) BU_P8V_OVERLOAD_2 (VRLD, vrld) BU_P8V_OVERLOAD_2 (VSLD, vsld) BU_P8V_OVERLOAD_2 (VSRAD, vsrad) Index: gcc/config/rs6000/rs6000-c.c === --- gcc/config/rs6000/rs6000-c.c(revision 227035) +++ gcc/config/rs6000/rs6000-c.c(working copy) @@ -2937,6 +2937,14 @@ const struct altivec_builtin_types altivec_overloa RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_MADD, VSX_BUILTIN_XVMADDDP, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF }, + { ALTIVEC_BUILTIN_VEC_MADD, ALTIVEC_BUILTIN_VMLADDUHM, +RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI }, + { ALTIVEC_BUILTIN_VEC_MADD, ALTIVEC_BUILTIN_VMLADDUHM, +RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI }, + { ALTIVEC_BUILTIN_VEC_MADD, ALTIVEC_BUILTIN_VMLADDUHM, +RS6000_BTI_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI }, + { ALTIVEC_BUILTIN_VEC_MADD, ALTIVEC_BUILTIN_VMLADDUHM, +RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI }, { ALTIVEC_BUILTIN_VEC_MADDS, ALTIVEC_BUILTIN_VMHADDSHS, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI }, { ALTIVEC_BUILTIN_VEC_MLADD, ALTIVEC_BUILTIN_VMLADDUHM, @@ -4171,6 +4179,19 @@ const struct altivec_builtin_types altivec_overloa { P8V_BUILTIN_VEC_VMRGOW, P8V_BUILTIN_VMRGOW, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, + { P8V_BUILTIN_VEC_VPMSUM, P8V_BUILTIN_VPMSUMB, +RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, +RS6000_BTI_unsigned_V16QI, 0 }, + { P8V_BUILTIN_VEC_VPMSUM, P8V_BUILTIN_VPMSUMH, +RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, +RS6000_BTI_unsigned_V8HI, 0 }, + { P8V_BUILTIN_VEC_VPMSUM, P8V_BUILTIN_VPMSUMW, +RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, +RS6000_BTI_unsigned_V4SI, 0 }, + { P8V_BUILTIN_VEC_VPMSUM, P8V_BUILTIN_VPMSUMD, +RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, +RS6000_BTI_unsigned_V2DI, 0 }, + {
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On 08/20/2015 09:27 AM, Andrew Haley wrote: On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. Right. So the question is there some reason why OpenJDK can't be used to bootstrap itself? Ie, is there a fundamental reason why Andrew needs to drop back down to GCJ and start the bootstrapping process from scratch. ISTM that ideally the previous version of OpenJDK would be used to bootstrap the new version of OpenJDK. Which leaves the question of how to deal with new platforms, but it sounds like there's a cross-compilation process starting with OpenJDK 8 which ought to solve that problem. From a personal point of view, I need gcj to make sure each new IcedTea 1.x and 2.x release bootstraps. Sure, but all that does is test that the GCJ bootstrap still works. And it's probably the only serious use of GCJ left. And how much value is there in that in the real world? It's not a sudden whim: it's something we've been discussing for years. The only reason GCJ is still alive is that I committed to keep it going while we still needed it boot bootstrap OpenJDK. Maintaining GCJ in GCC is a significant cost, and GCJ has reached the end of its natural life. Classpath is substantially unmaintained, and GCJ doesn't support any recent versions of Java. Right. I think we last discuss this in 2013 and there was still some benefit in keeping GCJ building, but that benefit is dwindling over time. There's an ongoing cost to every GCC developer to keep GCJ functional as changes in the core compiler happen. Furthermore, there's a round-trip cost for every patch under development by every developer in the boostrap testing cycles. Given the marginal benefit to GCC and OpenJDK and the fairly high cost, we'd really prefer to drop GCJ. Jeff
Re: [PATCH, rs6000] A few more vector interfaces
On Thu, Aug 20, 2015 at 11:40 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch adds a few more vector interfaces listed in the ELFv2 ABI v1.1: missing flavors of vec_madd, vec_pmsum_be, and vec_shasigma_be. Existing tests have been updated to check for correct code gen. Tested on powerpc64le-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill [gcc] 2015-08-20 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.h (vec_pmsum_be): New #define. (vec_shasigma_be): New #define. * config/rs6000/rs6000-builtin.def (VPMSUMB): New BU_P8V_AV2_2. (VPMSUMH): Likewise. (VPMSUMW): Likewise. (VPMSUMD): Likewise. (VPMSUM): New BU_P8V_OVERLOAD_2. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): New entries for VEC_MADD and VEC_VPMSUM. [gcc/testsuite] 2015-08-20 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/altivec-35.c (foo): Add tests for vec_madd. * gcc.target/powerpc/p8vector-builtin-8.c (foo): Add tests for vec_vpmsum_be and vec_shasigma_be. Okay. Thanks, David
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
- Original Message - On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. True, but it's more amenable to cross-compilation than older versions of OpenJDK. I guess we've been riding on the fact that we have gcc available at an early stage on new systems and this allows us to get easily to gcj and from there to IcedTea. From a personal point of view, I need gcj to make sure each new IcedTea 1.x and 2.x release bootstraps. Sure, but all that does is test that the GCJ bootstrap still works. And it's probably the only serious use of GCJ left. Yes, but that's a feature I'm reluctant to suddenly drop in the late stages of these projects. We don't have it in IcedTea 3.x / OpenJDK 8 and so that usage will go when we drop support for 7. I don't plan to hold my system GCC at GCC 5 for the next decade or however long we plan to support IcedTea 2.x / OpenJDK 7. It's also still noticeably faster building with a native ecj than OpenJDK's javac. It would cause me and others a lot of pain to remove gcj at this point. What exactly is the reason to do so, other than some sudden whim? It's not a sudden whim: it's something we've been discussing for years. The only reason GCJ is still alive is that I committed to keep it going while we still needed it boot bootstrap OpenJDK. Maintaining GCJ in GCC is a significant cost, and GCJ has reached the end of its natural life. Classpath is substantially unmaintained, and GCJ doesn't support any recent versions of Java. Ok, I wasn't aware of this work. I follow this list but the only patches I've really seen here are the occasional bumps from Matthias. I don't want to keep it around forever either. Is there a way we can stage the removal rather than going for a straight-out deletion so dependants have more time to adapt to this? For example, can we flag it as deprecated, take it out of defaults and the testsuite, etc. but leave the code there at least for a little while longer? Basically, whatever is needed to stop it being a burden to GCC developers without removing it altogether. Classpath is not unmaintained and has equally been kept going by me over the years for similar reasons. It is overdue a merge into gcj and I've been putting that off, both for want of a suitable point to do so and the need to deal with the mess that is Subversion. If gcj can be just kept around for a few more years, while the older IcedTeas also wind down, I'll do whatever work is needed to keep it going for my purposes, then we can finally remove it. But dropping it altogether in the next six months is just too soon. Andrew. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
- Original Message - Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html Andrew Haley doesn't do releases of IcedTea 1.x and 2.x every three months. I do. Tom -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Thursday, August 20, 2015 1:13 AM To: Ajit Kumar Agarwal; Richard Biener Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation On 08/15/2015 11:01 AM, Ajit Kumar Agarwal wrote: All: Please find the updated patch with suggestion and feedback incorporated. Thanks Jeff and Richard for the review comments. Following changes were done based on the feedback on RFC comments. and the review for the previous patch. 1. Both tracer and path splitting pass are separate passes so that two instances of the pass will run in the end, one doing path splitting and one doing tracing, at different times in the optimization pipeline. I'll have to think about this. I'm not sure I agree totally with Richi's assertion that we should share code with the tracer pass, but I'll give it a good looksie. 2. Transform code is shared for tracer and path splitting pass. The common code in extracted in a given function transform_duplicate And place the function in tracer.c and the path splitting pass uses the transform code. OK. I'll take a good look at that. 3. Analysis for the basic block population and traversing the basic block using the Fibonacci heap is commonly used. This cannot be Factored out into new function as the tracer pass does more analysis based on the profile and the different heuristics is used in tracer And path splitting pass. Understood. 4. The include headers is minimal and presence of what is required for the path splitting pass. THanks. 5. The earlier patch does the SSA updating with replace function to preserve the SSA representation required to move the loop latch node same as join Block to its predecessors and the loop latch node is just forward block. Such replace function are not required as suggested by the Jeff. Such replace Function goes away with this patch and the transformed code is factored into a given function which is shared between tracer and path splitting pass. Sounds good. Bootstrapping with i386 and Microblaze target works fine. No regression is seen in Deja GNU tests for Microblaze. There are lesser failures. Mibench/EEMBC benchmarks were run for Microblaze target and the gain of 9.3% is seen in rgbcmy_lite the EEMBC benchmarks. What do you mean by there are lesser failures? Are you saying there are cases where path splitting generates incorrect code, or cases where path splitting produces code that is less efficient, or something else? I meant there are more Deja GNU testcases passes with the path splitting changes. SPEC 2000 benchmarks were run with i386 target and the following performance number is achieved. INT benchmarks with path splitting(ratio) Vs INT benchmarks without path splitting(ratio) = 3661.225091 vs 3621.520572 That's an impressive improvement. Anyway, I'll start taking a close look at this momentarily. Thanks Regards Ajit Jeff
Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
On 08/20/2015 09:38 AM, Ajit Kumar Agarwal wrote: Bootstrapping with i386 and Microblaze target works fine. No regression is seen in Deja GNU tests for Microblaze. There are lesser failures. Mibench/EEMBC benchmarks were run for Microblaze target and the gain of 9.3% is seen in rgbcmy_lite the EEMBC benchmarks. What do you mean by there are lesser failures? Are you saying there are cases where path splitting generates incorrect code, or cases where path splitting produces code that is less efficient, or something else? I meant there are more Deja GNU testcases passes with the path splitting changes. Ah, in that case, that's definitely good news! jeff
Re: [PATCH] Missing Skylake -march=/-mtune= option
On 2015.08.13 at 12:31 +0300, Yuri Rumyantsev wrote: Hi All, Here is patch for adding march/mtune options for Skylake. http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/desktop-6th-gen-core-family-spec-update.pdf states that BMI1 and BMI2 are not supported. Is this true for all Skylake CPUs? Quote from the pdf: SKD002 CPUID Incorrectly Reports Bit Manipulation Instructions Support Executing CPUID with EAX = 7 and ECX = 0 may return EBX with bits [3] and [8] set, incorrectly indicating the presence of BMI1 and BMI2 instruction set extensions. Attempting to use instructions from the BMI1 or BMI2 instruction set extensions will result in a #UD exception. -- Markus
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
- Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. From a personal point of view, I need gcj to make sure each new IcedTea 1.x and 2.x release bootstraps. I don't plan to hold my system GCC at GCC 5 for the next decade or however long we plan to support IcedTea 2.x / OpenJDK 7. It's also still noticeably faster building with a native ecj than OpenJDK's javac. It would cause me and others a lot of pain to remove gcj at this point. What exactly is the reason to do so, other than some sudden whim? Andrew. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. From a personal point of view, I need gcj to make sure each new IcedTea 1.x and 2.x release bootstraps. Sure, but all that does is test that the GCJ bootstrap still works. And it's probably the only serious use of GCJ left. I don't plan to hold my system GCC at GCC 5 for the next decade or however long we plan to support IcedTea 2.x / OpenJDK 7. It's also still noticeably faster building with a native ecj than OpenJDK's javac. It would cause me and others a lot of pain to remove gcj at this point. What exactly is the reason to do so, other than some sudden whim? It's not a sudden whim: it's something we've been discussing for years. The only reason GCJ is still alive is that I committed to keep it going while we still needed it boot bootstrap OpenJDK. Maintaining GCJ in GCC is a significant cost, and GCJ has reached the end of its natural life. Classpath is substantially unmaintained, and GCJ doesn't support any recent versions of Java. Andrew.
RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Thursday, August 20, 2015 3:16 AM To: Ajit Kumar Agarwal; Richard Biener Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation On 08/15/2015 11:01 AM, Ajit Kumar Agarwal wrote: From cf2b64cc1d6623424d770f2a9ea257eb7e58e887 Mon Sep 17 00:00:00 2001 From: Ajit Kumar Agarwalajit...@xilix.com Date: Sat, 15 Aug 2015 18:19:14 +0200 Subject: [PATCH] [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation. Added a new pass on path splitting on tree SSA representation. The path splitting optimization does the CFG transformation of join block of the if-then-else same as the loop latch node is moved and merged with the predecessor blocks after preserving the SSA representation. ChangeLog: 2015-08-15 Ajit Agarwalajit...@xilinx.com * gcc/Makefile.in: Add the build of the new file tree-ssa-path-split.c Instead: * Makefile.in (OBJS): Add tree-ssa-path-split.o. * gcc/opts.c (OPT_ftree_path_split) : Add an entry for Path splitting pass with optimization flag greater and equal to O2. * opts.c (default_options_table): Add entry for path splitting optimization at -O2 and above. * gcc/passes.def (path_split): add new path splitting pass. Capitalize add. * gcc/tree-ssa-path-split.c: New. Use New file. * gcc/tracer.c (transform_duplicate): New. Use New function. * gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c: New. * gcc/testsuite/gcc.dg/path-split-1.c: New. These belong in gcc/testsuite/ChangeLog and remove the gcc/testsuite prefix. * gcc/doc/invoke.texi (ftree-path-split): Document. (fdump-tree-path_split): Document. Should just be two lines instead of three. And more generally, there's no need to prefix ChangeLog entries with gcc/. Now that the ChangeLog nits are out of the way, let's get to stuff that's more interesting. I will incorporate all the above changes in the upcoming patches. Signed-off-by:Ajit agarwalajit...@xilinx.com --- gcc/Makefile.in | 1 + gcc/common.opt | 4 + gcc/doc/invoke.texi | 16 +- gcc/opts.c | 1 + gcc/passes.def | 1 + gcc/testsuite/gcc.dg/path-split-1.c | 65 ++ gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c | 60 + gcc/timevar.def | 1 + gcc/tracer.c | 37 +-- gcc/tree-pass.h | 1 + gcc/tree-ssa-path-split.c| 330 +++ 11 files changed, 503 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/path-split-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c create mode 100644 gcc/tree-ssa-path-split.c diff --git a/gcc/common.opt b/gcc/common.opt index e80eadf..1d02582 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2378,6 +2378,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees +ftree-path-split +Common Report Var(flag_tree_path_split) Init(0) Optimization +Perform Path Splitting Maybe Perform Path Splitting for loop backedges or something which is a little more descriptive. The above isn't exactly right, so don't use it as-is. @@ -9068,6 +9075,13 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-path-split +@opindex ftree-path-split +Perform Path Splitting on trees. The join blocks of IF-THEN-ELSE same +as loop latch node is moved to its predecessor and the loop latch node +will be forwarding block. This is enabled by default at @option{-O2} +and higher. Needs some work. Maybe something along the lines of When two paths of execution merge immediately before a loop latch node, try to duplicate the merge node into the two paths. I will incorporate all the above changes. diff --git a/gcc/passes.def b/gcc/passes.def index 6b66f8f..20ddf3d 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -82,6 +82,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ + NEXT_PASS (pass_path_split); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_sra_early); I can't recall if we've discussed the location of the pass at all. I'm not objecting to this location, but would like to hear why you chose this particular location in the optimization pipeline. I have placed the path
Re: [PR25529] Convert (unsigned t * 2)/2 into unsigned (t 0x7FFFFFFF)
On Fri, Aug 7, 2015 at 1:43 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, extend it - it should also work for non-INTEGER_CST divisors and it should work for any kind of division, not just exact_div. Please find attached the patch pr25529.patch that implements the pattern for all divisors Please review and let me know if its okay. Regression tested on AARH64 and x86_64. Thanks, Naveen 2015-08-07 Naveen H.S naveen.hurugalaw...@caviumnetworks.com PR middle-end/25529 gcc/ChangeLog: * match.pd (div (mult @0 @1) @1) : New simplifier. This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67298 -- H.J.
Re: [gomp4] New reduction infrastructure for OpenACC
Sigh, pdf's get filtered. Let's try some raw tex ... Here's the design document for the reduction implementation nathan \documentclass[a4paper]{article} \newcommand{\brk}{\linebreak[0]} \newcommand{\codespc}{\pagebreak[2]\\[1ex plus2pt minus1pt]} \newcommand{\codebrk}{\pagebreak[2]\\} \newcommand{\define}[1]{{\bf #1}} \begingroup \catcode`\~ = 12 \global\def\twiddle{~} \endgroup \newenvironment{codeblock}% {\begin{quote}\tt\begin{tabbing}% \hspace{1em}\=\hspace{1em}\=\hspace{1em}\=\hspace{1em}\=% \hspace{1em}\=\hspace{1em}\=\hspace{1em}\=\hspace{1em}\=% \kill}% {\end{tabbing}\end{quote}} \begin{document} \title{OpenACC Reductions} \author{Nathan Sidwell} \date{2015-08-13} \maketitle \begin{abstract} This document describes the design of the OpenACC reduction implementation in GCC. It is intended to be sufficiently general in the early part of the compiler, becoming more target-specific once we have entered the target-specific device compiler. \end{abstract} \section{Changes} \begin{itemize} \item[2015-07-28] Initial version \item[2015-07-30] Change internal builtins to not take addresses. \item[2015-08-03] Note about templated builtin expansion. \item[2015-08-07] Discuss reductions at outer parallel construct \item[2015-08-13] Note reductions at outer parallel consume inner gang reduction. Comment on memory barriers. \end{itemize} \section{General Features} We cannot emit anything that depends on a device feature before we've entered the device compiler. This means that anything happening in gimplify or omp-low has to be generic. It has to be sufficiently generic to permit other architectures to implement reductions. Thus, anything emitted here, beyond simply noting the gang/worker/vector level of the execution environment cannot know anything about gang/worker or vector beyond what the abstract specification describes. \subsection{Compiler Passes} The following passes are relevant to openACC compilation: \begin{enumerate} \item[Gimplify] This is where variables used in a parallel region are noted, and the required transformation determined -- for instance copy/private/firstprivate etc. These are all made explicit by augmenting the parallel clause itself. \item[Omp-lower] This is where a parallel region is broken out into a separate function, variables are rewritten according to copy, private or whatever. The structure describing where copied variables are located is created for both host and target sides. \item[Omp-expand] This is where loops are converted from serial form into a form to be executed in parallel on multiple threads. The multiple threads are implicit -- abstract functions provide the number of threads and the thread number of the current thread. \item[LTO write/read] The offloaded functions are written out to disc and read back into the device-specific compiler. The host-side compiler continues with its regular compilation. Following this point there are essentially two threads of compilation -- one for host and one for device. \item[Oacc-xform] This new pass is responsible for lowering the abstractions created in omp-lower and/or omp-expand into device-specific code sequences. Such sequences should be regular gimple as much as possible, but may include device-specific builtins. The expansions are done via target-specific hooks and default fallbacks. This step is also responsible for checking launch dimensions. \item[Expand-RTL] This pass expands any remaining internal functions and device-specific builtins into device-specific RTL. \item[Mach-dep-reorg] This pass may perform device specific reorganization. \end{enumerate} \subsection[Fork/Join]{Fork/Join} Omp-expand emits fork and join builtins around gang, worker and vector partitioned regions. In oacc-xform these are deleted by targets that do not need them. For PTX the hook deletes gang-level forks and joins, as they are irrelevant to the PTX execution model. They are expanded to PTX pseudo instructions at RTL expansion time, and those are propagated all the way through to the PTX reorg pass where they get turned into actual code. In the PTX case they are not real forks and joins. In the non-forked region we neuter all but one thread at the appropriate level -- they're emulated forks and joins for us. The reduction machinery will hang off these markers, by inserting additional builtins before and after them. These builtins will be expanded by device-specific code in the oacc-xform pass. Whether the expansion of those builtins generates device-specific builtins is a property of the particular backend. In the case of PTX, what each builtin expands to depends on the level of the reduction. \subsection{Reduction objects} The object used for a reduction has two separate lifetimes: \begin{itemize} \item The lifetime before (and after) the reduction loop. We refer to that as the RESULT object. \item The lifetime within the reduction loop. We refer to this as the LOCAL object. \end{itemize} There is
Re: [PATCH] Disable -mbranch-likely for -Os when targetting generic architecture
Robert Suchanek robert.sucha...@imgtec.com writes: The patch below disables generation of the branch likely instructions for -Os but only for generic architecture. The branch likely may result in some code size reduction but the cost of running the code on R6 core is significant. How about instead splitting PTF_AVOID_BRANCHLIKELY into PTF_AVOID_BRANCHLIKELY_SPEED and PTF_AVOID_BRANCHLIKELY_SIZE? We could have PTF_AVOID_BRANCHLIKELY_ALWAYS as an OR of the two. Anything that does string ops on the architecture is suspicious :-) Thanks, Richard
RE: [PATCH] Disable -mbranch-likely for -Os when targetting generic architecture
Richard Sandiford rdsandif...@googlemail.com writes: Robert Suchanek robert.sucha...@imgtec.com writes: The patch below disables generation of the branch likely instructions for -Os but only for generic architecture. The branch likely may result in some code size reduction but the cost of running the code on R6 core is significant. How about instead splitting PTF_AVOID_BRANCHLIKELY into PTF_AVOID_BRANCHLIKELY_SPEED and PTF_AVOID_BRANCHLIKELY_SIZE? We could have PTF_AVOID_BRANCHLIKELY_ALWAYS as an OR of the two. This sounds OK and is nicer. Anything that does string ops on the architecture is suspicious :-) You can blame me for this. I advocated the string comparison approach as I had to do the same thing in gas IIRC for some feature and couldn't think of anything better to suggest. Thanks, Matthew
Re: Should we remove remnants of UWIN support in gcc/config.* files?
On August 20, 2015 5:22:47 PM EDT, Joseph Myers jos...@codesourcery.com wrote: On Thu, 20 Aug 2015, FX wrote: Well, they aren't *targets*, but *host* and *build* systems. Yes, but do we maintain a list of support host or build systems, that would be different from our list of supported targets? I don't think there's such a list. For any such system that's not a supported target to work in practice, it would need a reasonably modern C++ compiler, which probably rules out a lot of systems that have been obsoleted as targets. Wouldn't a list be able to be compiled from major branch release announcements? There should be a deprecated and removed note in two release branch descriptions. Even if we screwed up and forgot to list it on both, if it likely to be in one of them. --joel
Re: [v3 patch] Fix friend declaration so it is visible to name lookup
Hi, On 08/20/2015 10:21 PM, Jonathan Wakely wrote: Jason pointed out this isn't valid, and is going to fail to compile soon with a fix he's making. I seem to remember that at some point we had the exact same issue with some member operator and operator of random. Paolo.
Re: [PATCH] Only accept BUILT_IN_NORMAL stringops for interesting_stringop_to_profile_p
On Thu, Aug 20, 2015 at 5:17 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, As DECL_FUNCTION_CODE is overloaded for builtin functions in different classes, so need to check builtin class before using fcode. Patch posted below. Bootstrapped on x86_64-suse-linux, OK for trunk? Thanks. Ugh. The code in the callers already looks like it could have some TLC, like instead of fndecl = gimple_call_fndecl (stmt); if (!fndecl) return false; fcode = DECL_FUNCTION_CODE (fndecl); if (!interesting_stringop_to_profile_p (fndecl, stmt, size_arg)) return false; simply do if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) return false; if (!interesting_stringop_to_profile_p (gimple_call_fndecl (stmt), )) similar for the other caller. interesting_stringop_to_profile_p can also get function-code directly from stmt, removing the redundant first argument or even do the gimple_call_builtin_p call itself. Mind reworking the patch accordingly? Thanks, Richard. Index: gcc/value-prof.c === --- gcc/value-prof.c(revision 141081) +++ gcc/value-prof.c(working copy) @@ -1547,8 +1547,12 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) static bool interesting_stringop_to_profile_p (tree fndecl, gimple call, int *size_arg) { - enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); + enum built_in_function fcode; + if (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL) +return false; + + fcode = DECL_FUNCTION_CODE (fndecl); if (fcode != BUILT_IN_MEMCPY fcode != BUILT_IN_MEMPCPY fcode != BUILT_IN_MEMSET fcode != BUILT_IN_BZERO) return false; Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 141081) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2015-08-20 Felix Yang felix.y...@huawei.com + Jiji Jiang jiangj...@huawei.com + + * value-prof.c (interesting_stringop_to_profile_p): Only accept string + operations which belong to the BUILT_IN_NORMAL builtin class. + 2015-08-18 Segher Boessenkool seg...@kernel.crashing.org Backport from mainline:
Re: Move some flag_unsafe_math_optimizations using simplify and match
On Thu, 20 Aug 2015, Richard Biener wrote: On Thu, Aug 20, 2015 at 7:38 AM, Marc Glisse marc.gli...@inria.fr wrote: On Thu, 20 Aug 2015, Hurugalawadi, Naveen wrote: The following testcase does not generate x as needed. double t (double x) { x = sqrt (x) * sqrt (x); return x; } With -fno-math-errno, we CSE the calls to sqrt, so I would expect this to match: (mult (SQRT@1 @0) @1) Without the flag, I expect that one will apply (simplify (mult (SQRT:s @0) (SQRT:s @1)) (SQRT (mult @0 @1))) and then maybe we have something converting sqrt(x*x) to abs(x) or maybe not. ICK. I'd rather have CSE still CSE the two calls by adding some tricks regarding to errno ... I wonder if all the unsafe math optimizations are really ok without -fno-math-errno... Well, on GIMPLE they will preserve the original calls because of their side-effects setting errno... on GENERIC probably not. But we are also introducing new math calls, and I am afraid those might set errno at an unexpected place in the code... I don't know if anyone interested in errno would ever use -funsafe-math-optimizations though. -- Marc Glisse
Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning
On August 20, 2015 5:52:55 PM GMT+02:00, Andrew Hughes gnu.and...@redhat.com wrote: - Original Message - On 08/20/2015 03:57 PM, Andrew Hughes wrote: - Original Message - On 20/08/15 09:24, Matthias Klose wrote: On 08/20/2015 06:36 AM, Tom Tromey wrote: Andrew No, it isn't. It's still a necessity for initial bootstrapping of Andrew OpenJDK/IcedTea. Andrew Haley said the opposite here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00537.html if you need bootstrapping OpenJDK 6 or OpenJDK 7, then having gcj available for the target platform is required. Starting with OpenJDK 8 you should be able to cross build OpenJDK 8 with an OpenJDK 8 available on the cross platform. It might be possible to cross build older OpenJDK versions, but this usually is painful. Sure, but we don't need GCJ going forward. I don't think that there are any new platforms to which OpenJDK has not been ported which will require GCJ to bootstrap. And even if there are, anybody who needs to do that can (and, indeed, should) use an earlier version of GCJ. It's not going to go away; it will always be in the GCC repos. And because newer versions of GCC may break GCJ (and maybe OpenJDK) it makes more sense to use an old GCC/GCJ for the bootstrapping of an old OpenJDK. I don't see how we don't at present. How else do you solve the chicken-and-egg situation of needing a JDK to build a JDK? I don't see crossing your fingers and hoping there's a binary around somewhere as a very sustainable system. That's what we do with GCC, binutils, etc: we bootstrap. True, but it's more amenable to cross-compilation than older versions of OpenJDK. I guess we've been riding on the fact that we have gcc available at an early stage on new systems and this allows us to get easily to gcj and from there to IcedTea. From a personal point of view, I need gcj to make sure each new IcedTea 1.x and 2.x release bootstraps. Sure, but all that does is test that the GCJ bootstrap still works. And it's probably the only serious use of GCJ left. Yes, but that's a feature I'm reluctant to suddenly drop in the late stages of these projects. We don't have it in IcedTea 3.x / OpenJDK 8 and so that usage will go when we drop support for 7. I don't plan to hold my system GCC at GCC 5 for the next decade or however long we plan to support IcedTea 2.x / OpenJDK 7. It's also still noticeably faster building with a native ecj than OpenJDK's javac. It would cause me and others a lot of pain to remove gcj at this point. What exactly is the reason to do so, other than some sudden whim? It's not a sudden whim: it's something we've been discussing for years. The only reason GCJ is still alive is that I committed to keep it going while we still needed it boot bootstrap OpenJDK. Maintaining GCJ in GCC is a significant cost, and GCJ has reached the end of its natural life. Classpath is substantially unmaintained, and GCJ doesn't support any recent versions of Java. Ok, I wasn't aware of this work. I follow this list but the only patches I've really seen here are the occasional bumps from Matthias. I don't want to keep it around forever either. Is there a way we can stage the removal rather than going for a straight-out deletion so dependants have more time to adapt to this? For example, can we flag it as deprecated, take it out of defaults and the testsuite, etc. but leave the code there at least for a little while longer? Basically, whatever is needed to stop it being a burden to GCC developers without removing it altogether. Having classpath (with binary files!) In the GCC SVN (or future git) repository is a significant burden, not to mention the size of the distributed source tarball. If we can get rid of that that would be a great step in reducing the burden. Iff we can even without classpath build enough of java to be useful (do you really need gcj or only gij for bootstrapping openjdk? After all ecj is just a drop-in to gcc as well). Richard. Classpath is not unmaintained and has equally been kept going by me over the years for similar reasons. It is overdue a merge into gcj and I've been putting that off, both for want of a suitable point to do so and the need to deal with the mess that is Subversion. If gcj can be just kept around for a few more years, while the older IcedTeas also wind down, I'll do whatever work is needed to keep it going for my purposes, then we can finally remove it. But dropping it altogether in the next six months is just too soon. Andrew.