Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
Mike Stump mikest...@comcast.net writes: Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. Show us the difference in timing. Show us the generated code. I can't imagine that it could ever matter. I'm also curious about that statement... PODs don't really seem to offer much advantage with modern compilers, except in a few very specific cases (of which this doesn't seem to be one), e.g. in unions. -miles -- values of β will give rise to dom!
Re: Value type of map need not be default copyable
On Tue, 7 Aug 2012, Richard Smith wrote: I've attached a patch for unordered_map which solves the rvalue reference problem. For efficiency, I've created a new _M_emplace_bucket method rather than call emplace directly. I've verified all libstdc++ tests pass (sorry for the previous oversight) and am running the full GCC test suite now. However, I'd appreciate any feedback on whether this is a reasonable approach. STL hacking is way outside my comfort zone. ;-) If this looks good, I'll take a stab at std::map. I think you should remove the mapped_type() argument from the call to _M_emplace_bucket. In C++11, the mapped_type is not required to be copyable at all, just to be DefaultInsertable. Indeed. The reason I was talking about emplace is that you want an object to be created only at the time the node is created. That might mean passing piecewise_construct_t and an empty tuple to emplace (otherwise it is too similar to insert). Or for unordered_map where the node functions are exposed, you could just create the node directly without passing through emplace. -- Marc Glisse
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On Wed, Aug 8, 2012 at 9:09 AM, Miles Bader mi...@gnu.org wrote: Mike Stump mikest...@comcast.net writes: Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. Show us the difference in timing. Show us the generated code. I can't imagine that it could ever matter. I'm also curious about that statement... PODs don't really seem to offer much advantage with modern compilers, except in a few very specific cases (of which this doesn't seem to be one), e.g. in unions. They make a difference for the by-value passing ABI. double-ints can be passed in two registers on most platforms. Richard. -miles -- values of β will give rise to dom!
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On Tue, Aug 7, 2012 at 8:38 PM, Lawrence Crowl cr...@google.com wrote: On 8/7/12, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Aug 7, 2012 at 2:35 AM, Lawrence Crowl cr...@google.com wrote: Convert double_int from a struct with function into a class with operators and methods. This patch adds the methods and operators. In general functions of the form double_int_whatever become member functions whatever or, when possible, operators. Every attempt has been made to preserve the existing algorithms, even at the expense of some optimization opportunities. Some examples: The ext operation takes a value and returns a value. However, that return value is usually assigned to the original variable. An operation that modified a variable would be more efficient. That's not always the case though and I think the interface should be consistent with existing behavior to avoid errors creeping in during the transition. We should probably think about naming conventions for mutating operations, as I expect we will want them eventually. Right. In the end I would prefer explicit constructors. In some cases, an outer sign-specific function calls an inner function with the sign as a parameter, which then decides which implementation to do. Decisions should not be artificially introduced, and the implementation of each case should be exposed as a separate routine. The existing operations are implemented in terms of the new operations, which necessarily adds a layer between the new code and the existing users. Once all files have migrated, this layer will be removed. There are several existing operations implemented in terms of even older legacy operations. This extra layer has not been removed. On occasion though, parameterized functions are often called with a constant argments. To support static statement of intent, and potentially faster code in the future, there are several new unparameterized member functions. Some examples: Four routines now encode both 'arithmetic or logical' and 'right or left' shift as part of the funciton name. Four routines now encode both 'signed or unsigned' and 'less than or greater than' as part of the function name. For most parts overloads that take an (unsigned) HOST_WIDE_INT argument would be nice, as well as the ability to say dbl + 1. Hm. There seems to be significant opinion that there should not be any implicit conversions. I am okay with operations as above, but would like to hear the opinions of others. Well, I'd simply add double_int operator+(HOST_WIDE_INT); double_int operator+(unsigned HOST_WIDE_INT); and be done with it ;) Yes, a tad bit more inconvenient on the implementation side compared to a non-explicit constructor from HOST_WIDE_INT or a conversion operator ... but adhering to the rule that we do _not_ want such automatic conversions (well, yet, for the start). -typedef struct +typedef struct double_int { +public: + /* Normally, we would define constructors to create instances. + Two things prevent us from doing so. + First, defining a constructor makes the class non-POD in C++03, + and we certainly want double_int to be a POD. + Second, the GCC conding conventions prefer explicit conversion, + and explicit conversion operators are not available until C++11. */ + + static double_int make (unsigned HOST_WIDE_INT cst); + static double_int make (HOST_WIDE_INT cst); + static double_int make (unsigned int cst); + static double_int make (int cst); Did we somehow decide to not allow constructors? It's odd to convert to C++ and end up with static member functions resembling them ... Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. True - I forgot about this issue. Same for the return values. I suppose we need to inspect code quality before going this route. Also I believe the conversion above introduces possible migration errors. Think of a previous HOST_WIDE_INT a; double_int d = uhwi_to_double_int (a); if you write that now as HOST_WIDE_INT a; double_int d = double_int::make (a); you get the effect of shwi_to_double_int. Oops. Hm. I think the code was more likely to be wrong originally, but I take your point. So as an intermediate I'd like you _not_ to introduce the make () overloads. Okay. Btw, if HOST_WIDE_INT == int the above won't even compile. Is that true of any host? I am not aware of any. Anyway, it is moot if we remove the overloads. Right. I'd simply rely on int / unsigned int promotion to HOST_WIDE_INT or unsigned HOST_WIDE_INT and only provide overloads for HOST_WIDE_INT kinds anyway. Richard. -- Lawrence Crowl
Re: [PATCH, Android] Runtime stack protector enabling for Android target
Hi all! I'd like to ask whether stack-protector changes for Android could go to 4.7? Pathes are: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01089.html http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01157.html Thanks, Igor On Wed, Jul 25, 2012 at 2:08 AM, Maxim Kuvyrkov ma...@codesourcery.com wrote: On 24/07/2012, at 10:08 PM, Uros Bizjak wrote: On Mon, Jul 23, 2012 at 10:00 PM, Igor Zamyatin izamya...@gmail.com wrote: 2012-07-23 Sergey Melnikov sergey.melni...@intel.com * config/i386/i386.md (stack_protect_set): Disable the pattern for Android since Android libc (bionic) does not provide random value for stack protection guard at gs:0x14. Guard value will be provided from external symbol (default implementation). (stack_protect_set_mode): Likewise. (stack_protect_test): Likewise. (stack_protect_test_mode): Likewise. * gcc/defaults.h: Define macro TARGET_HAS_BIONIC to 0 - target does not have Bionic by default * config/linux.h: Redefine macro TARGET_HAS_BIONIC to (OPTION_BIONIC) Macro OPTION_BIONIC is defined in this file and provides Bionic accessibility status Looks OK to me, patch needs approval from Maxim. OK. Thanks for fixing this. -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: PATCH: PR rtl-optimization/54157: [x32] -maddress-mode=long failures
H.J. Lu hjl.to...@gmail.com writes: diff --git a/gcc/combine.c b/gcc/combine.c index a07c046..b9a3589 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10784,12 +10784,30 @@ gen_lowpart_for_combine (enum machine_mode omode, rtx x) if (omode == imode) return x; - /* Return identity if this is a CONST or symbolic reference. */ - if (omode == Pmode - (GET_CODE (x) == CONST - || GET_CODE (x) == SYMBOL_REF - || GET_CODE (x) == LABEL_REF)) -return x; + if (omode == Pmode) +{ + /* Return identity if this is a symbolic reference. */ + if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF) + return x; + + /* Return identity for CONST unless this is a PLUS of 2 constant + operands. */ + if (GET_CODE (x) == CONST) + { + rtx y = XEXP (x, 0); + if (GET_CODE (y) == PLUS +((CONST_INT_P (XEXP (y, 1)) + (GET_CODE (XEXP (y, 0)) == CONST +|| GET_CODE (XEXP (y, 0)) == SYMBOL_REF +|| GET_CODE (XEXP (y, 0)) == LABEL_REF)) + || (CONST_INT_P (XEXP (y, 1)) +(GET_CODE (XEXP (y, 0)) == CONST + || GET_CODE (XEXP (y, 0)) == SYMBOL_REF + || GET_CODE (XEXP (y, 0)) == LABEL_REF + goto fail; + return x; + } +} /* We can only support MODE being wider than a word if X is a constant integer or has a mode the same size. */ works for the testcase. My point was that the return x is always wrong. Whenever we return x here, we know we're returning something in a different mode from the one that the caller wanted. Returning a Pmode LABEL_REF might not trigger that plus_constant assert, but it's still wrong. It looks like this came from the mips-rewrite branch: 2003-03-13 Richard Sandiford rsand...@redhat.com * combine.c (gen_lowpart_for_combine): Treat the lowpart Pmode of a CONST as identity. Check the return value of gen_lowpart_common. so I can categorically confirm that the person who wrote it didn't know what they were doing. It also means that this case was motivated by an experiment to make Pmode == DImode for n32, which we ended up discarding because it produced worse code. If this case really is important, we might consider using convert_memory_address (Pmode, x) instead. I'm not sure whether that would be right for targets with address spaces though, because we don't know which address space is associated with the address. Hopefully someone who knows address spaces can comment. If it is correct, then it should probably go in gen_lowpart_common rather than gen_lowpart_for_combine. But given how few people have hit this, it doesn't look like a particularly important attempted optimisation. I'll pre-approve a patch that undoes my mistake and simply removes: /* Return identity if this is a CONST or symbolic reference. */ if (omode == Pmode (GET_CODE (x) == CONST || GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)) return x; Richard
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
Richard Guenther richard.guent...@gmail.com writes: Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. Show us the difference in timing. Show us the generated code. I can't imagine that it could ever matter. I'm also curious about that statement... PODs don't really seem to offer much advantage with modern compilers, except in a few very specific cases (of which this doesn't seem to be one), e.g. in unions. They make a difference for the by-value passing ABI. double-ints can be passed in two registers on most platforms. Sure, but that doesn't seem to depend on PODness -- non-PODs can be passed in two registers as well, AFAICS... E.g., in the following: typedef long valtype; struct X { valtype x, y; }; struct Y { Y (valtype a, valtype b) : x (a), y (b) { } valtype x, y; }; extern void fx (X x); void test_x () {X x = { 1, 2 }; fx (x); } extern void fy (Y y); void test_y () {Y y (1, 2); fy (y); } test_x and test_y use exactly the same calling sequence (and contain exactly the same assembly code)... [on x86-64] Thanks, -miles -- 80% of success is just showing up. --Woody Allen
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On Wed, 8 Aug 2012, Richard Guenther wrote: + static double_int make (unsigned HOST_WIDE_INT cst); + static double_int make (HOST_WIDE_INT cst); + static double_int make (unsigned int cst); + static double_int make (int cst); [...] Btw, if HOST_WIDE_INT == int the above won't even compile. Is that true of any host? I am not aware of any. Anyway, it is moot if we remove the overloads. Right. I'd simply rely on int / unsigned int promotion to HOST_WIDE_INT or unsigned HOST_WIDE_INT and only provide overloads for HOST_WIDE_INT kinds anyway. Sadly, that doesn't work with the current C++ rules (there is a proposal to change that for C++1y): void f(long); void f(unsigned long); void g(int x){f(x);} e.cc: In function ‘void g(int)’: e.cc:3:18: error: call of overloaded ‘f(int)’ is ambiguous e.cc:3:18: note: candidates are: e.cc:1:6: note: void f(long int) e.cc:2:6: note: void f(long unsigned int) -- Marc Glisse
Commit: RL78: Include tree-pass.h
Hi DJ, I am applying the following patch to the gcc mainline as an obvious fix for the following problem building the RL78 backend: gcc/config/rl78/rl78.c:151:3: error: 'PASS_POS_INSERT_BEFORE' undeclared here (not in a function) Cheers Nick gcc/ChangeLog 2012-08-08 Nick Clifton ni...@redhat.com * config/rl78/rl78.c: Include tree-pass.h. Index: gcc/config/rl78/rl78.c === --- gcc/config/rl78/rl78.c (revision 190223) +++ gcc/config/rl78/rl78.c (working copy) @@ -48,6 +48,7 @@ #include langhooks.h #include rl78-protos.h #include dumpfile.h +#include tree-pass.h static inline bool is_interrupt_func (const_tree decl); static inline bool is_brk_interrupt_func (const_tree decl);
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On Wed, Aug 8, 2012 at 10:33 AM, Marc Glisse marc.gli...@inria.fr wrote: On Wed, 8 Aug 2012, Richard Guenther wrote: + static double_int make (unsigned HOST_WIDE_INT cst); + static double_int make (HOST_WIDE_INT cst); + static double_int make (unsigned int cst); + static double_int make (int cst); [...] Btw, if HOST_WIDE_INT == int the above won't even compile. Is that true of any host? I am not aware of any. Anyway, it is moot if we remove the overloads. Right. I'd simply rely on int / unsigned int promotion to HOST_WIDE_INT or unsigned HOST_WIDE_INT and only provide overloads for HOST_WIDE_INT kinds anyway. Sadly, that doesn't work with the current C++ rules (there is a proposal to change that for C++1y): void f(long); void f(unsigned long); void g(int x){f(x);} e.cc: In function ‘void g(int)’: e.cc:3:18: error: call of overloaded ‘f(int)’ is ambiguous e.cc:3:18: note: candidates are: e.cc:1:6: note: void f(long int) e.cc:2:6: note: void f(long unsigned int) Ick ... I forgot that integer promotions do not apply for int - long. So even f(1) would be ambiguous, right? So I suppose we have to bite the bullet and add overloads for all (unsigned) integer types from int to HOST_WIDE_INT (if HOST_WIDE_INT is long long then we have to add overloads for int, unsigned int, long and unsigned long). Or use template magic to force promotion to HOST_WIDE_INT for integer types smaller than HOST_WIDE_INT... Richard. -- Marc Glisse
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 10:28 AM, Nick Clifton ni...@redhat.com wrote: Hi DJ, I am applying the following patch to the gcc mainline as an obvious fix for the following problem building the RL78 backend: gcc/config/rl78/rl78.c:151:3: error: 'PASS_POS_INSERT_BEFORE' undeclared here (not in a function) Err - you are inside the compiler and should not use plugin stuff to register your machine dependent pass. Cheers Nick gcc/ChangeLog 2012-08-08 Nick Clifton ni...@redhat.com * config/rl78/rl78.c: Include tree-pass.h. Index: gcc/config/rl78/rl78.c === --- gcc/config/rl78/rl78.c (revision 190223) +++ gcc/config/rl78/rl78.c (working copy) @@ -48,6 +48,7 @@ #include langhooks.h #include rl78-protos.h #include dumpfile.h +#include tree-pass.h static inline bool is_interrupt_func (const_tree decl); static inline bool is_brk_interrupt_func (const_tree decl);
Re: Commit: RL78: Include tree-pass.h
Hi Richard, Err - you are inside the compiler and should not use plugin stuff to register your machine dependent pass. Umm, OK, what is the correct method for registering target specific passes ? (Ones that need to run at times other than TARGET_MACHINE_DEPENDENT_REORG). Cheers Nick
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 11:02 AM, nick clifton ni...@redhat.com wrote: Hi Richard, Err - you are inside the compiler and should not use plugin stuff to register your machine dependent pass. Umm, OK, what is the correct method for registering target specific passes ? (Ones that need to run at times other than TARGET_MACHINE_DEPENDENT_REORG). Have two machine_reorg passes with different target hooks. Or schedule it twice and have a target hook that specifies at which position it should actually run. Richard. Cheers Nick
Re: [PATCH, Android] Runtime stack protector enabling for Android target
On Wed, Aug 8, 2012 at 9:54 AM, Igor Zamyatin izamya...@gmail.com wrote: I'd like to ask whether stack-protector changes for Android could go to 4.7? Pathes are: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01089.html http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01157.html OK, as far as x86 is concerned. Thanks, Uros.
Fix simplify-rtx losing side effects in operand of IOR
simplify-rtx.c:simplify_binary_operation_1 simplifies certain cases of IOR to constants, but fails to check when doing so that it hasn't lost side effects in the nonconstant operand. This can cause wrong code generation (when called from combine) in certain cases where the constant operand only becomes visible for optimization at RTL optimization time (otherwise an equivalent optimization will happen earlier), and the nonconstant operand is a memory operand with a pre/post-modify address; the fix, adding side_effects_p calls, is straightforward (and I didn't find such issues elsewhere in simplify-rtx.c). The above conditions make this problem quite hard to trigger, but it shows up in pcretest on ARM with -O2 -funroll-loops, and the patch includes an artificial testcase that shows the problem (again, -O2 -funroll-loops; or -O3 -fomit-frame-pointer -funroll-loops as used by c-torture; I don't have an example that shows the issue without -funroll-loops). Tested with no regressions with cross to arm-none-linux-gnueabi (--with-mode=thumb --with-arch=armv7-a), and bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to commit? 2012-08-08 Joseph Myers jos...@codesourcery.com * simplify-rtx.c (simplify_binary_operation_1): Do not simplify IOR to a constant if one operand has side effects. testsuite: 2012-08-08 Joseph Myers jos...@codesourcery.com * gcc.c-torture/execute/20120808-1.c: New test. Index: testsuite/gcc.c-torture/execute/20120808-1.c === --- testsuite/gcc.c-torture/execute/20120808-1.c(revision 0) +++ testsuite/gcc.c-torture/execute/20120808-1.c(revision 0) @@ -0,0 +1,37 @@ +extern void exit (int); +extern void abort (void); + +volatile int i; +unsigned char *volatile cp; +unsigned char d[32] = { 0 }; + +int +main (void) +{ + unsigned char c[32] = { 0 }; + unsigned char *p = d + i; + int j; + for (j = 0; j 30; j++) +{ + int x = 0xff; + int y = *++p; + switch (j) + { + case 1: x ^= 2; break; + case 2: x ^= 4; break; + case 25: x ^= 1; break; + default: break; + } + c[j] = y | x; + cp = p; +} + if (c[0] != 0xff + || c[1] != 0xfd + || c[2] != 0xfb + || c[3] != 0xff + || c[4] != 0xff + || c[25] != 0xfe + || cp != d + 30) +abort (); + exit (0); +} Index: simplify-rtx.c === --- simplify-rtx.c (revision 190186) +++ simplify-rtx.c (working copy) @@ -2420,7 +2420,9 @@ simplify_binary_operation_1 (enum rtx_co case IOR: if (trueop1 == CONST0_RTX (mode)) return op0; - if (INTEGRAL_MODE_P (mode) trueop1 == CONSTM1_RTX (mode)) + if (INTEGRAL_MODE_P (mode) + trueop1 == CONSTM1_RTX (mode) + !side_effects_p (op0)) return op1; if (rtx_equal_p (trueop0, trueop1) ! side_effects_p (op0)) return op0; @@ -2434,7 +2436,8 @@ simplify_binary_operation_1 (enum rtx_co /* (ior A C) is C if all bits of A that might be nonzero are on in C. */ if (CONST_INT_P (op1) HWI_COMPUTABLE_MODE_P (mode) - (nonzero_bits (op0, mode) ~UINTVAL (op1)) == 0) + (nonzero_bits (op0, mode) ~UINTVAL (op1)) == 0 + !side_effects_p (op0)) return op1; /* Canonicalize (X C1) | C2. */ -- Joseph S. Myers jos...@codesourcery.com
[AArch64] Merge from upstream trunk reverted
Hi, I've just reverted my recent merge from upstream trunk on the aarch64-branch (r190119). A cleaner and broader merge will follow. Thanks Sofiane
Re: [PATCH, Android] Runtime stack protector enabling for Android target
On 8/08/2012, at 9:46 PM, Uros Bizjak wrote: On Wed, Aug 8, 2012 at 9:54 AM, Igor Zamyatin izamya...@gmail.com wrote: I'd like to ask whether stack-protector changes for Android could go to 4.7? Pathes are: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01089.html http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01157.html OK, as far as x86 is concerned. Strictly speaking, these fixes are not for a regression, so do not readily qualify for a release branch. However, since the patches are no-ops on targets other than x86 and Uros OK'd them for x86 ... OK, provided that the patches in the above threads apply without conflicts. If there are conflicts, please repost for review. Thanks, -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: Fix simplify-rtx losing side effects in operand of IOR
Joseph S. Myers jos...@codesourcery.com writes: 2012-08-08 Joseph Myers jos...@codesourcery.com * simplify-rtx.c (simplify_binary_operation_1): Do not simplify IOR to a constant if one operand has side effects. OK, thanks. Richard
Re: [PATCH] GCC Ada/GNAT configuration for GNU/Hurd
* Arnaud Charlet, 2012-06-18 : -#if defined (__linux__) !defined (_XOPEN_SOURCE) +#if (defined (__linux__) || defined (__GNU__)) !defined (_XOPEN_SOURCE) /** For Linux _XOPEN_SOURCE must be defined, otherwise IOV_MAX is not defined **/ #define _XOPEN_SOURCE 500 You need to update the comment here, since the section so far only applied to GNU/Linux and not GNU/Hurd. In fact, should that perhaps (unverified) simply say »For glibc, _XOPEN_SOURCE must be defined [...]« -- or is this code meant to be usable on GNU/Linux with a C library different from glibc? Posibly yes. Thomas (Quinot), any comment on the above? No strong opinion either way. What is important is that the comment be consistent with what we actually test. If we want the comment to say For glibc, blah... then we need to change the test to something that actually tests for glibc. Thomas. -- Thomas Quinot, Ph.D. ** qui...@adacore.com ** Senior Software Engineer AdaCore -- Paris, France -- New York, USA
[PATCH,i386] fma,fma4 and xop flags
Hello, Bdver2 cpu supports both fma and fma4 instructions. Previous to patch, option -mno-xop removes -mfma4. Similarly, option -mno-fma4 removes -mxop. So, the patch conditionally disables -mfma or -mfma4. Enabling -mxop is done by also checking -mfma. Ok for trunk? Regards Ganesh 2012-08-08 Ganesh Gopalasubramanian ganesh.gopalasubraman...@amd.com * common/config/i386/i386-common.c (ix86_handle_option): Reset fma flag after checking fma4. Reset fma4 flag after checking fma. Set xop flag after checking fma flags. Index: gcc/common/config/i386/i386-common.c === --- gcc/common/config/i386/i386-common.c(revision 189996) +++ gcc/common/config/i386/i386-common.c(working copy) @@ -310,8 +310,16 @@ } else { - opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA_UNSET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA_UNSET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA4) +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA; +} + else +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA_UNSET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA_UNSET; +} } return true; @@ -359,16 +367,32 @@ } else { - opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4_UNSET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4_UNSET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA) +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4 ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4; +} + else +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4_UNSET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4_UNSET; + } } return true; case OPT_mxop: if (value) { - opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP_SET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP_SET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA) +{ + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP; +} +else +{ + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP_SET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP_SET; + } } else {
Re: [PATCH,i386] fma,fma4 and xop flags
On Wed, Aug 8, 2012 at 1:31 PM, ganesh.gopalasubraman...@amd.com wrote: Hello, Bdver2 cpu supports both fma and fma4 instructions. Previous to patch, option -mno-xop removes -mfma4. Similarly, option -mno-fma4 removes -mxop. Eh? Why's that? I think we should disentangle -mxop and -mfma4 instead. Otherwise, what does -mno-fma4 -mxop do? (it should enable both xop and fma4!) what should -mfma4 -mno-xop do (it should disable both xop and fma4!). All this is just confusing to the user, even if in AMD documents XOP includes FMA4. Richard. So, the patch conditionally disables -mfma or -mfma4. Enabling -mxop is done by also checking -mfma. Ok for trunk? Regards Ganesh 2012-08-08 Ganesh Gopalasubramanian ganesh.gopalasubraman...@amd.com * common/config/i386/i386-common.c (ix86_handle_option): Reset fma flag after checking fma4. Reset fma4 flag after checking fma. Set xop flag after checking fma flags. Index: gcc/common/config/i386/i386-common.c === --- gcc/common/config/i386/i386-common.c(revision 189996) +++ gcc/common/config/i386/i386-common.c(working copy) @@ -310,8 +310,16 @@ } else { - opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA_UNSET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA_UNSET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA4) +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA; +} + else +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA_UNSET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA_UNSET; +} } return true; @@ -359,16 +367,32 @@ } else { - opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4_UNSET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4_UNSET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA) +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4 ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4; +} + else +{ + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_FMA4_UNSET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_FMA4_UNSET; + } } return true; case OPT_mxop: if (value) { - opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP_SET; - opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP_SET; + if (opts-x_ix86_isa_flags OPTION_MASK_ISA_FMA) +{ + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP ; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP; +} +else +{ + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XOP_SET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XOP_SET; + } } else {
Re: [RFH / Patch] PR 54191
On 08/08/2012 01:57 PM, Paolo Carlini wrote: this also triggers the static_assert. Really, in 'decltype(B{declvalFrom()})' almost *everything* is Ok between the curly brackets. Maybe we should have a separate PR for this. And I think this issue is addressed by the ongoing work on instantiation dependency, thus C++/51222 etc. AFAICS, the most uncertain case is the conditional operator test, otherwise we could split the PR. Paolo.
[Test] contrib/test_installed modified to set specific gcov
Hi, while running check for Android NDK compiler (I've used contrib/test_installed for it) I've noticed that gcov name is hardcoded in g++.dg/gcov/gcov.exp. All NDK x86 tools have prefix like i686-linux-android-, so testing will fail to spawn gcov. The patch attached introduces --with-gcov flag of contrib/test_installed so one could set specific gcov for testing. As workaround we could create a wrapper script named 'gcov' pointing to specific gcov in directory where GCC_UNDER_TEST resides. What do you think of this patch? Do you find it usefull? Thanks in advance. Anna gcov.p Description: Binary data
Re: [RFH / Patch] PR 54191
On 08/08/2012 02:47 PM, Paolo Carlini wrote: On 08/08/2012 01:57 PM, Paolo Carlini wrote: this also triggers the static_assert. Really, in 'decltype(B{declvalFrom()})' almost *everything* is Ok between the curly brackets. Maybe we should have a separate PR for this. And I think this issue is addressed by the ongoing work on instantiation dependency, thus C++/51222 etc. AFAICS, the most uncertain case is the conditional operator test, otherwise we could split the PR. I just confirmed that the pending patch for C++/51222 fixes the first 4 tests. Paolo.
Re: Value type of map need not be default copyable
On 08/08/2012 09:34 AM, Marc Glisse wrote: On Tue, 7 Aug 2012, Richard Smith wrote: I've attached a patch for unordered_map which solves the rvalue reference problem. For efficiency, I've created a new _M_emplace_bucket method rather than call emplace directly. I've verified all libstdc++ tests pass (sorry for the previous oversight) and am running the full GCC test suite now. However, I'd appreciate any feedback on whether this is a reasonable approach. STL hacking is way outside my comfort zone. ;-) If this looks good, I'll take a stab at std::map. I think you should remove the mapped_type() argument from the call to _M_emplace_bucket. In C++11, the mapped_type is not required to be copyable at all, just to be DefaultInsertable. Indeed. The reason I was talking about emplace is that you want an object to be created only at the time the node is created. That might mean passing piecewise_construct_t and an empty tuple to emplace (otherwise it is too similar to insert). Or for unordered_map where the node functions are exposed, you could just create the node directly without passing through emplace. This is what I try to do in the attached patch. I replace _M_insert_bucket with _M_insert_node and use it for operator[] implementation. I have also introduce a special std::pair constructor for container usage so that we do not have to include the whole tuple stuff just for associative container implementations. However one test is failing: /home/fdt/dev/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/insert/array_syntax_move.cc:39:18: required from here /home/fdt/dev/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stl_pair.h:175:42: error: use of deleted function '__gnu_test::rvalstruct::rvalstruct(const __gnu_test::rvalstruct)' : first(std::forward_T1(__x)), second() { } I don't understand why it doesn't use the move constructor. I can't see any std::forward call missing. Anyone ? François Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 190209) +++ include/bits/hashtable.h (working copy) @@ -584,11 +584,10 @@ __node_base* _M_get_previous_node(size_type __bkt, __node_base* __n); - templatetypename _Arg - iterator - _M_insert_bucket(_Arg, size_type, __hash_code); + // Insert the node n. Assumes key doesn't exist + iterator + _M_insert_node(size_type __bkt, __hash_code __code, __node_type* __n); - templatetypename... _Args std::pairiterator, bool _M_emplace(std::true_type, _Args... __args); @@ -1307,54 +1306,49 @@ } } - // Insert v in bucket n (assumes no element with its key already present). + // Insert node in bucket bkt (assumes no element with its key already + // present). Take ownership of the passed node, deallocate it on exception. templatetypename _Key, typename _Value, typename _Alloc, typename _ExtractKey, typename _Equal, typename _H1, typename _H2, typename _Hash, typename _RehashPolicy, typename _Traits -templatetypename _Arg - typename _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, - _Traits::iterator - _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, _Traits:: - _M_insert_bucket(_Arg __v, size_type __n, __hash_code __code) - { - const __rehash_state __saved_state = _M_rehash_policy._M_state(); - std::pairbool, std::size_t __do_rehash - = _M_rehash_policy._M_need_rehash(_M_bucket_count, - _M_element_count, 1); +typename _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, + _Traits::iterator +_Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, _Traits:: +_M_insert_node(size_type __bkt, __hash_code __code, __node_type* __node) +{ + const __rehash_state __saved_state = _M_rehash_policy._M_state(); + std::pairbool, std::size_t __do_rehash + = _M_rehash_policy._M_need_rehash(_M_bucket_count, + _M_element_count, 1); - if (__do_rehash.first) - { - const key_type __k = this-_M_extract()(__v); - __n = __hash_code_base::_M_bucket_index(__k, __code, + if (__do_rehash.first) + { + const key_type __k = this-_M_extract()(__node-_M_v); + __bkt = __hash_code_base::_M_bucket_index(__k, __code, __do_rehash.second); - } + } - __node_type* __node = nullptr; - __try - { - // Allocate the new node before doing the rehash so that we - // don't do a rehash if the allocation throws. - __node = _M_allocate_node(std::forward_Arg(__v)); - this-_M_store_code(__node, __code); - if (__do_rehash.first) - _M_rehash(__do_rehash.second, __saved_state); + __try + { + if (__do_rehash.first) + _M_rehash(__do_rehash.second, __saved_state); - _M_insert_bucket_begin(__n, __node); -
Re: [PATCH] Intrinsics for ADCX
Here is the patch with some obvious fixes. If there are no objections, could anyone please check it in? Done: http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00203.html Thanks, K
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 1:45 AM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Aug 8, 2012 at 10:28 AM, Nick Clifton ni...@redhat.com wrote: Hi DJ, I am applying the following patch to the gcc mainline as an obvious fix for the following problem building the RL78 backend: gcc/config/rl78/rl78.c:151:3: error: 'PASS_POS_INSERT_BEFORE' undeclared here (not in a function) Err - you are inside the compiler and should not use plugin stuff to register your machine dependent pass. But we should definitely have a way to register machine dependent passes, and what's wrong with the plugin interface? Ian
Re: Value type of map need not be default copyable
On 08/08/2012 03:15 PM, François Dumont wrote: I have also introduce a special std::pair constructor for container usage so that we do not have to include the whole tuple stuff just for associative container implementations. To be clear: sorry, this is not an option. Paolo.
Re: PATCH: PR rtl-optimization/54157: [x32] -maddress-mode=long failures
On Wed, Aug 8, 2012 at 1:08 AM, Richard Sandiford rdsandif...@googlemail.com wrote: H.J. Lu hjl.to...@gmail.com writes: diff --git a/gcc/combine.c b/gcc/combine.c index a07c046..b9a3589 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10784,12 +10784,30 @@ gen_lowpart_for_combine (enum machine_mode omode, rtx x) if (omode == imode) return x; - /* Return identity if this is a CONST or symbolic reference. */ - if (omode == Pmode - (GET_CODE (x) == CONST - || GET_CODE (x) == SYMBOL_REF - || GET_CODE (x) == LABEL_REF)) -return x; + if (omode == Pmode) +{ + /* Return identity if this is a symbolic reference. */ + if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF) + return x; + + /* Return identity for CONST unless this is a PLUS of 2 constant + operands. */ + if (GET_CODE (x) == CONST) + { + rtx y = XEXP (x, 0); + if (GET_CODE (y) == PLUS +((CONST_INT_P (XEXP (y, 1)) + (GET_CODE (XEXP (y, 0)) == CONST +|| GET_CODE (XEXP (y, 0)) == SYMBOL_REF +|| GET_CODE (XEXP (y, 0)) == LABEL_REF)) + || (CONST_INT_P (XEXP (y, 1)) +(GET_CODE (XEXP (y, 0)) == CONST + || GET_CODE (XEXP (y, 0)) == SYMBOL_REF + || GET_CODE (XEXP (y, 0)) == LABEL_REF + goto fail; + return x; + } +} /* We can only support MODE being wider than a word if X is a constant integer or has a mode the same size. */ works for the testcase. My point was that the return x is always wrong. Whenever we return x here, we know we're returning something in a different mode from the one that the caller wanted. Returning a Pmode LABEL_REF might not trigger that plus_constant assert, but it's still wrong. It looks like this came from the mips-rewrite branch: 2003-03-13 Richard Sandiford rsand...@redhat.com * combine.c (gen_lowpart_for_combine): Treat the lowpart Pmode of a CONST as identity. Check the return value of gen_lowpart_common. so I can categorically confirm that the person who wrote it didn't know what they were doing. It also means that this case was motivated by an experiment to make Pmode == DImode for n32, which we ended up discarding because it produced worse code. If this case really is important, we might consider using convert_memory_address (Pmode, x) instead. I'm not sure whether that would be right for targets with address spaces though, because we don't know which address space is associated with the address. Hopefully someone who knows address spaces can comment. If it is correct, then it should probably go in gen_lowpart_common rather than gen_lowpart_for_combine. But given how few people have hit this, it doesn't look like a particularly important attempted optimisation. I'll pre-approve a patch that undoes my mistake and simply removes: /* Return identity if this is a CONST or symbolic reference. */ if (omode == Pmode (GET_CODE (x) == CONST || GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)) return x; Richard This is the patch I checked in. Thanks. -- H.J. --- gcc/ 2012-08-08 Richard Sandiford rdsandif...@googlemail.com H.J. Lu hongjiu...@intel.com PR rtl-optimization/54157 * combine.c (gen_lowpart_for_combine): Don't return identity for CONST or symbolic reference. gcc/testsuite/ 2012-08-08 H.J. Lu hongjiu...@intel.com PR rtl-optimization/54157 * gcc.target/i386/pr54157.c: New file. diff --git a/gcc/combine.c b/gcc/combine.c index 495e129..2b91eb9 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10634,13 +10634,6 @@ gen_lowpart_for_combine (enum machine_mode omode, rtx x) if (omode == imode) return x; - /* Return identity if this is a CONST or symbolic reference. */ - if (omode == Pmode - (GET_CODE (x) == CONST - || GET_CODE (x) == SYMBOL_REF - || GET_CODE (x) == LABEL_REF)) -return x; - /* We can only support MODE being wider than a word if X is a constant integer or has a mode the same size. */ if (GET_MODE_SIZE (omode) UNITS_PER_WORD diff --git a/gcc/testsuite/gcc.target/i386/pr54157.c b/gcc/testsuite/gcc.target/i386/pr54157.c new file mode 100644 index 000..b5c4528 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr54157.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options -O2 -mx32 -maddress-mode=long -ftree-vectorize } */ + +struct s2{ + int n[24 -1][24 -1][24 -1]; +}; + +struct test2{ + struct s2 e; +}; + +struct test2 tmp2[4]; + +void main1 () +{ + int i,j; + + for (i = 0; i 24 -4; i++) + for (j = 0; j 24 -4; j++) + tmp2[2].e.n[1][i][j] = 8; +}
Re: Value type of map need not be default copyable
On Wed, 8 Aug 2012, François Dumont wrote: On 08/08/2012 09:34 AM, Marc Glisse wrote: On Tue, 7 Aug 2012, Richard Smith wrote: I've attached a patch for unordered_map which solves the rvalue reference problem. For efficiency, I've created a new _M_emplace_bucket method rather than call emplace directly. I've verified all libstdc++ tests pass (sorry for the previous oversight) and am running the full GCC test suite now. However, I'd appreciate any feedback on whether this is a reasonable approach. STL hacking is way outside my comfort zone. ;-) If this looks good, I'll take a stab at std::map. I think you should remove the mapped_type() argument from the call to _M_emplace_bucket. In C++11, the mapped_type is not required to be copyable at all, just to be DefaultInsertable. Indeed. The reason I was talking about emplace is that you want an object to be created only at the time the node is created. That might mean passing piecewise_construct_t and an empty tuple to emplace (otherwise it is too similar to insert). Or for unordered_map where the node functions are exposed, you could just create the node directly without passing through emplace. This is what I try to do in the attached patch. I replace _M_insert_bucket with _M_insert_node and use it for operator[] implementation. I have also introduce a special std::pair constructor for container usage so that we do not have to include the whole tuple stuff just for associative container implementations. However one test is failing: /home/fdt/dev/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/insert/array_syntax_move.cc:39:18: required from here /home/fdt/dev/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stl_pair.h:175:42: error: use of deleted function '__gnu_test::rvalstruct::rvalstruct(const __gnu_test::rvalstruct)' : first(std::forward_T1(__x)), second() { } I don't understand why it doesn't use the move constructor. I can't see any std::forward call missing. Anyone ? You are dealing with a pairT1,T2 where T1 is const T3. It is roughly the same issue as before, we end up trying to call a move constructor on a const. You probably want your pair constructor to accept T3, not T1. -- Marc Glisse
Re: PATCH: PR rtl-optimization/54157: [x32] -maddress-mode=long failures
On Wed, Aug 8, 2012 at 6:43 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Aug 8, 2012 at 3:40 PM, H.J. Lu hjl.to...@gmail.com wrote: rdsandif...@googlemail.com wrote: H.J. Lu hjl.to...@gmail.com writes: diff --git a/gcc/combine.c b/gcc/combine.c index a07c046..b9a3589 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10784,12 +10784,30 @@ gen_lowpart_for_combine (enum machine_mode omode, rtx x) if (omode == imode) return x; - /* Return identity if this is a CONST or symbolic reference. */ - if (omode == Pmode - (GET_CODE (x) == CONST - || GET_CODE (x) == SYMBOL_REF - || GET_CODE (x) == LABEL_REF)) -return x; + if (omode == Pmode) +{ + /* Return identity if this is a symbolic reference. */ + if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF) + return x; + + /* Return identity for CONST unless this is a PLUS of 2 constant + operands. */ + if (GET_CODE (x) == CONST) + { + rtx y = XEXP (x, 0); + if (GET_CODE (y) == PLUS +((CONST_INT_P (XEXP (y, 1)) + (GET_CODE (XEXP (y, 0)) == CONST +|| GET_CODE (XEXP (y, 0)) == SYMBOL_REF +|| GET_CODE (XEXP (y, 0)) == LABEL_REF)) + || (CONST_INT_P (XEXP (y, 1)) +(GET_CODE (XEXP (y, 0)) == CONST + || GET_CODE (XEXP (y, 0)) == SYMBOL_REF + || GET_CODE (XEXP (y, 0)) == LABEL_REF + goto fail; + return x; + } +} /* We can only support MODE being wider than a word if X is a constant integer or has a mode the same size. */ works for the testcase. My point was that the return x is always wrong. Whenever we return x here, we know we're returning something in a different mode from the one that the caller wanted. Returning a Pmode LABEL_REF might not trigger that plus_constant assert, but it's still wrong. It looks like this came from the mips-rewrite branch: 2003-03-13 Richard Sandiford rsand...@redhat.com * combine.c (gen_lowpart_for_combine): Treat the lowpart Pmode of a CONST as identity. Check the return value of gen_lowpart_common. so I can categorically confirm that the person who wrote it didn't know what they were doing. It also means that this case was motivated by an experiment to make Pmode == DImode for n32, which we ended up discarding because it produced worse code. If this case really is important, we might consider using convert_memory_address (Pmode, x) instead. I'm not sure whether that would be right for targets with address spaces though, because we don't know which address space is associated with the address. Hopefully someone who knows address spaces can comment. If it is correct, then it should probably go in gen_lowpart_common rather than gen_lowpart_for_combine. But given how few people have hit this, it doesn't look like a particularly important attempted optimisation. I'll pre-approve a patch that undoes my mistake and simply removes: /* Return identity if this is a CONST or symbolic reference. */ if (omode == Pmode (GET_CODE (x) == CONST || GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)) return x; Richard This is the patch I checked in. Probably we need to backport this patch to 4.7, where x32 is -maddress-mode=long by default. It doesn't fail on 4.7 branch since checking mode on PLUS CONST is new on trunk. However, I think it is a correctness issue. Is this OK to backport to 4.7? Thanks. -- H.J.
[PATCH][4/n] Allow anonymous SSA names
This splits out more cleanups from the main patch to make that smaller. The only bigger part of the patch is making tree-stdarg track escapes/varargs for SSA names more precise by not globbing on SSA_NAME_VAR but instead using the bits [0..num_ssa_names-1] for SSA names and [num_ssa_names,max_decl_uid + num_ssa_names] for decls in its bitmap. Eventually this pass will get rewritten by Micha anyway who has pending patches to make va_arg survive until after inlining. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-08-08 Richard Guenther rguent...@suse.de * tree-call-cdce.c (check_pow): Simplify. (gen_conditions_for_pow_int_base): Likewise. * tree-ssa-dom.c (propagate_rhs_into_lhs): Do not handle virtual operands here. * tree-ssa-operands.c (get_name_decl): Remove unused function. * gimplify.c (gimple_regimplify_operands): Remove dead code. * tree-vrp.c (get_value_range): Move SSA_NAME_VAR access. * tree-parloops.c (create_phi_for_local_result): Use copy_ssa_name. * value-prof.c (gimple_ic): Use duplicate_ssa_name. (gimple_stringop_fixed_value): Likewise. * tree.c (needs_to_live_in_memory): Remove SSA name handling. * tree-stdarg.c (find_va_list_reference): Store SSA_NAME_VERSIONs in the bitmap alongside shifted DECL_UIDs. (va_list_counter_struct_op): Likewise. (va_list_ptr_read): Likewise. (va_list_ptr_write): Likewise. (check_va_list_escapes): Likewise. (check_all_va_list_escapes): Likewise. (execute_optimize_stdarg): Likewise. * tree-outof-ssa.c (insert_backedge_copies): Use copy_ssa_name. Index: gcc/tree-outof-ssa.c === *** gcc/tree-outof-ssa.c.orig 2012-08-08 13:56:23.0 +0200 --- gcc/tree-outof-ssa.c2012-08-08 14:20:22.116810728 +0200 *** insert_backedge_copies (void) *** 1030,1042 { gimple phi = gsi_stmt (gsi); tree result = gimple_phi_result (phi); - tree result_var; size_t i; if (!is_gimple_reg (result)) continue; - result_var = SSA_NAME_VAR (result); for (i = 0; i gimple_phi_num_args (phi); i++) { tree arg = gimple_phi_arg_def (phi, i); --- 1030,1040 *** insert_backedge_copies (void) *** 1048,1054 needed. */ if ((e-flags EDGE_DFS_BACK) (TREE_CODE (arg) != SSA_NAME ! || SSA_NAME_VAR (arg) != result_var || trivially_conflicts_p (bb, result, arg))) { tree name; --- 1046,1052 needed. */ if ((e-flags EDGE_DFS_BACK) (TREE_CODE (arg) != SSA_NAME ! || SSA_NAME_VAR (arg) != SSA_NAME_VAR (result) || trivially_conflicts_p (bb, result, arg))) { tree name; *** insert_backedge_copies (void) *** 1078,1087 /* Create a new instance of the underlying variable of the PHI result. */ ! stmt = gimple_build_assign (result_var, gimple_phi_arg_def (phi, i)); - name = make_ssa_name (result_var, stmt); - gimple_assign_set_lhs (stmt, name); /* copy location if present. */ if (gimple_phi_arg_has_location (phi, i)) --- 1076,1084 /* Create a new instance of the underlying variable of the PHI result. */ ! name = copy_ssa_name (result, NULL); ! stmt = gimple_build_assign (name, gimple_phi_arg_def (phi, i)); /* copy location if present. */ if (gimple_phi_arg_has_location (phi, i)) Index: gcc/tree-stdarg.c === *** gcc/tree-stdarg.c.orig 2012-08-08 13:56:23.0 +0200 --- gcc/tree-stdarg.c 2012-08-08 14:20:22.120810730 +0200 *** find_va_list_reference (tree *tp, int *w *** 266,276 tree var = *tp; if (TREE_CODE (var) == SSA_NAME) ! var = SSA_NAME_VAR (var); ! ! if (TREE_CODE (var) == VAR_DECL !bitmap_bit_p (va_list_vars, DECL_UID (var))) ! return var; return NULL_TREE; } --- 266,280 tree var = *tp; if (TREE_CODE (var) == SSA_NAME) ! { ! if (bitmap_bit_p (va_list_vars, SSA_NAME_VERSION (var))) ! return var; ! } ! else if (TREE_CODE (var) == VAR_DECL) ! { ! if (bitmap_bit_p (va_list_vars, DECL_UID (var) + num_ssa_names)) ! return var; ! } return NULL_TREE; } *** va_list_counter_struct_op
[PATCH] Add virual_operand_p predicate
This adds a virual_operand_p predicate and uses it where we currently use the bit on the decl (VAR_DECL_IS_VIRTUAL_OPERAND) directly. I suspect most of the is_gimple_reg users in SSA optimizers can be replaced by this predicate eventually making is_gimple_reg a private predicate to the gimplifier ... Bootstrap regtest pending on x86_64-unknown-linux-gnu. Richard. 2012-08-08 Richard Guenther rguent...@suse.de * tree-ssa-operands.h (virtual_operand_p): Declare. * tree-ssa-operands.c (virtual_operand_p): New predicate. * gimple.c (is_gimple_reg): Use virtual_operand_p. * tree-into-ssa.c (prepare_block_for_update): Likewise. * tree-vect-loop-manip.c (adjust_debug_stmts): Likewise. Index: gcc/gimple.c === *** gcc/gimple.c(revision 190226) --- gcc/gimple.c(working copy) *** is_gimple_id (tree t) *** 2782,2800 bool is_gimple_reg (tree t) { ! if (TREE_CODE (t) == SSA_NAME) ! { ! t = SSA_NAME_VAR (t); ! if (TREE_CODE (t) == VAR_DECL ! VAR_DECL_IS_VIRTUAL_OPERAND (t)) ! return false; ! return true; ! } ! ! if (TREE_CODE (t) == VAR_DECL !VAR_DECL_IS_VIRTUAL_OPERAND (t)) return false; if (!is_gimple_variable (t)) return false; --- 2782,2793 bool is_gimple_reg (tree t) { ! if (virtual_operand_p (t)) return false; + if (TREE_CODE (t) == SSA_NAME) + return true; + if (!is_gimple_variable (t)) return false; Index: gcc/tree-into-ssa.c === *** gcc/tree-into-ssa.c (revision 190226) --- gcc/tree-into-ssa.c (working copy) *** prepare_block_for_update (basic_block bb *** 2548,2561 gimple phi = gsi_stmt (si); tree lhs_sym, lhs = gimple_phi_result (phi); - lhs_sym = DECL_P (lhs) ? lhs : SSA_NAME_VAR (lhs); - if (TREE_CODE (lhs) == SSA_NAME ! (TREE_CODE (lhs_sym) != VAR_DECL ! || !VAR_DECL_IS_VIRTUAL_OPERAND (lhs_sym) ! || !cfun-gimple_df-rename_vops)) continue; mark_for_renaming (lhs_sym); mark_def_interesting (lhs_sym, phi, bb, insert_phi_p); --- 2568,2579 gimple phi = gsi_stmt (si); tree lhs_sym, lhs = gimple_phi_result (phi); if (TREE_CODE (lhs) == SSA_NAME ! (! virtual_operand_p (lhs) ! || ! cfun-gimple_df-rename_vops)) continue; + lhs_sym = DECL_P (lhs) ? lhs : SSA_NAME_VAR (lhs); mark_for_renaming (lhs_sym); mark_def_interesting (lhs_sym, phi, bb, insert_phi_p); Index: gcc/tree-ssa-operands.c === *** gcc/tree-ssa-operands.c (revision 190226) --- gcc/tree-ssa-operands.c (working copy) *** debug_immediate_uses_for (tree var) *** 1432,1437 --- 1421,1444 } + /* Return true if OP, an SSA name or a DECL is a virtual operand. */ + + bool + virtual_operand_p (tree op) + { + if (TREE_CODE (op) == SSA_NAME) + { + op = SSA_NAME_VAR (op); + if (!op) + return false; + } + + if (TREE_CODE (op) == VAR_DECL) + return VAR_DECL_IS_VIRTUAL_OPERAND (op); + + return false; + } + /* Unlink STMTs virtual definition from the IL by propagating its use. */ void Index: gcc/tree-ssa-operands.h === *** gcc/tree-ssa-operands.h (revision 190226) --- gcc/tree-ssa-operands.h (working copy) *** extern void debug_decl_set (bitmap); *** 116,121 --- 116,122 extern bool ssa_operands_active (void); + extern bool virtual_operand_p (tree); extern void unlink_stmt_vdef (gimple); enum ssa_op_iter_type { Index: gcc/tree-vect-loop-manip.c === *** gcc/tree-vect-loop-manip.c (revision 190226) --- gcc/tree-vect-loop-manip.c (working copy) *** adjust_debug_stmts (tree from, tree to, *** 205,212 { adjust_info ai; ! if (MAY_HAVE_DEBUG_STMTS TREE_CODE (from) == SSA_NAME !SSA_NAME_VAR (from) != gimple_vop (cfun)) { ai.from = from; ai.to = to; --- 205,213 { adjust_info ai; ! if (MAY_HAVE_DEBUG_STMTS !TREE_CODE (from) == SSA_NAME !! virtual_operand_p (from)) { ai.from = from; ai.to = to;
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 3:35 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Aug 8, 2012 at 1:45 AM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Aug 8, 2012 at 10:28 AM, Nick Clifton ni...@redhat.com wrote: Hi DJ, I am applying the following patch to the gcc mainline as an obvious fix for the following problem building the RL78 backend: gcc/config/rl78/rl78.c:151:3: error: 'PASS_POS_INSERT_BEFORE' undeclared here (not in a function) Err - you are inside the compiler and should not use plugin stuff to register your machine dependent pass. But we should definitely have a way to register machine dependent passes, and what's wrong with the plugin interface? I don't think we really want that (machine dependent passes). It seems we cannot get away with it (so we have mdreorg). Allowing (some) flexibility where to put mdreorg is ok, using two different mechanisms (mdreorg and a plugin) sounds odd and is IMHO bad for consistency. Richard. Ian
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 7:03 AM, Richard Guenther richard.guent...@gmail.com wrote: I don't think we really want that (machine dependent passes). It seems we cannot get away with it (so we have mdreorg). Allowing (some) flexibility where to put mdreorg is ok, using two different mechanisms (mdreorg and a plugin) sounds odd and is IMHO bad for consistency. I think we definitely want machine dependent passes. E.g., reg-stack should be one. The passes should live by normal rules, they shouldn't be like mdreorg. I don't really care about the mechanism as long as it exists. Ian
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 4:06 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Aug 8, 2012 at 7:03 AM, Richard Guenther richard.guent...@gmail.com wrote: I don't think we really want that (machine dependent passes). It seems we cannot get away with it (so we have mdreorg). Allowing (some) flexibility where to put mdreorg is ok, using two different mechanisms (mdreorg and a plugin) sounds odd and is IMHO bad for consistency. I think we definitely want machine dependent passes. E.g., reg-stack should be one. The passes should live by normal rules, they shouldn't be like mdreorg. What is like mdreorg? That it is a pass centrally registered, called mdreorg that calls a target hook which happens to implement the pass? regstack is controlled by a target macro and is centrally registered, too. I don't really care about the mechanism as long as it exists. I was suggesting to for example register a 2nd mdreorg-like pass and add a 2nd target hook. regstack should get the same treatment. Exposing machine-dependent passes in the global pass schedule makes it easier to figure what you break - as I saw the RL78 use I wondered when I ever grepped for the pass dump-file name in arch specific dirs when changing that ... (never). Richard. Ian
[lra] patch to fix a ppc64 gcc testsuite failure
The following patch fixes a ppc64 gcc testsuite failure. The patch was successfully bootstrapped on x86/x86-64. Committed as rev. 190207. 2012-08-07 Vladimir Makarov vmaka...@redhat.com * lra-int.h (lra_constraint_iter_after_spill): New. * lra.c (lra): Initialize lra_constraint_iter_after_spill. * lra-constraints.c (lra_constraint_iter_after_spill): New. (lra_constraints): Use lra_constraint_iter_after_spill. Index: lra.c === --- lra.c (revision 190127) +++ lra.c (working copy) @@ -2118,7 +2118,8 @@ lra (FILE *f) COPY_HARD_REG_SET (lra_no_alloc_regs, ira_no_alloc_regs); lra_live_range_iter = lra_coalesce_iter = 0; - lra_constraint_iter = lra_inheritance_iter = lra_undo_inheritance_iter = 0; + lra_constraint_iter = lra_constraint_iter_after_spill = 0; + lra_inheritance_iter = lra_undo_inheritance_iter = 0; setup_reg_spill_flag (); @@ -2214,6 +2215,7 @@ lra (FILE *f) lra_constraint_new_regno_start = max_reg_num (); lra_constraint_new_insn_uid_start = get_max_uid (); bitmap_clear (lra_matched_pseudos); + lra_constraint_iter_after_spill = 0; } restore_scratches (); lra_eliminate (true); Index: lra-constraints.c === --- lra-constraints.c (revision 190127) +++ lra-constraints.c (working copy) @@ -3460,8 +3460,8 @@ debug_loc_equivalence_change_p (rtx *loc return result; } -/* Maximum number of constraint pass iteration number. It is for - preventing all LRA cycling. */ +/* Maximum number of constraint pass iteration number after the last + spill pass. It is for preventing all LRA cycling. */ #define MAX_CONSTRAINT_ITERATION_NUMBER 15 /* Maximum number of generated reload insns per an insn. It is for @@ -3471,6 +3471,10 @@ debug_loc_equivalence_change_p (rtx *loc /* The current iteration number of this LRA pass. */ int lra_constraint_iter; +/* The current iteration number of this LRA pass after the last spill + pass. */ +int lra_constraint_iter_after_spill; + /* True if we substituted equiv which needs checking register allocation correctness because the equivalent value contains allocatiable hard registers or when we restore multi-register @@ -3492,7 +3496,8 @@ lra_constraints (bool first_p) if (lra_dump_file != NULL) fprintf (lra_dump_file, \n** Local #%d: **\n\n, lra_constraint_iter); - if (lra_constraint_iter MAX_CONSTRAINT_ITERATION_NUMBER) + lra_constraint_iter_after_spill++; + if (lra_constraint_iter_after_spill MAX_CONSTRAINT_ITERATION_NUMBER) internal_error (Maximum number of LRA constraint passes is achieved (%d)\n, MAX_CONSTRAINT_ITERATION_NUMBER); Index: lra-int.h === --- lra-int.h (revision 190127) +++ lra-int.h (working copy) @@ -296,6 +296,7 @@ extern rtx lra_secondary_memory[NUM_MACH extern int lra_constraint_offset (int, enum machine_mode); extern int lra_constraint_iter; +extern int lra_constraint_iter_after_spill; extern bool lra_risky_transformations_p; extern int lra_inheritance_iter; extern int lra_undo_inheritance_iter;
Re: [Patch ARM 1/6] Canonicalize neon_vaba and neon_vabal patterns.
On 3 August 2012 16:00, Richard Earnshaw rearn...@arm.com wrote: On 30/07/12 12:43, Ramana Radhakrishnan wrote: Patch 1 fixes up the vaba and vabal patterns to use a canonical RTL form with the first operand to the plus being the more complex one. This patch canonicalizes the instruction patterns for the vaba and vabal intrinsics so that the more complex operand to plus is the first operand. This prevents needless splitting in combine. For reference, this was found by the new test in gcc.target/neon/vaba*.c and gcc.target/neon/vabal*.c from patch #4. Ok ? regards, Ramana 2012-07-27 Ramana Radhakrishnan ramana.radhakrish...@linaro.org * config/arm/neon.md (neon_vabamode): Change to define_expand. (neon_vabalmode): Likewise. (neon_vaba_internalmode): New internal pattern. (neon_vabal_internalmode): New internal pattern. In principle, this is OK. I think you could have achieved the same effect more simply though by just re-ordering the RTL but keeping the operand numbers the same. Indeed - this look better ? diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 7142c98..9e82564 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2337,11 +2337,11 @@ (define_insn neon_vabamode [(set (match_operand:VDQIW 0 s_register_operand =w) -(plus:VDQIW (match_operand:VDQIW 1 s_register_operand 0) -(unspec:VDQIW [(match_operand:VDQIW 2 s_register_operand w) - (match_operand:VDQIW 3 s_register_operand w) - (match_operand:SI 4 immediate_operand i)] - UNSPEC_VABD)))] +(plus:VDQIW (unspec:VDQIW + [(match_operand:VDQIW 2 s_register_operand w) +(match_operand:VDQIW 3 s_register_operand w) + (match_operand:SI 4 immediate_operand i)] UNSPEC_VABD) + (match_operand:VDQIW 1 s_register_operand 0)))] TARGET_NEON vaba.%T4%#V_sz_elem\t%V_reg0, %V_reg2, %V_reg3 [(set (attr neon_type) @@ -2351,13 +2351,13 @@ (define_insn neon_vabalmode [(set (match_operand:V_widen 0 s_register_operand =w) -(plus:V_widen (match_operand:V_widen 1 s_register_operand 0) -(unspec:V_widen [(match_operand:VW 2 s_register_operand w) - (match_operand:VW 3 s_register_operand w) - (match_operand:SI 4 immediate_operand i)] - UNSPEC_VABDL)))] +(plus:V_widen (unspec:V_widen + [(match_operand:VW 2 s_register_operand w) + (match_operand:VW 3 s_register_operand w) + (match_operand:SI 4 immediate_operand i)] UNSPEC_VABDL) +(match_operand:V_widen 1 s_register_operand 0)))] TARGET_NEON - vabal.%T4%#V_sz_elem\t%q0, %P2, %P3 + vabal.%T4%#V_sz_elem\t%q0, %P1, %P2 [(set_attr neon_type neon_vaba)] ) Ramana
Re: Commit: RL78: Include tree-pass.h
On Wed, Aug 8, 2012 at 7:11 AM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Aug 8, 2012 at 4:06 PM, Ian Lance Taylor i...@google.com wrote: On Wed, Aug 8, 2012 at 7:03 AM, Richard Guenther richard.guent...@gmail.com wrote: I don't think we really want that (machine dependent passes). It seems we cannot get away with it (so we have mdreorg). Allowing (some) flexibility where to put mdreorg is ok, using two different mechanisms (mdreorg and a plugin) sounds odd and is IMHO bad for consistency. I think we definitely want machine dependent passes. E.g., reg-stack should be one. The passes should live by normal rules, they shouldn't be like mdreorg. What is like mdreorg? That it is a pass centrally registered, called mdreorg that calls a target hook which happens to implement the pass? regstack is controlled by a target macro and is centrally registered, too. I don't really care about the mechanism as long as it exists. I was suggesting to for example register a 2nd mdreorg-like pass and add a 2nd target hook. regstack should get the same treatment. If the mechanism is a proliferation of mdreorg passes in every place we want a target-specific pass, that is fine with me. Ian
Re: PATCH: PR rtl-optimization/54157: [x32] -maddress-mode=long failures
H.J. Lu hjl.to...@gmail.com writes: On Wed, Aug 8, 2012 at 6:43 AM, Uros Bizjak ubiz...@gmail.com wrote: Probably we need to backport this patch to 4.7, where x32 is -maddress-mode=long by default. It doesn't fail on 4.7 branch since checking mode on PLUS CONST is new on trunk. However, I think it is a correctness issue. Is this OK to backport to 4.7? Yeah, I agree we should backport it. Richard
Re: Commit: RL78: Include tree-pass.h
On 08/08/2012 07:19 AM, Ian Lance Taylor wrote: I was suggesting to for example register a 2nd mdreorg-like pass and add a 2nd target hook. regstack should get the same treatment. If the mechanism is a proliferation of mdreorg passes in every place we want a target-specific pass, that is fine with me. I think it makes much more sense to edit the pass ordering from the backend, rather than hooks upon hooks upon hooks. Since the plugin interface exists, we might as well use it. r~
Re: Commit: RL78: Include tree-pass.h
But we should definitely have a way to register machine dependent passes, and what's wrong with the plugin interface? IIRC I asked about how to schedule that pass when I wrote it, and use the plugin API was the recommendation. Some background... The RL78 devirtualization pass is *not* a reorg pass, it has to happen after reload but before debug info is set up. The RL78 does not have a consistent register set or addressing scheme, GCC cannot practically support it. So RL78 uses a virtual register set and machine description until after reload, then copy data in and out of the virtual registers to physical registers depending on the operations required. The only way to get decent performance and reasonable debugging is to do that pass as soon after reload as possible, so the remaining optimizations can work with real registers. Basically, after the RL78 devirtualization pass, the RL78 is a completely different target machine.
[google] Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds (issue6446102)
Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds. A second occurrence of adding TARGET_LIB_PATH to LD_LIBRARY_PATH on gcc bootstrap builds. This one also needs removing to enable full test coverage. Discussion and rationale at: http://gcc.gnu.org/ml/gcc/2012-06/msg00314.html For google/main, google/gcc-4_7 and google/gcc-4_7-integration. Tested for bootstrap and regression. 2012-08-08 Simon Baldwin sim...@google.com * Makefile.tpl: Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds. * Makefile.in: Regenerate. Index: Makefile.in === --- Makefile.in (revision 190231) +++ Makefile.in (working copy) @@ -288,9 +288,6 @@ BASE_TARGET_EXPORTS = \ STRIP=$(STRIP_FOR_TARGET); export STRIP; \ WINDRES=$(WINDRES_FOR_TARGET); export WINDRES; \ WINDMC=$(WINDMC_FOR_TARGET); export WINDMC; \ -@if gcc-bootstrap - $(RPATH_ENVVAR)=`echo $(TARGET_LIB_PATH)$$$(RPATH_ENVVAR) | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \ -@endif gcc-bootstrap $(RPATH_ENVVAR)=`echo $(HOST_LIB_PATH)$$$(RPATH_ENVVAR) | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \ TARGET_CONFIGDIRS=$(TARGET_CONFIGDIRS); export TARGET_CONFIGDIRS; Index: Makefile.tpl === --- Makefile.tpl(revision 190231) +++ Makefile.tpl(working copy) @@ -291,9 +291,6 @@ BASE_TARGET_EXPORTS = \ STRIP=$(STRIP_FOR_TARGET); export STRIP; \ WINDRES=$(WINDRES_FOR_TARGET); export WINDRES; \ WINDMC=$(WINDMC_FOR_TARGET); export WINDMC; \ -@if gcc-bootstrap - $(RPATH_ENVVAR)=`echo $(TARGET_LIB_PATH)$$$(RPATH_ENVVAR) | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \ -@endif gcc-bootstrap $(RPATH_ENVVAR)=`echo $(HOST_LIB_PATH)$$$(RPATH_ENVVAR) | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \ TARGET_CONFIGDIRS=$(TARGET_CONFIGDIRS); export TARGET_CONFIGDIRS; -- This patch is available for review at http://codereview.appspot.com/6446102
Re: [PATCH] Set correct source location for deallocator calls
On 08/07/2012 06:25 AM, Richard Guenther wrote: (is re-setting _every_ stmt location really ok in all cases?) I'm certain that it's not, though you can't tell that from C++. Examine instead a Java test case using try-finally. In Java the contents of the finally would be incorrectly relocated from their original source line to the new line Dehao has decided upon. I can see the desire for wanting the call to ~t() to appear from the return statement. And for C++ we'll get the correct lines for the contents of ~t() post inlining (which happens after tree-eh). But unless the C++ front end uses something like UNKNOWN_LOCATION on the destructor call, I don't see how we can tell the Java and C++ cases apart. And if we can't, then I don't think we can make this change at all. r~
Re: [google] Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds (issue6446102)
On 12-08-08 11:49 , Simon Baldwin wrote: Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds. A second occurrence of adding TARGET_LIB_PATH to LD_LIBRARY_PATH on gcc bootstrap builds. This one also needs removing to enable full test coverage. Discussion and rationale at: http://gcc.gnu.org/ml/gcc/2012-06/msg00314.html For google/main, google/gcc-4_7 and google/gcc-4_7-integration. Tested for bootstrap and regression. 2012-08-08 Simon Baldwin sim...@google.com * Makefile.tpl: Omit another TARGET_LIB_PATH from RPATH_ENVVAR set on bootstrap builds. * Makefile.in: Regenerate. OK, but I gotta ask again... shouldn't this be in google/integration? :) Diego.
Re: [PATCH] Set correct source location for deallocator calls
On Wed, Aug 8, 2012 at 8:56 AM, Richard Henderson r...@redhat.com wrote: On 08/07/2012 06:25 AM, Richard Guenther wrote: (is re-setting _every_ stmt location really ok in all cases?) I'm certain that it's not, though you can't tell that from C++. Examine instead a Java test case using try-finally. In Java the contents of the finally would be incorrectly relocated from their original source line to the new line Dehao has decided upon. This makes sense. I was thinking what else can reside in the finally block, and apparently I was ignoring Java... I can see the desire for wanting the call to ~t() to appear from the return statement. And for C++ we'll get the correct lines for the contents of ~t() post inlining (which happens after tree-eh). Even if inline gives it right source position, it'll still have incorrect inline stack. But unless the C++ front end uses something like UNKNOWN_LOCATION on the destructor call, I don't see how we can tell the Java and C++ cases apart. And if we can't, then I don't think we can make this change at all. Then we should probably assign UNKNOWN_LOCATION for these destructor calls, what do you guys think? Thanks, Dehao r~
Re: [PATCH] Set correct source location for deallocator calls
On 08/08/2012 09:27 AM, Dehao Chen wrote: On Wed, Aug 8, 2012 at 8:56 AM, Richard Henderson r...@redhat.com wrote: On 08/07/2012 06:25 AM, Richard Guenther wrote: (is re-setting _every_ stmt location really ok in all cases?) I'm certain that it's not, though you can't tell that from C++. Examine instead a Java test case using try-finally. In Java the contents of the finally would be incorrectly relocated from their original source line to the new line Dehao has decided upon. This makes sense. I was thinking what else can reside in the finally block, and apparently I was ignoring Java... I can see the desire for wanting the call to ~t() to appear from the return statement. And for C++ we'll get the correct lines for the contents of ~t() post inlining (which happens after tree-eh). Even if inline gives it right source position, it'll still have incorrect inline stack. But unless the C++ front end uses something like UNKNOWN_LOCATION on the destructor call, I don't see how we can tell the Java and C++ cases apart. And if we can't, then I don't think we can make this change at all. Then we should probably assign UNKNOWN_LOCATION for these destructor calls, what do you guys think? I think it's certainly plausible. I can't think what other problems such a change would cause. Jason? r~
Re: [RFH / Patch] PR 54191
On 08/08/2012 02:47 PM, Paolo Carlini wrote: AFAICS, the most uncertain case is the conditional operator test, otherwise we could split the PR. Turned out to be really trivial: a missing check for error_mark_node in build_conditional_expr_1. I'll send in a separate message a complete regtested patch + testcase covering all the tests unrelated to instantiation dependent. Paolo.
[Patch, Fortran] PR40881 - Add two F95 obsolescence warnings
This patch implements a warning for the following Fortran 95 (and later) obsolescent features (cf. F2008, B.2 Obsolescent features): (2) Shared DO termination and termination on a statement other than END DO or CONTINUE -- use an END DO or a CONTINUE statement for each DO statement. (6) DATA statements amongst executable statements -- see B.2.5. With this patch, I think the only unimplemented obsolescence warning is for (8) Fixed form source -- see B.2.7. For the latter, I would like to see a possibility to silence that warning, given that there is substantial code around, which is in fixed form but otherwise a completely valid and obsolescent-free code. The motivation for implementing this patch was that I did a small obsolescent cleanup of our fixed-form code (which uses some Fortran 2003 features) and I realized that ifort had the shared DO termination warning and gfortran didn't. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias 2012-08-08 Tobias Burnus bur...@net-b.de PR fortran/40881 * error.c (gfc_notify_std): Reset cur_error_buffer-flag flag when the error/warning has been printed. * gfortran.h (gfc_sl_type): Add ST_LABEL_DO_TARGET and ST_LABEL_ENDDO_TARGET. * match.c (gfc_match_do): Use ST_LABEL_DO_TARGET. * parse.c (check_statement_label): Use ST_LABEL_ENDDO_TARGET. (parse_executable): Add obsolescence check for DATA. * resolve.c (resolve_branch): Handle ST_LABEL_DO_TARGET. * symbol.c (gfc_define_st_label, gfc_reference_st_label): Add obsolescence diagnostics. * trans-stmt.c (gfc_trans_label_assign): Handle ST_LABEL_DO_TARGET. 2012-08-08 Tobias Burnus bur...@net-b.de PR fortran/40881 * gfortran.dg/data_constraints_3.f90: New. * gfortran.dg/data_constraints_1.f90: Update dg-warning. * gfortran.dg/pr37243.f: Ditto. * gfortran.dg/g77/19990826-3.f: Ditto. * gfortran.dg/g77/20020307-1.f : Ditto. * gfortran.dg/g77/980310-3.f: Ditto. diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c index 7e968db..dde6a0f 100644 --- a/gcc/fortran/error.c +++ b/gcc/fortran/error.c @@ -875,6 +875,7 @@ gfc_notify_std (int std, const char *gmsgid, ...) warnings++; else gfc_increment_error_count(); + cur_error_buffer-flag = 0; } return (warning !warnings_are_errors) ? SUCCESS : FAILURE; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index b6e2975..9670022 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -146,8 +146,8 @@ ar_type; /* Statement label types. */ typedef enum -{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET, - ST_LABEL_BAD_TARGET, ST_LABEL_FORMAT +{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET, ST_LABEL_DO_TARGET, + ST_LABEL_ENDDO_TARGET, ST_LABEL_BAD_TARGET, ST_LABEL_FORMAT } gfc_sl_type; diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c index 737d6a3..5ab07e5 100644 --- a/gcc/fortran/match.c +++ b/gcc/fortran/match.c @@ -2400,7 +2400,7 @@ gfc_match_do (void) goto concurr_cleanup; if (label != NULL - gfc_reference_st_label (label, ST_LABEL_TARGET) == FAILURE) + gfc_reference_st_label (label, ST_LABEL_DO_TARGET) == FAILURE) goto concurr_cleanup; new_st.label1 = label; @@ -2454,7 +2454,7 @@ concurr_cleanup: done: if (label != NULL - gfc_reference_st_label (label, ST_LABEL_TARGET) == FAILURE) + gfc_reference_st_label (label, ST_LABEL_DO_TARGET) == FAILURE) goto cleanup; new_st.label1 = label; diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index ecda163..d7e4b78 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -1168,7 +1168,10 @@ check_statement_label (gfc_statement st) case ST_END_ASSOCIATE: case_executable: case_exec_markers: - type = ST_LABEL_TARGET; + if (st == ST_ENDDO || st == ST_CONTINUE) + type = ST_LABEL_ENDDO_TARGET; + else + type = ST_LABEL_TARGET; break; case ST_FORMAT: @@ -3825,8 +3828,11 @@ parse_executable (gfc_statement st) case ST_NONE: unexpected_eof (); - case ST_FORMAT: case ST_DATA: + gfc_notify_std (GFC_STD_F95_OBS, DATA statement at %C after the + first executable statement); + /* Fall through. */ + case ST_FORMAT: case ST_ENTRY: case_executable: accept_statement (st); diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index c5810b2..3db34c5 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -8767,7 +8767,8 @@ resolve_branch (gfc_st_label *label, gfc_code *code) return; } - if (label-defined != ST_LABEL_TARGET) + if (label-defined != ST_LABEL_TARGET + label-defined != ST_LABEL_ENDDO_TARGET) { gfc_error (Statement at %L is not a valid branch target statement for the branch statement at %L, label-where, code-loc); diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c index 455e6c9..135c1e5 100644 --- a/gcc/fortran/symbol.c +++ b/gcc/fortran/symbol.c @@ -2204,7 +2204,8 @@ gfc_define_st_label (gfc_st_label *lp, gfc_sl_type type, locus *label_locus)
Beyond Complex Register Management
On Aug 8, 2012, at 8:38 AM, DJ Delorie wrote: The RL78 devirtualization pass is *not* a reorg pass, it has to happen after reload but before debug info is set up. The RL78 does not have a consistent register set or addressing scheme, GCC cannot practically support it. Gosh, we got one of those two, though, I don't know how much worse your machine is than mine, in at all. For example, $fp is in register N, where N is calculated differently for each function and isn't known until compute_frame_size time. Which registers can be used isn't known until then either. Also, different parts of the prologue use totally different numbering. We handle this by having many different copies of all the registers, one copy for each possible category of use, and the once we get a feel for the landscape, we pin everything (that's a lie, we only pin some things) down, mark up which registers are going to be disappeared or put into service, essentially, for a given register, it can only ever be live in one category. Some of the categories have variable mappings (computed mappings) to the actual register, it isn't the case that the first register of category Y, maps to the same register as the first register of category Z. We also have different mappings depending where in the instruction stream the instruction falls, just for fun. Now, we do all this, with no reorg or devirtualization pass. I'd like to think what you do would be a proper subset of what we do. So, for example, we have one category for call clobbered registers, one for caller saved register, one for inbound argument register, one for outbound argument registers, gosh, so many, I can't recall all the others... The only magic we do have, is an extra call to reinit_regs in gimple_expand_cfg right before the if () fixup_tail_calls() code, so that TARGET_CONDITIONAL_REGISTER_USAGE has a chance to fire at just the right time. That firing pins some of the aspects of argument registers and some of the other registers. The rest of the pinning down is done at compute_frame_size time. So, there isn't just one pinning, but two. We handle the second pinning by dynamic code in print_operand. regno = func (regno, codegen_state); printf (%s, reg_name[regno]); The style allows for an arbitrarily computed register number, based upon anything one wants. We merely happen to use the frame information, leaf information, where in the instruction stream we are and a few other bits. Our usual func, is something like: switch CATEGORY (regno) case 1: regno = regno-CATEGORY_BASE[1] + actual_start_category[1]; break; case 2: regno = regno-category_base[2] + actual_start_category[2]; break; [ ... ] where the various category_bases are the start of the given categories and the actual_start_categories are the place to actually put the category in the actual register file. So, if you want to explore how to redo the port, or if it is even possible, just let me know... I like puzzles, but only if they aren't trivial. :-) The line count to switch your port is roughly O(NC*(ARPC/4 + 20) + 80), I suspect; where NC are the number of classes, and ARPC is the average registers per class. Now, all that said, the code is pretty basic and clean and we think interfaces well with the rest of the compiler. We don't violate gcc notion of codegen (though, we might get the DBX_REGISTER_NUMBER slightly wrong in the first part of the prologue). I think even that problem can be fixed, if we want. As we move to C++, I'd love for port maintainers to be able to get together and hoist _up_ code from the port so other ports can use it and thus, have more sharing. We make heavily stylized uses, which could be wrapped into a prettier api, given a reasonably powerful language. C++ might be powerful enough. I'd love to see someone improve the FIXED_REGISTERS, REGISTER_NAMES, REG_CLASS_CONTENTS, REGNO_REG_CLASS, REGNO_GLOBAL_BASE_REG_P, CLASS_MAX_NREGS experience, as the current api sucks; for example. I dread the slightest change to any of my registers, as the change ripples throughout a ton of disparate places, and it is currently too easy to forget one of them. Bonus points for cleaning up TARGET_SECONDARY_RELOAD at the same time as the above. I find that api sucks in ways I can't begin to describe.
Re: s390: Avoid CAS boolean output inefficiency
Richard Henderson wrote: On 08/07/2012 10:02 AM, Ulrich Weigand wrote: The following patch changes the builtin expander to pass a MEM oldval as-is to the back-end expander, so that the back-end can move the store to before the CC operation. With that patch I'm also seeing all the IPMs disappear. ... What do you think about this solution? It has the advantage that we still get the same xor code if we actually do need the ipm ... I'm ok with that patch. Thanks! I've checked in the following version. Tested on s390x-ibm-linux with no regressions. Bye, Ulrich ChangeLog: * builtins.c (expand_builtin_atomic_compare_exchange): Pass old value operand as MEM to expand_atomic_compare_and_swap. * config/s390/s390.md (atomic_compare_and_swapmode): Accept nonimmediate_operand for old value; generate load and store if needed. * config/s390/s390.c (s390_expand_cs_hqi): Accept any operand as vtarget. Index: gcc/builtins.c === *** gcc/builtins.c (revision 190226) --- gcc/builtins.c (working copy) *** expand_builtin_atomic_compare_exchange ( *** 5376,5381 --- 5376,5382 expect = expand_normal (CALL_EXPR_ARG (exp, 1)); expect = convert_memory_address (Pmode, expect); + expect = gen_rtx_MEM (mode, expect); desired = expand_expr_force_mode (CALL_EXPR_ARG (exp, 2), mode); weak = CALL_EXPR_ARG (exp, 3); *** expand_builtin_atomic_compare_exchange ( *** 5383,5396 if (host_integerp (weak, 0) tree_low_cst (weak, 0) != 0) is_weak = true; ! oldval = copy_to_reg (gen_rtx_MEM (mode, expect)); ! if (!expand_atomic_compare_and_swap ((target == const0_rtx ? NULL : target), oldval, mem, oldval, desired, is_weak, success, failure)) return NULL_RTX; ! emit_move_insn (gen_rtx_MEM (mode, expect), oldval); return target; } --- 5384,5398 if (host_integerp (weak, 0) tree_low_cst (weak, 0) != 0) is_weak = true; ! oldval = expect; if (!expand_atomic_compare_and_swap ((target == const0_rtx ? NULL : target), oldval, mem, oldval, desired, is_weak, success, failure)) return NULL_RTX; ! if (oldval != expect) ! emit_move_insn (expect, oldval); ! return target; } Index: gcc/config/s390/s390.c === *** gcc/config/s390/s390.c (revision 190226) --- gcc/config/s390/s390.c (working copy) *** s390_expand_cs_hqi (enum machine_mode mo *** 4825,4831 rtx res = gen_reg_rtx (SImode); rtx csloop = NULL, csend = NULL; - gcc_assert (register_operand (vtarget, VOIDmode)); gcc_assert (MEM_P (mem)); init_alignment_context (ac, mem, mode); --- 4825,4830 Index: gcc/config/s390/s390.md === *** gcc/config/s390/s390.md (revision 190226) --- gcc/config/s390/s390.md (working copy) *** *** 8870,8876 (define_expand atomic_compare_and_swapmode [(match_operand:SI 0 register_operand);; bool success output !(match_operand:DGPR 1 register_operand) ;; oldval output (match_operand:DGPR 2 memory_operand);; memory (match_operand:DGPR 3 register_operand) ;; expected intput (match_operand:DGPR 4 register_operand) ;; newval intput --- 8870,8876 (define_expand atomic_compare_and_swapmode [(match_operand:SI 0 register_operand);; bool success output !(match_operand:DGPR 1 nonimmediate_operand);; oldval output (match_operand:DGPR 2 memory_operand);; memory (match_operand:DGPR 3 register_operand) ;; expected intput (match_operand:DGPR 4 register_operand) ;; newval intput *** *** 8879,8887 (match_operand:SI 7 const_int_operand)] ;; failure model { ! rtx cc, cmp; emit_insn (gen_atomic_compare_and_swapmode_internal !(operands[1], operands[2], operands[3], operands[4])); cc = gen_rtx_REG (CCZ1mode, CC_REGNUM); cmp = gen_rtx_EQ (SImode, cc, const0_rtx); emit_insn (gen_cstorecc4 (operands[0], cmp, cc, const0_rtx)); --- 8879,8900 (match_operand:SI 7 const_int_operand)] ;; failure model { ! rtx cc, cmp, output = operands[1]; ! ! if (!register_operand (output, MODEmode)) ! output = gen_reg_rtx (MODEmode); ! emit_insn (gen_atomic_compare_and_swapmode_internal !(output, operands[2], operands[3], operands[4])); ! ! /* We deliberately accept non-register operands in the predicate ! to ensure the write back to the output operand happens *before* ! the store-flags code below. This makes it easier for combine ! to merge the store-flags code with a
Re: Beyond Complex Register Management
Gosh, we got one of those too, though, I don't know how much worse your machine is than mine, in at all. In the RL78 case, it's basically a modern Z80 clone. It has eight 8-bit registers (er, four banks of those, one active at a time) which can be combined into four 16-bit registers, but for each addressing mode there's specific register pairs that can be used as base registers, and sometimes the base register is 8-bits and sometimes it's 16, and sometimes you can combine them (i.e. [HL+B]) in weird ways. Worse, most of the MOV operations and all of the math/logic operations *only* use the A/AX register, and the ones that use other registers are often asymmetrical. For example, you can move data from memory to BC, but not from BC to memory. It's the weird addressing modes that confuse gcc. (I had similar problems with the m32c port, although it was clean enough to eventually force it to work right. Er, right enough. There's still one double-indirect addressing mode that I can't tell gcc about because it makes reload vomit.) Fortunately, there's an addressing mode that covers a small range of RAM with relatively flexible addressing, and that range happens to overlap the memory-mapped registers. So, we tell GCC to use some of that memory as virtual registers and after reload the devirtualization pass shuffles data in and out of real registers to do whatever operations are required, based on what addressing modes are used and what operations need be done. The post-reload optimizations then clean up unneeded moves etc. I.e. we built a micro-interpreter with a compile-time JIT :-) To do this, we have separate virtual and real *.md files and patterns, and the devirtualizer has some standard rules for copying data, which it applies until the pattern's predicates and constraints match. There are some hints in the patterns to tell it which rules to use, too. We do a recog on the virtual pattern to get the hints and operands, then recog on the real one until it works. As we move to C++, I'd love for port maintainers to be able to get together and hoist _up_ code from the port so other ports can use it and thus, have more sharing. The one bit of code I seem to need more often than usual is an unneeded move elimination pass or helper. I'd like to be able to call that at arbitrary times after (or before, or during) other passes to clean up unneeded moves. My implementations are fairly trivial because they cater to my ports, but a generic one that covers all ports generically would be much more appropriate.
Re: Beyond Complex Register Management
On Wed, Aug 08, 2012 at 10:52:28AM -0700, Mike Stump wrote: As we move to C++, I'd love for port maintainers to be able to get together and hoist _up_ code from the port so other ports can use it and thus, have more sharing. We make heavily stylized uses, which could be wrapped into a prettier api, given a reasonably powerful language. C++ might be powerful enough. I'd love to see someone improve the FIXED_REGISTERS, REGISTER_NAMES, REG_CLASS_CONTENTS, REGNO_REG_CLASS, REGNO_GLOBAL_BASE_REG_P, CLASS_MAX_NREGS experience, as the current api sucks; for example. I dread the slightest change to any of my registers, as the change ripples throughout a ton of disparate places, and it is currently too easy to forget one of them. Bonus points for cleaning up TARGET_SECONDARY_RELOAD at the same time as the above. I find that api sucks in ways I can't begin to describe. Would a registers.md for defining registers and register classes (and maybe things like FIXED_REGISTERS and the like) be a good start for you, or do you want something more developed first? -Nathan
Re: [PATCH][Cilkplus] Remove unwanted static chain.
Index: gcc/cp/parser.c === --- gcc/cp/parser.c (revision 190195) +++ gcc/cp/parser.c (working copy) @@ -28351,6 +28351,13 @@ FOR_EXPR (statement) = decl; CILK_FOR_GRAIN (statement) = grain; + /* If an initial value is available, and it is of type integer, then we + save it in CILK_FOR_INIT. */ + if (init TREE_TYPE (init) INTEGRAL_TYPE_P (TREE_TYPE (init))) +CILK_FOR_INIT (statement) = init; + else +CILK_FOR_INIT (statement) = NULL_TREE; + Shouldn't you only set this for flag_cilkplus (?), or does it need to be set for non Cilk instances of the compiler?
RE: [PATCH][Cilkplus] Remove unwanted static chain.
Hello Aldy, The only time we will get into this function (cp_parser_cilk_for) is when the fcilkplus is turned on. Here is the original call for this function (line #9983) : if (!flag_enable_cilk) fatal_error (-fcilkplus must be enabled to use %cilk_for%); else statement = cp_parser_cilk_for (parser, (tree) NULL_TREE, parser-in_statement); Thanks, Balaji V. Iyer. -Original Message- From: Aldy Hernandez [mailto:al...@redhat.com] Sent: Wednesday, August 08, 2012 2:23 PM To: Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH][Cilkplus] Remove unwanted static chain. Index: gcc/cp/parser.c === --- gcc/cp/parser.c (revision 190195) +++ gcc/cp/parser.c (working copy) @@ -28351,6 +28351,13 @@ FOR_EXPR (statement) = decl; CILK_FOR_GRAIN (statement) = grain; + /* If an initial value is available, and it is of type integer, then we + save it in CILK_FOR_INIT. */ + if (init TREE_TYPE (init) INTEGRAL_TYPE_P (TREE_TYPE (init))) +CILK_FOR_INIT (statement) = init; else +CILK_FOR_INIT (statement) = NULL_TREE; + Shouldn't you only set this for flag_cilkplus (?), or does it need to be set for non Cilk instances of the compiler?
Re: [PATCH][Cilkplus] Remove unwanted static chain.
On 08/08/12 13:27, Iyer, Balaji V wrote: Hello Aldy, The only time we will get into this function (cp_parser_cilk_for) is when the fcilkplus is turned on. Here is the original call for this function (line #9983) : if (!flag_enable_cilk) fatal_error (-fcilkplus must be enabled to use %cilk_for%); else statement = cp_parser_cilk_for (parser, (tree) NULL_TREE, parser-in_statement); Whoops, my bad. I should catch up on the rest of your changes before commenting :). Carry on...
Re: [patch] Fix problems with -fdebug-types-section and local types
Index: gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C === --- gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) @@ -0,0 +1,55 @@ +// { dg-do compile } +// { dg-options --std=c++11 -dA -gdwarf-4 -fdebug-types-section -fno-merge-debug-strings } + +// Check that -fdebug-types-sections does not copy a full referenced type +// into a type unit. + +// Checks that at least one type unit is generated. +// +// { dg-final { scan-assembler DIE \\(.*\\) DW_TAG_type_unit } } +// +// Check that func is declared exactly twice in the debug info: +// once in the type unit for struct D, and once in the compile unit. +// +// { dg-final { scan-assembler-times \\.ascii \func0\\[^\n\]*DW_AT_name 2 } } +// +// Check to make sure that no type unit contains a DIE with DW_AT_low_pc +// or DW_AT_ranges. These patterns assume that the compile unit is always +// emitted after all type units. +// +// { dg-final { scan-assembler-not \\.quad.*DW_AT_low_pc.*DIE \\(.*\\) DW_TAG_compile_unit } } +// { dg-final { scan-assembler-not \\.quad.*DW_AT_ranges.*DIE \\(.*\\) DW_TAG_compile_unit } } I found a minor problem with the regexps above -- I was using .* in a few places where I should have been using \[^\n\]*. Fixed as follows: +// { dg-final { scan-assembler DIE \\(\[^\n\]*\\) DW_TAG_type_unit } } +// { dg-final { scan-assembler-times \\.ascii \func0\\[^\n\]*DW_AT_name 2 } } +// { dg-final { scan-assembler-not \\.quad\[^\n\]*DW_AT_low_pc.*DIE \\(\[^\n\]*\\) DW_TAG_compile_unit } } +// { dg-final { scan-assembler-not \\.quad\[^\n\]*DW_AT_ranges.*DIE \\(\[^\n\]*\\) DW_TAG_compile_unit } } (Sorry, these will probably get split into two lines by gmail, but they're each a single line in the actual patch.) -cary
RE: [PATCH][Cilkplus] Remove unwanted static chain.
Its ok. I am glad you are catching all these, it makes me rethink and recheck. Thanks, Balaji V. Iyer. -Original Message- From: Aldy Hernandez [mailto:al...@redhat.com] Sent: Wednesday, August 08, 2012 2:29 PM To: Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH][Cilkplus] Remove unwanted static chain. On 08/08/12 13:27, Iyer, Balaji V wrote: Hello Aldy, The only time we will get into this function (cp_parser_cilk_for) is when the fcilkplus is turned on. Here is the original call for this function (line #9983) : if (!flag_enable_cilk) fatal_error (-fcilkplus must be enabled to use %cilk_for%); else statement = cp_parser_cilk_for (parser, (tree) NULL_TREE, parser-in_statement); Whoops, my bad. I should catch up on the rest of your changes before commenting :). Carry on...
Re: [PATCH, MIPS] 74k madd scheduler tweaks
Sandra Loosemore san...@codesourcery.com writes: +;; Before reload, all multiplier is registered as imul3 (which has a long +;; latency). We temporary jig the latency such that the macc groups +;; are scheduled closely together during the first scheduler pass. +(define_bypass 1 r74k_int_mul3 r74k_dsp_mac + mips_mult_madd_chain_bypass_p) +(define_bypass 1 r74k_int_mul3 r74k_dsp_mac_sat + mips_mult_madd_chain_bypass_p) If this is incorrect or looks like a hack to paper over some other problem, I'd be happy to drop the predicate on these bits too. WDYT? Hmm, yeah, it does look like they should be using mips_linked_madd_p instead, except that mips_linked_madd_p isn't yet wired up to handle DSP macs. Rather than pattern-match them all, the easiest thing would probably be to define a new attribute along the lines of: (define_attr accum_in none,0,1,2,3,4,5 (const_string none)) and use it for the existing imadds too. E.g.: (define_insn *mul_acc_si [(set (match_operand:SI 0 register_operand =l*?*?,d?) (plus:SI (mult:SI (match_operand:SI 1 register_operand d,d) (match_operand:SI 2 register_operand d,d)) (match_operand:SI 3 register_operand 0,d))) (clobber (match_scratch:SI 4 =X,l)) (clobber (match_scratch:SI 5 =X,d))] GENERATE_MADD_MSUB !TARGET_MIPS16 @ madd\t%1,%2 # [(set_attr type imadd) (set_attr accum_in 3) (set_attr mode SI) (set_attr length 4,8)]) Then mips_linked_madd_p can use get_attr_accum_in to check for chains. But if that's more work than you want right now, just dropping the predicates above would be OK too, like you say. Richard
Re: [cxx-conversion] Support garbage-collected C++ templates
hi Diego, just a word on style in the documentation: +templatetypename T +void gt_pch_nx (TPT *tp) +@{ + extern void gt_pch_nx (T); + + /* This marks field 'fld' of type 'T'. */ + gt_pch_nx (tp-fld); +@} 'extern' declaration at local scope if considered an extremely poor style in C++. Furthermore, it does not interact well with template instantiations and scope rules; plus it does not work well when the function actually has an internal linkage. A proper way to bring a symbol into local scope is through a using-declaration: using ::gt_pch_nx; -- Gaby
Re: [cxx-conversion] Support garbage-collected C++ templates
On 12-08-08 16:12 , Gabriel Dos Reis wrote: hi Diego, just a word on style in the documentation: +templatetypename T +void gt_pch_nx (TPT *tp) +@{ + extern void gt_pch_nx (T); + + /* This marks field 'fld' of type 'T'. */ + gt_pch_nx (tp-fld); +@} 'extern' declaration at local scope if considered an extremely poor style in C++. Furthermore, it does not interact well with template instantiations and scope rules; plus it does not work well when the function actually has an internal linkage. A proper way to bring a symbol into local scope is through a using-declaration: using ::gt_pch_nx; I struggled a bit with this. I need to tell the template function that there exists a free function gt_pch_nx that takes a T. This is not something that any header file has at the point of this declaration. The function gt_pch_nx(T) is declared and defined later in one of the gt-*.h files. This is another problem with the way that gengtype works today. The using-declaration that you propose does not seem to give me what I want, though. How does it know that the function takes a T? Diego.
[C++ Patch] PR 54191
Hi, this is a booted and tested patch which handles all the tests submitted as part of the PR besides the first 4, which require finish_decltype_type to use an instantiation_dependent_p along the lines of the work done as part of c++/51222. As I mentioned, I already verified that the latter work is enough to fix those 4 tests, thus my idea would be resolving asap this PR and then adding the 4 tests to c++/51222 or opening a separate PR only for those, blocked by c++/51222, whichever option you like. Otherwise, as I already described, I'm simply adding a tsubst_flags_t parameter to lookup_base, and the rest is straightforward. Thanks, Paolo. /cp 2012-08-08 Paolo Carlini paolo.carl...@oracle.com PR c++/54191 * search.c (lookup_base): Add tsubst_flags_t parameter. (adjust_result_of_qualified_name_lookup, check_final_overrider): Adjust. * name-lookup.c (do_class_using_decl): Likewise. * typeck.c (build_class_member_access_expr, finish_class_member_access_expr, get_member_function_from_ptrfunc, build_static_cast_1, get_delta_difference_1): Likewise. * class.c (build_base_path, convert_to_base, build_vtbl_ref_1, warn_about_ambiguous_bases): Likewise. * rtti.c (build_dynamic_cast_1): Likewise. * tree.c (maybe_dummy_object): Likewise. * typeck2.c (error_not_base_type, build_scoped_ref, build_m_component_ref): Likewise. * cvt.c (cp_convert_to_pointer, convert_to_pointer_force, build_up_reference): Likewise. * call.c (build_over_call): Likewise. (build_conditional_expr_1): Check arg2 and arg3 for error_mark_node before the valid_operands label. * cp-tree.h: Update. /testsuite 2012-08-08 Paolo Carlini paolo.carl...@oracle.com PR c++/54191 * g++.dg/cpp0x/sfinae39.C: New. Index: testsuite/g++.dg/cpp0x/sfinae39.C === --- testsuite/g++.dg/cpp0x/sfinae39.C (revision 0) +++ testsuite/g++.dg/cpp0x/sfinae39.C (revision 0) @@ -0,0 +1,98 @@ +// PR c++/54191 +// { dg-do compile { target c++11 } } + +struct B +{}; + +struct D + : private B +{}; + +templatetypename T +T declval(); + + +templatetypename From, typename To, + typename = decltype(static_castTo(declvalFrom())) +constexpr bool test_static_cast(int) +{ return true; } + +templatetypename, typename +constexpr bool test_static_cast(bool) +{ return false; } + +static_assert(!test_static_castB , D (0), ); +static_assert(!test_static_castB *, D *(0), ); + + +templatetypename From, typename To, + typename = decltype(dynamic_castTo(declvalFrom())) +constexpr bool test_dynamic_cast(int) +{ return true; } + +templatetypename, typename +constexpr bool test_dynamic_cast(bool) +{ return false; } + +static_assert(!test_dynamic_castD , B (0), ); +static_assert(!test_dynamic_castD *, B *(0), ); + + +int B::*pm = 0; + +templatetypename T, typename = decltype(declvalT().*pm) +constexpr bool test_member_ptr_dot(int) +{ return true; } + +templatetypename +constexpr bool test_member_ptr_dot(bool) +{ return false; } + +static_assert(!test_member_ptr_dotD(0), ); + + +templatetypename T, typename = decltype(declvalT()-*pm) +constexpr bool test_member_ptr_arrow(int) +{ return true; } + +templatetypename +constexpr bool test_member_ptr_arrow(bool) +{ return false; } + +static_assert(!test_member_ptr_arrowD *(0), ); + + +templatetypename T, typename U, + typename = decltype(declvalT() declvalU()) +constexpr bool test_rel_op(int) +{ return true; } + +templatetypename, typename +constexpr bool test_rel_op(bool) +{ return false; } + +static_assert(!test_rel_opD *, B *(0), ); + + +templatetypename T, typename U, + typename = decltype(declvalT() == declvalU()) +constexpr bool test_eq(int) +{ return true; } + +templatetypename, typename +constexpr bool test_eq(bool) +{ return false; } + +static_assert(!test_eqD *, B *(0), ); + + +templatetypename T, typename U, + typename = decltype(false ? declvalT() : declvalU()) +constexpr bool test_cond_op(int) +{ return true; } + +templatetypename, typename +constexpr bool test_cond_op(bool) +{ return false; } + +static_assert(!test_cond_opB *, D *(0), ); Index: cp/typeck.c === --- cp/typeck.c (revision 190231) +++ cp/typeck.c (working copy) @@ -2247,7 +2247,7 @@ build_class_member_access_expr (tree object, tree base_kind kind; binfo = lookup_base (access_path ? access_path : object_type, - member_scope, ba_unique, kind); + member_scope, ba_unique, kind, complain); if (binfo == error_mark_node) return error_mark_node; @@ -2630,7 +2630,8 @@ finish_class_member_access_expr (tree object, tree } /* Find the base of OBJECT_TYPE corresponding to
Re: Value type of map need not be default copyable
On 08/08/2012 03:39 PM, Paolo Carlini wrote: On 08/08/2012 03:15 PM, François Dumont wrote: I have also introduce a special std::pair constructor for container usage so that we do not have to include the whole tuple stuff just for associative container implementations. To be clear: sorry, this is not an option. Paolo. Then I can only imagine the attached patch which require to include tuple when including unordered_map or unordered_set. The std::pair(piecewise_construct_t, tuple, tuple) is the only constructor that allow to build a pair using the default constructor for the second member. In fact, adding declaration for std::make_tuple and std::forward_as_tuple could avoid to include tuple from unordered_set, there is no operator[] on unordered_set or unordered_multiset. But I am not sure it worth the effort, tell me. All unordered tests run under Linux x86_64, normal and debug modes. 2012-08-08 François Dumont fdum...@gcc.gnu.org Ollie Wild a...@google.com * include/bits/hashtable.h (_Hashtable::_M_insert_bucket): Replace by ... (_Hashtable::_M_insert_node): ... this, new. (_Hashtable::_M_insert(_Args, true_type)): Use latter. * include/bits/hashtable_policy.h (_Map_base::operator[]): Use latter, emplace the value_type rather than insert. * include/std/unordered_map: Include tuple. * include/std/unordered_set: Likewise. * testsuite/23_containers/unordered_map/operators/2.cc: New. François Index: include/std/unordered_map === --- include/std/unordered_map (revision 190209) +++ include/std/unordered_map (working copy) @@ -38,6 +38,7 @@ #include utility #include type_traits #include initializer_list +#include tuple #include bits/stl_algobase.h #include bits/allocator.h #include bits/stl_function.h // equal_to, _Identity, _Select1st Index: include/std/unordered_set === --- include/std/unordered_set (revision 190209) +++ include/std/unordered_set (working copy) @@ -38,6 +38,7 @@ #include utility #include type_traits #include initializer_list +#include tuple #include bits/stl_algobase.h #include bits/allocator.h #include bits/stl_function.h // equal_to, _Identity, _Select1st Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 190209) +++ include/bits/hashtable.h (working copy) @@ -584,11 +584,11 @@ __node_base* _M_get_previous_node(size_type __bkt, __node_base* __n); - templatetypename _Arg - iterator - _M_insert_bucket(_Arg, size_type, __hash_code); + // Insert node in bucket bkt (assumes no element with its key already + // present). Take ownership of the node, deallocate it on exception. + iterator + _M_insert_node(size_type __bkt, __hash_code __code, __node_type* __n); - templatetypename... _Args std::pairiterator, bool _M_emplace(std::true_type, _Args... __args); @@ -1307,54 +1307,49 @@ } } - // Insert v in bucket n (assumes no element with its key already present). + // Insert node in bucket bkt (assumes no element with its key already + // present). Take ownership of the node, deallocate it on exception. templatetypename _Key, typename _Value, typename _Alloc, typename _ExtractKey, typename _Equal, typename _H1, typename _H2, typename _Hash, typename _RehashPolicy, typename _Traits -templatetypename _Arg - typename _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, - _Traits::iterator - _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, _Traits:: - _M_insert_bucket(_Arg __v, size_type __n, __hash_code __code) - { - const __rehash_state __saved_state = _M_rehash_policy._M_state(); - std::pairbool, std::size_t __do_rehash - = _M_rehash_policy._M_need_rehash(_M_bucket_count, - _M_element_count, 1); +typename _Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, + _Traits::iterator +_Hashtable_Key, _Value, _Alloc, _ExtractKey, _Equal, + _H1, _H2, _Hash, _RehashPolicy, _Traits:: +_M_insert_node(size_type __bkt, __hash_code __code, __node_type* __node) +{ + const __rehash_state __saved_state = _M_rehash_policy._M_state(); + std::pairbool, std::size_t __do_rehash + = _M_rehash_policy._M_need_rehash(_M_bucket_count, + _M_element_count, 1); - if (__do_rehash.first) - { - const key_type __k = this-_M_extract()(__v); - __n = __hash_code_base::_M_bucket_index(__k, __code, + if (__do_rehash.first) + { + const key_type __k = this-_M_extract()(__node-_M_v); + __bkt = __hash_code_base::_M_bucket_index(__k, __code, __do_rehash.second); - } + } - __node_type* __node = nullptr; - __try - { - // Allocate
Re: [cxx-conversion] Support garbage-collected C++ templates
On Wed, Aug 8, 2012 at 3:37 PM, Diego Novillo dnovi...@google.com wrote: On 12-08-08 16:12 , Gabriel Dos Reis wrote: hi Diego, just a word on style in the documentation: +templatetypename T +void gt_pch_nx (TPT *tp) +@{ + extern void gt_pch_nx (T); + + /* This marks field 'fld' of type 'T'. */ + gt_pch_nx (tp-fld); +@} 'extern' declaration at local scope if considered an extremely poor style in C++. Furthermore, it does not interact well with template instantiations and scope rules; plus it does not work well when the function actually has an internal linkage. A proper way to bring a symbol into local scope is through a using-declaration: using ::gt_pch_nx; I struggled a bit with this. I need to tell the template function that there exists a free function gt_pch_nx that takes a T. This is not something that any header file has at the point of this declaration. The function gt_pch_nx(T) is declared and defined later in one of the gt-*.h files. This is another problem with the way that gengtype works today. The using-declaration that you propose does not seem to give me what I want, though. How does it know that the function takes a T? So, if the issue that the function does not exist at the point of the template definition, but will definitely exist at the point where it is instantiated because of inclusion of a header file (later or in a different translation unit), then the usual way is to write no declaration at all. The compiler will perform a lookup at the point of instantiation and resolve the name appropriate, or error. you can just write: void gt_pch_nx (TPT *tp) { /* This marks field 'fld' of type 'T'. */ gt_pch_nx (tp-fld); } Am I understanding you correctly? -- Gaby
Re: [cxx-conversion] Support garbage-collected C++ templates
On Wed, Aug 8, 2012 at 4:47 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: So, if the issue that the function does not exist at the point of the template definition, but will definitely exist at the point where it is instantiated because of inclusion of a header file (later or in a different translation unit), then the usual way is to write no declaration at all. The compiler will perform a lookup at the point of instantiation and resolve the name appropriate, or error. you can just write: void gt_pch_nx (TPT *tp) { /* This marks field 'fld' of type 'T'. */ gt_pch_nx (tp-fld); } Oh, but if I do that I get the error In file included from tree.h:29:0, from cp/tree.c:28: vec.h: In instantiation of 'void gt_ggc_mx(vec_tT*) [with T = tree_node*]': ./gt-cp-tree.h:571:59: required from here vec.h:180:5: error: no matching function for call to 'gt_ggc_mx(tree_node*)' gt_ggc_mx (v-vec[i]); ^ Because no declaration for gt_ggc_mx(tree_node *) exists at that point. It will exist in some other header file that's been generated by gengtype (which is included at the *end* of the .c file). Diego.
[SH] PR 50751
Hello, This patch fixes a minor issue related to the displacement addressing patterns, which leads to useless movt exts.* sequences and one of the predicates wrongly accepting non-mem ops. Tested on rev 190151 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK? Cheers, Oleg ChangeLog: PR target/50751 * config/sh/sh.md (*extendqisi2_compact_reg, *extendhisi2_compact_reg): Use arith_reg_operand predicate instead of register_operand. * config/sh/predicates.md (movsrc_no_disp_mem_operand): Accept only mem, simplify. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 190151) +++ gcc/config/sh/sh.md (working copy) @@ -4819,14 +4819,14 @@ (define_insn *extendqisi2_compact_reg [(set (match_operand:SI 0 arith_reg_dest =r) - (sign_extend:SI (match_operand:QI 1 register_operand r)))] + (sign_extend:SI (match_operand:QI 1 arith_reg_operand r)))] TARGET_SH1 exts.b %1,%0 [(set_attr type arith)]) (define_insn *extendhisi2_compact_reg [(set (match_operand:SI 0 arith_reg_dest =r) - (sign_extend:SI (match_operand:HI 1 register_operand r)))] + (sign_extend:SI (match_operand:HI 1 arith_reg_operand r)))] TARGET_SH1 exts.w %1,%0 [(set_attr type arith)]) Index: gcc/config/sh/predicates.md === --- gcc/config/sh/predicates.md (revision 190151) +++ gcc/config/sh/predicates.md (working copy) @@ -428,28 +428,12 @@ return general_operand (op, mode); }) -;; Same as movsrc_operand, but rejects displacement addressing. +;; Returns 1 if OP is a MEM that does not use displacement addressing. (define_predicate movsrc_no_disp_mem_operand - (match_code subreg,reg,const_int,const_double,mem,symbol_ref,label_ref,const,const_vector) + (match_code mem) { - if (!general_movsrc_operand (op, mode)) -return 0; - - if ((mode == QImode || mode == HImode) - mode == GET_MODE (op) - (MEM_P (op) - || (GET_CODE (op) == SUBREG MEM_P (SUBREG_REG (op) -{ - rtx x = XEXP ((MEM_P (op) ? op : SUBREG_REG (op)), 0); - - if (GET_CODE (x) == PLUS - REG_P (XEXP (x, 0)) - CONST_INT_P (XEXP (x, 1))) - return 0; -} - - return 1; + return general_movsrc_operand (op, mode) satisfies_constraint_Snd (op); }) ;; Returns 1 if OP can be a destination of a move. Same as
Re: [cxx-conversion] Support garbage-collected C++ templates
On Wed, Aug 8, 2012 at 3:58 PM, Diego Novillo dnovi...@google.com wrote: On Wed, Aug 8, 2012 at 4:47 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: So, if the issue that the function does not exist at the point of the template definition, but will definitely exist at the point where it is instantiated because of inclusion of a header file (later or in a different translation unit), then the usual way is to write no declaration at all. The compiler will perform a lookup at the point of instantiation and resolve the name appropriate, or error. you can just write: void gt_pch_nx (TPT *tp) { /* This marks field 'fld' of type 'T'. */ gt_pch_nx (tp-fld); } Oh, but if I do that I get the error In file included from tree.h:29:0, from cp/tree.c:28: vec.h: In instantiation of 'void gt_ggc_mx(vec_tT*) [with T = tree_node*]': ./gt-cp-tree.h:571:59: required from here vec.h:180:5: error: no matching function for call to 'gt_ggc_mx(tree_node*)' gt_ggc_mx (v-vec[i]); ^ Because no declaration for gt_ggc_mx(tree_node *) exists at that point. It will exist in some other header file that's been generated by gengtype (which is included at the *end* of the .c file). Aha, so it is an ordering issue, e.g. declarations being generated after they have been seen used in an instantiation. We might want to consider including the header file (that contains only the declarations of the marking functions) in the header files that contain the GTY-marked type definition. In this case, it would be included near the end of tree.h -- Gaby
Re: [cxx-conversion] Support garbage-collected C++ templates
On 12-08-08 17:25 , Gabriel Dos Reis wrote: Aha, so it is an ordering issue, e.g. declarations being generated after they have been seen used in an instantiation. We might want to consider including the header file (that contains only the declarations of the marking functions) in the header files that contain the GTY-marked type definition. In this case, it would be included near the end of tree.h Right. And that's the part of my plan that requires killing gengtype with fire first. When I started down that path, it became a very messy re-write, so I decided it was better to do it in stages. Diego.
Re: [cxx-conversion] Support garbage-collected C++ templates
On Wed, Aug 8, 2012 at 4:27 PM, Diego Novillo dnovi...@google.com wrote: On 12-08-08 17:25 , Gabriel Dos Reis wrote: Aha, so it is an ordering issue, e.g. declarations being generated after they have been seen used in an instantiation. We might want to consider including the header file (that contains only the declarations of the marking functions) in the header files that contain the GTY-marked type definition. In this case, it would be included near the end of tree.h Right. And that's the part of my plan that requires killing gengtype with fire first. When I started down that path, it became a very messy re-write, so I decided it was better to do it in stages. Aha, thanks. (I am slowly catching up.) -- Gaby
[SH] PR 51244 - Improve store of floating-point comparison
Hello, This patch mainly improves stores of negated/inverted floating point comparison results in regs and removes a useless zero-extension after storing the negated T bit in a reg. One thing that is annoying is the fact that the '-1' constant is emitted during combine and thus won't get any chance of being CSE-ed in any way. This results in multiple regs being loaded with the '-1' constant, although one would suffice. For integer comparisons, I've fixed this a while ago by loading the constant '-1' into a reg in the expander and emitting an insn that 'uses' that reg. However, this won't work for floating-point comparisons, because the cstoresf expander never sees the inversion. Is there a way to somehow merge pseudos that will be loaded with the same constant values before register allocation, possibly also lifting those constant loads outside of loops? Tested on rev 190151 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK? Cheers, Oleg ChangeLog: PR target/51244 * config/sh/sh.md: Add negc extu sequence peephole. (movrt, movnegt, movrt_negc, nott): Use t_reg_operand predicate. (*movrt_negc): New insn. * config/sh/sync.md (atomic_test_and_set): Pass gen_t_reg_rtx to gen_movnegt. * config/sh/sh.c (expand_cbranchsi4, sh_emit_scc_to_t, sh_emit_compare_and_branch, sh_emit_compare_and_set): Use get_t_reg_rtx. (sh_expand_t_scc): Pass gen_t_reg_rtx to gen_movnegt. testsuite/ChangeLog: PR target/51244 * gcc.target/sh/pr51244-5: New. * gcc.target/sh/pr51244-6: New. Index: gcc/testsuite/gcc.target/sh/pr51244-5.c === --- gcc/testsuite/gcc.target/sh/pr51244-5.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr51244-5.c (revision 0) @@ -0,0 +1,50 @@ +/* Check that no unnecessary sign or zero extension insn is generated after + a negc or movrt insn that stores the inverted T bit in a reg. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O2 } */ +/* { dg-skip-if { sh*-*-* } { -m5* } { } } */ +/* { dg-final { scan-assembler-not extu|exts } } */ + +int +test_00 (int a, int b, int* c, short* d, int x) +{ + *d = x != 0; + *c = -1; + + if (x != 0) +return a 0; + + return 0; +} + +unsigned char +test_01 (int x) +{ + if (x 58 x 47) +return 1; + return 0; +} + +char +test_02 (int x) +{ + if (x 58 x 47) +return 1; + return 0; +} + +unsigned short +test_03 (int x) +{ + if (x 58 x 47) +return 1; + return 0; +} + +short +test_04 (int x) +{ + if (x 58 x 47) +return 1; + return 0; +} Index: gcc/testsuite/gcc.target/sh/pr51244-6.c === --- gcc/testsuite/gcc.target/sh/pr51244-6.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr51244-6.c (revision 0) @@ -0,0 +1,15 @@ +/* Check that no unnecessary sign or zero extension insn is generated after + a negc or movrt insn that stores the inverted T bit in a reg. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O1 } */ +/* { dg-skip-if { sh*-*-* } { -m1 -m2 -m3 -m4al *nofpu -m4-340* -m4-400* -m4-500* -m5* } { } } */ +/* { dg-final { scan-assembler-not extu|exts } } */ + +float +test_00 (float q[4], float m[9]) +{ + float s0 = m[0] + m[1]; + float s1 = m[0] - m[1]; + + return q[s0 s1 ? 0 : 1]; +} Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 189953) +++ gcc/config/sh/sh.md (working copy) @@ -9515,7 +9515,7 @@ (define_insn movrt [(set (match_operand:SI 0 arith_reg_dest =r) - (xor:SI (reg:SI T_REG) (const_int 1)))] + (xor:SI (match_operand:SI 1 t_reg_operand ) (const_int 1)))] TARGET_SH2A movrt %0 [(set_attr type arith)]) @@ -9668,28 +9668,66 @@ (define_expand movnegt [(set (match_operand:SI 0 arith_reg_dest ) - (xor:SI (reg:SI T_REG) (const_int 1)))] - + (xor:SI (match_operand:SI 1 t_reg_operand ) (const_int 1)))] + TARGET_SH1 { if (TARGET_SH2A) -emit_insn (gen_movrt (operands[0])); +emit_insn (gen_movrt (operands[0], operands[1])); else { rtx val = force_reg (SImode, gen_int_mode (-1, SImode)); - emit_insn (gen_movrt_negc (operands[0], val)); + emit_insn (gen_movrt_negc (operands[0], operands[1], val)); } DONE; }) (define_insn movrt_negc [(set (match_operand:SI 0 arith_reg_dest =r) - (xor:SI (reg:SI T_REG) (const_int 1))) + (xor:SI (match_operand:SI 1 t_reg_operand ) (const_int 1))) (set (reg:SI T_REG) (const_int 1)) - (use (match_operand:SI 1 arith_reg_operand r))] + (use (match_operand:SI 2 arith_reg_operand r))] TARGET_SH1 - negc %1,%0 + negc %2,%0 [(set_attr type arith)]) +;; The -1 constant will not be CSE-ed for the *movrt_negc pattern, but the +;; pattern can be used by the combine pass. Using a
[google/gcc-4_7] XFAIL map element_access test
As previously discussed, this patch XFAIL's the new libstdc++ failure caused by http://gcc.gnu.org/viewcvs?revision=190129view=revision. It will be reverted once the issues discussed at http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html have been resolved. Okay to submit to google/gcc-4_7? Ollie 2012-08-08 Ollie Wild a...@google.com * testsuite-management/powerpc-grtev3-linux-gnu.xfail: xfail 23_containers/map/element_access/2.cc from libstdc++. * testsuite-management/x86_64-grtev3-linux-gnu.xfail: Ditto. commit 0e6b70f28b33c1c017afb7374fc724cd61b62745 Author: Ollie Wild a...@google.com Date: Wed Aug 8 16:48:19 2012 -0500 XFail 23_containers/map/element_access/2.cc compilation from libstedc++. To be reverted once issues at http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html are resolved. diff --git a/contrib/ChangeLog.google-4_7 b/contrib/ChangeLog.google-4_7 index 4217fc8..c1664f9 100644 --- a/contrib/ChangeLog.google-4_7 +++ b/contrib/ChangeLog.google-4_7 @@ -1,3 +1,9 @@ +2012-08-08 Ollie Wild a...@google.com + + * testsuite-management/powerpc-grtev3-linux-gnu.xfail: xfail + 23_containers/map/element_access/2.cc from libstdc++. + * testsuite-management/x86_64-grtev3-linux-gnu.xfail: Ditto. + 2012-08-06 Simon Baldwin sim...@google.com Cherry pick r190180 from google/main. diff --git a/contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail b/contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail index 14cd6c5..45752f8 100644 --- a/contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail +++ b/contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail @@ -170,3 +170,8 @@ FAIL: gcc.target/powerpc/pr46728-3.c scan-assembler-not pow FAIL: gcc.target/powerpc/pr46728-4.c scan-assembler-not pow FAIL: gcc.target/powerpc/pr46728-7.c scan-assembler-not pow FAIL: gcc.target/powerpc/pr46728-8.c scan-assembler-not pow + +# See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html. Revert once +# that is resolved. +UNRESOLVED: 23_containers/map/element_access/2.cc compilation failed to produce executable +FAIL: 23_containers/map/element_access/2.cc (test for excess errors) diff --git a/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail b/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail index 38ab201..4fa47ec 100644 --- a/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail +++ b/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail @@ -63,3 +63,8 @@ flaky | FAIL: libmudflap.cth/pass40-frag.c output pattern test flaky | FAIL: libmudflap.cth/pass40-frag.c (-O2) execution test flaky | FAIL: libmudflap.cth/pass39-frag.c (-O3) (rerun 10) execution test flaky | FAIL: libgomp.graphite/force-parallel-6.c execution test + +# See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html. Revert once +# that is resolved. +UNRESOLVED: 23_containers/map/element_access/2.cc compilation failed to produce executable +FAIL: 23_containers/map/element_access/2.cc (test for excess errors)
Re: [google/gcc-4_7] XFAIL map element_access test
On 12-08-08 17:52 , Ollie Wild wrote: As previously discussed, this patch XFAIL's the new libstdc++ failure caused by http://gcc.gnu.org/viewcvs?revision=190129view=revision. It will be reverted once the issues discussed at http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html have been resolved. Okay to submit to google/gcc-4_7? Ollie 2012-08-08 Ollie Wild a...@google.com * testsuite-management/powerpc-grtev3-linux-gnu.xfail: xfail 23_containers/map/element_access/2.cc from libstdc++. * testsuite-management/x86_64-grtev3-linux-gnu.xfail: Ditto. OK. Diego.
Re: Beyond Complex Register Management
On Aug 8, 2012, at 11:16 AM, Nathan Froyd wrote: On Wed, Aug 08, 2012 at 10:52:28AM -0700, Mike Stump wrote: As we move to C++, I'd love for port maintainers to be able to get together and hoist _up_ code from the port so other ports can use it and thus, have more sharing. We make heavily stylized uses, which could be wrapped into a prettier api, given a reasonably powerful language. C++ might be powerful enough. I'd love to see someone improve the FIXED_REGISTERS, REGISTER_NAMES, REG_CLASS_CONTENTS, REGNO_REG_CLASS, REGNO_GLOBAL_BASE_REG_P, CLASS_MAX_NREGS experience, as the current api sucks; for example. I dread the slightest change to any of my registers, as the change ripples throughout a ton of disparate places, and it is currently too easy to forget one of them. Bonus points for cleaning up TARGET_SECONDARY_RELOAD at the same time as the above. I find that api sucks in ways I can't begin to describe. Would a registers.md for defining registers and register classes (and maybe things like FIXED_REGISTERS and the like) be a good start for you, or do you want something more developed first? I think I'd rather have a nice general purpose C++ library that has a nice api that can generate code. Longer term, it'd be nice to refactor all the gen*.c code to use the library. The problem with the current gen*.c scheme is there is no flexibility to do anything that wasn't preconceived. So, for example, mode iterators had to be added, a user of the library, can't just slot in 20 lines and have them. I had mode iterators in a port I did, a long time before we had them in gcc, but, since there wasn't a way to slot them in, that meant no one had them for years and years. Sad. I can't just slot in 100 lines to 'solve' the register issue. I'd like, not a solution to the register problem, but, rather a library, that is compete enough, to let me solve the register problem. The register case, maybe is too trivial to show what I mean, for example, here is 113 lines that solves a part of the problem. Not a big deal, but, we don't have an easy enough way to do this and a culture that encourages innovative solutions. 20 years later, and the same old gross api still is the only way to do it. Consider the below code. Notice that I can do, pretty much anything I need with it. At first, it didn't generate FIXED_REGISTERS, then, a few trivial lines later, it did. The it didn't handle aliases, then a few lines later, then it did. The bits are trivial and obvious for anyone to add. The api is nice and flexible and can sport many different ways to doing things, and, even if all of them are compete failures, one can _always_ layer _anything_ they want on top of the trivial interface and do anything they want to solve the problem they face. Notice the addition of regs, you can now do an entire register file with 1 line. My feeling is that we should have a recommend best practice. Those ports that don't, or didn't follow the best practice, are larger; those that do, are smaller. In this case, the best practice, if regs is it, is 1 line. You can add aliases, easy to use register names for parameters, with those being preferred, the usual names for the rest of them and so on. Also, if all the ports switched to it, then one can actually change FIXED_REGISTERS to be completely different, merely by changing the implementation of the api, once. One can move a register around, thus renumbering all the register, and reordering all the bits in the bits array, without any carefulness or fear. One can add a new register, without fear of missing anything, anywhere (within the scope of the generated code). If one wanted to add support for MODE_OK, or NREGS, or CANNOT_CONVERT, again, trivial to do, and possible to do in an incremental fashion. Notice how I added aliases, while still retaining compatibility with existing users of the api. Want to add FRAME_POINTER_REGNUM support, easy to do, just invent an nice api you like, and bolt it in. Want to revamp an api, easy put them in a namespace and put in a using regs_v1, newer code can do a using reg_v2. Eventually regs_v1 can be deleted, or not, or even moved into the last port file that uses it. Now, imagine a nice pretty api for secondary_reload. :-) Or vectors, or move instructions, or legitimate or legitimize... or imagine what: gen_const_int_pred(1, 4, 8, 14, 22, 32, 48, 64); could do. (Hint, generate predicates with stylized names for all of the given N-bit constants.) Much nicer, less typing, easier to maintain... We know exactly how to write the above, but only in a real language, C++ should be sufficient. genpred.c will _never_ and can never do this, no matter that it is what, five lines of code, given the stuff on which to build this on top of. This is _why_ the gen*.c things are the wrong architecture. #include stdio.h #include strings.h enum { BITS_PER = 64 }; enum
[PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer
Hello, This is a fix to prepare the xmlj_io.c file of gnu classpath to a coming API change in libxml2. Basically, we were previously accessing fields inside the xmlOutputBuffer struct of libxml2. In a coming version of libxml2, that won't be possible anymore. Client code will have to use accessor functions instead. For the gory details, there is an interestin note of Daniel Veillard (author of libxml2) at https://mail.gnome.org/archives/desktop-devel-list/2012-August/msg7.html. This patch defines too accessor macros that, depending on the version of libxml2 we are using will either access the fields of xmlOutputBuffer directly, or use the new accessor function. Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk. OK to commit? libjava/classpath/ * native/jni/xmlj/xmlj_io.c (GET_XML_OUTPUT_BUFFER_CONTENT) (GET_XML_OUTPUT_BUFFER_SIZE): New macros. (xmljOutputWriteCallback): Use them. --- libjava/classpath/native/jni/xmlj/xmlj_io.c | 20 +--- 1 files changed, 17 insertions(+), 3 deletions(-) diff --git a/libjava/classpath/native/jni/xmlj/xmlj_io.c b/libjava/classpath/native/jni/xmlj/xmlj_io.c index aa2964d..a55e48d 100644 --- a/libjava/classpath/native/jni/xmlj/xmlj_io.c +++ b/libjava/classpath/native/jni/xmlj/xmlj_io.c @@ -102,6 +102,19 @@ xmljFreeOutputStreamContext (OutputStreamContext * outContext); xmlCharEncoding xmljDetectCharEncoding (JNIEnv * env, jbyteArray buffer); + +#ifdef LIBXML2_NEW_BUFFER +#define GET_XML_OUTPUT_BUFFER_CONTENT(buf) (gchar *) \ + (char *) xmlOutputBufferGetContent(buf) +#define GET_XML_OUTPUT_BUFFER_SIZE(buf) \ + xmlOutputBufferGetSize(buf) +#else +#define GET_XML_OUTPUT_BUFFER_CONTENT(buf) \ + (buf)-buffer-content +#define GET_XML_OUTPUT_BUFFER_SIZE(buf) \ + (buf)-buffer-use +#endif + int xmljOutputWriteCallback (void *context, const char *buffer, int len) { @@ -752,9 +765,10 @@ xmljLoadExternalEntity (const char *URL, const char *ID, inputStream-directory = NULL; inputStream-buf = inputBuffer; - inputStream-base = inputStream-buf-buffer-content; - inputStream-cur = inputStream-buf-buffer-content; - inputStream-end = inputStream-base[inputStream-buf-buffer-use]; + inputStream-base = GET_XML_OUTPUT_BUFFER_CONTENT (inputStream-buf); + inputStream-cur = GET_XML_OUTPUT_BUFFER_CONTENT (inputStream-buf); + inputStream-end = + inputStream-base[GET_XML_OUTPUT_BUFFER_SIZE (inputStream-buf)]; if ((ctxt-directory == NULL) (inputStream-directory != NULL)) ctxt-directory = (char *) xmlStrdup ((const xmlChar *) inputStream-directory); -- Dodji
Re: [SH] PR 39423
On Mon, 2012-07-30 at 08:28 -0700, Richard Henderson wrote: On 2012-07-29 15:56, Oleg Endo wrote: + can_create_pseudo_p () + [(set (match_dup 5) (ashift:SI (match_dup 1) (match_dup 2))) + (set (match_dup 6) (plus:SI (match_dup 5) (match_dup 3))) + (set (match_dup 0) (mem:SI (plus:SI (match_dup 6) (match_dup 4] Don't create new mems like this -- you've lost alias info. You need to use replace_equiv_address or something on the original memory. Which means you have to actually capture the memory operand somehow. Better to use a custom predicate to match these memories with these complex addresses, rather than list them out each time: [(set (match_operand:SI 0 arith_reg_dest =r) (match_operand:SI 1 mem_index_disp_operand m))] How about the attached patch? Is that way of dealing with the mems OK? What could be a possible test case for the alias info issue? Tested on rev 190151 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. Cheers, Oleg ChangeLog: PR target/39423 * config/sh/predicates.md (mem_index_disp_operand): New predicate. * config/sh/sh.md (*movsi_index_disp): Rewrite insns to use the new mem_index_disp_operand predicate. testsuite/ChangeLog: PR target/39423 * gcc.target/sh/pr39423-1.c: New. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 190151) +++ gcc/config/sh/sh.md (working copy) @@ -5119,114 +5119,116 @@ ;; FIXME: Combine never tries this kind of patterns for DImode. (define_insn_and_split *movsi_index_disp [(set (match_operand:SI 0 arith_reg_dest =r) - (mem:SI - (plus:SI - (plus:SI (mult:SI (match_operand:SI 1 arith_reg_operand r) - (match_operand:SI 2 const_int_operand)) - (match_operand:SI 3 arith_reg_operand r)) - (match_operand:SI 4 const_int_operand] - TARGET_SH1 sh_legitimate_index_p (SImode, operands[4], TARGET_SH2A, true) -exact_log2 (INTVAL (operands[2])) 0 + (match_operand:SI 1 mem_index_disp_operand m))] + TARGET_SH1 # can_create_pseudo_p () [(set (match_dup 5) (ashift:SI (match_dup 1) (match_dup 2))) (set (match_dup 6) (plus:SI (match_dup 5) (match_dup 3))) - (set (match_dup 0) (mem:SI (plus:SI (match_dup 6) (match_dup 4] + (set (match_dup 0) (match_dup 7))] { + rtx mem = operands[1]; + rtx plus0_rtx = XEXP (mem, 0); + rtx plus1_rtx = XEXP (plus0_rtx, 0); + rtx mult_rtx = XEXP (plus1_rtx, 0); + + operands[1] = XEXP (mult_rtx, 0); + operands[2] = GEN_INT (exact_log2 (INTVAL (XEXP (mult_rtx, 1; + operands[3] = XEXP (plus1_rtx, 1); + operands[4] = XEXP (plus0_rtx, 1); operands[5] = gen_reg_rtx (SImode); operands[6] = gen_reg_rtx (SImode); - operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2]))); + operands[7] = +replace_equiv_address (mem, + gen_rtx_PLUS (SImode, operands[6], operands[4])); }) (define_insn_and_split *movhi_index_disp [(set (match_operand:SI 0 arith_reg_dest =r) - (sign_extend:SI - (mem:HI - (plus:SI - (plus:SI (mult:SI (match_operand:SI 1 arith_reg_operand r) -(match_operand:SI 2 const_int_operand)) - (match_operand:SI 3 arith_reg_operand r)) - (match_operand:SI 4 const_int_operand)] - TARGET_SH1 sh_legitimate_index_p (HImode, operands[4], TARGET_SH2A, true) -exact_log2 (INTVAL (operands[2])) 0 + (sign_extend:SI (match_operand:HI 1 mem_index_disp_operand m)))] + TARGET_SH1 # can_create_pseudo_p () [(set (match_dup 5) (ashift:SI (match_dup 1) (match_dup 2))) (set (match_dup 6) (plus:SI (match_dup 5) (match_dup 3))) - (set (match_dup 0) - (sign_extend:SI (mem:HI (plus:SI (match_dup 6) (match_dup 4)] + (set (match_dup 0) (sign_extend:SI (match_dup 7)))] { + rtx mem = operands[1]; + rtx plus0_rtx = XEXP (mem, 0); + rtx plus1_rtx = XEXP (plus0_rtx, 0); + rtx mult_rtx = XEXP (plus1_rtx, 0); + + operands[1] = XEXP (mult_rtx, 0); + operands[2] = GEN_INT (exact_log2 (INTVAL (XEXP (mult_rtx, 1; + operands[3] = XEXP (plus1_rtx, 1); + operands[4] = XEXP (plus0_rtx, 1); operands[5] = gen_reg_rtx (SImode); operands[6] = gen_reg_rtx (SImode); - operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2]))); + operands[7] = +replace_equiv_address (mem, + gen_rtx_PLUS (SImode, operands[6], operands[4])); }) -(define_insn_and_split *movhi_index_disp - [(set (match_operand:SI 0 arith_reg_dest =r) - (zero_extend:SI - (mem:HI - (plus:SI - (plus:SI (mult:SI (match_operand:SI 1 arith_reg_operand r) -(match_operand:SI 2 const_int_operand)) - (match_operand:SI 3 arith_reg_operand r)) - (match_operand:SI 4 const_int_operand)] - TARGET_SH1 sh_legitimate_index_p (HImode, operands[4], TARGET_SH2A, true) -exact_log2 (INTVAL (operands[2])) 0 - # - can_create_pseudo_p () - [(set (match_dup 5)
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/7/12, Mike Stump mikest...@comcast.net wrote: On Aug 7, 2012, at 11:42 AM, Lawrence Crowl wrote: On 8/7/12, Mike Stump mikest...@comcast.net wrote: On Aug 6, 2012, at 5:35 PM, Lawrence Crowl wrote: Convert double_int from a struct with function into a class with operators and methods. We have a wide_int class that replaces this class. :-( Really? Where? I don't see neither definition nor use under gcc. It is being developed; it is largely done, but is going through final reviews now. When we finish with those review bits, we'll send it out. It sounds like you'll beat us to the merge window, so, we'll cope with the fallout. Can I have a pointer to the interface? It would have been better to just convert it. That depends on how easy it is to migrate. In the process I learned that double_int isn't actually a double int, but is a precision-controlled int. Well, I'd call it a pair of HOST_WIDE_INTs... I think calling it a precision-controlled int is a tad optimistic. For example, if want to control it to be 2048 bits, you'd find that your hopes are misplaced. :-) Yes, well, it's clearly all relying on the host being bigger than the target. -- Lawrence Crowl
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/7/12, Richard Henderson r...@redhat.com wrote: On 08/06/2012 05:35 PM, Lawrence Crowl wrote: +inline double_int +double_int::operator ++ () +{ + *this + double_int_one; + return *this; +} + +inline double_int +double_int::operator -- () +{ + *this - double_int_one; + return *this; +} Surely unused results there? Typically yes, but that is the interface for that operator. -- Lawrence Crowl
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/7/12, Mike Stump mikest...@comcast.net wrote: On Aug 7, 2012, at 11:38 AM, Lawrence Crowl wrote: Hm. There seems to be significant opinion that there should not be any implicit conversions. I am okay with operations as above, but would like to hear the opinions of others. If there is an agreed upon and expected semantic, having them are useful. In the wide-int world, which replaces double_int, I think there is an agreeable semantic and I think it is useful, so, I think we should plan on having them, though, I'd be fine with punting their implementation until such time as someone needs it. If no one every needs the routine, I don't see the harm in not implementing it. At present, there are no functions equivalent to (double_int + int), so there can be no expressions that need this overload. I have no objection to adding such an overload, but if there are no objections, I would rather do it as a separate patch. -- Lawrence Crowl
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Greetings, Thanks for the review of part 2! Here's another chunk of the SLSR code (I feel I owe you a few beers at this point). This performs analysis and replacement on groups of related candidates having an SSA name (rather than a constant) for a stride. This leaves only the conditional increment (CAND_PHI) case, which will be handled in the last patch of the series. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gimple-ssa-strength-reduction.c (struct incr_info_d): New struct. (incr_vec): New static var. (incr_vec_len): Likewise. (address_arithmetic_p): Likewise. (stmt_cost): Remove dead assignment. (dump_incr_vec): New function. (cand_abs_increment): Likewise. (lazy_create_slsr_reg): Likewise. (incr_vec_index): Likewise. (count_candidates): Likewise. (record_increment): Likewise. (record_increments): Likewise. (unreplaced_cand_in_tree): Likewise. (optimize_cands_for_speed_p): Likewise. (lowest_cost_path): Likewise. (total_savings): Likewise. (analyze_increments): Likewise. (ncd_for_two_cands): Likewise. (nearest_common_dominator_for_cands): Likewise. (profitable_increment_p): Likewise. (insert_initializers): Likewise. (introduce_cast_before_cand): Likewise. (replace_rhs_if_not_dup): Likewise. (replace_one_candidate): Likewise. (replace_profitable_candidates): Likewise. (analyze_candidates_and_replace): Handle candidates with SSA-name strides. gcc/testsuite: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gcc.dg/tree-ssa/slsr-5.c: New. * gcc.dg/tree-ssa/slsr-6.c: New. * gcc.dg/tree-ssa/slsr-7.c: New. * gcc.dg/tree-ssa/slsr-8.c: New. * gcc.dg/tree-ssa/slsr-9.c: New. * gcc.dg/tree-ssa/slsr-10.c: New. * gcc.dg/tree-ssa/slsr-11.c: New. * gcc.dg/tree-ssa/slsr-12.c: New. * gcc.dg/tree-ssa/slsr-13.c: New. * gcc.dg/tree-ssa/slsr-14.c: New. * gcc.dg/tree-ssa/slsr-15.c: New. * gcc.dg/tree-ssa/slsr-16.c: New. * gcc.dg/tree-ssa/slsr-17.c: New. * gcc.dg/tree-ssa/slsr-18.c: New. * gcc.dg/tree-ssa/slsr-19.c: New. * gcc.dg/tree-ssa/slsr-20.c: New. * gcc.dg/tree-ssa/slsr-21.c: New. * gcc.dg/tree-ssa/slsr-22.c: New. * gcc.dg/tree-ssa/slsr-23.c: New. * gcc.dg/tree-ssa/slsr-24.c: New. * gcc.dg/tree-ssa/slsr-25.c: New. * gcc.dg/tree-ssa/slsr-26.c: New. * gcc.dg/tree-ssa/slsr-30.c: New. * gcc.dg/tree-ssa/slsr-31.c: New. == --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) @@ -0,0 +1,25 @@ +/* Verify straight-line strength reduction fails for simple integer addition + with casts thrown in when -fwrapv is used. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ +/* { dg-skip-if { ilp32 } { -m32 } { } } */ + This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? Thanks. -- H.J. --- * gcc.dg/tree-ssa/slsr-30.c: Require non-ilp32. Remove dg-skip-if. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c b/gcc/testsuite/gcc.dg/tree -ssa/slsr-30.c index fbd6897..7921f43 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c @@ -1,9 +1,8 @@ /* Verify straight-line strength reduction fails for simple integer addition with casts thrown in when -fwrapv is used. */ -/* { dg-do compile } */ +/* { dg-do compile { target { ! { ilp32 } } } } */ /* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ -/* { dg-skip-if { ilp32 } { -m32 } { } } */ long f (int s, long c)
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/8/12, Richard Guenther richard.guent...@gmail.com wrote: On Aug 7, 2012 Lawrence Crowl cr...@google.com wrote: We should probably think about naming conventions for mutating operations, as I expect we will want them eventually. Right. In the end I would prefer explicit constructors. I don't think we're thinking about the same thing. I'm talking about member functions like mystring.append (foo). The += operator is mutating as well. Constructors do not mutate, they create. -- Lawrence Crowl
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Greetings, Thanks for the review of part 2! Here's another chunk of the SLSR code (I feel I owe you a few beers at this point). This performs analysis and replacement on groups of related candidates having an SSA name (rather than a constant) for a stride. This leaves only the conditional increment (CAND_PHI) case, which will be handled in the last patch of the series. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gimple-ssa-strength-reduction.c (struct incr_info_d): New struct. (incr_vec): New static var. (incr_vec_len): Likewise. (address_arithmetic_p): Likewise. (stmt_cost): Remove dead assignment. (dump_incr_vec): New function. (cand_abs_increment): Likewise. (lazy_create_slsr_reg): Likewise. (incr_vec_index): Likewise. (count_candidates): Likewise. (record_increment): Likewise. (record_increments): Likewise. (unreplaced_cand_in_tree): Likewise. (optimize_cands_for_speed_p): Likewise. (lowest_cost_path): Likewise. (total_savings): Likewise. (analyze_increments): Likewise. (ncd_for_two_cands): Likewise. (nearest_common_dominator_for_cands): Likewise. (profitable_increment_p): Likewise. (insert_initializers): Likewise. (introduce_cast_before_cand): Likewise. (replace_rhs_if_not_dup): Likewise. (replace_one_candidate): Likewise. (replace_profitable_candidates): Likewise. (analyze_candidates_and_replace): Handle candidates with SSA-name strides. gcc/testsuite: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gcc.dg/tree-ssa/slsr-5.c: New. * gcc.dg/tree-ssa/slsr-6.c: New. * gcc.dg/tree-ssa/slsr-7.c: New. * gcc.dg/tree-ssa/slsr-8.c: New. * gcc.dg/tree-ssa/slsr-9.c: New. * gcc.dg/tree-ssa/slsr-10.c: New. * gcc.dg/tree-ssa/slsr-11.c: New. * gcc.dg/tree-ssa/slsr-12.c: New. * gcc.dg/tree-ssa/slsr-13.c: New. * gcc.dg/tree-ssa/slsr-14.c: New. * gcc.dg/tree-ssa/slsr-15.c: New. * gcc.dg/tree-ssa/slsr-16.c: New. * gcc.dg/tree-ssa/slsr-17.c: New. * gcc.dg/tree-ssa/slsr-18.c: New. * gcc.dg/tree-ssa/slsr-19.c: New. * gcc.dg/tree-ssa/slsr-20.c: New. * gcc.dg/tree-ssa/slsr-21.c: New. * gcc.dg/tree-ssa/slsr-22.c: New. * gcc.dg/tree-ssa/slsr-23.c: New. * gcc.dg/tree-ssa/slsr-24.c: New. * gcc.dg/tree-ssa/slsr-25.c: New. * gcc.dg/tree-ssa/slsr-26.c: New. * gcc.dg/tree-ssa/slsr-30.c: New. * gcc.dg/tree-ssa/slsr-31.c: New. == --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) @@ -0,0 +1,25 @@ +/* Verify straight-line strength reduction fails for simple integer addition + with casts thrown in when -fwrapv is used. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ +/* { dg-skip-if { ilp32 } { -m32 } { } } */ + This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? This also does not work for mips64 where the options are either -mabi=32 or -mabi=n32 for ILP32. HJL's patch looks correct. Thanks, Andrew Thanks. -- H.J. --- * gcc.dg/tree-ssa/slsr-30.c: Require non-ilp32. Remove dg-skip-if. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c b/gcc/testsuite/gcc.dg/tree -ssa/slsr-30.c index fbd6897..7921f43 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c @@ -1,9 +1,8 @@ /* Verify straight-line strength reduction fails for simple integer addition with casts thrown in when -fwrapv is used. */ -/* { dg-do compile } */ +/* { dg-do compile { target { ! { ilp32 } } } } */ /* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ -/* { dg-skip-if { ilp32 } { -m32 } { } } */ long f (int s, long c)
Re: Beyond Complex Register Management
On Aug 8, 2012, at 11:14 AM, DJ Delorie wrote: It's the weird addressing modes that confuse gcc. Ah, yes... That problem. Sigh, my port is nice and orthogonal and doesn't suffer in this area, so... no solution from me.
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On Aug 8, 2012, at 3:25 PM, H.J. Lu wrote: This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? Ok.
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/8/12, Richard Guenther richard.guent...@gmail.com wrote: On Aug 7, 2012 Lawrence Crowl cr...@google.com wrote: On 8/7/12, Richard Guenther richard.guent...@gmail.com wrote: For most parts overloads that take an (unsigned) HOST_WIDE_INT argument would be nice, as well as the ability to say dbl + 1. Hm. There seems to be significant opinion that there should not be any implicit conversions. I am okay with operations as above, but would like to hear the opinions of others. Well, I'd simply add double_int operator+(HOST_WIDE_INT); double_int operator+(unsigned HOST_WIDE_INT); and be done with it ;) Yes, a tad bit more inconvenient on the implementation side compared to a non-explicit constructor from HOST_WIDE_INT or a conversion operator ... but adhering to the rule that we do _not_ want such automatic conversions (well, yet, for the start). Ah, we have a different definition of implicit conversion. In my mind, the above overloads do an implicit conversion. The issue is that the calling code does not make the conversion obvious. Note the difference in a = b + i; a = b + double_int(i); Using overloads as above is generally safer than using an implicit converting constructor or an implicit conversion operator. If folks are really only objecting to the implementation technique, we have a somewhat different outcome. -- Lawrence Crowl
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On 08/08/2012 03:27 PM, Andrew Pinski wrote: On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Greetings, Thanks for the review of part 2! Here's another chunk of the SLSR code (I feel I owe you a few beers at this point). This performs analysis and replacement on groups of related candidates having an SSA name (rather than a constant) for a stride. This leaves only the conditional increment (CAND_PHI) case, which will be handled in the last patch of the series. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gimple-ssa-strength-reduction.c (struct incr_info_d): New struct. (incr_vec): New static var. (incr_vec_len): Likewise. (address_arithmetic_p): Likewise. (stmt_cost): Remove dead assignment. (dump_incr_vec): New function. (cand_abs_increment): Likewise. (lazy_create_slsr_reg): Likewise. (incr_vec_index): Likewise. (count_candidates): Likewise. (record_increment): Likewise. (record_increments): Likewise. (unreplaced_cand_in_tree): Likewise. (optimize_cands_for_speed_p): Likewise. (lowest_cost_path): Likewise. (total_savings): Likewise. (analyze_increments): Likewise. (ncd_for_two_cands): Likewise. (nearest_common_dominator_for_cands): Likewise. (profitable_increment_p): Likewise. (insert_initializers): Likewise. (introduce_cast_before_cand): Likewise. (replace_rhs_if_not_dup): Likewise. (replace_one_candidate): Likewise. (replace_profitable_candidates): Likewise. (analyze_candidates_and_replace): Handle candidates with SSA-name strides. gcc/testsuite: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gcc.dg/tree-ssa/slsr-5.c: New. * gcc.dg/tree-ssa/slsr-6.c: New. * gcc.dg/tree-ssa/slsr-7.c: New. * gcc.dg/tree-ssa/slsr-8.c: New. * gcc.dg/tree-ssa/slsr-9.c: New. * gcc.dg/tree-ssa/slsr-10.c: New. * gcc.dg/tree-ssa/slsr-11.c: New. * gcc.dg/tree-ssa/slsr-12.c: New. * gcc.dg/tree-ssa/slsr-13.c: New. * gcc.dg/tree-ssa/slsr-14.c: New. * gcc.dg/tree-ssa/slsr-15.c: New. * gcc.dg/tree-ssa/slsr-16.c: New. * gcc.dg/tree-ssa/slsr-17.c: New. * gcc.dg/tree-ssa/slsr-18.c: New. * gcc.dg/tree-ssa/slsr-19.c: New. * gcc.dg/tree-ssa/slsr-20.c: New. * gcc.dg/tree-ssa/slsr-21.c: New. * gcc.dg/tree-ssa/slsr-22.c: New. * gcc.dg/tree-ssa/slsr-23.c: New. * gcc.dg/tree-ssa/slsr-24.c: New. * gcc.dg/tree-ssa/slsr-25.c: New. * gcc.dg/tree-ssa/slsr-26.c: New. * gcc.dg/tree-ssa/slsr-30.c: New. * gcc.dg/tree-ssa/slsr-31.c: New. == --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) @@ -0,0 +1,25 @@ +/* Verify straight-line strength reduction fails for simple integer addition + with casts thrown in when -fwrapv is used. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ +/* { dg-skip-if { ilp32 } { -m32 } { } } */ + This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? This also does not work for mips64 where the options are either -mabi=32 or -mabi=n32 for ILP32. HJL's patch looks correct. Thanks, Andrew There are GCC targets with 16-bit integers. What's the actual set of targets on which this test is meant to run? There's a list of effective-target names based on data type sizes in http://gcc.gnu.org/onlinedocs/gccint/Effective_002dTarget-Keywords.html#Effective_002dTarget-Keywords. Janis Thanks. -- H.J. --- * gcc.dg/tree-ssa/slsr-30.c: Require non-ilp32. Remove dg-skip-if. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c b/gcc/testsuite/gcc.dg/tree -ssa/slsr-30.c index fbd6897..7921f43 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c @@ -1,9 +1,8 @@ /* Verify straight-line strength reduction fails for simple integer addition with casts thrown in when -fwrapv is used. */ -/* { dg-do compile } */ +/* { dg-do compile { target { ! { ilp32 } } } } */ /* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ -/* { dg-skip-if { ilp32 } { -m32 } { } } */ long f (int s, long c)
Re: s390: rearrange temp moves in s390_expand_cs_hqi
Richard Henderson wrote: In the same vein as your CAS boolean output patch, if we rearrange the copies here we can get the combined compare-and-branch insn for the z10. I see that the z196 prefers not to use those, but the number of insns in that case remains the same, merely in a different order. Yes, on z196 the compare-and-branch insns are discouraged since they're cracked (i.e. split into two micro-ops, just the same as if we'd have two original insns, but in effect a bit worse for the pipeline). In any case, the patch certainly makes sense. Can you please test with --with-arch=z10? Tested with no regressions. * config/s390/s390.c (s390_expand_cs_hqi): Copy val to a temp before performing the compare for the restart loop. OK. Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/8/12, Richard Guenther richard.guent...@gmail.com wrote: On Aug 8, 2012 Marc Glisse marc.gli...@inria.fr wrote: On Wed, 8 Aug 2012, Richard Guenther wrote: + static double_int make (unsigned HOST_WIDE_INT cst); + static double_int make (HOST_WIDE_INT cst); + static double_int make (unsigned int cst); + static double_int make (int cst); [...] Btw, if HOST_WIDE_INT == int the above won't even compile. Is that true of any host? I am not aware of any. Anyway, it is moot if we remove the overloads. Right. I'd simply rely on int / unsigned int promotion to HOST_WIDE_INT or unsigned HOST_WIDE_INT and only provide overloads for HOST_WIDE_INT kinds anyway. Sadly, that doesn't work with the current C++ rules (there is a proposal to change that for C++1y): void f(long); void f(unsigned long); void g(int x){f(x);} e.cc: In function ‘void g(int)’: e.cc:3:18: error: call of overloaded ‘f(int)’ is ambiguous e.cc:3:18: note: candidates are: e.cc:1:6: note: void f(long int) e.cc:2:6: note: void f(long unsigned int) Ick ... I forgot that integer promotions do not apply for int - long. So even f(1) would be ambiguous, right? So I suppose we have to bite the bullet and add overloads for all (unsigned) integer types from int to HOST_WIDE_INT (if HOST_WIDE_INT is long long then we have to add overloads for int, unsigned int, long and unsigned long). Or use template magic to force promotion to HOST_WIDE_INT for integer types smaller than HOST_WIDE_INT... I did not get to the structure I had accidentally. However, with the prior suggestion to preserve exact semantics to existing calls, it is all moot because we cannot have overloading. We can test whether the code is making double_ints with cross-sign ints by adding undefined overloads. I think we should do that and make all such crossings explicit. However, I want to wait for a separate patch. -- Lawrence Crowl
[google/gcc-4_7] Fix problems with -fdebug-types-section and local types
This patch is for the google/gcc-4_7 branch. It's a backport of an upstream patch at: http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00398.html If approved for trunk, is this OK for google/gcc-4_7? Google ref b/6705530. Original description: With --std=c++11, a template parameter can refer to a local type defined within a function. Because that local type doesn't qualify for its own type unit, we copy it as an unworthy type into the type unit that refers to it, but we copy too much, leading to a comdat type unit that contains a DIE with subprogram definitions rather than declarations. These DIEs may have DW_AT_low_pc/high_pc or DW_AT_ranges attributes, and consequently can refer to range list entries that don't get emitted because they're not marked when the compile unit is scanned, eventually causing an undefined symbol at link time. In addition, while debugging this problem, I found that the DW_AT_object_pointer attribute, when left in the skeletons that are left behind in the compile unit, causes duplicate copies of the types to be copied back into the compile unit. This patch fixes these problems by removing the DW_AT_object_pointer attribute from the skeleton left behind in the compile unit, and by copying DW_TAG_subprogram DIEs as declarations when copying unworthy types into a type unit. In order to preserve information in the DIE structure, I also added DW_AT_abstract_origin as an attribute that should be copied when cloning a DIE as a declaration. I also fixed the dwarf4-typedef.C test, which should be turning on the -fdebug-types-section flag. Bootstrapped and tested on x86. OK for trunk? -cary 2012-08-07 Cary Coutant ccout...@google.com gcc/ * dwarf2out.c (clone_as_declaration): Copy DW_AT_abstract_origin attribute. (generate_skeleton_bottom_up): Remove DW_AT_object_pointer attribute from original DIE. (clone_tree_hash): Rename to ... (clone_tree_partial): ... this; change callers. Copy DW_TAG_subprogram DIEs as declarations. gcc/testsuite/ * testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C: New test case. * testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C: Add -fdebug-types-section flag. Index: gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C === --- gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) @@ -0,0 +1,55 @@ +// { dg-do compile } +// { dg-options --std=c++11 -dA -gdwarf-4 -fdebug-types-section -fno-merge-debug-strings } + +// Check that -fdebug-types-sections does not copy a full referenced type +// into a type unit. + +// Checks that at least one type unit is generated. +// +// { dg-final { scan-assembler DIE \\(\[^\n\]*\\) DW_TAG_type_unit } } +// +// Check that func is declared exactly twice in the debug info: +// once in the type unit for struct D, and once in the compile unit. +// +// { dg-final { scan-assembler-times \\.ascii \func0\\[^\n\]*DW_AT_name 2 } } +// +// Check to make sure that no type unit contains a DIE with DW_AT_low_pc +// or DW_AT_ranges. These patterns assume that the compile unit is always +// emitted after all type units. +// +// { dg-final { scan-assembler-not \\.quad\[^\n\]*DW_AT_low_pc.*DIE \\(\[^\n\]*\\) DW_TAG_compile_unit } } +// { dg-final { scan-assembler-not \\.quad\[^\n\]*DW_AT_ranges.*DIE \\(\[^\n\]*\\) DW_TAG_compile_unit } } + +struct A { + A(); + virtual ~A(); + virtual void foo(); + private: + int data; +}; + +struct B { + B(); + virtual ~B(); +}; + +extern B* table[]; + +struct D { + template typename T + T* get(int i) + { +B* cell = table[i]; +if (cell == 0) + cell = new T(); +return static_castT*(cell); + } +}; + +void func(D* d) +{ + struct C : B { +A a; + }; + d-getC(0)-a.foo(); +} Index: gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C === --- gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C (revision 190220) +++ gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -gdwarf-4 } */ +/* { dg-options -gdwarf-4 -fdebug-types-section } */ /* Regression test for an ICE in output_die when using -gdwarf-4. */ Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 190220) +++ gcc/dwarf2out.c (working copy) @@ -7378,6 +7378,7 @@ clone_as_declaration (dw_die_ref die) switch (a-dw_attr) { +case DW_AT_abstract_origin: case DW_AT_artificial: case DW_AT_containing_type: case DW_AT_external: @@ -7510,6 +7511,12 @@ generate_skeleton_bottom_up (skeleton_ch dw_die_ref clone = clone_die (c); move_all_children (c, clone); +/* If the original has a DW_AT_object_pointer attribute, +
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/8/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Aug 8, 2012 Miles Bader mi...@gnu.org wrote: Richard Guenther richard.guent...@gmail.com writes: Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. Show us the difference in timing. Show us the generated code. I can't imagine that it could ever matter. I'm also curious about that statement... PODs don't really seem to offer much advantage with modern compilers, except in a few very specific cases (of which this doesn't seem to be one), e.g. in unions. They make a difference for the by-value passing ABI. double-ints can be passed in two registers on most platforms. Sure, but that doesn't seem to depend on PODness -- non-PODs can be passed in two registers as well, AFAICS... E.g., in the following: typedef long valtype; struct X { valtype x, y; }; struct Y { valtype x, y; Y (valtype a, valtype b) : x (a), y (b) { } }; extern void fx (X x); void test_x () {X x = { 1, 2 }; fx (x); } extern void fy (Y y); void test_y () {Y y (1, 2); fy (y); } test_x and test_y use exactly the same calling sequence (and contain exactly the same assembly code)... [on x86-64] It is not PODness in the standard sense that matters. It is podness from the ABI perspecitve: Y does not have a user-defined copy-constructor nor a desctructor; it is not a polymorphic type, so it is OK to pass in registers just like X. Well, more specifically, the Itanium ABI says if the copy constructor and the destructor are trivial, one passes the class in registers. Other ABIs can make other choices. Older ABIs often pass all structs and classes indirectly. The adding in inline versions of various special members into the following code results in the following worst-case changes. x86 x86_64 instructions: 48 - 69 = 144% 38 - 53 = 139% stack operations:6 - 10 = 167% 2 - 6 = 300% memory operations: 28 - 38 = 136% 22 - 28 = 127% These numbers are not huge, particularly in context, but worth considering. In any event, the point is moot. To implement the exact semantics of current expressions using shwi_to_double_int and uhwi_to_double_int, we cannot use constructors anyway. So, there remains no immediate reason to use any constructors. struct type { #ifdef DEFAULT type (); #endif #ifdef INITIAL type (int); #endif #ifdef COPYCON type (const type from) : field1 (from.field1), field2 (from.field2) { } #endif #ifdef DESTRUCT ~type () { field1 = 0; field2 = 0; } #endif type method (type); long int field1; long int field2; }; extern type global; void callee (type arg); void function_caller () { type local (global); callee (local); callee (global); } void method_caller (type arg1, type arg2) { type var1 (global); type var2 = arg2; arg1.method (arg2); var1.method (var2); } -- Lawrence Crowl
Re: [SH] PR 39423
On 08/08/2012 03:12 PM, Oleg Endo wrote: How about the attached patch? Is that way of dealing with the mems OK? What could be a possible test case for the alias info issue? That looks like the right sort of thing. A test case would have to be for a missed-optimization, where we failed to schedule one load before another. It could be very hard to generate one intentionally... r~
Re: [PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer
On Thu, Aug 09, 2012 at 12:08:20AM +0200, Dodji Seketeli wrote: Hello, This is a fix to prepare the xmlj_io.c file of gnu classpath to a coming API change in libxml2. Basically, we were previously accessing fields inside the xmlOutputBuffer struct of libxml2. In a coming version of libxml2, that won't be possible anymore. Client code will have to use accessor functions instead. For the gory details, there is an interestin note of Daniel Veillard (author of libxml2) at https://mail.gnome.org/archives/desktop-devel-list/2012-August/msg7.html. This patch defines too accessor macros that, depending on the version of libxml2 we are using will either access the fields of xmlOutputBuffer directly, or use the new accessor function. Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk. OK to commit? For the record, I rewieved the patch and it looks fine to me, thanks Dodji ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ dan...@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)
On 8/8/12, Miles Bader mi...@gnu.org wrote: Mike Stump mikest...@comcast.net writes: Constructors are allowed, but PODs are often passed more efficiently. That property seemed particularly important for double_int. Show us the difference in timing. Show us the generated code. I can't imagine that it could ever matter. I'm also curious about that statement... PODs don't really seem to offer much advantage with modern compilers, except in a few very specific cases (of which this doesn't seem to be one), e.g. in unions. Which brings up a critical point that double_int is used in trees, which are in a union, and so double_int must be a POD until trees are no longer unions. -- Lawrence Crowl
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On Wed, 2012-08-08 at 15:35 -0700, Janis Johnson wrote: On 08/08/2012 03:27 PM, Andrew Pinski wrote: On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Greetings, Thanks for the review of part 2! Here's another chunk of the SLSR code (I feel I owe you a few beers at this point). This performs analysis and replacement on groups of related candidates having an SSA name (rather than a constant) for a stride. This leaves only the conditional increment (CAND_PHI) case, which will be handled in the last patch of the series. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gimple-ssa-strength-reduction.c (struct incr_info_d): New struct. (incr_vec): New static var. (incr_vec_len): Likewise. (address_arithmetic_p): Likewise. (stmt_cost): Remove dead assignment. (dump_incr_vec): New function. (cand_abs_increment): Likewise. (lazy_create_slsr_reg): Likewise. (incr_vec_index): Likewise. (count_candidates): Likewise. (record_increment): Likewise. (record_increments): Likewise. (unreplaced_cand_in_tree): Likewise. (optimize_cands_for_speed_p): Likewise. (lowest_cost_path): Likewise. (total_savings): Likewise. (analyze_increments): Likewise. (ncd_for_two_cands): Likewise. (nearest_common_dominator_for_cands): Likewise. (profitable_increment_p): Likewise. (insert_initializers): Likewise. (introduce_cast_before_cand): Likewise. (replace_rhs_if_not_dup): Likewise. (replace_one_candidate): Likewise. (replace_profitable_candidates): Likewise. (analyze_candidates_and_replace): Handle candidates with SSA-name strides. gcc/testsuite: 2012-08-01 Bill Schmidt wschm...@linux.ibm.com * gcc.dg/tree-ssa/slsr-5.c: New. * gcc.dg/tree-ssa/slsr-6.c: New. * gcc.dg/tree-ssa/slsr-7.c: New. * gcc.dg/tree-ssa/slsr-8.c: New. * gcc.dg/tree-ssa/slsr-9.c: New. * gcc.dg/tree-ssa/slsr-10.c: New. * gcc.dg/tree-ssa/slsr-11.c: New. * gcc.dg/tree-ssa/slsr-12.c: New. * gcc.dg/tree-ssa/slsr-13.c: New. * gcc.dg/tree-ssa/slsr-14.c: New. * gcc.dg/tree-ssa/slsr-15.c: New. * gcc.dg/tree-ssa/slsr-16.c: New. * gcc.dg/tree-ssa/slsr-17.c: New. * gcc.dg/tree-ssa/slsr-18.c: New. * gcc.dg/tree-ssa/slsr-19.c: New. * gcc.dg/tree-ssa/slsr-20.c: New. * gcc.dg/tree-ssa/slsr-21.c: New. * gcc.dg/tree-ssa/slsr-22.c: New. * gcc.dg/tree-ssa/slsr-23.c: New. * gcc.dg/tree-ssa/slsr-24.c: New. * gcc.dg/tree-ssa/slsr-25.c: New. * gcc.dg/tree-ssa/slsr-26.c: New. * gcc.dg/tree-ssa/slsr-30.c: New. * gcc.dg/tree-ssa/slsr-31.c: New. == --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0) @@ -0,0 +1,25 @@ +/* Verify straight-line strength reduction fails for simple integer addition + with casts thrown in when -fwrapv is used. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ +/* { dg-skip-if { ilp32 } { -m32 } { } } */ + This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? This also does not work for mips64 where the options are either -mabi=32 or -mabi=n32 for ILP32. HJL's patch looks correct. Thanks, Andrew There are GCC targets with 16-bit integers. What's the actual set of targets on which this test is meant to run? There's a list of effective-target names based on data type sizes in http://gcc.gnu.org/onlinedocs/gccint/Effective_002dTarget-Keywords.html#Effective_002dTarget-Keywords. Yes, sorry. The test really is only valid when int and long have different sizes. So according to that link we should skip ilp32 and llp64 at a minimum. It isn't clear what we should do for int16 since the size of long isn't specified, so I suppose we should skip that as well. So, perhaps modify HJ's patch to have /* { dg-do compile { target { ! { ilp32 llp64 int16 } } } } */ ? Thanks, Bill Janis Thanks. -- H.J. --- * gcc.dg/tree-ssa/slsr-30.c: Require non-ilp32. Remove dg-skip-if. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c b/gcc/testsuite/gcc.dg/tree -ssa/slsr-30.c index fbd6897..7921f43 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c +++
Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides
On 08/08/2012 06:41 PM, William J. Schmidt wrote: On Wed, 2012-08-08 at 15:35 -0700, Janis Johnson wrote: On 08/08/2012 03:27 PM, Andrew Pinski wrote: On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-dom2 -fwrapv } */ +/* { dg-skip-if { ilp32 } { -m32 } { } } */ + This doesn't work on x32 nor Linux/ia32 since -m32 may not be needed for ILP32. This patch works for me. OK to install? This also does not work for mips64 where the options are either -mabi=32 or -mabi=n32 for ILP32. HJL's patch looks correct. Thanks, Andrew There are GCC targets with 16-bit integers. What's the actual set of targets on which this test is meant to run? There's a list of effective-target names based on data type sizes in http://gcc.gnu.org/onlinedocs/gccint/Effective_002dTarget-Keywords.html#Effective_002dTarget-Keywords. Yes, sorry. The test really is only valid when int and long have different sizes. So according to that link we should skip ilp32 and llp64 at a minimum. It isn't clear what we should do for int16 since the size of long isn't specified, so I suppose we should skip that as well. So, perhaps modify HJ's patch to have /* { dg-do compile { target { ! { ilp32 llp64 int16 } } } } */ ? Thanks, Bill That's confusing. Perhaps what you really need is a new effective target for sizeof(int) != sizeof(long). Janis
Re: [google] Remove deprecated pfmon-specific functions/structs from pmu-profile.c (issue6442086)
I have committed this to google/main for Chris (approved by Rong). Chris, please prepare a patch to backport this to google/4_7. Teresa On Tue, Aug 7, 2012 at 3:55 PM, Chris Manghane cm...@google.com wrote: Removes references in libgcov.c to functions and structs removed from pmu-profile.c For google/main. Tested with crosstools and bootstrap. 2012-08-07 Chris Manghane cm...@google.com * libgcc/pmu-profile.c (enum pmu_tool_type): Remove pfmon-specific functions/structs. (enum pmu_event_type): Ditto. (enum pmu_state): Ditto. (enum cpu_vendor_signature): Ditto. (struct pmu_tool_info): Ditto. (void gcov_write_branch_mispredict_infos): Ditto. (get_x86cpu_vendor): Ditto. (parse_pmu_profile_options): Ditto. (start_addr2line_symbolizer): Ditto. (reset_symbolizer_parent_pipes): Ditto. (reset_symbolizer_child_pipes): Ditto. (end_addr2line_symbolizer): Ditto. (symbolize_addr2line): Ditto. (start_pfmon_module): Ditto. (convert_pct_to_unsigned): Ditto. (parse_load_latency_line): Ditto. (parse_branch_mispredict_line): Ditto. (parse_pfmon_load_latency): Ditto. (parse_pfmon_tool_header): Ditto. (parse_pfmon_branch_mispredicts): Ditto. (pmu_start): Ditto. (init_pmu_branch_mispredict): Ditto. (init_pmu_tool): Ditto. (__gcov_init_pmu_profiler): Ditto. (__gcov_start_pmu_profiler): Ditto. (__gcov_stop_pmu_profiler): Ditto. (gcov_write_branch_mispredict_line): Ditto. (gcov_write_load_latency_infos): Ditto. (gcov_write_branch_mispredict_infos): Ditto. (gcov_write_tool_header): Ditto. (__gcov_end_pmu_profiler): Ditto. * libgcc/libgcov.c (gcov_alloc_filename): Remove references to pfmon specific functions/structs. (pmu_profile_stop): Ditto. (gcov_exit): Ditto. (__gcov_init): Ditto. (__gcov_flush): Ditto. Index: libgcc/pmu-profile.c === --- libgcc/pmu-profile.c(revision 190135) +++ libgcc/pmu-profile.c(working copy) @@ -67,169 +67,11 @@ see the files COPYING3 and COPYING.RUNTIME respect #define XDELETEVEC(p) free(p) #define XDELETE(p) free(p) -#define PFMON_CMD /usr/bin/pfmon -#define ADDR2LINE_CMD /usr/bin/addr2line -#define PMU_TOOL_MAX_ARGS (20) -static char default_addr2line[] = ??:0; -static const char pfmon_ll_header[] = # counts %self%cum -10 32 64256 1024 =1024 %wself -code addr symbol\n; -static const char pfmon_bm_header[] = -# counts %self%cum code addr symbol\n; - -const char *pfmon_intel_ll_args[PMU_TOOL_MAX_ARGS] = { - PFMON_CMD, - --aggregate-results, - --follow-all, - --with-header, - --smpl-module=pebs-ll, - --ld-lat-threshold=4, - --pebs-ll-dcmiss-code, - --resolve-addresses, - -emem_inst_retired:LATENCY_ABOVE_THRESHOLD, - --long-smpl-periods=1, - 0 /* terminating NULL must be present */ -}; - -const char *pfmon_amd_ll_args[PMU_TOOL_MAX_ARGS] = { - PFMON_CMD, - --aggregate-results, - --follow-all, - -uk, - --with-header, - --smpl-module=ibs, - --resolve-addresses, - -eibsop_event:uops, - --ibs-dcmiss-code, - --long-smpl-periods=0x0, - 0 /* terminating NULL must be present */ -}; - -const char *pfmon_intel_brm_args[PMU_TOOL_MAX_ARGS] = { - PFMON_CMD, - --aggregate-results, - --follow-all, - --with-header, - --resolve-addresses, - -eMISPREDICTED_BRANCH_RETIRED, - --long-smpl-periods=1, - 0 /* terminating NULL must be present */ -}; - -const char *pfmon_amd_brm_args[PMU_TOOL_MAX_ARGS] = { - PFMON_CMD, - --aggregate-results, - --follow-all, - --with-header, - --resolve-addresses, - -eRETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS, - --long-smpl-periods=1, - 0 /* terminating NULL must be present */ -}; - -const char *addr2line_args[PMU_TOOL_MAX_ARGS] = { - ADDR2LINE_CMD, - -e, - 0 /* terminating NULL must be present */ -}; - - -enum pmu_tool_type -{ - PTT_PFMON, - PTT_LAST -}; - -enum pmu_event_type -{ - PET_INTEL_LOAD_LATENCY, - PET_AMD_LOAD_LATENCY, - PET_INTEL_BRANCH_MISPREDICT, - PET_AMD_BRANCH_MISPREDICT, - PET_LAST -}; - -typedef struct pmu_tool_fns { - const char *name; /* name of the pmu tool */ - /* pmu tool commandline argument. */ - const char **arg_array; - /* Initialize pmu module. */ - void *(*init_pmu_module) (void); - /* Start profililing. */ - void (*start_pmu_module) (pid_t ppid, char *tmpfile, const char **args); - /* Stop profililing. */ - void (*stop_pmu_module) (void); - /* How to parse the output generated by the PMU tool. */ - int (*parse_pmu_output) (char *filename, void *pmu_data); - /* How to write
RE: [PATCH,i386] fma,fma4 and xop flags
Otherwise, what does -mno-fma4 -mxop do? (it should enable both xop and fma4!) what should -mfma4 -mno-xop do (it should disable both xop and fma4!). Yes! that's what GCC does now. Some flags are coupled (atleast for now). For ex, -mno-sse4.2 -mavx enables both sse4.2 and avx whereas -mavx -mno-sse4.2 disables both. Setting of the following are clubbed. 1) 3DNow sets MMX 2) SSE2 sets SSE 3) SSE3 sets SSE2 4) SSE4_1 sets SSE3 5) SSE4_2 sets SSE4_1 6) FMA sets AVX 7) AVX2 sets AVX 8) SSE4_A sets SSE3 9) FMA4 set SSE4_A and AVX 10) XOP sets FMA4 11) AES sets SSE2 12) PCLMUL sets SSE2 13) ABM sets POPCNT Resetting is done in reversely (MMX resets 3DNOW). IMO, if we have different cpuid flags, enabling\disabling the compiler flags depends on these cpuid flags directly. Adding subsets to them or tangling them together may give wrong results. Please let me know your opinion. Regards Ganesh -Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: Wednesday, August 08, 2012 5:12 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com Subject: Re: [PATCH,i386] fma,fma4 and xop flags On Wed, Aug 8, 2012 at 1:31 PM, ganesh.gopalasubraman...@amd.com wrote: Hello, Bdver2 cpu supports both fma and fma4 instructions. Previous to patch, option -mno-xop removes -mfma4. Similarly, option -mno-fma4 removes -mxop. Eh? Why's that? I think we should disentangle -mxop and -mfma4 instead. Otherwise, what does -mno-fma4 -mxop do? (it should enable both xop and fma4!) what should -mfma4 -mno-xop do (it should disable both xop and fma4!). All this is just confusing to the user, even if in AMD documents XOP includes FMA4. Richard.