Re: [PATCH] Disable all target libraries if not building gcc
On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Disable all target libraries if not building gcc
On Mon, Mar 16, 2015 at 4:43 PM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com wrote: On Mon, 16 Mar 2015, H.J. Lu wrote: On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote: On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). We need to work out something to avoid building target libraries in binutils. I suggest not building zlib as a target library unless libgcj is also being built as a target library, given that there should be no need to built it as a target library in isolation. The logic is there. But somehow it doesn't work for binutils. I will take another look. Here is a patch. It excludes target-zlib if target-libjava isn't built. Tested in binutils and GCC with java. -- H.J. diff --git a/configure b/configure index 9e54319..f9455e9 100755 --- a/configure +++ b/configure @@ -6432,9 +6432,6 @@ Supported languages are: ${potential_languages} $LINENO 5 ac_configure_args=`echo $ac_configure_args | sed -e s/ '--enable-languages=[^ ]*'//g -e s/$/ '--enable-languages=$enable_languages'/ ` -else - # Disable all target libraries if not building gcc. - target_configdirs= fi # Handle --disable-component generically. @@ -6554,6 +6551,13 @@ for i in ${target_configdirs_all} ; do fi done +# Exclude target-zlib if target-libjava isn't built. +if echo ${target_configdirs} | grep target-libjava /dev/null 21; then + : +else + target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'` +fi + # libiberty-linker-plugin is special: it doesn't have its own source directory, # so we have to add it after the preceding checks. if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x diff --git a/configure.ac b/configure.ac index eaf4661..a5a0fd0 100644 --- a/configure.ac +++ b/configure.ac @@ -2129,9 +2129,6 @@ Supported languages are: ${potential_languages}]) AC_SUBST(stage1_languages) ac_configure_args=`echo $ac_configure_args | sed -e s/ '--enable-languages=[[^ ]]*'//g -e s/$/ '--enable-languages=$enable_languages'/ ` -else - # Disable all target libraries if not building gcc. - target_configdirs= fi # Handle --disable-component generically. @@ -2251,6 +2248,13 @@ for i in ${target_configdirs_all} ; do fi done +# Exclude target-zlib if target-libjava isn't built. +if echo ${target_configdirs} | grep target-libjava /dev/null 21; then + : +else + target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'` +fi + # libiberty-linker-plugin is special: it doesn't have its own source directory, # so we have to add it after the preceding checks. if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Mon, 16 Mar 2015, Eric Botcazou wrote: If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. The context of the above statement is somewhat ambiguous: if the statement is taken to include no matter what is specified in the program, including use of __attribute__ ((__aligned__ ...)) then I have to object. (I guess Eric, you didn't actually mean that, though.) The definition of BIGGEST_ALIGNMENT is (tm.texi.in): Biggest alignment that any data type can require on this machine, in bits. Note that this is not the biggest alignment that is supported, just the biggest alignment that, when violated, may cause a fault. So, IMNSHO we'd better *support* a *larger* alignment, as in if the code specified it by explicit means at least up to MAX_OFILE_ALIGNMENT. But, perfectly OK when the code didn't explicitly say anything else. BTW, unfortunately, BIGGEST_ALIGNMENT can't be blindly followed for atomic accesses to undecorated (without explicit alignment specifiers) code either. Rather the minimum alignment is the natural alignment *absolutely everywhere no matter what target* which matters for targets where BIGGEST_ALIGNMENT natural_alignment(all supported types) (well, as long as the natural alignment is smaller than the target page-size or cache-line, where at least one of the concepts is usually applicable). Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because that unfortunately has bad effects on generated code for all accesses to all sizes now = BIGGEST_ALIGNMENT. brgds, H-P PS. It's very unfortunate that there's now __BIGGEST_ALIGNMENT__ visible to programs.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hans-Peter Nilsson h...@bitrange.com writes: Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because it's an ABI change. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
On Tue, 17 Mar 2015, Andreas Schwab wrote: Hans-Peter Nilsson h...@bitrange.com writes: Q: So why not adjust the BIGGEST_ALIGNMENT definition in such targets to be at least the natural alignment of supported atomic types? A: Because it's an ABI change. I intended that to be included in bad effects; A: Because that unfortunately has bad effects on generated code for all accesses to all sizes now = BIGGEST_ALIGNMENT. but thanks for being specific (and I didn't remember exactly *what* bad effects it was that I saw when I tried that long ago :) brgds, H-P
Re: [PATCH] Disable all target libraries if not building gcc
On Mon, Mar 16, 2015 at 7:26 PM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 16, 2015 at 4:43 PM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com wrote: On Mon, 16 Mar 2015, H.J. Lu wrote: On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote: On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). We need to work out something to avoid building target libraries in binutils. I suggest not building zlib as a target library unless libgcj is also being built as a target library, given that there should be no need to built it as a target library in isolation. The logic is there. But somehow it doesn't work for binutils. I will take another look. Here is a patch. It excludes target-zlib if target-libjava isn't built. Tested in binutils and GCC with java. This version is more flexible to support future target libraries which depend on target zlib. -- H.J. diff --git a/configure b/configure index 9e54319..b719d38 100755 --- a/configure +++ b/configure @@ -6554,6 +6551,15 @@ for i in ${target_configdirs_all} ; do fi done +# Exclude target-zlib if target-libjava isn't built. +case ${target_configdirs} in +*target-libjava*) + ;; +*) + target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'` + ;; +esac + # libiberty-linker-plugin is special: it doesn't have its own source directory, # so we have to add it after the preceding checks. if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x diff --git a/configure.ac b/configure.ac index eaf4661..a4e4c7d 100644 --- a/configure.ac +++ b/configure.ac @@ -2251,6 +2248,15 @@ for i in ${target_configdirs_all} ; do fi done +# Exclude target-zlib if target-libjava isn't built. +case ${target_configdirs} in +*target-libjava*) + ;; +*) + target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'` + ;; +esac + # libiberty-linker-plugin is special: it doesn't have its own source directory, # so we have to add it after the preceding checks. if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x
Re: [AArch64][PR65375] Fix RTX cost for vector SET
On 17/03/15 03:48, Kyrill Tkachov wrote: On 16/03/15 13:15, Kugan wrote: On 16/03/15 23:32, Kugan wrote: lower-subreg.c:compute_costs() only cares about the cost of a (set (reg) (const_int )) move but I think the intention, at least for now, is to return extra_cost-vect.alu for all the vector operations. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu Thanks Kyrill for the review. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Are you sure it's a (set (reg) (const_int)) that's being costed here? I thought for moves into vecto registers it would be a (set (reg) (const_vector)) which we don't handle in our rtx costs currently. I think the correct approach would be to extend the aarch64_rtx_costs switch statement to handle the CONST_VECT case. I believe you can use aarch64_simd_valid_immediate to check whether x is a valid immediate for a simd instruction and give it a cost of extra_cost-vect.alu. The logic should be similar to the CONST_INT case. Sorry about the (set (reg) (const_int)) above. But the actual RTL that is being split at 220r.subreg2 is (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0) (subreg:V4SF (reg/v:OI 73 [ __o ]) 0)) /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625 800 {*aarch64_simd_movv4sf} (nil)) And also, if we return RTX cost above COSTS_N_INSNS (1), it will be split and it dosent recover from there. Therefore we need something like the below to prevent that happening. Hi Kyrill, How about the attached patch? It is similar to what is currently done for scalar register move. Hi Kugan, yeah, I think this is a better approach, though I can't approve. Here is the patch with minor comment update. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2015-03-17 Kugan Vivekanandarajah kug...@linaro.org Jim Wilson jim.wil...@linaro.org PR target/65375 * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle vector register copies. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cba3c1a..d6ad0af 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, /* Fall through. */ case REG: + if (VECTOR_MODE_P (GET_MODE (op0)) REG_P (op1)) + { + /* The cost is 1 per vector-register copied. */ + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) + / GET_MODE_SIZE (V4SImode); + *cost = COSTS_N_INSNS (n_minus_1 + 1); + } /* const0_rtx is in general free, but we will use an instruction to set a register to 0. */ - if (REG_P (op1) || op1 == const0_rtx) -{ + else if (REG_P (op1) || op1 == const0_rtx) + { /* The cost is 1 per register copied. */ int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) / UNITS_PER_WORD;
Re: [PATCH ARM]Fix memset-inline-* failures on cortex-a9 tune by checking tune information.
On Fri, Mar 13, 2015 at 7:56 PM, Ramana Radhakrishnan ramana@googlemail.com wrote: On Fri, Mar 6, 2015 at 7:46 AM, Bin Cheng bin.ch...@arm.com wrote: Hi, This patch is the second part fixing memset-inline-{4,5,6,8,9}.c failures on cortex-a9. It adds a function checking CPU tuning information in dejagnu, it also uses that function to skip related testcase when we are compiling for cortex-a9 tune. Skips the related testcase for all tests where the tuning information doesn't use neon. I think this technique can be used to clean up a number of multilib related failures in the gcc.target/arm testsuite. Actually these are all related cases. Cases {1,2,3} are intended for non-neon target inlining test, case 7 is an executable test which should be run what ever the target supports. Build and test on arm-none-eabi. Is it OK? gcc/testsuite/ChangeLog 2015-03-06 Bin Cheng bin.ch...@arm.com * lib/target-supports.exp (arm_tune_string_ops_prefer_neon): New. * gcc.target/arm/memset-inline-4.c: Skip for arm_tune_string_ops_prefer_neon. * gcc.target/arm/memset-inline-5.c: Ditto. * gcc.target/arm/memset-inline-6.c: Ditto. * gcc.target/arm/memset-inline-8.c: Ditto. * gcc.target/arm/memset-inline-9.c: Ditto. Ok, please document the new dejagnu helper routine in sourcebuild.texi Done. Patch updated, I will push both patches in if you are ok with it. Thanks, bin 2015-03-17 Bin Cheng bin.ch...@arm.com * doc/sourcebuild.texi (arm_tune_string_ops_prefer_neon): New. gcc/testsuite/ChangeLog 2015-03-17 Bin Cheng bin.ch...@arm.com * lib/target-supports.exp (arm_tune_string_ops_prefer_neon): New. * gcc.target/arm/memset-inline-4.c: Skip for arm_tune_string_ops_prefer_neon. * gcc.target/arm/memset-inline-5.c: Ditto. * gcc.target/arm/memset-inline-6.c: Ditto. * gcc.target/arm/memset-inline-8.c: Ditto. * gcc.target/arm/memset-inline-9.c: Ditto. Index: gcc/testsuite/gcc.target/arm/memset-inline-4.c === --- gcc/testsuite/gcc.target/arm/memset-inline-4.c (revision 221097) +++ gcc/testsuite/gcc.target/arm/memset-inline-4.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do run } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mcpu=cortex-a9 } { } } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mtune=cortex-a9 } { } } */ +/* { dg-skip-if Don't inline memset using neon instructions { ! arm_tune_string_ops_prefer_neon } } */ /* { dg-options -save-temps -O2 -fno-inline } */ /* { dg-add-options arm_neon } */ Index: gcc/testsuite/gcc.target/arm/memset-inline-5.c === --- gcc/testsuite/gcc.target/arm/memset-inline-5.c (revision 221097) +++ gcc/testsuite/gcc.target/arm/memset-inline-5.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do run } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mcpu=cortex-a9 } { } } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mtune=cortex-a9 } { } } */ +/* { dg-skip-if Don't inline memset using neon instructions { ! arm_tune_string_ops_prefer_neon } } */ /* { dg-options -save-temps -O2 -fno-inline } */ /* { dg-add-options arm_neon } */ Index: gcc/testsuite/gcc.target/arm/memset-inline-6.c === --- gcc/testsuite/gcc.target/arm/memset-inline-6.c (revision 221097) +++ gcc/testsuite/gcc.target/arm/memset-inline-6.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do run } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mcpu=cortex-a9 } { } } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mtune=cortex-a9 } { } } */ +/* { dg-skip-if Don't inline memset using neon instructions { ! arm_tune_string_ops_prefer_neon } } */ /* { dg-options -save-temps -O2 -fno-inline } */ /* { dg-add-options arm_neon } */ Index: gcc/testsuite/gcc.target/arm/memset-inline-8.c === --- gcc/testsuite/gcc.target/arm/memset-inline-8.c (revision 221097) +++ gcc/testsuite/gcc.target/arm/memset-inline-8.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do run } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mcpu=cortex-a9 } { } } */ -/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { *-*-* } { -mtune=cortex-a9 } { } } */ +/* { dg-skip-if Don't inline memset using neon instructions { ! arm_tune_string_ops_prefer_neon } } */ /* { dg-options -save-temps -O2 -fno-inline } */ /* { dg-add-options arm_neon } */ Index: gcc/testsuite/gcc.target/arm/memset-inline-9.c
C++ PATCH for c++/65327 aka DR 1688 (volatile constexpr variables)
The PR points out DR 1688: volatile + constexpr is a valid combination. This patch makes the compiler accept such a code. There's a related problem: constexpr volatile int a = 42; constexpr int b = a; the initialization of b should be rejected, but it is not. My patch does not address this issue though. Bootstrapped/regtested on x86_64-linux, ok for trunk? If so, should this be backported to 4.8/4.9? 2015-03-16 Marek Polacek pola...@redhat.com DR 1688 PR c++/65327 * decl.c (grokdeclarator): Allow volatile and constexpr together. * g++.dg/cpp0x/constexpr-object1.C: Change dg-error to dg-bogus. * g++.dg/cpp0x/pr65327.C: New test. diff --git gcc/cp/decl.c gcc/cp/decl.c index e35e484..cb0f11f 100644 --- gcc/cp/decl.c +++ gcc/cp/decl.c @@ -10134,8 +10134,9 @@ grokdeclarator (const cp_declarator *declarator, the object as `const'. */ if (constexpr_p innermost_code != cdk_function) { - if (type_quals TYPE_QUAL_VOLATILE) -error (both %volatile% and %constexpr% cannot be used here); + /* DR1688 says that a `constexpr' specifier in combination with +`volatile' is valid. */ + if (TREE_CODE (type) != REFERENCE_TYPE) { type_quals |= TYPE_QUAL_CONST; diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C index 41afbe9..287c3ac 100644 --- gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C +++ gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C @@ -19,7 +19,7 @@ constexpr A1 a2; // { dg-error uninitialized const } const constexpr A1 a3 = A1(); -volatile constexpr A1 a4 = A1(); // { dg-error both .volatile. and .constexpr. cannot } +volatile constexpr A1 a4 = A1(); // { dg-bogus both .volatile. and .constexpr. cannot } // error: on type declaration constexpr struct pixel diff --git gcc/testsuite/g++.dg/cpp0x/pr65327.C gcc/testsuite/g++.dg/cpp0x/pr65327.C index e69de29..c6cefab 100644 --- gcc/testsuite/g++.dg/cpp0x/pr65327.C +++ gcc/testsuite/g++.dg/cpp0x/pr65327.C @@ -0,0 +1,18 @@ +// PR c++/65327 +// { dg-do compile { target c++11 } } +// DR1688 says that constexpr can be used together with volatile. + +constexpr volatile int i = 10; + +void +foo () +{ + constexpr volatile int j = 5; + static constexpr volatile int k = 5; +} + +constexpr volatile int +bar () +{ + return i; +} Marek
[patch c++]: Fix for PR/65390
Hi, this patch avoids the attempt to create user-aligned-type for variants original and main-variants type-alignment differ and original type isn't user-aligned. Not sure if this is the preferred variant, we could create for such cases an aligned-type without setting user-align. But as this should just happen for invalid types, I am not sure if we actual need to handle later at all. So I suggest the following fix for it: ChangeLog PR c++/65390 * tree.c (strip_typedefs): Don't attempt to build user-aligned type if original doesn't set it and main-variant differs to it. Jonathan is just about to reduce the testcase, so final patch should include reduced testcase for it. Regression-tested for x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: tree.c === --- tree.c (Revision 221277) +++ tree.c (Arbeitskopie) @@ -1356,7 +1356,7 @@ strip_typedefs (tree t) if (!result) result = TYPE_MAIN_VARIANT (t); if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result) - || TYPE_ALIGN (t) != TYPE_ALIGN (result)) + || (TYPE_USER_ALIGN (t) TYPE_ALIGN (t) != TYPE_ALIGN (result))) { gcc_assert (TYPE_USER_ALIGN (t)); if (TYPE_ALIGN (t) == TYPE_ALIGN (result))
Re: [AArch64][PR65375] Fix RTX cost for vector SET
On 16/03/15 23:32, Kugan wrote: lower-subreg.c:compute_costs() only cares about the cost of a (set (reg) (const_int )) move but I think the intention, at least for now, is to return extra_cost-vect.alu for all the vector operations. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu Thanks Kyrill for the review. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Are you sure it's a (set (reg) (const_int)) that's being costed here? I thought for moves into vecto registers it would be a (set (reg) (const_vector)) which we don't handle in our rtx costs currently. I think the correct approach would be to extend the aarch64_rtx_costs switch statement to handle the CONST_VECT case. I believe you can use aarch64_simd_valid_immediate to check whether x is a valid immediate for a simd instruction and give it a cost of extra_cost-vect.alu. The logic should be similar to the CONST_INT case. Sorry about the (set (reg) (const_int)) above. But the actual RTL that is being split at 220r.subreg2 is (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0) (subreg:V4SF (reg/v:OI 73 [ __o ]) 0)) /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625 800 {*aarch64_simd_movv4sf} (nil)) And also, if we return RTX cost above COSTS_N_INSNS (1), it will be split and it dosent recover from there. Therefore we need something like the below to prevent that happening. Hi Kyrill, How about the attached patch? It is similar to what is currently done for scalar register move. Thanks, Kugan diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cba3c1a..b9db3ac 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, /* Fall through. */ case REG: + if (VECTOR_MODE_P (GET_MODE (op0)) REG_P (op1)) + { + /* The cost is 1 per register copied. */ + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) + / GET_MODE_SIZE (V4SImode); + *cost = COSTS_N_INSNS (n_minus_1 + 1); + } /* const0_rtx is in general free, but we will use an instruction to set a register to 0. */ - if (REG_P (op1) || op1 == const0_rtx) -{ + else if (REG_P (op1) || op1 == const0_rtx) + { /* The cost is 1 per register copied. */ int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) / UNITS_PER_WORD;
Re: RFC: Avoid calling convert_to_mode with invalid rtl.
Hi Eric, Note that the very same code is in expand_assignment, so they probably should be kept in sync. Oops - I had missed that. The patch adds a second call to expand_expr(), giving the address mode as the suggested mode, and using a normal expansion, rather than EXPAND_SUM. This might work, but even if it does not the rtl in offset_rtx will be valid in a non-address context so that whatever convert_to_mode does will still remain valid. I don't think that we want to expand twice the same expression. What about calling force_operand on offset_rtx right before convert_to_mode? That works too, and is much better than my solution. What do you think of the attached patch ? Is it suitable for submission once the sources are out of stage 4 ? (No regressions with an x86_64-pc-linux-gnu toolchain and several fixes for an rl78-elf toolchain). Cheers Nick expr.c.patch.2 Description: Unix manual page
[PATCH] Remove unused function
This function is unused since matrix-reorg.c removal back in 2012. I think there's no point in keeping it around. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-03-16 Marek Polacek pola...@redhat.com * cgraph.h (add_new_static_var): Remove declaration. * varpool.c (add_new_static_var): Remove function. diff --git gcc/cgraph.h gcc/cgraph.h index 99af026..52b15c5 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2717,9 +2717,6 @@ cgraph_node::has_gimple_body_p (void) for ((node) = symtab-first_function_with_gimple_body (); (node); \ (node) = symtab-next_function_with_gimple_body (node)) -/* Create a new static variable of type TYPE. */ -tree add_new_static_var (tree type); - /* Uniquize all constants that appear in memory. Each constant in memory thus far output is recorded in `const_desc_table'. */ diff --git gcc/varpool.c gcc/varpool.c index ce64279..f1439ca 100644 --- gcc/varpool.c +++ gcc/varpool.c @@ -760,27 +760,6 @@ symbol_table::output_variables (void) return changed; } -/* Create a new global variable of type TYPE. */ -tree -add_new_static_var (tree type) -{ - tree new_decl; - varpool_node *new_node; - - new_decl = create_tmp_var_raw (type); - DECL_NAME (new_decl) = create_tmp_var_name (NULL); - TREE_READONLY (new_decl) = 0; - TREE_STATIC (new_decl) = 1; - TREE_USED (new_decl) = 1; - DECL_CONTEXT (new_decl) = NULL_TREE; - DECL_ABSTRACT_P (new_decl) = false; - lang_hooks.dup_lang_specific_decl (new_decl); - new_node = varpool_node::get_create (new_decl); - varpool_node::finalize_decl (new_decl); - - return new_node-decl; -} - /* Attempt to mark ALIAS as an alias to DECL. Return TRUE if successful. Extra name aliases are output whenever DECL is output. */ Marek
[Patch, Fortran, 2/2] Proposal on renaming gfc_vtable_*_get() to gfc_class_vtab_*_get()
Hi all, this is the second part of the patch, substituting gfc_class_vtab_*_get() for gfc_vtable_*_get () where needed. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de vtab_access_rework2_v1.clog Description: Binary data diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 22fc7c7..54f8f4a 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -1196,7 +1196,7 @@ gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post, gfc_ss * ss, elemsize = fold_convert (gfc_array_index_type, TYPE_SIZE_UNIT (gfc_get_element_type (type))); else - elemsize = gfc_vtable_size_get (class_expr); + elemsize = gfc_class_vtab_size_get (class_expr); size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, size, elemsize); @@ -3076,7 +3076,7 @@ build_class_array_ref (gfc_se *se, tree base, tree index) if (!GFC_CLASS_TYPE_P (TREE_TYPE (decl))) return false; - size = gfc_vtable_size_get (decl); + size = gfc_class_vtab_size_get (decl); /* Build the address of the element. */ type = TREE_TYPE (TREE_TYPE (base)); @@ -7986,7 +7986,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, dst_data = gfc_class_data_get (dcmp); src_data = gfc_class_data_get (comp); - size = fold_convert (size_type_node, gfc_vtable_size_get (comp)); + size = fold_convert (size_type_node, + gfc_class_vtab_size_get (comp)); if (CLASS_DATA (c)-attr.dimension) { diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 65a6ff1..24c20fc 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -907,7 +907,7 @@ tree gfc_get_class_array_ref (tree index, tree class_decl) { tree data = gfc_class_data_get (class_decl); - tree size = gfc_vtable_size_get (class_decl); + tree size = gfc_class_vtab_size_get (class_decl); tree offset = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, index, size); @@ -943,16 +943,16 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems) args = NULL; if (from != NULL_TREE) -fcn = gfc_vtable_copy_get (from); +fcn = gfc_class_vtab_copy_get (from); else -fcn = gfc_vtable_copy_get (to); +fcn = gfc_class_vtab_copy_get (to); fcn_type = TREE_TYPE (TREE_TYPE (fcn)); if (from != NULL_TREE) from_data = gfc_class_data_get (from); else -from_data = gfc_vtable_def_init_get (to); +from_data = gfc_class_vtab_def_init_get (to); to_data = gfc_class_data_get (to); @@ -5744,7 +5744,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, CLASS_DATA (expr-value.function.esym-result)-attr); } - final_fndecl = gfc_vtable_final_get (se-expr); + final_fndecl = gfc_class_vtab_final_get (se-expr); is_final = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, final_fndecl, @@ -5755,7 +5755,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, tmp = build_call_expr_loc (input_location, final_fndecl, 3, gfc_build_addr_expr (NULL, tmp), - gfc_vtable_size_get (se-expr), + gfc_class_vtab_size_get (se-expr), boolean_false_node); tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, is_final, tmp, diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c index c899a73..1e43f1f 100644 --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -5866,14 +5866,14 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *expr) /* For deferred length arrays, conv_expr_descriptor returns an indirect_ref to the component. */ if (arg-rank 0) - byte_size = gfc_vtable_size_get (TREE_OPERAND (argse.expr, 0)); + byte_size = gfc_class_vtab_size_get (TREE_OPERAND (argse.expr, 0)); else if (arg-rank 0) /* The scalarizer added an additional temp. To get the class' vptr one has to look at the original backend_decl. */ - byte_size = gfc_vtable_size_get ( + byte_size = gfc_class_vtab_size_get ( GFC_DECL_SAVED_DESCRIPTOR (arg-symtree-n.sym-backend_decl)); else - byte_size = gfc_vtable_size_get (argse.expr); + byte_size = gfc_class_vtab_size_get (argse.expr); } else { @@ -6003,10 +6003,10 @@ gfc_conv_intrinsic_storage_size (gfc_se *se, gfc_expr *expr) if (arg-ts.type == BT_CLASS) { if (arg-rank 0) - tmp = gfc_vtable_size_get ( + tmp = gfc_class_vtab_size_get ( GFC_DECL_SAVED_DESCRIPTOR (arg-symtree-n.sym-backend_decl)); else - tmp = gfc_vtable_size_get (TREE_OPERAND (argse.expr, 0)); + tmp = gfc_class_vtab_size_get (TREE_OPERAND (argse.expr, 0)); tmp = fold_convert (result_type, tmp); goto done; } @@ -6151,7 +6151,7 @@ gfc_conv_intrinsic_transfer (gfc_se * se, gfc_expr * expr) argse.string_length); break; case
[Ada] Fix another ICE with -gnatct
An obscur case with a limited-with clause and 'Class in a subprogram profile. Tested on x86_64-suse-linux, applied on the mainline. 2015-03-16 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (is_from_limited_with_of_main): New predicate. (gnat_to_gnu_entity) E_Subprogram_Type: Invoke it on return and parameter types to detect circularities in ASIS mode. * gcc-interface/trans.c (Attribute_to_gnu): Mention AI05-0151. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 221446) +++ gcc-interface/decl.c (working copy) @@ -182,6 +182,7 @@ static tree gnat_to_gnu_component_type ( static tree gnat_to_gnu_param (Entity_Id, Mechanism_Type, Entity_Id, bool, bool *); static tree gnat_to_gnu_field (Entity_Id, tree, int, bool, bool); +static bool is_from_limited_with_of_main (Entity_Id); static tree change_qualified_type (tree, int); static bool same_discriminant_p (Entity_Id, Entity_Id); static bool array_type_has_nonaliased_component (tree, Entity_Id); @@ -4252,11 +4253,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit context may now appear in parameter and result profiles. If we are only annotating types, break circularities here. */ if (type_annotate_only - IN (Ekind (gnat_return_type), Incomplete_Kind) - From_Limited_With (gnat_return_type) - In_Extended_Main_Code_Unit - (Non_Limited_View (gnat_return_type)) - !present_gnu_tree (Non_Limited_View (gnat_return_type))) + is_from_limited_with_of_main (gnat_return_type)) gnu_return_type = ptr_void_type_node; else gnu_return_type = gnat_to_gnu_type (gnat_return_type); @@ -4365,11 +4362,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit context may now appear in parameter and result profiles. If we are only annotating types, break circularities here. */ if (type_annotate_only - IN (Ekind (gnat_param_type), Incomplete_Kind) - From_Limited_With (Etype (gnat_param_type)) - In_Extended_Main_Code_Unit - (Non_Limited_View (gnat_param_type)) - !present_gnu_tree (Non_Limited_View (gnat_param_type))) + is_from_limited_with_of_main (gnat_param_type)) { gnu_param_type = ptr_void_type_node; fake_param_type = true; @@ -5810,6 +5803,30 @@ gnat_to_gnu_param (Entity_Id gnat_param, return gnu_param; } +/* Return true if GNAT_ENTITY is an incomplete entity coming from a limited + with of the main unit and whose full view has not been elaborated yet. */ + +static bool +is_from_limited_with_of_main (Entity_Id gnat_entity) +{ + /* Class-wide types are always transformed into their root type. */ + if (Ekind (gnat_entity) == E_Class_Wide_Type) +gnat_entity = Root_Type (gnat_entity); + + if (IN (Ekind (gnat_entity), Incomplete_Kind) + From_Limited_With (gnat_entity)) +{ + Entity_Id gnat_full_view = Non_Limited_View (gnat_entity); + + if (present_gnu_tree (gnat_full_view)) + return false; + + return In_Extended_Main_Code_Unit (gnat_full_view); +} + + return false; +} + /* Like build_qualified_type, but TYPE_QUALS is added to the existing qualifiers on TYPE. */ Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 221407) +++ gcc-interface/trans.c (working copy) @@ -1593,8 +1593,9 @@ Attribute_to_gnu (Node_Id gnat_node, tre bool prefix_unused = false; /* ??? If this is an access attribute for a public subprogram to be used in - a dispatch table, do not translate its type as it's useless there and the - parameter types might be incomplete types coming from a limited with. */ + a dispatch table, do not translate its type as it's useless in this case + and the parameter types might be incomplete types coming from a limited + context in Ada 2012 (AI05-0151). */ if (Ekind (Etype (gnat_node)) == E_Access_Subprogram_Type Is_Dispatch_Table_Entity (Etype (gnat_node)) Nkind (gnat_prefix) == N_Identifier
Re: [PATCH, i386 testsuite]: Require nonpic target for some tests
On 12-03-15 11:51, Uros Bizjak wrote: On Thu, Mar 12, 2015 at 11:41 AM, Tom de Vries tom_devr...@mentor.com wrote: Attached patch adds nonpic target requirement for some (obvious) cases, where data access or PIC register setup confuses scan-asms. 2015-01-30 Uros Bizjak ubiz...@gmail.com * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target. * gcc.target/i386/fuse-caller-save-xmm.c: Ditto. * gcc.target/i386/fuse-caller-save.c: Ditto. Hi, I've reverted this part of the patch. The scans were failing because the -fipa-ra optimization was broken for -m32 -fpic (PR64895). Not really. Allocator is free to allocate %ebx (or other call-saved register) as PIC register. In this case, unwanted push/pop sequence will be emitted. Sure, but I don't see what that has to do with the test-cases. I don't see a pic register used in fuse-caller-save.c and fuse-caller-save-rec.c. I do see a pic register used in gcc.target/i386/fuse-caller-save-xmm.c, but there's no scan for push/pop sequence in there. You are right, the call is (obviously) to a local function. There is no need for PIC reg, so this clears my concerns. The patch for PR64895 has been reverted, so the scans started failing again. I've added xfails. Thanks, - Tom 2015-03-16 Tom de Vries t...@codesourcery.com * gcc.target/i386/fuse-caller-save-rec.c: Add PR64895 xfail on scans. * gcc.target/i386/fuse-caller-save-xmm.c: Same. * gcc.target/i386/fuse-caller-save.c: Same. diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c index c660e01..7abcf91 100644 --- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c +++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c @@ -18,12 +18,14 @@ foo (int y) return y + bar (y); } +/* For !nonpic ia32 xfails, see PR64895. */ + /* Check that no registers are saved/restored. */ -/* { dg-final { scan-assembler-not push } } */ -/* { dg-final { scan-assembler-not pop } } */ +/* { dg-final { scan-assembler-not push { xfail { { ! nonpic } ia32 } } } } */ +/* { dg-final { scan-assembler-not pop { xfail { { ! nonpic } ia32 } } } } */ /* Check that addition uses dx. */ -/* { dg-final { scan-assembler-times addl\t%\[re\]?dx, %\[re\]?ax 1 } } */ +/* { dg-final { scan-assembler-times addl\t%\[re\]?dx, %\[re\]?ax 1 { xfail { { ! nonpic } ia32 } } } } */ /* Verify that bar is self-recursive. */ /* { dg-final { scan-assembler-times call\t_?bar 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c index 1d02844..c2d0544 100644 --- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c +++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c @@ -15,11 +15,13 @@ foo (v2df y) return y + bar (y); } +/* For !nonpic ia32 xfails, see PR64895. */ + /* Check presence of all insns on xmm registers. These checks are expected to pass with both -fipa-ra and -fno-ipa-ra. */ /* { dg-final { scan-assembler-times addpd\t\\.?LC0.*, %xmm0 1 } } */ -/* { dg-final { scan-assembler-times addpd\t%xmm1, %xmm0 1 } } */ -/* { dg-final { scan-assembler-times movapd\t%xmm0, %xmm1 1 } } */ +/* { dg-final { scan-assembler-times addpd\t%xmm1, %xmm0 1 { xfail { { ! nonpic } ia32 } } } } */ +/* { dg-final { scan-assembler-times movapd\t%xmm0, %xmm1 1 { xfail { { ! nonpic } ia32 } } } } */ /* Check absence of save/restore of xmm1 register. */ /* { dg-final { scan-assembler-not movaps\t%xmm1, \\(%\[re\]?sp\\) } } */ diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c index 7cfd22a..4b8e68d 100644 --- a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c +++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c @@ -16,9 +16,11 @@ foo (int y) return y + bar (y); } +/* For !nonpic ia32 xfails, see PR64895. */ + /* Check that no registers are saved/restored. */ -/* { dg-final { scan-assembler-not push } } */ -/* { dg-final { scan-assembler-not pop } } */ +/* { dg-final { scan-assembler-not push { xfail { { ! nonpic } ia32 } } } } */ +/* { dg-final { scan-assembler-not pop { xfail { { ! nonpic } ia32 } } } } */ /* PR61605. If the first argument register and the return register differ, then bar leaves the first argument register intact. That means in foo that the @@ -29,4 +31,4 @@ foo (int y) /* { dg-final { scan-assembler-not movl { target { ! ia32 } } } } */ /* Check that addition uses di (in case of no copy) or dx (in case of copy). */ -/* { dg-final { scan-assembler-times addl\t%\[re\]?d\[ix\], %\[re\]?ax 1 } } */ +/* { dg-final { scan-assembler-times addl\t%\[re\]?d\[ix\], %\[re\]?ax 1 { xfail { { ! nonpic } ia32 } } } } */ -- 1.9.1
RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant
From: Steven Bosscher [mailto:stevenb@gmail.com] Sent: Monday, March 09, 2015 7:48 PM To: Thomas Preud'homme Cc: GCC Patches; Eric Botcazou Subject: Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant New patch below. It looks like this would run for all candidate loop invariants, right? If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a potential compile time hog for large loops. But why compute this at all? Perhaps I'm missing something, but you already have inv-always_executed available, no? Indeed. I didn't realize the information was already there. + basic_block use_bb; + + ref = DF_REF_INSN (use); + use_bb = BLOCK_FOR_INSN (ref); You can use DF_REF_BB. Since I need use_insn here I kept BLOCK_FOR_INSN but I used DF_REF_BB for the def below. So here are the new ChangeLog entries: *** gcc/ChangeLog *** 2015-03-11 Thomas Preud'homme thomas.preudho...@arm.com * loop-invariant.c (can_move_invariant_reg): New. (move_invariant_reg): Call above new function to decide whether instruction can just be moved, skipping creation of temporary register. *** gcc/testsuite/ChangeLog *** 2015-03-12 Thomas Preud'homme thomas.preudho...@arm.com * gcc.dg/loop-8.c: New test. * gcc.dg/loop-9.c: New test. diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f79b497..8217d62 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg, bool in_group) return 1; } And the new patch: +/* Whether invariant INV setting REG can be moved out of LOOP, at the end of + the block preceding its header. */ + +static bool +can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg) +{ + df_ref def, use; + unsigned int dest_regno, defs_in_loop_count = 0; + rtx_insn *insn = inv-insn; + basic_block bb = BLOCK_FOR_INSN (inv-insn); + + /* We ignore hard register and memory access for cost and complexity reasons. + Hard register are few at this stage and expensive to consider as they + require building a separate data flow. Memory access would require using + df_simulate_* and can_move_insns_across functions and is more complex. */ + if (!REG_P (reg) || HARD_REGISTER_P (reg)) +return false; + + /* Check whether the set is always executed. We could omit this condition if + we know that the register is unused outside of the loop, but it does not + seem worth finding out. */ + if (!inv-always_executed) +return false; + + /* Check that all uses reached by the def in insn would still be reached + it. */ + dest_regno = REGNO (reg); + for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use)) +{ + rtx_insn *use_insn; + basic_block use_bb; + + use_insn = DF_REF_INSN (use); + use_bb = BLOCK_FOR_INSN (use_insn); + + /* Ignore instruction considered for moving. */ + if (use_insn == insn) + continue; + + /* Don't consider uses outside loop. */ + if (!flow_bb_inside_loop_p (loop, use_bb)) + continue; + + /* Don't move if a use is not dominated by def in insn. */ + if (use_bb == bb DF_INSN_LUID (insn) = DF_INSN_LUID (use_insn)) + return false; + if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb)) + return false; +} + + /* Check for other defs. Any other def in the loop might reach a use + currently reached by the def in insn. */ + for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG (def)) +{ + basic_block def_bb = DF_REF_BB (def); + + /* Defs in exit block cannot reach a use they weren't already. */ + if (single_succ_p (def_bb)) + { + basic_block def_bb_succ; + + def_bb_succ = single_succ (def_bb); + if (!flow_bb_inside_loop_p (loop, def_bb_succ)) + continue; + } + + if (++defs_in_loop_count 1) + return false; +} + + return true; +} + /* Move invariant INVNO out of the LOOP. Returns true if this succeeds, false otherwise. */ @@ -1545,11 +1618,8 @@ move_invariant_reg (struct loop *loop, unsigned invno) } } - /* Move the set out of the loop. If the set is always executed (we could -omit this condition if we know that the register is unused outside of -the loop, but it does not seem worth finding out) and it has no uses -that would not be dominated by it, we may just move it (TODO). -Otherwise we need to create a temporary register. */ + /* If possible, just move the set out of the loop. Otherwise, we +need to create a temporary register. */ set = single_set (inv-insn); reg = dest = SET_DEST (set); if (GET_CODE (reg) == SUBREG) @@ -1557,19 +1627,25 @@ move_invariant_reg (struct loop *loop, unsigned invno) if (REG_P (reg))
[Patch, Fortran, 1/2] Proposal on renaming gfc_vtable_*_get() to gfc_class_vtab_*_get()
Hi all, I like to propose a rename and split of the function trans-expr.c::gfc_vtable_*_get(). Background: During fixing an issue, I encountered the case, that I had a handle to get the vptr already and now wanted to retrieve the size component using that vtpr handle. There was no function to easily do this. One is only able to get the size member of a class' vtab using the function gfc_vtable_size_get(). I therefore resided to extract from the gfc_vtable_field_get() function, which is called by all gfc_vtable_*_get() functions, the part that is taking care about the vptr's component ref and put it into a new function vptr_field_get(). The old gfc_vtable_field_get () then calls this new vptr_field_get() function. During the split it occurred to me, that the name for gfc_vtable_*_get() is not clearly saying what it is supposed to do. When reading the function name the first time, one may think, that the function gets the named field from a vtable, but looking into the function one learns that not a handle to a vptr is expected, but a handle to a class object. Therefore I propose to rename the gfc_vtable_*_get () function to gfc_class_vtab_*_get (). Side note: Unfortunately we are not allowed to use ' in function names, or the name would have been even better gfc_class'_vtab's_*_get (), e.g., for size gfc_class'_vtab's_size_get (). This is a two parts patch. This first patch does the split and introduces defines to stick with the old gfc_vtable_*_get (). The second part substitutes the new name gfc_class_vtab_*_get () where needed. Bootstraps and regtests fine on x86_64-linux-gnu/F20. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de vtab_access_rework1_v1.clog Description: Binary data diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index f61c79f..062c927 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -176,72 +176,85 @@ gfc_class_len_get (tree decl) if (POINTER_TYPE_P (TREE_TYPE (decl))) decl = build_fold_indirect_ref_loc (input_location, decl); len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)), - CLASS_LEN_FIELD); + CLASS_LEN_FIELD); return fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (len), decl, len, NULL_TREE); } +/* Get the specified FIELD from the VPTR. */ + static tree -gfc_vtable_field_get (tree decl, int field) +vptr_field_get (tree vptr, int fieldno) { - tree size; - tree vptr; - vptr = gfc_class_vptr_get (decl); + tree field; vptr = build_fold_indirect_ref_loc (input_location, vptr); - size = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (vptr)), - field); - size = fold_build3_loc (input_location, COMPONENT_REF, - TREE_TYPE (size), vptr, size, - NULL_TREE); - /* Always return size as an array index type. */ - if (field == VTABLE_SIZE_FIELD) -size = fold_convert (gfc_array_index_type, size); - gcc_assert (size); - return size; + field = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (vptr)), + fieldno); + field = fold_build3_loc (input_location, COMPONENT_REF, + TREE_TYPE (field), vptr, field, + NULL_TREE); + gcc_assert (field); + return field; } -tree -gfc_vtable_hash_get (tree decl) -{ - return gfc_vtable_field_get (decl, VTABLE_HASH_FIELD); -} - +/* Get the field from the class' vptr. */ -tree -gfc_vtable_size_get (tree decl) +static tree +class_vtab_field_get (tree decl, int fieldno) { - return gfc_vtable_field_get (decl, VTABLE_SIZE_FIELD); + tree vptr; + vptr = gfc_class_vptr_get (decl); + return vptr_field_get (vptr, fieldno); } -tree -gfc_vtable_extends_get (tree decl) -{ - return gfc_vtable_field_get (decl, VTABLE_EXTENDS_FIELD); +/* Define a macro for creating the class_vtab_* and vptr_* accessors in + unison. */ +#define VTAB_GET_FIELD_GEN(name, field) tree \ +gfc_class_vtab_## name ##_get (tree cl) \ +{ \ + return class_vtab_field_get (cl, field); \ +} \ + \ +tree \ +gfc_vptr_## name ##_get (tree vptr) \ +{ \ + return vptr_field_get (vptr, field); \ } +VTAB_GET_FIELD_GEN (hash, VTABLE_HASH_FIELD) +VTAB_GET_FIELD_GEN (extends, VTABLE_EXTENDS_FIELD) +VTAB_GET_FIELD_GEN (def_init, VTABLE_DEF_INIT_FIELD) +VTAB_GET_FIELD_GEN (copy, VTABLE_COPY_FIELD) +VTAB_GET_FIELD_GEN (final, VTABLE_FINAL_FIELD) -tree -gfc_vtable_def_init_get (tree decl) -{ - return gfc_vtable_field_get (decl, VTABLE_DEF_INIT_FIELD); -} +/* The size field is returned as an array index. Therefore treat it and only + it specially. */ tree -gfc_vtable_copy_get (tree decl) +gfc_class_vtab_size_get (tree cl) { - return gfc_vtable_field_get (decl, VTABLE_COPY_FIELD); + tree size; + size = class_vtab_field_get (cl, VTABLE_SIZE_FIELD); + /* Always return size as an array index type. */ + size = fold_convert (gfc_array_index_type, size); + gcc_assert (size); + return size; } - tree -gfc_vtable_final_get (tree decl) +gfc_vptr_size_get (tree vptr) { - return gfc_vtable_field_get
Re: PR c++/64626 - C++14 single quote should not always be a digit separator
OK. Jason
[PATCH][expmed][cleanup] Use std::swap instead of manual swapping
Hi all, This patch replaces manual swapping in synth_mult with std::swap. Not much else to say about this. This code could arguably be refactored a bit but that's another story. I believe these are considered obvious at this point. I'll apply it in 24 hours unless somebody objects Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. Thanks, Kyrill 2015-03-16 Kyrylo Tkachov kyrylo.tkac...@arm.com * expmed.c (synth_mult): Use std::swap instead of manually swapping algorithms.commit 20372baa835e35e365117b4e3f6bdc5e28a78a98 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Mar 12 16:22:58 2015 + [expmed] Use std::swap instead of manual swapping diff --git a/gcc/expmed.c b/gcc/expmed.c index d2b2534..e84ab36 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2551,9 +2551,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_shift; } @@ -2582,9 +2581,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_shift; } @@ -2624,9 +2622,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = 0; best_alg-op[best_alg-ops] = alg_sub_t_m2; } @@ -2644,9 +2641,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = 0; best_alg-op[best_alg-ops] = alg_add_t_m2; } @@ -2667,9 +2663,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_sub_t_m2; } @@ -2723,9 +2718,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency = op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_add_factor; } @@ -2762,9 +2756,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency = op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_sub_factor; } @@ -2793,9 +2786,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_add_t2_m; } @@ -2818,9 +2810,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, alg_in-cost.latency += op_cost; if (CHEAPER_MULT_COST (alg_in-cost, best_cost)) { - struct algorithm *x; best_cost = alg_in-cost; - x = alg_in, alg_in = best_alg, best_alg = x; + std::swap (alg_in, best_alg); best_alg-log[best_alg-ops] = m; best_alg-op[best_alg-ops] = alg_sub_t2_m; }
Re: [patch, fortran] Bug 64432 - [5 Regression] SYSTEM_CLOCK(COUNT_RATE=rate) wrong result for integer(4)::rate
On Sat, Mar 14, 2015 at 4:24 PM, Jerry DeLisle jvdeli...@charter.net wrote: Attachment on this one. On 03/14/2015 07:22 AM, Jerry DeLisle wrote: On 03/08/2015 04:58 PM, Steve Kargl wrote: On Mon, Mar 09, 2015 at 01:07:25AM +0200, Janne Blomqvist wrote: So I would prefer if we just hardcode the error values in the frontend (-HUGE, 0, 0), in case somebody tries to use the kind=1,2 versions, thus also removing the need for the new library functions, keeping the existing simpler ones instead. AFAICT this would be standards conforming. Any other opinions on this? Revised patch attached as requested. Regression tested on x86_64 linux. Typical results are shown below. I will provide a test case for the test-suite. $ ./a.out KIND=1: -127 0 0 KIND=1: -127 0 0 KIND=1: -127 . 0 --- KIND=2: -32767 0 0 KIND=2: -32767 . 0 --- KIND=4: 57496123 1000 2147483647 KIND=4: 57496123 1000.0 2147483647 --- KIND=8: 57496123484138 10 9223372036854775807 KIND=8: 57496123522116 10.000 9223372036854775807 --- KIND=10: 57496123575504 10 9223372036854775807 KIND=10: 57496123612377 10.000 9223372036854775807 --- KIND=16: 57496123669210 10 9223372036854775807 KIND=16: 57496123698413 10.00 9223372036854775807 OK for trunk? Regards, Jerry 2015-03-14 Jerry DeLisle jvdeli...@gcc.gnu.org PR fortran/64432 *trans-intrinisic.c (conv_intrinsic_system_clock): Check the smallest kind passed in user arguments and hard-code results for KIND=1 or KIND=2 to indicate no clock available. 2015-03-14 Jerry DeLisle jvdeli...@gcc.gnu.org PR libgfortran/64432 * intrinsics/system_clock.c (system_clock4, system_clock8): Cleanup some whitespace. Thanks, this looks good. Ok for trunk. -- Janne Blomqvist
[PATCH][expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost
Hi all, Eyeballing the mult_by_coeff_cost function I think it has a typo/bug. It's supposed to return the cost of multiplying by a constant 'coeff'. It calculates that by taking the cost of a MULT rtx by that constant and comparing it to the cost of synthesizing that multiplication, and returning the cheapest. However, in the MULT rtx cost calculations it creates a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as I would expect. This patches fixes that in the obvious way. Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. I'm guessing this is stage 1 material at this point? Thanks, Kyrill 2015-03-13 Kyrylo Tkachov kyrylo.tkac...@arm.com * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost calculation rather than fake_reg.commit fa230baf4f35d03cf072904dc048ce4ffcf43148 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Mar 12 09:47:06 2015 + [expmed] Calculate mult-by-const properly in mult_by_coeff_cost diff --git a/gcc/expmed.c b/gcc/expmed.c index 0034203..fec9501 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -3285,7 +3285,8 @@ mult_by_coeff_cost (HOST_WIDE_INT coeff, machine_mode mode, bool speed) enum mult_variant variant; rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1); - max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed); + max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, GEN_INT (coeff)), + speed); if (choose_mult_variant (mode, coeff, algorithm, variant, max_cost)) return algorithm.cost.cost; else
Re: [Patch, Fortran] Prevent segfault on illegal input
Hi Tobias, hi all, thanks for the review. Commited as r221455: r221455 | vehre | 2015-03-16 11:29:59 +0100 (Mo, 16. Mär 2015) | 13 Zeilen gcc/fortran/ChangeLog: 2015-03-16 Andre Vehreschild ve...@gmx.de * resolve.c: Prevent segfault on illegal input. gcc/testsuite/ChangeLog: 2015-03-16 Andre Vehreschild ve...@gmx.de * gfortran.dg/pointer_2.f90: New test. Regards, Andre On Fri, 13 Mar 2015 14:05:00 +0100 Tobias Burnus tobias.bur...@physik.fu-berlin.de wrote: Andre Vehreschild wrote: during debugging I found a segfault of gfortran, when it encounters an illegal fortran code. The mistake in Fortran is flagged correctly, but later gfortran crashes. This patch prevents the crash. Bootstraps and regtest ok on x86_64-linux-gnu. Ok for trunk? OK. Thanks for the patch. Tobias -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 221454) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,7 @@ +2015-03-16 Andre Vehreschild ve...@gmx.de + + * resolve.c: Prevent segfault on illegal input. + 2015-03-14 Mikael Morin mik...@gcc.gnu.org PR fortran/61138 Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (Revision 221454) +++ gcc/fortran/resolve.c (Arbeitskopie) @@ -2639,6 +2639,10 @@ expr-ts = sym-ts; expr-value.function.name = sym-name; expr-value.function.esym = sym; + /* Prevent crash when sym-ts.u.derived-components is not set due to previous + error(s). */ + if (sym-ts.type == BT_CLASS !CLASS_DATA (sym)) +return MATCH_ERROR; if (sym-ts.type == BT_CLASS CLASS_DATA (sym)-as) expr-rank = CLASS_DATA (sym)-as-rank; else if (sym-as != NULL) Index: gcc/testsuite/gfortran.dg/pointer_2.f90 === --- gcc/testsuite/gfortran.dg/pointer_2.f90 (Revision 0) +++ gcc/testsuite/gfortran.dg/pointer_2.f90 (Revision 221455) @@ -0,0 +1,18 @@ +! { dg-do compile } +! Check that the compiler reports the errors, but does not segfault. +! Contributed by: Andre Vehreschild ve...@gcc.gnu.org +! +program test +implicit none +class(*), pointer :: P +class(*), allocatable :: P2 + +allocate(P2, source=convertType(P)) + +contains + + function convertType(in) ! { dg-error must be dummy, allocatable or pointer } +class(*), intent(in) :: in +class(*) :: convertType + end function +end program test Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 221454) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,7 @@ +2015-03-16 Andre Vehreschild ve...@gmx.de + + * gfortran.dg/pointer_2.f90: New test. + 2015-03-16 Eric Botcazou ebotca...@adacore.com * testsuite/g++.dg/pr65049.C: New test.
Fix PR middle-end/65409
This is an ICE present on all active branches, a regression introduced by the overhaul I applied throughout the RTL expander to avoid creating unnecessary stack temporaries for return values. We enter the special block for MEM:BLK in store_field with a PARALLEL:BLK and abort on the assertion. This RTX is dealt with immediately below and, as discussed with Jakub in the audit trail, adding the obvious guard is the safest thing to do for the release branches. Tested on x86_64-suse-linux, applied on all active branches. 2015-03-16 Eric Botcazou ebotca...@adacore.com PR middle-end/65409 * expr.c (store_field): Do not do a direct block copy if the source is a PARALLEL with BLKmode. 2015-03-16 Eric Botcazou ebotca...@adacore.com * testsuite/g++.dg/pr65049.C: New test. -- Eric BotcazouIndex: expr.c === --- expr.c (revision 221407) +++ expr.c (working copy) @@ -6645,11 +6645,12 @@ store_field (rtx target, HOST_WIDE_INT b mode != TYPE_MODE (TREE_TYPE (exp))) temp = convert_modes (mode, TYPE_MODE (TREE_TYPE (exp)), temp, 1); - /* If the modes of TEMP and TARGET are both BLKmode, both - must be in memory and BITPOS must be aligned on a byte - boundary. If so, we simply do a block copy. Likewise - for a BLKmode-like TARGET. */ - if (GET_MODE (temp) == BLKmode + /* If TEMP is not a PARALLEL (see below) and its mode and that of TARGET + are both BLKmode, both must be in memory and BITPOS must be aligned + on a byte boundary. If so, we simply do a block copy. Likewise for + a BLKmode-like TARGET. */ + if (GET_CODE (temp) != PARALLEL + GET_MODE (temp) == BLKmode (GET_MODE (target) == BLKmode || (MEM_P (target) GET_MODE_CLASS (GET_MODE (target)) == MODE_INT// PR middle-end/65409 // Reported by Ignacy Gawedzki b...@qult.net struct Foo { Foo() {} int a; int b; char c; }; Foo copy_foo(Foo); struct Bar : Foo { Bar(Foo t) : Foo(copy_foo(t)) {} }; Bar a = Foo();
[RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation
Hi, For this trivial testcase: extern int bar (int , int); int foo (int *a, int *b) { return bar (*a, *b); } I noticed that GCC generate redundant zero-extension instructions under ILP32 (aarch64-linux-gnu-gcc -S -O2 -mabi=ilp32). Assembly code: .arch armv8-a+fp+simd .file 1.c .text .align 2 .p2align 3,,7 .global foo .type foo, %function foo: uxtwx0, w0 uxtwx1, w1 ldr w0, [x0] ldr w1, [x1] b bar .size foo, .-foo .ident GCC: (20140403) 5.0.0 20150312 (experimental) According the ILP32 ABI, the two uxtw instrunctions here is not necessary. The following is a proposed patch to fix this issue, i.e. promoting pointer function arguments to word_mode. But I don't know whether it's a good idea to do this for pointer return values. Any comments? Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c(revision 221393) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -1587,7 +1587,7 @@ aarch64_function_value (const_tree type, const_tre machine_mode ag_mode; mode = TYPE_MODE (type); - if (INTEGRAL_TYPE_P (type)) + if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)) mode = promote_function_mode (type, mode, unsignedp, func, 1); if (aarch64_return_in_msb (type)) @@ -1650,6 +1650,24 @@ aarch64_function_value_regno_p (const unsigned int return false; } +/* Implement TARGET_PROMOTE_FUNCTION_MODE. */ + +static machine_mode +aarch64_promote_function_mode (const_tree type, machine_mode mode, + int *punsignedp, const_tree fntype, + int for_return) +{ + /* Pointer function arguments and return values are promoted to word_mode. */ + if (type != NULL_TREE POINTER_TYPE_P (type)) +{ + *punsignedp = POINTERS_EXTEND_UNSIGNED; + return word_mode; +} + + return default_promote_function_mode (type, mode, punsignedp, fntype, +for_return); +} + /* Implement TARGET_RETURN_IN_MEMORY. If the type T of the result of a function is such that @@ -11329,6 +11347,9 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool l #define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE \ aarch64_override_options_after_change +#undef TARGET_PROMOTE_FUNCTION_MODE +#define TARGET_PROMOTE_FUNCTION_MODE aarch64_promote_function_mode + #undef TARGET_PASS_BY_REFERENCE #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference aarch64-promote-v2.diff Description: aarch64-promote-v2.diff
Re: [RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation
On Mar 16, 2015, at 2:28 AM, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, For this trivial testcase: extern int bar (int , int); int foo (int *a, int *b) { return bar (*a, *b); } I noticed that GCC generate redundant zero-extension instructions under ILP32 (aarch64-linux-gnu-gcc -S -O2 -mabi=ilp32). Assembly code: .arch armv8-a+fp+simd .file 1.c .text .align 2 .p2align 3,,7 .global foo .type foo, %function foo: uxtwx0, w0 uxtwx1, w1 ldr w0, [x0] ldr w1, [x1] b bar .size foo, .-foo .ident GCC: (20140403) 5.0.0 20150312 (experimental) According the ILP32 ABI, the two uxtw instrunctions here is not necessary. The following is a proposed patch to fix this issue, i.e. promoting pointer function arguments to word_mode. But I don't know whether it's a good idea to do this for pointer return values. Any comments? Actually they are required. The abi says the upper 32bits are undefined for arguments smaller then 64bits. I had this discussion a year or more ago about this case. A simple struct like struct a { int * b; int c; }; Will break the code if we don't have the zero extends Try void f(int *); void g(struct a d) { f(d.b); } And see that there is no zero extends inside g. I saw this exact thing when working on getting gccgo working. It also means the assembly functions in glibc are broken and need to be fixed. Thanks, Andrew Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c(revision 221393) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -1587,7 +1587,7 @@ aarch64_function_value (const_tree type, const_tre machine_mode ag_mode; mode = TYPE_MODE (type); - if (INTEGRAL_TYPE_P (type)) + if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)) mode = promote_function_mode (type, mode, unsignedp, func, 1); if (aarch64_return_in_msb (type)) @@ -1650,6 +1650,24 @@ aarch64_function_value_regno_p (const unsigned int return false; } +/* Implement TARGET_PROMOTE_FUNCTION_MODE. */ + +static machine_mode +aarch64_promote_function_mode (const_tree type, machine_mode mode, + int *punsignedp, const_tree fntype, + int for_return) +{ + /* Pointer function arguments and return values are promoted to word_mode. */ + if (type != NULL_TREE POINTER_TYPE_P (type)) +{ + *punsignedp = POINTERS_EXTEND_UNSIGNED; + return word_mode; +} + + return default_promote_function_mode (type, mode, punsignedp, fntype, +for_return); +} + /* Implement TARGET_RETURN_IN_MEMORY. If the type T of the result of a function is such that @@ -11329,6 +11347,9 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool l #define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE \ aarch64_override_options_after_change +#undef TARGET_PROMOTE_FUNCTION_MODE +#define TARGET_PROMOTE_FUNCTION_MODE aarch64_promote_function_mode + #undef TARGET_PASS_BY_REFERENCE #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference aarch64-promote-v2.diff
[Ada] Fix ICE on loop at -O3
This fixes a regression present on all active branches at -O3, caused by a call to fold_convert with an aggregate type. Tested on x86_64-suse-linux, applied on all active branches. 2015-03-16 Eric Botcazou ebotca...@adacore.com * gcc-interface/utils2.c (gnat_invariant_expr): Return null if the type of the expression ends up being composite. 2015-03-16 Eric Botcazou ebotca...@adacore.com * gnat.dg/loop_optimization18.ad[sb]: New test. * gnat.dg/loop_optimization18_pkg.ads: New helper. -- Eric BotcazouIndex: gcc-interface/utils2.c === --- gcc-interface/utils2.c (revision 221407) +++ gcc-interface/utils2.c (working copy) @@ -2805,6 +2805,12 @@ gnat_invariant_expr (tree expr) expr = remove_conversions (expr, false); } + /* We are only interested in scalar types at the moment and, even if we may + have gone through padding types in the above loop, we must be back to a + scalar value at this point. */ + if (AGGREGATE_TYPE_P (TREE_TYPE (expr))) +return NULL_TREE; + if (TREE_CONSTANT (expr)) return fold_convert (type, expr); with Unchecked_Conversion; package Loop_Optimization18_Pkg is type Arr is array (Integer range ) of Natural; type Rec (UB : Integer) is record L : Arr (1 .. UB); end record; type Byte_Array_Type is new String (1..4); function Conv is new Unchecked_Conversion (Byte_Array_Type, Integer); end Loop_Optimization18_Pkg; with Loop_Optimization18_Pkg; use Loop_Optimization18_Pkg; package Loop_Optimization18 is procedure Proc (Message : Byte_Array_Type); end Loop_Optimization18; -- { dg-do compile } -- { dg-options -O3 } package body Loop_Optimization18 is procedure Proc (Message : Byte_Array_Type) is R : Rec (Conv (Message)); begin for Division in 1 .. R.UB loop R.L (Division) := 0; end loop; end; end Loop_Optimization18;
Re: [AArch64][PR65375] Fix RTX cost for vector SET
lower-subreg.c:compute_costs() only cares about the cost of a (set (reg) (const_int )) move but I think the intention, at least for now, is to return extra_cost-vect.alu for all the vector operations. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu Thanks Kyrill for the review. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Are you sure it's a (set (reg) (const_int)) that's being costed here? I thought for moves into vecto registers it would be a (set (reg) (const_vector)) which we don't handle in our rtx costs currently. I think the correct approach would be to extend the aarch64_rtx_costs switch statement to handle the CONST_VECT case. I believe you can use aarch64_simd_valid_immediate to check whether x is a valid immediate for a simd instruction and give it a cost of extra_cost-vect.alu. The logic should be similar to the CONST_INT case. Sorry about the (set (reg) (const_int)) above. But the actual RTL that is being split at 220r.subreg2 is (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0) (subreg:V4SF (reg/v:OI 73 [ __o ]) 0)) /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625 800 {*aarch64_simd_movv4sf} (nil)) And also, if we return RTX cost above COSTS_N_INSNS (1), it will be split and it dosent recover from there. Therefore we need something like the below to prevent that happening. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cba3c1a..d5c80f1 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5544,10 +5544,14 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, /* Fall through. */ case REG: + if (VECTOR_MODE_P (GET_MODE (op0)) REG_P (op1)) + { + *cost = COSTS_N_INSNS (1); + } /* const0_rtx is in general free, but we will use an instruction to set a register to 0. */ - if (REG_P (op1) || op1 == const0_rtx) -{ + else if (REG_P (op1) || op1 == const0_rtx) + { /* The cost is 1 per register copied. */ int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) / UNITS_PER_WORD; Thanks, Kugan
[committed] Fix reduction_map destruction (PR middle-end/65431)
Hi! Trying to access *ctx-outer in delete_omp_context, which is called from splay tree deletion and thus it is pretty much random if the outer context is destructed before or after the child, is a very bad idea. It seems a unique splay tree is assigned to ctx-reduction_map only for the offloading openacc GIMPLE_OMP_TARGET constructs, so this patch deletes them on those contexts only and no other contexts. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2015-03-16 Jakub Jelinek ja...@redhat.com PR middle-end/65431 * omp-low.c (delete_omp_context): Only splay_tree_delete reduction_map in GIMPLE_OMP_TARGET is_gimple_omp_offloaded is_gimple_omp_oacc contexts. Don't look at ctx-outer. --- gcc/omp-low.c.jj2015-03-12 09:49:16.0 +0100 +++ gcc/omp-low.c 2015-03-16 10:18:35.082465483 +0100 @@ -1580,10 +1580,12 @@ delete_omp_context (splay_tree_value val splay_tree_delete (ctx-field_map); if (ctx-sfield_map) splay_tree_delete (ctx-sfield_map); + /* Reduction map is copied to nested contexts, so only delete it in the + owner. */ if (ctx-reduction_map - /* Shared over several omp_contexts. */ - (ctx-outer == NULL - || ctx-reduction_map != ctx-outer-reduction_map)) + gimple_code (ctx-stmt) == GIMPLE_OMP_TARGET + is_gimple_omp_offloaded (ctx-stmt) + is_gimple_omp_oacc (ctx-stmt)) splay_tree_delete (ctx-reduction_map); /* We hijacked DECL_ABSTRACT_ORIGIN earlier. We need to clear it before Jakub
[committed] Fix a memory leak in omp-low.c offloading
Hi! While looking at PR65431, I've noticed we leak memory for every expand_omp_target called. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2015-03-16 Jakub Jelinek ja...@redhat.com * omp-low.c (expand_omp_target): Use auto_vectree, 11 instead of vectree * with vec_alloc and release for args. Adjust all users. --- gcc/omp-low.c.jj2015-03-16 10:47:40.0 +0100 +++ gcc/omp-low.c 2015-03-16 11:15:00.74227 +0100 @@ -9105,14 +9105,11 @@ expand_omp_target (struct omp_region *re } gimple g; - vectree *args; /* The maximum number used by any start_ix, without varargs. */ - unsigned int argcnt = 11; - - vec_alloc (args, argcnt); - args-quick_push (device); + auto_vectree, 11 args; + args.quick_push (device); if (offloaded) -args-quick_push (build_fold_addr_expr (child_fn)); +args.quick_push (build_fold_addr_expr (child_fn)); switch (start_ix) { case BUILT_IN_GOMP_TARGET: @@ -9120,7 +9117,7 @@ expand_omp_target (struct omp_region *re case BUILT_IN_GOMP_TARGET_UPDATE: /* This const void * is part of the current ABI, but we're not actually using it. */ - args-quick_push (build_zero_cst (ptr_type_node)); + args.quick_push (build_zero_cst (ptr_type_node)); break; case BUILT_IN_GOACC_DATA_START: case BUILT_IN_GOACC_ENTER_EXIT_DATA: @@ -9130,10 +9127,10 @@ expand_omp_target (struct omp_region *re default: gcc_unreachable (); } - args-quick_push (t1); - args-quick_push (t2); - args-quick_push (t3); - args-quick_push (t4); + args.quick_push (t1); + args.quick_push (t2); + args.quick_push (t3); + args.quick_push (t4); switch (start_ix) { case BUILT_IN_GOACC_DATA_START: @@ -9166,9 +9163,9 @@ expand_omp_target (struct omp_region *re t_vector_length = fold_convert_loc (OMP_CLAUSE_LOCATION (c), integer_type_node, OMP_CLAUSE_VECTOR_LENGTH_EXPR (c)); - args-quick_push (t_num_gangs); - args-quick_push (t_num_workers); - args-quick_push (t_vector_length); + args.quick_push (t_num_gangs); + args.quick_push (t_num_workers); + args.quick_push (t_vector_length); } /* FALLTHRU */ case BUILT_IN_GOACC_ENTER_EXIT_DATA: @@ -9190,13 +9187,13 @@ expand_omp_target (struct omp_region *re integer_type_node, OMP_CLAUSE_ASYNC_EXPR (c)); - args-quick_push (t_async); + args.quick_push (t_async); /* Save the index, and... */ - t_wait_idx = args-length (); + t_wait_idx = args.length (); /* ... push a default value. */ - args-quick_push (fold_convert_loc (gimple_location (entry_stmt), - integer_type_node, - integer_zero_node)); + args.quick_push (fold_convert_loc (gimple_location (entry_stmt), + integer_type_node, + integer_zero_node)); c = find_omp_clause (clauses, OMP_CLAUSE_WAIT); if (c) { @@ -9206,19 +9203,19 @@ expand_omp_target (struct omp_region *re { if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_WAIT) { - args-safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c), - integer_type_node, - OMP_CLAUSE_WAIT_EXPR (c))); + args.safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c), + integer_type_node, + OMP_CLAUSE_WAIT_EXPR (c))); n++; } } /* Now that we know the number, replace the default value. */ - args-ordered_remove (t_wait_idx); - args-quick_insert (t_wait_idx, - fold_convert_loc (gimple_location (entry_stmt), - integer_type_node, - build_int_cst (integer_type_node, n))); + args.ordered_remove (t_wait_idx); + args.quick_insert (t_wait_idx, + fold_convert_loc (gimple_location (entry_stmt), +integer_type_node, +build_int_cst (integer_type_node, n))); } } break; @@ -9226,8 +9223,7 @@ expand_omp_target (struct omp_region *re gcc_unreachable (); } - g = gimple_build_call_vec (builtin_decl_explicit (start_ix), *args); - args-release (); + g = gimple_build_call_vec
Re: [AArch64][PR65375] Fix RTX cost for vector SET
Resending, now that I've figured out how to make gmail send text email instead of html. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu This won't work, because extra_cost-vect.alu is COSTS_N_INSNS (1), which means the total is COSTS_N_INSNS (2). The lower-subreg pass makes a decision on whether to split based on cost = (word_move_cost * size/word_mode_size). Vectors are twice the size of word mode, and word moves are cost COSTS_N_INSNS (1). Setting the vector move cost to COSTS_N_INSNS (2) means we have COSTS_N_INSNS (2) = COSTS_N_INSNS (2) and vector moves are split which is bad for vector register allocation. This calculation happens in compute_costs in lower-subreg.c. How about the attached patch? It is similar to what is currently done for scalar register move. I like this approach of using the vector register size instead of word size when we have a vector mode. Jim
Re: [AArch64][PR65375] Fix RTX cost for vector SET
On 16/03/15 13:15, Kugan wrote: On 16/03/15 23:32, Kugan wrote: lower-subreg.c:compute_costs() only cares about the cost of a (set (reg) (const_int )) move but I think the intention, at least for now, is to return extra_cost-vect.alu for all the vector operations. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu Thanks Kyrill for the review. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Are you sure it's a (set (reg) (const_int)) that's being costed here? I thought for moves into vecto registers it would be a (set (reg) (const_vector)) which we don't handle in our rtx costs currently. I think the correct approach would be to extend the aarch64_rtx_costs switch statement to handle the CONST_VECT case. I believe you can use aarch64_simd_valid_immediate to check whether x is a valid immediate for a simd instruction and give it a cost of extra_cost-vect.alu. The logic should be similar to the CONST_INT case. Sorry about the (set (reg) (const_int)) above. But the actual RTL that is being split at 220r.subreg2 is (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0) (subreg:V4SF (reg/v:OI 73 [ __o ]) 0)) /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625 800 {*aarch64_simd_movv4sf} (nil)) And also, if we return RTX cost above COSTS_N_INSNS (1), it will be split and it dosent recover from there. Therefore we need something like the below to prevent that happening. Hi Kyrill, How about the attached patch? It is similar to what is currently done for scalar register move. Hi Kugan, yeah, I think this is a better approach, though I can't approve. Kyrill Thanks, Kugan
[PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
Hi! On the following testcase, gimple LIM creates a vector COND_EXPR (scalar condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is unprepared for that. This patch lowers those (parallel or piecewise). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-03-16 Jakub Jelinek ja...@redhat.com PR tree-optimization/65427 * tree-vect-generic.c (do_cond, expand_vector_scalar_condition): New functions. (expand_vector_operations_1): Handle BLKmode vector COND_EXPR. * gcc.c-torture/execute/pr65427.c: New test. --- gcc/tree-vect-generic.c.jj 2015-01-15 20:25:40.0 +0100 +++ gcc/tree-vect-generic.c 2015-03-16 14:25:37.391932269 +0100 @@ -1417,6 +1417,57 @@ count_type_subparts (tree type) return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1; } +static tree +do_cond (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b, +tree bitpos, tree bitsize, enum tree_code code) +{ + if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE) +a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos); + if (TREE_CODE (TREE_TYPE (b)) == VECTOR_TYPE) +b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos); + tree cond = gimple_assign_rhs1 (gsi_stmt (*gsi)); + return gimplify_build3 (gsi, code, inner_type, cond, a, b); +} + +/* Expand a vector COND_EXPR to scalars, piecewise. */ +static void +expand_vector_scalar_condition (gimple_stmt_iterator *gsi) +{ + gassign *stmt = as_a gassign * (gsi_stmt (*gsi)); + tree type = gimple_expr_type (stmt); + tree compute_type = get_compute_type (COND_EXPR, mov_optab, type); + machine_mode compute_mode = TYPE_MODE (compute_type); + gcc_assert (compute_mode != BLKmode); + tree lhs = gimple_assign_lhs (stmt); + tree rhs2 = gimple_assign_rhs2 (stmt); + tree rhs3 = gimple_assign_rhs3 (stmt); + tree new_rhs; + + /* If the compute mode is not a vector mode (hence we are not decomposing + a BLKmode vector to smaller, hardware-supported vectors), we may want + to expand the operations in parallel. */ + if (GET_MODE_CLASS (compute_mode) != MODE_VECTOR_INT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FLOAT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FRACT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UFRACT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_ACCUM + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UACCUM) +new_rhs = expand_vector_parallel (gsi, do_cond, type, rhs2, rhs3, + COND_EXPR); + else +new_rhs = expand_vector_piecewise (gsi, do_cond, type, compute_type, + rhs2, rhs3, COND_EXPR); + if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (new_rhs))) +new_rhs = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), + new_rhs); + + /* NOTE: We should avoid using gimple_assign_set_rhs_from_tree. One + way to do it is change expand_vector_operation and its callees to + return a tree_code, RHS1 and RHS2 instead of a tree. */ + gimple_assign_set_rhs_from_tree (gsi, new_rhs); + update_stmt (gsi_stmt (*gsi)); +} + /* Process one statement. If we identify a vector operation, expand it. */ static void @@ -1449,6 +1500,14 @@ expand_vector_operations_1 (gimple_stmt_ return; } + if (code == COND_EXPR + TREE_CODE (TREE_TYPE (gimple_assign_lhs (stmt))) == VECTOR_TYPE + TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode) +{ + expand_vector_scalar_condition (gsi); + return; +} + if (code == CONSTRUCTOR TREE_CODE (lhs) == SSA_NAME VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (lhs))) --- gcc/testsuite/gcc.c-torture/execute/pr65427.c.jj2015-03-16 14:36:29.489254701 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr65427.c 2015-03-16 14:40:58.789851433 +0100 @@ -0,0 +1,34 @@ +/* PR tree-optimization/65427 */ + +typedef int V __attribute__ ((vector_size (8 * sizeof (int; +V a, b, c, d, e, f; + +__attribute__((noinline, noclone)) void +foo (int x, int y) +{ + do +{ + if (x) + d = a ^ c; + else + d = a ^ b; +} + while (y); +} + +int +main () +{ + a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 }; + b = (V) { 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80 }; + e = (V) { 0x41, 0x82, 0x43, 0x84, 0x45, 0x86, 0x47, 0x88 }; + foo (0, 0); + if (__builtin_memcmp (d, e, sizeof (V)) != 0) +__builtin_abort (); + c = (V) { 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40 }; + f = (V) { 0x81, 0x42, 0x83, 0x44, 0x85, 0x46, 0x87, 0x48 }; + foo (1, 0); + if (__builtin_memcmp (d, f, sizeof (V)) != 0) +__builtin_abort (); + return 0; +} Jakub
[RFC, stage1] Richer source location information for gcc 6 (location ranges etc)
I've been experimenting with revamping our diagnostics to better show source locations. Some of the ideas are inspired by Clang's diagnostics, but I'm going beyond what it does in some areas. I'm attaching a patch (against r221423), which is very much a work-in-progress, but in a release early, release often spirit I thought I'd solicit feedback here (obviously this would be for our next stage 1/gcc 6): Examples of the output using the patch (some very long lines, sorry): bitfld1.C:12:15: error: invalid operands of types ‘char*’ and ‘unsigned char:1’ to binary ‘operator|’ void *a = s.p | s.f; ~~~ ^ ~~~ multiline-rich-location-01.C:13:11: error: no match for ‘operator +’ (operand types are ‘int’ and ‘s’) return (first_function_with_a_very_long_name () ~~~ + second_function_with_a_very_long_name ()); ^ Going beyond pure ASCII, here's an example using unicode box-drawing chars (and showing a macro expansion): gcc.dg/20150304-01.c: In function ‘f’: gcc.dg/20150304-01.c:4:94: error: invalid operands to binary (have ‘struct mystruct’ and ‘float’) #define MYMAX(A,B)__extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a __b ? __b : __a; }) ─── ▲ ─── gcc.dg/20150304-01.c:11:7: note: in expansion of macro ‘MYMAX’ X = MYMAX(P, F); ▲── An example showing captions on ranges, in this case via right-hand margins: multiline-rich-location-02.C: In function ‘int test(int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int)’: multiline-rich-location-02.C:30:11: error: no match for ‘operator+’ (operand types are ‘int’ and ‘s’) return (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,┐ └│ consectetur, adipiscing, elit, │ ── │ sed, eiusmod, tempor, │ ─ ├type ‘int’ incididunt, ut, labore, et, │ ─── │ dolore, magna, aliqua) │ ─┘ ┘ + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, // { dg-error no match }┐ ▲ └──│ amet, consectetur, │ ── │ adipiscing, elit, sed, │ ── ├type ‘s’ eiusmod, tempor, incididunt,│ │ ut, labore, et, dolore, │ ─── │ magna, aliqua)); │ ─┘ ┘ Potentially we can support locations within string literals, for -Wformat: pr52952-spurious-trailing-percent.c: In function ‘foo’: pr52952-spurious-trailing-percent.c:7:23: warning: spurious trailing ‘%’ in format [-Wformat=] printf(hello world %); ▲ including ranges: pr52952-mismatching-argument-type.c: In function ‘foo’: pr52952-mismatching-argument-type.c:7:17: warning: format ‘%i’ expects argument of type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=] printf(hello %i, msg); ▲─ potentially for multiline ones: pr52952-multiline-format-string.c: In function ‘f’: pr52952-multiline-format-string.c:6:12: warning: format ‘%d’ expects a matching ‘int’ argument [-Wformat=] % ▲─ d ─┘ You can see colorized versions of these at: https://dmalcolm.fedorapeople.org/gcc/2015-03-06/ https://dmalcolm.fedorapeople.org/gcc/2015-03-09/ https://dmalcolm.fedorapeople.org/gcc/2015-03-11/ The first two ranges in a
[Committed, PATCH] Disable all target libraries if not building gcc
On Fri, Mar 13, 2015 at 1:06 PM, H.J. Lu hongjiu...@intel.com wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? Tested with binutils and GCC build. Checked into binutils-gdb tree. H.J. --- * configure.ac (target_configdirs): Clear if if not building gcc. * configure: Regenerated. --- configure| 3 +++ configure.ac | 3 +++ 2 files changed, 6 insertions(+) diff --git a/configure b/configure index d075cf3..5caf82b 100755 --- a/configure +++ b/configure @@ -6614,6 +6614,9 @@ Supported languages are: ${potential_languages} $LINENO 5 ac_configure_args=`echo $ac_configure_args | sed -e s/ '--enable-languages=[^ ]*'//g -e s/$/ '--enable-languages=$enable_languages'/ ` +else + # Disable all target libraries if not building gcc. + target_configdirs= fi # Handle --disable-component generically. diff --git a/configure.ac b/configure.ac index 5ff56bf..34774d8 100644 --- a/configure.ac +++ b/configure.ac @@ -2039,6 +2039,9 @@ Supported languages are: ${potential_languages}]) AC_SUBST(stage1_languages) ac_configure_args=`echo $ac_configure_args | sed -e s/ '--enable-languages=[[^ ]]*'//g -e s/$/ '--enable-languages=$enable_languages'/ ` +else + # Disable all target libraries if not building gcc. + target_configdirs= fi # Handle --disable-component generically. -- 1.9.3 -- H.J.
[Patch, Fortran, PR 64787, v1] Invalid code on sourced allocation of class(*) character string
Dear all, herewith I submit a patch to fix another issue on deferred length strings and unlimited polymorphic objects. When an unlimited polymorphic object is allocated to store a char array, the _len component was not in all cases set correctly in gfc_allocate (). During analyzing the issue in gfc_allocate () I figured that it would be easier to accomplish by restructuring gfc_allocate () completely. I set to work and viola, here it is. Now the source=/mold= expression is evaluated just once before the allocate of all objects, when the objects to allocate aren't arrays. So most of the source=/mold= expression (expr3) evaluation has been extracted from the loop running over the objects to allocate. This also enables to compute the size of a char array correctly needing the size of a single character. This is now done by storing that size in the size component of the vtab (for deferred length char arrays only). The size to allocate is then computed by the expression: expr3-_len 0 ? expr3-_len * expr3-_vptr-size : expr3-_vptr-size Furthermore does Pr64787 address the issue on copying the data correctly. In the vtab of an unlimited polymorphic object the copy function needs four arguments: from, to, from_len, to_cap, but before this patch copy () was called with from and to only resulting in a crash of the compiled program. This patch introduces this block to solve the issue: if (to-_len 0) to-_vptr-copy (from-_data, to-_data, from-_len, to-_len); else to-_vptr-copy (from-_data, to-_data); (For simplicity this explanation assumes, that both object from and to are unlimited polymorphic, the code will handle deferred length strings for from, too.) Furthermore, did I try to address the open comment #2 of pr57456. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Comments and reviews welcome !!! Note, this patch assumes that the following patches, which are all awaiting review, BTW, are present in your code tree (apply from top to bottom): https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html (missing may produce deltas when patching) https://gcc.gnu.org/ml/fortran/2015-03/msg00032.html (same) https://gcc.gnu.org/ml/fortran/2015-03/msg00063.html (nice to have, or the new patch may apply with delta.) https://gcc.gnu.org/ml/fortran/2015-03/msg00074.html (necessary) https://gcc.gnu.org/ml/fortran/2015-03/msg00075.html (necessary) Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de pr64787_v1.clog Description: Binary data diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index 786876c..949fc98 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -2562,13 +2562,19 @@ find_intrinsic_vtab (gfc_typespec *ts) c-attr.access = ACCESS_PRIVATE; /* Build a minimal expression to make use of - target-memory.c/gfc_element_size for 'size'. */ + target-memory.c/gfc_element_size for 'size'. Special handling + for character arrays, that are not constant sized: to support + len(str)*kind, only the kind information is stored in the + vtab. */ e = gfc_get_expr (); e-ts = *ts; e-expr_type = EXPR_VARIABLE; c-initializer = gfc_get_int_expr (gfc_default_integer_kind, NULL, - (int)gfc_element_size (e)); + ts-type == BT_CHARACTER + charlen == 0 ? + ts-kind : + (int)gfc_element_size (e)); gfc_free_expr (e); /* Add component _extends. */ diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index f55c691..f4fa9c8 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -3168,6 +3168,7 @@ void gfc_add_component_ref (gfc_expr *, const char *); void gfc_add_class_array_ref (gfc_expr *); #define gfc_add_data_component(e) gfc_add_component_ref(e,_data) #define gfc_add_vptr_component(e) gfc_add_component_ref(e,_vptr) +#define gfc_add_len_component(e) gfc_add_component_ref(e,_len) #define gfc_add_hash_component(e) gfc_add_component_ref(e,_hash) #define gfc_add_size_component(e) gfc_add_component_ref(e,_size) #define gfc_add_def_init_component(e) gfc_add_component_ref(e,_def_init) diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 54f8f4a..9976f79 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -4975,8 +4975,7 @@ static tree gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset, gfc_expr ** lower, gfc_expr ** upper, stmtblock_t * pblock, stmtblock_t * descriptor_block, tree * overflow, - tree expr3_elem_size, tree *nelems, gfc_expr *expr3, - gfc_typespec *ts) + tree expr3_elem_size, tree *nelems, gfc_expr *expr3) { tree type; tree tmp; @@ -5156,9 +5155,6 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset, tmp = TYPE_SIZE_UNIT (tmp); } } - else if (ts-type != BT_UNKNOWN ts-type != BT_CHARACTER) -/* FIXME: Properly handle characters. See
[Committed, PATCH]: Move cloog.m4 ChangeLog to config/ChangeLog
I checked in this patch to move cloog.m4 ChangeLog to config/ChangeLog. H.J. --- Index: ChangeLog === --- ChangeLog (revision 221457) +++ ChangeLog (working copy) @@ -331,7 +331,6 @@ 2014-11-11 Tobias Burnus bur...@net-b.de - * config/cloog.m4: Remove. * Makefile.def: Remove CLooG. * Makefile.tpl: Ditto. * configure.ac: Ditto. Index: config/ChangeLog === --- config/ChangeLog(revision 221457) +++ config/ChangeLog(working copy) @@ -25,6 +25,10 @@ * target-posix: New file. +2014-11-11 Tobias Burnus bur...@net-b.de + + * cloog.m4: Remove. + 2014-10-27 Tom Tromey tro...@redhat.com * gcc-plugin.m4: New file.
Re: [patch,avr]: Part5: Fix various problems with specs and specs file generation.
This patch introduces a new avr specific command option '-nodevicelib' so that linking of libdev.a can be bypassed. The argument of -specs= is suffixed by %s instead of supplying the absolute path. That way -specs= works with installation path that contains spaces. avr_mct_t.library_name and its initializers in avr-mcus.def are cleaned up. This field was used to define __AVR_DEV_LIB_NAME__. If no device macro is defined because a device is not supported by avr-gcc, that hook macro can be defined to tell avr/io.h where to find the device's header file. This means it is pointless to define __AVR_DEV_LIB_NAME__ in avr-gcc: If a device is supported, the device macro (e.g. __AVR_ATmega8__) is defined and __AVR_DEV_LIB_NAME__ will never be used. The patch adds more help text to the device specs file to inform a potential reader about the role of the hook macro __AVR_DEV_LIB_NAME__. The patch also removes specs known to GCC and which don't directly depend on the device from the device specs file. This makes the specs file smaller and easier less confusing. Ok for trunk? Johann PR target/65296 * config/avr/avrlibc.h (LIB_SPEC, LIBGCC_SPEC) [AVR1]: Don't link libgcc.a, libc.a, libm.a. * config/avr/specs.h: Same. * config/avr/gen-avr-mmcu-specs.c (print_mcu): Don't print specs which don't (directly) depend on the device. Print more help. (*avrlibc_devicelib) [-nodevicelib]: Don't link libdev.a. (*cpp): Don't define __AVR_DEV_LIB_NAME__. * config/avr/driver-avr.c: Remove -nodevicelib from option list in case of an error. (avr_devicespecs_file): Use suffix %s instead of absolute path. for specs file name instead of absolute path. * config/avr/avr-arch.h (avr_mcu_t) [.library_name]: Remove. * config/avr/avr-mcus.def: Same: Adjust initializers. * config/avr/avr.opt (-nodevicelib): New option. * doc/invoke.texi (AVR Options): Document it. Index: config/avr/avrlibc.h === --- config/avr/avrlibc.h (revision 221448) +++ config/avr/avrlibc.h (working copy) @@ -19,21 +19,17 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -/* AVR-Libc implements functions from libgcc.a in libm.a, see PR54461. - More AVR-Libc specific specs originate from gen-avr-mmcu-specs.c: - - - LIBGCC_SPEC (*libgcc) - - LIB_SPEC (*lib) - -*/ - #undef LIB_SPEC #define LIB_SPEC\ - -lc %(avrlibc_devicelib) + %{!mmcu=avr1:-lc} %(avrlibc_devicelib) + +// AVR-Libc implements functions from libgcc.a in libm.a, see PR54461. +// For a list of functions which are provided by libm.a and are +// omitted from libgcc.a see libgcc's t-avrlibc. #undef LIBGCC_SPEC #define LIBGCC_SPEC \ - -lgcc -lm + %{!mmcu=avr1:-lgcc -lm} #undef STARTFILE_SPEC #define STARTFILE_SPEC \ Index: config/avr/avr-arch.h === --- config/avr/avr-arch.h (revision 221448) +++ config/avr/avr-arch.h (working copy) @@ -122,9 +122,6 @@ const char *const macro; /* Number of 64k segments in the flash. */ int n_flash; - - /* Old name of device library. */ - const char *const library_name; } avr_mcu_t; /* AVR device specific features. Index: config/avr/avr-devices.c === --- config/avr/avr-devices.c (revision 221448) +++ config/avr/avr-devices.c (working copy) @@ -111,12 +111,12 @@ avr_texinfo[] = const avr_mcu_t avr_mcu_types[] = { -#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, LIBNAME)\ - { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, LIBNAME }, +#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH)\ + { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH }, #include avr-mcus.def #undef AVR_MCU /* End of list. */ - { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0, NULL } + { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0 } }; Index: config/avr/avr.opt === --- config/avr/avr.opt (revision 221448) +++ config/avr/avr.opt (working copy) @@ -94,3 +94,7 @@ Warn if the address space of an address mfract-convert-truncate Target Report Mask(FRACT_CONV_TRUNC) Allow to use truncation instead of rounding towards 0 for fractional int types + +nodevicelib +Driver Target Report RejectNegative +Do not link against the device-specific library libdev.a Index: config/avr/avr-mcus.def === --- config/avr/avr-mcus.def (revision 221448) +++ config/avr/avr-mcus.def (working copy) @@ -29,16 +29,25 @@ After that, rebuild everything and
Re: [PATCH] Remove unused function
On 03/16/15 07:36, Marek Polacek wrote: This function is unused since matrix-reorg.c removal back in 2012. I think there's no point in keeping it around. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-03-16 Marek Polacek pola...@redhat.com * cgraph.h (add_new_static_var): Remove declaration. * varpool.c (add_new_static_var): Remove function. OK. jeff
Re: [RFC, stage1] Richer source location information for gcc 6 (location ranges etc)
On 16 March 2015 at 17:52, David Malcolm dmalc...@redhat.com wrote: I've been experimenting with revamping our diagnostics to better show source locations. Some of the ideas are inspired by Clang's diagnostics, but I'm going beyond what it does in some areas. I'm attaching a patch (against r221423), which is very much a work-in-progress, but in a release early, release often spirit I thought I'd solicit feedback here (obviously this would be for our next stage 1/gcc 6): Wow, awesome! Obviously, I'm very much in favour of this in general. From my own experience, you are going to have two enormous bike-shedding experiences: one for the API and another for the output. My advice would be to implement a few obvious testcases and just the ability to print some more detailed location info, like Clang's -fdiagnostics-print-source-range-info (but see also my past troubles with that: https://gcc.gnu.org/ml/gcc-patches/2009-08/msg00174.html) without even printing any caret info. This way you can work out the internal details without having to defend at the same time the actual user interface. I think my mistake was to try to implement -fdiagnostics-print-source-range-info as a user option, I should have proposed it as a debug option such as -fdump-location-info and just move forward from that. Also, people are going to more easily review and accept this if you propose small incremental steps that just tackle (initially) one case at a time (of course, you may be working on your grand plan behind the scenes). Cheers, Manuel.
Re: C++ PATCH for c++/65327 aka DR 1688 (volatile constexpr variables)
OK for trunk, not for release branches. Jason
Re: [patch c++]: Fix for PR/65390
If there is an alignment mismatch without user intervention, there is a problem, we can't just ignore it. Where we run into trouble is with array types where the version built earlier has not been laid out yet but the new one has been. I've been trying to deal with that by making sure that we lay out the original type as well, but obviously that isn't working for this case. Why not? I suppose we could avoid checking TYPE_ALIGN if neither TYPE_USER_ALIGN nor TYPE_SIZE are set on 't', but checking TYPE_USER_ALIGN isn't enough. Jason
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote: Hi! On the following testcase, gimple LIM creates a vector COND_EXPR (scalar condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is unprepared for that. This patch lowers those (parallel or piecewise). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Though maybe LIM should not create these for cost reasons? I also wonder if we should lower them to control flow? Thanks, Richard. 2015-03-16 Jakub Jelinek ja...@redhat.com PR tree-optimization/65427 * tree-vect-generic.c (do_cond, expand_vector_scalar_condition): New functions. (expand_vector_operations_1): Handle BLKmode vector COND_EXPR. * gcc.c-torture/execute/pr65427.c: New test. --- gcc/tree-vect-generic.c.jj 2015-01-15 20:25:40.0 +0100 +++ gcc/tree-vect-generic.c2015-03-16 14:25:37.391932269 +0100 @@ -1417,6 +1417,57 @@ count_type_subparts (tree type) return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1; } +static tree +do_cond (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b, + tree bitpos, tree bitsize, enum tree_code code) +{ + if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE) +a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos); + if (TREE_CODE (TREE_TYPE (b)) == VECTOR_TYPE) +b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos); + tree cond = gimple_assign_rhs1 (gsi_stmt (*gsi)); + return gimplify_build3 (gsi, code, inner_type, cond, a, b); +} + +/* Expand a vector COND_EXPR to scalars, piecewise. */ +static void +expand_vector_scalar_condition (gimple_stmt_iterator *gsi) +{ + gassign *stmt = as_a gassign * (gsi_stmt (*gsi)); + tree type = gimple_expr_type (stmt); + tree compute_type = get_compute_type (COND_EXPR, mov_optab, type); + machine_mode compute_mode = TYPE_MODE (compute_type); + gcc_assert (compute_mode != BLKmode); + tree lhs = gimple_assign_lhs (stmt); + tree rhs2 = gimple_assign_rhs2 (stmt); + tree rhs3 = gimple_assign_rhs3 (stmt); + tree new_rhs; + + /* If the compute mode is not a vector mode (hence we are not decomposing + a BLKmode vector to smaller, hardware-supported vectors), we may want + to expand the operations in parallel. */ + if (GET_MODE_CLASS (compute_mode) != MODE_VECTOR_INT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FLOAT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FRACT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UFRACT + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_ACCUM + GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UACCUM) +new_rhs = expand_vector_parallel (gsi, do_cond, type, rhs2, rhs3, +COND_EXPR); + else +new_rhs = expand_vector_piecewise (gsi, do_cond, type, compute_type, + rhs2, rhs3, COND_EXPR); + if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (new_rhs))) +new_rhs = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), + new_rhs); + + /* NOTE: We should avoid using gimple_assign_set_rhs_from_tree. One + way to do it is change expand_vector_operation and its callees to + return a tree_code, RHS1 and RHS2 instead of a tree. */ + gimple_assign_set_rhs_from_tree (gsi, new_rhs); + update_stmt (gsi_stmt (*gsi)); +} + /* Process one statement. If we identify a vector operation, expand it. */ static void @@ -1449,6 +1500,14 @@ expand_vector_operations_1 (gimple_stmt_ return; } + if (code == COND_EXPR + TREE_CODE (TREE_TYPE (gimple_assign_lhs (stmt))) == VECTOR_TYPE + TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode) +{ + expand_vector_scalar_condition (gsi); + return; +} + if (code == CONSTRUCTOR TREE_CODE (lhs) == SSA_NAME VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (lhs))) --- gcc/testsuite/gcc.c-torture/execute/pr65427.c.jj 2015-03-16 14:36:29.489254701 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr65427.c 2015-03-16 14:40:58.789851433 +0100 @@ -0,0 +1,34 @@ +/* PR tree-optimization/65427 */ + +typedef int V __attribute__ ((vector_size (8 * sizeof (int; +V a, b, c, d, e, f; + +__attribute__((noinline, noclone)) void +foo (int x, int y) +{ + do +{ + if (x) + d = a ^ c; + else + d = a ^ b; +} + while (y); +} + +int +main () +{ + a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 }; + b = (V) { 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80 }; + e = (V) { 0x41, 0x82, 0x43, 0x84, 0x45, 0x86, 0x47, 0x88 }; + foo (0, 0); + if (__builtin_memcmp (d, e, sizeof (V)) != 0) +__builtin_abort (); + c = (V) { 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40 }; + f = (V) { 0x81, 0x42, 0x83, 0x44, 0x85, 0x46, 0x87, 0x48 }; + foo (1, 0); + if (__builtin_memcmp (d, f, sizeof (V)) != 0) +__builtin_abort (); + return 0;
Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types
Dear Tobias, As far as I can see, without the patch, gfc_get _derived_type goes into a continuous loop trying to build the abstract type. Why this is not the case with an additional non-procedure pointer component, I do not know. I suspect that there is a corner case out there that will challenge this patch but I was unable to generate it. I decided therefore to commit, with an additional condition in the loop to prevent repeated attempts to build the component field. Committed to trunk as revision 221474. 4.8 and 4.9 to follow. Cheers Paul On 16 March 2015 at 09:39, Paul Richard Thomas paul.richard.tho...@gmail.com wrote: Dear Tobias, I think that I have a partial understanding now and will attempt to verify it tonight. Certainly to not build the components, when a derived type is flagged to have proc_pointer components cannot be right just because there can be other components as in the original testcase. This led to the lack of DECL_SIZE information for the field 'term' and the subsequent ICE. The key to understanding is the problem with typebound_operator_9.f03, when the check for proc_pointer components is omitted completely. This implies that there is something missing from the function (sorry, I forget its name) that build proc_pointer fields. I will check that idea - enlightenment might lead to a different patch. Thanks for the review. Paul On 16 March 2015 at 09:07, Tobias Burnus tobias.bur...@physik.fu-berlin.de wrote: Dear Paul, Paul Richard Thomas wrote: The ChangeLog says it all. If the check is not done for components that are not procedure pointers, typebound_operator_9.f03 breaks. I am not entirely sure why this is the case but the fix works fine. Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0? OK. I have to admit that I also do not really understand why that's required. Tobias -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [PATCH] Disable all target libraries if not building gcc
On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote: On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). We need to work out something to avoid building target libraries in binutils. -- H.J.
Re: [PATCH] Disable all target libraries if not building gcc
On Mon, 16 Mar 2015, H.J. Lu wrote: On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote: On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). We need to work out something to avoid building target libraries in binutils. I suggest not building zlib as a target library unless libgcj is also being built as a target library, given that there should be no need to built it as a target library in isolation. It can be useful to build even GCC's target libraries when not building GCC (for example, rebuilding them with different optimization options using an installed compiler) - for libraries that don't use the host-side tm.h (all except libgcc and libobjc, I think), this is practical. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Disable all target libraries if not building gcc
On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com wrote: On Mon, 16 Mar 2015, H.J. Lu wrote: On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote: On Fri, 13 Mar 2015, H.J. Lu wrote: I am working on SHF_COMPRESSED support: http://www.sco.com/developers/gabi/latest/ch4.sheader.html I will import zlib from GCC source tree into binutils-gdb tree. But zlib failed to build as a target library in binutils-gdb. There is no need to build target libraries if not building gcc. OK for master? This is definitely wrong. newlib / libgloss is a target library that certainly makes sense to build separately from GCC, and the toplevel configure / build code should be identical in all three repositories (GCC, binutils-gdb, newlib-cygwin). We need to work out something to avoid building target libraries in binutils. I suggest not building zlib as a target library unless libgcj is also being built as a target library, given that there should be no need to built it as a target library in isolation. The logic is there. But somehow it doesn't work for binutils. I will take another look. -- H.J.
Re: [Patch, libstdc++/65420] Use constexpr variables as regex_constans flags
On Sun, Mar 15, 2015 at 5:59 AM, Daniel Krügler daniel.krueg...@gmail.com wrote: (I switched to a less interesting but practical change, so all class stuff is gone. But I'd like to reply to your comments.) Your implementation choice is an interesting approach. I believe that a strict reading of the library specification does not allow that form, because you are using as bitmask type a class type that is not std::bitset: You are too kind :) I simply forgot the exact requirement of bitmask, and didn't know how, thought that it should be a concept, with several operators defined. As you mentioned, this isn't actually a bad idea; we are using concepts everywhere, then why don't make it a concept? What I would consider as more controversial are the following things: a) The class takes advantage of a deprecated rule to implicitly declare a copy-assignment operator (because it has a user-declared copy-constructor). I suggest to fix that and add a defaulted one before at some future point this bitmask type is no longer copy-assignable. This would be a easy fix. I admit that I can never remenber in which case a special member function will be implicitly defined or deleted :( b) Basically no operation is marked as noexcept. This is user-observable and seems like an unfortunate outcome. Obviously all of them should be marked as noexcept. c) The presence of a non-explicit conversion to bool allows users to unintentionally write code like std::regex_constants::syntax_option_type a = ..; [..] int x = a + 3; // Oops The presence of this function also *seemingly* allows to implicitly convert syntax_option_type to any integer type: int x = a; // Oops I suggest to make this conversion function explicit. I did so just because I used this implicit conversion (since they are originally enums) elsewhere in the code. I didn't want to change all those code; besides, standard didn't specify exact behavior of these two types. So it is crappy, yes, but was more like an engineering trade off to me (when I'm overengineering things ;). d) One could consider to mark the mutable operations (syntax_option_type operator=(syntax_option_type)) as constexpr in C++14 mode, but that is just an idea, not a required functionality. I did noticed that in other functions (operator bool, etc), but ignored these. On Sun, Mar 15, 2015 at 6:13 AM, Jonathan Wakely jwakely@gmail.com wrote: On 15 March 2015 at 08:09, Tim Shen wrote: Did a little bit refectoring on regex_constants, so that users won't be polluted too much (they are not enum types anymore). I think this is overengineered and unnecessary. All it needs is something like: enum syntax_option_type : unsigned { }; constexpr syntax_option_type icase = ...; Acknowledged. Here's the simple version of it. Thanks! -- Regards, Tim Shen commit 33a88cefd85e04fcb7af463d6b23ea2248feef67 Author: Tim Shen tims...@google.com Date: Sun Mar 15 23:05:57 2015 -0700 PR libstdc++/65420 * include/bits/regex_constants.h: Use constexpr variables for flags. * testsuite/28_regex/constants/constexpr.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex_constants.h b/libstdc++-v3/include/bits/regex_constants.h index 1ef5a43..e2c7631 100644 --- a/libstdc++-v3/include/bits/regex_constants.h +++ b/libstdc++-v3/include/bits/regex_constants.h @@ -77,88 +77,97 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * elements @c ECMAScript, @c basic, @c extended, @c awk, @c grep, @c egrep * %set. */ - enum syntax_option_type : unsigned int - { -/** - * Specifies that the matching of regular expressions against a character - * sequence shall be performed without regard to case. - */ -icase = 1 _S_icase, - -/** - * Specifies that when a regular expression is matched against a character - * container sequence, no sub-expression matches are to be stored in the - * supplied match_results structure. - */ -nosubs = 1 _S_nosubs, - -/** - * Specifies that the regular expression engine should pay more attention to - * the speed with which regular expressions are matched, and less to the - * speed with which regular expression objects are constructed. Otherwise - * it has no detectable effect on the program output. - */ -optimize = 1 _S_optimize, - -/** - * Specifies that character ranges of the form [a-b] should be locale - * sensitive. - */ -collate= 1 _S_collate, - -/** - * Specifies that the grammar recognized by the regular expression engine is - * that used by ECMAScript in ECMA-262 [Ecma International, ECMAScript - * Language Specification, Standard Ecma-262, third edition, 1999], as - * modified in section [28.13]. This grammar is similar to that defined - * in the PERL scripting language but extended with elements found in the - * POSIX regular expression grammar. - */ -ECMAScript = 1
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote: +#ifndef GCC_GCSE__COMMONH +#define GCC_GCSE__COMMONH GCC_GCSE_COMMON_H instead? @@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED) if (expr_table-elements () 0) { + /* Knowing which MEMs are transparent through a block can signifiantly + increase the number of reundant loads found. So compute transparency + information for for each memory expression in the hash table. */ s/for for/for/ ? + df_analyze (); + /* This can not be part of the normal allocation routine because + we have to know the number of elements in the hash table. */ + transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), + expr_table-elements ()); + bitmap_vector_ones (transp, last_basic_block_for_fn (cfun)); + expr_table-traverse FILE *, compute_expr_transp (dump_file); eliminate_partially_redundant_loads (); delete_redundant_insns (); + sbitmap_vector_free (transp); if (dump_file) { What effect does the patch have on compile time on say x86_64 or ppc64? I'm slightly leaning towards trying it even in stage4, but if e.g. richi disagrees, we could defer it to stage1 too. Jakub
Re: [patch c++]: Fix for PR/65390
2015-03-16 19:07 GMT+01:00 Jason Merrill ja...@redhat.com: If there is an alignment mismatch without user intervention, there is a problem, we can't just ignore it. Where we run into trouble is with array types where the version built earlier has not been laid out yet but the new one has been. I've been trying to deal with that by making sure that we lay out the original type as well, but obviously that isn't working for this case. Why not? Well, TYPE_ALIGN (t) is set to 32, and it differs to TYPE_ALIGN (result) (value 8), and TYPE_USER_ALIGN isn't set. I suppose we could avoid checking TYPE_ALIGN if neither TYPE_USER_ALIGN nor TYPE_SIZE are set on 't', but checking TYPE_USER_ALIGN isn't enough. For t TYPE_SIZE is set, but it isn't a constant (as it is an variably modified type). So we could add here additional check if TYPE_SIZE is a integer-constant? Something like this condition you mean? ... if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result) || ((TYPE_USER_ALIGN (t) || TREE_CODE (TYPE_SIZE (t)) == INTEGER_CST) TYPE_ALIGN (t) != TYPE_ALIGN (result))) { ... Jason Kai
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On Mon, Mar 16, 2015 at 07:15:59PM +0100, Richard Biener wrote: On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote: On the following testcase, gimple LIM creates a vector COND_EXPR (scalar condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is unprepared for that. This patch lowers those (parallel or piecewise). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Though maybe LIM should not create these for cost reasons? I also wonder if we should lower them to control flow? Yeah, I've been thinking about teaching LIM not to do that if it is BLKmode. But then found how many other spots create COND_EXPRs and thought I just can't catch all of them. But guess I can change LIM incrementally too. I've also been thinking about lowering it to control flow, but only if it couldn't be done in say two halves comparison as in the testcase. I suppose doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, but if it is more than that (say 4 or 8+) it might be already better to just expand it as a PHI. But, as we don't create basic blocks in tree-vect-generic.c right now, I thought it might be too much for stage4. Jakub
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On March 16, 2015 7:26:58 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote: On Mon, Mar 16, 2015 at 07:15:59PM +0100, Richard Biener wrote: On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote: On the following testcase, gimple LIM creates a vector COND_EXPR (scalar condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is unprepared for that. This patch lowers those (parallel or piecewise). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Though maybe LIM should not create these for cost reasons? I also wonder if we should lower them to control flow? Yeah, I've been thinking about teaching LIM not to do that if it is BLKmode. But then found how many other spots create COND_EXPRs and thought I just can't catch all of them. But guess I can change LIM incrementally too. I've also been thinking about lowering it to control flow, but only if it couldn't be done in say two halves comparison as in the testcase. I suppose doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, but if it is more than that (say 4 or 8+) it might be already better to just expand it as a PHI. But, as we don't create basic blocks in tree-vect-generic.c right now, I thought it might be too much for stage4. Yeah, the patch is fine for stage 4. I just wrote down my random thoughts. Richard. Jakub
Re: [AArch64][PR65375] Fix RTX cost for vector SET
On 16/03/15 05:36, Kugan wrote: Hi Kugan, AArch64 RTX cost for vector SET is causing PR65375. Lower subreg is using this rtx_cost to compute the cost of moves, and splitting anything larger than word size, 64-bits in this case. The aarch64 rtx_costs is returning 2 * COST_N_INSNS(1) for vector moves, so they get split. Attach patch fixes this. With the patch the testcase in the PR: #include arm_neon.h void hello_vst2(float* fout, float *fin) { float32x4x2_t a; a = vld2q_f32 (fin); vst2q_f32 (fout, a); } Changes to: hello_vst2: - ld2 {v0.4s - v1.4s}, [x1] - sub sp, sp, #32 - umovx1, v0.d[0] - umovx2, v0.d[1] - str q1, [sp, 16] - mov x5, x1 - stp x5, x2, [sp] - ld1 {v0.16b - v1.16b}, [sp] + ld2 {v2.4s - v3.4s}, [x1] + orr v0.16b, v2.16b, v2.16b + orr v1.16b, v3.16b, v3.16b st2 {v0.4s - v1.4s}, [x0] - add sp, sp, 32 ret lower-subreg.c:compute_costs() only cares about the cost of a (set (reg) (const_int )) move but I think the intention, at least for now, is to return extra_cost-vect.alu for all the vector operations. Almost, what we want at the moment is COSTS_N_INSNS (1) + extra_cost-vect.alu Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Are you sure it's a (set (reg) (const_int)) that's being costed here? I thought for moves into vecto registers it would be a (set (reg) (const_vector)) which we don't handle in our rtx costs currently. I think the correct approach would be to extend the aarch64_rtx_costs switch statement to handle the CONST_VECT case. I believe you can use aarch64_simd_valid_immediate to check whether x is a valid immediate for a simd instruction and give it a cost of extra_cost-vect.alu. The logic should be similar to the CONST_INT case. Thanks, Kyrill Thanks, Kugan gcc/ChangeLog: 2015-03-16 Kugan Vivekanandarajah kug...@linaro.org Jim Wilson jim.wil...@linaro.org PR target/65375 * config/aarch64/aarch64.c (aarch64_rtx_costs): Return extra_cost-vect.alu for SET.
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, when looking at the m68k I realized the following, which is a general problem... If the alignment of the structure is less than sizeof(field), the strict volatile bitfields code may read beyond the end of the structure! Consider this example: struct s { char x : 8; volatile unsigned int y : 31; volatile unsigned int z : 1; } __attribute__((packed)); struct s global; Here we have sizeof(struct s) = 5, alignment(global) == 1, However when we access global.z we read a 32-bit word at offset 4, which touches 3 bytes that are not safe to use. Something like that does never happen with -fno-strict-volatile-bitfields, because IIRC, with the only exception of the simple_mem_bitfield_p code path, there is never an access mode used which is larger than MEM_ALIGN(x). In this example, if I want to use the packed attribute, I also have to use the aligned(4) attribute, this satisfies the check MEM_ALIGN (op0) modesize, which is IMO always necessary for strict volatile bitfields, not only on STRICT_ALIGNMENT targets. On a target, that has BIGGEST_ALIGNMENT BITS_PER_WORD, to use the strict volatile bitfields, you have to add the __attribute__((aligned(4))) to the structure. I had to do that on the pr23623.c test case, to have it passed on m68k for instance. I have attached the updated patch. As explained before, the check MEM_ALIGN (op0) modesize should always be done in strict_volatile_bitfield_p. For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes, Except when we use packed on the structure, we need to add also an aligned(4) attribute. For m68k where the natural alignment of any structure is =2 we need to force aligned(4) if we want to ensure the access is in SImode. Boot-strapped and reg-tested on x86_64-linux-gnu. OK for trunk? Thanks Bernd. gcc: 2015-03-15 Bernd Edlinger bernd.edlin...@hotmail.de * expmed.c (strict_volatile_bitfield_p): Check that MEM_ALIGN allows a MODESIZE access. (store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly generate an unaligned access if the field crosses a word boundary. testsuite: 2015-03-15 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.dg/pr23623.c: Added aligned attribute. * gcc.dg/20141029-1.c: Likewise. * gcc.dg/20150306-1.c: New test. patch-volatile-bitfields-1.diff Description: Binary data
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On 03/16/15 13:27, Jakub Jelinek wrote: On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote: +#ifndef GCC_GCSE__COMMONH +#define GCC_GCSE__COMMONH GCC_GCSE_COMMON_H instead? :-) Will fix. @@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED) if (expr_table-elements () 0) { + /* Knowing which MEMs are transparent through a block can signifiantly +increase the number of reundant loads found. So compute transparency +information for for each memory expression in the hash table. */ s/for for/for/ ? Similarly. + df_analyze (); + /* This can not be part of the normal allocation routine because +we have to know the number of elements in the hash table. */ + transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), +expr_table-elements ()); + bitmap_vector_ones (transp, last_basic_block_for_fn (cfun)); + expr_table-traverse FILE *, compute_expr_transp (dump_file); eliminate_partially_redundant_loads (); delete_redundant_insns (); + sbitmap_vector_free (transp); if (dump_file) { What effect does the patch have on compile time on say x86_64 or ppc64? I'll test both. In the common case, the cost is going to be the basic bookkeeping so that we can compute the transparent property. The actual computation of transparency and everything else is guarded on having something in the hash tables -- and the overwhelming majority of the time there's nothing in the hash tables. Regardless, I'll pin down boxes and do some testing. I'm slightly leaning towards trying it even in stage4, but if e.g. richi disagrees, we could defer it to stage1 too. I'd be OK either way. I just want us to make a decision one way or the other :-) Jeff
Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types
Dear Paul, Paul Richard Thomas wrote: The ChangeLog says it all. If the check is not done for components that are not procedure pointers, typebound_operator_9.f03 breaks. I am not entirely sure why this is the case but the fix works fine. Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0? OK. I have to admit that I also do not really understand why that's required. Tobias
Re: RFC: Avoid calling convert_to_mode with invalid rtl.
if (offset) { machine_mode address_mode; rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode, EXPAND_SUM); gcc_assert (MEM_P (op0)); address_mode = get_address_mode (op0); if (GET_MODE (offset_rtx) != address_mode) offset_rtx = convert_to_mode (address_mode, offset_rtx, 0); The problem here is that convert_to_mode() may decide to offload offset_rtx into another reg and then extend the reg to address_mode, but it assumes that offset_rtx is a valid rtl expression for the target. Note that the very same code is in expand_assignment, so they probably should be kept in sync. The patch adds a second call to expand_expr(), giving the address mode as the suggested mode, and using a normal expansion, rather than EXPAND_SUM. This might work, but even if it does not the rtl in offset_rtx will be valid in a non-address context so that whatever convert_to_mode does will still remain valid. I don't think that we want to expand twice the same expression. What about calling force_operand on offset_rtx right before convert_to_mode? -- Eric Botcazou
Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types
Dear Tobias, I think that I have a partial understanding now and will attempt to verify it tonight. Certainly to not build the components, when a derived type is flagged to have proc_pointer components cannot be right just because there can be other components as in the original testcase. This led to the lack of DECL_SIZE information for the field 'term' and the subsequent ICE. The key to understanding is the problem with typebound_operator_9.f03, when the check for proc_pointer components is omitted completely. This implies that there is something missing from the function (sorry, I forget its name) that build proc_pointer fields. I will check that idea - enlightenment might lead to a different patch. Thanks for the review. Paul On 16 March 2015 at 09:07, Tobias Burnus tobias.bur...@physik.fu-berlin.de wrote: Dear Paul, Paul Richard Thomas wrote: The ChangeLog says it all. If the check is not done for components that are not procedure pointers, typebound_operator_9.f03 breaks. I am not entirely sure why this is the case but the fix works fine. Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0? OK. I have to admit that I also do not really understand why that's required. Tobias -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit architecture. I doubt that the strict alignment code makes any sense for modesize BIGGEST_ALIGNMENT. Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. -- Eric Botcazou
[Ada] Fix minor ICE with -gnatct
A short-circuit was added in gigi for E_Abstract_State but it results in an unbalanced handling of the context. Tested on x86_64-suse-linux, applied on the mainline. 2015-03-16 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity) E_Abstract_State: Do not short-circuit the regular handling. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 221407) +++ gcc-interface/decl.c (working copy) @@ -4800,9 +4800,11 @@ gnat_to_gnu_entity (Entity_Id gnat_entit case E_Abstract_State: /* This is a SPARK annotation that only reaches here when compiling in - ASIS mode and has no characteristics to annotate. */ + ASIS mode. */ gcc_assert (type_annotate_only); - return error_mark_node; + gnu_decl = error_mark_node; + saved = true; + break; default: gcc_unreachable ();