Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Thu, Dec 26, 2013 at 7:28 AM, Gopalasubramanian, Ganesh ganesh.gopalasubraman...@amd.com wrote: (get_amd_cpu): Handle AMD_BOBCAT, AMD_JAGUAR, AMDFAM15H_BDVER2 and AMDFAM15H_BDVER3. As mentioned earlier, we would like to stick with BTVER1 and BTVER2 instead of using BOBCAT or JAGUAR. Attached patch does the changes. OK. I'm sorry I didn't notice previous conversation. Please install ASAP. Thanks, Uros.
RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning
I'm sorry I didn't notice previous conversation. Please install ASAP. Thanks Uros! Committed to revision 206210. - Ganesh
Re: [PATCH i386 4/8] [AVX512] [7/8] Add substed patterns: `round for expand' subst.
Hello Uros, On 23 Dec 17:46, Uros Bizjak wrote: This round_expand_predicate is the predicate substitution I was referred to in the review of 5/8. Please use it also in insn patterns, perhaps renamed as round_predicate This is drawback of substs. We bind given subst attribute to given subst strictly. So, this guy: +(define_subst_attr round_expand_predicate round_expand nonimmediate_operand register_operand) is binded to round_expand (second argument of definition) subst and to it only. That is way name is round_expand..., it reflects subst it relates to. For rest substs I'll introduce dedicated attributes. -- Thanks, K
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Thursday 26 December 2013, Gopalasubramanian, Ganesh wrote: Hi, (get_amd_cpu): Handle AMD_BOBCAT, AMD_JAGUAR, AMDFAM15H_BDVER2 and AMDFAM15H_BDVER3. As mentioned earlier, we would like to stick with BTVER1 and BTVER2 instead of using BOBCAT or JAGUAR. Attached patch does the changes. Sorry for missing your comment. Thanks for fixing it. Renaming the comments with the AMD family names might be overdoing it though. `Allan
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Honza
Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.
Hello, On 23 Dec 17:26, Uros Bizjak wrote: On Mon, Dec 23, 2013 at 5:11 PM, Uros Bizjak ubiz...@gmail.com wrote: So, OK for mainline, but I would kindly ask you to please wait a couple of days for possible Richard's comments When substituting constraints, please also substitute corresponding operand predicate: nonimmediate_operand - register_operand in 1st and 3rd case memory_operand - register_operand in 2nd case. Thanks! I've introduced new subst attribute: +(define_subst_attr round_nimm_predicate round nonimmediate_operand register_operand) which name reflect: 1. affilation to `round' subst (`round') 2. predicate it intended to affect (`nimm_predicate') TESTING 1. Bootstrap pass. 2. make check shows no regressions. 3. Spec 2000 2006 build show no regressions both with and without -mavx512f option. 4. Spec 2000 2006 run shows no regressions without -m512f option. If no more inputs - I'll check it in after 24 hrs from now. -- Thanks, K --- gcc/config/i386/i386.c | 32 gcc/config/i386/i386.md | 10 + gcc/config/i386/sse.md | 480 --- gcc/config/i386/subst.md | 42 + 4 files changed, 331 insertions(+), 233 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ecf5e0b..a3dd307 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -15041,6 +15041,38 @@ ix86_print_operand (FILE *file, rtx x, int code) fputs ({z}, file); return; + case 'R': + gcc_assert (CONST_INT_P (x)); + + if (ASSEMBLER_DIALECT == ASM_INTEL) + fputs (, , file); + + switch (INTVAL (x)) + { + case ROUND_NEAREST_INT: + fputs ({rn-sae}, file); + break; + case ROUND_NEG_INF: + fputs ({rd-sae}, file); + break; + case ROUND_POS_INF: + fputs ({ru-sae}, file); + break; + case ROUND_ZERO: + fputs ({rz-sae}, file); + break; + case ROUND_SAE: + fputs ({sae}, file); + break; + default: + gcc_unreachable (); + } + + if (ASSEMBLER_DIALECT == ASM_ATT) + fputs (, , file); + + return; + case '*': if (ASSEMBLER_DIALECT == ASM_ATT) putc ('*', file); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ab5b33f..30b8d74 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -241,6 +241,16 @@ (ROUND_NO_EXC 0x8) ]) +;; Constants to represent AVX512F embeded rounding +(define_constants + [(ROUND_NEAREST_INT 0) + (ROUND_NEG_INF 1) + (ROUND_POS_INF 2) + (ROUND_ZERO 3) + (NO_ROUND 4) + (ROUND_SAE 5) + ]) + ;; Constants to represent pcomtrue/pcomfalse variants (define_constants [(PCOM_FALSE 0) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index adedf44..119d0b0 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1229,23 +1229,23 @@ } [(set_attr isa noavx,noavx,avx,avx)]) -(define_expand plusminus_insnmode3mask_name +(define_expand plusminus_insnmode3mask_nameround_name [(set (match_operand:VF 0 register_operand) (plusminus:VF - (match_operand:VF 1 nonimmediate_operand) - (match_operand:VF 2 nonimmediate_operand)))] - TARGET_SSE mask_mode512bit_condition + (match_operand:VF 1 round_nimm_predicate) + (match_operand:VF 2 round_nimm_predicate)))] + TARGET_SSE mask_mode512bit_condition round_mode512bit_condition ix86_fixup_binary_operands_no_copy (CODE, MODEmode, operands);) -(define_insn *plusminus_insnmode3mask_name +(define_insn *plusminus_insnmode3mask_nameround_name [(set (match_operand:VF 0 register_operand =x,v) (plusminus:VF - (match_operand:VF 1 nonimmediate_operand comm0,v) - (match_operand:VF 2 nonimmediate_operand xm,vm)))] - TARGET_SSE ix86_binary_operator_ok (CODE, MODEmode, operands) mask_mode512bit_condition + (match_operand:VF 1 round_nimm_predicate comm0,v) + (match_operand:VF 2 round_nimm_predicate xm,round_constraint)))] + TARGET_SSE ix86_binary_operator_ok (CODE, MODEmode, operands) mask_mode512bit_condition round_mode512bit_condition @ plusminus_mnemonicssemodesuffix\t{%2, %0|%0, %2} - vplusminus_mnemonicssemodesuffix\t{%2, %1, %0mask_operand3|%0mask_operand3, %1, %2} + vplusminus_mnemonicssemodesuffix\t{round_mask_op3%2, %1, %0mask_operand3|%0mask_operand3, %1, %2round_mask_op3} [(set_attr isa noavx,avx) (set_attr type sseadd) (set_attr prefix mask_prefix3) @@ -1268,23 +1268,23 @@ (set_attr prefix orig,vex) (set_attr mode ssescalarmode)]) -(define_expand mulmode3mask_name +(define_expand
PATCH: PR target/59601: [4.9 Regression] __attribute__ ((target(arch=corei7))) won't match Westmere processor
Hi, After my Intel processor name cleanup, __attribute__ ((target(arch=corei7))) is translated to PROCESSOR_NEHALEM mapped to M_INTEL_COREI7_NEHALEM. We used to hav e __attribute__ ((target(arch=corei7))) to cover M_INTEL_COREI7_. Now it only covers M_INTEL_COREI7_NEHALEM. We have PROCESSOR_SANDYBRIDGE and PROCESSOR_HASWELL. But there is nothing to mark Westmere and Ivy Bridge. Since function versioning doesn't support extra ISAs in Westmere and Ivy Bridge, we don't lose anything. The solution is to map __attribute__ ((target(arch=corei7))) and __attribute__ ((target(arch=nehalem))) to M_INTEL_COREI7. I tested mv14.C and mv15.C on Nehalem, Westmere, Sandy Bride and Ivy Bridge. OK to install? Thanks. H.J. gcc/ 2013-12-26 H.J. Lu hongjiu...@intel.com PR target/59601 * config/i386/i386.c (get_builtin_code_for_version): Map PROCESSOR_NEHALEM to corei7. gcc/testsuite/ 2013-12-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/59601 * g++.dg/ext/mv14.C: New tests. * g++.dg/ext/mv15.C: Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 37bb656..e3d693a 100644 --- a/gcc/config/i386/i386.c ++ b/gcc/config/i386/i386.c @@ -30010,7 +30010,10 @@ get_builtin_code_for_version (tree decl, tree *predicate_list) priority = P_PROC_SSSE3; break; case PROCESSOR_NEHALEM: - arg_str = nehalem; + /* We translate arch=corei7 and arch=nehelam to +corei7 so that it will be mapped to M_INTEL_COREI7 +as cpu type to cover all M_INTEL_COREI7_XXXs. */ + arg_str = corei7; priority = P_PROC_SSE4_2; break; case PROCESSOR_SANDYBRIDGE: diff --git a/gcc/testsuite/g++.dg/ext/mv14.C b/gcc/testsuite/g++.dg/ext/mv14.C new file mode 100644 index 000..e36e08d --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/mv14.C @@ -0,0 +1,40 @@ +/* Test case to check if Multiversioning works. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-require-ifunc } */ +/* { dg-options -O2 -fPIC } */ + +#include assert.h + +/* Default version. */ +int foo (); // Extra declaration that is merged with the second one. +int foo () __attribute__ ((target(default))); + +int foo () __attribute__ ((target(arch=corei7))); + +int (*p)() = foo; +int main () +{ + int val = foo (); + assert (val == (*p)()); + + /* Check in the exact same order in which the dispatching + is expected to happen. */ + if (__builtin_cpu_is (corei7)) +assert (val == 5); + else +assert (val == 0); + + return 0; +} + +int __attribute__ ((target(default))) +foo () +{ + return 0; +} + +int __attribute__ ((target(arch=corei7))) +foo () +{ + return 5; +} diff --git a/gcc/testsuite/g++.dg/ext/mv15.C b/gcc/testsuite/g++.dg/ext/mv15.C new file mode 100644 index 000..42e39d2 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/mv15.C @@ -0,0 +1,40 @@ +/* Test case to check if Multiversioning works. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-require-ifunc } */ +/* { dg-options -O2 -fPIC } */ + +#include assert.h + +/* Default version. */ +int foo (); // Extra declaration that is merged with the second one. +int foo () __attribute__ ((target(default))); + +int foo () __attribute__ ((target(arch=nehalem))); + +int (*p)() = foo; +int main () +{ + int val = foo (); + assert (val == (*p)()); + + /* Check in the exact same order in which the dispatching + is expected to happen. */ + if (__builtin_cpu_is (corei7)) +assert (val == 5); + else +assert (val == 0); + + return 0; +} + +int __attribute__ ((target(default))) +foo () +{ + return 0; +} + +int __attribute__ ((target(arch=nehalem))) +foo () +{ + return 5; +}
Re: [RFC][gomp4] Offloading: Add device initialization and host-target function mapping
Ping. (Patch is slightly updated) On 20 Dec 21:18, Ilya Verbin wrote: Hi Jakub, Could you please take a look at this patch for libgomp? It adds new function GOMP_register_lib, that should be called from every exec/lib with target regions (that was done in patch [1]). This function maintains the array of pointers to the target shared library descriptors. Also this patch adds target device initialization into GOMP_target and GOMP_target_data. At first, it calls device_init function from the plugin. This function takes array of target-images as input, and returns the array of target-side addresses. Currently, it always uses the first target-image from the descriptor, this should be generalized later. Then libgomp reads the tables from host-side exec/libs. After that, it inserts host-target address mapping into the splay tree. [1] http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html Thanks, -- Ilya -- Ilya --- libgomp/libgomp.map |1 + libgomp/target.c| 154 --- 2 files changed, 146 insertions(+), 9 deletions(-) diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 2b64d05..792047f 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -208,6 +208,7 @@ GOMP_3.0 { GOMP_4.0 { global: + GOMP_register_lib; GOMP_barrier_cancel; GOMP_cancel; GOMP_cancellation_point; diff --git a/libgomp/target.c b/libgomp/target.c index d84a1fa..7677c28 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -84,6 +84,19 @@ struct splay_tree_key_s { bool copy_from; }; +enum library_descr { + DESCR_TABLE_START, + DESCR_TABLE_END, + DESCR_IMAGE_START, + DESCR_IMAGE_END +}; + +/* Array of pointers to target shared library descriptors. */ +static void **libraries; + +/* Total number of target shared libraries. */ +static int num_libraries; + /* Array of descriptors of all available devices. */ static struct gomp_device_descr *devices; @@ -117,11 +130,16 @@ struct gomp_device_descr TARGET construct. */ int id; + /* Set to true when device is initialized. */ + bool is_initialized; + /* Plugin file handler. */ void *plugin_handle; /* Function handlers. */ - bool (*device_available_func) (void); + bool (*device_available_func) (int); + void (*device_init_func) (void **, int *, int, void ***, int *); + void (*device_run_func) (void *, uintptr_t); /* Splay tree containing information about mapped memory regions. */ struct splay_tree_s dev_splay_tree; @@ -466,6 +484,89 @@ gomp_update (struct gomp_device_descr *devicep, size_t mapnum, gomp_mutex_unlock (devicep-dev_env_lock); } +void +GOMP_register_lib (const void *openmp_target) +{ + libraries = realloc (libraries, (num_libraries + 1) * sizeof (void *)); + + if (libraries == NULL) +return; + + libraries[num_libraries] = (void *) openmp_target; + + num_libraries++; +} + +static void +gomp_init_device (struct gomp_device_descr *devicep) +{ + void **target_images = malloc (num_libraries * sizeof (void *)); + int *target_img_sizes = malloc (num_libraries * sizeof (int)); + if (target_images == NULL || target_img_sizes == NULL) +gomp_fatal (Can not allocate memory); + + /* Collect target images from the library descriptors and calculate the total + size of host address table. */ + int i, host_table_size = 0; + for (i = 0; i num_libraries; i++) +{ + void **lib = libraries[i]; + void **host_table_start = lib[DESCR_TABLE_START]; + void **host_table_end = lib[DESCR_TABLE_END]; + /* FIXME: Select the proper target image. */ + target_images[i] = lib[DESCR_IMAGE_START]; + target_img_sizes[i] = lib[DESCR_IMAGE_END] - lib[DESCR_IMAGE_START]; + host_table_size += host_table_end - host_table_start; +} + + /* Initialize the target device and receive the address table from target. */ + void **target_table = NULL; + int target_table_size = 0; + devicep-device_init_func (target_images, target_img_sizes, num_libraries, +target_table, target_table_size); + free (target_images); + free (target_img_sizes); + + if (host_table_size != target_table_size) +gomp_fatal (Can't map target objects); + + /* Initialize the mapping data structure. */ + void **target_entry = target_table; + for (i = 0; i num_libraries; i++) +{ + void **lib = libraries[i]; + void **host_table_start = lib[DESCR_TABLE_START]; + void **host_table_end = lib[DESCR_TABLE_END]; + void **host_entry; + for (host_entry = host_table_start; host_entry host_table_end; + host_entry += 2, target_entry += 2) + { + struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt)); + tgt-refcount = 1; + tgt-array = gomp_malloc (sizeof (*tgt-array)); + tgt-tgt_start = (uintptr_t) *target_entry; + tgt-tgt_end = tgt-tgt_start + *((uint64_t *) target_entry + 1); +
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Should -mtune=i686 define __tune_i686__? If not, how can it be defined? Don't we default -mtune to generic for i686-linux? -- H.J.
Re: PATCH: PR target/59601: [4.9 Regression] __attribute__ ((target(arch=corei7))) won't match Westmere processor
On Thu, Dec 26, 2013 at 2:25 PM, H.J. Lu hongjiu...@intel.com wrote: After my Intel processor name cleanup, __attribute__ ((target(arch=corei7))) is translated to PROCESSOR_NEHALEM mapped to M_INTEL_COREI7_NEHALEM. We used to hav e __attribute__ ((target(arch=corei7))) to cover M_INTEL_COREI7_. Now it only covers M_INTEL_COREI7_NEHALEM. We have PROCESSOR_SANDYBRIDGE and PROCESSOR_HASWELL. But there is nothing to mark Westmere and Ivy Bridge. Since function versioning doesn't support extra ISAs in Westmere and Ivy Bridge, we don't lose anything. The solution is to map __attribute__ ((target(arch=corei7))) and __attribute__ ((target(arch=nehalem))) to M_INTEL_COREI7. I tested mv14.C and mv15.C on Nehalem, Westmere, Sandy Bride and Ivy Bridge. OK to install? gcc/ 2013-12-26 H.J. Lu hongjiu...@intel.com PR target/59601 * config/i386/i386.c (get_builtin_code_for_version): Map PROCESSOR_NEHALEM to corei7. gcc/testsuite/ 2013-12-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/59601 * g++.dg/ext/mv14.C: New tests. * g++.dg/ext/mv15.C: Likewise. OK. Thanks, Uros.
[PATCH, i386]: Use vendor signatures from cpuid.h in cpuinfo.c
Hello! Use the same definitions from common header. 2013-12-26 Uros Bizjak ubiz...@gmail.com * config/i386/cpuinfo.c (enum vendor_signatures): Remove. (__cpu_indicator_init): Use signature_INTEL_ebx and signature_AMD_ebx from cpuid.h to check vendor signatures. No functional changes. Bootstrapped on x86_64-pc-linux-gnu and committed to mainline SVN. Uros. Index: config/i386/cpuinfo.c === --- config/i386/cpuinfo.c (revision 206210) +++ config/i386/cpuinfo.c (working copy) @@ -36,12 +36,6 @@ see the files COPYING3 and COPYING.RUNTIME respect int __cpu_indicator_init (void) __attribute__ ((constructor CONSTRUCTOR_PRIORITY)); -enum vendor_signatures -{ - SIG_INTEL = 0x756e6547 /* Genu */, - SIG_AMD =0x68747541 /* Auth */ -}; - /* Processor Vendor and Models. */ enum processor_vendor @@ -368,7 +362,7 @@ __cpu_indicator_init (void) extended_model = (eax 12) 0xf0; extended_family = (eax 20) 0xff; - if (vendor == SIG_INTEL) + if (vendor == signature_INTEL_ebx) { /* Adjust model and family for Intel CPUS. */ if (family == 0x0f) @@ -385,7 +379,7 @@ __cpu_indicator_init (void) get_available_features (ecx, edx, max_level); __cpu_model.__cpu_vendor = VENDOR_INTEL; } - else if (vendor == SIG_AMD) + else if (vendor == signature_AMD_ebx) { /* Adjust model and family for AMD CPUS. */ if (family == 0x0f)
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Should -mtune=i686 define __tune_i686__? If not, how can it be defined? Don't we default -mtune to generic for i686-linux? If i686-linux defaults to -mtune=generic, then I think it is all fine. i686 is bit misbehaved since it was used as both CPU name for PPro (that does not make much sense) and for the overall architecture... Honza -- H.J.
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Should -mtune=i686 define __tune_i686__? If not, how can it be defined? Don't we default -mtune to generic for i686-linux? If i686-linux defaults to -mtune=generic, then I think it is all fine. We have defaulted [hjl@gnu-6 gcc]$ ./xgcc -B./ -S /tmp/x.i -v Reading specs from ./specs COLLECT_GCC=./xgcc Target: i686-linux Configured with: /export/gnu/import/git/gcc/configure --enable-languages=c,c++ --disable-bootstrap i686-linux --prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local --enable-targets=all --with-fpmath=sse : (reconfigured) /export/gnu/import/git/gcc/configure --enable-languages=c,c++ --disable-bootstrap i686-linux --prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local --enable-targets=all --with-fpmath=sse Thread model: posix gcc version 4.9.0 20131224 (experimental) (GCC) COLLECT_GCC_OPTIONS='-B' './' '-S' '-v' '-mtune=generic' '-march=pentium4' ./cc1 -fpreprocessed /tmp/x.i -quiet -dumpbase x.i -mtune=generic -march=pentium4 -auxbase x -version -o x.s GNU C (GCC) version 4.9.0 20131224 (experimental) (i686-linux) compiled by GNU C version 4.8.2 20131212 (Red Hat 4.8.2-7), GMP version 5.1.1, MPFR version 3.1.1, MPC version 1.0.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C (GCC) version 4.9.0 20131224 (experimental) (i686-linux) compiled by GNU C version 4.8.2 20131212 (Red Hat 4.8.2-7), GMP version 5.1.1, MPFR version 3.1.1, MPC version 1.0.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 8d0a04c49875a54ef44488e5406c52dd COMPILER_PATH=./ LIBRARY_PATH=./:/lib/../lib/:/usr/lib/../lib/:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-B' './' '-S' '-v' '-mtune=generic' '-march=pentium4' [hjl@gnu-6 gcc]$ I will check in my patch. i686 is bit misbehaved since it was used as both CPU name for PPro (that does not make much sense) and for the overall architecture... Thanks. -- H.J.
Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index
On Wed, Dec 25, 2013 at 12:49 PM, Uros Bizjak ubiz...@gmail.com wrote: TARGET_CPU_DEFAULT is left over for 32-bit target before --with-arch= and --with-cpu= were added. Today, -mtune=xxx -march=xxx are always passed to cc1 by GCC driver. If cc1 is run by hand and -mtune=xxx -march=xxx aren't passed to cc1, we should do 1. For 64-bit, it should be the same as -mtune=generic -march=x86_64 are passed. 2. For 32-bit, it should be the same as -mtune=cpu -march=cpu are passed, where cpu is the target cpu used to configure GCC, like i386 in i386-linux, i486 in i486-linux, But there is no i786 cpu. i786 is treated as i686. If SUBTARGET32_DEFAULT_CPU is defined, it should be the same -mtune=SUBTARGET32_DEFAULT_CPU -march=SUBTARGET32_DEFAULT_CPU. Here is the patch to implement this. Let's do one step at a time. So, let's split the patch back to target/59587 fix: I am not formally submitting the patch to define target_cpu_default for i[34567]86 targets: http://gcc.gnu.org/git/?p=gcc.git;a=patch;h=c5d2157c8c9181286441317cf55570d8e33741c2 since it has no impact on x86-64 nor when GCC driver is used. It only changes the default arch/tune when cc1/cc1plus is run by hand, which is very unusual. I will leave the patch on hjl/arch branch just in case someone is interested. Thanks. -- H.J.
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
On Thu, Dec 26, 2013 at 8:06 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Should -mtune=i686 define __tune_i686__? If not, how can it be defined? Don't we default -mtune to generic for i686-linux? If i686-linux defaults to -mtune=generic, then I think it is all fine. ... I will check in my patch. My patch exposes a testsuite bug: spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mtune=i686 -ffat-lto-objects -ffat-lto-objects -S -o andor-2.s^M /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0: error: CPU you selected does not support x86-64 instruction set^M compiler exited with status 1 output is: /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0: error: CPU you selected does not support x86-64 instruction set^M FAIL: gcc.target/i386/andor-2.c (test for excess errors) We used to silently turn -mtune=i686 into -mtune=generic. Now we don't. It is wrong to accept -mtune=i686 when compiling for x86-64. I am checking in this patch as an obvious fix. Thanks. -- H.J. -- diff --git a/gcc/testsuite/gcc.target/i386/andor-2.c b/gcc/testsuite/gcc.target/i386/andor-2.c index 88118aa..eacc7b1 100644 --- a/gcc/testsuite/gcc.target/i386/andor-2.c +++ b/gcc/testsuite/gcc.target/i386/andor-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -mtune=i686 } */ +/* { dg-options -O2 -mtune=generic } */ int h(int x, int y) {
Re: New prologue/epilogue code for i386 string functions
On Tue, Oct 22, 2013 at 8:58 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, this patch adds code to produce prologues/epilogues as suggested by Ondrej Bilka (I described more the approach in http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02082.html) This patch is updated and cleaned up version after Mikhail changes merging memset/memcpy generation code. (I will continue with some incremental cleanups for the code dulication we ended up with). For now I don't have value range code in, but all logic is in place once http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02011.html gets reviewed. Bootstrapped/regtesed x86_64-linux also with -minline-all-stringops and tested on SPEC2k6. I will commit it later today after more testing. Honza * i386.h (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES_EPILOGUES): New tuning flag. * x86-tune.def (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES): Define it. * i386.c (expand_small_movmem_or_setmem): New function. (expand_set_or_movmem_prologue_epilogue_by_misaligned_moves): New function (alg_usable_p): Add support for value ranges; cleanup. (ix86_expand_set_or_movmem): Add support for misaligned moves. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59605 -- H.J.
Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning
On Thu, Dec 26, 2013 at 11:11 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Dec 26, 2013 at 8:06 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi Honza, We have combined generic32 and generic64 into generic. There is no need to check generic anymore. Also we shouldn't change -mtune=i686 into -mtune=generic. OK to install? The i686-generic change was intended to get generic optimized code for i686-linux configuration rather than pentiumpro. I think it still makes sense to use this, since it is what most 32bit distros still configure for? Should -mtune=i686 define __tune_i686__? If not, how can it be defined? Don't we default -mtune to generic for i686-linux? If i686-linux defaults to -mtune=generic, then I think it is all fine. ... I will check in my patch. My patch exposes a testsuite bug: spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mtune=i686 -ffat-lto-objects -ffat-lto-objects -S -o andor-2.s^M /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0: error: CPU you selected does not support x86-64 instruction set^M compiler exited with status 1 output is: /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0: error: CPU you selected does not support x86-64 instruction set^M FAIL: gcc.target/i386/andor-2.c (test for excess errors) We used to silently turn -mtune=i686 into -mtune=generic. Now we don't. It is wrong to accept -mtune=i686 when compiling for x86-64. I am checking in this patch as an obvious fix. Thanks. -- H.J. -- diff --git a/gcc/testsuite/gcc.target/i386/andor-2.c b/gcc/testsuite/gcc.target/i386/andor-2.c index 88118aa..eacc7b1 100644 --- a/gcc/testsuite/gcc.target/i386/andor-2.c +++ b/gcc/testsuite/gcc.target/i386/andor-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -mtune=i686 } */ +/* { dg-options -O2 -mtune=generic } */ int h(int x, int y) { Another one happens with -mx32. I checked in this patch to fix it. -- H.J. --- diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 98d22b3e..ad98f63 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,5 +1,11 @@ 2013-12-26 H.J. Lu hongjiu...@intel.com + * g++.old-deja/g++.other/store-expr1.C (dg-options): Replace + -mtune=i686 with -mtune=generic. + * g++.old-deja/g++.other/store-expr2.C (dg-options): Likewise. + +2013-12-26 H.J. Lu hongjiu...@intel.com + * gcc.target/i386/andor-2.c (dg-options): Replace -mtune=i686 with -mtune=generic. diff --git a/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C b/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C index 72d30eb..af5e415 100644 --- a/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C +++ b/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C @@ -1,7 +1,7 @@ // { dg-do run { target i?86-*-* x86_64-*-* } } // { dg-require-effective-target ilp32 } // { dg-require-effective-target fpic } -// { dg-options -mtune=i686 -O2 -fpic } +// { dg-options -mtune=generic -O2 -fpic } class G {}; struct N { diff --git a/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C b/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C index 99e0943..1dffbcc 100644 --- a/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C +++ b/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C @@ -1,6 +1,6 @@ // { dg-do run { target i?86-*-* x86_64-*-*} } // { dg-require-effective-target ilp32 } -// { dg-options -mtune=i686 -O2 } +// { dg-options -mtune=generic -O2 } class G {}; struct N {
Re: [patch] powerpc64 FreeBSD support for boehm-gc
On 12/26/2013 12:11 AM, Andreas Tobler wrote: On 21.12.13 18:27, Andrew Haley wrote: On 12/20/2013 10:15 PM, Andreas Tobler wrote: Ok for gcc trunk? OK, thanks. May I get this one down to 4.8 too? Not really needed, but for completeness. Results will follow... No objections from me. Andrew.
PATCH: PR target/59605: Create jump_around_label only if it doesn't exist
Hi Honza, r203937 may create jump_around_label earlier. But later code doesn't check if jump_around_label exists. This patch fixes it. Tested on Linux/x86-64. OK to install? Thanks. H.J. -- gcc/ 2013-12-26 H.J. Lu hongjiu...@intel.com PR target/59605 * config/i386/i386.c (ix86_expand_set_or_movmem): Create jump_around_label only if it doesn't exist. gcc/testsuite/ 2013-12-26 H.J. Lu hongjiu...@intel.com PR target/59605 * gcc.dg/pr59605.c: New test. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 0cf0a9d..07f9a86 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -24015,7 +24015,8 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx count_exp, rtx val_exp, else { rtx hot_label = gen_label_rtx (); - jump_around_label = gen_label_rtx (); + if (jump_around_label == NULL_RTX) + jump_around_label = gen_label_rtx (); emit_cmp_and_jump_insns (count_exp, GEN_INT (dynamic_check - 1), LEU, 0, GET_MODE (count_exp), 1, hot_label); predict_jump (REG_BR_PROB_BASE * 90 / 100); diff --git a/gcc/testsuite/gcc.dg/pr59605.c b/gcc/testsuite/gcc.dg/pr59605.c new file mode 100644 index 000..4556843 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr59605.c @@ -0,0 +1,55 @@ +/* { dg-do run } */ +/* { dg-options -O2 } */ +/* { dg-additional-options -minline-stringops-dynamically { target { i?86-*-* x86_64-*-* } } } */ + +extern void abort (void); + +#define MAX_OFFSET (sizeof (long long)) +#define MAX_COPY (1024 + 8192) +#define MAX_EXTRA (sizeof (long long)) + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A[MAX_LENGTH]; + +int +main () +{ + int off, len, i; + char *p, *q; + + for (i = 0; i MAX_LENGTH; i++) +A[i] = 'A'; + + for (off = 0; off MAX_OFFSET; off++) +for (len = 1; len MAX_COPY; len++) + { + for (i = 0; i MAX_LENGTH; i++) + u.buf[i] = 'a'; + + p = __builtin_memcpy (u.buf + off, A, len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i len; i++, q++) + if (*q != 'A') + abort (); + + for (i = 0; i MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } + + return 0; +}
Re: [PATCH][x86] march aliases
On Mon, 23 Dec 2013 05:10:06 -0800 H.J. Lu hjl.to...@gmail.com wrote: This is the patch I checked in. I will submit separate patches for other parts. Please be sure to update changes.html. -- Ryan Hillpsn: dirtyepic_sk gcc-porting/toolchain/wxwidgets @ gentoo.org 47C3 6D62 4864 0E49 8E9E 7F92 ED38 BD49 957A 8463 signature.asc Description: PGP signature
Re: [Patch] PR55189 enable -Wreturn-type by default
2013/12/21 Sylvestre Ledru sylves...@debian.org: Hello Following this thread http://gcc.gnu.org/ml/gcc/2013-11/msg00260.html and this bug, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55189 I would like to propose the two following patches: I am activating -Wreturn-type by defaut and add the option -Wmissing-return [snip] Index: gcc/ChangeLog === --- gcc/ChangeLog (révision 206154) +++ gcc/ChangeLog (copie de travail) @@ -1,3 +1,11 @@ +2013-12-20 Sylvestre Ledru sylves...@debian.org + +PR target/55189 +* -Wreturn-type enabled by default. + * Introduce back the option -Wmissing-return (enabled by -Wall) + It was included by default with -Wreturn-type + * Update all tests failing because of these changes. + 2013-12-20 Eric Botcazou ebotca...@adacore.com * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with Hi, Sylvestre, Sorry I have no right to approve this patch. But I notice your ChangeLog formatting is not correct. You can refer to other entries in ChangeLog to refine yours, and then resubmit the patch for review. :) Best regards, jasonwucj
Re: [Patch] PR55189 enable -Wreturn-type by default
Chung-Wu wrote: But I notice your ChangeLog formatting is not correct. You can refer to other entries in ChangeLog to refine yours, and then resubmit the patch for review. :) Or - use contrib/mklog to autogenerate template ChangeLog for you. -Y