Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-26 Thread Uros Bizjak
On Thu, Dec 26, 2013 at 7:28 AM, Gopalasubramanian, Ganesh
ganesh.gopalasubraman...@amd.com wrote:

 (get_amd_cpu): Handle AMD_BOBCAT, AMD_JAGUAR, AMDFAM15H_BDVER2 and
 AMDFAM15H_BDVER3.

 As mentioned earlier, we would like to stick with BTVER1 and BTVER2 instead 
 of using BOBCAT or JAGUAR.
 Attached patch does the changes.

OK.

I'm sorry I didn't notice previous conversation. Please install ASAP.

Thanks,
Uros.


RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-26 Thread Gopalasubramanian, Ganesh
 I'm sorry I didn't notice previous conversation. Please install ASAP.

Thanks Uros! Committed to revision 206210.
- Ganesh



Re: [PATCH i386 4/8] [AVX512] [7/8] Add substed patterns: `round for expand' subst.

2013-12-26 Thread Kirill Yukhin
Hello Uros,
On 23 Dec 17:46, Uros Bizjak wrote:
 This round_expand_predicate is the predicate substitution I was
 referred to in the review of 5/8. Please use it also in insn patterns,
 perhaps renamed as round_predicate

This is drawback of substs. We bind given subst attribute to given subst
strictly. So, this guy:

+(define_subst_attr round_expand_predicate round_expand 
nonimmediate_operand register_operand)

is binded to round_expand (second argument of definition) subst and to it 
only.
That is way name is round_expand..., it reflects subst it relates to.

For rest substs I'll introduce dedicated attributes.

--
Thanks, K


Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-26 Thread Allan Sandfeld Jensen
On Thursday 26 December 2013, Gopalasubramanian, Ganesh wrote:
 Hi,
 
  (get_amd_cpu): Handle AMD_BOBCAT, AMD_JAGUAR, AMDFAM15H_BDVER2 and
  AMDFAM15H_BDVER3.
 
 As mentioned earlier, we would like to stick with BTVER1 and BTVER2 instead
 of using BOBCAT or JAGUAR. Attached patch does the changes.
 
Sorry for missing your comment. Thanks for fixing it. Renaming the comments 
with the AMD family names might be overdoing it though. 

`Allan


Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread Jan Hubicka
 Hi Honza,
 
 We have combined generic32 and generic64 into generic.  There is no need
 to check generic anymore.  Also we shouldn't change -mtune=i686 into
 -mtune=generic.  OK to install?

The i686-generic change was intended to get generic optimized code
for i686-linux configuration rather than pentiumpro.  I think it still makes
sense to use this, since it is what most 32bit distros still configure for?

Honza


Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.

2013-12-26 Thread Kirill Yukhin
Hello,

On 23 Dec 17:26, Uros Bizjak wrote:
 On Mon, Dec 23, 2013 at 5:11 PM, Uros Bizjak ubiz...@gmail.com wrote:
  So, OK for mainline, but I would kindly ask you to please wait a
  couple of days for possible Richard's comments
 
 When substituting constraints, please also substitute corresponding
 operand predicate:
 
 nonimmediate_operand - register_operand in 1st and 3rd case
 memory_operand - register_operand in 2nd case.

Thanks! I've introduced new subst attribute:
+(define_subst_attr round_nimm_predicate round nonimmediate_operand 
register_operand)

which name reflect:
  1.  affilation to `round' subst (`round')
  2.  predicate it intended to affect (`nimm_predicate')

TESTING
  1. Bootstrap pass.
  2. make check shows no regressions.
  3. Spec 2000  2006 build show no regressions both with and without -mavx512f 
option.
  4. Spec 2000  2006 run shows no regressions without -m512f option.

If no more inputs - I'll check it in after 24 hrs from now.

--
Thanks, K

---
 gcc/config/i386/i386.c   |  32 
 gcc/config/i386/i386.md  |  10 +
 gcc/config/i386/sse.md   | 480 ---
 gcc/config/i386/subst.md |  42 +
 4 files changed, 331 insertions(+), 233 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ecf5e0b..a3dd307 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -15041,6 +15041,38 @@ ix86_print_operand (FILE *file, rtx x, int code)
fputs ({z}, file);
  return;
 
+   case 'R':
+ gcc_assert (CONST_INT_P (x));
+
+ if (ASSEMBLER_DIALECT == ASM_INTEL)
+   fputs (, , file);
+
+ switch (INTVAL (x))
+   {
+   case ROUND_NEAREST_INT:
+ fputs ({rn-sae}, file);
+ break;
+   case ROUND_NEG_INF:
+ fputs ({rd-sae}, file);
+ break;
+   case ROUND_POS_INF:
+ fputs ({ru-sae}, file);
+ break;
+   case ROUND_ZERO:
+ fputs ({rz-sae}, file);
+ break;
+   case ROUND_SAE:
+ fputs ({sae}, file);
+ break;
+   default:
+ gcc_unreachable ();
+   }
+
+ if (ASSEMBLER_DIALECT == ASM_ATT)
+   fputs (, , file);
+
+ return;
+
case '*':
  if (ASSEMBLER_DIALECT == ASM_ATT)
putc ('*', file);
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ab5b33f..30b8d74 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -241,6 +241,16 @@
(ROUND_NO_EXC   0x8)
   ])
 
+;; Constants to represent AVX512F embeded rounding
+(define_constants
+  [(ROUND_NEAREST_INT  0)
+   (ROUND_NEG_INF  1)
+   (ROUND_POS_INF  2)
+   (ROUND_ZERO 3)
+   (NO_ROUND   4)
+   (ROUND_SAE  5)
+  ])
+
 ;; Constants to represent pcomtrue/pcomfalse variants
 (define_constants
   [(PCOM_FALSE 0)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index adedf44..119d0b0 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1229,23 +1229,23 @@
 }
   [(set_attr isa noavx,noavx,avx,avx)])
 
-(define_expand plusminus_insnmode3mask_name
+(define_expand plusminus_insnmode3mask_nameround_name
   [(set (match_operand:VF 0 register_operand)
(plusminus:VF
- (match_operand:VF 1 nonimmediate_operand)
- (match_operand:VF 2 nonimmediate_operand)))]
-  TARGET_SSE  mask_mode512bit_condition
+ (match_operand:VF 1 round_nimm_predicate)
+ (match_operand:VF 2 round_nimm_predicate)))]
+  TARGET_SSE  mask_mode512bit_condition  round_mode512bit_condition
   ix86_fixup_binary_operands_no_copy (CODE, MODEmode, operands);)
 
-(define_insn *plusminus_insnmode3mask_name
+(define_insn *plusminus_insnmode3mask_nameround_name
   [(set (match_operand:VF 0 register_operand =x,v)
(plusminus:VF
- (match_operand:VF 1 nonimmediate_operand comm0,v)
- (match_operand:VF 2 nonimmediate_operand xm,vm)))]
-  TARGET_SSE  ix86_binary_operator_ok (CODE, MODEmode, operands)  
mask_mode512bit_condition
+ (match_operand:VF 1 round_nimm_predicate comm0,v)
+ (match_operand:VF 2 round_nimm_predicate 
xm,round_constraint)))]
+  TARGET_SSE  ix86_binary_operator_ok (CODE, MODEmode, operands)  
mask_mode512bit_condition  round_mode512bit_condition
   @
plusminus_mnemonicssemodesuffix\t{%2, %0|%0, %2}
-   vplusminus_mnemonicssemodesuffix\t{%2, %1, 
%0mask_operand3|%0mask_operand3, %1, %2}
+   vplusminus_mnemonicssemodesuffix\t{round_mask_op3%2, %1, 
%0mask_operand3|%0mask_operand3, %1, %2round_mask_op3}
   [(set_attr isa noavx,avx)
(set_attr type sseadd)
(set_attr prefix mask_prefix3)
@@ -1268,23 +1268,23 @@
(set_attr prefix orig,vex)
(set_attr mode ssescalarmode)])
 
-(define_expand mulmode3mask_name
+(define_expand 

PATCH: PR target/59601: [4.9 Regression] __attribute__ ((target(arch=corei7))) won't match Westmere processor

2013-12-26 Thread H.J. Lu
Hi,

After my Intel processor name cleanup,

__attribute__ ((target(arch=corei7))) is translated to PROCESSOR_NEHALEM
mapped to M_INTEL_COREI7_NEHALEM. We used to hav

e __attribute__ ((target(arch=corei7)))

to cover M_INTEL_COREI7_. Now it only covers M_INTEL_COREI7_NEHALEM.
We have PROCESSOR_SANDYBRIDGE and PROCESSOR_HASWELL.  But there is nothing
to mark Westmere and Ivy Bridge.  Since function versioning doesn't support
extra ISAs in Westmere and Ivy Bridge, we don't lose anything. The solution
is to map

__attribute__ ((target(arch=corei7)))

and

__attribute__ ((target(arch=nehalem)))

to M_INTEL_COREI7.  I tested mv14.C and mv15.C on Nehalem, Westmere,
Sandy Bride and Ivy Bridge.  OK to install?

Thanks.

H.J.

gcc/

2013-12-26   H.J. Lu  hongjiu...@intel.com

PR target/59601
* config/i386/i386.c (get_builtin_code_for_version): Map
PROCESSOR_NEHALEM to corei7.

gcc/testsuite/

2013-12-26   Uros Bizjak  ubiz...@gmail.com
 H.J. Lu  hongjiu...@intel.com

PR target/59601
* g++.dg/ext/mv14.C: New tests.
* g++.dg/ext/mv15.C: Likewise.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 37bb656..e3d693a 100644
--- a/gcc/config/i386/i386.c
++ b/gcc/config/i386/i386.c
@@ -30010,7 +30010,10 @@ get_builtin_code_for_version (tree decl, tree 
*predicate_list)
  priority = P_PROC_SSSE3;
  break;
case PROCESSOR_NEHALEM:
- arg_str = nehalem;
+ /* We translate arch=corei7 and arch=nehelam to
+corei7 so that it will be mapped to M_INTEL_COREI7
+as cpu type to cover all M_INTEL_COREI7_XXXs.  */
+ arg_str = corei7;
  priority = P_PROC_SSE4_2;
  break;
case PROCESSOR_SANDYBRIDGE:
diff --git a/gcc/testsuite/g++.dg/ext/mv14.C b/gcc/testsuite/g++.dg/ext/mv14.C
new file mode 100644
index 000..e36e08d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/mv14.C
@@ -0,0 +1,40 @@
+/* Test case to check if Multiversioning works.  */
+/* { dg-do run { target i?86-*-* x86_64-*-* } } */
+/* { dg-require-ifunc  }  */
+/* { dg-options -O2 -fPIC } */
+
+#include assert.h
+
+/* Default version.  */
+int foo (); // Extra declaration that is merged with the second one.
+int foo () __attribute__ ((target(default)));
+
+int foo () __attribute__ ((target(arch=corei7)));
+
+int (*p)() = foo;
+int main ()
+{
+  int val = foo ();
+  assert (val ==  (*p)());
+
+  /* Check in the exact same order in which the dispatching
+ is expected to happen.  */
+  if (__builtin_cpu_is (corei7))
+assert (val == 5);
+  else
+assert (val == 0);
+  
+  return 0;
+}
+
+int __attribute__ ((target(default)))
+foo ()
+{
+  return 0;
+}
+
+int __attribute__ ((target(arch=corei7)))
+foo ()
+{
+  return 5;
+}
diff --git a/gcc/testsuite/g++.dg/ext/mv15.C b/gcc/testsuite/g++.dg/ext/mv15.C
new file mode 100644
index 000..42e39d2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/mv15.C
@@ -0,0 +1,40 @@
+/* Test case to check if Multiversioning works.  */
+/* { dg-do run { target i?86-*-* x86_64-*-* } } */
+/* { dg-require-ifunc  }  */
+/* { dg-options -O2 -fPIC } */
+
+#include assert.h
+
+/* Default version.  */
+int foo (); // Extra declaration that is merged with the second one.
+int foo () __attribute__ ((target(default)));
+
+int foo () __attribute__ ((target(arch=nehalem)));
+
+int (*p)() = foo;
+int main ()
+{
+  int val = foo ();
+  assert (val ==  (*p)());
+
+  /* Check in the exact same order in which the dispatching
+ is expected to happen.  */
+  if (__builtin_cpu_is (corei7))
+assert (val == 5);
+  else
+assert (val == 0);
+  
+  return 0;
+}
+
+int __attribute__ ((target(default)))
+foo ()
+{
+  return 0;
+}
+
+int __attribute__ ((target(arch=nehalem)))
+foo ()
+{
+  return 5;
+}


Re: [RFC][gomp4] Offloading: Add device initialization and host-target function mapping

2013-12-26 Thread Ilya Verbin
Ping.
(Patch is slightly updated)

On 20 Dec 21:18, Ilya Verbin wrote:
 Hi Jakub,
 
 Could you please take a look at this patch for libgomp?
 
 It adds new function GOMP_register_lib, that should be called from every
 exec/lib with target regions (that was done in patch [1]).  This function
 maintains the array of pointers to the target shared library descriptors.
 
 Also this patch adds target device initialization into GOMP_target and
 GOMP_target_data.  At first, it calls device_init function from the plugin.
 This function takes array of target-images as input, and returns the array of
 target-side addresses.  Currently, it always uses the first target-image from
 the descriptor, this should be generalized later.  Then libgomp reads the 
 tables
 from host-side exec/libs.  After that, it inserts host-target address mapping
 into the splay tree.
 
 [1] http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html
 
 Thanks,
 -- Ilya

-- Ilya

---
 libgomp/libgomp.map |1 +
 libgomp/target.c|  154 ---
 2 files changed, 146 insertions(+), 9 deletions(-)

diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2b64d05..792047f 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -208,6 +208,7 @@ GOMP_3.0 {
 
 GOMP_4.0 {
   global:
+   GOMP_register_lib;
GOMP_barrier_cancel;
GOMP_cancel;
GOMP_cancellation_point;
diff --git a/libgomp/target.c b/libgomp/target.c
index d84a1fa..7677c28 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -84,6 +84,19 @@ struct splay_tree_key_s {
   bool copy_from;
 };
 
+enum library_descr {
+  DESCR_TABLE_START,
+  DESCR_TABLE_END,
+  DESCR_IMAGE_START,
+  DESCR_IMAGE_END
+};
+
+/* Array of pointers to target shared library descriptors.  */
+static void **libraries;
+
+/* Total number of target shared libraries.  */
+static int num_libraries;
+
 /* Array of descriptors of all available devices.  */
 static struct gomp_device_descr *devices;
 
@@ -117,11 +130,16 @@ struct gomp_device_descr
  TARGET construct.  */
   int id;
 
+  /* Set to true when device is initialized.  */
+  bool is_initialized;
+
   /* Plugin file handler.  */
   void *plugin_handle;
 
   /* Function handlers.  */
-  bool (*device_available_func) (void);
+  bool (*device_available_func) (int);
+  void (*device_init_func) (void **, int *, int, void ***, int *);
+  void (*device_run_func) (void *, uintptr_t);
 
   /* Splay tree containing information about mapped memory regions.  */
   struct splay_tree_s dev_splay_tree;
@@ -466,6 +484,89 @@ gomp_update (struct gomp_device_descr *devicep, size_t 
mapnum,
   gomp_mutex_unlock (devicep-dev_env_lock);
 }
 
+void
+GOMP_register_lib (const void *openmp_target)
+{
+  libraries = realloc (libraries, (num_libraries + 1) * sizeof (void *));
+
+  if (libraries == NULL)
+return;
+
+  libraries[num_libraries] = (void *) openmp_target;
+
+  num_libraries++;
+}
+
+static void
+gomp_init_device (struct gomp_device_descr *devicep)
+{
+  void **target_images = malloc (num_libraries * sizeof (void *));
+  int *target_img_sizes = malloc (num_libraries * sizeof (int));
+  if (target_images == NULL || target_img_sizes == NULL)
+gomp_fatal (Can not allocate memory);
+
+  /* Collect target images from the library descriptors and calculate the total
+ size of host address table.  */
+  int i, host_table_size = 0;
+  for (i = 0; i  num_libraries; i++)
+{
+  void **lib = libraries[i];
+  void **host_table_start = lib[DESCR_TABLE_START];
+  void **host_table_end = lib[DESCR_TABLE_END];
+  /* FIXME: Select the proper target image.  */
+  target_images[i] = lib[DESCR_IMAGE_START];
+  target_img_sizes[i] = lib[DESCR_IMAGE_END] - lib[DESCR_IMAGE_START];
+  host_table_size += host_table_end - host_table_start;
+}
+
+  /* Initialize the target device and receive the address table from target.  
*/
+  void **target_table = NULL;
+  int target_table_size = 0;
+  devicep-device_init_func (target_images, target_img_sizes, num_libraries,
+target_table, target_table_size);
+  free (target_images);
+  free (target_img_sizes);
+
+  if (host_table_size != target_table_size)
+gomp_fatal (Can't map target objects);
+
+  /* Initialize the mapping data structure.  */
+  void **target_entry = target_table;
+  for (i = 0; i  num_libraries; i++)
+{
+  void **lib = libraries[i];
+  void **host_table_start = lib[DESCR_TABLE_START];
+  void **host_table_end = lib[DESCR_TABLE_END];
+  void **host_entry;
+  for (host_entry = host_table_start; host_entry  host_table_end;
+  host_entry += 2, target_entry += 2)
+   {
+ struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt));
+ tgt-refcount = 1;
+ tgt-array = gomp_malloc (sizeof (*tgt-array));
+ tgt-tgt_start = (uintptr_t) *target_entry;
+ tgt-tgt_end = tgt-tgt_start + *((uint64_t *) target_entry + 1);
+  

Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread H.J. Lu
On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi Honza,

 We have combined generic32 and generic64 into generic.  There is no need
 to check generic anymore.  Also we shouldn't change -mtune=i686 into
 -mtune=generic.  OK to install?

 The i686-generic change was intended to get generic optimized code
 for i686-linux configuration rather than pentiumpro.  I think it still makes
 sense to use this, since it is what most 32bit distros still configure for?


Should -mtune=i686 define __tune_i686__?  If not, how can
it be defined? Don't we default -mtune to generic for
i686-linux?

-- 
H.J.


Re: PATCH: PR target/59601: [4.9 Regression] __attribute__ ((target(arch=corei7))) won't match Westmere processor

2013-12-26 Thread Uros Bizjak
On Thu, Dec 26, 2013 at 2:25 PM, H.J. Lu hongjiu...@intel.com wrote:

 After my Intel processor name cleanup,

 __attribute__ ((target(arch=corei7))) is translated to PROCESSOR_NEHALEM
 mapped to M_INTEL_COREI7_NEHALEM. We used to hav

 e __attribute__ ((target(arch=corei7)))

 to cover M_INTEL_COREI7_. Now it only covers M_INTEL_COREI7_NEHALEM.
 We have PROCESSOR_SANDYBRIDGE and PROCESSOR_HASWELL.  But there is nothing
 to mark Westmere and Ivy Bridge.  Since function versioning doesn't support
 extra ISAs in Westmere and Ivy Bridge, we don't lose anything. The solution
 is to map

 __attribute__ ((target(arch=corei7)))

 and

 __attribute__ ((target(arch=nehalem)))

 to M_INTEL_COREI7.  I tested mv14.C and mv15.C on Nehalem, Westmere,
 Sandy Bride and Ivy Bridge.  OK to install?

 gcc/

 2013-12-26   H.J. Lu  hongjiu...@intel.com

 PR target/59601
 * config/i386/i386.c (get_builtin_code_for_version): Map
 PROCESSOR_NEHALEM to corei7.

 gcc/testsuite/

 2013-12-26   Uros Bizjak  ubiz...@gmail.com
  H.J. Lu  hongjiu...@intel.com

 PR target/59601
 * g++.dg/ext/mv14.C: New tests.
 * g++.dg/ext/mv15.C: Likewise.

OK.

Thanks,
Uros.


[PATCH, i386]: Use vendor signatures from cpuid.h in cpuinfo.c

2013-12-26 Thread Uros Bizjak
Hello!

Use the same definitions from common header.

2013-12-26  Uros Bizjak  ubiz...@gmail.com

* config/i386/cpuinfo.c (enum vendor_signatures): Remove.
(__cpu_indicator_init): Use signature_INTEL_ebx and signature_AMD_ebx
from cpuid.h to check vendor signatures.

No functional changes.

Bootstrapped on x86_64-pc-linux-gnu and committed to mainline SVN.

Uros.
Index: config/i386/cpuinfo.c
===
--- config/i386/cpuinfo.c   (revision 206210)
+++ config/i386/cpuinfo.c   (working copy)
@@ -36,12 +36,6 @@ see the files COPYING3 and COPYING.RUNTIME respect
 int __cpu_indicator_init (void)
   __attribute__ ((constructor CONSTRUCTOR_PRIORITY));
 
-enum vendor_signatures
-{
-  SIG_INTEL =  0x756e6547 /* Genu */,
-  SIG_AMD =0x68747541 /* Auth */
-};
-
 /* Processor Vendor and Models. */
 
 enum processor_vendor
@@ -368,7 +362,7 @@ __cpu_indicator_init (void)
   extended_model = (eax  12)  0xf0;
   extended_family = (eax  20)  0xff;
 
-  if (vendor == SIG_INTEL)
+  if (vendor == signature_INTEL_ebx)
 {
   /* Adjust model and family for Intel CPUS. */
   if (family == 0x0f)
@@ -385,7 +379,7 @@ __cpu_indicator_init (void)
   get_available_features (ecx, edx, max_level);
   __cpu_model.__cpu_vendor = VENDOR_INTEL;
 }
-  else if (vendor == SIG_AMD)
+  else if (vendor == signature_AMD_ebx)
 {
   /* Adjust model and family for AMD CPUS. */
   if (family == 0x0f)


Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread Jan Hubicka
 On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi Honza,
 
  We have combined generic32 and generic64 into generic.  There is no need
  to check generic anymore.  Also we shouldn't change -mtune=i686 into
  -mtune=generic.  OK to install?
 
  The i686-generic change was intended to get generic optimized code
  for i686-linux configuration rather than pentiumpro.  I think it still makes
  sense to use this, since it is what most 32bit distros still configure for?
 
 
 Should -mtune=i686 define __tune_i686__?  If not, how can
 it be defined? Don't we default -mtune to generic for
 i686-linux?

If i686-linux defaults to -mtune=generic, then I think it is all fine.
i686 is bit misbehaved since it was used as both CPU name for PPro (that does 
not
make much sense) and for the overall architecture...

Honza
 
 -- 
 H.J.


Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread H.J. Lu
On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi Honza,
 
  We have combined generic32 and generic64 into generic.  There is no need
  to check generic anymore.  Also we shouldn't change -mtune=i686 into
  -mtune=generic.  OK to install?
 
  The i686-generic change was intended to get generic optimized code
  for i686-linux configuration rather than pentiumpro.  I think it still 
  makes
  sense to use this, since it is what most 32bit distros still configure for?
 

 Should -mtune=i686 define __tune_i686__?  If not, how can
 it be defined? Don't we default -mtune to generic for
 i686-linux?

 If i686-linux defaults to -mtune=generic, then I think it is all fine.

We have defaulted
[hjl@gnu-6 gcc]$ ./xgcc -B./ -S /tmp/x.i -v
Reading specs from ./specs
COLLECT_GCC=./xgcc
Target: i686-linux
Configured with: /export/gnu/import/git/gcc/configure
--enable-languages=c,c++ --disable-bootstrap i686-linux
--prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local
--enable-targets=all --with-fpmath=sse : (reconfigured)
/export/gnu/import/git/gcc/configure --enable-languages=c,c++
--disable-bootstrap i686-linux --prefix=/usr/gcc-4.9.0
--with-local-prefix=/usr/local --enable-targets=all --with-fpmath=sse
Thread model: posix
gcc version 4.9.0 20131224 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-B' './' '-S' '-v' '-mtune=generic' '-march=pentium4'
 ./cc1 -fpreprocessed /tmp/x.i -quiet -dumpbase x.i -mtune=generic
-march=pentium4 -auxbase x -version -o x.s
GNU C (GCC) version 4.9.0 20131224 (experimental) (i686-linux)
compiled by GNU C version 4.8.2 20131212 (Red Hat 4.8.2-7), GMP
version 5.1.1, MPFR version 3.1.1, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C (GCC) version 4.9.0 20131224 (experimental) (i686-linux)
compiled by GNU C version 4.8.2 20131212 (Red Hat 4.8.2-7), GMP
version 5.1.1, MPFR version 3.1.1, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 8d0a04c49875a54ef44488e5406c52dd
COMPILER_PATH=./
LIBRARY_PATH=./:/lib/../lib/:/usr/lib/../lib/:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-B' './' '-S' '-v' '-mtune=generic' '-march=pentium4'
[hjl@gnu-6 gcc]$

I will check in my patch.

 i686 is bit misbehaved since it was used as both CPU name for PPro (that does 
 not
 make much sense) and for the overall architecture...


Thanks.

-- 
H.J.


Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-26 Thread H.J. Lu
On Wed, Dec 25, 2013 at 12:49 PM, Uros Bizjak ubiz...@gmail.com wrote:
 TARGET_CPU_DEFAULT is left over for 32-bit target before --with-arch=
 and --with-cpu= were added.  Today, -mtune=xxx -march=xxx are
 always passed to cc1 by GCC driver.  If cc1 is run by hand and
 -mtune=xxx -march=xxx aren't passed to cc1, we should do

 1. For 64-bit, it should be the same as -mtune=generic -march=x86_64
 are passed.
 2. For 32-bit, it should be the same as -mtune=cpu -march=cpu are
 passed, where cpu is the target cpu used to configure GCC,
 like i386 in i386-linux, i486 in i486-linux,  But there is no i786
 cpu.  i786 is treated as i686.  If SUBTARGET32_DEFAULT_CPU
 is defined, it should be the same -mtune=SUBTARGET32_DEFAULT_CPU
 -march=SUBTARGET32_DEFAULT_CPU.

 Here is the patch to implement this.

 Let's do one step at a time. So, let's split the patch back to target/59587 
 fix:


I am not formally submitting the patch to define target_cpu_default
for i[34567]86 targets:

http://gcc.gnu.org/git/?p=gcc.git;a=patch;h=c5d2157c8c9181286441317cf55570d8e33741c2

since it has no impact on x86-64 nor when GCC driver
is used.  It only changes the default arch/tune when
cc1/cc1plus is run by hand, which is very unusual.
I will leave the patch on hjl/arch branch just in case
someone is interested.

Thanks.

-- 
H.J.


Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread H.J. Lu
On Thu, Dec 26, 2013 at 8:06 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi Honza,
 
  We have combined generic32 and generic64 into generic.  There is no need
  to check generic anymore.  Also we shouldn't change -mtune=i686 into
  -mtune=generic.  OK to install?
 
  The i686-generic change was intended to get generic optimized code
  for i686-linux configuration rather than pentiumpro.  I think it still 
  makes
  sense to use this, since it is what most 32bit distros still configure 
  for?
 

 Should -mtune=i686 define __tune_i686__?  If not, how can
 it be defined? Don't we default -mtune to generic for
 i686-linux?

 If i686-linux defaults to -mtune=generic, then I think it is all fine.

...

 I will check in my patch.


My patch exposes a testsuite bug:

spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mtune=i686
-ffat-lto-objects -ffat-lto-objects -S -o andor-2.s^M
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0:
error: CPU you selected does not support x86-64 instruction set^M
compiler exited with status 1
output is:
/export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0:
error: CPU you selected does not support x86-64 instruction set^M

FAIL: gcc.target/i386/andor-2.c (test for excess errors)

We used to silently turn -mtune=i686 into -mtune=generic.
Now we don't.  It is wrong to accept -mtune=i686 when compiling
for  x86-64.  I am checking in this patch as an obvious fix.

Thanks.

-- 
H.J.
--
diff --git a/gcc/testsuite/gcc.target/i386/andor-2.c
b/gcc/testsuite/gcc.target/i386/andor-2.c
index 88118aa..eacc7b1 100644
--- a/gcc/testsuite/gcc.target/i386/andor-2.c
+++ b/gcc/testsuite/gcc.target/i386/andor-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -mtune=i686 } */
+/* { dg-options -O2 -mtune=generic } */

 int h(int x, int y)
 {


Re: New prologue/epilogue code for i386 string functions

2013-12-26 Thread H.J. Lu
On Tue, Oct 22, 2013 at 8:58 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 this patch adds code to produce prologues/epilogues as suggested by Ondrej 
 Bilka
 (I described more the approach in 
 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02082.html)
 This patch is updated and cleaned up version after Mikhail changes merging 
 memset/memcpy
 generation code.  (I will continue with some incremental cleanups for the 
 code dulication
 we ended up with).

 For now I don't have value range code in, but all logic is in place once
 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02011.html
 gets reviewed.

 Bootstrapped/regtesed x86_64-linux also with -minline-all-stringops and 
 tested on SPEC2k6.
 I will commit it later today after more testing.

 Honza

 * i386.h (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES_EPILOGUES): New 
 tuning flag.
 * x86-tune.def (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES): Define it.
 * i386.c (expand_small_movmem_or_setmem): New function.
 (expand_set_or_movmem_prologue_epilogue_by_misaligned_moves): New 
 function
 (alg_usable_p): Add support for value ranges; cleanup.
 (ix86_expand_set_or_movmem): Add support for misaligned moves.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59605


-- 
H.J.


Re: PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-26 Thread H.J. Lu
On Thu, Dec 26, 2013 at 11:11 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Dec 26, 2013 at 8:06 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Dec 26, 2013 at 7:45 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, Dec 26, 2013 at 4:38 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi Honza,
 
  We have combined generic32 and generic64 into generic.  There is no need
  to check generic anymore.  Also we shouldn't change -mtune=i686 into
  -mtune=generic.  OK to install?
 
  The i686-generic change was intended to get generic optimized code
  for i686-linux configuration rather than pentiumpro.  I think it still 
  makes
  sense to use this, since it is what most 32bit distros still configure 
  for?
 

 Should -mtune=i686 define __tune_i686__?  If not, how can
 it be defined? Don't we default -mtune to generic for
 i686-linux?

 If i686-linux defaults to -mtune=generic, then I think it is all fine.

 ...

 I will check in my patch.


 My patch exposes a testsuite bug:

 spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
 -B/export/build/gnu/gcc/build-x86_64-linux/gcc/
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c
 -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mtune=i686
 -ffat-lto-objects -ffat-lto-objects -S -o andor-2.s^M
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0:
 error: CPU you selected does not support x86-64 instruction set^M
 compiler exited with status 1
 output is:
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/andor-2.c:1:0:
 error: CPU you selected does not support x86-64 instruction set^M

 FAIL: gcc.target/i386/andor-2.c (test for excess errors)

 We used to silently turn -mtune=i686 into -mtune=generic.
 Now we don't.  It is wrong to accept -mtune=i686 when compiling
 for  x86-64.  I am checking in this patch as an obvious fix.

 Thanks.

 --
 H.J.
 --
 diff --git a/gcc/testsuite/gcc.target/i386/andor-2.c
 b/gcc/testsuite/gcc.target/i386/andor-2.c
 index 88118aa..eacc7b1 100644
 --- a/gcc/testsuite/gcc.target/i386/andor-2.c
 +++ b/gcc/testsuite/gcc.target/i386/andor-2.c
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 -mtune=i686 } */
 +/* { dg-options -O2 -mtune=generic } */

  int h(int x, int y)
  {

Another one happens with -mx32.  I checked in
this patch to fix it.

-- 
H.J.
---
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 98d22b3e..ad98f63 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,11 @@
 2013-12-26   H.J. Lu  hongjiu...@intel.com

+ * g++.old-deja/g++.other/store-expr1.C (dg-options): Replace
+ -mtune=i686 with -mtune=generic.
+ * g++.old-deja/g++.other/store-expr2.C (dg-options): Likewise.
+
+2013-12-26   H.J. Lu  hongjiu...@intel.com
+
  * gcc.target/i386/andor-2.c (dg-options): Replace -mtune=i686
  with -mtune=generic.

diff --git a/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C
b/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C
index 72d30eb..af5e415 100644
--- a/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C
+++ b/gcc/testsuite/g++.old-deja/g++.other/store-expr1.C
@@ -1,7 +1,7 @@
 // { dg-do run { target i?86-*-* x86_64-*-* } }
 // { dg-require-effective-target ilp32 }
 // { dg-require-effective-target fpic }
-// { dg-options -mtune=i686 -O2 -fpic }
+// { dg-options -mtune=generic -O2 -fpic }
 class G {};

 struct N {
diff --git a/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C
b/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C
index 99e0943..1dffbcc 100644
--- a/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C
+++ b/gcc/testsuite/g++.old-deja/g++.other/store-expr2.C
@@ -1,6 +1,6 @@
 // { dg-do run { target i?86-*-* x86_64-*-*} }
 // { dg-require-effective-target ilp32 }
-// { dg-options -mtune=i686 -O2 }
+// { dg-options -mtune=generic -O2 }
 class G {};

 struct N {


Re: [patch] powerpc64 FreeBSD support for boehm-gc

2013-12-26 Thread Andrew Haley
On 12/26/2013 12:11 AM, Andreas Tobler wrote:
 On 21.12.13 18:27, Andrew Haley wrote:
 On 12/20/2013 10:15 PM, Andreas Tobler wrote:
 Ok for gcc trunk?

 OK, thanks.

 
 May I get this one down to 4.8 too? Not really needed, but for
 completeness. Results will follow...

No objections from me.

Andrew.




PATCH: PR target/59605: Create jump_around_label only if it doesn't exist

2013-12-26 Thread H.J. Lu
Hi Honza,

r203937 may create jump_around_label earlier. But later code doesn't
check if jump_around_label exists.  This patch fixes it.  Tested
on Linux/x86-64.  OK to install?

Thanks.

H.J.
--
gcc/

2013-12-26   H.J. Lu  hongjiu...@intel.com

PR target/59605
* config/i386/i386.c (ix86_expand_set_or_movmem): Create
jump_around_label only if it doesn't exist.

gcc/testsuite/

2013-12-26   H.J. Lu  hongjiu...@intel.com

PR target/59605
* gcc.dg/pr59605.c: New test.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0cf0a9d..07f9a86 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24015,7 +24015,8 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx 
count_exp, rtx val_exp,
   else
{
  rtx hot_label = gen_label_rtx ();
- jump_around_label = gen_label_rtx ();
+ if (jump_around_label == NULL_RTX)
+   jump_around_label = gen_label_rtx ();
  emit_cmp_and_jump_insns (count_exp, GEN_INT (dynamic_check - 1),
   LEU, 0, GET_MODE (count_exp), 1, hot_label);
  predict_jump (REG_BR_PROB_BASE * 90 / 100);
diff --git a/gcc/testsuite/gcc.dg/pr59605.c b/gcc/testsuite/gcc.dg/pr59605.c
new file mode 100644
index 000..4556843
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr59605.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-options -O2 } */
+/* { dg-additional-options -minline-stringops-dynamically { target { 
i?86-*-* x86_64-*-* } } } */
+
+extern void abort (void);
+
+#define MAX_OFFSET (sizeof (long long))
+#define MAX_COPY (1024 + 8192)
+#define MAX_EXTRA (sizeof (long long))
+
+#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA)
+
+static union {
+  char buf[MAX_LENGTH];
+  long long align_int;
+  long double align_fp;
+} u;
+
+char A[MAX_LENGTH];
+
+int
+main ()
+{
+  int off, len, i;
+  char *p, *q;
+
+  for (i = 0; i  MAX_LENGTH; i++)
+A[i] = 'A';
+
+  for (off = 0; off  MAX_OFFSET; off++)
+for (len = 1; len  MAX_COPY; len++)
+  {
+   for (i = 0; i  MAX_LENGTH; i++)
+ u.buf[i] = 'a';
+
+   p = __builtin_memcpy (u.buf + off, A, len);
+   if (p != u.buf + off)
+ abort ();
+
+   q = u.buf;
+   for (i = 0; i  off; i++, q++)
+ if (*q != 'a')
+   abort ();
+
+   for (i = 0; i  len; i++, q++)
+ if (*q != 'A')
+   abort ();
+
+   for (i = 0; i  MAX_EXTRA; i++, q++)
+ if (*q != 'a')
+   abort ();
+  }
+
+  return 0;
+}


Re: [PATCH][x86] march aliases

2013-12-26 Thread Ryan Hill
On Mon, 23 Dec 2013 05:10:06 -0800
H.J. Lu hjl.to...@gmail.com wrote:

 This is the patch I checked in.  I will submit separate patches for
 other parts.

Please be sure to update changes.html.


-- 
Ryan Hillpsn: dirtyepic_sk
   gcc-porting/toolchain/wxwidgets @ gentoo.org

47C3 6D62 4864 0E49 8E9E  7F92 ED38 BD49 957A 8463


signature.asc
Description: PGP signature


Re: [Patch] PR55189 enable -Wreturn-type by default

2013-12-26 Thread Chung-Ju Wu
2013/12/21 Sylvestre Ledru sylves...@debian.org:
 Hello

 Following this thread http://gcc.gnu.org/ml/gcc/2013-11/msg00260.html
 and this bug,
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55189

 I would like to propose the two following patches:

 I am activating -Wreturn-type by defaut and add the option -Wmissing-return

[snip]

 Index: gcc/ChangeLog
 ===
 --- gcc/ChangeLog   (révision 206154)
 +++ gcc/ChangeLog   (copie de travail)
 @@ -1,3 +1,11 @@
 +2013-12-20  Sylvestre Ledru  sylves...@debian.org
 +
 +PR target/55189
 +* -Wreturn-type enabled by default.
 +   * Introduce back the option -Wmissing-return (enabled by -Wall)
 +   It was included by default with -Wreturn-type
 +   * Update all tests failing because of these changes.
 +
  2013-12-20  Eric Botcazou  ebotca...@adacore.com
 * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with

Hi, Sylvestre,

Sorry I have no right to approve this patch.
But I notice your ChangeLog formatting is not correct.

You can refer to other entries in ChangeLog to refine yours,
and then resubmit the patch for review. :)


Best regards,
jasonwucj


Re: [Patch] PR55189 enable -Wreturn-type by default

2013-12-26 Thread Yury Gribov

Chung-Wu wrote:
 But I notice your ChangeLog formatting is not correct.

 You can refer to other entries in ChangeLog to refine yours,
 and then resubmit the patch for review. :)

Or - use contrib/mklog to autogenerate template ChangeLog for you.

-Y