[PATCH,gcc/MIPS] Make loongson3a use fused madd.d

2016-11-02 Thread Paul Hua
Hi,

Loongson3a has 4 operand fused madd instrcution. This patch set
loongson3a use fused madd.d.



ChangeLog :

*** gcc/ChangeLog ***

2016-11-03 Chenghua Xu 

config/mips/
* mips.h: Set loongson3a use fused madd.d.


Tested on loongson3a.


PS: I will soon submit some patches, how can i get a copyright assignment.
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 81862a9..5076a2b 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1056,11 +1056,11 @@ struct mips_cpu_info {
 
 /* ISA has 4 operand fused madd instructions of the form
'd = [+-] (a * b [+-] c)'.  */
-#define ISA_HAS_FUSED_MADD4	TARGET_MIPS8000
+#define ISA_HAS_FUSED_MADD4	(TARGET_MIPS8000 || TARGET_LOONGSON_3A)
 
 /* ISA has 4 operand unfused madd instructions of the form
'd = [+-] (a * b [+-] c)'.  */
-#define ISA_HAS_UNFUSED_MADD4	(ISA_HAS_FP4 && !TARGET_MIPS8000)
+#define ISA_HAS_UNFUSED_MADD4	(ISA_HAS_FP4 && !TARGET_MIPS8000 && !TARGET_LOONGSON_3A)
 
 /* ISA has 3 operand r6 fused madd instructions of the form
'c = c [+-] (a * b)'.  */


Re: [C++ Patch/RFC] PR 67980 ("left shift count is negative [-Wshift-count-negative] generated for unreachable code")

2016-11-02 Thread Paolo Carlini
.. I'm still looking for some directions about the best way to handle 
this issue: anyway, in case it wasn't clear, the second patch I posted 
passes testing.


Thanks,
Paolo.


Re: [openacc] adjust default num_gangs

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 03:11:52PM -0700, Cesar Philippidis wrote:
>if (seen_zero)
>  {
> +  /* See if the user provided GOMP_OPENACC_DIM environment
> +  variable to specify runtime defaults. */
> +  static int default_dims[GOMP_DIM_MAX];
> +
> +  pthread_mutex_lock (_dev_lock);

I think pthread_once would be better here, with the default_dims
initialization outlined into a separate helper function.

Jakub


Re: [openacc] adjust default num_gangs

2016-11-02 Thread Cesar Philippidis
On 11/02/2016 12:50 PM, Jakub Jelinek wrote:
> On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote:
>> @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void 
>> **hostaddrs, void **devaddrs,
>>  
>>if (seen_zero)
>>  {
>> +  /* See if the user provided GOMP_OPENACC_DIM environment
>> + variable to specify runtime defaults. */
>> +  static int default_dims[GOMP_DIM_MAX];
>> +
>> +  if (!default_dims[0])
>> +{
> 
> Is this guarded by some lock, or is it just racy if multiple
> nvptx_execs are done at the same time?
> 
>> +  /* We only read the environment variable once.  You can't
>> + change it in the middle of execution.  The sytntax  is
> 
> syntax
> 
>> + the same as for the -fopenacc-dim compilation option.  */
>> +  const char *env_var = getenv ("GOMP_OPENACC_DIM");
> 
>> +
>> +  if (CUDA_SUCCESS == cuDeviceGetAttribute
>> +  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev)
>> +  && CUDA_SUCCESS == cuDeviceGetAttribute
>> +  (_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev)
>> +  && CUDA_SUCCESS == cuDeviceGetAttribute
>> +  (_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev)
>> +  && CUDA_SUCCESS == cuDeviceGetAttribute
>> +  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, 
>> dev))
> 
> The formatting is wrong.  1) you should use the call should be on lhs of ==,
> not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line
> 3) still the lines are too long.
> 
> if (cuDeviceGetAttribute (_size,
>   CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
>   dev) == CUDA_SUCCESS
> && cuDeviceGetAttribute (...
> 
> CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR
> is still way too long, perhaps initialize a temporary const var
> to that, or use some macro like
> DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR)
> where
> #define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x
> 
> Otherwise LGTM.

Thanks. I've applied this version to trunk.

Cesar

2016-11-02  Cesar Philippidis  
	Nathan Sidwell  

	gcc/
	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero.

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes
	to determine default geometry.
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang
	dimension.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 80fa9ae..782bbde 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4174,7 +4174,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget),
 /* Define dimension sizes for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  32
+#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
 
 /* Validate compute dimensions of an OpenACC offload or routine, fill
in non-unity defaults.  FN_LEVEL indicates the level at which a
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 327500c..5ee350d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *
 cuda_error (CUresult r)
@@ -932,9 +933,88 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
   if (seen_zero)
 {
+  /* See if the user provided GOMP_OPENACC_DIM environment
+	 variable to specify runtime defaults. */
+  static int default_dims[GOMP_DIM_MAX];
+
+  pthread_mutex_lock (_dev_lock);
+  if (!default_dims[0])
+	{
+	  /* We only read the environment variable once.  You can't
+	 change it in the middle of execution.  The syntax  is
+	 the same as for the -fopenacc-dim compilation option.  */
+	  const char *env_var = getenv ("GOMP_OPENACC_DIM");
+	  if (env_var)
+	{
+	  const char *pos = env_var;
+
+	  for (i = 0; *pos && i != GOMP_DIM_MAX; i++)
+		{
+		  if (i && *pos++ != ':')
+		break;
+		  if (*pos != ':')
+		{
+		  const char *eptr;
+
+		  errno = 0;
+		  long val = strtol (pos, (char **), 10);
+		  if (errno || val < 0 || (unsigned)val != val)
+			break;
+		  default_dims[i] = (int)val;
+		  pos = eptr;
+		}
+		}
+	}
+
+	  int warp_size, block_size, dev_size, cpu_size;
+	  CUdevice dev = nvptx_thread()->ptx_dev->dev;
+	  /* 32 is the default for known hardware.  */
+	  int gang = 0, worker = 32, vector = 32;
+	  CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm;
+
+	  cu_tpb = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK;
+	  cu_ws = CU_DEVICE_ATTRIBUTE_WARP_SIZE;
+	  cu_mpc = CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT;
+	  cu_tpm  = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR;
+
+	  if (cuDeviceGetAttribute (_size, cu_tpb, dev) == CUDA_SUCCESS
+	  && cuDeviceGetAttribute (_size, cu_ws, dev) == CUDA_SUCCESS
+	  && cuDeviceGetAttribute 

Re: [PATCH 1/2, libgcc]: Implement _divmoddi4

2016-11-02 Thread Ian Lance Taylor
On Wed, Nov 2, 2016 at 12:19 PM, Uros Bizjak  wrote:
> On Wed, Nov 2, 2016 at 2:27 PM, Ian Lance Taylor  wrote:
>> On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak  wrote:
>>> This function will be used in a follow-up patch to implement
>>> TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call
>>> this function, so IMO it should be part of a generic library.
>>>
>>> 2016-10-31  Uros Bizjak  
>>>
>>> * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4.
>>> * libgcc2.c (__divmoddi4): New function.
>>> * libgcc2.h (__divmoddi4): Declare.
>>> * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4
>>> and __PFX_divmodti4.
>>
>> You aren't defining divmodti4 anywhere, so it seems premature to add
>> it to libgcc-std.ver.in.
>
> It is created magically for x86_64, in the same way e.g. divti3 is created.

Ah, OK.

Ian


Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)

2016-11-02 Thread Bernd Edlinger
On 11/02/16 18:51, Jason Merrill wrote:
> On 11/02/2016 02:11 AM, Bernd Edlinger wrote:
>> On 11/01/16 19:15, Bernd Edlinger wrote:
>>> On 11/01/16 18:11, Jason Merrill wrote:
 On Tue, Nov 1, 2016 at 11:45 AM, Bernd Edlinger
  wrote:
> On 11/01/16 16:20, Jason Merrill wrote:
>> On 10/17/2016 03:18 PM, Bernd Edlinger wrote:
>> I'm not even sure we need a new warning.  Can we combine this warning
>> with the block that currently follows?
>
> After 20 years of not having a warning on that,
> an implicitly enabled warning would at least break lots of bogus
> test cases.

 Would it, though?  Which test cases still break with the current patch?
>>>
>>> Less than before, but there are still at least a few of them.
>>>
>>> I can make a list and send it tomorrow.
>>
>> FAIL: g++.dg/cpp1y/lambda-generic-udt.C  -std=gnu++14 (test for excess
>> errors)
>> FAIL: g++.dg/cpp1y/lambda-generic-xudt.C  -std=gnu++14 (test for excess
>> errors)
>> FAIL: g++.dg/init/new15.C  -std=c++11 (test for excess errors)
>> FAIL: g++.dg/init/new15.C  -std=c++14 (test for excess errors)
>> FAIL: g++.dg/init/new15.C  -std=c++98 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-1.C  -std=gnu++11 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-1.C  -std=gnu++14 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-1.C  -std=gnu++98 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-2.C  -std=gnu++11 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-2.C  -std=gnu++14 (test for excess errors)
>> FAIL: g++.dg/ipa/inline-2.C  -std=gnu++98 (test for excess errors)
>> FAIL: g++.dg/tc1/dr20.C  -std=c++11 (test for excess errors)
>> FAIL: g++.dg/tc1/dr20.C  -std=c++14 (test for excess errors)
>> FAIL: g++.dg/tc1/dr20.C  -std=c++98 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++11 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++14 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++98 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++11 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++14 (test for excess errors)
>> FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++98 (test for excess errors)
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
>> -flto-partition=1to1 -fno-use-linker-plugin
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
>> -flto-partition=none -fuse-linker-plugin
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
>> -fuse-linker-plugin -fno-fat-lto-objects
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
>> -flto-partition=1to1 -fno-use-linker-plugin
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
>> -flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects
>> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
>> -fuse-linker-plugin
>> FAIL: g++.dg/lto/pr68811 cp_lto_pr68811_0.o assemble, -O2
>> FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++11 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++14 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++98 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++11 (test for excess errors)
>> FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++14 (test for excess errors)
>> FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++98 (test for excess errors)
>> FAIL: g++.old-deja/g++.other/builtins10.C  -std=c++11 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/builtins10.C  -std=c++14 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/realloc.C  -std=c++11 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/realloc.C  -std=c++14 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/realloc.C  -std=c++98 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++11 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++14 (test for excess
>> errors)
>> FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++98 (test for excess
>> errors)
>>
>>
>> The lto test case does emit the warning when assembling, but
>> it still produces an executable and even executes it.
>>
>> Also g++.dg/cpp1y/lambda-generic-udt.C, g++.dg/tc1/dr20.C
>> and g++.old-deja/g++.other/vbase5.C are execution tests.
>>
>> So I was wrong to assume these were all compile-only tests.
>>
>> I think that list should be fixable, if we decide to enable
>> the warning by default.
>
> Yes, either by fixing the prototypes or disabling the warning.
>

Yes, I am inclined to enable the warning by default now.

Most of the test cases are fixable in a fairly obvious way,
see attachment.

But I am unsure about g++.old-deja/g++.other/builtins10.C:

// { dg-do assemble  }
// Test that built-in functions don't warn when prototyped without 
arguments.
// Origin: PR c++/9367
// Copyright (C) 

Re: [openacc] adjust default num_gangs

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote:
> @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, 
> void **devaddrs,
>  
>if (seen_zero)
>  {
> +  /* See if the user provided GOMP_OPENACC_DIM environment
> +  variable to specify runtime defaults. */
> +  static int default_dims[GOMP_DIM_MAX];
> +
> +  if (!default_dims[0])
> + {

Is this guarded by some lock, or is it just racy if multiple
nvptx_execs are done at the same time?

> +   /* We only read the environment variable once.  You can't
> +  change it in the middle of execution.  The sytntax  is

syntax

> +  the same as for the -fopenacc-dim compilation option.  */
> +   const char *env_var = getenv ("GOMP_OPENACC_DIM");

> +
> +   if (CUDA_SUCCESS == cuDeviceGetAttribute
> +   (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev)
> +   && CUDA_SUCCESS == cuDeviceGetAttribute
> +   (_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev)
> +   && CUDA_SUCCESS == cuDeviceGetAttribute
> +   (_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev)
> +   && CUDA_SUCCESS == cuDeviceGetAttribute
> +   (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, 
> dev))

The formatting is wrong.  1) you should use the call should be on lhs of ==,
not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line
3) still the lines are too long.

  if (cuDeviceGetAttribute (_size,
CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
dev) == CUDA_SUCCESS
  && cuDeviceGetAttribute (...

CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR
is still way too long, perhaps initialize a temporary const var
to that, or use some macro like
DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR)
where
#define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x

Otherwise LGTM.

Jakub


[openacc] adjust default num_gangs

2016-11-02 Thread Cesar Philippidis
This patch teaches the libgomp runtime how to probe the CUDA driver to
extract the number of Stream Multiprocessors that are available on the
graphics hardware and use that as the default value for num_gangs.
Without that patch, libgomp used to have num_gangs default to 32, which
was chosen arbitrarily. At least this value maps onto a hardware value.

More details regarding this patch can be found here:

  https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02064.html
  https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02084.html

Is this patch OK for trunk?

Cesar
2016-11-02  Cesar Philippidis  
	Nathan Sidwell  

	gcc/
	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero.

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes
	to determine default geometry.
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang
	dimension.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 80fa9ae..782bbde 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4174,7 +4174,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget),
 /* Define dimension sizes for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  32
+#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
 
 /* Validate compute dimensions of an OpenACC offload or routine, fill
in non-unity defaults.  FN_LEVEL indicates the level at which a
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 327500c..91c1386 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *
 cuda_error (CUresult r)
@@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
   if (seen_zero)
 {
+  /* See if the user provided GOMP_OPENACC_DIM environment
+	 variable to specify runtime defaults. */
+  static int default_dims[GOMP_DIM_MAX];
+
+  if (!default_dims[0])
+	{
+	  /* We only read the environment variable once.  You can't
+	 change it in the middle of execution.  The sytntax  is
+	 the same as for the -fopenacc-dim compilation option.  */
+	  const char *env_var = getenv ("GOMP_OPENACC_DIM");
+	  if (env_var)
+	{
+	  const char *pos = env_var;
+
+	  for (i = 0; *pos && i != GOMP_DIM_MAX; i++)
+		{
+		  if (i && *pos++ != ':')
+		break;
+		  if (*pos != ':')
+		{
+		  const char *eptr;
+
+		  errno = 0;
+		  long val = strtol (pos, (char **), 10);
+		  if (errno || val < 0 || (unsigned)val != val)
+			break;
+		  default_dims[i] = (int)val;
+		  pos = eptr;
+		}
+		}
+	}
+
+	  int warp_size, block_size, dev_size, cpu_size;
+	  CUdevice dev = nvptx_thread()->ptx_dev->dev;
+	  /* 32 is the default for known hardware.  */
+	  int gang = 0, worker = 32, vector = 32;
+
+	  if (CUDA_SUCCESS == cuDeviceGetAttribute
+	  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev)
+	  && CUDA_SUCCESS == cuDeviceGetAttribute
+	  (_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev)
+	  && CUDA_SUCCESS == cuDeviceGetAttribute
+	  (_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev)
+	  && CUDA_SUCCESS == cuDeviceGetAttribute
+	  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, dev))
+	{
+	  GOMP_PLUGIN_debug (0, " warp_size=%d, block_size=%d,"
+ " dev_size=%d, cpu_size=%d\n",
+ warp_size, block_size, dev_size, cpu_size);
+	  gang = (cpu_size / block_size) * dev_size;
+	  worker = block_size / warp_size;
+	  vector = warp_size;
+	}
+
+	  /* There is no upper bound on the gang size.  The best size
+	 matches the hardware configuration.  Logical gangs are
+	 scheduled onto physical hardware.  To maximize usage, we
+	 should guess a large number.  */
+	  if (default_dims[GOMP_DIM_GANG] < 1)
+	default_dims[GOMP_DIM_GANG] = gang ? gang : 1024;
+	  /* The worker size must not exceed the hardware.  */
+	  if (default_dims[GOMP_DIM_WORKER] < 1
+	  || (default_dims[GOMP_DIM_WORKER] > worker && gang))
+	default_dims[GOMP_DIM_WORKER] = worker;
+	  /* The vector size must exactly match the hardware.  */
+	  if (default_dims[GOMP_DIM_VECTOR] < 1
+	  || (default_dims[GOMP_DIM_VECTOR] != vector && gang))
+	default_dims[GOMP_DIM_VECTOR] = vector;
+
+	  GOMP_PLUGIN_debug (0, " default dimensions [%d,%d,%d]\n",
+			 default_dims[GOMP_DIM_GANG],
+			 default_dims[GOMP_DIM_WORKER],
+			 default_dims[GOMP_DIM_VECTOR]);
+	}
+
   for (i = 0; i != GOMP_DIM_MAX; i++)
-   if (!dims[i])
- dims[i] = /* TODO */ 32;
+	if (!dims[i])
+	  dims[i] = default_dims[i];
 }
 
   /* This reserves a chunk of a pre-allocated page of memory mapped on both
@@ -954,8 +1030,8 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 		mapnum * sizeof (void *));
   

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 10:55:23AM -0600, Martin Sebor wrote:
> >That's an unfair assertion in light of the numbers above.
> >
> >>If you want a warning for suspicious calls, sure, but
> >>1) it has to be clearly worded significantly differently from how do you
> >>   word it, so that users really understand you are warning about
> >>   suspicious code (though, I really believe it is very common and there
> >>   is really nothing the users can do about it)
> >>2) it really shouldn't be enabled in -Wall, and I think not even in
> >>-Wextra
> >
> >As you have argued yourself recently in our discussion of
> >-Wimplicit-fallthrough, warnings that aren't enabled by either
> >of these options don't generally benefit users because very few
> >turn them on explicitly.  I agree with that argument although I
> >would be in favor of rolling out a new warning gradually over
> >two or more releases if it were known to be prone to high rates
> >of false positive. The -Wstringop-overflow warning clearly isn't
> >in that category so there's no need for it.  My offer to do
> >additional testing is still good if you'd like to see more data.

But obviously not all levels of the warning can/should be enabled
with -Wall/-Werror.  There are cases which are worth warning by default
(the case where we want to inform the user if you reach this stmt,
you'll get your program killed (will call __chk_fail)) is something
that ought like before be enabled by default; can have a warning
switch users can disable.
Then there is the case where there is a sure buffer overflow (not using
-D_FORTIFY_SOURCE, but still __bos (, 0) tells the buffer is too short,
and it is unconditional (no tricks with PHIs where one path has short
and another part has long size).  This is something that is useful
in -Wall.
The rest I'm very doubtful about even for -Wextra.

One thing is that for useful warnings users have to be able to do something
to quiet them.  For -Wmisleading-indentation you indent your code properly,
for -Wimplicit-fallthrough you add attributes or comments, etc.
But I really don't know what you want people to do to tell the compiler
the warning is a false positive.  Add asm ("" : "+g" (ptr)); to hide
everything from the compiler?  Too ugly, too unportable, pessimizing correct
code.

Another thing is that __builtin_object_size (x, 1) is very fragile thing
especially since the introduction of MEM_REF, but also various foldings etc.
Just look up the numerous open PRs about it.  We try to mitigate it in some
early optimization passes, and had to introduce an early objsz pass copy to
deal with that.  But further down the optimization passes the distinctions
that __bos (x, 1) needs to do are very often lost, optimizers fold
(char *) + offsetof (struct S, fld) into  and vice versa, MEM_REF
just doesn't tell you anything about what field is used, etc.  It is
unreliable in both directions.  So trying to use it for anything very late
in the GIMPLE opts is just a bad idea.

Jakub


Re: [RFA] Fix false positive out of bounds array index warning in LRA

2016-11-02 Thread Jeff Law

On 11/02/2016 01:20 PM, Bernd Schmidt wrote:

On 10/29/2016 06:21 PM, Jeff Law wrote:


On a small number of ports, we only have 2 defined register classes.
NO_REGS and ALL_REGS.  Examples would include nvptx and vax.

So let's look at check_and_process_move from lra-constraints.c:

  sclass = dclass = NO_REGS;
  if (REG_P (dreg))
dclass = get_reg_class (REGNO (dreg));
  if (dclass == ALL_REGS)
/* ALL_REGS is used for new pseudos created by transformations
   like reload of SUBREG_REG (see function
   simplify_operand_subreg).  We don't know their class yet.  We
   should figure out the class from processing the insn
   constraints not in this fast path function.  Even if ALL_REGS
   were a right class for the pseudo, secondary_... hooks usually
   are not define for ALL_REGS.  */
return false;
  [ ... ]
  /* Set up hard register for a reload pseudo for hook
 secondary_reload because some targets just ignore unassigned
 pseudos in the hook.  */
  if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0)
{
  dregno = REGNO (dreg);
  reg_renumber[dregno] = ira_class_hard_regs[dclass][0];
}


The reference to ira_class_hard_regs is flagged by VRP as having an
out-of-bounds array reference.  WTF!?  Of course it's pretty obvious
once you look at the dumps...

On most targets dclass in this code will have a VR_VARYING range and as
a result no warning will be issued.  But on the ptx and vax ports VRP is
able to give us the range ~[NO_REGS,ALL_REGS]  aka ~[0,1] The
out-of-range array index is now obvious.


So I tried to look up the rules for enum values and it seems like we
can't prove that the code in the if statement is dead. However, can't we
at least prove that it is "dead enough" for warning purposes?
For both C & C++ you can shove in a value outside the range of the enum. 
 There is new strongly typed enum in C++11, but I didn't figure 
converting to those would be well received :-)  So for the given code 
the range really is ~[0,1] and we can overflow the array bounds.





Ok for the trunk?


Hmm, seems like a deficiency in the warning code TBH, but I guess this
patch can't hurt. Maybe open a PR for improving the warning.
I guess we could enhance the warning on the assumption that getting a 
value that is not one of the enumeration constants.  That's assuming 
we've got the right type in the right place.  I'll take a look and see 
before going forward.


jeff


Re: [PATCH, Fortran] Extension: Support legacy PARAMETER statements with -std=legacy (or -fdec?)

2016-11-02 Thread Toon Moene

On 11/02/2016 04:47 PM, Fritz Reese wrote:


All,

Another quirk of legacy compilers is their syntax for PARAMETER
statements. Such statements are similar to standard PARAMETER
statements but lack parentheses following the PARAMETER keyword. There
is a good reason the standard doesn't support this - because the
statement becomes ambiguous with assignment statements in fixed form.
Consider the following:

parameter pi = 3.14d0

In fixed form, spaces are ignored so in standard Fortran the above
looks like an assignment to the variable "parameterpi". In legacy
compilers, the above is instead interpreted as

parameter (pi = 3.14d0)

which of course declares the variable 'pi' to be a parameter with the
value 3.14.


Please test this with a program that has a *variable* that's named 
PARAMETER.


I have encountered Fortran compilers in the past that couldn't cope with 
it (I won't name names).


Thanks and kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: [RFA] Fix false positive out of bounds array index warning in LRA

2016-11-02 Thread Bernd Schmidt

On 10/29/2016 06:21 PM, Jeff Law wrote:


On a small number of ports, we only have 2 defined register classes.
NO_REGS and ALL_REGS.  Examples would include nvptx and vax.

So let's look at check_and_process_move from lra-constraints.c:

  sclass = dclass = NO_REGS;
  if (REG_P (dreg))
dclass = get_reg_class (REGNO (dreg));
  if (dclass == ALL_REGS)
/* ALL_REGS is used for new pseudos created by transformations
   like reload of SUBREG_REG (see function
   simplify_operand_subreg).  We don't know their class yet.  We
   should figure out the class from processing the insn
   constraints not in this fast path function.  Even if ALL_REGS
   were a right class for the pseudo, secondary_... hooks usually
   are not define for ALL_REGS.  */
return false;
  [ ... ]
  /* Set up hard register for a reload pseudo for hook
 secondary_reload because some targets just ignore unassigned
 pseudos in the hook.  */
  if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0)
{
  dregno = REGNO (dreg);
  reg_renumber[dregno] = ira_class_hard_regs[dclass][0];
}


The reference to ira_class_hard_regs is flagged by VRP as having an
out-of-bounds array reference.  WTF!?  Of course it's pretty obvious
once you look at the dumps...

On most targets dclass in this code will have a VR_VARYING range and as
a result no warning will be issued.  But on the ptx and vax ports VRP is
able to give us the range ~[NO_REGS,ALL_REGS]  aka ~[0,1] The
out-of-range array index is now obvious.


So I tried to look up the rules for enum values and it seems like we 
can't prove that the code in the if statement is dead. However, can't we 
at least prove that it is "dead enough" for warning purposes?



Ok for the trunk?


Hmm, seems like a deficiency in the warning code TBH, but I guess this 
patch can't hurt. Maybe open a PR for improving the warning.



Bernd


Re: [PATCH 1/2, libgcc]: Implement _divmoddi4

2016-11-02 Thread Uros Bizjak
On Wed, Nov 2, 2016 at 2:27 PM, Ian Lance Taylor  wrote:
> On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak  wrote:
>> This function will be used in a follow-up patch to implement
>> TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call
>> this function, so IMO it should be part of a generic library.
>>
>> 2016-10-31  Uros Bizjak  
>>
>> * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4.
>> * libgcc2.c (__divmoddi4): New function.
>> * libgcc2.h (__divmoddi4): Declare.
>> * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4
>> and __PFX_divmodti4.
>
> You aren't defining divmodti4 anywhere, so it seems premature to add
> it to libgcc-std.ver.in.

It is created magically for x86_64, in the same way e.g. divti3 is created.

objdump of x86_64 libgcc.a:

--cut here--

...

_divmoddi4.o: file format elf64-x86-64


Disassembly of section .text:

 <__divmodti4>:
   0:41 57push   %r15
   2:41 56push   %r14
   4:49 89 f9 mov%rdi,%r9
   ...

--cut here--

Uros.


Re: RFD: Buffer handling for ASM_GENERATE_INTERNAL_LABEL

2016-11-02 Thread Bernd Schmidt

On 10/29/2016 04:17 PM, Trevor Saunders wrote:


So actually a thing I've wanted to do for a while is add a long string
building class to generate asm into that would use a set of buffers
adding new ones when the space is needed.  So that would look something
like


I'd do something a little simpler and just wrap an obstack in a class.


Bernd


Re: [PATCH] xtensa: don't xfail gcc.c-torture/compile/20001226-1.c

2016-11-02 Thread Max Filippov
On Wed, Nov 2, 2016 at 10:22 AM, augustine.sterl...@gmail.com
 wrote:
> On Tue, Nov 1, 2016 at 12:45 PM, Max Filippov  wrote:
>> With jump trampolines implemented in binutils since 2.25 and enabled by
>> default this test no longer fails on xtensa.
>>
>> 2016-11-01  Max Filippov  
>> gcc/testsuite/
>> * gcc.c-torture/compile/20001226-1.c: Don't xfail on xtensa.
>
> Approved.

Applied to trunk. Thank you!

-- Max


Re: [PATCH] xtensa: fix ICE on pr59037.c test

2016-11-02 Thread Max Filippov
On Wed, Nov 2, 2016 at 10:23 AM, augustine.sterl...@gmail.com
 wrote:
> On Tue, Nov 1, 2016 at 12:11 PM, Max Filippov  wrote:
>> xtensa gcc gets ICE on pr59037.c test because its xtensa_output_literal
>> function cannot handle integer literals of sizes other than 4 and 8,
>> whereas the test uses 16-byte int vector.
>> Split integer literal formatting into the recursive function
>> xtensa_output_integer_literal_parts capable of handling literals of any
>> power of 2 size not less than 4.
>>
>> 2016-11-01  Max Filippov  
>> gcc/
>> * config/xtensa/xtensa.c (xtensa_output_integer_literal_parts):
>> New function.
>> (xtensa_output_literal): Use xtensa_output_integer_literal_parts
>> to format MODE_INT and MODE_PARTIAL_INT literals.
>
> Approved.

Applied to trunk. Thank you!

-- Max


Re: [Patch ARM 4/4] Enable _Float16

2016-11-02 Thread Joseph Myers
On Wed, 2 Nov 2016, James Greenhalgh wrote:

> Done in this revision, though this makes no difference to what gets tested
> as these options are not added when compiling the probe functions
> (check_effective_target_float16). These would need a patch like:
> 
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index abc4ac0..36573cc 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -2485,7 +2485,7 @@ proc check_effective_target_has_q_floating_suffix { } {
>  proc check_effective_target_float16 {} {
>  return [check_no_compiler_messages_nocache float16 object {
>  _Float16 x;
> -}]
> +} [add_options_for_float16 {}]]
>  }
> 
>  proc check_effective_target_float32 {} {
> 
> To actually enable the tests.

Then I think such a change should be made (for all these functions, 
probably separately from the ARM patch) so we have proper test coverage 
for _Float16 on ARM.  (I noted when adding the dg-add-options to these 
tests that I hadn't tested this properly enabled the _Float64x / _Float128 
tests on powerpc64le and that that could do with powerpc maintainer 
testing.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 0/3] use rtx_insn * more

2016-11-02 Thread Trevor Saunders
On Wed, Nov 02, 2016 at 05:35:40PM +0100, Bernd Schmidt wrote:
> On 11/02/2016 03:55 PM, David Malcolm wrote:
> 
> > Did you mean this patch:
> >   https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html
> 
> That one is ok after the test, sorry for not being more clear.

ah, no worries :) and sorry about the wrong link.

Trev

> 
> 
> Bernd
> 


Re: [PATCH] Generate reproducible output independently of the build-path

2016-11-02 Thread Ximin Luo
Ximin Luo:
> Joseph Myers:
>> On Wed, 2 Nov 2016, Ximin Luo wrote:
>>
>>> This patch series adds a new environment variable SOURCE_PREFIX_MAP. When 
>>> this
>>> is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value"
>>> command-line argument. This makes the final binary output reproducible, and
>>
>> Only one argument?  Sometimes you may want multiple -fdebug-prefix-map 
>> options (for both source and build directories, for example).  Perhaps the 
>> value should be split on spaces to provide multiple such mappings?
>>
> 
> We wanted to keep this environment variable simple, and allowing this would 
> make it more costly for other projects to adopt.
> 
> Whichever separator character we choose, we would have to either (a) disallow 
> it within the oldprefix/newprefix strings like what PATH does or (b) think of 
> an escaping scheme. Neither choice is great - with (a) since we want to map 
> *arbitrary* paths the restriction is less acceptable here than in PATH, and 
> with (b) it adds complexity to the spec, for uncommon use-cases.
> 

If we picked '\n' as the separator character, we could perhaps live with the 
(a) restriction. I'll wait a while for others to comment some more.

> In the case of an out-of-tree build, it is still possible with the current 
> proposal. Instead of:
> 
> SOURCE_PREFIX_MAP="srcprefix=srcname:buildprefix=buildname"
> 
> One could arrange the tree like:
> 
> commonprefix/srcname
> commonprefix/buildname
> 
> then set SOURCE_PREFIX_MAP="commonprefix=." or 
> SOURCE_PREFIX_MAP="commonprefix/=".
> 

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git


Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)

2016-11-02 Thread Jason Merrill

On 11/02/2016 02:11 AM, Bernd Edlinger wrote:

On 11/01/16 19:15, Bernd Edlinger wrote:

On 11/01/16 18:11, Jason Merrill wrote:

On Tue, Nov 1, 2016 at 11:45 AM, Bernd Edlinger
 wrote:

On 11/01/16 16:20, Jason Merrill wrote:

On 10/17/2016 03:18 PM, Bernd Edlinger wrote:
I'm not even sure we need a new warning.  Can we combine this warning
with the block that currently follows?


After 20 years of not having a warning on that,
an implicitly enabled warning would at least break lots of bogus
test cases.


Would it, though?  Which test cases still break with the current patch?


Less than before, but there are still at least a few of them.

I can make a list and send it tomorrow.


FAIL: g++.dg/cpp1y/lambda-generic-udt.C  -std=gnu++14 (test for excess
errors)
FAIL: g++.dg/cpp1y/lambda-generic-xudt.C  -std=gnu++14 (test for excess
errors)
FAIL: g++.dg/init/new15.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/init/new15.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/init/new15.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/ipa/inline-1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/ipa/inline-1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ipa/inline-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/tc1/dr20.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/tc1/dr20.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/tc1/dr20.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/tree-ssa/inline-2.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
-flto-partition=1to1 -fno-use-linker-plugin
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
-flto-partition=none -fuse-linker-plugin
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto
-fuse-linker-plugin -fno-fat-lto-objects
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
-flto-partition=1to1 -fno-use-linker-plugin
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
-flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects
FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto
-fuse-linker-plugin
FAIL: g++.dg/lto/pr68811 cp_lto_pr68811_0.o assemble, -O2
FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.old-deja/g++.law/except1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++11 (test for excess errors)
FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++14 (test for excess errors)
FAIL: g++.old-deja/g++.mike/p700.C  -std=gnu++98 (test for excess errors)
FAIL: g++.old-deja/g++.other/builtins10.C  -std=c++11 (test for excess
errors)
FAIL: g++.old-deja/g++.other/builtins10.C  -std=c++14 (test for excess
errors)
FAIL: g++.old-deja/g++.other/realloc.C  -std=c++11 (test for excess errors)
FAIL: g++.old-deja/g++.other/realloc.C  -std=c++14 (test for excess errors)
FAIL: g++.old-deja/g++.other/realloc.C  -std=c++98 (test for excess errors)
FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++11 (test for excess errors)
FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++14 (test for excess errors)
FAIL: g++.old-deja/g++.other/vbase5.C  -std=c++98 (test for excess errors)


The lto test case does emit the warning when assembling, but
it still produces an executable and even executes it.

Also g++.dg/cpp1y/lambda-generic-udt.C, g++.dg/tc1/dr20.C
and g++.old-deja/g++.other/vbase5.C are execution tests.

So I was wrong to assume these were all compile-only tests.

I think that list should be fixable, if we decide to enable
the warning by default.


Yes, either by fixing the prototypes or disabling the warning.


If we remove the DECL_ANTICIPATED check, I see some failures in
builtin* tests due to missing extern "C".  That seems appropriate at
file scope, but I'm not sure it's right for namespace std.


Interesting, what have you done?


The attached.  But now that you've found that the existing warning deals 
with people doing silly things like redeclaring __builtin_* I guess 
let's leave it alone, and add the warning to the DECL_ANTICIPATED block 
the way you proposed.


Jason

commit d93d421435f8c38b2019526c7645a59e79a92cc5
Author: Jason Merrill 
Date:   Tue Nov 1 17:16:12 2016 -0400

rem-ant

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 

Re: [ping * 4] PR35503 - warn for restrict

2016-11-02 Thread Prathamesh Kulkarni
On 2 November 2016 at 23:07, Jason Merrill  wrote:
> On Wed, Nov 2, 2016 at 1:08 PM, Prathamesh Kulkarni
>  wrote:
>> On 2 November 2016 at 18:29, Jason Merrill  wrote:
>>> Then I'll approve the whole patch.
>> Thanks!
>> Trying the patch on kernel build (allmodconfig) reveals the following
>> couple of warnings:
>> http://pastebin.com/Sv2HFDUv
>>
>> I think warning for str_error_r() is correct
>
> It's accurate, but unhelpful; snprintf isn't going to use the contents
> of buf via the variadic argument, so this warning is just noise.
Ah, indeed, it's just printing address of buf, not using the contents.
>
>> however I am not sure if
>> warning for pager_preexec() is legit or a false positive:
>>
>> pager.c: In function 'pager_preexec':
>> pager.c:35:12: warning: passing argument 2 to restrict-qualified
>> parameter aliases with argument 4 [-Wrestrict]
>>   select(1, , NULL, , NULL);
>>  ^~~~~~
>> Is the warning correct for  the above call to select() syscall ?
>
> The warning looks correct based on the prototype
>
> extern int select (int __nfds, fd_set *__restrict __readfds,
>fd_set *__restrict __writefds,
>fd_set *__restrict __exceptfds,
>struct timeval *__restrict __timeout);
>
> But passing the same fd_set to both readfds and exceptfds seems
> reasonable to me, so this also seems like a false positive.
>
> Looking at C11, I see this example:
>
> EXAMPLE 3 The function parameter declarations
> void h(int n, int * restrict p, int * restrict q, int * restrict r)
> {
>   int i;
>   for (i = 0; i < n; i++)
> p[i] = q[i] + r[i];
> }
>
> illustrate how an unmodified object can be aliased through two
> restricted pointers. In particular, if a and b
> are disjoint arrays, a call of the form h(100, a, b, b) has defined
> behavior, because array b is not
> modified within function h.
>
> This is is another example of well-defined code that your warning will
> complain about.
Yes, that's a limitation of the patch, it just looks at the prototype, and
not how the arguments are used in the function.
>
>> Should we instead keep it in Wextra, or continue keeping it in Wall ?
>
> It seems that it doesn't belong in -Wall.  I don't feel strongly about 
> -Wextra.
Should I commit the patch by keeping Wrestrict "standalone",
ie, not including it in either Wall or Wextra ?

Thanks,
Prathamesh
>
> Jason


Re: [Patch ARM 4/4] Enable _Float16

2016-11-02 Thread James Greenhalgh

On Mon, Oct 24, 2016 at 10:28:42PM +, Joseph Myers wrote:
> On Mon, 24 Oct 2016, James Greenhalgh wrote:
>
> > Hi,
> >
> > Finally, having added support for single-step DFmode to HFmode conversions,
> > this patch adds support for _Float16 to the ARM back-end.
>
> Given the need for -mfp16-format=ieee (on some processors), you should be
> updating target-supports.exp:add_options_for_float16 to add that option in
> the ARM case, and make sure that the tests run in cases where the option
> is needed to enable support for this format.

Done in this revision, though this makes no difference to what gets tested
as these options are not added when compiling the probe functions
(check_effective_target_float16). These would need a patch like:

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index abc4ac0..36573cc 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2485,7 +2485,7 @@ proc check_effective_target_has_q_floating_suffix { } {
 proc check_effective_target_float16 {} {
 return [check_no_compiler_messages_nocache float16 object {
 _Float16 x;
-}]
+} [add_options_for_float16 {}]]
 }

 proc check_effective_target_float32 {} {

To actually enable the tests.

> > That means making sure that only __fp16 promotes and adding similar hooks to
> > those used in the AArch64 port giving the excess precision rules, and
> > marking HFmode as supported in libgcc.
>
> Does "supported in libgcc" mean this patch completes fixing bug 63250?

>From my reading of the bug. Yes. I'll tag the gcc ChangeLog as a fix, so it
is recorded against the PR.

Thanks,
James

---
gcc/

2016-11-02  James Greenhalgh  

PR target/63250
* config/arm/arm-builtins.c (arm_simd_floatHF_type_node): Rename to...
(arm_fp16_type_node): ...This, make visibile.
(arm_simd_builtin_std_type): Rename arm_simd_floatHF_type_node to
arm_fp16_type_node.
(arm_init_simd_builtin_types): Likewise.
(arm_init_fp16_builtins): Likewise.
* config/arm/arm.c (arm_excess_precision): New.
(arm_floatn_mode): Likewise.
(TARGET_C_EXCESS_PRECISION): Likewise.
(TARGET_FLOATN_MODE): Likewise.
(arm_promoted_type): Only promote arm_fp16_type_node.
* config/arm/arm.h (arm_fp16_type_node): Declare.

gcc/testsuite/

2016-11-02  James Greenhalgh  

* lib/target-supports.exp (add_options_for_float16): Add
-mfp16-format=ieee when testign arm*-*-*.

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index e73043d..5ed38d1 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -652,7 +652,8 @@ static struct arm_simd_type_info arm_simd_types [] = {
 };
 #undef ENTRY
 
-static tree arm_simd_floatHF_type_node = NULL_TREE;
+/* The user-visible __fp16 type.  */
+tree arm_fp16_type_node = NULL_TREE;
 static tree arm_simd_intOI_type_node = NULL_TREE;
 static tree arm_simd_intEI_type_node = NULL_TREE;
 static tree arm_simd_intCI_type_node = NULL_TREE;
@@ -739,7 +740,7 @@ arm_simd_builtin_std_type (enum machine_mode mode,
 case XImode:
   return arm_simd_intXI_type_node;
 case HFmode:
-  return arm_simd_floatHF_type_node;
+  return arm_fp16_type_node;
 case SFmode:
   return float_type_node;
 case DFmode:
@@ -840,8 +841,8 @@ arm_init_simd_builtin_types (void)
   /* Continue with standard types.  */
   /* The __builtin_simd{64,128}_float16 types are kept private unless
  we have a scalar __fp16 type.  */
-  arm_simd_types[Float16x4_t].eltype = arm_simd_floatHF_type_node;
-  arm_simd_types[Float16x8_t].eltype = arm_simd_floatHF_type_node;
+  arm_simd_types[Float16x4_t].eltype = arm_fp16_type_node;
+  arm_simd_types[Float16x8_t].eltype = arm_fp16_type_node;
   arm_simd_types[Float32x2_t].eltype = float_type_node;
   arm_simd_types[Float32x4_t].eltype = float_type_node;
 
@@ -1754,11 +1755,11 @@ arm_init_iwmmxt_builtins (void)
 static void
 arm_init_fp16_builtins (void)
 {
-  arm_simd_floatHF_type_node = make_node (REAL_TYPE);
-  TYPE_PRECISION (arm_simd_floatHF_type_node) = GET_MODE_PRECISION (HFmode);
-  layout_type (arm_simd_floatHF_type_node);
+  arm_fp16_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (arm_fp16_type_node) = GET_MODE_PRECISION (HFmode);
+  layout_type (arm_fp16_type_node);
   if (arm_fp16_format)
-(*lang_hooks.types.register_builtin_type) (arm_simd_floatHF_type_node,
+(*lang_hooks.types.register_builtin_type) (arm_fp16_type_node,
 	   "__fp16");
 }
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 524c474..5695548 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -272,6 +272,7 @@ static bool arm_builtin_support_vector_misalignment (machine_mode mode,
 		 int misalignment,
 		 bool is_packed);
 static void arm_conditional_register_usage (void);

Re: [Patch 3/4] Half to double precision conversions

2016-11-02 Thread James Greenhalgh

On Mon, Oct 24, 2016 at 10:24:27PM +, Joseph Myers wrote:
> On Mon, 24 Oct 2016, James Greenhalgh wrote:
>
> > Conversions from double precision floats to the ARM __fp16 are required
> > to round only once.
>
> I'd expect that when fixing this you need to update
> gcc.target/arm/fp16-rounding-ieee-1.c (and fp16-rounding-alt-1.c, I
> suppose) to expect rounding once.  But I don't see such a testsuite change
> anywhere in this patch series.

Right, I went back and checked my ARM test environment. Sure enough it was
producing nonsense results. I've fixed my build system issue, which causes
these tests to start failing as you predicted. I've now fixed them in this
patch revision to use the value they would return if rounding happened
directly rather than first through SFmode.

Thanks,
James

---
gcc/

2016-11-02  James Greenhalgh  

* config/arm/arm.c (arm_convert_to_type): Delete.
(TARGET_CONVERT_TO_TYPE): Delete.
(arm_init_libfuncs): Enable trunc_optab from DFmode to HFmode.
(arm_libcall_uses_aapcs_base): Add trunc_optab from DF- to HFmode.
* config/arm/arm.h (TARGET_FP16_TO_DOUBLE): New.
* config/arm/arm.md (truncdfhf2): Only convert through SFmode if we
are in fast math mode, and have no single step hardware instruction.
(extendhfdf2): Only expand through SFmode if we don't have a
single-step hardware instruction.
* config/arm/vfp.md (*truncdfhf2): New.
(extendhfdf2): Likewise.

gcc/testsuite/

2016-11-02  James Greenhalgh  

* gcc.target/arm/fp16-rounding-alt-1.c (ROUNDED): Change expected
result.
* gcc.target/arm/fp16-rounding-ieee-1.c (ROUNDED): Change expected
result.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8ce4e4e..524c474 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -251,7 +251,6 @@ static bool arm_output_addr_const_extra (FILE *, rtx);
 static bool arm_allocate_stack_slots_for_args (void);
 static bool arm_warn_func_return (tree);
 static tree arm_promoted_type (const_tree t);
-static tree arm_convert_to_type (tree type, tree expr);
 static bool arm_scalar_mode_supported_p (machine_mode);
 static bool arm_frame_pointer_required (void);
 static bool arm_can_eliminate (const int, const int);
@@ -660,9 +659,6 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_PROMOTED_TYPE
 #define TARGET_PROMOTED_TYPE arm_promoted_type
 
-#undef TARGET_CONVERT_TO_TYPE
-#define TARGET_CONVERT_TO_TYPE arm_convert_to_type
-
 #undef TARGET_SCALAR_MODE_SUPPORTED_P
 #define TARGET_SCALAR_MODE_SUPPORTED_P arm_scalar_mode_supported_p
 
@@ -2565,6 +2561,11 @@ arm_init_libfuncs (void)
 			 ? "__gnu_h2f_ieee"
 			 : "__gnu_h2f_alternative"));
 
+  set_conv_libfunc (trunc_optab, HFmode, DFmode,
+			(arm_fp16_format == ARM_FP16_FORMAT_IEEE
+			 ? "__gnu_d2h_ieee"
+			 : "__gnu_d2h_alternative"));
+
   /* Arithmetic.  */
   set_optab_libfunc (add_optab, HFmode, NULL);
   set_optab_libfunc (sdiv_optab, HFmode, NULL);
@@ -5314,6 +5315,8 @@ arm_libcall_uses_aapcs_base (const_rtx libcall)
 			SFmode));
   add_libcall (libcall_htab, convert_optab_libfunc (trunc_optab, SFmode,
 			DFmode));
+  add_libcall (libcall_htab,
+		   convert_optab_libfunc (trunc_optab, HFmode, DFmode));
 }
 
   return libcall && libcall_htab->find (libcall) != NULL;
@@ -23742,23 +23745,6 @@ arm_promoted_type (const_tree t)
   return NULL_TREE;
 }
 
-/* Implement TARGET_CONVERT_TO_TYPE.
-   Specifically, this hook implements the peculiarity of the ARM
-   half-precision floating-point C semantics that requires conversions between
-   __fp16 to or from double to do an intermediate conversion to float.  */
-
-static tree
-arm_convert_to_type (tree type, tree expr)
-{
-  tree fromtype = TREE_TYPE (expr);
-  if (!SCALAR_FLOAT_TYPE_P (fromtype) || !SCALAR_FLOAT_TYPE_P (type))
-return NULL_TREE;
-  if ((TYPE_PRECISION (fromtype) == 16 && TYPE_PRECISION (type) > 32)
-  || (TYPE_PRECISION (type) == 16 && TYPE_PRECISION (fromtype) > 32))
-return convert (type, convert (float_type_node, expr));
-  return NULL_TREE;
-}
-
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P.
This simply adds HFmode as a supported mode; even though we don't
implement arithmetic on this type directly, it's supported by
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index fa8e84c..0f9a679 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -194,6 +194,11 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 #define TARGET_FP16			\
   (ARM_FPU_FSET_HAS (TARGET_FPU_FEATURES, FPU_FL_FP16))
 
+/* FPU supports converting between HFmode and DFmode in a single hardware
+   step.  */
+#define TARGET_FP16_TO_DOUBLE		\
+  (TARGET_HARD_FLOAT && (TARGET_FP16 && TARGET_VFP5))
+
 /* FPU supports fused-multiply-add operations.  */
 #define TARGET_FMA (TARGET_FPU_REV >= 4)
 
diff 

Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)

2016-11-02 Thread Jason Merrill
On Wed, Nov 2, 2016 at 12:37 PM, Joseph Myers  wrote:
> On Wed, 2 Nov 2016, Jason Merrill wrote:
>
>> It seems to me that the general principle is that we should consider
>> the location where the thing we're warning about is happening.  In
>>
>>float_var = LLONG_MIN;
>>
>> The relevant location is that of the assignment, not the constant on
>> the RHS.  In your ?: example, a simple answer would be to warn based
>
> I'm not sure we track locations well enough to handle that yet.
>
> Say you have an implicit conversion of a function argument to the type
> from the prototype and something about that conversion should be warned
> about.  Then you should obviously warn if the argument is a macro from a
> system header but the call is outside a system header.  But say the
> function call itself comes from a macro defined in a system header - you
> should still warn if the user passed an argument of the wrong type, even
> if that argument is a macro from a system header.
>
> That is:
>
> /* System header.  */
> int __foo (int);
> /* This sort of macro to avoid accidental interposition issues has been
>discussed lately on libc-alpha, so it's a realistic example.  */
> #define foo(x) (0 + __foo (x))
> /* User code.  */
> v = foo (NULL);
>
> should warn because the call to __foo logically results from user code
> even though both NULL and foo are macros defined in system headers.  I'm
> not sure what the locations look like here.

Sure, the problem here comes from the user combining the two macros.
I suppose in this case you could notice that the macro expansions of
'foo' and 'NULL' are not nested.

Jason


Re: [Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

2016-11-02 Thread James Greenhalgh

On Fri, Oct 28, 2016 at 09:09:55PM +, Joseph Myers wrote:
> On Fri, 14 Oct 2016, James Greenhalgh wrote:
>
> > +/* If the join of the implicit precision in which the target will compute
> > +   floating-point values and the standard precision in which the target 
> > will
> > +   compute values is not equal to the standard precision, then the target
> > +   is either unpredictable, or is a broken configuration in which it claims
> > +   standards compliance, but doesn't honor that.
> > +
> > +   Effective predictability for __GCC_IEC_559 in flag_iso_mode, means that
> > +   the implicit precision is not wider, or less predictable than the
> > +   standard precision.
> > +
> > +   Return TRUE if we have been asked to compile with
> > +   -fexcess-precision=standard, and following the rules above we are able
> > +   to guarantee the standards mode.  */
> > +
>
> I'm not convinced by the logic you have here.  At least, it seems
> different from what we have at present, where -std=c11
> -fexcess-precision=fast is not considered unpredictable if the target
> doesn't have any implicit excess precision.
>
> That is: I think the right question is whether the combination (front-end
> excess precision, implicit back-end excess precision) does the same thing
> as just front-end excess precision, regardless of the -fexcess-precision=
> option.

OK, I've reworked the patch along those lines. I noticed that the original
logic looked for

  && TARGET_FLT_EVAL_METHOD != 0

And I no longer make that check. Is that something I need to reinstate?
I didn't find any reference to excess precision in Annex F, so I'd guess
not?

Thanks,
James

---
gcc/

2016-11-02  James Greenhalgh  

* toplev.c (init_excess_precision): Delete most logic.
* tree.c (excess_precision_type): Rewrite to use
TARGET_EXCESS_PRECISION.
* doc/invoke.texi (-fexcess-precision): Document behaviour in a
more generic fashion.
* ginclude/float.h: Wrap definition of FLT_EVAL_METHOD in
__STDC_WANT_IEC_60559_TYPES_EXT__.

gcc/c-family/

2016-11-02  James Greenhalgh  

* c-common.c (excess_precision_mode_join): New.
(c_ts18661_flt_eval_method): New.
(c_c11_flt_eval_method): Likewise.
(c_flt_eval_method): Likewise.
* c-common.h (excess_precision_mode_join): New.
(c_flt_eval_method): Likewise.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New.
(cpp_iec_559_value): Call it.
(c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__,
call c_flt_eval_method to set __FLT_EVAL_METHOD__ and
__FLT_EVAL_METHOD_TS_18661_3__.

gcc/testsuite/

2016-11-02  James Greenhalgh  

* gcc.dg/fpermitted-flt-eval-methods_3.c: New.
* gcc.dg/fpermitted-flt-eval-methods_4.c: Likewise.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 307862b..9f0b4a6 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -7944,4 +7944,86 @@ cb_get_suggestion (cpp_reader *, const char *goal,
   return bm.get_best_meaningful_candidate ();
 }
 
+/* Return the latice point which is the wider of the two FLT_EVAL_METHOD
+   modes X, Y.  This isn't just  >, as the FLT_EVAL_METHOD values added
+   by C TS 18661-3 for interchange  types that are computed in their
+   native precision are larger than the C11 values for evaluating in the
+   precision of float/double/long double.  If either mode is
+   FLT_EVAL_METHOD_UNPREDICTABLE, return that.  */
+
+enum flt_eval_method
+excess_precision_mode_join (enum flt_eval_method x,
+			enum flt_eval_method y)
+{
+  if (x == FLT_EVAL_METHOD_UNPREDICTABLE
+  || y == FLT_EVAL_METHOD_UNPREDICTABLE)
+return FLT_EVAL_METHOD_UNPREDICTABLE;
+
+  /* GCC only supports one interchange type right now, _Float16.  If
+ we're evaluating _Float16 in 16-bit precision, then flt_eval_method
+ will be FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16.  */
+  if (x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16)
+return y;
+  if (y == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16)
+return x;
+
+  /* Other values for flt_eval_method are directly comparable, and we want
+ the maximum.  */
+  return MAX (x, y);
+}
+
+/* Return the value that should be set for FLT_EVAL_METHOD in the
+   context of ISO/IEC TS 18861-3.
+
+   This relates to the effective excess precision seen by the user,
+   which is the join point of the precision the target requests for
+   -fexcess-precision={standard,fast} and the implicit excess precision
+   the target uses.  */
+
+static enum flt_eval_method
+c_ts18661_flt_eval_method (void)
+{
+  enum flt_eval_method implicit
+= targetm.c.excess_precision (EXCESS_PRECISION_TYPE_IMPLICIT);
+
+  enum excess_precision_type flag_type
+= (flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD
+   ? EXCESS_PRECISION_TYPE_STANDARD
+   : EXCESS_PRECISION_TYPE_FAST);
+
+  enum 

Re: [ping * 4] PR35503 - warn for restrict

2016-11-02 Thread Jason Merrill
On Wed, Nov 2, 2016 at 1:08 PM, Prathamesh Kulkarni
 wrote:
> On 2 November 2016 at 18:29, Jason Merrill  wrote:
>> Then I'll approve the whole patch.
> Thanks!
> Trying the patch on kernel build (allmodconfig) reveals the following
> couple of warnings:
> http://pastebin.com/Sv2HFDUv
>
> I think warning for str_error_r() is correct

It's accurate, but unhelpful; snprintf isn't going to use the contents
of buf via the variadic argument, so this warning is just noise.

> however I am not sure if
> warning for pager_preexec() is legit or a false positive:
>
> pager.c: In function 'pager_preexec':
> pager.c:35:12: warning: passing argument 2 to restrict-qualified
> parameter aliases with argument 4 [-Wrestrict]
>   select(1, , NULL, , NULL);
>  ^~~~~~
> Is the warning correct for  the above call to select() syscall ?

The warning looks correct based on the prototype

extern int select (int __nfds, fd_set *__restrict __readfds,
   fd_set *__restrict __writefds,
   fd_set *__restrict __exceptfds,
   struct timeval *__restrict __timeout);

But passing the same fd_set to both readfds and exceptfds seems
reasonable to me, so this also seems like a false positive.

Looking at C11, I see this example:

EXAMPLE 3 The function parameter declarations
void h(int n, int * restrict p, int * restrict q, int * restrict r)
{
  int i;
  for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}

illustrate how an unmodified object can be aliased through two
restricted pointers. In particular, if a and b
are disjoint arrays, a call of the form h(100, a, b, b) has defined
behavior, because array b is not
modified within function h.

This is is another example of well-defined code that your warning will
complain about.

> Should we instead keep it in Wextra, or continue keeping it in Wall ?

It seems that it doesn't belong in -Wall.  I don't feel strongly about -Wextra.

Jason


Re: [Patch 1/11] Add a new target hook for describing excess precision intentions

2016-11-02 Thread James Greenhalgh

On Fri, Oct 28, 2016 at 09:12:11PM +, Joseph Myers wrote:
> On Fri, 14 Oct 2016, James Greenhalgh wrote:
>
> > + value set for @code{-fexcess-precision=[standard|fast]}.",
>
> I think the correct markup for the option here is:
>
> @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}
>
> (that is, using @option not @code, and with the [ | ] not in a fixed-width
> font because they aren't part of the option name).
>

Thanks, I've updated that line in this revision of the patch following
your suggestion.

James

---
gcc/

2016-11-02  James Greenhalgh  

* target.def (excess_precision): New hook.
* target.h (flt_eval_method): New.
(excess_precision_type): Likewise.
* targhooks.c (default_excess_precision): New.
* targhooks.h (default_excess_precision): New.
* doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): New.
* doc/tm.texi: Regenerate.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 869f858..09f8213 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -331,6 +331,24 @@ enum symbol_visibility
   VISIBILITY_INTERNAL
 };
 
+/* enums used by the targetm.excess_precision hook.  */
+
+enum flt_eval_method
+{
+  FLT_EVAL_METHOD_UNPREDICTABLE = -1,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT = 0,
+  FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE = 1,
+  FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE = 2,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 = 16
+};
+
+enum excess_precision_type
+{
+  EXCESS_PRECISION_TYPE_IMPLICIT,
+  EXCESS_PRECISION_TYPE_STANDARD,
+  EXCESS_PRECISION_TYPE_FAST
+};
+
 /* Support for user-provided GGC and PCH markers.  The first parameter
is a pointer to a pointer, the second a cookie.  */
 typedef void (*gt_pointer_operator) (void *, void *);
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 0aed3d4..d1d1e76 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -947,6 +947,10 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@deftypefn {Target Hook} {enum flt_eval_method} TARGET_C_EXCESS_PRECISION (enum excess_precision_type @var{type})
+Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that describes which excess precision should be applied.  @var{type} is either @code{EXCESS_PRECISION_TYPE_IMPLICIT}, @code{EXCESS_PRECISION_TYPE_FAST}, or @code{EXCESS_PRECISION_TYPE_STANDARD}.  For @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which precision and range operations will be implictly evaluated in regardless of the excess precision explicitly added.  For @code{EXCESS_PRECISION_TYPE_STANDARD} and @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the explicit excess precision that should be added depending on the value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+@end deftypefn
+
 @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return})
 Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or
 function return values.  The target hook should return the new mode
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 9e5b456..12f6bc0 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -921,6 +921,8 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@hook TARGET_C_EXCESS_PRECISION
+
 @hook TARGET_PROMOTE_FUNCTION_MODE
 
 @defmac PARM_BOUNDARY
diff --git a/gcc/target.def b/gcc/target.def
index c3c87b0..9fc31b3 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5430,6 +5430,23 @@ DEFHOOK_UNDOC
  machine_mode, (char c),
  default_mode_for_suffix)
 
+DEFHOOK
+(excess_precision,
+ "Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that\
+ describes which excess precision should be applied.  @var{type} is\
+ either @code{EXCESS_PRECISION_TYPE_IMPLICIT},\
+ @code{EXCESS_PRECISION_TYPE_FAST}, or\
+ @code{EXCESS_PRECISION_TYPE_STANDARD}.  For\
+ @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which\
+ precision and range operations will be implictly evaluated in regardless\
+ of the excess precision explicitly added.  For\
+ @code{EXCESS_PRECISION_TYPE_STANDARD} and\
+ @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the\
+ explicit excess precision that should be added depending on the\
+ value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.",
+ enum flt_eval_method, (enum excess_precision_type type),
+ default_excess_precision)
+
 HOOK_VECTOR_END (c)
 
 /* Functions specific to the C++ frontend.  */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 866747a..73e1c25 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2135,4 +2135,12 @@ default_min_arithmetic_precision (void)
   return WORD_REGISTER_OPERATIONS ? BITS_PER_WORD : BITS_PER_UNIT;
 }
 
+/* Default implementation of 

Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those

2016-11-02 Thread Jason Merrill
On Wed, Nov 2, 2016 at 12:33 PM, Jakub Jelinek  wrote:
> On Wed, Nov 02, 2016 at 04:44:05PM +0100, Jakub Jelinek wrote:
>> which means if gen_type_die or gen_type_die_with_usage doesn't
>> use the langhook, then we'd emit a completely useless { __pfn; __delta }
>> struct into debug info first, and then in modified_type_die used
>> the langhook, get OFFSET_TYPE and probably create the
>> DW_TAG_ptr_to_member_type.  So I think we really need that.
>>
>> > > 2) it is used for something Ada-ish I really don't know how to test etc.
>> > >to be able to find out if it is safe to call it in
>> > >gen_type_die_with_usage too
>> >
>> > You could find an Ada test that uses the code and verify that the
>> > output stays the same?
>>
>> I can try to find the patch that introduced it and if it contained any
>> testcases.
>
> I couldn't find any unfortunately.  Pierre, could you please test if the
> following patch doesn't regress anything in the Ada debug info area?
>
> Here is updated patch I'm going to bootstrap/regtest; it generates the
> same debuginfo on ref-3.C testcase.

Looks good.

Jason


Re: [PATCH] xtensa: fix ICE on pr59037.c test

2016-11-02 Thread augustine.sterl...@gmail.com
On Tue, Nov 1, 2016 at 12:11 PM, Max Filippov  wrote:
> xtensa gcc gets ICE on pr59037.c test because its xtensa_output_literal
> function cannot handle integer literals of sizes other than 4 and 8,
> whereas the test uses 16-byte int vector.
> Split integer literal formatting into the recursive function
> xtensa_output_integer_literal_parts capable of handling literals of any
> power of 2 size not less than 4.
>
> 2016-11-01  Max Filippov  
> gcc/
> * config/xtensa/xtensa.c (xtensa_output_integer_literal_parts):
> New function.
> (xtensa_output_literal): Use xtensa_output_integer_literal_parts
> to format MODE_INT and MODE_PARTIAL_INT literals.

Approved.


Re: [PATCH] xtensa: don't xfail gcc.c-torture/compile/20001226-1.c

2016-11-02 Thread augustine.sterl...@gmail.com
On Tue, Nov 1, 2016 at 12:45 PM, Max Filippov  wrote:
> With jump trampolines implemented in binutils since 2.25 and enabled by
> default this test no longer fails on xtensa.
>
> 2016-11-01  Max Filippov  
> gcc/testsuite/
> * gcc.c-torture/compile/20001226-1.c: Don't xfail on xtensa.

Approved.


PR61409: -Wmaybe-uninitialized false-positive with -O2

2016-11-02 Thread Aldy Hernandez

Hi Jeff.

As discussed in the PR, here is a patch exploring your idea of ignoring 
unguarded uses if we can prove that the guards for such uses are 
invalidated by the uninitialized operand paths being executed.


This is an updated patch from my suggestion in the PR.  It bootstraps 
with no regression on x86-64 Linux, and fixes the PR in question.


As the "NOTE:" in the code states, we could be much smarter when 
invalidating predicates, but for now let's do straight negation which 
works for the simple case.  We could expand on this in the future.


OK for trunk?

commit 8375d7e28c1a798dd0cc0f487d7fa1068d9eb124
Author: Aldy Hernandez 
Date:   Thu Aug 25 10:44:29 2016 -0400

PR middle-end/61409
* tree-ssa-uninit.c (use_pred_not_overlap_with_undef_path_pred):
Remove reference to missing NUM_PREDS in function comment.
(can_one_predicate_be_invalidated_p): New.
(can_chain_union_be_invalidated_p): New.
(flatten_out_predicate_chains): New.
(uninit_ops_invalidate_phi_use): New.
(is_use_properly_guarded): Call uninit_ops_invalidate_phi_use.

diff --git a/gcc/testsuite/gcc.dg/uninit-pr61409.c 
b/gcc/testsuite/gcc.dg/uninit-pr61409.c
new file mode 100644
index 000..c27a67b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-pr61409.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuninitialized" } */
+
+int *rw;
+int line_height;
+int pixel_width;
+int text_cols;
+int width1, width2, width3;
+
+void *pointer;
+
+void f (int i, int j)
+{
+  void *ptr;
+  if (i)
+{
+  if (j)
+   return;
+  ptr = pointer;
+}
+  pixel_width = 1234 + width1 + 2 * width2 + 2 * width3;
+  *rw = text_cols + line_height;
+  if (i)
+rw=ptr; /* { dg-bogus "uninitialized" "bogus warning" } */
+}
diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 1344854..37c99a7 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -1184,11 +1184,10 @@ prune_uninit_phi_opnds (gphi *phi, unsigned 
uninit_opnds, gphi *flag_def,
  transformation which eliminates the merge point thus makes
  path sensitive analysis unnecessary.)
 
- NUM_PREDS is the number is the number predicate chains, PREDS is
- the array of chains, PHI is the phi node whose incoming (undefined)
- paths need to be pruned, and UNINIT_OPNDS is the bitmap holding
- uninit operand positions.  VISITED_PHIS is the pointer set of phi
- stmts being checked.  */
+ PHI is the phi node whose incoming (undefined) paths need to be
+ pruned, and UNINIT_OPNDS is the bitmap holding uninit operand
+ positions.  VISITED_PHIS is the pointer set of phi stmts being
+ checked.  */
 
 static bool
 use_pred_not_overlap_with_undef_path_pred (pred_chain_union preds,
@@ -2140,6 +2139,158 @@ normalize_preds (pred_chain_union preds, gimple 
*use_or_def, bool is_use)
   return norm_preds;
 }
 
+/* Return TRUE if PREDICATE can be invalidated by any individual
+   predicate in WORKLIST.  */
+
+static bool
+can_one_predicate_be_invalidated_p (pred_info predicate,
+   vec worklist)
+{
+  for (size_t i = 0; i < worklist.length (); ++i)
+{
+  pred_info *p = worklist[i];
+
+  /* NOTE: This is a very simple check, and only understands an
+exact opposite.  So, [i == 0] is currently only invalidated
+by [.NOT. i == 0] or [i != 0].  Ideally we should also
+invalidate with say [i > 5] or [i == 8].  There is certainly
+room for improvement here.  */
+  if (pred_neg_p (predicate, *p))
+   return true;
+}
+  return false;
+}
+
+/* Return TRUE if all USE_PREDS can be invalidated by some predicate
+   in WORKLIST.  */
+
+static bool
+can_chain_union_be_invalidated_p (pred_chain_union use_preds,
+ vec worklist)
+{
+  /* Remember:
+   PRED_CHAIN_UNION = PRED_CHAIN1 || PRED_CHAIN2 || PRED_CHAIN3
+   PRED_CHAIN = PRED_INFO1 && PRED_INFO2 && PRED_INFO3, etc.
+
+   We need to invalidate the entire PRED_CHAIN_UNION, which means,
+   invalidating every PRED_CHAIN in this union.  But to invalidate
+   an individual PRED_CHAIN, all we need to invalidate is _any_ one
+   PRED_INFO, by boolean algebra !PRED_INFO1 || !PRED_INFO2...  */
+  for (size_t i = 0; i < use_preds.length (); ++i)
+{
+  pred_chain c = use_preds[i];
+  bool entire_pred_chain_invalidated = false;
+  for (size_t j = 0; j < c.length (); ++j)
+   if (can_one_predicate_be_invalidated_p (c[i], worklist))
+ {
+   entire_pred_chain_invalidated = true;
+   break;
+ }
+  if (!entire_pred_chain_invalidated)
+   return false;
+}
+  return true;
+}
+
+/* Flatten out all the factors in all the pred_chain_union's in PREDS
+   into a WORKLIST of individual PRED_INFO's.
+
+   N is the number of pred_chain_union's in PREDS.
+
+   Since we are interested in the inverse of the PRED_CHAIN's, by
+   

Re: [PATCH][AArch64] Improve SHA1 scheduling

2016-11-02 Thread Andrew Pinski
On Tue, Oct 25, 2016 at 10:08 AM, Wilco Dijkstra  wrote:
> SHA1H instructions may be scheduled after a SHA1C instruction
> that uses the same input register.  However SHA1C updates its input,
> so if SHA1H is scheduled after it, it requires an extra move.
> Increase the priority of SHA1H to ensure it gets scheduled
> earlier, avoiding the move.
>
> Is this something the generic scheduler could do automatically for
> instructions with RMW operands?

I was thinking that but there are many of those on x86 so it might not
make sense at all.
Also the SIMD instruction FMLA has the similar problem most likely
though I don't know if it make sense to do a similar thing for that
one.
Maybe it makes sense to mark the instructions in the .md file with
some attribute and then just check for that attribute here instead of
special casing SHA1C.
Something like (not a full patch and don't claim this is correct):
(define_attr "rmw" "yes,no" (const_string "no"))

...
  "sha1\\t%q0, %s2, %3.4s"
  [(set_attr "type" "crypto_sha1_slow")]
  [(set_attr "rmw" "yes")]


...

static int
aarch64_sched_adjust_priority (rtx_insn *insn, int priority)
{

  if (get_attr_rmw (insn) == yes)
return MAX(priority - 10, 0);

  return priority;
}




Thanks,
Andrew

>
> Passes bootstrap & regress. OK for commit?
>
> ChangeLog:
> 2016-10-25  Wilco Dijkstra  
>
> * config/aarch64/aarch64.c (aarch64_sched_adjust_priority)
> New function.
> (TARGET_SCHED_ADJUST_PRIORITY): Define target hook.
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 9b2f9cb19343828dc39e9950ebbefe941521942a..2b25bd1bdd6f4e7737f8e04c3b3684cdff6c4b80
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -13668,6 +13668,26 @@ aarch64_sched_fusion_priority (rtx_insn *insn, int 
> max_pri,
>return;
>  }
>
> +/* Implement the TARGET_SCHED_ADJUST_PRIORITY hook.
> +   Adjust priority of sha1h instructions so they are scheduled before
> +   other SHA1 instructions.  */
> +
> +static int
> +aarch64_sched_adjust_priority (rtx_insn *insn, int priority)
> +{
> +  rtx x = PATTERN (insn);
> +
> +  if (GET_CODE (x) == SET)
> +{
> +  x = SET_SRC (x);
> +
> +  if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_SHA1H)
> +   return priority + 10;
> +}
> +
> +  return priority;
> +}
> +
>  /* Given OPERANDS of consecutive load/store, check if we can merge
> them into ldp/stp.  LOAD is true if they are load instructions.
> MODE is the mode of memory operands.  */
> @@ -14431,6 +14451,9 @@ aarch64_optab_supported_p (int op, machine_mode 
> mode1, machine_mode,
>  #undef TARGET_CAN_USE_DOLOOP_P
>  #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost
>
> +#undef TARGET_SCHED_ADJUST_PRIORITY
> +#define TARGET_SCHED_ADJUST_PRIORITY aarch64_sched_adjust_priority
> +
>  #undef TARGET_SCHED_MACRO_FUSION_P
>  #define TARGET_SCHED_MACRO_FUSION_P aarch64_macro_fusion_p
>
>


Re: [ping * 4] PR35503 - warn for restrict

2016-11-02 Thread Prathamesh Kulkarni
On 2 November 2016 at 18:29, Jason Merrill  wrote:
> Then I'll approve the whole patch.
Thanks!
Trying the patch on kernel build (allmodconfig) reveals the following
couple of warnings:
http://pastebin.com/Sv2HFDUv

I think warning for str_error_r() is correct, however I am not sure if
warning for
pager_preexec() is legit or a false positive:

pager.c: In function 'pager_preexec':
pager.c:35:12: warning: passing argument 2 to restrict-qualified
parameter aliases with argument 4 [-Wrestrict]
  select(1, , NULL, , NULL);
 ^~~~~~
Is the warning correct for  the above call to select() syscall ?

I am a bit anxious about keeping the warning in Wall because it's
breaking the kernel.
Should we instead keep it in Wextra, or continue keeping it in Wall ?

Also is there a way to gracefully disable Werror for kernel builds ?
I tried:
make allmodconfig
make all KCFLAGS="-Wno-error=restrict" CFLAGS="-Wno-error=restrict" -j8
but that didn't work. I managed to workaround by manually modifying
Makefiles to not pass Werror,
but I hope there's a better way.

Thanks,
Prathamesh
>
> On Wed, Nov 2, 2016 at 8:42 AM, Joseph Myers  wrote:
>> The format-checking parts of the patch are OK.
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com


[PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-11-02 Thread Thomas Preudhomme

Hi,

When saving registers, function thumb1_expand_prologue () aims at minimizing the 
number of push instructions. One of the optimization it does is to push lr 
alongside high register(s) (after having moved them to low register(s)) when 
there is no low register to save. The way this is implemented is to add lr to 
the list of registers that can be pushed just before the push happens. This 
would then push lr and allows it to be used for further push if there was not 
enough registers to push all high registers to be pushed.


However, the logic that decides what register to move high registers to before 
being pushed only looks at low registers (see for loop initialization). This 
means not only that lr is not used for pushing high registers but also that lr 
is not removed from the list of registers to be pushed when it's not used. This 
extra lr push is not poped in epilogue leading in stack corruption.


This patch changes the loop to iterate over register r0 to lr so as to both fix 
the stack corruption and reuse lr to push some high register when possible.


ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2016-11-01  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Also check for lr being a
pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-01  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933.c: New test.


Testing: no regression on arm-none-eabi GCC cross-compiler targeting Cortex-M0

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index dd8d5e5db8ca50daab648e58df290969aa794862..22a20caf42389748fc64ee0a3f880c6bea4c8f49 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -24990,7 +24990,7 @@ thumb1_expand_prologue (void)
 	{
 	  unsigned long real_regs_mask = 0;
 
-	  for (regno = LAST_LO_REGNUM; regno >= 0; regno --)
+	  for (regno = LR_REGNUM; regno >= 0; regno --)
 	{
 	  if (pushable_regs & (1 << regno))
 		{
diff --git a/gcc/testsuite/gcc.target/arm/pr77933.c b/gcc/testsuite/gcc.target/arm/pr77933.c
new file mode 100644
index ..cba7e9bb9aa57c6755f79b5ec2ea1a8744e23599
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr77933.c
@@ -0,0 +1,9 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+int
+main (void)
+{
+  __asm__ volatile ("" : : : "r8", "r9", "lr");
+  return 0;
+}


Re: [PATCH] Generate reproducible output independently of the build-path

2016-11-02 Thread Ximin Luo
Joseph Myers:
> On Wed, 2 Nov 2016, Ximin Luo wrote:
> 
>> This patch series adds a new environment variable SOURCE_PREFIX_MAP. When 
>> this
>> is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value"
>> command-line argument. This makes the final binary output reproducible, and
> 
> Only one argument?  Sometimes you may want multiple -fdebug-prefix-map 
> options (for both source and build directories, for example).  Perhaps the 
> value should be split on spaces to provide multiple such mappings?
> 

We wanted to keep this environment variable simple, and allowing this would 
make it more costly for other projects to adopt.

Whichever separator character we choose, we would have to either (a) disallow 
it within the oldprefix/newprefix strings like what PATH does or (b) think of 
an escaping scheme. Neither choice is great - with (a) since we want to map 
*arbitrary* paths the restriction is less acceptable here than in PATH, and 
with (b) it adds complexity to the spec, for uncommon use-cases.

In the case of an out-of-tree build, it is still possible with the current 
proposal. Instead of:

SOURCE_PREFIX_MAP="srcprefix=srcname:buildprefix=buildname"

One could arrange the tree like:

commonprefix/srcname
commonprefix/buildname

then set SOURCE_PREFIX_MAP="commonprefix=." or 
SOURCE_PREFIX_MAP="commonprefix/=".

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git


Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-02 Thread Martin Sebor

Sure, they might and in that case the warning would be a false
positive.  It wouldn't be the first such warning that wasn't 100%
free of them.  But my testing with Binutils, GCC, and the Linux
kernel has exposed only 10 instances of new warnings and I don't
think I saw this idiom among them.  But even if some were, or if
all of them were false positives I think that would be well within
the acceptable rates.


I've gone through all the new warnings.  With one exception where
it's unclear what exactly is going on they all appear to be due
to the idiom of writing into a struct member to clear or copy
data into the member and those that follow to the end of the
struct, like so:

  struct S { int a, b, c } *s;
  memset (>b, 0, sizeof *s - offsetof (struct S, b));

It should be doable to recognize this idiom and suppress the
warning in this case.  Let me work on enhancing the patch to
do that.


Here are the numbers of warnings for
Binutils and the kernel:

113   -Wimplicit-fallthrough
 38   -Wformat-length=
 12   -Wunused-const-variable=
 10   -Wstringop-overflow
  2   -Wdangling-else
  2   -Wframe-address
  2   -Wint-in-bool-context
  1   -Wbool-operation


Using some header structure at the
beginning and then conditionally on fields in that structure various
payloads occurs in many projects, starting with glibc, gcc, Linux kernel,
... The warning really must not be detached from reality.


That's an unfair assertion in light of the numbers above.


If you want a warning for suspicious calls, sure, but
1) it has to be clearly worded significantly differently from how do you
   word it, so that users really understand you are warning about
   suspicious code (though, I really believe it is very common and there
   is really nothing the users can do about it)
2) it really shouldn't be enabled in -Wall, and I think not even in
-Wextra


As you have argued yourself recently in our discussion of
-Wimplicit-fallthrough, warnings that aren't enabled by either
of these options don't generally benefit users because very few
turn them on explicitly.  I agree with that argument although I
would be in favor of rolling out a new warning gradually over
two or more releases if it were known to be prone to high rates
of false positive. The -Wstringop-overflow warning clearly isn't
in that category so there's no need for it.  My offer to do
additional testing is still good if you'd like to see more data.

As for the wording, I welcome suggestions for improvements.

Martin




[PATCH] rs6000: Disable shrink-wrap-separate for abi=spe (PR78168)

2016-11-02 Thread Segher Boessenkool
With the SPE ABI, if we wrap GPRs we need to handle the upper half of the
extended 64-bit registers as well, which we cannot easily do.  So, this
patch disables separate shrink-wrapping for the SPE ABI.

Tested on powerpc64-linux {-m32,-m64}; also tested on the testcase (I
finally managed to build a config that fails).  Committing to trunk.


Segher


2016-11-02  Segher Boessenkool  

PR target/78168
* config/r6000/rs6000.c (rs6000_get_separate_components): Return
NULL if TARGET_SPE_ABI.

---
 gcc/config/rs6000/rs6000.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f9e4739..e5a6166 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27633,6 +27633,9 @@ rs6000_get_separate_components (void)
   if (WORLD_SAVE_P (info))
 return NULL;
 
+  if (TARGET_SPE_ABI)
+return NULL;
+
   sbitmap components = sbitmap_alloc (32);
   bitmap_clear (components);
 
-- 
1.9.3



Re: [PATCH][AArch64] Improve SHA1 scheduling

2016-11-02 Thread Wilco Dijkstra

ping


From: Wilco Dijkstra
Sent: 25 October 2016 18:08
To: GCC Patches
Cc: nd
Subject: [PATCH][AArch64] Improve SHA1 scheduling
    
SHA1H instructions may be scheduled after a SHA1C instruction
that uses the same input register.  However SHA1C updates its input,
so if SHA1H is scheduled after it, it requires an extra move.
Increase the priority of SHA1H to ensure it gets scheduled
earlier, avoiding the move.

Is this something the generic scheduler could do automatically for
instructions with RMW operands?

Passes bootstrap & regress. OK for commit?

ChangeLog:
2016-10-25  Wilco Dijkstra  

    * config/aarch64/aarch64.c (aarch64_sched_adjust_priority)
    New function.
    (TARGET_SCHED_ADJUST_PRIORITY): Define target hook.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
9b2f9cb19343828dc39e9950ebbefe941521942a..2b25bd1bdd6f4e7737f8e04c3b3684cdff6c4b80
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -13668,6 +13668,26 @@ aarch64_sched_fusion_priority (rtx_insn *insn, int 
max_pri,
   return;
 }
 
+/* Implement the TARGET_SCHED_ADJUST_PRIORITY hook.
+   Adjust priority of sha1h instructions so they are scheduled before
+   other SHA1 instructions.  */
+
+static int
+aarch64_sched_adjust_priority (rtx_insn *insn, int priority)
+{
+  rtx x = PATTERN (insn);
+
+  if (GET_CODE (x) == SET)
+    {
+  x = SET_SRC (x);
+
+  if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_SHA1H)
+   return priority + 10;
+    }
+
+  return priority;
+}
+
 /* Given OPERANDS of consecutive load/store, check if we can merge
    them into ldp/stp.  LOAD is true if they are load instructions.
    MODE is the mode of memory operands.  */
@@ -14431,6 +14451,9 @@ aarch64_optab_supported_p (int op, machine_mode mode1, 
machine_mode,
 #undef TARGET_CAN_USE_DOLOOP_P
 #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost
 
+#undef TARGET_SCHED_ADJUST_PRIORITY
+#define TARGET_SCHED_ADJUST_PRIORITY aarch64_sched_adjust_priority
+
 #undef TARGET_SCHED_MACRO_FUSION_P
 #define TARGET_SCHED_MACRO_FUSION_P aarch64_macro_fusion_p
 



Re: [PATCH][AArch64 - v3] Simplify eh_return implementation

2016-11-02 Thread Wilco Dijkstra

    

ping


From: Wilco Dijkstra
Sent: 02 September 2016 12:31
To: Ramana Radhakrishnan; GCC Patches
Cc: nd
Subject: Re: [PATCH][AArch64 - v3] Simplify eh_return implementation
    
Ramana Radhakrishnan wrote:
> Can you please file a PR for this and add some testcases ?  This sounds like 
> a serious enough problem that needs to be looked at probably going back since 
> the dawn of time.

I've created PR77455. Updated patch below:

This patch simplifies the handling of the EH return value.  We force the use of 
the
frame pointer so the return location is always at FP + 8.  This means we can 
emit
a simple volatile access in EH_RETURN_HANDLER_RTX without needing md
patterns, splitters and frame offset calculations.  The new implementation also
fixes various bugs in aarch64_final_eh_return_addr, which does not work with
-fomit-frame-pointer, alloca or outgoing arguments.

Bootstrap OK, GCC Regression OK, OK for trunk? Would it be useful to backport
this to GCC6.x?

ChangeLog:

2016-09-02  Wilco Dijkstra  

    PR77455
gcc/
    * config/aarch64/aarch64.md (eh_return): Remove pattern and splitter.
    * config/aarch64/aarch64.h (AARCH64_EH_STACKADJ_REGNUM): Remove.
    (EH_RETURN_HANDLER_RTX): New define.
    * config/aarch64/aarch64.c (aarch64_frame_pointer_required):
    Force frame pointer in EH return functions.
    (aarch64_expand_epilogue): Add barrier for eh_return.
    (aarch64_final_eh_return_addr): Remove.
    (aarch64_eh_return_handler_rtx): New function.
    * config/aarch64/aarch64-protos.h (aarch64_final_eh_return_addr):
    Remove.
    (aarch64_eh_return_handler_rtx): New prototype.

testsuite/
    * gcc.target/aarch64/eh_return.c: New test.
--
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
3cdd69b8af1089a839e5d45cda94bc70a15cd777..327c0a97f6f687604afef249b79ac22628418070
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -358,7 +358,7 @@ int aarch64_hard_regno_mode_ok (unsigned, machine_mode);
 int aarch64_hard_regno_nregs (unsigned, machine_mode);
 int aarch64_uxt_size (int, HOST_WIDE_INT);
 int aarch64_vec_fpconst_pow_of_2 (rtx);
-rtx aarch64_final_eh_return_addr (void);
+rtx aarch64_eh_return_handler_rtx (void);
 rtx aarch64_mask_from_zextract_ops (rtx, rtx);
 const char *aarch64_output_move_struct (rtx *operands);
 rtx aarch64_return_addr (int, rtx);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
003fec87e41db618570663f28cc2387a87e8252a..fa81e4b853daf08842955288861ec7e7acca
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -400,9 +400,9 @@ extern unsigned aarch64_architecture_version;
 #define ASM_DECLARE_FUNCTION_NAME(STR, NAME, DECL)  \
   aarch64_declare_function_name (STR, NAME, DECL)
 
-/* The register that holds the return address in exception handlers.  */
-#define AARCH64_EH_STACKADJ_REGNUM (R0_REGNUM + 4)
-#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, AARCH64_EH_STACKADJ_REGNUM)
+/* For EH returns X4 contains the stack adjustment.  */
+#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, R4_REGNUM)
+#define EH_RETURN_HANDLER_RTX  aarch64_eh_return_handler_rtx ()
 
 /* Don't use __builtin_setjmp until we've defined it.  */
 #undef DONT_USE_BUILTIN_SETJMP
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
e742c19d76e6c62117aa62a990b9c2945aa06b74..f07d771ea343803e054e03f59c8c1efb698bf474
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2739,6 +2739,10 @@ aarch64_frame_pointer_required (void)
   && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))
 return true;
 
+  /* Force a frame pointer for EH returns so the return address is at FP+8.  */
+  if (crtl->calls_eh_return)
+    return true;
+
   return false;
 }
 
@@ -3298,7 +3302,8 @@ aarch64_expand_epilogue (bool for_sibcall)
  + cfun->machine->frame.saved_varargs_size) != 0;
 
   /* Emit a barrier to prevent loads from a deallocated stack.  */
-  if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca)
+  if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca
+  || crtl->calls_eh_return)
 {
   emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
   need_barrier_p = false;
@@ -3366,52 +3371,15 @@ aarch64_expand_epilogue (bool for_sibcall)
 emit_jump_insn (ret_rtx);
 }
 
-/* Return the place to copy the exception unwinding return address to.
-   This will probably be a stack slot, but could (in theory be the
-   return register).  */
+/* Implement EH_RETURN_HANDLER_RTX.  The return address is stored at FP + 8.
+   The access needs to be volatile to prevent it from being removed.  */
 rtx
-aarch64_final_eh_return_addr (void)
+aarch64_eh_return_handler_rtx (void)
 {
-  HOST_WIDE_INT fp_offset;
-
-  aarch64_layout_frame ();
-
-  fp_offset = 

Re: [PATCH v2][AArch64] Fix symbol offset limit

2016-11-02 Thread Wilco Dijkstra

    

  ping

From: Wilco Dijkstra
Sent: 12 September 2016 15:50
To: Richard Earnshaw; GCC Patches
Cc: nd
Subject: Re: [PATCH v2][AArch64] Fix symbol offset limit
    
Wilco wrote:    
> The original example is from GCC itself, the fixed_regs array is small but 
> due to
> optimization we can end up with _regs + 0x.

We could also check the bounds of each symbol if they exist, like the patch 
below.


In aarch64_classify_symbol symbols are allowed full-range offsets on 
relocations.
This means the offset can use all of the +/-4GB offset, leaving no offset 
available
for the symbol itself.  This results in relocation overflow and link-time errors
for simple expressions like _char + 0xff00.

To avoid this, limit the offset to +/-1GB so that the symbol needs to be within 
a
3GB offset from its references.  For the tiny code model use a 64KB offset, 
allowing
most of the 1MB range for code/data between the symbol and its references.
For symbols with a defined size, limit the offset to be within the size of the 
symbol.

ChangeLog:
2016-09-12  Wilco Dijkstra  

    gcc/
    * config/aarch64/aarch64.c (aarch64_classify_symbol):
    Apply reasonable limit to symbol offsets.

    testsuite/
    * gcc.target/aarch64/symbol-range.c (foo): Set new limit.
    * gcc.target/aarch64/symbol-range-tiny.c (foo): Likewise.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
385bd560fb12cd5d404e6ddb2f01edf1fe72d729..275a828ac9e6e9b8187380c1b602ffb1b2bcfb21
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9351,6 +9351,8 @@ aarch64_classify_symbol (rtx x, rtx offset)
   if (aarch64_tls_symbol_p (x))
 return aarch64_classify_tls_symbol (x);
 
+  const_tree decl = SYMBOL_REF_DECL (x);
+
   switch (aarch64_cmodel)
 {
 case AARCH64_CMODEL_TINY:
@@ -9359,25 +9361,45 @@ aarch64_classify_symbol (rtx x, rtx offset)
  we have no way of knowing the address of symbol at compile time
  so we can't accurately say if the distance between the PC and
  symbol + offset is outside the addressible range of +/-1M in the
-    TINY code model.  So we rely on images not being greater than
-    1M and cap the offset at 1M and anything beyond 1M will have to
-    be loaded using an alternative mechanism.  Furthermore if the
-    symbol is a weak reference to something that isn't known to
-    resolve to a symbol in this module, then force to memory.  */
+    TINY code model.  So we limit the maximum offset to +/-64KB and
+    assume the offset to the symbol is not larger than +/-(1M - 64KB).
+    Furthermore force to memory if the symbol is a weak reference to
+    something that doesn't resolve to a symbol in this module.  */
   if ((SYMBOL_REF_WEAK (x)
    && !aarch64_symbol_binds_local_p (x))
- || INTVAL (offset) < -1048575 || INTVAL (offset) > 1048575)
+ || !IN_RANGE (INTVAL (offset), -0x1, 0x1))
 return SYMBOL_FORCE_TO_MEM;
+
+ /* Limit offset to within the size of a declaration if available.  */
+ if (decl && DECL_P (decl))
+   {
+ const_tree decl_size = DECL_SIZE (decl);
+
+ if (decl_size
+ && !IN_RANGE (INTVAL (offset), 0, tree_to_shwi (decl_size)))
+   return SYMBOL_FORCE_TO_MEM;
+   }
+
   return SYMBOL_TINY_ABSOLUTE;
 
 case AARCH64_CMODEL_SMALL:
   /* Same reasoning as the tiny code model, but the offset cap here is
-    4G.  */
+    1G, allowing +/-3G for the offset to the symbol.  */
   if ((SYMBOL_REF_WEAK (x)
    && !aarch64_symbol_binds_local_p (x))
- || !IN_RANGE (INTVAL (offset), HOST_WIDE_INT_C (-4294967263),
-   HOST_WIDE_INT_C (4294967264)))
+ || !IN_RANGE (INTVAL (offset), -0x4000, 0x4000))
 return SYMBOL_FORCE_TO_MEM;
+
+ /* Limit offset to within the size of a declaration if available.  */
+ if (decl && DECL_P (decl))
+   {
+ const_tree decl_size = DECL_SIZE (decl);
+
+ if (decl_size
+ && !IN_RANGE (INTVAL (offset), 0, tree_to_shwi (decl_size)))
+   return SYMBOL_FORCE_TO_MEM;
+   }
+
   return SYMBOL_SMALL_ABSOLUTE;
 
 case AARCH64_CMODEL_TINY_PIC:
diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c 
b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c
index 
d7e46b059e41f2672b3a1da5506fa8944e752e01..d399a3637ed834ddc4bb429594c4ec229b5c2ea8
 100644
--- a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c
+++ b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c
@@ -1,12 +1,12 @@
-/* { dg-do compile } */
+/* { dg-do link } */
 /* { dg-options "-O3 -save-temps -mcmodel=tiny" } */
 
-int 

Re: [PATCH v2] AIX visibility

2016-11-02 Thread Segher Boessenkool
On Wed, Nov 02, 2016 at 11:28:40AM -0500, Segher Boessenkool wrote:
> On Wed, Nov 02, 2016 at 11:41:32AM -0400, David Edelsohn wrote:
> > Any comments on ASM_WEAKEN_DECL change?
> 
> It no longer checks RS6000_WEAK, is that always on now?

Oh never mind, I can't read (it actually scrolled off my screen, what
a great excuse).

So, looks just fine to me.


Segher


Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 04:44:05PM +0100, Jakub Jelinek wrote:
> which means if gen_type_die or gen_type_die_with_usage doesn't
> use the langhook, then we'd emit a completely useless { __pfn; __delta }
> struct into debug info first, and then in modified_type_die used
> the langhook, get OFFSET_TYPE and probably create the
> DW_TAG_ptr_to_member_type.  So I think we really need that.
> 
> > > 2) it is used for something Ada-ish I really don't know how to test etc.
> > >to be able to find out if it is safe to call it in
> > >gen_type_die_with_usage too
> > 
> > You could find an Ada test that uses the code and verify that the
> > output stays the same?
> 
> I can try to find the patch that introduced it and if it contained any
> testcases.

I couldn't find any unfortunately.  Pierre, could you please test if the
following patch doesn't regress anything in the Ada debug info area?

Here is updated patch I'm going to bootstrap/regtest; it generates the
same debuginfo on ref-3.C testcase.

2016-11-02  Jakub Jelinek  
Alexandre Oliva  
Jason Merrill  

PR debug/28767
PR debug/56974
* langhooks.h (struct lang_hooks_for_types): Document type_hash_eq
being also called on METHOD_TYPEs.  Add type_dwarf_attribute langhook.
* langhooks.c (lhd_type_dwarf_attribute): New function.
* langhooks-def.h (lhd_type_dwarf_attribute): Declare.
(LANG_HOOKS_TYPE_DWARF_ATTRIBUTE): Define.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Add
LANG_HOOKS_TYPE_DWARF_ATTRIBUTE.
* tree.h (check_lang_type): Declare.
* tree.c (check_lang_type): New function.
(check_qualified_type, check_aligned_type): Call it.
* dwarf2out.c (modified_type_die): Don't use type_main_variant
for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with
check_base_type and check_lang_type.
(gen_ptr_to_mbr_type_die): If lookup_type_die is already non-NULL,
return early.  For pointer-to-data-member add DW_AT_use_location
attribute.
(gen_subroutine_type_die): Add DW_AT_{,rvalue_}reference attribute
if needed.
(gen_type_die_with_usage): Don't use type_main_variant
for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with
check_base_type and check_lang_type.  Formatting fixes. Call
get_debug_type langhook.
cp/
* tree.c (cp_check_qualified_type): Use check_base_type and
TYPE_QUALS comparison instead of check_qualified_type.
(cxx_type_hash_eq): Return false if type_memfn_rqual don't match.
* cp-objcp-common.c (cp_get_debug_type): New function.
(cp_decl_dwarf_attribute): Don't handle types here.
(cp_type_dwarf_attribute): New function.
* cp-objcp-common.h (cp_get_debug_type, cp_type_dwarf_attribute):
Declare.
(LANG_HOOKS_GET_DEBUG_TYPE, LANG_HOOKS_TYPE_DWARF_ATTRIBUTE):
Define.
testsuite/
* g++.dg/debug/dwarf2/ptrdmem-1.C: New test.
* g++.dg/debug/dwarf2/ref-3.C: New test.
* g++.dg/debug/dwarf2/refqual-1.C: New test.
* g++.dg/debug/dwarf2/refqual-2.C: New test.

--- gcc/langhooks.h.jj  2016-11-02 15:55:59.243607249 +0100
+++ gcc/langhooks.h 2016-11-02 17:06:23.818730222 +0100
@@ -120,7 +120,7 @@ struct lang_hooks_for_types
   /* Return TRUE if TYPE1 and TYPE2 are identical for type hashing purposes.
  Called only after doing all language independent checks.
  At present, this function is only called when both TYPE1 and TYPE2 are
- FUNCTION_TYPEs.  */
+ FUNCTION_TYPE or METHOD_TYPE.  */
   bool (*type_hash_eq) (const_tree, const_tree);
 
   /* Return TRUE if TYPE uses a hidden descriptor and fills in information
@@ -162,6 +162,10 @@ struct lang_hooks_for_types
  for the debugger about scale factor, etc.  */
   bool (*get_fixed_point_type_info) (const_tree,
 struct fixed_point_type_info *);
+
+  /* Returns -1 if dwarf ATTR shouldn't be added for TYPE, or the attribute
+ value otherwise.  */
+  int (*type_dwarf_attribute) (const_tree, int);
 };
 
 /* Language hooks related to decls and the symbol table.  */
--- gcc/langhooks.c.jj  2016-11-02 15:55:59.232607389 +0100
+++ gcc/langhooks.c 2016-11-02 16:35:39.938889496 +0100
@@ -702,6 +702,15 @@ lhd_decl_dwarf_attribute (const_tree, in
   return -1;
 }
 
+/* Default implementation of LANG_HOOKS_TYPE_DWARF_ATTRIBUTE.  Don't add
+   any attributes.  */
+
+int
+lhd_type_dwarf_attribute (const_tree, int)
+{
+  return -1;
+}
+
 /* Returns true if the current lang_hooks represents the GNU C frontend.  */
 
 bool
--- gcc/langhooks-def.h.jj  2016-11-02 15:55:59.151608423 +0100
+++ gcc/langhooks-def.h 2016-11-02 17:05:54.296095837 +0100
@@ -84,6 +84,7 @@ extern bool lhd_omp_mappable_type (tree)
 extern const char *lhd_get_substring_location (const substring_loc &,
   

Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)

2016-11-02 Thread Joseph Myers
On Wed, 2 Nov 2016, Jason Merrill wrote:

> It seems to me that the general principle is that we should consider
> the location where the thing we're warning about is happening.  In
> 
>float_var = LLONG_MIN;
> 
> The relevant location is that of the assignment, not the constant on
> the RHS.  In your ?: example, a simple answer would be to warn based

I'm not sure we track locations well enough to handle that yet.

Say you have an implicit conversion of a function argument to the type 
from the prototype and something about that conversion should be warned 
about.  Then you should obviously warn if the argument is a macro from a 
system header but the call is outside a system header.  But say the 
function call itself comes from a macro defined in a system header - you 
should still warn if the user passed an argument of the wrong type, even 
if that argument is a macro from a system header.

That is:

/* System header.  */
int __foo (int);
/* This sort of macro to avoid accidental interposition issues has been 
   discussed lately on libc-alpha, so it's a realistic example.  */
#define foo(x) (0 + __foo (x))
/* User code.  */
v = foo (NULL);

should warn because the call to __foo logically results from user code 
even though both NULL and foo are macros defined in system headers.  I'm 
not sure what the locations look like here.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Generate reproducible output independently of the build-path

2016-11-02 Thread Joseph Myers
On Wed, 2 Nov 2016, Ximin Luo wrote:

> This patch series adds a new environment variable SOURCE_PREFIX_MAP. When this
> is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value"
> command-line argument. This makes the final binary output reproducible, and

Only one argument?  Sometimes you may want multiple -fdebug-prefix-map 
options (for both source and build directories, for example).  Perhaps the 
value should be split on spaces to provide multiple such mappings?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 0/3] use rtx_insn * more

2016-11-02 Thread Bernd Schmidt

On 11/02/2016 03:55 PM, David Malcolm wrote:


Did you mean this patch:
  https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html


That one is ok after the test, sorry for not being more clear.


Bernd



Re: [PATCH v2] AIX visibility

2016-11-02 Thread Segher Boessenkool
On Wed, Nov 02, 2016 at 11:41:32AM -0400, David Edelsohn wrote:
> Any comments on ASM_WEAKEN_DECL change?

It no longer checks RS6000_WEAK, is that always on now?

Otherwise looks fine to me.


Segher


Re: [PATCH 1/4] Use SOURCE_PREFIX_MAP envvar as an implicit debug-prefix-map

2016-11-02 Thread Joseph Myers
On Wed, 2 Nov 2016, Ximin Luo wrote:

> +@item SOURCE_PREFIX_MAP If this variable is set, it specifies a mapping

The text should start on a separate line from the @item.

> +that is used to transform filepaths that are output in debug symbols.
> +This helps the embedded paths become reproducible, without having the
> +unreproducible value be visible in other input sources - such as GCC

Use a TeX em dash, ---, instead of " - ".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)

2016-11-02 Thread Jason Merrill
On Wed, Nov 2, 2016 at 11:51 AM, Marek Polacek  wrote:
> One of the pending issues that we should address before we release GCC 7 is
> that sometimes we don't issue a warning if the location points to a macro
> defined in a system header, unless -Wsystem-headers.  Consider e.g.
>
>   enum { e1 = LLONG_MIN };
>
> or
>
>   float_var = LLONG_MIN;
>
> Here, LLONG_MIN comes from a system header and so the compiler doesn't print
> any warnings even though it should--the problem is not in the macro itself,
> but in how it's used.  This has happened before, e.g. with NULL, and the fix
> was to call expansion_point_location_if_in_system_header, but this strategy
> is not tenable; there are too many such issues.  After having spent days on
> this it seems that we should always use the expansion location except for
> certain pedwarns/warnings--the challenge is of course how to distinguish
> which ones.  This patch introduces expand_macros_sentinel that can be used
> to suppress expanding macros, removes various
> expansion_point_location_if_in_system_header calls and fixes testsuite
> fallout.  I have *not* gone over all the warnings/pedwarns yet, because this
> is touch-and-go whether this'll be considered a reasonable approach.
>
> The general principle should be: is it the macro or its user that is
> responsible for the warning?, but in practice it was often less clear to
> me what to do.  So e.g. imagine
>
>   #define C :
>   #define F i ?: 3
>
> in a system header and then
>
>   i = i ? C 5;  // 1
>   return F; // 2
>
> in user code -- I believe we should NOT warn for 2 (so don't expand the 
> location),
> but that also means we won't warn for 1.
>
> Thoughts?

It seems to me that the general principle is that we should consider
the location where the thing we're warning about is happening.  In

   float_var = LLONG_MIN;

The relevant location is that of the assignment, not the constant on
the RHS.  In your ?: example, a simple answer would be to warn based
on the location of the ?.  The ?C example seems not worth worrying
about either way.

Jason


Re: [PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM

2016-11-02 Thread Christophe Lyon
On 2 November 2016 at 16:38, Tamar Christina  wrote:
> Hi all,
>
> This fixes the ARM failures in the testsuite.
> Previously these tests were gated on if ARMv8-a
> support was available in the compiler, not if the hardware
> was an ARMv8-a hardware.
>
> This thus resulted in correct code, but wouldn't run on
> any other hardware.
>
> Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf.
>
> Ok for trunk?
>

It seems OK to me (without actually testing, though)

Thanks

Christophe

> Thanks,
> Tamar
>
> gcc/testsuite/
>
> 2016-11-01  Tamar Christina  
>
> * gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target):
> Check for arm_v8_neon_hw.
> * gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target):
> Likewise.
> * gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target):
> Likewise.
> * gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target):
> Likewise.


RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)

2016-11-02 Thread Marek Polacek
One of the pending issues that we should address before we release GCC 7 is
that sometimes we don't issue a warning if the location points to a macro
defined in a system header, unless -Wsystem-headers.  Consider e.g.
  
  enum { e1 = LLONG_MIN };

or

  float_var = LLONG_MIN;

Here, LLONG_MIN comes from a system header and so the compiler doesn't print
any warnings even though it should--the problem is not in the macro itself,
but in how it's used.  This has happened before, e.g. with NULL, and the fix
was to call expansion_point_location_if_in_system_header, but this strategy
is not tenable; there are too many such issues.  After having spent days on
this it seems that we should always use the expansion location except for
certain pedwarns/warnings--the challenge is of course how to distinguish
which ones.  This patch introduces expand_macros_sentinel that can be used
to suppress expanding macros, removes various
expansion_point_location_if_in_system_header calls and fixes testsuite
fallout.  I have *not* gone over all the warnings/pedwarns yet, because this
is touch-and-go whether this'll be considered a reasonable approach.

The general principle should be: is it the macro or its user that is
responsible for the warning?, but in practice it was often less clear to
me what to do.  So e.g. imagine

  #define C :
  #define F i ?: 3

in a system header and then

  i = i ? C 5;  // 1
  return F; // 2

in user code -- I believe we should NOT warn for 2 (so don't expand the 
location),
but that also means we won't warn for 1.

Thoughts?

Bootstrapped/regtested on x86_64-linux and ppc64-linux.

2016-11-02  Marek Polacek  

PR c/78000
PR c/71613
gcc/c-family/
* c-common.c (unsafe_conversion_p): Don't call
expansion_point_location_if_in_system_header.
(c_cpp_error): Add a paramater and expand locations depending on that.
* c-common.h (c_cpp_error): Update declaration.
* c-warn.c (warnings_for_convert_and_check): Don't call
expansion_point_location_if_in_system_header.
gcc/c/
* c-decl.c (declspecs_add_type): Use expand_macros_sentinel to suppress
expanding macro locations.
* c-parser.c (c_parser_postfix_expression) [RID_FUNCTION_NAME,
RID_PRETTY_FUNCTION_NAME, RID_C99_FUNCTION_NAME]: Likewise.
* c-typeck.c (convert_arguments): Don't call
expansion_point_location_if_in_system_header.
(pedwarn_init): Likewise.
(warning_init): Likewise.
(convert_for_assignment): Don't call
expansion_point_location_if_in_system_header.
(c_finish_return): Likewise.
gcc/cp/
* call.c (conversion_null_warnings): Don't call
expansion_point_location_if_in_system_header.
* cvt.c (build_expr_type_conversion): Likewise.
* parser.c (set_and_check_decl_spec_loc): Use expand_macros_sentinel to
suppress expanding macro locations.
* typeck.c (cp_build_binary_op): Don't call
expansion_point_location_if_in_system_header.
gcc/
* diagnostic.c (diagnostic_initialize): Initialize dc_expand_locations.
(diagnostic_report_warnings_p): New function.
* diagnostic.h (struct diagnostic_context): Add dc_expand_locations.
(diagnostic_report_warnings_p): Remove.
(struct expand_macros_sentinel): New sentinel.
(diagnostic_report_warnings_p): Declare.
* genmatch.c (error_cb): Add bool parameter.
(fatal_at): Adjust error_cb call.
(warning_at): Likewise.
* input.c (on_error): Add bool parameter.
gcc/fortran/
* cpp.c (cb_cpp_error): Add a paramater and expand locations depending
on that.
gcc/testsuite/
* gcc.dg/pr71613.c: New.
* gcc.dg/pr78000.c: New.
* gcc.dg/pr78000.h: New.
libcpp/
* charset.c (noop_error_cb): Add bool parameter.
(saved_error_handler): Likewise.
(cpp_interpret_string_ranges):
* errors.c (cpp_diagnostic_at_richloc): Adjust cb.error call.
(cpp_diagnostic_at): Likewise.
(cpp_diagnostic_with_line): Add bool parameter and pass it down to
cb.error.
(cpp_error_with_line_noexpand): New.
(cpp_warning_with_line_noexpand): New.
(cpp_pedwarning_with_line_noexpand): New.
(cpp_warning_with_line_syshdr): Pass true to cpp_diagnostic_with_line.
* expr.c (cpp_classify_number): Call *_noexpand variants.
* include/cpplib.h (error): Adjust declaration.
(cpp_error_with_line_noexpand): New.
(cpp_warning_with_line_noexpand): New.
(cpp_pedwarning_with_line_noexpand): New.

diff --git gcc/gcc/c-family/c-common.c gcc/gcc/c-family/c-common.c
index 307862b..b0a1b5c 100644
--- gcc/gcc/c-family/c-common.c
+++ gcc/gcc/c-family/c-common.c
@@ -1230,7 +1230,6 @@ unsafe_conversion_p (location_t loc, tree type, tree 
expr, bool produce_warns)
 {
   enum conversion_safety give_warning = SAFE_CONVERSION; /* is 0 or 

Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 11:31:25AM -0400, Jason Merrill wrote:
> On Wed, Nov 2, 2016 at 10:31 AM, Jakub Jelinek  wrote:
> > It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook.  I've tried
> > to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html
> > and we even have such a langhook now, modified_type_die
> > uses lang_hooks.types.get_debug_type, but
> > 1) it is just called in modified_type_die and not in
> >gen_type_die_with_usage, that looks weird
> 
> How much of a problem is that?  modified_type_die calls gen_type_die,
> does that not cover the cases needed here?

If e.g. on the ref-3.C testcase from the patch I put breakpoint on
both gen_ptr_to_mbr_type_die and modified_type_die (the latter only for
type->base.code == RECORD_TYPE), then I see first:
#0  gen_ptr_to_mbr_type_die (type=, 
context_die=>, 
class_type=, member_type=) at ../../gcc/dwarf2out.c:23128
#1  0x00c3a7e5 in gen_type_die_with_usage (type=, 
context_die=>, usage=DINFO_USAGE_DIR_USE)
at ../../gcc/dwarf2out.c:24428
#2  0x00c3ab92 in gen_type_die (type=, 
context_die=>) at ../../gcc/dwarf2out.c:24491
#3  0x00c3caed in gen_decl_die (decl=, 
origin=, ctx=0x0, 
context_die=>) at ../../gcc/dwarf2out.c:25117
#4  0x00c3b13e in process_scope_var (stmt=, 
decl=, origin=, 
context_die=>) at ../../gcc/dwarf2out.c:24620
#5  0x00c3b1bb in decls_for_scope (stmt=, 
context_die=>) at ../../gcc/dwarf2out.c:24645
and only afterwards:
#0  modified_type_die (type=, cv_quals=0, 
reverse=false, 
context_die=>) at ../../gcc/dwarf2out.c:12328
#1  0x00c2f03b in add_type_attribute (object_die=>, 
type=, cv_quals=0, reverse=false, 
context_die=>) at ../../gcc/dwarf2out.c:20346
#2  0x00c354f4 in gen_variable_die (decl=, 
origin=, 
context_die=>) at ../../gcc/dwarf2out.c:22688
#3  0x00c3cb8e in gen_decl_die (decl=, 
origin=, ctx=0x0, 
context_die=>) at ../../gcc/dwarf2out.c:25138
#4  0x00c3b13e in process_scope_var (stmt=, 
decl=, origin=, 
context_die=>) at ../../gcc/dwarf2out.c:24620
#5  0x00c3b1bb in decls_for_scope (stmt=, 
context_die=>) at ../../gcc/dwarf2out.c:24645

which means if gen_type_die or gen_type_die_with_usage doesn't
use the langhook, then we'd emit a completely useless { __pfn; __delta }
struct into debug info first, and then in modified_type_die used
the langhook, get OFFSET_TYPE and probably create the
DW_TAG_ptr_to_member_type.  So I think we really need that.

> > 2) it is used for something Ada-ish I really don't know how to test etc.
> >to be able to find out if it is safe to call it in
> >gen_type_die_with_usage too
> 
> You could find an Ada test that uses the code and verify that the
> output stays the same?

I can try to find the patch that introduced it and if it contained any
testcases.

> > 3) most importantly, if the C++ version of this langhook would create
> >OFFSET_TYPE on the fly, I don't know how to ensure effective sharing
> >of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type
> >and DW_AT_containing_type; unless the C++ langhook adds some extra
> >hash table that caches already created OFFSET_TYPEs or something similar,
> >it would create a new OFFSET_TYPE each time it is called
> 
> build_offset_type already uses a hash table.

Ah, ok.

> > Also, I really don't know how well does GDB (especially older releases)
> > handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently
> > with if (dwarf_version >= 5).  Quick grep revealed that GDB has code to
> > handle the __pfn/__delta fields.  So, can I ask somebody from the GDB
> > team to test this patch with that if (dwarf_version >= 5) replaced
> > with if (1) and see if it works properly with current GDB as well as say
> > 4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)?  If yes, we
> > should emit it unconditionally.
> 
> This all makes sense to me.
> 
> > +  if (dwarf_version >= 5)
> > +   {
> > + tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0);
> > + if (class_type != NULL_TREE)
> 
> This can be
> 
> if (dwarf_version >= 5)
>   if (tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0))

Ok.

Jakub


[PATCH, Fortran] Extension: Support legacy PARAMETER statements with -std=legacy (or -fdec?)

2016-11-02 Thread Fritz Reese
All,

Another quirk of legacy compilers is their syntax for PARAMETER
statements. Such statements are similar to standard PARAMETER
statements but lack parentheses following the PARAMETER keyword. There
is a good reason the standard doesn't support this - because the
statement becomes ambiguous with assignment statements in fixed form.
Consider the following:

parameter pi = 3.14d0

In fixed form, spaces are ignored so in standard Fortran the above
looks like an assignment to the variable "parameterpi". In legacy
compilers, the above is instead interpreted as

parameter (pi = 3.14d0)

which of course declares the variable 'pi' to be a parameter with the
value 3.14.

Attached is a patch for the GNU Fortran front-end which allows the
compiler to interpret these legacy PARAMETER statements. The patch in
its current form does this without warning through -std=legacy
(warning for -std=gnu, error for -std=f*). Bootstraps and regtests on
x86_64-redhat-linux.

However, note that this would change by default the compiler's
interpretation of fixed-form variables starting with the string
"parameter", if any such cases existed in real code. IMO fixed form
code is isomorphic to legacy code, so I imagine most users writing
fixed-form/legacy code would intend for a legacy PARAMETER statement,
rather than assignment to variable PARAMETERPI, when writing such a
statement.

Beyond compatibility, one motivation for acceptance/recognizance of
these statements in GNU Fortran is the counterintuitive compile errors
currently seen when compiling legacy code which uses this type of
statement. Because the legacy PARAMETER statement is currently
interpreted as an assignment in fixed-form, and parameter statements
often precede other specification statements in real code, GNU Fortran
complains not about the parameter statement itself but often the
following line with "Unexpected data declaration" or similar. This
serves to confuse the user. An example of this can be seen by
compiling the attached dec_parameter_1.f in fixed form with the
current build of GNU Fortran.

I am amenable to only enabling support for legacy PARAMETER statements
through -fdec, so that the default (-std=gnu) behavior is unchanged
for such cases in fixed form. But this would leave the esoteric
"Unexpected data declaration statement" errors for legacy code without
-fdec. The case would be difficult to detect and work around
specifically since the entire parameter "assignment" statement is
eaten by the time the error comes about.

What do you think?

---
Fritz Reese

Author: Fritz O. Reese 
Date:   Tue Nov 1 12:26:57 2016 -0400

Support legacy PARAMETER statements with -std=legacy.

gcc/fortran/
* decl.c (gfc_match_parameter): Allow omitted '()' with -std=legacy.
* parse.c (decode_statement): Match "parmeter" before assignments.
* gfortran.texi: Document.

gcc/testsuite/gfortran.dg/
* dec_parameter_1.f: New test.
* dec_parameter_2.f90: Likewise.
* dec_parameter_3.f90: Likewise.
* dec_parameter_4.f90: Likewise.

 gcc/fortran/decl.c|   10 +++-
 gcc/fortran/gfortran.texi |   16 ++
 gcc/fortran/parse.c   |4 +-
 gcc/testsuite/gfortran.dg/dec_parameter_1.f   |   64 +
 gcc/testsuite/gfortran.dg/dec_parameter_2.f90 |   63 
 gcc/testsuite/gfortran.dg/dec_parameter_3.f90 |   13 +
 gcc/testsuite/gfortran.dg/dec_parameter_4.f90 |   13 +
 7 files changed, 180 insertions(+), 3 deletions(-)
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index f18eb41..0120ceb 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -7821,10 +7821,16 @@ cleanup:
 match
 gfc_match_parameter (void)
 {
+  const char *term = " )%t";
   match m;
 
   if (gfc_match_char ('(') == MATCH_NO)
-return MATCH_NO;
+{
+  /* With legacy PARAMETER statements, don't expect a terminating ')'.  */
+  if (!gfc_notify_std (GFC_STD_LEGACY, "PARAMETER without '()' at %C"))
+   return MATCH_NO;
+  term = " %t";
+}
 
   for (;;)
 {
@@ -7832,7 +7838,7 @@ gfc_match_parameter (void)
   if (m != MATCH_YES)
break;
 
-  if (gfc_match (" )%t") == MATCH_YES)
+  if (gfc_match (term) == MATCH_YES)
break;
 
   if (gfc_match_char (',') != MATCH_YES)
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 85ab31b..1d60551 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1472,6 +1472,7 @@ compatibility extensions along with those enabled by 
@option{-std=legacy}.
 * Bitwise logical operators::
 * Extended I/O specifiers::
 * Default exponents::
+* Legacy PARAMETER statements::
 @end menu
 
 @node Old-style kind specifications
@@ -2705,6 +2706,21 @@ For compatibility, GNU Fortran supports a default 
exponent of zero in real
 constants with @option{-fdec}.  For example, 

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-02 Thread Martin Sebor

On 11/02/2016 01:37 AM, Jakub Jelinek wrote:

On Tue, Nov 01, 2016 at 08:55:03PM -0600, Martin Sebor wrote:

struct S {
  int a, b, c, d;
};

#define bos(p, t) __builtin_object_size (p, t)
#define memset0(p, i, n) __builtin___memset_chk (p, i, n, bos (p, 0))
#define memset1(p, i, n) __builtin___memset_chk (p, i, n, bos (p, 1))

void f0 (struct S *s)
{
  memset0 (>d, 0, 1024);   // no warning here (bos 0)
}


But we do not want the warning here, there is nothing wrong on it.
The caller may be

void
bar (void)
{
  struct T { struct S header; char payload[1024 - offsetof (struct S, d)]; } t;
  initialize ();
  f0 ();
}

and the callee might rely on that.


Sure, they might and in that case the warning would be a false
positive.  It wouldn't be the first such warning that wasn't 100%
free of them.  But my testing with Binutils, GCC, and the Linux
kernel has exposed only 10 instances of new warnings and I don't
think I saw this idiom among them.  But even if some were, or if
all of them were false positives I think that would be well within
the acceptable rates.  Here are the numbers of warnings for
Binutils and the kernel:

113   -Wimplicit-fallthrough
 38   -Wformat-length=
 12   -Wunused-const-variable=
 10   -Wstringop-overflow
  2   -Wdangling-else
  2   -Wframe-address
  2   -Wint-in-bool-context
  1   -Wbool-operation


Using some header structure at the
beginning and then conditionally on fields in that structure various
payloads occurs in many projects, starting with glibc, gcc, Linux kernel,
... The warning really must not be detached from reality.


That's an unfair assertion in light of the numbers above.


If you want a warning for suspicious calls, sure, but
1) it has to be clearly worded significantly differently from how do you
   word it, so that users really understand you are warning about
   suspicious code (though, I really believe it is very common and there
   is really nothing the users can do about it)
2) it really shouldn't be enabled in -Wall, and I think not even in -Wextra


As you have argued yourself recently in our discussion of
-Wimplicit-fallthrough, warnings that aren't enabled by either
of these options don't generally benefit users because very few
turn them on explicitly.  I agree with that argument although I
would be in favor of rolling out a new warning gradually over
two or more releases if it were known to be prone to high rates
of false positive. The -Wstringop-overflow warning clearly isn't
in that category so there's no need for it.  My offer to do
additional testing is still good if you'd like to see more data.

As for the wording, I welcome suggestions for improvements.

Martin


Re: [PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM

2016-11-02 Thread Kyrill Tkachov


On 02/11/16 15:38, Tamar Christina wrote:

Hi all,

This fixes the ARM failures in the testsuite.
Previously these tests were gated on if ARMv8-a
support was available in the compiler, not if the hardware
was an ARMv8-a hardware.

This thus resulted in correct code, but wouldn't run on
any other hardware.

Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf.

Ok for trunk?


Looks ok to me too.
Kyrill


Thanks,
Tamar

gcc/testsuite/

2016-11-01  Tamar Christina  

* gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target):
Check for arm_v8_neon_hw.
* gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target):
Likewise.
* gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target):
Likewise.
* gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target):
Likewise.




[PATCH v2] AIX visibility

2016-11-02 Thread David Edelsohn
This revised patch makes two changes:

1) Fix typo in configure.ac
2) Add AIX visibility support for ASM_WEAKEN_DECL, which does touch
the same code as Linux.

The AIX "weak" support fixes a large number of C++ visibility testcases.

Bootstrapped on powerpc-ibm-aix7.2.0.0.

* configure.ac (.hidden): Change to conftest_s string. Provide string
for AIX assembler.
(gcc_cv_ld_hidden): Yes for AIX.
* configure: Regenerate.

* dwarf2asm.c (USE_LINKONCE_INDIRECT): Don't set for AIX (XCOFF).

* config/rs6000/rs6000-protos.h (rs6000_asm_weaken_decl): Declare
(rs6000_xcoff_asm_output_aligned_decl_common): Declare.
* config/rs6000/xcoff.h (TARGET_ASM_GLOBALIZE_DECL_NAME): Define.
(ASM_OUTPUT_ALIGNED_DECL_COMMON): Define.
(ASM_OUTPUT_ALIGNED_COMMON): Delete.
* config/rs6000/rs6000.c (rs6000_init_builtins): Change clog rename
from #if to if.
(rs6000_xcoff_visibility): New.
(rs6000_xcoff_declare_function_name): Add visibility support.
(rs6000_xcoff_asm_globalize_decl_name): New.
(rs6000_xcoff_asm_output_aligned_decl_common): New.
(rs6000_asm_weaken_decl): New.
(rs6000_code_end): Disable HIDDEN_LINKONCE on XCOFF.
config/rs6000/rs6000.h (ASM_WEAKEN_DECL): Change definition to
reference function.

dwarf2asm.c okay?

Any comments on ASM_WEAKEN_DECL change?

Thanks, David


ZZ
Description: Binary data


Re: [PATCH] combine: Improve change_zero_ext (fixes PR71847)

2016-11-02 Thread Bin.Cheng
On Tue, Oct 25, 2016 at 11:40 AM, Segher Boessenkool
 wrote:
> This improves a few things in change_zero_ext.  Firstly, it should use
> the passed in pattern in recog_for_combine, not the pattern of the insn
> (they are not the same if the whole pattern was replaced).  Secondly,
> it handled zero_ext of a subreg, but with hard registers we do not get
> a subreg, instead the mode of the reg is changed.  So this handles that.
> Thirdly, after changing a zero_ext to an AND, the resulting RTL may become
> non-canonical, like (ior (ashift ..) (and ..)); the AND should be first,
> it is commutative.  And lastly, zero_extract as a set_dest wasn't handled
> at all, but now it is.
>
> This fixes the testcase in PR71847, and improves code generation in some
> other edge cases too.
>
> Tested on powerpc64-linux {-m32,-m64}; I'll also test it on x86 and
> will build lots of crosses before committing.

Hi Segher,
This causes failure @ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78186
It includes how to report it, could you please have a look?  Thanks.

Thanks,
Bin


[PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM

2016-11-02 Thread Tamar Christina
Hi all,

This fixes the ARM failures in the testsuite.
Previously these tests were gated on if ARMv8-a
support was available in the compiler, not if the hardware
was an ARMv8-a hardware.

This thus resulted in correct code, but wouldn't run on
any other hardware.

Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/

2016-11-01  Tamar Christina  

* gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target):
Check for arm_v8_neon_hw.
* gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target):
Likewise.
* gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target):
Likewise.
* gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target):
Likewise.

gcc.arm-fix-vminnm-vmaxnm-tests
Description: gcc.arm-fix-vminnm-vmaxnm-tests


Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module

2016-11-02 Thread Jason Merrill
OK.

On Wed, Nov 2, 2016 at 10:18 AM, Jiong Wang  wrote:
> On 02/11/16 13:42, Jakub Jelinek wrote:
>>
>> On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote:
>>>
>>> -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION
>>> note. */
>>> +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION
>>> note.  */
>>
>>
>> Too long line.
>
>
> Hmm, it shows 80 columns under my editor.  I guess '+' is counted in?
>
>>
>>> +/* RTL sequences inside PARALLEL are raw expression representation.
>>> +
>>> +   mem_loc_descriptor can be used to build generic DWARF expressions
>>> for
>>> +   DW_CFA_expression and DW_CFA_val_expression where the expression
>>> may can
>>> +   not be represented using normal RTL sequences.  In this case,
>>> group all
>>> +   expression operations (DW_OP_*) inside a PARALLEL.  For those
>>> DW_OP which
>>> +   doesn't have RTL mapping, wrap it using UNSPEC.  The logic for
>>> parsing
>>> +   PARALLEL sequences is:
>>> +
>>> +   foreach elem inside PARALLEL
>>> + if (elem is UNSPEC)
>>> +   dw_op =  XINT (elem, 1) (DWARF operation is kept as UNSPEC
>>> number)
>>> +   oprnd1 = XVECEXP (elem, 0, 0)
>>> +   oprnd2 = XVECEXP (elem, 0, 1)
>>> + else
>>> +   call mem_loc_descriptor  */
>>
>>
>> Not sure if it is a good idea to document in weirdly formatted
>> pseudo-language what the code actually does a few lines below.  IMHO
>> either
>> express it in words, or don't express it at all.
>
>
> OK, fixed. I replaced these comments as some brief words.
>
>>
>>> +   exp_result =
>>> + new_loc_descr ((enum dwarf_location_atom) dw_op,
>>> oprnd1,
>>> +oprnd2);
>>
>>
>> Wrong formatting, = should be on the next line.
>>
>>> + }
>>> +   else
>>> + exp_result =
>>> +   mem_loc_descriptor (elem, mode, mem_mode,
>>> +   VAR_INIT_STATUS_INITIALIZED);
>>
>>
>> Likewise.
>
>
> Both fixed. Patch updated, please review.
>
>
> Thanks.
>
> gcc/
> 2016-11-02  Jiong Wang  
>
> * reg-notes.def (CFA_VAL_EXPRESSION): New entry.
> * dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New
> function.
> (dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION.
> (output_cfa_loc): Support DW_CFA_val_expression.
> (output_cfa_loc_raw): Likewise.
> (output_cfi): Likewise.
> (output_cfi_directive): Likewise.
> * dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression.
> (dw_cfi_oprnd2_desc): Likewise.
> (mem_loc_descriptor): Recognize new pattern generated for value
> expression.
>


Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those

2016-11-02 Thread Jason Merrill
On Wed, Nov 2, 2016 at 10:31 AM, Jakub Jelinek  wrote:
> It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook.  I've tried
> to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html
> and we even have such a langhook now, modified_type_die
> uses lang_hooks.types.get_debug_type, but
> 1) it is just called in modified_type_die and not in
>gen_type_die_with_usage, that looks weird

How much of a problem is that?  modified_type_die calls gen_type_die,
does that not cover the cases needed here?

> 2) it is used for something Ada-ish I really don't know how to test etc.
>to be able to find out if it is safe to call it in
>gen_type_die_with_usage too

You could find an Ada test that uses the code and verify that the
output stays the same?

> 3) most importantly, if the C++ version of this langhook would create
>OFFSET_TYPE on the fly, I don't know how to ensure effective sharing
>of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type
>and DW_AT_containing_type; unless the C++ langhook adds some extra
>hash table that caches already created OFFSET_TYPEs or something similar,
>it would create a new OFFSET_TYPE each time it is called

build_offset_type already uses a hash table.

> Also, I really don't know how well does GDB (especially older releases)
> handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently
> with if (dwarf_version >= 5).  Quick grep revealed that GDB has code to
> handle the __pfn/__delta fields.  So, can I ask somebody from the GDB
> team to test this patch with that if (dwarf_version >= 5) replaced
> with if (1) and see if it works properly with current GDB as well as say
> 4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)?  If yes, we
> should emit it unconditionally.

This all makes sense to me.

> +  if (dwarf_version >= 5)
> +   {
> + tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0);
> + if (class_type != NULL_TREE)

This can be

if (dwarf_version >= 5)
  if (tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0))

Jason


Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-11-02 Thread Segher Boessenkool
On Wed, Nov 02, 2016 at 02:51:41PM +0100, Richard Biener wrote:
> >> I don't believe it needs a cleanup on every iteration. One cleanup at
> >> the end should work fine.
> >
> > But as the comment there says:
> >
> >   /* Merge the duplicated blocks into predecessors, when possible.  */
> >   cleanup_cfg (0);
> >
> > (this is not a new comment), and without merging blocks this whole
> > patch does zilch.
> >
> > There is no need to create new basic blocks at all (can replace the
> > final branch in a block with a copy of the whole block it jumps to,
> > instead); and then it is painfully obvious that switching to layout
> > mode here is pointless, too.
> >
> > Do you want me to do that?
> >
> > Btw, this isn't quadratic anyway; it is a constant number (the maximal
> > length allowed of the block to copy) linear.  Unless there are unboundedly
> > many forwarder blocks, which there shouldn't be, but I cannot prove that.
> 
> Well, you iterate calling functions (find candidates, merge, cleanup-cfg) that
> walk the whole function.  So unless the number of iterations is bound
> with a constant I call this quadratic (in function size).

It is bounded (with that caveat above): new candidates will be bigger than
the block merged into it, so there won't be more than
PARAM_MAX_GOTO_DUPLICATION_INSNS passes.

But I can make it all work simpler, in non-layout mode, if you prefer.


Segher


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Martin Liška
On 11/02/2016 03:51 PM, Jakub Jelinek wrote:
> On Wed, Nov 02, 2016 at 03:38:25PM +0100, Martin Liška wrote:
>> it converts:
>> foo ()
>> {
>>   char a;
>>   char * p;
>>   char _1;
>>   int _2;
>>   int _8;
>>   int _9;
>>
>>   :
>>   ASAN_MARK (2, , 1);
>>   a = 0;
>>   p_6 = 
>>   ASAN_MARK (1, , 1);
>>   _1 = *p_6;
> 
> You shouldn't convert if a is addressable (when ignoring  in ASAN_MARK
> calls).  Only if there is  just in ASAN_MARK and MEM_REF, you can convert.

Sure, which should be done in execute_update_addresses_taken via 
gimple_ior_addresses_taken.

> 
>> to:
>>
>> foo ()
>> {
>>   char a;
>>   char * p;
>>   char _1;
>>   int _2;
>>
>>   :
>>   a_10 = 0;
>>   a_12 = ASAN_POISON ();
>>   _1 = a_12;
>>   if (_1 != 0)
>> goto ;
>>   else
>> goto ;
>>
>>   :
>>
>>   :
>>   # _2 = PHI <1(2), 0(3)>
>>   return _2;
>>
>> }
>>
>> and probably the last goal is to convert the newly added internal fn to a 
>> runtime call.
>> Hope sanopt pass is the right place where to it?
> 
> If ASAN_POISON is ECF_CONST and has any uses during sanopt, perhaps best
> would be to add an artificial variable you give the same name as the
> underlying var of the SSA_NAME (and alignment, locus etc.) and poison it
> right away (keep unpoisoning only to the function epilogue) and then
> ASAN_CHECK replace all uses of that SSA_NAME with ASAN_CHECK + use of
> (D) SSA_NAME.

When I create an ASAN_POISON call in execute_update_addresses_taken, there 
would not
be any ASAN_CHECK generated as it's going to be rewritten to SSA form (like the 
previous
sample I sent).

I like the idea of having a parallel variable, which can be poisoned at the 
very beginning of
a function. Whenever we have a use of the SSA_NAME (like a_12 = ASAN_POISON 
()), we can simply
insert BUILT_IN_ASAN_REPORT_LOADx(_variable) statement. No change 
would be necessary
for ASAN runtime in such case.

Will it work?
Thanks,
Martin


> 
>   Jakub
> 



Re: [PATCH] Fix for big-endian gcc.c-torture/execute/pr55750.c

2016-11-02 Thread Richard Biener
On Wed, 2 Nov 2016, Kyrill Tkachov wrote:

> Hi all,
> 
> I noticed that my patch for PR tree-optimization/78170 broke
> execute.exp=pr55750.c on big-endian.
> The problem with that patch is that we should not forget to clear the padding
> bits in the temporary
> buffer when merging values even when they are less than a byte.
> 
> Tested on aarch64_be-none-elf (the test failure goes away)
> Bootstrap and test on x86_64 ongoing.
> Ok for trunk if successful?

Ok.

Thanks,
Richard.

> David, does this patch allow AIX bootstrap to proceed by any chance?

Crossing fingers ;)

> Thanks,
> Kyrill
> 
> 2016-11-02  Kyrylo Tkachov  
> 
> * gimple-ssa-store-merging.c (encode_tree_to_bitpos): Don't forget to
> clear padding bits even when they're less than a byte.
> 


[PATCH] Fix PR78185

2016-11-02 Thread Richard Biener

The following fixes PR78185 by properly honoring possibly infinite child
loops when computing what blocks are always executed during loop invariant
motion.  Such loops behave as if the loop would exit at this point.

Both GIMPLE and RTL level passes have that very same issue and the 
following fixes the GIMPLE one and simply disables hoisting of possibly
trapping or faulting instructions in the RTL pass ... Eric seems to
remember this might regress gzip so I'm going to put this on one of
our SPEC testers for tonights run as well.  Maybe somebody else
wants to check the performance impact (I'm interested in both,
GIMPLE and RTL change fallout for performance).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

If all goes well I'll followup with the obvious removal of no longer
needed stuff in loop-invariant.c.

Another variant for RTL would be to simply treat all edges entering
a child loop (not only those entering possibly infinitely looping ones)
as an exit.  I think finite_loop_p has no RTL level variant (yet).

Richard.

2016-11-02  Richard Biener  

PR middle-end/78185
* loop-invariant.c (find_invariant_insn): Never hoist trapping or
faulting instructions.
* tree-ssa-loop-im.c: Include tree-ssa-loop-niter.h.
(fill_always_executed_in_1): Honor infinite child loops.

* gcc.dg/pr78185.c: New testcase.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 551103f..deb5be6 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1076,7 +1076,7 @@ pre_check_invariant_p (bool simple, rtx dest)
unless the program ends due to a function call.  */
 
 static void
-find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed)
+find_invariant_insn (rtx_insn *insn, bool, bool always_executed)
 {
   df_ref ref;
   struct def *def;
@@ -1108,8 +1108,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, 
bool always_executed)
   if (can_throw_internal (insn))
 return;
 
-  /* We cannot make trapping insn executed, unless it was executed before.  */
-  if (may_trap_or_fault_p (PATTERN (insn)) && !always_reached)
+  /* We cannot make trapping insn executed.  */
+  if (may_trap_or_fault_p (PATTERN (insn)))
 return;
 
   depends_on = BITMAP_ALLOC (NULL);
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 463db04..0524e57 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "trans-mem.h"
 #include "gimple-fold.h"
 #include "tree-scalar-evolution.h"
+#include "tree-ssa-loop-niter.h"
 
 /* TODO:  Support for predicated code motion.  I.e.
 
@@ -2369,8 +2370,16 @@ fill_always_executed_in_1 (struct loop *loop, sbitmap 
contains_call)
break;
 
  FOR_EACH_EDGE (e, ei, bb->succs)
-   if (!flow_bb_inside_loop_p (loop, e->dest))
- break;
+   {
+ /* If there is an exit from this BB.  */
+ if (!flow_bb_inside_loop_p (loop, e->dest))
+   break;
+ /* Or we enter a possibly non-finite loop.  */
+ if (flow_loop_nested_p (bb->loop_father,
+ e->dest->loop_father)
+ && ! finite_loop_p (e->dest->loop_father))
+   break;
+   }
  if (e)
break;
 
Index: trunk/gcc/testsuite/gcc.dg/pr78185.c
===
--- trunk/gcc/testsuite/gcc.dg/pr78185.c(revision 0)
+++ trunk/gcc/testsuite/gcc.dg/pr78185.c(revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do run { target *-*-linux* *-*-gnu* } } */
+/* { dg-options "-O" } */
+
+#include 
+#include 
+#include 
+
+static char var1 = 0L;
+static char *var2 = 
+
+void do_exit (int i)
+{
+  exit (0);
+}
+
+int main(void)
+{
+  struct sigaction s;
+  sigemptyset (_mask);
+  s.sa_handler = do_exit;
+  s.sa_flags = 0;
+  sigaction (SIGALRM, , NULL);
+  alarm (1);
+  /* The following loop is infinite, the division by zero should not
+ be hoisted out of it.  */
+  for (; (var1 == 0 ? 0 : (100 / var1)) == *var2; );
+  return 0;
+}


[PATCH] Fix for big-endian gcc.c-torture/execute/pr55750.c

2016-11-02 Thread Kyrill Tkachov

Hi all,

I noticed that my patch for PR tree-optimization/78170 broke 
execute.exp=pr55750.c on big-endian.
The problem with that patch is that we should not forget to clear the padding 
bits in the temporary
buffer when merging values even when they are less than a byte.

Tested on aarch64_be-none-elf (the test failure goes away)
Bootstrap and test on x86_64 ongoing.
Ok for trunk if successful?

David, does this patch allow AIX bootstrap to proceed by any chance?

Thanks,
Kyrill

2016-11-02  Kyrylo Tkachov  

* gimple-ssa-store-merging.c (encode_tree_to_bitpos): Don't forget to
clear padding bits even when they're less than a byte.
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index 081620e50f603e2de8ed962aec6c619890ce1e33..db3c3c14a5b8938024db8bed80145c77d29396ac 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -430,7 +430,8 @@ encode_tree_to_bitpos (tree expr, unsigned char *ptr, int bitlen, int bitpos,
  contain a sign bit due to sign-extension).  */
   unsigned int padding
 = byte_size - ROUND_UP (bitlen, BITS_PER_UNIT) / BITS_PER_UNIT - 1;
-  if (padding != 0)
+  if (padding != 0
+  || bitlen % BITS_PER_UNIT != 0)
 {
   /* On big-endian the padding is at the 'front' so just skip the initial
 	 bytes.  */


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 03:38:25PM +0100, Martin Liška wrote:
> it converts:
> foo ()
> {
>   char a;
>   char * p;
>   char _1;
>   int _2;
>   int _8;
>   int _9;
> 
>   :
>   ASAN_MARK (2, , 1);
>   a = 0;
>   p_6 = 
>   ASAN_MARK (1, , 1);
>   _1 = *p_6;

You shouldn't convert if a is addressable (when ignoring  in ASAN_MARK
calls).  Only if there is  just in ASAN_MARK and MEM_REF, you can convert.

> to:
> 
> foo ()
> {
>   char a;
>   char * p;
>   char _1;
>   int _2;
> 
>   :
>   a_10 = 0;
>   a_12 = ASAN_POISON ();
>   _1 = a_12;
>   if (_1 != 0)
> goto ;
>   else
> goto ;
> 
>   :
> 
>   :
>   # _2 = PHI <1(2), 0(3)>
>   return _2;
> 
> }
> 
> and probably the last goal is to convert the newly added internal fn to a 
> runtime call.
> Hope sanopt pass is the right place where to it?

If ASAN_POISON is ECF_CONST and has any uses during sanopt, perhaps best
would be to add an artificial variable you give the same name as the
underlying var of the SSA_NAME (and alignment, locus etc.) and poison it
right away (keep unpoisoning only to the function epilogue) and then
ASAN_CHECK replace all uses of that SSA_NAME with ASAN_CHECK + use of
(D) SSA_NAME.

Jakub


Re: [PATCH 0/3] use rtx_insn * more

2016-11-02 Thread David Malcolm
On Wed, 2016-11-02 at 00:05 -0400, Trevor Saunders wrote:
> On Mon, Oct 31, 2016 at 07:37:54AM -0600, Jeff Law wrote:
> > On 10/28/2016 01:13 PM, tbsaunde+...@tbsaunde.org wrote:
> > > From: Trevor Saunders 
> > > 
> > > HI,
> > > 
> > > This series changes various variables type from rtx to rtx_insn *
> > > so that the
> > > remaining patches in this series
> > > http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01353.html can be
> > > applied.
> > > 
> > > patches bootstrapped and regtested on x86_64-linux-gnu, and run
> > > through config-list.mk, ok?
> > > 
> > > Thanks!
> > > 
> > > Trev
> > > 
> > > Trevor Saunders (3):
> > >   use rtx_insn * in various places where it is obvious
> > >   split up the trial variable in reorg.c:relax_delay_slots to use
> > > rtx_insn * more
> > >   split up some variables to use rtx_insn * more
> > All 3 patches in this series are fine.
> 
> Thanks for the reviews.
> 
> Can I trouble you to look at 
> http://gcc.gnu.org/ml/gcc-patches/2016-10/
> now that it is tested on aarch64?

I think you're missing the full URL here; the above URL is the index
for all of October.

Did you mean this patch:
  https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html
?



Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Martin Liška
On 11/02/2016 02:16 PM, Richard Biener wrote:
> On Wed, Nov 2, 2016 at 2:06 PM, Jakub Jelinek  wrote:
>> On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote:
 Yeah, that is what I meant.  The issue is how to report uses of such
 SSA_NAME when there is no memory.  So, either we'd need a special runtime
 library entrypoint that would report uses after scope even when there is no
 underlying memory, or we'd need to force it at asan pass time into memory 
 again.
>>>
>>> Well, there can't be any uses outside the scope -- there are no (memory) 
>>> uses
>>> left if we rewrite the thing into SSA.  That is, the address can no
>>> longer "escape".
>>>
>>> Of course there could have been invalid uses before the rewrite into SSA.  
>>> But
>>> those can be diagnosed either immediately before or after re-writing into 
>>> SSA
>>> at compile-time (may be in dead code regions of course).
>>
>> Sure, we can warn on those at compile time, but we really should arrange to
>> error on those at runtime if they are ever executed, the UB happens only at
>> runtime, so in dead code isn't fatal.
> 
> Then we can replace those uses with a call into the asan runtime diagnosing 
> the
> issue instead?
> 
> Richard.
> 
>> Jakub

OK, thanks for the clarification, it's more clear to me. So we want to consider 
for
SSA transformation of ASAN_MARK only is_gimple_reg_types. I'm having a 
test-case where
it converts:
foo ()
{
  char a;
  char * p;
  char _1;
  int _2;
  int _8;
  int _9;

  :
  ASAN_MARK (2, , 1);
  a = 0;
  p_6 = 
  ASAN_MARK (1, , 1);
  _1 = *p_6;
  if (_1 != 0)
goto ;
  else
goto ;

  :
  _9 = 1;
  goto ;

  :
  _8 = 0;

  :
  # _2 = PHI <_9(3), _8(4)>
  return _2;

}

to:

foo ()
{
  char a;
  char * p;
  char _1;
  int _2;

  :
  a_10 = 0;
  a_12 = ASAN_POISON ();
  _1 = a_12;
  if (_1 != 0)
goto ;
  else
goto ;

  :

  :
  # _2 = PHI <1(2), 0(3)>
  return _2;

}

and probably the last goal is to convert the newly added internal fn to a 
runtime call.
Hope sanopt pass is the right place where to it?

Thanks,
Martin



Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 03:27:42PM +0100, Martin Liška wrote:
> > So is there anything I should do wrt -Wswitch-unreachable?
> > 
> > Marek
> > 
> 
> Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper 
> place
> in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive 
> regression
> tests.

Please do that only for -fsanitize-use-after-scope, it will likely affect at
least for -O0 the debugging experience.  For -O0 -fsanitize=address 
-fsanitize-use-after-scope
perhaps we could arrange for some extra stmt to have the locus of the
switch (where we still don't want the vars to appear in scope) and then
have no locus on the ASAN_MARK and actual GIMPLE_SWITCH or something
similar.

Jakub


[PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those

2016-11-02 Thread Jakub Jelinek
Hi!

This is a merge of my patch from yesterday, Jason's incremental patch to
that and parts of Alex' patch from Oct 19.

It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook.  I've tried
to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html
and we even have such a langhook now, modified_type_die
uses lang_hooks.types.get_debug_type, but
1) it is just called in modified_type_die and not in
   gen_type_die_with_usage, that looks weird
2) it is used for something Ada-ish I really don't know how to test etc.
   to be able to find out if it is safe to call it in
   gen_type_die_with_usage too
3) most importantly, if the C++ version of this langhook would create
   OFFSET_TYPE on the fly, I don't know how to ensure effective sharing
   of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type
   and DW_AT_containing_type; unless the C++ langhook adds some extra
   hash table that caches already created OFFSET_TYPEs or something similar,
   it would create a new OFFSET_TYPE each time it is called

Also, I really don't know how well does GDB (especially older releases)
handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently
with if (dwarf_version >= 5).  Quick grep revealed that GDB has code to
handle the __pfn/__delta fields.  So, can I ask somebody from the GDB
team to test this patch with that if (dwarf_version >= 5) replaced
with if (1) and see if it works properly with current GDB as well as say
4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)?  If yes, we
should emit it unconditionally.

Bootstrapped/regtested on x86_64-linux and i686-linux, on the new ref-3.C
testcase emits the number of DIEs I really expect to.  Ok for trunk?

2016-11-02  Jakub Jelinek  
Alexandre Oliva  
Jason Merrill  

PR debug/28767
PR debug/56974
* langhooks.h (struct lang_hooks_for_types): Document type_hash_eq
being also called on METHOD_TYPEs.  Add type_dwarf_attribute and
get_ptrmemfn_type langhooks.
* langhooks.c (lhd_type_dwarf_attribute): New function.
* langhooks-def.h (lhd_type_dwarf_attribute): Declare.
(LANG_HOOKS_TYPE_DWARF_ATTRIBUTE, LANG_HOOKS_GET_PTRMEMFN_TYPE):
Define.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Add
LANG_HOOKS_TYPE_DWARF_ATTRIBUTE and LANG_HOOKS_GET_PTRMEMFN_TYPE.
* hooks.h (hook_tree_const_tree_int_null): Declare.
* hooks.c (hook_tree_const_tree_int_null): New function.
* tree.h (check_lang_type): Declare.
* tree.c (check_lang_type): New function.
(check_qualified_type, check_aligned_type): Call it.
* dwarf2out.c (modified_type_die): Don't use type_main_variant
for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with
check_base_type and check_lang_type.
(gen_ptr_to_mbr_type_die): Add class_type and member_type arguments,
allow type to be something other than OFFSET_TYPE, if lookup_type_die
is already non-NULL, return early.  For OFFSET_TYPE add
DW_AT_use_location attribute.
(gen_subroutine_type_die): Add DW_AT_{,rvalue_}reference attribute
if needed.
(gen_type_die_with_usage): Don't use type_main_variant
for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with
check_base_type and check_lang_type.  Formatting fixes.
Adjust gen_ptr_to_mbr_type_die caller.  Handle PMF.
cp/
* tree.c (cp_check_qualified_type): Use check_base_type and
TYPE_QUALS comparison instead of check_qualified_type.
(cxx_type_hash_eq): Return false if type_memfn_rqual don't match.
* cp-objcp-common.c (cp_get_ptrmemfn_type): New function.
(cp_decl_dwarf_attribute): Don't handle types here.
(cp_type_dwarf_attribute): New function.
* cp-objcp-common.h (cp_get_ptrmemfn_type, cp_type_dwarf_attribute):
Declare.
(LANG_HOOKS_GET_PTRMEMFN_TYPE, LANG_HOOKS_TYPE_DWARF_ATTRIBUTE):
Define.
testsuite/
* g++.dg/debug/dwarf2/ptrdmem-1.C: New test.
* g++.dg/debug/dwarf2/ref-3.C: New test.
* g++.dg/debug/dwarf2/refqual-1.C: New test.
* g++.dg/debug/dwarf2/refqual-2.C: New test.

--- gcc/langhooks.h.jj  2016-11-01 15:18:44.994506161 +0100
+++ gcc/langhooks.h 2016-11-02 11:43:51.905046755 +0100
@@ -120,7 +120,7 @@ struct lang_hooks_for_types
   /* Return TRUE if TYPE1 and TYPE2 are identical for type hashing purposes.
  Called only after doing all language independent checks.
  At present, this function is only called when both TYPE1 and TYPE2 are
- FUNCTION_TYPEs.  */
+ FUNCTION_TYPE or METHOD_TYPE.  */
   bool (*type_hash_eq) (const_tree, const_tree);
 
   /* Return TRUE if TYPE uses a hidden descriptor and fills in information
@@ -162,6 +162,15 @@ struct lang_hooks_for_types
  for the debugger about scale factor, etc.  */
   bool (*get_fixed_point_type_info) 

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Martin Liška
On 11/02/2016 03:20 PM, Marek Polacek wrote:
> On Wed, Nov 02, 2016 at 11:10:53AM +0100, Jakub Jelinek wrote:
>> On Wed, Nov 02, 2016 at 10:59:26AM +0100, Jakub Jelinek wrote:
 Which is gimplified as:

 int * ptr;

 switch (argc) , case 1: >
 {
   int a;

   try
 {
   ASAN_MARK (2, , 4);
   :
   goto ;
   :
   ptr = 
   goto ;
 }
   finally
 {
   ASAN_MARK (1, , 4);
 }
>>
>>> Shouldn't there be also ASAN_MARK on the implicit gotos from the switch
>>> statement?  Otherwise, consider this being done in a loop, after the first
>>> iteration you ASAN_MARK (1, , 4) (i.e. poison), then you iterate say with
>>> args 1 and have in case 1: a = 0;, won't that trigger runtime error?
>>
>> Wonder if for the variables declared inside of switch body, because we don't
>> care about uses before scope, it wouldn't be more efficient to arrange for
>> int *p, *q, *r;
>> switch (x)
>>   {
>> int a;
>>   case 1:
>> p = 
>> a = 5;
>> break;
>> int b;
>>   case 2:
>> int c;
>> q = 
>> r = 
>> b = 3;
>> c = 4;
>> break;
>>   }
>> to effectively ASAN_MARK (2, , 4); ASAN_MARK (2, , 4); ASAN_MARK (2, , 
>> 4);
>> before the GIMPLE_SWITCH, instead of unpoisoning them on every case label
>> where they might be in scope.  Though, of course, at least until lower pass
>> that is quite ugly, because it would refer to "not yet declared" variables.
>> Perhaps we'd need to move the ASAN_MARK and GIMPLE_SWITCH (but of course not
>> the expression evaluation of the switch control expression) inside of the
>> switches' GIMPLE_BIND.
> 
> So is there anything I should do wrt -Wswitch-unreachable?
> 
>   Marek
> 

Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper place
in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive 
regression
tests.

Thanks,
Martin


[PATCH] Generate reproducible output independently of the build-path

2016-11-02 Thread Ximin Luo
(Please keep me on CC, I am not subscribed)

Background
==

We are on a long journey to make build processes be able to reproduce the build
outputs independently of which filesystem path the build is being executed from
- e.g. if the executing user doesn't have root access to be able to execute it
under a standard path such as /build. This currently is making about 2k-3k [1]
packages in Debian unreproducible when build-paths are varied across builds.

[1] 
https://tests.reproducible-builds.org/debian/issues/unstable/captures_build_path_issue.html

Previous attempts have involved using -fdebug-prefix-map to strip away the
prefix of an absolute path, leaving behind the part relative to the top-level
directory of the source code, which is reproducible. But this flag was itself
stored in DW_AT_producer, which makes the final output unreproducible. This was
pointed out in bug 68848 and fixed in r231835.

However, through more testing, we have found out that the fix just mentioned is
not enough to achieve reproducibility in practice. The main issue is that many
different packages like to store CFLAGS et. al. in arbitrary ways. So if we add
an explicit -fdebug-prefix-map flag to the system-level CFLAGS etc, this will
often propagate into the build result, making it again dependent on the
build-path, and not reproducible. For example:

Some packages embed compiler flags into their pkg-config files (or equivalent):
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/curl.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/perl.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/qt4-x11.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/fflas-ffpack.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/sip4.html

Other packages embed compiler flags directly into the binary:
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/singular.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/mutt.html

etc etc.

We think it's not appropriate to patch all (3k+) of these packages to strip out
-fdebug-prefix-map flags. This would involve adding quite complex logic to
everyone's build scripts, and we have to adapt this logic every single time to
that particular package. Also, in general CFLAGS is *supposed* to affect the
compiler output, and saving it unconditionally is quite a reasonable thing for
packages to do. If we tried to patch all of these packages, we would be turning
"reproducible builds" in to a costly opt-in feature, rather than on-by-default
that everyone can easily benefit from.

So, we believe it is better to patch GCC centrally. Our proposed solution is 
similar to (a) the SOURCE_DATE_EPOCH environment variable which was previously 
accepted into GCC and was used to successfully make 400+ packages reproducible, 
and (b) the -fdebug-prefix-map mechanism that already exists in GCC and which 
nearly but not quite, achieves at-scale build-path-independent reproducibility.

Proposal


This patch series adds a new environment variable SOURCE_PREFIX_MAP. When this
is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value"
command-line argument. This makes the final binary output reproducible, and
also hides the unreproducible value (the build path prefix) from CFLAGS et. al.
which everyone is (understandably) embedding as-is into their build output.

This environment variable also acts on the __FILE__ macro, mapping it in the
same way that debug-prefix-map works for debug symbols. We have seen that
__FILE__ is also a very large source of unreproducibility, and is represented
quite heavily in the 3k+ figure given above.

Finally, we tweak the __TIMESTAMP__ macro so it honours SOURCE_DATE_EPOCH, in a
similar way to how __DATE__ and __TIME__ do so already.

More details are given in the headers of the patch files themselves.

Testing
===

I've tested these patches on a Debian testing/unstable x86_64-linux-gnu system.
So far I've only run the new tests that this patch adds, on a disable-bootstrap
build. I will do a full bootstrap and run the full testsuite over the next few
days, both with and without this patch, and report back.

Copyright disclaimer


I dedicate these patches to the public domain by waiving all of my rights to
the work worldwide under copyright law, including all related and neighboring
rights, to the extent allowed by law.

See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.

Please let me know if the above is insufficient and I will be happy to sign any
relevant forms.


[PATCH 3/4] Use SOURCE_PREFIX_MAP envvar to transform __FILE__

2016-11-02 Thread Ximin Luo
Honour the SOURCE_PREFIX_MAP environment variable when expanding the __FILE__
macro, in the same way that debug-prefix-map works for debugging symbol paths.

This patch follows similar lines to the earlier patch for SOURCE_DATE_EPOCH.
Specifically, we read the environment variable not in libcpp but via a hook
which has an implementation defined in gcc/c-family. However, to achieve this
is more complex than the earlier patch: we need to share the prefix_map data
structure and associated functions between libcpp and c-family. Therefore, we
need to move these to libiberty. (For comparison, the SOURCE_DATE_EPOCH patch
did not need this because time_t et. al. are in the standard C library.)

Acknowledgements


Dhole  who wrote the earlier patch for SOURCE_DATE_EPOCH
which saved me a lot of time on figuring out what to edit.

ChangeLogs
--

include/ChangeLog:

2016-11-01  Ximin Luo  

* prefix-map.h: New file, mostly derived from /gcc/final.c.

libiberty/ChangeLog:

2016-11-01  Ximin Luo  

* prefix-map.c: New file, mostly derived from /gcc/final.c.
* Makefile.in: Update for new files.

gcc/ChangeLog:

2016-11-01  Ximin Luo  

* final.c: Generalise and refactor code related to debug_prefix_map.
Move some of it to /libiberty/prefix-map.c, /include/prefix-map.h
and refactor the remaining code to use the moved-out things.
* doc/invoke.texi (Environment Variables): Update SOURCE_PREFIX_MAP to
describe how it affects __FILE__ expansion.

gcc/c-family/ChangeLog:

2016-11-01  Ximin Luo  

* c-common.c (cb_get_source_prefix_map): Define new call target.
* c-common.h (cb_get_source_prefix_map): Declare call target.
* c-lex.c (init_c_lex): Set the get_source_prefix_map callback.

libcpp/ChangeLog:

2016-11-01  Ximin Luo  

* include/cpplib.h (cpp_callbacks): Add get_source_prefix_map
callback.
* init.c (cpp_create_reader): Initialise source_prefix_map field.
* internal.h (cpp_reader): Add new field source_prefix_map.
* macro.c (_cpp_builtin_macro_text): Set the source_prefix_map field
if unset and apply it to the __FILE__ macro.

gcc/testsuite/ChangeLog:

2016-11-01  Ximin Luo  

* gcc.dg/cpp/source_prefix_map-1.c: New test.
* gcc.dg/cpp/source_prefix_map-2.c: New test.

Index: gcc-7-20161030/include/prefix-map.h
===
--- /dev/null
+++ gcc-7-20161030/include/prefix-map.h
@@ -0,0 +1,71 @@
+/* Declarations for manipulating filename prefixes.
+
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+#ifndef _PREFIX_MAP_H
+#define _PREFIX_MAP_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef HAVE_STDLIB_H
+#include 
+#endif
+
+/* Linked-list of mappings from old prefixes to new prefixes.  */
+
+struct prefix_map
+{
+  const char *old_prefix;
+  const char *new_prefix;
+  size_t old_len;
+  size_t new_len;
+  struct prefix_map *next;
+};
+
+/* Parse a single prefix-map.
+
+   The string `arg' is split at the final '=' character. The part before
+   it is used to set `map->old_prefix' and `map->old_len', and the part
+   after it is used to set `map->new_prefix' and `map->new_len'.
+
+   If `arg' does not contain a '=' then 0 is returned. Otherwise, a
+   non-zero value is returned.
+   */
+
+extern int parse_prefix_map (const char *arg, struct prefix_map *map);
+
+/* Perform mapping of filename prefixes.
+
+   Return the filename corresponding to `old_name'. The return value is
+   equal to `old_name' if no transformation occurred, else it is equal
+   to `new_name' where the new filename is stored.
+
+   On entry into this function, `new_name' must be able to hold at least
+   `(old_name - map->old_len + map->old_len + 1)' characters, where
+   `map' is the mapping that will be selected and performed.
+   */
+
+extern const char *apply_prefix_map (const char *old_name, char *new_name,
+struct prefix_map *map_head);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _PREFIX_MAP_H */
Index: 

[PATCH 1/4] Use SOURCE_PREFIX_MAP envvar as an implicit debug-prefix-map

2016-11-02 Thread Ximin Luo
Define the SOURCE_PREFIX_MAP environment variable, and treat it as an implicit
CLI -fdebug-prefix-map option specified before any explicit such options.

Acknowledgements


Daniel Kahn Gillmor who wrote the patch for r231835, which saved me a lot of
time figuring out what to edit.

HW42 for discussion on the details of the proposal, and for suggesting that we
retain the ability to map the prefix to something other than ".".

ChangeLogs
--

gcc/ChangeLog:

2016-11-01  Ximin Luo  

* opts-global.c (add_debug_prefix_map_from_envvar): Add the value of
SOURCE_PREFIX_MAP as a debug_prefix_map if the former is set.
(handle_common_deferred_options): Call add_debug_prefix_map_from_envvar
before processing options.
* final.c: (add_debug_prefix_map): Be less specific in the error 
message,
since it can also be triggered by the SOURCE_PREFIX_MAP variable.
* doc/invoke.texi (Environment Variables): Document form and behaviour 
of
SOURCE_PREFIX_MAP.

gcc/testsuite/ChangeLog:

2016-11-01  Ximin Luo  

* gcc.dg/debug/dwarf2/source_prefix_map-1.c: New test.
* gcc.dg/debug/dwarf2/source_prefix_map-2.c: New test.

Index: gcc-7-20161030/gcc/opts-global.c
===
--- gcc-7-20161030.orig/gcc/opts-global.c
+++ gcc-7-20161030/gcc/opts-global.c
@@ -310,6 +310,21 @@ decode_options (struct gcc_options *opts
   finish_options (opts, opts_set, loc);
 }
 
+/* Add a debug-prefix-map using the SOURCE_PREFIX_MAP environment variable if
+   it is set.  */
+
+static void
+add_debug_prefix_map_from_envvar ()
+{
+  char *prefix_map;
+
+  prefix_map = getenv ("SOURCE_PREFIX_MAP");
+  if (!prefix_map)
+return;
+
+  add_debug_prefix_map (prefix_map);
+}
+
 /* Hold command-line options associated with stack limitation.  */
 const char *opt_fstack_limit_symbol_arg = NULL;
 int opt_fstack_limit_register_no = -1;
@@ -335,6 +350,8 @@ handle_common_deferred_options (void)
   if (flag_opt_info)
 opt_info_switch_p (NULL);
 
+  add_debug_prefix_map_from_envvar ();
+
   FOR_EACH_VEC_ELT (v, i, opt)
 {
   switch (opt->opt_index)
Index: gcc-7-20161030/gcc/final.c
===
--- gcc-7-20161030.orig/gcc/final.c
+++ gcc-7-20161030/gcc/final.c
@@ -1531,7 +1531,7 @@ add_debug_prefix_map (const char *arg)
   p = strchr (arg, '=');
   if (!p)
 {
-  error ("invalid argument %qs to -fdebug-prefix-map", arg);
+  error ("invalid value %qs for debug-prefix-map", arg);
   return;
 }
   map = XNEW (debug_prefix_map);
Index: gcc-7-20161030/gcc/doc/invoke.texi
===
--- gcc-7-20161030.orig/gcc/doc/invoke.texi
+++ gcc-7-20161030/gcc/doc/invoke.texi
@@ -26222,6 +26222,23 @@ Recognize EUCJP characters.
 If @env{LANG} is not defined, or if it has some other value, then the
 compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale 
to
 recognize and translate multibyte characters.
+
+@item SOURCE_PREFIX_MAP If this variable is set, it specifies a mapping
+that is used to transform filepaths that are output in debug symbols.
+This helps the embedded paths become reproducible, without having the
+unreproducible value be visible in other input sources - such as GCC
+command-line flags or standardised build-time environment variables like
+@code{CFLAGS} - that are commonly legitimately-embedded in the build
+output by higher-level build processes.
+
+The form and behaviour is similar to @option{-fdebug-prefix-map}. That
+is, the value of @env{SOURCE_PREFIX_MAP} must be of the form
+@samp{@var{old}=@var{new}}. The split occurs on the first @code{=}
+character, so that @var{old} cannot itself contain a @code{=}.
+
+Whenever an absolute source- or build-related filepath is to be emitted
+in a final end-result output, GCC will replace @var{old} with @var{new}
+if that filepath starts with @var{old}.
 @end table
 
 @noindent
Index: gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-1.c
===
--- /dev/null
+++ gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-1.c
@@ -0,0 +1,9 @@
+/* DW_AT_comp_dir should be relative if SOURCE_PREFIX_MAP is a prefix of it. */
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+/* { dg-set-compiler-env-var SOURCE_PREFIX_MAP "[file dirname 
[pwd]]=DWARF2TEST" } */
+/* { dg-final { scan-assembler-dem "DW_AT_comp_dir: \"DWARF2TEST/gcc" } } */
+
+void func (void)
+{
+}
Index: gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-2.c
===
--- /dev/null
+++ gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-2.c
@@ -0,0 +1,8 @@
+/* DW_AT_comp_dir should be absolute if SOURCE_PREFIX_MAP is not 

[PATCH 4/4] Use SOURCE_DATE_EPOCH in place of __TIMESTAMP__ if the latter is newer.

2016-11-02 Thread Ximin Luo
This brings the behaviour in line with the __DATE__ and __TIME__ macros. Note
though the minor difference: __TIMESTAMP__ is defined as the file-modification
date and not the "current" date or time like the other two macros. Therefore,
we do a clamping behaviour (similar to tar --clamp-mtime).

Acknowledgements


Reiner Herrmann suggested the clamping behaviour.

ChangeLogs
--

libcpp/ChangeLog:

2016-11-01  Ximin Luo  

* macro.c (_cpp_builtin_macro_text): Use SOURCE_DATE_EPOCH in place of
__TIMESTAMP__ if the latter is newer than the former.

gcc/ChangeLog:

2016-11-01  Ximin Luo  

* doc/cppenv.texi (Environment Variables): Update SOURCE_DATE_EPOCH to
describe the new effect on __TIMESTAMP__.

gcc/testsuite/ChangeLog:

2016-11-01  Ximin Luo  

* gcc.dg/cpp/source_date_epoch-4.c: New test.
* gcc.dg/cpp/source_date_epoch-5.c: New test.

Index: gcc-7-20161030/libcpp/macro.c
===
--- gcc-7-20161030.orig/libcpp/macro.c
+++ gcc-7-20161030/libcpp/macro.c
@@ -264,7 +264,30 @@ _cpp_builtin_macro_text (cpp_reader *pfi
struct tm *tb = NULL;
struct stat *st = _cpp_get_file_stat (file);
if (st)
- tb = localtime (>st_mtime);
+ {
+   /* Set a reproducible timestamp for __DATE__ and __TIME__ 
macro
+  if SOURCE_DATE_EPOCH is defined.  */
+   if (pfile->source_date_epoch == (time_t) -2
+   && pfile->cb.get_source_date_epoch != NULL)
+ pfile->source_date_epoch = 
pfile->cb.get_source_date_epoch (pfile);
+
+   if (pfile->source_date_epoch >= (time_t) 0)
+ {
+   /* If SOURCE_DATE_EPOCH is set, we want reproducible
+  timestamps so use gmtime not localtime.  */
+   if (st->st_mtime >= pfile->source_date_epoch)
+ tb = gmtime (>source_date_epoch);
+   else
+ /* Don't use SOURCE_DATE_EPOCH if the timestamp is
+older, since that means it was probably already
+set in a reproducible way (e.g. via source tarball
+extraction or some other method). */
+ tb = gmtime (>st_mtime);
+ }
+   else
+ tb = localtime (>st_mtime);
+ }
+
if (tb)
  {
char *str = asctime (tb);
Index: gcc-7-20161030/gcc/doc/cppenv.texi
===
--- gcc-7-20161030.orig/gcc/doc/cppenv.texi
+++ gcc-7-20161030/gcc/doc/cppenv.texi
@@ -83,8 +83,9 @@ main input file is omitted.
 @item SOURCE_DATE_EPOCH
 If this variable is set, its value specifies a UNIX timestamp to be
 used in replacement of the current date and time in the @code{__DATE__}
-and @code{__TIME__} macros, so that the embedded timestamps become
-reproducible.
+and @code{__TIME__} macros, and in replacement of the file modification
+date in the @code{__TIMESTAMP__} macro if the latter is newer than the
+former, so that the embedded timestamps become reproducible.
 
 The value of @env{SOURCE_DATE_EPOCH} must be a UNIX timestamp,
 defined as the number of seconds (excluding leap seconds) since
Index: gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-4.c
===
--- /dev/null
+++ gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-4.c
@@ -0,0 +1,13 @@
+/* __TIMESTAMP__ should use SOURCE_DATE_EPOCH if the latter is older. */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var TZ "UTC" } */
+/* { dg-set-compiler-env-var LC_ALL "C" } */
+/* { dg-set-compiler-env-var SOURCE_DATE_EPOCH "0" } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__TIMESTAMP__, "Thu Jan  1 00:00:00 1970") != 0)
+__builtin_abort ();
+  return 0;
+}
Index: gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-5.c
===
--- /dev/null
+++ gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-5.c
@@ -0,0 +1,13 @@
+/* __TIMESTAMP__ should not use SOURCE_DATE_EPOCH if the latter is newer. */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var TZ "UTC" } */
+/* { dg-set-compiler-env-var LC_ALL "C" } */
+/* { dg-set-compiler-env-var SOURCE_DATE_EPOCH "253402300799" } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__TIMESTAMP__, "Fri Dec 31 23:59:59 UTC ") == 0)
+__builtin_abort ();
+  return 0;
+}


[PATCH 2/4] Split prefix-map values on the last '=' character, not the first

2016-11-02 Thread Ximin Luo
We are planning to ask other tools to support SOURCE_PREFIX_MAP, in the same
way that we have already done this for SOURCE_DATE_EPOCH. So, this will last
for some time and have quite wide reach. Consequently, we believe it is better
to split on the final '=', since it is much less likely to result in problems.

For example, with the previous behaviour (splitting on the first '=') it would
not be possible to map paths like the following:

C:\Documents and Settings\Betrand Russell\Proofs of 1+1=2\Automated 
Proofs\Source Code\
/srv/net/distributed-hash-table/address/VaIWP8YIWDChR2O76yiufXBsbw5g2skB_kp9VP-qVLvydovdGw==/projects/gcc-6/

These are contrived examples, but hopefully you can agree that they're not *so*
unrealistic - future software or users might plausibly wish to run reproducible
build processes underneath paths similar to these.

On the other hand, I can think of much fewer cases where the new-prefix *must*
include a '=' character. I can't think of any software project that includes
it, and I'd imagine that any such (or future) projects that might exist would
already have standardised "ASCII-only" versions of its name.

ChangeLogs
--

gcc/ChangeLog:

2016-11-01  Ximin Luo  

* final.c: (add_debug_prefix_map): Split on the last and not first '='.
* doc/invoke.texi (Environment Variables): Update SOURCE_PREFIX_MAP to
describe new parsing.

Index: gcc-7-20161030/gcc/final.c
===
--- gcc-7-20161030.orig/gcc/final.c
+++ gcc-7-20161030/gcc/final.c
@@ -1528,7 +1528,7 @@ add_debug_prefix_map (const char *arg)
   debug_prefix_map *map;
   const char *p;
 
-  p = strchr (arg, '=');
+  p = strrchr (arg, '=');
   if (!p)
 {
   error ("invalid value %qs for debug-prefix-map", arg);
Index: gcc-7-20161030/gcc/doc/invoke.texi
===
--- gcc-7-20161030.orig/gcc/doc/invoke.texi
+++ gcc-7-20161030/gcc/doc/invoke.texi
@@ -26233,8 +26233,8 @@ output by higher-level build processes.
 
 The form and behaviour is similar to @option{-fdebug-prefix-map}. That
 is, the value of @env{SOURCE_PREFIX_MAP} must be of the form
-@samp{@var{old}=@var{new}}. The split occurs on the first @code{=}
-character, so that @var{old} cannot itself contain a @code{=}.
+@samp{@var{old}=@var{new}}. The split occurs on the last @code{=}
+character, so that @var{new} cannot itself contain a @code{=}.
 
 Whenever an absolute source- or build-related filepath is to be emitted
 in a final end-result output, GCC will replace @var{old} with @var{new}


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Marek Polacek
On Wed, Nov 02, 2016 at 11:10:53AM +0100, Jakub Jelinek wrote:
> On Wed, Nov 02, 2016 at 10:59:26AM +0100, Jakub Jelinek wrote:
> > > Which is gimplified as:
> > > 
> > > int * ptr;
> > > 
> > > switch (argc) , case 1: >
> > > {
> > >   int a;
> > > 
> > >   try
> > > {
> > >   ASAN_MARK (2, , 4);
> > >   :
> > >   goto ;
> > >   :
> > >   ptr = 
> > >   goto ;
> > > }
> > >   finally
> > > {
> > >   ASAN_MARK (1, , 4);
> > > }
> 
> > Shouldn't there be also ASAN_MARK on the implicit gotos from the switch
> > statement?  Otherwise, consider this being done in a loop, after the first
> > iteration you ASAN_MARK (1, , 4) (i.e. poison), then you iterate say with
> > args 1 and have in case 1: a = 0;, won't that trigger runtime error?
> 
> Wonder if for the variables declared inside of switch body, because we don't
> care about uses before scope, it wouldn't be more efficient to arrange for
> int *p, *q, *r;
> switch (x)
>   {
> int a;
>   case 1:
> p = 
> a = 5;
> break;
> int b;
>   case 2:
> int c;
> q = 
> r = 
> b = 3;
> c = 4;
> break;
>   }
> to effectively ASAN_MARK (2, , 4); ASAN_MARK (2, , 4); ASAN_MARK (2, , 
> 4);
> before the GIMPLE_SWITCH, instead of unpoisoning them on every case label
> where they might be in scope.  Though, of course, at least until lower pass
> that is quite ugly, because it would refer to "not yet declared" variables.
> Perhaps we'd need to move the ASAN_MARK and GIMPLE_SWITCH (but of course not
> the expression evaluation of the switch control expression) inside of the
> switches' GIMPLE_BIND.

So is there anything I should do wrt -Wswitch-unreachable?

Marek


Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module

2016-11-02 Thread Jiong Wang

On 02/11/16 13:42, Jakub Jelinek wrote:

On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote:

-/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note.  
*/


Too long line.


Hmm, it shows 80 columns under my editor.  I guess '+' is counted in?




+/* RTL sequences inside PARALLEL are raw expression representation.
+
+   mem_loc_descriptor can be used to build generic DWARF expressions for
+   DW_CFA_expression and DW_CFA_val_expression where the expression may can
+   not be represented using normal RTL sequences.  In this case, group all
+   expression operations (DW_OP_*) inside a PARALLEL.  For those DW_OP 
which
+   doesn't have RTL mapping, wrap it using UNSPEC.  The logic for parsing
+   PARALLEL sequences is:
+
+   foreach elem inside PARALLEL
+ if (elem is UNSPEC)
+   dw_op =  XINT (elem, 1) (DWARF operation is kept as UNSPEC number)
+   oprnd1 = XVECEXP (elem, 0, 0)
+   oprnd2 = XVECEXP (elem, 0, 1)
+ else
+   call mem_loc_descriptor  */


Not sure if it is a good idea to document in weirdly formatted
pseudo-language what the code actually does a few lines below.  IMHO either
express it in words, or don't express it at all.


OK, fixed. I replaced these comments as some brief words.




+   exp_result =
+ new_loc_descr ((enum dwarf_location_atom) dw_op, oprnd1,
+oprnd2);


Wrong formatting, = should be on the next line.


+ }
+   else
+ exp_result =
+   mem_loc_descriptor (elem, mode, mem_mode,
+   VAR_INIT_STATUS_INITIALIZED);


Likewise.


Both fixed. Patch updated, please review.

Thanks.

gcc/
2016-11-02  Jiong Wang  

* reg-notes.def (CFA_VAL_EXPRESSION): New entry.
* dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New function.
(dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION.
(output_cfa_loc): Support DW_CFA_val_expression.
(output_cfa_loc_raw): Likewise.
(output_cfi): Likewise.
(output_cfi_directive): Likewise.
* dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression.
(dw_cfi_oprnd2_desc): Likewise.
(mem_loc_descriptor): Recognize new pattern generated for value
expression.

commit 36de0173c17efcc30c56ef10304377e71313e8bc
Author: Jiong Wang 
Date:   Wed Oct 19 15:42:04 2016 +0100

dwarf val expression

diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 6491d5a..b8c88fb 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -1235,7 +1235,7 @@ dwarf2out_frame_debug_cfa_register (rtx set)
   reg_save (sregno, dregno, 0);
 }
 
-/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note.  */
 
 static void
 dwarf2out_frame_debug_cfa_expression (rtx set)
@@ -1267,6 +1267,29 @@ dwarf2out_frame_debug_cfa_expression (rtx set)
   update_row_reg_save (cur_row, regno, cfi);
 }
 
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_VAL_EXPRESSION
+   note.  */
+
+static void
+dwarf2out_frame_debug_cfa_val_expression (rtx set)
+{
+  rtx dest = SET_DEST (set);
+  gcc_assert (REG_P (dest));
+
+  rtx span = targetm.dwarf_register_span (dest);
+  gcc_assert (!span);
+
+  rtx src = SET_SRC (set);
+  dw_cfi_ref cfi = new_cfi ();
+  cfi->dw_cfi_opc = DW_CFA_val_expression;
+  cfi->dw_cfi_oprnd1.dw_cfi_reg_num = dwf_regno (dest);
+  cfi->dw_cfi_oprnd2.dw_cfi_loc
+= mem_loc_descriptor (src, GET_MODE (src),
+			  GET_MODE (dest), VAR_INIT_STATUS_INITIALIZED);
+  add_cfi (cfi);
+  update_row_reg_save (cur_row, dwf_regno (dest), cfi);
+}
+
 /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note.  */
 
 static void
@@ -2033,10 +2056,16 @@ dwarf2out_frame_debug (rtx_insn *insn)
 	break;
 
   case REG_CFA_EXPRESSION:
+  case REG_CFA_VAL_EXPRESSION:
 	n = XEXP (note, 0);
 	if (n == NULL)
 	  n = single_set (insn);
-	dwarf2out_frame_debug_cfa_expression (n);
+
+	if (REG_NOTE_KIND (note) == REG_CFA_EXPRESSION)
+	  dwarf2out_frame_debug_cfa_expression (n);
+	else
+	  dwarf2out_frame_debug_cfa_val_expression (n);
+
 	handled_one = true;
 	break;
 
@@ -3015,7 +3044,8 @@ output_cfa_loc (dw_cfi_ref cfi, int for_eh)
   dw_loc_descr_ref loc;
   unsigned long size;
 
-  if (cfi->dw_cfi_opc == DW_CFA_expression)
+  if (cfi->dw_cfi_opc == DW_CFA_expression
+  || cfi->dw_cfi_opc == DW_CFA_val_expression)
 {
   unsigned r =
 	DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
@@ -3041,7 +3071,8 @@ output_cfa_loc_raw (dw_cfi_ref cfi)
   dw_loc_descr_ref loc;
   unsigned long size;
 
-  if (cfi->dw_cfi_opc == DW_CFA_expression)
+  if (cfi->dw_cfi_opc == DW_CFA_expression
+  || cfi->dw_cfi_opc 

Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

2016-11-02 Thread Richard Biener
On Wed, Nov 2, 2016 at 2:43 PM, Wilco Dijkstra  wrote:
> Richard Biener wrote:
> On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra  
> wrote:
>
>> > If bswap is false no byte swap is needed, so we found a native endian load
>> > and it will always perform the optimization by inserting an unaligned load.
>>
>> Yes, the general agreement is that the expander can do best and thus we
>> should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS.
>> The expander will generate the canonical best code (hopefully...).
>
> Right, but there are cases where you have to choose between unaligned or 
> aligned
> accesses and you need to know whether the unaligned access is fast.
>
> A good example is memcpy expansion, if you have fast unaligned accesses then 
> you
> should use them to deal with the last few bytes, but if they get expanded, 
> using several
> aligned accesses is much faster than a single unaligned access.

Yes.  That's RTL expansion at which point you of course have to look
at SLOW_UNALIGNED_ACCESS.

>> > This apparently works on all targets, and doesn't cause alignment traps or
>> > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS.
>> > So I'm at a loss what these macros are supposed to mean and how I can query
>> > whether a backend supports fast unaligned access for a particular mode.
>> >
>> > What I actually want to write is something like:
>> >
>> >  if (!FAST_UNALIGNED_LOAD (mode, align)) return false;
>> >
>> > And know that it only accepts unaligned accesses that are efficient on the 
>> > target.
>> > Maybe we need a new hook like this and get rid of the old one?
>>
>> No, we don't need to other hook.
>>
>> Note there is another similar user in gimple-fold.c when folding small
>> memcpy/memmove
>> to single load/store pairs (patch posted but not applied by me -- I've
>> asked for strict-align
>> target maintainer feedback but got none).
>
> I didn't find it, do you have a link?

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00598.html

>> Now - for bswap I'm only 99% sure that unaligned load + bswap is
>> better than piecewise loads plus manual swap.
>
> It depends on whether unaligned loads and bswap are expanded or not. Even if 
> we
> assume the expansion is at least as efficient as doing it explicitly 
> (definitely true
> for modes larger than the native integer size - as we found out in PR77308!),
> if both the unaligned load and bswap are expanded it seems better not to make 
> the
> transformation for modes up to the word size. But there is no way to find out 
> as
> SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true.

The case I was thinking about is availability of a bswap load operating only on
aligned memory and "regular" register bswap being "fake" provided by first
spilling to an aligned stack slot and then loading from that.

Maybe a bit far-fetched.

>> But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS /
>> STRICT_ALIGNMENT checks from the GIMPLE side of the compiler.
>
> I sort of agree because the purpose of these macros is unclear - the 
> documentation
> is insufficient and out of date. I do believe however we need an accurate way 
> to find out
> whether a target supports fast unaligned accesses as that is required to 
> generate good
> target code.

I believe the target macros are solely for RTL expansion and say that
it has to avoid
unaligned ops as those would trap.

Richard.

> Wilco


Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-11-02 Thread Richard Biener
On Wed, Nov 2, 2016 at 2:40 PM, Segher Boessenkool
 wrote:
> On Wed, Nov 02, 2016 at 11:39:20AM +0100, Steven Bosscher wrote:
>>  On Wed, Nov 2, 2016 at 10:02 AM, Richard Biener
>>  wrote:
>> > On Mon, Oct 31, 2016 at 4:35 PM, Segher Boessenkool wrote:
>> >> On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote:
>> >>> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
>> >>> > +  cfg_layout_finalize ();
>> >>>
>> >>> I don't think you have to go into/out-of cfglayout mode for each 
>> >>> iteration.
>> >>
>> >> Yeah probably.  I was too lazy :-)  It needs the cleanup_cfg every
>> >> iteration though, I was afraid that interacts.
>> >
>> > Ick.  Why would it need a cfg-cleanup every iteration?
>>
>> I don't believe it needs a cleanup on every iteration. One cleanup at
>> the end should work fine.
>
> But as the comment there says:
>
>   /* Merge the duplicated blocks into predecessors, when possible.  */
>   cleanup_cfg (0);
>
> (this is not a new comment), and without merging blocks this whole
> patch does zilch.
>
> There is no need to create new basic blocks at all (can replace the
> final branch in a block with a copy of the whole block it jumps to,
> instead); and then it is painfully obvious that switching to layout
> mode here is pointless, too.
>
> Do you want me to do that?
>
> Btw, this isn't quadratic anyway; it is a constant number (the maximal
> length allowed of the block to copy) linear.  Unless there are unboundedly
> many forwarder blocks, which there shouldn't be, but I cannot prove that.

Well, you iterate calling functions (find candidates, merge, cleanup-cfg) that
walk the whole function.  So unless the number of iterations is bound
with a constant I call this quadratic (in function size).

> And on a testcase with 2000 cases (instead of the 4 in the testcase in
> the PR) this pass takes less than 1% of the compiler runtime; and in
> the normal cases (no computed gotos to unfactor) it does the same as
> before.
>
>
> Segher


Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

2016-11-02 Thread Wilco Dijkstra
Richard Biener wrote:
On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra  wrote:

> > If bswap is false no byte swap is needed, so we found a native endian load
> > and it will always perform the optimization by inserting an unaligned load.
>
> Yes, the general agreement is that the expander can do best and thus we
> should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS.
> The expander will generate the canonical best code (hopefully...).

Right, but there are cases where you have to choose between unaligned or aligned
accesses and you need to know whether the unaligned access is fast.

A good example is memcpy expansion, if you have fast unaligned accesses then you
should use them to deal with the last few bytes, but if they get expanded, 
using several
aligned accesses is much faster than a single unaligned access.

> > This apparently works on all targets, and doesn't cause alignment traps or
> > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS.
> > So I'm at a loss what these macros are supposed to mean and how I can query
> > whether a backend supports fast unaligned access for a particular mode.
> >
> > What I actually want to write is something like:
> >
> >  if (!FAST_UNALIGNED_LOAD (mode, align)) return false;
> >
> > And know that it only accepts unaligned accesses that are efficient on the 
> > target.
> > Maybe we need a new hook like this and get rid of the old one?
>
> No, we don't need to other hook.
> 
> Note there is another similar user in gimple-fold.c when folding small
> memcpy/memmove
> to single load/store pairs (patch posted but not applied by me -- I've
> asked for strict-align
> target maintainer feedback but got none).

I didn't find it, do you have a link?

> Now - for bswap I'm only 99% sure that unaligned load + bswap is
> better than piecewise loads plus manual swap.

It depends on whether unaligned loads and bswap are expanded or not. Even if we 
assume the expansion is at least as efficient as doing it explicitly 
(definitely true
for modes larger than the native integer size - as we found out in PR77308!),
if both the unaligned load and bswap are expanded it seems better not to make 
the
transformation for modes up to the word size. But there is no way to find out as
SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true.

> But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS /
> STRICT_ALIGNMENT checks from the GIMPLE side of the compiler.

I sort of agree because the purpose of these macros is unclear - the 
documentation
is insufficient and out of date. I do believe however we need an accurate way 
to find out
whether a target supports fast unaligned accesses as that is required to 
generate good
target code.

Wilco 

Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote:
> -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. 
> */
> +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. 
>  */

Too long line.

> +/* RTL sequences inside PARALLEL are raw expression representation.
> +
> +   mem_loc_descriptor can be used to build generic DWARF expressions for
> +   DW_CFA_expression and DW_CFA_val_expression where the expression may 
> can
> +   not be represented using normal RTL sequences.  In this case, group 
> all
> +   expression operations (DW_OP_*) inside a PARALLEL.  For those DW_OP 
> which
> +   doesn't have RTL mapping, wrap it using UNSPEC.  The logic for parsing
> +   PARALLEL sequences is:
> +
> + foreach elem inside PARALLEL
> +   if (elem is UNSPEC)
> + dw_op =  XINT (elem, 1) (DWARF operation is kept as UNSPEC number)
> + oprnd1 = XVECEXP (elem, 0, 0)
> + oprnd2 = XVECEXP (elem, 0, 1)
> +   else
> + call mem_loc_descriptor  */

Not sure if it is a good idea to document in weirdly formatted
pseudo-language what the code actually does a few lines below.  IMHO either
express it in words, or don't express it at all.

> + exp_result =
> +   new_loc_descr ((enum dwarf_location_atom) dw_op, oprnd1,
> +  oprnd2);

Wrong formatting, = should be on the next line.

> +   }
> + else
> +   exp_result =
> + mem_loc_descriptor (elem, mode, mem_mode,
> + VAR_INIT_STATUS_INITIALIZED);

Likewise.

Jakub


Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-11-02 Thread Segher Boessenkool
On Wed, Nov 02, 2016 at 11:39:20AM +0100, Steven Bosscher wrote:
>  On Wed, Nov 2, 2016 at 10:02 AM, Richard Biener
>  wrote:
> > On Mon, Oct 31, 2016 at 4:35 PM, Segher Boessenkool wrote:
> >> On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote:
> >>> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
> >>> > +  cfg_layout_finalize ();
> >>>
> >>> I don't think you have to go into/out-of cfglayout mode for each 
> >>> iteration.
> >>
> >> Yeah probably.  I was too lazy :-)  It needs the cleanup_cfg every
> >> iteration though, I was afraid that interacts.
> >
> > Ick.  Why would it need a cfg-cleanup every iteration?
> 
> I don't believe it needs a cleanup on every iteration. One cleanup at
> the end should work fine.

But as the comment there says:

  /* Merge the duplicated blocks into predecessors, when possible.  */
  cleanup_cfg (0);

(this is not a new comment), and without merging blocks this whole
patch does zilch.

There is no need to create new basic blocks at all (can replace the
final branch in a block with a copy of the whole block it jumps to,
instead); and then it is painfully obvious that switching to layout
mode here is pointless, too.

Do you want me to do that?

Btw, this isn't quadratic anyway; it is a constant number (the maximal
length allowed of the block to copy) linear.  Unless there are unboundedly
many forwarder blocks, which there shouldn't be, but I cannot prove that.

And on a testcase with 2000 cases (instead of the 4 in the testcase in
the PR) this pass takes less than 1% of the compiler runtime; and in
the normal cases (no computed gotos to unfactor) it does the same as
before.


Segher


Re: [RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant

2016-11-02 Thread Aaron Sawdey
On Wed, 2016-11-02 at 13:41 +0100, Bernd Schmidt wrote:
> On 10/27/2016 03:14 AM, Aaron Sawdey wrote:
> > 
> > I'm currently working on a builtin expansion of strncmp for powerpc
> > similar to the one for memcmp I checked recently. One thing I
> > encountered is that the code in expand_strn_compare will not
> > attempt to
> > expand the cmpstrnsi pattern at all if neither string parameter is
> > a
> > constant string. This doesn't make a lot of sense in light of the
> > fact
> > that expand_str_compare starts with an attempt to expand cmpstrsi
> > and
> > then if that does not work, looks at the string args to see if one
> > is
> > constant so it can use cmpstrnsi with the known length.
> > 
> > The attached patch is my attempt to add this fallback path to
> > expand_strn_compare, i.e. if neither length is known, just attempt
> > expansion of cmpstrnsi using the given 3 arguments.
> > 
> > It looks like in addition to rs6000, there are 3 other targets that
> > have cmpstrnsi patterns: i386, sh, and rx.
> > 
> > Is this a reasonable approach to take with this? If so I'll
> > bootstrap/regtest on i386 as rs6000 does not as yet have an
> > expansion
> > for cmpstrsi or cmpstrnsi.
> 
> Just to be sure, this is superseded by your later patch series,
> correct?
> 
> (After I saw this one I had been trying to remember what exactly the 
> i386 issue was but it looks like you found it yourself.)
> 
> 
> Bernd

Yes, the later patch series replaces this preliminary patch. And yes,
the i386 issue took some headscratching and careful reading of the arch
manual on what repz cmpsb actually does.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.

2016-11-02 Thread Christophe Lyon
On 2 November 2016 at 12:22, Bin.Cheng  wrote:
> On Tue, Nov 1, 2016 at 12:21 PM, Kyrill Tkachov
>  wrote:
>> Hi Tamar,
>>
>>
>> On 26/10/16 16:01, Tamar Christina wrote:
>>>
>>> Hi Christophe,
>>>
>>> Here's the updated patch.
>>>
>>> Cheers,
>>> Tamar
>>> 
>>> From: Christophe Lyon 
>>> Sent: Wednesday, October 19, 2016 11:23:56 AM
>>> To: Tamar Christina
>>> Cc: GCC Patches; Kyrylo Tkachov; nd
>>> Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and
>>> vminnmQ_ST intrinsincs.
>>>
>>> On 19 October 2016 at 11:36, Tamar Christina 
>>> wrote:

 Hi All,

 This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
 current builtin registration code is deficient since it can't access
 standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
 directly. Thus, to enable the vectoriser to have access to these
 intrinsics, we implement them using builtin functions, which we
 expand to the proper standard pattern using a define_expand.

 This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
 which is defined when __ARM_ARCH >= 8, and which enables the
 intrinsics.

 Regression tested on arm-none-eabi and no regressions.

 This patch is a rework of a previous patch:
 https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html

 OK for trunk?
> These cases failed on arm-none-linux-gnueabihf as below:
> FAIL: gcc.target/arm/simd/vmaxnm_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vmaxnmq_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vminnm_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vminnmq_f32_1.c execution test
>
> For such changes, I would suggest reg test for both bare-metal and
> linux toolchains, plus a bootstrap for linux toolchain.
>

Hi,

I confirm some tests are failing:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241736/report-build-info.html

Sorry I couldn't answer/test before you committed, I was on holidays.

Christophe

> Thanks,
> bin
>
>>
>>
>> Ok.
>> Thanks,
>> Kyrill
>>
>>
 Thanks,
 Tamar

 ---

 gcc/

 2016-10-19  Bilyan Borisov  
  Tamar Christina 

  * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
  * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
  (vmaxnmq_f32): Likewise.
  (vminnm_f32): Likewise.
  (vminnmq_f32): Likewise.
  * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
  (vminnm): Likewise.
  * config/arm/neon.md (neon_, VCVTF): New
  expander.

 gcc/testsuite/

 2016-10-19  Bilyan Borisov  

  * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
  * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
  * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
  * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.

>>> I think you forgot to attach the new tests.
>>>
>>> Christophe
>>>
>>


Re: [PATCH 1/2, libgcc]: Implement _divmoddi4

2016-11-02 Thread Ian Lance Taylor
On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak  wrote:
> This function will be used in a follow-up patch to implement
> TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call
> this function, so IMO it should be part of a generic library.
>
> 2016-10-31  Uros Bizjak  
>
> * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4.
> * libgcc2.c (__divmoddi4): New function.
> * libgcc2.h (__divmoddi4): Declare.
> * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4
> and __PFX_divmodti4.

You aren't defining divmodti4 anywhere, so it seems premature to add
it to libgcc-std.ver.in.

Other than that the patch is OK.

Thanks.

Ian


Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module

2016-11-02 Thread Jiong Wang

On 01/11/16 16:48, Jason Merrill wrote:
> It seems to me that a CFA_*expression note would never use target
> UNSPEC codes, and a DWARF UNSPEC would never appear outside of such a
> note, so we don't need to worry about conflicts.

Indeed.

DWARF UNSPEC is deeper inside DW_CFA_*expression note.  My worry of conflict
makes no sense.

I updated the patch to put DWARF operation in to UNSPEC number field.

x86-64 bootstrap OK,  no regression on gcc/g++.

Please review.

Thanks.

gcc/
2016-11-02  Jiong Wang  

* reg-notes.def (CFA_VAL_EXPRESSION): New entry.
* dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New 
function.

(dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION.
(output_cfa_loc): Support DW_CFA_val_expression.
(output_cfa_loc_raw): Likewise.
(output_cfi): Likewise.
(output_cfi_directive): Likewise.
* dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression.
(dw_cfi_oprnd2_desc): Likewise.
(mem_loc_descriptor): Recognize new pattern generated for value
expression.
diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 6491d5aaf4c4a21241cc718bfff1016f6d149951..b8c88fbae1df80a2664a414d8ae016a5343bf435 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -1235,7 +1235,7 @@ dwarf2out_frame_debug_cfa_register (rtx set)
   reg_save (sregno, dregno, 0);
 }
 
-/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note.  */
 
 static void
 dwarf2out_frame_debug_cfa_expression (rtx set)
@@ -1267,6 +1267,29 @@ dwarf2out_frame_debug_cfa_expression (rtx set)
   update_row_reg_save (cur_row, regno, cfi);
 }
 
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_VAL_EXPRESSION
+   note.  */
+
+static void
+dwarf2out_frame_debug_cfa_val_expression (rtx set)
+{
+  rtx dest = SET_DEST (set);
+  gcc_assert (REG_P (dest));
+
+  rtx span = targetm.dwarf_register_span (dest);
+  gcc_assert (!span);
+
+  rtx src = SET_SRC (set);
+  dw_cfi_ref cfi = new_cfi ();
+  cfi->dw_cfi_opc = DW_CFA_val_expression;
+  cfi->dw_cfi_oprnd1.dw_cfi_reg_num = dwf_regno (dest);
+  cfi->dw_cfi_oprnd2.dw_cfi_loc
+= mem_loc_descriptor (src, GET_MODE (src),
+			  GET_MODE (dest), VAR_INIT_STATUS_INITIALIZED);
+  add_cfi (cfi);
+  update_row_reg_save (cur_row, dwf_regno (dest), cfi);
+}
+
 /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note.  */
 
 static void
@@ -2033,10 +2056,16 @@ dwarf2out_frame_debug (rtx_insn *insn)
 	break;
 
   case REG_CFA_EXPRESSION:
+  case REG_CFA_VAL_EXPRESSION:
 	n = XEXP (note, 0);
 	if (n == NULL)
 	  n = single_set (insn);
-	dwarf2out_frame_debug_cfa_expression (n);
+
+	if (REG_NOTE_KIND (note) == REG_CFA_EXPRESSION)
+	  dwarf2out_frame_debug_cfa_expression (n);
+	else
+	  dwarf2out_frame_debug_cfa_val_expression (n);
+
 	handled_one = true;
 	break;
 
@@ -3015,7 +3044,8 @@ output_cfa_loc (dw_cfi_ref cfi, int for_eh)
   dw_loc_descr_ref loc;
   unsigned long size;
 
-  if (cfi->dw_cfi_opc == DW_CFA_expression)
+  if (cfi->dw_cfi_opc == DW_CFA_expression
+  || cfi->dw_cfi_opc == DW_CFA_val_expression)
 {
   unsigned r =
 	DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
@@ -3041,7 +3071,8 @@ output_cfa_loc_raw (dw_cfi_ref cfi)
   dw_loc_descr_ref loc;
   unsigned long size;
 
-  if (cfi->dw_cfi_opc == DW_CFA_expression)
+  if (cfi->dw_cfi_opc == DW_CFA_expression
+  || cfi->dw_cfi_opc == DW_CFA_val_expression)
 {
   unsigned r =
 	DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, 1);
@@ -3188,6 +3219,7 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref fde, int for_eh)
 
 	case DW_CFA_def_cfa_expression:
 	case DW_CFA_expression:
+	case DW_CFA_val_expression:
 	  output_cfa_loc (cfi, for_eh);
 	  break;
 
@@ -3302,16 +3334,13 @@ output_cfi_directive (FILE *f, dw_cfi_ref cfi)
   break;
 
 case DW_CFA_def_cfa_expression:
-  if (f != asm_out_file)
-	{
-	  fprintf (f, "\t.cfi_def_cfa_expression ...\n");
-	  break;
-	}
-  /* FALLTHRU */
 case DW_CFA_expression:
+case DW_CFA_val_expression:
   if (f != asm_out_file)
 	{
-	  fprintf (f, "\t.cfi_cfa_expression ...\n");
+	  fprintf (f, "\t.cfi_%scfa_%sexpression ...\n",
+		   cfi->dw_cfi_opc == DW_CFA_def_cfa_expression ? "def_" : "",
+		   cfi->dw_cfi_opc == DW_CFA_val_expression ? "val_" : "");
 	  break;
 	}
   fprintf (f, "\t.cfi_escape %#x,", cfi->dw_cfi_opc);
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 4a3df339df2c6a6816ac8b8dbdb2466a7492c592..7dac70d7392f2c457ffd3f677e07decb1ba488a1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -518,6 +518,7 @@ dw_cfi_oprnd1_desc (enum dwarf_call_frame_info cfi)
 case DW_CFA_def_cfa_register:
 case DW_CFA_register:
 case DW_CFA_expression:
+case DW_CFA_val_expression:
   return dw_cfi_oprnd_reg_num;
 
 case DW_CFA_def_cfa_offset:
@@ -551,6 +552,7 @@ 

Re: [PATCH] libiberty: Fix -Wimplicit-fallthrough warnings.

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 02:19:33PM +0100, Mark Wielaard wrote:
> Adjust some comments, add some explicit fall through comments or explicit
> returns where necessary to not get implicit-fallthrough warnings.
> 
> All fall throughs were deliberate. In one case I added an explicit return
> false for clarity instead of falling through a default case (that also
> would return false).
> 
> libiberty/ChangeLog:
> 
>* cplus-dem.c (demangle_signature): Move fall through comment.
>(demangle_fund_type): Add fall through comment between 'G' and 'I'.
>* hashtab.c (iterative_hash): Add fall through comments.
>* regex.c (regex_compile): Add Fall through comment after '+'/'?'.
>(byte_re_match_2_internal): Add Fall through comment after jump_n.
>Change "Note fall through" to "Fall through".
>(common_op_match_null_string_p): Return false after set_number_at
>instead of fall through.

LGTM, except for:

> --- a/libiberty/cplus-dem.c
> +++ b/libiberty/cplus-dem.c
> @@ -1658,8 +1658,8 @@ demangle_signature (struct work_stuff *work,
> break;
>   }
> else
> - /* fall through */
>   {;}
> + /* fall through */

I think you should just remove the else and {;} and just have fallthrough
comment indented where else used to be.

Jakub


[PATCH] libiberty: Fix -Wimplicit-fallthrough warnings.

2016-11-02 Thread Mark Wielaard
Adjust some comments, add some explicit fall through comments or explicit
returns where necessary to not get implicit-fallthrough warnings.

All fall throughs were deliberate. In one case I added an explicit return
false for clarity instead of falling through a default case (that also
would return false).

libiberty/ChangeLog:

   * cplus-dem.c (demangle_signature): Move fall through comment.
   (demangle_fund_type): Add fall through comment between 'G' and 'I'.
   * hashtab.c (iterative_hash): Add fall through comments.
   * regex.c (regex_compile): Add Fall through comment after '+'/'?'.
   (byte_re_match_2_internal): Add Fall through comment after jump_n.
   Change "Note fall through" to "Fall through".
   (common_op_match_null_string_p): Return false after set_number_at
   instead of fall through.
---
diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c
index 7f63397..810e971 100644
--- a/libiberty/cplus-dem.c
+++ b/libiberty/cplus-dem.c
@@ -1658,8 +1658,8 @@ demangle_signature (struct work_stuff *work,
  break;
}
  else
-   /* fall through */
{;}
+   /* fall through */
 
default:
  if (AUTO_DEMANGLING || GNU_DEMANGLING)
@@ -4024,6 +4024,7 @@ demangle_fund_type (struct work_stuff *work,
  success = 0;
  break;
}
+  /* fall through */
 case 'I':
   (*mangled)++;
   if (**mangled == '_')
diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c
index 04607ea..99381b1 100644
--- a/libiberty/hashtab.c
+++ b/libiberty/hashtab.c
@@ -962,17 +962,17 @@ iterative_hash (const PTR k_in /* the key */,
   c += length;
   switch(len)  /* all the case statements fall through */
 {
-case 11: c+=((hashval_t)k[10]<<24);
-case 10: c+=((hashval_t)k[9]<<16);
-case 9 : c+=((hashval_t)k[8]<<8);
+case 11: c+=((hashval_t)k[10]<<24);/* fall through */
+case 10: c+=((hashval_t)k[9]<<16); /* fall through */
+case 9 : c+=((hashval_t)k[8]<<8);  /* fall through */
   /* the first byte of c is reserved for the length */
-case 8 : b+=((hashval_t)k[7]<<24);
-case 7 : b+=((hashval_t)k[6]<<16);
-case 6 : b+=((hashval_t)k[5]<<8);
-case 5 : b+=k[4];
-case 4 : a+=((hashval_t)k[3]<<24);
-case 3 : a+=((hashval_t)k[2]<<16);
-case 2 : a+=((hashval_t)k[1]<<8);
+case 8 : b+=((hashval_t)k[7]<<24); /* fall through */
+case 7 : b+=((hashval_t)k[6]<<16); /* fall through */
+case 6 : b+=((hashval_t)k[5]<<8);  /* fall through */
+case 5 : b+=k[4];  /* fall through */
+case 4 : a+=((hashval_t)k[3]<<24); /* fall through */
+case 3 : a+=((hashval_t)k[2]<<16); /* fall through */
+case 2 : a+=((hashval_t)k[1]<<8);  /* fall through */
 case 1 : a+=k[0];
   /* case 0: nothing left to add */
 }
diff --git a/libiberty/regex.c b/libiberty/regex.c
index 9ffc3f4..6854e3b 100644
--- a/libiberty/regex.c
+++ b/libiberty/regex.c
@@ -2493,6 +2493,7 @@ PREFIX(regex_compile) (const char *ARG_PREFIX(pattern),
   if ((syntax & RE_BK_PLUS_QM)
   || (syntax & RE_LIMITED_OPS))
 goto normal_char;
+ /* Fall through.  */
 handle_plus:
 case '*':
   /* If there is no previous pattern... */
@@ -6697,6 +6698,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp,
 {
   case jump_n:
is_a_jump_n = true;
+   /* Fall through.  */
   case pop_failure_jump:
  case maybe_pop_jump:
  case jump:
@@ -7125,7 +7127,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp,
   DEBUG_PRINT1 ("  Match => jump.\n");
  goto unconditional_jump;
}
-/* Note fall through.  */
+/* Fall through.  */
 
 
/* The end of a simple repeat has a pop_failure_jump back to
@@ -7150,7 +7152,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp,
dummy_low_reg, dummy_high_reg,
reg_dummy, reg_dummy, reg_info_dummy);
   }
- /* Note fall through.  */
+ /* Fall through.  */
 
unconditional_jump:
 #ifdef _LIBC
@@ -7453,6 +7455,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp,
 {
 case jump_n:
   is_a_jump_n = true;
+ /* Fall through.  */
 case maybe_pop_jump:
 case pop_failure_jump:
 case jump:
@@ -7718,6 +7721,7 @@ PREFIX(common_op_match_null_string_p) (UCHAR_T **p, 
UCHAR_T *end,
 
 case set_number_at:
   p1 += 2 * OFFSET_ADDRESS_SIZE;
+  return false;
 
 default:
   /* All other opcodes mean we cannot match the empty string.  */
-- 
1.8.3.1



Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Richard Biener
On Wed, Nov 2, 2016 at 2:06 PM, Jakub Jelinek  wrote:
> On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote:
>> > Yeah, that is what I meant.  The issue is how to report uses of such
>> > SSA_NAME when there is no memory.  So, either we'd need a special runtime
>> > library entrypoint that would report uses after scope even when there is no
>> > underlying memory, or we'd need to force it at asan pass time into memory 
>> > again.
>>
>> Well, there can't be any uses outside the scope -- there are no (memory) uses
>> left if we rewrite the thing into SSA.  That is, the address can no
>> longer "escape".
>>
>> Of course there could have been invalid uses before the rewrite into SSA.  
>> But
>> those can be diagnosed either immediately before or after re-writing into SSA
>> at compile-time (may be in dead code regions of course).
>
> Sure, we can warn on those at compile time, but we really should arrange to
> error on those at runtime if they are ever executed, the UB happens only at
> runtime, so in dead code isn't fatal.

Then we can replace those uses with a call into the asan runtime diagnosing the
issue instead?

Richard.

> Jakub


[PATCH] Allow non-NULL offset for store-merging bases

2016-11-02 Thread Richard Biener

The following teaches store-merging to handle non-NULL offset if the
base is already addressable (otherwise introducing new pointers to
a non-addressable base invalidates points-to information, see a comment
in the patch how we could avoid this in theory).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-11-02  Richard Biener  

* gimple-ssa-store-merging.c: Include gimplify-me.h.
(imm_store_chain_info::output_merged_stores): Force base_addr
to be proper GIMPLE for a MEM_REF address.
(pass_store_merging::execute): Restrict negative bitpos
handling to non-MEM_REF bases.  Remove TREE_THIS_VOLATILE
check.  Take into account non-NULL_TREE offset if the base
is already addressable.

* gcc.dg/store_merging_8.c: New testcase.

Index: gcc/gimple-ssa-store-merging.c
===
--- gcc/gimple-ssa-store-merging.c  (revision 241789)
+++ gcc/gimple-ssa-store-merging.c  (working copy)
@@ -125,6 +125,7 @@
 #include "tree-cfg.h"
 #include "tree-eh.h"
 #include "target.h"
+#include "gimplify-me.h"
 
 /* The maximum size (in bits) of the stores this pass should generate.  */
 #define MAX_STORE_BITSIZE (BITS_PER_WORD)
@@ -1127,6 +1128,8 @@ imm_store_chain_info::output_merged_stor
   unsigned int i;
   bool fail = false;
 
+  tree addr = force_gimple_operand_1 (unshare_expr (base_addr), ,
+ is_gimple_mem_ref_addr, NULL_TREE);
   FOR_EACH_VEC_ELT (split_stores, i, split_store)
 {
   unsigned HOST_WIDE_INT try_size = split_store->size;
@@ -1137,7 +1140,7 @@ imm_store_chain_info::output_merged_stor
 
   tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);
   int_type = build_aligned_type (int_type, align);
-  tree dest = fold_build2 (MEM_REF, int_type, base_addr,
+  tree dest = fold_build2 (MEM_REF, int_type, addr,
   build_int_cst (offset_type, try_pos));
 
   tree src = native_interpret_expr (int_type,
@@ -1366,15 +1369,10 @@ pass_store_merging::execute (function *f
   , , );
  /* As a future enhancement we could handle stores with the same
 base and offset.  */
- bool invalid = offset || reversep || bitpos < 0
+ bool invalid = reversep
 || ((bitsize > MAX_BITSIZE_MODE_ANY_INT)
  && (TREE_CODE (rhs) != INTEGER_CST))
-|| !rhs_valid_for_store_merging_p (rhs)
-   /* An access may not be volatile itself but base_addr may be
-  a volatile decl i.e. MEM[].  The hashing for
-  tree_operand_hash won't consider such stores equal to each
-  other so we can't track chains on them.  */
-|| TREE_THIS_VOLATILE (base_addr);
+|| !rhs_valid_for_store_merging_p (rhs);
 
  /* We do not want to rewrite TARGET_MEM_REFs.  */
  if (TREE_CODE (base_addr) == TARGET_MEM_REF)
@@ -1398,7 +1396,32 @@ pass_store_merging::execute (function *f
  /* get_inner_reference returns the base object, get at its
 address now.  */
  else
-   base_addr = build_fold_addr_expr (base_addr);
+   {
+ if (bitpos < 0)
+   invalid = true;
+ base_addr = build_fold_addr_expr (base_addr);
+   }
+
+ if (! invalid
+ && offset != NULL_TREE)
+   {
+ /* If the access is variable offset then a base
+decl has to be address-taken to be able to
+emit pointer-based stores to it.
+???  We might be able to get away with
+re-using the original base up to the first
+variable part and then wrapping that inside
+a BIT_FIELD_REF.  */
+ tree base = get_base_address (base_addr);
+ if (! base
+ || (DECL_P (base)
+ && ! TREE_ADDRESSABLE (base)))
+   invalid = true;
+ else
+   base_addr = build2 (POINTER_PLUS_EXPR,
+   TREE_TYPE (base_addr),
+   base_addr, offset);
+   }
 
  struct imm_store_chain_info **chain_info
= m_stores.get (base_addr);
Index: gcc/testsuite/gcc.dg/store_merging_8.c
===
--- gcc/testsuite/gcc.dg/store_merging_8.c  (revision 0)
+++ gcc/testsuite/gcc.dg/store_merging_8.c  (working copy)
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target non_strict_align } */
+/* { 

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote:
> > Yeah, that is what I meant.  The issue is how to report uses of such
> > SSA_NAME when there is no memory.  So, either we'd need a special runtime
> > library entrypoint that would report uses after scope even when there is no
> > underlying memory, or we'd need to force it at asan pass time into memory 
> > again.
> 
> Well, there can't be any uses outside the scope -- there are no (memory) uses
> left if we rewrite the thing into SSA.  That is, the address can no
> longer "escape".
> 
> Of course there could have been invalid uses before the rewrite into SSA.  But
> those can be diagnosed either immediately before or after re-writing into SSA
> at compile-time (may be in dead code regions of course).

Sure, we can warn on those at compile time, but we really should arrange to
error on those at runtime if they are ever executed, the UB happens only at
runtime, so in dead code isn't fatal.

Jakub


Re: [ping * 4] PR35503 - warn for restrict

2016-11-02 Thread Jason Merrill
Then I'll approve the whole patch.

On Wed, Nov 2, 2016 at 8:42 AM, Joseph Myers  wrote:
> The format-checking parts of the patch are OK.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Richard Biener
On Wed, Nov 2, 2016 at 1:56 PM, Jakub Jelinek  wrote:
> On Wed, Nov 02, 2016 at 01:36:31PM +0100, Richard Biener wrote:
>> > Unless I'm missing something we either need to perform further analysis
>> > during the addressable subpass - this variable could be made
>> > non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA
>> > form there be any SSA uses of the poisoning ASAN_MARK?  If yes, keep it
>> > addressable, otherwise rewrite into SSA.
>> > Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into
>> > some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any
>> > uses of those, rewrite it back into addressable immediately or later or
>> > something.
>>
>> Or just give up optimizing (asan has a penalty anyway)?  Or
>
> Well, asan has a penalty and -fsanitize-use-after-scope even bigger penalty,
> but the point is to make that penalty bearable.
>
>> re-structure ASAN_POISON ()
>> similar to clobbers:
>>
>>   var = ASAN_POISION ();
>>
>> that avoids the address-taking and thus should be less heavy-weight on
>> optimizations.
>
> Yeah, that is what I meant.  The issue is how to report uses of such
> SSA_NAME when there is no memory.  So, either we'd need a special runtime
> library entrypoint that would report uses after scope even when there is no
> underlying memory, or we'd need to force it at asan pass time into memory 
> again.

Well, there can't be any uses outside the scope -- there are no (memory) uses
left if we rewrite the thing into SSA.  That is, the address can no
longer "escape".

Of course there could have been invalid uses before the rewrite into SSA.  But
those can be diagnosed either immediately before or after re-writing into SSA
at compile-time (may be in dead code regions of course).

Richard.

> Jakub


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Jakub Jelinek
On Wed, Nov 02, 2016 at 01:36:31PM +0100, Richard Biener wrote:
> > Unless I'm missing something we either need to perform further analysis
> > during the addressable subpass - this variable could be made
> > non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA
> > form there be any SSA uses of the poisoning ASAN_MARK?  If yes, keep it
> > addressable, otherwise rewrite into SSA.
> > Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into
> > some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any
> > uses of those, rewrite it back into addressable immediately or later or
> > something.
> 
> Or just give up optimizing (asan has a penalty anyway)?  Or

Well, asan has a penalty and -fsanitize-use-after-scope even bigger penalty,
but the point is to make that penalty bearable.

> re-structure ASAN_POISON ()
> similar to clobbers:
> 
>   var = ASAN_POISION ();
> 
> that avoids the address-taking and thus should be less heavy-weight on
> optimizations.

Yeah, that is what I meant.  The issue is how to report uses of such
SSA_NAME when there is no memory.  So, either we'd need a special runtime
library entrypoint that would report uses after scope even when there is no
underlying memory, or we'd need to force it at asan pass time into memory again.

Jakub


Re: [ping * 4] PR35503 - warn for restrict

2016-11-02 Thread Joseph Myers
The format-checking parts of the patch are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

2016-11-02 Thread Richard Biener
On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra  wrote:
>  Jeff Law  wrote:
>
>> I think you'll need to look at bz61320 before this could go in.
>
> I had a look, but there is nothing there that is related - eventually
> a latent alignment bug was fixed in IVOpt. Note that the bswap phase
> currently inserts unaligned accesses irrespectively of STRICT_ALIGNMENT
> or SLOW_UNALIGNED_ACCESS:
>
> -  if (bswap
>  - && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type))
>  - && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
>  -   return false;
>
> If bswap is false no byte swap is needed, so we found a native endian load
> and it will always perform the optimization by inserting an unaligned load.

Yes, the general agreement is that the expander can do best and thus we
should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS.
The expander will generate the canonical best code (hopefully...).

> This apparently works on all targets, and doesn't cause alignment traps or
> huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS.
> So I'm at a loss what these macros are supposed to mean and how I can query
> whether a backend supports fast unaligned access for a particular mode.
>
> What I actually want to write is something like:
>
>  if (!FAST_UNALIGNED_LOAD (mode, align)) return false;
>
> And know that it only accepts unaligned accesses that are efficient on the 
> target.
> Maybe we need a new hook like this and get rid of the old one?

No, we don't need to other hook.

Note there is another similar user in gimple-fold.c when folding small
memcpy/memmove
to single load/store pairs (patch posted but not applied by me -- I've
asked for strict-align
target maintainer feedback but got none).

Now - for bswap I'm only 99% sure that unaligned load + bswap is
better than piecewise
loads plus manual swap.

But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS /
STRICT_ALIGNMENT
checks from the GIMPLE side of the compiler.

Richard.

> Wilco
>


Re: [RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant

2016-11-02 Thread Bernd Schmidt

On 10/27/2016 03:14 AM, Aaron Sawdey wrote:

I'm currently working on a builtin expansion of strncmp for powerpc
similar to the one for memcmp I checked recently. One thing I
encountered is that the code in expand_strn_compare will not attempt to
expand the cmpstrnsi pattern at all if neither string parameter is a
constant string. This doesn't make a lot of sense in light of the fact
that expand_str_compare starts with an attempt to expand cmpstrsi and
then if that does not work, looks at the string args to see if one is
constant so it can use cmpstrnsi with the known length.

The attached patch is my attempt to add this fallback path to
expand_strn_compare, i.e. if neither length is known, just attempt
expansion of cmpstrnsi using the given 3 arguments.

It looks like in addition to rs6000, there are 3 other targets that
have cmpstrnsi patterns: i386, sh, and rx.

Is this a reasonable approach to take with this? If so I'll
bootstrap/regtest on i386 as rs6000 does not as yet have an expansion
for cmpstrsi or cmpstrnsi.


Just to be sure, this is superseded by your later patch series, correct?

(After I saw this one I had been trying to remember what exactly the 
i386 issue was but it looks like you found it yourself.)



Bernd


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-02 Thread Richard Biener
On Wed, Nov 2, 2016 at 10:52 AM, Jakub Jelinek  wrote:
> On Wed, Nov 02, 2016 at 10:40:35AM +0100, Richard Biener wrote:
>> > I wonder if the sanitize_asan_mark wouldn't better be some PROP_* property
>> > set during the asan pass and kept on until end of compilation of that
>> > function.  That means even if a var only addressable because of ASAN_MARK 
>> > is
>> > discovered after this pass we'd still be able to rewrite it into SSA.
>>
>> Note that being TREE_ADDRESSABLE also has effects on alias analysis
>> (didn't follow the patches to see whether you handle ASAN_MARK specially
>> in points-to analysis and/or alias analysis).
>>
>> Generally in update-address-taken you can handle ASAN_MARK similar to
>> MEM_REF (and drop it in the rewrite phase?).
>
> That is the intent, but we can't do that before the asan pass, because
> otherwise as Martin explained we don't diagnose at runtime bugs where
> a variable is used outside of its scope only through a MEM_REF.
> So we need to wait for asan pass to actually add a real builtin call that
> takes the address in that case.  Except now I really don't see how that
> can work for the case where the var is used only properly when it is in the
> scope, because the asan pass will still see those being addressable.
>
> Unless I'm missing something we either need to perform further analysis
> during the addressable subpass - this variable could be made
> non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA
> form there be any SSA uses of the poisoning ASAN_MARK?  If yes, keep it
> addressable, otherwise rewrite into SSA.
> Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into
> some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any
> uses of those, rewrite it back into addressable immediately or later or
> something.

Or just give up optimizing (asan has a penalty anyway)?  Or
re-structure ASAN_POISON ()
similar to clobbers:

  var = ASAN_POISION ();

that avoids the address-taking and thus should be less heavy-weight on
optimizations.

Richard.

>
> Jakub


Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-02 Thread Richard Biener
On Tue, 1 Nov 2016, Yuri Rumyantsev wrote:

> Hi All,
> 
> I re-send all patches sent by Ilya earlier for review which support
> vectorization of loop epilogues and loops with low trip count. We
> assume that the only patch - vec-tails-07-combine-tail.patch - was not
> approved by Jeff.
> 
> I did re-base of all patches and performed bootstrapping and
> regression testing that did not show any new failures. Also all
> changes related to new vect_do_peeling algorithm have been changed
> accordingly.
> 
> Is it OK for trunk?

I would have prefered that the series up to -03-nomask-tails would
_only_ contain epilogue loop vectorization changes but unfortunately
the patchset is oddly separated.

I have a comment on that part nevertheless:

@@ -1608,7 +1614,10 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
   /* Check if we can possibly peel the loop.  */
   if (!vect_can_advance_ivs_p (loop_vinfo)
   || !slpeel_can_duplicate_loop_p (loop, single_exit (loop))
-  || loop->inner)
+  || loop->inner
+  /* Required peeling was performed in prologue and
+is not required for epilogue.  */
+  || LOOP_VINFO_EPILOGUE_P (loop_vinfo))
 do_peeling = false;

   if (do_peeling
@@ -1888,7 +1897,10 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)

   do_versioning =
optimize_loop_nest_for_speed_p (loop)
-   && (!loop->inner); /* FORNOW */
+   && (!loop->inner) /* FORNOW */
+/* Required versioning was performed for the
+  original loop and is not required for epilogue.  */
+   && !LOOP_VINFO_EPILOGUE_P (loop_vinfo);

   if (do_versioning)
 {

please do that check in the single caller of this function.

Otherwise I still dislike the new ->aux use and I believe that simply
passing down info from the processed parent would be _much_ cleaner.
That is, here (and avoid the FOR_EACH_LOOP change):

@@ -580,12 +586,21 @@ vectorize_loops (void)
&& dump_enabled_p ())
   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
"loop vectorized\n");
-   vect_transform_loop (loop_vinfo);
+   new_loop = vect_transform_loop (loop_vinfo);
num_vectorized_loops++;
/* Now that the loop has been vectorized, allow it to be unrolled
   etc.  */
loop->force_vectorize = false;

+   /* Add new loop to a processing queue.  To make it easier
+  to match loop and its epilogue vectorization in dumps
+  put new loop as the next loop to process.  */
+   if (new_loop)
+ {
+   loops.safe_insert (i + 1, new_loop->num);
+   vect_loops_num = number_of_loops (cfun);
+ }

simply dispatch to a vectorize_epilogue (loop_vinfo, new_loop)
function which will set up stuff properly (and also perform
the if-conversion of the epilogue there).

That said, if we can get in non-masked epilogue vectorization
separately that would be great.

I'm still torn about all the rest of the stuff and question its
usability (esp. merging the epilogue with the main vector loop).
But it has already been approved ... oh well.

Thanks,
Richard.


Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-11-02 Thread Bill Schmidt
On Nov 2, 2016, at 4:28 AM, Jakub Jelinek  wrote:
> 
> On Wed, Nov 02, 2016 at 10:19:26AM +0100, Richard Biener wrote:
>> On Tue, Nov 1, 2016 at 1:09 AM, Jakub Jelinek  wrote:
>>> On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote:
 The PowerPC back end loses performance on vector intrinsics, because 
 currently
 all of them are treated as calls throughout the middle-end phases and only
 expanded when they reach RTL.  Our version of altivec.h currently defines 
 the
 public names of overloaded functions (like vec_add) to be #defines for 
 hidden
 functions (like __builtin_vec_add), which are recognized in the parser as
 requiring special back-end support.  Tables in rs6000-c.c handle dispatch 
 of
 the overloaded functions to specific function calls appropriate to the 
 argument
 types.
>>> 
>>> This doesn't look very nice.  If all you care is that the builtins like
>>> __builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold
>>> into generic vector operations under certain conditions, just fold those
>>> into whatever you want in targetm.gimple_fold_builtin (gsi).
>> 
>> Note that traditionally "overloading" with GCC "builtins" is done by
>> using varargs
>> and the "type generic" attribute.  That doesn't scale to return type 
>> overloading
>> though for which we usually added direct support to the parser (for example
>> for __builtin_shuffle).
> 
> My understanding is that rs6000 already does that, it hooks into
> resolve_overloaded_builtin which already handles the fully type generic
> builtins where not just the arguments, but also the return type can be
> picked up.  But it resolves the overloaded builtins into calls to other
> builtins that are not type-generic.
> 
> So, either that function instead of returning the specific md builtin calls
> in some cases already returns trees with the generic behavior of the
> builtin, or it returns what it does now and then in the gimple fold builtin
> target hook (note, the normal fold builtin target hook is not right for
> that, because it is mostly used for folding builtins into constant - callers
> will usually throw away other results) fold those specific md builtins
> into whatever GIMPLE you want.  If we want to decrease amount of folding in
> the FEs, the gimple fold builtin hook is probably better.
> 
>   Jakub

Thanks, all.  Using the gimple_fold_builtin target hook works very well and
is exactly what I'm looking for.  I've reworked the patch to the much simpler
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00104.html.

Much obliged for the help!

Bill


  1   2   >