[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread kjlu at gatech dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

--- Comment #9 from Kangjie Lu  ---
(In reply to Andrew Pinski from comment #8)
> A simple google search (secure memset [glibc]) finds a few things:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf
> 
> https://sourceware.org/ml/libc-alpha/2014-12/msg00506.html
> 
> https://www.securecoding.cert.org/confluence/display/c/MSC06-C.
> +Beware+of+compiler+optimizations
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537

Thanks for sharing these interesting links. 
Sure, compiler optimizations sometime may aggressively eliminate dead code.

As I mentioned in my last reply, this is not a problem in our work because
our instrumentation is inserted after all LLVM optimization passes. 
The inserted memset will not be removed.

Back to my original problem, many Linux kernel developers also hope GCC can 
provide a feature (like a compilation option) that can zero-initialize 
padding bytes. Fixing these information leaks manually will make the code
maintenance extremely difficult.  
Anyway, I just wanted to report this issue :)

[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

--- Comment #8 from Andrew Pinski  ---
A simple google search (secure memset [glibc]) finds a few things:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf

https://sourceware.org/ml/libc-alpha/2014-12/msg00506.html

https://www.securecoding.cert.org/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537

[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread kjlu at gatech dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

--- Comment #7 from Kangjie Lu  ---
(In reply to Andrew Pinski from comment #6)
> >More information can be found in our research paper: 
> >http://www.cc.gatech.edu/~klu38/publications/unisan-ccs16.pdf
> 
> 
> You research paper is wrong and does not consider C is an inherently
> insecure language to be begin with.  There are many other things wrong with
> it.  Like for an example recommending the use of memset when you want to
> hide the stores from the compiler.  There is already a thread on the glibc
> mailing list about this exact thing about adding a secure memset which is
> GCC is not going to optimize away.

Thanks for your feedback. 
We do think C is not safe language and that's why we want to secure programs 
written in C.
Could you provide me more information about the thread. We use LLVM instead
of GCC. Our instrumentation is inserted after optimization passes.

Thanks!

[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

--- Comment #6 from Andrew Pinski  ---
>More information can be found in our research paper: 
>http://www.cc.gatech.edu/~klu38/publications/unisan-ccs16.pdf


You research paper is wrong and does not consider C is an inherently insecure
language to be begin with.  There are many other things wrong with it.  Like
for an example recommending the use of memset when you want to hide the stores
from the compiler.  There is already a thread on the glibc mailing list about
this exact thing about adding a secure memset which is GCC is not going to
optimize away.

[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Component|driver  |c
 Resolution|--- |INVALID
   Severity|critical|normal

--- Comment #5 from Andrew Pinski  ---
Again C is not the language you want for this.
Also if you want this you should use a loop which hides the stores from GCC
Using inline-asm.

>Besides performance (I understand that the unaligned initialization could be 
>expensive), any other reasons?

Because there is no reason to do it that way.  You said to initialize only
those fields anyways.

Re: [PATCH] Don't peel extra copy of loop in unroller for loops with exit at end

2016-10-14 Thread Andrew Pinski
On Fri, Oct 14, 2016 at 8:28 PM, Andrew Pinski  wrote:
> On Thu, Sep 22, 2016 at 12:10 PM, Pat Haugen
>  wrote:
>> I noticed the loop unroller peels an extra copy of the loop before it enters 
>> the switch block code to round the iteration count to a multiple of the 
>> unroll factor. This peeled copy is only needed for the case where the exit 
>> test is at the beginning of the loop since in that case it inserts the test 
>> for zero peel iterations before that peeled copy.
>>
>> This patch bumps the iteration count by 1 for loops with the exit at the end 
>> so that it represents the number of times the loop body is executed, and 
>> therefore removes the need to always execute that first peeled copy. With 
>> this change, when the number of executions of the loop is an even multiple 
>> of the unroll factor then the code will jump to the unrolled loop 
>> immediately instead of executing all the switch code and peeled copies of 
>> the loop and then falling into the unrolled loop. This change also reduces 
>> code size by removing a peeled copy of the loop.
>>
>> Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?
>
> This patch or
> PR rtl-optimization/68212
> * cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge
> frequency when computing scale factor for peeled copies.
> * loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count
> values for switch/peel blocks/edges.
>
> Caused a ~2.7-3.5% regression in coremarks with -funroll-all-loops.

I should say on ThunderX (aarch64-linux-gnu).

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>>
>>
>> 2016-09-22  Pat Haugen  
>>
>> * loop-unroll.c (unroll_loop_runtime_iterations): Condition initial
>> loop peel to loops with exit test at the beginning.
>>
>>


Re: [PATCH] Don't peel extra copy of loop in unroller for loops with exit at end

2016-10-14 Thread Andrew Pinski
On Thu, Sep 22, 2016 at 12:10 PM, Pat Haugen
 wrote:
> I noticed the loop unroller peels an extra copy of the loop before it enters 
> the switch block code to round the iteration count to a multiple of the 
> unroll factor. This peeled copy is only needed for the case where the exit 
> test is at the beginning of the loop since in that case it inserts the test 
> for zero peel iterations before that peeled copy.
>
> This patch bumps the iteration count by 1 for loops with the exit at the end 
> so that it represents the number of times the loop body is executed, and 
> therefore removes the need to always execute that first peeled copy. With 
> this change, when the number of executions of the loop is an even multiple of 
> the unroll factor then the code will jump to the unrolled loop immediately 
> instead of executing all the switch code and peeled copies of the loop and 
> then falling into the unrolled loop. This change also reduces code size by 
> removing a peeled copy of the loop.
>
> Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?

This patch or
PR rtl-optimization/68212
* cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge
frequency when computing scale factor for peeled copies.
* loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count
values for switch/peel blocks/edges.

Caused a ~2.7-3.5% regression in coremarks with -funroll-all-loops.

Thanks,
Andrew

>
>
>
> 2016-09-22  Pat Haugen  
>
> * loop-unroll.c (unroll_loop_runtime_iterations): Condition initial
> loop peel to loops with exit test at the beginning.
>
>


Re: [PATCH] rs6000: Fix shrink-wrap-separate for AIX

2016-10-14 Thread Segher Boessenkool
On Sat, Oct 15, 2016 at 03:00:20AM +, Segher Boessenkool wrote:
> 2016-10-15  Segher Boessenkool  
> 
>   * config/rs6000/rs6000.c (rs6000_get_separate_components): Do not
>   make LR a separately shrink-wrapped component if savres_strategy
>   contains any of {SAVE,REST}_INLINE_{GPRS,FPRS,VRS}.  Do not wrap
>   GPRs if {SAVE,REST}_INLINE_GPRS.  Do not disallow all wrapping
>   when {SAVE,REST}_INLINE_GPRS.

Wow I messed that up.

* config/rs6000/rs6000.c (rs6000_get_separate_components): Do not
make LR a separately shrink-wrapped component unless savres_strategy
contains all of {SAVE,REST}_INLINE_{GPRS,FPRS,VRS}.  Do not wrap
GPRs unless both {SAVE,REST}_INLINE_GPRS.  Do not disallow all
wrapping when not both {SAVE,REST}_INLINE_GPRS.


Segher


[PATCH] rs6000: Fix shrink-wrap-separate for AIX

2016-10-14 Thread Segher Boessenkool
All out-of-line register save routines need LR, so we cannot wrap the
LR component if there are out-of-line saves.  This didn't show up for
testing on Linux because none of the tests there use out-of-line FPR
saves without also using out-of-line GPR saves, which we did handle.

This patch fixes it, and also cleans up code a little.

Is this okay for trunk?


Segher


2016-10-15  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_get_separate_components): Do not
make LR a separately shrink-wrapped component if savres_strategy
contains any of {SAVE,REST}_INLINE_{GPRS,FPRS,VRS}.  Do not wrap
GPRs if {SAVE,REST}_INLINE_GPRS.  Do not disallow all wrapping
when {SAVE,REST}_INLINE_GPRS.

---
 gcc/config/rs6000/rs6000.c | 41 +
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1110ee2..df48980 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27434,27 +27434,29 @@ rs6000_get_separate_components (void)
 {
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (!(info->savres_strategy & SAVE_INLINE_GPRS)
-  || !(info->savres_strategy & REST_INLINE_GPRS)
-  || WORLD_SAVE_P (info))
+  if (WORLD_SAVE_P (info))
 return NULL;
 
   sbitmap components = sbitmap_alloc (32);
   bitmap_clear (components);
 
   /* The GPRs we need saved to the frame.  */
-  int reg_size = TARGET_32BIT ? 4 : 8;
-  int offset = info->gp_save_offset;
-  if (info->push_p)
-offset += info->total_size;
-
-  for (unsigned regno = info->first_gp_reg_save; regno < 32; regno++)
+  if ((info->savres_strategy & SAVE_INLINE_GPRS)
+  && (info->savres_strategy & REST_INLINE_GPRS))
 {
-  if (IN_RANGE (offset, -0x8000, 0x7fff)
- && rs6000_reg_live_or_pic_offset_p (regno))
-   bitmap_set_bit (components, regno);
+  int reg_size = TARGET_32BIT ? 4 : 8;
+  int offset = info->gp_save_offset;
+  if (info->push_p)
+   offset += info->total_size;
 
-  offset += reg_size;
+  for (unsigned regno = info->first_gp_reg_save; regno < 32; regno++)
+   {
+ if (IN_RANGE (offset, -0x8000, 0x7fff)
+ && rs6000_reg_live_or_pic_offset_p (regno))
+   bitmap_set_bit (components, regno);
+
+ offset += reg_size;
+   }
 }
 
   /* Don't mess with the hard frame pointer.  */
@@ -27467,11 +27469,18 @@ rs6000_get_separate_components (void)
   || (flag_pic && DEFAULT_ABI == ABI_DARWIN))
 bitmap_clear_bit (components, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  /* Optimize LR save and restore if we can.  This is component 0.  */
+  /* Optimize LR save and restore if we can.  This is component 0.  Any
+ out-of-line register save/restore routines need LR.  */
   if (info->lr_save_p
-  && !(flag_pic && (DEFAULT_ABI == ABI_V4 || DEFAULT_ABI == ABI_DARWIN)))
+  && !(flag_pic && (DEFAULT_ABI == ABI_V4 || DEFAULT_ABI == ABI_DARWIN))
+  && (info->savres_strategy & SAVE_INLINE_GPRS)
+  && (info->savres_strategy & REST_INLINE_GPRS)
+  && (info->savres_strategy & SAVE_INLINE_FPRS)
+  && (info->savres_strategy & REST_INLINE_FPRS)
+  && (info->savres_strategy & SAVE_INLINE_VRS)
+  && (info->savres_strategy & REST_INLINE_VRS))
 {
-  offset = info->lr_save_offset;
+  int offset = info->lr_save_offset;
   if (info->push_p)
offset += info->total_size;
   if (IN_RANGE (offset, -0x8000, 0x7fff))
-- 
1.9.3



[Bug testsuite/77623] [7 regression] test cases gcc.target/powerpc/warn-1.c and warn-2.c fail starting with r239994

2016-10-14 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77623

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||segher at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #2 from Segher Boessenkool  ---
Fixed with r241056, I wasn't aware there was a BZ, sorry.

[Bug middle-end/77989] [7 Regression] -O3 causes verify_gimple fail

2016-10-14 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77989

Markus Trippelsdorf  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-10-15
 CC||trippels at gcc dot gnu.org
  Component|c   |middle-end
Summary|-O3 causes verify_gimple|[7 Regression] -O3 causes
   |fail|verify_gimple fail
 Ever confirmed|0   |1

--- Comment #1 from Markus Trippelsdorf  ---
markus@x4 tmp % cat fail.i
int a, d;
char b;
char *c;
void fn1() {
  char *e = 
  c =  + 48;
  while (e < c)
e++;
  e++;
  c =  + a;
  while (e < c)
d += *e++;
}

markus@x4 tmp % gcc -fchecking -O3 -c fail.i
fail.i: In function ‘fn1’:
fail.i:4:5: error: invalid address operand in MEM_REF
 int fn1() {
 ^~~
MEM[(void *) + 49B];

fail.i:4:5: error: invalid first operand of MEM_REF
[(void *) + 49B]
fail.i:12:10: note: in statement
 d += *e++;
  ^~~~
# VUSE <.MEM_59>
_15 = MEM[(void *) + 49B];
fail.i:4: confused by earlier errors, bailing out

[Bug rtl-optimization/77843] ICE for gcc.dg/tree-ssa/forwprop-35.c

2016-10-14 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77843

Segher Boessenkool  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Segher Boessenkool  ---
It is fixed.

Re: [PATCH AArch64]Penalize vector cost for large loops with too many vect insns.

2016-10-14 Thread kugan

Hi Bin,

On 15/10/16 00:15, Bin Cheng wrote:

+/* Test for likely overcommitment of vector hardware resources.  If a
+   loop iteration is relatively large, and too large a percentage of
+   instructions in the loop are vectorized, the cost model may not
+   adequately reflect delays from unavailable vector resources.
+   Penalize the loop body cost for this case.  */
+
+static void
+aarch64_density_test (struct aarch64_vect_loop_cost_data *data)
+{
+  const int DENSITY_PCT_THRESHOLD = 85;
+  const int DENSITY_SIZE_THRESHOLD = 128;
+  const int DENSITY_PENALTY = 10;
+  struct loop *loop = data->loop_info;
+  basic_block *bbs = get_loop_body (loop);


Is this worth being part of the cost model such that it can have 
different defaults for different micro-architecture?



Thanks,
Kugan


[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread kjlu at gatech dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Kangjie Lu  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |---

--- Comment #4 from Kangjie Lu  ---
(In reply to Andrew Pinski from comment #3)
> There is no way in C to do that. If you want a secure language you need
> something different.

Could you please explain why there is no way in C to initialize padding?
Besides performance (I understand that the unaligned initialization could be
expensive), any other reasons?

Thanks!

[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
There is no way in C to do that. If you want a secure language you need
something different.

[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread kjlu at gatech dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Kangjie Lu  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |---

--- Comment #2 from Kangjie Lu  ---
Then I guess this is an unspecified area in C11.

Anyway, the failure to initialize the padding bytes will cause information
leaks; many leaks have been confirmed.

I would suggest gcc to initialize padding bytes even it is not specified in
C11.


Thanks,
Kangjie

[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||msebor at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from Martin Sebor  ---
As the quoted text states, paragraph 10 applies to objects with static or
thread storage duration.  The object s in the test program has automatic
storage duration, so the requirement doesn't apply, and padding bits are not
required to be initialized.

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-14 Thread Bernhard Reutner-Fischer
On 14 October 2016 22:41:25 CEST, Janne Blomqvist  
wrote:
>On Fri, Oct 14, 2016 at 8:01 PM, Bernhard Reutner-Fischer
> wrote:
>> On 13 October 2016 22:08:21 CEST, Jerry DeLisle
> wrote:
>>>On 10/13/2016 08:16 AM, Janne Blomqvist wrote:
>>

 Regtested on x86_64-pc-linux-gnu. Ok for trunk?

>>>
>>>Yes, OK, clever! Thanks!
>>
>> Is 32 something a typical program uses?
>
>Probably not. Then again, wasting a puny 32 bytes vs. the time it
>takes to do one or two extra realloc+copy operations when opening that
>many files?

Every reallocated I'm aware of uses pools.

>
>> I'd have started at 8 and had not doubled but += 16 fwiw.
>
>I can certainly start at a smaller value like 8 or 16, but I'd like to

Yes please.

>keep the multiplicative increase in order to get O(log(N))
>reallocs+copys rather than O(N) when increasing the size.

Bike-shedding but if she's going to use that many units O(log(N)) will be 
nothing compared to the expected insn storm to follow. Inc by max(initial 
value, 64, let's say - short of double initial value - is still overestimated 
IMHO.
Thanks for taking care of this either way.
Cheers



[Bug driver/77992] New: Failures to initialize padding bytes -- causing many information leaks

2016-10-14 Thread kjlu at gatech dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Bug ID: 77992
   Summary: Failures to initialize padding bytes -- causing many
information leaks
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: critical
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kjlu at gatech dot edu
  Target Milestone: ---

Created attachment 39817
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39817=edit
testcase

Hello,

I'd like to report an implementation (or even design) problem in GCC.

Chapter §6.7.9/10 in C11:
"If an object that has static or thread storage duration is not initialized
explicitly, then:
...
if it is an aggregate, every member is initialized (recursively) according to
these rules, and any padding is initialized to zero bits;"

According to this specification, padding bytes should be initialized when the
initializer is static.
Take a look at this example (say x86_64):
/
struct S {
long l;
char c;
};

void main () {
struct S s ={
.l = 0,
.c = 0
};
}
/
The developer has carefully initialized all fields with constants.
Object "s" is supposed to be fully initialized, i.e., the seven padding bytes
right after "s.c" are supposed to be initialized.
However, these padding bytes are not initialized in fact. 
In contrast, LLVM would initialize the padding bytes in such a case.

Similarly, when "variables" are used to initialize the fields of "s", padding
bytes are not initialized either, such as:
/
struct S s ={
.l = variable1,
.c = variable2
};
/

Such failures to initialize padding bytes will result in many information
leaks. We have found many information leaks in the Linux kernel.
Here is an example:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4482
More information can be found in our research paper:
http://www.cc.gatech.edu/~klu38/publications/unisan-ccs16.pdf

The testing program for reproducing the leak is attached.

Testing environment:
"Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2)"


My suggestion to reliably address this problem is that padding bytes of an
object, which are implicitly introduced by compilers, should be
zero-initialized upon object allocation.

Please let me know if you need more information or any assistance.

Best Regards,
Kangjie Lu

[Bug libstdc++/77987] unique_ptr<T[]> reset rejects cv-compatible pointers

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77987

--- Comment #1 from Jonathan Wakely  ---
Seems simple enough to fix:

@@ -608,8 +608,9 @@
   >
>>
   void
-  reset(_Up __p) noexcept
+  reset(_Up __ptr) noexcept
   {
+   pointer __p = __ptr;
using std::swap;
swap(std::get<0>(_M_t), __p);
if (__p != nullptr)

Re: [PATCH] PR fortran/77978 -- STOP code fixes

2016-10-14 Thread Steve Kargl
On Fri, Oct 14, 2016 at 05:24:53PM -0700, Steve Kargl wrote:
> For the code 
> 
>   program foo
> stop merge(667, 668, .true.)
>   end 
> 
> gfortran with either -std=f95 or -std=f2003 should reject 
> this code.  My patch does not fix this issue, because it
> would (1) require a complete rewrite of gfc_match_stopcode
> (which I am not willing to do) and (2) it simply is a vastly
> unimportant corner case that gives the desired behavior.
> 

I take it back.  It is sort of accidently fixed with a
completely unintelligent error for -std=f95.  It is
accepted for -std=f2003.  Either way I have no intention
of fixing this usage.

troutmask:sgk[492] gfc7 -c -std=f2003 a.f90
troutmask:sgk[493] gfc7 -c -std=f95 a.f90
a.f90:2:6:

   stop merge(667,668,.true.)
  1
Error: Fortran 2003: Elemental function as initialization expression
with non-integer/non-character arguments at (1)


-- 
Steve


Go patch committed: copy runtime package time code from Go 1.7 runtime

2016-10-14 Thread Ian Lance Taylor
This patch to the Go frontend and libgo copies the runtime package
time code from the Go 1.7 runtime.

This tweaks the frontend to fix the handling of function values for
-fgo-c-header to generate FuncVal*, not simply FuncVal.

While we're here change runtime.nanotime to use clock_gettime with
CLOCK_MONOTONIC, rather than gettimeofday.  This is what the gc
library does.  It provides nanosecond precision and a monotonic clock.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241189)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ec3dc927da71d15cac48a13c0fb0c1f94572d0d2
+880cb0a45590d992880fc6aabc7484e54c817eeb
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 240942)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -5928,7 +5928,7 @@ Struct_type::write_field_to_c_header(std
   break;
 
 case TYPE_FUNCTION:
-  os << "FuncVal";
+  os << "FuncVal*";
   break;
 
 case TYPE_POINTER:
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 241189)
+++ libgo/Makefile.am   (working copy)
@@ -519,7 +519,6 @@ runtime_files = \
reflect.c \
runtime1.c \
sigqueue.c \
-   time.c \
$(runtime_getncpu_file)
 
 goc2c.$(OBJEXT): runtime/goc2c.c
Index: libgo/go/runtime/stubs.go
===
--- libgo/go/runtime/stubs.go   (revision 241189)
+++ libgo/go/runtime/stubs.go   (working copy)
@@ -196,15 +196,15 @@ func getcallersp(argp unsafe.Pointer) ui
 // argp used in Defer structs when there is no argp.
 const _NoArgs = ^uintptr(0)
 
-// //go:linkname time_now time.now
-// func time_now() (sec int64, nsec int32)
+//go:linkname time_now time.now
+func time_now() (sec int64, nsec int32)
 
-/*
+// For gccgo, expose this for C callers.
+//go:linkname unixnanotime runtime.unixnanotime
 func unixnanotime() int64 {
sec, nsec := time_now()
return sec*1e9 + int64(nsec)
 }
-*/
 
 // round n up to a multiple of a.  a must be a power of 2.
 func round(n, a uintptr) uintptr {
Index: libgo/go/runtime/time.go
===
--- libgo/go/runtime/time.go(revision 0)
+++ libgo/go/runtime/time.go(working copy)
@@ -0,0 +1,307 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// Time-related runtime and pieces of package time.
+
+package runtime
+
+import "unsafe"
+
+// Export temporarily for gccgo's C code to call:
+//go:linkname addtimer runtime.addtimer
+//go:linkname deltimer runtime.deltimer
+
+// Package time knows the layout of this structure.
+// If this struct changes, adjust ../time/sleep.go:/runtimeTimer.
+// For GOOS=nacl, package syscall knows the layout of this structure.
+// If this struct changes, adjust ../syscall/net_nacl.go:/runtimeTimer.
+type timer struct {
+   i int // heap index
+
+   // Timer wakes up at when, and then at when+period, ... (period > 0 
only)
+   // each time calling f(arg, now) in the timer goroutine, so f must be
+   // a well-behaved function and not block.
+   when   int64
+   period int64
+   f  func(interface{}, uintptr)
+   arginterface{}
+   sequintptr
+}
+
+var timers struct {
+   lock mutex
+   gp   *g
+   created  bool
+   sleeping bool
+   rescheduling bool
+   waitnote note
+   t[]*timer
+}
+
+// nacl fake time support - time in nanoseconds since 1970
+var faketime int64
+
+// Package time APIs.
+// Godoc uses the comments in package time, not these.
+
+// time.now is implemented in assembly.
+
+// timeSleep puts the current goroutine to sleep for at least ns nanoseconds.
+//go:linkname timeSleep time.Sleep
+func timeSleep(ns int64) {
+   if ns <= 0 {
+   return
+   }
+
+   t := new(timer)
+   t.when = nanotime() + ns
+   t.f = goroutineReady
+   t.arg = getg()
+   lock()
+   addtimerLocked(t)
+   goparkunlock(, "sleep", traceEvGoSleep, 2)
+}
+
+// startTimer adds t to the timer heap.
+//go:linkname startTimer time.startTimer
+func startTimer(t *timer) {
+   if raceenabled {
+   racerelease(unsafe.Pointer(t))
+   }
+   addtimer(t)
+}
+
+// stopTimer removes t from the timer heap if it is there.
+// It returns true if t was removed, false if t wasn't even there.
+//go:linkname stopTimer time.stopTimer
+func stopTimer(t *timer) bool {
+   

[PATCH] PR fortran/77978 -- STOP code fixes

2016-10-14 Thread Steve Kargl
The attach patch fixes a number of shortcomings with
STOP codes in gfortran.  The updated comment in the
code nicely summarizes the problem.

 /* Match a number or character constant after an (ERROR) STOP or PAUSE
-   statement.  */
+   statement.  The requirements for a stop-code differs in the standards.
+
+   Fortran 95 has
+
+   R840 stop-stmt  is STOP [ stop-code ]
+   R841 stop-code  is scalar-char-constant
+   or digit [ digit [ digit [ digit [ digit ] ] ] ]
+
+   Fortran 2003 is the same as Fortran 95 except R840 and R841 are now
+   R849 and R850.
+
+   Fortran 2008 has
+
+   R855 stop-stmt is STOP [ stop-code ]
+   R856 allstop-stmt  is ALL STOP [ stop-code ]
+   R857 stop-code is scalar-default-char-constant-expr
+  or scalar-int-constant-expr
+*/

So, the F95/2003 "digit [...]" is not a scalar-int-constant-expr.
It sort of looks like a statement label, but of course it is not
a statement label as the stop code does label anything.  Currently,
gfortran parses "digit [...]" as an expression.  I've added the
necessary checking that "digit [...]" is valid with one exception. 
For the code 

  program foo
stop merge(667, 668, .true.)
  end 

gfortran with either -std=f95 or -std=f2003 should reject 
this code.  My patch does not fix this issue, because it
would (1) require a complete rewrite of gfc_match_stopcode
(which I am not willing to do) and (2) it simply is a vastly
unimportant corner case that gives the desired behavior.

A second issue raised by John in PR fortran/77978 is that
for F95/2003, the following is valid free-form source code:

  program foo
stop666
  end

but is invalid F2008.  The patch fixes this bug, too.

OK to commit?

2016-10-XX  Steven G. Kargl  

PR fortran/77978
* match.c (gfc_match_stopcode): Fix error reporting for several
deficiencies in matching STOP codes.
 
2016-10-XX  Steven G. Kargl  

PR fortran/77978
* gfortran.dg/pr77978_1.f90: New test.
* gfortran.dg/pr77978_2.f90: Ditto.
* gfortran.dg/pr77978_3.f90: Ditto.

-- 
Steve
Index: gcc/fortran/match.c
===
--- gcc/fortran/match.c	(revision 241074)
+++ gcc/fortran/match.c	(working copy)
@@ -2732,7 +2732,24 @@ gfc_match_cycle (void)
 
 
 /* Match a number or character constant after an (ERROR) STOP or PAUSE
-   statement.  */
+   statement.  The requirements for a stop-code differs in the standards.
+
+   Fortran 95 has
+
+   R840 stop-stmt  is STOP [ stop-code ]
+   R841 stop-code  is scalar-char-constant
+   or digit [ digit [ digit [ digit [ digit ] ] ] ]
+
+   Fortran 2003 is the same as Fortran 95 except R840 and R841 are now
+   R849 and R850.
+
+   Fortran 2008 has
+
+   R855 stop-stmt is STOP [ stop-code ]
+   R856 allstop-stmt  is ALL STOP [ stop-code ]
+   R857 stop-code is scalar-default-char-constant-expr
+  or scalar-int-constant-expr
+*/
 
 static match
 gfc_match_stopcode (gfc_statement st)
@@ -2740,6 +2757,27 @@ gfc_match_stopcode (gfc_statement st)
   gfc_expr *e;
   match m;
 
+  /* The default selected Standards. */
+  int std = GFC_STD_GNU | GFC_STD_LEGACY | GFC_STD_F77 | GFC_STD_F95
+	  | GFC_STD_F95_OBS | GFC_STD_F95_DEL | GFC_STD_F2003
+	  | GFC_STD_F2008 | GFC_STD_F2008_OBS | GFC_STD_F2008_TS;
+
+  if (gfc_current_form != FORM_FIXED)
+{
+  char c;
+
+  c = gfc_peek_ascii_char ();
+
+  if (c != ' '
+	  && gfc_option.allow_std != std
+	  && (gfc_option.allow_std & GFC_STD_F2008))
+	{
+	  gfc_error ("Blank required in %s statement near %C",
+		 gfc_ascii_statement (st));
+	  return MATCH_ERROR;
+	}
+}
+
   e = NULL;
 
   if (gfc_match_eos () != MATCH_YES)
@@ -2785,6 +2823,15 @@ gfc_match_stopcode (gfc_statement st)
 
   if (e != NULL)
 {
+  gfc_simplify_expr (e, 0);
+
+  if (e->expr_type != EXPR_CONSTANT)
+	{
+	  gfc_error ("STOP code at %L must be a constant expression",
+		 >where);
+	  goto cleanup;
+	}
+
   if (!(e->ts.type == BT_CHARACTER || e->ts.type == BT_INTEGER))
 	{
 	  gfc_error ("STOP code at %L must be either INTEGER or CHARACTER type",
@@ -2794,8 +2841,7 @@ gfc_match_stopcode (gfc_statement st)
 
   if (e->rank != 0)
 	{
-	  gfc_error ("STOP code at %L must be scalar",
-		 >where);
+	  gfc_error ("STOP code at %L must be scalar", >where);
 	  goto cleanup;
 	}
 
@@ -2807,12 +2853,35 @@ gfc_match_stopcode (gfc_statement st)
 	  goto cleanup;
 	}
 
-  if (e->ts.type == BT_INTEGER
-	  && e->ts.kind != gfc_default_integer_kind)
+  if (e->ts.type == BT_INTEGER)
 	{
-	  gfc_error ("STOP code at %L must be default integer KIND=%d",
-		 >where, (int) gfc_default_integer_kind);
-	  goto cleanup;
+	  if (e->ts.kind != gfc_default_integer_kind)
+	{
+	  gfc_error ("STOP code at %L must be default integer KIND=%d",
+			 >where, (int) gfc_default_integer_kind);
+	  goto 

[Bug middle-end/77991] New: ICE on x32 in plus_constant, at explow.c:87

2016-10-14 Thread kilobyte at angband dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77991

Bug ID: 77991
   Summary: ICE on x32 in plus_constant, at explow.c:87
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kilobyte at angband dot pl
  Target Milestone: ---

Created attachment 39816
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39816=edit
reduced reproducer

The attached test case ICEs on x32 target in plus_constant, at explow.c:87,
when compiling with -O1 or higher.  No splat with -O0.

The test case comes from a new version of qemu, where this ICE triggers on a
number of files.  No other target seems to be affected, as qemu built
successfully on all architectures but x32.

Host arch doesn't seem to matter, only target: reproducible with -mx32 on amd64
too.

Reproduced on Debian packages of gcc 5.4.1-2, 6.2.0-6 and 20161006-1 trunk
snapshot.

Minimal invocation:
(on x32) gcc -O -c f.i
(on amd64) gcc -mx32 -O -c f.i

[Bug fortran/77978] stop codes misinterpreted in both f2003 and f2008

2016-10-14 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77978

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |kargl at gcc dot gnu.org

--- Comment #3 from kargl at gcc dot gnu.org ---
I have a patch cooking.

   Fortran 95 has

   R840 stop-stmt  is STOP [ stop-code ]
   R841 stop-code  is scalar-char-constant
   or digit [ digit [ digit [ digit [ digit ] ] ] ]

   Fortran 2003 is the same as Fortran 95 except R840 and R841 are now
   R849 and R850.

   Fortran 2008 has

   R855 stop-stmt is STOP [ stop-code ]
   R856 allstop-stmt  is ALL STOP [ stop-code ]
   R857 stop-code is scalar-default-char-constant-expr
  or scalar-int-constant-expr

[Bug libstdc++/77987] unique_ptr<T[]> reset rejects cv-compatible pointers

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77987

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-10-14
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Ever confirmed|0   |1

Re: [PATCH, rs6000] pr65479 Add option to fix failing asan test cases.

2016-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2016 at 02:37:42PM -0500, Bill Seurer wrote:
> [PATCH, rs6000] pr65479 Add option to fix failing asan test cases.
> 
> This patch adds the -fasynchronous-unwind-tables option to several of the asan
> test cases.  The option causes a full strack trace to be produced when the
> sanitizer detects an error.  Without the full trace the 3 test cases fail.

Should we enable this whenever asan is used, instead?  Not just in the
testsuite?  And is this actually PowerPC-specific?


Segher


Re: [PATCH] Improve DWARF constant attribute langhooks

2016-10-14 Thread Jason Merrill
OK.

On Fri, Oct 14, 2016 at 1:29 PM, Jakub Jelinek  wrote:
> Hi!
>
> Before early dwarf changes, if we wanted to note some decl property so that
> some corresponding DWARF attribute can be emitted, we had to use some
> generic IL bit for that.  Now a langhook can be used instead (hopefully for
> 7.x even with LTO), but having a single langhook for each such bit looks
> excessive to me, when all we actually want is forward some bits from the C++
> FE lang structures/macros to dwarf2out.
>
> So, this patch introduces a lang hook through which dwarf2out can ask if
> some DW_AT_* attribute should be added to decl (it is dwarf2out's business
> to guard it with dwarf_version, dwarf_strict and other conditions), and the
> lang hook just returns -1 if nothing should be added (most attributes we
> care about here have either boolean 0/1 or small unsigned integer values),
> or the value of the attribute that should be added.
>
> I've converted 3 attributes to this new langhook.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-10-14  Jakub Jelinek  
>
> * langhooks.h (struct lang_hooks_for_decls): Remove
> function_decl_explicit_p, function_decl_deleted_p and
> function_decl_defaulted hooks.  Add decl_dwarf_attribute hook.
> * langhooks-def.h (lhd_decl_dwarf_attribute): Declare.
> (LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
> LANG_HOOKS_FUNCTION_DECL_DELETED_P,
> LANG_HOOKS_FUNCTION_DECL_DEFAULTED): Remove.
> (LANG_HOOKS_DECL_DWARF_ATTRIBUTE): Define.
> (LANG_HOOKS_DECLS): Remove LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
> LANG_HOOKS_FUNCTION_DECL_DELETED_P and
> LANG_HOOKS_FUNCTION_DECL_DEFAULTED.  Add
> LANG_HOOKS_DECL_DWARF_ATTRIBUTE.
> * langhooks.c (lhd_decl_dwarf_attribute): New function.
> * dwarf2out.c (gen_subprogram_die): Use
> lang_hooks.decls.decl_dwarf_attribute instead of
> lang_hooks.decls.function_decl_*.
> cp/
> * cp-objcp-common.h (cp_function_decl_explicit_p,
> cp_function_decl_deleted_p, cp_function_decl_defaulted): Remove.
> (cp_decl_dwarf_attribute): Declare.
> (LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
> LANG_HOOKS_FUNCTION_DECL_DELETED_P,
> LANG_HOOKS_FUNCTION_DECL_DEFAULTED): Remove.
> (LANG_HOOKS_DECL_DWARF_ATTRIBUTE): Redefine.
> * cp-objcp-common.c (cp_function_decl_explicit_p,
> cp_function_decl_deleted_p, cp_function_decl_defaulted): Remove.
> (cp_decl_dwarf_attribute): New function.
>
> --- gcc/langhooks.h.jj  2016-10-13 10:24:46.0 +0200
> +++ gcc/langhooks.h 2016-10-14 14:27:07.806695803 +0200
> @@ -182,16 +182,9 @@ struct lang_hooks_for_decls
>/* Returns the chain of decls so far in the current scope level.  */
>tree (*getdecls) (void);
>
> -  /* Returns true if DECL is explicit member function.  */
> -  bool (*function_decl_explicit_p) (const_tree);
> -
> -  /* Returns true if DECL is C++11 deleted special member function.  */
> -  bool (*function_decl_deleted_p) (const_tree);
> -
> -  /* Returns 0 if DECL is NOT a C++11 defaulted special member
> - function, 1 if it is explicitly defaulted within the class body,
> - or 2 if it is explicitly defaulted outside the class body.  */
> -  int (*function_decl_defaulted) (const_tree);
> +  /* Returns -1 if dwarf ATTR shouldn't be added for DECL, or the attribute
> + value otherwise.  */
> +  int (*decl_dwarf_attribute) (const_tree, int);
>
>/* Returns True if the parameter is a generic parameter decl
>   of a generic type, e.g a template template parameter for the C++ FE.  */
> --- gcc/langhooks-def.h.jj  2016-10-13 10:28:19.0 +0200
> +++ gcc/langhooks-def.h 2016-10-14 14:29:09.535146412 +0200
> @@ -83,6 +83,7 @@ extern bool lhd_omp_mappable_type (tree)
>
>  extern const char *lhd_get_substring_location (const substring_loc &,
>location_t *out_loc);
> +extern int lhd_decl_dwarf_attribute (const_tree, int);
>
>  #define LANG_HOOKS_NAME"GNU unknown"
>  #define LANG_HOOKS_IDENTIFIER_SIZE sizeof (struct lang_identifier)
> @@ -213,9 +214,7 @@ extern tree lhd_make_node (enum tree_cod
>  #define LANG_HOOKS_GLOBAL_BINDINGS_P global_bindings_p
>  #define LANG_HOOKS_PUSHDECLpushdecl
>  #define LANG_HOOKS_GETDECLSgetdecls
> -#define LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P hook_bool_const_tree_false
> -#define LANG_HOOKS_FUNCTION_DECL_DELETED_P hook_bool_const_tree_false
> -#define LANG_HOOKS_FUNCTION_DECL_DEFAULTED hook_int_const_tree_0
> +#define LANG_HOOKS_DECL_DWARF_ATTRIBUTE lhd_decl_dwarf_attribute
>  #define LANG_HOOKS_WARN_UNUSED_GLOBAL_DECL lhd_warn_unused_global_decl
>  #define LANG_HOOKS_POST_COMPILATION_PARSING_CLEANUPS NULL
>  #define LANG_HOOKS_DECL_OK_FOR_SIBCALL lhd_decl_ok_for_sibcall
> @@ -236,9 +235,7 @@ extern tree lhd_make_node (enum tree_cod
> 

C++ PATCH for P0017, C++17 aggregates with bases

2016-10-14 Thread Jason Merrill
Implementing this proposal was a pretty straightforward matter of
changing the definition of aggregate and treating artificial base
fields as initializable in aggregate initialization.  For this to work
with empty bases, I needed to create base fields for them, which we
haven't done in the past.  build_base_field warned about problems with
empty base fields confusing the back end, but I wasn't able to find
any such trouble.  For the time being I'm only creating them in C++17
mode, to limit any regressions.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 99ba4ce4ec3daa3897b6bc971381ca4b1cdc54b1
Author: Jason Merrill 
Date:   Fri Oct 14 07:45:02 2016 -0400

Implement P0017R1, C++17 aggregates with bases.

* class.c (build_base_field_1): Split out from...
(build_base_field): ...here.  In C++17 mode, build a field for
empty bases.
* decl.c (xref_basetypes): In C++17 aggregates can have bases.
(next_initializable_field): Allow base fields in C++17.
* typeck2.c (process_init_constructor_record): Likewise.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 46f1bac..9a6ea97 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -4452,6 +4452,34 @@ layout_empty_base (record_layout_info rli, tree binfo,
   return atend;
 }
 
+/* Build the FIELD_DECL for BASETYPE as a base of T, add it to the chain of
+   fields at NEXT_FIELD, and return it.  */
+
+static tree
+build_base_field_1 (tree t, tree basetype, tree *_field)
+{
+  /* Create the FIELD_DECL.  */
+  gcc_assert (CLASSTYPE_AS_BASE (basetype));
+  tree decl = build_decl (input_location,
+ FIELD_DECL, NULL_TREE, CLASSTYPE_AS_BASE (basetype));
+  DECL_ARTIFICIAL (decl) = 1;
+  DECL_IGNORED_P (decl) = 1;
+  DECL_FIELD_CONTEXT (decl) = t;
+  DECL_SIZE (decl) = CLASSTYPE_SIZE (basetype);
+  DECL_SIZE_UNIT (decl) = CLASSTYPE_SIZE_UNIT (basetype);
+  SET_DECL_ALIGN (decl, CLASSTYPE_ALIGN (basetype));
+  DECL_USER_ALIGN (decl) = CLASSTYPE_USER_ALIGN (basetype);
+  DECL_MODE (decl) = TYPE_MODE (basetype);
+  DECL_FIELD_IS_BASE (decl) = 1;
+
+  /* Add the new FIELD_DECL to the list of fields for T.  */
+  DECL_CHAIN (decl) = *next_field;
+  *next_field = decl;
+  next_field = _CHAIN (decl);
+
+  return decl;
+}
+
 /* Layout the base given by BINFO in the class indicated by RLI.
*BASE_ALIGN is a running maximum of the alignments of
any base class.  OFFSETS gives the location of empty base
@@ -4483,29 +4511,12 @@ build_base_field (record_layout_info rli, tree binfo,
   CLASSTYPE_EMPTY_P (t) = 0;
 
   /* Create the FIELD_DECL.  */
-  decl = build_decl (input_location,
-FIELD_DECL, NULL_TREE, CLASSTYPE_AS_BASE (basetype));
-  DECL_ARTIFICIAL (decl) = 1;
-  DECL_IGNORED_P (decl) = 1;
-  DECL_FIELD_CONTEXT (decl) = t;
-  if (CLASSTYPE_AS_BASE (basetype))
-   {
- DECL_SIZE (decl) = CLASSTYPE_SIZE (basetype);
- DECL_SIZE_UNIT (decl) = CLASSTYPE_SIZE_UNIT (basetype);
- SET_DECL_ALIGN (decl, CLASSTYPE_ALIGN (basetype));
- DECL_USER_ALIGN (decl) = CLASSTYPE_USER_ALIGN (basetype);
- DECL_MODE (decl) = TYPE_MODE (basetype);
- DECL_FIELD_IS_BASE (decl) = 1;
+  decl = build_base_field_1 (t, basetype, next_field);
 
- /* Try to place the field.  It may take more than one try if we
-have a hard time placing the field without putting two
-objects of the same type at the same address.  */
- layout_nonempty_base_or_field (rli, decl, binfo, offsets);
- /* Add the new FIELD_DECL to the list of fields for T.  */
- DECL_CHAIN (decl) = *next_field;
- *next_field = decl;
- next_field = _CHAIN (decl);
-   }
+  /* Try to place the field.  It may take more than one try if we
+have a hard time placing the field without putting two
+objects of the same type at the same address.  */
+  layout_nonempty_base_or_field (rli, decl, binfo, offsets);
 }
   else
 {
@@ -4541,6 +4552,13 @@ build_base_field (record_layout_info rli, tree binfo,
 create CONSTRUCTORs for the class by iterating over the
 FIELD_DECLs, and the back end does not handle overlapping
 FIELD_DECLs.  */
+  if (cxx_dialect >= cxx1z && !BINFO_VIRTUAL_P (binfo))
+   {
+ tree decl = build_base_field_1 (t, basetype, next_field);
+ DECL_FIELD_OFFSET (decl) = BINFO_OFFSET (binfo);
+ DECL_FIELD_BIT_OFFSET (decl) = bitsize_zero_node;
+ SET_DECL_OFFSET_ALIGN (decl, BITS_PER_UNIT);
+   }
 
   /* An empty virtual base causes a class to be non-empty
 -- but in that case we do not need to clear CLASSTYPE_EMPTY_P
@@ -6586,7 +6604,7 @@ layout_class_type (tree t, tree *virtuals_p)
 
   /* Make sure that empty classes are reflected in RLI at this
  point.  */
-  include_empty_classes(rli);
+  include_empty_classes (rli);
 
   

[PATCH, libfortran] PR 48587 Newunit allocator

2016-10-14 Thread Janne Blomqvist
Currently GFortran newer reuses unit numbers allocated with NEWUNIT=,
instead having a simple counter that is decremented each time such a
unit is opened.  For a long running program which repeatedly opens
files with NEWUNIT= and closes them, the counter can wrap around and
cause an abort.  This patch replaces the counter with an allocator
that keeps track of which units numbers are allocated, and can reuse
them once they have been deallocated.  Since operating systems tend to
limit the number of simultaneous open files for a process to a
relatively modest number, a relatively simple approach with a linear
scan through an array suffices.  Though as a small optimization there
is a low water indicator keeping track of the index for which all unit
numbers below are already allocated.  This linear scan also ensures
that we always allocate the smallest available unit number.

2016-10-15  Janne Blomqvist  

PR libfortran/48587
* io/io.h (get_unique_unit_number): Remove prototype.
(newunit_alloc): New prototype.
* io/open.c (st_open): Call newunit_alloc.
* io/unit.c (newunits,newunit_size,newunit_lwi): New static
variables.
(GFC_FIRST_NEWUNIT): Rename to NEWUNIT_START.
(next_available_newunit): Remove variable.
(get_unit): Call newunit_alloc, don't try to create negative
external unit.
(close_unit_1): Call newunit_free.
(close_units): Free newunits array.
(get_unique_number): Remove function.
(newunit_alloc): New function.
(newunit_free): New function.
* io/transfer.c (data_transfer_init): Check for invalid unit
number.

testsuite ChangeLog:

2016-10-15  Janne Blomqvist  

PR libfortran/48587
* gfortran.dg/negative_unit2.f90: New testcase.

Regtested on x86_64-pc-linux-gnu.

Version 2 of the patch. Compared to v1:
* Check for invalid unit number in transfer.c:data_transfer_init and
  unit.c:get_unit.
* Add testcase to check invalid unit number.
* Reduce initial size of newunits array from 32 to 16.
---
 gcc/testsuite/gfortran.dg/negative_unit2.f90 |   9 +++
 libgfortran/io/io.h  |   5 +-
 libgfortran/io/open.c|   2 +-
 libgfortran/io/transfer.c|  10 ++-
 libgfortran/io/unit.c| 108 +--
 5 files changed, 107 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/negative_unit2.f90

diff --git a/gcc/testsuite/gfortran.dg/negative_unit2.f90 
b/gcc/testsuite/gfortran.dg/negative_unit2.f90
new file mode 100644
index 000..e7fb85e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/negative_unit2.f90
@@ -0,0 +1,9 @@
+! { dg-do run }
+! Test case submitted by Dominique d'Humieres
+program negative_unit2
+  integer :: i, j
+  ! i should be <= NEWUNIT_FIRST in libgfortran/io/unit.c
+  i = -100
+  write(unit=i,fmt=*, iostat=j) 10
+  if (j == 0) call abort
+end program negative_unit2
diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index ea93fba..aaacc08 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -715,8 +715,9 @@ internal_proto (finish_last_advance_record);
 extern int unit_truncate (gfc_unit *, gfc_offset, st_parameter_common *);
 internal_proto (unit_truncate);
 
-extern GFC_INTEGER_4 get_unique_unit_number (st_parameter_common *);
-internal_proto(get_unique_unit_number);
+extern int newunit_alloc (void);
+internal_proto(newunit_alloc);
+
 
 /* open.c */
 
diff --git a/libgfortran/io/open.c b/libgfortran/io/open.c
index d074b02..2e7163d 100644
--- a/libgfortran/io/open.c
+++ b/libgfortran/io/open.c
@@ -812,7 +812,7 @@ st_open (st_parameter_open *opp)
   if ((opp->common.flags & IOPARM_LIBRETURN_MASK) == IOPARM_LIBRETURN_OK)
 {
   if ((opp->common.flags & IOPARM_OPEN_HAS_NEWUNIT))
-   opp->common.unit = get_unique_unit_number(>common);
+   opp->common.unit = newunit_alloc ();
   else if (opp->common.unit < 0)
{
  u = find_unit (opp->common.unit);
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 902c020..7696cca 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -2601,7 +2601,15 @@ data_transfer_init (st_parameter_dt *dtp, int read_flag)
 
   dtp->u.p.current_unit = get_unit (dtp, 1);
 
-  if (dtp->u.p.current_unit->s == NULL)
+  if (dtp->u.p.current_unit == NULL)
+{
+  /* This means we tried to access an external unit < 0 without
+having opened it first with NEWUNIT=.  */
+  generate_error (>common, LIBERROR_BAD_OPTION,
+ "Invalid unit number in statement");
+  return;
+}
+  else if (dtp->u.p.current_unit->s == NULL)
 {  /* Open the unit with some default flags.  */
st_parameter_open opp;
unit_convert conv;
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 274b24b..41cd52f 100644
--- a/libgfortran/io/unit.c
+++ 

[Bug libstdc++/77990] unique_ptr<T, D>::unique_ptr(T*) imposes CopyConstructible on the deleter

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77990

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-10-14
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Ever confirmed|0   |1

Re: [wwwdocs, coding conventions] Mention OVERRIDE/FINAL

2016-10-14 Thread Pedro Alves
On 10/14/2016 10:28 PM, David Malcolm wrote:

> I propose that we update our coding conventions to mention the OVERRIDE
> and FINAL macros in the paragraph that discusses virtual funcs.
> 
> The attached patch (to the website) does so.
> 

Good idea, I like it.

GDB is following GCC's C++ coding conventions, BTW:

  
https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#C.2B-.2B--specific_coding_conventions

At least for starters.  :-)  Let's see how that goes.

Thanks,
Pedro Alves



[wwwdocs, coding conventions] Mention OVERRIDE/FINAL

2016-10-14 Thread David Malcolm
On Fri, 2016-10-14 at 16:27 +0100, Pedro Alves wrote:
> On 10/12/2016 03:13 PM, Bernd Schmidt wrote:
> > On 10/12/2016 04:09 PM, Pedro Alves wrote:
> > > 
> > > Thanks.  Here's a follow up patch that I was just testing.
> > > 
> > > Need this if building with "g++ -std=gnu++11", with gcc < 4.7.
> > 
> > Lovely. That's ok too if the other one goes in.
> 
> FYI, I pushed these in now.  I also bootstrapped with the
> jit included in the selected languages, and hacked the
> jit code a bit to trigger the problems OVERRIDE intends to
> catch, just to make sure it still works.
> 
> Thanks,
> Pedro Alves

I propose that we update our coding conventions to mention the OVERRIDE
and FINAL macros in the paragraph that discusses virtual funcs.

The attached patch (to the website) does so.

OK to commit?Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.77
diff -u -p -r1.77 codingconventions.html
--- htdocs/codingconventions.html	18 Sep 2016 13:55:17 -	1.77
+++ htdocs/codingconventions.html	14 Oct 2016 21:22:44 -
@@ -902,7 +902,10 @@ Its use with data-carrying classes is mo
 
 Think carefully about the size and performance impact
 of virtual functions and virtual bases
-before using them.
+before using them.  If you do use virtual functions, use the
+OVERRIDE and FINAL macros from
+include/ansidecl.h to annotate the code for a human reader,
+and to allow sufficiently modern C++ compilers to detect mistakes.
 
 
 


Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-14 Thread Janne Blomqvist
On Fri, Oct 14, 2016 at 8:01 PM, Bernhard Reutner-Fischer
 wrote:
> On 13 October 2016 22:08:21 CEST, Jerry DeLisle  wrote:
>>On 10/13/2016 08:16 AM, Janne Blomqvist wrote:
>
>>>
>>> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>>>
>>
>>Yes, OK, clever! Thanks!
>
> Is 32 something a typical program uses?

Probably not. Then again, wasting a puny 32 bytes vs. the time it
takes to do one or two extra realloc+copy operations when opening that
many files?

> I'd have started at 8 and had not doubled but += 16 fwiw.

I can certainly start at a smaller value like 8 or 16, but I'd like to
keep the multiplicative increase in order to get O(log(N))
reallocs+copys rather than O(N) when increasing the size.

-- 
Janne Blomqvist


Re: fix -fmax-errors & notes

2016-10-14 Thread David Malcolm
On Fri, 2016-10-14 at 15:50 -0400, Nathan Sidwell wrote:
> On 10/14/16 15:17, David Malcolm wrote:
> 
> > "Limits the maximum number of error messages to @var{n}, at which
> > point
> > GCC bails out rather than attempting to continue processing the
> > source
> > code.  If @var{n} is 0 (the default), there is no limit on the
> > number
> > of error messages produced.  If @option{-Wfatal-errors} is also
> > specified, then @option{-Wfatal-errors} takes precedence over this
> > option."
> > 
> > I'm not sure that the above would still be true after this patch.
> 
> disagree.  The above documentation is still correct.

Yes - it's possible to interpret the docs in such a way that they're
still correct.

> > How about splitting out the bail-out code into a separate function:
> > 
> >diagnostic_handle_max_errors
> > 
> > or somesuch, and calling it before emitting a diagnostic (like in
> > your
> > patch), and *also* at various key points in compilation - perhaps
> > at
> > some of the places where we call seen_error?  That way we wouldn't
> > need
> > an additional error to happen to stop processing, and notes would
> > still
> > happen after the final error.  There could also be one just before
> > we
> > cleanup the global_dc, so that the user gets the message there, if
> > they
> > haven't gotten it yet.
> 
> that would be possible, but seems over engineered to me. The patch I
> posted is 
> clearly an improvement in the user interface.

Your patch is a definite improvement to the UI.   I agree with your
characterization of my suggestion.

The patch is OK, assuming usual testing.

Thanks

Dave


Re: [PATCH v2,rs6000] Add built-in function support for Power9 string operations

2016-10-14 Thread Segher Boessenkool
On Thu, Oct 13, 2016 at 09:45:22AM -0600, Kelvin Nilsen wrote:
> 3. Replace magic number 74 with CR6_REGNO in vsx.md (2 occurrences)
>and vector.md (3 occurrences).

Some remain, see below.

> +moves bytes 16 - @code{len} to 15 of the corresponding vector.  For the

> +the element to be extracted is found at position @code{(15 - index)}.

Which of these is what we want, "@code{123 - x}" or "123 - @code{x}"?
(I think the former).  And parens or not?  Whichever way, making it more
consistent will make it easier to read.

> +;; This expansion handles the V4SF and V2DF modes in the Power9
> +;; implementation of the vec_all_ne and vec_any_eq built-in
> +;; functions.
> +(define_expand "vector_ne__p"
> +  [(parallel
> +[(set (reg:CC 74)

CR6_REGNO

You can use 
to uncover these problems.

>  (define_expand "vector_gt__p"
>[(parallel
>  [(set (reg:CC 74)

Your patch is against an old source tree?  Trunk has CR6_REGNO here.

> +(define_insn "*vector_nez__p"
> +  [(set (reg:CC 74)

CR6_REGNO

> +;; Load VSX Vector with Length
> +(define_expand "lxvl"
> +  [(set (match_dup 3)
> +(match_operand:DI 2 "register_operand" "r"))
> +   (set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
> + (unspec:V16QI
> +  [(match_operand:DI 1 "gpc_reg_operand" "b")
> +   (match_dup 3)]
> +  UNSPEC_LXVL))]

Constraints are useless on a define_expand.

> +(define_expand "stxvl"

Here too.


Okay for trunk with those things fixed.  Thank you!


Segher


Re: [PATCH] (v2) Tweaks to print_rtx_function

2016-10-14 Thread Bernd Schmidt

On 10/14/2016 10:12 PM, David Malcolm wrote:

gcc/ChangeLog:
* print-rtl-function.c (print_edge): Omit "(flags)" when none are
set.
(print_rtx_function): Update example in comment for...
* print-rtl.c (print_rtx_operand_code_r): In compact mode, print
non-virtual pseudos with a '%' sigil followed by the regno, offset
by (LAST_VIRTUAL_REGISTER + 1), so that the first non-virtual
pseudo is dumped as "%0".


Ok.


Bernd



[Bug fortran/77978] stop codes misinterpreted in both f2003 and f2008

2016-10-14 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77978

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #2 from kargl at gcc dot gnu.org ---
(In reply to john.harper from comment #0)
> Stop codes changed from the f2003 standard to f2008. The first of these 2 
> programs has a stop code valid in f2003 but not in f2008, the second has a
> stop code valid in f2008 but not in f2003. But gfortran 6.1.1 happily
> compiles and runs both programs with either -std=f2003 or -f2008.
> 
> Here are the program listings, gfortran -v result, and the results of
> compiling with the -std options that should not have worked:
> 
> cayley[~/Jfh] % cat stopnumber.f90
> ! stop666 (no space before 666) is valid f95 or f2003 but bad f2008
>   implicit none
>   stop666
> end program
> cayley[~/Jfh] % cat stopnumber2.f90
> ! stop expression is valid f2008 but bad f2003
>   implicit none 
> 

It actually worse than it appears.

% gfc7 -o z -std=f95 a.f90
% ./z
STOP -66
% cat a.f90
  implicit none
  stop -66
end program
% vi a.f90 (remove minus sign)
% gfc7 -o z -std=f95 a.f90
% ./z
STOP 66

Re: fix -fmax-errors & notes

2016-10-14 Thread Nathan Sidwell

On 10/14/16 15:17, David Malcolm wrote:


"Limits the maximum number of error messages to @var{n}, at which point
GCC bails out rather than attempting to continue processing the source
code.  If @var{n} is 0 (the default), there is no limit on the number
of error messages produced.  If @option{-Wfatal-errors} is also
specified, then @option{-Wfatal-errors} takes precedence over this
option."

I'm not sure that the above would still be true after this patch.


disagree.  The above documentation is still correct.


How about splitting out the bail-out code into a separate function:

   diagnostic_handle_max_errors

or somesuch, and calling it before emitting a diagnostic (like in your
patch), and *also* at various key points in compilation - perhaps at
some of the places where we call seen_error?  That way we wouldn't need
an additional error to happen to stop processing, and notes would still
happen after the final error.  There could also be one just before we
cleanup the global_dc, so that the user gets the message there, if they
haven't gotten it yet.


that would be possible, but seems over engineered to me. The patch I posted is 
clearly an improvement in the user interface.


nathan


[PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}

2016-10-14 Thread Jakub Jelinek
Hi!

Another set of GNU extensions that were accepted into DWARF5, so we should
emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use
the accepted standard opcodes instead of the corresponding GNU ones.

Bootstrapped/regtested on x86_64-linux and i686-linux on top of the
dwarf2.{def,h} patch, ok for trunk?

2016-10-14  Jakub Jelinek  

* dwarf2out.c (dwarf_op): New function.
(size_of_loc_descr): Handle DW_OP_{implicit_pointer,entry_value},
DW_OP_{const,regval,deref}_type and DW_OP_{convert,reinterpret}.
(output_loc_operands, output_loc_operands_raw): Likewise.
(resolve_args_picking_1, prune_unused_types_walk_loc_descr,
mark_base_types, hash_loc_operands, compare_loc_operands): Likewise.
(resolve_addr_in_expr): Likewise.  Only punt for !dwarf_strict
if dwarf_version < 5.
(convert_descriptor_to_mode): Use dwarf_op (DW_OP_xxx) instead of
DW_OP_GNU_xxx.
(scompare_loc_descriptor, ucompare_loc_descriptor,
minmax_loc_descriptor, typed_binop, mem_loc_descriptor,
implicit_ptr_descriptor, optimize_one_addr_into_implicit_ptr,
optimize_location_into_implicit_ptr): Likewise.  Only punt for
!dwarf_strict if dwarf_version < 5.
(string_cst_pool_decl): Adjust comment.
(non_dwarf_expression): Handle DW_OP_implicit_pointer.

--- gcc/dwarf2out.c.jj  2016-10-14 18:09:22.0 +0200
+++ gcc/dwarf2out.c 2016-10-14 19:02:00.291876232 +0200
@@ -1514,6 +1514,54 @@ loc_list_plus_const (dw_loc_list_ref lis
 #define DWARF_REF_SIZE \
   (dwarf_version == 2 ? DWARF2_ADDR_SIZE : DWARF_OFFSET_SIZE)
 
+/* Utility inline function for construction of ops that were GNU extension
+   before DWARF 5.  */
+static inline enum dwarf_location_atom
+dwarf_op (enum dwarf_location_atom op)
+{
+  switch (op)
+{
+case DW_OP_implicit_pointer:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_implicit_pointer;
+  break;
+
+case DW_OP_entry_value:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_entry_value;
+  break;
+
+case DW_OP_const_type:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_const_type;
+  break;
+
+case DW_OP_regval_type:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_regval_type;
+  break;
+
+case DW_OP_deref_type:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_deref_type;
+  break;
+
+case DW_OP_convert:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_convert;
+  break;
+
+case DW_OP_reinterpret:
+  if (dwarf_version < 5)
+   return DW_OP_GNU_reinterpret;
+  break;
+
+default:
+  break;
+}
+  return op;
+}
+
 static unsigned long int get_base_type_offset (dw_die_ref);
 
 /* Return the size of a location descriptor.  */
@@ -1633,15 +1681,18 @@ size_of_loc_descr (dw_loc_descr_ref loc)
   size += size_of_uleb128 (loc->dw_loc_oprnd1.v.val_unsigned)
  + loc->dw_loc_oprnd1.v.val_unsigned;
   break;
+case DW_OP_implicit_pointer:
 case DW_OP_GNU_implicit_pointer:
   size += DWARF_REF_SIZE + size_of_sleb128 (loc->dw_loc_oprnd2.v.val_int);
   break;
+case DW_OP_entry_value:
 case DW_OP_GNU_entry_value:
   {
unsigned long op_size = size_of_locs (loc->dw_loc_oprnd1.v.val_loc);
size += size_of_uleb128 (op_size) + op_size;
break;
   }
+case DW_OP_const_type:
 case DW_OP_GNU_const_type:
   {
unsigned long o
@@ -1668,6 +1719,7 @@ size_of_loc_descr (dw_loc_descr_ref loc)
  }
break;
   }
+case DW_OP_regval_type:
 case DW_OP_GNU_regval_type:
   {
unsigned long o
@@ -1676,6 +1728,7 @@ size_of_loc_descr (dw_loc_descr_ref loc)
+ size_of_uleb128 (o);
   }
   break;
+case DW_OP_deref_type:
 case DW_OP_GNU_deref_type:
   {
unsigned long o
@@ -1683,6 +1736,8 @@ size_of_loc_descr (dw_loc_descr_ref loc)
size += 1 + size_of_uleb128 (o);
   }
   break;
+case DW_OP_convert:
+case DW_OP_reinterpret:
 case DW_OP_GNU_convert:
 case DW_OP_GNU_reinterpret:
   if (loc->dw_loc_oprnd1.val_class == dw_val_class_unsigned_const)
@@ -2043,6 +2098,7 @@ output_loc_operands (dw_loc_descr_ref lo
   }
   break;
 
+case DW_OP_implicit_pointer:
 case DW_OP_GNU_implicit_pointer:
   {
char label[MAX_ARTIFICIAL_LABEL_BYTES
@@ -2054,11 +2110,13 @@ output_loc_operands (dw_loc_descr_ref lo
   }
   break;
 
+case DW_OP_entry_value:
 case DW_OP_GNU_entry_value:
   dw2_asm_output_data_uleb128 (size_of_locs (val1->v.val_loc), NULL);
   output_loc_sequence (val1->v.val_loc, for_eh_or_skip);
   break;
 
+case DW_OP_const_type:
 case DW_OP_GNU_const_type:
   {
unsigned long o = get_base_type_offset (val1->v.val_die_ref.die), l;
@@ -2132,6 +2190,7 @@ output_loc_operands (dw_loc_descr_ref lo
  }
   }
 

Re: [patch, libstdc++] std::shuffle: Generate two swap positions at a time if possible

2016-10-14 Thread Jonathan Wakely

On 02/09/16 20:53 +0200, Eelis wrote:

On 2016-09-02 20:20, Eelis van der Weegen wrote:

On 2016-08-31 14:45, Jonathan Wakely wrote:

Is this significantly faster than just using
uniform_int_distribution<_IntType>{0, __bound - 1}(__g) so we don't
need to duplicate the logic? (And people maintaining the code won't
reconvince themselves it's correct every time they look at it :-)

[..]

Could we hoist this test out of the loop somehow?

If we change the loop condition to be __i+1 < __last we don't need to
test it on every iteration, and then after the loop we can just do
the final swap if (__urange % 2).


Reusing std::uniform_int_distribution seems just as fast, so I've removed 
__generate_random_index_below.

I've hoisted the (__i + 1 == __last) check out of the loop (in a slightly 
different way), and it seems to shave off a couple more cycles, yay!

Updated patch attached.



Please ignore that patch, I used __g()&1 but that's invalid (the new 
"UniformRandomBitGenerator" name is misleading).

Updated patch (which uses a proper distribution even for the [0,1] case) 
attached.



I've finally got round to committing this patch to trunk.

Thanks for your patience, and sorry for the delay!

commit 1ddf9566764a85da4826628098b352bd30ba2bbc
Author: Jonathan Wakely 
Date:   Fri Oct 14 20:27:13 2016 +0100

Optimize std::shuffle by using generator to get two values at once

2016-10-14  Eelis van der Weegen  

	* include/bits/stl_algo.h (shuffle): Extract two random numbers from
	each generator invocation when its range is large enough.

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 0538a79..db99cb8 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -3772,6 +3772,47 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename std::make_unsigned<_DistanceType>::type __ud_type;
   typedef typename std::uniform_int_distribution<__ud_type> __distr_type;
   typedef typename __distr_type::param_type __p_type;
+
+  typedef typename std::remove_reference<_UniformRandomNumberGenerator>::type _Gen;
+  typedef typename std::common_type::type __uc_type;
+
+  const __uc_type __urngrange = __g.max() - __g.min();
+  const __uc_type __urange = __uc_type(__last - __first);
+
+  if (__urngrange / __urange >= __urange)
+// I.e. (__urngrange >= __urange * __urange) but without wrap issues.
+  {
+	_RandomAccessIterator __i = __first + 1;
+
+	// Since we know the range isn't empty, an even number of elements
+	// means an uneven number of elements /to swap/, in which case we
+	// do the first one up front:
+
+	if ((__urange % 2) == 0)
+	{
+	  __distr_type __d{0, 1};
+	  std::iter_swap(__i++, __first + __d(__g));
+	}
+
+	// Now we know that __last - __i is even, so we do the rest in pairs,
+	// using a single distribution invocation to produce swap positions
+	// for two successive elements at a time:
+
+	while (__i != __last)
+	{
+	  const __uc_type __swap_range = __uc_type(__i - __first) + 1;
+	  const __uc_type __comp_range = __swap_range * (__swap_range + 1);
+
+	  std::uniform_int_distribution<__uc_type> __d{0, __comp_range - 1};
+	  const __uc_type __pospos = __d(__g);
+
+	  std::iter_swap(__i++, __first + (__pospos % __swap_range));
+	  std::iter_swap(__i++, __first + (__pospos / __swap_range));
+	}
+
+	return;
+  }
+
   __distr_type __d;
 
   for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)


[PATCH] (v2) Tweaks to print_rtx_function

2016-10-14 Thread David Malcolm
On Thu, 2016-10-13 at 16:18 +0200, Bernd Schmidt wrote:
> On 10/13/2016 04:08 PM, David Malcolm wrote:
> > I thought it might be useful to brainstorm [1] some ideas on this,
> > so  here are various possible ways it could be printed for this use
> > -case:
> > 
> > * Offset by LAST_VIRTUAL_REGISTER + 1 (as in the patch), and
> > printed
> > just as a number, giving:
> > 
> >   (reg:SI 3)
> 
> Unambiguous in the compact format, nice low register numbers, but
> some
> potential for confusion with hard regs based on what people are used
> to.
> 
> > * Prefixed by a "sigil" character:
> 
>  >   (reg:SI %3)
> 
> Avoids the confusion issue and shouldn't overlap with hard register
> names. I think this is the one I prefer, followed by plain (reg:SI
> 3).
> 
> >   (reg:SI P3)
> 
> Can't use this, as there are machines with P3 registers.
> 
> > * Prefixed so it looks like a register name:
> > 
> >   (reg:SI pseudo-3)
> >   (reg:SI pseudo_3)
> >   (reg:SI pseudo+3)
> 
> Not too different from just a "%" prefix and probably too verbose.
> 
> > Looking at print_rtx_operand_code_r there are also things like
> > ORIGINAL_REGNO, REG_EXPR and REG_OFFSET which get printed after the
> > main regno, e.g.: >
> 
> >   (reg:SI 1 [  ])
> 
> That's the REG_EXPR here presumably? The interesting part comes when
> parsing this.

Indeed, but that seems like an issue for another patch...

Here's an updated version of the patch, which uses the '%' sigil for
non-virtual pseudos.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* print-rtl-function.c (print_edge): Omit "(flags)" when none are
set.
(print_rtx_function): Update example in comment for...
* print-rtl.c (print_rtx_operand_code_r): In compact mode, print
non-virtual pseudos with a '%' sigil followed by the regno, offset
by (LAST_VIRTUAL_REGISTER + 1), so that the first non-virtual
pseudo is dumped as "%0".
---
 gcc/print-rtl-function.c | 29 ++---
 gcc/print-rtl.c  |  8 
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/gcc/print-rtl-function.c b/gcc/print-rtl-function.c
index 2abae84..f46304b 100644
--- a/gcc/print-rtl-function.c
+++ b/gcc/print-rtl-function.c
@@ -60,9 +60,11 @@ print_edge (FILE *outfile, edge e, bool from)
 
   /* Express edge flags as a string with " | " separator.
  e.g. (flags "FALLTHRU | DFS_BACK").  */
-  fprintf (outfile, " (flags \"");
-  bool seen_flag = false;
-#define DEF_EDGE_FLAG(NAME,IDX) \
+  if (e->flags)
+{
+  fprintf (outfile, " (flags \"");
+  bool seen_flag = false;
+#define DEF_EDGE_FLAG(NAME,IDX)\
   do { \
 if (e->flags & EDGE_##NAME)\
   {\
@@ -75,7 +77,10 @@ print_edge (FILE *outfile, edge e, bool from)
 #include "cfg-flags.def"
 #undef DEF_EDGE_FLAG
 
-  fprintf (outfile, "\"))\n");
+  fprintf (outfile, "\")");
+}
+
+  fprintf (outfile, ")\n");
 }
 
 /* If BB is non-NULL, print the start of a "(block)" directive for it
@@ -132,7 +137,9 @@ can_have_basic_block_p (const rtx_insn *insn)
If COMPACT, then instructions are printed in a compact form:
- INSN_UIDs are omitted, except for jumps and CODE_LABELs,
- INSN_CODEs are omitted,
-   - register numbers are omitted for hard and virtual regs
+   - register numbers are omitted for hard and virtual regs, and
+ non-virtual pseudos are offset relative to the first such reg, and
+ printed with a '%' sigil e.g. "%0" for (LAST_VIRTUAL_REGISTER + 1),
- insn names are prefixed with "c" (e.g. "cinsn", "cnote", etc)
 
Example output (with COMPACT==true):
@@ -148,13 +155,13 @@ can_have_basic_block_p (const rtx_insn *insn)
   (reg:SI di [ i ])) "t.c":2
   (nil))
 (cnote NOTE_INSN_FUNCTION_BEG)
-(cinsn (set (reg:SI 89)
+(cinsn (set (reg:SI %2)
   (mem/c:SI (plus:DI (reg/f:DI virtual-stack-vars)
   (const_int -4)) [1 i+0 S4 A32])) "t.c":3
   (nil))
 (cinsn (parallel [
-  (set (reg:SI 87 [ _2 ])
-  (ashift:SI (reg:SI 89)
+  (set (reg:SI %0 [ _2 ])
+  (ashift:SI (reg:SI %2)
   (const_int 1)))
   (clobber (reg:CC flags))
   ]) "t.c":3
@@ -162,11 +169,11 @@ can_have_basic_block_p (const rtx_insn *insn)
   (const_int -4)) [1 i+0 S4 A32])
   (const_int 1))
   (nil)))
-(cinsn (set (reg:SI 88 [  ])
-  (reg:SI 87 [ _2 ])) "t.c":3
+(cinsn (set (reg:SI %1 [  ])
+  (reg:SI %0 [ _2 ])) "t.c":3
   (nil))
 (cinsn (set 

[PATCH] Resolve ambiguities in std::experimental::sample test

2016-10-14 Thread Jonathan Wakely

With --target_board=unix/-std=gnu++17 this test fails, because
std::sample is defined too.

* testsuite/experimental/algorithm/sample.cc: Qualify calls to
resolve ambiguity between std::sample and std::experimental::sample.

Tested x86_64-linux, committed to trunk.


commit 8129c87b2dcd834a8a4d3d165f60e6fb295a4d79
Author: Jonathan Wakely 
Date:   Fri Oct 14 20:33:38 2016 +0100

Resolve ambiguities in std::experimental::sample test

* testsuite/experimental/algorithm/sample.cc: Qualify calls to
resolve ambiguity between std::sample and std::experimental::sample.

diff --git a/libstdc++-v3/testsuite/experimental/algorithm/sample.cc 
b/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
index 16e6a74..e3c25e8 100644
--- a/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
+++ b/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
@@ -28,7 +28,6 @@
 
 std::mt19937 rng;
 
-using std::experimental::sample;
 using std::istream_iterator;
 using std::ostream_iterator;
 
@@ -39,7 +38,7 @@ test01()
   int samp[10] = { };
 
   // population smaller than desired sample size
-  auto it = sample(pop, pop + 2, samp, 10, rng);
+  auto it = std::experimental::sample(pop, pop + 2, samp, 10, rng);
   VERIFY( it == samp + 2 );
   VERIFY( std::accumulate(samp, samp + 10, 0) == 3 );
 }
@@ -50,7 +49,7 @@ test02()
   const int pop[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
   int samp[10] = { };
 
-  auto it = sample(pop, std::end(pop), samp, 10, rng);
+  auto it = std::experimental::sample(pop, std::end(pop), samp, 10, rng);
   VERIFY( it == samp + 10 );
 
   std::sort(samp, it);
@@ -65,7 +64,9 @@ test03()
   int samp[5] = { };
 
   // input iterator for population
-  auto it = sample(istream_iterator{pop}, {}, samp, 5, rng);
+  auto it = std::experimental::sample(istream_iterator{pop}, {},
+  samp,
+  5, rng);
   VERIFY( it == samp + 5 );
 
   std::sort(samp, it);
@@ -80,7 +81,9 @@ test04()
   std::stringstream samp;
 
   // forward iterator for population and output iterator for result
-  sample(pop.begin(), pop.end(), ostream_iterator{samp, " "}, 5, rng);
+  std::experimental::sample(pop.begin(), pop.end(),
+ostream_iterator{samp, " "},
+5, rng);
 
   // samp.rdbuf()->pubseekoff(0, std::ios::beg);
   std::vector v(istream_iterator{samp}, {});


[PATCH] For -gdwarf-5 emit DWARF5 .debug_macro

2016-10-14 Thread Jakub Jelinek
Hi!

The GNU DW_AT_GNU_macros and .debug_macro* extensions were added to DWARF5,
so we should emit those even in -gstrict-dwarf -gdwarf-5 mode.

Bootstrapped/regtested on x86_64-linux and i686-linux on top of the
dwarf2.{def,h} patch, ok for trunk?

2016-10-14  Jakub Jelinek  

* dwarf2out.c (dwarf2out_define, dwarf2out_undef, output_macinfo_op,
optimize_macinfo_range, save_macinfo_strings): Replace
DW_MACRO_GNU_* constants with corresponding DW_MACRO_* constants.
(output_macinfo): Likewise.  Emit .debug_macro* rather than
.debug_macinfo* even for -gstrict-dwarf -gdwarf-5.
(init_sections_and_labels): Use .debug_macro* labels rather than
.debug_macinfo* labels even for -gstrict-dwarf -gdwarf-5.
(dwarf2out_finish): Use DW_AT_macros instead of DW_AT_macro_info
or DW_AT_GNU_macros for -gdwarf-5.

--- gcc/dwarf2out.c.jj  2016-10-14 16:39:12.0 +0200
+++ gcc/dwarf2out.c 2016-10-14 17:59:32.304773877 +0200
@@ -25213,7 +25213,7 @@ dwarf2out_define (unsigned int lineno AT
 {
   macinfo_entry e;
   /* Insert a dummy first entry to be able to optimize the whole
-predefined macro block using DW_MACRO_GNU_transparent_include.  */
+predefined macro block using DW_MACRO_import.  */
   if (macinfo_table->is_empty () && lineno <= 1)
{
  e.code = 0;
@@ -25240,7 +25240,7 @@ dwarf2out_undef (unsigned int lineno ATT
 {
   macinfo_entry e;
   /* Insert a dummy first entry to be able to optimize the whole
-predefined macro block using DW_MACRO_GNU_transparent_include.  */
+predefined macro block using DW_MACRO_import.  */
   if (macinfo_table->is_empty () && lineno <= 1)
{
  e.code = 0;
@@ -25312,8 +25312,7 @@ output_macinfo_op (macinfo_entry *ref)
  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
{
  ref->code = ref->code == DW_MACINFO_define
- ? DW_MACRO_GNU_define_indirect
- : DW_MACRO_GNU_undef_indirect;
+ ? DW_MACRO_define_strp : DW_MACRO_undef_strp;
  output_macinfo_op (ref);
  return;
}
@@ -25324,16 +25323,16 @@ output_macinfo_op (macinfo_entry *ref)
   (unsigned long) ref->lineno);
   dw2_asm_output_nstring (ref->info, -1, "The macro");
   break;
-case DW_MACRO_GNU_define_indirect:
-case DW_MACRO_GNU_undef_indirect:
+case DW_MACRO_define_strp:
+case DW_MACRO_undef_strp:
   node = find_AT_string (ref->info);
   gcc_assert (node
-  && ((node->form == DW_FORM_strp)
-  || (node->form == DW_FORM_GNU_str_index)));
+ && (node->form == DW_FORM_strp
+ || node->form == DW_FORM_GNU_str_index));
   dw2_asm_output_data (1, ref->code,
-  ref->code == DW_MACRO_GNU_define_indirect
-  ? "Define macro indirect"
-  : "Undefine macro indirect");
+  ref->code == DW_MACRO_define_strp
+  ? "Define macro strp"
+  : "Undefine macro strp");
   dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
   (unsigned long) ref->lineno);
   if (node->form == DW_FORM_strp)
@@ -25344,8 +25343,8 @@ output_macinfo_op (macinfo_entry *ref)
 dw2_asm_output_data_uleb128 (node->index, "The macro: \"%s\"",
  ref->info);
   break;
-case DW_MACRO_GNU_transparent_include:
-  dw2_asm_output_data (1, ref->code, "Transparent include");
+case DW_MACRO_import:
+  dw2_asm_output_data (1, ref->code, "Import");
   ASM_GENERATE_INTERNAL_LABEL (label,
   DEBUG_MACRO_SECTION_LABEL, ref->lineno);
   dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
@@ -25361,7 +25360,7 @@ output_macinfo_op (macinfo_entry *ref)
other compilation unit .debug_macinfo sections.  IDX is the first
index of a define/undef, return the number of ops that should be
emitted in a comdat .debug_macinfo section and emit
-   a DW_MACRO_GNU_transparent_include entry referencing it.
+   a DW_MACRO_import entry referencing it.
If the define/undef entry should be emitted normally, return 0.  */
 
 static unsigned
@@ -25447,10 +25446,10 @@ optimize_macinfo_range (unsigned int idx
   for (i = 0; i < 16; i++)
 sprintf (tail + i * 2, "%02x", checksum[i] & 0xff);
 
-  /* Construct a macinfo_entry for DW_MACRO_GNU_transparent_include
+  /* Construct a macinfo_entry for DW_MACRO_import
  in the empty vector entry before the first define/undef.  */
   inc = &(*macinfo_table)[idx - 1];
-  inc->code = DW_MACRO_GNU_transparent_include;
+  inc->code = DW_MACRO_import;
   inc->lineno = 0;
   inc->info = ggc_strdup (grp_name);
   if (!*macinfo_htab)
@@ 

[PATCH, rs6000] pr65479 Add option to fix failing asan test cases.

2016-10-14 Thread Bill Seurer
[PATCH, rs6000] pr65479 Add option to fix failing asan test cases.

This patch adds the -fasynchronous-unwind-tables option to several of the asan
test cases.  The option causes a full strack trace to be produced when the
sanitizer detects an error.  Without the full trace the 3 test cases fail.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu,
powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions.
Is this ok for trunk?

[gcc/testsuite]

2016-10-14  Bill Seurer  

c-c++-common/asan/misalign-1.c: Add option for powerpc.
c-c++-common/asan/misalign-2.c: Add option for powerpc.
c-c++-common/asan/null-deref-1.c: Add option for powerpc.

Index: gcc/testsuite/c-c++-common/asan/misalign-1.c
===
--- gcc/testsuite/c-c++-common/asan/misalign-1.c(revision 241174)
+++ gcc/testsuite/c-c++-common/asan/misalign-1.c(working copy)
@@ -1,6 +1,7 @@
 /* { dg-do run { target { ilp32 || lp64 } } } */
 /* { dg-options "-O2" } */
 /* { dg-additional-options "-fno-omit-frame-pointer" { target *-*-darwin* } } 
*/
+/* { dg-additional-options "-fasynchronous-unwind-tables" { target { 
powerpc*-*-linux* } } } */
 /* { dg-shouldfail "asan" } */
 
 struct S { int i; } __attribute__ ((packed));
@@ -39,5 +40,5 @@ main ()
 /* { dg-output "ERROR: AddressSanitizer:\[^\n\r]*on address\[^\n\r]*" } */
 /* { dg-output "0x\[0-9a-f\]+ at pc 0x\[0-9a-f\]+ bp 0x\[0-9a-f\]+ sp 
0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*READ of size 4 at 0x\[0-9a-f\]+ thread 
T0\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*foo(\[^\n\r]*misalign-1.c:1\[01]|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } 
*/
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*misalign-1.c:3\[45]|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*foo(\[^\n\r]*misalign-1.c:1\[12]|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } 
*/
+/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*misalign-1.c:3\[56]|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
Index: gcc/testsuite/c-c++-common/asan/misalign-2.c
===
--- gcc/testsuite/c-c++-common/asan/misalign-2.c(revision 241174)
+++ gcc/testsuite/c-c++-common/asan/misalign-2.c(working copy)
@@ -1,6 +1,7 @@
 /* { dg-do run { target { ilp32 || lp64 } } } */
 /* { dg-options "-O2" } */
 /* { dg-additional-options "-fno-omit-frame-pointer" { target *-*-darwin* } } 
*/
+/* { dg-additional-options "-fasynchronous-unwind-tables" { target { 
powerpc*-*-linux* } } } */
 /* { dg-shouldfail "asan" } */
 
 struct S { int i; } __attribute__ ((packed));
@@ -39,5 +40,5 @@ main ()
 /* { dg-output "ERROR: AddressSanitizer:\[^\n\r]*on address\[^\n\r]*" } */
 /* { dg-output "0x\[0-9a-f\]+ at pc 0x\[0-9a-f\]+ bp 0x\[0-9a-f\]+ sp 
0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*READ of size 4 at 0x\[0-9a-f\]+ thread 
T0\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*baz(\[^\n\r]*misalign-2.c:2\[23]|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } 
*/
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*misalign-2.c:3\[45]|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*baz(\[^\n\r]*misalign-2.c:2\[34]|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } 
*/
+/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*misalign-2.c:3\[56]|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
Index: gcc/testsuite/c-c++-common/asan/null-deref-1.c
===
--- gcc/testsuite/c-c++-common/asan/null-deref-1.c  (revision 241174)
+++ gcc/testsuite/c-c++-common/asan/null-deref-1.c  (working copy)
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-fno-omit-frame-pointer -fno-shrink-wrap" } */
 /* { dg-additional-options "-mno-omit-leaf-frame-pointer" { target { i?86-*-* 
x86_64-*-* } } } */
+/* { dg-additional-options "-fasynchronous-unwind-tables" { target { 
powerpc*-*-linux* } } } */
 /* { dg-shouldfail "asan" } */
 
 __attribute__((noinline, noclone))
@@ -18,5 +19,5 @@ int main()
 
 /* { dg-output "ERROR: AddressSanitizer:? SEGV on unknown address\[^\n\r]*" } 
*/
 /* { dg-output "0x\[0-9a-f\]+ \[^\n\r]*pc 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" 
} */
-/* { dg-output "\[^\n\r]*#0 0x\[0-9a-f\]+ +(in \[^\n\r]*NullDeref\[^\n\r]* 
(\[^\n\r]*null-deref-1.c:10|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*null-deref-1.c:15|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*#0 0x\[0-9a-f\]+ +(in \[^\n\r]*NullDeref\[^\n\r]* 
(\[^\n\r]*null-deref-1.c:11|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 

[Bug middle-end/77959] ICE in ix86_decompose_address, at i386/i386.c:14954

2016-10-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77959

--- Comment #10 from Jakub Jelinek  ---
Author: jakub
Date: Fri Oct 14 19:36:58 2016
New Revision: 241182

URL: https://gcc.gnu.org/viewcvs?rev=241182=gcc=rev
Log:
PR middle-end/77959
* expr.c (expand_expr_real_1) : For EXPAND_WRITE
return a MEM.

* gfortran.dg/pr77959.f90: New test.

Added:
trunk/gcc/testsuite/gfortran.dg/pr77959.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/testsuite/ChangeLog

[Bug libstdc++/77990] New: unique_ptr<T, D>::unique_ptr(T*) imposes CopyConstructible on the deleter

2016-10-14 Thread evansr at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77990

Bug ID: 77990
   Summary: unique_ptr::unique_ptr(T*) imposes
CopyConstructible on the deleter
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: evansr at google dot com
CC: brooks at gcc dot gnu.org
  Target Milestone: ---

Created attachment 39815
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39815=edit
test case

C++11 spec [unique.ptr.single.ctor] says:

For unique_ptr the ctor overload:

  explicit unique_ptr(pointer p);

Requires: D shall satisfy DefaultConstructible.

The current libstdc++ implementation imposes CopyConstructible because it
constructs the _M_t data member tuple as _M_t(__p, deleter_type()).

See:
https://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/include/bits/unique_ptr.h?revision=237531=markup#l171

Re: [PATCH] Add "__RTL" to cc1 (v2)

2016-10-14 Thread David Malcolm
On Fri, 2016-10-14 at 21:27 +0200, Bernd Schmidt wrote:
> On 10/14/2016 09:25 PM, David Malcolm wrote:
> > 
> > The behavior probably should be that it runs the remainder of the
> > RTL
> > passes from some specified point, and generates valid assembler (so
> > that we can have dg-do DejaGnu tests).
> 
> Actually I had imagined that tests would specify before and after RTL
> so 
> that we verify that the pass we're testing does what it's supposed to
> do.

Note that this approach would allow for:

  { dg-final { scan-rtl-dump "SOMETHING" "PASS OF INTEREST" } } */

directives in the .c file, so it would support specifying the "after
RTL" to some extent.


Re: [PATCH] Fix expansion ICE on store to CONST_DECL (PR middle-end/77959)

2016-10-14 Thread Richard Biener
On October 14, 2016 7:20:43 PM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>The following (invalid) testcase ICEs, because we try to store into
>CONST_DECL's FIELD.  Normally in GIMPLE we have MEM_REF[] and
>writes to that expand gracefully into a MEM, but as soon as we use
>get_inner_reference in expand_assignment (even if the MEM is just
>reverse
>order, or we just want to store to a part of it etc.),
>get_inner_reference
>looks through even that MEM_REF.  Instead of hacking that around in
>expand_assignment, just attempting to handle EXPAND_WRITE into
>CONST_DECL
>looked easier to me.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-10-14  Jakub Jelinek  
>
>   PR middle-end/77959
>   * expr.c (expand_expr_real_1) : For EXPAND_WRITE
>   return a MEM.
>
>   * gfortran.dg/pr77959.f90: New test.
>
>--- gcc/expr.c.jj  2016-10-09 13:19:09.0 +0200
>+++ gcc/expr.c 2016-10-13 11:49:36.386993921 +0200
>@@ -9914,6 +9914,19 @@ expand_expr_real_1 (tree exp, rtx target
>   }
> 
> case CONST_DECL:
>+  if (modifier == EXPAND_WRITE)
>+  {
>+/* Writing into CONST_DECL is always invalid, but handle it
>+   gracefully.  */
>+addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
>+machine_mode address_mode = targetm.addr_space.address_mode (as);
>+op0 = expand_expr_addr_expr_1 (exp, NULL_RTX, address_mode,
>+   EXPAND_NORMAL, as);
>+op0 = memory_address_addr_space (mode, op0, as);
>+temp = gen_rtx_MEM (mode, op0);
>+set_mem_addr_space (temp, as);
>+return temp;
>+  }
>   return expand_expr (DECL_INITIAL (exp), target, VOIDmode, modifier);
> 
> case REAL_CST:
>--- gcc/testsuite/gfortran.dg/pr77959.f90.jj   2016-10-13
>11:57:30.019992471 +0200
>+++ gcc/testsuite/gfortran.dg/pr77959.f90  2016-10-13 11:58:50.719969914
>+0200
>@@ -0,0 +1,16 @@
>+! PR middle-end/77959
>+! { dg-do compile }
>+! { dg-options "-O2" }
>+
>+program pr77959
>+  interface
>+subroutine foo(x)  ! { dg-warning "Type mismatch in argument" }
>+  real :: x
>+end
>+  end interface
>+  call foo(1.0)
>+end
>+subroutine foo(x)
>+  complex :: x
>+  x = x + 1
>+end
>
>   Jakub




Re: [PATCH] Add "__RTL" to cc1 (v2)

2016-10-14 Thread Bernd Schmidt

On 10/14/2016 09:25 PM, David Malcolm wrote:


The behavior probably should be that it runs the remainder of the RTL
passes from some specified point, and generates valid assembler (so
that we can have dg-do DejaGnu tests).


Actually I had imagined that tests would specify before and after RTL so 
that we verify that the pass we're testing does what it's supposed to do.



Bernd



[Bug sanitizer/65479] sanitizer stack trace missing frames past #0 on powerpc64

2016-10-14 Thread seurer at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479

Bill Seurer  changed:

   What|Removed |Added

 CC||seurer at linux dot 
vnet.ibm.com

--- Comment #18 from Bill Seurer  ---
With other fixes that have been made the only thing remaining to get the 3 test
cases working is the -fasynchronous-unwind-tables option.  I tried this on
powerpc64 LE and BE and it works fine.  I also verified it doesn't break on
x86.

c-c++-common/asan/misalign-1.c
c-c++-common/asan/misalign-2.c
c-c++-common/asan/null-deref-1.c


/* { dg-additional-options "-fasynchronous-unwind-tables" { target {
powerpc*-*-linux* } } } */

(plus some tweaking of line numbers in expected outputs)

I'll be submitting a patch for this.

Re: [PATCH] Add "__RTL" to cc1 (v2)

2016-10-14 Thread David Malcolm
On Fri, 2016-10-14 at 11:33 +0200, Richard Biener wrote:
> On Thu, Oct 13, 2016 at 3:51 PM, Bernd Schmidt 
> wrote:
> > On 10/13/2016 03:49 PM, Richard Biener wrote:
> > > 
> > > Does it really run a single pass only?  Thus you can't do a { dg
> > > -do run }
> > > test
> > > with __RTL?
> > 
> > 
> > I think that's really not the intended use-case. To my mind this is
> > for
> > unit-testing: ensuring that a given rtl pass performs the expected
> > transformation on an input.
> 
> Ok, so at least for the GIMPLE FE side I thought it's useful to allow
> a correctness verification with something simpler than pattern
> matching
> on the pass output.  By means of doing runtime verification of an
> expected
> result (this necessarily includes running followup passes as we have
> to
> generate code).  I don't see why this shouldn't apply to __RTL -- it
> might
> be more difficult to get __RTL testcases to the point where they emit
> assembly of course.
> 
> OTOH the question then still is what's the default behavior if you do
> _not_
> specify a "single pass to run".

As noted elsewhere, the current behavior is that it merely parses the
function and ignores it - and that's a bug in the current
implementation.

The behavior probably should be that it runs the remainder of the RTL
passes from some specified point, and generates valid assembler (so
that we can have dg-do DejaGnu tests).


Re: [PATCH] Add "__RTL" to cc1 (v2)

2016-10-14 Thread David Malcolm
On Thu, 2016-10-13 at 15:49 +0200, Richard Biener wrote:
> On Fri, Oct 7, 2016 at 5:58 PM, David Malcolm 
> wrote:
> > On Wed, 2016-10-05 at 16:09 +, Joseph Myers wrote:
> > > On Wed, 5 Oct 2016, David Malcolm wrote:
> > > 
> > > > @@ -1752,6 +1759,35 @@ c_parser_declaration_or_fndef (c_parser
> > > > *parser, bool fndef_ok,
> > > >c_parser_skip_to_end_of_block_or_statement (parser);
> > > >return;
> > > >  }
> > > > +
> > > > +  if (c_parser_next_token_is (parser, CPP_KEYWORD))
> > > > +{
> > > > +  c_token *kw_token = c_parser_peek_token (parser);
> > > > +  if (kw_token->keyword == RID_RTL)
> > > 
> > > if (c_parser_next_token_is_keyword (parser, RID_RTL))
> > > 
> > > You're missing an update to the comment above this function to
> > > show
> > > what
> > > the new syntax is.
> > 
> > Thanks.  Here's an updated version of the patch which fixes that,
> > along with some other fixes:
> > * Use c_parser_next_token_is_keyword.
> > * Removed a stray "FIXME".
> > * Removed some debug code.
> > * Add more comments
> > * Fixed a typo in the ChangeLog ("__RID" -> "__RTL")
> > 
> > Blurb from original version:
> > 
> > This patch implements Richi's idea of having a custom __RTL marker
> > in C function definitions, to indicate that the body of the
> > function
> > is to be parsed as RTL, rather than C:
> > 
> > int __RTL test_fn_1 (int i)
> > {
> >  (function "times_two"
> >(insn-chain
> >  (note 1 0 4 (nil) NOTE_INSN_DELETED)
> >  ;; etc
> >) ;; insn-chain
> >(crtl
> >  (return_rtx
> >(reg/i:SI 0 ax)
> >  ) ;; return_rtx
> >) ;; crtl
> >   ) ;; function
> > }
> > 
> > This allows for decls and types to be declared in C, and to use
> > the function decl from the C frontend.
> > 
> > I added support for running a single pass by giving __RTL an
> > optional
> > parameter (the name of the pass).  For example:
> 
> So what's the default behavior?

Currently it loads the RTL, but doesn't manage to generate assembler
for it.  There are some state issues to be tracked down.

int __RTL ("rtl-dfinit") test_fn_2 (int i)
> > {
> >  (function "times_two"
> >(insn-chain
> >  (note 1 0 4 (nil) NOTE_INSN_DELETED)
> >  ;; etc
> >) ;; insn-chain
> >(crtl
> >  (return_rtx
> >(reg/i:SI 0 ax)
> >  ) ;; return_rtx
> >) ;; crtl
> >   ) ;; function
> > }
> 
> Does it really run a single pass only? 

Yes.

>  Thus you can't do a { dg-do run } test
> with __RTL?  

Currently not, but I think I can fix that.

> The GIMPLE FE has a __GIMPLE (starts-with: "pass") thing
> starting from a specific pass but going all the way to assembly
> output.

It strikes me that we might need that; we probably need some way to
identify what state the RTL is in.

> It looks like your run-one-rtl-pass thingy is directly invoked from
> the "frontend"
> rather than passing down everything to the middle-end?

Yes.  There are some nasty state issues here: the whole of the RTL
-handling in the backend seems to expect a single function, and various
other singleton state (e.g. "crtl" aka "x_rtl").  The RTL "frontend" is
populating that state directly, so I think I have to do one function at
a time, running any/all RTL passes as soon as each one is parsed.

[...]


[Bug c++/67182] Initialising map with disabled copy elision yields unexpected results.

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67182

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Known to work||4.9.4, 5.4.0, 6.2.0
 Resolution|--- |FIXED
   Target Milestone|--- |5.4
  Known to fail||5.1.0, 5.3.0

--- Comment #1 from Jonathan Wakely  ---
This is fixed in GCC 5.4

Re: fix -fmax-errors & notes

2016-10-14 Thread David Malcolm
On Thu, 2016-10-13 at 06:48 -0400, Nathan Sidwell wrote:
> On 10/11/16 16:07, David Malcolm wrote:
> 
> > This logic is running when the next diagnostic is about to be
> > emitted.
> > But what if the user has selected -Wfatal-errors and there's a
> > single
> > error and no further diagnostics?  Could this change the observable
> > behavior?  (I'm trying to think of a case here, but failing).
> 
> 
> This version only moves the -fmax-errors handling.  I've addressed
> your testcase 
> comments too.  WDYT?

invoke.texi has this text for -fmax-errors:

"Limits the maximum number of error messages to @var{n}, at which point
GCC bails out rather than attempting to continue processing the source
code.  If @var{n} is 0 (the default), there is no limit on the number
of error messages produced.  If @option{-Wfatal-errors} is also
specified, then @option{-Wfatal-errors} takes precedence over this
option."

I'm not sure that the above would still be true after this patch.

How about splitting out the bail-out code into a separate function:

   diagnostic_handle_max_errors

or somesuch, and calling it before emitting a diagnostic (like in your
patch), and *also* at various key points in compilation - perhaps at
some of the places where we call seen_error?  That way we wouldn't need
an additional error to happen to stop processing, and notes would still
happen after the final error.  There could also be one just before we
cleanup the global_dc, so that the user gets the message there, if they
haven't gotten it yet.

Thanks
Dave


[PATCH] xtensa: add HW FPU sequences for DIV/SQRT/RECIP/RSQRT

2016-10-14 Thread Max Filippov
Use new FPU instruction sequences documented in the ISA book to
implement __divsf3, __divdf3, __recipsf2, __recipdf2, __rsqrtsf2,
__rsqrtdf2 and __ieee754_sqrtf and __ieee754_sqrt.

2013-02-12  Ding-Kai Chen  
libgcc/
* config/xtensa/ieee754-df.S (__recipdf2, __rsqrtdf2,
__ieee754_sqrt): New functions.
(__divdf3): Add implementation with new FPU instructions under
#if XCHAL_HAVE_DFP_DIV.
* config/xtensa/ieee754-sf.S (__recipsf2, __rsqrtsf2,
__ieee754_sqrtf): New functions.
(__divsf3): Add implementation with new FPU instructions under
#if XCHAL_HAVE_FP_DIV.
* config/xtensa/t-xtensa (LIB1ASMFUNCS): Add _sqrtf, _recipsf2
_rsqrtsf2, _sqrt, _recipdf2 and _rsqrtdf2.
---
 libgcc/config/xtensa/ieee754-df.S | 179 +-
 libgcc/config/xtensa/ieee754-sf.S | 156 -
 libgcc/config/xtensa/t-xtensa |   4 +-
 3 files changed, 336 insertions(+), 3 deletions(-)

diff --git a/libgcc/config/xtensa/ieee754-df.S 
b/libgcc/config/xtensa/ieee754-df.S
index 1d9ef46..efb3c41 100644
--- a/libgcc/config/xtensa/ieee754-df.S
+++ b/libgcc/config/xtensa/ieee754-df.S
@@ -1217,8 +1217,59 @@ __muldf3:
 
 #ifdef L_divdf3
 
-   .literal_position
/* Division */
+
+#if XCHAL_HAVE_DFP_DIV
+
+.text
+.align 4
+.global __divdf3
+.type  __divdf3, @function
+__divdf3:
+   leaf_entry  sp, 16
+
+   wfrdf1, xh, xl
+   wfrdf2, yh, yl
+
+
+   div0.d  f3, f2
+   nexp01.df4, f2
+   const.d f0, 1
+   maddn.d f0, f4, f3
+   const.d f5, 0
+   mov.d   f7, f2
+   mkdadj.df7, f1
+   maddn.d f3, f0, f3
+   maddn.d f5, f0, f0
+   nexp01.df1, f1
+   div0.d  f2, f2
+   maddn.d f3, f5, f3
+   const.d f5, 1
+   const.d f0, 0
+   neg.d   f6, f1
+   maddn.d f5, f4, f3
+   maddn.d f0, f6, f2
+   maddn.d f3, f5, f3
+   maddn.d f6, f4, f0
+   const.d f2, 1
+   maddn.d f2, f4, f3
+   maddn.d f0, f6, f3
+   neg.d   f1, f1
+   maddn.d f3, f2, f3
+   maddn.d f1, f4, f0
+   addexpm.d   f0, f7
+   addexp.df3, f7
+   divn.d  f0, f1, f3
+
+   rfr xl, f0
+   rfrdxh, f0
+
+   leaf_return
+
+#else
+
+   .literal_position
+
 __divdf3_aux:
 
/* Handle unusual cases (zeros, subnormals, NaNs and Infinities).
@@ -1537,6 +1588,8 @@ __divdf3:
movixl, 0
leaf_return
 
+#endif /* XCHAL_HAVE_DFP_DIV */
+
 #endif /* L_divdf3 */
 
 #ifdef L_cmpdf2
@@ -2388,3 +2441,127 @@ __extendsfdf2:
 #endif /* L_extendsfdf2 */
 
 
+#if XCHAL_HAVE_DFP_SQRT
+#ifdef L_sqrt
+
+.text
+.align 4
+.global __ieee754_sqrt
+.type  __ieee754_sqrt, @function
+__ieee754_sqrt:
+   leaf_entry  sp, 16
+
+   wfrdf1, xh, xl
+
+   sqrt0.d f2, f1
+   const.d f4, 0
+   maddn.d f4, f2, f2
+   nexp01.df3, f1
+   const.d f0, 3
+   addexp.df3, f0
+   maddn.d f0, f4, f3
+   nexp01.df4, f1
+   maddn.d f2, f0, f2
+   const.d f5, 0
+   maddn.d f5, f2, f3
+   const.d f0, 3
+   maddn.d f0, f5, f2
+   neg.d   f6, f4
+   maddn.d f2, f0, f2
+   const.d f0, 0
+   const.d f5, 0
+   const.d f7, 0
+   maddn.d f0, f6, f2
+   maddn.d f5, f2, f3
+   const.d f3, 3
+   maddn.d f7, f3, f2
+   maddn.d f4, f0, f0
+   maddn.d f3, f5, f2
+   neg.d   f2, f7
+   maddn.d f0, f4, f2
+   maddn.d f7, f3, f7
+   mksadj.df2, f1
+   nexp01.df1, f1
+   maddn.d f1, f0, f0
+   neg.d   f3, f7
+   addexpm.d   f0, f2
+   addexp.df3, f2
+   divn.d  f0, f1, f3
+
+   rfr xl, f0
+   rfrdxh, f0
+
+   leaf_return
+
+#endif /* L_sqrt */
+#endif /* XCHAL_HAVE_DFP_SQRT */
+
+#if XCHAL_HAVE_DFP_RECIP
+#ifdef L_recipdf2
+   /* Reciprocal */
+
+   .align  4
+   .global __recipdf2
+   .type   __recipdf2, @function
+__recipdf2:
+   leaf_entry  sp, 16
+
+   wfrdf1, xh, xl
+
+   recip0.df0, f1
+   const.d f2, 2
+   msub.d  f2, f1, f0
+   mul.d   f3, f1, f0
+   const.d f4, 2
+   mul.d   f5, f0, f2
+   msub.d  f4, f3, f2
+   const.d f2, 1
+   mul.d   f0, f5, f4
+   msub.d  f2, f1, f0
+   maddn.d 

[SPARC] Use new pass registration mechanism

2016-10-14 Thread Eric Botcazou
Tested on SPARC/Solaris, applied on the mainline.


2016-10-14  Eric Botcazou  

* config/sparc/sparc-passes.def: New file.
* config/sparc/t-sparc (PASSES_EXTRA): Add sparc-passes.def.
* config/sparc/sparc-protos.h (make_pass_work_around_errata): New.
* config/sparc/sparc.c (sparc_option_override): Don't register passes.

-- 
Eric BotcazouIndex: config/sparc/sparc-passes.def
===
--- config/sparc/sparc-passes.def	(revision 0)
+++ config/sparc/sparc-passes.def	(working copy)
@@ -0,0 +1,27 @@
+/* Description of target passes for SPARC. 
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/*
+   Macros that can be used in this file:
+   INSERT_PASS_AFTER (PASS, INSTANCE, TGT_PASS)
+   INSERT_PASS_BEFORE (PASS, INSTANCE, TGT_PASS)
+   REPLACE_PASS (PASS, INSTANCE, TGT_PASS)
+ */
+
+  INSERT_PASS_AFTER (pass_delay_slots, 1, pass_work_around_errata);
Index: config/sparc/sparc-protos.h
===
--- config/sparc/sparc-protos.h	(revision 241147)
+++ config/sparc/sparc-protos.h	(working copy)
@@ -47,6 +47,7 @@ extern void sparc_profile_hook (int);
 extern void sparc_override_options (void);
 extern void sparc_output_scratch_registers (FILE *);
 extern void sparc_target_macros (void);
+extern void sparc_emit_membar_for_model (enum memmodel, int, int);
 
 #ifdef RTX_CODE
 extern machine_mode select_cc_mode (enum rtx_code, rtx, rtx);
@@ -110,6 +111,6 @@ unsigned int sparc_regmode_natural_size
 bool sparc_modes_tieable_p (machine_mode, machine_mode);
 #endif /* RTX_CODE */
 
-extern void sparc_emit_membar_for_model (enum memmodel, int, int);
+extern rtl_opt_pass *make_pass_work_around_errata (gcc::context *);
 
 #endif /* __SPARC_PROTOS_H__ */
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 241147)
+++ config/sparc/sparc.c	(working copy)
@@ -883,10 +883,10 @@ mem_ref (rtx x)
 }
 
 /* We use a machine specific pass to enable workarounds for errata.
+
We need to have the (essentially) final form of the insn stream in order
to properly detect the various hazards.  Therefore, this machine specific
-   pass runs as late as possible.  The pass is inserted in the pass pipeline
-   at the end of sparc_option_override.  */
+   pass runs as late as possible.  */
 
 static unsigned int
 sparc_do_work_around_errata (void)
@@ -1706,21 +1706,6 @@ sparc_option_override (void)
  pessimizes for double floating-point registers.  */
   if (!global_options_set.x_flag_ira_share_save_slots)
 flag_ira_share_save_slots = 0;
-
-  /* We register a machine specific pass to work around errata, if any.
- The pass mut be scheduled as late as possible so that we have the
- (essentially) final form of the insn stream to work on.
- Registering the pass must be done at start up.  It's convenient to
- do it here.  */
-  opt_pass *errata_pass = make_pass_work_around_errata (g);
-  struct register_pass_info insert_pass_work_around_errata =
-{
-  errata_pass,		/* pass */
-  "dbr",			/* reference_pass_name */
-  1,			/* ref_pass_instance_number */
-  PASS_POS_INSERT_AFTER	/* po_op */
-};
-  register_pass (_pass_work_around_errata);
 }
 
 /* Miscellaneous utilities.  */
Index: config/sparc/t-sparc
===
--- config/sparc/t-sparc	(revision 241147)
+++ config/sparc/t-sparc	(working copy)
@@ -18,6 +18,8 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
+PASSES_EXTRA += $(srcdir)/config/sparc/sparc-passes.def
+
 sparc-c.o: $(srcdir)/config/sparc/sparc-c.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)


[Bug libstdc++/24693] Deque improvements

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24693

--- Comment #6 from Jonathan Wakely  ---
Prototype patch: https://gcc.gnu.org/ml/libstdc++/2016-10/msg00017.html

[Bug libstdc++/66338] std::forward_as_tuple() issue with single argument

2016-10-14 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66338

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Jonathan Wakely  ---
.

[PATCH] Make std::bind use std::invoke

2016-10-14 Thread Jonathan Wakely

I think this is the last place where we weren't using std::__invoke()
for something the standard defines in terms of INVOKE. This should
make all our call wrappers and functional stuff consistent in terms of
support for invoking a pointer-to-member, reference_wrapper etc.

This means we can remove _Maybe_wrap_member_pointer, because
std::__invoke does the right thing for member pointers.

I've also added the deprecated attribute to the volatile overloads,
corresponding to the new wording in C++17:

 "The cv-qualifiers cv of the call wrapper g, as specified below,
 shall be neither volatile nor const volatile."


* include/std/functional (_Mu, _Mu):
Simplify forwarding from tuple of references.
(_Maybe_wrap_member_pointer): Remove.
(_Bind::__call, _Bind::__call_c, _Bind::__call_v, _Bind::__call_c_v):
Use std::__invoke.
(_Bind::_Mu_type, _Bind::_Res_type_impl, _Bind::_Res_type)
(_Bind::__dependent, _Bind::_Res_type_cv): New helpers to simplify
return type deduction.
(_Bind::operator(), _Bind::operator() const): Use new helpers.
(_Bind::operator() volatile, _Bind::operator() const volatile):
Likewise. Add deprecated attribute for C++17 mode.
(_Bind_result::__call): Use std::__invoke.
(_Bind_result::operator() volatile)
(_Bind_result::operator() const volatile): Add deprecated attribute.
(_Bind_helper::__maybe_type, _Bindres_helper::__maybe_type): Remove.
(_Bind_helper, _Bindres_helper): Don't use _Maybe_wrap_member_pointer.
(bind, bind): Don't use __maybe_type.
* src/c++11/compatibility-thread-c++0x.cc
(_Maybe_wrap_member_pointer): Define here for compatibility symbols.
* testsuite/20_util/bind/68912.cc: Don't test volatile-qualification
in C++17 mode.
* testsuite/20_util/bind/cv_quals.cc: Likewise.
* testsuite/20_util/bind/cv_quals_2.cc: Likewise.

Tested x86_64-linux, -std=gnu++{14,17},  committed to trunk.

commit 0a21180fd9ac92e6fe756f1956fded4c8f3149f2
Author: Jonathan Wakely 
Date:   Thu Oct 13 01:40:13 2016 +0100

Make std::bind use std::invoke

* include/std/functional (_Mu, _Mu):
Simplify forwarding from tuple of references.
(_Maybe_wrap_member_pointer): Remove.
(_Bind::__call, _Bind::__call_c, _Bind::__call_v, _Bind::__call_c_v):
Use std::__invoke.
(_Bind::_Mu_type, _Bind::_Res_type_impl, _Bind::_Res_type)
(_Bind::__dependent, _Bind::_Res_type_cv): New helpers to simplify
return type deduction.
(_Bind::operator(), _Bind::operator() const): Use new helpers.
(_Bind::operator() volatile, _Bind::operator() const volatile):
Likewise. Add deprecated attribute for C++17 mode.
(_Bind_result::__call): Use std::__invoke.
(_Bind_result::operator() volatile)
(_Bind_result::operator() const volatile): Add deprecated attribute.
(_Bind_helper::__maybe_type, _Bindres_helper::__maybe_type): Remove.
(_Bind_helper, _Bindres_helper): Don't use _Maybe_wrap_member_pointer.
(bind, bind): Don't use __maybe_type.
* src/c++11/compatibility-thread-c++0x.cc
(_Maybe_wrap_member_pointer): Define here for compatibility symbols.
* testsuite/20_util/bind/68912.cc: Don't test volatile-qualification
in C++17 mode.
* testsuite/20_util/bind/cv_quals.cc: Likewise.
* testsuite/20_util/bind/cv_quals_2.cc: Likewise.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index d39b519..ad67a1d 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -742,7 +742,7 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
   const _Index_tuple<_Indexes...>&) const volatile
-> decltype(__arg(declval<_Args>()...))
{
- return __arg(std::forward<_Args>(std::get<_Indexes>(__tuple))...);
+ return __arg(std::get<_Indexes>(std::move(__tuple))...);
}
 };
 
@@ -759,10 +759,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
_Safe_tuple_element_t<(is_placeholder<_Arg>::value - 1), _Tuple>&&
operator()(const volatile _Arg&, _Tuple& __tuple) const volatile
{
- using __type
-   = __tuple_element_t<(is_placeholder<_Arg>::value - 1), _Tuple>;
- return std::forward<__type>(
- ::std::get<(is_placeholder<_Arg>::value - 1)>(__tuple));
+ return
+   ::std::get<(is_placeholder<_Arg>::value - 1)>(std::move(__tuple));
}
 };
 
@@ -781,50 +779,6 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
{ return std::forward<_CVArg>(__arg); }
 };
 
-  /**
-   *  Maps member pointers into instances of _Mem_fn but leaves all
-   *  other function objects untouched. Used by std::bind(). The
-   *  

Re: [PATCH v2] aarch64: Add split-stack initial support

2016-10-14 Thread Wilco Dijkstra
Hi,

> Split-stack prologue on function entry is as follow (this goes before the
> usual function prologue):

>   mrsx9, tpidr_el0
>   movx10, -

As Jiong already remarked, the nop won't work. Do we know the maximum adjustment
that the linker is allowed to make? If so, and we can limit the adjustment to 
16MB in
most cases, emitting 2 subtracts is best. Larger offset need mov/movk/sub but 
that
should be extremely rare.

>   nop/movk

>   addx10, sp, x10
>   ldrx9, [x9, 16]

Is there any need to detect underflow of x10 or is there a guarantee that 
stacks are
never allocated in the low 2GB (given the maximum adjustment is 2GB)? It's safe
to do a signed comparison.

>   cmpx10, x9
>   b.csenough

Why save/restore x30 and the call x30+8 trick when we could pass the
continuation address and use a tailcall? That also avoids emitting extra unwind 
info.

>   stpx30, [sp, -16]
>   bl __morestack
>   ldpx30, [sp], 16
>   ret

This part doesn't make any sense - both x28 and carry flag as an input, and 
spread
across the prolog - why???

> enough:
>   mov x10, sp
[prolog]
>   b.cscontinue
>   mov x10, x28
continue:
[rest of function]

Why not do this?

function:
mrsx9, tpidr_el0
subx10, sp, N & 0xfff000
subx10, x10, N & 0xfff
ldrx9, [x9, 16]
adr x12, main_fn_entry
movx11, sp   [if function has stacked arguments]
cmpx10, x9
b.gemain_fn_entry
b __morestack
main_fn_entry: [x11 is argument pointer]
[prolog]
[rest of function]

In __morestack you need to save x8 as well (another argument register!) and x12 
(the 
continuation address). After returning from the call x8 doesn't need to be 
preserved.

There are several issues with unwinding in __morestack. x28 is not described as 
a callee-save
so will be corrupted if unwinding across a __morestack call. This won't unwind 
correctly after
the ldp as the unwinder will use the restored frame pointer to try to restore 
x29/x30:

+   ldp x29, x30, [x28, STACKFRAME_BASE]
+   ldr x28, [x28, STACKFRAME_BASE + 80]
+
+   .cfi_remember_state
+   .cfi_restore 30
+   .cfi_restore 29
+   .cfi_def_cfa 31, 0

This stores a random x30 value on the stack, what is the purpose of this? 
Nothing can unwind
to here:

+   # Start using new stack
+   stp x29, x30, [x0, -16]!
+   mov sp, x0

Also we no longer need split_stack_arg_pointer_used_p () or any code that uses 
it (functions
that don't have any arguments passed on the stack could omit the mov x11, sp).

Wilco



Re: [C++ PATCH] DR 1511 - const volatile variables and ODR

2016-10-14 Thread Jason Merrill
OK.

On Fri, Oct 14, 2016 at 1:23 PM, Jakub Jelinek  wrote:
> Hi!
>
> We weren't implementing this DR, in the past all non-extern const vars
> (and non-inline) at namespace scope had internal linkage, but now only
> non-volatile const var.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2016-10-14  Jakub Jelinek  
>
> DR 1511 - const volatile variables and ODR
> * decl.c (grokvardecl): Change flags argument to type_quals,
> add conceptp argument.  Set TREE_PUBLIC for non-static volatile vars.
> (grokdeclarator): Adjust grokvardecl caller.
>
> * g++.dg/DRs/dr1511-1.C: New test.
> * g++.dg/DRs/dr1511-2.C: New test.
>
> --- gcc/cp/decl.c.jj2016-10-14 12:31:49.0 +0200
> +++ gcc/cp/decl.c   2016-10-14 12:50:28.697542270 +0200
> @@ -68,7 +68,7 @@ static int unary_op_p (enum tree_code);
>  static void push_local_name (tree);
>  static tree grok_reference_init (tree, tree, tree, int);
>  static tree grokvardecl (tree, tree, tree, const cp_decl_specifier_seq *,
> -int, int, int, int, tree);
> +int, int, int, bool, int, tree);
>  static int check_static_variable_definition (tree, tree);
>  static void record_unknown_type (tree, const char *);
>  static tree builtin_function_1 (tree, tree, bool);
> @@ -8512,8 +8512,9 @@ grokvardecl (tree type,
>  tree orig_declarator,
>  const cp_decl_specifier_seq *declspecs,
>  int initialized,
> -int flags,
> +int type_quals,
>  int inlinep,
> +bool conceptp,
>  int template_count,
>  tree scope)
>  {
> @@ -8522,8 +8523,8 @@ grokvardecl (tree type,
>
>gcc_assert (!name || identifier_p (name));
>
> -  bool constp = flags&1;
> -  bool conceptp = flags&2;
> +  bool constp = (type_quals & TYPE_QUAL_CONST) != 0;
> +  bool volatilep = (type_quals & TYPE_QUAL_VOLATILE) != 0;
>
>/* Compute the scope in which to place the variable, but remember
>   whether or not that scope was explicitly specified by the user.   */
> @@ -8580,6 +8581,7 @@ grokvardecl (tree type,
>TREE_PUBLIC (decl) = (declspecs->storage_class != sc_static
> && (DECL_THIS_EXTERN (decl)
> || ! constp
> +   || volatilep
> || inlinep));
>TREE_STATIC (decl) = ! DECL_EXTERNAL (decl);
>  }
> @@ -11626,8 +11628,9 @@ grokdeclarator (const cp_declarator *dec
> decl = grokvardecl (type, dname, unqualified_id,
> declspecs,
> initialized,
> -   ((type_quals & TYPE_QUAL_CONST) != 0) | (2 * 
> concept_p),
> +   type_quals,
> inlinep,
> +   concept_p,
> template_count,
> ctype ? ctype : in_namespace);
> if (decl == NULL_TREE)
> --- gcc/testsuite/g++.dg/DRs/dr1511-1.C.jj  2016-10-14 13:12:06.745016428 
> +0200
> +++ gcc/testsuite/g++.dg/DRs/dr1511-1.C 2016-10-14 13:12:40.715583815 +0200
> @@ -0,0 +1,38 @@
> +/* DR 1511 - const volatile variables and the one-definition rule */
> +/* { dg-do run } */
> +/* { dg-additional-sources "dr1511-2.C" } */
> +
> +typedef const int cint;
> +typedef const volatile int cvint;
> +typedef volatile int vint;
> +const int v1 = 5;
> +extern volatile const int v2;
> +cint v3 = 7;
> +extern cvint v4;
> +extern const vint v5;
> +extern volatile cint v6;
> +const int w1 = 5;
> +extern volatile const int w2;
> +cint w3 = 7;
> +extern cvint w4;
> +extern const vint w5;
> +extern volatile cint w6;
> +extern const int 
> +extern volatile const int 
> +extern const int 
> +extern const volatile int 
> +extern const volatile int 
> +extern const volatile int 
> +
> +int
> +main ()
> +{
> +  if (v1 != 5 || v2 != 6 || v3 != 7 || v4 != 8 || v5 != 9 || v6 != 10)
> +__builtin_abort ();
> +  if (w1 != 5 || w2 != 6 || w3 != 7 || w4 != 8 || w5 != 9 || w6 != 10)
> +__builtin_abort ();
> +  if (r1 != w1 ||  ==  || r2 != w2 ||  !=  || r3 != w3 ||  == 
> )
> +__builtin_abort ();
> +  if (r4 != w4 ||  !=  || r5 != w5 ||  !=  || r6 != w6 ||  != 
> )
> +__builtin_abort ();
> +}
> --- gcc/testsuite/g++.dg/DRs/dr1511-2.C.jj  2016-10-14 13:12:09.912976098 
> +0200
> +++ gcc/testsuite/g++.dg/DRs/dr1511-2.C 2016-10-14 13:11:53.0 +0200
> @@ -0,0 +1,24 @@
> +/* DR 1511 - const volatile variables and the one-definition rule */
> +/* { dg-do compile } */
> +
> +typedef const int cint;
> +typedef const volatile int cvint;
> +typedef volatile int vint;
> +const int v1 = 5;
> +volatile const int v2 = 6;
> +cint v3 = 7;
> +cvint v4 = 8;
> +const vint v5 = 9;
> +volatile cint v6 = 10;
> +const int w1 = 5;
> +volatile const int w2 = 6;
> +cint w3 

Re: New option -flimit-function-alignment

2016-10-14 Thread Bernd Schmidt

On 10/12/2016 09:27 PM, Denys Vlasenko wrote:

Yes, something like "if max_skip >= func_size, temporarily lower
max_skip to func_size-1" (because otherwise we can create padding
bigger-or-equal to the entire function in size, which is stupid
- it's better to just put the function in that space).

This would be a nice.


That would be this patch. Bootstrapped and tested on x86_64-linux, ok?


Bernd
gcc/
	* common.opt (flimit-function-alignment): New.
	* doc/invoke.texi (-flimit-function-alignment): Document.
	* emit-rtl.h (struct rtl_data): Add max_insn_address field.
	* final.c (shorten_branches): Set it.
	* varasm.c (assemble_start_function): Limit alignment if
	requested.

gcc/testsuite/
	* gcc.target/i386/align-limit.c: New test.

Index: gcc/common.opt
===
--- gcc/common.opt	(revision 240861)
+++ gcc/common.opt	(working copy)
@@ -906,6 +906,9 @@ Align the start of functions.
 falign-functions=
 Common RejectNegative Joined UInteger Var(align_functions)
 
+flimit-function-alignment
+Common Report Var(flag_limit_function_alignment) Optimization Init(0)
+
 falign-jumps
 Common Report Var(align_jumps,0) Optimization UInteger
 Align labels which are only reached by jumping.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 240861)
+++ gcc/doc/invoke.texi	(working copy)
@@ -368,7 +368,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-ira-share-spill-slots @gol
 -fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
 -fivopts -fkeep-inline-functions -fkeep-static-functions @gol
--fkeep-static-consts -flive-range-shrinkage @gol
+-fkeep-static-consts -flimit-function-alignment -flive-range-shrinkage @gol
 -floop-block -floop-interchange -floop-strip-mine @gol
 -floop-unroll-and-jam -floop-nest-optimize @gol
 -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
@@ -8262,6 +8262,12 @@ If @var{n} is not specified or is zero,
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
+@item -flimit-function-alignment
+If this option is enabled, the compiler tries to avoid unnecessarily
+overaligning functions. It attempts to instruct the assembler to align
+by the amount specified by @option{-falign-functions}, but not to
+skip more bytes than the size of the function.
+
 @item -falign-labels
 @itemx -falign-labels=@var{n}
 @opindex falign-labels
Index: gcc/emit-rtl.h
===
--- gcc/emit-rtl.h	(revision 240861)
+++ gcc/emit-rtl.h	(working copy)
@@ -284,6 +284,9 @@ struct GTY(()) rtl_data {
  to eliminable regs (like the frame pointer) are set if an asm
  sets them.  */
   HARD_REG_SET asm_clobbers;
+
+  /* The highest address seen during shorten_branches.  */
+  int max_insn_address;
 };
 
 #define return_label (crtl->x_return_label)
Index: gcc/final.c
===
--- gcc/final.c	(revision 240861)
+++ gcc/final.c	(working copy)
@@ -1462,7 +1462,7 @@ shorten_branches (rtx_insn *first)
   if (!increasing)
 	break;
 }
-
+  crtl->max_insn_address = insn_current_address;
   free (varying_length);
 }
 
Index: gcc/testsuite/gcc.target/i386/align-limit.c
===
--- gcc/testsuite/gcc.target/i386/align-limit.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/align-limit.c	(working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -falign-functions=64 -flimit-function-alignment" } */
+/* { dg-final { scan-assembler ".p2align 6,,1" } } */
+/* { dg-final { scan-assembler-not ".p2align 6,,63" } } */
+
+void
+test_func (void)
+{
+}
Index: gcc/varasm.c
===
--- gcc/varasm.c	(revision 240861)
+++ gcc/varasm.c	(working copy)
@@ -1791,9 +1791,14 @@ assemble_start_function (tree decl, cons
   && align_functions_log > align
   && optimize_function_for_speed_p (cfun))
 {
+  int align_log = align_functions_log;
+  int max_skip = align_functions - 1;
+  if (flag_limit_function_alignment && crtl->max_insn_address > 0
+	  && max_skip >= crtl->max_insn_address)
+	max_skip = crtl->max_insn_address - 1;
+
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
-  ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
- align_functions_log, align_functions - 1);
+  ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
 #else
   ASM_OUTPUT_ALIGN (asm_out_file, align_functions_log);
 #endif


[Bug c/77989] New: -O3 causes verify_gimple fail

2016-10-14 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77989

Bug ID: 77989
   Summary: -O3 causes verify_gimple fail
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

Created attachment 39814
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39814=edit
C source code

The attached C code, when compiled by gcc trunk dated 20161014 and 
compiler flag -O3, does this:

tar.c: In function ‘tar_wr’:
tar.c:554:1: error: invalid address operand in MEM_REF
MEM[(char *)[(void *) + 156B]];

tar.c:554:1: error: invalid first operand of MEM_REF
[(void *) + 156B]
tar.c:300:15: note: in statement
# VUSE <.MEM_505>
_324 = MEM[(char *)[(void *) + 156B]];
tar.c:554:1: internal compiler error: verify_gimple failed
0xd7a849 verify_gimple_in_cfg(function*, bool)
../../trunk/gcc/tree-cfg.c:5208
0xbf46fb execute_function_todo
../../trunk/gcc/passes.c:1965

[Bug tree-optimization/77988] New: ICE on valid code at -Os and above on x86_64-linux-gnu: verify_gimple failed

2016-10-14 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77988

Bug ID: 77988
   Summary: ICE on valid code at -Os and above on
x86_64-linux-gnu: verify_gimple failed
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

It is a regression from 6.2.x. 


$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20161013 (experimental) [trunk revision 241143] (GCC)
$
$ gcc-trunk -O1 small.c
$ gcc-6.2 -Os small.c
$
$ gcc-trunk -Os small.c
small.c: In function ‘main’:
small.c:4:5: error: invalid address operand in MEM_REF
 int main ()
 ^~~~
b[0];

small.c:4:5: error: invalid first operand of MEM_REF
[0]
small.c:15:11: note: in statement
   if (*f)
   ^~
# VUSE <.MEM_9(D)>
_2 = b[0];
small.c:4:5: internal compiler error: verify_gimple failed
 int main ()
 ^~~~
0xc31eff verify_gimple_in_cfg(function*, bool)
../../gcc-source-trunk/gcc/tree-cfg.c:5208
0xb12492 execute_function_todo
../../gcc-source-trunk/gcc/passes.c:1965
0xb12e49 execute_todo
../../gcc-source-trunk/gcc/passes.c:2015
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
$


--


static int a = 2;
int b[1], c, d;

int main ()
{ 
  int e = a, *f = [0];
  if (d)
for (e = 0; e < 1; e++)
  ;
  if (e)
{
L1:
  if (b < f)
__builtin_abort ();
  if (*f)
c++;
  return 0;
}
  f = 0;
  goto L1;
  return 0;
}

[testsuite] Small tweaks to gnat.dg/debug[789].adb

2016-10-14 Thread Eric Botcazou
This removes redundant switches, reorders them and adds missing final -margs.

Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline.


2016-10-14  Eric Botcazou  

* gnat.dg/debug7.adb (dg-options): Remove -g.
* gnat.dg/debug8.adb (dg-options): Add -margs.
* gnat.dg/debug9.adb (dg-options): Remove -g and add -margs.

-- 
Eric BotcazouIndex: gnat.dg/debug7.adb
===
--- gnat.dg/debug7.adb	(revision 241147)
+++ gnat.dg/debug7.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-cargs -g -gdwarf-2 -gstrict-dwarf -dA -margs" }
+-- { dg-options "-cargs -gdwarf-2 -gstrict-dwarf -dA -margs" }
 -- { dg-final { scan-assembler "DW_TAG_imported_decl" } }
 
 package body Debug7 is
Index: gnat.dg/debug8.adb
===
--- gnat.dg/debug8.adb	(revision 241147)
+++ gnat.dg/debug8.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-cargs -g -fgnat-encodings=minimal -dA" }
+-- { dg-options "-cargs -g -fgnat-encodings=minimal -dA -margs" }
 -- { dg-final { scan-assembler-not "DW_OP_const4u" } }
 -- { dg-final { scan-assembler-not "DW_OP_const8u" } }
 
Index: gnat.dg/debug9.adb
===
--- gnat.dg/debug9.adb	(revision 241147)
+++ gnat.dg/debug9.adb	(working copy)
@@ -7,7 +7,7 @@
 --  some hackish way to check that types are output in the proper context (i.e.
 --  at global or local scope).
 --
---  { dg-options "-g -gdwarf-4 -cargs -fdebug-types-section -dA" }
+--  { dg-options "-cargs -gdwarf-4 -fdebug-types-section -dA -margs" }
 --  { dg-final { scan-assembler-times "\\(DIE \\(0x\[a-f0-9\]*\\) DW_TAG_type_unit\\)" 0 } }
 
 procedure Debug9 is


[PATCH] read-md.c: Move various state to within class rtx_reader

2016-10-14 Thread David Malcolm
On Wed, 2016-10-12 at 22:57 +0100, Richard Sandiford wrote:
> Sorry, haven't had time to read the full series yet, but:
> 
> David Malcolm  writes:
> > On Wed, 2016-10-05 at 17:51 +0200, Bernd Schmidt wrote:
> > > On 10/05/2016 06:14 PM, David Malcolm wrote:
> > > > The selftests for the RTL frontend require supporting multiple
> > > > reader instances being alive one after another in-process, so
> > > > this lack of cleanup would become a leak.
> > > 
> > > > +  /* Initialize global data.  */
> > > > +  obstack_init (_obstack);
> > > > +  ptr_locs = htab_create (161, leading_ptr_hash,
> > > > leading_ptr_eq_p,
> > > > 0);
> > > > +  obstack_init (_loc_obstack);
> > > > +  joined_conditions = htab_create (161, leading_ptr_hash,
> > > > leading_ptr_eq_p, 0);
> > > > +  obstack_init (_conditions_obstack);
> > > > +  md_constants = htab_create (31, leading_string_hash,
> > > > + leading_string_eq_p, (htab_del)
> > > > 0);
> > > > +  enum_types = htab_create (31, leading_string_hash,
> > > > +   leading_string_eq_p, (htab_del)
> > > > 0);
> > > > +
> > > > +  /* Unlock the stdio streams.  */
> > > > +  unlock_std_streams ();
> > > 
> > > Hmm, but these are global statics. Shouldn't they first be moved
> > > to
> > > become class members?
> > 
> > [CCing Richard Sandiford]
> > 
> > I tried to move these into class rtx_reader, but doing so rapidly
> > became quite invasive, with many of functions in the gen* tools
> > becoming methods.
> 
> Is that just to avoid introducing explicit references to the global
> rtx_reader object in the gen* tools?  If so, then TBH adding those
> references sound better to me than tying generator-specific functions
> to the rtx reader (not least because most of them do more than just
> read rtl).
> 
> > Arguably that would be a good thing, but there are a couple of
> > issues:
> > 
> > (a) some of these functions take "vec" arguments; moving them from
> > static functions to being class methods requires that vec.h has
> > been
> > included when the relevant class decl is parsed.
> 
> I don't think including vec.h more often should be a blocker though.
> :-)
> 
> > (b) rtx_reader turns into a bug dumping ground of methods, for the
> > union of all of the various gen* tools.
> > 
> > One way to alleviate these issues would be to do the split of
> > rtx_reader into base_rtx_reader vs rtx_reader from patch 9 of the
> > kit:
> >   https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00273.html
> > and perhaps to split out part of read-md.h into a new read-rtl.h.
> > 
> > Before I reorganize the patches, does this approach sound
> > reasonable?
> > 
> > Alternatively, a less invasive approach is to have an accessor for
> > these fields, so that things using them can get at them via the
> > rtx_reader_ptr singleton e.g.:
> > 
> > void
> > grow_string_obstack (char ch)
> > {
> >obstack_1grow (rtx_reader_ptr->get_string_obstack (), ch);
> > }
> > 
> > and similar.
> 
> I think it's OK for the generators to refer rtx_reader_ptr directly.
> Obviously that makes the patches more invasive, but hopefully the
> extra changes are mechanical.
> 
> Thanks,
> Richard

Thanks.

Here's an updated version of the patch.

As before:

Various global data items relating to reading .md files are
currently initialized in rtx_reader::read_md_files, and are
not cleaned up.

The selftests for the RTL frontend require supporting multiple
reader instances being alive one after another in-process, so
this lack of cleanup would become a leak.

What's new:

In this version of the patch, I've moved the global variables to
be fields of class rtx_reader, moving their setup to the constructor.
The patch adds matching cleanups to the destructor, along with a
cleanup of m_base_dir.

Doing so requires updating the various users of these fields.
Where it seemed appropriate, I made the functions be methods
of rtx_reader.  In other cases, I updated them to access the fields
via rtx_reader_ptr.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Comparing md5sums of BUILD/gcc/insn-*.[hc] before/after shows them to be
unchanged by the patch.

OK for trunk?

gcc/ChangeLog:
* genattrtab.c (gen_attr): Use rtx_reader_ptr for lookup_enum_type
call.
* genconstants.c: Include "statistics.h" and "vec.h".
(main): Update for conversion of traverse_enum_types to a method.
* genenums.c: Include "statistics.h" and "vec.h".
(main): Update for conversion of traverse_enum_types to a method.
* genmddeps.c: Include "statistics.h" and "vec.h".
* gensupport.c (gen_mnemonic_setattr): Update for move of
string_obstack to a field of rtx_reader.
(mnemonic_htab_callback): Likewise.
(gen_mnemonic_attr): Likewise.
* read-md.c: Include "statistics.h" and "vec.h".
(string_obstack): Convert this global to field "m_string_obstack"
of class rtx_reader.
(ptr_locs): 

[Bug c++/77987] New: unique_ptr<T[]> reset rejects cv-compatible pointers

2016-10-14 Thread barry.revzin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77987

Bug ID: 77987
   Summary: unique_ptr reset rejects cv-compatible pointers
   Product: gcc
   Version: 6.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

#include 

int main() {
std::unique_ptr p;
p.reset(new char[1]);
}

fails to compile with due to reset invoking swap on two different types.
However, this code should be well-formed per N4089.

[PATCH] Add various DWARF5 constants to dwarf2.{h,def}

2016-10-14 Thread Jakub Jelinek
Hi!

Now that DWARF5 public review draft has been released, I went through
the document looking for double dagger marked constants and added them to
dwarf2.{def,h}.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-14  Jakub Jelinek  

* dwarf2.h (enum dwarf_calling_convention): Add new DWARF5
calling convention codes.
(enum dwarf_line_number_content_type): New.
(enum dwarf_location_list_entry_type): Add DWARF5 DW_LLE_*
codes.
(enum dwarf_source_language): Add new DWARF5 DW_LANG_* codes.
(enum dwarf_macro_record_type): Add DWARF5 DW_MACRO_* codes.
(enum dwarf_name_index_attribute): New.
(enum dwarf_range_list_entry): New.
(enum dwarf_unit_type): New.
* dwarf2.def: Add new DWARF5 DW_TAG_*, DW_FORM_*, DW_AT_*,
DW_OP_* and DW_ATE_* entries.

--- include/dwarf2.h.jj 2016-08-12 11:12:47.0 +0200
+++ include/dwarf2.h2016-10-14 11:41:14.962308511 +0200
@@ -175,6 +175,10 @@ enum dwarf_calling_convention
 DW_CC_program = 0x2,
 DW_CC_nocall = 0x3,
 
+/* DWARF 5.  */
+DW_CC_pass_by_reference = 0x4,
+DW_CC_pass_by_value = 0x5,
+
 DW_CC_lo_user = 0x40,
 DW_CC_hi_user = 0xff,
 
@@ -257,15 +261,38 @@ enum dwarf_line_number_hp_sfc_ops
 DW_LNE_HP_SFC_associate = 3
   };
 
-/* Type codes for location list entries.
-   Extension for Fission.  See http://gcc.gnu.org/wiki/DebugFission.  */
+/* Content type codes in line table directory_entry_format
+   and file_name_entry_format sequences.  */
+enum dwarf_line_number_content_type
+  {
+DW_LNCT_path = 0x1,
+DW_LNCT_directory_index = 0x2,
+DW_LNCT_timestamp = 0x3,
+DW_LNCT_size = 0x4,
+DW_LNCT_MD5 = 0x5,
+DW_LNCT_lo_user = 0x2000,
+DW_LNCT_hi_user = 0x3fff
+  };
 
+/* Type codes for location list entries.  */
 enum dwarf_location_list_entry_type
   {
-DW_LLE_GNU_end_of_list_entry = 0,
-DW_LLE_GNU_base_address_selection_entry = 1,
-DW_LLE_GNU_start_end_entry = 2,
-DW_LLE_GNU_start_length_entry = 3
+DW_LLE_end_of_list = 0x00,
+DW_LLE_base_addressx = 0x01,
+DW_LLE_startx_endx = 0x02,
+DW_LLE_startx_length = 0x03,
+DW_LLE_offset_pair = 0x04,
+DW_LLE_default_location = 0x05,
+DW_LLE_base_address = 0x06,
+DW_LLE_start_end = 0x07,
+DW_LLE_start_length = 0x08,
+
+/* Former extension for Fission.
+   See http://gcc.gnu.org/wiki/DebugFission.  */
+DW_LLE_GNU_end_of_list_entry = 0x00,
+DW_LLE_GNU_base_address_selection_entry = 0x01,
+DW_LLE_GNU_start_end_entry = 0x02,
+DW_LLE_GNU_start_length_entry = 0x03
   };
 
 #define DW_CIE_ID0x
@@ -305,14 +332,22 @@ enum dwarf_source_language
 /* DWARF 4.  */
 DW_LANG_Python = 0x0014,
 /* DWARF 5.  */
+DW_LANG_OpenCL = 0x0015,
 DW_LANG_Go = 0x0016,
-
-DW_LANG_C_plus_plus_11 = 0x001a, /* dwarf5.20141029.pdf DRAFT */
+DW_LANG_Modula3 = 0x0017,
+DW_LANG_Haskell = 0x0018,
+DW_LANG_C_plus_plus_03 = 0x0019,
+DW_LANG_C_plus_plus_11 = 0x001a,
+DW_LANG_OCaml = 0x001b,
 DW_LANG_Rust = 0x001c,
 DW_LANG_C11 = 0x001d,
+DW_LANG_Swift = 0x001e,
+DW_LANG_Julia = 0x001f,
+DW_LANG_Dylan = 0x0020,
 DW_LANG_C_plus_plus_14 = 0x0021,
 DW_LANG_Fortran03 = 0x0022,
 DW_LANG_Fortran08 = 0x0023,
+DW_LANG_RenderScript = 0x0024,
 
 DW_LANG_lo_user = 0x8000,  /* Implementation-defined range start.  */
 DW_LANG_hi_user = 0x,  /* Implementation-defined range start.  */
@@ -342,7 +377,7 @@ enum dwarf_macinfo_record_type
 DW_MACINFO_vendor_ext = 255
   };
 
-/* DW_TAG_GNU_defaulted/DW_TAG_defaulted attributes.  */
+/* DW_TAG_defaulted/DW_TAG_GNU_defaulted attributes.  */
 enum dwarf_defaulted_attribute
   {
 DW_DEFAULTED_no = 0x00,
@@ -353,21 +388,75 @@ enum dwarf_defaulted_attribute
 /* Names and codes for new style macro information.  */
 enum dwarf_macro_record_type
   {
-DW_MACRO_GNU_define = 1,
-DW_MACRO_GNU_undef = 2,
-DW_MACRO_GNU_start_file = 3,
-DW_MACRO_GNU_end_file = 4,
-DW_MACRO_GNU_define_indirect = 5,
-DW_MACRO_GNU_undef_indirect = 6,
-DW_MACRO_GNU_transparent_include = 7,
+DW_MACRO_define = 0x01,
+DW_MACRO_undef = 0x02,
+DW_MACRO_start_file = 0x03,
+DW_MACRO_end_file = 0x04,
+DW_MACRO_define_strp = 0x05,
+DW_MACRO_undef_strp = 0x06,
+DW_MACRO_import = 0x07,
+DW_MACRO_define_sup = 0x08,
+DW_MACRO_undef_sup = 0x09,
+DW_MACRO_import_sup = 0x0a,
+DW_MACRO_define_strx = 0x0b,
+DW_MACRO_undef_strx = 0x0c,
+DW_MACRO_lo_user = 0xe0,
+DW_MACRO_hi_user = 0xff,
+
+/* Compatibility macros for the GNU .debug_macro extension.  */
+DW_MACRO_GNU_define = 0x01,
+DW_MACRO_GNU_undef = 0x02,
+DW_MACRO_GNU_start_file = 0x03,
+DW_MACRO_GNU_end_file = 0x04,
+DW_MACRO_GNU_define_indirect = 0x05,
+DW_MACRO_GNU_undef_indirect = 0x06,
+DW_MACRO_GNU_transparent_include = 0x07,
 

[PATCH] Emit DW_AT_inline for C++17 inline variables

2016-10-14 Thread Jakub Jelinek
Hi!

This also uses the infrastructure of the langhook patch I've sent earlier.
It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline
variables, and also tweaks dwarf2out so that for inline static data members
we consider in-class declarations as definitions (emit DW_AT_linkage_name
and no DW_AT_declaration) for them.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-14  Jakub Jelinek  

* dwarf2out.c (add_linkage_name): Add linkage attribute even for
DW_TAG_member if it is inline static data member.
(gen_variable_die): Consider inline static data member's DW_TAG_member
to be definition rather than declaration.  Add DW_AT_inline attribute
if needed.
cp/
* cp-objcp-common.c (cp_decl_dwarf_attribute): Handle DW_AT_inline.
testsuite/
* g++.dg/debug/dwarf2/inline-var-1.C: New test.

--- gcc/dwarf2out.c.jj  2016-10-14 15:22:57.0 +0200
+++ gcc/dwarf2out.c 2016-10-14 16:39:12.226917016 +0200
@@ -18896,7 +18896,10 @@ add_linkage_name (dw_die_ref die, tree d
   && VAR_OR_FUNCTION_DECL_P (decl)
   && TREE_PUBLIC (decl)
   && !(VAR_P (decl) && DECL_REGISTER (decl))
-  && die->die_tag != DW_TAG_member)
+  && (die->die_tag != DW_TAG_member
+ || (VAR_P (decl)
+ && (lang_hooks.decls.decl_dwarf_attribute (decl, DW_AT_inline)
+ != -1
 add_linkage_name_raw (die, decl);
 }
 
@@ -21382,6 +21385,20 @@ gen_variable_die (tree decl, tree origin
   return;
 }
 
+  /* For static data members, the declaration (or definition for inline
+ variables) in the class is supposed to have DW_TAG_member tag;
+ the specification if any should still be DW_TAG_variable referencing
+ the DW_TAG_member DIE.  */
+  enum dwarf_tag tag = DW_TAG_variable;
+  if (declaration && class_scope_p (context_die))
+{
+  tag = DW_TAG_member;
+  /* Inline static data members are defined inside of the class.  */
+  if (lang_hooks.decls.decl_dwarf_attribute (decl_or_origin, DW_AT_inline)
+ != -1)
+   declaration = false;
+}
+
   if (old_die)
 {
   if (declaration)
@@ -21414,14 +21431,7 @@ gen_variable_die (tree decl, tree origin
  goto gen_variable_die_location;
}
 }
-
-  /* For static data members, the declaration in the class is supposed
- to have DW_TAG_member tag; the specification should still be
- DW_TAG_variable referencing the DW_TAG_member DIE.  */
-  if (declaration && class_scope_p (context_die))
-var_die = new_die (DW_TAG_member, context_die, decl);
-  else
-var_die = new_die (DW_TAG_variable, context_die, decl);
+  var_die = new_die (tag, context_die, decl);
 
   if (origin != NULL)
 origin_die = add_abstract_origin_attribute (var_die, origin);
@@ -21521,6 +21531,17 @@ gen_variable_die (tree decl, tree origin
   && (origin_die == NULL || get_AT (origin_die, DW_AT_const_expr) == NULL)
   && !specialization_p)
 add_AT_flag (var_die, DW_AT_const_expr, 1);
+
+  if (!dwarf_strict)
+{
+  int inl = lang_hooks.decls.decl_dwarf_attribute (decl_or_origin,
+  DW_AT_inline);
+  if (inl != -1
+ && !get_AT (var_die, DW_AT_inline)
+ && (origin_die == NULL || get_AT (origin_die, DW_AT_inline) == NULL)
+ && !specialization_p)
+   add_AT_unsigned (var_die, DW_AT_inline, inl);
+}
 }
 
 /* Generate a DIE to represent a named constant.  */
--- gcc/cp/cp-objcp-common.c.jj 2016-10-14 15:01:27.0 +0200
+++ gcc/cp/cp-objcp-common.c2016-10-14 15:46:14.650303379 +0200
@@ -173,6 +173,16 @@ cp_decl_dwarf_attribute (const_tree decl
return 1;
   break;
 
+case DW_AT_inline:
+  if (VAR_P (decl) && DECL_INLINE_VAR_P (decl))
+   {
+ if (DECL_VAR_DECLARED_INLINE_P (decl))
+   return DW_INL_declared_inlined;
+ else
+   return DW_INL_inlined;
+   }
+  break;
+
 default:
   break;
 }
--- gcc/testsuite/g++.dg/debug/dwarf2/inline-var-1.C.jj 2016-10-14 
16:55:30.345512927 +0200
+++ gcc/testsuite/g++.dg/debug/dwarf2/inline-var-1.C2016-10-14 
16:56:45.704558635 +0200
@@ -0,0 +1,26 @@
+// { dg-do compile }
+// { dg-options "-O -std=c++1z -g -dA -gno-strict-dwarf" }
+// { dg-require-weak "" }
+// { dg-final { scan-assembler-times "0x3\[^\n\r]* DW_AT_inline" 6 } }
+// { dg-final { scan-assembler-times "0x1\[^\n\r]* DW_AT_inline" 2 } }
+// { dg-final { scan-assembler-not " DW_AT_declaration" } }
+// { dg-final { scan-assembler-times " DW_AT_\[^\n\r]*linkage_name" 7 } }
+
+inline int a;
+struct S
+{
+  static inline double b = 4.0;
+  static constexpr int c = 2;
+  static constexpr inline char d = 3;
+} s;
+template 
+inline int e = N;
+int  = e<2>;
+template 
+struct T
+{
+  static inline double g = 4.0;
+  static constexpr int h = 2;
+  static inline constexpr char i = 3;
+};
+T<5> t;

Jakub


[PATCH] Emit DW_AT_const_expr for constexpr variables

2016-10-14 Thread Jakub Jelinek
Hi!

This relies on the previous langhook patch (which greatly simplifies it).

I'm only handling variables for now, DW_AT_const_expr is just weird on
functions/methods, it is supposed to appear only on
DW_TAG_inlined_subroutine?

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-14  Jakub Jelinek  

* dwarf2out.c (gen_variable_die): Emit DW_AT_const_expr attribute
if needed.
cp/
* cp-objcp-common.c (cp_decl_dwarf_attribute): Handle
DW_AT_const_expr.
testsuite/
* g++.dg/debug/dwarf2/constexpr-var-1.C: New test.

--- gcc/dwarf2out.c.jj  2016-10-14 14:37:15.0 +0200
+++ gcc/dwarf2out.c 2016-10-14 15:22:57.878078634 +0200
@@ -21513,6 +21513,14 @@ gen_variable_die (tree decl, tree origin
 }
   else
 tree_add_const_value_attribute_for_decl (var_die, decl_or_origin);
+
+  if ((dwarf_version >= 4 || !dwarf_strict)
+  && lang_hooks.decls.decl_dwarf_attribute (decl_or_origin,
+   DW_AT_const_expr) == 1
+  && !get_AT (var_die, DW_AT_const_expr)
+  && (origin_die == NULL || get_AT (origin_die, DW_AT_const_expr) == NULL)
+  && !specialization_p)
+add_AT_flag (var_die, DW_AT_const_expr, 1);
 }
 
 /* Generate a DIE to represent a named constant.  */
--- gcc/cp/cp-objcp-common.c.jj 2016-10-14 14:27:56.0 +0200
+++ gcc/cp/cp-objcp-common.c2016-10-14 15:01:27.770495885 +0200
@@ -168,6 +168,11 @@ cp_decl_dwarf_attribute (const_tree decl
}
   break;
 
+case DW_AT_const_expr:
+  if (VAR_OR_FUNCTION_DECL_P (decl) && DECL_DECLARED_CONSTEXPR_P (decl))
+   return 1;
+  break;
+
 default:
   break;
 }
--- gcc/testsuite/g++.dg/debug/dwarf2/constexpr-var-1.C.jj  2016-10-14 
15:32:23.323882991 +0200
+++ gcc/testsuite/g++.dg/debug/dwarf2/constexpr-var-1.C 2016-10-14 
15:31:56.0 +0200
@@ -0,0 +1,9 @@
+// { dg-do compile }
+// { dg-options "-O -std=c++11 -g -dA -gno-strict-dwarf" }
+// { dg-final { scan-assembler-times " DW_AT_const_expr" 2 } }
+
+constexpr int a = 5;
+struct S
+{
+  static constexpr int b = 6;
+} s;

Jakub


[PATCH] Improve DWARF constant attribute langhooks

2016-10-14 Thread Jakub Jelinek
Hi!

Before early dwarf changes, if we wanted to note some decl property so that
some corresponding DWARF attribute can be emitted, we had to use some
generic IL bit for that.  Now a langhook can be used instead (hopefully for
7.x even with LTO), but having a single langhook for each such bit looks
excessive to me, when all we actually want is forward some bits from the C++
FE lang structures/macros to dwarf2out.

So, this patch introduces a lang hook through which dwarf2out can ask if
some DW_AT_* attribute should be added to decl (it is dwarf2out's business
to guard it with dwarf_version, dwarf_strict and other conditions), and the
lang hook just returns -1 if nothing should be added (most attributes we
care about here have either boolean 0/1 or small unsigned integer values),
or the value of the attribute that should be added.

I've converted 3 attributes to this new langhook.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-14  Jakub Jelinek  

* langhooks.h (struct lang_hooks_for_decls): Remove
function_decl_explicit_p, function_decl_deleted_p and
function_decl_defaulted hooks.  Add decl_dwarf_attribute hook.
* langhooks-def.h (lhd_decl_dwarf_attribute): Declare.
(LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
LANG_HOOKS_FUNCTION_DECL_DELETED_P,
LANG_HOOKS_FUNCTION_DECL_DEFAULTED): Remove.
(LANG_HOOKS_DECL_DWARF_ATTRIBUTE): Define.
(LANG_HOOKS_DECLS): Remove LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
LANG_HOOKS_FUNCTION_DECL_DELETED_P and
LANG_HOOKS_FUNCTION_DECL_DEFAULTED.  Add
LANG_HOOKS_DECL_DWARF_ATTRIBUTE.
* langhooks.c (lhd_decl_dwarf_attribute): New function.
* dwarf2out.c (gen_subprogram_die): Use
lang_hooks.decls.decl_dwarf_attribute instead of
lang_hooks.decls.function_decl_*.
cp/
* cp-objcp-common.h (cp_function_decl_explicit_p,
cp_function_decl_deleted_p, cp_function_decl_defaulted): Remove.
(cp_decl_dwarf_attribute): Declare.
(LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P,
LANG_HOOKS_FUNCTION_DECL_DELETED_P,
LANG_HOOKS_FUNCTION_DECL_DEFAULTED): Remove.
(LANG_HOOKS_DECL_DWARF_ATTRIBUTE): Redefine.
* cp-objcp-common.c (cp_function_decl_explicit_p,
cp_function_decl_deleted_p, cp_function_decl_defaulted): Remove.
(cp_decl_dwarf_attribute): New function.

--- gcc/langhooks.h.jj  2016-10-13 10:24:46.0 +0200
+++ gcc/langhooks.h 2016-10-14 14:27:07.806695803 +0200
@@ -182,16 +182,9 @@ struct lang_hooks_for_decls
   /* Returns the chain of decls so far in the current scope level.  */
   tree (*getdecls) (void);
 
-  /* Returns true if DECL is explicit member function.  */
-  bool (*function_decl_explicit_p) (const_tree);
-
-  /* Returns true if DECL is C++11 deleted special member function.  */
-  bool (*function_decl_deleted_p) (const_tree);
-
-  /* Returns 0 if DECL is NOT a C++11 defaulted special member
- function, 1 if it is explicitly defaulted within the class body,
- or 2 if it is explicitly defaulted outside the class body.  */
-  int (*function_decl_defaulted) (const_tree);
+  /* Returns -1 if dwarf ATTR shouldn't be added for DECL, or the attribute
+ value otherwise.  */
+  int (*decl_dwarf_attribute) (const_tree, int);
 
   /* Returns True if the parameter is a generic parameter decl
  of a generic type, e.g a template template parameter for the C++ FE.  */
--- gcc/langhooks-def.h.jj  2016-10-13 10:28:19.0 +0200
+++ gcc/langhooks-def.h 2016-10-14 14:29:09.535146412 +0200
@@ -83,6 +83,7 @@ extern bool lhd_omp_mappable_type (tree)
 
 extern const char *lhd_get_substring_location (const substring_loc &,
   location_t *out_loc);
+extern int lhd_decl_dwarf_attribute (const_tree, int);
 
 #define LANG_HOOKS_NAME"GNU unknown"
 #define LANG_HOOKS_IDENTIFIER_SIZE sizeof (struct lang_identifier)
@@ -213,9 +214,7 @@ extern tree lhd_make_node (enum tree_cod
 #define LANG_HOOKS_GLOBAL_BINDINGS_P global_bindings_p
 #define LANG_HOOKS_PUSHDECLpushdecl
 #define LANG_HOOKS_GETDECLSgetdecls
-#define LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P hook_bool_const_tree_false
-#define LANG_HOOKS_FUNCTION_DECL_DELETED_P hook_bool_const_tree_false
-#define LANG_HOOKS_FUNCTION_DECL_DEFAULTED hook_int_const_tree_0
+#define LANG_HOOKS_DECL_DWARF_ATTRIBUTE lhd_decl_dwarf_attribute
 #define LANG_HOOKS_WARN_UNUSED_GLOBAL_DECL lhd_warn_unused_global_decl
 #define LANG_HOOKS_POST_COMPILATION_PARSING_CLEANUPS NULL
 #define LANG_HOOKS_DECL_OK_FOR_SIBCALL lhd_decl_ok_for_sibcall
@@ -236,9 +235,7 @@ extern tree lhd_make_node (enum tree_cod
   LANG_HOOKS_GLOBAL_BINDINGS_P, \
   LANG_HOOKS_PUSHDECL, \
   LANG_HOOKS_GETDECLS, \
-  LANG_HOOKS_FUNCTION_DECL_EXPLICIT_P, \
-  LANG_HOOKS_FUNCTION_DECL_DELETED_P, \
-  LANG_HOOKS_FUNCTION_DECL_DEFAULTED, \
+  LANG_HOOKS_DECL_DWARF_ATTRIBUTE, \
   

[C++ PATCH] DR 1511 - const volatile variables and ODR

2016-10-14 Thread Jakub Jelinek
Hi!

We weren't implementing this DR, in the past all non-extern const vars
(and non-inline) at namespace scope had internal linkage, but now only
non-volatile const var.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-10-14  Jakub Jelinek  

DR 1511 - const volatile variables and ODR
* decl.c (grokvardecl): Change flags argument to type_quals,
add conceptp argument.  Set TREE_PUBLIC for non-static volatile vars.
(grokdeclarator): Adjust grokvardecl caller.

* g++.dg/DRs/dr1511-1.C: New test.
* g++.dg/DRs/dr1511-2.C: New test.

--- gcc/cp/decl.c.jj2016-10-14 12:31:49.0 +0200
+++ gcc/cp/decl.c   2016-10-14 12:50:28.697542270 +0200
@@ -68,7 +68,7 @@ static int unary_op_p (enum tree_code);
 static void push_local_name (tree);
 static tree grok_reference_init (tree, tree, tree, int);
 static tree grokvardecl (tree, tree, tree, const cp_decl_specifier_seq *,
-int, int, int, int, tree);
+int, int, int, bool, int, tree);
 static int check_static_variable_definition (tree, tree);
 static void record_unknown_type (tree, const char *);
 static tree builtin_function_1 (tree, tree, bool);
@@ -8512,8 +8512,9 @@ grokvardecl (tree type,
 tree orig_declarator,
 const cp_decl_specifier_seq *declspecs,
 int initialized,
-int flags,
+int type_quals,
 int inlinep,
+bool conceptp,
 int template_count,
 tree scope)
 {
@@ -8522,8 +8523,8 @@ grokvardecl (tree type,
 
   gcc_assert (!name || identifier_p (name));
 
-  bool constp = flags&1;
-  bool conceptp = flags&2;
+  bool constp = (type_quals & TYPE_QUAL_CONST) != 0;
+  bool volatilep = (type_quals & TYPE_QUAL_VOLATILE) != 0;
 
   /* Compute the scope in which to place the variable, but remember
  whether or not that scope was explicitly specified by the user.   */
@@ -8580,6 +8581,7 @@ grokvardecl (tree type,
   TREE_PUBLIC (decl) = (declspecs->storage_class != sc_static
&& (DECL_THIS_EXTERN (decl)
|| ! constp
+   || volatilep
|| inlinep));
   TREE_STATIC (decl) = ! DECL_EXTERNAL (decl);
 }
@@ -11626,8 +11628,9 @@ grokdeclarator (const cp_declarator *dec
decl = grokvardecl (type, dname, unqualified_id,
declspecs,
initialized,
-   ((type_quals & TYPE_QUAL_CONST) != 0) | (2 * 
concept_p),
+   type_quals,
inlinep,
+   concept_p,
template_count,
ctype ? ctype : in_namespace);
if (decl == NULL_TREE)
--- gcc/testsuite/g++.dg/DRs/dr1511-1.C.jj  2016-10-14 13:12:06.745016428 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr1511-1.C 2016-10-14 13:12:40.715583815 +0200
@@ -0,0 +1,38 @@
+/* DR 1511 - const volatile variables and the one-definition rule */
+/* { dg-do run } */
+/* { dg-additional-sources "dr1511-2.C" } */
+
+typedef const int cint;
+typedef const volatile int cvint;
+typedef volatile int vint;
+const int v1 = 5;
+extern volatile const int v2;
+cint v3 = 7;
+extern cvint v4;
+extern const vint v5;
+extern volatile cint v6;
+const int w1 = 5;
+extern volatile const int w2;
+cint w3 = 7;
+extern cvint w4;
+extern const vint w5;
+extern volatile cint w6;
+extern const int 
+extern volatile const int 
+extern const int 
+extern const volatile int 
+extern const volatile int 
+extern const volatile int 
+
+int
+main ()
+{
+  if (v1 != 5 || v2 != 6 || v3 != 7 || v4 != 8 || v5 != 9 || v6 != 10)
+__builtin_abort ();
+  if (w1 != 5 || w2 != 6 || w3 != 7 || w4 != 8 || w5 != 9 || w6 != 10)
+__builtin_abort ();
+  if (r1 != w1 ||  ==  || r2 != w2 ||  !=  || r3 != w3 ||  == 
)
+__builtin_abort ();
+  if (r4 != w4 ||  !=  || r5 != w5 ||  !=  || r6 != w6 ||  != 
)
+__builtin_abort ();
+}
--- gcc/testsuite/g++.dg/DRs/dr1511-2.C.jj  2016-10-14 13:12:09.912976098 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr1511-2.C 2016-10-14 13:11:53.0 +0200
@@ -0,0 +1,24 @@
+/* DR 1511 - const volatile variables and the one-definition rule */
+/* { dg-do compile } */
+
+typedef const int cint;
+typedef const volatile int cvint;
+typedef volatile int vint;
+const int v1 = 5;
+volatile const int v2 = 6;
+cint v3 = 7;
+cvint v4 = 8;
+const vint v5 = 9;
+volatile cint v6 = 10;
+const int w1 = 5;
+volatile const int w2 = 6;
+cint w3 = 7;
+cvint w4 = 8;
+const vint w5 = 9;
+volatile cint w6 = 10;
+const int  = w1;
+volatile const int  = w2;
+const int  = w3;
+const volatile int  = w4;
+const volatile int  = w5;
+const volatile int  = w6;

Jakub


libgo patch committed: just do flie/line lookup in C, move Func to Go

2016-10-14 Thread Ian Lance Taylor
In order to port stack backtraces to Go, we need the ability to look
up file/line information for PC values without allocating memory.
This libgo patch moves the handling of Func from C code to Go code,
and simplifies the C code to just look up function/file/line/entry
information for a PC.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241171)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-911fceabd4c955b2f29f6b532f241a002ca7ad4f
+993840643e27e52cda7e86e6a775f54443ea5d07
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/symtab.go
===
--- libgo/go/runtime/symtab.go  (revision 240942)
+++ libgo/go/runtime/symtab.go  (working copy)
@@ -65,19 +65,20 @@ func (ci *Frames) Next() (frame Frame, m
}
more = len(ci.callers) > 0
 
-   f, file, line := funcframe(pc, i)
-   if f == nil {
+   // Subtract 1 from PC to undo the 1 we added in callback in
+   // go-callers.c.
+   function, file, line := funcfileline(pc-1, int32(i))
+   if function == "" && file == "" {
return Frame{}, more
}
+   entry := funcentry(pc - 1)
+   f := {name: function, entry: entry}
 
-   entry := f.Entry()
xpc := pc
if xpc > entry {
xpc--
}
 
-   function := f.Name()
-
frame = Frame{
PC:   xpc,
Func: f,
@@ -97,21 +98,29 @@ func (ci *Frames) Next() (frame Frame, m
 
 // A Func represents a Go function in the running binary.
 type Func struct {
-   opaque struct{} // unexported field to disallow conversions
+   name  string
+   entry uintptr
 }
 
 // FuncForPC returns a *Func describing the function that contains the
 // given program counter address, or else nil.
-func FuncForPC(pc uintptr) *Func
+func FuncForPC(pc uintptr) *Func {
+   name, _, _ := funcfileline(pc, -1)
+   if name == "" {
+   return nil
+   }
+   entry := funcentry(pc)
+   return {name: name, entry: entry}
+}
 
 // Name returns the name of the function.
 func (f *Func) Name() string {
-   return funcname_go(f)
+   return f.name
 }
 
 // Entry returns the entry address of the function.
 func (f *Func) Entry() uintptr {
-   return funcentry_go(f)
+   return f.entry
 }
 
 // FileLine returns the file name and line number of the
@@ -119,11 +128,10 @@ func (f *Func) Entry() uintptr {
 // The result will not be accurate if pc is not a program
 // counter within f.
 func (f *Func) FileLine(pc uintptr) (file string, line int) {
-   return funcline_go(f, pc)
+   _, file, line = funcfileline(pc, -1)
+   return file, line
 }
 
-// implemented in symtab.c
-func funcline_go(*Func, uintptr) (string, int)
-func funcname_go(*Func) string
-func funcentry_go(*Func) uintptr
-func funcframe(uintptr, int) (*Func, string, int)
+// implemented in go-caller.c
+func funcfileline(uintptr, int32) (string, string, int)
+func funcentry(uintptr) uintptr
Index: libgo/runtime/go-caller.c
===
--- libgo/runtime/go-caller.c   (revision 240942)
+++ libgo/runtime/go-caller.c   (working copy)
@@ -1,4 +1,4 @@
-/* go-caller.c -- runtime.Caller and runtime.FuncForPC for Go.
+/* go-caller.c -- look up function/file/line/entry info
 
Copyright 2009 The Go Authors. All rights reserved.
Use of this source code is governed by a BSD-style
@@ -171,8 +171,6 @@ struct caller_ret
 
 struct caller_ret Caller (int n) __asm__ (GOSYM_PREFIX "runtime.Caller");
 
-Func *FuncForPC (uintptr_t) __asm__ (GOSYM_PREFIX "runtime.FuncForPC");
-
 /* Implement runtime.Caller.  */
 
 struct caller_ret
@@ -193,115 +191,40 @@ Caller (int skip)
   return ret;
 }
 
-/* Implement runtime.FuncForPC.  */
-
-Func *
-FuncForPC (uintptr_t pc)
-{
-  Func *ret;
-  String fn;
-  String file;
-  intgo line;
-  uintptr_t val;
-
-  if (!__go_file_line (pc, -1, , , ))
-return NULL;
-
-  ret = (Func *) runtime_malloc (sizeof (*ret));
-  ret->name = fn;
-
-  if (__go_symbol_value (pc, ))
-ret->entry = val;
-  else
-ret->entry = 0;
-
-  return ret;
-}
-
-/* Look up the file and line information for a PC within a
-   function.  */
+/* Look up the function name, file name, and line number for a PC.  */
 
-struct funcline_go_return
+struct funcfileline_return
 {
+  String retfn;
   String retfile;
   intgo retline;
 };
 
-struct funcline_go_return
-runtime_funcline_go (Func *f, uintptr targetpc)
-  __asm__ (GOSYM_PREFIX "runtime.funcline_go");
+struct funcfileline_return
+runtime_funcfileline (uintptr targetpc, int32 index)
+  __asm__ (GOSYM_PREFIX "runtime.funcfileline");
 
-struct funcline_go_return

[PATCH] Fix expansion ICE on store to CONST_DECL (PR middle-end/77959)

2016-10-14 Thread Jakub Jelinek
Hi!

The following (invalid) testcase ICEs, because we try to store into
CONST_DECL's FIELD.  Normally in GIMPLE we have MEM_REF[] and
writes to that expand gracefully into a MEM, but as soon as we use
get_inner_reference in expand_assignment (even if the MEM is just reverse
order, or we just want to store to a part of it etc.), get_inner_reference
looks through even that MEM_REF.  Instead of hacking that around in
expand_assignment, just attempting to handle EXPAND_WRITE into CONST_DECL
looked easier to me.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-14  Jakub Jelinek  

PR middle-end/77959
* expr.c (expand_expr_real_1) : For EXPAND_WRITE
return a MEM.

* gfortran.dg/pr77959.f90: New test.

--- gcc/expr.c.jj   2016-10-09 13:19:09.0 +0200
+++ gcc/expr.c  2016-10-13 11:49:36.386993921 +0200
@@ -9914,6 +9914,19 @@ expand_expr_real_1 (tree exp, rtx target
   }
 
 case CONST_DECL:
+  if (modifier == EXPAND_WRITE)
+   {
+ /* Writing into CONST_DECL is always invalid, but handle it
+gracefully.  */
+ addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
+ machine_mode address_mode = targetm.addr_space.address_mode (as);
+ op0 = expand_expr_addr_expr_1 (exp, NULL_RTX, address_mode,
+EXPAND_NORMAL, as);
+ op0 = memory_address_addr_space (mode, op0, as);
+ temp = gen_rtx_MEM (mode, op0);
+ set_mem_addr_space (temp, as);
+ return temp;
+   }
   return expand_expr (DECL_INITIAL (exp), target, VOIDmode, modifier);
 
 case REAL_CST:
--- gcc/testsuite/gfortran.dg/pr77959.f90.jj2016-10-13 11:57:30.019992471 
+0200
+++ gcc/testsuite/gfortran.dg/pr77959.f90   2016-10-13 11:58:50.719969914 
+0200
@@ -0,0 +1,16 @@
+! PR middle-end/77959
+! { dg-do compile }
+! { dg-options "-O2" }
+
+program pr77959
+  interface
+subroutine foo(x)  ! { dg-warning "Type mismatch in argument" }
+  real :: x
+end
+  end interface
+  call foo(1.0)
+end
+subroutine foo(x)
+  complex :: x
+  x = x + 1
+end

Jakub


libgo patch committed: support SPARC64/ELF relocs

2016-10-14 Thread Ian Lance Taylor
This patch by James Clarke adds support for SPARC64/ELF relocs to the
debug/elf package in libgo.  This is a backport of
https://golang.org/cl/30870 from the master library.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241163)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5f043fc2bf0f92a84a1f7da57acd79a61c9d2592
+911fceabd4c955b2f29f6b532f241a002ca7ad4f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/debug/elf/file.go
===
--- libgo/go/debug/elf/file.go  (revision 240942)
+++ libgo/go/debug/elf/file.go  (working copy)
@@ -598,6 +598,8 @@ func (f *File) applyRelocations(dst []by
return f.applyRelocationsMIPS64(dst, rels)
case f.Class == ELFCLASS64 && f.Machine == EM_S390:
return f.applyRelocationss390x(dst, rels)
+   case f.Class == ELFCLASS64 && f.Machine == EM_SPARCV9:
+   return f.applyRelocationsSPARC64(dst, rels)
default:
return errors.New("applyRelocations: not implemented")
}
@@ -959,6 +961,51 @@ func (f *File) applyRelocationss390x(dst
}
}
 
+   return nil
+}
+
+func (f *File) applyRelocationsSPARC64(dst []byte, rels []byte) error {
+   // 24 is the size of Rela64.
+   if len(rels)%24 != 0 {
+   return errors.New("length of relocation section is not a 
multiple of 24")
+   }
+
+   symbols, _, err := f.getSymbols(SHT_SYMTAB)
+   if err != nil {
+   return err
+   }
+
+   b := bytes.NewReader(rels)
+   var rela Rela64
+
+   for b.Len() > 0 {
+   binary.Read(b, f.ByteOrder, )
+   symNo := rela.Info >> 32
+   t := R_SPARC(rela.Info & 0x)
+
+   if symNo == 0 || symNo > uint64(len(symbols)) {
+   continue
+   }
+   sym := [symNo-1]
+   if SymType(sym.Info&0xf) != STT_SECTION {
+   // We don't handle non-section relocations for now.
+   continue
+   }
+
+   switch t {
+   case R_SPARC_64, R_SPARC_UA64:
+   if rela.Off+8 >= uint64(len(dst)) || rela.Addend < 0 {
+   continue
+   }
+   f.ByteOrder.PutUint64(dst[rela.Off:rela.Off+8], 
uint64(rela.Addend))
+   case R_SPARC_32, R_SPARC_UA32:
+   if rela.Off+4 >= uint64(len(dst)) || rela.Addend < 0 {
+   continue
+   }
+   f.ByteOrder.PutUint32(dst[rela.Off:rela.Off+4], 
uint32(rela.Addend))
+   }
+   }
+
return nil
 }
 
Index: libgo/go/debug/elf/file_test.go
===
--- libgo/go/debug/elf/file_test.go (revision 240942)
+++ libgo/go/debug/elf/file_test.go (working copy)
@@ -492,6 +492,25 @@ var relocationTests = []relocationTest{
},
},
{
+   "testdata/go-relocation-test-gcc620-sparc64.obj",
+   []relocationTestEntry{
+   {0, {
+   Offset:   0xb,
+   Tag:  dwarf.TagCompileUnit,
+   Children: true,
+   Field: []dwarf.Field{
+   {Attr: dwarf.AttrProducer, Val: "GNU 
C11 6.2.0 20160914 -mcpu=v9 -g -fstack-protector-strong", Class: 
dwarf.ClassString},
+   {Attr: dwarf.AttrLanguage, Val: 
int64(12), Class: dwarf.ClassConstant},
+   {Attr: dwarf.AttrName, Val: "hello.c", 
Class: dwarf.ClassString},
+   {Attr: dwarf.AttrCompDir, Val: "/tmp", 
Class: dwarf.ClassString},
+   {Attr: dwarf.AttrLowpc, Val: 
uint64(0x0), Class: dwarf.ClassAddress},
+   {Attr: dwarf.AttrHighpc, Val: 
int64(0x2c), Class: dwarf.ClassConstant},
+   {Attr: dwarf.AttrStmtList, Val: 
int64(0), Class: dwarf.ClassLinePtr},
+   },
+   }},
+   },
+   },
+   {
"testdata/go-relocation-test-gcc493-mips64le.obj",
[]relocationTestEntry{
{0, {
Index: libgo/go/debug/elf/testdata/go-relocation-test-gcc620-sparc64.obj
===
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: 

[Bug target/77966] Corrupt function with -fsanitize-coverage=trace-pc

2016-10-14 Thread jpoimboe at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77966

--- Comment #6 from Josh Poimboeuf  ---
(In reply to Arnd Bergmann from comment #5)
> I checked the test case using "-fsanitize=unreachable" and that avoids the
> problem.
> 
> Josh, should we set that whenever we enable objtool in the kernel?

In theory, adding -fsanitize=unreachable might be a workable option for
allowing objtool to detect such unreachable blocks.

However, in practice, that option doesn't seem to work as advertised.  It seems
to change the control flow unexpectedly.  When adding it to the test case, it
doesn't add a __ubsan_handle_builtin_unreachable() call to the unreachable
block.  Instead, it treats it as a normal loop, and removes the assumption that
the loop can only run one time.

Here's the same test case from comment #1, with -fsanitize-unreachable added:

 :
   0:   55  push   %rbp
   1:   53  push   %rbx
   2:   48 89 fdmov%rdi,%rbp
   5:   31 db   xor%ebx,%ebx
   7:   48 83 ec 08 sub$0x8,%rsp
   b:   e8 00 00 00 00  callq  10 
c: R_X86_64_PC32__sanitizer_cov_trace_pc-0x4
  10:   8b 45 00mov0x0(%rbp),%eax
  13:   85 c0   test   %eax,%eax
  15:   75 11   jne28 
  17:   48 83 c4 08 add$0x8,%rsp
  1b:   5b  pop%rbx
  1c:   5d  pop%rbp
  1d:   e9 00 00 00 00  jmpq   22 
1e: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  22:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
  28:   e8 00 00 00 00  callq  2d 
29: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  2d:   89 d8   mov%ebx,%eax
  2f:   83 c3 01add$0x1,%ebx
  32:   48 8b 7c c5 08  mov0x8(%rbp,%rax,8),%rdi
  37:   e8 00 00 00 00  callq  3c 
38: R_X86_64_PC32   ioread32-0x4
  3c:   39 5d 00cmp%ebx,0x0(%rbp)
  3f:   77 e7   ja 28 
  41:   eb d4   jmp17 

[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies

2016-10-14 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212

--- Comment #4 from Pat Haugen  ---
Author: pthaugen
Date: Fri Oct 14 17:10:18 2016
New Revision: 241170

URL: https://gcc.gnu.org/viewcvs?rev=241170=gcc=rev
Log:
PR rtl-optimization/68212
* cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge
frequency when computing scale factor for peeled copies.
* loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count
values for switch/peel blocks/edges.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgloopmanip.c
trunk/gcc/loop-unroll.c

Re: [Patch AArch64 11/11] Enable _Float16

2016-10-14 Thread James Greenhalgh

On Fri, Sep 30, 2016 at 06:03:57PM +0100, James Greenhalgh wrote:
> Hi,
>
> Finally, this patch adds the back-end wiring to get AArch64 support for
> the _Float16 type working.
>
> Bootstrapped on AArch64 with no issues.
>
> OK?

I spotted a bug in the way I'd written aarch64_promoted_type. We were not
taking the TYPE_MAIN_VARIANT before comparing with aarch64_fp16_type, so we
would fail to promote "volatile __fp16" correctly.

That's fixed in this revision, which has been through a new round of
bootstrap and cross-testing.

OK?

Thanks,
James

---
2016-10-14  James Greenhalgh  

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Update
__FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__ when we switch
architecture levels.
* config/aarch64/aarch64.c (aarch64_promoted_type): Only promote
the aarch64_fp16_type_node, not all HFmode types.
(aarch64_libgcc_floating_mode_supported_p): Support HFmode.
(aarch64_scalar_mode_supported_p): Likewise.
(aarch64_excess_precision): New.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Define.
(TARGET_SCALAR_MODE_SUPPORTED_P): Likewise.
(TARGET_C_EXCESS_PRECISION): Likewise.

2016-10-14  James Greenhalgh  

* gcc.target/aarch64/_Float16_1.c: New.
* gcc.target/aarch64/_Float16_2.c: Likewise.
* gcc.target/aarch64/_Float16_3.c: Likewise.

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 422e322..320b912 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -133,6 +133,16 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
 
   aarch64_def_or_undef (TARGET_CRYPTO, "__ARM_FEATURE_CRYPTO", pfile);
   aarch64_def_or_undef (TARGET_SIMD_RDMA, "__ARM_FEATURE_QRDMX", pfile);
+
+  /* Not for ACLE, but required to keep "float.h" correct if we switch
+ target between implementations that do or do not support ARMv8.2-A
+ 16-bit floating-point extensions.  */
+  cpp_undef (pfile, "__FLT_EVAL_METHOD__");
+  builtin_define_with_int_value ("__FLT_EVAL_METHOD__",
+ c_flt_eval_method (true));
+  cpp_undef (pfile, "__FLT_EVAL_METHOD_C99__");
+  builtin_define_with_int_value ("__FLT_EVAL_METHOD_C99__",
+ c_flt_eval_method (false));
 }
 
 /* Implement TARGET_CPU_CPP_BUILTINS.  */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f32eb5f..4f9191b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14025,12 +14025,20 @@ aarch64_vec_fpconst_pow_of_2 (rtx x)
   return firstval;
 }
 
-/* Implement TARGET_PROMOTED_TYPE to promote __fp16 to float.  */
+/* Implement TARGET_PROMOTED_TYPE to promote 16-bit floating point types
+   to float.
+
+   __fp16 always promotes through this hook.
+   _Float16 may promote if TARGET_FLT_EVAL_METHOD is 16, but we do that
+   through the generic excess precision logic rather than here.  */
+
 static tree
 aarch64_promoted_type (const_tree t)
 {
-  if (SCALAR_FLOAT_TYPE_P (t) && TYPE_PRECISION (t) == 16)
+
+  if (TYPE_P (t) && TYPE_MAIN_VARIANT (t) == aarch64_fp16_type_node)
 return float_type_node;
+
   return NULL_TREE;
 }
 
@@ -14050,6 +14058,17 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 }
 }
 
+/* Implement TARGET_LIBGCC_FLOATING_POINT_MODE_SUPPORTED_P - return TRUE
+   if MODE is HFmode, and punt to the generic implementation otherwise.  */
+
+static bool
+aarch64_libgcc_floating_mode_supported_p (machine_mode mode)
+{
+  return (mode == HFmode
+	  ? true
+	  : default_libgcc_floating_mode_supported_p (mode));
+}
+
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P - return TRUE
if MODE is HFmode, and punt to the generic implementation otherwise.  */
 
@@ -14061,6 +14080,47 @@ aarch64_scalar_mode_supported_p (machine_mode mode)
 	  : default_scalar_mode_supported_p (mode));
 }
 
+/* Set the value of FLT_EVAL_METHOD.
+   ISO/IEC TS 18661-3 defines two values that we'd like to make use of:
+
+0: evaluate all operations and constants, whose semantic type has at
+   most the range and precision of type float, to the range and
+   precision of float; evaluate all other operations and constants to
+   the range and precision of the semantic type;
+
+N, where _FloatN is a supported interchange floating type
+   evaluate all operations and constants, whose semantic type has at
+   most the range and precision of _FloatN type, to the range and
+   precision of the _FloatN type; evaluate all other operations and
+   constants to the range and precision of the semantic type;
+
+   If we have the ARMv8.2-A extensions then we support _Float16 in native
+   precision, so we should set this to 16.  Otherwise, we support the type,
+   but want to evaluate expressions in float precision, so set this to
+   0.  */
+
+static enum flt_eval_method
+aarch64_excess_precision (enum excess_precision_type type)
+{
+  switch 

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-14 Thread Bernhard Reutner-Fischer
On 13 October 2016 22:08:21 CEST, Jerry DeLisle  wrote:
>On 10/13/2016 08:16 AM, Janne Blomqvist wrote:

>>
>> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>>
>
>Yes, OK, clever! Thanks!

Is 32 something a typical program uses?
I'd have started at 8 and had not doubled but += 16 fwiw.

Cheers



Re: [Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

2016-10-14 Thread James Greenhalgh

On Fri, Sep 30, 2016 at 05:32:01PM +, Joseph Myers wrote:
> On Fri, 30 Sep 2016, James Greenhalgh wrote:
>
> >/* float.h needs to know this.  */
> > +  /* We already have the option -fno-fp-int-builtin-inexact to ensure
> > + certain built-in functions follow TS 18661-1 semantics.  It might be
> > + reasonable to have a new option to enable FLT_EVAL_METHOD using new
> > + values.  However, I'd be inclined to think that such an option should
> > + be on by default for -std=gnu*, only off for strict conformance modes.
> > + (There would be both __FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__,
> > + say, predefined macros, so that  could also always use the
> > + new value if __STDC_WANT_IEC_60559_TYPES_EXT__ is defined.)  */
>
> This comment makes no sense in the context.  The comment should not be
> talking about some other option for a different issue, or about
> half-thought-out ideas for how something might be implemented; comments
> need to relate to the actual code (which in this case is obvious and not
> in need of comments beyond saying what the macro semantics are).

Yes, that was a particularly useless comment. Modified in this revision.

> In any case, this patch does not achieve the proposed semantics, since
> there is no change to ginclude/float.h.

Ah, I thought float.h was outside the project. My mistake.

> The goal is: if the user's options imply new FLT_EVAL_METHOD values are
> OK, *or* they defined __STDC_WANT_IEC_60559_TYPES_EXT__ before including
> , it should use the appropriate TS 18661-3 value.  Otherwise
> (strict standards modes for existing standards, no
> __STDC_WANT_IEC_60559_TYPES_EXT__) it should use a C11 value.
>
> So in a strict standards mode you need to predefine macros with both
> choices of values and let  choose between them.  One possibility
> is: __FLT_EVAL_METHOD_C99__ is the value to use when
> __STDC_WANT_IEC_60559_TYPES_EXT__ is not defined, __FLT_EVAL_METHOD__ is
> the value to use when it is defined.  Or some other arrangement, with or
> without a macro saying what setting you have for the new option.  But you
> can't avoid changing .
>
> Tests then should be testing the value of FLT_EVAL_METHOD from ,
> *not* the internal macros predefined by the compiler.

I've added tests testing the float.h behaviour in this patch, and I'll
leave those testing __FLT_EVAL_METHOD__ in patch [5/11]. For all
the difference the extra testing makes, I'd rather test both, as explicit
testing that the clamping from -fpermitted-eval-methods works, and the
value is correctly set in float.h, but I can certainly drop the tests in
5/11 if you'd prefer.

I've also fixed a bug I noticed with the legacy __fp16 type. Excess
precision should leave this alone, so we need to check with
targetm.promoted_type before applying the rules in excess_precision_type.

Thanks,
James

---
gcc/

2016-10-14  James Greenhalgh  

* toplev.c (init_excess_precision): Delete most logic.
* tree.c (excess_precision_type): Rewrite to use
TARGET_EXCESS_PRECISION.
* doc/invoke.texi (-fexcess-precision): Document behaviour in a
more generic fashion.
* ginclude/float.h: Wrap definition of FLT_EVAL_METHOD in
__STDC_WANT_IEC_60559_TYPES_EXT__.

gcc/c-family/

2016-10-14  James Greenhalgh  

* c-common.c (excess_precision_mode_join): New.
(c_ts18661_flt_eval_method): New.
(c_c11_flt_eval_method): Likewise.
(c_flt_eval_method): Likewise.
* c-common.h (excess_precision_mode_join): New.
(c_flt_eval_method): Likewise.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New.
(cpp_iec_559_value): Call it.
(c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__,
call c_flt_eval_method to set __FLT_EVAL_METHOD__ and
__FLT_EVAL_METHOD_TS_18661_3__.

gcc/testsuite/

2016-10-14  James Greenhalgh  

* gcc.dg/fpermitted-flt-eval-methods_3.c: New.
* gcc.dg/fpermitted-flt-eval-methods_4.c: Likewise.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index c4a0ce8..2a4add5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -11043,4 +11043,86 @@ cb_get_suggestion (cpp_reader *, const char *goal,
   return bm.get_best_meaningful_candidate ();
 }
 
+/* Return the latice point which is the wider of the two FLT_EVAL_METHOD
+   modes X, Y.  This isn't just  >, as the FLT_EVAL_METHOD values added
+   by C TS 18661-3 for interchange  types that are computed in their
+   native precision are larger than the C11 values for evaluating in the
+   precision of float/double/long double.  If either mode is
+   FLT_EVAL_METHOD_UNPREDICTABLE, return that.  */
+
+enum flt_eval_method
+excess_precision_mode_join (enum flt_eval_method x,
+			enum flt_eval_method y)
+{
+  if (x == FLT_EVAL_METHOD_UNPREDICTABLE
+  || y == 

Re: [Patch 4/11] Implement TARGET_C_EXCESS_PRECISION for m68k

2016-10-14 Thread James Greenhalgh

On Fri, Sep 30, 2016 at 11:28:28AM -0600, Jeff Law wrote:
> On 09/30/2016 11:01 AM, James Greenhalgh wrote:
> >
> >Hi,
> >
> >This patch ports the logic from m68k's TARGET_FLT_EVAL_METHOD to the new
> >target hook TARGET_C_EXCESS_PRECISION.
> >
> >Patch tested by building an m68k-none-elf toolchain and running
> >m68k.exp (without the ability to execute) with no regressions, and manually
> >inspecting the output assembly code when compiling
> >testsuite/gcc.target/i386/excess-precision* to show no difference in
> >code-generation.
> >
> >OK?
> >
> >Thanks,
> >James
> >
> >---
> >gcc/
> >
> >2016-09-30  James Greenhalgh  
> >
> > * config/m68k/m68k.c (m68k_excess_precision): New.
> > (TARGET_C_EXCESS_PRECISION): Define.
> OK when prereqs are approved.  Similarly for other targets where you
> needed to add this hook.

Thanks Jeff, Andreas,

I spotted a very silly bug when I was retesting this patch set - when I
swapped the namespace for the new traget macro it changed from
TARGET_EXCESS_PRECISION to TARGET_C_EXCESS_PRECISION but I failed to
update the m68k patch to reflect that.

This second revision fixes that (obvious) oversight.

OK?

Thanks,
James

---
gcc/

2016-10-14  James Greenhalgh  

* config/m68k/m68k.c (m68k_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index b152ca8..3edeb71 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -183,6 +183,8 @@ static rtx m68k_function_arg (cumulative_args_t, machine_mode,
 static bool m68k_cannot_force_const_mem (machine_mode mode, rtx x);
 static bool m68k_output_addr_const_extra (FILE *, rtx);
 static void m68k_init_sync_libfuncs (void) ATTRIBUTE_UNUSED;
+static enum flt_eval_method
+m68k_excess_precision (enum excess_precision_type);
 
 /* Initialize the GCC target structure.  */
 
@@ -323,6 +325,9 @@ static void m68k_init_sync_libfuncs (void) ATTRIBUTE_UNUSED;
 #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
 #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA m68k_output_addr_const_extra
 
+#undef TARGET_C_EXCESS_PRECISION
+#define TARGET_C_EXCESS_PRECISION m68k_excess_precision
+
 /* The value stored by TAS.  */
 #undef TARGET_ATOMIC_TEST_AND_SET_TRUEVAL
 #define TARGET_ATOMIC_TEST_AND_SET_TRUEVAL 128
@@ -6531,4 +6536,36 @@ m68k_epilogue_uses (int regno ATTRIBUTE_UNUSED)
 	  == m68k_fk_interrupt_handler));
 }
 
+
+/* Implement TARGET_C_EXCESS_PRECISION.
+
+   Set the value of FLT_EVAL_METHOD in float.h.  When using 68040 fp
+   instructions, we get proper intermediate rounding, otherwise we
+   get extended precision results.  */
+
+static enum flt_eval_method
+m68k_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_FAST:
+	/* The fastest type to promote to will always be the native type,
+	   whether that occurs with implicit excess precision or
+	   otherwise.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+  case EXCESS_PRECISION_TYPE_STANDARD:
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+	/* Otherwise, the excess precision we want when we are
+	   in a standards compliant mode, and the implicit precision we
+	   provide can be identical.  */
+	if (TARGET_68040 || ! TARGET_68881)
+	  return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+
+	return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
+  default:
+	gcc_unreachable ();
+}
+  return FLT_EVAL_METHOD_UNPREDICTABLE;
+}
+
 #include "gt-m68k.h"


Re: [Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

2016-10-14 Thread James Greenhalgh

Hi,

On Fri, Oct 07, 2016 at 10:34:25AM +0200, Andreas Krebbel wrote:
> On 10/04/2016 03:42 PM, Joseph Myers wrote:
> > On Tue, 4 Oct 2016, Andreas Krebbel wrote:
> >
> >>> (b) Handling EXCESS_PRECISION_TYPE_IMPLICIT like
> >>> EXCESS_PRECISION_TYPE_FAST would accurately describe what the back end
> >>> does.  It would mean that the default FLT_EVAL_METHOD is 0, which is a
> >>> more accurate description of how the compiler actually behaves, and would
> >>> avoid the suboptimal code in libgcc and glibc.  It would however mean that
> >>> unless -fexcess-precision=standard is used, FLT_EVAL_METHOD (accurate) is
> >>> out of synx with float_t in math.h (inaccurate).
> >>
> >> With (b) we would violate the C standard which explicitly states that
> >> the definition of float_t needs to be float if FLT_EVAL_METHOD is 0.
> >> I've no idea how much code really relies on that. So far I only know
> >> about the Plum Hall testsuite ;) So this probably would still be a safe
> >> change. Actually it was like that for many years without any problems
> >> ... until I've changed it due to the Plum Hall finding :(
> >> https://gcc.gnu.org/ml/gcc-patches/2013-03/msg01124.html
> >
> > You'd only violate it outside standards conformance modes (which you
> > should be using for running conformance testsuites); with -std=c11 etc.
> > -fexcess-precision=standard would be implied, meaning FLT_EVAL_METHOD
> > remains as 1 in that case.
>
> wrt (b): Agreed. I was more concerned about all the other code which might 
> accidently be built
> without a strict standard compliance option. I did some searches in the 
> source package repos. The
> only snippet where the definition of FLT_EVAL_METHOD might affect code 
> generation is in musl libc
> but that one is being built with -std=c99. So I don't see anything speaking 
> against (b). I'm ok with
> going that way.
>
> wrt (c): float_t appears to be more widely used than I expected. But the only 
> hits which might
> indicate potential ABI problems where in clucene and libassa. (I've scanned 
> the header files of
> about 25k Ubuntu source packages).
> I'm also not sure about script language interfaces with C. There might be 
> potential problems out
> there which I wasn't able to catch with my scan.
> While I fully agree with you that the float_t type definition for S/390 in 
> Glibc is plain wrong I do
> not really feel comfortable with changing it.
>
> An interesting case is imagemagick. They define their ABI-relevant 
> MagickRealType based on the size
> of float_t in recent versions but excplicitly without depending on 
> FLT_EVAL_METHOD
> (http://www.imagemagick.org/discourse-server/viewtopic.php?t=22136). They 
> build with -std=gnu99 so
> this helps us with (b) I think. To my understanding MagickRealType would stay 
> double not affected by
> FLT_EVAL_METHOD changes.

Here is a patch implementing what I think has been discussed in this thread.

OK?

Thanks,
James

---
gcc/

2016-10-14  James Greenhalgh  

* config/s390/s390.c (s390_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index f69b470..8f6f199 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -15107,6 +15107,43 @@ s390_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree ty
   return NULL;
 }
 
+/* Implement TARGET_C_EXCESS_PRECISION.
+
+   FIXME: For historical reasons, float_t and double_t are typedef'ed to
+   double on s390, causing operations on float_t to operate in a higher
+   precision than is necessary.  However, it is not the case that SFmode
+   operations have implicit excess precision, and we generate more optimal
+   code if we let the compiler know no implicit extra precision is added.
+
+   That means when we are compiling with -fexcess-precision=fast, the value
+   we set for FLT_EVAL_METHOD will be out of line with the actual precision of
+   float_t (though they would be correct for -fexcess-precision=standard).
+
+   A complete fix would modify glibc to remove the unnecessary typedef
+   of float_t to double.  */
+
+static enum flt_eval_method
+s390_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+  case EXCESS_PRECISION_TYPE_FAST:
+	/* The fastest type to promote to will always be the native type,
+	   whether that occurs with implicit excess precision or
+	   otherwise.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+  case EXCESS_PRECISION_TYPE_STANDARD:
+	/* Otherwise, when we are in a standards compliant mode, to
+	   ensure consistency with the implementation in glibc, report that
+	   float is evaluated to the range and precision of double.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE;
+  default:
+	gcc_unreachable ();
+}
+  return FLT_EVAL_METHOD_UNPREDICTABLE;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ 

[Bug fortran/77978] stop codes misinterpreted in both f2003 and f2008

2016-10-14 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77978

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-10-14
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.8 up to trunk (7.0).

Re: [Patch 1/11] Add a new target hook for describing excess precision intentions

2016-10-14 Thread James Greenhalgh

On Fri, Sep 30, 2016 at 05:56:53PM +0100, James Greenhalgh wrote:
>
> This patch introduces TARGET_C_EXCESS_PRECISION. This hook takes a tri-state
> argument, one of EXCESS_PRECISION_TYPE_IMPLICIT,
> EXCESS_PRECISION_TYPE_STANDARD, EXCESS_PRECISION_TYPE_FAST. Which relate to
> the implicit extra precision added by the target, the excess precision that
> should be guaranteed for -fexcess-precision=standard, and the excess
> precision that should be added for performance under -fexcess-precision=fast .
>
> Bootstrapped and tested in sequence with the other patches in this series
> on Arch64, and as a standalone patch on x86_64.
>

Hi,

This version of this patch has no major changes, simply updating the
comment above default_excess_precision to use the newer
TARGET_C_EXCESS_PRECISION name for this target hook, rather than
TARGET_EXCESS_PRECISION as it was in an earlier patch revision.

Bootstrapped and tested in sequence with the other patches in this series
on Arch64, and as a standalone patch on x86_64.

OK?

Thanks
James

---
gcc/

2016-10-14  James Greenhalgh  

* target.def (excess_precision): New hook.
* target.h (flt_eval_method): New.
(excess_precision_type): Likewise.
* targhooks.c (default_excess_precision): New.
* targhooks.h (default_excess_precision): New.
* doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): New.
* doc/tm.texi: Regenerate.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 70f909d..c4b00b0 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -331,6 +331,24 @@ enum symbol_visibility
   VISIBILITY_INTERNAL
 };
 
+/* enums used by the targetm.excess_precision hook.  */
+
+enum flt_eval_method
+{
+  FLT_EVAL_METHOD_UNPREDICTABLE = -1,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT = 0,
+  FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE = 1,
+  FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE = 2,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 = 16
+};
+
+enum excess_precision_type
+{
+  EXCESS_PRECISION_TYPE_IMPLICIT,
+  EXCESS_PRECISION_TYPE_STANDARD,
+  EXCESS_PRECISION_TYPE_FAST
+};
+
 /* Support for user-provided GGC and PCH markers.  The first parameter
is a pointer to a pointer, the second a cookie.  */
 typedef void (*gt_pointer_operator) (void *, void *);
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a4a8e49..c21a772 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -947,6 +947,10 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@deftypefn {Target Hook} {enum flt_eval_method} TARGET_C_EXCESS_PRECISION (enum excess_precision_type @var{type})
+Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that describes which excess precision should be applied.  @var{type} is either @code{EXCESS_PRECISION_TYPE_IMPLICIT}, @code{EXCESS_PRECISION_TYPE_FAST}, or @code{EXCESS_PRECISION_TYPE_STANDARD}.  For @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which precision and range operations will be implictly evaluated in regardless of the excess precision explicitly added.  For @code{EXCESS_PRECISION_TYPE_STANDARD} and @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the explicit excess precision that should be added depending on the value set for @code{-fexcess-precision=[standard|fast]}.
+@end deftypefn
+
 @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return})
 Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or
 function return values.  The target hook should return the new mode
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 265f1be..19b381b 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -921,6 +921,8 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@hook TARGET_C_EXCESS_PRECISION
+
 @hook TARGET_PROMOTE_FUNCTION_MODE
 
 @defmac PARM_BOUNDARY
diff --git a/gcc/target.def b/gcc/target.def
index b6968f7..3b17c62 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5402,6 +5402,23 @@ DEFHOOK_UNDOC
  machine_mode, (char c),
  default_mode_for_suffix)
 
+DEFHOOK
+(excess_precision,
+ "Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that\
+ describes which excess precision should be applied.  @var{type} is\
+ either @code{EXCESS_PRECISION_TYPE_IMPLICIT},\
+ @code{EXCESS_PRECISION_TYPE_FAST}, or\
+ @code{EXCESS_PRECISION_TYPE_STANDARD}.  For\
+ @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which\
+ precision and range operations will be implictly evaluated in regardless\
+ of the excess precision explicitly added.  For\
+ @code{EXCESS_PRECISION_TYPE_STANDARD} and\
+ @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the\
+ explicit excess precision that should be added depending on the\
+ value set for 

[PATCH 2/8] nvptx: implement predicated instructions

2016-10-14 Thread Alexander Monakov
This patch wires up generation of predicated instruction forms in nvptx.md and
fixes their handling in nvptx.c.  This is a prerequisite for the following
patch.  On its own it doesn't affect generated code because COND_EXEC
instructions are created by if-conversion only after register allocation,
which is not performed on NVPTX.

* config/nvptx/nvptx.c (nvptx_output_call_insn): Handle COND_EXEC
patterns.  Emit instruction predicate.
(nvptx_print_operand): Fix handling of instruction predicates.
* config/nvptx/nvptx.md (predicable): New attribute.  Generate
predicated forms via define_cond_exec.
(br_true): Mark as not predicable.
(br_false): Ditto.
(br_true_uni): Ditto.
(br_false_uni): Ditto.
(return): Ditto.
(trap_if_true): Ditto.
(trap_if_false): Ditto.
(nvptx_fork): Ditto.
(nvptx_forked): Ditto.
(nvptx_joining): Ditto.
(nvptx_join): Ditto.
(nvptx_barsync): Ditto.
---
 gcc/config/nvptx/nvptx.c  | 14 --
 gcc/config/nvptx/nvptx.md | 43 +++
 2 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 0525b17..4cdaa1e 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1927,6 +1927,8 @@ nvptx_output_mov_insn (rtx dst, rtx src)
   return "%.\tcvt%t0%t1\t%0, %1;";
 }
 
+static void nvptx_print_operand (FILE *, rtx, int);
+
 /* Output INSN, which is a call to CALLEE with result RESULT.  For ptx, this
involves writing .param declarations and in/out copies into them.  For
indirect calls, also write the .callprototype.  */
@@ -1938,6 +1940,8 @@ nvptx_output_call_insn (rtx_insn *insn, rtx result, rtx 
callee)
   static int labelno;
   bool needs_tgt = register_operand (callee, Pmode);
   rtx pat = PATTERN (insn);
+  if (GET_CODE (pat) == COND_EXEC)
+pat = COND_EXEC_CODE (pat);
   int arg_end = XVECLEN (pat, 0);
   tree decl = NULL_TREE;
 
@@ -1982,6 +1986,8 @@ nvptx_output_call_insn (rtx_insn *insn, rtx result, rtx 
callee)
   fprintf (asm_out_file, ";\n");
 }
 
+  /* The '.' stands for the call's predicate, if any.  */
+  nvptx_print_operand (asm_out_file, NULL_RTX, '.');
   fprintf (asm_out_file, "\t\tcall ");
   if (result != NULL_RTX)
 fprintf (asm_out_file, "(%s_in), ", reg_names[NVPTX_RETURN_REGNUM]);
@@ -2045,8 +2051,6 @@ nvptx_print_operand_punct_valid_p (unsigned char c)
   return c == '.' || c== '#';
 }
 
-static void nvptx_print_operand (FILE *, rtx, int);
-
 /* Subroutine of nvptx_print_operand; used to print a memory reference X to 
FILE.  */
 
 static void
@@ -2107,12 +2111,10 @@ nvptx_print_operand (FILE *file, rtx x, int code)
   x = current_insn_predicate;
   if (x)
{
- unsigned int regno = REGNO (XEXP (x, 0));
- fputs ("[", file);
+ fputs ("@", file);
  if (GET_CODE (x) == EQ)
fputs ("!", file);
- fputs (reg_names [regno], file);
- fputs ("]", file);
+ output_reg (file, REGNO (XEXP (x, 0)), VOIDmode);
}
   return;
 }
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index e91e8ac..5c5c991 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -126,6 +126,17 @@ (define_predicate "call_operation"
   return true;
 })
 
+(define_attr "predicable" "false,true"
+  (const_string "true"))
+
+(define_cond_exec
+  [(match_operator 0 "predicate_operator"
+  [(match_operand:BI 1 "nvptx_register_operand" "")
+   (match_operand:BI 2 "const0_operand" "")])]
+  ""
+  ""
+  )
+
 (define_constraint "P0"
   "An integer with the value 0."
   (and (match_code "const_int")
@@ -511,7 +522,8 @@ (define_insn "br_true"
  (label_ref (match_operand 1 "" ""))
  (pc)))]
   ""
-  "%j0\\tbra\\t%l1;")
+  "%j0\\tbra\\t%l1;"
+  [(set_attr "predicable" "false")])
 
 (define_insn "br_false"
   [(set (pc)
@@ -520,7 +532,8 @@ (define_insn "br_false"
  (label_ref (match_operand 1 "" ""))
  (pc)))]
   ""
-  "%J0\\tbra\\t%l1;")
+  "%J0\\tbra\\t%l1;"
+  [(set_attr "predicable" "false")])
 
 ;; unified conditional branch
 (define_insn "br_true_uni"
@@ -529,7 +542,8 @@ (define_insn "br_true_uni"
   UNSPEC_BR_UNIFIED) (const_int 0))
 (label_ref (match_operand 1 "" "")) (pc)))]
   ""
-  "%j0\\tbra.uni\\t%l1;")
+  "%j0\\tbra.uni\\t%l1;"
+  [(set_attr "predicable" "false")])
 
 (define_insn "br_false_uni"
   [(set (pc) (if_then_else
@@ -537,7 +551,8 @@ (define_insn "br_false_uni"
   UNSPEC_BR_UNIFIED) (const_int 0))
 (label_ref (match_operand 1 "" "")) (pc)))]
   ""
-  "%J0\\tbra.uni\\t%l1;")
+  "%J0\\tbra.uni\\t%l1;"
+  [(set_attr "predicable" "false")])
 
 (define_expand "cbranch4"
   [(set (pc)
@@ -940,7 +955,8 @@ (define_insn "return"
   ""
 {
   return nvptx_output_return ();

[PATCH 7/8] nvptx backend: new insns for OpenMP SIMD-via-SIMT

2016-10-14 Thread Alexander Monakov
This patch implements in nvptx.md a few new instruction patterns that are used
for OpenMP SIMD code.

* config/nvptx/nvptx-protos.h (nvptx_shuffle_kind): Move enum
declaration from nvptx.c.
(nvptx_gen_shuffle): Declare.
* config/nvptx/nvptx.c (nvptx_shuffle_kind): Move to nvptx-protos.h.
(nvptx_gen_shuffle): Export.
* config/nvptx/nvptx.md (UNSPEC_VOTE_BALLOT): New unspec.
(UNSPEC_LANEID): Ditto.
(UNSPECV_NOUNROLL): Ditto.
(nvptx_vote_ballot): New pattern.
(omp_simt_lane): Ditto.
(omp_simt_last_lane): Ditto.
(omp_simt_ordered): Ditto.
(omp_simt_vote_any): Ditto.
(omp_simt_xchg_bfly): Ditto.
(omp_simt_xchg_idx): Ditto.
(nvptx_nounroll): Ditto.
* target-insns.def (omp_simt_lane): New.
(omp_simt_last_lane): New.
(omp_simt_ordered): New.
(omp_simt_vote_any): New.
(omp_simt_xchg_bfly): New.
(omp_simt_xchg_idx): New.
---
 gcc/config/nvptx/nvptx-protos.h | 11 +
 gcc/config/nvptx/nvptx.c| 12 +-
 gcc/config/nvptx/nvptx.md   | 94 +
 gcc/target-insns.def|  6 +++
 4 files changed, 112 insertions(+), 11 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index 647607d..331ec0a 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -21,6 +21,16 @@
 #ifndef GCC_NVPTX_PROTOS_H
 #define GCC_NVPTX_PROTOS_H
 
+/* The kind of shuffe instruction.  */
+enum nvptx_shuffle_kind
+{
+  SHUFFLE_UP,
+  SHUFFLE_DOWN,
+  SHUFFLE_BFLY,
+  SHUFFLE_IDX,
+  SHUFFLE_MAX
+};
+
 extern void nvptx_declare_function_name (FILE *, const char *, const_tree 
decl);
 extern void nvptx_declare_object_name (FILE *file, const char *name,
   const_tree decl);
@@ -36,6 +46,7 @@ extern void nvptx_register_pragmas (void);
 extern void nvptx_expand_oacc_fork (unsigned);
 extern void nvptx_expand_oacc_join (unsigned);
 extern void nvptx_expand_call (rtx, rtx);
+extern rtx nvptx_gen_shuffle (rtx, rtx, rtx, nvptx_shuffle_kind);
 extern rtx nvptx_expand_compare (rtx);
 extern const char *nvptx_ptx_type_from_mode (machine_mode, bool);
 extern const char *nvptx_output_mov_insn (rtx, rtx);
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index ef85ef6..f9ac380 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -72,16 +72,6 @@
 /* This file should be included last.  */
 #include "target-def.h"
 
-/* The kind of shuffe instruction.  */
-enum nvptx_shuffle_kind
-{
-  SHUFFLE_UP,
-  SHUFFLE_DOWN,
-  SHUFFLE_BFLY,
-  SHUFFLE_IDX,
-  SHUFFLE_MAX
-};
-
 /* The various PTX memory areas an object might reside in.  */
 enum nvptx_data_area
 {
@@ -1455,7 +1445,7 @@ nvptx_gen_pack (rtx dst, rtx src0, rtx src1)
 /* Generate an instruction or sequence to broadcast register REG
across the vectors of a single warp.  */
 
-static rtx
+rtx
 nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, nvptx_shuffle_kind kind)
 {
   rtx res;
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 35ae71e..91d1129 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -42,6 +42,10 @@ (define_c_enum "unspec" [
 
UNSPEC_BIT_CONV
 
+   UNSPEC_VOTE_BALLOT
+
+   UNSPEC_LANEID
+
UNSPEC_SHUFFLE
UNSPEC_BR_UNIFIED
 ])
@@ -57,6 +61,8 @@ (define_c_enum "unspecv" [
UNSPECV_FORKED
UNSPECV_JOINING
UNSPECV_JOIN
+
+   UNSPECV_NOUNROLL
 ])
 
 (define_attr "subregs_ok" "false,true"
@@ -1169,6 +1175,88 @@ (define_insn "nvptx_shuffle"
   ""
   "%.\\tshfl%S3.b32\\t%0, %1, %2, 31;")
 
+(define_insn "nvptx_vote_ballot"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+   (unspec:SI [(match_operand:BI 1 "nvptx_register_operand" "R")]
+  UNSPEC_VOTE_BALLOT))]
+  ""
+  "%.\\tvote.ballot.b32\\t%0, %1;")
+
+;; Patterns for OpenMP SIMD-via-SIMT lowering
+
+;; Implement IFN_GOMP_SIMT_LANE: set operand 0 to lane index
+(define_insn "omp_simt_lane"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "")
+   (unspec:SI [(const_int 0)] UNSPEC_LANEID))]
+  ""
+  "%.\\tmov.u32\\t%0, %%laneid;")
+
+;; Implement IFN_GOMP_SIMT_ORDERED: copy operand 1 to operand 0 and
+;; place a compiler barrier to disallow unrolling/peeling the containing loop
+(define_expand "omp_simt_ordered"
+  [(match_operand:SI 0 "nvptx_register_operand" "=R")
+   (match_operand:SI 1 "nvptx_register_operand" "R")]
+  ""
+{
+  emit_move_insn (operands[0], operands[1]);
+  emit_insn (gen_nvptx_nounroll ());
+  DONE;
+})
+
+;; Implement IFN_GOMP_SIMT_XCHG_BFLY: perform a "butterfly" exchange
+;; across lanes
+(define_expand "omp_simt_xchg_bfly"
+  [(match_operand 0 "nvptx_register_operand" "=R")
+   (match_operand 1 "nvptx_register_operand" "R")
+   (match_operand:SI 2 "nvptx_nonmemory_operand" "Ri")]
+  ""
+{
+  emit_insn (nvptx_gen_shuffle (operands[0], operands[1], operands[2],
+ 

[PATCH 6/8] new target hook: TARGET_SIMT_VF

2016-10-14 Thread Alexander Monakov
This patch adds a new target hook and implements it in a straightforward
manner on NVPTX to indicate that the target is running in SIMT fashion with 32
threads in a synchronous group ("warp").  For use in OpenMP transforms.

* config/nvptx/nvptx.c (nvptx_simt_vf): New.
(TARGET_SIMT_VF): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: (TARGET_SIMT_VF): New hook.
* target.def: Define it.
---
 gcc/config/nvptx/nvptx.c | 11 +++
 gcc/doc/tm.texi  |  4 
 gcc/doc/tm.texi.in   |  2 ++
 gcc/target.def   | 12 
 4 files changed, 29 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 1c3267f..ef85ef6 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4453,6 +4453,14 @@ nvptx_expand_builtin (tree exp, rtx target, rtx 
ARG_UNUSED (subtarget),
 #define PTX_WORKER_LENGTH 32
 #define PTX_GANG_DEFAULT  32
 
+/* Implement TARGET_SIMT_VF target hook: number of threads in a warp.  */
+
+static int
+nvptx_simt_vf ()
+{
+  return PTX_VECTOR_LENGTH;
+}
+
 /* Validate compute dimensions of an OpenACC offload or routine, fill
in non-unity defaults.  FN_LEVEL indicates the level at which a
routine might spawn a loop.  It is negative for non-routines.  If
@@ -5221,6 +5229,9 @@ nvptx_goacc_reduction (gcall *call)
 #undef  TARGET_BUILTIN_DECL
 #define TARGET_BUILTIN_DECL nvptx_builtin_decl
 
+#undef TARGET_SIMT_VF
+#define TARGET_SIMT_VF nvptx_simt_vf
+
 #undef TARGET_GOACC_VALIDATE_DIMS
 #define TARGET_GOACC_VALIDATE_DIMS nvptx_goacc_validate_dims
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a4a8e49..76477d6 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5836,6 +5836,10 @@ usable.  In that case, the smaller the number is, the 
more desirable it is
 to use it.
 @end deftypefn
 
+@deftypefn {Target Hook} int TARGET_SIMT_VF (void)
+Return number of threads in SIMT thread group on the target.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_GOACC_VALIDATE_DIMS (tree @var{decl}, int 
*@var{dims}, int @var{fn_level})
 This hook should check the launch dimensions provided for an OpenACC
 compute region, or routine.  Defaulted values are represented as -1
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 265f1be..36672af 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4289,6 +4289,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_SIMD_CLONE_USABLE
 
+@hook TARGET_SIMT_VF
+
 @hook TARGET_GOACC_VALIDATE_DIMS
 
 @hook TARGET_GOACC_DIM_LIMIT
diff --git a/gcc/target.def b/gcc/target.def
index b6968f7..0018f4d 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1648,6 +1648,18 @@ int, (struct cgraph_node *), NULL)
 
 HOOK_VECTOR_END (simd_clone)
 
+/* Functions relating to OpenMP SIMT vectorization transform.  */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_SIMT_"
+HOOK_VECTOR (TARGET_SIMT, simt)
+
+DEFHOOK
+(vf,
+"Return number of threads in SIMT thread group on the target.",
+int, (void), NULL)
+
+HOOK_VECTOR_END (simt)
+
 /* Functions relating to openacc.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_GOACC_"
-- 
1.8.3.1



[PATCH 0/8] NVPTX offloading to NVPTX: backend patches

2016-10-14 Thread Alexander Monakov
Hi,

I'm resending the patch series with backend prerequisites for OpenMP
offloading to the NVIDIA PTX ISA.  The patches are rebased on trunk.

Could a global reviewer have a look at patch 6 (new TARGET_SIMT_VF hook) please?

Documentation changes in doc/invoke.texi have already been reviewed
by Sandra Loosemore (thank you!).

Alexander


[PATCH 8/8] nvptx: handle OpenMP "omp target entrypoint"

2016-10-14 Thread Alexander Monakov
This patch implements emission of OpenMP target region entrypoints: the
compiler emits the target function with '$impl' appended to the name, and
under the original name it emits a short entry sequence that sets up shared
memory arrays and calls the target function via 'gomp_nvptx_main' (which is
implemented in libgomp).

* config/nvptx/nvptx.c (write_as_kernel): Restrict to OpenACC target
regions.
(write_omp_entry): New.  Use it...
(nvptx_declare_function_name): ...here to emit OpenMP target region
entrypoints.
(nvptx_record_offload_symbol): Handle NULL attributes.
---
 gcc/config/nvptx/nvptx.c | 82 +---
 1 file changed, 78 insertions(+), 4 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index f9ac380..8d86aa8 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -739,7 +739,10 @@ static bool
 write_as_kernel (tree attrs)
 {
   return (lookup_attribute ("kernel", attrs) != NULL_TREE
- || lookup_attribute ("omp target entrypoint", attrs) != NULL_TREE);
+ || (lookup_attribute ("omp target entrypoint", attrs) != NULL_TREE
+ && lookup_attribute ("oacc function", attrs) != NULL_TREE));
+  /* For OpenMP target regions, the corresponding kernel entry is emitted from
+ write_omp_entry as a separate function.  */
 }
 
 /* Emit a linker marker for a function decl or defn.  */
@@ -1096,6 +1099,69 @@ nvptx_init_unisimt_predicate (FILE *file)
   need_unisimt_decl = true;
 }
 
+/* Emit kernel NAME for function ORIG outlined for an OpenMP 'target' region:
+
+   extern void gomp_nvptx_main (void (*fn)(void*), void *fnarg);
+   void __attribute__((kernel)) NAME (void *arg, char *stack, size_t stacksize)
+   {
+ __nvptx_stacks[tid.y] = stack + stacksize * (ctaid.x * ntid.y + tid.y + 
1);
+ __nvptx_uni[tid.y] = 0;
+ gomp_nvptx_main (ORIG, arg);
+   }
+   ORIG itself should not be emitted as a PTX .entry function.  */
+
+static void
+write_omp_entry (FILE *file, const char *name, const char *orig)
+{
+  static bool gomp_nvptx_main_declared;
+  if (!gomp_nvptx_main_declared)
+{
+  gomp_nvptx_main_declared = true;
+  write_fn_marker (func_decls, false, true, "gomp_nvptx_main");
+  func_decls << ".extern .func gomp_nvptx_main (.param.u" << POINTER_SIZE
+<< " %in_ar1, .param.u" << POINTER_SIZE << " %in_ar2);\n";
+}
+#define ENTRY_TEMPLATE(PS, PS_BYTES, MAD_PS_32) "\
+ (.param.u" PS " %arg, .param.u" PS " %stack, .param.u" PS " %sz)\n\
+{\n\
+   .reg.u32 %r<3>;\n\
+   .reg.u" PS " %R<4>;\n\
+   mov.u32 %r0, %tid.y;\n\
+   mov.u32 %r1, %ntid.y;\n\
+   mov.u32 %r2, %ctaid.x;\n\
+   cvt.u" PS ".u32 %R1, %r0;\n\
+   " MAD_PS_32 " %R1, %r1, %r2, %R1;\n\
+   mov.u" PS " %R0, __nvptx_stacks;\n\
+   " MAD_PS_32 " %R0, %r0, " PS_BYTES ", %R0;\n\
+   ld.param.u" PS " %R2, [%stack];\n\
+   ld.param.u" PS " %R3, [%sz];\n\
+   add.u" PS " %R2, %R2, %R3;\n\
+   mad.lo.u" PS " %R2, %R1, %R3, %R2;\n\
+   st.shared.u" PS " [%R0], %R2;\n\
+   mov.u" PS " %R0, __nvptx_uni;\n\
+   " MAD_PS_32 " %R0, %r0, 4, %R0;\n\
+   mov.u32 %r0, 0;\n\
+   st.shared.u32 [%R0], %r0;\n\
+   mov.u" PS " %R0, \0;\n\
+   ld.param.u" PS " %R1, [%arg];\n\
+   {\n\
+   .param.u" PS " %P<2>;\n\
+   st.param.u" PS " [%P0], %R0;\n\
+   st.param.u" PS " [%P1], %R1;\n\
+   call.uni gomp_nvptx_main, (%P0, %P1);\n\
+   }\n\
+   ret.uni;\n\
+}\n"
+  static const char entry64[] = ENTRY_TEMPLATE ("64", "8", "mad.wide.u32");
+  static const char entry32[] = ENTRY_TEMPLATE ("32", "4", "mad.lo.u32  ");
+#undef ENTRY_TEMPLATE
+  const char *entry_1 = TARGET_ABI64 ? entry64 : entry32;
+  /* Position ENTRY_2 after the embedded nul using strlen of the prefix.  */
+  const char *entry_2 = entry_1 + strlen (entry64) + 1;
+  fprintf (file, ".visible .entry %s%s%s%s", name, entry_1, orig, entry_2);
+  need_softstack_decl = need_unisimt_decl = true;
+}
+
 /* Implement ASM_DECLARE_FUNCTION_NAME.  Writes the start of a ptx
function, including local var decls and copies from the arguments to
local regs.  */
@@ -1107,6 +1173,14 @@ nvptx_declare_function_name (FILE *file, const char 
*name, const_tree decl)
   tree result_type = TREE_TYPE (fntype);
   int argno = 0;
 
+  if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
+  && !lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
+{
+  char *buf = (char *) alloca (strlen (name) + sizeof ("$impl"));
+  sprintf (buf, "%s$impl", name);
+  write_omp_entry (file, name, buf);
+  name = buf;
+}
   /* We construct the initial part of the function into a string
  stream, in order to share the prototype writing code.  */
   std::stringstream s;
@@ -4176,13 +4250,13 @@ nvptx_record_offload_symbol (tree decl)
 case FUNCTION_DECL:
   {
 

[PATCH 5/8] nvptx mkoffload: pass -mgomp for OpenMP offloading

2016-10-14 Thread Alexander Monakov
This patch wires up use of alternative -mgomp multilib for OpenMP offloading
via nvptx mkoffload.  It makes OpenACC and OpenMP incompatible for
simultaneous offloading compilation, so I've added a diagnostic for that.

* config/nvptx/mkoffload.c (main): Check that either OpenACC or OpenMP
is selected.  Pass -mgomp to offload compiler in OpenMP case.
---
 gcc/config/nvptx/mkoffload.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/nvptx/mkoffload.c b/gcc/config/nvptx/mkoffload.c
index c8eed45..d876c7b 100644
--- a/gcc/config/nvptx/mkoffload.c
+++ b/gcc/config/nvptx/mkoffload.c
@@ -460,6 +460,7 @@ main (int argc, char **argv)
 
   /* Scan the argument vector.  */
   bool fopenmp = false;
+  bool fopenacc = false;
   for (int i = 1; i < argc; i++)
 {
 #define STR "-foffload-abi="
@@ -476,11 +477,15 @@ main (int argc, char **argv)
 #undef STR
   else if (strcmp (argv[i], "-fopenmp") == 0)
fopenmp = true;
+  else if (strcmp (argv[i], "-fopenacc") == 0)
+   fopenacc = true;
   else if (strcmp (argv[i], "-save-temps") == 0)
save_temps = true;
   else if (strcmp (argv[i], "-v") == 0)
verbose = true;
 }
+  if (!(fopenacc ^ fopenmp))
+fatal_error (input_location, "either -fopenacc or -fopenmp must be set");
 
   struct obstack argv_obstack;
   obstack_init (_obstack);
@@ -501,6 +506,8 @@ main (int argc, char **argv)
 default:
   gcc_unreachable ();
 }
+  if (fopenmp)
+obstack_ptr_grow (_obstack, "-mgomp");
 
   for (int ix = 1; ix != argc; ix++)
 {
-- 
1.8.3.1



[PATCH 1/8] nvptx -msoft-stack

2016-10-14 Thread Alexander Monakov
This patch implements '-msoft-stack' code generation variant for NVPTX.  The
goal is to avoid relying on '.local' memory space for placement of automatic
data, and instead have an explicitely-maintained stack pointer (which can be
set up to point to preallocated global memory space).  This allows to have
stack data accessible from all threads and modifiable with atomic
instructions.  This also allows to implement variable-length stack allocation
(for 'alloca' and C99 VLAs).

Each warp has its own 'soft stack' pointer.  It lives in shared memory array
called __nvptx_stacks at index %tid.y (like in OpenACC, OpenMP offloading is
going to use launch geometry such that %tid.y gives the warp index).  It is
retrieved in function prologue (if the function needs a stack frame) and may
also be written there (if the function is non-leaf, so that its callees see
the updated stack pointer), and restored prior to returning.

Startup code is responsible for setting up the initial soft-stack pointer. For
-mmainkernel testing it is libgcc's __main, for OpenMP offloading it's the
kernel region entry code.

gcc/:
* config/nvptx/nvptx-protos.h (nvptx_output_set_softstack): Declare.
* config/nvptx/nvptx.c: (need_softstack_decl): New variable.
(init_softstack_frame): New.
(nvptx_declare_function_name): Handle TARGET_SOFT_STACK.
(nvptx_output_set_softstack): New.
(nvptx_get_drap_rtx): Return %argp as the DRAP if needed.
(nvptx_file_end): Handle need_softstack_decl.
* config/nvptx/nvptx.h: (TARGET_CPU_CPP_BUILTINS): Define
__nvptx_softstack__ when -msoft-stack is active.
(STACK_SIZE_MODE): Define.
(FIXED_REGISTERS): Adjust.
(SOFTSTACK_SLOT_REGNUM): New.
(SOFTSTACK_PREV_REGNUM): New.
(REGISTER_NAMES): Adjust.
(struct machine_function): New bool field has_softstack.
* config/nvptx/nvptx.md (UNSPEC_SET_SOFTSTACK): New.
(epilogue): Emit stack restore if TARGET_SOFT_STACK.
(allocate_stack): Implement for TARGET_SOFT_STACK.  Remove unused code.
(allocate_stack_): Remove unused pattern.
(set_softstack_insn): New pattern.
(restore_stack_block): Handle for TARGET_SOFT_STACK.
* config/nvptx/nvptx.opt: (msoft-stack): New option.
* doc/invoke.texi (msoft-stack): Document.

gcc/testsuite/:
* gcc.target/nvptx/softstack.c: New test.
* lib/target-supports.exp (check_effective_target_alloca): Use a
compile test.

libgcc/:
* config/nvptx/crt0.c (__main): Setup __nvptx_stacks.
---
 gcc/config/nvptx/nvptx-protos.h|   1 +
 gcc/config/nvptx/nvptx.c   | 120 ++---
 gcc/config/nvptx/nvptx.h   |  15 +++-
 gcc/config/nvptx/nvptx.md  |  36 ++---
 gcc/config/nvptx/nvptx.opt |   4 +
 gcc/doc/invoke.texi|  12 +++
 gcc/testsuite/gcc.target/nvptx/softstack.c |  23 ++
 gcc/testsuite/lib/target-supports.exp  |   5 +-
 libgcc/config/nvptx/crt0.c |   8 ++
 9 files changed, 198 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/nvptx/softstack.c

diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index ec4588e..647607d 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -41,5 +41,6 @@ extern const char *nvptx_ptx_type_from_mode (machine_mode, 
bool);
 extern const char *nvptx_output_mov_insn (rtx, rtx);
 extern const char *nvptx_output_call_insn (rtx_insn *, rtx, rtx);
 extern const char *nvptx_output_return (void);
+extern const char *nvptx_output_set_softstack (unsigned);
 #endif
 #endif
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 9e04f5b..0525b17 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -141,6 +141,9 @@ static GTY(()) rtx worker_red_sym;
 /* Global lock variable, needed for 128bit worker & gang reductions.  */
 static GTY(()) tree global_lock_var;
 
+/* True if any function references __nvptx_stacks.  */
+static bool need_softstack_decl;
+
 /* Allocate a new, cleared machine_function structure.  */
 
 static struct machine_function *
@@ -981,6 +984,67 @@ init_frame (FILE  *file, int regno, unsigned align, 
unsigned size)
   POINTER_SIZE, reg_names[regno], reg_names[regno]);
 }
 
+/* Emit soft stack frame setup sequence.  */
+
+static void
+init_softstack_frame (FILE *file, unsigned alignment, HOST_WIDE_INT size)
+{
+  /* Maintain 64-bit stack alignment.  */
+  unsigned keep_align = BIGGEST_ALIGNMENT / BITS_PER_UNIT;
+  size = ROUND_UP (size, keep_align);
+  int bits = POINTER_SIZE;
+  const char *reg_stack = reg_names[STACK_POINTER_REGNUM];
+  const char *reg_frame = reg_names[FRAME_POINTER_REGNUM];
+  const char *reg_sspslot = reg_names[SOFTSTACK_SLOT_REGNUM];
+  const char *reg_sspprev = reg_names[SOFTSTACK_PREV_REGNUM];
+  fprintf (file, "\t.reg.u%d 

[PATCH 4/8] nvptx -mgomp

2016-10-14 Thread Alexander Monakov
This patch adds option -mgomp which enables -msoft-stack plus -muniform-simt,
and wires up the corresponding multilib variant.  This codegen convention is
used for OpenMP offloading.

* config/nvptx/nvptx.c (diagnose_openacc_conflict): New.  Use it...
(nvptx_option_override): ...here.  Handle TARGET_GOMP.
* config/nvptx/nvptx.opt (mgomp): New option.
* config/nvptx/t-nvptx (MULTILIB_OPTIONS): New.
* doc/invoke.texi (mgomp): Document.

libgcc:
config/nvptx/mgomp.c: New file.
config/nvptx/t-nvptx: Add mgomp.c
---
 gcc/config/nvptx/nvptx.c| 17 +
 gcc/config/nvptx/nvptx.opt  |  4 
 gcc/config/nvptx/t-nvptx|  2 ++
 gcc/doc/invoke.texi |  6 ++
 libgcc/config/nvptx/mgomp.c | 32 
 libgcc/config/nvptx/t-nvptx |  3 ++-
 6 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/nvptx/mgomp.c

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 65217ab..1c3267f 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -157,6 +157,16 @@ nvptx_init_machine_status (void)
   return p;
 }
 
+/* Issue a diagnostic when option OPTNAME is enabled (as indicated by OPTVAL)
+   and -fopenacc is also enabled.  */
+
+static void
+diagnose_openacc_conflict (bool optval, const char *optname)
+{
+  if (flag_openacc && optval)
+error ("option %s is not supported together with -fopenacc", optname);
+}
+
 /* Implement TARGET_OPTION_OVERRIDE.  */
 
 static void
@@ -194,6 +204,13 @@ nvptx_option_override (void)
   worker_red_sym = gen_rtx_SYMBOL_REF (Pmode, "__worker_red");
   SET_SYMBOL_DATA_AREA (worker_red_sym, DATA_AREA_SHARED);
   worker_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+
+  diagnose_openacc_conflict (TARGET_GOMP, "-mgomp");
+  diagnose_openacc_conflict (TARGET_SOFT_STACK, "-msoft-stack");
+  diagnose_openacc_conflict (TARGET_UNIFORM_SIMT, "-muniform-simt");
+
+  if (TARGET_GOMP)
+target_flags |= MASK_SOFT_STACK | MASK_UNIFORM_SIMT;
 }
 
 /* Return a ptx type for MODE.  If PROMOTE, then use .u32 for QImode to
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 0d46e1d..cb6194d 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -40,3 +40,7 @@ Use custom stacks instead of local memory for automatic 
storage.
 muniform-simt
 Target Report Mask(UNIFORM_SIMT)
 Generate code that can keep local state uniform across all lanes.
+
+mgomp
+Target Report Mask(GOMP)
+Generate code for OpenMP offloading: enables -msoft-stack and -muniform-simt.
diff --git a/gcc/config/nvptx/t-nvptx b/gcc/config/nvptx/t-nvptx
index e2580c9..6c1010d 100644
--- a/gcc/config/nvptx/t-nvptx
+++ b/gcc/config/nvptx/t-nvptx
@@ -8,3 +8,5 @@ ALL_HOST_OBJS += mkoffload.o
 mkoffload$(exeext): mkoffload.o collect-utils.o libcommon-target.a 
$(LIBIBERTY) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
  mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBS)
+
+MULTILIB_OPTIONS = mgomp
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5150b2f..6d6247c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20281,6 +20281,12 @@ variant is used for OpenMP offloading, but the option 
is exposed on its own
 for the purpose of testing the compiler; to generate code suitable for linking
 into programs using OpenMP offloading, use option @option{-mgomp}.
 
+@item -mgomp
+@opindex mgomp
+Generate code for use in OpenMP offloading: enables the @option{-msoft-stack}
+and @option{-muniform-simt} options, and selects the corresponding multilib
+variant.
+
 @end table
 
 @node PDP-11 Options
diff --git a/libgcc/config/nvptx/mgomp.c b/libgcc/config/nvptx/mgomp.c
new file mode 100644
index 000..d8ca581
--- /dev/null
+++ b/libgcc/config/nvptx/mgomp.c
@@ -0,0 +1,32 @@
+/* Define shared memory arrays for -msoft-stack and -muniform-simt.
+
+   Copyright (C) 2015-2016 Free Software Foundation, Inc.
+
+   This file is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 3, or (at your option) any
+   later version.
+
+   This file is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* OpenACC offloading does not use these 

[PATCH 3/8] nvptx -muniform-simt

2016-10-14 Thread Alexander Monakov
This patch implements -muniform-simt code generation option, which is used to
emit code for OpenMP offloading.  The goal is to emit code that can either
execute "normally", or can execute in a way that keeps all lanes in a given
warp active, their local state synchronized, and observable effects from
execution happening as if only one lane was active.  The latter mode is how
OpenMP offloaded code runs outside of SIMD regions.

To achieve that, the compiler instruments atomic instructions and calls to
functions provided by the CUDA runtime (malloc, free, vprintf), i.e. those
that GCC itself doesn't compile.

Instrumentation converts an atomic instruction to a predicated atomic
instruction followed by a warp shuffle.  To illustrate,

   atom.op dest, 

becomes

  @PRED atom.op dest, 
   shfl.idx dest, dest, MASTER

where, outside of SIMD regions:

- PRED is true in lane 0, false in lanes 1-31, so the side effect happens once
- MASTER is 0 in all lanes, so the shuffle synchronizes 'dest' among all lanes

and inside of SIMD regions:

- PRED is true in all lanes, so the atomic is done in all lanes independently
- MASTER equals to current lane number, so the shuffle is a no-op.

To keep track of current state and compute PRED and MASTER, the compiler uses
shared memory array 'unsigned __nvptx_uni[]' with per-warp all-zeros or
all-ones masks.  The mask word is zero outside of SIMD regions, all-ones
inside.  Function prologue uses mask to compute MASTER and PRED via:

MASTER = LANE_ID & MASK;
PRED   = LANE_ID == MASTER;

Calls are handled like atomics.

gcc/
* config/nvptx/nvptx.c (need_unisimt_decl): New variable.  Set it...
(nvptx_init_unisimt_predicate): ...here (new function) and use it...
(nvptx_file_end): ...here to emit declaration of __nvptx_uni array.
(nvptx_declare_function_name): Call nvptx_init_unisimt_predicate.
(nvptx_get_unisimt_master): New helper function.
(nvptx_get_unisimt_predicate): Ditto.
(nvptx_call_insn_is_syscall_p): Ditto.
(nvptx_unisimt_handle_set): Ditto.
(nvptx_reorg_uniform_simt): New.  Transform code for -muniform-simt.
(nvptx_reorg): Call nvptx_reorg_uniform_simt.
* config/nvptx/nvptx.h (TARGET_CPU_CPP_BUILTINS): Define
__nvptx_unisimt__ when -muniform-simt option is active.
(struct machine_function): Add unisimt_master, unisimt_predicate
rtx fields.
* config/nvptx/nvptx.md (atomic): New attribute.
(atomic_compare_and_swap_1): Mark with atomic attribute.
(atomic_exchange): Ditto.
(atomic_fetch_add): Ditto.
(atomic_fetch_addsf): Ditto.
(atomic_fetch_): Ditto.
* config/nvptx/nvptx.opt (muniform-simt): New option.
* doc/invoke.texi (-muniform-simt): Document.

gcc/testsuite/
* gcc.target/nvptx/unisimt.c: New test.

libgcc/
* config/nvptx/crt0.c (__main): Setup __nvptx_uni.
---
 gcc/config/nvptx/nvptx.c | 124 +++
 gcc/config/nvptx/nvptx.h |   4 +
 gcc/config/nvptx/nvptx.md|  18 +++--
 gcc/config/nvptx/nvptx.opt   |   4 +
 gcc/doc/invoke.texi  |  11 +++
 gcc/testsuite/gcc.target/nvptx/unisimt.c |  22 ++
 libgcc/config/nvptx/crt0.c   |   4 +
 7 files changed, 182 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/nvptx/unisimt.c

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 4cdaa1e..65217ab 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -144,6 +144,9 @@ static GTY(()) tree global_lock_var;
 /* True if any function references __nvptx_stacks.  */
 static bool need_softstack_decl;
 
+/* True if any function references __nvptx_uni.  */
+static bool need_unisimt_decl;
+
 /* Allocate a new, cleared machine_function structure.  */
 
 static struct machine_function *
@@ -1058,6 +1061,34 @@ nvptx_init_axis_predicate (FILE *file, int regno, const 
char *name)
   fprintf (file, "\t}\n");
 }
 
+/* Emit code to initialize predicate and master lane index registers for
+   -muniform-simt code generation variant.  */
+
+static void
+nvptx_init_unisimt_predicate (FILE *file)
+{
+  int bits = POINTER_SIZE;
+  int master = REGNO (cfun->machine->unisimt_master);
+  int pred = REGNO (cfun->machine->unisimt_predicate);
+  fprintf (file, "\t{\n");
+  fprintf (file, "\t\t.reg.u32 %%ustmp0;\n");
+  fprintf (file, "\t\t.reg.u%d %%ustmp1;\n", bits);
+  fprintf (file, "\t\t.reg.u%d %%ustmp2;\n", bits);
+  fprintf (file, "\t\tmov.u32 %%ustmp0, %%tid.y;\n");
+  fprintf (file, "\t\tmul%s.u32 %%ustmp1, %%ustmp0, 4;\n",
+  bits == 64 ? ".wide" : ".lo");
+  fprintf (file, "\t\tmov.u%d %%ustmp2, __nvptx_uni;\n", bits);
+  fprintf (file, "\t\tadd.u%d %%ustmp2, %%ustmp2, %%ustmp1;\n", bits);
+  fprintf (file, "\t\tld.shared.u32 %%r%d, [%%ustmp2];\n", master);
+  fprintf (file, "\t\tmov.u32 %%ustmp0, %%tid.x;\n");
+  /* 

[Bug c/77965] -Wduplicated-cond should find duplicated condition / identical expressions of form "a || a" or "a && a"

2016-10-14 Thread egall at gwmail dot gwu.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77965

Eric Gallager  changed:

   What|Removed |Added

 CC||egall at gwmail dot gwu.edu

--- Comment #1 from Eric Gallager  ---
Use -Wlogical-op:

$ /usr/local/bin/gcc -c -Wall -Wextra -pedantic -Wlogical-op -Wduplicated-cond
-Wtautological-compare logical_op.c 
logical_op.c: In function ‘foo’:
logical_op.c:5:17: warning: logical ‘or’ of equal expressions [-Wlogical-op]
  return (x == 5 || x == 5) ? 1 : 0;
 ^~

[PATCH] Avoid copies in std::scoped_allocator_adaptor piecewise construction

2016-10-14 Thread Jonathan Wakely

While looking at LWG  2511 I realised that we can prevent
scoped_allocator_adaptor::construct(pair*, ...) from making any
copies internally. The transformed tuples that get passed to the
std::pair constructor (with additional allocator arguments) can be
tuples of references even if the incoming tuples are not.

See P0475R0 (which will be in the pre-Issaquah mailing) for a proposal
to resolve LWG 2511 by requiring this behaviour.

* include/std/scoped_allocator (scoped_allocator_adaptor): Forward
piecewise construction arguments as tuples of references, to avoid
copies (related to LWG 2511).
* testsuite/20_util/scoped_allocator/construct_pair.cc: New test.

Tested powerpc64le-linux, committed to trunk.


commit 2af91ec5027f51a60cb548379cccf4fb69e34897
Author: Jonathan Wakely 
Date:   Tue Jul 21 13:14:33 2015 +0100

Avoid copies in std::scoped_allocator_adaptor piecewise construction

* include/std/scoped_allocator (scoped_allocator_adaptor): Forward
piecewise construction arguments as tuples of references, to avoid
copies (related to LWG 2511).
* testsuite/20_util/scoped_allocator/construct_pair.cc: New test.

diff --git a/libstdc++-v3/include/std/scoped_allocator 
b/libstdc++-v3/include/std/scoped_allocator
index 39762fe..dcb97df 100644
--- a/libstdc++-v3/include/std/scoped_allocator
+++ b/libstdc++-v3/include/std/scoped_allocator
@@ -369,10 +369,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
= __use_alloc<_T1, inner_allocator_type, _Args1...>(__inner);
  auto __y_use_tag
= __use_alloc<_T2, inner_allocator_type, _Args2...>(__inner);
+ typename _Build_index_tuple::__type __x_indices;
+ typename _Build_index_tuple::__type __y_indices;
  typedef __outermost_alloc_traits _O_traits;
  _O_traits::construct(__outermost(*this), __p, piecewise_construct,
-  _M_construct_p(__x_use_tag, __x),
-  _M_construct_p(__y_use_tag, __y));
+  _M_construct_p(__x_use_tag, __x_indices, __x),
+  _M_construct_p(__y_use_tag, __y_indices, __y));
}
 
   template
@@ -428,26 +430,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  const scoped_allocator_adaptor<_OutA2, _InA...>& __b) 
noexcept;
 
 private:
-  template
-   _Tuple&&
-   _M_construct_p(__uses_alloc0, _Tuple& __t)
+  template
+   tuple<_Args&&...>
+   _M_construct_p(__uses_alloc0, _Ind, tuple<_Args...>& __t)
{ return std::move(__t); }
 
-  template
-   std::tuple
-   _M_construct_p(__uses_alloc1_, std::tuple<_Args...>& __t)
+  template
+   tuple
+   _M_construct_p(__uses_alloc1_, _Index_tuple<_Ind...>,
+  tuple<_Args...>& __t)
{
- typedef std::tuple _Tuple;
- return std::tuple_cat(_Tuple(allocator_arg, inner_allocator()),
-   std::move(__t));
+ return { allocator_arg, inner_allocator(),
+ std::get<_Ind>(std::move(__t))...
+ };
}
 
-  template
-   std::tuple<_Args..., inner_allocator_type&>
-   _M_construct_p(__uses_alloc2_, std::tuple<_Args...>& __t)
+  template
+   tuple<_Args&&..., inner_allocator_type&>
+   _M_construct_p(__uses_alloc2_, _Index_tuple<_Ind...>,
+  tuple<_Args...>& __t)
{
- typedef std::tuple _Tuple;
- return std::tuple_cat(std::move(__t), _Tuple(inner_allocator()));
+ return { std::get<_Ind>(std::move(__t))..., inner_allocator() };
}
 };
 
diff --git a/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair.cc 
b/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair.cc
new file mode 100644
index 000..2996412
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/scoped_allocator/construct_pair.cc
@@ -0,0 +1,81 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+

[Bug c++/71912] [6 regression] flexible array in struct in union rejected

2016-10-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71912

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Martin Sebor  ---
Backported to 6-branch in r241168.

[Bug c++/69698] [meta-bug] flexible array members

2016-10-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69698
Bug 69698 depends on bug 71912, which changed state.

Bug 71912 Summary: [6 regression] flexible array in struct in union rejected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71912

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/71912] [6 regression] flexible array in struct in union rejected

2016-10-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71912

--- Comment #9 from Martin Sebor  ---
Author: msebor
Date: Fri Oct 14 15:37:54 2016
New Revision: 241168

URL: https://gcc.gnu.org/viewcvs?rev=241168=gcc=rev
Log:
PR c++/71912 - [6/7 regression] flexible array in struct in union rejected

gcc/cp/ChangeLog:
* class.c (struct flexmems_t):  Add members.
(find_flexarrays): Add arguments.  Correct handling of anonymous
structs.
(diagnose_flexarrays): Adjust to issue warnings in addition to errors.
(check_flexarrays): Add argument.
(diagnose_invalid_flexarray): New functions.

gcc/testsuite/ChangeLog:
* g++.dg/ext/flexary4.C: Adjust.
* g++.dg/ext/flexary5.C: Same.
* g++.dg/ext/flexary9.C: Same.
* g++.dg/ext/flexary19.C: New test.
* g++.dg/ext/flexary18.C: New test.
* g++.dg/torture/pr64312.C: Add a dg-error directive to an ill-formed
regression test.
* g++.dg/compat/struct-layout-1_generate.c (subfield): Add argument.
Avoid generating a flexible array member in an array.


Modified:
branches/gcc-6-branch/gcc/cp/ChangeLog
branches/gcc-6-branch/gcc/cp/class.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog
   
branches/gcc-6-branch/gcc/testsuite/g++.dg/compat/struct-layout-1_generate.c
branches/gcc-6-branch/gcc/testsuite/g++.dg/ext/flexary4.C
branches/gcc-6-branch/gcc/testsuite/g++.dg/ext/flexary5.C
branches/gcc-6-branch/gcc/testsuite/g++.dg/ext/flexary9.C
branches/gcc-6-branch/gcc/testsuite/g++.dg/torture/pr64312.C

DWARF Version 5 Public Review Draft Released

2016-10-14 Thread Michael Eager

DWARF Version 5 Public Review Draft Released
October 14, 2016

The DWARF Debugging Information Format Committee has released the public review draft of Version 5 
of the DWARF Debugging Information Format standard. The DWARF debugging format is used to 
communicate debugging information between a compiler and debugger to make it easier for programmers 
to develop, test, and debug programs.


DWARF is used by a wide range of compilers and debuggers, both proprietary and open source, to 
support debugging of Ada, C, C++, Cobol, FORTRAN, Java, and other programming languages. DWARF
V5 adds support for new languages like Rust, Swift, Ocaml, Go, and Haskell, as well as support for 
new features in the older languages.  DWARF can be used with many processor architectures, from 
8-bit to 64-bit.


DWARF is the standard debugging format for Linux and several versions of Unix and is widely used 
with embedded processors. DWARF is designed to be extended easily to support new languages and new 
architectures.


The DWARF Version 5 Standard has been in development for six years.  DWARF Committee members include 
representatives from over a dozen major companies with extensive experience in compiler and debugger 
development.  Version 5 incorporates improvements in many areas:  better data compression, 
separation of debugging data from executable files, improved description of macros and source files, 
faster searching for symbols, improved debugging optimized code, as well as numerous improvements in 
functionality and performance.


The public review draft of DWARF Version 5 standard can be downloaded without charge from the DWARF 
website (http://dwarfstd.org). The DWARF Committee will accept public comments on DWARF Version 5 
until November 30, after which a finalized version will be published.  Additional information about 
DWARF, including how to subscribe to the DWARF mailing list, can also be found on the website. 
Questions about the DWARF Debugging Information Format or the DWARF Committee can be directed to the 
DWARF Committee Chair, Michael Eager at i...@dwarfstd.org.


--
Michael Eager, Chair, DWARF Debugging Format Standards Committee
i...@dwarfstd.org  650-325-8077



  1   2   3   >