date:20150902

Re: [gomp4] expunge shared_size from launch API

2015-09-02 Thread Tom de Vries


On 01-09-15 12:15, Tom de Vries wrote:

On 31/08/15 19:39, Nathan Sidwell wrote:

* builtin-types.def (DEF_FUNCTION_TYPE_VAR_6): Define.


Committed attached follow-up patch to fix the ada build.

Thanks,
- Tom



0001-Fix-gomp-4_0-branch-ada-build.patch


Fix gomp-4_0-branch ada build

2015-09-01  Tom de Vries

* gcc-interface/utils.c (DEF_FUNCTION_TYPE_VAR_6): Define.
---
  gcc/ada/gcc-interface/utils.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 66ba904..e32a539 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -5376,6 +5376,8 @@ enum c_builtin_type
  #define DEF_FUNCTION_TYPE_VAR_4(NAME, RETURN, ARG1, ARG2, ARG3, ARG4) NAME,
  #define DEF_FUNCTION_TYPE_VAR_5(NAME, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5) \
NAME,
+#define DEF_FUNCTION_TYPE_VAR_6(NAME, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
+   ARG6) NAME,
  #define DEF_FUNCTION_TYPE_VAR_7(NAME, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
ARG6, ARG7) NAME,
  #define DEF_FUNCTION_TYPE_VAR_12(NAME, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
@@ -5399,6 +5401,7 @@ enum c_builtin_type
  #undef DEF_FUNCTION_TYPE_VAR_3
  #undef DEF_FUNCTION_TYPE_VAR_4
  #undef DEF_FUNCTION_TYPE_VAR_5
+#undef DEF_FUNCTION_TYPE_VAR_6
  #undef DEF_FUNCTION_TYPE_VAR_7
  #undef DEF_FUNCTION_TYPE_VAR_12
  #undef DEF_POINTER_TYPE
@@ -5506,6 +5509,9 @@ install_builtin_function_types (void)
def_fn_type (ENUM, RETURN, 1, 4, ARG1, ARG2, ARG3, ARG4);
  #define DEF_FUNCTION_TYPE_VAR_5(ENUM, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5) \
def_fn_type (ENUM, RETURN, 1, 5, ARG1, ARG2, ARG3, ARG4, ARG5);
+#define DEF_FUNCTION_TYPE_VAR_6(ENUM, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
+   ARG6)   \
+  def_fn_type (ENUM, RETURN, 1, 7, ARG1, ARG2, ARG3, ARG4, ARG5, ARG6);


And of course, the argument for the 'n' parameter of def_fn_type should be '6' 
for DEF_FUNCTION_TYPE_VAR_6, not '7'.


Fixed in attached patch, committed.

Thanks,
- Tom
Fix DEF_FUNCTION_TYPE_VAR_6 in install_builtin_function_types

2015-09-02  Tom de Vries  

	* gcc-interface/utils.c (DEF_FUNCTION_TYPE_VAR_6): Fix define in
	install_builtin_function_types
---
 gcc/ada/gcc-interface/utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index e32a539..2ffc436 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -5511,7 +5511,7 @@ install_builtin_function_types (void)
   def_fn_type (ENUM, RETURN, 1, 5, ARG1, ARG2, ARG3, ARG4, ARG5);
 #define DEF_FUNCTION_TYPE_VAR_6(ENUM, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
 ARG6)\
-  def_fn_type (ENUM, RETURN, 1, 7, ARG1, ARG2, ARG3, ARG4, ARG5, ARG6);
+  def_fn_type (ENUM, RETURN, 1, 6, ARG1, ARG2, ARG3, ARG4, ARG5, ARG6);
 #define DEF_FUNCTION_TYPE_VAR_7(ENUM, RETURN, ARG1, ARG2, ARG3, ARG4, ARG5, \
 ARG6, ARG7)\
   def_fn_type (ENUM, RETURN, 1, 7, ARG1, ARG2, ARG3, ARG4, ARG5, ARG6, ARG7);
-- 
1.9.1

Re: [Patch, libfortran] PR 67414 Improve error handling

2015-09-02 Thread FX

> Hmm, in that case errnum must be set to 0. What about the attached
> patch, which prints the existing message if errnum == 0, and the new
> and improved only for errnum > 0?

OK. Thanks for the patch.

FX

Re: [PATCH v2] [libstdc++] Run tests on RTEMS

2015-09-02 Thread Sebastian Huber


On 01/09/15 23:07, Jeff Law wrote:

On 09/01/2015 05:02 AM, Sebastian Huber wrote:

v2: Include all options and not only "dg-do run ...".

libstdc++-v3/ChangeLog
2015-09-01  Sebastian Huber 

testsuite/*: Use 's/\*-\*-cygwin\* /&*-*-rtems* /' to add RTEMS
target selector to all tests that run on Cygwin.

So presumably those tests actually run correctly :-)


Not all, but its not that bad:

Target is arm-unknown-rtems4.11
Host   is arm-unknown-rtems4.11
Build  is x86_64-pc-linux-gnu

=== libstdc++ tests ===

Schedule of variations:
rtems-arm-realview_pbx_a9_qemu/-march=armv7-a/-mthumb/-mfpu=neon/-mfloat-abi=hard

Running target 
rtems-arm-realview_pbx_a9_qemu/-march=armv7-a/-mthumb/-mfpu=neon/-mfloat-abi=hard
Using 
/scratch/git-rtems-testing/dejagnu/boards/rtems-arm-realview_pbx_a9_qemu.exp 
as board description file for target.
Using /usr/share/dejagnu/config/sim.exp as generic interface file for 
target.
Using /usr/share/dejagnu/baseboards/basic-sim.exp as board description 
file for target.
Using 
/home/EB/sebastian_h/archive/gcc-git/libstdc++-v3/testsuite/config/default.exp 
as tool-and-target-specific interface file.
Running 
/home/EB/sebastian_h/archive/gcc-git/libstdc++-v3/testsuite/libstdc++-abi/abi.exp 
...
Running 
/home/EB/sebastian_h/archive/gcc-git/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp 
...

FAIL: 25_algorithms/copy/streambuf_iterators/wchar_t/4.cc execution test
FAIL: 25_algorithms/find/istreambuf_iterators/wchar_t/2.cc execution test
FAIL: 25_algorithms/random_shuffle/moveable.cc execution test
FAIL: 27_io/basic_istream/extractors_other/wchar_t/2.cc execution test
FAIL: 27_io/basic_istream/get/wchar_t/2.cc execution test
FAIL: 27_io/basic_istream/ignore/wchar_t/3.cc execution test
FAIL: 27_io/basic_istream/seekg/wchar_t/sstream.cc execution test
FAIL: 27_io/basic_istream/tellg/wchar_t/sstream.cc execution test
FAIL: 27_io/basic_ostream/inserters_other/wchar_t/1.cc execution test
FAIL: 27_io/basic_stringbuf/setbuf/char/4.cc execution test
FAIL: 27_io/objects/wchar_t/12048-1.cc execution test
FAIL: 27_io/objects/wchar_t/12048-2.cc execution test
FAIL: 27_io/objects/wchar_t/12048-3.cc execution test
FAIL: 27_io/objects/wchar_t/12048-4.cc execution test
WARNING: program timed out.
FAIL: 30_threads/async/42819.cc execution test
WARNING: program timed out.
FAIL: 30_threads/async/49668.cc execution test
WARNING: program timed out.
FAIL: 30_threads/async/any.cc execution test
WARNING: program timed out.
FAIL: 30_threads/async/async.cc execution test
WARNING: program timed out.
FAIL: 30_threads/condition_variable/members/3.cc execution test
FAIL: 30_threads/shared_timed_mutex/try_lock/3.cc execution test
WARNING: program timed out.
FAIL: 30_threads/thread/native_handle/cancel.cc execution test
FAIL: 30_threads/timed_mutex/try_lock_until/57641.cc execution test
FAIL: tr1/8_c_compatibility/complex/50880.cc (test for excess errors)
WARNING: tr1/8_c_compatibility/complex/50880.cc compilation failed to 
produce executable

FAIL: tr1/8_c_compatibility/complex/functions.cc (test for excess errors)
Running 
/home/EB/sebastian_h/archive/gcc-git/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp 
...
Running 
/home/EB/sebastian_h/archive/gcc-git/libstdc++-v3/testsuite/libstdc++-xmethods/xmethods.exp 
...


=== libstdc++ Summary ===

# of expected passes9029
# of unexpected failures24
# of expected failures  65
# of unsupported tests  726

One issue is a thread cancel/exit misbehaviour/deviation from glibc in 
RTEMS. Another issue is that the files under libstdc++-v3/testsuite/data 
are currently not available in our test driver which uses Qemu.




I don't think the ChangeLog is strictly OK according to standards. 
Every file changed is supposed to be listed.  I know it's a pain, but 
until we change those requirements it's probably best to stick with 
current standards.


GIven a context diff or a unidiff, contrib/mklog can generate a 
skeleton ChangeLog entry for all the referenced files.


I think

* firstfile: What changed.
* secondfile: Likewise.
* thirdfile: Likewise.

Is fine.

OK with the fixed ChangeLog.

jeff


My first ChangeLog look like this, but then I found this:

2014-05-23  Jonathan Wakely  

PR libstdc++/60793
* testsuite/*: Use 's/\*-\*-freebsd\* /&*-*-dragonfly* /' to add
dragonfly target selector to all tests that run on freebsd.

I will fix the ChangeLog.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: [Fortran, committed] XFAIL read_dir.f90 on FreeBSD

2015-09-02 Thread Janne Blomqvist

On Wed, Sep 2, 2015 at 1:28 AM, Jerry DeLisle  wrote:
> On 09/01/2015 11:18 AM, Steve Kargl wrote:
>> On Tue, Sep 01, 2015 at 11:16:27AM -0700, Steve Kargl wrote:
>>> open(unit=10, file='junko.dir',iostat=ios,action='read',access='stream')
>>> if (ios.ne.0) call abort
>>> read(10, iostat=ios) c
>>> -   if (ios.ne.21) call abort
>>> +   if (ios.ne.21) then
>>> +  close(10)
>>
>> I forgot to mention that 'close(10, status="delete')' does not
>> work on a directory.  Should it?
>>
>>> +  call system('rmdir junko.dir')
>>> +  call abort
>>> +   end if
>>> +   close(10)
>>> call system('rmdir junko.dir')
>>
>
> Thanks for the touch up Steve.  I suspect other OS's will not work either.  I
> assumed close with Status="delete" would not work on a directory.

That's because libgfortran uses unlink(2), which only works for files,
not directories. One could change that to use remove(3), which works
for both.

Also, I suspect the reason why it fails on freebsd is that errno
EISDIR is not 21 there. Perhaps one should just check for ios /= 0?
Unfortunately the __linux__ define isn't available since the switch to
cpp (IIRC there is a PR for that), so it's not that straightforward to
check which platform one runs on.

-- 
Janne Blomqvist

[Patch, avr] Fix PR65210

2015-09-02 Thread Senthil Kumar Selvaraj

Hi,

  This (rather trivial) patch fixes PR65210. The ICE happens because code
  wasn't handling io_low attribute where it is supposed to.

  If this is ok, could someone commit please? I don't have commit
  access.

Regards
Senthil

gcc/ChangeLog

2015-09-02  Senthil Kumar Selvaraj  

PR target/65210
* config/avr/avr.c (avr_eval_addr_attrib): Look for io_low
attribute as well.

gcc/testsuite/ChangeLog

PR target/65210
* gcc.target/avr/pr65210.c: New test.

diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c
index bec9a8b..9f5bc88 100644
--- gcc/config/avr/avr.c
+++ gcc/config/avr/avr.c
@@ -9069,6 +9069,8 @@ avr_eval_addr_attrib (rtx x)
   if (SYMBOL_REF_FLAGS (x) & SYMBOL_FLAG_IO)
{
  attr = lookup_attribute ("io", DECL_ATTRIBUTES (decl));
+ if (!attr || !TREE_VALUE (attr))
+   attr = lookup_attribute ("io_low", DECL_ATTRIBUTES (decl));
  gcc_assert (attr);
}
   if (!attr || !TREE_VALUE (attr))
diff --git gcc/testsuite/gcc.target/avr/pr65210.c 
gcc/testsuite/gcc.target/avr/pr65210.c
new file mode 100644
index 000..1aed441
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr65210.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+
+/* This testcase exposes PR65210. Usage of the io_low attribute
+   causes assertion failure because code only looks for the io
+   attribute if SYMBOL_FLAG_IO is set. */
+
+volatile char q __attribute__((io_low,address(0x81)));

Re: [gomp4] declare directive

2015-09-02 Thread Tom de Vries


On 12-08-15 20:31, James Norris wrote:

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index bc54067..eee5340 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -5864,8 +5864,7 @@ void
  finish_oacc_declare (gfc_namespace *ns, enum sym_flavor flavor)
  {
gfc_code *code, *end_c, *code2;
-  gfc_oacc_declare *oc;
-  gfc_omp_clauses *omp_clauses = NULL, *ret_clauses = NULL;
+  gfc_oacc_declare *oc, *new_oc;
gfc_omp_namelist *n;
locus where = gfc_current_locus;



This introduces an unused variable new_oc.

Attached patch removes that and some other unused variables in 
finish_oacc_declare.

Committed.

Thanks,
- Tom
Remove unused vars in finish_oacc_declare

2015-09-02  Tom de Vries  

	* trans-decl.c (finish_oacc_declare): Remove unused variables.
---
 gcc/fortran/trans-decl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index c84b098..39acabd 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -5890,8 +5890,8 @@ find_module_oacc_declare_clauses (gfc_symbol *sym)
 void
 finish_oacc_declare (gfc_namespace *ns, enum sym_flavor flavor)
 {
-  gfc_code *code, *end_c, *code2;
-  gfc_oacc_declare *oc, *new_oc;
+  gfc_code *code;
+  gfc_oacc_declare *oc;
   gfc_omp_namelist *n;
   locus where = gfc_current_locus;
 
-- 
1.9.1

Re: [PATCH][wwwdocs][AArch64] Add entry for target attributes and pragmas

2015-09-02 Thread Kyrill Tkachov


Hi Gerald,

On 01/09/15 18:29, Gerald Pfeifer wrote:

On Tue, 1 Sep 2015, Kyrill Tkachov wrote:

This wwwdocs patch adds an entry to the GCC 6 changes page about the
aarch64 target attributes and pragmas support.

Thanks for thinking of this, Kyrill.


Thanks for the feedback.



Index: htdocs/gcc-6/changes.html
===
+ 
+   The AArch64 port now supports target attributes and pragmas.  Please
+   refer to the documentation for details of available attributes and
+   pragmas as well as usage instructions.
+ 

Here, isn't the second sentence the default assumption anyway,
that is, do we need to highlight it specifically or can we omit
it?


My thinking was that when we introduce some new command-line option we list it
here and give a short description of it (new -mcpu values, for example).
However, here we introduce about 10 new target attributes and pragmas and 
listing
them all would make this entry too long for my liking so as a shorthand for 
listing
them all I chose to point to the documentation.

Unless you feel strongly against this reasoning I'd like to commit the
patch as is within 48 hours.




Please consider this feedback (and "let's keep it as is" is a
fine response, too) and go ahead.

Gerald

Re: [PING] Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function

2015-09-02 Thread Jason Merrill


On 09/01/2015 06:25 PM, Martin Sebor wrote:

Having now made this change, I don't think the added complexity
of three declarations and two trivial definitions of the new
c_decl_implicit function across five files is an improvement,


Three declarations?  Isn't declaring it in c-common.h enough?


especially since we're still checking for c_dialect_cxx().
(The change simply replaces:

   && (c_dialect_cxx () || !DECL_LANG_FLAG_2 (expr))

with

   && (c_dialect_cxx () || c_decl_implicit (expr))

as suggested.)


Seems like you can do without the check for C++ if you're defining this 
function for C++ (to just return false).


I agree with Joseph that the function is better.


+bool diag /* = true */)


Let's call these parameters "reject_builtin" rather than the generic "diag".


+function = decay_conversion (function, complain, false);


Please add a comment to "false", e.g. /*reject_builtin*/false

Jason

RE: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-09-02 Thread Zamyatin, Igor

> 
> 
> On 19/08/15 17:57, Jeff Law wrote:
> > On 08/12/2015 08:31 AM, Kyrill Tkachov wrote:
> >> 2015-08-10  Kyrylo Tkachov 
> >>
> >>   * ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
> >>   then_cost, else_cost fields.  Change branch_cost field to
> >> unsigned int.
> >>   (end_ifcvt_sequence): Call set_used_flags on each insn in the
> >>   sequence.
> >>   Include rtl-iter.h.
> >>   (noce_simple_bbs): New function.
> >>   (noce_try_move): Bail if basic blocks are not simple.
> >>   (noce_try_store_flag): Likewise.
> >>   (noce_try_store_flag_constants): Likewise.
> >>   (noce_try_addcc): Likewise.
> >>   (noce_try_store_flag_mask): Likewise.
> >>   (noce_try_cmove): Likewise.
> >>   (noce_try_minmax): Likewise.
> >>   (noce_try_abs): Likewise.
> >>   (noce_try_sign_mask): Likewise.
> >>   (noce_try_bitop): Likewise.
> >>   (bbs_ok_for_cmove_arith): New function.
> >>   (noce_emit_all_but_last): Likewise.
> >>   (noce_emit_insn): Likewise.
> >>   (noce_emit_bb): Likewise.
> >>   (noce_try_cmove_arith): Handle non-simple basic blocks.
> >>   (insn_valid_noce_process_p): New function.
> >>   (contains_mem_rtx_p): Likewise.
> >>   (bb_valid_for_noce_process_p): Likewise.
> >>   (noce_process_if_block): Allow non-simple basic blocks
> >>   where appropriate.
> >>
> >> 2015-08-11  Kyrylo Tkachov 
> >>
> >>   * gcc.dg/ifcvt-1.c: New test.
> >>   * gcc.dg/ifcvt-2.c: Likewise.
> >>   * gcc.dg/ifcvt-3.c: Likewise.

Looks like ifcvt-3.c fails on x86_64. I see

New failures:
FAIL: gcc.dg/ifcvt-3.c scan-rtl-dump ce1 "3 true changes made"

Could you please take a look?

Thanks,
Igor

Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alan Lawrence


On 14/08/15 19:57, Alexandre Oliva wrote:


I'm glad it appears to be working to everyone's
satisfaction now.  I've just committed it as r226901, with only a
context adjustment to account for a change in use_register_for_decl in
function.c.  /me crosses fingers :-)

Here's the patch as checked in:


One more failure to report, I'm afraid. On AArch64 Bigendian, 
aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348):


In file included from /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aa
pcs64/func-ret-4.c:14:0:
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c: In
 function 'func_return_val_10':
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:12:2
4: internal compiler error: in simplify_subreg, at simplify-rtx.c:5808
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:13:4
0: note: in definition of macro 'FUNC_NAME_COMBINE'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:2
7: note: in expansion of macro 'FUNC_NAME_1'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:3
9: note: in expansion of macro 'FUNC_BASE_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:69:3
3: note: in expansion of macro 'FUNC_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c:23:
1: note: in expansion of macro 'FUNC_VAL_CHECK'
0xa7ba44 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/work/alalaw01/src/gcc/gcc/simplify-rtx.c:5808
0xa7c4ef simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/work/alalaw01/src/gcc/gcc/simplify-rtx.c:6031
0x7ad097 operand_subword(rtx_def*, unsigned int, int, machine_mode)
/work/alalaw01/src/gcc/gcc/emit-rtl.c:1611
0x7def4e move_block_from_reg(int, rtx_def*, int)
/work/alalaw01/src/gcc/gcc/expr.c:1536
0x83a494 assign_parm_setup_block
/work/alalaw01/src/gcc/gcc/function.c:3117
0x841a43 assign_parms
/work/alalaw01/src/gcc/gcc/function.c:3857
0x842ffa expand_function_start(tree_node*)
/work/alalaw01/src/gcc/gcc/function.c:5286
0x6e7496 execute
/work/alalaw01/src/gcc/gcc/cfgexpand.c:6203
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c compilation,  -O1  (internal 
compiler error)


Also at -O2, -O3 -g, -Og -g, -Os. -O0 is OK.

simplify_subreg is called with outermode=DImode, op=

(concat:CHI (reg:HI 76 [ t ])
(reg:HI 77 [ t+2 ]))

innermode = BLKmode (which violates the assertion), byte=0.

move_block_from_reg (in expr.c) calls operand_subword(x, i, 1, BLKmode), here 
i=0 and x is the concat:CHI above, and operand_subword doesn't handle that case 
(well, it passes it onto simplify_subreg).


In assign_parm_setup_block, I see 'mem = validize_mem (copy_rtx (stack_parm))' 
where stack_parm is again the same concat:CHI.


This should be easily reproducible with a stage 1 compiler 
(aarch64_be-none-elf).

--Alan

RFC: Combine of compare & and oddity

2015-09-02 Thread Wilco Dijkstra

Hi,

Combine canonicalizes certain AND masks in a comparison with zero into extracts 
of the widest
register type. During matching these are expanded into a very inefficient 
sequence that fails to
match. For example (x & 2) == 0 is matched in combine like this:

Failed to match this instruction:
(set (reg:CC 66 cc)
(compare:CC (zero_extract:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
(const_int 1 [0x1])
(const_int 1 [0x1]))
(const_int 0 [0])))
Failed to match this instruction:
(set (reg:CC 66 cc)
(compare:CC (and:DI (lshiftrt:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
(const_int 1 [0x1]))
(const_int 1 [0x1]))
(const_int 0 [0])))

Neither matches the AArch64 patterns for ANDS/TST (which is just compare and 
AND). If the immediate
is not a power of 2 or a power of 2 -1 then it matches correctly as expected.

I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion (it even 
uses shifts by 0 at
times which are unlikely to ever match anything). Why does combine not try to 
match the obvious (x &
C) != 0 case? Single-bit and mask tests are very common, so this blocks 
efficient code generation on
many targets.

It's trivial to change change_zero_ext to expand extracts always into AND and 
remove the redundant
subreg. However wouldn't it make more sense to never special case certain AND 
immediate in the first
place?

Wilco

Patch GCC for profile-func-internal-id=0 coverage-callback=1

2015-09-02 Thread Matt Deeds

Hello, Honza.  David Li said you might be able to help me get this
patch into GCC trunk.  I sent mail for this on August 27, but didn't
get a reply.  It's a small change to make these two options work
together:

profile-func-internal-id=0 coverage-callback=1

Let me know what I can do to get this submitted.

This patch is for svn://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9.  I add
support for the profile_func_internal-id in the instrumentation generated for
__coverage_callback.

Add support for the profile-func-internal-id parameter to the coverage callback.
Without this change, the function identifier passed to __coverage_callback
(enabled with param=coverage-callback=1) does not match the values emitted in
the .gcno file.  Because the function profile_id is typically more unique
(typically 32 bits) than the function internal id (typically 16 bits), it can be
desirable to have the profile_id used to identify a function as opposed to the
function internal id.

I've instrumented a large binary creating over 500 .gcno files and confirmed
that function IDs in these .gcno files match the IDs in __coverage_callback.  In
my example, there were typically about one to four functions sharing the same
internal function ID.  There were no collisions using profile_id.


Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c (revision 226647)
+++ gcc/tree-profile.c (working copy)
@@ -864,8 +864,20 @@ gimple_gen_edge_profiler (int edgeno, edge e)
 {
   gimple call;
   tree tree_edgeno = build_int_cst (gcov_type_node, edgeno);
-  tree tree_uid = build_int_cst (gcov_type_node,
+
+  tree tree_uid;
+  if (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID))
+{
+  tree_uid  = build_int_cst (gcov_type_node,
  current_function_funcdef_no);
+}
+  else
+{
+  gcc_assert (coverage_node_map_initialized_p ());
+
+  tree_uid = build_int_cst
+  (gcov_type_node, cgraph_get_node (current_function_decl)->profile_id);
+}
   tree callback_fn_type
   = build_function_type_list (void_type_node,
   gcov_type_node,

Re: [Fortran, committed] XFAIL read_dir.f90 on FreeBSD

2015-09-02 Thread Steve Kargl

On Wed, Sep 02, 2015 at 11:30:07AM +0300, Janne Blomqvist wrote:
> On Wed, Sep 2, 2015 at 1:28 AM, Jerry DeLisle  wrote:
> > On 09/01/2015 11:18 AM, Steve Kargl wrote:
> >> On Tue, Sep 01, 2015 at 11:16:27AM -0700, Steve Kargl wrote:
> >>> open(unit=10, 
> >>> file='junko.dir',iostat=ios,action='read',access='stream')
> >>> if (ios.ne.0) call abort
> >>> read(10, iostat=ios) c
> >>> -   if (ios.ne.21) call abort
> >>> +   if (ios.ne.21) then
> >>> +  close(10)
> >>
> >> I forgot to mention that 'close(10, status="delete')' does not
> >> work on a directory.  Should it?
> >>
> >>> +  call system('rmdir junko.dir')
> >>> +  call abort
> >>> +   end if
> >>> +   close(10)
> >>> call system('rmdir junko.dir')
> >>
> >
> > Thanks for the touch up Steve.  I suspect other OS's will not work either.  
> > I
> > assumed close with Status="delete" would not work on a directory.
> 
> That's because libgfortran uses unlink(2), which only works for files,
> not directories. One could change that to use remove(3), which works
> for both.

I suspect people who create directories and then 
want to delete them will use SYSTEM or the 
standard conforming equivalent.

> Also, I suspect the reason why it fails on freebsd is that errno
> EISDIR is not 21 there. Perhaps one should just check for ios /= 0?

I checked.  FreeBSD's EISDIR is 21; howevr, ios == 0 in this
case.  I haven't looked too deep.  FreeBSD is probably
adhering to the unix philosophy of "everything is a file".   

-- 
Steve

Re: [PATCH] add initial support for J2 core to sh target

2015-09-02 Thread Oleg Endo


On 02 Sep 2015, at 02:08, Rich Felker  wrote:

> On Wed, Sep 02, 2015 at 01:24:55AM +0900, Oleg Endo wrote:
>>> I'm not sure what the best way to achieve multiple goals is, but the
>>> current behavior makes it so you need --isa=any (and a final binary
>>> with weird ABI tag) to have a binary that supports atomic operations
>>> on any SH model. musl libc already has such support (except the new J2
>>> CAS instruction) and I would like to eventually provide a libatomic
>>> approach for GCC too so that it's possible to use __sync/C11 atomics
>>> and have the binary be safe to run on any model that supports the
>>> baseline ISA & ABI you built for (e.g. all >=SH2 if you used -m2).
>> 
>> I don't know the details of your implementation. The compiler
>> generated atomic sequences are not really compatible. The safest
>> thing is not to enable any atomic model in GCC and let it emit
>> function calls to __atomic*.
> 
> Exactly -- but then, libatomic.a needs to contain J2-specific cas.l
> opcodes and SH4A-specific movli.l/movco.l opcodes and code that
> selects at runtime which to use (or whether to use imask or gusa)
> based on hwcap, etc. The point is that a mix of opcodes for different
> ISA levels end up being in the final binary, which might otherwise be
> targeted for SH-2 baseline so it can run on any of them.
> 
>>> I have a patch for that part, just not expanding the
>>> already-very-complex SH "family-tree" of instruction support. However
>>> it's likely that encoding details will change (the draft encoding
>>> overlaps with something used by SH2A IIRC, and the intent was not to
>>> have such overlap)
>> 
>> Yeah, it overlaps with the first 16 bit word of the 32 bit SH2A
>> load/store insns.
>> 
>>> so I'm holding off on submitting it until the
>>> hardware side works out this issue.
>> 
>> Sounds reasonable.
> 
> In the mean time, do you have any suggestsions on how the ISA level
> stuff should be done to add J2 on the binutils side?

Let's continue this topic on the binutils list
( https://sourceware.org/ml/binutils/2015-09/msg00031.html )

Re: patch for PR61578

2015-09-02 Thread Christophe Lyon

Hi Vladimir,



On 1 September 2015 at 21:39, Vladimir Makarov  wrote:
>   The following patch is for
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578
>
>   The patch was bootstrapped and tested on x86 and x86-64.
>
>   Committed as rev. 227382.
>

Since this patch, I can see:
  gcc.dg/vect/slp-perm-5.c (internal compiler error)
  gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects (internal compiler error)

on arm* targets.

Can you have a look?

Thanks,

Christophe.


> 2015-09-01  Vladimir Makarov  
>
> PR target/61578
> * lra-lives.c (process_bb_lives): Process move pseudos with the
> same value for copies and preferences
> * lra-constraints.c (match_reload): Create match reload pseudo
> with the same value from single dying input pseudo.
>

Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-09-02 Thread Kyrill Tkachov



On 02/09/15 16:18, Zamyatin, Igor wrote:


On 19/08/15 17:57, Jeff Law wrote:

On 08/12/2015 08:31 AM, Kyrill Tkachov wrote:

2015-08-10  Kyrylo Tkachov 

   * ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
   then_cost, else_cost fields.  Change branch_cost field to
unsigned int.
   (end_ifcvt_sequence): Call set_used_flags on each insn in the
   sequence.
   Include rtl-iter.h.
   (noce_simple_bbs): New function.
   (noce_try_move): Bail if basic blocks are not simple.
   (noce_try_store_flag): Likewise.
   (noce_try_store_flag_constants): Likewise.
   (noce_try_addcc): Likewise.
   (noce_try_store_flag_mask): Likewise.
   (noce_try_cmove): Likewise.
   (noce_try_minmax): Likewise.
   (noce_try_abs): Likewise.
   (noce_try_sign_mask): Likewise.
   (noce_try_bitop): Likewise.
   (bbs_ok_for_cmove_arith): New function.
   (noce_emit_all_but_last): Likewise.
   (noce_emit_insn): Likewise.
   (noce_emit_bb): Likewise.
   (noce_try_cmove_arith): Handle non-simple basic blocks.
   (insn_valid_noce_process_p): New function.
   (contains_mem_rtx_p): Likewise.
   (bb_valid_for_noce_process_p): Likewise.
   (noce_process_if_block): Allow non-simple basic blocks
   where appropriate.

2015-08-11  Kyrylo Tkachov 

   * gcc.dg/ifcvt-1.c: New test.
   * gcc.dg/ifcvt-2.c: Likewise.
   * gcc.dg/ifcvt-3.c: Likewise.

Looks like ifcvt-3.c fails on x86_64. I see

New failures:
FAIL: gcc.dg/ifcvt-3.c scan-rtl-dump ce1 "3 true changes made"

Could you please take a look?


Hmm, these pass for me on x86_64-pc-linux-gnu.
The test is most probably failing due to branch costs being too low for the
transformation to kick in. The test passes for me with -mtune=intel and 
-mtune=generic.
Do you know what the default tuning CPU is used for that failing test?

Thanks,
Kyrill



Thanks,
Igor

[gomp4.1] Depend clause support for offloading

2015-09-02 Thread Jakub Jelinek

Hi!

On Wed, Sep 02, 2015 at 02:21:14PM +0300, Ilya Verbin wrote:
> On Mon, Aug 31, 2015 at 17:07:53 +0200, Jakub Jelinek wrote:
> > * gimplify.c (gimplify_scan_omp_clauses): Handle
> > struct element GOMP_MAP_FIRSTPRIVATE_POINTER.
> 
> Have you seen this?
> 
> gcc/gimplify.c: In function ‘void gimplify_scan_omp_clauses(tree_node**, 
> gimple_statement_base**, omp_region_type, tree_code)’:
> gcc/gimplify.c:6578:12: error: ‘sc’ may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>   : *sc != c;
> ^

I haven't, but I haven't bootstrapped it for a while, just keep
doing make -C gcc -j16 -k check RUNTESTFLAGS=gomp.exp and
make check-target-libgomp.  That said, this looks like a false positive,
but I've added a NULL initialization for it anyway.

Here is the start of the async offloading support I've talked about,
but nowait is not supported on the library side yet, only depend clause
(and for that I haven't added a testcase yet).

2015-09-02  Jakub Jelinek  

* gimplify.c (gimplify_scan_omp_clauses): Initialize sc
to NULL to avoid false positive warnings.
* omp-low.c (check_omp_nesting_restrictions): Diagnose
depend(source) or depend(sink:...) on #pragma omp target *.
(expand_omp_target): Pass flags and depend arguments to
GOMP_target_{41,update_41,enter_exit_data} libcalls.
(lower_depend_clauses): Change first argument from gimple
to tree * pointing to the stmt's clauses.
(lower_omp_taskreg): Adjust caller.
(lower_omp_target): Lower depend clauses.  Always use 16-bit
kinds and 8 as align shift.  Use
GOMP_MAP_DELETE_ZERO_LEN_ARRAY_SECTION for zero length array
section in map clause with delete kind.
* omp-builtins.def (BUILT_IN_GOMP_TARGET,
BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): Add flags and depend arguments.
(BUILT_IN_GOMP_TARGET_UPDATE): Change library function name
to GOMP_target_update_41.  Add flags and depend arguments,
remove unused argument.
* builtin-types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): Remove.
(BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR): New.
gcc/c/
* c-typeck.c (handle_omp_array_sections): Set
OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION even for
GOMP_MAP_DELETE kinds.
gcc/cp/
* semantics.c (handle_omp_array_sections): Set
OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION even for
GOMP_MAP_DELETE kinds.
gcc/fortran/
* types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): Remove.
(BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR): New.
include/
* gomp-constants.h (enum gomp_map_kind): Add
GOMP_MAP_DELETE_ZERO_LEN_ARRAY_SECTION.
(GOMP_TARGET_FLAG_NOWAIT, GOMP_TARGET_FLAG_EXIT_DATA): Define.
libgomp/
* libgomp_g.h (GOMP_target_41, GOMP_target_enter_exit_data): Add
flags and depend arguments.
(GOMP_target_update_41): New prototype.
* libgomp.h (gomp_task_maybe_wait_for_dependencies): New prototype.
* libgomp.map (GOMP_4.1): Add GOMP_target_update_41.
* task.c (gomp_task_maybe_wait_for_dependencies): Remove prototype.
No longer static.
* target.c (GOMP_target_41): Add flags and depend arguments.  If
depend is non-NULL, wait until all dependencies are satisfied.
(GOMP_target_enter_exit_data): Likewise.  Use
flags & GOMP_TARGET_FLAG_EXIT_DATA to determine if it is enter
or exit data construct, instead of analysing kinds.
(gomp_exit_data): Handle GOMP_MAP_DELETE_ZERO_LEN_ARRAY_SECTION.
(GOMP_target_update_41): New function.
* testsuite/libgomp.c/target-24.c: New test.

--- gcc/gimplify.c.jj   2015-08-31 16:57:23.0 +0200
+++ gcc/gimplify.c  2015-09-02 14:20:41.012253248 +0200
@@ -6557,8 +6557,8 @@ gimplify_scan_omp_clauses (tree *list_p,
}
  else
{
- tree *osc = struct_map_to_clause->get (decl), *sc;
- tree *pt = NULL;
+ tree *osc = struct_map_to_clause->get (decl);
+ tree *sc = NULL, *pt = NULL;
  if (!ptr && TREE_CODE (*osc) == TREE_LIST)
osc = _PURPOSE (*osc);
  if (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS)
--- gcc/omp-low.c.jj2015-09-01 17:39:05.0 +0200
+++ gcc/omp-low.c   2015-09-02 15:13:13.726567918 +0200
@@ -3440,6 +3440,19 @@ check_omp_nesting_restrictions (gimple s
}
   break;
 case GIMPLE_OMP_TARGET:
+  for (c = gimple_omp_target_clauses (stmt); c; c = OMP_CLAUSE_CHAIN (c))
+   if (OMP_CLAUSE_CODE (c) ==

Re: Patch GCC for profile-func-internal-id=0 coverage-callback=1

2015-09-02 Thread Xinliang David Li

Sorry for the wrong advice. I thought the feature was in trunk. Rong,
can you submit the callback support to trunk?

David


On Wed, Sep 2, 2015 at 1:41 PM, Rong Xu  wrote:
> Matt,
>
> It seems this patch is for google branch, rather the trunk. The code for
> coverage callback function is not in trunk.
>
> It's ok to submit to google/gcc-4_9 branch.
>
> Thanks,
>
> -Rong
>
> On Wed, Sep 2, 2015 at 10:01 AM, Matt Deeds  wrote:
>>
>> Hello, Honza.  David Li said you might be able to help me get this
>> patch into GCC trunk.  I sent mail for this on August 27, but didn't
>> get a reply.  It's a small change to make these two options work
>> together:
>>
>> profile-func-internal-id=0 coverage-callback=1
>>
>> Let me know what I can do to get this submitted.
>>
>> This patch is for svn://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9.  I
>> add
>> support for the profile_func_internal-id in the instrumentation generated
>> for
>> __coverage_callback.
>>
>> Add support for the profile-func-internal-id parameter to the coverage
>> callback.
>> Without this change, the function identifier passed to __coverage_callback
>> (enabled with param=coverage-callback=1) does not match the values emitted
>> in
>> the .gcno file.  Because the function profile_id is typically more unique
>> (typically 32 bits) than the function internal id (typically 16 bits), it
>> can be
>> desirable to have the profile_id used to identify a function as opposed to
>> the
>> function internal id.
>>
>> I've instrumented a large binary creating over 500 .gcno files and
>> confirmed
>> that function IDs in these .gcno files match the IDs in
>> __coverage_callback.  In
>> my example, there were typically about one to four functions sharing the
>> same
>> internal function ID.  There were no collisions using profile_id.
>>
>>
>> Index: gcc/tree-profile.c
>> ===
>> --- gcc/tree-profile.c (revision 226647)
>> +++ gcc/tree-profile.c (working copy)
>> @@ -864,8 +864,20 @@ gimple_gen_edge_profiler (int edgeno, edge e)
>>  {
>>gimple call;
>>tree tree_edgeno = build_int_cst (gcov_type_node, edgeno);
>> -  tree tree_uid = build_int_cst (gcov_type_node,
>> +
>> +  tree tree_uid;
>> +  if (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID))
>> +{
>> +  tree_uid  = build_int_cst (gcov_type_node,
>>   current_function_funcdef_no);
>> +}
>> +  else
>> +{
>> +  gcc_assert (coverage_node_map_initialized_p ());
>> +
>> +  tree_uid = build_int_cst
>> +  (gcov_type_node, cgraph_get_node
>> (current_function_decl)->profile_id);
>> +}
>>tree callback_fn_type
>>= build_function_type_list (void_type_node,
>>gcov_type_node,
>
>

Re: [RFC] COMDAT Safe Module Level Multi versioning

2015-09-02 Thread Sriraman Tallam

On Tue, Aug 18, 2015 at 9:46 PM, Cary Coutant  wrote:
>> Thanks, will make those changes.  Do you recommend a different name
>> for this flag like -fmake-comdat-functions-static?
>
> Well, the C++ ABI refers to this as "vague linkage." It may be a bit
> too long or too ABI-specific, but maybe something like
> -f[no-]use-vague-linkage-for-functions or
> -f[no-]functions-vague-linkage?

Done and patch attached.

* c-family/c.opt (fvague-linkage-functions): New option.
* cp/decl2.c (comdat_linkage): Implement new option.  Warn when
virtual comdat functions are seen.
* ipa.c (function_and_variable_visibility): Check for no vague
linkage.
* doc/invoke.texi: Document new option.
* testsuite/g++.dg/no-vague-linkage-functions-1.C: New test.




>
> And perhaps note in the doc that using this option may technically
> break the C++ ODR, so it should be used only when you know what you're
> doing.

Done.

Thanks
Sri

>
> -cary
* c-family/c.opt (fvague-linkage-functions): New option.
* cp/decl2.c (comdat_linkage): Implement new option.  Warn when
virtual comdat functions are seen.
* ipa.c (function_and_variable_visibility): Check for no vague
linkage.
* doc/invoke.texi: Document new option.
* testsuite/g++.dg/no-vague-linkage-functions-1.C: New test.

Index: c-family/c.opt
===
--- c-family/c.opt  (revision 227383)
+++ c-family/c.opt  (working copy)
@@ -1236,6 +1236,16 @@ fweak
 C++ ObjC++ Var(flag_weak) Init(1)
 Emit common-like symbols as weak symbols
 
+fvague-linkage-functions
+C++ Var(flag_vague_linkage_functions) Init(1)
+Option -fno-vague-linkage-functions makes comdat functions static and local
+to the module. With -fno-vague-linkage-functions, virtual comdat functions
+still use vague linkage.  With -fno-vague-linkage-functions, the address of
+the comdat functions that are made local will be unique and this can cause
+unintended behavior when addresses of these comdat functions are used in
+comparisons.  This option may technically break the C++ ODR and users of
+this flag should know what they are doing.
+
 fwide-exec-charset=
 C ObjC C++ ObjC++ Joined RejectNegative
 -fwide-exec-charset= Convert all wide strings and character 
constants to character set 
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 227383)
+++ cp/decl2.c  (working copy)
@@ -1702,8 +1702,22 @@ mark_vtable_entries (tree decl)
 void
 comdat_linkage (tree decl)
 {
-  if (flag_weak
-make_decl_one_only (decl, cxx_comdat_group (decl));
+  if (flag_weak
+  && (flag_vague_linkage_functions
+ || TREE_CODE (decl) != FUNCTION_DECL
+ || DECL_VIRTUAL_P (decl)))
+{
+  make_decl_one_only (decl, cxx_comdat_group (decl));
+  /* Warn when -fno-vague-linkage-functions is used and we found virtual
+comdat functions.  Virtual comdat functions must still use vague
+linkage.  */
+  if (TREE_CODE (decl) == FUNCTION_DECL
+ && DECL_VIRTUAL_P (decl)
+ && !flag_vague_linkage_functions)
+   warning_at (DECL_SOURCE_LOCATION (decl), 0,
+   "fno-vague-linkage-functions: Comdat linkage of virtual "
+   "function %q#D preserved.", decl);
+}
   else if (TREE_CODE (decl) == FUNCTION_DECL
   || (VAR_P (decl) && DECL_ARTIFICIAL (decl)))
 /* We can just emit function and compiler-generated variables
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 227383)
+++ doc/invoke.texi (working copy)
@@ -189,7 +189,7 @@ in the following sections.
 -fno-pretty-templates @gol
 -frepo  -fno-rtti  -fstats  -ftemplate-backtrace-limit=@var{n} @gol
 -ftemplate-depth=@var{n} @gol
--fno-threadsafe-statics -fuse-cxa-atexit  -fno-weak  -nostdinc++ @gol
+-fno-threadsafe-statics -fuse-cxa-atexit  -fno-weak 
-fno-vague-linkage-functions -nostdinc++ @gol
 -fvisibility-inlines-hidden @gol
 -fvtable-verify=@var{std|preinit|none} @gol
 -fvtv-counts -fvtv-debug @gol
@@ -2448,6 +2448,18 @@ option exists only for testing, and should not be
 it results in inferior code and has no benefits.  This option may
 be removed in a future release of G++.
 
+@item -fno-vague-linkage-functions
+@opindex fno-vague-linkage-functions
+Do not use vague linkage for comdat non-virtual functions, even if it
+is provided by the linker.  This option is useful when comdat functions
+generated in certain compilation units need to be kept local to the
+respective units and not exposed globally.  This does not apply to virtual
+comdat functions as their pointers may be taken via virtual tables.
+This can cause unintended behavior if the addresses of the comdat functions
+that are made local are used in comparisons, which are not warned about.
+This option may technically break the C++ ODR and users of this flag should

Re: [C PATCH] Better diagnostic for empty enum (PR c/67432)

2015-09-02 Thread Joseph Myers

On Wed, 2 Sep 2015, Marek Polacek wrote:

> This PR asks for a better error wrt empty enums.  So I've handled
> empty enum specially.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Reviving SH FDPIC target

2015-09-02 Thread Rich Felker

On Wed, Sep 02, 2015 at 05:05:35PM -0400, Rich Felker wrote:
> On Wed, Sep 02, 2015 at 07:59:45PM +, Joseph Myers wrote:
> > On Wed, 2 Sep 2015, Rich Felker wrote:
> > 
> > > Also, according to Joseph Myers, there was some unresolved
> > > disagreement that stalled (and eventually sunk) the old patch, so if
> > > anyone's still around who has objections to it, could you speak up and
> > > let me know what's wrong? Kaz Kojima seems to have approved the patch
> > > at the time so I'm confused what the issue was/is.
> > 
> > It's patch 1/3 (architecture-independent) that had the disagreement (and 
> > patch 3/3 depends on patch 1/3).
> > 
> > https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01462.html
> 
> So this is only for __fpscr_values? In that case I think the right
> solution is just to follow up with getting rid of __fpscr_values, if
> it's not already done:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60138
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513
> 
> 53513 is marked fixed, but I didn't follow up to confirm that the
> actual problems I reported in 60138 are fixed; I'll do some more
> research on this. But if all goes well, we can just drop 1/3.

I've confirmed that gcc 5.2 does not produce references to
__fpscr_values; instead, it does:

mov.l   .L4,r3
...
sts fpscr,r1
xor r3,r1
lds r1,fpscr
...
.L4:
.long   524288

So if __fpscr_values was the only reason for patch 1/3 in the FDPIC
patchset, I think we can safely drop it. And patch 2/3 was already
committed, so 3/3, the one I was originally looking at, seems to be
all we need. It was approved at the time, so I'll proceed with merging
it with 5.2.0.

Rich

Go patch committed: report invalid receiver types

2015-09-02 Thread Ian Lance Taylor

This patch by Chris Manghane fixes the Go frontend to report invalid
function receiver types as an error rather than crashing.  This fixes
https://golang.org/issue/12324 .  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 227420)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-3f8feb4f905535448833a14e4f5c83f682087749
+672ac2abc52d8bd70cb9fb03dd4a32fdde9c438f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 227299)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -1818,7 +1818,11 @@ Gogo::start_function(const std::string&
  function);
}
  else
-   go_unreachable();
+{
+  error_at(type->receiver()->location(),
+   "invalid receiver type (receiver must be a named 
type)");
+  ret = Named_object::make_function(name, NULL, function);
+}
}
   this->package_->bindings()->add_method(ret);
 }

[PATCH 05/10] bt-load.c: remove typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* bt-load.c (struct btr_def_group): Rename from btr_def_group_s.
(struct btr_user): Rename from btr_user_s.
(struct btr_def): Rename from btr_def_s.
(find_btr_def_group): Adjust.
(add_btr_def): Likewise.
(new_btr_user): Likewise.
(note_other_use_this_block): Likewise.
(compute_defs_uses_and_gen): Likewise.
(link_btr_uses): Likewise.
(build_btr_def_use_webs): Likewise.
(block_at_edge_of_live_range_p): Likewise.
(btr_def_live_range): Likewise.
(combine_btr_defs): Likewise.
(move_btr_def): Likewise.
(migrate_btr_def): Likewise.
(migrate_btr_defs): Likewise.
---
 gcc/bt-load.c | 140 +-
 1 file changed, 71 insertions(+), 69 deletions(-)

diff --git a/gcc/bt-load.c b/gcc/bt-load.c
index 5d8b752..9b1d366 100644
--- a/gcc/bt-load.c
+++ b/gcc/bt-load.c
@@ -51,18 +51,20 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "fibonacci_heap.h"
 
+struct btr_def;
+
 /* Target register optimizations - these are performed after reload.  */
 
-typedef struct btr_def_group_s
+struct btr_def_group
 {
-  struct btr_def_group_s *next;
+  btr_def_group *next;
   rtx src;
-  struct btr_def_s *members;
-} *btr_def_group;
+  btr_def *members;
+};
 
-typedef struct btr_user_s
+struct btr_user
 {
-  struct btr_user_s *next;
+  btr_user *next;
   basic_block bb;
   int luid;
   rtx_insn *insn;
@@ -74,7 +76,7 @@ typedef struct btr_user_s
   int n_reaching_defs;
   int first_reaching_def;
   char other_use_this_block;
-} *btr_user;
+};
 
 /* btr_def structs appear on three lists:
  1. A list of all btr_def structures (head is
@@ -85,10 +87,10 @@ typedef struct btr_user_s
group (head is in a BTR_DEF_GROUP struct, linked by
NEXT_THIS_GROUP field).  */
 
-typedef struct btr_def_s
+struct btr_def
 {
-  struct btr_def_s *next_this_bb;
-  struct btr_def_s *next_this_group;
+  btr_def *next_this_bb;
+  btr_def *next_this_group;
   basic_block bb;
   int luid;
   rtx_insn *insn;
@@ -98,8 +100,8 @@ typedef struct btr_def_s
  source (i.e. a label), group links together all the
  insns with the same source.  For other branch register
  setting insns, group is NULL.  */
-  btr_def_group group;
-  btr_user uses;
+  btr_def_group *group;
+  btr_user *uses;
   /* If this def has a reaching use which is not a simple use
  in a branch instruction, then has_ambiguous_use will be true,
  and we will not attempt to migrate this definition.  */
@@ -119,38 +121,38 @@ typedef struct btr_def_s
  to clear out trs_live_at_end again.  */
   char own_end;
   bitmap live_range;
-} *btr_def;
+};
 
-typedef fibonacci_heap  btr_heap_t;
-typedef fibonacci_node  btr_heap_node_t;
+typedef fibonacci_heap  btr_heap_t;
+typedef fibonacci_node  btr_heap_node_t;
 
 static int issue_rate;
 
 static int basic_block_freq (const_basic_block);
 static int insn_sets_btr_p (const rtx_insn *, int, int *);
-static void find_btr_def_group (btr_def_group *, btr_def);
-static btr_def add_btr_def (btr_heap_t *, basic_block, int, rtx_insn *,
-   unsigned int, int, btr_def_group *);
-static btr_user new_btr_user (basic_block, int, rtx_insn *);
+static void find_btr_def_group (btr_def_group **, btr_def *);
+static btr_def *add_btr_def (btr_heap_t *, basic_block, int, rtx_insn *,
+   unsigned int, int, btr_def_group **);
+static btr_user *new_btr_user (basic_block, int, rtx_insn *);
 static void dump_hard_reg_set (HARD_REG_SET);
 static void dump_btrs_live (int);
-static void note_other_use_this_block (unsigned int, btr_user);
-static void compute_defs_uses_and_gen (btr_heap_t *, btr_def *,btr_user *,
+static void note_other_use_this_block (unsigned int, btr_user *);
+static void compute_defs_uses_and_gen (btr_heap_t *, btr_def **, btr_user **,
   sbitmap *, sbitmap *, HARD_REG_SET *);
 static void compute_kill (sbitmap *, sbitmap *, HARD_REG_SET *);
 static void compute_out (sbitmap *bb_out, sbitmap *, sbitmap *, int);
-static void link_btr_uses (btr_def *, btr_user *, sbitmap *, sbitmap *, int);
+static void link_btr_uses (btr_def **, btr_user **, sbitmap *, sbitmap *, int);
 static void build_btr_def_use_webs (btr_heap_t *);
-static int block_at_edge_of_live_range_p (int, btr_def);
-static void clear_btr_from_live_range (btr_def def);
-static void add_btr_to_live_range (btr_def, int);
+static int block_at_edge_of_live_range_p (int, btr_def *);
+static void clear_btr_from_live_range (btr_def *def);
+static void add_btr_to_live_range (btr_def *, int);
 static void augment_live_range (bitmap, HARD_REG_SET *, basic_block,
basic_block,

[PATCH 03/10] var-tracking.c: remove typedef of location_chain

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* var-tracking.c (struct location_chain): Rename from
location_chain_def.
(struct variable_part): Adjust.
(variable_htab_free): Likewise.
(unshare_variable): Likewise.
(get_init_value): Likewise.
(get_addr_from_local_cache): Likewise.
(drop_overlapping_mem_locs): Likewise.
(val_reset): Likewise.
(struct variable_union_info): Likewise.

(variable_union): Likewise.
(find_loc_in_1pdv): 
Likewise.
(insert_into_intersection): Likewise.
(intersect_loc_chains): Likewise.
(canonicalize_loc_order_check): Likewise.
(canonicalize_values_mark): Likewise.
(canonicalize_values_star): Likewise.
(canonicalize_vars_star): Likewise.
(variable_merge_over_cur): Likewise.
(remove_duplicate_values): Likewise.
(variable_post_merge_new_vals): Likewise.
(variable_post_merge_perm_vals): Likewise.
(find_mem_expr_in_1pdv): Likewise.
(dataflow_set_preserve_mem_locs): Likewise.
(dataflow_set_remove_mem_locs): Likewise.
(variable_part_different_p): Likewise.
(onepart_variable_different_p): Likewise.
(find_src_set_src): Likewise.
(dump_var): Likewise.
(set_slot_part): Likewise.
(clobber_slot_part): Likewise.
(delete_slot_part): Likewise.
(vt_expand_var_loc_chain): Likewise.
(emit_note_insn_var_location): Likewise.
(vt_finalize): Likewise.
---
 gcc/var-tracking.c | 136 ++---
 1 file changed, 68 insertions(+), 68 deletions(-)

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index cd394a0..5a53a4a 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -264,10 +264,10 @@ typedef struct attrs_def
 } *attrs;
 
 /* Structure for chaining the locations.  */
-typedef struct location_chain_def
+struct location_chain
 {
   /* Next element in the chain.  */
-  struct location_chain_def *next;
+  location_chain *next;
 
   /* The location (REG, MEM or VALUE).  */
   rtx loc;
@@ -277,7 +277,7 @@ typedef struct location_chain_def
 
   /* Initialized? */
   enum var_init_status init;
-} *location_chain;
+};
 
 /* A vector of loc_exp_dep holds the active dependencies of a one-part
DV on VALUEs, i.e., the VALUEs expanded so as to form the current
@@ -337,7 +337,7 @@ struct onepart_aux
 struct variable_part
 {
   /* Chain of locations of the part.  */
-  location_chain loc_chain;
+  location_chain *loc_chain;
 
   /* Location which was last emitted to location list.  */
   rtx cur_loc;
@@ -589,8 +589,8 @@ static pool_allocator valvar_pool
   ("small variable_def pool", 256, sizeof (variable_def));
 
 /* Alloc pool for struct location_chain_def.  */
-static object_allocator location_chain_def_pool
-  ("location_chain_def pool", 1024);
+static object_allocator location_chain_pool
+  ("location_chain pool", 1024);
 
 /* Alloc pool for struct shared_hash_def.  */
 static object_allocator shared_hash_def_pool
@@ -663,7 +663,7 @@ static void dataflow_set_clear (dataflow_set *);
 static void dataflow_set_copy (dataflow_set *, dataflow_set *);
 static int variable_union_info_cmp_pos (const void *, const void *);
 static void dataflow_set_union (dataflow_set *, dataflow_set *);
-static location_chain find_loc_in_1pdv (rtx, variable, variable_table_type *);
+static location_chain *find_loc_in_1pdv (rtx, variable, variable_table_type *);
 static bool canon_value_cmp (rtx, rtx);
 static int loc_cmp (rtx, rtx);
 static bool variable_part_different_p (variable_part *, variable_part *);
@@ -1435,7 +1435,7 @@ variable_htab_free (void *elem)
 {
   int i;
   variable var = (variable) elem;
-  location_chain node, next;
+  location_chain *node, *next;
 
   gcc_checking_assert (var->refcount > 0);
 
@@ -1738,8 +1738,8 @@ unshare_variable (dataflow_set *set, variable_def **slot, 
variable var,
 
   for (i = 0; i < var->n_var_parts; i++)
 {
-  location_chain node;
-  location_chain *nextp;
+  location_chain *node;
+  location_chain **nextp;
 
   if (i == 0 && var->onepart)
{
@@ -1756,9 +1756,9 @@ unshare_variable (dataflow_set *set, variable_def **slot, 
variable var,
   nextp = _var->var_part[i].loc_chain;
   for (node = var->var_part[i].loc_chain; node; node = node->next)
{
- location_chain new_lc;
+ location_chain *new_lc;
 
- new_lc = new location_chain_def;
+ new_lc = new location_chain;
  new_lc->next = NULL;
  if (node->init > initialized)
new_lc->init = node->init;
@@ -1882,7 +1882,7 @@ get_init_value (dataflow_set *set, rtx loc, decl_or_value 
dv)

[PATCH 04/10] var-tracking.c: remove typedef of shared_hash

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* var-tracking.c (shared_hash_def): Rename to shared_hash.
(shared_hash): Remove typedef.
(struct dataflow_set): Adjust.
(shared_hash_unshare): Likewise.
(dataflow_set_merge): Likewise.
(vt_initialize): Likewise.
(vt_finalize): Likewise.
---
 gcc/var-tracking.c | 56 +++---
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 5a53a4a..126feee 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -525,14 +525,14 @@ struct emit_note_data
 
 /* Structure holding a refcounted hash table.  If refcount > 1,
it must be first unshared before modified.  */
-typedef struct shared_hash_def
+struct shared_hash
 {
   /* Reference count.  */
   int refcount;
 
   /* Actual hash table.  */
   variable_table_type *htab;
-} *shared_hash;
+};
 
 /* Structure holding the IN or OUT set for a basic block.  */
 struct dataflow_set
@@ -544,10 +544,10 @@ struct dataflow_set
   attrs regs[FIRST_PSEUDO_REGISTER];
 
   /* Variable locations.  */
-  shared_hash vars;
+  shared_hash *vars;
 
   /* Vars that is being traversed.  */
-  shared_hash traversed_vars;
+  shared_hash *traversed_vars;
 };
 
 /* The structure (one for each basic block) containing the information
@@ -593,8 +593,8 @@ static object_allocator location_chain_pool
   ("location_chain pool", 1024);
 
 /* Alloc pool for struct shared_hash_def.  */
-static object_allocator shared_hash_def_pool
-  ("shared_hash_def pool", 256);
+static object_allocator shared_hash_pool
+  ("shared_hash pool", 256);
 
 /* Alloc pool for struct loc_exp_dep_s for NOT_ONEPART variables.  */
 object_allocator loc_exp_dep_pool ("loc_exp_dep pool", 64);
@@ -611,7 +611,7 @@ static bool emit_notes;
 static variable_table_type *dropped_values;
 
 /* Empty shared hashtable.  */
-static shared_hash empty_shared_hash;
+static shared_hash *empty_shared_hash;
 
 /* Scratch register bitmap used by cselib_expand_value_rtx.  */
 static bitmap scratch_regs = NULL;
@@ -1571,7 +1571,7 @@ attrs_list_mpdv_union (attrs *dstp, attrs src, attrs src2)
 /* Return true if VARS is shared.  */
 
 static inline bool
-shared_hash_shared (shared_hash vars)
+shared_hash_shared (shared_hash *vars)
 {
   return vars->refcount > 1;
 }
@@ -1579,7 +1579,7 @@ shared_hash_shared (shared_hash vars)
 /* Return the hash table for VARS.  */
 
 static inline variable_table_type *
-shared_hash_htab (shared_hash vars)
+shared_hash_htab (shared_hash *vars)
 {
   return vars->htab;
 }
@@ -1587,7 +1587,7 @@ shared_hash_htab (shared_hash vars)
 /* Return true if VAR is shared, or maybe because VARS is shared.  */
 
 static inline bool
-shared_var_p (variable var, shared_hash vars)
+shared_var_p (variable var, shared_hash *vars)
 {
   /* Don't count an entry in the changed_variables table as a duplicate.  */
   return ((var->refcount > 1 + (int) var->in_changed_variables)
@@ -1596,10 +1596,10 @@ shared_var_p (variable var, shared_hash vars)
 
 /* Copy variables into a new hash table.  */
 
-static shared_hash
-shared_hash_unshare (shared_hash vars)
+static shared_hash *
+shared_hash_unshare (shared_hash *vars)
 {
-  shared_hash new_vars = new shared_hash_def;
+  shared_hash *new_vars = new shared_hash;
   gcc_assert (vars->refcount > 1);
   new_vars->refcount = 1;
   new_vars->htab = new variable_table_type (vars->htab->elements () + 3);
@@ -1610,8 +1610,8 @@ shared_hash_unshare (shared_hash vars)
 
 /* Increment reference counter on VARS and return it.  */
 
-static inline shared_hash
-shared_hash_copy (shared_hash vars)
+static inline shared_hash *
+shared_hash_copy (shared_hash *vars)
 {
   vars->refcount++;
   return vars;
@@ -1621,7 +1621,7 @@ shared_hash_copy (shared_hash vars)
anymore.  */
 
 static void
-shared_hash_destroy (shared_hash vars)
+shared_hash_destroy (shared_hash *vars)
 {
   gcc_checking_assert (vars->refcount > 0);
   if (--vars->refcount == 0)
@@ -1635,7 +1635,7 @@ shared_hash_destroy (shared_hash vars)
INSERT, insert it if not already present.  */
 
 static inline variable_def **
-shared_hash_find_slot_unshare_1 (shared_hash *pvars, decl_or_value dv,
+shared_hash_find_slot_unshare_1 (shared_hash **pvars, decl_or_value dv,
 hashval_t dvhash, enum insert_option ins)
 {
   if (shared_hash_shared (*pvars))
@@ -1644,7 +1644,7 @@ shared_hash_find_slot_unshare_1 (shared_hash *pvars, 
decl_or_value dv,
 }
 
 static inline variable_def **
-shared_hash_find_slot_unshare (shared_hash *pvars, decl_or_value dv,
+shared_hash_find_slot_unshare (shared_hash **pvars, decl_or_value dv,
   enum insert_option ins)
 {
   return shared_hash_find_slot_unshare_1 (pvars, dv, dv_htab_hash (dv), ins);
@@ -1655,7 +1655,7 @@ shared_hash_find_slot_unshare (shared_hash *pvars,

[PATCH 07/10] tree-vrp.c: remove typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* tree-vrp.c (struct assert_locus_d): Rename to assert_locus.
(dump_asserts_for): Adjust.
(register_new_assert_for): Likewise.
(process_assert_insertions): Likewise.
(insert_range_assertions): Likewise.
---
 gcc/tree-vrp.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 21fbed0..c45daba 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -126,7 +126,7 @@ static tree vrp_evaluate_conditional_warnv_with_ops (enum 
tree_code,
SSA name may have more than one assertion associated with it, these
locations are kept in a linked list attached to the corresponding
SSA name.  */
-struct assert_locus_d
+struct assert_locus
 {
   /* Basic block where the assertion would be inserted.  */
   basic_block bb;
@@ -148,11 +148,9 @@ struct assert_locus_d
   tree expr;
 
   /* Next node in the linked list.  */
-  struct assert_locus_d *next;
+  assert_locus *next;
 };
 
-typedef struct assert_locus_d *assert_locus_t;
-
 /* If bit I is present, it means that SSA name N_i has a list of
assertions that should be inserted in the IL.  */
 static bitmap need_assert_for;
@@ -160,7 +158,7 @@ static bitmap need_assert_for;
 /* Array of locations lists where to insert assertions.  ASSERTS_FOR[I]
holds a list of ASSERT_LOCUS_T nodes that describe where
ASSERT_EXPRs for SSA name N_I should be inserted.  */
-static assert_locus_t *asserts_for;
+static assert_locus **asserts_for;
 
 /* Value range array.  After propagation, VR_VALUE[I] holds the range
of values that SSA name N_I may take.  */
@@ -4897,7 +4895,7 @@ void debug_all_asserts (void);
 void
 dump_asserts_for (FILE *file, tree name)
 {
-  assert_locus_t loc;
+  assert_locus *loc;
 
   fprintf (file, "Assertions to be inserted for ");
   print_generic_expr (file, name, 0);
@@ -4979,7 +4977,7 @@ register_new_assert_for (tree name, tree expr,
 edge e,
 gimple_stmt_iterator si)
 {
-  assert_locus_t n, loc, last_loc;
+  assert_locus *n, *loc, *last_loc;
   basic_block dest_bb;
 
   gcc_checking_assert (bb == NULL || e == NULL);
@@ -5054,7 +5052,7 @@ register_new_assert_for (tree name, tree expr,
   /* If we didn't find an assertion already registered for
  NAME COMP_CODE VAL, add a new one at the end of the list of
  assertions associated with NAME.  */
-  n = XNEW (struct assert_locus_d);
+  n = XNEW (struct assert_locus);
   n->bb = dest_bb;
   n->e = e;
   n->si = si;
@@ -6333,7 +6331,7 @@ find_assert_locations (void)
indicated by LOC.  Return true if we made any edge insertions.  */
 
 static bool
-process_assert_insertions_for (tree name, assert_locus_t loc)
+process_assert_insertions_for (tree name, assert_locus *loc)
 {
   /* Build the comparison expression NAME_i COMP_CODE VAL.  */
   gimple stmt;
@@ -6401,12 +6399,12 @@ process_assert_insertions (void)
 
   EXECUTE_IF_SET_IN_BITMAP (need_assert_for, 0, i, bi)
 {
-  assert_locus_t loc = asserts_for[i];
+  assert_locus *loc = asserts_for[i];
   gcc_assert (loc);
 
   while (loc)
{
- assert_locus_t next = loc->next;
+ assert_locus *next = loc->next;
  update_edges_p |= process_assert_insertions_for (ssa_name (i), loc);
  free (loc);
  loc = next;
@@ -6458,7 +6456,7 @@ static void
 insert_range_assertions (void)
 {
   need_assert_for = BITMAP_ALLOC (NULL);
-  asserts_for = XCNEWVEC (assert_locus_t, num_ssa_names);
+  asserts_for = XCNEWVEC (assert_locus *, num_ssa_names);
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-- 
2.4.0

[PATCH 02/10] dse.c: remove some typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-02  Trevor Saunders  

* dse.c (store_info_t): Remove typedef.
(group_info_t): Likewise.
(const_group_info_t): Likewise.
(deferred_change_t): Likewise.
(get_group_info): Adjust.
(free_store_info): Likewise.
(canon_address): Likewise.
(clear_rhs_from_active_local_stores): Likewise.
(record_store): Likewise.
(replace_read): Likewise.
(check_mem_read_rtx): Likewise.
(scan_insn): Likewise.
(remove_useless_values): Likewise.
(dse_step1): Likewise.
(dse_step2_init): Likewise.
(dse_step2_nospill): Likewise.
(scan_stores_nospill): Likewise.
(scan_reads_nospill): Likewise.
(dse_step3_exit_block_scan): Likewise.
(dse_step3): Likewise.
(dse_step5_nospill): Likewise.
(dse_step6): Likewise.
---
 gcc/dse.c | 115 ++
 1 file changed, 55 insertions(+), 60 deletions(-)

diff --git a/gcc/dse.c b/gcc/dse.c
index 780379a..0634d9d 100644
--- a/gcc/dse.c
+++ b/gcc/dse.c
@@ -307,7 +307,6 @@ lowpart_bitmask (int n)
   return mask >> (HOST_BITS_PER_WIDE_INT - n);
 }
 
-typedef struct store_info *store_info_t;
 static object_allocator cse_store_info_pool ("cse_store_info_pool",
   100);
 
@@ -400,7 +399,7 @@ struct insn_info_type
  But it could also contain clobbers.  Insns that contain more than
  one mem set are not deletable, but each of those mems are here in
  order to provide info to delete other insns.  */
-  store_info_t store_rec;
+  store_info *store_rec;
 
   /* The linked list of mem uses in this insn.  Only the reads from
  rtx bases are listed here.  The reads to cselib bases are
@@ -564,8 +563,6 @@ struct group_info
   int *offset_map_n, *offset_map_p;
   int offset_map_size_n, offset_map_size_p;
 };
-typedef struct group_info *group_info_t;
-typedef const struct group_info *const_group_info_t;
 
 static object_allocator group_info_pool
   ("rtx_group_info_pool", 100);
@@ -574,7 +571,7 @@ static object_allocator group_info_pool
 static int rtx_group_next_id;
 
 
-static vec rtx_group_vec;
+static vec rtx_group_vec;
 
 
 /* This structure holds the set of changes that are being deferred
@@ -591,15 +588,13 @@ struct deferred_change
   struct deferred_change *next;
 };
 
-typedef struct deferred_change *deferred_change_t;
-
 static object_allocator deferred_change_pool
   ("deferred_change_pool", 10);
 
-static deferred_change_t deferred_change_list = NULL;
+static deferred_change *deferred_change_list = NULL;
 
 /* The group that holds all of the clear_alias_sets.  */
-static group_info_t clear_alias_group;
+static group_info *clear_alias_group;
 
 /* The modes of the clear_alias_sets.  */
 static htab_t clear_alias_mode_table;
@@ -680,11 +675,11 @@ static hash_table 
*rtx_group_table;
 
 /* Get the GROUP for BASE.  Add a new group if it is not there.  */
 
-static group_info_t
+static group_info *
 get_group_info (rtx base)
 {
   struct group_info tmp_gi;
-  group_info_t gi;
+  group_info *gi;
   group_info **slot;
 
   if (base)
@@ -693,7 +688,7 @@ get_group_info (rtx base)
 if necessary.  */
   tmp_gi.rtx_base = base;
   slot = rtx_group_table->find_slot (_gi, INSERT);
-  gi = (group_info_t) *slot;
+  gi = *slot;
 }
   else
 {
@@ -790,17 +785,17 @@ dse_step0 (void)
 static void
 free_store_info (insn_info_t insn_info)
 {
-  store_info_t store_info = insn_info->store_rec;
-  while (store_info)
+  store_info *cur = insn_info->store_rec;
+  while (cur)
 {
-  store_info_t next = store_info->next;
-  if (store_info->is_large)
-   BITMAP_FREE (store_info->positions_needed.large.bmap);
-  if (store_info->cse_base)
-   cse_store_info_pool.remove (store_info);
+  store_info *next = cur->next;
+  if (cur->is_large)
+   BITMAP_FREE (cur->positions_needed.large.bmap);
+  if (cur->cse_base)
+   cse_store_info_pool.remove (cur);
   else
-   rtx_store_info_pool.remove (store_info);
-  store_info = next;
+   rtx_store_info_pool.remove (cur);
+  cur = next;
 }
 
   insn_info->cannot_delete = true;
@@ -1015,7 +1010,7 @@ can_escape (tree expr)
OFFSET and WIDTH.  */
 
 static void
-set_usage_bits (group_info_t group, HOST_WIDE_INT offset, HOST_WIDE_INT width,
+set_usage_bits (group_info *group, HOST_WIDE_INT offset, HOST_WIDE_INT width,
 tree expr)
 {
   HOST_WIDE_INT i;
@@ -1240,7 +1235,7 @@ canon_address (rtx mem,
   if (ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (mem))
  && const_or_frame_p (address))
{
- group_info_t group = get_group_info (address);
+ group_info *group = get_group_info (address);
 
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "

[PATCH 01/10] don't typedef alias_set_entry and unhide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-02  Trevor Saunders  

* alias.c (alias_set_entry_d): Rename to alias_set_entry.
(alias_set_entry): Remove typedef.
(alias_set_subset_of): Adjust.
(alias_sets_conflict_p): Likewise.
(init_alias_set_entry): Likewise.
(get_alias_set): Likewise.
(new_alias_set): Likewise.
(record_alias_subset): Likewise.
---
 gcc/alias.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index 4681e3f..a0e8616 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -134,7 +134,7 @@ along with GCC; see the file COPYING3.  If not see
 
 struct alias_set_hash : int_hash  {};
 
-struct GTY(()) alias_set_entry_d {
+struct GTY(()) alias_set_entry {
   /* The alias set number, as stored in MEM_ALIAS_SET.  */
   alias_set_type alias_set;
 
@@ -158,7 +158,6 @@ struct GTY(()) alias_set_entry_d {
   /* Nonzero if is_pointer or if one of childs have has_pointer set.  */
   bool has_pointer;
 };
-typedef struct alias_set_entry_d *alias_set_entry;
 
 static int rtx_equal_for_memref_p (const_rtx, const_rtx);
 static int memrefs_conflict_p (int, rtx, int, rtx, HOST_WIDE_INT);
@@ -167,7 +166,7 @@ static int base_alias_check (rtx, rtx, rtx, rtx, 
machine_mode,
 machine_mode);
 static rtx find_base_value (rtx);
 static int mems_in_disjoint_alias_sets_p (const_rtx, const_rtx);
-static alias_set_entry get_alias_set_entry (alias_set_type);
+static alias_set_entry *get_alias_set_entry (alias_set_type);
 static tree decl_for_component_ref (tree);
 static int write_dependence_p (const_rtx,
   const_rtx, machine_mode, rtx,
@@ -288,7 +287,7 @@ static bool copying_arguments;
 
 
 /* The splay-tree used to store the various alias set entries.  */
-static GTY (()) vec *alias_sets;
+static GTY (()) vec *alias_sets;
 
 /* Build a decomposed reference object for querying the alias-oracle
from the MEM rtx and store it in *REF.
@@ -395,7 +394,7 @@ rtx_refs_may_alias_p (const_rtx x, const_rtx mem, bool 
tbaa_p)
 /* Returns a pointer to the alias set entry for ALIAS_SET, if there is
such an entry, or NULL otherwise.  */
 
-static inline alias_set_entry
+static inline alias_set_entry *
 get_alias_set_entry (alias_set_type alias_set)
 {
   return (*alias_sets)[alias_set];
@@ -417,7 +416,7 @@ mems_in_disjoint_alias_sets_p (const_rtx mem1, const_rtx 
mem2)
 bool
 alias_set_subset_of (alias_set_type set1, alias_set_type set2)
 {
-  alias_set_entry ase2;
+  alias_set_entry *ase2;
 
   /* Everything is a subset of the "aliases everything" set.  */
   if (set2 == 0)
@@ -453,7 +452,7 @@ alias_set_subset_of (alias_set_type set1, alias_set_type 
set2)
  get_alias_set for more details.  */
   if (ase2 && ase2->has_pointer)
 {
-  alias_set_entry ase1 = get_alias_set_entry (set1);
+  alias_set_entry *ase1 = get_alias_set_entry (set1);
 
   if (ase1 && ase1->is_pointer)
{
@@ -477,8 +476,8 @@ alias_set_subset_of (alias_set_type set1, alias_set_type 
set2)
 int
 alias_sets_conflict_p (alias_set_type set1, alias_set_type set2)
 {
-  alias_set_entry ase1;
-  alias_set_entry ase2;
+  alias_set_entry *ase1;
+  alias_set_entry *ase2;
 
   /* The easy case.  */
   if (alias_sets_must_conflict_p (set1, set2))
@@ -808,10 +807,10 @@ alias_ptr_types_compatible_p (tree t1, tree t2)
 
 /* Create emptry alias set entry.  */
 
-alias_set_entry
+alias_set_entry *
 init_alias_set_entry (alias_set_type set)
 {
-  alias_set_entry ase = ggc_alloc ();
+  alias_set_entry *ase = ggc_alloc ();
   ase->alias_set = set;
   ase->children = NULL;
   ase->has_zero_child = false;
@@ -1057,7 +1056,7 @@ get_alias_set (tree t)
   /* We treat pointer types specially in alias_set_subset_of.  */
   if (POINTER_TYPE_P (t) && set)
 {
-  alias_set_entry ase = get_alias_set_entry (set);
+  alias_set_entry *ase = get_alias_set_entry (set);
   if (!ase)
ase = init_alias_set_entry (set);
   ase->is_pointer = true;
@@ -1075,8 +1074,8 @@ new_alias_set (void)
   if (flag_strict_aliasing)
 {
   if (alias_sets == 0)
-   vec_safe_push (alias_sets, (alias_set_entry) 0);
-  vec_safe_push (alias_sets, (alias_set_entry) 0);
+   vec_safe_push (alias_sets, (alias_set_entry *) NULL);
+  vec_safe_push (alias_sets, (alias_set_entry *) NULL);
   return alias_sets->length () - 1;
 }
   else
@@ -1099,8 +1098,8 @@ new_alias_set (void)
 void
 record_alias_subset (alias_set_type superset, alias_set_type subset)
 {
-  alias_set_entry superset_entry;
-  alias_set_entry subset_entry;
+  alias_set_entry *superset_entry;
+  alias_set_entry *subset_entry;
 
   /* It is possible in complex type situations for both sets to be the same,
  in which case we can ignore this operation.  */
-- 
2.4.0

[PATCH 06/10] tree-ssa-ter.c: remove typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* tree-ssa-ter.c (temp_expr_table_d): Rename to temp_expr_table
and remove typedef.
(new_temp_expr_table): Adjust.
(free_temp_expr_table): Likewise.
(version_to_be_replaced_p): Likewise.
(make_dependent_on_partition): Likewise.
(add_to_partition_kill_list): Likewise.
(remove_from_partition_kill_list): Likewise.
(add_dependence): Likewise.
(finished_with_expr): Likewise.
(process_replaceable): Likewise.
(kill_expr): Likewise.
(kill_virtual_exprs): Likewise.
(mark_replaceable): Likewise.
(find_replaceable_in_bb): Likewise.
(find_replaceable_exprs): Likewise.
(debug_ter): Likewise.
---
 gcc/tree-ssa-ter.c | 39 +++
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-ssa-ter.c b/gcc/tree-ssa-ter.c
index f7ca95b..17686a9 100644
--- a/gcc/tree-ssa-ter.c
+++ b/gcc/tree-ssa-ter.c
@@ -162,7 +162,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Temporary Expression Replacement (TER) table information.  */
 
-typedef struct temp_expr_table_d
+struct temp_expr_table
 {
   var_map map;
   bitmap *partition_dependencies;  /* Partitions expr is dependent on.  */
@@ -174,7 +174,7 @@ typedef struct temp_expr_table_d
   bitmap new_replaceable_dependencies; /* Holding place for pending dep's.  */
   int *num_in_part;/* # of ssa_names in a partition.  */
   int *call_cnt;   /* Call count at definition.  */
-} *temp_expr_table_p;
+};
 
 /* Used to indicate a dependency on VDEFs.  */
 #define VIRTUAL_PARTITION(table)   (table->virtual_partition)
@@ -183,19 +183,18 @@ typedef struct temp_expr_table_d
 static bitmap_obstack ter_bitmap_obstack;
 
 #ifdef ENABLE_CHECKING
-extern void debug_ter (FILE *, temp_expr_table_p);
+extern void debug_ter (FILE *, temp_expr_table *);
 #endif
 
 
 /* Create a new TER table for MAP.  */
 
-static temp_expr_table_p
+static temp_expr_table *
 new_temp_expr_table (var_map map)
 {
-  temp_expr_table_p t;
   unsigned x;
 
-  t = XNEW (struct temp_expr_table_d);
+  temp_expr_table *t = XNEW (struct temp_expr_table);
   t->map = map;
 
   t->partition_dependencies = XCNEWVEC (bitmap, num_ssa_names + 1);
@@ -229,7 +228,7 @@ new_temp_expr_table (var_map map)
vector.  */
 
 static bitmap
-free_temp_expr_table (temp_expr_table_p t)
+free_temp_expr_table (temp_expr_table *t)
 {
   bitmap ret = NULL;
 
@@ -264,7 +263,7 @@ free_temp_expr_table (temp_expr_table_p t)
 /* Return TRUE if VERSION is to be replaced by an expression in TAB.  */
 
 static inline bool
-version_to_be_replaced_p (temp_expr_table_p tab, int version)
+version_to_be_replaced_p (temp_expr_table *tab, int version)
 {
   if (!tab->replaceable_expressions)
 return false;
@@ -276,7 +275,7 @@ version_to_be_replaced_p (temp_expr_table_p tab, int 
version)
the expression table */
 
 static inline void
-make_dependent_on_partition (temp_expr_table_p tab, int version, int p)
+make_dependent_on_partition (temp_expr_table *tab, int version, int p)
 {
   if (!tab->partition_dependencies[version])
 tab->partition_dependencies[version] = BITMAP_ALLOC (_bitmap_obstack);
@@ -288,7 +287,7 @@ make_dependent_on_partition (temp_expr_table_p tab, int 
version, int p)
 /* Add VER to the kill list for P.  TAB is the expression table */
 
 static inline void
-add_to_partition_kill_list (temp_expr_table_p tab, int p, int ver)
+add_to_partition_kill_list (temp_expr_table *tab, int p, int ver)
 {
   if (!tab->kill_list[p])
 {
@@ -303,7 +302,7 @@ add_to_partition_kill_list (temp_expr_table_p tab, int p, 
int ver)
table.  */
 
 static inline void
-remove_from_partition_kill_list (temp_expr_table_p tab, int p, int version)
+remove_from_partition_kill_list (temp_expr_table *tab, int p, int version)
 {
   gcc_checking_assert (tab->kill_list[p]);
   bitmap_clear_bit (tab->kill_list[p], version);
@@ -321,7 +320,7 @@ remove_from_partition_kill_list (temp_expr_table_p tab, int 
p, int version)
expression table.  */
 
 static void
-add_dependence (temp_expr_table_p tab, int version, tree var)
+add_dependence (temp_expr_table *tab, int version, tree var)
 {
   int i;
   bitmap_iterator bi;
@@ -372,7 +371,7 @@ add_dependence (temp_expr_table_p tab, int version, tree 
var)
expression from consideration as well by freeing the decl uid bitmap.  */
 
 static void
-finished_with_expr (temp_expr_table_p tab, int version, bool free_expr)
+finished_with_expr (temp_expr_table *tab, int version, bool free_expr)
 {
   unsigned i;
   bitmap_iterator bi;
@@ -444,7 +443,7 @@ ter_is_replaceable_p (gimple stmt)
 /* Create an expression entry for a replaceable expression.  */
 
 static void
-process_replaceable (temp_expr_table_p tab, gimple stmt, int call_cnt)
+process_replaceable

[PATCH 00/10] removal of typedefs that hide pointerness episode 1

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

Hi,

Personally I think hiding that variables are pointers is confusing, and I
believe consensus is we should move away from this style.  So this series
starts to do that.

patches individually bootstrapped + regtested on x86_64-linux-gnu, ok?

Trev

Trevor Saunders (10):
  don't typedef alias_set_entry and unhide pointerness
  dse.c: remove some typedefs that hide pointerness
  var-tracking.c: remove typedef of location_chain
  var-tracking.c: remove typedef of shared_hash
  bt-load.c: remove typedefs that hide pointerness
  tree-ssa-ter.c: remove typedefs that hide pointerness
  tree-vrp.c: remove typedefs that hide pointerness
  dwarf2cfi.c: remove typedef that hides pointerness
  dwarf2out.c: remove typedefs that hide pointerness
  tree-ssa-loop-im.c: remove typedefs that hide pointerness

 gcc/alias.c|  31 +++--
 gcc/bt-load.c  | 140 ++--
 gcc/dse.c  | 115 -
 gcc/dwarf2cfi.c|   5 +-
 gcc/dwarf2out.c| 340 -
 gcc/tree-ssa-loop-im.c |  98 +++---
 gcc/tree-ssa-ter.c |  39 +++---
 gcc/tree-vrp.c |  22 ++--
 gcc/var-tracking.c | 192 ++--
 9 files changed, 479 insertions(+), 503 deletions(-)

-- 
2.4.0

Re: [PATCH] [RTEMS] Update RTEMS thread model

2015-09-02 Thread Sebastian Huber


Committed:

https://gcc.gnu.org/viewcvs/gcc?view=revision=227428

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

[PATCH 09/10] dwarf2out.c: remove typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* dwarf2out.c (dw_attr_ref): Remove typedef.
(dw_line_info_ref): Likewise.
(pubname_ref): Likewise.
(dw_ranges_ref): Likewise.
(dw_ranges_by_label_ref): Likewise.
(comdat_type_node_ref): Likewise.
 (dw_line_info_table_struct): Rename to dw_line_info_table.
(get_AT): Adjust.
(get_AT_low_pc): Likewise.
(get_AT_hi_pc): Likewise.
(get_AT_string): Likewise.
(get_AT_flag): Likewise.
(get_AT_unsigned): Likewise.
(get_AT_ref): Likewise.
(get_AT_file): Likewise.
(remove_AT): Likewise.
(print_die): Likewise.
(check_die): Likewise.
(die_checksum): Likewise.
(attr_checksum_ordered): Likewise.
(struct checksum_attributes): Likewise.
(collect_checksum_attributes): Likewise.
(die_checksum_ordered): Likewise.
(same_die_p): Likewise.
(is_declaration_die): Likewise.
(clone_die): Likewise.
(clone_as_declaration): Likewise.
(copy_declaration_context): Likewise.
(break_out_comdat_types): Likewise.
(copy_decls_walk): Likewise.
(output_location_lists): Likewise.
(external_ref_hasher::hash): Likewise.
(optimize_external_refs_1): Likewise.
(build_abbrev_table): Likewise.
(size_of_die): Likewise.
(unmark_all_dies): Likewise.
(size_of_pubnames): Likewise.
(output_die_abbrevs): Likewise.
(output_die): Likewise.
(output_pubnames): Likewise.
(add_ranges_num): Likewise.
(add_ranges_by_labels): Likewise.
(add_high_low_attributes): Likewise.
(gen_producer_string): Likewise.
(dwarf2out_set_name): Likewise.
(new_line_info_table): Likewise.
(prune_unused_types_walk_attribs): Likewise.
(prune_unused_types_update_strings): Likewise.
(prune_unused_types): Likewise.
(resolve_addr): Likewise.
(optimize_location_lists_1): Likewise.
(index_location_lists): Likewise.
(dwarf2out_finish): Likewise.
---
 gcc/dwarf2out.c | 340 +++-
 1 file changed, 163 insertions(+), 177 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index d9d3063..f441e92 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -2550,14 +2550,7 @@ const struct gcc_debug_hooks dwarf2_lineno_debug_hooks =
 
 typedef long int dw_offset;
 
-/* Define typedefs here to avoid circular dependencies.  */
-
-typedef struct dw_attr_struct *dw_attr_ref;
-typedef struct dw_line_info_struct *dw_line_info_ref;
-typedef struct pubname_struct *pubname_ref;
-typedef struct dw_ranges_struct *dw_ranges_ref;
-typedef struct dw_ranges_by_label_struct *dw_ranges_by_label_ref;
-typedef struct comdat_type_struct *comdat_type_node_ref;
+struct comdat_type_node;
 
 /* The entries in the line_info table more-or-less mirror the opcodes
that are used in the real dwarf line table.  Arrays of these entries
@@ -2596,7 +2589,7 @@ typedef struct GTY(()) dw_line_info_struct {
 } dw_line_info_entry;
 
 
-typedef struct GTY(()) dw_line_info_table_struct {
+struct GTY(()) dw_line_info_table {
   /* The label that marks the end of this section.  */
   const char *end_label;
 
@@ -2610,9 +2603,7 @@ typedef struct GTY(()) dw_line_info_table_struct {
   bool in_use;
 
   vec *entries;
-} dw_line_info_table;
-
-typedef dw_line_info_table *dw_line_info_table_p;
+};
 
 
 /* Each DIE attribute has a field specifying the attribute kind,
@@ -2634,7 +2625,7 @@ typedef struct GTY((chain_circular ("%h.die_sib"), 
for_user)) die_struct {
   union die_symbol_or_type_node
 {
   const char * GTY ((tag ("0"))) die_symbol;
-  comdat_type_node_ref GTY ((tag ("1"))) die_type_node;
+  comdat_type_node *GTY ((tag ("1"))) die_type_node;
 }
   GTY ((desc ("%0.comdat_type_p"))) die_id;
   vec *die_attr;
@@ -2680,7 +2671,7 @@ typedef struct GTY(()) pubname_struct {
 pubname_entry;
 
 
-struct GTY(()) dw_ranges_struct {
+struct GTY(()) dw_ranges {
   /* If this is positive, it's a block number, otherwise it's a
  bitwise-negated index into dw_ranges_by_label.  */
   int num;
@@ -2696,21 +2687,20 @@ typedef struct GTY(()) macinfo_struct {
 macinfo_entry;
 
 
-struct GTY(()) dw_ranges_by_label_struct {
+struct GTY(()) dw_ranges_by_label {
   const char *begin;
   const char *end;
 };
 
 /* The comdat type node structure.  */
-typedef struct GTY(()) comdat_type_struct
+struct GTY(()) comdat_type_node
 {
   dw_die_ref root_die;
   dw_die_ref type_die;
   dw_die_ref skeleton_die;
   char signature[DWARF_TYPE_SIGNATURE_SIZE];
-  struct comdat_type_struct *next;
-}
-comdat_type_node;
+  comdat_type_node *next;
+};
 
 /* A list of DIEs for which we can't

[PATCH 10/10] tree-ssa-loop-im.c: remove typedefs that hide pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* tree-ssa-loop-im.c (mem_ref_loc_p): Remove typedef.
(mem_ref_p): Likewise.
(outermost_indep_loop): Adjust.
(mem_ref_in_stmt): Likewise.
(determine_max_movement): Likewise.
(mem_ref_alloc): Likewise.
(record_mem_ref_loc): Likewise.
(set_ref_stored_in_loop): Likewise.
(mark_ref_stored): Likewise.
(gather_mem_refs_stmt): Likewise.
(mem_refs_may_alias_p): Likewise.
(for_all_locs_in_loop): Likewise.
(struct rewrite_mem_ref_loc): Likewise.
(rewrite_mem_refs): Likewise.
(struct first_mem_ref_loc_1): Likewise.
(first_mem_ref_loc): Likewise.
(struct sm_set_flag_if_changed): Likewise.
(execute_sm_if_changed_flag_set): Likewise.
(execute_sm): Likewise.
(hoist_memory_references):
(struct ref_always_accessed): Likewise.
(ref_always_accessed_p): Likewise.
(refs_independent_p): Likewise.
(record_dep_loop): Likewise.
(ref_indep_loop_p_1): Likewise.
(ref_indep_loop_p_2): Likewise.
(ref_indep_loop_p): Likewise.
(can_sm_ref_p): Likewise.
(find_refs_for_sm): Likewise.
(tree_ssa_lim_finalize): Likewise.
---
 gcc/tree-ssa-loop-im.c | 98 +-
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index b85d9cb..f1d4a8c 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -102,16 +102,16 @@ static hash_map *lim_aux_data_map;
 
 /* Description of a memory reference location.  */
 
-typedef struct mem_ref_loc
+struct mem_ref_loc
 {
   tree *ref;   /* The reference itself.  */
   gimple stmt; /* The statement in that it occurs.  */
-} *mem_ref_loc_p;
+};
 
 
 /* Description of a memory reference.  */
 
-typedef struct im_mem_ref
+struct im_mem_ref
 {
   unsigned id; /* ID assigned to the memory reference
   (its index in memory_accesses.refs_list)  */
@@ -138,7 +138,7 @@ typedef struct im_mem_ref
   If it is only loaded, then it is independent
 on all stores in the loop.  */
   bitmap_head dep_loop;/* The complement of INDEP_LOOP.  */
-} *mem_ref_p;
+};
 
 /* We use two bits per loop in the ref->{in,}dep_loop bitmaps, the first
to record (in)dependence against stores in the loop and its subloops, the
@@ -181,7 +181,7 @@ static struct
   hash_table *refs;
 
   /* The list of memory references.  */
-  vec refs_list;
+  vec refs_list;
 
   /* The set of memory references accessed in each loop.  */
   vec refs_in_loop;
@@ -200,7 +200,7 @@ static struct
 static bitmap_obstack lim_bitmap_obstack;
 static obstack mem_ref_obstack;
 
-static bool ref_indep_loop_p (struct loop *, mem_ref_p);
+static bool ref_indep_loop_p (struct loop *, im_mem_ref *);
 
 /* Minimum cost of an expensive expression.  */
 #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
@@ -537,7 +537,7 @@ stmt_cost (gimple stmt)
instead.  */
 
 static struct loop *
-outermost_indep_loop (struct loop *outer, struct loop *loop, mem_ref_p ref)
+outermost_indep_loop (struct loop *outer, struct loop *loop, im_mem_ref *ref)
 {
   struct loop *aloop;
 
@@ -590,13 +590,13 @@ simple_mem_ref_in_stmt (gimple stmt, bool *is_store)
 
 /* Returns the memory reference contained in STMT.  */
 
-static mem_ref_p
+static im_mem_ref *
 mem_ref_in_stmt (gimple stmt)
 {
   bool store;
   tree *mem = simple_mem_ref_in_stmt (stmt, );
   hashval_t hash;
-  mem_ref_p ref;
+  im_mem_ref *ref;
 
   if (!mem)
 return NULL;
@@ -790,7 +790,7 @@ determine_max_movement (gimple stmt, bool 
must_preserve_exec)
 
   if (gimple_vuse (stmt))
 {
-  mem_ref_p ref = mem_ref_in_stmt (stmt);
+  im_mem_ref *ref = mem_ref_in_stmt (stmt);
 
   if (ref)
{
@@ -1420,10 +1420,10 @@ memref_free (struct im_mem_ref *mem)
 /* Allocates and returns a memory reference description for MEM whose hash
value is HASH and id is ID.  */
 
-static mem_ref_p
+static im_mem_ref *
 mem_ref_alloc (tree mem, unsigned hash, unsigned id)
 {
-  mem_ref_p ref = XOBNEW (_ref_obstack, struct im_mem_ref);
+  im_mem_ref *ref = XOBNEW (_ref_obstack, struct im_mem_ref);
   ao_ref_init (>mem, mem);
   ref->id = id;
   ref->hash = hash;
@@ -1439,7 +1439,7 @@ mem_ref_alloc (tree mem, unsigned hash, unsigned id)
description REF.  The reference occurs in statement STMT.  */
 
 static void
-record_mem_ref_loc (mem_ref_p ref, gimple stmt, tree *loc)
+record_mem_ref_loc (im_mem_ref *ref, gimple stmt, tree *loc)
 {
   mem_ref_loc aref;
   aref.stmt = stmt;
@@ -1451,7 +1451,7 @@ record_mem_ref_loc (mem_ref_p ref, gimple stmt, tree *loc)

[PATCH 08/10] dwarf2cfi.c: remove typedef that hides pointerness

2015-09-02 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2015-09-03  Trevor Saunders  

* dwarf2cfi.c (dw_trace_info_ref): Remove typedef.
---
 gcc/dwarf2cfi.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index ab18062..1cfa6a7 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -160,9 +160,6 @@ struct dw_trace_info
 };
 
 
-typedef dw_trace_info *dw_trace_info_ref;
-
-
 /* Hashtable helpers.  */
 
 struct trace_info_hasher : nofree_ptr_hash 
@@ -186,7 +183,7 @@ trace_info_hasher::equal (const dw_trace_info *a, const 
dw_trace_info *b)
 
 /* The variables making up the pseudo-cfg, as described above.  */
 static vec trace_info;
-static vec trace_work_list;
+static vec trace_work_list;
 static hash_table *trace_index;
 
 /* A vector of call frame insns for the CIE.  */
-- 
2.4.0

Re: [PATCH v2] [libstdc++] Run tests on RTEMS

2015-09-02 Thread Sebastian Huber




On 01/09/15 23:07, Jeff Law wrote:

On 09/01/2015 05:02 AM, Sebastian Huber wrote:

v2: Include all options and not only "dg-do run ...".

libstdc++-v3/ChangeLog
2015-09-01  Sebastian Huber 

testsuite/*: Use 's/\*-\*-cygwin\* /&*-*-rtems* /' to add RTEMS
target selector to all tests that run on Cygwin.

So presumably those tests actually run correctly :-)

I don't think the ChangeLog is strictly OK according to standards. 
Every file changed is supposed to be listed.  I know it's a pain, but 
until we change those requirements it's probably best to stick with 
current standards.


GIven a context diff or a unidiff, contrib/mklog can generate a 
skeleton ChangeLog entry for all the referenced files.


I think

* firstfile: What changed.
* secondfile: Likewise.
* thirdfile: Likewise.

Is fine.

OK with the fixed ChangeLog.


Committed:

https://gcc.gnu.org/viewcvs/gcc?view=revision=227429

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

[hsa] Stricter target_follows_kernelizable_pattern

2015-09-02 Thread Martin Jambor

Hi,

the patch below makes target_follows_kernelizable_pattern stricter by
adding a few checks for clauses that have to preclude kernelization.
Committed to the branch.

Thanks,

Martin


2015-09-02  Martin Jambor  

* omp-low.c (target_follows_kernelizable_pattern): Parallel
num_thread clause and non-automatic loop schedule preclude
kernelization.
---
 gcc/ChangeLog.hsa |  6 ++
 gcc/omp-low.c | 32 ++--
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6c2bbe7..d6c521f 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -2832,9 +2832,23 @@ target_follows_kernelizable_pattern (gomp_target 
*target, tree *group_size_p,
   gomp_parallel *par;
   if (!stmt || !(par = dyn_cast  (stmt)))
 return NULL;
+
+  tree clauses = gimple_omp_parallel_clauses (par);
+  tree num_threads_clause = find_omp_clause (clauses, OMP_CLAUSE_NUM_THREADS);
+  if (num_threads_clause)
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, tloc,
+"Will not turn target construct into a "
+"simple GPGPU kernel because there is a num_threads "
+"clause of the parallel construct that "
+"is likely to require looping \n");
+  return NULL;
+}
+
   stmt = single_stmt_in_seq_skip_bind (gimple_omp_body (par), tloc, 
"parallel");
-  /* FIXME: We are currently ignoring parallel clauses and potentially also
- sharing clauses of teams and distribute, if there are any. We need to
+  /* FIXME: We are currently ignoring parallel sharing clauses and potentially
+ also sharing clauses of teams and distribute, if there are any. We need to
  check they can be skipped.  */
   gomp_for *gfor;
   if (!stmt || !(gfor = dyn_cast  (stmt)))
@@ -2859,6 +2873,20 @@ target_follows_kernelizable_pattern (gomp_target 
*target, tree *group_size_p,
   return NULL;
 }
 
+  clauses = gimple_omp_for_clauses (gfor);
+  tree for_sched_clause = find_omp_clause (clauses, OMP_CLAUSE_SCHEDULE);
+
+  if (for_sched_clause
+  && OMP_CLAUSE_SCHEDULE_KIND (for_sched_clause) != 
OMP_CLAUSE_SCHEDULE_AUTO)
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, tloc,
+"Will not turn target construct into a simple GPGPU "
+"kernel because the inner loop has non-automatic "
+"scheduling clause\n");
+  return NULL;
+}
+
   if (teams)
 gather_inner_locals (gimple_omp_body (teams), kri);
   if (dist)
-- 
2.4.6

Re: RFC: Combine of compare & and oddity

2015-09-02 Thread Segher Boessenkool

Hi Wilco,

On Wed, Sep 02, 2015 at 06:09:24PM +0100, Wilco Dijkstra wrote:
> Combine canonicalizes certain AND masks in a comparison with zero into 
> extracts of the widest
> register type. During matching these are expanded into a very inefficient 
> sequence that fails to
> match. For example (x & 2) == 0 is matched in combine like this:
> 
> Failed to match this instruction:
> (set (reg:CC 66 cc)
> (compare:CC (zero_extract:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
> (const_int 1 [0x1])
> (const_int 1 [0x1]))
> (const_int 0 [0])))

Yes.  Some processors even have specific instructions to do this.

> Failed to match this instruction:
> (set (reg:CC 66 cc)
> (compare:CC (and:DI (lshiftrt:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
> (const_int 1 [0x1]))
> (const_int 1 [0x1]))
> (const_int 0 [0])))

This is after r223067.  Combine tests only one "final" instruction; that
revision rewrites zero_ext* if it doesn't match and tries again.  This
helps for processors that can do more generic masks (like rs6000, and I
believe also aarch64?): without it, you need many more patterns to match
all the zero_ext{ract,end} cases.

> Neither matches the AArch64 patterns for ANDS/TST (which is just compare and 
> AND). If the immediate
> is not a power of 2 or a power of 2 -1 then it matches correctly as expected.
> 
> I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion

It is zero_extract(x,1,1) really.  This is convenient for (old and embedded)
processors that have special bit-test instructions.  If we now want combine
to not do this, we'll have to update all backends that rely on it.

> (it even uses shifts by 0 at
> times which are unlikely to ever match anything).

It matches fine on some targets.  It certainly looks silly, and could be
expressed simpler.  Patch should be simple; watch this space / remind me /
or file a PR.

> Why does combine not try to match the obvious (x &
> C) != 0 case? Single-bit and mask tests are very common, so this blocks 
> efficient code generation on
> many targets.

They are common, and many processors had instructions for them, which is
(supposedly) why combine special-cased them.

> It's trivial to change change_zero_ext to expand extracts always into AND and 
> remove the redundant
> subreg.

Not really trivial...  Think about comparisons...

int32_t x;
((x >> 31) & 1) > 0;
// is not the same as
(x & 0x8000) > 0; // signed comparison

(You do not easily know what the COMPARE is used for).

> However wouldn't it make more sense to never special case certain AND 
> immediate in the first
> place?

Yes, but we need to make sure no targets regress (i.e. by rewriting patterns
for those that do).  We need to know the impact of such a change, at the least.

Segher

[gomp4] remove xfails in the libgomp reduction tests

2015-09-02 Thread Cesar Philippidis

A couple of reduction tests inside libgomp had xfails because Julian
added those tests before my reduction patches were ready. Most of them
should pass unmodified, but I had to found a bug in
loop-reduction-wv-p-3.c, where a private variable was used without being
initialized. This patch fixes that bug and removes the xfails from the
reduction test cases.

This patch has been committed to gomp-4_0-branch.

Cesar
2015-09-02  Cesar Philippidis  

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c:
	Remove xfail.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c:
	Likwise.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c:
	Remove xfail.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c:
	Likewise.  Initialize res because it's private.


diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c
index 3e5c707..ea5c151 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on loop directive (gangs, workers and vectors, non-private
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c
index 44d7f0f..0056f3c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on loop directive (gangs, workers and vectors, non-private
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c
index 8bc18f7..e69d0ec 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { *-*-* } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on loop directive (gangs, workers and vectors, multiple
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c
index 63f3fef..15f0053 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on loop directive (vector reduction in
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c
index ac96525..b5e28fb 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { *-*-* } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on loop directive (workers and vectors, private reduction
@@ -16,6 +14,9 @@ main (int argc, char *argv[])
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 		   private(res) copyin(arr) copyout(out)
   {
+/* Private variables aren't initialized by default in openacc.  */
+res = 0;
+
 /* "res" should be available at the end of the following loop (and should
have the same value redundantly in each gang).  */
 #pragma acc loop worker vector reduction(+:res)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c
index 860e56d..9b26f9b 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c
@@ -1,5 +1,3 @@
-/* { dg-xfail-run-if "TODO" { *-*-* } { "*" } { "" } } */
-
 #include 
 
 /* Test of reduction on both parallel and loop directives (workers and vectors
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c
index 41e0f71..38e63e3 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c
@@ -1,5 +1,3 @@
-/* {

Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-09-02 Thread Jeff Law


On 09/02/2015 07:36 AM, Jiong Wang wrote:


For the record, after Bin's recent tree-ssa-ivopt improvement originated
from PR62173, this patch is not benefitial anymore.

It happens sometimes.



I have stopped working on this patch. Thanks for those time spent on
reviewing and discussing on this.
No problem, thanks for testing after Bin's changes and letting me know 
that this has become a non-issue.


Jeff

Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-09-02 Thread Jeff Law


On 09/02/2015 09:18 AM, Zamyatin, Igor wrote:



On 19/08/15 17:57, Jeff Law wrote:

On 08/12/2015 08:31 AM, Kyrill Tkachov wrote:

2015-08-10  Kyrylo Tkachov 

   * ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
   then_cost, else_cost fields.  Change branch_cost field to
unsigned int.
   (end_ifcvt_sequence): Call set_used_flags on each insn in the
   sequence.
   Include rtl-iter.h.
   (noce_simple_bbs): New function.
   (noce_try_move): Bail if basic blocks are not simple.
   (noce_try_store_flag): Likewise.
   (noce_try_store_flag_constants): Likewise.
   (noce_try_addcc): Likewise.
   (noce_try_store_flag_mask): Likewise.
   (noce_try_cmove): Likewise.
   (noce_try_minmax): Likewise.
   (noce_try_abs): Likewise.
   (noce_try_sign_mask): Likewise.
   (noce_try_bitop): Likewise.
   (bbs_ok_for_cmove_arith): New function.
   (noce_emit_all_but_last): Likewise.
   (noce_emit_insn): Likewise.
   (noce_emit_bb): Likewise.
   (noce_try_cmove_arith): Handle non-simple basic blocks.
   (insn_valid_noce_process_p): New function.
   (contains_mem_rtx_p): Likewise.
   (bb_valid_for_noce_process_p): Likewise.
   (noce_process_if_block): Allow non-simple basic blocks
   where appropriate.

2015-08-11  Kyrylo Tkachov 

   * gcc.dg/ifcvt-1.c: New test.
   * gcc.dg/ifcvt-2.c: Likewise.
   * gcc.dg/ifcvt-3.c: Likewise.


Looks like ifcvt-3.c fails on x86_64. I see

New failures:
FAIL: gcc.dg/ifcvt-3.c scan-rtl-dump ce1 "3 true changes made"

Could you please take a look?

Hmm, FWIW, these are passing in my builds/tests:
[law@localhost c-family]$ grep ifcvt-3 /tmp/gcc.sum
PASS: gcc.dg/ifcvt-3.c (test for excess errors)
PASS: gcc.dg/ifcvt-3.c scan-rtl-dump ce1 "3 true changes made"
PASS: gcc.dg/vect/vect-ifcvt-3.c (test for excess errors)
PASS: gcc.dg/vect/vect-ifcvt-3.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorized 1 loops" 1
PASS: gcc.dg/vect/vect-ifcvt-3.c -flto -ffat-lto-objects (test for 
excess errors)

PASS: gcc.dg/vect/vect-ifcvt-3.c -flto -ffat-lto-objects execution test
PASS: gcc.dg/vect/vect-ifcvt-3.c execution test
PASS: gcc.dg/vect/vect-ifcvt-3.c scan-tree-dump-times vect "vectorized 1 
loops" 1

Jeff

Re: [PATCH] PR 60586

2015-09-02 Thread Jeff Law


On 09/01/2015 10:30 PM, Iyer, Balaji V wrote:

Hi Jeff,
I thought about this for a minute and I don't think I need to use the 
lang_hooks. I could do this change right before calling gimplify_cilk_spawn. I 
have attached the fixed patch and have answered your questions below. Here are 
the ChangeLog entries:

gcc/c-family/ChangeLog:
2015-09-01  Balaji V. Iyer  

 PR middle-end/60586
 * c-common.h (cilk_gimplify_call_params_in_spawned_fn): New prototype.
 * c-gimplify.c (c_gimplify_expr): Added a call to the function
 cilk_gimplify_call_params_in_spawned_fn.
 * cilk.c (cilk_gimplify_call_params_in_spawned_fn): New function.
 (gimplify_cilk_spawn): Removed EXPR_STMT and CLEANUP_POINT_EXPR
 unwrapping.

gcc/cp/ChangeLog
2015-09-01  Balaji V. Iyer  

 PR middle-end/60586
 * cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn): New
 function.
 (cp_gimplify_expr): Added a call to the function
 cilk_cp_gimplify_call_params_in_spawned_fn.

gcc/testsuite/ChangeLog
2015-09-01  Balaji V. Iyer  

 PR middle-end/60586
 * c-c++-common/cilk-plus/CK/pr60586.c: New file.
 * g++.dg/cilk-plus/CK/pr60586.cc: Likewise.

Is this OK for trunk?

Yes.   Please install.

Thanks,

Jeff

Re: RFC: Combine of compare & and oddity

2015-09-02 Thread Segher Boessenkool

On Wed, Sep 02, 2015 at 01:59:58PM -0600, Jeff Law wrote:
> >(set (reg:CC 66 cc)
> > (compare:CC (and:DI (lshiftrt:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 
> > 0)
> > (const_int 1 [0x1]))
> > (const_int 1 [0x1]))
> > (const_int 0 [0])))
> Yea, this is an alternative form.   I don't offhand remember how/why 
> this form appears, but it certainly does.  I don't think any ports 
> handle this form (but I certainly have done any checks), but I believe 
> combine creates it primarily for internal purposes.

Combine replaces zero_ext* with equivalent shift/and patterns and tries
again, if things don't match.  Targets with more generic masking insns
do not want to describe the very many cases that can be described with
zero_ext* separately.

rs6000 handles this exact pattern, btw.  And I'll be very happy if we can
just drop it :-)

> >I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion (it 
> >even uses shifts by 0 at
> >times which are unlikely to ever match anything). Why does combine not try 
> >to match the obvious (x &
> >C) != 0 case? Single-bit and mask tests are very common, so this blocks 
> >efficient code generation on
> >many targets.
> From md.texi:
> 
> cindex @code{zero_extract}, canonicalization of
> @cindex @code{sign_extract}, canonicalization of
> @item
> Equality comparisons of a group of bits (usually a single bit) with zero
> will be written using @code{zero_extract} rather than the equivalent
> @code{and} or @code{sign_extract} operations.

Oh it's even documented, thanks.  I do still think we should think of
changing this.


Segher

[C PATCH] Better diagnostic for empty enum (PR c/67432)

2015-09-02 Thread Marek Polacek

This PR asks for a better error wrt empty enums.  So I've handled
empty enum specially.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-09-02  Marek Polacek  

PR c/67432
* c-parser.c (c_parser_enum_specifier): Give a better error for
an empty enum.

* gcc.dg/pr67432.c: New test.

diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 1a58798..11a2b0f 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -2555,7 +2555,16 @@ c_parser_enum_specifier (c_parser *parser)
  location_t decl_loc, value_loc;
  if (c_parser_next_token_is_not (parser, CPP_NAME))
{
- c_parser_error (parser, "expected identifier");
+ /* Give a nicer error for "enum {}".  */
+ if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE)
+ && !parser->error)
+   {
+ error_at (c_parser_peek_token (parser)->location,
+   "empty enum is invalid");
+ parser->error = true;
+   }
+ else
+   c_parser_error (parser, "expected identifier");
  c_parser_skip_until_found (parser, CPP_CLOSE_BRACE, NULL);
  values = error_mark_node;
  break;
diff --git gcc/testsuite/gcc.dg/pr67432.c gcc/testsuite/gcc.dg/pr67432.c
index e69de29..74367a9 100644
--- gcc/testsuite/gcc.dg/pr67432.c
+++ gcc/testsuite/gcc.dg/pr67432.c
@@ -0,0 +1,6 @@
+/* PR c/67432 */
+/* { dg-do compile } */
+
+enum {}; /* { dg-error "empty enum is invalid" } */
+enum E {}; /* { dg-error "empty enum is invalid" } */
+enum F {} e; /* { dg-error "empty enum is invalid" } */

Marek

Reviving SH FDPIC target

2015-09-02 Thread Rich Felker

I've started work on reviving the FDPIC support patch for the SH
target, which was proposed upstream in 2010 then abandoned:

https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01464.html

Right now I'm in the process of determining what parts can be applied
as-is to current gcc, and what parts need to be adapted or rewritten
to account for changes in gcc between the 4.5 era and now (5.2/6.0).

The original patch as posted contained a significant amount of changes
that were unrelated to FDPIC code generation but rather for adding
sh*-uclinux tuples, and some things that even look like they're
associated with the old bFLT format. I am omitting these parts for now
since I'm unfamiliar with the old uClinux stuff and it's unmaintained.
If anyone else wants to use it, I think it would make more sense
factored as a separate patch anyway.

The target I have in mind is SH-2/J2 with musl libc, but uClibc or
even glibc could be made to work with it. I will submit the patches
for musl support (basically, just hooking up the dynamic linker name
for fdpic) separately; I believe they'll need merging with the
already-pending musl support patch from Szabolcs Nagy.

One question I'd like to ask now in case it's a problem that takes a
while to work out -- is copyright assignment already handled for the
old patch? The contributors listed in it are (all codesourcery):

- Daniel Jacobowitz
- Joseph Myers
- Mark Shinwell
- Andrew Stubbs

Also, according to Joseph Myers, there was some unresolved
disagreement that stalled (and eventually sunk) the old patch, so if
anyone's still around who has objections to it, could you speak up and
let me know what's wrong? Kaz Kojima seems to have approved the patch
at the time so I'm confused what the issue was/is.

Rich

Re: RFC: Combine of compare & and oddity

2015-09-02 Thread Jeff Law


On 09/02/2015 11:09 AM, Wilco Dijkstra wrote:

Hi,

Combine canonicalizes certain AND masks in a comparison with zero into extracts 
of the widest
register type. During matching these are expanded into a very inefficient 
sequence that fails to
match. For example (x & 2) == 0 is matched in combine like this:

Failed to match this instruction:
(set (reg:CC 66 cc)
 (compare:CC (zero_extract:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
 (const_int 1 [0x1])
 (const_int 1 [0x1]))
 (const_int 0 [0])))
That's a fairly standard looking bit test  One could argue about the 
mode of the extraction being problematical because it introduces the 
SUBREG, but overall it's a normal looking bit test.  Various ports know 
about this form.




Failed to match this instruction:
(set (reg:CC 66 cc)
 (compare:CC (and:DI (lshiftrt:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
 (const_int 1 [0x1]))
 (const_int 1 [0x1]))
 (const_int 0 [0])))
Yea, this is an alternative form.   I don't offhand remember how/why 
this form appears, but it certainly does.  I don't think any ports 
handle this form (but I certainly have done any checks), but I believe 
combine creates it primarily for internal purposes.




Neither matches the AArch64 patterns for ANDS/TST (which is just compare and 
AND). If the immediate
is not a power of 2 or a power of 2 -1 then it matches correctly as expected.

It might be advisable to recognize the first form.



I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion (it even 
uses shifts by 0 at
times which are unlikely to ever match anything). Why does combine not try to match 
the obvious (x &
C) != 0 case? Single-bit and mask tests are very common, so this blocks 
efficient code generation on
many targets.

From md.texi:

cindex @code{zero_extract}, canonicalization of
@cindex @code{sign_extract}, canonicalization of
@item
Equality comparisons of a group of bits (usually a single bit) with zero
will be written using @code{zero_extract} rather than the equivalent
@code{and} or @code{sign_extract} operations.

Jeff

Re: Reviving SH FDPIC target

2015-09-02 Thread Joseph Myers

On Wed, 2 Sep 2015, Rich Felker wrote:

> Also, according to Joseph Myers, there was some unresolved
> disagreement that stalled (and eventually sunk) the old patch, so if
> anyone's still around who has objections to it, could you speak up and
> let me know what's wrong? Kaz Kojima seems to have approved the patch
> at the time so I'm confused what the issue was/is.

It's patch 1/3 (architecture-independent) that had the disagreement (and 
patch 3/3 depends on patch 1/3).

https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01462.html

-- 
Joseph S. Myers
jos...@codesourcery.com

Go patch committed: mark erroneous constants as invalid

2015-09-02 Thread Ian Lance Taylor

This patch by Chris Manghane avoids a compiler crash by marking
erroneous constants as invalid and turning them into error expressions
when seen.  This fixes https://golang.org/issue/11541 .  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 227395)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-a63e173b20baa1a48470dd31a1fb1f2704b37011
+3f8feb4f905535448833a14e4f5c83f682087749
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 227395)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -4567,6 +4567,7 @@ Binary_expression::eval_integer(Operator
   if (mpz_sizeinbase(val, 2) > 0x10)
{
  error_at(location, "constant addition overflow");
+  nc->set_invalid();
  mpz_set_ui(val, 1);
}
   break;
@@ -4575,6 +4576,7 @@ Binary_expression::eval_integer(Operator
   if (mpz_sizeinbase(val, 2) > 0x10)
{
  error_at(location, "constant subtraction overflow");
+  nc->set_invalid();
  mpz_set_ui(val, 1);
}
   break;
@@ -4589,6 +4591,7 @@ Binary_expression::eval_integer(Operator
   if (mpz_sizeinbase(val, 2) > 0x10)
{
  error_at(location, "constant multiplication overflow");
+  nc->set_invalid();
  mpz_set_ui(val, 1);
}
   break;
@@ -4598,6 +4601,7 @@ Binary_expression::eval_integer(Operator
   else
{
  error_at(location, "division by zero");
+  nc->set_invalid();
  mpz_set_ui(val, 0);
}
   break;
@@ -4607,6 +4611,7 @@ Binary_expression::eval_integer(Operator
   else
{
  error_at(location, "division by zero");
+  nc->set_invalid();
  mpz_set_ui(val, 0);
}
   break;
@@ -4618,6 +4623,7 @@ Binary_expression::eval_integer(Operator
else
  {
error_at(location, "shift count overflow");
+nc->set_invalid();
mpz_set_ui(val, 1);
  }
break;
@@ -4629,6 +4635,7 @@ Binary_expression::eval_integer(Operator
if (mpz_cmp_ui(right_val, shift) != 0)
  {
error_at(location, "shift count overflow");
+nc->set_invalid();
mpz_set_ui(val, 1);
  }
else
@@ -4723,6 +4730,7 @@ Binary_expression::eval_float(Operator o
   else
{
  error_at(location, "division by zero");
+  nc->set_invalid();
  mpfr_set_ui(val, 0, GMP_RNDN);
}
   break;
@@ -4787,6 +4795,7 @@ Binary_expression::eval_complex(Operator
   if (mpc_cmp_si(right_val, 0) == 0)
{
  error_at(location, "division by zero");
+  nc->set_invalid();
  mpc_set_ui(val, 0, MPC_RNDNN);
  break;
}
@@ -4849,7 +4858,14 @@ Binary_expression::do_lower(Gogo* gogo,
Numeric_constant nc;
if (!Binary_expression::eval_constant(op, _nc, _nc,
  location, ))
- return this;
+  {
+if (nc.is_invalid())
+  {
+go_assert(saw_errors());
+return Expression::make_error(location);
+  }
+return this;
+  }
return nc.expression(location);
  }
   }
@@ -15189,7 +15205,7 @@ Numeric_constant::set_type(Type* type, b
 
 bool
 Numeric_constant::check_int_type(Integer_type* type, bool issue_error,
-Location location) const
+Location location)
 {
   mpz_t val;
   switch (this->classification_)
@@ -15203,7 +15219,11 @@ Numeric_constant::check_int_type(Integer
   if (!mpfr_integer_p(this->u_.float_val))
{
  if (issue_error)
-   error_at(location, "floating point constant truncated to integer");
+{
+  error_at(location,
+   "floating point constant truncated to integer");
+  this->set_invalid();
+}
  return false;
}
   mpz_init(val);
@@ -15215,7 +15235,10 @@ Numeric_constant::check_int_type(Integer
  || !mpfr_zero_p(mpc_imagref(this->u_.complex_val)))
{
  if (issue_error)
-   error_at(location, "complex constant truncated to integer");
+{
+  error_at(location, "complex constant truncated to integer");
+  this->set_invalid();
+}
  return false;
}
   mpz_init(val);
@@ -15253,7 +15276,10 @@ Numeric_constant::check_int_type(Integer
 }
 
   if

Re: Remove redundant test for global_regs

2015-09-02 Thread Georg-Johann Lay


Anatoliy Sokolov schrieb:

Hi.

The fixed_reg_set contain all fixed and global registers. This patch 
change code "fixed_regs[r] || global_regs[r]" with "TEST_HARD_REG_BIT 
(fixed_reg_set, r)".


Even though technically a no-op change

  TEST_HARD_REG_BIT (fixed_reg_set, r)

appears to be a test for r being a fixed register and not for being 
fixed-or-global register, so the old style


  fixed_regs[r] || global_regs[r]

seems to be less error prone.

Johann


Bootstrapped and reg-tested on x86_64-unknown-linux-gnu and
powerpc64le-unknown-linux-gnu.

OK for trunk?

2015-08-24  Anatoly Sokolov  

* cse.c (FIXED_REGNO_P): Don't check global_regs. Use fixed_reg_set
instead of fixed_regs.
* df-problems.c (can_move_insns_across): Ditto.
* postreload.c (reload_combine_recognize_pattern): Ditto.
* recog.c (peep2_find_free_register): Ditto.
* regcprop.c (copy_value): Ditto.
* regrename.c (check_new_reg_p, rename_chains): Ditto.
* sel-sched.c (init_regs_for_mode, mark_unavailable_hard_regs): Ditto.

Index: gcc/cse.c
===
--- gcc/cse.c(revision 226953)
+++ gcc/cse.c(working copy)
@@ -463,7 +463,7 @@
A reg wins if it is either the frame pointer or designated as 
fixed.  */

 #define FIXED_REGNO_P(N)  \
   ((N) == FRAME_POINTER_REGNUM || (N) == HARD_FRAME_POINTER_REGNUM \
-   || fixed_regs[N] || global_regs[N])
+   || TEST_HARD_REG_BIT (fixed_reg_set, N))

 /* Compute cost of X, as stored in the `cost' field of a table_elt.  Fixed
hard registers and pointers into the frame are the cheapest with a cost
Index: gcc/df-problems.c
===
--- gcc/df-problems.c(revision 226953)
+++ gcc/df-problems.c(working copy)
@@ -3871,8 +3871,7 @@
   EXECUTE_IF_SET_IN_BITMAP (merge_set, 0, i, bi)
 {
   if (i < FIRST_PSEUDO_REGISTER
-  && ! fixed_regs[i]
-  && ! global_regs[i])
+  && ! TEST_HARD_REG_BIT (fixed_reg_set, i))
 {
   fail = 1;
   break;
Index: gcc/postreload.c
===
--- gcc/postreload.c(revision 226953)
+++ gcc/postreload.c(working copy)
@@ -1144,7 +1144,7 @@
   && reg_state[i].store_ruid <= reg_state[regno].use_ruid
   && (call_used_regs[i] || df_regs_ever_live_p (i))
   && (!frame_pointer_needed || i != HARD_FRAME_POINTER_REGNUM)
-  && !fixed_regs[i] && !global_regs[i]
+  && !TEST_HARD_REG_BIT (fixed_reg_set, i)
   && hard_regno_nregs[i][GET_MODE (reg)] == 1
   && targetm.hard_regno_scratch_ok (i))
 {
Index: gcc/recog.c
===
--- gcc/recog.c(revision 226953)
+++ gcc/recog.c(working copy)
@@ -3162,17 +3162,11 @@
   for (j = 0; success && j < hard_regno_nregs[regno][mode]; j++)
 {
   /* Don't allocate fixed registers.  */
-  if (fixed_regs[regno + j])
+  if (TEST_HARD_REG_BIT (fixed_reg_set, regno + j))
 {
   success = 0;
   break;
 }
-  /* Don't allocate global registers.  */
-  if (global_regs[regno + j])
-{
-  success = 0;
-  break;
-}
   /* Make sure the register is of the right class.  */
   if (! TEST_HARD_REG_BIT (reg_class_contents[cl], regno + j))
 {
Index: gcc/regcprop.c
===
--- gcc/regcprop.c(revision 226953)
+++ gcc/regcprop.c(working copy)
@@ -315,7 +315,7 @@
   /* Do not propagate copies to fixed or global registers, patterns
  can be relying to see particular fixed register or users can
  expect the chosen global register in asm.  */
-  if (fixed_regs[dr] || global_regs[dr])
+  if (TEST_HARD_REG_BIT (fixed_reg_set, dr))
 return;

   /* If SRC and DEST overlap, don't record anything.  */
Index: gcc/regrename.c
===
--- gcc/regrename.c(revision 226953)
+++ gcc/regrename.c(working copy)
@@ -311,8 +311,7 @@

   for (i = nregs - 1; i >= 0; --i)
 if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
-|| fixed_regs[new_reg + i]
-|| global_regs[new_reg + i]
+|| TEST_HARD_REG_BIT (fixed_reg_set, new_reg + i)
 /* Can't use regs which aren't saved by the prologue.  */
 || (! df_regs_ever_live_p (new_reg + i)
 && ! call_used_regs[new_reg + i])
@@ -440,7 +439,7 @@
   if (this_head->cannot_rename)
 continue;

-  if (fixed_regs[reg] || global_regs[reg]
+  if (TEST_HARD_REG_BIT (fixed_reg_set, reg)
   || (!HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed
   && reg == HARD_FRAME_POINTER_REGNUM)
   || (HARD_FRAME_POINTER_REGNUM && frame_pointer_needed
Index: gcc/sel-sched.c
===
---

Re: [PATCH] fix PR53852: stop ISL after a given number of operations

2015-09-02 Thread Jeff Law


On 09/02/2015 04:52 PM, Tobias Grosser wrote:

On 09/03/2015 12:34 AM, Sebastian Pop wrote:

2015-09-02  Sebastian Pop  

 * config.in: Regenerate.
 * configure: Regenerate.
 * configure.ac (HAVE_ISL_CTX_MAX_OPERATIONS): Detect.
 * graphite-optimize-isl.c (optimize_isl): Stop
computation when
 PARAM_MAX_ISL_OPERATIONS is reached.
 * params.def (PARAM_MAX_ISL_OPERATIONS): Add.

 * graphite-dependences.c (extend_schedule): Remove
gcc_asserts on
 result equal to isl_stat_ok as the status now can be
isl_error_quota.
 (subtract_commutative_associative_deps): Same.
 (compute_deps): Same.

testsuite/
 * gcc.dg/graphite/uns-interchange-12.c: Adjust pattern to
pass with
 both isl-0.12 and isl-0.15.
 * gcc.dg/graphite/uns-interchange-14.c: Same.
 * gcc.dg/graphite/uns-interchange-15.c: Same.
 * gcc.dg/graphite/uns-interchange-mvt.c: Same.


Hi Sebastian,

this looks good to me.

Thanks for taking care of this Sebastian!

jeff

Re: Reviving SH FDPIC target

2015-09-02 Thread Rich Felker

On Wed, Sep 02, 2015 at 07:59:45PM +, Joseph Myers wrote:
> On Wed, 2 Sep 2015, Rich Felker wrote:
> 
> > Also, according to Joseph Myers, there was some unresolved
> > disagreement that stalled (and eventually sunk) the old patch, so if
> > anyone's still around who has objections to it, could you speak up and
> > let me know what's wrong? Kaz Kojima seems to have approved the patch
> > at the time so I'm confused what the issue was/is.
> 
> It's patch 1/3 (architecture-independent) that had the disagreement (and 
> patch 3/3 depends on patch 1/3).
> 
> https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01462.html

So this is only for __fpscr_values? In that case I think the right
solution is just to follow up with getting rid of __fpscr_values, if
it's not already done:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60138
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513

53513 is marked fixed, but I didn't follow up to confirm that the
actual problems I reported in 60138 are fixed; I'll do some more
research on this. But if all goes well, we can just drop 1/3.

Thanks for the quick reply!

Rich

Re: [PATCH] Fix ICE when generating a vector shift by scalar

2015-09-02 Thread Bill Schmidt


On Wed, 2015-09-02 at 14:44 +0200, Richard Biener wrote:
> On Tue, Sep 1, 2015 at 5:53 PM, Bill Schmidt
>  wrote:
> > On Tue, 2015-09-01 at 11:01 +0200, Richard Biener wrote:
> >> On Mon, Aug 31, 2015 at 10:28 PM, Bill Schmidt
> >>  wrote:
> >> > Hi,
> >> >
> >> > The following simple test fails when attempting to convert a vector
> >> > shift-by-scalar into a vector shift-by-vector.
> >> >
> >> >   typedef unsigned char v16ui __attribute__((vector_size(16)));
> >> >
> >> >   v16ui vslb(v16ui v, unsigned char i)
> >> >   {
> >> > return v << i;
> >> >   }
> >> >
> >> > When this code is gimplified, the shift amount gets expanded to an
> >> > unsigned int:
> >> >
> >> >   vslb (v16ui v, unsigned char i)
> >> >   {
> >> > v16ui D.2300;
> >> > unsigned int D.2301;
> >> >
> >> > D.2301 = (unsigned int) i;
> >> > D.2300 = v << D.2301;
> >> > return D.2300;
> >> >   }
> >> >
> >> > In expand_binop, the shift-by-scalar is converted into a shift-by-vector
> >> > using expand_vector_broadcast, which produces the following rtx to be
> >> > used to initialize a V16QI vector:
> >> >
> >> > (parallel:V16QI [
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > (subreg/s/v:SI (reg:DI 155) 0)
> >> > ])
> >> >
> >> > The back end eventually chokes trying to generate a copy of the SImode
> >> > expression into a QImode memory slot.
> >> >
> >> > This patch fixes this problem by ensuring that the shift amount is
> >> > truncated to the inner mode of the vector when necessary.  I've added a
> >> > test case verifying correct PowerPC code generation in this case.
> >> >
> >> > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> >> > regressions.  Is this ok for trunk?
> >> >
> >> > Thanks,
> >> > Bill
> >> >
> >> >
> >> > [gcc]
> >> >
> >> > 2015-08-31  Bill Schmidt  
> >> >
> >> > * optabs.c (expand_binop): Don't create a broadcast vector with a
> >> > source element wider than the inner mode.
> >> >
> >> > [gcc/testsuite]
> >> >
> >> > 2015-08-31  Bill Schmidt  
> >> >
> >> > * gcc.target/powerpc/vec-shift.c: New test.
> >> >
> >> >
> >> > Index: gcc/optabs.c
> >> > ===
> >> > --- gcc/optabs.c(revision 227353)
> >> > +++ gcc/optabs.c(working copy)
> >> > @@ -1608,6 +1608,13 @@ expand_binop (machine_mode mode, optab binoptab, r
> >> >
> >> >if (otheroptab && optab_handler (otheroptab, mode) != 
> >> > CODE_FOR_nothing)
> >> > {
> >> > + /* The scalar may have been extended to be too wide.  Truncate
> >> > +it back to the proper size to fit in the broadcast vector.  
> >> > */
> >> > + machine_mode inner_mode = GET_MODE_INNER (mode);
> >> > + if (GET_MODE_BITSIZE (inner_mode)
> >> > + < GET_MODE_BITSIZE (GET_MODE (op1)))
> >>
> >> Does that work for modeless constants?  Btw, what do other targets do
> >> here?  Do they
> >> also choke or do they cope with the wide operand?
> >
> > Good question.  This works by serendipity more than by design.  Because
> > a constant has a mode of VOIDmode, its bitsize is 0 and the TRUNCATE
> > won't be generated.  It would be better for me to put in an explicit
> > check for CONST_INT rather than relying on this, though.  I'll fix that.
> >
> > I am not sure what other targets do here; I can check.  However, do you
> > think that's relevant?  I'm concerned that
> >
> > (parallel:V16QI [
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > (subreg/s/v:SI (reg:DI 155) 0)
> > ])
> >
> > is a nonsensical expression and shouldn't be

Re: [PATCH] fix PR53852: stop ISL after a given number of operations

2015-09-02 Thread Tobias Grosser


On 09/03/2015 12:34 AM, Sebastian Pop wrote:

2015-09-02  Sebastian Pop  

 * config.in: Regenerate.
 * configure: Regenerate.
 * configure.ac (HAVE_ISL_CTX_MAX_OPERATIONS): Detect.
 * graphite-optimize-isl.c (optimize_isl): Stop computation when
 PARAM_MAX_ISL_OPERATIONS is reached.
 * params.def (PARAM_MAX_ISL_OPERATIONS): Add.

 * graphite-dependences.c (extend_schedule): Remove gcc_asserts on
 result equal to isl_stat_ok as the status now can be 
isl_error_quota.
 (subtract_commutative_associative_deps): Same.
 (compute_deps): Same.

testsuite/
 * gcc.dg/graphite/uns-interchange-12.c: Adjust pattern to pass with
 both isl-0.12 and isl-0.15.
 * gcc.dg/graphite/uns-interchange-14.c: Same.
 * gcc.dg/graphite/uns-interchange-15.c: Same.
 * gcc.dg/graphite/uns-interchange-mvt.c: Same.


Hi Sebastian,

this looks good to me.

Tobias

Re: [PING] Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function

2015-09-02 Thread Martin Sebor


On 09/02/2015 09:29 AM, Jason Merrill wrote:

On 09/01/2015 06:25 PM, Martin Sebor wrote:

Having now made this change, I don't think the added complexity
of three declarations and two trivial definitions of the new
c_decl_implicit function across five files is an improvement,


Three declarations?  Isn't declaring it in c-common.h enough?

...


Seems like you can do without the check for C++ if you're defining this
function for C++ (to just return false).


Yes on both counts. Thanks.



I agree with Joseph that the function is better.


+ bool diag /* = true */)


Let's call these parameters "reject_builtin" rather than the generic
"diag".


+function = decay_conversion (function, complain, false);


Please add a comment to "false", e.g. /*reject_builtin*/false


Done.

The latest patch (attached) implements all of requested changes
plus a few tweaks to the tests to make them more strict.

I also noticed and fixed a gotcha in the dg-error directives
that might be interesting to mention: in prior patches tests
were all: /* dg-error "builtin" */
Unfortunately, since both the name of the test file and the
test function have the string "builtin" in them, the directives
would pass even if the error message wasn't the expected:

  builtin function ‘__builtin_trap’ must be directly called

This problem was hiding a few invalid C++ tests. I've fixed
the directives to avoid the problem.

I've tested it on x86_64 this time with no regressions.

Martin

gcc/ChangeLog
2015-09-02  Martin Sebor  

	PR c/66516
	* doc/extend.texi (Other Builtins): Document when the address
	of a builtin function can be taken.

gcc/c-family/ChangeLog
2015-09-02  Martin Sebor  

	PR c/66516
	* c-common.h (c_decl_implicit, reject_gcc_builtin): Declare new
	functions.
	* c-common.c (reject_gcc_builtin): Define.

gcc/c/ChangeLog
2015-09-02  Martin Sebor  

	PR c/66516
	* c/c-typeck.c (convert_arguments, parser_build_unary_op)
	(build_conditional_expr, c_cast_expr, convert_for_assignment)
	(build_binary_op, _objc_common_truthvalue_conversion): Call
	reject_gcc_builtin.
	(c_decl_implicit): Define.

gcc/cp/ChangeLog
2015-09-02  Martin Sebor  

	PR c/66516
	* cp/cp-tree.h (mark_rvalue_use, decay_conversion): Add new
	argument(s).
	* cp/expr.c (mark_rvalue_use): Use new argument.
	* cp/call.c (build_addr_func): Call decay_conversion with new
	argument.
	* cp/pt.c (convert_template_argument): Call reject_gcc_builtin.
	* cp/typeck.c (decay_conversion): Use new argument.
	(c_decl_implicit): Define.

gcc/testsuite/ChangeLog
2015-09-02  Martin Sebor  

	PR c/66516
	* g++.dg/addr_builtin-1.C: New test.
	* gcc.dg/addr_builtin-1.c: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7691035..4cc6c3e 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12882,4 +12882,41 @@ pointer_to_zero_sized_aggr_p (tree t)
   return (TYPE_SIZE (t) && integer_zerop (TYPE_SIZE (t)));
 }

+/* For an EXPR of a FUNCTION_TYPE that references a GCC built-in function
+   with no library fallback or for an ADDR_EXPR whose operand is such type
+   issues an error pointing to the location LOC.
+   Returns true when the expression has been diagnosed and false
+   otherwise.  */
+bool
+reject_gcc_builtin (const_tree expr, location_t loc /* = UNKNOWN_LOCATION */)
+{
+  if (TREE_CODE (expr) == ADDR_EXPR)
+expr = TREE_OPERAND (expr, 0);
+
+  if (TREE_TYPE (expr)
+  && TREE_CODE (TREE_TYPE (expr)) == FUNCTION_TYPE
+  && DECL_P (expr)
+  /* The intersection of DECL_BUILT_IN and DECL_IS_BUILTIN avoids
+	 false positives for user-declared built-ins such as abs or
+	 strlen, and for C++ operators new and delete.
+	 The c_decl_implicit() test avoids false positives for implicitly
+	 declared builtins with library fallbacks (such as abs).  */
+  && DECL_BUILT_IN (expr)
+  && DECL_IS_BUILTIN (expr)
+  && !c_decl_implicit (expr)
+  && !DECL_ASSEMBLER_NAME_SET_P (expr))
+{
+  if (loc == UNKNOWN_LOCATION)
+	loc = EXPR_LOC_OR_LOC (expr, input_location);
+
+  /* Reject arguments that are builtin functions with
+	 no library fallback.  */
+  error_at (loc, "builtin function %qE must be directly called", expr);
+
+  return true;
+}
+
+  return false;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index be63cd2..0d589b5 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -572,6 +572,7 @@ extern int field_decl_cmp (const void *, const void *);
 extern void resort_sorted_fields (void *, void *, gt_pointer_operator,
   void *);
 extern bool has_c_linkage (const_tree decl);
+extern bool c_decl_implicit (const_tree);
 
 /* Switches common to the C front ends.  */

@@ -1437,5 +1438,6 @@ extern bool contains_cilk_spawn_stmt (tree);
 extern tree cilk_for_number_of_iterations (tree);
 extern bool check_no_cilk (tree, const

Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alexandre Oliva

On Sep  2, 2015, Alan Lawrence  wrote:

> One more failure to report, I'm afraid. On AArch64 Bigendian,
> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
> r227348):

Thanks.  The failure mode was different in the current, revamped git
branch aoliva/pr64164, but I've just fixed it there.

I'm almost ready to post a new patch, with a new, simpler, less fragile
and more maintainable approach to integrate cfgexpand and assign_parms'
RTL assignment, so if you could give it a spin on big and little endian
aarch64 natives, that would be very much appreciated!

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

[PATCH] fix PR53852: stop ISL after a given number of operations

2015-09-02 Thread Sebastian Pop

2015-09-02  Sebastian Pop  

* config.in: Regenerate.
* configure: Regenerate.
* configure.ac (HAVE_ISL_CTX_MAX_OPERATIONS): Detect.
* graphite-optimize-isl.c (optimize_isl): Stop computation when
PARAM_MAX_ISL_OPERATIONS is reached.
* params.def (PARAM_MAX_ISL_OPERATIONS): Add.

* graphite-dependences.c (extend_schedule): Remove gcc_asserts on
result equal to isl_stat_ok as the status now can be 
isl_error_quota.
(subtract_commutative_associative_deps): Same.
(compute_deps): Same.

testsuite/
* gcc.dg/graphite/uns-interchange-12.c: Adjust pattern to pass with
both isl-0.12 and isl-0.15.
* gcc.dg/graphite/uns-interchange-14.c: Same.
* gcc.dg/graphite/uns-interchange-15.c: Same.
* gcc.dg/graphite/uns-interchange-mvt.c: Same.
---
 gcc/config.in  |  6 ++
 gcc/configure  | 28 
 gcc/configure.ac   | 11 +++
 gcc/graphite-dependences.c | 83 +-
 gcc/graphite-optimize-isl.c| 49 -
 gcc/params.def |  5 ++
 gcc/testsuite/gcc.dg/graphite/uns-interchange-12.c |  2 +-
 gcc/testsuite/gcc.dg/graphite/uns-interchange-14.c |  2 +-
 gcc/testsuite/gcc.dg/graphite/uns-interchange-15.c |  2 +-
 .../gcc.dg/graphite/uns-interchange-mvt.c  |  2 +-
 10 files changed, 120 insertions(+), 70 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 22a4e6b..98c4647 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1332,6 +1332,12 @@
 #endif
 
 
+/* Define if isl_ctx_get_max_operations exists. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_ISL_CTX_MAX_OPERATIONS
+#endif
+
+
 /* Define if isl_options_set_schedule_serialize_sccs exists. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_ISL_OPTIONS_SET_SCHEDULE_SERIALIZE_SCCS
diff --git a/gcc/configure b/gcc/configure
index 0d31383..07d39f9 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -28625,6 +28625,29 @@ rm -f core conftest.err conftest.$ac_objext \
   { $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$ac_has_isl_options_set_schedule_serialize_sccs" >&5
 $as_echo "$ac_has_isl_options_set_schedule_serialize_sccs" >&6; }
 
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking Checking for 
isl_ctx_get_max_operations" >&5
+$as_echo_n "checking Checking for isl_ctx_get_max_operations... " >&6; }
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+isl_ctx_get_max_operations (isl_ctx_alloc ());
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_link "$LINENO"; then :
+  ac_has_isl_ctx_get_max_operations=yes
+else
+  ac_has_isl_ctx_get_max_operations=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$ac_has_isl_ctx_get_max_operations" >&5
+$as_echo "$ac_has_isl_ctx_get_max_operations" >&6; }
+
   LIBS="$saved_LIBS"
   CXXFLAGS="$saved_CXXFLAGS"
 
@@ -28639,6 +28662,11 @@ $as_echo "#define 
HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE 1" >>confdefs.h
 $as_echo "#define HAVE_ISL_OPTIONS_SET_SCHEDULE_SERIALIZE_SCCS 1" >>confdefs.h
 
   fi
+  if test x"$ac_has_isl_ctx_get_max_operations" = x"yes"; then
+
+$as_echo "#define HAVE_ISL_CTX_MAX_OPERATIONS 1" >>confdefs.h
+
+  fi
 fi
 
 # Check for plugin support
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 846651d..b6e8bed 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5790,6 +5790,13 @@ if test "x${ISLLIBS}" != "x" ; then
   [ac_has_isl_options_set_schedule_serialize_sccs=no])
   AC_MSG_RESULT($ac_has_isl_options_set_schedule_serialize_sccs)
 
+  AC_MSG_CHECKING([Checking for isl_ctx_get_max_operations])
+  AC_TRY_LINK([#include ],
+  [isl_ctx_get_max_operations (isl_ctx_alloc ());],
+  [ac_has_isl_ctx_get_max_operations=yes],
+  [ac_has_isl_ctx_get_max_operations=no])
+  AC_MSG_RESULT($ac_has_isl_ctx_get_max_operations)
+
   LIBS="$saved_LIBS"
   CXXFLAGS="$saved_CXXFLAGS"
 
@@ -5802,6 +5809,10 @@ if test "x${ISLLIBS}" != "x" ; then
  AC_DEFINE(HAVE_ISL_OPTIONS_SET_SCHEDULE_SERIALIZE_SCCS, 1,
[Define if isl_options_set_schedule_serialize_sccs exists.])
   fi
+  if test x"$ac_has_isl_ctx_get_max_operations" = x"yes"; then
+ AC_DEFINE(HAVE_ISL_CTX_MAX_OPERATIONS, 1,
+   [Define if isl_ctx_get_max_operations exists.])
+  fi
 fi
 
 GCC_ENABLE_PLUGINS
diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index c3c2090..85f16f3 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -256,17 +256,12 @@ __isl_give isl_union_map *
 extend_schedule (__isl_take isl_union_map *x)
 {
   int max = 0;
-  isl_stat res;
   struct extend_schedule_str str;
 
-  res =

Re: [Patch] PR67351 Implement << N & >> N optimizers

2015-09-02 Thread Marc Glisse


+/* Optimize (x >> c) << c into x & (-1<

Re: [gomp4.1] Structure element mapping support

2015-09-02 Thread Ilya Verbin

On Mon, Aug 31, 2015 at 17:07:53 +0200, Jakub Jelinek wrote:
>   * gimplify.c (gimplify_scan_omp_clauses): Handle
>   struct element GOMP_MAP_FIRSTPRIVATE_POINTER.

Have you seen this?

gcc/gimplify.c: In function ‘void gimplify_scan_omp_clauses(tree_node**, 
gimple_statement_base**, omp_region_type, tree_code)’:
gcc/gimplify.c:6578:12: error: ‘sc’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
  : *sc != c;
^

  -- Ilya

Re: Fix 61441

2015-09-02 Thread Sujoy Saraswati

Hi Richard,

> Note that I'm curious what
> the actual bug is - is it that (double) sNaN creates a sNaN?  Then the fix
> should be elsewhere, in constant folding itself
> (fold_convert_const_real_from_real
> or real_convert).
>
> If that isn't the bug you have very many other passes to fix for the
> same problem.
>
> So - can you please explain?

In this test case, the floating point operation for converting the
float to double is what should convert the sNaN to qNaN. I tried to
cover more floating point operations than just the conversion
operations exposed by this test case. For example, if we consider the
following program -

#define _GNU_SOURCE
#include 
#include 

int main (void)
{
  float x;
  float sNaN = __builtin_nansf ("");
  x = sNaN + .0;
  return issignaling(x);
}

The operation (sNaN + .0) should also result in qNaN after folding.
Hence, I thought of doing the sNaN to qNaN conversion in various
places under tree-ssa-ccp, where the result upon a folding is
available. I do agree that this approach may mean many more such
places should also be covered in other passes, but thought of sending
the fix for the ccp pass to start with.

Let me know if you suggest alternate approach.

Regards,
Sujoy

> Thanks,
> Richard.
>
>> Regards,
>> Sujoy
>>
>> 2015-09-01  Sujoy Saraswati 
>>
>> PR tree-optimization/61441
>> * tree-ssa-ccp.c (convert_snan_to_qnan): Convert sNaN to qNaN when
>> flag_signaling_nans is off.
>> (ccp_fold_stmt, visit_assignment, visit_cond_stmt): call
>> convert_snan_to_qnan to convert sNaN to qNaN on constant folding.
>>
>> PR tree-optimization/61441
>> * gcc.dg/pr61441.c: New testcase.
>>
>> Index: gcc/tree-ssa-ccp.c
>> ===
>> --- gcc/tree-ssa-ccp.c  (revision 226965)
>> +++ gcc/tree-ssa-ccp.c  (working copy)
>> @@ -560,6 +560,24 @@ value_to_wide_int (ccp_prop_value_t val)
>>return 0;
>>  }
>>
>> +/* Convert sNaN to qNaN when flag_signaling_nans is off */
>> +
>> +static void
>> +convert_snan_to_qnan (tree expr)
>> +{
>> +  if (expr
>> +  && (TREE_CODE (expr) == REAL_CST)
>> +  && !flag_signaling_nans)
>> +  {
>> +REAL_VALUE_TYPE *d = TREE_REAL_CST_PTR (expr);
>> +
>> +if (HONOR_NANS (TYPE_MODE (TREE_TYPE (expr)))
>> +&& REAL_VALUE_ISNAN (*d)
>> +&& d->signalling)
>> +  d->signalling = 0;
>> +  }
>> +}
>> +
>>  /* Return the value for the address expression EXPR based on alignment
>> information.  */
>>
>> @@ -2156,6 +2174,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>> if (val.lattice_val != CONSTANT
>> || val.mask != 0)
>>   return false;
>> +convert_snan_to_qnan (val.value);
>>
>> if (dump_file)
>>   {
>> @@ -2197,7 +2216,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>> bool res;
>> if (!useless_type_conversion_p (TREE_TYPE (lhs),
>> TREE_TYPE (new_rhs)))
>> +{
>>   new_rhs = fold_convert (TREE_TYPE (lhs), new_rhs);
>> +  convert_snan_to_qnan (new_rhs);
>> +}
>> res = update_call_from_tree (gsi, new_rhs);
>> gcc_assert (res);
>> return true;
>> @@ -2216,6 +2238,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>  tree new_rhs = fold_builtin_alloca_with_align (stmt);
>>  if (new_rhs)
>>   {
>> +convert_snan_to_qnan (new_rhs);
>> bool res = update_call_from_tree (gsi, new_rhs);
>> tree var = TREE_OPERAND (TREE_OPERAND (new_rhs, 0),0);
>> gcc_assert (res);
>> @@ -2260,7 +2283,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>   {
>> tree rhs = unshare_expr (val);
>> if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE 
>> (rhs)))
>> +{
>>   rhs = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), rhs);
>> +  convert_snan_to_qnan (rhs);
>> +}
>> gimple_assign_set_rhs_from_tree (gsi, rhs);
>> return true;
>>   }
>> @@ -2292,6 +2318,7 @@ visit_assignment (gimple stmt, tree *output_p)
>>/* Evaluate the statement, which could be
>>  either a GIMPLE_ASSIGN or a GIMPLE_CALL.  */
>>val = evaluate_stmt (stmt);
>> +  convert_snan_to_qnan (val.value);
>>
>>/* If STMT is an assignment to an SSA_NAME, we only have one
>>  value to set.  */
>> @@ -2324,6 +2351,7 @@ visit_cond_stmt (gimple stmt, edge *taken_edge_p)
>>if (val.lattice_val != CONSTANT
>>|| val.mask != 0)
>>  return SSA_PROP_VARYING;
>> +  convert_snan_to_qnan (val.value);
>>
>>/* Find which edge out of the conditional block will be taken and add it
>>   to the worklist.  If no single edge can be determined statically,
>>
>> Index: gcc/testsuite/gcc.dg/pr61441.c
>>

Re: Fix 61441

2015-09-02 Thread Richard Biener

On Wed, Sep 2, 2015 at 1:36 PM, Sujoy Saraswati  wrote:
> Hi Richard,
>
>> Note that I'm curious what
>> the actual bug is - is it that (double) sNaN creates a sNaN?  Then the fix
>> should be elsewhere, in constant folding itself
>> (fold_convert_const_real_from_real
>> or real_convert).
>>
>> If that isn't the bug you have very many other passes to fix for the
>> same problem.
>>
>> So - can you please explain?
>
> In this test case, the floating point operation for converting the
> float to double is what should convert the sNaN to qNaN. I tried to
> cover more floating point operations than just the conversion
> operations exposed by this test case. For example, if we consider the
> following program -
>
> #define _GNU_SOURCE
> #include 
> #include 
>
> int main (void)
> {
>   float x;
>   float sNaN = __builtin_nansf ("");
>   x = sNaN + .0;
>   return issignaling(x);
> }
>
> The operation (sNaN + .0) should also result in qNaN after folding.
> Hence, I thought of doing the sNaN to qNaN conversion in various
> places under tree-ssa-ccp, where the result upon a folding is
> available. I do agree that this approach may mean many more such
> places should also be covered in other passes, but thought of sending
> the fix for the ccp pass to start with.
>
> Let me know if you suggest alternate approach.

CCP and other passes ultimatively end up using fold-const.c:const_{unop,binop}
for constant folding so that is where the fix should go to (or to real.c).  That
will automatically handle other passes doing similar transforms.

Richard.

> Regards,
> Sujoy
>
>> Thanks,
>> Richard.
>>
>>> Regards,
>>> Sujoy
>>>
>>> 2015-09-01  Sujoy Saraswati 
>>>
>>> PR tree-optimization/61441
>>> * tree-ssa-ccp.c (convert_snan_to_qnan): Convert sNaN to qNaN when
>>> flag_signaling_nans is off.
>>> (ccp_fold_stmt, visit_assignment, visit_cond_stmt): call
>>> convert_snan_to_qnan to convert sNaN to qNaN on constant folding.
>>>
>>> PR tree-optimization/61441
>>> * gcc.dg/pr61441.c: New testcase.
>>>
>>> Index: gcc/tree-ssa-ccp.c
>>> ===
>>> --- gcc/tree-ssa-ccp.c  (revision 226965)
>>> +++ gcc/tree-ssa-ccp.c  (working copy)
>>> @@ -560,6 +560,24 @@ value_to_wide_int (ccp_prop_value_t val)
>>>return 0;
>>>  }
>>>
>>> +/* Convert sNaN to qNaN when flag_signaling_nans is off */
>>> +
>>> +static void
>>> +convert_snan_to_qnan (tree expr)
>>> +{
>>> +  if (expr
>>> +  && (TREE_CODE (expr) == REAL_CST)
>>> +  && !flag_signaling_nans)
>>> +  {
>>> +REAL_VALUE_TYPE *d = TREE_REAL_CST_PTR (expr);
>>> +
>>> +if (HONOR_NANS (TYPE_MODE (TREE_TYPE (expr)))
>>> +&& REAL_VALUE_ISNAN (*d)
>>> +&& d->signalling)
>>> +  d->signalling = 0;
>>> +  }
>>> +}
>>> +
>>>  /* Return the value for the address expression EXPR based on alignment
>>> information.  */
>>>
>>> @@ -2156,6 +2174,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>> if (val.lattice_val != CONSTANT
>>> || val.mask != 0)
>>>   return false;
>>> +convert_snan_to_qnan (val.value);
>>>
>>> if (dump_file)
>>>   {
>>> @@ -2197,7 +2216,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>> bool res;
>>> if (!useless_type_conversion_p (TREE_TYPE (lhs),
>>> TREE_TYPE (new_rhs)))
>>> +{
>>>   new_rhs = fold_convert (TREE_TYPE (lhs), new_rhs);
>>> +  convert_snan_to_qnan (new_rhs);
>>> +}
>>> res = update_call_from_tree (gsi, new_rhs);
>>> gcc_assert (res);
>>> return true;
>>> @@ -2216,6 +2238,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>>  tree new_rhs = fold_builtin_alloca_with_align (stmt);
>>>  if (new_rhs)
>>>   {
>>> +convert_snan_to_qnan (new_rhs);
>>> bool res = update_call_from_tree (gsi, new_rhs);
>>> tree var = TREE_OPERAND (TREE_OPERAND (new_rhs, 0),0);
>>> gcc_assert (res);
>>> @@ -2260,7 +2283,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
>>>   {
>>> tree rhs = unshare_expr (val);
>>> if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE 
>>> (rhs)))
>>> +{
>>>   rhs = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), rhs);
>>> +  convert_snan_to_qnan (rhs);
>>> +}
>>> gimple_assign_set_rhs_from_tree (gsi, rhs);
>>> return true;
>>>   }
>>> @@ -2292,6 +2318,7 @@ visit_assignment (gimple stmt, tree *output_p)
>>>/* Evaluate the statement, which could be
>>>  either a GIMPLE_ASSIGN or a GIMPLE_CALL.  */
>>>val = evaluate_stmt (stmt);
>>> +  convert_snan_to_qnan (val.value);
>>>
>>>/* If STMT is an assignment to an

[gomp4.1] Fix up ordered threads handling and initial step towards ordered simd

2015-09-02 Thread Jakub Jelinek

Hi!

This fixes ICE with ordered simd (broken with the doacross changes; the
second chunk) and allows ordered simd inside of simd regions.
As it is still unclear whether ordered simd is required to be only closely
nested inside of simd construct or declare simd functions (i.e. lexically
nested), or only closely nested region (i.e. dynamically nested), I'm
deferring further implementation until that is settled.

Committed to gomp 4.1 branch.

2015-09-02  Jakub Jelinek  

* omp-low.c (check_omp_nesting_restrictions): Allow ordered simd
instead of simd region.  Don't assume that all ordered construct
clauses must be depend clauses.

* testsuite/libgomp.c/ordered-4.c: New test.
* testsuite/libgomp.c++/ordered-1.C: New test.

--- gcc/omp-low.c.jj2015-08-31 16:57:23.0 +0200
+++ gcc/omp-low.c   2015-09-01 17:39:05.114232692 +0200
@@ -3118,8 +3118,16 @@ check_omp_nesting_restrictions (gimple s
   if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
  && gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_SIMD)
{
+ c = NULL_TREE;
+ if (gimple_code (stmt) == GIMPLE_OMP_ORDERED)
+   {
+ c = gimple_omp_ordered_clauses (as_a  (stmt));
+ if (c && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SIMD)
+   return true;
+   }
  error_at (gimple_location (stmt),
-   "OpenMP constructs may not be nested inside simd region");
+   "OpenMP constructs other than %<#pragma omp ordered simd%>"
+   " may not be nested inside simd region");
  return false;
}
   else if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
@@ -3337,6 +3345,13 @@ check_omp_nesting_restrictions (gimple s
   for (c = gimple_omp_ordered_clauses (as_a  (stmt));
   c; c = OMP_CLAUSE_CHAIN (c))
{
+ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND)
+   {
+ gcc_assert (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_THREADS
+ || (ctx == NULL
+ && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SIMD));
+ continue;
+   }
  enum omp_clause_depend_kind kind = OMP_CLAUSE_DEPEND_KIND (c);
  if (kind == OMP_CLAUSE_DEPEND_SOURCE
  || kind == OMP_CLAUSE_DEPEND_SINK)
--- libgomp/testsuite/libgomp.c/ordered-4.c.jj  2015-09-01 15:04:14.419283225 
+0200
+++ libgomp/testsuite/libgomp.c/ordered-4.c 2015-09-01 15:04:37.612966738 
+0200
@@ -0,0 +1,83 @@
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort (void);
+
+void
+foo (int i, char *j)
+{
+  #pragma omp atomic
+  j[i]++;
+  #pragma omp ordered threads
+  {
+int t;
+#pragma omp atomic read
+t = j[i];
+if (t != 3)
+  abort ();
+if (i > 1)
+  {
+   #pragma omp atomic read
+   t = j[i - 1];
+   if (t == 2)
+ abort ();
+  }
+if (i < 127)
+  {
+   #pragma omp atomic read
+   t = j[i + 1];
+   if (t == 4)
+ abort ();
+  }
+  }
+  #pragma omp atomic
+  j[i]++;
+}
+
+int
+main ()
+{
+  int i;
+  char j[128];
+  #pragma omp parallel
+  {
+#pragma omp for
+for (i = 0; i < 128; i++)
+  j[i] = 0;
+#pragma omp for ordered schedule(dynamic, 1)
+for (i = 0; i < 128; i++)
+  {
+   #pragma omp atomic
+   j[i]++;
+   #pragma omp ordered threads
+   {
+ int t;
+ #pragma omp atomic read
+ t = j[i];
+ if (t != 1)
+   abort ();
+ if (i > 1)
+   {
+ #pragma omp atomic read
+ t = j[i - 1];
+ if (t == 0)
+   abort ();
+   }
+ if (i < 127)
+   {
+ #pragma omp atomic read
+ t = j[i + 1];
+ if (t == 2)
+   abort ();
+   }
+   }
+   #pragma omp atomic
+   j[i]++;
+  }
+#pragma omp for ordered schedule(static, 1)
+for (i = 0; i < 128; i++)
+  foo (i, j);
+  }
+  return 0;
+}
--- libgomp/testsuite/libgomp.c++/ordered-1.C.jj2015-09-01 
15:05:23.432341517 +0200
+++ libgomp/testsuite/libgomp.c++/ordered-1.C   2015-09-01 15:05:18.289411694 
+0200
@@ -0,0 +1 @@
+#include "../libgomp.c/ordered-4.c"

Jakub

Re: [PATCH] [ARM, Callgraph] Fix PR67280: function incorrectly marked as nothrow

2015-09-02 Thread Jan Hubicka

> >This patch is an attempt to fix
> >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67280. I have written up
> >an analysis of the bug there.
> >
> >When cgraph_node::create_wrapper() updates the callgraph for the new
> >function, it sets the can_throw_external flag to false, even when
> >wrapping a function which can throw. This causes the ipa-pure-const
> >phase to mark the wrapper function as nothrow which results in
Oops...
> >incorrect unwinding tables. (more details on bugzilla)
> Seems clearly wrong.  I wonder if there are other properties that
> should be set but aren't in the thunk.

can_throw_external seems to be only one.
> 
> >
> >The attached patch addresses the problem in
> >cgraph_node::create_wrapper(). A slightly more general approach would
> >be to change symbol_table::create_edge() so that it checks
> >TREE_NOTHROW(callee->decl) when call_stmt is NULL.
> I'm not well versed in the cgraph code -- my worry with this
> approach would be that the wrapper's state is inconsistent with what
> the wrapper can call.  It seems cleaner to make sure these various
> flags are correct when we create the wrapper.

It kind of sucks that one needs to mind this flag each time one creates edge,
but setting the value in create_edge is not quite correct as that one does not
have any information on where the call appears and if the exception is not 
handled
locally.
> >gcc/ChangeLog:
> >
> >2015-08-28  Charles Baylis  
> >
> > * cgraphunit.c (cgraph_node::create_wrapper): Set can_throw_external
> > in new callgraph edge.
> Ultimately Jan's call.

This is OK.
Thanks for looking into this!

Honza
> 
> Jeff

Re: [PATCH] Fix PR66705

2015-09-02 Thread Richard Biener

On Wed, 2 Sep 2015, Jan Hubicka wrote:

> > 
> > I was naiively using ->get_constructor in IPA PTA without proper
> > checking on wheter that succeeds.  Now I tried to use ctor_for_folding
> > but that isn't good as we want to analyze non-const globals in IPA
> > PTA and we need to analyze their initialiers as well.
> > 
> > Thus I'm trying below with ctor_for_analysis, but I really "just"
> > need the initializer or a "not available" for conservative handling.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > Honza - I suppose you should doble-check this and suggest sth
> > different (or implement sth more generic in the IPA infrastructure).
> 
> Yep, you are correct that we don't currently have way to look into ctor
> without actually loading. But do you need something more than just walking
> references that you already have in ipa-ref lists?

Hmm, no, ipa-ref list should be enough (unless we start field-sensitive
analysis or need NULL inits for correctness).  Still have to figure out
how to walk the list and how the reference would look like (what
is ref->use?  IPA_REF_ADDR?  can those be speculative?)

Richard.

> > 
> > Thanks,
> > Richard.
> > 
> > 2015-09-02  Richard Biener  
> > 
> > PR ipa/66705
> > * tree-ssa-structalias.c (ctor_for_analysis): New function.
> > (create_variable_info_for_1): Use ctor_for_analysis instead
> > of get_constructor.
> > (create_variable_info_for): Likewise.
> 
> Otherwise I would go for making ctor_for_analysis a method of varpool_node 
> and...
> > 
> > * g++.dg/lto/pr66705_0.C: New testcase.
> > 
> > Index: gcc/tree-ssa-structalias.c
> > ===
> > --- gcc/tree-ssa-structalias.c  (revision 227207)
> > +++ gcc/tree-ssa-structalias.c  (working copy)
> > @@ -5637,6 +5637,26 @@ check_for_overlaps (vec fiel
> >return false;
> >  }
> >  
> > +/* We can't use ctor_for_folding as that only returns constant 
> > constructors.  */
> > +
> > +static tree
> > +ctor_for_analysis (tree decl)
> > +{
> > +  varpool_node *node = varpool_node::get (decl);
> > +  if (!node)
> > +return error_mark_node;
> > +  node = node->ultimate_alias_target ();
> > +  if (DECL_INITIAL (node->decl) != error_mark_node
> > +  || !in_lto_p)
> > +return (DECL_INITIAL (node->decl)
> > +   ? DECL_INITIAL (node->decl) : error_mark_node);
> 
> I think returning NULL here is just fine. 
> error_mark_node means constructor is not really available. NULL is
> the usual way to say that the variable is not initialized.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Import liboffloadmic from upstream

2015-09-02 Thread Dodji Seketeli

Ilya Verbin  writes:

> On Tue, Sep 01, 2015 at 09:58:22 +0200, Dodji Seketeli wrote:
>> Woops.  can you send me the exact two libraries so that I can see what's
>> going wrong?  You can quickly file an issue to
>> https://sourceware.org/bugzilla/enter_bug.cgi?product=libabigail or just
>> send me the two binaries by email. I'll quickly look into this.
>
> Done: https://sourceware.org/bugzilla/show_bug.cgi?id=18904

Thanks!  I think I have fixed the fixed the issue.  Sorry for the
inconvenience.

The ABI changes of the library according to abidiff are:
http://paste.fedoraproject.org/262507/14412012

Cheers,

-- 
Dodji

Re: [PATCH] Fix PR66705

2015-09-02 Thread Richard Biener

On Wed, 2 Sep 2015, Richard Biener wrote:

> On Wed, 2 Sep 2015, Jan Hubicka wrote:
> 
> > > 
> > > I was naiively using ->get_constructor in IPA PTA without proper
> > > checking on wheter that succeeds.  Now I tried to use ctor_for_folding
> > > but that isn't good as we want to analyze non-const globals in IPA
> > > PTA and we need to analyze their initialiers as well.
> > > 
> > > Thus I'm trying below with ctor_for_analysis, but I really "just"
> > > need the initializer or a "not available" for conservative handling.
> > > 
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > 
> > > Honza - I suppose you should doble-check this and suggest sth
> > > different (or implement sth more generic in the IPA infrastructure).
> > 
> > Yep, you are correct that we don't currently have way to look into ctor
> > without actually loading. But do you need something more than just walking
> > references that you already have in ipa-ref lists?
> 
> Hmm, no, ipa-ref list should be enough (unless we start field-sensitive
> analysis or need NULL inits for correctness).  Still have to figure out
> how to walk the list and how the reference would look like (what
> is ref->use?  IPA_REF_ADDR?  can those be speculative?)

Sth like the following seems to work.

Richard.

2015-09-02  Richard Biener  

PR ipa/66705
* tree-ssa-structalias.c (ctor_for_analysis): New function.
(create_variable_info_for_1): Use ctor_for_analysis instead
of get_constructor.
(create_variable_info_for): Likewise.

* g++.dg/lto/pr66705_0.C: New testcase.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 227207)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -5650,7 +5650,6 @@ create_variable_info_for_1 (tree decl, c
   auto_vec fieldstack;
   fieldoff_s *fo;
   unsigned int i;
-  varpool_node *vnode;
 
   if (!declsize
   || !tree_fits_uhwi_p (declsize))
@@ -5668,12 +5667,10 @@ create_variable_info_for_1 (tree decl, c
   /* Collect field information.  */
   if (use_field_sensitive
   && var_can_have_subvars (decl)
-  /* ???  Force us to not use subfields for global initializers
-in IPA mode.  Else we'd have to parse arbitrary initializers.  */
+  /* ???  Force us to not use subfields for globals in IPA mode.
+Else we'd have to parse arbitrary initializers.  */
   && !(in_ipa_mode
-  && is_global_var (decl)
-  && (vnode = varpool_node::get (decl))
-  && vnode->get_constructor ()))
+  && is_global_var (decl)))
 {
   fieldoff_s *fo = NULL;
   bool notokay = false;
@@ -5805,13 +5802,13 @@ create_variable_info_for (tree decl, con
 
  /* If this is a global variable with an initializer and we are in
 IPA mode generate constraints for it.  */
- if (vnode->get_constructor ()
- && vnode->definition)
+ ipa_ref *ref;
+ for (unsigned idx = 0; vnode->iterate_reference (idx, ref); ++idx)
{
  auto_vec rhsc;
  struct constraint_expr lhs, *rhsp;
  unsigned i;
- get_constraint_for_rhs (vnode->get_constructor (), );
+ get_constraint_for_address_of (ref->referred->decl, );
  lhs.var = vi->id;
  lhs.offset = 0;
  lhs.type = SCALAR;
Index: gcc/testsuite/g++.dg/lto/pr66705_0.C
===
--- gcc/testsuite/g++.dg/lto/pr66705_0.C(revision 0)
+++ gcc/testsuite/g++.dg/lto/pr66705_0.C(working copy)
@@ -0,0 +1,15 @@
+// { dg-lto-do link }
+// { dg-lto-options { { -O2 -flto -flto-partition=max -fipa-pta } } }
+// { dg-extra-ld-options "-r -nostdlib" }
+
+class A {
+public:
+A();
+};
+int a = 0;
+void foo() {
+a = 0;
+A b;
+for (; a;)
+  ;
+}

Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-09-02 Thread Jiong Wang


Jeff Law writes:

> On 05/21/2015 02:46 PM, Jiong Wang wrote:
>>
>> Thanks for these thoughts.
>>
>> I tried but still can't prove this transformation will not introduce
>> extra pointer overflow even given it's reassociation with vfp, although
>> my first impression is it do will not introduce extra risk in real
>> application.
>>
>> Have done a quick check on hppa's legitimize_address. I see for (plus
>> sym_ref, const_int), if const_int is beyond +-4K, then that hook will
>> force them into register, then (plus reg, reg) is always OK.
> I'm virtually certain the PA's legitimize_address is not overflow safe. 
>   It was written long before we started worrying about overflows in 
> address computations.  It was mostly concerned with trying generate good 
> addressing modes without running afoul of the implicit space register 
> selection issues.
>
> A SYMBOL_REF is always a valid base register.  However, as the comment 
> in hppa_legitimize_address notes, we might be given a MEM for something 
> like:  x[n-10].
>
> We don't want to rewrite that as (x-10) + n, even though doing so 
> would be beneficial for LICM.
>
>
>>
>> So for target hooks,  my understanding of your idea is something like:
>>
>>   new hook targetm.pointer_arith_reassociate (), if return -1 then
>>   support full reassociation, 0 for limited, 1 for should not do any
>>   reassociation. the default version return -1 as most targets are OK to
>>   do reassociation given we can prove there is no introducing of overflow
>>   risk. While for target like HPPA, we should define this hook to return
>>   0 for limited support.
> Right.  Rather than use magic constants, I'd suggest an enum for the 
> tri-state.  FULL_PTR_REASSOCIATION, PARTIAL_PTR_REASSOCIATION, 
> NO_PTR_REASSOCIATION.
>
>
>>
>>   Then, if targetm.pointer_arith_reassociate () return 1, we should
>>   further invoke the second hook targetm.limited_reassociate_p (rtx x),
>>   to check the reassociated rtx 'x' meets any restrictions, for example
>>   for HPPA, constants part shouldn't beyond +-4K.
> Right.
>
> Jeff

For the record, after Bin's recent tree-ssa-ivopt improvement originated
from PR62173, this patch is not benefitial anymore.

I can't see such re-shuffling opportunites in RTL level anymore. This
patch was trying to hoist those RTL sequences generated for local array
base address which haven't been hoisted out of the loop at tree level,
while now they are handled quite well by tree-ssa-ivopt.

During both aarch64 and mips64 bootstrapping, optimization in this patch
haven't been triggered while there were quite a few before Bin's tree level fix.

I have stopped working on this patch. Thanks for those time spent on
reviewing and discussing on this.

-- 
Regards,
Jiong

Re: [PATCH][optabs][ifcvt][1/3] Define negcc, notcc optabs

2015-09-02 Thread Kyrill Tkachov


Hi Jeff,

On 01/09/15 23:13, Jeff Law wrote:

On 09/01/2015 09:04 AM, Kyrill Tkachov wrote:

Hi all,

This first patch introduces the negcc and notcc optabs that should
expand to a conditional
negate or a conditional bitwise complement operation.

These are used in ifcvt.c to transform code of the form:
if (test) x = -A; else x = A;
into:
x = A; if (test) x = -x;
where the "if (test) x = -x;" is implemented using the negcc (or notcc
in the ~x case)
if such an optab is available. If such an optab is not implemented then
no transformation
is performed.  Thus, without patches 2/3 and 3/3 this patch does not
impact behaviour on any target.

Bootstrapped and tested as part of the series on arm, aarch64, x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-09-01  Kyrylo Tkachov 

  * ifcvt.c (noce_try_inverse_constants): New function.
  (noce_process_if_block): Call it.
  * optabs.h (emit_conditional_neg_or_complement): Declare prototype.
  * optabs.def (negcc_optab, notcc_optab): Declare.
  * optabs.c (emit_conditional_neg_or_complement): New function.
  * doc/tm.texi (Standard Names): Document negcc, notcc names.

negnotcc-optabs.patch


commit a2183218070ed5f2dca0a9651fdb08ce134ba8ee
Author: Kyrylo Tkachov
Date:   Thu Aug 13 18:14:52 2015 +0100

  [optabs][ifcvt][1/3] Define negcc, notcc optabs

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0bffdc6..5038269 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5791,6 +5791,21 @@ move operand 2 or (operands 2 + operand 3) into operand 
0 according to the
   comparison in operand 1.  If the comparison is false, operand 2 is moved into
   operand 0, otherwise (operand 2 + operand 3) is moved.

+@cindex @code{neg@var{mode}cc} instruction pattern
+@item @samp{neg@var{mode}cc}
+Similar to @samp{mov@var{mode}cc} but for conditional negation.  Conditionally
+move the negation of operand 2 operand 3 into operand 0 according to the
+comparison in operand 1.  If the comparison is true, the negation of operand 2
+is moved into operand 0, otherwise operand 3 is moved.
+
+@cindex @code{not@var{mode}cc} instruction pattern
+@item @samp{not@var{mode}cc}
+Similar to @samp{neg@var{mode}cc} but for conditional complement.
+Conditionally move the bitwise complement of operand 2 operand 3 into operand 0
+according to the comparison in operand 1.  If the comparison is true,
+the complement of operand 2 is moved into operand 0, otherwise operand 3 is
+moved.

"operand 2 operand 3" does not parse.  I think you mean "operand 2 or
operand 3" in both instances above.  Even that doesn't parse well as
it's not clear if operand3 or the negation/complement of operand 3 is
meant.  Can you try to improve the ambiguity of the second sentence in
each description.


You're right, it doesn't parse. I've improved the description.
We either move the negated operand 2 or the unchanged operand 3.



And just a note.  The canonical way to refer to these patterns should be
negcc/notcc.  That avoids conflicting with the older negscc patterns
with different semantics that are defined by some ports.  You're already
using that terminology, so there's nothing to change, I just wanted to
point it out.



+

+  rtx_code code;
+  if (val_a == -val_b)

Do we have to worry about signed overflow here?  I'm thinking
specifically when val_b is the smallest possible integer representable
by a HOST_WIDE_INT.  I suspect you may be able to avoid these problems
with judicious use of the hwi interfaces.


I understand the issue, but am not sure what hwi interfaces to use here.
Seems that the problem will be if val_b is HOST_WIDE_INT_MIN.
Looking at the definition of abs_hwi in hwint.h before it negates it's argument
it asserts that it's not HOST_WIDE_INT_MIN. I think that's to avoid this exact 
issue?
If so, I've added a check for HOST_WIDE_INT_MIN which should cover the 
undefined case
when negating a HOST_WIDE_INT, unless there's something else I'm missing.




So I think we just need to resolve the doc change and make sure we're
not triggering undefined behaviour above and this can go forward.


Thanks, here's the updated patch.
Kyrill

2015-09-02  Kyrylo Tkachov 

 * ifcvt.c (noce_try_inverse_constants): New function.
 (noce_process_if_block): Call it.
 * optabs.h (emit_conditional_neg_or_complement): Declare prototype.
 * optabs.def (negcc_optab, notcc_optab): Declare.
 * optabs.c (emit_conditional_neg_or_complement): New function.
 * doc/tm.texi (Standard Names): Document negcc, notcc names.



jeff



commit e3546b6e9fa772fe15f0a9845dbe429c02f5b327
Author: Kyrylo Tkachov 
Date:   Thu Aug 13 18:14:52 2015 +0100

[optabs][ifcvt][1/3] Define negcc, notcc optabs

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 619259f..c4e43f3 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5791,6 +5791,21 @@ move operand 2 or (operands 2 + operand 3) into operand 0

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Dmitry Vyukov

On Wed, Sep 2, 2015 at 12:58 PM, Marc Glisse  wrote:
> On Tue, 1 Sep 2015, Dmitry Vyukov wrote:
>
>> The refcounted basic_string implementation contains several data races
>> on _M_refcount:
>
>
> There are several bug reports about races in basic_string in bugzilla (some
> might even have been closed as wontfix because of the new implementation).
> Does this also fix some of them?

I've tried to search for "basic_string race" with all statuses:
https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED_status=NEW_status=ASSIGNED_status=SUSPENDED_status=WAITING_status=REOPENED_status=RESOLVED_status=VERIFIED_status=CLOSED_known_to_fail_type=allwords_known_to_work_type=allwords=libstdc%2B%2B_id=125385_format=advanced_desc=basic_string%20race_desc_type=allwordssubstr

But it does not yield anything interesting. What bugs can I reference?

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Jonathan Wakely


On 02/09/15 15:49 +0200, Dmitry Vyukov wrote:

On Wed, Sep 2, 2015 at 12:58 PM, Marc Glisse  wrote:

On Tue, 1 Sep 2015, Dmitry Vyukov wrote:


The refcounted basic_string implementation contains several data races
on _M_refcount:



There are several bug reports about races in basic_string in bugzilla (some
might even have been closed as wontfix because of the new implementation).
Does this also fix some of them?


I've tried to search for "basic_string race" with all statuses:
https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED_status=NEW_status=ASSIGNED_status=SUSPENDED_status=WAITING_status=REOPENED_status=RESOLVED_status=VERIFIED_status=CLOSED_known_to_fail_type=allwords_known_to_work_type=allwords=libstdc%2B%2B_id=125385_format=advanced_desc=basic_string%20race_desc_type=allwordssubstr

But it does not yield anything interesting. What bugs can I reference?


There's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334 but I don't
think this fixes it. That's a race condition in the logic, not a
formal data race due to non-atomic operations like the problem being
fixed here.

Re: [AArch64_be] Fix vldX/vstX AdvSIMD intrinsics

2015-09-02 Thread James Greenhalgh

On Wed, Sep 02, 2015 at 02:18:03PM +0100, Christophe Lyon wrote:
> Hi,
> 
> The aarch64_vldX/aarch64_vstX expanders used for the vldX/vstX AdvSIMD
> intrisics in Q mode called vec_load_lanes, witch shuffles the vectors
> to match the layout expected by the vectorizer.
> 
> We do not want this to happen when the intrinsics are called directly
> by the end-user code.
> 
> This patch fixes this, by calling gen_aarch64_simd_ldX/gen_aarch64_simd_stX.
> 
> With this patch, the following tests now pass in advsimd-intrinsics
> (target aarch64_be):
> vldX_lane.c, vtrn, vuzp, vzip
> as well as aarch64/vldN_1.c and aarch64/vstN_1.c
> 
> It fixes PR 59810, 63652, 63653.

Great!

> 
> No regression, and tested on aarch64 and aarch64_be using the Foundation 
> Model.
> 
> OK for trunk?

OK.

Thanks,
James

> 2015-09-02  Christophe Lyon  
> 
>   PR target/59810
>   PR target/63652
>   PR target/63653
>   * config/aarch64/aarch64-simd.md
>   (aarch64_ld): Call
>   gen_aarch64_simd_ld.
>   (aarch64_st): Call
>   gen_aarch64_simd_st.

RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being run for the n32 and 64 ABIs

2015-09-02 Thread Andrew Bennett

> > diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > index 0890ffa..20c26ca 100644
> > --- a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > +++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > @@ -1,6 +1,7 @@
> >  /* { dg-do compile } */
> >  /* { dg-options "-dp -mtune=p5600  -mno-micromips -mno-mips16" } */
> >  /* { dg-skip-if "Bonding needs peephole optimization." { *-*-* } { "-O0" "-
> O1" } { "" } }
> > */
> > +/* { dg-skip-if "There is no DI mode support for load/store bonding" { *-*-
> * } { "-
> > mabi=n32" "-mabi=64" } { "" } } */
> >  typedef int VINT32 __attribute__ ((vector_size((16;
> 
> If the best fix we can do for this test is to limit what it tests then we
> should still not just skip it. There is some precedence for tests that
> require a specific arch with the isa=loongson special case. I'd rather
> just lock the test down to p5600 as per the filename.

I have changed the testcase's dg-options so that it is only built for p5600.
The updated patch and ChangeLog are below.  

Ok to commit?

Many thanks,



Andrew


testsuite/
* gcc.target/mips/p5600-bonding.c (dg-options): Force the test to be 
always
built for p5600.
* gcc.target/mips/mips.exp (mips-dg-options): Add support for the 
isa=p5600
dg-option.


diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 42e7fff..e8d1895 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -142,6 +142,9 @@
 #   isa=loongson
 #  select a Loongson processor
 #
+#   isa=p5600
+#  select a P5600 processor
+#
 #   addressing=absolute
 #  force absolute addresses to be used
 #
@@ -1009,6 +1012,10 @@ proc mips-dg-options { args } {
if { ![regexp {^-march=loongson} $arch] } {
set arch "-march=loongson2f"
}
+   } elseif { [string equal $spec "isa=p5600"] } {
+   if { ![regexp {^-march=p5600} $arch] } {
+   set arch "-march=p5600"
+   }
} else {
if { ![regexp {^(isa(?:|_rev))(=|<=|>=)([0-9]*)$} \
   $spec dummy prop relation value nocpus] } {
diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c 
b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
index 0890ffa..0bc6d91 100644
--- a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
+++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-dp -mtune=p5600  -mno-micromips -mno-mips16" } */
+/* { dg-options "-dp isa=p5600 -mtune=p5600 -mno-micromips -mno-mips16" } */
 /* { dg-skip-if "Bonding needs peephole optimization." { *-*-* } { "-O0" "-O1" 
} { "" } } */
 typedef int VINT32 __attribute__ ((vector_size((16;

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Dmitry Vyukov

Added comment to _M_dispose and restored ChangeLog entry.
Please take another look.


On Wed, Sep 2, 2015 at 3:17 PM, Jonathan Wakely  wrote:
> On 01/09/15 17:42 +0200, Dmitry Vyukov wrote:
>>
>> On Tue, Sep 1, 2015 at 5:08 PM, Jonathan Wakely 
>> wrote:
>>>
>>> On 01/09/15 16:56 +0200, Dmitry Vyukov wrote:


 I don't understand how a new gcc may not support __atomic builtins on
 ints. How it is even possible? That's a portable API provided by
 recent gcc's...
>>>
>>>
>>>
>>> The built-in function is always defined, but it might expand to a call
>>> to an external function in libatomic, and it would be a regression for
>>> code using std::string to start requiring libatomic (although maybe it
>>> would be necessary if it's the only way to make the code correct).
>>>
>>> I don't know if there are any targets that define __GTHREADS and also
>>> don't support __atomic_load(int*, ...) without libatomic. If such
>>> targets exist then adding a new configure check that only depends on
>>> __atomic_load(int*, ...) would mean we keep supporting those targets.
>>>
>>> Another option would be to simply do:
>>>
>>> bool
>>> _M_is_shared() const _GLIBCXX_NOEXCEPT
>>> #if defined(__GTHREADS)
>>> +{ return __atomic_load(>_M_refcount, __ATOMIC_ACQUIRE) >
>>> 0; }
>>> +#else
>>> { return this->_M_refcount > 0; }
>>> +#endif
>>>
>>> and see if anyone complains!
>>
>>
>> I like this option!
>> If a platform uses multithreading and has non-inlined atomic loads,
>> then the way to fix this is to provide inlined atomic loads rather
>> than to fix all call sites.
>>
>> Attaching new patch. Please take another look.
>
>
> This looks good. Torvald suggested that it would be useful to add a
> similar comment to the release operation in _M_dispose, so that both
> sides of the release-acquire are similarly documented. Could you add
> that and provide a suitable ChangeLog entry?
>
> Thanks!
>
>
>> Index: include/bits/basic_string.h
>> ===
>> --- include/bits/basic_string.h (revision 227363)
>> +++ include/bits/basic_string.h (working copy)
>> @@ -2601,11 +2601,32 @@
>>
>> bool
>> _M_is_leaked() const _GLIBCXX_NOEXCEPT
>> -{ return this->_M_refcount < 0; }
>> +{
>> +#if defined(__GTHREADS)
>> +  // _M_refcount is mutated concurrently by
>> _M_refcopy/_M_dispose,
>> +  // so we need to use an atomic load. However, _M_is_leaked
>> +  // predicate does not change concurrently (i.e. the string is
>> either
>> +  // leaked or not), so a relaxed load is enough.
>> +  return __atomic_load_n(>_M_refcount, __ATOMIC_RELAXED) <
>> 0;
>> +#else
>> +  return this->_M_refcount < 0;
>> +#endif
>> +}
>>
>> bool
>> _M_is_shared() const _GLIBCXX_NOEXCEPT
>> -{ return this->_M_refcount > 0; }
>> +   {
>> +#if defined(__GTHREADS)
>> +  // _M_refcount is mutated concurrently by
>> _M_refcopy/_M_dispose,
>> +  // so we need to use an atomic load. Another thread can drop
>> last
>> +  // but one reference concurrently with this check, so we need
>> this
>> +  // load to be acquire to synchronize with release fetch_and_add
>> in
>> +  // _M_dispose.
>> +  return __atomic_load_n(>_M_refcount, __ATOMIC_ACQUIRE) >
>> 0;
>> +#else
>> +  return this->_M_refcount > 0;
>> +#endif
>> +}
>>
>> void
>> _M_set_leaked() _GLIBCXX_NOEXCEPT
>
>
Index: ChangeLog
===
--- ChangeLog   (revision 227400)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2015-09-02  Dmitry Vyukov  
+
+   * include/bits/basic_string.h: Fix data races on _M_refcount.
+
 2015-09-02  Sebastian Huber  
 
PR libstdc++/67408
Index: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 227400)
+++ include/bits/basic_string.h (working copy)
@@ -2601,11 +2601,32 @@
 
 bool
_M_is_leaked() const _GLIBCXX_NOEXCEPT
-{ return this->_M_refcount < 0; }
+{
+#if defined(__GTHREADS)
+  // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose,
+  // so we need to use an atomic load. However, _M_is_leaked
+  // predicate does not change concurrently (i.e. the string is either
+  // leaked or not), so a relaxed load is enough.
+  return __atomic_load_n(>_M_refcount, __ATOMIC_RELAXED) < 0;
+#else
+  return this->_M_refcount < 0;
+#endif
+}
 
 bool
_M_is_shared() const _GLIBCXX_NOEXCEPT
-{ return this->_M_refcount > 0; }
+   {
+#if defined(__GTHREADS)
+  // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose,
+  // so

Re: [testsuite] Clean up effective_target cache

2015-09-02 Thread Christophe Lyon

On 1 September 2015 at 16:04, Christophe Lyon
 wrote:
> On 25 August 2015 at 17:31, Mike Stump  wrote:
>> On Aug 25, 2015, at 1:14 AM, Christophe Lyon  
>> wrote:
>>> Some subsets of the tests override ALWAYS_CXXFLAGS or
>>> TEST_ALWAYS_FLAGS and perform effective_target support tests using
>>> these modified flags.
>>
>>> This patch adds a new function 'clear_effective_target_cache', which
>>> is called at the end of every .exp file which overrides
>>> ALWAYS_CXXFLAGS or TEST_ALWAYS_FLAGS.
>>
>> So, a simple English directive somewhere that says, if one changes 
>> ALWAYS_CXXFLAGS or TEST_ALWAYS_FLAGS then they should do a 
>> clear_effective_target_cache at the end as the target cache can make 
>> decisions based upon the flags, and those decisions need to be redone when 
>> the flags change would be nice.
>>
>> I do wonder, do we need to reexamine when setting the flags?  I’m thinking 
>> of a sequence like: non-thumb default, is_thumb, set flags (thumb), 
>> is_thumb.  Anyway, safe to punt this until someone discovers it or is 
>> reasonable sure it happens.
>>
>> Anyway, all looks good.  Ok.
>>
> Here is what I have committed (r227372).

Hmmm, in fact this was r227401.

>
> I updated the comment before clear_effective_target_cache, and copied
> the directive you suggested above.
> I also added a test to check if $et_prop_list exists before clearing
> (there were error messages otherwise).
>
> Christophe.
>
>>> However, I noticed that lib/g++.exp changes ALWAYS_CXXFLAGS, but does
>>> not appear to restore it. In doubt, I didn't change it.
>>
>> Yeah, I examined it.  It seems like it might not matter, as anyone setting 
>> and unsetting would come in cleared, and if they didn’t, it should be 
>> roughly the same exact state, meaning, no clearing necessary.  I think it is 
>> safe to punt this until someone finds a bug or can see a way that it would 
>> matter.  I also don’t think it would hurt to clear, if someone wanted to 
>> refactor the code a bit and make the clearing and the cleanup a little more 
>> automatic.  I’m thinking of a RAII style code in which the dtor runs the 
>> clear.  Not sure if that is even possible in tcl.  [ checking ] Nope, maybe 
>> not.  Oh well.

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Jonathan Wakely


On 02/09/15 16:01 +0200, Dmitry Vyukov wrote:

Added comment to _M_dispose and restored ChangeLog entry.
Please take another look.


Thanks, this is OK for trunk.

I assume you are covered by the Google company-wide copyright
assignment, so someone just needs to commit it, which I can do if you
like.

[patch] libstdc++/67408 Handle distinct __gthread_recursive_mutex_t type.

2015-09-02 Thread Jonathan Wakely


This makes recursive_timed_mutex work for non-pthreads targets where
__gthread_mutex_t and __gthread_recursive_mutex_t are not the same
type. Thanks to Sebastian for the bug report and patch.

Tested powerpc64le-linux, committed to trunk.


commit 852d09122561d301b2980b6f9c97e88c5499006c
Author: Jonathan Wakely 
Date:   Wed Sep 2 11:39:46 2015 +0100

2015-09-02  Sebastian Huber  

	PR libstdc++/67408
	* include/std/mutex (__timed_mutex_impl::_M_try_lock_until): Use
	_Derived::_M_timedlock().
	(timed_mutex): Add _M_timedlock() and make base class a friend.
	(recursive_timed_mutex): Likewise.

diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index deb85df..790508c 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -230,8 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	static_cast(__ns.count())
 	  };
 
-	  auto __mutex = static_cast<_Derived*>(this)->native_handle();
-	  return !__gthread_mutex_timedlock(__mutex, &__ts);
+	  return static_cast<_Derived*>(this)->_M_timedlock(__ts);
 	}
 
   template
@@ -293,6 +292,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 native_handle_type
 native_handle()
 { return &_M_mutex; }
+
+private:
+  friend class __timed_mutex_impl;
+
+  bool
+  _M_timedlock(const __gthread_time_t& __ts)
+  { return !__gthread_mutex_timedlock(&_M_mutex, &__ts); }
   };
 
   /// recursive_timed_mutex
@@ -346,6 +352,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 native_handle_type
 native_handle()
 { return &_M_mutex; }
+
+private:
+  friend class __timed_mutex_impl;
+
+  bool
+  _M_timedlock(const __gthread_time_t& __ts)
+  { return !__gthread_recursive_mutex_timedlock(&_M_mutex, &__ts); }
   };
 #endif
 #endif // _GLIBCXX_HAS_GTHREADS

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Marc Glisse


On Tue, 1 Sep 2015, Dmitry Vyukov wrote:


The refcounted basic_string implementation contains several data races
on _M_refcount:


There are several bug reports about races in basic_string in bugzilla 
(some might even have been closed as wontfix because of the new 
implementation). Does this also fix some of them?


(ChangeLog entry appears to be missing)

--
Marc Glisse

Re: Fix 61441

2015-09-02 Thread Sujoy Saraswati

> CCP and other passes ultimatively end up using fold-const.c:const_{unop,binop}
> for constant folding so that is where the fix should go to (or to real.c).  
> That
> will automatically handle other passes doing similar transforms.

Thanks for the tip. I will modify my fix and post it.

Regards,
Sujoy

>
> Richard.
>
>> Regards,
>> Sujoy
>>
>>> Thanks,
>>> Richard.
>>>
 Regards,
 Sujoy

 2015-09-01  Sujoy Saraswati 

 PR tree-optimization/61441
 * tree-ssa-ccp.c (convert_snan_to_qnan): Convert sNaN to qNaN when
 flag_signaling_nans is off.
 (ccp_fold_stmt, visit_assignment, visit_cond_stmt): call
 convert_snan_to_qnan to convert sNaN to qNaN on constant folding.

 PR tree-optimization/61441
 * gcc.dg/pr61441.c: New testcase.

 Index: gcc/tree-ssa-ccp.c
 ===
 --- gcc/tree-ssa-ccp.c  (revision 226965)
 +++ gcc/tree-ssa-ccp.c  (working copy)
 @@ -560,6 +560,24 @@ value_to_wide_int (ccp_prop_value_t val)
return 0;
  }

 +/* Convert sNaN to qNaN when flag_signaling_nans is off */
 +
 +static void
 +convert_snan_to_qnan (tree expr)
 +{
 +  if (expr
 +  && (TREE_CODE (expr) == REAL_CST)
 +  && !flag_signaling_nans)
 +  {
 +REAL_VALUE_TYPE *d = TREE_REAL_CST_PTR (expr);
 +
 +if (HONOR_NANS (TYPE_MODE (TREE_TYPE (expr)))
 +&& REAL_VALUE_ISNAN (*d)
 +&& d->signalling)
 +  d->signalling = 0;
 +  }
 +}
 +
  /* Return the value for the address expression EXPR based on alignment
 information.  */

 @@ -2156,6 +2174,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 if (val.lattice_val != CONSTANT
 || val.mask != 0)
   return false;
 +convert_snan_to_qnan (val.value);

 if (dump_file)
   {
 @@ -2197,7 +2216,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 bool res;
 if (!useless_type_conversion_p (TREE_TYPE (lhs),
 TREE_TYPE (new_rhs)))
 +{
   new_rhs = fold_convert (TREE_TYPE (lhs), new_rhs);
 +  convert_snan_to_qnan (new_rhs);
 +}
 res = update_call_from_tree (gsi, new_rhs);
 gcc_assert (res);
 return true;
 @@ -2216,6 +2238,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
  tree new_rhs = fold_builtin_alloca_with_align (stmt);
  if (new_rhs)
   {
 +convert_snan_to_qnan (new_rhs);
 bool res = update_call_from_tree (gsi, new_rhs);
 tree var = TREE_OPERAND (TREE_OPERAND (new_rhs, 0),0);
 gcc_assert (res);
 @@ -2260,7 +2283,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
   {
 tree rhs = unshare_expr (val);
 if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE 
 (rhs)))
 +{
   rhs = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), rhs);
 +  convert_snan_to_qnan (rhs);
 +}
 gimple_assign_set_rhs_from_tree (gsi, rhs);
 return true;
   }
 @@ -2292,6 +2318,7 @@ visit_assignment (gimple stmt, tree *output_p)
/* Evaluate the statement, which could be
  either a GIMPLE_ASSIGN or a GIMPLE_CALL.  */
val = evaluate_stmt (stmt);
 +  convert_snan_to_qnan (val.value);

/* If STMT is an assignment to an SSA_NAME, we only have one
  value to set.  */
 @@ -2324,6 +2351,7 @@ visit_cond_stmt (gimple stmt, edge *taken_edge_p)
if (val.lattice_val != CONSTANT
|| val.mask != 0)
  return SSA_PROP_VARYING;
 +  convert_snan_to_qnan (val.value);

/* Find which edge out of the conditional block will be taken and add it
   to the worklist.  If no single edge can be determined statically,

 Index: gcc/testsuite/gcc.dg/pr61441.c
 ===
 --- gcc/testsuite/gcc.dg/pr61441.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/pr61441.c  (working copy)
 @@ -0,0 +1,17 @@
 +/* { dg-do run } */
 +/* { dg-options "-O1 -lm" } */
 +
 +#define _GNU_SOURCE
 +#include 
 +#include 
 +
 +int main (void)
 +{
 +  float sNaN = __builtin_nansf ("");
 +  double x = (double) sNaN;
 +  if (issignaling(x))
 +  {
 +__builtin_abort();
 +  }
 +  return 0;
 +}

Re: [PATCH GCC]Look into unnecessary conversion when checking mult_op in get_shiftadd_cost

2015-09-02 Thread Richard Biener

On Wed, Sep 2, 2015 at 5:50 AM, Bin Cheng  wrote:
> Hi,
> When calling get_shiftadd_cost, the mult_op is stripped at caller places.
> We should look into unnecessary conversion in op1 before checking equality,
> otherwise it computes wrong shiftadd cost.  This patch picks this small
> issue up.
>
> Bootstrap and test on x86_64 and aarch64 along with other patches.  Is it
> OK?

Just do STRIP_NOPS (op1) unconditionally?  Thus

  STRIP_NOPS (op1);
  mult_in_op1 = operand_equal_p (op1, mult, 0);

ok with that change.

Thanks,
Richard.

> Thanks,
> bin
>
> 2015-08-31  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_shiftadd_cost): Look into
> unnecessary type conversion for OP1.

Re: [PATCH] Fix ICE when generating a vector shift by scalar

2015-09-02 Thread Richard Biener

On Tue, Sep 1, 2015 at 5:53 PM, Bill Schmidt
 wrote:
> On Tue, 2015-09-01 at 11:01 +0200, Richard Biener wrote:
>> On Mon, Aug 31, 2015 at 10:28 PM, Bill Schmidt
>>  wrote:
>> > Hi,
>> >
>> > The following simple test fails when attempting to convert a vector
>> > shift-by-scalar into a vector shift-by-vector.
>> >
>> >   typedef unsigned char v16ui __attribute__((vector_size(16)));
>> >
>> >   v16ui vslb(v16ui v, unsigned char i)
>> >   {
>> > return v << i;
>> >   }
>> >
>> > When this code is gimplified, the shift amount gets expanded to an
>> > unsigned int:
>> >
>> >   vslb (v16ui v, unsigned char i)
>> >   {
>> > v16ui D.2300;
>> > unsigned int D.2301;
>> >
>> > D.2301 = (unsigned int) i;
>> > D.2300 = v << D.2301;
>> > return D.2300;
>> >   }
>> >
>> > In expand_binop, the shift-by-scalar is converted into a shift-by-vector
>> > using expand_vector_broadcast, which produces the following rtx to be
>> > used to initialize a V16QI vector:
>> >
>> > (parallel:V16QI [
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > (subreg/s/v:SI (reg:DI 155) 0)
>> > ])
>> >
>> > The back end eventually chokes trying to generate a copy of the SImode
>> > expression into a QImode memory slot.
>> >
>> > This patch fixes this problem by ensuring that the shift amount is
>> > truncated to the inner mode of the vector when necessary.  I've added a
>> > test case verifying correct PowerPC code generation in this case.
>> >
>> > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
>> > regressions.  Is this ok for trunk?
>> >
>> > Thanks,
>> > Bill
>> >
>> >
>> > [gcc]
>> >
>> > 2015-08-31  Bill Schmidt  
>> >
>> > * optabs.c (expand_binop): Don't create a broadcast vector with a
>> > source element wider than the inner mode.
>> >
>> > [gcc/testsuite]
>> >
>> > 2015-08-31  Bill Schmidt  
>> >
>> > * gcc.target/powerpc/vec-shift.c: New test.
>> >
>> >
>> > Index: gcc/optabs.c
>> > ===
>> > --- gcc/optabs.c(revision 227353)
>> > +++ gcc/optabs.c(working copy)
>> > @@ -1608,6 +1608,13 @@ expand_binop (machine_mode mode, optab binoptab, r
>> >
>> >if (otheroptab && optab_handler (otheroptab, mode) != 
>> > CODE_FOR_nothing)
>> > {
>> > + /* The scalar may have been extended to be too wide.  Truncate
>> > +it back to the proper size to fit in the broadcast vector.  */
>> > + machine_mode inner_mode = GET_MODE_INNER (mode);
>> > + if (GET_MODE_BITSIZE (inner_mode)
>> > + < GET_MODE_BITSIZE (GET_MODE (op1)))
>>
>> Does that work for modeless constants?  Btw, what do other targets do
>> here?  Do they
>> also choke or do they cope with the wide operand?
>
> Good question.  This works by serendipity more than by design.  Because
> a constant has a mode of VOIDmode, its bitsize is 0 and the TRUNCATE
> won't be generated.  It would be better for me to put in an explicit
> check for CONST_INT rather than relying on this, though.  I'll fix that.
>
> I am not sure what other targets do here; I can check.  However, do you
> think that's relevant?  I'm concerned that
>
> (parallel:V16QI [
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> (subreg/s/v:SI (reg:DI 155) 0)
> ])
>
> is a nonsensical expression and shouldn't be produced by common code, in
> my view.  It seems best to make this explicitly correct.  Please let me
> know if that's off-base.

No, the above indeed looks fishy though other backends vec_init_optab might
have just handle it fine.

OTOH if a conversion is required it would be nice to CSE it, thus
force the result to a

Re: [PATCH][AArch64][1/3] Expand signed mod by power of 2 using CSNEG

2015-09-02 Thread Kyrill Tkachov



On 01/09/15 11:40, Kyrill Tkachov wrote:

Hi James,

On 01/09/15 10:25, James Greenhalgh wrote:

On Thu, Aug 13, 2015 at 01:36:50PM +0100, Kyrill Tkachov wrote:

Some comments below.

Thanks, I'll incorporate them, with one clarification inline.


And here's the updated patch.

Thanks,
Kyrill

2015-09-02  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (mod3): New define_expand.
 (*neg2_compare0): Rename to...
 (neg2_compare0): ... This.
 * config/aarch64/aarch64.c (aarch64_rtx_costs, MOD case):
 Move check for speed inside the if-then-elses.  Reflect
 CSNEG sequence in MOD by power of 2 case.




diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1394ed7..c8bd8d2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6652,6 +6652,25 @@ cost_plus:
 return true;
   
   case MOD:

+/* We can expand signed mod by power of 2 using a
+   NEGS, two parallel ANDs and a CSNEG.  Assume here
+   that CSNEG is COSTS_N_INSNS (1).  This case should

Why do we want to hardcode this assumption rather than parameterise? Even
if you model this as the cost of an unconditional NEG I think that is
better than hardcoding zero cost.



+   only ever be reached through the set_smod_pow2_cheap check
+   in expmed.c.  */
+  if (CONST_INT_P (XEXP (x, 1))
+ && exact_log2 (INTVAL (XEXP (x, 1))) > 0
+ && (mode == SImode || mode == DImode))
+   {
+ *cost += COSTS_N_INSNS (3);

Can you add am comment to make it clear why this is not off-by-one? By
quick inspection it looks like you have made a typo trying to set the
cost to be 3 instructions rather than 4 - a reader needs the extra
knowledge that we already have a COSTS_N_INSNS(1) as a baseline.

This would be clearer as:

/* We will expand to four instructions, reset the baseline.  */
*cost = COSTS_N_INSNS (4);


+
+ if (speed)
+   *cost += 2 * extra_cost->alu.logical
++ extra_cost->alu.arith;
+
+ return true;
+   }
+
+/* Fall-through.  */
   case UMOD:
 if (speed)
{
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b7b04c4..a515573 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -302,6 +302,62 @@ (define_expand "cmp"
 }
   )
   
+;; AArch64-specific expansion of signed mod by power of 2 using CSNEG.

Seems a strange comment given that we are in aarch64.md :-).


+;; For x0 % n where n is a power of 2 produce:
+;; negs   x1, x0
+;; andx0, x0, #(n - 1)
+;; andx1, x1, #(n - 1)
+;; csneg  x0, x0, x1, mi
+
+(define_expand "mod3"
+  [(match_operand:GPI 0 "register_operand" "")
+   (match_operand:GPI 1 "register_operand" "")
+   (match_operand:GPI 2 "const_int_operand" "")]
+  ""
+  {
+HOST_WIDE_INT val = INTVAL (operands[2]);
+
+if (val <= 0
+   || exact_log2 (INTVAL (operands[2])) <= 0
+   || !aarch64_bitmask_imm (INTVAL (operands[2]) - 1, mode))
+  FAIL;
+
+rtx mask = GEN_INT (val - 1);
+
+/* In the special case of x0 % 2 we can do the even shorter:
+   cmp x0, xzr
+   and x0, x0, 1
+   csneg   x0, x0, x0, ge.  */
+if (val == 2)
+  {
+   rtx masked = gen_reg_rtx (mode);
+   rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);

Non-obvious why this is correct given the comment above saying we want GE.

We want to negate if the comparison earlier yielded "less than zero".
Unfortunately, the CSNEG form of that is written as

csneg   x0, x0, x0, ge

which looks counter-intuitive at first glance.
With my other patch posted at 
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00020.html
the generated assembly would be:
cneg x0, x0, lt
which more closely matches the RTL generation in this hunk.
If you think that patch should go in, I'll rewrite the CSNEG form in this 
comment to CNEG.

Kyrill




+   emit_insn (gen_and3 (masked, operands[1], mask));
+   rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
+   emit_insn (gen_csneg3_insn (operands[0], x, masked, masked));
+   DONE;
+  }
+
+rtx neg_op = gen_reg_rtx (mode);
+rtx_insn *insn = emit_insn (gen_neg2_compare0 (neg_op, operands[1]));
+
+/* Extract the condition register and mode.  */
+rtx cmp = XVECEXP (PATTERN (insn), 0, 0);
+rtx cc_reg = SET_DEST (cmp);
+rtx cond = gen_rtx_GE (VOIDmode, cc_reg, const0_rtx);
+
+rtx masked_pos = gen_reg_rtx (mode);
+emit_insn (gen_and3 (masked_pos, operands[1], mask));
+
+rtx masked_neg = gen_reg_rtx (mode);
+emit_insn (gen_and3 (masked_neg, neg_op, mask));
+
+emit_insn (gen_csneg3_insn (operands[0], cond,
+  masked_neg, masked_pos));
+DONE;
+  }
+)
+

Thanks,
James



commit 534676af75436d7e11865403bdd231fadc7c19aa
Author: Kyrylo Tkachov 
Date:   Wed Jul 15 17:01:13 2015 +0100

Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-02 Thread Richard Biener

On Wed, Sep 2, 2015 at 2:51 PM, Ilya Enkovich  wrote:
> 2015-09-02 15:35 GMT+03:00 Richard Biener :
>> On Tue, Sep 1, 2015 at 5:03 PM, Ilya Enkovich  wrote:
>>> Hi,
>>>
>>> This fixes an ICE by adding a NULL check.  Bootstrapped and regtested for 
>>> x86_64-unknown-linux-gnu.  Applied to trunk.  Does this need to be ported 
>>> to gcc-5-branch?
>>>
>>> Thanks,
>>> Ilya
>>> --
>>> gcc/
>>>
>>> 2015-09-01  Ilya Enkovich  
>>>
>>> PR target/67405
>>> * tree-chkp.c (chkp_find_bound_slots_1): Add NULL check.
>>>
>>> gcc/testsuite/
>>>
>>> 2015-09-01  Ilya Enkovich  
>>>
>>> PR target/67405
>>> * g++.dg/pr67405.C: New test.
>>>
>>>
>>> diff --git a/gcc/testsuite/g++.dg/pr67405.C b/gcc/testsuite/g++.dg/pr67405.C
>>> new file mode 100644
>>> index 000..5055921
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.dg/pr67405.C
>>> @@ -0,0 +1,11 @@
>>> +// { dg-do compile }
>>> +
>>> +struct S
>>> +{
>>> +  S f; // { dg-error "incomplete type" }
>>> +};
>>> +
>>> +void
>>> +fn1 (S p1)
>>> +{
>>> +}
>>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>>> index 8c1b48c..2489abb 100644
>>> --- a/gcc/tree-chkp.c
>>> +++ b/gcc/tree-chkp.c
>>> @@ -1667,8 +1667,9 @@ chkp_find_bound_slots_1 (const_tree type, bitmap 
>>> have_bound,
>>>for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
>>> if (TREE_CODE (field) == FIELD_DECL)
>>>   {
>>> -   HOST_WIDE_INT field_offs
>>> - = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>>> +   HOST_WIDE_INT field_offs = 0;
>>> +   if (DECL_FIELD_BIT_OFFSET (field))
>>
>> DECL_FIELD_BIT_OFFSET should be never NULL.  Whoever created that
>> FIELD_DECL created an invalid one.
>
> I'll check where this decl comes from. Is there a proper checker to
> add a NULL test for DECL_FIELD_BIT_OFFSET BTW?.

The type verifier Honza added recently I guess.

Richard.

> Thanks,
> Ilya
>
>>
>> Richard.
>>
>>> + field_offs += TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET 
>>> (field));
>>> if (DECL_FIELD_OFFSET (field))
>>>   field_offs += TREE_INT_CST_LOW (DECL_FIELD_OFFSET (field)) * 
>>> 8;
>>> chkp_find_bound_slots_1 (TREE_TYPE (field), have_bound,

Re: [PATCH GCC]Try to fold (long)(A-B) into (long)A - (long)B for canonicalization in tree affine

2015-09-02 Thread Richard Biener

On Wed, Sep 2, 2015 at 5:48 AM, Bin Cheng  wrote:
> Hi,
> Generally we don't try to fold (long)(A-B) into (long)A - (long)B because it
> results in more operations.  On the other hand, this fold is wanted when we
> want to explore as many canonical opportunities as possible.  Tree affine is
> definitely such a place.  This patch supports this in
> tree_to_aff_combination, so it can produce canonical affines rather than
> stupid expressions like " + (sizetype) (t_4(D) + t_4(D)) * 4 -
> (sizetype)t_4(D) * 8".
>
> Bootstrap and test on x86_64 and aarch64 along with other patches.  Is it
> OK?

Hmm, but we also do code-gen from affine combinations and we can't reverse
that operation.  So eventually we end up generating an expensive wide plus/minus
instead of a narrow one and an extension.

But I see we already do this transform, just only for constant 2nd operand and
in aff_combination_expand.  Any reason you chose tree_to_aff_combination
only?  I suppose if we need it in both places it would be good to factor it out
to a helper function.

Richard.

>
> 2015-08-31  Bin Cheng  
>
> * tree-affine.c (tree_to_aff_combination): Try to fold (long)(A-B)
> by adding CASE_CONVERT support.

[PATCH][AArch64][3/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

Remove aarch64_bitmasks, aarch64_build_bitmask_table and aarch64_bitmasks_cmp 
as they are no longer
used by the immediate generation code.

No change in generated code, passes GCC regression tests/bootstrap.

ChangeLog:
2015-09-02  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_bitmasks): Remove.
(AARCH64_NUM_BITMASKS) remove.  (aarch64_bitmasks_cmp): Remove.
(aarch64_build_bitmask_table): Remove.

---
 gcc/config/aarch64/aarch64.c | 69 
 1 file changed, 69 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 070c68b..0bc6b19 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -563,12 +563,6 @@ static const struct aarch64_option_extension 
all_extensions[] =
increment address.  */
 static machine_mode aarch64_memory_reference_mode;
 
-/* A table of valid AArch64 "bitmask immediate" values for
-   logical instructions.  */
-
-#define AARCH64_NUM_BITMASKS  5334
-static unsigned HOST_WIDE_INT aarch64_bitmasks[AARCH64_NUM_BITMASKS];
-
 typedef enum aarch64_cond_code
 {
   AARCH64_EQ = 0, AARCH64_NE, AARCH64_CS, AARCH64_CC, AARCH64_MI, AARCH64_PL,
@@ -3172,67 +3166,6 @@ aarch64_tls_referenced_p (rtx x)
 }
 
 
-static int
-aarch64_bitmasks_cmp (const void *i1, const void *i2)
-{
-  const unsigned HOST_WIDE_INT *imm1 = (const unsigned HOST_WIDE_INT *) i1;
-  const unsigned HOST_WIDE_INT *imm2 = (const unsigned HOST_WIDE_INT *) i2;
-
-  if (*imm1 < *imm2)
-return -1;
-  if (*imm1 > *imm2)
-return +1;
-  return 0;
-}
-
-
-static void
-aarch64_build_bitmask_table (void)
-{
-  unsigned HOST_WIDE_INT mask, imm;
-  unsigned int log_e, e, s, r;
-  unsigned int nimms = 0;
-
-  for (log_e = 1; log_e <= 6; log_e++)
-{
-  e = 1 << log_e;
-  if (e == 64)
-   mask = ~(HOST_WIDE_INT) 0;
-  else
-   mask = ((HOST_WIDE_INT) 1 << e) - 1;
-  for (s = 1; s < e; s++)
-   {
- for (r = 0; r < e; r++)
-   {
- /* set s consecutive bits to 1 (s < 64) */
- imm = ((unsigned HOST_WIDE_INT)1 << s) - 1;
- /* rotate right by r */
- if (r != 0)
-   imm = ((imm >> r) | (imm << (e - r))) & mask;
- /* replicate the constant depending on SIMD size */
- switch (log_e) {
- case 1: imm |= (imm <<  2);
- case 2: imm |= (imm <<  4);
- case 3: imm |= (imm <<  8);
- case 4: imm |= (imm << 16);
- case 5: imm |= (imm << 32);
- case 6:
-   break;
- default:
-   gcc_unreachable ();
- }
- gcc_assert (nimms < AARCH64_NUM_BITMASKS);
- aarch64_bitmasks[nimms++] = imm;
-   }
-   }
-}
-
-  gcc_assert (nimms == AARCH64_NUM_BITMASKS);
-  qsort (aarch64_bitmasks, nimms, sizeof (aarch64_bitmasks[0]),
-aarch64_bitmasks_cmp);
-}
-
-
 /* Return true if val can be encoded as a 12-bit unsigned immediate with
a left shift of 0 or 12 bits.  */
 bool
@@ -7828,8 +7761,6 @@ aarch64_override_options (void)
|| (aarch64_arch_string && valid_arch))
 gcc_assert (explicit_arch != aarch64_no_arch);
 
-  aarch64_build_bitmask_table ();
-
   aarch64_override_options_internal (_options);
 
   /* Save these options as the default ones in case we push and pop them later
-- 
1.8.3

[PATCH][AArch64][4/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

The code that emits a movw with an add/sub is hardly ever used, and all cases 
in actual code are
already covered by mov+movk, so it is redundant (an example of such an 
immediate is
0x00000abcul).

Passes GCC regression tests/bootstrap. Minor changes in generated code due to 
movk being used
instead of add/sub (codesize remains the same).

ChangeLog:
2015-09-02  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_internal_mov_immediate):
Remove redundant immediate generation code.

---
 gcc/config/aarch64/aarch64.c | 60 
 1 file changed, 60 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0bc6b19..bd6e522 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1371,8 +1371,6 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   int i;
   bool first;
   unsigned HOST_WIDE_INT val, val2;
-  bool subtargets;
-  rtx subtarget;
   int one_match, zero_match, first_not__match;
   int num_insns = 0;
 
@@ -1402,7 +1400,6 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   /* Remaining cases are all for DImode.  */
 
   val = INTVAL (imm);
-  subtargets = optimize && can_create_pseudo_p ();
 
   one_match = 0;
   zero_match = 0;
@@ -1440,63 +1437,6 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   if (zero_match == 2)
 goto simple_sequence;
 
-  mask = 0x0UL;
-  for (i = 16; i < 64; i += 16, mask <<= 16)
-{
-  HOST_WIDE_INT comp = mask & ~(mask - 1);
-
-  if (aarch64_uimm12_shift (val - (val & mask)))
-   {
- if (generate)
-   {
- subtarget = subtargets ? gen_reg_rtx (DImode) : dest;
- emit_insn (gen_rtx_SET (subtarget, GEN_INT (val & mask)));
- emit_insn (gen_adddi3 (dest, subtarget,
-GEN_INT (val - (val & mask;
-   }
- num_insns += 2;
- return num_insns;
-   }
-  else if (aarch64_uimm12_shift (-(val - ((val + comp) & mask
-   {
- if (generate)
-   {
- subtarget = subtargets ? gen_reg_rtx (DImode) : dest;
- emit_insn (gen_rtx_SET (subtarget,
- GEN_INT ((val + comp) & mask)));
- emit_insn (gen_adddi3 (dest, subtarget,
-GEN_INT (val - ((val + comp) & mask;
-   }
- num_insns += 2;
- return num_insns;
-   }
-  else if (aarch64_uimm12_shift (val - ((val - comp) | ~mask)))
-   {
- if (generate)
-   {
- subtarget = subtargets ? gen_reg_rtx (DImode) : dest;
- emit_insn (gen_rtx_SET (subtarget,
- GEN_INT ((val - comp) | ~mask)));
- emit_insn (gen_adddi3 (dest, subtarget,
-GEN_INT (val - ((val - comp) | ~mask;
-   }
- num_insns += 2;
- return num_insns;
-   }
-  else if (aarch64_uimm12_shift (-(val - (val | ~mask
-   {
- if (generate)
-   {
- subtarget = subtargets ? gen_reg_rtx (DImode) : dest;
- emit_insn (gen_rtx_SET (subtarget, GEN_INT (val | ~mask)));
- emit_insn (gen_adddi3 (dest, subtarget,
-GEN_INT (val - (val | ~mask;
-   }
- num_insns += 2;
- return num_insns;
-   }
-}
-
   if (zero_match != 2 && one_match != 2)
 {
   for (i = 0; i < 64; i += 16, mask <<= 16)
-- 
1.8.3

[PATCH][AArch64][1/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

This patch reimplements aarch64_bitmask_imm using bitwise arithmetic rather 
than a slow binary
search. The algorithm searches for a sequence of set bits. If there are no more 
set bits and not all
bits are set, it is a valid mask. Otherwise it determines the distance to the 
next set bit and
checks the mask is repeated across the full 64 bits. Native performance is 5-6x 
faster on typical
queries.

No change in generated code, passes GCC regression/bootstrap.

ChangeLog:
2015-09-02  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_bitmask_imm):
Reimplement using faster algorithm.

---
 gcc/config/aarch64/aarch64.c | 62 +---
 1 file changed, 53 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c666dce..ba1b77e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3301,19 +3301,63 @@ aarch64_movw_imm (HOST_WIDE_INT val, machine_mode mode)
  || (val & (((HOST_WIDE_INT) 0x) << 16)) == val);
 }
 
+/* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2.  */
+
+static const unsigned HOST_WIDE_INT bitmask_imm_mul[] =
+  {
+0x00010001ull,
+0x0001000100010001ull,
+0x0101010101010101ull,
+0xull,
+0xull,
+  };
+
 
 /* Return true if val is a valid bitmask immediate.  */
+
 bool
-aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode mode)
+aarch64_bitmask_imm (HOST_WIDE_INT val_in, machine_mode mode)
 {
-  if (GET_MODE_SIZE (mode) < 8)
-{
-  /* Replicate bit pattern.  */
-  val &= (HOST_WIDE_INT) 0x;
-  val |= val << 32;
-}
-  return bsearch (, aarch64_bitmasks, AARCH64_NUM_BITMASKS,
- sizeof (aarch64_bitmasks[0]), aarch64_bitmasks_cmp) != NULL;
+  unsigned HOST_WIDE_INT val, tmp, mask, first_one, next_one;
+  int bits;
+
+  /* Check for a single sequence of one bits and return quickly if so.
+ The special cases of all ones and all zeroes returns false.  */
+  val = (unsigned HOST_WIDE_INT) val_in;
+  tmp = val + (val & -val);
+
+  if (tmp == (tmp & -tmp))
+return (val + 1) > 1;
+
+  /* Replicate 32-bit immediates so we can treat them as 64-bit.  */
+  if (mode == SImode)
+val = (val << 32) | (val & 0x);
+
+  /* Invert if the immediate doesn't start with a zero bit - this means we
+ only need to search for sequences of one bits.  */
+  if (val & 1)
+val = ~val;
+
+  /* Find the first set bit and set tmp to val with the first sequence of one
+ bits removed.  Return success if there is a single sequence of ones.  */
+  first_one = val & -val;
+  tmp = val & (val + first_one);
+
+  if (tmp == 0)
+return true;
+
+  /* Find the next set bit and compute the difference in bit position.  */
+  next_one = tmp & -tmp;
+  bits = clz_hwi (first_one) - clz_hwi (next_one);
+  mask = val ^ tmp;
+
+  /* Check the bit position difference is a power of 2, and that the first
+ sequence of one bits fits within 'bits' bits.  */
+  if ((mask >> bits) != 0 || bits != (bits & -bits))
+return false;
+
+  /* Check the sequence of one bits is repeated 64/bits times.  */
+  return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26];
 }
 
 
-- 
1.8.3

Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-02 Thread Richard Biener

On Tue, Sep 1, 2015 at 5:03 PM, Ilya Enkovich  wrote:
> Hi,
>
> This fixes an ICE by adding a NULL check.  Bootstrapped and regtested for 
> x86_64-unknown-linux-gnu.  Applied to trunk.  Does this need to be ported to 
> gcc-5-branch?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-09-01  Ilya Enkovich  
>
> PR target/67405
> * tree-chkp.c (chkp_find_bound_slots_1): Add NULL check.
>
> gcc/testsuite/
>
> 2015-09-01  Ilya Enkovich  
>
> PR target/67405
> * g++.dg/pr67405.C: New test.
>
>
> diff --git a/gcc/testsuite/g++.dg/pr67405.C b/gcc/testsuite/g++.dg/pr67405.C
> new file mode 100644
> index 000..5055921
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr67405.C
> @@ -0,0 +1,11 @@
> +// { dg-do compile }
> +
> +struct S
> +{
> +  S f; // { dg-error "incomplete type" }
> +};
> +
> +void
> +fn1 (S p1)
> +{
> +}
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index 8c1b48c..2489abb 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -1667,8 +1667,9 @@ chkp_find_bound_slots_1 (const_tree type, bitmap 
> have_bound,
>for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
> if (TREE_CODE (field) == FIELD_DECL)
>   {
> -   HOST_WIDE_INT field_offs
> - = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
> +   HOST_WIDE_INT field_offs = 0;
> +   if (DECL_FIELD_BIT_OFFSET (field))

DECL_FIELD_BIT_OFFSET should be never NULL.  Whoever created that
FIELD_DECL created an invalid one.

Richard.

> + field_offs += TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
> if (DECL_FIELD_OFFSET (field))
>   field_offs += TREE_INT_CST_LOW (DECL_FIELD_OFFSET (field)) * 8;
> chkp_find_bound_slots_1 (TREE_TYPE (field), have_bound,

[PATCH][AArch64][2/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

aarch64_internal_mov_immediate uses loops iterating over all legal bitmask 
immediates to find
2-instruction immediate combinations. One loop is quadratic and despite being 
extremely expensive
very rarely finds a matching immediate (43 matches in all of SPEC2006 but none 
are emitted in final
code), so it can be removed without any effect on code quality. The other loop 
can be replaced by a
constant-time search: rather than iterating over all legal bitmask values, 
reconstruct a potential
bitmask and query the fast aarch64_bitmask_imm.

No change in generated code, passes GCC regression tests/bootstrap.

ChangeLog:
2015-09-02  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_internal_mov_immediate):
Replace slow immediate matching loops with a faster algorithm.

---
 gcc/config/aarch64/aarch64.c | 96 +++-
 1 file changed, 23 insertions(+), 73 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c0280e6..d6f7cb0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1376,7 +1376,7 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   unsigned HOST_WIDE_INT mask;
   int i;
   bool first;
-  unsigned HOST_WIDE_INT val;
+  unsigned HOST_WIDE_INT val, val2;
   bool subtargets;
   rtx subtarget;
   int one_match, zero_match, first_not__match;
@@ -1503,85 +1503,35 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
}
 }
 
-  /* See if we can do it by arithmetically combining two
- immediates.  */
-  for (i = 0; i < AARCH64_NUM_BITMASKS; i++)
+  if (zero_match != 2 && one_match != 2)
 {
-  int j;
-  mask = 0x;
+  /* Try emitting a bitmask immediate with a movk replacing 16 bits.
+For a 64-bit bitmask try whether changing 16 bits to all ones or
+zeroes creates a valid bitmask.  To check any repeated bitmask,
+try using 16 bits from the other 32-bit half of val.  */
 
-  if (aarch64_uimm12_shift (val - aarch64_bitmasks[i])
- || aarch64_uimm12_shift (-val + aarch64_bitmasks[i]))
+  for (i = 0; i < 64; i += 16, mask <<= 16)
{
- if (generate)
-   {
- subtarget = subtargets ? gen_reg_rtx (DImode) : dest;
- emit_insn (gen_rtx_SET (subtarget,
- GEN_INT (aarch64_bitmasks[i])));
- emit_insn (gen_adddi3 (dest, subtarget,
-GEN_INT (val - aarch64_bitmasks[i])));
-   }
- num_insns += 2;
- return num_insns;
+ val2 = val & ~mask;
+ if (val2 != val && aarch64_bitmask_imm (val2, mode))
+   break;
+ val2 = val | mask;
+ if (val2 != val && aarch64_bitmask_imm (val2, mode))
+   break;
+ val2 = val2 & ~mask;
+ val2 = val2 | (((val2 >> 32) | (val2 << 32)) & mask);
+ if (val2 != val && aarch64_bitmask_imm (val2, mode))
+   break;
}
-
-  for (j = 0; j < 64; j += 16, mask <<= 16)
+  if (i != 64)
{
- if ((aarch64_bitmasks[i] & ~mask) == (val & ~mask))
+ if (generate)
{
- if (generate)
-   {
- emit_insn (gen_rtx_SET (dest,
- GEN_INT (aarch64_bitmasks[i])));
- emit_insn (gen_insv_immdi (dest, GEN_INT (j),
-GEN_INT ((val >> j) & 0x)));
-   }
- num_insns += 2;
- return num_insns;
+ emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));
+ emit_insn (gen_insv_immdi (dest, GEN_INT (i),
+GEN_INT ((val >> i) & 0x)));
}
-   }
-}
-
-  /* See if we can do it by logically combining two immediates.  */
-  for (i = 0; i < AARCH64_NUM_BITMASKS; i++)
-{
-  if ((aarch64_bitmasks[i] & val) == aarch64_bitmasks[i])
-   {
- int j;
-
- for (j = i + 1; j < AARCH64_NUM_BITMASKS; j++)
-   if (val == (aarch64_bitmasks[i] | aarch64_bitmasks[j]))
- {
-   if (generate)
- {
-   subtarget = subtargets ? gen_reg_rtx (mode) : dest;
-   emit_insn (gen_rtx_SET (subtarget,
-   GEN_INT (aarch64_bitmasks[i])));
-   emit_insn (gen_iordi3 (dest, subtarget,
-  GEN_INT (aarch64_bitmasks[j])));
- }
-   num_insns += 2;
-   return num_insns;
- }
-   }
-  else if ((val & aarch64_bitmasks[i]) == val)
-   {
- int j;
-
- for (j = i + 1; j < AARCH64_NUM_BITMASKS; j++)
-   if (val == (aarch64_bitmasks[j] & aarch64_bitmasks[i]))
- {
-   if (generate)
- {
-

[PATCH] Fix PR66705

2015-09-02 Thread Richard Biener


I was naiively using ->get_constructor in IPA PTA without proper
checking on wheter that succeeds.  Now I tried to use ctor_for_folding
but that isn't good as we want to analyze non-const globals in IPA
PTA and we need to analyze their initialiers as well.

Thus I'm trying below with ctor_for_analysis, but I really "just"
need the initializer or a "not available" for conservative handling.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Honza - I suppose you should doble-check this and suggest sth
different (or implement sth more generic in the IPA infrastructure).

Thanks,
Richard.

2015-09-02  Richard Biener  

PR ipa/66705
* tree-ssa-structalias.c (ctor_for_analysis): New function.
(create_variable_info_for_1): Use ctor_for_analysis instead
of get_constructor.
(create_variable_info_for): Likewise.

* g++.dg/lto/pr66705_0.C: New testcase.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 227207)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -5637,6 +5637,26 @@ check_for_overlaps (vec fiel
   return false;
 }
 
+/* We can't use ctor_for_folding as that only returns constant constructors.  
*/
+
+static tree
+ctor_for_analysis (tree decl)
+{
+  varpool_node *node = varpool_node::get (decl);
+  if (!node)
+return error_mark_node;
+  node = node->ultimate_alias_target ();
+  if (DECL_INITIAL (node->decl) != error_mark_node
+  || !in_lto_p)
+return (DECL_INITIAL (node->decl)
+   ? DECL_INITIAL (node->decl) : error_mark_node);
+  if (in_lto_p
+  && node->lto_file_data
+  && !node->body_removed)
+return node->get_constructor ();
+  return error_mark_node;
+}
+
 /* Create a varinfo structure for NAME and DECL, and add it to VARMAP.
This will also create any varinfo structures necessary for fields
of DECL.  */
@@ -5650,7 +5670,6 @@ create_variable_info_for_1 (tree decl, c
   auto_vec fieldstack;
   fieldoff_s *fo;
   unsigned int i;
-  varpool_node *vnode;
 
   if (!declsize
   || !tree_fits_uhwi_p (declsize))
@@ -5672,8 +5691,7 @@ create_variable_info_for_1 (tree decl, c
 in IPA mode.  Else we'd have to parse arbitrary initializers.  */
   && !(in_ipa_mode
   && is_global_var (decl)
-  && (vnode = varpool_node::get (decl))
-  && vnode->get_constructor ()))
+  && ctor_for_analysis (decl) != error_mark_node))
 {
   fieldoff_s *fo = NULL;
   bool notokay = false;
@@ -5805,13 +5823,13 @@ create_variable_info_for (tree decl, con
 
  /* If this is a global variable with an initializer and we are in
 IPA mode generate constraints for it.  */
- if (vnode->get_constructor ()
- && vnode->definition)
+ tree ctor = ctor_for_analysis (decl);
+ if (ctor != error_mark_node)
{
  auto_vec rhsc;
  struct constraint_expr lhs, *rhsp;
  unsigned i;
- get_constraint_for_rhs (vnode->get_constructor (), );
+ get_constraint_for_rhs (ctor, );
  lhs.var = vi->id;
  lhs.offset = 0;
  lhs.type = SCALAR;
Index: gcc/testsuite/g++.dg/lto/pr66705_0.C
===
--- gcc/testsuite/g++.dg/lto/pr66705_0.C(revision 0)
+++ gcc/testsuite/g++.dg/lto/pr66705_0.C(working copy)
@@ -0,0 +1,15 @@
+// { dg-lto-do link }
+// { dg-lto-options { { -O2 -flto -flto-partition=max -fipa-pta } } }
+// { dg-extra-ld-options "-r -nostdlib" }
+
+class A {
+public:
+A();
+};
+int a = 0;
+void foo() {
+a = 0;
+A b;
+for (; a;)
+  ;
+}

Re: Location of "dg-final" directives? (was Re: [PATCH][GCC] Algorithmic optimization in match and simplify)

2015-09-02 Thread Andre Vieira




On 01/09/15 17:54, Marek Polacek wrote:

On Tue, Sep 01, 2015 at 12:50:27PM -0400, David Malcolm wrote:

I can't comment on the patch itself, but I noticed that in the testsuite
addition, you've gathered all the "dg-final" clauses at the end.

I think that this is consistent with existing practice in gcc, but
AFAIK, the dg-final clauses can appear anywhere in the file.

Would it make test cases more readable if we put the dg-final clauses
next to the code that they relate to?  Or, at least after the function
that they relate to?

For example:

   unsigned short
   foo (unsigned short a)
   {
 a ^= 0x4002;
 a >>= 1;
 a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001).  */
 return a;
   }
   /* { dg-final { scan-tree-dump "\\^ 40961" "forwprop1" } } */

   unsigned short
   bar (unsigned short a)
   {
   /* snip */ /* Simplify to ((a << 1) | 0x8005).  */
   }
   /* { dg-final { scan-tree-dump "\\| 32773" "forwprop1" } } */

   ... etc ...

I think this may be more readable than the "group them all at the end of
the file approach", especially with testcases with many files and
dg-final directives.


Yeah, it's probably somewhat more readable.  Same for dg-output.  E.g. some
ubsan tests already use dg-outputs after functions they relate to.

Marek



If no one objects Ill make those changes too. Sounds reasonable to me.

Andre

[PATCH][AArch64][0/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

This is a set of patches to reduce the compile-time overhead of immediate 
generation on AArch64.
There have been discussions and investigations into reducing the overhead of 
immediate generation
using various caching strategies. However the statistics showed some of the 
expensive immediate
loops are not beneficial, and the algorithms can be improved significantly. The 
resulting speedups
are so large that caching can no longer show a measurable benefit.

aarch64_bitmask_imm is rewritten to use bitwise arithmetic rather than binary 
search.
aarch64_internal_mov_immediate is rewritten to remove slow linear and quadratic 
loops with constant
time logic, reduce the number of special cases and simplify the overall logic. 
There are slight
differences in the generated sequences however all immediate are the same size 
(no codesize
difference in SPEC2006). Overall buildtime improvement is 0.3% when building 
SPEC2006.

[PATCH][AArch64][5/5] Improve immediate generation

2015-09-02 Thread Wilco Dijkstra

Cleanup the remainder of aarch64_internal_mov_immediate. Compute the number of 
16-bit aligned 16-bit
masks that are all-zeroes or all-ones, and emit the smallest sequence using a 
single loop skipping
either all-ones or all-zeroes.

Passes GCC regression tests/bootstrap. Minor changes in generated code for some 
special cases but
codesize is identical.

ChangeLog:
2015-09-02  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_internal_mov_immediate):
Cleanup immediate generation code.

---
 gcc/config/aarch64/aarch64.c | 137 ---
 1 file changed, 39 insertions(+), 98 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index af9a3d3..ca4428a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1367,75 +1367,42 @@ static int
 aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate,
machine_mode mode)
 {
-  unsigned HOST_WIDE_INT mask;
   int i;
-  bool first;
-  unsigned HOST_WIDE_INT val, val2;
-  int one_match, zero_match, first_not__match;
-  int num_insns = 0;
+  unsigned HOST_WIDE_INT val, val2, mask;
+  int one_match, zero_match;
+  int num_insns;
 
-  if (CONST_INT_P (imm) && aarch64_move_imm (INTVAL (imm), mode))
+  val = INTVAL (imm);
+
+  if (aarch64_move_imm (val, mode))
 {
   if (generate)
emit_insn (gen_rtx_SET (dest, imm));
-  num_insns++;
-  return num_insns;
+  return 1;
 }
 
-  if (mode == SImode)
+  if ((val >> 32) == 0 || mode == SImode)
 {
-  /* We know we can't do this in 1 insn, and we must be able to do it
-in two; so don't mess around looking for sequences that don't buy
-us anything.  */
   if (generate)
{
- emit_insn (gen_rtx_SET (dest, GEN_INT (INTVAL (imm) & 0x)));
- emit_insn (gen_insv_immsi (dest, GEN_INT (16),
-GEN_INT ((INTVAL (imm) >> 16) & 0x)));
+ emit_insn (gen_rtx_SET (dest, GEN_INT (val & 0x)));
+ if (mode == SImode)
+   emit_insn (gen_insv_immsi (dest, GEN_INT (16),
+  GEN_INT ((val >> 16) & 0x)));
+ else
+   emit_insn (gen_insv_immdi (dest, GEN_INT (16),
+  GEN_INT ((val >> 16) & 0x)));
}
-  num_insns += 2;
-  return num_insns;
+  return 2;
 }
 
   /* Remaining cases are all for DImode.  */
 
-  val = INTVAL (imm);
-
-  one_match = 0;
-  zero_match = 0;
   mask = 0x;
-  first_not__match = -1;
-
-  for (i = 0; i < 64; i += 16, mask <<= 16)
-{
-  if ((val & mask) == mask)
-   one_match++;
-  else
-   {
- if (first_not__match < 0)
-   first_not__match = i;
- if ((val & mask) == 0)
-   zero_match++;
-   }
-}
-
-  if (one_match == 2)
-{
-  /* Set one of the quarters and then insert back into result.  */
-  mask = 0xll << first_not__match;
-  if (generate)
-   {
- emit_insn (gen_rtx_SET (dest, GEN_INT (val | mask)));
- emit_insn (gen_insv_immdi (dest, GEN_INT (first_not__match),
-GEN_INT ((val >> first_not__match)
- & 0x)));
-   }
-  num_insns += 2;
-  return num_insns;
-}
-
-  if (zero_match == 2)
-goto simple_sequence;
+  zero_match = ((val & mask) == 0) + ((val & (mask << 16)) == 0) +
+((val & (mask << 32)) == 0) + ((val & (mask << 48)) == 0);
+  one_match = ((~val & mask) == 0) + ((~val & (mask << 16)) == 0) +
+((~val & (mask << 32)) == 0) + ((~val & (mask << 48)) == 0);
 
   if (zero_match != 2 && one_match != 2)
 {
@@ -1463,58 +1430,32 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
{
  emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));
  emit_insn (gen_insv_immdi (dest, GEN_INT (i),
-GEN_INT ((val >> i) & 0x)));
+GEN_INT ((val >> i) & 0x)));
}
- return 2;
}
 }
 
-  if (one_match > zero_match)
-{
-  /* Set either first three quarters or all but the third.  */
-  mask = 0xll << (16 - first_not__match);
-  if (generate)
-   emit_insn (gen_rtx_SET (dest,
-   GEN_INT (val | mask | 0xull)));
-  num_insns ++;
+  /* Generate 2-4 instructions, skipping 16 bits of all zeroes or ones which
+ are emitted by the initial mov.  If one_match > zero_match, skip set bits,
+ otherwise skip zero bits.  */
 
-  /* Now insert other two quarters. */
-  for (i = first_not__match + 16, mask <<= (first_not__match << 1);
-  i < 64; i += 16, mask <<= 16)
-   {
- if ((val & mask) != mask)
-   {
- if

Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-02 Thread Ilya Enkovich

2015-09-02 15:35 GMT+03:00 Richard Biener :
> On Tue, Sep 1, 2015 at 5:03 PM, Ilya Enkovich  wrote:
>> Hi,
>>
>> This fixes an ICE by adding a NULL check.  Bootstrapped and regtested for 
>> x86_64-unknown-linux-gnu.  Applied to trunk.  Does this need to be ported to 
>> gcc-5-branch?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-09-01  Ilya Enkovich  
>>
>> PR target/67405
>> * tree-chkp.c (chkp_find_bound_slots_1): Add NULL check.
>>
>> gcc/testsuite/
>>
>> 2015-09-01  Ilya Enkovich  
>>
>> PR target/67405
>> * g++.dg/pr67405.C: New test.
>>
>>
>> diff --git a/gcc/testsuite/g++.dg/pr67405.C b/gcc/testsuite/g++.dg/pr67405.C
>> new file mode 100644
>> index 000..5055921
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/pr67405.C
>> @@ -0,0 +1,11 @@
>> +// { dg-do compile }
>> +
>> +struct S
>> +{
>> +  S f; // { dg-error "incomplete type" }
>> +};
>> +
>> +void
>> +fn1 (S p1)
>> +{
>> +}
>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>> index 8c1b48c..2489abb 100644
>> --- a/gcc/tree-chkp.c
>> +++ b/gcc/tree-chkp.c
>> @@ -1667,8 +1667,9 @@ chkp_find_bound_slots_1 (const_tree type, bitmap 
>> have_bound,
>>for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
>> if (TREE_CODE (field) == FIELD_DECL)
>>   {
>> -   HOST_WIDE_INT field_offs
>> - = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>> +   HOST_WIDE_INT field_offs = 0;
>> +   if (DECL_FIELD_BIT_OFFSET (field))
>
> DECL_FIELD_BIT_OFFSET should be never NULL.  Whoever created that
> FIELD_DECL created an invalid one.

I'll check where this decl comes from. Is there a proper checker to
add a NULL test for DECL_FIELD_BIT_OFFSET BTW?.

Thanks,
Ilya

>
> Richard.
>
>> + field_offs += TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>> if (DECL_FIELD_OFFSET (field))
>>   field_offs += TREE_INT_CST_LOW (DECL_FIELD_OFFSET (field)) * 8;
>> chkp_find_bound_slots_1 (TREE_TYPE (field), have_bound,

Re: [PATCH] Fix PR66705

2015-09-02 Thread Jan Hubicka

> 
> I was naiively using ->get_constructor in IPA PTA without proper
> checking on wheter that succeeds.  Now I tried to use ctor_for_folding
> but that isn't good as we want to analyze non-const globals in IPA
> PTA and we need to analyze their initialiers as well.
> 
> Thus I'm trying below with ctor_for_analysis, but I really "just"
> need the initializer or a "not available" for conservative handling.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> Honza - I suppose you should doble-check this and suggest sth
> different (or implement sth more generic in the IPA infrastructure).

Yep, you are correct that we don't currently have way to look into ctor
without actually loading. But do you need something more than just walking
references that you already have in ipa-ref lists?
> 
> Thanks,
> Richard.
> 
> 2015-09-02  Richard Biener  
> 
>   PR ipa/66705
>   * tree-ssa-structalias.c (ctor_for_analysis): New function.
>   (create_variable_info_for_1): Use ctor_for_analysis instead
>   of get_constructor.
>   (create_variable_info_for): Likewise.

Otherwise I would go for making ctor_for_analysis a method of varpool_node 
and...
> 
>   * g++.dg/lto/pr66705_0.C: New testcase.
> 
> Index: gcc/tree-ssa-structalias.c
> ===
> --- gcc/tree-ssa-structalias.c(revision 227207)
> +++ gcc/tree-ssa-structalias.c(working copy)
> @@ -5637,6 +5637,26 @@ check_for_overlaps (vec fiel
>return false;
>  }
>  
> +/* We can't use ctor_for_folding as that only returns constant constructors. 
>  */
> +
> +static tree
> +ctor_for_analysis (tree decl)
> +{
> +  varpool_node *node = varpool_node::get (decl);
> +  if (!node)
> +return error_mark_node;
> +  node = node->ultimate_alias_target ();
> +  if (DECL_INITIAL (node->decl) != error_mark_node
> +  || !in_lto_p)
> +return (DECL_INITIAL (node->decl)
> + ? DECL_INITIAL (node->decl) : error_mark_node);

I think returning NULL here is just fine. 
error_mark_node means constructor is not really available. NULL is
the usual way to say that the variable is not initialized.

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Jonathan Wakely


On 01/09/15 17:42 +0200, Dmitry Vyukov wrote:

On Tue, Sep 1, 2015 at 5:08 PM, Jonathan Wakely  wrote:

On 01/09/15 16:56 +0200, Dmitry Vyukov wrote:


I don't understand how a new gcc may not support __atomic builtins on
ints. How it is even possible? That's a portable API provided by
recent gcc's...



The built-in function is always defined, but it might expand to a call
to an external function in libatomic, and it would be a regression for
code using std::string to start requiring libatomic (although maybe it
would be necessary if it's the only way to make the code correct).

I don't know if there are any targets that define __GTHREADS and also
don't support __atomic_load(int*, ...) without libatomic. If such
targets exist then adding a new configure check that only depends on
__atomic_load(int*, ...) would mean we keep supporting those targets.

Another option would be to simply do:

bool
_M_is_shared() const _GLIBCXX_NOEXCEPT
#if defined(__GTHREADS)
+{ return __atomic_load(>_M_refcount, __ATOMIC_ACQUIRE) > 0; }
+#else
{ return this->_M_refcount > 0; }
+#endif

and see if anyone complains!


I like this option!
If a platform uses multithreading and has non-inlined atomic loads,
then the way to fix this is to provide inlined atomic loads rather
than to fix all call sites.

Attaching new patch. Please take another look.


This looks good. Torvald suggested that it would be useful to add a
similar comment to the release operation in _M_dispose, so that both
sides of the release-acquire are similarly documented. Could you add
that and provide a suitable ChangeLog entry?

Thanks!



Index: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 227363)
+++ include/bits/basic_string.h (working copy)
@@ -2601,11 +2601,32 @@

bool
_M_is_leaked() const _GLIBCXX_NOEXCEPT
-{ return this->_M_refcount < 0; }
+{
+#if defined(__GTHREADS)
+  // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose,
+  // so we need to use an atomic load. However, _M_is_leaked
+  // predicate does not change concurrently (i.e. the string is either
+  // leaked or not), so a relaxed load is enough.
+  return __atomic_load_n(>_M_refcount, __ATOMIC_RELAXED) < 0;
+#else
+  return this->_M_refcount < 0;
+#endif
+}

bool
_M_is_shared() const _GLIBCXX_NOEXCEPT
-{ return this->_M_refcount > 0; }
+   {
+#if defined(__GTHREADS)
+  // _M_refcount is mutated concurrently by _M_refcopy/_M_dispose,
+  // so we need to use an atomic load. Another thread can drop last
+  // but one reference concurrently with this check, so we need this
+  // load to be acquire to synchronize with release fetch_and_add in
+  // _M_dispose.
+  return __atomic_load_n(>_M_refcount, __ATOMIC_ACQUIRE) > 0;
+#else
+  return this->_M_refcount > 0;
+#endif
+}

void
_M_set_leaked() _GLIBCXX_NOEXCEPT

[AArch64_be] Fix vldX/vstX AdvSIMD intrinsics

2015-09-02 Thread Christophe Lyon

Hi,

The aarch64_vldX/aarch64_vstX expanders used for the vldX/vstX AdvSIMD
intrisics in Q mode called vec_load_lanes, witch shuffles the vectors
to match the layout expected by the vectorizer.

We do not want this to happen when the intrinsics are called directly
by the end-user code.

This patch fixes this, by calling gen_aarch64_simd_ldX/gen_aarch64_simd_stX.

With this patch, the following tests now pass in advsimd-intrinsics
(target aarch64_be):
vldX_lane.c, vtrn, vuzp, vzip
as well as aarch64/vldN_1.c and aarch64/vstN_1.c

It fixes PR 59810, 63652, 63653.

No regression, and tested on aarch64 and aarch64_be using the Foundation Model.

OK for trunk?

Christophe.
2015-09-02  Christophe Lyon  

	PR target/59810
	PR target/63652
	PR target/63653
	* config/aarch64/aarch64-simd.md
	(aarch64_ld): Call
	gen_aarch64_simd_ld.
	(aarch64_st): Call
	gen_aarch64_simd_st.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9777418..75fa0ab 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4566,7 +4566,7 @@
   machine_mode mode = mode;
   rtx mem = gen_rtx_MEM (mode, operands[1]);
 
-  emit_insn (gen_vec_load_lanes (operands[0], mem));
+  emit_insn (gen_aarch64_simd_ld (operands[0], mem));
   DONE;
 })
 
@@ -4849,7 +4849,7 @@
   machine_mode mode = mode;
   rtx mem = gen_rtx_MEM (mode, operands[0]);
 
-  emit_insn (gen_vec_store_lanes (mem, operands[1]));
+  emit_insn (gen_aarch64_simd_st (mem, operands[1]));
   DONE;
 })

Re: [PATCH] Fix PR66705

2015-09-02 Thread Jan Hubicka

> On Wed, 2 Sep 2015, Richard Biener wrote:
> 
> > On Wed, 2 Sep 2015, Jan Hubicka wrote:
> > 
> > > > 
> > > > I was naiively using ->get_constructor in IPA PTA without proper
> > > > checking on wheter that succeeds.  Now I tried to use ctor_for_folding
> > > > but that isn't good as we want to analyze non-const globals in IPA
> > > > PTA and we need to analyze their initialiers as well.
> > > > 
> > > > Thus I'm trying below with ctor_for_analysis, but I really "just"
> > > > need the initializer or a "not available" for conservative handling.
> > > > 
> > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > 
> > > > Honza - I suppose you should doble-check this and suggest sth
> > > > different (or implement sth more generic in the IPA infrastructure).
> > > 
> > > Yep, you are correct that we don't currently have way to look into ctor
> > > without actually loading. But do you need something more than just walking
> > > references that you already have in ipa-ref lists?
> > 
> > Hmm, no, ipa-ref list should be enough (unless we start field-sensitive
> > analysis or need NULL inits for correctness).  Still have to figure out
> > how to walk the list and how the reference would look like (what
> > is ref->use?  IPA_REF_ADDR?  can those be speculative?)
> 
> Sth like the following seems to work.

Yep, it looks good to me. Do you conservatively handle constructors that are in 
other units?
Those won't have ipa-ref lists streamed to ltrans stage.  I suppose you do not 
care because
all references in them can be supplied by foreign code, so you need to be 
conservative anyway.

Honza

Re: [PATCH] Fix PR66705

2015-09-02 Thread Richard Biener

On Wed, 2 Sep 2015, Jan Hubicka wrote:

> > On Wed, 2 Sep 2015, Richard Biener wrote:
> > 
> > > On Wed, 2 Sep 2015, Jan Hubicka wrote:
> > > 
> > > > > 
> > > > > I was naiively using ->get_constructor in IPA PTA without proper
> > > > > checking on wheter that succeeds.  Now I tried to use ctor_for_folding
> > > > > but that isn't good as we want to analyze non-const globals in IPA
> > > > > PTA and we need to analyze their initialiers as well.
> > > > > 
> > > > > Thus I'm trying below with ctor_for_analysis, but I really "just"
> > > > > need the initializer or a "not available" for conservative handling.
> > > > > 
> > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > > 
> > > > > Honza - I suppose you should doble-check this and suggest sth
> > > > > different (or implement sth more generic in the IPA infrastructure).
> > > > 
> > > > Yep, you are correct that we don't currently have way to look into ctor
> > > > without actually loading. But do you need something more than just 
> > > > walking
> > > > references that you already have in ipa-ref lists?
> > > 
> > > Hmm, no, ipa-ref list should be enough (unless we start field-sensitive
> > > analysis or need NULL inits for correctness).  Still have to figure out
> > > how to walk the list and how the reference would look like (what
> > > is ref->use?  IPA_REF_ADDR?  can those be speculative?)
> > 
> > Sth like the following seems to work.
> 
> Yep, it looks good to me. Do you conservatively handle constructors that 
> are in other units? Those won't have ipa-ref lists streamed to ltrans 
> stage.  I suppose you do not care because all references in them can be 
> supplied by foreign code, so you need to be conservative anyway.

I use all_refs_explicit_p () to go a conservative path.  And indeed
I may trip aliases (well, the code doesn't handle aliases correctly
anyway I guess - I just walk vars via

  /* Create constraints for global variables and their initializers.  */
  FOR_EACH_VARIABLE (var)
{
  if (var->alias && var->analyzed)
continue;

  get_vi_for_tree (var->decl);
}

and in get_vi_for_tree look at its ref list.  So I should only get
"ultimate" alias targets and only those may have initializers?

Richard.

Re: [PATCH PR66388]Add sizetype cand for BIV of smaller type if it's used as index of memory ref

2015-09-02 Thread Richard Biener

On Wed, Sep 2, 2015 at 5:26 AM, Bin Cheng  wrote:
> Hi,
> This patch is a new approach to fix PR66388.  IVO today computes iv_use with
> iv_cand which has at least same type precision as the use.  On 64bit
> platforms like AArch64, this results in different iv_cand created for each
> address type iv_use, and register pressure increased.  As a matter of fact,
> the BIV should be used for all iv_uses in some of these cases.  It is a
> latent bug but recently getting worse because of overflow changes.
>
> The original approach at
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01484.html can fix the issue
> except it conflict with IV elimination.  Seems to me it is impossible to
> mitigate the contradiction.
>
> This new approach fixes the issue by adding sizetype iv_cand for BIVs
> directly.  In cases if the original BIV is preferred, the sizetype iv_cand
> will be chosen.  As for code generation, the sizetype iv_cand has the same
> effect as the original BIV.  Actually, it's better because BIV needs to be
> explicitly extended to sizetype to be used in address expression on most
> targets.
>
> One shortage of this approach is it may introduce more iv candidates.  To
> minimize the impact, this patch does sophisticated code analysis and adds
> sizetype candidate for BIV only if it is used as index.  Moreover, it avoids
> to add candidate of the original type if the BIV is only used as index.
> Statistics for compiling spec2k6 shows increase of candidate number is
> modest and can be ignored.
>
> There are two more patches following to fix corner cases revealed by this
> one.  In together they bring obvious perf improvement for spec26k/int on
> aarch64.
> Spec2k6/int
> 400.perlbench   3.44%
> 445.gobmk   -0.86%
> 456.hmmer   14.83%
> 458.sjeng   2.49%
> 462.libquantum  -0.79%
> GEOMEAN 1.68%
>
> There is also about 0.36% improvement for spec2k6/fp, mostly because of case
> 436.cactusADM.  I believe it can be further improved, but that should be
> another patch.
>
> I also collected benchmark data for x86_64.  Spec2k6/fp is not affected.  As
> for spec2k6/int, though the geomean is improved slightly, 400.perlbench is
> regressed by ~3%.  I can see BIVs are chosen for some loops instead of
> address candidates.  Generally, the loop header will be simplified because
> iv elimination with BIV is simpler; the number of instructions in loop body
> isn't changed.  I suspect the regression comes from different addressing
> modes.  With BIV, complex addressing mode like [base + index << scale +
> disp] is used, rather than [base + disp].  I guess the former has more
> micro-ops, thus more expensive.  This guess can be confirmed by manually
> suppressing the complex addressing mode with higher address cost.
> Now the problem becomes why overall cost of BIV is computed lower while the
> actual cost is higher.  I noticed for most affected loops, loop header is
> bloated because of iv elimination using the old address candidate.  The
> bloated loop header results in much higher cost than BIV.  As a result, BIV
> is preferred.  I also noticed the bloated loop header generally can be
> simplified (I have a following patch for this).  After applying the local
> patch, the old address candidate is chosen, and most of regression is
> recovered.
> Conclusion is I think loop header bloated issue should be blamed for the
> regression, and it can be resolved.
>
> Bootstrap and test on x64_64 and aarch64.  It fixes failure of
> gcc.target/i386/pr49781-1.c, without new breakage.
>
> So what do you think?

The data above looks ok to me.

+static struct iv *
+find_deriving_biv_for_iv (struct ivopts_data *data, struct iv *iv)
+{
+  aff_tree aff;
+  struct expand_data exp_data;
+
+  if (!iv->ssa_name || TREE_CODE (iv->ssa_name) != SSA_NAME)
+return iv;
+
+  /* Expand IV's ssa_name till the deriving biv is found.  */
+  exp_data.data = data;
+  exp_data.biv = NULL;
+  tree_to_aff_combination_expand (iv->ssa_name, TREE_TYPE (iv->ssa_name),
+ , >name_expansion_cache,
+ stop_expand, _data);
+  return exp_data.biv;

that's actually "abusing" tree_to_aff_combination_expand for simply walking
SSA uses and their defs uses recursively until you hit "stop".  ISTR past
discussion to add a generic walk_ssa_use interface for that.  Not sure if it
materialized with a name I can't remember or whether it didn't.

-  add_candidate (data, iv->base, iv->step, true, NULL);
+  /* Check if this biv is used in address type use.  */
+  if (iv->no_overflow  && iv->have_address_use
+  && INTEGRAL_TYPE_P (TREE_TYPE (iv->base))
+  && TYPE_PRECISION (TREE_TYPE (iv->base)) < TYPE_PRECISION (sizetype))
+{
+  tree type = unsigned_type_for (sizetype);

sizetype is unsigned.

the rest looks ok to me but I really don't like the abuse of
tree_to_aff_combination_expand...

Thanks,
Richard.

> Thanks,
> bin
>
> 2015-08-31  Bin Cheng

Re: [PATCH] Fix PR66705

2015-09-02 Thread Jan Hubicka

> 
> Hmm, no, ipa-ref list should be enough (unless we start field-sensitive
> analysis or need NULL inits for correctness).  Still have to figure out
> how to walk the list and how the reference would look like (what
> is ref->use?  IPA_REF_ADDR?  can those be speculative?)

Yep, it should be IPA_REF_ADDR.  If you do not check for that, you will
probably also trip IPA_REF_ALIAS as dereference (which would lead to missed
optimization).  If we go field sensitive, we may want to extend ipa references
anyway - that would be useful for ipa-reference and other stuff, too.

Honza
> 
> Richard.
> 
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > > 2015-09-02  Richard Biener  
> > > 
> > >   PR ipa/66705
> > >   * tree-ssa-structalias.c (ctor_for_analysis): New function.
> > >   (create_variable_info_for_1): Use ctor_for_analysis instead
> > >   of get_constructor.
> > >   (create_variable_info_for): Likewise.
> > 
> > Otherwise I would go for making ctor_for_analysis a method of varpool_node 
> > and...
> > > 
> > >   * g++.dg/lto/pr66705_0.C: New testcase.
> > > 
> > > Index: gcc/tree-ssa-structalias.c
> > > ===
> > > --- gcc/tree-ssa-structalias.c(revision 227207)
> > > +++ gcc/tree-ssa-structalias.c(working copy)
> > > @@ -5637,6 +5637,26 @@ check_for_overlaps (vec fiel
> > >return false;
> > >  }
> > >  
> > > +/* We can't use ctor_for_folding as that only returns constant 
> > > constructors.  */
> > > +
> > > +static tree
> > > +ctor_for_analysis (tree decl)
> > > +{
> > > +  varpool_node *node = varpool_node::get (decl);
> > > +  if (!node)
> > > +return error_mark_node;
> > > +  node = node->ultimate_alias_target ();
> > > +  if (DECL_INITIAL (node->decl) != error_mark_node
> > > +  || !in_lto_p)
> > > +return (DECL_INITIAL (node->decl)
> > > + ? DECL_INITIAL (node->decl) : error_mark_node);
> > 
> > I think returning NULL here is just fine. 
> > error_mark_node means constructor is not really available. NULL is
> > the usual way to say that the variable is not initialized.
> > 
> > 
> 
> -- 
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)

Re: [Patch, libstdc++] Fix data races in basic_string implementation

2015-09-02 Thread Dmitry Vyukov

Thank you.

Yes, I am covered by the Google copyright assignment, and I have commit access.
I don't commit frequently, hope I didn't mess up things fundamentally :)
https://gcc.gnu.org/viewcvs/gcc?view=revision=227403

On Wed, Sep 2, 2015 at 4:08 PM, Jonathan Wakely  wrote:
> On 02/09/15 16:01 +0200, Dmitry Vyukov wrote:
>>
>> Added comment to _M_dispose and restored ChangeLog entry.
>> Please take another look.
>
>
> Thanks, this is OK for trunk.
>
> I assume you are covered by the Google company-wide copyright
> assignment, so someone just needs to commit it, which I can do if you
> like.
>

99 matches

Mail list logo