Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Uros Bizjak
On Tue, Dec 3, 2013 at 6:53 AM, Konstantin Serebryany
 wrote:

>>> >> with #if LINUX_VERSION_CODE >= 132640
>>> Good idea, let me try that.
>>
>> Had a quick look at this on RHEL 5.
>> Following patch let me compile at least the first source file, but then
>> I run into tons of issues in sanitizer_platform_limits_posix.cc.
>
> That's what I am afraid of. Even if we manage to compile everything,
> there is no guarantee that the code will work.
> I suggest to simply disable libsanitizer build on the older systems
> which is what happens de facto now.
> If there is significant interest in maintaining asan&co on older
> systems (which I have not seen so far),
> then those interested will need to help us in upstream repository (llvm) by
> a) sending us patches using http://llvm.org/docs/Phabricator.html and
> b) setting up a public buildbot (attached to the LLVM master bot) with
> the system they care about.
> If there is no one interested enough to do a) and b) I say we should
> not spend time on this.

IMO, it is also OK for the configure to check for needed features and
disable libsanitizer (perhaps with some informative message) if
minimum requirements are not met. If someone adds workarounds for
those missing features to support older systems, then chese checks can
easily be adapted. The problem ATM is, that gcc won't build
out-of-the-box on older distributions, although adding
--disable-libsanitizer manually works OK.

> And this discussion does not affect the merge since nothing that works
> today will get broken, right?

Yes, that's right.

Uros.


Re: [PATCH] Fix SSE (pre-AVX) alignment handling (PR target/59163)

2013-12-02 Thread Jeff Law

On 12/02/13 15:58, Jakub Jelinek wrote:

Hi!

As discussed in the PR, combiner can combine e.g. unaligned integral
load (e.g. TImode) together with some SSE instruction that requires aligned
load, but doesn't actually check it.  For AVX, most of the instructions
actually allow unaligned operands, except for a few vmov* instructions where
the pattern typically handle the misaligned mems through misaligned_operand
checks, and some nontemporal move insns that have UNSPECs that should
prevent combination.  The following patch attempts to solve this by
rejecting combining of unaligned memory loads/stores into SSE insns that
don't allow it.  I've added ssememalign attribute for that, but actually
only later on realized that even for the insns which load/store < 16 byte
memory values if strict alignment checking isn't turned on in hw, the
arguments don't have to be aligned at all, so perhaps instead of
ssememalign in bits all we could have is a boolean attribute whether
insn requires for pre-AVX memory operands to be as aligned as their mode, or
not (with default that it does require that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-12-02  Jakub Jelinek  
Uros Bizjak  

PR target/59163
* config/i386/i386.c (ix86_legitimate_combined_insn): If for
!TARGET_AVX there is misaligned MEM operand with vector mode
and get_attr_ssememalign is 0, return false.
(ix86_expand_special_args_builtin): Add get_pointer_alignment
computed alignment and for non-temporal loads/stores also
at least GET_MODE_ALIGNMENT as MEM_ALIGN.
* config/i386/sse.md
(_loadu,
_storeu,
_loaddqu,
_storedqu, _lddqu,
sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
sse_movlhps, sse_storehps, sse_loadhps, *vec_interleave_highv2df,
*vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_movsd,
sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2,
sse4_1_v4hiv4si2, sse4_1_v2qiv2di2,
sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr,
*sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm,
sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned,
sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add
ssememalign attribute.
* config/i386/i386.md (ssememalign): New define_attr.

* g++.dg/torture/pr59163.C: New test.

OK for the trunk.

I doubt it's worth doing anything special for the case where strict 
alignment on SSE stuff is turned off.  Unless someone is screaming for it.


I'm trusting that Uros & yourself actually have the right alignments in 
the sse.md changes.  I didn't look at those closely.


jeff


Re: [PATCH] Fix up cmove expansion (PR target/58864, take 2)

2013-12-02 Thread Jeff Law

On 12/02/13 15:51, Jakub Jelinek wrote:

Hi!

On Sat, Nov 30, 2013 at 12:38:30PM +0100, Eric Botcazou wrote:

Rather than adding do_pending_stack_adjust () in all the places, especially
when it isn't clear whether emit_conditional_move will be called at all and
whether it will actually do do_pending_stack_adjust (), I chose to add
two new functions to save/restore the pending stack adjustment state,
so that when instruction sequence is thrown away (either by doing
start_sequence/end_sequence around it and not emitting it, or
delete_insns_since) the state can be restored, and have changed all the
places that IMHO need it for emit_conditional_move.


Why not do it in emit_conditional_move directly then?  The code thinks it's
clever to do:

   do_pending_stack_adjust ();
   last = get_last_insn ();
   prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
GET_CODE (comparison), NULL_RTX, unsignedp, OPTAB_WIDEN,
&comparison, &cmode);
[...]
   delete_insns_since (last);
   return NULL_RTX;

but apparently not, so why not delete the stack adjustment as well and restore
the state afterwards?


On Sat, Nov 30, 2013 at 09:10:35AM +0100, Richard Biener wrote:

The idea is good but I'd like to see a struct rather than an array for the
storage.


So, this patch attempts to include both of the proposed changes.
Furthermore, I've noticed that calls.c has been saving/restoring those
two values by hand, so now it can use the new APIs for that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

What about 4.8 branch?  I could create an alternative patch for 4.8,
keep everything as is and just save/restore the two fields by hand in
emit_conditional_move like calls.c used to do it.

2013-12-02  Jakub Jelinek  

PR target/58864
* dojump.c (save_pending_stack_adjust, restore_pending_stack_adjust):
New functions.
* expr.h (struct saved_pending_stack_adjust): New type.
(save_pending_stack_adjust, restore_pending_stack_adjust): New
prototypes.
* optabs.c (emit_conditional_move): Call save_pending_stack_adjust
and get_last_insn before do_pending_stack_adjust, call
restore_pending_stack_adjust after delete_insns_since.
* expr.c (expand_expr_real_2): Don't call do_pending_stack_adjust
before calling emit_conditional_move.
* expmed.c (expand_sdiv_pow2): Likewise.
* calls.c (expand_call): Use {save,restore}_pending_stack_adjust.

* g++.dg/opt/pr58864.C: New test.

OK.

Branch maintainers call on how they want to deal with this in 4.8.

jeff



Re: [PING] [PATCH] Optional alternative base_expr in finding basis for CAND_REFs

2013-12-02 Thread Jeff Law

On 12/02/13 08:47, Yufeng Zhang wrote:

Ping~

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03360.html




Thanks,
Yufeng

On 11/26/13 15:02, Yufeng Zhang wrote:

On 11/26/13 12:45, Richard Biener wrote:

On Thu, Nov 14, 2013 at 12:25 AM, Yufeng
Zhang   wrote:

On 11/13/13 20:54, Bill Schmidt wrote:

The second version of your original patch is ok with me with the
following changes.  Sorry for the little side adventure into the
next-interp logic; in the end that's going to hurt more than it
helps in
this case.  Thanks for having a look at it, anyway.  Thanks also for
cleaning up this version to be less intrusive to common interfaces; I
appreciate it.



Thanks a lot for the review.  I've attached an updated patch with the
suggested changes incorporated.

For the next-interp adventure, I was quite happy to do the
experiment; it's
a good chance of gaining insight into the pass.  Many thanks for
your prompt
replies and patience in guiding!



Everything else looks OK to me.  Please ask Richard for final
approval,
as I'm not a maintainer.
First a note, I need to check on voting for Bill as the slsr maintainer 
from the steering committee.   Voting was in progress just before the 
close of stage1 development so I haven't tallied the results :-)




Yes, you are right about the non-trivial 'base' tree are rarely shared.
   The cached is introduced mainly because get_alternative_base () may be
called twice on the same 'base' tree, once in the
find_basis_for_candidate () for look-up and the other time in
alloc_cand_and_find_basis () for record_potential_basis ().  I'm happy
to leave out the cache if you think the benefit is trivial.
Without some sense of how expensive the lookups are vs how often the 
cache hits it's awful hard to know if the cache is worth it.


I'd say take it out unless you have some sense it's really saving time. 
 It's a pretty minor implementation detail either way.






+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+  int i, j;
+
+  i = v1 + 5;
+  j = i;
+  a2 [i-10] [j] = 2;
+  a2 [i] [j++] = i;
+  a2 [i+20] [j++] = i;
+  a2 [i-3] [i-1] += 1;
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */

scanning for 5 MEMs looks non-sensical.  What transform do
you expect?  I see other slsr testcases do similar non-sensical
checking which is bad, too.


As the slsr optimizes CAND_REF candidates by simply lowering them to
MEM_REF from e.g. ARRAY_REF, I think scanning for the number of MEM_REFs
is an effective check.  Alternatively, I can add a follow-up patch to
add some dumping facility in replace_ref () to print out the replacing
actions when -fdump-tree-slsr-details is on.
I think adding some details to the dump and scanning for them would be 
better.  That's the only change that is required for this to move forward.


I suggest doing it quickly.  We're well past stage1 close at this point.

jeff



Re: [RFC][LIBGCC][2 of 2] 64 bit divide implementation for processor without hw divide instruction

2013-12-02 Thread Kugan
ping

Thanks,
Kugan

On 27/11/13 15:30, Kugan wrote:
> On 27/11/13 02:07, Richard Earnshaw wrote:
>> On 23/11/13 01:54, Kugan wrote:
> 
> [snip]
> 
>>> +2013-11-22  Kugan Vivekanandarajah  
>>> +
>>> +   * libgcc/config/arm/pbapi-lib.h (HAVE_NO_HW_DIVIDE): Define for
>>
>> It's bpabi-lib.h
> 
> Thanks for the review.
> 
>>> +   __ARM_ARCH_7_A__.
>>> +
>>>
>>>
>>
>> No, this will:
>> 1) Do the wrong thing for Cortex-a7, A12 and A15 (which all have HW
>> divide, and currently define __ARM_ARCH_7_A__).
>> 2) Do the wrong thing for v7-M and v7-R devices, which have Thumb HW
>> division instructions.
>> 3) Do the wrong thing for all pre-v7 devices, which don't have HW division.
>>
>> I think the correct solution is to test !defined(__ARM_ARCH_EXT_IDIV__)
> 
> I understand it now and updated the code as attached.
> 
> +2013-11-27  Kugan Vivekanandarajah  
> +
> + * config/arm/bpapi-lib.h (TARGET_HAS_NO_HW_DIVIDE): Define for
> + architectures that does not have hardware divide instruction.
> + i.e. architectures that does not define __ARM_ARCH_EXT_IDIV__.
> +
> 
> 
> Is this OK for trunk now?
> Thanks,
> Kugan
> 



Re: [PATCH] _Cilk_for for C and C++

2013-12-02 Thread Jeff Law

On 11/27/13 17:52, Jason Merrill wrote:

On 11/27/2013 04:14 PM, Iyer, Balaji V wrote:

I completely agree with you that there are certain parts of Cilk
Plus that is similar to OMP4, namely #pragma simd and SIMD-enabled
functions (formerly called elemental functions). But, the Cilk
keywords is almost completely orthogonal to OpenMP. They are
semantically different  and one cannot be transformed to another. Cilk
uses automatically load-balanced work-stealing using the Cilk runtime,
whereas OMP uses work sharing via OMP runtime. There are a number of
other semantic differences but this is the core-issue. #pragma simd
and #pragma omp have converged in several places but the Cilk part has
always been different from OpenMP.


Yes, Cilk for loops will use the Cilk runtime and OMP for loops will use
the OMP runtime, but that doesn't mean they can't share a lot of the
middle end code along the way.

We already have several different varieties of parallel/simd loops all
represented by GIMPLE_OMP_FOR, and I think this could be another
GP_OMP_FOR_KIND_.
Right.  It's not a question of what runtime they call back into, but 
that both share a common overall structure.


Conceptually I look at a for loop as having 4 main components.

Initializer, test condition, increment and the body.

I'd like to hope things like the syntatic & semantic analysis of the 
first three would be largely the same.  Most of the Cilk specific bits 
would be in the handling of the body -- but there may be some 
significant code sharing that can happen there too.





...

As you can tell, this is not how openmp handles a #pragma omp for loop.


It's different in detail, but #pragma omp parallel for works very
similarly: it creates a separate function for the body of the loop and
then passes that to GOMP_parallel along with any shared data.

My thoughts exactly.
jeff


Re: [patch] combine ICE fix

2013-12-02 Thread Jeff Law

On 11/27/13 17:13, Cesar Philippidis wrote:


I looked into adding support for incremental DF scanning from df*.[ch]
in combine but there are a couple of problems. First of all, combine
does its own DF analysis. It does so because its usage falls under this
category (df-core.c):

c) If the pass modifies insns several times, this incremental
   updating may be expensive.

Furthermore, combine's DF relies on the DF scanning to be deferred, so
the DF_REF_DEF_COUNT values would be off. Eg, calls to SET_INSN_DELETED
take place before it updates the notes for those insns. Also, combine
has a tendency to undo its changes occasionally.
I think at this stage of the release cycle, converting combine to 
incremental DF is probably a no-go.  However, we should keep it in mind 
for the future -- while hairy I'd really like to see that happen in the 
long term.


As for moving this issue forward, I think the REG_N_SETs uses in combine 
are valid/legitimate, so the real question in my mind is whether or not 
to turn this into a vec or leave it as an array.


I'd lean for the former -- turning it into a vec gets us bounds checking 
which probably would have caught this problem earlier.


This would also need a testcase for the testsuite.  Presumably with a 
vec version that won't be hard to create since you should get an 
out-of-bounds error on the array indexing rather waiting for a segfault.


jeff



Re: [PING^2] [PATCH] PR59063

2013-12-02 Thread Yury Gribov

> Is this still necessary after HJ's patch?

Frankly I don't have access to non-sanitizer-enabled platform but if I 
manually disable it in libsanitizer/configure, I start getting Asan test 
errors which are similiar to e.g. 
http://gcc.gnu.org/ml/gcc-testresults/2013-12/msg00189.html 
(i386-unknown-freebsd10.0 from Dec 2). I hope Andreas can confirm 
whether we still need this patch.


-Y

---
From: Jeff Law 
Sent:  Tuesday, December 03, 2013 9:01AM
To: Yury Gribov , Andreas Schwab 

Cc: Jakub Jelinek , gcc-patches@gcc.gnu.org, 
eugeni.stepa...@gmail.com, VandeVondele Joost 
, Evgeny Gavrin , 
Viacheslav Garbuzov 

Subject: Re: [PING^2] [PATCH] PR59063
On 12/03/2013 09:01 AM, Jeff Law wrote:
On 12/01/13 23:12, Yury Gribov wrote:

 > This is causing all the tests being run on all targets,
 > even if libsanitizer is not supported,
 > most of them failing due to link errors.

Thanks for the info and sorry about this. I should probably check
non-sanitized platforms as well before commiting patches. Does the
attached patch make sense to you? Worked for me on x64 and x64 with
manually disabled libsanitizer.

[ ... ]
Is this still necessary after HJ's patch?

jeff





Re: [patch] Fix failure of ACATS c52102c

2013-12-02 Thread Jeff Law

On 11/30/13 10:24, Eric Botcazou wrote:

Hi,

this test started to fail very recently on 32-bit platforms with 64-bit HWI.
Not sure exactly why, but the issue is straightforward and was latent.

For the following reference, a call to ao_ref_init_from_ptr_and_size yields:

(gdb) p debug_generic_expr((tree_node *) 0x76e01200)
&a[0 ...]{lb: 4294967292 sz: 4}
(gdb) p debug_generic_expr(size)
20
(gdb) p dref
$36 = {ref = 0x0, base = 0x76dfd260, offset = -137438953344, size = 160,
   max_size = 160, ref_alias_set = 0, base_alias_set = 0, volatile_p = false}

The offset is bogus.  'a' is an array with lower bound -4 so {lb: 4294967292
sz: 4} is actually {lb: -4 sz: 4}.  The computation of the offset goes wrong
in get_addr_base_and_unit_offset_1 because it is not done in sizetype.

Fixed by copying the relevant bits from get_ref_base_and_extent, where the
computation is correctly done in sizetype.

Tested on x86_64-suse-linux, OK for the mainline?


2013-11-30  Eric Botcazou  

* tree-dfa.h (get_addr_base_and_unit_offset_1) : Do the
offset computation using the precision of the index type.


2013-11-30  Eric Botcazou  

* gnat.dg/opt30.adb: New test.

Ok.

Thanks,
Jeff


Re: [PING][PATCH] LRA: check_rtl modifies RTL instruction stream

2013-12-02 Thread Jeff Law

On 11/28/13 16:50, Alan Modra wrote:


This is due to that innocuous seeming change of setting
lra_in_progress before calling check_rtl(), in combination with
previous changes Vlad made to the rs6000 backend here:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html
In particular the "Call legitimate_constant_pool_address_p in strict
mode for LRA" change, that sets "strict" when lra_in_progress.

Is this still an issue?

That code has gone through a couple revisions.  Robert's change was 
reverted and Vlad twiddled thigns to use recog_memoized instead of 
insn_invalid_p which prevents check_rtl from incorrectly adding CLOBBERs 
after the point where an insn's form is supposed to be fixed.


jeff



Re: [PING] [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-12-02 Thread Jeff Law

On 11/28/13 03:24, Bernd Edlinger wrote:

Hi,

On Wed, 27 Nov 2013 12:07:16, Jeff Law wrote:


On 11/27/13 05:29, Bernd Edlinger wrote:

Hi,

ping...

this patch still open: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02291.html

Note: it does, as it is, _not_ depend on the keep_aligning patch.
And it would fix some really nasty wrong code generation issues.

Is there a testcase for this problem?


Yes,
the patch contains two test cases, one for

struct S { V a; V b[0]; } P __attribute__((aligned (1)))
and another for

struct S { V b[1]; } P __attribute__((aligned (1)))


V can be anything that has a movmisalign_optab or is SLOW_UNALIGNED_ACCESS

If V::b is used as flexible array, reading p->b[1] gives garbage.

We tried hard, to fix this in stor-layout.c by not using the mode of V
for struct S, but this created ABI-fallout. So currently the only possible
way to fix it seems to be in expansion, by letting expand_real_1 know that
we need a memory reference, even if it may be unaligned.



My recommendation is to start a separate thread which focuses on this
issue and only this issue.



If there are more questions of general interest, please feel free to start
in a new thread.
Well, my point is there's this thread that deals with multiple issues; 
the message with the patch itself references prior versions of the patch 
and sorting out any discussion that relates strictly to the outstanding 
patch is nontrivial.


Hence my request to repost the patch as a new independent thread. 
Include the testcase and a reference to any current bugzilla bugs.




jeff



Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Konstantin Serebryany
On Tue, Dec 3, 2013 at 2:32 AM, Jakub Jelinek  wrote:
> On Mon, Dec 02, 2013 at 05:59:53PM +0100, Konstantin Serebryany wrote:
>> >> with #if LINUX_VERSION_CODE >= 132640
>> Good idea, let me try that.
>
> Had a quick look at this on RHEL 5.
> Following patch let me compile at least the first source file, but then
> I run into tons of issues in sanitizer_platform_limits_posix.cc.

That's what I am afraid of. Even if we manage to compile everything,
there is no guarantee that the code will work.
I suggest to simply disable libsanitizer build on the older systems
which is what happens de facto now.
If there is significant interest in maintaining asan&co on older
systems (which I have not seen so far),
then those interested will need to help us in upstream repository (llvm) by
a) sending us patches using http://llvm.org/docs/Phabricator.html and
b) setting up a public buildbot (attached to the LLVM master bot) with
the system they care about.
If there is no one interested enough to do a) and b) I say we should
not spend time on this.

And this discussion does not affect the merge since nothing that works
today will get broken, right?





>
> I think the main problem is that you are mixing standard glibc headers and
> linux kernel headers in the same source file, that is a big no no.
> Lots of the kernel headers declare the same things as glibc headers.
>
> I'd strongly recommend splitting the files, so that you include absolute 
> minimum of
> glibc headers when you include linux/* and/or asm/* headers and no kernel 
> headers
> if you include tons of glibc headers.
> And as the errors show up, there are also .cfi* directives that are used
> unconditionally (you've set you've removed it from sanitizer_common or where 
> it
> was used (IMHO a pitty, much better would be conditionalizing them on either 
> compiler
> preprocessor macros or whatever clang provides as alternative for that when 
> not building
> with gcc)), but they are used in tsan (in HACKY_CALL macro).  Plus in *.S file
> (either that could be again guarded by the same preprocessor macro, or 
> configure or
> something else).  Note that RHEL5 here has already gas that supports .cfi_* 
> directives
> (just not .cfi_personality/.cfi_lsda I think), but if you go to even older 
> system
> it will not be there.  E.g. glibc assembler files solve that by defining 
> various
> CFI_STARTPROC etc. macros that either expand to .cfi_startproc etc. if 
> assembler
> supports the directives, or nothing otherwise.
>
> --- sanitizer_platform_limits_linux.cc.jj   2013-12-02 15:27:58.0 
> -0500
> +++ sanitizer_platform_limits_linux.cc  2013-12-02 17:06:19.0 -0500
> @@ -51,8 +51,12 @@
>  #endif
>
>  #if !SANITIZER_ANDROID
> +#include 
> +//  has been added in 2.6.32
> +#if LINUX_VERSION_CODE >= 132640
>  #include 
>  #endif
> +#endif
>
>  namespace __sanitizer {
>unsigned struct_statfs64_sz = sizeof(struct statfs64);
> @@ -75,15 +79,18 @@ CHECK_SIZE_AND_OFFSET(io_event, res);
>  CHECK_SIZE_AND_OFFSET(io_event, res2);
>
>  #if !SANITIZER_ANDROID
> +#if LINUX_VERSION_CODE >= 132640
>  COMPILER_CHECK(sizeof(struct __sanitizer_perf_event_attr) <=
> sizeof(struct perf_event_attr));
>  CHECK_SIZE_AND_OFFSET(perf_event_attr, type);
>  CHECK_SIZE_AND_OFFSET(perf_event_attr, size);
>  #endif
> +#endif
>
>  COMPILER_CHECK(iocb_cmd_pread == IOCB_CMD_PREAD);
>  COMPILER_CHECK(iocb_cmd_pwrite == IOCB_CMD_PWRITE);
> -#if !SANITIZER_ANDROID
> +#if !SANITIZER_ANDROID && LINUX_VERSION_CODE >= 132627
> +// IOCB_CMD_PREADV/PWRITEV has been added in 2.6.19
>  COMPILER_CHECK(iocb_cmd_preadv == IOCB_CMD_PREADV);
>  COMPILER_CHECK(iocb_cmd_pwritev == IOCB_CMD_PWRITEV);
>  #endif
> --- sanitizer_platform_limits_posix.cc.jj   2013-12-02 15:27:58.0 
> -0500
> +++ sanitizer_platform_limits_posix.cc  2013-12-02 17:11:00.0 -0500
> @@ -82,12 +82,16 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +//  has been added in 2.6.26
> +#if LINUX_VERSION_CODE >= 132634
>  #include 
> +#endif
>  #include 
>  #include 
>  #include 
>
> So the current errors are (from make -j64 -k to show more than one file):
> In file included from /usr/include/sys/ustat.h:30:0,
>  from 
> ../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:84:
> /usr/include/bits/ustat.h:25:8: error: redefinition of ‘struct ustat’
>  struct ustat
> ^
> In file included from /usr/include/linux/if_ether.h:24:0,
>  from /usr/include/netinet/if_ether.h:26,
>  from /usr/include/netinet/ether.h:26,
>  from 
> ../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:47:
> /usr/include/linux/types.h:156:8: error: previous definition of ‘struct ustat’
>  struct ustat {
> ^
> In file included from /usr/include/linux/mroute.h:5:0,
>  from 
> ../../../../libsanitizer/sanitizer_

Re: [PATCH] Fix nested function ICE with VLAs (PR middle-end/59011)

2013-12-02 Thread Jeff Law

On 12/02/13 16:02, Jakub Jelinek wrote:

Hi!

tree-nested.c uses declare_vars with last argument true, which
relies on BLOCK_VARS of gimple_bind_block being a tail
of the gimple_bind_vars chain.  But unfortunately a debug info
improvement I've added to gimplify_var_or_parm_decl 4 years ago
violates this assumption, in that it adds some VAR_DECLs
at the head of BLOCK_VARS (DECL_INITIAL (current_function_decl))
chain, but doesn't adjust gimple_bind_vars chain correspondingly.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.8?

2013-12-02  Jakub Jelinek  

PR middle-end/59011
* gimplify.c (nonlocal_vla_vars): New variable.
(gimplify_var_or_parm_decl): Put VAR_DECLs for VLAs into
nonlocal_vla_vars chain.
(gimplify_body): Call declare_vars on nonlocal_vla_vars chain
if outer_bind has DECL_INITIAL (current_function_decl) block.

* gcc.dg/pr59011.c: New test.
OK for the trunk.  Branch maintainers have final call on their 
respective branches.


jeff



Re: [PING^2] [PATCH] PR59063

2013-12-02 Thread Jeff Law

On 12/01/13 23:12, Yury Gribov wrote:

 > This is causing all the tests being run on all targets,
 > even if libsanitizer is not supported,
 > most of them failing due to link errors.

Thanks for the info and sorry about this. I should probably check
non-sanitized platforms as well before commiting patches. Does the
attached patch make sense to you? Worked for me on x64 and x64 with
manually disabled libsanitizer.

[ ... ]
Is this still necessary after HJ's patch?

jeff


Re: [PATCH] Fix recent regression in tree-object-size.c (PR tree-optimization/59362)

2013-12-02 Thread Jeff Law

On 12/02/13 16:04, Jakub Jelinek wrote:

Hi!

Recent change to tree-object-size.c to fold stmts with immediate uses
of __builtin_object_size result broke the pass, because it now can
create new SSA_NAMEs and the code wasn't expecting that to happen.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-12-02  Jakub Jelinek  

PR tree-optimization/59362
* tree-object-size.c (object_sizes): Change into array of
vec.
(compute_builtin_object_size): Check computed bitmap for
non-NULL instead of object_sizes.  Call safe_grow on object_sizes
vector if new SSA_NAMEs appeared.
(init_object_sizes): Check computed bitmap for non-NULL.
Call safe_grow on object_sizes elements instead of initializing
it with XNEWVEC.
(fini_object_sizes): Call release on object_sizes elements, don't
set it to NULL.

* gcc.c-torture/compile/pr59362.c: New test.

OK.
Jeff



Re: patch for elimination to SP when it is changed in RTL (PR57293)

2013-12-02 Thread Jeff Law

On 12/02/13 16:44, Vladimir Makarov wrote:



   First of all, it is a bad situation for code performance when IRA
decides that it can use frame pointer for allocation, and after that
LRA/reload decides that frame pointer can not be used and spills all
pseudos assigned to FP.  The generated code will be much worse than
one generated if we decided not to use FP from the IRA start.
Yup.  Actually, I think we had the same problem with the old 
local/global/reload allocator as well.


I don't recall the specifics, but I think the problem was global thought 
it could eliminate FP, but reload didn't and as a result code generation 
suffered.


I don't recall ever auditing ports to see if they were vulnerable to 
this class of problem.  So there may be others that will might trigger 
the assert.




2013-12-02  Vladimir Makarov  

 * config/aarch64/aarch64.c (aarch64_frame_pointer_required): Check
 LR_REGNUM.
 (aarch64_can_eliminate): Don't check elimination source when
 frame_pointer_requred is false.

s/frame_pointer_requred/frame_pointer_required in the ChangeLog entry.

jeff



Re: Patch ping (stage1-ish patches)

2013-12-02 Thread Jeff Law

On 11/28/13 00:17, Jakub Jelinek wrote:

On Wed, Nov 27, 2013 at 01:11:59PM -0700, Jeff Law wrote:

On 11/27/13 00:36, Jakub Jelinek wrote:


Use libbacktrace for libsanitizer's symbolization (will need tweaking,
depending on next libsanitizer merge, whether the corresponding
sanitizer_common changes are upstreamed or not, and perhaps to compile
libbacktrace sources again with renamed function names and other tweaks
- different allocator, only subset of files, etc.; but, there is a P1
bug for this anyway):
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02055.html

Isn't libsanitizer maintained outside GCC?  In which case making
significant changes of this nature ought to be avoided.


libsanitizer contains some files imported from upstream (pretty much all of
*.cc and *.h) and the rest (configury/Makefiles etc.) is owned by GCC, as
the LLVM buildsystem is very different.
OK.  I actually kindof came to the same conclusion while looking at 
other sanitizer library patches.


The changes to the *.cc/*.h files actally have been committed to upstream,
so a next merge from upstream will bring those changes automatically and
we'll just need the build system etc. changes.  When that happens (I think
Kostya said he'll work on that), I'll update the patch accordingly.

OK.  Go ahead and check it in then.

Thanks for clarifying things,
jeff


Re: [PATCH, testsuite] Fix some testcases for nds32 target and provide new nds32 target specific tests

2013-12-02 Thread Jeff Law

On 11/28/13 03:03, Chung-Ju Wu wrote:

Hi, Mike,

There is a pending testsuite patch for nds32 target:
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01584.html

Is it OK for trunk? :)


Best regards,
jasonwucj


2013/11/14 Chung-Ju Wu :


I would like to modify some testcases for nds32 target.
Also I have some nds32 target specific tests which is
suggested by Joseph earlier:
   http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00396.html

The patch is attached and a ChangeLog is as below:

gcc/testsuite/
2013-11-14  Chung-Ju Wu  

 * g++.dg/other/PR23205.C: Skip for nds32*-*-*.
 * g++.dg/other/pr23205-2.C: Skip for nds32*-*-*.
 * gcc.dg/20020312-2.c: Add __nds32__ case.
 * gcc.dg/builtin-apply2.c: Skip for nds32*-*-*.
 * gcc.dg/lower-subreg-1.c: Skip for nds32*-*-*.
 * gcc.dg/sibcall-3.c: Expected fail for nds32*-*-*.
 * gcc.dg/sibcall-4.c: Expected fail for nds32*-*-*.
 * gcc.dg/stack-usage-1.c (SIZE): Define case for __nds32__.
 * gcc.dg/torture/pr37868.c: Skip for nds32*-*-*.
 * gcc.dg/torture/stackalign/builtin-apply-2.c: Skip for nds32*-*-*.
 * gcc.dg/tree-ssa/20040204-1.c: Expected fail for nds32*-*-*.
 * gcc.dg/tree-ssa/forwprop-28.c: Skip for nds32*-*-*.
 * gcc.dg/tree-ssa/pr42585.c: Skip for nds32*-*-*.
 * gcc.dg/tree-ssa/sra-12.c: Skip for nds32*-*-*.
 * gcc.target/nds32: New nds32 specific directory and testcases.
 * lib/target-supports.exp (check_profiling_available): Check for
 nds32*-*-elf.

This is fine.  Sorry for the delay.

Thanks,
Jeff



Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-12-02 Thread Jeff Law

On 11/27/13 15:31, Wei Mi wrote:

Hmm, maybe attack from the other direction? -- could we clear SCHED_GROUP_P
for each insn at the start of this loop in sched_analyze?

It's not as clean in the sense that SCHED_GROUP_P "escapes" the scheduler,
but it might be an option.

for (insn = head;; insn = NEXT_INSN (insn))
 {

   if (INSN_P (insn))
 {
   /* And initialize deps_lists.  */
   sd_init_insn (insn);
 }

   deps_analyze_insn (deps, insn);

   if (insn == tail)
 {
   if (sched_deps_info->use_cselib)
 cselib_finish ();
   return;
 }
 }
Jeff







Thanks for the suggestion. It looks workable. Then I need to move the
SCHED_GROUP_P setting for macrofusion from sched_init to a place
inside sched_analyze after the SCHED_GROUP_P cleanup. It will be more
consistent with the settings for cc0 setter-user group and call group,
which are both inside sched_analyze.
I am trying this method...

Thanks,
Wei.


Here is the patch. The patch does the SCHED_GROUP_P cleanup in
sched_analyze before deps_analyze_insn set SCHED_GROUP_P and chain the
insn with prev insns. And it move try_group_insn for macrofusion from
sched_init to sched_analyze_insn.

bootstrap and regression pass on x86_64-linux-gnu. Is it ok?

Thanks,
Wei.

2013-11-27  Wei Mi  

 PR rtl-optimization/59020
 * sched-deps.c (try_group_insn): Move it from haifa-sched.c to here.
 (sched_analyze_insn): Call try_group_insn.
 (sched_analyze): Cleanup SCHED_GROUP_P before start the analysis.
 * haifa-sched.c (try_group_insn): Moved to sched-deps.c.
 (group_insns_for_macro_fusion): Removed.
 (sched_init): Remove calling group_insns_for_macro_fusion.

2013-11-27  Wei Mi  

 PR rtl-optimization/59020
 * testsuite/gcc.dg/pr59020.c: New.
 * testsuite/gcc.dg/macro-fusion-1.c: New.
 * testsuite/gcc.dg/macro-fusion-2.c: New.

This is fine.  Thanks for your patience,

Jeff



[PATCH] [PR tree-optimization/59322] Remove unwanted jump thread path copying

2013-12-02 Thread Jeff Law


Code was added to copy the jump threading path (AUX field on an edge) by 
a change from Zdenek in 2007.  At the time the code was added, AFAICT, 
the copied AUX field would never be examined and certainly not used for 
threading.


I'd been suspicious of Zdenek's code to copy the AUX field, but went 
along with it and changed it to properly copy the updated representation.


The testcase for 59322 is wonderful in that it shows how that copying is 
just plan bad because it results in dangling embedded edge pointers in 
the jump threading path.We make the copy and attach it to a new 
edge's AUX field in the CFG.  We then thread from outside the loop 
through the loop header.  We can delete edges in the CFG as a result. 
This leaves a dangling edge pointer in a the jump threading path 
structure (particularly the copied one).


We then walk all the edges to see if any are threads from inside the 
loop, through the backedge to another point within the loop.  This may 
reference the jump threading path containing the dangling edge pointer.


As I mentioned, I went back and looked at the 2007 code and AFAICT 
copying the AUX field was dead when it was added.  This patch removes 
the code to copy the jump threading path in the AUX field to newly 
created edges.  And (of course), it fixes the 59322 testcase.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on the trunk.


Jeff
PR tree-optimization/59322
* tree-ssa-threadedge.c (create_edge_and_update_destination_phis):
Remove code which copied jump threading paths.

PR tree-optimization/59322
* gcc.c-torture/compile/pr59322.c: New test

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 24d0f42..ad727a1 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -421,27 +421,22 @@ create_edge_and_update_destination_phis (struct 
redirection_data *rd,
   e->probability = REG_BR_PROB_BASE;
   e->count = bb->count;
 
-  /* We have to copy path -- which means creating a new vector as well
- as all the jump_thread_edge entries.  */
-  if (rd->path->last ()->e->aux)
-{
-  vec *path = THREAD_PATH (rd->path->last ()->e);
-  vec *copy = new vec ();
+  /* We used to copy the thread path here.  That was added in 2007
+ and dutifully updated through the representation changes in 2013.
 
-  /* Sadly, the elements of the vector are pointers and need to
-be copied as well.  */
-  for (unsigned int i = 0; i < path->length (); i++)
-   {
- jump_thread_edge *x
-   = new jump_thread_edge ((*path)[i]->e, (*path)[i]->type);
- copy->safe_push (x);
-   }
-  e->aux = (void *)copy;
-}
-  else
-{
-  e->aux = NULL;
-}
+ In 2013 we added code to thread from an interior node through
+ the backedge to another interior node.  That runs after the code
+ to thread through loop headers from outside the loop.
+
+ The latter may delete edges in the CFG, including those
+ which appeared in the jump threading path we copied here.  Thus
+ we'd end up using a dangling pointer.
+
+ After reviewing the 2007/2011 code, I can't see how anything
+ depended on copying the AUX field and clearly copying the jump
+ threading path is problematical due to embedded edge pointers.
+ It has been removed.  */
+  e->aux = NULL;
 
   /* If there are any PHI nodes at the destination of the outgoing edge
  from the duplicate block, then we will need to add a new argument


Diagnose array assignment for C90 (PR 58235)

2013-12-02 Thread Joseph S. Myers
This patch fixes bug 58235, a corner case with non-lvalue arrays in
C90 where the assignment of a non-lvalue array to an expression with
array type was not diagnosed.  A specific check is added for
assignments to arrays (which are never valid).

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

c:
2013-12-02  Joseph Myers  

PR c/58235
* c-typeck.c (build_modify_expr): Diagnose assignment to
expression with array type.

testsuite:
2013-12-02  Joseph Myers  

PR c/58235
* gcc.dg/c90-array-lval-8.c: New test.

Index: testsuite/gcc.dg/c90-array-lval-8.c
===
--- testsuite/gcc.dg/c90-array-lval-8.c (revision 0)
+++ testsuite/gcc.dg/c90-array-lval-8.c (revision 0)
@@ -0,0 +1,20 @@
+/* Test for non-lvalue arrays: test that they cannot be assigned to
+   array variables.  PR 58235.  */
+/* { dg-do compile } */
+/* { dg-options "-std=iso9899:1990 -pedantic-errors" } */
+
+struct s { char c[1]; } x;
+struct s f (void) { return x; }
+
+void
+g (void)
+{
+  char c[1];
+  c = f ().c; /* { dg-error "array" } */
+}
+
+void
+h (void)
+{
+  char c[1] = f ().c; /* { dg-error "array" } */
+}
Index: c/c-typeck.c
===
--- c/c-typeck.c(revision 205585)
+++ c/c-typeck.c(working copy)
@@ -5205,6 +5205,14 @@ build_modify_expr (location_t location, tree lhs,
   if (TREE_CODE (lhs) == ERROR_MARK || TREE_CODE (rhs) == ERROR_MARK)
 return error_mark_node;
 
+  /* Ensure an error for assigning a non-lvalue array to an array in
+ C90.  */
+  if (TREE_CODE (lhstype) == ARRAY_TYPE)
+{
+  error_at (location, "assignment to expression with array type");
+  return error_mark_node;
+}
+
   /* For ObjC properties, defer this check.  */
   if (!objc_is_property_ref (lhs) && !lvalue_or_else (location, lhs, 
lv_assign))
 return error_mark_node;

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix PR58944

2013-12-02 Thread Sriraman Tallam
On Thu, Nov 28, 2013 at 9:36 PM, Bernd Edlinger
 wrote:
> Hi,
>
> On Wed, 27 Nov 2013 19:49:39, Uros Bizjak wrote:
>>
>> On Mon, Nov 25, 2013 at 10:08 PM, Sriraman Tallam  
>> wrote:
>>
>>> I have attached a patch to fix this bug :
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58944
>>>
>>> A similar problem was also reported here:
>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01050.html
>>>
>>>
>>> Recently, ix86_valid_target_attribute_tree in config/i386/i386.c was
>>> refactored to not depend on global_options structure and to be able to
>>> use any gcc_options structure. One clean way to fix this is by having
>>> target_option_default_node save all the default target options which
>>> can be restored to any gcc_options structure. The root cause of the
>>> above bugs was that ix86_arch_string and ix86_tune_string was not
>>> saved in target_option_deault_node in PR58944 and
>>> ix86_preferred_stack_boundary_arg was not saved in the latter case.
>>>
>>> This patch saves all the target options used in i386.opt which are
>>> either obtained from the command-line or set to some default. Is this
>>> patch alright?
>>
>> Things looks rather complicated, but I see no other solution that save
>> and restore the way you propose.
>>
>> Please wait 24h if somebody has a different idea, otherwise please go
>> ahead and commit the patch to mainline.
>>
>
> Maybe you should also look at the handling or preferred_stack_boundary_arg
> versus incoming_stack_boundary_arg in ix86_option_override_internal:
>
> Remember ix86_incoming_stack_boundary_arg is defined to
> global_options.x_ix86_incoming_stack_boundary_arg.
>
> like this?
>
>   if (opts_set->x_ix86_incoming_stack_boundary_arg)
> {
> -  if (ix86_incoming_stack_boundary_arg
> +  if (opts->x_ix86_incoming_stack_boundary_arg
>   < (TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 4 : 2)
> -  || ix86_incoming_stack_boundary_arg> 12)
> + || opts->x_ix86_incoming_stack_boundary_arg> 12)
> error ("-mincoming-stack-boundary=%d is not between %d and 12",
> -   ix86_incoming_stack_boundary_arg,
> +  opts->x_ix86_incoming_stack_boundary_arg,
>TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 4 : 2);
>   else
> {
>   ix86_user_incoming_stack_boundary
> -= (1 << ix86_incoming_stack_boundary_arg) * BITS_PER_UNIT;
> +   = (1 << opts->x_ix86_incoming_stack_boundary_arg) * BITS_PER_UNIT;
>   ix86_incoming_stack_boundary
> = ix86_user_incoming_stack_boundary;
> }
>

Thanks for catching this. I will make this change in the same patch.

Sri

> Note however that opts_set always points to global_options_set.
> so this logic combines the stat of global_options_set and the
> target_option_default_node.



>
>
> Bernd.
>
>
>> Thanks,
>> Uros


Re: backport fix for go hash function names to 4.8

2013-12-02 Thread Ian Lance Taylor
On Wed, Nov 27, 2013 at 3:04 PM, Michael Hudson-Doyle
 wrote:
>
> This patch brings the recent fix for the generated hash functions of
> types that are aliases for structures containing unexported fields to
> the 4.8 branch.

Thanks.  Committed to 4.8 branch.

Ian


Re: Backport reflect.Call fixes to 4.8 branch

2013-12-02 Thread Ian Lance Taylor
On Wed, Nov 27, 2013 at 3:01 PM, Michael Hudson-Doyle
 wrote:
> This patch brings the recent fix for calling a function or method that
> takes or returns an empty struct via reflection to the 4.8 branch.

Thanks.  Committed to 4.8 branch.

Ian


[wwwdocs] back-end -> back end

2013-12-02 Thread Gerald Pfeifer
After changing a backend to back end or back-end, I realized we had
a number of uses of back-end which actually were about the noun (back
end).

Fixed thusly.

Index: svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.190
diff -u -3 -p -r1.190 svn.html
--- svn.html3 Dec 2013 01:04:41 -   1.190
+++ svn.html3 Dec 2013 01:08:52 -
@@ -377,7 +377,7 @@ the command svn log --stop-on-copy
   line.
 
   st/cli-be
-  The goal of the branch is to develop a back-end producing CLI binaries,
+  The goal of the branch is to develop a back end producing CLI binaries,
   compliant with ECMA-335 specification.
   This branch was originally maintained by Roberto Costa
   robsettanta...@gmail.com>.
Index: gcc-3.0/criteria.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.0/criteria.html,v
retrieving revision 1.30
diff -u -3 -p -r1.30 criteria.html
--- gcc-3.0/criteria.html   29 Dec 2006 10:03:38 -  1.30
+++ gcc-3.0/criteria.html   3 Dec 2013 01:08:53 -
@@ -78,7 +78,7 @@ be completed before GCC 3.0 is released:
like the other GCC front-ends.  This conversion will enable the
simplification, optimization, and removal of code in the
machine-independent portions of the compiler, as well as in 
-   the various back-ends.  Done.
+   the various back ends.  Done.
 
 Chill Front-End Garbage Collection
 Like the Java front-end, the Chill front-end will be converted
Index: gcc-3.0/features.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.0/features.html,v
retrieving revision 1.34
diff -u -3 -p -r1.34 features.html
--- gcc-3.0/features.html   31 Oct 2013 23:52:44 -  1.34
+++ gcc-3.0/features.html   3 Dec 2013 01:08:53 -
@@ -132,7 +132,7 @@
 New Targets and Target Specific Improvements
 
   
-New x86 back-end, generating much improved code.
+New x86 back end, generating much improved code.
 Support for a generic i386-elf target contributed.
 New option to emit x86 assembly code using Intel style syntax
 (-mintel-syntax).
Index: projects/cli.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cli.html,v
retrieving revision 1.23
diff -u -3 -p -r1.23 cli.html
--- projects/cli.html   3 Dec 2013 01:04:42 -   1.23
+++ projects/cli.html   3 Dec 2013 01:08:54 -
@@ -12,8 +12,8 @@
 Latest news
 Introduction
 Contributing
-The CLI back-end
-The CLI front-end
+The CLI back end
+The CLI front end
 Readings
 
 
@@ -59,7 +59,7 @@ that allows most of c applications to be
 
 
 2007-07-10
-Added CLI front-end
+Added CLI front end
 
 
 
@@ -93,11 +93,11 @@ with different abstraction levels, from 
 to low-level languages with no managed execution at all.
 
 
-The purpose of this project is to develop a GCC back-end that produces
+The purpose of this project is to develop a GCC back end that produces
 CLI-compliant binaries.
 The initial focus is on C language (more precisely, C99);
 C++ is likely to be considered in the future, as well as any
-other language for which there is an interest for a CLI back-end.
+other language for which there is an interest for a CLI back end.
 
 
 STMicroelectronics started this project in 2006,
@@ -106,7 +106,7 @@ as part of the European funded project <
 
 In 2007 to explore the potential of .NET as a deployment file format, in
 collaboration with http://www.hipeac.net/";>HiPEAC, we
-developped also a CIL front-end (always using GCC).
+developped also a CIL front end (always using GCC).
 
 
 
@@ -151,9 +151,9 @@ gcc CLI Backend and the CLI Frontend and
 http://gcc.gnu.org/viewcvs/branches/st/README?view=markup";>Build 
instructions
 
 
-The CLI back-end
+The CLI back end
 
-Unlike a typical GCC back-end, CLI back-end stops the compilation flow
+Unlike a typical GCC back end, the CLI backnend stops the compilation flow
 at the end of the middle-end passes and, without going through any RTL
 pass, it emits CIL bytecode from GIMPLE representation.
 As a matter of fact, RTL is not a convenient representation to emit
@@ -171,7 +171,7 @@ data type information that is not preser
 
 Target machine model
 
-Like existing GCC back-ends, CLI is truly seen as a target machine
+Like existing GCC back ends, CLI is truly seen as a target machine
 and, as such, it follows GCC policy about the organization of the
 back-end specific files.
 
@@ -202,10 +202,10 @@ This is an overview of such a descriptio
   attribute).
 
   Properties exclusively needed by RTL passes are skipped.
-  This is a mere consequence of the fact that CLI back-end starts
+  This is a mere consequence of the fact that the CLI back end starts
   from GIMPLE and it does not go through RTL at all.
 
-  Though CLI back-end does not

PATCH: PR target/59363: [4.9 Regression] r203886 miscompiles git

2013-12-02 Thread H.J. Lu
Hi,

emit_memset fails to adjust destination address after gen_strset, which
leads to the wrong address in aliasing info.  This patch fixes it.
Tested on Linux/x86-64.  OK to install?

Thanks.

H.J.
---
gcc/

2013-12-03   H.J. Lu  

PR target/59363
* config/i386/i386.c (emit_memset): Adjust destination address
after gen_strset.

gcc/testsuite/

2013-12-03   H.J. Lu  

PR target/59363
* gcc.target/i386/pr59363.c: New file.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index aa221df..d395a99 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22806,6 +22806,8 @@ emit_memset (rtx destmem, rtx destptr, rtx promoted_val,
   if (piece_size <= GET_MODE_SIZE (word_mode))
{
  emit_insn (gen_strset (destptr, dst, promoted_val));
+ dst = adjust_automodify_address_nv (dst, move_mode, destptr,
+ piece_size);
  continue;
}
 
diff --git a/gcc/testsuite/gcc.target/i386/pr59363.c 
b/gcc/testsuite/gcc.target/i386/pr59363.c
new file mode 100644
index 000..a4e1240
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr59363.c
@@ -0,0 +1,24 @@
+/* PR target/59363 */
+/* { dg-do run } */
+/* { dg-options "-O2 -mtune=amdfam10" } */
+
+typedef struct {
+  int ctxlen;
+  long interhunkctxlen;
+  int flags;
+  long find_func;
+  void *find_func_priv;
+  int hunk_func;
+} xdemitconf_t;
+
+__attribute__((noinline))
+int xdi_diff(xdemitconf_t *xecfg) {
+  if (xecfg->hunk_func == 0)
+__builtin_abort();
+  return 0;
+}
+int main() {
+  xdemitconf_t xecfg = {0};
+  xecfg.hunk_func = 1;
+  return xdi_diff(&xecfg);
+}


Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.

2013-12-02 Thread Cong Hou
Hi Richard

Could you please take a look at this patch and see if it is ready for
the trunk? The patch is pasted as a text file here again.

Thank you very much!


Cong


On Mon, Nov 11, 2013 at 11:25 AM, Cong Hou  wrote:
> Hi James
>
> Sorry for the late reply.
>
>
> On Fri, Nov 8, 2013 at 2:55 AM, James Greenhalgh
>  wrote:
>>> On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou  wrote:
>>> > Thank you for your detailed explanation.
>>> >
>>> > Once GCC detects a reduction operation, it will automatically
>>> > accumulate all elements in the vector after the loop. In the loop the
>>> > reduction variable is always a vector whose elements are reductions of
>>> > corresponding values from other vectors. Therefore in your case the
>>> > only instruction you need to generate is:
>>> >
>>> > VABAL   ops[3], ops[1], ops[2]
>>> >
>>> > It is OK if you accumulate the elements into one in the vector inside
>>> > of the loop (if one instruction can do this), but you have to make
>>> > sure other elements in the vector should remain zero so that the final
>>> > result is correct.
>>> >
>>> > If you are confused about the documentation, check the one for
>>> > udot_prod (just above usad in md.texi), as it has very similar
>>> > behavior as usad. Actually I copied the text from there and did some
>>> > changes. As those two instruction patterns are both for vectorization,
>>> > their behavior should not be difficult to explain.
>>> >
>>> > If you have more questions or think that the documentation is still
>>> > improper please let me know.
>>
>> Hi Cong,
>>
>> Thanks for your reply.
>>
>> I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and
>> DOT_PROD_EXPR and I see that the same ambiguity exists for
>> DOT_PROD_EXPR. Can you please add a note in your tree.def
>> that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either:
>>
>>   tmp = WIDEN_MINUS_EXPR (arg1, arg2)
>>   tmp2 = ABS_EXPR (tmp)
>>   arg3 = PLUS_EXPR (tmp2, arg3)
>>
>> or:
>>
>>   tmp = WIDEN_MINUS_EXPR (arg1, arg2)
>>   tmp2 = ABS_EXPR (tmp)
>>   arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
>>
>> Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a
>> a value of the same (widened) type as arg3.
>>
>
>
> I have added it, although we currently don't have WIDEN_MINUS_EXPR (I
> mentioned it in tree.def).
>
>
>> Also, while looking for the history of DOT_PROD_EXPR I spotted this
>> patch:
>>
>>   [autovect] [patch] detect mult-hi and sad patterns
>>   http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html
>>
>> I wonder what the reason was for that patch to be dropped?
>>
>
> It has been 8 years.. I have no idea why this patch is not accepted
> finally. There is even no reply in that thread. But I believe the SAD
> pattern is very important to be recognized. ARM also provides
> instructions for it.
>
>
> Thank you for your comment again!
>
>
> thanks,
> Cong
>
>
>
>> Thanks,
>> James
>>
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6bdaa31..37ff6c4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,4 +1,24 @@
-2013-11-01  Trevor Saunders  
+2013-10-29  Cong Hou  
+
+   * tree-vect-patterns.c (vect_recog_sad_pattern): New function for SAD
+   pattern recognition.
+   (type_conversion_p): PROMOTION is true if it's a type promotion
+   conversion, and false otherwise.  Return true if the given expression
+   is a type conversion one.
+   * tree-vectorizer.h: Adjust the number of patterns.
+   * tree.def: Add SAD_EXPR.
+   * optabs.def: Add sad_optab.
+   * cfgexpand.c (expand_debug_expr): Add SAD_EXPR case.
+   * expr.c (expand_expr_real_2): Likewise.
+   * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
+   * gimple.c (get_gimple_rhs_num_ops): Likewise.
+   * optabs.c (optab_for_tree_code): Likewise.
+   * tree-cfg.c (estimate_operator_cost): Likewise.
+   * tree-ssa-operands.c (get_expr_operands): Likewise.
+   * tree-vect-loop.c (get_initial_def_for_reduction): Likewise.
+   * config/i386/sse.md: Add SSE2 and AVX2 expand for SAD.
+   * doc/generic.texi: Add document for SAD_EXPR.
+   * doc/md.texi: Add document for ssad and usad.
 
* function.c (reorder_blocks): Convert block_stack to a stack_vec.
* gimplify.c (gimplify_compound_lval): Likewise.
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index fb05ce7..1f824fb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2740,6 +2740,7 @@ expand_debug_expr (tree exp)
{
case COND_EXPR:
case DOT_PROD_EXPR:
+   case SAD_EXPR:
case WIDEN_MULT_PLUS_EXPR:
case WIDEN_MULT_MINUS_EXPR:
case FMA_EXPR:
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9094a1c..af73817 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -7278,6 +7278,36 @@
   DONE;
 })
 
+(define_expand "usadv16qi"
+  [(match_operand:V4SI 0 "register_operand")
+   (match_operand:V16QI 1 "register_operand")
+   (match_operand:V16QI 2 "nonimmediate_operand")
+   (match_op

Re: [wwwdocs] backend -> back end

2013-12-02 Thread Gerald Pfeifer
On Mon, 11 Nov 2013, Joseph S. Myers wrote:
>> Working to address a user question, I noticed that many of our pages use 
>> the spelling of "backend" when http://gcc.gnu.org/codingconventions.html
>> suggest "back end" (noun) and "back-end" (adjective).
>> 
>> Joseph, if you confirm that back end it is (as a noun), I'll apply
>> the patch below.
> I believe it's correct, but several of the cases where you have "back end" 
> look like adjective uses that should be "back-end" to me:
> 
>> -Richard Henderson has finished merging the ia32 backend rewrite into the
>> +Richard Henderson has finished merging the ia32 back end rewrite into the
> 
>> - Fix x86 backend problem with Fortran literals and -fpic.
>> + Fix x86 back end problem with Fortran literals and -fpic.
> 
>> -improved epilogue sequences for Pentium chips and backend
>> +improved epilogue sequences for Pentium chips and back end
> 
>> - Fix a m68k backend bug which caused invalid offsets in reg+d 
>> + Fix a m68k back end bug which caused invalid offsets in reg+d 
> 
>> -SPARC backend rewrite.
>> +SPARC back end rewrite.
> 
>> -Fix SPARC backend bug which caused aborts in final.c.
>> +Fix SPARC back end bug which caused aborts in final.c.

In these cases above, isn't this more about  
  (ia32 back end) rewrite
than
  ia32 (back-end rewrite)
?  That was my intuition, but then my native tongue is a bit different
here.  Though, doesn't English have noun adjectives as well?

I have made the other changes you pointed out, reverted the above, and
committed the patch below.  If you have guidance based on my note above,
I'd appreciate that.

Thanks,
Gerald


Index: contribute.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/contribute.html,v
retrieving revision 1.81
diff -u -3 -p -r1.81 contribute.html
--- contribute.html 18 Aug 2013 20:59:25 -  1.81
+++ contribute.html 3 Dec 2013 01:00:51 -
@@ -308,11 +308,11 @@ is a good candidate for being mentioned 
 
 Larger accomplishments, either as part of a specific project, or long
 term commitment, merit mention on the front page.  Examples include projects
-like tree-ssa, new backends, major advances in optimization or standards
+like tree-ssa, new back ends, major advances in optimization or standards
 compliance.
 
 The gcc-announce mailing list serves to announce new releases and changes
-like frontends or backends being dropped.
+like front ends or back ends being dropped.
 
 
 
Index: frontends.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/frontends.html,v
retrieving revision 1.38
diff -u -3 -p -r1.38 frontends.html
--- frontends.html  27 Mar 2013 21:53:00 -  1.38
+++ frontends.html  3 Dec 2013 01:00:52 -
@@ -33,14 +33,14 @@ are very mature.
 href="http://www.mercurylang.org/download/gcc-backend.html";>Mercury,
 a declarative logic/functional language. The University of Melbourne Mercury
 compiler is written in Mercury; originally it compiled via C but now it also
-has a backend that generates assembler directly, using the GCC backend.
+has a back end that generates assembler directly, using the GCC back end.
 
 http://CobolForGCC.sourceforge.net/";>Cobol For GCC
 (at an early stage of development).
 
 http://www.nongnu.org/gm2/";>GNU Modula-2 implements
 the PIM2, PIM3, PIM4 and ISO dialects of the language.  The compiler
-is fully operational with the GCC 4.1.2 backend (on GNU/Linux x86
+is fully operational with the GCC 4.1.2 back end (on GNU/Linux x86
 systems).  Work is in progress to move the frontend to the GCC trunk.
 The frontend is mostly written in Modula-2, but includes a bootstrap
 procedure via a heavily modified version of p2c.
Index: lists.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/lists.html,v
retrieving revision 1.106
diff -u -3 -p -r1.106 lists.html
--- lists.html  24 Oct 2013 01:31:10 -  1.106
+++ lists.html  3 Dec 2013 01:00:52 -
@@ -95,7 +95,7 @@ before subscribing<
   http://gcc.gnu.org/ml/jit/";>jit
   is for discussion and development of
   http://gcc.gnu.org/wiki/JIT";>libgccjit, an experimental
-  library for implementing Just-In-Time compilation using gcc as a backend.
+  library for implementing Just-In-Time compilation using GCC as a back end.
   The list is intended for both users and developers of the library.
   Patches for the jit branch should go to both this list and
   gcc-patches.
Index: news.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/news.html,v
retrieving revision 1.137
diff -u -3 -p -r1.137 news.html
--- news.html   12 Aug 2013 00:01:45 -  1.137
+++ news.html   3 Dec 2013 01:00:52 -
@@ -1079,10 +1079,10 @@ noticed or fixed defects or made other u
 May 1, 2000
 
 Richard Earnshaw of ARM Ltd, and Nick Clifton of Cyg

Re: [PATCH, rs6000] Skip another test case for little endian

2013-12-02 Thread Bill Schmidt
Good idea, Mike, I'll make that change.

Thanks,
Bill

On Mon, 2013-12-02 at 16:54 -0800, Mike Stump wrote:
> On Dec 2, 2013, at 3:32 PM, Bill Schmidt  wrote:
> > The test case gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c fails if a
> > loop isn't vectorized.  When compiled for little endian, the cost of
> > vectorizing the loop is deemed too high
> 
> > Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c
> > ===
> > --- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c  
> > (revision 205585)
> > +++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c  
> > (working copy)
> > @@ -1,4 +1,5 @@
> > /* { dg-require-effective-target vect_int } */
> > +/* { dg-skip-if "" { powerpc*le-*-* } { "*" } { "" } } */
> 
> We like noting comments somewhere why we want to skip, the idea being if 
> another target has high costs, they can just know that this is a comment 
> failure mode.  Maybe something like:
> 
> +/* { dg-skip-if "cost to high" { powerpc*le-*-* } { "*" } { "" } } */
> 
> ?



Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-12-02 Thread Cong Hou
Any comment on this patch?


thanks,
Cong


On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou  wrote:
> On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse  wrote:
>> On Thu, 21 Nov 2013, Cong Hou wrote:
>>
>>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse  wrote:

 On Thu, 21 Nov 2013, Cong Hou wrote:

> While I added the new define_insn_and_split for vec_merge, a bug is
> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz2" ]
> only takes one input, but the corresponding builtin functions have two
> inputs, which are shown in i386.c:
>
>  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2,
> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN,
> (int)MULTI_ARG_2_SF },
>  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2,
> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN,
> (int)MULTI_ARG_2_DF },
>
> In consequence, the ix86_expand_multi_arg_builtin() function tries to
> check two args but based on the define_expand of xop_vmfrcz2,
> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be
> incorrect (because it only needs one input).
>
> The patch below fixed this issue.
>
> Bootstrapped and tested on ax x86-64 machine. Note that this patch
> should be applied before the one I sent earlier (sorry for sending
> them in wrong order).



 This is PR 56788. Your patch seems strange to me and I don't think it
 fixes the real issue, but I'll let more knowledgeable people answer.
>>>
>>>
>>>
>>> Thank you for pointing out the bug report. This patch is not intended
>>> to fix PR56788.
>>
>>
>> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788
>> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the
>> associated builtin, which would solve your issue as well.
>
>
> I agree. Then I will wait until your patch is merged to the trunk,
> otherwise my patch could not pass the test.
>
>
>>
>>
>>> For your function:
>>>
>>> #include 
>>> __m128d f(__m128d x, __m128d y){
>>>  return _mm_frcz_sd(x,y);
>>> }
>>>
>>> Note that the second parameter is ignored intentionally, but the
>>> prototype of this function contains two parameters. My fix is
>>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have
>>> three operands instead of two, to let it have the correct information
>>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to
>>> match the type of the second parameter in the builtin function in
>>> ix86_expand_multi_arg_builtin().
>>
>>
>> I disagree that this is intentional, it is a bug. AFAIK there is no AMD
>> documentation that could be used as a reference for what _mm_frcz_sd is
>> supposed to do. The only existing documentations are by Microsoft (which
>> does *not* ignore the second argument) and by LLVM (which has a single
>> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a
>> single argument, and if necessary we'll use 2 builtins to implement
>> _mm_frcz_sd.
>>
>
>
> I also only found the one by Microsoft.. If the second argument is
> ignored, we could just remove it, as long as there is no "standard"
> that requires two arguments. Hopefully it won't break current projects
> using _mm_frcz_sd.
>
> Thank you for your comments!
>
>
> Cong
>
>
>> --
>> Marc Glisse


Re: [PATCH, rs6000] Skip another test case for little endian

2013-12-02 Thread Mike Stump
On Dec 2, 2013, at 3:32 PM, Bill Schmidt  wrote:
> The test case gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c fails if a
> loop isn't vectorized.  When compiled for little endian, the cost of
> vectorizing the loop is deemed too high

> Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c
> ===
> --- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c
> (revision 205585)
> +++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c
> (working copy)
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-skip-if "" { powerpc*le-*-* } { "*" } { "" } } */

We like noting comments somewhere why we want to skip, the idea being if 
another target has high costs, they can just know that this is a comment 
failure mode.  Maybe something like:

+/* { dg-skip-if "cost to high" { powerpc*le-*-* } { "*" } { "" } } */

?

Re: [PATCH, rs6000] Skip another test case for little endian

2013-12-02 Thread David Edelsohn
On Mon, Dec 2, 2013 at 6:32 PM, Bill Schmidt
 wrote:
> The test case gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c fails if a
> loop isn't vectorized.  When compiled for little endian, the cost of
> vectorizing the loop is deemed too high to vectorize due to unaligned
> vector accesses within the loop.  Therefore we should skip this test for
> LE.
>
> Verified on powerpc64le-unknown-linux-gnu.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> 2013-12-02  Bill Schmidt  
>
> * gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c: Skip for little
> endian.

LGTM.

Thanks, David


Re: [wide-int] Drop some lingering uses of precision 0

2013-12-02 Thread Mike Stump
On Dec 2, 2013, at 12:20 PM, Richard Sandiford  
wrote:
> I noticed that there were still a couple of tests for zero precision.

> OK to install?

Ok.

Re: [PATCH] Allow compounds with empty initializer in pedantic mode (PR c/59351)

2013-12-02 Thread Joseph S. Myers
On Mon, 2 Dec 2013, Marek Polacek wrote:

> Regtested/botstrapped on x86_64-linux, ok for trunk and 4.8 and
> perhaps even 4.7?
> 
> 2013-12-02  Marek Polacek  
> 
>   PR c/59351
> c/
>   * c-decl.c (build_compound_literal): Allow compound literals with
>   empty initial value.
> testsuite/
>   * gcc.dg/pr59351.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: patch for elimination to SP when it is changed in RTL (PR57293)

2013-12-02 Thread Vladimir Makarov

On 12/1/2013, 7:57 AM, James Greenhalgh wrote:

On Thu, Nov 28, 2013 at 10:11:26PM +, Vladimir Makarov wrote:

Committed as rev. 205498.

   2013-11-28  Vladimir Makarov

PR target/57293
* ira.h (ira_setup_eliminable_regset): Remove parameter.
* ira.c (ira_setup_eliminable_regset): Ditto.  Add
SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
Don't call lra_init_elimination.
(ira): Call ira_setup_eliminable_regset without arguments.
* loop-invariant.c (calculate_loop_reg_pressure): Remove argument
from ira_setup_eliminable_regset call.
* gcse.c (calculate_bb_reg_pressure): Ditto.
* haifa-sched.c (sched_init): Ditto.
* lra.h (lra_init_elimination): Remove the prototype.
* lra-int.h (lra_insn_recog_data): New member sp_offset.  Move
used_insn_alternative upper.
(lra_eliminate_regs_1): Add one more parameter.
(lra-eliminate): Ditto.
* lra.c (lra_invalidate_insn_data): Set sp_offset.
(setup_sp_offset): New.
(lra_process_new_insns): Call setup_sp_offset.
(lra): Add argument to lra_eliminate calls.
* lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
(get_equiv_with_elimination): New.
(process_addr_reg): Call get_equiv_with_elimination instead of
get_equiv_substitution.
(equiv_address_substitution): Ditto.
(loc_equivalence_change_p): Ditto.
(loc_equivalence_callback, lra_constraints): Ditto.
(curr_insn_transform): Ditto.  Print the sp offset
(process_alt_operands): Prevent stack pointer reloads.
(lra_constraints): Remove one argument from lra_eliminate call.
Move it up.  Mark used hard regs bfore it.  Use
get_equiv_with_elimination instead of get_equiv_substitution.
* lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
assert for param values combination.  Use sp offset.  Add argument
to lra_eliminate_regs_1 calls.
(lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
(curr_sp_change): New static var.
(mark_not_eliminable): Add parameter.  Update curr_sp_change.
Don't prevent elimination to sp if we can calculate its change.
Pass the argument to mark_not_eliminable calls.
(eliminate_regs_in_insn): Add a parameter.  Use sp offset.  Add
argument to lra_eliminate_regs_1 call.
(update_reg_eliminate): Move calculation of hard regs for spill
lower.  Switch off lra_in_progress temporarily to generate regs
involved into elimination.
(lra_init_elimination): Rename to init_elimination.  Make it
static.  Set up insn sp offset, check the offsets at the end of
BBs.
(process_insn_for_elimination): Add parameter.  Pass its value to
eliminate_regs_in_insn.
(lra_eliminate): : Add parameter.  Pass its value to
process_insn_for_elimination.  Add assert for param values
combination.  Call init_elimination.  Don't update offsets in
equivalence substitutions.
* lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
for created stack slot.
(remove_pseudos): Call lra_eliminate_regs_1 before changing memory
onto stack slot.

2013-11-28  Vladimir Makarov

PR target/57293
* gcc.target/i386/pr57293.c: New.


Hi Vlad,

This patch seems to cause some problems for AArch64. I see an assert
triggering when building libgloss:

/work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/ 
-B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/ 
-isystem 
/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/targ-include
 -isystem /work/gcc-clean/src/binutils/newlib/libc/include 
-B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/aarch64
 
-L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/libnosys
 -L/work/gcc-clean/src/binutils/libgloss/aarch64 
-L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/./ld-O2 -g -O2 -g -I. 
-I/work/gcc-clean/src/binutils/libgloss/aarch64/.. -DARM_RDI_MONITOR -o 
rdimon-_exit.o -c /work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c
/work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c: In function '_exit':
/work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c:41:1: internal compiler 
error: in update_reg_eliminate, at lra-eliminations.c:1157
  }
  ^




  First of all, it is a bad situation for code performance when IRA
decides that it can use frame pointer for allocation, and after that
LRA/reload decides that frame pointer can not be used and spills all
pseudos assigned to FP.  The generated code will be much worse than
one generated if we decided not to use FP from the IRA start.

  Therefore I decided to put an assert for checking the si

Re: [PATCH] Time profiler - phase 2

2013-12-02 Thread Martin Liška
Hello,
   there are dumps for Inkscape, it looks very well.

Link: https://docs.google.com/file/d/0B0pisUJ80pO1Y0t1aEVBRlByR28/edit

There are few of functions that look like this (wpa cgraph):

_ZL13resync_activeP19_EgeSelectOneActionii/2604322 (resync_active)
@0x7f84af42cea0
  Type: function definition analyzed
  Visibility: prevailing_def_ironly
  References:
  Referring:
  Read from file: libinkscape.a
  Availability: local
  First run: 4422
  Function flags: executed 47x local
  Called by: ege_select_one_action_set_active_text/2604300 (0.34 per
call) (can throw external)
_ZL21commit_pending_changeP19_EgeSelectOneAction/2604327 (0.16 per
call) (can throw external)
_ZL34ege_select_one_action_set_propertyP8_GObjectjPK7_GValueP11_GParamSpec/2604316
(47x) (0.16 per call) (can throw external)
  Calls: _ZL13resync_activeP19_EgeSelectOneActionii.part.0/2604456
(10x) (0.21 per call) (can throw external)

_ZL13resync_activeP19_EgeSelectOneActionii.part.0/2604456
(_ZL13resync_activeP19_EgeSelectOneActionii.part.0) @0x7f84af42cd68
  Type: function definition analyzed
  Visibility: artificial
  References: _ZL7signals/2604291 (read)
  Referring:
  Read from file: libinkscape.a
  Availability: local
  First run: 0
  Function flags: executed 10x local
  Called by: _ZL13resync_activeP19_EgeSelectOneActionii/2604322 (10x)
(0.21 per call) (can throw external)

First function has a profile (position is correct according to
valgrind) and second not. Both of them comes from the same object
file. The problem is that the second one is called according to
valgrind. What does .part.X means, is it a part of function that was
separated to a different function? Is there any was these two profiles
could be merged?

Thank you,
Martin

On 27 November 2013 00:24, Martin Liška  wrote:
> Hello,
> present results reached for GIMP by the reordering pass. Important
> to notice, that I just used single '.text' section where all symbols
> are placed. As you can see, there just few functions that are not
> catched by the pass (3 of them are LTO clones, that I will find out).
> And 2 functions were not seen during -fprofile-generate run.
>
> In following days, I will prepare same dumps for Inkscape and Firefox.
>
> Martin
>
> On 18 November 2013 11:16, Jan Hubicka  wrote:
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 5cb07b7..754f882 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,13 @@
>>> +2013-11-17  Martin Liska  
>>> + Jan Hubicka  
>>> +
>>> + * cgraphunit.c (node_cmp): New function.
>>> + (expand_all_functions): Function ordering added.
>>> + * common.opt: New profile based function reordering flag introduced.
>>> + * lto-partition.c: Support for time profile added.
>>> + * lto.c: Likewise.
>>> + * predict.c (handle_missing_profiles): Time profile handled in
>>> +   missing profiles.
>>
>> OK,
>> thanks!  Implementing the function section naming scheme would be easy and 
>> it would
>> enable us to do the reordering even w/o LTO that would be quite cool. Lets 
>> hope it gets
>> resolved soon.
>>
>> Honza


[PATCH, rs6000] Skip another test case for little endian

2013-12-02 Thread Bill Schmidt
The test case gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c fails if a
loop isn't vectorized.  When compiled for little endian, the cost of
vectorizing the loop is deemed too high to vectorize due to unaligned
vector accesses within the loop.  Therefore we should skip this test for
LE.

Verified on powerpc64le-unknown-linux-gnu.  Is this ok for trunk?

Thanks,
Bill


2013-12-02  Bill Schmidt  

* gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c: Skip for little
endian.


Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c
===
--- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c  (revision 
205585)
+++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-34.c  (working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-skip-if "" { powerpc*le-*-* } { "*" } { "" } } */
 
 #include 
 #include "../../tree-vect.h"




Committed: fix epiphany libgcc build

2013-12-02 Thread Joern Rennecke


2013-12-02  Joern Rennecke  

* config/epiphany/epiphany.h: Wrap rtl_opt_pass declarations
in #ifndef IN_LIBGCC2 / #endif.

Index: config/epiphany/epiphany.h
===
--- config/epiphany/epiphany.h  (revision 205586)
+++ config/epiphany/epiphany.h  (working copy)
@@ -929,8 +929,10 @@ enum
 };
 
 extern int epiphany_normal_fp_rounding;
+#ifndef IN_LIBGCC2
 extern rtl_opt_pass *make_pass_mode_switch_use (gcc::context *ctxt);
 extern rtl_opt_pass *make_pass_resolve_sw_modes (gcc::context *ctxt);
+#endif
 
 /* This will need to be adjusted when FP_CONTRACT_ON is properly
implemented.  */


Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-12-02 Thread Eric Botcazou
> Good question. Most of the time the expansion can not know if it expands
> Ada, C, or Fortran. In this case we know it can only be Ada, so the C++
> memory model is not mandatory. Maybe Eric can tell, if a data store race
> condition may be an issue in Ada if  structure is laid out like
> __attribute((packed,aligned(1))) I mean, if that is at all possible.

Unlike in C++, all bets are off in Ada as soon as you have non-byte-aligned 
objects.  For byte-aligned objects, there is the following implementation 
advice (C.6 2/22):

"A load or store of a volatile object whose size is a multiple of 
System.Storage_Unit and whose alignment is nonzero, should be implemented by 
accessing exactly the bits of the object and no others."

so the answer is (theoritically) yes for volatile fields.  But, in practice, 
we probably reject the potentially problematic volatile fields in gigi.

-- 
Eric Botcazou


[PATCH] Fix recent regression in tree-object-size.c (PR tree-optimization/59362)

2013-12-02 Thread Jakub Jelinek
Hi!

Recent change to tree-object-size.c to fold stmts with immediate uses
of __builtin_object_size result broke the pass, because it now can
create new SSA_NAMEs and the code wasn't expecting that to happen.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-12-02  Jakub Jelinek  

PR tree-optimization/59362
* tree-object-size.c (object_sizes): Change into array of
vec.
(compute_builtin_object_size): Check computed bitmap for
non-NULL instead of object_sizes.  Call safe_grow on object_sizes
vector if new SSA_NAMEs appeared.
(init_object_sizes): Check computed bitmap for non-NULL.
Call safe_grow on object_sizes elements instead of initializing
it with XNEWVEC.
(fini_object_sizes): Call release on object_sizes elements, don't
set it to NULL.

* gcc.c-torture/compile/pr59362.c: New test.

--- gcc/tree-object-size.c.jj   2013-11-22 21:03:16.0 +0100
+++ gcc/tree-object-size.c  2013-12-02 10:16:01.777024163 +0100
@@ -78,7 +78,7 @@ static void check_for_plus_in_loops_1 (s
the subobject (innermost array or field with address taken).
object_sizes[2] is lower bound for number of bytes till the end of
the object and object_sizes[3] lower bound for subobject.  */
-static unsigned HOST_WIDE_INT *object_sizes[4];
+static vec object_sizes[4];
 
 /* Bitmaps what object sizes have been computed already.  */
 static bitmap computed[4];
@@ -506,7 +506,7 @@ compute_builtin_object_size (tree ptr, i
 
   if (TREE_CODE (ptr) == SSA_NAME
   && POINTER_TYPE_P (TREE_TYPE (ptr))
-  && object_sizes[object_size_type] != NULL)
+  && computed[object_size_type] != NULL)
 {
   if (!bitmap_bit_p (computed[object_size_type], SSA_NAME_VERSION (ptr)))
{
@@ -514,6 +514,8 @@ compute_builtin_object_size (tree ptr, i
  bitmap_iterator bi;
  unsigned int i;
 
+ if (num_ssa_names > object_sizes[object_size_type].length ())
+   object_sizes[object_size_type].safe_grow (num_ssa_names);
  if (dump_file)
{
  fprintf (dump_file, "Computing %s %sobject size for ",
@@ -1175,12 +1177,12 @@ init_object_sizes (void)
 {
   int object_size_type;
 
-  if (object_sizes[0])
+  if (computed[0])
 return;
 
   for (object_size_type = 0; object_size_type <= 3; object_size_type++)
 {
-  object_sizes[object_size_type] = XNEWVEC (unsigned HOST_WIDE_INT, 
num_ssa_names);
+  object_sizes[object_size_type].safe_grow (num_ssa_names);
   computed[object_size_type] = BITMAP_ALLOC (NULL);
 }
 
@@ -1197,9 +1199,8 @@ fini_object_sizes (void)
 
   for (object_size_type = 0; object_size_type <= 3; object_size_type++)
 {
-  free (object_sizes[object_size_type]);
+  object_sizes[object_size_type].release ();
   BITMAP_FREE (computed[object_size_type]);
-  object_sizes[object_size_type] = NULL;
 }
 }
 
--- gcc/testsuite/gcc.c-torture/compile/pr59362.c.jj2013-12-02 
10:20:10.964738283 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr59362.c   2013-12-02 
10:18:19.0 +0100
@@ -0,0 +1,21 @@
+/* PR tree-optimization/59362 */
+
+char *
+foo (char *r, int s)
+{
+  r = __builtin___stpcpy_chk (r, "abc", __builtin_object_size (r, 1));
+  if (s)
+r = __builtin___stpcpy_chk (r, "d", __builtin_object_size (r, 1));
+  return r;
+}
+
+char *a;
+long int b;
+
+void
+bar (void)
+{
+  b = __builtin_object_size (0, 0);
+  a = __builtin___stpcpy_chk (0, "", b);
+  b = __builtin_object_size (a, 0);
+}

Jakub


[PATCH] Fix nested function ICE with VLAs (PR middle-end/59011)

2013-12-02 Thread Jakub Jelinek
Hi!

tree-nested.c uses declare_vars with last argument true, which
relies on BLOCK_VARS of gimple_bind_block being a tail
of the gimple_bind_vars chain.  But unfortunately a debug info
improvement I've added to gimplify_var_or_parm_decl 4 years ago
violates this assumption, in that it adds some VAR_DECLs
at the head of BLOCK_VARS (DECL_INITIAL (current_function_decl))
chain, but doesn't adjust gimple_bind_vars chain correspondingly.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.8?

2013-12-02  Jakub Jelinek  

PR middle-end/59011
* gimplify.c (nonlocal_vla_vars): New variable.
(gimplify_var_or_parm_decl): Put VAR_DECLs for VLAs into
nonlocal_vla_vars chain.
(gimplify_body): Call declare_vars on nonlocal_vla_vars chain
if outer_bind has DECL_INITIAL (current_function_decl) block.

* gcc.dg/pr59011.c: New test.

--- gcc/gimplify.c.jj   2013-12-02 14:33:34.0 +0100
+++ gcc/gimplify.c  2013-12-02 20:32:02.883491995 +0100
@@ -1689,6 +1689,9 @@ gimplify_conversion (tree *expr_p)
 /* Nonlocal VLAs seen in the current function.  */
 static struct pointer_set_t *nonlocal_vlas;
 
+/* The VAR_DECLs created for nonlocal VLAs for debug info purposes.  */
+static tree nonlocal_vla_vars;
+
 /* Gimplify a VAR_DECL or PARM_DECL.  Return GS_OK if we expanded a
DECL_VALUE_EXPR, and it's worth re-examining things.  */
 
@@ -1737,14 +1740,13 @@ gimplify_var_or_parm_decl (tree *expr_p)
ctx = ctx->outer_context;
  if (!ctx && !pointer_set_insert (nonlocal_vlas, decl))
{
- tree copy = copy_node (decl), block;
+ tree copy = copy_node (decl);
 
  lang_hooks.dup_lang_specific_decl (copy);
  SET_DECL_RTL (copy, 0);
  TREE_USED (copy) = 1;
- block = DECL_INITIAL (current_function_decl);
- DECL_CHAIN (copy) = BLOCK_VARS (block);
- BLOCK_VARS (block) = copy;
+ DECL_CHAIN (copy) = nonlocal_vla_vars;
+ nonlocal_vla_vars = copy;
  SET_DECL_VALUE_EXPR (copy, unshare_expr (value_expr));
  DECL_HAS_VALUE_EXPR_P (copy) = 1;
}
@@ -8562,6 +8564,21 @@ gimplify_body (tree fndecl, bool do_parm
 
   if (nonlocal_vlas)
 {
+  if (nonlocal_vla_vars)
+   {
+ /* tree-nested.c may later on call declare_vars (..., true);
+which relies on BLOCK_VARS chain to be the tail of the
+gimple_bind_vars chain.  Ensure we don't violate that
+assumption.  */
+ if (gimple_bind_block (outer_bind)
+ == DECL_INITIAL (current_function_decl))
+   declare_vars (nonlocal_vla_vars, outer_bind, true);
+ else
+   BLOCK_VARS (DECL_INITIAL (current_function_decl))
+ = chainon (BLOCK_VARS (DECL_INITIAL (current_function_decl)),
+nonlocal_vla_vars);
+ nonlocal_vla_vars = NULL_TREE;
+   }
   pointer_set_destroy (nonlocal_vlas);
   nonlocal_vlas = NULL;
 }
--- gcc/testsuite/gcc.dg/pr59011.c.jj   2013-12-02 20:14:22.350702153 +0100
+++ gcc/testsuite/gcc.dg/pr59011.c  2013-12-02 20:15:10.902455660 +0100
@@ -0,0 +1,22 @@
+/* PR middle-end/59011 */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99" } */
+
+void
+foo (int m)
+{
+  int a[m];
+  void
+  bar (void)
+  {
+{
+  int
+  baz (void)
+  {
+   return a[0];
+  }
+}
+a[0] = 42;
+  }
+  bar ();
+}

Jakub


[PATCH] Fix SSE (pre-AVX) alignment handling (PR target/59163)

2013-12-02 Thread Jakub Jelinek
Hi!

As discussed in the PR, combiner can combine e.g. unaligned integral
load (e.g. TImode) together with some SSE instruction that requires aligned
load, but doesn't actually check it.  For AVX, most of the instructions
actually allow unaligned operands, except for a few vmov* instructions where
the pattern typically handle the misaligned mems through misaligned_operand
checks, and some nontemporal move insns that have UNSPECs that should
prevent combination.  The following patch attempts to solve this by
rejecting combining of unaligned memory loads/stores into SSE insns that
don't allow it.  I've added ssememalign attribute for that, but actually
only later on realized that even for the insns which load/store < 16 byte
memory values if strict alignment checking isn't turned on in hw, the
arguments don't have to be aligned at all, so perhaps instead of
ssememalign in bits all we could have is a boolean attribute whether
insn requires for pre-AVX memory operands to be as aligned as their mode, or
not (with default that it does require that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-12-02  Jakub Jelinek  
Uros Bizjak  

PR target/59163
* config/i386/i386.c (ix86_legitimate_combined_insn): If for
!TARGET_AVX there is misaligned MEM operand with vector mode
and get_attr_ssememalign is 0, return false.
(ix86_expand_special_args_builtin): Add get_pointer_alignment
computed alignment and for non-temporal loads/stores also
at least GET_MODE_ALIGNMENT as MEM_ALIGN.
* config/i386/sse.md
(_loadu,
_storeu,
_loaddqu,
_storedqu, _lddqu,
sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
sse_movlhps, sse_storehps, sse_loadhps, *vec_interleave_highv2df,
*vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_movsd,
sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2,
sse4_1_v4hiv4si2, sse4_1_v2qiv2di2,
sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr,
*sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm,
sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned,
sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add
ssememalign attribute.
* config/i386/i386.md (ssememalign): New define_attr.

* g++.dg/torture/pr59163.C: New test.

--- gcc/config/i386/i386.c.jj   2013-12-02 14:33:34.813367951 +0100
+++ gcc/config/i386/i386.c  2013-12-02 19:57:39.116438744 +0100
@@ -5685,6 +5685,17 @@ ix86_legitimate_combined_insn (rtx insn)
  bool win;
  int j;
 
+ /* For pre-AVX disallow unaligned loads/stores where the
+instructions don't support it.  */
+ if (!TARGET_AVX
+ && VECTOR_MODE_P (GET_MODE (op))
+ && misaligned_operand (op, GET_MODE (op)))
+   {
+ int min_align = get_attr_ssememalign (insn);
+ if (min_align == 0)
+   return false;
+   }
+
  /* A unary operator may be accepted by the predicate, but it
 is irrelevant for matching constraints.  */
  if (UNARY_P (op))
@@ -32426,11 +32437,12 @@ ix86_expand_args_builtin (const struct b
 
 static rtx
 ix86_expand_special_args_builtin (const struct builtin_description *d,
-   tree exp, rtx target)
+ tree exp, rtx target)
 {
   tree arg;
   rtx pat, op;
   unsigned int i, nargs, arg_adjust, memory;
+  bool aligned_mem = false;
   struct
 {
   rtx op;
@@ -32493,6 +32505,26 @@ ix86_expand_special_args_builtin (const
   klass = store;
   /* Reserve memory operand for target.  */
   memory = ARRAY_SIZE (args);
+  switch (icode)
+   {
+   /* These builtins and instructions require the memory
+  to be properly aligned.  */
+   case CODE_FOR_avx_movntv4di:
+   case CODE_FOR_sse2_movntv2di:
+   case CODE_FOR_avx_movntv8sf:
+   case CODE_FOR_sse_movntv4sf:
+   case CODE_FOR_sse4a_vmmovntv4sf:
+   case CODE_FOR_avx_movntv4df:
+   case CODE_FOR_sse2_movntv2df:
+   case CODE_FOR_sse4a_vmmovntv2df:
+   case CODE_FOR_sse2_movntidi:
+   case CODE_FOR_sse_movntq:
+   case CODE_FOR_sse2_movntisi:
+ aligned_mem = true;
+ break;
+   default:
+ break;
+   }
   break;
 case V4SF_FTYPE_V4SF_PCV2SF:
 case V2DF_FTYPE_V2DF_PCDOUBLE:
@@ -32549,6 +32581,11 @@ ix86_expand_special_args_builtin (const
{
  op = ix86_zero_extend_to_Pmode (op);
  target = gen_rtx_MEM (tmode, op);
+ unsigned int align = get_pointer_alignment (arg);
+ if (aligned_mem && align < GET_MODE_ALIGNMENT (tmode))
+   align = GET_MODE_ALIGNMENT (tmode);
+ if (MEM_ALIGN (target) < align)
+   set_mem_align (target, align);
}
   else
target = force_reg (tmode, op);
@

[PATCH] Fix up cmove expansion (PR target/58864, take 2)

2013-12-02 Thread Jakub Jelinek
Hi!

On Sat, Nov 30, 2013 at 12:38:30PM +0100, Eric Botcazou wrote:
> > Rather than adding do_pending_stack_adjust () in all the places, especially
> > when it isn't clear whether emit_conditional_move will be called at all and
> > whether it will actually do do_pending_stack_adjust (), I chose to add
> > two new functions to save/restore the pending stack adjustment state,
> > so that when instruction sequence is thrown away (either by doing
> > start_sequence/end_sequence around it and not emitting it, or
> > delete_insns_since) the state can be restored, and have changed all the
> > places that IMHO need it for emit_conditional_move.
> 
> Why not do it in emit_conditional_move directly then?  The code thinks it's 
> clever to do:
> 
>   do_pending_stack_adjust ();
>   last = get_last_insn ();
>   prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
>   GET_CODE (comparison), NULL_RTX, unsignedp, OPTAB_WIDEN,
>   &comparison, &cmode);
> [...]
>   delete_insns_since (last);
>   return NULL_RTX;
> 
> but apparently not, so why not delete the stack adjustment as well and 
> restore 
> the state afterwards?

On Sat, Nov 30, 2013 at 09:10:35AM +0100, Richard Biener wrote:
> The idea is good but I'd like to see a struct rather than an array for the
> storage.

So, this patch attempts to include both of the proposed changes.
Furthermore, I've noticed that calls.c has been saving/restoring those
two values by hand, so now it can use the new APIs for that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

What about 4.8 branch?  I could create an alternative patch for 4.8,
keep everything as is and just save/restore the two fields by hand in
emit_conditional_move like calls.c used to do it.

2013-12-02  Jakub Jelinek  

PR target/58864
* dojump.c (save_pending_stack_adjust, restore_pending_stack_adjust):
New functions.
* expr.h (struct saved_pending_stack_adjust): New type.
(save_pending_stack_adjust, restore_pending_stack_adjust): New
prototypes.
* optabs.c (emit_conditional_move): Call save_pending_stack_adjust
and get_last_insn before do_pending_stack_adjust, call
restore_pending_stack_adjust after delete_insns_since.
* expr.c (expand_expr_real_2): Don't call do_pending_stack_adjust
before calling emit_conditional_move.
* expmed.c (expand_sdiv_pow2): Likewise.
* calls.c (expand_call): Use {save,restore}_pending_stack_adjust.

* g++.dg/opt/pr58864.C: New test.

--- gcc/dojump.c.jj 2013-12-02 14:33:25.954413998 +0100
+++ gcc/dojump.c2013-12-02 15:08:39.958423641 +0100
@@ -96,6 +96,29 @@ do_pending_stack_adjust (void)
   pending_stack_adjust = 0;
 }
 }
+
+/* Remember pending_stack_adjust/stack_pointer_delta.
+   To be used around code that may call do_pending_stack_adjust (),
+   but the generated code could be discarded e.g. using delete_insns_since.  */
+
+void
+save_pending_stack_adjust (saved_pending_stack_adjust *save)
+{
+  save->x_pending_stack_adjust = pending_stack_adjust;
+  save->x_stack_pointer_delta = stack_pointer_delta;
+}
+
+/* Restore the saved pending_stack_adjust/stack_pointer_delta.  */
+
+void
+restore_pending_stack_adjust (saved_pending_stack_adjust *save)
+{
+  if (inhibit_defer_pop == 0)
+{
+  pending_stack_adjust = save->x_pending_stack_adjust;
+  stack_pointer_delta = save->x_stack_pointer_delta;
+}
+}
 
 /* Expand conditional expressions.  */
 
--- gcc/expr.h.jj   2013-12-02 14:33:26.263412414 +0100
+++ gcc/expr.h  2013-12-02 14:50:15.374175604 +0100
@@ -473,6 +473,28 @@ extern void clear_pending_stack_adjust (
 /* Pop any previously-pushed arguments that have not been popped yet.  */
 extern void do_pending_stack_adjust (void);
 
+/* Struct for saving/restoring of pending_stack_adjust/stack_pointer_delta
+   values.  */
+
+struct saved_pending_stack_adjust
+{
+  /* Saved value of pending_stack_adjust.  */
+  int x_pending_stack_adjust;
+
+  /* Saved value of stack_pointer_delta.  */
+  int x_stack_pointer_delta;
+};
+
+/* Remember pending_stack_adjust/stack_pointer_delta.
+   To be used around code that may call do_pending_stack_adjust (),
+   but the generated code could be discarded e.g. using delete_insns_since.  */
+
+extern void save_pending_stack_adjust (saved_pending_stack_adjust *);
+
+/* Restore the saved pending_stack_adjust/stack_pointer_delta.  */
+
+extern void restore_pending_stack_adjust (saved_pending_stack_adjust *);
+
 /* Return the tree node and offset if a given argument corresponds to
a string constant.  */
 extern tree string_constant (tree, tree *);
--- gcc/optabs.c.jj 2013-12-02 14:33:25.0 +0100
+++ gcc/optabs.c2013-12-02 15:12:04.0 +0100
@@ -4566,8 +4566,10 @@ emit_conditional_move (rtx target, enum
   if (!COMPARISON_P (comparison))
 return NULL_RTX;
 
-  do_pending_stack_adjust ();
+  saved_pending_stack_adju

[committed] Fix VRP range meet (PR tree-optimization/59358)

2013-12-02 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled (to endless loop), because
union_ranges didn't count with the possibility that *vr0max and vr1max
are uncomparable (one of them is symbolic).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
preapproved by richi on IRC, committed to trunk/4.8.

2013-12-02  Jakub Jelinek  

PR tree-optimization/59358
* tree-vrp.c (union_ranges): To check for the partially
overlapping ranges or adjacent ranges, also compare *vr0max
with vr1max.

* gcc.c-torture/execute/pr59358.c: New test.

--- gcc/tree-vrp.c.jj   2013-11-28 23:51:58.0 +0100
+++ gcc/tree-vrp.c  2013-12-02 13:24:10.750956769 +0100
@@ -7758,7 +7758,8 @@ union_ranges (enum value_range_type *vr0
 }
   else if ((operand_less_p (vr1min, *vr0max) == 1
|| operand_equal_p (vr1min, *vr0max, 0))
-  && operand_less_p (*vr0min, vr1min) == 1)
+  && operand_less_p (*vr0min, vr1min) == 1
+  && operand_less_p (*vr0max, vr1max) == 1)
 {
   /* [  (  ]  ) or [   ](   ) */
   if (*vr0type == VR_RANGE
@@ -7792,7 +7793,8 @@ union_ranges (enum value_range_type *vr0
 }
   else if ((operand_less_p (*vr0min, vr1max) == 1
|| operand_equal_p (*vr0min, vr1max, 0))
-  && operand_less_p (vr1min, *vr0min) == 1)
+  && operand_less_p (vr1min, *vr0min) == 1
+  && operand_less_p (vr1max, *vr0max) == 1)
 {
   /* (  [  )  ] or (   )[   ] */
   if (*vr0type == VR_RANGE
--- gcc/testsuite/gcc.c-torture/execute/pr59358.c.jj2013-12-02 
13:26:33.984198815 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr59358.c   2013-12-02 
13:26:17.0 +0100
@@ -0,0 +1,44 @@
+/* PR tree-optimization/59358 */
+
+__attribute__((noinline, noclone)) int
+foo (int *x, int y)
+{
+  int z = *x;
+  if (y > z && y <= 16)
+while (y > z)
+  z *= 2;
+  return z;
+}
+
+int
+main ()
+{
+  int i;
+  for (i = 1; i < 17; i++)
+{
+  int j = foo (&i, 16);
+  int k;
+  if (i >= 8 && i <= 15)
+   k = 16 + (i - 8) * 2;
+  else if (i >= 4 && i <= 7)
+   k = 16 + (i - 4) * 4;
+  else if (i == 3)
+   k = 24;
+  else
+   k = 16;
+  if (j != k)
+   __builtin_abort ();
+  j = foo (&i, 7);
+  if (i >= 7)
+   k = i;
+  else if (i >= 4)
+   k = 8 + (i - 4) * 2;
+  else if (i == 3)
+   k = 12;
+  else
+   k = 8;
+  if (j != k)
+   __builtin_abort ();
+}
+  return 0;
+}

Jakub


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 05:59:53PM +0100, Konstantin Serebryany wrote:
> >> with #if LINUX_VERSION_CODE >= 132640
> Good idea, let me try that.

Had a quick look at this on RHEL 5.
Following patch let me compile at least the first source file, but then
I run into tons of issues in sanitizer_platform_limits_posix.cc.

I think the main problem is that you are mixing standard glibc headers and
linux kernel headers in the same source file, that is a big no no.
Lots of the kernel headers declare the same things as glibc headers.

I'd strongly recommend splitting the files, so that you include absolute 
minimum of
glibc headers when you include linux/* and/or asm/* headers and no kernel 
headers
if you include tons of glibc headers.
And as the errors show up, there are also .cfi* directives that are used
unconditionally (you've set you've removed it from sanitizer_common or where it
was used (IMHO a pitty, much better would be conditionalizing them on either 
compiler
preprocessor macros or whatever clang provides as alternative for that when not 
building
with gcc)), but they are used in tsan (in HACKY_CALL macro).  Plus in *.S file
(either that could be again guarded by the same preprocessor macro, or 
configure or
something else).  Note that RHEL5 here has already gas that supports .cfi_* 
directives
(just not .cfi_personality/.cfi_lsda I think), but if you go to even older 
system
it will not be there.  E.g. glibc assembler files solve that by defining various
CFI_STARTPROC etc. macros that either expand to .cfi_startproc etc. if assembler
supports the directives, or nothing otherwise.

--- sanitizer_platform_limits_linux.cc.jj   2013-12-02 15:27:58.0 
-0500
+++ sanitizer_platform_limits_linux.cc  2013-12-02 17:06:19.0 -0500
@@ -51,8 +51,12 @@
 #endif
 
 #if !SANITIZER_ANDROID
+#include 
+//  has been added in 2.6.32
+#if LINUX_VERSION_CODE >= 132640
 #include 
 #endif
+#endif
 
 namespace __sanitizer {
   unsigned struct_statfs64_sz = sizeof(struct statfs64);
@@ -75,15 +79,18 @@ CHECK_SIZE_AND_OFFSET(io_event, res);
 CHECK_SIZE_AND_OFFSET(io_event, res2);
 
 #if !SANITIZER_ANDROID
+#if LINUX_VERSION_CODE >= 132640
 COMPILER_CHECK(sizeof(struct __sanitizer_perf_event_attr) <=
sizeof(struct perf_event_attr));
 CHECK_SIZE_AND_OFFSET(perf_event_attr, type);
 CHECK_SIZE_AND_OFFSET(perf_event_attr, size);
 #endif
+#endif
 
 COMPILER_CHECK(iocb_cmd_pread == IOCB_CMD_PREAD);
 COMPILER_CHECK(iocb_cmd_pwrite == IOCB_CMD_PWRITE);
-#if !SANITIZER_ANDROID
+#if !SANITIZER_ANDROID && LINUX_VERSION_CODE >= 132627
+// IOCB_CMD_PREADV/PWRITEV has been added in 2.6.19
 COMPILER_CHECK(iocb_cmd_preadv == IOCB_CMD_PREADV);
 COMPILER_CHECK(iocb_cmd_pwritev == IOCB_CMD_PWRITEV);
 #endif
--- sanitizer_platform_limits_posix.cc.jj   2013-12-02 15:27:58.0 
-0500
+++ sanitizer_platform_limits_posix.cc  2013-12-02 17:11:00.0 -0500
@@ -82,12 +82,16 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+//  has been added in 2.6.26
+#if LINUX_VERSION_CODE >= 132634
 #include 
+#endif
 #include 
 #include 
 #include 

So the current errors are (from make -j64 -k to show more than one file):
In file included from /usr/include/sys/ustat.h:30:0,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:84:
/usr/include/bits/ustat.h:25:8: error: redefinition of ‘struct ustat’
 struct ustat
^
In file included from /usr/include/linux/if_ether.h:24:0,
 from /usr/include/netinet/if_ether.h:26,
 from /usr/include/netinet/ether.h:26,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:47:
/usr/include/linux/types.h:156:8: error: previous definition of ‘struct ustat’
 struct ustat {
^
In file included from /usr/include/linux/mroute.h:5:0,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:90:
/usr/include/linux/in.h:26:16: error: redeclaration of ‘IPPROTO_IP’
   IPPROTO_IP = 0,  /* Dummy protocol for TCP  */
^
In file included from /usr/include/arpa/inet.h:23:0,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:20:
/usr/include/netinet/in.h:33:5: note: previous declaration ‘ 
IPPROTO_IP’
 IPPROTO_IP = 0,/* Dummy protocol for TCP.  */
 ^
In file included from /usr/include/linux/mroute.h:5:0,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:90:
/usr/include/linux/in.h:27:18: error: redeclaration of ‘IPPROTO_ICMP’
   IPPROTO_ICMP = 1,  /* Internet Control Message Protocol */
  ^
In file included from /usr/include/arpa/inet.h:23:0,
 from 
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:20:
/usr/include/netinet/in.h:37:5: note: previous declaration ‘ 
IPPROTO_ICMP’
 IPPROTO_ICMP 

Re: wwwdocs: Broken links due to the preprocess script

2013-12-02 Thread Tobias Burnus

Gerald Pfeifer wrote:

Okay, so I applied this patch plus the one below to adjust
gcc-4.9/changes.html accordingly.  (The first anchor there
is not stable, but for other reasons.)


But it should be sufficient to check them before the release and then 
one is fine as the links should refer to the released versions per 
item13 of http://gcc.gnu.org/releasing.html (that's why I proposed to 
add that item).



Talking about links like 
http://gcc.gnu.org/onlinedocs/gcc/Language-Independent-Options.html#index-fdiagnostics-color-246


What I always disliked that it doesn't linke to the @item 
"|-fdiagnostics-color[=WHEN]" but to the first paragraph after the 
@item. That makes somewhat sense if one looks at the source code:


@item -fdiagnostics-color[=@var{WHEN}]
@itemx -fno-diagnostics-color
@opindex fdiagnostics-color
@cindex highlight, color, colour
@vindex GCC_COLORS @r{environment variable}
Use color ...

Actually, it also makes sense if one reads the texi doc, 
http://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#Index-Entries 
: "||Index entries should precede the visible material that is being 
indexed. [...] ||Among other reasons, that way following indexing links 
(in whatever context) ends up before the material, where readers want to 
be, instead of after."



Thus, it seems as we should go through all of GCC's *texi files and swap 
the order of @item and @*index.


Tobias
|


Re: _Cilk_spawn and _Cilk_sync for C++

2013-12-02 Thread Jason Merrill

On 11/28/2013 11:40 AM, Iyer, Balaji V wrote:

Consider the following test case. I took this from the lambda_spawns.cc line 
#203.

as you can tell, it is clobbering the lambda closure at the end of the lambda 
calling and then it is catching value of A from main2 as it is supposed to.


Yep, your patch gives a fine result for this testcase.


What am I misunderstanding?


It just gets there for the wrong reason: remapping exactly nothing in 
the closure object happens to give the desired semantics, treating the 
temporaries in the CONSTRUCTOR as local to the spawned function and 
referring to variables from the spawning context via the nested function 
static chain.  But presumably you have all that remapping machinery 
there because doing nothing doesn't always give the desired result.  Right?


When you add CONSTRUCTOR handling to extract_free_variables, you get the 
crash in gimplify_var_or_parm_decl because you don't specifically handle 
VEC_INIT_EXPR, which needs to be handled a lot like TARGET_EXPR, and end 
up trying to pass the address of its temporary object into the spawned 
function, which doesn't work because, like a TARGET_EXPR, the temporary 
doesn't exist outside of the VEC_INIT_EXPR.  This doesn't mean that 
handling CONSTRUCTOR is wrong; leaving it out means that you aren't 
going to handle aggregate temporaries properly either.


I think it might be better for gimplify_cilk_spawn to gimplify the call 
expression first, and then do your transformation on the gimple, so you 
don't have to worry about language-specific magic.


Now, after all that I must admit that cilk_spawn could only ever see 
VEC_INIT_EXPR in the context of a lambda closure initialization, and the 
default behavior should always be correct for a lambda closure 
initialization, so I guess I'm willing to allow the magic lambda 
handling with a comment about it being a workaround.



+is_lambda_fn_p (tree call_exp)
+{
+  if (TREE_CODE (call_exp) != CALL_EXPR)
+return false;
+  tree call_fn = CALL_EXPR_FN (call_exp);
+  if (TREE_CODE (call_fn) == ADDR_EXPR)
+call_fn = TREE_OPERAND (call_fn, 0);


Use get_callee_fndecl to get the FUNCTION_DECL.  And change the name of 
the function, since it isn't testing whether the argument is itself a 
lambda function; perhaps call_to_lambda_fn_p?



case CILK_SPAWN_STMT:
  gcc_assert
(fn_contains_cilk_spawn_p (cfun)
 && lang_hooks.cilkplus.cilk_detect_spawn_and_unwrap (expr_p));
  if (!seen_error ())
{
  ret = (enum gimplify_status)
lang_hooks.cilkplus.gimplify_cilk_spawn (expr_p, pre_p,
 post_p);
  break;
}
  /* If errors are seen, then just process it as a CALL_EXPR.  */


Please remove these langhooks and instead add handling of 
CILK_SPAWN_STMT to c_gimplify_expr and cp_gimplify_expr.



  lang_hooks.cilkplus.install_body_with_frame_cleanup (fndecl, stmt,
   (void *) wd);


And instead of this langhook, declare a function in c-common.h that is 
defined by all C family front ends.



+stabilize_expr (orig_body, &pre_body);


Here you're pre-evaluating the entire call, rather than just the lambda 
closure object, which means none of the arguments to the call will be 
remapped.  I think you want


CALL_EXPR_ARG (orig_body, 0)
  = stabilize_expr (CALL_EXPR_ARG (orig_body, 0), &pre_body);
append_to_statement_list (orig_body, &pre_body);

instead.


+  gcc_assert (TREE_CODE (catch_list) == STATEMENT_LIST);


You don't need this, append_to_statement_list handles the list not yet 
being a list fine.



+   /* We set this here so that finish_call_expr can set lambda to a var.
+  if it is not done so.  */


This comment is obsolete.


+   error_at (input_location, "_Cilk_sync cannot be used without enabling "
+ "Cilk Plus");
+  cp_lexer_consume_token (parser->lexer);
+  if (parser->in_statement & IN_CILK_SPAWN)
+   parser->in_statement = parser->in_statement & ~IN_CILK_SPAWN;


Why are you messing with in_statement in the cilk_spawn code?

Jason



Re: [Dwarf Patch] Use offset into debug_info for pubtype name referring to pubtype section

2013-12-02 Thread Sterling Augustine
On Mon, Dec 2, 2013 at 1:59 PM, Cary Coutant  wrote:
> This is OK, but your patch also has a local change to contrib/mklog.
> Please be careful not to commit that.

Committed without the contrib/mklog portion.

Also committing on google/gcc-4_8 and google/main.


Re: [Dwarf Patch] Use offset into debug_info for pubtype name referring to pubtype section

2013-12-02 Thread Cary Coutant
> gcc/ChangeLog
>
> 2013-12-02 Sterling Augustine  
>
> * dwarf2out.c (output_pubnames): Use comp_unit_die ()->die_offset
> when there
> isn't a skeleton die.

This is OK, but your patch also has a local change to contrib/mklog.
Please be careful not to commit that.

Thanks!

-cary


[Dwarf Patch] Use offset into debug_info for pubtype name referring to pubtype section

2013-12-02 Thread Sterling Augustine
The enclosed patch fixes a mismerge from google/gcc-4_7 to main. When
outputting a pubtype whose type has no skeleton section, it's DIE
offset should be from the comp_unit_die, instead of zero. Zero is
actually a place-holder for the end of the pubtypes.

Sterling

gcc/ChangeLog

2013-12-02 Sterling Augustine  

* dwarf2out.c (output_pubnames): Use comp_unit_die ()->die_offset
when there
isn't a skeleton die.


pubtypes-bug.tot-patch
Description: Binary data


Re: wwwdocs: Broken links due to the preprocess script

2013-12-02 Thread Gerald Pfeifer
On Mon, 2 Dec 2013, Tobias Burnus wrote:
> Looks good to me. (I fully concur that the _002d is ugly.)

Okay, so I applied this patch plus the one below to adjust 
gcc-4.9/changes.html accordingly.  (The first anchor there
is not stable, but for other reasons.)

Thanks for pushing for this fix!

Gerald


Index: gcc-4.9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.43
diff -u -3 -p -r1.43 changes.html
--- gcc-4.9/changes.html30 Nov 2013 16:36:59 -  1.43
+++ gcc-4.9/changes.html2 Dec 2013 21:42:24 -
@@ -110,7 +110,7 @@
   
 Support for colorizing diagnostics emitted by GCC has been added.
 The http://gcc.gnu.org/onlinedocs/gcc/Language-Independent-Options.html#index-fdiagnostics_002dcolor-239";
+
href="http://gcc.gnu.org/onlinedocs/gcc/Language-Independent-Options.html#index-fdiagnostics-color-246";
 >-fdiagnostics-color=auto will enable it when
 outputting to terminals, -fdiagnostics-color=always
 unconditionally.  The GCC_COLORS environment variable
@@ -136,7 +136,7 @@
 
 
 With the new http://gcc.gnu.org/onlinedocs/gcc/Loop_002dSpecific-Pragmas.html";
+href="http://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-Pragmas.html";
 >#pragma GCC ivdep, the user can assert that there are no
 loop-carried dependencies which would prevent concurrent execution of
 consecutive iterations using SIMD (single instruction multiple data)


[Patch] Add comments for future regex work

2013-12-02 Thread Tim Shen
...for optimization purpose. Should be done in one month.

Thanks!


-- 
Regards,
Tim Shen
commit cc7d58128e68455498d0257c4796cb70a9e24990
Author: tim 
Date:   Mon Dec 2 15:49:15 2013 -0500

2013-12-02  Tim Shen  

	* regex_compiler.h: Add todo comment.
	* regex_executor.tcc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex_compiler.h b/libstdc++-v3/include/bits/regex_compiler.h
index b9f8127..5759d48 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -237,6 +237,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// Matches a character range (bracket expression)
+  // TODO: Convert used _M_flags fields to template parameters, including
+  // collate and icase. Avoid using std::set, could use flat_set
+  // (sorted vector and binary search) instead; use an fixed sized (256)
+  // vector for char specialization if necessary.
   template
 struct _BracketMatcher
 {
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index 22fd67c..150adb4 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -162,6 +162,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return false;
 }
 
+  // TODO: Use a function vector to dispatch, instead of using switch-case.
   template
   template


RE: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-12-02 Thread Bernd Edlinger
Hi,

On Mon, 2 Dec 2013 15:55:08Richard Biener wrote:
>
> On Mon, Nov 25, 2013 at 1:07 PM, Bernd Edlinger
>  wrote:
>> Hello,
>>
>> I had forgotten to run the Ada test suite when I submitted the previous 
>> version of this patch.
>> And indeed there were some Ada test cases failing because in Ada packed 
>> structures are
>> like bit fields, but without the DECL_BIT_FIELD_TYPE attribute.
>
> I think they may have DECL_BIT_FIELD set though, not sure.
>
>> Please find attached the updated version of this patch.
>>
>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>> Ok for trunk?
>
> So you mimic what Eric added in get_bit_range? Btw, I'm not sure
> the "conservative" way of failing get_bit_range with not limiting the
> access at all is good.
>
> That is, we may want to do
>
> + /* The C++ memory model naturally applies to byte-aligned fields.
> + However, if we do not have a DECL_BIT_FIELD_TYPE but BITPOS or
> + BITSIZE are not byte-aligned, there is no need to limit the range
> + we can access. This can occur with packed structures in Ada. */
> + if (bitregion_start == 0 && bitregion_end == 0
> + && bitsize> 0
> + && bitsize % BITS_PER_UNIT == 0
> + && bitpos % BITS_PER_UNIT == 0)
> + {
> + bitregion_start = bitpos;
> + bitregion_end = bitpos + bitsize - 1;
> + }
>
> thus not else if but also apply it when get_bit_range "failed" (as it may
> fail for other reasons). A better fallback would be to track down
> the outermost byte-aligned handled-component and limit the access
> to that (though I guess Ada doesn't care at all about the C++ memory
> model and only Ada has bit-aligned aggregates).
>

Good question. Most of the time the expansion can not know if it expands 
Ada, C, or Fortran. In this case we know it can only be Ada, so the C++ memory
model is not mandatory. Maybe Eric can tell, if a data store race condition
may be an issue in Ada if  structure is laid out like 
__attribute((packed,aligned(1)))
I mean, if that is at all possible.

> That said, the patch looks ok as-is to me, let's see if we can clean
> things up for the next stage1.
>

Ok, applied as-is.

Thanks
Bernd.

> Thanks,
> Richard.
>
>> Bernd.
>>
>>> On Mon, 18 Nov 2013 11:37:05, Bernd Edlinger wrote:
>>>
>>> Hi,
>>>
>>> On Fri, 15 Nov 2013 13:30:51, Richard Biener wrote:
> That looks like always pretending it is a bit field.
> But it is not a bit field, and bitregion_start=bitregion_end=0
> means it is an ordinary value.

 I don't think it is supposed to mean that. It's supposed to mean
 "the access is unconstrained".

>>>
>>> Ok, agreed, I did not regard that as a feature.
>>> And apparently only the code path in expand_assigment
>>> really has a problem with it.
>>>
>>> So here my second attempt at fixing this.
>>>
>>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>>>
>>> OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.

Re: [wide-int] i am concerned about the typedef for widest-int.

2013-12-02 Thread Kenneth Zadeck

On 12/02/2013 03:34 PM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

see wide-int.h around line 290

the MAX_BITSIZE_MODE_ANY_INT is the largest mode on the machine. however
if the value coming in is an unsigned number of the type the represents
that mode, don't we loose a bit?

That was the +1 mentioned here:

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03745.html

I.e. it should be "widest supported arithmetic input + 1".

Thanks,
Richard

do we want 129 or do we want to round that up to the next hwi?


Re: [wide-int] i am concerned about the typedef for widest-int.

2013-12-02 Thread Richard Sandiford
Kenneth Zadeck  writes:
> see wide-int.h around line 290
>
> the MAX_BITSIZE_MODE_ANY_INT is the largest mode on the machine. however 
> if the value coming in is an unsigned number of the type the represents 
> that mode, don't we loose a bit?

That was the +1 mentioned here:

   http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03745.html

I.e. it should be "widest supported arithmetic input + 1".

Thanks,
Richard


Re: [PATCH] Fix PR56344

2013-12-02 Thread Richard Biener
Marek Polacek  wrote:
>On Mon, Dec 02, 2013 at 05:40:33PM +0100, Marek Polacek wrote:
>> On Mon, Dec 02, 2013 at 04:01:05PM +0100, Richard Biener wrote:
>> > On Wed, Mar 13, 2013 at 1:57 PM, Marek Polacek 
>wrote:
>> > > Ping.
>> > 
>> > Ok.  (yay, oldest patch in my review queue ...)
>> 
>> ;) thanks.  Just to be sure, did you mean to ok this patch (that is,
>> the one with HOST_BITS_PER_INT)?

Yes, thanks,
Richard.

>> Bootstrap/regtest in progress.
>> 
>> 2013-12-02  Marek Polacek  
>> 
>>  PR middle-end/56344
>>  * calls.c (expand_call): Disallow passing huge arguments
>>  by value.
>> 
>> --- gcc/calls.c.mp4  2013-12-02 17:12:18.621057873 +0100
>> +++ gcc/calls.c  2013-12-02 17:32:35.523684716 +0100
>> @@ -3047,6 +3047,15 @@ expand_call (tree exp, rtx target, int i
>>  {
>>rtx before_arg = get_last_insn ();
>>  
>> +  /* We don't allow passing huge (> 2^30 B) arguments
>> + by value.  It would cause an overflow later on.  */
>> +  if (adjusted_args_size.constant
>> +  >= (1 << (HOST_BITS_PER_INT - 1)))
>
>Surely I meant to use "HOST_BITS_PER_INT - 2" here.
>
>   Marek




[wide-int] i am concerned about the typedef for widest-int.

2013-12-02 Thread Kenneth Zadeck

see wide-int.h around line 290

the MAX_BITSIZE_MODE_ANY_INT is the largest mode on the machine. however 
if the value coming in is an unsigned number of the type the represents 
that mode, don't we loose a bit?


kenny


[wide-int] Drop some lingering uses of precision 0

2013-12-02 Thread Richard Sandiford
I noticed that there were still a couple of tests for zero precision.
This patch replaces them with asserts when handling separately-supplied
precisions and simply drops them when handling existing wide_ints.
(The idea is that most code would break for zero precision wide_ints
and only asserting in some use sites would be inconsistent.)

Also, to_shwi is called either with a nonzero precision argument or
with no argument.  I think it'd be clearer to split the two cases into
separate (overloaded) functions.  It's also more efficient, since the
compiler doesn't know that a variable-precision argument must be nonzero.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2013-12-02 20:03:50.112581766 +
+++ gcc/wide-int.cc 2013-12-02 20:12:22.178998274 +
@@ -275,9 +275,8 @@ wi::from_mpz (const_tree type, mpz_t x,
 wide_int
 wi::max_value (unsigned int precision, signop sgn)
 {
-  if (precision == 0)
-return shwi (0, precision);
-  else if (sgn == UNSIGNED)
+  gcc_checking_assert (precision != 0);
+  if (sgn == UNSIGNED)
 /* The unsigned max is just all ones.  */
 return shwi (-1, precision);
   else
@@ -290,7 +289,8 @@ wi::max_value (unsigned int precision, s
 wide_int
 wi::min_value (unsigned int precision, signop sgn)
 {
-  if (precision == 0 || sgn == UNSIGNED)
+  gcc_checking_assert (precision != 0);
+  if (sgn == UNSIGNED)
 return uhwi (0, precision);
   else
 /* The signed min is all zeros except the top bit.  This must be
@@ -1487,9 +1487,6 @@ wi::popcount (const wide_int_ref &x)
   unsigned int i;
   int count;
 
-  if (x.precision == 0)
-return 0;
-
   /* The high order block is special if it is the last block and the
  precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
  have to clear out any ones above the precision before doing
@@ -2082,10 +2079,6 @@ wi::ctz (const wide_int_ref &x)
 int
 wi::exact_log2 (const wide_int_ref &x)
 {
-  /* 0-precision values can only hold 0.  */
-  if (x.precision == 0)
-return -1;
-
   /* Reject cases where there are implicit -1 blocks above HIGH.  */
   if (x.len * HOST_BITS_PER_WIDE_INT < x.precision && x.sign_mask () < 0)
 return -1;
Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2013-12-02 19:52:05.424989079 +
+++ gcc/wide-int.h  2013-12-02 20:12:22.179998282 +
@@ -644,8 +644,10 @@ class GTY(()) generic_wide_int : public
   generic_wide_int (const T &, unsigned int);
 
   /* Conversions.  */
-  HOST_WIDE_INT to_shwi (unsigned int = 0) const;
-  unsigned HOST_WIDE_INT to_uhwi (unsigned int = 0) const;
+  HOST_WIDE_INT to_shwi (unsigned int) const;
+  HOST_WIDE_INT to_shwi () const;
+  unsigned HOST_WIDE_INT to_uhwi (unsigned int) const;
+  unsigned HOST_WIDE_INT to_uhwi () const;
   HOST_WIDE_INT to_short_addr () const;
 
   /* Public accessors for the interior of a wide int.  */
@@ -735,18 +737,23 @@ inline generic_wide_int ::gener
 inline HOST_WIDE_INT
 generic_wide_int ::to_shwi (unsigned int precision) const
 {
-  if (precision == 0)
-{
-  if (is_sign_extended)
-   return this->get_val ()[0];
-  precision = this->get_precision ();
-}
   if (precision < HOST_BITS_PER_WIDE_INT)
 return sext_hwi (this->get_val ()[0], precision);
   else
 return this->get_val ()[0];
 }
 
+/* Return THIS as a signed HOST_WIDE_INT, in its natural precision.  */
+template 
+inline HOST_WIDE_INT
+generic_wide_int ::to_shwi () const
+{
+  if (is_sign_extended)
+return this->get_val ()[0];
+  else
+return to_shwi (this->get_precision ());
+}
+
 /* Return THIS as an unsigned HOST_WIDE_INT, zero-extending from
PRECISION.  If THIS does not fit in PRECISION, the information
is lost.  */
@@ -754,14 +761,20 @@ generic_wide_int ::to_shwi (uns
 inline unsigned HOST_WIDE_INT
 generic_wide_int ::to_uhwi (unsigned int precision) const
 {
-  if (precision == 0)
-precision = this->get_precision ();
   if (precision < HOST_BITS_PER_WIDE_INT)
 return zext_hwi (this->get_val ()[0], precision);
   else
 return this->get_val ()[0];
 }
 
+/* Return THIS as an signed HOST_WIDE_INT, in its natural precision.  */
+template 
+inline unsigned HOST_WIDE_INT
+generic_wide_int ::to_uhwi () const
+{
+  return to_uhwi (this->get_precision ());
+}
+
 /* TODO: The compiler is half converted from using HOST_WIDE_INT to
represent addresses to using offset_int to represent addresses.
We use to_short_addr at the interface from new code to old,
@@ -2289,9 +2302,7 @@ wi::add (const T1 &x, const T2 &y, signo
   unsigned HOST_WIDE_INT xl = xi.ulow ();
   unsigned HOST_WIDE_INT yl = yi.ulow ();
   unsigned HOST_WIDE_INT resultl = xl + yl;
-  if (precision == 0)
-   *overflow = false;
-  else if (sgn == SIGNED)
+  if (sgn == SIGNED)
*overflow = (((resultl ^ xl) & (r

Re: [wwwdocs] Document Runtime CPU detection builtins

2013-12-02 Thread Gerald Pfeifer
On Tue, 21 Aug 2012, Sriraman Tallam wrote:
> Committed after making the changes.
> 
> One small problem, I am not sure how to fix this:
> 
> The hyper link I referenced is :
> http://gcc.gnu.org/onlinedocs/gcc/X86-Built_002din-Functions.html#X86-Built_002din-Functions
> 
> whereas the committed changes.html is pointing to:
> http://gcc.gnu.org/onlinedocs/gcc/X86-Built-in-Functions.html#X86-Built-in-Functions
> 
> Please note that the "_002din" is missing. This makes the link broken,
> did I miss anything? I verified that I submitted the right link.

Based on changes I just committed and applied on gcc.gnu.org, finally
there won't be new files or anchors with "_002d" in their names, just
"-" instead.

The patch below, which I just committed, adjust the links.  All simpler
and nicer now. :-)

Gerald

Index: gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.124
diff -u -3 -p -r1.124 changes.html
--- gcc-4.8/changes.html26 Nov 2013 03:21:07 -  1.124
+++ gcc-4.8/changes.html2 Dec 2013 20:17:11 -
@@ -512,7 +512,7 @@ int i = A().f();  // error, f() requires
 added. For details, see the
 http://gcc.gnu.org/wiki/avr-gcc#Fixed-Point_Support";>
   GCC wiki and the
-http://gcc.gnu.org/onlinedocs/gcc/Fixed_002dPoint.html";>
+http://gcc.gnu.org/onlinedocs/gcc/Fixed-Point.html";>
   user manual.  The support is not complete. 
   
   A new print modifier %r for register operands in inline
@@ -584,7 +584,7 @@ int i = A().f();  // error, f() requires
   __builtin_cpu_is("westmere") returns a positive integer if
   the run-time CPU is an Intel Core i7 Westmere processor.  Please refer
   to the http://gcc.gnu.org/onlinedocs/gcc/X86-Built_002din-Functions.html#X86-Built_002din-Functions";>
+  
href="http://gcc.gnu.org/onlinedocs/gcc/X86-Built-in-Functions.html#X86-Built-in-Functions";>
   user manual for the list of valid CPU names recognized.
   A built-in function __builtin_cpu_supports has been
   added to detect if the run-time CPU supports a particular ISA feature.
@@ -592,7 +592,7 @@ int i = A().f();  // error, f() requires
   one string literal argument, the ISA feature.  For example,
   __builtin_cpu_supports("ssse3") returns a positive integer
   if the run-time CPU supports SSSE3 instructions.  Please refer to the http://gcc.gnu.org/onlinedocs/gcc/X86-Built_002din-Functions.html#X86-Built_002din-Functions";>
+  
href="http://gcc.gnu.org/onlinedocs/gcc/X86-Built-in-Functions.html#X86-Built-in-Functions";>
   user manual for the list of valid ISA names recognized.
 
 Caveat: If these built-in functions are called before any static


[PATCH] Allow compounds with empty initializer in pedantic mode (PR c/59351)

2013-12-02 Thread Marek Polacek
We triggered an assert on attached testcase, because when building the
compound literal with empty initial value complete_array_type returns
3, but we assert it returns 0.  It returns 3 only in the pedantic mode,
where empty initializer braces are forbidden.  Since we already gave
a warning, I think we could loosen the assert a bit and allow
empty initial values at that point.  sizeof on such compound literal
then yields zero, which I think is correct.
The assert exists even in GCC 4.0.  

Regtested/botstrapped on x86_64-linux, ok for trunk and 4.8 and
perhaps even 4.7?

2013-12-02  Marek Polacek  

PR c/59351
c/
* c-decl.c (build_compound_literal): Allow compound literals with
empty initial value.
testsuite/
* gcc.dg/pr59351.c: New test.

--- gcc/c/c-decl.c.mp3  2013-12-02 20:23:27.947224366 +0100
+++ gcc/c/c-decl.c  2013-12-02 20:25:56.618779873 +0100
@@ -4693,7 +4693,9 @@ build_compound_literal (location_t loc,
 {
   int failure = complete_array_type (&TREE_TYPE (decl),
 DECL_INITIAL (decl), true);
-  gcc_assert (!failure);
+  /* If complete_array_type returns 3, it means that the
+ initial value of the compound literal is empty.  Allow it.  */
+  gcc_assert (failure == 0 || failure == 3);
 
   type = TREE_TYPE (decl);
   TREE_TYPE (DECL_INITIAL (decl)) = type;
--- gcc/testsuite/gcc.dg/pr59351.c.mp3  2013-12-02 20:29:05.612345428 +0100
+++ gcc/testsuite/gcc.dg/pr59351.c  2013-12-02 20:48:47.298751979 +0100
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -Wpedantic" } */
+
+unsigned int
+foo (void)
+{
+  return sizeof ((int[]) {}); /* { dg-warning "ISO C forbids empty initializer 
braces" } */
+}

Marek


Re: [wide-int] small cleanup in wide-int.*

2013-12-02 Thread Kenneth Zadeck

committed as revision 205599 to wide-int branch.

kenny

On 12/02/2013 05:50 AM, Richard Biener wrote:

On Sat, Nov 30, 2013 at 1:55 AM, Kenneth Zadeck
 wrote:

Richi,

this is the first of either 2 or 3 patches to fix this.There are two
places that need be fixed for us to do 1X + 1 and this patch fixes the first
one.   There was an unnecessary call to mul_full and this was the only call
to mul_full.   So this patch removes the call and also the function itself.

The other place is the tree-vpn that is discussed here and will be dealt
with in the other patches.

tested on x86-64.

Ok to commit?

Ok.

Thanks,
Richard.


Kenny



On 11/29/2013 05:24 AM, Richard Biener wrote:

On Thu, Nov 28, 2013 at 6:11 PM, Kenneth Zadeck
 wrote:

This patch does three things in wide-int:

1) it cleans up some comments.
2) removes a small amount of trash.
3) it changes the max size of the wide int from being 4x of
MAX_BITSIZE_MODE_ANY_INT to 2x +1.   This should improve large muls and
divs
as well as perhaps help with some cache behavior.

@@ -235,8 +233,8 @@ along with GCC; see the file COPYING3.
  range of a multiply.  This code needs 2n + 2 bits.  */

   #define WIDE_INT_MAX_ELTS \
-  ((4 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) \
-   / HOST_BITS_PER_WIDE_INT)
+  (((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) \
+/ HOST_BITS_PER_WIDE_INT) + 1)

I always wondered why VRP (if that is the only reason we do 2*n+1)
cannot simply use FIXED_WIDE_INT(MAX_BITSIZE_MODE_ANY_INT*2 + 1)?
Other widest_int users should not suffer IMHO.  widest_int should
strictly cover all modes that the target can do any arithmetic on
(thus not XImode or OImode on x86_64).

Richard.


ok to commit




Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc	(revision 205597)
+++ gcc/wide-int.cc	(working copy)
@@ -1247,22 +1247,18 @@ wi_pack (unsigned HOST_WIDE_INT *result,
 }
 
 /* Multiply Op1 by Op2.  If HIGH is set, only the upper half of the
-   result is returned.  If FULL is set, the entire result is returned
-   in a mode that is twice the width of the inputs.  However, that
-   mode needs to exist if the value is to be usable.  Clients that use
-   FULL need to check for this.
-
-   If HIGH or FULL are not set, throw away the upper half after the
-   check is made to see if it overflows.  Unfortunately there is no
-   better way to check for overflow than to do this.  If OVERFLOW is
-   nonnull, record in *OVERFLOW whether the result overflowed.  SGN
-   controls the signedness and is used to check overflow or if HIGH or
-   FULL is set.  */
+   result is returned.  
+
+   If HIGH is not set, throw away the upper half after the check is
+   made to see if it overflows.  Unfortunately there is no better way
+   to check for overflow than to do this.  If OVERFLOW is nonnull,
+   record in *OVERFLOW whether the result overflowed.  SGN controls
+   the signedness and is used to check overflow or if HIGH is set.  */
 unsigned int
 wi::mul_internal (HOST_WIDE_INT *val, const HOST_WIDE_INT *op1,
 		  unsigned int op1len, const HOST_WIDE_INT *op2,
 		  unsigned int op2len, unsigned int prec, signop sgn,
-		  bool *overflow, bool high, bool full)
+		  bool *overflow, bool high)
 {
   unsigned HOST_WIDE_INT o0, o1, k, t;
   unsigned int i;
@@ -1313,7 +1309,7 @@ wi::mul_internal (HOST_WIDE_INT *val, co
   /* If we need to check for overflow, we can only do half wide
  multiplies quickly because we need to look at the top bits to
  check for the overflow.  */
-  if ((high || full || needs_overflow)
+  if ((high || needs_overflow)
   && (prec <= HOST_BITS_PER_HALF_WIDE_INT))
 {
   unsigned HOST_WIDE_INT r;
@@ -1372,7 +1368,7 @@ wi::mul_internal (HOST_WIDE_INT *val, co
 
   /* We did unsigned math above.  For signed we must adjust the
  product (assuming we need to see that).  */
-  if (sgn == SIGNED && (full || high || needs_overflow))
+  if (sgn == SIGNED && (high || needs_overflow))
 {
   unsigned HOST_WIDE_INT b;
   if (op1[op1len-1] < 0)
@@ -1420,13 +1416,7 @@ wi::mul_internal (HOST_WIDE_INT *val, co
 	  *overflow = true;
 }
 
-  if (full)
-{
-  /* compute [2prec] <- [prec] * [prec] */
-  wi_pack ((unsigned HOST_WIDE_INT *) val, r, 2 * half_blocks_needed);
-  return canonize (val, blocks_needed * 2, prec * 2);
-}
-  else if (high)
+  if (high)
 {
   /* compute [prec] <- ([prec] * [prec]) >> [prec] */
   wi_pack ((unsigned HOST_WIDE_INT *) val,
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 205597)
+++ gcc/fold-const.c	(working copy)
@@ -5962,11 +5962,12 @@ extract_muldiv_1 (tree t, tree c, enum t
 	 assuming no overflow.  */
   if (tcode == code)
 	{
-	  bool overflow_p;
+	  bool overflow_p = false;
+	  bool overflow_mul_p;
 	  signop sign = TYPE_SIGN (ctype);
-	  wide_int mul = wi::mul_full (op1, c, sign);

Re: [patch] introduce aarch64 as a Go architecture

2013-12-02 Thread Mike Stump
On Dec 2, 2013, at 1:10 AM, Andrew Pinski  wrote:
>> All the documentation relevant to this architecture uses the term
>> "aarch64". How is arm64 obvious?
> 
> The same reason Linus used arm64:
> https://lkml.org/lkml/2012/7/15/133

Thanks for the link, ah, now I exactly understand what that port is!  :-)  
arm64 conveys more to me, more quickly.

Re: [wide-int] Add fast path for hosts with HWI widening multiplication

2013-12-02 Thread Paolo Bonzini
Il 02/12/2013 20:34, Richard Sandiford ha scritto:
>>> >> I followed Joseph's suggestion and reused longlong.h.  I copied it from
>>> >> libgcc rather than glibc since it seemed better for GCC to have a single
>>> >> version across both gcc/ and libgcc/.  I can put it in include/ if that
>>> >> seems better.
>> >
>> > Actually copying complex code like this does not seem maintainable.  I
>> > think there needs to be only one copy in the GCC sources.  If that
>> > requires moving it back from libgcc to gcc, or moving it to include,
>> > do that.
> OK, will do, but which do you prefer?

libgcc/ should not use gcc/ sources too much.  Please put it in include/.

Paolo


Re: wwwdocs: Broken links due to the preprocess script

2013-12-02 Thread Tobias Burnus

Gerald Pfeifer wrote:

Below you'll find a patch for maintainer-scripts/update_web_docs_svn
which I tested on gcc.gnu.org and the current documentation pages (not
those for older releases) are adjusted now.

Among others this fixes the link you reported above (though adjusting
gcc-4.9/changes.html directly is now a logical next step).

Thoughts?


Looks good to me. (I fully concur that the _002d is ugly.)

Tobias


Re: [PATCH, testsuite] Fix some testcases for nds32 target and provide new nds32 target specific tests

2013-12-02 Thread Mike Stump
On Dec 2, 2013, at 5:02 AM, Chung-Ju Wu  wrote:
> Perhaps I should have used the following description, which seems much better:
> 
> +/* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have 
> args in registers." { nds32*-*-* } "*" "" } */

Reads nicely, thanks.  Also, if I do a port, and this test case fails, and I 
read that and those facts apply to my port, I can just effortlessly go that 
direction.  To me, this is the best use of this information.  Secondary would 
be if people wanted to do a target_supports, it would be more clear to the 
untrained why it was done in the first place.

Re: [wide-int] Add fast path for hosts with HWI widening multiplication

2013-12-02 Thread Richard Sandiford
Ian Lance Taylor  writes:
> On Sun, Dec 1, 2013 at 2:28 AM, Richard Sandiford
>  wrote:
>> I followed Joseph's suggestion and reused longlong.h.  I copied it from
>> libgcc rather than glibc since it seemed better for GCC to have a single
>> version across both gcc/ and libgcc/.  I can put it in include/ if that
>> seems better.
>
> Actually copying complex code like this does not seem maintainable.  I
> think there needs to be only one copy in the GCC sources.  If that
> requires moving it back from libgcc to gcc, or moving it to include,
> do that.

OK, will do, but which do you prefer?

Thanks,
Richard


Re: [PATCH] Fix PR56344

2013-12-02 Thread Marek Polacek
On Mon, Dec 02, 2013 at 05:40:33PM +0100, Marek Polacek wrote:
> On Mon, Dec 02, 2013 at 04:01:05PM +0100, Richard Biener wrote:
> > On Wed, Mar 13, 2013 at 1:57 PM, Marek Polacek  wrote:
> > > Ping.
> > 
> > Ok.  (yay, oldest patch in my review queue ...)
> 
> ;) thanks.  Just to be sure, did you mean to ok this patch (that is,
> the one with HOST_BITS_PER_INT)?
> 
> Bootstrap/regtest in progress.
> 
> 2013-12-02  Marek Polacek  
> 
>   PR middle-end/56344
>   * calls.c (expand_call): Disallow passing huge arguments
>   by value.
> 
> --- gcc/calls.c.mp4   2013-12-02 17:12:18.621057873 +0100
> +++ gcc/calls.c   2013-12-02 17:32:35.523684716 +0100
> @@ -3047,6 +3047,15 @@ expand_call (tree exp, rtx target, int i
>   {
> rtx before_arg = get_last_insn ();
>  
> +   /* We don't allow passing huge (> 2^30 B) arguments
> +  by value.  It would cause an overflow later on.  */
> +   if (adjusted_args_size.constant
> +   >= (1 << (HOST_BITS_PER_INT - 1)))

Surely I meant to use "HOST_BITS_PER_INT - 2" here.

Marek


Re: [PATCH] Avoid SIMD clone dg-do run tests if assembler doesn't support AVX2 (PR lto/59326)

2013-12-02 Thread Richard Henderson
On 11/29/2013 12:02 PM, Jakub Jelinek wrote:
> As we create SIMD clones for all of SSE2, AVX and AVX2 ISAs right now,
> the assembler needs to support SSE2, AVX and AVX2.  Apparently some folks
> are still using binutils that don't handle that, this patch conditionalizes
> the test on that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.


r~


Fix a bug in points-to solver

2013-12-02 Thread Xinliang David Li
Points to solver has a bug that can cause complex constraints to be
skipped leading to wrong points-to results. In the case that exposed
the problem, there is sd constraint: x = *y which is never processed.
'y''s final points to set is { NULL READONLY ESCAPED NOLOCAL}, but 'x'
points-to set is {}.

What happens is before 'y'' is processed, it is merged with another
node 'z' during cycle elimination (the complex constraints get
transferred to 'z'), but 'z' is not marked as 'changed' so it is
skipped in a later iteration.

The attached patch fixed the problem. The problem is exposed by a
large program built with -fprofile-generate in LIPO mode -- so there
is no small testcase attached.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu, OK for trunk?

Index: ChangeLog
===
--- ChangeLog   (revision 205579)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2013-12-02  Xinliang David Li  
+
+   * tree-ssa-structalias.c (solve_graph): Mark rep node changed
+   after cycle elimination.
+
 2013-12-01  Eric Botcazou  

* config/i386/winnt.c (i386_pe_asm_named_section): Be prepared for an
Index: tree-ssa-structalias.c
===
--- tree-ssa-structalias.c  (revision 205579)
+++ tree-ssa-structalias.c  (working copy)
@@ -2655,8 +2655,13 @@ solve_graph (constraint_graph_t graph)

  /* In certain indirect cycle cases, we may merge this
 variable to another.  */
- if (eliminate_indirect_cycles (i) && find (i) != i)
-   continue;
+ if (eliminate_indirect_cycles (i))
+{
+ unsigned int rep = find (i);
+ bitmap_set_bit (changed, rep);
+ if (i != rep)
+   continue;
+}

  /* If the node has changed, we need to process the
 complex constraints and outgoing edges again.  */


Re: [C++,doc] vector conditional expression

2013-12-02 Thread Gerald Pfeifer
On Mon, 2 Dec 2013, Marc Glisse wrote:
>> Index: doc/extend.texi
>> ===
>> +In C++, the ternary operator @code{?:} is available. @code{a?b:c}, where
>> +@code{b} and @code{c} are vectors of the same type and @code{a} is an
>> +integer vector of the same size and number of elements as @code{b} and
>> +@code{c}
>> 
>> Why "same size and number of elements" in the above?  What is the
>> difference between these two?
> (on x86_64)
> A vector of 4 int and a vector of 4 long have the same number of elements but
> not the same size.
> A vector of 8 int and a vector of 4 long have the same size but not the same
> number of elements.
> 
> For semantics, we want the same number of elements. To match the 
> hardware, we want the same size.

Ah, so it was good I asked. :-)  Thanks for your explanation.

It seems the way this is intended is
  integer vector of the (same size and number of elements) as 
whereas I parsed it as
  (integer vector of the same size) and (number of elements) as
hence wondering what the difference between the size of the vector and 
the number of elements was.

Rephrasing this as "the same number and size of elements as" or better
"the same number of elements of the same size as" may help avoid this.

Gerald


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 05:59:53PM +0100, Konstantin Serebryany wrote:
> On Mon, Dec 2, 2013 at 5:44 PM, Jakub Jelinek  wrote:
> > On Mon, Dec 02, 2013 at 05:26:45PM +0100, Uros Bizjak wrote:
> >> No, so your patch doesn't regress anything. I can configure with
> >> --disable-libsanitizer to skip build of libsanitizer, although it
> >> would be nice to support RHEL5 derived long-term distributions.
> >>
> >> > Is there a way to test gcc in such environment w/o setting up VMs
> >> > (e.g. chroot, or some such)?
> >>
> >> Maybe gcc compile farm has linux-2.6.18 machine available?
> >
> > That or perhaps try say:
> > mkdir ~/centos5
> > cd ~/centos5
> > wget 
> > http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-devel-2.5-118.x86_64.rpm
> > wget 
> > http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-headers-2.5-118.x86_64.rpm
> > wget 
> > http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/kernel-headers-2.6.18-371.el5.x86_64.rpm
> > for i in *.rpm; do
> >   rpm2cpio $i | cpio -id
> > done
> >
> > and then compile with
> > g++ -nostdinc `g++ -v -E -xc++ /dev/null 2>&1 | sed -n '/^#include  > of/{/\/usr\/include/d;s/^ \//-isystem /p}'` -isystem ~/centos5/usr/include/
> > This command will use all standard C++ search paths except for /usr/include,
> > and will use ~/centos5/usr/include/ instead of that.
> 
> Doing this gives me:
> ../gcc/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:24:20:
> fatal error: stddef.h: No such file or directory
> because stddef.h is found in /usr/include/linux; I guess we need some
> more gcc flags here.

Oops, sorry, should have been:
g++ -nostdinc `g++ -v -E -xc++ /dev/null 2>&1 | sed -n '/^#include 

Re: PATCH: PR other/59055: gcc.texinfo warnings

2013-12-02 Thread H.J. Lu
On Mon, Dec 2, 2013 at 5:10 AM, Gerald Pfeifer  wrote:
> On Fri, 8 Nov 2013, H.J. Lu wrote:
>> bugreport.texi has
>>
>> @menu
>> * Criteria:  Bug Criteria.   Have you really found a bug?
>> * Reporting: Bug Reporting.  How to report a bug effectively.
>> * Known: Trouble.Known problems.
>> * Help: Service. Where to ask for help.
>> @end menu
>>
>> That means include order should be bugreport.texi, trouble.texi,
>> service.texi.  And we need to specify next, previous and up nodes to
>> Service and Trouble nodes.  OK to install?
>
> Thanks for looking into this, H.J.!  Looking at the logic we have
> been using elsewhere, I am wondering whether it shouldn't be the
> order in bugreport.texi that should be adjusted -- switching the
> "Known: Trouble" and "Reporting: Bug Reporting" nodes?

It doesn't work:

/export/gnu/import/git/gcc/gcc/doc/trouble.texi:5: warning: node next
`Trouble' in menu `Service' and in sectioning `Bugs' differ
/export/gnu/import/git/gcc/gcc/doc/trouble.texi:5: warning: node prev
`Trouble' in menu `Bug Reporting' and in sectioning `Gcov' differ
/export/gnu/import/git/gcc/gcc/doc/trouble.texi:5: warning: node up
`Trouble' in menu `Bugs' and in sectioning `Top' differ
/export/gnu/import/git/gcc/gcc/doc/service.texi:5: warning: node prev
`Service' in menu `Trouble' and in sectioning `Bugs' differ
/export/gnu/import/git/gcc/gcc/doc/service.texi:5: warning: node up
`Service' in menu `Bugs' and in sectioning `Top' differ

> That way you could omit
>
>> 2013-11-08  H.J. Lu  
>>
>>   PR other/59055
>>   * doc/gcc.texi: Move Trouble after Bugs in menu.  Include
>>   trouble.texi after bugreport.texi.
>
> those two changes, and only update bugreport.texi?  If this sounds
> acceptable, please go ahead and make this change.
>
>>   * doc/service.texi: Add next, previous and up nodes to Service
>>   nodes.
>>   * doc/trouble.texi: Add next, previous and up nodes to Trouble
>>   nodes.
>
> Why are these necessary?  The texinfo documentation say the following
> about @node:
>
> The subsequent arguments are optional—they are the names of the
> ‘Next’, ‘Previous’, and ‘Up’ pointers, in that order. We strongly
> recommend omitting them if your Texinfo document is hierarchically
> organized, as virtually all are
>

gcc.texi has

@include gcov.texi
@include trouble.texi
@include bugreport.texi
@include service.texi

and there is a menu in bugreport.texi as well as

@node Bug Criteria,Bug Reporting,,Bugs
@node Bug Reporting,,Bug Criteria,Bugs

They aren't really hierarchically organized.


H.J.


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Konstantin Serebryany
On Mon, Dec 2, 2013 at 5:44 PM, Jakub Jelinek  wrote:
> On Mon, Dec 02, 2013 at 05:26:45PM +0100, Uros Bizjak wrote:
>> No, so your patch doesn't regress anything. I can configure with
>> --disable-libsanitizer to skip build of libsanitizer, although it
>> would be nice to support RHEL5 derived long-term distributions.
>>
>> > Is there a way to test gcc in such environment w/o setting up VMs
>> > (e.g. chroot, or some such)?
>>
>> Maybe gcc compile farm has linux-2.6.18 machine available?
>
> That or perhaps try say:
> mkdir ~/centos5
> cd ~/centos5
> wget 
> http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-devel-2.5-118.x86_64.rpm
> wget 
> http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-headers-2.5-118.x86_64.rpm
> wget 
> http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/kernel-headers-2.6.18-371.el5.x86_64.rpm
> for i in *.rpm; do
>   rpm2cpio $i | cpio -id
> done
>
> and then compile with
> g++ -nostdinc `g++ -v -E -xc++ /dev/null 2>&1 | sed -n '/^#include  of/{/\/usr\/include/d;s/^ \//-isystem /p}'` -isystem ~/centos5/usr/include/
> This command will use all standard C++ search paths except for /usr/include,
> and will use ~/centos5/usr/include/ instead of that.

Doing this gives me:
../gcc/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:24:20:
fatal error: stddef.h: No such file or directory
because stddef.h is found in /usr/include/linux; I guess we need some
more gcc flags here.


>> with #if LINUX_VERSION_CODE >= 132640
Good idea, let me try that.


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 05:43:17PM +0100, Konstantin Serebryany wrote:
> We can fix this particular failure, but unless someone helps us test
> the code upstream
> (not just that it builds, but also that it works) asan has little
> chance to work on old systems anyway.

For these kernel headers that were added only lately and weren't existing in
older kernels, perhaps you can
#include 
and guard the include of such headers plus everything related to that
with #if LINUX_VERSION_CODE >= 132640
(at least from Kernel's git linux/perf_event.h header has been added in
2.6.32).
Or alternatively use configure, but you'd need to use it in both
compiler-rt buildsystem and gcc's libsanitizer configure.

Jakub


Re: [PATCH i386] Enable -freorder-blocks-and-partition

2013-12-02 Thread Martin Liška
Dear Teresa,
   I will today double check if the graphs are correct :)

Martin

On 2 December 2013 17:16, Jeff Law  wrote:
> On 12/02/13 08:16, Teresa Johnson wrote:
>>
>>
>> I'm wondering if the -fno-reorder-blocks-and-partition graph really
>> had that disabled. I am surprised that the size of the .text and
>> .text.hot did not shrink from splitting.
>
> Could be due to needing longer jump opcodes to reach the unlikely sections.
> jeff
>


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 05:26:45PM +0100, Uros Bizjak wrote:
> No, so your patch doesn't regress anything. I can configure with
> --disable-libsanitizer to skip build of libsanitizer, although it
> would be nice to support RHEL5 derived long-term distributions.
> 
> > Is there a way to test gcc in such environment w/o setting up VMs
> > (e.g. chroot, or some such)?
> 
> Maybe gcc compile farm has linux-2.6.18 machine available?

That or perhaps try say:
mkdir ~/centos5
cd ~/centos5
wget 
http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-devel-2.5-118.x86_64.rpm
wget 
http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/glibc-headers-2.5-118.x86_64.rpm
wget 
http://mirrors.kernel.org/centos/5/os/x86_64/CentOS/kernel-headers-2.6.18-371.el5.x86_64.rpm
for i in *.rpm; do
  rpm2cpio $i | cpio -id
done

and then compile with
g++ -nostdinc `g++ -v -E -xc++ /dev/null 2>&1 | sed -n '/^#include 

Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Konstantin Serebryany
On Mon, Dec 2, 2013 at 5:26 PM, Uros Bizjak  wrote:
> On Mon, Dec 2, 2013 at 5:12 PM, Konstantin Serebryany
>  wrote:
>
> Does it support using libbacktrace in GCC?

 Not on it's own, but the support in the upstream maintained files
 is there, so hopefully it will be just a matter of follow-up patch
 with configury/Makefile etc. stuff, I'll work on it once the merge is
 committed.

 What is more important now is to test the patch Kostya posted on non-x86_64
 targets and/or older kernel headers (say RHEL5, older SLES, etc.).
>>>
>>> Unfortunately, the build breaks on CentOS 5.10 (= RHEL5) with:
>>>
>>> libtool: compile:  /home/uros/gcc-build-xxx/./gcc/xgcc -shared-libgcc
>>> -B/home/uros/gcc-build-xxx/./gcc -nostdinc++
>>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src
>>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
>>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>>> -B/usr/local/x86_64-unknown-linux-gnu/bin/
>>> -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem
>>> /usr/local/x86_64-unknown-linux-gnu/include -isystem
>>> /usr/local/x86_64-unknown-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG
>>> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
>>> -I. -I../../../../gcc-svn/trunk/libsanitizer/sanitizer_common -I
>>> ../../../../gcc-svn/trunk/libsanitizer/include -Wall -W
>>> -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC
>>> -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer
>>> -funwind-tables -fvisibility=hidden -Wno-variadic-macros
>>> -I../../libstdc++-v3/include
>>> -I../../libstdc++-v3/include/x86_64-unknown-linux-gnu
>>> -I../../../../gcc-svn/trunk/libsanitizer/../libstdc++-v3/libsupc++ -g
>>> -O2 -D_GNU_SOURCE -MT sanitizer_platform_limits_linux.lo -MD -MP -MF
>>> .deps/sanitizer_platform_limits_linux.Tpo -c
>>> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
>>>  -fPIC -DPIC -o .libs/sanitizer_platform_limits_linux.o
>>> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:54:30:
>>> fatal error: linux/perf_event.h: No such file or directory
>>>  #include 
>>
>> Sounds familiar. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068
>> Do things work for you w/o my patch in fresh trunk?
>
> No, so your patch doesn't regress anything. I can configure with
> --disable-libsanitizer to skip build of libsanitizer, although it
> would be nice to support RHEL5 derived long-term distributions.
Ok, so this does not gate the merge.

We can fix this particular failure, but unless someone helps us test
the code upstream
(not just that it builds, but also that it works) asan has little
chance to work on old systems anyway.

--kcc

>
>> Is there a way to test gcc in such environment w/o setting up VMs
>> (e.g. chroot, or some such)?
>
> Maybe gcc compile farm has linux-2.6.18 machine available?
>
> Uros.


Re: [PATCH] Fix PR56344

2013-12-02 Thread Marek Polacek
On Mon, Dec 02, 2013 at 04:01:05PM +0100, Richard Biener wrote:
> On Wed, Mar 13, 2013 at 1:57 PM, Marek Polacek  wrote:
> > Ping.
> 
> Ok.  (yay, oldest patch in my review queue ...)

;) thanks.  Just to be sure, did you mean to ok this patch (that is,
the one with HOST_BITS_PER_INT)?

Bootstrap/regtest in progress.

2013-12-02  Marek Polacek  

PR middle-end/56344
* calls.c (expand_call): Disallow passing huge arguments
by value.

--- gcc/calls.c.mp4 2013-12-02 17:12:18.621057873 +0100
+++ gcc/calls.c 2013-12-02 17:32:35.523684716 +0100
@@ -3047,6 +3047,15 @@ expand_call (tree exp, rtx target, int i
{
  rtx before_arg = get_last_insn ();
 
+ /* We don't allow passing huge (> 2^30 B) arguments
+by value.  It would cause an overflow later on.  */
+ if (adjusted_args_size.constant
+ >= (1 << (HOST_BITS_PER_INT - 1)))
+   {
+ sorry ("passing too large argument on stack");
+ continue;
+   }
+
  if (store_one_arg (&args[i], argblock, flags,
 adjusted_args_size.var != 0,
 reg_parm_stack_space)

Marek


Re: patch for elimination to SP when it is changed in RTL (PR57293)

2013-12-02 Thread Vladimir Makarov

On 12/1/2013, 7:57 AM, James Greenhalgh wrote:

On Thu, Nov 28, 2013 at 10:11:26PM +, Vladimir Makarov wrote:

Committed as rev. 205498.

   2013-11-28  Vladimir Makarov

PR target/57293
* ira.h (ira_setup_eliminable_regset): Remove parameter.
* ira.c (ira_setup_eliminable_regset): Ditto.  Add
SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
Don't call lra_init_elimination.
(ira): Call ira_setup_eliminable_regset without arguments.
* loop-invariant.c (calculate_loop_reg_pressure): Remove argument
from ira_setup_eliminable_regset call.
* gcse.c (calculate_bb_reg_pressure): Ditto.
* haifa-sched.c (sched_init): Ditto.
* lra.h (lra_init_elimination): Remove the prototype.
* lra-int.h (lra_insn_recog_data): New member sp_offset.  Move
used_insn_alternative upper.
(lra_eliminate_regs_1): Add one more parameter.
(lra-eliminate): Ditto.
* lra.c (lra_invalidate_insn_data): Set sp_offset.
(setup_sp_offset): New.
(lra_process_new_insns): Call setup_sp_offset.
(lra): Add argument to lra_eliminate calls.
* lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
(get_equiv_with_elimination): New.
(process_addr_reg): Call get_equiv_with_elimination instead of
get_equiv_substitution.
(equiv_address_substitution): Ditto.
(loc_equivalence_change_p): Ditto.
(loc_equivalence_callback, lra_constraints): Ditto.
(curr_insn_transform): Ditto.  Print the sp offset
(process_alt_operands): Prevent stack pointer reloads.
(lra_constraints): Remove one argument from lra_eliminate call.
Move it up.  Mark used hard regs bfore it.  Use
get_equiv_with_elimination instead of get_equiv_substitution.
* lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
assert for param values combination.  Use sp offset.  Add argument
to lra_eliminate_regs_1 calls.
(lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
(curr_sp_change): New static var.
(mark_not_eliminable): Add parameter.  Update curr_sp_change.
Don't prevent elimination to sp if we can calculate its change.
Pass the argument to mark_not_eliminable calls.
(eliminate_regs_in_insn): Add a parameter.  Use sp offset.  Add
argument to lra_eliminate_regs_1 call.
(update_reg_eliminate): Move calculation of hard regs for spill
lower.  Switch off lra_in_progress temporarily to generate regs
involved into elimination.
(lra_init_elimination): Rename to init_elimination.  Make it
static.  Set up insn sp offset, check the offsets at the end of
BBs.
(process_insn_for_elimination): Add parameter.  Pass its value to
eliminate_regs_in_insn.
(lra_eliminate): : Add parameter.  Pass its value to
process_insn_for_elimination.  Add assert for param values
combination.  Call init_elimination.  Don't update offsets in
equivalence substitutions.
* lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
for created stack slot.
(remove_pseudos): Call lra_eliminate_regs_1 before changing memory
onto stack slot.

2013-11-28  Vladimir Makarov

PR target/57293
* gcc.target/i386/pr57293.c: New.


Hi Vlad,

This patch seems to cause some problems for AArch64. I see an assert
triggering when building libgloss:

/work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/ 
-B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/ 
-isystem 
/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/targ-include
 -isystem /work/gcc-clean/src/binutils/newlib/libc/include 
-B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/aarch64
 
-L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/libnosys
 -L/work/gcc-clean/src/binutils/libgloss/aarch64 
-L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/./ld-O2 -g -O2 -g -I. 
-I/work/gcc-clean/src/binutils/libgloss/aarch64/.. -DARM_RDI_MONITOR -o 
rdimon-_exit.o -c /work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c
/work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c: In function '_exit':
/work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c:41:1: internal compiler 
error: in update_reg_eliminate, at lra-eliminations.c:1157
  }

Thanks, James.  I'll try to reproduce it with the cross-compiler.  I 
expected that the patch might be disruptive.  It is pretty big. 
Therefore I started to work on it (and the related PRs) first. I'll try 
to fix as soon as I can.





Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Uros Bizjak
On Mon, Dec 2, 2013 at 5:12 PM, Konstantin Serebryany
 wrote:

 Does it support using libbacktrace in GCC?
>>>
>>> Not on it's own, but the support in the upstream maintained files
>>> is there, so hopefully it will be just a matter of follow-up patch
>>> with configury/Makefile etc. stuff, I'll work on it once the merge is
>>> committed.
>>>
>>> What is more important now is to test the patch Kostya posted on non-x86_64
>>> targets and/or older kernel headers (say RHEL5, older SLES, etc.).
>>
>> Unfortunately, the build breaks on CentOS 5.10 (= RHEL5) with:
>>
>> libtool: compile:  /home/uros/gcc-build-xxx/./gcc/xgcc -shared-libgcc
>> -B/home/uros/gcc-build-xxx/./gcc -nostdinc++
>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src
>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
>> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>> -B/usr/local/x86_64-unknown-linux-gnu/bin/
>> -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem
>> /usr/local/x86_64-unknown-linux-gnu/include -isystem
>> /usr/local/x86_64-unknown-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG
>> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
>> -I. -I../../../../gcc-svn/trunk/libsanitizer/sanitizer_common -I
>> ../../../../gcc-svn/trunk/libsanitizer/include -Wall -W
>> -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC
>> -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer
>> -funwind-tables -fvisibility=hidden -Wno-variadic-macros
>> -I../../libstdc++-v3/include
>> -I../../libstdc++-v3/include/x86_64-unknown-linux-gnu
>> -I../../../../gcc-svn/trunk/libsanitizer/../libstdc++-v3/libsupc++ -g
>> -O2 -D_GNU_SOURCE -MT sanitizer_platform_limits_linux.lo -MD -MP -MF
>> .deps/sanitizer_platform_limits_linux.Tpo -c
>> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
>>  -fPIC -DPIC -o .libs/sanitizer_platform_limits_linux.o
>> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:54:30:
>> fatal error: linux/perf_event.h: No such file or directory
>>  #include 
>
> Sounds familiar. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068
> Do things work for you w/o my patch in fresh trunk?

No, so your patch doesn't regress anything. I can configure with
--disable-libsanitizer to skip build of libsanitizer, although it
would be nice to support RHEL5 derived long-term distributions.

> Is there a way to test gcc in such environment w/o setting up VMs
> (e.g. chroot, or some such)?

Maybe gcc compile farm has linux-2.6.18 machine available?

Uros.


Re: [PATCH i386] Enable -freorder-blocks-and-partition

2013-12-02 Thread Jeff Law

On 12/02/13 08:16, Teresa Johnson wrote:


I'm wondering if the -fno-reorder-blocks-and-partition graph really
had that disabled. I am surprised that the size of the .text and
.text.hot did not shrink from splitting.

Could be due to needing longer jump opcodes to reach the unlikely sections.
jeff



Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Konstantin Serebryany
On Mon, Dec 2, 2013 at 4:49 PM, Uros Bizjak  wrote:
> Hello!
>
>>> Does it support using libbacktrace in GCC?
>>
>> Not on it's own, but the support in the upstream maintained files
>> is there, so hopefully it will be just a matter of follow-up patch
>> with configury/Makefile etc. stuff, I'll work on it once the merge is
>> committed.
>>
>> What is more important now is to test the patch Kostya posted on non-x86_64
>> targets and/or older kernel headers (say RHEL5, older SLES, etc.).
>
> Unfortunately, the build breaks on CentOS 5.10 (= RHEL5) with:
>
> libtool: compile:  /home/uros/gcc-build-xxx/./gcc/xgcc -shared-libgcc
> -B/home/uros/gcc-build-xxx/./gcc -nostdinc++
> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src
> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> -L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> -B/usr/local/x86_64-unknown-linux-gnu/bin/
> -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem
> /usr/local/x86_64-unknown-linux-gnu/include -isystem
> /usr/local/x86_64-unknown-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> -I. -I../../../../gcc-svn/trunk/libsanitizer/sanitizer_common -I
> ../../../../gcc-svn/trunk/libsanitizer/include -Wall -W
> -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC
> -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer
> -funwind-tables -fvisibility=hidden -Wno-variadic-macros
> -I../../libstdc++-v3/include
> -I../../libstdc++-v3/include/x86_64-unknown-linux-gnu
> -I../../../../gcc-svn/trunk/libsanitizer/../libstdc++-v3/libsupc++ -g
> -O2 -D_GNU_SOURCE -MT sanitizer_platform_limits_linux.lo -MD -MP -MF
> .deps/sanitizer_platform_limits_linux.Tpo -c
> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
>  -fPIC -DPIC -o .libs/sanitizer_platform_limits_linux.o
> ../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:54:30:
> fatal error: linux/perf_event.h: No such file or directory
>  #include 

Sounds familiar. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068
Do things work for you w/o my patch in fresh trunk?

Is there a way to test gcc in such environment w/o setting up VMs
(e.g. chroot, or some such)?

--kcc

>   ^
> compilation terminated.
> gmake[4]: *** [sanitizer_platform_limits_linux.lo] Error 1
> gmake[4]: *** Waiting for unfinished jobs
>
> Uros.


Re: [PATCH] Adjust ubsan/vla-1.c test

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 04:57:49PM +0100, Marek Polacek wrote:
> This patch puts every VLA test into its separate function to make it
> less like fail due to stack overflow.
> 
> Ran ubsan testsuite, ok for trunk?

Ok, thanks.
> 2013-12-02  Marek Polacek  
> 
> testsuite/
>   * c-c++-common/ubsan/vla-1.c: Split the tests into individual
>   functions.

Jakub


[PATCH] Adjust ubsan/vla-1.c test

2013-12-02 Thread Marek Polacek
This patch puts every VLA test into its separate function to make it
less like fail due to stack overflow.

Ran ubsan testsuite, ok for trunk?

2013-12-02  Marek Polacek  

testsuite/
* c-c++-common/ubsan/vla-1.c: Split the tests into individual
functions.

--- gcc/testsuite/c-c++-common/ubsan/vla-1.c.mp42013-12-02 
16:32:21.139139722 +0100
+++ gcc/testsuite/c-c++-common/ubsan/vla-1.c2013-12-02 16:48:28.791731232 
+0100
@@ -1,33 +1,104 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=vla-bound -Wall -Wno-unused-variable" } */
 
-static int
+typedef long int V;
+int x = -1;
+double di = -3.2;
+V v = -6;
+
+static int __attribute__ ((noinline, noclone))
 bar (void)
 {
-  return -42;
+  return -4;
 }
 
-typedef long int V;
-int
-main (void)
+static void __attribute__ ((noinline, noclone))
+fn1 (void)
 {
-  int x = -1;
-  double di = -3.2;
-  V v = -666;
-
   int a[x];
-  int aa[x][x];
-  int aaa[x][x][x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn2 (void)
+{
+  int a[x][x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn3 (void)
+{
+  int a[x][x][x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn4 (void)
+{
   int b[x - 4];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn5 (void)
+{
   int c[(int) di];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn6 (void)
+{
   int d[1 + x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn7 (void)
+{
   int e[1 ? x : -1];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn8 (void)
+{
   int f[++x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn9 (void)
+{
   int g[(signed char) --x];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn10 (void)
+{
   int h[(++x, --x, x)];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn11 (void)
+{
   int i[v];
+}
+
+static void __attribute__ ((noinline, noclone))
+fn12 (void)
+{
   int j[bar ()];
+}
 
+int
+main (void)
+{
+  fn1 ();
+  fn2 ();
+  fn3 ();
+  fn4 ();
+  fn5 ();
+  fn6 ();
+  fn7 ();
+  fn8 ();
+  fn9 ();
+  fn10 ();
+  fn11 ();
+  fn12 ();
   return 0;
 }
 
@@ -44,5 +115,5 @@ main (void)
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value 0(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -1(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -1(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -666(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -42(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -6(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -4(\n|\r\n|\r)" } */

Marek


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Uros Bizjak
Hello!

>> Does it support using libbacktrace in GCC?
>
> Not on it's own, but the support in the upstream maintained files
> is there, so hopefully it will be just a matter of follow-up patch
> with configury/Makefile etc. stuff, I'll work on it once the merge is
> committed.
>
> What is more important now is to test the patch Kostya posted on non-x86_64
> targets and/or older kernel headers (say RHEL5, older SLES, etc.).

Unfortunately, the build breaks on CentOS 5.10 (= RHEL5) with:

libtool: compile:  /home/uros/gcc-build-xxx/./gcc/xgcc -shared-libgcc
-B/home/uros/gcc-build-xxx/./gcc -nostdinc++
-L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src
-L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/home/uros/gcc-build-xxx/x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
-B/usr/local/x86_64-unknown-linux-gnu/bin/
-B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/local/x86_64-unknown-linux-gnu/include -isystem
/usr/local/x86_64-unknown-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I. -I../../../../gcc-svn/trunk/libsanitizer/sanitizer_common -I
../../../../gcc-svn/trunk/libsanitizer/include -Wall -W
-Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC
-fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer
-funwind-tables -fvisibility=hidden -Wno-variadic-macros
-I../../libstdc++-v3/include
-I../../libstdc++-v3/include/x86_64-unknown-linux-gnu
-I../../../../gcc-svn/trunk/libsanitizer/../libstdc++-v3/libsupc++ -g
-O2 -D_GNU_SOURCE -MT sanitizer_platform_limits_linux.lo -MD -MP -MF
.deps/sanitizer_platform_limits_linux.Tpo -c
../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
 -fPIC -DPIC -o .libs/sanitizer_platform_limits_linux.o
../../../../gcc-svn/trunk/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:54:30:
fatal error: linux/perf_event.h: No such file or directory
 #include 
  ^
compilation terminated.
gmake[4]: *** [sanitizer_platform_limits_linux.lo] Error 1
gmake[4]: *** Waiting for unfinished jobs

Uros.


Re: [PING] [PATCH] Optional alternative base_expr in finding basis for CAND_REFs

2013-12-02 Thread Yufeng Zhang

Ping~

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03360.html

Thanks,
Yufeng

On 11/26/13 15:02, Yufeng Zhang wrote:

On 11/26/13 12:45, Richard Biener wrote:

On Thu, Nov 14, 2013 at 12:25 AM, Yufeng Zhang   wrote:

On 11/13/13 20:54, Bill Schmidt wrote:

The second version of your original patch is ok with me with the
following changes.  Sorry for the little side adventure into the
next-interp logic; in the end that's going to hurt more than it helps in
this case.  Thanks for having a look at it, anyway.  Thanks also for
cleaning up this version to be less intrusive to common interfaces; I
appreciate it.



Thanks a lot for the review.  I've attached an updated patch with the
suggested changes incorporated.

For the next-interp adventure, I was quite happy to do the experiment; it's
a good chance of gaining insight into the pass.  Many thanks for your prompt
replies and patience in guiding!



Everything else looks OK to me.  Please ask Richard for final approval,
as I'm not a maintainer.



Hi Richard, would you be happy to OK the patch?


Hmm,

+static tree
+get_alternative_base (tree base)
+{
+  tree *result = (tree *) pointer_map_contains (alt_base_map, base);
+
+  if (result == NULL)
+{
+  tree expr;
+  aff_tree aff;
+
+  tree_to_aff_combination_expand (base, TREE_TYPE (base),
+&aff,&name_expansions);
+  aff.offset = tree_to_double_int (integer_zero_node);
+  expr = aff_combination_to_tree (&aff);
+
+  result = (tree *) pointer_map_insert (alt_base_map, base);
+  gcc_assert (!*result);

I believe this cache will never hit (unless you repeatedly ask for
the exact same statement?) - any non-trivial 'base' trees are
not shared and thus not pointer equivalent.


Yes, you are right about the non-trivial 'base' tree are rarely shared.
   The cached is introduced mainly because get_alternative_base () may be
called twice on the same 'base' tree, once in the
find_basis_for_candidate () for look-up and the other time in
alloc_cand_and_find_basis () for record_potential_basis ().  I'm happy
to leave out the cache if you think the benefit is trivial.


Also using tree_to_aff_combination_expand to get at - what
exactly? The address with any constant offset stripped?
Where do you re-construct that offset?  That is, aff.offset,
which you definitely need to get at a candidate?


As explained in the previous RFC emails, the expanded and
constant-offset-stripped base expr is only used for the purpose of basis
look-up.  The corresponding candidate still has the unexpanded base expr
as its 'base_expr', therefore the info on the constant offset is not
lost and doesn't need to be re-constructed.


+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+  int i, j;
+
+  i = v1 + 5;
+  j = i;
+  a2 [i-10] [j] = 2;
+  a2 [i] [j++] = i;
+  a2 [i+20] [j++] = i;
+  a2 [i-3] [i-1] += 1;
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */

scanning for 5 MEMs looks non-sensical.  What transform do
you expect?  I see other slsr testcases do similar non-sensical
checking which is bad, too.


As the slsr optimizes CAND_REF candidates by simply lowering them to
MEM_REF from e.g. ARRAY_REF, I think scanning for the number of MEM_REFs
is an effective check.  Alternatively, I can add a follow-up patch to
add some dumping facility in replace_ref () to print out the replacing
actions when -fdump-tree-slsr-details is on.

I hope these can address your concerns.


Regards,
Yufeng





Richard.


Regards,

Yufeng

gcc/

  * gimple-ssa-strength-reduction.c: Include tree-affine.h.
  (name_expansions): New static variable.
  (alt_base_map): Ditto.
  (get_alternative_base): New function.
  (find_basis_for_candidate): For CAND_REF, optionally call
  find_basis_for_base_expr with the returned value from
  get_alternative_base.
  (record_potential_basis): Add new parameter 'base' of type 'tree';
  add an assertion of non-NULL base; use base to set node->base_expr.

  (alloc_cand_and_find_basis): Update; call record_potential_basis
  for CAND_REF with the returned value from get_alternative_base.
  (execute_strength_reduction): Call pointer_map_create for
  alt_base_map; call free_affine_expand_cache with&name_expansions.

gcc/testsuite/

  * gcc.dg/tree-ssa/slsr-41.c: New test.






diff --git a/gcc/gimple-ssa-strength-reduction.c 
b/gcc/gimple-ssa-strength-reduction.c
index 88afc91..26502c3 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "hash-table.h"
 #include "tree-ssa-address.h"
+#include "tree-affine.h"
 
 /* Information about a strength reduction candidate.  Each statement
in the candidate table repr

Re: wide-int, misc

2013-12-02 Thread Richard Biener
On Sat, Nov 23, 2013 at 8:22 PM, Mike Stump  wrote:
> Richi has asked the we break the wide-int patch so that the individual port 
> and front end maintainers can review their parts without have to go through 
> the entire patch.This patch covers the random pieces that didn't seem to 
> fit nicely into other bins.
>
> Ok?

Ok.

Thanks,
Richard.


Re: wide-int, tree-vec

2013-12-02 Thread Richard Biener
On Sat, Nov 23, 2013 at 8:23 PM, Mike Stump  wrote:
> Richi has asked the we break the wide-int patch so that the individual port 
> and front end maintainers can review their parts without have to go through 
> the entire patch.This patch covers the tree-vec code.
>
> Ok?

Ok.

Thanks,
Richard.


[PATCH] Fix PR59139

2013-12-02 Thread Richard Biener

This fixes PR59139, ternary support was missing from get_val_for.
Instead of supporting it I simply chose to properly disable its
support.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2013-12-02  Richard Biener  

PR tree-optimization/59139
* tree-ssa-loop-niter.c (chain_of_csts_start): Properly match
code in get_val_for.
(get_val_for): Use gcc_checking_asserts.

* gcc.dg/torture/pr59139.c: New testcase.

Index: gcc/tree-ssa-loop-niter.c
===
*** gcc/tree-ssa-loop-niter.c   (revision 205585)
--- gcc/tree-ssa-loop-niter.c   (working copy)
*** chain_of_csts_start (struct loop *loop,
*** 2075,2081 
return NULL;
  }
  
!   if (gimple_code (stmt) != GIMPLE_ASSIGN)
  return NULL;
  
code = gimple_assign_rhs_code (stmt);
--- 2075,2082 
return NULL;
  }
  
!   if (gimple_code (stmt) != GIMPLE_ASSIGN
!   || gimple_assign_rhs_class (stmt) == GIMPLE_TERNARY_RHS)
  return NULL;
  
code = gimple_assign_rhs_code (stmt);
*** get_val_for (tree x, tree base)
*** 2143,2149 
  {
gimple stmt;
  
!   gcc_assert (is_gimple_min_invariant (base));
  
if (!x)
  return base;
--- 2144,2150 
  {
gimple stmt;
  
!   gcc_checking_assert (is_gimple_min_invariant (base));
  
if (!x)
  return base;
*** get_val_for (tree x, tree base)
*** 2152,2158 
if (gimple_code (stmt) == GIMPLE_PHI)
  return base;
  
!   gcc_assert (is_gimple_assign (stmt));
  
/* STMT must be either an assignment of a single SSA name or an
   expression involving an SSA name and a constant.  Try to fold that
--- 2153,2159 
if (gimple_code (stmt) == GIMPLE_PHI)
  return base;
  
!   gcc_checking_assert (is_gimple_assign (stmt));
  
/* STMT must be either an assignment of a single SSA name or an
   expression involving an SSA name and a constant.  Try to fold that
Index: gcc/testsuite/gcc.dg/torture/pr59139.c
===
*** gcc/testsuite/gcc.dg/torture/pr59139.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr59139.c  (working copy)
***
*** 0 
--- 1,20 
+ /* { dg-do compile } */
+ 
+ int a, b, c, d, e;
+ int fn1(p1, p2) { return p2 == 0 ? p1 : 1 % p2; }
+ 
+ void fn2()
+ {
+   c = 0;
+   for (;; c = (unsigned short)c)
+ {
+   b = 2;
+   for (; b; b = a)
+   {
+ e = fn1(2, c && 1);
+ d = c == 0 ? e : c;
+ if (d)
+   return;
+   }
+ }
+ }


Re: [PATCH i386] Enable -freorder-blocks-and-partition

2013-12-02 Thread Teresa Johnson
On Thu, Nov 28, 2013 at 6:06 AM, Jan Hubicka  wrote:
>> Dear Teresa and Jan,
>>I tried to test Teresa's patch, but I've encountered two bugs
>> during usage of -fprofile-generate/use (one in SPEC CPU 2006 and
>> Inkscape).
>
> Thanks, this is non-LTO run. Is there a chance to get -flto version, too?
> So we see how things combine with -freorder-function
>>
>> This will be probably for Jan:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59266
>>
>> second one:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59265
>>
>> There are numbers I recorded for GIMP with and without block reordering.
>>
>> GIMP (-freorder-blocks-and-partition)
>> pages read (no readahead): 597 pages (4K)
>>
>> GIMP (-no-freorder-blocks-and-partition)
>> pages read (no readahead): 596 pages (4K)
>
> The graphs themselves seems bit odd however, why do we have so many accesses
> to cold section with -fno-reorder-blocks-and-partition again?

Comparing the two graphs I don't see additional accesses in the cold
section from -freorder-blocks-and-partition. For the most part the
graphs look identical. In contrast, the graphs Martin had generated
with and without -freorder-blocks-and-partition back in August had a
significant increase in execution out of text.unlikely.

I'm wondering if the -fno-reorder-blocks-and-partition graph really
had that disabled. I am surprised that the size of the .text and
.text.hot did not shrink from splitting. And the accesses in the cold
section in both graphs look suspiciously like the accesses we ended up
with in the cold section when enabling -freorder-blocks-and-partition
back in Aug (although there are certainly a lot fewer than before,
which is good news).

Martin, can you check that the binary used for
-fno-reorder-blocks-and-partition really doesn't have any splitting?

Thanks,
Teresa

>
> Honza
>>
>> Martin
>>
>> On 19 November 2013 23:18, Teresa Johnson  wrote:
>> > On Tue, Nov 19, 2013 at 9:40 AM, Jeff Law  wrote:
>> >> On 11/19/13 10:24, Teresa Johnson wrote:
>> >>>
>> >>> On Tue, Nov 19, 2013 at 7:44 AM, Jan Hubicka  wrote:
>> 
>>  Martin,
>>  can you, please, generate the updated systemtap with
>>  -freorder-blocks-and-partition enabled?
>> 
>>  I am in favour of enabling this - it is usefull pass and it is pointless
>>  ot
>>  have passes that are not enabled by default.
>>  Is there reason why this would not work on other ELF target? Is it
>>  working
>>  with Darwin and Windows?
>> >>>
>> >>>
>> >>> I don't know how to test these (I don't see any machines listed in the
>> >>> gcc compile farm of those types). For Windows, I assume you mean
>> >>> MinGW, which should be enabled as it is under i386. Should I disable
>> >>> it there and for Darwin?
>> >>>
>> 
>> > This patch enables -freorder-blocks-and-partition by default for x86
>> > at -O2 and up. It is showing some modest gains in cpu2006 performance
>> > with profile feedback and -O2 on an Intel Westmere system. 
>> > Specifically,
>> > I am seeing consistent improvements in 401.bzip2 (1.5-3%), 
>> > 483.xalancbmk
>> > (1.5-3%), and 453.povray (2.5-3%), and no apparent regressions.
>> 
>> 
>>  This actually sounds very good ;)
>> 
>>  Lets see how the systemtap graphs goes.  If we will end up with problem
>>  of too many accesses to cold section, I would suggest making cold 
>>  section
>>  subdivided into .unlikely and .unlikely.part (we could have better name)
>>  with the second consisting only of unlikely parts of hot&normal
>>  functions.
>> 
>>  This should reduce the problems we are seeing with mistakely identifying
>>  code to be cold because of roundoff errors (and it probably makes sense
>>  in general, too).
>>  We will however need to update gold and ld for that.
>> >>>
>> >>>
>> >>> Note that I don't think this would help much unless the linker is
>> >>> changed to move the cold split section close to the hot section. There
>> >>> is probably some fine-tuning we could do eventually in the linker
>> >>> under -ffunction-sections without putting the split portions in a
>> >>> separate section. I.e. clump the split parts together within unlikely.
>> >>> But hopefully this can all be done later on as follow-on work to boost
>> >>> the performance further.
>> >>>
>> >
>> > Bootstrapped and tested on x86-64-unknown-linux-gnu with a normal
>> > bootstrap, a profiledbootstrap and an LTO profiledbootstrap. All were
>> > configured with --enable-languages=all,obj-c++ and tested for both
>> > 32 and 64-bit with RUNTESTFLAGS="--target_board=unix\{-m32,-m64\}".
>> >
>> > It would be good to enable this for additional targets as a follow on,
>> > but it needs more testing for both correctness and performance on those
>> > other targets (i.e for correctness because I see a number of places
>> > in other config/*/*.c files that do some special handling under this
>> >>

Re: [PATCH] Fix PR58115

2013-12-02 Thread Richard Biener
On Sun, Nov 3, 2013 at 11:25 AM, Bernd Edlinger
 wrote:
> Hello,
>
> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails 
> because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?

Ok.

Thanks,
Richard.

> Regards
> Bernd.


Re: [PATCH] Fix PR56344

2013-12-02 Thread Richard Biener
On Wed, Mar 13, 2013 at 1:57 PM, Marek Polacek  wrote:
> Ping.

Ok.  (yay, oldest patch in my review queue ...)

Thanks,
Richard.

> On Tue, Mar 05, 2013 at 05:06:21PM +0100, Marek Polacek wrote:
>> On Fri, Mar 01, 2013 at 09:41:27AM +0100, Richard Biener wrote:
>> > On Wed, Feb 27, 2013 at 6:38 PM, Joseph S. Myers
>> >  wrote:
>> > > On Wed, 27 Feb 2013, Richard Biener wrote:
>> > >
>> > >> Wouldn't it be better to simply pass this using the variable size 
>> > >> handling
>> > >> code?  Thus, initialize args_size.var for too large constant size 
>> > >> instead?
>> > >
>> > > Would that be compatible with the ABI definition of how a large (constant
>> > > size) argument should be passed?
>> >
>> > I'm not sure.  Another alternative is to expand to __builtin_trap (), but 
>> > that's
>> > probably not easy at this very point.
>> >
>> > Or simply fix the size calculation to not overflow (either don't count bits
>> > or use a double-int).
>>
>> I don't think double_int will help us here.  We won't detect overflow,
>> because we overflowed here (when lower_bound is an int):
>>   lower_bound = INTVAL (XEXP (XEXP (arg->stack_slot, 0), 1));
>> The value from INTVAL () fits when lower_bound is a double_int, but
>> then:
>>   i = lower_bound;
>>   ...
>>   stack_usage_map[i]
>> the size of stack_usage_map is stored in highest_outgoing_arg_in_use,
>> which is an int, so we're limited by an int size here.
>> Changing the type of highest_outgoing_arg_in_use from an int to a
>> double_int isn't worth the trouble, IMHO.
>>
>> Maybe the original approach, only with sorry () instead of error ()
>> and e.g. HOST_BITS_PER_INT - 1 instead of 30 would be appropriate
>> after all.  Dunno.
>>
>>   Marek


Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-12-02 Thread Richard Biener
On Mon, Nov 18, 2013 at 1:11 PM, Bernd Edlinger
 wrote:
> Hi,
>
>
> This modified test case exposes a bug in the already approved part of the 
> strict-volatile-bitfields patch:
>
> #include 
>
> typedef struct {
>   char pad;
>   int arr[0];
> } __attribute__((packed)) str;
>
> str *
> foo (int* src)
> {
>   str *s = malloc (sizeof (str) + sizeof (int));
>   s->arr[0] = 0x12345678;
>   asm volatile("":::"memory");
>   *src = s->arr[0];
>   return s;
> }
>
>
> As we know this test case triggered a recursion in the store_bit_field on ARM 
> and on PowerPC,
> which is no longer reproducible after this patch is applied: 
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html
>
> Additionally it triggered a recursion on extract_bit_field, but _only_ on my 
> local copy of the trunk.
> I had this patch installed, but did not expect it to change anything unless 
> the values are volatile.
> That was cased by this hunk in the strict-volatile-bitfields v4 patch:
>
>
> @@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo
>  includes the entire field.  If such a mode would be larger than
>  a word, we won't be doing the extraction the normal way.  */
>
> -  if (MEM_VOLATILE_P (op0)
> - && flag_strict_volatile_bitfields> 0)
> -   {
> - if (GET_MODE_BITSIZE (GET_MODE (op0))> 0)
> -   mode = GET_MODE (op0);
> - else if (target && GET_MODE_BITSIZE (GET_MODE (target))> 0)
> -   mode = GET_MODE (target);
> - else
> -   mode = tmode;
> -   }
> -  else
> -   mode = get_best_mode (bitsize, bitnum, 0, 0,
> - MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P 
> (op0));
> +  mode = GET_MODE (op0);
> +  if (GET_MODE_BITSIZE (mode) == 0
> + || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode))
> +   mode = word_mode;
> +  mode = get_best_mode (bitsize, bitnum, 0, 0,
> +   MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
>
>if (mode == VOIDmode)
> /* The only way this should occur is if the field spans word
>boundaries.  */
> return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
>
> So the problem started, because initially this function did not look at 
> GET_MODE(op0)
> and always used word_mode. That was changed, but now also affected 
> non-volatile data.
>
>
> Now, if we solve this differently and install the C++ memory model patch,
> we can avoid to introduce the recursion in the extract path,
> and remove these two hunks in the update patch at the same time:
>
> +  else if (MEM_P (str_rtx)
> +  && MEM_VOLATILE_P (str_rtx)
> +  && flag_strict_volatile_bitfields> 0)
> +/* This is a case where -fstrict-volatile-bitfields doesn't apply
> +   because we can't do a single access in the declared mode of the field.
> +   Since the incoming STR_RTX has already been adjusted to that mode,
> +   fall back to word mode for subsequent logic.  */
> +str_rtx = adjust_address (str_rtx, word_mode, 0);
>
>
>
> Attached you'll find a new version of the bitfields-update patch,
> it is again relative to the already approved version of the 
> volatile-bitfields patch v4, part 1/2.
>
> Boot-strapped and regression-tested on X86_64-linux-gnu.
> additionally tested with an ARM cross-compiler.
>
>
> OK for trunk?

Ok.

Thanks,
Richard.

>
> Thanks
> Bernd.


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Marek Polacek
On Mon, Dec 02, 2013 at 03:47:18PM +0100, Jakub Jelinek wrote:
> On Mon, Dec 02, 2013 at 03:41:18PM +0100, Marek Polacek wrote:
> > On Mon, Dec 02, 2013 at 02:41:05PM +0100, Marek Polacek wrote:
> > > On Mon, Dec 02, 2013 at 03:52:09PM +0400, Konstantin Serebryany wrote:
> > > > This change breaks one ubsan test:
> > > > make check -C gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> > > > ubsan.exp'
> > > > FAIL: c-c++-common/ubsan/vla-1.c  -O0  execution test
> > > > I am asking gcc-ubsan maintainers to help me decipher dejagnu
> > > > diagnostics and fix the test failure.
> > > 
> > > Ok, reproduced.  I'll look into it.
> > 
> > Well, this should help.  The problem is that the testcase, when run,
> > SIGSEGVed, but since we're doing Ugly Things (VLAs with negative
> > size), it of course _can_ segfault, we're just relying that it
> > doesn't.  
> 
> Suppossedly it might be better to split the main from the test into multiple
> functions, with __attribute__((noinline)) and just one invalid VLA in each.

Okay, I'll do this separately.

So, Kostya, I guess just do the merge and I'll take care of that ubsan
fail.

Marek


Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-12-02 Thread Richard Biener
On Mon, Nov 25, 2013 at 1:07 PM, Bernd Edlinger
 wrote:
> Hello,
>
> I had forgotten to run the Ada test suite when I submitted the previous 
> version of this patch.
> And indeed there were some Ada test cases failing because in Ada packed 
> structures are
> like bit fields, but without the DECL_BIT_FIELD_TYPE attribute.

I think they may have DECL_BIT_FIELD set though, not sure.

> Please find attached the updated version of this patch.
>
> Boot-strapped and regression-tested on x86_64-linux-gnu.
> Ok for trunk?

So you mimic what Eric added in get_bit_range?  Btw, I'm not sure
the "conservative" way of failing get_bit_range with not limiting the
access at all is good.

That is, we may want to do

+  /* The C++ memory model naturally applies to byte-aligned fields.
+However, if we do not have a DECL_BIT_FIELD_TYPE but BITPOS or
+BITSIZE are not byte-aligned, there is no need to limit the range
+we can access.  This can occur with packed structures in Ada.  */
+  if (bitregion_start == 0 && bitregion_end == 0
+  && bitsize > 0
+  && bitsize % BITS_PER_UNIT == 0
+  && bitpos % BITS_PER_UNIT == 0)
+   {
+ bitregion_start = bitpos;
+ bitregion_end = bitpos + bitsize - 1;
+   }

thus not else if but also apply it when get_bit_range "failed" (as it may
fail for other reasons).  A better fallback would be to track down
the outermost byte-aligned handled-component and limit the access
to that (though I guess Ada doesn't care at all about the C++ memory
model and only Ada has bit-aligned aggregates).

That said, the patch looks ok as-is to me, let's see if we can clean
things up for the next stage1.

Thanks,
Richard.

> Bernd.
>
>> On Mon, 18 Nov 2013 11:37:05, Bernd Edlinger wrote:
>>
>> Hi,
>>
>> On Fri, 15 Nov 2013 13:30:51, Richard Biener wrote:
 That looks like always pretending it is a bit field.
 But it is not a bit field, and bitregion_start=bitregion_end=0
 means it is an ordinary value.
>>>
>>> I don't think it is supposed to mean that. It's supposed to mean
>>> "the access is unconstrained".
>>>
>>
>> Ok, agreed, I did not regard that as a feature.
>> And apparently only the code path in expand_assigment
>> really has a problem with it.
>>
>> So here my second attempt at fixing this.
>>
>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>>
>> OK for trunk?
>>
>>
>> Thanks
>> Bernd.


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Jakub Jelinek
On Mon, Dec 02, 2013 at 03:41:18PM +0100, Marek Polacek wrote:
> On Mon, Dec 02, 2013 at 02:41:05PM +0100, Marek Polacek wrote:
> > On Mon, Dec 02, 2013 at 03:52:09PM +0400, Konstantin Serebryany wrote:
> > > This change breaks one ubsan test:
> > > make check -C gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> > > ubsan.exp'
> > > FAIL: c-c++-common/ubsan/vla-1.c  -O0  execution test
> > > I am asking gcc-ubsan maintainers to help me decipher dejagnu
> > > diagnostics and fix the test failure.
> > 
> > Ok, reproduced.  I'll look into it.
> 
> Well, this should help.  The problem is that the testcase, when run,
> SIGSEGVed, but since we're doing Ugly Things (VLAs with negative
> size), it of course _can_ segfault, we're just relying that it
> doesn't.  

Suppossedly it might be better to split the main from the test into multiple
functions, with __attribute__((noinline)) and just one invalid VLA in each.

> diff --git a/gcc/testsuite/c-c++-common/ubsan/vla-1.c 
> b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> index 3e47bd3..1c5d14a 100644
> --- a/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> +++ b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> @@ -13,7 +13,7 @@ main (void)
>  {
>int x = -1;
>double di = -3.2;
> -  V v = -666;
> +  V v = -6;
>  
>int a[x];
>int aa[x][x];
> @@ -44,5 +44,5 @@ main (void)
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value 0(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -1(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -1(\n|\r\n|\r)" } */
> -/* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -666(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -6(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -42(\n|\r\n|\r)" } */

Jakub


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Konstantin Serebryany
On Mon, Dec 2, 2013 at 6:41 PM, Marek Polacek  wrote:
> On Mon, Dec 02, 2013 at 02:41:05PM +0100, Marek Polacek wrote:
>> On Mon, Dec 02, 2013 at 03:52:09PM +0400, Konstantin Serebryany wrote:
>> > This change breaks one ubsan test:
>> > make check -C gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} ubsan.exp'
>> > FAIL: c-c++-common/ubsan/vla-1.c  -O0  execution test
>> > I am asking gcc-ubsan maintainers to help me decipher dejagnu
>> > diagnostics and fix the test failure.
>>
>> Ok, reproduced.  I'll look into it.
>
> Well, this should help.  The problem is that the testcase, when run,
> SIGSEGVed, but since we're doing Ugly Things (VLAs with negative
> size), it of course _can_ segfault, we're just relying that it
> doesn't.

Thanks!
Shall I add this change to mine, or you want to commit it separately?

An alternative and more stable fix would be to rewrite the test to run
each case independently
and fail after the report, but that's up to you.

--kcc

>
> diff --git a/gcc/testsuite/c-c++-common/ubsan/vla-1.c 
> b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> index 3e47bd3..1c5d14a 100644
> --- a/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> +++ b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
> @@ -13,7 +13,7 @@ main (void)
>  {
>int x = -1;
>double di = -3.2;
> -  V v = -666;
> +  V v = -6;
>
>int a[x];
>int aa[x][x];
> @@ -44,5 +44,5 @@ main (void)
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value 0(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -1(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -1(\n|\r\n|\r)" } */
> -/* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -666(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -6(\n|\r\n|\r)" } */
>  /* { dg-output "\[^\n\r]*variable length array bound evaluates to 
> non-positive value -42(\n|\r\n|\r)" } */
>
> Marek


Re: [PATCH] Strict volatile bit-fields clean-up

2013-12-02 Thread Richard Biener
On Wed, Nov 20, 2013 at 11:48 AM, Bernd Edlinger
 wrote:
> Hello Richard,
>
> as a follow-up patch to the bit-fields patch(es), I wanted to remove the 
> dependencies on
> the variable flag_strict_volatile_bitfields from expand_assignment and 
> expand_expr_real_1.
> Additionally I want the access mode of the field to be selected in the memory 
> context,
> instead of the structure's mode.
>
> Boot-strapped and regression-tested on x86_64-linux-gnu.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> Thanks
> Bernd.
>
>
> FYI - these are my in-flight patches, which would be nice to go into the GCC 
> 4.9.0 release:
>
>
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02046.html :[PATCH] reimplement 
> -fstrict-volatile-bitfields v4, part 1/2
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html :[PATCH] Fix C++0x 
> memory model for -fno-strict-volatile-bitfields on ARM
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02291.html :
> [PATCH, PR 57748] Check for out of bounds access, Part 2
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00581.html :[PATCH, PR 57748] 
> Check for out of bounds access, Part 3
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00133.html:[PATCH] Fix PR58115


Re: libsanitizer merge from upstream r196090

2013-12-02 Thread Marek Polacek
On Mon, Dec 02, 2013 at 02:41:05PM +0100, Marek Polacek wrote:
> On Mon, Dec 02, 2013 at 03:52:09PM +0400, Konstantin Serebryany wrote:
> > This change breaks one ubsan test:
> > make check -C gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} ubsan.exp'
> > FAIL: c-c++-common/ubsan/vla-1.c  -O0  execution test
> > I am asking gcc-ubsan maintainers to help me decipher dejagnu
> > diagnostics and fix the test failure.
> 
> Ok, reproduced.  I'll look into it.

Well, this should help.  The problem is that the testcase, when run,
SIGSEGVed, but since we're doing Ugly Things (VLAs with negative
size), it of course _can_ segfault, we're just relying that it
doesn't.  

diff --git a/gcc/testsuite/c-c++-common/ubsan/vla-1.c 
b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
index 3e47bd3..1c5d14a 100644
--- a/gcc/testsuite/c-c++-common/ubsan/vla-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
@@ -13,7 +13,7 @@ main (void)
 {
   int x = -1;
   double di = -3.2;
-  V v = -666;
+  V v = -6;
 
   int a[x];
   int aa[x][x];
@@ -44,5 +44,5 @@ main (void)
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value 0(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -1(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -1(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -666(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -6(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*variable length array bound evaluates to non-positive 
value -42(\n|\r\n|\r)" } */

Marek


Re: [ipa PATCH] Fix bug introduced by r202567

2013-12-02 Thread H.J. Lu
On Mon, Dec 2, 2013 at 5:30 AM, Marek Polacek  wrote:
> On Mon, Dec 02, 2013 at 02:21:04PM +0100, Richard Biener wrote:
>> On Mon, Dec 2, 2013 at 1:36 PM, Yuri Rumyantsev  wrote:
>> > Hi All,
>> >
>> > Attached is evident fix found in process of investigation of PR 58721.
>> > Note that this fix does not resolve it.
>> >
>> > Is it OK for trunk?
>>
>> Ok.
>>
>> Thanks,
>> Richard.
>>
>> > ChangeLog:
>> >
>> > 2013-11-02  Yuri Rumyantsev  
>> >
>> > * gcc/ipa-inline.c (check_callers) : Add missed pointer de-reference.
>
> But please drop the gcc/ prefix from the ChangeLog.  Also, no space
> before :.
>
> Marek

Hi.

I checked it in for Yuri with updated ChangeLog.

Thanks.

-- 
H.J.


  1   2   >