Re: [PATCH] Fix the GNU Stack markings on libgcc.a

2018-05-02 Thread Daniel Santos

On 05/02/2018 06:17 PM, Magnus Granberg wrote:
> torsdag 3 maj 2018 kl. 01:07:51 CEST skrev  Daniel Santos:
>> Hello
>>
>> On 05/01/2018 06:32 AM, Magnus Granberg wrote:
>>> New patch
>>> libgcc/ChangeLog:
>>>
>>> 2018-05-01  Magnus Granberg  <zo...@gentoo.org>
>>>
>>> * config/i386/resms64.h: Add .note.GNU-stack section
>>> * config/i386/resms64f.h: Likewise.
>>> * config/i386/resms64fx.h: Likewise.
>>> * config/i386/resms64x.h: Likewise.
>>> * config/i386/savms64.h: Likewise.
>>> * config/i386/savms64f.h: Likewise.
>>>
>>> ---
>> Well this isn't correct either because you are outside of the inclusion
>> guard.  Can you please move this up a line?
>>
>> Thanks,
>> Daniel
> /libgcc/ChangeLog:
> 2018-05-01  Magnus Granberg  <zo...@gentoo.org>
>
>   * config/i386/resms64.h: Add .note.GNU-stack section
>   * config/i386/resms64f.h: Likewise.
>   * config/i386/resms64fx.h: Likewise.
>   * config/i386/resms64x.h: Likewise.
>   * config/i386/savms64.h: Likewise.
>   * config/i386/savms64f.h: Likewise.
>
> ---

No, I meant to move the changes up a line so that, if for some reason
the header was included twice, that it wouldn't output the section
twice.  Example:

 MS2SYSV_STUB_END(savms64_18) 
  
+#if·defined(__linux__)·&&·defined(__ELF__) 
+.section·.note.GNU-stack,"",%progbits 
+#endif
 #endif·/*·__x86_64__·*/ 


But upon further reflection, I think it can be cleanly added to
i386-asm.h.  Does that look sane Jakub?  (I haven't tried it)

Also, for the sake of my education, I don't exactly understand what the
problem is as I haven't been keeping up with pax and hardening.  I just
want to clarify that the stack shouldn't be executable.  These are not
actual "functions" per-se (i.e., they do not adhere to any ABI), they
operate on the stack of the calling function.

Thanks,
Daniel


diff --git a/libgcc/config/i386/i386-asm.h b/libgcc/config/i386/i386-asm.h
index 267133a9b75..7eb3c12fc85 100644
--- a/libgcc/config/i386/i386-asm.h
+++ b/libgcc/config/i386/i386-asm.h
@@ -80,6 +80,10 @@ ASMNAME(fn):
 
 #ifdef MS2SYSV_STUB_PREFIX
 
+# if·defined(__linux__)·&&·defined(__ELF__)
+.section·.note.GNU-stack,"",%progbits
+# endif
+
 # define MS2SYSV_STUB_BEGIN(base_name) \
 	HIDDEN_FUNC(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
 


Re: [PATCH] Fix the GNU Stack markings on libgcc.a

2018-05-02 Thread Daniel Santos
Hello

On 05/01/2018 06:32 AM, Magnus Granberg wrote:
> New patch
> libgcc/ChangeLog:
>
> 2018-05-01  Magnus Granberg  
>
>   * config/i386/resms64.h: Add .note.GNU-stack section
>   * config/i386/resms64f.h: Likewise.
>   * config/i386/resms64fx.h: Likewise.
>   * config/i386/resms64x.h: Likewise.
>   * config/i386/savms64.h: Likewise.
>   * config/i386/savms64f.h: Likewise.
>
> ---

Well this isn't correct either because you are outside of the inclusion
guard.  Can you please move this up a line?

Thanks,
Daniel


Re: [PATCH] Correct debug for -mcall-ms2sysv-xlogues stubs (PR target/83917, take 2)

2018-02-26 Thread Daniel Santos


On 02/26/2018 02:20 AM, Jakub Jelinek wrote:
> On Sun, Feb 25, 2018 at 05:56:28PM -0600, Daniel Santos wrote:
>>> --- libgcc/config/i386/i386-asm.h.jj2018-01-03 10:42:56.317763517 
>>> +0100
>>> +++ libgcc/config/i386/i386-asm.h   2018-02-22 15:33:43.812922298 +0100
>>> @@ -27,8 +27,47 @@ see the files COPYING3 and COPYING.RUNTI
>>>  #define I386_ASM_H
>>>  
>>>  #include "auto-target.h"
>>> +#undef PACKAGE_VERSION
>>> +#undef PACKAGE_NAME
>>> +#undef PACKAGE_STRING
>>> +#undef PACKAGE_TARNAME
>>> +#undef PACKAGE_URL
>> This is a beautiful, temporary(?) fix to an ugly problem!
>>
>>>  #include "auto-host.h"
>>> --- libgcc/config/i386/cygwin.S.jj  2018-01-03 10:42:56.309763515 +0100
>>> +++ libgcc/config/i386/cygwin.S 2018-02-22 15:30:34.597925496 +0100
>>> @@ -23,31 +23,13 @@
>>>   * <http://www.gnu.org/licenses/>.
>>>   */
>>>  
>>> -#include "auto-host.h"
>> The following include should be here.
>>
>> +#include "i386-asm.h"
> I don't understand this.  i386-asm.h needs (both before my patch and after
> it) both auto-host.h and auto-target.h, as it tests
> HAVE_GAS_SECTIONS_DIRECTIVE (this one newly, comes from cygwin.S)

The problem is that HAVE_GAS_SECTIONS_DIRECTIVE gets defined (or not) in
../../gcc/auto-host.h, but you are testing it before including
auto-host.h, either directly or via i386-asm.h.  So if i386-asm.h
depends upon HAVE_GAS_SECTIONS_DIRECTIVE first being defined then it is
a circular dependency.

In its current form, cygwin.S would never define USE_GAS_CFI_DIRECTIVES
prior to including i386-asm.h and also never emit
    .cfi_sections    .debug_frame
and rather or not USE_GAS_CFI_DIRECTIVES ends up being defined to 1 or 0
depends upon the test of __GCC_HAVE_DWARF2_CFI_ASM in i386-asm.h.

So this area is new for me, but I don't understand why we're testing
HAVE_GAS_SECTIONS_DIRECTIVE in cygwin.S and __GCC_HAVE_DWARF2_CFI_ASM
when included from one of the stubs.  Is this an error, or a lack of my
understanding or both? :)

> HAVE_GAS_HIDDEN
> macros defined in auto-host.h
> and
> HAVE_AS_AVX
> macro defined in auto-target.h.
> Including auto-host.h when i386-asm.h will include it again just doesn't
> work, these headers don't have multiple inclusion guards.  And only including
> auto-target.h will work only if the
> .hidden
> and
> .cfi_sections .debug_frame
> tests are duplicated from gcc/configure.ac to libgcc/configure.ac, then we
> could include just auto-target.h in i386-asm.h.
> I've just followed what i386-asm.h has been doing.

And it's possible that I failed to test something correctly before
presuming it to be available, although I *think* the test for .hidden is
good.

>
>   Jakub
>

Thanks for your work on this.  If we need to test for CFI directives
differently when being included from cygwin.S, maybe we can just define
a simple cpp macro to indicate this and let i386-asm.h encapsulate the
implementation of it (e.g., testing HAVE_GAS_SECTIONS_DIRECTIVE or
__GCC_HAVE_DWARF2_CFI_ASM as appropriate).

Ultimately, the proper cleanup will be moving these tests out of
{gcc,libgcc}/configure.ac and into .m4 files in the root config
directory so that we don't uglify them with massive copy & pastes. 
These tests are also fairly complex as there are a lot of dependencies. 
m4 isn't my strong suite, but I can look at this after we're out of code
freeze.

Daniel



Re: [PATCH] Correct debug for -mcall-ms2sysv-xlogues stubs (PR target/83917, take 2)

2018-02-25 Thread Daniel Santos
Sorry for the dropping the ball on this and thank you Jakub for stepping in!

I've had a patch set sort-of rotting in my local repo, but I like yours
better.  I think I had gotten hung up on trying to figure out how to
write a test for this, and like you I just tested mine manually in gdb. 
I do have one correction though.


On 02/22/2018 08:56 AM, Jakub Jelinek wrote:
> Hi!
>
> On Sat, Jan 20, 2018 at 06:01:16PM -0600, Daniel Santos wrote:
>> Thanks.  I like the idea of commonizing the macros for consistency.
> Didn't see a progress on this P3 for a while, so I've written this
> version of the patch; no tests though, what I've been using in testing was:
> /* { dg-do compile { target lp64 } } */
> /* { dg-options "-mno-avx -msse2 -mcall-ms2sysv-xlogues -O2" } */
>
> void __attribute__((sysv_abi, noipa))
> foo (void)
> {
> }
>
> static void __attribute__((sysv_abi)) (*volatile foop) () = foo;
>
> void __attribute__((ms_abi, noipa))
> bar (void)
> {
>   foop ();
> }
>
> int
> main ()
> {
>   bar ();
>   return 0;
> }
>
> with/without -fno-omit-frame-pointer, disas bar; b on the tail
> call in there, stepi; bt (which before the patch failed, now works),
> also up; p $rbp to see if %rbp has been properly declared to be saved.
> There is no need to cfi_startproc/cfi_endproc for every single entrypoint in
> there, it is enough if the whole range is covered.  On the other side
> we need the cfi_offset for the frame pointer case, otherwise up; p/x $rbp
> doesn't work properly.
>
> Ok for trunk if it passes bootstrap/regtest on x86_64-linux and i686-linux?
>
> 2018-02-22  Jakub Jelinek  <ja...@redhat.com>
>
>   PR debug/83917
>   * config/i386/i386-asm.h (PACKAGE_VERSION, PACKAGE_NAME,
>   PACKAGE_STRING, PACKAGE_TARNAME, PACKAGE_URL): Undefine between
>   inclusion of auto-target.h and auto-host.h.
>   (USE_GAS_CFI_DIRECTIVES): Define if not defined already based on
>   __GCC_HAVE_DWARF2_CFI_ASM.
>   (cfi_startproc, cfi_endproc, cfi_adjust_cfa_offset,
>   cfi_def_cfa_register, cfi_def_cfa, cfi_register, cfi_offset, cfi_push,
>   cfi_pop): Define.
>   * config/i386/cygwin.S: Don't include auto-host.h here, just
>   define USE_GAS_CFI_DIRECTIVES to 1 or 0 and include i386-asm.h.
>   (cfi_startproc, cfi_endproc, cfi_adjust_cfa_offset,
>   cfi_def_cfa_register, cfi_register, cfi_push, cfi_pop): Remove.
>   * config/i386/resms64fx.h: Add cfi_* directives.
>   * config/i386/resms64x.h: Likewise.
>
> --- libgcc/config/i386/i386-asm.h.jj  2018-01-03 10:42:56.317763517 +0100
> +++ libgcc/config/i386/i386-asm.h 2018-02-22 15:33:43.812922298 +0100
> @@ -27,8 +27,47 @@ see the files COPYING3 and COPYING.RUNTI
>  #define I386_ASM_H
>  
>  #include "auto-target.h"
> +#undef PACKAGE_VERSION
> +#undef PACKAGE_NAME
> +#undef PACKAGE_STRING
> +#undef PACKAGE_TARNAME
> +#undef PACKAGE_URL

This is a beautiful, temporary(?) fix to an ugly problem!

>  #include "auto-host.h"
>  
> +#ifndef USE_GAS_CFI_DIRECTIVES
> +# ifdef __GCC_HAVE_DWARF2_CFI_ASM
> +#  define USE_GAS_CFI_DIRECTIVES 1
> +# else
> +#  define USE_GAS_CFI_DIRECTIVES 0
> +# endif
> +#endif
> +#if USE_GAS_CFI_DIRECTIVES
> +# define cfi_startproc() .cfi_startproc
> +# define cfi_endproc()   .cfi_endproc
> +# define cfi_adjust_cfa_offset(X).cfi_adjust_cfa_offset X
> +# define cfi_def_cfa_register(X) .cfi_def_cfa_register X
> +# define cfi_def_cfa(R,O).cfi_def_cfa R, O
> +# define cfi_register(D,S)   .cfi_register D, S
> +# define cfi_offset(R,O) .cfi_offset R, O
> +# ifdef __x86_64__
> +#  define cfi_push(X).cfi_adjust_cfa_offset 8; 
> .cfi_rel_offset X, 0
> +#  define cfi_pop(X) .cfi_adjust_cfa_offset -8; .cfi_restore X
> +# else
> +#  define cfi_push(X).cfi_adjust_cfa_offset 4; 
> .cfi_rel_offset X, 0
> +#  define cfi_pop(X) .cfi_adjust_cfa_offset -4; .cfi_restore X
> +# endif
> +#else
> +# define cfi_startproc()
> +# define cfi_endproc()
> +# define cfi_adjust_cfa_offset(X)
> +# define cfi_def_cfa_register(X)
> +# define cfi_def_cfa(R,O)
> +# define cfi_register(D,S)
> +# define cfi_offset(R,O)
> +# define cfi_push(X)
> +# define cfi_pop(X)
> +#endif
> +
>  #define PASTE2(a, b) PASTE2a(a, b)
>  #define PASTE2a(a, b) a ## b
>  
> --- libgcc/config/i386/cygwin.S.jj2018-01-03 10:42:56.309763515 +0100
> +++ libgcc/config/i386/cygwin.S   2018-02-22 15:30:34.597925496 +0100
> @@ -23,31 +23,13 @@
>   * <http://www.gnu.org/licenses/>.
>   */
>  
> -#include "auto-host.h&quo

Re: [PATCH, x86, libgcc] PR target/83917 Correct debug for -mcall-ms2sysv-xlogues stubs

2018-01-20 Thread Daniel Santos
On 01/19/2018 05:35 PM, Jakub Jelinek wrote:
> On Fri, Jan 19, 2018 at 05:33:10PM -0600, Daniel Santos wrote:
>> When stepping through tail-call restore stubs the debugger has to assume
>> that rsp - 8 is the CFA, although it is not.  This is because I did not
>> explicitly add any .cfi directives.  This patch adds them to the
>> tail-call restore stubs, but this is new territory for me, so I would
>> appreciate feedback.
>>
>> I've reg-tested on x86_64, but I still need to test on Solaris and
>> Darwin.  OK to commit after those tests?
> I think you can't assume that the assembler supports .cfi_* directives.
> While e.g. libgcc/config/i386/morestack.S uses them unconditionally,
> it is guarded with:
> if test "$libgcc_cv_cfi" = "yes"; then
> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
> fi

Ah hah! That explains a lot.  Yeah, I wasn't thinking all assemblers
would support it but I saw them in the Solaris assembler manual and
figured that they were maybe more widely supported than I had thought.

> in config.host.  E.g. cygwin.S has:
> #ifdef HAVE_GAS_CFI_SECTIONS_DIRECTIVE
> .cfi_sections   .debug_frame
> # define cfi_startproc().cfi_startproc
> # define cfi_endproc()  .cfi_endproc
> # define cfi_adjust_cfa_offset(X)   .cfi_adjust_cfa_offset X
> # define cfi_def_cfa_register(X).cfi_def_cfa_register X
> # define cfi_register(D,S)  .cfi_register D, S
> # ifdef __x86_64__
> #  define cfi_push(X)   .cfi_adjust_cfa_offset 8; .cfi_rel_offset X, 0
> #  define cfi_pop(X).cfi_adjust_cfa_offset -8; .cfi_restore X
> # else
> #  define cfi_push(X)   .cfi_adjust_cfa_offset 4; .cfi_rel_offset X, 0
> #  define cfi_pop(X).cfi_adjust_cfa_offset -4; .cfi_restore X
> # endif
> #else
> # define cfi_startproc()
> # define cfi_endproc()
> # define cfi_adjust_cfa_offset(X)
> # define cfi_def_cfa_register(X)
> # define cfi_register(D,S)
> # define cfi_push(X)
> # define cfi_pop(X)
> #endif /* HAVE_GAS_CFI_SECTIONS_DIRECTIVE */
> perhaps you need something similar or commonize that (though, without
> .cfi_sections, you want the default).
>
>   Jakub

Thanks.  I like the idea of commonizing the macros for consistency.

As far as adding tests, I guess I would need to dig into
lib/gcc-gdb-test.exp to figure out how to do that.

Daniel


[PATCH, x86, libgcc] PR target/83917 Correct debug for -mcall-ms2sysv-xlogues stubs

2018-01-19 Thread Daniel Santos
When stepping through tail-call restore stubs the debugger has to assume
that rsp - 8 is the CFA, although it is not.  This is because I did not
explicitly add any .cfi directives.  This patch adds them to the
tail-call restore stubs, but this is new territory for me, so I would
appreciate feedback.

I've reg-tested on x86_64, but I still need to test on Solaris and
Darwin.  OK to commit after those tests?

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 libgcc/config/i386/resms64fx.h | 19 +++
 libgcc/config/i386/resms64x.h  | 22 ++
 2 files changed, 41 insertions(+)

diff --git a/libgcc/config/i386/resms64fx.h b/libgcc/config/i386/resms64fx.h
index c5f63d879fe..7dc8c7d89ed 100644
--- a/libgcc/config/i386/resms64fx.h
+++ b/libgcc/config/i386/resms64fx.h
@@ -34,21 +34,40 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
.text
 MS2SYSV_STUB_BEGIN(resms64fx_17)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x68(%rsi),%r15
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_16)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x60(%rsi),%r14
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_15)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x58(%rsi),%r13
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_14)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x50(%rsi),%r12
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_13)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x48(%rsi),%rbx
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64fx_12)
+.cfi_startproc
+.cfi_def_cfa %rbp, 16
mov -0x40(%rsi),%rdi
SSE_RESTORE
mov -0x38(%rsi),%rsi
leaveq
+.cfi_def_cfa %rsp, 8
ret
+.cfi_endproc
 MS2SYSV_STUB_END(resms64fx_12)
 MS2SYSV_STUB_END(resms64fx_13)
 MS2SYSV_STUB_END(resms64fx_14)
diff --git a/libgcc/config/i386/resms64x.h b/libgcc/config/i386/resms64x.h
index 1b44938ae7c..753be1f4c52 100644
--- a/libgcc/config/i386/resms64x.h
+++ b/libgcc/config/i386/resms64x.h
@@ -33,23 +33,45 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
.text
 MS2SYSV_STUB_BEGIN(resms64x_18)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x70(%rsi),%r15
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_17)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x68(%rsi),%r14
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_16)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x60(%rsi),%r13
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_15)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x58(%rsi),%r12
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_14)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x50(%rsi),%rbp
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_13)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x48(%rsi),%rbx
+.cfi_endproc
 MS2SYSV_STUB_BEGIN(resms64x_12)
+.cfi_startproc
+.cfi_def_cfa %r10, 8
mov -0x40(%rsi),%rdi
SSE_RESTORE
mov -0x38(%rsi),%rsi
mov %r10,%rsp
+.cfi_def_cfa_register %rsp
ret
+.cfi_endproc
 MS2SYSV_STUB_END(resms64x_12)
 MS2SYSV_STUB_END(resms64x_13)
 MS2SYSV_STUB_END(resms64x_14)
-- 
2.15.0



Re: [PATCH] Fix ms-sysv.exp testsuite FAILs (PR c/83117)

2017-11-28 Thread Daniel Santos


On 11/28/2017 05:22 AM, Jakub Jelinek wrote:
> On Mon, Nov 27, 2017 at 05:02:32PM -0600, Daniel Santos wrote:
>>> --- gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc.jj   2017-05-22 
>>> 10:49:45.0 +0200
>>> +++ gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc  2017-11-27 
>>> 11:57:14.889570915 +0100
>>> @@ -392,7 +392,7 @@ static void make_do_tests_decl (const ve
>>> continue;
>>>  
>>>   comma.reset ();
>>> - out << "static __attribute__ ((ms_abi)) long (*const do_test_"
>>> + out << "static __attribute__ ((ms_abi)) long (*do_test_"
>>>   << (unaligned ? "u" : "")
>>>   << (varargs ? "v" : "") << i << ") (";
>> I don't have a problem with removing const, it's only there for
>> const-correctness and caution.  I just posted to the PR a bit ago and
>> I'm curious if there is a better approach when using assembly stubs that
>> are meant to be called in varying ways.  CV would work also, although
>> there's no real need to refetch the address before each use.
>>
>> If you don't have a better way to do this then please use this patch.
> I've verified the resulting *.optimized dump as well as assembly is
> practically identical without/with the patch, only differences are in
> SSA_NAME versions, in assembly the .LC and .LCFI constants are
> different but otherwise it is the same - the functions are emitted in
> different orders by cgraph and committed the patch.
>
> Using assembly stubs that are meant to be called in varying ways should
> just be avoided in portable programs, you could e.g. in the generator
> instead of all those:
> extern __attribute__ ((ms_abi)) long do_test_aligned ();
> extern __attribute__ ((ms_abi)) long do_test_unaligned ();
> static __attribute__ ((ms_abi)) long (*do_test_1) (long a) = 
> (void*)do_test_aligned;
> static __attribute__ ((ms_abi)) long (*do_test_v1) (long a, ...) = 
> (void*)do_test_aligned;
> static __attribute__ ((ms_abi)) long (*do_test_u1) (long a) = 
> (void*)do_test_unaligned;
> static __attribute__ ((ms_abi)) long (*do_test_uv1) (long a, ...) = 
> (void*)do_test_unaligned;
> emit:
> extern __attribute__ ((ms_abi)) long do_test_1 (long a);
> asm (".text; do_test_1: jmp do_test_aligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_v1 (long a, ...);
> asm (".text; do_test_v1: jmp do_test_aligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_1 (long a);
> asm (".text; do_test_u1: jmp do_test_unaligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_1 (long a, ...);
> asm (".text; do_test_uv1: jmp do_test_unaligned; .previous");
> or something similar.
>
>   Jakub

Ah hah! That would indeed work. Thanks for the tip.  I have some
improvements to make to this set of tests, mostly tests triggered by
GCC_TEST_RUN_EXPENSIVE, but perhaps I can make this modification as
well.  Come to think of it, attribute naked might work too.

Thanks,
Daniel


Re: [PATCH] Fix ms-sysv.exp testsuite FAILs (PR c/83117)

2017-11-27 Thread Daniel Santos
On 11/27/2017 04:34 PM, Jakub Jelinek wrote:
> Hi!
>
> As mentioned in the PR, my C FE rvalue folding patch allows folding
> const variable initializers into the uses of those variables in rvalue
> contexts more than before, and so we get warnings about UB in the test,
> because an unprototyped function is cast to a function type with ellipsis in
> it.
>
> It isn't entirely clear what exactly the test wants to test, as mentioned
> in the PR, this is one of the options how to solve it, by dropping the
> const it can't be optimized in the FEs (the optimizers can still figure out
> the static vars are never written to).  Another option would be just
> add -w to dg-options, another one is const volatile.
>
> Regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-11-27  Jakub Jelinek  
>
>   PR c/83117
>   * gcc.target/x86_64/abi/ms-sysv/gen.cc (make_do_tests_decl): Drop
>   const from do_test_{u,v}*.
>
> --- gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc.jj 2017-05-22 
> 10:49:45.0 +0200
> +++ gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc2017-11-27 
> 11:57:14.889570915 +0100
> @@ -392,7 +392,7 @@ static void make_do_tests_decl (const ve
>   continue;
>  
> comma.reset ();
> -   out << "static __attribute__ ((ms_abi)) long (*const do_test_"
> +   out << "static __attribute__ ((ms_abi)) long (*do_test_"
> << (unaligned ? "u" : "")
> << (varargs ? "v" : "") << i << ") (";
>  
>
>   Jakub
>

I don't have a problem with removing const, it's only there for
const-correctness and caution.  I just posted to the PR a bit ago and
I'm curious if there is a better approach when using assembly stubs that
are meant to be called in varying ways.  CV would work also, although
there's no real need to refetch the address before each use.

If you don't have a better way to do this then please use this patch.

Thanks!
Daniel





Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-11-03 Thread Daniel Santos
On 11/03/2017 04:22 PM, Daniel Santos wrote:
> ...
> How does this patch look?  (Also, I've updated comments for
> choose_baseaddr.)  Currently re-running tests.
>
> Thanks,
> Daniel
>
> @@ -13110,10 +13125,26 @@ ix86_expand_prologue (void)
>target.  */
>if (TARGET_SEH)
>   m->fs.sp_valid = false;
> -}
>  
> -  if (m->call_ms2sysv)
> -ix86_emit_outlined_ms2sysv_save (frame);
> +  /* If SP offset is non-immediate after allocation of the stack frame,
> +  then emit SSE saves or stub call prior to allocating the rest of the
> +  stack frame.  This is less efficient for the out-of-line stub because
> +  we can't combine allocations across the call barrier, but it's better
> +  than using a scratch register.  */
> +  else if (!x86_64_immediate_operand (GEN_INT 
> (frame.stack_pointer_offset - m->fs.sp_realigned_offset), Pmode))

Oops, and also after fixing this formatting...

Daniel


Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-11-03 Thread Daniel Santos
On 11/03/2017 02:09 AM, Uros Bizjak wrote:
> On Thu, Nov 2, 2017 at 11:43 PM, Daniel Santos <daniel.san...@pobox.com> 
> wrote:
>
>>>>int_registers_saved = (frame.nregs == 0);
>>>>sse_registers_saved = (frame.nsseregs == 0);
>>>> +  save_stub_call_needed = (m->call_ms2sysv);
>>>> +  gcc_assert (!(!sse_registers_saved && save_stub_call_needed));
>>> Oooh, double negation :(
>> I'm just saying that we shouldn't be saving SSE registers inline and via
>> the stub.  If I followed the naming convention of e.g.,
>> "see_registers_saved" then my variable would end up being called
>> "save_stub_called" which would be incorrect and misleading, similar to
>> how "see_registers_saved" is misleading when there are in fact no SSE
>> register that need to be saved.  Maybe I should rename
>> (int|sse)_registers_saved to (int|sse)_register_saves_needed with
>> inverted logic instead.
> But, we can just say
>
> gcc_assert (sse_registers_saved || !save_stub_call_needed);
>
> No?
>
> Uros.
>

Oh yes, I see.  Because "sse_registers_saved" really means that we've
either already saved them or don't have to, and not literally that they
have been saved.  I ranted about it's name but didn't think it all the
way through. :)

How does this patch look?  (Also, I've updated comments for
choose_baseaddr.)  Currently re-running tests.

Thanks,
Daniel
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2967876..fb81d4dba84 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11515,12 +11515,15 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx _reg,
an alignment value (in bits) that is preferred or zero and will
recieve the alignment of the base register that was selected,
irrespective of rather or not CFA_OFFSET is a multiple of that
-   alignment value.
+   alignment value.  If it is possible for the base register offset to be
+   non-immediate then SCRATCH_REGNO should specify a scratch register to
+   use.
 
The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
-choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
+choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
+		 unsigned int scratch_regno = INVALID_REGNUM)
 {
   rtx base_reg = NULL;
   HOST_WIDE_INT base_offset = 0;
@@ -11534,6 +11537,19 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
 choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
 
   gcc_assert (base_reg != NULL);
+
+  rtx base_offset_rtx = GEN_INT (base_offset);
+
+  if (!x86_64_immediate_operand (base_offset_rtx, Pmode))
+{
+  gcc_assert (scratch_regno != INVALID_REGNUM);
+
+  rtx scratch_reg = gen_rtx_REG (Pmode, scratch_regno);
+  emit_move_insn (scratch_reg, base_offset_rtx);
+
+  return gen_rtx_PLUS (Pmode, base_reg, scratch_reg);
+}
+
   return plus_constant (Pmode, base_reg, base_offset);
 }
 
@@ -12793,23 +12809,19 @@ ix86_emit_outlined_ms2sysv_save (const struct ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
 
   /* AL should only be live with sysv_abi.  */
   gcc_assert (!ix86_eax_live_at_start_p ());
+  gcc_assert (m->fs.sp_offset >= frame.sse_reg_save_offset);
 
   /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
  we've actually realigned the stack or not.  */
   align = GET_MODE_ALIGNMENT (V4SFmode);
   addr = choose_baseaddr (frame.stack_realign_offset
-			  + xlogue.get_stub_ptr_offset (), );
+			  + xlogue.get_stub_ptr_offset (), , AX_REG);
   gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Allocate stack if not already done.  */
-  if (allocate > 0)
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-allocate), -1, false);
+  emit_insn (gen_rtx_SET (rax, addr));
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
@@ -12841,6 +12853,7 @@ ix86_expand_prologue (void)
   HOST_WIDE_INT allocate;
   bool int_registers_saved;
   bool sse_registers_saved;
+  bool save_stub_call_needed;
   rtx static_chain = NULL_RTX;
 
   if (ix86_function_naked (current_function_decl))
@@ -13016,6 +13029,8 @@ ix86_expand_prologue (void)
 
   int_registers_saved = (frame.nregs == 0);
   sse_registers_saved = (frame.nsseregs == 0);
+  save_stub_call_needed = (m->call_ms2sysv);
+  gcc_assert (sse_registers_saved || !save_stub_call_needed);
 
   if (frame_pointer_needed && !m->fs.fp_valid)
 {
@@ -13110,10 +13125,26 @@ ix86_expand_prologue (void)
 	 target.  */
   if (TARGET_SEH)
 	m->fs.sp_va

Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-11-02 Thread Daniel Santos
On 10/31/2017 04:31 AM, Uros Bizjak wrote:
> On Tue, Oct 31, 2017 at 3:09 AM, Daniel Santos <daniel.san...@pobox.com> 
> wrote:
>> When we are realigning the stack pointer, making an ms_abi to sysv_abi
>> call and alllocating 2GiB or more on the stack we end up with an invalid
>> INSN due to a non-immediate offset.  This occurs both with and without
>> -mcall-ms2sysv-xlogues.  Additionally, I've discovered that the stack
>> allocation with -mcall-ms2sysv-xlogues is incorrect as it ignores stack
>> checking, stack clash checking and probing.
>>
>> This patch fixes these problems by
>>
>> 1. No longer allocate stack space in ix86_emit_outlined_ms2sysv_save.
>> 2. Rearrange where we emit SSE saves or stub call:
>>a. Before frame allocation when offset from frame to save area is >= 2GiB.
>>b. After frame allocation when frame is < 2GiB.  (Stack allocations
>>   prior to the stub call can't be combined with those afterwards, so
>>   this is better when possible.)
>> 3. Modify choose_baseaddr to take an optional scratch_regno argument
>>and never return rtx that cannot be used as an immediate.
>>
>> gcc:
>> config/i386/i386.c (choose_basereg): Use optional scratch
>> register and add assertion.
>> (x86_emit_outlined_ms2sysv_save): use scratch register when
>> needed, and don't allocate stack.
>> (ix86_expand_prologue): Rearrange where SSE saves/stub call is
>> emitted, correct wrong allocation with -mcall-ms2sysv-xlogues.
>> (ix86_emit_outlined_ms2sysv_restore): Fix non-immediate offsets.
>>
>> gcc/testsuite:
>> gcc.target/i386/pr82002-2a.c: Change from xfail to fail.
>> gcc.target/i386/pr82002-2b.c: Likewise.
>>
>> Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
>> ---
>>  gcc/config/i386/i386.c | 76 
>> --
>>  gcc/testsuite/gcc.target/i386/pr82002-2a.c |  2 -
>>  gcc/testsuite/gcc.target/i386/pr82002-2b.c |  2 -
>>  3 files changed, 62 insertions(+), 18 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 83a07afb3e1..abd8e937e0d 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -11520,7 +11520,8 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
>> _reg,
>> The valid base registers are taken from CFUN->MACHINE->FS.  */
>>
>>  static rtx
>> -choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
>> +choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
>> +int scratch_regno = -1)
>>  {
>>rtx base_reg = NULL;
>>HOST_WIDE_INT base_offset = 0;
>> @@ -11534,6 +11535,28 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned 
>> int *align)
>>  choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
>>
>>gcc_assert (base_reg != NULL);
>> +
>> +  if (TARGET_64BIT)
>> +{
>> +  rtx base_offset_rtx = GEN_INT (base_offset);
>> +
>> +  if (scratch_regno >= 0)
>> +   {
>> + if (!x86_64_immediate_operand (base_offset_rtx, DImode))
>> +   {
>> + rtx tmp;
>> + rtx scratch_reg = gen_rtx_REG (DImode, scratch_regno);
>> +
>> + emit_insn (gen_rtx_SET (scratch_reg, base_offset_rtx));
>> + tmp = gen_rtx_PLUS (DImode, scratch_reg, base_reg);
>> + emit_insn (gen_rtx_SET (scratch_reg, tmp));
>> + return scratch_reg;
>> +   }
>> +   }
>> +  else
>> +   gcc_assert (x86_64_immediate_operand (base_offset_rtx, DImode));
>> +}
>> +
>>return plus_constant (Pmode, base_reg, base_offset);
>>  }
> This function doesn't need to return a register, it can return plus
> RTX. I'd suggest the following implementation:
>
> --cut here--
> Index: i386.c
> ===
> --- i386.c  (revision 254243)
> +++ i386.c  (working copy)
> @@ -11520,7 +11520,8 @@
> The valid base registers are taken from CFUN->MACHINE->FS.  */
>
>  static rtx
> -choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
> +choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
> +unsigned int scratch_regno = INVALID_REGNUM)
>  {
>rtx base_reg = NULL;
>HOST_WIDE_INT base_offset = 0;
> @@ -11534,6 +11535,19 @@
>  choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
>
>gcc

Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-10-30 Thread Daniel Santos
On 10/30/2017 09:09 PM, Daniel Santos wrote:
> 3. Modify choose_baseaddr to take an optional scratch_regno argument
>and never return rtx that cannot be used as an immediate.

I should amend this, it actually does a gcc_assert, so that won't happen
if --enable-checking=no, but it would still fail later in expand.

>  static rtx
> -choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
> +choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
> +  int scratch_regno = -1)
>  {
>rtx base_reg = NULL;
>HOST_WIDE_INT base_offset = 0;
> @@ -11534,6 +11535,28 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned 
> int *align)
>  choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
>  
>gcc_assert (base_reg != NULL);
> +
> +  if (TARGET_64BIT)
> +{
> +  rtx base_offset_rtx = GEN_INT (base_offset);
> +
> +  if (scratch_regno >= 0)
> + {
> +   if (!x86_64_immediate_operand (base_offset_rtx, DImode))
> + {
> +   rtx tmp;
> +   rtx scratch_reg = gen_rtx_REG (DImode, scratch_regno);
> +
> +   emit_insn (gen_rtx_SET (scratch_reg, base_offset_rtx));
> +   tmp = gen_rtx_PLUS (DImode, scratch_reg, base_reg);
> +   emit_insn (gen_rtx_SET (scratch_reg, tmp));
> +   return scratch_reg;
> + }
> + }
> +  else
> + gcc_assert (x86_64_immediate_operand (base_offset_rtx, DImode));
> +}
> +
>return plus_constant (Pmode, base_reg, base_offset);
>  }

Daniel


[PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-10-30 Thread Daniel Santos
When we are realigning the stack pointer, making an ms_abi to sysv_abi
call and alllocating 2GiB or more on the stack we end up with an invalid
INSN due to a non-immediate offset.  This occurs both with and without
-mcall-ms2sysv-xlogues.  Additionally, I've discovered that the stack
allocation with -mcall-ms2sysv-xlogues is incorrect as it ignores stack
checking, stack clash checking and probing.

This patch fixes these problems by

1. No longer allocate stack space in ix86_emit_outlined_ms2sysv_save.
2. Rearrange where we emit SSE saves or stub call:
   a. Before frame allocation when offset from frame to save area is >= 2GiB.
   b. After frame allocation when frame is < 2GiB.  (Stack allocations
  prior to the stub call can't be combined with those afterwards, so
  this is better when possible.)
3. Modify choose_baseaddr to take an optional scratch_regno argument
   and never return rtx that cannot be used as an immediate.

gcc:
config/i386/i386.c (choose_basereg): Use optional scratch
register and add assertion.
(x86_emit_outlined_ms2sysv_save): use scratch register when
needed, and don't allocate stack.
(ix86_expand_prologue): Rearrange where SSE saves/stub call is
emitted, correct wrong allocation with -mcall-ms2sysv-xlogues.
(ix86_emit_outlined_ms2sysv_restore): Fix non-immediate offsets.

gcc/testsuite:
gcc.target/i386/pr82002-2a.c: Change from xfail to fail.
gcc.target/i386/pr82002-2b.c: Likewise.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 76 --
 gcc/testsuite/gcc.target/i386/pr82002-2a.c |  2 -
 gcc/testsuite/gcc.target/i386/pr82002-2b.c |  2 -
 3 files changed, 62 insertions(+), 18 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 83a07afb3e1..abd8e937e0d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11520,7 +11520,8 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx _reg,
The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
-choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
+choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
+int scratch_regno = -1)
 {
   rtx base_reg = NULL;
   HOST_WIDE_INT base_offset = 0;
@@ -11534,6 +11535,28 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned 
int *align)
 choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
 
   gcc_assert (base_reg != NULL);
+
+  if (TARGET_64BIT)
+{
+  rtx base_offset_rtx = GEN_INT (base_offset);
+
+  if (scratch_regno >= 0)
+   {
+ if (!x86_64_immediate_operand (base_offset_rtx, DImode))
+   {
+ rtx tmp;
+ rtx scratch_reg = gen_rtx_REG (DImode, scratch_regno);
+
+ emit_insn (gen_rtx_SET (scratch_reg, base_offset_rtx));
+ tmp = gen_rtx_PLUS (DImode, scratch_reg, base_reg);
+ emit_insn (gen_rtx_SET (scratch_reg, tmp));
+ return scratch_reg;
+   }
+   }
+  else
+   gcc_assert (x86_64_immediate_operand (base_offset_rtx, DImode));
+}
+
   return plus_constant (Pmode, base_reg, base_offset);
 }
 
@@ -12793,23 +12816,22 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
 
   /* AL should only be live with sysv_abi.  */
   gcc_assert (!ix86_eax_live_at_start_p ());
+  gcc_assert (m->fs.sp_offset >= frame.sse_reg_save_offset);
 
   /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
  we've actually realigned the stack or not.  */
   align = GET_MODE_ALIGNMENT (V4SFmode);
   addr = choose_baseaddr (frame.stack_realign_offset
- + xlogue.get_stub_ptr_offset (), );
+ + xlogue.get_stub_ptr_offset (), , AX_REG);
   gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Allocate stack if not already done.  */
-  if (allocate > 0)
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-   GEN_INT (-allocate), -1, false);
+  /* If choose_baseaddr returned our scratch register, then we don't need to
+ do another SET.  */
+  if (!REG_P (addr) || REGNO (addr) != AX_REG)
+emit_insn (gen_rtx_SET (rax, addr));
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
@@ -12841,6 +12863,7 @@ ix86_expand_prologue (void)
   HOST_WIDE_INT allocate;
   bool int_registers_saved;
   bool sse_registers_saved;
+  bool save_stub_call_needed;
   rtx static_chain = NULL_RTX;
 
   if (ix86_function_naked (current_function_decl))
@@ -13016,6 +13039,8 @@ ix86_expand_prol

[PATCH 1/2] [i386] PR82002 Part 1: Correct ICE caused by wrong calculation.

2017-10-30 Thread Daniel Santos
This is a residual problem caused by the off-by-one error in sp_valid_at
and fp_valid_at originally corrected in r252099.  However, adding tests
that include an ms_abi to sysv_abi call reveals an additional, more
complex problem with an invalid INSN due to overflowing the s32 offset.
Therefore I'm including all new tests, but marking ones that are broken
by this additional problem as xfail and addressing that in the next
patch.

gcc:
config/i386/i386.c (ix86_expand_epilogue): Correct stack
calculation.

gcc/testsuite:
gcc.target/i386/pr82002-1.c: New test.
gcc.target/i386/pr82002-2a.c: New xfail test.
gcc.target/i386/pr82002-2b.c: New xfail test.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr82002-1.c  | 12 
 gcc/testsuite/gcc.target/i386/pr82002-2a.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82002-2b.c | 14 ++
 4 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82002-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82002-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82002-2b.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2de0dd0c283..83a07afb3e1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13812,7 +13812,7 @@ ix86_expand_epilogue (int style)
 the stack pointer, if we will restore SSE regs via sp.  */
   if (TARGET_64BIT
  && m->fs.sp_offset > 0x7fff
- && sp_valid_at (frame.stack_realign_offset)
+ && sp_valid_at (frame.stack_realign_offset + 1)
  && (frame.nsseregs + frame.nregs) != 0)
{
  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-1.c 
b/gcc/testsuite/gcc.target/i386/pr82002-1.c
new file mode 100644
index 000..86678a01992
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms" } */
+
+void a (char *);
+void
+b ()
+{
+  char c[100];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-2a.c 
b/gcc/testsuite/gcc.target/i386/pr82002-2a.c
new file mode 100644
index 000..bc85080ba8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-2a.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms" } */
+/* { dg-xfail-if "" { *-*-* }  } */
+/* { dg-xfail-run-if "" { *-*-* }  } */
+
+void __attribute__((sysv_abi)) a (char *);
+void
+b ()
+{
+  char c[100];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-2b.c 
b/gcc/testsuite/gcc.target/i386/pr82002-2b.c
new file mode 100644
index 000..10e44cd7b1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-2b.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms -mcall-ms2sysv-xlogues" } */
+/* { dg-xfail-if "" { *-*-* }  } */
+/* { dg-xfail-run-if "" { *-*-* }  } */
+
+void __attribute__((sysv_abi)) a (char *);
+void
+b ()
+{
+  char c[100];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
-- 
2.14.3



[PATCH 0/2] [i386] PR82002 Correct ICE with large stack frame

2017-10-30 Thread Daniel Santos
I originally intended to submit the first part of this patch set a few
weeks ago as it was simpler, but here is the full fix.  The first part
is a really simple follow-up fix to an off-by-one error H.J. originally
fixed with r252099, but in the process of testing I discovered a more
complex problem when we add a ms_abi to sysv_abi call that resulted in a
bad INSN because I didn't check for a non-immediate offset.

I originally wrote a different solution where I added a mechanism to
struct ix86_frame to track and reuse a scratch register in the
pro/epilogue, but then I realized that I didn't need that if I just
emitted the SSE saves or stub call after the SP realignment and prior to
allocating the remainder of the frame.  However, I still need to use a
scratch register sometimes in the epilogue, so I've added a simplified
mechanism to choose_baseaddr to manage that, but not to track and reuse
it for subsequent calls.

Unfortunately, this sat for so long that there's two duplicates in
Bugzilla now (pr82485 and pr82712). Regression tested with {,-m32} and
I've started one for x32 even though it *shouldn't* affect it (in theory).

Thanks,
Daniel



[PATCH] [testsuite/i386] PR 82268 Correct FAIL when configured --with-cpu

2017-10-27 Thread Daniel Santos
When I originally wrote this test I wasn't wasn't aware of the
--with-cpu configure option, so this change explicitly disables avx to
make sure we choose the sse implementation, even when --with-cpu
specifies an arch that has avx support.

OK for head?

gcc/testsuite/ChangeLog:

gcc.target/i386/pr82196-1.c (dg-options): Add -mno-avx.

Thanks,
Daniel

---
 gcc/testsuite/gcc.target/i386/pr82196-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c 
b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index 541d975480d..ff108132bb5 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
+/* { dg-options "-mno-avx -msse -mcall-ms2sysv-xlogues -O2" } */
 /* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
 /* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
-- 
2.14.3



[PATCH try 3] [i386,testsuite,libgcc] Fix build breakage on Mac and test FAILS on Solaris caused by PR82196 patch

2017-09-23 Thread Daniel Santos
OK, hopefully I've gotten everything cleaned up.  I should note that
because I'm now including both auto-target.h from libgcc (for
HAVE_AS_AVX) and auto-host.h from gcc (HAVE_GAS_HIDDEN) that there are
several warnings for redefined macros, but I plan on fixing that once
this is resolved.  Thanks for all of the corrections and improvements!

The previous test run was good, but I still need to recheck after these
changes.

gcc/testsuite:
* gcc.target/i386/pr82196-1.c: (b): Remove volatile asm.
* gcc.target/i386/pr82196-2.c: (b): Likewise.

libgcc:
* configure.ac: Add Check for HAVE_AS_AVX.
* config.in: Regenerate.
* configure: Likewise.
* config/i386/i386-asm.h: Include auto-target.h from libgcc.
(SSE_SAVE, SSE_RESTORE): Emit .byte sequence for !HAVE_AS_AVX.
Correct out-of-date comments.


Please let me know if you find any other issues.

Thanks,
Daniel

diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index ef858328f00..541d975480d 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {
-  __asm__ __volatile__ ("" :::"rbx", "rbp", "r12", "r13", "r14", "r15");
   a_noinfo ();
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-2.c b/gcc/testsuite/gcc.target/i386/pr82196-2.c
index 8fe58411d5e..7166d068bc1 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-mavx -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__avx_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__avx_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__avx_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__avx_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {
-  __asm__ __volatile__ ("" :::"rbx", "rbp", "r12", "r13", "r14", "r15");
   a_noinfo ();
 }
diff --git a/libgcc/config.in b/libgcc/config.in
index 7de22ee0a72..f9fb253874f 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -1,5 +1,8 @@
 /* config.in.  Generated from configure.ac by autoheader.  */
 
+/* Define to 1 if the assembler supports AVX. */
+#undef HAVE_AS_AVX
+
 /* Define to 1 if the target assembler supports thread-local storage. */
 #undef HAVE_CC_TLS
 
diff --git a/libgcc/config/i386/i386-asm.h b/libgcc/config/i386/i386-asm.h
index 424e0f72aac..aad1a752106 100644
--- a/libgcc/config/i386/i386-asm.h
+++ b/libgcc/config/i386/i386-asm.h
@@ -26,6 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #ifndef I386_ASM_H
 #define I386_ASM_H
 
+#include "auto-target.h"
 #include "auto-host.h"
 
 #define PASTE2(a, b) PASTE2a(a, b)
@@ -69,13 +70,15 @@ ASMNAME(fn):
 
 #ifdef MS2SYSV_STUB_AVX
 # define MS2SYSV_STUB_PREFIX __avx_
-# define MOVAPS vmovaps
+# ifdef HAVE_AS_AVX
+#  define MOVAPS vmovaps
+# endif
 #elif defined(MS2SYSV_STUB_SSE)
 # define MS2SYSV_STUB_PREFIX __sse_
 # define MOVAPS movaps
 #endif
 
-#if defined (MS2SYSV_STUB_PREFIX) && defined (MOVAPS)
+#ifdef MS2SYSV_STUB_PREFIX
 
 # define MS2SYSV_STUB_BEGIN(base_name) \
 	HIDDEN_FUNC(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
@@ -83,8 +86,10 @@ ASMNAME(fn):
 # define MS2SYSV_STUB_END(base_name) \
 	FUNC_END(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
 
-/* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
-# define SSE_SAVE		   \
+/* If expanding for sse or avx and we have assembler support.  */
+# ifdef MOVAPS
+/* Save SSE registers 6-15 using rax as the base address.  */
+#  define SSE_SAVE		   \
 	MOVAPS %xmm15,-0x30(%rax); \
 	MOVAPS %xmm14,-0x20(%rax); \
 	MOVAPS %xmm13,-0x10(%rax); \
@@ -96,8 +101,8 @@ ASMNAME(fn):
 	MOVAPS %xmm7,  0x50(%rax); \
 	MOVAPS %xmm6,  0x60(%rax)
 
-/* Restore SSE registers 6-15. off is the offset of rsi to get to xmm6.  */
-# define SSE_RESTORE		\
+/* Restore SSE registers 6-15 using rsi as the base address.  */
+#  define SSE_RESTORE		\
 	MOVAPS -0x30(%rsi), %xmm15; \
 	MOVAPS -0x20(%rsi), %xmm14; \
 	MOVAPS -0x10(%rsi), %xmm13; \
@@ -108,6 +113,31 @@ ASMNAME(fn):
 	MOVAPS  0x40(%rsi), %xmm8 ; \
 	MOVAPS  0x50(%rsi), %xmm7 ; \
 	MOVAPS  0x60(%rsi), %xmm6
-
-#endif /* 

Re: [PATCH try 2] [i386,testsuite,libgcc] Fix build breakage on Mac and test FAILS on Solaris caused by PR82196 patch

2017-09-23 Thread Daniel Santos
On 09/22/2017 06:50 AM, Uros Bizjak wrote:
> On Fri, Sep 22, 2017 at 1:27 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>> On Fri, Sep 22, 2017 at 12:28 PM, Daniel Santos <daniel.san...@pobox.com> 
>> wrote:
>>> On 09/22/2017 03:28 AM, Rainer Orth wrote:
>>>> Hi Daniel,
>>>>
>>>>> On 09/22/2017 02:18 AM, Rainer Orth wrote:
>>>>>> Hi Daniel,
>>>>>>
>>>>>>> On 09/21/2017 05:18 PM, Daniel Santos wrote:
>>>>>>>> So libgcc doesn't use a config.in. :(
>>>>>>> Scratch that, I forgot that we're using gcc/config.in via auto-host.h.
>>>>>>> So I only have to add this to gcc/configure.ac and it will be available
>>>>>>> for my libgcc header -- this is what I used to sniff out support for the
>>>>>>> .hidden directive.
>>>>>> Please don't go that route: it's totally the wrong direction.  There's
>>>>>> work going on to further decouple libgcc from gcc-private headers and
>>>>>> configure results.  libgcc already has its own configure tests for
>>>>>> assembler features, and its own config.in.  What's wrong with adapting
>>>>>> libitm's avx test in libitm/acinclude.m4 (LIBITM_CHECK_AS_AVX) for
>>>>>> libgcc?  Should be trivial...
>>>>>>
>>>>>> Rainer
>>>>>>
>>>>> Oops, I just saw your email after submitting my other patch.  Yes, I am
>>>>> mistaken about config.in, sorry about that.  I didn't see a config.h
>>>>> file, but examining further it looks like it outputs to auto-target.h.
>>>>> Also, I was looking for some HAVE_AS* macros, but they are named
>>>>> differently.
>>>> Right: though some are for assembler features, the macros are named
>>>> differently.
>>>>
>>>>> I had previously included gcc's auto-host.h since it was in the include
>>>>> path in order to use HAVE_AS_HIDDEN, so in order to decouple this I'll
>>>> HAVE_GAS_HIDDEN actually ;-)
>>>>
>>>>> need to add that check into libgcc/configure.ac as well.  Again,
>>>>> shouldn't be that much code.  Sound sane to you?
>>>> You could do that, but it was already used before your patches, so
>>>> please separate it from the current issue if you go down that route.
>>>> libgcc is still full of cleanup possibilities :-)
>>>>
>>>>   Rainer
>>> OK, so I'm just adding HAVE_AS_AVX mostly as-is from libitm (we don't
>>> have $target_cpu so I'm using $target).  I do have minor concerns about
>>> how this test will work on a cross-build -- I'm not an autotools expert
>>> and I don't understand which assembler it will invoke, but the results
>>> of the test failing only means we use .byte instead of the real
>>> mnemonic, so it really shouldn't be a problem.
>>>
>>> I've got tests started again, so presuming that *this* one passes, is it
>>> OK for the trunk?
>>>
>>> gcc/testsuite:
>>> * gcc.target/i386/pr82196-1.c: Simplify so that it doesn't break
>>> on Solaris or with -mno-omit-frame-pointer.
>> No need to explain the change in the ChangeLog. Just say "(b): Remove
>> volatile asm."
>>
>>> * gcc.target/i386/pr82196-2.c: Likewise.
>>>
>>> libgcc:
>>> * configure.ac: Add check for HAVE_AS_AVX.
>>> * config.in: Regenerate.
>>> * configure: Likewise.
>>> * config/i386/i386-asm.h: Include auto-target.h from libgcc.
>>> (SSE_SAVE, SSE_RESTORE): Sniff HAVE_AS_AVX and directly emit raw
>>> .byte code when assembler doesn't support avx, correct
>>> out-of-date comments.
>> (SSE_SAVE, SSE_RESTORE): Emit .byte sequence for !HAVE_AS_AVX.
>> Correct out-of-date comments.
>>
>>>  gcc/testsuite/gcc.target/i386/pr82196-1.c |  5 ++-
>>>  gcc/testsuite/gcc.target/i386/pr82196-2.c |  5 ++-
>>>  libgcc/config.in  |  3 ++
>>>  libgcc/config/i386/i386-asm.h | 45 ++-
>>>  libgcc/configure  | 39 +++
>>>  libgcc/configure.ac   | 16 ++
>>>  6 files changed, 100 insertions(+), 13 deletions(-)
>>
>>  #ifdef MS2SYSV_STUB_AVX
>>  # define MS2SYSV_STUB_PREFIX __avx_
>> -# define MOVAPS vmovaps
>> +# 

Re: [PATCH try 2] [i386,testsuite,libgcc] Fix build breakage on Mac and test FAILS on Solaris caused by PR82196 patch

2017-09-23 Thread Daniel Santos
On 09/22/2017 05:33 AM, Jakub Jelinek wrote:
> On Fri, Sep 22, 2017 at 05:28:00AM -0500, Daniel Santos wrote:
>> +/* If the assembler doesn't support AVX then directly emit machine code
>> +   for the instructions above directly.  */
> Just a nit: too many "directly" words.
>
>   Jakub

Well I hate nits in my code (or comments), so it's appreciated. :)


[PATCH try 2] [i386,testsuite,libgcc] Fix build breakage on Mac and test FAILS on Solaris caused by PR82196 patch

2017-09-22 Thread Daniel Santos
On 09/22/2017 03:28 AM, Rainer Orth wrote:
> Hi Daniel,
>
>> On 09/22/2017 02:18 AM, Rainer Orth wrote:
>>> Hi Daniel,
>>>
>>>> On 09/21/2017 05:18 PM, Daniel Santos wrote:
>>>>> So libgcc doesn't use a config.in. :(
>>>> Scratch that, I forgot that we're using gcc/config.in via auto-host.h. 
>>>> So I only have to add this to gcc/configure.ac and it will be available
>>>> for my libgcc header -- this is what I used to sniff out support for the
>>>> .hidden directive.
>>> Please don't go that route: it's totally the wrong direction.  There's
>>> work going on to further decouple libgcc from gcc-private headers and
>>> configure results.  libgcc already has its own configure tests for
>>> assembler features, and its own config.in.  What's wrong with adapting
>>> libitm's avx test in libitm/acinclude.m4 (LIBITM_CHECK_AS_AVX) for
>>> libgcc?  Should be trivial...
>>>
>>> Rainer
>>>
>> Oops, I just saw your email after submitting my other patch.  Yes, I am
>> mistaken about config.in, sorry about that.  I didn't see a config.h
>> file, but examining further it looks like it outputs to auto-target.h. 
>> Also, I was looking for some HAVE_AS* macros, but they are named
>> differently.
> Right: though some are for assembler features, the macros are named
> differently.
>
>> I had previously included gcc's auto-host.h since it was in the include
>> path in order to use HAVE_AS_HIDDEN, so in order to decouple this I'll
> HAVE_GAS_HIDDEN actually ;-)
>
>> need to add that check into libgcc/configure.ac as well.  Again,
>> shouldn't be that much code.  Sound sane to you?
> You could do that, but it was already used before your patches, so
> please separate it from the current issue if you go down that route.
> libgcc is still full of cleanup possibilities :-)
>
>   Rainer

OK, so I'm just adding HAVE_AS_AVX mostly as-is from libitm (we don't
have $target_cpu so I'm using $target).  I do have minor concerns about
how this test will work on a cross-build -- I'm not an autotools expert
and I don't understand which assembler it will invoke, but the results
of the test failing only means we use .byte instead of the real
mnemonic, so it really shouldn't be a problem.

I've got tests started again, so presuming that *this* one passes, is it
OK for the trunk?

gcc/testsuite:
* gcc.target/i386/pr82196-1.c: Simplify so that it doesn't break
on Solaris or with -mno-omit-frame-pointer.
* gcc.target/i386/pr82196-2.c: Likewise.

libgcc:
* configure.ac: Add check for HAVE_AS_AVX.
* config.in: Regenerate.
* configure: Likewise.
* config/i386/i386-asm.h: Include auto-target.h from libgcc.
(SSE_SAVE, SSE_RESTORE): Sniff HAVE_AS_AVX and directly emit raw
.byte code when assembler doesn't support avx, correct
out-of-date comments.


 gcc/testsuite/gcc.target/i386/pr82196-1.c |  5 ++-
 gcc/testsuite/gcc.target/i386/pr82196-2.c |  5 ++-
 libgcc/config.in  |  3 ++
 libgcc/config/i386/i386-asm.h | 45 ++-
 libgcc/configure  | 39 +++
 libgcc/configure.ac   | 16 ++
 6 files changed, 100 insertions(+), 13 deletions(-)


Thanks,
Daniel
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index ef858328f00..541d975480d 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {
-  __asm__ __volatile__ ("" :::"rbx", "rbp", "r12", "r13", "r14", "r15");
   a_noinfo ();
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-2.c b/gcc/testsuite/gcc.target/i386/pr82196-2.c
index 8fe58411d5e..7166d068bc1 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-mavx -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__avx_savms64_18" } } 

Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-22 Thread Daniel Santos
On 09/22/2017 02:18 AM, Rainer Orth wrote:
> Hi Daniel,
>
>> On 09/21/2017 05:18 PM, Daniel Santos wrote:
>>> So libgcc doesn't use a config.in. :(
>> Scratch that, I forgot that we're using gcc/config.in via auto-host.h. 
>> So I only have to add this to gcc/configure.ac and it will be available
>> for my libgcc header -- this is what I used to sniff out support for the
>> .hidden directive.
> Please don't go that route: it's totally the wrong direction.  There's
> work going on to further decouple libgcc from gcc-private headers and
> configure results.  libgcc already has its own configure tests for
> assembler features, and its own config.in.  What's wrong with adapting
> libitm's avx test in libitm/acinclude.m4 (LIBITM_CHECK_AS_AVX) for
> libgcc?  Should be trivial...
>
>   Rainer
>

Oops, I just saw your email after submitting my other patch.  Yes, I am
mistaken about config.in, sorry about that.  I didn't see a config.h
file, but examining further it looks like it outputs to auto-target.h. 
Also, I was looking for some HAVE_AS* macros, but they are named
differently.

I had previously included gcc's auto-host.h since it was in the include
path in order to use HAVE_AS_HIDDEN, so in order to decouple this I'll
need to add that check into libgcc/configure.ac as well.  Again,
shouldn't be that much code.  Sound sane to you?

Thanks,
Daniel


[PATCH] [i386, testsuite, libgcc] Fix build breakage on Mac and test FAILS on Solaris.

2017-09-22 Thread Daniel Santos
I've bootstrapped and reg-tested {,-m64} with this on
86_64-pc-linux-gnu, but I'm waiting for a reference test set to finish
to compare them.  I've verified that we're getting HAVE_AS_IX86_AVX in
auto-host.h and I've also built and tested w/o to double-verify that the
AVX code is correct.  Now that I think of it, I didn't run the tests
with -mno-omit-frame-pointer (to simulate Solaris), but since the tests
pass as-is, I presume that it understands the 'f?' part of my regex.  (A
separate set of stubs are used when rbp is the frame pointer and these
have 'f' appended to their names.)

OK to commit once I get a clean compare?

gcc:
configure.ac: Add Check for HAVE_AS_IX86_AVX
config.in: Regenerate.
configure: Likewise.

gcc/testsuite:
gcc.target/i386/pr82196-1.c: Simplify so that it doesn't break
on Solaris or with -mno-omit-frame-pointer.
gcc.target/i386/pr82196-2.c: Likewise.

libgcc:
config/i386/i386-asm.h (SSE_SAVE, SSE_RESTORE): Sniff
HAVE_AS_IX86_AVX and directly emit raw .byte code when assembler
doesn't support avx, correct out-of-date comments.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config.in |  6 +
 gcc/configure | 32 ++
 gcc/configure.ac  |  6 +
 gcc/testsuite/gcc.target/i386/pr82196-1.c |  5 ++--
 gcc/testsuite/gcc.target/i386/pr82196-2.c |  5 ++--
 libgcc/config/i386/i386-asm.h | 44 ++-
 6 files changed, 85 insertions(+), 13 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 89d7108e8db..df2e518baa6 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -406,6 +406,12 @@
 #endif
 
 
+/* Define if your assembler supports avx extensions. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_IX86_AVX
+#endif
+
+
 /* Define if your assembler supports the Sun syntax for cmov. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_IX86_CMOV_SUN_SYNTAX
diff --git a/gcc/configure b/gcc/configure
index 13f97cd3663..e982b86c25c 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -25881,6 +25881,38 @@ if test $gcc_cv_as_ix86_swap = yes; then
 
 $as_echo "#define HAVE_AS_IX86_SWAP 1" >>confdefs.h
 
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for avx 
extensions" >&5
+$as_echo_n "checking assembler for avx extensions... " >&6; }
+if test "${gcc_cv_as_ix86_avx+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_ix86_avx=no
+  if test x$gcc_cv_as != x; then
+$as_echo 'vzeroupper' > conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+then
+   gcc_cv_as_ix86_avx=yes
+else
+  echo "configure: failed program was" >&5
+  cat conftest.s >&5
+fi
+rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_avx" >&5
+$as_echo "$gcc_cv_as_ix86_avx" >&6; }
+if test $gcc_cv_as_ix86_avx = yes; then
+
+$as_echo "#define HAVE_AS_IX86_AVX 1" >>confdefs.h
+
 fi
 
 
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 82711389281..a05f2ca10b2 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4171,6 +4171,12 @@ foo: nop
   [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
 [Define if your assembler supports the swap suffix.])])
 
+gcc_GAS_CHECK_FEATURE([avx extensions],
+  gcc_cv_as_ix86_avx,,,
+  [vzeroupper],,
+  [AC_DEFINE(HAVE_AS_IX86_AVX, 1,
+[Define if your assembler supports avx extensions.])])
+
 gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
   gcc_cv_as_ix86_diff_sect_delta,,,
   [.section .rodata
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c 
b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index ef858328f00..541d975480d 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {

Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-21 Thread Daniel Santos
On 09/21/2017 05:18 PM, Daniel Santos wrote:
> So libgcc doesn't use a config.in. :(

Scratch that, I forgot that we're using gcc/config.in via auto-host.h. 
So I only have to add this to gcc/configure.ac and it will be available
for my libgcc header -- this is what I used to sniff out support for the
.hidden directive.

I still wouldn't mind centralizing assembler sniffing in config/as.m4
(or similar) later on.

Thanks,
Daniel


Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-21 Thread Daniel Santos
On 09/21/2017 11:14 AM, Rainer Orth wrote:
> Hi Daniel,
>
>> On 09/19/2017 01:58 AM, Jakub Jelinek wrote:
>>> What can be done in libgcc is detect in configure whether the assembler
>>> supports AVX, and if not, provide some alternative (e.g. because the insns
>>> are always the same, you could just code them as .byte or something 
>>> similar).
>>>
>>> Say like:
>>> --- i386-asm.h  2017-09-18 18:34:30.917126996 +0200
>>> +++ i386-asm.h  2017-09-19 08:56:58.829559038 +0200
>>> @@ -70,6 +70,7 @@ ASMNAME(fn):
>>>  #ifdef MS2SYSV_STUB_AVX
>>>  # define MS2SYSV_STUB_PREFIX __avx_
>>>  # define MOVAPS vmovaps
>>> +# define BYTE .byte
>>>  #elif defined(MS2SYSV_STUB_SSE)
>>>  # define MS2SYSV_STUB_PREFIX __sse_
>>>  # define MOVAPS movaps
>>> @@ -84,7 +85,8 @@ ASMNAME(fn):
>>> FUNC_END(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
>>>  
>>>  /* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
>>> -# define SSE_SAVE \
>>> +# ifdef HAVE_AS_AVX
>> I'm not exactly an autotools expert, but libtim defines HAVE_AS_AVX from
>> libitm/acinclude.m4 -- of course I need it in libgcc.  Similarly, gcc
>> has a nice generic gcc_GAS_CHECK_FEATURE macro in gcc/acinclude.m4 which
>> it uses for all of its HAVE_AS_* macro tests defined in
>> gcc/configure.ac.  I can just copy, paste and edit what's in libitm, but
>> I find that rather distasteful.  Is there a cleaner way to do this?  Can
>> I suck gcc_GAS_CHECK_FEATURE and it's deps out of gcc/acinclude.m4 and
>> put it somewhere central, like config/as.m4?  The upside would be the
>> ability to make HAVE_AS_* macros in other sub-projects more uniform.
> This might be an option as a followup: less code duplication is
> certainly a good thing ;-)
>
>> Alternatively, I can just do the copy and paste and deal with it -- it's
>> not that much code. :)
> However, given that the above will take some time and testing and your
> patch has broken macOS bootstrap, I'd go this route now to unbreak the
> tree ASAP.
>
>   Rainer

A very good point!  So libgcc doesn't use a config.in. :(  So what about
committing my patch as is with HAVE_AS_AVX never defined and the avx
version of the stubs always being built via the .byte directives so that
the build is un-broken, and then figure out how (and where) to add
HAVE_AS_AVX afterwards?  I would still prefer to run a full bootstrap,
but updating an already build bootstrap is good and the tests should
hopefully be fixed on Solaris as well.

Thanks,
Daniel


diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index ef858328f00..541d975480d 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {
-  __asm__ __volatile__ ("" :::"rbx", "rbp", "r12", "r13", "r14", "r15");
   a_noinfo ();
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-2.c b/gcc/testsuite/gcc.target/i386/pr82196-2.c
index 8fe58411d5e..7166d068bc1 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-options "-mavx -mcall-ms2sysv-xlogues -O2" } */
-/* { dg-final { scan-assembler "call.*__avx_savms64_18" } } */
-/* { dg-final { scan-assembler "jmp.*__avx_resms64x_18" } } */
+/* { dg-final { scan-assembler "call.*__avx_savms64f?_12" } } */
+/* { dg-final { scan-assembler "jmp.*__avx_resms64f?x_12" } } */
 
 void __attribute__((sysv_abi)) a() {
 }
@@ -9,6 +9,5 @@ void __attribute__((sysv_abi)) a() {
 static void __attribute__((sysv_abi)) (*volatile a_noinfo)() = a;
 
 void __attribute__((ms_abi)) b() {
-  __asm__ __volatile__ ("" :::"rbx", "rbp", "r12", "r13", "r14", "r15");
   a_noinfo ();
 }
diff --git a/libgcc/config/i386/i386-asm.h b/libgcc/config/i386/i386-asm.h
index 424e0f72aac..91e1c0123ff 100644
--- a/libgcc/config/i386/i386-asm.h
+++ b/libgcc/config/i386/i386-asm.h
@@ -69,13 +69,15 @@ ASMNAME(fn):
 
 #ifdef MS2SYSV_STUB_AVX
 # define MS2SYSV_STUB_PREFIX __avx_
-# define MOVAPS vmovaps
+# ifdef HAVE_AS_AVX
+#  define MOVAPS vmovaps
+# endif
 #elif defined(MS2SYSV_STUB_SSE)
 # define MS2SYSV_STUB_PREFIX __sse_
 # define MOVAPS movaps
 #endif
 
-#if defined (MS2SYSV_STUB_PREFIX) && defined (MOVAPS)
+#if defined (MS2SYSV_STUB_PREFIX)
 
 # define MS2SYSV_STUB_BEGIN(base_name) \
 	HIDDEN_FUNC(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
@@ -83,8 +85,10 @@ ASMNAME(fn):
 # define 

Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-20 Thread Daniel Santos
On 09/19/2017 01:58 AM, Jakub Jelinek wrote:
> What can be done in libgcc is detect in configure whether the assembler
> supports AVX, and if not, provide some alternative (e.g. because the insns
> are always the same, you could just code them as .byte or something similar).
>
> Say like:
> --- i386-asm.h2017-09-18 18:34:30.917126996 +0200
> +++ i386-asm.h2017-09-19 08:56:58.829559038 +0200
> @@ -70,6 +70,7 @@ ASMNAME(fn):
>  #ifdef MS2SYSV_STUB_AVX
>  # define MS2SYSV_STUB_PREFIX __avx_
>  # define MOVAPS vmovaps
> +# define BYTE .byte
>  #elif defined(MS2SYSV_STUB_SSE)
>  # define MS2SYSV_STUB_PREFIX __sse_
>  # define MOVAPS movaps
> @@ -84,7 +85,8 @@ ASMNAME(fn):
>   FUNC_END(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
>  
>  /* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
> -# define SSE_SAVE   \
> +# ifdef HAVE_AS_AVX

I'm not exactly an autotools expert, but libtim defines HAVE_AS_AVX from
libitm/acinclude.m4 -- of course I need it in libgcc.  Similarly, gcc
has a nice generic gcc_GAS_CHECK_FEATURE macro in gcc/acinclude.m4 which
it uses for all of its HAVE_AS_* macro tests defined in
gcc/configure.ac.  I can just copy, paste and edit what's in libitm, but
I find that rather distasteful.  Is there a cleaner way to do this?  Can
I suck gcc_GAS_CHECK_FEATURE and it's deps out of gcc/acinclude.m4 and
put it somewhere central, like config/as.m4?  The upside would be the
ability to make HAVE_AS_* macros in other sub-projects more uniform.

Alternatively, I can just do the copy and paste and deal with it -- it's
not that much code. :)

Also, is there any sense in doing this same check for SSE support in the
assembler?  It's been out for 18 years now.

Thanks,
Daniel



Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-19 Thread Daniel Santos
On 09/19/2017 07:13 AM, Rainer Orth wrote:
> Daniel Santos <daniel.san...@pobox.com> writes:
>
>> On 09/17/2017 10:53 AM, Uros Bizjak wrote:
>>> OK.
>>>
>>> Thanks,
>>> Uros.
>> Thanks. I should have posted this Friday when my tests finished, but
>> I'll be committing with one minor change so tests don't run on m32 or mx32:
>>
>> --- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
>> +++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
>> @@ -1,4 +1,4 @@
>> -/* { dg-do compile } */
>> +/* { dg-do compile { target lp64 } } */
>>  /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
>>  /* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
>>  /* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
>> diff --git a/gcc/testsuite/gcc.target/i386/pr82196-2.c
>> b/gcc/testsuite/gcc.target/i386/pr82196-2.c
>> index 31705bee29b..8fe58411d5e 100644
>> --- a/gcc/testsuite/gcc.target/i386/pr82196-2.c
>> +++ b/gcc/testsuite/gcc.target/i386/pr82196-2.c
>> @@ -1,4 +1,4 @@
>> -/* { dg-do compile } */
>> +/* { dg-do compile { target lp64 } } */
>>  /* { dg-options "-mavx -mcall-ms2sysv-xlogues -O2" } */
>>  /* { dg-final { scan-assembler "call.*__avx_savms64_18" } } */
>>  /* { dg-final { scan-assembler "jmp.*__avx_resms64x_18" } } */
>>
>> Other than that, full regression tests pass.
> However, they do FAIL on 64-bit Solaris/x86:
>
> +FAIL: gcc.target/i386/pr82196-1.c (test for excess errors)
>
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/pr82196-1.c:14:1: 
> error: bp cannot be used in asm here
>
> +UNRESOLVED: gcc.target/i386/pr82196-1.c scan-assembler call.*__sse_savms64_18
> +UNRESOLVED: gcc.target/i386/pr82196-1.c scan-assembler jmp.*__sse_resms64x_18
> +FAIL: gcc.target/i386/pr82196-2.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/pr82196-2.c scan-assembler call.*__avx_savms64_18
> +UNRESOLVED: gcc.target/i386/pr82196-2.c scan-assembler jmp.*__avx_resms64x_18
>
>   Rainer

Sorry about that, I forgot about Solaris' default enabled frame
pointers.  I don't even need a test this complicated, I'll make it much
simpler.

Thanks,
Daniel


Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-19 Thread Daniel Santos
On 09/19/2017 01:58 AM, Jakub Jelinek wrote:
> On Mon, Sep 18, 2017 at 06:10:29PM -0500, Daniel Santos wrote:
>> Mike, can you take a look at this please?
>>
>> On 09/18/2017 10:17 AM, Dominique d'Humières wrote:
>>> This patch (r252896) breaks bootstrap on x86_64-apple-darwin10 configured 
>>> with
>>>
>>> ../work/configure --prefix=/opt/gcc/gcc8w 
>>> --enable-languages=c,c++,fortran,objc,obj-c++,ada,lto 
>>> --with-gmp=/opt/mp-new --with-system-zlib --with-isl=/opt/mp-new 
>>> --enable-lto --enable-plugin
>>>
>>> /opt/gcc/build_w/./gcc/xgcc -B/opt/gcc/build_w/./gcc/ 
>>> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/bin/ 
>>> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/lib/ -isystem 
>>> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/include -isystem 
>>> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/sys-include-g -O2 -O2  -g -O2 
>>> -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format 
>>> -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition  -isystem 
>>> ./include   -mmacosx-version-min=10.5 -pipe -fno-common -g -DIN_LIBGCC2 
>>> -fbuilding-libgcc -fno-stack-protector   -mmacosx-version-min=10.5 -pipe 
>>> -fno-common -I. -I. -I../.././gcc -I../../../work/libgcc 
>>> -I../../../work/libgcc/. -I../../../work/libgcc/../gcc 
>>> -I../../../work/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
>>> avx_savms64_s.o -MT avx_savms64_s.o -MD -MP -MF avx_savms64_s.dep -DSHARED 
>>> -c -xassembler-with-cpp ../../../work/libgcc/config/i386/avx_savms64.S
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm15,-0x30(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm14,-0x20(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm13,-0x10(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm12, (%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm11, 0x10(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm10, 0x20(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm9, 0x30(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm8, 0x40(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm7, 0x50(%rax)'
>>> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
>>> %xmm6, 0x60(%rax)'
>>> make[3]: *** [avx_savms64_s.o] Error 1
>>>
>>> Dominique
>> Thanks for the report.  AVX has been out since early 2011 and Wikipedia
>> claims that AVX support was added to OSX in version 10.6.8 in June 2011
>> and you seem to be using 10.8.0.  I would presume that also means that
>> the assembler supports it.  So I'm going to guess that it's the
>> "-mmacosx-version-min=10.5" parameter.  Can you please try setting that
>> to 10.6.8 and let me know the result?  I don't know what the minimum
>> system requirements for GCC 8 are going to be, but if it includes these
>> older versions of OSX then I'll have to figure out how to cope with it
>> in the libgcc build.
> What can be done in libgcc is detect in configure whether the assembler
> supports AVX, and if not, provide some alternative (e.g. because the insns
> are always the same, you could just code them as .byte or something similar).
>
> Say like:
> --- i386-asm.h2017-09-18 18:34:30.917126996 +0200
> +++ i386-asm.h2017-09-19 08:56:58.829559038 +0200
> @@ -70,6 +70,7 @@ ASMNAME(fn):
>  #ifdef MS2SYSV_STUB_AVX
>  # define MS2SYSV_STUB_PREFIX __avx_
>  # define MOVAPS vmovaps
> +# define BYTE .byte
>  #elif defined(MS2SYSV_STUB_SSE)
>  # define MS2SYSV_STUB_PREFIX __sse_
>  # define MOVAPS movaps
> @@ -84,7 +85,8 @@ ASMNAME(fn):
>   FUNC_END(PASTE2(MS2SYSV_STUB_PREFIX, base_name))
>  
>  /* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
> -# define SSE_SAVE   \
> +# ifdef HAVE_AS_AVX
> +#  define SSE_SAVE  \
>   MOVAPS %xmm15,-0x30(%rax); \
>   MOVAPS %xmm14,-0x20(%rax); \
>   MOVAPS %xmm13,-0x10(%rax); \
> @@ -95,6 +97,21 @@ ASMNAME(fn):
>   MOVAPS %xmm8,  0x40(%rax); \
>   MOVAPS %xmm7,  0x50(%rax); \
>   MOVAPS %xmm6,  0x60(%rax)
> +# else
> +/* If the assembler doesn't have AVX support,

Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-18 Thread Daniel Santos
Mike, can you take a look at this please?

On 09/18/2017 10:17 AM, Dominique d'Humières wrote:
> This patch (r252896) breaks bootstrap on x86_64-apple-darwin10 configured with
>
> ../work/configure --prefix=/opt/gcc/gcc8w 
> --enable-languages=c,c++,fortran,objc,obj-c++,ada,lto --with-gmp=/opt/mp-new 
> --with-system-zlib --with-isl=/opt/mp-new --enable-lto --enable-plugin
>
> /opt/gcc/build_w/./gcc/xgcc -B/opt/gcc/build_w/./gcc/ 
> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/bin/ 
> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/lib/ -isystem 
> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/include -isystem 
> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/sys-include-g -O2 -O2  -g -O2 
> -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format 
> -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition  -isystem 
> ./include   -mmacosx-version-min=10.5 -pipe -fno-common -g -DIN_LIBGCC2 
> -fbuilding-libgcc -fno-stack-protector   -mmacosx-version-min=10.5 -pipe 
> -fno-common -I. -I. -I../.././gcc -I../../../work/libgcc 
> -I../../../work/libgcc/. -I../../../work/libgcc/../gcc 
> -I../../../work/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
> avx_savms64_s.o -MT avx_savms64_s.o -MD -MP -MF avx_savms64_s.dep -DSHARED -c 
> -xassembler-with-cpp ../../../work/libgcc/config/i386/avx_savms64.S
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm15,-0x30(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm14,-0x20(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm13,-0x10(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm12, (%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm11, 0x10(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm10, 0x20(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm9, 0x30(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm8, 0x40(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm7, 0x50(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm6, 0x60(%rax)'
> make[3]: *** [avx_savms64_s.o] Error 1
>
> Dominique

Thanks for the report.  AVX has been out since early 2011 and Wikipedia
claims that AVX support was added to OSX in version 10.6.8 in June 2011
and you seem to be using 10.8.0.  I would presume that also means that
the assembler supports it.  So I'm going to guess that it's the
"-mmacosx-version-min=10.5" parameter.  Can you please try setting that
to 10.6.8 and let me know the result?  I don't know what the minimum
system requirements for GCC 8 are going to be, but if it includes these
older versions of OSX then I'll have to figure out how to cope with it
in the libgcc build.

Thanks,
Daniel


Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-17 Thread Daniel Santos
On 09/17/2017 10:53 AM, Uros Bizjak wrote:
> OK.
>
> Thanks,
> Uros.

Thanks. I should have posted this Friday when my tests finished, but
I'll be committing with one minor change so tests don't run on m32 or mx32:

--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target lp64 } } */
 /* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
 /* { dg-final { scan-assembler "call.*__sse_savms64_18" } } */
 /* { dg-final { scan-assembler "jmp.*__sse_resms64x_18" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-2.c 
b/gcc/testsuite/gcc.target/i386/pr82196-2.c
index 31705bee29b..8fe58411d5e 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-2.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target lp64 } } */
 /* { dg-options "-mavx -mcall-ms2sysv-xlogues -O2" } */
 /* { dg-final { scan-assembler "call.*__avx_savms64_18" } } */
 /* { dg-final { scan-assembler "jmp.*__avx_resms64x_18" } } */

Other than that, full regression tests pass.

Thanks,
Daniel


[PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-13 Thread Daniel Santos
I made a silly mistake in libgcc by testing the cpp macro __AVX__ to
determine rather to use movaps or vmovaps in the stubs.  This resulted
in the stubs choice of instruction being decided by the machine flags
when the compiler was built rather than those being supplied at the
command line.  This patch splits stubs into separate sse and avx
versions so that both are available.

gcc:
config/i386/i386.c: (xlogue_layout::STUB_NAME_MAX_LEN): Increase to 20
bytes.
(xlogue_layout::s_stub_names): Add an additional size-2 diminsion.
(xlogue_layout::get_stub_name): Modify to select the appropairate sse
and avx version of the stub.

gcc/testsuite:
gcc.target/i386/pr82196-1.c: New test.
gcc.target/i386/pr82196-2.c: Likewise.

libgcc:
config/i386/i386-asm.h (PASTE2): New macro.
(ASMNAME): Modify to use PASTE2.
(MS2SYSV_STUB_PREFIX): New macro for isa prefix.
(MS2SYSV_STUB_BEGIN, MS2SYSV_STUB_END): New macros for stub headers.
config/i386/resms64.S: Rename to a header file, use MS2SYSV_STUB_BEGIN
instead of HIDDEN_FUNC and MS2SYSV_STUB_END instead of FUNC_END.
config/i386/resms64f.S: Likewise.
config/i386/resms64fx.S: Likewise.
config/i386/resms64x.S: Likewise.
config/i386/savms64.S: Likewise.
config/i386/savms64f.S: Likewise.
config/i386/avx_resms64.S: New file that only defines a macro and
includes it's corresponding header file.
config/i386/avx_resms64f.S: Likewise.
config/i386/avx_resms64fx.S: Likewise.
config/i386/avx_resms64x.S: Likewise.
config/i386/avx_savms64.S: Likewise.
config/i386/avx_savms64f.S: Likewise.
config/i386/sse_resms64.S: Likewise.
config/i386/sse_resms64f.S: Likewise.
config/i386/sse_resms64fx.S: Likewise.
config/i386/sse_resms64x.S: Likewise.
config/i386/sse_savms64.S: Likewise.
config/i386/sse_savms64f.S: Likewise.
config/i386/t-msabi: Modified to add avx and sse versions of stubs.
Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c  | 15 ++-
 gcc/testsuite/gcc.target/i386/pr82196-1.c   | 14 ++
 gcc/testsuite/gcc.target/i386/pr82196-2.c   | 14 ++
 libgcc/config/i386/avx_resms64.S|  2 ++
 libgcc/config/i386/avx_resms64f.S   |  2 ++
 libgcc/config/i386/avx_resms64fx.S  |  2 ++
 libgcc/config/i386/avx_resms64x.S   |  2 ++
 libgcc/config/i386/avx_savms64.S|  2 ++
 libgcc/config/i386/avx_savms64f.S   |  2 ++
 libgcc/config/i386/i386-asm.h   | 34 -
 libgcc/config/i386/{resms64.S => resms64.h} | 28 ++--
 libgcc/config/i386/{resms64f.S => resms64f.h}   | 24 -
 libgcc/config/i386/{resms64fx.S => resms64fx.h} | 24 -
 libgcc/config/i386/{resms64x.S => resms64x.h}   | 28 ++--
 libgcc/config/i386/{savms64.S => savms64.h} | 28 ++--
 libgcc/config/i386/{savms64f.S => savms64f.h}   | 24 -
 libgcc/config/i386/sse_resms64.S|  2 ++
 libgcc/config/i386/sse_resms64f.S   |  2 ++
 libgcc/config/i386/sse_resms64fx.S  |  2 ++
 libgcc/config/i386/sse_resms64x.S   |  2 ++
 libgcc/config/i386/sse_savms64.S|  2 ++
 libgcc/config/i386/sse_savms64f.S   |  2 ++
 libgcc/config/i386/t-msabi  | 18 -
 23 files changed, 173 insertions(+), 102 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82196-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82196-2.c
 create mode 100644 libgcc/config/i386/avx_resms64.S
 create mode 100644 libgcc/config/i386/avx_resms64f.S
 create mode 100644 libgcc/config/i386/avx_resms64fx.S
 create mode 100644 libgcc/config/i386/avx_resms64x.S
 create mode 100644 libgcc/config/i386/avx_savms64.S
 create mode 100644 libgcc/config/i386/avx_savms64f.S
 rename libgcc/config/i386/{resms64.S => resms64.h} (76%)
 rename libgcc/config/i386/{resms64f.S => resms64f.h} (79%)
 rename libgcc/config/i386/{resms64fx.S => resms64fx.h} (79%)
 rename libgcc/config/i386/{resms64x.S => resms64x.h} (77%)
 rename libgcc/config/i386/{savms64.S => savms64.h} (76%)
 rename libgcc/config/i386/{savms64f.S => savms64f.h} (79%)
 create mode 100644 libgcc/config/i386/sse_resms64.S
 create mode 100644 libgcc/config/i386/sse_resms64f.S
 create mode 100644 libgcc/config/i386/sse_resms64fx.S
 create mode 100644 libgcc/config/i386/sse_resms64x.S
 create mode 100644 libgcc/config/i386/sse_savms64.S
 create mode 100644 libgcc/config/i386/sse_savms64f.S

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b2b02acc58a..f0d7d0eb196 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ 

Re: [PATCH] [i386] PR 81850 Don't ignore -mabi=sysv on Cygwin/MinGW

2017-08-23 Thread Daniel Santos
On 08/23/2017 01:12 AM, Uros Bizjak wrote:
> On Wed, Aug 23, 2017 at 7:23 AM, Daniel Santos <daniel.san...@pobox.com> 
> wrote:
>> On 08/22/2017 03:00 PM, Uros Bizjak wrote:
>>> On Tue, Aug 22, 2017 at 9:47 PM, Daniel Santos <daniel.san...@pobox.com> 
>>> wrote:
>>>>> Please add UNKNOWN_ABI to the enum and initialize -mabi in i386.opt to
>>>>> UNKNOWN_ABI.
>>>> It would seem to me that UNSPECIFIED_ABI would be a better value name.
>>>>
>>>> Also, I don't really understand what opts_set and opts are, except that I 
>>>> had
>>>> guessed opts_set is what the user asked for (or didn't ask for) and opts is
>>>> what we're going to actually use.  Am I close?
>>> Yes. opts_set is a flag that user specified an option at the command line.
>>>
>>> However, I fail to see what is the problem. If nothing was specified,
>>> then opts->x_ix86_abi is set to DEFAULT_ABI.
>> That is not what is happening.  If -mabi=sysv is specified, then the
>> test (!opts_set->x_ix86_abi) is true since the value of SYSV_ABI is
>> zero.  When that is evaluated as true, then the abi is set to
>> DEFAULT_ABI, which on Windows is MS_ABI, thus ignoring the command line
>> option.
> Let's use the following patch:
>
> --cut here--
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 3c82ae64f4f2..f8590f663285 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -5682,7 +5682,7 @@ ix86_option_override_internal (bool main_args_p,
>  ? PMODE_DI : PMODE_SI;
>
>if (!opts_set->x_ix86_abi)
> -opts->x_ix86_abi = DEFAULT_ABI;
> +printf ("Using default ABI\n"), opts->x_ix86_abi = DEFAULT_ABI;
>
>/* For targets using ms ABI enable ms-extensions, if not
>   explicit turned off.  For non-ms ABI we turn off this
> --cut here--
>
> $ ./cc1 -O2 -quiet hello.c
> Using default ABI
> $ ./cc1 -O2 -mabi=sysv -quiet hello.c
> $
> $ ./cc1 -O2 -mabi=sysv -quiet hello.c
> $
>
> Again, opts_set is set to true when the option is specified on the
> command line, it has nothing to do with the value of the option.

Interesting, I get the same result and in fact I can't reproduce the bug
anymore.  Either I made a mistake somewhere (likely) or something else
fixed the problem (less likely).  I'll try again from where the trunk
was when I filed the bug and close it either invalid or fixed depending
upon which it is.

Thanks!
Daniel

>> I'm guessing that if we don't specify an Init() option then it will
>> default to zero?  We just need a valid way to differentiate when
>> -mabi=sysv has been passed from when nothing has been passed.
> Yes, it defaults to zero, but since we live in c++ world nowadays, we
> can't initialize enum with integer zero...
>
> Uros.




Re: [PATCH 4/4] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-08-23 Thread Daniel Santos

On 08/23/2017 08:26 AM, Uros Bizjak wrote:

>> @@ -1822,6 +1845,7 @@ proc check_avx2_hw_available { } {
>> expr 0
>> } else {
>> check_runtime_nocache avx2_hw_available {
>> +   #include 
> Why is the above include needed?

It is only needed to #define NULL.  Without the include, I've had this
function fail due to NULL being undefined.

Daniel



Re: [PATCH] [i386] PR 81850 Don't ignore -mabi=sysv on Cygwin/MinGW

2017-08-22 Thread Daniel Santos
On 08/22/2017 03:00 PM, Uros Bizjak wrote:
> On Tue, Aug 22, 2017 at 9:47 PM, Daniel Santos <daniel.san...@pobox.com> 
> wrote:
>>> Please add UNKNOWN_ABI to the enum and initialize -mabi in i386.opt to
>>> UNKNOWN_ABI.
>> It would seem to me that UNSPECIFIED_ABI would be a better value name.
>>
>> Also, I don't really understand what opts_set and opts are, except that I had
>> guessed opts_set is what the user asked for (or didn't ask for) and opts is
>> what we're going to actually use.  Am I close?
> Yes. opts_set is a flag that user specified an option at the command line.
>
> However, I fail to see what is the problem. If nothing was specified,
> then opts->x_ix86_abi is set to DEFAULT_ABI.

That is not what is happening.  If -mabi=sysv is specified, then the
test (!opts_set->x_ix86_abi) is true since the value of SYSV_ABI is
zero.  When that is evaluated as true, then the abi is set to
DEFAULT_ABI, which on Windows is MS_ABI, thus ignoring the command line
option.

> Probably we don't need
> Init(SYSV_ABI) in mabi= declaration at all.

I'm guessing that if we don't specify an Init() option then it will
default to zero?  We just need a valid way to differentiate when
-mabi=sysv has been passed from when nothing has been passed.

Daniel

>
> Uros.
>
>> I'm re-running tests, so if they pass is this OK?
>>
>> Thanks,
>> Daniel
>> ---
>>  gcc/config/i386/i386-opts.h | 5 +++--
>>  gcc/config/i386/i386.c  | 3 +--
>>  gcc/config/i386/i386.opt| 2 +-
>>  3 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
>> index 542cd0f3d67..a1d1552a3c6 100644
>> --- a/gcc/config/i386/i386-opts.h
>> +++ b/gcc/config/i386/i386-opts.h
>> @@ -44,8 +44,9 @@ last_alg
>>  /* Available call abi.  */
>>  enum calling_abi
>>  {
>> -  SYSV_ABI = 0,
>> -  MS_ABI = 1
>> +  UNSPECIFIED_ABI = 0,
>> +  SYSV_ABI = 1,
>> +  MS_ABI = 2
>>  };
>>
>>  enum fpmath_unit
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 650bcbc65ae..c08ad55fcd9 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -5681,12 +5681,11 @@ ix86_option_override_internal (bool main_args_p,
>>  opts->x_ix86_pmode = TARGET_LP64_P (opts->x_ix86_isa_flags)
>>  ? PMODE_DI : PMODE_SI;
>>
>> -  if (!opts_set->x_ix86_abi)
>> +  if (opts_set->x_ix86_abi == UNSPECIFIED_ABI)
>>  opts->x_ix86_abi = DEFAULT_ABI;
>>
>>if (opts->x_ix86_abi == MS_ABI && TARGET_X32_P (opts->x_ix86_isa_flags))
>>  error ("-mabi=ms not supported with X32 ABI");
>> -  gcc_assert (opts->x_ix86_abi == SYSV_ABI || opts->x_ix86_abi == MS_ABI);
>>
>>/* For targets using ms ABI enable ms-extensions, if not
>>   explicit turned off.  For non-ms ABI we turn off this
>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
>> index cd564315f04..f7b9f9707f7 100644
>> --- a/gcc/config/i386/i386.opt
>> +++ b/gcc/config/i386/i386.opt
>> @@ -525,7 +525,7 @@ Target Report Mask(IAMCU)
>>  Generate code that conforms to Intel MCU psABI.
>>
>>  mabi=
>> -Target RejectNegative Joined Var(ix86_abi) Enum(calling_abi) Init(SYSV_ABI)
>> +Target RejectNegative Joined Var(ix86_abi) Enum(calling_abi) 
>> Init(UNSPECIFIED_ABI)
>>  Generate code that conforms to the given ABI.
>>
>>  Enum
>> --
>> 2.13.3
>>



[PATCH 4/4] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-08-22 Thread Daniel Santos
Changes to lib/target-supports.exp and documentation:
* Add effective-targets avx512f and avx512f_runtime (needed for new
  tests).
* Corrects bug in check_avx2_hw_available.
* Adds documentation for effective-targets avx2, avx2_runtime (both
  missing), avx512f and avx512f_runtime.

The following tests are added.  The testcase in the PR is used as a base
and relevant variants are added to test other factors affected by the
patch set.

pr80969-1.c   Base test case.
pr80969-2.c   With ms to sysv call.
pr80969-2a.c  With ms to sysv call using stubs.
pr80969-3.c   With alloca (for DRAP test).
pr80969-4.c   With va_args passed via va_list
pr80969-4a.c  With va_args passed via va_list and ms to sysv call.
pr80969-4b.c  With va_args passed via va_list and ms to sysv call using
  stubs.
pr80969-4.h   Common header for pr80969-4*.c.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/doc/sourcebuild.texi   |  12 +++
 gcc/testsuite/gcc.target/i386/pr80969-1.c  |  16 
 gcc/testsuite/gcc.target/i386/pr80969-2.c  |  27 +++
 gcc/testsuite/gcc.target/i386/pr80969-2a.c |   8 ++
 gcc/testsuite/gcc.target/i386/pr80969-3.c  |  32 
 gcc/testsuite/gcc.target/i386/pr80969-4.c  |   9 +++
 gcc/testsuite/gcc.target/i386/pr80969-4.h  | 119 +
 gcc/testsuite/gcc.target/i386/pr80969-4a.c |   9 +++
 gcc/testsuite/gcc.target/i386/pr80969-4b.c |   9 +++
 gcc/testsuite/lib/target-supports.exp  |  66 
 10 files changed, 307 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.h
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4b.c

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index e6313dc031e..0bf4d6afeb6 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1855,6 +1855,18 @@ Target supports compiling @code{avx} instructions.
 @item avx_runtime
 Target supports the execution of @code{avx} instructions.
 
+@item avx2
+Target supports compiling @code{avx2} instructions.
+
+@item avx2_runtime
+Target supports the execution of @code{avx2} instructions.
+
+@item avx512f
+Target supports compiling @code{avx512f} instructions.
+
+@item avx512f_runtime
+Target supports the execution of @code{avx512f} instructions.
+
 @item cell_hw
 Test system can execute AltiVec and Cell PPU instructions.
 
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-1.c 
b/gcc/testsuite/gcc.target/i386/pr80969-1.c
new file mode 100644
index 000..e0520b45c40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-1.c
@@ -0,0 +1,16 @@
+/* { dg-do run { target { ! x32 } } } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+int a[56];
+int b;
+int main (int argc, char *argv[]) {
+  int c;
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2.c
new file mode 100644
index 000..f885dee6512
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { { ! x32 } && avx512f_runtime } } } */
+/* { dg-do compile { target { { ! x32 } && { ! avx512f_runtime } } } } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2a.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
new file mode 100644
index 000..baea0796d24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
@@ -0,0 +1,8 @@
+/* { dg-do run { target { lp64 && avx512f_runtime } } } */
+/* { dg-do compile { target { lp64 && { ! avx512f_runtime } } } } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f -mcall-ms2sysv-xlogues" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func using save/restore stubs.  */
+
+#include "pr80969-2.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-3.c 
b/gcc/testsuite/gcc.target/i386/pr80969-3.c
new file mode 100644
index 000..d902a771cc8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-3.c
@@ -0,0 +1,32 @

[PATCH 3/4] [i386] Modify SP realignment in ix86_expand_prologue, et. al.

2017-08-22 Thread Daniel Santos
My first version of this patch inited m->fs.sp_realigned_fp_last with
the value of m->fs.sp_offset prior to performing the stack realignment.
I had forgotten, however, that when we're saving GP regs using MOV that
we delay SP modification as long as possible so that the value of
m->fs.sp_offset at this point is correct when we've used push, but
incorrect when we've used mov.

This has been tested on both x86_64-pc-linux-gnu{,x32} with
--target_board=unix/\{-m64,-mx32,-m32\}.

Original patch description:

The SP allocation calculation is now done in ix86_compute_frame_layout
and the result stored in ix86_frame::stack_realign_allocate.  This
change also updates comments for choose_baseaddr to clarify that the
alignment returned doesn't necessarily reflect the alignment of the
cfa_offset passed (e.g., you can pass cfa_offset 48 and it can return an
alignment of 64 bytes).

Since the alignment required may be more than 16-bytes, we cannot defer
SP allocation to ix86_emit_outlined_ms2sysv_save (when it's enabled), so
that function needs to be updated as well.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 58 --
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 30e84dd5303..dbc771da8aa 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13359,10 +13359,13 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
_reg,
 }
 
 /* Return an RTX that points to CFA_OFFSET within the stack frame and
-   the alignment of address.  If align is non-null, it should point to
+   the alignment of address.  If ALIGN is non-null, it should point to
an alignment value (in bits) that is preferred or zero and will
-   recieve the alignment of the base register that was selected.  The
-   valid base registers are taken from CFUN->MACHINE->FS.  */
+   recieve the alignment of the base register that was selected,
+   irrespective of rather or not CFA_OFFSET is a multiple of that
+   alignment value.
+
+   The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
 choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
@@ -14445,35 +14448,35 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
-  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
-  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
+
+  /* AL should only be live with sysv_abi.  */
+  gcc_assert (!ix86_eax_live_at_start_p ());
+
+  /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
+ we've actually realigned the stack or not.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (frame.stack_realign_offset
+ + xlogue.get_stub_ptr_offset (), );
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Verify that the incoming stack 16-byte alignment offset matches the
- layout we're using.  */
-  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+  /* Allocate stack if not already done.  */
+  if (allocate > 0)
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+   GEN_INT (-allocate), -1, false);
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
  : XLOGUE_STUB_SAVE);
   RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
 
-  /* Setup RAX as the stub's base pointer.  */
-  align = GET_MODE_ALIGNMENT (V4SFmode);
-  addr = choose_baseaddr (rax_offset, );
-  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  insn = emit_insn (gen_rtx_SET (rax, addr));
-
-  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-stack_alloc_size), -1,
-m->fs.cfa_reg == stack_pointer_rtx);
   for (i = 0; i < ncregs; ++i)
 {
   const xlogue_layout::reginfo  = xlogue.get_reginfo (i);
   rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
 r.regno);
-  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);
 }
 
   gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
@@ -14728,14 +14731,15 @@ ix86_expand_prologue (void)
   gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
 
   /* Record last valid frame pointer

[PATCH 2/4] [i386] Modify ix86_compute_frame_layout

2017-08-22 Thread Daniel Santos
These changes affect how the stack frame is calculated from the region
starting at frame.reg_save_offset until frame.frame_pointer_offset,
which includes either the stub save area or the (inline) SSE register
save area and the va_args register save area.

The calculation used when not realigning the stack pointer is the same,
but when when realigning we calculate the 16-byte aligned space needed
in reverse so that the stack realignment boundary at
frame.stack_realign_offset may not necessarily be a multiple of
stack_alignment_needed, but the value of frame.frame_pointer_offset
will. This results in a properly aligned stack for the function body and
avoids wasting stack space.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 116 +
 gcc/config/i386/i386.h |   2 +-
 2 files changed, 80 insertions(+), 38 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 601e3ef47f6..30e84dd5303 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12960,6 +12960,14 @@ ix86_compute_frame_layout (void)
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
   gcc_assert (preferred_alignment <= stack_alignment_needed);
 
+  /* The only ABI saving SSE regs should be 64-bit ms_abi.  */
+  gcc_assert (TARGET_64BIT || !frame->nsseregs);
+  if (TARGET_64BIT && m->call_ms2sysv)
+{
+  gcc_assert (stack_alignment_needed >= 16);
+  gcc_assert (!frame->nsseregs);
+}
+
   /* For SEH we have to limit the amount of code movement into the prologue.
  At present we do this via a BLOCKAGE, at which point there's very little
  scheduling that can be done, which means that there's very little point
@@ -13022,54 +13030,88 @@ ix86_compute_frame_layout (void)
   if (TARGET_SEH)
 frame->hard_frame_pointer_offset = offset;
 
-  /* When re-aligning the stack frame, but not saving SSE registers, this
- is the offset we want adjust the stack pointer to.  */
-  frame->stack_realign_allocate_offset = offset;
+  /* Calculate the size of the va-arg area (not including padding, if any).  */
+  frame->va_arg_size = ix86_varargs_gpr_size + ix86_varargs_fpr_size;
 
-  /* The re-aligned stack starts here.  Values before this point are not
- directly comparable with values below this point.  Use sp_valid_at
- to determine if the stack pointer is valid for a given offset and
- fp_valid_at for the frame pointer.  */
   if (stack_realign_fp)
-offset = ROUND_UP (offset, stack_alignment_needed);
-  frame->stack_realign_offset = offset;
-
-  if (TARGET_64BIT && m->call_ms2sysv)
 {
-  gcc_assert (stack_alignment_needed >= 16);
-  gcc_assert (!frame->nsseregs);
+  /* We may need a 16-byte aligned stack for the remainder of the
+register save area, but the stack frame for the local function
+may require a greater alignment if using AVX/2/512.  In order
+to avoid wasting space, we first calculate the space needed for
+the rest of the register saves, add that to the stack pointer,
+and then realign the stack to the boundary of the start of the
+frame for the local function.  */
+  HOST_WIDE_INT space_needed = 0;
+  HOST_WIDE_INT sse_reg_space_needed = 0;
 
-  m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
-  offset += xlogue_layout::get_instance ().get_stack_space_used ();
-}
+  if (TARGET_64BIT)
+   {
+ if (m->call_ms2sysv)
+   {
+ m->call_ms2sysv_pad_in = 0;
+ space_needed = xlogue_layout::get_instance 
().get_stack_space_used ();
+   }
 
-  /* Align and set SSE register save area.  */
-  else if (frame->nsseregs)
-{
-  /* The only ABI that has saved SSE registers (Win64) also has a
-16-byte aligned default stack.  However, many programs violate
-the ABI, and Wine64 forces stack realignment to compensate.
+ else if (frame->nsseregs)
+   /* The only ABI that has saved SSE registers (Win64) also has a
+  16-byte aligned default stack.  However, many programs violate
+  the ABI, and Wine64 forces stack realignment to compensate.  */
+   space_needed = frame->nsseregs * 16;
+
+ sse_reg_space_needed = space_needed = ROUND_UP (space_needed, 16);
+
+ /* 64-bit frame->va_arg_size should always be a multiple of 16, but
+rounding to be pedantic.  */
+ space_needed = ROUND_UP (space_needed + frame->va_arg_size, 16);
+   }
+  else
+   space_needed = frame->va_arg_size;
+
+  /* Record the allocation size required prior to the realignment AND.  */
+  frame->stack_realign_allocate = space_needed;
+
+  /* The re-aligned stack starts at frame->stack_realign_offset.  Values
+before this point are not directly co

[PATCH 1/4] [i386] Correct comments, add assertions to sp_valid_at and fp_valid_at

2017-08-22 Thread Daniel Santos
When we realign the stack frame (without DRAP), there may be a range of
CFA offsets that should never be touched because they are alignment
padding and any reference to them is almost certainly an error.
Previously, only the offset of where the realigned stack frame starts
was recorded and checked in sp_valid_at and fp_valid_at.

This change adds sp_realigned_fp_last to struct machine_frame_state to
record the last valid offset from which the frame pointer can be used
when the stack pointer is realigned and modifies sp_valid_at and
fp_valid_at to fail an assertion when passed an offset in the "no-man's
land" between these two values.

Comments for struct machine_frame_state incorrectly stated that a
realigned stack pointer could be used to access offsets equal to or
greater than sp_realigned_offset, but it is only valid for offsets that
are greater.  This was the (incorrect) behaviour of sp_valid_at and
fp_valid_at prior to r250587 and this change now corrects the
documentation and adds clarification of the CFA-relative calculation.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 45 ++---
 gcc/config/i386/i386.h | 18 +-
 2 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c08ad55fcd9..601e3ef47f6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13177,26 +13177,36 @@ choose_baseaddr_len (unsigned int regno, 
HOST_WIDE_INT offset)
   return len;
 }
 
-/* Determine if the stack pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the stack pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
-static inline bool
+static bool
 sp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state  = cfun->machine->fs;
-  return fs.sp_valid && !(fs.sp_realigned
- && cfa_offset <= fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset <= fs.sp_realigned_offset)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset <= fs.sp_realigned_fp_last);
+  return false;
+}
+  return fs.sp_valid;
 }
 
-/* Determine if the frame pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the frame pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
 static inline bool
 fp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state  = cfun->machine->fs;
-  return fs.fp_valid && !(fs.sp_valid && fs.sp_realigned
- && cfa_offset > fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset > fs.sp_realigned_fp_last)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset >= fs.sp_realigned_offset);
+  return false;
+}
+  return fs.fp_valid;
 }
 
 /* Choose a base register based upon alignment requested, speed and/or
@@ -14675,6 +14685,9 @@ ix86_expand_prologue (void)
   int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
   gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
 
+  /* Record last valid frame pointer offset.  */
+  m->fs.sp_realigned_fp_last = m->fs.sp_offset;
+
   /* The computation of the size of the re-aligned stack frame means
 that we must allocate the size of the register save area before
 performing the actual alignment.  Otherwise we cannot guarantee
@@ -14688,13 +14701,15 @@ ix86_expand_prologue (void)
   insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
stack_pointer_rtx,
GEN_INT (-align_bytes)));
-  /* For the purposes of register save area addressing, the stack
-pointer can no longer be used to access anything in the frame
-below m->fs.sp_realigned_offset and the frame pointer cannot be
-used for anything at or above.  */
   m->fs.sp_offset = ROUND_UP (m->fs.sp_offset, align_bytes);
   m->fs.sp_realigned = true;
   m->fs.sp_realigned_offset = m->fs.sp_offset - frame.nsseregs * 16;
+  /* The stack pointer may no longer be equal to CFA - m->fs.sp_offset.
+Beyond this point, stack access should be done via choose_baseaddr or
+by using sp_valid_at and fp_valid_at to determine the correct base
+register.  Henceforth, any CFA offset should be thought of as logical
+and not physical.  */
+  gcc_assert (m->fs.sp_realigned_offset >= m->fs.sp_realigned_fp_last);
   gcc_assert (m->fs.sp_realign

[PATCH v4 0/4] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-08-22 Thread Daniel Santos
I had to fix a few things for x32 compatibility and I this is ready
now.  H.J. tested on machine with avx512 (including x32) and I've tested
both native x32 and normal x86_64 with m64, m32 and mx32 and all is
well.  I've made more changes to the tests so I'm just submitting a
version 2 of the whole patch set.

OK for trunk?

2017-08-22  Daniel Santos  <daniel.san...@pobox.com>

* config/i386/i386.h (ix86_frame::stack_realign_allocate_offset):
Remove field.
(ix86_frame::stack_realign_allocate): New field.
(struct machine_frame_state): Modify comments.
(machine_frame_state::sp_realigned_fp_end): New field.
* config/i386/i386.c (ix86_compute_frame_layout): Rework stack frame
layout calculation.
(sp_valid_at): Add assertion to assure no attempt to access invalid
offset of a realigned stack.
(fp_valid_at): Likewise.
(choose_baseaddr): Modify comments.
(ix86_emit_outlined_ms2sysv_save): Adjust to changes in
ix86_expand_prologue.
(ix86_expand_prologue): Modify stack realignment and allocation.
(ix86_expand_epilogue): Modify comments.

2017-08-22  Daniel Santos  <daniel.san...@pobox.com>

* gcc.target/i386/pr80969-1.c: New testcase.
* gcc.target/i386/pr80969-2a.c: Likewise.
* gcc.target/i386/pr80969-2.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/i386/pr80969-4a.c: Likewise.
* gcc.target/i386/pr80969-4b.c: Likewise.
* gcc.target/i386/pr80969-4.c: Likewise.
* gcc.target/i386/pr80969-4.h: New header common to pr80969-4*.c


Thanks,
Daniel


[PATCH] [i386] PR 81850 Don't ignore -mabi=sysv on Cygwin/MinGW

2017-08-22 Thread Daniel Santos
> Please add UNKNOWN_ABI to the enum and initialize -mabi in i386.opt to
> UNKNOWN_ABI.

It would seem to me that UNSPECIFIED_ABI would be a better value name.

Also, I don't really understand what opts_set and opts are, except that I had
guessed opts_set is what the user asked for (or didn't ask for) and opts is
what we're going to actually use.  Am I close?

I'm re-running tests, so if they pass is this OK?

Thanks,
Daniel
---
 gcc/config/i386/i386-opts.h | 5 +++--
 gcc/config/i386/i386.c  | 3 +--
 gcc/config/i386/i386.opt| 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index 542cd0f3d67..a1d1552a3c6 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -44,8 +44,9 @@ last_alg
 /* Available call abi.  */
 enum calling_abi
 {
-  SYSV_ABI = 0,
-  MS_ABI = 1
+  UNSPECIFIED_ABI = 0,
+  SYSV_ABI = 1,
+  MS_ABI = 2
 };
 
 enum fpmath_unit
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 650bcbc65ae..c08ad55fcd9 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5681,12 +5681,11 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_ix86_pmode = TARGET_LP64_P (opts->x_ix86_isa_flags)
 ? PMODE_DI : PMODE_SI;
 
-  if (!opts_set->x_ix86_abi)
+  if (opts_set->x_ix86_abi == UNSPECIFIED_ABI)
 opts->x_ix86_abi = DEFAULT_ABI;
 
   if (opts->x_ix86_abi == MS_ABI && TARGET_X32_P (opts->x_ix86_isa_flags))
 error ("-mabi=ms not supported with X32 ABI");
-  gcc_assert (opts->x_ix86_abi == SYSV_ABI || opts->x_ix86_abi == MS_ABI);
 
   /* For targets using ms ABI enable ms-extensions, if not
  explicit turned off.  For non-ms ABI we turn off this
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index cd564315f04..f7b9f9707f7 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -525,7 +525,7 @@ Target Report Mask(IAMCU)
 Generate code that conforms to Intel MCU psABI.
 
 mabi=
-Target RejectNegative Joined Var(ix86_abi) Enum(calling_abi) Init(SYSV_ABI)
+Target RejectNegative Joined Var(ix86_abi) Enum(calling_abi) 
Init(UNSPECIFIED_ABI)
 Generate code that conforms to the given ABI.
 
 Enum
-- 
2.13.3



Re: [PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-22 Thread Daniel Santos
OK, the problem is at line 4014 of gcc/Makefile.in:

  $(MAKE) TESTSUITEDIR="$(TESTSUITEDIR)"
RUNTESTFLAGS="$(RUNTESTFLAGS)" \
check-parallel-$* \
 
Even worse, one can inject arbitrary shell commands here, not that I can
think of a scenario where it would be an actual security problem:

RUNTESTFLAGS="i386.exp=a b\"; beep\"" check-c

I presume that the solution would be to re-escape the contents of
RUNTESTFLAGS.

Daniel


Re: [PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-22 Thread Daniel Santos
On 08/22/2017 12:32 PM, Mike Stump wrote:
> On Aug 22, 2017, at 10:32 AM, Daniel Santos <daniel.san...@pobox.com> wrote:
>>>  I would suggest "escaped or quoted."
>>> The whole argument to RUNTESTFLAGS can be quoted in either single
>>> or double quotes and, AFAICT, so can the space-separated test
>>> names within it.
>> Well, mysteriously, double quotes do not work.
> Did you try the obvious:
>
> "\"pdf pdf\" pdf"
>
> ?  I think it should work fine.

I have found one additional working mechanism:

RUNTESTFLAGS='i386.exp=\"pr80969-[12]*.c pr80969-4.c\"'

But using double quotes for both does NOT work:

RUNTESTFLAGS="i386.exp=\"pr80969-[12]*.c pr80969-4.c\""

So the three working options appears to be:
1. Escaping whitespace
2. Using double quotes for the whole value and single quotes for the
file.exp=patterns expression
3. Using single quotes for the whole value and double quotes for the
file.exp=patterns expression

Daniel


Re: [PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-22 Thread Daniel Santos
On 08/22/2017 12:32 PM, Mike Stump wrote:
> On Aug 22, 2017, at 10:32 AM, Daniel Santos <daniel.san...@pobox.com> wrote:
>>>  I would suggest "escaped or quoted."
>>> The whole argument to RUNTESTFLAGS can be quoted in either single
>>> or double quotes and, AFAICT, so can the space-separated test
>>> names within it.
>> Well, mysteriously, double quotes do not work.
> Did you try the obvious:
>
> "\"pdf pdf\" pdf"
>
> ?  I think it should work fine.

Yes.  As I explained in the rest of my email I tried a great many
variations.  I can debug runtest some more and try to better understand
how this is getting parsed.

Daniel


[PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-22 Thread Daniel Santos
OK, how's this one?

* doc/install.texi: Modify to add more details on running
selected tests.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/doc/install.texi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 7c9e2f25d44..da360da1c50 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2737,6 +2737,16 @@ the testsuite with filenames matching @samp{9805*}, you 
would use
 make check-g++ RUNTESTFLAGS="old-deja.exp=9805* @var{other-options}"
 @end smallexample
 
+The file-matching expression following @var{filename}@command{.exp=} is treated
+as a series of whitespace-delimited glob expressions so that multiple patterns
+may be passed, although any whitespace must either be escaped or surrounded by
+single quotes if multiple expressions are desired. For example,
+
+@smallexample
+make check-g++ RUNTESTFLAGS="old-deja.exp=9805*\ virtual2.c 
@var{other-options}"
+make check-g++ RUNTESTFLAGS="'old-deja.exp=9805* virtual2.c' 
@var{other-options}"
+@end smallexample
+
 The @file{*.exp} files are located in the testsuite directories of the GCC
 source, the most important ones being @file{compile.exp},
 @file{execute.exp}, @file{dg.exp} and @file{old-deja.exp}.
-- 
2.13.3



Re: [PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-22 Thread Daniel Santos
On 08/22/2017 10:58 AM, Martin Sebor wrote:
> On 08/21/2017 07:41 PM, Daniel Santos wrote:
>> It took me a while to figure out how to do this so I figured that it
>> should be
>> in the docs.  OK for trunk?
>>
>> * doc/install.texi: Add more details on selecting multiple tests.
>
> Thank you!  It had taken me some time to figure this out.
>
>> +The file-matching expression following @var{filename}@command{.exp=}
>> is treated
>> +as a series of whitespace-delimited glob expressions so that
>> multiple patterns
>> +may be passed, although any whitespace must either be escaped or
>> surrounded by
>> +tick marks if multiple expressions are desired. For example,
>
> Do you mean single quotes?

Yes.  I guess I've heard the terms "tick marks" and "single quotes" used
before.  Perhaps using 'single quotes' would be a good way to express it
(with the quotes).

>   I would suggest "escaped or quoted."
> The whole argument to RUNTESTFLAGS can be quoted in either single
> or double quotes and, AFAICT, so can the space-separated test
> names within it.

Well, mysteriously, double quotes do not work.  So if I pass
RUNTESTFLAGS='"i386.exp=pr80969-[12]*.c pr80969-4.c"' then the second
pattern isn't used.  I have NO idea what happens to it because it I pass
RUNTESTFLAGS='i386.exp=pr80969-[12]*.c pr80969-4.c' then runtest
properly demands that I tell it what in the hell pr80969-4.c is supposed
to mean.  As an experiment, I created a symlink named \"pr80969-4.c and
using RUNTESTFLAGS='"i386.exp=pr80969-[12]*.c "pr80969-4.c' but it
didn't pick it up.  This is probably JAB (just another bug) in DejaGNU.

Among the variations I've tried are enclosing the expressions in
{braces},  \{escaped braces\} and comma-delimited \{escaped,braces\},
but none of these worked.

Daniel

> Martin
>



Re: [PATCH] [i386] PR 81850 Don't ignore -mabi=sysv on Cygwin/MinGW

2017-08-22 Thread Daniel Santos
On 08/22/2017 01:26 AM, Andreas Schwab wrote:
> On Aug 21 2017, Daniel Santos <daniel.san...@pobox.com> wrote:
>
>> This is a problem that occured because of this code in
>> ix86_option_override_internal:
>>
>>   if (!opts_set->x_ix86_abi)
>> opts->x_ix86_abi = DEFAULT_ABI;
> Why is that a problem?  Note opts_set vs opts.

Just because the test !opts_set->x_ix86_abi will be true rather we
supplied no -mabi parameter or we supplied -mabi=sysv.

Daniel

> Andreas.



[PATCH] [i386] PR 81850 Don't ignore -mabi=sysv on Cygwin/MinGW

2017-08-21 Thread Daniel Santos
This is a problem that occured because of this code in
ix86_option_override_internal:

  if (!opts_set->x_ix86_abi)
opts->x_ix86_abi = DEFAULT_ABI;

I tested this along with my other patches.  OK for trunk?

* config/i386/i386-opts.h (enum calling_abi): Modify so that no legal
values are equivalent to zero.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386-opts.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index 542cd0f3d67..8c2b5380e49 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -44,8 +44,8 @@ last_alg
 /* Available call abi.  */
 enum calling_abi
 {
-  SYSV_ABI = 0,
-  MS_ABI = 1
+  SYSV_ABI = 1,
+  MS_ABI = 2
 };
 
 enum fpmath_unit
-- 
2.13.3



[PATCH] [i386, testsuite] [PR 71958] Error on -mx32 with -mabi=ms

2017-08-21 Thread Daniel Santos
We currently error when -mx32 and -mabi=sysv and we encounter a function
with attribute ms_abi, but we are not erroring on -mx32 and -mabi=ms
(either explicitly or when it is the default on Windows).  In fact, it
generates code that runs, but is of an undfined ABI.

I'm also changing pr64409.c because if you explicitly supply -m64, then
the test became ineffective.  This is because the -mx32 parameter passed
in dg-options is later overridden by the explicit -m64 parameter.

I've bootstrapped and tested on
*  an x86_64-pc-linux-gnux32 system building gcc with --with-abi=mx32,
*  a "normal" x86_64-pc-linux-gnu testing with
   --target_board=unix/\{,-m32\}, and
*  on Windows.

OK for trunk?

gcc/ChangeLog:
2017-08-11  Daniel Santos  <daniel.san...@pobox.com>

* config/i386/i386.c (ix86_option_override_internal): Error when
-mx32 is combined with -mabi=ms.
(ix86_function_type_abi): Limit errors for mixing -mx32 with
attribute ms_abi.

gcc/testsuite/ChangeLog:
2017-08-11  Daniel Santos  <daniel.san...@pobox.com>

* gcc.target/i386/pr71958.c: New test to verify error on -mx32
and -mabi=ms
* gcc.target/i386/pr64409.c: Modify to only run on x32.
* gcc.target/i386/pr46470.c: Modify to skip x32 target.
* gcc.target/i386/pr66275.c: Likewise.
* gcc.target/i386/pr68018.c: Likewise.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c  | 12 ++--
 gcc/testsuite/gcc.target/i386/pr46470.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr64409.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr66275.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr68018.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr71958.c |  7 +++
 6 files changed, 21 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71958.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1d88e4f247a..3b537f2608f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5684,6 +5684,10 @@ ix86_option_override_internal (bool main_args_p,
   if (!opts_set->x_ix86_abi)
 opts->x_ix86_abi = DEFAULT_ABI;
 
+  if (opts->x_ix86_abi == MS_ABI && TARGET_X32_P (opts->x_ix86_isa_flags))
+error ("-mabi=ms not supported with X32 ABI");
+  gcc_assert (opts->x_ix86_abi == SYSV_ABI || opts->x_ix86_abi == MS_ABI);
+
   /* For targets using ms ABI enable ms-extensions, if not
  explicit turned off.  For non-ms ABI we turn off this
  option.  */
@@ -8777,8 +8781,12 @@ ix86_function_type_abi (const_tree fntype)
   if (abi == SYSV_ABI
   && lookup_attribute ("ms_abi", TYPE_ATTRIBUTES (fntype)))
 {
-  if (TARGET_X32)
-   error ("X32 does not support ms_abi attribute");
+  static int warned;
+  if (TARGET_X32 && !warned)
+   {
+ error ("X32 does not support ms_abi attribute");
+ warned = 1;
+   }
 
   abi = MS_ABI;
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr46470.c 
b/gcc/testsuite/gcc.target/i386/pr46470.c
index 9e8e731188e..c66a378a1ad 100644
--- a/gcc/testsuite/gcc.target/i386/pr46470.c
+++ b/gcc/testsuite/gcc.target/i386/pr46470.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! x32 } } } */
 /* The pic register save adds unavoidable stack pointer references.  */
 /* { dg-skip-if "" { ia32 && { ! nonpic } } } */
 /* These options are selected to ensure 1 word needs to be allocated
diff --git a/gcc/testsuite/gcc.target/i386/pr64409.c 
b/gcc/testsuite/gcc.target/i386/pr64409.c
index 917472653f4..7bf9d1e398d 100644
--- a/gcc/testsuite/gcc.target/i386/pr64409.c
+++ b/gcc/testsuite/gcc.target/i386/pr64409.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-do compile { target x32 } } */
 /* { dg-require-effective-target maybe_x32 } */
 /* { dg-options "-O0 -mx32" } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr66275.c 
b/gcc/testsuite/gcc.target/i386/pr66275.c
index b8759aeb5ec..51ae1f6859c 100644
--- a/gcc/testsuite/gcc.target/i386/pr66275.c
+++ b/gcc/testsuite/gcc.target/i386/pr66275.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
 /* { dg-options "-mabi=ms -fdump-rtl-dfinit" } */
 
 void
diff --git a/gcc/testsuite/gcc.target/i386/pr68018.c 
b/gcc/testsuite/gcc.target/i386/pr68018.c
index a0fa21e0b00..04929c6c13c 100644
--- a/gcc/testsuite/gcc.target/i386/pr68018.c
+++ b/gcc/testsuite/gcc.target/i386/pr68018.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
 /* { dg-options "-O -mabi=ms -mstackrealign" } */
 
 typedef float V __attribute__ ((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr

[PATCH] [docs] Explain how to use multiple file-name patterns in RUNTESTFLAGS

2017-08-21 Thread Daniel Santos
It took me a while to figure out how to do this so I figured that it should be
in the docs.  OK for trunk?

* doc/install.texi: Add more details on selecting multiple tests.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/doc/install.texi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 7c9e2f25d44..6aefd213901 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2737,6 +2737,16 @@ the testsuite with filenames matching @samp{9805*}, you 
would use
 make check-g++ RUNTESTFLAGS="old-deja.exp=9805* @var{other-options}"
 @end smallexample
 
+The file-matching expression following @var{filename}@command{.exp=} is treated
+as a series of whitespace-delimited glob expressions so that multiple patterns
+may be passed, although any whitespace must either be escaped or surrounded by
+tick marks if multiple expressions are desired. For example,
+
+@smallexample
+make check-g++ RUNTESTFLAGS="old-deja.exp=9805*\ virtual2.c 
@var{other-options}"
+make check-g++ RUNTESTFLAGS="'old-deja.exp=9805* virtual2.c' 
@var{other-options}"
+@end smallexample
+
 The @file{*.exp} files are located in the testsuite directories of the GCC
 source, the most important ones being @file{compile.exp},
 @file{execute.exp}, @file{dg.exp} and @file{old-deja.exp}.
-- 
2.13.3



[PATCH] [i386,testsuite] [PR 71958] Error on -mx32 with -mabi=ms

2017-08-11 Thread Daniel Santos
We currently error when -mx32 -mabi=sysv and we encounter a function
with attribute ms_abi, but we are not erroring on -mx32 and -mabi=ms
(either explicitly or when it is the default on Windows).  In fact, it
generates code that runs, but is of an undfined ABI.

I'm running -m64 and -m32 tests now and will run x32 tests when those
are done.  Presuming that I've corrected all existing tests that do not
filter out x32 target and there are no additional failures, is this OK
for head?

Thanks,
Daniel

gcc/ChangeLog:
2017-08-11  Daniel Santos  <daniel.san...@pobox.com>

* config/i386/i386.c (ix86_option_override_internal): Modify.
(ix86_function_type_abi): Likewise.

gcc/testsuite/ChangeLog:
2017-08-11  Daniel Santos  <daniel.san...@pobox.com>

* gcc.target/i386/pr71958.c: New test.
* gcc.target/i386/pr64409.c: Modify to skip on Windows.
* gcc.target/i386/pr46470.c: Modify to skip x32 target.
* gcc.target/i386/pr66275.c: Likewise.
* gcc.target/i386/pr68018.c: Likewise.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c  | 11 +--
 gcc/testsuite/gcc.target/i386/pr46470.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr64409.c |  3 ++-
 gcc/testsuite/gcc.target/i386/pr66275.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr68018.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr71958.c |  8 
 6 files changed, 22 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71958.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b04321a8d40..311a52c2a1f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5585,6 +5585,9 @@ ix86_option_override_internal (bool main_args_p,
 
   if (TARGET_X32_P (opts->x_ix86_isa_flags))
 {
+  if (opts_set->x_ix86_abi == MS_ABI)
+   error ("-mx32 not supported with -mabi=ms");
+
   /* Always turn on OPTION_MASK_ISA_64BIT and turn off
 OPTION_MASK_ABI_64 for TARGET_X32.  */
   opts->x_ix86_isa_flags |= OPTION_MASK_ISA_64BIT;
@@ -8777,8 +8780,12 @@ ix86_function_type_abi (const_tree fntype)
   if (abi == SYSV_ABI
   && lookup_attribute ("ms_abi", TYPE_ATTRIBUTES (fntype)))
 {
-  if (TARGET_X32)
-   error ("X32 does not support ms_abi attribute");
+  static int warned;
+  if (TARGET_X32 && !warned)
+   {
+ error ("X32 does not support ms_abi attribute");
+ warned = 1;
+   }
 
   abi = MS_ABI;
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr46470.c 
b/gcc/testsuite/gcc.target/i386/pr46470.c
index 9e8e731188e..c66a378a1ad 100644
--- a/gcc/testsuite/gcc.target/i386/pr46470.c
+++ b/gcc/testsuite/gcc.target/i386/pr46470.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! x32 } } } */
 /* The pic register save adds unavoidable stack pointer references.  */
 /* { dg-skip-if "" { ia32 && { ! nonpic } } } */
 /* These options are selected to ensure 1 word needs to be allocated
diff --git a/gcc/testsuite/gcc.target/i386/pr64409.c 
b/gcc/testsuite/gcc.target/i386/pr64409.c
index 917472653f4..3dbd9a09f01 100644
--- a/gcc/testsuite/gcc.target/i386/pr64409.c
+++ b/gcc/testsuite/gcc.target/i386/pr64409.c
@@ -1,6 +1,7 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-effective-target maybe_x32 } */
 /* { dg-options "-O0 -mx32" } */
+/* { xfail { "*-*-cygwin* *-*-mingw*" } } */
 
 int a;
-int* __attribute__ ((ms_abi)) fn1 () { return  } /* { dg-error "X32 does 
not support ms_abi attribute" } */
+int* __attribute__ ((ms_abi)) fn1 () { return  } /* { dg-error "X32 does 
not support ms_abi attribute" { target { ! "*-*-mingw* *-*-cygwin*" } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66275.c 
b/gcc/testsuite/gcc.target/i386/pr66275.c
index b8759aeb5ec..a1271857f6a 100644
--- a/gcc/testsuite/gcc.target/i386/pr66275.c
+++ b/gcc/testsuite/gcc.target/i386/pr66275.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
+/* { dg-do compile { target { *-*-linux* && { ! { ia32 || x32 } } } } } */
 /* { dg-options "-mabi=ms -fdump-rtl-dfinit" } */
 
 void
diff --git a/gcc/testsuite/gcc.target/i386/pr68018.c 
b/gcc/testsuite/gcc.target/i386/pr68018.c
index a0fa21e0b00..871fdddf643 100644
--- a/gcc/testsuite/gcc.target/i386/pr68018.c
+++ b/gcc/testsuite/gcc.target/i386/pr68018.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
+/* { dg-do compile { target { *-*-linux* && { ! { ia32 || x32 } } } } } */
 /* { dg-options "-O -mabi=ms -mstackrealign" } */
 
 typedef float V __attribute__ ((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr71958.c 
b/gcc/testsuite/gcc.target/i386/pr71958.c
new file mode 100644
index 000..090d1970ca9
--- /dev/null
+++ b/gcc

PING Re: [PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-08-08 Thread Daniel Santos
Original message: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02005.html

Patches 2 and 3 have been committed and I have corrected the error in
patch 5.  I configuring with --enable-checking=yes,rtl
--enable-languages=all and retested with
RUNTESTFLAGS="--target_board=unix/\{,-m32\}"  The updated patches fix an
error when using mov instead of push and add documentation for changes
to target-supports.exp.  I have included modified ChangeLogs.

In addition to to fixing the ICE, this patch set makes more efficient
use of stack space in some cases the outgoing stack boundary is > 16
bytes and realignment is necessary.  This adds new tests, some of which
require avx512f (gcc/testsuite/gcc.target/i386/pr80969-4*.c) -- these I
have only tested these using Intel SDE.

Below is an updated list of the patches.

1. https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02006.html
2. Committed.
3. Committed.
4. https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02009.html
5. v2 -- https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00249.html
6. v2 -- https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00618.html

Thanks,
Daniel
2017-08-08  Daniel Santos  <daniel.san...@pobox.com>

* config/i386/i386.h (ix86_frame::stack_realign_allocate_offset):
Remove
(ix86_frame::stack_realign_allocate): New field.
(struct machine_frame_state): Modify comments.
(machine_frame_state::sp_realigned_fp_end): New field.
* config/i386/i386.c (ix86_compute_frame_layout): Modify.
(sp_valid_at): Likewise.
(fp_valid_at): Likewise.
(choose_baseaddr): Modify comments.
(ix86_emit_outlined_ms2sysv_save): Modify.
(ix86_expand_prologue): Likewise.
* doc/sourcebuild.texi (avx2, avx2_runtime): Add missing items to
effective-targets.
(avx512f, avx512f_runtime): Add new items to effective-tarets.
2017-08-08  Daniel Santos  <daniel.san...@pobox.com>

* lib/target-supports.exp (check_avx512_os_support_available): New
Procedure.
(check_avx2_hw_available): Modify.
(check_avx512f_hw_available): New Procedure.
(check_effective_target_avx512f_runtime): Likewise.
* gcc.target/i386/pr80969-1.c: New testcase.
* gcc.target/i386/pr80969-2a.c: Likewise.
* gcc.target/i386/pr80969-2.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/i386/pr80969-4a.c: Likewise.
* gcc.target/i386/pr80969-4b.c: Likewise.
* gcc.target/i386/pr80969-4.c: Likewise.


[PATCH 6/6 v2] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-08-08 Thread Daniel Santos
This update adds documentation for the new effective taregts in addition to a
few existing effective targets that were undocumented.

Changes to lib/target-supports.exp and documentation:
* Add effective-targets avx512f and avx512f_runtime (needed for new
  tests).
* Corrects bug in check_avx2_hw_available.
* Adds documentation for effective-targets avx2, avx2_runtime (both
  missing), avx512f and avx512f_runtime.

The following tests are added.  The testcase in the PR is used as a base
and relevant variants are added to test other factors affected by the
patch set.

pr80969-1.c   Base test case.
pr80969-2.c   With ms to sysv call.
pr80969-2a.c  With ms to sysv call using stubs.
pr80969-3.c   With alloca (for DRAP test).
pr80969-4.c   With va_args passed via va_list
pr80969-4a.c  With va_args passed via va_list and ms to sysv call.
pr80969-4b.c  With va_args passed via va_list and ms to sysv call using
  stubs.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/doc/sourcebuild.texi   |  12 +++
 gcc/testsuite/gcc.target/i386/pr80969-1.c  |  16 
 gcc/testsuite/gcc.target/i386/pr80969-2.c  |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-2a.c |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-3.c  |  31 
 gcc/testsuite/gcc.target/i386/pr80969-4.c  | 123 
 gcc/testsuite/gcc.target/i386/pr80969-4a.c | 124 +
 gcc/testsuite/gcc.target/i386/pr80969-4b.c | 124 +
 gcc/testsuite/lib/target-supports.exp  |  66 +++
 9 files changed, 548 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4b.c

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 85af8778167..66f040f212d 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1852,6 +1852,18 @@ Target supports compiling @code{avx} instructions.
 @item avx_runtime
 Target supports the execution of @code{avx} instructions.
 
+@item avx2
+Target supports compiling @code{avx2} instructions.
+
+@item avx2_runtime
+Target supports the execution of @code{avx2} instructions.
+
+@item avx512f
+Target supports compiling @code{avx512f} instructions.
+
+@item avx512f_runtime
+Target supports the execution of @code{avx512f} instructions.
+
 @item cell_hw
 Test system can execute AltiVec and Cell PPU instructions.
 
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-1.c 
b/gcc/testsuite/gcc.target/i386/pr80969-1.c
new file mode 100644
index 000..eb8d767a778
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-1.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+int a[56];
+int b;
+int main (int argc, char *argv[]) {
+  int c;
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2.c
new file mode 100644
index 000..e868d6c7e5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2a.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
new file mode 100644
index 000..071a90534a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f -mcall-ms2sysv-xlogues" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func using save/restore stubs.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-3.c 
b/gcc/testsuite/gcc.target/i386/pr80969-3.c
new file mode 100644
index 000..5982981b55c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i

[PATCH 5/6 v2] [i386] Modify SP realignment in ix86_expand_prologue, et. al.

2017-08-02 Thread Daniel Santos
My first version of this patch inited m->fs.sp_realigned_fp_last with
the value of m->fs.sp_offset prior to performing the stack realignment.
I had forgotten, however, that when we're saving GP regs using MOV that
we delay SP modification as long as possible so that the value of
m->fs.sp_offset at this point is correct when we've used push, but
incorrect when we've used mov.

This time I've bootstraped with --enable-checking=yes,rtl
--enable-languages=all and reg tested using the below command to test both 64-
and 32-bit code.

  make -kj8 RUNTESTFLAGS="--target_board=unix/\{,-m32\}" check

Original patch description:

The SP allocation calculation is now done in ix86_compute_frame_layout
and the result stored in ix86_frame::stack_realign_allocate.  This
change also updates comments for choose_baseaddr to clarify that the
alignment returned doesn't necessarily reflect the alignment of the
cfa_offset passed (e.g., you can pass cfa_offset 48 and it can return an
alignment of 64 bytes).

Since the alignment required may be more than 16-bytes, we cannot defer
SP allocation to ix86_emit_outlined_ms2sysv_save (when it's enabled), so
that function needs to be updated as well.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 58 --
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0dc366cf16e..a1f39cd714c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13289,10 +13289,13 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
_reg,
 }
 
 /* Return an RTX that points to CFA_OFFSET within the stack frame and
-   the alignment of address.  If align is non-null, it should point to
+   the alignment of address.  If ALIGN is non-null, it should point to
an alignment value (in bits) that is preferred or zero and will
-   recieve the alignment of the base register that was selected.  The
-   valid base registers are taken from CFUN->MACHINE->FS.  */
+   recieve the alignment of the base register that was selected,
+   irrespective of rather or not CFA_OFFSET is a multiple of that
+   alignment value.
+
+   The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
 choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
@@ -14338,35 +14341,35 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
-  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
-  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
+
+  /* AL should only be live with sysv_abi.  */
+  gcc_assert (!ix86_eax_live_at_start_p ());
+
+  /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
+ we've actually realigned the stack or not.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (frame.stack_realign_offset
+ + xlogue.get_stub_ptr_offset (), );
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Verify that the incoming stack 16-byte alignment offset matches the
- layout we're using.  */
-  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+  /* Allocate stack if not already done.  */
+  if (allocate > 0)
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+   GEN_INT (-allocate), -1, false);
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
  : XLOGUE_STUB_SAVE);
   RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
 
-  /* Setup RAX as the stub's base pointer.  */
-  align = GET_MODE_ALIGNMENT (V4SFmode);
-  addr = choose_baseaddr (rax_offset, );
-  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  insn = emit_insn (gen_rtx_SET (rax, addr));
-
-  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-stack_alloc_size), -1,
-m->fs.cfa_reg == stack_pointer_rtx);
   for (i = 0; i < ncregs; ++i)
 {
   const xlogue_layout::reginfo  = xlogue.get_reginfo (i);
   rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
 r.regno);
-  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);
 }
 
   gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
@@ -14621,14 +14624,15 @@ ix86_exp

Re: [PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-07-31 Thread Daniel Santos
Well I just learned how to test 32-bit earlier and I've uncovered a 
problem when running 32-bit tests.  Do you want me to commit the the two 
patches (squashed together) in the mean time?


Thanks,
Daniel




[PATCH 6/6] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-07-31 Thread Daniel Santos
The testcase in the PR is used as a base and relevant variants are added
to test other factors affected by the patch set.

pr80969-1.c   Base test case.
pr80969-2.c   With ms to sysv call.
pr80969-2a.c  With ms to sysv call using stubs.
pr80969-3.c   With alloca (for DRAP test).
pr80969-4.c   With va_args passed via va_list
pr80969-4a.c  With va_args passed via va_list and ms to sysv call.
pr80969-4b.c  With va_args passed via va_list and ms to sysv call using
  stubs.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/testsuite/gcc.target/i386/pr80969-1.c  |  16 
 gcc/testsuite/gcc.target/i386/pr80969-2.c  |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-2a.c |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-3.c  |  31 
 gcc/testsuite/gcc.target/i386/pr80969-4.c  | 123 
 gcc/testsuite/gcc.target/i386/pr80969-4a.c | 124 +
 gcc/testsuite/gcc.target/i386/pr80969-4b.c | 124 +
 gcc/testsuite/lib/target-supports.exp  |  66 +++
 8 files changed, 536 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4b.c

diff --git a/gcc/testsuite/gcc.target/i386/pr80969-1.c 
b/gcc/testsuite/gcc.target/i386/pr80969-1.c
new file mode 100644
index 000..eb8d767a778
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-1.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+int a[56];
+int b;
+int main (int argc, char *argv[]) {
+  int c;
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2.c
new file mode 100644
index 000..e868d6c7e5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2a.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
new file mode 100644
index 000..071a90534a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f -mcall-ms2sysv-xlogues" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func using save/restore stubs.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-3.c 
b/gcc/testsuite/gcc.target/i386/pr80969-3.c
new file mode 100644
index 000..5982981b55c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test with alloca (and DRAP).  */
+
+#include 
+
+int a[56];
+volatile int b = -12345;
+volatile const int d = 42;
+
+void foo (int *x, int y, int z)
+{
+}
+
+void (*volatile const foo_noinfo)(int *, int, int) = foo;
+
+int main (int argc, char *argv[]) {
+  int c;
+  int *e = alloca (d);
+  foo_noinfo (e, d, 0);
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+foo_noinfo (e, d, c);
+a[-(b % 56)] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-4.c 
b/gcc/testsuite/gcc.target/i386/pr80969-4.c
new file mode 100644
index 000..1ec54d081cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-4.c
@@ -0,0 +1,123 @@
+/* { dg-do run { target avx512f_runtime } } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test with avx512 and va_args.  */
+
+#include 
+#include 
+
+#include "avx-check.h"
+
+int a[56];
+int b;
+
+__m128 n1 = { -283.3, -23.3, 213.4, 1119.03 };
+__m512d n2 = { -93.83, 893.318, 

[PATCH 5/6] [i386] Modify SP realignment in ix86_expand_prologue, et. al.

2017-07-31 Thread Daniel Santos
The SP allocation calculation is now done in ix86_compute_frame_layout
and the result stored in ix86_frame::stack_realign_allocate.  This
change also updates comments for choose_baseaddr to clarify that the
alignment returned doesn't necessarily reflect the alignment of the
cfa_offset passed (e.g., you can pass cfa_offset 48 and it can return an
alignment of 64 bytes).

Since the alignment required may be more than 16-bytes, we cannot defer
SP allocation to ix86_emit_outlined_ms2sysv_save (when it's enabled), so
that function needs to be updated as well.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 54 +++---
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e92f322de0c..7e1fc4dfbf5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13273,10 +13273,13 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
_reg,
 }
 
 /* Return an RTX that points to CFA_OFFSET within the stack frame and
-   the alignment of address.  If align is non-null, it should point to
+   the alignment of address.  If ALIGN is non-null, it should point to
an alignment value (in bits) that is preferred or zero and will
-   recieve the alignment of the base register that was selected.  The
-   valid base registers are taken from CFUN->MACHINE->FS.  */
+   recieve the alignment of the base register that was selected,
+   irrespective of rather or not CFA_OFFSET is a multiple of that
+   alignment value.
+
+   The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
 choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
@@ -14322,35 +14325,35 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
-  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
-  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
+
+  /* AL should only be live with sysv_abi.  */
+  gcc_assert (!ix86_eax_live_at_start_p ());
+
+  /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
+ we've actually realigned the stack or not.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (frame.stack_realign_offset
+ + xlogue.get_stub_ptr_offset (), );
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Verify that the incoming stack 16-byte alignment offset matches the
- layout we're using.  */
-  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+  /* Allocate stack if not already done.  */
+  if (allocate > 0)
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+   GEN_INT (-allocate), -1, false);
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
  : XLOGUE_STUB_SAVE);
   RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
 
-  /* Setup RAX as the stub's base pointer.  */
-  align = GET_MODE_ALIGNMENT (V4SFmode);
-  addr = choose_baseaddr (rax_offset, );
-  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  insn = emit_insn (gen_rtx_SET (rax, addr));
-
-  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-stack_alloc_size), -1,
-m->fs.cfa_reg == stack_pointer_rtx);
   for (i = 0; i < ncregs; ++i)
 {
   const xlogue_layout::reginfo  = xlogue.get_reginfo (i);
   rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
 r.regno);
-  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);
 }
 
   gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
@@ -14608,8 +14611,8 @@ ix86_expand_prologue (void)
 that we must allocate the size of the register save area before
 performing the actual alignment.  Otherwise we cannot guarantee
 that there's enough storage above the realignment point.  */
-  allocate = frame.stack_realign_allocate_offset - m->fs.sp_offset;
-  if (allocate && !m->call_ms2sysv)
+  allocate = frame.stack_realign_allocate;
+  if (allocate)
 pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
   GEN_INT (-allocate), -1, false);
 
@@ -14618,8 +14621,7 @@ ix86_expand_prologu

[PATCH 4/6] [i386] Modify ix86_compute_frame_layout

2017-07-31 Thread Daniel Santos
These changes affect how the stack frame is calculated from the region
starting at frame.reg_save_offset until frame.frame_pointer_offset,
which includes either the stub save area or the (inline) SSE register
save area and the va_args register save area.

The calculation used when not realigning the stack pointer is the same,
but when when realigning we calculate the 16-byte aligned space needed
in reverse so that the stack realignment boundary at
frame.stack_realign_offset may not necessarily be a multiple of
stack_alignment_needed, but the value of frame.frame_pointer_offset
will. This results in a properly aligned stack for the function body and
avoids wasting stack space.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 116 +
 gcc/config/i386/i386.h |   2 +-
 2 files changed, 80 insertions(+), 38 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e2e9546a27c..e92f322de0c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12874,6 +12874,14 @@ ix86_compute_frame_layout (void)
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
   gcc_assert (preferred_alignment <= stack_alignment_needed);
 
+  /* The only ABI saving SSE regs should be 64-bit ms_abi.  */
+  gcc_assert (TARGET_64BIT || !frame->nsseregs);
+  if (TARGET_64BIT && m->call_ms2sysv)
+{
+  gcc_assert (stack_alignment_needed >= 16);
+  gcc_assert (!frame->nsseregs);
+}
+
   /* For SEH we have to limit the amount of code movement into the prologue.
  At present we do this via a BLOCKAGE, at which point there's very little
  scheduling that can be done, which means that there's very little point
@@ -12936,54 +12944,88 @@ ix86_compute_frame_layout (void)
   if (TARGET_SEH)
 frame->hard_frame_pointer_offset = offset;
 
-  /* When re-aligning the stack frame, but not saving SSE registers, this
- is the offset we want adjust the stack pointer to.  */
-  frame->stack_realign_allocate_offset = offset;
+  /* Calculate the size of the va-arg area (not including padding, if any).  */
+  frame->va_arg_size = ix86_varargs_gpr_size + ix86_varargs_fpr_size;
 
-  /* The re-aligned stack starts here.  Values before this point are not
- directly comparable with values below this point.  Use sp_valid_at
- to determine if the stack pointer is valid for a given offset and
- fp_valid_at for the frame pointer.  */
   if (stack_realign_fp)
-offset = ROUND_UP (offset, stack_alignment_needed);
-  frame->stack_realign_offset = offset;
-
-  if (TARGET_64BIT && m->call_ms2sysv)
 {
-  gcc_assert (stack_alignment_needed >= 16);
-  gcc_assert (!frame->nsseregs);
+  /* We may need a 16-byte aligned stack for the remainder of the
+register save area, but the stack frame for the local function
+may require a greater alignment if using AVX/2/512.  In order
+to avoid wasting space, we first calculate the space needed for
+the rest of the register saves, add that to the stack pointer,
+and then realign the stack to the boundary of the start of the
+frame for the local function.  */
+  HOST_WIDE_INT space_needed = 0;
+  HOST_WIDE_INT sse_reg_space_needed = 0;
 
-  m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
-  offset += xlogue_layout::get_instance ().get_stack_space_used ();
-}
+  if (TARGET_64BIT)
+   {
+ if (m->call_ms2sysv)
+   {
+ m->call_ms2sysv_pad_in = 0;
+ space_needed = xlogue_layout::get_instance 
().get_stack_space_used ();
+   }
 
-  /* Align and set SSE register save area.  */
-  else if (frame->nsseregs)
-{
-  /* The only ABI that has saved SSE registers (Win64) also has a
-16-byte aligned default stack.  However, many programs violate
-the ABI, and Wine64 forces stack realignment to compensate.
+ else if (frame->nsseregs)
+   /* The only ABI that has saved SSE registers (Win64) also has a
+  16-byte aligned default stack.  However, many programs violate
+  the ABI, and Wine64 forces stack realignment to compensate.  */
+   space_needed = frame->nsseregs * 16;
+
+ sse_reg_space_needed = space_needed = ROUND_UP (space_needed, 16);
+
+ /* 64-bit frame->va_arg_size should always be a multiple of 16, but
+rounding to be pedantic.  */
+ space_needed = ROUND_UP (space_needed + frame->va_arg_size, 16);
+   }
+  else
+   space_needed = frame->va_arg_size;
+
+  /* Record the allocation size required prior to the realignment AND.  */
+  frame->stack_realign_allocate = space_needed;
+
+  /* The re-aligned stack starts at frame->stack_realign_offset.  Values
+before this point are not directly co

[PATCH 3/6] [i386] Remove machine_function::call_ms2sysv_pad_out

2017-07-31 Thread Daniel Santos
The -mcall-ms2sysv-xlogues project added the boolean fields
call_ms2sysv_pad_in and call_ms2sysv_pad_out to struct machine_function
to track rather or not an additional 8 bytes of padding was needed for
stack alignment prior to and after the stub save area.  This design was
based upon the faulty assumption the function body would not require a
stack alignment greater than 16 bytes.  This continues to work well for
managing padding prior to the stub save area, but will not work for the
outgoing alignment.

Rather than changing machine_function::call_ms2sysv_pad_out to a larger
type, this patch removes it, thus transferring responsibility for stack
alignment following the stub save area from class xlogue_layout to the
body of ix86_compute_frame_layout.  Since the 64-bit va_arg register
save area is always a multiple of 16-bytes in size (176 for System V ABI
and 96 for Microsoft ABI), the ROUND_UP calculation for the stack offset
at the start of the function body (frame.frame_pointer_offset) will
assure there is enough room for any padding needed to keep the save area
for SSE va_args 16-byte aligned, so no modification is needed for that
calculation.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 18 --
 gcc/config/i386/i386.h |  8 ++--
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 47c5608c3cd..e2e9546a27c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2491,9 +2491,7 @@ public:
 unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;
 
 gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
-return m_regs[last_reg].offset
-  + (m->call_ms2sysv_pad_out ? 8 : 0)
-  + STUB_INDEX_OFFSET;
+return m_regs[last_reg].offset + STUB_INDEX_OFFSET;
   }
 
   /* Returns the offset for the base pointer used by the stub.  */
@@ -12849,13 +12847,12 @@ ix86_compute_frame_layout (void)
{
  unsigned count = xlogue_layout::count_stub_managed_regs ();
  m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
+ m->call_ms2sysv_pad_in = 0;
}
 }
 
   frame->nregs = ix86_nsaved_regs ();
   frame->nsseregs = ix86_nsaved_sseregs ();
-  m->call_ms2sysv_pad_in = 0;
-  m->call_ms2sysv_pad_out = 0;
 
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
  except for function prologues, leaf functions and when the defult
@@ -12957,15 +12954,7 @@ ix86_compute_frame_layout (void)
   gcc_assert (!frame->nsseregs);
 
   m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
-
-  /* Select an appropriate layout for incoming stack offset.  */
-  const struct xlogue_layout  = xlogue_layout::get_instance ();
-
-  if ((offset + xlogue.get_stack_space_used ()) & UNITS_PER_WORD)
-   m->call_ms2sysv_pad_out = 1;
-
-  offset += xlogue.get_stack_space_used ();
-  gcc_assert (!(offset & 0xf));
+  offset += xlogue_layout::get_instance ().get_stack_space_used ();
 }
 
   /* Align and set SSE register save area.  */
@@ -12993,6 +12982,7 @@ ix86_compute_frame_layout (void)
 
   /* Align start of frame for local function.  */
   if (stack_realign_fp
+  || m->call_ms2sysv
   || offset != frame->sse_reg_save_offset
   || size != 0
   || !crtl->is_leaf
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 1648bdf1556..b08e45f68d4 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2646,17 +2646,13 @@ struct GTY(()) machine_function {
   BOOL_BITFIELD arg_reg_available : 1;
 
   /* If true, we're out-of-lining reg save/restore for regs clobbered
- by ms_abi functions calling a sysv function.  */
+ by 64-bit ms_abi functions calling a sysv_abi function.  */
   BOOL_BITFIELD call_ms2sysv : 1;
 
   /* If true, the incoming 16-byte aligned stack has an offset (of 8) and
- needs padding.  */
+ needs padding prior to out-of-line stub save/restore area.  */
   BOOL_BITFIELD call_ms2sysv_pad_in : 1;
 
-  /* If true, the size of the stub save area plus inline int reg saves will
- result in an 8 byte offset, so needs padding.  */
-  BOOL_BITFIELD call_ms2sysv_pad_out : 1;
-
   /* This is the number of extra registers saved by stub (valid range is
  0-6). Each additional register is only saved/restored by the stubs
  if all successive ones are. (Will always be zero when using a hard
-- 
2.13.3



[PATCH 2/6] [i386] Remove ix86_frame::outlined_save_offset

2017-07-31 Thread Daniel Santos
This value was used in an earlier incarnation of the
-mcall-ms2sysv-xlogues patch set but is now set and never read.  The
value of ix86_frame::sse_reg_save_offset serves the same purpose.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 1 -
 gcc/config/i386/i386.h | 4 +---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 690631dfe43..47c5608c3cd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12966,7 +12966,6 @@ ix86_compute_frame_layout (void)
 
   offset += xlogue.get_stack_space_used ();
   gcc_assert (!(offset & 0xf));
-  frame->outlined_save_offset = offset;
 }
 
   /* Align and set SSE register save area.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index ce5bb7f6677..1648bdf1556 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2477,8 +2477,7 @@ enum avx_u128_state
<- end of stub-saved/restored regs
  [padding1]
]
-   <- outlined_save_offset
-   <- sse_regs_save_offset
+   <- sse_reg_save_offset
[padding2]
   |<- FRAME_POINTER
[va_arg registers]  |
@@ -2504,7 +2503,6 @@ struct GTY(()) ix86_frame
   HOST_WIDE_INT reg_save_offset;
   HOST_WIDE_INT stack_realign_allocate_offset;
   HOST_WIDE_INT stack_realign_offset;
-  HOST_WIDE_INT outlined_save_offset;
   HOST_WIDE_INT sse_reg_save_offset;
 
   /* When save_regs_using_mov is set, emit prologue using
-- 
2.13.3



[PATCH 1/6] [i386] Correct comments, add assertions to sp_valid_at and fp_valid_at

2017-07-31 Thread Daniel Santos
When we realign the stack frame (without DRAP), there may be a range of
CFA offsets that should never be touched because they are alignment
padding and any reference to them is almost certainly an error.
Previously, only the offset of where the realigned stack frame starts
was recorded and checked in sp_valid_at and fp_valid_at.

This change adds sp_realigned_fp_last to struct machine_frame_state to
record the last valid offset from which the frame pointer can be used
when the stack pointer is realigned and modifies sp_valid_at and
fp_valid_at to fail an assertion when passed an offset in the "no-man's
land" between these two values.

Comments for struct machine_frame_state incorrectly stated that a
realigned stack pointer could be used to access offsets equal to or
greater than sp_realigned_offset, but it is only valid for offsets that
are greater.  This was the (incorrect) behaviour of sp_valid_at and
fp_valid_at prior to r250587 and this change now corrects the
documentation and adds clarification of the CFA-relative calculation.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 45 ++---
 gcc/config/i386/i386.h | 18 +-
 2 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f1486ff3750..690631dfe43 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13102,26 +13102,36 @@ choose_baseaddr_len (unsigned int regno, 
HOST_WIDE_INT offset)
   return len;
 }
 
-/* Determine if the stack pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the stack pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
-static inline bool
+static bool
 sp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state  = cfun->machine->fs;
-  return fs.sp_valid && !(fs.sp_realigned
- && cfa_offset <= fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset <= fs.sp_realigned_offset)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset <= fs.sp_realigned_fp_last);
+  return false;
+}
+  return fs.sp_valid;
 }
 
-/* Determine if the frame pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the frame pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
 static inline bool
 fp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state  = cfun->machine->fs;
-  return fs.fp_valid && !(fs.sp_valid && fs.sp_realigned
- && cfa_offset > fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset > fs.sp_realigned_fp_last)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset >= fs.sp_realigned_offset);
+  return false;
+}
+  return fs.fp_valid;
 }
 
 /* Choose a base register based upon alignment requested, speed and/or
@@ -14560,6 +14570,9 @@ ix86_expand_prologue (void)
   int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
   gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
 
+  /* Record last valid frame pointer offset.  */
+  m->fs.sp_realigned_fp_last = m->fs.sp_offset;
+
   /* The computation of the size of the re-aligned stack frame means
 that we must allocate the size of the register save area before
 performing the actual alignment.  Otherwise we cannot guarantee
@@ -14573,13 +14586,15 @@ ix86_expand_prologue (void)
   insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
stack_pointer_rtx,
GEN_INT (-align_bytes)));
-  /* For the purposes of register save area addressing, the stack
-pointer can no longer be used to access anything in the frame
-below m->fs.sp_realigned_offset and the frame pointer cannot be
-used for anything at or above.  */
   m->fs.sp_offset = ROUND_UP (m->fs.sp_offset, align_bytes);
   m->fs.sp_realigned = true;
   m->fs.sp_realigned_offset = m->fs.sp_offset - frame.nsseregs * 16;
+  /* The stack pointer may no longer be equal to CFA - m->fs.sp_offset.
+Beyond this point, stack access should be done via choose_baseaddr or
+by using sp_valid_at and fp_valid_at to determine the correct base
+register.  Henceforth, any CFA offset should be thought of as logical
+and not physical.  */
+  gcc_assert (m->fs.sp_realigned_offset >= m->fs.sp_realigned_fp_last);
   gcc_assert (m->fs.sp_realign

[PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-07-31 Thread Daniel Santos
When working on the Wine64 project to use aligned SSE MOVs after SP 
realignment and adding -mcall-ms2sysv-xlogues, I overlooked the fact 
that the function body may require a stack alignment greater than 
16-bytes.  This can result in an ICE with -mabi=ms -mavx512f and some 
other cases.  This patch set reworks the strategy for calculating the 
frame layout following normal (inline) integral register saves (at 
frame.reg_save_offset) to the start of the frame for the local function 
(frame.frame_pointer_offset).


I've completed a bootstrap and full regression test with no additional 
failures, but I don't have access to a machine with avx512 extensions.  
I have manually run the tests that need it using the Intel SDE, but I 
haven't been able to validate that my 
check_effective_target_avx512f_runtime code in 
gcc/testsuite/lib/target-supports.exp is correctly enabling the tests 
for pr80969-4*.c.


As an aside note, I still have some rework of the ms-sysv.exp tests that 
I haven't yet to submitted and in which I'm adding more tests for cases 
with uncommon stacks, as in PR 81563.


Thanks,
Daniel
2017-07-23  Daniel Santos  <daniel.san...@pobox.com>

* config/i386/i386.h (ix86_frame::outlined_save_offset): Remove field.
(ix86_frame::stack_realign_allocate_offset): Likewise.
(ix86_frame::stack_realign_allocate): New field.
(struct machine_frame_state): Modify comments.
(machine_frame_state::sp_realigned_fp_end): New field.
(machine_function::call_ms2sysv_pad_out): Remove field.
* config/i386/i386.c (xlogue_layout::get_stack_space_used): Modify.
(ix86_compute_frame_layout): Likewise.
(sp_valid_at): Likewise.
(fp_valid_at): Likewise.
(choose_baseaddr): Modify comments.
(ix86_emit_outlined_ms2sysv_save): Modify.
(ix86_expand_prologue): Likewise.
(ix86_expand_epilogue): Modify comments.
2017-07-23  Daniel Santos  <daniel.san...@pobox.com>
* gcc.target/i386/pr80969-1.c: New testcase.
* gcc.target/i386/pr80969-2a.c: Likewise.
* gcc.target/i386/pr80969-2.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/i386/pr80969-4a.c: Likewise.
* gcc.target/i386/pr80969-4b.c: Likewise.
* gcc.target/i386/pr80969-4.c: Likewise.


Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-07-31 Thread Daniel Santos

On 07/28/2017 09:41 AM, H.J. Lu wrote:

On Fri, Jul 28, 2017 at 6:57 AM, Daniel Santos<daniel.san...@pobox.com>  wrote:

On 07/26/2017 02:03 PM, H.J. Lu wrote:

This patch caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81563

Hello.  I've rebased my patch set and I'm now retesting.  I'm afraid that
your changes are wrong because my my sp_valid_at and fp_valid_at functions
are wrong -- these are supposed to be for the base offset and not the CFA
offset, sorry about that.  This means that the check in choose_basereg (and
thus choose_baseaddr) have been wrong as well.  I'm retesting now.

Please check your change with gcc.target/i386/pr81563.c.

Thanks.


I'm still getting used to x86 stack math and and briefly I thought that 
my understanding of the CFA was wrong and that I had messed up 
sp_valid_at and fp_valid_at, but I was mistaken, so sorry for the false 
alarm.  My rebased patches pass all tests, so it's OK.


Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-07-28 Thread Daniel Santos

On 07/26/2017 02:03 PM, H.J. Lu wrote:

This patch caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81563


Hello.  I've rebased my patch set and I'm now retesting.  I'm afraid 
that your changes are wrong because my my sp_valid_at and fp_valid_at 
functions are wrong -- these are supposed to be for the base offset and 
not the CFA offset, sorry about that.  This means that the check in 
choose_basereg (and thus choose_baseaddr) have been wrong as well.  I'm 
retesting now.


Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-07-26 Thread Daniel Santos

On 07/26/2017 02:03 PM, H.J. Lu wrote:

This patch caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81563


Yes, I discovered this flaw while working on PR 80969 but I hadn't found 
an actual testcase where it caused a problem yet.  I'm about to submit 
my patchset for review, so sorry I didn't get it committed sooner.  My 
patch set further improves sp_valid_at and fp_valid_at since it's 
possible that the the last offset the frame pointer can be used to 
access is not equal to realignment offset.  I'll try to get this out 
tonight or tomorrow.


Thanks!
Daniel


PING: [PATCH v2 0/2] [testsuite, libgcc] PR80759 Fix FAILs on Solaris and Darwin

2017-07-17 Thread Daniel Santos

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00025.html

Uros,
Can you review changes for i386 please?

Mike or Iain,
Can one of you review changes for Darwin please?  I'm not familiar with 
the platform, although Rainer tested on Darwin for me.


Ian,
Can you review changes to libgcc please?

Thank you all!
Daniel


On 07/02/2017 12:11 AM, Daniel Santos wrote:
This patchset addresses a number of testsuite issues for 
gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp, mostly occurring on Solaris 
and Darwin.  Additionally, it solves a bug in libgcc that caused link 
failures on Darwin when building with -mcall-ms2sysv-xlogues.  The 
issues are detailed in the notes for each patch.


I would particularly appreciate any feedback for Darwin as I am 
unfamiliar with the platform and Rainer and I have fashioned some of 
these changes by looking at other Darwin code in gcc.


 .../gcc.target/x86_64/abi/ms-sysv/do-test.S  | 200 
---

 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c  |  83 +++-
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp| 153 +-
 libgcc/config.host   |   6 +-
 libgcc/config/i386/i386-asm.h|  89 +
 libgcc/config/i386/resms64.S |   2 +-
 libgcc/config/i386/resms64f.S|   2 +-
 libgcc/config/i386/resms64fx.S   |   2 +-
 libgcc/config/i386/resms64x.S|   2 +-
 libgcc/config/i386/savms64.S |   2 +-
 libgcc/config/i386/savms64f.S|   2 +-
 11 files changed, 274 insertions(+), 269 deletions(-)


Many thanks to Rainer for all of his help on this!

Thanks,
Daniel




[PATCH 2/2] [libgcc]: PR80759 fixes for Solaris & Darwin

2017-07-01 Thread Daniel Santos
The -mcall-ms2sysv-xlogues option is supposed to work on Solaris and
Darwin, but my changes to config.host weren't adding the sav/res stubs
to libgcc and the assembly code wasn't compatible with their assemblers
either.

* Change config.host to build -mcall-ms2sysv-xlogues sav/res stubs on
  Solaris and Darwin.
* Replace .macro/.endm with cpp macros
* Replace .global with .globl
* Append __USER_LABEL_PREFIX__ when defined (via ASMNAME macro).
* Only use .size when __ELF__ is defined.
* Only use .hidden when both __ELF__ and HAVE_GAS_HIDDEN are defined.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 libgcc/config.host |  6 +--
 libgcc/config/i386/i386-asm.h  | 89 ++
 libgcc/config/i386/resms64.S   |  2 +-
 libgcc/config/i386/resms64f.S  |  2 +-
 libgcc/config/i386/resms64fx.S |  2 +-
 libgcc/config/i386/resms64x.S  |  2 +-
 libgcc/config/i386/savms64.S   |  2 +-
 libgcc/config/i386/savms64f.S  |  2 +-
 8 files changed, 64 insertions(+), 43 deletions(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index cf62e0e54f7..bee3e931106 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -588,12 +588,12 @@ hppa*-*-openbsd*)
tmake_file="$tmake_file pa/t-openbsd"
;;
 i[34567]86-*-darwin*)
-   tmake_file="$tmake_file i386/t-crtpc t-crtfm"
+   tmake_file="$tmake_file i386/t-crtpc t-crtfm i386/t-msabi"
tm_file="$tm_file i386/darwin-lib.h"
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
;;
 x86_64-*-darwin*)
-   tmake_file="$tmake_file i386/t-crtpc t-crtfm"
+   tmake_file="$tmake_file i386/t-crtpc t-crtfm i386/t-msabi"
tm_file="$tm_file i386/darwin-lib.h"
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
;;
@@ -670,7 +670,7 @@ i[34567]86-*-rtems*)
extra_parts="$extra_parts crti.o crtn.o"
;;
 i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*)
-   tmake_file="$tmake_file i386/t-crtpc t-crtfm"
+   tmake_file="$tmake_file i386/t-crtpc t-crtfm i386/t-msabi"
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
tm_file="${tm_file} i386/elf-lib.h"
md_unwind_header=i386/sol2-unwind.h
diff --git a/libgcc/config/i386/i386-asm.h b/libgcc/config/i386/i386-asm.h
index c613e9fd83d..1387fd24b4f 100644
--- a/libgcc/config/i386/i386-asm.h
+++ b/libgcc/config/i386/i386-asm.h
@@ -26,22 +26,45 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #ifndef I386_ASM_H
 #define I386_ASM_H
 
+#include "auto-host.h"
+
+/* These macros currently support GNU/Linux, Solaris and Darwin.  */
+
 #ifdef __ELF__
-# define ELFFN(fn) .type fn,@function
+# define FN_TYPE(fn) .type fn,@function
+# define FN_SIZE(fn) .size fn,.-fn
+# ifdef HAVE_GAS_HIDDEN
+#  define FN_HIDDEN(fn) .hidden fn
+# endif
+#else
+# define FN_TYPE(fn)
+# define FN_SIZE(fn)
+#endif
+
+#ifndef FN_HIDDEN
+# define FN_HIDDEN(fn)
+#endif
+
+#ifdef __USER_LABEL_PREFIX__
+# define ASMNAME2(prefix, name)prefix ## name
+# define ASMNAME1(prefix, name)ASMNAME2(prefix, name)
+# define ASMNAME(name) ASMNAME1(__USER_LABEL_PREFIX__, name)
 #else
-# define ELFFN(fn)
+# define ASMNAME(name) name
 #endif
 
-#define FUNC_START(fn) \
-   .global fn; \
-   ELFFN (fn); \
-fn:
+#define FUNC_BEGIN(fn) \
+   .globl ASMNAME(fn); \
+   FN_TYPE (ASMNAME(fn));  \
+ASMNAME(fn):
 
-#define HIDDEN_FUNC(fn)\
-   FUNC_START (fn) \
-   .hidden fn; \
+#define HIDDEN_FUNC(fn)\
+   .globl ASMNAME(fn); \
+   FN_TYPE(ASMNAME(fn));   \
+   FN_HIDDEN(ASMNAME(fn)); \
+ASMNAME(fn):
 
-#define FUNC_END(fn) .size fn,.-fn
+#define FUNC_END(fn) FN_SIZE(ASMNAME(fn))
 
 #ifdef __SSE2__
 # ifdef __AVX__
@@ -51,32 +74,30 @@ fn:
 # endif
 
 /* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
-.macro SSE_SAVE off=0
-   MOVAPS %xmm15,(\off - 0x90)(%rax)
-   MOVAPS %xmm14,(\off - 0x80)(%rax)
-   MOVAPS %xmm13,(\off - 0x70)(%rax)
-   MOVAPS %xmm12,(\off - 0x60)(%rax)
-   MOVAPS %xmm11,(\off - 0x50)(%rax)
-   MOVAPS %xmm10,(\off - 0x40)(%rax)
-   MOVAPS %xmm9, (\off - 0x30)(%rax)
-   MOVAPS %xmm8, (\off - 0x20)(%rax)
-   MOVAPS %xmm7, (\off - 0x10)(%rax)
-   MOVAPS %xmm6, \off(%rax)
-.endm
+#define SSE_SAVE  \
+   MOVAPS %xmm15,-0x30(%rax); \
+   MOVAPS %xmm14,-0x20(%rax); \
+   MOVAPS %xmm13,-0x10(%rax); \
+   MOVAPS %xmm12, (%rax); \
+   MOVAPS %xmm11, 0x10(%rax); \
+   MOVAPS %xmm10, 0x20(%rax); \
+   MOVAPS %xmm9,  0x30(%rax); \
+   MOVAPS %xmm8,  0x40(%rax); \
+   MOVAPS %xmm7,  0x50(%rax); \
+   MOVAPS %xmm6,  0x60(%rax)
 
 /* Restore SSE r

[PATCH 1/2] [testsuite] PR80759 fix tests on Solaris and Darwin

2017-07-01 Thread Daniel Santos
The ms-sysv.exp tests were failing on Solaris and Darwin targets.  In
addition, a number of other problems have been identified.

* Assembly failed on Solaris and Darwin when not using gas due to use of
  .cfi directives and .struct.

* Tests were failing on Solaris due to hard frame pointer being always
  enabled on that platform and and not passing --omit-rbp-clobbers to
  the code generator.

* Manual compilation (via remote_exec as opposed to dg-runtest, et. al.)
  was missing TEST_ALWAYS_FLAGS, resulting in color codes in log files.
  It was also missing -m64 in some cases where it was needed.

* When built with make -j48 on an unsupported triplet, the "test
  unsupported" message appeared 48 times in the log (it appears that
  several other tests do this as well).

* Using hard-coded offsets in do-tests.S is ugly.  This is fixed by
  moving some code into inline assembly in ms-sysv.c.

* Custom parallelization code broke when running make without -j

* Accessing the test_data global from assembly requires(?) use of global
  offset table on Darwin.

This patch corrects all of these problems.  The custom parallelization
code has been removed and replaced with calls to procs in gcc's standard
testing framework: gcc_parallel_test_enable, runtest_file_p and
dg-runtest.  This results in much poorer parallelization, which I hope
to address in a future patch, but has little effect when built without
checking enabled.

Previously, each test job compiled and executed around 20k individual
tests.  This high number resulted in test jobs far exceeding the default
5 minute timeout for remote_/local_exec when gcc was built with
--enable-checking=rtl.  This has been resolved by splitting the tests
out to a maximum of around 3500 tests per job.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 .../gcc.target/x86_64/abi/ms-sysv/do-test.S| 200 +
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c|  83 -
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp  | 153 +---
 3 files changed, 210 insertions(+), 226 deletions(-)

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
index 1395235fd1e..ffe011bcc68 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
@@ -23,141 +23,101 @@ a copy of the GCC Runtime Library Exception along with 
this program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 <http://www.gnu.org/licenses/>.  */
 
-#ifdef __x86_64__
-
-# ifdef __ELF__
-#  define ELFFN_BEGIN(fn)   .type fn,@function
-#  define ELFFN_END(fn) .size fn,.-fn
-# else
-#  define ELFFN_BEGIN(fn)
-#  define ELFFN_END(fn)
-# endif
-
-# define FUNC(fn)  \
-   .global fn; \
-   ELFFN_BEGIN(fn);\
-fn:
-
-#define FUNC_END(fn) ELFFN_END(fn)
-
-# ifdef __AVX__
-#  define MOVAPS vmovaps
-# else
-#  define MOVAPS movaps
-# endif
-
-/* TODO: Is there a cleaner way to provide these offsets?  */
-   .struct 0
-test_data_save:
-
-   .struct test_data_save + 224
-test_data_input:
-
-   .struct test_data_save + 448
-test_data_output:
-
-   .struct test_data_save + 672
-test_data_fn:
-
-   .struct test_data_save + 680
-test_data_retaddr:
+#if defined(__x86_64__) && defined(__SSE2__)
+
+/* These macros currently support GNU/Linux, Solaris and Darwin.  */
+
+#ifdef __ELF__
+# define FN_TYPE(fn) .type fn,@function
+# define FN_SIZE(fn) .size fn,.-fn
+#else
+# define FN_TYPE(fn)
+# define FN_SIZE(fn)
+#endif
+
+#ifdef __USER_LABEL_PREFIX__
+# define ASMNAME2(prefix, name)prefix ## name
+# define ASMNAME1(prefix, name)ASMNAME2(prefix, name)
+# define ASMNAME(name) ASMNAME1(__USER_LABEL_PREFIX__, name)
+#else
+# define ASMNAME(name) name
+#endif
+
+#define FUNC_BEGIN(fn) \
+   .globl ASMNAME(fn); \
+   FN_TYPE (ASMNAME(fn));  \
+ASMNAME(fn):
+
+#define FUNC_END(fn) FN_SIZE(ASMNAME(fn))
+
+#ifdef __AVX__
+# define MOVAPS vmovaps
+#else
+# define MOVAPS movaps
+#endif
 
.text
 
-regs_to_mem:
-   MOVAPS  %xmm6, (%rax)
-   MOVAPS  %xmm7, 0x10(%rax)
-   MOVAPS  %xmm8, 0x20(%rax)
-   MOVAPS  %xmm9, 0x30(%rax)
-   MOVAPS  %xmm10, 0x40(%rax)
-   MOVAPS  %xmm11, 0x50(%rax)
-   MOVAPS  %xmm12, 0x60(%rax)
-   MOVAPS  %xmm13, 0x70(%rax)
-   MOVAPS  %xmm14, 0x80(%rax)
-   MOVAPS  %xmm15, 0x90(%rax)
-   mov %rsi, 0xa0(%rax)
-   mov %rdi, 0xa8(%rax)
-   mov %rbx, 0xb0(%rax)
-   mov %rbp, 0xb8(%rax)
-   mov %r12, 0xc0(%rax)
-   mov %r13, 0xc8(%rax)
-   mov %r14, 0xd0(%rax)
-   mov %r15, 0xd8(%rax)
+FUNC_BEGIN(regs_to_mem)
+   MOVAPS  %xmm6, (%r10)
+   MOVAPS  %xmm7, 0x10(%r10)
+   MOVAPS  %xmm8, 0x20(%r10)
+   MOVAPS  %xmm9, 0x30(%r10)
+   MOVAPS  %xmm10, 0x4

Re: [PATCH v2 0/2] [testsuite, libgcc] PR80759 Fix FAILs on Solaris and Darwin

2017-07-01 Thread Daniel Santos
This patchset addresses a number of testsuite issues for 
gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp, mostly occurring on Solaris 
and Darwin.  Additionally, it solves a bug in libgcc that caused link 
failures on Darwin when building with -mcall-ms2sysv-xlogues.  The 
issues are detailed in the notes for each patch.


I would particularly appreciate any feedback for Darwin as I am 
unfamiliar with the platform and Rainer and I have fashioned some of 
these changes by looking at other Darwin code in gcc.


 .../gcc.target/x86_64/abi/ms-sysv/do-test.S  | 200 ---
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c  |  83 +++-
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp| 153 +-
 libgcc/config.host   |   6 +-
 libgcc/config/i386/i386-asm.h|  89 +
 libgcc/config/i386/resms64.S |   2 +-
 libgcc/config/i386/resms64f.S|   2 +-
 libgcc/config/i386/resms64fx.S   |   2 +-
 libgcc/config/i386/resms64x.S|   2 +-
 libgcc/config/i386/savms64.S |   2 +-
 libgcc/config/i386/savms64f.S|   2 +-
 11 files changed, 274 insertions(+), 269 deletions(-)


Many thanks to Rainer for all of his help on this!

Thanks,
Daniel
2017-06-28  Daniel Santos  <daniel.san...@pobox.com>


2017-06-10  Daniel Santos  <daniel.san...@pobox.com>

PR testsuite/80759
* gcc.target/x86_64/abi/ms-sysv/do-test.S
(ELFFN_BEGIN): Rename to FN_TYPE.
(ELFFN_END): Rename to FN_SIZE.
(ASMNAME): New macro.
(FUNC): Rename to FUNC_BEGIN, use ASMNAME and use .globl instead of
.global.
(FUNC_END): Use ASMNAME.
(test_data_save): Remove.
(test_data_input): Likewise.
(test_data_output: Likewise.
(test_data_fn): Likewise.
(test_data_retaddr): Likewise.
(regs_to_mem): Make globals, use r10 instead of rax.
(mem_to_regs): Likewise.
(do_test_unaligned): Remove .cfi directives, remove pushf/popf, move
body to ms-sysv.c.
(do_test_aligned): Likewise.
* gcc.target/x86_64/abi/ms-sysv/ms-sysv.c:
Add dg-* directives.
(PASTE_STR): New macro.
(ASMNAME): Likewise.
(LOAD_TEST_DATA_ADDR): Likewise.
(TEST_DATA_OFFSET): Likewise.
(do_test_body0): New C function.
(do_test_body): New inline assembly routine.
* gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
(runtest_ms_sysv): Modify.
2017-06-28  Daniel Santos  <daniel.san...@pobox.com>

PR testsuite/80759
* config.host: include i386/t-msabi for darwin and solaris.
* config/i386/i386-asm.h
(ELFFN): Rename to FN_TYPE.
(FN_SIZE): New macro.
(FN_HIDDEN): Likewise.
(ASMNAME): Likewise.
(FUNC_START): Rename to FUNC_BEGIN, use ASMNAME, replace .global with
.globl.
(HIDDEN_FUNC): Use ASMNAME and .globl instead of .global.
(SSE_SAVE): Convert to cpp macro, hard-code offset (always 0x60).
* config/i386/resms64.S: Use SSE_SAVE as cpp macro instead of gas
.macro.
* config/i386/resms64f.S: Likewise.
* config/i386/resms64fx.S: Likewise.
* config/i386/resms64x.S: Likewise.
* config/i386/savms64.S: Likewise.
* config/i386/savms64f.S: Likewise.


[PATCH try 2 resend] [i386] Remove warnings for ignoring -mcall-ms2sysv-xlogues.

2017-06-11 Thread Daniel Santos
I appear to have forgotten to cc gcc-patches, sorry about that.

There are currently three cases where we issue a warning when disabling
-mcall-ms2sysv-xlogues for a function, but I never added a proper
warning, so there's no mechanism for disabling it.  This is something
that I meant to address sooner.  I'm thinking that it's better to just
remove the warning entirely and document these cases, rather than adding
a new warning.  Any thoughts?

These are the conditions:

* the use of -fsplit-stack,
* the use of static call chains (not sure if we can ever have that), and
* if the function calls __buildin_eh_return.

Some of these cases can likely be supported, but they are just on the
"not yet tested" list.

2017-06-11  Daniel Santos  <daniel.san...@pobox.com

* config/i386/i386.c (warn_once_call_ms2sysv_xlogues): Remove.
(ix86_compute_frame_layout): Don't call warn_once_call_ms2sysv_xlogues.
(ix86_expand_call): Likewise.

Thanks,
Daniel

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 26 +++---
 gcc/doc/invoke.texi| 25 -
 2 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d5c2d46bf5e..2dc6e53c765 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12772,18 +12772,6 @@ ix86_builtin_setjmp_frame_value (void)
   return stack_realign_fp ? hard_frame_pointer_rtx : virtual_stack_vars_rtx;
 }
 
-/* Emits a warning for unsupported msabi to sysv pro/epilogues.  */
-static void warn_once_call_ms2sysv_xlogues (const char *feature)
-{
-  static bool warned_once = false;
-  if (!warned_once)
-{
-  warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s",
-  feature);
-  warned_once = true;
-}
-}
-
 /* When using -fsplit-stack, the allocation routines set a field in
the TCB to the bottom of the stack plus this much space, measured
in bytes.  */
@@ -12814,18 +12802,10 @@ ix86_compute_frame_layout (void)
   gcc_assert (TARGET_SSE);
   gcc_assert (!ix86_using_red_zone ());
 
-  if (crtl->calls_eh_return)
+  if (crtl->calls_eh_return || ix86_static_chain_on_stack)
{
  gcc_assert (!reload_completed);
  m->call_ms2sysv = false;
- warn_once_call_ms2sysv_xlogues ("__builtin_eh_return");
-   }
-
-  else if (ix86_static_chain_on_stack)
-   {
- gcc_assert (!reload_completed);
- m->call_ms2sysv = false;
- warn_once_call_ms2sysv_xlogues ("static call chains");
}
 
   /* Finally, compute which registers the stub will manage.  */
@@ -29290,9 +29270,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
  else if (ix86_function_ms_hook_prologue (current_function_decl))
;
 
- /* TODO: Cases not yet examined.  */
+ /* TODO: Compatibility not yet examined.  */
  else if (flag_split_stack)
-   warn_once_call_ms2sysv_xlogues ("-fsplit-stack");
+   ;
 
  else
{
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c1168823af7..eec02b43a4f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -25389,11 +25389,26 @@ using the function attributes @code{ms_abi} and 
@code{sysv_abi}.
 @opindex mno-call-ms2sysv-xlogues
 Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a
 System V ABI function must consider RSI, RDI and XMM6-15 as clobbered.  By
-default, the code for saving and restoring these registers is emitted inline,
-resulting in fairly lengthy prologues and epilogues.  Using
-@option{-mcall-ms2sysv-xlogues} emits prologues and epilogues that
-use stubs in the static portion of libgcc to perform these saves and restores,
-thus reducing function size at the cost of a few extra instructions.
+default, the instructions for saving and restoring these registers are emitted
+inline, resulting in fairly lengthy pro- and epilogues.  Using
+@option{-mcall-ms2sysv-xlogues} emits pro- and epilogues that use stubs in the
+static portion of libgcc to perform these saves and restores, thus reducing
+function size at the cost of executing a few extra instructions.  This cost is
+theoretically mitigated or eliminated by reduced instruction cache utilization,
+temporal locality of the stubs, and the stubs' use of MOV instructions over
+PUSH and POP.
+
+This option is not supported with SEH, so it is completely unavailable on
+Windows.  It is also silently disabled if a function:
+
+@enumerate
+@item is built with @option{-mno-sse2} or @option{-fsplit-stack},
+@item has @code{__attribute__ ((ms_hook_prologue))}, or
+@item either throws an exception or explicitly calls 
@code{__builtin_eh_return}.
+@end enumerate
+
+Support for @option{-fsplit-stack} and @code{__builtin_eh_return} may be
+added at some time in the future, but has not ye

Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-23 Thread Daniel Santos

On 05/23/2017 09:31 AM, Bernd Edlinger wrote:

Hi,

this is the latest version of my patch.

As already said, it attempts to compute
the frame layout only when relevant data have
changed.

Apologies for doing more clean-up on Daniel's
patch than absolutely necessary, but ...

Bootstrap and reg-tested successfully on
x86_64-pc-linux-gnu with unix\{,-m32\}.
Is it OK for trunk?


Thanks
Bernd.


OK with me.

Thanks,
Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-23 Thread Daniel Santos

On 05/22/2017 01:32 PM, Bernd Edlinger wrote:

On 05/19/17 05:17, Daniel Santos wrote:


No, I'm not at all comfortable with you making so many seemingly
unnecessary changes to this code.  (Although I wish I got this much
feedback during my RFCs! :)  I can accept the changes to
is/count_stub_managed_reg (with some caveats), but I do not at all see
the rationale for changing m_stub_names to a static and adding another
dimension for the object instance -- from an OO standpoint, that's just
bad design.  Can you please share your rationale for that?


Hmm, sorry about that ...
I just thought it would be nice to avoid the const-cast here.


Well remember const-correctness isn't about an object's internal 
(bitwise) state, but it's externally visible (logical) state.  So a 
const member function need not avoid altering it's internal state if the 
externally visible state remains unchanged, such as when caching some 
result or lazy initing.  I have tended to prefer using const_cast for 
this, isolating its use to a single const accessor function (or if () 
block) to leave less room for the data members to be accidentally 
altered in another const member function.  But mutable is generally 
preferred over const_cast, which opens up the danger of accidentally 
modifying an object's logical state (especially by a subsequent edit), 
so using mutable is probably a better practice anyway.


However, ...


This moved the m_stub_names from all 4 instances to one static
array s_stub_names.  But looking at it again, I think the extra
dimension is not even necessary, because all instances share the
same data, so removing that extra dimension again will be fine.


You are correct!  And I see that you're new patch has already changed 
get_stub_name to a static member function, so great!



Incidentally, half of the space in that array is wasted and can be
trimmed since a given instance of xlogue_layout either supports hard
frame pointers or doesn't, I just never got around to splitting that
apart.  (The first three enum xlogue_stub values are for without HFP and
the last three for with.)  Also, if we wanted to further reduce the
memory footprint of xlogue_layout objects, the offset field of struct
reginfo could be changed to int, and if we really wanted to go nuts then
16 bits would work for both of its fields.

So for is/count_stub_managed_reg(s), you are obviously much more
familiar with gcc, its passes and the i386 backend, but my knowledge
level makes me very uncomfortable having the result of
xlogue_layout::is_stub_managed_reg() determined in a way that has the
(apparent) potential to differ from from the state at the time the last
call to ix86_compute_frame_layout() was made; I just don't understand

I fund it technically difficult to add a HARD_REG_SET to
struct machine_function, and tried to avoid the extra overhead of
calling ix86_save_reg so often, which you also felt uncomfortable with.

So, if you look at compute_stub_managed_regs I first saw that the
first loop can never terminate thru the "return 0", because the
registers in x86_64_ms_sysv_extra_clobbered_registers are guaranteed
to be clobbered.  Then I saw that the bits in stub_managed_regs
are always added in the same sequence, thus the result depends only
on the number call_ms2sysv_extra_regs and hfp so everything is already
available in struct machine_function.


Thanks
Bernd.


Yes, I agree with how you have refactored compute_stub_managed_regs 
given your rationale of not adding another header dependency to i386.h.  
It is only the overall scheme of calculating this outside of 
ix86_compute_frame_layout that I cannot validate due to my not having a 
good understanding of what can and cannot change in between the time 
that ix86_compute_frame_layout is last called and the last call to 
is_stub_managed_regs().


As Uros said, my patch set touches a "delicate part of the compiler, 
where lots of code-paths cross each other (and we have had quite some 
hard-to-fix bugs in this area)" 
(https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01924.html).  I wrote it 
the way I did with my understanding of what was safe to do and your 
alterations move it's functionality outside of that understanding.  So 
if you say that this won't break it, then I will have to trust you (and 
the testsuite) on that.


On that note, the tests are undergoing some change and bug fixes and I'm 
planning on adding more tests to validate non-breakage with various 
other stack frame-related options and probably additional tests (and 
test options) triggered by GCC_TEST_RUN_EXPENSIVE or some such.


Daniel





Re: [PATCH 2/2] [testsuite] PR 80759 Remove gas extensions from do-test.S, fix other problems

2017-05-19 Thread Daniel Santos

Thanks you for your assistance Rainer!

On 05/19/2017 04:03 AM, Rainer Orth wrote:

unfortunately, it still doesn't, as explained in the PR.  The multilib
support is still wrong/non-existant.


I guess I thought for some reason that would magically appear in 
TEST_ALWAYS_FLAGS. :)  I've explicitly added it for now, but I haven't 
yet found where -m64 gets fed in the normal flow of things and I would 
rather know I'm doing things as closely as possible to how the rest if 
the test harness does it.



(I have SVN write privs now, so I can even commit it myself).

Please always include ChangeLog entries with your patch submissions so
one can easily see what you've change
(cf. https://gcc.gnu.org/contribute.html).

Thanks.
 Rainer


I hate when I forget that!  I'll be sure to remember when I resubmit.


Use of .struct in do_test.S causes breakages when gas isn't the
assembler (e.g., Solaris).  I also wasn't including TEST_ALWAYS_FLAGS in
my CFLAGS resulting in super-ugly log files.  Finally, this patch
eliminates spam of "test unsupported" (limiting it to one printing).

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
  .../gcc.target/x86_64/abi/ms-sysv/do-test.S| 26 +-
  .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c|  7 ++
  .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp  | 24 
  3 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
index 1395235fd1e..967eb959fbc 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
@@ -46,22 +46,6 @@ fn:
  #  define MOVAPS movaps
  # endif
  
-/* TODO: Is there a cleaner way to provide these offsets?  */

-   .struct 0
-test_data_save:
-
-   .struct test_data_save + 224
-test_data_input:
-
-   .struct test_data_save + 448
-test_data_output:
-
-   .struct test_data_save + 672
-test_data_fn:
-
-   .struct test_data_save + 680
-test_data_retaddr:
-
.text
  
  regs_to_mem:

@@ -132,23 +116,23 @@ L0:
callregs_to_mem
  
  	# Load register with random data

-   lea test_data + test_data_input(%rip), %rax
+   lea test_data + 224(%rip), %rax
callmem_to_regs
  
  	# Save original return address

pop %rax
-   movq%rax, test_data + test_data_retaddr(%rip)
+   movq%rax, test_data + 680(%rip)
  
  	# Call the test function

-   call*test_data + test_data_fn(%rip)
+   call*test_data + 672(%rip)
  
  	# Restore the original return address

-   movqtest_data + test_data_retaddr(%rip), %rcx
+   movqtest_data + 680(%rip), %rcx
push%rcx
  
  	# Save test function return value and store resulting register values

push%rax
-   lea test_data + test_data_output(%rip), %rax
+   lea test_data + 448(%rip), %rax
callregs_to_mem
  
  	# Restore registers

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
index 2a011f5103d..7cec312c386 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
@@ -346,6 +346,13 @@ int main (int argc, char *argv[])
assert (!((long)_data.regdata[REG_SET_INPUT] & 15));
assert (!((long)_data.regdata[REG_SET_OUTPUT] & 15));
  
+  /* Verify offsets hard-coded into assembly.  */

+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_SAVE]) == 0);
+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_INPUT]) == 
224);
+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_OUTPUT]) == 
448);
+  assert (__builtin_offsetof (struct test_data, fn) == 672);
+  assert (__builtin_offsetof (struct test_data, retaddr) == 680);
+
while ((c = getopt (argc, argv, "s:f")) != -1)
  {
switch (c)

while .struct is a gas extension and doesn't work with the Solaris/x86
/bin/as, having the same (mostly unexplained) constants hardcoded in two
places isn't exactly helpful.  I'd suggest moving them to (say)
ms-sysv.h and include that from both do-test.S (which is preprocessed
assembler after all) and ms-sysv.c.

Rainer


Well, I don't have an ms-sysv.h, but I suppose I can add one.

I'm starting to lean more towards the idea of plucking out the portion 
of asm that uses these offsets, moving that to an inline asm function 
and having the code in do-test.S just jmp to it.  I wish there was some 
sort of "naked" attribute for x86 since I'm not well versed in every way 
that the compiler can change it in a way that wouldn't be friendly.


void __attribute__((optimize ("-O0 -fno-split-stack")))
do_test_body (void)
{
  __asm__ __volatile__ (
"# Save registers\n"
"  lea %0, %%

[PATCH 2/2] [testsuite] PR 80759 Remove gas extensions from do-test.S, fix other problems

2017-05-19 Thread Daniel Santos
Use of .struct in do_test.S causes breakages when gas isn't the
assembler (e.g., Solaris).  I also wasn't including TEST_ALWAYS_FLAGS in
my CFLAGS resulting in super-ugly log files.  Finally, this patch
eliminates spam of "test unsupported" (limiting it to one printing).

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 .../gcc.target/x86_64/abi/ms-sysv/do-test.S| 26 +-
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c|  7 ++
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp  | 24 
 3 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
index 1395235fd1e..967eb959fbc 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
@@ -46,22 +46,6 @@ fn:
 #  define MOVAPS movaps
 # endif
 
-/* TODO: Is there a cleaner way to provide these offsets?  */
-   .struct 0
-test_data_save:
-
-   .struct test_data_save + 224
-test_data_input:
-
-   .struct test_data_save + 448
-test_data_output:
-
-   .struct test_data_save + 672
-test_data_fn:
-
-   .struct test_data_save + 680
-test_data_retaddr:
-
.text
 
 regs_to_mem:
@@ -132,23 +116,23 @@ L0:
callregs_to_mem
 
# Load register with random data
-   lea test_data + test_data_input(%rip), %rax
+   lea test_data + 224(%rip), %rax
callmem_to_regs
 
# Save original return address
pop %rax
-   movq%rax, test_data + test_data_retaddr(%rip)
+   movq%rax, test_data + 680(%rip)
 
# Call the test function
-   call*test_data + test_data_fn(%rip)
+   call*test_data + 672(%rip)
 
# Restore the original return address
-   movqtest_data + test_data_retaddr(%rip), %rcx
+   movqtest_data + 680(%rip), %rcx
push%rcx
 
# Save test function return value and store resulting register values
push%rax
-   lea test_data + test_data_output(%rip), %rax
+   lea test_data + 448(%rip), %rax
callregs_to_mem
 
# Restore registers
diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
index 2a011f5103d..7cec312c386 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
@@ -346,6 +346,13 @@ int main (int argc, char *argv[])
   assert (!((long)_data.regdata[REG_SET_INPUT] & 15));
   assert (!((long)_data.regdata[REG_SET_OUTPUT] & 15));
 
+  /* Verify offsets hard-coded into assembly.  */
+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_SAVE]) == 0);
+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_INPUT]) == 
224);
+  assert (__builtin_offsetof (struct test_data, regdata[REG_SET_OUTPUT]) == 
448);
+  assert (__builtin_offsetof (struct test_data, fn) == 672);
+  assert (__builtin_offsetof (struct test_data, retaddr) == 680);
+
   while ((c = getopt (argc, argv, "s:f")) != -1)
 {
   switch (c)
diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
index 77c40dbf349..a9571f194b1 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
@@ -23,18 +23,12 @@
 # see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 # <http://www.gnu.org/licenses/>.
 
-# Exit immediately if this isn't a native x86_64 target.
-if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
- || ![is-effective-target lp64] || ![isnative] } then {
-unsupported "$subdir"
-return
-}
-
 load_lib gcc-dg.exp
 load_lib parallelize.exp
 
 proc runtest_ms_sysv { cflags generator_args } {
-global GCC_UNDER_TEST HOSTCXX HOSTCXXFLAGS tmpdir srcdir subdir
+global GCC_UNDER_TEST HOSTCXX HOSTCXXFLAGS tmpdir srcdir subdir \
+  TEST_ALWAYS_FLAGS
 
 set objdir "$tmpdir/ms-sysv"
 set generator "$tmpdir/ms-sysv-generate.exe"
@@ -93,7 +87,7 @@ proc runtest_ms_sysv { cflags generator_args } {
}
 }
 
-set cc "$GCC_UNDER_TEST -I$objdir -I$srcdir/$subdir $cflags $warn_flags"
+set cc "$GCC_UNDER_TEST -I$objdir -I$srcdir/$subdir $TEST_ALWAYS_FLAGS 
$cflags $warn_flags"
 
 # Assemble do-test.S
 set src "$srcdir/$subdir/do-test.S"
@@ -142,6 +136,18 @@ if { [parallel-init "ms2sysv"] != 0 } then {
 return;
 }
 
+# Exit if this isn't a native x86_64 target.
+if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
+ || ![is-effective-target lp64] || ![isnative] } then {
+
+# The first call to parallel-should-run-test is used so we only print the
+# "

[PATCH 1/2] [testsuite] Move non-standard parallelization support into new lib and fix flaw

2017-05-19 Thread Daniel Santos
This fixes a flaw in my parallelization code that caused it to fail when
GCC_RUNTEST_PARALLELIZE_DIR wasn't set.  It worked fine with make -j1,
but failed with just make.

As there could be other tests that might need to do their own
paralellization, I'm moving the that code into it's own file under
gcc/testsuite/lib.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp  | 48 
 gcc/testsuite/lib/parallelize.exp  | 88 ++
 2 files changed, 103 insertions(+), 33 deletions(-)
 create mode 100644 gcc/testsuite/lib/parallelize.exp

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
index e317af9bd85..77c40dbf349 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
@@ -30,13 +30,11 @@ if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
 return
 }
 
-global GCC_RUNTEST_PARALLELIZE_DIR
-
 load_lib gcc-dg.exp
+load_lib parallelize.exp
 
 proc runtest_ms_sysv { cflags generator_args } {
-global GCC_UNDER_TEST HOSTCXX HOSTCXXFLAGS tmpdir srcdir subdir \
-  parallel_dir next_test
+global GCC_UNDER_TEST HOSTCXX HOSTCXXFLAGS tmpdir srcdir subdir
 
 set objdir "$tmpdir/ms-sysv"
 set generator "$tmpdir/ms-sysv-generate.exe"
@@ -46,22 +44,6 @@ proc runtest_ms_sysv { cflags generator_args } {
 set ms_sysv_exe "$objdir/ms-sysv.exe"
 set status 0
 set warn_flags "-Wall"
-set this_test $next_test
-incr next_test
-
-# Do parallelization here
-if [catch {set fd [open "$parallel_dir/$this_test" \
-   [list RDWR CREAT EXCL]]} ] {
-   if { [lindex $::errorCode 1] eq "EEXIST" } then {
-   # Another job is running this test
-   return
-   } else {
-   error "Failed to open $parallel_dir/$this_test: $::errorCode"
-   set status 1
-   }
-} else {
-  close $fd
-}
 
 # Detect when hard frame pointers are enabled (or required) so we know not
 # to generate bp clobbers.
@@ -73,9 +55,17 @@ proc runtest_ms_sysv { cflags generator_args } {
 set descr "$subdir CFLAGS=\"$cflags\" generator_args=\"$generator_args\""
 verbose "$tmpdir: Running test $descr" 1
 
-# Cleanup any previous test in objdir
-file delete -force $objdir
-file mkdir $objdir
+set status [parallel-should-run-test]
+
+if { $status == 1 } then {
+   return
+}
+
+if { $status == 0 } then {
+   # Cleanup any previous test in objdir
+   file delete -force $objdir
+   file mkdir $objdir
+}
 
 # Build the generator (only needs to be done once).
 set src "$srcdir/$subdir/gen.cc"
@@ -148,16 +138,8 @@ proc runtest_ms_sysv { cflags generator_args } {
 }
 
 dg-init
-
-# Setup parallelization
-set next_test 0
-set parallel_dir "$env(GCC_RUNTEST_PARALLELIZE_DIR)/abi-ms-sysv"
-file mkdir "$env(GCC_RUNTEST_PARALLELIZE_DIR)"
-file mkdir "$parallel_dir"
-
-if { ![file isdirectory "$parallel_dir"] } then {
-error "Failed to create directory $parallel_dir: $::errorCode"
-return
+if { [parallel-init "ms2sysv"] != 0 } then {
+return;
 }
 
 set gen_opts "-p0-5"
diff --git a/gcc/testsuite/lib/parallelize.exp 
b/gcc/testsuite/lib/parallelize.exp
new file mode 100644
index 000..346a06f0fa0
--- /dev/null
+++ b/gcc/testsuite/lib/parallelize.exp
@@ -0,0 +1,88 @@
+# Functions for parallelizing tests that cannot use the standard dg-run,
+# dg-runtest or gcc-dg-runtest for some reason.
+#
+# Copyright (C) 2017 Free Software Foundation, Inc.
+# Contributed by Daniel Santos <daniel.san...@pobox.com>
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# Under Section 7 of GPL version 3, you are granted additional
+# permissions described in the GCC Runtime Library Exception, version
+# 3.1, as published by the Free Software Foundation.
+#
+# You should have received a copy of the GNU General Public License and
+# a copy of the GCC Runtime Library Exception along with this program;
+# see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+# <http://www.gnu.org/licenses/>.
+
+set is_parallel_build 0
+set parallel_next_test 0
+set parallel_dir ""

[PATCH 0/2] [testsuite] PR80759 Fix test breakages on i386-pc-solaris2.*

2017-05-19 Thread Daniel Santos

There are a few issues with my ms-sysv.exp tests:

1. Use of gas extensions in do_test.S cause breakages on Solaris,
2. Parallelization breaks when no make -j flag is passed,
3. Builds aren't adding TEST_ALWAYS_FLAGS, so log files filled with
   color escape codes, and
4. The "test unsupported" message is being spammed once for each -j

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

I've broken this apart into two patches because I don't know if you'll 
agree with the first one.  I fixed the make -j issue and moved the 
parallelization code into a new gcc/target/lib/parallelize.exp in the 
first patch and fixed all of the other issues in the second.  I've 
removed all usage of gas .struct in my assembly file, used hard-coded 
the offsets into the code and added asserts to main.c to make sure they 
don't change.


I've bootstrapped and retested on x86_64 Linux and have asked Rainer to 
retest on Solaris.  Presuming that succeeds, are you OK with this 
change?  (I have SVN write privs now, so I can even commit it myself).


Thanks,
Daniel



Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-18 Thread Daniel Santos



PS: Oh! it might be due to the difference between -j1 and no -j argument.


Yes, that's how I missed it.  This flaw isn't exposed with make -j1, but 
is exposed with just make.  Thanks for finding this!


Daniel



Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-18 Thread Daniel Santos

On 05/18/2017 08:37 AM, Bernd Edlinger wrote:

On 05/17/17 04:01, Daniel Santos wrote:

-  if (ignore_outlined && cfun->machine->call_ms2sysv
-  && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
-return false;
+  if (ignore_outlined && cfun->machine->call_ms2sysv)
+{
+  /* Registers who's save & restore will be managed by stubs
called from
+ pro/epilogue.  */
+  HARD_REG_SET stub_managed_regs;
+  xlogue_layout::compute_stub_managed_regs (stub_managed_regs);

+  if (in_hard_reg_set_p (stub_managed_regs, DImode, regno))
+return false;
+}
+
if (crtl->drap_reg
&& regno == REGNO (crtl->drap_reg)
&& !cfun->machine->no_drap_save_restore)

This makes no sense.  The entire purpose of stub_managed_regs is to
cache the result of xlogue_layout::compute_stub_managed_regs() and this
would unnecessarily repeat that calculation for each time
ix86_save_reg() is called.  Since
xlogue_layout::compute_stub_managed_regs() calls ix86_save_reg many
times, this makes it even worse.Which registers are being saved
out-of-line and inline MUST be known at the time the stack layout is
determined.  So stub_managed_regsshould either be left a TU static or
just moved to struct machine_function.

As an aside, I've noticed that xlogue_layout::compute_stub_managed_regs
is calling ix86_save_reg (which isn't trivial) more often than it really
has to, so I've refactored it.


Well, meanwhile I think the stub_managed_regs contain zero information
and need not be saved at all, because it can easily be reconstructed
from  m->call_ms2sysv_extra_regs.

See the attached new version.  Daniel does it work for you?


No, I'm not at all comfortable with you making so many seemingly 
unnecessary changes to this code.  (Although I wish I got this much 
feedback during my RFCs! :)  I can accept the changes to 
is/count_stub_managed_reg (with some caveats), but I do not at all see 
the rationale for changing m_stub_names to a static and adding another 
dimension for the object instance -- from an OO standpoint, that's just 
bad design.  Can you please share your rationale for that?


Incidentally, half of the space in that array is wasted and can be 
trimmed since a given instance of xlogue_layout either supports hard 
frame pointers or doesn't, I just never got around to splitting that 
apart.  (The first three enum xlogue_stub values are for without HFP and 
the last three for with.)  Also, if we wanted to further reduce the 
memory footprint of xlogue_layout objects, the offset field of struct 
reginfo could be changed to int, and if we really wanted to go nuts then 
16 bits would work for both of its fields.


So for is/count_stub_managed_reg(s), you are obviously much more 
familiar with gcc, its passes and the i386 backend, but my knowledge 
level makes me very uncomfortable having the result of 
xlogue_layout::is_stub_managed_reg() determined in a way that has the 
(apparent) potential to differ from from the state at the time the last 
call to ix86_compute_frame_layout() was made; I just don't understand 
well enough what all can change in between the last call to 
ix86_compute_frame_layout() and the last call to 
xlogue_layout::is_stub_managed_reg().  I like your 
count_stub_managed_regs() is_stub_managed_regs() from a *performance* 
standpoint (and I know I get too uptight about that kind of thing, so 
appreciate that), but as to the change in scheme, I would have to trust 
you if you assert that this will always behave consistently.


I also want to give you a little background on some of these seemingly 
repetitive computations.  One of my design goals was for the code to be 
relatively easily to adapted to the management of out-of-line 
pro/epilogue stubs for other possible scenarios where there are a lot of 
clobbers and it could be useful.  Granted, I don't have enough knowledge 
of x86 architectures to identify situations other than this one (in 
64-bit Wine) where it could be helpful and I know that x86 push/pops are 
really small.  So theoretically, struct machine_function's 
"call_ms2sysv" could be changed to something like "outline_savres" and 
any combination of clobbered registers for which there is a descent stub 
could be used if it was a good choice.  I also realize that nobody likes 
complexity that isn't being used, and I respect that.  So if you are 
comfortable with this change and you believe you understand how it works 
then I will agree to it, but I'll be trusting you well beyond my 
knowledge level and ability to confidently predict the outcome (probably 
what a programmer hates the most).


Thanks,
Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-18 Thread Daniel Santos


On 05/17/2017 01:39 PM, Bernd Edlinger wrote:

On 05/15/17 03:39, Daniel Santos wrote:


I should add that if you want to run faster tests just on the ms to sysv
abi code, you can use make RUNTESTFLAGS="ms-sysv.exp" check and then if
that succeeds run the full testsuite.

Daniel

Hmm, that's funny...

If I use "make check-c RUNTESTFLAGS="ms-sysv.exp" -j8" it seems to work,
but if I omit the -j8 it fails:

make check-c RUNTESTFLAGS="ms-sysv.exp"
...Test Run By ed on Wed May 17 20:38:24 2017
Native configuration is x86_64-pc-linux-gnu

=== gcc tests ===

Schedule of variations:
  unix

Running target unix
Using /usr/share/dejagnu/baseboards/unix.exp as board description file
for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for
target.
Using /home/ed/gnu/gcc-trunk/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running
/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp
...
ERROR: tcl error sourcing
/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp.
ERROR: no such variable
  (read trace on "env(GCC_RUNTEST_PARALLELIZE_DIR)")
  invoked from within
"set parallel_dir "$env(GCC_RUNTEST_PARALLELIZE_DIR)/abi-ms-sysv""
  (file
"/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp"
line 154)
  invoked from within
"source
/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp"
  ("uplevel" body line 1)
  invoked from within
"uplevel #0 source
/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp"
  invoked from within
"catch "uplevel #0 source $test_file_name""

=== gcc Summary ===

/home/ed/gnu/gcc-build/gcc/xgcc  version 8.0.0 20170514 (experimental)
(GCC)

make[2]: Leaving directory `/home/ed/gnu/gcc-build/gcc'
make[1]: Leaving directory `/home/ed/gnu/gcc-build/gcc'


Hmm, that might be something I hadn't actually tried.  And if I run it 
in a directory where I had previously run a multi-job check it doesn't 
blow up (maybe because the directory is already there?)  Due to the 
nature of my test program, I had to break with tradition and implement 
something akin to the test that generates random structs (I forgot what 
that one is called).  It ended up breaking the bastardized 
parallelization scheme, so I had to implement my own re-bastardized 
scheme.  Looks like I can just skip parallelization if 
GCC_RUNTEST_PARALLELIZE_DIR isn't defined.


I have another Solaris test issue on PR 80759 so I'll fix that along 
with it.


Thanks,
Daniel

PS: Oh! it might be due to the difference between -j1 and no -j argument.


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-18 Thread Daniel Santos

On 05/17/2017 12:41 PM, Bernd Edlinger wrote:

Apologies if I ruined your patch...


As I said before, I'm the new guy here. :) So when this is done I'll 
rebase my changes.  I have some test stuff to fix and some refactoring 
and refinements to xlogue_layout::compute_stub_managed_regs(). And then 
I'll find a solution to the stub_managed_regs after that.



Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c(revision 248031)
+++ gcc/config/i386/i386.c(working copy)
@@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] =

  /* Additional registers that are clobbered by SYSV calls.  */

-unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
+#define NUM_X86_64_MS_CLOBBERED_REGS 12
+static int const x86_64_ms_sysv_extra_clobbered_registers
+ [NUM_X86_64_MS_CLOBBERED_REGS] =

Is there a reason you're changing this unsigned to signed int? While
AX_REG and such are just preprocessor macros, everywhere else it seems
that register numbers are dealt with as unsigned ints.


I actually there seems to be confusion about "int" vs. "unsigned int"
for regno, the advantage of int, is that it can contain -1 as a
exceptional value.  Furthermore there are 3 similar arrays just
above that also use int:

static int const x86_64_int_parameter_registers[6] =
{
DI_REG, SI_REG, DX_REG, CX_REG, R8_REG, R9_REG
};

static int const x86_64_ms_abi_int_parameter_registers[4] =
{
CX_REG, DX_REG, R8_REG, R9_REG
};

static int const x86_64_int_return_registers[4] =
{
AX_REG, DX_REG, DI_REG, SI_REG
};

/* Additional registers that are clobbered by SYSV calls.  */

#define NUM_X86_64_MS_CLOBBERED_REGS 12
static int const x86_64_ms_sysv_extra_clobbered_registers
   [NUM_X86_64_MS_CLOBBERED_REGS] =
{
SI_REG, DI_REG,
XMM6_REG, XMM7_REG,
XMM8_REG, XMM9_REG, XMM10_REG, XMM11_REG,
XMM12_REG, XMM13_REG, XMM14_REG, XMM15_REG
};

So IMHO it looked odd to have one array use a different type in the
first place.


OK.  I think that when I originally started this I was using elements of 
this array in comparisons and got the signed/unsigned warning and 
changed them.  None of the code gives that warning now however.



@@ -2484,13 +2486,13 @@ class xlogue_layout {
   needs to store registers based upon data in the
machine_function.  */
HOST_WIDE_INT get_stack_space_used () const
{
-const struct machine_function  = *cfun->machine;
-unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1;
+const struct machine_function *m = cfun->machine;
+unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;

What is the reason for this change?


Because a mixture of C and C++ (C wants "struct" machine_function)
looks ugly, and everywhere else in this module, "m" is a pointer and no
reference.


I see, consistency with the rest of the file.


-gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
+gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
  return m_regs[last_reg].offset
-+ (m.call_ms2sysv_pad_out ? 8 : 0)
-+ STUB_INDEX_OFFSET;
+   + (m->call_ms2sysv_pad_out ? 8 : 0)
+   + STUB_INDEX_OFFSET;
}

/* Returns the offset for the base pointer used by the stub. */
@@ -2532,7 +2534,7 @@ class xlogue_layout {
/* Lazy-inited cache of symbol names for stubs.  */
char
m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];

-  static const struct xlogue_layout GTY(())
s_instances[XLOGUE_SET_COUNT];
+  static const struct GTY(()) xlogue_layout
s_instances[XLOGUE_SET_COUNT];

Hmm, during development I originally had C-style xlogue_layout as a
struct and later decided to make it a class and apparently forgot to
remove the "struct" here.  None the less, it's bazaar that the GTY()
would go in between the "struct" and the "xlogue_layout."  As I said
before, I don't fully understand how this GTY works.  Can we just remove
the "struct" keyword?

Also, if the way I had it was wrong, (and resulted in garbage collection
not working right) then perhaps it was the cause of a problem I had with
caching symbol rtx objects.  I could not get this to work because my
cached objects would somehow become stale and I've since removed that
code (from xlogue_layout::get_stub_rtx).  (i.e., does GTY effect
lifespan of globals, TU statics and static C++ data members?)


Yes, I have not noticed the "struct", and agree to remove it.

I just saw every other place where GTY is used it is directly after
"struct" or "static", so my impulse was just to follow that examples.


Yeah, and not understanding how it worked I was just trying to follow suit.


But neither version actually makes the class GC-able.  Apparently
this class construct is too complicated for the gengtype machinery.
So I am inclined to remove the GTY keyword completely as it gives
you only false security in GC's ability to garbage collect anything
in this class.



Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-16 Thread Daniel Santos

On 05/16/2017 02:52 PM, Bernd Edlinger wrote:

I think I solved the problem with -fsplit-stack, I am not sure
if ix86_static_chain_on_stack might change after reload due to
final.c possibly calling targetm.calls.static_chain, but if that
is the case, that is an already pre-existing problem.

The goal of this patch is to make all decisions regarding the
frame layout before the reload pass, and to make sure that
the frame layout does not change unexpectedly it asserts
that the data that goes into the decision does not change
after reload_completed.

With the attached patch -fsplit-stack and the attribute ms_hook_prologue
is handed directly at the ix86_expand_call, because that data is
already known before expansion.

The calls_eh_return and ix86_static_chain_on_stack may become
known at a later time, but after reload it should not change any more.
To be sure, I added an assertion at ix86_static_chain, which the
regression test did not trigger, neither with -m64 nor with -m32.

I have bootstrapped the patch several times, and a few times I
encounterd a segfault in the garbage collection, but it did not
happen every time.  Currently I think that is unrelated to this patch.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu with -m64/-m32.
Is it OK for trunk?


Thanks
Bernd.


With as many formatting errors as I seem to have had, I would like to 
fix those then you patch on top of that if you wouldn't mind terribly.  
While gcc uses subversion, git-blame is still very helpful (then again, 
since Uros committed it for me, I guess that's already off).




Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c(revision 248031)
+++ gcc/config/i386/i386.c(working copy)
@@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] =

 /* Additional registers that are clobbered by SYSV calls.  */

-unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
+#define NUM_X86_64_MS_CLOBBERED_REGS 12
+static int const x86_64_ms_sysv_extra_clobbered_registers
+ [NUM_X86_64_MS_CLOBBERED_REGS] =


Is there a reason you're changing this unsigned to signed int? While 
AX_REG and such are just preprocessor macros, everywhere else it seems 
that register numbers are dealt with as unsigned ints.



@@ -2484,13 +2486,13 @@ class xlogue_layout {
  needs to store registers based upon data in the 
machine_function.  */

   HOST_WIDE_INT get_stack_space_used () const
   {
-const struct machine_function  = *cfun->machine;
-unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1;
+const struct machine_function *m = cfun->machine;
+unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;


What is the reason for this change?



-gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
+gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
 return m_regs[last_reg].offset
-+ (m.call_ms2sysv_pad_out ? 8 : 0)
-+ STUB_INDEX_OFFSET;
+   + (m->call_ms2sysv_pad_out ? 8 : 0)
+   + STUB_INDEX_OFFSET;
   }

   /* Returns the offset for the base pointer used by the stub. */
@@ -2532,7 +2534,7 @@ class xlogue_layout {
   /* Lazy-inited cache of symbol names for stubs.  */
   char 
m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];


-  static const struct xlogue_layout GTY(()) 
s_instances[XLOGUE_SET_COUNT];
+  static const struct GTY(()) xlogue_layout 
s_instances[XLOGUE_SET_COUNT];


Hmm, during development I originally had C-style xlogue_layout as a 
struct and later decided to make it a class and apparently forgot to 
remove the "struct" here.  None the less, it's bazaar that the GTY() 
would go in between the "struct" and the "xlogue_layout."  As I said 
before, I don't fully understand how this GTY works.  Can we just remove 
the "struct" keyword?


Also, if the way I had it was wrong, (and resulted in garbage collection 
not working right) then perhaps it was the cause of a problem I had with 
caching symbol rtx objects.  I could not get this to work because my 
cached objects would somehow become stale and I've since removed that 
code (from xlogue_layout::get_stub_rtx).  (i.e., does GTY effect 
lifespan of globals, TU statics and static C++ data members?)



 /* Constructor for xlogue_layout.  */
@@ -2639,11 +2643,11 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT 
stack_

   : m_hfp (hfp) , m_nregs (hfp ? 17 : 18),
 m_stack_align_off_in (stack_align_off_in)
 {
+  HOST_WIDE_INT offset = stack_align_off_in;
+  unsigned i, j;
+
   memset (m_regs, 0, sizeof (m_regs));
   memset (m_stub_names, 0, sizeof (m_stub_names));
-
-  HOST_WIDE_INT offset = stack_align_off_in;
-  unsigned i, j;
   for (i = j = 0; i < MAX_REGS; ++i)
 {
   unsigned regno = REG_ORDER[i];
@@ -2662,11 +2666,12 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT 
stack_

   m_regs[j].regno= regno;
   m_regs[j++].offset = offset - STUB_INDEX_OFFSET;
 }
-gcc_assert (j 

Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-16 Thread Daniel Santos

On 05/16/2017 12:19 PM, Ian Lance Taylor wrote:

On Mon, May 15, 2017 at 10:00 PM, Daniel Santos <daniel.san...@pobox.com> wrote:

Ian, would you mind looking at this please?  A combination of my
-mcall-ms2sysv-xlogues patch with Bernd's patch is causing problems when
ix86_expand_split_stack_prologue() calls ix86_expand_call().

I don't have a lot of context here.  I assume that ms2sysv is going to
be used on Windows systems, where -fsplit-stack isn't really going to
work anyhow, so I think it would probably be OK that reject that
combination if it causes trouble.


Sorry I wasn't more specific.  This -mcall-ms2sysv-xlogues actually 
targets Wine, although they don't use -fsplit-stack.  My patch set as-is 
is disabled when fsplit-stack is used, but during 
ix86_compute_frame_layout, which is too late in the case of 
-fsplit-stack.  I think I should just change this to a sorry() in 
ix86_option_override_internal.



Also, it's overkill for ix86_expand_split_stack_prologue to call
ix86_expand_call.  The call is always to __morestack, and __morestack
is written in assembler, so we could use a simpler version of
ix86_expand_call if that helps.  In particular we can decide that
__morestack doesn't clobber any unusual registers, if that is what is
causing the problem.

Ian


Well aside from the conflict of the two patches, it just looks like it 
has the potential to generate clobbers where none are needed, but I'm 
having trouble actually *proving* that, so maybe I'm just wrong.


Daniel



Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-16 Thread Daniel Santos

On 05/16/2017 03:34 AM, Bernd Edlinger wrote:

It would be good to have test cases for each of the not-supported warnings that
can happen, so far I only managed to get a test case for -fsplit-stack.


Yes, I'm inclined to agree.  I'll try to get this done today or 
tomorrow.  I've also put in a limiter of one warning per TU.  One 
problem is that there isn't a way to disable the warning, so I may want 
to add that.


Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-15 Thread Daniel Santos
Ian, would you mind looking at this please?  A combination of my 
-mcall-ms2sysv-xlogues patch with Bernd's patch is causing problems when 
ix86_expand_split_stack_prologue() calls ix86_expand_call().


On 05/15/2017 06:46 PM, Daniel Santos wrote:
Rather or not m->call_ms2sysv is set determines which stack layout is 
used when ix86_compute_frame_layout() runs. But if we can run 
expand_call after the final time ix86_compute_frame_layout() then we 
have a problem.  It looks like ix86_expand_split_stack_prologue is the 
only function that manually calls ix86_expand_call, but maybe it would 
be better to modify the test to something like this:


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a78819d6b3f..c36383f6962 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -29325,7 +29325,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
callarg1,

}

   /* Set here, but it may get cleared later.  */
-  if (TARGET_CALL_MS2SYSV_XLOGUES)
+  if (TARGET_CALL_MS2SYSV_XLOGUES && !reload_completed)
cfun->machine->call_ms2sysv = true;
 }




Actually, I think this is wrong.  I happened to recall looking at the 
morestack code last year and remembered that it was all assembly.  I 
looked at it again and I don't see that it calls anything outside of 
it's implementation file (libgcc/config/i386/morestack.S) except for 
_Unwind_Resume and the calling function its self (I think it calls its 
caller).  It saves and restores rsi and rdi and doesn't use any sse 
registers, so it doesn't need to clobber all of the regs in the 
x86_64_ms_sysv_extra_clobbered_registers array.  I'm guessing that this 
should have it's own pattern instead of calling ix86_expand_call in the 
first place.


Of course, I'm the new guy here, so please enlighten me if I'm wrong.

Thanks,
Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-15 Thread Daniel Santos

On 05/15/2017 03:39 PM, Bernd Edlinger wrote:

On 05/15/17 03:39, Daniel Santos wrote:

On 05/14/2017 11:31 AM, Bernd Edlinger wrote:

Hi Daniel,

there is one thing I don't understand in your patch:
That is, it introduces a static value:

/* Registers who's save & restore will be managed by stubs called from
  pro/epilogue.  */
static HARD_REG_SET GTY(()) stub_managed_regs;

This seems to be set as a side effect of ix86_compute_frame_layout,
and depends on the register usage of the current function.
But values that depend on the current function need usually be
attached to cfun->machine, because the passes can run in parallel
unless I am completely mistaken, and the stub_managed_regs may
therefore be computed from a different function.


Bernd.

I should add that if you want to run faster tests just on the ms to sysv
abi code, you can use make RUNTESTFLAGS="ms-sysv.exp" check and then if
that succeeds run the full testsuite.

Daniel

Unfortunately I encounter a serious problem when my patch is used
ontop of your patch, Yes, the test suite ran without error, but then
I tried to trigger the warning and that tripped an ICE.
The reason is that cfun->machine->call_ms2sysv can be set to true
*after* reload_completed, which can be seen using the following
patch:

Index: i386.c
===
--- i386.c  (revision 248031)
+++ i386.c  (working copy)
@@ -29320,7 +29320,10 @@

 /* Set here, but it may get cleared later.  */
 if (TARGET_CALL_MS2SYSV_XLOGUES)
+  {
+   gcc_assert(!reload_completed);
cfun->machine->call_ms2sysv = true;
+  }
   }

 if (vec_len > 1)


That assertion is triggered in this test case:

cat test.c
int test()
{
__builtin_printf("test\n");
return 0;
}

gcc -mabi=ms -mcall-ms2sysv-xlogues -fsplit-stack -c test.c
test.c: In function 'test':
test.c:5:1: internal compiler error: in ix86_expand_call, at
config/i386/i386.c:29324
   }
   ^
0x13390a4 ix86_expand_call(rtx_def*, rtx_def*, rtx_def*, rtx_def*,
rtx_def*, bool)
../../gcc-trunk/gcc/config/i386/i386.c:29324
0x1317494 ix86_expand_split_stack_prologue()
../../gcc-trunk/gcc/config/i386/i386.c:15920
0x162ba21 gen_split_stack_prologue()
../../gcc-trunk/gcc/config/i386/i386.md:12556
0x12f3f30 target_gen_split_stack_prologue
../../gcc-trunk/gcc/config/i386/i386.md:12325
0xb237b3 make_split_prologue_seq
../../gcc-trunk/gcc/function.c:5822
0xb23a08 thread_prologue_and_epilogue_insns()
../../gcc-trunk/gcc/function.c:5958
0xb24840 rest_of_handle_thread_prologue_and_epilogue
../../gcc-trunk/gcc/function.c:6428
0xb248c0 execute
../../gcc-trunk/gcc/function.c:6470
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


so, in ix86_expand_split_stack_prologue
we first call:
ix86_finalize_stack_realign_flags ();
ix86_compute_frame_layout ();

and later:
call_insn = ix86_expand_call (NULL_RTX, gen_rtx_MEM (QImode, fn),
  GEN_INT (UNITS_PER_WORD), constm1_rtx,
  pop, false);

which changes a flag with a huge impact on the frame layout, but there
is no absolutely no way how the frame layout can change once it is
finalized.


Any Thoughts?


Bernd.


Well, my intention was actually to punt on those cases, but I hadn't 
actually tested with -fsplit-stack.  It looks like 
ix86_expand_split_stack_prologue calls ix86_expand_call, and I hadn't 
anticipated it getting called after the last call to 
ix86_compute_frame_layout(), which your patch has probably eliminated.  
In the case of -fsplit-stack, I'm testing the macro flag_split_stack 
which (currently) just expands to check the global flag, so this could 
instead be done in ix86_option_override_internal () instead, but I think 
it highlights a somewhat deeper problem.


Rather or not m->call_ms2sysv is set determines which stack layout is 
used when ix86_compute_frame_layout() runs.  But if we can run 
expand_call after the final time ix86_compute_frame_layout() then we 
have a problem.  It looks like ix86_expand_split_stack_prologue is the 
only function that manually calls ix86_expand_call, but maybe it would 
be better to modify the test to something like this:


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a78819d6b3f..c36383f6962 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -29325,7 +29325,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
}
 
   /* Set here, but it may get cleared later.  */

-  if (TARGET_CALL_MS2SYSV_XLOGUES)
+  if (TARGET_CALL_MS2SYSV_XLOGUES && !reload_completed)
cfun->machine->call_ms2sysv = true;
 }
 

Or even use the same incompatibility tests from 
ix86_c

Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-14 Thread Daniel Santos

On 05/14/2017 11:31 AM, Bernd Edlinger wrote:

Hi Daniel,

there is one thing I don't understand in your patch:
That is, it introduces a static value:

/* Registers who's save & restore will be managed by stubs called from
 pro/epilogue.  */
static HARD_REG_SET GTY(()) stub_managed_regs;

This seems to be set as a side effect of ix86_compute_frame_layout,
and depends on the register usage of the current function.
But values that depend on the current function need usually be
attached to cfun->machine, because the passes can run in parallel
unless I am completely mistaken, and the stub_managed_regs may
therefore be computed from a different function.


Bernd.


I should add that if you want to run faster tests just on the ms to sysv 
abi code, you can use make RUNTESTFLAGS="ms-sysv.exp" check and then if 
that succeeds run the full testsuite.


Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-14 Thread Daniel Santos

On 05/14/2017 11:31 AM, Bernd Edlinger wrote:

Hi Daniel,

there is one thing I don't understand in your patch:
That is, it introduces a static value:

/* Registers who's save & restore will be managed by stubs called from
 pro/epilogue.  */
static HARD_REG_SET GTY(()) stub_managed_regs;

This seems to be set as a side effect of ix86_compute_frame_layout,
and depends on the register usage of the current function.
But values that depend on the current function need usually be
attached to cfun->machine, because the passes can run in parallel
unless I am completely mistaken, and the stub_managed_regs may
therefore be computed from a different function.


Bernd.


I'm relatively new to GCC and still learning.  However, there are quite 
a lot of static TU variables in i386.c like this.  I am not aware of gcc 
having parallelism support, but if it were to be added then all of these 
TU variables should probably be moved to some class or struct (like 
cfun->machine) to reduce the number of TLS lookups required (which I 
presume is a little more expensive than a this/offset calculation).  
Having this (as well as other variables) in such a struct is better 
design IMO, but as I said, I'm still learning GCC's architecture, idioms 
and patterns.  (I should add that I don't really understand the GTY 
memory management either. :)


To be clear on class xlogue_layout, the only instances of this class are 
const and could be shared across multiple threads.  It is dependent upon 
the cfun->machine as well as the global struct rtl_data crtl, but is not 
so entangled that were these proper C++ classes (with private data) that 
it would need to be a friend -- it only needs read-access to their data 
members.


To be honest, it's a strange feeling programming in a mixture of C and 
C++ idioms, but I know it was only recently converted to C++ so I think 
it's better to try to use only one or the other in a given function.  
But if I were going to do this all OO, then ix86_compute_frame_layout 
would be a member function of ix86_frame (which would be a 
specialization of some generic "frame" class), machine_function would be 
class ix86_machine_function with it's own compute_frame_layout that 
called ix86_frame::compute_frame_layout, etc.  If I really wanted to go 
nuts, I would consider making class function, et.al. template classes 
with machine_function and machine_function_state part of the object 
instead of pointers to separate objects to reduce accesses down to a 
single this/offset, but now I I'm *really* digressing...


Please free to move it.

Thanks,
Daniel


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-14 Thread Daniel Santos

On 05/14/2017 02:42 AM, Bernd Edlinger wrote:

Hi,


this patch uses the new TARGET_COMPUTE_FRAME_LAYOUT hook in the i386
backend to avoid re-computing the frame layout when not really
necessary.

It simplifies the logic in ix86_compute_frame_layout by removing
the use_fast_prologue_epilogue_nregs, which is no longer necessary,
because the frame layout can no longer change spontaneously.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.


I think Uros is about to commit my improvements to ms to sysv abi calls, 
which is a large change and will conflict with your patch. I've added 
several new fields to struct ix86_frame that will need to be merged (and 
moved to i386.h).  I believe that my only explicit check of 
crtl->stack_realign_finalized is during pro/epilogue expand, and not in 
ix86_compute_frame_layout.  A former incarnation of my patches needed 
ix86_compute_frame_layout to be called *after* it was set, but I believe 
that is no longer the case, and so shouldn't conflict, but retesting 
should certainly be done.


https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01338.html

Thanks,
Daniel


Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-13 Thread Daniel Santos

On 05/13/2017 11:52 AM, Uros Bizjak wrote:

On Sat, May 13, 2017 at 1:01 AM, Daniel Santos <daniel.san...@pobox.com> wrote:

Ping?  I have posted revisions of the following in patch set:

05/12 - https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01442.html
09/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00348.html
11/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00350.html

I have retested them on Linux x86-64 in addition a Wine testsuite comparison
resulting in fewer failed tests (31) than when using unpatched 7.1.0 (78)
and 5.4.0 (78).  A cursory examination of the now working failures with
7.1.0 seemed to be to be due to race conditions in Wine that are
incidentally hidden after the patches.

Is there anything else needed before we can commit these?  They still rebase
cleanly onto the HEAD, but I can repost as "v5" if you prefer.

Please go ahead and commit the patches.

However, please stay around to fix possible fallout. As said - you are
touching quite complex part of the compiler ...

Thanks,
Uros.


Thanks!  I'll definitely be around, I have a lot more that I'm working 
on with C generics/pseudo-templates (all middle-end stuff). I also want 
to examine more ways that SSE saves/restores can be omitted in these ms 
to sysv calls through static analysis and such.


Anyway, I don't yet have SVN write access, will you sponsor my request?

Thanks,
Daniel



[PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-12 Thread Daniel Santos

Ping?  I have posted revisions of the following in patch set:

05/12 - https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01442.html
09/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00348.html
11/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00350.html

I have retested them on Linux x86-64 in addition a Wine testsuite 
comparison resulting in fewer failed tests (31) than when using 
unpatched 7.1.0 (78) and 5.4.0 (78).  A cursory examination of the now 
working failures with 7.1.0 seemed to be to be due to race conditions in 
Wine that are incidentally hidden after the patches.


Is there anything else needed before we can commit these?  They still 
rebase cleanly onto the HEAD, but I can repost as "v5" if you prefer.


Thanks,
Daniel



Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-08 Thread Daniel Santos

On 05/06/2017 03:22 PM, Daniel Santos wrote:


gcc-5.4.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=nocona -mtune=generic -O2 -g": 79
gcc-7.1.0 CFLAGS="-march=native -O2 -g -mcall-ms2sysv-xlogues" 
(patched): 31


I'm building out a clean test environment on another machine to try to 
rule out clutter issues (and video driver issues) on my workstation.


Daniel



I've re-run Wine's tests with a new clean VM environment and some 
changes to include more tests and similar results:


Compiler Failures
gcc-4.9.4:   39
gcc-7.1.0:   78
gcc-7.1.0-patched (with -mcall-ms2sysv-xlogues): 40


The first error not present in the gcc-4.9.4 tests that I examined 
looked like a run-of-the-mill race condition in Wine that just happened 
to not crash when built with 4.9.4.  So I'm going to guess that the 
disappearance of these failures with -mcall-ms2sysv-xlogues is just 
incidental.  I think we're in good condition with this patch set.


Daniel


Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-06 Thread Daniel Santos

On 05/05/2017 03:56 AM, Daniel Santos wrote:

On 05/02/2017 05:40 AM, Kai Tietz wrote:

Right, and Wine people will tell, if something doesn't work for them.
So ok for me too.

Kai
Well, I haven't re-run these tests in a few months, but I got 272 
failed wine tests with gcc 7.1 and 234 with my patch set rebased onto 
7.1.  So it looks like I'll be trying to diagnose these failures this 
weekend.


Those are bad numbers.  I had forgotten to filter out the testlist.o 
files.  Below are my most recent numbers running Wine 2.7:


gcc-5.4.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=nocona -mtune=generic -O2 -g": 79
gcc-7.1.0 CFLAGS="-march=native -O2 -g -mcall-ms2sysv-xlogues" (patched): 31

I'm building out a clean test environment on another machine to try to 
rule out clutter issues (and video driver issues) on my workstation.


Daniel


Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-05 Thread Daniel Santos

On 05/02/2017 05:40 AM, Kai Tietz wrote:

Right, and Wine people will tell, if something doesn't work for them.
So ok for me too.

Kai
Well, I haven't re-run these tests in a few months, but I got 272 failed 
wine tests with gcc 7.1 and 234 with my patch set rebased onto 7.1.  So 
it looks like I'll be trying to diagnose these failures this weekend.


Daniel


[PATCH 11/12 rev1] [i386] Add remainder of -mcall-ms2sysv-xlogues implementation

2017-05-04 Thread Daniel Santos
Now generates RTL with appropriate stack restore and leave patterns.  Slightly
cleaned up code that calculates the number of vector elements for clarity.

Tests are good when rebased onto gcc-7_1_0-release as HEAD currently fails to
bootstrap.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/i386.c | 287 +++--
 1 file changed, 278 insertions(+), 9 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f2772b2d10e..e43dc819f9a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14148,6 +14148,78 @@ ix86_elim_entry_set_got (rtx reg)
 }
 }
 
+static rtx
+gen_frame_set (rtx reg, rtx frame_reg, int offset, bool store)
+{
+  rtx addr, mem;
+
+  if (offset)
+addr = gen_rtx_PLUS (Pmode, frame_reg, GEN_INT (offset));
+  mem = gen_frame_mem (GET_MODE (reg), offset ? addr : frame_reg);
+  return gen_rtx_SET (store ? mem : reg, store ? reg : mem);
+}
+
+static inline rtx
+gen_frame_load (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, false);
+}
+
+static inline rtx
+gen_frame_store (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, true);
+}
+
+static void
+ix86_emit_outlined_ms2sysv_save (const struct ix86_frame )
+{
+  struct machine_function *m = cfun->machine;
+  const unsigned ncregs = NUM_X86_64_MS_CLOBBERED_REGS
+ + m->call_ms2sysv_extra_regs;
+  rtvec v = rtvec_alloc (ncregs + 1);
+  unsigned int align, i, vi = 0;
+  rtx_insn *insn;
+  rtx sym, addr;
+  rtx rax = gen_rtx_REG (word_mode, AX_REG);
+  const struct xlogue_layout  = xlogue_layout::get_instance ();
+  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
+  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
+  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+
+  /* Verify that the incoming stack 16-byte alignment offset matches the
+ layout we're using.  */
+  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+
+  /* Get the stub symbol.  */
+  sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
+ : XLOGUE_STUB_SAVE);
+  RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
+
+  /* Setup RAX as the stub's base pointer.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (rax_offset, );
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  insn = emit_insn (gen_rtx_SET (rax, addr));
+
+  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+GEN_INT (-stack_alloc_size), -1,
+m->fs.cfa_reg == stack_pointer_rtx);
+  for (i = 0; i < ncregs; ++i)
+{
+  const xlogue_layout::reginfo  = xlogue.get_reginfo (i);
+  rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
+r.regno);
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+}
+
+  gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
+
+  insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, v));
+  RTX_FRAME_RELATED_P (insn) = true;
+}
+
 /* Expand the prologue into a bunch of separate insns.  */
 
 void
@@ -14395,7 +14467,7 @@ ix86_expand_prologue (void)
 performing the actual alignment.  Otherwise we cannot guarantee
 that there's enough storage above the realignment point.  */
   allocate = frame.stack_realign_allocate_offset - m->fs.sp_offset;
-  if (allocate)
+  if (allocate && !m->call_ms2sysv)
 pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
   GEN_INT (-allocate), -1, false);
 
@@ -14403,7 +14475,6 @@ ix86_expand_prologue (void)
   insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
stack_pointer_rtx,
GEN_INT (-align_bytes)));
-
   /* For the purposes of register save area addressing, the stack
 pointer can no longer be used to access anything in the frame
 below m->fs.sp_realigned_offset and the frame pointer cannot be
@@ -14420,6 +14491,9 @@ ix86_expand_prologue (void)
m->fs.sp_valid = false;
 }
 
+  if (m->call_ms2sysv)
+ix86_emit_outlined_ms2sysv_save (frame);
+
   allocate = frame.stack_pointer_offset - m->fs.sp_offset;
 
   if (flag_stack_usage_info)
@@ -14740,17 +14814,19 @@ ix86_emit_restore_regs_using_pop (void)
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, false))
+if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, false, true))
   ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, regno));
 

[PATCH 09/12 rev1] [i386] Add patterns and predicates mcall-ms2sysv-xlogues

2017-05-04 Thread Daniel Santos
I've cleaned up the patterns and predicates as per your instructions, resulting
in 74 less lines of code.  Adding explicit insns to restore the stack pointer
and pointer perform the "leave" (to the patterns restore_multiple_and_return
and restore_multiple_leave_return, respectively) disambiguates them just fine
without the const_int tag while correctly describing exactly what the pattern
does.

Thanks for your guidance.  I understand RTL much better now.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/config/i386/predicates.md | 81 +++
 gcc/config/i386/sse.md| 37 
 2 files changed, 118 insertions(+)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 8f250a2e720..e7371a41b16 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1657,3 +1657,84 @@
   (ior (match_operand 0 "register_operand")
(and (match_code "const_int")
(match_test "op == constm1_rtx"
+
+;; Return true if the vector ends with between 12 and 18 register saves using
+;; RAX as the base address.
+(define_predicate "save_multiple"
+  (match_code "parallel")
+{
+  const unsigned len = XVECLEN (op, 0);
+  unsigned i;
+
+  /* Starting from end of vector, count register saves.  */
+  for (i = 0; i < len; ++i)
+{
+  rtx src, dest, addr;
+  rtx e = XVECEXP (op, 0, len - 1 - i);
+
+  if (GET_CODE (e) != SET)
+   break;
+
+  src  = SET_SRC (e);
+  dest = SET_DEST (e);
+
+  if (!REG_P (src) || !MEM_P (dest))
+   break;
+
+  addr = XEXP (dest, 0);
+
+  /* Good if dest address is in RAX.  */
+  if (REG_P (addr) && REGNO (addr) == AX_REG)
+   continue;
+
+  /* Good if dest address is offset of RAX.  */
+  if (GET_CODE (addr) == PLUS
+ && REG_P (XEXP (addr, 0))
+ && REGNO (XEXP (addr, 0)) == AX_REG)
+   continue;
+
+  break;
+}
+  return (i >= 12 && i <= 18);
+})
+
+
+;; Return true if the vector ends with between 12 and 18 register loads using
+;; RSI as the base address.
+(define_predicate "restore_multiple"
+  (match_code "parallel")
+{
+  const unsigned len = XVECLEN (op, 0);
+  unsigned i;
+
+  /* Starting from end of vector, count register restores.  */
+  for (i = 0; i < len; ++i)
+{
+  rtx src, dest, addr;
+  rtx e = XVECEXP (op, 0, len - 1 - i);
+
+  if (GET_CODE (e) != SET)
+   break;
+
+  src  = SET_SRC (e);
+  dest = SET_DEST (e);
+
+  if (!MEM_P (src) || !REG_P (dest))
+   break;
+
+  addr = XEXP (src, 0);
+
+  /* Good if src address is in RSI.  */
+  if (REG_P (addr) && REGNO (addr) == SI_REG)
+   continue;
+
+  /* Good if src address is offset of RSI.  */
+  if (GET_CODE (addr) == PLUS
+ && REG_P (XEXP (addr, 0))
+ && REGNO (XEXP (addr, 0)) == SI_REG)
+   continue;
+
+  break;
+}
+  return (i >= 12 && i <= 18);
+})
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 094404bc913..d488b25c254 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -20010,3 +20010,40 @@
   (match_operand:VI48_512 1 "nonimmediate_operand" "vm")))]
   "TARGET_AVX512VPOPCNTDQ"
   "vpopcnt\t{%1, %0|%0, %1}")
+
+;; Save multiple registers out-of-line.
+(define_insn "save_multiple"
+  [(match_parallel 0 "save_multiple"
+[(use (match_operand:P 1 "symbol_operand"))])]
+  "TARGET_SSE && TARGET_64BIT"
+  "call\t%P1")
+
+;; Restore multiple registers out-of-line.
+(define_insn "restore_multiple"
+  [(match_parallel 0 "restore_multiple"
+[(use (match_operand:P 1 "symbol_operand"))])]
+  "TARGET_SSE && TARGET_64BIT"
+  "call\t%P1")
+
+;; Restore multiple registers out-of-line and return.
+(define_insn "restore_multiple_and_return"
+  [(match_parallel 0 "restore_multiple"
+[(return)
+ (use (match_operand:P 1 "symbol_operand"))
+ (set (reg:DI SP_REG) (reg:DI R10_REG))
+])]
+  "TARGET_SSE && TARGET_64BIT"
+  "jmp\t%P1")
+
+;; Restore multiple registers out-of-line when hard frame pointer is used,
+;; perform the leave operation prior to returning (from the function).
+(define_insn "restore_multiple_leave_return"
+  [(match_parallel 0 "restore_multiple"
+[(return)
+ (use (match_operand:P 1 "symbol_operand"))
+ (set (reg:DI SP_REG) (plus:DI (reg:DI BP_REG) (const_int 8)))
+ (set (reg:DI BP_REG) (mem:DI (reg:DI BP_REG)))
+ (clobber (mem:BLK (scratch)))
+])]
+  "TARGET_SSE && TARGET_64BIT"
+  "jmp\t%P1")
-- 
2.11.0



Re: [PATCH 09/12] [i386] Add patterns and predicates foutline-msabi-xlouges

2017-05-03 Thread Daniel Santos

On 05/03/2017 01:10 AM, Uros Bizjak wrote:

The order of subexpressions of parallel in general does not matter.


Thanks, this makes things much clearer.


Also, I'm wondering if there's anything wrong with calling ix86_gen_leave ()
and plucking the insns out of the generated parallel insn and moving that
into my own parallel rather than generating them in my own function.  I
guess all the matters is what is cleanest.

Hm... I'd rather see subexpressions generated "by hand".


OK.  While we're on the topic, are you OK with my changes to 
ix86_emit_leave to generate the notes or would you prefer those by hand 
as well?


Also, are these predicates what you had in mind?  (I haven't actually 
tested them just yet.)


(define_predicate "save_multiple"
  (match_code "parallel")
{
  const unsigned len = XVECLEN (op, 0);
  unsigned i;

  /* Starting from end of vector, count register saves.  */
  for (i = 0; i < len; ++i)
{
  rtx src, dest, addr;
  rtx e = XVECEXP (op, 0, len - 1 - i);

  if (GET_CODE (e) != SET)
break;

  src  = SET_SRC (e);
  dest = SET_DEST (e);

  if (!REG_P (src) || !MEM_P (dest))
break;

  addr = XEXP (dest, 0);

  /* Good if dest address is in RAX.  */
  if (REG_P (addr) && REGNO (addr) == AX_REG)
continue;

  /* Good if dest address is offset of RAX.  */
  if (GET_CODE (addr) == PLUS
  && REG_P (XEXP (addr, 0))
  && REGNO (XEXP (addr, 0)) == AX_REG)
continue;

  break;
}
  return (i >= 12 && i <= 18);
})


(define_predicate "restore_multiple"
  (match_code "parallel")
{
  const unsigned len = XVECLEN (op, 0);
  unsigned i;

  /* Starting from end of vector, count register restores.  */
  for (i = 0; i < len; ++i)
{
  rtx src, dest, addr;
  rtx e = XVECEXP (op, 0, len - 1 - i);

  if (GET_CODE (e) != SET)
break;

  src  = SET_SRC (e);
  dest = SET_DEST (e);

  if (!MEM_P (src) || !REG_P (dest))
break;

  addr = XEXP (src, 0);

  /* Good if src address is in RSI.  */
  if (REG_P (addr) && REGNO (addr) == SI_REG)
continue;

  /* Good if src address is offset of RSI.  */
  if (GET_CODE (addr) == PLUS
  && REG_P (XEXP (addr, 0))
  && REGNO (XEXP (addr, 0)) == SI_REG)
continue;

  break;
}
  return (i >= 12 && i <= 18);
})


Thanks,
Daniel



Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-02 Thread Daniel Santos

On 05/02/2017 05:40 AM, Kai Tietz wrote:

Right, and Wine people will tell, if something doesn't work for them.
So ok for me too.

Kai


Yes, and I although I haven't repeated the Wine tests in a few months, 
little has changed since my last run.  I'll be running them again soon 
anyway.


Daniel


Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-02 Thread Daniel Santos

On 05/02/2017 05:21 AM, JonY wrote:

On 05/01/2017 11:31 AM, Uros Bizjak wrote:


I also assume that Cygwin and MinGW people agree with the patch and
the functionality itself.

Uros.


Cygwin and MinGW does not use SysV/MS transitions directly in their own
code, changes should be OK.


And to be clear, this did initially have a failed gcc test on Cygwin due 
to the aligned SSE MOVs portion of the patch set (this is the first 
three patches) and this is resolved by disabling that feature on SEH 
targets.  This is the last two lines in the below chunk from 3/12:



@@ -14080,11 +14102,19 @@ ix86_expand_prologue (void)
GEN_INT (-align_bytes)));
  
/* For the purposes of register save area addressing, the stack

- pointer is no longer valid.  As for the value of sp_offset,
-see ix86_compute_frame_layout, which we need to match in order
-to pass verification of stack_pointer_offset at the end.  */
+pointer can no longer be used to access anything in the frame
+below m->fs.sp_realigned_offset and the frame pointer cannot be
+used for anything at or above.  */
m->fs.sp_offset = ROUND_UP (m->fs.sp_offset, align_bytes);
-  m->fs.sp_valid = false;
+  m->fs.sp_realigned = true;
+  m->fs.sp_realigned_offset = m->fs.sp_offset - frame.nsseregs * 16;
+  gcc_assert (m->fs.sp_realigned_offset == frame.stack_realign_offset);
+  /* SEH unwind emit doesn't currently support REG_CFA_EXPRESSION, which
+is needed to describe where a register is saved using a realigned
+stack pointer, so we need to invalidate the stack pointer for that
+target.  */
+  if (TARGET_SEH)
+   m->fs.sp_valid = false;
  }
  
allocate = frame.stack_pointer_offset - m->fs.sp_offset;


Still I have run complete tests on Cygwin with 32- and 64-bit using both 
the Cygwin and MinGW compilers.


Daniel



Re: [PATCH 09/12] [i386] Add patterns and predicates foutline-msabi-xlouges

2017-05-02 Thread Daniel Santos

Thank you for the review.

On 05/01/2017 06:18 AM, Uros Bizjak wrote:

On Thu, Apr 27, 2017 at 10:09 AM, Daniel Santos <daniel.san...@pobox.com> wrote:

Adds the predicates save_multiple and restore_multiple to predicates.md,
which are used by following patterns in sse.md:

* save_multiple - insn that calls a save stub
* restore_multiple - call_insn that calls a save stub and returns to the
   function to allow a sibling call (which should typically offer better
   optimization than the restore stub as the tail call)
* restore_multiple_and_return - a jump_insn that returns from the
   function as a tail-call.
* restore_multiple_leave_return - like the above, but restores the frame
   pointer before returning.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
  gcc/config/i386/predicates.md | 155 ++
  gcc/config/i386/sse.md|  37 ++
  2 files changed, 192 insertions(+)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 8f250a2e720..36fe8abc3f4 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1657,3 +1657,158 @@
(ior (match_operand 0 "register_operand")
 (and (match_code "const_int")
 (match_test "op == constm1_rtx"
+
+;; Return true if:
+;; 1. first op is a symbol reference,
+;; 2. >= 13 operands, and
+;; 3. operands 2 to end is one of:
+;;   a. save a register to a memory location, or
+;;   b. restore stack pointer.
+(define_predicate "save_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  if (GET_CODE (head) != USE)
+return false;
+  else
+{
+  rtx op0 = XEXP (head, 0);
+  if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+   return false;
+}
+
+  if (nregs < 13)
+return false;
+
+  for (i = 2; i < nregs; i++)
+{
+  rtx e, src, dest;
+
+  e = XVECEXP (op, 0, i);
+
+  switch (GET_CODE (e))
+   {
+ case SET:
+   src  = SET_SRC (e);
+   dest = SET_DEST (e);
+
+   /* storing a register to memory.  */
+   if (GET_CODE (src) == REG && GET_CODE (dest) == MEM)

Please use REG_P (...) and MEM_P (...) - and possible others -
predicates in the code.


+ {
+   rtx addr = XEXP (dest, 0);
+
+   /* Good if dest address is in RAX.  */
+   if (GET_CODE (addr) == REG
+   && REGNO (addr) == AX_REG)
+ continue;
+
+   /* Good if dest address is offset of RAX.  */
+   if (GET_CODE (addr) == PLUS
+   && GET_CODE (XEXP (addr, 0)) == REG
+   && REGNO (XEXP (addr, 0)) == AX_REG)
+ continue;
+ }
+   break;
+
+ default:
+   break;
+   }
+   return false;
+}
+  return true;
+})
+
+;; Return true if:
+;; * first op is (return) or a a use (symbol reference),
+;; * >= 14 operands, and
+;; * operands 2 to end are one of:
+;;   - restoring a register from a memory location that's an offset of RSI.
+;;   - clobbering a reg
+;;   - adjusting SP
+(define_predicate "restore_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  switch (GET_CODE (head))
+{
+  case RETURN:
+   i = 3;
+   break;
+
+  case USE:
+  {
+   rtx op0 = XEXP (head, 0);
+
+   if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+ return false;
+
+   i = 1;
+   break;
+  }
+
+  default:
+   return false;
+}
+
+  if (nregs < i + 12)
+return false;
+
+  for (; i < nregs; i++)
+{
+  rtx e, src, dest;
+
+  e = XVECEXP (op, 0, i);
+
+  switch (GET_CODE (e))
+   {
+ case CLOBBER:
+   continue;

I don't see where CLOBBER is genreated in ix86_emit_outlined_ms2sysv_restore.


I think this is clutter that I didn't remove after changing the stubs.


+
+ case SET:
+   src  = SET_SRC (e);
+   dest = SET_DEST (e);
+
+   /* Restoring a register from memory.  */
+   if (GET_CODE (src) == MEM && GET_CODE (dest) == REG)
+ {
+   rtx addr = XEXP (src, 0);
+
+   /* Good if src address is in RSI.  */
+   if (GET_CODE (addr) == REG
+   && REGNO (addr) == SI_REG)
+ continue;
+
+   /* Good if src address is offset of RSI.  */
+   if (GET_CODE (addr) == PLUS
+   && GET_CODE (XEXP (addr, 0)) == REG
+   && REGNO (XEXP (addr, 0)) == SI_REG)
+ continue;
+
+   /* Good if adjusting stack pointer.  */
+   if (GET_C

Re: [PATCH 05/12 rev 1] [i386] Add option -mcall-ms2sysv-xlogues

2017-04-28 Thread Daniel Santos
Oops.  I blame my fingers.  :)

Daniel
---
 gcc/config/i386/i386.c   |  6 +-
 gcc/config/i386/i386.opt |  4 
 gcc/doc/invoke.texi  | 13 -
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 113f83742c2..521116195cb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4508,7 +4508,8 @@ ix86_target_string (HOST_WIDE_INT isa, HOST_WIDE_INT isa2,
 { "-mstv", MASK_STV },
 { "-mavx256-split-unaligned-load", MASK_AVX256_SPLIT_UNALIGNED_LOAD },
 { "-mavx256-split-unaligned-store",
MASK_AVX256_SPLIT_UNALIGNED_STORE },
-{ "-mprefer-avx128",   MASK_PREFER_AVX128 }
+{ "-mprefer-avx128",   MASK_PREFER_AVX128 },
+{ "-mcall-ms2sysv-xlogues",MASK_CALL_MS2SYSV_XLOGUES }
   };
 
   /* Additional flag options.  */
@@ -6319,6 +6320,9 @@ ix86_option_override_internal (bool main_args_p,
 #endif
}
 
+  if (TARGET_SEH && TARGET_CALL_MS2SYSV_XLOGUES)
+sorry ("-mcall-ms2sysv-xlogues isn%'t currently supported with SEH");
+
   if (!(opts_set->x_target_flags & MASK_VZEROUPPER))
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 9384e29b1de..65b228544a5 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -538,6 +538,10 @@ Enum(calling_abi) String(sysv) Value(SYSV_ABI)
 EnumValue
 Enum(calling_abi) String(ms) Value(MS_ABI)
 
+mcall-ms2sysv-xlogues
+Target Report Mask(CALL_MS2SYSV_XLOGUES) Save
+Use libgcc stubs to save and restore registers clobbered by 64-bit Microsoft 
to System V ABI calls.
+
 mveclibabi=
 Target RejectNegative Joined Var(ix86_veclibabi_type) Enum(ix86_veclibabi) 
Init(ix86_veclibabi_type_none)
 Vector library ABI to use.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0eeea7b3b87..d9894f37ee5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1209,7 +1209,7 @@ See RS/6000 and PowerPC Options.
 -msse2avx  -mfentry  -mrecord-mcount  -mnop-mcount  -m8bit-idiv @gol
 -mavx256-split-unaligned-load  -mavx256-split-unaligned-store @gol
 -malign-data=@var{type}  -mstack-protector-guard=@var{guard} @gol
--mmitigate-rop  -mgeneral-regs-only}
+-mmitigate-rop  -mgeneral-regs-only  -mcall-ms2sysv-xlogues}
 
 @emph{x86 Windows Options}
 @gccoptlist{-mconsole  -mcygwin  -mno-cygwin  -mdll @gol
@@ -25308,6 +25308,17 @@ You can control this behavior for specific functions by
 using the function attributes @code{ms_abi} and @code{sysv_abi}.
 @xref{Function Attributes}.
 
+@item -mcall-ms2sysv-xlogues
+@opindex mcall-ms2sysv-xlogues
+@opindex mno-call-ms2sysv-xlogues
+Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a
+System V ABI function must consider RSI, RDI and XMM6-15 as clobbered.  By
+default, the code for saving and restoring these registers is emitted inline,
+resulting in fairly lengthy prologues and epilogues.  Using
+@option{-mcall-ms2sysv-xlogues} emits prologues and epilogues that
+use stubs in the static portion of libgcc to perform these saves and restores,
+thus reducing function size at the cost of a few extra instructions.
+
 @item -mtls-dialect=@var{type}
 @opindex mtls-dialect
 Generate code to access thread-local storage using the @samp{gnu} or
-- 
2.11.0



Re: [PATCH v4 0/12 GCC8] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-04-27 Thread Daniel Santos

I probably should have mentioned that these are all for GCC8.


[PATCH 12/12] [i386,testsuite] Test program for ms to sysv abi function calls.

2017-04-27 Thread Daniel Santos
A comprehensive program for testing x86_64 ms_abi functions that call
sysv_abi functions to help validate -mcall-ms2sysv-xlogues and use of
aligned SSE MOVs after a (non-DRAP) realigned stack.

Signed-off-by: Daniel Santos <daniel.san...@pobox.com>
---
 gcc/Makefile.in|   2 +
 .../gcc.target/x86_64/abi/ms-sysv/do-test.S| 163 +
 gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc | 807 +
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.c| 373 ++
 .../gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp  | 178 +
 5 files changed, 1523 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f675e073ecc..7f7c238127b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3807,7 +3807,9 @@ site.exp: ./config.status Makefile
@echo "set CFLAGS \"\"" >> ./site.tmp
@echo "set CXXFLAGS \"\"" >> ./site.tmp
@echo "set HOSTCC \"$(CC)\"" >> ./site.tmp
+   @echo "set HOSTCXX \"$(CXX)\"" >> ./site.tmp
@echo "set HOSTCFLAGS \"$(CFLAGS)\"" >> ./site.tmp
+   @echo "set HOSTCXXFLAGS \"$(CXXFLAGS)\"" >> ./site.tmp
 # TEST_ALWAYS_FLAGS are flags that should be passed to every compilation.
 # They are passed first to allow individual tests to override them.
@echo "set TEST_ALWAYS_FLAGS \"$(SYSROOT_CFLAGS_FOR_TARGET)\"" >> 
./site.tmp
diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S 
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
new file mode 100644
index 000..1395235fd1e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
@@ -0,0 +1,163 @@
+/* Assembly proxy functions for ms_abi tests.
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+   Contributed by Daniel Santos <daniel.san...@pobox.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#ifdef __x86_64__
+
+# ifdef __ELF__
+#  define ELFFN_BEGIN(fn)   .type fn,@function
+#  define ELFFN_END(fn) .size fn,.-fn
+# else
+#  define ELFFN_BEGIN(fn)
+#  define ELFFN_END(fn)
+# endif
+
+# define FUNC(fn)  \
+   .global fn; \
+   ELFFN_BEGIN(fn);\
+fn:
+
+#define FUNC_END(fn) ELFFN_END(fn)
+
+# ifdef __AVX__
+#  define MOVAPS vmovaps
+# else
+#  define MOVAPS movaps
+# endif
+
+/* TODO: Is there a cleaner way to provide these offsets?  */
+   .struct 0
+test_data_save:
+
+   .struct test_data_save + 224
+test_data_input:
+
+   .struct test_data_save + 448
+test_data_output:
+
+   .struct test_data_save + 672
+test_data_fn:
+
+   .struct test_data_save + 680
+test_data_retaddr:
+
+   .text
+
+regs_to_mem:
+   MOVAPS  %xmm6, (%rax)
+   MOVAPS  %xmm7, 0x10(%rax)
+   MOVAPS  %xmm8, 0x20(%rax)
+   MOVAPS  %xmm9, 0x30(%rax)
+   MOVAPS  %xmm10, 0x40(%rax)
+   MOVAPS  %xmm11, 0x50(%rax)
+   MOVAPS  %xmm12, 0x60(%rax)
+   MOVAPS  %xmm13, 0x70(%rax)
+   MOVAPS  %xmm14, 0x80(%rax)
+   MOVAPS  %xmm15, 0x90(%rax)
+   mov %rsi, 0xa0(%rax)
+   mov %rdi, 0xa8(%rax)
+   mov %rbx, 0xb0(%rax)
+   mov %rbp, 0xb8(%rax)
+   mov %r12, 0xc0(%rax)
+   mov %r13, 0xc8(%rax)
+   mov %r14, 0xd0(%rax)
+   mov %r15, 0xd8(%rax)
+   retq
+
+mem_to_regs:
+   MOVAPS  (%rax), %xmm6
+   MOVAPS  0x10(%rax),%xmm7
+   MOVAPS  0x20(%rax),%xmm8
+   MOVAPS  0x30(%rax),%xmm9
+   MOVAPS  0x40(%rax),%xmm10
+   MOVAPS  0x50(%rax),%xmm11
+   MOVAPS  0x60(%rax),%xmm12
+   MOVAPS  0x70(%rax),%xmm13
+   MOVAPS  0x80(%rax),%xmm14
+   MOVAPS  0x90(%rax),%xmm15
+   mov 0

  1   2   >